The minimum requirement for active attendance: 3 tasks done
Exercise group 4 (Tuesday, 19th Oct, 16-18) will exceptionally be at B450.
On week 44 (Nov 1-5) we have neither lectures nor exercise groups. The first exam will be as indicated: Nov 3rd. The requirements of the exam will be expresses later.
1. Sketch the structure of the course registration database used at our department so that you can roughly estimate how much disk space the registration system needs for registrations of one term. Suppose that there are 1500 students, 50 lecture courses with 1-20 exercise groups each (seminars and lab groups are omitted), 70 final exams, 5000 registrations for courses and 1000 for final exams. The system is based on an Oracle database. Could it be a system based on data stored in the main memory (pros/cons of that)? 2. Let us implement the EMPLOYEE relation of the Company database as a heap file. The size of a disk block is 4 KB. What kind of an internal structure would you use for an EMPLOYEE tuple? Estimate the size of the file (the number of blocks), when there are 10000 employees in all. 3. a) A file of 80000 blocks is stored optimally on a disk pack having 12 disk surfaces, 2000 tracks per surface and 36 blocks per track. How much time is spent scanning through the file when the blocks are read in the order they are stored on the disk. The seek time is 1-12 ms (the average is 6 ms) and block transfer time 0.2 ms. b) What is the time needed to read all blocks separately (in an arbitrary order)? 4. A hash file structure contains B = 4 buckets. The hash key is integer-valued and the hash function h is defined by the equation h(v) = v mod B. A disk block can hold two records of the file. Give the contents of the file structure when records with the keys 9, 10, 11, 31, 32, 41, 42, 43, 44, 45, and 46 (in this order) have first been inserted into the file and when the records with keys 10, 41, and 42 have then been deleted from the file. 5. Let us implement the WORKS_ON relation of the Company database as a hash file with ESSN as the hash key. How would you define the hash function (consider 2 to 3 alternatives). Suppose that the SSN is formed as in Finland: DDMMYY-123X (day, month, year, date-based counter which is even for women and uneven for men, check character). 6. What strategy would you use in executing the query select ESSN, count(PNO) from WORKS_ON group by ESSN, when the structure of the WORKS_ON relation is as in the previous problem? How many disk accesses are needed? How much main memory (buffer blocks) is consumed?