Database management, Autumn 2000, Exercise 1 (18.-22.9.)

The minimum requirement for active attendance: 3 tasks done

The purpose of tasks 1 and 2 is simply to repeat some basic features of the SQL query language. For SQL, see the textbook (E&N, Ch. 8), Oracle manual etc. The SQL exercises concern the database described in E&N, page 205.

1. Explain the meaning of the following SQL query.

select x.LNAME, y.LNAME
from EMPLOYEE x, EMPLOYEE y
where x.SUPERSSN = y.SSN
union select x.LNAME, z.LNAME
from EMPLOYEE x, EMPLOYEE y, EMPLOYEE z
where x.SUPERSSN = y.SSN and y.SUPERSSN = z.SSN.
2. Write SQL queries for printing
a) the names and addresses of the employees working for projects of Administration department;
b) employees with no children;
c) the employees that work on no project.

3. Sketch the structure of the course registration database used at our department so that you can roughly estimate how much disk space the registration system needs for registrations of one term. Suppose that there are 3000 students, 100 lecture courses with 1-20 exercise groups each (seminars and lab groups are omitted), 80 final exams, 10000 registrations for courses and 2000 for final exams. (The data indicated should be sufficient to find the needed relations.)

The system uses a disk based database (Oracle). Would it be possible to construct the system based on data stored in the main memory? (What would be pros/cons of that)?

4. Let us implement the EMPLOYEE relation of the Company database as a heap file (E&N, Ch. 5.7). The size of a disk block is 4 KB. What kind of an internal structure would you use for an EMPLOYEE tuple? (What alternative structures there exist?) Estimate the size of the file (the number of blocks), when there are 10000 employees in all.

5. (**) a) A file of 80000 blocks is stored optimally on a disk pack having 12 disk surfaces, 2000 tracks per surface and 36 blocks per track. How much time is spent in scanning through the file when the blocks are read in the order they are stored on the disk. The seek time is 1-12 ms (the average is 6 ms) and block transfer time 0.2 ms. (Interblock gaps need not to be considered.)

b) What is the time needed to read all blocks separately (in an arbitrary order)?

c) With what number of blocks X it is more efficient to scan through the whole file as compared to the series of single requests?

d) Which of the indicated disk parameters have the most effect to these results? (Estimate shortly how the changes in parameter values are visible in results.)

(**) this means that the task 5 is graded as 2 tasks