Database management, Autumn 2000, Exercise 5 (16.-19.10.)

The minimum requirement for active attendance: 3 tasks done

These tasks (2-7) concern transaction management and especially log-based recovery. In E&N, the corresponding material is in Chs. 19 and 21. Ch. 19 contains general transaction management and Ch. 21 recovery in more detail. It is not easy to express exactly the needed parts but I will do something and write it on the English course page (until the weekend).

1. a) Explain two essentially different ways to implement the query
select * from employee where salary < 20000 or dno = 3;
The essential point is that the two conditions are disjunktively combined. Consider the use of indexes.

b) On page 40 (ch. 4) in the lecture notes there exists the query

SELECT * 
  FROM accounts 
  WHERE custno IN 
    (SELECT custno FROM customers); 

which is transformed to the query

SELECT accounts.* FROM accounts, customers WHERE accounts.custno = customers.custno;

The former contains a subquery and the latter a join operation. The queries could be interpreted as two alternative execution plans for the same query.
Is this transformation always possible? How about the efficiency of these versions?

2. a) What kind of an operation sequence is included in a typical online banking terminal transaction (a cash withdrawal)? We assume that the card's verification code is checked locally (without a database) but the accounts are managed by a database. Which exceptions there can happen (concerning the transaction management)?

b) Suppose that a system makes reservations for several successive flights (from A to B with some transfers between) or for a holiday trip (with hotel, flights, and rental car). Are these situations somehow problematic from the transaction management point of view?

3. Assume that the following records are found in the log file when the system crashes:

100: [start, T1]
101: [start, T2]
102: [write, T1, A, 10, 20]
103: [write, T2, B, 11, 22]
104: [commit, T1]
105: [start, T3]
106: [write, T3, A, 20, 30]
107: [abort, T2]
108: [checkpoint]
109: [start, T4]
110: [write, T3, B, 11, 33]
111: [write, T3, A, 30, 40]
112: [write, T4, C, 15, 25]
113: [commit, T3]
114: [write, T4, C, 25, 35]
a) What recovery operations are needed? We assume that the PageLSN (page log sequence number) field of the page containing data item A is 106, the PageLSN for B is 110 and the PageLSN for C is 112.

b) Let us assume that no checkpoint record (# 108) has been written and some failure occurs. In what phase of the log sequence the failure causes the largest number of recovery operations?

4. What are the advantages of write-ahead logging (WAL) protocol in recovery? Is this protocol necessary (or can you find some viable alternative)?

5. Why are all undos performed in backward order, and all redos in forward order, from the log? Give a concrete example describing the significance of the backward/forward order.

6. What kind of a commit protocol should be used were it required that no redos be necessary in recovery? What drawbacks there are with this scheme?

7. Assume that when the database management system is recovering from a failure (that is, performing undo or redo operations), a new failure occurs before the recovery from the old one is complete. How is this situation handled? Should one also record undos and redos in the log, so that these could be "undone" or "redone"?

The Finnish lecture notes are available via the Finnish course page ('Luentokalvot') in pdf-format or in a course folder in room A412 (on paper).