Design and Analysis of Algorithms

582630
5
Algorithms and machine learning
Advanced studies
General design principles of algorithms. Examples of central problems and typical solutions. Average case analysis. Amortised complexity. Recurrences. NP-completeness. Prerequisites: the course Data Structures and Algorithms or equivalent.

Exam

22.10.2015 09.00 A111
Year Semester Date Period Language In charge
2015 autumn 02.09-14.10. 1-1 English Veli Mäkinen

Lectures

Time Room Lecturer Date
Wed 14-16 CK112 Veli Mäkinen 02.09.2015-14.10.2015

Exercise groups

Group: 1
Time Room Instructor Date Observe
Fri 10-12 B222 Veli Mäkinen 04.09.2015—16.10.2015 Study group
Group: 2
Time Room Instructor Date Observe
Fri 12-14 B222 Alexandru Tomescu 04.09.2015—16.10.2015 Study group
Group: 3
Time Room Instructor Date Observe
Wed 10-12 B119 Topi Talvitie 09.09.2015—14.10.2015
Group: 4
Time Room Instructor Date Observe
Wed 12-14 B119 Topi Talvitie 09.09.2015—14.10.2015
Group: 5
Time Room Instructor Date Observe
Fri 14-16 C222 Veli Mäkinen 04.09.2015—04.09.2015 Study group
Fri 14-16 B222 Veli Mäkinen 11.09.2015—25.09.2015 Study group
Group: 6
Time Room Instructor Date Observe
Wed 16-18 D122 Johannes Verwijnen 09.09.2015—14.10.2015

The room for lectures has changed!

Information for international students

Language: The course will be given in English. The questions in the course exam, renewal exam, and separate exams will be in English.

General

News:

  • ABOUT THE EXAM:
    • Exam is now graded: see grounds for grading. Feedback session is on Friday 6.11. 10-11.30 C222. 8.30-10.00 C222.
    • Please check the precise time and place in advance. You need to be prepared to prove your ID. You may bring only the very basic equipment, like pen or pencil, but no calculator, lecture notes etc. Paper sheets are provided to you by the house. You will have a total of 2 and a half hours for answering all the questions given in a separate sheet. You may expect the exam questions to be mostly similar to the exercise problems. See an old exam, a set of sufficient model solutions, and a set more detailed solutions and grading guidelines. [Note. The content has changed quite a lot from last year]
    • Course exam is recommended for all who followed the course actively, as it will be pretty educational; if you are missing points from exercises to exceed the level required, ask the lecturer for permission to attend anyways (e.g. by activity at lectures & study groups).
    • There will be 4 questions that will touch somehow 7 topics.  See the learning goal matrix to know what you should know in the exam. This year's edition adds to the cell "Reaches the learning objectives" / "Graph algorithms": Reduces related problems to maximum flow and can describe the principles of how maximum flows are solved.
    • There will be one voluntary extra (difficult) * question in the course exam, whose points can complement missing exercise points. Later separate exams will test the knowledge more directly without educational goals.
    • In study groups (and lectures) we sometimes had advanced topics (e.g. median finding & related extra exercises for week 1, universal hashing, proofs of flow algorithms, cook theorem, LCA queries, speeding up dynamic programming with amortized data structures). These advanced topics were for motivational purposes, to see that the techniques we learn are useful and inter-connected. If you feel you master well those advanced topics, we could organize a separate exam (under different course name) on those. Answer this poll if you are interested in this option.
  • Added one more optional assignment to exercise 6 (after the printouts where shared at Friday study groups).
  • Check the corrected version of lecture nodes 5 and exercise sheet 5: the expected number of comparisons that RandomizedQuickSort does is at most  2n ln n <= 2n lg n. In this course we denote lg := log_2 and ln := log_e.
  • Check the corrected version of  exercise sheet 4: Sliding window definition min_j A[i] -> min_j A[j]
  • Friday study group 14-16 cancelled for the rest of the course; we'll fit  to two study groups.
  • Added a proof of correctness for the Cartesian tree construction (see lecture material).
  • A hint re-extended in exercise 3.3 and ... added in exercise 2.6 between 2 and (i-1).
  • Extra study group on Monday 16-18 A218 for those who absolutely cannot travel on this Friday (limited space, register here)
  • Check the corrected version of  exercise sheet 2: Smaller -> greater (2.3, line -3). Link added to the example graph (2.1).
  • Solutions to to the study group 2 problems added
  • Learning goal matrix added (see couple of lines down).
  • Lecture material now linked below (scroll down). I'll add these after the lecture so that I have chance to make corrections that typically come to the mind while lecturing. These are not for self-studying, but hopefully help in recalling what the lecture was about; sorry for the bad handwriting, these are just notes to myself and typically many things are added spontaneously at the lecture. Check always the corresponding book chapters that contain more details. I'm fixing (and updating below) the lecture plan typically on Mondays, so you can even study the planned topics in advance.

Description: General design principles of algorithms. Examples of central problems and typical solutions. Average case analysis. Amortised complexity. Recurrences. NP-completeness.

Prerequisites: the course Data Structures and Algorithms or equivalent. Strongly recommended: basic courses in mathematics, especially in calculus, probability, and discrete math (basics of logic, graphs, combinatorics).

Learning goals: see this draft matrix. This year's edition adds to the cell "Reaches the learning objectives" / "Graph algorithms": Reduces related problems to maximum flow and can describe the principles of how maximum flows are solved.

Weekly agenda:

  • Wednesday lecture by V. M.:  Introduction to the week's topic and setting up goals what to study before attending your selected Friday study group.
  • Friday study groups by V. M. and A. T.: There will be 4 different questions written on the black/white boards and the attendants are split evenly to solve those questions. Once all groups have reached a satisfactory solution, groups are mixed until each attendant has got an opportunity to teach a solution and/or learn the rest from others.
  • Wednesday exercise sessions by T. T.: Exercise questions are announced at latest on Friday after the study groups, and the goal of this independent homework is to reflect how well the learning goals are met. Seeing solutions by others and presenting your own solution at the exercise session gives the final opportunity to learn the previous week's topic before the next lecture.          

Note: The estimated work load of the course is 5 credit units. As 1 credit unit amounts to about 27 hours of work, the estimated total work load in the course is about 135 hours, meaning about 19 hours of work per week. For instance, 2 hours for the lecture, 4 hours preparing for the study group, 2 hours for attending the study group, 9 hours for solving the exercise problems, and the remaining 2 hours for attending the exercise session.

Completing the course

To pass the course you need to earn more than 30 points (tentative, may be lowered depending on the difficulty of the exam). From the exam you may get up to 54 points and from weekly exercises you may earn up to 6  points. To attend the course exam, you should earn at least one point from exercises. Otherwise, you can alway attend a separate exam.

To get exercise points you need to either (a) show up in an exercise session and describe your solution if asked, or (b) send your solutions to the instructur (topi.talvitie[at]helsinki.fi) by email before the first exercise session of the week. The number of exercise points you get is (the ceil of) the number of problems you have solved (marked) times six divided by the sum of weekly maxima. At any time, if you want to know the points you have earned so far, please send email to the instructor (topi.talvitie[at]helsinki.fi).

Literature and material

 

Course book: T. H. Cormen, C. E. Leiserson, R. L. Rivest, C. Stein: Introduction to Algorithms, 3rd ed., MIT Press, 2009.  

Chapters 1, 2, 3, 22, 23, 24, 25 are recommended as background material before (during) attending the course; these concepts are briefly revisited when needed, but not thoroughly covered as these chapters are part of prerequisite Data Structures and Algorithms course.

The main content of the course consist of Chapters 4 (except 4.6), 5 (except 5.4), 7, 9, 15, 17, 26, 34, 35, scheduled along seven lecture weeks as follows:

Week I (36): Divide and conquer; Recurrences; Sorting; Matrix multiplication - Chapters 4, 7.1, 9.3

Week II (37): Dynamic programming - Chapter 15

Week III (38): NP-completeness; TSP, SAT, Subset Sum, and other examples - Chapter 34 

  • Wednesday lecture covered definitions of NP-completeness concepts through encodings and polynomial reductions, an example reduction from SAT to 3-CNF (this was continued at the study groups), and some words about NP-hardness of clustering (Proc. WALCOM '09)

Week IV (39): Amortized analysis - Chapter 17

Week V (40): Randomization; Average case analysis - Chapters 5, 7

  • Wednesday lecture covered Definitions of worst-case, average-case and expected case running time, RandomizedSelect, RandomizedQuickSort and their expected time analysis, Discussion of Las Vegas/Monte Carlo randomized algorithms, Freivalds' randomized Monte Carlo algorithm for verifying matrix products. Read Appendix C.2 and C.3 for basics on probabilities. The definition of expected running time was sloppy; in the theorem about RandomizedQuickSort lg (i.e. log_2) was incorrectly used instead of ln (i.e. log_e). See the corrected hand-written notes

Week VI (41): Flows; Matchings; Approximation algorithms; - Chapters 26, 35

  • Wednesday lecture gave an overview of Ford-Fulkerson and Edmonds-Karp algorithms for maximum flows, and introduced approximation algorithms through 2-approximation algorithms for vertex cover and metric TSP. 

Week VII (42): Recap and rerun of topics needing more rehearsal

  • Wednesday lecture covered Cook-Levin theorem / proof of NP-completeness of SAT (Cook1, Cook2, Cook3, typos on last screen: T_{i,\sigma} should be T_{i,\sigma,k} and T_{i,k+1} should be T_{i,\sigma',k+1}), and recapped some combinations of techniques we studied in previous weeks, i.e. those that could make it to the exam.
  • If you are curious to deepen your knowledge about possible combinations of techniques, see this material on how to support min-deque operations in constant amortized time and how to apply sliding window minima to speed-up dynamic programming; these are not asked in the exam, but if you can follow this material, you should be safe when it regards to knowledge in dynamic programming and amortized analysis required at the exam.
  • Solutions to the first week exercises not yet covered will appear below and are good for rehersing recurrences / divide and conquer.

 

Study group topics

Week I (36): Divide and conquer; Recurrences; Sorting; Matrix multiplication - Chapters 4, 7.1, 9.3

1. Substitution method (pp. 83-87)

  • Voluntary homework: try 4.3-2 with c log n observing it fails and then try c log (n-2)

2. Recursion-tree method (pp. 88-93)

3. Master method (pp. 93-96)

4. Selection/median finding in linear time (pp. 220-222)

Week II (37): Dynamic programming - Chapter 15

  • We'll do some problem solving. For preparation, check Chapter 15 from the book (other examples beyond what we covered at the lecture). Also you might find useful e.g. this video lecture on dynamic programming.
  • Week III (38): NP-completeness; TSP, Subset Sum, and other examples - Chapter 34
    Read one of the following before attending the study group:
  1. Finishing the proof of  Theorem 34.10: The 3-CNF problem is NP-complete (pages 1082-1085)
  2.  Proof of Theorem 34.11: The clique problem is NP-complete (pages 1086-1089)
  3.  Proof of Theorem 34.12: The vertex-cover problem is NP-complete (pages 1089-1091)
  4.  Proof of Theorem 34.14: The traveling salesman problem is NP-complete (pages 1096-1097); Exercise 34.5-6: Show that hamiltonian-path problem is NP-complete.

Week IV (39): Amortized analysis - Chapter 17

  1. 17.1 Aggregate analysis (pp. 452-455)
  2. 17.2 The accounting method (pp. 456-458)
  3. 17.3. The potential method (pp. 459-462)
  4. Proof of Lemma 7 in (http://www.ics.uci.edu/~eppstein/261/BenFar-LCA-00.pdf)

Week V (40): Randomized algorithms. Read Appendix C.2 and C.3 for basics on probabilities.

  1. Randomize-In-Place (Section 5.3, pages 126-128)
  2. Balls and Bins (Section 5.4.2)
  3. A randomized approximation algorithm for MAX-3-CNF satisfiability (in Section 35.4) [Monte Carlo algorithm]
  4. Universal hashing (Section 11.3.3, pages 265-266 only)

Week VI (41): Flows; Matchings; Approximation algorithms; - Chapters 26, 35

  • Lemma 26.4 & Theorem 26.6 (Max-flow min-cut), pages 720-724
  • Lemma 26.7  & Theorem 26.8 (Edmods-Karp shortest-path monotonicity & running time), pages 727-730
  • Reduction from maximum bipartite matching to max-flow, pages 732-735
  • Theorem 35.3 (inapproximability of non-metric TSP), pages 1115-1116

Week VII (42): Recap and rerun of topics needing more rehearsal

Exercises

 

Additional material:

DAA@MIT.OCW, TCS cheatsheet, Compedium of NP-complete problems, a graph of NP-complete problems

Associated lecture slides  (by Valentin Polishchuk (c) 2011)

Home works and their solutions will appear here.