University of Helsinki - Department of Computer Science


581550 Data mining (tietämyksen muodostaminen), 3 cu

10 Sep - 17 Oct 2002, Tue and Thu at 10-12 o'clock
Lecture room A414, Teollisuuskatu 23, Vallila

http://www.cs.helsinki.fi/hannu.toivonen/teaching/timuS02/



Course grades

Grades of the exam in April 2003 are now available in the intranet.

If your name is not on the list, you didn't pass. To pass the course, 10/20 points are required for the project work and 20/40 points for the course exam. Person's who didn't pass due to the project work have been informed of this before the exam. Other failures are due to the exam, ask Taneli or Hannu for your points if interested.

Course description

Finnish course description: http://www.cs.helsinki.fi/hannu.toivonen/teaching/timuS02/kuvaus.html

See slides 2-6 for English information about the course.

The course is lectured in Finnish. Non-Finnish speaking students are nevertheless able to take the course: all course material is in English and the Tuesday (8-10) exercise group is held in English. (Also Finnish students are encouraged to attend the Tuesday exercise group, to prevent overfilling the Friday group. Discussions in that group can be partially in Finnish, too, when necessary.)

Course material

Project work

Exercises

Course contents and schedule

  • Course overview
    • Tue 10 Sep 02: slides 1-8
  • Introduction to data mining
    • Tue 10 Sep 02: Chapter 1 of the text, slides A1-A37, slides 17-18,
  • Association rules and Apriori algorithm
    • Thu 12 Sep 02: Chapter 2, pages 11-19; slides 34-54
    • Tue 17 Sep 02: Chapter 2, pages 20-30; slides 55-73
  • An example problem: alarm correlation
    • Tue 17 Sep 02: Chapter 3, pages 31-34; slides 74-78
    • Thu 19 Sep 02: Chapter 3, pages 34-38; slides 79-81
  • Frequent episodes
    • Thu 19 Sep 02: Chapter 4, pages 39-47; slides 82-97
    • Tue 24 Sep 02: Chapter 4, pages 47-60; Chapter 5, pages 61-62; slides 98-107
    • Thu 26 Sep 02: Chapter 5, pages 62-70; slides 108-113
  • The knowledge discovery process
    • Thu 26 Sep 02: Chapter 6, pages 71-80; slides 114-124
  • Generalized framework
    • Tue 1 Oct 02: Chapter 7, pages 81-90; slides 125-135
  • Complexity of finding frequent patterns
    • Tue 1 Oct 02: Chapter 8, pages 91-92; slides 136-140
    • Tue 3 Oct 02: Chapter 8, pages 92-102; slides 141-159
  • Closed sets and generators
    • Tue 8 Oct 02: Articles [1,2]; slides "closed sets"
  • FP-tree
    • Thu 10 Oct 02: Articles [3,4]; slides "fptree"
  • Sampling
    • Tue 15 Oct 02: Chapter 9, pages 103-119; slides 160-183
  • Summary
    • Thu 17 Oct 02: summary slides

References