Course grades
Grades of the exam in April 2003 are now available
in the intranet.
If your name is not on the list, you didn't pass.
To pass the course, 10/20 points are required for the project work
and 20/40 points for the course exam.
Person's who didn't pass due to the project work have been informed
of this before the exam.
Other failures are due to the exam, ask Taneli or Hannu for
your points if interested.
Course description
Finnish course description:
http://www.cs.helsinki.fi/hannu.toivonen/teaching/timuS02/kuvaus.html
See slides 2-6 for English information
about the course.
The course is lectured in Finnish.
Non-Finnish speaking students are nevertheless able to take the course:
all course material is in English and the Tuesday (8-10)
exercise group is held in English.
(Also Finnish students are encouraged to attend the Tuesday
exercise group, to prevent overfilling the Friday group.
Discussions in that group can be partially in Finnish, too,
when necessary.)
Course material
-
All text so far:
ps (1.9MB)
|
gzipped ps (0.3MB)
|
pdf (1MB)
|
gzipped pdf (0.8MB)
-
Articles:
-
N. Pasquier, Y. Bastide, R. Taouil, and L. Lakhal:
Discovering Frequent Closed Itemsets for Association Rules.
In International Conference of Database Theory - ICDT '99,
Jerusalem, Israel, 1999.
Lecture Notes in Computer Science 1540, 398-416. Springer.
-
J-F. Boulicaut, A. Bykowski, and C. Rigotti:
Approximation of frequency queries by means of free-sets.
In: Proceedings of the Fourth European Conference on Principles
and Practice of Knowledge Discovery in Databases PKDD'00,
Lyon, France, 2000.
Lecture Notes in Computer Science LNAI 1910, 75-85.
-
Jiawei Han, Jian Pei, Yiwen Yin:
Mining Frequent Patterns without Candidate Generation.
2000 ACM SIGMOD Intl. Conference on Management of Data.
-
Jian Pei, Jiawei Han, Runying Mao:
CLOSET: An Efficient Algorithm for Mining Frequent Closed Itemsets.
ACM SIGMOD Workshop on Research Issues in Data Mining and Knowledge Discovery 2000.
-
All slides to the text so far:
ps (6MB)
|
gzipped ps (0.3MB)
|
pdf (1MB)
|
pdf (0.6MB)
Additional introductory slides (A1-A41):
ps
|
pdf
Slides about closed sets:
ps
|
pdf
Slides about FP-tree:
ps (huge!)
|
ps (gzipped)
|
pdf (small, uncompressed)
Summary slides:
ps
|
pdf
Project work
Exercises
-
Exercises 1 (due Sep 16 - 20)
ps | pdf
Links to the datasets of the exercise:
-
Exercises 2 (due Sep 23 - 27)
ps | pdf
-
Exercises 3 (due Sep 30 - Oct 4)
ps | pdf
-
Exercises 4 (due Oct 7 - Oct 11)
ps | pdf
-
Exercises 5 (due Oct 14 - Oct 18)
ps | pdf
Course contents and schedule
-
Course overview
- Tue 10 Sep 02: slides 1-8
-
Introduction to data mining
- Tue 10 Sep 02: Chapter 1 of the text, slides A1-A37, slides 17-18,
-
Association rules and Apriori algorithm
- Thu 12 Sep 02: Chapter 2, pages 11-19; slides 34-54
- Tue 17 Sep 02: Chapter 2, pages 20-30; slides 55-73
-
An example problem: alarm correlation
- Tue 17 Sep 02: Chapter 3, pages 31-34; slides 74-78
- Thu 19 Sep 02: Chapter 3, pages 34-38; slides 79-81
-
Frequent episodes
- Thu 19 Sep 02: Chapter 4, pages 39-47; slides 82-97
- Tue 24 Sep 02: Chapter 4, pages 47-60;
Chapter 5, pages 61-62; slides 98-107
- Thu 26 Sep 02: Chapter 5, pages 62-70; slides 108-113
-
The knowledge discovery process
- Thu 26 Sep 02: Chapter 6, pages 71-80; slides 114-124
-
Generalized framework
- Tue 1 Oct 02: Chapter 7, pages 81-90; slides 125-135
-
Complexity of finding frequent patterns
- Tue 1 Oct 02: Chapter 8, pages 91-92; slides 136-140
- Tue 3 Oct 02: Chapter 8, pages 92-102; slides 141-159
-
Closed sets and generators
- Tue 8 Oct 02: Articles [1,2]; slides "closed sets"
-
FP-tree
- Thu 10 Oct 02: Articles [3,4]; slides "fptree"
-
Sampling
- Tue 15 Oct 02: Chapter 9, pages 103-119; slides 160-183
-
Summary
- Thu 17 Oct 02: summary slides
References
-
Association rules and Apriori algorithm
Rakesh Agrawal, Heikki Mannila, Ramakrishnan Srikant, Hannu Toivonen,
and A. Inkeri Verkamo:
Fast discovery of association rules.
In Usama M. Fayyad, Gregory Piatetsky-Shapiro, Padhraic Smyth, and
Ramasamy Uthurusamy, editors,
Advances in Knowledge Discovery and Data Mining, 307 - 328.
AAAI Press, 1996.
-
Episodes
Heikki Mannila, Hannu Toivonen, and A. Inkeri Verkamo:
Discovery of frequent episodes in event sequences,
Data Mining and Knowledge Discovery 1(3): 259 - 289, November 1997.
-
Alarm analysis in telecommunication networks, KDD process
Mika Klemettinen, Heikki Mannila, and Hannu Toivonen:
Interactive exploration of interesting patterns in
the Telecommunication network alarm sequence analyzer TASA,
Information and Software Technology 41(9): 557 - 567, June 1999.
-
Generalization of the problem, levelwise search and borders
Heikki Mannila and Hannu Toivonen:
Levelwise search and borders of theories in knowledge discovery,
Data Mining and Knowledge Discovery 1(3): 241 - 258, November 1997.
-
Closed sets and generators
See articles on top of the page under heading "Course material".
-
FP-tree
See articles on top of the page under heading "Course material".
-
Sampling in the discovery of association rules
Hannu Toivonen:
Sampling large databases for association rules.
In 22th International Conference on Very Large Databases (VLDB'96),
134 - 145, Mumbay, India, September 1996. Morgan Kaufmann.
|