Data Mining

Algorithms and machine learning
Advanced studies
This course focuses on concepts and methods for frequent pattern discovery, also known as association analysis. This edition of the course is a structured and guided self-study course with weekly tasks and supervision, with mandatory attendance. Prerequisites: BSc degree and the course Introduction to Machine Learning or equivalent. Course book: Tan P., Steinbach M. & Kumar V.: Introduction to Data Mining, Chapters 6 and 7. Addison Wesley, 2006.


03.05.2011 16.00 A111
Year Semester Date Period Language In charge
2011 spring 14.03-28.04. 4-4 English Hannu Toivonen


Time Room Lecturer Date
Mon 9-12 B222 Hannu Toivonen 14.03.2011-28.04.2011
Thu 9-12 B222 Hannu Toivonen 14.03.2011-28.04.2011

Ilmoittautuminen tälle kurssille alkaa tiistaina 22.2. klo 9.00.

Registration for this course starts on Tuesday 22nd of February at 9.00.

Information for international students

The course will be taught in English. The exam can be taken in English or Finnish. A Swedish exam is also available by advance request.


Results from the course are available in the department intranet at

Data mining or knowledge discovery ("tiedon louhinta" in Finnish) is the process of discovering interesting regularities in large masses of data. This course will focus on a fundamental and generic class of regularities, that of frequent patterns, also known as association analysis.

The course uses a problem-based approach where students learn by actively acquiring knowledge and skills, individually and in groups, to solve data mining challenges identified during the course. Participation in the course requires commitment and initiative, as well as regular and active attendance in the course meetings at Mon and Thu at 9-12. An alternative to course participation is to take the course by an exam, without participation in course meetings.

There are no separate lectures and exercises. The course meetings (Mon and Thu at 9-12) will mostly consist of discussions and team work. There will be a weekly cycle of the teacher presenting a problem and the students discussing and analysing it, identifying and setting their learning objectives, studying individually, and then presenting and discussing the learned content together. All of the above activities except invididual studying take place during the course meetings, and therefore participation in them is crucial.

In each cycle, the students will set their own learning objectives and then work to reach them. There will be few regular lectures by the teacher. Instead, the students will order short lectures on topics that they want to learn more or they had troubles understanding.

Prerequisites: BSc degree and the course Introduction to Machine Learning or equivalent.


Completing the course

The course can be taken either

  • A. by active participation in the course meetings and independent studies between them (see above), and reporting of this work as instructed during the course, OR
  • B. by a written exam.

Case A requires active attendance in at least 10 sessions and failing to deliver at most one acceptable learning journal and one acceptable group report (see Reporting). Grading is based on the journals and oral and written reporting. Otherwise case B is the only option. In case B, participation in sessions and delivered journals and reports do NOT count at all, only the exam does.


Literature and material

The course is based on Chapters

  • 6.  Association Analysis: Basic Concepts and Algorithms and
  • 7.  Association Analysis: Advanced Concepts

of the book Introduction to Data Mining by Tan, Steinbach, and Kumar (Pearson Education, 2006). See the book website for material for and from the book. The final (or separate) exam will be based solely on the two chapters above.

Additional material that may be helpful when studying: