Data Mining

Algorithms and machine learning
Advanced studies
This course focuses on concepts and methods for frequent pattern discovery, also known as association analysis. This edition of the course is a structured and guided self-study course with weekly tasks and supervision, with mandatory attendance. Prerequisites: BSc degree and the course Introduction to Machine Learning or equivalent. Course book: Tan P., Steinbach M. & Kumar V.: Introduction to Data Mining, Chapters 6 and 7. Addison Wesley, 2006.
Year Semester Date Period Language In charge
2012 spring 12.03-26.04. 4-4 English Hannu Toivonen


Time Room Lecturer Date
Mon 12-15 D122 Hannu Toivonen 12.03.2012-26.04.2012
Thu 12-15 D122 Hannu Toivonen 12.03.2012-26.04.2012

Information for international students

The course will be taught in English. Much of reporting and oral examinations will take place in groups, in English. (Contact the teacher in advance if you want to take the course, i.e., exam, in Finnish or Swedish.)


Attendance at the first lecture is absolutely obligatory. Students cannot join the course after the first lecture.

Data mining or knowledge discovery ("tiedon louhinta" in Finnish) is the process of discovering interesting regularities in large masses of data. This course will focus on a fundamental and generic class of regularities, that of frequent patterns, also known as association analysis.

The course uses a problem-based approach where students learn by actively acquiring knowledge and skills, individually and in groups, to solve data mining challenges identified during the course. Participation in the course requires commitment and initiative, as well as regular and active attendance in the course meetings at Mon and Thu at 12-15. An alternative to course participation is to take the course by an exam, without participation in course meetings. (See below for more information.)

There are no separate lectures and exercises. The course meetings (Mon and Thu at 12-15) will mostly consist of discussions and team work. There will be a number of cycles of the teacher presenting a problem and the students discussing and analysing it, identifying and setting their learning objectives, studying individually, and then presenting and discussing the learned content together. All of the above activities except invididual studying take place during the course meetings, and therefore participation in them is crucial.

In each cycle, the students will set their own learning objectives and then work to reach them. There will be few regular lectures by the teacher. Instead, the students will be able to order short lectures on topics that they want to learn more or they had troubles understanding.

Prerequisites: BSc degree and the course Introduction to Machine Learning or equivalent.

Completing the course

The course can be taken either

  • A. by active participation in the course meetings and independent studies between them (see above), and reporting of this work as instructed during the course, OR
  • B. by taking a separate exam.

Mixing these two options is not possible. Either you take the course by active participation, or you take the exam. The current plan is that in both options, there will be an oral examination.

Case A requires active attendance and reporting throughout the whole course (see Reporting). Grading is based on the reporting and exam. Otherwise case B is the only option. In case B, activity and reports do NOT count at all, only the exam does.

Added Mar 15th: An extra separate exam will be organized on Tue Apr 17th, at 16.00 in room A111. For participation in the Data Mining project later this spring, the students must have taken the course OR this exam. It is not possible to participate in the project without this course or exam.

Literature and material

The course is based on Chapters

  • 6.  Association Analysis: Basic Concepts and Algorithms
  • 7.  Association Analysis: Advanced Concepts

of the book Introduction to Data Mining by Tan, Steinbach, and Kumar (Pearson Education, 2006). See the book website for material for and from the book. The final (or separate) exam will be based solely on the two chapters above.

Additional material that may be helpful when studying:


Please, enter your anonymous review and feedback of the course here: