Data Mining
Koe
Vuosi | Lukukausi | Päivämäärä | Periodi | Kieli | Vastuuhenkilö |
---|---|---|---|---|---|
2011 | kevät | 14.03-28.04. | 4-4 | Englanti | Hannu Toivonen |
Luennot
Aika | Huone | Luennoija | Päivämäärä |
---|---|---|---|
Ma 9-12 | B222 | Hannu Toivonen | 14.03.2011-28.04.2011 |
To 9-12 | B222 | Hannu Toivonen | 14.03.2011-28.04.2011 |
Ilmoittautuminen tälle kurssille alkaa tiistaina 22.2. klo 9.00.
Registration for this course starts on Tuesday 22nd of February at 9.00.
Information for international students
The course will be taught in English. The exam can be taken in English or Finnish. A Swedish exam is also available by advance request.
Yleistä
Results from the course are available in the department intranet at http://www.cs.helsinki.fi/i/htoivone/DM2011/results29042011.txt
Data mining or knowledge discovery ("tiedon louhinta" in Finnish) is the process of discovering interesting regularities in large masses of data. This course will focus on a fundamental and generic class of regularities, that of frequent patterns, also known as association analysis.
The course uses a problem-based approach where students learn by actively acquiring knowledge and skills, individually and in groups, to solve data mining challenges identified during the course. Participation in the course requires commitment and initiative, as well as regular and active attendance in the course meetings at Mon and Thu at 9-12. An alternative to course participation is to take the course by an exam, without participation in course meetings.
There are no separate lectures and exercises. The course meetings (Mon and Thu at 9-12) will mostly consist of discussions and team work. There will be a weekly cycle of the teacher presenting a problem and the students discussing and analysing it, identifying and setting their learning objectives, studying individually, and then presenting and discussing the learned content together. All of the above activities except invididual studying take place during the course meetings, and therefore participation in them is crucial.
In each cycle, the students will set their own learning objectives and then work to reach them. There will be few regular lectures by the teacher. Instead, the students will order short lectures on topics that they want to learn more or they had troubles understanding.
Prerequisites: BSc degree and the course Introduction to Machine Learning or equivalent.
Kurssin suorittaminen
The course can be taken either
- A. by active participation in the course meetings and independent studies between them (see above), and reporting of this work as instructed during the course, OR
- B. by a written exam.
Case A requires active attendance in at least 10 sessions and failing to deliver at most one acceptable learning journal and one acceptable group report (see Reporting). Grading is based on the journals and oral and written reporting. Otherwise case B is the only option. In case B, participation in sessions and delivered journals and reports do NOT count at all, only the exam does.
Kirjallisuus ja materiaali
The course is based on Chapters
- 6. Association Analysis: Basic Concepts and Algorithms and
- 7. Association Analysis: Advanced Concepts
of the book Introduction to Data Mining by Tan, Steinbach, and Kumar (Pearson Education, 2006). See the book website http://www-users.cs.umn.edu/~kumar/dmbook/index.php for material for and from the book. The final (or separate) exam will be based solely on the two chapters above.
Additional material that may be helpful when studying:
- Text from an earlier data mining course: http://www.cs.helsinki.fi/u/htoivone/teaching/timuS02/b.ps
- Text book manuscript by Mohammed Zaki: available in intranet at http://www.cs.helsinki.fi/i/htoivone/DM2011/zakibook.pdf (see esp. Part II)
- Solutions to exercises in the textbook of Tan et al: available in Room C127.