Data Mining Project

Basic information

Completed projects

Main themes and learning objectives »

Course code: 582635

Credit units: 2

Subprogramme: Algorithms and machine learning

Level: Advanced studies

Description:

Application of data mining to a data analysis problem. The project covers the whole data mining process, and includes either implementing a data mining algorithm or using a wider range of available implementations. The project is completed by a research report describing and justifying the steps taken and decisions made, and discussing the results obtained. Prerequisites: The course Data Mining. The project can only be taken during the specified period. There are no final exams.

Year	Semester	Date	Period	Language	In charge
2014	spring	05.05-16.05.	4-4	English	Fabio Cunial

Lectures

Time	Room	Lecturer	Date
Tue 14-16	B222	Fabio Cunial	06.05.2014-06.05.2014
Mon 10-12	B222	Fabio Cunial	12.05.2014-12.05.2014
Fri 10-14	B222	Fabio Cunial	16.05.2014-16.05.2014

Huom:

Ilmoittautuminen tälle kurssille alkaa tiistaina 18.2. klo 9.00.

Note:

Registration for this course starts on Tuesday 18th of February at 9.00.

General

The objectives of this project are:

to get an exposure to advanced concepts or practices in itemset and association rule discovery;
to understand where the field is currently going;
to do something cool that you could write in your CV;
to have fun :-)

Completing the course

The project can be completed in one of the following, mutually-exclusive strategies. Regardless of the strategy, the student must submit a detailed report of her activity.

(Algorithms) Study one of the papers listed in section "Literature and material", and either:
1. write a detailed summary on the paper, or
2. implement the main idea described in the paper, or
3. improve the theoretical results of the paper.
(Implementations) Perform an in-depth review of the implementations that are currently available for itemset and association rule discovery. In particular, choose one of the options below:
1. Review the whole state of the art. What is the architecture of such implementations? Do they support parallelism? How do they handle large datasets? Which implementation choices do they make? Which of them performs best on benchmark datasets? Collect and plot performance metrics.
2. Study the fine details of one specific implementation. Answer the same questions as in point (2.1), but in greater depth. Read and possibly change the source code.
(Datasets) Using the algorithms studied in the Data Mining course, and possibly interacting with a domain expert, design a controlled set of experiments to find semantically meaningful patterns from the course datasets. Perform a detailed analysis of the discovered patterns.
(Applications) Design and implement an innovative application of the algorithms studied in the Data Mining course (for example a smartphone app, a facebook app, a gmail app, or a gcalendar app -- for possible inspiration, see e.g. this blog post, this facebook app, this smartphone app, and this example of app integration: can you do better?). The application must be agreed beforehand with the instructor, and it must have a well-defined purpose and a clear utility (but it can use existing algorithms and implementations). The student is expected to have prior working knowledge of the technologies required to implement the application.

Strategies (2), (3) and (4) allow students to form groups of at least two people and to submit a joint report.

Literature and material

Any other paper from the following conferences/journals can be used as well, but the student needs to prove to the instructor that the chosen paper conforms to the learning objectives of the project.

Address: Department of Computer Science, P.O. 68 (Gustaf Hällströmin katu 2b), FI-00014 UNIVERSITY OF HELSINKI, FINLAND
Opening Hours: During spring and autumn semesters Mon - Fri 7.45 - 19.45 (7.45 am - 7.45 pm)
Phone: +358 9 1911 (University switch)
General e-mail: info [at] cs.helsinki.fi
Fax: +358 9 876 4314

Department of Computer Science [pre 2018 site]

University of Helsinki

Faculty of Science

Data Mining Project

Lectures

General

Completing the course

Literature and material