Data Mining Project

Perustiedot

Completed projects

Pääteemat ja oppimistavoitteet »

Kurssikoodi: 582635

Opintopisteet: 2

Erikoistumislinja: Algoritmit ja koneoppiminen

Taso: Syventävät opinnot

Kuvaus:

Application of data mining to a data analysis problem. The project covers the whole data mining process, and includes either implementing a data mining algorithm or using a wider range of available implementations. The project is completed by a research report describing and justifying the steps taken and decisions made, and discussing the results obtained. Prerequisites: The course Data Mining. The project can only be taken during the specified period. There are no final exams.

Vuosi	Lukukausi	Päivämäärä	Periodi	Kieli	Vastuuhenkilö
2014	kevät	05.05-16.05.	4-4	Englanti	Fabio Cunial

Luennot

Aika	Huone	Luennoija	Päivämäärä
Ti 14-16	B222	Fabio Cunial	06.05.2014-06.05.2014
Ma 10-12	B222	Fabio Cunial	12.05.2014-12.05.2014
Pe 10-14	B222	Fabio Cunial	16.05.2014-16.05.2014

Huom:

Ilmoittautuminen tälle kurssille alkaa tiistaina 18.2. klo 9.00.

Note:

Registration for this course starts on Tuesday 18th of February at 9.00.

Yleistä

The objectives of this project are:

to get an exposure to advanced concepts or practices in itemset and association rule discovery;
to understand where the field is currently going;
to do something cool that you could write in your CV;
to have fun :-)

Kurssin suorittaminen

The project can be completed in one of the following, mutually-exclusive strategies. Regardless of the strategy, the student must submit a detailed report of her activity.

(Algorithms) Study one of the papers listed in section "Literature and material", and either:
1. write a detailed summary on the paper, or
2. implement the main idea described in the paper, or
3. improve the theoretical results of the paper.
(Implementations) Perform an in-depth review of the implementations that are currently available for itemset and association rule discovery. In particular, choose one of the options below:
1. Review the whole state of the art. What is the architecture of such implementations? Do they support parallelism? How do they handle large datasets? Which implementation choices do they make? Which of them performs best on benchmark datasets? Collect and plot performance metrics.
2. Study the fine details of one specific implementation. Answer the same questions as in point (2.1), but in greater depth. Read and possibly change the source code.
(Datasets) Using the algorithms studied in the Data Mining course, and possibly interacting with a domain expert, design a controlled set of experiments to find semantically meaningful patterns from the course datasets. Perform a detailed analysis of the discovered patterns.
(Applications) Design and implement an innovative application of the algorithms studied in the Data Mining course (for example a smartphone app, a facebook app, a gmail app, or a gcalendar app -- for possible inspiration, see e.g. this blog post, this facebook app, this smartphone app, and this example of app integration: can you do better?). The application must be agreed beforehand with the instructor, and it must have a well-defined purpose and a clear utility (but it can use existing algorithms and implementations). The student is expected to have prior working knowledge of the technologies required to implement the application.

Strategies (2), (3) and (4) allow students to form groups of at least two people and to submit a joint report.

Kirjallisuus ja materiaali

Any other paper from the following conferences/journals can be used as well, but the student needs to prove to the instructor that the chosen paper conforms to the learning objectives of the project.

Osoite: Tietojenkäsittelytieteen laitos, PL 68 (Gustaf Hällströmin katu 2b), 00014 Helsingin yliopisto
Aukioloajat: Normaalisti syys- ja kevätlukukausien aikana ma - pe klo 7.45-19.45.
Puhelin: 0294 1911 (yliopiston vaihde)
Sähköposti: Palveluosoitteet
Faksi: 09 876 4314

Kirjaudu sivulle | Webmaster

Department of Computer Science [pre 2018 site]

Helsingin Yliopisto

Matemaattis-luonnontieteellinen tiedekunta

Data Mining Project

Luennot

Yleistä

Kurssin suorittaminen

Kirjallisuus ja materiaali