Project in Practical Machine Learning

582739
2-6
Algoritmit ja koneoppiminen
Syventävät opinnot
A project in implementing an online machine learning system. Each student (or pair) will create a ML system deployed on a webserver, periodically importing data over the internet and publishing its results. The system needs to be implemented using a webserver-friendly programming language and framework (ie. no R/MATLAB/Octave). The amount of credit points varies per group depending on group size and amount of work. Grading is based on a project report and possible presentation. Prerequisites: Introduction to Machine Learning and Scientific Writing (or similar knowledge). Students should be very fluent in the programming language/framework of their choice.
Vuosi Lukukausi Päivämäärä Periodi Kieli Vastuuhenkilö
2015 kevät 14.01-29.05. 3-3 Englanti Johannes Verwijnen

Luennot

Aika Huone Luennoija Päivämäärä
Ke 16-18 C222 Johannes Verwijnen 14.01.2015-14.01.2015
Ke 16-18 C222 Johannes Verwijnen 21.01.2015-21.01.2015

Yleistä

The purpose of the course is to introduce students to the problematics of machine learning in a realistic setting. Students should be able to identify and take into account the "dirtiness" of real online data; select, justify and implement a machine learning algorithm/technique using a programming environment runnable on a web server; monitor and report the accuracy of their implementation, including reflection of their choices.

Lecture date Guest lecture Course lecture
Wed, Jan 14th, 16:15 Janne Sinkkonen, PhD, Senior Data Scientist at Reaktor Administrative issues
Wed, Jan 21st, 16:15 Matti Aksela, PhD, VP, Analytics and Technology at Comptel Data sources, dirtiness and context, existing tools & libraries, expected outcomes

 

Kurssin suorittaminen

Lecture attendance is not mandatory, but each group should prepare to have at least one student attend each lecture. Slides will be available on this page.

The project will be implemented in groups of 1-4 students. Each group will have a meeting with the instructor in the beginning of their project to validate the data source and implementation planned and to explain expected outcomes in detail. Another meeting will be scheduled roughly halfway through the project to ensure that the group is on schedule and refresh expectations. During the project guidance and simple clarifications are available via email.

The number of study points awarded is dependent on the amount of work done on the project. Higher amounts of study points require the implementation of a machine learning algorithm in the language of choice, whereas lower amounts can be achieved by using available libraries. Individual work hours need to be recorded during project work and submitted every Sunday (alternatively you can just share an online spreadsheet with the instructor). All project work should be available in a public GitHub repository.

The course is graded based on the written report and presentation.

Preliminary example schedule

Week starting Tasks
12.1.2014 Lectures, deciding on project data source and ML algorithm
19.1.2014 1st meeting, starting work, finding hosting environment
26.1.2014 Working on implementation
9.2.2014 Starting to run the system, start writing report
16.2.2014 No more changes to implementation, writing report
23.2.2014 Submit report, presentation

 

Kirjallisuus ja materiaali

Course lecture slides

Dr. Sinkkonen's slides

You can find peer support (and the instructor) on IRC channel #tkt-ppml

Data Sources:

ML libraries (in no particular order):

Places to host your system: