Project in Probabilistic Models

582637
2-3
Algoritmit ja koneoppiminen
Syventävät opinnot
The task in this course is to implement and empirically validate probabilistic modeling techniques on a real-world data analysis problem. The progress of each participant will be monitored weekly, and at the end the participants are also expected to summarize their results by submitting a project report and giving a short talk. Prerequisites: 582636 Probabilistic Models.
Vuosi Lukukausi Päivämäärä Periodi Kieli Vastuuhenkilö
2015 kevät 11.03-29.04. 4-4 Englanti Petri Myllymäki

Luennot

Aika Huone Luennoija Päivämäärä
Ke 16-18 C220 Petri Myllymäki 11.03.2015-29.04.2015

Ilmoittautuminen tälle kurssille alkaa tiistaina 17.2. klo 9.00.

Registration for this course starts on Tuesday 17th of February at 9.00.

Yleistä

The task in the project is to build a probabillistic predictive model based on the given set of  training data. More details to be added soon.

Schedule (modified, note that the 1st deadline has been extended!):

  • Wed 11.03. First meeting. Walk-through of the project.
  • Wed 18.03. Meeting with the supervising team. Introduction to the evaluation method used.
  • Wed 25.03. Meeting with the supervising team. Q&A.
  • Wed 01.04. Meeting with the supervising team. Q&A.
  • Mon 06.04. Deadline of Round 1: submission of first set of solutions.
  • Wed 08.04. Meeting: results of round 1, feedback.
  • Wed 15.04. No meeting.
  • Mon 20.04. Deadline of Round 2: submission of second set of solutions.
  • Wed 22.04. Meeting: results of round 2, feedback.
  • Mon 27.04. Deadline of Round 3: submission of third set of solutions.
  • Wed 29.04. Final meeting: results of round 3, short presentations by each student/team.
  • Tue 05.05. Deadline for written report (midnight).
  • Wed 06.05. Feedback meeting.

The teaching assistant is Johannes Verwijnen.

 

Kurssin suorittaminen

To pass the course, you need to:

  • build a program that reads a set of training data, and calculates probabilities of new, unseen data vectors
  • participate in the weekly meetings
  • give a short (5 min) presentation at the final meeting
  • write a report of your accomplishments during the course

Grading criteria:

  • demonstration of capability to apply modeling methods, innovativeness, versatility
  • quality of the produced results
  • work effort/productivity during the course

Kirjallisuus ja materiaali

Data:

  • The data sets and row numbers for test sets are available from here http://www.cs.helsinki.fi/u/jverwijn/teaching/PPM15/
  • The data is a (comma-separated) matrix of 303 columns and 67785 rows where each value is an integer measurement by one of the 303 sensors.
  • The values are real measurements, discretized to integers. 0 has the special meaning of "no data". All other values are measurements, where you should take noise into account. You can assume that there will not be any negative values and that any values are < 100.

Evaluation environment:

FAQ:

  • What type of models are allowed for solving the prediction task?
    • Any type of model is OK, as long as you produce a probability distribution as the final outcome.
  • Can I use available open-source software packages, or do I have to program everyting myself?
    • You are allowed to use existing software packages, but doing the program yourself is considered a plus
  • Can we work in teams?
    • You can, in which case you can work on the same program, and give only one presentation together at the final meeting
    • Each participant will still need to deliver an individual final report (not a joint report, everybody writes a report in his/her own words)
    • The size of the team affects estimation of the work effort, and for thsi reason it is advisable to NOT to consider teams consisting of more than two people