Unsupervised Machine Learning Projects

582674
3
Algorithms and machine learning
Advanced studies
Practical implementation of methods taught in the course Unsupervised Machine Learning, in a number of short computer projects. The projects are done in parallel to the course. The project work can be done in addition to or as an alternative to taking the course exam.
Year Semester Date Period Language In charge
2014 spring 10.03-25.04. 4-4 English Hugo Eyherabide

Information for international students

 
The course will be held in ENGLISH.
 

NEWS:

 
  • 02.05.14: The evaluation of Project 2 is available in the students folder (here). You will also find a file called modelreport21.pdf which is a report from last year which you may take as an example of an enjoyable and well written report. Notice however that some answers may nevertheless be inaccurate or incorrect.
  • 23.04.14: VERY IMPORTANT: Project 3 has been updated. There are more tips for you to solve the exercises and a few corrections in exercise 2. More tips on exercise 3 will come soon.
  • 21.04.14: Project 3 Exercise 1 has been updated to make the instruction more clear, and to give you useful tips. More comments and tips for the other exercises will appear soon.
  • 17.04.14: Project 2 has been updated to make the instruction more clear.
  • 11.04.14: There is a new version of project 3 with some English mistakes corrected.
  • 11.04.14: The deadline for submission of the second project has been delayed to 18.04.14.
  • 11.04.14: Third project is now available for download.
  • 07.04.14: Please notice that you can still submit you exercises corresponding to project 1, though a points penalty will be applied. News on 26.03.14 was contradictory in this respect and now has been modified to reflect this possibility as well.
  • 04.04.14: Codes should be submitted in separate files that can be run, not only embedded in the report. If you haven't done that, you have time until today midnight.
  • 04.04.14: VERY IMPORTANT: As explained in the presentation of the course, ALL projects must be completed in order to complete the course. That means that each project must reach 50% of the total points assigned to that project. It may well be possible to have less than 50% of the points in each individual exercise though, but the average must lie above 50%.
  • 04.04.14: For those who have problems with exercise 2 of project 1, today you will solve the gradient of the quartimax rotation in the exercise session. You can correct your exercise and submit it. To that end, notify your intention in doing so no later than midnight, and submit your report with the corrections included by Saturday 05.04.14 midnight. Notice that, because of the late submission, 5% or 10% of the points will be subtracted depending on the submission day.
  • 27.03.14: Second project is now available for download.
  • 26.03.14: Please notice that, in order to complete the projects, ALL exercises must be completed and submitted.
  • 26.03.14: All downloads will be located exclusively at the bottom of the course page.
  • 26.03.14: Codes can also be implemented in Python. The code visual.py is now available for download.
  • 23.03.14: Markus Kaukonen (markus.kaukonen[at]iki.fi) is looking to join another student willing to conduct the projects in pairs.   
  • 20.03.14: Codes can also be implemented in Octave. The code visual.m provided with the first project works fine in both Matlab and Octave.
  • 18.03.14: Today's presentation is available for download.
  • 18.03.14: Deadline of first project and publication date of second project have been delayed two days.
  • 14.03.14: Send an email to hugo.eyherabide@helsinki.fi if you want to receive the news in your email.
  • 14.03.14: Codes provided with the project have been cleaned up and updated. There is a small change in how the code should be run compared to the previous version.
  • 13.03.13: First project is now available for download.

 

General

In this course, students will learn how to implement well-known unsupervised-machine-learning methods using Matlab or R and apply them to real data. These skills are extremely important to understand the benefits and limitations of existing algorithms, and how to customize them in real applications. The course also teaches scientific reporting skills, fundamental for communicating the results obtained with existing methods and the rationale behind improvements and breakthroughs.

The practical implementation of unsupervised-machine-learning methods requires to understand the underlying theory sufficiently well. For that reason, it is essential to take this course together with the course Unsupervised Machine Learning where the theory will be explained in detail.

The project assignments will be published in this page soon after the essential theory has been covered in the lectures. There will be altogether three assignments with the following topics, publication dates and deadlines:

 

Assignment Publication date Deadline
Principal component analysis 14.03.14 03.04.14
Independent component analysis 27.03.14 18.04.14
Clustering and Non-linear projection 11.04.14 16.05.14

 

Assignments can be conducted in pairs (in which case, one report in which both students contributed to the writing is sufficient) or individually. Each assignment will be divided in two/three parts and each part can be submitted individually. Failure to meet the deadlines will yield a reduction in points of 5% per day of the points corresponding to the missing parts.

 

However, deadlines are tentative and will take into consideration working load of all students. For example, the deadline of the last assignment has already been extended one week due to exams in the week before. Therefore, if some of the deadlines are hard to meet, please inform the reasons as soon as possible by writing an email to Hugo Gabriel Eyherabide (hugo.eyherabide@helsinki.fi). No changes will occur 72 hours before deadlines.

 

Notes and basic information:

  • All the relevant information concerning the project assignments will be available in this web page. However, a short description on how to complete the computer assignments and write the reports will be given at the beginning of the first exercise session of Unsupervised Machine Learning (Tuesday 18th of March in C222).
  • There will be no organized exercise sessions in this course but personal guidance is provided during the projects. To that end, please contact Hugo Gabriel Eyherabide (hugo.eyherabide@helsinki.fi) be email or visit the office A332 Tuesdays and Thursdays between 13:00 and 14:00.
  • Basic skills of Matlab or R are necessary to finish the course! This reference can be helpful for self-study: matlab R reference.
  • To improve your writing skills, you can take the following course Scientific Writing for MSc in Computer Science or read the books mentioned below. This course and material, however, may be far too much for what is required in this course.

 

Completing the course

For each computer project, students are required to:

  • Implement core methods from the course Unsupervised Machine Learning using Matlab or R.
  • Write a report where you present and discuss the solutions to the questions posed in the computer projects.

Reports should be sent to Hugo Gabriel Eyherabide (hugo.eyherabide{at}helsinki.fi) by the deadlines mentioned above.

 

Grades will be based on the quality of both software implementations and reports, as well as your ability to meet the deadline. Software implementations should be bug-free and readable. Reports should provide correct results and clear discussions. Failure to meet the deadlines will yield a reduction of points as mentioned above.

Please send an email to Hugo Gabriel Eyherabide (hugo.eyherabide@helsinki.fi) if you have further questions.

 

Literature and material

THIRD PROJECT: DEADLINE: 16.05.14

DOWNLOAD

You will need to login using the user uml and the password uml.

 

SECOND PROJECT: DEADLINE: 18.04.14

DOWNLOAD

You will need to login using the user uml and the password uml.

 

 

FIRST PROJECT: DEADLINE: 03.04.14

DOWNLOAD (includes additional codes in Matlab/Octave, R and Python).

You will need to login using the user uml and the password uml.

 

UMLP Introduction presentation:

DOWNLOAD

 

English dictionaries and grammar:

  • Multilanguage dictionary with translations to English LINK
  • Synonyms dictionary LINK
  • Thesaurus LINK
  • BBC grammar LINK
  • Longman English Dictionary LINK

Improving writing skills:

  • Gopen and Swan, The science of scientific writing, LINK
  • Schultz, Eloquent science : a practical guide to becoming a better writer, speaker, and atmospheric scientist. KUMPULA LIBRARY
  • Lebrun, Scientific writing: A reader and writer's guide.

Improving coding skills:

  • Martin, Clean code: A handbook of Agile Software Craftmanship. KUMPULA LIBRARY