Unsupervised Machine Learning Projects

582674
3
Algorithms and machine learning
Advanced studies
Practical implementation of methods taught in the course Unsupervised Machine Learning, in a number of short computer projects. The projects are done in parallel to the course. The project work can be done in addition to or as an alternative to taking the course exam.
Year Semester Date Period Language In charge
2015 spring 09.03-01.05. 4-4 English Hugo Eyherabide

Information for international students

 
The course will be held in ENGLISH.
 

NEWS:

 
  • 18.05.15: The deadline for the third project has been extended until May 29th.
  • 16.05.15: Your grades for the second project are available at https://www.cs.helsinki.fi/u/eyherabi/Grades/ .
  • 15.05.15: In the task 2 of the first exercise of the third project, do not use Eq. 12.21 as the objective function for the convergence criterion. Use instead the log-likelihood as defined in 12.13. This objective is guaranteed to increase at each step of EM, whereas the former is not. If you used the other one, make sure your convergence criterion is not met when the function decreases, particularly at the beginning of the optimization.
  • 07.05.15: Third project ready for download.
  • 04.05.15: IMPORTANT - I have corrected all reports that I was aware of. If you have submitted your report but you cannot find your grade, chances are you have submitted it to the wrong email address, or I have completely missed you email. Whatever the reason, please let me know as soon as possible.
  • 04.05.15: There is a typo in Exercise 3, task 4.d. Please add a bracket at the end of the second line, that is, at the end of the equation for updating gamma_i. The equation must look like the one in the lecture notes. I will update the pdf as soon as possible.
  • 01.05.15: Hyvää vappua! Happy workers' day!
  • 27.04.15: IMPORTANT NOTES FOR PROJECT 2:
  1. What you don't write in the report gives you no points.
  2. Do not use shortcuts in your algorithms, like computing gradients numerically with packages. In case of doubt, ask me.
  3. Implement covergence criteria, and stop based on maximum number of iterations.
  4. Keep aspect ratio near unity for figures showing directions.
  5. Write down the formulas for errors, contrast functions, and whatever you calculate.
  6. No references. All in the report.
  7. If you find or you believe there may be some inconsistency or error in the instructions, please point it out. What could be better that finding the instructor made a mistake?
  • 27.04.15: Your grades for the first project are available at https://www.cs.helsinki.fi/u/eyherabi/Grades/ .You will need to enter a username and a password. Your username is your surname, and your password is your student number. Please let me know if you cannot access. You have access to a histogram with the grades, and also to a file containing your grades, evaluation point by point and all comments I made on all reports, and an indication pointing out those which are specifically for your reports. Please read them all because they may be of your interest anyway. I have largely overlooked many of them during the grading, but most of them will not be overlooked in the next projects. By the way, very good work. Most reports have been outstanding.
  • 25.04.15: Deadline and title in the pdf with the instructions for the second project have been corrected to match the ones specified in the table on the webpage.
  • 18.04.15: Second project ready for download.
  • 09.04.15: FIrst project ready for download.
  • 03.04.15: News will appear here. Please check this webpage periodically, or send me an email to hugo.eyherabide@helsinki.fi with the exact heading SUBSCRIBE UMLP NEWS (all in upper case) to receive updates in your email. You will receive a confirmation when added to the email list.

 

General

In this course, you will implement well-known unsupervised-machine-learning methods using Matlab, Octave, R or Python, and apply them to real data. These skills are extremely important to understand the benefits and limitations of existing algorithms, and how to customize them in real applications. The course also teaches scientific communication skills, fundamental for explaining the methods employed, the significance of your results and the rationale behind improvements and breakthroughs.

The practical implementation of unsupervised-machine-learning methods requires to understand the underlying theory sufficiently well. For that reason, it is essential to take this course together with the course Unsupervised Machine Learning where the theory will be explained in detail.

The course consist of three projects, which subject, publication date and submission deadlines are given below:

 

Assignment Publication date Deadline
Principal component analysis 09.04.15 24.04.15
Independent component analysis 17.04.15 08.05.15
Clustering and Non-linear projection 30.04.15 29.05.15

 

The projects can be completed by each student individually or in pairs (in which case, only one report per pair must be submitted). Each project consist of three exercises, which ALL must be completed before submission. Each project will be graded using a scale from 0 to 100, a pass corresponding to 50 points or more. Failure to meet the deadlines will yield a reduction of 10 points per day. Passing the course requires to pass each and every project.

Notes and basic information:

  • All the relevant information concerning the project assignments will be available in this web page. However, a short description on how to complete the computer assignments and write the reports will be given on Thursday 9th of April in Room C222.
  • There will be no organized exercise sessions in this course but personal guidance is provided during the projects by email (hugo.eyherabide@helsinki.fi).
  • Basic skills of Matlab or R are necessary to finish the course! This reference can be helpful for self-study: matlab R reference.
  • To improve your communication and writing skills, you can read at least a few chapters of the books mentioned below. The first project will be used to assess whether you actually need to.

 

Completing the course

For each computer project, students are required to:

  • Implement core methods from the course Unsupervised Machine Learning using Matlab, Octave, R or Python.
  • Write a report where you explain the methods you used and their implementations.
  • Present and discuss the solutions to the questions posed in the projects.

Reports should be sent by email (hugo.eyherabide{at}helsinki.fi) by the deadlines mentioned above.

 

Grades will be based on the quality of both software implementations and reports, as well as your ability to meet the deadline. Each project will be graded using a scale from 0 to 100, a pass corresponding to 50 points or more. Software implementations should be bug-free and readable. Reports should provide correct answers to each and every question posed in the projects. Failure to meet the deadlines will yield a reduction of 10 points per day. Passing the course requires to pass each and every project.

 

Should you have further questions, please send me an email (hugo.eyherabide@helsinki.fi).

 

 

Literature and material

MY EMAIL: hugo.eyherabide{at}helsinki.fi

 

Third project: DOWNLOAD

 

Second project: DOWNLOAD

 

First project: DOWNLOAD

 

Submission guidelines: DOWNLOAD

 

UMLP Introduction presentation: DOWNLOAD

 

English dictionaries and grammar:

  • Multilanguage dictionary with translations to English LINK
  • Synonyms dictionary LINK
  • Thesaurus LINK
  • BBC grammar LINK
  • Longman English Dictionary LINK

Improving writing skills:

  • Gopen and Swan, The science of scientific writing, LINK
  • Schultz, Eloquent science : a practical guide to becoming a better writer, speaker, and atmospheric scientist. KUMPULA LIBRARY
  • Lebrun, Scientific writing: A reader and writer's guide.

Improving coding skills:

  • Martin, Clean code: A handbook of Agile Software Craftmanship. KUMPULA LIBRARY