# Supervised Machine Learning

## Exam

Year | Semester | Date | Period | Language | In charge |
---|---|---|---|---|---|

2012 | spring | 17.01-22.02. | 3-3 | English | Jyrki Kivinen |

## Lectures

Time | Room | Lecturer | Date |
---|---|---|---|

Tue 10-12 | C222 | Jyrki Kivinen | 17.01.2012-22.02.2012 |

Wed 10-12 | C222 | Jyrki Kivinen | 17.01.2012-22.02.2012 |

## Exercise groups

Time | Room | Instructor | Date | Observe |
---|---|---|---|---|

Fri 12-14 | B119 | Joonas Paalasmaa | 23.01.2012—24.02.2012 |

## General

The course is over and the course examination has been graded. See below for the results and exam questions.

Thanks to all students who provided feedback, either on the web form or otherwise. A brief summary of the feedback is available.

## Prerequisites

The course Introduction to Machine Learning is not strictly required. However it includes a lot of very useful motivation, context and other background which we will here cover only very briefly.

The students are expected to have basic knowledge of linear algebra, probability theory and calculus. Some multivariate calculus will be needed, but we will briefly cover the necessary tools for those not familiar.

Although there are not many specific prerequisites from mathematics, the approach taken on the course is mainly mathematical, and familiarity with mathematical manipulations will help a lot.

Part of the homework will require some programming. Basic programming skills are essential, and familiarity with tools such as Matlab, Octave or R will be very helpful.

## Contents

The form and content of the course will mostly be the same as in Spring 2011.

The course covers a selection of topics mainly related to (binary) classification. This is a large research area, and the choice of topics is somewhat based on the personal preferences of the lecturer (whose research area is computational learning theory, in particular online learning).

Emphasis will be on provable performance guarantees we can provide for learning algorithms.

Table of contents (preliminary):

- Introduction
- basic concepts
- mathematical frameworks for supervised learning: online and statistical

- Online learning
- combining expert advice
- linear classification and the perceptron algorithm
- relative loss bounds (also known as regret bounds)

- Statistical learning
- basic statistical model, connection to online learning
- complexity measures: VC-dimension and Rademacher complexity
- Support Vector Machine

## Completing the course

The grade consists of homework (20%) and exam (80%).

To get credit for homework, you need to turn in your solutions on paper before the weekly exercise session.

### Course exam

The course exam was on Tuesday 28 February at 16:00–19:00 in auditorium A111.

The exam will in principle cover everything discussed in the lectures, lecture notes and homework. However, the following pages of the lecture notes are not included, since we skipped them in class:

- pp. 63–65 (lower bound proof for Weighted Majority)
- pp. 128–129 (comments on Hilbert spaces)

The section on statistical learning theory (pp. 163–198) was covered in class only in a cursory manner. From that part, you should understand

- what do VC dimension and Rademacher complexity mean (precise definition and the intuition behind it); how are the two related (Theorem 3.19)
- what is the problem setting in which VC-dimension and Rademacher complexity are used (the statistical learning framework)
- what are the main results (sample size bounds in terms of VC-dimension and Rademacher complexity).

You are not expected to know any of the proofs in this part, *except* for Example 3.11 (VC-dimension of linear classifiers).

### Course feedback

Please proved course feedback using the web form, or just contacting the lecturer or teaching assistant.

## Literature and material

The material consists of lecture notes, exercise problems and their solutions, and possibly some additional original articles. All the lecture notes are now available: pages 1–204.

The course does not follow any single textbook. Some recommended textbooks and articles on the course topic are listed under the tab Additional references.

### Progress of the lectures

Page numbers refer to the lecture notes.

- Tue 17 Jan: administrativia, general background, basic concepts (pages 1–21)
- Wed 18 Jan: linear classification (basic concepts), Halving Algorithm, Weighted Majority (pages 22–34 and page 37)
- Tue 24 Jan: Weighted Majority finished, the setting of statistical learning (pages 34–45)
- Wed 25 Jan: empirical risk minimisation, sample size bounds for finite hypothesis classes (pages 46–54)
- Tue 31 Jan: sample size for finite classes finished; the expert framework; Aggregating Algorithm (pages 54–62 and 66–76; proof on pages 63–65 was skipped)
- Wed 1 Feb: loss bounds for the Aggregating Algorithm; tuning the learning rate; basics of linear classifiers (pages 77–88)
- Tue 7 Feb: Perceptron Algorithm; Perceptron Convergence Theorem (pages 89–101)
- Wed 8 Feb: feature mappings and kernels (pages 102–113)
- Tue 14 Feb: marginalised perceptron, with non-separable data; introduction to SVMs (pages 114–132)
- Wed 15 Feb: basics of convex optimisation, with application to hard-margin SVM (pages 133–152)
- Tue 21 Feb: hard-margin SVM finihed; soft-margin SVM (pages 153–162)
- Wed 22 Feb: VC-dimension; Rademacher complexity; closing remarks (pages 163–204)
*Comment:*this part was covered were superficially and many details skipped; see the section on exam regarding what you should realistically remember about this.

### Homework

Turn your homework in either on paper into the box in the door of Jyrki Kivinen's office (B229a) or by e-mail to Joonas Paalasmaa.

- problems (solutions due on Friday 27 January at 9:00 am); solutions
- problems (solutions due on Friday 3 February at 9:00 am); solutions
- problems (solutions due on Friday 10 February at 9:00 am); solutions
- problems (solutions due on Friday 17 February at 9:00 am); solutions
- problems (solutions due on Friday 24 February at 9:00 am); data for Problem 4; LIBSVM; solutions