582638 Unsupervised Machine Learning (ohtk 25.8.2011)

Principal themes Prerequisites Appraches learning objectives Reaches learning objectives Deepens learning objectives
Unsupervised learning Introduction to ML or introductory statistics   Understands the difference between supervised and unsupervised learning. Understands the principle of probabilistic learning.  
Optimization Vector analysis Knows the definition of the gradient method and the idea of local vs. global maxima. Can derive the gradient method in basic cases. Understands the gradient method and its variants (projected, stochastic). Can define Newton's method. Can derive practical algorithms based on these. Understands the Lagrangian, and how to derive the projected gradient from it. Can derive and reproduce some variant of the conjugate gradient algorithm.
Principal Component Analysis (PCA) and Factor Analysis Linear algebra I&II, Introduction to probability theory Can give the definition of PCA and explain the main uses of PCA. Can describe the computation using the eigenvectors of the covariance matrix. Can define the factor analysis model. Understands the connection between PCA and factor analysis. Can formulate the factor rotation problem.
Can derive the PCA as solution to one or more optimization problems. Can see if the solution is unique or not. Able to show the connection between PCA and factor analysis. Knows at least one basic solution to the factor rotation problem.
Understands the connection to singular value decomposition. Knows more than one classic factor rotation method, and can compare them.
Independent Component Analysis (ICA) All the above Can reproduce the definition of ICA. Understands the uniqueness result and the relevance of non-gaussianity. Knows two basic applications of ICA.Understands how ICA estimation is related to maximization of nongaussianity based on the central limit theorem; can formulate at least two measures of non-gaussianity. Can reproduce the basic formulae for the likelihood and mutual information; can show how they are related, and how they are related to non-gaussianity. /td> Can show the effect of whitening. Can show the problem is impossible to solve for gaussian data. Can derive methods for computationally maximizing measures of non-gaussianity. Can compare different measures of non-gaussianity. Can derive a practical algorithm for maximizing likelihood, including a simple family of density models. Can show in more than one way that the problem is impossible for gaussian data. Can reproduce the optimality proof of kurtosis. Understands the non-parametric nature of the likelihood, and different methods for tackling it.
Clustering Intro to ML or introductory statistics; Introduction to probability theory. Can reproduce the k-means algorithm. Understands the gaussian mixture model and the basic idea of the EM algorithm. Can explain the differences of k-means and gaussian mixture model using EM. Can derive the likelihood of the gaussian mixture model. Can derive (with some help) the EM algorithm for that model. Understands the theory of the EM algorithm.
Nonlinear projections PCA&FA section above Understands the concept of nonlinear projections. Can explain at least two of the following methods: linear MDS, kernel PCA, Laplacian Eigenmaps, IsoMap. Understands the connection of these methods to PCA. Can show the equivalence of linear MDS and PCA. Can reproduce the following algorithms: kernel PCA, Laplacian Eigenmaps, IsoMap, SOM. Knows further algorithms such as curvilinear component analysis, local linear embeddings.
28.08.2011 - 18:04 Jyrki Kivinen
23.02.2011 - 12:59 Aapo Hyvärinen