Aapo Hyvärinen: Publications

Publications by topic

Publications front page

Home page

Unsupervised deep learning

Nonlinear Independent Component Analysis

[Very recently we have developed a new framework for a nonlinear version of ICA, which is a principled approach to unsupervised deep learning.]

H. Hälvä and A. Hyvärinen. Hidden Markov Nonlinear ICA: Unsupervised Learning from Nonstationary Time Series.UAI2020.
[A generalization of TCL (see below) where the segmentation is estimated as part of the process, thus leading to more unsupervised method for nonlinear ICA of nonstationary time series.]

Hiroaki Sasaki, Takashi Takenouchi, Ricardo Monti, Aapo Hyvärinen. Robust contrastive learning and nonlinear ICA in the presence of outliers.UAI2020.
[A robust version (i.e. not sensitive to outliers) of the nonlinear ICA methods of TCL and PCL (see below).]

H. Morioka and A. Hyvärinen. Independent innovation analysis for nonlinear vector autoregressive process.Arxiv, June 2020.
[A generalization of nonlinear ICA to time series where it is the innovations which are decomposed to components.]

Ilyes Khemakhem, Diederik P. Kingma, Ricardo P. Monti, and Aapo Hyvärinen. ICE-BeeM: Identifiable Conditional Energy-Based Deep Models. ArXiv, Feb 2020.
[A generalization of nonlinear ICA with energy-based models. Shows how even models with dependent nonlinear components can be identifiable.]

Ilyes Khemakhem, Diederik P. Kingma, Ricardo P. Monti, and Aapo Hyvärinen. Variational Autoencoders and Nonlinear ICA: A Unifying Framework. AISTATS2020.
[Does nonlinear ICA by VAE's, or, modifies VAE's so that they do nonlinear ICA.]

A. Hyvärinen, H. Sasaki, and R.E. Turner. Nonlinear ICA Using Auxiliary Variables and Generalized Contrastive Learning. AISTATS 2019.
[A general framework for identifiable nonlinear ICA unifying the two framework below.]

A. Hyvärinen and H. Morioka. Unsupervised Feature Extraction by Time-Contrastive Learning and Nonlinear ICA. NIPS 2016.
pdf    Python code
[Our first method for identifiable nonlinear ICA, based on the temporal structure of the independent components. Unlike in previous approaches for disentanglement of nonlinear ICA, we can actually prove that the method recovers the independent components, i.e. it is identifiable.]

A. Hyvärinen and H. Morioka. Nonlinear ICA of Temporally Dependent Stationary Sources. AISTATS 2017.
[Our second method for identifiable nonlinear ICA with different assumptions of the temporal structure of the components.]

A. Hyvärinen and P. Pajunen. Nonlinear Independent Component Analysis: Existence and Uniqueness results. Neural Networks 12(3): 429--439, 1999.
[Older work showing that the solution of the nonlinear ICA problem is highly non-unique if the data has no temporal structure. Here we further propose an identifiable version by strongly restricting the nonlinearity.]

Density estimation / Energy-based modelling

[An alternative goal in unsupervised learning is to model the probability density of data.]

S. Saremi and A. Hyvärinen. Neural Empirical Bayes. J. Machine Learning Research, (181):1-23, 2019.
[A combination of density estimation by the DEEN method below with denoising by empirical Bayes. Leads to highly efficient denoising and deeper ideas such as a new kind of associative memory, and even computational creativity.]

S. Saremi, A. Merjou, B. Schölkopf and A. Hyvärinen. Deep Energy Estimator Networks. Arxiv, May 2018.
[Shows how to use score matching with neural networks to achieve universal approximation of the energy function (log-density) of data.]

H. Sasaki and A. Hyvärinen. Neural-Kernelized Conditional Density Estimation. Arxiv, June 2018.
[A framework for modelling conditional densities, combining kernel methods and neural networks.]

Further unsupervised deep learning

Luigi Gresele, Giancarlo Fissore, Adrián Javaloy, Bernhard Schölkopf, Aapo Hyvärinen. Relative gradient optimization of the Jacobian term in unsupervised deep learning. Arxiv, June 2020.
[Solves the problem of optimizing the log determinant of the Jacobian as found in deep latent variable models, including nonlinear ICA.]

J. Hirayama, A. Hyvärinen and M. Kawanabe. SPLICE: Fully Tractable Hierarchical Extension of ICA with Pooling. ICML 2017.
[A hierarchical extension of ICA and ISA, emphasizing pooling and modelling V2 for example. The model is tractable from an estimation viewpoint, which was earlier believed to be impossible.]

T. Matsuda and A. Hyvärinen. Estimation of Non-Normalized Mixture Models and Clustering Using Deep Representation. AISTATS 2019.
[Shows how to estimate mixtures of non-normalized densities, and applies it to develop a probabilistically principled model for clustering based on the hidden representation of a neural network (e.g. trained by ImageNet).]