Aapo Hyvärinen: Publications

Publications by topic

Publications front page

Home page

Unsupervised deep learning

Review papers

A. Hyvärinen, I. Khemakhem, and H. Morioka. Nonlinear Independent Component Analysis for Principled Disentanglement in Unsupervised Deep Learning. Patterns, 4(10):100844, 2023.
[Review focused on algorithms for nonlinear ICA]

A. Hyvärinen, I. Khemakhem, and R. Monti. Identifiability of latent-variable and structural-equation models: from linear to nonlinear. Annals of the Institute of Statistical Mathematics, in press.
[Review focused on identifiability theory for ICA and causal discovery]

Nonlinear Independent Component Analysis

[Recently we have developed a new framework for a nonlinear version of ICA, which can be seen as a principled approach to "disentanglement".]

Hermanni Hälvä, Sylvain Le Corff, Luc Lehéricy, Jonathan So, Yongjie Zhu, Elisabeth Gassiat, Aapo Hyvarinen. Disentangling Identifiable Features from Noisy Data with Structured Nonlinear ICA. NeurIPS2021.
[Our most general framework for nonlinear ICA of time series or spatial data so far. Does not require any extra information but the temporal (or spatial structure), nor are there any assumptions of exponential family. Includes an observational noise model of arbitrary distribution.]

H. Morioka, H. Hälvä, and A. Hyvärinen. Independent innovation analysis for nonlinear vector autoregressive process.AISTATS2021.
[A generalization of nonlinear ICA to time series where it is the innovations which are decomposed to components.]

Ilyes Khemakhem, Diederik P. Kingma, Ricardo P. Monti, and Aapo Hyvärinen. ICE-BeeM: Identifiable Conditional Energy-Based Deep Models. NeurIPS 2020.
pdf    code   
[A generalization of nonlinear ICA with energy-based models. Shows how even models with dependent nonlinear components can be identifiable.]

Luigi Gresele, Giancarlo Fissore, Adrián Javaloy, Bernhard Schölkopf, Aapo Hyvärinen. Relative gradient optimization of the Jacobian term in unsupervised deep learning. NeurIPS2020.
[Solves the problem of optimizing the log determinant of the Jacobian as found in deep latent variable models, including nonlinear ICA.]

H. Hälvä and A. Hyvärinen. Hidden Markov Nonlinear ICA: Unsupervised Learning from Nonstationary Time Series.UAI2020.
pdf    code
[A generalization of TCL (see below) where the segmentation is estimated as part of the process, thus leading to more unsupervised method for nonlinear ICA of nonstationary time series.]

Hiroaki Sasaki, Takashi Takenouchi, Ricardo Monti, Aapo Hyvärinen. Robust contrastive learning and nonlinear ICA in the presence of outliers.UAI2020.
[A robust version (i.e. not sensitive to outliers) of the nonlinear ICA methods of TCL and PCL (see below).]

Ilyes Khemakhem, Diederik P. Kingma, Ricardo P. Monti, and Aapo Hyvärinen. Variational Autoencoders and Nonlinear ICA: A Unifying Framework. AISTATS2020.
pdf    code   
[Does nonlinear ICA by VAE's, or, modifies VAE's so that they do nonlinear ICA.]

A. Hyvärinen, H. Sasaki, and R.E. Turner. Nonlinear ICA Using Auxiliary Variables and Generalized Contrastive Learning. AISTATS 2019.
[A general framework for identifiable nonlinear ICA unifying the two framework below.]

A. Hyvärinen and H. Morioka. Unsupervised Feature Extraction by Time-Contrastive Learning and Nonlinear ICA. NIPS 2016.
pdf    Python code
[Our first method for identifiable nonlinear ICA, based on the temporal structure of the independent components. Unlike in previous approaches for disentanglement of nonlinear ICA, we can actually prove that the method recovers the independent components, i.e. it is identifiable.]

A. Hyvärinen and H. Morioka. Nonlinear ICA of Temporally Dependent Stationary Sources. AISTATS 2017.
[Our second method for identifiable nonlinear ICA with different assumptions of the temporal structure of the components.]

A. Hyvärinen and P. Pajunen. Nonlinear Independent Component Analysis: Existence and Uniqueness results. Neural Networks 12(3): 429--439, 1999.
[Older work showing that the solution of the nonlinear ICA problem is highly non-unique if the data has no temporal structure. Here we further propose an identifiable version by strongly restricting the nonlinearity.]

Density estimation / Energy-based modelling

[An alternative goal in unsupervised learning is to model the probability density of data.]

S. Saremi and A. Hyvärinen. Neural Empirical Bayes. J. Machine Learning Research, (181):1-23, 2019.
[A combination of density estimation by the DEEN method below with denoising by empirical Bayes. Leads to highly efficient denoising and deeper ideas such as a new kind of associative memory, and even computational creativity.]

S. Saremi, A. Merjou, B. Schölkopf and A. Hyvärinen. Deep Energy Estimator Networks. Arxiv, May 2018.
[Shows how to use score matching with neural networks to achieve universal approximation of the energy function (log-density) of data.]

H. Sasaki and A. Hyvärinen. Neural-Kernelized Conditional Density Estimation. Arxiv, June 2018.
[A framework for modelling conditional densities, combining kernel methods and neural networks.]

Further unsupervised deep learning

J. Hirayama, A. Hyvärinen and M. Kawanabe. SPLICE: Fully Tractable Hierarchical Extension of ICA with Pooling. ICML 2017.
[A hierarchical extension of ICA and ISA, emphasizing pooling and modelling V2 for example. The model is tractable from an estimation viewpoint, which was earlier believed to be impossible.]

T. Matsuda and A. Hyvärinen. Estimation of Non-Normalized Mixture Models and Clustering Using Deep Representation. AISTATS 2019.
[Shows how to estimate mixtures of non-normalized densities, and applies it to develop a probabilistically principled model for clustering based on the hidden representation of a neural network (e.g. trained by ImageNet).]