Estimation theory
[These papers propose two principles for estimation of statistical models, especially non-normalized ones: noise-contrastive estimation and score matching.]
Noise-contrastive estimation
M. Gutmann and A. Hyvärinen. Noise-contrastive estimation: A new estimation principle for unnormalized statistical models. Proc. AISTATS2010.
pdf
[Our newest method for estimating statistical models when the normalization constant (partition function) is not known. ]
M. Pihlaja, M. Gutmann and A. Hyvärinen. A Family of Computationally Efficient and Simple Estimators for Unnormalized Statistical Models. Proc. UAI2010.
pdf
[Generalizes the method above and shows its connection to importance sampling. ]
M. Gutmann and A. Hyvärinen.
Learning features by contrasting natural images with noise. >
Proc. Int. Conf. on Artificial Neural Networks (ICANN2009), Limassol, Cyprus, 2009.
pdf
[The first paper on noise-contrastive estimation. Proposed it from a very intuitive viewpoint.]
Score matching
A. Hyvärinen. Estimation of non-normalized statistical models using score matching.
Journal of Machine Learning Research, 6:695--709, 2005.
pdf errata
[A computationally simple yet consistent method for estimating statistical models when the normalization constant (partition function) is not known. ]
A. Hyvärinen. Optimal approximation of signal priors.
Neural Computation, 20:3087-3110, 2008.
pdf gzipped ps
[Shows that the optimal method for estimating a prior model (e.g. of natural images) for Bayesian inference (e.g. denoising) is not maximum likelihood, but score matching and some of its generalizations.]
A. Hyvärinen. Some extensions of score matching.
Computational Statistics & Data Analysis, 51:2499-2512, 2007.
pdf
[Extends score matching to binary data and non-negative data, and shows that the estimator can be obtained in closed form for exponential families.]
A. Hyvärinen. Connections between score matching, contrastive divergence, and pseudolikelihood for continuous-valued variables.
IEEE Transactions on Neural Networks, 18(5):1529-1531, 2007.
pdf gzipped ps
[Shows how score matching can be viewed as a deterministic first-order approximation of contrastive divergence.]
A. Hyvärinen. Estimation theory and information geometry based on denoising.
Proc. Workshop on Information Theory in Science and Engineering, Tampere, Finland, 2008.
pdf
[A short review of the theory of score matching.]
Some applications of score matching can be found
here
and
here
|