582408 Lectures on Statistical Modeling Theory (2 cu)
Lecturer
Jorma Rissanen
![]() | Complex System Computation Group Helsinki Institute for Information Technology |
![]() | University of London, Royal Holloway (UK) |
![]() | Technical University of Tampere |
email: rissanen@mdl-research.org
Time and place
August 27-31, 2001
Lectures at 10-12, exercises at 12-13
Room A516Abstract
These lectures are an introduction to a theory of statistical modeling based on information theory. The basic idea is to find a decomposition of the observed data sequence into an information bearing part and the rest, which is just noise having no useful information that can be described in terms of models in a suggested class. This is accomplished by finding the shortest code length, called the complexity of the data, with which the data can be encoded when advantage is taken from the models in the suggested class. The complexity, in turn, breaks up into the shortest code length for the optimal model in a set of models that can be `distinguished' from the data and the rest, which defines `noise' as the incompressible part in the data. The code length for the optimal model is defined as the amount of information in the data that can be learned with the suggested models. It may also be viewed as the complexity of the optimal model. In this view, then, the objective of statistical modeling is to achieve such a decomposition of data, which, unlike in customary statistics, need not be assumed to be a sample from any distribution.
The lecture material (postscript)
Instructions for the homework project
The lectures cover the following topics to the extent time permits:
I. Basics of Coding Theory
- prefix codes and Kraft-inequality
- Shannon's Noiseless Coding Theorem
- coding of random processes
II.Universal Coding
- general
- Lempel-Ziv algorithm
- Algorithm Context
III.Kolmogorov Complexity
- universal algorithmic model
- sufficient statistics decomposition
IV.Complexity of Loss Functions
- models
- logarithmic and nonlogarithmic loss functions
- universal models and the MDL principle
- information
V.Four Universal Models
- normalized 2-part models
- NML-models
- mixture models
- predictive models
- universal sufficient statistics decomposition
VI.Applications
- linear LS regression
- MDL denoising
Biography
Tommi.Mononen@cs.Helsinki.FI

