The approximating posterior distribution

Next: Bayesian nonlinear state-space model Up: Bayesian continuous density hidden Previous: The model Contents

The approximating posterior distribution

The approximating posterior distribution needed in ensemble learning is over all the possible hidden state sequences $\boldsymbol {M}$ and the parameter values $\boldsymbol {\theta }$ . The approximation is chosen to be of a factorial form

$\displaystyle q(\boldsymbol{M}, \boldsymbol{\theta}) = q(\boldsymbol{M}) q(\boldsymbol{\theta}).$

(5.13)

The approximation $q(\boldsymbol {M})$ is a discrete distribution and it factorises as

$\displaystyle q(\boldsymbol{M}) = q(M_1) \prod_{t=1}^{T-1} q(M_{t+1} \vert M_{t}).$

(5.14)

The parameters of this distribution are the discrete probabilities

and $q(M_{t+1} = i \vert M_{t} = j)$ .

The distribution $q(\boldsymbol {\theta })$ is also formed as a product of independent distribution for different parameters. The parameters with Dirichlet priors have posterior approximations of a single Dirichlet distribution like for $\boldsymbol{\pi}$

$\displaystyle q( \boldsymbol{\pi} ) = \ensuremath{\text{Dirichlet}}( \boldsymbol{\pi};\; \hat{\boldsymbol{\pi}} ),$

(5.15)

or a product of Dirichlet distributions as for $\mathbf{A}$

$\displaystyle q( \mathbf{A}) = \prod_{i=1}^M \ensuremath{\text{Dirichlet}}( \mathbf{a}_i;\; \hat{\mathbf{a}}_i).$

(5.16)

These will actually be the optimal choices among all possible distributions, assuming the factorisation $q(\boldsymbol{M}, \boldsymbol{\pi}, \mathbf{A}) = q(\boldsymbol{M}) q(\boldsymbol{\pi}) q(\mathbf{A})$ .

The parameters with Gaussian priors have Gaussian posterior approximations of the form

$\displaystyle q( \theta_i ) = N(\theta_i;\; \overline{\theta_i}, \widetilde{\theta_i}).$

(5.17)

All these parameters are assumed to be independent.

Next: Bayesian nonlinear state-space model Up: Bayesian continuous density hidden Previous: The model Contents

Antti Honkela 2001-05-30