next up previous contents
Next: Finding optimal for Dirichlet Up: Optimising the cost function Previous: Optimising the cost function   Contents

Finding optimal $ q(\boldsymbol {M})$

Assuming $ q(\boldsymbol {\theta })$ is fixed, the cost function can be written, up to an additive constant, in the form

\begin{displaymath}\begin{split}C(\boldsymbol{M}) &= \sum_{\boldsymbol{M}} q(\bo...
...A} + \sum_{t=1}^T C(\mathbf{x}(t) \vert M_t) \bigg] \end{split}\end{displaymath} (6.11)

where $ C(\mathbf{x}(t) \vert M_t)$ is the value of $ \operatorname{E}[ p(\mathbf{x}(t) \vert M_t, \boldsymbol{\theta})
]$, i.e. the ``cost'' of current data sample given the HMM state.

By defining

$\displaystyle \pi_i^* = \exp\left(\int q(\boldsymbol{\pi}) \log \pi_{i} d\boldsymbol{\pi}\right)$    and $\displaystyle a_{ij}^* = \exp\left(\int q(\mathbf{A}) \log a_{ij} d\mathbf{A}\right),$ (6.12)

Equation (6.11) can be written in the form

$\displaystyle C_{q(\boldsymbol{M})} = \sum_{\boldsymbol{M}} q(\boldsymbol{M}) \...
...1}}^* \right] \left[ \prod_{t=1}^T \exp(- C(\mathbf{x}(t) \vert M_t)) \right]}.$ (6.13)

The expression $ \int q(x) \log \frac{q(x)}{p^*(x)}$ is minimised with respect to $ q(x)$ by setting $ q(x) = \frac{1}{Z} p^*(x)$ where $ Z$ is the appropriate normalising constant [39]. This can be proved with similar reasoning as in Equation (3.13).

The cost in Equation (6.13) can thus be minimised by setting

$\displaystyle q(\boldsymbol{M}) = \frac{1}{Z_M} \pi_{M_1}^* \left[ \prod_{t=1}^...
...t+1}}^* \right] \left[ \prod_{t=1}^T \exp(- C(\mathbf{x}(t) \vert M_t)) \right]$ (6.14)

where $ Z_M$ is the appropriate normalising constant.

The derived optimal approximation is very similar in form to the exact posterior in Equation (4.6). Therefore the point probabilities of $ q(M_1 = i)$ and $ q(M_t = j \vert M_{t-1}=i)$ can be evaluated with a modified forward-backward iteration. The result is the same as in Equation (4.9) except that in the iteration, $ \pi_i$ is replaced with $ \pi_i^*$, $ a_{ij}$ is replaced with $ a_{ij}^*$ and $ b_i(\mathbf{x})$ is replaced with $ \exp(-C(\mathbf{x}(t) \vert
M_t=i))$.


next up previous contents
Next: Finding optimal for Dirichlet Up: Optimising the cost function Previous: Optimising the cost function   Contents
Antti Honkela 2001-05-30