next up previous contents
Next: Optimising the cost function Up: Evaluating the cost function Previous: The likelihood term   Contents

The terms originating from the hidden state sequence $ \boldsymbol {M}$

The term $ C_q(\boldsymbol{M})$ is just a sum over the discrete distribution. It can be further simplified into

\begin{displaymath}\begin{split}C_q(\boldsymbol{M}) &= \sum_{\boldsymbol{M}} q(\...
..._{t+1}=j, M_{t}=i) \log q(M_{t+1}=j \vert M_{t}=i). \end{split}\end{displaymath} (6.9)

The other term, $ C_p(\boldsymbol{M})$ can be split down to

\begin{displaymath}\begin{split}C_p(\boldsymbol{M}) &= \operatorname{E}\left[ - ...
...N q(M_{t+1}=j, M_{t}=i) E\left[ \log a_{ij} \right] \end{split}\end{displaymath} (6.10)

where according to Equation (A.13), $ \operatorname{E}\left[ \log
\pi_i \right] = \Psi(\hat{\pi}_i) - \Psi(\sum_{j=1}^N
\hat{\pi}_j)$ and similarly $ \operatorname{E}\left[ \log a_{ij} \right] =
\Psi(\hat{a}_{ij}) - \Psi(\sum_{k=1}^N \hat{a}_{ik})$.

The above equations give the value of the cost function for given approximating distribution $ q(\boldsymbol{\theta}, \boldsymbol{M})$. This value is important because it can be used to compare different models as shown in Section 3.3. Additionally it can be used to monitor whether the iterative optimisation procedure has converged.



Antti Honkela 2001-05-30