Assignment 3

Generating samples from Hidden Markov Models

Results

hmm1seqmany.png

HMM1

hmm2seq.png

HMM2

hmm3seq.png

HMM3

hmm4seq.png

HMM4

Questions

  1. How can you verify that a transition matrix is valid?
    • A valid transition matrix has zeros in the first column, because no state should transition to the starting state.
    • Each row should sum to 1, to form a valid PDF.
    • The last row should be all zeros with a one in the last column, because the final state goes only to itself.
    • At least one state besides the final state should have a nonzero value in the last column, so that the final state can be reached
    • For an ergodic system, a_{ij} is nonzero for all 0 < i < N, 0 < j < N, where N is the number of states
    • For a pure L-R system,
      • the first row has all zeros and a one in the column for the first emitting state
      • the last row has all zeros and a one in the last column
      • all other rows have non zero values only on the main diagonal and in the column for the next state
      • the path from the initial state through each next state should reach the final state after visiting each state once
  2. What is the effect of the different transition matrices on the sequences obtained during the current experiment? Hence, what is the role of the transition probabilities in the Markovian modeling framework?
    • The contrast is obvious in these examples between the transition matrices that heavily favor steady-state transitions and those that make transitioning to other states more likely. The graphs clearly show a tendency to stay in one state in the first case, where as the others transition state-to-state more frequently.
  3. What would happen if we didn’t have a final state?
    • The system would keep going indefinitely
  4. -
    1. In the case of HMMs with plain Gaussian emission probabilities, what quantities should be present in the complete parameter set \Theta that specifies a particular model?
      • A - The set of transition probabilities a_{ij}
      • For each state there is a mean and variance for each dimension of the observation vector
    2. If the model is ergodic with N states (including the initial and final), and represents data of dimension D, what is the total number of parameters in \Theta?
      • The transition matrix will be N x N, but the first column and last row are known, so the variables are really (N-1) x (N-1)
      • A mean and variance for each dimension and for each emitting state give 2 x (N-2) x D
      • so the total is (N-1) x (N-1) + 2 x (N-2) x D
  5. Which type of HMM (ergodic or left-right) would you use to model words?
    • If we have a limited vocabulary we are trying to recognize, we could model each potential word as a left-right system, run them in parallel, and choose the most likely overall. For a large vocabulary, it might make more sense to use an ergodic model, and try to find a word that matches that sequence of phenomes.

Maximum Likelihood Classification

Sequence log p(X|Θ1 ) log p(X|Θ2 ) log p(X|Θ3 ) log p(X|Θ4 ) log p(X|Θ5 ) log p(X|Θ6 ) Most Likely Model
X1 -559.39 -621.22 -1439.84 -1421.07 -1294.46 -993.23 1
X2 -115.97 -117.48 -111.43 -114.49 -246.16 -140.22 3
X3 -826.50 -787.79 -1180.20 -1154.30 -741.00 -1329.14 5
X4 -878.87 -823.31 -855.12 -820.31 -1158.25 -1077.90 4
X5 -776.49 -760.90 -811.55 -788.94 -997.82 -603.51 6
X6 -1396.76 -1322.77 -1905.32 -1852.02 -2110.50 -2023.40 2

Optimal State Sequence

X6hmm2.png

This is the only example for which the viterbi algorithm didn't match the original state sequence.

hmm2X6seq.png

Here we see why the mistake was made (notice the blue dot among the red)

Unless otherwise stated, the content of this page is licensed under Creative Commons Attribution-ShareAlike 3.0 License