1. Forward values for AGTT ----------------------- f0(0) = 1 f1(0) = f2(0) = f3(0) = f4(0) = f5(0) = 0 f1(1) = e1(A) * a_0,1 * f0(0) = .4*.5*.1 = .2 f2(1) = e2(A) * a_0,2 * f0(0) = .2 f1(2) = e1(G) * a_1,1 * f1(1) = 0.2*0.2*0.2 = 0.008 f2(2) = e2(G) * a_2,2 * f2(1) = 0.1*0.8*0.2 = 0.016 f3(2) = e3(G) * a_1,3 * f1(1) = 0.3*0.8*0.2 = 0.048 f4(2) = e4(G) * a_2,4 * f2(1) = 0.4*0.2*0.2 = 0.016 f1(3) = e1(T) * a_1,1 * f1(2) = 0.3*0.2*0.008 = 0.00048 f2(3) = e2(T) * a_2,2 * f2(2) = 0.4*0.8*0.016 = 0.00512 f3(3) = e3(T) * (a_1,3 * f1(2)+a_3,3*f3(2)) = 0.2*(0.8*0.008 + 0.4*0.048) = 0.00512 f4(3) = e4(T) * (a_2,4 * f2(2)+a_4,4*f4(2)) = 0.1*(0.2*0.016 + 0.1*0.016) = 0.00048 f1(4) = e1(T) * a_1,1 * f1(3) = 0.3*0.2*0.00048 = 0.0000288 f2(4) = e2(T) * a_2,2 * f2(3) = 0.4*0.8*0.00512 = 0.00164 f3(4) = e3(T) * (a_1,3 * f1(3)+a_3,3*f3(3)) = 0.2*(0.8*0.00048 + 0.4*0.00512) = 0.000486 f4(4) = e4(T) * (a_2,4 * f2(3)+a_4,4*f4(3)) = 0.1*(0.2*0.00512 + 0.1*0.00048) = 0.000107 f5(4) = a_3,5 * f3(4)+a_4,5*f4(4) = 0.6 * 0.000486 + 0.9 * 0.000107 = 0.000388 f 0 1 2 3 4 5 0 1.00000 0.00000 0.00000 0.00000 0.00000 0.00000 A 1 0.00000 0.20000 0.20000 0.00000 0.00000 0.00000 G 2 0.00000 0.00800 0.01600 0.04800 0.01600 0.00000 T 3 0.00000 0.00048 0.00512 0.00512 0.00048 0.00000 T 4 0.0000000 0.0000288 0.0016384 0.0004864 0.000107 0.000388 ------------------------ Backward values for AGTT ------------------------ b5(4) = 1 b4(4) = 0.9 b3(4) = 0.6 b4(3) = b4(4) * e4(T) * a_4,4 = 0.9 * 0.1 * 0.1 = 0.009 b3(3) = b3(4) * e3(T) * a_3,3 = 0.6 * 0.2 * 0.4 = 0.048 b2(3) = b4(4) * e4(T) * a_2,4 = 0.9 * 0.1 * 0.2 = 0.018 b1(3) = b3(4) * e3(T) * a_1,3 = 0.6 * 0.2 * 0.8 = 0.096 b4(2) = b4(3) * e4(T) * a_4,4 = 0.009 * 0.1 * 0.1 = 0.00009 b3(2) = b3(3) * e3(T) * a_3,3 = 0.048 * 0.2 * 0.4 = 0.0384 b2(2) = b4(3) * e4(T) * a_2,4 + b2(3) * e2(T) * a_2,2 = 0.009 * 0.1 * 0.2 + 0.018 * 0.4 *0.8 = 0.00594 b1(2) = b3(3) * e3(T) * a_1,3 + b1(3) * e1(T) * a_1,1= 0.048 * 0.2 * 0.8 + 0.096*0.3*0.2= 0.01344 b 0 1 2 3 4 5 0 0.000388 0.000190 0.000154 0.0000369 0.000000036 0 A 0.001641 0.00146 0.000482 0.0004608 0.0000036 0 G 0.018 0.0134 0.00594 0.00384 0.00009 0 T 0 0.096 0.018 0.048 0.009 0 T 0 0 0 0.6 0.9 1 =============================================================================== 2. Viterbi values for AGTT ----------------------- v 0 1 2 3 4 5 0 1.00000 0.00000 0.00000 0.00000 0.00000 0.00000 T 1 0.00000 0.15000 0.20000 0.00000 0.00000 0.00000 A 2 0.00000 0.01200 0.06400 0.02400 0.00400 0.00000 T 3 0.00000 0.00072 0.02048 0.00192 0.00128 0.00000 A 4 0.00000 0.0000576 0.00655 0.000154 0.000410 0.0003686 ptr 0 1 2 3 4 5 0 T 1 0 0 A 2 1 2 1 2 T 3 1 2 1,3 2 A 4 1 2 3 2 4 TRACE: T A T A 2 2 2 4 =============================================================================== 3. HMM for boxes of fruit (NOTE: you could also have a BEGIN and END state which would give slightly different values ---------------------- a 1 2 3 e 1 2 3 1 0.33 0.33 0.33 A 0.5 0.75 0.25 2 0.33 0.33 0.33 O 0.5 0.25 0.75 3 0.33 0.33 0.33 ____________________________________ | | --[box 1] --> [box 2] --> [box 3]<-- ->[A:.5 ] <-- [A:.75] <-- [A:.25]--- | [O:.5 ] [O:.25] [O:.75] | ------------------------------------ =============================================================================== 4. P(pi = (1,1,3,3,2); x = (A,A,O,O,A)) ------------------------------------ P(pi,x) = (0.33*0.5)^2 * (0.33*0.75)^2 * (0.33*0.75) = .000434 =============================================================================== 5. Calculate pi* for x = (A,A,O,O,A) Use Viterbi v 0 1 2 3 0 1 A 0.17 0.25 0.08 A 0.04 0.06 0.02 O 0.01 0.01 0.02 O 0.0026 0.0013 0.0039 A 0.0007 0.0010 0.0003 v 0 1 2 3 A 1 2 3 A 2 2 2 O 2 2 2 O 3 3 3 A 3 3 3 2 A A O O A 2 2 3 3 2 =============================================================================== 6. Log odds ration of pi* to pi_4 ------------------------------- P(pi*|x) = P(pi*,x)/P(x) = 0.33^5 * 0.75^5 / P(x) P(pi_4|x) P(pi_4,x)/P(x) = 0.00434 / P(x) P(pi*|x) / P(pi_4|x) = 0.000977 / 0.000434 Answer depends on log base, either is acceptable: ln(0.000977 / 0.000434) = 0.8109 log(0.000977 / 0.000434) = 0.3522 =============================================================================== 7. Cheating dealer HMM -------------------- [BEGIN] -> [F1] -> [F2] -> [F3] -> [F4] -> [F5] (arrows from F5 to F1, B1, END) \ --> [B1] --> [B2] --> [B3] --> [B4] --> [B5] (arrows from B5 to F1, B1, END) No difference with algos but should tie emission parameters of all fair states and also within bias states =============================================================================== 8. Adding an END state ------------------ The end state must reduce the probability values of all other outgoing edges for any state that leads to the end state. Therefore, the probability decreases for each transition in the model making longer sequences less likely for each transition used E.g., consider a model for a state with a self transition with P() of tau, and end transition of 1-tau Without an end state, tau would be 1. So a sequence of length N would accumulate tau^N = 1 probability no matter the length For an end state, tau is less than 1. The prob because tau^(N-1)*(1-tau). This is an exponential decay wrt to N =============================================================================== 9 Applications where END is not a good choice ------------------------------------------- Infinite process (weather, stock markets) Fixed length processes