1 MM : Viterbi algorithm - a toy example et's consider the following simple MM. This model is composed of 2 states, (high C content) and (low C content). We can for example consider that state characterizes coding DN while characterizes non-coding DN. The model can then be used to predict the region of coding DN from a given sequence. Sources: For the theory, see Durbin et al (1998); For the example, see Borodovsky & Ekisheva (26), pp 8-81
2 MM : Viterbi algorithm - a toy example Consider the sequence S= CCT There are several paths through the hidden states ( and ) that lead to the given sequence S. Example: P = The probability of the MM to produce sequence S through the path P is: p = p () * p () * p * p () * p * p (C) *... =.5 *.2 *.6 *.2 *.4 *.3 *... =...
3 MM : Viterbi algorithm - a toy example CCT There are several paths through the hidden states ( and ) that lead to the given sequence, but they do not have the same probability. The Viterbi algorithm is a dynamical programming algorithm that allows us to compute the most probable path. Its principle is similar to the DP programs used to align 2 sequences (i.e. Needleman-Wunsch) Source: Borodovsky & Ekisheva, 26
4 MM : Viterbi algorithm - a toy example Viterbi algorithm: principle C C T The probability of the most probable path ending in state k with observation "i" is p l (i, x) = e l ( i)max k ( ) p k ( j, x "1).p kl! probability to observe element i in state l probability of the most probable path ending at position x-1 in state k with element j probability of the transition from state l to state k
5 MM : Viterbi algorithm - a toy example Viterbi algorithm: principle C C T The probability of the most probable path ending in state k with observation "i" is In our example, the probability of the most probable path ending in state with observation "" at the 4th position is:! p l (i, x) = e l ( i)max ( ) p (,4) = e ()max p (C,3) p, p (C,3) p We can thus compute recursively (from the first to the last element of our sequence) the probability of the most probable path. k ( ) p k ( j, x "1).p kl
6 MM : Viterbi algorithm - a toy example C C T T Remark: for the calculations, it is convenient to use the log of the probabilities (rather than the probabilities themselves). Indeed, this allows us to compute sums instead of products, which is more efficient and accurate. We used here log 2 (p).
7 MM : Viterbi algorithm - a toy example C T C T CCT Probability (in log 2 ) that at the first position was emitted by state Probability (in log 2 ) that at the first position was emitted by state p (,1) = = p (,1) = =
8 MM : Viterbi algorithm - a toy example C T C T CCT Probability (in log 2 ) that at the 2nd position was emitted by state Probability (in log 2 ) that at the 2nd position was emitted by state p (,2) = max (p (,1)+p, p (,1)+p ) = max ( , ) = (obtained from p (,1)) p (,2) = max (p (,1)+p, p (,1)+p ) = max ( , ) = (obtained from p (,1))
9 MM : Viterbi algorithm - a toy example C T C T CCT C C T We then compute iteratively the probabilities p (i,x) and p (i,x) that nucleotide i at position x was emitted by state or, respectively. The highest probability obtained for the nucleotide at the last position is the probability of the most probable path. This path can be retrieved by back-tracking.
10 MM : Viterbi algorithm - a toy example C T C T CCT back-tracking (= finding the path which corresponds to the highest probability, ) C C T The most probable path is: Its probability is = 4.25E-8 (remember that we used log 2 (p))
11 MM : Forward algorithm - a toy example Consider now the sequence S= C What is the probability P(S) that this sequence S was generated by the MM model? This probability P(S) is given by the sum of the probabilities p i (S) of each possible path that produces this sequence. The probability P(S) can be computed by dynamical programming using either the so-called Forward or the Backward algorithm.
12 MM : Forward algorithm - a toy example Forward algorithm Consider now the sequence S= C C.5*.3=.15.5*.2=.1
13 MM : Forward algorithm - a toy example Consider now the sequence S= C C.5*.3=.15.15*.5*.3 +.1*.4*.3=.345.5*.2=.1
14 MM : Forward algorithm - a toy example Consider now the sequence S= C C.5*.3=.15.15*.5*.3 +.1*.4*.3=.345.5*.2=.1.1*.6* *.5*.2=.27
15 MM : Forward algorithm - a toy example Consider now the sequence S= C C.5*.3=.15.15*.5*.3 +.1*.4*.3= *.2=.1.1*.6* *.5*.2=
16 MM : Forward algorithm - a toy example Consider now the sequence S= C C.5*.3=.15.15*.5*.3 +.1*.4*.3= *.2=.1.1*.6* *.5*.2= => The probability that the sequence S was generated by the MM model is thus P(S)= Σ =.38432
17 MM : Forward algorithm - a toy example The probability that sequence S="C" was generated by the MM model is P MM (S) = To assess the significance of this value, we have to compare it to the probability that sequence S was generated by the background model (i.e. by chance). Ex: If all nucleotides have the same probability, p bg =.25; the probability to observe S by chance is: P bg (S) = p bg 4 =.25 4 =.396. Thus, for this particular example, it is likely that the sequence S does not match the MM model (P bg > P MM ). NB: Note that this toy model is very simple and does not reflect any biological motif. If fact both states and are characterized by probabilities close to the background probabilities, which makes the model not realistic and not suitable to detect specific motifs.
18 MM : Summary Summary The Viterbi algorithm is used to compute the most probable path (as well as its probability). It requires knowledge of the parameters of the MM model and a particular output sequence and it finds the state sequence that is most likely to have generated that output sequence. It works by finding a maximum over all possible state sequences. In sequence analysis, this method can be used for example to predict coding vs non-coding sequences. In fact there are often many state sequences that can produce the same particular output sequence, but with different probabilities. It is possible to calculate the probability for the MM model to generate that output sequence by doing the summation over all possible state sequences. This also can be done efficiently using the Forward algorithm (or the Backward algorithm), which is also a dynamical programming algorithm. In sequence analysis, this method can be used for example to predict the probability that a particular DN region match the MM motif (i.e. was emitted by the MM model). MM motif can represent a TF binding site for ex.
19 MM : Summary Remarks To create a MM model (i.e. find the most likely set of state transition and output probabilities of each state), we need a set of (training) sequences, that does not need to be aligned. No tractable algorithm is known for solving this problem exactly, but a local maximum likelihood can be derived efficiently using the Baum-Welch algorithm or the Baldi-Chauvin algorithm. The Baum-Welch algorithm is an example of a forward-backward algorithm, and is a special case of the Expectation-maximization algorithm. For more details: see Durbin et al (1998) MMER The UMMER3 package contains a set of programs (developed by S. Eddy) to build MM models (from a set of aligned sequences) and to use MM models (to align sequences or to find sequences in databases). These programs are available at the Mobyle plateform (http://mobyle.pasteur.fr/cgi-bin/mobyleportal/portal.py)
MiSeq: Imaging and Page Welcome Navigation Presenter Introduction MiSeq Sequencing Workflow Narration Welcome to MiSeq: Imaging and. This course takes 35 minutes to complete. Click Next to continue. Please
Finding the M Most Probable Configurations Using Loopy Belief Propagation Chen Yanover and Yair Weiss School of Computer Science and Engineering The Hebrew University of Jerusalem 91904 Jerusalem, Israel
Foundations of Data Science John Hopcroft Ravindran Kannan Version /4/204 These notes are a first draft of a book being written by Hopcroft and Kannan and in many places are incomplete. However, the notes
1 Base Arithmetic 1.1 Binary Numbers We normally work with numbers in base 10. In this section we consider numbers in base 2, often called binary numbers. In base 10 we use the digits 0, 1, 2, 3, 4, 5,
7 The Backpropagation Algorithm 7. Learning as gradient descent We saw in the last chapter that multilayered networks are capable of computing a wider range of Boolean functions than networks with a single
1 An Introduction to Conditional Random Fields for Relational Learning Charles Sutton Department of Computer Science University of Massachusetts, USA email@example.com http://www.cs.umass.edu/ casutton
(To appear in ALGORITHMICA) On line construction of suffix trees 1 Esko Ukkonen Department of Computer Science, University of Helsinki, P. O. Box 26 (Teollisuuskatu 23), FIN 00014 University of Helsinki,
Selecting a Sub-set of Cases in SPSS: The Select Cases Command When analyzing a data file in SPSS, all cases with valid values for the relevant variable(s) are used. If I opened the 1991 U.S. General Social
and its Applications to Clustering Koby Crammer firstname.lastname@example.org Partha Pratim Talukdar email@example.com Department of Computer & Information Science, University of Pennsylvania, Philadelphia,
Space-Time Embedded Intelligence Laurent Orseau 1 and Mark Ring 2 1 AgroParisTech UMR 518 / INRA 16 rue Claude Bernard, 75005 Paris, France firstname.lastname@example.org http://www.agroparistech.fr/mia/orseau/
Monte Carlo methods in PageRank computation: When one iteration is sufficient K.Avrachenkov, N. Litvak, D. Nemirovsky, N. Osipova Abstract PageRank is one of the principle criteria according to which Google
A Google-like Model of Road Network Dynamics and its Application to Regulation and Control Emanuele Crisostomi, Steve Kirkland, Robert Shorten August, 2010 Abstract Inspired by the ability of Markov chains
600 IEEE TRANSACTIONS ON SIGNAL PROCESSING, VOL. 45, NO. 3, MARCH 1997 Sparse Signal Reconstruction from Limited Data Using FOCUSS: A Re-weighted Minimum Norm Algorithm Irina F. Gorodnitsky, Member, IEEE,
One-year reserve risk including a tail factor : closed formula and bootstrap approaches Alexandre Boumezoued R&D Consultant Milliman Paris email@example.com Yoboua Angoua Non-Life Consultant
Gaussian Processes in Machine Learning Carl Edward Rasmussen Max Planck Institute for Biological Cybernetics, 72076 Tübingen, Germany firstname.lastname@example.org WWW home page: http://www.tuebingen.mpg.de/ carl
Subspace Pursuit for Compressive Sensing: Closing the Gap Between Performance and Complexity Wei Dai and Olgica Milenkovic Department of Electrical and Computer Engineering University of Illinois at Urbana-Champaign
Chapter 4 High-Dimensional Image Warping John Ashburner & Karl J. Friston The Wellcome Dept. of Imaging Neuroscience, 12 Queen Square, London WC1N 3BG, UK. Contents 4.1 Introduction.................................
WHAT IS '''SIGNIFICANT LEARNING"? Dr. L. Dee Fink Director, Instructional Development Program University of Oklahoma Author of Creating Significant Learning Experiences (Jossey-Bass, 2003) One of the first
Approximately Detecting Duplicates for Streaming Data using Stable Bloom Filters Fan Deng University of Alberta email@example.com Davood Rafiei University of Alberta firstname.lastname@example.org ABSTRACT
Exploiting generative models discriminative classifiers In Tommi S. Jaakkola* MIT Artificial Intelligence Laboratorio 545 Technology Square Cambridge, MA 02139 David Haussler Department of Computer Science
Gaussian Process Latent Variable Models for Visualisation of High Dimensional Data Neil D. Lawrence Department of Computer Science, University of Sheffield, Regent Court, 211 Portobello Street, Sheffield,
OpenGamma Quantitative Research Adjoint Algorithmic Differentiation: Calibration and Implicit Function Theorem Marc Henrard email@example.com OpenGamma Quantitative Research n. 1 November 2011 Abstract
Journal of Machine Learning Research 6 (2005) 1961 1998 Submitted 8/04; Revised 3/05; Published 12/05 What s Strange About Recent Events (WSARE): An Algorithm for the Early Detection of Disease Outbreaks