An Introduction to Factor Graphs

Size: px
Start display at page:

Download "An Introduction to Factor Graphs"

Transcription

1 An Introduction to Factor Graphs Hans-Andrea Loeliger MLSB 2008, Copenhagen 1

2 Definition A factor graph represents the factorization of a function of several variables. We use Forney-style factor graphs (Forney, 2001). Example: f(x 1, x 2, x 3, x 4, x 5 ) f A (x 1, x 2, x 3 ) f B (x 3, x 4, x 5 ) f C (x 4 ). x 1 f A x 2 x 3 f B x 4 x 5 Rules: A node for every factor. An edge or half-edge for every variable. Node g is connected to edge x iff variable x appears in factor g. (What if some variable appears in more than 2 factors?) f C 2

3 Example: Markov Chain p XY Z (x, y, z) p X (x) p Y X (y x) p Z Y (z y). p X X p Y X Y p Z Y Z We will often use capital letters for the variables. (Why?) Further examples will come later. 3

4 Message Passing Algorithms operate by passing messages along the edges of a factor graph:... 4

5 A main point of factor graphs (and similar graphical notations): A Unified View of Historically Different Things Statistical physics: - Markov random fields (Ising 1925) Signal processing: - linear state-space models and Kalman filtering: Kalman recursive least-squares adaptive filters - Hidden Markov models: Baum et al unification: Levy et al Error correcting codes: - Low-density parity check codes: Gallager 1962; Tanner 1981; MacKay 1996; Luby et al Convolutional codes and Viterbi decoding: Forney Turbo codes: Berrou et al Machine learning, statistics: - Bayesian networks: Pearl 1988; Shachter 1988; Lauritzen and Spiegelhalter 1988; Shafer and Shenoy

6 Other Notation Systems for Graphical Models Example: p(u, w, x, y, z) p(u)p(w)p(x u, w)p(y x)p(z x). W U X Z W U X Z Y Y Forney-style factor graph. Original factor graph [FKLW 1997]. U X W Z W U X Z Y Y Bayesian network. Markov random field. 6

7 Outline 1. Introduction 2. The sum-product and max-product algorithms 3. More about factor graphs 4. Applications of sum-product & max-product to hidden Markov models 5. Graphs with cycles, continuous variables,... 7

8 The Two Basic Problems 1. Marginalization: Compute f k (x k ) f(x 1,..., x n ) x 1,..., x n except x k 2. Maximization: Compute the max-marginal ˆf k (x k ) max f(x 1,..., x n ) x 1,..., x n except x k assuming that f is real-valued and nonnegative and has a maximum. Note that ( argmax f(x 1,..., x n ) argmax ˆf 1 (x 1 ),..., argmax ˆf ) n (x n ). For large n, both problems are in general intractable even for x 1,..., x n {0, 1}. 8

9 Factorization Helps For example, if f(x 1,..., f n ) f 1 (x 1 )f 2 (x 2 ) f n (x n ) then f k (x k ) and ˆf k (x k ) ( ) ( ) ( ) f 1 (x 1 ) f k 1 (x k 1 ) f k (x k ) f n (x n ) x 1 x k 1 x n ( ) ( ) ( ) max f 1 (x 1 ) max f k 1 (x k 1 ) f k (x k ) max f n (x n ). x 1 x k 1 x n Factorization helps also beyond this trivial example. 9

10 Towards the sum-product algorithm: Computing Marginals A Generic Example Assume we wish to compute f 3 (x 3 ) f(x 1,..., x 7 ) x 1,..., x 7 except x 3 and assume that f can be factored as follows: f 2 X 2 f 4 f 7 X 4 X 7 f 1 X 1 f 3 X 3 f 5 X 5 f 6 X 6 10

11 Example cont d: Closing Boxes by the Distributive Law f 2 X 2 f 4 f 7 X 4 X 7 f 1 X 1 f 3 X 3 f 5 X 5 f 6 X 6 f 3 (x 3 ) ( ) f 1 (x 1 )f 2 (x 2 )f 3 (x 1, x 2, x 3 ) x 1,x ( 2 ( ) ) f 4 (x 4 )f 5 (x 3, x 4, x 5 ) f 6 (x 5, x 6, x 7 )f 7 (x 7 ) x 4,x 5 x 6,x 7 11

12 Example cont d: Message Passing View f 2 X 2 f 4 f 7 X 4 X 7 f 1 X 1 X 3 f 3 X 5 f 5 f 6 X 6 f 3 (x 3 ) ( f 1 (x 1 )f 2 (x 2 )f 3 (x 1, x 2, x 3 ) x 1,x } 2 {{} µx3 (x 3 ) ( ( f 4 (x 4 )f 5 (x 3, x 4, x 5 ) f 6 (x 5, x 6, x 7 )f 7 (x 7 ) x 4,x 5 x 6,x } 7 {{} µx5 (x 5 ) }{{} µx3 (x 3 ) ) )) 12

13 Example cont d: Messages Everywhere f 4 f 7 f 2 X 2 X 4 X 7 f 1 X 1 X 3 f 3 X 5 f 5 f 6 X 6 With µ X1 (x 1 ) f 1 (x 1 ), µ X2 (x 2 ) f 2 (x 2 ), etc., we have µ X3 (x 3 ) µ X1 (x 1 ) µ X2 (x 2 )f 3 (x 1, x 2, x 3 ) x1,x2 µ X5 (x 5 ) µ X7 (x 7 )f 6 (x 5, x 6, x 7 ) x6,x7 µ X3 (x 3 ) µ X4 (x 4 ) µ X5 (x 5 )f 5 (x 3, x 4, x 5 ) x4,x5 13

14 The Sum-Product Algorithm (Belief Propagation). Y 1 X Y n g Sum-product message computation rule: µ X (x) y 1,...,y n g(x, y 1,..., y n ) µ Y1 (y 1 ) µ Yn (y n ) Sum-product theorem: If the factor graph for some function f has no cycles, then f X (x) µ X (x) µ X (x). 14

15 Towards the max-product algorithm: Computing Max-Marginals A Generic Example Assume we wish to compute ˆf 3 (x 3 ) max x 1,..., x 7 except x 3 f(x 1,..., x 7 ) and assume that f can be factored as follows: f 2 X 2 f 4 f 7 X 4 X 7 f 1 X 1 f 3 X 3 f 5 X 5 f 6 X 6 15

16 Example: Closing Boxes by the Distributive Law f 2 X 2 f 4 f 7 X 4 X 7 f 1 X 1 f 3 X 3 f 5 X 5 f 6 X 6 ˆf 3 (x 3 ) ( max f 1 (x 1 )f 2 (x 2 )f 3 (x 1, x 2, x 3 ) x 1,x 2 ( ( ) ) max f 4 (x 4 )f 5 (x 3, x 4, x 5 ) max f 6 (x 5, x 6, x 7 )f 7 (x 7 ) x 4,x 5 x 6,x 7 ) 16

17 Example cont d: Message Passing View f 2 X 2 f 4 f 7 X 4 X 7 f 1 X 1 X 3 f 3 X 5 f 5 f 6 X 6 ˆf 3 (x 3 ) ( max f 1 (x 1 )f 2 (x 2 )f 3 (x 1, x 2, x 3 ) x 1,x } 2 {{} µx3 (x 3 ) ( ( max f 4 (x 4 )f 5 (x 3, x 4, x 5 ) x 4,x 5 ) max f 6 (x 5, x 6, x 7 )f 7 (x 7 ) x 6,x } 7 {{} µx5 (x 5 ) }{{} µx3 (x 3 ) )) 17

18 Example cont d: Messages Everywhere f 4 f 7 f 2 X 2 X 4 X 7 f 1 X 1 X 3 f 3 X 5 f 5 f 6 X 6 With µ X1 (x 1 ) f 1 (x 1 ), µ X2 (x 2 ) f 2 (x 2 ), etc., we have µ X3 (x 3 ) max µ X1 (x 1 ) µ X2 (x 2 )f 3 (x 1, x 2, x 3 ) x 1,x 2 µ X5 (x 5 ) max µ X7 (x 7 )f 6 (x 5, x 6, x 7 ) x 6,x 7 µ X3 (x 3 ) max µ X4 (x 4 ) µ X5 (x 5 )f 5 (x 3, x 4, x 5 ) x 4,x 5 18

19 The Max-Product Algorithm. Y 1 X Y n g Max-product message computation rule: µ X (x) max y 1,...,y n g(x, y 1,..., y n ) µ Y1 (y 1 ) µ Yn (y n ) Max-product theorem: If the factor graph for some global function f has no cycles, then ˆf X (x) µ X (x) µ X (x). 19

20 Outline 1. Introduction 2. The sum-product and max-product algorithms 3. More about factor graphs 4. Applications of sum-product & max-product to hidden Markov models 5. Graphs with cycles, continuous variables,... 20

21 Terminology x 1 f A x 2 x 3 f B x 4 x 5 f C Global function f product of all factors; usually (but not always!) real and nonnegative. A configuration is an assignment of values to all variables. The configuration space is the set of all configurations, which is the domain of the global function. A configuration ω (x 1,..., x 5 ) is valid iff f(ω) 0. 21

22 Invalid Configurations Do Not Affect Marginals A configuration ω (x 1,..., x n ) is valid iff f(ω) 0. Recall: 1. Marginalization: Compute f k (x k ) x 1,..., x n except x k f(x 1,..., x n ) 2. Maximization: Compute the max-marginal ˆf k (x k ) max f(x 1,..., x n ) x 1,..., x n except x k assuming that f is real-valued and nonnegative and has a maximum. Constraints may be expressed by factors that evaluate to 0 if the constraint is violated. 22

23 Branching Point Equality Constraint Factor X X X X The factor f (x, x, x ) { 1, if x x x 0, else enforces X X X for every valid configuration. More general: f (x, x, x ) δ(x x )δ(x x ) where δ denotes the Kronecker delta for discrete variables and the Dirac delta for discrete variables. 23

24 Example: Independent Observations Let Y 1 and Y 2 be two independent observations of X: p(x, y 1, y 2 ) p(x)p(y 1 x)p(y 2 x). X p X p Y1 X p Y2 X Y 1 Y 2 Literally, the factor graph represents an extended model p(x, x, x, y 1, y 2 ) p(x)p(y 1 x )p(y 2 x )f (x, x, x ) with the same marginals and max-marginals as p(x, y 1, y 2 ). 24

25 From A Priori to A Posteriori Probability Example (cont d): Let Y 1 y 1 and Y 2 y 2 be two independent observations of X, i.e., p(x, y 1, y 2 ) p(x)p(y 1 x)p(y 2 x). For fixed y 1 and y 2, we have p(x y 1, y 2 ) p(x, y 1, y 2 ) p(y 1, y 2 ) p(x, y 1, y 2 ). X p X p Y1 X p Y2 X y 1 y 2 The factorization is unchanged (except for a scale factor). Known variables will be denoted by small letters; unknown variables will usually be denoted by capital letters. 25

26 Outline 1. Introduction 2. The sum-product and max-product algorithms 3. More about factor graphs 4. Applications of sum-product & max-product to hidden Markov models 5. Graphs with cycles, continuous variables,... 26

27 Example: Hidden Markov Model n p(x 0, x 1, x 2,..., x n, y 1, y 2,..., y n ) p(x 0 ) p(x k x k 1 )p(y k x k 1 ) k1 p(x 0 ) X 0 p(x 1 x 0 ) X 1 p(x 2 x 1 ) X 2... p(y 1 x 0 ) p(y 2 x 1 ) Y 1 Y 2 27

28 Sum-product algorithm applied to HMM: Estimation of Current State p(x n y 1,..., y n ) p(x n, y 1,..., y n ) p(y 1,..., y n ) p(x n, y 1,..., y n ) x 0... x n 1 p(x 0, x 1,..., x n, y 1, y 2,..., y n ) For n 2: µ Xn (x n ). X 0 X 1 X y 1 y 2 Y 3 28

29 Backward Message in Chain Rule Model X Y p X p Y X If Y y is known (observed): µ X (x) p Y X (y x), the likelihood function. If Y is unknown: µ X (x) y 1. p Y X (y x) 29

30 Sum-product algorithm applied to HMM: Prediction of Next Output Symbol p(y n+1 y 1,..., y n ) p(y 1,..., y n+1 ) p(y 1,..., y n ) p(y 1,..., y n+1 ) For n 2: x 0,x 1,...,x n p(x 0, x 1,..., x n, y 1, y 2,..., y n, y n+1 ) µ Yn (y n ). X 0 X 1 X y 1 y 2 Y 3 30

31 Sum-product algorithm applied to HMM: Estimation of Time-k State p(x k y 1, y 2,..., y n ) p(x k, y 1, y 2,..., y n ) p(y 1, y 2,..., y n ) p(x k, y 1, y 2,..., y n ) p(x 0, x 1,..., x n, y 1, y 2,..., y n ) x 0,..., x n except x k For k 1: µ Xk (x k ) µ Xk (x k ) X 0 X 1 X 2... y 1 y 2 y 3 31

32 Sum-product algorithm applied to HMM: All States Simultaneously p(x k y 1,..., y n ) for all k: X 0 X 1 X 2... y 1 y 2 y 3 In this application, the sum-product algorithm coincides with the Baum-Welch / BCJR forward-backward algorithm. 32

33 Scaling of Messages In all the examples so far: The final result (such as µ Xk (x k ) µ Xk (x k )) equals the desired quantity (such as p(x k y 1,..., y n )) only up to a scale factor. The missing scale factor γ may be recovered at the end from the condition x k γ µ Xk (x k ) µ Xk (x k ) 1. It follows that messages may be scaled freely along the way. Such message scaling is often mandatory to avoid numerical problems. 33

34 Sum-product algorithm applied to HMM: Probability of the Observation p(y 1,..., y n ) x 0... x n p(x 0, x 1,..., x n, y 1, y 2,..., y n ) x n µ Xn (x n ). This is a number. Scale factors cannot be neglected in this case. For n 2: X 0 X 1 X y 1 y 2 Y 3 34

35 Max-product algorithm applied to HMM: MAP Estimate of the State Trajectory The estimate (ˆx 0,..., ˆx n ) MAP argmax x 0,...,x n p(x 0,..., x n y 1,..., y n ) may be obtained by computing argmax x 0,...,x n p(x 0,..., x n, y 1,..., y n ) ˆp k (x k ) max p(x 0,..., x n, y 1,..., y n ) x 1,..., x n except x k µ Xk (x k ) µ Xk (x k ) for all k by forward-backward max-product sweeps. In this example, the max-product algorithm is a time-symmetric version of the Viterbi algorithm with soft output. 35

36 Max-product algorithm applied to HMM: MAP Estimate of the State Trajectory cont d Computing ˆp k (x k ) max p(x 0,..., x n, y 1,..., y n ) x 1,..., x n except x k µ Xk (x k ) µ Xk (x k ) simultaneously for all k: X 0 X 1 X 2... y 1 y 2 y 3 36

37 Marginals and Output Edges Marginals such µ X (x) µ X (x) may be viewed as messages out of a output half edge (without incoming message): X X X X µ X(x) µ X (x ) µ X (x ) δ(x x ) δ(x x ) dx dx x x µ X (x) µ X (x) Marginals are computed like messages out of -nodes. 37

38 Outline 1. Introduction 2. The sum-product and max-product algorithms 3. More about factor graphs 4. Applications of sum-product & max-product to hidden Markov models 5. Graphs with cycles, continuous variables,... 38

39 Factor Graphs with Cycles? Continuous Variables? 39

40 Example: Error Correcting Codes (e.g. LDPC Codes) Codeword x (x 1,..., x n ) C {0, 1} n. Received y (y 1,..., y n ) R n. p(x, y) p(y x)δ C (x). f (z 1,..., z m ) { 1, if z z m 0 mod 2 0, else random connections code membership function { 1, if x C δ C (x) 0, else X 1 X 2... X n channel model p(y x) n e.g., p(y x) p(y l x l ) y 1 y 2 y n l1 Decoding by iterative sum-product message passing. 40

41 Example: Magnets, Spin Glasses, etc. Configuration x (x 1,..., x n ), x k {0, 1}. Energy w(x) β k x k + k p(x) e w(x) k e β kx k neighboring pairs (k, l) neighboring pairs (k, l) β k,l (x k x l ) 2 e β k,l(x k x l ) 2 41

42 Example: Hidden Markov Model with Parameter(s) n p(x 0, x 1, x 2,..., x n, y 1, y 2,..., y n θ) p(x 0 ) p(x k x k 1, θ)p(y k x k 1 ) k1 Θ p(x 0 ) X 0 p(x 1 x 0, θ) X 1 p(x 2 x 1, θ) X 2... Y 1 Y 2 42

43 Example: Least-Squares Problems n Minimizing x 2 k subject to (linear or nonlinear) constraints is k1 equivalent to maximizing e n k1 x 2 k n subject to the given constraints. k1 e x2 k constraints X 1 e x X n e x2 n 43

44 General Linear State Space Model U k U k+1 B k B k+1... X k 1 A k X + k A k+1 X k C k C k+1 Y k Y k+1 X k Y k A k X k 1 + B k U k C k X k 44

45 Gaussian Message Passing in Linear Models encompasses much of classical signal processing and appears often as a component of more complex problems/algorithms. Note: 1. Gaussianity of messages is preserved in linear models. 2. Includes Kalman filtering and recursive least-squares algorithms. 3. For Gaussian messages, the sum-product (integral-product) algorithm coincides with the max-product algorithm. 4. For jointly Gaussian random variables, MAP estimation MMSE estimation LMMSE estimation. 5. Even if X and Y are not jointly Gaussian, the LMMSE estimate of X from Y y may be obtained by pretending that X and Y are jointly Gaussian (with their actual means and second-order moments) and forming the corresponding MAP estimate. See The factor graph approach to model-based signal processing, Proceedings of the IEEE, June

46 Continuous Variables: Message Types The following message types are widely applicable. Quantization of messages into discrete bins. Infeasible in higher dimensions. Single point: the message µ(x) is replaced by a temporary or final estimate ˆx. Function value and gradient at a point selected by the receiving node. Allows steepest-ascent (descent) algorithms. Gaussians. Works for Kalman filtering. Gaussian mixtures. List of samples: a pdf can be represented by a list of samples. This data type is the basis of particle filters, but it can be used as a general data type in a graphical model. Compound messages: the product of other message types. 46

47 (7) quantity (8) ics. ing to (6) therefore E p(y k X k, S k, S k 1 ) Y k 1, (16) where Particle the expectation Methods is as with Message respect to Passing the probability density p(s k 1, s k, x k y k 1 ) p(s k 1 y k 1 )p(x k, s k s k 1 ) (17) Basic idea: represent a probability density f by a list ( ) L (ˆx (1), w (1) ),... (ˆx (L), w (L) ) of weighted samples ( particles ): µ k 1 (s k 1 )p(x k, s k s k 1 ). (18) III. A PARTICLE METHOD f x k 1, (9) ll choose Fig. 2. A probability density function f : R R + and its representation as a list Versatile of particles. data type for sum-product, max-product, EM,... Not sensitive to dimensionality. If both the input alphabet X and the state space S are finite sets (and the alphabet of X and S is not too large), then 47

48 On Factor Graphs with Cycles Generally iterative algorithms. For example, alternating maximization ˆx new argmax x f(x, ŷ) and ŷ new argmax y using the max-product algorithm in each iteration. f(ˆx, y) Iterative sum-product message passing gives excellent results for maximization(!) in some applications (e.g., the decoding of error correcting codes). Many other useful algorithms can be formulated in message passing form (e.g., J. Dauwels): gradient ascent, Gibbs sampling, expectation maximization, variational methods, clustering by affinity propagation (B. Frey)... Factor graphs facilitate to mix and match different algorithmic techniques. End of this talk. 48

Factor Graphs and the Sum-Product Algorithm

Factor Graphs and the Sum-Product Algorithm 498 IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 47, NO. 2, FEBRUARY 2001 Factor Graphs and the Sum-Product Algorithm Frank R. Kschischang, Senior Member, IEEE, Brendan J. Frey, Member, IEEE, and Hans-Andrea

More information

3. The Junction Tree Algorithms

3. The Junction Tree Algorithms A Short Course on Graphical Models 3. The Junction Tree Algorithms Mark Paskin mark@paskin.org 1 Review: conditional independence Two random variables X and Y are independent (written X Y ) iff p X ( )

More information

The Basics of Graphical Models

The Basics of Graphical Models The Basics of Graphical Models David M. Blei Columbia University October 3, 2015 Introduction These notes follow Chapter 2 of An Introduction to Probabilistic Graphical Models by Michael Jordan. Many figures

More information

Advanced Signal Processing and Digital Noise Reduction

Advanced Signal Processing and Digital Noise Reduction Advanced Signal Processing and Digital Noise Reduction Saeed V. Vaseghi Queen's University of Belfast UK WILEY HTEUBNER A Partnership between John Wiley & Sons and B. G. Teubner Publishers Chichester New

More information

Tagging with Hidden Markov Models

Tagging with Hidden Markov Models Tagging with Hidden Markov Models Michael Collins 1 Tagging Problems In many NLP problems, we would like to model pairs of sequences. Part-of-speech (POS) tagging is perhaps the earliest, and most famous,

More information

Hidden Markov Models

Hidden Markov Models 8.47 Introduction to omputational Molecular Biology Lecture 7: November 4, 2004 Scribe: Han-Pang hiu Lecturer: Ross Lippert Editor: Russ ox Hidden Markov Models The G island phenomenon The nucleotide frequencies

More information

Course: Model, Learning, and Inference: Lecture 5

Course: Model, Learning, and Inference: Lecture 5 Course: Model, Learning, and Inference: Lecture 5 Alan Yuille Department of Statistics, UCLA Los Angeles, CA 90095 yuille@stat.ucla.edu Abstract Probability distributions on structured representation.

More information

Coding and decoding with convolutional codes. The Viterbi Algor

Coding and decoding with convolutional codes. The Viterbi Algor Coding and decoding with convolutional codes. The Viterbi Algorithm. 8 Block codes: main ideas Principles st point of view: infinite length block code nd point of view: convolutions Some examples Repetition

More information

Linear Threshold Units

Linear Threshold Units Linear Threshold Units w x hx (... w n x n w We assume that each feature x j and each weight w j is a real number (we will relax this later) We will study three different algorithms for learning linear

More information

Evaluation of Machine Learning Techniques for Green Energy Prediction

Evaluation of Machine Learning Techniques for Green Energy Prediction arxiv:1406.3726v1 [cs.lg] 14 Jun 2014 Evaluation of Machine Learning Techniques for Green Energy Prediction 1 Objective Ankur Sahai University of Mainz, Germany We evaluate Machine Learning techniques

More information

Solutions to Exam in Speech Signal Processing EN2300

Solutions to Exam in Speech Signal Processing EN2300 Solutions to Exam in Speech Signal Processing EN23 Date: Thursday, Dec 2, 8: 3: Place: Allowed: Grades: Language: Solutions: Q34, Q36 Beta Math Handbook (or corresponding), calculator with empty memory.

More information

Introduction to Algorithmic Trading Strategies Lecture 2

Introduction to Algorithmic Trading Strategies Lecture 2 Introduction to Algorithmic Trading Strategies Lecture 2 Hidden Markov Trading Model Haksun Li haksun.li@numericalmethod.com www.numericalmethod.com Outline Carry trade Momentum Valuation CAPM Markov chain

More information

Bayesian Machine Learning (ML): Modeling And Inference in Big Data. Zhuhua Cai Google, Rice University caizhua@gmail.com

Bayesian Machine Learning (ML): Modeling And Inference in Big Data. Zhuhua Cai Google, Rice University caizhua@gmail.com Bayesian Machine Learning (ML): Modeling And Inference in Big Data Zhuhua Cai Google Rice University caizhua@gmail.com 1 Syllabus Bayesian ML Concepts (Today) Bayesian ML on MapReduce (Next morning) Bayesian

More information

Hidden Markov Models Chapter 15

Hidden Markov Models Chapter 15 Hidden Markov Models Chapter 15 Mausam (Slides based on Dan Klein, Luke Zettlemoyer, Alex Simma, Erik Sudderth, David Fernandez-Baca, Drena Dobbs, Serafim Batzoglou, William Cohen, Andrew McCallum, Dan

More information

Finding the M Most Probable Configurations Using Loopy Belief Propagation

Finding the M Most Probable Configurations Using Loopy Belief Propagation Finding the M Most Probable Configurations Using Loopy Belief Propagation Chen Yanover and Yair Weiss School of Computer Science and Engineering The Hebrew University of Jerusalem 91904 Jerusalem, Israel

More information

Variational Mean Field for Graphical Models

Variational Mean Field for Graphical Models Variational Mean Field for Graphical Models CS/CNS/EE 155 Baback Moghaddam Machine Learning Group baback @ jpl.nasa.gov Approximate Inference Consider general UGs (i.e., not tree-structured) All basic

More information

10-601. Machine Learning. http://www.cs.cmu.edu/afs/cs/academic/class/10601-f10/index.html

10-601. Machine Learning. http://www.cs.cmu.edu/afs/cs/academic/class/10601-f10/index.html 10-601 Machine Learning http://www.cs.cmu.edu/afs/cs/academic/class/10601-f10/index.html Course data All up-to-date info is on the course web page: http://www.cs.cmu.edu/afs/cs/academic/class/10601-f10/index.html

More information

Graphical Models, Exponential Families, and Variational Inference

Graphical Models, Exponential Families, and Variational Inference Foundations and Trends R in Machine Learning Vol. 1, Nos. 1 2 (2008) 1 305 c 2008 M. J. Wainwright and M. I. Jordan DOI: 10.1561/2200000001 Graphical Models, Exponential Families, and Variational Inference

More information

Deterministic Sampling-based Switching Kalman Filtering for Vehicle Tracking

Deterministic Sampling-based Switching Kalman Filtering for Vehicle Tracking Proceedings of the IEEE ITSC 2006 2006 IEEE Intelligent Transportation Systems Conference Toronto, Canada, September 17-20, 2006 WA4.1 Deterministic Sampling-based Switching Kalman Filtering for Vehicle

More information

The continuous and discrete Fourier transforms

The continuous and discrete Fourier transforms FYSA21 Mathematical Tools in Science The continuous and discrete Fourier transforms Lennart Lindegren Lund Observatory (Department of Astronomy, Lund University) 1 The continuous Fourier transform 1.1

More information

Lecture 3: Linear methods for classification

Lecture 3: Linear methods for classification Lecture 3: Linear methods for classification Rafael A. Irizarry and Hector Corrada Bravo February, 2010 Today we describe four specific algorithms useful for classification problems: linear regression,

More information

Structured Learning and Prediction in Computer Vision. Contents

Structured Learning and Prediction in Computer Vision. Contents Foundations and Trends R in Computer Graphics and Vision Vol. 6, Nos. 3 4 (2010) 185 365 c 2011 S. Nowozin and C. H. Lampert DOI: 10.1561/0600000033 Structured Learning and Prediction in Computer Vision

More information

Master s thesis tutorial: part III

Master s thesis tutorial: part III for the Autonomous Compliant Research group Tinne De Laet, Wilm Decré, Diederik Verscheure Katholieke Universiteit Leuven, Department of Mechanical Engineering, PMA Division 30 oktober 2006 Outline General

More information

Semi-Supervised Support Vector Machines and Application to Spam Filtering

Semi-Supervised Support Vector Machines and Application to Spam Filtering Semi-Supervised Support Vector Machines and Application to Spam Filtering Alexander Zien Empirical Inference Department, Bernhard Schölkopf Max Planck Institute for Biological Cybernetics ECML 2006 Discovery

More information

Statistical machine learning, high dimension and big data

Statistical machine learning, high dimension and big data Statistical machine learning, high dimension and big data S. Gaïffas 1 14 mars 2014 1 CMAP - Ecole Polytechnique Agenda for today Divide and Conquer principle for collaborative filtering Graphical modelling,

More information

Logistic Regression. Jia Li. Department of Statistics The Pennsylvania State University. Logistic Regression

Logistic Regression. Jia Li. Department of Statistics The Pennsylvania State University. Logistic Regression Logistic Regression Department of Statistics The Pennsylvania State University Email: jiali@stat.psu.edu Logistic Regression Preserve linear classification boundaries. By the Bayes rule: Ĝ(x) = arg max

More information

Statistical Machine Learning

Statistical Machine Learning Statistical Machine Learning UoC Stats 37700, Winter quarter Lecture 4: classical linear and quadratic discriminants. 1 / 25 Linear separation For two classes in R d : simple idea: separate the classes

More information

How To Solve A Sequential Mca Problem

How To Solve A Sequential Mca Problem Monte Carlo-based statistical methods (MASM11/FMS091) Jimmy Olsson Centre for Mathematical Sciences Lund University, Sweden Lecture 6 Sequential Monte Carlo methods II February 3, 2012 Changes in HA1 Problem

More information

Multi-Class and Structured Classification

Multi-Class and Structured Classification Multi-Class and Structured Classification [slides prises du cours cs294-10 UC Berkeley (2006 / 2009)] [ p y( )] http://www.cs.berkeley.edu/~jordan/courses/294-fall09 Basic Classification in ML Input Output

More information

954 IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 51, NO. 3, MARCH 2005

954 IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 51, NO. 3, MARCH 2005 954 IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 51, NO. 3, MARCH 2005 Using Linear Programming to Decode Binary Linear Codes Jon Feldman, Martin J. Wainwright, Member, IEEE, and David R. Karger, Associate

More information

Joint Message-Passing Decoding of LDPC Codes and Partial-Response Channels

Joint Message-Passing Decoding of LDPC Codes and Partial-Response Channels 1410 IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 48, NO. 6, JUNE 2002 Joint Message-Passing Decoding of LDPC Codes Partial-Response Channels Brian M. Kurkoski, Student Member, IEEE, Paul H. Siegel, Fellow,

More information

HT2015: SC4 Statistical Data Mining and Machine Learning

HT2015: SC4 Statistical Data Mining and Machine Learning HT2015: SC4 Statistical Data Mining and Machine Learning Dino Sejdinovic Department of Statistics Oxford http://www.stats.ox.ac.uk/~sejdinov/sdmml.html Bayesian Nonparametrics Parametric vs Nonparametric

More information

Efficient Recovery of Secrets

Efficient Recovery of Secrets Efficient Recovery of Secrets Marcel Fernandez Miguel Soriano, IEEE Senior Member Department of Telematics Engineering. Universitat Politècnica de Catalunya. C/ Jordi Girona 1 i 3. Campus Nord, Mod C3,

More information

Monte Carlo-based statistical methods (MASM11/FMS091)

Monte Carlo-based statistical methods (MASM11/FMS091) Monte Carlo-based statistical methods (MASM11/FMS091) Magnus Wiktorsson Centre for Mathematical Sciences Lund University, Sweden Lecture 6 Sequential Monte Carlo methods II February 7, 2014 M. Wiktorsson

More information

NEURAL NETWORKS A Comprehensive Foundation

NEURAL NETWORKS A Comprehensive Foundation NEURAL NETWORKS A Comprehensive Foundation Second Edition Simon Haykin McMaster University Hamilton, Ontario, Canada Prentice Hall Prentice Hall Upper Saddle River; New Jersey 07458 Preface xii Acknowledgments

More information

A Learning Based Method for Super-Resolution of Low Resolution Images

A Learning Based Method for Super-Resolution of Low Resolution Images A Learning Based Method for Super-Resolution of Low Resolution Images Emre Ugur June 1, 2004 emre.ugur@ceng.metu.edu.tr Abstract The main objective of this project is the study of a learning based method

More information

A Tutorial on Particle Filters for Online Nonlinear/Non-Gaussian Bayesian Tracking

A Tutorial on Particle Filters for Online Nonlinear/Non-Gaussian Bayesian Tracking 174 IEEE TRANSACTIONS ON SIGNAL PROCESSING, VOL. 50, NO. 2, FEBRUARY 2002 A Tutorial on Particle Filters for Online Nonlinear/Non-Gaussian Bayesian Tracking M. Sanjeev Arulampalam, Simon Maskell, Neil

More information

Aalborg Universitet. Published in: I E E E Signal Processing Letters. DOI (link to publication from Publisher): 10.1109/LSP.2014.

Aalborg Universitet. Published in: I E E E Signal Processing Letters. DOI (link to publication from Publisher): 10.1109/LSP.2014. Aalborg Universitet Message-Passing Receivers for Single Carrier Systems with Frequency-Domain Equalization Zhang, Chuanzong; Manchón, Carles Navarro; Wang, Zhongyong; Fleury, Bernard Henri Published in:

More information

Message-passing sequential detection of multiple change points in networks

Message-passing sequential detection of multiple change points in networks Message-passing sequential detection of multiple change points in networks Long Nguyen, Arash Amini Ram Rajagopal University of Michigan Stanford University ISIT, Boston, July 2012 Nguyen/Amini/Rajagopal

More information

Hardware Implementation of Probabilistic State Machine for Word Recognition

Hardware Implementation of Probabilistic State Machine for Word Recognition IJECT Vo l. 4, Is s u e Sp l - 5, Ju l y - Se p t 2013 ISSN : 2230-7109 (Online) ISSN : 2230-9543 (Print) Hardware Implementation of Probabilistic State Machine for Word Recognition 1 Soorya Asokan, 2

More information

CS 2750 Machine Learning. Lecture 1. Machine Learning. http://www.cs.pitt.edu/~milos/courses/cs2750/ CS 2750 Machine Learning.

CS 2750 Machine Learning. Lecture 1. Machine Learning. http://www.cs.pitt.edu/~milos/courses/cs2750/ CS 2750 Machine Learning. Lecture Machine Learning Milos Hauskrecht milos@cs.pitt.edu 539 Sennott Square, x5 http://www.cs.pitt.edu/~milos/courses/cs75/ Administration Instructor: Milos Hauskrecht milos@cs.pitt.edu 539 Sennott

More information

Algebra I Notes Relations and Functions Unit 03a

Algebra I Notes Relations and Functions Unit 03a OBJECTIVES: F.IF.A.1 Understand the concept of a function and use function notation. Understand that a function from one set (called the domain) to another set (called the range) assigns to each element

More information

Robert Collins CSE598G. More on Mean-shift. R.Collins, CSE, PSU CSE598G Spring 2006

Robert Collins CSE598G. More on Mean-shift. R.Collins, CSE, PSU CSE598G Spring 2006 More on Mean-shift R.Collins, CSE, PSU Spring 2006 Recall: Kernel Density Estimation Given a set of data samples x i ; i=1...n Convolve with a kernel function H to generate a smooth function f(x) Equivalent

More information

Basics of Statistical Machine Learning

Basics of Statistical Machine Learning CS761 Spring 2013 Advanced Machine Learning Basics of Statistical Machine Learning Lecturer: Xiaojin Zhu jerryzhu@cs.wisc.edu Modern machine learning is rooted in statistics. You will find many familiar

More information

DYNAMIC RANGE IMPROVEMENT THROUGH MULTIPLE EXPOSURES. Mark A. Robertson, Sean Borman, and Robert L. Stevenson

DYNAMIC RANGE IMPROVEMENT THROUGH MULTIPLE EXPOSURES. Mark A. Robertson, Sean Borman, and Robert L. Stevenson c 1999 IEEE. Personal use of this material is permitted. However, permission to reprint/republish this material for advertising or promotional purposes or for creating new collective works for resale or

More information

CCNY. BME I5100: Biomedical Signal Processing. Linear Discrimination. Lucas C. Parra Biomedical Engineering Department City College of New York

CCNY. BME I5100: Biomedical Signal Processing. Linear Discrimination. Lucas C. Parra Biomedical Engineering Department City College of New York BME I5100: Biomedical Signal Processing Linear Discrimination Lucas C. Parra Biomedical Engineering Department CCNY 1 Schedule Week 1: Introduction Linear, stationary, normal - the stuff biology is not

More information

Introduction to Probability

Introduction to Probability Introduction to Probability EE 179, Lecture 15, Handout #24 Probability theory gives a mathematical characterization for experiments with random outcomes. coin toss life of lightbulb binary data sequence

More information

Linear Programming Notes V Problem Transformations

Linear Programming Notes V Problem Transformations Linear Programming Notes V Problem Transformations 1 Introduction Any linear programming problem can be rewritten in either of two standard forms. In the first form, the objective is to maximize, the material

More information

Methods of Data Analysis Working with probability distributions

Methods of Data Analysis Working with probability distributions Methods of Data Analysis Working with probability distributions Week 4 1 Motivation One of the key problems in non-parametric data analysis is to create a good model of a generating probability distribution,

More information

Towards running complex models on big data

Towards running complex models on big data Towards running complex models on big data Working with all the genomes in the world without changing the model (too much) Daniel Lawson Heilbronn Institute, University of Bristol 2013 1 / 17 Motivation

More information

PATTERN RECOGNITION AND MACHINE LEARNING CHAPTER 4: LINEAR MODELS FOR CLASSIFICATION

PATTERN RECOGNITION AND MACHINE LEARNING CHAPTER 4: LINEAR MODELS FOR CLASSIFICATION PATTERN RECOGNITION AND MACHINE LEARNING CHAPTER 4: LINEAR MODELS FOR CLASSIFICATION Introduction In the previous chapter, we explored a class of regression models having particularly simple analytical

More information

Removing Partial Inconsistency in Valuation- Based Systems*

Removing Partial Inconsistency in Valuation- Based Systems* Removing Partial Inconsistency in Valuation- Based Systems* Luis M. de Campos and Serafín Moral Departamento de Ciencias de la Computación e I.A., Universidad de Granada, 18071 Granada, Spain This paper

More information

Tracking in flussi video 3D. Ing. Samuele Salti

Tracking in flussi video 3D. Ing. Samuele Salti Seminari XXIII ciclo Tracking in flussi video 3D Ing. Tutors: Prof. Tullio Salmon Cinotti Prof. Luigi Di Stefano The Tracking problem Detection Object model, Track initiation, Track termination, Tracking

More information

Dirichlet Processes A gentle tutorial

Dirichlet Processes A gentle tutorial Dirichlet Processes A gentle tutorial SELECT Lab Meeting October 14, 2008 Khalid El-Arini Motivation We are given a data set, and are told that it was generated from a mixture of Gaussian distributions.

More information

CSCI567 Machine Learning (Fall 2014)

CSCI567 Machine Learning (Fall 2014) CSCI567 Machine Learning (Fall 2014) Drs. Sha & Liu {feisha,yanliu.cs}@usc.edu September 22, 2014 Drs. Sha & Liu ({feisha,yanliu.cs}@usc.edu) CSCI567 Machine Learning (Fall 2014) September 22, 2014 1 /

More information

SOLiD System accuracy with the Exact Call Chemistry module

SOLiD System accuracy with the Exact Call Chemistry module WHITE PPER 55 Series SOLiD System SOLiD System accuracy with the Exact all hemistry module ONTENTS Principles of Exact all hemistry Introduction Encoding of base sequences with Exact all hemistry Demonstration

More information

Design of LDPC codes

Design of LDPC codes Design of LDPC codes Codes from finite geometries Random codes: Determine the connections of the bipartite Tanner graph by using a (pseudo)random algorithm observing the degree distribution of the code

More information

Computational Foundations of Cognitive Science

Computational Foundations of Cognitive Science Computational Foundations of Cognitive Science Lecture 15: Convolutions and Kernels Frank Keller School of Informatics University of Edinburgh keller@inf.ed.ac.uk February 23, 2010 Frank Keller Computational

More information

Monte Carlo and Empirical Methods for Stochastic Inference (MASM11/FMS091)

Monte Carlo and Empirical Methods for Stochastic Inference (MASM11/FMS091) Monte Carlo and Empirical Methods for Stochastic Inference (MASM11/FMS091) Magnus Wiktorsson Centre for Mathematical Sciences Lund University, Sweden Lecture 5 Sequential Monte Carlo methods I February

More information

Distributed Structured Prediction for Big Data

Distributed Structured Prediction for Big Data Distributed Structured Prediction for Big Data A. G. Schwing ETH Zurich aschwing@inf.ethz.ch T. Hazan TTI Chicago M. Pollefeys ETH Zurich R. Urtasun TTI Chicago Abstract The biggest limitations of learning

More information

Learning the Behavior Model of a Robot

Learning the Behavior Model of a Robot Learning the Behavior Model of a Robot Guillaume Infantes ONERA, France guillaume.infantes@onera.fr Malik Ghallab INRIA, France Malik.Ghallab@inria.fr Félix Ingrand LAAS-CNRS, France University of Toulouse

More information

Numerical Methods For Image Restoration

Numerical Methods For Image Restoration Numerical Methods For Image Restoration CIRAM Alessandro Lanza University of Bologna, Italy Faculty of Engineering CIRAM Outline 1. Image Restoration as an inverse problem 2. Image degradation models:

More information

Machine Learning. CUNY Graduate Center, Spring 2013. Professor Liang Huang. huang@cs.qc.cuny.edu

Machine Learning. CUNY Graduate Center, Spring 2013. Professor Liang Huang. huang@cs.qc.cuny.edu Machine Learning CUNY Graduate Center, Spring 2013 Professor Liang Huang huang@cs.qc.cuny.edu http://acl.cs.qc.edu/~lhuang/teaching/machine-learning Logistics Lectures M 9:30-11:30 am Room 4419 Personnel

More information

Machine Learning and Data Analysis overview. Department of Cybernetics, Czech Technical University in Prague. http://ida.felk.cvut.

Machine Learning and Data Analysis overview. Department of Cybernetics, Czech Technical University in Prague. http://ida.felk.cvut. Machine Learning and Data Analysis overview Jiří Kléma Department of Cybernetics, Czech Technical University in Prague http://ida.felk.cvut.cz psyllabus Lecture Lecturer Content 1. J. Kléma Introduction,

More information

Hidden Markov Models in Bioinformatics. By Máthé Zoltán Kőrösi Zoltán 2006

Hidden Markov Models in Bioinformatics. By Máthé Zoltán Kőrösi Zoltán 2006 Hidden Markov Models in Bioinformatics By Máthé Zoltán Kőrösi Zoltán 2006 Outline Markov Chain HMM (Hidden Markov Model) Hidden Markov Models in Bioinformatics Gene Finding Gene Finding Model Viterbi algorithm

More information

The Artificial Prediction Market

The Artificial Prediction Market The Artificial Prediction Market Adrian Barbu Department of Statistics Florida State University Joint work with Nathan Lay, Siemens Corporate Research 1 Overview Main Contributions A mathematical theory

More information

Statistical Machine Learning from Data

Statistical Machine Learning from Data Samy Bengio Statistical Machine Learning from Data 1 Statistical Machine Learning from Data Gaussian Mixture Models Samy Bengio IDIAP Research Institute, Martigny, Switzerland, and Ecole Polytechnique

More information

17.3.1 Follow the Perturbed Leader

17.3.1 Follow the Perturbed Leader CS787: Advanced Algorithms Topic: Online Learning Presenters: David He, Chris Hopman 17.3.1 Follow the Perturbed Leader 17.3.1.1 Prediction Problem Recall the prediction problem that we discussed in class.

More information

Learning Gaussian process models from big data. Alan Qi Purdue University Joint work with Z. Xu, F. Yan, B. Dai, and Y. Zhu

Learning Gaussian process models from big data. Alan Qi Purdue University Joint work with Z. Xu, F. Yan, B. Dai, and Y. Zhu Learning Gaussian process models from big data Alan Qi Purdue University Joint work with Z. Xu, F. Yan, B. Dai, and Y. Zhu Machine learning seminar at University of Cambridge, July 4 2012 Data A lot of

More information

Globally Optimal Crowdsourcing Quality Management

Globally Optimal Crowdsourcing Quality Management Globally Optimal Crowdsourcing Quality Management Akash Das Sarma Stanford University akashds@stanford.edu Aditya G. Parameswaran University of Illinois (UIUC) adityagp@illinois.edu Jennifer Widom Stanford

More information

Monte Carlo-based statistical methods (MASM11/FMS091)

Monte Carlo-based statistical methods (MASM11/FMS091) Monte Carlo-based statistical methods (MASM11/FMS091) Jimmy Olsson Centre for Mathematical Sciences Lund University, Sweden Lecture 5 Sequential Monte Carlo methods I February 5, 2013 J. Olsson Monte Carlo-based

More information

Markov random fields and Gibbs measures

Markov random fields and Gibbs measures Chapter Markov random fields and Gibbs measures 1. Conditional independence Suppose X i is a random element of (X i, B i ), for i = 1, 2, 3, with all X i defined on the same probability space (.F, P).

More information

Discrete Frobenius-Perron Tracking

Discrete Frobenius-Perron Tracking Discrete Frobenius-Perron Tracing Barend J. van Wy and Michaël A. van Wy French South-African Technical Institute in Electronics at the Tshwane University of Technology Staatsartillerie Road, Pretoria,

More information

Question 2 Naïve Bayes (16 points)

Question 2 Naïve Bayes (16 points) Question 2 Naïve Bayes (16 points) About 2/3 of your email is spam so you downloaded an open source spam filter based on word occurrences that uses the Naive Bayes classifier. Assume you collected the

More information

A Primer on Mathematical Statistics and Univariate Distributions; The Normal Distribution; The GLM with the Normal Distribution

A Primer on Mathematical Statistics and Univariate Distributions; The Normal Distribution; The GLM with the Normal Distribution A Primer on Mathematical Statistics and Univariate Distributions; The Normal Distribution; The GLM with the Normal Distribution PSYC 943 (930): Fundamentals of Multivariate Modeling Lecture 4: September

More information

Bayesian networks - Time-series models - Apache Spark & Scala

Bayesian networks - Time-series models - Apache Spark & Scala Bayesian networks - Time-series models - Apache Spark & Scala Dr John Sandiford, CTO Bayes Server Data Science London Meetup - November 2014 1 Contents Introduction Bayesian networks Latent variables Anomaly

More information

Class #6: Non-linear classification. ML4Bio 2012 February 17 th, 2012 Quaid Morris

Class #6: Non-linear classification. ML4Bio 2012 February 17 th, 2012 Quaid Morris Class #6: Non-linear classification ML4Bio 2012 February 17 th, 2012 Quaid Morris 1 Module #: Title of Module 2 Review Overview Linear separability Non-linear classification Linear Support Vector Machines

More information

Supervised Learning (Big Data Analytics)

Supervised Learning (Big Data Analytics) Supervised Learning (Big Data Analytics) Vibhav Gogate Department of Computer Science The University of Texas at Dallas Practical advice Goal of Big Data Analytics Uncover patterns in Data. Can be used

More information

STA 4273H: Statistical Machine Learning

STA 4273H: Statistical Machine Learning STA 4273H: Statistical Machine Learning Russ Salakhutdinov Department of Statistics! rsalakhu@utstat.toronto.edu! http://www.cs.toronto.edu/~rsalakhu/ Lecture 6 Three Approaches to Classification Construct

More information

The Backpropagation Algorithm

The Backpropagation Algorithm 7 The Backpropagation Algorithm 7. Learning as gradient descent We saw in the last chapter that multilayered networks are capable of computing a wider range of Boolean functions than networks with a single

More information

Model-based Synthesis. Tony O Hagan

Model-based Synthesis. Tony O Hagan Model-based Synthesis Tony O Hagan Stochastic models Synthesising evidence through a statistical model 2 Evidence Synthesis (Session 3), Helsinki, 28/10/11 Graphical modelling The kinds of models that

More information

Inference on Phase-type Models via MCMC

Inference on Phase-type Models via MCMC Inference on Phase-type Models via MCMC with application to networks of repairable redundant systems Louis JM Aslett and Simon P Wilson Trinity College Dublin 28 th June 202 Toy Example : Redundant Repairable

More information

Conditional Random Fields: An Introduction

Conditional Random Fields: An Introduction Conditional Random Fields: An Introduction Hanna M. Wallach February 24, 2004 1 Labeling Sequential Data The task of assigning label sequences to a set of observation sequences arises in many fields, including

More information

LDPC Codes: An Introduction

LDPC Codes: An Introduction LDPC Codes: An Introduction Amin Shokrollahi Digital Fountain, Inc. 39141 Civic Center Drive, Fremont, CA 94538 amin@digitalfountain.com April 2, 2003 Abstract LDPC codes are one of the hottest topics

More information

Machine Learning in Spam Filtering

Machine Learning in Spam Filtering Machine Learning in Spam Filtering A Crash Course in ML Konstantin Tretyakov kt@ut.ee Institute of Computer Science, University of Tartu Overview Spam is Evil ML for Spam Filtering: General Idea, Problems.

More information

Feed-Forward mapping networks KAIST 바이오및뇌공학과 정재승

Feed-Forward mapping networks KAIST 바이오및뇌공학과 정재승 Feed-Forward mapping networks KAIST 바이오및뇌공학과 정재승 How much energy do we need for brain functions? Information processing: Trade-off between energy consumption and wiring cost Trade-off between energy consumption

More information

Cell Phone based Activity Detection using Markov Logic Network

Cell Phone based Activity Detection using Markov Logic Network Cell Phone based Activity Detection using Markov Logic Network Somdeb Sarkhel sxs104721@utdallas.edu 1 Introduction Mobile devices are becoming increasingly sophisticated and the latest generation of smart

More information

Lecture 8 February 4

Lecture 8 February 4 ICS273A: Machine Learning Winter 2008 Lecture 8 February 4 Scribe: Carlos Agell (Student) Lecturer: Deva Ramanan 8.1 Neural Nets 8.1.1 Logistic Regression Recall the logistic function: g(x) = 1 1 + e θt

More information

Linear-Time Encodable Rate-Compatible Punctured LDPC Codes with Low Error Floors

Linear-Time Encodable Rate-Compatible Punctured LDPC Codes with Low Error Floors Linear-Time Encodable Rate-Compatible Punctured LDPC Codes with Low Error Floors Seungmoon Song, Daesung Hwang, Sungloc Seo, and Jeongseo Ha School of Electrical and Computer Engineering Information and

More information

Linear Programming. Widget Factory Example. Linear Programming: Standard Form. Widget Factory Example: Continued.

Linear Programming. Widget Factory Example. Linear Programming: Standard Form. Widget Factory Example: Continued. Linear Programming Widget Factory Example Learning Goals. Introduce Linear Programming Problems. Widget Example, Graphical Solution. Basic Theory:, Vertices, Existence of Solutions. Equivalent formulations.

More information

Probabilistic Models for Big Data. Alex Davies and Roger Frigola University of Cambridge 13th February 2014

Probabilistic Models for Big Data. Alex Davies and Roger Frigola University of Cambridge 13th February 2014 Probabilistic Models for Big Data Alex Davies and Roger Frigola University of Cambridge 13th February 2014 The State of Big Data Why probabilistic models for Big Data? 1. If you don t have to worry about

More information

Mixture Models for Genomic Data

Mixture Models for Genomic Data Mixture Models for Genomic Data S. Robin AgroParisTech / INRA École de Printemps en Apprentissage automatique, Baie de somme, May 2010 S. Robin (AgroParisTech / INRA) Mixture Models May 10 1 / 48 Outline

More information

Likelihood, MLE & EM for Gaussian Mixture Clustering. Nick Duffield Texas A&M University

Likelihood, MLE & EM for Gaussian Mixture Clustering. Nick Duffield Texas A&M University Likelihood, MLE & EM for Gaussian Mixture Clustering Nick Duffield Texas A&M University Probability vs. Likelihood Probability: predict unknown outcomes based on known parameters: P(x θ) Likelihood: eskmate

More information

CHAPTER 2 Estimating Probabilities

CHAPTER 2 Estimating Probabilities CHAPTER 2 Estimating Probabilities Machine Learning Copyright c 2016. Tom M. Mitchell. All rights reserved. *DRAFT OF January 24, 2016* *PLEASE DO NOT DISTRIBUTE WITHOUT AUTHOR S PERMISSION* This is a

More information

Part 4 fitting with energy loss and multiple scattering non gaussian uncertainties outliers

Part 4 fitting with energy loss and multiple scattering non gaussian uncertainties outliers Part 4 fitting with energy loss and multiple scattering non gaussian uncertainties outliers material intersections to treat material effects in track fit, locate material 'intersections' along particle

More information

24. The Branch and Bound Method

24. The Branch and Bound Method 24. The Branch and Bound Method It has serious practical consequences if it is known that a combinatorial problem is NP-complete. Then one can conclude according to the present state of science that no

More information

Probability and Random Variables. Generation of random variables (r.v.)

Probability and Random Variables. Generation of random variables (r.v.) Probability and Random Variables Method for generating random variables with a specified probability distribution function. Gaussian And Markov Processes Characterization of Stationary Random Process Linearly

More information

Lecture 6: Discrete & Continuous Probability and Random Variables

Lecture 6: Discrete & Continuous Probability and Random Variables Lecture 6: Discrete & Continuous Probability and Random Variables D. Alex Hughes Math Camp September 17, 2015 D. Alex Hughes (Math Camp) Lecture 6: Discrete & Continuous Probability and Random September

More information

Introduction to Mobile Robotics Bayes Filter Particle Filter and Monte Carlo Localization

Introduction to Mobile Robotics Bayes Filter Particle Filter and Monte Carlo Localization Introduction to Mobile Robotics Bayes Filter Particle Filter and Monte Carlo Localization Wolfram Burgard, Maren Bennewitz, Diego Tipaldi, Luciano Spinello 1 Motivation Recall: Discrete filter Discretize

More information

Ex. 2.1 (Davide Basilio Bartolini)

Ex. 2.1 (Davide Basilio Bartolini) ECE 54: Elements of Information Theory, Fall 00 Homework Solutions Ex.. (Davide Basilio Bartolini) Text Coin Flips. A fair coin is flipped until the first head occurs. Let X denote the number of flips

More information