Mixture Models for Genomic Data

Size: px
Start display at page:

Download "Mixture Models for Genomic Data"

Transcription

1 Mixture Models for Genomic Data S. Robin AgroParisTech / INRA École de Printemps en Apprentissage automatique, Baie de somme, May 2010 S. Robin (AgroParisTech / INRA) Mixture Models May 10 1 / 48

2 Outline 1 Some examples 2 Statistical inference of mixture models 3 Independent mixture model 4 Hidden Markov model 5 Mixture for random graphs 6 Variational Bayes inference 7 Some extensions S. Robin (AgroParisTech / INRA) Mixture Models May 10 2 / 48

3 Some examples Some examples S. Robin (AgroParisTech / INRA) Mixture Models May 10 3 / 48

4 Some examples ChIP-chip experiments ChIP on chip ChIP = Chromatin Immuno-Precipitation, aims at detecting protein-dna interactions. ChIP-chip: Probes corresponding to different loci are spotted on a glass slide. log IP: DNA fragments interacting with the protein of interest. Input: whole genomic DNA. X = log(ip1/ip2) Non-zero X reveal differential protein- DNA interaction between samples 1 and 2. S. Robin (AgroParisTech / INRA) Mixture Models May 10 4 / 48

5 Some examples ChIP-chip experiments Proposed model Denoting X i = log(ip1 i /IP2 i ) the signal observed for probe i, Z i its unknown status, we can assume that the Z i s are i.i.d Z i M(1;π), π k = Pr{Z i = k} = Pr{Z ik = 1} the X i s are independent conditionally to the Z i s: (X i Z i = k) f k ( ) = f ( ;γ k ), e.g. f k = N(µ k,σk 2 ). We have to estimate {π k,µ k,σk 2 } k and Pr{Z i = k X i } S. Robin (AgroParisTech / INRA) Mixture Models May 10 5 / 48

6 Some examples Accounting for the genomic localisation: HMM Accounting for the genomic localisation Probes are (almost) equally spaced along the genome, Probes with large (positive or negative) ratio, tend to be clustered Proposed model: Hidden Markov model (HMM: Baum and Petrie (1966),Churchill (1992)) The X i s are still independent conditionally to the Z i s: (X i Z i = k) f k, But the status are (Markov-)dependent: {Z i } MC(π) π kl = Pr{Z i = k Z i 1 = l} S. Robin (AgroParisTech / INRA) Mixture Models May 10 6 / 48

7 Some examples Regulatory network Regulatory network Regulatory network = directed graph where Nodes = genes (or groups of genes, e.g. operons) Edges = regulations: {i j} i regulates j Typical questions are Do some nodes share similar connexion profiles? Is there a macroscopic organisation of the network? S. Robin (AgroParisTech / INRA) Mixture Models May 10 7 / 48

8 Some examples Regulatory network Proposed model Denoting X ij the presence of regulation from operon i to operon j, Z i the unknown status of operon i, we can assume that [Daudin et al. (2008)] the Z i s are i.i.d Z i M(1;π), π k = Pr{Z i = k} = Pr{Z ik = 1} the X ij s are independent conditionally to the Z i s: (X ij Z i = k,z j = l) B(γ kl ). We want to estimate θ = (π,γ) and Pr{Z i = k {X j }} S. Robin (AgroParisTech / INRA) Mixture Models May 10 8 / 48

9 Statistical inference of mixture models Model and likelihoods Statistical inference of incomplete data models S. Robin (AgroParisTech / INRA) Mixture Models May 10 9 / 48

10 Statistical inference of mixture models Model and likelihoods Statistical inference of incomplete data models Notations: X = observed data (typically X = {X i }), Z = unobserved data, θ = the unknown parameters of both the distributions of Z and X. Definitions: Likelihood of the observed data (or observed likelihood): log P(X) = log P(X;θ) Complete likelihood: log P(X,Z) = log P(X,Z;θ) S. Robin (AgroParisTech / INRA) Mixture Models May 10 9 / 48

11 Statistical inference of mixture models Model and likelihoods Maximum likelihood inference Maximum likelihood estimate: We are looking for Incomplete data model: The calculation of θ = arg max log P(X;θ) θ P(X;θ) = Z P(X,Z;θ) is not always possible since this sum typically involves K n terms. This of P(X,Z;θ) is much easier... except that Z is unknown. S. Robin (AgroParisTech / INRA) Mixture Models May / 48

12 Statistical inference of mixture models Variational approach Variational approach The Küllback-Leibler divergence between distributions F and G: KL(F;G) = F(U)log F(U) G(U) du 0 is a non-symmetric dissimilarity measure; is zero iff F = G. Lower bound: For any distribution Q(Z), we have [Jordan et al. (1999),Jaakkola (2000)] log P(X) log P(X) KL[Q(Z); P(Z X)] = log P(X) Q(Z)log Q(Z)dZ + Q(Z) log P(Z X)dZ = Q(Z)log Q(Z)dZ + Q(Z)log P(X,Z)dZ = H(Q) + E Q [log P(X,Z)]. S. Robin (AgroParisTech / INRA) Mixture Models May / 48

13 Consequences Statistical inference of mixture models Variational approach If P(Z X) can be calculated: taking Q(Z) = P(Z X) achieves the maximisation of log P(X) through this of E Q [log P(X,Z)]. E-M algorithm for independent mixtures and hidden Markov models [Dempster et al. (1977),McLachlan and Peel (2000),Cappé et al. (2005)]. If P(Z X) can not be calculated: the best lower bound of log P(X) is obtained for Q = arg min Q Q KL[Q(Z);P(Z X)] Mean-field approximation for random graphs. S. Robin (AgroParisTech / INRA) Mixture Models May / 48

14 Variational E-M Statistical inference of mixture models Variational E-M Expectation step (E-step): calculate or its approximation P(Z X; θ) Q = arg min Q Q KL[Q(Z);P(Z X;θ)]. Maximisation step (M-step): estimate θ with θ = arg max θ E Q[log P(X,Z)] which maximise log P(X) if Q(Z) = P(Z X), and its lower bound otherwise. S. Robin (AgroParisTech / INRA) Mixture Models May / 48

15 Independent mixture model Independent mixture models Model S. Robin (AgroParisTech / INRA) Mixture Models May / 48

16 Independent mixture model Model Independent mixture models Reminder: [McLachlan and Peel (2000)] the Z i s are i.i.d M(1;π), π k = Pr{Z i = k}, k = 1..K; the X i s are independent conditionally to Z: (X i Z i = k) f (γ k ). Identifiability. The model is invariant for any permutation of the labels {1,...,K} the mixture model has K! equivalent definitions. Distribution of the observed data. X i g(x) = k π k f (x;γ k ). because Pr{X i = x} = k Pr{X i = x Z i = k}pr{z i = k}. S. Robin (AgroParisTech / INRA) Mixture Models May / 48

17 Independent mixture model Model Dependency structure Some properties: {Z i } are independent; {X i } are independent conditionally to Z; Couples {(X i,z i )} are i.i.d.; (X i,x j Z i = Z j ) are not independent. Graphical representation.... Z i 1 Z i Z i X i 1 X i X i+1... S. Robin (AgroParisTech / INRA) Mixture Models May / 48

18 Likelihoods Independent mixture model Inference Observed likelihood: log P(X;θ) = i log g(x i ;θ) = i [ ] log π k f (X i ;γ k ) k Complete likelihood: log P(X,Z;θ) = log P(Z;θ) + log P(X Z;θ) = Z ik log π k + Z ik log f (X i ;γ k ) i k i k = Z ik [log π k + log f (X i ;γ k )]. i k S. Robin (AgroParisTech / INRA) Mixture Models May / 48

19 Inference: E-step Independent mixture model Inference Since the {(X i,z i )} i are independant, we can calculate P(Z X) = P(Z i X i ) = τ Z ik ik i i where τ ik = Pr{Z i = k X} = E Q {Z ik }: τ ik = Pr{Z i = k X i,θ} = π kf (X i ;γ k ) l π lf (X i ;γ l ) k (Bayes rule). Conditional expectation of the complete likelihood ( completed likelihood): { } E Q [log P(X,Z)] = E Q Z ik log[π k f (X i ;γ k )] i k = τ ik [log π k + log f (X i ;γ k )] i k S. Robin (AgroParisTech / INRA) Mixture Models May / 48

20 Inference: M-step Independent mixture model Inference We want to maximise θ = arg max θ E Q[log P(X,Z)] = arg max τ ik [log π k + log f (X i ;γ k )] θ weighted version of the usual maximum likelihood estimates (MLE). Gaussian case: γ k = (µ k,σ k ) π k = 1 n i i k τ ik = n k n, µ k = 1 τ ik X i, n k i σ k 2 = 1 τ ik (X i µ k ) 2. n k i S. Robin (AgroParisTech / INRA) Mixture Models May / 48

21 Independent mixture model Graphical interpretation Inference Distributions: Posterior probabilities: g(x) = π 1 f 1 (x) + π 2 f 2 (x) + π 3 f 3 (x) τ ik = π kf k (x i ) g(x i ) τ ik (%) i = 1 i = 2 i = 3 k = k = k = Demo S. Robin (AgroParisTech / INRA) Mixture Models May / 48

22 Independent mixture model Precision of the estimates Precision of the estimates In the framework of the regular MLE (i.e. when P(Z X) can be calculated), the asymptotic variance of the estimates is given by [ ] V ( θ) = I(θ) 1 2 where I(θ) = E log P(X;θ) 2 θ where I(θ) is the Fisher information matrix. Louis (1982) provides a convenient way to calculate of I(θ), based only on the complete-likelihood: [ P ] [ (X,Z;θ) P ] [ I(θ) = E P(X,Z;θ) X (X;θ) P ] (X;θ) E E. P(X; θ) P(X; θ) } {{ } =0 at the miximum of log P(X;θ) S. Robin (AgroParisTech / INRA) Mixture Models May / 48

23 Independent mixture model Application to ChIP-chip Application to ChIP-chip Common variance Different variances k π k µ k σ k k π k µ k σ k S. Robin (AgroParisTech / INRA) Mixture Models May / 48

24 Independent mixture model Application to ChIP-chip Probe classification Common variance Different variances Heterogenous variances provide a better fit to the distribution But the classification rule is not very convenient... Accounting for annotation. When some annotation C i is available for each probe, the model can account through different prior probabilities: π c k = Pr{Z i = k C i = c} S. Robin (AgroParisTech / INRA) Mixture Models May / 48

25 Independent mixture model Application to ChIP-chip Case of IP/Input experiments Complete DNA is often used as a reference ( Input ) to detect protein-dna interaction. The relation IP = f (Input) is seemingly linear, but the difference IP Input is not (always) sufficient for classification... because the shape of the relation depends on the probe status. Frequency S. Robin (AgroParisTech / INRA) Mixture Models May / log(ip/input)

26 Independent mixture model Application to ChIP-chip Mixture of regressions 17 This mixture states that: 2 IPi Zi = k N (ak + bk Inputi, σ ) where k = 0 (normal) or 1 (enriched) b ak and b bk are only weighted versions of the standard OLS intercept and slope estimates. S. Robin (AgroParisTech / INRA) Mixture Models May / 48

27 Hidden Markov model Hidden Markov model Model S. Robin (AgroParisTech / INRA) Mixture Models May / 48

28 Hidden Markov model Model Hidden Markov model Reminder: {Z i } MC(π), π kl = Pr{Z i = l Z i 1 = k}; Z 1 M(1;ν) (e.g. ν = stationary distribution of π); the X i s are independent conditionally to Z: (X i Z i = k) f (γ k ). Distribution of the observed data. X i g(x) = k ν i k f (x;γ k). since Z i M(1;ν i ) where ν i = νπ i 1 S. Robin (AgroParisTech / INRA) Mixture Models May / 48

29 Hidden Markov model Model Dependency structure Some properties: (Z i 1,Z i ) are not independent, (X i 1,X i ) are not independent, (X i 1,X i ) are independent conditionally on Z i, Graphical representation.... Z i 1 Z i Z i X i 1 X i X i+1... S. Robin (AgroParisTech / INRA) Mixture Models May / 48

30 Likelihood Hidden Markov model Inference Complete likelihood: P(X, Z) = P(Z)P(X Z) { } = ν Z 1k k π Z i 1,kZ i,l kl f k (X i ) Z ik k i>1 k,l i k log P(X,Z) = Z 1k log ν k + Z i 1,k Z i,l log π kl k i>1 k,l + Z ik log f k (X i ) Z ik i k Completed likelihood: E Q [log P(X,Z)] = E Q [Z 1k ] log ν k + E Q [Z i 1,k Z i,l ]log π kl k i>1 k,l + E Q [Z ik ]log f k (X i ) i k S. Robin (AgroParisTech / INRA) Mixture Models May / 48

31 E-step Hidden Markov model Inference For Q(Z) = P(Z X), we need to compute τ ik = E Q [Z ik ] and η ikl = E Q [Z i 1,k Z i,l ] Forward equation: Denoting F il = Pr{Z i = l X1 i }, we have [Devijver (1985)] F il F i 1,k π kl f l (X i ). k Backward equation: Once we get all F ik, we get the τ ik as τ ik = l π kl τ i+1,l G i+1,l F ik, with G i+1,l = k π kl τ ik. S. Robin (AgroParisTech / INRA) Mixture Models May / 48

32 Hidden Markov model Inference Forward recursion: Proof F 1l = P(Z 1l X 1 ) = P(X 1 Z 1l )P(Z 1l )/P(X 1 ) ν l f l (X 1 ) F il = P(Z il X i 1) = k P(Z i 1,k,Z il X i 1 ) = k = k P(Z il,z i 1,k,X i 1 )/ P(X i 1 ) P(X i Z il )P(Z il Z i 1,k )P(Z i 1,k X i 1 1 ) [ P(X1 i 1 ) P(X1 i) ] k f l (X i )π k,l F i 1,k S. Robin (AgroParisTech / INRA) Mixture Models May / 48

33 Hidden Markov model Backward recursion: Proof Inference τ nk = P(Z 1k X) = P(Z 1k X1 n ) = F nk τ ik = P(Z ik X1 n ) = P(Z ik,z i+1,l,x1 n )/P(X 1 n ) l = l = F ik and P(X i 1 )P(X n i+1 Z i+1,l)/p(x n 1 ) P(X i 1 )P(Z ik X i 1 )P(Z i+1,l Z ik )P(X n i+1 Z i+1,l)/p(x n 1 ) l π kl P(X i 1 )P(X n i+1 Z i+1,l)/p(x n 1 ) = P(X i 1 )P(X n i+1 Z i+1,l)p(x i 1 Z i+1,l)/p(x n 1 )P(X i 1 Z i+1,l) = P(X i 1)P(X n 1 Z i+1,l )/P(X n 1 )P(X i 1 Z i+1,l ) = P(Z i+1,l X n 1 )/P(Z i+1,l X i 1) = τ i+1,l /P(Z i+1,l X i 1) where P(Z i+1,l X i 1 ) = k P(Z i+1,l,z ik X i 1 ) = k F ikπ kl. S. Robin (AgroParisTech / INRA) Mixture Models May / 48

34 Hidden Markov model Ratio Application to ChIP-chip Application to ChIP-chip: heterogeneous variances Distribution fit (dotted=mixture) Classification along the genome Position Position Position S. Robin (AgroParisTech / INRA) Mixture Models May / 48

35 Hidden Markov model Ratio Application to ChIP-chip Application to ChIP-chip: common variance Distribution fit (dotted=mixture) Classification along the genome Position Position Position S. Robin (AgroParisTech / INRA) Mixture Models May / 48

36 Hidden Markov model Application to ChIP-chip One step further in modelling The observed signal is actually bi-dimensional: IPwt = signal observed at each probe in the wild-type, IPmut = signal observed at each probe in the mutant. A joint modelling allows to distinguish between identical probes (same signal in both lines) non-methylated probes (no signal in both lines). Source: C. Bérard S. Robin (AgroParisTech / INRA) Mixture Models May / 48

37 Hidden Markov model Application to ChIP-chip Comparison with genome annotation The probe classification provided by the HMM is more consistent with their spatial organisation. The Methylation mark in META1 is lost in the mutant, mostly in the left-end part, near the regulatory region. Legend: lost, enriched, normal S. Robin (AgroParisTech / INRA) Mixture Models May / 48

38 Mixture for random graphs Model Mixture model for random graph S. Robin (AgroParisTech / INRA) Mixture Models May / 48

39 Mixture for random graphs Model Mixture model for random graph Reminder: the Z i s are i.i.d M(1;π), π k = Pr{Z i = k}, k = 1..K; the X ij s are independent conditionally to Z: (X ij Z i = k,z j = l) f (γ kl ). Distribution of the observed data. π k π l f (x;γ kl ), e.g. B π k π l γ kl, X ij Z i = k X ij g(x) = k,l g(x) = l π l f (x;γ kl ), k,l ( ) e.g. B π l γ kl. l S. Robin (AgroParisTech / INRA) Mixture Models May / 48

40 Dependency structure Mixture for random graphs Model Graphical rep.: P(Z)P(X Z) Moral graph [Lauritzen (1996)] Cond. dep. of Z given X: P(Z X) X ij X ij X ij Z j Z j Z j X jk Z i X jk Z i X jk Z i Z k Z k Z k X ik X ik X ik The conditional dependency of Z is a clique no factorisation can be hoped to calculate P(X Z) P(X Z) can only be approximated. S. Robin (AgroParisTech / INRA) Mixture Models May / 48

41 Likelihood Mixture for random graphs Inference Complete likelihood: log P(X,Z) = i,k Z ik log π k + i,j Z ik Z jl [X ij log γ kl + (1 X ij )log(1 γ kl )]. k,l Completed likelihood: Denoting τ ik = E(Z ik ), E Q [log P(X,Z)] = i,k τ ik log π k + i,j τ ik τ jl [X ij log γ kl + (1 X ij )log(1 γ kl )] k,l M-step: (Again) weighted version of the MLE. S. Robin (AgroParisTech / INRA) Mixture Models May / 48

42 Mixture for random graphs Approximation of P(Z X) Inference Problem: We are looking for Q = arg min Q Q KL[Q(Z);P(Z X)]. The optimum over all possible distributions is Q (Z) = P(Z X)... which can no be calculated. We restrict ourselves to the set of factorisable distributions: { } Q = Q : Q(Z) = i Q i (Z i ) = i k τ Z ik ik Q is characterised by the set of optimal parameters τ ik s: τ ik Pr{Z i = k X}. The optimal τik s can be found using standard (constrained) optimisation techniques. S. Robin (AgroParisTech / INRA) Mixture Models May / 48.

43 Mixture for random graphs Inference Optimisation: The optimal τ ik s must satisfy { τ ik KL[Q(Z);P(Z X)] + ( λ i {τ ik } i k τ ik 1 )} = 0 which leads to the fix-point relation: τ ik π k j i l [ ] γ X τ ij kl (1 γ kl) 1 X jl ij also known as mean-field approximation in physics. Intuitive interpretation: [ Pr{Z i = k X,Z i } π k γ X ij kl (1 γ kl) ij] 1 X Zjl. j i l S. Robin (AgroParisTech / INRA) Mixture Models May / 48

44 Mixture for random graphs Application to regulatory networks Application to regulatory networks γ kl (%) α (%) (source Picard) S. Robin (AgroParisTech / INRA) Mixture Models May / 48

45 Variational Bayes inference Variational Bayes inference Bayesian inference S. Robin (AgroParisTech / INRA) Mixture Models May / 48

46 Variational Bayes inference Bayesian inference Variational Bayes inference Bayesian point of view: The parameter θ itself is random: where P(θ) is prior distribution of θ. θ P(θ) Bayesian inference: The goal is then to calculate the posterior distribution P(θ X) = P(θ)P(X θ). P(X) Its explicit calculation is possible in nice cases, e.g. exponential family with conjugate prior. Monte-Carlo (e.g. MCMC) sampling is often used to estimate it. S. Robin (AgroParisTech / INRA) Mixture Models May / 48

47 Variational Bayes inference Incomplete data model Bayesian inference Hierarchical modelling: The model is typically defined with: the prior distribution of θ: the conditional distribution of the unobserved Z: the conditional distribution of the observed X: P(θ) P(Z θ) P(X Z,θ) Inference: The goal is know to calculate (or estimate) the joint conditional distribution P(Z,θ X) = P(θ)P(Z θ)p(x Z,θ) P(X) which is often intractable, even when P(Z X,θ) can be calculated (e.g. independent mixture models). S. Robin (AgroParisTech / INRA) Mixture Models May / 48

48 Variational Bayes inference VB-EM Variational Bayes inference Exponential family / conjugate prior: if log P(θ) = φ(θ) ν + cst log P(X,Z θ) = φ(θ) u(x,z) + cst. Variational optimisation: The best approximate distribution Q = arg min Q Q KL[Q(Z,θ);P(Z,θ X)] within the class of factorisable distributions Q: Q = {Q : Q(Z,θ) = Q Z (Z)Q θ (θ)} can be recovered via a variational Bayes E-M algorithm (VBEM) [Beal and Ghahramani (2003)]. S. Robin (AgroParisTech / INRA) Mixture Models May / 48

49 Variational Bayes inference VB-EM VB-EM algorithm The approximate conditional distributions Q Z (Z) and Q θ (θ) are alternatively updated. VB-M step: Approximate posterior of θ log Q(θ) = φ(θ) {E QZ [u(x,z)] + ν} + cst VB-E step: Approximate conditional distribution of Z log Q(Z) = E Qθ [φ(θ)] u(x,z) + cst General properties: Still not well known Consistency for some particular cases. Generally tend to underestimate the posterior variance of θ. Obviously depends on the quality of the approximate Q(θ,Z). S. Robin (AgroParisTech / INRA) Mixture Models May / 48

50 Variational Bayes inference Mixture for networks Application to mixtures for networks VB-EM Credibility intervals with a mixture of 2 groups of nodes π 1 : +, γ 11 :, γ 12 :, γ 22 : For all parameters, VB-EM posterior credibility intervals achieve the nominal level (90%), as soon as n 25. the VB-EM approximation works well, at least for graphs. S. Robin (AgroParisTech / INRA) Mixture Models May / 48

51 Variational Bayes inference Approximate posterior distribution Mixture for networks S. Robin (AgroParisTech / INRA) Mixture Models May / 48

52 Model selection Some extensions Model selection S. Robin (AgroParisTech / INRA) Mixture Models May / 48

53 Model selection Some extensions Model selection Problem: The number of groups K is often not known a priori. Model fit: The observed log-likelihood L K (X) = log P(X, { θ 1,... θ K }) is not sufficient to measure of how the model fits to the data since it always increases with K. Penalised criterion: A penalty term has to be added to avoid over-fitting: BIC(K) = L K log n(#param.)/2 ICL(K) = BIC(K) H(P(Z X)) S. Robin (AgroParisTech / INRA) Mixture Models May / 48

54 Some extensions Model selection Baum, L. and Petrie, T. (1966). Statistical inference for probalistic functions of finite state markov chains. Ann. Math. Statis Beal, J., M. and Ghahramani, Z. (2003). The variational Bayesian EM algorithm for incomplete data: with application to scoring graphical model structures. In Bayesian Statistics 7, Oxford University Press. Cappé, O., Moulines, E. and Rydén, T. (2005). Inference in Hidden Markov Models. Springer. Churchill, G. A. (1992). Hidden Markov chains and the analysis of genome structure. Computer Chem Daudin, J.-J., Picard, F. and Robin, S. (Jun, 2008). A mixture model for random graphs. Stat. Comput. 18 (2) Dempster, A. P., Laird, N. M. and Rubin, D. B. (1977). Maximum likelihood from incomplete data via the EM algorithm. J. R. Statist. Soc. B Devijver, P. (1985). Baum s forward-backward algorithm revisited. Pattern Recogn. Lett Jaakkola, T. (2000). Advanced mean field methods: theory and practice. chapter Tutorial on variational approximation methods. MIT Press. Jordan, M. I., Ghahramani, Z., Jaakkola, T. and Saul, L. K. (1999). An introduction to variational methods for graphical models. Machine Learning. 37 (2) Lauritzen, S. (1996). Graphical Models. Oxford Statistical Science Series. Clarendon Press. Louis, T. (1982). Finding the observed information matrix when using the em algorithm. J. R. Statist. Soc. B. 44 (2) McLachlan, G. and Peel, D. (2000). Finite Mixture Models. Wiley. S. Robin (AgroParisTech / INRA) Mixture Models May / 48

Network analysis with the W -graph model

Network analysis with the W -graph model Network analysis with the W -graph model (via the Stochastic Block Model) S. Robin Joint work with P. Latouche and S. Ouadah INRA / AgroParisTech IMS, June 2015, Singapore S. Robin Joint work with P. Latouche

More information

Course: Model, Learning, and Inference: Lecture 5

Course: Model, Learning, and Inference: Lecture 5 Course: Model, Learning, and Inference: Lecture 5 Alan Yuille Department of Statistics, UCLA Los Angeles, CA 90095 yuille@stat.ucla.edu Abstract Probability distributions on structured representation.

More information

The Expectation Maximization Algorithm A short tutorial

The Expectation Maximization Algorithm A short tutorial The Expectation Maximiation Algorithm A short tutorial Sean Borman Comments and corrections to: em-tut at seanborman dot com July 8 2004 Last updated January 09, 2009 Revision history 2009-0-09 Corrected

More information

Hidden Markov Models

Hidden Markov Models 8.47 Introduction to omputational Molecular Biology Lecture 7: November 4, 2004 Scribe: Han-Pang hiu Lecturer: Ross Lippert Editor: Russ ox Hidden Markov Models The G island phenomenon The nucleotide frequencies

More information

Logistic Regression. Jia Li. Department of Statistics The Pennsylvania State University. Logistic Regression

Logistic Regression. Jia Li. Department of Statistics The Pennsylvania State University. Logistic Regression Logistic Regression Department of Statistics The Pennsylvania State University Email: jiali@stat.psu.edu Logistic Regression Preserve linear classification boundaries. By the Bayes rule: Ĝ(x) = arg max

More information

Bayesian Statistics: Indian Buffet Process

Bayesian Statistics: Indian Buffet Process Bayesian Statistics: Indian Buffet Process Ilker Yildirim Department of Brain and Cognitive Sciences University of Rochester Rochester, NY 14627 August 2012 Reference: Most of the material in this note

More information

STA 4273H: Statistical Machine Learning

STA 4273H: Statistical Machine Learning STA 4273H: Statistical Machine Learning Russ Salakhutdinov Department of Statistics! rsalakhu@utstat.toronto.edu! http://www.cs.toronto.edu/~rsalakhu/ Lecture 6 Three Approaches to Classification Construct

More information

Statistical Machine Learning from Data

Statistical Machine Learning from Data Samy Bengio Statistical Machine Learning from Data 1 Statistical Machine Learning from Data Gaussian Mixture Models Samy Bengio IDIAP Research Institute, Martigny, Switzerland, and Ecole Polytechnique

More information

The Chinese Restaurant Process

The Chinese Restaurant Process COS 597C: Bayesian nonparametrics Lecturer: David Blei Lecture # 1 Scribes: Peter Frazier, Indraneel Mukherjee September 21, 2007 In this first lecture, we begin by introducing the Chinese Restaurant Process.

More information

Statistics Graduate Courses

Statistics Graduate Courses Statistics Graduate Courses STAT 7002--Topics in Statistics-Biological/Physical/Mathematics (cr.arr.).organized study of selected topics. Subjects and earnable credit may vary from semester to semester.

More information

A mixture model for random graphs

A mixture model for random graphs A mixture model for random graphs J-J Daudin, F. Picard, S. Robin robin@inapg.inra.fr UMR INA-PG / ENGREF / INRA, Paris Mathématique et Informatique Appliquées Examples of networks. Social: Biological:

More information

HT2015: SC4 Statistical Data Mining and Machine Learning

HT2015: SC4 Statistical Data Mining and Machine Learning HT2015: SC4 Statistical Data Mining and Machine Learning Dino Sejdinovic Department of Statistics Oxford http://www.stats.ox.ac.uk/~sejdinov/sdmml.html Bayesian Nonparametrics Parametric vs Nonparametric

More information

Basics of Statistical Machine Learning

Basics of Statistical Machine Learning CS761 Spring 2013 Advanced Machine Learning Basics of Statistical Machine Learning Lecturer: Xiaojin Zhu jerryzhu@cs.wisc.edu Modern machine learning is rooted in statistics. You will find many familiar

More information

Probabilistic Models for Big Data. Alex Davies and Roger Frigola University of Cambridge 13th February 2014

Probabilistic Models for Big Data. Alex Davies and Roger Frigola University of Cambridge 13th February 2014 Probabilistic Models for Big Data Alex Davies and Roger Frigola University of Cambridge 13th February 2014 The State of Big Data Why probabilistic models for Big Data? 1. If you don t have to worry about

More information

Lecture 3: Linear methods for classification

Lecture 3: Linear methods for classification Lecture 3: Linear methods for classification Rafael A. Irizarry and Hector Corrada Bravo February, 2010 Today we describe four specific algorithms useful for classification problems: linear regression,

More information

Dirichlet Processes A gentle tutorial

Dirichlet Processes A gentle tutorial Dirichlet Processes A gentle tutorial SELECT Lab Meeting October 14, 2008 Khalid El-Arini Motivation We are given a data set, and are told that it was generated from a mixture of Gaussian distributions.

More information

Using Mixtures-of-Distributions models to inform farm size selection decisions in representative farm modelling. Philip Kostov and Seamus McErlean

Using Mixtures-of-Distributions models to inform farm size selection decisions in representative farm modelling. Philip Kostov and Seamus McErlean Using Mixtures-of-Distributions models to inform farm size selection decisions in representative farm modelling. by Philip Kostov and Seamus McErlean Working Paper, Agricultural and Food Economics, Queen

More information

The Basics of Graphical Models

The Basics of Graphical Models The Basics of Graphical Models David M. Blei Columbia University October 3, 2015 Introduction These notes follow Chapter 2 of An Introduction to Probabilistic Graphical Models by Michael Jordan. Many figures

More information

Christfried Webers. Canberra February June 2015

Christfried Webers. Canberra February June 2015 c Statistical Group and College of Engineering and Computer Science Canberra February June (Many figures from C. M. Bishop, "Pattern Recognition and ") 1of 829 c Part VIII Linear Classification 2 Logistic

More information

BayesX - Software for Bayesian Inference in Structured Additive Regression

BayesX - Software for Bayesian Inference in Structured Additive Regression BayesX - Software for Bayesian Inference in Structured Additive Regression Thomas Kneib Faculty of Mathematics and Economics, University of Ulm Department of Statistics, Ludwig-Maximilians-University Munich

More information

Logistic Regression. Vibhav Gogate The University of Texas at Dallas. Some Slides from Carlos Guestrin, Luke Zettlemoyer and Dan Weld.

Logistic Regression. Vibhav Gogate The University of Texas at Dallas. Some Slides from Carlos Guestrin, Luke Zettlemoyer and Dan Weld. Logistic Regression Vibhav Gogate The University of Texas at Dallas Some Slides from Carlos Guestrin, Luke Zettlemoyer and Dan Weld. Generative vs. Discriminative Classifiers Want to Learn: h:x Y X features

More information

Statistical Machine Learning

Statistical Machine Learning Statistical Machine Learning UoC Stats 37700, Winter quarter Lecture 4: classical linear and quadratic discriminants. 1 / 25 Linear separation For two classes in R d : simple idea: separate the classes

More information

Model-Based Cluster Analysis for Web Users Sessions

Model-Based Cluster Analysis for Web Users Sessions Model-Based Cluster Analysis for Web Users Sessions George Pallis, Lefteris Angelis, and Athena Vakali Department of Informatics, Aristotle University of Thessaloniki, 54124, Thessaloniki, Greece gpallis@ccf.auth.gr

More information

LECTURE 4. Last time: Lecture outline

LECTURE 4. Last time: Lecture outline LECTURE 4 Last time: Types of convergence Weak Law of Large Numbers Strong Law of Large Numbers Asymptotic Equipartition Property Lecture outline Stochastic processes Markov chains Entropy rate Random

More information

Section 5. Stan for Big Data. Bob Carpenter. Columbia University

Section 5. Stan for Big Data. Bob Carpenter. Columbia University Section 5. Stan for Big Data Bob Carpenter Columbia University Part I Overview Scaling and Evaluation data size (bytes) 1e18 1e15 1e12 1e9 1e6 Big Model and Big Data approach state of the art big model

More information

Interpreting Kullback-Leibler Divergence with the Neyman-Pearson Lemma

Interpreting Kullback-Leibler Divergence with the Neyman-Pearson Lemma Interpreting Kullback-Leibler Divergence with the Neyman-Pearson Lemma Shinto Eguchi a, and John Copas b a Institute of Statistical Mathematics and Graduate University of Advanced Studies, Minami-azabu

More information

Probabilistic user behavior models in online stores for recommender systems

Probabilistic user behavior models in online stores for recommender systems Probabilistic user behavior models in online stores for recommender systems Tomoharu Iwata Abstract Recommender systems are widely used in online stores because they are expected to improve both user

More information

The CUSUM algorithm a small review. Pierre Granjon

The CUSUM algorithm a small review. Pierre Granjon The CUSUM algorithm a small review Pierre Granjon June, 1 Contents 1 The CUSUM algorithm 1.1 Algorithm............................... 1.1.1 The problem......................... 1.1. The different steps......................

More information

Statistical machine learning, high dimension and big data

Statistical machine learning, high dimension and big data Statistical machine learning, high dimension and big data S. Gaïffas 1 14 mars 2014 1 CMAP - Ecole Polytechnique Agenda for today Divide and Conquer principle for collaborative filtering Graphical modelling,

More information

A hidden Markov model for criminal behaviour classification

A hidden Markov model for criminal behaviour classification RSS2004 p.1/19 A hidden Markov model for criminal behaviour classification Francesco Bartolucci, Institute of economic sciences, Urbino University, Italy. Fulvia Pennoni, Department of Statistics, University

More information

Visualization of Collaborative Data

Visualization of Collaborative Data Visualization of Collaborative Data Guobiao Mei University of California, Riverside gmei@cs.ucr.edu Christian R. Shelton University of California, Riverside cshelton@cs.ucr.edu Abstract Collaborative data

More information

Probabilistic Latent Semantic Analysis (plsa)

Probabilistic Latent Semantic Analysis (plsa) Probabilistic Latent Semantic Analysis (plsa) SS 2008 Bayesian Networks Multimedia Computing, Universität Augsburg Rainer.Lienhart@informatik.uni-augsburg.de www.multimedia-computing.{de,org} References

More information

Message-passing sequential detection of multiple change points in networks

Message-passing sequential detection of multiple change points in networks Message-passing sequential detection of multiple change points in networks Long Nguyen, Arash Amini Ram Rajagopal University of Michigan Stanford University ISIT, Boston, July 2012 Nguyen/Amini/Rajagopal

More information

Item selection by latent class-based methods: an application to nursing homes evaluation

Item selection by latent class-based methods: an application to nursing homes evaluation Item selection by latent class-based methods: an application to nursing homes evaluation Francesco Bartolucci, Giorgio E. Montanari, Silvia Pandolfi 1 Department of Economics, Finance and Statistics University

More information

Lectures 1 and 8 15. February 7, 2013. Genomics 2012: Repetitorium. Peter N Robinson. VL1: Next- Generation Sequencing. VL8 9: Variant Calling

Lectures 1 and 8 15. February 7, 2013. Genomics 2012: Repetitorium. Peter N Robinson. VL1: Next- Generation Sequencing. VL8 9: Variant Calling Lectures 1 and 8 15 February 7, 2013 This is a review of the material from lectures 1 and 8 14. Note that the material from lecture 15 is not relevant for the final exam. Today we will go over the material

More information

Note on the EM Algorithm in Linear Regression Model

Note on the EM Algorithm in Linear Regression Model International Mathematical Forum 4 2009 no. 38 1883-1889 Note on the M Algorithm in Linear Regression Model Ji-Xia Wang and Yu Miao College of Mathematics and Information Science Henan Normal University

More information

Detection of changes in variance using binary segmentation and optimal partitioning

Detection of changes in variance using binary segmentation and optimal partitioning Detection of changes in variance using binary segmentation and optimal partitioning Christian Rohrbeck Abstract This work explores the performance of binary segmentation and optimal partitioning in the

More information

Model based clustering of longitudinal data: application to modeling disease course and gene expression trajectories

Model based clustering of longitudinal data: application to modeling disease course and gene expression trajectories 1 2 3 Model based clustering of longitudinal data: application to modeling disease course and gene expression trajectories 4 5 A. Ciampi 1, H. Campbell, A. Dyacheno, B. Rich, J. McCuser, M. G. Cole 6 7

More information

Gaussian Processes in Machine Learning

Gaussian Processes in Machine Learning Gaussian Processes in Machine Learning Carl Edward Rasmussen Max Planck Institute for Biological Cybernetics, 72076 Tübingen, Germany carl@tuebingen.mpg.de WWW home page: http://www.tuebingen.mpg.de/ carl

More information

Cell Phone based Activity Detection using Markov Logic Network

Cell Phone based Activity Detection using Markov Logic Network Cell Phone based Activity Detection using Markov Logic Network Somdeb Sarkhel sxs104721@utdallas.edu 1 Introduction Mobile devices are becoming increasingly sophisticated and the latest generation of smart

More information

Introduction to Markov Chain Monte Carlo

Introduction to Markov Chain Monte Carlo Introduction to Markov Chain Monte Carlo Monte Carlo: sample from a distribution to estimate the distribution to compute max, mean Markov Chain Monte Carlo: sampling using local information Generic problem

More information

Multivariate Normal Distribution

Multivariate Normal Distribution Multivariate Normal Distribution Lecture 4 July 21, 2011 Advanced Multivariate Statistical Methods ICPSR Summer Session #2 Lecture #4-7/21/2011 Slide 1 of 41 Last Time Matrices and vectors Eigenvalues

More information

Methods for big data in medical genomics

Methods for big data in medical genomics Methods for big data in medical genomics Parallel Hidden Markov Models in Population Genetics Chris Holmes, (Peter Kecskemethy & Chris Gamble) Department of Statistics and, Nuffield Department of Medicine

More information

Hidden Markov Models with Applications to DNA Sequence Analysis. Christopher Nemeth, STOR-i

Hidden Markov Models with Applications to DNA Sequence Analysis. Christopher Nemeth, STOR-i Hidden Markov Models with Applications to DNA Sequence Analysis Christopher Nemeth, STOR-i May 4, 2011 Contents 1 Introduction 1 2 Hidden Markov Models 2 2.1 Introduction.......................................

More information

Neural Networks Lesson 5 - Cluster Analysis

Neural Networks Lesson 5 - Cluster Analysis Neural Networks Lesson 5 - Cluster Analysis Prof. Michele Scarpiniti INFOCOM Dpt. - Sapienza University of Rome http://ispac.ing.uniroma1.it/scarpiniti/index.htm michele.scarpiniti@uniroma1.it Rome, 29

More information

Un point de vue bayésien pour des algorithmes de bandit plus performants

Un point de vue bayésien pour des algorithmes de bandit plus performants Un point de vue bayésien pour des algorithmes de bandit plus performants Emilie Kaufmann, Telecom ParisTech Rencontre des Jeunes Statisticiens, Aussois, 28 août 2013 Emilie Kaufmann (Telecom ParisTech)

More information

Probabilistic Methods for Time-Series Analysis

Probabilistic Methods for Time-Series Analysis Probabilistic Methods for Time-Series Analysis 2 Contents 1 Analysis of Changepoint Models 1 1.1 Introduction................................ 1 1.1.1 Model and Notation....................... 2 1.1.2 Example:

More information

Question 2 Naïve Bayes (16 points)

Question 2 Naïve Bayes (16 points) Question 2 Naïve Bayes (16 points) About 2/3 of your email is spam so you downloaded an open source spam filter based on word occurrences that uses the Naive Bayes classifier. Assume you collected the

More information

STATISTICA Formula Guide: Logistic Regression. Table of Contents

STATISTICA Formula Guide: Logistic Regression. Table of Contents : Table of Contents... 1 Overview of Model... 1 Dispersion... 2 Parameterization... 3 Sigma-Restricted Model... 3 Overparameterized Model... 4 Reference Coding... 4 Model Summary (Summary Tab)... 5 Summary

More information

Reject Inference in Credit Scoring. Jie-Men Mok

Reject Inference in Credit Scoring. Jie-Men Mok Reject Inference in Credit Scoring Jie-Men Mok BMI paper January 2009 ii Preface In the Master programme of Business Mathematics and Informatics (BMI), it is required to perform research on a business

More information

Segmentation models and applications with R

Segmentation models and applications with R Segmentation models and applications with R Franck Picard UMR 5558 UCB CNRS LBBE, Lyon, France franck.picard@univ-lyon1.fr http://pbil.univ-lyon1.fr/members/fpicard/ INED-28/04/11 F. Picard (CNRS-LBBE)

More information

Scaling Bayesian Network Parameter Learning with Expectation Maximization using MapReduce

Scaling Bayesian Network Parameter Learning with Expectation Maximization using MapReduce Scaling Bayesian Network Parameter Learning with Expectation Maximization using MapReduce Erik B. Reed Carnegie Mellon University Silicon Valley Campus NASA Research Park Moffett Field, CA 94035 erikreed@cmu.edu

More information

Parametric fractional imputation for missing data analysis

Parametric fractional imputation for missing data analysis 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 Biometrika (????,??,?, pp. 1 14 C???? Biometrika Trust Printed in

More information

Towards running complex models on big data

Towards running complex models on big data Towards running complex models on big data Working with all the genomes in the world without changing the model (too much) Daniel Lawson Heilbronn Institute, University of Bristol 2013 1 / 17 Motivation

More information

Linear Classification. Volker Tresp Summer 2015

Linear Classification. Volker Tresp Summer 2015 Linear Classification Volker Tresp Summer 2015 1 Classification Classification is the central task of pattern recognition Sensors supply information about an object: to which class do the object belong

More information

CHAPTER 2 Estimating Probabilities

CHAPTER 2 Estimating Probabilities CHAPTER 2 Estimating Probabilities Machine Learning Copyright c 2016. Tom M. Mitchell. All rights reserved. *DRAFT OF January 24, 2016* *PLEASE DO NOT DISTRIBUTE WITHOUT AUTHOR S PERMISSION* This is a

More information

Search and Data Mining: Techniques. Text Mining Anya Yarygina Boris Novikov

Search and Data Mining: Techniques. Text Mining Anya Yarygina Boris Novikov Search and Data Mining: Techniques Text Mining Anya Yarygina Boris Novikov Introduction Generally used to denote any system that analyzes large quantities of natural language text and detects lexical or

More information

11. Time series and dynamic linear models

11. Time series and dynamic linear models 11. Time series and dynamic linear models Objective To introduce the Bayesian approach to the modeling and forecasting of time series. Recommended reading West, M. and Harrison, J. (1997). models, (2 nd

More information

CSCI567 Machine Learning (Fall 2014)

CSCI567 Machine Learning (Fall 2014) CSCI567 Machine Learning (Fall 2014) Drs. Sha & Liu {feisha,yanliu.cs}@usc.edu September 22, 2014 Drs. Sha & Liu ({feisha,yanliu.cs}@usc.edu) CSCI567 Machine Learning (Fall 2014) September 22, 2014 1 /

More information

Automated Hierarchical Mixtures of Probabilistic Principal Component Analyzers

Automated Hierarchical Mixtures of Probabilistic Principal Component Analyzers Automated Hierarchical Mixtures of Probabilistic Principal Component Analyzers Ting Su tsu@ece.neu.edu Jennifer G. Dy jdy@ece.neu.edu Department of Electrical and Computer Engineering, Northeastern University,

More information

CS229 Lecture notes. Andrew Ng

CS229 Lecture notes. Andrew Ng CS229 Lecture notes Andrew Ng Part X Factor analysis Whenwehavedatax (i) R n thatcomesfromamixtureofseveral Gaussians, the EM algorithm can be applied to fit a mixture model. In this setting, we usually

More information

An extension of the factoring likelihood approach for non-monotone missing data

An extension of the factoring likelihood approach for non-monotone missing data An extension of the factoring likelihood approach for non-monotone missing data Jae Kwang Kim Dong Wan Shin January 14, 2010 ABSTRACT We address the problem of parameter estimation in multivariate distributions

More information

Time Series Analysis III

Time Series Analysis III Lecture 12: Time Series Analysis III MIT 18.S096 Dr. Kempthorne Fall 2013 MIT 18.S096 Time Series Analysis III 1 Outline Time Series Analysis III 1 Time Series Analysis III MIT 18.S096 Time Series Analysis

More information

Bayesian Clustering for Email Campaign Detection

Bayesian Clustering for Email Campaign Detection Peter Haider haider@cs.uni-potsdam.de Tobias Scheffer scheffer@cs.uni-potsdam.de University of Potsdam, Department of Computer Science, August-Bebel-Strasse 89, 14482 Potsdam, Germany Abstract We discuss

More information

Designing and Evaluating an Interpretable Predictive Modeling Technique for Business Processes

Designing and Evaluating an Interpretable Predictive Modeling Technique for Business Processes Designing and Evaluating an Interpretable Predictive Modeling Technique for Business Processes Dominic Breuker 1, Patrick Delfmann 1, Martin Matzner 1 and Jörg Becker 1 1 Department for Information Systems,

More information

The Optimality of Naive Bayes

The Optimality of Naive Bayes The Optimality of Naive Bayes Harry Zhang Faculty of Computer Science University of New Brunswick Fredericton, New Brunswick, Canada email: hzhang@unbca E3B 5A3 Abstract Naive Bayes is one of the most

More information

CHAPTER 3 EXAMPLES: REGRESSION AND PATH ANALYSIS

CHAPTER 3 EXAMPLES: REGRESSION AND PATH ANALYSIS Examples: Regression And Path Analysis CHAPTER 3 EXAMPLES: REGRESSION AND PATH ANALYSIS Regression analysis with univariate or multivariate dependent variables is a standard procedure for modeling relationships

More information

Machine Learning and Pattern Recognition Logistic Regression

Machine Learning and Pattern Recognition Logistic Regression Machine Learning and Pattern Recognition Logistic Regression Course Lecturer:Amos J Storkey Institute for Adaptive and Neural Computation School of Informatics University of Edinburgh Crichton Street,

More information

Parallelization Strategies for Multicore Data Analysis

Parallelization Strategies for Multicore Data Analysis Parallelization Strategies for Multicore Data Analysis Wei-Chen Chen 1 Russell Zaretzki 2 1 University of Tennessee, Dept of EEB 2 University of Tennessee, Dept. Statistics, Operations, and Management

More information

Fortgeschrittene Computerintensive Methoden: Finite Mixture Models Steffen Unkel Manuel Eugster, Bettina Grün, Friedrich Leisch, Matthias Schmid

Fortgeschrittene Computerintensive Methoden: Finite Mixture Models Steffen Unkel Manuel Eugster, Bettina Grün, Friedrich Leisch, Matthias Schmid Fortgeschrittene Computerintensive Methoden: Finite Mixture Models Steffen Unkel Manuel Eugster, Bettina Grün, Friedrich Leisch, Matthias Schmid Institut für Statistik LMU München Sommersemester 2013 Outline

More information

Is log ratio a good value for measuring return in stock investments

Is log ratio a good value for measuring return in stock investments Is log ratio a good value for measuring return in stock investments Alfred Ultsch Databionics Research Group, University of Marburg, Germany, Contact: ultsch@informatik.uni-marburg.de Measuring the rate

More information

Web User Segmentation Based on a Mixture of Factor Analyzers

Web User Segmentation Based on a Mixture of Factor Analyzers Web User Segmentation Based on a Mixture of Factor Analyzers Yanzan Kevin Zhou 1 and Bamshad Mobasher 2 1 ebay Inc., San Jose, CA yanzzhou@ebay.com 2 DePaul University, Chicago, IL mobasher@cs.depaul.edu

More information

Big Data, Machine Learning, Causal Models

Big Data, Machine Learning, Causal Models Big Data, Machine Learning, Causal Models Sargur N. Srihari University at Buffalo, The State University of New York USA Int. Conf. on Signal and Image Processing, Bangalore January 2014 1 Plan of Discussion

More information

3. The Junction Tree Algorithms

3. The Junction Tree Algorithms A Short Course on Graphical Models 3. The Junction Tree Algorithms Mark Paskin mark@paskin.org 1 Review: conditional independence Two random variables X and Y are independent (written X Y ) iff p X ( )

More information

Bayesian Information Criterion The BIC Of Algebraic geometry

Bayesian Information Criterion The BIC Of Algebraic geometry Generalized BIC for Singular Models Factoring through Regular Models Shaowei Lin http://math.berkeley.edu/ shaowei/ Department of Mathematics, University of California, Berkeley PhD student (Advisor: Bernd

More information

Finding the M Most Probable Configurations Using Loopy Belief Propagation

Finding the M Most Probable Configurations Using Loopy Belief Propagation Finding the M Most Probable Configurations Using Loopy Belief Propagation Chen Yanover and Yair Weiss School of Computer Science and Engineering The Hebrew University of Jerusalem 91904 Jerusalem, Israel

More information

Nonparametric Factor Analysis with Beta Process Priors

Nonparametric Factor Analysis with Beta Process Priors Nonparametric Factor Analysis with Beta Process Priors John Paisley Lawrence Carin Department of Electrical & Computer Engineering Duke University, Durham, NC 7708 jwp4@ee.duke.edu lcarin@ee.duke.edu Abstract

More information

Topic models for Sentiment analysis: A Literature Survey

Topic models for Sentiment analysis: A Literature Survey Topic models for Sentiment analysis: A Literature Survey Nikhilkumar Jadhav 123050033 June 26, 2014 In this report, we present the work done so far in the field of sentiment analysis using topic models.

More information

Inference on Phase-type Models via MCMC

Inference on Phase-type Models via MCMC Inference on Phase-type Models via MCMC with application to networks of repairable redundant systems Louis JM Aslett and Simon P Wilson Trinity College Dublin 28 th June 202 Toy Example : Redundant Repairable

More information

Model-based Synthesis. Tony O Hagan

Model-based Synthesis. Tony O Hagan Model-based Synthesis Tony O Hagan Stochastic models Synthesising evidence through a statistical model 2 Evidence Synthesis (Session 3), Helsinki, 28/10/11 Graphical modelling The kinds of models that

More information

Introduction to Detection Theory

Introduction to Detection Theory Introduction to Detection Theory Reading: Ch. 3 in Kay-II. Notes by Prof. Don Johnson on detection theory, see http://www.ece.rice.edu/~dhj/courses/elec531/notes5.pdf. Ch. 10 in Wasserman. EE 527, Detection

More information

Measurement-Based Network Monitoring and Inference: Scalability and Missing Information

Measurement-Based Network Monitoring and Inference: Scalability and Missing Information 714 IEEE JOURNAL ON SELECTED AREAS IN COMMUNICATIONS, VOL. 20, NO. 4, MAY 2002 Measurement-Based Network Monitoring and Inference: Scalability and Missing Information Chuanyi Ji, Member, IEEE and Anwar

More information

Multi-Class and Structured Classification

Multi-Class and Structured Classification Multi-Class and Structured Classification [slides prises du cours cs294-10 UC Berkeley (2006 / 2009)] [ p y( )] http://www.cs.berkeley.edu/~jordan/courses/294-fall09 Basic Classification in ML Input Output

More information

3. Interpolation. Closing the Gaps of Discretization... Beyond Polynomials

3. Interpolation. Closing the Gaps of Discretization... Beyond Polynomials 3. Interpolation Closing the Gaps of Discretization... Beyond Polynomials Closing the Gaps of Discretization... Beyond Polynomials, December 19, 2012 1 3.3. Polynomial Splines Idea of Polynomial Splines

More information

Bias in the Estimation of Mean Reversion in Continuous-Time Lévy Processes

Bias in the Estimation of Mean Reversion in Continuous-Time Lévy Processes Bias in the Estimation of Mean Reversion in Continuous-Time Lévy Processes Yong Bao a, Aman Ullah b, Yun Wang c, and Jun Yu d a Purdue University, IN, USA b University of California, Riverside, CA, USA

More information

Variational Mean Field for Graphical Models

Variational Mean Field for Graphical Models Variational Mean Field for Graphical Models CS/CNS/EE 155 Baback Moghaddam Machine Learning Group baback @ jpl.nasa.gov Approximate Inference Consider general UGs (i.e., not tree-structured) All basic

More information

Exponential Random Graph Models for Social Network Analysis. Danny Wyatt 590AI March 6, 2009

Exponential Random Graph Models for Social Network Analysis. Danny Wyatt 590AI March 6, 2009 Exponential Random Graph Models for Social Network Analysis Danny Wyatt 590AI March 6, 2009 Traditional Social Network Analysis Covered by Eytan Traditional SNA uses descriptive statistics Path lengths

More information

Big Data - Lecture 1 Optimization reminders

Big Data - Lecture 1 Optimization reminders Big Data - Lecture 1 Optimization reminders S. Gadat Toulouse, Octobre 2014 Big Data - Lecture 1 Optimization reminders S. Gadat Toulouse, Octobre 2014 Schedule Introduction Major issues Examples Mathematics

More information

Lecture 13 Linear quadratic Lyapunov theory

Lecture 13 Linear quadratic Lyapunov theory EE363 Winter 28-9 Lecture 13 Linear quadratic Lyapunov theory the Lyapunov equation Lyapunov stability conditions the Lyapunov operator and integral evaluating quadratic integrals analysis of ARE discrete-time

More information

Several Views of Support Vector Machines

Several Views of Support Vector Machines Several Views of Support Vector Machines Ryan M. Rifkin Honda Research Institute USA, Inc. Human Intention Understanding Group 2007 Tikhonov Regularization We are considering algorithms of the form min

More information

NEW YORK STATE TEACHER CERTIFICATION EXAMINATIONS

NEW YORK STATE TEACHER CERTIFICATION EXAMINATIONS NEW YORK STATE TEACHER CERTIFICATION EXAMINATIONS TEST DESIGN AND FRAMEWORK September 2014 Authorized for Distribution by the New York State Education Department This test design and framework document

More information

False-Alarm and Non-Detection Probabilities for On-line Quality Control via HMM

False-Alarm and Non-Detection Probabilities for On-line Quality Control via HMM Int. Journal of Math. Analysis, Vol. 6, 2012, no. 24, 1153-1162 False-Alarm and Non-Detection Probabilities for On-line Quality Control via HMM C.C.Y. Dorea a,1, C.R. Gonçalves a, P.G. Medeiros b and W.B.

More information

Introduction to Deep Learning Variational Inference, Mean Field Theory

Introduction to Deep Learning Variational Inference, Mean Field Theory Introduction to Deep Learning Variational Inference, Mean Field Theory 1 Iasonas Kokkinos Iasonas.kokkinos@ecp.fr Center for Visual Computing Ecole Centrale Paris Galen Group INRIA-Saclay Lecture 3: recap

More information

THE DYING FIBONACCI TREE. 1. Introduction. Consider a tree with two types of nodes, say A and B, and the following properties:

THE DYING FIBONACCI TREE. 1. Introduction. Consider a tree with two types of nodes, say A and B, and the following properties: THE DYING FIBONACCI TREE BERNHARD GITTENBERGER 1. Introduction Consider a tree with two types of nodes, say A and B, and the following properties: 1. Let the root be of type A.. Each node of type A produces

More information

CCNY. BME I5100: Biomedical Signal Processing. Linear Discrimination. Lucas C. Parra Biomedical Engineering Department City College of New York

CCNY. BME I5100: Biomedical Signal Processing. Linear Discrimination. Lucas C. Parra Biomedical Engineering Department City College of New York BME I5100: Biomedical Signal Processing Linear Discrimination Lucas C. Parra Biomedical Engineering Department CCNY 1 Schedule Week 1: Introduction Linear, stationary, normal - the stuff biology is not

More information

Stock Option Pricing Using Bayes Filters

Stock Option Pricing Using Bayes Filters Stock Option Pricing Using Bayes Filters Lin Liao liaolin@cs.washington.edu Abstract When using Black-Scholes formula to price options, the key is the estimation of the stochastic return variance. In this

More information

Reinforcement Learning

Reinforcement Learning Reinforcement Learning LU 2 - Markov Decision Problems and Dynamic Programming Dr. Joschka Bödecker AG Maschinelles Lernen und Natürlichsprachliche Systeme Albert-Ludwigs-Universität Freiburg jboedeck@informatik.uni-freiburg.de

More information

Spatial Statistics Chapter 3 Basics of areal data and areal data modeling

Spatial Statistics Chapter 3 Basics of areal data and areal data modeling Spatial Statistics Chapter 3 Basics of areal data and areal data modeling Recall areal data also known as lattice data are data Y (s), s D where D is a discrete index set. This usually corresponds to data

More information

Analysis of Bayesian Dynamic Linear Models

Analysis of Bayesian Dynamic Linear Models Analysis of Bayesian Dynamic Linear Models Emily M. Casleton December 17, 2010 1 Introduction The main purpose of this project is to explore the Bayesian analysis of Dynamic Linear Models (DLMs). The main

More information

Lecture 6: Logistic Regression

Lecture 6: Logistic Regression Lecture 6: CS 194-10, Fall 2011 Laurent El Ghaoui EECS Department UC Berkeley September 13, 2011 Outline Outline Classification task Data : X = [x 1,..., x m]: a n m matrix of data points in R n. y { 1,

More information