Dynamic Topic Models. Abstract. 1. Introduction

Size: px
Start display at page:

Download "Dynamic Topic Models. Abstract. 1. Introduction"

Transcription

1 David M. Blei omputer Science Department, Princeton University, Princeton, NJ 08544, USA John D. Lafferty School of omputer Science, arnegie Mellon University, Pittsburgh PA 15213, USA Abstract A family of probabilistic time series models is developed to analye the time evolution of topics in large document collections. The approach is to use state space models on the natural parameters of the multinomial distributions that represent the topics. Variational approximations based on Kalman filters and nonparametric avelet regression are developed to carry out approximate posterior inference over the latent topics. In addition to giving quantitative, predictive models of a sequential corpus, dynamic topic models provide a qualitative indo into the contents of a large document collection. The models are demonstrated by analying the OR ed archives of the journal Science from 1880 through Introduction Managing the explosion of electronic document archives requires ne tools for automatically organiing, searching, indexing, and brosing large collections. Recent research in machine learning and statistics has developed ne techniques for finding patterns of ords in document collections using hierarchical probabilistic models Blei et al., 2003; Mcallum et al., 2004; Rosen-Zvi et al., 2004; Griffiths and Steyvers, 2004; Buntine and Jakulin, 2004; Blei and Lafferty, These models are called topic models because the discovered patterns often reflect the underlying topics hich combined to form the documents. Such hierarchical probabilistic models are easily generalied to other kinds of data; for example, topic models have been used to analye images Fei-Fei and Perona, 2005; Sivic et al., 2005, biological data Pritchard et al., 2000, and survey data Erosheva, In an exchangeable topic model, the ords of each docu- Appearing in Proceedings of the 23 rd International onference on Machine Learning, Pittsburgh, PA, opyright 2006 by the authors/oners. ment are assumed to be independently dran from a mixture of multinomials. The mixing proportions are randomly dran for each document; the mixture components, or topics, are shared by all documents. Thus, each document reflects the components ith different proportions. These models are a poerful method of dimensionality reduction for large collections of unstructured documents. Moreover, posterior inference at the document level is useful for information retrieval, classification, and topic-directed brosing. Treating ords exchangeably is a simplification that it is consistent ith the goal of identifying the semantic themes ithin each document. For many collections of interest, hoever, the implicit assumption of exchangeable documents is inappropriate. Document collections such as scholarly journals, , nes articles, and search query logs all reflect evolving content. For example, the Science article The Brain of Professor Laborde may be on the same scientific path as the article Reshaping the ortical Motor Map by Unmasking Latent Intracortical onnections, but the study of neuroscience looked much different in 1903 than it did in The themes in a document collection evolve over time, and it is of interest to explicitly model the dynamics of the underlying topics. In this paper, e develop a dynamic topic model hich captures the evolution of topics in a sequentially organied corpus of documents. We demonstrate its applicability by analying over 100 years of OR ed articles from the journal Science, hich as founded in 1880 by Thomas Edison and has been published through the present. Under this model, articles are grouped by year, and each year s articles arise from a set of topics that have evolved from the last year s topics. In the subsequent sections, e extend classical state space models to specify a statistical model of topic evolution. We then develop efficient approximate posterior inference techniques for determining the evolving topics from a sequential collection of documents. Finally, e present qualitative results that demonstrate ho dynamic topic models allo the exploration of a large document collection in ne

2 ays, and quantitative results that demonstrate greater predictive accuracy hen compared ith static topic models. 2. Dynamic Topic Models While traditional time series modeling has focused on continuous data, topic models are designed for categorical data. Our approach is to use state space models on the natural parameter space of the underlying topic multinomials, as ell as on the natural parameters for the logistic normal distributions used for modeling the document-specific topic proportions. First, e revie the underlying statistical assumptions of a static topic model, such as latent Dirichlet allocation LDA Blei et al., Let 1:K be K topics, each of hich is a distribution over a fixed vocabulary. In a static topic model, each document is assumed dran from the folloing generative process: 1. hoose topic proportions from a distribution over the K 1-simplex, such as a Dirichlet. 2. For each ord: a hoose a topic assignment Z Mult. b hoose a ord W Mult. This process implicitly assumes that the documents are dran exchangeably from the same set of topics. For many collections, hoever, the order of the documents reflects an evolving set of topics. In a dynamic topic model, e suppose that the data is divided by time slice, for example by year. We model the documents of each slice ith a K- component topic model, here the topics associated ith slice t evolve from the topics associated ith slice t 1. For a K-component model ith V terms, let t,k denote the V -vector of natural parameters for topic k in slice t. The usual representation of a multinomial distribution is by its mean parameteriation. If e denote the mean parameter of a V -dimensional multinomial by π, the ith component of the natural parameter is given by the mapping i = logπ i /π V. In typical language modeling applications, Dirichlet distributions are used to model uncertainty about the distributions over ords. Hoever, the Dirichlet is not amenable to sequential modeling. Instead, e chain the natural parameters of each topic t,k in a state space model that evolves ith Gaussian noise; the simplest version of such a model is t,k t 1,k N t 1,k,σ 2 I. 1 Our approach is thus to model sequences of compositional random variables by chaining Gaussian distributions in a dynamic model and mapping the emitted values to the simplex. This is an extension of the logistic normal distribu- N N N A A A Figure 1. Graphical representation of a dynamic topic model for three time slices. Each topic s natural parameters t,k evolve over time, together ith the mean parameters t of the logistic normal distribution for the topic proportions. tion Aitchison, 1982 to time-series simplex data West and Harrison, In LDA, the document-specific topic proportions are dran from a Dirichlet distribution. In the dynamic topic model, e use a logistic normal ith mean to express uncertainty over proportions. The sequential structure beteen models is again captured ith a simple dynamic model t t 1 N t 1,δ 2 I. 2 For simplicity, e do not model the dynamics of topic correlation, as as done for static models by Blei and Lafferty By chaining together topics and topic proportion distributions, e have sequentially tied a collection of topic models. The generative process for slice t of a sequential corpus is thus as follos: 1. Dra topics t t 1 N t 1,σ 2 I. 2. Dra t t 1 N t 1,δ 2 I. 3. For each document: a Dra η N t,a 2 I b For each ord: i. Dra Z Multπη. ii. Dra W t,d,n Multπ t,. Note that π maps the multinomial natural parameters to the mean parameters, π k,t = exp k,t, P exp k,t,. The graphical model for this generative process is shon in Figure 1. When the horiontal arros are removed, breaking the time dynamics, the graphical model reduces to a set of independent topic models. With time dynamics, the kth K

3 topic at slice t has smoothly evolved from the kth topic at slice t 1. For clarity of presentation, e no focus on a model ith K dynamic topics evolving as in 1, and here the topic proportion model is fixed at a Dirichlet. The technical issues associated ith modeling the topic proportions in a time series as in 2 are essentially the same as those for chaining the topics together. 3. Approximate Inference Working ith time series over the natural parameters enables the use of Gaussian models for the time dynamics; hoever, due to the nonconjugacy of the Gaussian and multinomial models, posterior inference is intractable. In this section, e present a variational method for approximate posterior inference. We use variational methods as deterministic alternatives to stochastic simulation, in order to handle the large data sets typical of text analysis. While Gibbs sampling has been effectively used for static topic models Griffiths and Steyvers, 2004, nonconjugacy makes sampling methods more difficult for this dynamic model. The idea behind variational methods is to optimie the free parameters of a distribution over the latent variables so that the distribution is close in Kullback-Liebler KL divergence to the true posterior; this distribution can then be used as a substitute for the true posterior. In the dynamic topic model, the latent variables are the topics t,k, mixture proportions t,d, and topic indicators t,d,n. The variational distribution reflects the group structure of the latent variables. There are variational parameters for each topic s sequence of multinomial parameters, and variational parameters for each of the document-level latent variables. The approximate variational posterior is K q k,1,..., k,t ˆ k,1,..., ˆ k,t 3 k=1 T Dt d=1 q t,d γ t,d N t,d n=1 q t,d,n φ t,d,n In the commonly used mean-field approximation, each latent variable is considered independently of the others. In the variational distribution of { k,1,..., k,t }, hoever, e retain the sequential structure of the topic by positing a dynamic model ith Gaussian variational observations {ˆ k,1,..., ˆ k,t }. These parameters are fit to minimie the KL divergence beteen the resulting posterior, hich is Gaussian, and the true posterior, hich is not Gaussian. A similar technique for Gaussian processes is described in Snelson and Ghahramani, The variational distribution of the document-level latent. N N N A A A Figure 2. A graphical representation of the variational approximation for the time series topic model of Figure 1. The variational parameters ˆ and ˆ are thought of as the outputs of a Kalman filter, or as observed data in a nonparametric regression setting. variables follos the same form as in Blei et al Each proportion vector t,d is endoed ith a free Dirichlet parameter γ t,d, each topic indicator t,d,n is endoed ith a free multinomial parameter φ t,d,n, and optimiation proceeds by coordinate ascent. The updates for the documentlevel variational parameters have a closed form; e use the conjugate gradient method to optimie the topic-level variational observations. The resulting variational approximation for the natural topic parameters { k,1,..., k,t } incorporates the time dynamics; e describe one approximation based on a Kalman filter, and a second based on avelet regression Variational Kalman Filtering The vie of the variational parameters as outputs is based on the symmetry properties of the Gaussian density, f µ,σ x = f x,σ µ, hich enables the use of the standard forard-backard calculations for linear state space models. The graphical model and its variational approximation are shon in Figure 2. Here the triangles denote variational parameters; they can be thought of as hypothetical outputs of the Kalman filter, to facilitate calculation. To explain the main idea behind this technique in a simpler setting, consider the model here unigram models t in the natural parameteriation evolve over time. In this model there are no topics and thus no mixing parameters. The calculations are simpler versions of those e need for the more general latent variable models, but exhibit the es- K

4 sential features. Our state space model is t t 1 N t 1,σ 2 I t,n t Multπ t and e form the variational state space model here ˆ t t N t, ˆν 2 t I The variational parameters are ˆ t and ˆν t. Using standard Kalman filter calculations Kalman, 1960, the forard mean and variance of the variational posterior are given by m t E t ˆ 1:t = ˆν 2 t V t 1 + σ 2 + ˆν 2 t m t V t E t m t 2 ˆ 1:t ˆν 2 t = V t 1 + σ 2 + ˆν t 2 V t 1 + σ 2 ˆν 2 t V t 1 + σ 2 + ˆν 2 t ˆ t ith initial conditions specified by fixed m 0 and V 0. The backard recursion then calculates the marginal mean and variance of t given ˆ 1:T as m t 1 E t 1 ˆ 1:T = σ 2 V t 1 + σ 2 m t σ 2 V t 1 + σ 2 m t Ṽ t 1 E t 1 m t 1 2 ˆ 1:T 2 Vt 1 = V t 1 + Ṽt V t 1 + σ 2 V t 1 + σ 2 ith initial conditions m T = m T and ṼT = V T. We approximate the posterior p 1:T 1:T using the state space posterior q 1:T ˆ 1:T. From Jensen s inequality, the loglikelihood is bounded from belo as log pd 1:T 4 q 1:T ˆ p 1:T pd 1:T 1:T 1:T log q 1:T ˆ d 1:T 1:T = E q log p 1:T + E q log pd t t + Hq Details of optimiing this bound are given in an appendix Variational Wavelet Regression The variational Kalman filter can be replaced ith variational avelet regression; for a readable introduction standard avelet methods, see Wasserman We rescale time so it is beteen 0 and 1. For 128 years of Science e take n = 2 J and J = 7. To be consistent ith our earlier notation, e assume that ˆ t = m t + ˆνɛ t here ɛ t N0,1. Our variational avelet regression algorithm estimates {ˆ t }, hich e vie as observed data, just as in the Kalman filter method, as ell as the noise level ˆν. For concreteness, e illustrate the technique using the Haar avelet basis; Daubechies avelets are used in our actual examples. The model is then J 1 ˆ t = φx t + 2 j 1 j=0 k=0 D jk ψ jk x t here x t = t/n, φx = 1 for 0 x 1, { 1 if 0 x 1 ψx = 2, 1 if 1 2 < x 1 and ψ jk x = 2 j/2 ψ2 j x k. Our variational estimate for the posterior mean becomes J 1 m t = ˆφx t + 2 j 1 j=0 k=0 ˆD jk ψ jk x t. here ˆ = n 1 n ˆ t, and ˆD jk are obtained by thresholding the coefficients Z jk = 1 n n ˆ t ψ jk x t. To estimate ˆ t e use gradient ascent, as for the Kalman filter approximation, requiring the derivatives m t / ˆ t. If soft thresholding is used, then e have that m t ˆ s = ˆ ˆ s φx t + J 1 2 j 1 j=0 k=0 ˆD jk ˆ s ψ jk x t. ith ˆ/ ˆ s = n 1 and { 1 ˆD jk / ˆ s = n ψ jkx s if Z jk > λ 0 otherise. Note also that Z jk > λ if and only if ˆD jk > 0. These derivatives can be computed using off-the-shelf softare for the avelet transform in any of the standard avelet bases. Sample results of running this and the Kalman variational algorithm to approximate a unigram model are given in Figure 3. Both variational approximations smooth out the

5 Darin Einstein moon e+00 2e 04 4e 04 6e 04 0e+00 2e 04 4e 04 6e 04 8e 04 1e e+00 2e 04 4e 04 6e 04 0e+00 2e 04 4e 04 6e 04 8e 04 1e 03 Figure 3. omparison of the Kalman filter top and avelet regression bottom variational approximations to a unigram model. The variational approximations red and blue curves smooth out the local fluctuations in the unigram counts gray curves of the ords shon, hile preserving the sharp peaks that may indicate a significant change of content in the journal. The avelet regression is able to superresolve the double spikes in the occurrence of Einstein in the 1920s. The spike in the occurrence of Darin near 1910 may be associated ith the centennial of Darin s birth in local fluctuations in the unigram counts, hile preserving the sharp peaks that may indicate a significant change of content in the journal. While the fit is similar to that obtained using standard avelet regression to the normalied counts, the estimates are obtained by minimiing the KL divergence as in standard variational approximations. In the dynamic topic model of Section 2, the algorithms are essentially the same as those described above. Hoever, rather than fitting the observations from true observed counts, e fit them from expected counts under the document-level variational distributions in Analysis of Science We analyed a subset of 30,000 articles from Science, 250 from each of the 120 years beteen 1881 and Our data ere collected by JSTOR.jstor.org, a notfor-profit organiation that maintains an online scholarly archive obtained by running an optical character recognition OR engine over the original printed journals. JS- TOR indexes the resulting text and provides online access to the scanned images of the original content through keyord search. Our corpus is made up of approximately 7.5 million ords. We pruned the vocabulary by stemming each term to its root, removing function terms, and removing terms that occurred feer than 25 times. The total vocabulary sie is 15,955. To explore the corpus and its themes, e estimated a 20-component dynamic topic model. Posterior inference took approximately 4 hours on a 1.5GHZ PoerP Macintosh laptop. To of the resulting topics are illustrated in Figure 4, shoing the top several ords from those topics in each decade, according to the posterior mean number of occurrences as estimated using the Kalman filter variational approximation. Also shon are example articles hich exhibit those topics through the decades. As illustrated, the model captures different scientific themes, and can be used to inspect trends of ord usage ithin them. To validate the dynamic topic model quantitatively, e consider the task of predicting the next year of Science given all the articles from the previous years. We compare the predictive poer of three 20-topic models: the dynamic topic model estimated from all of the previous years, a static topic model estimated from all of the previous years, and a static topic model estimated from the single previous year. All the models are estimated to the same convergence criterion. The topic model estimated from all the previous data and dynamic topic model are initialied at the same point. The dynamic topic model performs ell; it alays assigns higher likelihood to the next year s articles than the other to models Figure 5. It is interesting that the predictive poer of each of the models declines over the years. We can tentatively attribute this to an increase in the rate of specialiation in scientific language.

6 9 : : 9 < D F E;;? = I D F E=? > D I D D? = =? J K GD 9 : L M < D I D F E;;? = F E=? > D =? J K GD 9 L M M I D? G? > D = E> D H? < = B J B J D? < D N < D F E;;? = 9 L 9 M I D D H? < =B? G? > D = E> J B J D? O < F B 9 L P M D H? < =B? G? > D = Q I GK? 9 L R M D H? < =B S I Q?? G? > D = E> Q I GK? 9 L T M N < D D H? < = B 9 L U M D S < I O J < = N D 9 L V M DS < < O J? = Q? 9 L X M < F? G I D 9 L : M < F? G D S < 9 L L = A B J D = K > D K = < F? G J D I D? D S < I D P M M M J D I D? I D J B J D? D S < Y K D K N H B J E> e f ggh i!! " # $ %% & ' ' Z [ \ ] ^ _ ` a b c d _ ` d Z h jh k gil m * ' +, n o f m go e / " 4 / 5 4 & ' " 5 6 ", ª «{ { { { } ~ ƒ ƒ } ~ ƒ ~ } ˆ }~ ƒ } ˆ ˆ ~ Š ƒ } ˆ Ž Š Ž Š Ž ~ } } Š } Ž Ž ˆ ~ }~ }ƒ ~ Ž Š }ƒ Š Œ ~ ƒ Œ ˆ ~ Š ƒ ƒ Ž ~ } }Š } Ž Ž } Œ ~ }ƒ Ž ~ } Ž } Ž Ž ~ Ž ~ Ž Ž Ž } } Š Ž }ƒ }ƒ }ƒ Ž Š Ž Š ƒ Ž Œ } }ƒ } Ž Š } ˆ } ~ Œ } } Š Š ~ } ˆ ~ ~ Ž } Ž } ~ Ž ~ } }~ ƒ ~ ƒ ~ ƒ } ~ } } ~ ƒ Œ } ~ } ƒ ~ }~ ~ ~ ~ }ƒ Ž } Ž ƒ Ž }~ } } Š Ž ƒ Ž Ž ƒ }~ } }ƒ ~ ~ }ƒ Ž Š Ž š } }Š Ž Ž } } Ž ˆ ~ Ž ~ Ž }~ }ƒ } } } Ž } } Ž } } Ž ~ Ž Ž ~ p q r s t u v x r y r p m h i h k f m h o il m œ ž œ Ÿ ž ž œ Ÿ ž œ Ÿ ž œ Ÿ ž œ Ÿ ž ž ž ž 8 " '!!, %% & 5 5. & * ' 0 / $ * & " 2 4 ± 4 ± " 8 5 ' $ ' ' 0 0 Figure 4. Examples from the posterior analysis of a 20-topic dynamic model estimated from the Science corpus. For to topics, e illustrate: a the top ten ords from the inferred posterior distribution at ten year lags b the posterior estimate of the frequency as a function of year of several ords from the same to topics c example articles throughout the collection hich exhibit these topics. Note that the plots are scaled to give an idea of the shape of the trajectory of the ords posterior probability i.e., comparisons across ords are not meaningful. 5. Discussion We have developed sequential topic models for discrete data by using Gaussian time series on the natural parameters of the multinomial topics and logistic normal topic proportion models. We derived variational inference algorithms that exploit existing techniques for sequential data; e demonstrated a novel use of Kalman filters and avelet regression as variational approximations. Dynamic topic models can give a more accurate predictive model, and also offer ne ays of brosing large, unstructured document collections. There are many ays that the ork described here can be extended. One direction is to use more sophisticated state space models. We have demonstrated the use of a simple Gaussian model, but it ould be natural to include a drift term in a more sophisticated autoregressive model to explicitly capture the rise and fall in popularity of a topic, or in the use of specific terms. Another variant ould allo for heteroscedastic time series. Perhaps the most promising extension to the methods presented here is to incorporate a model of ho ne topics in the collection appear or disappear over time, rather than assuming a fixed number of topics. One possibility is to use a simple Galton-Watson or birth-death process for the topic population. While the analysis of birth-death or branching processes often centers on extinction probabilities, here a goal ould be to find documents that may be responsible for spaning ne themes in a collection.

7 Negative log likelihood log scale 1e+06 2e+06 4e+06 7e+06 LDA prev LDA all DTM data. PhD thesis, arnegie Mellon University, Department of Statistics. Fei-Fei, L. and Perona, P A Bayesian hierarchical model for learning natural scene categories. IEEE omputer Vision and Pattern Recognition. Griffiths, T. and Steyvers, M Finding scientific topics. Proceedings of the National Academy of Science, 101: Kalman, R A ne approach to linear filtering and prediction problems. Transaction of the AMSE: Journal of Basic Engineering, 82: Year Figure 5. This figure illustrates the performance of using dynamic topic models and static topic models for prediction. For each year beteen 1900 and 2000 at 5 year increments, e estimated three models on the articles through that year. We then computed the variational bound on the negative log likelihood of next year s articles under the resulting model loer numbers are better. DTM is the dynamic topic model; LDA-prev is a static topic model estimated on just the previous year s articles; LDAall is a static topic model estimated on all the previous articles. Acknoledgments This research as supported in part by NSF grants IIS and IIS , the DARPA ALO project, and a grant from Google. References Aitchison, J The statistical analysis of compositional data. Journal of the Royal Statistical Society, Series B, 442: Blei, D., Ng, A., and Jordan, M Latent Dirichlet allocation. Journal of Machine Learning Research, 3: Blei, D. M. and Lafferty, J. D orrelated topic models. In Weiss, Y., Schölkopf, B., and Platt, J., editors, Advances in Neural Information Processing Systems 18. MIT Press, ambridge, MA. Buntine, W. and Jakulin, A Applying discrete PA in data analysis. In Proceedings of the 20th onference on Uncertainty in Artificial Intelligence, pages AUAI Press. Erosheva, E Grade of membership and latent structure models ith application to disability survey Mcallum, A., orrada-emmanuel, A., and Wang, X The author-recipient-topic model for topic and role discovery in social netorks: Experiments ith Enron and academic . Technical report, University of Massachusetts, Amherst. Pritchard, J., Stephens, M., and Donnelly, P Inference of population structure using multilocus genotype data. Genetics, 155: Rosen-Zvi, M., Griffiths, T., Steyvers, M., and Smith, P The author-topic model for authors and documents. In Proceedings of the 20th onference on Uncertainty in Artificial Intelligence, pages AUAI Press. Sivic, J., Rusell, B., Efros, A., Zisserman, A., and Freeman, W Discovering objects and their location in images. In International onference on omputer Vision IV Snelson, E. and Ghahramani, Z Sparse Gaussian processes using pseudo-inputs. In Weiss, Y., Schölkopf, B., and Platt, J., editors, Advances in Neural Information Processing Systems 18, ambridge, MA. MIT Press. Wasserman, L Springer. All of Nonparametric Statistics. West, M. and Harrison, J Bayesian Forecasting and Dynamic Models. Springer. A. Derivation of Variational Algorithm In this appendix e give some details of the variational algorithm outlined in Section 3.1, hich calculates a distribution q 1:T ˆ 1:T to maximie the loer bound on

8 log pd 1:T. The first term of the righthand side of 5 is E q log p t t 1 = V T 2 1 2σ 2 = V T 2 log σ 2 + log 2π E q t t 1 T t t 1 log σ 2 + log 2π 1 2σ 2 1 T σ 2 Tr m t m t 1 2 Ṽt + 1 2σ 2 TrṼ0 TrṼT using the Gaussian quadratic form identity E m,v x µ T Σ 1 x µ = The second term of 5 is m µ T Σ 1 m µ + TrΣ 1 V. E q log pd t t = n t E q t log exp t n t m t n ˆζ 1 t t + n t n t log ˆζ t exp m t + Ṽt/2 Next, e maximie ith respect to ˆ s : lˆ, ˆν ˆ = s 1 T mt σ 2 m t m t 1, ˆ m t 1, s ˆ s + n t n tˆζ 1 t exp m t + Ṽt/2 mt ˆ. s The forard-backard equations for m t can be used to derive a recurrence for m t / ˆ s. The forard recurrence is m t ˆ s = ˆν 2 t mt 1 v t 1 + σ 2 + ˆν t 2 ˆ + s ˆν 2 t 1 v t 1 + σ 2 + ˆν t 2 δ s,t, ith the initial condition m 0 / ˆ s = 0. The backard recurrence is then m t 1 σ 2 mt 1 ˆ = s V t 1 + σ 2 ˆ + s 1 σ 2 V t 1 + σ 2 mt ˆ s, ith the initial condition m T / ˆ s = m T / ˆ s. here n t = n t, introducing additional variational parameters ˆζ 1:T. The third term of 5 is the entropy Hq = = log Ṽt + T log 2π 2 log Ṽt + TV 2 log 2π. To maximie the loer bound as a function of the variational parameters e use a conjugate gradient algorithm. First, e maximie ith respect to ˆζ; the derivative is l ˆζ t = n t ˆζ 2 t Setting to ero and solving for ˆζ t gives exp m t + Ṽt/2 n t ˆζ t. ˆζ t = exp m t + Ṽt/2.

Bayesian Statistics: Indian Buffet Process

Bayesian Statistics: Indian Buffet Process Bayesian Statistics: Indian Buffet Process Ilker Yildirim Department of Brain and Cognitive Sciences University of Rochester Rochester, NY 14627 August 2012 Reference: Most of the material in this note

More information

E-mail Spam Filtering Using Genetic Algorithm: A Deeper Analysis

E-mail Spam Filtering Using Genetic Algorithm: A Deeper Analysis ISSN:0975-9646 Mandeep hodhary et al, / (IJSIT) International Journal of omputer Science and Information Technologies, Vol. 6 (5), 205, 4266-4270 E-mail Spam Filtering Using Genetic Algorithm: A Deeper

More information

Correlated Topic Models

Correlated Topic Models Correlated Topic Models David M. Blei Department of Computer Science Princeton University John D. Lafferty School of Computer Science Carnegie Mellon University Abstract Topic models, such as latent Dirichlet

More information

11. Time series and dynamic linear models

11. Time series and dynamic linear models 11. Time series and dynamic linear models Objective To introduce the Bayesian approach to the modeling and forecasting of time series. Recommended reading West, M. and Harrison, J. (1997). models, (2 nd

More information

A Quantitative Approach to the Performance of Internet Telephony to E-business Sites

A Quantitative Approach to the Performance of Internet Telephony to E-business Sites A Quantitative Approach to the Performance of Internet Telephony to E-business Sites Prathiusha Chinnusamy TransSolutions Fort Worth, TX 76155, USA Natarajan Gautam Harold and Inge Marcus Department of

More information

CRITERIUM FOR FUNCTION DEFININING OF FINAL TIME SHARING OF THE BASIC CLARK S FLOW PRECEDENCE DIAGRAMMING (PDM) STRUCTURE

CRITERIUM FOR FUNCTION DEFININING OF FINAL TIME SHARING OF THE BASIC CLARK S FLOW PRECEDENCE DIAGRAMMING (PDM) STRUCTURE st Logistics International Conference Belgrade, Serbia 8-30 November 03 CRITERIUM FOR FUNCTION DEFININING OF FINAL TIME SHARING OF THE BASIC CLARK S FLOW PRECEDENCE DIAGRAMMING (PDM STRUCTURE Branko Davidović

More information

Latent Dirichlet Markov Allocation for Sentiment Analysis

Latent Dirichlet Markov Allocation for Sentiment Analysis Latent Dirichlet Markov Allocation for Sentiment Analysis Ayoub Bagheri Isfahan University of Technology, Isfahan, Iran Intelligent Database, Data Mining and Bioinformatics Lab, Electrical and Computer

More information

STA 4273H: Statistical Machine Learning

STA 4273H: Statistical Machine Learning STA 4273H: Statistical Machine Learning Russ Salakhutdinov Department of Statistics! rsalakhu@utstat.toronto.edu! http://www.cs.toronto.edu/~rsalakhu/ Lecture 6 Three Approaches to Classification Construct

More information

Topic models for Sentiment analysis: A Literature Survey

Topic models for Sentiment analysis: A Literature Survey Topic models for Sentiment analysis: A Literature Survey Nikhilkumar Jadhav 123050033 June 26, 2014 In this report, we present the work done so far in the field of sentiment analysis using topic models.

More information

Statistics Graduate Courses

Statistics Graduate Courses Statistics Graduate Courses STAT 7002--Topics in Statistics-Biological/Physical/Mathematics (cr.arr.).organized study of selected topics. Subjects and earnable credit may vary from semester to semester.

More information

Expert Systems with Applications

Expert Systems with Applications Expert Systems ith Applications 36 (2009) 5450 5455 Contents lists available at ScienceDirect Expert Systems ith Applications journal homepage:.elsevier.com/locate/esa Trading rule discovery in the US

More information

Network Traffic Prediction Based on the Wavelet Analysis and Hopfield Neural Network

Network Traffic Prediction Based on the Wavelet Analysis and Hopfield Neural Network Netork Traffic Prediction Based on the Wavelet Analysis and Hopfield Neural Netork Sun Guang Abstract Build a mathematical model is the key problem of netork traffic prediction. Traditional single netork

More information

Dirichlet Processes A gentle tutorial

Dirichlet Processes A gentle tutorial Dirichlet Processes A gentle tutorial SELECT Lab Meeting October 14, 2008 Khalid El-Arini Motivation We are given a data set, and are told that it was generated from a mixture of Gaussian distributions.

More information

A Learning Based Method for Super-Resolution of Low Resolution Images

A Learning Based Method for Super-Resolution of Low Resolution Images A Learning Based Method for Super-Resolution of Low Resolution Images Emre Ugur June 1, 2004 emre.ugur@ceng.metu.edu.tr Abstract The main objective of this project is the study of a learning based method

More information

Introduction to Machine Learning. Speaker: Harry Chao Advisor: J.J. Ding Date: 1/27/2011

Introduction to Machine Learning. Speaker: Harry Chao Advisor: J.J. Ding Date: 1/27/2011 Introduction to Machine Learning Speaker: Harry Chao Advisor: J.J. Ding Date: 1/27/2011 1 Outline 1. What is machine learning? 2. The basic of machine learning 3. Principles and effects of machine learning

More information

Statistical Models in Data Mining

Statistical Models in Data Mining Statistical Models in Data Mining Sargur N. Srihari University at Buffalo The State University of New York Department of Computer Science and Engineering Department of Biostatistics 1 Srihari Flood of

More information

The Research on Demand Forecasting of Supply Chain Based on ICCELMAN

The Research on Demand Forecasting of Supply Chain Based on ICCELMAN 505 A publication of CHEMICAL ENGINEERING TRANSACTIONS VOL. 46, 2015 Guest Editors: Peiyu Ren, Yancang Li, Huiping Song Copyright 2015, AIDIC Servizi S.r.l., ISBN 978-88-95608-37-2; ISSN 2283-9216 The

More information

Analysis of Bayesian Dynamic Linear Models

Analysis of Bayesian Dynamic Linear Models Analysis of Bayesian Dynamic Linear Models Emily M. Casleton December 17, 2010 1 Introduction The main purpose of this project is to explore the Bayesian analysis of Dynamic Linear Models (DLMs). The main

More information

Maximum likelihood estimation of mean reverting processes

Maximum likelihood estimation of mean reverting processes Maximum likelihood estimation of mean reverting processes José Carlos García Franco Onward, Inc. jcpollo@onwardinc.com Abstract Mean reverting processes are frequently used models in real options. For

More information

Bayesian networks - Time-series models - Apache Spark & Scala

Bayesian networks - Time-series models - Apache Spark & Scala Bayesian networks - Time-series models - Apache Spark & Scala Dr John Sandiford, CTO Bayes Server Data Science London Meetup - November 2014 1 Contents Introduction Bayesian networks Latent variables Anomaly

More information

1 Maximum likelihood estimation

1 Maximum likelihood estimation COS 424: Interacting with Data Lecturer: David Blei Lecture #4 Scribes: Wei Ho, Michael Ye February 14, 2008 1 Maximum likelihood estimation 1.1 MLE of a Bernoulli random variable (coin flips) Given N

More information

The Exponential Family

The Exponential Family The Exponential Family David M. Blei Columbia University November 3, 2015 Definition A probability density in the exponential family has this form where p.x j / D h.x/ expf > t.x/ a./g; (1) is the natural

More information

Linear Threshold Units

Linear Threshold Units Linear Threshold Units w x hx (... w n x n w We assume that each feature x j and each weight w j is a real number (we will relax this later) We will study three different algorithms for learning linear

More information

Logistic Regression (1/24/13)

Logistic Regression (1/24/13) STA63/CBB540: Statistical methods in computational biology Logistic Regression (/24/3) Lecturer: Barbara Engelhardt Scribe: Dinesh Manandhar Introduction Logistic regression is model for regression used

More information

LCs for Binary Classification

LCs for Binary Classification Linear Classifiers A linear classifier is a classifier such that classification is performed by a dot product beteen the to vectors representing the document and the category, respectively. Therefore it

More information

Choosing the Optimal Object-Oriented Implementation using Analytic Hierarchy Process

Choosing the Optimal Object-Oriented Implementation using Analytic Hierarchy Process hoosing the Optimal Object-Oriented Implementation using Analytic Hierarchy Process Naunong Sunanta honlameth Arpnikanondt King Mongkut s University of Technology Thonburi, naunong@sit.kmutt.ac.th King

More information

Lecture 3: Linear methods for classification

Lecture 3: Linear methods for classification Lecture 3: Linear methods for classification Rafael A. Irizarry and Hector Corrada Bravo February, 2010 Today we describe four specific algorithms useful for classification problems: linear regression,

More information

OPTIMIZING WEB SERVER'S DATA TRANSFER WITH HOTLINKS

OPTIMIZING WEB SERVER'S DATA TRANSFER WITH HOTLINKS OPTIMIZING WEB SERVER'S DATA TRANSFER WIT OTLINKS Evangelos Kranakis School of Computer Science, Carleton University Ottaa,ON. K1S 5B6 Canada kranakis@scs.carleton.ca Danny Krizanc Department of Mathematics,

More information

Bayes and Naïve Bayes. cs534-machine Learning

Bayes and Naïve Bayes. cs534-machine Learning Bayes and aïve Bayes cs534-machine Learning Bayes Classifier Generative model learns Prediction is made by and where This is often referred to as the Bayes Classifier, because of the use of the Bayes rule

More information

A Probabilistic Model for Online Document Clustering with Application to Novelty Detection

A Probabilistic Model for Online Document Clustering with Application to Novelty Detection A Probabilistic Model for Online Document Clustering with Application to Novelty Detection Jian Zhang School of Computer Science Cargenie Mellon University Pittsburgh, PA 15213 jian.zhang@cs.cmu.edu Zoubin

More information

Machine Learning for Data Science (CS4786) Lecture 1

Machine Learning for Data Science (CS4786) Lecture 1 Machine Learning for Data Science (CS4786) Lecture 1 Tu-Th 10:10 to 11:25 AM Hollister B14 Instructors : Lillian Lee and Karthik Sridharan ROUGH DETAILS ABOUT THE COURSE Diagnostic assignment 0 is out:

More information

Data Mining Yelp Data - Predicting rating stars from review text

Data Mining Yelp Data - Predicting rating stars from review text Data Mining Yelp Data - Predicting rating stars from review text Rakesh Chada Stony Brook University rchada@cs.stonybrook.edu Chetan Naik Stony Brook University cnaik@cs.stonybrook.edu ABSTRACT The majority

More information

Deterministic Sampling-based Switching Kalman Filtering for Vehicle Tracking

Deterministic Sampling-based Switching Kalman Filtering for Vehicle Tracking Proceedings of the IEEE ITSC 2006 2006 IEEE Intelligent Transportation Systems Conference Toronto, Canada, September 17-20, 2006 WA4.1 Deterministic Sampling-based Switching Kalman Filtering for Vehicle

More information

Learning Gaussian process models from big data. Alan Qi Purdue University Joint work with Z. Xu, F. Yan, B. Dai, and Y. Zhu

Learning Gaussian process models from big data. Alan Qi Purdue University Joint work with Z. Xu, F. Yan, B. Dai, and Y. Zhu Learning Gaussian process models from big data Alan Qi Purdue University Joint work with Z. Xu, F. Yan, B. Dai, and Y. Zhu Machine learning seminar at University of Cambridge, July 4 2012 Data A lot of

More information

Statistics in Retail Finance. Chapter 6: Behavioural models

Statistics in Retail Finance. Chapter 6: Behavioural models Statistics in Retail Finance 1 Overview > So far we have focussed mainly on application scorecards. In this chapter we shall look at behavioural models. We shall cover the following topics:- Behavioural

More information

REDUCING RISK OF HAND-ARM VIBRATION INJURY FROM HAND-HELD POWER TOOLS INTRODUCTION

REDUCING RISK OF HAND-ARM VIBRATION INJURY FROM HAND-HELD POWER TOOLS INTRODUCTION Health and Safety Executive Information Document HSE 246/31 REDUCING RISK OF HAND-ARM VIBRATION INJURY FROM HAND-HELD POWER TOOLS INTRODUCTION 1 This document contains internal guidance hich has been made

More information

Basics of Statistical Machine Learning

Basics of Statistical Machine Learning CS761 Spring 2013 Advanced Machine Learning Basics of Statistical Machine Learning Lecturer: Xiaojin Zhu jerryzhu@cs.wisc.edu Modern machine learning is rooted in statistics. You will find many familiar

More information

> plot(exp.btgpllm, main = "treed GP LLM,", proj = c(1)) > plot(exp.btgpllm, main = "treed GP LLM,", proj = c(2)) quantile diff (error)

> plot(exp.btgpllm, main = treed GP LLM,, proj = c(1)) > plot(exp.btgpllm, main = treed GP LLM,, proj = c(2)) quantile diff (error) > plot(exp.btgpllm, main = "treed GP LLM,", proj = c(1)) > plot(exp.btgpllm, main = "treed GP LLM,", proj = c(2)) 0.4 0.2 0.0 0.2 0.4 treed GP LLM, mean treed GP LLM, 0.00 0.05 0.10 0.15 0.20 x1 x1 0.4

More information

Text Mining in JMP with R Andrew T. Karl, Senior Management Consultant, Adsurgo LLC Heath Rushing, Principal Consultant and Co-Founder, Adsurgo LLC

Text Mining in JMP with R Andrew T. Karl, Senior Management Consultant, Adsurgo LLC Heath Rushing, Principal Consultant and Co-Founder, Adsurgo LLC Text Mining in JMP with R Andrew T. Karl, Senior Management Consultant, Adsurgo LLC Heath Rushing, Principal Consultant and Co-Founder, Adsurgo LLC 1. Introduction A popular rule of thumb suggests that

More information

203.4770: Introduction to Machine Learning Dr. Rita Osadchy

203.4770: Introduction to Machine Learning Dr. Rita Osadchy 203.4770: Introduction to Machine Learning Dr. Rita Osadchy 1 Outline 1. About the Course 2. What is Machine Learning? 3. Types of problems and Situations 4. ML Example 2 About the course Course Homepage:

More information

NEURAL NETWORKS A Comprehensive Foundation

NEURAL NETWORKS A Comprehensive Foundation NEURAL NETWORKS A Comprehensive Foundation Second Edition Simon Haykin McMaster University Hamilton, Ontario, Canada Prentice Hall Prentice Hall Upper Saddle River; New Jersey 07458 Preface xii Acknowledgments

More information

Machine Learning Logistic Regression

Machine Learning Logistic Regression Machine Learning Logistic Regression Jeff Howbert Introduction to Machine Learning Winter 2012 1 Logistic regression Name is somewhat misleading. Really a technique for classification, not regression.

More information

Online Courses Recommendation based on LDA

Online Courses Recommendation based on LDA Online Courses Recommendation based on LDA Rel Guzman Apaza, Elizabeth Vera Cervantes, Laura Cruz Quispe, José Ochoa Luna National University of St. Agustin Arequipa - Perú {r.guzmanap,elizavvc,lvcruzq,eduardo.ol}@gmail.com

More information

PATTERN RECOGNITION AND MACHINE LEARNING CHAPTER 4: LINEAR MODELS FOR CLASSIFICATION

PATTERN RECOGNITION AND MACHINE LEARNING CHAPTER 4: LINEAR MODELS FOR CLASSIFICATION PATTERN RECOGNITION AND MACHINE LEARNING CHAPTER 4: LINEAR MODELS FOR CLASSIFICATION Introduction In the previous chapter, we explored a class of regression models having particularly simple analytical

More information

Machine Learning and Pattern Recognition Logistic Regression

Machine Learning and Pattern Recognition Logistic Regression Machine Learning and Pattern Recognition Logistic Regression Course Lecturer:Amos J Storkey Institute for Adaptive and Neural Computation School of Informatics University of Edinburgh Crichton Street,

More information

Network Big Data: Facing and Tackling the Complexities Xiaolong Jin

Network Big Data: Facing and Tackling the Complexities Xiaolong Jin Network Big Data: Facing and Tackling the Complexities Xiaolong Jin CAS Key Laboratory of Network Data Science & Technology Institute of Computing Technology Chinese Academy of Sciences (CAS) 2015-08-10

More information

Social Media Mining. Data Mining Essentials

Social Media Mining. Data Mining Essentials Introduction Data production rate has been increased dramatically (Big Data) and we are able store much more data than before E.g., purchase data, social media data, mobile phone data Businesses and customers

More information

Probabilistic Latent Semantic Analysis (plsa)

Probabilistic Latent Semantic Analysis (plsa) Probabilistic Latent Semantic Analysis (plsa) SS 2008 Bayesian Networks Multimedia Computing, Universität Augsburg Rainer.Lienhart@informatik.uni-augsburg.de www.multimedia-computing.{de,org} References

More information

Bayesian Predictive Profiles with Applications to Retail Transaction Data

Bayesian Predictive Profiles with Applications to Retail Transaction Data Bayesian Predictive Profiles with Applications to Retail Transaction Data Igor V. Cadez Information and Computer Science University of California Irvine, CA 92697-3425, U.S.A. icadez@ics.uci.edu Padhraic

More information

Tensor Factorization for Multi-Relational Learning

Tensor Factorization for Multi-Relational Learning Tensor Factorization for Multi-Relational Learning Maximilian Nickel 1 and Volker Tresp 2 1 Ludwig Maximilian University, Oettingenstr. 67, Munich, Germany nickel@dbs.ifi.lmu.de 2 Siemens AG, Corporate

More information

Principles of Data Mining by Hand&Mannila&Smyth

Principles of Data Mining by Hand&Mannila&Smyth Principles of Data Mining by Hand&Mannila&Smyth Slides for Textbook Ari Visa,, Institute of Signal Processing Tampere University of Technology October 4, 2010 Data Mining: Concepts and Techniques 1 Differences

More information

Predict the Popularity of YouTube Videos Using Early View Data

Predict the Popularity of YouTube Videos Using Early View Data 000 001 002 003 004 005 006 007 008 009 010 011 012 013 014 015 016 017 018 019 020 021 022 023 024 025 026 027 028 029 030 031 032 033 034 035 036 037 038 039 040 041 042 043 044 045 046 047 048 049 050

More information

The Basics of Graphical Models

The Basics of Graphical Models The Basics of Graphical Models David M. Blei Columbia University October 3, 2015 Introduction These notes follow Chapter 2 of An Introduction to Probabilistic Graphical Models by Michael Jordan. Many figures

More information

HT2015: SC4 Statistical Data Mining and Machine Learning

HT2015: SC4 Statistical Data Mining and Machine Learning HT2015: SC4 Statistical Data Mining and Machine Learning Dino Sejdinovic Department of Statistics Oxford http://www.stats.ox.ac.uk/~sejdinov/sdmml.html Bayesian Nonparametrics Parametric vs Nonparametric

More information

Probabilistic Models for Big Data. Alex Davies and Roger Frigola University of Cambridge 13th February 2014

Probabilistic Models for Big Data. Alex Davies and Roger Frigola University of Cambridge 13th February 2014 Probabilistic Models for Big Data Alex Davies and Roger Frigola University of Cambridge 13th February 2014 The State of Big Data Why probabilistic models for Big Data? 1. If you don t have to worry about

More information

Machine Learning in FX Carry Basket Prediction

Machine Learning in FX Carry Basket Prediction Machine Learning in FX Carry Basket Prediction Tristan Fletcher, Fabian Redpath and Joe D Alessandro Abstract Artificial Neural Networks ANN), Support Vector Machines SVM) and Relevance Vector Machines

More information

TRENDS IN AN INTERNATIONAL INDUSTRIAL ENGINEERING RESEARCH JOURNAL: A TEXTUAL INFORMATION ANALYSIS PERSPECTIVE. wernervanzyl@sun.ac.

TRENDS IN AN INTERNATIONAL INDUSTRIAL ENGINEERING RESEARCH JOURNAL: A TEXTUAL INFORMATION ANALYSIS PERSPECTIVE. wernervanzyl@sun.ac. TRENDS IN AN INTERNATIONAL INDUSTRIAL ENGINEERING RESEARCH JOURNAL: A TEXTUAL INFORMATION ANALYSIS PERSPECTIVE J.W. Uys 1 *, C.S.L. Schutte 2 and W.D. Van Zyl 3 1 Indutech (Pty) Ltd., Stellenbosch, South

More information

Auxiliary Variables in Mixture Modeling: 3-Step Approaches Using Mplus

Auxiliary Variables in Mixture Modeling: 3-Step Approaches Using Mplus Auxiliary Variables in Mixture Modeling: 3-Step Approaches Using Mplus Tihomir Asparouhov and Bengt Muthén Mplus Web Notes: No. 15 Version 8, August 5, 2014 1 Abstract This paper discusses alternatives

More information

Statistics for BIG data

Statistics for BIG data Statistics for BIG data Statistics for Big Data: Are Statisticians Ready? Dennis Lin Department of Statistics The Pennsylvania State University John Jordan and Dennis K.J. Lin (ICSA-Bulletine 2014) Before

More information

Example: Credit card default, we may be more interested in predicting the probabilty of a default than classifying individuals as default or not.

Example: Credit card default, we may be more interested in predicting the probabilty of a default than classifying individuals as default or not. Statistical Learning: Chapter 4 Classification 4.1 Introduction Supervised learning with a categorical (Qualitative) response Notation: - Feature vector X, - qualitative response Y, taking values in C

More information

Detection of changes in variance using binary segmentation and optimal partitioning

Detection of changes in variance using binary segmentation and optimal partitioning Detection of changes in variance using binary segmentation and optimal partitioning Christian Rohrbeck Abstract This work explores the performance of binary segmentation and optimal partitioning in the

More information

Machine Learning. CUNY Graduate Center, Spring 2013. Professor Liang Huang. huang@cs.qc.cuny.edu

Machine Learning. CUNY Graduate Center, Spring 2013. Professor Liang Huang. huang@cs.qc.cuny.edu Machine Learning CUNY Graduate Center, Spring 2013 Professor Liang Huang huang@cs.qc.cuny.edu http://acl.cs.qc.edu/~lhuang/teaching/machine-learning Logistics Lectures M 9:30-11:30 am Room 4419 Personnel

More information

Tutorial on Markov Chain Monte Carlo

Tutorial on Markov Chain Monte Carlo Tutorial on Markov Chain Monte Carlo Kenneth M. Hanson Los Alamos National Laboratory Presented at the 29 th International Workshop on Bayesian Inference and Maximum Entropy Methods in Science and Technology,

More information

Christfried Webers. Canberra February June 2015

Christfried Webers. Canberra February June 2015 c Statistical Group and College of Engineering and Computer Science Canberra February June (Many figures from C. M. Bishop, "Pattern Recognition and ") 1of 829 c Part VIII Linear Classification 2 Logistic

More information

10-601. Machine Learning. http://www.cs.cmu.edu/afs/cs/academic/class/10601-f10/index.html

10-601. Machine Learning. http://www.cs.cmu.edu/afs/cs/academic/class/10601-f10/index.html 10-601 Machine Learning http://www.cs.cmu.edu/afs/cs/academic/class/10601-f10/index.html Course data All up-to-date info is on the course web page: http://www.cs.cmu.edu/afs/cs/academic/class/10601-f10/index.html

More information

Bayesian Image Super-Resolution

Bayesian Image Super-Resolution Bayesian Image Super-Resolution Michael E. Tipping and Christopher M. Bishop Microsoft Research, Cambridge, U.K..................................................................... Published as: Bayesian

More information

Assessing Data Mining: The State of the Practice

Assessing Data Mining: The State of the Practice Assessing Data Mining: The State of the Practice 2003 Herbert A. Edelstein Two Crows Corporation 10500 Falls Road Potomac, Maryland 20854 www.twocrows.com (301) 983-3555 Objectives Separate myth from reality

More information

2010 Master of Science Computer Science Department, University of Massachusetts Amherst

2010 Master of Science Computer Science Department, University of Massachusetts Amherst Scott Niekum Postdoctoral Research Fellow The Robotics Institute, Carnegie Mellon University Contact Information Smith Hall, Room 228 Carnegie Mellon University 5000 Forbes Avenue Pittsburgh, PA 15213

More information

I, ONLINE KNOWLEDGE TRANSFER FREELANCER PORTAL M. & A.

I, ONLINE KNOWLEDGE TRANSFER FREELANCER PORTAL M. & A. ONLINE KNOWLEDGE TRANSFER FREELANCER PORTAL M. Micheal Fernandas* & A. Anitha** * PG Scholar, Department of Master of Computer Applications, Dhanalakshmi Srinivasan Engineering College, Perambalur, Tamilnadu

More information

Dissertation TOPIC MODELS FOR IMAGE RETRIEVAL ON LARGE-SCALE DATABASES. Eva Hörster

Dissertation TOPIC MODELS FOR IMAGE RETRIEVAL ON LARGE-SCALE DATABASES. Eva Hörster Dissertation TOPIC MODELS FOR IMAGE RETRIEVAL ON LARGE-SCALE DATABASES Eva Hörster Department of Computer Science University of Augsburg Adviser: Readers: Prof. Dr. Rainer Lienhart Prof. Dr. Rainer Lienhart

More information

Mobile Phone APP Software Browsing Behavior using Clustering Analysis

Mobile Phone APP Software Browsing Behavior using Clustering Analysis Proceedings of the 2014 International Conference on Industrial Engineering and Operations Management Bali, Indonesia, January 7 9, 2014 Mobile Phone APP Software Browsing Behavior using Clustering Analysis

More information

Probabilistic. review articles. Surveying a suite of algorithms that offer a solution to managing large document archives.

Probabilistic. review articles. Surveying a suite of algorithms that offer a solution to managing large document archives. doi:10.1145/2133806.2133826 Surveying a suite of algorithms that offer a solution to managing large document archives. by David M. Blei Probabilistic Topic Models As our collective knowledge continues

More information

Variational Mean Field for Graphical Models

Variational Mean Field for Graphical Models Variational Mean Field for Graphical Models CS/CNS/EE 155 Baback Moghaddam Machine Learning Group baback @ jpl.nasa.gov Approximate Inference Consider general UGs (i.e., not tree-structured) All basic

More information

Designing a learning system

Designing a learning system Lecture Designing a learning system Milos Hauskrecht milos@cs.pitt.edu 539 Sennott Square, x4-8845 http://.cs.pitt.edu/~milos/courses/cs750/ Design of a learning system (first vie) Application or Testing

More information

A Statistical Framework for Operational Infrasound Monitoring

A Statistical Framework for Operational Infrasound Monitoring A Statistical Framework for Operational Infrasound Monitoring Stephen J. Arrowsmith Rod W. Whitaker LA-UR 11-03040 The views expressed here do not necessarily reflect the views of the United States Government,

More information

Sanjeev Kumar. contribute

Sanjeev Kumar. contribute RESEARCH ISSUES IN DATAA MINING Sanjeev Kumar I.A.S.R.I., Library Avenue, Pusa, New Delhi-110012 sanjeevk@iasri.res.in 1. Introduction The field of data mining and knowledgee discovery is emerging as a

More information

When scientists decide to write a paper, one of the first

When scientists decide to write a paper, one of the first Colloquium Finding scientific topics Thomas L. Griffiths* and Mark Steyvers *Department of Psychology, Stanford University, Stanford, CA 94305; Department of Brain and Cognitive Sciences, Massachusetts

More information

Big Ideas in Mathematics

Big Ideas in Mathematics Big Ideas in Mathematics which are important to all mathematics learning. (Adapted from the NCTM Curriculum Focal Points, 2006) The Mathematics Big Ideas are organized using the PA Mathematics Standards

More information

CORRELATED TO THE SOUTH CAROLINA COLLEGE AND CAREER-READY FOUNDATIONS IN ALGEBRA

CORRELATED TO THE SOUTH CAROLINA COLLEGE AND CAREER-READY FOUNDATIONS IN ALGEBRA We Can Early Learning Curriculum PreK Grades 8 12 INSIDE ALGEBRA, GRADES 8 12 CORRELATED TO THE SOUTH CAROLINA COLLEGE AND CAREER-READY FOUNDATIONS IN ALGEBRA April 2016 www.voyagersopris.com Mathematical

More information

Neural Networks. CAP5610 Machine Learning Instructor: Guo-Jun Qi

Neural Networks. CAP5610 Machine Learning Instructor: Guo-Jun Qi Neural Networks CAP5610 Machine Learning Instructor: Guo-Jun Qi Recap: linear classifier Logistic regression Maximizing the posterior distribution of class Y conditional on the input vector X Support vector

More information

An Experience Based Learning Controller

An Experience Based Learning Controller J. Intelligent Learning Systems & Applications, 2010, 2: 80-85 doi:10.4236/jilsa.2010.22011 Published Online May 2010 (http://.scirp.org/journal/jilsa) An Experience Based Learning Controller Debadutt

More information

Gaussian Processes in Machine Learning

Gaussian Processes in Machine Learning Gaussian Processes in Machine Learning Carl Edward Rasmussen Max Planck Institute for Biological Cybernetics, 72076 Tübingen, Germany carl@tuebingen.mpg.de WWW home page: http://www.tuebingen.mpg.de/ carl

More information

Simple and efficient online algorithms for real world applications

Simple and efficient online algorithms for real world applications Simple and efficient online algorithms for real world applications Università degli Studi di Milano Milano, Italy Talk @ Centro de Visión por Computador Something about me PhD in Robotics at LIRA-Lab,

More information

ADVANCED MACHINE LEARNING. Introduction

ADVANCED MACHINE LEARNING. Introduction 1 1 Introduction Lecturer: Prof. Aude Billard (aude.billard@epfl.ch) Teaching Assistants: Guillaume de Chambrier, Nadia Figueroa, Denys Lamotte, Nicola Sommer 2 2 Course Format Alternate between: Lectures

More information

TOWARD BIG DATA ANALYSIS WORKSHOP

TOWARD BIG DATA ANALYSIS WORKSHOP TOWARD BIG DATA ANALYSIS WORKSHOP 邁 向 巨 量 資 料 分 析 研 討 會 摘 要 集 2015.06.05-06 巨 量 資 料 之 矩 陣 視 覺 化 陳 君 厚 中 央 研 究 院 統 計 科 學 研 究 所 摘 要 視 覺 化 (Visualization) 與 探 索 式 資 料 分 析 (Exploratory Data Analysis, EDA)

More information

Cell Phone based Activity Detection using Markov Logic Network

Cell Phone based Activity Detection using Markov Logic Network Cell Phone based Activity Detection using Markov Logic Network Somdeb Sarkhel sxs104721@utdallas.edu 1 Introduction Mobile devices are becoming increasingly sophisticated and the latest generation of smart

More information

1.2 Cultural Orientation and Consumer Behavior

1.2 Cultural Orientation and Consumer Behavior JOURNAL OF INFORMATION SCIENCE AND ENGINEERING 25, 1605-1616 (2009) A Ne Perspective for Neural Netorks: Application to a Marketing Management Problem JAESOO KIM AND HEEJUNE AHN + Department of Computer

More information

Introduction. A. Bellaachia Page: 1

Introduction. A. Bellaachia Page: 1 Introduction 1. Objectives... 3 2. What is Data Mining?... 4 3. Knowledge Discovery Process... 5 4. KD Process Example... 7 5. Typical Data Mining Architecture... 8 6. Database vs. Data Mining... 9 7.

More information

TOPIC MODELS DAVID M. BLEI PRINCETON UNIVERSITY JOHN D. LAFFERTY CARNEGIE MELLON UNIVERSITY

TOPIC MODELS DAVID M. BLEI PRINCETON UNIVERSITY JOHN D. LAFFERTY CARNEGIE MELLON UNIVERSITY TOPIC MODELS DAVID M. BLEI PRINCETON UNIVERSITY JOHN D. LAFFERTY CARNEGIE MELLON UNIVERSITY 1. INTRODUCTION Scientists need new tools to explore and browse large collections of scholarly literature. Thanks

More information

A Regime-Switching Model for Electricity Spot Prices. Gero Schindlmayr EnBW Trading GmbH g.schindlmayr@enbw.com

A Regime-Switching Model for Electricity Spot Prices. Gero Schindlmayr EnBW Trading GmbH g.schindlmayr@enbw.com A Regime-Switching Model for Electricity Spot Prices Gero Schindlmayr EnBW Trading GmbH g.schindlmayr@enbw.com May 31, 25 A Regime-Switching Model for Electricity Spot Prices Abstract Electricity markets

More information

Improving predictive inference under covariate shift by weighting the log-likelihood function

Improving predictive inference under covariate shift by weighting the log-likelihood function Journal of Statistical Planning and Inference 90 (2000) 227 244.elsevier.com/locate/jspi Improving predictive inference under covariate shift by eighting the log-likelihood function Hidetoshi Shimodaira

More information

Mind the Duality Gap: Logarithmic regret algorithms for online optimization

Mind the Duality Gap: Logarithmic regret algorithms for online optimization Mind the Duality Gap: Logarithmic regret algorithms for online optimization Sham M. Kakade Toyota Technological Institute at Chicago sham@tti-c.org Shai Shalev-Shartz Toyota Technological Institute at

More information

An Introduction to Data Mining. Big Data World. Related Fields and Disciplines. What is Data Mining? 2/12/2015

An Introduction to Data Mining. Big Data World. Related Fields and Disciplines. What is Data Mining? 2/12/2015 An Introduction to Data Mining for Wind Power Management Spring 2015 Big Data World Every minute: Google receives over 4 million search queries Facebook users share almost 2.5 million pieces of content

More information

Learning Vector Quantization: generalization ability and dynamics of competing prototypes

Learning Vector Quantization: generalization ability and dynamics of competing prototypes Learning Vector Quantization: generalization ability and dynamics of competing prototypes Aree Witoelar 1, Michael Biehl 1, and Barbara Hammer 2 1 University of Groningen, Mathematics and Computing Science

More information

CCNY. BME I5100: Biomedical Signal Processing. Linear Discrimination. Lucas C. Parra Biomedical Engineering Department City College of New York

CCNY. BME I5100: Biomedical Signal Processing. Linear Discrimination. Lucas C. Parra Biomedical Engineering Department City College of New York BME I5100: Biomedical Signal Processing Linear Discrimination Lucas C. Parra Biomedical Engineering Department CCNY 1 Schedule Week 1: Introduction Linear, stationary, normal - the stuff biology is not

More information

8. Time Series and Prediction

8. Time Series and Prediction 8. Time Series and Prediction Definition: A time series is given by a sequence of the values of a variable observed at sequential points in time. e.g. daily maximum temperature, end of day share prices,

More information

Nominal and ordinal logistic regression

Nominal and ordinal logistic regression Nominal and ordinal logistic regression April 26 Nominal and ordinal logistic regression Our goal for today is to briefly go over ways to extend the logistic regression model to the case where the outcome

More information

DESIGN PROCEDURE FOR A LEARNING FEED-FORWARD CONTROLLER

DESIGN PROCEDURE FOR A LEARNING FEED-FORWARD CONTROLLER Proc. 1 st IFAC Conf. on Mechatronic Systems (Darmstadt, Germany, 18 0 Sept. 000), R. Isermann (ed.), VDI/VDE Gesellschaft Mess-und Automatiserungstechnik GMA, Düsseldorf, Germany. DESIGN PROCEDURE FOR

More information

RELogAnalysis Well Log Analysis

RELogAnalysis Well Log Analysis RESys Reservoir Evaluation System RELogAnalysis Well Log Analysis Users Manual Copyright 2006-2010, The Comport Computing Company All Rights Reserved RESys is an integrated softare system for performing

More information