Clustering  example. Given some data x i X Find a partitioning of the data into k disjunctive clusters Example: kmeans clustering


 Alisha Mills
 10 months ago
 Views:
Transcription
1 Clustering  example Graph Mining and Graph Kernels Given some data x i X Find a partitioning of the data into k disjunctive clusters Example: kmeans clustering x!!!!8!!! 8 x 1 1
2 Clustering  example Graph Mining and Graph Kernels Given some data x i X Find a partitioning of the data into k disjunctive clusters Example: kmeans clustering x!!! Can we do something better?!8!!! 8 x 1
3 Generative model view of clustering Instead of partitioning the data try to describe the underlying generative process of the data Each cluster can be seen as one distribution For example Gaussian distributions Objects x i are assumed to be independent samples from their cluster distribution => Gaussian mixture model x i N (µ l, Σ l ) univariate Gaussian prbability density function f(x) c 1 =Normal(,1.5); p(c 1 )=.5 c =Normal(3,.5); p(c )=. c 3 =Normal(!,.7); p(c 3 )=.3.!5 5 x 3
4 Gaussian Mixture Model  Introduction Data x i are independent and identically distributed (i.i.d.) samples from a mixture of k distributions c l x i R d,i {1... N} c l,l {1... k} each cluster is a multivariate Gaussian distribution Sufficient statistics of each cluster: Mean (Centroid) Covariance (empirical covariance matrix) Probability density function of a Gaussian distribution P(x i c l ) f l (x i )= x i N (µ l, Σ l ) 1 (π) d det(σ l ) exp µ l R d Σ l R d d ( 1 (x i µ l ) Σ 1 (x i µ l ) )
5 Gaussian Mixture Model  Introduction Mixture of onedimensional Gaussians c i = N (µ l, σ l ) univariate Gaussian prbability density function c 1 =Normal(,1.5); p(c 1 )=.5 c =Normal(3,.5); p(c )=. c 3 =Normal(!,.7); p(c 3 )=.3.1 f(x) !5 5 x 5
6 Gaussian Mixture Model  Introduction Mixture of multivariate Gaussians x!!!!8!!! 8 x 1
7 Gaussian Mixture Model  Introduction Mixture of multivariate Gaussians No covariance x! &'() Negative covariance!!!8!!! 8 x 1 µ l Σ l Positive covariance 7
8 Gaussian Mixture Model some maths Probability of a cluster c l P(c l )= 1 N N P(c l x i ) i=1 Empirical estimate of the density of the cluster low density => small P(c l ) x!!! high density => large P(c l )!8!!! 8 x 1 8
9 Gaussian Mixture Model some maths Probability of a cluster c l P(c l )= 1 N Empirical estimate of the density of the cluster Probability of observing an object x i P(x i )= k l=1 N P(c l x i ) i=1 P(c l )P(x i c l ) Probability of observing an object x i given its cluster c l P(x i c l ) 1 (π) d det(σ l ) exp ( 1 ) (x i µ l ) Σ 1 (x i µ l ) 9
10 Gaussian Mixture Model likelihood function Quality measure of the model Probability that the data is generated by the GMM L = = N i=1 N i=1 P(x i ) k l=1 P(c l )P(x i c l ) Also possible to use the loglikelihood log (L) 1
11 Gaussian Mixture Model  clustering Question: How can we use the GMM to partition the data? Choose most likely cluster assignment of each object argmax l P(c l x i ) = argmax l P(c l )P(x i c l ) x!!!!8!!! 8 x 1 11
12 Gaussian Mixture Model  clustering Question: How can we use the GMM to partition the data? Choose most likely cluster assignment of each object argmax l P(c l x i ) = argmax l P(c l )P(x i c l ) x!!! Great! but!8!!! 8 x 1 1
13 This is all we that have How to estimate the sufficient statistics of each cluster? Mean (Centroid) Covariance (empirical covariance matrix) µ l R d Σ l R d d => use Expectation Maximization algorithm x!!!!8!!! 8 x 1 13
14 Expectation Maximization algorithm Original algorithm by [Dempster, Laird and Rubin, 1977] General method for finding the maximumlikelihood estimate of a data distribution, when the data is partially missing or hidden. How does this apply? data x i are fully observed Trick: the cluster assignments of an object x i can be seen as hidden variable 1
15 Exepectation Maximization algorithm a short sketch of the EM algorithm: Initialize cluster assignments Two alternating steps: Estep: reestimate the Expectedvalues of the hidden data (cluster assignments) under the current estimate of the model Mstep: reestimate the model parameters such that the likelihood according to the current estimate of the complete data is maximized until convergence L new L old < 1+ɛ 15
16 Expectation Maximization algorithm Estep: Reestimate the Expectedvalues of the hidden data (cluster assignments) under the current estimate of the model P new (c l x i ) = P(c l )P(x i c l ) 1
17 Expectation Maximization algorithm Mstep: reestimate the model parameters by taking the maximum likelihood estimate according to the current estimate of the complete data Cluster densities P new (c l )= 1 N P new (c l x i ) N Cluster means: µ new l = i=1 N i=1 x ip new (c l x i ) N i=1 Pnew (c l x i ) Cluster covariances: Σ new l = N i=1 (x i µ new l )(x i µ new l ) P new (c l x i ) N i=1 Pnew (c l x i ) 17
Machine Learning and Data Mining. Clustering. (adapted from) Prof. Alexander Ihler
Machine Learning and Data Mining Clustering (adapted from) Prof. Alexander Ihler Unsupervised learning Supervised learning Predict target value ( y ) given features ( x ) Unsupervised learning Understand
More informationCS540 Machine learning Lecture 14 Mixtures, EM, Nonparametric models
CS540 Machine learning Lecture 14 Mixtures, EM, Nonparametric models Outline Mixture models EM for mixture models K means clustering Conditional mixtures Kernel density estimation Kernel regression GMM
More informationRobotics 2 Clustering & EM. Giorgio Grisetti, Cyrill Stachniss, Kai Arras, Maren Bennewitz, Wolfram Burgard
Robotics 2 Clustering & EM Giorgio Grisetti, Cyrill Stachniss, Kai Arras, Maren Bennewitz, Wolfram Burgard 1 Clustering (1) Common technique for statistical data analysis to detect structure (machine learning,
More informationStatistical Machine Learning from Data
Samy Bengio Statistical Machine Learning from Data 1 Statistical Machine Learning from Data Gaussian Mixture Models Samy Bengio IDIAP Research Institute, Martigny, Switzerland, and Ecole Polytechnique
More informationL10: Probability, statistics, and estimation theory
L10: Probability, statistics, and estimation theory Review of probability theory Bayes theorem Statistics and the Normal distribution Least Squares Error estimation Maximum Likelihood estimation Bayesian
More informationA crash course in probability and Naïve Bayes classification
Probability theory A crash course in probability and Naïve Bayes classification Chapter 9 Random variable: a variable whose possible values are numerical outcomes of a random phenomenon. s: A person s
More informationData Mining. Cluster Analysis: Advanced Concepts and Algorithms
Data Mining Cluster Analysis: Advanced Concepts and Algorithms Tan,Steinbach, Kumar Introduction to Data Mining 4/18/2004 1 More Clustering Methods Prototypebased clustering Densitybased clustering Graphbased
More informationEM Clustering Approach for MultiDimensional Analysis of Big Data Set
EM Clustering Approach for MultiDimensional Analysis of Big Data Set Amhmed A. Bhih School of Electrical and Electronic Engineering Princy Johnson School of Electrical and Electronic Engineering Martin
More information10810 /02710 Computational Genomics. Clustering expression data
10810 /02710 Computational Genomics Clustering expression data What is Clustering? Organizing data into clusters such that there is high intracluster similarity low intercluster similarity Informally,
More informationClustering. 15381 Artificial Intelligence Henry Lin. Organizing data into clusters such that there is
Clustering 15381 Artificial Intelligence Henry Lin Modified from excellent slides of Eamonn Keogh, Ziv BarJoseph, and Andrew Moore What is Clustering? Organizing data into clusters such that there is
More informationMachine Learning and Data Analysis overview. Department of Cybernetics, Czech Technical University in Prague. http://ida.felk.cvut.
Machine Learning and Data Analysis overview Jiří Kléma Department of Cybernetics, Czech Technical University in Prague http://ida.felk.cvut.cz psyllabus Lecture Lecturer Content 1. J. Kléma Introduction,
More informationWes, Delaram, and Emily MA751. Exercise 4.5. 1 p(x; β) = [1 p(xi ; β)] = 1 p(x. y i [βx i ] log [1 + exp {βx i }].
Wes, Delaram, and Emily MA75 Exercise 4.5 Consider a twoclass logistic regression problem with x R. Characterize the maximumlikelihood estimates of the slope and intercept parameter if the sample for
More informationProbabilistic Latent Semantic Analysis (plsa)
Probabilistic Latent Semantic Analysis (plsa) SS 2008 Bayesian Networks Multimedia Computing, Universität Augsburg Rainer.Lienhart@informatik.uniaugsburg.de www.multimediacomputing.{de,org} References
More informationHighly Efficient Incremental Estimation of Gaussian Mixture Models for Online Data Stream Clustering
Highly Efficient Incremental Estimation of Gaussian Mixture Models for Online Data Stream Clustering Mingzhou Song a,b and Hongbin Wang b a Department of Computer Science, Queens College of CUNY, Flushing,
More informationAutomated Hierarchical Mixtures of Probabilistic Principal Component Analyzers
Automated Hierarchical Mixtures of Probabilistic Principal Component Analyzers Ting Su tsu@ece.neu.edu Jennifer G. Dy jdy@ece.neu.edu Department of Electrical and Computer Engineering, Northeastern University,
More informationGaussian Classifiers CS498
Gaussian Classifiers CS498 Today s lecture The Gaussian Gaussian classifiers A slightly more sophisticated classifier Nearest Neighbors We can classify with nearest neighbors x m 1 m 2 Decision boundary
More informationModelBased Cluster Analysis for Web Users Sessions
ModelBased Cluster Analysis for Web Users Sessions George Pallis, Lefteris Angelis, and Athena Vakali Department of Informatics, Aristotle University of Thessaloniki, 54124, Thessaloniki, Greece gpallis@ccf.auth.gr
More informationWhy the Normal Distribution?
Why the Normal Distribution? Raul Rojas Freie Universität Berlin Februar 2010 Abstract This short note explains in simple terms why the normal distribution is so ubiquitous in pattern recognition applications.
More informationHealth Status Monitoring Through Analysis of Behavioral Patterns
Health Status Monitoring Through Analysis of Behavioral Patterns Tracy Barger 1, Donald Brown 1, and Majd Alwan 2 1 University of Virginia, Systems and Information Engineering, Charlottesville, VA 2 University
More informationData Mining Cluster Analysis: Basic Concepts and Algorithms. Lecture Notes for Chapter 8. Introduction to Data Mining
Data Mining Cluster Analysis: Basic Concepts and Algorithms Lecture Notes for Chapter 8 by Tan, Steinbach, Kumar 1 What is Cluster Analysis? Finding groups of objects such that the objects in a group will
More informationMusic Classification. Juan Pablo Bello MPATEGE 2623 Music Information Retrieval New York University
Music Classification Juan Pablo Bello MPATEGE 2623 Music Information Retrieval New York University 1 Classification It is the process by which we automatically assign an individual item to one of a number
More informationIntroduction to Segmentation
Lecture 2: Introduction to Segmentation Jonathan Krause 1 Goal Goal: Identify groups of pixels that go together image credit: Steve Seitz, Kristen Grauman 2 Types of Segmentation Semantic Segmentation:
More informationMultivariate Normal Distribution
Multivariate Normal Distribution Lecture 4 July 21, 2011 Advanced Multivariate Statistical Methods ICPSR Summer Session #2 Lecture #47/21/2011 Slide 1 of 41 Last Time Matrices and vectors Eigenvalues
More informationThe Expectation Maximization Algorithm A short tutorial
The Expectation Maximiation Algorithm A short tutorial Sean Borman Comments and corrections to: emtut at seanborman dot com July 8 2004 Last updated January 09, 2009 Revision history 2009009 Corrected
More informationMixtures of Robust Probabilistic Principal Component Analyzers
Mixtures of Robust Probabilistic Principal Component Analyzers Cédric Archambeau, Nicolas Delannay 2 and Michel Verleysen 2  University College London, Dept. of Computer Science Gower Street, London WCE
More informationStatistical Machine Learning
Statistical Machine Learning UoC Stats 37700, Winter quarter Lecture 4: classical linear and quadratic discriminants. 1 / 25 Linear separation For two classes in R d : simple idea: separate the classes
More informationA hidden Markov model for criminal behaviour classification
RSS2004 p.1/19 A hidden Markov model for criminal behaviour classification Francesco Bartolucci, Institute of economic sciences, Urbino University, Italy. Fulvia Pennoni, Department of Statistics, University
More informationAn Introduction to Statistical Machine Learning  Overview 
An Introduction to Statistical Machine Learning  Overview  Samy Bengio bengio@idiap.ch Dalle Molle Institute for Perceptual Artificial Intelligence (IDIAP) CP 592, rue du Simplon 4 1920 Martigny, Switzerland
More informationHT2015: SC4 Statistical Data Mining and Machine Learning
HT2015: SC4 Statistical Data Mining and Machine Learning Dino Sejdinovic Department of Statistics Oxford http://www.stats.ox.ac.uk/~sejdinov/sdmml.html Bayesian Nonparametrics Parametric vs Nonparametric
More informationLecture 20: Clustering
Lecture 20: Clustering Wrapup of neural nets (from last lecture Introduction to unsupervised learning Kmeans clustering COMP424, Lecture 20  April 3, 2013 1 Unsupervised learning In supervised learning,
More informationOneClass Classifiers: A Review and Analysis of Suitability in the Context of MobileMasquerader Detection
Joint Special Issue Advances in enduser datamining techniques 29 OneClass Classifiers: A Review and Analysis of Suitability in the Context of MobileMasquerader Detection O Mazhelis Department of Computer
More informationLecture 3: Linear methods for classification
Lecture 3: Linear methods for classification Rafael A. Irizarry and Hector Corrada Bravo February, 2010 Today we describe four specific algorithms useful for classification problems: linear regression,
More informationComparative Analysis of EM Clustering Algorithm and Density Based Clustering Algorithm Using WEKA tool.
International Journal of Engineering Research and Development eissn: 2278067X, pissn: 2278800X, www.ijerd.com Volume 9, Issue 8 (January 2014), PP. 1924 Comparative Analysis of EM Clustering Algorithm
More informationClass #6: Nonlinear classification. ML4Bio 2012 February 17 th, 2012 Quaid Morris
Class #6: Nonlinear classification ML4Bio 2012 February 17 th, 2012 Quaid Morris 1 Module #: Title of Module 2 Review Overview Linear separability Nonlinear classification Linear Support Vector Machines
More informationA Basic Introduction to Missing Data
John Fox Sociology 740 Winter 2014 Outline Why Missing Data Arise Why Missing Data Arise Global or unit nonresponse. In a survey, certain respondents may be unreachable or may refuse to participate. Item
More informationCS229 Lecture notes. Andrew Ng
CS229 Lecture notes Andrew Ng Part X Factor analysis Whenwehavedatax (i) R n thatcomesfromamixtureofseveral Gaussians, the EM algorithm can be applied to fit a mixture model. In this setting, we usually
More informationLinear Classification. Volker Tresp Summer 2015
Linear Classification Volker Tresp Summer 2015 1 Classification Classification is the central task of pattern recognition Sensors supply information about an object: to which class do the object belong
More informationConditional Anomaly Detection
IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING 1 Conditional Anomaly Detection Xiuyao Song, Mingxi Wu, Christopher Jermaine, Sanjay Ranka Abstract When anomaly detection software is used as a data
More informationOverview. Longitudinal Data Variation and Correlation Different Approaches. Linear Mixed Models Generalized Linear Mixed Models
Overview 1 Introduction Longitudinal Data Variation and Correlation Different Approaches 2 Mixed Models Linear Mixed Models Generalized Linear Mixed Models 3 Marginal Models Linear Models Generalized Linear
More informationParametric Models Part I: Maximum Likelihood and Bayesian Density Estimation
Parametric Models Part I: Maximum Likelihood and Bayesian Density Estimation Selim Aksoy Department of Computer Engineering Bilkent University saksoy@cs.bilkent.edu.tr CS 551, Fall 2015 CS 551, Fall 2015
More informationThe Exponential Family
The Exponential Family David M. Blei Columbia University November 3, 2015 Definition A probability density in the exponential family has this form where p.x j / D h.x/ expf > t.x/ a./g; (1) is the natural
More informationModeling Anchoring Effects in Sequential Likert Scale Questions
No. 1315 Modeling Anchoring Effects in Sequential Likert Scale Questions Marcin Hitczenko Abstract: Surveys in many different research fields rely on sequences of Likert scale questions to assess individuals
More informationMathematical Background
Appendix A Mathematical Background A.1 Joint, Marginal and Conditional Probability Let the n (discrete or continuous) random variables y 1,..., y n have a joint joint probability probability p(y 1,...,
More informationFlow Clustering Using Machine Learning Techniques
Flow Clustering Using Machine Learning Techniques Anthony McGregor 1,2, Mark Hall 1, Perry Lorier 1, and James Brunskill 1 1 The University of Waikato, Private BAG 3105, Hamilton, New Zealand mhall,tonym@cs.waikato.ac.nz,
More informationStock Option Pricing Using Bayes Filters
Stock Option Pricing Using Bayes Filters Lin Liao liaolin@cs.washington.edu Abstract When using BlackScholes formula to price options, the key is the estimation of the stochastic return variance. In this
More informationBayesian Probability Maps For Evaluation Of Cardiac Ultrasound Data
Bayesian Probability Maps For Evaluation Of Cardiac Ultrasound Data Mattias Hansson 1, Sami Brandt 1,2, and Petri Gudmundsson 3 1 Center for Technological Studies, Malmö University, Sweden, mattias.hansson@mah.se.
More informationNetwork Intrusion Alert Aggregation Based on PCA and Expectation Maximization Clustering Algorithm
2009 International Conference on Computer Engineering and Applications IPCSIT vol.2 (2011) (2011) IACSIT Press, Singapore Network Intrusion Alert Aggregation Based on PCA and Expectation Maximization Clustering
More informationLecture 4: Thresholding
Lecture 4: Thresholding c Bryan S. Morse, Brigham Young University, 1998 2000 Last modified on Wednesday, January 12, 2000 at 10:00 AM. Reading SH&B, Section 5.1 4.1 Introduction Segmentation involves
More informationData Visualization with Simultaneous Feature Selection
1 Data Visualization with Simultaneous Feature Selection Dharmesh M. Maniyar and Ian T. Nabney Neural Computing Research Group Aston University, Birmingham. B4 7ET, United Kingdom Email: {maniyard,nabneyit}@aston.ac.uk
More informationJoint Probability Distributions and Random Samples (Devore Chapter Five)
Joint Probability Distributions and Random Samples (Devore Chapter Five) 101634501 Probability and Statistics for Engineers Winter 20102011 Contents 1 Joint Probability Distributions 1 1.1 Two Discrete
More informationComparing large datasets structures through unsupervised learning
Comparing large datasets structures through unsupervised learning Guénaël Cabanes and Younès Bennani LIPNCNRS, UMR 7030, Université de Paris 13 99, Avenue JB. Clément, 93430 Villetaneuse, France cabanes@lipn.univparis13.fr
More informationCCNY. BME I5100: Biomedical Signal Processing. Linear Discrimination. Lucas C. Parra Biomedical Engineering Department City College of New York
BME I5100: Biomedical Signal Processing Linear Discrimination Lucas C. Parra Biomedical Engineering Department CCNY 1 Schedule Week 1: Introduction Linear, stationary, normal  the stuff biology is not
More informationProbabilistic user behavior models in online stores for recommender systems
Probabilistic user behavior models in online stores for recommender systems Tomoharu Iwata Abstract Recommender systems are widely used in online stores because they are expected to improve both user
More informationTime Series Analysis III
Lecture 12: Time Series Analysis III MIT 18.S096 Dr. Kempthorne Fall 2013 MIT 18.S096 Time Series Analysis III 1 Outline Time Series Analysis III 1 Time Series Analysis III MIT 18.S096 Time Series Analysis
More informationMeasuring the tracking error of exchange traded funds: an unobserved components approach
Measuring the tracking error of exchange traded funds: an unobserved components approach Giuliano De Rossi Quantitative analyst +44 20 7568 3072 UBS Investment Research June 2012 Analyst Certification
More informationCourse: Model, Learning, and Inference: Lecture 5
Course: Model, Learning, and Inference: Lecture 5 Alan Yuille Department of Statistics, UCLA Los Angeles, CA 90095 yuille@stat.ucla.edu Abstract Probability distributions on structured representation.
More informationClustering UE 141 Spring 2013
Clustering UE 141 Spring 013 Jing Gao SUNY Buffalo 1 Definition of Clustering Finding groups of obects such that the obects in a group will be similar (or related) to one another and different from (or
More informationIEEE TRANSACTIONS ON NEURAL NETWORKS, VOL. 20, NO. 7, JULY 2009 1181
IEEE TRANSACTIONS ON NEURAL NETWORKS, VOL. 20, NO. 7, JULY 2009 1181 The Global Kernel kmeans Algorithm for Clustering in Feature Space Grigorios F. Tzortzis and Aristidis C. Likas, Senior Member, IEEE
More informationAPPLIED MISSING DATA ANALYSIS
APPLIED MISSING DATA ANALYSIS Craig K. Enders Series Editor's Note by Todd D. little THE GUILFORD PRESS New York London Contents 1 An Introduction to Missing Data 1 1.1 Introduction 1 1.2 Chapter Overview
More informationRevenue Management with Correlated Demand Forecasting
Revenue Management with Correlated Demand Forecasting Catalina Stefanescu Victor DeMiguel Kristin Fridgeirsdottir Stefanos Zenios 1 Introduction Many airlines are struggling to survive in today's economy.
More informationLogistic Regression. Jia Li. Department of Statistics The Pennsylvania State University. Logistic Regression
Logistic Regression Department of Statistics The Pennsylvania State University Email: jiali@stat.psu.edu Logistic Regression Preserve linear classification boundaries. By the Bayes rule: Ĝ(x) = arg max
More informationNote on the EM Algorithm in Linear Regression Model
International Mathematical Forum 4 2009 no. 38 18831889 Note on the M Algorithm in Linear Regression Model JiXia Wang and Yu Miao College of Mathematics and Information Science Henan Normal University
More informationPractical Data Science with R
Practical Data Science with R Instructor Matthew Renze Twitter: @matthewrenze Email: matthew@matthewrenze.com Web: http://www.matthewrenze.com Course Description Data science is the practice of transforming
More informationMath 2015 Lesson 21. We discuss the mean and the median, two important statistics about a distribution. p(x)dx = 0.5
ean and edian We discuss the mean and the median, two important statistics about a distribution. The edian The median is the halfway point of a distribution. It is the point where half the population has
More informationCLUSTERINGBASED NETWORK INTRUSION DETECTION
International Journal of Reliability, Quality and Safety Engineering c World Scientific Publishing Company CLUSTERINGBASED NETWORK INTRUSION DETECTION SHI ZHONG, TAGHI KHOSHGOFTAAR, and NAEEM SELIYA Department
More informationLecture 9: Introduction to Pattern Analysis
Lecture 9: Introduction to Pattern Analysis g Features, patterns and classifiers g Components of a PR system g An example g Probability definitions g Bayes Theorem g Gaussian densities Features, patterns
More informationStatistical machine learning, high dimension and big data
Statistical machine learning, high dimension and big data S. Gaïffas 1 14 mars 2014 1 CMAP  Ecole Polytechnique Agenda for today Divide and Conquer principle for collaborative filtering Graphical modelling,
More informationScaling Bayesian Network Parameter Learning with Expectation Maximization using MapReduce
Scaling Bayesian Network Parameter Learning with Expectation Maximization using MapReduce Erik B. Reed Carnegie Mellon University Silicon Valley Campus NASA Research Park Moffett Field, CA 94035 erikreed@cmu.edu
More informationData Mining Chapter 6: Models and Patterns Fall 2011 Ming Li Department of Computer Science and Technology Nanjing University
Data Mining Chapter 6: Models and Patterns Fall 2011 Ming Li Department of Computer Science and Technology Nanjing University Models vs. Patterns Models A model is a high level, global description of a
More informationStatistiek (WISB361)
Statistiek (WISB361) Final exam June 29, 2015 Schrijf uw naam op elk in te leveren vel. Schrijf ook uw studentnummer op blad 1. The maximum number of points is 100. Points distribution: 23 20 20 20 17
More information1. The maximum likelihood principle 2. Properties of maximumlikelihood estimates
The maximumlikelihood method Volker Blobel University of Hamburg March 2005 1. The maximum likelihood principle 2. Properties of maximumlikelihood estimates Keys during display: enter = next page; =
More informationLecture 8: Random Walk vs. Brownian Motion, Binomial Model vs. LogNormal Distribution
Lecture 8: Random Walk vs. Brownian Motion, Binomial Model vs. Logormal Distribution October 4, 200 Limiting Distribution of the Scaled Random Walk Recall that we defined a scaled simple random walk last
More informationWeb User Segmentation Based on a Mixture of Factor Analyzers
Web User Segmentation Based on a Mixture of Factor Analyzers Yanzan Kevin Zhou 1 and Bamshad Mobasher 2 1 ebay Inc., San Jose, CA yanzzhou@ebay.com 2 DePaul University, Chicago, IL mobasher@cs.depaul.edu
More informationMath 21A Brian Osserman Practice Exam 1 Solutions
Math 2A Brian Osserman Practice Exam Solutions These solutions are intended to indicate roughly how much you would be expected to write. Comments in [square brackets] are additional and would not be required.
More informationSufficient Statistics and Exponential Family. 1 Statistics and Sufficient Statistics. Math 541: Statistical Theory II. Lecturer: Songfeng Zheng
Math 541: Statistical Theory II Lecturer: Songfeng Zheng Sufficient Statistics and Exponential Family 1 Statistics and Sufficient Statistics Suppose we have a random sample X 1,, X n taken from a distribution
More informationAuxiliary Variables in Mixture Modeling: 3Step Approaches Using Mplus
Auxiliary Variables in Mixture Modeling: 3Step Approaches Using Mplus Tihomir Asparouhov and Bengt Muthén Mplus Web Notes: No. 15 Version 8, August 5, 2014 1 Abstract This paper discusses alternatives
More informationProbabilistic Visualisation of Highdimensional Binary Data
Probabilistic Visualisation of Highdimensional Binary Data Michael E. Tipping Microsoft Research, St George House, 1 Guildhall Street, Cambridge CB2 3NH, U.K. mtipping@microsoit.com Abstract We present
More informationBAYESIAN CLASSIFICATION USING GAUSSIAN MIXTURE MODEL AND EM ESTIMATION: IMPLEMENTATIONS AND COMPARISONS
LAPPEENRANTA UNIVERSITY OF TECHNOLOGY DEPARTMENT OF INFORMATION TECHNOLOGY BAYESIAN CLASSIFICATION USING GAUSSIAN MIXTURE MODEL AND ESTIMATION: IMPLENTATIONS AND COMPARISONS Information Technology Project
More informationSampling and Subsampling for Cluster Analysis in Data Mining: With Applications to Sky Survey Data
Data Mining and Knowledge Discovery, 7, 215 232, 2003 c 2003 Kluwer Academic Publishers. Manufactured in The Netherlands. Sampling and Subsampling for Cluster Analysis in Data Mining: With Applications
More informationTutorial on SemiSupervised Learning
Tutorial on SemiSupervised Learning Xiaojin Zhu Department of Computer Sciences University of Wisconsin, Madison, USA Theory and Practice of Computational Learning Chicago, 2009 Xiaojin Zhu (Univ. Wisconsin,
More informationTreatment of Incomplete Data in the Field of Operational Risk: The Effects on Parameter Estimates, EL and UL Figures
Chernobai.qxd 2/1/ 1: PM Page 1 Treatment of Incomplete Data in the Field of Operational Risk: The Effects on Parameter Estimates, EL and UL Figures Anna Chernobai; Christian Menn*; Svetlozar T. Rachev;
More informationStatistical Databases and Registers with some datamining
Unsupervised learning  Statistical Databases and Registers with some datamining a course in Survey Methodology and O cial Statistics Pages in the book: 501528 Department of Statistics Stockholm University
More informationVisualization, Clustering and Classification of Multidimensional Astronomical Data
Visualization, Clustering and Classification of Multidimensional Astronomical Data Antonino Staiano, Angelo Ciaramella, Lara De Vinco, Ciro Donalek, Giuseppe Longo, Giancarlo Raiconi, Roberto Tagliaferri,
More informationANALYTICAL TECHNIQUES FOR DATA VISUALIZATION
ANALYTICAL TECHNIQUES FOR DATA VISUALIZATION CSE 537 Ar@ficial Intelligence Professor Anita Wasilewska GROUP 2 TEAM MEMBERS: SAEED BOOR BOOR  110564337 SHIH YU TSAI  110385129 HAN LI 110168054 SOURCES
More informationMixture Models for Genomic Data
Mixture Models for Genomic Data S. Robin AgroParisTech / INRA École de Printemps en Apprentissage automatique, Baie de somme, May 2010 S. Robin (AgroParisTech / INRA) Mixture Models May 10 1 / 48 Outline
More informationImproving Pattern Recognition Methods for Speaker Recognition
UNIVERSITY OF JOENSUU COMPUTER SCIENCE AND STATISTICS DISSERTATIONS 22 Ville Hautamäki Improving Pattern Recognition Methods for Speaker Recognition Academic dissertation To be presented, with the permission
More informationLinear Threshold Units
Linear Threshold Units w x hx (... w n x n w We assume that each feature x j and each weight w j is a real number (we will relax this later) We will study three different algorithms for learning linear
More information10601. Machine Learning. http://www.cs.cmu.edu/afs/cs/academic/class/10601f10/index.html
10601 Machine Learning http://www.cs.cmu.edu/afs/cs/academic/class/10601f10/index.html Course data All uptodate info is on the course web page: http://www.cs.cmu.edu/afs/cs/academic/class/10601f10/index.html
More informationIntroduction to General and Generalized Linear Models
Introduction to General and Generalized Linear Models General Linear Models  part I Henrik Madsen Poul Thyregod Informatics and Mathematical Modelling Technical University of Denmark DK2800 Kgs. Lyngby
More informationStatistical Analysis with Missing Data
Statistical Analysis with Missing Data Second Edition RODERICK J. A. LITTLE DONALD B. RUBIN WILEY INTERSCIENCE A JOHN WILEY & SONS, INC., PUBLICATION Contents Preface PARTI OVERVIEW AND BASIC APPROACHES
More informationParametric Statistical Modeling
Parametric Statistical Modeling ECE 275A Statistical Parameter Estimation Ken KreutzDelgado ECE Department, UC San Diego Ken KreutzDelgado (UC San Diego) ECE 275A SPE Version 1.1 Fall 2012 1 / 12 Why
More informationMachine Learning I Week 14: Sequence Learning Introduction
Machine Learning I Week 14: Sequence Learning Introduction Alex Graves Technische Universität München 29. January 2009 Literature Pattern Recognition and Machine Learning Chapter 13: Sequential Data Christopher
More informationCluster Analysis: Advanced Concepts
Cluster Analysis: Advanced Concepts and dalgorithms Dr. Hui Xiong Rutgers University Introduction to Data Mining 08/06/2006 1 Introduction to Data Mining 08/06/2006 1 Outline Prototypebased Fuzzy cmeans
More informationExamination 110 Probability and Statistics Examination
Examination 0 Probability and Statistics Examination Sample Examination Questions The Probability and Statistics Examination consists of 5 multiplechoice test questions. The test is a threehour examination
More information1 Maximum likelihood estimation
COS 424: Interacting with Data Lecturer: David Blei Lecture #4 Scribes: Wei Ho, Michael Ye February 14, 2008 1 Maximum likelihood estimation 1.1 MLE of a Bernoulli random variable (coin flips) Given N
More informationFortgeschrittene Computerintensive Methoden: Finite Mixture Models Steffen Unkel Manuel Eugster, Bettina Grün, Friedrich Leisch, Matthias Schmid
Fortgeschrittene Computerintensive Methoden: Finite Mixture Models Steffen Unkel Manuel Eugster, Bettina Grün, Friedrich Leisch, Matthias Schmid Institut für Statistik LMU München Sommersemester 2013 Outline
More informationPattern Analysis. Logistic Regression. 12. Mai 2009. Joachim Hornegger. Chair of Pattern Recognition Erlangen University
Pattern Analysis Logistic Regression 12. Mai 2009 Joachim Hornegger Chair of Pattern Recognition Erlangen University Pattern Analysis 2 / 43 1 Logistic Regression Posteriors and the Logistic Function Decision
More informationA gentle introduction to Expectation Maximization
A getle itroductio to Expectatio Maximizatio Mark Johso Brow Uiversity November 2009 1 / 15 Outlie What is Expectatio Maximizatio? Mixture models ad clusterig EM for setece topic modelig 2 / 15 Why Expectatio
More informationAn Enhanced Clustering Algorithm to Analyze Spatial Data
International Journal of Engineering and Technical Research (IJETR) ISSN: 23210869, Volume2, Issue7, July 2014 An Enhanced Clustering Algorithm to Analyze Spatial Data Dr. Mahesh Kumar, Mr. Sachin Yadav
More informationChapter 4: Hidden Markov Models
Chapter 4: Hidden Markov Models 4.3 HMM raining Prof. Yechiam Yemini (YY) Computer Science Department Columbia University Overview Learning HMM parameters Supervise learning Unsupervised learning (Viterbi,
More information