# Clustering - example. Given some data x i X Find a partitioning of the data into k disjunctive clusters Example: k-means clustering

Save this PDF as:

Size: px
Start display at page:

Download "Clustering - example. Given some data x i X Find a partitioning of the data into k disjunctive clusters Example: k-means clustering"

## Transcription

1 Clustering - example Graph Mining and Graph Kernels Given some data x i X Find a partitioning of the data into k disjunctive clusters Example: k-means clustering x!!!!8!!! 8 x 1 1

2 Clustering - example Graph Mining and Graph Kernels Given some data x i X Find a partitioning of the data into k disjunctive clusters Example: k-means clustering x!!! Can we do something better?!8!!! 8 x 1

3 Generative model view of clustering Instead of partitioning the data try to describe the underlying generative process of the data Each cluster can be seen as one distribution For example Gaussian distributions Objects x i are assumed to be independent samples from their cluster distribution => Gaussian mixture model x i N (µ l, Σ l ) univariate Gaussian prbability density function f(x) c 1 =Normal(,1.5); p(c 1 )=.5 c =Normal(3,.5); p(c )=. c 3 =Normal(!,.7); p(c 3 )=.3.!5 5 x 3

4 Gaussian Mixture Model - Introduction Data x i are independent and identically distributed (i.i.d.) samples from a mixture of k distributions c l x i R d,i {1... N} c l,l {1... k} each cluster is a multivariate Gaussian distribution Sufficient statistics of each cluster: Mean (Centroid) Covariance (empirical covariance matrix) Probability density function of a Gaussian distribution P(x i c l ) f l (x i )= x i N (µ l, Σ l ) 1 (π) d det(σ l ) exp µ l R d Σ l R d d ( 1 (x i µ l ) Σ 1 (x i µ l ) )

5 Gaussian Mixture Model - Introduction Mixture of one-dimensional Gaussians c i = N (µ l, σ l ) univariate Gaussian prbability density function c 1 =Normal(,1.5); p(c 1 )=.5 c =Normal(3,.5); p(c )=. c 3 =Normal(!,.7); p(c 3 )=.3.1 f(x) !5 5 x 5

6 Gaussian Mixture Model - Introduction Mixture of multivariate Gaussians x!!!!8!!! 8 x 1

7 Gaussian Mixture Model - Introduction Mixture of multivariate Gaussians No covariance x! &'() Negative covariance!!!8!!! 8 x 1 µ l Σ l Positive covariance 7

8 Gaussian Mixture Model some maths Probability of a cluster c l P(c l )= 1 N N P(c l x i ) i=1 Empirical estimate of the density of the cluster low density => small P(c l ) x!!! high density => large P(c l )!8!!! 8 x 1 8

9 Gaussian Mixture Model some maths Probability of a cluster c l P(c l )= 1 N Empirical estimate of the density of the cluster Probability of observing an object x i P(x i )= k l=1 N P(c l x i ) i=1 P(c l )P(x i c l ) Probability of observing an object x i given its cluster c l P(x i c l ) 1 (π) d det(σ l ) exp ( 1 ) (x i µ l ) Σ 1 (x i µ l ) 9

10 Gaussian Mixture Model likelihood function Quality measure of the model Probability that the data is generated by the GMM L = = N i=1 N i=1 P(x i ) k l=1 P(c l )P(x i c l ) Also possible to use the log-likelihood log (L) 1

11 Gaussian Mixture Model - clustering Question: How can we use the GMM to partition the data? Choose most likely cluster assignment of each object argmax l P(c l x i ) = argmax l P(c l )P(x i c l ) x!!!!8!!! 8 x 1 11

12 Gaussian Mixture Model - clustering Question: How can we use the GMM to partition the data? Choose most likely cluster assignment of each object argmax l P(c l x i ) = argmax l P(c l )P(x i c l ) x!!! Great! but!8!!! 8 x 1 1

13 This is all we that have How to estimate the sufficient statistics of each cluster? Mean (Centroid) Covariance (empirical covariance matrix) µ l R d Σ l R d d => use Expectation Maximization algorithm x!!!!8!!! 8 x 1 13

14 Expectation Maximization algorithm Original algorithm by [Dempster, Laird and Rubin, 1977] General method for finding the maximum-likelihood estimate of a data distribution, when the data is partially missing or hidden. How does this apply? data x i are fully observed Trick: the cluster assignments of an object x i can be seen as hidden variable 1

15 Exepectation Maximization algorithm a short sketch of the EM algorithm: Initialize cluster assignments Two alternating steps: E-step: re-estimate the Expected-values of the hidden data (cluster assignments) under the current estimate of the model M-step: re-estimate the model parameters such that the likelihood according to the current estimate of the complete data is maximized until convergence L new L old < 1+ɛ 15

16 Expectation Maximization algorithm E-step: Re-estimate the Expected-values of the hidden data (cluster assignments) under the current estimate of the model P new (c l x i ) = P(c l )P(x i c l ) 1

17 Expectation Maximization algorithm M-step: re-estimate the model parameters by taking the maximum likelihood estimate according to the current estimate of the complete data Cluster densities P new (c l )= 1 N P new (c l x i ) N Cluster means: µ new l = i=1 N i=1 x ip new (c l x i ) N i=1 Pnew (c l x i ) Cluster covariances: Σ new l = N i=1 (x i µ new l )(x i µ new l ) P new (c l x i ) N i=1 Pnew (c l x i ) 17

### Machine Learning and Data Mining. Clustering. (adapted from) Prof. Alexander Ihler

Machine Learning and Data Mining Clustering (adapted from) Prof. Alexander Ihler Unsupervised learning Supervised learning Predict target value ( y ) given features ( x ) Unsupervised learning Understand

### CS540 Machine learning Lecture 14 Mixtures, EM, Non-parametric models

CS540 Machine learning Lecture 14 Mixtures, EM, Non-parametric models Outline Mixture models EM for mixture models K means clustering Conditional mixtures Kernel density estimation Kernel regression GMM

### Robotics 2 Clustering & EM. Giorgio Grisetti, Cyrill Stachniss, Kai Arras, Maren Bennewitz, Wolfram Burgard

Robotics 2 Clustering & EM Giorgio Grisetti, Cyrill Stachniss, Kai Arras, Maren Bennewitz, Wolfram Burgard 1 Clustering (1) Common technique for statistical data analysis to detect structure (machine learning,

### Statistical Machine Learning from Data

Samy Bengio Statistical Machine Learning from Data 1 Statistical Machine Learning from Data Gaussian Mixture Models Samy Bengio IDIAP Research Institute, Martigny, Switzerland, and Ecole Polytechnique

### L10: Probability, statistics, and estimation theory

L10: Probability, statistics, and estimation theory Review of probability theory Bayes theorem Statistics and the Normal distribution Least Squares Error estimation Maximum Likelihood estimation Bayesian

### A crash course in probability and Naïve Bayes classification

Probability theory A crash course in probability and Naïve Bayes classification Chapter 9 Random variable: a variable whose possible values are numerical outcomes of a random phenomenon. s: A person s

### Data Mining. Cluster Analysis: Advanced Concepts and Algorithms

Data Mining Cluster Analysis: Advanced Concepts and Algorithms Tan,Steinbach, Kumar Introduction to Data Mining 4/18/2004 1 More Clustering Methods Prototype-based clustering Density-based clustering Graph-based

### EM Clustering Approach for Multi-Dimensional Analysis of Big Data Set

EM Clustering Approach for Multi-Dimensional Analysis of Big Data Set Amhmed A. Bhih School of Electrical and Electronic Engineering Princy Johnson School of Electrical and Electronic Engineering Martin

### 10-810 /02-710 Computational Genomics. Clustering expression data

10-810 /02-710 Computational Genomics Clustering expression data What is Clustering? Organizing data into clusters such that there is high intra-cluster similarity low inter-cluster similarity Informally,

### Clustering. 15-381 Artificial Intelligence Henry Lin. Organizing data into clusters such that there is

Clustering 15-381 Artificial Intelligence Henry Lin Modified from excellent slides of Eamonn Keogh, Ziv Bar-Joseph, and Andrew Moore What is Clustering? Organizing data into clusters such that there is

### Machine Learning and Data Analysis overview. Department of Cybernetics, Czech Technical University in Prague. http://ida.felk.cvut.

Machine Learning and Data Analysis overview Jiří Kléma Department of Cybernetics, Czech Technical University in Prague http://ida.felk.cvut.cz psyllabus Lecture Lecturer Content 1. J. Kléma Introduction,

### Wes, Delaram, and Emily MA751. Exercise 4.5. 1 p(x; β) = [1 p(xi ; β)] = 1 p(x. y i [βx i ] log [1 + exp {βx i }].

Wes, Delaram, and Emily MA75 Exercise 4.5 Consider a two-class logistic regression problem with x R. Characterize the maximum-likelihood estimates of the slope and intercept parameter if the sample for

### Probabilistic Latent Semantic Analysis (plsa)

Probabilistic Latent Semantic Analysis (plsa) SS 2008 Bayesian Networks Multimedia Computing, Universität Augsburg Rainer.Lienhart@informatik.uni-augsburg.de www.multimedia-computing.{de,org} References

### Highly Efficient Incremental Estimation of Gaussian Mixture Models for Online Data Stream Clustering

Highly Efficient Incremental Estimation of Gaussian Mixture Models for Online Data Stream Clustering Mingzhou Song a,b and Hongbin Wang b a Department of Computer Science, Queens College of CUNY, Flushing,

### Automated Hierarchical Mixtures of Probabilistic Principal Component Analyzers

Automated Hierarchical Mixtures of Probabilistic Principal Component Analyzers Ting Su tsu@ece.neu.edu Jennifer G. Dy jdy@ece.neu.edu Department of Electrical and Computer Engineering, Northeastern University,

### Gaussian Classifiers CS498

Gaussian Classifiers CS498 Today s lecture The Gaussian Gaussian classifiers A slightly more sophisticated classifier Nearest Neighbors We can classify with nearest neighbors x m 1 m 2 Decision boundary

### Model-Based Cluster Analysis for Web Users Sessions

Model-Based Cluster Analysis for Web Users Sessions George Pallis, Lefteris Angelis, and Athena Vakali Department of Informatics, Aristotle University of Thessaloniki, 54124, Thessaloniki, Greece gpallis@ccf.auth.gr

### Why the Normal Distribution?

Why the Normal Distribution? Raul Rojas Freie Universität Berlin Februar 2010 Abstract This short note explains in simple terms why the normal distribution is so ubiquitous in pattern recognition applications.

### Health Status Monitoring Through Analysis of Behavioral Patterns

Health Status Monitoring Through Analysis of Behavioral Patterns Tracy Barger 1, Donald Brown 1, and Majd Alwan 2 1 University of Virginia, Systems and Information Engineering, Charlottesville, VA 2 University

### Data Mining Cluster Analysis: Basic Concepts and Algorithms. Lecture Notes for Chapter 8. Introduction to Data Mining

Data Mining Cluster Analysis: Basic Concepts and Algorithms Lecture Notes for Chapter 8 by Tan, Steinbach, Kumar 1 What is Cluster Analysis? Finding groups of objects such that the objects in a group will

### Music Classification. Juan Pablo Bello MPATE-GE 2623 Music Information Retrieval New York University

Music Classification Juan Pablo Bello MPATE-GE 2623 Music Information Retrieval New York University 1 Classification It is the process by which we automatically assign an individual item to one of a number

### Introduction to Segmentation

Lecture 2: Introduction to Segmentation Jonathan Krause 1 Goal Goal: Identify groups of pixels that go together image credit: Steve Seitz, Kristen Grauman 2 Types of Segmentation Semantic Segmentation:

### Multivariate Normal Distribution

Multivariate Normal Distribution Lecture 4 July 21, 2011 Advanced Multivariate Statistical Methods ICPSR Summer Session #2 Lecture #4-7/21/2011 Slide 1 of 41 Last Time Matrices and vectors Eigenvalues

### The Expectation Maximization Algorithm A short tutorial

The Expectation Maximiation Algorithm A short tutorial Sean Borman Comments and corrections to: em-tut at seanborman dot com July 8 2004 Last updated January 09, 2009 Revision history 2009-0-09 Corrected

### Mixtures of Robust Probabilistic Principal Component Analyzers

Mixtures of Robust Probabilistic Principal Component Analyzers Cédric Archambeau, Nicolas Delannay 2 and Michel Verleysen 2 - University College London, Dept. of Computer Science Gower Street, London WCE

### Statistical Machine Learning

Statistical Machine Learning UoC Stats 37700, Winter quarter Lecture 4: classical linear and quadratic discriminants. 1 / 25 Linear separation For two classes in R d : simple idea: separate the classes

### A hidden Markov model for criminal behaviour classification

RSS2004 p.1/19 A hidden Markov model for criminal behaviour classification Francesco Bartolucci, Institute of economic sciences, Urbino University, Italy. Fulvia Pennoni, Department of Statistics, University

### An Introduction to Statistical Machine Learning - Overview -

An Introduction to Statistical Machine Learning - Overview - Samy Bengio bengio@idiap.ch Dalle Molle Institute for Perceptual Artificial Intelligence (IDIAP) CP 592, rue du Simplon 4 1920 Martigny, Switzerland

### HT2015: SC4 Statistical Data Mining and Machine Learning

HT2015: SC4 Statistical Data Mining and Machine Learning Dino Sejdinovic Department of Statistics Oxford http://www.stats.ox.ac.uk/~sejdinov/sdmml.html Bayesian Nonparametrics Parametric vs Nonparametric

### Lecture 20: Clustering

Lecture 20: Clustering Wrap-up of neural nets (from last lecture Introduction to unsupervised learning K-means clustering COMP-424, Lecture 20 - April 3, 2013 1 Unsupervised learning In supervised learning,

### One-Class Classifiers: A Review and Analysis of Suitability in the Context of Mobile-Masquerader Detection

Joint Special Issue Advances in end-user data-mining techniques 29 One-Class Classifiers: A Review and Analysis of Suitability in the Context of Mobile-Masquerader Detection O Mazhelis Department of Computer

### Lecture 3: Linear methods for classification

Lecture 3: Linear methods for classification Rafael A. Irizarry and Hector Corrada Bravo February, 2010 Today we describe four specific algorithms useful for classification problems: linear regression,

### Comparative Analysis of EM Clustering Algorithm and Density Based Clustering Algorithm Using WEKA tool.

International Journal of Engineering Research and Development e-issn: 2278-067X, p-issn: 2278-800X, www.ijerd.com Volume 9, Issue 8 (January 2014), PP. 19-24 Comparative Analysis of EM Clustering Algorithm

### Class #6: Non-linear classification. ML4Bio 2012 February 17 th, 2012 Quaid Morris

Class #6: Non-linear classification ML4Bio 2012 February 17 th, 2012 Quaid Morris 1 Module #: Title of Module 2 Review Overview Linear separability Non-linear classification Linear Support Vector Machines

### A Basic Introduction to Missing Data

John Fox Sociology 740 Winter 2014 Outline Why Missing Data Arise Why Missing Data Arise Global or unit non-response. In a survey, certain respondents may be unreachable or may refuse to participate. Item

### CS229 Lecture notes. Andrew Ng

CS229 Lecture notes Andrew Ng Part X Factor analysis Whenwehavedatax (i) R n thatcomesfromamixtureofseveral Gaussians, the EM algorithm can be applied to fit a mixture model. In this setting, we usually

### Linear Classification. Volker Tresp Summer 2015

Linear Classification Volker Tresp Summer 2015 1 Classification Classification is the central task of pattern recognition Sensors supply information about an object: to which class do the object belong

### Conditional Anomaly Detection

IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING 1 Conditional Anomaly Detection Xiuyao Song, Mingxi Wu, Christopher Jermaine, Sanjay Ranka Abstract When anomaly detection software is used as a data

### Overview. Longitudinal Data Variation and Correlation Different Approaches. Linear Mixed Models Generalized Linear Mixed Models

Overview 1 Introduction Longitudinal Data Variation and Correlation Different Approaches 2 Mixed Models Linear Mixed Models Generalized Linear Mixed Models 3 Marginal Models Linear Models Generalized Linear

### Parametric Models Part I: Maximum Likelihood and Bayesian Density Estimation

Parametric Models Part I: Maximum Likelihood and Bayesian Density Estimation Selim Aksoy Department of Computer Engineering Bilkent University saksoy@cs.bilkent.edu.tr CS 551, Fall 2015 CS 551, Fall 2015

### The Exponential Family

The Exponential Family David M. Blei Columbia University November 3, 2015 Definition A probability density in the exponential family has this form where p.x j / D h.x/ expf > t.x/ a./g; (1) is the natural

### Modeling Anchoring Effects in Sequential Likert Scale Questions

No. 13-15 Modeling Anchoring Effects in Sequential Likert Scale Questions Marcin Hitczenko Abstract: Surveys in many different research fields rely on sequences of Likert scale questions to assess individuals

### Mathematical Background

Appendix A Mathematical Background A.1 Joint, Marginal and Conditional Probability Let the n (discrete or continuous) random variables y 1,..., y n have a joint joint probability probability p(y 1,...,

### Flow Clustering Using Machine Learning Techniques

Flow Clustering Using Machine Learning Techniques Anthony McGregor 1,2, Mark Hall 1, Perry Lorier 1, and James Brunskill 1 1 The University of Waikato, Private BAG 3105, Hamilton, New Zealand mhall,tonym@cs.waikato.ac.nz,

### Stock Option Pricing Using Bayes Filters

Stock Option Pricing Using Bayes Filters Lin Liao liaolin@cs.washington.edu Abstract When using Black-Scholes formula to price options, the key is the estimation of the stochastic return variance. In this

### Bayesian Probability Maps For Evaluation Of Cardiac Ultrasound Data

Bayesian Probability Maps For Evaluation Of Cardiac Ultrasound Data Mattias Hansson 1, Sami Brandt 1,2, and Petri Gudmundsson 3 1 Center for Technological Studies, Malmö University, Sweden, mattias.hansson@mah.se.

### Network Intrusion Alert Aggregation Based on PCA and Expectation Maximization Clustering Algorithm

2009 International Conference on Computer Engineering and Applications IPCSIT vol.2 (2011) (2011) IACSIT Press, Singapore Network Intrusion Alert Aggregation Based on PCA and Expectation Maximization Clustering

### Lecture 4: Thresholding

Lecture 4: Thresholding c Bryan S. Morse, Brigham Young University, 1998 2000 Last modified on Wednesday, January 12, 2000 at 10:00 AM. Reading SH&B, Section 5.1 4.1 Introduction Segmentation involves

### Data Visualization with Simultaneous Feature Selection

1 Data Visualization with Simultaneous Feature Selection Dharmesh M. Maniyar and Ian T. Nabney Neural Computing Research Group Aston University, Birmingham. B4 7ET, United Kingdom Email: {maniyard,nabneyit}@aston.ac.uk

### Joint Probability Distributions and Random Samples (Devore Chapter Five)

Joint Probability Distributions and Random Samples (Devore Chapter Five) 1016-345-01 Probability and Statistics for Engineers Winter 2010-2011 Contents 1 Joint Probability Distributions 1 1.1 Two Discrete

### Comparing large datasets structures through unsupervised learning

Comparing large datasets structures through unsupervised learning Guénaël Cabanes and Younès Bennani LIPN-CNRS, UMR 7030, Université de Paris 13 99, Avenue J-B. Clément, 93430 Villetaneuse, France cabanes@lipn.univ-paris13.fr

### CCNY. BME I5100: Biomedical Signal Processing. Linear Discrimination. Lucas C. Parra Biomedical Engineering Department City College of New York

BME I5100: Biomedical Signal Processing Linear Discrimination Lucas C. Parra Biomedical Engineering Department CCNY 1 Schedule Week 1: Introduction Linear, stationary, normal - the stuff biology is not

### Probabilistic user behavior models in online stores for recommender systems

Probabilistic user behavior models in online stores for recommender systems Tomoharu Iwata Abstract Recommender systems are widely used in online stores because they are expected to improve both user

### Time Series Analysis III

Lecture 12: Time Series Analysis III MIT 18.S096 Dr. Kempthorne Fall 2013 MIT 18.S096 Time Series Analysis III 1 Outline Time Series Analysis III 1 Time Series Analysis III MIT 18.S096 Time Series Analysis

### Measuring the tracking error of exchange traded funds: an unobserved components approach

Measuring the tracking error of exchange traded funds: an unobserved components approach Giuliano De Rossi Quantitative analyst +44 20 7568 3072 UBS Investment Research June 2012 Analyst Certification

### Course: Model, Learning, and Inference: Lecture 5

Course: Model, Learning, and Inference: Lecture 5 Alan Yuille Department of Statistics, UCLA Los Angeles, CA 90095 yuille@stat.ucla.edu Abstract Probability distributions on structured representation.

### Clustering UE 141 Spring 2013

Clustering UE 141 Spring 013 Jing Gao SUNY Buffalo 1 Definition of Clustering Finding groups of obects such that the obects in a group will be similar (or related) to one another and different from (or

### IEEE TRANSACTIONS ON NEURAL NETWORKS, VOL. 20, NO. 7, JULY 2009 1181

IEEE TRANSACTIONS ON NEURAL NETWORKS, VOL. 20, NO. 7, JULY 2009 1181 The Global Kernel k-means Algorithm for Clustering in Feature Space Grigorios F. Tzortzis and Aristidis C. Likas, Senior Member, IEEE

### APPLIED MISSING DATA ANALYSIS

APPLIED MISSING DATA ANALYSIS Craig K. Enders Series Editor's Note by Todd D. little THE GUILFORD PRESS New York London Contents 1 An Introduction to Missing Data 1 1.1 Introduction 1 1.2 Chapter Overview

### Revenue Management with Correlated Demand Forecasting

Revenue Management with Correlated Demand Forecasting Catalina Stefanescu Victor DeMiguel Kristin Fridgeirsdottir Stefanos Zenios 1 Introduction Many airlines are struggling to survive in today's economy.

### Logistic Regression. Jia Li. Department of Statistics The Pennsylvania State University. Logistic Regression

Logistic Regression Department of Statistics The Pennsylvania State University Email: jiali@stat.psu.edu Logistic Regression Preserve linear classification boundaries. By the Bayes rule: Ĝ(x) = arg max

### Note on the EM Algorithm in Linear Regression Model

International Mathematical Forum 4 2009 no. 38 1883-1889 Note on the M Algorithm in Linear Regression Model Ji-Xia Wang and Yu Miao College of Mathematics and Information Science Henan Normal University

### Practical Data Science with R

Practical Data Science with R Instructor Matthew Renze Twitter: @matthewrenze Email: matthew@matthewrenze.com Web: http://www.matthewrenze.com Course Description Data science is the practice of transforming

### Math 2015 Lesson 21. We discuss the mean and the median, two important statistics about a distribution. p(x)dx = 0.5

ean and edian We discuss the mean and the median, two important statistics about a distribution. The edian The median is the halfway point of a distribution. It is the point where half the population has

### CLUSTERING-BASED NETWORK INTRUSION DETECTION

International Journal of Reliability, Quality and Safety Engineering c World Scientific Publishing Company CLUSTERING-BASED NETWORK INTRUSION DETECTION SHI ZHONG, TAGHI KHOSHGOFTAAR, and NAEEM SELIYA Department

### Lecture 9: Introduction to Pattern Analysis

Lecture 9: Introduction to Pattern Analysis g Features, patterns and classifiers g Components of a PR system g An example g Probability definitions g Bayes Theorem g Gaussian densities Features, patterns

### Statistical machine learning, high dimension and big data

Statistical machine learning, high dimension and big data S. Gaïffas 1 14 mars 2014 1 CMAP - Ecole Polytechnique Agenda for today Divide and Conquer principle for collaborative filtering Graphical modelling,

### Scaling Bayesian Network Parameter Learning with Expectation Maximization using MapReduce

Scaling Bayesian Network Parameter Learning with Expectation Maximization using MapReduce Erik B. Reed Carnegie Mellon University Silicon Valley Campus NASA Research Park Moffett Field, CA 94035 erikreed@cmu.edu

### Data Mining Chapter 6: Models and Patterns Fall 2011 Ming Li Department of Computer Science and Technology Nanjing University

Data Mining Chapter 6: Models and Patterns Fall 2011 Ming Li Department of Computer Science and Technology Nanjing University Models vs. Patterns Models A model is a high level, global description of a

### Statistiek (WISB361)

Statistiek (WISB361) Final exam June 29, 2015 Schrijf uw naam op elk in te leveren vel. Schrijf ook uw studentnummer op blad 1. The maximum number of points is 100. Points distribution: 23 20 20 20 17

### 1. The maximum likelihood principle 2. Properties of maximum-likelihood estimates

The maximum-likelihood method Volker Blobel University of Hamburg March 2005 1. The maximum likelihood principle 2. Properties of maximum-likelihood estimates Keys during display: enter = next page; =

### Lecture 8: Random Walk vs. Brownian Motion, Binomial Model vs. Log-Normal Distribution

Lecture 8: Random Walk vs. Brownian Motion, Binomial Model vs. Log-ormal Distribution October 4, 200 Limiting Distribution of the Scaled Random Walk Recall that we defined a scaled simple random walk last

### Web User Segmentation Based on a Mixture of Factor Analyzers

Web User Segmentation Based on a Mixture of Factor Analyzers Yanzan Kevin Zhou 1 and Bamshad Mobasher 2 1 ebay Inc., San Jose, CA yanzzhou@ebay.com 2 DePaul University, Chicago, IL mobasher@cs.depaul.edu

### Math 21A Brian Osserman Practice Exam 1 Solutions

Math 2A Brian Osserman Practice Exam Solutions These solutions are intended to indicate roughly how much you would be expected to write. Comments in [square brackets] are additional and would not be required.

### Sufficient Statistics and Exponential Family. 1 Statistics and Sufficient Statistics. Math 541: Statistical Theory II. Lecturer: Songfeng Zheng

Math 541: Statistical Theory II Lecturer: Songfeng Zheng Sufficient Statistics and Exponential Family 1 Statistics and Sufficient Statistics Suppose we have a random sample X 1,, X n taken from a distribution

### Auxiliary Variables in Mixture Modeling: 3-Step Approaches Using Mplus

Auxiliary Variables in Mixture Modeling: 3-Step Approaches Using Mplus Tihomir Asparouhov and Bengt Muthén Mplus Web Notes: No. 15 Version 8, August 5, 2014 1 Abstract This paper discusses alternatives

### Probabilistic Visualisation of High-dimensional Binary Data

Probabilistic Visualisation of High-dimensional Binary Data Michael E. Tipping Microsoft Research, St George House, 1 Guildhall Street, Cambridge CB2 3NH, U.K. mtipping@microsoit.com Abstract We present

### BAYESIAN CLASSIFICATION USING GAUSSIAN MIXTURE MODEL AND EM ESTIMATION: IMPLEMENTATIONS AND COMPARISONS

LAPPEENRANTA UNIVERSITY OF TECHNOLOGY DEPARTMENT OF INFORMATION TECHNOLOGY BAYESIAN CLASSIFICATION USING GAUSSIAN MIXTURE MODEL AND ESTIMATION: IMPLENTATIONS AND COMPARISONS Information Technology Project

### Sampling and Subsampling for Cluster Analysis in Data Mining: With Applications to Sky Survey Data

Data Mining and Knowledge Discovery, 7, 215 232, 2003 c 2003 Kluwer Academic Publishers. Manufactured in The Netherlands. Sampling and Subsampling for Cluster Analysis in Data Mining: With Applications

### Tutorial on Semi-Supervised Learning

Tutorial on Semi-Supervised Learning Xiaojin Zhu Department of Computer Sciences University of Wisconsin, Madison, USA Theory and Practice of Computational Learning Chicago, 2009 Xiaojin Zhu (Univ. Wisconsin,

### Treatment of Incomplete Data in the Field of Operational Risk: The Effects on Parameter Estimates, EL and UL Figures

Chernobai.qxd 2/1/ 1: PM Page 1 Treatment of Incomplete Data in the Field of Operational Risk: The Effects on Parameter Estimates, EL and UL Figures Anna Chernobai; Christian Menn*; Svetlozar T. Rachev;

### Statistical Databases and Registers with some datamining

Unsupervised learning - Statistical Databases and Registers with some datamining a course in Survey Methodology and O cial Statistics Pages in the book: 501-528 Department of Statistics Stockholm University

### Visualization, Clustering and Classification of Multidimensional Astronomical Data

Visualization, Clustering and Classification of Multidimensional Astronomical Data Antonino Staiano, Angelo Ciaramella, Lara De Vinco, Ciro Donalek, Giuseppe Longo, Giancarlo Raiconi, Roberto Tagliaferri,

### ANALYTICAL TECHNIQUES FOR DATA VISUALIZATION

ANALYTICAL TECHNIQUES FOR DATA VISUALIZATION CSE 537 Ar@ficial Intelligence Professor Anita Wasilewska GROUP 2 TEAM MEMBERS: SAEED BOOR BOOR - 110564337 SHIH- YU TSAI - 110385129 HAN LI 110168054 SOURCES

### Mixture Models for Genomic Data

Mixture Models for Genomic Data S. Robin AgroParisTech / INRA École de Printemps en Apprentissage automatique, Baie de somme, May 2010 S. Robin (AgroParisTech / INRA) Mixture Models May 10 1 / 48 Outline

### Improving Pattern Recognition Methods for Speaker Recognition

UNIVERSITY OF JOENSUU COMPUTER SCIENCE AND STATISTICS DISSERTATIONS 22 Ville Hautamäki Improving Pattern Recognition Methods for Speaker Recognition Academic dissertation To be presented, with the permission

### Linear Threshold Units

Linear Threshold Units w x hx (... w n x n w We assume that each feature x j and each weight w j is a real number (we will relax this later) We will study three different algorithms for learning linear

10-601 Machine Learning http://www.cs.cmu.edu/afs/cs/academic/class/10601-f10/index.html Course data All up-to-date info is on the course web page: http://www.cs.cmu.edu/afs/cs/academic/class/10601-f10/index.html

### Introduction to General and Generalized Linear Models

Introduction to General and Generalized Linear Models General Linear Models - part I Henrik Madsen Poul Thyregod Informatics and Mathematical Modelling Technical University of Denmark DK-2800 Kgs. Lyngby

### Statistical Analysis with Missing Data

Statistical Analysis with Missing Data Second Edition RODERICK J. A. LITTLE DONALD B. RUBIN WILEY- INTERSCIENCE A JOHN WILEY & SONS, INC., PUBLICATION Contents Preface PARTI OVERVIEW AND BASIC APPROACHES

### Parametric Statistical Modeling

Parametric Statistical Modeling ECE 275A Statistical Parameter Estimation Ken Kreutz-Delgado ECE Department, UC San Diego Ken Kreutz-Delgado (UC San Diego) ECE 275A SPE Version 1.1 Fall 2012 1 / 12 Why

### Machine Learning I Week 14: Sequence Learning Introduction

Machine Learning I Week 14: Sequence Learning Introduction Alex Graves Technische Universität München 29. January 2009 Literature Pattern Recognition and Machine Learning Chapter 13: Sequential Data Christopher

Cluster Analysis: Advanced Concepts and dalgorithms Dr. Hui Xiong Rutgers University Introduction to Data Mining 08/06/2006 1 Introduction to Data Mining 08/06/2006 1 Outline Prototype-based Fuzzy c-means

### Examination 110 Probability and Statistics Examination

Examination 0 Probability and Statistics Examination Sample Examination Questions The Probability and Statistics Examination consists of 5 multiple-choice test questions. The test is a three-hour examination

### 1 Maximum likelihood estimation

COS 424: Interacting with Data Lecturer: David Blei Lecture #4 Scribes: Wei Ho, Michael Ye February 14, 2008 1 Maximum likelihood estimation 1.1 MLE of a Bernoulli random variable (coin flips) Given N

### Fortgeschrittene Computerintensive Methoden: Finite Mixture Models Steffen Unkel Manuel Eugster, Bettina Grün, Friedrich Leisch, Matthias Schmid

Fortgeschrittene Computerintensive Methoden: Finite Mixture Models Steffen Unkel Manuel Eugster, Bettina Grün, Friedrich Leisch, Matthias Schmid Institut für Statistik LMU München Sommersemester 2013 Outline

### Pattern Analysis. Logistic Regression. 12. Mai 2009. Joachim Hornegger. Chair of Pattern Recognition Erlangen University

Pattern Analysis Logistic Regression 12. Mai 2009 Joachim Hornegger Chair of Pattern Recognition Erlangen University Pattern Analysis 2 / 43 1 Logistic Regression Posteriors and the Logistic Function Decision

### A gentle introduction to Expectation Maximization

A getle itroductio to Expectatio Maximizatio Mark Johso Brow Uiversity November 2009 1 / 15 Outlie What is Expectatio Maximizatio? Mixture models ad clusterig EM for setece topic modelig 2 / 15 Why Expectatio