Clustering Gene Expression Data. (Slides thanks to Dr. Mark Craven)

Size: px
Start display at page:

Download "Clustering Gene Expression Data. (Slides thanks to Dr. Mark Craven)"

Transcription

1 Clusterng Gene Epresson Data Sldes thanks to Dr. Mark Craven

2 Gene Epresson Proles we ll assume we have a D matr o gene epresson measurements rows represent genes columns represent derent eperments tme ponts ndvduals etc. what we can measured usng one* mcroarray we ll reer to ndvdual rows or columns as proles a row s a prole or a gene * Dependng on the number o genes beng consdered we mght actually use several arrays per eperment tme pont ndvdual.

3 Epresson Prole Eample rows represent genes columns represent people wth leukema

4 Task Denton: Clusterng Gene Epresson Proles gven: epresson proles or a set o genes or eperments/ndvduals/tme ponts whatever columns represent do: organze proles nto clusters such that nstances n the same cluster are hghly smlar to each other nstances rom derent clusters have low smlarty to each other

5 Motvaton or Clusterng eploratory data analyss understandng general characterstcs o data vsualzng data generalzaton ner somethng about an nstance e.g. a gene based on how t relates to other nstances everyone else s dong t

6 The Clusterng Landscape there are many derent clusterng algorthms they der along several dmensons herarchcal vs. parttonal lat hard no uncertanty about whch nstances belong to a cluster vs. sot clusters dsunctve an nstance can belong to multple clusters vs. non-dsunctve determnstc same clusters produced every tme or a gven data set vs. stochastc dstance smlarty measure used

7 Dstance/Smlarty Measures many clusterng methods employ a dstance smlarty measure to assess the dstance between a par o nstances a cluster and an nstance a par o clusters gven a dstance value t s straghtorward to convert t nto a smlarty value sm y + dst y not necessarly straghtorward to go the other way we ll descrbe our algorthms n terms o dstances

8 Dstance Metrcs propertes o metrcs dst 0 dst 0 dst dst dst some dstance metrcs dst + k dst k Manhattan Eucldean dst dst e e e e e e e ranges over the ndvdual measurements or and

9 Herarchcal Clusterng: A Dendogram heght o bar ndcates degree o dstance wthn cluster dstance scale 0 leaves represent nstances e.g. genes

10 Herarchcal Clusterng can do top-down dvsve or bottom-up agglomeratve n ether case we mantan a matr o dstance or smlarty scores or all pars o nstances clusters ormed so ar nstances and clusters

11 Dstance Between Two Clusters the dstance between two clusters can be determned n several ways sngle lnk: dstance o two most smlar nstances dst c u c v { dst a b a c b c } mn complete lnk: dstance o two least smlar nstances dst c u c v { dst a b a c b c } ma average lnk: average dstance between nstances dst c u c v { dst a b a c b c } avg u u u v v v

12 Complete-Lnk vs. Sngle-Lnk Dstances complete lnk cv sngle lnk cv c u c u

13 Updatng Dstances Ecently we ust merged u and v nto we can determne dstance to each other cluster as ollows sngle lnk: dst c c complete lnk: k dst c c k c mn ma c c k { dst c c dst c c } u k { dst c c dst c c } u k c v v k k average lnk: dst c c k c u dst c u c c k u + + c c v v dst c v c k

14 Dendogram or Serum Stmulaton o Fbroblasts sgnalng & angogeness cell cyle cholesterol bosynthess

15 Parttonal Clusterng dvde nstances nto dsont clusters lat vs. tree structure key ssues how many clusters should there be? how should clusters be represented?

16 Parttonal Clusterng Eample

17 Parttonal Clusterng rom a Herarchcal Clusterng we can always generate a parttonal clusterng rom a herarchcal clusterng by cuttng the tree at some level cuttng here results n clusters cuttng here results n 4 clusters

18 K-Means Clusterng assume our nstances are represented by vectors o real values put k cluster centers n same space as nstances each cluster s represented by a vector consder an eample n whch our vectors have dmensons + + nstances + cluster center +

19 K-Means Clusterng each teraton nvolves two steps assgnment o nstances to clusters re-computaton o the means assgnment re-computaton o means

20 K-Means Clusterng: Updatng the Means or a set o nstances that have been assgned to a cluster we re-compute the mean o the cluster as ollows µ c c c c

21 K-Means Clusterng gven : a set X {... n} o nstances select k ntal cluster centers... whle stoppng crteron not true do or all clusters c c or all means { dst < dst } µ c l do // determne whch nstances are assgned to ths cluster do // update the cluster center k l

22 K-means Clusterng Eample dst dst dst dst dst dst dst dst dst dst dst dst dst dst dst dst Gven the ollowng 4 nstances and clusters ntalzed as shown. Assume the dstance uncton s e e e dst

23 K-means Clusterng Eample Contnued assgnments reman the same so the procedure has converged

24 EM Clusterng n k-means as ust descrbed nstances are assgned to one and only one cluster we can do sot k-means clusterng va an Epectaton Mamzaton EM algorthm each cluster represented by a dstrbuton e.g. a Gaussan E step: determne how lkely s t that each cluster generated each nstance M step: adust cluster parameters to mamze lkelhood o nstances

25 Representaton o Clusters n the EM approach we ll represent each cluster usng an m-dmensonal multvarate Gaussan where Σ Σ ep T m N µ µ π Σ µ s the mean o the Gaussan s the covarance matr ths s a representaton o a Gaussan n a -D space

26 EM Clusterng the EM algorthm wll try to set the parameters o the Gaussans Θ to mamze the log lkelhood o the data X log lkelhood X Θ log log n n n Pr k k log N N

27 EM Clusterng the parameters o the model nclude the means the covarance matr and sometmes pror weghts or each Gaussan here we ll assume that the covarance matr and the pror weghts are ed; we ll ocus ust on settng the means Θ

28 EM Clusterng: the E-step z recall that s a hdden varable whch s generated and 0 otherwse n the E-step we compute the epected value o ths hdden varable h h N E z k N l l N assgnment

29 EM Clusterng: the M-step gven the epected values we re-estmate the means o the Gaussans µ n n h h can also re-estmate the covarance matr and pror weghts we re varyng them h

30 EM and K-Means Clusterng both wll converge to a local mamum both are senstve to ntal postons means o clusters have to choose value o k or both

31 Evaluatng Clusterng Results gven random data wthout any structure clusterng algorthms wll stll return clusters the gold standard: do clusters correspond to natural categores? do clusters correspond to categores we care about? there are lots o ways to partton the world

32 Evaluatng Clusterng Results some approaches eternal valdaton E.g. do genes clustered together have some common uncton? nternal valdaton How well does clusterng optmze ntra-cluster smlarty and nter-cluster dssmlarty? relatve valdaton How does t compare to other clusterngs usng these crtera? E.g. wth a probablstc method such as EM we can ask: how probable does held-asde data look as we vary the number o clusters.

33 Comments on Clusterng there many derent ways to do clusterng; we ve dscussed ust a ew methods herarchcal clusters may be more normatve but they re more epensve to compute clusterngs are hard to evaluate n many cases

Clustering Gene Expression Data

Clustering Gene Expression Data Clusterng Gene Expresson Data BMI/CS 576 www.bostat.wsc.edu/bm576/ Mark Craven craven@bostat.wsc.edu Fall 2011 Gene expresson profles we ll assume we have a 2D matrx of gene expresson measurements rows

More information

CS 2750 Machine Learning. Lecture 17a. Clustering. CS 2750 Machine Learning. Clustering

CS 2750 Machine Learning. Lecture 17a. Clustering. CS 2750 Machine Learning. Clustering Lecture 7a Clusterng Mlos Hauskrecht mlos@cs.ptt.edu 539 Sennott Square Clusterng Groups together smlar nstances n the data sample Basc clusterng problem: dstrbute data nto k dfferent groups such that

More information

Descriptive Models. Cluster Analysis. Example. General Applications of Clustering. Examples of Clustering Applications

Descriptive Models. Cluster Analysis. Example. General Applications of Clustering. Examples of Clustering Applications CMSC828G Prncples of Data Mnng Lecture #9 Today s Readng: HMS, chapter 9 Today s Lecture: Descrptve Modelng Clusterng Algorthms Descrptve Models model presents the man features of the data, a global summary

More information

Unsupervised Learning and Clustering

Unsupervised Learning and Clustering Unsupervsed Learnng and Clusterng Supervsed vs. Unsupervsed Learnng Up to now we consdered supervsed learnng scenaro, where we are gven 1. samples 1,, n 2. class labels for all samples 1,, n Ths s also

More information

Lecture 18: Clustering & classification

Lecture 18: Clustering & classification O CPS260/BGT204. Algorthms n Computatonal Bology October 30, 2003 Lecturer: Pana K. Agarwal Lecture 8: Clusterng & classfcaton Scrbe: Daun Hou Open Problem In HomeWor 2, problem 5 has an open problem whch

More information

Cluster Analysis. Cluster Analysis

Cluster Analysis. Cluster Analysis Cluster Analyss Cluster Analyss What s Cluster Analyss? Types of Data n Cluster Analyss A Categorzaton of Maor Clusterng Methos Parttonng Methos Herarchcal Methos Densty-Base Methos Gr-Base Methos Moel-Base

More information

Latent Class Regression. Statistics for Psychosocial Research II: Structural Models December 4 and 6, 2006

Latent Class Regression. Statistics for Psychosocial Research II: Structural Models December 4 and 6, 2006 Latent Class Regresson Statstcs for Psychosocal Research II: Structural Models December 4 and 6, 2006 Latent Class Regresson (LCR) What s t and when do we use t? Recall the standard latent class model

More information

A DATA MINING APPLICATION IN A STUDENT DATABASE

A DATA MINING APPLICATION IN A STUDENT DATABASE JOURNAL OF AERONAUTICS AND SPACE TECHNOLOGIES JULY 005 VOLUME NUMBER (53-57) A DATA MINING APPLICATION IN A STUDENT DATABASE Şenol Zafer ERDOĞAN Maltepe Ünversty Faculty of Engneerng Büyükbakkalköy-Istanbul

More information

Cluster Analysis of Data Points using Partitioning and Probabilistic Model-based Algorithms

Cluster Analysis of Data Points using Partitioning and Probabilistic Model-based Algorithms Internatonal Journal of Appled Informaton Systems (IJAIS) ISSN : 2249-0868 Foundaton of Computer Scence FCS, New York, USA Volume 7 No.7, August 2014 www.jas.org Cluster Analyss of Data Ponts usng Parttonng

More information

Credit Limit Optimization (CLO) for Credit Cards

Credit Limit Optimization (CLO) for Credit Cards Credt Lmt Optmzaton (CLO) for Credt Cards Vay S. Desa CSCC IX, Ednburgh September 8, 2005 Copyrght 2003, SAS Insttute Inc. All rghts reserved. SAS Propretary Agenda Background Tradtonal approaches to credt

More information

Multivariate EWMA Control Chart

Multivariate EWMA Control Chart Multvarate EWMA Control Chart Summary The Multvarate EWMA Control Chart procedure creates control charts for two or more numerc varables. Examnng the varables n a multvarate sense s extremely mportant

More information

benefit is 2, paid if the policyholder dies within the year, and probability of death within the year is ).

benefit is 2, paid if the policyholder dies within the year, and probability of death within the year is ). REVIEW OF RISK MANAGEMENT CONCEPTS LOSS DISTRIBUTIONS AND INSURANCE Loss and nsurance: When someone s subject to the rsk of ncurrng a fnancal loss, the loss s generally modeled usng a random varable or

More information

1. Measuring association using correlation and regression

1. Measuring association using correlation and regression How to measure assocaton I: Correlaton. 1. Measurng assocaton usng correlaton and regresson We often would lke to know how one varable, such as a mother's weght, s related to another varable, such as a

More information

STATISTICAL DATA ANALYSIS IN EXCEL

STATISTICAL DATA ANALYSIS IN EXCEL Mcroarray Center STATISTICAL DATA ANALYSIS IN EXCEL Lecture 6 Some Advanced Topcs Dr. Petr Nazarov 14-01-013 petr.nazarov@crp-sante.lu Statstcal data analyss n Ecel. 6. Some advanced topcs Correcton for

More information

CS 2750 Machine Learning. Lecture 3. Density estimation. CS 2750 Machine Learning. Announcements

CS 2750 Machine Learning. Lecture 3. Density estimation. CS 2750 Machine Learning. Announcements Lecture 3 Densty estmaton Mlos Hauskrecht mlos@cs.ptt.edu 5329 Sennott Square Next lecture: Matlab tutoral Announcements Rules for attendng the class: Regstered for credt Regstered for audt (only f there

More information

Faraday's Law of Induction

Faraday's Law of Induction Introducton Faraday's Law o Inducton In ths lab, you wll study Faraday's Law o nducton usng a wand wth col whch swngs through a magnetc eld. You wll also examne converson o mechanc energy nto electrc energy

More information

What is Candidate Sampling

What is Candidate Sampling What s Canddate Samplng Say we have a multclass or mult label problem where each tranng example ( x, T ) conssts of a context x a small (mult)set of target classes T out of a large unverse L of possble

More information

Extending Probabilistic Dynamic Epistemic Logic

Extending Probabilistic Dynamic Epistemic Logic Extendng Probablstc Dynamc Epstemc Logc Joshua Sack May 29, 2008 Probablty Space Defnton A probablty space s a tuple (S, A, µ), where 1 S s a set called the sample space. 2 A P(S) s a σ-algebra: a set

More information

LETTER IMAGE RECOGNITION

LETTER IMAGE RECOGNITION LETTER IMAGE RECOGNITION 1. Introducton. 1. Introducton. Objectve: desgn classfers for letter mage recognton. consder accuracy and tme n takng the decson. 20,000 samples: Startng set: mages based on 20

More information

Clustering Text Documents: An Overview

Clustering Text Documents: An Overview Clusterng Text Documents: An Overvew Radu CREȚULESCU*, Danel MORARIU*, Lucan VINȚAN* * "Lucan Blaga" Unversty of Sbu, Romana, Engneerng Faculty, Eml Coranst. no. 4, 55005 Sbu, Phone +40-69-1798, Fax +40-69-1716,

More information

Estimating the Number of Clusters in Genetics of Acute Lymphoblastic Leukemia Data

Estimating the Number of Clusters in Genetics of Acute Lymphoblastic Leukemia Data Journal of Al Azhar Unversty-Gaza (Natural Scences), 2011, 13 : 109-118 Estmatng the Number of Clusters n Genetcs of Acute Lymphoblastc Leukema Data Mahmoud K. Okasha, Khaled I.A. Almghar Department of

More information

NPAR TESTS. One-Sample Chi-Square Test. Cell Specification. Observed Frequencies 1O i 6. Expected Frequencies 1EXP i 6

NPAR TESTS. One-Sample Chi-Square Test. Cell Specification. Observed Frequencies 1O i 6. Expected Frequencies 1EXP i 6 PAR TESTS If a WEIGHT varable s specfed, t s used to replcate a case as many tmes as ndcated by the weght value rounded to the nearest nteger. If the workspace requrements are exceeded and samplng has

More information

Portfolio Loss Distribution

Portfolio Loss Distribution Portfolo Loss Dstrbuton Rsky assets n loan ortfolo hghly llqud assets hold-to-maturty n the bank s balance sheet Outstandngs The orton of the bank asset that has already been extended to borrowers. Commtment

More information

An Interest-Oriented Network Evolution Mechanism for Online Communities

An Interest-Oriented Network Evolution Mechanism for Online Communities An Interest-Orented Network Evoluton Mechansm for Onlne Communtes Cahong Sun and Xaopng Yang School of Informaton, Renmn Unversty of Chna, Bejng 100872, P.R. Chna {chsun,yang}@ruc.edu.cn Abstract. Onlne

More information

Face Verification Problem. Face Recognition Problem. Application: Access Control. Biometric Authentication. Face Verification (1:1 matching)

Face Verification Problem. Face Recognition Problem. Application: Access Control. Biometric Authentication. Face Verification (1:1 matching) Face Recognton Problem Face Verfcaton Problem Face Verfcaton (1:1 matchng) Querymage face query Face Recognton (1:N matchng) database Applcaton: Access Control www.vsage.com www.vsoncs.com Bometrc Authentcaton

More information

The Greedy Method. Introduction. 0/1 Knapsack Problem

The Greedy Method. Introduction. 0/1 Knapsack Problem The Greedy Method Introducton We have completed data structures. We now are gong to look at algorthm desgn methods. Often we are lookng at optmzaton problems whose performance s exponental. For an optmzaton

More information

Uncertain Data Mining: A New Research Direction

Uncertain Data Mining: A New Research Direction Uncertan Data Mnng: A New Research Drecton Mchael Chau 1, Reynold Cheng, and Ben Kao 3 1: School of Busness, The Unversty of Hong Kong, Pokfulam, Hong Kong : Department of Computng, Hong Kong Polytechnc

More information

CHOLESTEROL REFERENCE METHOD LABORATORY NETWORK. Sample Stability Protocol

CHOLESTEROL REFERENCE METHOD LABORATORY NETWORK. Sample Stability Protocol CHOLESTEROL REFERENCE METHOD LABORATORY NETWORK Sample Stablty Protocol Background The Cholesterol Reference Method Laboratory Network (CRMLN) developed certfcaton protocols for total cholesterol, HDL

More information

The covariance is the two variable analog to the variance. The formula for the covariance between two variables is

The covariance is the two variable analog to the variance. The formula for the covariance between two variables is Regresson Lectures So far we have talked only about statstcs that descrbe one varable. What we are gong to be dscussng for much of the remander of the course s relatonshps between two or more varables.

More information

v a 1 b 1 i, a 2 b 2 i,..., a n b n i.

v a 1 b 1 i, a 2 b 2 i,..., a n b n i. SECTION 8.4 COMPLEX VECTOR SPACES AND INNER PRODUCTS 455 8.4 COMPLEX VECTOR SPACES AND INNER PRODUCTS All the vector spaces we have studed thus far n the text are real vector spaces snce the scalars are

More information

An Alternative Way to Measure Private Equity Performance

An Alternative Way to Measure Private Equity Performance An Alternatve Way to Measure Prvate Equty Performance Peter Todd Parlux Investment Technology LLC Summary Internal Rate of Return (IRR) s probably the most common way to measure the performance of prvate

More information

Probabilities and Probabilistic Models

Probabilities and Probabilistic Models Probabltes and Probablstc Models Probablstc models A model means a system that smulates an obect under consderaton. A probablstc model s a model that produces dfferent outcomes wth dfferent probabltes

More information

Feature selection for intrusion detection. Slobodan Petrović NISlab, Gjøvik University College

Feature selection for intrusion detection. Slobodan Petrović NISlab, Gjøvik University College Feature selecton for ntruson detecton Slobodan Petrovć NISlab, Gjøvk Unversty College Contents The feature selecton problem Intruson detecton Traffc features relevant for IDS The CFS measure The mrmr measure

More information

x f(x) 1 0.25 1 0.75 x 1 0 1 1 0.04 0.01 0.20 1 0.12 0.03 0.60

x f(x) 1 0.25 1 0.75 x 1 0 1 1 0.04 0.01 0.20 1 0.12 0.03 0.60 BIVARIATE DISTRIBUTIONS Let be a varable that assumes the values { 1,,..., n }. Then, a functon that epresses the relatve frequenc of these values s called a unvarate frequenc functon. It must be true

More information

The Development of Web Log Mining Based on Improve-K-Means Clustering Analysis

The Development of Web Log Mining Based on Improve-K-Means Clustering Analysis The Development of Web Log Mnng Based on Improve-K-Means Clusterng Analyss TngZhong Wang * College of Informaton Technology, Luoyang Normal Unversty, Luoyang, 471022, Chna wangtngzhong2@sna.cn Abstract.

More information

Georey E. Hinton. University oftoronto. Email: zoubin@cs.toronto.edu. Technical Report CRG-TR-96-1. May 21, 1996 (revised Feb 27, 1997) Abstract

Georey E. Hinton. University oftoronto. Email: zoubin@cs.toronto.edu. Technical Report CRG-TR-96-1. May 21, 1996 (revised Feb 27, 1997) Abstract The EM Algorthm for Mxtures of Factor Analyzers Zoubn Ghahraman Georey E. Hnton Department of Computer Scence Unversty oftoronto 6 Kng's College Road Toronto, Canada M5S A4 Emal: zoubn@cs.toronto.edu Techncal

More information

Time Series Analysis in Studies of AGN Variability. Bradley M. Peterson The Ohio State University

Time Series Analysis in Studies of AGN Variability. Bradley M. Peterson The Ohio State University Tme Seres Analyss n Studes of AGN Varablty Bradley M. Peterson The Oho State Unversty 1 Lnear Correlaton Degree to whch two parameters are lnearly correlated can be expressed n terms of the lnear correlaton

More information

DEFINING %COMPLETE IN MICROSOFT PROJECT

DEFINING %COMPLETE IN MICROSOFT PROJECT CelersSystems DEFINING %COMPLETE IN MICROSOFT PROJECT PREPARED BY James E Aksel, PMP, PMI-SP, MVP For Addtonal Informaton about Earned Value Management Systems and reportng, please contact: CelersSystems,

More information

Lecture 2: Single Layer Perceptrons Kevin Swingler

Lecture 2: Single Layer Perceptrons Kevin Swingler Lecture 2: Sngle Layer Perceptrons Kevn Sngler kms@cs.str.ac.uk Recap: McCulloch-Ptts Neuron Ths vastly smplfed model of real neurons s also knon as a Threshold Logc Unt: W 2 A Y 3 n W n. A set of synapses

More information

Lecture 2 Sequence Alignment. Burr Settles IBS Summer Research Program 2008 bsettles@cs.wisc.edu www.cs.wisc.edu/~bsettles/ibs08/

Lecture 2 Sequence Alignment. Burr Settles IBS Summer Research Program 2008 bsettles@cs.wisc.edu www.cs.wisc.edu/~bsettles/ibs08/ Lecture 2 Sequence lgnment Burr Settles IBS Summer Research Program 2008 bsettles@cs.wsc.edu www.cs.wsc.edu/~bsettles/bs08/ Sequence lgnment: Task Defnton gven: a par of sequences DN or proten) a method

More information

Naïve Bayes classifier & Evaluation framework

Naïve Bayes classifier & Evaluation framework Lecture aïve Bayes classfer & Evaluaton framework Mlos Hauskrecht mlos@cs.ptt.edu 539 Sennott Square Generatve approach to classfcaton Idea:. Represent and learn the dstrbuton p x, y. Use t to defne probablstc

More information

CHAPTER 5 RELATIONSHIPS BETWEEN QUANTITATIVE VARIABLES

CHAPTER 5 RELATIONSHIPS BETWEEN QUANTITATIVE VARIABLES CHAPTER 5 RELATIONSHIPS BETWEEN QUANTITATIVE VARIABLES In ths chapter, we wll learn how to descrbe the relatonshp between two quanttatve varables. Remember (from Chapter 2) that the terms quanttatve varable

More information

Probabilistic Linear Classifier: Logistic Regression. CS534-Machine Learning

Probabilistic Linear Classifier: Logistic Regression. CS534-Machine Learning robablstc Lnear Classfer: Logstc Regresson CS534-Machne Learnng Three Man Approaches to learnng a Classfer Learn a classfer: a functon f, ŷ f Learn a probablstc dscrmnatve model,.e., the condtonal dstrbuton

More information

Abstract. Clustering ensembles have emerged as a powerful method for improving both the

Abstract. Clustering ensembles have emerged as a powerful method for improving both the Clusterng Ensembles: {topchyal, Models jan, of punch}@cse.msu.edu Consensus and Weak Parttons * Alexander Topchy, Anl K. Jan, and Wllam Punch Department of Computer Scence and Engneerng, Mchgan State Unversty

More information

Logistic Regression. Steve Kroon

Logistic Regression. Steve Kroon Logstc Regresson Steve Kroon Course notes sectons: 24.3-24.4 Dsclamer: these notes do not explctly ndcate whether values are vectors or scalars, but expects the reader to dscern ths from the context. Scenaro

More information

Linear Regression, Regularization Bias-Variance Tradeoff

Linear Regression, Regularization Bias-Variance Tradeoff HTF: Ch3, 7 B: Ch3 Lnear Regresson, Regularzaton Bas-Varance Tradeoff Thanks to C Guestrn, T Detterch, R Parr, N Ray 1 Outlne Lnear Regresson MLE = Least Squares! Bass functons Evaluatng Predctors Tranng

More information

A Binary Particle Swarm Optimization Algorithm for Lot Sizing Problem

A Binary Particle Swarm Optimization Algorithm for Lot Sizing Problem Journal o Economc and Socal Research 5 (2), -2 A Bnary Partcle Swarm Optmzaton Algorthm or Lot Szng Problem M. Fath Taşgetren & Yun-Cha Lang Abstract. Ths paper presents a bnary partcle swarm optmzaton

More information

1 Example 1: Axis-aligned rectangles

1 Example 1: Axis-aligned rectangles COS 511: Theoretcal Machne Learnng Lecturer: Rob Schapre Lecture # 6 Scrbe: Aaron Schld February 21, 2013 Last class, we dscussed an analogue for Occam s Razor for nfnte hypothess spaces that, n conjuncton

More information

Monte Carlo Simulation

Monte Carlo Simulation Chapter 8 Monte Carlo Smulaton Chapter 8 Monte Carlo Smulaton 8. Introducton Monte Carlo smulaton s named ater the cty o Monte Carlo n Monaco, whch s amous or gamblng such as roulette, dce, and slot machnes.

More information

Vision Mouse. Saurabh Sarkar a* University of Cincinnati, Cincinnati, USA ABSTRACT 1. INTRODUCTION

Vision Mouse. Saurabh Sarkar a* University of Cincinnati, Cincinnati, USA ABSTRACT 1. INTRODUCTION Vson Mouse Saurabh Sarkar a* a Unversty of Cncnnat, Cncnnat, USA ABSTRACT The report dscusses a vson based approach towards trackng of eyes and fngers. The report descrbes the process of locatng the possble

More information

Document Clustering Analysis Based on Hybrid PSO+K-means Algorithm

Document Clustering Analysis Based on Hybrid PSO+K-means Algorithm Document Clusterng Analyss Based on Hybrd PSO+K-means Algorthm Xaohu Cu, Thomas E. Potok Appled Software Engneerng Research Group, Computatonal Scences and Engneerng Dvson, Oak Rdge Natonal Laboratory,

More information

Fast Fuzzy Clustering of Web Page Collections

Fast Fuzzy Clustering of Web Page Collections Fast Fuzzy Clusterng of Web Page Collectons Chrstan Borgelt and Andreas Nürnberger Dept. of Knowledge Processng and Language Engneerng Otto-von-Guercke-Unversty of Magdeburg Unverstätsplatz, D-396 Magdeburg,

More information

Enterprise Master Patient Index

Enterprise Master Patient Index Enterprse Master Patent Index Healthcare data are captured n many dfferent settngs such as hosptals, clncs, labs, and physcan offces. Accordng to a report by the CDC, patents n the Unted States made an

More information

Review of Hierarchical Models for Data Clustering and Visualization

Review of Hierarchical Models for Data Clustering and Visualization Revew of Herarchcal Models for Data Clusterng and Vsualzaton Lola Vcente & Alfredo Velldo Grup de Soft Computng Seccó d Intel lgènca Artfcal Departament de Llenguatges Sstemes Informàtcs Unverstat Poltècnca

More information

Evaluating the generalizability of an RCT using electronic health records data

Evaluating the generalizability of an RCT using electronic health records data Evaluatng the generalzablty of an RCT usng electronc health records data 3 nterestng questons Is our RCT representatve? How can we generalze RCT results? Can we use EHR* data as a control group? *) Electronc

More information

Single and multiple stage classifiers implementing logistic discrimination

Single and multiple stage classifiers implementing logistic discrimination Sngle and multple stage classfers mplementng logstc dscrmnaton Hélo Radke Bttencourt 1 Dens Alter de Olvera Moraes 2 Vctor Haertel 2 1 Pontfíca Unversdade Católca do Ro Grande do Sul - PUCRS Av. Ipranga,

More information

Bayesian Cluster Ensembles

Bayesian Cluster Ensembles Bayesan Cluster Ensembles Hongjun Wang 1, Hanhua Shan 2 and Arndam Banerjee 2 1 Informaton Research Insttute, Southwest Jaotong Unversty, Chengdu, Schuan, 610031, Chna 2 Department of Computer Scence &

More information

The OC Curve of Attribute Acceptance Plans

The OC Curve of Attribute Acceptance Plans The OC Curve of Attrbute Acceptance Plans The Operatng Characterstc (OC) curve descrbes the probablty of acceptng a lot as a functon of the lot s qualty. Fgure 1 shows a typcal OC Curve. 10 8 6 4 1 3 4

More information

9.1 The Cumulative Sum Control Chart

9.1 The Cumulative Sum Control Chart Learnng Objectves 9.1 The Cumulatve Sum Control Chart 9.1.1 Basc Prncples: Cusum Control Chart for Montorng the Process Mean If s the target for the process mean, then the cumulatve sum control chart s

More information

Rate Monotonic (RM) Disadvantages of cyclic. TDDB47 Real Time Systems. Lecture 2: RM & EDF. Priority-based scheduling. States of a process

Rate Monotonic (RM) Disadvantages of cyclic. TDDB47 Real Time Systems. Lecture 2: RM & EDF. Priority-based scheduling. States of a process Dsadvantages of cyclc TDDB47 Real Tme Systems Manual scheduler constructon Cannot deal wth any runtme changes What happens f we add a task to the set? Real-Tme Systems Laboratory Department of Computer

More information

21 Vectors: The Cross Product & Torque

21 Vectors: The Cross Product & Torque 21 Vectors: The Cross Product & Torque Do not use our left hand when applng ether the rght-hand rule for the cross product of two vectors dscussed n ths chapter or the rght-hand rule for somethng curl

More information

) of the Cell class is created containing information about events associated with the cell. Events are added to the Cell instance

) of the Cell class is created containing information about events associated with the cell. Events are added to the Cell instance Calbraton Method Instances of the Cell class (one nstance for each FMS cell) contan ADC raw data and methods assocated wth each partcular FMS cell. The calbraton method ncludes event selecton (Class Cell

More information

QUANTUM MECHANICS, BRAS AND KETS

QUANTUM MECHANICS, BRAS AND KETS PH575 SPRING QUANTUM MECHANICS, BRAS AND KETS The followng summares the man relatons and defntons from quantum mechancs that we wll be usng. State of a phscal sstem: The state of a phscal sstem s represented

More information

Module 2 LOSSLESS IMAGE COMPRESSION SYSTEMS. Version 2 ECE IIT, Kharagpur

Module 2 LOSSLESS IMAGE COMPRESSION SYSTEMS. Version 2 ECE IIT, Kharagpur Module LOSSLESS IMAGE COMPRESSION SYSTEMS Lesson 3 Lossless Compresson: Huffman Codng Instructonal Objectves At the end of ths lesson, the students should be able to:. Defne and measure source entropy..

More information

ErrorPropagation.nb 1. Error Propagation

ErrorPropagation.nb 1. Error Propagation ErrorPropagaton.nb Error Propagaton Suppose that we make observatons of a quantty x that s subject to random fluctuatons or measurement errors. Our best estmate of the true value for ths quantty s then

More information

Capital asset pricing model, arbitrage pricing theory and portfolio management

Capital asset pricing model, arbitrage pricing theory and portfolio management Captal asset prcng model, arbtrage prcng theory and portfolo management Vnod Kothar The captal asset prcng model (CAPM) s great n terms of ts understandng of rsk decomposton of rsk nto securty-specfc rsk

More information

SCALAR A physical quantity that is completely characterized by a real number (or by its numerical value) is called a scalar. In other words, a scalar

SCALAR A physical quantity that is completely characterized by a real number (or by its numerical value) is called a scalar. In other words, a scalar SCALAR A phscal quantt that s completel charactered b a real number (or b ts numercal value) s called a scalar. In other words, a scalar possesses onl a magntude. Mass, denst, volume, temperature, tme,

More information

8.5 UNITARY AND HERMITIAN MATRICES. The conjugate transpose of a complex matrix A, denoted by A*, is given by

8.5 UNITARY AND HERMITIAN MATRICES. The conjugate transpose of a complex matrix A, denoted by A*, is given by 6 CHAPTER 8 COMPLEX VECTOR SPACES 5. Fnd the kernel of the lnear transformaton gven n Exercse 5. In Exercses 55 and 56, fnd the mage of v, for the ndcated composton, where and are gven by the followng

More information

Recurrence. 1 Definitions and main statements

Recurrence. 1 Definitions and main statements Recurrence 1 Defntons and man statements Let X n, n = 0, 1, 2,... be a MC wth the state space S = (1, 2,...), transton probabltes p j = P {X n+1 = j X n = }, and the transton matrx P = (p j ),j S def.

More information

Data Broadcast on a Multi-System Heterogeneous Overlayed Wireless Network *

Data Broadcast on a Multi-System Heterogeneous Overlayed Wireless Network * JOURNAL OF INFORMATION SCIENCE AND ENGINEERING 24, 819-840 (2008) Data Broadcast on a Mult-System Heterogeneous Overlayed Wreless Network * Department of Computer Scence Natonal Chao Tung Unversty Hsnchu,

More information

Nonlinear data mapping by neural networks

Nonlinear data mapping by neural networks Nonlnear data mappng by neural networks R.P.W. Dun Delft Unversty of Technology, Netherlands Abstract A revew s gven of the use of neural networks for nonlnear mappng of hgh dmensonal data on lower dmensonal

More information

Determination of Number of Clusters in K-Means Clustering and Application in Colour Image Segmentation

Determination of Number of Clusters in K-Means Clustering and Application in Colour Image Segmentation Determnaton of Number of Clusters n K-Means Clusterng and Applcaton n Colour Image Segmentaton Abstract: Sddheswar Ray and Rose H. Tur School of Computer Scence and Software Engneerng Monash Unversty,

More information

Colocalization of Fluorescent Probes

Colocalization of Fluorescent Probes Colocalzaton of Fluorescent Probes APPLICATION NOTE #1 1. Introducton Fluorescence labelng technques are qute useful to mcroscopsts. Not only can fluorescent probes label sub-cellular structures wth a

More information

Calculation of Sampling Weights

Calculation of Sampling Weights Perre Foy Statstcs Canada 4 Calculaton of Samplng Weghts 4.1 OVERVIEW The basc sample desgn used n TIMSS Populatons 1 and 2 was a two-stage stratfed cluster desgn. 1 The frst stage conssted of a sample

More information

Least 1-Norm SVMs: a New SVM Variant between Standard and LS-SVMs

Least 1-Norm SVMs: a New SVM Variant between Standard and LS-SVMs ESANN proceedngs, European Smposum on Artfcal Neural Networks - Computatonal Intellgence and Machne Learnng. Bruges (Belgum), 8-3 Aprl, d-sde publ., ISBN -9337--. Least -Norm SVMs: a New SVM Varant between

More information

A Comparative Study of Data Clustering Techniques

A Comparative Study of Data Clustering Techniques A COMPARATIVE STUDY OF DATA CLUSTERING TECHNIQUES A Comparatve Study of Data Clusterng Technques Khaled Hammouda Prof. Fakhreddne Karray Unversty of Waterloo, Ontaro, Canada Abstract Data clusterng s a

More information

Lecture 7 March 20, 2002

Lecture 7 March 20, 2002 MIT 8.996: Topc n TCS: Internet Research Problems Sprng 2002 Lecture 7 March 20, 2002 Lecturer: Bran Dean Global Load Balancng Scrbe: John Kogel, Ben Leong In today s lecture, we dscuss global load balancng

More information

Detecting Global Motion Patterns in Complex Videos

Detecting Global Motion Patterns in Complex Videos Detectng Global Moton Patterns n Complex Vdeos Mn Hu, Saad Al, Mubarak Shah Computer Vson Lab, Unversty of Central Florda {mhu,sal,shah}@eecs.ucf.edu Abstract Learnng domnant moton patterns or actvtes

More information

NON-CONSTANT SUM RED-AND-BLACK GAMES WITH BET-DEPENDENT WIN PROBABILITY FUNCTION LAURA PONTIGGIA, University of the Sciences in Philadelphia

NON-CONSTANT SUM RED-AND-BLACK GAMES WITH BET-DEPENDENT WIN PROBABILITY FUNCTION LAURA PONTIGGIA, University of the Sciences in Philadelphia To appear n Journal o Appled Probablty June 2007 O-COSTAT SUM RED-AD-BLACK GAMES WITH BET-DEPEDET WI PROBABILITY FUCTIO LAURA POTIGGIA, Unversty o the Scences n Phladelpha Abstract In ths paper we nvestgate

More information

Homework 11. Problems: 20.37, 22.33, 22.41, 22.67

Homework 11. Problems: 20.37, 22.33, 22.41, 22.67 Homework 11 roblems: 0.7,.,.41,.67 roblem 0.7 1.00-kg block o alumnum s heated at atmospherc pressure such that ts temperature ncreases rom.0 to 40.0. Fnd (a) the work done by the alumnum, (b) the energy

More information

Risk-based Fatigue Estimate of Deep Water Risers -- Course Project for EM388F: Fracture Mechanics, Spring 2008

Risk-based Fatigue Estimate of Deep Water Risers -- Course Project for EM388F: Fracture Mechanics, Spring 2008 Rsk-based Fatgue Estmate of Deep Water Rsers -- Course Project for EM388F: Fracture Mechancs, Sprng 2008 Chen Sh Department of Cvl, Archtectural, and Envronmental Engneerng The Unversty of Texas at Austn

More information

Conversion between the vector and raster data structures using Fuzzy Geographical Entities

Conversion between the vector and raster data structures using Fuzzy Geographical Entities Converson between the vector and raster data structures usng Fuzzy Geographcal Enttes Cdála Fonte Department of Mathematcs Faculty of Scences and Technology Unversty of Combra, Apartado 38, 3 454 Combra,

More information

Inequality and The Accounting Period. Quentin Wodon and Shlomo Yitzhaki. World Bank and Hebrew University. September 2001.

Inequality and The Accounting Period. Quentin Wodon and Shlomo Yitzhaki. World Bank and Hebrew University. September 2001. Inequalty and The Accountng Perod Quentn Wodon and Shlomo Ytzha World Ban and Hebrew Unversty September Abstract Income nequalty typcally declnes wth the length of tme taen nto account for measurement.

More information

Realistic Image Synthesis

Realistic Image Synthesis Realstc Image Synthess - Combned Samplng and Path Tracng - Phlpp Slusallek Karol Myszkowsk Vncent Pegoraro Overvew: Today Combned Samplng (Multple Importance Samplng) Renderng and Measurng Equaton Random

More information

greatest common divisor

greatest common divisor 4. GCD 1 The greatest common dvsor of two ntegers a and b (not both zero) s the largest nteger whch s a common factor of both a and b. We denote ths number by gcd(a, b), or smply (a, b) when there s no

More information

Control Charts for Means (Simulation)

Control Charts for Means (Simulation) Chapter 290 Control Charts for Means (Smulaton) Introducton Ths procedure allows you to study the run length dstrbuton of Shewhart (Xbar), Cusum, FIR Cusum, and EWMA process control charts for means usng

More information

I. SCOPE, APPLICABILITY AND PARAMETERS Scope

I. SCOPE, APPLICABILITY AND PARAMETERS Scope D Executve Board Annex 9 Page A/R ethodologcal Tool alculaton of the number of sample plots for measurements wthn A/R D project actvtes (Verson 0) I. SOPE, PIABIITY AD PARAETERS Scope. Ths tool s applcable

More information

Generator Warm-Up Characteristics

Generator Warm-Up Characteristics NO. REV. NO. : ; ~ Generator Warm-Up Characterstcs PAGE OF Ths document descrbes the warm-up process of the SNAP-27 Generator Assembly after the sotope capsule s nserted. Several nqures have recently been

More information

Construction Rules for Morningstar Canada Target Dividend Index SM

Construction Rules for Morningstar Canada Target Dividend Index SM Constructon Rules for Mornngstar Canada Target Dvdend Index SM Mornngstar Methodology Paper October 2014 Verson 1.2 2014 Mornngstar, Inc. All rghts reserved. The nformaton n ths document s the property

More information

The Magnetic Field. Concepts and Principles. Moving Charges. Permanent Magnets

The Magnetic Field. Concepts and Principles. Moving Charges. Permanent Magnets . The Magnetc Feld Concepts and Prncples Movng Charges All charged partcles create electrc felds, and these felds can be detected by other charged partcles resultng n electrc force. However, a completely

More information

Genetic Algorithms applied to Clustering Problem and Data Mining

Genetic Algorithms applied to Clustering Problem and Data Mining Proceedngs of the 7th WSEAS Internatonal Conference on Smulaton, Modellng and Optmzaton, Beng, Chna, September 5-7, 007 9 Genetc Algorthms appled to Clusterng Problem and Data Mnng JF JIMENEZ a, FJ CUEVAS

More information

Mean Molecular Weight

Mean Molecular Weight Mean Molecular Weght The thermodynamc relatons between P, ρ, and T, as well as the calculaton of stellar opacty requres knowledge of the system s mean molecular weght defned as the mass per unt mole of

More information

Interpreting Patterns and Analysis of Acute Leukemia Gene Expression Data by Multivariate Statistical Analysis

Interpreting Patterns and Analysis of Acute Leukemia Gene Expression Data by Multivariate Statistical Analysis Interpretng Patterns and Analyss of Acute Leukema Gene Expresson Data by Multvarate Statstcal Analyss ChangKyoo Yoo * and Peter A. Vanrolleghem BIOMATH, Department of Appled Mathematcs, Bometrcs and Process

More information

Quality Adjustment of Second-hand Motor Vehicle Application of Hedonic Approach in Hong Kong s Consumer Price Index

Quality Adjustment of Second-hand Motor Vehicle Application of Hedonic Approach in Hong Kong s Consumer Price Index Qualty Adustment of Second-hand Motor Vehcle Applcaton of Hedonc Approach n Hong Kong s Consumer Prce Index Prepared for the 14 th Meetng of the Ottawa Group on Prce Indces 20 22 May 2015, Tokyo, Japan

More information

CHAPTER 8 Potential Energy and Conservation of Energy

CHAPTER 8 Potential Energy and Conservation of Energy CHAPTER 8 Potental Energy and Conservaton o Energy One orm o energy can be converted nto another orm o energy. Conservatve and non-conservatve orces Physcs 1 Knetc energy: Potental energy: Energy assocated

More information

1 Approximation Algorithms

1 Approximation Algorithms CME 305: Dscrete Mathematcs and Algorthms 1 Approxmaton Algorthms In lght of the apparent ntractablty of the problems we beleve not to le n P, t makes sense to pursue deas other than complete solutons

More information

CHAPTER 14 MORE ABOUT REGRESSION

CHAPTER 14 MORE ABOUT REGRESSION CHAPTER 14 MORE ABOUT REGRESSION We learned n Chapter 5 that often a straght lne descrbes the pattern of a relatonshp between two quanttatve varables. For nstance, n Example 5.1 we explored the relatonshp

More information

A Novel Methodology of Working Capital Management for Large. Public Constructions by Using Fuzzy S-curve Regression

A Novel Methodology of Working Capital Management for Large. Public Constructions by Using Fuzzy S-curve Regression Novel Methodology of Workng Captal Management for Large Publc Constructons by Usng Fuzzy S-curve Regresson Cheng-Wu Chen, Morrs H. L. Wang and Tng-Ya Hseh Department of Cvl Engneerng, Natonal Central Unversty,

More information

Allocating fixed costs in the postal sector in the presence of changing letter and parcel volumes: applied in outdoor delivery

Allocating fixed costs in the postal sector in the presence of changing letter and parcel volumes: applied in outdoor delivery SE- 378 February 3 Allocatng ed costs n the postal sector n the presence o changng letter and parcel volumes: appled n outdoor delvery P.De Donder H.remer P.Dudley and F.Rodrguez Allocatng ed costs n the

More information

Data Modeling and Least Squares Fitting COS 323

Data Modeling and Least Squares Fitting COS 323 Data Modeln and Least Squares Fttn COS 33 Data Modeln or Reresson Gven: data ponts, unctonal orm, nd constants n uncton Eample: ven,, nd lne throuh them;.e., nd a and n a+ 3, 3 5, 5 6, 6 a+ 1, 1 7, 7,

More information