Review: Classification Outline

Size: px
Start display at page:

Download "Review: Classification Outline"

Transcription

1 Data Miig CS 341, Sprig 2007 Decisio Trees Neural etworks Review: Lecture 6: Classificatio issues, regressio, bayesia classificatio Pretice Hall 2 Data Miig Core Techiques Classificatio Clusterig Associatio Rules Classificatio Outlie Goal: Provide a overview of the classificatio problem ad itroduce some of the basic algorithms Classificatio Problem Overview Classificatio Techiques Regressio Bayesia classificatio Distace Decisio Trees Rules Neural Networks Pretice Hall 3 Pretice Hall 4 Classificatio Outlie Goal: Provide a overview of the classificatio problem ad itroduce some of the basic algorithms Classificatio Problem Overview Classificatio Techiques Regressio Bayesia classificatio Classificatio Problem Give a database D={t 1,t 2,,t } ad a set of classes C={C 1,,C,C m }, the Classificatio Problem is to defie a mappig f:dc where each t i is assiged to oe class. Actually divides D ito equivalece classes. Predictio is similar, but may be viewed as havig ifiite umber of classes. Pretice Hall 5 Pretice Hall 6 1

2 Classificatio Examples Teachers classify studets grades as A, B, C, D, or F. Idetify mushrooms as poisoous or edible. Predict whe a river will flood. Idetify idividuals with credit risks. Speech recogitio Patter recogitio Pretice Hall 7 Classificatio Ex: Gradig If x >= 90 the grade =A. If 80<=x<90 the grade =B. If 70<=x<80 the grade =C. If 60<=x<70 the grade =D. If x<60 the grade =F. <80 <70 x <50 F <90 >=90 Pretice Hall 8 x x x >=70 C >=60 D >=80 B A Classificatio Ex: Letter Recogitio View letters as costructed from 5 compoets: Letter A Letter B Letter C Letter D Letter E Letter F Pretice Hall 9 Classificatio Techiques Approach: 1. Create specific model by evaluatig traiig data (or usig domai experts kowledge). 2. Apply model developed to ew data. Classes must be predefied Most commo techiques use DTs, NNs,, or are based o distaces or statistical methods. Pretice Hall 10 Defiig Classes Issues i Classificatio Partitioig Based Distace Based Missig Data Igore Replace with assumed value Overfittig Large set of traiig data Filter out erroeous or oisy data Measurig Performace Classificatio accuracy o test data Cofusio matrix OC Curve Pretice Hall 11 Pretice Hall 12 2

3 Classificatio Accuracy Classificatio Performace True positive (TP) t i Predicted to be i C j ad is actually i it. False positive (FP) t i Predicted to be i C j but is ot actually i it. True egative (TN) t i ot predicted to be i C j ad is ot actually i it. False egative (FN) t i ot predicted to be i C j but is actually i it. True Positive False Positive False Negative True Negative Pretice Hall 13 Pretice Hall 14 Cofusio Matrix A m x m matrix Etry C i,j idicates the umber of tuples assiged to C j, but where the correct class is C i The best solutio will oly have o- zero values o the diagoal. Pretice Hall 15 Height Example Data N am e G e d e r H eig h t O u tp u t1 O u tp u t2 K ristia F 1.6m S ho rt M ed iu m Jim M 2m T all M ed iu m M ag gie F 1.9m M e diu m T all M arth a F 1.88 m M e diu m T all S tep ha ie F 1.7m S ho rt M ed iu m B o b M 1.85 m M e diu m M ed iu m K a th y F 1.6m S ho rt M ed iu m D ave M 1.7m S ho rt M ed iu m W orth M 2.2m T all T all S teve M 2.1m T all T all D eb bie F 1.8m M e diu m M ed iu m T o dd M 1.95 m M e diu m M ed iu m K im F 1.9m M e diu m T all A m y F 1.8m M e diu m M ed iu m W ye tte F 1.75 m M e diu m M ed iu m Pretice Hall 16 Cofusio Matrix Example Operatig Characteristic Curve Usig height data example with Output1 (correct) ad Output2 (actual) assigmet Actual Assigmet Membership Short Medium Tall Short Medium Tall Pretice Hall 17 Pretice Hall 18 3

4 Classificatio Outlie Goal: Provide a overview of the classificatio problem ad itroduce some of the basic algorithms Classificatio Problem Overview Classificatio Techiques Regressio Distace Decisio Trees Rules Neural Networks Regressio Assume data fits a predefied fuctio Determie best values for parameters i the model Estimate a output value based o iput values Ca be used for classificatio ad predictio Pretice Hall 19 Pretice Hall 20 Liear Regressio Assume the relatio of the output variable to the iput variables is a liear fuctio of some parameters. Determie best values for regressio coefficiets c 0,c 1,,c,c. Assume a error: y = c 0 +c 1 x 1 + +c x +ε Estimate error usig mea squared error for traiig set: Example: 4.3 Y = C 0 + ε Fid the value for c 0 that best partitio the height values ito classes: short ad medium The traiig data for y i is {1.6, 1.9, 1.88, 1.7, 1.85, 1.6, 1.7, 1.8, 1.95, 1.9, 1.8, 1.75} How? Pretice Hall 21 Pretice Hall 22 Example: 4.4 Liear Regressio Poor Fit Y = c 0 + c 0 x 1 + ε Fid the value for c 0 ad c 1 that best predict the class. Assume 0 for the short class, 1 for the medium class The traiig data for (x i, y i) is {(1.6,0), (1.9,0), (1.88, 0), (1.7, 0), (1.85, 0), (1.6, 0), (1.7,0), (1.8,0), (1.95, 0), (1.9, 0), (1.8, 0), (1.75, 0)} How? Pretice Hall 23 Pretice Hall 24 4

5 Classificatio Usig Regressio Divisio Divisio: Use regressio fuctio to divide area ito regios. Predictio: : Use regressio fuctio to predict a class membership fuctio. Pretice Hall 25 Pretice Hall 26 Predictio Logistic Regressio A geeralized liear model Extesively used i the medical ad social scieces It has the followig form Log e (p /p -1) = c 0 + c 1 x c k x k p is the probability of beig i the class, 1 p is the probability that is ot. The parameters c 0, c 1, c k are usually estimated by maximum likelihood. (maximize the probability of observig the give value.) Pretice Hall 27 Pretice Hall 28 Why Logistic Regressio P is i the rage [0,1] A good model would like to have p value close to 0 or 1 Liear fuctio is ot suitable for p Cosider the odds p/1-p. p. As p icreases, the odds (p/1-p) p) icreases The odds is i the rage of [0, + ], + asymmetric. The log odds lies i the rage - to +, symmetric. Liear Regressio vs. Logistic Regressio Pretice Hall 29 Pretice Hall 30 5

6 Classificatio Outlie Goal: Provide a overview of the classificatio problem ad itroduce some of the basic algorithms Classificatio Problem Overview Classificatio Techiques Regressio Bayesia classificatio Bayes Theorem Posterior Probability: P(h 1 x i ) Prior Probability: P(h 1 ) Bayes Theorem: Pretice Hall 31 Assig probabilities of hypotheses give a data value. Pretice Hall 32 Naïve Bayes Classificatio Assume that the cotributio by all attributes are idepedet ad that each cotributes equally to the classificatio problem. t i has m idepedet attributes {x i1 P (t( i C j ) P (x( ik C j ) i1,, x im,}. Pretice Hall 33 Example: usig the output1 as classificatio results N a m e G e d e r H e ig h t O u tp u t1 O u tp u t2 K ris ti a F 1.6 m S h o rt M e d iu m J im M 2 m T a ll M e d iu m M a g g ie F 1.9 m M e d iu m T a ll M a rth a F m M e d iu m T a ll S te p h a ie F 1.7 m S h o rt M e d iu m B o b M m M e d iu m M e d iu m K a th y F 1.6 m S h o rt M e d iu m D a v e M 1.7 m S h o rt M e d iu m W o rth M 2.2 m T a ll T a ll S te v e M 2.1 m T a ll T a ll D e b b ie F 1.8 m M e d iu m M e d iu m T o d d M m M e d iu m M e d iu m K im F 1.9 m M e d iu m T a ll A m y F 1.8 m M e d iu m M e d iu m W y e tte F m M e d iu m M e d iu m Pretice Hall 34 Example 4.5 Step1: Calculate the prior probability P (short) = P (medium) = P (tall) = Example 4.5 Step1: Calculate the prior probability P (short) = 4/15 = P (medium) = 8/15 = P (tall) = 3/15 = 0.2 Step 2: Calculate the coditioal probability P(Geder i C j ), Geder i = F or M, C j = short or medium or tall P(Height i C j ) Height i i (0,1.6],(1.6,1.7],(1.7,1.8],(1.8,1.9],(1.9,2.0],(>2.0). Pretice Hall 35 Pretice Hall 36 6

7 Attribute Example 4.5 (cot d) cout short medium tall Geder M F Height (<1.6] (1.6,1.7] (1.7,1.8] (1.8,1.9] (1.9,2.0] ( >2.0 ) probability p(x i C j ) short medium tall 1/4 2/8 3/3 3/4 6/8 0/3 2/ / / / /8 1/ /3 Example 4.5 (cot d) Give a tuple t ={Adam, M, 1.95m} Step 3: Calculate P(t C j ) P(t short) ) = P(t medium) ) = P(t tall)= Step 4: calculate P(t) P(t) ) = P(t short)p(short)+p(t medium)p(medium)+p(t tall)p(tall) Pretice Hall 37 Pretice Hall 38 Example 4.5 (cot d) Give a tuple t ={Adam, M, 1.95m} Step 3: Calculate P(t C j ) P(t short) ) = ¼ x 0 =0 P(t medium) ) = 2/8 x 1/8 =0.031 P(t tall)= 3/3 x1/3 =0.333 Step 4: calculate P(t) P(t) ) = P(t short)p(short)+p(t medium)p(medium)+p(t tall)p(tall) = Example 4.5 (cot d) Step 5: Calculate P(C j t) usig Bayes Rule P(short t) ) = P(t short)p(short)/p(t) ) = P(medium t) ) = P(tall t)= Last step: classify t based o these probabilities Pretice Hall 39 Pretice Hall 40 Example 4.5 (cot d) Step 5: Calculate P(C j t) usig Bayes Rule P(short t) ) = P(t short)p(short)/p(t) ) = 0 P(medium t) ) = 0.2 P(tall t)= Last step: Classify the ew tuple as tall. A Summary Step 1: Calculate the prior probability of each class. P (C( j ) Step 2: Calculate the coditioal probability for each attribute value, P(Geder i C j ), Step 3: Calculate the coditioal probability P(t C j ) Step 4: calculate the prior probability of a tuple, P(t) Step 5: Calculate the posterior probability for each class give the tuple, P(C j t) usig Bayes Rule Step 6: Classify a tuple based o the P(C j t), the tuple belogs to the class with has the highest posterior probability. Pretice Hall 41 Pretice Hall 42 7

8 Next Lecture: Classificatio: Distace-based algorithms Decisio tree-based algorithms HW2 will be aouced! Pretice Hall 43 8

THE REGRESSION MODEL IN MATRIX FORM. For simple linear regression, meaning one predictor, the model is. for i = 1, 2, 3,, n

THE REGRESSION MODEL IN MATRIX FORM. For simple linear regression, meaning one predictor, the model is. for i = 1, 2, 3,, n We will cosider the liear regressio model i matrix form. For simple liear regressio, meaig oe predictor, the model is i = + x i + ε i for i =,,,, This model icludes the assumptio that the ε i s are a sample

More information

2-3 The Remainder and Factor Theorems

2-3 The Remainder and Factor Theorems - The Remaider ad Factor Theorems Factor each polyomial completely usig the give factor ad log divisio 1 x + x x 60; x + So, x + x x 60 = (x + )(x x 15) Factorig the quadratic expressio yields x + x x

More information

SECTION 1.5 : SUMMATION NOTATION + WORK WITH SEQUENCES

SECTION 1.5 : SUMMATION NOTATION + WORK WITH SEQUENCES SECTION 1.5 : SUMMATION NOTATION + WORK WITH SEQUENCES Read Sectio 1.5 (pages 5 9) Overview I Sectio 1.5 we lear to work with summatio otatio ad formulas. We will also itroduce a brief overview of sequeces,

More information

Soving Recurrence Relations

Soving Recurrence Relations Sovig Recurrece Relatios Part 1. Homogeeous liear 2d degree relatios with costat coefficiets. Cosider the recurrece relatio ( ) T () + at ( 1) + bt ( 2) = 0 This is called a homogeeous liear 2d degree

More information

Taking DCOP to the Real World: Efficient Complete Solutions for Distributed Multi-Event Scheduling

Taking DCOP to the Real World: Efficient Complete Solutions for Distributed Multi-Event Scheduling Taig DCOP to the Real World: Efficiet Complete Solutios for Distributed Multi-Evet Schedulig Rajiv T. Maheswara, Milid Tambe, Emma Bowrig, Joatha P. Pearce, ad Pradeep araatham Uiversity of Souther Califoria

More information

Application and research of fuzzy clustering analysis algorithm under micro-lecture English teaching mode

Application and research of fuzzy clustering analysis algorithm under micro-lecture English teaching mode SHS Web of Cofereces 25, shscof/20162501018 Applicatio ad research of fuzzy clusterig aalysis algorithm uder micro-lecture Eglish teachig mode Yig Shi, Wei Dog, Chuyi Lou & Ya Dig Qihuagdao Istitute of

More information

Baan Service Master Data Management

Baan Service Master Data Management Baa Service Master Data Maagemet Module Procedure UP069A US Documetiformatio Documet Documet code : UP069A US Documet group : User Documetatio Documet title : Master Data Maagemet Applicatio/Package :

More information

Chapter 5: Inner Product Spaces

Chapter 5: Inner Product Spaces Chapter 5: Ier Product Spaces Chapter 5: Ier Product Spaces SECION A Itroductio to Ier Product Spaces By the ed of this sectio you will be able to uderstad what is meat by a ier product space give examples

More information

1. MATHEMATICAL INDUCTION

1. MATHEMATICAL INDUCTION 1. MATHEMATICAL INDUCTION EXAMPLE 1: Prove that for ay iteger 1. Proof: 1 + 2 + 3 +... + ( + 1 2 (1.1 STEP 1: For 1 (1.1 is true, sice 1 1(1 + 1. 2 STEP 2: Suppose (1.1 is true for some k 1, that is 1

More information

AP Calculus AB 2006 Scoring Guidelines Form B

AP Calculus AB 2006 Scoring Guidelines Form B AP Calculus AB 6 Scorig Guidelies Form B The College Board: Coectig Studets to College Success The College Board is a ot-for-profit membership associatio whose missio is to coect studets to college success

More information

Classification Techniques (1)

Classification Techniques (1) 10 10 Overview Classification Techniques (1) Today Classification Problem Classification based on Regression Distance-based Classification (KNN) Net Lecture Decision Trees Classification using Rules Quality

More information

Non-life insurance mathematics. Nils F. Haavardsson, University of Oslo and DNB Skadeforsikring

Non-life insurance mathematics. Nils F. Haavardsson, University of Oslo and DNB Skadeforsikring No-life isurace mathematics Nils F. Haavardsso, Uiversity of Oslo ad DNB Skadeforsikrig Mai issues so far Why does isurace work? How is risk premium defied ad why is it importat? How ca claim frequecy

More information

1 Correlation and Regression Analysis

1 Correlation and Regression Analysis 1 Correlatio ad Regressio Aalysis I this sectio we will be ivestigatig the relatioship betwee two cotiuous variable, such as height ad weight, the cocetratio of a ijected drug ad heart rate, or the cosumptio

More information

Section 11.3: The Integral Test

Section 11.3: The Integral Test Sectio.3: The Itegral Test Most of the series we have looked at have either diverged or have coverged ad we have bee able to fid what they coverge to. I geeral however, the problem is much more difficult

More information

Asymptotic Growth of Functions

Asymptotic Growth of Functions CMPS Itroductio to Aalysis of Algorithms Fall 3 Asymptotic Growth of Fuctios We itroduce several types of asymptotic otatio which are used to compare the performace ad efficiecy of algorithms As we ll

More information

The following example will help us understand The Sampling Distribution of the Mean. C1 C2 C3 C4 C5 50 miles 84 miles 38 miles 120 miles 48 miles

The following example will help us understand The Sampling Distribution of the Mean. C1 C2 C3 C4 C5 50 miles 84 miles 38 miles 120 miles 48 miles The followig eample will help us uderstad The Samplig Distributio of the Mea Review: The populatio is the etire collectio of all idividuals or objects of iterest The sample is the portio of the populatio

More information

Output Analysis (2, Chapters 10 &11 Law)

Output Analysis (2, Chapters 10 &11 Law) B. Maddah ENMG 6 Simulatio 05/0/07 Output Aalysis (, Chapters 10 &11 Law) Comparig alterative system cofiguratio Sice the output of a simulatio is radom, the comparig differet systems via simulatio should

More information

Overview of some probability distributions.

Overview of some probability distributions. Lecture Overview of some probability distributios. I this lecture we will review several commo distributios that will be used ofte throughtout the class. Each distributio is usually described by its probability

More information

Firewall Modules and Modular Firewalls

Firewall Modules and Modular Firewalls Firewall Modules ad Modular Firewalls H. B. Acharya Uiversity of Texas at Austi acharya@cs.utexas.edu Aditya Joshi Uiversity of Texas at Austi adityaj@cs.utexas.edu M. G. Gouda Natioal Sciece Foudatio

More information

5 Boolean Decision Trees (February 11)

5 Boolean Decision Trees (February 11) 5 Boolea Decisio Trees (February 11) 5.1 Graph Coectivity Suppose we are give a udirected graph G, represeted as a boolea adjacecy matrix = (a ij ), where a ij = 1 if ad oly if vertices i ad j are coected

More information

Department of Computer Science, University of Otago

Department of Computer Science, University of Otago Departmet of Computer Sciece, Uiversity of Otago Techical Report OUCS-2006-09 Permutatios Cotaiig May Patters Authors: M.H. Albert Departmet of Computer Sciece, Uiversity of Otago Micah Colema, Rya Fly

More information

In nite Sequences. Dr. Philippe B. Laval Kennesaw State University. October 9, 2008

In nite Sequences. Dr. Philippe B. Laval Kennesaw State University. October 9, 2008 I ite Sequeces Dr. Philippe B. Laval Keesaw State Uiversity October 9, 2008 Abstract This had out is a itroductio to i ite sequeces. mai de itios ad presets some elemetary results. It gives the I ite Sequeces

More information

PSYCHOLOGICAL STATISTICS

PSYCHOLOGICAL STATISTICS UNIVERSITY OF CALICUT SCHOOL OF DISTANCE EDUCATION B Sc. Cousellig Psychology (0 Adm.) IV SEMESTER COMPLEMENTARY COURSE PSYCHOLOGICAL STATISTICS QUESTION BANK. Iferetial statistics is the brach of statistics

More information

Modified Line Search Method for Global Optimization

Modified Line Search Method for Global Optimization Modified Lie Search Method for Global Optimizatio Cria Grosa ad Ajith Abraham Ceter of Excellece for Quatifiable Quality of Service Norwegia Uiversity of Sciece ad Techology Trodheim, Norway {cria, ajith}@q2s.tu.o

More information

One-sample test of proportions

One-sample test of proportions Oe-sample test of proportios The Settig: Idividuals i some populatio ca be classified ito oe of two categories. You wat to make iferece about the proportio i each category, so you draw a sample. Examples:

More information

Chapter 7 Methods of Finding Estimators

Chapter 7 Methods of Finding Estimators Chapter 7 for BST 695: Special Topics i Statistical Theory. Kui Zhag, 011 Chapter 7 Methods of Fidig Estimators Sectio 7.1 Itroductio Defiitio 7.1.1 A poit estimator is ay fuctio W( X) W( X1, X,, X ) of

More information

How To Extract From Data From A College Course

How To Extract From Data From A College Course (IJACSA Iteratioal Joural of Advaced Computer Sciece ad Applicatios, Vol., No. 6, 0 Miig Educatioal Data to Aalyze Studets Performace Briesh Kumar Baradwa Research Scholor, Sighaiya Uiversity, Raastha,

More information

LECTURE 13: Cross-validation

LECTURE 13: Cross-validation LECTURE 3: Cross-validatio Resampli methods Cross Validatio Bootstrap Bias ad variace estimatio with the Bootstrap Three-way data partitioi Itroductio to Patter Aalysis Ricardo Gutierrez-Osua Texas A&M

More information

Detecting Auto Insurance Fraud by Data Mining Techniques

Detecting Auto Insurance Fraud by Data Mining Techniques Detectig Auto Isurace Fraud by Data Miig Techiques Rekha Bhowmik Computer Sciece Departmet Uiversity of Texas at Dallas, USA rxb080100@utdallas.edu ABSTRACT The paper presets fraud detectio method to predict

More information

Determining the sample size

Determining the sample size Determiig the sample size Oe of the most commo questios ay statisticia gets asked is How large a sample size do I eed? Researchers are ofte surprised to fid out that the aswer depeds o a umber of factors

More information

Infinite Sequences and Series

Infinite Sequences and Series CHAPTER 4 Ifiite Sequeces ad Series 4.1. Sequeces A sequece is a ifiite ordered list of umbers, for example the sequece of odd positive itegers: 1, 3, 5, 7, 9, 11, 13, 15, 17, 19, 21, 23, 25, 27, 29...

More information

.04. This means $1000 is multiplied by 1.02 five times, once for each of the remaining sixmonth

.04. This means $1000 is multiplied by 1.02 five times, once for each of the remaining sixmonth Questio 1: What is a ordiary auity? Let s look at a ordiary auity that is certai ad simple. By this, we mea a auity over a fixed term whose paymet period matches the iterest coversio period. Additioally,

More information

Overview on S-Box Design Principles

Overview on S-Box Design Principles Overview o S-Box Desig Priciples Debdeep Mukhopadhyay Assistat Professor Departmet of Computer Sciece ad Egieerig Idia Istitute of Techology Kharagpur INDIA -721302 What is a S-Box? S-Boxes are Boolea

More information

CS100: Introduction to Computer Science

CS100: Introduction to Computer Science Review: History of Computers CS100: Itroductio to Computer Sciece Maiframes Miicomputers Lecture 2: Data Storage -- Bits, their storage ad mai memory Persoal Computers & Workstatios Review: The Role of

More information

Confidence Intervals for One Mean

Confidence Intervals for One Mean Chapter 420 Cofidece Itervals for Oe Mea Itroductio This routie calculates the sample size ecessary to achieve a specified distace from the mea to the cofidece limit(s) at a stated cofidece level for a

More information

Trigonometric Form of a Complex Number. The Complex Plane. axis. ( 2, 1) or 2 i FIGURE 6.44. The absolute value of the complex number z a bi is

Trigonometric Form of a Complex Number. The Complex Plane. axis. ( 2, 1) or 2 i FIGURE 6.44. The absolute value of the complex number z a bi is 0_0605.qxd /5/05 0:45 AM Page 470 470 Chapter 6 Additioal Topics i Trigoometry 6.5 Trigoometric Form of a Complex Number What you should lear Plot complex umbers i the complex plae ad fid absolute values

More information

SAMPLE QUESTIONS FOR FINAL EXAM. (1) (2) (3) (4) Find the following using the definition of the Riemann integral: (2x + 1)dx

SAMPLE QUESTIONS FOR FINAL EXAM. (1) (2) (3) (4) Find the following using the definition of the Riemann integral: (2x + 1)dx SAMPLE QUESTIONS FOR FINAL EXAM REAL ANALYSIS I FALL 006 3 4 Fid the followig usig the defiitio of the Riema itegral: a 0 x + dx 3 Cosider the partitio P x 0 3, x 3 +, x 3 +,......, x 3 3 + 3 of the iterval

More information

NATIONAL SENIOR CERTIFICATE GRADE 12

NATIONAL SENIOR CERTIFICATE GRADE 12 NATIONAL SENIOR CERTIFICATE GRADE MATHEMATICS P EXEMPLAR 04 MARKS: 50 TIME: 3 hours This questio paper cosists of 8 pages ad iformatio sheet. Please tur over Mathematics/P DBE/04 NSC Grade Eemplar INSTRUCTIONS

More information

Example 2 Find the square root of 0. The only square root of 0 is 0 (since 0 is not positive or negative, so those choices don t exist here).

Example 2 Find the square root of 0. The only square root of 0 is 0 (since 0 is not positive or negative, so those choices don t exist here). BEGINNING ALGEBRA Roots ad Radicals (revised summer, 00 Olso) Packet to Supplemet the Curret Textbook - Part Review of Square Roots & Irratioals (This portio ca be ay time before Part ad should mostly

More information

Confidence Intervals. CI for a population mean (σ is known and n > 30 or the variable is normally distributed in the.

Confidence Intervals. CI for a population mean (σ is known and n > 30 or the variable is normally distributed in the. Cofidece Itervals A cofidece iterval is a iterval whose purpose is to estimate a parameter (a umber that could, i theory, be calculated from the populatio, if measuremets were available for the whole populatio).

More information

PROCEEDINGS OF THE YEREVAN STATE UNIVERSITY AN ALTERNATIVE MODEL FOR BONUS-MALUS SYSTEM

PROCEEDINGS OF THE YEREVAN STATE UNIVERSITY AN ALTERNATIVE MODEL FOR BONUS-MALUS SYSTEM PROCEEDINGS OF THE YEREVAN STATE UNIVERSITY Physical ad Mathematical Scieces 2015, 1, p. 15 19 M a t h e m a t i c s AN ALTERNATIVE MODEL FOR BONUS-MALUS SYSTEM A. G. GULYAN Chair of Actuarial Mathematics

More information

Lesson 15 ANOVA (analysis of variance)

Lesson 15 ANOVA (analysis of variance) Outlie Variability -betwee group variability -withi group variability -total variability -F-ratio Computatio -sums of squares (betwee/withi/total -degrees of freedom (betwee/withi/total -mea square (betwee/withi

More information

0.7 0.6 0.2 0 0 96 96.5 97 97.5 98 98.5 99 99.5 100 100.5 96.5 97 97.5 98 98.5 99 99.5 100 100.5

0.7 0.6 0.2 0 0 96 96.5 97 97.5 98 98.5 99 99.5 100 100.5 96.5 97 97.5 98 98.5 99 99.5 100 100.5 Sectio 13 Kolmogorov-Smirov test. Suppose that we have a i.i.d. sample X 1,..., X with some ukow distributio P ad we would like to test the hypothesis that P is equal to a particular distributio P 0, i.e.

More information

Chapter 6: Variance, the law of large numbers and the Monte-Carlo method

Chapter 6: Variance, the law of large numbers and the Monte-Carlo method Chapter 6: Variace, the law of large umbers ad the Mote-Carlo method Expected value, variace, ad Chebyshev iequality. If X is a radom variable recall that the expected value of X, E[X] is the average value

More information

The analysis of the Cournot oligopoly model considering the subjective motive in the strategy selection

The analysis of the Cournot oligopoly model considering the subjective motive in the strategy selection The aalysis of the Courot oligopoly model cosiderig the subjective motive i the strategy selectio Shigehito Furuyama Teruhisa Nakai Departmet of Systems Maagemet Egieerig Faculty of Egieerig Kasai Uiversity

More information

I. Chi-squared Distributions

I. Chi-squared Distributions 1 M 358K Supplemet to Chapter 23: CHI-SQUARED DISTRIBUTIONS, T-DISTRIBUTIONS, AND DEGREES OF FREEDOM To uderstad t-distributios, we first eed to look at aother family of distributios, the chi-squared distributios.

More information

Chapter 7 - Sampling Distributions. 1 Introduction. What is statistics? It consist of three major areas:

Chapter 7 - Sampling Distributions. 1 Introduction. What is statistics? It consist of three major areas: Chapter 7 - Samplig Distributios 1 Itroductio What is statistics? It cosist of three major areas: Data Collectio: samplig plas ad experimetal desigs Descriptive Statistics: umerical ad graphical summaries

More information

Theorems About Power Series

Theorems About Power Series Physics 6A Witer 20 Theorems About Power Series Cosider a power series, f(x) = a x, () where the a are real coefficiets ad x is a real variable. There exists a real o-egative umber R, called the radius

More information

Chapter 5 Unit 1. IET 350 Engineering Economics. Learning Objectives Chapter 5. Learning Objectives Unit 1. Annual Amount and Gradient Functions

Chapter 5 Unit 1. IET 350 Engineering Economics. Learning Objectives Chapter 5. Learning Objectives Unit 1. Annual Amount and Gradient Functions Chapter 5 Uit Aual Amout ad Gradiet Fuctios IET 350 Egieerig Ecoomics Learig Objectives Chapter 5 Upo completio of this chapter you should uderstad: Calculatig future values from aual amouts. Calculatig

More information

Confidence Intervals

Confidence Intervals Cofidece Itervals Cofidece Itervals are a extesio of the cocept of Margi of Error which we met earlier i this course. Remember we saw: The sample proportio will differ from the populatio proportio by more

More information

Project Deliverables. CS 361, Lecture 28. Outline. Project Deliverables. Administrative. Project Comments

Project Deliverables. CS 361, Lecture 28. Outline. Project Deliverables. Administrative. Project Comments Project Deliverables CS 361, Lecture 28 Jared Saia Uiversity of New Mexico Each Group should tur i oe group project cosistig of: About 6-12 pages of text (ca be loger with appedix) 6-12 figures (please

More information

CS103A Handout 23 Winter 2002 February 22, 2002 Solving Recurrence Relations

CS103A Handout 23 Winter 2002 February 22, 2002 Solving Recurrence Relations CS3A Hadout 3 Witer 00 February, 00 Solvig Recurrece Relatios Itroductio A wide variety of recurrece problems occur i models. Some of these recurrece relatios ca be solved usig iteratio or some other ad

More information

CS100: Introduction to Computer Science

CS100: Introduction to Computer Science Course Iformatio CS100: Itroductio to Computer Sciece Lecture 1: Itroductio (Survey, Pictures) Istructor: Xiaoya Li Lecture: Mo. & Wed. 11:00am 12:15pm Room: Kedade Hall 305 Labs: Wed or Thu 1:00pm 2:50pm

More information

Maximum Likelihood Estimators.

Maximum Likelihood Estimators. Lecture 2 Maximum Likelihood Estimators. Matlab example. As a motivatio, let us look at oe Matlab example. Let us geerate a radom sample of size 00 from beta distributio Beta(5, 2). We will lear the defiitio

More information

Solutions to Selected Problems In: Pattern Classification by Duda, Hart, Stork

Solutions to Selected Problems In: Pattern Classification by Duda, Hart, Stork Solutios to Selected Problems I: Patter Classificatio by Duda, Hart, Stork Joh L. Weatherwax February 4, 008 Problem Solutios Chapter Bayesia Decisio Theory Problem radomized rules Part a: Let Rx be the

More information

CHAPTER 3 THE TIME VALUE OF MONEY

CHAPTER 3 THE TIME VALUE OF MONEY CHAPTER 3 THE TIME VALUE OF MONEY OVERVIEW A dollar i the had today is worth more tha a dollar to be received i the future because, if you had it ow, you could ivest that dollar ad ear iterest. Of all

More information

3 Basic Definitions of Probability Theory

3 Basic Definitions of Probability Theory 3 Basic Defiitios of Probability Theory 3defprob.tex: Feb 10, 2003 Classical probability Frequecy probability axiomatic probability Historical developemet: Classical Frequecy Axiomatic The Axiomatic defiitio

More information

LOAD BALANCING IN PUBLIC CLOUD COMBINING THE CONCEPTS OF DATA MINING AND NETWORKING

LOAD BALANCING IN PUBLIC CLOUD COMBINING THE CONCEPTS OF DATA MINING AND NETWORKING LOAD BALACIG I PUBLIC CLOUD COMBIIG THE COCEPTS OF DATA MIIG AD ETWORKIG Priyaka R M. Tech Studet, Dept. of Computer Sciece ad Egieerig, AIET, Karataka, Idia Abstract Load balacig i the cloud computig

More information

Plug-in martingales for testing exchangeability on-line

Plug-in martingales for testing exchangeability on-line Plug-i martigales for testig exchageability o-lie Valetia Fedorova, Alex Gammerma, Ilia Nouretdiov, ad Vladimir Vovk Computer Learig Research Cetre Royal Holloway, Uiversity of Lodo, UK {valetia,ilia,alex,vovk}@cs.rhul.ac.uk

More information

, a Wishart distribution with n -1 degrees of freedom and scale matrix.

, a Wishart distribution with n -1 degrees of freedom and scale matrix. UMEÅ UNIVERSITET Matematisk-statistiska istitutioe Multivariat dataaalys D MSTD79 PA TENTAMEN 004-0-9 LÖSNINGSFÖRSLAG TILL TENTAMEN I MATEMATISK STATISTIK Multivariat dataaalys D, 5 poäg.. Assume that

More information

THE ARITHMETIC OF INTEGERS. - multiplication, exponentiation, division, addition, and subtraction

THE ARITHMETIC OF INTEGERS. - multiplication, exponentiation, division, addition, and subtraction THE ARITHMETIC OF INTEGERS - multiplicatio, expoetiatio, divisio, additio, ad subtractio What to do ad what ot to do. THE INTEGERS Recall that a iteger is oe of the whole umbers, which may be either positive,

More information

Incremental calculation of weighted mean and variance

Incremental calculation of weighted mean and variance Icremetal calculatio of weighted mea ad variace Toy Fich faf@cam.ac.uk dot@dotat.at Uiversity of Cambridge Computig Service February 009 Abstract I these otes I eplai how to derive formulae for umerically

More information

Subject CT5 Contingencies Core Technical Syllabus

Subject CT5 Contingencies Core Technical Syllabus Subject CT5 Cotigecies Core Techical Syllabus for the 2015 exams 1 Jue 2014 Aim The aim of the Cotigecies subject is to provide a groudig i the mathematical techiques which ca be used to model ad value

More information

Cooley-Tukey. Tukey FFT Algorithms. FFT Algorithms. Cooley

Cooley-Tukey. Tukey FFT Algorithms. FFT Algorithms. Cooley Cooley Cooley-Tuey Tuey FFT Algorithms FFT Algorithms Cosider a legth- sequece x[ with a -poit DFT X[ where Represet the idices ad as +, +, Cooley Cooley-Tuey Tuey FFT Algorithms FFT Algorithms Usig these

More information

The Forgotten Middle. research readiness results. Executive Summary

The Forgotten Middle. research readiness results. Executive Summary The Forgotte Middle Esurig that All Studets Are o Target for College ad Career Readiess before High School Executive Summary Today, college readiess also meas career readiess. While ot every high school

More information

Approximating Area under a curve with rectangles. To find the area under a curve we approximate the area using rectangles and then use limits to find

Approximating Area under a curve with rectangles. To find the area under a curve we approximate the area using rectangles and then use limits to find 1.8 Approximatig Area uder a curve with rectagles 1.6 To fid the area uder a curve we approximate the area usig rectagles ad the use limits to fid 1.4 the area. Example 1 Suppose we wat to estimate 1.

More information

Clustering Algorithm Analysis of Web Users with Dissimilarity and SOM Neural Networks

Clustering Algorithm Analysis of Web Users with Dissimilarity and SOM Neural Networks JONAL OF SOFTWARE, VOL. 7, NO., NOVEMBER 533 Clusterig Algorithm Aalysis of Web Users with Dissimilarity ad SOM Neal Networks Xiao Qiag School of Ecoomics ad maagemet, Lazhou Jiaotog Uiversity, Lazhou;

More information

University of California, Los Angeles Department of Statistics. Distributions related to the normal distribution

University of California, Los Angeles Department of Statistics. Distributions related to the normal distribution Uiversity of Califoria, Los Ageles Departmet of Statistics Statistics 100B Istructor: Nicolas Christou Three importat distributios: Distributios related to the ormal distributio Chi-square (χ ) distributio.

More information

Z-TEST / Z-STATISTIC: used to test hypotheses about. µ when the population standard deviation is unknown

Z-TEST / Z-STATISTIC: used to test hypotheses about. µ when the population standard deviation is unknown Z-TEST / Z-STATISTIC: used to test hypotheses about µ whe the populatio stadard deviatio is kow ad populatio distributio is ormal or sample size is large T-TEST / T-STATISTIC: used to test hypotheses about

More information

Our aim is to show that under reasonable assumptions a given 2π-periodic function f can be represented as convergent series

Our aim is to show that under reasonable assumptions a given 2π-periodic function f can be represented as convergent series 8 Fourier Series Our aim is to show that uder reasoable assumptios a give -periodic fuctio f ca be represeted as coverget series f(x) = a + (a cos x + b si x). (8.) By defiitio, the covergece of the series

More information

CME 302: NUMERICAL LINEAR ALGEBRA FALL 2005/06 LECTURE 8

CME 302: NUMERICAL LINEAR ALGEBRA FALL 2005/06 LECTURE 8 CME 30: NUMERICAL LINEAR ALGEBRA FALL 005/06 LECTURE 8 GENE H GOLUB 1 Positive Defiite Matrices A matrix A is positive defiite if x Ax > 0 for all ozero x A positive defiite matrix has real ad positive

More information

1 Computing the Standard Deviation of Sample Means

1 Computing the Standard Deviation of Sample Means Computig the Stadard Deviatio of Sample Meas Quality cotrol charts are based o sample meas ot o idividual values withi a sample. A sample is a group of items, which are cosidered all together for our aalysis.

More information

FOUNDATIONS OF MATHEMATICS AND PRE-CALCULUS GRADE 10

FOUNDATIONS OF MATHEMATICS AND PRE-CALCULUS GRADE 10 FOUNDATIONS OF MATHEMATICS AND PRE-CALCULUS GRADE 10 [C] Commuicatio Measuremet A1. Solve problems that ivolve liear measuremet, usig: SI ad imperial uits of measure estimatio strategies measuremet strategies.

More information

Definition. A variable X that takes on values X 1, X 2, X 3,...X k with respective frequencies f 1, f 2, f 3,...f k has mean

Definition. A variable X that takes on values X 1, X 2, X 3,...X k with respective frequencies f 1, f 2, f 3,...f k has mean 1 Social Studies 201 October 13, 2004 Note: The examples i these otes may be differet tha used i class. However, the examples are similar ad the methods used are idetical to what was preseted i class.

More information

Notes on exponential generating functions and structures.

Notes on exponential generating functions and structures. Notes o expoetial geeratig fuctios ad structures. 1. The cocept of a structure. Cosider the followig coutig problems: (1) to fid for each the umber of partitios of a -elemet set, (2) to fid for each the

More information

CHAPTER 7: Central Limit Theorem: CLT for Averages (Means)

CHAPTER 7: Central Limit Theorem: CLT for Averages (Means) CHAPTER 7: Cetral Limit Theorem: CLT for Averages (Meas) X = the umber obtaied whe rollig oe six sided die oce. If we roll a six sided die oce, the mea of the probability distributio is X P(X = x) Simulatio:

More information

A gentle introduction to Expectation Maximization

A gentle introduction to Expectation Maximization A getle itroductio to Expectatio Maximizatio Mark Johso Brow Uiversity November 2009 1 / 15 Outlie What is Expectatio Maximizatio? Mixture models ad clusterig EM for setece topic modelig 2 / 15 Why Expectatio

More information

Week 3 Conditional probabilities, Bayes formula, WEEK 3 page 1 Expected value of a random variable

Week 3 Conditional probabilities, Bayes formula, WEEK 3 page 1 Expected value of a random variable Week 3 Coditioal probabilities, Bayes formula, WEEK 3 page 1 Expected value of a radom variable We recall our discussio of 5 card poker hads. Example 13 : a) What is the probability of evet A that a 5

More information

A Model Based Mixture Supervised Classification Approach in Hyperspectral Data Analysis

A Model Based Mixture Supervised Classification Approach in Hyperspectral Data Analysis A Model Based Mixture Supervised Classificatio Approach i Hyperspectral Data Aalysis M. Murat Dudar ad David Ladgrebe, Life Fellow, IEEE School of Electrical ad Computer Egieerig Purdue Uiversity Copyright

More information

CS103X: Discrete Structures Homework 4 Solutions

CS103X: Discrete Structures Homework 4 Solutions CS103X: Discrete Structures Homewor 4 Solutios Due February 22, 2008 Exercise 1 10 poits. Silico Valley questios: a How may possible six-figure salaries i whole dollar amouts are there that cotai at least

More information

Basic Elements of Arithmetic Sequences and Series

Basic Elements of Arithmetic Sequences and Series MA40S PRE-CALCULUS UNIT G GEOMETRIC SEQUENCES CLASS NOTES (COMPLETED NO NEED TO COPY NOTES FROM OVERHEAD) Basic Elemets of Arithmetic Sequeces ad Series Objective: To establish basic elemets of arithmetic

More information

MATH 083 Final Exam Review

MATH 083 Final Exam Review MATH 08 Fial Eam Review Completig the problems i this review will greatly prepare you for the fial eam Calculator use is ot required, but you are permitted to use a calculator durig the fial eam period

More information

Semiconductor Devices

Semiconductor Devices emicoductor evices Prof. Zbigiew Lisik epartmet of emicoductor ad Optoelectroics evices room: 116 e-mail: zbigiew.lisik@p.lodz.pl Uipolar devices IFE T&C JFET Trasistor Uipolar evices - Trasistors asic

More information

Automatic Tuning for FOREX Trading System Using Fuzzy Time Series

Automatic Tuning for FOREX Trading System Using Fuzzy Time Series utomatic Tuig for FOREX Tradig System Usig Fuzzy Time Series Kraimo Maeesilp ad Pitihate Soorasa bstract Efficiecy of the automatic currecy tradig system is time depedet due to usig fixed parameters which

More information

How To Solve The Homewor Problem Beautifully

How To Solve The Homewor Problem Beautifully Egieerig 33 eautiful Homewor et 3 of 7 Kuszmar roblem.5.5 large departmet store sells sport shirts i three sizes small, medium, ad large, three patters plaid, prit, ad stripe, ad two sleeve legths log

More information

*The most important feature of MRP as compared with ordinary inventory control analysis is its time phasing feature.

*The most important feature of MRP as compared with ordinary inventory control analysis is its time phasing feature. Itegrated Productio ad Ivetory Cotrol System MRP ad MRP II Framework of Maufacturig System Ivetory cotrol, productio schedulig, capacity plaig ad fiacial ad busiess decisios i a productio system are iterrelated.

More information

Factoring x n 1: cyclotomic and Aurifeuillian polynomials Paul Garrett <garrett@math.umn.edu>

Factoring x n 1: cyclotomic and Aurifeuillian polynomials Paul Garrett <garrett@math.umn.edu> (March 16, 004) Factorig x 1: cyclotomic ad Aurifeuillia polyomials Paul Garrett Polyomials of the form x 1, x 3 1, x 4 1 have at least oe systematic factorizatio x 1 = (x 1)(x 1

More information

Function factorization using warped Gaussian processes

Function factorization using warped Gaussian processes Fuctio factorizatio usig warped Gaussia processes Mikkel N. Schmidt ms@imm.dtu.dk Uiversity of Cambridge, Departmet of Egieerig, Trumpigto Street, Cambridge, CB2 PZ, UK Abstract We itroduce a ew approach

More information

Lesson 17 Pearson s Correlation Coefficient

Lesson 17 Pearson s Correlation Coefficient Outlie Measures of Relatioships Pearso s Correlatio Coefficiet (r) -types of data -scatter plots -measure of directio -measure of stregth Computatio -covariatio of X ad Y -uique variatio i X ad Y -measurig

More information

Spam Detection. A Bayesian approach to filtering spam

Spam Detection. A Bayesian approach to filtering spam Spam Detectio A Bayesia approach to filterig spam Kual Mehrotra Shailedra Watave Abstract The ever icreasig meace of spam is brigig dow productivity. More tha 70% of the email messages are spam, ad it

More information

Multiplexers and Demultiplexers

Multiplexers and Demultiplexers I this lesso, you will lear about: Multiplexers ad Demultiplexers 1. Multiplexers 2. Combiatioal circuit implemetatio with multiplexers 3. Demultiplexers 4. Some examples Multiplexer A Multiplexer (see

More information

SEQUENCES AND SERIES

SEQUENCES AND SERIES Chapter 9 SEQUENCES AND SERIES Natural umbers are the product of huma spirit. DEDEKIND 9.1 Itroductio I mathematics, the word, sequece is used i much the same way as it is i ordiary Eglish. Whe we say

More information

Sampling Distribution And Central Limit Theorem

Sampling Distribution And Central Limit Theorem () Samplig Distributio & Cetral Limit Samplig Distributio Ad Cetral Limit Samplig distributio of the sample mea If we sample a umber of samples (say k samples where k is very large umber) each of size,

More information

Interference Alignment and the Generalized Degrees of Freedom of the X Channel

Interference Alignment and the Generalized Degrees of Freedom of the X Channel Iterferece Aligmet ad the Geeralized Degrees of Freedom of the X Chael Chiachi Huag, Viveck R. Cadambe, Syed A. Jafar Electrical Egieerig ad Computer Sciece Uiversity of Califoria Irvie Irvie, Califoria,

More information

1. C. The formula for the confidence interval for a population mean is: x t, which was

1. C. The formula for the confidence interval for a population mean is: x t, which was s 1. C. The formula for the cofidece iterval for a populatio mea is: x t, which was based o the sample Mea. So, x is guarateed to be i the iterval you form.. D. Use the rule : p-value

More information

DATA MINING TO CLUSTER HUMAN PERFORMANCE BY USING ONLINE SELF REGULATING CLUSTERING METHOD

DATA MINING TO CLUSTER HUMAN PERFORMANCE BY USING ONLINE SELF REGULATING CLUSTERING METHOD Istabul, Turkey, May 7-30, 008. DATA MINING TO CLUSTE HUMAN PEFOMANCE BY USING ONLINE SELF EGULATING CLUSTEING METHOD ADEM KAAHOCA, DILEK KAAHOCA, OSMAN KAYA Bahcesehir Uiversity, Egieerig Faculty Computer

More information

Hypothesis testing. Null and alternative hypotheses

Hypothesis testing. Null and alternative hypotheses Hypothesis testig Aother importat use of samplig distributios is to test hypotheses about populatio parameters, e.g. mea, proportio, regressio coefficiets, etc. For example, it is possible to stipulate

More information

Inference on Proportion. Chapter 8 Tests of Statistical Hypotheses. Sampling Distribution of Sample Proportion. Confidence Interval

Inference on Proportion. Chapter 8 Tests of Statistical Hypotheses. Sampling Distribution of Sample Proportion. Confidence Interval Chapter 8 Tests of Statistical Hypotheses 8. Tests about Proportios HT - Iferece o Proportio Parameter: Populatio Proportio p (or π) (Percetage of people has o health isurace) x Statistic: Sample Proportio

More information

Case Study. Normal and t Distributions. Density Plot. Normal Distributions

Case Study. Normal and t Distributions. Density Plot. Normal Distributions Case Study Normal ad t Distributios Bret Halo ad Bret Larget Departmet of Statistics Uiversity of Wiscosi Madiso October 11 13, 2011 Case Study Body temperature varies withi idividuals over time (it ca

More information

Descriptive Statistics

Descriptive Statistics Descriptive Statistics We leared to describe data sets graphically. We ca also describe a data set umerically. Measures of Locatio Defiitio The sample mea is the arithmetic average of values. We deote

More information