Lecture 9: the χ 2 -test

Size: px
Start display at page:

Download "Lecture 9: the χ 2 -test"

Transcription

1 Lecture 9: the χ 2 -test S. Massa, Department of Statistics, University of Oxford 19 January 2016

2 Hypothesis Testing 1. Begin with a research hypothesis (this is usually the alternative hypothesis) 2. Decide significance level, e.g. 5%; 3. Set up the null hypothesis (roughly the opposite of the research hypothesis); 4. Calculate a test statistic from your data; 5. Calculate the p-value: the probability under the null hypothesis of observing something at least as extreme as your observation. 6. Reject the null if the p-value is less than the significance level, 5%.

3 The Z-test Roughly the Z-test is given by Z = observed expected. standard error Example (Population mean) Suppose X 1,, X n are n i.i.d. normal samples with known SD σ. H 0 : mean = µ; H 1 : mean µ; Then with X the sample mean, and since SE = σ/ n Z = X µ σ/ n N(0, 1). (1)

4 The Z-test Example (Test for a proportion) When testing the probability of success in n trials H 0 : P(success)= p 0 ; H 1 : P(success) p 0 ; Under the null hypothesis the expected proportion of successes is p 0, the standard error is p 0 (1 p 0 )/n and the test statistic is Z = proportion of successes p 0 p0 (1 p 0 )/n N(0, 1).

5 Z-test recap Some good properties of the Z-statistic computed from the data (definition of statistic); extreme values correspond intuitively to failure of the null; under the null hypothesis we know the distribution of Z; very good for testing whether the mean of quantitative data could have a certain value. But what about categorical/qualitative data?

6 Suicide by Month of Birth Month Prob Female Male Total Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec Total Mean Table: Suicides by month of birth in England and Wales

7 Suicide by Month of Birth Of course different months of birth have different numbers of suicides, due to chance variation. Is there an underlying relation? Is it more likely that one commits suicide if they were born in a particular month? It seems there are slightly more suicides among those born in March, April and May. This will be our research hypothesis. H 0 : a suicide is equally to be born on any day of the year, 1/ Does the z-test work in this setting?

8 Does the z-test Work in This Setting? One choice is to group March to June into Spring, and Other for all the other months. We can now test for the proportion of spring born suicides. Then 9421 spring born, and other. 122 days in the spring category H 0 : p 0 = Probability of a suicide being born in spring = 122/ = Expected = p 0 = 8980, SE = p 0 (1 p 0 )n = Then the z-statistic becomes Z = observed expected SE = = 5.71, giving a p-value of less than 10 8.

9 Problem with Using the z-test with More than Two Levels The problem is that we made a choice about how to split up the months. This may give one the opportunity to choose the particular splitting that proves the desired result. This problem will always appear whenever there are more than two levels in our categorical variables (12 months). We need a method that deals with all categories simultaneously.

10 Goodness of Fit Tests Observe n samples of a categorical variable with k levels. H 0 : the categories have the probabilities p 1,..., p k. Under the null hypothesis, since there are n observations the expected counts in the categories are np 1,..., np k. Our statistic should measure how far the observed counts n 1,..., n k deviate from the expected ones np 1,..., np k. Add up (n i np i ) 2. Getting 1 rather than 10 in a category should count more than getting 1010 while expecting So we need to normalise. Therefore we get the χ 2 -statistic X 2 := (n 1 np 1 ) (n k np k ) 2 np 1 np k = (observed expected) 2. expected

11 The χ 2 Distribution Straightaway we know that (i) X 2 is a statistic: it can be computed from the data; (ii) large values imply that the null hypothesis is likely not true. (iii) do we know its distribution? YES, but approximately for large n. This distribution is called the χ 2 distribution and is parameterized by a positive integer, the number of degrees of freedom(d.f.) How do we figure out the degrees of freedom?

12 The Number of Degrees of Freedom The number of degrees of freedom depends on: the number of categories; the form of the null hypothesis, specifically the number of parameters. In general degrees of freedom = #categories #parameters fitted from data 1. n large enough: expected count in each category at least 5. If not, merge categories together.

13 Properties of χ 2 The χ 2 distribution with d degrees of freedom is a continuous distribution. It has mean d and variance 2d. Distributions right-skewed but skew drops as d grows. For large d becomes close to normal with mean d and variance 2d.

14 Some Plots Figure: χ 2 distributions with 0.5, 3, 5, and 10 degrees of freedom.

15 Table d.f. P = 0.05 P = If more than 60 degrees of freedom approximate with N(d, 2d). Find p-value for observed X 2 using Z = X2 d 2d N(0, 1), and the normal table.

16 Assumptions for the χ 2 Goodness of Fit Test Simple Random sample: The sample is taken from a fixed population where each individual has equal chance of being selected. Independence: The observations are independent. Sample size: We have a sufficiently large sample size for the whole table and also for the individuals cells: each cell must have expected count at least 5. The sample size assumption goes back to the central limit theorem: here under the null hypothesis you are sampling from a multinomial distribution. You need adequate expected cell counts for the CLT to kick in.

17 Fixed Distribution: a Fair Die Given a standard die we want to test if it s fair. H 0 : each face has probability 1/6. If it is fair then each side has probability 1/6. Roll it 60 times. Expected count 10 for each side. Compute the χ 2 statistic Observed Expected X 2 (16 10)2 (15 10)2 (5 10)2 = = = 15.4

18 Fixed Distribution: a Fair Die To compute the p-value we need to know the number of degrees of freedom. There are 6 categories and no parameters so d.f. = = 5. Therefore comparing X 2 = 15.4 with and we reject the null hypothesis at both the 5% and the 1% significance level.

19 Parametric Families Sometimes we want to know if the data came from a whole family of distributions, for example from the Binomial or Poisson distribution. H 0 : the data comes from a Poisson(µ) distribution, for some µ. At its current state, it s impossible to check the hypothesis. We don t know µ and thus we cannot calculate probabilities under the null hypothesis. First use the data to estimate the parameter. This gives you a concrete null hypothesis to test.

20 Example (Deaths from Horsekicks) Deaths Frequency Total=200 Table: Deaths from horsekicks in Prussian army (Preece, Ross and Kirby, 1988) We want to test H 0 : The observations come from a Poisson distribution. Recall Poisson has a single parameter µ equal to its mean. Approximate the mean with the sample mean: X = 1 ( ) = = 0.61.

21 Example (Deaths from Horsekicks) Now test: H 0 : The observations come from a Poisson(0.61). Construct the expected table: What is the expected frequency of zeroes in 200 samples from a Poisson distribution? expected frequency of zeroes = 200 P (zero) = 200 e = = !

22 Computing the Expected Row expected frequency of zeroes = 200 P (zero) = 200 e = = expected frequency of 1 = 200 e = = 66.29, 1! and so on until we arrive at Expected freq Observed freq The expected frequency for 2 or more is less than 5 so we combine the last three columns Expected freq Observed freq !

23 We then calculate X 2 = ( )2 ( )2 ( ) = The final table has three categories, so we would have 2 degrees of freedom, but we have also estimated one parameter from the data, so that leaves 1 degree of freedom. Consulting the table the critical value for the χ 2 distribution with 1 degree of freedom is 3.84 at 5% and 6.63 at 1%. Conclusion: Since our χ 2 statistic is 0.07 which is less than both critical values, we retain (do not reject) the null hypothesis that the data comes from a Poisson distribution.

24 The Normal Distribution Suppose we are given the heights of 100 students in centimetres Height (cm) Frequency and you want to test whether it comes from a Normal distribution. H 0 : The data follows N(µ, σ 2 ), for some µ and σ 2. H 1 : The data do not follow a Normal distribution. How do you compute the sample mean and sample standard deviation here? Replace each bin with its midpoint, e.g for , for [161,166], and so on up to for [ ].

25 Estimate the Parameters Like with the Poisson distribution, we use the data to estimate the parameters µ and σ 2. Use the sample mean and sample variance. x = 172 s 2 = H 0 : The data follows N(172, ). H 1 : The data do not follow N(172, ).

26 Compute Expected counts in Each Bin I Our observations are provided in bins, so what we can do is compute the expected count in each bin from N(172, ). So first compute the probability of each interval under the null hypothesis, e.g. let Y N(172, ), and Z N(0, 1) Replace the first interval with (, 160) ( Y ) P (Y 160) = P = P (Z 1.61) =

27 Compute Expected counts in Each Bin II For the second interval P (160.5 Y 166.5) ( = P Y ) = P ( 1.61 Z 0.77) = For the last interval replace [185, 190] with [184.5, ). Multiply each probability with 100, the total number of observations.

28 Test Statistic I Height (cm) prob Expected Observed Check: All categories have expected 5? No, so combine last two columns to get Height (cm) Expected Observed

29 Test Statistic II The χ 2 statistic is then X 2 = (5.4 6) ( ) = The number of degrees of freedom is 5 (categories)-1-2 (for the number of parameters estimated from the data)=2. Critical value for χ 2 (2) is 5.99 at the 5% confidence level. Conclusion: At the 5% confidence level, there is not enough evidence to reject the null hypothesis.

30 Contingency Tables Table: NHANES handedness data for Americans aged Men Women Total right-handed left-handed ambidextrous Total T:h This type of table is known as a contingency table. The data seem to suggest that women tend to be right-handed more than men. Let s test this. One can use the χ 2 statistic to test whether two variables are independent.

31 χ 2 Test of Association We have two variables, one for the gender, one for being right or left-handed. Our research hypothesis implies that gender has an effect on whether one is right- or left-handed. H 0 : The two variables are independent. H 1 : The two variables are associated. We will use the χ 2 test: we have the observed counts, but what are the expected counts under the null hypothesis? The null hypothesis implies that the two variables are independent. This means that and so on. P (man left-handed) = P (man) P (left-handed),

32 χ 2 Test of Association But we still don t know P (man) or P(left-handed): estimate from the data. No. of men in sample P (man) = = 1067 No. of participants 2237 = No. of lhanded in sample P (left-handed) = = 205 No. of participants 2237 = This gives us the expected table: e.g. P (man) P (left) 2237 = Men Women right-handed left-handed ambidextrous Table: Expected table

33 Computing the χ 2 Statistic Then X 2 = ( ) ( ) But how many degrees of freedom? + + (8 15)2 15 = 12.4.

34 d.f. for Test of Association I We have two categorical variables one with 2 levels, one with 3 levels. For gender, we estimated the probability of the event {man}, while the other one follows immediately since there are only two possibilities. For handedness, we had to estimate the probability of two of the three levels, the last one following since the probabilities must sum to 1. Thus for each variable, the number of parameters we estimated is one less the number of levels.

35 d.f. for Test of Association II The number of categories is 3 2 = 6. Therefore d.f. = = 2. Remark In general if we have two variables with r and c levels respectively d.f. = r c 1 (r 1) (c 1) = (r 1) (c 1). Conclusion: Since the critical value at the 1% confidence level is 9.21 and our statistic is 12.4 there is enough evidence to reject the null hypothesis. There is association between gender and being right or left-handed.

36 Recap I χ 2 distribution: parametrized by degrees of freedom; Goodness of fit test: Given n sample of a categorical variable with k levels we test the null hypothesis H 0 : the categories have probabilities p 0,..., p k. If observed counts are n 1,..., n k then X 2 := (n 1 np 1 ) 2 np (n k np k ) 2 np k = (observed expected) 2 expected d.f. = #categories #parameters fit from data 1. Assumptions: simple random sample, independence, expected cell count 5;

37 Recap II χ 2 test of association: two categorical variables summarized in a contingency table H 0 : the two variables are independent. H 1 : the two variables are associated. If variables have r and c levels resp. d.f. = (r 1) (c 1).

Class 19: Two Way Tables, Conditional Distributions, Chi-Square (Text: Sections 2.5; 9.1)

Class 19: Two Way Tables, Conditional Distributions, Chi-Square (Text: Sections 2.5; 9.1) Spring 204 Class 9: Two Way Tables, Conditional Distributions, Chi-Square (Text: Sections 2.5; 9.) Big Picture: More than Two Samples In Chapter 7: We looked at quantitative variables and compared the

More information

AT&T Global Network Client for Windows Product Support Matrix January 29, 2015

AT&T Global Network Client for Windows Product Support Matrix January 29, 2015 AT&T Global Network Client for Windows Product Support Matrix January 29, 2015 Product Support Matrix Following is the Product Support Matrix for the AT&T Global Network Client. See the AT&T Global Network

More information

statistics Chi-square tests and nonparametric Summary sheet from last time: Hypothesis testing Summary sheet from last time: Confidence intervals

statistics Chi-square tests and nonparametric Summary sheet from last time: Hypothesis testing Summary sheet from last time: Confidence intervals Summary sheet from last time: Confidence intervals Confidence intervals take on the usual form: parameter = statistic ± t crit SE(statistic) parameter SE a s e sqrt(1/n + m x 2 /ss xx ) b s e /sqrt(ss

More information

COMPARISON OF FIXED & VARIABLE RATES (25 YEARS) CHARTERED BANK ADMINISTERED INTEREST RATES - PRIME BUSINESS*

COMPARISON OF FIXED & VARIABLE RATES (25 YEARS) CHARTERED BANK ADMINISTERED INTEREST RATES - PRIME BUSINESS* COMPARISON OF FIXED & VARIABLE RATES (25 YEARS) 2 Fixed Rates Variable Rates FIXED RATES OF THE PAST 25 YEARS AVERAGE RESIDENTIAL MORTGAGE LENDING RATE - 5 YEAR* (Per cent) Year Jan Feb Mar Apr May Jun

More information

COMPARISON OF FIXED & VARIABLE RATES (25 YEARS) CHARTERED BANK ADMINISTERED INTEREST RATES - PRIME BUSINESS*

COMPARISON OF FIXED & VARIABLE RATES (25 YEARS) CHARTERED BANK ADMINISTERED INTEREST RATES - PRIME BUSINESS* COMPARISON OF FIXED & VARIABLE RATES (25 YEARS) 2 Fixed Rates Variable Rates FIXED RATES OF THE PAST 25 YEARS AVERAGE RESIDENTIAL MORTGAGE LENDING RATE - 5 YEAR* (Per cent) Year Jan Feb Mar Apr May Jun

More information

12.5: CHI-SQUARE GOODNESS OF FIT TESTS

12.5: CHI-SQUARE GOODNESS OF FIT TESTS 125: Chi-Square Goodness of Fit Tests CD12-1 125: CHI-SQUARE GOODNESS OF FIT TESTS In this section, the χ 2 distribution is used for testing the goodness of fit of a set of data to a specific probability

More information

Comparing Multiple Proportions, Test of Independence and Goodness of Fit

Comparing Multiple Proportions, Test of Independence and Goodness of Fit Comparing Multiple Proportions, Test of Independence and Goodness of Fit Content Testing the Equality of Population Proportions for Three or More Populations Test of Independence Goodness of Fit Test 2

More information

Math 58. Rumbos Fall 2008 1. Solutions to Review Problems for Exam 2

Math 58. Rumbos Fall 2008 1. Solutions to Review Problems for Exam 2 Math 58. Rumbos Fall 2008 1 Solutions to Review Problems for Exam 2 1. For each of the following scenarios, determine whether the binomial distribution is the appropriate distribution for the random variable

More information

Is it statistically significant? The chi-square test

Is it statistically significant? The chi-square test UAS Conference Series 2013/14 Is it statistically significant? The chi-square test Dr Gosia Turner Student Data Management and Analysis 14 September 2010 Page 1 Why chi-square? Tests whether two categorical

More information

Bivariate Statistics Session 2: Measuring Associations Chi-Square Test

Bivariate Statistics Session 2: Measuring Associations Chi-Square Test Bivariate Statistics Session 2: Measuring Associations Chi-Square Test Features Of The Chi-Square Statistic The chi-square test is non-parametric. That is, it makes no assumptions about the distribution

More information

Analysis One Code Desc. Transaction Amount. Fiscal Period

Analysis One Code Desc. Transaction Amount. Fiscal Period Analysis One Code Desc Transaction Amount Fiscal Period 57.63 Oct-12 12.13 Oct-12-38.90 Oct-12-773.00 Oct-12-800.00 Oct-12-187.00 Oct-12-82.00 Oct-12-82.00 Oct-12-110.00 Oct-12-1115.25 Oct-12-71.00 Oct-12-41.00

More information

Math 108 Exam 3 Solutions Spring 00

Math 108 Exam 3 Solutions Spring 00 Math 108 Exam 3 Solutions Spring 00 1. An ecologist studying acid rain takes measurements of the ph in 12 randomly selected Adirondack lakes. The results are as follows: 3.0 6.5 5.0 4.2 5.5 4.7 3.4 6.8

More information

Case 2:08-cv-02463-ABC-E Document 1-4 Filed 04/15/2008 Page 1 of 138. Exhibit 8

Case 2:08-cv-02463-ABC-E Document 1-4 Filed 04/15/2008 Page 1 of 138. Exhibit 8 Case 2:08-cv-02463-ABC-E Document 1-4 Filed 04/15/2008 Page 1 of 138 Exhibit 8 Case 2:08-cv-02463-ABC-E Document 1-4 Filed 04/15/2008 Page 2 of 138 Domain Name: CELLULARVERISON.COM Updated Date: 12-dec-2007

More information

Statistical tests for SPSS

Statistical tests for SPSS Statistical tests for SPSS Paolo Coletti A.Y. 2010/11 Free University of Bolzano Bozen Premise This book is a very quick, rough and fast description of statistical tests and their usage. It is explicitly

More information

General Method: Difference of Means. 3. Calculate df: either Welch-Satterthwaite formula or simpler df = min(n 1, n 2 ) 1.

General Method: Difference of Means. 3. Calculate df: either Welch-Satterthwaite formula or simpler df = min(n 1, n 2 ) 1. General Method: Difference of Means 1. Calculate x 1, x 2, SE 1, SE 2. 2. Combined SE = SE1 2 + SE2 2. ASSUMES INDEPENDENT SAMPLES. 3. Calculate df: either Welch-Satterthwaite formula or simpler df = min(n

More information

UNDERSTANDING THE TWO-WAY ANOVA

UNDERSTANDING THE TWO-WAY ANOVA UNDERSTANDING THE e have seen how the one-way ANOVA can be used to compare two or more sample means in studies involving a single independent variable. This can be extended to two independent variables

More information

1. What is the critical value for this 95% confidence interval? CV = z.025 = invnorm(0.025) = 1.96

1. What is the critical value for this 95% confidence interval? CV = z.025 = invnorm(0.025) = 1.96 1 Final Review 2 Review 2.1 CI 1-propZint Scenario 1 A TV manufacturer claims in its warranty brochure that in the past not more than 10 percent of its TV sets needed any repair during the first two years

More information

Introduction to Analysis of Variance (ANOVA) Limitations of the t-test

Introduction to Analysis of Variance (ANOVA) Limitations of the t-test Introduction to Analysis of Variance (ANOVA) The Structural Model, The Summary Table, and the One- Way ANOVA Limitations of the t-test Although the t-test is commonly used, it has limitations Can only

More information

Enhanced Vessel Traffic Management System Booking Slots Available and Vessels Booked per Day From 12-JAN-2016 To 30-JUN-2017

Enhanced Vessel Traffic Management System Booking Slots Available and Vessels Booked per Day From 12-JAN-2016 To 30-JUN-2017 From -JAN- To -JUN- -JAN- VIRP Page Period Period Period -JAN- 8 -JAN- 8 9 -JAN- 8 8 -JAN- -JAN- -JAN- 8-JAN- 9-JAN- -JAN- -JAN- -JAN- -JAN- -JAN- -JAN- -JAN- -JAN- 8-JAN- 9-JAN- -JAN- -JAN- -FEB- : days

More information

4. Continuous Random Variables, the Pareto and Normal Distributions

4. Continuous Random Variables, the Pareto and Normal Distributions 4. Continuous Random Variables, the Pareto and Normal Distributions A continuous random variable X can take any value in a given range (e.g. height, weight, age). The distribution of a continuous random

More information

Chi Square Distribution

Chi Square Distribution 17. Chi Square A. Chi Square Distribution B. One-Way Tables C. Contingency Tables D. Exercises Chi Square is a distribution that has proven to be particularly useful in statistics. The first section describes

More information

Association Between Variables

Association Between Variables Contents 11 Association Between Variables 767 11.1 Introduction............................ 767 11.1.1 Measure of Association................. 768 11.1.2 Chapter Summary.................... 769 11.2 Chi

More information

Statistics I for QBIC. Contents and Objectives. Chapters 1 7. Revised: August 2013

Statistics I for QBIC. Contents and Objectives. Chapters 1 7. Revised: August 2013 Statistics I for QBIC Text Book: Biostatistics, 10 th edition, by Daniel & Cross Contents and Objectives Chapters 1 7 Revised: August 2013 Chapter 1: Nature of Statistics (sections 1.1-1.6) Objectives

More information

Section 5 Part 2. Probability Distributions for Discrete Random Variables

Section 5 Part 2. Probability Distributions for Discrete Random Variables Section 5 Part 2 Probability Distributions for Discrete Random Variables Review and Overview So far we ve covered the following probability and probability distribution topics Probability rules Probability

More information

A POPULATION MEAN, CONFIDENCE INTERVALS AND HYPOTHESIS TESTING

A POPULATION MEAN, CONFIDENCE INTERVALS AND HYPOTHESIS TESTING CHAPTER 5. A POPULATION MEAN, CONFIDENCE INTERVALS AND HYPOTHESIS TESTING 5.1 Concepts When a number of animals or plots are exposed to a certain treatment, we usually estimate the effect of the treatment

More information

Data Mining Techniques Chapter 5: The Lure of Statistics: Data Mining Using Familiar Tools

Data Mining Techniques Chapter 5: The Lure of Statistics: Data Mining Using Familiar Tools Data Mining Techniques Chapter 5: The Lure of Statistics: Data Mining Using Familiar Tools Occam s razor.......................................................... 2 A look at data I.........................................................

More information

MAT 155. Key Concept. September 27, 2010. 155S5.5_3 Poisson Probability Distributions. Chapter 5 Probability Distributions

MAT 155. Key Concept. September 27, 2010. 155S5.5_3 Poisson Probability Distributions. Chapter 5 Probability Distributions MAT 155 Dr. Claude Moore Cape Fear Community College Chapter 5 Probability Distributions 5 1 Review and Preview 5 2 Random Variables 5 3 Binomial Probability Distributions 5 4 Mean, Variance and Standard

More information

Exam 3 Review/WIR 9 These problems will be started in class on April 7 and continued on April 8 at the WIR.

Exam 3 Review/WIR 9 These problems will be started in class on April 7 and continued on April 8 at the WIR. Exam 3 Review/WIR 9 These problems will be started in class on April 7 and continued on April 8 at the WIR. 1. Urn A contains 6 white marbles and 4 red marbles. Urn B contains 3 red marbles and two white

More information

Chi-square test Fisher s Exact test

Chi-square test Fisher s Exact test Lesson 1 Chi-square test Fisher s Exact test McNemar s Test Lesson 1 Overview Lesson 11 covered two inference methods for categorical data from groups Confidence Intervals for the difference of two proportions

More information

Lecture 5 : The Poisson Distribution

Lecture 5 : The Poisson Distribution Lecture 5 : The Poisson Distribution Jonathan Marchini November 10, 2008 1 Introduction Many experimental situations occur in which we observe the counts of events within a set unit of time, area, volume,

More information

CONTINGENCY TABLES ARE NOT ALL THE SAME David C. Howell University of Vermont

CONTINGENCY TABLES ARE NOT ALL THE SAME David C. Howell University of Vermont CONTINGENCY TABLES ARE NOT ALL THE SAME David C. Howell University of Vermont To most people studying statistics a contingency table is a contingency table. We tend to forget, if we ever knew, that contingency

More information

Chapter 23. Two Categorical Variables: The Chi-Square Test

Chapter 23. Two Categorical Variables: The Chi-Square Test Chapter 23. Two Categorical Variables: The Chi-Square Test 1 Chapter 23. Two Categorical Variables: The Chi-Square Test Two-Way Tables Note. We quickly review two-way tables with an example. Example. Exercise

More information

Solutions to Homework 10 Statistics 302 Professor Larget

Solutions to Homework 10 Statistics 302 Professor Larget s to Homework 10 Statistics 302 Professor Larget Textbook Exercises 7.14 Rock-Paper-Scissors (Graded for Accurateness) In Data 6.1 on page 367 we see a table, reproduced in the table below that shows the

More information

CHI-SQUARE: TESTING FOR GOODNESS OF FIT

CHI-SQUARE: TESTING FOR GOODNESS OF FIT CHI-SQUARE: TESTING FOR GOODNESS OF FIT In the previous chapter we discussed procedures for fitting a hypothesized function to a set of experimental data points. Such procedures involve minimizing a quantity

More information

STAT 145 (Notes) Al Nosedal anosedal@unm.edu Department of Mathematics and Statistics University of New Mexico. Fall 2013

STAT 145 (Notes) Al Nosedal anosedal@unm.edu Department of Mathematics and Statistics University of New Mexico. Fall 2013 STAT 145 (Notes) Al Nosedal anosedal@unm.edu Department of Mathematics and Statistics University of New Mexico Fall 2013 CHAPTER 18 INFERENCE ABOUT A POPULATION MEAN. Conditions for Inference about mean

More information

Opgaven Onderzoeksmethoden, Onderdeel Statistiek

Opgaven Onderzoeksmethoden, Onderdeel Statistiek Opgaven Onderzoeksmethoden, Onderdeel Statistiek 1. What is the measurement scale of the following variables? a Shoe size b Religion c Car brand d Score in a tennis game e Number of work hours per week

More information

2015-16 BCOE Payroll Calendar. Monday Tuesday Wednesday Thursday Friday Jun 29 30 Jul 1 2 3. Full Force Calc

2015-16 BCOE Payroll Calendar. Monday Tuesday Wednesday Thursday Friday Jun 29 30 Jul 1 2 3. Full Force Calc July 2015 CM Period 1501075 July 2015 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 August 2015 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26

More information

Unit 26 Estimation with Confidence Intervals

Unit 26 Estimation with Confidence Intervals Unit 26 Estimation with Confidence Intervals Objectives: To see how confidence intervals are used to estimate a population proportion, a population mean, a difference in population proportions, or a difference

More information

An Introduction to Basic Statistics and Probability

An Introduction to Basic Statistics and Probability An Introduction to Basic Statistics and Probability Shenek Heyward NCSU An Introduction to Basic Statistics and Probability p. 1/4 Outline Basic probability concepts Conditional probability Discrete Random

More information

Comparing share-price performance of a stock

Comparing share-price performance of a stock Comparing share-price performance of a stock A How-to write-up by Pamela Peterson Drake Analysis of relative stock performance is challenging because stocks trade at different prices, indices are calculated

More information

SCHOOL OF HEALTH AND HUMAN SCIENCES DON T FORGET TO RECODE YOUR MISSING VALUES

SCHOOL OF HEALTH AND HUMAN SCIENCES DON T FORGET TO RECODE YOUR MISSING VALUES SCHOOL OF HEALTH AND HUMAN SCIENCES Using SPSS Topics addressed today: 1. Differences between groups 2. Graphing Use the s4data.sav file for the first part of this session. DON T FORGET TO RECODE YOUR

More information

Chapter 5 Analysis of variance SPSS Analysis of variance

Chapter 5 Analysis of variance SPSS Analysis of variance Chapter 5 Analysis of variance SPSS Analysis of variance Data file used: gss.sav How to get there: Analyze Compare Means One-way ANOVA To test the null hypothesis that several population means are equal,

More information

2 Sample t-test (unequal sample sizes and unequal variances)

2 Sample t-test (unequal sample sizes and unequal variances) Variations of the t-test: Sample tail Sample t-test (unequal sample sizes and unequal variances) Like the last example, below we have ceramic sherd thickness measurements (in cm) of two samples representing

More information

Multinomial and Ordinal Logistic Regression

Multinomial and Ordinal Logistic Regression Multinomial and Ordinal Logistic Regression ME104: Linear Regression Analysis Kenneth Benoit August 22, 2012 Regression with categorical dependent variables When the dependent variable is categorical,

More information

Mind on Statistics. Chapter 15

Mind on Statistics. Chapter 15 Mind on Statistics Chapter 15 Section 15.1 1. A student survey was done to study the relationship between class standing (freshman, sophomore, junior, or senior) and major subject (English, Biology, French,

More information

3.4 Statistical inference for 2 populations based on two samples

3.4 Statistical inference for 2 populations based on two samples 3.4 Statistical inference for 2 populations based on two samples Tests for a difference between two population means The first sample will be denoted as X 1, X 2,..., X m. The second sample will be denoted

More information

Descriptive Statistics

Descriptive Statistics Descriptive Statistics Primer Descriptive statistics Central tendency Variation Relative position Relationships Calculating descriptive statistics Descriptive Statistics Purpose to describe or summarize

More information

Study Guide for the Final Exam

Study Guide for the Final Exam Study Guide for the Final Exam When studying, remember that the computational portion of the exam will only involve new material (covered after the second midterm), that material from Exam 1 will make

More information

One-Way Analysis of Variance (ANOVA) Example Problem

One-Way Analysis of Variance (ANOVA) Example Problem One-Way Analysis of Variance (ANOVA) Example Problem Introduction Analysis of Variance (ANOVA) is a hypothesis-testing technique used to test the equality of two or more population (or treatment) means

More information

Chicago Booth BUSINESS STATISTICS 41000 Final Exam Fall 2011

Chicago Booth BUSINESS STATISTICS 41000 Final Exam Fall 2011 Chicago Booth BUSINESS STATISTICS 41000 Final Exam Fall 2011 Name: Section: I pledge my honor that I have not violated the Honor Code Signature: This exam has 34 pages. You have 3 hours to complete this

More information

Characteristics of Binomial Distributions

Characteristics of Binomial Distributions Lesson2 Characteristics of Binomial Distributions In the last lesson, you constructed several binomial distributions, observed their shapes, and estimated their means and standard deviations. In Investigation

More information

Business Statistics. Successful completion of Introductory and/or Intermediate Algebra courses is recommended before taking Business Statistics.

Business Statistics. Successful completion of Introductory and/or Intermediate Algebra courses is recommended before taking Business Statistics. Business Course Text Bowerman, Bruce L., Richard T. O'Connell, J. B. Orris, and Dawn C. Porter. Essentials of Business, 2nd edition, McGraw-Hill/Irwin, 2008, ISBN: 978-0-07-331988-9. Required Computing

More information

Chi Squared and Fisher's Exact Tests. Observed vs Expected Distributions

Chi Squared and Fisher's Exact Tests. Observed vs Expected Distributions BMS 617 Statistical Techniques for the Biomedical Sciences Lecture 11: Chi-Squared and Fisher's Exact Tests Chi Squared and Fisher's Exact Tests This lecture presents two similarly structured tests, Chi-squared

More information

Chapter 3 RANDOM VARIATE GENERATION

Chapter 3 RANDOM VARIATE GENERATION Chapter 3 RANDOM VARIATE GENERATION In order to do a Monte Carlo simulation either by hand or by computer, techniques must be developed for generating values of random variables having known distributions.

More information

Jinadasa Gamage, Professor of Mathematics, Illinois State University, Normal, IL, e- mail: jina@ilstu.edu

Jinadasa Gamage, Professor of Mathematics, Illinois State University, Normal, IL, e- mail: jina@ilstu.edu Submission for ARCH, October 31, 2006 Jinadasa Gamage, Professor of Mathematics, Illinois State University, Normal, IL, e- mail: jina@ilstu.edu Jed L. Linfield, FSA, MAAA, Health Actuary, Kaiser Permanente,

More information

Regression Analysis: A Complete Example

Regression Analysis: A Complete Example Regression Analysis: A Complete Example This section works out an example that includes all the topics we have discussed so far in this chapter. A complete example of regression analysis. PhotoDisc, Inc./Getty

More information

Chi Square Tests. Chapter 10. 10.1 Introduction

Chi Square Tests. Chapter 10. 10.1 Introduction Contents 10 Chi Square Tests 703 10.1 Introduction............................ 703 10.2 The Chi Square Distribution.................. 704 10.3 Goodness of Fit Test....................... 709 10.4 Chi Square

More information

The normal approximation to the binomial

The normal approximation to the binomial The normal approximation to the binomial The binomial probability function is not useful for calculating probabilities when the number of trials n is large, as it involves multiplying a potentially very

More information

An Introduction to Statistics Course (ECOE 1302) Spring Semester 2011 Chapter 10- TWO-SAMPLE TESTS

An Introduction to Statistics Course (ECOE 1302) Spring Semester 2011 Chapter 10- TWO-SAMPLE TESTS The Islamic University of Gaza Faculty of Commerce Department of Economics and Political Sciences An Introduction to Statistics Course (ECOE 130) Spring Semester 011 Chapter 10- TWO-SAMPLE TESTS Practice

More information

Topic 8. Chi Square Tests

Topic 8. Chi Square Tests BE540W Chi Square Tests Page 1 of 5 Topic 8 Chi Square Tests Topics 1. Introduction to Contingency Tables. Introduction to the Contingency Table Hypothesis Test of No Association.. 3. The Chi Square Test

More information

Odds ratio, Odds ratio test for independence, chi-squared statistic.

Odds ratio, Odds ratio test for independence, chi-squared statistic. Odds ratio, Odds ratio test for independence, chi-squared statistic. Announcements: Assignment 5 is live on webpage. Due Wed Aug 1 at 4:30pm. (9 days, 1 hour, 58.5 minutes ) Final exam is Aug 9. Review

More information

LAB 4 INSTRUCTIONS CONFIDENCE INTERVALS AND HYPOTHESIS TESTING

LAB 4 INSTRUCTIONS CONFIDENCE INTERVALS AND HYPOTHESIS TESTING LAB 4 INSTRUCTIONS CONFIDENCE INTERVALS AND HYPOTHESIS TESTING In this lab you will explore the concept of a confidence interval and hypothesis testing through a simulation problem in engineering setting.

More information

P/T 2B: 2 nd Half of Term (8 weeks) Start: 25-AUG-2014 End: 19-OCT-2014 Start: 20-OCT-2014 End: 14-DEC-2014

P/T 2B: 2 nd Half of Term (8 weeks) Start: 25-AUG-2014 End: 19-OCT-2014 Start: 20-OCT-2014 End: 14-DEC-2014 2014-2015 SPECIAL TERM ACADEMIC CALENDAR FOR SCRANTON EDUCATION ONLINE (SEOL), MBA ONLINE, HUMAN RESOURCES ONLINE, NURSE ANESTHESIA and ERP PROGRAMS SPECIAL FALL 2014 TERM Key: P/T = Part of Term P/T Description

More information

P/T 2B: 2 nd Half of Term (8 weeks) Start: 26-AUG-2013 End: 20-OCT-2013 Start: 21-OCT-2013 End: 15-DEC-2013

P/T 2B: 2 nd Half of Term (8 weeks) Start: 26-AUG-2013 End: 20-OCT-2013 Start: 21-OCT-2013 End: 15-DEC-2013 2013-2014 SPECIAL TERM ACADEMIC CALENDAR FOR SCRANTON EDUCATION ONLINE (SEOL), MBA ONLINE, HUMAN RESOURCES ONLINE, NURSE ANESTHESIA and ERP PROGRAMS SPECIAL FALL 2013 TERM Key: P/T = Part of Term P/T Description

More information

Using Stata for Categorical Data Analysis

Using Stata for Categorical Data Analysis Using Stata for Categorical Data Analysis NOTE: These problems make extensive use of Nick Cox s tab_chi, which is actually a collection of routines, and Adrian Mander s ipf command. From within Stata,

More information

Ashley Institute of Training Schedule of VET Tuition Fees 2015

Ashley Institute of Training Schedule of VET Tuition Fees 2015 Ashley Institute of Training Schedule of VET Fees Year of Study Group ID:DECE15G1 Total Course Fees $ 12,000 29-Aug- 17-Oct- 50 14-Sep- 0.167 blended various $2,000 CHC02 Best practice 24-Oct- 12-Dec-

More information

Common Univariate and Bivariate Applications of the Chi-square Distribution

Common Univariate and Bivariate Applications of the Chi-square Distribution Common Univariate and Bivariate Applications of the Chi-square Distribution The probability density function defining the chi-square distribution is given in the chapter on Chi-square in Howell's text.

More information

DESCRIPTIVE STATISTICS & DATA PRESENTATION*

DESCRIPTIVE STATISTICS & DATA PRESENTATION* Level 1 Level 2 Level 3 Level 4 0 0 0 0 evel 1 evel 2 evel 3 Level 4 DESCRIPTIVE STATISTICS & DATA PRESENTATION* Created for Psychology 41, Research Methods by Barbara Sommer, PhD Psychology Department

More information

CHAPTER IV FINDINGS AND CONCURRENT DISCUSSIONS

CHAPTER IV FINDINGS AND CONCURRENT DISCUSSIONS CHAPTER IV FINDINGS AND CONCURRENT DISCUSSIONS Hypothesis 1: People are resistant to the technological change in the security system of the organization. Hypothesis 2: information hacked and misused. Lack

More information

P/T 2B: 2 nd Half of Term (8 weeks) Start: 24-AUG-2015 End: 18-OCT-2015 Start: 19-OCT-2015 End: 13-DEC-2015

P/T 2B: 2 nd Half of Term (8 weeks) Start: 24-AUG-2015 End: 18-OCT-2015 Start: 19-OCT-2015 End: 13-DEC-2015 2015-2016 SPECIAL TERM ACADEMIC CALENDAR For Scranton Education Online (SEOL), Masters of Business Administration Online, Masters of Accountancy Online, Health Administration Online, Health Informatics

More information

Nonparametric Two-Sample Tests. Nonparametric Tests. Sign Test

Nonparametric Two-Sample Tests. Nonparametric Tests. Sign Test Nonparametric Two-Sample Tests Sign test Mann-Whitney U-test (a.k.a. Wilcoxon two-sample test) Kolmogorov-Smirnov Test Wilcoxon Signed-Rank Test Tukey-Duckworth Test 1 Nonparametric Tests Recall, nonparametric

More information

Chapter 10 Capital Markets and the Pricing of Risk

Chapter 10 Capital Markets and the Pricing of Risk Chapter 10 Capital Markets and the Pricing of Risk 10-1. The figure below shows the one-year return distribution for RCS stock. Calculate a. The expected return. b. The standard deviation of the return.

More information

MBA 611 STATISTICS AND QUANTITATIVE METHODS

MBA 611 STATISTICS AND QUANTITATIVE METHODS MBA 611 STATISTICS AND QUANTITATIVE METHODS Part I. Review of Basic Statistics (Chapters 1-11) A. Introduction (Chapter 1) Uncertainty: Decisions are often based on incomplete information from uncertain

More information

Final Exam Practice Problem Answers

Final Exam Practice Problem Answers Final Exam Practice Problem Answers The following data set consists of data gathered from 77 popular breakfast cereals. The variables in the data set are as follows: Brand: The brand name of the cereal

More information

Simple Inventory Management

Simple Inventory Management Jon Bennett Consulting http://www.jondbennett.com Simple Inventory Management Free Up Cash While Satisfying Your Customers Part of the Business Philosophy White Papers Series Author: Jon Bennett September

More information

Independent t- Test (Comparing Two Means)

Independent t- Test (Comparing Two Means) Independent t- Test (Comparing Two Means) The objectives of this lesson are to learn: the definition/purpose of independent t-test when to use the independent t-test the use of SPSS to complete an independent

More information

a. mean b. interquartile range c. range d. median

a. mean b. interquartile range c. range d. median 3. Since 4. The HOMEWORK 3 Due: Feb.3 1. A set of data are put in numerical order, and a statistic is calculated that divides the data set into two equal parts with one part below it and the other part

More information

Summary of Formulas and Concepts. Descriptive Statistics (Ch. 1-4)

Summary of Formulas and Concepts. Descriptive Statistics (Ch. 1-4) Summary of Formulas and Concepts Descriptive Statistics (Ch. 1-4) Definitions Population: The complete set of numerical information on a particular quantity in which an investigator is interested. We assume

More information

Independent Accountants Report on Applying Agreed-Upon Procedures

Independent Accountants Report on Applying Agreed-Upon Procedures Independent Accountants Report on Applying Agreed-Upon Procedures Board of Trustees We have performed the procedures, as discussed below, with respect to the employer contributions remitted by to the in

More information

CONTENTS OF DAY 2. II. Why Random Sampling is Important 9 A myth, an urban legend, and the real reason NOTES FOR SUMMER STATISTICS INSTITUTE COURSE

CONTENTS OF DAY 2. II. Why Random Sampling is Important 9 A myth, an urban legend, and the real reason NOTES FOR SUMMER STATISTICS INSTITUTE COURSE 1 2 CONTENTS OF DAY 2 I. More Precise Definition of Simple Random Sample 3 Connection with independent random variables 3 Problems with small populations 8 II. Why Random Sampling is Important 9 A myth,

More information

HYPOTHESIS TESTING (ONE SAMPLE) - CHAPTER 7 1. used confidence intervals to answer questions such as...

HYPOTHESIS TESTING (ONE SAMPLE) - CHAPTER 7 1. used confidence intervals to answer questions such as... HYPOTHESIS TESTING (ONE SAMPLE) - CHAPTER 7 1 PREVIOUSLY used confidence intervals to answer questions such as... You know that 0.25% of women have red/green color blindness. You conduct a study of men

More information

Statistics Review PSY379

Statistics Review PSY379 Statistics Review PSY379 Basic concepts Measurement scales Populations vs. samples Continuous vs. discrete variable Independent vs. dependent variable Descriptive vs. inferential stats Common analyses

More information

Recall this chart that showed how most of our course would be organized:

Recall this chart that showed how most of our course would be organized: Chapter 4 One-Way ANOVA Recall this chart that showed how most of our course would be organized: Explanatory Variable(s) Response Variable Methods Categorical Categorical Contingency Tables Categorical

More information

Week 4: Standard Error and Confidence Intervals

Week 4: Standard Error and Confidence Intervals Health Sciences M.Sc. Programme Applied Biostatistics Week 4: Standard Error and Confidence Intervals Sampling Most research data come from subjects we think of as samples drawn from a larger population.

More information

IEOR 6711: Stochastic Models I Fall 2012, Professor Whitt, Tuesday, September 11 Normal Approximations and the Central Limit Theorem

IEOR 6711: Stochastic Models I Fall 2012, Professor Whitt, Tuesday, September 11 Normal Approximations and the Central Limit Theorem IEOR 6711: Stochastic Models I Fall 2012, Professor Whitt, Tuesday, September 11 Normal Approximations and the Central Limit Theorem Time on my hands: Coin tosses. Problem Formulation: Suppose that I have

More information

QUANTITATIVE METHODS BIOLOGY FINAL HONOUR SCHOOL NON-PARAMETRIC TESTS

QUANTITATIVE METHODS BIOLOGY FINAL HONOUR SCHOOL NON-PARAMETRIC TESTS QUANTITATIVE METHODS BIOLOGY FINAL HONOUR SCHOOL NON-PARAMETRIC TESTS This booklet contains lecture notes for the nonparametric work in the QM course. This booklet may be online at http://users.ox.ac.uk/~grafen/qmnotes/index.html.

More information

Name: Date: Use the following to answer questions 3-4:

Name: Date: Use the following to answer questions 3-4: Name: Date: 1. Determine whether each of the following statements is true or false. A) The margin of error for a 95% confidence interval for the mean increases as the sample size increases. B) The margin

More information

II. DISTRIBUTIONS distribution normal distribution. standard scores

II. DISTRIBUTIONS distribution normal distribution. standard scores Appendix D Basic Measurement And Statistics The following information was developed by Steven Rothke, PhD, Department of Psychology, Rehabilitation Institute of Chicago (RIC) and expanded by Mary F. Schmidt,

More information

CENTERPOINT ENERGY TEXARKANA SERVICE AREA GAS SUPPLY RATE (GSR) JULY 2015. Small Commercial Service (SCS-1) GSR

CENTERPOINT ENERGY TEXARKANA SERVICE AREA GAS SUPPLY RATE (GSR) JULY 2015. Small Commercial Service (SCS-1) GSR JULY 2015 Area (RS-1) GSR GSR (LCS-1) Texarkana Incorporated July-15 $0.50690/Ccf $0.45450/Ccf $0.00000/Ccf $2.85090/MMBtu $17.52070/MMBtu Texarkana Unincorporated July-15 $0.56370/Ccf $0.26110/Ccf $1.66900/Ccf

More information

HYPOTHESIS TESTING (ONE SAMPLE) - CHAPTER 7 1. used confidence intervals to answer questions such as...

HYPOTHESIS TESTING (ONE SAMPLE) - CHAPTER 7 1. used confidence intervals to answer questions such as... HYPOTHESIS TESTING (ONE SAMPLE) - CHAPTER 7 1 PREVIOUSLY used confidence intervals to answer questions such as... You know that 0.25% of women have red/green color blindness. You conduct a study of men

More information

The correlation coefficient

The correlation coefficient The correlation coefficient Clinical Biostatistics The correlation coefficient Martin Bland Correlation coefficients are used to measure the of the relationship or association between two quantitative

More information

IBM SPSS Statistics for Beginners for Windows

IBM SPSS Statistics for Beginners for Windows ISS, NEWCASTLE UNIVERSITY IBM SPSS Statistics for Beginners for Windows A Training Manual for Beginners Dr. S. T. Kometa A Training Manual for Beginners Contents 1 Aims and Objectives... 3 1.1 Learning

More information

Comparing Means Between Groups

Comparing Means Between Groups Comparing Means Between Groups Michael Ash Lecture 6 Summary of Main Points Comparing means between groups is an important method for program evaluation by policy analysts and public administrators. The

More information

Testing Research and Statistical Hypotheses

Testing Research and Statistical Hypotheses Testing Research and Statistical Hypotheses Introduction In the last lab we analyzed metric artifact attributes such as thickness or width/thickness ratio. Those were continuous variables, which as you

More information

Statistiek I. Proportions aka Sign Tests. John Nerbonne. CLCG, Rijksuniversiteit Groningen. http://www.let.rug.nl/nerbonne/teach/statistiek-i/

Statistiek I. Proportions aka Sign Tests. John Nerbonne. CLCG, Rijksuniversiteit Groningen. http://www.let.rug.nl/nerbonne/teach/statistiek-i/ Statistiek I Proportions aka Sign Tests John Nerbonne CLCG, Rijksuniversiteit Groningen http://www.let.rug.nl/nerbonne/teach/statistiek-i/ John Nerbonne 1/34 Proportions aka Sign Test The relative frequency

More information

Crosstabulation & Chi Square

Crosstabulation & Chi Square Crosstabulation & Chi Square Robert S Michael Chi-square as an Index of Association After examining the distribution of each of the variables, the researcher s next task is to look for relationships among

More information

Stat 411/511 THE RANDOMIZATION TEST. Charlotte Wickham. stat511.cwick.co.nz. Oct 16 2015

Stat 411/511 THE RANDOMIZATION TEST. Charlotte Wickham. stat511.cwick.co.nz. Oct 16 2015 Stat 411/511 THE RANDOMIZATION TEST Oct 16 2015 Charlotte Wickham stat511.cwick.co.nz Today Review randomization model Conduct randomization test What about CIs? Using a t-distribution as an approximation

More information

CHAPTER 13 SIMPLE LINEAR REGRESSION. Opening Example. Simple Regression. Linear Regression

CHAPTER 13 SIMPLE LINEAR REGRESSION. Opening Example. Simple Regression. Linear Regression Opening Example CHAPTER 13 SIMPLE LINEAR REGREION SIMPLE LINEAR REGREION! Simple Regression! Linear Regression Simple Regression Definition A regression model is a mathematical equation that descries the

More information

Two-sample inference: Continuous data

Two-sample inference: Continuous data Two-sample inference: Continuous data Patrick Breheny April 5 Patrick Breheny STA 580: Biostatistics I 1/32 Introduction Our next two lectures will deal with two-sample inference for continuous data As

More information