Objectives. 9.1, 9.2 Inference for two-way tables. The hypothesis: no association. Expected cell counts. The chi-square test.

Size: px
Start display at page:

Download "Objectives. 9.1, 9.2 Inference for two-way tables. The hypothesis: no association. Expected cell counts. The chi-square test."

Transcription

1 Objectives 9.1, 9.2 Inference for two-way tables The hypothesis: no association Expected cell counts The chi-square test Using software Further reading:

2 Independence/Association: Sample and Population In the previous section we defined the notion of independence and dependence (also called association) using two-way contingency tables: Recall two variables are independent if the probability of one variable conditioned on the other is the same as the marginal probabilities. Example: In Chapter 13 we gave an example where the gender of a person did not change the chance of a pass or fail. This means that gender and passing are independent variables. If the marginal probabilities and the conditional probabilities are not the same, then there is an association between the variables. Example: In Chapter 13 we have an example where the gender of a person changed (dramatically) the chance of them wearing a dress (or not) to the Oscars. This means there is an association between dress wearing and gender. Knowledge of one variables (ie the person is a female) changes the chances of the wearing a dress or not. In reality we do not ever observe the population, and the numbers in a two-way table is a sample from a population. In such a case, even if variables are independent, sampling variation will mean that the marginal probability will not be the same as the conditional.

3 Example: The number of males and females in higher education is know to be equal. However, in a given class the numbers of males and females are likely to be different. Thus we need to `test for independent between the variables given the data set. We do this by `predicting what the numbers in the tables would be under the scenario of independence and make a comparison to what is actually observed in the data. This is best explained through several examples

4 But in reality these are only samples from the entire population. For a sample we cannot expect that even in the case of independence (no association) that the marginal probabilities and the conditional probabilities will be exactly the same. It will be different due to random variation in the sample. As in everything we have done so far, we are not interested in the sample but the population itself. What can we infer about the population based on the sample? Therefore we are interested in seeing whether in the population there is an association between the two variables. In the case that we have a two-by two table, for example, the hair and and minodoxil example we can test if the conditional probabilities (proportion who saw an improvement using minidoxil vs placebo) are the same or not by using the test on two proportions. However, this method cannot be extended to larger tables such as the student smoking example. Instead we take a slightly different approach. We calculate what we expect to see if there is no dependence and compare it what we do observe. We formalize this on the next slides.

5 Hypothesis test for association When we have two categorical variables it is often of interest to determine whether they are associated. As always, a firm decision can only be made by rejecting a null hypothesis using an appropriate test procedure. This is because we need to know if the apparent differences among sample proportions are likely to have occurred just by chance due to random sampling. The hypotheses are H 0 : the variables are not associated vs. H a : the variables are associated. We will use the chi-square (χ 2 ) test to assess the null hypothesis based on how well the data fit with what H 0 predicts the counts of a two-way table to be.

6 Expected cell counts To test the null hypothesis of no association between the variables, we must compare the actual (observed) counts from the sample data with the expected counts. The null hypothesis predicts that the cell proportions of the column variable within each row be the same as their overall proportions for the whole table. Specifically, the expected count in any cell of a two-way table when H 0 is true is: Do not round the expected count it usually is not a whole number. The expected count is a mean, not a value we would actually see.

7 Example 1: Oscars and dresses Male Female Total Dress No Dress Total Male Female Total Dress 215 ( =0.512) No Dress 205 ( =0.488) Total 200 ( =0.476) 220 ( 420 =0.524) 420 Male Female Total Dress 51.2% of 200 males 51.2% of 220 females 215 ( =0.512) = =112.6 No Dress 48.8% of 200 males 48.8% of 220 females 205 ( =0.488) Total 200 ( =97.62 = =0.476) ( 420 =0.524) 420 Let us return to the Oscars data. Suppose we miss all the numbers inside the table and only observe the subtotals, ie. The number who wore dresses, the number who did not wear dresses, the number of females and the number of males. Question These are the marginal numbers. Can we use these to deduce the numbers inside the boxes? Answer Only if dress and gender are independent variables. In this case the conditionals and marginal probabilities are the same. If there is no dependence between gender and whether they wear a dress or not, then we can use the marginal probabilities to predict the numbers. Compare the numbers from what is observed and expected under independence, they are completely different!

8 We measure the difference between what is observed actually in the data and what we predict if they is no association: 2 = ( ) ( ) ( ) ( ) = This difference is huge! It means that our predictions are completely wrong and this is because we made the predictions under the assumption that there dress and gender are completely independent clearly they are not this is why there is such a big difference. Observe that the p-value is very small. This tells us we are rejecting the null hypothesis, which is that there is no dependence between gender and whether they wear a dress or not.

9 Example 2: Gender and grades Let us return to the gender and grade Male Female Total Pass Fail Total Male Female Total Pass 288 Fail 32 Total Male Female Total Pass 90% of the 120 males 90% of the 200 females 288 ( =0.9) = 108 = 180 Fail 10% of 120 males 10% of the 320 females 32 ( =0.10) Total 120 ( =12 = 20 =0.375) ( 320 =0.625) 320 data. Suppose we miss all the numbers inside the table and only observe the subtotals, ie. The number who passed, the number who did not pass, the number of females and the number of males. We `fill in the middle of the table what we expect to see if the is no dependence between gender and grades If there is no dependence between gender and grades, then we can use the marginal probabilities to predicted numbers. Compare the numbers from what is observed and expected under independence, they are completely different!

10 We measure the difference between what is observed actually in the data and what we predict if they is no association: 2 = ( ) ( ) (12 12) (5 5)2 5 =0 There is no difference at all, the predictions were exactly what is expected under independence of grade and gender. Look at the p-value it is one. This tells us we cannot rejecting the null hypothesis, that there is no dependence between gender and whether they wear a dress or not.

11 Example 3: Minidoxil and hair Minidoxil Placebo Subtotals Improvement No improvement Total Minidoxil Placebo Subtotals Improvement 159 No improvement 453 Total Minidoxil Placebo Subtotals Improvement 26% of % of ( = 26%) =80.54 =78.46 No improvement 74% of % of ( = 74%) Total 310 ( = = = 50.6) ( 612 = 49.3%) 612 Let us return to the Minidoxildata. Suppose we miss all the numbers inside the table and only observe the subtotals, ie. The number who saw an improvement, the number who did not, and the numbers in both groups.. Question These are the marginal numbers. Can we use these to deduce the numbers inside the boxes? Again we can only use the marginals if there is no dependence/ association between the treatment and hair growth. If this is the case, we obtain the numbers on the left. Unlike the previous two examples, the numbers neither match or are completely different. How to interpret these differences?

12 We measure the difference between what is observed actually in the data and what we predict if they is no association: 2 = ( ) ( ) ( ) ( ) = This difference is not zero (as in Example 2 or huge as in Example 1). Can we explain this difference under just sampling variation when in fact the Minidoxil is no different to the placebo? It is hard to judge based on the difference 11.58, we need to know the distribution associated with and from here deduce the p-value. We see that the p-value is 0.7%, in the next few slides we explain where this comes from.

13 The chi- squared distribution This is what a chi-squared looks like. It tells us that if there is no association between gender and binge drinking the chistatistics is likely to be quite small. In fact the chance of it being large are quite slim. These chance can be obtained using the critical values for the chi-squared distribution given in the chi-squared tables. Looking up chi-squared tables with 1df. We see that there is a 25% chance the chi-value will be more than 1.32 and a 5% chance it will be more than We apply this to our chivalue.

14 For the chi-square test, H 0 states that there is no association between the row and column variables in a two-way table. The alternative is that these variables are related. If H 0 is true, the chi-square test statistic has approximately a chisquare distribution with (r 1) (c 1) degrees of freedom. Use the chi-square table (Table F) to get the P-value. The P-value for the chi-square test is the area to the right of the test statistic χ 2 under this distribution.

15 Table F df = (r 1) (c 1) If χ 2 = and df = 2, the P-value is between and p df

16 The chi-value for the mindoxil example is We see that it is far to the right of the chi-squared distribution. The area to the RIGHT of is 0.7%. This matches with the Statcrunch output. This can be deduced from the tables. Since > 3.84 (which corresponds to 5% in the chi table) it is clear that the p-value corresponding to is a lot less than 5%. Interpretation If there is no association between treatment and hair, then about 0.7% of the time we will observe a difference in the data of times or more.

17 Example 4: Treating cocaine addiction Observed % of No Relapse by Treatment 74 patients addicted to cocaine were assigned at random to one of three possible treatments. The observed variable is whether or not they relapsed into addiction after their treatment. We test whether the chance of relapse is related to treatment with α = Expected % of No Relapse by Treatment 35.14% 35.14% 35.14% Overall 26/74 = 35.14% did not relapse. If this occurrence does not depend on the treatment then we would expect 35.14% of each group not to relapse. This is what a null hypothesis of no association would predict.

18 Treating cocaine addiction, cont. We have noted that, assuming treatment has no effect on the relapse response, each treatment group should be 35.14% no relapse and 64.85% relapse. The expected counts are computed using the margin totals from the observed (actual) counts. Observed counts No Yes Total Desipramine Lithium Placebo Total Row totals are the same for both tables. Column totals are the same for both tables. Expected counts Comparing the observed and expected counts, we can see that the two do not agree very well. But to say whether this is significant, we need to compute a test statistic and a P-value. Desipramine Lithium Placebo No Yes Total (25 26)/74 = 8.78 (26 26)/74 = 9.14 (23 26)/74 = 8.08 (25 48)/74 = (26 48)/74 = (23 48)/74 = Total

19 Treating cocaine addiction, cont. Now we compute the χ 2 test statistic. Observed counts No Yes Total Desipramine Lithium Placebo Total Expected counts No Yes Total Desipramine Lithium Placebo Total χ = (obs. exp.) exp. The degrees of freedom is (3 1) (2 1) = 2. From Table F, we find that the area to the right of is between and ( ) ( ) = (7 9.14) ( ) (4 8.08) ( ) = The P-value is less than α = 0.05 so we conclude that the chance of relapse is related to the treatment. A causal effect also is indicated because the treatment was applied and then the response (relapse/no relapse) was observed. For many data sets we cannot say there is a causal effect see HW9, Q4.

20 Cocaine addiction, cont. Observed % of no relapse From StatCrunch: use Stat-Tables- Contingency-with summary. The counts are to be provided in a table format. The P-value is , which is very significant. We reject the null hypothesis of no association and conclude there is a relationship between the treatment and the outcome (relapse or not).

21 Meaning of conditional probabilities: Cocaine addiction, cont. Since the outcome (relapse or not) is a response variable, it is sensible to ask about its conditional distribution, given each treatment, and then to compare across treatments. For example, 60% of addicts treated with Desipramine did not relapse and 27% of addicts treated with Lithium did not relapse while only 17% of addicts treated with the placebo did not relapse. But it is not sensible to look at the conditional proportions for treatment, given the outcome, because treatment is not a response variable in this study: it was applied to the patients (subjects). In fact, the row totals are the sizes of three samples and are not random. P (No Placebo) = 4 23 P (Placebo No) = 4 26 = 17.39% has meaning = 15.38% has no interpretation

22 Cocaine addiction, cont. Using the confidence interval formula for a single proportion, we can estimate the individual proportions of no relapse for each treatment. For Desipramine:.600 ± (.400) / 25 =.600 ±.192 = (.408,.792). For Lithium:.269 ± (.731) / 26 =.269 ±.170 = (.099,.439). For Placebo:.174 ± (.826) / 25 =.174 ±.155 = (.019,.329). We can also compare two groups, say Desipramine and Lithium:.600(.400).269(.731) ± =.331 ±.257 = (.074,.588).

23 Example 5: Left- handed students Does right/left-handedness of students vary between genders? We will test H 0 : There is no relationship between handedness and gender, H a : There is some relationship, at significance level α = From the class survey, we get the summary in the table at right. In StatCrunch use: Stat-Tables- Contingency-with data. The results of a chi-square test show that the P-value is 4.33%. Since this is bigger than α, we do not reject H 0. There is insufficient evidence to conclude that handedness differs by gender. Note: only 407 students responded to both questions.

24 Example 6: Parental smoking Does parental smoking influence the smoking habits of their high school children? High school students were asked whether they smoke and whether their parents smoke. In StatCrunch, request row percents (if the column variable is a response) and column percents (if the row variable is a response). The proportion of students who smoke, among those for whom both parents smoke, is 400/1780 = 22.47%. The proportion of students for whom both parents smoke, among those who smoke, is 400/1004 = 39.84%. The percent of students who smoke is greatest when both parents smoke and least when neither parent smokes (22% vs. 14%). If a student smokes, it is more likely that both parents do (40%) than that neither parent does (19%).

25 Experimental designs for two- way tables The chi-square test is an overall technique for determining evidence of a relationship between two categorical variables. There are two cases. Compare category proportions for several populations. A simple random sample is selected from each population and a single categorical variable is observed. The cocaine addiction study is an example of this. The populations are the 3 treatments. A SRS was obtained for each treatment. One response was observed. Test independence of two categorical variables in a single population. A single random sample is obtained from the populations and each individual is classified according to the two categorical variables. The handedness survey is an example of this. There was one random sample from the population and each student responded to the two categorical questions. We use the χ 2 test to test the null hypothesis of no relationship for both.

26 Review: The chi- square test statistic The chi-square statistic (χ 2 ) is a measure of how much the observed cell counts in a two-way table diverge from the expected cell counts. Summing over all r c cells in the table (r and c are the number of rows and columns), the formula for the χ 2 statistic is ( ) 2 2 observed count expected count expected count χ = Beware: the denominators are expected counts. Do not include the margins in the sum. Large values for χ 2 represent strong deviations from the expected distribution under the H 0 and provide evidence against H 0. How large a χ 2 value is required for statistical significance will depend on its degrees of freedom df = (r 1) (c 1).

27 Summary of testing for association This test pertains to association between two categorical variables only. The hypotheses are: H 0 : the variables are not associated vs. H a : the variables are associated. The data are summarized in a r c table with cells containing the observed counts for each combination of categories of the two variables. r is the number of rows (categories of 1 st variable) and c is the number of columns (categories of 2 nd variable). The expected count for each cell in the table is computed by The test statistic is row total column total expected count =. total sample size ( ) 2 2 observed count expected count expected count χ = The P-value is computed from the chi-square distribution as the area to the right of the χ 2 statistic, with df = (r 1) (c 1)..

28 When is it safe to use a χ 2 test? Like the z-tests for proportions, the chi-square test is based on an approximation. We can safely use the test when: The samples are simple random samples (SRS). All but one or two individual observed counts are 1 or more. All expected counts are 5 or more, except perhaps one which should be at least 1. For a 2 x 2 table, all four expected counts should be 5 or more. If the approximation is not appropriate, a statistician should be consulted for the proper procedure.

29 Accompanying problems associated with this Chapter Quiz Homework 9

Class 19: Two Way Tables, Conditional Distributions, Chi-Square (Text: Sections 2.5; 9.1)

Class 19: Two Way Tables, Conditional Distributions, Chi-Square (Text: Sections 2.5; 9.1) Spring 204 Class 9: Two Way Tables, Conditional Distributions, Chi-Square (Text: Sections 2.5; 9.) Big Picture: More than Two Samples In Chapter 7: We looked at quantitative variables and compared the

More information

Is it statistically significant? The chi-square test

Is it statistically significant? The chi-square test UAS Conference Series 2013/14 Is it statistically significant? The chi-square test Dr Gosia Turner Student Data Management and Analysis 14 September 2010 Page 1 Why chi-square? Tests whether two categorical

More information

Chapter 23. Two Categorical Variables: The Chi-Square Test

Chapter 23. Two Categorical Variables: The Chi-Square Test Chapter 23. Two Categorical Variables: The Chi-Square Test 1 Chapter 23. Two Categorical Variables: The Chi-Square Test Two-Way Tables Note. We quickly review two-way tables with an example. Example. Exercise

More information

Having a coin come up heads or tails is a variable on a nominal scale. Heads is a different category from tails.

Having a coin come up heads or tails is a variable on a nominal scale. Heads is a different category from tails. Chi-square Goodness of Fit Test The chi-square test is designed to test differences whether one frequency is different from another frequency. The chi-square test is designed for use with data on a nominal

More information

Association Between Variables

Association Between Variables Contents 11 Association Between Variables 767 11.1 Introduction............................ 767 11.1.1 Measure of Association................. 768 11.1.2 Chapter Summary.................... 769 11.2 Chi

More information

Chi-square test Fisher s Exact test

Chi-square test Fisher s Exact test Lesson 1 Chi-square test Fisher s Exact test McNemar s Test Lesson 1 Overview Lesson 11 covered two inference methods for categorical data from groups Confidence Intervals for the difference of two proportions

More information

1. What is the critical value for this 95% confidence interval? CV = z.025 = invnorm(0.025) = 1.96

1. What is the critical value for this 95% confidence interval? CV = z.025 = invnorm(0.025) = 1.96 1 Final Review 2 Review 2.1 CI 1-propZint Scenario 1 A TV manufacturer claims in its warranty brochure that in the past not more than 10 percent of its TV sets needed any repair during the first two years

More information

Fairfield Public Schools

Fairfield Public Schools Mathematics Fairfield Public Schools AP Statistics AP Statistics BOE Approved 04/08/2014 1 AP STATISTICS Critical Areas of Focus AP Statistics is a rigorous course that offers advanced students an opportunity

More information

Solutions to Homework 10 Statistics 302 Professor Larget

Solutions to Homework 10 Statistics 302 Professor Larget s to Homework 10 Statistics 302 Professor Larget Textbook Exercises 7.14 Rock-Paper-Scissors (Graded for Accurateness) In Data 6.1 on page 367 we see a table, reproduced in the table below that shows the

More information

Unit 31 A Hypothesis Test about Correlation and Slope in a Simple Linear Regression

Unit 31 A Hypothesis Test about Correlation and Slope in a Simple Linear Regression Unit 31 A Hypothesis Test about Correlation and Slope in a Simple Linear Regression Objectives: To perform a hypothesis test concerning the slope of a least squares line To recognize that testing for a

More information

LAB : THE CHI-SQUARE TEST. Probability, Random Chance, and Genetics

LAB : THE CHI-SQUARE TEST. Probability, Random Chance, and Genetics Period Date LAB : THE CHI-SQUARE TEST Probability, Random Chance, and Genetics Why do we study random chance and probability at the beginning of a unit on genetics? Genetics is the study of inheritance,

More information

Bivariate Statistics Session 2: Measuring Associations Chi-Square Test

Bivariate Statistics Session 2: Measuring Associations Chi-Square Test Bivariate Statistics Session 2: Measuring Associations Chi-Square Test Features Of The Chi-Square Statistic The chi-square test is non-parametric. That is, it makes no assumptions about the distribution

More information

AP: LAB 8: THE CHI-SQUARE TEST. Probability, Random Chance, and Genetics

AP: LAB 8: THE CHI-SQUARE TEST. Probability, Random Chance, and Genetics Ms. Foglia Date AP: LAB 8: THE CHI-SQUARE TEST Probability, Random Chance, and Genetics Why do we study random chance and probability at the beginning of a unit on genetics? Genetics is the study of inheritance,

More information

Recommend Continued CPS Monitoring. 63 (a) 17 (b) 10 (c) 90. 35 (d) 20 (e) 25 (f) 80. Totals/Marginal 98 37 35 170

Recommend Continued CPS Monitoring. 63 (a) 17 (b) 10 (c) 90. 35 (d) 20 (e) 25 (f) 80. Totals/Marginal 98 37 35 170 Work Sheet 2: Calculating a Chi Square Table 1: Substance Abuse Level by ation Total/Marginal 63 (a) 17 (b) 10 (c) 90 35 (d) 20 (e) 25 (f) 80 Totals/Marginal 98 37 35 170 Step 1: Label Your Table. Label

More information

Comparing Multiple Proportions, Test of Independence and Goodness of Fit

Comparing Multiple Proportions, Test of Independence and Goodness of Fit Comparing Multiple Proportions, Test of Independence and Goodness of Fit Content Testing the Equality of Population Proportions for Three or More Populations Test of Independence Goodness of Fit Test 2

More information

Testing Research and Statistical Hypotheses

Testing Research and Statistical Hypotheses Testing Research and Statistical Hypotheses Introduction In the last lab we analyzed metric artifact attributes such as thickness or width/thickness ratio. Those were continuous variables, which as you

More information

Chapter 5 Analysis of variance SPSS Analysis of variance

Chapter 5 Analysis of variance SPSS Analysis of variance Chapter 5 Analysis of variance SPSS Analysis of variance Data file used: gss.sav How to get there: Analyze Compare Means One-way ANOVA To test the null hypothesis that several population means are equal,

More information

LAB 4 INSTRUCTIONS CONFIDENCE INTERVALS AND HYPOTHESIS TESTING

LAB 4 INSTRUCTIONS CONFIDENCE INTERVALS AND HYPOTHESIS TESTING LAB 4 INSTRUCTIONS CONFIDENCE INTERVALS AND HYPOTHESIS TESTING In this lab you will explore the concept of a confidence interval and hypothesis testing through a simulation problem in engineering setting.

More information

1.5 Oneway Analysis of Variance

1.5 Oneway Analysis of Variance Statistics: Rosie Cornish. 200. 1.5 Oneway Analysis of Variance 1 Introduction Oneway analysis of variance (ANOVA) is used to compare several means. This method is often used in scientific or medical experiments

More information

Contingency Tables and the Chi Square Statistic. Interpreting Computer Printouts and Constructing Tables

Contingency Tables and the Chi Square Statistic. Interpreting Computer Printouts and Constructing Tables Contingency Tables and the Chi Square Statistic Interpreting Computer Printouts and Constructing Tables Contingency Tables/Chi Square Statistics What are they? A contingency table is a table that shows

More information

Topic 8. Chi Square Tests

Topic 8. Chi Square Tests BE540W Chi Square Tests Page 1 of 5 Topic 8 Chi Square Tests Topics 1. Introduction to Contingency Tables. Introduction to the Contingency Table Hypothesis Test of No Association.. 3. The Chi Square Test

More information

statistics Chi-square tests and nonparametric Summary sheet from last time: Hypothesis testing Summary sheet from last time: Confidence intervals

statistics Chi-square tests and nonparametric Summary sheet from last time: Hypothesis testing Summary sheet from last time: Confidence intervals Summary sheet from last time: Confidence intervals Confidence intervals take on the usual form: parameter = statistic ± t crit SE(statistic) parameter SE a s e sqrt(1/n + m x 2 /ss xx ) b s e /sqrt(ss

More information

This chapter discusses some of the basic concepts in inferential statistics.

This chapter discusses some of the basic concepts in inferential statistics. Research Skills for Psychology Majors: Everything You Need to Know to Get Started Inferential Statistics: Basic Concepts This chapter discusses some of the basic concepts in inferential statistics. Details

More information

Unit 26 Estimation with Confidence Intervals

Unit 26 Estimation with Confidence Intervals Unit 26 Estimation with Confidence Intervals Objectives: To see how confidence intervals are used to estimate a population proportion, a population mean, a difference in population proportions, or a difference

More information

Chi Squared and Fisher's Exact Tests. Observed vs Expected Distributions

Chi Squared and Fisher's Exact Tests. Observed vs Expected Distributions BMS 617 Statistical Techniques for the Biomedical Sciences Lecture 11: Chi-Squared and Fisher's Exact Tests Chi Squared and Fisher's Exact Tests This lecture presents two similarly structured tests, Chi-squared

More information

Statistics 2014 Scoring Guidelines

Statistics 2014 Scoring Guidelines AP Statistics 2014 Scoring Guidelines College Board, Advanced Placement Program, AP, AP Central, and the acorn logo are registered trademarks of the College Board. AP Central is the official online home

More information

Lesson 1: Comparison of Population Means Part c: Comparison of Two- Means

Lesson 1: Comparison of Population Means Part c: Comparison of Two- Means Lesson : Comparison of Population Means Part c: Comparison of Two- Means Welcome to lesson c. This third lesson of lesson will discuss hypothesis testing for two independent means. Steps in Hypothesis

More information

Section 12 Part 2. Chi-square test

Section 12 Part 2. Chi-square test Section 12 Part 2 Chi-square test McNemar s Test Section 12 Part 2 Overview Section 12, Part 1 covered two inference methods for categorical data from 2 groups Confidence Intervals for the difference of

More information

Odds ratio, Odds ratio test for independence, chi-squared statistic.

Odds ratio, Odds ratio test for independence, chi-squared statistic. Odds ratio, Odds ratio test for independence, chi-squared statistic. Announcements: Assignment 5 is live on webpage. Due Wed Aug 1 at 4:30pm. (9 days, 1 hour, 58.5 minutes ) Final exam is Aug 9. Review

More information

Introduction to Analysis of Variance (ANOVA) Limitations of the t-test

Introduction to Analysis of Variance (ANOVA) Limitations of the t-test Introduction to Analysis of Variance (ANOVA) The Structural Model, The Summary Table, and the One- Way ANOVA Limitations of the t-test Although the t-test is commonly used, it has limitations Can only

More information

Math 58. Rumbos Fall 2008 1. Solutions to Review Problems for Exam 2

Math 58. Rumbos Fall 2008 1. Solutions to Review Problems for Exam 2 Math 58. Rumbos Fall 2008 1 Solutions to Review Problems for Exam 2 1. For each of the following scenarios, determine whether the binomial distribution is the appropriate distribution for the random variable

More information

Chapter 23 Inferences About Means

Chapter 23 Inferences About Means Chapter 23 Inferences About Means Chapter 23 - Inferences About Means 391 Chapter 23 Solutions to Class Examples 1. See Class Example 1. 2. We want to know if the mean battery lifespan exceeds the 300-minute

More information

One-Way Analysis of Variance (ANOVA) Example Problem

One-Way Analysis of Variance (ANOVA) Example Problem One-Way Analysis of Variance (ANOVA) Example Problem Introduction Analysis of Variance (ANOVA) is a hypothesis-testing technique used to test the equality of two or more population (or treatment) means

More information

Crosstabulation & Chi Square

Crosstabulation & Chi Square Crosstabulation & Chi Square Robert S Michael Chi-square as an Index of Association After examining the distribution of each of the variables, the researcher s next task is to look for relationships among

More information

UNDERSTANDING THE TWO-WAY ANOVA

UNDERSTANDING THE TWO-WAY ANOVA UNDERSTANDING THE e have seen how the one-way ANOVA can be used to compare two or more sample means in studies involving a single independent variable. This can be extended to two independent variables

More information

Research Methods & Experimental Design

Research Methods & Experimental Design Research Methods & Experimental Design 16.422 Human Supervisory Control April 2004 Research Methods Qualitative vs. quantitative Understanding the relationship between objectives (research question) and

More information

Session 7 Bivariate Data and Analysis

Session 7 Bivariate Data and Analysis Session 7 Bivariate Data and Analysis Key Terms for This Session Previously Introduced mean standard deviation New in This Session association bivariate analysis contingency table co-variation least squares

More information

Online 12 - Sections 9.1 and 9.2-Doug Ensley

Online 12 - Sections 9.1 and 9.2-Doug Ensley Student: Date: Instructor: Doug Ensley Course: MAT117 01 Applied Statistics - Ensley Assignment: Online 12 - Sections 9.1 and 9.2 1. Does a P-value of 0.001 give strong evidence or not especially strong

More information

3.4 Statistical inference for 2 populations based on two samples

3.4 Statistical inference for 2 populations based on two samples 3.4 Statistical inference for 2 populations based on two samples Tests for a difference between two population means The first sample will be denoted as X 1, X 2,..., X m. The second sample will be denoted

More information

First-year Statistics for Psychology Students Through Worked Examples

First-year Statistics for Psychology Students Through Worked Examples First-year Statistics for Psychology Students Through Worked Examples 1. THE CHI-SQUARE TEST A test of association between categorical variables by Charles McCreery, D.Phil Formerly Lecturer in Experimental

More information

Week 3&4: Z tables and the Sampling Distribution of X

Week 3&4: Z tables and the Sampling Distribution of X Week 3&4: Z tables and the Sampling Distribution of X 2 / 36 The Standard Normal Distribution, or Z Distribution, is the distribution of a random variable, Z N(0, 1 2 ). The distribution of any other normal

More information

Calculating P-Values. Parkland College. Isela Guerra Parkland College. Recommended Citation

Calculating P-Values. Parkland College. Isela Guerra Parkland College. Recommended Citation Parkland College A with Honors Projects Honors Program 2014 Calculating P-Values Isela Guerra Parkland College Recommended Citation Guerra, Isela, "Calculating P-Values" (2014). A with Honors Projects.

More information

Mind on Statistics. Chapter 4

Mind on Statistics. Chapter 4 Mind on Statistics Chapter 4 Sections 4.1 Questions 1 to 4: The table below shows the counts by gender and highest degree attained for 498 respondents in the General Social Survey. Highest Degree Gender

More information

Test Positive True Positive False Positive. Test Negative False Negative True Negative. Figure 5-1: 2 x 2 Contingency Table

Test Positive True Positive False Positive. Test Negative False Negative True Negative. Figure 5-1: 2 x 2 Contingency Table ANALYSIS OF DISCRT VARIABLS / 5 CHAPTR FIV ANALYSIS OF DISCRT VARIABLS Discrete variables are those which can only assume certain fixed values. xamples include outcome variables with results such as live

More information

CHAPTER 14 NONPARAMETRIC TESTS

CHAPTER 14 NONPARAMETRIC TESTS CHAPTER 14 NONPARAMETRIC TESTS Everything that we have done up until now in statistics has relied heavily on one major fact: that our data is normally distributed. We have been able to make inferences

More information

Simulating Chi-Square Test Using Excel

Simulating Chi-Square Test Using Excel Simulating Chi-Square Test Using Excel Leslie Chandrakantha John Jay College of Criminal Justice of CUNY Mathematics and Computer Science Department 524 West 59 th Street, New York, NY 10019 lchandra@jjay.cuny.edu

More information

Part 3. Comparing Groups. Chapter 7 Comparing Paired Groups 189. Chapter 8 Comparing Two Independent Groups 217

Part 3. Comparing Groups. Chapter 7 Comparing Paired Groups 189. Chapter 8 Comparing Two Independent Groups 217 Part 3 Comparing Groups Chapter 7 Comparing Paired Groups 189 Chapter 8 Comparing Two Independent Groups 217 Chapter 9 Comparing More Than Two Groups 257 188 Elementary Statistics Using SAS Chapter 7 Comparing

More information

Two Correlated Proportions (McNemar Test)

Two Correlated Proportions (McNemar Test) Chapter 50 Two Correlated Proportions (Mcemar Test) Introduction This procedure computes confidence intervals and hypothesis tests for the comparison of the marginal frequencies of two factors (each with

More information

Chapter 8 Hypothesis Testing Chapter 8 Hypothesis Testing 8-1 Overview 8-2 Basics of Hypothesis Testing

Chapter 8 Hypothesis Testing Chapter 8 Hypothesis Testing 8-1 Overview 8-2 Basics of Hypothesis Testing Chapter 8 Hypothesis Testing 1 Chapter 8 Hypothesis Testing 8-1 Overview 8-2 Basics of Hypothesis Testing 8-3 Testing a Claim About a Proportion 8-5 Testing a Claim About a Mean: s Not Known 8-6 Testing

More information

Final Exam Practice Problem Answers

Final Exam Practice Problem Answers Final Exam Practice Problem Answers The following data set consists of data gathered from 77 popular breakfast cereals. The variables in the data set are as follows: Brand: The brand name of the cereal

More information

CHAPTER IV FINDINGS AND CONCURRENT DISCUSSIONS

CHAPTER IV FINDINGS AND CONCURRENT DISCUSSIONS CHAPTER IV FINDINGS AND CONCURRENT DISCUSSIONS Hypothesis 1: People are resistant to the technological change in the security system of the organization. Hypothesis 2: information hacked and misused. Lack

More information

Mind on Statistics. Chapter 15

Mind on Statistics. Chapter 15 Mind on Statistics Chapter 15 Section 15.1 1. A student survey was done to study the relationship between class standing (freshman, sophomore, junior, or senior) and major subject (English, Biology, French,

More information

The Chi-Square Test. STAT E-50 Introduction to Statistics

The Chi-Square Test. STAT E-50 Introduction to Statistics STAT -50 Introduction to Statistics The Chi-Square Test The Chi-square test is a nonparametric test that is used to compare experimental results with theoretical models. That is, we will be comparing observed

More information

t Tests in Excel The Excel Statistical Master By Mark Harmon Copyright 2011 Mark Harmon

t Tests in Excel The Excel Statistical Master By Mark Harmon Copyright 2011 Mark Harmon t-tests in Excel By Mark Harmon Copyright 2011 Mark Harmon No part of this publication may be reproduced or distributed without the express permission of the author. mark@excelmasterseries.com www.excelmasterseries.com

More information

Two-Sample T-Tests Assuming Equal Variance (Enter Means)

Two-Sample T-Tests Assuming Equal Variance (Enter Means) Chapter 4 Two-Sample T-Tests Assuming Equal Variance (Enter Means) Introduction This procedure provides sample size and power calculations for one- or two-sided two-sample t-tests when the variances of

More information

CONTINGENCY TABLES ARE NOT ALL THE SAME David C. Howell University of Vermont

CONTINGENCY TABLES ARE NOT ALL THE SAME David C. Howell University of Vermont CONTINGENCY TABLES ARE NOT ALL THE SAME David C. Howell University of Vermont To most people studying statistics a contingency table is a contingency table. We tend to forget, if we ever knew, that contingency

More information

Introduction. Hypothesis Testing. Hypothesis Testing. Significance Testing

Introduction. Hypothesis Testing. Hypothesis Testing. Significance Testing Introduction Hypothesis Testing Mark Lunt Arthritis Research UK Centre for Ecellence in Epidemiology University of Manchester 13/10/2015 We saw last week that we can never know the population parameters

More information

Data Mining Techniques Chapter 5: The Lure of Statistics: Data Mining Using Familiar Tools

Data Mining Techniques Chapter 5: The Lure of Statistics: Data Mining Using Familiar Tools Data Mining Techniques Chapter 5: The Lure of Statistics: Data Mining Using Familiar Tools Occam s razor.......................................................... 2 A look at data I.........................................................

More information

Statistical tests for SPSS

Statistical tests for SPSS Statistical tests for SPSS Paolo Coletti A.Y. 2010/11 Free University of Bolzano Bozen Premise This book is a very quick, rough and fast description of statistical tests and their usage. It is explicitly

More information

12.5: CHI-SQUARE GOODNESS OF FIT TESTS

12.5: CHI-SQUARE GOODNESS OF FIT TESTS 125: Chi-Square Goodness of Fit Tests CD12-1 125: CHI-SQUARE GOODNESS OF FIT TESTS In this section, the χ 2 distribution is used for testing the goodness of fit of a set of data to a specific probability

More information

Independent samples t-test. Dr. Tom Pierce Radford University

Independent samples t-test. Dr. Tom Pierce Radford University Independent samples t-test Dr. Tom Pierce Radford University The logic behind drawing causal conclusions from experiments The sampling distribution of the difference between means The standard error of

More information

Independent t- Test (Comparing Two Means)

Independent t- Test (Comparing Two Means) Independent t- Test (Comparing Two Means) The objectives of this lesson are to learn: the definition/purpose of independent t-test when to use the independent t-test the use of SPSS to complete an independent

More information

Mind on Statistics. Chapter 12

Mind on Statistics. Chapter 12 Mind on Statistics Chapter 12 Sections 12.1 Questions 1 to 6: For each statement, determine if the statement is a typical null hypothesis (H 0 ) or alternative hypothesis (H a ). 1. There is no difference

More information

p ˆ (sample mean and sample

p ˆ (sample mean and sample Chapter 6: Confidence Intervals and Hypothesis Testing When analyzing data, we can t just accept the sample mean or sample proportion as the official mean or proportion. When we estimate the statistics

More information

Study Guide for the Final Exam

Study Guide for the Final Exam Study Guide for the Final Exam When studying, remember that the computational portion of the exam will only involve new material (covered after the second midterm), that material from Exam 1 will make

More information

MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question.

MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question. Sample Practice problems - chapter 12-1 and 2 proportions for inference - Z Distributions Name MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question. Provide

More information

CALCULATIONS & STATISTICS

CALCULATIONS & STATISTICS CALCULATIONS & STATISTICS CALCULATION OF SCORES Conversion of 1-5 scale to 0-100 scores When you look at your report, you will notice that the scores are reported on a 0-100 scale, even though respondents

More information

8 6 X 2 Test for a Variance or Standard Deviation

8 6 X 2 Test for a Variance or Standard Deviation Section 8 6 x 2 Test for a Variance or Standard Deviation 437 This test uses the P-value method. Therefore, it is not necessary to enter a significance level. 1. Select MegaStat>Hypothesis Tests>Proportion

More information

ABSORBENCY OF PAPER TOWELS

ABSORBENCY OF PAPER TOWELS ABSORBENCY OF PAPER TOWELS 15. Brief Version of the Case Study 15.1 Problem Formulation 15.2 Selection of Factors 15.3 Obtaining Random Samples of Paper Towels 15.4 How will the Absorbency be measured?

More information

HYPOTHESIS TESTING (ONE SAMPLE) - CHAPTER 7 1. used confidence intervals to answer questions such as...

HYPOTHESIS TESTING (ONE SAMPLE) - CHAPTER 7 1. used confidence intervals to answer questions such as... HYPOTHESIS TESTING (ONE SAMPLE) - CHAPTER 7 1 PREVIOUSLY used confidence intervals to answer questions such as... You know that 0.25% of women have red/green color blindness. You conduct a study of men

More information

Hypothesis testing - Steps

Hypothesis testing - Steps Hypothesis testing - Steps Steps to do a two-tailed test of the hypothesis that β 1 0: 1. Set up the hypotheses: H 0 : β 1 = 0 H a : β 1 0. 2. Compute the test statistic: t = b 1 0 Std. error of b 1 =

More information

Descriptive Statistics

Descriptive Statistics Descriptive Statistics Primer Descriptive statistics Central tendency Variation Relative position Relationships Calculating descriptive statistics Descriptive Statistics Purpose to describe or summarize

More information

Experimental Design. Power and Sample Size Determination. Proportions. Proportions. Confidence Interval for p. The Binomial Test

Experimental Design. Power and Sample Size Determination. Proportions. Proportions. Confidence Interval for p. The Binomial Test Experimental Design Power and Sample Size Determination Bret Hanlon and Bret Larget Department of Statistics University of Wisconsin Madison November 3 8, 2011 To this point in the semester, we have largely

More information

C. The null hypothesis is not rejected when the alternative hypothesis is true. A. population parameters.

C. The null hypothesis is not rejected when the alternative hypothesis is true. A. population parameters. Sample Multiple Choice Questions for the material since Midterm 2. Sample questions from Midterms and 2 are also representative of questions that may appear on the final exam.. A randomly selected sample

More information

Chapter 2. Hypothesis testing in one population

Chapter 2. Hypothesis testing in one population Chapter 2. Hypothesis testing in one population Contents Introduction, the null and alternative hypotheses Hypothesis testing process Type I and Type II errors, power Test statistic, level of significance

More information

STAT 145 (Notes) Al Nosedal anosedal@unm.edu Department of Mathematics and Statistics University of New Mexico. Fall 2013

STAT 145 (Notes) Al Nosedal anosedal@unm.edu Department of Mathematics and Statistics University of New Mexico. Fall 2013 STAT 145 (Notes) Al Nosedal anosedal@unm.edu Department of Mathematics and Statistics University of New Mexico Fall 2013 CHAPTER 18 INFERENCE ABOUT A POPULATION MEAN. Conditions for Inference about mean

More information

Projects Involving Statistics (& SPSS)

Projects Involving Statistics (& SPSS) Projects Involving Statistics (& SPSS) Academic Skills Advice Starting a project which involves using statistics can feel confusing as there seems to be many different things you can do (charts, graphs,

More information

Good luck! BUSINESS STATISTICS FINAL EXAM INSTRUCTIONS. Name:

Good luck! BUSINESS STATISTICS FINAL EXAM INSTRUCTIONS. Name: Glo bal Leadership M BA BUSINESS STATISTICS FINAL EXAM Name: INSTRUCTIONS 1. Do not open this exam until instructed to do so. 2. Be sure to fill in your name before starting the exam. 3. You have two hours

More information

Non-Parametric Tests (I)

Non-Parametric Tests (I) Lecture 5: Non-Parametric Tests (I) KimHuat LIM lim@stats.ox.ac.uk http://www.stats.ox.ac.uk/~lim/teaching.html Slide 1 5.1 Outline (i) Overview of Distribution-Free Tests (ii) Median Test for Two Independent

More information

individualdifferences

individualdifferences 1 Simple ANalysis Of Variance (ANOVA) Oftentimes we have more than two groups that we want to compare. The purpose of ANOVA is to allow us to compare group means from several independent samples. In general,

More information

Point and Interval Estimates

Point and Interval Estimates Point and Interval Estimates Suppose we want to estimate a parameter, such as p or µ, based on a finite sample of data. There are two main methods: 1. Point estimate: Summarize the sample by a single number

More information

Two-Sample T-Tests Allowing Unequal Variance (Enter Difference)

Two-Sample T-Tests Allowing Unequal Variance (Enter Difference) Chapter 45 Two-Sample T-Tests Allowing Unequal Variance (Enter Difference) Introduction This procedure provides sample size and power calculations for one- or two-sided two-sample t-tests when no assumption

More information

Comparing Two Groups. Standard Error of ȳ 1 ȳ 2. Setting. Two Independent Samples

Comparing Two Groups. Standard Error of ȳ 1 ȳ 2. Setting. Two Independent Samples Comparing Two Groups Chapter 7 describes two ways to compare two populations on the basis of independent samples: a confidence interval for the difference in population means and a hypothesis test. The

More information

Using Stata for Categorical Data Analysis

Using Stata for Categorical Data Analysis Using Stata for Categorical Data Analysis NOTE: These problems make extensive use of Nick Cox s tab_chi, which is actually a collection of routines, and Adrian Mander s ipf command. From within Stata,

More information

Lesson 3: Calculating Conditional Probabilities and Evaluating Independence Using Two-Way Tables

Lesson 3: Calculating Conditional Probabilities and Evaluating Independence Using Two-Way Tables Calculating Conditional Probabilities and Evaluating Independence Using Two-Way Tables Classwork Example 1 Students at Rufus King High School were discussing some of the challenges of finding space for

More information

Recall this chart that showed how most of our course would be organized:

Recall this chart that showed how most of our course would be organized: Chapter 4 One-Way ANOVA Recall this chart that showed how most of our course would be organized: Explanatory Variable(s) Response Variable Methods Categorical Categorical Contingency Tables Categorical

More information

Analysis of Variance ANOVA

Analysis of Variance ANOVA Analysis of Variance ANOVA Overview We ve used the t -test to compare the means from two independent groups. Now we ve come to the final topic of the course: how to compare means from more than two populations.

More information

CHAPTER 11 CHI-SQUARE AND F DISTRIBUTIONS

CHAPTER 11 CHI-SQUARE AND F DISTRIBUTIONS CHAPTER 11 CHI-SQUARE AND F DISTRIBUTIONS CHI-SQUARE TESTS OF INDEPENDENCE (SECTION 11.1 OF UNDERSTANDABLE STATISTICS) In chi-square tests of independence we use the hypotheses. H0: The variables are independent

More information

Chapter 7: Simple linear regression Learning Objectives

Chapter 7: Simple linear regression Learning Objectives Chapter 7: Simple linear regression Learning Objectives Reading: Section 7.1 of OpenIntro Statistics Video: Correlation vs. causation, YouTube (2:19) Video: Intro to Linear Regression, YouTube (5:18) -

More information

Chapter 7 Section 7.1: Inference for the Mean of a Population

Chapter 7 Section 7.1: Inference for the Mean of a Population Chapter 7 Section 7.1: Inference for the Mean of a Population Now let s look at a similar situation Take an SRS of size n Normal Population : N(, ). Both and are unknown parameters. Unlike what we used

More information

COMPARISONS OF CUSTOMER LOYALTY: PUBLIC & PRIVATE INSURANCE COMPANIES.

COMPARISONS OF CUSTOMER LOYALTY: PUBLIC & PRIVATE INSURANCE COMPANIES. 277 CHAPTER VI COMPARISONS OF CUSTOMER LOYALTY: PUBLIC & PRIVATE INSURANCE COMPANIES. This chapter contains a full discussion of customer loyalty comparisons between private and public insurance companies

More information

Comparing Means in Two Populations

Comparing Means in Two Populations Comparing Means in Two Populations Overview The previous section discussed hypothesis testing when sampling from a single population (either a single mean or two means from the same population). Now we

More information

Two-sample inference: Continuous data

Two-sample inference: Continuous data Two-sample inference: Continuous data Patrick Breheny April 5 Patrick Breheny STA 580: Biostatistics I 1/32 Introduction Our next two lectures will deal with two-sample inference for continuous data As

More information

Section 13, Part 1 ANOVA. Analysis Of Variance

Section 13, Part 1 ANOVA. Analysis Of Variance Section 13, Part 1 ANOVA Analysis Of Variance Course Overview So far in this course we ve covered: Descriptive statistics Summary statistics Tables and Graphs Probability Probability Rules Probability

More information

General Method: Difference of Means. 3. Calculate df: either Welch-Satterthwaite formula or simpler df = min(n 1, n 2 ) 1.

General Method: Difference of Means. 3. Calculate df: either Welch-Satterthwaite formula or simpler df = min(n 1, n 2 ) 1. General Method: Difference of Means 1. Calculate x 1, x 2, SE 1, SE 2. 2. Combined SE = SE1 2 + SE2 2. ASSUMES INDEPENDENT SAMPLES. 3. Calculate df: either Welch-Satterthwaite formula or simpler df = min(n

More information

Week 4: Standard Error and Confidence Intervals

Week 4: Standard Error and Confidence Intervals Health Sciences M.Sc. Programme Applied Biostatistics Week 4: Standard Error and Confidence Intervals Sampling Most research data come from subjects we think of as samples drawn from a larger population.

More information

SCHOOL OF HEALTH AND HUMAN SCIENCES DON T FORGET TO RECODE YOUR MISSING VALUES

SCHOOL OF HEALTH AND HUMAN SCIENCES DON T FORGET TO RECODE YOUR MISSING VALUES SCHOOL OF HEALTH AND HUMAN SCIENCES Using SPSS Topics addressed today: 1. Differences between groups 2. Graphing Use the s4data.sav file for the first part of this session. DON T FORGET TO RECODE YOUR

More information

"Statistical methods are objective methods by which group trends are abstracted from observations on many separate individuals." 1

Statistical methods are objective methods by which group trends are abstracted from observations on many separate individuals. 1 BASIC STATISTICAL THEORY / 3 CHAPTER ONE BASIC STATISTICAL THEORY "Statistical methods are objective methods by which group trends are abstracted from observations on many separate individuals." 1 Medicine

More information