11.3 Contingency Tables

Similar documents
Class 19: Two Way Tables, Conditional Distributions, Chi-Square (Text: Sections 2.5; 9.1)

Elementary Statistics

Is it statistically significant? The chi-square test

SHORT ANSWER. Write the word or phrase that best completes each statement or answers the question.

Contingency Tables and the Chi Square Statistic. Interpreting Computer Printouts and Constructing Tables

Mind on Statistics. Chapter 15

CHAPTER IV FINDINGS AND CONCURRENT DISCUSSIONS

The Chi-Square Test. STAT E-50 Introduction to Statistics

Introduction to Analysis of Variance (ANOVA) Limitations of the t-test

Bivariate Statistics Session 2: Measuring Associations Chi-Square Test

Mind on Statistics. Chapter 4

Lesson 3: Calculating Conditional Probabilities and Evaluating Independence Using Two-Way Tables

Association Between Variables

Chi-square test Fisher s Exact test

Having a coin come up heads or tails is a variable on a nominal scale. Heads is a different category from tails.

Section 12 Part 2. Chi-square test

DESCRIPTIVE STATISTICS & DATA PRESENTATION*

CONTINGENCY TABLES ARE NOT ALL THE SAME David C. Howell University of Vermont

Chapter 8 Hypothesis Testing Chapter 8 Hypothesis Testing 8-1 Overview 8-2 Basics of Hypothesis Testing

Comparing Multiple Proportions, Test of Independence and Goodness of Fit

Statistical tests for SPSS

AP: LAB 8: THE CHI-SQUARE TEST. Probability, Random Chance, and Genetics

Recommend Continued CPS Monitoring. 63 (a) 17 (b) 10 (c) (d) 20 (e) 25 (f) 80. Totals/Marginal

1. What is the critical value for this 95% confidence interval? CV = z.025 = invnorm(0.025) = 1.96

MATH 1108 R07 MIDTERM EXAM 1 SOLUTION

UNDERSTANDING THE TWO-WAY ANOVA

Crosstabulation & Chi Square

Statistics 2014 Scoring Guidelines

AP STATISTICS TEST #2 - REVIEW - Ch. 14 &15 Period:

Chapter 5 Analysis of variance SPSS Analysis of variance

Math 108 Exam 3 Solutions Spring 00

LAB : THE CHI-SQUARE TEST. Probability, Random Chance, and Genetics

NEW JERSEY VOTERS DIVIDED OVER SAME-SEX MARRIAGE. A Rutgers-Eagleton Poll on same-sex marriage, conducted in June 2006, found the state s

SAMPLING & INFERENTIAL STATISTICS. Sampling is necessary to make inferences about a population.

Row vs. Column Percents. tab PRAYER DEGREE, row col

2.5 Conditional Probabilities and 2-Way Tables

Math 58. Rumbos Fall Solutions to Review Problems for Exam 2

CHAPTER 12. Chi-Square Tests and Nonparametric Tests LEARNING OBJECTIVES. USING T.C. Resort Properties

CHAPTER 14 ORDINAL MEASURES OF CORRELATION: SPEARMAN'S RHO AND GAMMA

Mind on Statistics. Chapter 10

CHAPTER 11 CHI-SQUARE AND F DISTRIBUTIONS

Using Stata for Categorical Data Analysis

Testing Research and Statistical Hypotheses

Additional sources Compilation of sources:

"Statistical methods are objective methods by which group trends are abstracted from observations on many separate individuals." 1

3.4 Statistical inference for 2 populations based on two samples

Chi Square Distribution

Introduction. Hypothesis Testing. Hypothesis Testing. Significance Testing

Fairfield Public Schools

C. The null hypothesis is not rejected when the alternative hypothesis is true. A. population parameters.

Statistics. One-two sided test, Parametric and non-parametric test statistics: one group, two groups, and more than two groups samples

Solutions to Homework 10 Statistics 302 Professor Larget

Topic 8. Chi Square Tests

An Introduction to Statistics Course (ECOE 1302) Spring Semester 2011 Chapter 10- TWO-SAMPLE TESTS

EXAM. Exam #3. Math 1430, Spring April 21, 2001 ANSWERS

Elementary Statistics

People like to clump things into categories. Virtually every research

Chapter 4 - Practice Problems 1

The Dummy s Guide to Data Analysis Using SPSS

Calculating P-Values. Parkland College. Isela Guerra Parkland College. Recommended Citation

Why Sample? Why not study everyone? Debate about Census vs. sampling

This chapter discusses some of the basic concepts in inferential statistics.

Simulating Chi-Square Test Using Excel

Elementary Statistics Sample Exam #3

SCHOOL OF HEALTH AND HUMAN SCIENCES DON T FORGET TO RECODE YOUR MISSING VALUES

Mind on Statistics. Chapter 12

Common Univariate and Bivariate Applications of the Chi-square Distribution

MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question.

November 08, S8.6_3 Testing a Claim About a Standard Deviation or Variance

A Study to Predict No Show Probability for a Scheduled Appointment at Free Health Clinic

Odds ratio, Odds ratio test for independence, chi-squared statistic.

Inclusion and Exclusion Criteria

HYPOTHESIS TESTING (ONE SAMPLE) - CHAPTER 7 1. used confidence intervals to answer questions such as...

SPSS Resources. 1. See website (readings) for SPSS tutorial & Stats handout

Section 7.1. Introduction to Hypothesis Testing. Schrodinger s cat quantum mechanics thought experiment (1935)

Chapter 8: Hypothesis Testing for One Population Mean, Variance, and Proportion

Linear Models in STATA and ANOVA

IB Practice Chi Squared Test of Independence

CONTINGENCY (CROSS- TABULATION) TABLES

VIRGINIA: TRUMP, CLINTON LEAD PRIMARIES

Chapter 23. Two Categorical Variables: The Chi-Square Test

CHAPTER 15 NOMINAL MEASURES OF CORRELATION: PHI, THE CONTINGENCY COEFFICIENT, AND CRAMER'S V

Statistiek I. Proportions aka Sign Tests. John Nerbonne. CLCG, Rijksuniversiteit Groningen.

HYPOTHESIS TESTING (ONE SAMPLE) - CHAPTER 7 1. used confidence intervals to answer questions such as...

Gambling participation: activities and mode of access

Research Methods & Experimental Design

Chi Squared and Fisher's Exact Tests. Observed vs Expected Distributions

Chapter 7 Section 7.1: Inference for the Mean of a Population

First-year Statistics for Psychology Students Through Worked Examples

Testing differences in proportions

Nonparametric Tests. Chi-Square Test for Independence

TABLE OF CONTENTS. About Chi Squares What is a CHI SQUARE? Chi Squares Hypothesis Testing with Chi Squares... 2

Regression Analysis: A Complete Example

BA 275 Review Problems - Week 5 (10/23/06-10/27/06) CD Lessons: 48, 49, 50, 51, 52 Textbook: pp

Statistical Impact of Slip Simulator Training at Los Alamos National Laboratory

STAT 145 (Notes) Al Nosedal Department of Mathematics and Statistics University of New Mexico. Fall 2013

NATIONAL: TRUMP WIDENS NATIONAL LEAD

Categorical Data Analysis

3. Analysis of Qualitative Data

FOR IMMEDIATE RELEASE DATE: September 28, 2005 CONTACT: Dan Romer, ; (cell)

Transcription:

11.3 Contingency Tables Objectives: 1. Perform a test of homogeneity. Perform a test of independence Overview: In this section we consider contingency tables (or two-way frequency tables), which include frequency counts for categorical data arranged in a table with a least two rows and at least two columns. We present a method for testing the claim that the row and column variables are independent of each other. We will use the same method for a test of homogeneity, whereby we test the claim that different populations have the same proportion of some characteristics. Contingency Tables: A contingency table (or two-way frequency table) is a table in which frequencies correspond to two variables. (One variable is used to categorize rows, and a second variable is used to categorize columns.) Contingency tables have at least two rows and at least two columns. Test of Independence: A test of independence tests the null hypothesis that in a contingency table, the row and column variables are independent. Notation: O r c represents the observed frequency in a cell of a contingency table. represents the expected frequency in a cell, found by assuming that the row and column m variables are independent represents the number of rows in a contingency table (not including labels). represents the number of columns in a contingency table (not including labels). Requirements: 1. The sample data are randomly selected.. The sample data are represented as frequency counts in a two-way table. 3. For every cell in the contingency table, the expected frequency is at least 5. (There is no requirement that every observed frequency must be at least 5. Also, there is no requirement that the population must have a normal distribution or any other specific distribution.) Null and Alternative Hypotheses: H 0 : H 1 : The row and column variables are independent. The row and column variables are dependent.

Test Statistic: (Chi Squared) χ ( O ) = where O is the observed frequency in a cell and is the expected frequency found by evaluating Critical Values: = ( row. total)( column.. total) grand. total 1. Found in Table A- 4 using ( r 1)( c 1) degrees of freedom, r is the number of rows and c is the number of columns.. Tests of Independence are always right-tailed. P-Values: P-values are typically provided by computer software, or a range of P-values can be found from Table A-4. Warning: 1. This procedure cannot be used to establish a direct cause-and-effect link between variables in question.. Dependence means only there is a relationship between the two variables. Relationships Among Key Components in Test of Independence:

xample: Responses to a survey question are broken down according to gender and the sample results are given below. At the 0.05 significance level, test the claim that response and gender are independent. Solution: Requirements are satisfied: randomly assigned to treatment groups, frequency counts, expected frequencies are all at least 5. Step 1: We are testing the claim that that response and gender are independent. Step : The opposite of the claim is that the response and gender are dependent. Step 3: Null hypothesis contains equality, therefore, Step 4: Significance level is 0.05 H 0 : The response is independent of the gender. H 1 : The response and gender are dependent. Step 5: We are testing for independence, use: χ Step 6: Use a table to calculate the vales of and the test statistic. Yes No Undecided Observed Category O Chi^ O Chi^ O Chi^ Totals Male 5 7.00 0.15 50 48.00 0.08 15 15.00 0.00 90 Female 0 18.00 0. 30 3.00 0.13 10 10.00 0.00 60 Totals 45 0.37 80 0.1 5 0.00 150 Chi-Squared = 0.579 χ ( O ) = = 0.579 The critical value of χ = 5.991 is found from Table A-4 with α = 0.05 in the right tail and the number of degrees of freedom given by (r 1)(c 1) = ( 1)(3 1) =. Test Statistic < Critical Value 0.579 < 5.991

Step 7: Because the test statistic does not fall in the critical region, there is not sufficient evidence to reject the null hypothesis. Step 8: Conclusion: There is not sufficient evidence to reject the claim that the responses are independent of gender. (Responses are dependent on gender.) xample: The table below shows the age and favorite type of music of 668 randomly selected people. Use a 97.5% level of significance to test the null hypothesis that age and preferred music type are independent. Solution: Requirements are satisfied: randomly assigned to groups, frequency counts, expected frequencies are all at least 5. Step 1: We are testing the claim that age and preferred music type are independent. Step : The opposite of the claim is that age and preferred music type are dependent. Step 3: Null hypothesis contains equality, therefore, H 0 : Music type is independent of age. H 1 : Music type and age are dependent. Step 4: Significance level is 0.05. This is obtained from 97.5% = 0.975 and 1-0.975 = 0.05. Step 5: We are testing for independence, use: χ

Step 6: Use a table to calculate the vales of and the test statistic. Rock Pop Classical Observed Category O Chi^ O Chi^ O Chi^ Totals 15-5 50 64.77 3.37 85 77.84 0.66 73 65.39 0.89 08 5-35 68 68.19 0.00 91 81.96 1.00 60 68.85 1.14 19 35-45 90 75.04.98 74 90.19.91 77 75.76 0.0 41 Totals 08 6.35 50 4.56 10.04 668 Chi-Squared = 1.954 χ ( O ) = = 1.954 The critical value of χ = 11.143 is found from Table A-4 with α = 0.05 in the right tail and the number of degrees of freedom given by (r 1)(c 1) = (3 1)(3 1) = 4. Test Statistic > Critical Value 1.954 > 11.143 Step 7: Because the test statistic falls in the critical region, there is sufficient evidence to reject the null hypothesis. Step 8: Conclusion: There is sufficient evidence to reject the claim that music type is independent of age.

xample: 160 students who were majoring in either math or nglish were asked a test question, and the researcher recorded whether they answered the question correctly. The sample results are given below. At the 0.10 significance level, test the claim that response and major are independent. Solution: Requirements are satisfied: randomly assigned to groups, frequency counts, expected frequencies are all at least 5.

xample: Use the sample data below to test whether car color affects the likelihood of being in an accident. Use a significance level of 0.01. Solution: Requirements are satisfied: randomly assigned to groups, frequency counts, expected frequencies are all at least 5.

Test of Homogeneity: In a test of homogeneity, we test the claim that different populations have the same proportions of some characteristics. How to Distinguish Between a Test of Homogeneity and a Test for Independence: Were predetermined sample sizes used for different populations (test of homogeneity), or was one big sample drawn so both row and column totals were determined randomly (test of independence)? Procedure: A test of homogeneity uses the same notation, requirements, test statistic, critical value and procedures as a test for independence. However, instead of testing for independence, we are testing to determine if the different populations have the same proportions. xample: On sensitive issues, people tend to give acceptable rather than honest responses; their answers may depend on the gender or race of the interviewer. To support that claim, men were asked if they agreed with this statement: Abortion is a private matter that should be left to the woman to decide without government intervention. Using a 0.05 significance level, test the claim that the proportions of agree/disagree responses are the same for the subjects interviewed by men and the subjects interviewed by women. Solution: Requirements are satisfied: data are random, frequency counts in a two-way table, expected frequencies are all at least 5 Step 1: We are testing the claim that the proportion of agree/disagree responses are the same for the subjects interviewed by men and the subjects interviewed by women. Step : The opposite of the claim is that the proportions are different. Step 3: Null hypothesis contains equality, therefore, H 0 : The proportions of agree/disagree responses are the same for the subjects interviewed by men and the subjects interviewed by women. H 1 : The proportions are different. Step 4: Significance level is 0.05.. Step 5: We are testing for independence, use: χ

Step 6: Use a table to calculate the vales of and the test statistic. Man Woman Observed Category O Chi^ O Chi^ Totals Agree 560 578.67 0.60 308 89.33 1.0 868 Disagree 40 1.33 1.57 9 110.67 3.15 33 Totals 800.18 400 4.35 100 Chi-Squared = 6.59 χ ( O ) = = 6.59 The critical value of χ = 3.841 is found from Table A-4 with α = 0.05 in the right tail and the number of degrees of freedom given by (r 1)(c 1) = ( 1)( 1) = 1. Test Statistic > Critical Value 6.59 > 3.841 Step 7: Because the test statistic falls in the critical region, there is sufficient evidence to reject the null hypothesis. Step 8: Conclusion: There is sufficient evidence to warrant rejection of the claim that the proportions are the same.

xample: At a high school debate tournament, half of the teams were asked to wear suits and ties and the rest were asked to wear jeans and t-shirts. The results are given in the table below. Test the hypothesis at the 0.05 level that the proportion of wins is the same for teams wearing suits as for teams wearing jeans. Solution: Requirements are satisfied: data are random, frequency counts in a two-way table, expected frequencies are all at least 5

xample: A researcher wishes to test the effectiveness of a flu vaccination. 150 people are vaccinated, 180 people are vaccinated with a placebo, and 100 people are not vaccinated. The number in each group who later caught the flu was recorded. The results are shown below. Use a 0.05 significance level to test the claim that the proportion of people catching the flu is the same in all three groups. Solution: Requirements are satisfied: data are random, frequency counts in a two-way table, expected frequencies are all at least 5