CHI SQUARE DISTRIBUTION

Similar documents
Is it statistically significant? The chi-square test

Having a coin come up heads or tails is a variable on a nominal scale. Heads is a different category from tails.

Bivariate Statistics Session 2: Measuring Associations Chi-Square Test

CHAPTER IV FINDINGS AND CONCURRENT DISCUSSIONS

Class 19: Two Way Tables, Conditional Distributions, Chi-Square (Text: Sections 2.5; 9.1)

Recommend Continued CPS Monitoring. 63 (a) 17 (b) 10 (c) (d) 20 (e) 25 (f) 80. Totals/Marginal

The Chi-Square Test. STAT E-50 Introduction to Statistics

Chi Square Distribution

Topic 8. Chi Square Tests

Chi Squared and Fisher's Exact Tests. Observed vs Expected Distributions

Chi-square test Fisher s Exact test

Contingency Tables and the Chi Square Statistic. Interpreting Computer Printouts and Constructing Tables

Association Between Variables

Chapter 23. Two Categorical Variables: The Chi-Square Test

Nonparametric Tests. Chi-Square Test for Independence

Mind on Statistics. Chapter 4

Testing differences in proportions

Elementary Statistics

Calculating P-Values. Parkland College. Isela Guerra Parkland College. Recommended Citation

CHAPTER 11 CHI-SQUARE AND F DISTRIBUTIONS

Problems for Chapter 9: Producing data: Experiments. STAT Fall 2015.

This chapter discusses some of the basic concepts in inferential statistics.

People like to clump things into categories. Virtually every research

IBM SPSS Statistics 20 Part 4: Chi-Square and ANOVA

Mind on Statistics. Chapter 15

SHORT ANSWER. Write the word or phrase that best completes each statement or answers the question.

Odds ratio, Odds ratio test for independence, chi-squared statistic.

CONTINGENCY TABLES ARE NOT ALL THE SAME David C. Howell University of Vermont

Test Positive True Positive False Positive. Test Negative False Negative True Negative. Figure 5-1: 2 x 2 Contingency Table

1. What is the critical value for this 95% confidence interval? CV = z.025 = invnorm(0.025) = 1.96

Elementary Statistics Sample Exam #3

Comparing Multiple Proportions, Test of Independence and Goodness of Fit

Statistical tests for SPSS

DESCRIPTIVE STATISTICS & DATA PRESENTATION*

Testing Research and Statistical Hypotheses

Crosstabulation & Chi Square

Simulating Chi-Square Test Using Excel

Math 58. Rumbos Fall Solutions to Review Problems for Exam 2

Statistics. One-two sided test, Parametric and non-parametric test statistics: one group, two groups, and more than two groups samples

Introduction to Analysis of Variance (ANOVA) Limitations of the t-test

Inferential Statistics. What are they? When would you use them?

AP: LAB 8: THE CHI-SQUARE TEST. Probability, Random Chance, and Genetics

Section 12 Part 2. Chi-square test

Advanced Statistical Analysis of Mortality. Rhodes, Thomas E. and Freitas, Stephen A. MIB, Inc. 160 University Avenue. Westwood, MA 02090

DDBA 8438: The t Test for Independent Samples Video Podcast Transcript

Solutions to Homework 10 Statistics 302 Professor Larget

Categorical Data Analysis

LAB : THE CHI-SQUARE TEST. Probability, Random Chance, and Genetics

p ˆ (sample mean and sample

First-year Statistics for Psychology Students Through Worked Examples

Statistical Impact of Slip Simulator Training at Los Alamos National Laboratory

Research Methods & Experimental Design

Data Mining Techniques Chapter 5: The Lure of Statistics: Data Mining Using Familiar Tools

One-Way Analysis of Variance (ANOVA) Example Problem

TABLE OF CONTENTS. About Chi Squares What is a CHI SQUARE? Chi Squares Hypothesis Testing with Chi Squares... 2

Final Exam Practice Problem Answers

Chapter 5 Analysis of variance SPSS Analysis of variance

Lesson 3: Calculating Conditional Probabilities and Evaluating Independence Using Two-Way Tables

CHAPTER 12. Chi-Square Tests and Nonparametric Tests LEARNING OBJECTIVES. USING T.C. Resort Properties

Chi Square Tests. Chapter Introduction

C. The null hypothesis is not rejected when the alternative hypothesis is true. A. population parameters.

STA-201-TE. 5. Measures of relationship: correlation (5%) Correlation coefficient; Pearson r; correlation and causation; proportion of common variance

HYPOTHESIS TESTING (ONE SAMPLE) - CHAPTER 7 1. used confidence intervals to answer questions such as...

Study Guide for the Final Exam

UNDERSTANDING THE TWO-WAY ANOVA

The Effects of Alcohol Use on Academic Performance Among College Students

Adverse Impact Ratio for Females (0/ 1) = 0 (5/ 17) = Adverse impact as defined by the 4/5ths rule was not found in the above data.

Whitney Colbert Research Methods for the Social Sciences Trinity College Spring 2012

Online 12 - Sections 9.1 and 9.2-Doug Ensley

Mind on Statistics. Chapter 12

Mendelian Genetics in Drosophila

Math 108 Exam 3 Solutions Spring 00

Use of the Chi-Square Statistic. Marie Diener-West, PhD Johns Hopkins University

International Journal of Technical Research and Applications e Dr. A. Jesu Kulandairaj ABSTRACT- In the present scenario of the competitive

Chapter VIII Customers Perception Regarding Health Insurance

12.5: CHI-SQUARE GOODNESS OF FIT TESTS

The Partnership for a Drug-Free New Jersey. College Student Survey Results. November 2008

Statistik for MPH: september Risiko, relativ risiko, signifikanstest (Silva: ) Per Kragh Andersen

IBM SPSS Statistics for Beginners for Windows

11. Analysis of Case-control Studies Logistic Regression

Chapter 13. Chi-Square. Crosstabs and Nonparametric Tests. Specifically, we demonstrate procedures for running two separate

Opgaven Onderzoeksmethoden, Onderdeel Statistiek

Weight of Evidence Module

Hypothesis Testing: Two Means, Paired Data, Two Proportions

Statistics 2014 Scoring Guidelines

STATISTICS 8, FINAL EXAM. Last six digits of Student ID#: Circle your Discussion Section:

Independent samples t-test. Dr. Tom Pierce Radford University

American Journal Of Business Education July/August 2012 Volume 5, Number 4

Using Stata for Categorical Data Analysis

General Method: Difference of Means. 3. Calculate df: either Welch-Satterthwaite formula or simpler df = min(n 1, n 2 ) 1.

Overview of Non-Parametric Statistics PRESENTER: ELAINE EISENBEISZ OWNER AND PRINCIPAL, OMEGA STATISTICS

8 6 X 2 Test for a Variance or Standard Deviation

Introduction. Hypothesis Testing. Hypothesis Testing. Significance Testing

Part 3. Comparing Groups. Chapter 7 Comparing Paired Groups 189. Chapter 8 Comparing Two Independent Groups 217

CHAPTER 5 COMPARISON OF DIFFERENT TYPE OF ONLINE ADVERTSIEMENTS. Table: 8 Perceived Usefulness of Different Advertisement Types

CHAPTER 15 NOMINAL MEASURES OF CORRELATION: PHI, THE CONTINGENCY COEFFICIENT, AND CRAMER'S V

HYPOTHESIS TESTING (ONE SAMPLE) - CHAPTER 7 1. used confidence intervals to answer questions such as...

An Introduction to Statistics Course (ECOE 1302) Spring Semester 2011 Chapter 10- TWO-SAMPLE TESTS

Chapter 6. Examples (details given in class) Who is Measured: Units, Subjects, Participants. Research Studies to Detect Relationships

Transcription:

CI SQUARE DISTRIBUTION 1

Introduction to the Chi Square Test of Independence This test is used to analyse the relationship between two sets of discrete data. Contingency tables are used to examine the relationship between subjects' scores on two or more qualitative or categorical variables. For example, consider the hypothetical experiment on the effect of smoking on divorce to find if there is any relationship between them. In a sample of 167, 73 from 85 of the smokers had been divorced. Of the non-smokers, only 43 of 8 had been divorced. These data are depicted in the contingency table shown above. The cell entries are cell frequencies. The top left cell with a "73" in it means: 73 subjects in the group who smoked were divorced. From the table, it looks as if the smokers of the group are more likely to divorce than the non-smokers. Thus, the column a subject is in (divorced or not divorced) is contingent upon (depends on) the row the subject is in (smoker or non-smoker). If the columns are not contingent on the rows, then the rows and column frequencies are independent. The test of whether the columns are contingent on the rows is called the chi square test of independence. The null hypothesis is the hypothesis that there is no relationship between row and column frequencies. Parameters and Symbols o the null hypothesis A or 1 the alternative hypothesis O observed count E expected count test critical chi-square test value chi-square value from the table table Divorced Not divorced Total Smoker 73 1 85 Non-smoker 43 39 8 Total 116 51 167 Formulae For each cell in the contingency table: E = ( row total) ( column total) ( grand total) To calculate the critical value: ( O E) test = Σ Degrees of freedom df = ( no. of rows 1) ( no. of columns 1) E

Steps To Follow to Perform the Chi-Squared Test: 1 State the null hypothesis. This states that the variables in the contingency table are independent (or the classification of one variable does not affect the classification of the other). eg o : Divorce is not related to being a smoker. State the alternative hypothesis. This states that the variables are dependent, or that the direction of one affects the direction of the other. eg A : Divorce is related to being a smoker. 3 Calculate the expected values. Record the experimental results (observed values, O) in a contingency table of r rows and c columns. Calculate the expected (E) values for each cell: Row i total Column j total Estimated or expected number for the cell in i th row and j th column Ε = ( R )( C ) n Grand total Record these values in the contingency table. Divorced Not divorced Observed Expected Observed Expected Total Smoker 73 85 116 1 = 59 167 85 51 85 = 6 167 Non Smoker 43 8 116 39 = 57 167 8 51 8 = 5 167 Total 116 51 167 3

Calculate the chi-square test value: sum for every cell in each row and each column test = ( Ο Ε ) Ε observed number of responses for cell in i th row and j th column = ( 73 59) ( 1 6) ( 43 57) ( 39 5) 59 6 57 5 ( = 3.3 7.54 3.44 7.84) =.14 NOTE: It is not necessary to write these individual calculations separately. It is quicker to put numbers in your calculator exactly as is. 4 Find the critical value from chi-squared tables. To use these tables, two pieces of information are required: the level of significance usually at the 5% level (written as α = 0. 05) degrees of freedom = df = (no. of rows minus 1) (no. of columns -1) = r 1 c 1 ( )( ) With 1 degree of freedom and α = 0. 05, from tables, = 3. 841 5 Compare the test statistic with table statistic. test > the critical value from tables table, then the null hypothesis is rejected and the alternative hypothesis is favoured. If the calculated test statistic ( ) Otherwise, we conclude that there is not enough evidence to reject o. In our example, test value of =. 14. This is greater than the table value of 3.841. ence we reject in favour of, that is that divorce and O A smoking are related. The chi-squared test is valid if all Ε 1 and no more than 0% of Ε 5; ie is most accurate when the expected frequency in each cell 5. 4

Example questions 1. EXAM QUESTION 001 Semester In a recent trial, researchers gave subjects either aspirin or a placebo (an inactive substance) to determine whether aspirin reduced the risk of a heart attack. The results of this study are summarised in the following table. eart Attack Yes No observed Expected Observed Expected Aspirin 104 10933 Placebo 189 10845 (a) State the alternative hypothesis for a chi-square test of this data. (1) (b) Calculate the expected values for this data. () (c) Calculate the Chi-square statistic for this data. (3) (d) At the α = 0.05 level of significance, draw an appropriate conclusion from this test. Also, write a sentence explaining your conclusion, so that it would be understood by someone with no previous training in statistics. (4) Test Summer School 004 Use the following information to answer Questions 8-1 The Director of Transportation of a large company is interested in the pattern of usage of her van pool. She considers her routes to be divided into local and nonlocal. She is particularly interested in learning if there is a difference in the proportion of males and females who use the local routes. She takes a sample of a day's riders and finds the following: Male Female Total Local 0 40 60 Non-local 30 10 40 Total 50 50 100 The Director uses this information to perform a Chi-Square test using a level of significance of 0.05. (8) Referring to the above Table, what are the degrees of freedom of the test? (9) Referring to the above Table, what is the expected cell frequency in the Male/Local cell? (10) Referring to the above Table, what is the calculated value of the Chi- Square test statistic? (11) What is the critical Chi-Square value for the above Table for a level of significance of 0.05? (1) What conclusion from the Chi-Square test for the above data can be made? 5

3. Test question A study was conducted to determine whether brand awareness of female TV viewers and the gender of the spokesperson are independent. Each, in a sample of 300 female TV viewers, was asked to identify a product advertised by a celebrity spokesperson. The gender of the spokesperson and whether or not the viewer could identify the product was recorded. The numbers in each category are given below. Male spokesperson Female spokesperson Identified product 41 61 Could not identify 109 89 a. Which of the following would be an appropriate alternative hypothesis? A Brand awareness of female TV viewers and the gender of spokesperson are independent. B Brand awareness of female TV viewers and the gender of spokesperson are not independent. C Brand awareness of female TV viewers and the products they identified are not independent. D The gender of the spokesperson and the products identified are related. b. What are he degrees of freedom of the test statistic? c. Referring to Table 1, what is the calculated test statistic? d. Referring to Table, at 5% level of significance, what is the critical value of the test statistic? Answers 1. eart Attack Yes No Total Observed Expected Observed Expected Aspirin 104 147 10933 10890 11037 Placebo 189 146 10845 10888 11034 Totals 93 1778 071 (a) : That taking aspirin affects the incidence of heart attack (b) As in table above. (c) Chi-square (test) = 5.58 (d) Chi-square (table) = 3.84 So since the test value is greater than the table value, we can reject the null hypothesis in favour of the alternative ie that taking aspirin will lessen your chances of having a heart attack.. (8) 1 (9) 30 (10) 16.67 (11) 3.841 (1) Since, calculated test statistic > critical value, reject the null hypothesis 3. a. B: Brand awareness of female TV viewers and the gender of spokesperson are not independent. b. 1 c. 5.9418 d. 3.8415 6

Further practice questions from past papers 1. The following is a printout produced from part of a study of the relationship between family income and choice of shopping mall. Shopping mall Income 1 3 All 1 60 5 14 99 65.74 5.0 8.09 99.00 66 3 9 107 71.05 7.4 8.71 107.00 3 17 40 8 175 116.1 44.55 14.4 175.00 All 53 97 31 381 53.00 97.00 31.00 381.00 CI-SQUARE = 10.9 WIT D.F. = 4 (a) For α = 0.05, what conclusion would you come to? (b) (c) Show how the result of 44.55 was calculated for shopping mall=, income=3. What contribution to the overall chi-square value do the shopping mall=, income=3 make?. A class of students was asked if they drank alcohol. Of those that did, a sample was asked more detail about their drinking. Each was asked for their preferred drink : [1 = wine, = beer, 3 = spirits] and for the amount they drank each week on a scale of 1 to 4:[1= a little to 4 = a lot ]. The table below shows the results. Amount Preference 1 3 4 ALL 1 8 4 1 0 13 5.7.60 1.8.86 13 5 3 3 9 0 8.80 4.00.80 4.40 0.00 3 9 3 3 17 7.48 3.40.38 3.74 17.00 ALL 10 7 11 50.00 10.00 7.00 11.00 50.00 CI-SQUARE = 1.933 with d. f. = 6 (a) What is the critical value for the chi-square statistic (α = 0.05)? (b) Give your conclusion: what does the chi-squared test suggest? (c) Show how the value of 7.48 [row 3, column 1] is calculated. (d) What contribution to the overall chi-squared value do the wine drinkers make? 3. A random sample of 300 students were asked their income (low to high in four categories 1 to 4) and whether they owned a car (Yes or No). The results obtained are in the table below: Income category Owns a car 1 3 4 Total No 65 48 5 4 1 60.19 ** 6.51 6.51 1 Yes 83 7 11 1 178 87.81 71.0 9.49 9.49 178 Total 148 10 16 16 300 (a) Given that the chi-squared test =.886 with d.f. = 3, at = 0.05, what is the correct conclusion to draw from this test? 7

(b) (c) What is the expected value for the cell corresponding to Does not own car, Income = (ie the missing value)? What is the contribution to the overall Chi-squared value of.886 from the cell corresponding to owns car, Income = 4? 4 At the 0.05 significance level, use the data in the following table to test the claim that the sentence is independent of the category of crime. Embezzlement Fraud Forgery Total Sent to jail Not sent to jail 57 8 130 146 0 5 17 8 Total 79 76 45 Answers: 1. a) = 9.488 from tables. Since this is less than test value, we reject the null hypothesis. That is, choice of shopping mall is dependent on family income. 175 97 E = = 44.55 381 40 44.55 44.55 b) c) Contribution = ( ) = 0. 4647. a) = 1. 59 b) Since test value of 1.933 is greater than table value, we can reject the null hypothesis, in favour of the alternative that amount of alcohol consumed does depend on the type preferred. 17 c) E 31 = = 7. 48 50 d) 8 5.5 4.60 1 1.8 0.86 contribution of wine drinkers : = =5.097 5.7.60 1.8.86 3. (a) from tables =7.815. Since test value of.886 is less than table value, there is not enough evidence to reject O. (b) 1 10 Expected value = = 48. 8 300 (c) Contribution from the cell corresponding to owns car, Income = 4 = ( 1 9.49) = 0.664 9.49 4 Observed Expected Observed Expected Observed Expected 17 79 17 76 17 55 = 34 130 = 119 0 = 4 8 79 8 76 8 45 57 = 45 146 130 119 0 4 119 4 0.67 3. 0.77 0.038 = = 157 57 45 45 5 = 6 5 6 6 from tables (with degrees of freedom and α = 0.05) = 5.991. Since test value is greater than table value, we reject sentence depends on type of crime. ( ) ( ) ( ) ( ) ( 34) ( ) ( ) ( ) ( 146 157) ( ) = 34 = 4. 1.0 9.9 157 O in favour of A - that type of