Chi-Square Test (χ 2 )

Similar documents
Bivariate Statistics Session 2: Measuring Associations Chi-Square Test

The Chi-Square Test. STAT E-50 Introduction to Statistics

Association Between Variables

Is it statistically significant? The chi-square test

Chapter 13. Chi-Square. Crosstabs and Nonparametric Tests. Specifically, we demonstrate procedures for running two separate

CHAPTER 15 NOMINAL MEASURES OF CORRELATION: PHI, THE CONTINGENCY COEFFICIENT, AND CRAMER'S V

Simulating Chi-Square Test Using Excel

Having a coin come up heads or tails is a variable on a nominal scale. Heads is a different category from tails.

Class 19: Two Way Tables, Conditional Distributions, Chi-Square (Text: Sections 2.5; 9.1)

The Dummy s Guide to Data Analysis Using SPSS

Crosstabulation & Chi Square

Elementary Statistics

CHAPTER 14 ORDINAL MEASURES OF CORRELATION: SPEARMAN'S RHO AND GAMMA

IBM SPSS Statistics for Beginners for Windows

SCHOOL OF HEALTH AND HUMAN SCIENCES DON T FORGET TO RECODE YOUR MISSING VALUES

Odds ratio, Odds ratio test for independence, chi-squared statistic.

Statistics. One-two sided test, Parametric and non-parametric test statistics: one group, two groups, and more than two groups samples

Nonparametric Tests. Chi-Square Test for Independence

IBM SPSS Statistics 20 Part 4: Chi-Square and ANOVA

TABLE OF CONTENTS. About Chi Squares What is a CHI SQUARE? Chi Squares Hypothesis Testing with Chi Squares... 2

Statistical tests for SPSS

Additional sources Compilation of sources:

Guido s Guide to PROC FREQ A Tutorial for Beginners Using the SAS System Joseph J. Guido, University of Rochester Medical Center, Rochester, NY

Mind on Statistics. Chapter 4

Chi Square Distribution

Projects Involving Statistics (& SPSS)

This chapter discusses some of the basic concepts in inferential statistics.

Part 3. Comparing Groups. Chapter 7 Comparing Paired Groups 189. Chapter 8 Comparing Two Independent Groups 217

DESCRIPTIVE STATISTICS & DATA PRESENTATION*

Working with SPSS. A Step-by-Step Guide For Prof PJ s ComS 171 students

Using Excel for inferential statistics

3. Analysis of Qualitative Data

Testing differences in proportions

Chapter 23. Two Categorical Variables: The Chi-Square Test

One-Way Analysis of Variance (ANOVA) Example Problem

Testing Research and Statistical Hypotheses

How to Make APA Format Tables Using Microsoft Word

Recommend Continued CPS Monitoring. 63 (a) 17 (b) 10 (c) (d) 20 (e) 25 (f) 80. Totals/Marginal

Types of Data, Descriptive Statistics, and Statistical Tests for Nominal Data. Patrick F. Smith, Pharm.D. University at Buffalo Buffalo, New York

11. Analysis of Case-control Studies Logistic Regression

II. DISTRIBUTIONS distribution normal distribution. standard scores

An introduction to IBM SPSS Statistics

Directions for using SPSS

One-Way ANOVA using SPSS SPSS ANOVA procedures found in the Compare Means analyses. Specifically, we demonstrate

Main Effects and Interactions

Statistical Impact of Slip Simulator Training at Los Alamos National Laboratory

Chi Squared and Fisher's Exact Tests. Observed vs Expected Distributions

Chapter 5 Analysis of variance SPSS Analysis of variance

CHAPTER IV FINDINGS AND CONCURRENT DISCUSSIONS

C. The null hypothesis is not rejected when the alternative hypothesis is true. A. population parameters.

CHAPTER 11 CHI-SQUARE AND F DISTRIBUTIONS

Chi-square test Fisher s Exact test

People like to clump things into categories. Virtually every research

Scientific Method. 2. Design Study. 1. Ask Question. Questionnaire. Descriptive Research Study. 6: Share Findings. 1: Ask Question.

Two Correlated Proportions (McNemar Test)

Unit 31 A Hypothesis Test about Correlation and Slope in a Simple Linear Regression

Research Methods & Experimental Design

Use of the Chi-Square Statistic. Marie Diener-West, PhD Johns Hopkins University

First-year Statistics for Psychology Students Through Worked Examples

MULTIPLE REGRESSION WITH CATEGORICAL DATA

4. Descriptive Statistics: Measures of Variability and Central Tendency

Using SPSS, Chapter 2: Descriptive Statistics

Introduction to Analysis of Variance (ANOVA) Limitations of the t-test

Solutions to Homework 10 Statistics 302 Professor Larget

Nonparametric Statistics

Contingency Tables and the Chi Square Statistic. Interpreting Computer Printouts and Constructing Tables

TI-Inspire manual 1. Instructions. Ti-Inspire for statistics. General Introduction

Descriptive Statistics

Mind on Statistics. Chapter 15

Analysing Questionnaires using Minitab (for SPSS queries contact -)

Outline. Definitions Descriptive vs. Inferential Statistics The t-test - One-sample t-test

Students' Opinion about Universities: The Faculty of Economics and Political Science (Case Study)

LAB 4 INSTRUCTIONS CONFIDENCE INTERVALS AND HYPOTHESIS TESTING

DDBA 8438: The t Test for Independent Samples Video Podcast Transcript

Using Stata for Categorical Data Analysis

SPSS TUTORIAL & EXERCISE BOOK

SPSS Explore procedure

UNDERSTANDING THE TWO-WAY ANOVA

Chapter 7 Section 7.1: Inference for the Mean of a Population

Row vs. Column Percents. tab PRAYER DEGREE, row col

1.5 Oneway Analysis of Variance

AP: LAB 8: THE CHI-SQUARE TEST. Probability, Random Chance, and Genetics

When to use Excel. When NOT to use Excel 9/24/2014

TI-Inspire manual 1. I n str uctions. Ti-Inspire for statistics. General Introduction

Introduction to Statistics with SPSS (15.0) Version 2.3 (public)

Calculating the Probability of Returning a Loan with Binary Probability Models

EPS 625 INTERMEDIATE STATISTICS FRIEDMAN TEST

Point Biserial Correlation Tests

SPSS Tests for Versions 9 to 13

January 26, 2009 The Faculty Center for Teaching and Learning

Math 108 Exam 3 Solutions Spring 00

Testing Group Differences using T-tests, ANOVA, and Nonparametric Measures

PERCEPTION OF SENIOR CITIZEN RESPONDENTS AS TO REVERSE MORTGAGE SCHEME

Sample Size and Power in Clinical Trials

Linear Models in STATA and ANOVA

Data analysis process

Chi Square Tests. Chapter Introduction

Understanding and Quantifying EFFECT SIZES

Consider a study in which. How many subjects? The importance of sample size calculations. An insignificant effect: two possibilities.

Transcription:

Chi Square Tests

Chi-Square Test (χ 2 ) Nonparametric test for nominal independent variables These variables, also called "attribute variables" or "categorical variables," classify observations into a small number of categories. Examples? The dependent variable: the count of observations in each category of the nominal variable e.g. the number of men versus women; or the number of accidents in dry weather versus in wet weather

Two uses of χ 2 Chi Square Goodness of Fit: when we have one independent variable (e.g. weather) we compare the numbers of observations in the categories of this variable (e.g. wet and dry) to what we would expect if the variable did not make any difference (e.g. out of 100 accidents 50 in dry weather and 50 in wet weather) Chi Square Test of Independence: when we have two or more independent variables we compare the numbers of observations in each category of each variable to the numbers we would expect if the variables were independent of each other

Chi-Square Goodness of Fit The observed counts of numbers of observations in each category are compared with the expected counts, which are calculated using some kind of theoretical expectation, such as a 1:1 sex ratio. An example: An area of shore that has 59% of the area covered in sand, 28% mud and 13% rocks; if seagulls were standing in random places, your null hypothesis would be that 59% of the seagulls were standing on sand, 28% on mud and 13% on rocks. the independent variable is type of shore, the dependent variable is the observed number of seagulls Rock 13% Mud 28% Sand 59%

Tabulating Chi Square Goodness of Fit Sand Mud Rocks Observed seagulls (Total: 100) Expected seagulls (Total: 100) 35 8 57 59 28 13

Calculating the test value The test statistic is calculated by taking an observed number (O), subtracting the expected number (E), then squaring this difference. The larger the deviation from the null hypothesis, the larger the difference between observed and expected. ( O E) E Χ 2 2 Each squared difference is divided by the expected number, and these standardized ratios are summed: the more differences between what you would expect and what you get the bigger the number.

Calculate Chi square for the seagull (O E) 2 divided by E: data Now check your answer on Graphpad: http://www.graphpad.com Or on Social Science Statistics: http://www.socscistatistics.com/tests/default.aspx (this will also give you the steps of the calculation)

Goodness of fit in SPSS Create a variable column (Surface) Create frequency column Type the observed frequencies for each category of the independent variable From the data menu, weight cases by frequency Go to Analyse Nonparametric One Sample - Chi square Select Surface as test field In Options, you can set the expected values to be equal percentages of the categories (33% here) or you can assign expected values Run the analysis

How the test works 1.Identify Pop. Distribution & Assumptions a) Two populations, one distribution that matches expected outcomes and another where distribution matches observed outcomes. b) Null hypothesis: the two distributions do not differ c) Comparison distribution is chi-square

Characteristics of the comparison distribution Degrees of Freedom: N of categories 1 Chi-Square Test for Goodness-of-Fit distribution

Chi-Square Goodness of Fit: Test Assumptions 1. Random and independent sampling. 2. Sample size must be sufficiently large (no more than 20% of cells should have an expected value of less than 5) 3. Values of the variable are mutually exclusive and exhaustive. Every subject must fall in only one category. Note: If these values are not met, the critical values in the chi-square table are not necessarily correct.

Exercise 1: Popularity A Psychology course is offered by three different professors. The table shows the number of students enrolled in the course of each. Is one professor more popular than another or are the different enrollment numbers due to chance? Complete the table and run the analysis. Prof A Prof B Prof C Prof D Observed 25 29 22 17?????

Chi Square Test of Independence Two or more nominal variables We test the independence of the variables (whether they affect each other)

Chi-Square Test of Independence Example A researcher wants to know if there is a significant difference in the frequencies with which males come from small, medium, or large cities as contrasted with females. The two variables are hometown size (small, medium, or large) and sex (male or female). Another way of putting our research question is: Is gender independent of size of hometown? Contingency table for the data for 30 females and 6 males: Frequency with which males and females come from small, medium, and large cities Small Medium Large Totals Female 10 14 6 30 Male 4 1 1 6 Totals 14 15 7 36

The formula for chi-square is: Where: O is the observed frequency, and E is the expected frequency. The degrees of freedom for the 2-D chi-square statistic is: df = (Columns - 1) x (Rows - 1)

Computing Expected Frequencies Frequency with which males and females come from small, medium, and large cities Small Medium Large Totals Female 10 14 6 30 Male 4 1 1 6 Totals 14 15 7 36 Expected Frequency for each Cell: The cell s Column Total x the cell s Row Total / Grand Total In our example: Column Totals are 14 (small), 15 (medium), and 7 (large). Row Totals are 30 (female) and 6 (male). Grand total is 36.

Computing Expected Frequencies Frequency with which males and females come from small, medium, and large cities Small Medium Large Totals Female 10 14 6 30 Male 4 1 1 6 Totals 14 15 7 36 The expected frequency: 1. Small female cell:14 X 30 / 36 = 11.667 2. Medium female cell: 15 X 30 / 36 = 12.500 3. Large female cell: 7 X 30 / 36 = 5.833 4. Small male cell: 14 X 6 / 36 = 2.333 5. Medium male cell: 15 X 6 / 36 = 2.500 6. Large male cell: 7 X 6 / 36 = 1.167

Observed frequencies, expected frequencies, and (O - E) 2 /E for males and females from small, medium, and large cities Observed Expected Small Medium Large Totals (O- E) 2 /E Observed Expected (O- E) 2 /E Observed Expected (O- E) 2 /E Female 10 11.667 0.238 14 12.500 0.180 6 5.833 0.005 30 Male 4 2.333 1.191 1 2.500 0.900 1 1.167 0.024 6 Totals 14 15 7 36

For practice: Check the accuracy of the hand calculations on Social Science Statistics. Are the two variables independent of each other?

Fischer s Exact Test Chi square test is not accurate when we have a small number of observations (expected frequency of less than 5 in more than 20% of cells) We can substitute Fischer s exact in a 2 x 2 design

Exercise 2: Cycling (handlebar.sav) Are the Dutch reckless cyclists? Keep only one hand on the handlebar Variables: Nationality (English, Dutch) Hands on handlebar (One, Two)

Observed frequencies Dutch English One Handed 120 17 Two Handed 578 154 In SPSS: Weight cases by frequency Go to Analyze -> Descriptive Statistics -> Crosstabs Choose Chi square from Statistics You can also choose Phi and Cramer s V an effect size From Exact, you can choose Fischer s exact From Cells, choose the data you need (expected, possibly percentages) Run

Other way of running Chi-Square using SPSS: ResponseIncentive.sav A researcher is interested in whether people are more likely to return survey questionnaires if the questionnaire offers an incentive. He sends out 100 questionnaires: 20 promises that the respondent will get the survey results 30 says the respondent will be entered in a prize draw 50 has no incentive We have two categorical variables: Incentive (results vs. prize draw vs. none) and Response (questionnaire returned or not).

Steps Analyse Desriptive Statistics Crosstabs Choose rows and columns Click display bar charts Choose statistics (Chi 2 and Cramer s V for effect size) Choose which cells you want displayed (observed and expected) Run the analysis

SPSS Output

How big is the effect?: Cramer s V.27 out of 1 = a medium association between type of incentive and whether people return a questionnaire. Can be viewed like a correlation coefficient. The significance level indicates it is unlikely the observed pattern of data is due to chance.

A more useful effect size: Odds Ratio 1. Odds that a Q was returned given a promise of results. Odds (responding to results) = number that responded to results = 9 =.82 number that didn t respond = 11 2. Odds that a Q was returned given a promise of prize draw. Odds (responding to draw) = number that responded to draw = 16 = 1.14 number that didn t respond = 14 3. Odds ratio. Odds (responding to draw) = 1.39 Odds (responding to results) Odds ratios can be calculated for any pairs of categories.

Writing up the results There was a significant association between the type of incentive and whether people returned the questionnaire, 2 (2 )= 7.61, p =.022. People offered either type of incentive were more likely to respond than those not offered any incentive but, based on the odds ratio, the odds of returning the questionnaire were 1.39 times higher if people were promised a prize draw than if they were promised the results of the survey.

Exercise 4 The relationship between drug companies and medical researchers is under scrutiny because of possible conflict of interest. The issue that started the controversy was a 1995 case control study that suggested that the use of calciumchannel blockers to treat hypertension led to an increase risk of heart disease. This led to an intense debate. Researchers writing in the New England Journal of Medicine ( Conflict of Interest in the Debate over Calcium Channel Antagonists, January 8, 1998, p. 101) looked at the 70 research reports that appeared during 1996 1997, classifying them as favorable, neutral, or critical toward the drugs. The researchers then contacted the authors of the reports and questioned them about financial ties to drug companies. Results in ResearchBribes.sav

Homework Sonnentag(2012).sav Is there an association between the amount of time pressure at work and whether we can relax when not working (SwitchOff)? Births.sav Are births equally distributed over the year or are more babies born in some months than in others?

Make-up Homework Does a negative example on TV make us more negative in our relationships? Eastenders.sav Couples watched three types of TV programmes: EastEnders (British soap opera with very miserable and mean people) Friends (British soap opera with exaggeratedly nice and helpful people) A neutral nature programme After watching each programme, the couples were left alone for an hour and the number of sharp/nasty/unfriendly comments they make to each other was counted. Choose the appropriate test, run it and report the results.