When to use a Chi-Square test:

Similar documents
Having a coin come up heads or tails is a variable on a nominal scale. Heads is a different category from tails.

Testing Research and Statistical Hypotheses

Bivariate Statistics Session 2: Measuring Associations Chi-Square Test

Crosstabulation & Chi Square

Confidence intervals

Contingency Tables and the Chi Square Statistic. Interpreting Computer Printouts and Constructing Tables

One-Way Analysis of Variance

Is it statistically significant? The chi-square test

Class 19: Two Way Tables, Conditional Distributions, Chi-Square (Text: Sections 2.5; 9.1)

Chi Square Distribution

Math 108 Exam 3 Solutions Spring 00

Week 4: Standard Error and Confidence Intervals

Recommend Continued CPS Monitoring. 63 (a) 17 (b) 10 (c) (d) 20 (e) 25 (f) 80. Totals/Marginal

Chapter 13. Chi-Square. Crosstabs and Nonparametric Tests. Specifically, we demonstrate procedures for running two separate

Chi Squared and Fisher's Exact Tests. Observed vs Expected Distributions

First-year Statistics for Psychology Students Through Worked Examples

The Kruskal-Wallis test:

TABLE OF CONTENTS. About Chi Squares What is a CHI SQUARE? Chi Squares Hypothesis Testing with Chi Squares... 2

Confidence Intervals for Cp

Section 13, Part 1 ANOVA. Analysis Of Variance

12.5: CHI-SQUARE GOODNESS OF FIT TESTS

LAB 4 INSTRUCTIONS CONFIDENCE INTERVALS AND HYPOTHESIS TESTING

Simple Linear Regression Inference

Chi-square test Fisher s Exact test

Calculating P-Values. Parkland College. Isela Guerra Parkland College. Recommended Citation

Biostatistics: DESCRIPTIVE STATISTICS: 2, VARIABILITY

Simulating Chi-Square Test Using Excel

One-Way Analysis of Variance (ANOVA) Example Problem

Study Guide for the Final Exam

2 GENETIC DATA ANALYSIS

Additional sources Compilation of sources:

Data Mining Techniques Chapter 5: The Lure of Statistics: Data Mining Using Familiar Tools

Chapter 23. Two Categorical Variables: The Chi-Square Test

One-Way ANOVA using SPSS SPSS ANOVA procedures found in the Compare Means analyses. Specifically, we demonstrate

November 08, S8.6_3 Testing a Claim About a Standard Deviation or Variance

Chapter 7. Comparing Means in SPSS (t-tests) Compare Means analyses. Specifically, we demonstrate procedures for running Dependent-Sample (or

5/31/ Normal Distributions. Normal Distributions. Chapter 6. Distribution. The Normal Distribution. Outline. Objectives.

Eight things you need to know about interpreting correlations:

First-year Statistics for Psychology Students Through Worked Examples. 3. Analysis of Variance

Comparing Multiple Proportions, Test of Independence and Goodness of Fit

Odds ratio, Odds ratio test for independence, chi-squared statistic.

Constructing and Interpreting Confidence Intervals

MBA 611 STATISTICS AND QUANTITATIVE METHODS

Use of the Chi-Square Statistic. Marie Diener-West, PhD Johns Hopkins University

Testing differences in proportions

CHAPTER 11 CHI-SQUARE AND F DISTRIBUTIONS

Chi Square Tests. Chapter Introduction

CHAPTER IV FINDINGS AND CONCURRENT DISCUSSIONS

6.4 Normal Distribution

Evaluation of Total Quality Management (TQM) Application in the Nigerian Telecommunication Industry: A Case study of Imo NITEL Owerri

Using Excel for inferential statistics

UNDERSTANDING THE TWO-WAY ANOVA

Final Exam Practice Problem Answers

The Chi-Square Test. STAT E-50 Introduction to Statistics

CS 147: Computer Systems Performance Analysis

Goodness of Fit. Proportional Model. Probability Models & Frequency Data

research/scientific includes the following: statistical hypotheses: you have a null and alternative you accept one and reject the other

5 Cumulative Frequency Distributions, Area under the Curve & Probability Basics 5.1 Relative Frequencies, Cumulative Frequencies, and Ogives

DDBA 8438: The t Test for Independent Samples Video Podcast Transcript

Statistical estimation using confidence intervals

Test Positive True Positive False Positive. Test Negative False Negative True Negative. Figure 5-1: 2 x 2 Contingency Table

" Y. Notation and Equations for Regression Lecture 11/4. Notation:

Lesson 1: Comparison of Population Means Part c: Comparison of Two- Means

Chapter 3 RANDOM VARIATE GENERATION

CALCULATIONS & STATISTICS

Chapter 8. Chi-Square Procedures for the Analysis of Categorical Frequency Data Part 1

Section 12 Part 2. Chi-square test

MULTIPLE REGRESSION WITH CATEGORICAL DATA

Part 3. Comparing Groups. Chapter 7 Comparing Paired Groups 189. Chapter 8 Comparing Two Independent Groups 217

MATH 140 Lab 4: Probability and the Standard Normal Distribution

STA-201-TE. 5. Measures of relationship: correlation (5%) Correlation coefficient; Pearson r; correlation and causation; proportion of common variance

CHAPTER 5 COMPARISON OF DIFFERENT TYPE OF ONLINE ADVERTSIEMENTS. Table: 8 Perceived Usefulness of Different Advertisement Types

Variables Control Charts

Association Between Variables

4. Continuous Random Variables, the Pareto and Normal Distributions

Population Mean (Known Variance)

Math 58. Rumbos Fall Solutions to Review Problems for Exam 2

Introduction to Analysis of Variance (ANOVA) Limitations of the t-test

AP: LAB 8: THE CHI-SQUARE TEST. Probability, Random Chance, and Genetics

INTERPRETING THE ONE-WAY ANALYSIS OF VARIANCE (ANOVA)

Simple linear regression

Year 6 Mathematics - Student Portfolio Summary

CHAPTER 13 SIMPLE LINEAR REGRESSION. Opening Example. Simple Regression. Linear Regression

Business Statistics. Successful completion of Introductory and/or Intermediate Algebra courses is recommended before taking Business Statistics.

Descriptive Statistics

MATHS ACTIVITIES FOR REGISTRATION TIME

Standard Deviation Estimator

Stats Review Chapters 9-10

Statistics. One-two sided test, Parametric and non-parametric test statistics: one group, two groups, and more than two groups samples

The Analysis of Variance ANOVA

This content downloaded on Tue, 19 Feb :28:43 PM All use subject to JSTOR Terms and Conditions

T-TESTS: There are two versions of the t-test:

SCHOOL OF HEALTH AND HUMAN SCIENCES DON T FORGET TO RECODE YOUR MISSING VALUES

Roadmap to Data Analysis. Introduction to the Series, and I. Introduction to Statistical Thinking-A (Very) Short Introductory Course for Agencies

CHAPTER 12 TESTING DIFFERENCES WITH ORDINAL DATA: MANN WHITNEY U

Elementary Statistics Sample Exam #3

Independent samples t-test. Dr. Tom Pierce Radford University

Course Text. Required Computing Software. Course Description. Course Objectives. StraighterLine. Business Statistics

Chapter 4: Probability and Counting Rules

Transcription:

When to use a Chi-Square test: Usually in psychological research, we aim to obtain one or more scores from each participant. However, sometimes data consist merely of the frequencies with which certain categories of event occurred. Instead of each participant providing a score, participants contribute to a head count : they fall into one category or another. In these circumstances, where one has categorical data, one of the family of Chi-Square tests is the appropriate test to use. The Chi-Square test allows us to compare observed frequencies with what we might expect to find by chance (expected frequencies). Chi-Square Goodness of Fit test: This test is used when you want to compare an observed frequency distribution to a theoretical frequency distribution. The most common situation is where your data consist of the observed frequencies for a number of mutually exclusive categories, and you want to know if they all occur equally frequently (so that our theoretical frequency-distribution contains categories that all have the same number of individuals in them). Worked Example: We select a random sample of 100 A Level students, and ask them their favourite A- Level. We find that they are distributed across five subjects (fictional data!). Favourite A Level Subject Total Observed Frequency 40 35 5 10 10 100 Are all A-Levels equally popular, as far as students are concerned? Or do some A- Levels attract more students than others? Looking at the observed frequencies, it appears that Psychology and Maths are much more popular than the other A-Levels. We can perform a Chi-Square test on these data to find out if this is true, i.e. to see the pattern of preference deviates significantly from what we might expect to obtain if all A-Levels were equally popular with students. Patricia George / Coursework / Statistics / Chi-Square Test / Page 1 of 5

Formula for Chi-Square: χ 2 ( 0 ) = E 2 E Step 1: Calculate the expected frequencies (the frequencies which we would expect to obtain if there were no preferences): total number of instances Expected Frequency = number of categories We have 100 applications and 5 categories, so the Expected Frequency for each category is 20 (i.e. 100/5) Step 2: Subtract, from each observed frequency, its associated expected frequency (e.g. work out (O-E): O - E 20 15-15 -10-10 Step 3: Square each value of (O-E): (O E) 2 400 225 225 100 100 Step 4: Divide each of the values obtained in step 3, by its associated expected frequency: (O E) 2 E 20 11.25 11.25 5 5 Patricia George / Coursework / Statistics / Chi-Square Test / Page 2 of 5

Step 5: Add together all the values obtained in step 4, to get your value of Chi- Square: Χ 2 = 20 + 11.25 + 11.25 + 5 + 5 = 52.25 This is our obtained value of Chi-Square: It is a single number summary of the discrepancy between our obtained frequencies, and the frequencies, which we would expect if all the categories had equal frequencies. The bigger our Chi-Square, the greater the difference between the observed and the expected frequencies. How do we assess whether this value represents a real departure of our obtained frequencies from the expected frequencies? To do this we need to know how likely it is to obtain various values of Chi-Square by chance. Patricia George / Coursework / Statistics / Chi-Square Test / Page 3 of 5

Assessing the size of our obtained Chi-Square value: In a nutshell, you a) Find a table of critical values (at the back of most statistics books) b) Work out how many degrees of freedom (d.f.) you have. For this test it is simply the number of categories minus one. c) Decide on the probability level. (p<0.05) d) Find the critical value in the table (at the intersection of the appropriate d.f. row and probability column). If your obtained Chi-Square value is bigger than the one in the table, then you conclude that your obtained Chid-Square value is too large to have arisen by chance: it is more likely to seem from the fact that there were real differences between the observed and expected frequencies. If on the other hand, your obtained Chi-Square value is smaller than the one in the table, you conclude that there is no reason to think that the observed pattern of frequencies is not due simply to chance (i.e. we retain our initial assumption that the discrepancies between the observed and expected frequencies are due merely to random sampling variations, and hence we have no reason to believe that the categories did not occur with equal frequency. For our worked example (a) We have an obtained Chi-Square value of 52.5 (b) We have five categories, so there are 5-1 = 4 degrees of freedom. (c) Consult a table of critical values of Chi-Square. Here is a typical example: Probability level: d.f. 0.05 0.01 0.001 1 3.84 6.63 10.83 2 5.99 9.21 13.82 3 7.81 11.34 16.27 4 9.49 13.28 18.46 5 11.07 etc etc 6 12.49 (d) The values in each column are the critical values of Chi-Square. These values would be expected to occur by chance with the probability shown at the top of the column. We are interested in the 4.d.f. row, since our obtained Chi-Square has 4 d.f. Patricia George / Coursework / Statistics / Chi-Square Test / Page 4 of 5

With 4 d.f. obtained values of Chi-Square as large as 9.49 or more will occur by chance with a probability of 0.05: in other words, you would expect to get a Chi-Square value of 9.49 or more, only 5 times in a hundred Chi-Squared tests. Thus, values of Chi-Square that are this large do occur by chance, but not very often. Looking at the same row, we find that Chi-Square values of 13.28 are even more uncommon, they occur by chance once in a hundred times (0.01). (e) We compare our obtained Chi-Square to these tabulated valued. If our obtained Chi-Square is larger than a value in the table, it implies that our obtained value is even less likely to occur by chance than is the value in the table. Our obtained value of 52.5 is much bigger than the critical value having occurred by chance are less than one in a thousand (written formally, p < 0.001. We can therefore be relatively confident to conclude that our observed frequencies are significantly different from the frequencies that we would expect to obtain if all categories were equally popular. In other words, not all A-Levels are equally popular with A-Level students. Patricia George / Coursework / Statistics / Chi-Square Test / Page 5 of 5