Chi Square (χ 2 ) Statistical Instructions EXP 3082L Jay Gould s Elaboration on Christensen and Evans (1980)



Similar documents
Class 19: Two Way Tables, Conditional Distributions, Chi-Square (Text: Sections 2.5; 9.1)

Is it statistically significant? The chi-square test

Chapter 23. Two Categorical Variables: The Chi-Square Test

Bivariate Statistics Session 2: Measuring Associations Chi-Square Test

Recommend Continued CPS Monitoring. 63 (a) 17 (b) 10 (c) (d) 20 (e) 25 (f) 80. Totals/Marginal

Testing Research and Statistical Hypotheses

Odds ratio, Odds ratio test for independence, chi-squared statistic.

Having a coin come up heads or tails is a variable on a nominal scale. Heads is a different category from tails.

Test Positive True Positive False Positive. Test Negative False Negative True Negative. Figure 5-1: 2 x 2 Contingency Table

Calculating P-Values. Parkland College. Isela Guerra Parkland College. Recommended Citation

Association Between Variables

Comparing Multiple Proportions, Test of Independence and Goodness of Fit

CHAPTER 14 ORDINAL MEASURES OF CORRELATION: SPEARMAN'S RHO AND GAMMA

Testing differences in proportions

AP: LAB 8: THE CHI-SQUARE TEST. Probability, Random Chance, and Genetics

One-Way Analysis of Variance (ANOVA) Example Problem

Two Correlated Proportions (McNemar Test)

IMPLEMENTING BUSINESS CONTINUITY MANAGEMENT IN A DISTRIBUTED ORGANISATION: A CASE STUDY

Chapter 13. Chi-Square. Crosstabs and Nonparametric Tests. Specifically, we demonstrate procedures for running two separate

Chi Square Distribution

Chi-square test Fisher s Exact test

First-year Statistics for Psychology Students Through Worked Examples

Using Stata for Categorical Data Analysis

Math 58. Rumbos Fall Solutions to Review Problems for Exam 2

Contingency Tables and the Chi Square Statistic. Interpreting Computer Printouts and Constructing Tables

Normality Testing in Excel

One-Way ANOVA using SPSS SPSS ANOVA procedures found in the Compare Means analyses. Specifically, we demonstrate

CHAPTER 11 CHI-SQUARE AND F DISTRIBUTIONS

Chi Squared and Fisher's Exact Tests. Observed vs Expected Distributions

Statistical Impact of Slip Simulator Training at Los Alamos National Laboratory

Row vs. Column Percents. tab PRAYER DEGREE, row col

TABLE OF CONTENTS. About Chi Squares What is a CHI SQUARE? Chi Squares Hypothesis Testing with Chi Squares... 2

Friedman's Two-way Analysis of Variance by Ranks -- Analysis of k-within-group Data with a Quantitative Response Variable

Section 13, Part 1 ANOVA. Analysis Of Variance

Using Excel for inferential statistics

Part 3. Comparing Groups. Chapter 7 Comparing Paired Groups 189. Chapter 8 Comparing Two Independent Groups 217

CONTINGENCY TABLES ARE NOT ALL THE SAME David C. Howell University of Vermont

Study Guide for the Final Exam

Introduction to Analysis of Variance (ANOVA) Limitations of the t-test

UNDERSTANDING THE TWO-WAY ANOVA

LAB : THE CHI-SQUARE TEST. Probability, Random Chance, and Genetics

Variables Control Charts

Chi Square Tests. Chapter Introduction

CHAPTER IV FINDINGS AND CONCURRENT DISCUSSIONS

1. What is the critical value for this 95% confidence interval? CV = z.025 = invnorm(0.025) = 1.96

Factors affecting online sales

8 6 X 2 Test for a Variance or Standard Deviation

5.5. Solving linear systems by the elimination method

INTERPRETING THE ONE-WAY ANALYSIS OF VARIANCE (ANOVA)

CALCULATIONS & STATISTICS

Goodness of Fit. Proportional Model. Probability Models & Frequency Data

SIMPLE LINEAR CORRELATION. r can range from -1 to 1, and is independent of units of measurement. Correlation can be done on two dependent variables.

Chapter 8 Hypothesis Testing Chapter 8 Hypothesis Testing 8-1 Overview 8-2 Basics of Hypothesis Testing

Evaluation of Total Quality Management (TQM) Application in the Nigerian Telecommunication Industry: A Case study of Imo NITEL Owerri

Chapter 5 Analysis of variance SPSS Analysis of variance

Solutions to Homework 10 Statistics 302 Professor Larget

Chapter 7. Comparing Means in SPSS (t-tests) Compare Means analyses. Specifically, we demonstrate procedures for running Dependent-Sample (or

In the past, the increase in the price of gasoline could be attributed to major national or global

Biostatistics: DESCRIPTIVE STATISTICS: 2, VARIABILITY

Crosstabulation & Chi Square

Understand the role that hypothesis testing plays in an improvement project. Know how to perform a two sample hypothesis test.

11. Analysis of Case-control Studies Logistic Regression

StatCrunch and Nonparametric Statistics

Additional sources Compilation of sources:

This chapter discusses some of the basic concepts in inferential statistics.

Probability Distributions

Final Exam Practice Problem Answers

Regression III: Advanced Methods

Topic 8. Chi Square Tests

The impact of question wording reversal on probabilistic estimates of defection / loyalty for a subscription product

Chapter 3 RANDOM VARIATE GENERATION

Independent samples t-test. Dr. Tom Pierce Radford University

Analysing Questionnaires using Minitab (for SPSS queries contact -)

Descriptive Analysis

Elementary Statistics Sample Exam #3

The Chi-Square Test. STAT E-50 Introduction to Statistics

Data Mining Techniques Chapter 5: The Lure of Statistics: Data Mining Using Familiar Tools

SAMPLING & INFERENTIAL STATISTICS. Sampling is necessary to make inferences about a population.

Unit 31 A Hypothesis Test about Correlation and Slope in a Simple Linear Regression

PEARSON R CORRELATION COEFFICIENT

2 GENETIC DATA ANALYSIS

MULTIPLE REGRESSION WITH CATEGORICAL DATA

The ANOVA for 2x2 Independent Groups Factorial Design

Fairfield Public Schools

Poisson Models for Count Data

Nonparametric Tests. Chi-Square Test for Independence

Simple Linear Regression Inference

Adverse Impact Ratio for Females (0/ 1) = 0 (5/ 17) = Adverse impact as defined by the 4/5ths rule was not found in the above data.

12.5: CHI-SQUARE GOODNESS OF FIT TESTS

Statistical Models in R

CHAPTER 12. Chi-Square Tests and Nonparametric Tests LEARNING OBJECTIVES. USING T.C. Resort Properties

The Kruskal-Wallis test:

" Y. Notation and Equations for Regression Lecture 11/4. Notation:

Chapter 19 The Chi-Square Test

Statistical Testing of Randomness Masaryk University in Brno Faculty of Informatics

DDBA 8438: The t Test for Independent Samples Video Podcast Transcript

LAB 4 INSTRUCTIONS CONFIDENCE INTERVALS AND HYPOTHESIS TESTING

Descriptive Statistics

CHAPTER 5 COMPARISON OF DIFFERENT TYPE OF ONLINE ADVERTSIEMENTS. Table: 8 Perceived Usefulness of Different Advertisement Types

University of Colorado Campus Box 470 Boulder, CO (303) Fax (303)

Transcription:

Chi Square (χ 2 ) Statistical Instructions EXP 3082L Jay Gould s Elaboration on Christensen and Evans (1980) For the Driver Behavior Study, the Chi Square Analysis II is the appropriate analysis below. It also corresponds to a Two-Way Chi Square Analysis in Bruning and Kintz s Computational Handbook of Statistics (p. 297-304) and also corresponds to a Two-Way Chi Square Analysis in Linton and Gallo s The Practical Statistician. See page 74, as well as page 33 for this analysis, and page 62 for limitations regarding sample size and expected frequencies (e.g. f e must be 2, not 5). If you are interested, see page 88 (along with page 33) of the Linton and Gallo text for information about a Three-Way Chi Square Analysis, which is more complex but would be the most appropriate for the Driver Behavior Study, given the number of variables and levels involved. The information below regarding Chi Square Analysis I is for simpler research designs, and is provided for completeness, but particularly as a learning tool. It should be read first. Chi Square Analysis I (Simple Chi Square Analysis or Goodness of Fit Test) Chi Square is a statistic that is used to analyze frequencies and specifically to determine if the frequency of occurrence of two or more events differs significantly from that which would be expected by chance alone. The null hypothesis would be that the frequency of occurrence of the events could be explained by chance and not by some other factor. Consider, for example, the frequency of use of a right and left entrance door to a building. If individuals did not have a preference for the use of one door over the other, then the two doors should be used equally often or should differ in frequency of use to a degree that could be expected by chance alone. If you wanted to find out if individuals demonstrated a preference for one door over the other, your first task would be to count the number of times that individuals use the left versus the right door. Assume you observe 50 people entering the building, and of these 50 people you find that 35 have used the right door and 15 have used the left door. To determine if this preference for the right door is a real preference or instead is within chance variation, you could compute a chi square analysis to help make such an assessment. To calculate the

chi square you would follow the steps outlines in the following table, labled Chi Square Analysis I. In using the calculations presented in this table, remember that your observed frequencies of use of the doors are 35 for the right door and 15 for the left door. Also remember that the theoretical expected frequency of use of each door would be 25, since this is the expected frequency of use if individuals did not demonstrate a preference for one of the doors. Once you have made the calculations illustrated in this table you essentially have computed the chi square since χ 2 = (F o F e ) 2 / F e Chi Square Analysis I Door F o F e F o F e (F o F e ) 2 (F o F e ) 2 / Right 35 25 +10 100 100/25=4 Left 15 25-10 100 100/25=4 (50) (50) (0) (200) 8 = χ 2 F e F o = the observed frequency of use of a given door. F e = the theoretical or expected frequency of use of a given door. = sum Sums are shown at the bottom of the table, with the one at the right being the value of χ 2, i.e. χ 2 =8. To determine whether the χ 2 value of 8 is significant, the appropriate degrees of freedom (df) must be calculated. The df are calculated by subtracting one from the number of observed groups (k), i.e. df = k 1. Since there were two doors, and therefore two groups, k = 2, df = 2-1 or 1. To determine if the χ 2 of 8 with 1 df is significant turn to the chi square table at the end of the handout. This table is set for two significant α levels,.05 and.01, and for a range of df. You must first decide under which significance level you wish to operate. Assume you choose the.05 level. To determine if the χ 2 value of 8 is significant with 1 df you would look at the table under df

= 1 and α =.05. The tabulated value is 3.84. If our calculated χ 2 were less than the tabulated value, then the null hypothesis (that there is no difference between the observed and theoretical scores) would be retained. If the calculated value were larger than that tabulated, we could reject the null hypothesis and we could state that the observed and theoretical conditions are significantly different at the 95% level of confidence. Since the value we calculated equals 8, we can reject the null hypothesis. Note that in this case had we selected the 1% level of significance, we still would have rejected the null hypothesis. Chi Square Analysis II, or Complex Chi Square Analysis, or Test of Association, or Two- Way Chi Square Anaysis The preceding is the simplest form of the chi square statistic and it is suitable only where there is a single dimension of an observed variable, in this case, the location of doors. With a slightly more complicated arrangement, we can determine the dependence or independence of two dimensions of variables. This involves a Complex Chi-Square Analysis as opposed to the preceding Simple Chi-Square Analysis. Assume that you wanted to find out if prior hurricane experience was related to the decision to evacuate in the face of a subsequent hurricane. The two variables would be prior hurricane experience vs. no prior hurricane experience and decision to evacuate vs. decision not to evacuate in face of a subsequent hurricane. Let s say we go into a town where a hurricane was about to hit and divided the population into four categories: (a) those with prior hurricane experience who decided to evacuate before the current storm arrives; (b) those with no prior hurricane experience who decided to evacuate; (c) those with prior hurricane experience who decided not to evacuate; and (d) those with no prior hurricane experience who decided not to evacuate. It should be remembered in all chi square determinations that the subjects in the various cells are assumed to be independent and the same subjects cannot be in two cells. The matrix below shows the form and data for our hypothetical research:

Chi Square Analysis II Prior hurricane experience No prior hurricane experience Marginal Totals: Decision to evacuate Decision not to evacuate Marginal Totals a c b f oa = 10 f ea = 60 X 40 = 22. 85 f ob = 50 f eb = 60 X 65 = 37.14 d f oc = 30 f ec = 45 X 40 = 17.14 f od = 15 f ed = 45 X 65 = 27.85 f oa + f ob = 60 f oc + f od = 45 f oa + f oc = 40 f ob + f od = 65 f oa + f ob + f oc + f od = = Where a, b, c, and d = four different cells f o = observed frequency f e = expected frequency f e of a cell = Row s Proportion of observations Col s Proportion of observations Total # of observations, that is to say: f e = f o Row Χ f o Col Χ Total f o All Cells Total f o All cells Total f o All Cells This reduces down to : f e = f o Row f o Col Total f o All Cells In this method of computing the χ 2 statistic, the f e must be calculated for each cell by, as we have just seen, multiplying the f o row total by the f o column total corresponding to a given cell, and then dividing by the sum of the f o values for all four cells. The value of f e for cell A would be calculated as follows: f ea = (f oa + f ob ) (f oa + f oc ) = 60 40 = 2400 = 22.85 f oa + f ob + f oc + f od 10 + 50 + 30 + 15

Once the f e values have been calculated for all cells, then the additional necessary values for χ 2 can be calculated as follows: Cell f o f e (f o f e ) 2 (f o f e ) 2 f e a 10 22.85 = 12.85 165.12 165.12 / 22.85 = 7.22 b 50 37.14 = 12.86 165.38 165.38 / 37.14 = 4.45 c 30 17.14 = 12.86 165.38 165.38 / 17.14 = 9.65 d 15 27.85 = 12.85 165.12 165.12 / 27.85 = 5.93 Once these values are calculated you have computed all the necessary values for the determination of the χ 2 value. (f o f e ) 2 χ 2 = = ( 7.22 + 4.45 + 9.65 + 5.93 ) = 27.25 f e The df for this χ 2 is computed by calculating the following formula. df = ( C 1 ) ( R 1 ) C = number of columns, R = number of rows

Since the matrix we formed was a 2 row by 2 column matrix, df = (2 1) (2 1) = 1.With a χ 2 value of 27.25 and 1 df, we again go to the chi square table to determine if the χ 2 is signifcant. Going to the table of chi square and looking under the 5% significance level (α) for df = 1, we find the critical value 3.84. Since our calculated value is 27.25, we are able to reject the null hypothesis of no difference and state that there is a significant relationship between the two variable dimensions. Note: this is true at the.01 level as well as the.05 level. Table 1. Distribution of χ 2 * n ( df ) α =.05 α =.01 1 3.84 6.64 2 5.99 9.21 3 7.82 11.34 4 9.49 13.28 5 11.07 15.09 6 12.59 16.81 7 14.07 18.48 8 15.51 20.09 9 16.92 21.67 10 18.31 23.21 11 19.68 24.72 12 21.03 26.22 13 22.36 27.69 14 23.68 29.14 15 25.00 30.58 16 26.30 32.00 17 27.59 33.41 18 28.87 34.80 19 30.14 36.10 20 31.41 37.57 21 32.67 38.93 22 33.92 40.29 23 35.17 41.64 24 36.42 42.98 25 37.65 44.31 26 38.88 45.64 27 40.11 46.96 28 41.34 48.28 29 42.56 49.59 30 43.77 50.89 *Table 1 is taken from Table III of Fischer and Yates: Statistical Tables for Biological, Agricultural and Medical Research, published by Longman Group Ltd., London (previously published by Oliver and Boyd, Edinburgh), and by permission of the authors and publishers.