The Chi-Square Test. STAT E-50 Introduction to Statistics



Similar documents
Is it statistically significant? The chi-square test

An introduction to IBM SPSS Statistics

Nonparametric Tests. Chi-Square Test for Independence

Bivariate Statistics Session 2: Measuring Associations Chi-Square Test

Analysis of categorical data: Course quiz instructions for SPSS

Main Effects and Interactions

Odds ratio, Odds ratio test for independence, chi-squared statistic.

SPSS TUTORIAL & EXERCISE BOOK

Calculating P-Values. Parkland College. Isela Guerra Parkland College. Recommended Citation

This chapter discusses some of the basic concepts in inferential statistics.

Chapter 13. Chi-Square. Crosstabs and Nonparametric Tests. Specifically, we demonstrate procedures for running two separate

Chapter 23. Inferences for Regression

Two Related Samples t Test

Additional sources Compilation of sources:

How to Make APA Format Tables Using Microsoft Word

IBM SPSS Statistics for Beginners for Windows

EPS 625 INTERMEDIATE STATISTICS FRIEDMAN TEST

IBM SPSS Statistics 20 Part 4: Chi-Square and ANOVA

3. Analysis of Qualitative Data

Simulating Chi-Square Test Using Excel

CHAPTER 11 CHI-SQUARE AND F DISTRIBUTIONS

People like to clump things into categories. Virtually every research

Independent t- Test (Comparing Two Means)

Chapter 23. Two Categorical Variables: The Chi-Square Test

IBM SPSS Statistics 20 Part 1: Descriptive Statistics

Introduction to Statistics with SPSS (15.0) Version 2.3 (public)

MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question.

4. Descriptive Statistics: Measures of Variability and Central Tendency

Whitney Colbert Research Methods for the Social Sciences Trinity College Spring 2012

Using SPSS, Chapter 2: Descriptive Statistics

Having a coin come up heads or tails is a variable on a nominal scale. Heads is a different category from tails.

UNDERSTANDING THE INDEPENDENT-SAMPLES t TEST

Difference of Means and ANOVA Problems

CHAPTER IV FINDINGS AND CONCURRENT DISCUSSIONS

Chapter 7 Section 7.1: Inference for the Mean of a Population

Data Analysis for Marketing Research - Using SPSS

Probability Distributions

MBA 611 STATISTICS AND QUANTITATIVE METHODS

Two Correlated Proportions (McNemar Test)

Data analysis process

Data Analysis Tools. Tools for Summarizing Data

TI-Inspire manual 1. Instructions. Ti-Inspire for statistics. General Introduction

Nonparametric Statistics

An analysis method for a quantitative outcome and two categorical explanatory variables.

Mind on Statistics. Chapter 15

TABLE OF CONTENTS. About Chi Squares What is a CHI SQUARE? Chi Squares Hypothesis Testing with Chi Squares... 2

Projects Involving Statistics (& SPSS)

Comparing Multiple Proportions, Test of Independence and Goodness of Fit

LAB 4 INSTRUCTIONS CONFIDENCE INTERVALS AND HYPOTHESIS TESTING

Crosstabulation & Chi Square

SCHOOL OF HEALTH AND HUMAN SCIENCES DON T FORGET TO RECODE YOUR MISSING VALUES

Descriptive Statistics

Opgaven Onderzoeksmethoden, Onderdeel Statistiek

Mind on Statistics. Chapter 13

Chapter 5 Analysis of variance SPSS Analysis of variance

Simple Linear Regression Inference

Contingency Tables and the Chi Square Statistic. Interpreting Computer Printouts and Constructing Tables

Profiles and Data Analysis. 5.1 Introduction

Topic 8. Chi Square Tests

Class 19: Two Way Tables, Conditional Distributions, Chi-Square (Text: Sections 2.5; 9.1)

SPSS Resources. 1. See website (readings) for SPSS tutorial & Stats handout

Bill Burton Albert Einstein College of Medicine April 28, 2014 EERS: Managing the Tension Between Rigor and Resources 1

Using Excel for inferential statistics

Data exploration with Microsoft Excel: analysing more than one variable

HYPOTHESIS TESTING WITH SPSS:

12.5: CHI-SQUARE GOODNESS OF FIT TESTS

An SPSS companion book. Basic Practice of Statistics

Statistics. One-two sided test, Parametric and non-parametric test statistics: one group, two groups, and more than two groups samples

INTERPRETING THE ONE-WAY ANALYSIS OF VARIANCE (ANOVA)

3.4 Statistical inference for 2 populations based on two samples

Data Mining Techniques Chapter 5: The Lure of Statistics: Data Mining Using Familiar Tools

Statistics 2014 Scoring Guidelines

Once saved, if the file was zipped you will need to unzip it. For the files that I will be posting you need to change the preferences.

Working with SPSS. A Step-by-Step Guide For Prof PJ s ComS 171 students

SHORT ANSWER. Write the word or phrase that best completes each statement or answers the question.

Students' Opinion about Universities: The Faculty of Economics and Political Science (Case Study)

Statistical Impact of Slip Simulator Training at Los Alamos National Laboratory

II. DISTRIBUTIONS distribution normal distribution. standard scores

SPSS Notes (SPSS version 15.0)

Unit 31 A Hypothesis Test about Correlation and Slope in a Simple Linear Regression

IB Practice Chi Squared Test of Independence

Descriptive Analysis

Examining Differences (Comparing Groups) using SPSS Inferential statistics (Part I) Dwayne Devonish

SPSS (Statistical Package for the Social Sciences)

November 08, S8.6_3 Testing a Claim About a Standard Deviation or Variance

Table of Contents. Preface

1. What is the critical value for this 95% confidence interval? CV = z.025 = invnorm(0.025) = 1.96

Non-Inferiority Tests for Two Means using Differences

DDBA 8438: The t Test for Independent Samples Video Podcast Transcript

UNDERSTANDING THE TWO-WAY ANOVA

Directions for using SPSS

Association Between Variables

SPSS Explore procedure

Questionnaire design and analysing the data using SPSS page 1

SPSS Tests for Versions 9 to 13

January 26, 2009 The Faculty Center for Teaching and Learning

Using Excel in Research. Hui Bian Office for Faculty Excellence

Transcription:

STAT -50 Introduction to Statistics The Chi-Square Test The Chi-square test is a nonparametric test that is used to compare experimental results with theoretical models. That is, we will be comparing observed frequencies with expected frequencies. In a hypothesis test, the expected frequencies are those we would expect if the null hypothesis our test is true. O The formula is where O represents the observed frequency and represents the expected frequency. The value df depends on the type test you are performing. The Chi-Square Distribution The χ distribution is nonnegative not symmetrical; it is skewed to the right distributed to form a family distributions, with a separate distribution for each different degrees freedom. The Chi-Square Test for Goodness Fit The goodness--fit test compares the distribution observed outcomes for a single categorical variable to the expected outcomes predicted by a probability model. This test involves one sample, and one variable. Assumptions and Conditions: Be sure that the data is counts, or frequencies Independence assumption Sample size assumption xpected cell frequency condition: each expected frequency is at least 5 3 4 The Chi-square test is one-sided 0 (df, α) Automobile insurance is much more expensive for teenage than for older. To justify this cost difference, insurance companies claim that the younger are much more likely to be involved in costly. To test this claim, a researcher obtains information about registered from the Department Motor Vehicles and selects a sample 300 accident reports from the police department. The DMV reports the age registered in each age category as reported below. The accident reports is also shown. Does this data indicate that occur with the same distribution as the ages the? H 0 : H a : 5 6 1

Automobile insurance is much more expensive for teenage than for older. To justify this cost difference, insurance companies claim that the younger are much more likely to be involved in costly. To test this claim, a researcher obtains information about registered from the Department Motor Vehicles and selects a sample 300 accident reports from the police department. The DMV reports the age registered in each age category as reported below. The accident reports is also shown. Does this data indicate that occur with the same distribution as the ages the? H 0 : The distribution the ages involved in is the same as the distribution the ages registered. H a : The distribution the ages involved in is not the same as the distribution the ages registered. xpected cell frequency condition Under 0 16 68 0-9 8 9 30 or over 56 140 (this is the data) expected O - (O - ) (O - ) 7 8 xpected cell frequency condition xpected cell frequency condition Under 0 16 68 48 0-9 8 9 84 30 or over 56 140 168 n = 300 300 Note: Σ observed = Σ expected expected O - (O - ) (O - ) Under 0 16 68 48 0-9 8 9 84 30 or over 56 140 168 n = 300 300 Note: Σ observed = Σ expected expected O - (O - ) (O - ) 9 10 xpected cell frequency condition xpected cell frequency condition Under 0 16 68 48 0-9 8 9 84 30 or over 56 140 168 n = 300 300 Note: Σ observed = Σ expected expected O - (O - ) (O - ) Under 0 16 68 48 0-9 8 9 84 30 or over 56 140 168 n = 300 300 Note: Σ observed = Σ expected expected O - (O - ) (O - ) 11 1

xpected cell frequency condition - each expected frequency 5 Under 0 16 68 48 0-9 8 9 84 30 or over 56 140 168 n = 300 300 Note: Σ observed = Σ expected expected O - (O - ) (O - ) 13 expected O - (O - ) (O - ) Under 0 16 68 48 0 400 8.33 0-9 8 9 84 30 or over 56 140 168 Specify the sampling distribution model and the test you will use. O, with df = k-1 = df = 14 expected O - (O - ) (O - ) Under 0 16 68 48 0 400 8.33 0-9 8 9 84 8 64.76 30 or over 56 140 168-8 784 4.67 Note: Σ(O - ) = 0 Specify the sampling distribution model and the test you will use. expected O - (O - ) (O - ) Under 0 16 68 48 0 400 8.33 0-9 8 9 84 8 64.76 30 or over 56 140 168-8 784 4.67 Note: Σ(O - ) = 0 Specify the sampling distribution model and the test you will use. O, with df = k-1 O, with df = k-1 = df = = df = 15 16 expected O - (O - ) (O - ) Under 0 16 68 48 0 400 8.33 0-9 8 9 84 8 64.76 30 or over 56 140 168-8 784 4.67 expected O - (O - ) (O - ) Under 0 16 68 48 0 400 8.33 0-9 8 9 84 8 64.76 30 or over 56 140 168-8 784 4.67 13.76 Specify the sampling distribution model and the test you will use. Since the conditions are met, we will use a Chi-square model with degrees freedom, and do a Chi-square goodness--fit test. O, with df = k-1 = df = 17 O, with df = k-1 = 13.76 df = 3-1 = P-value: 18 3

= 13.76 df = 3-1 = P-value: P <.005 Statistical conclusion: Conclusion in context: expected O - (O - ) (O - ) Under 0 16 68 48 0 400 8.33 0-9 8 9 84 8 64.76 30 or over 56 140 168-8 784 4.67 13.76 19 0 expected O - (O - ) (O - ) Under 0 16 68 48 0 400 8.33 0-9 8 9 84 8 64.76 30 or over 56 140 168-8 784 4.67 = 13.76 df = 3-1 = 13.76 Using SPSS for a Goodness Fit Test If you have the expected proportions: 1. Create a numeric variable with a width 1 and no decimal places for the categories. Code the values this variable as follows: In the Values column, click on the box with the three dots: P-value: P <.005 Statistical conclusion: Since the p-value is small, reject the null hypothesis. Conclusion in context: The data indicates that the distribution ages involved in is not the same as the distribution ages the in the population. 1 You will then see the Value Labels dialog box. Since there are three categories ages, enter the values 1,, and 3 as coding variables: Then click on Add and you will see the results: nter the value "1" and code it as "under 0". (You do not have to use quotation marks; they will be added by SPSS.) 3 4 4

Continue adding all categories, one at a time, and then click on OK. You will see the results in the Values column in Variable View. 5 6. Create a numeric variable with no decimal places for the observed frequencies. You can then enter the observed frequency for that category. Then, for each category, enter the coded value: Repeat this until all observed frequencies have been entered: As you enter each value you will see a drop-down box. If you click on it, you can choose from the list labels. However, if you just move to the next column, you will see the category name associated with the coded value. 7 8 3. Weight the cases using the observed frequencies. 4. Now select > Analyze > Nonparametric Tests > Legacy Dialogs > Chi-Square 9 30 5

5. Select the variable with the observed frequencies as the Test Variable In the xpected Values box, select Values: 6. nter the expected s (as decimals) one at a time, and click on Add until all have been entered: 31 3 6. nter the expected s (as decimals) one at a time, and click on Add until all have been entered: 7. After the last value has been entered, click on OK. You should see a table showing the observed and expected frequencies and a table with the results the Chi-square test: count Observed N xpected N Residual 68 48.0 0.0 68 9 9 84.0 8.0 140 140 168.0-8.0 300 Test Statistics count Chi-Square 13.76 a df Asymp. Sig..001 a. 0 cells (.0%) have expected frequencies less than 5. The minimum expected cell frequency is 48.0. These results show that χ = 13.76, and p =.001 (Note that you also have the option to choose All categories equal if that is appropriate.) 33 34 The Chi-Square Test for Homogeneity In a test for homogeneity, we compare observed distributions for several groups to see if there are differences among the respective populations. The central issue is whether the category proportions are the same for all the populations. The test involves several samples but only one variable. The article Relationship Health Behaviors to Alcohol and Cigarette Use by College Students (J. College Student Development (199)) included data on drinking behavior for independently chosen random samples male and female students similar to the data shown below. Does there appear to be a gender difference with respect to drinking behavior? None 140 ( ) 186 ( ) Low (1-7) 478 ( ) 661 ( ) Moderate (8-4) 300 ( ) 173 ( ) High (5 or more) 63 ( ) 16 ( ) 35 36 6

The Chi-Square Test for Homogeneity Assumptions and Conditions: Be sure that the data is counts, or frequencies Independence assumption If you want to generalize from the data to a population. Sample size assumption xpected cell frequency condition ach expected frequency is at least 5 The article Relationship Health Behaviors to Alcohol and Cigarette Use by College Students (J. College Student Development (199)) included data on drinking behavior for independently chosen random samples male and female students similar to the data shown below. Does there appear to be a gender difference with respect to drinking behavior? H 0 : H a : xpected cell frequency condition 37 38 The article Relationship Health Behaviors to Alcohol and Cigarette Use by College Students (J. College Student Development (199)) included data on drinking behavior for independently chosen random samples male and female students similar to the data shown below. Does there appear to be a gender difference with respect to drinking behavior? H 0 : The proportions the four drinking levels are the same for males and for females H a : The proportions the four drinking levels are not the same for males and for females xpected cell frequency condition: (row total)(column total) n 39 Specify the sampling distribution model and the test you will use. df = (R - 1)(C - 1) None 140 ( ) 186 ( ) Low (1-7) 478 ( ) 661 ( ) Moderate (8-4) 300 ( ) 173 ( ) High (5 or more) 63 ( ) 16 ( ) 40 None 140 ( ) 186 ( ) Low (1-7) 478 ( ) 661 ( ) Moderate (8-4) 300 ( ) 173 ( ) High (5 or more) 63 ( ) 16 ( ) None 140 ( ) 186 ( ) Low (1-7) 478 ( ) 661 ( ) Moderate (8-4) 300 ( ) 173 ( ) High (5 or more) 63 ( ) 16 ( ) Specify the sampling distribution model and the test you will use. df = (R - 1)(C - 1) = (4-1)( - 1) = (3)(1) = 3 Fill in the row and column totals. The conditions are met, so we will use a Chi-square model with 3 degrees freedom, and do a Chi-square test homogeneity. 41 4 7

None 140 ( ) 186 ( ) 36 Low (1-7) 478 ( ) 661 ( ) 1139 Moderate (8-4) 300 ( ) 173 ( ) 473 High (5 or more) 63 ( ) 16 ( ) 79 981 1036 017 None 140 ( 158.56 ) 186 ( ) 36 Low (1-7) 478 ( ) 661 ( ) 1139 Moderate (8-4) 300 ( ) 173 ( ) 473 High (5 or more) 63 ( ) 16 ( ) 79 981 1036 017 Calculate the expected frequencies for each cell, using (row total)(column total) = n Calculate the expected frequencies for each cell, using (row total)(column total) = n 43 44 None 140 ( 158.56 ) 186 ( 167.44 ) 36 Low (1-7) 478 ( ) 661 ( ) 1139 Moderate (8-4) 300 ( ) 173 ( ) 473 High (5 or more) 63 ( ) 16 ( ) 79 981 1036 017 Calculate the expected frequencies for each cell, using (row total)(column total) = n O None 140 ( 158.56 ) 186 ( 167.44 ) 36 Low (1-7) 478 ( 553.97 ) 661 ( 585.03 ) 1139 Moderate (8-4) 300 ( 30.05 ) 173 ( 4.95 ) 473 High (5 or more) 63 ( 38.4 ) 16 ( 40.58 ) 79 981 1036 017 45 46 O.17 + None 140 ( 158.56 ) 186 ( 167.44 ) 36 Low (1-7) 478 ( 553.97 ) 661 ( 585.03 ) 1139 Moderate (8-4) 300 ( 30.05 ) 173 ( 4.95 ) 473 High (5 or more) 63 ( 38.4 ) 16 ( 40.58 ) 79 981 1036 017 O.17 +.06 + None 140 ( 158.56 ) 186 ( 167.44 ) 36 Low (1-7) 478 ( 553.97 ) 661 ( 585.03 ) 1139 Moderate (8-4) 300 ( 30.05 ) 173 ( 4.95 ) 473 High (5 or more) 63 ( 38.4 ) 16 ( 40.58 ) 79 981 1036 017 47 48 8

O.17 +.06 + 10.418 + 9.865 + None 140 ( 158.56 ) 186 ( 167.44 ) 36 Low (1-7) 478 ( 553.97 ) 661 ( 585.03 ) 1139 Moderate (8-4) 300 ( 30.05 ) 173 ( 4.95 ) 473 High (5 or more) 63 ( 38.4 ) 16 ( 40.58 ) 79 981 1036 017 O.17 +.06 + 10.418 + 9.865 + 1.7 + 0.14 + 15.73 + 14.89 = None 140 ( 158.56 ) 186 ( 167.44 ) 36 Low (1-7) 478 ( 553.97 ) 661 ( 585.03 ) 1139 Moderate (8-4) 300 ( 30.05 ) 173 ( 4.95 ) 473 High (5 or more) 63 ( 38.4 ) 16 ( 40.58 ) 79 981 1036 017 49 50 O.17 +.06 + 10.418 + 9.865 + 1.7 + 0.14 + None 140 ( 158.56 ) 186 ( 167.44 ) 36 Low (1-7) 478 ( 553.97 ) 661 ( 585.03 ) 1139 Moderate (8-4) 300 ( 30.05 ) 173 ( 4.95 ) 473 High (5 or more) 63 ( 38.4 ) 16 ( 40.58 ) 79 981 1036 017 15.73 + 14.89 = 96.54 51 5 The article Relationship Health Behaviors to Alcohol and Cigarette Use by College Students (J. College Student Development (199)) included data on drinking behavior for independently chosen random samples male and female students similar to the data shown below. Does there appear to be a gender difference with respect to drinking behavior? H 0 : The proportions the four drinking levels are the same for males and females H a : The proportions the four drinking levels are not the same for males and females = 96.54 df = 3 P-value: p <.005 Statistical conclusion: Conclusion in context: The article Relationship Health Behaviors to Alcohol and Cigarette Use by College Students (J. College Student Development (199)) included data on drinking behavior for independently chosen random samples male and female students similar to the data shown below. Does there appear to be a gender difference with respect to drinking behavior? H 0 : The proportions the four drinking levels are the same for males and females H a : The proportions the four drinking levels are not the same for males and females = 96.54 df = 3 P-value: p <.005 Statistical conclusion: p is small, so the null hypothesis is rejected Conclusion in context: The data does indicate a gender difference with respect to drinking behavior. 53 54 9

Using SPSS for a Test for Homogeneity 1. Create a string variable for each the categories, and a numeric variable for the observed frequencies. Be sure to make the columns wide enough ("columns" in Variable View). 3. Select > Analyze > Descriptive Statistics > Crosstabs Select one variable as the row variable and the other as the column variable. Click on Statistics and then on Chi-square. Then enter the values these two variables:. Weight the cases using the observed frequencies. (> Data > Weight Cases ) 55 56 Click on the Cells button, and select Observed and xpected in the Cell Display window. Then click on Continue. Your output should include a table showing the observed and expected frequencies: Click on Display clustered bar charts to produce the graph shown in the results. Click on Continue and then click on OK. gender * level Crosstabulation level high low moderate none gender female Count 16 661 173 186 1036 xpected Count 40.6 585.0 4.9 167.4 1036.0 male Count 63 478 300 140 981 xpected Count 38.4 554.0 30.1 158.6 981.0 Count 79 1139 473 36 017 xpected Count 79.0 1139.0 473.0 36.0 017.0 57 58 and a table with the results your Chi-square test: Here is the graph that represents the results: Chi-Square Tests Value df Asymp. Sig. (- sided) Pearson Chi-Square 96.56 a 3.000 Likelihood Ratio 98.966 3.000 N Valid Cases 017 a. 0 cells (.0%) have expected count less than 5. The minimum expected count is 38.4. These results show that χ = 96.56, and p =.000 59 60 10

The Chi-Square Test for Independence In a test for independence, we investigate association between two categorical variables in a single population. There is one sample, but there are two variables. Assumptions and Conditions: If you want to generalize from the data to a population. xpected cell frequency condition 61 The table shown below was constructed using data in the article Television Viewing and Physical Fitness in Adults (Research Quarterly for xercise and Sport (1990)). The author hoped to determine whether time spent watching television is associated with cardiovascular fitness. Subjects were asked about their television viewing time (per day, rounded to the nearest hour) and were classified as physically fit if they scored in the excellent or very good category on a step test. H o : H a : 0 35 ( ) 147 ( ) 1-101 ( ) 69 ( ) 3-4 8 ( ) ( ) 5 or more 4 ( ) 34 ( ) 6 The table shown below was constructed using data in the article Television Viewing and Physical Fitness in Adults (Research Quarterly for xercise and Sport (1990)). The author hoped to determine whether time spent watching television is associated with cardiovascular fitness. Subjects were asked about their television viewing time (per day, rounded to the nearest hour) and were classified as physically fit if they scored in the excellent or very good category on a step test. xpected cell frequency condition 0 35 ( ) 147 ( ) 1-101 ( ) 69 ( ) 3-4 8 ( ) ( ) 5 or more 4 ( ) 34 ( ) H o : Fitness and TV viewing are independent H a : Fitness and TV viewing are not independent 63 64 0 35 ( ) 147 ( ) 1-101 ( ) 69 ( ) 3-4 8 ( ) ( ) 5 or more 4 ( ) 34 ( ) Specify the sampling distribution model and the test you will use. 0 35 ( ) 147 ( ) 1-101 ( ) 69 ( ) 3-4 8 ( ) ( ) 5 or more 4 ( ) 34 ( ) Find the row and column totals. df = (R - 1)(C - 1) = (4-1)( - 1) = (3)(1) = 3 Since the conditions are met, we will use a Chi-square model with 3 degrees freedom, and do a Chi-square test for independence. 65 66 11

0 35 ( ) 147 ( ) 18 1-101 ( ) 69 ( ) 730 3-4 8 ( ) ( ) 50 5 or more 4 ( ) 34 ( ) 38 168 103 100 0 35 ( 5.48 ) 147 ( ) 18 1-101 ( 10.0 ) 69 ( ) 730 3-4 8 ( 35.00 ) ( ) 50 5 or more 4 ( 5.3 ) 34 ( ) 38 168 103 100 (row total)(column total) = n (row total)(column total) = n 67 68 0 35 ( 5.48 ) 147 ( 156.5 ) 18 1-101 ( 10.0 ) 69 ( 67.80 ) 730 3-4 8 ( 35.00 ) ( 15.00 ) 50 5 or more 4 ( 5.3 ) 34 ( 3.68 ) 38 168 103 100 0 35 ( 5.48 ) 147 ( 156.5 ) 18 1-101 ( 10.0 ) 69 ( 67.80 ) 730 3-4 8 ( 35.00 ) ( 15.00 ) 50 5 or more 4 ( 5.3 ) 34 ( 3.68 ) 38 168 103 100 (row total)(column total) = n O 3.557 +.579 + 69 70 0 35 ( 5.48 ) 147 ( 156.5 ) 18 1-101 ( 10.0 ) 69 ( 67.80 ) 730 3-4 8 ( 35.00 ) ( 15.00 ) 50 5 or more 4 ( 5.3 ) 34 ( 3.68 ) 38 168 103 100 0 35 ( 5.48 ) 147 ( 156.5 ) 18 1-101 ( 10.0 ) 69 ( 67.80 ) 730 3-4 8 ( 35.00 ) ( 15.00 ) 50 5 or more 4 ( 5.3 ) 34 ( 3.68 ) 38 168 103 100 O 3.557 +.579 + O 3.557 +.579 +.014 +.00 + 1.4 +.8 +.38 +.0539 = 6.161 71 7 1

6.161 df = 3 0 35 ( 5.48 ) 147 ( 156.5 ) 18 1-101 ( 10.0 ) 69 ( 67.80 ) 730 3-4 8 ( 35.00 ) ( 15.00 ) 50 5 or more 4 ( 5.3 ) 34 ( 3.68 ) 38 168 103 100 P-value: 73 74 6.161 df = 3 0 35 ( 5.48 ) 147 ( 156.5 ) 18 1-101 ( 10.0 ) 69 ( 67.80 ) 730 3-4 8 ( 35.00 ) ( 15.00 ) 50 5 or more 4 ( 5.3 ) 34 ( 3.68 ) 38 168 103 100 6.161 df = 3 0 35 ( 5.48 ) 147 ( 156.5 ) 18 1-101 ( 10.0 ) 69 ( 67.80 ) 730 3-4 8 ( 35.00 ) ( 15.00 ) 50 5 or more 4 ( 5.3 ) 34 ( 3.68 ) 38 168 103 100 P-value: p >.10 Statistical conclusion: Conclusion in context: 75 P-value: p >.10 Statistical conclusion: Since the p-value is large, we cannot reject the null hypothesis. Conclusion in context: There is not enough evidence to conclude that time spent watching television is associated with cardiovascular fitness. 76 Using SPSS for a Test for Independence Then enter the frequencies as before: Follow the instructions for a Chi-Square test for homogeneity. You may define two string variables for the categories and one numeric variable for the counts, or you may choose to use coding for one or either the variables representing the categories. 77 78 13

Weight the cases by counts, and then use > Analyze > Descriptive Statistics > Crosstabs SPSS output: Select one variable as the row variable and the other as the column variable. TVGroup * Fitness Crosstabulation Fitness Fit Not Fit Click on Statistics and then on Chi-square. Click on the Cells button, and select Observed and xpected in the Cell Display window. Click on Display clustered bar charts to produce the graph shown in the results. Then click on Continue and on OK. TVGroup 0 Count 35 147 18 xpected Count 5.5 156.5 18.0 1- Count 101 69 730 xpected Count 10. 67.8 730.0 3-4 Count 8 50 xpected Count 35.0 15.0 50.0 5 or more Count 4 34 38 xpected Count 5.3 3.7 38.0 Count 168 103 100 xpected Count 168.0 103.0 100.0 79 80 SPSS output: Here is the graph that supports these results: Chi-Square Tests Value df Asymp. Sig. (- sided) Pearson Chi-Square 6.161 a 3.104 Likelihood Ratio 5.930 3.115 N Valid Cases 100 a. 0 cells (.0%) have expected count less than 5. The minimum expected count is 5.3. These results show that χ = 36.161 and p =.104 81 8 1. A health pressional selected a random sample 100 patients from each four major hospital emergency rooms to see if the major reasons for emergency room visits (accident, illegal activity, illness, other) are the same in all four hospitals. This is an example a. A goodness--fit test b. A test for homogeneity c. A test for independence 1. A health pressional selected a random sample 100 patients from each four major hospital emergency rooms to see if the major reasons for emergency room visits (accident, illegal activity, illness, other) are the same in all four hospitals. This is an example a. A goodness--fit test b. A test for homogeneity c. A test for independence 83 84 14

. An urban economist wants to determine whether the region the United States a resident lives in is related to his level education. He randomly selects 1800 US residents and asks them to report their level education and the region the US in which they live. The economist is using a. A goodness--fit test b. A test for homogeneity c. A test for independence. An urban economist wants to determine whether the region the United States a resident lives in is related to his level education. He randomly selects 1800 US residents and asks them to report their level education and the region the US in which they live. The economist is using a. A goodness--fit test b. A test for homogeneity c. A test for independence 85 86 3. As part a class project, a student asked a random sample students about their preferred st drink: Pepsi, Coke, or 7-Up, to determine whether these three drinks were equally preferred by students. 3. As part a class project, a student asked a random sample students about their preferred st drink: Pepsi, Coke, or 7-Up, to determine whether these three drinks were equally preferred by students. The student should use a. A goodness--fit test b. A test for homogeneity c. A test for independence The student should use a. A goodness--fit test b. A test for homogeneity c. A test for independence 87 88 15