Chi-Square Test. Contingency Tables. Contingency Tables. Chi-Square Test for Independence. Chi-Square Tests for Goodnessof-Fit

Size: px
Start display at page:

Download "Chi-Square Test. Contingency Tables. Contingency Tables. Chi-Square Test for Independence. Chi-Square Tests for Goodnessof-Fit"

Transcription

1 Chi-Square Tests 15 Chapter Chi-Square Test for Independence Chi-Square Tests for Goodness Uniform Goodness- Poisson Goodness- Goodness Test ECDF Tests (Optional) McGraw-Hill/Irwin Copyright 2009 by The McGraw-Hill Companies, Inc Contingency Tables A contingency table is a cross-tabulation of n paired observations into categories. Each cell shows the count of observations that fall into the B category A defined by its row (r)( ) and column (c)( heading Contingency Tables For example: Table Chi-Square Test In a test of independence for an r x c contingency table, the hypotheses are H 0 : Variable A is independent of variable B H 1 : Variable A is not independent of variable B Use the chi-square test for independence to test these hypotheses. This non-parametric test is based on frequencies. The n data pairs are classified into c columns and r rows and then the observed frequency f jk is compared with the expected frequency e jk.

2 Chi-Square Distribution The critical value comes from the chi-square probability distribution with ν degrees of freedom. ν = degrees of freedom = (r( 1)(c 1) where r = number of rows in the table c = number of columns in the table Appendix E contains critical values for right- tail areas of the chi-square distribution. The mean of a chi-square distribution is ν with variance 2ν. 2 Chi-Square Distribution Consider the shape of the chi-square distribution: Figure Expected Frequencies Assuming that H 0 is true, the expected frequency of row j and column k is: e jk = R j C k /n where R j = total for row j (j = 1, 2,, r) C k = total for column k (k = 1, 2,, c) n = sample size Steps in Testing the Hypotheses Step 1: State the Hypotheses H 0 : Variable A is independent of variable B H 1 : Variable A is not independent of variable B Step 2: Specify the Decision Rule Calculate ν = (r( 1)(c 1) For a given α,, look up the right-tail tail critical value (χ( 2 R) ) from Appendix E or by using Excel. Reject H 0 if χ 2 R > test statistic

3 Steps in Testing the Hypotheses For example, for ν = 6 and α =.05, χ 2.05 = Steps in Testing the Hypotheses Here is the rejection region Figure Figure 15.3 Steps in Testing the Hypotheses Step 3: Calculate the Expected Frequencies e jk = R j C k /n For example, Steps in Testing the Hypotheses Step 4: Calculate the Test Statistic The chi-square test statistic is calc Step 5: Make the Decision Reject H 0 if χ 2 R > test statistic or if the p-value < α

4 Small Expected Frequencies The chi-square test is unreliable if the expected frequencies are too small. Rules of thumb: Cochran s s Rule requires that e jk > 5 for all cells. Up to 20% of the cells may have e jk < 5 Most agree that a chi-square test is infeasible if e jk < 1 in any cell. If this happens, try combining adjacent rows or columns to enlarge the expected frequencies. Cross-Tabulating Raw Data Chi-square tests for independence can also be used to analyze quantitative variables by coding them into categories. For example, the variables Infant Deaths per 1,000 and Doctors per 100,000 can each be coded into various categories: Figure Chi-Square Test for Why Do a Chi-Square Test on Numerical Data? The researcher may believe there s s a relationship between X and Y, but doesn t want to use regression. There are outliers or anomalies that prevent us from assuming that the data came from a normal population. The researcher has numerical data for one variable but not the other. 3-Way Tables and Higher More than two variables can be compared using contingency tables. However, it is difficult to visualize a higher order table. For example, you could visualize a cube as a stack of tiled 2-way 2 contingency tables. Major computer packages permit 3-way 3 tables

5 Purpose of the Test The goodness-of fitfit (GOF)) test helps you decide whether your sample resembles a particular kind of population. The chi-square test will be used because it is versatile and easy to understand. Multinomial GOF Test A multinomial distribution is defined by any k probabilities π 1, π 2,, π k that sum to unity. For example, consider the following official proportions of M&M colors. calc Multinomial GOF Test The hypotheses are H 0 : π 1 =.30, π 2 =.20, π 3 =.10, π 4 =.10, π 5 =.10, π 6 =.20 H 1 : At least one of the π j differs from the hypothesized value No parameters are estimated (m( = 0) and there are c = 6 classes, so the degrees of freedom are ν = c m 1 = Chi-Square Test for Chi-Square Test for Chi-Square Test for Chi-Square Test for Hypotheses for GOF The hypotheses are: H 0 : The population follows a distribution H 1 : The population does not follow a distribution The blank may contain the name of any theoretical distribution (e.g., uniform, Poisson, normal)

6 15-21 Test Statistic and Degrees of Freedom for GOF Chi-Square Test for Assuming n observations, the observations are grouped into c classes and then the chi- square test statistic is found using: where calc f j = the observed frequency of observations in class j e j = the expected frequency in class j if H 0 were true Test Statistic and Degrees of Freedom for GOF If the proposed distribution gives a good fit to the sample, the test statistic will be near zero. The test statistic follows the chi-square distribution with degrees of freedom ν = c m 1 where c is the no. of classes used in the test m is the no. of parameters estimated Test Statistic and Degrees of Freedom for GOF v = c m = c 0 1 = c 1 v = c m = c 1 1 = c 2 v = c m = c 2 1 = c 3 Chi-Square Test for Chi-Square Test for Chi-Square Test for Data-Generating Situations Instead of fishing for a good-fitting model, visualize a priori the characteristics of the underlying data-generating process. Mixtures: A Problem Mixtures occur when more than one data- generating process is superimposed on top of one another

7 Chi-Square Test for Eyeball Tests A simple eyeball inspection of the histogram or dot plot may suffice to rule out a hypothesized population. Small Expected Frequencies fit fit tests may lack power in small samples. As a guideline, a chi- square goodness-of fit fit test should be avoided if n < 25. Uniform Uniform Distribution The uniform goodness-of fitfit test is a special case of the multinomial in which every value has the same chance of occurrence. The chi-square test for a uniform distribution compares all c groups simultaneously. The hypotheses are: H 0 : π 1 = π 2 =, π c = 1/c H 1 : Not all π j are equal Uniform Uniform Uniform GOF Test: Grouped Data The test can be performed on data that are already tabulated into groups. Calculate the expected frequency e j for each cell. The degrees of freedom are ν = c 1 since there are no parameters for the uniform distribution. Obtain the critical value χ 2 α from Appendix E for the desired level of significance α. The p-value can be obtained from Excel. Reject H 0 if p-value < α Uniform GOF Test: Raw Data First form c bins of equal width and create a frequency distribution. Calculate the observed frequency f j for each bin. Define e j = n/c. Perform the chi-square calculations. The degrees of freedom are ν = c 1 since there are no parameters for the uniform distribution. Obtain the critical value from Appendix E for a given significance level α and make the decision.

8 Uniform Uniform Uniform GOF Test: Raw Data Maximize the test s s power by defining bin width as As a result, the expected frequencies will be as large as possible. Uniform GOF Test: Raw Data Calculate the mean and standard deviation of the uniform distribution as: µ = (a + b)/2 σ = [(b a + 1)2 1)/12 If the data are not skewed and the sample size is large (n( > 30), then the mean is approximately normally distributed. So, test the hypothesized uniform mean using Poisson Poisson Poisson Data-Generating Situations In a Poisson distribution model, X represents the number of events per unit of time or space. X is a discrete nonnegative integer (X( = 0, 1, 2, ) Event arrivals must be independent of each other. Sometimes called a model of rare events because X typically has a small mean. Poisson The mean λ is the only parameter. Assuming that λ is unknown and must be estimated from the sample, the steps are: Step 1: Tally the observed frequency f j of each X-value. Step 2: Estimate the mean λ from the sample. Step 3: Use the estimated λ to find the Poisson probability P(X) ) for each value of X

9 Poisson Poisson Poisson Step 4: Multiply P(X) ) by the sample size n to get expected Poisson frequencies e j. Step 5: Perform the chi-square calculations. Step 6: Make the decision. You may need to combine classes until expected frequencies become large enough for the test (at least until e j > 2). Poisson GOF Test: Tabulated Data Calculate the sample mean as: ^λ = c Σ x j f j =1 j n Using this estimate mean, calculate the Poisson probabilities either by using the Poisson formula P(x) ) = (λ( x e -λ )/x!! or Excel Poisson Poisson GOF Test: Tabulated Data For c classes with m = 1 parameter estimated, the degrees of freedom are ν = c m 1 Obtain the critical value for a given α from Appendix E. Make the decision. Normal Data Generating Situations Two parameters, µ and σ,, fully describe the normal distribution. Unless µ and σ are know a priori,, they must be estimated from a sample by using x and s. Using these statistics, the chi-square goodness-of fit fit test can be used

10 Method 1: Standardizing the Data Transform the sample observations x 1, x 2,, x n into standardized values. Count the sample observations f j within intervals of the form x + ks and compare them with the known frequencies e j based on the normal distribution. Method 1: Standardizing the Data Advantage is a standardized scale. Disadvantage is that data are no longer in the original units Figure Method 2: Equal Bin Widths To obtain equal-width bins, divide the exact data range into c groups of equal width. Step 1: Count the sample observations in each bin to get observed frequencies f j. Step 2: Convert the bin limits into standardized z-values z by using the formula. Method 2: Equal Bin Widths Step 3: Find the normal area within each bin assuming a normal distribution. Step 4: Find expected frequencies e j by multiplying each normal area by the sample size n. Classes may need to be collapsed from the ends inward to enlarge expected frequencies

11 15-41 Method 3: Equal Expected Frequencies Define histogram bins in such a way that an equal number of observations would be expected within each bin under the null hypothesis. Define bin limits so that e j = n/c A normal area of 1/c in each of the c bins is desired. The first and last classes must be open-ended ended for a normal distribution, so to define c bins, we need c 1 cutpoints Method 3: Equal Expected Frequencies The upper limit of bin j can be found directly by using Excel. Alternatively, find z j for bin j using Excel and then calculate the upper limit for bin j as x + z j s Once the bins are defined, count the observations f j within each bin and compare them with the expected frequencies e j = n/c. Method 3: Equal Expected Frequencies Standard normal cutpoints for equal area bins. Histograms The fitted normal histogram gives visual clues as to the likely outcome of the GOF test. Histograms reveal any outliers or other non- normality issues. Further tests are needed since histograms vary Table Figure 15.15

12 15-45 Critical Values for Normal GOF Test Since two parameters, m and s, are estimated from the sample, the degrees of freedom are ν = c m 1 Table At least 4 bins are needed to ensure 1 df ECDF Tests Kolmogorov-Smirnov and Lilliefors Tests There are many alternatives to the chi-square test based on the Empirical Cumulative Distribution Function (ECDF). The Kolmogorov-Smirnov (K-S) test statistic D is the largest absolute difference between the actual and expected cumulative relative frequency of the n data values: D = Max F a F e The K-S K S test is not recommended for grouped data. ECDF Tests Kolmogorov-Smirnov and Lilliefors Tests F a is the actual cumulative frequency at observation i. F e is the expected cumulative frequency at observation i under the assumption that the data came from the hypothesized distribution. The K-S K S test assumes that no parameters are estimated. If parameters are estimated, use a Lilliefors test. Both of these tests are done by computer. ECDF Tests Kolmogorov-Smirnov and Lilliefors Tests K-S test for uniformity. Figure

13 15-49 ECDF Tests Kolmogorov-Smirnov and Lilliefors Tests K-S test for normality. Figure ECDF Tests Anderson-Darling Tests The Anderson-Darling (A-D) test is widely used for non-normality normality because of its power. The A-D A D test is based on a probability plot. When the data fit the hypothesized distribution closely, the probability plot will be close to a straight line. The A-D A D test statistic measures the overall distance between the actual and the hypothesized distributions, using a weighted squared distance. ECDF Tests Anderson-Darling Tests with MINITAB Applied Statistics in Business & Economics End of Chapter 15 Figure McGraw-Hill/Irwin Copyright 2009 by The McGraw-Hill Companies, Inc.

12.5: CHI-SQUARE GOODNESS OF FIT TESTS

12.5: CHI-SQUARE GOODNESS OF FIT TESTS 125: Chi-Square Goodness of Fit Tests CD12-1 125: CHI-SQUARE GOODNESS OF FIT TESTS In this section, the χ 2 distribution is used for testing the goodness of fit of a set of data to a specific probability

More information

Normality Testing in Excel

Normality Testing in Excel Normality Testing in Excel By Mark Harmon Copyright 2011 Mark Harmon No part of this publication may be reproduced or distributed without the express permission of the author. mark@excelmasterseries.com

More information

Minitab Guide. This packet contains: A Friendly Guide to Minitab. Minitab Step-By-Step

Minitab Guide. This packet contains: A Friendly Guide to Minitab. Minitab Step-By-Step Minitab Guide This packet contains: A Friendly Guide to Minitab An introduction to Minitab; including basic Minitab functions, how to create sets of data, and how to create and edit graphs of different

More information

Computer Lab 3 Thursday, 24 February, 2011 DMS 106 4:00 5:15PM

Computer Lab 3 Thursday, 24 February, 2011 DMS 106 4:00 5:15PM Statistics: Continuous Methods STAT452/652, Spring 2011 Computer Lab 3 Thursday, 24 February, 2011 DMS 106 4:00 5:15PM Goodness of Fit tests: Chi-square, Kolmogorov-Smirnov, Anderson-Darling, Shapiro-Wilk

More information

Technology Step-by-Step Using StatCrunch

Technology Step-by-Step Using StatCrunch Technology Step-by-Step Using StatCrunch Section 1.3 Simple Random Sampling 1. Select Data, highlight Simulate Data, then highlight Discrete Uniform. 2. Fill in the following window with the appropriate

More information

Chapter Additional: Standard Deviation and Chi- Square

Chapter Additional: Standard Deviation and Chi- Square Chapter Additional: Standard Deviation and Chi- Square Chapter Outline: 6.4 Confidence Intervals for the Standard Deviation 7.5 Hypothesis testing for Standard Deviation Section 6.4 Objectives Interpret

More information

The Goodness-of-Fit Test

The Goodness-of-Fit Test on the Lecture 49 Section 14.3 Hampden-Sydney College Tue, Apr 21, 2009 Outline 1 on the 2 3 on the 4 5 Hypotheses on the (Steps 1 and 2) (1) H 0 : H 1 : H 0 is false. (2) α = 0.05. p 1 = 0.24 p 2 = 0.20

More information

Comparing Multiple Proportions, Test of Independence and Goodness of Fit

Comparing Multiple Proportions, Test of Independence and Goodness of Fit Comparing Multiple Proportions, Test of Independence and Goodness of Fit Content Testing the Equality of Population Proportions for Three or More Populations Test of Independence Goodness of Fit Test 2

More information

Hypothesis Testing COMP 245 STATISTICS. Dr N A Heard. 1 Hypothesis Testing 2 1.1 Introduction... 2 1.2 Error Rates and Power of a Test...

Hypothesis Testing COMP 245 STATISTICS. Dr N A Heard. 1 Hypothesis Testing 2 1.1 Introduction... 2 1.2 Error Rates and Power of a Test... Hypothesis Testing COMP 45 STATISTICS Dr N A Heard Contents 1 Hypothesis Testing 1.1 Introduction........................................ 1. Error Rates and Power of a Test.............................

More information

Calculating P-Values. Parkland College. Isela Guerra Parkland College. Recommended Citation

Calculating P-Values. Parkland College. Isela Guerra Parkland College. Recommended Citation Parkland College A with Honors Projects Honors Program 2014 Calculating P-Values Isela Guerra Parkland College Recommended Citation Guerra, Isela, "Calculating P-Values" (2014). A with Honors Projects.

More information

t Tests in Excel The Excel Statistical Master By Mark Harmon Copyright 2011 Mark Harmon

t Tests in Excel The Excel Statistical Master By Mark Harmon Copyright 2011 Mark Harmon t-tests in Excel By Mark Harmon Copyright 2011 Mark Harmon No part of this publication may be reproduced or distributed without the express permission of the author. mark@excelmasterseries.com www.excelmasterseries.com

More information

9-3.4 Likelihood ratio test. Neyman-Pearson lemma

9-3.4 Likelihood ratio test. Neyman-Pearson lemma 9-3.4 Likelihood ratio test Neyman-Pearson lemma 9-1 Hypothesis Testing 9-1.1 Statistical Hypotheses Statistical hypothesis testing and confidence interval estimation of parameters are the fundamental

More information

Additional sources Compilation of sources: http://lrs.ed.uiuc.edu/tseportal/datacollectionmethodologies/jin-tselink/tselink.htm

Additional sources Compilation of sources: http://lrs.ed.uiuc.edu/tseportal/datacollectionmethodologies/jin-tselink/tselink.htm Mgt 540 Research Methods Data Analysis 1 Additional sources Compilation of sources: http://lrs.ed.uiuc.edu/tseportal/datacollectionmethodologies/jin-tselink/tselink.htm http://web.utk.edu/~dap/random/order/start.htm

More information

Chapter 3 RANDOM VARIATE GENERATION

Chapter 3 RANDOM VARIATE GENERATION Chapter 3 RANDOM VARIATE GENERATION In order to do a Monte Carlo simulation either by hand or by computer, techniques must be developed for generating values of random variables having known distributions.

More information

Statistics 641 - EXAM II - 1999 through 2003

Statistics 641 - EXAM II - 1999 through 2003 Statistics 641 - EXAM II - 1999 through 2003 December 1, 1999 I. (40 points ) Place the letter of the best answer in the blank to the left of each question. (1) In testing H 0 : µ 5 vs H 1 : µ > 5, the

More information

3.6: General Hypothesis Tests

3.6: General Hypothesis Tests 3.6: General Hypothesis Tests The χ 2 goodness of fit tests which we introduced in the previous section were an example of a hypothesis test. In this section we now consider hypothesis tests more generally.

More information

INTERPRETING THE ONE-WAY ANALYSIS OF VARIANCE (ANOVA)

INTERPRETING THE ONE-WAY ANALYSIS OF VARIANCE (ANOVA) INTERPRETING THE ONE-WAY ANALYSIS OF VARIANCE (ANOVA) As with other parametric statistics, we begin the one-way ANOVA with a test of the underlying assumptions. Our first assumption is the assumption of

More information

1 SAMPLE SIGN TEST. Non-Parametric Univariate Tests: 1 Sample Sign Test 1. A non-parametric equivalent of the 1 SAMPLE T-TEST.

1 SAMPLE SIGN TEST. Non-Parametric Univariate Tests: 1 Sample Sign Test 1. A non-parametric equivalent of the 1 SAMPLE T-TEST. Non-Parametric Univariate Tests: 1 Sample Sign Test 1 1 SAMPLE SIGN TEST A non-parametric equivalent of the 1 SAMPLE T-TEST. ASSUMPTIONS: Data is non-normally distributed, even after log transforming.

More information

Chapter 23. Two Categorical Variables: The Chi-Square Test

Chapter 23. Two Categorical Variables: The Chi-Square Test Chapter 23. Two Categorical Variables: The Chi-Square Test 1 Chapter 23. Two Categorical Variables: The Chi-Square Test Two-Way Tables Note. We quickly review two-way tables with an example. Example. Exercise

More information

Simple Linear Regression Inference

Simple Linear Regression Inference Simple Linear Regression Inference 1 Inference requirements The Normality assumption of the stochastic term e is needed for inference even if it is not a OLS requirement. Therefore we have: Interpretation

More information

UNDERSTANDING THE INDEPENDENT-SAMPLES t TEST

UNDERSTANDING THE INDEPENDENT-SAMPLES t TEST UNDERSTANDING The independent-samples t test evaluates the difference between the means of two independent or unrelated groups. That is, we evaluate whether the means for two independent groups are significantly

More information

13.2 The Chi Square Test for Homogeneity of Populations The setting: Used to compare distribution of proportions in two or more populations.

13.2 The Chi Square Test for Homogeneity of Populations The setting: Used to compare distribution of proportions in two or more populations. 13.2 The Chi Square Test for Homogeneity of Populations The setting: Used to compare distribution of proportions in two or more populations. Data is organized in a two way table Explanatory variable (Treatments)

More information

Chapter 9, Part A Hypothesis Tests. Learning objectives

Chapter 9, Part A Hypothesis Tests. Learning objectives Chapter 9, Part A Hypothesis Tests Slide 1 Learning objectives 1. Understand how to develop Null and Alternative Hypotheses 2. Understand Type I and Type II Errors 3. Able to do hypothesis test about population

More information

Class 19: Two Way Tables, Conditional Distributions, Chi-Square (Text: Sections 2.5; 9.1)

Class 19: Two Way Tables, Conditional Distributions, Chi-Square (Text: Sections 2.5; 9.1) Spring 204 Class 9: Two Way Tables, Conditional Distributions, Chi-Square (Text: Sections 2.5; 9.) Big Picture: More than Two Samples In Chapter 7: We looked at quantitative variables and compared the

More information

Chi Square for Contingency Tables

Chi Square for Contingency Tables 2 x 2 Case Chi Square for Contingency Tables A test for p 1 = p 2 We have learned a confidence interval for p 1 p 2, the difference in the population proportions. We want a hypothesis testing procedure

More information

The Chi Square Test. Diana Mindrila, Ph.D. Phoebe Balentyne, M.Ed. Based on Chapter 23 of The Basic Practice of Statistics (6 th ed.

The Chi Square Test. Diana Mindrila, Ph.D. Phoebe Balentyne, M.Ed. Based on Chapter 23 of The Basic Practice of Statistics (6 th ed. The Chi Square Test Diana Mindrila, Ph.D. Phoebe Balentyne, M.Ed. Based on Chapter 23 of The Basic Practice of Statistics (6 th ed.) Concepts: Two-Way Tables The Problem of Multiple Comparisons Expected

More information

Foundation of Quantitative Data Analysis

Foundation of Quantitative Data Analysis Foundation of Quantitative Data Analysis Part 1: Data manipulation and descriptive statistics with SPSS/Excel HSRS #10 - October 17, 2013 Reference : A. Aczel, Complete Business Statistics. Chapters 1

More information

Statistical Testing of Randomness Masaryk University in Brno Faculty of Informatics

Statistical Testing of Randomness Masaryk University in Brno Faculty of Informatics Statistical Testing of Randomness Masaryk University in Brno Faculty of Informatics Jan Krhovják Basic Idea Behind the Statistical Tests Generated random sequences properties as sample drawn from uniform/rectangular

More information

1-3 id id no. of respondents 101-300 4 respon 1 responsible for maintenance? 1 = no, 2 = yes, 9 = blank

1-3 id id no. of respondents 101-300 4 respon 1 responsible for maintenance? 1 = no, 2 = yes, 9 = blank Basic Data Analysis Graziadio School of Business and Management Data Preparation & Entry Editing: Inspection & Correction Field Edit: Immediate follow-up (complete? legible? comprehensible? consistent?

More information

Chi Square Tests. Chapter 10. 10.1 Introduction

Chi Square Tests. Chapter 10. 10.1 Introduction Contents 10 Chi Square Tests 703 10.1 Introduction............................ 703 10.2 The Chi Square Distribution.................. 704 10.3 Goodness of Fit Test....................... 709 10.4 Chi Square

More information

2. DATA AND EXERCISES (Geos2911 students please read page 8)

2. DATA AND EXERCISES (Geos2911 students please read page 8) 2. DATA AND EXERCISES (Geos2911 students please read page 8) 2.1 Data set The data set available to you is an Excel spreadsheet file called cyclones.xls. The file consists of 3 sheets. Only the third is

More information

Bowerman, O'Connell, Aitken Schermer, & Adcock, Business Statistics in Practice, Canadian edition

Bowerman, O'Connell, Aitken Schermer, & Adcock, Business Statistics in Practice, Canadian edition Bowerman, O'Connell, Aitken Schermer, & Adcock, Business Statistics in Practice, Canadian edition Online Learning Centre Technology Step-by-Step - Excel Microsoft Excel is a spreadsheet software application

More information

Inferential Statistics

Inferential Statistics Inferential Statistics Sampling and the normal distribution Z-scores Confidence levels and intervals Hypothesis testing Commonly used statistical methods Inferential Statistics Descriptive statistics are

More information

4) The goodness of fit test is always a one tail test with the rejection region in the upper tail. Answer: TRUE

4) The goodness of fit test is always a one tail test with the rejection region in the upper tail. Answer: TRUE Business Statistics, 9e (Groebner/Shannon/Fry) Chapter 13 Goodness of Fit Tests and Contingency Analysis 1) A goodness of fit test can be used to determine whether a set of sample data comes from a specific

More information

Chi-Square Tests. In This Chapter BONUS CHAPTER

Chi-Square Tests. In This Chapter BONUS CHAPTER BONUS CHAPTER Chi-Square Tests In the previous chapters, we explored the wonderful world of hypothesis testing as we compared means and proportions of one, two, three, and more populations, making an educated

More information

First-year Statistics for Psychology Students Through Worked Examples

First-year Statistics for Psychology Students Through Worked Examples First-year Statistics for Psychology Students Through Worked Examples 1. THE CHI-SQUARE TEST A test of association between categorical variables by Charles McCreery, D.Phil Formerly Lecturer in Experimental

More information

Module 9: Nonparametric Tests. The Applied Research Center

Module 9: Nonparametric Tests. The Applied Research Center Module 9: Nonparametric Tests The Applied Research Center Module 9 Overview } Nonparametric Tests } Parametric vs. Nonparametric Tests } Restrictions of Nonparametric Tests } One-Sample Chi-Square Test

More information

November 08, 2010. 155S8.6_3 Testing a Claim About a Standard Deviation or Variance

November 08, 2010. 155S8.6_3 Testing a Claim About a Standard Deviation or Variance Chapter 8 Hypothesis Testing 8 1 Review and Preview 8 2 Basics of Hypothesis Testing 8 3 Testing a Claim about a Proportion 8 4 Testing a Claim About a Mean: σ Known 8 5 Testing a Claim About a Mean: σ

More information

Chi-square test Fisher s Exact test

Chi-square test Fisher s Exact test Lesson 1 Chi-square test Fisher s Exact test McNemar s Test Lesson 1 Overview Lesson 11 covered two inference methods for categorical data from groups Confidence Intervals for the difference of two proportions

More information

CHI-SQUARE: TESTING FOR GOODNESS OF FIT

CHI-SQUARE: TESTING FOR GOODNESS OF FIT CHI-SQUARE: TESTING FOR GOODNESS OF FIT In the previous chapter we discussed procedures for fitting a hypothesized function to a set of experimental data points. Such procedures involve minimizing a quantity

More information

Measuring the Power of a Test

Measuring the Power of a Test Textbook Reference: Chapter 9.5 Measuring the Power of a Test An economic problem motivates the statement of a null and alternative hypothesis. For a numeric data set, a decision rule can lead to the rejection

More information

MINITAB ASSISTANT WHITE PAPER

MINITAB ASSISTANT WHITE PAPER MINITAB ASSISTANT WHITE PAPER This paper explains the research conducted by Minitab statisticians to develop the methods and data checks used in the Assistant in Minitab 17 Statistical Software. One-Way

More information

CHAPTER 11. GOODNESS OF FIT AND CONTINGENCY TABLES

CHAPTER 11. GOODNESS OF FIT AND CONTINGENCY TABLES CHAPTER 11. GOODNESS OF FIT AND CONTINGENCY TABLES The chi-square distribution was discussed in Chapter 4. We now turn to some applications of this distribution. As previously discussed, chi-square is

More information

1. What is the critical value for this 95% confidence interval? CV = z.025 = invnorm(0.025) = 1.96

1. What is the critical value for this 95% confidence interval? CV = z.025 = invnorm(0.025) = 1.96 1 Final Review 2 Review 2.1 CI 1-propZint Scenario 1 A TV manufacturer claims in its warranty brochure that in the past not more than 10 percent of its TV sets needed any repair during the first two years

More information

Unit 29 Chi-Square Goodness-of-Fit Test

Unit 29 Chi-Square Goodness-of-Fit Test Unit 29 Chi-Square Goodness-of-Fit Test Objectives: To perform the chi-square hypothesis test concerning proportions corresponding to more than two categories of a qualitative variable To perform the Bonferroni

More information

Recall this chart that showed how most of our course would be organized:

Recall this chart that showed how most of our course would be organized: Chapter 4 One-Way ANOVA Recall this chart that showed how most of our course would be organized: Explanatory Variable(s) Response Variable Methods Categorical Categorical Contingency Tables Categorical

More information

Confidence Intervals for the Difference Between Two Means

Confidence Intervals for the Difference Between Two Means Chapter 47 Confidence Intervals for the Difference Between Two Means Introduction This procedure calculates the sample size necessary to achieve a specified distance from the difference in sample means

More information

Variables Control Charts

Variables Control Charts MINITAB ASSISTANT WHITE PAPER This paper explains the research conducted by Minitab statisticians to develop the methods and data checks used in the Assistant in Minitab 17 Statistical Software. Variables

More information

NCSS Statistical Software. One-Sample T-Test

NCSS Statistical Software. One-Sample T-Test Chapter 205 Introduction This procedure provides several reports for making inference about a population mean based on a single sample. These reports include confidence intervals of the mean or median,

More information

Random Uniform Clumped. 0 1 2 3 4 5 6 7 8 9 Number of Individuals per Sub-Quadrat. Number of Individuals per Sub-Quadrat

Random Uniform Clumped. 0 1 2 3 4 5 6 7 8 9 Number of Individuals per Sub-Quadrat. Number of Individuals per Sub-Quadrat 4-1 Population ecology Lab 4: Population dispersion patterns I. Introduction to population dispersion patterns The dispersion of individuals in a population describes their spacing relative to each other.

More information

The Chi-Square Test. STAT E-50 Introduction to Statistics

The Chi-Square Test. STAT E-50 Introduction to Statistics STAT -50 Introduction to Statistics The Chi-Square Test The Chi-square test is a nonparametric test that is used to compare experimental results with theoretical models. That is, we will be comparing observed

More information

Business Statistics. Successful completion of Introductory and/or Intermediate Algebra courses is recommended before taking Business Statistics.

Business Statistics. Successful completion of Introductory and/or Intermediate Algebra courses is recommended before taking Business Statistics. Business Course Text Bowerman, Bruce L., Richard T. O'Connell, J. B. Orris, and Dawn C. Porter. Essentials of Business, 2nd edition, McGraw-Hill/Irwin, 2008, ISBN: 978-0-07-331988-9. Required Computing

More information

Notes for STA 437/1005 Methods for Multivariate Data

Notes for STA 437/1005 Methods for Multivariate Data Notes for STA 437/1005 Methods for Multivariate Data Radford M. Neal, 26 November 2010 Random Vectors Notation: Let X be a random vector with p elements, so that X = [X 1,..., X p ], where denotes transpose.

More information

MT426 Notebook 3 Fall 2012 prepared by Professor Jenny Baglivo. 3 MT426 Notebook 3 3. 3.1 Definitions... 3. 3.2 Joint Discrete Distributions...

MT426 Notebook 3 Fall 2012 prepared by Professor Jenny Baglivo. 3 MT426 Notebook 3 3. 3.1 Definitions... 3. 3.2 Joint Discrete Distributions... MT426 Notebook 3 Fall 2012 prepared by Professor Jenny Baglivo c Copyright 2004-2012 by Jenny A. Baglivo. All Rights Reserved. Contents 3 MT426 Notebook 3 3 3.1 Definitions............................................

More information

Descriptive Statistics

Descriptive Statistics Y520 Robert S Michael Goal: Learn to calculate indicators and construct graphs that summarize and describe a large quantity of values. Using the textbook readings and other resources listed on the web

More information

PASS Sample Size Software

PASS Sample Size Software Chapter 250 Introduction The Chi-square test is often used to test whether sets of frequencies or proportions follow certain patterns. The two most common instances are tests of goodness of fit using multinomial

More information

Bivariate Statistics Session 2: Measuring Associations Chi-Square Test

Bivariate Statistics Session 2: Measuring Associations Chi-Square Test Bivariate Statistics Session 2: Measuring Associations Chi-Square Test Features Of The Chi-Square Statistic The chi-square test is non-parametric. That is, it makes no assumptions about the distribution

More information

START Selected Topics in Assurance

START Selected Topics in Assurance START Selected Topics in Assurance Related Technologies Table of Contents Introduction Some Statistical Background Fitting a Normal Using the Anderson Darling GoF Test Fitting a Weibull Using the Anderson

More information

LAB 4 INSTRUCTIONS CONFIDENCE INTERVALS AND HYPOTHESIS TESTING

LAB 4 INSTRUCTIONS CONFIDENCE INTERVALS AND HYPOTHESIS TESTING LAB 4 INSTRUCTIONS CONFIDENCE INTERVALS AND HYPOTHESIS TESTING In this lab you will explore the concept of a confidence interval and hypothesis testing through a simulation problem in engineering setting.

More information

Chapter 8 Hypothesis Testing

Chapter 8 Hypothesis Testing Chapter 8 Hypothesis Testing Chapter problem: Does the MicroSort method of gender selection increase the likelihood that a baby will be girl? MicroSort: a gender-selection method developed by Genetics

More information

Data Analysis Tools. Tools for Summarizing Data

Data Analysis Tools. Tools for Summarizing Data Data Analysis Tools This section of the notes is meant to introduce you to many of the tools that are provided by Excel under the Tools/Data Analysis menu item. If your computer does not have that tool

More information

How to Conduct a Hypothesis Test

How to Conduct a Hypothesis Test How to Conduct a Hypothesis Test The idea of hypothesis testing is relatively straightforward. In various studies we observe certain events. We must ask, is the event due to chance alone, or is there some

More information

Chapter Five: Paired Samples Methods 1/38

Chapter Five: Paired Samples Methods 1/38 Chapter Five: Paired Samples Methods 1/38 5.1 Introduction 2/38 Introduction Paired data arise with some frequency in a variety of research contexts. Patients might have a particular type of laser surgery

More information

Calculate Confidence Intervals Using the TI Graphing Calculator

Calculate Confidence Intervals Using the TI Graphing Calculator Calculate Confidence Intervals Using the TI Graphing Calculator Confidence Interval for Population Proportion p Confidence Interval for Population μ (σ is known 1 Select: STAT / TESTS / 1-PropZInt x: number

More information

business statistics using Excel OXFORD UNIVERSITY PRESS Glyn Davis & Branko Pecar

business statistics using Excel OXFORD UNIVERSITY PRESS Glyn Davis & Branko Pecar business statistics using Excel Glyn Davis & Branko Pecar OXFORD UNIVERSITY PRESS Detailed contents Introduction to Microsoft Excel 2003 Overview Learning Objectives 1.1 Introduction to Microsoft Excel

More information

3.4 Statistical inference for 2 populations based on two samples

3.4 Statistical inference for 2 populations based on two samples 3.4 Statistical inference for 2 populations based on two samples Tests for a difference between two population means The first sample will be denoted as X 1, X 2,..., X m. The second sample will be denoted

More information

CHAPTER 11 CHI-SQUARE: NON-PARAMETRIC COMPARISONS OF FREQUENCY

CHAPTER 11 CHI-SQUARE: NON-PARAMETRIC COMPARISONS OF FREQUENCY CHAPTER 11 CHI-SQUARE: NON-PARAMETRIC COMPARISONS OF FREQUENCY The hypothesis testing statistics detailed thus far in this text have all been designed to allow comparison of the means of two or more samples

More information

NPTEL STRUCTURAL RELIABILITY

NPTEL STRUCTURAL RELIABILITY NPTEL Course On STRUCTURAL RELIABILITY Module # 02 Lecture 6 Course Format: Web Instructor: Dr. Arunasis Chakraborty Department of Civil Engineering Indian Institute of Technology Guwahati 6. Lecture 06:

More information

1. Comparing Two Means: Dependent Samples

1. Comparing Two Means: Dependent Samples 1. Comparing Two Means: ependent Samples In the preceding lectures we've considered how to test a difference of two means for independent samples. Now we look at how to do the same thing with dependent

More information

3. Nonparametric methods

3. Nonparametric methods 3. Nonparametric methods If the probability distributions of the statistical variables are unknown or are not as required (e.g. normality assumption violated), then we may still apply nonparametric tests

More information

Exercise 1.12 (Pg. 22-23)

Exercise 1.12 (Pg. 22-23) Individuals: The objects that are described by a set of data. They may be people, animals, things, etc. (Also referred to as Cases or Records) Variables: The characteristics recorded about each individual.

More information

Introduction to Hypothesis Testing. Point estimation and confidence intervals are useful statistical inference procedures.

Introduction to Hypothesis Testing. Point estimation and confidence intervals are useful statistical inference procedures. Introduction to Hypothesis Testing Point estimation and confidence intervals are useful statistical inference procedures. Another type of inference is used frequently used concerns tests of hypotheses.

More information

SPSS for Exploratory Data Analysis Data used in this guide: studentp.sav (http://people.ysu.edu/~gchang/stat/studentp.sav)

SPSS for Exploratory Data Analysis Data used in this guide: studentp.sav (http://people.ysu.edu/~gchang/stat/studentp.sav) Data used in this guide: studentp.sav (http://people.ysu.edu/~gchang/stat/studentp.sav) Organize and Display One Quantitative Variable (Descriptive Statistics, Boxplot & Histogram) 1. Move the mouse pointer

More information

7 Hypothesis testing - one sample tests

7 Hypothesis testing - one sample tests 7 Hypothesis testing - one sample tests 7.1 Introduction Definition 7.1 A hypothesis is a statement about a population parameter. Example A hypothesis might be that the mean age of students taking MAS113X

More information

THE KRUSKAL WALLLIS TEST

THE KRUSKAL WALLLIS TEST THE KRUSKAL WALLLIS TEST TEODORA H. MEHOTCHEVA Wednesday, 23 rd April 08 THE KRUSKAL-WALLIS TEST: The non-parametric alternative to ANOVA: testing for difference between several independent groups 2 NON

More information

Basic concepts and introduction to statistical inference

Basic concepts and introduction to statistical inference Basic concepts and introduction to statistical inference Anna Helga Jonsdottir Gunnar Stefansson Sigrun Helga Lund University of Iceland (UI) Basic concepts 1 / 19 A review of concepts Basic concepts Confidence

More information

Odds ratio, Odds ratio test for independence, chi-squared statistic.

Odds ratio, Odds ratio test for independence, chi-squared statistic. Odds ratio, Odds ratio test for independence, chi-squared statistic. Announcements: Assignment 5 is live on webpage. Due Wed Aug 1 at 4:30pm. (9 days, 1 hour, 58.5 minutes ) Final exam is Aug 9. Review

More information

Descriptive Statistics. Understanding Data: Categorical Variables. Descriptive Statistics. Dataset: Shellfish Contamination

Descriptive Statistics. Understanding Data: Categorical Variables. Descriptive Statistics. Dataset: Shellfish Contamination Descriptive Statistics Understanding Data: Dataset: Shellfish Contamination Location Year Species Species2 Method Metals Cadmium (mg kg - ) Chromium (mg kg - ) Copper (mg kg - ) Lead (mg kg - ) Mercury

More information

Course Text. Required Computing Software. Course Description. Course Objectives. StraighterLine. Business Statistics

Course Text. Required Computing Software. Course Description. Course Objectives. StraighterLine. Business Statistics Course Text Business Statistics Lind, Douglas A., Marchal, William A. and Samuel A. Wathen. Basic Statistics for Business and Economics, 7th edition, McGraw-Hill/Irwin, 2010, ISBN: 9780077384470 [This

More information

HYPOTHESIS TESTING (ONE SAMPLE) - CHAPTER 7 1. used confidence intervals to answer questions such as...

HYPOTHESIS TESTING (ONE SAMPLE) - CHAPTER 7 1. used confidence intervals to answer questions such as... HYPOTHESIS TESTING (ONE SAMPLE) - CHAPTER 7 1 PREVIOUSLY used confidence intervals to answer questions such as... You know that 0.25% of women have red/green color blindness. You conduct a study of men

More information

Introduction to Quantitative Methods

Introduction to Quantitative Methods Introduction to Quantitative Methods October 15, 2009 Contents 1 Definition of Key Terms 2 2 Descriptive Statistics 3 2.1 Frequency Tables......................... 4 2.2 Measures of Central Tendencies.................

More information

BIOSTATISTICS QUIZ ANSWERS

BIOSTATISTICS QUIZ ANSWERS BIOSTATISTICS QUIZ ANSWERS 1. When you read scientific literature, do you know whether the statistical tests that were used were appropriate and why they were used? a. Always b. Mostly c. Rarely d. Never

More information

Elementary Statistics Sample Exam #3

Elementary Statistics Sample Exam #3 Elementary Statistics Sample Exam #3 Instructions. No books or telephones. Only the supplied calculators are allowed. The exam is worth 100 points. 1. A chi square goodness of fit test is considered to

More information

Tutorial 5: Hypothesis Testing

Tutorial 5: Hypothesis Testing Tutorial 5: Hypothesis Testing Rob Nicholls nicholls@mrc-lmb.cam.ac.uk MRC LMB Statistics Course 2014 Contents 1 Introduction................................ 1 2 Testing distributional assumptions....................

More information

Introduction to Analysis of Variance (ANOVA) Limitations of the t-test

Introduction to Analysis of Variance (ANOVA) Limitations of the t-test Introduction to Analysis of Variance (ANOVA) The Structural Model, The Summary Table, and the One- Way ANOVA Limitations of the t-test Although the t-test is commonly used, it has limitations Can only

More information

TABLE OF CONTENTS. About Chi Squares... 1. What is a CHI SQUARE?... 1. Chi Squares... 1. Hypothesis Testing with Chi Squares... 2

TABLE OF CONTENTS. About Chi Squares... 1. What is a CHI SQUARE?... 1. Chi Squares... 1. Hypothesis Testing with Chi Squares... 2 About Chi Squares TABLE OF CONTENTS About Chi Squares... 1 What is a CHI SQUARE?... 1 Chi Squares... 1 Goodness of fit test (One-way χ 2 )... 1 Test of Independence (Two-way χ 2 )... 2 Hypothesis Testing

More information

CHAPTER 11 CHI-SQUARE AND F DISTRIBUTIONS

CHAPTER 11 CHI-SQUARE AND F DISTRIBUTIONS CHAPTER 11 CHI-SQUARE AND F DISTRIBUTIONS CHI-SQUARE TESTS OF INDEPENDENCE (SECTION 11.1 OF UNDERSTANDABLE STATISTICS) In chi-square tests of independence we use the hypotheses. H0: The variables are independent

More information

Descriptive Statistics

Descriptive Statistics Descriptive Statistics Suppose following data have been collected (heights of 99 five-year-old boys) 117.9 11.2 112.9 115.9 18. 14.6 17.1 117.9 111.8 16.3 111. 1.4 112.1 19.2 11. 15.4 99.4 11.1 13.3 16.9

More information

Study Guide for the Final Exam

Study Guide for the Final Exam Study Guide for the Final Exam When studying, remember that the computational portion of the exam will only involve new material (covered after the second midterm), that material from Exam 1 will make

More information

MCQ TESTING OF HYPOTHESIS

MCQ TESTING OF HYPOTHESIS MCQ TESTING OF HYPOTHESIS MCQ 13.1 A statement about a population developed for the purpose of testing is called: (a) Hypothesis (b) Hypothesis testing (c) Level of significance (d) Test-statistic MCQ

More information

MBA 611 STATISTICS AND QUANTITATIVE METHODS

MBA 611 STATISTICS AND QUANTITATIVE METHODS MBA 611 STATISTICS AND QUANTITATIVE METHODS Part I. Review of Basic Statistics (Chapters 1-11) A. Introduction (Chapter 1) Uncertainty: Decisions are often based on incomplete information from uncertain

More information

Poisson Models for Count Data

Poisson Models for Count Data Chapter 4 Poisson Models for Count Data In this chapter we study log-linear models for count data under the assumption of a Poisson error structure. These models have many applications, not only to the

More information

Module 5 Hypotheses Tests: Comparing Two Groups

Module 5 Hypotheses Tests: Comparing Two Groups Module 5 Hypotheses Tests: Comparing Two Groups Objective: In medical research, we often compare the outcomes between two groups of patients, namely exposed and unexposed groups. At the completion of this

More information

tests whether there is an association between the outcome variable and a predictor variable. In the Assistant, you can perform a Chi-Square Test for

tests whether there is an association between the outcome variable and a predictor variable. In the Assistant, you can perform a Chi-Square Test for This paper explains the research conducted by Minitab statisticians to develop the methods and data checks used in the Assistant in Minitab 17 Statistical Software. In practice, quality professionals sometimes

More information

How Does My TI-84 Do That

How Does My TI-84 Do That How Does My TI-84 Do That A guide to using the TI-84 for statistics Austin Peay State University Clarksville, Tennessee How Does My TI-84 Do That A guide to using the TI-84 for statistics Table of Contents

More information

Final Exam Practice Problem Answers

Final Exam Practice Problem Answers Final Exam Practice Problem Answers The following data set consists of data gathered from 77 popular breakfast cereals. The variables in the data set are as follows: Brand: The brand name of the cereal

More information

Chapter 2 - Graphical Summaries of Data

Chapter 2 - Graphical Summaries of Data Chapter 2 - Graphical Summaries of Data Data recorded in the sequence in which they are collected and before they are processed or ranked are called raw data. Raw data is often difficult to make sense

More information

Two Correlated Proportions (McNemar Test)

Two Correlated Proportions (McNemar Test) Chapter 50 Two Correlated Proportions (Mcemar Test) Introduction This procedure computes confidence intervals and hypothesis tests for the comparison of the marginal frequencies of two factors (each with

More information

Nonparametric Two-Sample Tests. Nonparametric Tests. Sign Test

Nonparametric Two-Sample Tests. Nonparametric Tests. Sign Test Nonparametric Two-Sample Tests Sign test Mann-Whitney U-test (a.k.a. Wilcoxon two-sample test) Kolmogorov-Smirnov Test Wilcoxon Signed-Rank Test Tukey-Duckworth Test 1 Nonparametric Tests Recall, nonparametric

More information

" Y. Notation and Equations for Regression Lecture 11/4. Notation:

 Y. Notation and Equations for Regression Lecture 11/4. Notation: Notation: Notation and Equations for Regression Lecture 11/4 m: The number of predictor variables in a regression Xi: One of multiple predictor variables. The subscript i represents any number from 1 through

More information