How To Test For Fit

Similar documents
MATH 140 Lab 4: Probability and the Standard Normal Distribution

12.5: CHI-SQUARE GOODNESS OF FIT TESTS

Comparing Multiple Proportions, Test of Independence and Goodness of Fit

Class 19: Two Way Tables, Conditional Distributions, Chi-Square (Text: Sections 2.5; 9.1)

An Introduction to Statistics Course (ECOE 1302) Spring Semester 2011 Chapter 10- TWO-SAMPLE TESTS

Section 7.1. Introduction to Hypothesis Testing. Schrodinger s cat quantum mechanics thought experiment (1935)

Chapter 3 RANDOM VARIATE GENERATION

Data Mining Techniques Chapter 5: The Lure of Statistics: Data Mining Using Familiar Tools

Drawing a histogram using Excel

Chicago Booth BUSINESS STATISTICS Final Exam Fall 2011

Normality Testing in Excel

AMS 5 CHANCE VARIABILITY

Having a coin come up heads or tails is a variable on a nominal scale. Heads is a different category from tails.

HYPOTHESIS TESTING WITH SPSS:

Calculating P-Values. Parkland College. Isela Guerra Parkland College. Recommended Citation

Statistical Functions in Excel

Odds ratio, Odds ratio test for independence, chi-squared statistic.

Microsoft Excel 2010 and Tools for Statistical Analysis

Unit 19: Probability Models

KSTAT MINI-MANUAL. Decision Sciences 434 Kellogg Graduate School of Management

Math 108 Exam 3 Solutions Spring 00

Elementary Statistics Sample Exam #3

Final Exam Practice Problem Answers

SHORT ANSWER. Write the word or phrase that best completes each statement or answers the question.

Opgaven Onderzoeksmethoden, Onderdeel Statistiek

1. What is the critical value for this 95% confidence interval? CV = z.025 = invnorm(0.025) = 1.96

Business Statistics. Successful completion of Introductory and/or Intermediate Algebra courses is recommended before taking Business Statistics.

How To Test For Significance On A Data Set

SPSS/Excel Workshop 2 Semester One, 2010

Unit 31 A Hypothesis Test about Correlation and Slope in a Simple Linear Regression

Mind on Statistics. Chapter 15

MBA 611 STATISTICS AND QUANTITATIVE METHODS

Elementary Statistics

Topic 8. Chi Square Tests

Bowerman, O'Connell, Aitken Schermer, & Adcock, Business Statistics in Practice, Canadian edition

Joint Exam 1/P Sample Exam 1

HYPOTHESIS TESTING (ONE SAMPLE) - CHAPTER 7 1. used confidence intervals to answer questions such as...

6 3 The Standard Normal Distribution

One-Way Analysis of Variance (ANOVA) Example Problem

Department of Civil Engineering-I.I.T. Delhi CEL 899: Environmental Risk Assessment Statistics and Probability Example Part 1

Chapter 4. Probability Distributions

1.5 Oneway Analysis of Variance

Test Positive True Positive False Positive. Test Negative False Negative True Negative. Figure 5-1: 2 x 2 Contingency Table

Use of the Chi-Square Statistic. Marie Diener-West, PhD Johns Hopkins University

Lecture Notes Module 1

5/31/ Normal Distributions. Normal Distributions. Chapter 6. Distribution. The Normal Distribution. Outline. Objectives.

Course Text. Required Computing Software. Course Description. Course Objectives. StraighterLine. Business Statistics

Part 3. Comparing Groups. Chapter 7 Comparing Paired Groups 189. Chapter 8 Comparing Two Independent Groups 217

Chi-square test Fisher s Exact test

Using Formulas, Functions, and Data Analysis Tools Excel 2010 Tutorial

t Tests in Excel The Excel Statistical Master By Mark Harmon Copyright 2011 Mark Harmon

Summary of Formulas and Concepts. Descriptive Statistics (Ch. 1-4)

Regression Analysis: A Complete Example

MAS108 Probability I

HYPOTHESIS TESTING (ONE SAMPLE) - CHAPTER 7 1. used confidence intervals to answer questions such as...

November 08, S8.6_3 Testing a Claim About a Standard Deviation or Variance

THE BINOMIAL DISTRIBUTION & PROBABILITY

Confidence Intervals for Cp

ACMS Section 02 Elements of Statistics October 28, Midterm Examination II

Math 58. Rumbos Fall Solutions to Review Problems for Exam 2

Introduction to Hypothesis Testing

Confidence Intervals for the Difference Between Two Means

Module 4 (Effect of Alcohol on Worms): Data Analysis

Data Analysis Tools. Tools for Summarizing Data

Chapter 8 Hypothesis Testing Chapter 8 Hypothesis Testing 8-1 Overview 8-2 Basics of Hypothesis Testing

5 Cumulative Frequency Distributions, Area under the Curve & Probability Basics 5.1 Relative Frequencies, Cumulative Frequencies, and Ogives

THE FIRST SET OF EXAMPLES USE SUMMARY DATA... EXAMPLE 7.2, PAGE 227 DESCRIBES A PROBLEM AND A HYPOTHESIS TEST IS PERFORMED IN EXAMPLE 7.

Module 2 Probability and Statistics

4. Continuous Random Variables, the Pareto and Normal Distributions

Chapter 7 Section 1 Homework Set A

Two-Sample T-Tests Assuming Equal Variance (Enter Means)

Using Stata for Categorical Data Analysis

Stats Review Chapters 9-10

ACMS Section 02 Elements of Statistics October 28, 2010 Midterm Examination II Answers

Chi Square Tests. Chapter Introduction

LAB 4 INSTRUCTIONS CONFIDENCE INTERVALS AND HYPOTHESIS TESTING

Two-Sample T-Tests Allowing Unequal Variance (Enter Difference)

Part I Learning about SPSS

Statistics Review PSY379

Hypothesis Testing. Reminder of Inferential Statistics. Hypothesis Testing: Introduction

Lecture 1 Introduction Properties of Probability Methods of Enumeration Asrat Temesgen Stockholm University

The Chi-Square Test. STAT E-50 Introduction to Statistics

An Introduction to Statistical Tests for the SAS Programmer Sara Beck, Fred Hutchinson Cancer Research Center, Seattle, WA

Permutation Tests for Comparing Two Populations

TIPS FOR DOING STATISTICS IN EXCEL

4. Distribution (DIST)

ABSORBENCY OF PAPER TOWELS

This can dilute the significance of a departure from the null hypothesis. We can focus the test on departures of a particular form.

Section 6-5 Sample Spaces and Probability

Using Excel for inferential statistics

Is it statistically significant? The chi-square test

Tests of Hypotheses Using Statistics

5/31/2013. Chapter 8 Hypothesis Testing. Hypothesis Testing. Hypothesis Testing. Outline. Objectives. Objectives

Hypothesis testing - Steps

Introduction to Hypothesis Testing OPRE 6301

statistics Chi-square tests and nonparametric Summary sheet from last time: Hypothesis testing Summary sheet from last time: Confidence intervals

Statistics 2014 Scoring Guidelines

Hypothesis Testing --- One Mean

Describing Populations Statistically: The Mean, Variance, and Standard Deviation

COMMON CORE STATE STANDARDS FOR

Transcription:

Tests for Goodness of Fit: General Notion: We often wish to know whether a particular distribution fits a general definition Example: To use t tests, we must suppose that the population is normally distributed If a sample is drawn from, say, a normal distribution, the sample values should be reflect the population distribution Allows us to state the number in the sample that should be in a particular range Example: 68% of a normal distribution is within +/- 1 standard deviation of the mean. About 68% of the values in a sample from a normal distribution should be within +/- 1 standard deviation of the mean Comparison of actual and expected numbers is the province of the distribution

Let O j be the number observed in the sample in range j Let E j be the number that would be expected if the population had a given distribution, as uniform, Poisson, normal, etc. Then ( O j E j ) E k j 1 where k is the number of categories degrees of freedom = k 1 m where m is the number of parameter estimates used in the calculation j

Example: Are the answers to Dr. Dinwiddie s multiple-choice tests random? If so, the answers should conform to a uniform distribution and P(A) = P(B) = P(C) = P(D) = ¼. (For the uniform distribution P(E) = 1/n, where n is the number of possible values.) On a recent exam there were sixty questions with correct answers: A-0, B-5, C-17, and D-18. H 0 : the distribution of answers is uniform H 1 : the distribution is not uniform

Correct Answer Observed Expected Squared Difference A 0 15 B 5 15 C 17 15 D 18 15 k ( O j E j ) E j 1 j Then = 9.07, and no parameters were estimated, so degrees of freedom = 4 1 = 3

Excel and the chi-square distribution CHIDIST(x value, df) returns the area in the right-hand tail of the chi-square distribution goodness of fit tests are all upper one-tail tests, so chidist gives the p-value of the test CHIINV(probability, df) gives the chi-square value for the upper tail of the probability entered use to find the critical value for a chi-square test For the Dinwiddie problem: CHIDIST(9.07, 3) gives the p-value of the test

EXAMPLE: Hamish suspects that the dice at Black Bart s are not fair, so he spirits one out of the casino one night. After rolling the stolen die 10 times, he has the following result: No. of Dots No. of Times 1 7 4 3 18 4 11 5 7 6 13 What are the null and alternative hypotheses? Is Hamish right to be suspicious of Black Bart? k ( O j E j ) E j 1 j

Testing for normality suppose that nationally auto insurance has a mean price of $700 with standard deviation $135. We have a sample of 80 NC drivers, and we d like to know whether their insurance bills are normally distributed with the national parameters. how many would we expect in the range 700 to 835? HINT: how many standard deviations? What proportion are within that range of standard deviations?

answer: on a normal distribution, 0.34 are between the mean and +1 st dev, so we d expect to find 0.34 * 80 = 7. in that range Setting up a spreadsheet: use normsdist normsdist(-) gives the proportion more than two standard deviations below the mean normsdist(-1) normsdist(-) would give proportion between 1 and st devs below mean

Continuing in that fashion, we d have the following St Devs Range Prop. Expected freq < - < 430 0.075 1.8 - to -1 430-565 0.1359 10.87-1 to 0 565-700 0.3413 7.31 0 to 1 700-835 0.3413 7.31 1 to 835-970 0.1359 10.87 > > 970 0.075 1.8

To find the observed values in the sample, use the HISTOGRAM tool An elaborated solution appears under Study Aids on my web site. Click on the link to normaltest.xls Issue: how many degrees of freedom does the statistic have? df= k 1 m = 6 1 0 = 5

Alternate technique: determine whether the sample was drawn from a normal population First, calculate sample mean and standard deviation and use those numbers in the calculation Issue: how many degrees of freedom does the statistic have? df = k 1 m = 6 1 = 3

A problem and an alternate solution Each cell should have expected frequency at least 5, otherwise chisquare value is not correct One solution: choose ranges with equal expected frequencies Divide data into, say, 10 ranges each expected to contain 8 observations So we define ranges that each contain 1/10 of total Remember NORMINV(probability, mean, standard deviation) displays the upper boundary of the given probability for the specified mean and standard deviation Example: NORMINV(.1, 300, 0) = 74.37. 10% of this distribution is 74.37 NORMINV(1/10, X, s) will find the boundary of the lowest 10% of the distribution NORMINV(4/10, X, s) finds the boundary of the lowest 40% and so on Look carefully at sheet of the workbook normaltest.xls as posted

The boundaries thus found are the bin range Each will have expected number equal to n/c where n is the amount of data and c the number of categories

Testing for conformity to an observed distribution: The national distribution of pets is as follows: Number of Pets Percentage of Households 0 55 1 5 10 3 5 4 3 5 or more A marketing company wants to know whether Boone conforms to the national pattern. In a sample of 300 Boone households, they found the following:

No. of Pets 0 18 1 75 50 3 0 4 18 5 or more 9 No. of Households Expected No. Squares k ( O j E j ) E j 1 j