CHAPTER 11. GOODNESS OF FIT AND CONTINGENCY TABLES

Save this PDF as:

Size: px
Start display at page:

Transcription

1 CHAPTER 11. GOODNESS OF FIT AND CONTINGENCY TABLES The chi-square distribution was discussed in Chapter 4. We now turn to some applications of this distribution. As previously discussed, chi-square is a continuous distribution, however, its application is not limited to continuous data. In fact it is the most important distribution used for the evaluation of discrete or categorical data, for example, the classification of experimental units as dead or alive, sick or healthy, white, green or blue, scores of 1 to 5, etc Goodness of fit for discrete distributions Goodness of fit involves a comparison of the frequency observed in the sample with the expected frequency based on some theoretical model. If the differences between the observed and the expected frequencies are so great that they are unlikely to be due to chance alone, we conclude that the sample is not taken from the population that was used to calculate the expected frequencies. Suppose we have k observed frequencies, O 1, O,..., O k and their corresponding expected frequencies, E 1, E,..., E k, then the expression k Σ ( ) / Ei i = 1 χ = 0 E i i is approximately χ distributed with k-1 degrees of freedom. The closer the agreement between the observed and expected values, the smaller will be the value of χ, and a value of zero indicates perfect agreement. Calculated values of χ exceeding those in Appendix Table A-5 indicate that, at the corresponding probability level, there is significant disagreement between observed and expected values. Chi-square, defined in terms of observed and expected frequencies, becomes a discrete variable and can only assume certain, but not all, non-negative values. As the number of classes (k) increases, χ as defined above approaches a continuous variable. When frequencies for only two classes are involved, a correction must be made for the non-continuity, and the observed χ is adjusted as, adj. χ ( 0i Ei 1/ ) = Σ Ei i = 1 with df = 1 In adjusted χ the absolute value of each of the differences between observed and expected frequencies is reduced by 1/ before being squared. The correction 1/ is applicable only in the case of one degree of freedom (k=). For more degrees of freedom the corrections are more complicated and are not generally used, but for 1 df the adjusted value of χ is always appropriate.

2 An example in the use of adjusted χ It is hypothesized that blood groups are inherited in a simple Mendelian manner so that a cross of parents both of whom have AB blood should give 3/4 type AB or AA children and 1/4 type BB children (theoretical model). Suppose that among 400 children from such parents, 9 are of type AB or AA. Does the observation conform to the theoretical hypothesis? adj. Blood Type: AB or AA BB observed frequency expected frequency ( ) ( ) = = = 0.75 χ Referring to Appendix Table A-5 with 1 df, we find that the observed χ is not significant (p < 0.10). Thus the observed frequencies of the sample support the hypothetical ratio. χ with 3 or more classes. In a cross between ivory and red snapdragons the following counts were observed in the F generation. Phenotype Number of plants Red 0 Pink 55 Ivory 5 On the basis of these data, can segregation be assumed to occur in the simple Mendelian ratio of 1::1? 100

3 Color Red Pink Ivory Observed frequency Expected frequency ( 0 5) (55 50) ( 5 5) = = = 1.5 χ Note since this χ has df, the adjustment for noncontinuity is not necessary. Again, the calculated χ is much smaller than the tabular χ at the 10% level. Therefore we conclude that the observed color frequencies could conform with Mendelian ratio. 11. Goodness of fit for continuous distributions It is often of interest to know whether a given set of data approximates a continuous distribution such as the normal or chi-square. To illustrate the procedure, we will use the % sucrose data presented in Table -1, summarized in Table -, and graphed in Figure -1. We will test to see if these data can be considered to be normally distributed. To compute the expected frequencies for each class interval, we need to determine the probability associated with each interval. This procedure is summarized in Table 11-1 and the computational steps follow the table.

4 Table Observed and expected frequencies to test the goodness of fit of percent sucrose values to a normal distribution. Mid- Y i End- O i Obs. Est.Z Y Y Cum. Int. E i Exp. Contribution Class point point freq. S prob. prob. freq. to χ Y = 1., S =.6, χ = with 9-3 = 6 df 1. Columns 1 through 4 are recorded from the frequency table, Table -.. Standardize the class interval end points, (Y - Y )/S, where Y = 1. and S =.6 as previously calculated. 3. Determine the cumulative probability for each standardized value from Appendix Table A-4. For example, p(z < -.54) = p(z >.54) = p(z < 0.35) = 1 p p(z > 0.35) = = p(z <.08) = 1 p p(z >.08) = = Calculate the probability for each class interval, e.g., for class, the interval probability = = The expected frequency for each class is calculated by multiplying the interval probability by the sample size, e.g., for class, the expected frequency = (0.0195) (100) = 1.95 ~ The contribution of each class to the overall is equal to (Observed frequency - expected frequency) = (O - E) expected frequency E 7. The calculated χ is the sum of each class contribution χ = = 11.99

5 8. The degrees of freedom for χ depends on the number of parameters that must be estimated for computing the expected frequencies. In this case, we have 9 classes, therefore there are 8 df for classes. This is further reduced by a degree of freedom for mean and a degree of freedom for standard deviation. Thus, there are 6 df for the calculated χ. For Appendix Table A-5, χ 0.05,6 = Although the calculated χ is not significant at the 5% level, it is nearly so. Therefore we cannot be too sure that the data of Figure -1 are normally distributed. However, we can conclude that the data are near enough to being normally distributed to have no effect on the AOV procedures we are using in the evaluation of this variable Contingency Tables A closely related application of a χ distribution is the test of independence, also known as Pearson's test for association. This test is very similar to the test of goodness of fit and some people prefer to treat them as the same test with minor variation. In this section, we are concerned with the hypothesis of statistical independence between two variables, each of which is classified into a number of categories or attributes. Suppose a group of persons or set of objects is classified according to two criteria of classification, one criterion being entered in rows and the other in column. This two-way table is called a contingency table. If there are j rows and k columns, the table is known as a j x k table. From such a table we are interested in determining whether a relationship exists between the two criteria of classification or if they are independent. For a j x k table there are (j-1)(k-1) degrees of freedom. A x contingency table The following data show the effect of a certain type of fumigation on fruit spoilage. Spoiled Unspoiled Totals Unfumigated Fumigated Totals Does the amount of fruit depend upon whether it has been fumigated? In any contingency table, we set up the hypothesis that the two criteria of classification are independent. The marginal totals are accepted as part of the hypothesis. For a population in which the distribution in the classes is shown by the marginal totals and the classes are independent, we are asking what proportion of a large series of samples similar to the one under consideration will deviate as much or more from the theoretical as the one observed. On the

6 basis of the marginal total the expected entry in the upper left-hand corner would be (4) (10)/40 = 6. After this entry or any other has been calculated, all the remaining entries can be obtained by subtraction from the marginal totals. Since only one value need be calculated, for a x table, there is only one degree of freedom. This checks with the (j - 1) (k - 1) degrees of freedom for a jxk table since in this case j = k =. The expected (theoretical) values are given below: Therefore, Spoiled Unspoiled Totals Unfumigated Fumigated Totals adjusted ( ) ( ) 6 18 ( ) ( ) (.) 15 (.) 15 (.) 15 (.) 15 = = = 1.5 χ = + Note the use of the correction for non-continuity for 1 df. From Appendix Table A-5, χ 0.05,1 = Since χ = 1.5 < 3.84, there is no reason to reject the null hypothesis of independence. It appears, therefore, that fumigation has had no significant effect in reducing spoilage. A x3 contingency table The following are tabulated data on 8 strains of oats divided into groups according to the presence or absence of awns, and into 3 groups according to yield. Do these data permit the conclusion that more of the awned strains occur in the highest yielding classes than do awnless strains? Yield Class (weight in grams) Awned Awnless

7 The χ is calculated as follows: Yield Class (expected values in parentheses) Total Awned 6(10) 7(11.6) 1(1.4) 34 Awnless 18(14) 1(16.4) 9(17.6) 48 Totals ( 6 10) ( 18 14) ( ) ( ) = ( ) ( ) = χ = = Since χ = > χ005., = 599., the observed values are not distributed as expected on the basis of the marginal totals, and the awned strains do occur in the highest yielding classes more frequently.

8 SUMMARY 1. The chi-square distribution can be used to test the goodness of fit of data to a hypothesized model. 0 Σ ( E ) E χ i i = i Where O i is the observed frequency and E i is the expected frequency from the hypothesized model. Assume observations are classified into k frequency classes, then a) χ has k-1 df if no parameter of the hypothesized model needs to be estimated from the data. b) χ has k-1-n df in n parameters of the model are to be estimated from the data.. The chi-square distribution can also be used to test the hypothesis of statistical independence between two variables, each of which is classified into a number of categories or attributes. The two-way table of the classified frequencies is called a contingency table. χ 0 = Σ ( ) ij Eij E ij where O ij is the observed frequency in the i th row and the j th column, with expected frequency E ij. If there are j rows and k columns, the χ has (j - 1) (k - 1) df. 3. In case the calculated has 1 df, the above formula needs to be modified as adj. χ ( 0i Ei 0. 5) = Σ E i

9 EXERCISES 1. Suppose 50 years rainfall data are summarized in the table below. Test whether the data approximate a normal distribution. Midpoint Class (inches) Frequency Endpoint Y = 165. and S = 54.. Of 64 offspring of a certain cross of guinea pigs, 34 are red, 10 are black and 0 are white. According to the genetic model, these numbers should be in the ratio of 9:3:4. Are the data consistent with the model? 3. In an experiment involving the crossing of two hybrids of a species of flower the results shown below are observed. Are these results consistent with the expected population 9:3:3:1? Magenta Flower Magenta Flower Red Flower Red Flower Green Stigma Red Stigma Green Stigma Red Stigma It was hypothesized that the F generation of a particular barley cultivar would show a 9 hooded to 3 long-awned to 4 short-awned ratio. The observed data showed 348 hooded to 115 long-awned to 157 short-awned individuals. Test the hypothesis. 5.Suppose in a particular orange growing area of California that on the average 5% of the oranges are graded as best grade, 40% are placed in the above average grade, 5% are graded as average and 10% are graded as poor. A random sample of 500 bushels from one orchard in the region yields 100 bushels of best grade, 300 bushels of above average

10 grade, 50 bushels of average grade and 50 bushels of poor grade oranges. Is the quality of oranges from this orchard representative of this area? (χ = 167.5) 6. A study was conducted to establish whether there is a difference in susceptibility to mildew between two types of pasture grass. The following data were obtained: Type Mildewed Not Mildewed Total A B Total Test the hypothesis that there is no difference between the two grasses in their susceptibility to mildew. (χ = ) 7. Two barley fields containing two different varieties of barley were examined for rust. The results are presented below. Variety Rust No Rust Total Total Is there evidence of heterogeneity in response to rust? (χ = 1.787) 8. During an epidemic of cholera the following data on the effectiveness of inoculation as a means of preventing the disease were obtained. Not Attacked Attacked Inoculated 19 4 Not inoculated Do these data indicate the effectiveness of the inoculation on the basis of the 1% level of significance? (Yes, since adj. χ = 35.79) 9. Twenty-two animals are suffering from a disease, the severity of which is about the same in each case. In order to test the therapeutic value of a serum, it is administered to 10 of the animals; 1 remain uninoculated as a control. The results are shown below. Recovered Died Inoculated 7 3 Not inoculated 3 9

11 Has inoculation been effective? (5% level) (No, since adj. χ =.8) 10. It is suspected that different combinations of temperature and humidity affect the number of defective articles produced in a certain workroom. Do the following data confirm this suspicion? (5% level) (Yes, since adj. χ = 5.95) Temperature Humidity Low High Low 10 5 High A plant breeder wants to know if the sterility of rice is a genetic problem. Samples were taken from a large field study of 400 plots and the sterility of each plot was rated as follows: Genotypes Sterility A B C D No problem Moderate Severe Test the hypothesis that the severity of sterility is independent of genetic make-up or genotype. (χ = ) 1. An alfalfa breeder conducted a study on inheritance of resistance to anthracnose. The following data were obtained: Growth Habit Resistant Not Resistant Total Standard Intermediate standard Intermediate alpha Alpha Total Can he conclude that resistance to crown rot and growth habit are independent? ( χ = )

12 13. A pastry chef wishes to determine whether the proportion of unsatisfactory bear claws is affected by oven temperature. He has baked batches of them at 350, 400, and 450, and has obtained the following results: Total Number satisfactory Number unsatisfactory Totals If the chef wishes to test the null hypothesis of identical proportions at an = 0.05 significance level, what conclusion should he reach? (χ = 13.71)

Chapter 11 Chi square Tests

11.1 Introduction 259 Chapter 11 Chi square Tests 11.1 Introduction In this chapter we will consider the use of chi square tests (χ 2 tests) to determine whether hypothesized models are consistent with

Comparing Multiple Proportions, Test of Independence and Goodness of Fit

Comparing Multiple Proportions, Test of Independence and Goodness of Fit Content Testing the Equality of Population Proportions for Three or More Populations Test of Independence Goodness of Fit Test 2

1. Chi-Squared Tests

1. Chi-Squared Tests We'll now look at how to test statistical hypotheses concerning nominal data, and specifically when nominal data are summarized as tables of frequencies. The tests we will considered

Chi-Square Test. Contingency Tables. Contingency Tables. Chi-Square Test for Independence. Chi-Square Tests for Goodnessof-Fit

Chi-Square Tests 15 Chapter Chi-Square Test for Independence Chi-Square Tests for Goodness Uniform Goodness- Poisson Goodness- Goodness Test ECDF Tests (Optional) McGraw-Hill/Irwin Copyright 2009 by The

TECHNICAL WORKING PARTY ON AUTOMATION AND COMPUTER PROGRAMS. Twenty-Fifth Session Sibiu, Romania, September 3 to 6, 2007

E TWC/5/0 ORIGINAL: English DATE: August, 007 INTERNATIONAL UNION FOR THE PROTECTION OF NEW VARIETIES OF PLANTS GENEVA TECHNICAL WORKING PARTY ON AUTOMATION AND COMPUTER PROGRAMS Twenty-Fifth Session Sibiu,

Unit 29 Chi-Square Goodness-of-Fit Test

Unit 29 Chi-Square Goodness-of-Fit Test Objectives: To perform the chi-square hypothesis test concerning proportions corresponding to more than two categories of a qualitative variable To perform the Bonferroni

4. Sum the results of the calculation described in step 3 for all classes of progeny

F09 Biol 322 chi square notes 1. Before proceeding with the chi square calculation, clearly state the genetic hypothesis concerning the data. This hypothesis is an interpretation of the data that gives

PASS Sample Size Software

Chapter 250 Introduction The Chi-square test is often used to test whether sets of frequencies or proportions follow certain patterns. The two most common instances are tests of goodness of fit using multinomial

November 08, 2010. 155S8.6_3 Testing a Claim About a Standard Deviation or Variance

Chapter 8 Hypothesis Testing 8 1 Review and Preview 8 2 Basics of Hypothesis Testing 8 3 Testing a Claim about a Proportion 8 4 Testing a Claim About a Mean: σ Known 8 5 Testing a Claim About a Mean: σ

Testing Research and Statistical Hypotheses

Testing Research and Statistical Hypotheses Introduction In the last lab we analyzed metric artifact attributes such as thickness or width/thickness ratio. Those were continuous variables, which as you

INTRODUCTION TO GENETICS USING TOBACCO (Nicotiana tabacum) SEEDLINGS

INTRODUCTION TO GENETICS USING TOBACCO (Nicotiana tabacum) SEEDLINGS By Dr. Susan Petro Based on a lab by Dr. Elaine Winshell Nicotiana tabacum Objectives To apply Mendel s Law of Segregation To use Punnett

Chi Square for Contingency Tables

2 x 2 Case Chi Square for Contingency Tables A test for p 1 = p 2 We have learned a confidence interval for p 1 p 2, the difference in the population proportions. We want a hypothesis testing procedure

How to Conduct a Hypothesis Test

How to Conduct a Hypothesis Test The idea of hypothesis testing is relatively straightforward. In various studies we observe certain events. We must ask, is the event due to chance alone, or is there some

Chi Square (χ 2 ) Statistical Instructions EXP 3082L Jay Gould s Elaboration on Christensen and Evans (1980)

Chi Square (χ 2 ) Statistical Instructions EXP 3082L Jay Gould s Elaboration on Christensen and Evans (1980) For the Driver Behavior Study, the Chi Square Analysis II is the appropriate analysis below.

2 GENETIC DATA ANALYSIS

2.1 Strategies for learning genetics 2 GENETIC DATA ANALYSIS We will begin this lecture by discussing some strategies for learning genetics. Genetics is different from most other biology courses you have

The Goodness-of-Fit Test

on the Lecture 49 Section 14.3 Hampden-Sydney College Tue, Apr 21, 2009 Outline 1 on the 2 3 on the 4 5 Hypotheses on the (Steps 1 and 2) (1) H 0 : H 1 : H 0 is false. (2) α = 0.05. p 1 = 0.24 p 2 = 0.20

Calculating P-Values. Parkland College. Isela Guerra Parkland College. Recommended Citation

Parkland College A with Honors Projects Honors Program 2014 Calculating P-Values Isela Guerra Parkland College Recommended Citation Guerra, Isela, "Calculating P-Values" (2014). A with Honors Projects.

Chapter Additional: Standard Deviation and Chi- Square

Chapter Additional: Standard Deviation and Chi- Square Chapter Outline: 6.4 Confidence Intervals for the Standard Deviation 7.5 Hypothesis testing for Standard Deviation Section 6.4 Objectives Interpret

Chi Square Tests. Chapter 10. 10.1 Introduction

Contents 10 Chi Square Tests 703 10.1 Introduction............................ 703 10.2 The Chi Square Distribution.................. 704 10.3 Goodness of Fit Test....................... 709 10.4 Chi Square

12.5: CHI-SQUARE GOODNESS OF FIT TESTS

125: Chi-Square Goodness of Fit Tests CD12-1 125: CHI-SQUARE GOODNESS OF FIT TESTS In this section, the χ 2 distribution is used for testing the goodness of fit of a set of data to a specific probability

CHAPTER IV FINDINGS AND CONCURRENT DISCUSSIONS

CHAPTER IV FINDINGS AND CONCURRENT DISCUSSIONS Hypothesis 1: People are resistant to the technological change in the security system of the organization. Hypothesis 2: information hacked and misused. Lack

Mendelian Genetics. I. Background

Mendelian Genetics Objectives 1. To understand the Principles of Segregation and Independent Assortment. 2. To understand how Mendel s principles can explain transmission of characters from one generation

13.2 The Chi Square Test for Homogeneity of Populations The setting: Used to compare distribution of proportions in two or more populations.

13.2 The Chi Square Test for Homogeneity of Populations The setting: Used to compare distribution of proportions in two or more populations. Data is organized in a two way table Explanatory variable (Treatments)

Chi-square test Testing for independeny The r x c contingency tables square test

Chi-square test Testing for independeny The r x c contingency tables square test 1 The chi-square distribution HUSRB/0901/1/088 Teaching Mathematics and Statistics in Sciences: Modeling and Computer-aided

Week 7: Chi-squared tests

The chi-squared test for association Health Sciences M.Sc. Programme Applied Biostatistics Week 7: Chi-squared tests Table 1 shows the relationship between housing tenure for a sample of pregnant women

Class 19: Two Way Tables, Conditional Distributions, Chi-Square (Text: Sections 2.5; 9.1)

Spring 204 Class 9: Two Way Tables, Conditional Distributions, Chi-Square (Text: Sections 2.5; 9.) Big Picture: More than Two Samples In Chapter 7: We looked at quantitative variables and compared the

BA 275 Review Problems - Week 5 (10/23/06-10/27/06) CD Lessons: 48, 49, 50, 51, 52 Textbook: pp. 380-394

BA 275 Review Problems - Week 5 (10/23/06-10/27/06) CD Lessons: 48, 49, 50, 51, 52 Textbook: pp. 380-394 1. Does vigorous exercise affect concentration? In general, the time needed for people to complete

Association Between Variables

Contents 11 Association Between Variables 767 11.1 Introduction............................ 767 11.1.1 Measure of Association................. 768 11.1.2 Chapter Summary.................... 769 11.2 Chi

Dr. Young. Genetics Problems Set #1 Answer Key

BIOL276 Dr. Young Name Due Genetics Problems Set #1 Answer Key For problems in genetics, if no particular order is specified, you can assume that a specific order is not required. 1. What is the probability

Chi-square test Fisher s Exact test

Lesson 1 Chi-square test Fisher s Exact test McNemar s Test Lesson 1 Overview Lesson 11 covered two inference methods for categorical data from groups Confidence Intervals for the difference of two proportions

CHAPTER 11 CHI-SQUARE: NON-PARAMETRIC COMPARISONS OF FREQUENCY

CHAPTER 11 CHI-SQUARE: NON-PARAMETRIC COMPARISONS OF FREQUENCY The hypothesis testing statistics detailed thus far in this text have all been designed to allow comparison of the means of two or more samples

Chapter Five: Paired Samples Methods 1/38

Chapter Five: Paired Samples Methods 1/38 5.1 Introduction 2/38 Introduction Paired data arise with some frequency in a variety of research contexts. Patients might have a particular type of laser surgery

Two Correlated Proportions (McNemar Test)

Chapter 50 Two Correlated Proportions (Mcemar Test) Introduction This procedure computes confidence intervals and hypothesis tests for the comparison of the marginal frequencies of two factors (each with

Odds ratio, Odds ratio test for independence, chi-squared statistic.

Odds ratio, Odds ratio test for independence, chi-squared statistic. Announcements: Assignment 5 is live on webpage. Due Wed Aug 1 at 4:30pm. (9 days, 1 hour, 58.5 minutes ) Final exam is Aug 9. Review

The Chi Square Test. Diana Mindrila, Ph.D. Phoebe Balentyne, M.Ed. Based on Chapter 23 of The Basic Practice of Statistics (6 th ed.

The Chi Square Test Diana Mindrila, Ph.D. Phoebe Balentyne, M.Ed. Based on Chapter 23 of The Basic Practice of Statistics (6 th ed.) Concepts: Two-Way Tables The Problem of Multiple Comparisons Expected

statistics Chi-square tests and nonparametric Summary sheet from last time: Hypothesis testing Summary sheet from last time: Confidence intervals

Summary sheet from last time: Confidence intervals Confidence intervals take on the usual form: parameter = statistic ± t crit SE(statistic) parameter SE a s e sqrt(1/n + m x 2 /ss xx ) b s e /sqrt(ss

Chi Square Distribution

17. Chi Square A. Chi Square Distribution B. One-Way Tables C. Contingency Tables D. Exercises Chi Square is a distribution that has proven to be particularly useful in statistics. The first section describes

1. What is the critical value for this 95% confidence interval? CV = z.025 = invnorm(0.025) = 1.96

1 Final Review 2 Review 2.1 CI 1-propZint Scenario 1 A TV manufacturer claims in its warranty brochure that in the past not more than 10 percent of its TV sets needed any repair during the first two years

CATEGORICAL DATA Chi-Square Tests for Univariate Data

CATEGORICAL DATA Chi-Square Tests For Univariate Data 1 CATEGORICAL DATA Chi-Square Tests for Univariate Data Recall that a categorical variable is one in which the possible values are categories or groupings.

Introduction to Analysis of Variance (ANOVA) Limitations of the t-test

Introduction to Analysis of Variance (ANOVA) The Structural Model, The Summary Table, and the One- Way ANOVA Limitations of the t-test Although the t-test is commonly used, it has limitations Can only

Chi Square Analysis. When do we use chi square?

Chi Square Analysis When do we use chi square? More often than not in psychological research, we find ourselves collecting scores from participants. These data are usually continuous measures, and might

Topic 8. Chi Square Tests

BE540W Chi Square Tests Page 1 of 5 Topic 8 Chi Square Tests Topics 1. Introduction to Contingency Tables. Introduction to the Contingency Table Hypothesis Test of No Association.. 3. The Chi Square Test

MBA 611 STATISTICS AND QUANTITATIVE METHODS

MBA 611 STATISTICS AND QUANTITATIVE METHODS Part I. Review of Basic Statistics (Chapters 1-11) A. Introduction (Chapter 1) Uncertainty: Decisions are often based on incomplete information from uncertain

Randomized Block Analysis of Variance

Chapter 565 Randomized Block Analysis of Variance Introduction This module analyzes a randomized block analysis of variance with up to two treatment factors and their interaction. It provides tables of

Is it statistically significant? The chi-square test

UAS Conference Series 2013/14 Is it statistically significant? The chi-square test Dr Gosia Turner Student Data Management and Analysis 14 September 2010 Page 1 Why chi-square? Tests whether two categorical

NPTEL STRUCTURAL RELIABILITY

NPTEL Course On STRUCTURAL RELIABILITY Module # 02 Lecture 6 Course Format: Web Instructor: Dr. Arunasis Chakraborty Department of Civil Engineering Indian Institute of Technology Guwahati 6. Lecture 06:

Lecture 7: Binomial Test, Chisquare

Lecture 7: Binomial Test, Chisquare Test, and ANOVA May, 01 GENOME 560, Spring 01 Goals ANOVA Binomial test Chi square test Fisher s exact test Su In Lee, CSE & GS suinlee@uw.edu 1 Whirlwind Tour of One/Two

Math 108 Exam 3 Solutions Spring 00

Math 108 Exam 3 Solutions Spring 00 1. An ecologist studying acid rain takes measurements of the ph in 12 randomly selected Adirondack lakes. The results are as follows: 3.0 6.5 5.0 4.2 5.5 4.7 3.4 6.8

9-3.4 Likelihood ratio test. Neyman-Pearson lemma

9-3.4 Likelihood ratio test Neyman-Pearson lemma 9-1 Hypothesis Testing 9-1.1 Statistical Hypotheses Statistical hypothesis testing and confidence interval estimation of parameters are the fundamental

AP: LAB 8: THE CHI-SQUARE TEST. Probability, Random Chance, and Genetics

Ms. Foglia Date AP: LAB 8: THE CHI-SQUARE TEST Probability, Random Chance, and Genetics Why do we study random chance and probability at the beginning of a unit on genetics? Genetics is the study of inheritance,

Elementary Statistics Sample Exam #3

Elementary Statistics Sample Exam #3 Instructions. No books or telephones. Only the supplied calculators are allowed. The exam is worth 100 points. 1. A chi square goodness of fit test is considered to

Crosstabulation & Chi Square

Crosstabulation & Chi Square Robert S Michael Chi-square as an Index of Association After examining the distribution of each of the variables, the researcher s next task is to look for relationships among

Chapter 4. The analysis of Segregation

Chapter 4. The analysis of Segregation 1 The law of segregation (Mendel s rst law) There are two alleles in the two homologous chromosomes at a certain marker. One of the two alleles received from one

When to use a Chi-Square test:

When to use a Chi-Square test: Usually in psychological research, we aim to obtain one or more scores from each participant. However, sometimes data consist merely of the frequencies with which certain

Measuring the Power of a Test

Textbook Reference: Chapter 9.5 Measuring the Power of a Test An economic problem motivates the statement of a null and alternative hypothesis. For a numeric data set, a decision rule can lead to the rejection

Topic 21: Goodness of Fit

Topic 21: December 5, 2011 A goodness of fit tests examine the case of a sequence of independent observations each of which can have 1 of k possible categories. For example, each of us has one of 4 possible

4) The goodness of fit test is always a one tail test with the rejection region in the upper tail. Answer: TRUE

Business Statistics, 9e (Groebner/Shannon/Fry) Chapter 13 Goodness of Fit Tests and Contingency Analysis 1) A goodness of fit test can be used to determine whether a set of sample data comes from a specific

MINITAB ASSISTANT WHITE PAPER

MINITAB ASSISTANT WHITE PAPER This paper explains the research conducted by Minitab statisticians to develop the methods and data checks used in the Assistant in Minitab 17 Statistical Software. One-Way

The Chi-Square Test. STAT E-50 Introduction to Statistics

STAT -50 Introduction to Statistics The Chi-Square Test The Chi-square test is a nonparametric test that is used to compare experimental results with theoretical models. That is, we will be comparing observed

Chi-square test. More types of inference for nominal variables

Chi-square test FPP 28 More types of inference for nominal variables Nominal data is categorical with more than two categories Compare observed frequencies of nominal variable to hypothesized probabilities

DEPARTMENT OF POLITICAL SCIENCE AND INTERNATIONAL RELATIONS. Posc/Uapp 816 CONTINGENCY TABLES

DEPARTMENT OF POLITICAL SCIENCE AND INTERNATIONAL RELATIONS Posc/Uapp 816 CONTINGENCY TABLES I. AGENDA: A. Cross-classifications 1. Two-by-two and R by C tables 2. Statistical independence 3. The interpretation

Section 12 Part 2. Chi-square test

Section 12 Part 2 Chi-square test McNemar s Test Section 12 Part 2 Overview Section 12, Part 1 covered two inference methods for categorical data from 2 groups Confidence Intervals for the difference of

Factorial Analysis of Variance

Chapter 560 Factorial Analysis of Variance Introduction A common task in research is to compare the average response across levels of one or more factor variables. Examples of factor variables are income

Simple Linear Regression Inference

Simple Linear Regression Inference 1 Inference requirements The Normality assumption of the stochastic term e is needed for inference even if it is not a OLS requirement. Therefore we have: Interpretation

MATH 10: Elementary Statistics and Probability Chapter 11: The Chi-Square Distribution

MATH 10: Elementary Statistics and Probability Chapter 11: The Chi-Square Distribution Tony Pourmohamad Department of Mathematics De Anza College Spring 2015 Objectives By the end of this set of slides,

Chapter 8 Hypothesis Testing

Chapter 8 Hypothesis Testing Chapter problem: Does the MicroSort method of gender selection increase the likelihood that a baby will be girl? MicroSort: a gender-selection method developed by Genetics

II. DISTRIBUTIONS distribution normal distribution. standard scores

Appendix D Basic Measurement And Statistics The following information was developed by Steven Rothke, PhD, Department of Psychology, Rehabilitation Institute of Chicago (RIC) and expanded by Mary F. Schmidt,

Mendelian Inheritance & Probability

Mendelian Inheritance & Probability (CHAPTER 2- Brooker Text) January 31 & Feb 2, 2006 BIO 184 Dr. Tom Peavy Problem Solving TtYy x ttyy What is the expected phenotypic ratio among offspring? Tt RR x Tt

Normality Testing in Excel

Normality Testing in Excel By Mark Harmon Copyright 2011 Mark Harmon No part of this publication may be reproduced or distributed without the express permission of the author. mark@excelmasterseries.com

Inferential Statistics

Inferential Statistics Sampling and the normal distribution Z-scores Confidence levels and intervals Hypothesis testing Commonly used statistical methods Inferential Statistics Descriptive statistics are

CHI SQUARE DISTRIBUTION

CI SQUARE DISTRIBUTION 1 Introduction to the Chi Square Test of Independence This test is used to analyse the relationship between two sets of discrete data. Contingency tables are used to examine the

Chi-square (χ 2 ) Tests

Math 442 - Mathematical Statistics II May 5, 2008 Common Uses of the χ 2 test. 1. Testing Goodness-of-fit. Chi-square (χ 2 ) Tests 2. Testing Equality of Several Proportions. 3. Homogeneity Test. 4. Testing

CHAPTER NINE. Key Concepts. McNemar s test for matched pairs combining p-values likelihood ratio criterion

CHAPTER NINE Key Concepts chi-square distribution, chi-square test, degrees of freedom observed and expected values, goodness-of-fit tests contingency table, dependent (response) variables, independent

Study Guide for the Final Exam

Study Guide for the Final Exam When studying, remember that the computational portion of the exam will only involve new material (covered after the second midterm), that material from Exam 1 will make

Using Excel for inferential statistics

FACT SHEET Using Excel for inferential statistics Introduction When you collect data, you expect a certain amount of variation, just caused by chance. A wide variety of statistical tests can be applied

DATA INTERPRETATION AND STATISTICS

PholC60 September 001 DATA INTERPRETATION AND STATISTICS Books A easy and systematic introductory text is Essentials of Medical Statistics by Betty Kirkwood, published by Blackwell at about 14. DESCRIPTIVE

Appendix A The Chi-Square (32) Test in Genetics

Appendix A The Chi-Square (32) Test in Genetics With infinitely large sample sizes, the ideal result of any particular genetic cross is exact conformation to the expected ratio. For example, a cross between

STA-201-TE. 5. Measures of relationship: correlation (5%) Correlation coefficient; Pearson r; correlation and causation; proportion of common variance

Principles of Statistics STA-201-TE This TECEP is an introduction to descriptive and inferential statistics. Topics include: measures of central tendency, variability, correlation, regression, hypothesis

(pronounced kie-square ; text Ch. 15) She s the sweetheart of The page from the Book of Kells

Applications (pronounced kie-square ; text Ch. 15) She s the sweetheart of The page from the Book of Kells Tests for Independence is characteristic R the same or different in different populations? Tests

UNDERSTANDING THE TWO-WAY ANOVA

UNDERSTANDING THE e have seen how the one-way ANOVA can be used to compare two or more sample means in studies involving a single independent variable. This can be extended to two independent variables

Hypothesis testing: Examples. AMS7, Spring 2012

Hypothesis testing: Examples AMS7, Spring 2012 Example 1: Testing a Claim about a Proportion Sect. 7.3, # 2: Survey of Drinking: In a Gallup survey, 1087 randomly selected adults were asked whether they

A POPULATION MEAN, CONFIDENCE INTERVALS AND HYPOTHESIS TESTING

CHAPTER 5. A POPULATION MEAN, CONFIDENCE INTERVALS AND HYPOTHESIS TESTING 5.1 Concepts When a number of animals or plots are exposed to a certain treatment, we usually estimate the effect of the treatment

TEST NAME: Genetics unit test TEST ID: GRADE:07 SUBJECT:Life and Physical Sciences TEST CATEGORY: School Assessment

TEST NAME: Genetics unit test TEST ID: 437885 GRADE:07 SUBJECT:Life and Physical Sciences TEST CATEGORY: School Assessment Genetics unit test Page 1 of 12 Student: Class: Date: 1. There are four blood

Chi Squared and Fisher's Exact Tests. Observed vs Expected Distributions

BMS 617 Statistical Techniques for the Biomedical Sciences Lecture 11: Chi-Squared and Fisher's Exact Tests Chi Squared and Fisher's Exact Tests This lecture presents two similarly structured tests, Chi-squared

Bivariate Statistics Session 2: Measuring Associations Chi-Square Test

Bivariate Statistics Session 2: Measuring Associations Chi-Square Test Features Of The Chi-Square Statistic The chi-square test is non-parametric. That is, it makes no assumptions about the distribution

Hypothesis Testing COMP 245 STATISTICS. Dr N A Heard. 1 Hypothesis Testing 2 1.1 Introduction... 2 1.2 Error Rates and Power of a Test...

Hypothesis Testing COMP 45 STATISTICS Dr N A Heard Contents 1 Hypothesis Testing 1.1 Introduction........................................ 1. Error Rates and Power of a Test.............................

MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question.

Open book and note Calculator OK Multiple Choice 1 point each MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question. Find the mean for the given sample data.

Recommend Continued CPS Monitoring. 63 (a) 17 (b) 10 (c) 90. 35 (d) 20 (e) 25 (f) 80. Totals/Marginal 98 37 35 170

Work Sheet 2: Calculating a Chi Square Table 1: Substance Abuse Level by ation Total/Marginal 63 (a) 17 (b) 10 (c) 90 35 (d) 20 (e) 25 (f) 80 Totals/Marginal 98 37 35 170 Step 1: Label Your Table. Label

CHAPTER 13. Experimental Design and Analysis of Variance

CHAPTER 13 Experimental Design and Analysis of Variance CONTENTS STATISTICS IN PRACTICE: BURKE MARKETING SERVICES, INC. 13.1 AN INTRODUCTION TO EXPERIMENTAL DESIGN AND ANALYSIS OF VARIANCE Data Collection

Chi-Square Tests. In This Chapter BONUS CHAPTER

BONUS CHAPTER Chi-Square Tests In the previous chapters, we explored the wonderful world of hypothesis testing as we compared means and proportions of one, two, three, and more populations, making an educated

Chapter 23. Two Categorical Variables: The Chi-Square Test

Chapter 23. Two Categorical Variables: The Chi-Square Test 1 Chapter 23. Two Categorical Variables: The Chi-Square Test Two-Way Tables Note. We quickly review two-way tables with an example. Example. Exercise

Statistics Review PSY379

Statistics Review PSY379 Basic concepts Measurement scales Populations vs. samples Continuous vs. discrete variable Independent vs. dependent variable Descriptive vs. inferential stats Common analyses

Pearson s 2x2 Chi-Square Test of Independence -- Analysis of the Relationship between two Qualitative

Pearson s 2x2 Chi-Square Test of Independence -- Analysis of the Relationship between two Qualitative or (Binary) Variables Analysis of 2-Between-Group Data with a Qualitative (Binary) Response Variable

1. Comparing Two Means: Dependent Samples

1. Comparing Two Means: ependent Samples In the preceding lectures we've considered how to test a difference of two means for independent samples. Now we look at how to do the same thing with dependent

Chi-Square Tests TEACHER NOTES MATH NSPIRED. Math Objectives. Vocabulary. About the Lesson

Math Objectives Students will recognize that chi-squared tests are for counts of categorical data. Students will identify the appropriate chi-squared test to use for a given situation: 2 Goodness of Fit

First-year Statistics for Psychology Students Through Worked Examples

First-year Statistics for Psychology Students Through Worked Examples 1. THE CHI-SQUARE TEST A test of association between categorical variables by Charles McCreery, D.Phil Formerly Lecturer in Experimental

χ 2 = (O i E i ) 2 E i

Chapter 24 Two-Way Tables and the Chi-Square Test We look at two-way tables to determine association of paired qualitative data. We look at marginal distributions, conditional distributions and bar graphs.