# CHI-Squared Test of Independence

Save this PDF as:

Size: px
Start display at page:

## Transcription

1 CHI-Squared Test of Independence Minhaz Fahim Zibran Department of Computer Science University of Calgary, Alberta, Canada. Abstract Chi-square (X 2 ) test is a nonparametric statistical analyzing method often used in experimental work where the data consist in frequencies or counts for example the number of boys and girls in a class having their tonsils out as distinct from quantitative data obtained from measurement of continuous variables such as temperature, height, and so on. The most common use of the test is to assess the probability of association or independence of facts [3]. This paper summarizes the chi-squared test. Beginning from the basics it discusses with descriptive examples on where and how to apply this test. The purpose of the paper is to present a quick overview on chi-square test, so that one who doesn t have much knowledge on statistics may use it as a beginner s guide. Keywords: chi-square, statistical test, qualitative data, test of independence, test of association 1 Introduction Suppose we have a sample of boys and girls from the 5th, 8th, and 12th grade of school. We may want to know whether there is an association between the gender of students and the grade levels. Or, we may want to know whether the men and women in a liberal arts college differ in their section of majors. These are the type of questions that the chi-squared (X 2 ) test is designed to answer. It was first introduced by British statistician Karl Pearson in 1900 [1, 2]. Before going into details of chi-squared test section 2 first presents some statistical preliminaries necessary to clearly understand it. Section 3 then illustrates the chi-square test in detail with a simple example. Section 4 points out some limitations of chi-square test. Finally I conclude the paper in section 5, where I also provide direction to further reading. The appendix includes an abridged form of the standard chi-square table. 1

2 2 Preliminaries Population: A population is an individual or group that represents all the members of a certain group or category of interest [4]. Populations do not have to include people. Suppose we want to know the average age of the cats in the city. The population in this study is made up of cats, not people. A population does not need to be large to count as a population. We may want to know the average height of the 3 kids in a family. In this study, the population is comprised of only 3 kids. Sample: A sample is a subset of a given population. Samples are not necessarily good representations of the populations from which they are selected. But choosing representative samples is important for most experiments. Parameter: A parameter is a value generated from, or applied to a population. Variable: A variable is pretty much anything that can be codified and can have value from a set (domain) of more than one values. Variables may be quantitative or qualitative. A quantitative variable is one that is scored in such a way that its values indicate some sort of amount. For example, height is a quantitative variable. On the contrary, a qualitative variable indicates some kind of category. A commonly used qualitative variable in social science research is the dichotomous variable, which has two different categories. For instance, gender has two categories: male and female. Chi-square test is applicable to when we have qualitative variables classified into categories. Nominally scaled variable: A nominally scaled variable is one in which the labels that are used to identify the different levels of the variable have no weight, or numeric value [4]. For example the sample may be divided into two groups labeled 0 and 1. In this case the value 1 does not indicate a higher score than the value of 0. Rather the, 0 and 1 are simply names, or labels have been assigned to each group. Contingency table: When the members of a sample are doubly classified (i.e., classified in two separate ways), the results may be arranged in rectangular tables. Such a table is called contingency table. For example, in 1956, the number of people on record who died of tuberculosis in England and Wales was Of these 3804 were males and 1571 were females: 3534 males and 1319 females died of tuberculosis of respiratory system, while the remainder died of other forms of tuberculosis. There data can be arranged in contingency table as shown in Table 1 1. This 2 2 (the members of the sample having been dichotomized in two different ways) contingency table is an example of the simplest form. 1 this example and table is actually taken from [3] 2

3 Males Females Total Tuberculosis of respiratory system Other forms of tuberculosis Tuberculosis (all forms) Table 1: Observed frequencies of deaths from Tuberculosis in England and Wales in 1956 Males Females Total Tuberculosis of respiratory system E 1 E Other forms of tuberculosis E 3 E Tuberculosis (all forms) Table 2: Table 1 with cells replaced by letters 3 Chi-squared Test The entries in the cells in a contingency table may be frequencies, as in Table 1, or frequencies may be transformed into proportions or percentages. However, it is important to note that in whatever form (frequencies, proportions, etc.) they are presented are not continuous measurements. Chi-squared test can be applied to only discrete data [3]: for the purpose of the test, of course, continuous data can be often put into discrete form by the use of intervals on a continuous scale. For instance, age is a continuous variable, but if people are classified into different age-groups, then the intervals of time corresponding to these groups can be treated as if they were discrete units. Chi-square (X 2 ) test is a nonparametric statistical test to determine if the two or more classifications of the samples are independent or not. For explanation, let us consider the data presented in Table 1 where the people are classified according to two attributes: gender and type of tuberculosis. We may want to determine whether death caused by tuberculosis of respiratory system or other type of tuberculosis is dependent on gender. To get the answer, we may apply chi-square test. A. E. Maxwell [3] presented this example with the data shown in Table 1 to elaborately describe the procedure of chi-square test. In this paper I use the same example and illustrate the procedure in a concise form. In order to carry out the chi-square test, it will be helpful to rearrange Table 1 leaving the marginal frequencies as they are but replacing the frequencies in the cells of the body of the table by the letters E 1 to E 4. After this arrangement we get Table 2. Looking at the column of the marginal totals on the right of the table we see that the proportion of deaths, males and females combined, due to tuberculosis of the respiratory system, is 4853 = (1)

4 Males Females Total Tuberculosis of respiratory system Other forms of tuberculosis Tuberculosis (all forms) Table 3: Expected frequencies on the assumption of independent classification Now, if the two classifications are independent, that is if the form of tuberculosis from which people die is independent of gender, we would expect that the proportion of males that died from tuberculosis of respiratory system would be equal to that of females died from the same cause, and consequently would equal the proportion of the total group, The expected values E 1 and E 2 then must be chosen so that the following holds. Using equation 2 we find E = E = 4853 = (2) 5375 E 1 = ( ) 5375 = (3) Once the value of E 1 is known, the values of E 2, E 3 and E 4 can be deduced since the following facts are true. E 1 + E 2 = 4853 (4) E 3 + E 4 = 522 (5) E 1 + E 3 = 3804 (6) E 2 + E 4 = 1571 (7) Calculating values of E 1, E 2, E 3, and E 4 and putting these expected values replacing the corresponding letters in Table 2, we get Table 3. These are the values that one would expect to find in the cells in the body of Table 1 were the two methods of classification independent. Though the number of people may not be fractional, fractional values may appear in the table of expected frequencies. Specially when the sample size is small, we retain the fractional values to increase the accuracy of subsequent calculations. Now, if we refer again to the data, we see that the observed frequencies in Table 1 differs considerably from the expected frequencies in Table 3. The question in concern is whether this difference is such as could have arisen from random sampling error alone, or whether it indicates a real difference between genders: males and females. Now, let us define our null hypothesis as follows. Null hypothesis: The number of men and women died in 1956 due to tuberculosis of respiratory system and other types of tuberculosis is independent of their sex. 4

5 O E (O E) (O E) 2 (O E) X 2 = Table 4: Calculating chi-square (X 2 ) for data in Table 1 and Table 3 E Chi-square test, if properly applied may give us the answer by rejecting the null hypothesis or failing to reject it. The test is based on the chi-square (X 2 ) distribution. To compare the observed and expected frequencies, we produce chi-square (X 2 ) value using the formula stated in equation 8. X 2 = (O i E i ) 2 E i (8) In this equation 8, O i stands for observed frequencies, E i stands for expected frequencies, and i runs from 1, 2,..., n, where n is the number of cells in the contingency table. To perform the calculations it is useful to arrange the observed and expected frequencies as shown in Table 4. To asses the significance of the calculated value of X 2, we refer to the standard chi-square table presented in Appendix. This table contains the critical X 2 values on different degrees of freedom and levels of probability. Referring back to Table 3, we recall that once the value of any one of the E i (i = 1,..., 4) had been determined, all other E i s could be deduced. In other words, when the marginal totals of a 2 2 contingency table is given, only one cell in the body of the table can be filled arbitrarily. This fact is expressed by saying that a 2 2 contingency table has only one degree of freedom. The degree of freedom (df) of a contingency table with r rows and c columns is computed using the following formula given in equation 9. df = (r 1)(c 1) (9) To assess the significance of our chi-square value X 2 = , we enter the chi-square table of the Appendix, with df = 1, that is we look into the first row, which corresponds to one degree of freedom. The largest value in that row is under the probability (P ) level A value of chi-square equal to or greater than would be expected to occur by chance only once in a thousand times if the null hypothesis is true. Since our chi-square value is much greater than , it would be expected to occur even less frequently. Hence our chi-square test rejects the null hypothesis. So, we conclude that the proportion of males died from tuberculosis of respiratory system, namely 3534 = 0.929, is significantly different 3804 from the proportion of females, namely 1319 = 0.840, that died from the same cause

6 Before accomplishing the test we define a level of confidence, that is the probability level (P ) we are going to accept. Once the X 2 value is computed and the number of degrees of freedom is determined, we go to the chi-square table and look into the row corresponding to the given degree of freedom. Then if we find our X 2 value to be less than (to the left side of our level of confidence) that of the value corresponding to our level of confidence, we conclude that our null hypothesis is probably true. On the contrary, if our X 2 value lies over the level of confidence or to its right indicating less probability of occurring the difference by chance, we know that our chi-square test rejects the null hypothesis. Therefore, we conclude that the classifications on population are dependent on each other. 4 Limitations of Chi-square Test As mentioned before, chi-square test cannot be applied on continuous data. It can only be applied to qualitative data classified into categories, or labeled using nominally scaled variables [3, 4]. In the standard chi-square table presented in the Appendix the chi-square values computed using the formula in equation 8 assuming that the expected values (E) are large. Most statisticians warn against using the test when any of the expected values are less than 5. This warning implies that the use of chi-square test is restricted to large samples. However, where small samples are concerned, the difficulty can be overcome in a number of ways. One way is to apply a correction of continuity, known as Yates correction. Another way is using Fisher s Exact Test. Cochran [5] recommends that Fisher s exact test should be used when the total sample size in a 2 2 contingency table is less than 20, or when the sample is less than 40 and one of the expected frequencies is less than 5. More about these methods and chi-square test itself are nicely described in [3]. 5 Concluding Remarks Chi-square test tells us whether the classifications on a given population are dependent on each other or not. However, it is important to stress that the establishment of statistical association by means of chi-square does not necessarily imply any causal relationship between the attributes being compared, but it does indicate that the reason for the association is worth investigating [3]. For example, if further investigation carried out on the case of people s death from tuberculosis, we might have found out that the reason why the number of men died from tuberculosis of the respiratory system is higher than the women is the fact that there are more smokers in men than in women. More on chi-square test may be found in the references given below. Maxwell in his book Analysing Qualitative Data [3] elaborately describes the chi-square test and related topics. The book by Timothy C. Urdan [4] covers necessary preliminary knowledge on statistics and illustrates the chi-square test with descriptive examples. 6

7 References [1] J. C. W. Rayner and D. J. Best. Smooth Tests of Goodness of Fit. Oxford University Press, Inc., ISBN [2] William Mendenhall, Robert J. Beaver, and Barbara M. Beaver. Introduction to Probability and Statistics. Brooks/Cole, a division of Thomson Learning, Inc., ISBN [3] A. E. Maxwell. Analysing Qualitative Data. 4th Edition. Chapman and Hall Ltd., Library of Congress Catalog Card Number [4] Timothy C. Urdan. Statistics in Plain English. 2nd Edition. Lawrence Erlbaum Associates, Inc., London, [5] W. G. Cochran. Some methods of strengthening the common x 2 tests. Biometrics, 10, Appendix Percentage points of chi-square distribution 2 Probability (P) level D.F Part of the chi-square table is presented here. The table with more values may be found in books on statistics. The full table may be found in Table 8 of Biometrika Tables of Statisticians, vol. 1, Biometrika Trustees. 7

### Bivariate Statistics Session 2: Measuring Associations Chi-Square Test

Bivariate Statistics Session 2: Measuring Associations Chi-Square Test Features Of The Chi-Square Statistic The chi-square test is non-parametric. That is, it makes no assumptions about the distribution

### CHAPTER 11 CHI-SQUARE: NON-PARAMETRIC COMPARISONS OF FREQUENCY

CHAPTER 11 CHI-SQUARE: NON-PARAMETRIC COMPARISONS OF FREQUENCY The hypothesis testing statistics detailed thus far in this text have all been designed to allow comparison of the means of two or more samples

### Data Types. 1. Continuous 2. Discrete quantitative 3. Ordinal 4. Nominal. Figure 1

Data Types By Tanya Hoskin, a statistician in the Mayo Clinic Department of Health Sciences Research who provides consultations through the Mayo Clinic CTSA BERD Resource. Don t let the title scare you.

### Is it statistically significant? The chi-square test

UAS Conference Series 2013/14 Is it statistically significant? The chi-square test Dr Gosia Turner Student Data Management and Analysis 14 September 2010 Page 1 Why chi-square? Tests whether two categorical

### Module 9: Nonparametric Tests. The Applied Research Center

Module 9: Nonparametric Tests The Applied Research Center Module 9 Overview } Nonparametric Tests } Parametric vs. Nonparametric Tests } Restrictions of Nonparametric Tests } One-Sample Chi-Square Test

### Inferential Statistics

Inferential Statistics Sampling and the normal distribution Z-scores Confidence levels and intervals Hypothesis testing Commonly used statistical methods Inferential Statistics Descriptive statistics are

### Statistical Inference

Statistical Inference Idea: Estimate parameters of the population distribution using data. How: Use the sampling distribution of sample statistics and methods based on what would happen if we used this

### Association Between Variables

Contents 11 Association Between Variables 767 11.1 Introduction............................ 767 11.1.1 Measure of Association................. 768 11.1.2 Chapter Summary.................... 769 11.2 Chi

### PASS Sample Size Software

Chapter 250 Introduction The Chi-square test is often used to test whether sets of frequencies or proportions follow certain patterns. The two most common instances are tests of goodness of fit using multinomial

### Chi-square test Fisher s Exact test

Lesson 1 Chi-square test Fisher s Exact test McNemar s Test Lesson 1 Overview Lesson 11 covered two inference methods for categorical data from groups Confidence Intervals for the difference of two proportions

### Chi Square (χ 2 ) Statistical Instructions EXP 3082L Jay Gould s Elaboration on Christensen and Evans (1980)

Chi Square (χ 2 ) Statistical Instructions EXP 3082L Jay Gould s Elaboration on Christensen and Evans (1980) For the Driver Behavior Study, the Chi Square Analysis II is the appropriate analysis below.

### Chapter 19 The Chi-Square Test

Tutorial for the integration of the software R with introductory statistics Copyright c Grethe Hystad Chapter 19 The Chi-Square Test In this chapter, we will discuss the following topics: We will plot

### UNDERSTANDING THE INDEPENDENT-SAMPLES t TEST

UNDERSTANDING The independent-samples t test evaluates the difference between the means of two independent or unrelated groups. That is, we evaluate whether the means for two independent groups are significantly

### Module 5 Hypotheses Tests: Comparing Two Groups

Module 5 Hypotheses Tests: Comparing Two Groups Objective: In medical research, we often compare the outcomes between two groups of patients, namely exposed and unexposed groups. At the completion of this

### Chi Square Distribution

17. Chi Square A. Chi Square Distribution B. One-Way Tables C. Contingency Tables D. Exercises Chi Square is a distribution that has proven to be particularly useful in statistics. The first section describes

### Recommend Continued CPS Monitoring. 63 (a) 17 (b) 10 (c) 90. 35 (d) 20 (e) 25 (f) 80. Totals/Marginal 98 37 35 170

Work Sheet 2: Calculating a Chi Square Table 1: Substance Abuse Level by ation Total/Marginal 63 (a) 17 (b) 10 (c) 90 35 (d) 20 (e) 25 (f) 80 Totals/Marginal 98 37 35 170 Step 1: Label Your Table. Label

### TABLE OF CONTENTS. About Chi Squares... 1. What is a CHI SQUARE?... 1. Chi Squares... 1. Hypothesis Testing with Chi Squares... 2

About Chi Squares TABLE OF CONTENTS About Chi Squares... 1 What is a CHI SQUARE?... 1 Chi Squares... 1 Goodness of fit test (One-way χ 2 )... 1 Test of Independence (Two-way χ 2 )... 2 Hypothesis Testing

### Class 19: Two Way Tables, Conditional Distributions, Chi-Square (Text: Sections 2.5; 9.1)

Spring 204 Class 9: Two Way Tables, Conditional Distributions, Chi-Square (Text: Sections 2.5; 9.) Big Picture: More than Two Samples In Chapter 7: We looked at quantitative variables and compared the

### SCHOOL OF HEALTH AND HUMAN SCIENCES DON T FORGET TO RECODE YOUR MISSING VALUES

SCHOOL OF HEALTH AND HUMAN SCIENCES Using SPSS Topics addressed today: 1. Differences between groups 2. Graphing Use the s4data.sav file for the first part of this session. DON T FORGET TO RECODE YOUR

### Sydney Roberts Predicting Age Group Swimmers 50 Freestyle Time 1. 1. Introduction p. 2. 2. Statistical Methods Used p. 5. 3. 10 and under Males p.

Sydney Roberts Predicting Age Group Swimmers 50 Freestyle Time 1 Table of Contents 1. Introduction p. 2 2. Statistical Methods Used p. 5 3. 10 and under Males p. 8 4. 11 and up Males p. 10 5. 10 and under

### The Chi-Square Test. STAT E-50 Introduction to Statistics

STAT -50 Introduction to Statistics The Chi-Square Test The Chi-square test is a nonparametric test that is used to compare experimental results with theoretical models. That is, we will be comparing observed

### Stats for Strategy Exam 1 In-Class Practice Questions DIRECTIONS

Stats for Strategy Exam 1 In-Class Practice Questions DIRECTIONS Choose the single best answer for each question. Discuss questions with classmates, TAs and Professor Whitten. Raise your hand to check

### Chapter 23. Two Categorical Variables: The Chi-Square Test

Chapter 23. Two Categorical Variables: The Chi-Square Test 1 Chapter 23. Two Categorical Variables: The Chi-Square Test Two-Way Tables Note. We quickly review two-way tables with an example. Example. Exercise

### Testing differences in proportions

Testing differences in proportions Murray J Fisher RN, ITU Cert., DipAppSc, BHSc, MHPEd, PhD Senior Lecturer and Director Preregistration Programs Sydney Nursing School (MO2) University of Sydney NSW 2006

### CHAPTER 15 NOMINAL MEASURES OF CORRELATION: PHI, THE CONTINGENCY COEFFICIENT, AND CRAMER'S V

CHAPTER 15 NOMINAL MEASURES OF CORRELATION: PHI, THE CONTINGENCY COEFFICIENT, AND CRAMER'S V Chapters 13 and 14 introduced and explained the use of a set of statistical tools that researchers use to measure

### Unit 29 Chi-Square Goodness-of-Fit Test

Unit 29 Chi-Square Goodness-of-Fit Test Objectives: To perform the chi-square hypothesis test concerning proportions corresponding to more than two categories of a qualitative variable To perform the Bonferroni

### Test Positive True Positive False Positive. Test Negative False Negative True Negative. Figure 5-1: 2 x 2 Contingency Table

ANALYSIS OF DISCRT VARIABLS / 5 CHAPTR FIV ANALYSIS OF DISCRT VARIABLS Discrete variables are those which can only assume certain fixed values. xamples include outcome variables with results such as live

### HYPOTHESIS TESTING (ONE SAMPLE) - CHAPTER 7 1. used confidence intervals to answer questions such as...

HYPOTHESIS TESTING (ONE SAMPLE) - CHAPTER 7 1 PREVIOUSLY used confidence intervals to answer questions such as... You know that 0.25% of women have red/green color blindness. You conduct a study of men

### General Method: Difference of Means. 3. Calculate df: either Welch-Satterthwaite formula or simpler df = min(n 1, n 2 ) 1.

General Method: Difference of Means 1. Calculate x 1, x 2, SE 1, SE 2. 2. Combined SE = SE1 2 + SE2 2. ASSUMES INDEPENDENT SAMPLES. 3. Calculate df: either Welch-Satterthwaite formula or simpler df = min(n

### Section 12 Part 2. Chi-square test

Section 12 Part 2 Chi-square test McNemar s Test Section 12 Part 2 Overview Section 12, Part 1 covered two inference methods for categorical data from 2 groups Confidence Intervals for the difference of

### First-year Statistics for Psychology Students Through Worked Examples

First-year Statistics for Psychology Students Through Worked Examples 1. THE CHI-SQUARE TEST A test of association between categorical variables by Charles McCreery, D.Phil Formerly Lecturer in Experimental

### Contingency Tables and the Chi Square Statistic. Interpreting Computer Printouts and Constructing Tables

Contingency Tables and the Chi Square Statistic Interpreting Computer Printouts and Constructing Tables Contingency Tables/Chi Square Statistics What are they? A contingency table is a table that shows

### UNDERSTANDING THE DEPENDENT-SAMPLES t TEST

UNDERSTANDING THE DEPENDENT-SAMPLES t TEST A dependent-samples t test (a.k.a. matched or paired-samples, matched-pairs, samples, or subjects, simple repeated-measures or within-groups, or correlated groups)

### Statistics Review PSY379

Statistics Review PSY379 Basic concepts Measurement scales Populations vs. samples Continuous vs. discrete variable Independent vs. dependent variable Descriptive vs. inferential stats Common analyses

### CONTINGENCY TABLES ARE NOT ALL THE SAME David C. Howell University of Vermont

CONTINGENCY TABLES ARE NOT ALL THE SAME David C. Howell University of Vermont To most people studying statistics a contingency table is a contingency table. We tend to forget, if we ever knew, that contingency

### Nonparametric Tests. Chi-Square Test for Independence

DDBA 8438: Nonparametric Statistics: The Chi-Square Test Video Podcast Transcript JENNIFER ANN MORROW: Welcome to "Nonparametric Statistics: The Chi-Square Test." My name is Dr. Jennifer Ann Morrow. In

### Part 3. Comparing Groups. Chapter 7 Comparing Paired Groups 189. Chapter 8 Comparing Two Independent Groups 217

Part 3 Comparing Groups Chapter 7 Comparing Paired Groups 189 Chapter 8 Comparing Two Independent Groups 217 Chapter 9 Comparing More Than Two Groups 257 188 Elementary Statistics Using SAS Chapter 7 Comparing

### Having a coin come up heads or tails is a variable on a nominal scale. Heads is a different category from tails.

Chi-square Goodness of Fit Test The chi-square test is designed to test differences whether one frequency is different from another frequency. The chi-square test is designed for use with data on a nominal

### Descriptive Statistics

Descriptive Statistics Primer Descriptive statistics Central tendency Variation Relative position Relationships Calculating descriptive statistics Descriptive Statistics Purpose to describe or summarize

### Using Stata for Categorical Data Analysis

Using Stata for Categorical Data Analysis NOTE: These problems make extensive use of Nick Cox s tab_chi, which is actually a collection of routines, and Adrian Mander s ipf command. From within Stata,

### Northumberland Knowledge

Northumberland Knowledge Know Guide How to Analyse Data - November 2012 - This page has been left blank 2 About this guide The Know Guides are a suite of documents that provide useful information about

### Simulating Chi-Square Test Using Excel

Simulating Chi-Square Test Using Excel Leslie Chandrakantha John Jay College of Criminal Justice of CUNY Mathematics and Computer Science Department 524 West 59 th Street, New York, NY 10019 lchandra@jjay.cuny.edu

### Crosstabulation & Chi Square

Crosstabulation & Chi Square Robert S Michael Chi-square as an Index of Association After examining the distribution of each of the variables, the researcher s next task is to look for relationships among

### UNDERSTANDING THE TWO-WAY ANOVA

UNDERSTANDING THE e have seen how the one-way ANOVA can be used to compare two or more sample means in studies involving a single independent variable. This can be extended to two independent variables

### IBM SPSS Statistics for Beginners for Windows

ISS, NEWCASTLE UNIVERSITY IBM SPSS Statistics for Beginners for Windows A Training Manual for Beginners Dr. S. T. Kometa A Training Manual for Beginners Contents 1 Aims and Objectives... 3 1.1 Learning

### MINITAB ASSISTANT WHITE PAPER

MINITAB ASSISTANT WHITE PAPER This paper explains the research conducted by Minitab statisticians to develop the methods and data checks used in the Assistant in Minitab 17 Statistical Software. One-Way

### Statistical tests for SPSS

Statistical tests for SPSS Paolo Coletti A.Y. 2010/11 Free University of Bolzano Bozen Premise This book is a very quick, rough and fast description of statistical tests and their usage. It is explicitly

### Chi-Square Test. Contingency Tables. Contingency Tables. Chi-Square Test for Independence. Chi-Square Tests for Goodnessof-Fit

Chi-Square Tests 15 Chapter Chi-Square Test for Independence Chi-Square Tests for Goodness Uniform Goodness- Poisson Goodness- Goodness Test ECDF Tests (Optional) McGraw-Hill/Irwin Copyright 2009 by The

### II. DISTRIBUTIONS distribution normal distribution. standard scores

Appendix D Basic Measurement And Statistics The following information was developed by Steven Rothke, PhD, Department of Psychology, Rehabilitation Institute of Chicago (RIC) and expanded by Mary F. Schmidt,

### SAS Software to Fit the Generalized Linear Model

SAS Software to Fit the Generalized Linear Model Gordon Johnston, SAS Institute Inc., Cary, NC Abstract In recent years, the class of generalized linear models has gained popularity as a statistical modeling

### Mind on Statistics. Chapter 15

Mind on Statistics Chapter 15 Section 15.1 1. A student survey was done to study the relationship between class standing (freshman, sophomore, junior, or senior) and major subject (English, Biology, French,

### Using Excel for inferential statistics

FACT SHEET Using Excel for inferential statistics Introduction When you collect data, you expect a certain amount of variation, just caused by chance. A wide variety of statistical tests can be applied

### STA-201-TE. 5. Measures of relationship: correlation (5%) Correlation coefficient; Pearson r; correlation and causation; proportion of common variance

Principles of Statistics STA-201-TE This TECEP is an introduction to descriptive and inferential statistics. Topics include: measures of central tendency, variability, correlation, regression, hypothesis

### Outline of Topics. Statistical Methods I. Types of Data. Descriptive Statistics

Statistical Methods I Tamekia L. Jones, Ph.D. (tjones@cog.ufl.edu) Research Assistant Professor Children s Oncology Group Statistics & Data Center Department of Biostatistics Colleges of Medicine and Public

### In the past, the increase in the price of gasoline could be attributed to major national or global

Chapter 7 Testing Hypotheses Chapter Learning Objectives Understanding the assumptions of statistical hypothesis testing Defining and applying the components in hypothesis testing: the research and null

### Elementary Statistics

Elementary Statistics Chapter 1 Dr. Ghamsary Page 1 Elementary Statistics M. Ghamsary, Ph.D. Chap 01 1 Elementary Statistics Chapter 1 Dr. Ghamsary Page 2 Statistics: Statistics is the science of collecting,

### Solutions to Homework 10 Statistics 302 Professor Larget

s to Homework 10 Statistics 302 Professor Larget Textbook Exercises 7.14 Rock-Paper-Scissors (Graded for Accurateness) In Data 6.1 on page 367 we see a table, reproduced in the table below that shows the

### Introduction to Hypothesis Testing

I. Terms, Concepts. Introduction to Hypothesis Testing A. In general, we do not know the true value of population parameters - they must be estimated. However, we do have hypotheses about what the true

### Chi Squared and Fisher's Exact Tests. Observed vs Expected Distributions

BMS 617 Statistical Techniques for the Biomedical Sciences Lecture 11: Chi-Squared and Fisher's Exact Tests Chi Squared and Fisher's Exact Tests This lecture presents two similarly structured tests, Chi-squared

### Examining Differences (Comparing Groups) using SPSS Inferential statistics (Part I) Dwayne Devonish

Examining Differences (Comparing Groups) using SPSS Inferential statistics (Part I) Dwayne Devonish Statistics Statistics are quantitative methods of describing, analysing, and drawing inferences (conclusions)

### Study Guide for the Final Exam

Study Guide for the Final Exam When studying, remember that the computational portion of the exam will only involve new material (covered after the second midterm), that material from Exam 1 will make

### DESCRIPTIVE STATISTICS & DATA PRESENTATION*

Level 1 Level 2 Level 3 Level 4 0 0 0 0 evel 1 evel 2 evel 3 Level 4 DESCRIPTIVE STATISTICS & DATA PRESENTATION* Created for Psychology 41, Research Methods by Barbara Sommer, PhD Psychology Department

### 5.1 Identifying the Target Parameter

University of California, Davis Department of Statistics Summer Session II Statistics 13 August 20, 2012 Date of latest update: August 20 Lecture 5: Estimation with Confidence intervals 5.1 Identifying

### Math 58. Rumbos Fall 2008 1. Solutions to Review Problems for Exam 2

Math 58. Rumbos Fall 2008 1 Solutions to Review Problems for Exam 2 1. For each of the following scenarios, determine whether the binomial distribution is the appropriate distribution for the random variable

### How to Conduct a Hypothesis Test

How to Conduct a Hypothesis Test The idea of hypothesis testing is relatively straightforward. In various studies we observe certain events. We must ask, is the event due to chance alone, or is there some

### Elementary Statistics

lementary Statistics Chap10 Dr. Ghamsary Page 1 lementary Statistics M. Ghamsary, Ph.D. Chapter 10 Chi-square Test for Goodness of fit and Contingency tables lementary Statistics Chap10 Dr. Ghamsary Page

### LAB 4 INSTRUCTIONS CONFIDENCE INTERVALS AND HYPOTHESIS TESTING

LAB 4 INSTRUCTIONS CONFIDENCE INTERVALS AND HYPOTHESIS TESTING In this lab you will explore the concept of a confidence interval and hypothesis testing through a simulation problem in engineering setting.

### Chapter 23 Inferences About Means

Chapter 23 Inferences About Means Chapter 23 - Inferences About Means 391 Chapter 23 Solutions to Class Examples 1. See Class Example 1. 2. We want to know if the mean battery lifespan exceeds the 300-minute

### Confidence intervals

Confidence intervals Today, we re going to start talking about confidence intervals. We use confidence intervals as a tool in inferential statistics. What this means is that given some sample statistics,

### Chapter 8. Hypothesis Testing

Chapter 8 Hypothesis Testing Hypothesis In statistics, a hypothesis is a claim or statement about a property of a population. A hypothesis test (or test of significance) is a standard procedure for testing

### Introduction to Statistics

1 Introduction to Statistics LEARNING OBJECTIVES After reading this chapter, you should be able to: 1. Distinguish between descriptive and inferential statistics. 2. Explain how samples and populations,

### Chapter 11. Chapter 11 Overview. Chapter 11 Objectives 11/24/2015. Other Chi-Square Tests

11/4/015 Chapter 11 Overview Chapter 11 Introduction 11-1 Test for Goodness of Fit 11- Tests Using Contingency Tables Other Chi-Square Tests McGraw-Hill, Bluman, 7th ed., Chapter 11 1 Bluman, Chapter 11

### This chapter discusses some of the basic concepts in inferential statistics.

Research Skills for Psychology Majors: Everything You Need to Know to Get Started Inferential Statistics: Basic Concepts This chapter discusses some of the basic concepts in inferential statistics. Details

### Unit 31 A Hypothesis Test about Correlation and Slope in a Simple Linear Regression

Unit 31 A Hypothesis Test about Correlation and Slope in a Simple Linear Regression Objectives: To perform a hypothesis test concerning the slope of a least squares line To recognize that testing for a

### Math 108 Exam 3 Solutions Spring 00

Math 108 Exam 3 Solutions Spring 00 1. An ecologist studying acid rain takes measurements of the ph in 12 randomly selected Adirondack lakes. The results are as follows: 3.0 6.5 5.0 4.2 5.5 4.7 3.4 6.8

### Biodiversity Data Analysis: Testing Statistical Hypotheses By Joanna Weremijewicz, Simeon Yurek, Steven Green, Ph. D. and Dana Krempels, Ph. D.

Biodiversity Data Analysis: Testing Statistical Hypotheses By Joanna Weremijewicz, Simeon Yurek, Steven Green, Ph. D. and Dana Krempels, Ph. D. In biological science, investigators often collect biological

### Business Statistics. Successful completion of Introductory and/or Intermediate Algebra courses is recommended before taking Business Statistics.

Business Course Text Bowerman, Bruce L., Richard T. O'Connell, J. B. Orris, and Dawn C. Porter. Essentials of Business, 2nd edition, McGraw-Hill/Irwin, 2008, ISBN: 978-0-07-331988-9. Required Computing

### The Dummy s Guide to Data Analysis Using SPSS

The Dummy s Guide to Data Analysis Using SPSS Mathematics 57 Scripps College Amy Gamble April, 2001 Amy Gamble 4/30/01 All Rights Rerserved TABLE OF CONTENTS PAGE Helpful Hints for All Tests...1 Tests

### WHAT IS A JOURNAL CLUB?

WHAT IS A JOURNAL CLUB? With its September 2002 issue, the American Journal of Critical Care debuts a new feature, the AJCC Journal Club. Each issue of the journal will now feature an AJCC Journal Club

### Some Critical Information about SOME Statistical Tests and Measures of Correlation/Association

Some Critical Information about SOME Statistical Tests and Measures of Correlation/Association This information is adapted from and draws heavily on: Sheskin, David J. 2000. Handbook of Parametric and

### Basic Concepts in Research and Data Analysis

Basic Concepts in Research and Data Analysis Introduction: A Common Language for Researchers...2 Steps to Follow When Conducting Research...3 The Research Question... 3 The Hypothesis... 4 Defining the

### Variables and Data A variable contains data about anything we measure. For example; age or gender of the participants or their score on a test.

The Analysis of Research Data The design of any project will determine what sort of statistical tests you should perform on your data and how successful the data analysis will be. For example if you decide

### Graphing Data Presentation of Data in Visual Forms

Graphing Data Presentation of Data in Visual Forms Purpose of Graphing Data Audience Appeal Provides a visually appealing and succinct representation of data and summary statistics Provides a visually

### Statistics. Annex 7. Calculating rates

Annex 7 Statistics Calculating rates Rates are the most common way of measuring disease frequency in a population and are calculated as: number of new cases of disease in population at risk number of persons

### Linear Models in STATA and ANOVA

Session 4 Linear Models in STATA and ANOVA Page Strengths of Linear Relationships 4-2 A Note on Non-Linear Relationships 4-4 Multiple Linear Regression 4-5 Removal of Variables 4-8 Independent Samples

### Analyzing Research Data Using Excel

Analyzing Research Data Using Excel Fraser Health Authority, 2012 The Fraser Health Authority ( FH ) authorizes the use, reproduction and/or modification of this publication for purposes other than commercial

### Data Mining Techniques Chapter 5: The Lure of Statistics: Data Mining Using Familiar Tools

Data Mining Techniques Chapter 5: The Lure of Statistics: Data Mining Using Familiar Tools Occam s razor.......................................................... 2 A look at data I.........................................................

### 3.4 Statistical inference for 2 populations based on two samples

3.4 Statistical inference for 2 populations based on two samples Tests for a difference between two population means The first sample will be denoted as X 1, X 2,..., X m. The second sample will be denoted

### Testing Research and Statistical Hypotheses

Testing Research and Statistical Hypotheses Introduction In the last lab we analyzed metric artifact attributes such as thickness or width/thickness ratio. Those were continuous variables, which as you

### S P S S Statistical Package for the Social Sciences

S P S S Statistical Package for the Social Sciences Data Entry Data Management Basic Descriptive Statistics Jamie Lynn Marincic Leanne Hicks Survey, Statistics, and Psychometrics Core Facility (SSP) July

### Statistical Foundations: Measurement Scales. Psychology 790 Lecture #1 8/24/2006

Statistical Foundations: Measurement Scales Psychology 790 Lecture #1 8/24/2006 Today s Lecture Measurement What is always assumed. What we can say when we assign numbers to phenomena. Implications for

### MBA 611 STATISTICS AND QUANTITATIVE METHODS

MBA 611 STATISTICS AND QUANTITATIVE METHODS Part I. Review of Basic Statistics (Chapters 1-11) A. Introduction (Chapter 1) Uncertainty: Decisions are often based on incomplete information from uncertain

### Two Correlated Proportions (McNemar Test)

Chapter 50 Two Correlated Proportions (Mcemar Test) Introduction This procedure computes confidence intervals and hypothesis tests for the comparison of the marginal frequencies of two factors (each with

### Statistical Study: Comparing Fast Food Consumption to Body Mass Index

Statistical Study: Comparing Fast Food Consumption to Body Mass Index Janet Hiestan Biology major Jennifer Sanchez Biology major Jason Majirsky Biology major Introduction: Does fast food consumption influence

### 8 6 X 2 Test for a Variance or Standard Deviation

Section 8 6 x 2 Test for a Variance or Standard Deviation 437 This test uses the P-value method. Therefore, it is not necessary to enter a significance level. 1. Select MegaStat>Hypothesis Tests>Proportion

### Mind on Statistics. Chapter 4

Mind on Statistics Chapter 4 Sections 4.1 Questions 1 to 4: The table below shows the counts by gender and highest degree attained for 498 respondents in the General Social Survey. Highest Degree Gender

### One-Way Analysis of Variance

One-Way Analysis of Variance Note: Much of the math here is tedious but straightforward. We ll skim over it in class but you should be sure to ask questions if you don t understand it. I. Overview A. We