Chapter 7. Categorical Data Analysis

Size: px
Start display at page:

Download "Chapter 7. Categorical Data Analysis"

Transcription

1 Chapter 7 Categorical Data Analysis In Chapter 5 we studied how to test hypotheses involving a single population such as H 0 :µ=5 vs. H a :µ>5. In Chapter 6, we studied how to test hypotheses involving two or more populations, such as H 0 :µ 1 =µ 2 vs. H a : µ 1 >µ 2. In both these chapters we were dealing with quantitative variables such as height of a person, or lengths of alligators etc. In this chapter we will learn how to test hypotheses that involve qualitative or categorical variables. Recall some examples of qualitative or categorical variables such as Gender, Religion, Race, Color etc. They are considered categorical variables because their values are not numeric. Although we may represent values of categorical variables with numbers, it is still a categorical variable. For example we can represent colors using numbers such as 1 for white, 2 for red, 3 for blue and so one, but that does not make it quantitative because you couldn t add 2 and 3 (red and blue) and hope that 5 represents purple. The number 5 probably already represents some other color. The numbers in this color example have no inherent numerical properties; they are simply labels. When dealing with categorical variables, the closest thing to something numerical is the frequency data. So for example let s say I observe the color of the cars passing by my window (assuming I can see a road from my window with cars passing by). Suppose I collect the following data of the first twelve cars that I see: White, Red, Red, Black, Blue, Green, Yellow, Red, White, Black, Blue and White. I can translate this data into a frequency table like this: Color of the vehicle Frequency White 3 Black 2 Blue 2 Yellow 1 Red 3 Green 1 It is relatively easy to obtain such frequency tables for data involving categorical variables as you can see in the above example. Using frequency data, we can test a variety of new types of hypotheses that we have not seen in previous chapters. For example, a favorite family game when you are on a long drive on an interstate highway is for the contestants to pick a color, say white or red and see who gets the most number of cars of that color till you reach your destination. The person who picked the color with the most number of cars wins this extremely delightful and colorful game. Suppose on one long journey while playing this game my family collected the following data: Color of the vehicle Frequency White 44 Red 36 All the Rest 120 Data such as in the above table can be used to test hypotheses about proportions. For example, say I have a hypothesis 25% of all cars produced are White and another 25% are Red and the remaining 50% are all other colors combined. In symbols, this hypothesis can be written as: 1

2 H 0 : p white = 0.25, p red = 0.25, p other = 0.50 H a : at least one of the proportions is different than specified in the null hypothesis Hypotheses such as this cannot be tested using any of the methods that we have studied so far. For example we couldn t use either the z test or the t test or the F test to test such a hypothesis. Testing this type of hypothesis requires a new type of test called the chi-square test, where chi is pronounced as in the words kind or kite and not like chime. So the bad news is that we will have to learn a new type of Excel function (or in the olden days, a new statistical table), but the good news is that the whole hypothesis testing procedure remains the same. So, we will still have a required or desired significance level (alpha), we will still have a test statistic, a critical value, a rejection region, a p-value, a decision and a conclusion. The rules of rejection remain the same. To obtain the test statistic value, we use a formula which I will tell you shortly. But before I give you the formula, I must tell you that we will need another column of values. In the above table of data, we will add a column for expected frequency if the null hypothesis was true. Color of the Car Observed Frequency (O) Expected Frequency (if the Null Hypothesis was true) (E) White Red All the Rest Total Note that in this table, we have changed the label for the second column as Observed Frequency. Please verify that the new column has values that represent hypothesized proportions. For example, since the hypothesized proportion of White cars was 25%, the expected frequency is 50, which happens to be 25% of 200. Now let me give you the formula for the Test Statistic value for this type of hypothesis: Frequency. Chi-Square Value =, Where O is the Observed Frequency and E is the Expected How to calculate the Chi-Square test statistic? We will use the above example to illustrate how to compute the test statistic: Color of the Car (O) (E) (O-E) (O-E) 2 (O-E) 2 /E White Red All the Rest Total The chi-square test statistic is 8.64 How to obtain the critical value? If we had a chi-square table, we could obtain the critical value from the table, but since we are getting so good at using Excel, we will obtain it using the excel function =CHIINV(). This function takes two 2

3 parameters probability and degree of freedom. The probability is basically your alpha value (which is typically 0.05) and the degree of freedom is the number of groups (in our example, three) minus one. So for our example, we will obtain the p-value using the formula =CHIINV(0.05,2) which comes gives us So what is the rejection region? Any chi-square value greater than falls in the rejection region. How to get the p-value? We get the p-value using the =CHIDIST() function. In our example, it will be =CHIDIST(8.64,2) = Decision Time: Since the chi-square test statistic value is 8.64, which is in the rejection region, because it is greater than 5.99, we reject the null. The same decision would be reached using the p-value, which happens to be which is less than alpha value of Conclusion: There is sufficient evidence, at significance level 0.05, that the proportions of white, red and other cars are other than 0.25, 0.25 and 0.50 respectively. What if alpha was 0.01? If alpha was 0.01, the critical value would be =CHIINV(0.01,2) = So the rejection region would be χ 2 > The p-value would still be the same at Using the critical value approach, we will fail to reject the null hypothesis since 8.64 is not greater than Also, using the p-value we will fail to reject the null because is greater than Note that using either of the two approaches, the decision should always be the same. The =CHITEST() function. Excel provides a function called =CHITEST(). Once you have generated the column for the expected frequency, you can use the =CHITEST() function to get the p-value for the test, without having to generate the test statistic value, which requires you to generate the columns necessary to compute (O- E) 2 /E. For the above example the following Excel screen shots will illustrate the use of the =CHITEST() function: 3

4 Please note that the CHITEST function needs two ranges the actual frequency range, which is the same as the observed frequency range and the expected frequency range. Please also note that the value thus obtained (0.0133) is the same value that we had obtained earlier using the =CHIDIST(8.64,2) function. Please run the above example in Excel yourself to get a better feel of how to use the CHITEST() function. Two Categorical Variables So far in this chapter, we have looked at hypotheses regarding proportions of certain values of a categorical random variable. In the example that we discussed, the random variable was color of a vehicle and the hypothesis was about the proportions of vehicles with certain colors. In such hypotheses, we are looking at frequency data of one categorical variable (color of vehicle in our example). What if we have frequency data on two categorical variables? For example, let us look at the following data that shows the number of wins at home and away for a certain university in various sports in the past five years: Sport Wins at Home Wins Away Football Basketball Baseball Soccer Figure 1: Data for Sports vs. Home Field Advantage In the above data, there are two categorical variables Wins (at home or away) and Sport. When we have data like this on two categorical variables, the question that can be asked is whether there is a relationship between the two variables or whether they are independent of each other. For example we can ask the question whether home field advantage depends upon the sport or not. Essentially we are asking whether two variables are independent or dependent. This type of test is called the test of independence. Null Hypothesis: H 0 : Home Field Advantage and Sport are independent of each other Alternate Hypothesis: H a : Home Field Advantage depends on the Sport Chi-Square test can be used to test for independence between two variables. 4

5 Test Statistic: The formula for the test statistic is the same for two variables as for one variable. It is χ 2 =, Just like in the case of one variable, we will have to create expected frequencies (E). Generating expected frequency for two categorical variables involves some extra work, which I will explain next. How to obtain Expected Frequencies? a. For each row, find the row sum. b. For each column, find the column sum. c. Find the grand sum i.e. the sum of all the row sums (or sum of all the column sums) d. For i th row and j th column, the expected frequency is row sum of i th row * column sum of j th column divided by the grand sum. The row sums, column sums and the grand sum are shown in the table in Figure 2. Sport Wins at Home Wins Away Row Sums Football Basketball Baseball Soccer Column Sums Figure 2: Row Sums, Column Sums and Grand Sum Please verify the row sums, the column sums and the grand sum in Figure 2. The expected frequencies are given in the table in Figure 3, using the formula explained in step d above. Sport Wins at Home Wins Away Row Sums Football Basketball Baseball Soccer Column Sums Figure 3: Expected Frequencies (E) I will explain a couple of these frequencies in Figure 3. You should verify all the rest of the frequencies. The expected frequency for the cell for Football and Wins at Home is computed as 40*110/200 = 22. The expected frequency for the cell Baseball and Wins Away is computed as 60*90/200 = 27. So now we have the observed frequencies and the expected frequencies in Figures 1 and 3 respectively. Next we calculate the chi-square value. For each cell we need to compute (O-E) 2 /E. The next table shows the values of (O-E) 2 /E for each cell. I will show you Sport Wins at Home Wins Away Football Basketball Baseball Soccer Figure 4: (O-E) 2 /E for each cell The sum of all these values gives the chi-square test statistic value = 4.51 So what should we compare this test value with in other words, what is the critical value? 5

6 The critical value can be determined from the Excel function =CHIINV(alpha, degrees of freedom). Suppose our alpha is For a test of independence, the degree of freedom is given by (r 1)*(c-1) where r is the number of rows (4 in our example) and c is the number of columns (2 in our example). So (4 1)*(2 1) = 3*1 = 3. Critical Value: =CHIINV(0.05,3) = Rejection region: χ 2 > p-value: is given by the excel function =CHIDIST(4.51,3) = Decision using the critical value approach: We fail to reject the null hypothesis because 4.51 is less than Decision using the p-value approach: We fail to reject the null because the p-value of is higher than Conclusion: we did not find sufficient evidence, at significance level of 0.05 that home field advantage depends on the sport. Can we get the p-value directly using Excel? Once we compute the expected frequencies (Figure 3) we can compute the p-value without having to calculate the numbers in Figure 4. So we can bypass the calculations of (O-E) 2 /E. How? Using the function =CHITEST(). In this function, we specify two ranges the range for observed frequencies (Figure- 1) and the range for expected frequencies (Figure-4). Suppose the range of data values in Figure-1 is C5:D8 and suppose the range of expected frequencies in Figure-4 is C13:D16 (See Figure 5). Then =CHITEST(C5:D8,C13:D16) will give , which is the same p- value we got using =CHIDIST(4.51,3). Figure 6 shows the formulas used in Figure 5. Please try to recreate this example on your computer to get a better sense of how this chi-square test was performed. Figure 5: Excel Calculations of expected frequencies 6

7 Figure 6: Excel Formulas for the numbers in Figure 5. Summary of the Chapter When dealing with categorical variables certain types of hypotheses can be made. One type of hypothesis involves a single categorical variable. The hypothesis is about the proportions of distribution of the category into different values. Another type of hypothesis involves two categorical variables. The hypothesis is regarding whether the two variables are independent or dependent on each other. The test of hypothesis involving categorical variables uses a chi-square test. The test statistic for a chi-square test is a measure of how far the actual frequencies are with respect to the expected frequencies if the Null-Hypothesis was true. The higher the value of the test statistic, the stronger is the evidence in favor of the alternate hypothesis. 7

Class 19: Two Way Tables, Conditional Distributions, Chi-Square (Text: Sections 2.5; 9.1)

Class 19: Two Way Tables, Conditional Distributions, Chi-Square (Text: Sections 2.5; 9.1) Spring 204 Class 9: Two Way Tables, Conditional Distributions, Chi-Square (Text: Sections 2.5; 9.) Big Picture: More than Two Samples In Chapter 7: We looked at quantitative variables and compared the

More information

Is it statistically significant? The chi-square test

Is it statistically significant? The chi-square test UAS Conference Series 2013/14 Is it statistically significant? The chi-square test Dr Gosia Turner Student Data Management and Analysis 14 September 2010 Page 1 Why chi-square? Tests whether two categorical

More information

Association Between Variables

Association Between Variables Contents 11 Association Between Variables 767 11.1 Introduction............................ 767 11.1.1 Measure of Association................. 768 11.1.2 Chapter Summary.................... 769 11.2 Chi

More information

Chi-square test Fisher s Exact test

Chi-square test Fisher s Exact test Lesson 1 Chi-square test Fisher s Exact test McNemar s Test Lesson 1 Overview Lesson 11 covered two inference methods for categorical data from groups Confidence Intervals for the difference of two proportions

More information

Mind on Statistics. Chapter 15

Mind on Statistics. Chapter 15 Mind on Statistics Chapter 15 Section 15.1 1. A student survey was done to study the relationship between class standing (freshman, sophomore, junior, or senior) and major subject (English, Biology, French,

More information

AP: LAB 8: THE CHI-SQUARE TEST. Probability, Random Chance, and Genetics

AP: LAB 8: THE CHI-SQUARE TEST. Probability, Random Chance, and Genetics Ms. Foglia Date AP: LAB 8: THE CHI-SQUARE TEST Probability, Random Chance, and Genetics Why do we study random chance and probability at the beginning of a unit on genetics? Genetics is the study of inheritance,

More information

Statistical tests for SPSS

Statistical tests for SPSS Statistical tests for SPSS Paolo Coletti A.Y. 2010/11 Free University of Bolzano Bozen Premise This book is a very quick, rough and fast description of statistical tests and their usage. It is explicitly

More information

Solutions to Homework 10 Statistics 302 Professor Larget

Solutions to Homework 10 Statistics 302 Professor Larget s to Homework 10 Statistics 302 Professor Larget Textbook Exercises 7.14 Rock-Paper-Scissors (Graded for Accurateness) In Data 6.1 on page 367 we see a table, reproduced in the table below that shows the

More information

Having a coin come up heads or tails is a variable on a nominal scale. Heads is a different category from tails.

Having a coin come up heads or tails is a variable on a nominal scale. Heads is a different category from tails. Chi-square Goodness of Fit Test The chi-square test is designed to test differences whether one frequency is different from another frequency. The chi-square test is designed for use with data on a nominal

More information

LAB 4 INSTRUCTIONS CONFIDENCE INTERVALS AND HYPOTHESIS TESTING

LAB 4 INSTRUCTIONS CONFIDENCE INTERVALS AND HYPOTHESIS TESTING LAB 4 INSTRUCTIONS CONFIDENCE INTERVALS AND HYPOTHESIS TESTING In this lab you will explore the concept of a confidence interval and hypothesis testing through a simulation problem in engineering setting.

More information

Unit 31 A Hypothesis Test about Correlation and Slope in a Simple Linear Regression

Unit 31 A Hypothesis Test about Correlation and Slope in a Simple Linear Regression Unit 31 A Hypothesis Test about Correlation and Slope in a Simple Linear Regression Objectives: To perform a hypothesis test concerning the slope of a least squares line To recognize that testing for a

More information

Chapter 23. Two Categorical Variables: The Chi-Square Test

Chapter 23. Two Categorical Variables: The Chi-Square Test Chapter 23. Two Categorical Variables: The Chi-Square Test 1 Chapter 23. Two Categorical Variables: The Chi-Square Test Two-Way Tables Note. We quickly review two-way tables with an example. Example. Exercise

More information

CHAPTER IV FINDINGS AND CONCURRENT DISCUSSIONS

CHAPTER IV FINDINGS AND CONCURRENT DISCUSSIONS CHAPTER IV FINDINGS AND CONCURRENT DISCUSSIONS Hypothesis 1: People are resistant to the technological change in the security system of the organization. Hypothesis 2: information hacked and misused. Lack

More information

Bivariate Statistics Session 2: Measuring Associations Chi-Square Test

Bivariate Statistics Session 2: Measuring Associations Chi-Square Test Bivariate Statistics Session 2: Measuring Associations Chi-Square Test Features Of The Chi-Square Statistic The chi-square test is non-parametric. That is, it makes no assumptions about the distribution

More information

Odds ratio, Odds ratio test for independence, chi-squared statistic.

Odds ratio, Odds ratio test for independence, chi-squared statistic. Odds ratio, Odds ratio test for independence, chi-squared statistic. Announcements: Assignment 5 is live on webpage. Due Wed Aug 1 at 4:30pm. (9 days, 1 hour, 58.5 minutes ) Final exam is Aug 9. Review

More information

Contingency Tables and the Chi Square Statistic. Interpreting Computer Printouts and Constructing Tables

Contingency Tables and the Chi Square Statistic. Interpreting Computer Printouts and Constructing Tables Contingency Tables and the Chi Square Statistic Interpreting Computer Printouts and Constructing Tables Contingency Tables/Chi Square Statistics What are they? A contingency table is a table that shows

More information

OA3-10 Patterns in Addition Tables

OA3-10 Patterns in Addition Tables OA3-10 Patterns in Addition Tables Pages 60 63 Standards: 3.OA.D.9 Goals: Students will identify and describe various patterns in addition tables. Prior Knowledge Required: Can add two numbers within 20

More information

12.5: CHI-SQUARE GOODNESS OF FIT TESTS

12.5: CHI-SQUARE GOODNESS OF FIT TESTS 125: Chi-Square Goodness of Fit Tests CD12-1 125: CHI-SQUARE GOODNESS OF FIT TESTS In this section, the χ 2 distribution is used for testing the goodness of fit of a set of data to a specific probability

More information

1. What is the critical value for this 95% confidence interval? CV = z.025 = invnorm(0.025) = 1.96

1. What is the critical value for this 95% confidence interval? CV = z.025 = invnorm(0.025) = 1.96 1 Final Review 2 Review 2.1 CI 1-propZint Scenario 1 A TV manufacturer claims in its warranty brochure that in the past not more than 10 percent of its TV sets needed any repair during the first two years

More information

CHAPTER 11 CHI-SQUARE AND F DISTRIBUTIONS

CHAPTER 11 CHI-SQUARE AND F DISTRIBUTIONS CHAPTER 11 CHI-SQUARE AND F DISTRIBUTIONS CHI-SQUARE TESTS OF INDEPENDENCE (SECTION 11.1 OF UNDERSTANDABLE STATISTICS) In chi-square tests of independence we use the hypotheses. H0: The variables are independent

More information

SHORT ANSWER. Write the word or phrase that best completes each statement or answers the question.

SHORT ANSWER. Write the word or phrase that best completes each statement or answers the question. Ch. 10 Chi SquareTests and the F-Distribution 10.1 Goodness of Fit 1 Find Expected Frequencies Provide an appropriate response. 1) The frequency distribution shows the ages for a sample of 100 employees.

More information

Simple Regression Theory II 2010 Samuel L. Baker

Simple Regression Theory II 2010 Samuel L. Baker SIMPLE REGRESSION THEORY II 1 Simple Regression Theory II 2010 Samuel L. Baker Assessing how good the regression equation is likely to be Assignment 1A gets into drawing inferences about how close the

More information

Recommend Continued CPS Monitoring. 63 (a) 17 (b) 10 (c) 90. 35 (d) 20 (e) 25 (f) 80. Totals/Marginal 98 37 35 170

Recommend Continued CPS Monitoring. 63 (a) 17 (b) 10 (c) 90. 35 (d) 20 (e) 25 (f) 80. Totals/Marginal 98 37 35 170 Work Sheet 2: Calculating a Chi Square Table 1: Substance Abuse Level by ation Total/Marginal 63 (a) 17 (b) 10 (c) 90 35 (d) 20 (e) 25 (f) 80 Totals/Marginal 98 37 35 170 Step 1: Label Your Table. Label

More information

LAB : THE CHI-SQUARE TEST. Probability, Random Chance, and Genetics

LAB : THE CHI-SQUARE TEST. Probability, Random Chance, and Genetics Period Date LAB : THE CHI-SQUARE TEST Probability, Random Chance, and Genetics Why do we study random chance and probability at the beginning of a unit on genetics? Genetics is the study of inheritance,

More information

Pigeonhole Principle Solutions

Pigeonhole Principle Solutions Pigeonhole Principle Solutions 1. Show that if we take n + 1 numbers from the set {1, 2,..., 2n}, then some pair of numbers will have no factors in common. Solution: Note that consecutive numbers (such

More information

Mathematics Content: Pie Charts; Area as Probability; Probabilities as Percents, Decimals & Fractions

Mathematics Content: Pie Charts; Area as Probability; Probabilities as Percents, Decimals & Fractions Title: Using the Area on a Pie Chart to Calculate Probabilities Mathematics Content: Pie Charts; Area as Probability; Probabilities as Percents, Decimals & Fractions Objectives: To calculate probability

More information

Descriptive Statistics and Measurement Scales

Descriptive Statistics and Measurement Scales Descriptive Statistics 1 Descriptive Statistics and Measurement Scales Descriptive statistics are used to describe the basic features of the data in a study. They provide simple summaries about the sample

More information

Crosstabulation & Chi Square

Crosstabulation & Chi Square Crosstabulation & Chi Square Robert S Michael Chi-square as an Index of Association After examining the distribution of each of the variables, the researcher s next task is to look for relationships among

More information

Conditional Probability, Independence and Bayes Theorem Class 3, 18.05, Spring 2014 Jeremy Orloff and Jonathan Bloom

Conditional Probability, Independence and Bayes Theorem Class 3, 18.05, Spring 2014 Jeremy Orloff and Jonathan Bloom Conditional Probability, Independence and Bayes Theorem Class 3, 18.05, Spring 2014 Jeremy Orloff and Jonathan Bloom 1 Learning Goals 1. Know the definitions of conditional probability and independence

More information

8 6 X 2 Test for a Variance or Standard Deviation

8 6 X 2 Test for a Variance or Standard Deviation Section 8 6 x 2 Test for a Variance or Standard Deviation 437 This test uses the P-value method. Therefore, it is not necessary to enter a significance level. 1. Select MegaStat>Hypothesis Tests>Proportion

More information

Mathematics (Project Maths Phase 1)

Mathematics (Project Maths Phase 1) 2011. S133S Coimisiún na Scrúduithe Stáit State Examinations Commission Junior Certificate Examination Sample Paper Mathematics (Project Maths Phase 1) Paper 2 Ordinary Level Time: 2 hours 300 marks Running

More information

Conversions between percents, decimals, and fractions

Conversions between percents, decimals, and fractions Click on the links below to jump directly to the relevant section Conversions between percents, decimals and fractions Operations with percents Percentage of a number Percent change Conversions between

More information

Phonics. High Frequency Words P.008. Objective The student will read high frequency words.

Phonics. High Frequency Words P.008. Objective The student will read high frequency words. P.008 Jumping Words Objective The student will read high frequency words. Materials High frequency words (P.HFW.005 - P.HFW.064) Choose target words. Checkerboard and checkers (Activity Master P.008.AM1a

More information

Representation of functions as power series

Representation of functions as power series Representation of functions as power series Dr. Philippe B. Laval Kennesaw State University November 9, 008 Abstract This document is a summary of the theory and techniques used to represent functions

More information

Introduction to Quantitative Methods

Introduction to Quantitative Methods Introduction to Quantitative Methods October 15, 2009 Contents 1 Definition of Key Terms 2 2 Descriptive Statistics 3 2.1 Frequency Tables......................... 4 2.2 Measures of Central Tendencies.................

More information

Testing Research and Statistical Hypotheses

Testing Research and Statistical Hypotheses Testing Research and Statistical Hypotheses Introduction In the last lab we analyzed metric artifact attributes such as thickness or width/thickness ratio. Those were continuous variables, which as you

More information

6.4 Normal Distribution

6.4 Normal Distribution Contents 6.4 Normal Distribution....................... 381 6.4.1 Characteristics of the Normal Distribution....... 381 6.4.2 The Standardized Normal Distribution......... 385 6.4.3 Meaning of Areas under

More information

Hooray for the Hundreds Chart!!

Hooray for the Hundreds Chart!! Hooray for the Hundreds Chart!! The hundreds chart consists of a grid of numbers from 1 to 100, with each row containing a group of 10 numbers. As a result, children using this chart can count across rows

More information

CONTINGENCY TABLES ARE NOT ALL THE SAME David C. Howell University of Vermont

CONTINGENCY TABLES ARE NOT ALL THE SAME David C. Howell University of Vermont CONTINGENCY TABLES ARE NOT ALL THE SAME David C. Howell University of Vermont To most people studying statistics a contingency table is a contingency table. We tend to forget, if we ever knew, that contingency

More information

Final Exam Practice Problem Answers

Final Exam Practice Problem Answers Final Exam Practice Problem Answers The following data set consists of data gathered from 77 popular breakfast cereals. The variables in the data set are as follows: Brand: The brand name of the cereal

More information

Chapter 8 Hypothesis Testing Chapter 8 Hypothesis Testing 8-1 Overview 8-2 Basics of Hypothesis Testing

Chapter 8 Hypothesis Testing Chapter 8 Hypothesis Testing 8-1 Overview 8-2 Basics of Hypothesis Testing Chapter 8 Hypothesis Testing 1 Chapter 8 Hypothesis Testing 8-1 Overview 8-2 Basics of Hypothesis Testing 8-3 Testing a Claim About a Proportion 8-5 Testing a Claim About a Mean: s Not Known 8-6 Testing

More information

UNDERSTANDING THE TWO-WAY ANOVA

UNDERSTANDING THE TWO-WAY ANOVA UNDERSTANDING THE e have seen how the one-way ANOVA can be used to compare two or more sample means in studies involving a single independent variable. This can be extended to two independent variables

More information

MBA 611 STATISTICS AND QUANTITATIVE METHODS

MBA 611 STATISTICS AND QUANTITATIVE METHODS MBA 611 STATISTICS AND QUANTITATIVE METHODS Part I. Review of Basic Statistics (Chapters 1-11) A. Introduction (Chapter 1) Uncertainty: Decisions are often based on incomplete information from uncertain

More information

The Chi-Square Test. STAT E-50 Introduction to Statistics

The Chi-Square Test. STAT E-50 Introduction to Statistics STAT -50 Introduction to Statistics The Chi-Square Test The Chi-square test is a nonparametric test that is used to compare experimental results with theoretical models. That is, we will be comparing observed

More information

Two Correlated Proportions (McNemar Test)

Two Correlated Proportions (McNemar Test) Chapter 50 Two Correlated Proportions (Mcemar Test) Introduction This procedure computes confidence intervals and hypothesis tests for the comparison of the marginal frequencies of two factors (each with

More information

Additional sources Compilation of sources: http://lrs.ed.uiuc.edu/tseportal/datacollectionmethodologies/jin-tselink/tselink.htm

Additional sources Compilation of sources: http://lrs.ed.uiuc.edu/tseportal/datacollectionmethodologies/jin-tselink/tselink.htm Mgt 540 Research Methods Data Analysis 1 Additional sources Compilation of sources: http://lrs.ed.uiuc.edu/tseportal/datacollectionmethodologies/jin-tselink/tselink.htm http://web.utk.edu/~dap/random/order/start.htm

More information

TABLE OF CONTENTS. About Chi Squares... 1. What is a CHI SQUARE?... 1. Chi Squares... 1. Hypothesis Testing with Chi Squares... 2

TABLE OF CONTENTS. About Chi Squares... 1. What is a CHI SQUARE?... 1. Chi Squares... 1. Hypothesis Testing with Chi Squares... 2 About Chi Squares TABLE OF CONTENTS About Chi Squares... 1 What is a CHI SQUARE?... 1 Chi Squares... 1 Goodness of fit test (One-way χ 2 )... 1 Test of Independence (Two-way χ 2 )... 2 Hypothesis Testing

More information

Assignment #1: Spreadsheets and Basic Data Visualization Sample Solution

Assignment #1: Spreadsheets and Basic Data Visualization Sample Solution Assignment #1: Spreadsheets and Basic Data Visualization Sample Solution Part 1: Spreadsheet Data Analysis Problem 1. Football data: Find the average difference between game predictions and actual outcomes,

More information

Hypothesis Testing: Two Means, Paired Data, Two Proportions

Hypothesis Testing: Two Means, Paired Data, Two Proportions Chapter 10 Hypothesis Testing: Two Means, Paired Data, Two Proportions 10.1 Hypothesis Testing: Two Population Means and Two Population Proportions 1 10.1.1 Student Learning Objectives By the end of this

More information

Calculating P-Values. Parkland College. Isela Guerra Parkland College. Recommended Citation

Calculating P-Values. Parkland College. Isela Guerra Parkland College. Recommended Citation Parkland College A with Honors Projects Honors Program 2014 Calculating P-Values Isela Guerra Parkland College Recommended Citation Guerra, Isela, "Calculating P-Values" (2014). A with Honors Projects.

More information

Nonparametric Tests. Chi-Square Test for Independence

Nonparametric Tests. Chi-Square Test for Independence DDBA 8438: Nonparametric Statistics: The Chi-Square Test Video Podcast Transcript JENNIFER ANN MORROW: Welcome to "Nonparametric Statistics: The Chi-Square Test." My name is Dr. Jennifer Ann Morrow. In

More information

Section 12 Part 2. Chi-square test

Section 12 Part 2. Chi-square test Section 12 Part 2 Chi-square test McNemar s Test Section 12 Part 2 Overview Section 12, Part 1 covered two inference methods for categorical data from 2 groups Confidence Intervals for the difference of

More information

Independent samples t-test. Dr. Tom Pierce Radford University

Independent samples t-test. Dr. Tom Pierce Radford University Independent samples t-test Dr. Tom Pierce Radford University The logic behind drawing causal conclusions from experiments The sampling distribution of the difference between means The standard error of

More information

Math 58. Rumbos Fall 2008 1. Solutions to Review Problems for Exam 2

Math 58. Rumbos Fall 2008 1. Solutions to Review Problems for Exam 2 Math 58. Rumbos Fall 2008 1 Solutions to Review Problems for Exam 2 1. For each of the following scenarios, determine whether the binomial distribution is the appropriate distribution for the random variable

More information

3.4 Statistical inference for 2 populations based on two samples

3.4 Statistical inference for 2 populations based on two samples 3.4 Statistical inference for 2 populations based on two samples Tests for a difference between two population means The first sample will be denoted as X 1, X 2,..., X m. The second sample will be denoted

More information

Elementary Statistics

Elementary Statistics lementary Statistics Chap10 Dr. Ghamsary Page 1 lementary Statistics M. Ghamsary, Ph.D. Chapter 10 Chi-square Test for Goodness of fit and Contingency tables lementary Statistics Chap10 Dr. Ghamsary Page

More information

Session 7 Bivariate Data and Analysis

Session 7 Bivariate Data and Analysis Session 7 Bivariate Data and Analysis Key Terms for This Session Previously Introduced mean standard deviation New in This Session association bivariate analysis contingency table co-variation least squares

More information

EXTRA ACTIVITy pages

EXTRA ACTIVITy pages EXTRA FUN ACTIVITIES This booklet contains extra activity pages for the student as well as the tests. See the next page for information about the activity pages. Go to page 7 to find the Alpha tests. EXTRA

More information

Data Mining Techniques Chapter 5: The Lure of Statistics: Data Mining Using Familiar Tools

Data Mining Techniques Chapter 5: The Lure of Statistics: Data Mining Using Familiar Tools Data Mining Techniques Chapter 5: The Lure of Statistics: Data Mining Using Familiar Tools Occam s razor.......................................................... 2 A look at data I.........................................................

More information

Exam Style Questions. Revision for this topic. Name: Ensure you have: Pencil, pen, ruler, protractor, pair of compasses and eraser

Exam Style Questions. Revision for this topic. Name: Ensure you have: Pencil, pen, ruler, protractor, pair of compasses and eraser Name: Exam Style Questions Ensure you have: Pencil, pen, ruler, protractor, pair of compasses and eraser You may use tracing paper if needed Guidance 1. Read each question carefully before you begin answering

More information

DESCRIPTIVE STATISTICS & DATA PRESENTATION*

DESCRIPTIVE STATISTICS & DATA PRESENTATION* Level 1 Level 2 Level 3 Level 4 0 0 0 0 evel 1 evel 2 evel 3 Level 4 DESCRIPTIVE STATISTICS & DATA PRESENTATION* Created for Psychology 41, Research Methods by Barbara Sommer, PhD Psychology Department

More information

First-year Statistics for Psychology Students Through Worked Examples

First-year Statistics for Psychology Students Through Worked Examples First-year Statistics for Psychology Students Through Worked Examples 1. THE CHI-SQUARE TEST A test of association between categorical variables by Charles McCreery, D.Phil Formerly Lecturer in Experimental

More information

Solutions to Homework 6 Statistics 302 Professor Larget

Solutions to Homework 6 Statistics 302 Professor Larget s to Homework 6 Statistics 302 Professor Larget Textbook Exercises 5.29 (Graded for Completeness) What Proportion Have College Degrees? According to the US Census Bureau, about 27.5% of US adults over

More information

Chi Square Distribution

Chi Square Distribution 17. Chi Square A. Chi Square Distribution B. One-Way Tables C. Contingency Tables D. Exercises Chi Square is a distribution that has proven to be particularly useful in statistics. The first section describes

More information

INTERPRETING THE ONE-WAY ANALYSIS OF VARIANCE (ANOVA)

INTERPRETING THE ONE-WAY ANALYSIS OF VARIANCE (ANOVA) INTERPRETING THE ONE-WAY ANALYSIS OF VARIANCE (ANOVA) As with other parametric statistics, we begin the one-way ANOVA with a test of the underlying assumptions. Our first assumption is the assumption of

More information

IBM SPSS Statistics for Beginners for Windows

IBM SPSS Statistics for Beginners for Windows ISS, NEWCASTLE UNIVERSITY IBM SPSS Statistics for Beginners for Windows A Training Manual for Beginners Dr. S. T. Kometa A Training Manual for Beginners Contents 1 Aims and Objectives... 3 1.1 Learning

More information

Topic 8. Chi Square Tests

Topic 8. Chi Square Tests BE540W Chi Square Tests Page 1 of 5 Topic 8 Chi Square Tests Topics 1. Introduction to Contingency Tables. Introduction to the Contingency Table Hypothesis Test of No Association.. 3. The Chi Square Test

More information

CS 147: Computer Systems Performance Analysis

CS 147: Computer Systems Performance Analysis CS 147: Computer Systems Performance Analysis One-Factor Experiments CS 147: Computer Systems Performance Analysis One-Factor Experiments 1 / 42 Overview Introduction Overview Overview Introduction Finding

More information

Test Positive True Positive False Positive. Test Negative False Negative True Negative. Figure 5-1: 2 x 2 Contingency Table

Test Positive True Positive False Positive. Test Negative False Negative True Negative. Figure 5-1: 2 x 2 Contingency Table ANALYSIS OF DISCRT VARIABLS / 5 CHAPTR FIV ANALYSIS OF DISCRT VARIABLS Discrete variables are those which can only assume certain fixed values. xamples include outcome variables with results such as live

More information

Using Stata for Categorical Data Analysis

Using Stata for Categorical Data Analysis Using Stata for Categorical Data Analysis NOTE: These problems make extensive use of Nick Cox s tab_chi, which is actually a collection of routines, and Adrian Mander s ipf command. From within Stata,

More information

Introduction to Analysis of Variance (ANOVA) Limitations of the t-test

Introduction to Analysis of Variance (ANOVA) Limitations of the t-test Introduction to Analysis of Variance (ANOVA) The Structural Model, The Summary Table, and the One- Way ANOVA Limitations of the t-test Although the t-test is commonly used, it has limitations Can only

More information

One-Way Analysis of Variance (ANOVA) Example Problem

One-Way Analysis of Variance (ANOVA) Example Problem One-Way Analysis of Variance (ANOVA) Example Problem Introduction Analysis of Variance (ANOVA) is a hypothesis-testing technique used to test the equality of two or more population (or treatment) means

More information

Section 1.1 Exercises (Solutions)

Section 1.1 Exercises (Solutions) Section 1.1 Exercises (Solutions) HW: 1.14, 1.16, 1.19, 1.21, 1.24, 1.25*, 1.31*, 1.33, 1.34, 1.35, 1.38*, 1.39, 1.41* 1.14 Employee application data. The personnel department keeps records on all employees

More information

Introduction to Hypothesis Testing

Introduction to Hypothesis Testing I. Terms, Concepts. Introduction to Hypothesis Testing A. In general, we do not know the true value of population parameters - they must be estimated. However, we do have hypotheses about what the true

More information

Likelihood: Frequentist vs Bayesian Reasoning

Likelihood: Frequentist vs Bayesian Reasoning "PRINCIPLES OF PHYLOGENETICS: ECOLOGY AND EVOLUTION" Integrative Biology 200B University of California, Berkeley Spring 2009 N Hallinan Likelihood: Frequentist vs Bayesian Reasoning Stochastic odels and

More information

Lesson 1: Comparison of Population Means Part c: Comparison of Two- Means

Lesson 1: Comparison of Population Means Part c: Comparison of Two- Means Lesson : Comparison of Population Means Part c: Comparison of Two- Means Welcome to lesson c. This third lesson of lesson will discuss hypothesis testing for two independent means. Steps in Hypothesis

More information

Normality Testing in Excel

Normality Testing in Excel Normality Testing in Excel By Mark Harmon Copyright 2011 Mark Harmon No part of this publication may be reproduced or distributed without the express permission of the author. mark@excelmasterseries.com

More information

Opgaven Onderzoeksmethoden, Onderdeel Statistiek

Opgaven Onderzoeksmethoden, Onderdeel Statistiek Opgaven Onderzoeksmethoden, Onderdeel Statistiek 1. What is the measurement scale of the following variables? a Shoe size b Religion c Car brand d Score in a tennis game e Number of work hours per week

More information

The ANOVA for 2x2 Independent Groups Factorial Design

The ANOVA for 2x2 Independent Groups Factorial Design The ANOVA for 2x2 Independent Groups Factorial Design Please Note: In the analyses above I have tried to avoid using the terms "Independent Variable" and "Dependent Variable" (IV and DV) in order to emphasize

More information

A Comparative Analysis of Speech Recognition Platforms

A Comparative Analysis of Speech Recognition Platforms Communications of the IIMA Volume 9 Issue 3 Article 2 2009 A Comparative Analysis of Speech Recognition Platforms Ore A. Iona College Follow this and additional works at: http://scholarworks.lib.csusb.edu/ciima

More information

A magician showed a magic trick where he picked one card from a standard deck. Determine what the probability is that the card will be a queen card?

A magician showed a magic trick where he picked one card from a standard deck. Determine what the probability is that the card will be a queen card? Topic : Probability Word Problems- Worksheet 1 Jill is playing cards with her friend when she draws a card from a pack of 20 cards numbered from 1 to 20. What is the probability of drawing a number that

More information

Instructions Budget Sheets

Instructions Budget Sheets Instructions Budget Sheets Potential Sources of Revenue and Expenses REVENUE Parent Dues Tournament Revenue Fundraisers Sponsors Branch / Association EXPENSES Games / Practices Officiating Fees Rink /

More information

Topic : Probability of a Complement of an Event- Worksheet 1. Do the following:

Topic : Probability of a Complement of an Event- Worksheet 1. Do the following: Topic : Probability of a Complement of an Event- Worksheet 1 1. You roll a die. What is the probability that 2 will not appear 2. Two 6-sided dice are rolled. What is the 3. Ray and Shan are playing football.

More information

Lab 3 - DC Circuits and Ohm s Law

Lab 3 - DC Circuits and Ohm s Law Lab 3 DC Circuits and Ohm s Law L3-1 Name Date Partners Lab 3 - DC Circuits and Ohm s Law OBJECTIES To learn to apply the concept of potential difference (voltage) to explain the action of a battery in

More information

CBA Fractions Student Sheet 1

CBA Fractions Student Sheet 1 Student Sheet 1 1. If 3 people share 12 cookies equally, how many cookies does each person get? 2. Four people want to share 5 cakes equally. Show how much each person gets. Student Sheet 2 1. The candy

More information

The Taxman Game. Robert K. Moniot September 5, 2003

The Taxman Game. Robert K. Moniot September 5, 2003 The Taxman Game Robert K. Moniot September 5, 2003 1 Introduction Want to know how to beat the taxman? Legally, that is? Read on, and we will explore this cute little mathematical game. The taxman game

More information

STA-201-TE. 5. Measures of relationship: correlation (5%) Correlation coefficient; Pearson r; correlation and causation; proportion of common variance

STA-201-TE. 5. Measures of relationship: correlation (5%) Correlation coefficient; Pearson r; correlation and causation; proportion of common variance Principles of Statistics STA-201-TE This TECEP is an introduction to descriptive and inferential statistics. Topics include: measures of central tendency, variability, correlation, regression, hypothesis

More information

GRAPHS/TABLES. (line plots, bar graphs pictographs, line graphs)

GRAPHS/TABLES. (line plots, bar graphs pictographs, line graphs) GRAPHS/TABLES (line plots, bar graphs pictographs, line graphs) Standard: 3.D.1.2 Represent data using tables and graphs (e.g., line plots, bar graphs, pictographs, and line graphs). Concept Skill: Graphs

More information

Lab 11: Budgeting with Excel

Lab 11: Budgeting with Excel Lab 11: Budgeting with Excel This lab exercise will have you track credit card bills over a period of three months. You will determine those months in which a budget was met for various categories. You

More information

MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question.

MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question. Ch. 4 Discrete Probability Distributions 4.1 Probability Distributions 1 Decide if a Random Variable is Discrete or Continuous 1) State whether the variable is discrete or continuous. The number of cups

More information

Decision Analysis. Here is the statement of the problem:

Decision Analysis. Here is the statement of the problem: Decision Analysis Formal decision analysis is often used when a decision must be made under conditions of significant uncertainty. SmartDrill can assist management with any of a variety of decision analysis

More information

PURPOSE OF GRAPHS YOU ARE ABOUT TO BUILD. To explore for a relationship between the categories of two discrete variables

PURPOSE OF GRAPHS YOU ARE ABOUT TO BUILD. To explore for a relationship between the categories of two discrete variables 3 Stacked Bar Graph PURPOSE OF GRAPHS YOU ARE ABOUT TO BUILD To explore for a relationship between the categories of two discrete variables 3.1 Introduction to the Stacked Bar Graph «As with the simple

More information

Experimental Design. Power and Sample Size Determination. Proportions. Proportions. Confidence Interval for p. The Binomial Test

Experimental Design. Power and Sample Size Determination. Proportions. Proportions. Confidence Interval for p. The Binomial Test Experimental Design Power and Sample Size Determination Bret Hanlon and Bret Larget Department of Statistics University of Wisconsin Madison November 3 8, 2011 To this point in the semester, we have largely

More information

Mind on Statistics. Chapter 12

Mind on Statistics. Chapter 12 Mind on Statistics Chapter 12 Sections 12.1 Questions 1 to 6: For each statement, determine if the statement is a typical null hypothesis (H 0 ) or alternative hypothesis (H a ). 1. There is no difference

More information

Elementary Statistics Sample Exam #3

Elementary Statistics Sample Exam #3 Elementary Statistics Sample Exam #3 Instructions. No books or telephones. Only the supplied calculators are allowed. The exam is worth 100 points. 1. A chi square goodness of fit test is considered to

More information

Understand the role that hypothesis testing plays in an improvement project. Know how to perform a two sample hypothesis test.

Understand the role that hypothesis testing plays in an improvement project. Know how to perform a two sample hypothesis test. HYPOTHESIS TESTING Learning Objectives Understand the role that hypothesis testing plays in an improvement project. Know how to perform a two sample hypothesis test. Know how to perform a hypothesis test

More information

Probability Distributions

Probability Distributions CHAPTER 5 Probability Distributions CHAPTER OUTLINE 5.1 Probability Distribution of a Discrete Random Variable 5.2 Mean and Standard Deviation of a Probability Distribution 5.3 The Binomial Distribution

More information

" Y. Notation and Equations for Regression Lecture 11/4. Notation:

 Y. Notation and Equations for Regression Lecture 11/4. Notation: Notation: Notation and Equations for Regression Lecture 11/4 m: The number of predictor variables in a regression Xi: One of multiple predictor variables. The subscript i represents any number from 1 through

More information

Chapter 5 Analysis of variance SPSS Analysis of variance

Chapter 5 Analysis of variance SPSS Analysis of variance Chapter 5 Analysis of variance SPSS Analysis of variance Data file used: gss.sav How to get there: Analyze Compare Means One-way ANOVA To test the null hypothesis that several population means are equal,

More information

One-Way Analysis of Variance

One-Way Analysis of Variance One-Way Analysis of Variance Note: Much of the math here is tedious but straightforward. We ll skim over it in class but you should be sure to ask questions if you don t understand it. I. Overview A. We

More information