Chapter 7. Categorical Data Analysis
|
|
- Clare Heath
- 7 years ago
- Views:
Transcription
1 Chapter 7 Categorical Data Analysis In Chapter 5 we studied how to test hypotheses involving a single population such as H 0 :µ=5 vs. H a :µ>5. In Chapter 6, we studied how to test hypotheses involving two or more populations, such as H 0 :µ 1 =µ 2 vs. H a : µ 1 >µ 2. In both these chapters we were dealing with quantitative variables such as height of a person, or lengths of alligators etc. In this chapter we will learn how to test hypotheses that involve qualitative or categorical variables. Recall some examples of qualitative or categorical variables such as Gender, Religion, Race, Color etc. They are considered categorical variables because their values are not numeric. Although we may represent values of categorical variables with numbers, it is still a categorical variable. For example we can represent colors using numbers such as 1 for white, 2 for red, 3 for blue and so one, but that does not make it quantitative because you couldn t add 2 and 3 (red and blue) and hope that 5 represents purple. The number 5 probably already represents some other color. The numbers in this color example have no inherent numerical properties; they are simply labels. When dealing with categorical variables, the closest thing to something numerical is the frequency data. So for example let s say I observe the color of the cars passing by my window (assuming I can see a road from my window with cars passing by). Suppose I collect the following data of the first twelve cars that I see: White, Red, Red, Black, Blue, Green, Yellow, Red, White, Black, Blue and White. I can translate this data into a frequency table like this: Color of the vehicle Frequency White 3 Black 2 Blue 2 Yellow 1 Red 3 Green 1 It is relatively easy to obtain such frequency tables for data involving categorical variables as you can see in the above example. Using frequency data, we can test a variety of new types of hypotheses that we have not seen in previous chapters. For example, a favorite family game when you are on a long drive on an interstate highway is for the contestants to pick a color, say white or red and see who gets the most number of cars of that color till you reach your destination. The person who picked the color with the most number of cars wins this extremely delightful and colorful game. Suppose on one long journey while playing this game my family collected the following data: Color of the vehicle Frequency White 44 Red 36 All the Rest 120 Data such as in the above table can be used to test hypotheses about proportions. For example, say I have a hypothesis 25% of all cars produced are White and another 25% are Red and the remaining 50% are all other colors combined. In symbols, this hypothesis can be written as: 1
2 H 0 : p white = 0.25, p red = 0.25, p other = 0.50 H a : at least one of the proportions is different than specified in the null hypothesis Hypotheses such as this cannot be tested using any of the methods that we have studied so far. For example we couldn t use either the z test or the t test or the F test to test such a hypothesis. Testing this type of hypothesis requires a new type of test called the chi-square test, where chi is pronounced as in the words kind or kite and not like chime. So the bad news is that we will have to learn a new type of Excel function (or in the olden days, a new statistical table), but the good news is that the whole hypothesis testing procedure remains the same. So, we will still have a required or desired significance level (alpha), we will still have a test statistic, a critical value, a rejection region, a p-value, a decision and a conclusion. The rules of rejection remain the same. To obtain the test statistic value, we use a formula which I will tell you shortly. But before I give you the formula, I must tell you that we will need another column of values. In the above table of data, we will add a column for expected frequency if the null hypothesis was true. Color of the Car Observed Frequency (O) Expected Frequency (if the Null Hypothesis was true) (E) White Red All the Rest Total Note that in this table, we have changed the label for the second column as Observed Frequency. Please verify that the new column has values that represent hypothesized proportions. For example, since the hypothesized proportion of White cars was 25%, the expected frequency is 50, which happens to be 25% of 200. Now let me give you the formula for the Test Statistic value for this type of hypothesis: Frequency. Chi-Square Value =, Where O is the Observed Frequency and E is the Expected How to calculate the Chi-Square test statistic? We will use the above example to illustrate how to compute the test statistic: Color of the Car (O) (E) (O-E) (O-E) 2 (O-E) 2 /E White Red All the Rest Total The chi-square test statistic is 8.64 How to obtain the critical value? If we had a chi-square table, we could obtain the critical value from the table, but since we are getting so good at using Excel, we will obtain it using the excel function =CHIINV(). This function takes two 2
3 parameters probability and degree of freedom. The probability is basically your alpha value (which is typically 0.05) and the degree of freedom is the number of groups (in our example, three) minus one. So for our example, we will obtain the p-value using the formula =CHIINV(0.05,2) which comes gives us So what is the rejection region? Any chi-square value greater than falls in the rejection region. How to get the p-value? We get the p-value using the =CHIDIST() function. In our example, it will be =CHIDIST(8.64,2) = Decision Time: Since the chi-square test statistic value is 8.64, which is in the rejection region, because it is greater than 5.99, we reject the null. The same decision would be reached using the p-value, which happens to be which is less than alpha value of Conclusion: There is sufficient evidence, at significance level 0.05, that the proportions of white, red and other cars are other than 0.25, 0.25 and 0.50 respectively. What if alpha was 0.01? If alpha was 0.01, the critical value would be =CHIINV(0.01,2) = So the rejection region would be χ 2 > The p-value would still be the same at Using the critical value approach, we will fail to reject the null hypothesis since 8.64 is not greater than Also, using the p-value we will fail to reject the null because is greater than Note that using either of the two approaches, the decision should always be the same. The =CHITEST() function. Excel provides a function called =CHITEST(). Once you have generated the column for the expected frequency, you can use the =CHITEST() function to get the p-value for the test, without having to generate the test statistic value, which requires you to generate the columns necessary to compute (O- E) 2 /E. For the above example the following Excel screen shots will illustrate the use of the =CHITEST() function: 3
4 Please note that the CHITEST function needs two ranges the actual frequency range, which is the same as the observed frequency range and the expected frequency range. Please also note that the value thus obtained (0.0133) is the same value that we had obtained earlier using the =CHIDIST(8.64,2) function. Please run the above example in Excel yourself to get a better feel of how to use the CHITEST() function. Two Categorical Variables So far in this chapter, we have looked at hypotheses regarding proportions of certain values of a categorical random variable. In the example that we discussed, the random variable was color of a vehicle and the hypothesis was about the proportions of vehicles with certain colors. In such hypotheses, we are looking at frequency data of one categorical variable (color of vehicle in our example). What if we have frequency data on two categorical variables? For example, let us look at the following data that shows the number of wins at home and away for a certain university in various sports in the past five years: Sport Wins at Home Wins Away Football Basketball Baseball Soccer Figure 1: Data for Sports vs. Home Field Advantage In the above data, there are two categorical variables Wins (at home or away) and Sport. When we have data like this on two categorical variables, the question that can be asked is whether there is a relationship between the two variables or whether they are independent of each other. For example we can ask the question whether home field advantage depends upon the sport or not. Essentially we are asking whether two variables are independent or dependent. This type of test is called the test of independence. Null Hypothesis: H 0 : Home Field Advantage and Sport are independent of each other Alternate Hypothesis: H a : Home Field Advantage depends on the Sport Chi-Square test can be used to test for independence between two variables. 4
5 Test Statistic: The formula for the test statistic is the same for two variables as for one variable. It is χ 2 =, Just like in the case of one variable, we will have to create expected frequencies (E). Generating expected frequency for two categorical variables involves some extra work, which I will explain next. How to obtain Expected Frequencies? a. For each row, find the row sum. b. For each column, find the column sum. c. Find the grand sum i.e. the sum of all the row sums (or sum of all the column sums) d. For i th row and j th column, the expected frequency is row sum of i th row * column sum of j th column divided by the grand sum. The row sums, column sums and the grand sum are shown in the table in Figure 2. Sport Wins at Home Wins Away Row Sums Football Basketball Baseball Soccer Column Sums Figure 2: Row Sums, Column Sums and Grand Sum Please verify the row sums, the column sums and the grand sum in Figure 2. The expected frequencies are given in the table in Figure 3, using the formula explained in step d above. Sport Wins at Home Wins Away Row Sums Football Basketball Baseball Soccer Column Sums Figure 3: Expected Frequencies (E) I will explain a couple of these frequencies in Figure 3. You should verify all the rest of the frequencies. The expected frequency for the cell for Football and Wins at Home is computed as 40*110/200 = 22. The expected frequency for the cell Baseball and Wins Away is computed as 60*90/200 = 27. So now we have the observed frequencies and the expected frequencies in Figures 1 and 3 respectively. Next we calculate the chi-square value. For each cell we need to compute (O-E) 2 /E. The next table shows the values of (O-E) 2 /E for each cell. I will show you Sport Wins at Home Wins Away Football Basketball Baseball Soccer Figure 4: (O-E) 2 /E for each cell The sum of all these values gives the chi-square test statistic value = 4.51 So what should we compare this test value with in other words, what is the critical value? 5
6 The critical value can be determined from the Excel function =CHIINV(alpha, degrees of freedom). Suppose our alpha is For a test of independence, the degree of freedom is given by (r 1)*(c-1) where r is the number of rows (4 in our example) and c is the number of columns (2 in our example). So (4 1)*(2 1) = 3*1 = 3. Critical Value: =CHIINV(0.05,3) = Rejection region: χ 2 > p-value: is given by the excel function =CHIDIST(4.51,3) = Decision using the critical value approach: We fail to reject the null hypothesis because 4.51 is less than Decision using the p-value approach: We fail to reject the null because the p-value of is higher than Conclusion: we did not find sufficient evidence, at significance level of 0.05 that home field advantage depends on the sport. Can we get the p-value directly using Excel? Once we compute the expected frequencies (Figure 3) we can compute the p-value without having to calculate the numbers in Figure 4. So we can bypass the calculations of (O-E) 2 /E. How? Using the function =CHITEST(). In this function, we specify two ranges the range for observed frequencies (Figure- 1) and the range for expected frequencies (Figure-4). Suppose the range of data values in Figure-1 is C5:D8 and suppose the range of expected frequencies in Figure-4 is C13:D16 (See Figure 5). Then =CHITEST(C5:D8,C13:D16) will give , which is the same p- value we got using =CHIDIST(4.51,3). Figure 6 shows the formulas used in Figure 5. Please try to recreate this example on your computer to get a better sense of how this chi-square test was performed. Figure 5: Excel Calculations of expected frequencies 6
7 Figure 6: Excel Formulas for the numbers in Figure 5. Summary of the Chapter When dealing with categorical variables certain types of hypotheses can be made. One type of hypothesis involves a single categorical variable. The hypothesis is about the proportions of distribution of the category into different values. Another type of hypothesis involves two categorical variables. The hypothesis is regarding whether the two variables are independent or dependent on each other. The test of hypothesis involving categorical variables uses a chi-square test. The test statistic for a chi-square test is a measure of how far the actual frequencies are with respect to the expected frequencies if the Null-Hypothesis was true. The higher the value of the test statistic, the stronger is the evidence in favor of the alternate hypothesis. 7
Class 19: Two Way Tables, Conditional Distributions, Chi-Square (Text: Sections 2.5; 9.1)
Spring 204 Class 9: Two Way Tables, Conditional Distributions, Chi-Square (Text: Sections 2.5; 9.) Big Picture: More than Two Samples In Chapter 7: We looked at quantitative variables and compared the
More informationIs it statistically significant? The chi-square test
UAS Conference Series 2013/14 Is it statistically significant? The chi-square test Dr Gosia Turner Student Data Management and Analysis 14 September 2010 Page 1 Why chi-square? Tests whether two categorical
More informationAssociation Between Variables
Contents 11 Association Between Variables 767 11.1 Introduction............................ 767 11.1.1 Measure of Association................. 768 11.1.2 Chapter Summary.................... 769 11.2 Chi
More informationChi-square test Fisher s Exact test
Lesson 1 Chi-square test Fisher s Exact test McNemar s Test Lesson 1 Overview Lesson 11 covered two inference methods for categorical data from groups Confidence Intervals for the difference of two proportions
More informationMind on Statistics. Chapter 15
Mind on Statistics Chapter 15 Section 15.1 1. A student survey was done to study the relationship between class standing (freshman, sophomore, junior, or senior) and major subject (English, Biology, French,
More informationAP: LAB 8: THE CHI-SQUARE TEST. Probability, Random Chance, and Genetics
Ms. Foglia Date AP: LAB 8: THE CHI-SQUARE TEST Probability, Random Chance, and Genetics Why do we study random chance and probability at the beginning of a unit on genetics? Genetics is the study of inheritance,
More informationStatistical tests for SPSS
Statistical tests for SPSS Paolo Coletti A.Y. 2010/11 Free University of Bolzano Bozen Premise This book is a very quick, rough and fast description of statistical tests and their usage. It is explicitly
More informationSolutions to Homework 10 Statistics 302 Professor Larget
s to Homework 10 Statistics 302 Professor Larget Textbook Exercises 7.14 Rock-Paper-Scissors (Graded for Accurateness) In Data 6.1 on page 367 we see a table, reproduced in the table below that shows the
More informationHaving a coin come up heads or tails is a variable on a nominal scale. Heads is a different category from tails.
Chi-square Goodness of Fit Test The chi-square test is designed to test differences whether one frequency is different from another frequency. The chi-square test is designed for use with data on a nominal
More informationLAB 4 INSTRUCTIONS CONFIDENCE INTERVALS AND HYPOTHESIS TESTING
LAB 4 INSTRUCTIONS CONFIDENCE INTERVALS AND HYPOTHESIS TESTING In this lab you will explore the concept of a confidence interval and hypothesis testing through a simulation problem in engineering setting.
More informationUnit 31 A Hypothesis Test about Correlation and Slope in a Simple Linear Regression
Unit 31 A Hypothesis Test about Correlation and Slope in a Simple Linear Regression Objectives: To perform a hypothesis test concerning the slope of a least squares line To recognize that testing for a
More informationChapter 23. Two Categorical Variables: The Chi-Square Test
Chapter 23. Two Categorical Variables: The Chi-Square Test 1 Chapter 23. Two Categorical Variables: The Chi-Square Test Two-Way Tables Note. We quickly review two-way tables with an example. Example. Exercise
More informationCHAPTER IV FINDINGS AND CONCURRENT DISCUSSIONS
CHAPTER IV FINDINGS AND CONCURRENT DISCUSSIONS Hypothesis 1: People are resistant to the technological change in the security system of the organization. Hypothesis 2: information hacked and misused. Lack
More informationBivariate Statistics Session 2: Measuring Associations Chi-Square Test
Bivariate Statistics Session 2: Measuring Associations Chi-Square Test Features Of The Chi-Square Statistic The chi-square test is non-parametric. That is, it makes no assumptions about the distribution
More informationOdds ratio, Odds ratio test for independence, chi-squared statistic.
Odds ratio, Odds ratio test for independence, chi-squared statistic. Announcements: Assignment 5 is live on webpage. Due Wed Aug 1 at 4:30pm. (9 days, 1 hour, 58.5 minutes ) Final exam is Aug 9. Review
More informationContingency Tables and the Chi Square Statistic. Interpreting Computer Printouts and Constructing Tables
Contingency Tables and the Chi Square Statistic Interpreting Computer Printouts and Constructing Tables Contingency Tables/Chi Square Statistics What are they? A contingency table is a table that shows
More informationOA3-10 Patterns in Addition Tables
OA3-10 Patterns in Addition Tables Pages 60 63 Standards: 3.OA.D.9 Goals: Students will identify and describe various patterns in addition tables. Prior Knowledge Required: Can add two numbers within 20
More information12.5: CHI-SQUARE GOODNESS OF FIT TESTS
125: Chi-Square Goodness of Fit Tests CD12-1 125: CHI-SQUARE GOODNESS OF FIT TESTS In this section, the χ 2 distribution is used for testing the goodness of fit of a set of data to a specific probability
More information1. What is the critical value for this 95% confidence interval? CV = z.025 = invnorm(0.025) = 1.96
1 Final Review 2 Review 2.1 CI 1-propZint Scenario 1 A TV manufacturer claims in its warranty brochure that in the past not more than 10 percent of its TV sets needed any repair during the first two years
More informationCHAPTER 11 CHI-SQUARE AND F DISTRIBUTIONS
CHAPTER 11 CHI-SQUARE AND F DISTRIBUTIONS CHI-SQUARE TESTS OF INDEPENDENCE (SECTION 11.1 OF UNDERSTANDABLE STATISTICS) In chi-square tests of independence we use the hypotheses. H0: The variables are independent
More informationSHORT ANSWER. Write the word or phrase that best completes each statement or answers the question.
Ch. 10 Chi SquareTests and the F-Distribution 10.1 Goodness of Fit 1 Find Expected Frequencies Provide an appropriate response. 1) The frequency distribution shows the ages for a sample of 100 employees.
More informationSimple Regression Theory II 2010 Samuel L. Baker
SIMPLE REGRESSION THEORY II 1 Simple Regression Theory II 2010 Samuel L. Baker Assessing how good the regression equation is likely to be Assignment 1A gets into drawing inferences about how close the
More informationRecommend Continued CPS Monitoring. 63 (a) 17 (b) 10 (c) 90. 35 (d) 20 (e) 25 (f) 80. Totals/Marginal 98 37 35 170
Work Sheet 2: Calculating a Chi Square Table 1: Substance Abuse Level by ation Total/Marginal 63 (a) 17 (b) 10 (c) 90 35 (d) 20 (e) 25 (f) 80 Totals/Marginal 98 37 35 170 Step 1: Label Your Table. Label
More informationLAB : THE CHI-SQUARE TEST. Probability, Random Chance, and Genetics
Period Date LAB : THE CHI-SQUARE TEST Probability, Random Chance, and Genetics Why do we study random chance and probability at the beginning of a unit on genetics? Genetics is the study of inheritance,
More informationPigeonhole Principle Solutions
Pigeonhole Principle Solutions 1. Show that if we take n + 1 numbers from the set {1, 2,..., 2n}, then some pair of numbers will have no factors in common. Solution: Note that consecutive numbers (such
More informationMathematics Content: Pie Charts; Area as Probability; Probabilities as Percents, Decimals & Fractions
Title: Using the Area on a Pie Chart to Calculate Probabilities Mathematics Content: Pie Charts; Area as Probability; Probabilities as Percents, Decimals & Fractions Objectives: To calculate probability
More informationDescriptive Statistics and Measurement Scales
Descriptive Statistics 1 Descriptive Statistics and Measurement Scales Descriptive statistics are used to describe the basic features of the data in a study. They provide simple summaries about the sample
More informationCrosstabulation & Chi Square
Crosstabulation & Chi Square Robert S Michael Chi-square as an Index of Association After examining the distribution of each of the variables, the researcher s next task is to look for relationships among
More informationConditional Probability, Independence and Bayes Theorem Class 3, 18.05, Spring 2014 Jeremy Orloff and Jonathan Bloom
Conditional Probability, Independence and Bayes Theorem Class 3, 18.05, Spring 2014 Jeremy Orloff and Jonathan Bloom 1 Learning Goals 1. Know the definitions of conditional probability and independence
More information8 6 X 2 Test for a Variance or Standard Deviation
Section 8 6 x 2 Test for a Variance or Standard Deviation 437 This test uses the P-value method. Therefore, it is not necessary to enter a significance level. 1. Select MegaStat>Hypothesis Tests>Proportion
More informationMathematics (Project Maths Phase 1)
2011. S133S Coimisiún na Scrúduithe Stáit State Examinations Commission Junior Certificate Examination Sample Paper Mathematics (Project Maths Phase 1) Paper 2 Ordinary Level Time: 2 hours 300 marks Running
More informationConversions between percents, decimals, and fractions
Click on the links below to jump directly to the relevant section Conversions between percents, decimals and fractions Operations with percents Percentage of a number Percent change Conversions between
More informationPhonics. High Frequency Words P.008. Objective The student will read high frequency words.
P.008 Jumping Words Objective The student will read high frequency words. Materials High frequency words (P.HFW.005 - P.HFW.064) Choose target words. Checkerboard and checkers (Activity Master P.008.AM1a
More informationRepresentation of functions as power series
Representation of functions as power series Dr. Philippe B. Laval Kennesaw State University November 9, 008 Abstract This document is a summary of the theory and techniques used to represent functions
More informationIntroduction to Quantitative Methods
Introduction to Quantitative Methods October 15, 2009 Contents 1 Definition of Key Terms 2 2 Descriptive Statistics 3 2.1 Frequency Tables......................... 4 2.2 Measures of Central Tendencies.................
More informationTesting Research and Statistical Hypotheses
Testing Research and Statistical Hypotheses Introduction In the last lab we analyzed metric artifact attributes such as thickness or width/thickness ratio. Those were continuous variables, which as you
More information6.4 Normal Distribution
Contents 6.4 Normal Distribution....................... 381 6.4.1 Characteristics of the Normal Distribution....... 381 6.4.2 The Standardized Normal Distribution......... 385 6.4.3 Meaning of Areas under
More informationHooray for the Hundreds Chart!!
Hooray for the Hundreds Chart!! The hundreds chart consists of a grid of numbers from 1 to 100, with each row containing a group of 10 numbers. As a result, children using this chart can count across rows
More informationCONTINGENCY TABLES ARE NOT ALL THE SAME David C. Howell University of Vermont
CONTINGENCY TABLES ARE NOT ALL THE SAME David C. Howell University of Vermont To most people studying statistics a contingency table is a contingency table. We tend to forget, if we ever knew, that contingency
More informationFinal Exam Practice Problem Answers
Final Exam Practice Problem Answers The following data set consists of data gathered from 77 popular breakfast cereals. The variables in the data set are as follows: Brand: The brand name of the cereal
More informationChapter 8 Hypothesis Testing Chapter 8 Hypothesis Testing 8-1 Overview 8-2 Basics of Hypothesis Testing
Chapter 8 Hypothesis Testing 1 Chapter 8 Hypothesis Testing 8-1 Overview 8-2 Basics of Hypothesis Testing 8-3 Testing a Claim About a Proportion 8-5 Testing a Claim About a Mean: s Not Known 8-6 Testing
More informationUNDERSTANDING THE TWO-WAY ANOVA
UNDERSTANDING THE e have seen how the one-way ANOVA can be used to compare two or more sample means in studies involving a single independent variable. This can be extended to two independent variables
More informationMBA 611 STATISTICS AND QUANTITATIVE METHODS
MBA 611 STATISTICS AND QUANTITATIVE METHODS Part I. Review of Basic Statistics (Chapters 1-11) A. Introduction (Chapter 1) Uncertainty: Decisions are often based on incomplete information from uncertain
More informationThe Chi-Square Test. STAT E-50 Introduction to Statistics
STAT -50 Introduction to Statistics The Chi-Square Test The Chi-square test is a nonparametric test that is used to compare experimental results with theoretical models. That is, we will be comparing observed
More informationTwo Correlated Proportions (McNemar Test)
Chapter 50 Two Correlated Proportions (Mcemar Test) Introduction This procedure computes confidence intervals and hypothesis tests for the comparison of the marginal frequencies of two factors (each with
More informationAdditional sources Compilation of sources: http://lrs.ed.uiuc.edu/tseportal/datacollectionmethodologies/jin-tselink/tselink.htm
Mgt 540 Research Methods Data Analysis 1 Additional sources Compilation of sources: http://lrs.ed.uiuc.edu/tseportal/datacollectionmethodologies/jin-tselink/tselink.htm http://web.utk.edu/~dap/random/order/start.htm
More informationTABLE OF CONTENTS. About Chi Squares... 1. What is a CHI SQUARE?... 1. Chi Squares... 1. Hypothesis Testing with Chi Squares... 2
About Chi Squares TABLE OF CONTENTS About Chi Squares... 1 What is a CHI SQUARE?... 1 Chi Squares... 1 Goodness of fit test (One-way χ 2 )... 1 Test of Independence (Two-way χ 2 )... 2 Hypothesis Testing
More informationAssignment #1: Spreadsheets and Basic Data Visualization Sample Solution
Assignment #1: Spreadsheets and Basic Data Visualization Sample Solution Part 1: Spreadsheet Data Analysis Problem 1. Football data: Find the average difference between game predictions and actual outcomes,
More informationHypothesis Testing: Two Means, Paired Data, Two Proportions
Chapter 10 Hypothesis Testing: Two Means, Paired Data, Two Proportions 10.1 Hypothesis Testing: Two Population Means and Two Population Proportions 1 10.1.1 Student Learning Objectives By the end of this
More informationCalculating P-Values. Parkland College. Isela Guerra Parkland College. Recommended Citation
Parkland College A with Honors Projects Honors Program 2014 Calculating P-Values Isela Guerra Parkland College Recommended Citation Guerra, Isela, "Calculating P-Values" (2014). A with Honors Projects.
More informationNonparametric Tests. Chi-Square Test for Independence
DDBA 8438: Nonparametric Statistics: The Chi-Square Test Video Podcast Transcript JENNIFER ANN MORROW: Welcome to "Nonparametric Statistics: The Chi-Square Test." My name is Dr. Jennifer Ann Morrow. In
More informationSection 12 Part 2. Chi-square test
Section 12 Part 2 Chi-square test McNemar s Test Section 12 Part 2 Overview Section 12, Part 1 covered two inference methods for categorical data from 2 groups Confidence Intervals for the difference of
More informationIndependent samples t-test. Dr. Tom Pierce Radford University
Independent samples t-test Dr. Tom Pierce Radford University The logic behind drawing causal conclusions from experiments The sampling distribution of the difference between means The standard error of
More informationMath 58. Rumbos Fall 2008 1. Solutions to Review Problems for Exam 2
Math 58. Rumbos Fall 2008 1 Solutions to Review Problems for Exam 2 1. For each of the following scenarios, determine whether the binomial distribution is the appropriate distribution for the random variable
More information3.4 Statistical inference for 2 populations based on two samples
3.4 Statistical inference for 2 populations based on two samples Tests for a difference between two population means The first sample will be denoted as X 1, X 2,..., X m. The second sample will be denoted
More informationElementary Statistics
lementary Statistics Chap10 Dr. Ghamsary Page 1 lementary Statistics M. Ghamsary, Ph.D. Chapter 10 Chi-square Test for Goodness of fit and Contingency tables lementary Statistics Chap10 Dr. Ghamsary Page
More informationSession 7 Bivariate Data and Analysis
Session 7 Bivariate Data and Analysis Key Terms for This Session Previously Introduced mean standard deviation New in This Session association bivariate analysis contingency table co-variation least squares
More informationEXTRA ACTIVITy pages
EXTRA FUN ACTIVITIES This booklet contains extra activity pages for the student as well as the tests. See the next page for information about the activity pages. Go to page 7 to find the Alpha tests. EXTRA
More informationData Mining Techniques Chapter 5: The Lure of Statistics: Data Mining Using Familiar Tools
Data Mining Techniques Chapter 5: The Lure of Statistics: Data Mining Using Familiar Tools Occam s razor.......................................................... 2 A look at data I.........................................................
More informationExam Style Questions. Revision for this topic. Name: Ensure you have: Pencil, pen, ruler, protractor, pair of compasses and eraser
Name: Exam Style Questions Ensure you have: Pencil, pen, ruler, protractor, pair of compasses and eraser You may use tracing paper if needed Guidance 1. Read each question carefully before you begin answering
More informationDESCRIPTIVE STATISTICS & DATA PRESENTATION*
Level 1 Level 2 Level 3 Level 4 0 0 0 0 evel 1 evel 2 evel 3 Level 4 DESCRIPTIVE STATISTICS & DATA PRESENTATION* Created for Psychology 41, Research Methods by Barbara Sommer, PhD Psychology Department
More informationFirst-year Statistics for Psychology Students Through Worked Examples
First-year Statistics for Psychology Students Through Worked Examples 1. THE CHI-SQUARE TEST A test of association between categorical variables by Charles McCreery, D.Phil Formerly Lecturer in Experimental
More informationSolutions to Homework 6 Statistics 302 Professor Larget
s to Homework 6 Statistics 302 Professor Larget Textbook Exercises 5.29 (Graded for Completeness) What Proportion Have College Degrees? According to the US Census Bureau, about 27.5% of US adults over
More informationChi Square Distribution
17. Chi Square A. Chi Square Distribution B. One-Way Tables C. Contingency Tables D. Exercises Chi Square is a distribution that has proven to be particularly useful in statistics. The first section describes
More informationINTERPRETING THE ONE-WAY ANALYSIS OF VARIANCE (ANOVA)
INTERPRETING THE ONE-WAY ANALYSIS OF VARIANCE (ANOVA) As with other parametric statistics, we begin the one-way ANOVA with a test of the underlying assumptions. Our first assumption is the assumption of
More informationIBM SPSS Statistics for Beginners for Windows
ISS, NEWCASTLE UNIVERSITY IBM SPSS Statistics for Beginners for Windows A Training Manual for Beginners Dr. S. T. Kometa A Training Manual for Beginners Contents 1 Aims and Objectives... 3 1.1 Learning
More informationTopic 8. Chi Square Tests
BE540W Chi Square Tests Page 1 of 5 Topic 8 Chi Square Tests Topics 1. Introduction to Contingency Tables. Introduction to the Contingency Table Hypothesis Test of No Association.. 3. The Chi Square Test
More informationCS 147: Computer Systems Performance Analysis
CS 147: Computer Systems Performance Analysis One-Factor Experiments CS 147: Computer Systems Performance Analysis One-Factor Experiments 1 / 42 Overview Introduction Overview Overview Introduction Finding
More informationTest Positive True Positive False Positive. Test Negative False Negative True Negative. Figure 5-1: 2 x 2 Contingency Table
ANALYSIS OF DISCRT VARIABLS / 5 CHAPTR FIV ANALYSIS OF DISCRT VARIABLS Discrete variables are those which can only assume certain fixed values. xamples include outcome variables with results such as live
More informationUsing Stata for Categorical Data Analysis
Using Stata for Categorical Data Analysis NOTE: These problems make extensive use of Nick Cox s tab_chi, which is actually a collection of routines, and Adrian Mander s ipf command. From within Stata,
More informationIntroduction to Analysis of Variance (ANOVA) Limitations of the t-test
Introduction to Analysis of Variance (ANOVA) The Structural Model, The Summary Table, and the One- Way ANOVA Limitations of the t-test Although the t-test is commonly used, it has limitations Can only
More informationOne-Way Analysis of Variance (ANOVA) Example Problem
One-Way Analysis of Variance (ANOVA) Example Problem Introduction Analysis of Variance (ANOVA) is a hypothesis-testing technique used to test the equality of two or more population (or treatment) means
More informationSection 1.1 Exercises (Solutions)
Section 1.1 Exercises (Solutions) HW: 1.14, 1.16, 1.19, 1.21, 1.24, 1.25*, 1.31*, 1.33, 1.34, 1.35, 1.38*, 1.39, 1.41* 1.14 Employee application data. The personnel department keeps records on all employees
More informationIntroduction to Hypothesis Testing
I. Terms, Concepts. Introduction to Hypothesis Testing A. In general, we do not know the true value of population parameters - they must be estimated. However, we do have hypotheses about what the true
More informationLikelihood: Frequentist vs Bayesian Reasoning
"PRINCIPLES OF PHYLOGENETICS: ECOLOGY AND EVOLUTION" Integrative Biology 200B University of California, Berkeley Spring 2009 N Hallinan Likelihood: Frequentist vs Bayesian Reasoning Stochastic odels and
More informationLesson 1: Comparison of Population Means Part c: Comparison of Two- Means
Lesson : Comparison of Population Means Part c: Comparison of Two- Means Welcome to lesson c. This third lesson of lesson will discuss hypothesis testing for two independent means. Steps in Hypothesis
More informationNormality Testing in Excel
Normality Testing in Excel By Mark Harmon Copyright 2011 Mark Harmon No part of this publication may be reproduced or distributed without the express permission of the author. mark@excelmasterseries.com
More informationOpgaven Onderzoeksmethoden, Onderdeel Statistiek
Opgaven Onderzoeksmethoden, Onderdeel Statistiek 1. What is the measurement scale of the following variables? a Shoe size b Religion c Car brand d Score in a tennis game e Number of work hours per week
More informationThe ANOVA for 2x2 Independent Groups Factorial Design
The ANOVA for 2x2 Independent Groups Factorial Design Please Note: In the analyses above I have tried to avoid using the terms "Independent Variable" and "Dependent Variable" (IV and DV) in order to emphasize
More informationA Comparative Analysis of Speech Recognition Platforms
Communications of the IIMA Volume 9 Issue 3 Article 2 2009 A Comparative Analysis of Speech Recognition Platforms Ore A. Iona College Follow this and additional works at: http://scholarworks.lib.csusb.edu/ciima
More informationA magician showed a magic trick where he picked one card from a standard deck. Determine what the probability is that the card will be a queen card?
Topic : Probability Word Problems- Worksheet 1 Jill is playing cards with her friend when she draws a card from a pack of 20 cards numbered from 1 to 20. What is the probability of drawing a number that
More informationInstructions Budget Sheets
Instructions Budget Sheets Potential Sources of Revenue and Expenses REVENUE Parent Dues Tournament Revenue Fundraisers Sponsors Branch / Association EXPENSES Games / Practices Officiating Fees Rink /
More informationTopic : Probability of a Complement of an Event- Worksheet 1. Do the following:
Topic : Probability of a Complement of an Event- Worksheet 1 1. You roll a die. What is the probability that 2 will not appear 2. Two 6-sided dice are rolled. What is the 3. Ray and Shan are playing football.
More informationLab 3 - DC Circuits and Ohm s Law
Lab 3 DC Circuits and Ohm s Law L3-1 Name Date Partners Lab 3 - DC Circuits and Ohm s Law OBJECTIES To learn to apply the concept of potential difference (voltage) to explain the action of a battery in
More informationCBA Fractions Student Sheet 1
Student Sheet 1 1. If 3 people share 12 cookies equally, how many cookies does each person get? 2. Four people want to share 5 cakes equally. Show how much each person gets. Student Sheet 2 1. The candy
More informationThe Taxman Game. Robert K. Moniot September 5, 2003
The Taxman Game Robert K. Moniot September 5, 2003 1 Introduction Want to know how to beat the taxman? Legally, that is? Read on, and we will explore this cute little mathematical game. The taxman game
More informationSTA-201-TE. 5. Measures of relationship: correlation (5%) Correlation coefficient; Pearson r; correlation and causation; proportion of common variance
Principles of Statistics STA-201-TE This TECEP is an introduction to descriptive and inferential statistics. Topics include: measures of central tendency, variability, correlation, regression, hypothesis
More informationGRAPHS/TABLES. (line plots, bar graphs pictographs, line graphs)
GRAPHS/TABLES (line plots, bar graphs pictographs, line graphs) Standard: 3.D.1.2 Represent data using tables and graphs (e.g., line plots, bar graphs, pictographs, and line graphs). Concept Skill: Graphs
More informationLab 11: Budgeting with Excel
Lab 11: Budgeting with Excel This lab exercise will have you track credit card bills over a period of three months. You will determine those months in which a budget was met for various categories. You
More informationMULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question.
Ch. 4 Discrete Probability Distributions 4.1 Probability Distributions 1 Decide if a Random Variable is Discrete or Continuous 1) State whether the variable is discrete or continuous. The number of cups
More informationDecision Analysis. Here is the statement of the problem:
Decision Analysis Formal decision analysis is often used when a decision must be made under conditions of significant uncertainty. SmartDrill can assist management with any of a variety of decision analysis
More informationPURPOSE OF GRAPHS YOU ARE ABOUT TO BUILD. To explore for a relationship between the categories of two discrete variables
3 Stacked Bar Graph PURPOSE OF GRAPHS YOU ARE ABOUT TO BUILD To explore for a relationship between the categories of two discrete variables 3.1 Introduction to the Stacked Bar Graph «As with the simple
More informationExperimental Design. Power and Sample Size Determination. Proportions. Proportions. Confidence Interval for p. The Binomial Test
Experimental Design Power and Sample Size Determination Bret Hanlon and Bret Larget Department of Statistics University of Wisconsin Madison November 3 8, 2011 To this point in the semester, we have largely
More informationMind on Statistics. Chapter 12
Mind on Statistics Chapter 12 Sections 12.1 Questions 1 to 6: For each statement, determine if the statement is a typical null hypothesis (H 0 ) or alternative hypothesis (H a ). 1. There is no difference
More informationElementary Statistics Sample Exam #3
Elementary Statistics Sample Exam #3 Instructions. No books or telephones. Only the supplied calculators are allowed. The exam is worth 100 points. 1. A chi square goodness of fit test is considered to
More informationUnderstand the role that hypothesis testing plays in an improvement project. Know how to perform a two sample hypothesis test.
HYPOTHESIS TESTING Learning Objectives Understand the role that hypothesis testing plays in an improvement project. Know how to perform a two sample hypothesis test. Know how to perform a hypothesis test
More informationProbability Distributions
CHAPTER 5 Probability Distributions CHAPTER OUTLINE 5.1 Probability Distribution of a Discrete Random Variable 5.2 Mean and Standard Deviation of a Probability Distribution 5.3 The Binomial Distribution
More information" Y. Notation and Equations for Regression Lecture 11/4. Notation:
Notation: Notation and Equations for Regression Lecture 11/4 m: The number of predictor variables in a regression Xi: One of multiple predictor variables. The subscript i represents any number from 1 through
More informationChapter 5 Analysis of variance SPSS Analysis of variance
Chapter 5 Analysis of variance SPSS Analysis of variance Data file used: gss.sav How to get there: Analyze Compare Means One-way ANOVA To test the null hypothesis that several population means are equal,
More informationOne-Way Analysis of Variance
One-Way Analysis of Variance Note: Much of the math here is tedious but straightforward. We ll skim over it in class but you should be sure to ask questions if you don t understand it. I. Overview A. We
More information