Objectives. 9.1, 9.2 Inference for two-way tables. The hypothesis: no association. Expected cell counts. The chi-square test.
|
|
- Dina Jenkins
- 7 years ago
- Views:
Transcription
1 Objectives 9.1, 9.2 Inference for two-way tables The hypothesis: no association Expected cell counts The chi-square test Using software Further reading:
2 Independence/Association: Sample and Population In the previous section we defined the notion of independence and dependence (also called association) using two-way contingency tables: Recall two variables are independent if the probability of one variable conditioned on the other is the same as the marginal probabilities. Example: In Chapter 13 we gave an example where the gender of a person did not change the chance of a pass or fail. This means that gender and passing are independent variables. If the marginal probabilities and the conditional probabilities are not the same, then there is an association between the variables. Example: In Chapter 13 we have an example where the gender of a person changed (dramatically) the chance of them wearing a dress (or not) to the Oscars. This means there is an association between dress wearing and gender. Knowledge of one variables (ie the person is a female) changes the chances of the wearing a dress or not. In reality we do not ever observe the population, and the numbers in a two-way table is a sample from a population. In such a case, even if variables are independent, sampling variation will mean that the marginal probability will not be the same as the conditional.
3 Example: The number of males and females in higher education is know to be equal. However, in a given class the numbers of males and females are likely to be different. Thus we need to `test for independent between the variables given the data set. We do this by `predicting what the numbers in the tables would be under the scenario of independence and make a comparison to what is actually observed in the data. This is best explained through several examples
4 But in reality these are only samples from the entire population. For a sample we cannot expect that even in the case of independence (no association) that the marginal probabilities and the conditional probabilities will be exactly the same. It will be different due to random variation in the sample. As in everything we have done so far, we are not interested in the sample but the population itself. What can we infer about the population based on the sample? Therefore we are interested in seeing whether in the population there is an association between the two variables. In the case that we have a two-by two table, for example, the hair and and minodoxil example we can test if the conditional probabilities (proportion who saw an improvement using minidoxil vs placebo) are the same or not by using the test on two proportions. However, this method cannot be extended to larger tables such as the student smoking example. Instead we take a slightly different approach. We calculate what we expect to see if there is no dependence and compare it what we do observe. We formalize this on the next slides.
5 Hypothesis test for association When we have two categorical variables it is often of interest to determine whether they are associated. As always, a firm decision can only be made by rejecting a null hypothesis using an appropriate test procedure. This is because we need to know if the apparent differences among sample proportions are likely to have occurred just by chance due to random sampling. The hypotheses are H 0 : the variables are not associated vs. H a : the variables are associated. We will use the chi-square (χ 2 ) test to assess the null hypothesis based on how well the data fit with what H 0 predicts the counts of a two-way table to be.
6 Expected cell counts To test the null hypothesis of no association between the variables, we must compare the actual (observed) counts from the sample data with the expected counts. The null hypothesis predicts that the cell proportions of the column variable within each row be the same as their overall proportions for the whole table. Specifically, the expected count in any cell of a two-way table when H 0 is true is: Do not round the expected count it usually is not a whole number. The expected count is a mean, not a value we would actually see.
7 Example 1: Oscars and dresses Male Female Total Dress No Dress Total Male Female Total Dress 215 ( =0.512) No Dress 205 ( =0.488) Total 200 ( =0.476) 220 ( 420 =0.524) 420 Male Female Total Dress 51.2% of 200 males 51.2% of 220 females 215 ( =0.512) = =112.6 No Dress 48.8% of 200 males 48.8% of 220 females 205 ( =0.488) Total 200 ( =97.62 = =0.476) ( 420 =0.524) 420 Let us return to the Oscars data. Suppose we miss all the numbers inside the table and only observe the subtotals, ie. The number who wore dresses, the number who did not wear dresses, the number of females and the number of males. Question These are the marginal numbers. Can we use these to deduce the numbers inside the boxes? Answer Only if dress and gender are independent variables. In this case the conditionals and marginal probabilities are the same. If there is no dependence between gender and whether they wear a dress or not, then we can use the marginal probabilities to predict the numbers. Compare the numbers from what is observed and expected under independence, they are completely different!
8 We measure the difference between what is observed actually in the data and what we predict if they is no association: 2 = ( ) ( ) ( ) ( ) = This difference is huge! It means that our predictions are completely wrong and this is because we made the predictions under the assumption that there dress and gender are completely independent clearly they are not this is why there is such a big difference. Observe that the p-value is very small. This tells us we are rejecting the null hypothesis, which is that there is no dependence between gender and whether they wear a dress or not.
9 Example 2: Gender and grades Let us return to the gender and grade Male Female Total Pass Fail Total Male Female Total Pass 288 Fail 32 Total Male Female Total Pass 90% of the 120 males 90% of the 200 females 288 ( =0.9) = 108 = 180 Fail 10% of 120 males 10% of the 320 females 32 ( =0.10) Total 120 ( =12 = 20 =0.375) ( 320 =0.625) 320 data. Suppose we miss all the numbers inside the table and only observe the subtotals, ie. The number who passed, the number who did not pass, the number of females and the number of males. We `fill in the middle of the table what we expect to see if the is no dependence between gender and grades If there is no dependence between gender and grades, then we can use the marginal probabilities to predicted numbers. Compare the numbers from what is observed and expected under independence, they are completely different!
10 We measure the difference between what is observed actually in the data and what we predict if they is no association: 2 = ( ) ( ) (12 12) (5 5)2 5 =0 There is no difference at all, the predictions were exactly what is expected under independence of grade and gender. Look at the p-value it is one. This tells us we cannot rejecting the null hypothesis, that there is no dependence between gender and whether they wear a dress or not.
11 Example 3: Minidoxil and hair Minidoxil Placebo Subtotals Improvement No improvement Total Minidoxil Placebo Subtotals Improvement 159 No improvement 453 Total Minidoxil Placebo Subtotals Improvement 26% of % of ( = 26%) =80.54 =78.46 No improvement 74% of % of ( = 74%) Total 310 ( = = = 50.6) ( 612 = 49.3%) 612 Let us return to the Minidoxildata. Suppose we miss all the numbers inside the table and only observe the subtotals, ie. The number who saw an improvement, the number who did not, and the numbers in both groups.. Question These are the marginal numbers. Can we use these to deduce the numbers inside the boxes? Again we can only use the marginals if there is no dependence/ association between the treatment and hair growth. If this is the case, we obtain the numbers on the left. Unlike the previous two examples, the numbers neither match or are completely different. How to interpret these differences?
12 We measure the difference between what is observed actually in the data and what we predict if they is no association: 2 = ( ) ( ) ( ) ( ) = This difference is not zero (as in Example 2 or huge as in Example 1). Can we explain this difference under just sampling variation when in fact the Minidoxil is no different to the placebo? It is hard to judge based on the difference 11.58, we need to know the distribution associated with and from here deduce the p-value. We see that the p-value is 0.7%, in the next few slides we explain where this comes from.
13 The chi- squared distribution This is what a chi-squared looks like. It tells us that if there is no association between gender and binge drinking the chistatistics is likely to be quite small. In fact the chance of it being large are quite slim. These chance can be obtained using the critical values for the chi-squared distribution given in the chi-squared tables. Looking up chi-squared tables with 1df. We see that there is a 25% chance the chi-value will be more than 1.32 and a 5% chance it will be more than We apply this to our chivalue.
14 For the chi-square test, H 0 states that there is no association between the row and column variables in a two-way table. The alternative is that these variables are related. If H 0 is true, the chi-square test statistic has approximately a chisquare distribution with (r 1) (c 1) degrees of freedom. Use the chi-square table (Table F) to get the P-value. The P-value for the chi-square test is the area to the right of the test statistic χ 2 under this distribution.
15 Table F df = (r 1) (c 1) If χ 2 = and df = 2, the P-value is between and p df
16 The chi-value for the mindoxil example is We see that it is far to the right of the chi-squared distribution. The area to the RIGHT of is 0.7%. This matches with the Statcrunch output. This can be deduced from the tables. Since > 3.84 (which corresponds to 5% in the chi table) it is clear that the p-value corresponding to is a lot less than 5%. Interpretation If there is no association between treatment and hair, then about 0.7% of the time we will observe a difference in the data of times or more.
17 Example 4: Treating cocaine addiction Observed % of No Relapse by Treatment 74 patients addicted to cocaine were assigned at random to one of three possible treatments. The observed variable is whether or not they relapsed into addiction after their treatment. We test whether the chance of relapse is related to treatment with α = Expected % of No Relapse by Treatment 35.14% 35.14% 35.14% Overall 26/74 = 35.14% did not relapse. If this occurrence does not depend on the treatment then we would expect 35.14% of each group not to relapse. This is what a null hypothesis of no association would predict.
18 Treating cocaine addiction, cont. We have noted that, assuming treatment has no effect on the relapse response, each treatment group should be 35.14% no relapse and 64.85% relapse. The expected counts are computed using the margin totals from the observed (actual) counts. Observed counts No Yes Total Desipramine Lithium Placebo Total Row totals are the same for both tables. Column totals are the same for both tables. Expected counts Comparing the observed and expected counts, we can see that the two do not agree very well. But to say whether this is significant, we need to compute a test statistic and a P-value. Desipramine Lithium Placebo No Yes Total (25 26)/74 = 8.78 (26 26)/74 = 9.14 (23 26)/74 = 8.08 (25 48)/74 = (26 48)/74 = (23 48)/74 = Total
19 Treating cocaine addiction, cont. Now we compute the χ 2 test statistic. Observed counts No Yes Total Desipramine Lithium Placebo Total Expected counts No Yes Total Desipramine Lithium Placebo Total χ = (obs. exp.) exp. The degrees of freedom is (3 1) (2 1) = 2. From Table F, we find that the area to the right of is between and ( ) ( ) = (7 9.14) ( ) (4 8.08) ( ) = The P-value is less than α = 0.05 so we conclude that the chance of relapse is related to the treatment. A causal effect also is indicated because the treatment was applied and then the response (relapse/no relapse) was observed. For many data sets we cannot say there is a causal effect see HW9, Q4.
20 Cocaine addiction, cont. Observed % of no relapse From StatCrunch: use Stat-Tables- Contingency-with summary. The counts are to be provided in a table format. The P-value is , which is very significant. We reject the null hypothesis of no association and conclude there is a relationship between the treatment and the outcome (relapse or not).
21 Meaning of conditional probabilities: Cocaine addiction, cont. Since the outcome (relapse or not) is a response variable, it is sensible to ask about its conditional distribution, given each treatment, and then to compare across treatments. For example, 60% of addicts treated with Desipramine did not relapse and 27% of addicts treated with Lithium did not relapse while only 17% of addicts treated with the placebo did not relapse. But it is not sensible to look at the conditional proportions for treatment, given the outcome, because treatment is not a response variable in this study: it was applied to the patients (subjects). In fact, the row totals are the sizes of three samples and are not random. P (No Placebo) = 4 23 P (Placebo No) = 4 26 = 17.39% has meaning = 15.38% has no interpretation
22 Cocaine addiction, cont. Using the confidence interval formula for a single proportion, we can estimate the individual proportions of no relapse for each treatment. For Desipramine:.600 ± (.400) / 25 =.600 ±.192 = (.408,.792). For Lithium:.269 ± (.731) / 26 =.269 ±.170 = (.099,.439). For Placebo:.174 ± (.826) / 25 =.174 ±.155 = (.019,.329). We can also compare two groups, say Desipramine and Lithium:.600(.400).269(.731) ± =.331 ±.257 = (.074,.588).
23 Example 5: Left- handed students Does right/left-handedness of students vary between genders? We will test H 0 : There is no relationship between handedness and gender, H a : There is some relationship, at significance level α = From the class survey, we get the summary in the table at right. In StatCrunch use: Stat-Tables- Contingency-with data. The results of a chi-square test show that the P-value is 4.33%. Since this is bigger than α, we do not reject H 0. There is insufficient evidence to conclude that handedness differs by gender. Note: only 407 students responded to both questions.
24 Example 6: Parental smoking Does parental smoking influence the smoking habits of their high school children? High school students were asked whether they smoke and whether their parents smoke. In StatCrunch, request row percents (if the column variable is a response) and column percents (if the row variable is a response). The proportion of students who smoke, among those for whom both parents smoke, is 400/1780 = 22.47%. The proportion of students for whom both parents smoke, among those who smoke, is 400/1004 = 39.84%. The percent of students who smoke is greatest when both parents smoke and least when neither parent smokes (22% vs. 14%). If a student smokes, it is more likely that both parents do (40%) than that neither parent does (19%).
25 Experimental designs for two- way tables The chi-square test is an overall technique for determining evidence of a relationship between two categorical variables. There are two cases. Compare category proportions for several populations. A simple random sample is selected from each population and a single categorical variable is observed. The cocaine addiction study is an example of this. The populations are the 3 treatments. A SRS was obtained for each treatment. One response was observed. Test independence of two categorical variables in a single population. A single random sample is obtained from the populations and each individual is classified according to the two categorical variables. The handedness survey is an example of this. There was one random sample from the population and each student responded to the two categorical questions. We use the χ 2 test to test the null hypothesis of no relationship for both.
26 Review: The chi- square test statistic The chi-square statistic (χ 2 ) is a measure of how much the observed cell counts in a two-way table diverge from the expected cell counts. Summing over all r c cells in the table (r and c are the number of rows and columns), the formula for the χ 2 statistic is ( ) 2 2 observed count expected count expected count χ = Beware: the denominators are expected counts. Do not include the margins in the sum. Large values for χ 2 represent strong deviations from the expected distribution under the H 0 and provide evidence against H 0. How large a χ 2 value is required for statistical significance will depend on its degrees of freedom df = (r 1) (c 1).
27 Summary of testing for association This test pertains to association between two categorical variables only. The hypotheses are: H 0 : the variables are not associated vs. H a : the variables are associated. The data are summarized in a r c table with cells containing the observed counts for each combination of categories of the two variables. r is the number of rows (categories of 1 st variable) and c is the number of columns (categories of 2 nd variable). The expected count for each cell in the table is computed by The test statistic is row total column total expected count =. total sample size ( ) 2 2 observed count expected count expected count χ = The P-value is computed from the chi-square distribution as the area to the right of the χ 2 statistic, with df = (r 1) (c 1)..
28 When is it safe to use a χ 2 test? Like the z-tests for proportions, the chi-square test is based on an approximation. We can safely use the test when: The samples are simple random samples (SRS). All but one or two individual observed counts are 1 or more. All expected counts are 5 or more, except perhaps one which should be at least 1. For a 2 x 2 table, all four expected counts should be 5 or more. If the approximation is not appropriate, a statistician should be consulted for the proper procedure.
29 Accompanying problems associated with this Chapter Quiz Homework 9
Class 19: Two Way Tables, Conditional Distributions, Chi-Square (Text: Sections 2.5; 9.1)
Spring 204 Class 9: Two Way Tables, Conditional Distributions, Chi-Square (Text: Sections 2.5; 9.) Big Picture: More than Two Samples In Chapter 7: We looked at quantitative variables and compared the
More informationIs it statistically significant? The chi-square test
UAS Conference Series 2013/14 Is it statistically significant? The chi-square test Dr Gosia Turner Student Data Management and Analysis 14 September 2010 Page 1 Why chi-square? Tests whether two categorical
More informationChapter 23. Two Categorical Variables: The Chi-Square Test
Chapter 23. Two Categorical Variables: The Chi-Square Test 1 Chapter 23. Two Categorical Variables: The Chi-Square Test Two-Way Tables Note. We quickly review two-way tables with an example. Example. Exercise
More informationHaving a coin come up heads or tails is a variable on a nominal scale. Heads is a different category from tails.
Chi-square Goodness of Fit Test The chi-square test is designed to test differences whether one frequency is different from another frequency. The chi-square test is designed for use with data on a nominal
More informationAssociation Between Variables
Contents 11 Association Between Variables 767 11.1 Introduction............................ 767 11.1.1 Measure of Association................. 768 11.1.2 Chapter Summary.................... 769 11.2 Chi
More informationChi-square test Fisher s Exact test
Lesson 1 Chi-square test Fisher s Exact test McNemar s Test Lesson 1 Overview Lesson 11 covered two inference methods for categorical data from groups Confidence Intervals for the difference of two proportions
More information1. What is the critical value for this 95% confidence interval? CV = z.025 = invnorm(0.025) = 1.96
1 Final Review 2 Review 2.1 CI 1-propZint Scenario 1 A TV manufacturer claims in its warranty brochure that in the past not more than 10 percent of its TV sets needed any repair during the first two years
More informationFairfield Public Schools
Mathematics Fairfield Public Schools AP Statistics AP Statistics BOE Approved 04/08/2014 1 AP STATISTICS Critical Areas of Focus AP Statistics is a rigorous course that offers advanced students an opportunity
More informationSolutions to Homework 10 Statistics 302 Professor Larget
s to Homework 10 Statistics 302 Professor Larget Textbook Exercises 7.14 Rock-Paper-Scissors (Graded for Accurateness) In Data 6.1 on page 367 we see a table, reproduced in the table below that shows the
More informationUnit 31 A Hypothesis Test about Correlation and Slope in a Simple Linear Regression
Unit 31 A Hypothesis Test about Correlation and Slope in a Simple Linear Regression Objectives: To perform a hypothesis test concerning the slope of a least squares line To recognize that testing for a
More informationLAB : THE CHI-SQUARE TEST. Probability, Random Chance, and Genetics
Period Date LAB : THE CHI-SQUARE TEST Probability, Random Chance, and Genetics Why do we study random chance and probability at the beginning of a unit on genetics? Genetics is the study of inheritance,
More informationBivariate Statistics Session 2: Measuring Associations Chi-Square Test
Bivariate Statistics Session 2: Measuring Associations Chi-Square Test Features Of The Chi-Square Statistic The chi-square test is non-parametric. That is, it makes no assumptions about the distribution
More informationAP: LAB 8: THE CHI-SQUARE TEST. Probability, Random Chance, and Genetics
Ms. Foglia Date AP: LAB 8: THE CHI-SQUARE TEST Probability, Random Chance, and Genetics Why do we study random chance and probability at the beginning of a unit on genetics? Genetics is the study of inheritance,
More informationRecommend Continued CPS Monitoring. 63 (a) 17 (b) 10 (c) 90. 35 (d) 20 (e) 25 (f) 80. Totals/Marginal 98 37 35 170
Work Sheet 2: Calculating a Chi Square Table 1: Substance Abuse Level by ation Total/Marginal 63 (a) 17 (b) 10 (c) 90 35 (d) 20 (e) 25 (f) 80 Totals/Marginal 98 37 35 170 Step 1: Label Your Table. Label
More informationComparing Multiple Proportions, Test of Independence and Goodness of Fit
Comparing Multiple Proportions, Test of Independence and Goodness of Fit Content Testing the Equality of Population Proportions for Three or More Populations Test of Independence Goodness of Fit Test 2
More informationTesting Research and Statistical Hypotheses
Testing Research and Statistical Hypotheses Introduction In the last lab we analyzed metric artifact attributes such as thickness or width/thickness ratio. Those were continuous variables, which as you
More informationChapter 5 Analysis of variance SPSS Analysis of variance
Chapter 5 Analysis of variance SPSS Analysis of variance Data file used: gss.sav How to get there: Analyze Compare Means One-way ANOVA To test the null hypothesis that several population means are equal,
More informationLAB 4 INSTRUCTIONS CONFIDENCE INTERVALS AND HYPOTHESIS TESTING
LAB 4 INSTRUCTIONS CONFIDENCE INTERVALS AND HYPOTHESIS TESTING In this lab you will explore the concept of a confidence interval and hypothesis testing through a simulation problem in engineering setting.
More information1.5 Oneway Analysis of Variance
Statistics: Rosie Cornish. 200. 1.5 Oneway Analysis of Variance 1 Introduction Oneway analysis of variance (ANOVA) is used to compare several means. This method is often used in scientific or medical experiments
More informationContingency Tables and the Chi Square Statistic. Interpreting Computer Printouts and Constructing Tables
Contingency Tables and the Chi Square Statistic Interpreting Computer Printouts and Constructing Tables Contingency Tables/Chi Square Statistics What are they? A contingency table is a table that shows
More informationTopic 8. Chi Square Tests
BE540W Chi Square Tests Page 1 of 5 Topic 8 Chi Square Tests Topics 1. Introduction to Contingency Tables. Introduction to the Contingency Table Hypothesis Test of No Association.. 3. The Chi Square Test
More informationstatistics Chi-square tests and nonparametric Summary sheet from last time: Hypothesis testing Summary sheet from last time: Confidence intervals
Summary sheet from last time: Confidence intervals Confidence intervals take on the usual form: parameter = statistic ± t crit SE(statistic) parameter SE a s e sqrt(1/n + m x 2 /ss xx ) b s e /sqrt(ss
More informationThis chapter discusses some of the basic concepts in inferential statistics.
Research Skills for Psychology Majors: Everything You Need to Know to Get Started Inferential Statistics: Basic Concepts This chapter discusses some of the basic concepts in inferential statistics. Details
More informationUnit 26 Estimation with Confidence Intervals
Unit 26 Estimation with Confidence Intervals Objectives: To see how confidence intervals are used to estimate a population proportion, a population mean, a difference in population proportions, or a difference
More informationChi Squared and Fisher's Exact Tests. Observed vs Expected Distributions
BMS 617 Statistical Techniques for the Biomedical Sciences Lecture 11: Chi-Squared and Fisher's Exact Tests Chi Squared and Fisher's Exact Tests This lecture presents two similarly structured tests, Chi-squared
More informationStatistics 2014 Scoring Guidelines
AP Statistics 2014 Scoring Guidelines College Board, Advanced Placement Program, AP, AP Central, and the acorn logo are registered trademarks of the College Board. AP Central is the official online home
More informationLesson 1: Comparison of Population Means Part c: Comparison of Two- Means
Lesson : Comparison of Population Means Part c: Comparison of Two- Means Welcome to lesson c. This third lesson of lesson will discuss hypothesis testing for two independent means. Steps in Hypothesis
More informationSection 12 Part 2. Chi-square test
Section 12 Part 2 Chi-square test McNemar s Test Section 12 Part 2 Overview Section 12, Part 1 covered two inference methods for categorical data from 2 groups Confidence Intervals for the difference of
More informationOdds ratio, Odds ratio test for independence, chi-squared statistic.
Odds ratio, Odds ratio test for independence, chi-squared statistic. Announcements: Assignment 5 is live on webpage. Due Wed Aug 1 at 4:30pm. (9 days, 1 hour, 58.5 minutes ) Final exam is Aug 9. Review
More informationIntroduction to Analysis of Variance (ANOVA) Limitations of the t-test
Introduction to Analysis of Variance (ANOVA) The Structural Model, The Summary Table, and the One- Way ANOVA Limitations of the t-test Although the t-test is commonly used, it has limitations Can only
More informationMath 58. Rumbos Fall 2008 1. Solutions to Review Problems for Exam 2
Math 58. Rumbos Fall 2008 1 Solutions to Review Problems for Exam 2 1. For each of the following scenarios, determine whether the binomial distribution is the appropriate distribution for the random variable
More informationChapter 23 Inferences About Means
Chapter 23 Inferences About Means Chapter 23 - Inferences About Means 391 Chapter 23 Solutions to Class Examples 1. See Class Example 1. 2. We want to know if the mean battery lifespan exceeds the 300-minute
More informationOne-Way Analysis of Variance (ANOVA) Example Problem
One-Way Analysis of Variance (ANOVA) Example Problem Introduction Analysis of Variance (ANOVA) is a hypothesis-testing technique used to test the equality of two or more population (or treatment) means
More informationCrosstabulation & Chi Square
Crosstabulation & Chi Square Robert S Michael Chi-square as an Index of Association After examining the distribution of each of the variables, the researcher s next task is to look for relationships among
More informationUNDERSTANDING THE TWO-WAY ANOVA
UNDERSTANDING THE e have seen how the one-way ANOVA can be used to compare two or more sample means in studies involving a single independent variable. This can be extended to two independent variables
More informationResearch Methods & Experimental Design
Research Methods & Experimental Design 16.422 Human Supervisory Control April 2004 Research Methods Qualitative vs. quantitative Understanding the relationship between objectives (research question) and
More informationSession 7 Bivariate Data and Analysis
Session 7 Bivariate Data and Analysis Key Terms for This Session Previously Introduced mean standard deviation New in This Session association bivariate analysis contingency table co-variation least squares
More informationOnline 12 - Sections 9.1 and 9.2-Doug Ensley
Student: Date: Instructor: Doug Ensley Course: MAT117 01 Applied Statistics - Ensley Assignment: Online 12 - Sections 9.1 and 9.2 1. Does a P-value of 0.001 give strong evidence or not especially strong
More information3.4 Statistical inference for 2 populations based on two samples
3.4 Statistical inference for 2 populations based on two samples Tests for a difference between two population means The first sample will be denoted as X 1, X 2,..., X m. The second sample will be denoted
More informationFirst-year Statistics for Psychology Students Through Worked Examples
First-year Statistics for Psychology Students Through Worked Examples 1. THE CHI-SQUARE TEST A test of association between categorical variables by Charles McCreery, D.Phil Formerly Lecturer in Experimental
More informationWeek 3&4: Z tables and the Sampling Distribution of X
Week 3&4: Z tables and the Sampling Distribution of X 2 / 36 The Standard Normal Distribution, or Z Distribution, is the distribution of a random variable, Z N(0, 1 2 ). The distribution of any other normal
More informationCalculating P-Values. Parkland College. Isela Guerra Parkland College. Recommended Citation
Parkland College A with Honors Projects Honors Program 2014 Calculating P-Values Isela Guerra Parkland College Recommended Citation Guerra, Isela, "Calculating P-Values" (2014). A with Honors Projects.
More informationMind on Statistics. Chapter 4
Mind on Statistics Chapter 4 Sections 4.1 Questions 1 to 4: The table below shows the counts by gender and highest degree attained for 498 respondents in the General Social Survey. Highest Degree Gender
More informationTest Positive True Positive False Positive. Test Negative False Negative True Negative. Figure 5-1: 2 x 2 Contingency Table
ANALYSIS OF DISCRT VARIABLS / 5 CHAPTR FIV ANALYSIS OF DISCRT VARIABLS Discrete variables are those which can only assume certain fixed values. xamples include outcome variables with results such as live
More informationCHAPTER 14 NONPARAMETRIC TESTS
CHAPTER 14 NONPARAMETRIC TESTS Everything that we have done up until now in statistics has relied heavily on one major fact: that our data is normally distributed. We have been able to make inferences
More informationSimulating Chi-Square Test Using Excel
Simulating Chi-Square Test Using Excel Leslie Chandrakantha John Jay College of Criminal Justice of CUNY Mathematics and Computer Science Department 524 West 59 th Street, New York, NY 10019 lchandra@jjay.cuny.edu
More informationPart 3. Comparing Groups. Chapter 7 Comparing Paired Groups 189. Chapter 8 Comparing Two Independent Groups 217
Part 3 Comparing Groups Chapter 7 Comparing Paired Groups 189 Chapter 8 Comparing Two Independent Groups 217 Chapter 9 Comparing More Than Two Groups 257 188 Elementary Statistics Using SAS Chapter 7 Comparing
More informationTwo Correlated Proportions (McNemar Test)
Chapter 50 Two Correlated Proportions (Mcemar Test) Introduction This procedure computes confidence intervals and hypothesis tests for the comparison of the marginal frequencies of two factors (each with
More informationChapter 8 Hypothesis Testing Chapter 8 Hypothesis Testing 8-1 Overview 8-2 Basics of Hypothesis Testing
Chapter 8 Hypothesis Testing 1 Chapter 8 Hypothesis Testing 8-1 Overview 8-2 Basics of Hypothesis Testing 8-3 Testing a Claim About a Proportion 8-5 Testing a Claim About a Mean: s Not Known 8-6 Testing
More informationFinal Exam Practice Problem Answers
Final Exam Practice Problem Answers The following data set consists of data gathered from 77 popular breakfast cereals. The variables in the data set are as follows: Brand: The brand name of the cereal
More informationCHAPTER IV FINDINGS AND CONCURRENT DISCUSSIONS
CHAPTER IV FINDINGS AND CONCURRENT DISCUSSIONS Hypothesis 1: People are resistant to the technological change in the security system of the organization. Hypothesis 2: information hacked and misused. Lack
More informationMind on Statistics. Chapter 15
Mind on Statistics Chapter 15 Section 15.1 1. A student survey was done to study the relationship between class standing (freshman, sophomore, junior, or senior) and major subject (English, Biology, French,
More informationThe Chi-Square Test. STAT E-50 Introduction to Statistics
STAT -50 Introduction to Statistics The Chi-Square Test The Chi-square test is a nonparametric test that is used to compare experimental results with theoretical models. That is, we will be comparing observed
More informationt Tests in Excel The Excel Statistical Master By Mark Harmon Copyright 2011 Mark Harmon
t-tests in Excel By Mark Harmon Copyright 2011 Mark Harmon No part of this publication may be reproduced or distributed without the express permission of the author. mark@excelmasterseries.com www.excelmasterseries.com
More informationTwo-Sample T-Tests Assuming Equal Variance (Enter Means)
Chapter 4 Two-Sample T-Tests Assuming Equal Variance (Enter Means) Introduction This procedure provides sample size and power calculations for one- or two-sided two-sample t-tests when the variances of
More informationCONTINGENCY TABLES ARE NOT ALL THE SAME David C. Howell University of Vermont
CONTINGENCY TABLES ARE NOT ALL THE SAME David C. Howell University of Vermont To most people studying statistics a contingency table is a contingency table. We tend to forget, if we ever knew, that contingency
More informationIntroduction. Hypothesis Testing. Hypothesis Testing. Significance Testing
Introduction Hypothesis Testing Mark Lunt Arthritis Research UK Centre for Ecellence in Epidemiology University of Manchester 13/10/2015 We saw last week that we can never know the population parameters
More informationData Mining Techniques Chapter 5: The Lure of Statistics: Data Mining Using Familiar Tools
Data Mining Techniques Chapter 5: The Lure of Statistics: Data Mining Using Familiar Tools Occam s razor.......................................................... 2 A look at data I.........................................................
More informationStatistical tests for SPSS
Statistical tests for SPSS Paolo Coletti A.Y. 2010/11 Free University of Bolzano Bozen Premise This book is a very quick, rough and fast description of statistical tests and their usage. It is explicitly
More information12.5: CHI-SQUARE GOODNESS OF FIT TESTS
125: Chi-Square Goodness of Fit Tests CD12-1 125: CHI-SQUARE GOODNESS OF FIT TESTS In this section, the χ 2 distribution is used for testing the goodness of fit of a set of data to a specific probability
More informationIndependent samples t-test. Dr. Tom Pierce Radford University
Independent samples t-test Dr. Tom Pierce Radford University The logic behind drawing causal conclusions from experiments The sampling distribution of the difference between means The standard error of
More informationIndependent t- Test (Comparing Two Means)
Independent t- Test (Comparing Two Means) The objectives of this lesson are to learn: the definition/purpose of independent t-test when to use the independent t-test the use of SPSS to complete an independent
More informationMind on Statistics. Chapter 12
Mind on Statistics Chapter 12 Sections 12.1 Questions 1 to 6: For each statement, determine if the statement is a typical null hypothesis (H 0 ) or alternative hypothesis (H a ). 1. There is no difference
More informationp ˆ (sample mean and sample
Chapter 6: Confidence Intervals and Hypothesis Testing When analyzing data, we can t just accept the sample mean or sample proportion as the official mean or proportion. When we estimate the statistics
More informationStudy Guide for the Final Exam
Study Guide for the Final Exam When studying, remember that the computational portion of the exam will only involve new material (covered after the second midterm), that material from Exam 1 will make
More informationMULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question.
Sample Practice problems - chapter 12-1 and 2 proportions for inference - Z Distributions Name MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question. Provide
More informationCALCULATIONS & STATISTICS
CALCULATIONS & STATISTICS CALCULATION OF SCORES Conversion of 1-5 scale to 0-100 scores When you look at your report, you will notice that the scores are reported on a 0-100 scale, even though respondents
More information8 6 X 2 Test for a Variance or Standard Deviation
Section 8 6 x 2 Test for a Variance or Standard Deviation 437 This test uses the P-value method. Therefore, it is not necessary to enter a significance level. 1. Select MegaStat>Hypothesis Tests>Proportion
More informationABSORBENCY OF PAPER TOWELS
ABSORBENCY OF PAPER TOWELS 15. Brief Version of the Case Study 15.1 Problem Formulation 15.2 Selection of Factors 15.3 Obtaining Random Samples of Paper Towels 15.4 How will the Absorbency be measured?
More informationHYPOTHESIS TESTING (ONE SAMPLE) - CHAPTER 7 1. used confidence intervals to answer questions such as...
HYPOTHESIS TESTING (ONE SAMPLE) - CHAPTER 7 1 PREVIOUSLY used confidence intervals to answer questions such as... You know that 0.25% of women have red/green color blindness. You conduct a study of men
More informationHypothesis testing - Steps
Hypothesis testing - Steps Steps to do a two-tailed test of the hypothesis that β 1 0: 1. Set up the hypotheses: H 0 : β 1 = 0 H a : β 1 0. 2. Compute the test statistic: t = b 1 0 Std. error of b 1 =
More informationDescriptive Statistics
Descriptive Statistics Primer Descriptive statistics Central tendency Variation Relative position Relationships Calculating descriptive statistics Descriptive Statistics Purpose to describe or summarize
More informationExperimental Design. Power and Sample Size Determination. Proportions. Proportions. Confidence Interval for p. The Binomial Test
Experimental Design Power and Sample Size Determination Bret Hanlon and Bret Larget Department of Statistics University of Wisconsin Madison November 3 8, 2011 To this point in the semester, we have largely
More informationC. The null hypothesis is not rejected when the alternative hypothesis is true. A. population parameters.
Sample Multiple Choice Questions for the material since Midterm 2. Sample questions from Midterms and 2 are also representative of questions that may appear on the final exam.. A randomly selected sample
More informationChapter 2. Hypothesis testing in one population
Chapter 2. Hypothesis testing in one population Contents Introduction, the null and alternative hypotheses Hypothesis testing process Type I and Type II errors, power Test statistic, level of significance
More informationSTAT 145 (Notes) Al Nosedal anosedal@unm.edu Department of Mathematics and Statistics University of New Mexico. Fall 2013
STAT 145 (Notes) Al Nosedal anosedal@unm.edu Department of Mathematics and Statistics University of New Mexico Fall 2013 CHAPTER 18 INFERENCE ABOUT A POPULATION MEAN. Conditions for Inference about mean
More informationProjects Involving Statistics (& SPSS)
Projects Involving Statistics (& SPSS) Academic Skills Advice Starting a project which involves using statistics can feel confusing as there seems to be many different things you can do (charts, graphs,
More informationGood luck! BUSINESS STATISTICS FINAL EXAM INSTRUCTIONS. Name:
Glo bal Leadership M BA BUSINESS STATISTICS FINAL EXAM Name: INSTRUCTIONS 1. Do not open this exam until instructed to do so. 2. Be sure to fill in your name before starting the exam. 3. You have two hours
More informationNon-Parametric Tests (I)
Lecture 5: Non-Parametric Tests (I) KimHuat LIM lim@stats.ox.ac.uk http://www.stats.ox.ac.uk/~lim/teaching.html Slide 1 5.1 Outline (i) Overview of Distribution-Free Tests (ii) Median Test for Two Independent
More informationindividualdifferences
1 Simple ANalysis Of Variance (ANOVA) Oftentimes we have more than two groups that we want to compare. The purpose of ANOVA is to allow us to compare group means from several independent samples. In general,
More informationPoint and Interval Estimates
Point and Interval Estimates Suppose we want to estimate a parameter, such as p or µ, based on a finite sample of data. There are two main methods: 1. Point estimate: Summarize the sample by a single number
More informationTwo-Sample T-Tests Allowing Unequal Variance (Enter Difference)
Chapter 45 Two-Sample T-Tests Allowing Unequal Variance (Enter Difference) Introduction This procedure provides sample size and power calculations for one- or two-sided two-sample t-tests when no assumption
More informationComparing Two Groups. Standard Error of ȳ 1 ȳ 2. Setting. Two Independent Samples
Comparing Two Groups Chapter 7 describes two ways to compare two populations on the basis of independent samples: a confidence interval for the difference in population means and a hypothesis test. The
More informationUsing Stata for Categorical Data Analysis
Using Stata for Categorical Data Analysis NOTE: These problems make extensive use of Nick Cox s tab_chi, which is actually a collection of routines, and Adrian Mander s ipf command. From within Stata,
More informationLesson 3: Calculating Conditional Probabilities and Evaluating Independence Using Two-Way Tables
Calculating Conditional Probabilities and Evaluating Independence Using Two-Way Tables Classwork Example 1 Students at Rufus King High School were discussing some of the challenges of finding space for
More informationRecall this chart that showed how most of our course would be organized:
Chapter 4 One-Way ANOVA Recall this chart that showed how most of our course would be organized: Explanatory Variable(s) Response Variable Methods Categorical Categorical Contingency Tables Categorical
More informationAnalysis of Variance ANOVA
Analysis of Variance ANOVA Overview We ve used the t -test to compare the means from two independent groups. Now we ve come to the final topic of the course: how to compare means from more than two populations.
More informationCHAPTER 11 CHI-SQUARE AND F DISTRIBUTIONS
CHAPTER 11 CHI-SQUARE AND F DISTRIBUTIONS CHI-SQUARE TESTS OF INDEPENDENCE (SECTION 11.1 OF UNDERSTANDABLE STATISTICS) In chi-square tests of independence we use the hypotheses. H0: The variables are independent
More informationChapter 7: Simple linear regression Learning Objectives
Chapter 7: Simple linear regression Learning Objectives Reading: Section 7.1 of OpenIntro Statistics Video: Correlation vs. causation, YouTube (2:19) Video: Intro to Linear Regression, YouTube (5:18) -
More informationChapter 7 Section 7.1: Inference for the Mean of a Population
Chapter 7 Section 7.1: Inference for the Mean of a Population Now let s look at a similar situation Take an SRS of size n Normal Population : N(, ). Both and are unknown parameters. Unlike what we used
More informationCOMPARISONS OF CUSTOMER LOYALTY: PUBLIC & PRIVATE INSURANCE COMPANIES.
277 CHAPTER VI COMPARISONS OF CUSTOMER LOYALTY: PUBLIC & PRIVATE INSURANCE COMPANIES. This chapter contains a full discussion of customer loyalty comparisons between private and public insurance companies
More informationComparing Means in Two Populations
Comparing Means in Two Populations Overview The previous section discussed hypothesis testing when sampling from a single population (either a single mean or two means from the same population). Now we
More informationTwo-sample inference: Continuous data
Two-sample inference: Continuous data Patrick Breheny April 5 Patrick Breheny STA 580: Biostatistics I 1/32 Introduction Our next two lectures will deal with two-sample inference for continuous data As
More informationSection 13, Part 1 ANOVA. Analysis Of Variance
Section 13, Part 1 ANOVA Analysis Of Variance Course Overview So far in this course we ve covered: Descriptive statistics Summary statistics Tables and Graphs Probability Probability Rules Probability
More informationGeneral Method: Difference of Means. 3. Calculate df: either Welch-Satterthwaite formula or simpler df = min(n 1, n 2 ) 1.
General Method: Difference of Means 1. Calculate x 1, x 2, SE 1, SE 2. 2. Combined SE = SE1 2 + SE2 2. ASSUMES INDEPENDENT SAMPLES. 3. Calculate df: either Welch-Satterthwaite formula or simpler df = min(n
More informationWeek 4: Standard Error and Confidence Intervals
Health Sciences M.Sc. Programme Applied Biostatistics Week 4: Standard Error and Confidence Intervals Sampling Most research data come from subjects we think of as samples drawn from a larger population.
More informationSCHOOL OF HEALTH AND HUMAN SCIENCES DON T FORGET TO RECODE YOUR MISSING VALUES
SCHOOL OF HEALTH AND HUMAN SCIENCES Using SPSS Topics addressed today: 1. Differences between groups 2. Graphing Use the s4data.sav file for the first part of this session. DON T FORGET TO RECODE YOUR
More information"Statistical methods are objective methods by which group trends are abstracted from observations on many separate individuals." 1
BASIC STATISTICAL THEORY / 3 CHAPTER ONE BASIC STATISTICAL THEORY "Statistical methods are objective methods by which group trends are abstracted from observations on many separate individuals." 1 Medicine
More information