7.1 Inference for comparing means of two populations

Size: px
Start display at page:

Download "7.1 Inference for comparing means of two populations"

Transcription

1 Objectives 7.1 Inference for comparing means of two populations Matched pair t confidence interval Matched pair t hypothesis test

2 Overview of what is to come We have covered the difficult bits of the course. The part where we need to do extensive calculations. In terms of inference, what have we done? Constructed confidence intervals for locating the population mean. Statistical tests. From now on we will do much the same thing. The only difference is that the data sets will `appear more complex. But we will still Constructing confidence intervals Testing hypotheses. The only difference is: We need to identify the appropriate methodology/procedure given the data and how it was collected. In terms of calculations they become more difficult, but we don t do them! We make the computer do them instead. Our role is Understand every single part of the computer output. The assumptions used to do all the calculations.

3 As before the standard error is a vital ingredient and will be used to measure uncertainty: To construct CIs just do as before. To do tests via the t-transform just do as before. Recall t = Estimate mean in the null hypothesis standard error The t-transform measures the number of standard errors between the estimate mean and the null mean. It is a measure of distance. The further the distance the more infeasible the null. Important: continue to make plots of the normal and t-distribution, using the results from the output. This will help you to check that what you are doing makes sense.

4 New types of data: Comparative inference In most statistical procedures, the objective is to make comparisons. For example: Does tuition lead to higher grades? Does eating healthy food lead to longer life expectancy? Etc. How does one design the experiment in order to test such hypothesis? In matched pair studies, subjects are matched and comparisons are made with each pairing. Examples: A patient before and after a treatment. Accessing whether a diet worked by weighing a person before and after. Studies involving twins, each twin given different regimes. Using matched studies to make comparisons is an extremely useful method for reducing confounding in a design. Twin in space: -.VRriQGR4qlw

5 Matched paired data We are given a data set If there is a clear matching in the data. Then we need to do a matched paired procedure. We need to determine whether there is a matching by understanding how the data is collected. Once we determine there is matching we need to understand the statistical output of the matched paired procedure. Examples of matched data we will consider in this chapter: The effect red wine has on polyphenol levels (wait a minute we already considered this data ) The influence that a full moon has on certain patients. What are the differences between running at high and low altitude Does Friday 13 th change behavior. Weight of baby calves. The questions asked above are answered by collecting matched data. In an exam you will be asked to identify matched data.

6 Example 1: Red wine and polyphenols levels in blood We have already come across this example in Chapter 6 and 7. We had used a one sample method to analyze this data, but this was after processing. The data we used after processing were the differences we saw between before and after taking red wine. The `raw data is simply the polyphenol levels before taking red wine and after taking red wine. It is very natural to consider the difference as it gives the increase/decrease in polyphenol after the treatment. The matched paired methods we discuss in this chapter are identical to the one sample methods discussed in Chapter 6 and 7. Just after the differences are taken. Our job is to understand that differences need to be taken. In Chapter 10 we will consider data where there is no natural pairing -- we need to be sure we do not confused between the two different types of data.

7 The statistical output: CI and tests Above is the 95% CI and tests for the mean. The top plot is with you telling the computer that it paired data (see demo) The lower plot is after manually taking differences and then construct a CI and test on the differences. We observe that the outputs are identical. In the next slide we review what the output is telling us.

8 Review of polyphenol output The output on the left is the CI interval for where we believe the mean difference after taking red wine should lie. We have 95% confidence it lies between [2.6,5.99]. We see that this CI is far above zero suggesting that mean levels increase. We formally test this in the next output. The output on the right is testing H 0 : µ A - µ B 0 against H A : µ A - µ B > 0. (ie. Taking red wine increases polyphenol levels). The p-value is very small, less than 0.1%. As this is far smaller than 5% it tells us that there is strong evidence to suggest that the mean level of polyphenol increase with wine consumption. Observe, using this one-sided test we can also deduce that if we were to test H 0 : µ A - µ B 0 against H A : µ A - µ B < 0. (ie. Taking red wine decreases polyphenol levels), then the p-value is greater 99.9%, thus there is no evidence that polyphenol decreases with wine consumption.

9 Examples 2: Does a full moon have an indluence on behavior? We want to investigate whether aggressive dementia patients tend to be more aggressive when there is a full moon. The behavior of 15 disruptive dementia patients was studied to see if there is any evidence of this. For each patient the average number of disruptive events on Full moon days and other days was counted. The data is on the right. The raw numbers do not contain the information on being `more disruptive. Instead one should consider the difference between each of the pairs. It is the difference that actually contains the information on whether the individuals are more or less disruptive. In addition, by taking differences we are `factoring out that some of the natural variability in aggressive behavior between patients.

10 The hypothesis of interest and output We want to test whether the full moon made people more disruptive. First we set notation: Let µ N = the mean number of disruptive events (no full moon) Let µ F = the mean number of disruptive events under full moon. We conjecture that there are more disruptive events during full moon, so we are testing that H 0 : µ F µ N 0 against H A: µ F - µ N > 0. We see that the p-value is extremely small, thus there strong evidence to reject the null. Understanding the output: The t-value is calculated as t = = Using the output we can calculate the 95% CI using the same output [2.3 ± ] = [1.58, 3.02]

11 Lab Practice I (moon example) Load the moon data into Statcrunch. We can visually look at the differences by going to Data -> Compute Data -> Expression -> Build (now compute the expression by clicking on the correct expression). By just looking at the differences it is clear that for this data set patients are more disruptive. Now we want to see whether we can infer that in general aggressive dementia patients are more disruptive during the full moon. To do this we see how likely that we can get these mainly positive differences by just random chance alone. This is the p-value. If this turns out to be large, the data is consistent with these numbers coming by random chance and there is no evidence that there is an actual difference in the behaviors. To do the test and construct CIs Go to Stat -> T-statistics -> Paired and select either the test of interest or the desired CI.

12 Checking reliability of output Now we want to see whether the calculations are correct: We always assume the sample is a simple random sample. In this example, this assumption is a bit dubious as the patients selected were the ones who were the most disruptive. Therefore we can only draw inference on the population of disruptive patients. As the sample size is quite small (18) we should check that the differences data does not deviate too much from normality. A histogram and Qqplot is given below.

13 Observations: The data is numerical continuous (average number of disruptive events per person). The histogram of the differences does not look very bell shaped and points on the QQplot don t fall on line. Data is not very close to normal, but there isn t any clear skew which is the main factor for the CLT not kicking in for relatively large samples sizes. The sample size of 18 is below the 30 rule of thumb. However, the simulation of the sampling distribution using the applet shows that the sample mean based on 18 will be quite close to normal

14 The above observations imply that the sample mean will be approximately normal. Therefore, the p-value that was calculated using the t-distribution (remember we only use the t because the standard deviation is unknown) is relatively close to the truth. Regardless of normality of the sample mean, the sample mean is 6.71 standard errors from the null (which is a huge difference) - the t-transform is so large, that the p-value is very small regardless of what the actual distribution is. Based on this there is overwhelming evidence that the behavior on Full moon days will be different to other days. Recommendations: As a consequence of this study the nursing home way want to bring in more staff on Full moon days. As on average the number of additional disruptive events on a full moon is between [1.57,3.02], the nursing home may want to use this to calculate the number of extra staff to bring on duty during the full moon.

15 Example 3: Running at different altitudes It is usually believed that peoples running times at high altitude are worse than their running times at sea level. We want to check this assertion. Data collection: 12 runners are asked to run the same distance at both sea level and high altitude, their running times are recorded. The data is given on the right. Since the same runner is used for both the high and low altitude there is clear matching in the data. Also observe that most of the differences are positive. For most of the runners we see an increase in time.

16 The output and using it for testing The 95% CI for the difference is given below We are interested in understanding whether running at a high altitude increases running time. So the hypothesis of interest is H 0 : µ H - µ L 0 against H A : µ H - µ L > 0. Suppose, we do the test at the 1% level. The t-transform is t = =3.88 The p-value is the area to the RIGHT of Looking up the tables we see that the area to the right of 3.88 is 0.25%. Thus the p-value is less than 0.25%. The p-value can be calculated exactly by doing the test. The results from three hypothesis test outputs are on the next slide.

17 One and two sided tests: The output

18 The output corresponding to the hypothesis test of interest is the last one and gives the p-value 0.13% However, for the purpose of an examination you should be able to deduce the correct p-value from any three of the outputs. The first output corresponds to the opposite hypothesis, where we are interested in seeing if there is evidence to suggest that at high altitude we run faster. The p-value for this test is 99.87% - this is the area to the left of Therefore the p-value we are interested in is the area to the right of 3.88 which is = 0.13%. The middle output is the two-sided test, that is the mean running speed at sea level and high altitudes is different. The p-value is 0.25%, which is double our p-value.

19 Example 4: Does Friday 13 th increase accidents? To answer this question the number of accidents was on 6 consecutive Friday 13ths (during the early 1990 s) was collected. A comparison is required, where all the factors the same except for the 13 th, Thus the data is compared with the number of accidents which happen on the previous Friday 6 th. It is not immediately obvious but there is matching in this data. This is because Friday 6 th and the following Friday 13 th share similar factors except for the numbers (for example more accidents tend to happen during July/August which would increase both values). There is a dependence between them (driven by these common factors).

20 Hypothesis of interest and the test To see whether there is evidence that accidents have increased our hypothesis of interest is H 0 : µ 13 - µ 6 0 against H A : µ 13 - µ 6 > 0. We see that the p-value is about 2.11% This is less than 5% so we can reject the null at the 5% level and determine that Friday 13 th tends to increase accidents: Notes of caution: The data is clearly not normally distributed (it is numerical discrete) and the sample size is very small (n=6). Therefore the p-value is unlikely to be that reliable. As it is relatively close to the boundary of 5% we need to be cautious in interpreting the full significance of this result. Ie. The true p-value may be over 5%.

21 More Examples: Compare the weights of calf data at different weeks. Load the calf data into Statcrunch and compare their weights at different weeks The data is clearly matched, because the same calf is followed over a few weeks. Moreover in the scatterplot of week 0.5 against week 1 we see a clear linear trend. This shows that there is a clear matching between the weights (notice that calves that are heavier at week 0.5 also tend to be heavier at week 1). Based on the above, if we want to compare the weights at different weeks we need to use a match paired procedure. Do this!

22 Summary: matched pair procedures Sometimes we want to compare treatments or conditions at the individual level. These situations produce two samples that are not independent they are related to each other. The subjects of one sample are identical to, or matched (paired) with, the subjects of the other sample. Example: Pre-test and post-test studies look at data collected on the same subjects before and after some treatment is performed. Example: Twin studies often try to sort out the influence of genetic factors by comparing a variable between sets of twins. Example: Using people matched for age, sex, and education in social studies helps to cancel out the effects of these potentially relevant variables. Except for pre/post studies, subjects should be randomized assigned to the samples at random (within each pair), in observational studies.

23 For data from a matched pair design, we use the observed differences X difference = (X 1 X 2 ) to test the difference in the two population means. The hypotheses can then be expressed as H 0 : µ difference = 0 ; H a : µ difference >0 (or <0, or 0) You will need to decide what test to apply to the data. In Chapter 10 we will cover the independent sample t-test. This tests the same hypothesis but there is no matching in the data so a different procedure is used. Based on how the data was collected you should be able to decide which test to use when.

24 Calculation Practice: Does no caffeine increase depression? Individuals diagnosed as caffeine-dependent were deprived of caffeine-rich foods and assigned pills for 10 days. Sometimes, the pills contained caffeine and other times they contained a placebo. A depression score was determined separately for the caffeine pills (as a whole) and for the placebo pills. There are 2 data points for each subject, but we only look at the difference. We calculate that = 7.36; s diff = 6.92, df = 10. We test H 0 : µ difference = 0, H a : µ difference > 0, using α = Why is a one-sided test ok? t x diff = = = s diff n x diff 6.92 / From the t-distribution: P-value =.0027, which is quite small, in fact smaller than α. Depression Subject with Caffeine Depression with Placebo Placebo - Cafeine Depression is greater with the placebo than with the caffeine pills, on average.

25 Accompanying problems associated with this Chapter Quiz 13 Homework 6 (Q3) Homework 8 (Q1)

LAB 4 INSTRUCTIONS CONFIDENCE INTERVALS AND HYPOTHESIS TESTING

LAB 4 INSTRUCTIONS CONFIDENCE INTERVALS AND HYPOTHESIS TESTING LAB 4 INSTRUCTIONS CONFIDENCE INTERVALS AND HYPOTHESIS TESTING In this lab you will explore the concept of a confidence interval and hypothesis testing through a simulation problem in engineering setting.

More information

Lesson 1: Comparison of Population Means Part c: Comparison of Two- Means

Lesson 1: Comparison of Population Means Part c: Comparison of Two- Means Lesson : Comparison of Population Means Part c: Comparison of Two- Means Welcome to lesson c. This third lesson of lesson will discuss hypothesis testing for two independent means. Steps in Hypothesis

More information

Good luck! BUSINESS STATISTICS FINAL EXAM INSTRUCTIONS. Name:

Good luck! BUSINESS STATISTICS FINAL EXAM INSTRUCTIONS. Name: Glo bal Leadership M BA BUSINESS STATISTICS FINAL EXAM Name: INSTRUCTIONS 1. Do not open this exam until instructed to do so. 2. Be sure to fill in your name before starting the exam. 3. You have two hours

More information

Stat 411/511 THE RANDOMIZATION TEST. Charlotte Wickham. stat511.cwick.co.nz. Oct 16 2015

Stat 411/511 THE RANDOMIZATION TEST. Charlotte Wickham. stat511.cwick.co.nz. Oct 16 2015 Stat 411/511 THE RANDOMIZATION TEST Oct 16 2015 Charlotte Wickham stat511.cwick.co.nz Today Review randomization model Conduct randomization test What about CIs? Using a t-distribution as an approximation

More information

CHAPTER 14 NONPARAMETRIC TESTS

CHAPTER 14 NONPARAMETRIC TESTS CHAPTER 14 NONPARAMETRIC TESTS Everything that we have done up until now in statistics has relied heavily on one major fact: that our data is normally distributed. We have been able to make inferences

More information

Objectives. 6.1, 7.1 Estimating with confidence (CIS: Chapter 10) CI)

Objectives. 6.1, 7.1 Estimating with confidence (CIS: Chapter 10) CI) Objectives 6.1, 7.1 Estimating with confidence (CIS: Chapter 10) Statistical confidence (CIS gives a good explanation of a 95% CI) Confidence intervals. Further reading http://onlinestatbook.com/2/estimation/confidence.html

More information

3.4 Statistical inference for 2 populations based on two samples

3.4 Statistical inference for 2 populations based on two samples 3.4 Statistical inference for 2 populations based on two samples Tests for a difference between two population means The first sample will be denoted as X 1, X 2,..., X m. The second sample will be denoted

More information

THE FIRST SET OF EXAMPLES USE SUMMARY DATA... EXAMPLE 7.2, PAGE 227 DESCRIBES A PROBLEM AND A HYPOTHESIS TEST IS PERFORMED IN EXAMPLE 7.

THE FIRST SET OF EXAMPLES USE SUMMARY DATA... EXAMPLE 7.2, PAGE 227 DESCRIBES A PROBLEM AND A HYPOTHESIS TEST IS PERFORMED IN EXAMPLE 7. THERE ARE TWO WAYS TO DO HYPOTHESIS TESTING WITH STATCRUNCH: WITH SUMMARY DATA (AS IN EXAMPLE 7.17, PAGE 236, IN ROSNER); WITH THE ORIGINAL DATA (AS IN EXAMPLE 8.5, PAGE 301 IN ROSNER THAT USES DATA FROM

More information

Unit 26: Small Sample Inference for One Mean

Unit 26: Small Sample Inference for One Mean Unit 26: Small Sample Inference for One Mean Prerequisites Students need the background on confidence intervals and significance tests covered in Units 24 and 25. Additional Topic Coverage Additional coverage

More information

Once saved, if the file was zipped you will need to unzip it. For the files that I will be posting you need to change the preferences.

Once saved, if the file was zipped you will need to unzip it. For the files that I will be posting you need to change the preferences. 1 Commands in JMP and Statcrunch Below are a set of commands in JMP and Statcrunch which facilitate a basic statistical analysis. The first part concerns commands in JMP, the second part is for analysis

More information

Fairfield Public Schools

Fairfield Public Schools Mathematics Fairfield Public Schools AP Statistics AP Statistics BOE Approved 04/08/2014 1 AP STATISTICS Critical Areas of Focus AP Statistics is a rigorous course that offers advanced students an opportunity

More information

General Method: Difference of Means. 3. Calculate df: either Welch-Satterthwaite formula or simpler df = min(n 1, n 2 ) 1.

General Method: Difference of Means. 3. Calculate df: either Welch-Satterthwaite formula or simpler df = min(n 1, n 2 ) 1. General Method: Difference of Means 1. Calculate x 1, x 2, SE 1, SE 2. 2. Combined SE = SE1 2 + SE2 2. ASSUMES INDEPENDENT SAMPLES. 3. Calculate df: either Welch-Satterthwaite formula or simpler df = min(n

More information

HYPOTHESIS TESTING (ONE SAMPLE) - CHAPTER 7 1. used confidence intervals to answer questions such as...

HYPOTHESIS TESTING (ONE SAMPLE) - CHAPTER 7 1. used confidence intervals to answer questions such as... HYPOTHESIS TESTING (ONE SAMPLE) - CHAPTER 7 1 PREVIOUSLY used confidence intervals to answer questions such as... You know that 0.25% of women have red/green color blindness. You conduct a study of men

More information

Recall this chart that showed how most of our course would be organized:

Recall this chart that showed how most of our course would be organized: Chapter 4 One-Way ANOVA Recall this chart that showed how most of our course would be organized: Explanatory Variable(s) Response Variable Methods Categorical Categorical Contingency Tables Categorical

More information

C. The null hypothesis is not rejected when the alternative hypothesis is true. A. population parameters.

C. The null hypothesis is not rejected when the alternative hypothesis is true. A. population parameters. Sample Multiple Choice Questions for the material since Midterm 2. Sample questions from Midterms and 2 are also representative of questions that may appear on the final exam.. A randomly selected sample

More information

Chapter 7 Section 7.1: Inference for the Mean of a Population

Chapter 7 Section 7.1: Inference for the Mean of a Population Chapter 7 Section 7.1: Inference for the Mean of a Population Now let s look at a similar situation Take an SRS of size n Normal Population : N(, ). Both and are unknown parameters. Unlike what we used

More information

Part 3. Comparing Groups. Chapter 7 Comparing Paired Groups 189. Chapter 8 Comparing Two Independent Groups 217

Part 3. Comparing Groups. Chapter 7 Comparing Paired Groups 189. Chapter 8 Comparing Two Independent Groups 217 Part 3 Comparing Groups Chapter 7 Comparing Paired Groups 189 Chapter 8 Comparing Two Independent Groups 217 Chapter 9 Comparing More Than Two Groups 257 188 Elementary Statistics Using SAS Chapter 7 Comparing

More information

2013 MBA Jump Start Program. Statistics Module Part 3

2013 MBA Jump Start Program. Statistics Module Part 3 2013 MBA Jump Start Program Module 1: Statistics Thomas Gilbert Part 3 Statistics Module Part 3 Hypothesis Testing (Inference) Regressions 2 1 Making an Investment Decision A researcher in your firm just

More information

SCHOOL OF HEALTH AND HUMAN SCIENCES DON T FORGET TO RECODE YOUR MISSING VALUES

SCHOOL OF HEALTH AND HUMAN SCIENCES DON T FORGET TO RECODE YOUR MISSING VALUES SCHOOL OF HEALTH AND HUMAN SCIENCES Using SPSS Topics addressed today: 1. Differences between groups 2. Graphing Use the s4data.sav file for the first part of this session. DON T FORGET TO RECODE YOUR

More information

Chapter 7: Simple linear regression Learning Objectives

Chapter 7: Simple linear regression Learning Objectives Chapter 7: Simple linear regression Learning Objectives Reading: Section 7.1 of OpenIntro Statistics Video: Correlation vs. causation, YouTube (2:19) Video: Intro to Linear Regression, YouTube (5:18) -

More information

HYPOTHESIS TESTING (ONE SAMPLE) - CHAPTER 7 1. used confidence intervals to answer questions such as...

HYPOTHESIS TESTING (ONE SAMPLE) - CHAPTER 7 1. used confidence intervals to answer questions such as... HYPOTHESIS TESTING (ONE SAMPLE) - CHAPTER 7 1 PREVIOUSLY used confidence intervals to answer questions such as... You know that 0.25% of women have red/green color blindness. You conduct a study of men

More information

Simple Linear Regression Inference

Simple Linear Regression Inference Simple Linear Regression Inference 1 Inference requirements The Normality assumption of the stochastic term e is needed for inference even if it is not a OLS requirement. Therefore we have: Interpretation

More information

HYPOTHESIS TESTING WITH SPSS:

HYPOTHESIS TESTING WITH SPSS: HYPOTHESIS TESTING WITH SPSS: A NON-STATISTICIAN S GUIDE & TUTORIAL by Dr. Jim Mirabella SPSS 14.0 screenshots reprinted with permission from SPSS Inc. Published June 2006 Copyright Dr. Jim Mirabella CHAPTER

More information

Lecture Notes Module 1

Lecture Notes Module 1 Lecture Notes Module 1 Study Populations A study population is a clearly defined collection of people, animals, plants, or objects. In psychological research, a study population usually consists of a specific

More information

Linear Models in STATA and ANOVA

Linear Models in STATA and ANOVA Session 4 Linear Models in STATA and ANOVA Page Strengths of Linear Relationships 4-2 A Note on Non-Linear Relationships 4-4 Multiple Linear Regression 4-5 Removal of Variables 4-8 Independent Samples

More information

Name: Date: Use the following to answer questions 3-4:

Name: Date: Use the following to answer questions 3-4: Name: Date: 1. Determine whether each of the following statements is true or false. A) The margin of error for a 95% confidence interval for the mean increases as the sample size increases. B) The margin

More information

Confidence Intervals for One Standard Deviation Using Standard Deviation

Confidence Intervals for One Standard Deviation Using Standard Deviation Chapter 640 Confidence Intervals for One Standard Deviation Using Standard Deviation Introduction This routine calculates the sample size necessary to achieve a specified interval width or distance from

More information

How To Test For Significance On A Data Set

How To Test For Significance On A Data Set Non-Parametric Univariate Tests: 1 Sample Sign Test 1 1 SAMPLE SIGN TEST A non-parametric equivalent of the 1 SAMPLE T-TEST. ASSUMPTIONS: Data is non-normally distributed, even after log transforming.

More information

Chapter 5 Analysis of variance SPSS Analysis of variance

Chapter 5 Analysis of variance SPSS Analysis of variance Chapter 5 Analysis of variance SPSS Analysis of variance Data file used: gss.sav How to get there: Analyze Compare Means One-way ANOVA To test the null hypothesis that several population means are equal,

More information

STAT 350 Practice Final Exam Solution (Spring 2015)

STAT 350 Practice Final Exam Solution (Spring 2015) PART 1: Multiple Choice Questions: 1) A study was conducted to compare five different training programs for improving endurance. Forty subjects were randomly divided into five groups of eight subjects

More information

StatCrunch and Nonparametric Statistics

StatCrunch and Nonparametric Statistics StatCrunch and Nonparametric Statistics You can use StatCrunch to calculate the values of nonparametric statistics. It may not be obvious how to enter the data in StatCrunch for various data sets that

More information

Unit 31 A Hypothesis Test about Correlation and Slope in a Simple Linear Regression

Unit 31 A Hypothesis Test about Correlation and Slope in a Simple Linear Regression Unit 31 A Hypothesis Test about Correlation and Slope in a Simple Linear Regression Objectives: To perform a hypothesis test concerning the slope of a least squares line To recognize that testing for a

More information

Statistics 2014 Scoring Guidelines

Statistics 2014 Scoring Guidelines AP Statistics 2014 Scoring Guidelines College Board, Advanced Placement Program, AP, AP Central, and the acorn logo are registered trademarks of the College Board. AP Central is the official online home

More information

Projects Involving Statistics (& SPSS)

Projects Involving Statistics (& SPSS) Projects Involving Statistics (& SPSS) Academic Skills Advice Starting a project which involves using statistics can feel confusing as there seems to be many different things you can do (charts, graphs,

More information

Statistics Review PSY379

Statistics Review PSY379 Statistics Review PSY379 Basic concepts Measurement scales Populations vs. samples Continuous vs. discrete variable Independent vs. dependent variable Descriptive vs. inferential stats Common analyses

More information

Section 14 Simple Linear Regression: Introduction to Least Squares Regression

Section 14 Simple Linear Regression: Introduction to Least Squares Regression Slide 1 Section 14 Simple Linear Regression: Introduction to Least Squares Regression There are several different measures of statistical association used for understanding the quantitative relationship

More information

Unit 26 Estimation with Confidence Intervals

Unit 26 Estimation with Confidence Intervals Unit 26 Estimation with Confidence Intervals Objectives: To see how confidence intervals are used to estimate a population proportion, a population mean, a difference in population proportions, or a difference

More information

Chapter 23 Inferences About Means

Chapter 23 Inferences About Means Chapter 23 Inferences About Means Chapter 23 - Inferences About Means 391 Chapter 23 Solutions to Class Examples 1. See Class Example 1. 2. We want to know if the mean battery lifespan exceeds the 300-minute

More information

Inference for two Population Means

Inference for two Population Means Inference for two Population Means Bret Hanlon and Bret Larget Department of Statistics University of Wisconsin Madison October 27 November 1, 2011 Two Population Means 1 / 65 Case Study Case Study Example

More information

Comparing Means in Two Populations

Comparing Means in Two Populations Comparing Means in Two Populations Overview The previous section discussed hypothesis testing when sampling from a single population (either a single mean or two means from the same population). Now we

More information

Testing for differences I exercises with SPSS

Testing for differences I exercises with SPSS Testing for differences I exercises with SPSS Introduction The exercises presented here are all about the t-test and its non-parametric equivalents in their various forms. In SPSS, all these tests can

More information

Hypothesis testing - Steps

Hypothesis testing - Steps Hypothesis testing - Steps Steps to do a two-tailed test of the hypothesis that β 1 0: 1. Set up the hypotheses: H 0 : β 1 = 0 H a : β 1 0. 2. Compute the test statistic: t = b 1 0 Std. error of b 1 =

More information

Two-Sample T-Tests Assuming Equal Variance (Enter Means)

Two-Sample T-Tests Assuming Equal Variance (Enter Means) Chapter 4 Two-Sample T-Tests Assuming Equal Variance (Enter Means) Introduction This procedure provides sample size and power calculations for one- or two-sided two-sample t-tests when the variances of

More information

1.5 Oneway Analysis of Variance

1.5 Oneway Analysis of Variance Statistics: Rosie Cornish. 200. 1.5 Oneway Analysis of Variance 1 Introduction Oneway analysis of variance (ANOVA) is used to compare several means. This method is often used in scientific or medical experiments

More information

HYPOTHESIS TESTING: CONFIDENCE INTERVALS, T-TESTS, ANOVAS, AND REGRESSION

HYPOTHESIS TESTING: CONFIDENCE INTERVALS, T-TESTS, ANOVAS, AND REGRESSION HYPOTHESIS TESTING: CONFIDENCE INTERVALS, T-TESTS, ANOVAS, AND REGRESSION HOD 2990 10 November 2010 Lecture Background This is a lightning speed summary of introductory statistical methods for senior undergraduate

More information

Mind on Statistics. Chapter 13

Mind on Statistics. Chapter 13 Mind on Statistics Chapter 13 Sections 13.1-13.2 1. Which statement is not true about hypothesis tests? A. Hypothesis tests are only valid when the sample is representative of the population for the question

More information

BA 275 Review Problems - Week 6 (10/30/06-11/3/06) CD Lessons: 53, 54, 55, 56 Textbook: pp. 394-398, 404-408, 410-420

BA 275 Review Problems - Week 6 (10/30/06-11/3/06) CD Lessons: 53, 54, 55, 56 Textbook: pp. 394-398, 404-408, 410-420 BA 275 Review Problems - Week 6 (10/30/06-11/3/06) CD Lessons: 53, 54, 55, 56 Textbook: pp. 394-398, 404-408, 410-420 1. Which of the following will increase the value of the power in a statistical test

More information

Week 4: Standard Error and Confidence Intervals

Week 4: Standard Error and Confidence Intervals Health Sciences M.Sc. Programme Applied Biostatistics Week 4: Standard Error and Confidence Intervals Sampling Most research data come from subjects we think of as samples drawn from a larger population.

More information

Simple Linear Regression

Simple Linear Regression STAT 101 Dr. Kari Lock Morgan Simple Linear Regression SECTIONS 9.3 Confidence and prediction intervals (9.3) Conditions for inference (9.1) Want More Stats??? If you have enjoyed learning how to analyze

More information

Chapter 8 Hypothesis Testing Chapter 8 Hypothesis Testing 8-1 Overview 8-2 Basics of Hypothesis Testing

Chapter 8 Hypothesis Testing Chapter 8 Hypothesis Testing 8-1 Overview 8-2 Basics of Hypothesis Testing Chapter 8 Hypothesis Testing 1 Chapter 8 Hypothesis Testing 8-1 Overview 8-2 Basics of Hypothesis Testing 8-3 Testing a Claim About a Proportion 8-5 Testing a Claim About a Mean: s Not Known 8-6 Testing

More information

Psychology 60 Fall 2013 Practice Exam Actual Exam: Next Monday. Good luck!

Psychology 60 Fall 2013 Practice Exam Actual Exam: Next Monday. Good luck! Psychology 60 Fall 2013 Practice Exam Actual Exam: Next Monday. Good luck! Name: 1. The basic idea behind hypothesis testing: A. is important only if you want to compare two populations. B. depends on

More information

Two-Sample T-Tests Allowing Unequal Variance (Enter Difference)

Two-Sample T-Tests Allowing Unequal Variance (Enter Difference) Chapter 45 Two-Sample T-Tests Allowing Unequal Variance (Enter Difference) Introduction This procedure provides sample size and power calculations for one- or two-sided two-sample t-tests when no assumption

More information

Statistics courses often teach the two-sample t-test, linear regression, and analysis of variance

Statistics courses often teach the two-sample t-test, linear regression, and analysis of variance 2 Making Connections: The Two-Sample t-test, Regression, and ANOVA In theory, there s no difference between theory and practice. In practice, there is. Yogi Berra 1 Statistics courses often teach the two-sample

More information

Chapter 2. Hypothesis testing in one population

Chapter 2. Hypothesis testing in one population Chapter 2. Hypothesis testing in one population Contents Introduction, the null and alternative hypotheses Hypothesis testing process Type I and Type II errors, power Test statistic, level of significance

More information

Sample Size and Power in Clinical Trials

Sample Size and Power in Clinical Trials Sample Size and Power in Clinical Trials Version 1.0 May 011 1. Power of a Test. Factors affecting Power 3. Required Sample Size RELATED ISSUES 1. Effect Size. Test Statistics 3. Variation 4. Significance

More information

Unit 27: Comparing Two Means

Unit 27: Comparing Two Means Unit 27: Comparing Two Means Prerequisites Students should have experience with one-sample t-procedures before they begin this unit. That material is covered in Unit 26, Small Sample Inference for One

More information

Independent samples t-test. Dr. Tom Pierce Radford University

Independent samples t-test. Dr. Tom Pierce Radford University Independent samples t-test Dr. Tom Pierce Radford University The logic behind drawing causal conclusions from experiments The sampling distribution of the difference between means The standard error of

More information

NCSS Statistical Software

NCSS Statistical Software Chapter 06 Introduction This procedure provides several reports for the comparison of two distributions, including confidence intervals for the difference in means, two-sample t-tests, the z-test, the

More information

Odds ratio, Odds ratio test for independence, chi-squared statistic.

Odds ratio, Odds ratio test for independence, chi-squared statistic. Odds ratio, Odds ratio test for independence, chi-squared statistic. Announcements: Assignment 5 is live on webpage. Due Wed Aug 1 at 4:30pm. (9 days, 1 hour, 58.5 minutes ) Final exam is Aug 9. Review

More information

Hypothesis Testing. Steps for a hypothesis test:

Hypothesis Testing. Steps for a hypothesis test: Hypothesis Testing Steps for a hypothesis test: 1. State the claim H 0 and the alternative, H a 2. Choose a significance level or use the given one. 3. Draw the sampling distribution based on the assumption

More information

Statistics 100 Sample Final Questions (Note: These are mostly multiple choice, for extra practice. Your Final Exam will NOT have any multiple choice!

Statistics 100 Sample Final Questions (Note: These are mostly multiple choice, for extra practice. Your Final Exam will NOT have any multiple choice! Statistics 100 Sample Final Questions (Note: These are mostly multiple choice, for extra practice. Your Final Exam will NOT have any multiple choice!) Part A - Multiple Choice Indicate the best choice

More information

II. DISTRIBUTIONS distribution normal distribution. standard scores

II. DISTRIBUTIONS distribution normal distribution. standard scores Appendix D Basic Measurement And Statistics The following information was developed by Steven Rothke, PhD, Department of Psychology, Rehabilitation Institute of Chicago (RIC) and expanded by Mary F. Schmidt,

More information

Comparing Two Groups. Standard Error of ȳ 1 ȳ 2. Setting. Two Independent Samples

Comparing Two Groups. Standard Error of ȳ 1 ȳ 2. Setting. Two Independent Samples Comparing Two Groups Chapter 7 describes two ways to compare two populations on the basis of independent samples: a confidence interval for the difference in population means and a hypothesis test. The

More information

Outline. Definitions Descriptive vs. Inferential Statistics The t-test - One-sample t-test

Outline. Definitions Descriptive vs. Inferential Statistics The t-test - One-sample t-test The t-test Outline Definitions Descriptive vs. Inferential Statistics The t-test - One-sample t-test - Dependent (related) groups t-test - Independent (unrelated) groups t-test Comparing means Correlation

More information

Opgaven Onderzoeksmethoden, Onderdeel Statistiek

Opgaven Onderzoeksmethoden, Onderdeel Statistiek Opgaven Onderzoeksmethoden, Onderdeel Statistiek 1. What is the measurement scale of the following variables? a Shoe size b Religion c Car brand d Score in a tennis game e Number of work hours per week

More information

Confidence Intervals for the Difference Between Two Means

Confidence Intervals for the Difference Between Two Means Chapter 47 Confidence Intervals for the Difference Between Two Means Introduction This procedure calculates the sample size necessary to achieve a specified distance from the difference in sample means

More information

Introduction. Hypothesis Testing. Hypothesis Testing. Significance Testing

Introduction. Hypothesis Testing. Hypothesis Testing. Significance Testing Introduction Hypothesis Testing Mark Lunt Arthritis Research UK Centre for Ecellence in Epidemiology University of Manchester 13/10/2015 We saw last week that we can never know the population parameters

More information

Two-sample hypothesis testing, II 9.07 3/16/2004

Two-sample hypothesis testing, II 9.07 3/16/2004 Two-sample hypothesis testing, II 9.07 3/16/004 Small sample tests for the difference between two independent means For two-sample tests of the difference in mean, things get a little confusing, here,

More information

4. Continuous Random Variables, the Pareto and Normal Distributions

4. Continuous Random Variables, the Pareto and Normal Distributions 4. Continuous Random Variables, the Pareto and Normal Distributions A continuous random variable X can take any value in a given range (e.g. height, weight, age). The distribution of a continuous random

More information

6: Introduction to Hypothesis Testing

6: Introduction to Hypothesis Testing 6: Introduction to Hypothesis Testing Significance testing is used to help make a judgment about a claim by addressing the question, Can the observed difference be attributed to chance? We break up significance

More information

STATISTICS 8, FINAL EXAM. Last six digits of Student ID#: Circle your Discussion Section: 1 2 3 4

STATISTICS 8, FINAL EXAM. Last six digits of Student ID#: Circle your Discussion Section: 1 2 3 4 STATISTICS 8, FINAL EXAM NAME: KEY Seat Number: Last six digits of Student ID#: Circle your Discussion Section: 1 2 3 4 Make sure you have 8 pages. You will be provided with a table as well, as a separate

More information

Chapter 7 Notes - Inference for Single Samples. You know already for a large sample, you can invoke the CLT so:

Chapter 7 Notes - Inference for Single Samples. You know already for a large sample, you can invoke the CLT so: Chapter 7 Notes - Inference for Single Samples You know already for a large sample, you can invoke the CLT so: X N(µ, ). Also for a large sample, you can replace an unknown σ by s. You know how to do a

More information

Two Related Samples t Test

Two Related Samples t Test Two Related Samples t Test In this example 1 students saw five pictures of attractive people and five pictures of unattractive people. For each picture, the students rated the friendliness of the person

More information

Hypothesis Testing for Beginners

Hypothesis Testing for Beginners Hypothesis Testing for Beginners Michele Piffer LSE August, 2011 Michele Piffer (LSE) Hypothesis Testing for Beginners August, 2011 1 / 53 One year ago a friend asked me to put down some easy-to-read notes

More information

Online 12 - Sections 9.1 and 9.2-Doug Ensley

Online 12 - Sections 9.1 and 9.2-Doug Ensley Student: Date: Instructor: Doug Ensley Course: MAT117 01 Applied Statistics - Ensley Assignment: Online 12 - Sections 9.1 and 9.2 1. Does a P-value of 0.001 give strong evidence or not especially strong

More information

AP STATISTICS (Warm-Up Exercises)

AP STATISTICS (Warm-Up Exercises) AP STATISTICS (Warm-Up Exercises) 1. Describe the distribution of ages in a city: 2. Graph a box plot on your calculator for the following test scores: {90, 80, 96, 54, 80, 95, 100, 75, 87, 62, 65, 85,

More information

Data Analysis Tools. Tools for Summarizing Data

Data Analysis Tools. Tools for Summarizing Data Data Analysis Tools This section of the notes is meant to introduce you to many of the tools that are provided by Excel under the Tools/Data Analysis menu item. If your computer does not have that tool

More information

Principles of Hypothesis Testing for Public Health

Principles of Hypothesis Testing for Public Health Principles of Hypothesis Testing for Public Health Laura Lee Johnson, Ph.D. Statistician National Center for Complementary and Alternative Medicine johnslau@mail.nih.gov Fall 2011 Answers to Questions

More information

Experimental Design. Power and Sample Size Determination. Proportions. Proportions. Confidence Interval for p. The Binomial Test

Experimental Design. Power and Sample Size Determination. Proportions. Proportions. Confidence Interval for p. The Binomial Test Experimental Design Power and Sample Size Determination Bret Hanlon and Bret Larget Department of Statistics University of Wisconsin Madison November 3 8, 2011 To this point in the semester, we have largely

More information

Normal distribution. ) 2 /2σ. 2π σ

Normal distribution. ) 2 /2σ. 2π σ Normal distribution The normal distribution is the most widely known and used of all distributions. Because the normal distribution approximates many natural phenomena so well, it has developed into a

More information

Midterm Review Problems

Midterm Review Problems Midterm Review Problems October 19, 2013 1. Consider the following research title: Cooperation among nursery school children under two types of instruction. In this study, what is the independent variable?

More information

KSTAT MINI-MANUAL. Decision Sciences 434 Kellogg Graduate School of Management

KSTAT MINI-MANUAL. Decision Sciences 434 Kellogg Graduate School of Management KSTAT MINI-MANUAL Decision Sciences 434 Kellogg Graduate School of Management Kstat is a set of macros added to Excel and it will enable you to do the statistics required for this course very easily. To

More information

Unit 31: One-Way ANOVA

Unit 31: One-Way ANOVA Unit 31: One-Way ANOVA Summary of Video A vase filled with coins takes center stage as the video begins. Students will be taking part in an experiment organized by psychology professor John Kelly in which

More information

t Tests in Excel The Excel Statistical Master By Mark Harmon Copyright 2011 Mark Harmon

t Tests in Excel The Excel Statistical Master By Mark Harmon Copyright 2011 Mark Harmon t-tests in Excel By Mark Harmon Copyright 2011 Mark Harmon No part of this publication may be reproduced or distributed without the express permission of the author. mark@excelmasterseries.com www.excelmasterseries.com

More information

Lecture 2. Summarizing the Sample

Lecture 2. Summarizing the Sample Lecture 2 Summarizing the Sample WARNING: Today s lecture may bore some of you It s (sort of) not my fault I m required to teach you about what we re going to cover today. I ll try to make it as exciting

More information

Tutorial 5: Hypothesis Testing

Tutorial 5: Hypothesis Testing Tutorial 5: Hypothesis Testing Rob Nicholls nicholls@mrc-lmb.cam.ac.uk MRC LMB Statistics Course 2014 Contents 1 Introduction................................ 1 2 Testing distributional assumptions....................

More information

How To Compare Birds To Other Birds

How To Compare Birds To Other Birds STT 430/630/ES 760 Lecture Notes: Chapter 7: Two-Sample Inference 1 February 27, 2009 Chapter 7: Two Sample Inference Chapter 6 introduced hypothesis testing in the one-sample setting: one sample is obtained

More information

NCSS Statistical Software

NCSS Statistical Software Chapter 06 Introduction This procedure provides several reports for the comparison of two distributions, including confidence intervals for the difference in means, two-sample t-tests, the z-test, the

More information

The Wilcoxon Rank-Sum Test

The Wilcoxon Rank-Sum Test 1 The Wilcoxon Rank-Sum Test The Wilcoxon rank-sum test is a nonparametric alternative to the twosample t-test which is based solely on the order in which the observations from the two samples fall. We

More information

Chicago Booth BUSINESS STATISTICS 41000 Final Exam Fall 2011

Chicago Booth BUSINESS STATISTICS 41000 Final Exam Fall 2011 Chicago Booth BUSINESS STATISTICS 41000 Final Exam Fall 2011 Name: Section: I pledge my honor that I have not violated the Honor Code Signature: This exam has 34 pages. You have 3 hours to complete this

More information

Non-Parametric Tests (I)

Non-Parametric Tests (I) Lecture 5: Non-Parametric Tests (I) KimHuat LIM lim@stats.ox.ac.uk http://www.stats.ox.ac.uk/~lim/teaching.html Slide 1 5.1 Outline (i) Overview of Distribution-Free Tests (ii) Median Test for Two Independent

More information

Introduction to Hypothesis Testing OPRE 6301

Introduction to Hypothesis Testing OPRE 6301 Introduction to Hypothesis Testing OPRE 6301 Motivation... The purpose of hypothesis testing is to determine whether there is enough statistical evidence in favor of a certain belief, or hypothesis, about

More information

Descriptive Statistics

Descriptive Statistics Descriptive Statistics Primer Descriptive statistics Central tendency Variation Relative position Relationships Calculating descriptive statistics Descriptive Statistics Purpose to describe or summarize

More information

Hypothesis testing. c 2014, Jeffrey S. Simonoff 1

Hypothesis testing. c 2014, Jeffrey S. Simonoff 1 Hypothesis testing So far, we ve talked about inference from the point of estimation. We ve tried to answer questions like What is a good estimate for a typical value? or How much variability is there

More information

Descriptive Statistics and Measurement Scales

Descriptive Statistics and Measurement Scales Descriptive Statistics 1 Descriptive Statistics and Measurement Scales Descriptive statistics are used to describe the basic features of the data in a study. They provide simple summaries about the sample

More information

Simple Regression Theory II 2010 Samuel L. Baker

Simple Regression Theory II 2010 Samuel L. Baker SIMPLE REGRESSION THEORY II 1 Simple Regression Theory II 2010 Samuel L. Baker Assessing how good the regression equation is likely to be Assignment 1A gets into drawing inferences about how close the

More information

Section 13, Part 1 ANOVA. Analysis Of Variance

Section 13, Part 1 ANOVA. Analysis Of Variance Section 13, Part 1 ANOVA Analysis Of Variance Course Overview So far in this course we ve covered: Descriptive statistics Summary statistics Tables and Graphs Probability Probability Rules Probability

More information

Two-sample inference: Continuous data

Two-sample inference: Continuous data Two-sample inference: Continuous data Patrick Breheny April 5 Patrick Breheny STA 580: Biostatistics I 1/32 Introduction Our next two lectures will deal with two-sample inference for continuous data As

More information