Confidence intervals



Similar documents
Two Related Samples t Test

Hypothesis testing - Steps

Introduction to Hypothesis Testing. Hypothesis Testing. Step 1: State the Hypotheses

Confidence Intervals for the Difference Between Two Means

Math 251, Review Questions for Test 3 Rough Answers

Two-sample inference: Continuous data

Chapter 8 Hypothesis Testing Chapter 8 Hypothesis Testing 8-1 Overview 8-2 Basics of Hypothesis Testing

Simple Regression Theory II 2010 Samuel L. Baker

BA 275 Review Problems - Week 5 (10/23/06-10/27/06) CD Lessons: 48, 49, 50, 51, 52 Textbook: pp

Two-sample hypothesis testing, II /16/2004

Biostatistics: DESCRIPTIVE STATISTICS: 2, VARIABILITY

HYPOTHESIS TESTING: POWER OF THE TEST

Coefficient of Determination

LAB 4 INSTRUCTIONS CONFIDENCE INTERVALS AND HYPOTHESIS TESTING

HYPOTHESIS TESTING: CONFIDENCE INTERVALS, T-TESTS, ANOVAS, AND REGRESSION

Calculating P-Values. Parkland College. Isela Guerra Parkland College. Recommended Citation

4. Continuous Random Variables, the Pareto and Normal Distributions

" Y. Notation and Equations for Regression Lecture 11/4. Notation:

5/31/2013. Chapter 8 Hypothesis Testing. Hypothesis Testing. Hypothesis Testing. Outline. Objectives. Objectives

UNDERSTANDING THE DEPENDENT-SAMPLES t TEST

HYPOTHESIS TESTING (ONE SAMPLE) - CHAPTER 7 1. used confidence intervals to answer questions such as...

Chapter 7 - Practice Problems 1

Unit 26 Estimation with Confidence Intervals

AP STATISTICS (Warm-Up Exercises)

Using Excel for inferential statistics

5.1 Identifying the Target Parameter

Point and Interval Estimates

Chicago Booth BUSINESS STATISTICS Final Exam Fall 2011

Estimation of σ 2, the variance of ɛ

Lesson 9 Hypothesis Testing

Lesson 1: Comparison of Population Means Part c: Comparison of Two- Means

Experimental Design. Power and Sample Size Determination. Proportions. Proportions. Confidence Interval for p. The Binomial Test

The Normal distribution

Practice problems for Homework 12 - confidence intervals and hypothesis testing. Open the Homework Assignment 12 and solve the problems.

Odds ratio, Odds ratio test for independence, chi-squared statistic.

HYPOTHESIS TESTING (ONE SAMPLE) - CHAPTER 7 1. used confidence intervals to answer questions such as...

Normal distribution. ) 2 /2σ. 2π σ

Lecture Notes Module 1

Objectives. 6.1, 7.1 Estimating with confidence (CIS: Chapter 10) CI)

Estimation and Confidence Intervals

1. How different is the t distribution from the normal?

Testing Research and Statistical Hypotheses

Psychology 60 Fall 2013 Practice Exam Actual Exam: Next Monday. Good luck!

Section 13, Part 1 ANOVA. Analysis Of Variance

1. What is the critical value for this 95% confidence interval? CV = z.025 = invnorm(0.025) = 1.96

Vieta s Formulas and the Identity Theorem

Chapter 9. Two-Sample Tests. Effect Sizes and Power Paired t Test Calculation

Difference of Means and ANOVA Problems

statistics Chi-square tests and nonparametric Summary sheet from last time: Hypothesis testing Summary sheet from last time: Confidence intervals

research/scientific includes the following: statistical hypotheses: you have a null and alternative you accept one and reject the other

Describing Populations Statistically: The Mean, Variance, and Standard Deviation

3.4 Statistical inference for 2 populations based on two samples

Outline. Definitions Descriptive vs. Inferential Statistics The t-test - One-sample t-test

Stats Review Chapters 9-10

Lesson 7 Z-Scores and Probability

p ˆ (sample mean and sample

Understand the role that hypothesis testing plays in an improvement project. Know how to perform a two sample hypothesis test.

Chi Square Tests. Chapter Introduction

Having a coin come up heads or tails is a variable on a nominal scale. Heads is a different category from tails.

CHAPTER 7 INTRODUCTION TO SAMPLING DISTRIBUTIONS

Simple Linear Regression

6.4 Normal Distribution

The normal approximation to the binomial

General Method: Difference of Means. 3. Calculate df: either Welch-Satterthwaite formula or simpler df = min(n 1, n 2 ) 1.

Math 108 Exam 3 Solutions Spring 00

Probability Distributions

November 08, S8.6_3 Testing a Claim About a Standard Deviation or Variance

Unit 31 A Hypothesis Test about Correlation and Slope in a Simple Linear Regression

Descriptive Statistics

Chapter 7 Section 1 Homework Set A

Name: (b) Find the minimum sample size you should use in order for your estimate to be within 0.03 of p when the confidence level is 95%.

Simple Inventory Management

1-3 id id no. of respondents respon 1 responsible for maintenance? 1 = no, 2 = yes, 9 = blank

Chapter 7 - Practice Problems 2

Recall this chart that showed how most of our course would be organized:

The normal approximation to the binomial

Lecture 8. Confidence intervals and the central limit theorem

MATH 13150: Freshman Seminar Unit 10

Easy Casino Profits. Congratulations!!

SCHOOL OF HEALTH AND HUMAN SCIENCES DON T FORGET TO RECODE YOUR MISSING VALUES

Stat 411/511 THE RANDOMIZATION TEST. Charlotte Wickham. stat511.cwick.co.nz. Oct

Statistics Review PSY379

Introduction to Analysis of Variance (ANOVA) Limitations of the t-test

The Dummy s Guide to Data Analysis Using SPSS

Independent samples t-test. Dr. Tom Pierce Radford University

Association Between Variables

1) The table lists the smoking habits of a group of college students. Answer: 0.218

Chapter Study Guide. Chapter 11 Confidence Intervals and Hypothesis Testing for Means

Chapter 7 Notes - Inference for Single Samples. You know already for a large sample, you can invoke the CLT so:

Introduction to Hypothesis Testing OPRE 6301

An Introduction to Statistics Course (ECOE 1302) Spring Semester 2011 Chapter 10- TWO-SAMPLE TESTS

C. The null hypothesis is not rejected when the alternative hypothesis is true. A. population parameters.

UNDERSTANDING THE TWO-WAY ANOVA

UNDERSTANDING THE INDEPENDENT-SAMPLES t TEST

Hypothesis Testing: Two Means, Paired Data, Two Proportions

7 Confidence Intervals

Week 3&4: Z tables and the Sampling Distribution of X

6.2 Normal distribution. Standard Normal Distribution:

Need for Sampling. Very large populations Destructive testing Continuous production process

Transcription:

Confidence intervals Today, we re going to start talking about confidence intervals. We use confidence intervals as a tool in inferential statistics. What this means is that given some sample statistics, we can say whether or not the actual population parameter is likely to be a certain value. For example, if we have data from a survey that says 52% of voters prefer Al Gore and 47% of voters prefer George Bush, we can use confidence intervals to decide whether or not Al Gore is in fact more likely to win an election (i.e. if the true population parameter is greater than 50%). Confidence Intervals (corrected) 1

Sample statistics are point estimates To reinforce this idea, let s think about this hypothetical survey. From the survey we know that 52% of the sample respondents prefer Al Gore. This is our best available point estimate of the population parameter. However, we know that a sample doesn t include the whole population at best, we have several thousand out of around 200 million possible voters. It s possible, by sheer chance, that we have surveyed a sample that is atypical of the population, even if we have a completely random sample. So, in addition to our point estimates, we need to say how confident we are that we have a representative sample. Confidence Intervals (corrected) 2

What this is about (continued) That, in a nutshell, is what confidence intervals are about. We re mainly going to talk about the sample and population means, but this also works for other population parameters (such as the standard deviation). Confidence Intervals (corrected) 3

Confidence intervals for a population If we know the standard deviation of the population, confidence intervals for a sample are pretty easy to calculate. CI = x ± (Z α/2 )( σ n ) This is the confidence interval for the population mean of x when we know the population standard deviation (σ). There are four important parts of the formula; we ll talk about each of them in turn. Confidence Intervals (corrected) 4

Standard error of the mean One important part is σ/ n, which we call the standard error of the mean. The standard error of the mean is a measurement of how much error results from the size of our sample. The bigger our sample, the less likely we are to accidentally end up with an unrepresentative sample, and therefore the standard error of the mean will be smaller. Conversely, if we have a small sample, we expect the chances of us having a bad sample to be higher, so the standard error will be bigger. Confidence Intervals (corrected) 5

Standard error example For example, if σ = 1.3 and n = 144, the standard error of the mean will be: σ = 1.3 = 1.3 n 144 12 = 1.0833 Confidence Intervals (corrected) 6

Alpha and confidence levels The next part of the formula we need to discuss is the confidence level. The confidence level represents how willing we are to accidentally report a mistake. This is represented by Z α/2 in the equation. The Greek letter α actually represents something called the alpha level. Normally, we use values of α = 0.05 or α = 0.01, which mean we are willing to be wrong five out of every hundred times, or one out of every hundred times, respectively. Confidence Intervals (corrected) 7

Confidence levels Normally, we instead of talking about α directly, we state our confidence level in terms of our chances of not making a mistake. α =.05 corresponds to a 95% confidence level, while α =.01 corresponds to a 99% confidence level. Note that you can use other confidence levels, but people will probably look at you funny if you do. Anything lower than 95% is generally considered suspect, and above 99% is generally considered showing off. Confidence Intervals (corrected) 8

Z scores for confidence levels Now, the letter Z doesn t appear here by coincidence. In fact, we use the same Z tables that we learned how to use back in Chapter 6. However, since we normally only care about 95% and 99% confidence levels, we only have to worry about two Z values (we call them critical values ): Z.05/2 = Z.025 = 1.96 and Z 0.01/2 = Z.005 = 2.58 You can look these up using the small part column in Appendix 1 if you like. Confidence Intervals (corrected) 9

Confidence limits The confidence limits are simply the product of the standard error of the mean and the Z score for a given confidence level.more formally: e = (Z α/2 )( σ n ) e represents the confidence limit for a given α, σ and n. Confidence Intervals (corrected) 10

Confidence limits (example) From our example above, we can figure out the confidence limits for the sample mean, given n = 144 and σ = 1.3, as before, for α =.05: (Z α/2 )( σ n ) = Z.025 1.3 144 = 1.96 1.3 12 = 1.96 1.0833 = 0.21333 Confidence Intervals (corrected) 11

Confidence limits (example - cont d) With α =.01: (Z α/2 )( σ n ) = Z.005 1.3 144 = 2.58 1.3 12 = 2.58 1.0833 = 0.2795 Note that the interval is bigger with the smaller alpha. Confidence Intervals (corrected) 12

On to the confidence interval Once we have our confidence limits, all we need to do is figure out our confidence interval: CI = x ± (Z α/2 )( σ n ) Confidence Intervals (corrected) 13

Examples If x = 0.20, we can find the confidence intervals for α =.05: CI = x ± (Z α/2 )( σ n ) = 0.20 ± 0.168 = (0.032 < µ < 0.368) When α =.01: CI = x ± (Z α/2 )( σ n ) = 0.20 ± 0.2795 = ( 0.0795 < µ < 0.4795) Confidence Intervals (corrected) 14

So what does this mean? What this means is that the population mean is somewhere between 0.032 and 0.368, with a 95% confidence level, and between 0.0795 and 0.4795, with a 99% confidence level.to put it another way, we believe, with the given margin of error, that the mean of the population is somewhere between those values, and that we d get a result between those values if we looked at the entire population 95% (or 99%) of the time. By the way, margin of error is another way to say confidence limit. So when you see on the news that President Bush s approval rating is 68%, with a 4% margin of error, this means that they are confident that Bush s true approval rating is between 64% and 72%. Confidence Intervals (corrected) 15

Minimum sample sizes From the confidence limit formula, we can derive the minimum n (sample size) required to have a given confidence level. That is, how many items out of the population must we look at to be confident that the population parameter is close to the sample statistic? To do this, we rearrange the confidence limit formula and get: n = ( (Z α/2)σ ) 2 e Confidence Intervals (corrected) 16

Example So, for example, if we want to know the average age of Ole Miss students within a month (1/12 of a year, or.00833 years), with a 95% confidence level (α =.05), and we know that the standard deviation σ = 1.8: n = ( (Z α/2)σ ) 2 1.96 1.8 = ( e 0.0833 )2 = ( 3.528 0.0833 )2 = (32.566) 2 = 1060.6 This means we would need to sample 1061 students to determine this information with a 95% confidence level. (Note that sometimes n > N!). Confidence Intervals (corrected) 17

What if we don t know the population std. deviation? Most of the time, we don t already know the population standard deviation (σ); in fact, to figure it out, we would have collected enough information to get the population mean µ as well. However, there is an analogous formula we can use with the sample standard deviation, s: CI = x ± (t α/2 )( s n ) Note that as well as changing σ to s, we ve also changed Z into t. This means we have to learn a new distribution function. Confidence Intervals (corrected) 18

Student s t distribution The new distribution is called the Student s t distribution. (It was actually invented by William Gosset and employed to monitor the production of Guinness.) Student s t distribution is a lot like the standard normal curve, except that it is designed specifically to deal with the problem of small samples. (For large samples, where n > 200 or so, the t distribution and Z distribution become the same.) Confidence Intervals (corrected) 19

A new table; degrees of freedom The t distribution is illustrated in Appendix 2 of Schacht. Note that this table is different than the table in Appendix 1 in two ways: only certain alpha levels for t are shown across the top, and down the left-hand column are a set of values called df or degrees of freedom. Degrees of freedom have to do with what mathematicians call a free parameters problem (basically, it has to do with how many of the parameters we can estimate based on the data). Since we re not mathematicians, all we need to really worry about is that df = n 1 for the t distribution. Confidence Intervals (corrected) 20

Finding t values So, for any given α and df (or n), we can find the correct t value. For example, let s find the t value for α =.05 and 18 degrees of freedom. (Look under the two-tailed test column.) t crit = 2.101. Note that if the number of degrees of freedom doesn t appear in the table, use the next-lowest number that appears. So, for example, the critical value for α =.01 and df = 53 is 2.704. Of course, if you have more than about 200 degrees of freedom, you should use the row, which has the Z scores for the appropriate alpha levels. Confidence Intervals (corrected) 21

Let s work an example Now that we know the t distribution, we can figure out some confidence intervals. Let s assume we have a sample of 20 political science undergrads at Ole Miss. Their mean IQ x = 107 with a standard deviation s = 17. What is the confidence interval for all political science undergrads at Ole Miss, with a 95% confidence level? Since n = 20, we have n 1 = 19 degrees of freedom. The critical value for t with α =.05 and df = 19 is 2.093. Now, all we have to do is substitute into the formula: Confidence Intervals (corrected) 22

Plug and chug CI = x ± (t α/2 )( s n ) = 107 ± (2.093)( 17 20 ) = 107 ± (2.093) 17 4.47 = 107 ± (2.093)(3.081) = 107 ± 7.956 = 99.04 < µ < 114.96 Confidence Intervals (corrected) 23

Homework Answer questions 1 and 3 on pages 136 7. Prepare for a quiz next Wednesday on the material from chapters 6 8. Confidence Intervals (corrected) 24