11. CONFIDENCE INTERVALS FOR THE MEAN; KNOWN VARIANCE

Similar documents
Point and Interval Estimates

The Normal Approximation to Probability Histograms. Dice: Throw a single die twice. The Probability Histogram: Area = Probability. Where are we going?

Normal distribution. ) 2 /2σ. 2π σ

Week 4: Standard Error and Confidence Intervals

The Math. P (x) = 5! = = 120.

Population Mean (Known Variance)

MA 1125 Lecture 14 - Expected Values. Friday, February 28, Objectives: Introduce expected values.

Week 3&4: Z tables and the Sampling Distribution of X

WISE Power Tutorial All Exercises

CALCULATIONS & STATISTICS

Independent samples t-test. Dr. Tom Pierce Radford University

The Normal distribution

Simple Regression Theory II 2010 Samuel L. Baker

Vieta s Formulas and the Identity Theorem

Constructing and Interpreting Confidence Intervals

4. Continuous Random Variables, the Pareto and Normal Distributions

Introduction to Hypothesis Testing

Problem sets for BUEC 333 Part 1: Probability and Statistics

5.4 Solving Percent Problems Using the Percent Equation

Def: The standard normal distribution is a normal probability distribution that has a mean of 0 and a standard deviation of 1.

Good luck! BUSINESS STATISTICS FINAL EXAM INSTRUCTIONS. Name:

Confidence Intervals for One Standard Deviation Using Standard Deviation

p ˆ (sample mean and sample

STATISTICS 8: CHAPTERS 7 TO 10, SAMPLE MULTIPLE CHOICE QUESTIONS

Review. March 21, S7.1 2_3 Estimating a Population Proportion. Chapter 7 Estimates and Sample Sizes. Test 2 (Chapters 4, 5, & 6) Results

Study Guide for the Final Exam

Introduction to. Hypothesis Testing CHAPTER LEARNING OBJECTIVES. 1 Identify the four steps of hypothesis testing.

Confidence intervals

Hypothesis Testing for Beginners

LAB 4 INSTRUCTIONS CONFIDENCE INTERVALS AND HYPOTHESIS TESTING

Chapter 5. Discrete Probability Distributions

Chicago Booth BUSINESS STATISTICS Final Exam Fall 2011

Confidence Intervals for Cp

Conversions between percents, decimals, and fractions

Fractions to decimals

Session 7 Fractions and Decimals

Pr(X = x) = f(x) = λe λx

8. THE NORMAL DISTRIBUTION

Chapter 8 Section 1. Homework A

Chapter 5: Normal Probability Distributions - Solutions

Hypothesis testing - Steps

WRITING PROOFS. Christopher Heil Georgia Institute of Technology

MATH 13150: Freshman Seminar Unit 10

Linear Equations and Inequalities

Example: Find the expected value of the random variable X. X P(X)

Hypothesis testing. c 2014, Jeffrey S. Simonoff 1

The Binomial Distribution

Math 370/408, Spring 2008 Prof. A.J. Hildebrand. Actuarial Exam Practice Problem Set 5 Solutions

Math 4310 Handout - Quotient Vector Spaces

Unit 26 Estimation with Confidence Intervals

Betting systems: how not to lose your money gambling

22. HYPOTHESIS TESTING

Worldwide Casino Consulting Inc.

THE POKIES: BEFORE YOU PRESS THE BUTTON, KNOW THE FACTS.

5544 = = = Now we have to find a divisor of 693. We can try 3, and 693 = 3 231,and we keep dividing by 3 to get: 1

MTH6120 Further Topics in Mathematical Finance Lesson 2

Chapter 19 Confidence Intervals for Proportions

Math 461 Fall 2006 Test 2 Solutions

23. RATIONAL EXPONENTS

Chapter 1: Order of Operations, Fractions & Percents

HYPOTHESIS TESTING: POWER OF THE TEST

arxiv: v1 [math.pr] 5 Dec 2011

1. The Fly In The Ointment

Chapter 7 Notes - Inference for Single Samples. You know already for a large sample, you can invoke the CLT so:

Descriptive Statistics

Chapter 10. Key Ideas Correlation, Correlation Coefficient (r),

A Note on Proebsting s Paradox

SAMPLING DISTRIBUTIONS

To give it a definition, an implicit function of x and y is simply any relationship that takes the form:

Trigonometric Functions and Triangles

DEVELOPING A MODEL THAT REFLECTS OUTCOMES OF TENNIS MATCHES

MONEY MANAGEMENT. Guy Bower delves into a topic every trader should endeavour to master - money management.

Introduction to the Practice of Statistics Fifth Edition Moore, McCabe

5.1 Identifying the Target Parameter

Probability and Statistics Prof. Dr. Somesh Kumar Department of Mathematics Indian Institute of Technology, Kharagpur

Graph Theory Lecture 3: Sum of Degrees Formulas, Planar Graphs, and Euler s Theorem Spring 2014 Morgan Schreffler Office: POT 902

Is it possible to beat the lottery system?

" Y. Notation and Equations for Regression Lecture 11/4. Notation:

WHERE DOES THE 10% CONDITION COME FROM?

Experimental Design. Power and Sample Size Determination. Proportions. Proportions. Confidence Interval for p. The Binomial Test

Recall this chart that showed how most of our course would be organized:

Lesson 7 Z-Scores and Probability

Math 201: Statistics November 30, 2006

Chapter 4. Probability Distributions

13.0 Central Limit Theorem

Activity 1: Using base ten blocks to model operations on decimals

ECON 459 Game Theory. Lecture Notes Auctions. Luca Anderlini Spring 2015

Lecture 8. Confidence intervals and the central limit theorem

Polynomial Invariants

Pre-Algebra Lecture 6

6.042/18.062J Mathematics for Computer Science. Expected Value I

Week 5: Expected value and Betting systems

3.1. Solving linear equations. Introduction. Prerequisites. Learning Outcomes. Learning Style

Solving Rational Equations

ph. Weak acids. A. Introduction

WISE Sampling Distribution of the Mean Tutorial

CHAPTER 1. Compound Interest

The Kelly Betting System for Favorable Games.

Transcription:

11. CONFIDENCE INTERVALS FOR THE MEAN; KNOWN VARIANCE We assume here that the population variance σ 2 is known. This is an unrealistic assumption, but it allows us to give a simplified presentation which reveals many of the important issues, and prepares us to solve the real problem, where σ 2 is unknown. (Next handout). We want a confidence interval for the population mean, μ, based on n observations. There is some small probability (α) that the method of constructing confidence intervals will fail. We can control this probability, that is, we can select any value for α we want. Traditionally, α is taken to be either 0.05 or 0.01. We refer to 1 αas the confidence level. This is the proportion of the time that the confidence interval contains the parameter, in repeated sampling. Let z α/2 denote the z value such that the area to its right under the standard normal curve is α/2. Eg: For α = 0.05 we get z α/2 = z 0.025 = 1.96. So, 95% of the time a normal will be in the range (μ 1.96σ, μ + 1.96σ). In the Empirical Rule, we approximated the 1.96 by 2. For α = 0.01 we get z α/2 = z 0.005 = 2.576. The interval X ± z α 2 σ x is a CI for μ with confidence level 1 α. Note that the interval extends from X zα σx to X + zα 2σ 2 x.

Here s why the formula above works. First, we suppose that the Central Limit Theorem allows us to assume that X is normal. Let s convert X into its own z-score, X μ Z = σ x We have standardized X by subtracting its own mean, μ x = μ, σ and dividing by its own standard error, σ x =. n The probability that X falls more than z α/2 standard errors from μ is α α Prob{ Z < zα 2} + Prob{ Z > zα 2} = + = α 2 2 Since X is normal, Z is standard normal. Z measures how many standard errors X falls from μ. Therefore, 1 α = Prob = Prob = Prob { X falls within zα 2 standard errors of μ } { μ is within zα 2σx of X } { μ is between X z σ and X + z σ }. α 2 x α 2 So X z 2σ is a CI for μ with confidence level1 α. ± α x x Interpretation of Confidence Intervals As stated earlier, the confidence interval X ± z α 2 σ x will cover μ with probability 1 α. There is a subtle difficulty in the practical interpretation of the results, however, as demonstrated in the following example. Eg 1: In the Pepsi example, we had n = 100, σ = 0.05 and x = 1985.. The 95% confidence interval estimator for μ is X ± 1.96 σ, that is, X ± 0.01 n Plugging in x =1.985, we obtain the 95% confidence interval estimate (1.975, 1.995), rounded to three decimal places.

Which of the following statements is true? a) There is a 95% chance that μ is between 1.975 and 1.995. b) μ will be between 1.975 and 1.995 95% of the time. c) In 95% of all future samples, x will be between 1.975 and 1.995. d) μ is between 1.975 and 1.995. e) None of the above. Warning: The practical interpretation of confidence intervals is extremely tricky. The difficulty has to do with the distinction between an estimator and an estimate. An estimator is a random variable whose value depends on a sample not yet taken (eg: X). An estimate is the value actually taken by the estimator for a given sample (eg: x = 1985. ). The CI is an interval estimator. It has random endpoints. After the sample is taken, its endpoints take on specific values, yielding an interval estimate. Since μ is an (unknown) constant, and since the endpoints of the CI estimate are fixed numbers (eg: 1.975, 1.995), it makes no sense to talk about the probability that the CI estimate contains μ. Either it does or it doesn t, and we may never find out which of these events has occurred. Instead, it is the CI estimator which contains μ with probability 1 α. The estimator has random endpoints, X zα σx, X + zα 2σ 2 x. The word probability refers to the long-run proportion of the time that these random endpoints will contain the true mean μ, assuming a large number of repetitions of the experiment of collecting a random sample and constructing the CI. Thus the confidence level 1 αrefers to the process of constructing confidence intervals, not to the particular CI estimate obtained from the given sample. A correct (but not very satisfying) answer to the multiple choice problem is f) We can t really say anything about the particular interval estimate we got for this sample (1.975, 1.995), but the confidence interval estimator X ± 001. will cover μ in 95% of all random samples which can be collected.

Unfortunately, in practice we have only one sample. So what good is a probability statement referring to all samples which might have been taken? 1) You can think of 1 αas an overall success rate. If you compute many 95% confidence intervals over your lifetime, and if the required assumptions are satisfied for each one, then approximately 95% of these confidence intervals will contain their respective population means. Unfortunately, you may never know which ones were wrong. 2) Even though we can t talk about the probability that the given CI estimate contains μ, if we had to bet on it, we would say that the odds are 19 to 1 that our given 95% CI estimate fails to cover μ. In the long run, in a lifetime of gambling on confidence intervals, you would win 19 out of 20 bets that the CI covered μ. See Web demo on confidence intervals based on repeated samples at: http://www.amstat.org/publications/jse/v6n3/applets/confidenceinte rval.html Try changing the value of alpha and watch the CIs widen or narrow. Remember: In real life, we have only one sample, so only one CI. It might be one of the 5% of all CIs which will fail to contain the mean. We don t know. But we have 95% confidence in the statistical procedure, since only 1 out of 20 such intervals will fail in the long run.

Why Don t We Use 100% Confidence Intervals? Wouldn t it be better to be right 100% of the time rather than 95% of the time? Not necessarily, when it comes to confidence intervals. The problem is that (for given n and σ), the smaller we make α the wider the CI becomes. It is possible to construct a 100% confidence interval, but it is infinitely wide, and therefore tells us nothing.