Lecture 8. Confidence intervals and the central limit theorem



Similar documents
Math 151. Rumbos Spring Solutions to Assignment #22

4. Continuous Random Variables, the Pareto and Normal Distributions

The Normal Distribution. Alan T. Arnholt Department of Mathematical Sciences Appalachian State University

Chapter 7 Notes - Inference for Single Samples. You know already for a large sample, you can invoke the CLT so:

Exact Confidence Intervals

Random variables P(X = 3) = P(X = 3) = 1 8, P(X = 1) = P(X = 1) = 3 8.

Lecture 3: Continuous distributions, expected value & mean, variance, the normal distribution

MAS108 Probability I

Lecture 10: Depicting Sampling Distributions of a Sample Proportion

Joint Exam 1/P Sample Exam 1

Chapter 4 - Lecture 1 Probability Density Functions and Cumul. Distribution Functions

Overview of Monte Carlo Simulation, Probability Review and Introduction to Matlab

Calculating P-Values. Parkland College. Isela Guerra Parkland College. Recommended Citation

Chapter 4. Probability and Probability Distributions

5. Continuous Random Variables

Normal distribution. ) 2 /2σ. 2π σ

Lecture 6: Discrete & Continuous Probability and Random Variables

The Normal distribution

Normal Approximation. Contents. 1 Normal Approximation. 1.1 Introduction. Anthony Tanbakuchi Department of Mathematics Pima Community College

1 Sufficient statistics

Principle of Data Reduction

University of California, Los Angeles Department of Statistics. Random variables

Hypothesis Testing for Beginners

Math 431 An Introduction to Probability. Final Exam Solutions

You flip a fair coin four times, what is the probability that you obtain three heads.

MATH 10: Elementary Statistics and Probability Chapter 5: Continuous Random Variables

Summary of Formulas and Concepts. Descriptive Statistics (Ch. 1-4)

1 Prior Probability and Posterior Probability

The normal approximation to the binomial

Simple Linear Regression Inference

Lecture 8: More Continuous Random Variables

Probability Distribution for Discrete Random Variables

Section 5.1 Continuous Random Variables: Introduction

For a partition B 1,..., B n, where B i B j = for i. A = (A B 1 ) (A B 2 ),..., (A B n ) and thus. P (A) = P (A B i ) = P (A B i )P (B i )

CHAPTER 7 INTRODUCTION TO SAMPLING DISTRIBUTIONS

The sample space for a pair of die rolls is the set. The sample space for a random number between 0 and 1 is the interval [0, 1].

HYPOTHESIS TESTING (ONE SAMPLE) - CHAPTER 7 1. used confidence intervals to answer questions such as...

Aggregate Loss Models

IEOR 6711: Stochastic Models I Fall 2012, Professor Whitt, Tuesday, September 11 Normal Approximations and the Central Limit Theorem

EMPIRICAL FREQUENCY DISTRIBUTION

Simulation Exercises to Reinforce the Foundations of Statistical Thinking in Online Classes

Binomial Distribution n = 20, p = 0.3

PROBABILITY AND SAMPLING DISTRIBUTIONS

ECE302 Spring 2006 HW4 Solutions February 6,

HYPOTHESIS TESTING: POWER OF THE TEST

Chapter 3 RANDOM VARIATE GENERATION

Maximum Likelihood Estimation

Discrete Mathematics and Probability Theory Fall 2009 Satish Rao, David Tse Note 18. A Brief Introduction to Continuous Probability

The Standard Normal distribution

2. Discrete random variables

Binomial lattice model for stock prices

Statistics 104: Section 6!

Dongfeng Li. Autumn 2010

HYPOTHESIS TESTING (ONE SAMPLE) - CHAPTER 7 1. used confidence intervals to answer questions such as...

Notes on Continuous Random Variables

The Binomial Probability Distribution

MATH4427 Notebook 2 Spring MATH4427 Notebook Definitions and Examples Performance Measures for Estimators...

1. How different is the t distribution from the normal?

Lecture 2: Discrete Distributions, Normal Distributions. Chapter 1

Estimation and Confidence Intervals

ST 371 (IV): Discrete Random Variables

Chapter 5. Random variables

Tests of Hypotheses Using Statistics

Inaugural Lecture. Jan Vecer

Important Probability Distributions OPRE 6301

CHAPTER 6: Continuous Uniform Distribution: 6.1. Definition: The density function of the continuous random variable X on the interval [A, B] is.

Section 7.1. Introduction to Hypothesis Testing. Schrodinger s cat quantum mechanics thought experiment (1935)

Chapter 2. Hypothesis testing in one population

Math 425 (Fall 08) Solutions Midterm 2 November 6, 2008

Chapter 4 Lecture Notes

Stats on the TI 83 and TI 84 Calculator

6.041/6.431 Spring 2008 Quiz 2 Wednesday, April 16, 7:30-9:30 PM. SOLUTIONS

Math/Stats 342: Solutions to Homework

Practice problems for Homework 11 - Point Estimation

2 ESTIMATION. Objectives. 2.0 Introduction

Probability Theory. Florian Herzog. A random variable is neither random nor variable. Gian-Carlo Rota, M.I.T..

An Introduction to Basic Statistics and Probability

Lecture 5 : The Poisson Distribution

5.1 Identifying the Target Parameter

Statistics 100A Homework 8 Solutions

6 PROBABILITY GENERATING FUNCTIONS

Normal Distribution as an Approximation to the Binomial Distribution

STAT 830 Convergence in Distribution

Multivariate normal distribution and testing for means (see MKB Ch 3)

Probability Generating Functions

The normal approximation to the binomial

Probability and Statistics Prof. Dr. Somesh Kumar Department of Mathematics Indian Institute of Technology, Kharagpur

Week 3&4: Z tables and the Sampling Distribution of X

Experimental Design. Power and Sample Size Determination. Proportions. Proportions. Confidence Interval for p. The Binomial Test

Time Series Analysis

TImath.com. F Distributions. Statistics

3.4 Statistical inference for 2 populations based on two samples

Topic 8. Chi Square Tests

Module 2 Probability and Statistics

Confidence Intervals for the Difference Between Two Means

Chapter 9 Monté Carlo Simulation

THE FIRST SET OF EXAMPLES USE SUMMARY DATA... EXAMPLE 7.2, PAGE 227 DESCRIBES A PROBLEM AND A HYPOTHESIS TEST IS PERFORMED IN EXAMPLE 7.

Quadratic forms Cochran s theorem, degrees of freedom, and all that

Transcription:

Lecture 8. Confidence intervals and the central limit theorem Mathematical Statistics and Discrete Mathematics November 25th, 2015 1 / 15

Central limit theorem Let X 1, X 2,... X n be a random sample of size n from a distribution of X with mean µ and variance σ 2. Then, for large n, n X i N (nµ, nσ 2 ), i=1 X N (µ, σ 2 /n), X µ σ/ N (0, 1). n Here X Y means that X and Y have approximately the same distribution. Note that the central limit theorem is valid for any random variable X with mean µ and variance σ 2. In particular, X can be discrete, and the theorem says that the sample means for large sample sizes are well approximated by the continuous normal distribution. Note that if X is normal then we have exact, and not approximate equalities. 2 / 15

Central limit theorem 1.0 1.0 0.8 0.8 0.6 0.6 1 2 3 4 1 2 3 4 0.7 0.6 0.6 0.5 0.3 0.1 1 2 3 4 1 2 3 4 Figure: A comparison of PDF s of sums of n independent uniform random variables on (0, 1) for n = 1, 2, 3, 4. 3 / 15

Central limit theorem 0.15 0.15 0.10 0.10 0.05 0.05 5 10 15 20 25 30 5 10 15 20 25 30 Figure: A comparison of PDF s of Binom(n, p) with n = 20, 30, 40 and p = 0.5 on the left, and n = 60, 90, 120 and p = 0.1 on the left. One can see that the shape of the PDF approaches the bell curve of the normal distribution. Note that the number of variables n required for a good approximation by a normal distribution depends on the distribution of a single variable. 4 / 15

Central limit theorem We toss a fair coin 400 times. Let X be the total number of heads. We want to know We have X Binom(400, 1/2), µ = E[X] = 400 1/2 = 200, P(190 X 210). σ 2 = Var[X] = 400 1/2 (1 1/2) = 100. By the central limit theorem, X 200 10 is approximately distributed like a standard normal variable Z, and hence P(190 X 210) = P(X 210) P(X 189) = P(X 200 10) P(X 200 11) ( X 200 ) ( X 200 ) = P 1 P 1.1 10 10 F Z (1) F Z ( 1.1) = 0.8413 0.1357 = 0.7056. 5 / 15

Confidence intervals for µ with arbitrary data and σ 2 known Let X be an arbitrary random variable with known variance σ 2, and let X 1, X 2,..., X n be a random sample of large size n from the distribution of X. Let Z N (0, 1) be a standard normal variable, and let z α/2 > 0 be such that Then, the random interval [L, R], where F Z ( z α/2 ) = α/2. L = X z α/2 σ/ n and R = X + z α/2 σ/ n is a confidence interval for the true mean µ with confidence level 1 α, that is P(L µ R) = 1 α. 6 / 15

Chi-squared and t-distribution If Z 1, Z 2,..., Z n is a random sample of size n from the standard normal distribution, then we say that the random variable Q = n i=1 X 2 i has chi-squared distribution with n degrees of freedom. We denote this by writing Q χ 2 (n). If Z and Q be independent random variables such that Z is a standard normal variable, and Q has chi-squared distribution with n degrees of freedom, then we say that the random variable T = Z Q/n has t-distribution with n degrees of freedom. These are very important distributions and numerical values for their CDF s are found in all mathematical tables. 7 / 15

Chi-squared and t-distribution 0.15 0.10 0.05 5 10 15 20 25 30 35 Figure: The PDF s of the chi-squared distribution with 5, 10, 20, 30 degrees of freedom. 8 / 15

Chi-squared and t-distribution 0.3 0.3 0.1 0.1-4 -2 2 4-4 -2 2 4 0.3 0.3 0.1 0.1-4 -2 2 4-4 -2 2 4 Figure: A comparison of PDF s of the t-distribution with 1, 3, 10, and 30 degrees of freedom (orange) and the standard normal distribution (blue). 9 / 15

Chi-squared and t-distribution If X 1, X 2,... X n is a sample from the normal distribution N (µ, σ 2 ), and X is the sample mean, and S 2 the sample variance, then (n 1)S 2 /σ 2 has chi-squared distribution with n 1 degrees of freedom, and X µ S/ n has t-distribution with n 1 degrees of freedom. Proof. The proof is outside the scope of the course. Partial arguments can be found in the book. 10 / 15

Confidence intervals for µ with normal data and σ 2 unknown Let X be a normal random variable with unknown variance, and let X 1, X 2,..., X n be a random sample of size n from the distribution of X. Let T n 1 be a random variable that has t-distribution with n 1 degrees of freedom, and let t α/2 > 0 be such that Then, the random interval [L, R], where F Tn 1 (t α/2 ) = 1 α/2. L = X t α/2 S/ n and R = X + t α/2 S/ n is a confidence interval for the true mean µ of X with confidence level 1 α, that is P(L µ R) = 1 α. Proof. The proof is analogous to the one for σ 2 known. We use the fact X µ S/ n T n 1. 11 / 15

The manufacturer claims that their mix of nuts and fruits contains 33g fruits per 100g. We want to check this claim. We buy 5 packages and weigh the fruit content. We obtain the following numbers: 31.84, 32.35, 31.20, 32.89, 32.80. We find x = 1 5 5 i 1 x i = 32.22, and s 2 = 1 4 5 i 1 x2 i 5(x 2 ) = 0.50. We assume that the sample comes from a normal distribution. In the tables, we find that t 0.025 = 2.776 The 95% confidence interval is then [ s s ] [l, r] = x t 0.025, x + t 0.025 5 5 [ 0.5 0.5 ] = 32.22 2.776, 32.22 + 2.776 5 5 = [31.34, 33.10]. 12 / 15

Confidence intervals for σ 2 with normal data Let X be a normal random variable with unknown variance, and let X 1, X 2,..., X n be a random sample of size n from the distribution of X. Let χ 2 n 1 be a random variable that has chi-squared distribution with n 1 degrees of freedom, and let χ 2 α/2, χ2 1 α/2 > 0 be numbers such that F χ 2 n 1 (χ 2 α/2 ) = 1 α/2, and F χ 2 n 1 (χ2 1 α/2 ) = α/2 Then, the random interval [L, R], where L = (n 1)S2 χ 2 α/2 and R = (n 1)S2 χ 2 1 α/2 is a confidence interval for the true variance σ 2 of X with confidence level 1 α, that is P(L σ 2 R) = 1 α. 13 / 15

Confidence intervals for σ 2 with normal data Proof. We will use the fact that (n 1)S 2 /σ 2 χ 2 n 1. By the definition of χ 2 α/2, χ2 1 α/2, > 0, we have 1 α = P(χ 2 α/2 (n 1)S2 /σ 2 χ 2 1 α/2 ( ) ) χ 2 α/2 = P (n 1)S 2 1 σ 2 χ2 1 α/2 (n 1)S 2 ( = P (n 1)S 2 χ 2 1 α/2 σ 2 (n 1)S2 χ 2 α/2 ). 14 / 15

Let us find a 95% confidence interval for σ 2 in the fruit mix example. We have s 2 = 0.50, n 1 = 4, α/2 = 0.025. We find in the tables that, χ 2 1 α/2 = χ2 0.975 = 84 and χ 2 α/2 = χ2 0.025 = 11.1. Hence, the confidence interval is [l, r] = [ 4 0.5 11.1, 4 0.5 ] = [0.18, 4.13]. 4.84 15 / 15