Let m denote the margin of error. Then

Similar documents
5.1 Identifying the Target Parameter

4. Continuous Random Variables, the Pareto and Normal Distributions

Simple Regression Theory II 2010 Samuel L. Baker

Biostatistics: DESCRIPTIVE STATISTICS: 2, VARIABILITY

Week 4: Standard Error and Confidence Intervals

Chapter 7 Section 7.1: Inference for the Mean of a Population

MEASURES OF VARIATION

Summary of Formulas and Concepts. Descriptive Statistics (Ch. 1-4)

3.2 Measures of Spread

The Normal distribution

5/31/ Normal Distributions. Normal Distributions. Chapter 6. Distribution. The Normal Distribution. Outline. Objectives.

LAB 4 INSTRUCTIONS CONFIDENCE INTERVALS AND HYPOTHESIS TESTING

Density Curve. A density curve is the graph of a continuous probability distribution. It must satisfy the following properties:

Review. March 21, S7.1 2_3 Estimating a Population Proportion. Chapter 7 Estimates and Sample Sizes. Test 2 (Chapters 4, 5, & 6) Results

1) Write the following as an algebraic expression using x as the variable: Triple a number subtracted from the number

Standard Deviation Estimator

7. Normal Distributions

Chapter 7 - Practice Problems 1

Why Taking This Course? Course Introduction, Descriptive Statistics and Data Visualization. Learning Goals. GENOME 560, Spring 2012

Unbeknownst to us, the entire population consists of 5 cloned sheep with ages 10, 11, 12, 13, 14 months.

AP Physics 1 and 2 Lab Investigations

p ˆ (sample mean and sample

TImath.com. F Distributions. Statistics

MBA 611 STATISTICS AND QUANTITATIVE METHODS

Study Guide for the Final Exam

Good luck! BUSINESS STATISTICS FINAL EXAM INSTRUCTIONS. Name:

8. THE NORMAL DISTRIBUTION

Chapter Study Guide. Chapter 11 Confidence Intervals and Hypothesis Testing for Means

2 ESTIMATION. Objectives. 2.0 Introduction

3: Summary Statistics

Lecture Notes Module 1

How To Check For Differences In The One Way Anova

Unit 26 Estimation with Confidence Intervals

Recall this chart that showed how most of our course would be organized:

Week 3&4: Z tables and the Sampling Distribution of X

1. How different is the t distribution from the normal?

Confidence Intervals for Cp

BASIC STATISTICAL METHODS FOR GENOMIC DATA ANALYSIS

Problem sets for BUEC 333 Part 1: Probability and Statistics

6.4 Normal Distribution

Section 1.3 Exercises (Solutions)

Chapter 4. Probability and Probability Distributions

Chapter 3 RANDOM VARIATE GENERATION

Objectives. 6.1, 7.1 Estimating with confidence (CIS: Chapter 10) CI)

Confidence Intervals for One Standard Deviation Using Standard Deviation

Statistics 151 Practice Midterm 1 Mike Kowalski

Coefficient of Determination

Statistics I for QBIC. Contents and Objectives. Chapters 1 7. Revised: August 2013

6 3 The Standard Normal Distribution

How To Calculate Confidence Intervals In A Population Mean

The right edge of the box is the third quartile, Q 3, which is the median of the data values above the median. Maximum Median

Means, standard deviations and. and standard errors

Estimation and Confidence Intervals

SOLUTIONS TO BIOSTATISTICS PRACTICE PROBLEMS

Chapter 10. Key Ideas Correlation, Correlation Coefficient (r),

Population Mean (Known Variance)

Chicago Booth BUSINESS STATISTICS Final Exam Fall 2011

Solutions to Homework 3 Statistics 302 Professor Larget

Content Sheet 7-1: Overview of Quality Control for Quantitative Tests

3.4 The Normal Distribution

Constructing and Interpreting Confidence Intervals

Lecture 19: Chapter 8, Section 1 Sampling Distributions: Proportions

TEACHER NOTES MATH NSPIRED

SKEWNESS. Measure of Dispersion tells us about the variation of the data set. Skewness tells us about the direction of variation of the data set.

Section 14 Simple Linear Regression: Introduction to Least Squares Regression

Def: The standard normal distribution is a normal probability distribution that has a mean of 0 and a standard deviation of 1.

MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question.

SENSITIVITY ANALYSIS AND INFERENCE. Lecture 12

Name: Date: Use the following to answer questions 3-4:

CALCULATIONS & STATISTICS

Mind on Statistics. Chapter 2

How To Write A Data Analysis

Chapter 1: Exploring Data

An Introduction to Basic Statistics and Probability

REPEATED TRIALS. The probability of winning those k chosen times and losing the other times is then p k q n k.

Hypothesis testing - Steps

Linear Equations and Inequalities

Math 201: Statistics November 30, 2006

Descriptive Statistics

Summarizing and Displaying Categorical Data

Fairfield Public Schools

Simple linear regression

Probability and Statistics Prof. Dr. Somesh Kumar Department of Mathematics Indian Institute of Technology, Kharagpur

Descriptive Statistics

STA 130 (Winter 2016): An Introduction to Statistical Reasoning and Data Science

P(every one of the seven intervals covers the true mean yield at its location) = 3.

Chapter 7. One-way ANOVA

Confidence Intervals for the Difference Between Two Means

Confidence Intervals in Public Health

Statistics 104: Section 6!

MTH 140 Statistics Videos

CHAPTER 13 SIMPLE LINEAR REGRESSION. Opening Example. Simple Regression. Linear Regression

Confidence Intervals for Cpk

Chapter 1: Looking at Data Section 1.1: Displaying Distributions with Graphs

HYPOTHESIS TESTING: CONFIDENCE INTERVALS, T-TESTS, ANOVAS, AND REGRESSION

Permutation Tests for Comparing Two Populations

SAMPLING DISTRIBUTIONS

7 Confidence Intervals

Introduction to Statistics for Psychology. Quantitative Methods for Human Sciences

Example 1: Suppose the demand function is p = 50 2q, and the supply function is p = q. a) Find the equilibrium point b) Sketch a graph

Transcription:

S:105 Statistical Methods and Computing Sample size for confidence intervals with σ known t Intervals Lecture 13 Mar. 6, 009 Kate Cowles 374 SH, 335-077 kcowles@stat.uiowa.edu 1 The margin of error The margin of error is the value that we add onto x and subtract from x to get the endpoints of a confidence interval. For confidence intervals for the mean of a normal population with σ known, this is m = z σ n Equivalently, the margin of error is one half the width of the c.i. The margin of error depends on the level of confidence desired the population standard deviation (which we can t control!) the sample size (not the population size) Sample size for a study involving a confidence interval Suppose a group of obstetricians wish to carry out a study to estimate µ, the mean birthweight in the population of infants born at UIHC. Suppose the obstetricians believe that the population standard deviation of birthweights of infants born at UIHC is the same as that of infants overall in the US. σ = 15 oz The obstetricians would like a 95% confidence interval for µ that is no wider than 4 ounces. That is, they want a margin of error ounce. How many infants do they need in their study? 3 4 Let m denote the margin of error. Then m = z n = z n = z σ n σ m σ m n = 1.96 15 = 16.09 A sample size must always be rounded up, so they need 17 infants in their study.

Sample size continued What makes a sample size large? n = z σ m 5 6 Caveats regarding our formula for computing confidence intervals for population means The data must be a simple random sample from the population. We are not in too big trouble if the data can plausibly be thought of as observations taken at random from the population. There is no correct method for inference from data haphazardly collected with bias of unknown size. Fancy formulas cannot rescue badly produced data. * Watch out for outliers in your dataset, because they can have a large effect on both the point estimate of µ and the confidence interval. If outliers are not data errors, and if there is no subject-matter reason for deleting them, get help from a statistician on computing measures of center and intervals that are not sensitive to outliers. Check your data for skewness and other signs that the population they came from may not be normal. If the sample size is large (i.e. n 30) the central limit theorem says the approach is valid. If the sample size is small, the confidence level will not be correct. The formula x ± z σ n requires that we know the exact value of the population standard deviation σ, which we never do. Moore, David S. (000) The Basic Practice of Statistics, nd ed., W.H. Freeman and Co. 7 8 What to do when we believe the population is normal but we don t know σ Assumptions behind this method The data are a simple random sample from the population of interest. Values in the population follow a normal distribution with mean µ and standard deviation σ. Both µ and σ are unknown. The sample mean x is still our point estimate of the unknown population mean µ. x still comes from a normal distribution with mean µ and standard deviation σ n.

We will estimate σ by the sample standard deviation s. Then we estimate the standard deviation of x by s n 9 10 t intervals When we claimed to know σ, we computed confidence intervals for µ as x ± z σ n Standard errors When we use the data to estimate the standard deviation of a statistic, the result is called the standard error of the statistic. The standard error of the sample mean x is s n. where z was the appropriate cutoff value from a standard normal distribution. When we don t know σ, we will compute confidence intervals for µ as When we are estimating σ with s, we need to make our confidence interval wider to account for the uncertainty in estimation. (What if we had gotten a sample that happened to give a sample standard deviation s that was much smaller than the population standard deviation σ?) We do this by multiplying s n by something bigger than z. 11 1 The t distribution There is a different t distribution for every sample size. We identify different t distributions by their degrees of freedom, n 1. The density curve for t distributions is symmetric around 0 bell-shaped (and has only one mode) The spread of t distributions is greater than the spread of the standard normal distribution. The smaller the degrees of freedom, the more spread out the t distribution is. The larger the degrees of freedom, the closer the density curve for a t distribution is to a standard normal curve.

This makes sense because the larger the sample size, the better an estimate s is likely to be for σ (i.e., the less extra uncertainty is introduced by estimating σ instead of knowing its value) More on the t distribution If x is the sample mean of a simple random sample of size n value from a normal population with mean µ and standard deviation σ, then the random quantity follows a t distribution t = x µ s/ n 13 14 Constructing confidence intervals for µ when σ is unknown To construct a level C confidence interval for µ Draw a simple random sample of size n from the population. The population is assumed to be normal. Compute the sample mean x and the sample standard deviation s. Then the level C confidence interval for µ is where t cuts off the upper 1 C area under the density curve for a t distribution with n 1 degrees of freedom. Use Table A. at the back of your textbook to find t. Example We have data on a simple random sample of 10 birthweights of infants born at Boston City Hospital. We wish to estimate the mean µ of of birthweights in the population served by this hospital. This population may be different from the population of all US birthweights, so we cannot assume that we know either µ or σ. 15 16 Our data values are: Infant Birthweight in ounces 1 97 117 3 140 4 78 5 99 6 148 7 108 8 135 9 16 10 11

First calculate x = 116.90 s = 1.70 The degrees of freedom are 10-1 = 9. For a 95% confidence interval, we need the value of t that cuts off an area of.05 in the upper tail. 17 18 The interval is so wide because of the relatively small sample size the relatively large variation between birthweights (large s) From Table C, we find t =.6. Our confidence interval is = 116.90 ±.6 1.70 10 = 116.90 ± 15. = (101.38, 13.4)