Confidence Intervals. CI for a population mean (σ is known and n > 30 or the variable is normally distributed in the.



Similar documents
1. C. The formula for the confidence interval for a population mean is: x t, which was

Practice Problems for Test 3

Math C067 Sampling Distributions

Confidence Intervals for One Mean

Determining the sample size

CHAPTER 7: Central Limit Theorem: CLT for Averages (Means)

Z-TEST / Z-STATISTIC: used to test hypotheses about. µ when the population standard deviation is unknown

Confidence Intervals

5: Introduction to Estimation

One-sample test of proportions

Chapter 7: Confidence Interval and Sample Size

Center, Spread, and Shape in Inference: Claims, Caveats, and Insights

Hypothesis testing. Null and alternative hypotheses

Overview. Learning Objectives. Point Estimate. Estimation. Estimating the Value of a Parameter Using Confidence Intervals

The following example will help us understand The Sampling Distribution of the Mean. C1 C2 C3 C4 C5 50 miles 84 miles 38 miles 120 miles 48 miles

STA 2023 Practice Questions Exam 2 Chapter 7- sec 9.2. Case parameter estimator standard error Estimate of standard error

Case Study. Normal and t Distributions. Density Plot. Normal Distributions

Sampling Distribution And Central Limit Theorem

I. Chi-squared Distributions

University of California, Los Angeles Department of Statistics. Distributions related to the normal distribution

Research Method (I) --Knowledge on Sampling (Simple Random Sampling)

1 Computing the Standard Deviation of Sample Means

Inference on Proportion. Chapter 8 Tests of Statistical Hypotheses. Sampling Distribution of Sample Proportion. Confidence Interval

Lesson 17 Pearson s Correlation Coefficient

1 Correlation and Regression Analysis

Quadrat Sampling in Population Ecology

Descriptive Statistics

OMG! Excessive Texting Tied to Risky Teen Behaviors

Measures of Spread and Boxplots Discrete Math, Section 9.4

Chapter 14 Nonparametric Statistics

Maximum Likelihood Estimators.

Week 3 Conditional probabilities, Bayes formula, WEEK 3 page 1 Expected value of a random variable

In nite Sequences. Dr. Philippe B. Laval Kennesaw State University. October 9, 2008

GCSE STATISTICS. 4) How to calculate the range: The difference between the biggest number and the smallest number.

Exam 3. Instructor: Cynthia Rudin TA: Dimitrios Bisias. November 22, 2011

Properties of MLE: consistency, asymptotic normality. Fisher information.

2-3 The Remainder and Factor Theorems

Chapter 6: Variance, the law of large numbers and the Monte-Carlo method

Multi-server Optimal Bandwidth Monitoring for QoS based Multimedia Delivery Anup Basu, Irene Cheng and Yinzhe Yu


hp calculators HP 12C Statistics - average and standard deviation Average and standard deviation concepts HP12C average and standard deviation

MEI Structured Mathematics. Module Summary Sheets. Statistics 2 (Version B: reference to new book)

This document contains a collection of formulas and constants useful for SPC chart construction. It assumes you are already familiar with SPC.

Topic 5: Confidence Intervals (Chapter 9)

% 60% 70% 80% 90% 95% 96% 98% 99% 99.5% 99.8% 99.9%

Normal Distribution.

Soving Recurrence Relations

Approximating Area under a curve with rectangles. To find the area under a curve we approximate the area using rectangles and then use limits to find

Output Analysis (2, Chapters 10 &11 Law)

SECTION 1.5 : SUMMATION NOTATION + WORK WITH SEQUENCES

PSYCHOLOGICAL STATISTICS

Hypergeometric Distributions

Repeating Decimals are decimal numbers that have number(s) after the decimal point that repeat in a pattern.

Chapter 7 Methods of Finding Estimators

COMPARISON OF THE EFFICIENCY OF S-CONTROL CHART AND EWMA-S 2 CONTROL CHART FOR THE CHANGES IN A PROCESS

THE REGRESSION MODEL IN MATRIX FORM. For simple linear regression, meaning one predictor, the model is. for i = 1, 2, 3,, n

Analyzing Longitudinal Data from Complex Surveys Using SUDAAN

The Forgotten Middle. research readiness results. Executive Summary

Lesson 15 ANOVA (analysis of variance)

Statistical inference: example 1. Inferential Statistics

Predictive Modeling Data. in the ACT Electronic Student Record

GOOD PRACTICE CHECKLIST FOR INTERPRETERS WORKING WITH DOMESTIC VIOLENCE SITUATIONS

Example 2 Find the square root of 0. The only square root of 0 is 0 (since 0 is not positive or negative, so those choices don t exist here).

Confidence Intervals for Linear Regression Slope

CHAPTER 11 Financial mathematics

Solving Logarithms and Exponential Equations

Now here is the important step

5.4 Amortization. Question 1: How do you find the present value of an annuity? Question 2: How is a loan amortized?

CHAPTER 3 DIGITAL CODING OF SIGNALS

Biology 171L Environment and Ecology Lab Lab 2: Descriptive Statistics, Presenting Data and Graphing Relationships

Unit 8: Inference for Proportions. Chapters 8 & 9 in IPS

Mann-Whitney U 2 Sample Test (a.k.a. Wilcoxon Rank Sum Test)

The Fundamental Forces of Nature

SAMPLE QUESTIONS FOR FINAL EXAM. (1) (2) (3) (4) Find the following using the definition of the Riemann integral: (2x + 1)dx

Discrete Mathematics and Probability Theory Spring 2014 Anant Sahai Note 13

Confidence intervals and hypothesis tests

Definition. A variable X that takes on values X 1, X 2, X 3,...X k with respective frequencies f 1, f 2, f 3,...f k has mean

INVESTMENT PERFORMANCE COUNCIL (IPC)

PROCEEDINGS OF THE YEREVAN STATE UNIVERSITY AN ALTERNATIVE MODEL FOR BONUS-MALUS SYSTEM

Convexity, Inequalities, and Norms

where: T = number of years of cash flow in investment's life n = the year in which the cash flow X n i = IRR = the internal rate of return

Lecture 4: Cauchy sequences, Bolzano-Weierstrass, and the Squeeze theorem

Modified Line Search Method for Global Optimization

Simple Annuities Present Value.

Chapter 5 Unit 1. IET 350 Engineering Economics. Learning Objectives Chapter 5. Learning Objectives Unit 1. Annual Amount and Gradient Functions

Chapter 5: Inner Product Spaces

MEP Pupil Text 9. The mean, median and mode are three different ways of describing the average.

AP Calculus AB 2006 Scoring Guidelines Form B

, a Wishart distribution with n -1 degrees of freedom and scale matrix.

CS103X: Discrete Structures Homework 4 Solutions

Hypothesis testing using complex survey data

TI-89, TI-92 Plus or Voyage 200 for Non-Business Statistics

Section 11.3: The Integral Test

1 The Gaussian channel

LECTURE 13: Cross-validation

HCL Dynamic Spiking Protocol

Taking DCOP to the Real World: Efficient Complete Solutions for Distributed Multi-Event Scheduling

How To Solve The Homewor Problem Beautifully

Transcription:

Cofidece Itervals A cofidece iterval is a iterval whose purpose is to estimate a parameter (a umber that could, i theory, be calculated from the populatio, if measuremets were available for the whole populatio). A cofidece iterval has three elemets. First there is the iterval itself, somethig like (123, 456). Secod is the cofidece level, somethig like 95%. Third there is the parameter beig estimated, somethig like the populatio mea, µ or the populatio proportio, p. I order to have a meaigful statemet, you eed all three elemets: (123, 456) is a 95% cofidece iterval for µ. Formulas: Geeral formula for cofidece itervals: estimate ± margi of error is 1.645 for 90% cofidece, 1.96 for 95% cofidece, ad 2.576 for 99% cofidece CI for a populatio mea (σ is kow ad > 30 or the variable is ormally distributed i the σ populatio) x ± (TI-83: STAT TESTS 7:ZIterval) CI for a populatio mea (σ is ukow ad > 30 or the variable is ormally distributed i the populatio) s x ± t (TI-83: STAT TESTS 8:TIterval) CI for a Populatio proportio (whe p$ 10 ad ( 1 p$) 10) $p x = p$ ± (TI-83: STAT TESTS A:1-PropZIterval) If you do t kow $p, use $p = 1 2 (coservative approach). Miimum required sample sie for a desired margi of error ad cofidece level: Whe it is a mea problem: = m σ 2 Whe it is a proportio problem: = p $( 1 p $) 2 1

Examples: 1. You wish to estimate, with 95% cofidece, the proportio of computers that eed repairs or have problems by the time the product is three years old. Your estimate must be accurate withi 3% of the true proportio. a. If o prelimiary estimate is available, fid the miimum sample sie required. If o prelimiary estimate is available, use the coservative choice: p $ = 05. m = 3% = 0.03 2 2 = p p = 2 $( 1 $) 05. ( 1 05. ) = 1111111. 0. 03 Thus we eed at least 1112 computers to sample. (Remember: ALWAYS roud up!) b. Now suppose a prior study ivolvig less tha 100 computers foud that 19% of these computers eeded repairs or had problems by the time the product was three years old. Fid the miimum sample sie eeded. Now p $ = 019. 2 2 = p p = 2 $( 1 $) 019. ( 1 019. ) = 684 0. 03 This is a whole umber, thus the miimum sample sie we eed is 684. 2. A college admiistrator would like to determie how much time studets sped o homework assigmets durig a typical week. A questioaire is set to a sample of = 100 studets ad their respose idicates a mea of 7.4 hours per week ad stadard deviatio of 3hours. (a) What is the poit estimate of the mea amout of homework for the etire studet populatio (i.e., what is the poit estimate for µ, the ukow populatio mea)? The poit estimate for the populatio mea is the sample mea. I this case it s 7.4 hours. (b) Now make a iterval estimate of the populatio mea so that you are 95% cofidet that the true mea is i your iterval (i.e., compute the 95% cofidece iterval). Coditios: radom sample? We do t really kow. > 30, so we ca assume by the CLT that the shape of the samplig distributio of the sample meas is approximately ormal. x = 7.4 hours, ad s = 3 hours. The populatio s.d. is ukow, we oly kow the sample s.d., so we eed to use the t-iterval. 2

s Usig x ± t ( t = 1.987) or the calculator: 8: TIterval The 95% cofidece iterval is (6.8, 8.0). That meas, we are 95% cofidet that the mea time ALL studets sped o homework assigmets durig a typical week is betwee 6.8 hours ad 8.0 hours. (c) Now compute the 99% cofidece iterval. Repeatig part b with t = 2.632, we get (6.6, 8.2). That meas, we are 99% cofidet that the mea time ALL studets sped o homework assigmets durig a typical week is betwee 6.6 hours ad 8.2 hours. (d) Compare your aswer to b ad c. Which cofidece iterval is wider, ad why? How is the width of the cofidece iterval related to the percetage/degree of cofidece? The 99% cofidece iterval is wider. If you wide the cofidece iterval of plausible values, you're more sure that the real parameter is i there somewhere. (e) Now compute the 95% cofidece iterval agai, but assume that = 50. Sice is still larger tha 30, we ca use the t-iterval agai. (t = 2.014) The 95% cofidece iterval with = 50 is (6.5, 8.3). (f) Compare your aswer to b ad e. Which cofidece iterval is wider, ad why? How is the width of the cofidece iterval related to the sie of the sample? The sample sie of 100 gives a smaller cofidece iterval tha the sample of sie 50. The larger your sample sie, the more sure you ca be that their aswers truly reflect the populatio. This idicates that for a give cofidece level, the larger your sample sie, the smaller your cofidece iterval. However, the relatioship is ot liear (i.e., doublig the sample sie does ot halve the cofidece iterval. Actually if we make the sample sie quadrupled (times 4), that would halve the cofidece iterval). 3. I Roosevelt Natioal Forest, the ragers took radom samples of live aspe trees ad measured the base circumferece of each tree. Assume that the circumfereces of the trees are ormally distributed. a. The first sample had 30 trees with a mea circumferece of 15.71 iches, ad stadard deviatio of 4.63 iches. Fid a 95% cofidece iterval for the mea circumferece of aspe trees from this data. Coditios: radom sample checked, σ is ukow, ad =30 ad the circumfereces are ormally distributed, so we ca use the t-iterval. x = 15.71 s = 4.63 = 30 3

s Usig x ± t (t = 2.045) or the calculator: 8: TIterval The 95% t-iterval is (13.98, 17.44). This meas, that we are 95% cofidet that the mea circumferece of ALL live aspe trees i Roosevelt Natioal Forest is betwee 13.98 iches ad 17.44 iches. That is, based o this sample. If we could measure the circumferece of ALL of the live aspe trees there, the we are 95% cofidet that the mea of all the measuremets would be betwee 13.98 iches ad 17.44 iches. Also, it meas that if we would take may, may samples of sie 30 of live aspe trees ad calculate a 95% cofidece iterval for each sample, about 95% of them would cotai the real, actual mea circumferece ad about 5% would miss it. But, of course, we do t kow which 5% would miss it. The ext sample had 100 trees with a mea of 15.58 iches. Agai fid a 95% cofidece iterval for the mea circumferece of aspe trees from these data. Coditios: σ is ukow, ad > 30 ad the circumfereces are ormally distributed, so we ca use the t-iterval. x = 15.71 s = 4.63 = 100 s Usig x ± t (t = 1.984) or the calculator: 8: TIterval The 95% t-iterval is (14.79, 16.63). This meas, that we are 95% cofidet that the mea circumferece of ALL live aspe trees i Roosevelt Natioal Forest is betwee 14.79 iches ad 16.63 iches. That is, based o this sample, if we could measure the circumferece of ALL the live aspe trees there, the we are 95% cofidet that the mea of all the measuremets would be betwee 14.79 iches ad 16.63 iches. The last sample had 300 trees with a mea of 15.59 iches. Fid a 95% cofidece iterval from these data. Coditios: σ is ukow, ad > 30 ad the circumfereces are ormally distributed, so we ca use the t-iterval. x = 15.71 s = 4.63 = 300 s Usig x ± t (t = 1.96) or the calculator: 8: TIterval The 95% t-iterval is (15.18, 16.24). 4

This meas, that we are 95% cofidet that the mea circumferece of ALL live aspe trees i Roosevelt Natioal Forest is betwee 15.18 iches ad 16.24 iches. That is, based o this sample, if we could measure the circumferece of ALL the live aspe trees there, the we are 95% cofidet that the mea of all the measuremets would be betwee 15.18 iches ad 16.24 iches. Fid the legth of each iterval of parts (a), (b) ad (c). Commet o how these legths chage as the sample sie icreases. The legth of the CI with = 30 is 17.44 13.98 = 3.46 The legth of the CI with = 100 is 16.63 14.79 = 1.84 The legth of the CI with = 300 is 16.24 15.18 = 1.06. The legth of the iterval gets smaller as the sample sie icreases. 4. I a article explorig blood serum levels of vitamis ad lug cacer risks (The New Eglad Joural of Medicie), the mea serum level of vitami E i the cotrol group was 11.9 mg/liter. There were 196 patiets i the cotrol group. (These patiets were free of all cacer, except possible ski cacer, i the subsequet 8 years). Assume that the stadard deviatio σ = 4.30 mg/liter. a. Fid a 95% cofidece iterval for the mea serum level of vitami E i all persos similar to the cotrol group. Coditios: Radom sample? We do t really kow, but let s assume they picked the subjects radomly. σ is kow, so we ca use the -iterval. x = 11.9 σ = 4.30 = 196 σ Usig either x ± ( = 1.96) or the calculator: 7: ZIterval The 95% t-iterval is (11.3, 12.5). This meas, that we are 95% cofidet that the mea serum level of vitami E i the ALL cacer free patiets is betwee 11.3 mg/liter ad 12.5 mg/liter. That is, based o this sample, if we could measure the mea serum level of vitami E i ALL cacer free patiets (except possible ski cacer i the subsequet 8 years), the we are 95% cofidet that the mea of all the measuremets would be betwee 11.3 mg/liter ad 12.5 mg/liter. b. If you wated to estimate the mea serum level of vitami E, with 90% cofidece, ad a margi of error of o more tha 0.25 mg/liter, how large a sample would you eed? For the miimum sample sie we eed we ca use the formula: = m σ 2 5

= m 1645. 4. 30 = = 80055. 0. 25 σ 2 2 Thus, we would eed at least 801 cacer free patiets i our sample. 5. Suppose i a state with a large umber of voters that 56 out of 100 radomly surveyed voters favored Propositio 1. This is just a small sample of all the voters. Do you thik Propositio 1 passed? YES, but I am ot very sure, I would like more iformatio. a. Give a rage of plausible values for the proportio of all voters who favored Propositio 1. (That is, fid a 95% cofidece iterval) Our goal is to estimate the proportio of ALL voters who favored Propositio 1 (p). I our sample, 56 out of 100 favored the propositio, that is $p = 56/100 = 0.56 = 56%. x = 56 = 100 $p =0.56 Checkig coditios for CI: radom sample, p$ = 56 > 10 ad ( 1 p$) = 100( 1 056. ) = 44 Coditios are satisfied. We use : p$ ± Thus, usig the formula above (with = 1.96), or usig the A:1-PropZIt meu o the calculator, we get (0.462, 0.653). That is we are 95% cofidet that the proportio of ALL voters who favored Propostio 1 is betwee 46.2% ad 65.3%. Other samples of 100 voters would yield other 95% cofidece itervals. Most of these cofidece itervals (about 95% of them) would capture p, but a few of them (about 5%) would ot. b. The 95% cofidece iterval we just computed is rather wide ad does ot pipoit p to ay great extet. (I fact, we caot eve tell whether a majority voted for Propositio 1 Our ext example shows that we ca obtai a arrower cofidece iterval by takig a larger sample. Suppose i a state with a large umber of voters that 560 out of 1000 radomly surveyed voters favored Propositio 1. Give a rage of plausible values for the proportio of all voters who favored Propositio 1. Our goal is to estimate the proportio of ALL voters who favored Propositio 1 (p). I our sample, 560 out of 1000 favored the propositio, that is $p = 560/1000 = 0.56 = 56%. x = 560 = 1000 $p =0.56 6

Checkig coditios for CI: radom sample, p$ = 560 > 10 ad ( 1 p$) = 1000( 1 0. 56) = 440 > 10 Coditios are satisfied. We use : p$ ± Thus, usig the formula above (with = 1.96), or usig the A:1-PropZIt meu o the calculator, we get (0.529, 0.591). That is, based o the results from our sample of sie 1000, we are 95% cofidet that the proportio of ALL voters who favored Propostio 1 is betwee 52.9% ad 59.1%. Notice that the sample sie of 1000 gives a much arrower cofidece iterval tha the sample sie of 100. I fact, with the larger sample, we ca be quite cofidet (about 95% of the time ayway), that a majority of the voters favored Propositio 1, sice the smaller edpoit of the samples 95% cofidece iterval, 0.529 is greater tha oe-half. Bear i mid, however, that the larger sample may be more costly ad time cosumig tha the smaller oe. Now, how cofidet are you that Propositio 1 passed or failed? I d bet a small amout of moey that I am right. c. Forget the previous parts ow. Assume that you did t take ay samples yet. What sample sie you eed to use if you wat the margi of error to be at most 3% with 95% cofidece but you have o estimate of p? Because you do t have a estimate of p, use $p = 0.5. We wat the margi of error to be at most 3%, that is m = 0.03. 2 2 = p p = 196. $( 1 $) 05. ( 1 05. ) = 1067111. 0. 03 Thus, to get a margi of error to be at most 3%, we eed at least 1068voters i our sample. d. Now let s assume you did a pilot sample, i which 56 out of 100 voters said they favor Propositio 1. What sample sie you eed to use if you wat the margi of error to be at most 3% with 95% cofidece ow? Now we have a estimate of p from the pilot study, so we use $p = 0.56. We wat the margi of error to be at most 3%, that is m = 0.03. 2 2 = p p = 196. $( 1 $) 056. ( 1 056. ) = 105174. 0. 03 Thus, to get a margi of error to be at most 3%, we eed at least 1052 voters i our sample. 6. Sometimes a 95% cofidece iterval is ot eough. For example, i testig ew medical drugs or procedures, a 99% cofidece iterval may be required before the ew drug or procedure is approved for geeral use. For example, a ew drug for migraies might iduce isomia (difficulty of fallig asleep) i some patiets. If this side effect happes i too may patiets, the 7

drug might ot be approved. More precisely, if it could happe i more tha 5% of all the patiets, it wo t be approved. I a radom sample of 632 migraie patiets who took the ew pill, 19 of them experieced isomia. Based o this sample result, what would be your recommedatio, should the ew drug be approved or ot? We wat to estimate the proportio of ALL migraie patiets who would experiece isomia. The sample proportio, $p, is 19/632 = 0.03 = 3% We wat to calculate the 99% cofidece iterval based o this sample result. Let s check the coditios first: Radom sample, p$ = 19 > 10 ad ( 1 p$) = 613 > 10 Coditios are satisfied. We use : p$ ± Thus, usig the formula above (with = 2.575), or usig the A:1-PropZIt meu o the calculator, we get (0.0126, 0.0476). Thus, based o this sample result, we are 99% cofidet that if we could test every migraie patiets who would take this pill, the proportio of them who would experiece isomia would be betwee about 1.26% ad 4.76%. Therefore, we ca recommed the approval of the ew drug. 7. The Gallup Poll survey orgaiatio coducted telephoe iterviews with a radomly selected atioal sample of 1,003 adults, 18 years ad older, o Mar. 3-5, 2003. I the survey they foud that 281 adults said that the atio s eergy situatio is very serious. Fid a 95 ad 99% cofidece iterval for the ukow proportio of Americas who felt that the atio s eergy situatio is very serious. x This is a proportio problem. $p = = 281 1003 Coditios: radom sample, checked, p$ = 1003 281 281 = 281 > 10, ( 1 p$) = 1003( 1 ) = 722 > 10 1003 1003 95% cofidece iterval: p$ ± ( = 1.96) Or usig the calculator: STAT TESTS A:1-PropZIt, x = 213, = 1003, C-level: 0.95 The 95% cofidece iterval is: (0.253, 0.308) We are 95% cofidet that the proportio of ALL adult i the U.S. who feel that the atio s eergy situatio is very serious is somewhere betwee 25.3% ad 30.8%. That is, if we could ask EVERY adult i the U.S. ad ask them what they thik about the atio s eergy situatio, we are 95% cofidet that 25.3%-30.8% of them would thik that the eergy situatio is very serious. 8

99% cofidece iterval: p$ ± ( = 2.575) Or usig the calculator: STAT TESTS A:1-PropZIt, x = 281, = 1003, C-level: 0.99 The 95% cofidece iterval is: (0.244, 0.317) We are 99% cofidet that the proportio of ALL adult i the U.S. who feels that the atio s eergy situatio is very serious is somewhere betwee 24.4% ad 31.7%. That is, if we could ask EVERY adult i the U.S. ad ask them what they thik about the atio s eergy situatio, we are 95% cofidet that 24.4%-31.7% of them would thik that the eergy situatio is very serious. Agai, as it should be, the 99% cofidece iterval is wider. 8. The dataset "Normal Body Temperature, Geder, ad Heart Rate" cotais 130 observatios of body temperature, alog with the geder of each idividual ad his or her heart rate. MINITAB provides the followig iformatio: Descriptive Statistics Variable N Mea Media Tr Mea StDev SE Mea TEMP 130 98.249 98.300 98.253 0.733 0.064 Variable Mi Max Q1 Q3 TEMP 96.300 100.800 97.800 98.700 Based o these results, costruct ad iterpret a 95% cofidece itervals for the mea body temperature. Accordig to these results, is the usual assumed ormal body temperature of 98.6 degrees Fahreheit withi the 95% cofidece iterval for the mea? This is a mea problem. Coditios: radom sample: we do t kow. No iformatio about that. > 30. Sice we do t kow sigma, the populatio s stadard deviatio, we eed to use the t-iterval. The sample mea is 98.249, ad the sample stadard deviatio is 0.733 (both are provided above). Use t = 1.984 s. The 95% cofidece iterval: x ± t = 98. 249 ± 1984. 0 733 = ( 98121., 98. 377 ) 130 Or usig the calculator: STAT TESTS 8: TIterval: highlight Stat, ad eter 98.249 for the mea, 0.733 for Sx, ad 130 for. We are 95% cofidet that the mea body temperature for ALL people is betwee 98.121 ad 98.377 degrees of Fahreheit. The usual assumed ormal body temperature of 98.6 degrees Fahreheit is ot withi the 95% cofidece iterval for the mea. 9