Overview. Learning Objectives. Point Estimate. Estimation. Estimating the Value of a Parameter Using Confidence Intervals



Similar documents
1. C. The formula for the confidence interval for a population mean is: x t, which was

Confidence Intervals for One Mean

Determining the sample size

5: Introduction to Estimation

Confidence Intervals. CI for a population mean (σ is known and n > 30 or the variable is normally distributed in the.

Practice Problems for Test 3

Z-TEST / Z-STATISTIC: used to test hypotheses about. µ when the population standard deviation is unknown

Confidence Intervals

Chapter 7: Confidence Interval and Sample Size

I. Chi-squared Distributions

CHAPTER 7: Central Limit Theorem: CLT for Averages (Means)

Case Study. Normal and t Distributions. Density Plot. Normal Distributions

Hypothesis testing. Null and alternative hypotheses

The following example will help us understand The Sampling Distribution of the Mean. C1 C2 C3 C4 C5 50 miles 84 miles 38 miles 120 miles 48 miles

Math C067 Sampling Distributions

Properties of MLE: consistency, asymptotic normality. Fisher information.

Measures of Spread and Boxplots Discrete Math, Section 9.4

Center, Spread, and Shape in Inference: Claims, Caveats, and Insights

STA 2023 Practice Questions Exam 2 Chapter 7- sec 9.2. Case parameter estimator standard error Estimate of standard error

Statistical inference: example 1. Inferential Statistics

Inference on Proportion. Chapter 8 Tests of Statistical Hypotheses. Sampling Distribution of Sample Proportion. Confidence Interval

1 Computing the Standard Deviation of Sample Means

Sampling Distribution And Central Limit Theorem

One-sample test of proportions

Topic 5: Confidence Intervals (Chapter 9)

Research Method (I) --Knowledge on Sampling (Simple Random Sampling)

PSYCHOLOGICAL STATISTICS

Descriptive Statistics

Chapter 7 - Sampling Distributions. 1 Introduction. What is statistics? It consist of three major areas:

Normal Distribution.

In nite Sequences. Dr. Philippe B. Laval Kennesaw State University. October 9, 2008

THE REGRESSION MODEL IN MATRIX FORM. For simple linear regression, meaning one predictor, the model is. for i = 1, 2, 3,, n

Quadrat Sampling in Population Ecology

Confidence intervals and hypothesis tests

Chapter 7 Methods of Finding Estimators


SECTION 1.5 : SUMMATION NOTATION + WORK WITH SEQUENCES

Maximum Likelihood Estimators.

Output Analysis (2, Chapters 10 &11 Law)

1 Correlation and Regression Analysis

Chapter 6: Variance, the law of large numbers and the Monte-Carlo method

Lesson 15 ANOVA (analysis of variance)

Multi-server Optimal Bandwidth Monitoring for QoS based Multimedia Delivery Anup Basu, Irene Cheng and Yinzhe Yu

University of California, Los Angeles Department of Statistics. Distributions related to the normal distribution

GCSE STATISTICS. 4) How to calculate the range: The difference between the biggest number and the smallest number.

Hypergeometric Distributions

CHAPTER 3 THE TIME VALUE OF MONEY

Chapter 14 Nonparametric Statistics

OMG! Excessive Texting Tied to Risky Teen Behaviors

Exam 3. Instructor: Cynthia Rudin TA: Dimitrios Bisias. November 22, 2011

This document contains a collection of formulas and constants useful for SPC chart construction. It assumes you are already familiar with SPC.

5.4 Amortization. Question 1: How do you find the present value of an annuity? Question 2: How is a loan amortized?

Soving Recurrence Relations

.04. This means $1000 is multiplied by 1.02 five times, once for each of the remaining sixmonth

Chapter 5 Unit 1. IET 350 Engineering Economics. Learning Objectives Chapter 5. Learning Objectives Unit 1. Annual Amount and Gradient Functions

G r a d e. 2 M a t h e M a t i c s. statistics and Probability

Definition. A variable X that takes on values X 1, X 2, X 3,...X k with respective frequencies f 1, f 2, f 3,...f k has mean

MEI Structured Mathematics. Module Summary Sheets. Statistics 2 (Version B: reference to new book)

hp calculators HP 12C Statistics - average and standard deviation Average and standard deviation concepts HP12C average and standard deviation

Lesson 17 Pearson s Correlation Coefficient

Approximating Area under a curve with rectangles. To find the area under a curve we approximate the area using rectangles and then use limits to find

Analyzing Longitudinal Data from Complex Surveys Using SUDAAN

Trigonometric Form of a Complex Number. The Complex Plane. axis. ( 2, 1) or 2 i FIGURE The absolute value of the complex number z a bi is

Example 2 Find the square root of 0. The only square root of 0 is 0 (since 0 is not positive or negative, so those choices don t exist here).

Here are a couple of warnings to my students who may be here to get a copy of what happened on a day that you missed.

Section 11.3: The Integral Test

How To Solve The Homewor Problem Beautifully

Unit 8: Inference for Proportions. Chapters 8 & 9 in IPS

CHAPTER 11 Financial mathematics

3 Basic Definitions of Probability Theory

A Guide to the Pricing Conventions of SFE Interest Rate Products

Biology 171L Environment and Ecology Lab Lab 2: Descriptive Statistics, Presenting Data and Graphing Relationships

3. Greatest Common Divisor - Least Common Multiple

Week 3 Conditional probabilities, Bayes formula, WEEK 3 page 1 Expected value of a random variable

Convexity, Inequalities, and Norms

Solving Logarithms and Exponential Equations

Exploratory Data Analysis

Lecture 4: Cauchy sequences, Bolzano-Weierstrass, and the Squeeze theorem

CHAPTER 3 DIGITAL CODING OF SIGNALS

Confidence Intervals for Linear Regression Slope

Discrete Mathematics and Probability Theory Spring 2014 Anant Sahai Note 13

*The most important feature of MRP as compared with ordinary inventory control analysis is its time phasing feature.

Overview of some probability distributions.

Sequences and Series

LECTURE 13: Cross-validation

Systems Design Project: Indoor Location of Wireless Devices

Incremental calculation of weighted mean and variance

% 60% 70% 80% 90% 95% 96% 98% 99% 99.5% 99.8% 99.9%

The analysis of the Cournot oligopoly model considering the subjective motive in the strategy selection

Hypothesis testing using complex survey data

Pre-Suit Collection Strategies

INVESTMENT PERFORMANCE COUNCIL (IPC)

Non-life insurance mathematics. Nils F. Haavardsson, University of Oslo and DNB Skadeforsikring

Asymptotic Growth of Functions

Page 1. Real Options for Engineering Systems. What are we up to? Today s agenda. J1: Real Options for Engineering Systems. Richard de Neufville

MEP Pupil Text 9. The mean, median and mode are three different ways of describing the average.

Simple Annuities Present Value.

A Review and Comparison of Methods for Detecting Outliers in Univariate Data Sets

Forecasting techniques

Transcription:

Overview Estimatig the Value of a Parameter Usig Cofidece Itervals We apply the results about the sample mea the problem of estimatio Estimatio is the process of usig sample data estimate the value of a populatio parameter We will quatify the accuracy of our estimatio process Learig Objectives Compute a poit estimate of the populatio mea The Logic i Costructig Cofidece Itervals about a Populatio Mea whe Populatio Stadard Deviatio is Kow Costruct ad iterpret a cofidece iterval about the populatio mea (assumig the populatio stadard deviatio is kow) Uderstad the role of margi of error i costructig a cofidece iterval Determie the sample size ecessary for estimatig the populatio mea withi a specified margi of error Estimatio The eviromet of our problem is that we wat estimate the value of a ukow populatio mea The process that we use is called estimatio This is oe of the most commo goals of statistics Poit Estimate Estimatio ivolves two steps Step 1 obtai a specific umeric estimate, this is called the poit estimate Step quatify the accuracy ad precisio of the poit estimate The first step is relatively easy The secod step is why we eed statistics 1

Examples of Poit Estimate Some examples of poit estimates are The sample mea estimate the populatio mea The sample stadard deviatio estimate the populatio stadard deviatio The sample proportio estimate the populatio proportio The sample media estimate the populatio media Precisio of Poit Estimate The most obvious poit estimate for the populatio mea is the sample mea Now we will use the material o the samplig distributio of sample mea quatify the accuracy ad precisio of this poit estimate Example A example of what we wat quatify We wat estimate the miles per gallo for a certai car We test some umber of cars We calculate the sample mea it is 7 7 miles per gallo would be our best guess Example (cotiued) How sure are we that the gas ecoomy is 7 ad ot 8.1, or 5.? We would like make a statemet such as We thik that the mileage is 7 mpg ad we re pretty sure that we re ot o far off Iterval Estimatio A cofidece iterval for a ukow parameter is a iterval of umbers Compare this a poit estimate which is just oe umber, ot a iterval of umbers ( a rage of umbers) The level of cofidece represets the expected proportio of itervals that will cotai the parameter if a large umber of differet samples is obtaied The cofidece iterval quatifies the accuracy ad precisio of the poit estimate Iterpret Cofidece level What does the level of cofidece represet? If we obtai a series of 50 radom samples from a populatio of iterest Follow a process for calculatig cofidece itervals for populatio mea with a 90% level of cofidece from each of the sample meas The, we would expect that 90% of those 50 cofidece itervals (or about 45) would cotai our populatio mea

Cofidece Level The level of cofidece is always expressed as a percet The level of cofidece is described by a parameter α (i.e.,alpha) The level of cofidece is (1 α) 100% Whe α =.05, the (1 α) =.95, ad we have a 95% level of cofidece Whe α =.01, the (1 α) =.99, ad we have a 99% level of cofidece Cofidece Iterval If we expect that a method would create itervals that cotai the populatio mea 90% of the time, we call those itervals 90% cofidece itervals If we have a method for itervals that cotai the populatio mea 95% of the time, those are 95% cofidece itervals Ad so forth Summary To tie the defiitios gether We are usig the sample mea estimate the populatio mea..(poit estimate) With each specific sample, we ca costruct a,for istace, 95% cofidece iterval estimate the populatio mea (Iterval estimate) 95% cofidece iterval tells you that If we take samples repeatedly, we expect that 95% of these itervals would cotai the populatio mea Example Back our 7 miles per gallo car We thik that the mileage is 7 mpg ad we re pretty sure that we re ot o far off Puttig i umbers (quatify the accuracy) We estimate the gas mileage is 7 mpg ad we are 90% cofidet that the real mileage of this model of car is betwee 5 ad 9 miles per gallo Example (cotiued) We estimate the gas mileage is 7 mpg This is our poit estimate ad we are 90% cofidet that Our cofidece level is 90% (which is 1- α, i.e. α = 0.10) the real mileage of this model of car The populatio mea is betwee 5 ad 9 miles per gallo Our cofidece iterval is (5, 9) Kow Populatio Stadard Deviatio First, we assume that we kow the stadard deviatio of the populatio () This is ot very realistic but we eed it for right ow itroduce how costruct a cofidece iterval We ll solve this problem i a better way (where we do t kow what is) later but first we ll do this oe 3

Assumptio To estimate the mea µ with a kow, we eed a ormal distributio assumptio for the samplig distributio of mea. Assumptio satisfied by: 1. Kowig that the sampled populatio is ormally distributed, or. Usig a large eough radom sample (CLT) Note: The CLT may be applied smaller samples (for example = 15) whe there is evidece suggest a uimodal distributio that is approximately symmetric. If there is evidece of skewess, the sample size eeds be much larger. Samplig Distributio of meas By the cetral limit theorem, we kow that If the sample size is large eough, i.e. 30, we ca assume that the sample meas have a ormal distributio with stadard deviatio / We look up a stadard ormal distributio 95% of the values i a stadard ormal are betwee 1.96 ad 1.96 i other words withi ± 1.96 (ote: we ll use more accurate figures -1.96 ad 1.96 istead of - ad from the empirical rule.) We ow use this a geeral ormal variable Samplig Distributio of Meas The values of a geeral ormal radom variable are withi 1.96 times (or about times accordig empirical rule) its stadard deviatio away from its mea 95% of the time Thus the sample mea is withi ± 1.96 of the populatio mea 95% of the time Here, = x Iterval for Sample Mea Because the sample mea has a approximately ormal distributio, it is i the iterval µ ± 1. 96 aroud the (ukow) populatio mea 95% of the time. I other words, the iterval will cover 95% of possible sample meas, whe you take samples from the populatio repeatedly. Sice X = µ ± 1. 96, we ca flip the equatio aroud betwee µ ad X solve for the populatio mea µ Iterval for Populatio Mea After we solve for the populatio mea µ, we fid that µ is withi the iterval x ± 1. 96 aroud the (kow) sample mea 95% of the time This is t exactly true i the mathematical sese as the populatio mea is ot a radom variable that s why we call this a cofidece istead of a probability Cofidece Iterval Thus a 95% cofidece iterval for the Populatio mea is x ± 1. 96 This is i the form Poit estimate ± margi of error The margi of error here is 1.96 / 4

Example For our car mileage example Assume that the sample mea was 7 mpg Assume that we tested a sample of 40 cars Assume that we kew that the populatio stadard deviatio was 6 mpg The our 95% cofidece iterval estimate for the true/populatio mea mileage would be 6 7 ± 1. 96 40 or 7 ± 1.9 Critical Value If we wated compute a 90% cofidece iterval, or a 99% cofidece iterval, etc., we would just eed fid the right stadard ormal value (istead of 1.96 for a 95% cofidece iterval) called critical value Frequetly used cofidece levels, ad their critical values, are 90% correspods 1.645 95% correspods 1.960 99% correspods.575 Critical Value The umbers 1.645, 1.960, ad.575 are writte as a form of Z α where α is the area the right of the Z value. z 0.05 = 1.645 P(Z 1.645) =.05 [use TI Calcular: ivnorm(.95,0,1) = 1.645)] z 0.05 = 1.960 P(Z 1.960) =.05 [ivnorm(0.975,0,1) = 1.960] z 0.005 =.575 P(Z.575) =.005 [ivnorm(0,995,0.1) =.575] where Z is a stadard ormal radom variable How Determie Critical Value? Why do we use Z 0.05 for 95% cofidece? To be withi somethig 95% of the time We ca be o low.5% of the time We ca be o high.5% of the time Thus the 5% cofidece that we do t have is split as.5% beig o high ad.5% beig o low Critical Value z α/ for Cofidece Level 1 α I geeral, for a (1 α) 100% cofidece iterval, we eed fid z α/, the critical Z-value z α/ is the value such that P(Z z α/ ) = α/ Critical Value z α/ for 1 α Cofidece Level Oce we kow these critical values for the ormal distributio, the we ca costruct cofidece itervals for the populatio mea x z / α x + z / α 5

Example The weights of full boxes of a certai kid of cereal are ormally distributed with a stadard deviatio of 0.7 oz. A sample of 18 radomly selected boxes produced a mea weight of 9.87 oz. Fid a 95% cofidece iterval for the true mea weight of a box of this cereal. Solutio: Follow the process below solve 1. Describe the populatio parameter of cocer The mea, µ, weight of all boxes of this cereal. Specify the cofidece iterval criteria a. Check the assumptios The weights are ormally distributed, the distributio of X is ormal b. Idetify the probability distributio ad formula be used Use a z-iterval with = 0.7 c. Determie the level of cofidece, 1 - α The questio asks for 95% cofidece, so 1 - α = 0.95 3. Collect ad preset iformatio The sample iformatio is give i the statemet of the problem Give: = 18,x = 9. 87 Example (cotiued) 4. Determie the cofidece iterval a. Determie the critical value either from a z-table or a TI graphig calcular ivnorm(1-α/,0,1) = ivnorm(0.975,0,1) = 1.96 b. Fid the margi of error of estimate 07. Z / = 196. = 0147. α 18 c. Fid the lower ad upper cofidece limits X ± Margi of Error 9. 87 ± 0147. 9.75 10.00 5. State the cofidece iterval ad iterpret it. 9.75 10.00 is a 95% cofidece iterval for the true mea weight, µ, of cereal boxes. This meas that if we coduct the experimet over ad over, ad costruct lots of cofidece itervals, the 99% of the cofidece itervals will cotai the true mea value µ. Margi of Error Uderstad the role of margi of error i costructig a cofidece iterval If we write the cofidece iterval as 7 ± the we would call the umber (after the ±) the size of margi of error So we have three ways of writig cofidece itervals (5, 9) 7 ± 7 with a margi of error of Margi of Error The margi of errors would be 1.645 / for 90% cofidece itervals 1.960 / for 95% cofidece itervals.575 / for 99% cofidece itervals Oce we kow the margi of error, we ca state the cofidece iterval as sample mea ± margi of error Margi of Error The margi of error which is half of a legth of a cofidece iterval depeds o three facrs The level of cofidece (1-α) The sample size () The stadard deviatio of the populatio () Notice that The higher the cofidece level, the loger the legth of the cofidece iterval. That is, a 99% cofidece iterval will be loger tha a 90% cofidece iter, because a wider iterval will warrat better chace cover the populatio mea The larger the sample size, the shorter the cofidece iterval. This is because the larger the sample size, the smaller the stadard error of the sample mea, which meas the margi of error of the estimatio is smaller. The larger the stadard deviatio of the populatio, the loger the cofidet iterval. So, if the value of the variable varies very much, the margi of error of the estimate icreases. 6

Sample Size Determiatio Determie the sample size ecessary for estimatig the populatio mea withi a specified margi of error Ofte we have the reverse problem where we wat a experimet achieve a particular accuracy of the estimatio. That is, we wat make sure the populatio mea ca be estimated withi a target margi of error from a sample mea. Sice the sample size will affect the margi of error, we wat fid the sample size () eeded achieve a particular size of margi of error i estimatio. Sample size determiatio is eeded i desigig a experimetal ivestigatio before the data collectio. Example For our car miles per gallo, we had = 6 If we wated our margi of error be 1 for a 95% cofidece iterval, the we would eed 6 1. 96 = 1. 96 = 1 Solvig for would get us = (1.96 6) or that = 138 cars would be eeded Sample Size Determiatio We ca write this as a formula The sample size eeded result i a margi of error E for (1 α) 100% cofidece is z = α / E Usually we do t get a iteger for, so we would eed take the ext higher umber (the oe lower would t be large eough) Summary We ca costruct a cofidece iterval aroud a poit estimar if we kow the populatio stadard deviatio The margi of error is calculated usig, the sample size, ad the appropriate Z- value We ca also calculate the sample size eeded obtai a target margi of error Cofidece Itervals about a Populatio Mea i Practice where the Populatio Stadard Deviatio is Ukow 7

Learig Objectives Kow the properties of t-distributio Determie t-values Costruct ad iterpret a cofidece iterval about a populatio mea Kow the properties of t-distributio Ukow Populatio Stadard Deviatio So far we assumed that we kew the populatio stadard deviatio But, this assumptio is ot realistic, because if we kow the populatio stadard deviatio, we probably would kow the populatio mea as well. The there is o eed estimate the populatio mea usig a sample mea. So, it is more realistic costruct cofidece itervals i the case where we do ot kow the populatio stadard deviatio Replacig with s If we do t kow the populatio stadard deviatio, we obviously ca t use the formula Margi of error = 1.96 / because we have o umber use for However, just as we ca use the sample mea approximate the populatio mea, we ca also use the sample stadard deviatio approximate the populatio stadard deviatio Studet s t-distributio Because we ve chaged our formula (by usig s istead of ), we ca t use the ormal distributio ay more Istead of the ormal distributio, we use the Studet s t-distributio This distributio was developed specifically for the situatio whe is ot kow Properties of t-distributio Several properties are familiar about the Studet s t distributio Just like the ormal distributio, it is cetered at 0 ad symmetric about 0 Just like the ormal curve, the tal area uder the Studet s t curve is 1, the area left of 0 is ½, ad the area the right of 0 is also ½ Just like the ormal curve, as t icreases, the Studet s t curve gets close, but ever reaches, 0 8

Differece betwee Z ad t So what s differet? Ulike the ormal, there are may differet stadard t-distributios There is a stadard oe with 1 degree of freedom There is a stadard oe with degrees of freedom There is a stadard oe with 3 degrees of freedom Etc. The umber of degrees of freedom is crucial for the t-distributios t-statistic Whe is kow, the z-score x µ z = / follows a stadard ormal distributio Whe is ot kow, the t-statistic x µ t = s / follows a t-distributio with 1(sample size mius 1) degrees of freedom t-distributio Comparig three curves The stadard ormal curve The t curve with 14 degrees of freedom The t curve with 4 degrees of freedom Determie t-values Calculatio of t-distributio The calculatio of t-distributio values t α ca be doe i similar ways as the calculatio of ormal values z α Usig tables Usig techology TI graphig Calcular Use a t-table show fid a critical value Upper critical values of Studet's t distributio with ν degrees of freedom Or use TI graphig calcular fid a critical value: for istace, t 0.05 & df = 3 = ivt(0.95,3) =.3534 t 0,01& df = 11 = ivt(0.99,11) =.7187 Probability of exceedig the critical value 0.10 0.05 0.05 0.01 0.005 0.001 1. 3.078 6.314 1.706 31.81 63.657 318.313. 1.886.90 4.303 6.965 9.95.37 3. 1.638.353 3.18 4.541 5.841 10.15 4. 1.533.13.776 3.747 4.604 7.173 5. 1.476.015.571 3.365 4.03 5.893 6. 1.440 1.943.447 3.143 3.707 5.08 7. 1.415 1.895.365.998 3.499 4.78 8. 1.397 1.860.306.896 3.355 4.499 9. 1.383 1.833.6.81 3.50 4.96 10. 1.37 1.81.8.764 3.169 4.143 11. 1.363 1.796.01.718 3.106 4.04 1. 1.356 1.78.179.681 3.055 3.99 13. 1.350 1.771.160.650 3.01 3.85 14. 1.345 1.761.145.64.977 3.787 15. 1.341 1.753.131.60.947 3.733 16. 1.337 1.746.10.583.91 3.686 17. 1.333 1.740.110.567.898 3.646 18. 1.330 1.734.101.55.878 3.610 19. 1.38 1.79.093.539.861 3.579 0. 1.35 1.75.086.58.845 3.55 1. 1.33 1.71.080.518.831 3.57. 1.31 1.717.074.508.819 3.505 3. 1.319 1.714.069.500.807 3.485 4. 1.318 1.711.064.49.797 3.467 5. 1.316 1.708.060.485.787 3.450 6. 1.315 1.706.056.479.779 3.435 7. 1.314 1.703.05.473.771 3.41 8. 1.313 1.701.048.467.763 3.408 9. 1.311 1.699.045.46.756 3.396 30. 1.310 1.697.04.457.750 3.385 9

Critical values t Critical values for various degrees of freedom for the t- distributio are (compared the ormal) 6 16 31 101 1001 Normal Degrees of Freedom 5 15 30 100 1000 Ifiite t 0.05.571.131.04 1.984 1.96 1.960 Costruct ad iterpret a t-cofidece iterval about a populatio mea Note: Whe the sample size is large, a t distributio is close a z distributio z-score ad t-score The differece betwee the two formulas x µ z = / x µ t = s/ is that the sample stadard deviatio s is used approximate the populatio stadard deviatio The z-score has a ormal distributio, the t-statistic (or the t-score) has a t-distributio 95% Cofidece iterval for mea with ukow A 95% cofidece iterval, with ukow, is x t 0. 05 s where t 0.05 is the critical value for the t-distributio with ( 1) degrees of freedom x + t0. 05 s Note: Compare it the 95% cofidece iterval, with a kow : x z 0. 05 x + z 0. 05 Critical Value t α/ correspodig Cofidece Level 1 α The differet 95% cofidece itervals with t 0.05 would be For = 6, the sample mea ±.571 s / 6 For = 16, the sample mea ±.131 s / 16 For = 31, the sample mea ±.04 s / 31 For = 101, the sample mea ± 1.984 s / 101 For = 1001, the sample mea ± 1.96 s / 1001 Whe is kow, the sample mea ± 1.960 / Cofidece iterval for mea with ukow I geeral, the (1 α) 100% cofidece iterval, whe is ukow, is x tα / s x + tα / s where t α/ is the critical value for the t-distributio with ( 1) degrees of freedom 10

Approximate t with z As the sample size gets large, there is less ad less of a differece betwee the critical values for the ormal ad the critical values for the t-distributio Although t-critical value ad z-critical value may be close each other whe the sample size is large, we still recommed use a t-distributio whe is ot kow obtai a more accurate aswer Whe doig rough assessmet by had, the ormal critical values ca be used, particularly whe is large, for example if is 30 or more Example 1 Assume that we wat estimate the average weight of a particular type of very rare fish We are oly able borrow 7 specimes of this fish The average weight of these was 1.38 kg (the sample mea) The stadard deviatio of these 7 specimes of this fish was 0.9 kg (a sample stadard deviatio) What is a 95% cofidece iterval for the true mea weight? Example 1 (cotiued) = 7, the critical value t 0.05 for 6 degrees of freedom is.447 Our cofidece iterval thus is or (1.11, 1.65) 0. 9 1. 38. 447 = 1. 11 7 0. 9 1. 38 +. 447 = 1. 65 7 Example Suppose you do a study of acupucture determie how effective it is i relievig pai. You measure sesory rates for 15 subjects with the results give below. Use the sample data costruct a 95% cofidece iterval for the mea sesory rate for the populatio (assumed ormal) from which you ok the data. 8.6; 9.4; 7.9; 6.8; 8.3; 7.3; 9.; 9.6; 8.7; 11.4; 10.3; 5.4; 8.1; 5.5; 6.9 Solutio To fid the cofidece iterval, first we eed fid the sample mea. Sice populatio stadard deviatio is ot give ad we have the sample data calculate the sample stadard deviatio, we ca costruct a t-cofidece iterval for estimatig the mea. Use TI calcular eterig the data ad obtai oe-variable statistics. We obtai X = 8.67 ad s =1.67, where = 15 Critical value is t =. 145 0. 05; df = 14 1. 67 95% cofidece iterval is 8. 67 ± 145. 15 ; Betwee 7.30 ad 9.15 Check the uderlyig distributio Whe apply a t-iterval, we eed make sure the uderlyig populatio is approximately ormally distributed. Whe the sample size is small, outlier of the data will have a major affect o the data set, because outliers will affect the calculatio of sample mea ad sample stadard deviatio. So what ca we do? For a small sample, we always must check see that the outlier is a legitimate data value (ad ot just a typo) We ca collect more data, for example icrease be over 30. Apply the cetral limit theorem, we ca use a z-iterval approximate a t-iterval. Summary We used values from the ormal distributio whe we kew the value of the populatio stadard deviatio Whe we do ot kow, we estimate usig the sample stadard deviatio s We use values from the t-distributio whe we use s istead of, i.e. whe we do t kow the populatio stadard deviatio 11

Learig Objectives Cofidece Itervals about a Populatio Proportio Obtai a poit estimate for the populatio proportio Costruct ad iterpret a cofidece iterval for the populatio proportio Determie the sample size ecessary for estimatig a populatio proportio withi a specified margi of error Mea & Proportio Obtai a poit estimate for the populatio proportio So far, we leared calculate cofidece itervals for the populatio mea, whe we kew ad We also leared calculate cofidece itervals for the mea, whe we did ot kow Here, we ll lear how costruct cofidece itervals for situatios whe we are aalyzig a populatio proportio The issues ad methods are quite similar Sample Proportio Whe we aalyze the populatio mea, we use the sample mea as the poit estimate The sample mea is our best guess for the populatio mea Whe we aalyze the populatio proportio, we use the sample proportio as the poit estimate The sample proportio is our best guess for the populatio proportio Proportio Poit Estimate Usig the sample proportio is the atural choice for the poit estimate If we are doig a poll, ad 68% of the respodets said yes our questio, the we would estimate that 68% of the populatio would say yes our questio also The sample proportio is writte as pˆ 1

Costruct ad iterpret a cofidece iterval for the populatio proportio Cofidece Iterval for Mea versus Proportio Cofidece itervals for the populatio mea are Cetered at the sample mea Plus ad mius z α/ times the stadard deviatio of the sample mea (the stadard error from the samplig distributio) Similarly, cofidece itervals for the populatio proportio will be Cetered at the sample proportio Plus ad mius z α/ times the stadard deviatio of the sample proportio Samplig Distributio of Proportio We have already studied the distributio of the sample proportio is approximately ormal with pˆ = µ p ˆ = p p( 1 p) uder most coditios We use this costruct cofidece itervals for the populatio proportio Cofidece Iterval for Populatio Proportio The (1 α) 100% cofidece iterval for the populatio proportio is from pˆ zα / where z α/ is the critical value for the ormal distributio Note: That is, pˆ( 1 pˆ) pˆ + zα / pˆ( 1 pˆ) sample proportio ± z α/ stadard error of sample proportio Margi of Error Like for cofidece itervals for populatio meas, the quatity zα / pˆ( 1 pˆ) is called the margi of error Example We polled = 500 voters (This a sample of voters) Whe asked about a ballot questio, pˆ = 47% of them were i favor Obtai a 99% cofidece iterval for the populatio proportio i favor of this ballot questio (α = 0.005) 13

Example (cotiued) The critical value z 0.005 =.575, so 0. 47 0. 53 0. 47. 575 = 0. 41 500 0. 47 0. 53 0. 47 +. 575 = 0. 53 500 Determie the sample size ecessary for estimatig a populatio proportio withi a specified margi of error or (0.41, 0.53) is a 99% cofidece iterval for the populatio proportio Sample Size Determiatio We ofte wat kow the miimum sample size obtai a target margi of error for estimatig the populatio proportio A commo use of this calculatio is i pollig how may people eed be polled for the result have a certai margi of error News sries ofte say the latest polls show that so-ad-so will receive X% of the votes with a E% margi of error Example 1 For our pollig example, how may people eed be polled so that we are withi 1 percetage poit with 99% cofidece? The margi of error is pˆ (1 pˆ ) z α / which must be 0.01 We have a problem, though what is pˆ? Two choices of pˆ Example 1 (cotiued) If we try figure out the sample size i the experimetal desig stage before collectig data, the we do ot have sample data calculate pˆ. A way aroud this is that usig pˆ = 0.5 will always yield a sample size that is large eough. We ca also use a estimates pˆ from a previous study (hisric data) calculate the sample size. I our case, if we usig so ad = 16,577 pˆ =0.5 0. 5 0. 5. 575 =. 01. 575 = 0. 5. 01, the we have 14

Example 1 (cotiued) We uderstad ow why political polls ofte have a 3 or 4 percetage poits margi of error Sice it takes a large sample ( = 16,577) get be 99% cofidet withi 1 percetage poit, the 3 or 4 percetage poits margi of error targets are good compromises betwee accuracy ad cost effectiveess Sample Size Determiatio We ca write this as a formula The sample size eeded result i a margi of error E% for (1 α) 100% cofidece for a populatio proportio is ( Z ) pˆ ( 1 pˆ ) α / = ( E% ) Usually we do t get a iteger for, so we would eed take the ext higher umber (the oe lower would t be large eough) Example Determie the sample size ecessary estimate the true proportio of laborary mice with a certai geetic defect. We would like the estimate be withi 0.015 with 95% cofidece. Solutio: 1. Level of cofidece: 1 α = 0.95, z α/ = 1.96. Desired maximum error is E = 0.015. 3. No estimate of p give, use pˆ =0.5 4. Use the formula for : ( Z ) pˆ ( 1 pˆ ) α = ( 196. ) ( 05. )( 05. ) ( E% ) ( 0015. ) / = = 468. 44 469 Example (cotiued) Suppose we kow the geetic defect occurs i approximately 1 of 80 aimals Use: pˆ = 1 / 80 = 0015. = ( Z ) pˆ ( 1 pˆ ) / α ( E% ) ( 196. ) ( 0015. )( 09875. ) ( 0015. ) = = 1075. 11 Note: As illustrated here, it is a advatage have some idicatio of the value expected for p, especially as p becomes icreasigly further from 0.5 Summary We ca costruct cofidece itervals for populatio proportios i much the same way as for populatio meas We eed use the formula for the stadard deviatio of the sample proportio We ca also compute the miimum sample size eeded for a desired level of accuracy Which Procedure Do I Use? 15

Overview There are three differet cofidece iterval calculatios covered i this uit It ca be cofusig which oe is appropriate for which situatio I should use the ormal o, the t o the??? Which Parameter? The oe mai questio right at the begiig Which parameter are we tryig estimate? A mea? A proportio? This the sigle most importat questio z-iterval or t-iterval? I aalyzig populatio meas Is the populatio variace kow? If so, the we ca use the ormal distributio If the populatio variace is ot kow If we have eough data (30 or more values), we still ca use the ormal distributio If we do t have eough data (9 or fewer values), we should use the Studet's t-distributio We do t have ask this questio i the aalysis of proportios z-iterval for mea For the aalysis of a populatio mea If The data is OK (reasoably ormal) The variace is kow the we ca use the ormal distributio with a cofidece iterval of x zα / x + zα / t-iterval for mea For the aalysis of a populatio mea If The data is OK (reasoably ormal) The variace is NOT kow the we ca use the Studet's t-distributio with a cofidece iterval of x tα / s x + tα / s z-iterval for Proportio For the aalysis of a populatio proportio If sample size is large eough, the we ca use the proportios method with a cofidece iterval of pˆ zα / pˆ( 1 pˆ) pˆ + zα / pˆ( 1 pˆ) 16

Summary The mai questios that determie the cofidece iterval use: Is it a Populatio mea? Populatio proportio? I the case of a populatio mea, we eed determie Is the populatio variace kow? Does the data look reasoably ormal? Estimatig the Value of a Parameter Usig Cofidece Itervals Summary We ca use a sample {mea, proportio} estimate the populatio {mea, proportio} I each case, we ca use the appropriate samplig distributio of the sample statistic costruct a cofidece iterval aroud our estimate The cofidece iterval expresses the cofidece we have that our calculated iterval cotais the true parameter 17