values lie? Of course you can, if you remember the Empirical Rule! The interval 2, 2

Similar documents
1. C. The formula for the confidence interval for a population mean is: x t, which was

Math C067 Sampling Distributions

Unit 8: Inference for Proportions. Chapters 8 & 9 in IPS

Determining the sample size

Hypothesis testing. Null and alternative hypotheses

Confidence Intervals. CI for a population mean (σ is known and n > 30 or the variable is normally distributed in the.

CHAPTER 7: Central Limit Theorem: CLT for Averages (Means)

Example 2 Find the square root of 0. The only square root of 0 is 0 (since 0 is not positive or negative, so those choices don t exist here).

Confidence Intervals for One Mean

Overview. Learning Objectives. Point Estimate. Estimation. Estimating the Value of a Parameter Using Confidence Intervals

In nite Sequences. Dr. Philippe B. Laval Kennesaw State University. October 9, 2008

Chapter 7: Confidence Interval and Sample Size

Lesson 15 ANOVA (analysis of variance)

5: Introduction to Estimation

Soving Recurrence Relations

Confidence Intervals for Two Proportions

Confidence Intervals

Lesson 17 Pearson s Correlation Coefficient

Properties of MLE: consistency, asymptotic normality. Fisher information.

Center, Spread, and Shape in Inference: Claims, Caveats, and Insights

Practice Problems for Test 3

Lecture 4: Cauchy sequences, Bolzano-Weierstrass, and the Squeeze theorem

I. Chi-squared Distributions

THE REGRESSION MODEL IN MATRIX FORM. For simple linear regression, meaning one predictor, the model is. for i = 1, 2, 3,, n

Simple Annuities Present Value.

GCSE STATISTICS. 4) How to calculate the range: The difference between the biggest number and the smallest number.

Confidence Intervals (2) QMET103

Measures of Spread and Boxplots Discrete Math, Section 9.4

1 Computing the Standard Deviation of Sample Means

Week 3 Conditional probabilities, Bayes formula, WEEK 3 page 1 Expected value of a random variable

Discrete Mathematics and Probability Theory Spring 2014 Anant Sahai Note 13

The following example will help us understand The Sampling Distribution of the Mean. C1 C2 C3 C4 C5 50 miles 84 miles 38 miles 120 miles 48 miles

Repeating Decimals are decimal numbers that have number(s) after the decimal point that repeat in a pattern.

THE ARITHMETIC OF INTEGERS. - multiplication, exponentiation, division, addition, and subtraction

3 Energy Non-Flow Energy Equation (NFEE) Internal Energy. MECH 225 Engineering Science 2

The Fundamental Forces of Nature

STA 2023 Practice Questions Exam 2 Chapter 7- sec 9.2. Case parameter estimator standard error Estimate of standard error

OMG! Excessive Texting Tied to Risky Teen Behaviors

Case Study. Normal and t Distributions. Density Plot. Normal Distributions

CS103A Handout 23 Winter 2002 February 22, 2002 Solving Recurrence Relations

How To Solve The Homewor Problem Beautifully

Output Analysis (2, Chapters 10 &11 Law)

SECTION 1.5 : SUMMATION NOTATION + WORK WITH SEQUENCES

A probabilistic proof of a binomial identity

I. Why is there a time value to money (TVM)?

Here are a couple of warnings to my students who may be here to get a copy of what happened on a day that you missed.

CHAPTER 3 THE TIME VALUE OF MONEY

Trigonometric Form of a Complex Number. The Complex Plane. axis. ( 2, 1) or 2 i FIGURE The absolute value of the complex number z a bi is

Chapter 14 Nonparametric Statistics

Z-TEST / Z-STATISTIC: used to test hypotheses about. µ when the population standard deviation is unknown

Present Value Factor To bring one dollar in the future back to present, one uses the Present Value Factor (PVF): Concept 9: Present Value

Quadrat Sampling in Population Ecology

Sequences and Series

Time Value of Money. First some technical stuff. HP10B II users

.04. This means $1000 is multiplied by 1.02 five times, once for each of the remaining sixmonth

BINOMIAL EXPANSIONS In this section. Some Examples. Obtaining the Coefficients

MMQ Problems Solutions with Calculators. Managerial Finance

S. Tanny MAT 344 Spring be the minimum number of moves required.

One-sample test of proportions

Exam 3. Instructor: Cynthia Rudin TA: Dimitrios Bisias. November 22, 2011

Regression with a Binary Dependent Variable (SW Ch. 11)

CHAPTER 11 Financial mathematics

% 60% 70% 80% 90% 95% 96% 98% 99% 99.5% 99.8% 99.9%

RF Engineering Continuing Education Introduction to Traffic Planning

5 Boolean Decision Trees (February 11)

Elementary Theory of Russian Roulette

Solving Logarithms and Exponential Equations

WHEN IS THE (CO)SINE OF A RATIONAL ANGLE EQUAL TO A RATIONAL NUMBER?

Betting on Football Pools

CS103X: Discrete Structures Homework 4 Solutions

Your organization has a Class B IP address of Before you implement subnetting, the Network ID and Host ID are divided as follows:

Building Blocks Problem Related to Harmonic Series

Chapter 7 - Sampling Distributions. 1 Introduction. What is statistics? It consist of three major areas:

Chapter 6: Variance, the law of large numbers and the Monte-Carlo method

REVIEW OF INTEGRATION

Incremental calculation of weighted mean and variance

5.4 Amortization. Question 1: How do you find the present value of an annuity? Question 2: How is a loan amortized?

Section 11.3: The Integral Test

Convexity, Inequalities, and Norms

Inference on Proportion. Chapter 8 Tests of Statistical Hypotheses. Sampling Distribution of Sample Proportion. Confidence Interval

Approximating Area under a curve with rectangles. To find the area under a curve we approximate the area using rectangles and then use limits to find

Time Value of Money, NPV and IRR equation solving with the TI-86

Definition. A variable X that takes on values X 1, X 2, X 3,...X k with respective frequencies f 1, f 2, f 3,...f k has mean

Overview of some probability distributions.

Basic Elements of Arithmetic Sequences and Series

Page 1. Real Options for Engineering Systems. What are we up to? Today s agenda. J1: Real Options for Engineering Systems. Richard de Neufville

Mathematical goals. Starting points. Materials required. Time needed

Chapter 5 Unit 1. IET 350 Engineering Economics. Learning Objectives Chapter 5. Learning Objectives Unit 1. Annual Amount and Gradient Functions

Normal Distribution.

Mann-Whitney U 2 Sample Test (a.k.a. Wilcoxon Rank Sum Test)

A Mathematical Perspective on Gambling

Hypergeometric Distributions

1 Correlation and Regression Analysis

Department of Computer Science, University of Otago

3. Greatest Common Divisor - Least Common Multiple

Confidence intervals and hypothesis tests

Infinite Sequences and Series

where: T = number of years of cash flow in investment's life n = the year in which the cash flow X n i = IRR = the internal rate of return

Maximum Likelihood Estimators.

The Stable Marriage Problem

Transcription:

Chater 19: Cofidece Itervals for Proortios Whe we made our robability calculatios back i chater 18, we were holdig o to oe last thig that ket us from reality: kowig the value of. Sice that s a arameter, we ought ot to kow its value! I this chater, we ll deal with that oe last thread ad retur fully to rocedures that work i the real world. A Poit Estimate for A oit estimate is a sigle umber that estimates a arameter. I this chater, that arameter is, the oulatio roortio. If you do t kow the roortio for the oulatio, what are you goig to do? Take a samle, of course! The statistic that you get from that samle is a oit estimate for the arameter. Thus, is a oit estimator of. Develoig a Better Method The Problem The roblem with oit estimates is that they are almost always wrog. Try it fli a coi a few times (say, 0). Did you get exactly half of those tosses to be heads? Probably ot (you have the tools to calculate the robability that exactly 10 of 0 coi tosses come u heads ). Poit estimates rarely give the aswer that we re lookig for. The Solutio There must be a better method some way of sayig I thik that the arameter is here ad feel cofidet that it really is there. The solutio is to add a margi of error a regio aroud our oit estimate where we are retty sure that the arameter lies. How wide should that regio be? I m 100% cofidet that the roortio is betwee 0 ad 1, but that s ot a very useful aswer The Theory The key is to use what we kow about samlig distributios i articular, the fact that is of immese hel. Now, ca you describe a regio where aroximately 95% of values lie? Of course you ca, if you remember the Emirical Rule! The iterval, should cotai about 95% of all values. Put aother way, 95% of samles will roduce a value of that is withi two of Now for a little word lay: if I am stadig withi two meters of you, are you stadig withi two meters of me? Of course you are! Now, aly that same logic to the last statistics statemet I made. HOLLOMAN S AP STATISTICS BVD CHAPTER 19, PAGE 1 OF 7.

95% of samles will roduce a value of where is withi two of. Agai, with more symbols: 95% of samles will roduce a value of so that lies i the iterval,. Aha! There it is. I ow have a little iece that, whe added to my origial oit estimate, roduces a regio a iterval where I am retty sure (i this case, about 95% sure) that the arameter lies. Wait we re still usig! Ideed we are sice 1, we re still hagig o to the urealistic idea that we kow. Well if you do t have a to ut i there, what are you goig to do? Hit: quittig or cryig are ot otios. Maybe we should use a umber that s a good estimate for if oly we kew a oit estimate for of course! You might be woderig if that relacemet messes u the theory (or calculatios) so far fortuately, o. Remember that : the ceter of all ossible values of is. Whe that haes whe the ceter of the samlig distributio equals the arameter that we are tryig to estimate we say that the statistic is ubiased. The result of that is that our calculatios still hold. but we ca t call it aymore so istead we call it the stadard error of : SE. Critical Values So, there are still two issues. What if I wat to be more sure tha 95% what if I wat to be 99% sure? The Emirical Rule does t have a umber with a middle area of 99%. 68%, 99.7%...the Emirical Rule will let me fid itervals to be that sure. Secod the Emirical Rule is oly aroximate surely there is a more exact way? ad of course there is. The key is i realizig that those umbers (I used for my theory talk a little earlier) are really z-scores. About 95% of the data i a ormal distributio lies betwee z ad z. Thus, the issue is to fid a value of z where the area betwee z ad z is a certai amout (like 99%). Wait did t we do roblems like that? Of course we did! The value of z that has a articular amout of area to oe side is called a critical value. For our itervals, the give area is i the middle but most eole defie critical values i terms of HOLLOMAN S AP STATISTICS BVD CHAPTER 19, PAGE OF 7

the left or right had areas. Let s say our area i the middle is C that makes the area above z equal to 1 C. The otatio for a critical value is z. The critical value hels determie how wide your iterval should be, so that you ca be certai to catch the arameter. Coditios Alas, we re still makig some assumtios. Relax it s imossible to get aythig doe without some assumtios. We just eed to make sure that we uderstad what they are ad, if ossible, check to see if those assumtios hold u. First of all, our work with roortios is based o what we kow about biomial radom variables. What is required to make a radom variable biomial? Success ad failure check; we re still doig that by oly coutig those idividuals that have some quality of iterest. Fixed robability of success well, that oe turs out to be hard. We re goig to assume that this is true without ever metioig it, because workig i a situatio where there is t a fixed robability of success forces you to start usig Bayesia methods, ad we re ot goig there. Fixed umber of trials check. We ll defiitely have a fixed samle size. Ideedet trials aother hard oe. There are two thigs that we ca do to try to esure that the trials are ideedet: obtai a radom samle, ad make sure that the samle is ot too large, relative to the oulatio. We ll defiitely metio that we eed a radom samle i fact, almost every rocedure that we develo will require a radom samle. As for the ot too large issue the roblem is that whe we samle from our oulatio, we do so without relacemet we take a fixed umber of idividuals from a fiite oulatio. Whe we do this, we are actually i a Hyergeometric situatio but relax, there s a escae clause! As log as the samle is ot too large relative to the oulatio a good rule of thumb is that the samle is less tha 10% of the oulatio the the hyergeometric ad the biomial get really close to oe aother. Thus, if our samle is smaller tha 10% of the oulatio, we ca cotiue usig the methods we ve develoed. but really, if you ve got 10% of the oulatio i your samle, there s a good chace that you could have just measured the whole oulatio ad avoided all of this mess i the first lace! Thus, this 10% coditio is ot terribly imortat for our work. It is a issue, ad you should kee it i the back of your mid but we re ot goig to regularly state that this eeds to be true. That s it for the biomial, but there was oe more thig we did to develo our iterval we used the fact that the biomial looked aroximately ormal. Do you remember the requiremets for a biomial to be aroximately ormal? We used i that check reviously, but we do t kow aymore. I woder what we should do? (kee readig to fid out) Summary: A Iterval Estimate for HOLLOMAN S AP STATISTICS BVD CHAPTER 19, PAGE 3 OF 7

1 A level C cofidece iterval for is z, where z is the uer 1 C critical value from the Stadard Normal Distributio (a ormal where 0 ad 1). Costructig this iterval requires that the samle was obtaied radomly, ad that both of ad 1 are at least 10. Examles [1.] A Harris Poll from Jue 000 reorted that 79% of U.S. citizes (based o a radom samle of 000 eole) thought that elected officials should be subjected to radom drug tests. Let s costruct a 90% cofidece iterval for the true oulatio roortio that agree with this idea. To costruct this iterval, I eed to kow that the samle was obtaied radomly, ad that each of ad 1 are at least 10. I m told that the samle was obtaied radomly. is 1580 ad 1 these is at least 10, so we may roceed. 90% cofidece gives z 1.645. The iterval is is 40; each of 0.79 0.1 0.79 1.645 0.79 0.0149 0.7750,0.8049. 000 I am 90% cofidet that the true roortio of U.S. citizes that agree with this statemet is betwee 77.5% ad 80.5%. [.] Researchers are testig a ew drug to hel atiets with arcolesy. Of the 33 articiats, 7 reorted ausea as a side effect of the drug. Costruct a 99% cofidece iterval for the roortio of atiets that ca exect to exeriece ausea while usig this drug. To costruct this iterval, I eed to kow that the samle was obtaied radomly, ad that each of ad 1 are at least 10. I m ot told that this samle was obtaied radomly I ll have to assume that this is the case. 7 ad 1 96 ; sice each of these is at least 10, I ca roceed. For 99% cofidece, z.5758. The iterval is 0.0836 0.9164 0.0836.5758 0.0439,0.133. 33 I am 99% cofidet that the oulatio roortio of users of this drug that will exeriece ausea is betwee 4.39% ad 1.33%. [3.] A study of 530 eole aged 60 or older i the Uited States foud 14 with rheumatoid arthritis. Costruct a 90% cofidece iterval for the actual roortio of all eole aged 60 ad older who have rheumatoid arthritis. HOLLOMAN S AP STATISTICS BVD CHAPTER 19, PAGE 4 OF 7

To costruct this iterval, I eed to kow that the samle was obtaied radomly, ad that each of ad 1 are at least 10. I ll have to assume that the samle was obtaied radomly. 14 sice each of these is at least 10, I ca cotiue. 90% cofidece makes z 1.645. The iterval is ad 1 5175 ; 0.04 0.976 0.04 1.645 0.005,0.074. I am 90% cofidet that the roortio of 530 adults aged 60 ad over who suffer from rheumatoid arthritis is betwee.05% ad.74%. Iterretig Cofidece You saw, i my examles above, how I fiished with a statemet like I am 90% cofidet that This iterrets the iterval ad you must do this but sometimes you ll be asked to iterret what 90% cofidet meas. For this, you must be cautious. I ve said it clearly ad correctly, but your attemts to say that i your ow words will robably backfire. Here is a temlate for sayig it correctly I ve left markers to idicate sots where you have to fill i some details. <C%> cofidet meas that if we took may samles, ad costructed a iterval from each samle, the about <C%> of those itervals ought to cotai the true oulatio roortio of <give some cotext>. Examles [4.] From examle 1 what do we mea whe we say we are 90% cofidet that the true oulatio roortio of those who thik that elected officials should be subjected to drug tests is cotaied withi this iterval? If I took may samles, ad costructed a iterval for each samle, the about 90% of those samles ought to cotai the oulatio roortio of eole who thik that elected officials should be subjected to drug tests. [5.] From examle what do we mea whe we say we are 99% cofidet that the true oulatio roortio of users of this drug that will exeriece ausea is cotaied withi this iterval? If I took may samles, ad costructed a iterval from each of those samles, the about 99% of those itervals ought to cotai the oulatio roortio of users of this drug that will exeriece ausea as a side effect. HOLLOMAN S AP STATISTICS BVD CHAPTER 19, PAGE 5 OF 7

More About the Margi of Error A Issue It should be fairly obvious that we wat the margi of error to be small a small regio where we thik the oulatio roortio lies is much more iterestig ad useful tha a very wide regio. The art of the formula that reresets margi of error is affect the margi of error? HOLLOMAN S AP STATISTICS BVD CHAPTER 19, PAGE 6 OF 7 z 1 what values will The Effects of the Numbers First of all, there is the critical value, which comes from our level of cofidece. A larger critical value meas a larger margi of error so we wat a smaller critical value. The critical value comes from the choice of cofidece level (C) so what kids of cofidece levels will result i smaller critical values, ad thus smaller margis of error? Here s a easy way to look at it: I am 0% cofidet i my oit estimate (which has a small margi of error: zero), ad I am 100% cofidet that the oulatio roortio is a real umber (which has a large margi of error: ifiity). Do you see the relatioshi betwee cofidece level ad margi of error? We tyically wat very high cofidece levels betwee 90% ad 99% so that does t leave much room for chagig the margi of error. The samle roortio has a effect, but we do t kow that value util after we re doe, so it is t as useful for laig uroses. That leaves the samle size over which we have quite a bit of cotrol. What kids of samle sizes will result i smaller margis of error? Look at that formula agai, otice that is i the deomiator, ad thik We have cotrol over the samle size, but [1] there is a uer limit about 10% of the oulatio ad [] it is ofte exesive to obtai very large samles, ad larger samles mea more work! Here s the thought that statisticias have: if I decide ahead of time what margi of error I wat (ad what level of cofidece I wat), what s the smallest samle size that should roduce the desired margi of error? Solvig for Samle Size 1 I the equatio m z, we ll kow m ad z, ad we wat to solve for. That just leaves oe thig: what will we use for the samle roortio? There are two ossibilities. First, you may have some guess about the value of from a rior study. If so, use that. I this course, that would be a value give somewhere i the roblem; i reality, it meas you did a small trial ru to establish that iitial value. The other ossibility is for whe you have o idea what to use. I that case, it turs out that the best choice is 0.5. If the actual samle roortio turs out to be 0.5 the your samle

size will have bee just right; ay other value of will result i a smaller margi of error. Thus, usig 0.5 roduces the largest that is eeded; it gives a samle size that should guaratee that the margi of error is o bigger tha the oe you desired. I both cases, the result of your calculatio will robably ot be a iteger i which case you should roud u to the ext iteger (eve if the decimal art is somethig like 0.001). Examles [6.] A revious study has suggested that about 19.3% of tees (aged 1 19) are obese. How large of a samle will be eeded i order to estimate the true roortio of obese tees with 95% cofidece ad a margi of error of o more tha 1%? 95% cofidece makes z 1.96 0.01 1.96. I have 0.01 1.96 that comes out to to do. 0.193 0.807, ad I eed to solve for 0.193 0.807 5983.111. Thus, a samle size of 5984 ought [7.] I wat to costruct a 99% cofidece iterval for the roortio of Americas who thik that the govermet has laced too may regulatios o busiesses, ad I wat a margi of error of o more tha 3%. How large of a samle will this require? 99% cofidece makes z.5758. I do t have ay rior value for, so I ll use 0.5. That 0.50.5.5758, which solves to gives me 0.03.5758 a samle size of 1844 ought to do. 0.03 0.5 0.5 1843.07. Thus, HOLLOMAN S AP STATISTICS BVD CHAPTER 19, PAGE 7 OF 7