Chater 19: Cofidece Itervals for Proortios Whe we made our robability calculatios back i chater 18, we were holdig o to oe last thig that ket us from reality: kowig the value of. Sice that s a arameter, we ought ot to kow its value! I this chater, we ll deal with that oe last thread ad retur fully to rocedures that work i the real world. A Poit Estimate for A oit estimate is a sigle umber that estimates a arameter. I this chater, that arameter is, the oulatio roortio. If you do t kow the roortio for the oulatio, what are you goig to do? Take a samle, of course! The statistic that you get from that samle is a oit estimate for the arameter. Thus, is a oit estimator of. Develoig a Better Method The Problem The roblem with oit estimates is that they are almost always wrog. Try it fli a coi a few times (say, 0). Did you get exactly half of those tosses to be heads? Probably ot (you have the tools to calculate the robability that exactly 10 of 0 coi tosses come u heads ). Poit estimates rarely give the aswer that we re lookig for. The Solutio There must be a better method some way of sayig I thik that the arameter is here ad feel cofidet that it really is there. The solutio is to add a margi of error a regio aroud our oit estimate where we are retty sure that the arameter lies. How wide should that regio be? I m 100% cofidet that the roortio is betwee 0 ad 1, but that s ot a very useful aswer The Theory The key is to use what we kow about samlig distributios i articular, the fact that is of immese hel. Now, ca you describe a regio where aroximately 95% of values lie? Of course you ca, if you remember the Emirical Rule! The iterval, should cotai about 95% of all values. Put aother way, 95% of samles will roduce a value of that is withi two of Now for a little word lay: if I am stadig withi two meters of you, are you stadig withi two meters of me? Of course you are! Now, aly that same logic to the last statistics statemet I made. HOLLOMAN S AP STATISTICS BVD CHAPTER 19, PAGE 1 OF 7.
95% of samles will roduce a value of where is withi two of. Agai, with more symbols: 95% of samles will roduce a value of so that lies i the iterval,. Aha! There it is. I ow have a little iece that, whe added to my origial oit estimate, roduces a regio a iterval where I am retty sure (i this case, about 95% sure) that the arameter lies. Wait we re still usig! Ideed we are sice 1, we re still hagig o to the urealistic idea that we kow. Well if you do t have a to ut i there, what are you goig to do? Hit: quittig or cryig are ot otios. Maybe we should use a umber that s a good estimate for if oly we kew a oit estimate for of course! You might be woderig if that relacemet messes u the theory (or calculatios) so far fortuately, o. Remember that : the ceter of all ossible values of is. Whe that haes whe the ceter of the samlig distributio equals the arameter that we are tryig to estimate we say that the statistic is ubiased. The result of that is that our calculatios still hold. but we ca t call it aymore so istead we call it the stadard error of : SE. Critical Values So, there are still two issues. What if I wat to be more sure tha 95% what if I wat to be 99% sure? The Emirical Rule does t have a umber with a middle area of 99%. 68%, 99.7%...the Emirical Rule will let me fid itervals to be that sure. Secod the Emirical Rule is oly aroximate surely there is a more exact way? ad of course there is. The key is i realizig that those umbers (I used for my theory talk a little earlier) are really z-scores. About 95% of the data i a ormal distributio lies betwee z ad z. Thus, the issue is to fid a value of z where the area betwee z ad z is a certai amout (like 99%). Wait did t we do roblems like that? Of course we did! The value of z that has a articular amout of area to oe side is called a critical value. For our itervals, the give area is i the middle but most eole defie critical values i terms of HOLLOMAN S AP STATISTICS BVD CHAPTER 19, PAGE OF 7
the left or right had areas. Let s say our area i the middle is C that makes the area above z equal to 1 C. The otatio for a critical value is z. The critical value hels determie how wide your iterval should be, so that you ca be certai to catch the arameter. Coditios Alas, we re still makig some assumtios. Relax it s imossible to get aythig doe without some assumtios. We just eed to make sure that we uderstad what they are ad, if ossible, check to see if those assumtios hold u. First of all, our work with roortios is based o what we kow about biomial radom variables. What is required to make a radom variable biomial? Success ad failure check; we re still doig that by oly coutig those idividuals that have some quality of iterest. Fixed robability of success well, that oe turs out to be hard. We re goig to assume that this is true without ever metioig it, because workig i a situatio where there is t a fixed robability of success forces you to start usig Bayesia methods, ad we re ot goig there. Fixed umber of trials check. We ll defiitely have a fixed samle size. Ideedet trials aother hard oe. There are two thigs that we ca do to try to esure that the trials are ideedet: obtai a radom samle, ad make sure that the samle is ot too large, relative to the oulatio. We ll defiitely metio that we eed a radom samle i fact, almost every rocedure that we develo will require a radom samle. As for the ot too large issue the roblem is that whe we samle from our oulatio, we do so without relacemet we take a fixed umber of idividuals from a fiite oulatio. Whe we do this, we are actually i a Hyergeometric situatio but relax, there s a escae clause! As log as the samle is ot too large relative to the oulatio a good rule of thumb is that the samle is less tha 10% of the oulatio the the hyergeometric ad the biomial get really close to oe aother. Thus, if our samle is smaller tha 10% of the oulatio, we ca cotiue usig the methods we ve develoed. but really, if you ve got 10% of the oulatio i your samle, there s a good chace that you could have just measured the whole oulatio ad avoided all of this mess i the first lace! Thus, this 10% coditio is ot terribly imortat for our work. It is a issue, ad you should kee it i the back of your mid but we re ot goig to regularly state that this eeds to be true. That s it for the biomial, but there was oe more thig we did to develo our iterval we used the fact that the biomial looked aroximately ormal. Do you remember the requiremets for a biomial to be aroximately ormal? We used i that check reviously, but we do t kow aymore. I woder what we should do? (kee readig to fid out) Summary: A Iterval Estimate for HOLLOMAN S AP STATISTICS BVD CHAPTER 19, PAGE 3 OF 7
1 A level C cofidece iterval for is z, where z is the uer 1 C critical value from the Stadard Normal Distributio (a ormal where 0 ad 1). Costructig this iterval requires that the samle was obtaied radomly, ad that both of ad 1 are at least 10. Examles [1.] A Harris Poll from Jue 000 reorted that 79% of U.S. citizes (based o a radom samle of 000 eole) thought that elected officials should be subjected to radom drug tests. Let s costruct a 90% cofidece iterval for the true oulatio roortio that agree with this idea. To costruct this iterval, I eed to kow that the samle was obtaied radomly, ad that each of ad 1 are at least 10. I m told that the samle was obtaied radomly. is 1580 ad 1 these is at least 10, so we may roceed. 90% cofidece gives z 1.645. The iterval is is 40; each of 0.79 0.1 0.79 1.645 0.79 0.0149 0.7750,0.8049. 000 I am 90% cofidet that the true roortio of U.S. citizes that agree with this statemet is betwee 77.5% ad 80.5%. [.] Researchers are testig a ew drug to hel atiets with arcolesy. Of the 33 articiats, 7 reorted ausea as a side effect of the drug. Costruct a 99% cofidece iterval for the roortio of atiets that ca exect to exeriece ausea while usig this drug. To costruct this iterval, I eed to kow that the samle was obtaied radomly, ad that each of ad 1 are at least 10. I m ot told that this samle was obtaied radomly I ll have to assume that this is the case. 7 ad 1 96 ; sice each of these is at least 10, I ca roceed. For 99% cofidece, z.5758. The iterval is 0.0836 0.9164 0.0836.5758 0.0439,0.133. 33 I am 99% cofidet that the oulatio roortio of users of this drug that will exeriece ausea is betwee 4.39% ad 1.33%. [3.] A study of 530 eole aged 60 or older i the Uited States foud 14 with rheumatoid arthritis. Costruct a 90% cofidece iterval for the actual roortio of all eole aged 60 ad older who have rheumatoid arthritis. HOLLOMAN S AP STATISTICS BVD CHAPTER 19, PAGE 4 OF 7
To costruct this iterval, I eed to kow that the samle was obtaied radomly, ad that each of ad 1 are at least 10. I ll have to assume that the samle was obtaied radomly. 14 sice each of these is at least 10, I ca cotiue. 90% cofidece makes z 1.645. The iterval is ad 1 5175 ; 0.04 0.976 0.04 1.645 0.005,0.074. I am 90% cofidet that the roortio of 530 adults aged 60 ad over who suffer from rheumatoid arthritis is betwee.05% ad.74%. Iterretig Cofidece You saw, i my examles above, how I fiished with a statemet like I am 90% cofidet that This iterrets the iterval ad you must do this but sometimes you ll be asked to iterret what 90% cofidet meas. For this, you must be cautious. I ve said it clearly ad correctly, but your attemts to say that i your ow words will robably backfire. Here is a temlate for sayig it correctly I ve left markers to idicate sots where you have to fill i some details. <C%> cofidet meas that if we took may samles, ad costructed a iterval from each samle, the about <C%> of those itervals ought to cotai the true oulatio roortio of <give some cotext>. Examles [4.] From examle 1 what do we mea whe we say we are 90% cofidet that the true oulatio roortio of those who thik that elected officials should be subjected to drug tests is cotaied withi this iterval? If I took may samles, ad costructed a iterval for each samle, the about 90% of those samles ought to cotai the oulatio roortio of eole who thik that elected officials should be subjected to drug tests. [5.] From examle what do we mea whe we say we are 99% cofidet that the true oulatio roortio of users of this drug that will exeriece ausea is cotaied withi this iterval? If I took may samles, ad costructed a iterval from each of those samles, the about 99% of those itervals ought to cotai the oulatio roortio of users of this drug that will exeriece ausea as a side effect. HOLLOMAN S AP STATISTICS BVD CHAPTER 19, PAGE 5 OF 7
More About the Margi of Error A Issue It should be fairly obvious that we wat the margi of error to be small a small regio where we thik the oulatio roortio lies is much more iterestig ad useful tha a very wide regio. The art of the formula that reresets margi of error is affect the margi of error? HOLLOMAN S AP STATISTICS BVD CHAPTER 19, PAGE 6 OF 7 z 1 what values will The Effects of the Numbers First of all, there is the critical value, which comes from our level of cofidece. A larger critical value meas a larger margi of error so we wat a smaller critical value. The critical value comes from the choice of cofidece level (C) so what kids of cofidece levels will result i smaller critical values, ad thus smaller margis of error? Here s a easy way to look at it: I am 0% cofidet i my oit estimate (which has a small margi of error: zero), ad I am 100% cofidet that the oulatio roortio is a real umber (which has a large margi of error: ifiity). Do you see the relatioshi betwee cofidece level ad margi of error? We tyically wat very high cofidece levels betwee 90% ad 99% so that does t leave much room for chagig the margi of error. The samle roortio has a effect, but we do t kow that value util after we re doe, so it is t as useful for laig uroses. That leaves the samle size over which we have quite a bit of cotrol. What kids of samle sizes will result i smaller margis of error? Look at that formula agai, otice that is i the deomiator, ad thik We have cotrol over the samle size, but [1] there is a uer limit about 10% of the oulatio ad [] it is ofte exesive to obtai very large samles, ad larger samles mea more work! Here s the thought that statisticias have: if I decide ahead of time what margi of error I wat (ad what level of cofidece I wat), what s the smallest samle size that should roduce the desired margi of error? Solvig for Samle Size 1 I the equatio m z, we ll kow m ad z, ad we wat to solve for. That just leaves oe thig: what will we use for the samle roortio? There are two ossibilities. First, you may have some guess about the value of from a rior study. If so, use that. I this course, that would be a value give somewhere i the roblem; i reality, it meas you did a small trial ru to establish that iitial value. The other ossibility is for whe you have o idea what to use. I that case, it turs out that the best choice is 0.5. If the actual samle roortio turs out to be 0.5 the your samle
size will have bee just right; ay other value of will result i a smaller margi of error. Thus, usig 0.5 roduces the largest that is eeded; it gives a samle size that should guaratee that the margi of error is o bigger tha the oe you desired. I both cases, the result of your calculatio will robably ot be a iteger i which case you should roud u to the ext iteger (eve if the decimal art is somethig like 0.001). Examles [6.] A revious study has suggested that about 19.3% of tees (aged 1 19) are obese. How large of a samle will be eeded i order to estimate the true roortio of obese tees with 95% cofidece ad a margi of error of o more tha 1%? 95% cofidece makes z 1.96 0.01 1.96. I have 0.01 1.96 that comes out to to do. 0.193 0.807, ad I eed to solve for 0.193 0.807 5983.111. Thus, a samle size of 5984 ought [7.] I wat to costruct a 99% cofidece iterval for the roortio of Americas who thik that the govermet has laced too may regulatios o busiesses, ad I wat a margi of error of o more tha 3%. How large of a samle will this require? 99% cofidece makes z.5758. I do t have ay rior value for, so I ll use 0.5. That 0.50.5.5758, which solves to gives me 0.03.5758 a samle size of 1844 ought to do. 0.03 0.5 0.5 1843.07. Thus, HOLLOMAN S AP STATISTICS BVD CHAPTER 19, PAGE 7 OF 7