ypthesis Testing (intr) We hve discussed tw methds f mking inference n prmeters in ppultin bsed n rndm smple (in English, hw d we figure ut the true men r prprtin) Pint estimtes give us single guess Stt 104: Quntittive Methds fr Ecnmists Clss 22: ypthesis Testing- Prt I (verbge) Cnfidence intervls give us regin. Alterntively, we might be interested in using the infrmtin in smple t test hypthesis but prmeters in the ppultin. 1 2 Wht is hypthesis? A hypthesis (r clim) is simply belief (usully bsed n sme relevnt infrmtin). Sme exmples f hypthesis might be: Assigning weekly hmewrk in sttistics clss increses scres n curse exms. Tking Cld-Eeze reduces the time f yur cld The Atkins diet is better thn the Suth Bech diet. Business Exmples A cmpny tht hs 10% mrket shre lunches mrketing cmpign. At the end f the cmpign perid the cmpny cnducts survey in rder t ssess whether its mrket shre hs incresed. A bttling mchine is set t utmticlly fill ech bttle with 355 ml. f sft drink. T check whether the mchine needs t be redjusted, qulity cntrl inspectr exmines rndm smple f newly filled bttles. 3 4 Recp : Cnfidence Intervls Recp : ypthesis Testing Allw us t use smple dt t estimte ppultin vlue, like the true men r the true prprtin, nd then put bunds n ur estimte. Exmple: Give 95% cnfidence intervl fr the true verge munt students spend weekly n lchl. Allws us t use smple dt t test clim but ppultin, such s testing whether ppultin prprtin r ppultin men equls sme number. Exmple: Is the true verge munt tht students spent weekly n lchl $20? 5 6 1
Prmeter Identifictin Prmeter Identifictin (cnt) ypthesis tests cn be crried ut n ll the ppultin prmeters (the Greeks), such s the ppultin medin r vrince. But in this sectin, we will cnduct tests f hypthesis nly regrding the ppultin men µ r the ppultin prprtin p. In the Mrket Shre exmple: A cmpny tht hs 10% mrket shre lunches mrketing cmpign. At the end f the cmpign perid the cmpny cnducts survey in rder t ssess whether its mrket shre hs incresed The prmeter f interest my be defined s p = the true prprtin f custmers wh wuld purchse the cmpny's prducts t the cnclusin f the mrketing cmpign. 7 8 Prmeter Identifictin (cnt) In the Bttling Mchine exmple: A bttling mchine is set t utmticlly fill ech bttle with 355 ml. f sft drink. T check whether the mchine needs t be redjusted, qulity cntrl inspectr exmines rndm smple f newly filled bttles. The prmeter f interest my be defined s µ = the true men vlume (in ml.) per bttle Nming the hyptheses We hve specil nmes fr the hyptheses under cnsidertin. We cll the cnventinl belief, the sttus qu r previling viewpint the null hypthesis nd dente it by The cmpeting belief is referred t s the lterntive hypthesis nd is dented by 9 10 Subtle but imprtnt pint Null versus Alterntive The null hypthesis is generlly sttement tht there is nthing hppening, n difference, r n chnge in the ppultin. The lterntive hypthesis is n lterntive t the null hypthesis (!!). It is the sttement tht the resercher hpes is true, nmely, the chnge in the ppultin tht the resercher is lking fr. Let s lk t sme exmples. The Null ypthesis The null hypthesis, which is dented, specifies specific vlue fr the ppultin prmeter. In the Mrket Shre exmple: A cmpny tht hs 10% mrket shre lunches mrketing cmpign. At the end f the cmpign perid the cmpny cnducts survey in rder t ssess whether its mrket shre hs incresed : p = 10% In the Bttling Mchine exmple: A bttling mchine is set t utmticlly fill ech bttle with 355 ml. f sft drink. T check whether the mchine needs t be redjusted, qulity cntrl inspectr exmines rndm smple f newly filled bttles. : µ = 355ml 11 12 2
The Alterntive ypthesis If the null hypthesis is nt true then smething else must be true. We cll this the lterntive hypthesis nd dente it by In the mrketing exmple the lterntive hypthesis is clled ne sided becuse we re testing ginst the lterntive tht the mrket shre is nw greter thn 10%: : p = 10% : p > 10% The Alterntive ypthesis (cnt) Fr the bttling mchine exmple, we wnt t test the null hypthesis tht the men is equl t 355ml ginst the lterntive tht the men is either lrger thn r smller thn 355ml. This is clled tw sided lterntive. : µ = 355 ml : µ 355ml 13 14 Exmple Suppse yu wrk fr cmpny tht prduces cking pts with n verge life spn f 7 yers. T gin cmpetitive dvntge yu suggest using new mteril tht clims t extend the life spn f the pts. Yu wnt t test the hypthesis tht the verge life spn f the cking pts mde with this new mteril increses. : : Generl frmewrk Nw tht we knw hw t set up the hyptheses, we cn g ver the testing prcess. Strt with sme clim ( nd ) Cllect evidence (dt). In sttistics, we lwys ssume the null hypthesis is true (like ssuming the defendnt is inncent until prven guilty.) Then, mke decisin bsed n the vilble evidence. If there is sufficient evidence ( beynd resnble dubt ), reject the null hypthesis. (Behve s if defendnt is guilty.) If there is nt enugh evidence, d nt reject the null hypthesis. (Behve s if defendnt is nt guilty.) 15 16 Philsphicl Issue Nte tht the tw pssible utcmes t ny hypthesis test re : reject the null hypthesis d nt reject the null hypthesis ccepting the null hypthesis is nt n ptin. This is becuse it is esier t disprve smething thn prve it: Reminder : Yu cn never prve crrect. When we perfrm hypthesis test, ur cnclusin is either r We reject nd hence ccept We fil t reject 0 : ll dgs hve fur legs This is why the lterntive hypthesis is lwys the cnclusin we re trying t prve 17 Nte tht we shuld nt sy we ccept. We sy fil t reject becuse the experiment filed t prduce evidence tht is incrrect. 18 3
IF Becuse the cnclusin we will be mking is bsed n smple dt, the pssibility f mking n errr lwys exists. is true is flse Wht cn g wrng? is true Crrect Decisin (n errr) Type II errr nd we Clim tht is flse Type I errr Crrect Decisin (n errr) Types f Errrs Type I errr: The null hypthesis is rejected when it is true. Type II errr: The null hypthesis is nt rejected when it is flse. There is lwys chnce f mking ne f these errrs. But, we will wnt t minimize the chnce f ding s! 19 20 Exmple f Errrs : Defendnt is nt guilty. : Defendnt is guilty. Wht is the Type I Errr? Wht is the Type II Errr? Which errr is mre imprtnt? The milemster tire cmpny hs decided tht their new tire must lst mre thn 45,000 miles r they wnt mrket it. : tire lsts 45,000 (r less) : tire lsts mre thn 45,000. Wht is the type I errr? The type II errr? Wht is the cst f type I errr here? Wht is the cst f type II errr? 21 22 The Significnce Level We define α = Prb(Type I errr) = P(reject true) We nrmlly use α=.05 : clled the significnce level This is pretty stndrd vlue t use, but nt set in stne. Usully the greter the cst f type I errr, the smller this number is. Actully Testing yptheses As yu cn see, there is lt f terminlgy we hd t g thrugh befre we get t test ny hyptheses There re 2 methds fr ctully ding the hypthesis test: Test sttistic methd (clssicl) P-vlues (using the cmputer) 23 24 4
Cncentrte n One-Sided Tdy Mechnics f ypthesis Testing Tdy we will nly fcus n the ne sided hypthesis test: 0 : µ = µ : µ > µ S s nt t get lst in the detils, it will be helpful t hve cncrete exmple t refer bck t. Suppse tht in n effrt t reduce trffic, Atlnt city plnners hve encurged mre peple t tke public trnsit nd t jin cr pls. This ws dne vi dvertising cmpigns nd finncil incentives fr cr plers. The inititive is tw yers ld nd the plnners re nw interested t see if their cmpign is successful. 25 26 Mechnics (cnt) The Atlnt ypthesis Tw yers g, befre the plnners begn their cmpign, the verge speed f vehicles in the dwntwn cre during rush hur ws clculted t be 7.5 mph. Assume the ppultin stndrd devitin hsn t chnged nd is 4.4 mph. Atlnt wnts prf tht their cmpign is effective, s tht will be the lterntive hypthesis. The ide is tht Atlnt shuld prvide resnble evidence tht will let us reject the null hypthesis f n imprvement. The hyptheses re : µ = 7.5 : µ > 7.5 Nw we re sking questin but µ, but f curse we dn t bserve it. Luckily the smple men is gd prxy fr it. Befre we frmlly estblish the hypthesis test, lets check ut instincts but wht is ging n. 27 28 Our Instincts Cmmn Sense Clerly the trffic cmpign is nt wrking if the new trffic flw is less thn 7.5 mph. Suppse we mesured the speed f rndm smple f 40 crs trvelling thrugh the dwntwn cnnectr nd fund tht the verge speed is 2 mph. We wuld clerly fil t reject. If the smple verge ws 7 mph, we wuld ls fil t reject. It is pssible tht if the true men is 7.5, ne culd btin n verge f 7 mph, but we dn t wnt t reject the null hypthesis unless we see fundmentlly strng dt tht the trffic flw is nw much better. Cnsider the fllwing hyptheses, nd then sk yurself, when is there evidence in fvr f the lterntive? : µ = 7.5 : µ > 7.5 Clerly there is evidence in fvr f the lterntive when the smple men is lrger thn 7.5. But hw much lrger? 29 30 5
The Decisin Rule We wnt t cnstruct decisin rule tht sys Reject if x c Tht is, reject the null hypthesis fr lrge vlues f the bserved smple men. The vlue c is clled the criticl vlue r cutff vlue. It is determined s tht we hve the crrect Type I errr. w t find the cut-ff A Type I errr f 5% sys tht there is 5% chnce we reject the null when it is true. S we need t slve the fllwing prblem. Find the vlue c s tht P(X c when µ = µ ) =.05 31 32 Sme z scre mth We wnt t find c s tht Its clled n inverse lk up prblem We knw tht P(Z>1.64) =.05 P( X > c µ = µ ) =.05 P( X µ > c µ ) =.05 X µ c µ P( > ) =.05 c µ P( Zstt > ) =.05 33 34 Finding the cut-ff The Decisin Rule S if we desire c µ P( Zstt > ) =.05 We need t slve (since P(Z>1.64)=.05) c µ σ = 1.64 r c = µ + 1.64 n Fr 5% level test, ccept the ne-sided lterntive hypthesis :µ>µ ο if σ x > µ + 1.64 n Sme peple prefer wrking n the stndrd nrml scle, s the equivlent sttement is; ccept the ne-sided lterntive if x µ zstt = > 1.64 35 36 6
The Prcess The Five Steps f ypthesis Testing Step 1: Stte the null nd lterntive hyptheses. Step 2: Chse significnce levelα(usully 5%) Step 3: Chse test sttistic nd use the significnce level t estblish decisin rule. Step 4: Cmpute the vlue f the test sttistic. Step 5: Apply the decisin rule nd mke yur decisin. Atlnt Trffic Agin T mesure the effectiveness f the cmpign, the speed f 40 mtr vehicles were mesured during rush hur in the dwn twn cre (ften referred t s the dwntwn cnnectr). Their verge speed ws 9.3 mph. Tw yers g, befre the plnners begn their cmpign, the verge speed f vehicles in the dwntwn cre during rush hur ws clculted t be 7.5 mph. Assume the ppultin stndrd devitin hsn t chnged nd is 4.4 mph. 37 38 Fllw the prcess Exmple Fr the given dt, the vlue f the test sttistic is 9.3 7.5 z stt = = 2.58 4.4 / 40 Since z stt =2.58 is greter thn 1.64, we reject the null hypthesis. Cnclusin: At the 5% level f significnce, we did find sufficient evidence t cnclude tht there is n imprvement in trffic flw since the cmpign begn. The il dditive Slick 50 is suppsed t nt nly reduce engine wer but ls increse gs milege. A rndm smple f 50 crs tht hve men gs milege f 24mpg is selected. When the new il is put int the crs, ech is driven until the gs tnk is dry nd the milege is recrded. The men f the 50 bservtins is 26.3mpg. Assume ppultin stndrd devitin f 6.6mpg. At the 5% significnce level cn we cnclude tht Slick 50 is effective in incresing gs milege? 39 40 The Dt nd Testing Prcess Step 1: Stte the null nd lterntive : µ = 24 µ : > 24 Step 2: significnce level.05 Steps 3,4: Decisin rule sys; reject the null hypthesis if x µ zstt = > 1.64 Plug in Our Dt nd Decide Bsed n ur smple dt, ut test sttistic is z stt x µ 26.3 24 = = = 2.46 6.6 / 50 Since z stt = 2.46 is greter thn 1.64, we reject the null hypthesis. Our cnclusin; t the 5% level f significnce, we did find sufficient evidence t cnclude tht Slick 50 increses gs milege. 41 42 7
Recp Things yu shuld knw The test sttistic z stt x µ 0 = The decisin rule : = If z > 1.64 reject 0 µ µ µ µ stt : > Bsiclly, we fvr the lterntive when we bserve smple men fr t the right f the hypthesized vlue. Null nd lterntive hypthesis Type I nd II errr Decisin rules fr testing men Redings frm the bk Sectins 4.3 nd 4.5. Just skim right nw-fter ll the hypthesis testing lectures (there re three f them) it will mke mre sense. 43 44 8