Statistics 111 - Lecture 14 Itroductio to Iferece Hypothesis Tests Admiistrative Notes Sprig Break! No lectures o Tuesday, March 8 th ad Thursday March 10 th Exteded Sprig Break! There is o Stat 111 recitatio this Friday, March 4 th. Midterm will be graded ad available i recitatio o Friday, March 18 th Homework 4 due i recitatio o Friday, March 18 th Mar. 3, 2016 Stat 111 - Lecture 14 - Hyp.Test. 1 Mar. 3, 2016 Stat 111 - Lecture 14 - Hyp.Test. 2 Last Class: Cofidece Itervals We used the sample mea X as our best estimate of the populatio mea µ, but we realized that our sample mea will vary betwee differet samples Our solutio was to use our sample mea as the ceter of a etire cofidece iterval of likely values for our populatio mea µ 95% cofidece itervals are most commo, but we ca calculate iterval for ay cofidece level Also did cofidece iterval for populatio proportio p Formulas for cofidece itervals are based o results about samplig distributio of sample mea ad sample proportio (chapter 5) This Class: Hypothesis Testig Today, we will agai use our samplig distributio results for a differet type of iferece: testig a specific hypothesis I some problems, we are ot iterested i calculatig a cofidece iterval, but rather we wat to see whether our data cofirm a specific hypothesis This type of iferece is sometimes called statistical decisio makig, but the more commo term is hypothesis testig Mar. 3, 2016 Stat 111 - Lecture 14 - Hyp.Test. 3 Mar. 3, 2016 Stat 111 - Lecture 14 - Hyp.Test. 4 Example: Blackout Baby Boom New York City experieced a major blackout o November 9, 1965 may people were trapped for hours i the dark ad o subways, i elevators, etc. Nie moths afterwards (August 10, 1966), the NY Times claimed that the umber of births were way up They attributed the icreased births to the blackout, ad this has sice become urba leged! Does the data actually support the claim of the NY Times? Usig data, we will test the hypothesis that the birth rate i August 1966 was differet tha the usual birth rate Number of Births i NYC, August 1966 Su Mo Tue Wed Thu Fri Sat 452 470 431 448 467 377 344 449 440 457 471 463 405 377 453 499 461 442 444 415 356 470 519 443 449 418 394 399 451 468 432 First two weeks X = 433.6 s = 39.4 =14 We wat to test this data agaist the usual birth rate i NYC, which is 430 births/day Mar. 3, 2016 Stat 111 - Lecture 14 - Hyp.Test. 5 Mar. 3, 2016 Stat 111 - Lecture 14 - Hyp.Test. 6 1
Steps for Hypothesis Testig 1. Formulate your hypotheses: Need a Null Hypothesis ad a Alterative Hypothesis 2. Calculate the test statistic: Test statistic summarizes the differece betwee data ad your ull hypothesis 3. Fid the p-value for the test statistic: How probable is your data if the ull hypothesis is true? Null ad Alterative Hypotheses Null Hypothesis (H 0 ) is (usually) a assumptio that there is o effect or o chage i the populatio Alterative hypothesis (H a ) states that there is a real differece or real chage i the populatio If the ull hypothesis is true, there should be little discrepacy betwee the observed data ad the ull hypothesis If we fid there is a large discrepacy, the we will reject the ull hypothesis Both hypotheses are expressed i terms of differet values for populatio parameters Mar. 3, 2016 Stat 111 - Lecture 14 - Hyp.Test. 7 Mar. 3, 2016 Stat 111 - Lecture 14 - Hyp.Test. 8 Example: NYC blackout ad birth rates Let µ be the mea birth rate i August 1966 Null Hypothesis: Blackout has o effect o birth rate, so August 1966 should be the same as ay other moth H 0 : µ = 430 (usual birth rate) Alterative Hypothesis: Blackout did have a effect o the birth rate H a : µ 430 This is a two-sided alterative, which meas that we are cosiderig a chage i either directio We could istead use a oe-sided alterative that oly cosiders chages i oe directio Eg. oly alterative is a icrease i birth rate H a : µ > 430 Test Statistic Now that we have a ull hypothesis, we ca calculate a test statistic The test statistic measures the differece betwee the observed data ad the ull hypothesis Specifically, the test statistic aswers the questio: How may stadard deviatios is our observed sample value from the hypothesized value? For our birth rate dataset, the observed sample mea is 433.6 ad our hypothesized mea is 430 To calculate the test statistic, we eed the stadard deviatio of our sample mea Mar. 3, 2016 Stat 111 - Lecture 14 - Hyp.Test. 9 Mar. 3, 2016 Stat 111 - Lecture 14 - Hyp.Test. 10 Test Statistic for Sample Mea Sample mea has a stadard deviatio of σ / our test statistic Z is: σ / Z is the umber of stadard deviatios betwee our sample mea ad the hypothesized mea µ 0 is the otatio we use for our hypothesized mea To calculate our test statistic Z, we eed to kow the populatio stadard deviatio σ For ow we will make the assumptio that σ is the same as our sample stadard deviatio s Later, we will correct this assumptio! Mar. 3, 2016 Stat 111 - Lecture 14 - Hyp.Test. 11 so Test Statistic for Birth Rate Example For our NYC births/day example, we have a sample mea of 433.6, a hypothesized mea of 430 ad a sample stadard deviatio of 39.4 Our test statistic is: 433.6 430 = σ / 39.4 / 14 = 0.342 So, our sample mea is 0.342 stadard deviatios differet from what it should be if there was o blackout effect Is this differece statistically sigificat? Mar. 3, 2016 Stat 111 - Lecture 14 - Hyp.Test. 12 2
Probability values (p-values) Assumig the ull hypothesis is true, the p- value is the probability we get a value as far from the hypothesized value as our observed sample value The smaller the p-value is, the more urealistic our ull hypothesis appears For our NYC birth-rate example, Z=0.342 Assumig our populatio mea really is 430, what is the probability that we get a test statistic of 0.342 or greater? Mar. 3, 2016 Stat 111 - Lecture 14 - Hyp.Test. 13 p-value for NYC dataset To calculate the p-value, we use the fact that the sample mea has a ormal distributio prob = 0.367 Z = -0.342 Z = 0.342 prob = 0.367 If our alterative hypothesis was oe-sided (H a :µ > 430), the our p-value would be 0.367 Sice are alterative hypothesis was two-sided our p- value is the sum of both tail probabilities p-value = 0.367 + 0.367 = 0.734 Mar. 3, 2016 Stat 111 - Lecture 14 - Hyp.Test. 14 Statistical Sigificace If the p-value is smaller tha α, we say the data are statistically sigificat at level α The most commo α-level to use is α = 0.05 Later, we will see this relates to 95% cofidece itervals! The α-level is used as a threshold for rejectig the ull hypothesis If the p-value < α, we reject the ull hypothesis that there is o chage or differece Coclusios for NYC birth-rate data The p-value = 0.734 for the NYC birth-rate data, so we ca ot reject the ull hypothesis at α-level of 0.05 Aother way of sayig this is that the differece betwee ull hypothesis ad our data is ot statistically sigificat So, we coclude that the data do ot support the idea that there was a differet birth rate tha usual for the first two weeks of August, 1966. No blackout baby boom effect! Mar. 3, 2016 Stat 111 - Lecture 14 - Hyp.Test. 15 Mar. 3, 2016 Stat 111 - Lecture 14 - Hyp.Test. 16 Tests ad Itervals There is a close coectio betwee cofidece itervals ad two-sided hypothesis tests 100 C % cofidece iterval is cotais likely values for a populatio parameter, like the pop. mea µ Iterval is cetered aroud sample mea X Width of iterval is a multiple of SD(X ) = σ A α-level hypothesis test rejects the ull hypothesis that µ = µ 0 if the test statistic Z has a p-value less tha α σ / Tests ad Itervals If our cofidece level C is equal to 1 - α where α is the level of the hypothesis test, the we have the followig coectio betwee tests ad itervals: A two-sided hypothesis test rejects the ull hypothesis (µ =µ 0 ) if our hypothesized value µ 0 falls outside the cofidece iterval for µ So, if we have already calculated a cofidece iterval for µ, the we ca test ay hypothesized value µ 0 just by seeig whether or ot µ 0 is i the iterval! Mar. 3, 2016 Stat 111 - Lecture 14 - Hyp.Test. 17 Mar. 3, 2016 Stat 111 - Lecture 14 - Hyp.Test. 18 3
Example: NYC blackout baby boom Births per day from two weeks i August 1966 X = 433.6 s = 39.4 =14 Differece betwee our sample mea ad the populatio mea µ 0 = 430 had a p-value of 0.734, so we did ot reject the ull hypothesis at α-level of 0.05 Could have calculated 100 (1-α) % = 95 % cofidece iterval: % σ σ ( % ' X Z *, X + Z * * = 433.6 1.96 39.4, 433.6 +1.96 39.4 ( ' * & ) & 14 14 ) = ( 413.0, 454.2) Sice our hypothesized µ 0 = 430 is withi our iterval of likely values, we do ot reject the ull hypothesis. If hypothesis was µ 0 = 410, the we would reject it! Mar. 3, 2016 Stat 111 - Lecture 14 - Hyp.Test. 19 Aother Example: Calcium i the Diet Calcium is a crucial elemet i body. Recommeded daily allowace (RDA) for adults is 850 mg/day Radom sample of 18 people below poverty level: X = 747.4 mg =18 Does the data support claim that people below the poverty level have a differet calcium itake tha the recommeded daily allowace? Mar. 3, 2016 Stat 111 - Lecture 14 - Hyp.Test. 20 Hypothesis Test for Calcium Let µ be the mea calcium itake for people below the poverty lie Null hypothesis is that calcium itake for people below poverty lie is ot differet from RDA: µ 0 = 850 mg/day Two-sided alterative hypothesis: µ 0 850 mg/day To calculate test statistic, we eed to kow the populatio stadard deviatio of daily calcium itake. From previous study, we kow σ = 188 mg 747 850 = σ / 188/ 18 = 2.32 Need p-value: if µ 0 = 850, what is the probability we get a sample mea as extreme (or more) tha 747? Mar. 3, 2016 Stat 111 - Lecture 14 - Hyp.Test. 21 p-value for Calcium We have two-sided alterative, so p-value icludes stadard ormal probabilities o both sides: prob = 0.010 Z = -2.32 Z = 2.32 prob = 0.010 Lookig up probability i table, we see that the twosided p-value is 0.010+0.010 = 0.02 Sice the p-value is less tha 0.05, we ca reject the ull hypothesis Coclusio: people below the poverty lie have sigificatly (at a α=0.05 level) lower calcium itake tha the RDA Mar. 3, 2016 Stat 111 - Lecture 14 - Hyp.Test. 22 Cofidece Iterval for Calcium Alteratively, we calculate a cofidece iterval for the calcium itake of people below poverty lie Use cofidece level 100 C = 100 (1-α) = 95% 95% cofidece level meas critical value Z * =1.96 % ' X Z * & σ σ ( %, X + Z * * = ' 747 1.96 188 ) & 18 = ( 660.1, 833.9), 747 +1.96 188 ( * 18 ) Sice our hypothesized value µ 0 = 850 mg is ot i the 95% cofidece iterval, we ca reject that hypothesis right away! Cautios about Hypothesis Tests Statistical sigificace does ot ecessarily mea real sigificace If sample size is large, eve very small differeces ca have a low p-value Lack of sigificace does ot ecessarily mea that the ull hypothesis is true If sample size is small, there could be a real differece, but we are ot able to detect it May assumptios wet ito our hypothesis tests Presece of outliers, low sample sizes, etc. make our assumptios less realistic We will try to address some of these problems ext class Mar. 3, 2016 Stat 111 - Lecture 14 - Hyp.Test. 23 Mar. 3, 2016 Stat 111 - Lecture 14 - Hyp.Test. 24 4
Next Class - Lecture 15 Practice Problems i Chapter 6 Ejoy Sprig Break! Mar. 3, 2016 Stat 111 - Lecture 14 - Hyp.Test. 25 5