1 ypothesis Testing I The testing process:. Assumption about population(s) parameter(s) is made, called null hypothesis, denoted. 2. Then the alternative is chosen (often just a negation of the null hypothesis), called alternative hypothesis, denoted or a. 3. After the random sample X,, X n is collected the test statistic is calculated. T T X,, X n (.) 4. The researcher decides about rejection region R, the set of values of T that would warrant the rejection of and if the test statistics is in that region i.e., then we reject the null hypothesis. (#4 is generally done before #3.) T R (.2) 5. If the hypothesis is not rejected, meaning that (.2) is false then we say that the empirical data does not approve the rejection of the null hypothesis. Depending on the research we might consider that situation as the one reaffirming the null hypothesis as well though that is rare.. Testing Proportion Note: In this portion of testing hypotheses all the samples are large. This means we shall use Central Limit Theorem in our work. Example. A claim has been made by a political action committee that a certain candidate on the local elections has the support of 55% of the voters in the district (this is a null hypothesis). We suspect that is not true and that his support is less than that (this is alternative hypothesis). We shall poll 2 voters to examine the claim. The sample mean (proportion) is our test statistic. If the support in our poll is below 5% we shall reject the claim (i.e., the rejection region for the test statistic is anything between and 5%). Let s denote with p the proportion of population in the district that supports the candidate.

2 : p.55 : p.55 T X 2 i th ( X is if i voter supports the candidate, if not) i R,.5 There are three types of alternative hypotheses. The test with the one above is called is called left-tailed test. If we used : p.55 then the testing would be right-tailed test. And lastly, if we use just : p.55 then this test would be called two-tailed test. Since random sample used for testing is not census there is a possibility to make two types of error: Type I rejecting null hypothesis when it is in fact true Type II not rejecting null hypothesis when it is in fact false The probability of Type I error is called the level of significance of the test and often written α. The probability of Type II error is often written β. The complementary probability (of rejecting false hypothesis) is called the power of the test. This is the probability that the test hypothesis would be rejected if false. Back to Example : (a) P X p i (to calculate that we need to calculate all binomial probabilities, check with me after the class if you want to know how)

3 (b) (optional) 2 2 k P X p p p i k k 2k Thus the power of the test depends on the actual (true) value of population parameter p. In this regard we write the power of the test K p as a function of p. For example if true value of p is.5 then.528. The power of the test here is.472. If p is.45 then.887. Thus the power of the test here is.93. The further true value of p is from the claimed (null hypothesis) value the more power our test has which means it is more likely we will reject the null hypothesis. That is certainly not surprising. According to the previous example we see that the error of type II is more likely and in fact very often. That is the reason why statisticians would rather use the term not rejecting than accepting in the definition of type II error. Our approach now is that we first assume the population parameter as in the null hypothesis and then establish the acceptable type I error. We shall also use normal approximation to the binomial via Central Limit Theorem. Assume that null hypothesis is : p p (.3) X n X i i is approximately normal (assuming large n as 2 in our example). The n mean of this random variable is in case that alternative hypothesis is p and the standard deviation is p q / n, therefore : p p (.4) we have to consider following (assuming the null hypothesis holds) according to CLT P X p p q / n z (.5)

4 Therefore for the test statistic ˆp X we chose to have rejection region (left-tailed test) p p q / n z (.6) given the significance level of the test to be. In a similar manner we should consider, given the significance level, (right-tailed test) p (.7) p q / n : p p z (two-tailed test) p : p p z /2 (.8) p q / n the rejection regions respective to alternative hypotheses as denoted. Example (continued) In the actual sample of 2 voters the support for the candidate was among 98 of the voters. Assuming significance of the test at 5% we see that p z p q n / / 2 This means that at significance level of 5% the claim by the political action committee should be rejected - the candidate s support among voters is not 55% but likely below that.

5 Important Note: The chance that we would have even more extreme test statistic would be the chance we get anything even smaller than.7 is called the p-value of the test (this is not to be confused with power of the test) and in our case this is 4.4%. Rule of thumb is that whenever the p-value is smaller than significance of the test then the null hypothesis should be rejected. Example 2. We are testing commercially manufactured dice. The dice should not have been impacted by the indentation or the manufacturing process. We rolled a single die in the test, times and got 6 precisely 72 times. Is the die manufactured correctly? Let p be the probability of die rolling 6. Null hypothesis is the die is manufactured correctly i.e., : p, 6 and the alternative hypothesis is that the side with 6 is lighter which implies right-tailed alternative : p. 6 If we use significance level of % and the right-tailed test with, rolls then rejection of the null hypothesis would follow if p 6 p q z..282 / n 5 /, 6 6 Obviously this is true and the initial hypothesis should be rejected. Therefore the manufactured die is likely faulty. The p-value of the test here is.72

6 Example 3. In a recent publication by automotive industry it was claimed that only 6% New Yorkers drive a pickup truck (similar percentage nation-wide is 2%). We would like to examine that claim using 5% significance. Since we have no inclination to claim either more or less we shall use two-tailed test. Our sample is made of 9 vehicles in the parking lot of Walmart, Commack Commons, 2 of which are pickup trucks. The test statistic is : p.6 : p p 9.69 z /.6.84 / 9 p q n Obviously we cannot reject the claim. Once again pay attention on the wording of your final conclusion, not rejecting does not necessarily mean accepting! Example 4. It has been claimed that the Superbowl is watched by out of 2 male Americans. In a poll conducted about that 28 out of 273 male respondents answered that they watched the last Superbowl. Should we reject hypothesis that 5% of males watch Superbowl using significance level of %? The data we got from the sample indicates less percentage than what is claimed. Thus we are setting up one-sided left-tailed test: : p.5 p q n : p p z /.5.5 / 273 At % significance we should not reject the claim. By the way, p-value of this test is.58. omework: Check online.

