Lecture 5: Non-Parametric Tests (I) KimHuat LIM lim@stats.ox.ac.uk http://www.stats.ox.ac.uk/~lim/teaching.html Slide 1
5.1 Outline (i) Overview of Distribution-Free Tests (ii) Median Test for Two Independent Samples (iii) Rank-Sum (Mann-Whitney) Test for Two Independent Samples (iv) The Sign Test for Matched Data (v) The Wilcoxon Test for Matched Pairs Slide 2
5.2 Overview of Distribution-Free Tests Most inferential statistics assume normal distributions. Although these statistical tests work well even if the assumption of normality is violated, extreme deviations from normality can distort the results. Usually the effect of violating the assumption of normality is to decrease the Type I error rate. Although this may sound like a good thing, it often is accompanied by a substantial decrease in power. Slide 3
There is a collection of tests called distribution-free tests that do not make any assumptions about the distribution from which the numbers were sampled; thus the name distribution-free. The main advantage of distribution-free tests is that they provide more power than traditional tests when the samples are from highly-skewed distributions. Since alternative means of dealing with skewed distributions such as taking logarithms or square roots of the data are available, distribution-free tests have not achieved a high level of popularity. Slide 4
Distribution-free tests are nonetheless a worthwhile alternative. Moreover, computer analysis has made possible new and promising variations of these tests. Distribution-free tests are sometimes referred to as non-parametric tests because, strictly speaking, they do not test hypotheses about population parameters. Nonetheless, they do provide a basis for making inferences about populations and are therefore classified as inferential statistics. Slide 5
5.3 Median Test for Two Independent Samples The Median test investigates if the medians of two independent samples are the same. The hypothesis under test, H 0, is that the medians are the same, and this is to be tested against the alternative hypothesis H 1 that they are different. Slide 6
Example 5.1 Suppose there is a new way of teaching spelling in grade schools. A teacher has two classes of students, and the students were assigned randomly to the two classes. She teaches one of the classes using the new method, A and the other class using her usual method, B. Then she gives both classes the same spelling test. These scores are obtained: Method A 10 10 10 12 15 17 17 19 20 22 25 26 Method B 6 7 8 8 12 16 19 19 22 Assuming that students have comparable spelling ability, is the new method better? Slide 7
Solution: To determine if Method A is better, we check if students get a higher average (median) test score with method A than with method B. If the two methods are equally effective, we would expect the same proportion of students in each class to be above the common median. The median of this combined set of scores is 16. List the 21 scores in increasing order to determine the overall median: 6 7 8 8 10 10 10 12 12 15 16 17 17 19 19 19 20 22 22 25 26 Slide 8
For Method A: 7 out of the 12 values are above the median. So, P A = 12 7. For Method B: 3 out of the 9 values are above the median. So P B = 9 3 = 3 1. To test: H 0 : P A P B = 0 (methods are equally effective) vs H 1 : P A P B > 0 (method A is better) at α = 5% level. Slide 9
Under H 0, Z cal = 7 1 12 3 pq ˆ ˆ ( 1 + 12 1 ) 9 = 1.135 where pˆ = 7 + 3 12 + 9 10 =. 21 We reject H 0 if Z cal > 1.645. So, at α = 5% level, we do not reject H 0 and conclude that there is insufficient evidence to suggest that Method A is better. Slide 10
5.4 Rank-Sum (Mann-Whitney) Test for Two Independent Samples When the populations being compared are not normal, this test is used in place of a 2 sample t test. It requires independent random samples of sizes n 1 and n 2. The test is very simple and consists of combining the two samples into one sample of size n 1 + n 2, sorting the result, assigning ranks to the sorted values (giving the average rank to any tied observations), and then letting T be the sum of the ranks for the observations in the first sample. Slide 11
If the two populations have the same distribution, then the sum of the ranks of the first sample and those in the second sample should be close to the same value. Slide 12
Example 5.1 To compare the running speed of the first grade boys and girls, the following information is collected. Rank 1 2 3 4 5 6 7 8 9 10 11 Sex G B G G G G B B B G B Is there a difference between the running speed of boys and girls? Slide 13
Solution: Analyze the data according to sex: Boys: 5 observations: 2, 7, 8, 9, 11 Rank Sum: 37 Girls: 6 observations: 1, 3, 4, 5, 6, 10 Rank Sum: 29 Slide 14
To test: H 0 : There is no difference in running speed of boys and girls vs H 1 : There is difference in running speed of boys and girls (2-tailed test) at α = 5% level Slide 15
R = sum of ranks in smaller sample = 37 From TABLE 5 with n 1 = 5 and n 2 = 6, we see that the critical value is (19, 41). So, we do not reject H 0 and conclude that there is no difference in the speed of boys and girls. Slide 16
Approximate z test (just for illustration) µ = ½ 5 (5+6+1) = 30 σ 2 = [ 5 6 (5+6+1) ] / 12 = 30 z cal = (37 30) / 30 = 1.278 < 1.96 So, at 5% level, we do not reject H 0 and conclude that there is no difference in the speed of boys and girls. Note: Approximation should be applied only when both sample sizes are more than 10. Slide 17
5.5 The Sign Test for Matched Data Given n pairs of data, the sign test tests the hypothesis that the median of the differences in the pairs is zero. The test statistic is the number of positive differences. If the null hypothesis is true, then the numbers of positive and negative differences should be approximately the same. Let X is the number of positive differences. Under H 0, X ~ Bin(n,½). Slide 18
Example 5.2 In a study, the average number of seeds in two pods was recorded at both the top and the bottom of 10 plants. The objective of study was to determine whether the position on the plant affected the number of seeds in the pods. Plant Number Location 1 2 3 4 5 6 7 8 9 10 Top 4.0 5.2 5.7 4.2 4.8 3.9 4.1 3.0 4.6 6.8 Bottom 4.4 3.7 4.7 2.8 4.2 4.3 3.5 3.7 3.1 1.9 Slide 19
Solution: Plant Number Location 1 2 3 4 5 6 7 8 9 10 Top 4.0 5.2 5.7 4.2 4.8 3.9 4.1 3.0 4.6 6.8 Bottom 4.4 3.7 4.7 2.8 4.2 4.3 3.5 3.7 3.1 1.9 - + + + + - + - + + Number of +, X = 7. Slide 20
To test: H 0 : vs H 1 : the position on the plant does NOT affect the number of seeds in the pods the position on the plant does affect the number of seeds in the pods at α = 5% level. Slide 21
Under H 0, X ~ Bin(10,½) N(5, 1.5811 2 ) Z cal = 7 5 1.5811 = 1.2649 < 1.645 Therefore, we do not reject H 0 and conclude that there is insufficient evidence to suggest that the position on the plant affect the number of seeds in the pods. Slide 22
5.6 The Wilcoxon Test for Matched Pairs This test is similar to the sign test in that it tests for the median difference in paired data to be zero. The test consists of sorting the absolute values of the differences from smallest to largest, assigning ranks to the absolute values (rank 1 to the smallest, rank 2 to the next smallest, and so on) and then finding the sum of the ranks of the positive differences. Slide 23
If the null hypothesis is true, the sum of the ranks of the positive differences should be about the same as the sum of the ranks of the negative differences. Slide 24
Example 5.3 Consider the data in Example 5.2: Plant Number Location 1 2 3 4 5 6 7 8 9 10 Top 4.0 5.2 5.7 4.2 4.8 3.9 4.1 3.0 4.6 6.8 Bottom 4.4 3.7 4.7 2.8 4.2 4.3 3.5 3.7 3.1 1.9 Slide 25
Solution: Location Plant Number 1 2 3 4 5 6 7 8 9 10 Top 4.0 5.2 5.7 4.2 4.8 3.9 4.1 3.0 4.6 6.8 Bottom 4.4 3.7 4.7 2.8 4.2 4.3 3.5 3.7 3.1 1.9 Difference, d i -.4 1.5 1.0 1.4 0.6 -.4 0.6 -.7 1.5 4.9 Rank of d i 1.5 8.5 6 7 3.5 1.5 3.5 5 8.5 10 The sum of the positive ranks, R + = 47, while The sum of the negative ranks, R = 8. So, T = min(r +, R ) = 8. Slide 26
To test: H 0 : vs H 1 : the position on the plant does NOT affect the number of seeds in the pods the position on the plant does affect the number of seeds in the pods at α = 5% level. Slide 27
From TABLE 6 Critical Values for T in the Wilcoxon Matched-Pairs Signed-Rank test: The critical value for N = 10 at α = 5% is 10. Since T = 8 < 10, we reject H 0 and conclude that there is sufficient evidence to suggest that the position on the plant affect the number of seeds in the pods. Slide 28
Sign Test versus Wilcoxon Signed-Ranks Test The Sign Test This is used when the direction of the difference between matched pairs of data can be determined, but the magnitude cannot. The test computes the difference between the two variables for all cases and classifies the differences as either positive, negative, or tied. Slide 29
The Wilcoxon Signed-Ranks Matched Pair Test This is used when both the direction of the difference between matched pairs of data and the magnitude can be determined. This test considers information about both the sign of the differences and the magnitude of the differences between pairs. Conclusion: If both the Sign test and the Wilcoxon test can be performed, the Wilcoxon test is the better choice, as it incorporates more information about the data. Slide 30