Wilcoxon Rank Sum or Mann-Whitney Test Chapter 7.11

STAT Non-Parametric tests /0/0 Here s a summary of the tests we will look at: Setting Normal test NonParametric Test One sample One-sample t-test Sign Test Wilcoxon signed-rank test Matched pairs Apply one-sample test to differences within pairs Two independent samples Two-sample t-test Wilcoxon rank sum test Wilcoxon Rank Sum or Mann-Whitney Test Chapter 7. Nonparametric comparison of two groups Main Idea: If two groups come from the same distribution, but you ve just randomly assigned labels to them, values in the two different groups should have values somewhat equally distributed between the two. Group A: X,..., X n F A Group B: Y,..., Y n F B Null Hypothesis H 0 : F A = F B gpa:.3 3. (n Artificial Example: = ) gpb:.9 0.3 3.3 (n = 3) Order all observations in the combined sample & assign ranks: (gp A data underlined) Order.3 3.3 3..9 0.3 Assign ranks 3 Test statistic R = sum of ranks attached to group A = + 3 =. Under H 0, each -subset of the ranks {,, 3,, } is equally likely to occur as the ranks of X, X. sum = R {, } 3 {, 3} {, } {, } 6 Possible ranks for X, X : {, 3} {, } 6 {, } 7 {3, } 7 {3, } 8 {, } 9 Hence the distribution of R under H 0 is given by r = 3 6 7 8 9 P H0 (R = r) 0 0 0 0 In our toy example, R =, the one-sided P-value Notes: P = P H0 (R ) = P (seeing a value as small or smaller than observed) = values of R, p-value do not depend on the exact details of X, X, only on their ranks in the combined sample.

the distribution of R under H 0 doesn t depend on the distribution of X or Y - it is a fixed distribution (which does, however, depend on n and n ). [for this reason, such methods are called distribution-free, or nonparametric.] Ranks can be sensitive to rounding. If we had additional data point 3.3 > 3.3 we would give the new data point a rank of 3. If we rounded, however, we would have a tie because we would have points with 3.3. We would then split the two possible ranks these two data points take up (&3) and divide it between the two, so each would have rank. More Generally, X,..., X n in group A, distribution F A ; Y,..., Y n in group B, distribution F B. Combine the samples into one sample of W i s. Order data in the combined sample W () W ()... W (n +n ). Assign rank i to the i th smallest observation (in the case of ties, assign the average rank to each observation) 3. Let R obs = sum of ranks attached to observations in sample. K = R obs n (n +). U obs s = max(k, n n K ) 6. Find distribution of U s under H 0. Reject if P (U s Us obs ) α More precisely: U α[n,n ] is largest u such that P (U s u) α A Couple of Notes if R = sum of ranks attached to observations in sample, then R + R = So knowing R is equivalent to knowing R n +n i= i = (n + n )(n + n + )/ Equivalently you can find K = n j= #{Y s < X j } and K = n n K = n i= #{X s < Y j } as is done in the book [Can show algebraically these are the same value] so get U s = max(k, K ) By definition, U s n n n (n +) is the lowest possible value of R and n n + n (n +) is the largest value possible. U s = max(r n (n +), n n + n (n +) R. Traditionally would choose R to go with smaller group because doing it by hand, but can show U s the same whichever group you choose. Because of different sample sizes, the distribution of U s is not symmetric.

From Previous Example: R U s 3 6 6 3 7 8 9 6 P H0 (U s = u) u 3 6 Alternative Hypothesis One sided alternatives: For H : F A > F B, expect R (and hence U s ) to be larger than H 0 For H : F A < F B, expect R to be smaller than under H 0 but U s will still be larger, so still in the right tail. Two sided alternative: For H : F B F A, need to allow for both left and right tails of R. It is possible to show that if the null hypothesis is verified, E H0 K = n n Z = K n n n n (n +n +) Var H0 K = n n (n + n + ) { N (0, ) n &n 0 Example Hollander & Wolfe (973), 69f. Permeability constants of the human chorioamnion (a placental membrane) at term (x) and between to 6 weeks gestational age (y). The alternative of interest is greater permeability of the human chorioamnion for the term pregnancy. > term <- c(0.80, 0.83,.89,.0,.,.38,.9,.6, 0.73,.6) > mid <- c(., 0.88, 0.90, 0.7,.) > rank(c(term,mid)) [] 3 7 0 3 8 6 9 > sum(rank(c(term,mid))[:0]) [] 90 > sum(rank(c(term,mid))[:0])-(0*/) [] 3 > -pwilcox(3,0,) # Beware this is a discrete random variable... [] 0.706 Output from the test: > wilcox.test(term, mid, alternative = "g") # greater Wilcoxon rank sum test data: term and mid W = 3, p-value = 0.7 alternative hypothesis: true mu is greater than 0 Paired Samples Paired samples (X i, Y i ) D i = Y i X i i =,..., n Sign Test Chapter 9. Main Idea: If there is no difference between the pairs (i.e. true difference is 0), then equally likely to observe differences on either side of 0. 3

Equally likely to be on either side a Bin(0., n): Wilcoxon signed rank test Main Idea: If neither condition has an effect, then not only should the differences be equally distributed on either side of 0, but also how far away the differences are from 0 should be the same on either side.. let R i be the rank of D i (absolute value of difference). restore signs of D i to the ranks signed ranks 3. calculate either W + = sum of ranks with positive signs of W = sum of ranks with negative signs) Idea -if distribution of x(f x ) is same as that of Y (F x ) then D i are equally likely to be positive as negative. So about half ranks are positive, and half are negative. -if F y is larger than F x, then expect most ranks to have positive signs and so W + will be large. More precisely, H 0 : distribution of D i is symmetric about 0 Can show (if no ties) E H0 W + = n(n+) V ar H0 W + = n(n+)(n+). Notes: Z = W + n(n+) n(n+)(n+) approx N(0, ) W + + W = n i = i= n(n + ), so either one of W +, W determines the other. Differences D i = 0 are usually omitted If there are ties in the D i, average the corresponding ranks in step (like the Rank-Sum test). If many ties, need to adjust variances - see books on non-parametric such as Hollander-Wolfe (973), Lehman(97). Null distribution of W + is tabulated for n 0. For n 0, however, can use normal approximation Can also use for a one-sample to test for θ if you assume the distribution is symmetric around θ. In that case, you would subtract off θ from each data value, and then perform the same test. Example: Measuring mercury levels in Fish (of Rice,.33); see calc. of W + = 9.; W = 0. Here n = (one D i = 0 discarded) Hence: n(n + ) = = 0; n(n + )(n + ) =

Two Ways to measure mercury levels in fish (ppm of mercury in juvenile black marlin) Sel Red Permang Diff. SignedRank Pos Ranks Neg Ranks 0.3 0.39 0.07.. 0. 0.7 0.07.. 0. 0. 0 «-note! 0.7 0.3-0.0 - - 0.3 0. 0. 9 9 0.3 0.3-0.0-3. -3. 0.3 0.3 0. 0 0 0.63 0.98 0.3 3 3 0. 0.86 0.36 0.6 0.79 0.9 0.38 0.33-0.0-3. -3. 0.6 0. -0.0 -. -. 0. 0. 0.0 6. 6. 0.3 0.3-0.0 -. -. 0.6 0.6-0.0-6. -6. 0. 0.3 0.0.. 0.77 0.8 0.08 7. 7. 0.3 0. -0.0-6. -6. 0.3 0.33 0.03 9 9 0.7 0.7-0.3 - - 0. 0.3 0.0 6. 6. 0.3 0.9-0.0 - - 0.9 0. 0.0.. 0.3 0.3 0.0 0.8 0. -0.08-7. -7. Hg=scan() 0.3 0.39 0. 0.7 0. 0.... way=factor(rep(:,)) Based on T s = min(w, W + ) (0. here) [only need to tabulate left hand tail] Normal approximation sided P-value P (W + 0.) = P ( W + 0 0. 0 ) = P (Z.7) =.03 [Compare with t-test, gives sided P =.09 - rather different but neither is statistically significant evidence for a difference].comparet-testwithwilcoxon:

Paired t-test mean diff = d = 0.00 SD diff = s d = 0.6 SE diff = SE d = 0.03 t s =.7 df = Wilcoxon signed rank sums W + = 9., W = 0. n= normal approx mean 0 sd of W + = 3 Z-score -.7 > pt(-.7,) [] 0.068897 sided P 0.09 t.test(hg~way,paired=t) Paired t-test data: Hg by way t = -.78, df =, p-value =0.0938 alternative hypothesis: true difference in means is not equal to 0 9 percent confidence interval: -0.08889837 0.007389837 sample estimates: mean of the differences -0.00 Sign Test: N = 0 N + =. So for -tailed test, we have: > pnorm(-.7) [] 0.003 sided P 0.03 > wilcox.test(hg~way,paired=t) Wilcoxon signed rank test with correction data: Hg by way V =07, p-value = 0. alternative hypothesis: true mu is not equal to 0 P (get 0 or less successes) + P (get or more successes) = pbinom(0,prob=.,size=) + - pbinom(3,prob=.,size=) = 0.707 6