# Permutation Tests for Comparing Two Populations

Size: px
Start display at page:

Transcription

1 Permutation Tests for Comparing Two Populations Ferry Butar Butar, Ph.D. Jae-Wan Park Abstract Permutation tests for comparing two populations could be widely used in practice because of flexibility of the test statistic and minimal assumptions. The Wilcoxon sum rank test is more powerful than a t test statistic for moderate and large sample sizes for heavier tailed distributions. Using a Resampling Stats, this test is easy to implement and a significance level is exact when calculating all possible permutations. The approximate significance level can be used when the numbers of permutations are very large. Introduction Suppose a researcher wants to know whether a new experimental drug relieves symptoms attributable to the common cold. The response variable may be the time until the cold symptoms go away. If we let μ 1 be the mean time until cold symptoms go away for individuals who take the drug and μ 2 be the mean time until symptoms go away for individuals who take placebo, then the hypothesis could be H 0 : μ 1 = μ 2 and the alternatives could be H a : μ 1 < μ 2 or H a : μ 1 > μ 2 or H a : μ 1 μ 2. The first alternative means that a drug is effective since the mean time until cold symptoms go away is less than for individuals who take the drug than for those who do not take the drug. In parametric setting, there are several assumptions for this test to be valid. First, the two samples come from populations with normal density. Second, the samples must be independent. If both population variances are known then the test statistics is given by ( x1 x2 ) ( μ1 μ 2 ) Z =. 2 2 σ 1 σ 2 + n1 n2 When population variances are unknown but the sample sizes for each population are greater than 30 then one usually uses Welch's approximate t which is a t test statistics that can be calculated similar to Z test statistics above by replacing S 2 1 and S 2 2 for σ 2 1 and σ 2 2. Another assumption is when both variances are unknown but they are equal. Since the variances are equal and we wish to test both population means are equal, then the natural way to estimate the variance is to combine the sample, called pooled variance. The pooled variance is computed by finding a weighted average of the samples variances, i.e, S 2 p=(n+m-2) -1 ((n-1)s 2 1 +(m-1)s 2 2 ). And the test statistic is just replaced Z with t and σ 2 1 and σ 2 2 with S 2 p. In practice we do not know whether the variances are actually equal, thus usually we have to check this by testing equality of the variances using F test statistic. We can imagine that there is no guarantee that all assumptions above are satisfied. If these assumptions are not Journal of Mathematical Sciences & Mathematics Education Vol. 3 No. 2 19

2 valid then we can still test similar hypothesis with nonparametric test. In the literature, there are many different methods for testing equality of two populations. In this paper, we will use a permutation method. Hypothesis Testing for Equal Treatment Effect In a nonparametric statistic, there is no parameter to be tested. Let F 1 (x) be the cumulative distribution function (cdf) of population 1 and F 2 (x) be the cdf of population 2. Then the null hypothesis is H 0 : F 1 (x) = F 2 (x). In this case the two distributions are identical under null hypothesis. Here it does not say the means are equal but the variances are different. When the two treatments are the same under the null hypothesis, meaning that the distribution of the observations is the same. The alternative hypothesis is given by H a : F 1 (x) F 2 (x) with at least one x for strict inequality. The statement above is the same as H a : μ 1 > μ 2 in parametric case, since observations for treatment 1 tend to be larger than observations for treatment 2. We can also have the alternative hypothesis H a : F 1 (x) F 2 (x). For two sided alternative the alternative hypothesis is H a : F 1 (x) F 2 (x) or, F 1 (x) F 2 (x) with strict inequality for at least one x. Permutation Test Permutation tests also known as randomization tests. It is widely used in nonparametric statistics where a parametric form of the underlying distribution is not specified. Consider sample of m observations from treatment 1 and n observations from treatment 2. Assume that under the null hypothesis there is no difference between the effect of treatment 1 and treatment 2. Then any permutation of the observations between the two treatments has the same chance to occur as any other permutation. The steps for a two-treatment permutation test: Compute the difference between the mean of observed data, called it D obs. Create a vector of m+n observations. Select at random experimental units to one of the two treatments with m units assigned to treatment 1 and n units assigned to treatment 2. Permute the m+n observations between the two treatments so that there are m observations for treatment 1 and n observations for treatment 2. m + n ( m + n)! The number of possibilities are =. n m! n! For small sample sizes, obtain all possible permutations of the observations; for large sample sizes, obtain a random sample of predetermined, R, permutations. For each permutation of the data, calculate the difference between the mean of treatment 1 and mean of treatment 2, called it, D. Journal of Mathematical Sciences & Mathematics Education Vol. 3 No. 2 20

3 For the upper tailed test, compute p-value as the proportion of D no. of D' s Dobs greater than or equal D obs i.e., P value = m + n n If p-value less than or equal to the predetermined level of significance α then we reject H 0. This permutation test is very flexible. One can choose a test statistic suited to the task at hand. Instead of using the difference between means of treatment 1 and mean of treatment 2 as a test statistic, one can also use the sum of either treatment 1 or 2. If there is an outlier in the data set one can use difference between median as a statistic instead of the difference between mean. One may also use trimmed mean as a statistic by deleting equal numbers of the smallest observations and the largest observations. (Higgins, 2004). How to decide which one of the mean, median, or trimmed mean to employ? It depends on the knowledge about the population from which data come from. In practice, one can use the difference between means when the data is approximately normal density; use the difference between medians, if the distributions of observations are asymmetric; and if the distribution is symmetric with some unusual large and small observations, then one can utilizes a trimmed mean as a statistic. These nonparametric methods do not require analytical derivation of test statistic under the null hypothesis. Again there is a relaxation in choosing the test statistic. With this relaxation, this permutation test has advantageous over a parametric test. Permutation tests can be applied to continuous, ordinal, or categorical data, to values of normal or non-normal density. Whenever a parametric test works, a permutation test also works. These permutation methods have wide range of applications. Permutation methods can be applied whenever parametric statistical methods fail (Good, 1994). Now instead of permuting the original observations, one can permute the ranks of the observations. Let X 1,X 2,...,X N (N=n+m) be the combined observations. The rank of X i among the N observations, R(X i ), is R(X i ) = number of X j 's X i. If no two observations have the same value then let 1 be the rank of the smallest observation, 2 be the rank of the next smallest observation, and so on. For example, let the observations to be: ; and their ranks are: In case of tie observations, one can adjust the rank. For example the observations 3,3,4,5,5,5,5. Their ranks are 1.5, 1.5, 3, 5.5, 5.5, 5.5, 5.5. The first two ranks are computed by averaging rank of 1 and 2. If one uses the sum of ranks for one treatment the test statistic is called the Wilcoxon rank-sum test, W. To calculate a permutation test, combine the m+n ranks. Permute the ranks among the two treatments in which m ranks are assigned to treatment 1 and n ranks are assigned to treatment 2. For a small sample sizes m and n, obtain all possible permutations of the ranks, for large sample size obtain a random sample of R permutations. For each permutation of the ranks, compute the sum of the ranks, W, for one treatment. The p-value is a fraction of the sum of the ranks for one treatment, W, greater than or equal to sum of the observed ranks, W obs. Journal of Mathematical Sciences & Mathematics Education Vol. 3 No. 2 21

4 Properties of the Tests The only assumption we have is the distribution of treatment 1 under the null hypothesis is the same as the distribution of treatment 2. In practice the fewer assumptions and limitations, the broader the applications. The question is how powerful is this test? Is this test more powerful than the parametric counterpart? How robust is the test? That is if there is a violation of the underlying distribution how sensitive is it? What are the effects of outliers or extreme values, especially for a small sample sizes (Good, 1994). In selecting a statistical method, a statistician will pay very close attention to the significance level and the power of the test. The significance level of a test, denoted by α, is the probability of making type I error, that is, the probability of deciding erroneously on the alternative hypothesis when, in fact, the null hypothesis is true. Probability of type II error is the probability of failing to reject the null hypothesis when, in fact, the null hypothesis is false. The power of a test denoted by β, is the complement of probability of type II error, that is, the probability of deciding the alternative hypothesis when the alternative hypothesis is true. We would like the significance level to be very small or close to zero, and the power of the test to be very big or close to one. In practice we fix significance level, for example, and then maximize the power of the test. At the same significance level α, a test is said to be a most powerful test if a test, at specified significance level, is more powerful against a specific alternative than all other tests. A test is said to be a uniformly most powerful test, if a test at specified significance level, is more powerful against all alternatives than all other tests. In practice, it is unusual to know the distribution of the variable(s) or its variance. A test is said to be exact, if probability of making type I error is exactly a significance level α. For a test to be exact, a sufficient condition for the combined observations can be permuted is exchangeability. The observations x 1,x 2,...,x N are exchangeable if, for example, probability of any particular joint outcome, say x 1 +x 5 +x 7 =x, is the same regardless of the order of the observations considered. Therefore independent identically distributed, sampling with replacement, and a dependent normal with constant variance and constant covariance are exchangeable. Example: Consider the illustration mentioned in section 1. Let the alternative hypothesis be the mean time until cold symptoms go away is less than for individuals who take the drug than for those who do not take the drug. The observations for two treatments are as follows: Table 1 Hours until the symptoms go away Drug Dale (36) Kenneth (60) Mathar (39 Placebo Butar (37) Honjo (55) Park (70) The combined samples are If there is no difference between the two treatments one can expect that the total observations for the treatment 1 Journal of Mathematical Sciences & Mathematics Education Vol. 3 No. 2 22

5 is the same as for the treatment 2. If the total observations for the treatment 1 is smaller than for the treatment 2, on the average, then the alternative hypothesis is true since less time for the treatment 1 to recover from the cold. Table 2 Permutation distribution of Treatment 1 and Treatment 2 Number Drug (Rank) (1 2 3) 2* (1 2 4) (1 2 5) (1 2 6) (1 3 4) (1 3 5) (1 3 6) (1 4 5) (1 4 6) (1 5 6) (2 3 4) (2 3 5) (2 3 6) (2 4 5) ( (2 5 6) (3 4 5) (3 4 6) (3 5 6) (4 5 6) Placebo (Rank) (4 5 6) (3 5 6) (3 4 6) (3 4 5) (2 5 6) (2 5 6) (2 4 5) (2 3 6) (2 3 5) (2 3 4) (1 5 6) (1 4 6) (1 4 5) (1 3 6) (1 3 5) (1 3 4) (1 2 6) (1 2 5) (1 2 4) (1 2 3) Difference Bet. means Sum of drug (Sum of Ranks) Diff bet. medians (6) (7) (8) (9) (8) (9) (10) (10) (11) (12) (9) (10) (11) (11) ( ( (12) ( (14) (15) Journal of Mathematical Sciences & Mathematics Education Vol. 3 No. 2 23

6 In table 2, there are 20 possible two-sample set from the list of permutations distribution. The second data set with asterisk from table 2 above is the observed data from the example. It has a difference of means of -21. The p-value for the lower-tail is the probability of observing a difference of means of -21 or less under the assumption that the treatments do not differ. In this case there are only two numbers less than or equal to -21 out of 20 difference means ( that is -21 and ), therefore its p-value is 2/20 = If one uses the sum of the treatment 1 (sum of rank of treatment 1) as a statistic, the p-value is the proportion of number less than or equal to 135 (7). In this case is 2/20=0.10. This p-value is exact since we calculate it from a permutation distribution, hence it is the exact significance level, not an approximation. Testing for Deviances The test explained above is design to distinguish between the effects of two treatments, whether observations from one treatment tend to be larger than observations from another treatment or vice versa. In this section we are interested in the variability of the observations for the two treatments. This variability is very important in quality control. Even though the data have the correct mean or median, it is possible to have an excessive variability within the data. The engineer has to fix this to reduce the variability and the data should be closed to the center. Suppose we are testing that there is no variability between treatment 1 and treatment 2, or H 0 : σ 1 = σ 2. If we assume that both samples come from normal distribution then the usual test is F=S 2 1 /S 2 2, which follows an F distribution with m-1 and n-1 degrees of freedom. This test is not valid if the underlying distribution is not normal. Higgins (2004) suggested to find the deviances of each treatment i.e., dev i1 = X i μ 1 and dev j2 = X j - μ 2. The test m dev / i= 1 i1 m statistic is the ratio of mean deviance (RMD) as RMD =. n dev / n If the means are unknown, then replace the means with the medians of the samples. The steps for the permutation test on deviances are similar to the permutation test for two treatment effect. First, compute the test statistic for the original observations, then permute (rearrange) the m+n observations between treatments so that m observations for treatment 1 and n observations for treatment 2. Find the medians for each treatment and estimated deviances then compute the statistic from the sample just permuted. Compare this value to the value obtained for the observed data. If the computed value from the permutation is greater than or equal to the statistic from the original observations, then count 1. Repeat permutation and computation for a number of times. If the sample sizes are small obtain all possible permutations and if sample sizes are large obtain permutations by randomly selecting from all possible permutations. Reject the null hypothesis if the proportion of 1 from the above is less than or equal to predetermined significance level. Higgins (2004) calculated medians only once from the original observations. He then permuted Journal of Mathematical Sciences & Mathematics Education Vol. 3 No j= 1 j2

7 deviances before calculating statistic, RMD. Our's is different than the Higgins'. Before calculating any statistic from a permutation, we permuted the observations, and then find medians from the permuted data in order to finally calculate deviances and then use a formula RMD above. Example: This example is from McClave et. al. (1997) textbook page 390. Tests of product can be completely automated or they can be conducted using human inspectors or human inspectors aided by mechanical devices. Although human inspection is frequently the most economical alternative, it can lead to serious inspection error problems. To evaluate the performance of inspectors in a new company, a quality manager had sample 12 novice inspectors evaluate 200 finished products. The same 200 items were evaluated by 12 experienced inspectors. The following table lists the number of inspection errors made by each inspector. The manager believe that the variability of inspection errors was lower for experienced inspectors than for novice inspectors. Table 3: Novice Inspectors Experienced Inspectors If we calculate all possible permutations then we have 24 = 2,704, arrangements. Keller-McNulty and Higgins (1987) concluded that there is enough to resample R=1,600, since nothing to be gained if taking sample more than We use R=1500 in our permutation in order to get an approximation of p-value. We use a software Resampling Stats (2000). Resampling Stats is a simple, but powerful software language that can solve complex problems both in probability and statistics. Since means of the two treatments are unknown, we use the sample medians which are 32, and 19.5, respectively. Calculate the absolute value of the observed minus the median for each treatment. The ratio of median deviance for the original observations R Mˆ D is The 1 st, 2.5 th, 5 th, 10 th, and 20 th percentiles of the permutation distribution based on 1,000 randomly selected permutations are found to be.5,.5471,.5913,.6789 and Thus the statistic is significant at the 10% level but not at the 5% level. And the approximate p-value is So there is no variability of inspection errors for experienced inspectors and novice inspectors. The histogram of R Mˆ D shown below is the distribution of the ratio of deviation of errors by experienced inspectors and novice inspectors. Journal of Mathematical Sciences & Mathematics Education Vol. 3 No. 2 25

8 Power Functions of the Tests In this section we will compare the power of test for Wilcoxon's rank sum test with t-distribution using the permutation method under various nonnormal distributions. We will consider uniform, exponential, log-normal, Poisson, Pareto, and Weibul distributions. We will use small (m=5, n=5), moderate (m=n=20) and large sample sizes (m=n=30). The simulation is conducted as follows: 1) Two independent samples of size m and n are randomly selected from probability distribution above. 2) We add a constant to each observation for treatment 1, for example, so that μ 1 > μ 2. 3) Calculate the observed value of t, and the Wilcoxon statistic. 4) Use 1000 resamples of the data to determine a p-value of the permutation test. 5) If a p-value less than 0.05 then reject null hypothesis using a test at the 5 % level. 6) Repeat steps 1-5 for a number of times, say ) The power is the proportion of times in the 2000 that are rejected. Again, complete steps 1-7 by increasing the value of a constant in step 2 until you get a wide range of power functions. Figure 1-5 are the graphs of the power functions. It is been known when the distributions of the data are normal with unknown, but equal variances, under a t-test for difference between two samples, t is unbiased test. That is, t has the greatest power and correct of probability of a Type I error. If the underlying distributions are not normal then the power of t is not optimal. For our simulation, we compare the Wilcoxon's sum rank test with the t-test. Results for small samples (see figure 1-5): t-test is uniformly better than the Wilcoxon's rank sum test for uniform, exponential, and Weibull distributions. Even though the t-test has advantage over the Wilcoxon's, the difference between a power is not significant. There is no clear choice between t-test and Wilcoxon's for Lognormal, and Pareto. For Poisson, most of the time the power of the Wilcoxon's is above of t-test except in two points. For moderate sample sizes the results are as follows: The uniform is the only distribution in which t-test is more powerful than the Wilcoxon's, again the difference is not very significant. There is no difference between t-test and the Journal of Mathematical Sciences & Mathematics Education Vol. 3 No. 2 26

9 Wilcoxon for Weibull distribution. The Wilcoxon's rank sum test is better than t- test for exponential, lognormal, Pareto, and Poisson. The Wilcoxon's test replaces the original observations by ranks. However, the observations that are unusually large compared to the rest of the data can affect the t-test. Exponential, lognormal, and Pareto are heavy-tailed distribution that can have unusually extreme observations. Whenever the Wilcoxon rank-sum test is uniformly better than t-test, the power of the test is very significant. For the Pareto, for example, let the difference between the means is.50, then the power of Pareto under the Wilcoxon is 86.95% while the power of t-test is 40.70%. Results for large samples: Some practitioners mistakenly assumed under central limit theorem a power of t-test is uniformly better than the Wilcoxon. However, in our simulation only uniform distribution has the edge for t-test, with no difference for a Weibull. The Wilcoxon rank sum statistic has bigger power for Pareto, lognormal, exponential, and Poisson distributions than that of t-statistic. Based on our simulation we conclude that for light-tailed distribution such as uniform distribution, the t-test is better than the Wilcoxon's for small, moderate, and large samples. For small sample size, symmetric and light-tailed, t-test is better most of the times. The Wilcoxon sum rank test is more powerful than t- test for moderate and large sample sizes most of the time, when the distributions of the data come from heavy-tailed distributions. Figure I One-tailed Power Functions of the Two Independent Mean Journal of Mathematical Sciences & Mathematics Education Vol. 3 No. 2 27

10 Figure II One-tailed Power Functions of the Two Independent Mean Figure III One-tailed Power Functions of the Two Independent Mean Journal of Mathematical Sciences & Mathematics Education Vol. 3 No. 2 28

11 Figure IV One-tailed Power Functions of the Two Independent Mean Figure V One-tailed Power Functions of the Two Independent Mean Journal of Mathematical Sciences & Mathematics Education Vol. 3 No. 2 29

12 Ferry Butar Butar, Ph.D., Sam Houston State University, Huntsville, Texas Jae-Wan Park, Sam Houston State University, Huntsville, Texas References Blair, R.C., and Higgins, J.J. (1980). A comparison of the power of Wilcoxon's rank-sum statistic to that of student's t statistic under various nonnormal distributions, Journal of Educational Statistics, 5, Conover, W. J. (1999). Practical Nonparametric Statistics, third ed., John Wiley & Sons, New York. Good, P. (1994). Permutation Tests: A practical guide to resampling methods for testing hypothesis, Springer, New York. Hicks, C.R., Turner, K.V. (1999). Fundamental Concepts in the Design of Experiments, Fifth ed., Oxford University Press, New York. Higgins, J.J. (2004). Introduction to Modern Nonparametric Statistics, Duxbury. Resampling Stats User's Guide. (2000). Resampling Stats, Inc.", Vol 5.0.2, Arlington, VA. Siegel, S. and Castellan, N.J. (1998). Nonparametric Statistics for the Behavioral Sciences", second ed. McGraw Hill, Boston. Journal of Mathematical Sciences & Mathematics Education Vol. 3 No. 2 30

### LAB 4 INSTRUCTIONS CONFIDENCE INTERVALS AND HYPOTHESIS TESTING

LAB 4 INSTRUCTIONS CONFIDENCE INTERVALS AND HYPOTHESIS TESTING In this lab you will explore the concept of a confidence interval and hypothesis testing through a simulation problem in engineering setting.

### NCSS Statistical Software

Chapter 06 Introduction This procedure provides several reports for the comparison of two distributions, including confidence intervals for the difference in means, two-sample t-tests, the z-test, the

### How To Check For Differences In The One Way Anova

MINITAB ASSISTANT WHITE PAPER This paper explains the research conducted by Minitab statisticians to develop the methods and data checks used in the Assistant in Minitab 17 Statistical Software. One-Way

### Comparing Two Groups. Standard Error of ȳ 1 ȳ 2. Setting. Two Independent Samples

Comparing Two Groups Chapter 7 describes two ways to compare two populations on the basis of independent samples: a confidence interval for the difference in population means and a hypothesis test. The

### Nonparametric Statistics

Nonparametric Statistics References Some good references for the topics in this course are 1. Higgins, James (2004), Introduction to Nonparametric Statistics 2. Hollander and Wolfe, (1999), Nonparametric

### Stat 5102 Notes: Nonparametric Tests and. confidence interval

Stat 510 Notes: Nonparametric Tests and Confidence Intervals Charles J. Geyer April 13, 003 This handout gives a brief introduction to nonparametrics, which is what you do when you don t believe the assumptions

### Two-Sample T-Tests Assuming Equal Variance (Enter Means)

Chapter 4 Two-Sample T-Tests Assuming Equal Variance (Enter Means) Introduction This procedure provides sample size and power calculations for one- or two-sided two-sample t-tests when the variances of

### THE FIRST SET OF EXAMPLES USE SUMMARY DATA... EXAMPLE 7.2, PAGE 227 DESCRIBES A PROBLEM AND A HYPOTHESIS TEST IS PERFORMED IN EXAMPLE 7.

THERE ARE TWO WAYS TO DO HYPOTHESIS TESTING WITH STATCRUNCH: WITH SUMMARY DATA (AS IN EXAMPLE 7.17, PAGE 236, IN ROSNER); WITH THE ORIGINAL DATA (AS IN EXAMPLE 8.5, PAGE 301 IN ROSNER THAT USES DATA FROM

### Two-Sample T-Tests Allowing Unequal Variance (Enter Difference)

Chapter 45 Two-Sample T-Tests Allowing Unequal Variance (Enter Difference) Introduction This procedure provides sample size and power calculations for one- or two-sided two-sample t-tests when no assumption

### NONPARAMETRIC STATISTICS 1. depend on assumptions about the underlying distribution of the data (or on the Central Limit Theorem)

NONPARAMETRIC STATISTICS 1 PREVIOUSLY parametric statistics in estimation and hypothesis testing... construction of confidence intervals computing of p-values classical significance testing depend on assumptions

### Non-Inferiority Tests for Two Means using Differences

Chapter 450 on-inferiority Tests for Two Means using Differences Introduction This procedure computes power and sample size for non-inferiority tests in two-sample designs in which the outcome is a continuous

### Tutorial 5: Hypothesis Testing

Tutorial 5: Hypothesis Testing Rob Nicholls nicholls@mrc-lmb.cam.ac.uk MRC LMB Statistics Course 2014 Contents 1 Introduction................................ 1 2 Testing distributional assumptions....................

### The Wilcoxon Rank-Sum Test

1 The Wilcoxon Rank-Sum Test The Wilcoxon rank-sum test is a nonparametric alternative to the twosample t-test which is based solely on the order in which the observations from the two samples fall. We

### Parametric and Nonparametric: Demystifying the Terms

Parametric and Nonparametric: Demystifying the Terms By Tanya Hoskin, a statistician in the Mayo Clinic Department of Health Sciences Research who provides consultations through the Mayo Clinic CTSA BERD

### Exact Nonparametric Tests for Comparing Means - A Personal Summary

Exact Nonparametric Tests for Comparing Means - A Personal Summary Karl H. Schlag European University Institute 1 December 14, 2006 1 Economics Department, European University Institute. Via della Piazzuola

### Chapter 3 RANDOM VARIATE GENERATION

Chapter 3 RANDOM VARIATE GENERATION In order to do a Monte Carlo simulation either by hand or by computer, techniques must be developed for generating values of random variables having known distributions.

### Introduction to Hypothesis Testing. Hypothesis Testing. Step 1: State the Hypotheses

Introduction to Hypothesis Testing 1 Hypothesis Testing A hypothesis test is a statistical procedure that uses sample data to evaluate a hypothesis about a population Hypothesis is stated in terms of the

### HYPOTHESIS TESTING: POWER OF THE TEST

HYPOTHESIS TESTING: POWER OF THE TEST The first 6 steps of the 9-step test of hypothesis are called "the test". These steps are not dependent on the observed data values. When planning a research project,

### Study Guide for the Final Exam

Study Guide for the Final Exam When studying, remember that the computational portion of the exam will only involve new material (covered after the second midterm), that material from Exam 1 will make

### Comparing Means in Two Populations

Comparing Means in Two Populations Overview The previous section discussed hypothesis testing when sampling from a single population (either a single mean or two means from the same population). Now we

### NCSS Statistical Software

Chapter 06 Introduction This procedure provides several reports for the comparison of two distributions, including confidence intervals for the difference in means, two-sample t-tests, the z-test, the

### Permutation & Non-Parametric Tests

Permutation & Non-Parametric Tests Statistical tests Gather data to assess some hypothesis (e.g., does this treatment have an effect on this outcome?) Form a test statistic for which large values indicate

### Rank-Based Non-Parametric Tests

Rank-Based Non-Parametric Tests Reminder: Student Instructional Rating Surveys You have until May 8 th to fill out the student instructional rating surveys at https://sakai.rutgers.edu/portal/site/sirs

### Nonparametric Two-Sample Tests. Nonparametric Tests. Sign Test

Nonparametric Two-Sample Tests Sign test Mann-Whitney U-test (a.k.a. Wilcoxon two-sample test) Kolmogorov-Smirnov Test Wilcoxon Signed-Rank Test Tukey-Duckworth Test 1 Nonparametric Tests Recall, nonparametric

### Non-Inferiority Tests for One Mean

Chapter 45 Non-Inferiority ests for One Mean Introduction his module computes power and sample size for non-inferiority tests in one-sample designs in which the outcome is distributed as a normal random

### Normality Testing in Excel

Normality Testing in Excel By Mark Harmon Copyright 2011 Mark Harmon No part of this publication may be reproduced or distributed without the express permission of the author. mark@excelmasterseries.com

### II. DISTRIBUTIONS distribution normal distribution. standard scores

Appendix D Basic Measurement And Statistics The following information was developed by Steven Rothke, PhD, Department of Psychology, Rehabilitation Institute of Chicago (RIC) and expanded by Mary F. Schmidt,

### NCSS Statistical Software. One-Sample T-Test

Chapter 205 Introduction This procedure provides several reports for making inference about a population mean based on a single sample. These reports include confidence intervals of the mean or median,

### Nonparametric Statistics

Nonparametric Statistics J. Lozano University of Goettingen Department of Genetic Epidemiology Interdisciplinary PhD Program in Applied Statistics & Empirical Methods Graduate Seminar in Applied Statistics

### 4. Continuous Random Variables, the Pareto and Normal Distributions

4. Continuous Random Variables, the Pareto and Normal Distributions A continuous random variable X can take any value in a given range (e.g. height, weight, age). The distribution of a continuous random

### Testing Group Differences using T-tests, ANOVA, and Nonparametric Measures

Testing Group Differences using T-tests, ANOVA, and Nonparametric Measures Jamie DeCoster Department of Psychology University of Alabama 348 Gordon Palmer Hall Box 870348 Tuscaloosa, AL 35487-0348 Phone:

### Part 3. Comparing Groups. Chapter 7 Comparing Paired Groups 189. Chapter 8 Comparing Two Independent Groups 217

Part 3 Comparing Groups Chapter 7 Comparing Paired Groups 189 Chapter 8 Comparing Two Independent Groups 217 Chapter 9 Comparing More Than Two Groups 257 188 Elementary Statistics Using SAS Chapter 7 Comparing

### Non-Parametric Tests (I)

Lecture 5: Non-Parametric Tests (I) KimHuat LIM lim@stats.ox.ac.uk http://www.stats.ox.ac.uk/~lim/teaching.html Slide 1 5.1 Outline (i) Overview of Distribution-Free Tests (ii) Median Test for Two Independent

### Chapter 2. Hypothesis testing in one population

Chapter 2. Hypothesis testing in one population Contents Introduction, the null and alternative hypotheses Hypothesis testing process Type I and Type II errors, power Test statistic, level of significance

### Lecture Notes Module 1

Lecture Notes Module 1 Study Populations A study population is a clearly defined collection of people, animals, plants, or objects. In psychological research, a study population usually consists of a specific

### Recall this chart that showed how most of our course would be organized:

Chapter 4 One-Way ANOVA Recall this chart that showed how most of our course would be organized: Explanatory Variable(s) Response Variable Methods Categorical Categorical Contingency Tables Categorical

### UNIVERSITY OF NAIROBI

UNIVERSITY OF NAIROBI MASTERS IN PROJECT PLANNING AND MANAGEMENT NAME: SARU CAROLYNN ELIZABETH REGISTRATION NO: L50/61646/2013 COURSE CODE: LDP 603 COURSE TITLE: RESEARCH METHODS LECTURER: GAKUU CHRISTOPHER

### Chapter G08 Nonparametric Statistics

G08 Nonparametric Statistics Chapter G08 Nonparametric Statistics Contents 1 Scope of the Chapter 2 2 Background to the Problems 2 2.1 Parametric and Nonparametric Hypothesis Testing......................

### 1.5 Oneway Analysis of Variance

Statistics: Rosie Cornish. 200. 1.5 Oneway Analysis of Variance 1 Introduction Oneway analysis of variance (ANOVA) is used to compare several means. This method is often used in scientific or medical experiments

### Variables Control Charts

MINITAB ASSISTANT WHITE PAPER This paper explains the research conducted by Minitab statisticians to develop the methods and data checks used in the Assistant in Minitab 17 Statistical Software. Variables

### 3.4 Statistical inference for 2 populations based on two samples

3.4 Statistical inference for 2 populations based on two samples Tests for a difference between two population means The first sample will be denoted as X 1, X 2,..., X m. The second sample will be denoted

### Nonparametric tests these test hypotheses that are not statements about population parameters (e.g.,

CHAPTER 13 Nonparametric and Distribution-Free Statistics Nonparametric tests these test hypotheses that are not statements about population parameters (e.g., 2 tests for goodness of fit and independence).

### Chicago Booth BUSINESS STATISTICS 41000 Final Exam Fall 2011

Chicago Booth BUSINESS STATISTICS 41000 Final Exam Fall 2011 Name: Section: I pledge my honor that I have not violated the Honor Code Signature: This exam has 34 pages. You have 3 hours to complete this

### How To Compare Birds To Other Birds

STT 430/630/ES 760 Lecture Notes: Chapter 7: Two-Sample Inference 1 February 27, 2009 Chapter 7: Two Sample Inference Chapter 6 introduced hypothesis testing in the one-sample setting: one sample is obtained

### individualdifferences

1 Simple ANalysis Of Variance (ANOVA) Oftentimes we have more than two groups that we want to compare. The purpose of ANOVA is to allow us to compare group means from several independent samples. In general,

### Error Type, Power, Assumptions. Parametric Tests. Parametric vs. Nonparametric Tests

Error Type, Power, Assumptions Parametric vs. Nonparametric tests Type-I & -II Error Power Revisited Meeting the Normality Assumption - Outliers, Winsorizing, Trimming - Data Transformation 1 Parametric

### How To Test For Significance On A Data Set

Non-Parametric Univariate Tests: 1 Sample Sign Test 1 1 SAMPLE SIGN TEST A non-parametric equivalent of the 1 SAMPLE T-TEST. ASSUMPTIONS: Data is non-normally distributed, even after log transforming.

### QUANTITATIVE METHODS BIOLOGY FINAL HONOUR SCHOOL NON-PARAMETRIC TESTS

QUANTITATIVE METHODS BIOLOGY FINAL HONOUR SCHOOL NON-PARAMETRIC TESTS This booklet contains lecture notes for the nonparametric work in the QM course. This booklet may be online at http://users.ox.ac.uk/~grafen/qmnotes/index.html.

### Section 7.1. Introduction to Hypothesis Testing. Schrodinger s cat quantum mechanics thought experiment (1935)

Section 7.1 Introduction to Hypothesis Testing Schrodinger s cat quantum mechanics thought experiment (1935) Statistical Hypotheses A statistical hypothesis is a claim about a population. Null hypothesis

### An Introduction to Statistics Course (ECOE 1302) Spring Semester 2011 Chapter 10- TWO-SAMPLE TESTS

The Islamic University of Gaza Faculty of Commerce Department of Economics and Political Sciences An Introduction to Statistics Course (ECOE 130) Spring Semester 011 Chapter 10- TWO-SAMPLE TESTS Practice

### Psychology 60 Fall 2013 Practice Exam Actual Exam: Next Monday. Good luck!

Psychology 60 Fall 2013 Practice Exam Actual Exam: Next Monday. Good luck! Name: 1. The basic idea behind hypothesis testing: A. is important only if you want to compare two populations. B. depends on

### NAG C Library Chapter Introduction. g08 Nonparametric Statistics

g08 Nonparametric Statistics Introduction g08 NAG C Library Chapter Introduction g08 Nonparametric Statistics Contents 1 Scope of the Chapter... 2 2 Background to the Problems... 2 2.1 Parametric and Nonparametric

### Paired T-Test. Chapter 208. Introduction. Technical Details. Research Questions

Chapter 208 Introduction This procedure provides several reports for making inference about the difference between two population means based on a paired sample. These reports include confidence intervals

### Quantitative Methods for Finance

Quantitative Methods for Finance Module 1: The Time Value of Money 1 Learning how to interpret interest rates as required rates of return, discount rates, or opportunity costs. 2 Learning how to explain

### CHAPTER 14 NONPARAMETRIC TESTS

CHAPTER 14 NONPARAMETRIC TESTS Everything that we have done up until now in statistics has relied heavily on one major fact: that our data is normally distributed. We have been able to make inferences

### Class 19: Two Way Tables, Conditional Distributions, Chi-Square (Text: Sections 2.5; 9.1)

Spring 204 Class 9: Two Way Tables, Conditional Distributions, Chi-Square (Text: Sections 2.5; 9.) Big Picture: More than Two Samples In Chapter 7: We looked at quantitative variables and compared the

### Inference for two Population Means

Inference for two Population Means Bret Hanlon and Bret Larget Department of Statistics University of Wisconsin Madison October 27 November 1, 2011 Two Population Means 1 / 65 Case Study Case Study Example

### Lesson 1: Comparison of Population Means Part c: Comparison of Two- Means

Lesson : Comparison of Population Means Part c: Comparison of Two- Means Welcome to lesson c. This third lesson of lesson will discuss hypothesis testing for two independent means. Steps in Hypothesis

### MEASURES OF LOCATION AND SPREAD

Paper TU04 An Overview of Non-parametric Tests in SAS : When, Why, and How Paul A. Pappas and Venita DePuy Durham, North Carolina, USA ABSTRACT Most commonly used statistical procedures are based on the

### Statistics Review PSY379

Statistics Review PSY379 Basic concepts Measurement scales Populations vs. samples Continuous vs. discrete variable Independent vs. dependent variable Descriptive vs. inferential stats Common analyses

### Tests for Two Proportions

Chapter 200 Tests for Two Proportions Introduction This module computes power and sample size for hypothesis tests of the difference, ratio, or odds ratio of two independent proportions. The test statistics

### START Selected Topics in Assurance

START Selected Topics in Assurance Related Technologies Table of Contents Introduction Some Statistical Background Fitting a Normal Using the Anderson Darling GoF Test Fitting a Weibull Using the Anderson

### Two-sample inference: Continuous data

Two-sample inference: Continuous data Patrick Breheny April 5 Patrick Breheny STA 580: Biostatistics I 1/32 Introduction Our next two lectures will deal with two-sample inference for continuous data As

### Difference tests (2): nonparametric

NST 1B Experimental Psychology Statistics practical 3 Difference tests (): nonparametric Rudolf Cardinal & Mike Aitken 10 / 11 February 005; Department of Experimental Psychology University of Cambridge

### Descriptive Statistics

Descriptive Statistics Primer Descriptive statistics Central tendency Variation Relative position Relationships Calculating descriptive statistics Descriptive Statistics Purpose to describe or summarize

### Statistics I for QBIC. Contents and Objectives. Chapters 1 7. Revised: August 2013

Statistics I for QBIC Text Book: Biostatistics, 10 th edition, by Daniel & Cross Contents and Objectives Chapters 1 7 Revised: August 2013 Chapter 1: Nature of Statistics (sections 1.1-1.6) Objectives

### Projects Involving Statistics (& SPSS)

Projects Involving Statistics (& SPSS) Academic Skills Advice Starting a project which involves using statistics can feel confusing as there seems to be many different things you can do (charts, graphs,

### Impact of Skewness on Statistical Power

Modern Applied Science; Vol. 7, No. 8; 013 ISSN 1913-1844 E-ISSN 1913-185 Published by Canadian Center of Science and Education Impact of Skewness on Statistical Power Ötüken Senger 1 1 Kafkas University,

### Terminating Sequential Delphi Survey Data Collection

A peer-reviewed electronic journal. Copyright is retained by the first or sole author, who grants right of first publication to the Practical Assessment, Research & Evaluation. Permission is granted to

### t-test Statistics Overview of Statistical Tests Assumptions

t-test Statistics Overview of Statistical Tests Assumption: Testing for Normality The Student s t-distribution Inference about one mean (one sample t-test) Inference about two means (two sample t-test)

### Fairfield Public Schools

Mathematics Fairfield Public Schools AP Statistics AP Statistics BOE Approved 04/08/2014 1 AP STATISTICS Critical Areas of Focus AP Statistics is a rigorous course that offers advanced students an opportunity

### 6.4 Normal Distribution

Contents 6.4 Normal Distribution....................... 381 6.4.1 Characteristics of the Normal Distribution....... 381 6.4.2 The Standardized Normal Distribution......... 385 6.4.3 Meaning of Areas under

### T test as a parametric statistic

KJA Statistical Round pissn 2005-619 eissn 2005-7563 T test as a parametric statistic Korean Journal of Anesthesiology Department of Anesthesia and Pain Medicine, Pusan National University School of Medicine,

### HYPOTHESIS TESTING WITH SPSS:

HYPOTHESIS TESTING WITH SPSS: A NON-STATISTICIAN S GUIDE & TUTORIAL by Dr. Jim Mirabella SPSS 14.0 screenshots reprinted with permission from SPSS Inc. Published June 2006 Copyright Dr. Jim Mirabella CHAPTER

### Standard Deviation Estimator

CSS.com Chapter 905 Standard Deviation Estimator Introduction Even though it is not of primary interest, an estimate of the standard deviation (SD) is needed when calculating the power or sample size of

### UNDERSTANDING THE TWO-WAY ANOVA

UNDERSTANDING THE e have seen how the one-way ANOVA can be used to compare two or more sample means in studies involving a single independent variable. This can be extended to two independent variables

### Introduction to Quantitative Methods

Introduction to Quantitative Methods October 15, 2009 Contents 1 Definition of Key Terms 2 2 Descriptive Statistics 3 2.1 Frequency Tables......................... 4 2.2 Measures of Central Tendencies.................

### Statistics courses often teach the two-sample t-test, linear regression, and analysis of variance

2 Making Connections: The Two-Sample t-test, Regression, and ANOVA In theory, there s no difference between theory and practice. In practice, there is. Yogi Berra 1 Statistics courses often teach the two-sample

### Principles of Hypothesis Testing for Public Health

Principles of Hypothesis Testing for Public Health Laura Lee Johnson, Ph.D. Statistician National Center for Complementary and Alternative Medicine johnslau@mail.nih.gov Fall 2011 Answers to Questions

### Introduction to General and Generalized Linear Models

Introduction to General and Generalized Linear Models General Linear Models - part I Henrik Madsen Poul Thyregod Informatics and Mathematical Modelling Technical University of Denmark DK-2800 Kgs. Lyngby

### Chapter 4: Statistical Hypothesis Testing

Chapter 4: Statistical Hypothesis Testing Christophe Hurlin November 20, 2015 Christophe Hurlin () Advanced Econometrics - Master ESA November 20, 2015 1 / 225 Section 1 Introduction Christophe Hurlin

### MODIFIED PARAMETRIC BOOTSTRAP: A ROBUST ALTERNATIVE TO CLASSICAL TEST

MODIFIED PARAMETRIC BOOTSTRAP: A ROBUST ALTERNATIVE TO CLASSICAL TEST Zahayu Md Yusof, Nurul Hanis Harun, Sharipah Sooad Syed Yahaya & Suhaida Abdullah School of Quantitative Sciences College of Arts and

### Statistics 2014 Scoring Guidelines

AP Statistics 2014 Scoring Guidelines College Board, Advanced Placement Program, AP, AP Central, and the acorn logo are registered trademarks of the College Board. AP Central is the official online home

### Department of Mathematics, Indian Institute of Technology, Kharagpur Assignment 2-3, Probability and Statistics, March 2015. Due:-March 25, 2015.

Department of Mathematics, Indian Institute of Technology, Kharagpur Assignment -3, Probability and Statistics, March 05. Due:-March 5, 05.. Show that the function 0 for x < x+ F (x) = 4 for x < for x

### 1 Nonparametric Statistics

1 Nonparametric Statistics When finding confidence intervals or conducting tests so far, we always described the population with a model, which includes a set of parameters. Then we could make decisions

### Stat 411/511 THE RANDOMIZATION TEST. Charlotte Wickham. stat511.cwick.co.nz. Oct 16 2015

Stat 411/511 THE RANDOMIZATION TEST Oct 16 2015 Charlotte Wickham stat511.cwick.co.nz Today Review randomization model Conduct randomization test What about CIs? Using a t-distribution as an approximation

### Simple Linear Regression Inference

Simple Linear Regression Inference 1 Inference requirements The Normality assumption of the stochastic term e is needed for inference even if it is not a OLS requirement. Therefore we have: Interpretation

### SUMAN DUVVURU STAT 567 PROJECT REPORT

SUMAN DUVVURU STAT 567 PROJECT REPORT SURVIVAL ANALYSIS OF HEROIN ADDICTS Background and introduction: Current illicit drug use among teens is continuing to increase in many countries around the world.

### Statistics for Sports Medicine

Statistics for Sports Medicine Suzanne Hecht, MD University of Minnesota (suzanne.hecht@gmail.com) Fellow s Research Conference July 2012: Philadelphia GOALS Try not to bore you to death!! Try to teach

### Hypothesis testing - Steps

Hypothesis testing - Steps Steps to do a two-tailed test of the hypothesis that β 1 0: 1. Set up the hypotheses: H 0 : β 1 = 0 H a : β 1 0. 2. Compute the test statistic: t = b 1 0 Std. error of b 1 =

### PROPERTIES OF THE SAMPLE CORRELATION OF THE BIVARIATE LOGNORMAL DISTRIBUTION

PROPERTIES OF THE SAMPLE CORRELATION OF THE BIVARIATE LOGNORMAL DISTRIBUTION Chin-Diew Lai, Department of Statistics, Massey University, New Zealand John C W Rayner, School of Mathematics and Applied Statistics,

### Tests for One Proportion

Chapter 100 Tests for One Proportion Introduction The One-Sample Proportion Test is used to assess whether a population proportion (P1) is significantly different from a hypothesized value (P0). This is

### Introduction to. Hypothesis Testing CHAPTER LEARNING OBJECTIVES. 1 Identify the four steps of hypothesis testing.

Introduction to Hypothesis Testing CHAPTER 8 LEARNING OBJECTIVES After reading this chapter, you should be able to: 1 Identify the four steps of hypothesis testing. 2 Define null hypothesis, alternative

### The Variability of P-Values. Summary

The Variability of P-Values Dennis D. Boos Department of Statistics North Carolina State University Raleigh, NC 27695-8203 boos@stat.ncsu.edu August 15, 2009 NC State Statistics Departement Tech Report

### Experimental Design. Power and Sample Size Determination. Proportions. Proportions. Confidence Interval for p. The Binomial Test

Experimental Design Power and Sample Size Determination Bret Hanlon and Bret Larget Department of Statistics University of Wisconsin Madison November 3 8, 2011 To this point in the semester, we have largely

### Nonparametric statistics and model selection

Chapter 5 Nonparametric statistics and model selection In Chapter, we learned about the t-test and its variations. These were designed to compare sample means, and relied heavily on assumptions of normality.

### t Tests in Excel The Excel Statistical Master By Mark Harmon Copyright 2011 Mark Harmon

t-tests in Excel By Mark Harmon Copyright 2011 Mark Harmon No part of this publication may be reproduced or distributed without the express permission of the author. mark@excelmasterseries.com www.excelmasterseries.com

### Hypothesis Testing --- One Mean

Hypothesis Testing --- One Mean A hypothesis is simply a statement that something is true. Typically, there are two hypotheses in a hypothesis test: the null, and the alternative. Null Hypothesis The hypothesis