The Mann-Whitney U test. Peter Shaw

Similar documents
SCHOOL OF HEALTH AND HUMAN SCIENCES DON T FORGET TO RECODE YOUR MISSING VALUES

Descriptive Statistics

Rank-Based Non-Parametric Tests

General Method: Difference of Means. 3. Calculate df: either Welch-Satterthwaite formula or simpler df = min(n 1, n 2 ) 1.

Introduction to Hypothesis Testing. Hypothesis Testing. Step 1: State the Hypotheses

CHAPTER 12 TESTING DIFFERENCES WITH ORDINAL DATA: MANN WHITNEY U

QUANTITATIVE METHODS BIOLOGY FINAL HONOUR SCHOOL NON-PARAMETRIC TESTS

HYPOTHESIS TESTING: POWER OF THE TEST

II. DISTRIBUTIONS distribution normal distribution. standard scores

SPSS Explore procedure

Analysing Questionnaires using Minitab (for SPSS queries contact -)

Projects Involving Statistics (& SPSS)

AP: LAB 8: THE CHI-SQUARE TEST. Probability, Random Chance, and Genetics

Nonparametric Two-Sample Tests. Nonparametric Tests. Sign Test

Using Excel for inferential statistics

LAB : THE CHI-SQUARE TEST. Probability, Random Chance, and Genetics

Non-Parametric Tests (I)

Chapter 8 Hypothesis Testing Chapter 8 Hypothesis Testing 8-1 Overview 8-2 Basics of Hypothesis Testing

INTERPRETING THE ONE-WAY ANALYSIS OF VARIANCE (ANOVA)

NONPARAMETRIC STATISTICS 1. depend on assumptions about the underlying distribution of the data (or on the Central Limit Theorem)

Introduction to Hypothesis Testing

t Tests in Excel The Excel Statistical Master By Mark Harmon Copyright 2011 Mark Harmon

Hypothesis testing. c 2014, Jeffrey S. Simonoff 1

1 Nonparametric Statistics

Statistical tests for SPSS

Hypothesis testing - Steps

HOW TO WRITE A LABORATORY REPORT

Recall this chart that showed how most of our course would be organized:

CHAPTER 14 NONPARAMETRIC TESTS

Lesson 9 Hypothesis Testing

One-Way Analysis of Variance

" Y. Notation and Equations for Regression Lecture 11/4. Notation:

Nonparametric statistics and model selection

THE KRUSKAL WALLLIS TEST

Having a coin come up heads or tails is a variable on a nominal scale. Heads is a different category from tails.

Understand the role that hypothesis testing plays in an improvement project. Know how to perform a two sample hypothesis test.

Class 19: Two Way Tables, Conditional Distributions, Chi-Square (Text: Sections 2.5; 9.1)

Testing Group Differences using T-tests, ANOVA, and Nonparametric Measures

Parametric and non-parametric statistical methods for the life sciences - Session I

COMPARISONS OF CUSTOMER LOYALTY: PUBLIC & PRIVATE INSURANCE COMPANIES.

Bivariate Statistics Session 2: Measuring Associations Chi-Square Test

StatCrunch and Nonparametric Statistics

Chapter 7 Section 7.1: Inference for the Mean of a Population

UNDERSTANDING THE INDEPENDENT-SAMPLES t TEST

CHAPTER 12. Chi-Square Tests and Nonparametric Tests LEARNING OBJECTIVES. USING T.C. Resort Properties

13: Additional ANOVA Topics. Post hoc Comparisons

An introduction to IBM SPSS Statistics

Mind on Statistics. Chapter 12

First-year Statistics for Psychology Students Through Worked Examples

Testing Hypotheses About Proportions

UNDERSTANDING THE TWO-WAY ANOVA

Research Methods & Experimental Design

This chapter discusses some of the basic concepts in inferential statistics.

Row vs. Column Percents. tab PRAYER DEGREE, row col

Outline. Definitions Descriptive vs. Inferential Statistics The t-test - One-sample t-test

Testing for differences I exercises with SPSS

The Friedman Test with MS Excel. In 3 Simple Steps. Kilem L. Gwet, Ph.D.

p ˆ (sample mean and sample

Introduction to. Hypothesis Testing CHAPTER LEARNING OBJECTIVES. 1 Identify the four steps of hypothesis testing.

Lesson 1: Comparison of Population Means Part c: Comparison of Two- Means

Statistics for Sports Medicine

Two Related Samples t Test

ANALYSING LIKERT SCALE/TYPE DATA, ORDINAL LOGISTIC REGRESSION EXAMPLE IN R.

Post-hoc comparisons & two-way analysis of variance. Two-way ANOVA, II. Post-hoc testing for main effects. Post-hoc testing 9.

Some Essential Statistics The Lure of Statistics

Unit 26 Estimation with Confidence Intervals

Solar Energy MEDC or LEDC

HYPOTHESIS TESTING (ONE SAMPLE) - CHAPTER 7 1. used confidence intervals to answer questions such as...

The Wilcoxon Rank-Sum Test

Part 3. Comparing Groups. Chapter 7 Comparing Paired Groups 189. Chapter 8 Comparing Two Independent Groups 217

Good luck! BUSINESS STATISTICS FINAL EXAM INSTRUCTIONS. Name:

Statistics 2014 Scoring Guidelines

STATISTICAL SIGNIFICANCE OF RANKING PARADOXES

Linear Models in STATA and ANOVA

Independent samples t-test. Dr. Tom Pierce Radford University

12: Analysis of Variance. Introduction

Permutation & Non-Parametric Tests

WISE Power Tutorial All Exercises

Correlational Research

C. The null hypothesis is not rejected when the alternative hypothesis is true. A. population parameters.

Permutation Tests for Comparing Two Populations

Overview of Non-Parametric Statistics PRESENTER: ELAINE EISENBEISZ OWNER AND PRINCIPAL, OMEGA STATISTICS


Introduction to Statistics and Quantitative Research Methods

SPSS Guide How-to, Tips, Tricks & Statistical Techniques

Difference tests (2): nonparametric

HYPOTHESIS TESTING WITH SPSS:

The Dummy s Guide to Data Analysis Using SPSS

A Few Basics of Probability

Nonparametric Statistics

Nursing Journal Toolkit: Critiquing a Quantitative Research Article

Characteristics of Binomial Distributions

This chapter will demonstrate how to perform multiple linear regression with IBM SPSS

11. Analysis of Case-control Studies Logistic Regression

CHAPTER 5 COMPARISON OF DIFFERENT TYPE OF ONLINE ADVERTSIEMENTS. Table: 8 Perceived Usefulness of Different Advertisement Types

An Automated Test for Telepathy in Connection with s

Introduction to Analysis of Variance (ANOVA) Limitations of the t-test

Examining Differences (Comparing Groups) using SPSS Inferential statistics (Part I) Dwayne Devonish

Lecture Notes Module 1

HYPOTHESIS TESTING: CONFIDENCE INTERVALS, T-TESTS, ANOVAS, AND REGRESSION

Transcription:

The Mann-Whitney U test Peter Shaw

Introduction We meet our first inferential test. You should not get put off by the messy-looking formulae it s usually run on a PC anyway. The important bit is to understand the philosophy of the test.

Imagine.. That you have acquired a set of measurements from 2 different sites. Maybe one is alleged to be polluted, the other clean, and you measure residues in the soil. Maybe these are questionnaire returns from students identified as M or F. You want to know whether these 2 sets of measurements genuinely differ. The issue here is that you need to rule out the possibility of the results being random noise.

The formal procedure: Involves the creation of two competing explanations for the data recorded. Idea 1:These are pattern-less random data. Any observed patterns are due to chance. This is the null hypothesis H0 Idea 2: There is a defined pattern in the data. This is the alternative hypothesis H1 Without the statement of the competing hypotheses, no meaning test can be run.

Occam s razor If competing explanations exist, chose the simpler unless there is good reason to reject it. Here, you must assume H0 to be true until you can reject it. In point of fact you can never ABSOLUTELY prove that your observations are non-random. Any pattern could arise in random noise, by chance. Instead you work out how likely H0 is to be true.

Example You conduct a questionnaire survey of homes in the Heathrow flight path, and also a control population of homes in South west London. Responses to the question How intrusive is plane noise in your daily life are tabulated: Noise complaints 1= no complaint, 5 = very unhappy Homes near airport Control site 5 3 4 2 4 4 3 1 5 2 4 1 5

Stage 1: Eyeball the data! These data are ordinal, but not normally distributed (allowable scores are 1, 2, 3, 4 or 5). Use Non-parametric statistics It does look as though people are less happy under the flightpath, but recall that we must state our hypotheses H0, H1 H0: There is no difference in attitudes to plane noise between the two areas any observed differences are due to chance. H1: Responses to the question differed between the two areas.

Now we assess how likely it is that this pattern could occur by chance: This is done by performing a calculation. Don t worry yet about what the calculation entails. What matters is that the calculation gives an answer (a test statistic) whose likelihood can be looked up in tables. Thus by means of this tool - the test statistic - we can work out an estimate of the probability that the observed pattern could occur by chance in random data

One philosophical hurdle to go: The test statistic generates a probability - a number for 0 to 1, which is the probability of H0 being true. If p = 0, H0 is certainly false. (Actually this is over-simple, but a good approximation) If p is large, say p = 0.8, H0 must be accepted as true. But how about p = 0.1, p = 0.01?

Significance We have to define a threshold, a boundary, and say that if p is below this threshold H0 is rejected otherwise H1 is accepted. This boundary is called the significance level. By convention it is set at p=0.05 (1:20), but you can chose any other number - as long as you specify it in the write-up of your analyses. WARNING!! This means that if you analyse 100 sets of random data, the expectance (log-term average) is that 5 will generate a significant test.

The procedure: Set up H0, H1. Decide significance level p=0.05 Data 5 3 4 2 4 4 3 1 5 2 4 1 5 Test statistic U = 15.5 Is p above critical level? Y N Probability of H0 being true p = 0.03 Accept H0 Reject H0

This particular test: The Mann-Whitney U test is a non-parametric test which examines whether 2 columns of data could have come from the same population (ie should be the same) It generates a test statistic called U (no idea why it s U). By hand we look U up in tables; PCs give you an exact probability. It requires 2 sets of data - these need not be paired, nor need they be normally distributed, nor need there be equal numbers in each set.

How to do it 1: rank all data into ascending order, then re-code the data set replacing raw data with ranks. 2 Harmonize ranks where the same value occurs more than once Data 5 3 4 2 4 4 3 1 5 2 4 1 5 Data 5 #13 3 #5 4 #10 2 #4 4 #9 4 #7 3 #6 1 #2 5 #12 2 #3 4 #8 1 #1 5 #11 Data 5 #13 = 12 3 #5 = 5.5 4 #10 = 8.5 2 #4 = 3.5 4 #9 = 8.5 4 #7 = 8.5 3 #6 = 5.5 1 #2 = 1.5 5 #12 = 12 2 #3 = 3.5 4 #8 = 8.5 1 #1 = 1.5 5 #11 = 12

Once data are ranked: Add up ranks for each column; call these r x and r y (Optional but a good check: r x + r y = n2/2 + n/2, or you have an error) Calculate Ux = NxNy + Nx(Nx+1)/2 - Rx Uy = NxNy + Ny(Ny+1)/2 - Ry takethesmallerofthese2valuesandlookupintables. IfU is LESS than the critical value, reject H0 NB This test is unique in one feature: Here low values of the test stat. Are significant - this is not true for any other test.

In this case: Data 5 #13 = 12 3 #5 = 5.5 4 #10 = 8.5 2 #4 = 3.5 4 #9 = 8.5 4 #7 = 8.5 3 #6 = 5.5 1 #2 = 1.5 5 #12 = 12 2 #3 = 3.5 4 #8 = 8.5 1 #1 = 1.5 5 #11 = 12 rx=67 ry=24 Check: rx + ry + 91 13*13/2 + 13/2 = 91 CHECK. Ux = 6*7 + 7*8/2-67 = 3 Uy = 6*7 + 6*7/2-24 = 39 Lowest U value is 3. Critical value of U (7,6) = 4 at p = 0.01. Calculated U is < tabulated U so reject H0. At p = 0.01 these two sets of data differ.

Tails.. Generally use 2 tailed tests 2 tailed test: These populations DIFFER. 1 tailed test: Population X is Greater than Y (or Less than Y). Lower tail of distribution Upper tail of distribution

Kruskal-Wallis: The U test s big cousin When we have 2 groups to compare (M/F, site 1/site 2, etc) the U test is correct applicable and safe. How to handle cases with 3 or more groups? The simple answer is to run the Kruskal-Wallis test. This is run on a PC, but behaves very much like the M-W U. It will give one significance value, which simply tells you whether at least one group differs from one other. Males Females Site 1 Site 2 Site 3 Do males differ from females? Do results differ between these sites?

Your coursework: I will give each of you a sheet with data collected from 3 sites. (Don t try copying each one is different and I know who gets which dataset!). I want you to show me your data processing skills as follows: 1: Produce a boxplot of these data, showing how values differ between the categories. 2: Run 3 separate Mann-Whitny U tests on them, comparing 1-2, 1-3 and 2-3. Only call the result significant if the p value is < 0.01 3: Run a Kruskal-Wallis anova on the three groups combined, and comment on your results.