Nonparametric Statistics



Similar documents
Nonparametric Two-Sample Tests. Nonparametric Tests. Sign Test

SPSS Tests for Versions 9 to 13

EPS 625 INTERMEDIATE STATISTICS FRIEDMAN TEST

Projects Involving Statistics (& SPSS)

SCHOOL OF HEALTH AND HUMAN SCIENCES DON T FORGET TO RECODE YOUR MISSING VALUES

NONPARAMETRIC STATISTICS 1. depend on assumptions about the underlying distribution of the data (or on the Central Limit Theorem)

Overview of Non-Parametric Statistics PRESENTER: ELAINE EISENBEISZ OWNER AND PRINCIPAL, OMEGA STATISTICS

Statistics. One-two sided test, Parametric and non-parametric test statistics: one group, two groups, and more than two groups samples

Research Methods & Experimental Design

Independent t- Test (Comparing Two Means)

Rank-Based Non-Parametric Tests

Nonparametric tests these test hypotheses that are not statements about population parameters (e.g.,

THE KRUSKAL WALLLIS TEST

Analysing Questionnaires using Minitab (for SPSS queries contact -)

Intro to Parametric & Nonparametric Statistics

Descriptive Statistics

1 Nonparametric Statistics

Statistical tests for SPSS

Study Guide for the Final Exam

UNDERSTANDING THE DEPENDENT-SAMPLES t TEST

Introduction to Analysis of Variance (ANOVA) Limitations of the t-test

Statistics for Sports Medicine

Chapter 12 Nonparametric Tests. Chapter Table of Contents

statistics Chi-square tests and nonparametric Summary sheet from last time: Hypothesis testing Summary sheet from last time: Confidence intervals

Permutation Tests for Comparing Two Populations

Skewed Data and Non-parametric Methods

Two-Sample T-Tests Assuming Equal Variance (Enter Means)

TABLE OF CONTENTS. About Chi Squares What is a CHI SQUARE? Chi Squares Hypothesis Testing with Chi Squares... 2

NCSS Statistical Software

The Chi-Square Test. STAT E-50 Introduction to Statistics

Research Methodology: Tools

Parametric and non-parametric statistical methods for the life sciences - Session I

QUANTITATIVE METHODS BIOLOGY FINAL HONOUR SCHOOL NON-PARAMETRIC TESTS

The Dummy s Guide to Data Analysis Using SPSS

II. DISTRIBUTIONS distribution normal distribution. standard scores

Chapter G08 Nonparametric Statistics

SPSS ADVANCED ANALYSIS WENDIANN SETHI SPRING 2011

Difference tests (2): nonparametric

Two-Sample T-Tests Allowing Unequal Variance (Enter Difference)

The Statistics Tutor s Quick Guide to

Basic Statistical and Modeling Procedures Using SAS

Chi-square test Fisher s Exact test

Bivariate Statistics Session 2: Measuring Associations Chi-Square Test

SPSS 3: COMPARING MEANS

Class 19: Two Way Tables, Conditional Distributions, Chi-Square (Text: Sections 2.5; 9.1)

Types of Data, Descriptive Statistics, and Statistical Tests for Nominal Data. Patrick F. Smith, Pharm.D. University at Buffalo Buffalo, New York

SPSS Explore procedure

UNIVERSITY OF NAIROBI

Is it statistically significant? The chi-square test

Additional sources Compilation of sources:

X X X a) perfect linear correlation b) no correlation c) positive correlation (r = 1) (r = 0) (0 < r < 1)

An introduction to IBM SPSS Statistics

Non-parametric Tests Using SPSS

CHAPTER 12 TESTING DIFFERENCES WITH ORDINAL DATA: MANN WHITNEY U

Analyzing Research Data Using Excel

HYPOTHESIS TESTING WITH SPSS:

Once saved, if the file was zipped you will need to unzip it. For the files that I will be posting you need to change the preferences.

Come scegliere un test statistico

MEASURES OF LOCATION AND SPREAD

NAG C Library Chapter Introduction. g08 Nonparametric Statistics

Section 12 Part 2. Chi-square test

How To Test For Significance On A Data Set

Introduction. Hypothesis Testing. Hypothesis Testing. Significance Testing

3. Analysis of Qualitative Data

Testing Group Differences using T-tests, ANOVA, and Nonparametric Measures

SAS/STAT. 9.2 User s Guide. Introduction to. Nonparametric Analysis. (Book Excerpt) SAS Documentation

Analysis of Questionnaires and Qualitative Data Non-parametric Tests

Testing differences in proportions

INTERPRETING THE ONE-WAY ANALYSIS OF VARIANCE (ANOVA)

Nonparametric Statistics

Comparing Means in Two Populations

Non-Parametric Tests (I)

Bowerman, O'Connell, Aitken Schermer, & Adcock, Business Statistics in Practice, Canadian edition

THE FIRST SET OF EXAMPLES USE SUMMARY DATA... EXAMPLE 7.2, PAGE 227 DESCRIBES A PROBLEM AND A HYPOTHESIS TEST IS PERFORMED IN EXAMPLE 7.

Chapter 2 Probability Topics SPSS T tests

Principles of Hypothesis Testing for Public Health

Guido s Guide to PROC FREQ A Tutorial for Beginners Using the SAS System Joseph J. Guido, University of Rochester Medical Center, Rochester, NY

SPSS TUTORIAL & EXERCISE BOOK

Part 3. Comparing Groups. Chapter 7 Comparing Paired Groups 189. Chapter 8 Comparing Two Independent Groups 217

Introduction to Statistics Used in Nursing Research

ISyE 2028 Basic Statistical Methods - Fall 2015 Bonus Project: Big Data Analytics Final Report: Time spent on social media

Lesson 1: Comparison of Population Means Part c: Comparison of Two- Means

Analysis of categorical data: Course quiz instructions for SPSS

Terminating Sequential Delphi Survey Data Collection

Chapter 7 Notes - Inference for Single Samples. You know already for a large sample, you can invoke the CLT so:

Statistiek I. Proportions aka Sign Tests. John Nerbonne. CLCG, Rijksuniversiteit Groningen.

Stats Review Chapters 9-10

Parametric and Nonparametric: Demystifying the Terms

UNDERSTANDING THE INDEPENDENT-SAMPLES t TEST

Examining Differences (Comparing Groups) using SPSS Inferential statistics (Part I) Dwayne Devonish

business statistics using Excel OXFORD UNIVERSITY PRESS Glyn Davis & Branko Pecar

Testing for differences I exercises with SPSS

CHAPTER 14 ORDINAL MEASURES OF CORRELATION: SPEARMAN'S RHO AND GAMMA

UNDERSTANDING THE TWO-WAY ANOVA

SPSS Guide How-to, Tips, Tricks & Statistical Techniques

Introduction to Statistics with GraphPad Prism (5.01) Version 1.1

T-test & factor analysis

1. What is the critical value for this 95% confidence interval? CV = z.025 = invnorm(0.025) = 1.96

Introduction to Statistics and Quantitative Research Methods

Transcription:

Nonparametric Statistics J. Lozano University of Goettingen Department of Genetic Epidemiology Interdisciplinary PhD Program in Applied Statistics & Empirical Methods Graduate Seminar in Applied Statistics July 3, 2006

2

Parametric vs. Nonparametric Hypothesis Testing Procedures Parametric Nonparametric z-test t-test One-Way ANOVA Wilcoxon Mann Whitney Test Kruskal-Wallis Test Many More Tests Exist! 3

Learning Objectives: 1. Differentiate nonparametric from parametric statistics 2. Discuss the advantages and disadvantages of nonparametric statistics 3. Enumerate and differentiate the different nonparametric tests 4. Apply some commonly used nonparametric test to hypothesis testing problems 4

Parametric Test Procedures 1. Involve Population Parameters Example: Population Mean t X μ 0 (S n) x 2. Have Stringent Assumptions Example: Normal Distribution 3. Require Interval Scale or Ratio Scale Whole Numbers or Fractions Example: Height in Inches (72, 60.5, 54.7) 4. Examples: z-test, t-test, ANOVA μ 5

Nonparametric Test Procedures 1. Do Not Involve Population Parameters 2. No Stringent Distribution Assumptions Distribution-free 3. Data Measured on Any Scale Ratio or Interval Ordinal Example: Good-Better-Best Nominal Example: Male-Female 4. Example: Wilcoxon-Mann-Whitney Test 6

Advantages of Nonparametric Tests 1. Used With All Scales 2. Easier to Compute Developed Originally Before Wide Computer Use 3. Make Fewer Assumptions 4. Need Not Involve Population Parameters 5. Results May Be as Exact as Parametric Procedures 7

Disadvantages of Nonparametric Tests 1. May Waste Information If Data Permit Using Parametric Procedures Example: Converting Data From Ratio to Ordinal Scale 2. Require a larger sample size than the corresponding parametric test in order to achieve the same power 3. Difficult to Compute by Hand for Large Samples 4. Stat tables are not readily available 8

Summary Table of Statistical Tests Level of Measurement Sample Characteristics Correlation 1 Sample 2 Sample K Sample (i.e., >2) Independent Dependent Independent Dependent Categorical or Nominal Rank or Ordinal Χ 2 Run s test Kolmogorov Smirnov Χ 2 Fisher Exact Median Test Binomial Wilcoxon- Mann- Whitney Mc Nemar s Χ 2 Cochran s Q Sign Test Wilcoxon Signed Ranks Kruskal-Wallis Friendman s ANOVA Kappa Agreement Test Spearman s rho Kendall Rank Parametric (Interval & Ratio) z-test or t-test t- test between groups Paired t-test 1 way ANOVA between groups 1 way ANOVA (within or repeated measure) Pearson s r Factorial ANOVA 9

Application of Commonly Used Nonparametric Statistics 10

Frequency 7 6 5 4 3 2 1 0 8 6 4 2 0 20 40 60 80 x1 30 50 70 90 x2 => nonparametric statistics 11

Wilcoxon-Mann-Whitney Test

Wilcoxon-Mann-Whitney Test Also known as Wilcoxon-test, Wilcoxon rank sum test, U-test, Mann-Whitney-U-test Tests Two Independent Population - compare medians Corresponds to t-test for 2 Independent Means Assumptions Independent, Random Samples Populations Are Continuous Can Use Normal Approximation If n i 20 13

Wilcoxon-Mann-Whitney Test Procedure 1. Assign Ranks, r i, to the n 1 + n 2 Sample Observations If unequal sample sizes, let n 1 refer to smaller-sized sample Smallest Value = 1 Average Ties 2. Sum the Ranks, R i, for Each Sample 3. Test Statistic Null hypothesis: both samples come from the same underlying distribution n ( n 1) U n n R 2 1 1 1 2 1 1 where R is the sum of ranks in sample1 Compute also (n 1 n 2 U). If this expression is larger than the above U, then this becomes the final U statistic. ** At N>=20, U begins to approximate t, so the test stat changes to a t-value. 14

Wilcoxon-Mann-Whitney Test Example You re an agriculturist. You want to determine if there was a difference in the biomass of male and female Juniper trees. Randomly select 6 trees of each gender from the field. Dry them to constant moisture and weigh in kg. Male trees Data: 71, 73, 78, 75, 72, 74 Female trees Data: 80, 73, 83, 84, 82, 79 15

Wilcoxon-Mann-Whitney Test Solution H0: Ha: = n1 = n2 = Critical Value: Test Statistic: Decision: Conclusion: 16

Wilcoxon-Mann-Whitney Test Computation Table Male Trees Female Trees Weight Rank Weight Rank 71 73 78 75 72 74 Rank Sum 80 73 83 84 82 79 17

Wilcoxon-Mann-Whitney Test Solution H0: Median weights = Ha: Medians not = =0.05 n1 = 6 n2 = 6 Critical Value: U table = 31 * Depends on the U table you are using Test Statistic: U = (6)(6) + (6)(7)/2 24.5 = 32.5 n1n2 - U = (6)(6) 32.5 = 3.5 *!!! final U Calc = 32.5 Decision: Reject H 0 at =.05 since U calc > U table Conclusion: There is evidence that the medians are not equal. 18

NPar Tests Mann-Whitney Test Wilcoxon-Mann-Whitney Test Solution (SPSS Output) Ranks Weight in kg Gender of Tree Male Female Total N Mean Rank Sum of Ranks 6 4.08 24.50 6 8.92 53.50 12 Mann-Whitney U Wilcoxon W Z Test Statistics b Asymp. Sig. (2-tailed) Exact Sig. [2*(1-tailed Sig.)] a. Not corrected for ties. Weight in kg 3.500 24.500-2.326.020.015 a b. Grouping Variable: Gender of Tree p < ; Reject Ho 19

Sign Test

Sign Test 1. Tests One Population Median, (eta) 2. Corresponds to t-test for 1 Mean 3. Assumes Population Is Continuous 4. Small Sample Test Statistic: # Sample Values Above (or Below) Median 5. Can Use Normal Approximation If n 10 21

Sign Test Example You re a marketing analyst for Chefs-R-Us. You ve asked 8 people to rate a new ravioli on a 5-point Likert scale 1 = terrible to 5 = excellent The ratings are: 2 4 1 2 1 1 2 1 At the.05 level, is there evidence that the median rating is at least 3? 22

Sign Test Solution H0: Ha: = Test Statistic: P-Value: Decision: Conclusion: 23

30% 20% Sign Test Uses P-Value to Make Decision P(X) Binomial: n = 8 p = 0.5 10% 0%.219.273.219.109.109.004.031.031.004 0 1 2 3 4 5 6 7 8 X P-Value is the probability of getting an observation at Least as extreme as we got. If 7 of 8 Observations Favor H a, Then P-Value P = P(x 7) =.031 +.004 =.035. If =.05, Then Reject H 0 Since P-Value P. 24

Sign Test Solution H0: = 3 Ha: < 3 =.05 Test Statistic: S = 7 (Ratings 1 & 2 are less than = 3: 2 4 1 2 1 1 2 1 ) P-Value: P(x 7) =.031 +.004 =.035 (Binomial Table, n = 8, p = 0.50) n! x n X P( X ) (1 ) X!( n X )! Decision: Reject H 0 at =.05 Conclusion: There is evidence that the median is less than 3 25

Sign Test (R Output) R version 2.3.1 (BSDA and ev10 package must be loaded) > x <- c(2,4,1,2,1,1,2,1) > sign.test(x,md=3, alternative="less") One-sample Sign-Test data: x s = 1, p-value = 0.03516 alternative hypothesis: true median is less than 3 95 percent confidence interval: -Inf 2 sample estimates: median of x 1.5 $Confidence.Intervals Conf.Level L.E.pt U.E.pt Lower Achieved CI 0.8555 -Inf 2 Interpolated CI 0.9500 -Inf 2 Upper Achieved CI 0.9648 -Inf 2 Warning message: multi-argument returns are deprecated in: return(rval, Confidence.Intervals) 26

Mc Nemar Change Test

McNemar Change Test (MCT) Typical examples: Testing the shift in the proportion of abnormal responses from before and after treatment in the same group of patients Comparing two ocular treatments when both are given to each patient, one in each eye 28

Data Layout: MCT Condition 1 No. of Responders No. of Non- Responders No. of Responders Condition 2 No. of Non- Responders TOTAL A B A + B C D C + D TOTAL A + C B + D N= A +B +C +D 29

McNemar Change Test (MCT) The hypothesis of interest is the equality of response proportions, p 1 and p 2, under conditions 1 and 2, respectively. The test statistic is based on the difference in the discordant cell frequencies (B, C). 30

Test Summary: MCT Null hypothesis: H 0 : p 1 = p 2 Alternative hypothesis: H a : p 1 p 2 Test Statistic: χ 2 χ 2 α 1 where 1 is the critical value from the chi-square table with significance level and 1 degree of freedom Decision Rule:Reject H 0 if χ 2 α 2 χ C B C B 2 31

Example: MCT Bilirubin Abnormalities Following Drug Treatment 86 patients were treated with an experimental drug for 3 months. Pre-post study clinical laboratory results showed abnormally high total bilirubin values. Is there evidence of a change in the preto post-treatment rates of abnormalities? 32

Solution: MCT Let p 1 and p 2 represent the proportions of patients with abnormally high bilirubin values ( Y ) before and after treatment, respectively. POST- Treatment N Y TOTAL PRE- N 60 14 74 Treatment Y 6 6 12 TOTAL 66 20 86 Y = T.Bilirubin above upper limit of normal range 33

Test Summary: MCT Null hypothesis: H 0 : p 1 = p 2 Alternative hypothesis: H a : p 1 p 2 Test Statistic: 2 2 2 B C 14 6 64 χ 3.20 B C 14 6 20 Decision: Do not reject H 0 since χ 2 < χ 2 α 0.05 3.841 1 Conclusion: There is no sufficient evidence to conclude at 0.05 level of significance that a shift in abnormality rates occurs with treatment. 34

SAS Computer Output for MCT McNemar's Test Example: Bilirubin Abnormalities Following Drug Treatment TABLE OF PRE BY PST PRE(PRE) PST(PST) Frequency Percent Row Pct Col Pct N Y Total ƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒˆ N 60 14 74 69.77 16.28 86.05 81.08 18.92 90.91 70.00 ƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒˆ Y 6 6 12 6.98 6.98 13.95 50.00 50.00 9.09 30.00 ƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒˆ Total 66 20 86 76.74 23.26 100.00 STATISTICS FOR TABLE OF PRE BY PST McNemar's Test -------------- Statistic = 3.200 DF = 1 Prob = 0.074 35

References Dawson-Saunders, B. and Trapp, R.G. Basic and Clinical Biostatistics. Connecticut: Prentice Hall International, Inc. (1990). McGill, J.J. Chap. 14: Nonparametric Statistics Presentation. (May 2006). Siegel S. and Castellan N.J. Nonparametric Statistics for the Behavioral Sciences (2nd edition). New York: McGraw Hill (1988). Walker G.A. Common Statistical Methods for Clinical Research with SAS Examples, SAS Institute, Inc, Cary NC (1997). 36