Parametric and non-parametric statistical methods for the life sciences - Session I



Similar documents
THE KRUSKAL WALLLIS TEST

NONPARAMETRIC STATISTICS 1. depend on assumptions about the underlying distribution of the data (or on the Central Limit Theorem)

Statistics. One-two sided test, Parametric and non-parametric test statistics: one group, two groups, and more than two groups samples

1 Nonparametric Statistics

II. DISTRIBUTIONS distribution normal distribution. standard scores

Overview of Non-Parametric Statistics PRESENTER: ELAINE EISENBEISZ OWNER AND PRINCIPAL, OMEGA STATISTICS

Rank-Based Non-Parametric Tests

Research Methodology: Tools

Nonparametric tests these test hypotheses that are not statements about population parameters (e.g.,

Non-Inferiority Tests for Two Means using Differences

Research Methods & Experimental Design

HYPOTHESIS TESTING (ONE SAMPLE) - CHAPTER 7 1. used confidence intervals to answer questions such as...

QUANTITATIVE METHODS BIOLOGY FINAL HONOUR SCHOOL NON-PARAMETRIC TESTS

SCHOOL OF HEALTH AND HUMAN SCIENCES DON T FORGET TO RECODE YOUR MISSING VALUES

SPSS ADVANCED ANALYSIS WENDIANN SETHI SPRING 2011

BA 275 Review Problems - Week 6 (10/30/06-11/3/06) CD Lessons: 53, 54, 55, 56 Textbook: pp , ,

Nonparametric Statistics

HYPOTHESIS TESTING: POWER OF THE TEST

Two-Sample T-Tests Assuming Equal Variance (Enter Means)

Two-Sample T-Tests Allowing Unequal Variance (Enter Difference)

StatCrunch and Nonparametric Statistics

NCSS Statistical Software

Nonparametric Two-Sample Tests. Nonparametric Tests. Sign Test

Non-Parametric Tests (I)

UNIVERSITY OF NAIROBI

Introduction to Analysis of Variance (ANOVA) Limitations of the t-test

1. What is the critical value for this 95% confidence interval? CV = z.025 = invnorm(0.025) = 1.96

Analysis of Variance ANOVA

Part 3. Comparing Groups. Chapter 7 Comparing Paired Groups 189. Chapter 8 Comparing Two Independent Groups 217

Statistical tests for SPSS

INTERPRETING THE ONE-WAY ANALYSIS OF VARIANCE (ANOVA)

CHAPTER 14 NONPARAMETRIC TESTS

Analysis of Data. Organizing Data Files in SPSS. Descriptive Statistics

Statistics Review PSY379

Introduction to Hypothesis Testing. Hypothesis Testing. Step 1: State the Hypotheses

BA 275 Review Problems - Week 5 (10/23/06-10/27/06) CD Lessons: 48, 49, 50, 51, 52 Textbook: pp

HYPOTHESIS TESTING (ONE SAMPLE) - CHAPTER 7 1. used confidence intervals to answer questions such as...

Nonparametric Statistics

Permutation & Non-Parametric Tests

Additional sources Compilation of sources:

SPSS Explore procedure

Parametric and Nonparametric: Demystifying the Terms

Introduction to Statistics and Quantitative Research Methods

Statistiek II. John Nerbonne. October 1, Dept of Information Science

EPS 625 INTERMEDIATE STATISTICS FRIEDMAN TEST

Simple Linear Regression Inference

The Wilcoxon Rank-Sum Test

Lesson 1: Comparison of Population Means Part c: Comparison of Two- Means

Study Guide for the Final Exam

C. The null hypothesis is not rejected when the alternative hypothesis is true. A. population parameters.

SPSS 3: COMPARING MEANS

Descriptive Statistics

SPSS Tests for Versions 9 to 13

DATA INTERPRETATION AND STATISTICS

Non-Inferiority Tests for One Mean

Experimental Design. Power and Sample Size Determination. Proportions. Proportions. Confidence Interval for p. The Binomial Test

P(every one of the seven intervals covers the true mean yield at its location) = 3.

One-Way Analysis of Variance (ANOVA) Example Problem

Outline. Topic 4 - Analysis of Variance Approach to Regression. Partitioning Sums of Squares. Total Sum of Squares. Partitioning sums of squares

T-test & factor analysis

Independent t- Test (Comparing Two Means)

Tutorial 5: Hypothesis Testing

1 Basic ANOVA concepts

Introduction to General and Generalized Linear Models

Projects Involving Statistics (& SPSS)

CHAPTER 11 CHI-SQUARE AND F DISTRIBUTIONS

STAT 350 Practice Final Exam Solution (Spring 2015)

ANALYSING LIKERT SCALE/TYPE DATA, ORDINAL LOGISTIC REGRESSION EXAMPLE IN R.

Chapter G08 Nonparametric Statistics

UNDERSTANDING THE TWO-WAY ANOVA

Statistics for Sports Medicine

Design and Analysis of Phase III Clinical Trials

ECON 142 SKETCH OF SOLUTIONS FOR APPLIED EXERCISE #2

Introduction. Hypothesis Testing. Hypothesis Testing. Significance Testing

Chapter 5 Analysis of variance SPSS Analysis of variance

Chicago Booth BUSINESS STATISTICS Final Exam Fall 2011

Once saved, if the file was zipped you will need to unzip it. For the files that I will be posting you need to change the preferences.

Biostatistics: Types of Data Analysis

NAG C Library Chapter Introduction. g08 Nonparametric Statistics

General Method: Difference of Means. 3. Calculate df: either Welch-Satterthwaite formula or simpler df = min(n 1, n 2 ) 1.

PRACTICE PROBLEMS FOR BIOSTATISTICS

UNDERSTANDING THE DEPENDENT-SAMPLES t TEST

Comparing Means in Two Populations

How To Test For Significance On A Data Set

Introduction to Quantitative Methods

Chapter 8 Hypothesis Testing Chapter 8 Hypothesis Testing 8-1 Overview 8-2 Basics of Hypothesis Testing

1.5 Oneway Analysis of Variance

Two-sample hypothesis testing, II /16/2004

Permutation Tests for Comparing Two Populations

Calculating P-Values. Parkland College. Isela Guerra Parkland College. Recommended Citation

Section 7.1. Introduction to Hypothesis Testing. Schrodinger s cat quantum mechanics thought experiment (1935)

CHAPTER 13. Experimental Design and Analysis of Variance

An Introduction to Statistics Course (ECOE 1302) Spring Semester 2011 Chapter 10- TWO-SAMPLE TESTS

Chapter 12 Nonparametric Tests. Chapter Table of Contents

UNDERSTANDING THE INDEPENDENT-SAMPLES t TEST

Outline. Definitions Descriptive vs. Inferential Statistics The t-test - One-sample t-test

MEASURES OF LOCATION AND SPREAD

Bivariate Statistics Session 2: Measuring Associations Chi-Square Test

Transcription:

Why nonparametric methods What test to use? Rank Tests Parametric and non-parametric statistical methods for the life sciences - Session I Liesbeth Bruckers Geert Molenberghs Interuniversity Institute for Biostatistics and statistical Bioinformatics (I-Biostat) Universiteit Hasselt June 7, 2011

Why nonparametric methods What test to use? Rank Tests Table of contents 1 Why nonparametric methods Introductory example Nonparametric test of hypotheses 2 What test to use? Two independent samples More then two independent samples Two dependent samples More then two dependent samples Ordered hypotheses 3 Rank Tests Wilcoxon Rank Sum Test Kruskal-Wallis Test Friedmann Statistic Sign Test Jonckheere-Terpstra Test

Why nonparametric methods What test to use? Rank Tests Introductory example Nonparametric test of hypotheses Why nonparametric methods?

Why nonparametric methods What test to use? Rank Tests Introductory example Nonparametric test of hypotheses Introductory Example The paper Hypertension in Terminal Renal Failure, Observations Pre and Post Bilateral Nephrectomy (J. Chronic Diseases (1973): 471-501) gave blood pressure readings for five terminal renal patients before and 2 months after surgery (removal of kidney). Patient 1 2 3 4 5 Before surgery 107 102 95 106 112 After surgery 87 97 101 113 80 Question: Does the mean blood pressure before surgery exceed the mean blood pressure two months after surgery?

Why nonparametric methods What test to use? Rank Tests Introductory example Nonparametric test of hypotheses Classical Approach Paired t-test: Patient 1 2 3 4 5 Before surgery 107 102 95 106 112 After surgery 87 97 101 113 80 Difference D i 20 5-6 -7 32 Hypotheses: H 0 : µ d = 0 versus H 1 : µ d > 0 µ d : mean difference in blood pressure Test-Statistic : t = D 1 (Di D) n(n 1) 2 follows a t distribution with n 1 d.f.

Why nonparametric methods What test to use? Rank Tests Introductory example Nonparametric test of hypotheses Assumptions The statistic follows a t-distribution if the differences are normally distributed t-test = parametric method Observations are made independent: selection of a patient does not influence chance of any other patient for inclusion (Two sample t test): populations must have same variances Variables must be measured in an interval scale, to interpret the results These assumptions are often not tested, but accepted.

Why nonparametric methods What test to use? Rank Tests Introductory example Nonparametric test of hypotheses Normal probability plot Normality is questionable!

Why nonparametric methods What test to use? Rank Tests Introductory example Nonparametric test of hypotheses Nonparametric Test of Hypotheses Follow same general procedure as parametric tests: State null and alternative hypothesis Calculate the value of the appropriate test statistic (choice based on the design of the study) Decision rule: either reject or accept depending on the magnitude of the statistic P H0 (T c) =?? Exact distribution Approximation for the exact distribution

Why nonparametric methods What test to use? Rank Tests Two independent samples More then two independent samples When to use what test

Why nonparametric methods What test to use? Rank Tests Two independent samples More then two independent samples What test to use? Choice of appropriate test statistic depends on the design of the study: number of groups? independent of dependent samples? ordered alternative hypothesis?

Why nonparametric methods What test to use? Rank Tests Two independent samples More then two independent samples Two Independent Samples Permeability constants of the human chorioamnion (a placental membrane) for at term (x) and between 12 to 26 weeks gestational age (y) pregnancies are given in the table below. Investigate the alternative of interest that the permeability of the human chorioamnion for a term pregnancy is greater than for a 12 to 26 weeks of gestational age pregnancy. X (at term) 0.83 1.89 1.04 1.45 1.38 1.91 1.64 1.46 Y (12-26weeks) 1.15 0.88 0.90 0.74 1.21 Statistical Methods: t-test Wilcoxon Rank Sum Test

Why nonparametric methods What test to use? Rank Tests Two independent samples More then two independent samples More Than Two Independent Samples Protoporphyrin levels were determined for three groups of people - a control group of normal workers, a group of alcoholics with sideroblasts in their bone marrow, and a group of alcoholics without sideroblasts. The data is shown below. Does the data suggest that normal workers and alcoholics with and without sideroblasts differ with respect to protoporphyrin level? Group Protoporphyrin level (mg) Normal 22 27 47 30 38 78 28 58 72 56 Alcoholics with sideroblasts 78 172 286 82 453 513 174 915 84 153 Alcoholics without sideroblasts 37 28 38 45 47 29 34 20 68 12 Statistical Methods: ANOVA Kruskal-Wallis Test

Why nonparametric methods What test to use? Rank Tests Two independent samples More then two independent samples Two Dependent Samples Twelve adult males were put on liquid diet in a weight-reducing plan. Weights were recorded before and after the diet. The data are shown in the table below. Subject 1 2 3 4 5 6 7 8 9 10 11 12 Before 186 171 177 168 191 172 177 191 170 171 188 187 After 188 177 176 169 196 172 165 190 165 180 181 172 Statistical Methods: Paired t-test Sign test; Signed-rank test

Why nonparametric methods What test to use? Rank Tests Two independent samples More then two independent samples Randomized Blocked Design Effect of Hypnosis: Emotions of fear, happiness, depression and calmness were requested (in random order) from 8 subject during hypnosis Response: skin potential (in millivolts) Subject 1 2 3 4 5 6 7 8 Fear 23.1 57.6 10.5 23.6 11.9 54.6 21.0 20.3 Happiness 22.7 53.2 9.7 19.6 13.8 47.1 13.6 23.6 Depression 22.5 53.7 10.8 21.1 13.7 39.2 13.7 16.3 Calmness 22.6 53.1 8.3 21.6 13.3 37.0 14.8 14.8 Statistical Methods: Mixed Models Friedmann test

Why nonparametric methods What test to use? Rank Tests Two independent samples More then two independent samples Ordered Treatments Patients were treated with a drug a four dose levels (100mg, 200mg, 300mg and 400mg) and then monitored for toxicity. Drug Toxicity Dose Mild Moderate Severe Drug Death 100mg 100 1 0 0 200mg 18 1 1 0 300mg 50 1 1 0 400mg 50 1 1 1 Statistical Methods: Regression Jonckheere-Terpstra Test

Wilcoxon Rank Sum Test

Wilxocon Rank Sum Test Detailed Example: Data : GAF scores Control 25 10 35 Treatment 36 26 40 Does treatment improve the functioning?

Parametric Approach: t-test t = X 1 X 0 s S X1, where S X1 X X 0 = 1 2 + s2 0 n 0 1 n0 t test: means of two normally distributed populations are equal H 0 : µ 1 = µ 0 H 1 : µ 1 µ 0 (one sided test H 1 : µ 1 µ 0 equal sample sizes two distributions have the same variance X 1 = 34.00, X 0 = 23.33, S X1 = 7.21, S X0 = 12.58 t = 1.27 P H0 (t 1.27) = 0.1358

Wilxocon Rank Sum Test Detailed Example: Control 25 10 35 Treatment 36 26 40 Order data: Position of patients on treatment as compared with position of patients in control arm? Ranks

Treatment is effective if treated patients rank sufficiently high in the combined ranking of all patients Test statistic such that: treatment ranks are high value test statistic is high treatment ranks are low value test statistic is low W S = S 1 + S 2 +... + S n (n=3, number of patients in treatment arm) Ranks W S = 5+3+6 =14 Control 2 1 4 (25) (10) (35) Treatment 5 3 6 (36) (26) (40)

Reject null hypothesis when W S is sufficiently large : W S c P H0 (W S c) = α (alpha=0.05) Distribution of W S under H 0? Suppose no treatment effect (H 0 ) rank is solely determined by patients health status rank is independent of receiving treatment or placebo rank is assigned to patient before randomisation Random selection of patients for treatment random selection of 3 ranks out of 6 Randomisation divides ranks (1,2,...6) into two groups! Number of possible combinations : ( ) N n = N! n!(n n)!

All posibilities: (each as a probability of 1/20 under H 0 ) treatment ranks (4,5,6) (3,5,6) (3,4,6) (3,4,5) (2,5,6) w s 15 14 13 12 13 treatment ranks (2,4,6) (2,4,5) (2,3,6) (2,3,5) (2,3,4) w 12 11 11 10 9 treatment ranks (1,5,6) (1,4,6) (1,4,5) (1,3,6) (1,3,5) w s 12 11 10 10 9 treatment ranks (1,3,4) (1,2,6) (1,2,5) (1,2,4) (1,2,3) w s 8 9 8 7 6

Distribution of W S under the null hypothesis: w 6 7 8 9 10 11 12 13 14 15 P H0 (W s = w) 1 20 1 20 2 20 3 20 3 20 3 20 3 20 2 20 1 20 1 20

P HO (W S 14) = 0.1 Do not reject H 0. Conclusion: Treatment does not increase the GAF scores. Power of this study???

Large Sample Size-case ( N ) n increases rapidly with N and n ( 20 ( 12 6 10) = 184756 ) = 924 Asymptotic Null Distribution: Central Limit Theorem Sum T of large number of independent random variables is approximately normally distributed. ( ) T E(T ) P a Φ(a) Var(T ) where Φ(a) is the area to the left of a under a standard normal curve

If both n and m are sufficiently large: W S N(E(W S ); Var(W S )) E(W S ) = 1 2n(N + 1) Var(W S ) = 1 12nm(N + 1)

Kruskal-Wallis Test

Kruskal- Wallis test Example: Kruskal- Wallis test: The following data represent corn yields per acre from three different fields where different farming methods were used. Method 1 Method 2 Method 3 92 94 101 91 90 100 84 81 93 89 102 Question: is the yields different for the 4 methods?

Parametric Approach One-way ANOVA Statistical test of whether or not the means of several groups are all equal Assumptions: Independence of cases The distributions of the residuals are normal : ɛ i (0, σ 2 ). Homoscedasticity variance between groups F = = variance within groups MSTR MSE Statistic follows a F distribution with s 1, n s d.f.

Small F: Large F:

One-Way ANOVA results X 1 = 89, X 2 = 88.33, X 3 = 99 σ 1 = 3.56, σ 2 = 6.65, σ 3 = 4.08 MSTR= 135.03, MSE = 22.08 F= 6.11 P H0 (F 6.11) = 0.0245

Ranks: Method 1 Method 2 Method 3 6 8 10 5 4 9 1 2 7 3 11 R i. : 3.75 4.666 6.75

Hypothesis : H 0 : No difference between the treatments H 1 : Any difference between the treatments If treatments do not differ widely (H 0 ): R i. are close to each other R i. close to R.. If treatments do differ (H 1 ): R i. differ substantial R i. not close to R..

Evaluate the null hypothesis by investigating: K = 12 N(N + 1) s n i (R i. R.. ) 2 i=1 P H0 (K c) =? Exact distribution of K under H 0 : ranks are determined before assignment to treatment random assignment all possibilities same chance of being observed Number ( of possible combinations: multinomial coefficient : 11 ( 4,3,4) = 11 )( 7 4 4 3)( 4) = 11550 ( ) ( N n 1,n 2,...,n s = N )( N n1 ) ( n 1 n 2... N n1... n s 1 ) n s

A few possible configurations: Method 1 Method 2 Method 3 K (1,2,3,4) (5,6,7) (8,9,10,11) 8.91 (1,2,3,5) (4,6,7) (8,9,10,11) 8.32 (1,2,3,6) (4,5,6) (8,9,10,11) 7.84 (1,2,3,7) (4,5,6) (8,9,10,11) 7,48... (1,3,5,6) (2,4,8) (7,9,10,11) 6.16... Each configuration has a probability of 1 11550 to happen.

Exact Distribution of K: P H0 (K 6.16) = 0.0306 Conclusion: Reject H 0 : there is a difference between the farming methods Large sample size approximation χ 2 distribution with s 1 d.f.

Friedmann Test

Friedmann Statistic Setting 1: complete randomization: Kruskal-Wallis test p-value =0.8611 Treatment effect is blurred by the variability between subjects Setting 2: randomisation within age groups: p-value 0.0411 Conclusion reject H 0

Procedure Divide subjects in homogeneous subgroups (BLOCKS) Compare subjects within the blocks w.r.t. treatment effects (Generalisation of the paired comparison design)

Example Data Age-group treatment 20-30 y 30-40 y 40-50 y 50-60 y A 19 21 43 46 B 17 20 37 44 C 23 22 39 42 Rank subjects within a block: Age-group treatment 20-30 y 30-40 y 40-50 y 50-60 y A 2 2 3 3 B 1 1 1 2 C 3 3 2 1

Mean of ranks for: treatment A = R A. = 10 4 = 2.5 treatment B = R B. = 6 4 = 1.5 treatment C = R C. = 9 4 = 2.25 If these mean ranks are different reject H 0 If these mean ranks are close accept H 0

Measure for closseness of the mean ranks: if the R i. are all close to each other then they are close to the overall mean R.. and (R i. R.. ) 2 will be close to zero Friedman Statistic Q = 12N s(s + 1) s (R i. R.. ) 2 i=1

P H0 (Q c) =? Exact distribution of Q under H 0 : A few possible configurations: Age-group Q Treatment 20-30 y 30-40 y 40-50 y 50-60 y A 1 1 1 1 8 B 2 2 2 2 C 3 3 3 3 A 3 3 3 3 8 B 2 2 2 2 C 1 1 1 1 A 1 3 1 3 0 B 2 2 2 2 C 3 1 3 1... A 2 2 3 3 3.5 B 1 1 1 2 C 3 3 2 1

Exact Distribution of Q: Q Pr -.0000000.694444444444444E-01.5000000.277777777777778 1.500000.222222222222222 2.000000.157407407407407 3.500000.148148148148148 4.500000.555555555555555E-01 6.000000.277777777777778E-01 6.500000.370370370370370E-01 8.000000.462962962962963E-02

Number of possibilities for the rank combinations: age-group 20-30 year: 3! = 6 age-groups are independent total number of possible combinations: (3!) 4 = 1296 Under the null these are all equally likely : 1 1296 (s!) N, s= treatment groups, N = of blocks P H0 (Q 3.5) = 0.2731 Do not reject H 0

Sign Test

Sign Test Special case of Friedmann test: blocks of size 2 subjects matched on e.g. age, gender,... twins two eyes (hands) of a person subject serves as own control: e.g. blood pressure before and after treatment Example: Pain scores for lower back pain, before and after having acupuncture Pain score Pain score Sign Pain score Pain score Sign Patient Before After Patient Before After 1 5 6-8 7 6 + 2 6 7-9 6 5 + 3 7 6 + 10 5 7-4 9 4 + 11 8 6 + 5 6 7-12 8 4 + 6 5 4 + 13 7 3 + 7 4 8-14 8 5 + 15 6 7 -

9 pairs out 15 where treatment comes out ahead (reduction in pain scores) Sign Test: S N = 9 P H0 (S N 9) =??? Exact Distribution of S N under H 0 is binomial N trials, N = number of pairs Success probability: 1 2 P H0 (S N 9) = ( ( 15 9 P H0 (S N = a) = ) + ( 15 10 ( ) N 1 a 2 N ) ( +... + 15 ) 15 ) 1 = 0.31 2 15

Jonckheere-Terpstra Test

Jonckheere-Terpstra Test To be used when the H 1 is ordered. Ordinal data for the responses and an ordering in the treatment/groups. Example: Data: Three diets for rats Response: growth H 1 : Growth rate decreases from A to C : A B C A 133 139 149 160 184 B 111 125 143 148 157 C 99 114 116 127 146

Parametric Approach : Regression Models the relationship between a dependent and independent variable y i = β 0 + β 1 x i + ɛ i Assumptions ɛ i N(0, σ 2 ), ɛ i are independent homoscedasticity x i is measured without error

β 0 = 169, p-value = < 0.0001 β 1 = 16, p-value = 0.0133 R-square = 0.3866

Jonckheere-Terpstra Test Based on Mann-Whitney statistics for two treatments Comparing the treatment groups two by two if W BA is large: growth A > growth B : (W BA = 18 if W BC is large: growth B > growth C : (W BC = 18 if W CA is large: growth A > growth C : (W BA = 23 JT Statistic: W = i<j W ij Reject H 0 when W is sufficiently large W = 59 P H0 (W c) = 0.0120 Compare with the result of a Kruskal-Wallis Test: p-value = 0. 072 The distribution of W follows a normal distribution for large samples

Parametric versus nonparametric tests Parametric tests: Assumptions about the distribution in the population Conditions are often not tested Test depends on the validity of the assumptions Most powerful test if all assumptions are met Nonparametric tests: Fewer assumptions about the distribution in the population In case of small sample sizes often the only alternative (unless the nature of the population distribution is known exactly) Less sensitive for measurement error (uses ranks) Can be used for data which are inherently in ranks, even for data measured in a nominal scale Easier to learn