6 Comparison of differences between 2 groups: Student s T-test, Mann-Whitney U-Test, Paired Samples T-test and Wilcoxon Test

Similar documents
SPSS Explore procedure

Chapter 5 Analysis of variance SPSS Analysis of variance

SPSS Tests for Versions 9 to 13

SCHOOL OF HEALTH AND HUMAN SCIENCES DON T FORGET TO RECODE YOUR MISSING VALUES

Projects Involving Statistics (& SPSS)

An introduction to IBM SPSS Statistics

EPS 625 INTERMEDIATE STATISTICS FRIEDMAN TEST

Independent t- Test (Comparing Two Means)

Testing for differences I exercises with SPSS

The Dummy s Guide to Data Analysis Using SPSS

Two Related Samples t Test

January 26, 2009 The Faculty Center for Teaching and Learning

Introduction to Analysis of Variance (ANOVA) Limitations of the t-test

Chapter 7 Section 7.1: Inference for the Mean of a Population

Nonparametric Statistics

Examining Differences (Comparing Groups) using SPSS Inferential statistics (Part I) Dwayne Devonish

DDBA 8438: The t Test for Independent Samples Video Podcast Transcript

Chapter 2 Probability Topics SPSS T tests

INTERPRETING THE ONE-WAY ANALYSIS OF VARIANCE (ANOVA)

SPSS 3: COMPARING MEANS

Mixed 2 x 3 ANOVA. Notes

Instructions for applying data validation(s) to data fields in Microsoft Excel

Data Analysis in SPSS. February 21, If you wish to cite the contents of this document, the APA reference for them would be

7. Comparing Means Using t-tests.

StatCrunch and Nonparametric Statistics

The Chi-Square Test. STAT E-50 Introduction to Statistics

QUANTITATIVE METHODS BIOLOGY FINAL HONOUR SCHOOL NON-PARAMETRIC TESTS

The Kruskal-Wallis test:

THE KRUSKAL WALLLIS TEST

SPSS Workbook 1 Data Entry : Questionnaire Data

Module 4 (Effect of Alcohol on Worms): Data Analysis

Two-Way ANOVA tests. I. Definition and Applications...2. II. Two-Way ANOVA prerequisites...2. III. How to use the Two-Way ANOVA tool?...

SPSS ADVANCED ANALYSIS WENDIANN SETHI SPRING 2011

SPSS/Excel Workshop 3 Summer Semester, 2010

Directions for using SPSS

Testing Group Differences using T-tests, ANOVA, and Nonparametric Measures

Chapter 7. Comparing Means in SPSS (t-tests) Compare Means analyses. Specifically, we demonstrate procedures for running Dependent-Sample (or

Non-parametric Tests Using SPSS

Using SPSS, Chapter 2: Descriptive Statistics

Overview of Non-Parametric Statistics PRESENTER: ELAINE EISENBEISZ OWNER AND PRINCIPAL, OMEGA STATISTICS

Data analysis process

SPSS Guide How-to, Tips, Tricks & Statistical Techniques

IBM SPSS Statistics 20 Part 4: Chi-Square and ANOVA

One-Way ANOVA using SPSS SPSS ANOVA procedures found in the Compare Means analyses. Specifically, we demonstrate

Analysis of categorical data: Course quiz instructions for SPSS

NCSS Statistical Software

Linear Models in STATA and ANOVA

SPSS Manual for Introductory Applied Statistics: A Variable Approach

Using Microsoft Excel to Analyze Data

HYPOTHESIS TESTING: CONFIDENCE INTERVALS, T-TESTS, ANOVAS, AND REGRESSION

Using Microsoft Excel to Analyze Data from the Disk Diffusion Assay

Research Methodology: Tools

SPSS Notes (SPSS version 15.0)

t Tests in Excel The Excel Statistical Master By Mark Harmon Copyright 2011 Mark Harmon

Creating a Distribution List from an Excel Spreadsheet

Nonparametric Two-Sample Tests. Nonparametric Tests. Sign Test

Student Guide to SPSS Barnard College Department of Biological Sciences

13: Additional ANOVA Topics. Post hoc Comparisons

Table of Contents. Preface

Rank-Based Non-Parametric Tests

8. Comparing Means Using One Way ANOVA

Statistics. One-two sided test, Parametric and non-parametric test statistics: one group, two groups, and more than two groups samples

IBM SPSS Statistics for Beginners for Windows

1.5 Oneway Analysis of Variance

Data Analysis for Marketing Research - Using SPSS

Two-Way ANOVA with Post Tests 1

The Friedman Test with MS Excel. In 3 Simple Steps. Kilem L. Gwet, Ph.D.

Come scegliere un test statistico

Bill Burton Albert Einstein College of Medicine April 28, 2014 EERS: Managing the Tension Between Rigor and Resources 1

How To Run Statistical Tests in Excel

Simple Predictive Analytics Curtis Seare

Doing Multiple Regression with SPSS. In this case, we are interested in the Analyze options so we choose that menu. If gives us a number of choices:

TIPS FOR DOING STATISTICS IN EXCEL

UNDERSTANDING THE DEPENDENT-SAMPLES t TEST

Scatter Plots with Error Bars

MEASURES OF LOCATION AND SPREAD

This chapter will demonstrate how to perform multiple linear regression with IBM SPSS

Excel Tutorial. Bio 150B Excel Tutorial 1

Federal Employee Viewpoint Survey Online Reporting and Analysis Tool

Main Effects and Interactions

An SPSS companion book. Basic Practice of Statistics

2: Entering Data. Open SPSS and follow along as your read this description.

4. Descriptive Statistics: Measures of Variability and Central Tendency

Parametric and non-parametric statistical methods for the life sciences - Session I

Directions for Frequency Tables, Histograms, and Frequency Bar Charts

What is a Mail Merge?

Tutorial Segmentation and Classification

Opgaven Onderzoeksmethoden, Onderdeel Statistiek

Introduction. Chapter 14: Nonparametric Tests

Introduction to Statistics with SPSS (15.0) Version 2.3 (public)

Descriptive and Inferential Statistics

Chapter 8. Comparing Two Groups

Reporting Statistics in Psychology

The primary format type that we use is the one highlighted on top (Microsoft Word 97/200/XP (.doc))

How to send meeting invitations using Office365 Calendar

Once saved, if the file was zipped you will need to unzip it. For the files that I will be posting you need to change the preferences.

Creating a Participants Mailing and/or Contact List:

T-test & factor analysis

Data Analysis Tools. Tools for Summarizing Data

Transcription:

6 Comparison of differences between 2 groups: Student s T-test, Mann-Whitney U-Test, Paired Samples T-test and Wilcoxon Test Having finally arrived at the bottom of our decision tree, we are now going to learn how to look for differences between two groups. The four tests covered in this course are very commonly used in many publications, however, as always, other statistical tests are available and these might be correct to use, too (it is your own responsibility as a scientist to understand the underlying assumption criteria if you decide to use a different test). The Null Hypothesis question/statement for any tests looking at differences between groups is: There are no differences between the groups.? p < 0.05 = there is a significant difference between the groups! Note that in contrast to all tests we have covered so far, we are now looking for a p value <0.05 to be able to state that we have a significant difference between groups. Experimental data (for Exercise 6.1-6.8): We have collected serum from male and female subjects at two different timepoints and we have measured the concentration of a protein (in pg/ml) in this serum in the laboratory. Student s T-test Exercise 6.1: 1. In the Excel spreadsheet Excercises_2, select the tab Student s T-test I. 2. Copy and paste the raw data into the SPSS Data View tab. 3. Label the data appropriately in the Variable View tab. 4. In the menu bar, go to Analyse-> Compare Means->Independent samples T-test (see screen shot). 1

5. As soon as you select Independent samples T-test, a new small window called Independent samples T-test opens up, with several selection options. 6. Using the top blue arrow, transfer Serum Protein over into the box called Test Variable (s) (see screen shot). 7. Using the bottom blue arrow, transfer Gender over into the box called Grouping Variable (see screen shot). 2

8. Select the blue button Define Groups underneath the Grouping Variable box. 9. As soon as you click the button, a new small window called Define Groups opens up. 10. Type the number 1 in the box for Group 1 and type the number 2 in the box for Group 2 (see screen shot). These are the numbers that we have defined in Gender. 3

11. Select Continue and Ok. 12. The SPSS Output Window contains 2 tables. The top table shows Group Statistics and tells you which groups you have compared (e.g. we have compared the serum protein concentrations between male and female) (see screen shot). It also tells you useful information such as the sample size (N), mean and Standard Deviation. This is similar to the Summary Statistics we have learned at the beginning of the course. 13. The second table shows the results of the Student s t-test (Independent Samples Test). The p-value can be found under the heading Sig. (2-tailed) (see screen shot). The p-value is <0.05, which means that we have a significant difference between the serum protein concentrations of the two genders. 14. In the table, you can also see a section called the Levene s Test for Equality of Variances. This is the exact same test that we have learned previously. Since this test is so important, SPSS automatically performs it for you and you don t have to do it yourself every time. However, you can also see from the table, that SPSS returns two rows and even two p values for the t-test. The first row is Equal variances assumed and in this row you find the result of the Levene test (see screen shot). 4

15. The second row is Equal variances not assumed. This test is a different test than the Student s t-test and you should ignore all results in this row (especially the p-value of the T- test, even if it is significant!). In case that the Levene test shows a p-value <0.05 in the top row, you should also ignore the results of the top row and follow our decision tree to the Mann-Whitney U test. Exercise 6.2: 1. In the Excel spreadsheet Excercises_2, select the tab Student s T-test II. 2. Repeat Exercise 6.1 with this new dataset. Solution Exercise 6.2: The p-value for the Levene test is highly significant (p<0.001), meaning there is no homogeneity of variance between the two genders. Therefore, we cannot take the results of the T-test as valid and have to continue using the Mann-Whitney U test. Mann-Whitney U test Exercise 6.3: 1. In the Excel spreadsheet Excercises_2, select the tab Mann-Whitney-U I. 2. Copy and paste the raw data into the SPSS Data View tab. 3. Label the data appropriately in the Variable View tab. 4. In the menu bar, go to Analyse-> Nonparametric Tests-> Legacy dialogs->2 Independent Samples (see screen shot). 5

5. As soon as you select 2 Independent Samples, a new small window called Two Independent Samples opens up, with several selection options. 6. Using the top blue arrow, transfer Serum Protein over into the box called Test Variable List. 7. Using the bottom blue arrow, transfer Gender over into the box called Grouping Variable. 8. Select the blue button Define Groups underneath the Grouping Variable box. 9. Define the groups (1 and 2) as described in exercise 6.1. 10. Select Continue and Ok. 11. The SPSS Output Window again contains 2 tables. The first table (Ranks) gives an overview of the group comparison (it does not contain the summary statistics that the T-test does, see screen shot). 12. The second table contains the Test Statistics. You can find the relevant p-value for the Mann-Whitney U test in the row called Asymp. Sig. (2-tailed) (Asymptotic Significance), see screen shot. The p-value is <0.05, meaning that there are differences in the serum protein concentrations between the two genders. 6

Exercise 6.4: 1. In the Excel spreadsheet Excercises_2, select the tab Mann-Whitney U II. 2. Repeat Exercise 6.3 with this new dataset. Solution Exercise 6.4: After completing this dataset, in the Test Statistics table you find, in addition to Asymp. Sig., another p-value called Exact Sig. (Exact Significance). Whenever this p-value appears, you should be using this instead of the Asymp. Sig. The Asymptotic p-value is to be used with a larger sample size, the Exact p-value with small sample sizes. We will discuss later what a small and what a large sample size is. The p-value for the Mann-Whitney U test is not significant (p>0.05), meaning there is no difference in serum protein concentrations between the two genders. 7

Paired samples T-test Exercise 6.5: 1. In the Excel spreadsheet Excercises_2, select the tab Paired T-test I. 2. Copy and paste the raw data into the SPSS Data View tab. 3. Label the data appropriately in the Variable View tab. 4. In the menu bar, go to Analyse-> Compare Means->Paired-Samples T Test (see screen shot). 5. As soon as you select Paired-Samples T Test, a new small window called Paired- Samples T test opens up, with several selection options. 6. Using the blue arrow, transfer Serum Protein Month 0 and Serum Protein Month 3 over into the box called Paired Variables (see screen shot). 8

7. Note that the variables still stay in the left hand box. This allows for doing several comparisons with multiple pairs in one go. 8. Select Ok. 9. The SPSS Output Window contains 3 tables. The first table (Paired Samples Statistics) is again a summary statistics (see screen shot). 10. The 3 rd table contains the result of our Paired samples T-test (we can ignore the 2 nd table). The p-value can be found at the very end of the table under the heading Sig. (2- tailed), see screen shot. It is highly significant (p<0.001), meaning that there is a significant difference in serum protein concentrations between the timepoint Month 0 and the timepoint Month 3. 9

Exercise 6.6: 1. In the Excel spreadsheet Excercises_2, select the tab Paired T-test II. 2. Repeat Exercise 6.5 with this new dataset. Solution Exercise 6.6: The p-value is highly significant (p<0.001), meaning that there is a significant difference in serum protein concentrations between the timepoint Month 0 and the timepoint Month 3. Wilcoxon test Exercise 6.7: 1. In the Excel spreadsheet Excercises_2, select the tab Wilcoxon test I. 2. Copy and paste the raw data into the SPSS Data View tab. 3. Label the data appropriately in the Variable View tab. 4. In the menu bar, go to Analyse-> Nonparametric Tests-> Legacy dialogs->2 Related Samples (see screen shot). 10

5. As soon as you select 2 Related Samples, a new small window called Two Related Samples Tests opens up, with several selection options. 6. Using the blue arrow, transfer Serum Protein Month 0 and Serum Protein Month 3 over into the box called Test Pairs (see screen shot). 7. Note that similar to the paired samples t-test, the variables still stay in the left hand box. This allows for doing several comparisons with multiple pairs in one go. 8. Select Ok. 9. The SPSS Output Window looks similar to the Output of the Mann-Whitney U test. The p- value of the Wilcoxon test can be found at the bottom row of the second table (see screen shot). It is not significant (p>0.05), meaning that there is no difference in serum protein concentrations between the timepoints Month 0 and Month 3. 11

Exercise 6.8: 1. In the Excel spreadsheet Excercises_2, select the tab Wilcoxon test II. 2. Repeat Exercise 6.7 with this new dataset. Solution Exercise 6.8: The p-value is highly significant (p<0.001), meaning that there is a significant difference in serum protein concentrations between the timepoint Month 0 and the timepoint Month 3. We have now completed our decision tree for comparisons between two groups. 12

By following the decision tree, you have successfully learned how to make the right decision (depending on the type of your dataset) when to use which test to compare two groups for differences between them. However, many experiments have more than two groups and I will now introduce a more complex decision tree to compare differences for more than two groups (in theory, as many groups/conditions as you like ). 13

The beginning and end of this decision tree is the same as the decision tree for comparisons of 2 groups. We have first decisions about the type of dataset we have, followed by decisions on which assumption criteria our dataset meets. At the end of the decision tree we have our standard 2-group comparison tests, the Student s T-test, Mann-Whitney U- Test, Paired Samples T-test and Wilcoxon Test. You can ignore the bit about Bonferroni correction for now, we will cover this later. Above the 2 group tests we have four new tests, which are used to detect differences between more than 2 groups. On the left-hand side (e.g. the independent samples side) of the tree, we have the Oneway ANOVA and the Kruskal-Wallis test. On the right-hand side (e.g. the dependent samples side) of the tree, we have the Repeated Measurement ANOVA and the Friedman test. These four tests can be used to simultaneously test if there are differences between more than 2 groups. The tests can be very powerful and sometimes detect differences when a normal 2 group test doesn t. However, as you will see later on, these tests only will give you an answer to the question if there are differences somewhere between any of the tested groups. They will not tell you where these differences are and you need to do a 2 group test to find out exactly this. This is so-called post-hoc testing and we will learn more about it 14

later. Also, you should note that because these tests are very powerful, if they don t find a difference between the groups, no other of the post-hoc tests will find one. So if you have for example 100 groups, you do not have to do 1000s of post-hoc tests if one of the big multiple group tests does not find any differences. Try as you might, you will not find any differences with the 2 group tests. Therefore, if you do have multiple groups, it is good practice to always do a multiple group test first before the 2 group tests, as this may save you a lot of time. 15