SCHOOL OF HEALTH AND HUMAN SCIENCES DON T FORGET TO RECODE YOUR MISSING VALUES



Similar documents
Projects Involving Statistics (& SPSS)

Introduction to Analysis of Variance (ANOVA) Limitations of the t-test

INTERPRETING THE ONE-WAY ANALYSIS OF VARIANCE (ANOVA)

Independent t- Test (Comparing Two Means)

Chapter 5 Analysis of variance SPSS Analysis of variance

SPSS Tests for Versions 9 to 13

Chapter 2 Probability Topics SPSS T tests

Examining Differences (Comparing Groups) using SPSS Inferential statistics (Part I) Dwayne Devonish

Descriptive Statistics

Chapter 7 Section 7.1: Inference for the Mean of a Population

The Dummy s Guide to Data Analysis Using SPSS

II. DISTRIBUTIONS distribution normal distribution. standard scores

An introduction to IBM SPSS Statistics

Comparing Means in Two Populations

SPSS Guide How-to, Tips, Tricks & Statistical Techniques

THE KRUSKAL WALLLIS TEST

EPS 625 INTERMEDIATE STATISTICS FRIEDMAN TEST

Study Guide for the Final Exam

Research Methods & Experimental Design

UNDERSTANDING THE TWO-WAY ANOVA

Statistics Review PSY379

Nonparametric Statistics

Rank-Based Non-Parametric Tests

7. Comparing Means Using t-tests.

Descriptive Analysis

SPSS 3: COMPARING MEANS

Outline. Definitions Descriptive vs. Inferential Statistics The t-test - One-sample t-test

One-Way ANOVA using SPSS SPSS ANOVA procedures found in the Compare Means analyses. Specifically, we demonstrate

QUANTITATIVE METHODS BIOLOGY FINAL HONOUR SCHOOL NON-PARAMETRIC TESTS

Statistics. One-two sided test, Parametric and non-parametric test statistics: one group, two groups, and more than two groups samples

SPSS Explore procedure

Class 19: Two Way Tables, Conditional Distributions, Chi-Square (Text: Sections 2.5; 9.1)

Analysing Questionnaires using Minitab (for SPSS queries contact -)

HYPOTHESIS TESTING WITH SPSS:

Part 3. Comparing Groups. Chapter 7 Comparing Paired Groups 189. Chapter 8 Comparing Two Independent Groups 217

SPSS ADVANCED ANALYSIS WENDIANN SETHI SPRING 2011

Two Related Samples t Test

Overview of Non-Parametric Statistics PRESENTER: ELAINE EISENBEISZ OWNER AND PRINCIPAL, OMEGA STATISTICS

Linear Models in STATA and ANOVA

Come scegliere un test statistico

Using Excel for inferential statistics

Data analysis process

UNDERSTANDING THE DEPENDENT-SAMPLES t TEST

StatCrunch and Nonparametric Statistics

COMPARISONS OF CUSTOMER LOYALTY: PUBLIC & PRIVATE INSURANCE COMPANIES.

One-Way Analysis of Variance

HYPOTHESIS TESTING: CONFIDENCE INTERVALS, T-TESTS, ANOVAS, AND REGRESSION

Additional sources Compilation of sources:

Statistical tests for SPSS

research/scientific includes the following: statistical hypotheses: you have a null and alternative you accept one and reject the other

Intro to Parametric & Nonparametric Statistics

Analysis of Data. Organizing Data Files in SPSS. Descriptive Statistics

13: Additional ANOVA Topics. Post hoc Comparisons

UNDERSTANDING THE INDEPENDENT-SAMPLES t TEST

Independent samples t-test. Dr. Tom Pierce Radford University

General Method: Difference of Means. 3. Calculate df: either Welch-Satterthwaite formula or simpler df = min(n 1, n 2 ) 1.

UNIVERSITY OF NAIROBI

Introduction to Statistics Used in Nursing Research

TABLE OF CONTENTS. About Chi Squares What is a CHI SQUARE? Chi Squares Hypothesis Testing with Chi Squares... 2

T-test & factor analysis

CHAPTER 14 NONPARAMETRIC TESTS

NCSS Statistical Software

Mixed 2 x 3 ANOVA. Notes

Nonparametric Two-Sample Tests. Nonparametric Tests. Sign Test

Nonparametric Tests. Chi-Square Test for Independence

Testing Group Differences using T-tests, ANOVA, and Nonparametric Measures

IBM SPSS Statistics for Beginners for Windows

CHAPTER 12 TESTING DIFFERENCES WITH ORDINAL DATA: MANN WHITNEY U

1.5 Oneway Analysis of Variance

Chapter Seven. Multiple regression An introduction to multiple regression Performing a multiple regression on SPSS

Chapter 7. Comparing Means in SPSS (t-tests) Compare Means analyses. Specifically, we demonstrate procedures for running Dependent-Sample (or

The Kruskal-Wallis test:

Association Between Variables

First-year Statistics for Psychology Students Through Worked Examples

Analysis of Variance ANOVA

Introduction to Statistics with SPSS (15.0) Version 2.3 (public)

How To Test For Significance On A Data Set

IBM SPSS Statistics 20 Part 4: Chi-Square and ANOVA

SPSS TUTORIAL & EXERCISE BOOK

Lesson 1: Comparison of Population Means Part c: Comparison of Two- Means

January 26, 2009 The Faculty Center for Teaching and Learning

Types of Data, Descriptive Statistics, and Statistical Tests for Nominal Data. Patrick F. Smith, Pharm.D. University at Buffalo Buffalo, New York

Introduction to Quantitative Methods

Testing for differences I exercises with SPSS

Good luck! BUSINESS STATISTICS FINAL EXAM INSTRUCTIONS. Name:

Sample Size and Power in Clinical Trials

An Introduction to Statistics Course (ECOE 1302) Spring Semester 2011 Chapter 10- TWO-SAMPLE TESTS

The Statistics Tutor s Quick Guide to

12: Analysis of Variance. Introduction

Parametric and non-parametric statistical methods for the life sciences - Session I

A full analysis example Multiple correlations Partial correlations

WHAT IS A JOURNAL CLUB?

Section 13, Part 1 ANOVA. Analysis Of Variance

HYPOTHESIS TESTING: POWER OF THE TEST

NONPARAMETRIC STATISTICS 1. depend on assumptions about the underlying distribution of the data (or on the Central Limit Theorem)

1 Nonparametric Statistics

Chapter Eight: Quantitative Methods

Transcription:

SCHOOL OF HEALTH AND HUMAN SCIENCES Using SPSS Topics addressed today: 1. Differences between groups 2. Graphing Use the s4data.sav file for the first part of this session. DON T FORGET TO RECODE YOUR MISSING VALUES T-tests There are a number of different types of t-tests available in SPSS. The one we will discuss is the independent samples t-test, used when you want to compare the mean scores of two different groups of people or conditions. Independent samples t-test. For example, a research question may be: Is there a significant difference in the mean GHQ score between men and women? Let s see: Analyze Compare Means Independent Samples T-Test Move the dependent (continuous) variable (ghqscale) into the Test Variable box Move the independent (categorical) variable (sex) into the Grouping Variable box Click on Define Groups and type in the numbers used in the data set to code each group - in the data file men = 1, women = 2, therefore in the Group 1 box, type 1 and in the Group 2 box type 2 1

T-Test GHQSCALE SEX sex 1 male 2 female Group Statistics Std. Error N Mean Std. Deviation Mean 108 22.5093 5.36081.51584 141 22.5248 4.36149.36730 Independent Samples Test GHQSCALE Equal variances assumed Equal variances not assumed Levene's Test for Equality of Variances F Sig. t df Sig. (2-tailed) t-test for Equality of Means Mean Difference 95% Confidence Interval of the Std. Error Difference Difference Lower Upper.783.377 -.025 247.980 -.0156.61633-1.22950 1.19838 -.025 203.102.980 -.0156.63325-1.26415 1.23303 Interpretation of the output In the Group Statistics box, SPSS gives you the mean and sd for each of your groups. It also gives you the number of people in each group. Always check these values first. Do they seem right? The first section of Independent Samples Test output box gives you the results of Levene s test for equality of variances. This tests whether the variance of the scores for the two groups is the same. The outcome of this test determines which of the t-values that SPSS provides is the correct one to use. If the significance level (Sig.) of the Levene s test is larger than.05 (e.g..07,.10), you should use the t-test in the first line in the table, which refers to Equal variances assumed. If it is P=.05 or less (e.g..01,.001), this means that the variances for the two groups are not the same. Therefore your data violates the assumption of equal variance. SPSS provides you with an alternative t-value which compensates for the fact that your variances are not the same. You should use the information in the second line of the t-test table that refers to Equal variances not assumed. If the value in the Sig (2-tailed) column is equal or less than.05, then there is a significant difference in the mean scores on your dependent variable for each of the two groups. If the value is above.05, there is no significant difference between the two groups as in this case. 2

See if there is a difference in the mean value on the ghqscale variable between those who are married and those who are divorced. (Hint: adjust the Define Groups boxes.) Paired samples t-test There is another common t-test: the paired samples t-test, used when you want to compare the mean scores on the same group of people on two different occasions, or you have matched pairs. If you wish to see how this works, download s4data_b.sav and have a go at looking at the two GHQ scores. You can this to the end of the session if you wish! Paired t-tests (also referred to as repeated measures) are used when you have only one group of people and you collect data from them on two different occasions, or under two different conditions. Pre-test and post-test experimental designs are an example of the type of situation where this technique is appropriate. You assess each person on some continuous measure at Time 1 and then at Time 2, after exposing them to some experimental manipulation or intervention. This approach is also used when you have matched pairs of subjects (that is, each person is matched with another on specific criteria such as age, sex etc.). One of the pair is exposed to Intervention 1 and the other is exposed to Intervention 2. Scores on a continuous measure are then compared for each pair. Paired sample t-tests can also be used when you measure the same person in terms of her response on two different questions. In this case, both dimensions should be rated on the same scale. A word on null hypotheses Hypotheses are in the form of either a substantive hypothesis, which, as has been pointed out, represents the predictive association between variables, or a null hypothesis, which is a statistical artifice and always predicts the absence of a relationship between the variables. Hypothesis testing is based on the logic that the substantive hypothesis is tested by assuming that the null hypothesis is true. Testing the null hypothesis involves calculating how likely (the probability) the results were to have occurred is there really was no differences. Thus the onus of proof rests with the substantive hypothesis that there is a change or difference. The null hypothesis is compared with the research observations and statistical tests are used to estimate the probability of the observations occurring by chance (Bowling, 2002 p.169). Which brings us to P values.. All the statistics we have calculated (phi, chi-squared etc) are tested to determine if they are statistically significant. This is usually done by comparing their value to point on an appropriate distribution determined by the statistic and the degrees of freedom. For example, the t distribution is a family of curves (in the same way as the normal curve is) and the shape of the curve is determined by the degrees of freedom. The value of the statistic is plotted (by SPSS!) against the relevant curve to determine the P value for that statistic. The most commonly used P value is below 0.05 (or 5%). This means that there is less than 5% chance of a false positive result. So, in the case of the independent t-test example above, we test the null hypothesis that there is no difference between the mean GHQ scores for men and women. From the output we see that the t statistic is -0.025 with a P value of 0.98. As we are looking for evidence to reject the null hypothesis we are looking for a P value of 0.05 or less. In this case the P value is well above 0.05 and so we have to accept the null hypothesis of no difference. 3

One-way ANOVA Use this test for comparing means of 3 or more groups, to avoid performing multiple t-tests. If you have 3 groups to compare (1, 2, 3) then we would need 3 separate t-tests (comparing 1 with 2, 1 with 3, and 2 with 3). If you had seven groups you would need 21 separate t-tests. This would be time-consuming but, more important, it would be flawed because in each t-test we usually accept a 5% chance of our conclusion being wrong (test for p < 0.05). So, in 21 tests you would expect that one test would give you a false result. ANOVA overcomes this problem by enabling you to detect significant differences between the treatments as a whole. You do a single test to see if there are differences between the means at our chosen probability level. The test statistic is F. To run a oneway ANOVA: Analyze Compare Means One-way ANOVA Move the dependent (interval) variable into the Dependent list box Move the independent (categorical) variable into the Factor box Click on Options to specify Descriptives this will produce the means of the dependent variable for each of the group in the factor variable. Use the oneway ANOVA to compare means of GHQ score between categories of marital status. Non-parametric tests Why use non-parametric tests? The parametric tests (t-tests and one-way analysis of variance) make assumptions about the population that the sample has been drawn from. This often includes assumptions about the shape of the population distribution. The required assumptions are less restrictive than those for fully parametric models. For example: Most parametric procedures require knowledge of or a strong enough belief in a distributional form for the measured outcome in the population studied. An interval level variable is usually required for parametric inference. Most non-parametric methods will work with ordinal level data, and some of techniques will hold with nominal level data. Non-parametric methods are valid for most distributions. Nonparametric methods are often easier to compute. Another factor that often limits the applicability of parametric tests based on the assumption that the sampling distribution is normal is the size of the sample of data available for the analysis (sample size; n). We can assume that the sampling distribution is normal even if we are not sure that the distribution of the variable in the population is normal, as long as our sample is large enough (e.g., 100 or more observations). However, if our sample is very small, then those tests can be used only if we are sure that the variable is normally distributed, and there is no way to test this assumption if the sample is small. Despite being less fussy, non-parametric tests do have their disadvantages They tend to be less sensitive than their parametric cousins, and therefore may fail to detect differences between groups that actually do exist. 4

If you have the right sort of data, it is always better to use a parametric test if you can. If in doubt then do both parametric and non-parametric tests - do they say anything different? If you are sure that a non-parametric test is the most appropriate then use that. Mann-Whitney U-test The Mann-Whitney Test is used in place of the t-test when the normality assumption (differences between the two samples) is questionable. This test can also be applied when the observations in a sample of data are ranks, that is, ordinal data rather than direct measurements. Instead of comparing the means of the two groups of interest, as in the case of the t-test, the Mann-Whitney U Test actually compares the medians. It converts the scores on the continuous variable to ranks, across the two groups. It then evaluates whether the ranks for the two groups differ significantly. As the scores are converted to ranks, the actual distribution of the scores does not matter. To run a Mann-Whitney U-test: Analyze Nonparametric Tests 2 Independent Samples Move the dependent variable into the Test Variable box Move the independent variable into the Grouping Variable box Click on Define Groups and type in the numbers used in the data set to code each group for example if men = 1, women = 2, therefore in the Group 1 box, type 1 and in the Group 2 box type 2 Kruskal-Wallis test This is a non-parametric test used to compare three or more samples. It is used to test the null hypothesis that all populations have identical distribution functions against the alternative hypothesis that at least two of the samples differ only with respect to location (median), if at all. It is the analogue to the F-test used in analysis of variance. While analysis of variance tests depend on the assumption that all populations under comparison are normally distributed, the Kruskal-Wallis test places no such restriction on the comparison. It is a logical extension of the Mann-Whitney U-test. To run a Kruskal-Wallis test: Analyze Nonparametric Tests K Independent Samples Move the dependent variable into the Test variable list box Move the independent variable into the Grouping variable box 5

USe the s4data_c.sav file for the rest of this session. 1. Make a bar graph showing the mean life expectancy of men in the different regions of the world. 2. Make a bar graph showing all the African countries arranged by literacy rate. 3. Use Crosstabs to look at distribution of religions in OECD and Latin American countries. 4. Make a Pie Chart showing the relative populations of Brazil, Argentina, Uruguay & Chile. 5. Imagine that you have to write a report on suicide around the world. What can you say about suicide in different countries using this data? Why might social scientists raise questions about the data on suicide from different countries? 6