SCHOOL OF HEALTH AND HUMAN SCIENCES Using SPSS Topics addressed today: 1. Differences between groups 2. Graphing Use the s4data.sav file for the first part of this session. DON T FORGET TO RECODE YOUR MISSING VALUES T-tests There are a number of different types of t-tests available in SPSS. The one we will discuss is the independent samples t-test, used when you want to compare the mean scores of two different groups of people or conditions. Independent samples t-test. For example, a research question may be: Is there a significant difference in the mean GHQ score between men and women? Let s see: Analyze Compare Means Independent Samples T-Test Move the dependent (continuous) variable (ghqscale) into the Test Variable box Move the independent (categorical) variable (sex) into the Grouping Variable box Click on Define Groups and type in the numbers used in the data set to code each group - in the data file men = 1, women = 2, therefore in the Group 1 box, type 1 and in the Group 2 box type 2 1
T-Test GHQSCALE SEX sex 1 male 2 female Group Statistics Std. Error N Mean Std. Deviation Mean 108 22.5093 5.36081.51584 141 22.5248 4.36149.36730 Independent Samples Test GHQSCALE Equal variances assumed Equal variances not assumed Levene's Test for Equality of Variances F Sig. t df Sig. (2-tailed) t-test for Equality of Means Mean Difference 95% Confidence Interval of the Std. Error Difference Difference Lower Upper.783.377 -.025 247.980 -.0156.61633-1.22950 1.19838 -.025 203.102.980 -.0156.63325-1.26415 1.23303 Interpretation of the output In the Group Statistics box, SPSS gives you the mean and sd for each of your groups. It also gives you the number of people in each group. Always check these values first. Do they seem right? The first section of Independent Samples Test output box gives you the results of Levene s test for equality of variances. This tests whether the variance of the scores for the two groups is the same. The outcome of this test determines which of the t-values that SPSS provides is the correct one to use. If the significance level (Sig.) of the Levene s test is larger than.05 (e.g..07,.10), you should use the t-test in the first line in the table, which refers to Equal variances assumed. If it is P=.05 or less (e.g..01,.001), this means that the variances for the two groups are not the same. Therefore your data violates the assumption of equal variance. SPSS provides you with an alternative t-value which compensates for the fact that your variances are not the same. You should use the information in the second line of the t-test table that refers to Equal variances not assumed. If the value in the Sig (2-tailed) column is equal or less than.05, then there is a significant difference in the mean scores on your dependent variable for each of the two groups. If the value is above.05, there is no significant difference between the two groups as in this case. 2
See if there is a difference in the mean value on the ghqscale variable between those who are married and those who are divorced. (Hint: adjust the Define Groups boxes.) Paired samples t-test There is another common t-test: the paired samples t-test, used when you want to compare the mean scores on the same group of people on two different occasions, or you have matched pairs. If you wish to see how this works, download s4data_b.sav and have a go at looking at the two GHQ scores. You can this to the end of the session if you wish! Paired t-tests (also referred to as repeated measures) are used when you have only one group of people and you collect data from them on two different occasions, or under two different conditions. Pre-test and post-test experimental designs are an example of the type of situation where this technique is appropriate. You assess each person on some continuous measure at Time 1 and then at Time 2, after exposing them to some experimental manipulation or intervention. This approach is also used when you have matched pairs of subjects (that is, each person is matched with another on specific criteria such as age, sex etc.). One of the pair is exposed to Intervention 1 and the other is exposed to Intervention 2. Scores on a continuous measure are then compared for each pair. Paired sample t-tests can also be used when you measure the same person in terms of her response on two different questions. In this case, both dimensions should be rated on the same scale. A word on null hypotheses Hypotheses are in the form of either a substantive hypothesis, which, as has been pointed out, represents the predictive association between variables, or a null hypothesis, which is a statistical artifice and always predicts the absence of a relationship between the variables. Hypothesis testing is based on the logic that the substantive hypothesis is tested by assuming that the null hypothesis is true. Testing the null hypothesis involves calculating how likely (the probability) the results were to have occurred is there really was no differences. Thus the onus of proof rests with the substantive hypothesis that there is a change or difference. The null hypothesis is compared with the research observations and statistical tests are used to estimate the probability of the observations occurring by chance (Bowling, 2002 p.169). Which brings us to P values.. All the statistics we have calculated (phi, chi-squared etc) are tested to determine if they are statistically significant. This is usually done by comparing their value to point on an appropriate distribution determined by the statistic and the degrees of freedom. For example, the t distribution is a family of curves (in the same way as the normal curve is) and the shape of the curve is determined by the degrees of freedom. The value of the statistic is plotted (by SPSS!) against the relevant curve to determine the P value for that statistic. The most commonly used P value is below 0.05 (or 5%). This means that there is less than 5% chance of a false positive result. So, in the case of the independent t-test example above, we test the null hypothesis that there is no difference between the mean GHQ scores for men and women. From the output we see that the t statistic is -0.025 with a P value of 0.98. As we are looking for evidence to reject the null hypothesis we are looking for a P value of 0.05 or less. In this case the P value is well above 0.05 and so we have to accept the null hypothesis of no difference. 3
One-way ANOVA Use this test for comparing means of 3 or more groups, to avoid performing multiple t-tests. If you have 3 groups to compare (1, 2, 3) then we would need 3 separate t-tests (comparing 1 with 2, 1 with 3, and 2 with 3). If you had seven groups you would need 21 separate t-tests. This would be time-consuming but, more important, it would be flawed because in each t-test we usually accept a 5% chance of our conclusion being wrong (test for p < 0.05). So, in 21 tests you would expect that one test would give you a false result. ANOVA overcomes this problem by enabling you to detect significant differences between the treatments as a whole. You do a single test to see if there are differences between the means at our chosen probability level. The test statistic is F. To run a oneway ANOVA: Analyze Compare Means One-way ANOVA Move the dependent (interval) variable into the Dependent list box Move the independent (categorical) variable into the Factor box Click on Options to specify Descriptives this will produce the means of the dependent variable for each of the group in the factor variable. Use the oneway ANOVA to compare means of GHQ score between categories of marital status. Non-parametric tests Why use non-parametric tests? The parametric tests (t-tests and one-way analysis of variance) make assumptions about the population that the sample has been drawn from. This often includes assumptions about the shape of the population distribution. The required assumptions are less restrictive than those for fully parametric models. For example: Most parametric procedures require knowledge of or a strong enough belief in a distributional form for the measured outcome in the population studied. An interval level variable is usually required for parametric inference. Most non-parametric methods will work with ordinal level data, and some of techniques will hold with nominal level data. Non-parametric methods are valid for most distributions. Nonparametric methods are often easier to compute. Another factor that often limits the applicability of parametric tests based on the assumption that the sampling distribution is normal is the size of the sample of data available for the analysis (sample size; n). We can assume that the sampling distribution is normal even if we are not sure that the distribution of the variable in the population is normal, as long as our sample is large enough (e.g., 100 or more observations). However, if our sample is very small, then those tests can be used only if we are sure that the variable is normally distributed, and there is no way to test this assumption if the sample is small. Despite being less fussy, non-parametric tests do have their disadvantages They tend to be less sensitive than their parametric cousins, and therefore may fail to detect differences between groups that actually do exist. 4
If you have the right sort of data, it is always better to use a parametric test if you can. If in doubt then do both parametric and non-parametric tests - do they say anything different? If you are sure that a non-parametric test is the most appropriate then use that. Mann-Whitney U-test The Mann-Whitney Test is used in place of the t-test when the normality assumption (differences between the two samples) is questionable. This test can also be applied when the observations in a sample of data are ranks, that is, ordinal data rather than direct measurements. Instead of comparing the means of the two groups of interest, as in the case of the t-test, the Mann-Whitney U Test actually compares the medians. It converts the scores on the continuous variable to ranks, across the two groups. It then evaluates whether the ranks for the two groups differ significantly. As the scores are converted to ranks, the actual distribution of the scores does not matter. To run a Mann-Whitney U-test: Analyze Nonparametric Tests 2 Independent Samples Move the dependent variable into the Test Variable box Move the independent variable into the Grouping Variable box Click on Define Groups and type in the numbers used in the data set to code each group for example if men = 1, women = 2, therefore in the Group 1 box, type 1 and in the Group 2 box type 2 Kruskal-Wallis test This is a non-parametric test used to compare three or more samples. It is used to test the null hypothesis that all populations have identical distribution functions against the alternative hypothesis that at least two of the samples differ only with respect to location (median), if at all. It is the analogue to the F-test used in analysis of variance. While analysis of variance tests depend on the assumption that all populations under comparison are normally distributed, the Kruskal-Wallis test places no such restriction on the comparison. It is a logical extension of the Mann-Whitney U-test. To run a Kruskal-Wallis test: Analyze Nonparametric Tests K Independent Samples Move the dependent variable into the Test variable list box Move the independent variable into the Grouping variable box 5
USe the s4data_c.sav file for the rest of this session. 1. Make a bar graph showing the mean life expectancy of men in the different regions of the world. 2. Make a bar graph showing all the African countries arranged by literacy rate. 3. Use Crosstabs to look at distribution of religions in OECD and Latin American countries. 4. Make a Pie Chart showing the relative populations of Brazil, Argentina, Uruguay & Chile. 5. Imagine that you have to write a report on suicide around the world. What can you say about suicide in different countries using this data? Why might social scientists raise questions about the data on suicide from different countries? 6