Non-parametric Tests Using SPSS

Transcription

1 Non-parametric Tests Using SPSS Statistical Package for Social Sciences Jinlin Fu January 2016 Contact Medical Research Consultancy Studio Australia

2 Contents 1 INTRODUCTION UNIVARIATE LOGISTIC REGRESSION ASSUMPTIONS AND DATA REQUIREMENTS TESTING FOR NORMALITY MULTIPLE LOGISTIC REGRESSION TESTING FOR INDEPENDENCE OF TWO CATEGORICAL VARIABLES FISHER S EXACT TEST FOR INDEPENDENCE OF TWO CATEGORICAL VARIABLES TESTING A THEORETICAL MODEL (GOODNESS OF FIT) USING CHI-SQUARE FURTHER ANALYSIS USING CHI-SQUARE BINOMIALTESTS USING CHI-SQUARE RUNS TEST ASSUMPTIONS AND DATA REQUIREMENTS COMFIRMATION OF APPROPRIATE CUT POINTS RUNS TEST ONE-SAMPLE TEST ASSUMPTIONS AND DATA REQUIREMENTS ONE-SAMPLE KOLMOGOROV-SMIRNOV TEST TWO-INDEPENDENT-SAMPLE TESTS ASSUMPTIONS AND DATA REQUIREMENTS TWO-IINDEPENDENT-SAMPLE MANN-WHITNEY AND WILCOXON TESTS TWO-INDEPENDENT-SAMPLE KOLMOGOROV-SMIRNOV TEST MULTI-INDEPENDENT-SAMPLE TESTS ASSUMPTIONS AND DATA REQUIREMENTS KRUSKALL-WALLIS TEST THE MEDIAN TEST POST HOC TESTS TWO-RELATED-SAMPLE TESTS ASSUMPTIONS AND DATA REQUIREMENTS WILCOXON SIGNED-RANKS TEST THE SIGN TEST MCNEMAR TEST MULTIPLE-RELATED-SAMPLE TESTS ASSUMPTIONS AND DATA REQUIREMENTS FRIEDMAN TEST KENDALL S WTEST COCHRAN S QTEST NONPARAMETRIC CORRELATIONS... 错误! 未定义书签 10.1 ASSUMPTIONS AND DATA REQUIREMENTS SPEARMAN S RANK CORRELATION KENDALL S TAU RANK CORRELATION... 47

3 1 INTRODUCTION Regression methods have an integral component of any data analysis concerned with describing the relationship between a response (or outcome or dependent) variable and one or more explanatory (predictor or independent) variables, or covariates. It is often the case that the outcome variable is discrete, taking on two or more possible values. Over the last two decades the logistic regression has become, in many fields, the standard method of analysis in this situation. Logistic regression allows one to predict a discrete outcome such as group membership from a set of variables that may be continuous, discrete, dichotomous, or a mix. Because of its popularity in the health sciences, the discrete outcome in logistic regression is often disease/no disease. For example, can presence or absence of hay fever be diagnosed from geographic area, season, degree of nasal stuffiness, and body temperature? Logistic regression has no assumptions about the distributions of the predictor variables; in logistic regression, the predictors do not have to be normally distributed, linearly related, or of equal variance within each group. Unlike multiway frequency analysis, the predictors do not need to be discrete; the predictors can be any mix of continuous, discrete and dichotomous variables, Unlike multiple regression analysis, which also has distributional requirements for predictors, logistic regression cannot produce negative predicted probabilities. There may be two or more outcomes (groups) in logistic regression. If there are more than two outcomes, they may or may not have order (e.g., no hay fever, moderate hay fever, severe hay fever). Logistic regression emphasizes the probability of a particular outcome for each case. For example, it evaluates the probability that a given person has hay fever, given that person's pattern of responses to questions about geographic area, season, nasal stuffiness, and temperature. Logistic regression analysis is especially useful when the distribution of responses on the dependent variable is expected to be nonlinear with one or more of the independent variables. Because the model produced by logistic regression is nonlinear, the equations used to describe the outcomes are slightly more complex than those for multiple regression. The outcome variable, Ŷ, is the probability of having one outcome or another based on a nonlinear function of the best linear combination of predictors; with two outcomes: Ŷ = 1+ Where Ŷ. is the estimated probability that the ith case (i = 1,..., n) is in one of the categories and u is the usual linear regression equation: = with constant C, coefficients β,, and predictors, X for k predictors (j = 1, 2,..., k). This linear regression equation creates the logit or log of the odds: Ŷ 1 Ŷ = + That is, the linear regression equation is the natural log ( ) of the probability of being in one group divided by the probability of being in the other group. For example, that is the natural log of the probability of being in the disease group divided by the probability of being in the non-disease group. 1

4 Logistic regression can also be used to fit and compare models. The simplest (and worst-fitting) model includes only the constant and none of the predictors. The most complex (and "best"-fitting) model includes the constant, all predictors, and, perhaps, interactions among predictors. Often, however, not all predictors (and interactions) are related to the outcome. There searcher uses goodness-of-fit tests to choose the model that does the best job of prediction with the fewest predictors. 2

5 2 UNIVARIATE LOGISTIC REGRESSION Logistic regression is useful for situations in which you want to be able to predict the presence or absence of a characteristic or outcome based on values of a set of predictor variables. It is similar to a linear regression model but is suited to models where the dependent variable is dichotomous. Logistic regression coefficients can be used to estimate odds ratios for each of the independent variables in the model. Logistic regression is applicable to a broader range of research situations than discriminant analysis. 2.1 ASSUMPTIONS AND DATA REQUIREMENTS ASSUMPTIONS: Logistic regression does not rely on distributional assumptions. However, the solution may be more stable if selected predictors have a multivariate normal distribution. Additionally, as with other forms of regression, multicollinearity among the predictors can lead to biased estimates and inflated standard errors. The procedure is most effective when group membership is a truly categorical variable; if group membership is based on values of a continuous variable (for example, "high IQ" versus "low IQ"), you should consider using linear regression to take advantage of the richer information offered by the continuous variable itself. DATA: The dependent variable should be dichotomous. Independent variables can be interval level or categorical; if categorical, they should be dummy or indicator coded (there is an option in the procedure to recode categorical variables automatically). The question we want to answer is: what is the average level of weight in the study sample? To what extent does the weight measurement vary? We will now examine the variable Weight. From the toolbar menus, select Analyse Descriptive Statistics Explore. Add Weight to the Dependent List using the arrow button. Your screen should look like the one below: 3

6 Click OK. Your output screen should look like the output on the next page: Interpreting the output First table (Case Processing Summary): This provides the total number and percentage of observations and any missing values. Second table (Descriptives): This table lists all statistics used to describe the variable, Weight. We can see that the mean weight of patients is 60.9 kg, with standard deviation of 14.4 kg and 95% confidence interval of 56.8 to 65.0kg. We can also have median weight of 59.6kg with IQR of 11.0k, minimum (43.0kg), maximum (136.4kg) and range (93.4kg). The second table also includes two statistics, Skewness and Kurtosis, which are used to examine the shape of distribution curve. Skewness and Kurtosis are and respectively in the second table of the outputs. The skewness and kurtosis statistics are far from 0. This is strong evidence that distribution of Weight is not regarded as normal distribution. 4

7 Definition of Skewness: A measure of the asymmetry of a distribution. The normal distribution is symmetric and has a skewness value of 0. A distribution with a significant positive skewness has a long right tail. A distribution with a significant negative skewness has a long left tail. As a guideline, a skewness value more than twice its standard error is taken to indicate a departure from symmetry. Definition of Kurtosis: A measure of the extent to which observations cluster around a central point. For a normal distribution, the value of the kurtosis statistic is zero. Positive kurtosis indicates that the observations cluster more and have longer tails than those in the normal distribution, and negative kurtosis indicates that the observations cluster less and have shorter tails. 2.2 TESTING FOR NORMALITY It is a good habit to examine the distributions of the continuous variables of interest before we start analysing on them. It helps us choose the right statistical methods to perform the analyses we want by doing so. Many statistical tests require that one or more variables are normally distributed. If a variable is normally distributed then parametric tests apply. If it is not the case, it is impropriate to employ parametric tests because violation of normal distribution of the variable may result in false outcomes. So, if a variable is nonnormally distributed, nonparametric tests, which is as valuable as parametric ones such as t-test and ANOVA, apply. The question we want to answer is: Is the variable Weight normally distributed? To check that this is the case, select Analyse Descriptive Statistics Explore. This time we still select Weight into the Dependent List and then click on Plots. Check that Stem-and-leaf is not selected and that Histogram is selected, and then select Normality plots with tests. 5

8 Click Continue and OK. Your output screen should look like the output below: Interpreting the output First table (Descriptives): This table is exactly the same as the one in 2.2. Second table (Tests of Normality): The table contains two formal tests for normality: the Kolmogorov- Smirnov and Shapiro-Wilk tests. The Kolmogorov-Smirnov test is only used for datasets with a large number of observations (i.e. > 5000). The Shapiro-Wilk significance level (p-value, labelled as Sig.) for Weight is less than 0.001, which is significant, and the histogram (below) looks to be nonnormally distributed. From the output we can see that the variable Weight is nonnormally distributed. 6

9 First graph (Histogram):This graph is a visual summary of the distribution of values. The overlay of the normal curve helps you to assess the skewness and kurtosis. The below histogram does show that the distribution is not symmetric, but left skewed. Second graph (Box Plot): The second graph is a box plot. Outliers are identified with a star sign *. The yield has two outlying values, labeled27 and 42. The label refers to the row number in the Data Editor where that observation is found. 7

10 3 ANALYSING CATEGORICAL DATA HYPOTHESIS TESTING USING CHI-SQUARE 3.1 TESTING FOR INDEPENDENCE OF TWO CATEGORICAL VARIABLES The Chi-square analysis can be used to determine whether there is a dependency between two categorical variables. The question we want to answer is: Is the level of exposure independent of gender? Go to Analyse Descriptive Statistics Crosstabs. Put the variable Exposure into the Row box and Gender into the Column box. Click the Statistics button and select Chi-square. and Continue. 8

11 Click the Cells button and select Percentages: Row. Click Continue and OK. Your output screen should look like the one below: 9

12 The results of the Chi-square test do not depend on whether you place Gender or Exposure in the rows or columns these can be switched around. However, your interpretation of the table tends to depend on the variables you have designated to be the rows and the columns. By custom, the variable you are interested in is designated to the rows. So in this example, our interest is in the level of exposure and we are investigating level of exposure by gender. Exposure is therefore designated to the row variable. Interpreting the output First table (Case Processing Summary): This provides the total number and percentage of observations and any missing values. Second table (Exposure * Gender Cross tabulation): This is a cross-tabulation displaying the two variables of interest, in this case, Exposure by Gender. Both the observed values and the row percentages are presented. This table is of benefit when the two variables are dependent, to interpret what the dependency between the two variables is. Third table (Chi-Square Tests): This shows the results of the Pearson Chi-square test. The p-value that responds to the question of independence is in the third column (Asymp. Sig (2-sided)) in the top row, and for this test was This indicates that the two variables (Exposure and Gender) are independent. Warning: The Chi-square test is not appropriate if the expected values are too small. SPSS will issue a warning below the third table if any of the cells have an expected value of <5. A guideline that is often used is that we should not have any cells with expected values less than 1 and at most one or two cells with expected values less than 5. Essentially you do not have enough data to reliably perform the Chi-square test, given the number of rows and columns in the table, and you do not have enough data upon which to make any reliable conclusions. In these instances you either (i) increase your sample size, (ii) reduce the number of the rows and/or columns, or (iii) use a Fisher s exact test (as follows) if there are only two categories for each variable. It is important to note that columns and/or rows can only be reduced if it is theoretically valid to do so. If you require a Fisher s exact test with more than two categories in either of the variables, please contact DM&A. 3.2 FISHER S EXACT TEST FOR INDEPENDENCE OF TWO CATEGORICAL VARIABLES SPSS outputs the results of Fisher s exact test within the Chi-square output (Section 3.1) when each variable only has two categories. If this is the case, and the expected values of each cell are too small, use the Fisher s exact p-value instead of the Pearson Chi-Square. In the example below, the Fisher s exact p-value (use the 2- sided value) is This indicates that the two variables (Insurance status and Gender) are independent. 10

13 3.3 TESTING A THEORETICAL MODEL (GOODNESS OF FIT) USING CHI-SQUARE The question we want to answer is: Are there equal numbers of males and females? To perform this test in SPSS, go to Analyse Nonparametric Tests Chi-Square.Put Gender in the right-hand Test Variable List box. Your screen should look like the one below: 11

14 In the Expected Values section, All categories equal is already selected. Click OK. Your output screen should look like the one below: 12

15 Interpreting the output First table (Gender): Provides the values expected under the assumption of equal numbers by gender. The residual values are calculated as observed - expected. Second table (Test Statistics): This provides the Pearson Chi-square statistic, the degrees of freedom and the p-value (labelled Asymp. Sig.); here the p-value is This indicates that there are equal numbers of males and females. Now suppose that a claim has been made that there are twice as many male patients as female patients. We can use the same procedure as before, this time choosing our own expected values. If there are twice as many males as females then the ratio of males to females is 2:1 and so one-third of the patients are females and two-thirds are males. Given that there are 50 patients in total, we would then expect: 50/ females and 16.7 x males. We enter these as expected values, entering 33.3 first since males correspond to the value 0. Go to Analyse Nonparametric Tests Chi-Square and select Values. Type 33.3 and click Add, then type 16.7 and click Add. Note that the expected values must be in the same order as the categories. Your screen should look like the one below: Click OK. Your output screen should look like the one below: 13

16 Interpreting the output First table (Gender): Provides the values expected under the assumption of twice as many males as females. The residual values are calculated as observed - expected. Second table (Test Statistics): This provides the Pearson Chi-square statistic, the degrees of freedom and the p-value (labelled Asymptotic Significance); here the p-value is This indicates that the numbers of males and females do not match the expected values that we entered. Note: You can do Goodness of Fit for a few variables (e.g. Gender, Exposure and Group) in the same procedure as selecting all the variables of interest as shown in the following window:. 14

17 Click OK. Your output screen should look like the one below: 3.4 FURTHER ANALYSIS USING CHI-SQUARE We have known how to do independent test between two categorical variables using Chi-square (2.1). If we are asked, Does insurance status change the situation, the level of exposure independent of gender?, we have to further cross-classify by whether they had public or private insurance. Go to Analyse Descriptive Statistics Crosstabs. Put the variable Exposure into the Row box and Gender into the Column box, and this time also put Insurance into the Layer 1 of 1 box. Do other selections as you did in Section 2.1, and then click OK. 15

18 Your output screen should look like the one below: 16

19 Interpreting the output First table (Exposure * Gender * Insurance Cross tabulation): This is a cross-tabulation displaying the two variables of interest, in this case, Exposure by Gender at different Insurance levels. Both the observed values and the row percentages are presented. This table is of benefit when the two variables are dependent, to interpret what the dependency between the two variables is at each level of Insurance. Second table (Chi-Square Tests): This shows the results of the Pearson Chi-square test. The p-values that respond to the question of independence are in the third column (Asymp. Sig (2-sided)) in the top row, and for this test were for public insurance holders and for private insurance holders. This indicates that the two variables (Exposure and Gender) are independent for public, but not for private insurance holders. Warning: Because 83.3% cells have expected count less than 5, you either (i) increase your sample size, (ii) reduce the number of the rows and/or columns (It is important to note that columns and/or rows can only be reduced if it is theoretically valid to do so.), or (iii) use a Fisher s exact test if there are only two categories for each variable in case you are still trying to use SPSS (version 15 or less). If you require a Fisher s exact test with more than two categories in either of the variables, you can go to IBM website for solution (You can find that IBM has issued a SPSS Exact Tests program on its website), or please contact DM&A. 3.5 BINOMIAL TESTS USING SPSS The Binomial Test procedure compares the observed frequencies of the two categories of a dichotomous variable to the frequencies that are expected under a binomial distribution with a specified probability parameter. A dichotomous variable is a variable that can take only two possible values: yes or no, true or false, 0 or 1, and so on. If the variables are not dichotomous, you must specify a cut point. The cut point assigns cases with values that are greater than the cut point to one group and assigns the rest of the cases to another group. In general population, the prevalence of a disease is about 21% and we are going to test if our sample patients have a higher rate of the disease. The question we want to answer is: Is the disease prevalence in the study sample same as that in general population? Go to Analyse Nonparametric Tests Binomial, and then put Disease into Test Variable List box. By default, the probability parameter for both groups is 0.5, although this may be changed. To change the probability, you enter a Test proportion for the first group (for this exercise we enter 0.21). The probability for the second group is equal to 1 minus the probability for the first group (1-0.21=0.79). Click on Options and then tick Descriptive under the Statistics. Click Continue and then OK to run the procedure. The outputs are then displayed in the Output window as it shows on next page. 17

20 Interpreting the output First table (Descriptive Statistics): This is a table displaying the basic information of the variable, Disease. It tells the number of the cases (N=50), the proportion of disease in the sample patients (Mean=0.28), standard deviation of the proportion (Std. Deviation =0.454),and the possible smallest and biggest value of the proportion (Minimum=0 and Maximum=1). Second table (Binomial Test): This shows the results of the Binomial test. The first three columns show the category, number of cases and observed proportions at each level of Disease. The fourth column gives the test or reference proportion that you have entered. The p-value that responds to the question of whether prevalence of disease in the sample patients (0.28 or 28%) is different from that of population (0.21 or 21%) is in the fifth column [Asymp. Sig. (1-tailed)] in the top row, and for this test it was This indicates that there is no statistically significant difference in proportions between sample patients and population. We can also use this test to check the proportions at levels of a variable. Say, for the above test we want to know if it is still true between males and females separately. Go to Data Split File and click Compare groups and Sort the file by grouping variables. Put Gender into Group based on box as it shown on the next page and then click OK. 18

21 Repeat the procedure as we did above in Binomial Test. Your output screen should look like the one below: Interpreting the output First table (Descriptive Statistics): This is a table displaying the basic information of Disease by Gender. Proportions of disease in male and female patients were 0.26 and 0.30, or 26% and 30%, respectively. Second table (Binomial Test): This shows the results of the Binomial test by Gender. The p-values that respond to the question of whether prevalence of disease in the sample patients (0.26 in males and 030 in females) is different from that of population (0.21for both males and females) is in the fifth [Exact Sig. (1-tailed) ] or sixth column [Asymp. Sig. (1-tailed)] in the top row, and for this test it was for males and for females. This indicates that there is no statistically significant difference in proportions between sample patients and population, no matter it is for males or for females. 19

22 When the number of cases in one category of gender is less than 30, Exact Gig.(exact significance level) will be displayed instead of Asymp. Sig. for that category. Most often we also want to know whether the patients with disease are heavier in weight, that is, whether the patients were more obese in terms of BMI, than those without disease. So, the question is: whether those who had disease tend to be above or below the cut-off value of 25.0 for overweight and obesity. First, we split our data by doing: Go to Data Split File and click Compare groups and Sort the file by grouping variables. Put Disease into Group based on box and then click OK. Next, we are going to employ Binomial test to find an answer to the above question. Go to Analyse Nonparametric Tests Binomial, and then put BMI into Test Variable List box. Enter in the Cut Point box and keep Test proportion box as default. (I will explain the reason why I entered for Cut point instead of 25 later in output section).click Options to select Quartiles, and then click Continue. Click OK to run the test. Your output screen should look like the one below: 20

23 Interpreting the output First table (Descriptive Statistics): This is a table displaying the basic information of BMI by Disease. The table tells us the number of cases (N) and percentiles [including 25 th, 50 th (median) and 75 th percentiles] in each category of Disease(Yes and No). Second table (Binomial Test): This shows the results of the Binomial test. BMI was classified into two subgroups based on cutoff value (25.0) as shown in the first column, namely Group 1 contains all cases with BMI less than 25.0 and Group 2 with BMI equal to or greater than (if we enter 25 as cut point in binomial Test window, here group 2 will exclude the value of 25 in BMI; but if we enter 24.99, in group 2 the smallest value will be greater than 24.99, implying the smallest value in group 2 is equal to 25.0 because in the data set the value below 25.0 is 24.9 which is smaller than So, in group 2 all the BMI values will be equal to or greater than 25.0 as we expected).this table also shows the number of cases (N) and the proportions (Observed Prop.) in each group under the different Disease status. Proportions of cases with BMI equal to or greater than 25.0 (Group 2) in Disease (Yes) and non-disease (No)groups are 0.21 vs 0.44, respectively, but the test did not tell us if there was a statistically significant difference between these two proportions. The p values were shown in fifth [Asymp. Sig. (1-tailed)]and sixth [Exact Sig. (1-tailed) ] columns. The above p-values indicate that whether proportions of cases between two BMI groups are different at each Disease status. The outputs show that the proportions of cases divided by BMI cutoff value of 25.0 were found significant difference neither in diseased (with p =0.608) nor in non-diseased cases (with p =0.057), that is, among the diseased there is no evidence to say that they are more overweighted or obsess, based on current data available. 21

24 4 RUNS TEST The Runs Test procedure tests whether the order of occurrence of two values of a variable is random. A run is a sequence of like observations. A sample with too many or too few runs suggests that the sample is not random. 4.1 ASSUMPTIONS AND DATA REQUIREMENTS Assumptions: Nonparametric tests do not require assumptions about the shape of the underlying distribution. Use samples from continuous probability distributions. Data: The variables must be numeric. To convert string variables to numeric variables, use the Automatic Recode procedure, which is available on the Transform menu. Many statistical tests assume that the observations in a sample are independent; in other words, that the order in which the data were collected is irrelevant. If the order does matter, then the sample is not random, and you cannot draw accurate conclusions about the population from which the sample was drawn. Therefore, it is prudent to check the data for a violation of this important assumption. You can use the Runs Test procedure to test whether the order of values of a variable is random. The procedure first classifies each value of the variable as falling above or below a cut point and then tests to ensure that there is no order to the resulting sequence. 4.2 CONFIRMATION OF APPROPRIATE CUT POINTS The cut point is based either on a measure of central tendency (mean, median, or mode) or a custom value. You can obtain descriptive statistics and/or quartiles of the test variable. Go to Graphs Chart Builder...and then in the Choose from box select Histogram gallery and choose the first Simple Bar. Select variable Status as the x axis and click OK. The Bar Chart of the Test Variable appears on the output window as shown below. The scale for Status theoretically ranges from 0 to 15, where 0 = highly in poor health and 20 = highly in good health. The actual range of scores is narrower, dispersing from a low of 6 to a high of 14. The Histogram shows that Status is non-normally distributed, so that we choose median as the cut point. 22

25 10 8 Frequency 6 4 Mean =9.52 Std. Dev. =2.332 N = Status 4.3 RUNS TEST The question we want to answer is: Is the order of values of Status random? Go to Analyse Nonparametric Tests Runs. The median is selected by default, so keep it as it is. In th etest Variable List box we put in Status and then click Options. Select Descriptive and Quartiles, and then click Continue. Back to the Runs Test dialog box, and then click OK to run the test. Your output screen appears like the one below: Please note: before you run the procedure, you have to make sure that your records are sorted descendingly by their study orders (study ID, the orders by which the participants actually entered the study). 23

26 Interpreting the output First table (Descriptive Statistics):The statistics table will help you understand more about the distribution of Status by displaying the basic information of it. While the default table is very wide, you can easily pivot it to column format by following the steps below: Double-click the table to activate it. From the Viewer menus choose :Pivot Transpose Rows and Columns The table then transpose from row into column as it is shown below: 24

27 Second table (Runs Test): This shows the results of the Runs test. The test value is used as a cut point to dichotomize the sample. In this table, the cut point is the sample median. Of 50patients, 21 scored below the median (Cases <Test Value). Think of them as the "negative" cases. The remaining 29patients(Cases >=Test Value) scored at or above the median. Think of them as the "positive" cases. The next statistic is a count of the observed runs (Number of Runs)in the test variable. A run is defined as a sequence of cases on the same side of the cut point. If the order of the Status is purely random with respect to the median value, you would expect about 26 runs across these 50 cases. Because you observed only 2 runs, the Z statistic is negative. The 2-tailed significance value [Asymp. Sig. (2-tailed) ]is the probability of obtaining a Z statistic as or more extreme (in absolute value) than the obtained value, if the order of Status above and below the median is purely random. In another word, the 2-tailed significance value (here p=<0.201) allows you not to reject the null hypothesis that the order of the Status is random with respect to the higher median value of 9. This is to say that the order of Status is random with a cut point of 9. 25

28 5 ONE-SAMPLE TEST The One-Sample Kolmogorov-Smirnov Test procedure compares the observed cumulative distribution function for a variable with a specified theoretical distribution, which may be normal, uniform, Poisson, or exponential. The Kolmogorov-Smirnov Z is computed from the largest difference (in absolute value) between the observed and theoretical cumulative distribution functions. This goodness-of-fit test tests whether the observations could reasonably have come from the specified distribution. 5.1 ASSUMPTIONS AND DATA REQUIREMENTS Assumptions: The Kolmogorov-Smirnov test assumes that the parameters of the test distribution are specified in advance. This procedure estimates the parameters from the sample. The sample mean and sample standard deviation are the parameters for a normal distribution, the sample minimum and maximum values define the range of the uniform distribution, the sample mean is the parameter for the Poisson distribution, and the sample mean is the parameter for the exponential distribution. The power of the test to detect departures from the hypothesized distribution may be seriously diminished. For testing against a normal distribution with estimated parameters, consider the adjusted K-S Lilliefors test (available in the Explore procedure). Data: Use quantitative variables (interval or ratio level of measurement). 5.2 ONE-SAMPLE KOLMOGOROV-SMIRNOV TEST The question we want to answer is: Is the variable of Status normally distributed or one of particular distribution? Let us take Status as an example. Go to Analyse Nonparametric Tests 1-Sample K-S, and then select Status as the test variable to the Test Variable List box. Tick all the four options in the Test Distribution box and then click Options. Tick Descriptive and Quartiles and then click Continue. Finally click OK to run the procedure. 26

29 Your output screen should be expected like the two below: Interpreting the outputs First table (Descriptive Statistics): This table will help you understand more about the distribution of Status in these data by displaying the basic information of Status. The information includes number of cases (N), Mean, Standard Deviation, Minimum, Maximum, Percentiles (25 th, 50 th and 75 th ). Second table (One-Sample Kolmogorov-Smirnov Test): This is the default test--test of normal distribution. This table shows that the Normal distribution is indexed by two parameters--the mean and Standard deviation. The average weight of the sample is about 9.52 with SD of The next three rows fall under the general category Most Extreme Differences. The differences referred to are the largest positive and negative points of divergence between the empirical and theoretical Cumulative distribution functions (CDFs). The first difference value, labelled Absolute, is the absolute value of the larger of the two difference values printed directly below it. This value will be required to calculate the test statistic. The Positive difference is the point at which the empirical CDF exceeds the theoretical CDF by the greatest amount. 27

30 At the opposite end of the continuum, the Negative difference is the point at which the theoretical CDF exceeds the empirical CDF by the greatest amount. The Z test statistic is the product of the square root of the sample size and the largest absolute difference between the empirical and theoretical CDFs. Unlike much statistical testing, a significant result here is bad news. 28

31 The probability of the Z statistic is above 0.05, meaning that the Normal distribution with parameters of 9.52±2.33 is a good fit for the Status. Third and fourth tables (One-Sample Kolmogorov-Smirnov Test 2 and 3): These two tables convey the similar messages to second table. Fifth tables (One-Sample Kolmogorov-Smirnov Test 4): This table shows that the probabilities of the Z statistic are all below 0.05, meaning that the Exponential distribution with a parameters of 9.52 (Mean), is not a good fit for Status. The above outputs indicate that the Status has a good fit of Normal, Uniform and Poisson Distributions, but not of Exponential distribution. 29

32 6 TWO-INDEPENDENT-SAMPLE TESTS The Two-Independent-Samples Tests procedure compares two groups of cases on one variable. The nonparametric tests for two independent samples are useful for determining whether or not the values of a particular variable differ between two groups. This is especially true when the assumptions of the t test are not met. Suppose we want to know whether height differs between private and public patients. In other words: does the categorical (independent) variable insurance affect the continuous (dependent) variable height? 6.1 ASSUMPTIONS AND DATA REQUIREMENTS Assumptions: Use independent, random samples. The Mann-Whitney U test requires that the two tested samples be similar in shape, that is, the variable you are testing is at least ordinal and that its distribution is similar in both groups. Data: Use numeric variables that can be ordered. We will assume that the independence assumption is met by the design of the experiment. 6.2 TWO-INDEPENDENT-SAMPLE MANN-WHITNEY AND WILCOXON TESTS The Mann-Whitney and Wilcoxon statistics can be used to test the null hypothesis that two independent samples come from the same population. Their advantage over the independent-samples t test is that Mann-Whitney and Wilcoxon do not assume normality and can be used to test ordinal variables. Go to Analyse Nonparametric Tests 2 Independent Samples. In the dialogue box add the dependent variable Height to the Test Variable List box. Add the independent variable Insurance to the Grouping Variable box. Click on Define Groups and in Group 1 type 1 and in Group 2 type 2. 30

33 Click Continue and OK. Your output screen should look like the one below: Interpreting the output First table (Ranks): presents figures used to calculate the p-value. First, each case is ranked without regard to group membership. Cases tied on a particular value receive the average rank for that value (Mean Rank). After ranking the cases, the ranks are summed within groups (Sum of Ranks). Second table (Test Statistics): presents the results of the Mann-Whitney U test (otherwise known as the Mann- Whitney-Wilcoxon test or the Wilcoxon rank-sum test). The p-value [Asymp. Sig.(2-tailed)] is 0.041, implying that insurance does have an effect on height. Note: you can do this test for several variables in the same run by adding all them in the same box as Height. Outputs will be displayed in different panels by variable names. 31

34 6.3 TWO-SAMPLE KOLMOGOROV-SMIRNOV TEST The two-sample Kolmogorov-Smirnov test tests the null hypothesis that two samples have the same distribution. It's a very flexible test because no specific shape is assumed for the underlying distribution. However, because the test makes no assumptions, it is sensitive to differences in both location and scale. You may want to center the test variable if you are not interested in location differences; additionally, you may want to standardize the test variable to remove both location and scale. Go to Analyse Nonparametric Tests 2 Independent Samples. In the dialogue box add the dependent variable Height to the Test Variable List box. Add the independent variable Gender to the Grouping Variable box. Do the same as you did in Section 6.2. Deselect Mann-Whitney U, and select Kolmogorov-Smirnov Z. Then click OK to proceed. Your output screen should look like the one below: Interpreting the output First table (Ranks): presents figures used to calculate the p-value. Second table (Test Statistics): presents the results of the Kolmogorov-Smirnov test. The p-value [Asymp. Sig.(2-tailed)] is 0.208, well above 0.05, implying that the distributions of the two yields are not significantly different from each other by that standard. 32

35 7 MULTI-INDEPENDENT-SAMPLE TESTS The Tests for Several Independent Samples procedure compares two or more groups of cases on one variable. The nonparametric tests for multiple independent samples are useful for determining whether or not the values of a particular variable differ between two or more groups. This is especially true when the assumptions of ANOVA are not met. Suppose we want to know whether BMI differs depending on which year patients were born in. In other words: does the categorical (independent) variable Year_born affect the continuous (dependent) variable BMI? 7.1 ASSUMPTIONS AND DATA REQUIREMENTS Assumptions: Use independent, random samples. The Kruskal-Wallis H test requires that the tested samples be similar in shape. Data: Use numeric variables that can be ordered. We will assume that the independence assumption is met by the design of the experiment. 7.2 KRUSKALL-WALLIS TEST The Kruskal-Wallis test is a one-way analysis of variance by ranks. It tests the null hypothesis that multiple independent samples come from the same population. Unlike standard ANOVA, it does not assume normality, and it can be used to test ordinal variables. Go to Analyse Nonparametric Tests K Independent Samples. In the dialogue box add the dependent variable BMI to the Test Variable List box. Add the independent variable Year_born to the Grouping Variable box. Click on Define Range and type 1997 for Minimum and 2000 for Maximum. 33

36 Click Continue, Keep the Test Type as it is (default =Kruskal-Wallis) and click OK. Your output screen should look like the one below: Interpreting the output First table (Ranks): presents figures used to calculate the p-value. Second table (Test Statistics): presents the results of the Kruskal-Wallis test. The p-value (Asymp.Sig.) is 0.038, implying that the year of birth does have an effect on BMI. 34

37 7.3 THE MEDIAN TEST The median method tests the null hypothesis that two or more independent samples have the same median. It assumes nothing about the distribution of the test variable, making it a good choice when you suspect that the distribution varies by group. Go to Analyse Nonparametric Tests K Independent Samples. Do the same as you did in Section 7.2, but this time deselect Kruskal-Wallis H, and select Median as the test type. Click Options and select Quartiles in the Statistics group. Click Continue to go back to the Tests for Several Independent Samples dialog box and then click OK to run the analysis. Your output screen should look like the one below: 35

38 Interpreting the output First table (Descriptive Statistics):presents numbers of each of the variables of interest and their percentiles. Second table (Frequencies):presents figures used to calculate the p-value, by Year-born and cut point (Median) of BMI. Third table (Test Statistics): presents the results of the Median test. The p-value (Asymp.Sig.) is 0.024, implying that the BMIs are different among the years of birth. 7.4 POST HOC TESTS SPSS doesn t have a convenient tool to do non-parametric post hoc testing. To find where the differences are, use multiple Mann-Whiney tests to compare each pair of categories. Because there are multiple comparisons here, the p-value is not longer significant at 0.05, but rather at 0.05 (# pairs possible). In the example above, there are six pairs possible. So, the significance level should be =

39 8 TWO-RELATED-SAMPLE TESTS The Two-Related-Samples Tests procedure compares the distributions of two variables. The nonparametric tests for two related samples allow you to test for differences between paired scores when you cannot (or would rather not) make the assumptions required by the paired-samples t test. Procedures are available for testing nominal, ordinal, or scale variables. 8.1 ASSUMPTIONS AND DATA REQUIREMENTS Assumptions: Although no particular distributions are assumed for the two variables, the population distribution of the paired differences is assumed to be symmetric. Data: Use numeric variables that can be ordered. 8.2 THE WILCOXON SIGNED-RANKS TEST The Wilcoxon signed-ranks test is a non-parametric version of the paired samples t-test. This is used when you have non-parametric data for one group of people measured over two time periods, or two different conditions, where a dependency exists between two measures and the test must account for this dependency. The question we want to answer is: Is there a difference in BMI measurements before and after exposure? Go to Analyse Nonparametric Tests 2 Related Samples.Click on BMI and BMI2 in the list on the left, then click the arrow button to add them to the box on the right (Paired Variables). 37

40 Click OK. Your output screen should look like the one below: Interpreting the output First table (Ranks): presents figures used to calculate the p-value. 38

41 Second table (Test Statistics): The results of the non-parametric paired samples test are displayed here. The p-value (Asymp.Sig.) is<0.001, implying that there is a difference in BMI depending on exposure. 8.3 THE SIGN TEST The sign test, like the Wilcoxon signed-ranks test isa nonparametric statistic that can be used with ordinally (or above) scaled dependent variable when the independent variable has two levels and the participants have been matched or the samples are correlated. Thus, itis useful when a t-test cannot be employed because its assumptions have been violated. The sign test uses only directional information while the Wilcoxon test uses both direction and magnitude information. Thus the Wilcoxon test is more powerful statistically than the sign test. However, the Wilcoxon test assumes that the difference between pairs of scores is ordinally scaled, and this assumption is difficult to test. We repeat the above test in Section 8.2 using the sign test. Go to Analyse Nonparametric Tests 2 Related Samples.Do the same as you did in Section 8.2, but this time deselect Wilcoxon, and select Sign as the test type as it is shown below. Click OK to proceed. Your output screen should look like the one below: 39

42 Interpreting the output First table (Frequencies): presents figures used to calculate the p-value. Second table (Test Statistics): The results of the non-parametric paired samples test are displayed here. The p-value (Asymp.Sig.)is0.010, implying that there is a difference in BMI depending on exposure. 8.4 THE MCNEMAR TEST The McNemar method tests the null hypothesis that binary responses are unchanged. As with the Wilcoxon test, the data may be from a single sample measured twice or from two matched samples. The McNemar test is particularly appropriate with nominal or ordinal test variables. The question we want to answer is: Is there a difference in re-admission before and after intervention? Go to Analyse Nonparametric Tests 2 Related Samples.Click on Before and After in the list on the left. Deselect Wilcoxon, and select McNemar as the test type and then click OK. Your output screen should look like the one below: 40

43 Interpreting the output First table (Re-admission before intervention & Re-admission after intervention): presents figures used to calculate the p-value. Second table (Test Statistics): The results of the non-parametric paired samples test are displayed here. The p-value (Asymp.Sig.)is 0.541, implying that there is no difference in re-admission depending on intervention, this is, the readmission has not changed after intervention. 41

44 9 MULTIPLE-RELATED-SAMPLE TESTS The Tests for Several Related Samples procedure compares the distributions of two or more variables. The nonparametric tests for multiple related samples are useful alternatives to a repeated measures analysis of variance. They are especially appropriate for small samples and can be used with nominal or ordinal test variables. 9.1 ASSUMPTIONS AND DATA REQUIREMENTS Assumptions: Nonparametric tests do not require assumptions about the shape of the underlying distribution. Use dependent, random samples. Data: Use numeric variables that can be ordered. 9.2 FRIEDMAN TEST The Friedman procedure tests the null hypothesis that multiple ordinal responses come from the same population. As with the Wilcoxon test for two related samples, the data may come from repeated measures of a single sample or from the same measure from multiple matched samples. An insurance group is evaluating four health care plans for customers. The fifty patients are asked to rank the plans by how much they would prefer to accept them. The question we want to answer is: Is there a difference in preference for the four health care plans? Go to Analyse Nonparametric Tests K Related Samples.Click on Plan 1-4 in the list on the left, and then click the arrow button to add them to the box on the right (Test Variables) as it is shown below. Click OK. Your output screen should look like the one below: 42

45 Interpreting the output First table (Ranks): presents figures used to calculate the p-value. Second table (Test Statistics): The results of the non-parametric Friedman test are displayed here. The p- value (Asymp.Sig.) is <0.001, implying that there is difference in preference for health care plans, that is, the fifty patients do not have equal preference for all four health care plans. 9.3 KENDALL S W TEST The Kendall s W test is referred to the normalization of the Friedman statistic. Kendall s W is used to assess the trend of agreement among the respondents. Kendall s W ranges from 0 to 1. The value 1 refers to the complete agreement among/between the raters, and value 0 refers to no agreement. Go to Analyse Nonparametric Tests K Related Samples.Click on Plan 1-4 in the list on the left, and then click the arrow button to add them to the box on the right (Test Variables). This time deselect Friedman, but select Kendall s W instead. Click OK to proceed. Your output screen should look like the one below: 43

46 Interpreting the output First table (Ranks): presents figures used to calculate the p-value. Second table (Test Statistics): The results of the non-parametric Kendall s W test are displayed here. Kendall's Coefficient of Concordance is 0.281, with Chi-square being and degrees of freedom being 3. The p-value (Asymp.Sig.)is<0.001, implying that the patients preferences are not statistically concordant (p<0.001), and the test rejected the hypothesis. That is to say that the level of preference for the four health care plans among 50 patients appears different. 9.4 COCHRAN Q TEST The Cochran Q procedure tests the null hypothesis that multiple related proportions are the same, that is, used for variables are dichotomous with the same values. The Cochran test is a multivariate extension of the McNemar test used for two related samples. Fifty patients are asked to perform five tasks on the site, all of which are designed to be equally easy. The question we want to answer is: Is there a difference in success rates of the 5 tasks? 44

47 Go to Analyse Nonparametric Tests k Related Samples.Click on Task1, Task2 and Task5 in the list on the left and then click the arrow button to add five of them to the box on the right (Test Variables). Deselect Friedman, and select Cochran s Q as the test type and then click Statistics. Tick Descriptive and then click Continue. Click OK to proceed. Your output screen should look like the one below: 45

48 Interpreting the output First table (Descriptive Statistics): presents basic statistics of the 5 tasks. Means here stand for the proportions of users who succeeded at each task. Second table (Frequencies): presents figures used to calculate the p-value. Third table (Test Statistics): The results of the non-parametric Cochran s Q test are displayed here. Cochran s Qis 0.985, with degrees of freedom being 4. The p-value (Asymp.Sig.) is 0.912, implying that all tasks have an equal number of successes, that is, there is no significant difference in the success rates among five tasks completed by fifty patients, to answer our question. 46

49 10 NONPARAMETRIC CORRELATIONS The following are two types of commonly used nonparametric correlation coefficients (Spearman RandKendall Tau).The Spearman's rho and Kendall's tau-b statistics measure the rank-order association between two scale or ordinal variables. They work regardless of the distributions of the variables. Note that the chi-square statistic computed for two-way frequency tables, also provides a careful measure of a relation between the two (tabulated) variables, and unlike the correlation measures listed below, it can be used for variables that are measured on a simple nominal scale. The question we want to answer is: Is there a correlation between weight and height? 10.1 ASSUMPTIONS AND DATA REQUIREMENTS Assumptions: There is no particular assumption for both Spearman's rho and Kendall's tau-bcorrelation coefficient. Data: Quantitative variables or variables with ordered categories for Spearman's rho and Kendall's tau-b SPEARMAN S RANK CORRELATION Spearman R (Siegel & Castellan, 1988) assumes that the variables under consideration were measured on at least an ordinal (rank order) scale, that is, that the individual observations can be ranked into two ordered series. Spearman R can be thought of as the regular Pearson product moment correlation coefficient, that is, in terms of proportion of variability accounted for, except that Spearman R is computed from ranks. Go to Analyse Correlate Bivariate.Click on Weight and Height in the list on the left and then click the arrow button to add them to the box on the right (Test Variables). Deselect Pearson and select Spearman instead, as it is shown below. Leave Test of significance as it is and then click OK. Your output screen should look like the one in the next page: 47

50 Interpreting the output Correlation table (Correlations): Spearman's rho is reported in the table, being between Weight and Height. The p-value [Sig. (2-tailed)] is less than 0.001, implying that there is a significant and fairly strong positive correlation between Weight and Height KENDALL S TAU RANK CORRELATION Kendall tau is equivalent to Spearman R with regard to the underlying assumptions. It is also comparable in terms of its statistical power. However, Spearman R and Kendall tau are usually not identical in magnitude because their underlying logic as well as their computational formulas is very different. Kendall tau and Spearman R imply different interpretations: Spearman R can be thought of as the regular Pearson product moment correlation coefficient, that is, in terms of proportion of variability accounted for, except that Spearman R is computed from ranks. Kendall tau, on the other hand, represents a probability, that is, it is the difference between the probability that in the observed data the two variables are in the same order versus the probability that the two variables are in different orders. There are two variations of Kendall's Tau: tau-b and tau-c. They differ only in the way that they handle rank ties. The following example is only used Kendall's tau-b. Go to Analyse Correlate Bivariate.Click on Weight and Height in the list on the left and then click the arrow button to add them to the box on the right (Test Variables). Deselect Pearson and select Kendall's tau-b instead. Leave Test of significance as it is and then click OK. Your output screen should look like the one below: 48

51 Interpreting the output Correlation table (Correlations): Kendall's tau-b is reported in the table, being between Weight and Height. The p-value [Sig. (2-tailed)] is less than 0.001, implying that there is a significant and fairly strong positive correlation between Weight and Height. 49