UNIVERSITY OF NAIROBI MASTERS IN PROJECT PLANNING AND MANAGEMENT NAME: SARU CAROLYNN ELIZABETH REGISTRATION NO: L50/61646/2013 COURSE CODE: LDP 603 COURSE TITLE: RESEARCH METHODS LECTURER: GAKUU CHRISTOPHER ASSIGNMENT TOPIC: DIFFERENCIATE BETWEEN PARAMETRIC AND NON PARAMETIC TOOLS OF ANALYSIS THIS GROUP WORK ASSIGNMEMNT IS SUBMITTED IN PARTIAL FULLFILMENT OF THE REQUIREMENT FOR THE AWARD OF A MASTERS OF ART DEGREE IN PROJECT PLANNING AND MANAGEMENT.
INTRODUCTION In the literal meaning of the terms, a parametric statistical test is one that makes assumptions about the parameters (defining properties) of the population distribution(s) from which one's data are drawn, while a non-parametric test is one that makes no such assumptions. In this strict sense, "non-parametric" is essentially a null category, since virtually all statistical tests assume one thing or another about the properties of the source population(s). A potential source of confusion in working out what statistics to use in analyzing data is whether your data allows for parametric or non-parametric statistics. The importance of this issue cannot be underestimated! The basic distinction for parametric versus non-parametric is: If your measurement scale is nominal or ordinal then you use non-parametric statistics If you are using interval or ratio scales you use parametric statistics. There are four basic types of data you can use: nominal, ordinal, interval and ratio. Depending on the experiment, these can be increasingly difficult to collect, but they give increasing rewards in what may be concluded. Interval and ratio data are parametric, and are used with parametric tools in which distributions are predictable (and often Normal). Nominal and ordinal data are non-parametric, and do not assume any particular distribution. They are used with non-parametric tools such as the Histogram. 1
Parametric Statistics Parametric statistics assume more about the quality of the data, but in return they can tell us more about what is going on with those data. The most common parametric statistics assume the General Linear Model that is, they assume that the true, underlying distribution of the data can be described by a straight line (or one of its variants). The two major ones are correlation and analysis of variance. Correlation is the statistical tool which most clearly expresses the general linear model. To perform a correlation, you must have observations of two characteristics for each case you wish to include, and the observation must both be measured on interval scales. You must further be willing to assume that the distribution underlying the observations is normal, or balanced about the mean. Analysis of variance applies the general linear model to situations in which one of the variables is measured on an interval scale, but the other variable (the x, or causal variable) is membership in a group. For example, a neighborhood group might be complaining that they are not getting their fair share of the city s park & recreation money. ANOVA will allow you to determine whether there is merit to such a claim. An advantage of ANOVA over correlation is that no assumption need be made that the relationship between the two variables is a straight line. Analysis of variance will work with U-shaped or other curvilinear relationships. Non-parametric statistics Non-parametric is a statistical method wherein the data is not required to fit a normal distribution. Nonparametric statistics uses data that is often ordinal, meaning it does not rely on numbers, but rather a ranking or order of sorts. For example, a survey conveying consumer preferences ranging from like to dislike would be considered ordinal data. Spearman's rho is a measure of the linear relationship between two variables. It assesses how well the relationship between two variables can be described using a monotonic function. If there are no repeated data values, a perfect Spearman correlation of +1 or 1 occurs when each of the variables is a perfect monotone function of the other. Kendall's Tau is a statistic used to measure the association between two measured quantities. A tau test is a non-parametric hypothesis test for statistical dependence based on the tau coefficient. Chi Square is the distribution of a sum of the squares of k independent standard normal random variables. It is one of the most widely used probability distributions in inferential statistics 2
Parametric tests and analogous nonparametric procedures It is sometimes easier to list examples of each type of procedure than to define the terms. The table below contains the names of several statistical procedures categorizes each one as parametric or nonparametric. All of the parametric procedures listed in the table below rely on an assumption of approximate normality. Analysis Type Example Parametric Procedure Nonparametric Procedure Compare means between two distinct/independent groups Is the mean systolic blood pressure (at baseline) for patients assigned to placebo different from the mean for patients assigned to the treatment group? Two-sample t-test Wilcoxon rank-sum test Compare two quantitative measurements taken from the same individual Was there a significant change in systolic blood pressure between baseline and the six-month follow up measurement in the treatment group? Paired t-test Wilcoxon signed-rank test Compare means between three or more distinct/independent groups If our experiment had three groups (e.g., placebo, new drug #1, new drug #2), we might want to know whether the mean systolic Analysis of variance (ANOVA) Kruskal-Wallis test blood pressure at baseline differed 3
among the three groups? Estimate the degree of association between two quantitative variables Is systolic blood pressure associated with the patient s age? Pearson coefficient of correlation Spearman s rank correlation Differences between parametric and Non-parametric tools of analysis In summary, the differences between parametric and non-parametric tools of analysis are as follows: Generally, parametric statistics (summaries, tests and models) involve more assumptions than nonparametric statistics. These assumptions are most often regarding the distribution of the data (that a variable is normally distributed, or that the relationship between two variables is linear, to give two examples). Frequently, the statistician is faced with deciding whether there is sufficient evidence to conclude that the difference between two summaries (average SATs for two high schools, median household income of two counties, proportion of medical patients experiencing relief from drug X versus those given a placebo, etc.) are different. There are a wide variety of parametric statistics for deciding such things, but they all have in common a process of making assumptions about data distributions and testing against those theoretical distributions. Such questions might be resolved in nonparametric statistics, for instance, by performing bootstrap analysis, which does not involve such assumptions. Parametric statistical tests assume that the data belong to some type of probability distribution. The normal distribution is probably the most common. That is, when graphed, the data follow a "bell shaped curve". On the other hand, non-parametric statistical tests are often called distribution free tests since they don't make any assumptions about the distribution of data. They are often used in place of parametric tests when one feels that the assumptions have been violated such as skewed data. 4
Non-parametric statistical procedures are less powerful because they use less information in their calculation. For example, a parametric correlation uses information about the mean and deviation from the mean, while a non-parametric correlation will use only the ordinal position of pairs of scores. The parametric assumption of normality is particularly worrisome for small sample sizes (n < 30). Nonparametric tests are often a good option for these data. Nonparametric procedures generally have less power for the same sample size than the corresponding parametric procedure if the data is truly normal. Interpretation of nonparametric procedures can also be more difficult than for parametric procedures. Nonparametric testing is lower than parametric testing. However as the decision maker or researcher does not have misinterpretation about the usage degree of nonparametric statistics that is lower than parametric statistics method. Of course not really like that, every method is made with specific usage related to the type of data that can be used. Increasing the accuracy level of nonparametric statistics can be done by adding the number of samples. However, as we know that adding the number of samples could also lead to an impact to the increasing of cost, time, and other measurements survey. In parametric tests, hypotheses are usually made about numerical values, especially the mean, while in non-parametric tests; hypotheses are posed regarding ranks, medians or frequencies of data. 5
References 1. Walsh, J.E. (1962), Handbook of Nonparametric Statistics, New York: D.V. Nostrand. 2. Conover, W.J. (1980), Practical Nonparametric Statistics, New York: Wiley & Sons. 3. Zimmerman DW, Zumbo BD: The effect of outliers on the relative power of parametric and nonparametric statistical tests. Perceptual and Motor Skills 1990, 71:339-349. 4. http://www.biomedcentral.com/1471-2288/5/35/prepub 5. http://www.investopedia.com/terms/n/nonparametric-statistics.asp 6