Statistical skills example sheet: Spearman s Rank



Similar documents
CHAPTER 14 ORDINAL MEASURES OF CORRELATION: SPEARMAN'S RHO AND GAMMA

Using Excel for inferential statistics

PEARSON R CORRELATION COEFFICIENT

1.6 The Order of Operations

CHAPTER 12 TESTING DIFFERENCES WITH ORDINAL DATA: MANN WHITNEY U

Section 3 Part 1. Relationships between two numerical variables

Rank-Based Non-Parametric Tests

Descriptive Statistics

The correlation coefficient

Biostatistics: DESCRIPTIVE STATISTICS: 2, VARIABILITY

Valor Christian High School Mrs. Bogar Biology Graphing Fun with a Paper Towel Lab

Least-Squares Intersection of Lines

Technical Information

CALCULATIONS & STATISTICS

NONPARAMETRIC STATISTICS 1. depend on assumptions about the underlying distribution of the data (or on the Central Limit Theorem)

Directions for using SPSS

INTERPRETING THE ONE-WAY ANALYSIS OF VARIANCE (ANOVA)

Testing Research and Statistical Hypotheses

KSTAT MINI-MANUAL. Decision Sciences 434 Kellogg Graduate School of Management

We are often interested in the relationship between two variables. Do people with more years of full-time education earn higher salaries?

MoorLIFE: Active Blanket Bog Restoration in the South Pennines Moors Monitoring Programme Mid-Term Report. November 2013

Study Guide for the Final Exam

Association Between Variables

One-Way ANOVA using SPSS SPSS ANOVA procedures found in the Compare Means analyses. Specifically, we demonstrate

CHAPTER 14 NONPARAMETRIC TESTS

Weight of Evidence Module

Introduction to Quantitative Methods

Difference tests (2): nonparametric

Non-Parametric Tests (I)

Correlation Coefficient The correlation coefficient is a summary statistic that describes the linear relationship between two numerical variables 2

CHAPTER 13 SIMPLE LINEAR REGRESSION. Opening Example. Simple Regression. Linear Regression

Calculate Highest Common Factors(HCFs) & Least Common Multiples(LCMs) NA1

Example: Boats and Manatees

ABSORBENCY OF PAPER TOWELS

Projects Involving Statistics (& SPSS)

Hypothesis testing - Steps

Correlation key concepts:

Section 13, Part 1 ANOVA. Analysis Of Variance

Lesson 1: Comparison of Population Means Part c: Comparison of Two- Means

Session 7 Bivariate Data and Analysis

DDBA 8438: The t Test for Independent Samples Video Podcast Transcript

Unit 31 A Hypothesis Test about Correlation and Slope in a Simple Linear Regression

Zero: If P is a polynomial and if c is a number such that P (c) = 0 then c is a zero of P.

Statistical tests for SPSS

Statistics Revision Sheet Question 6 of Paper 2

Chapter 3 RANDOM VARIATE GENERATION

11. Analysis of Case-control Studies Logistic Regression

Simple Regression Theory II 2010 Samuel L. Baker

Nonparametric Tests. Chi-Square Test for Independence

Part 3. Comparing Groups. Chapter 7 Comparing Paired Groups 189. Chapter 8 Comparing Two Independent Groups 217

Data Mining Techniques Chapter 5: The Lure of Statistics: Data Mining Using Familiar Tools

Assignment #1: Spreadsheets and Basic Data Visualization Sample Solution

UNDERSTANDING THE DEPENDENT-SAMPLES t TEST

Chapter 7. Comparing Means in SPSS (t-tests) Compare Means analyses. Specifically, we demonstrate procedures for running Dependent-Sample (or

Tom wants to find two real numbers, a and b, that have a sum of 10 and have a product of 10. He makes this table.

The Dummy s Guide to Data Analysis Using SPSS

Working with whole numbers

Correlation. Barrow, Statistics for Economics, Accounting and Business Studies, 4 th edition Pearson Education Limited 2006

STA-201-TE. 5. Measures of relationship: correlation (5%) Correlation coefficient; Pearson r; correlation and causation; proportion of common variance

DATA INTERPRETATION AND STATISTICS

CORRELATIONAL ANALYSIS: PEARSON S r Purpose of correlational analysis The purpose of performing a correlational analysis: To discover whether there

Independent samples t-test. Dr. Tom Pierce Radford University

Linear Models in STATA and ANOVA

THE INFLUENCE OF MARKETING INTELLIGENCE ON PERFORMANCES OF ROMANIAN RETAILERS. Adrian MICU 1 Angela-Eliza MICU 2 Nicoleta CRISTACHE 3 Edit LUKACS 4

Module 5: Statistical Analysis

Testing Group Differences using T-tests, ANOVA, and Nonparametric Measures

There are probably many good ways to describe the goals of science. One might. Correlation: Measuring Relations CHAPTER 12. Relations Among Variables

Eight things you need to know about interpreting correlations:

QUANTITATIVE METHODS BIOLOGY FINAL HONOUR SCHOOL NON-PARAMETRIC TESTS

Friedman's Two-way Analysis of Variance by Ranks -- Analysis of k-within-group Data with a Quantitative Response Variable

Procedure for Graphing Polynomial Functions

LAB 4 INSTRUCTIONS CONFIDENCE INTERVALS AND HYPOTHESIS TESTING

Chapter 23. Two Categorical Variables: The Chi-Square Test

Multivariate Analysis of Variance. The general purpose of multivariate analysis of variance (MANOVA) is to determine

6.4 Normal Distribution

Recommend Continued CPS Monitoring. 63 (a) 17 (b) 10 (c) (d) 20 (e) 25 (f) 80. Totals/Marginal

Using a Graphing Calculator With Cramer s Rule

Using Excel for Statistical Analysis

Correlational Research. Correlational Research. Stephen E. Brock, Ph.D., NCSP EDS 250. Descriptive Research 1. Correlational Research: Scatter Plots

Coins, Presidents, and Justices: Normal Distributions and z-scores

Name Date Class Period. How can you use the box method to factor a quadratic trinomial?

EXCEL Analysis TookPak [Statistical Analysis] 1. First of all, check to make sure that the Analysis ToolPak is installed. Here is how you do it:

This is a square root. The number under the radical is 9. (An asterisk * means multiply.)

Ecology and Simpson s Diversity Index

Class 19: Two Way Tables, Conditional Distributions, Chi-Square (Text: Sections 2.5; 9.1)

SOLVING QUADRATIC EQUATIONS BY THE DIAGONAL SUM METHOD

Distribution is a χ 2 value on the χ 2 axis that is the vertical boundary separating the area in one tail of the graph from the remaining area.

Section 6.1 Factoring Expressions

THE FIRST SET OF EXAMPLES USE SUMMARY DATA... EXAMPLE 7.2, PAGE 227 DESCRIBES A PROBLEM AND A HYPOTHESIS TEST IS PERFORMED IN EXAMPLE 7.

Chapter 7: Simple linear regression Learning Objectives

Factoring. Factoring Polynomial Equations. Special Factoring Patterns. Factoring. Special Factoring Patterns. Special Factoring Patterns

Having a coin come up heads or tails is a variable on a nominal scale. Heads is a different category from tails.

DATA ANALYSIS. QEM Network HBCU-UP Fundamentals of Education Research Workshop Gerunda B. Hughes, Ph.D. Howard University

UNIVERSITY OF NAIROBI

II. DISTRIBUTIONS distribution normal distribution. standard scores

X X X a) perfect linear correlation b) no correlation c) positive correlation (r = 1) (r = 0) (0 < r < 1)

1.3 Algebraic Expressions

Comparing Means in Two Populations

Normality Testing in Excel

Rockefeller College University at Albany

Transcription:

Statistical skills example sheet: Spearman s Rank Spearman s rank correlation is a statistical test that is carried out in order to assess the degree of association between different measurements from the same sample. That is, if you are looking for a positive or negative correlation between two variables. Positive correlation No correlation Negative correlation When the points on a graph clearly fit onto a line of best fit it is easy to determine whether a correlation exists. However, as the points become further placed from each other it is hard to make an accurate judgement. This is where statistics is used; to clarify how confident we are that a correlation exists. Example: During monitoring for the MoorLIFE project, an ecologist collected data from an area of moorland that had been restored in 2003. In order to assess whether a correlation existed between the two plant species she studied or whether they were growing independently of one another, she carried out the following statistical test.

Quadrat % Cover of Bilberry % Cover Common Heather 1 5 0 2 40 0 3 50 5 4 5 0 5 10 0 6 25 0 7 0 1 8 4 0 9 0 0 10 0 1 11 10 6 12 2 0.5 First of all, form a null hypothesis. This hypothesis assumes there is no relationship between the two variables e.g. There is no correlation between percentage cover of Bilberry and percentage cover of Common Heather. Step one: Rank the data For each set of data assign ranks from lowest to highest. The lowest value in a column will be given the rank of 1, the next smallest number will be given a 2 etc. If there are tied scores each of those will share the ranks and be given the average (mean) rank. For example, there are 7 quadrats with a percentage cover of Heather value of 0. As these are the lowest values in that data set, they share the ranks of 1, 2, 3, 4, 5, 6 and 7 (a total of 28) giving each value an average rank of 4 (28 7). The next rank that can be given is therefore an 8 as ranks 1 to 7 have already been assigned. Similarly in the same data set there are two values for Heather cover of 1%, these two then share ranks of 8 and 9 (8 +9 = 17, 17 2=9.5). Each place is assigned average rank of 9.5 and next rank to be given out is that of 10.

Quadrat % Cover of Bilberry Rank variable 1 % Cover Heather Rank variable 2 1 5 6.5 0 4 2 40 11 0 4 3 50 12 5 11 4 5 6.5 0 4 5 10 8.5 0 4 6 25 10 0 4 7 0 2 1 9.5 8 4 5 0 4 9 0 2 0 4 10 0 2 1 9.5 11 10 8.5 6 12 12 2 4 0.5 8 Step two: Find the differences between rank 1 and rank 2 for each value in the table. Rank variable 1 Rank variable 2 6.5 4 2.5 11 4 7 12 11 1 6.5 4 2.5 8.5 4 4.5 10 4 6 2 9.5-7.5 5 4 1 2 4-2 2 9.5-7.5 8.5 12-3.5 4 8-4 Difference d

Step three: Square the differences to get rid of negative numbers then find the sum of d 2 Rank Rank Difference d 2 variable 1 variable 2 d 6.5 4 2.5 6.25 11 4 7 49 12 11 1 1 6.5 4 2.5 6.25 8.5 4 4.5 20.25 10 4 6 36 2 9.5-7.5 56.25 5 4 1 1 2 4-2 4 2 9.5-7.5 56.25 8.5 12-3.5 12.25 4 8-4 16 d 2 = 264.5

Step four: Calculate Spearman s Rank using the formula below, where n is the number of samples in each category. Rank (R) = 1-6 d 2 n(n 2-1) R = 1-6 x 264.5 12 (144-1) R = 1 0.925 R = 0.075

Step five: work out the meaning of the R value The closer R is to 1 or -1 the more likely the correlation. A perfect positive correlation has an R value of 1, a perfect negative correlation has a value of -1. +1 0-1 Perfect positive correlation No correlation Perfect negative correlation If the value lies between -1 and 1 we need to carry out a test for significance. The significance level is calculated by finding the degrees of freedom for your test. This is the number of pairs in your sample minus 2. In our example this would be: 12-2 = 10 degrees of freedom

In Biology we work to a significance level of 0.05, that is, at this level we are 95% confident that the results are due to a correlation and not due to chance. Looking at the table below we can see that at a significant level of 0.05, for 10 degrees of freedom the R value must be 0.564 or above for it to be considered a statistically significant correlation. Our result does not reach this value, therefore the probability of our result being due to chance alone is greater than 5% and we must accept our null hypothesis, the two variables are not correlated. We can conclude that the percentage cover of Bilberry and Common Heather are not correlated (positively nor negatively) and the two plants therefore grow independently of one another. Degrees of Significance level Freedom 0.05 0.01 4 1.000 5 0.900 1.000 6 0.829 0.943 7 0.714 0.893 8 0.643 0.833 9 0.600 0.783 10 0.564 0.745 11 0.523 0.736 12 0.497 0.703 13 0.475 0.673 14 0.457 0.646 15 0.441 0.623 16 0.425 0.601 17 0.301 0.582 18 0.399 0.564 19 0.388 0.549 20 0.377 0.534

Exercise: Results analysis Following restoration of some Blanket Bog sites on Bleaklow by the MoorLIFE team, several areas of Blanket Bog were monitored in order to measure the success of the process. The table shows an extract from the results table for an area of Blanket Bog, (Joseph s Patch), which was restored in 2003. Quadrat_ID Percentage cover of D. flexuosa Rank 1 Percentage cover of Agrostis sp JP001 30 15 JP003 35 1 JP004 15 5 JP005 15 10 JP006 15 2 JP007 10 10 JP007b 3 3 JP008 5 1 JP009 10 5 JP010 25 0.25 JP011 35 30 JP012 50 20 JP013 50 10 JP014 30 15 JP015 40 15 JP016 50 5 JP017 30 4 JP018 15 10 JP019 40 7 JP020b 35 15 JP021 30 30 JP022 50 5 JP023 15 10 JP024 40 10 JP025 40 12 JP026 10 4 Rank 2

1. The team ecologist wanted to find out if there was any relationship between D.flexuosa and the Agrostis species. The null hypothesis was that there is no correlation between the percentage cover of D.flexuosa and Agrostis species. Complete the table by ranking the two sets of data. 2. The data above has an R value of 0.41. Work out how many degrees of freedom there are for this data set and then use the significance table to comment upon the significance of the results. Remember to include the terms chance and probability in your answer.