Statistics for Sports Medicine

Similar documents
Analysing Questionnaires using Minitab (for SPSS queries contact -)

Descriptive Statistics

The Dummy s Guide to Data Analysis Using SPSS

Section 3 Part 1. Relationships between two numerical variables

NONPARAMETRIC STATISTICS 1. depend on assumptions about the underlying distribution of the data (or on the Central Limit Theorem)

II. DISTRIBUTIONS distribution normal distribution. standard scores

Introduction to Quantitative Methods

Biostatistics: Types of Data Analysis

SPSS Explore procedure

X X X a) perfect linear correlation b) no correlation c) positive correlation (r = 1) (r = 0) (0 < r < 1)

Correlation Coefficient The correlation coefficient is a summary statistic that describes the linear relationship between two numerical variables 2

Come scegliere un test statistico

Overview of Non-Parametric Statistics PRESENTER: ELAINE EISENBEISZ OWNER AND PRINCIPAL, OMEGA STATISTICS

UNIVERSITY OF NAIROBI

Nonparametric Statistics

Using Excel for inferential statistics

Projects Involving Statistics (& SPSS)

DESCRIPTIVE STATISTICS. The purpose of statistics is to condense raw data to make it easier to answer specific questions; test hypotheses.

THE KRUSKAL WALLLIS TEST

Study Guide for the Final Exam

Data Analysis, Research Study Design and the IRB

We are often interested in the relationship between two variables. Do people with more years of full-time education earn higher salaries?

SPSS Tests for Versions 9 to 13

Comparing Means in Two Populations

The Statistics Tutor s Quick Guide to

Statistics. Measurement. Scales of Measurement 7/18/2012

QUANTITATIVE METHODS BIOLOGY FINAL HONOUR SCHOOL NON-PARAMETRIC TESTS

Rank-Based Non-Parametric Tests

1. What is the critical value for this 95% confidence interval? CV = z.025 = invnorm(0.025) = 1.96

DATA INTERPRETATION AND STATISTICS

Introduction to Statistics and Quantitative Research Methods

Analysis of Variance ANOVA

Section 13, Part 1 ANOVA. Analysis Of Variance

Additional sources Compilation of sources:

Statistical tests for SPSS

Difference tests (2): nonparametric

CHAPTER 14 NONPARAMETRIC TESTS

EPS 625 INTERMEDIATE STATISTICS FRIEDMAN TEST

Simple Predictive Analytics Curtis Seare

Basic Statistics and Data Analysis for Health Researchers from Foreign Countries

The correlation coefficient

Chi-square test Fisher s Exact test

Analyzing Research Data Using Excel

Statistics Review PSY379

Permutation Tests for Comparing Two Populations

STA-201-TE. 5. Measures of relationship: correlation (5%) Correlation coefficient; Pearson r; correlation and causation; proportion of common variance

SPSS ADVANCED ANALYSIS WENDIANN SETHI SPRING 2011

Research Methods & Experimental Design

Non-Parametric Tests (I)

Univariate Regression

Business Statistics. Successful completion of Introductory and/or Intermediate Algebra courses is recommended before taking Business Statistics.

Module 5: Multiple Regression Analysis

Types of Data, Descriptive Statistics, and Statistical Tests for Nominal Data. Patrick F. Smith, Pharm.D. University at Buffalo Buffalo, New York

SCHOOL OF HEALTH AND HUMAN SCIENCES DON T FORGET TO RECODE YOUR MISSING VALUES

Nonparametric Two-Sample Tests. Nonparametric Tests. Sign Test

Linear Models in STATA and ANOVA

Simple Regression Theory II 2010 Samuel L. Baker

Principles of Hypothesis Testing for Public Health

business statistics using Excel OXFORD UNIVERSITY PRESS Glyn Davis & Branko Pecar

1 Nonparametric Statistics

Testing Group Differences using T-tests, ANOVA, and Nonparametric Measures

Section 14 Simple Linear Regression: Introduction to Least Squares Regression

Hypothesis testing - Steps

Two Correlated Proportions (McNemar Test)

Once saved, if the file was zipped you will need to unzip it. For the files that I will be posting you need to change the preferences.

Bowerman, O'Connell, Aitken Schermer, & Adcock, Business Statistics in Practice, Canadian edition

ANALYSING LIKERT SCALE/TYPE DATA, ORDINAL LOGISTIC REGRESSION EXAMPLE IN R.

Section Format Day Begin End Building Rm# Instructor. 001 Lecture Tue 6:45 PM 8:40 PM Silver 401 Ballerini

Study Design and Statistical Analysis

The Wilcoxon Rank-Sum Test

Permutation & Non-Parametric Tests

HYPOTHESIS TESTING: CONFIDENCE INTERVALS, T-TESTS, ANOVAS, AND REGRESSION

Introduction. Hypothesis Testing. Hypothesis Testing. Significance Testing

MASTER COURSE SYLLABUS-PROTOTYPE PSYCHOLOGY 2317 STATISTICAL METHODS FOR THE BEHAVIORAL SCIENCES

An introduction to IBM SPSS Statistics

CALCULATIONS & STATISTICS

Erik Parner 14 September Basic Biostatistics - Day 2-21 September,

Simple linear regression

Bill Burton Albert Einstein College of Medicine April 28, 2014 EERS: Managing the Tension Between Rigor and Resources 1

DATA ANALYSIS. QEM Network HBCU-UP Fundamentals of Education Research Workshop Gerunda B. Hughes, Ph.D. Howard University

Course Text. Required Computing Software. Course Description. Course Objectives. StraighterLine. Business Statistics

training programme in pharmaceutical medicine Clinical Data Management and Analysis

MULTIPLE REGRESSION AND ISSUES IN REGRESSION ANALYSIS

Data Analysis Tools. Tools for Summarizing Data

Statistiek II. John Nerbonne. October 1, Dept of Information Science

Stat 5102 Notes: Nonparametric Tests and. confidence interval

Parametric and non-parametric statistical methods for the life sciences - Session I

Correlational Research. Correlational Research. Stephen E. Brock, Ph.D., NCSP EDS 250. Descriptive Research 1. Correlational Research: Scatter Plots

" Y. Notation and Equations for Regression Lecture 11/4. Notation:

Two-sample inference: Continuous data

Bivariate Statistics Session 2: Measuring Associations Chi-Square Test

Chapter 7: Simple linear regression Learning Objectives

Mathematics within the Psychology Curriculum

SPSS Guide How-to, Tips, Tricks & Statistical Techniques

Chapter 13 Introduction to Linear Regression and Correlation Analysis

Homework 11. Part 1. Name: Score: / null

Statistics. One-two sided test, Parametric and non-parametric test statistics: one group, two groups, and more than two groups samples

HYPOTHESIS TESTING WITH SPSS:

Transcription:

Statistics for Sports Medicine Suzanne Hecht, MD University of Minnesota (suzanne.hecht@gmail.com) Fellow s Research Conference July 2012: Philadelphia

GOALS Try not to bore you to death!! Try to teach you something useful Introduce concepts Give you a stats reference guide Encourage sports med research

QUIZ What is the appropriate stats test to apply?. 50 soccer players wore head gear & 40 did not. Players were followed for diagnosis of concussion over one season. 1. Paired two tailed t-test 2. ANOVA 3. Chi-square analysis 4. McNemar test

MY TOP 10 STATS TIP LIST

OVERVIEW Introduction Variables Normal distribution Hypothesis testing Comparing means Measuring association Scatterplots & Correlation Regression

PURPOSE Stats is just a tool to analyze data you collect Learn the basics Add to your foundation over time Lots of names of tests, just like Sports Medicine!! You wouldn t talk about a Jobe s test during a knee exam Mt Stats

PURPOSE Infer something about a population based on information from a sample of that population Use probability concepts Describe how reliable the conclusions are ie: You have all this data & is it useful in someway?

MY TOP 10 STATS TIP LIST

Variables Discrete Examples Gender (m/f); Fracture (y/n) Nominal or Ordinal Nominal: Set of categories, no ordering ie: m/f Ordinal: Ordering, but no meaning to differences in scores ie Compare 1 st & 2 nd place finishers (ranking) without using actual times Continuous Examples Weight, race time Differences between values has meaning

USE FOR FUTURE REFERENCE Variable Summary Statistics Comparing 2 groups Measuring Association Nominal Mode Chi-square Contingency Coefficient Ordinal Median Chi-square Nonparametric Kappa Spearman r Kendall s tao Continuous Mean Median & SD t-test Nonparametric Spearman r Pearson r

SAMPLE SIZE & POWER Important to calculate Do this prior to the study Avoid expenses, time, resources, etc. Calculations available in stats software Let s you know that you have enough subjects to detect a meaningful change

HYPOTHESIS TESTING Null hypothesis (H 0 ) No difference between groups (groups are the same) Alternative hypothesis (H 1 ) There is a difference between groups Type I error Saying groups are different when they aren t Type II error Saying groups are the same when they are different

MY TOP 10 STATS TIP LIST

Normal Distribution Applies to continuous variables Mean=median=mode Many stats tests assume nl distr t-test; ANOVA; regression Ways to test to see if a nl distribution Use non-parametric tests or transform data (ie log) if not a nl distribution Methods that assume nl distr Robust to moderate departures of nl distr assumption if n is large enough!

Normal Distribution Symmetrical about the mean BLUE= 68.2% of values w/in 1 SD BLUE+ BROWN= 95.4% of values w/in 2 SD BLUE + BROWN + GREEN= 99.7% of values w/in 3 SD

P-Value = the probability of obtaining results by chance alone p=0.05 (5% chance) May not tell whole story Statistically significant Clinically significant Small or large n s Small n: Type II error Give both: p-value & CI

MY TOP 10 STATS TIP LIST

Comparing 2 groups or rxs Type of Outcome Continuous Binary (y/n) Nl Distribution Paired Unpaired Paired t-test Yes Parametric Unpaired t-test Sign test No Nonparametric Paired Sign rank test McNemar s test Unpaired Wilcoxon rank sum test Yes Large Sample Size Chi-Squared No Fischer s Exact Test

Comparing 3 or > groups Type of Outcome Continuous Binary (y/n) Nl Distribution Yes Parametric No Nonparametric Frequency Tables Chi-squared Methods ANOVA Kruskal- Wallis Test

Comparing 2 groups or rxs Type of Outcome Continuous Binary (y/n) Nl Distribution Yes No Parametric Nonparametric Paired Unpaired Paired Unpaired t-test t-test Sign test Sign rank test Wilcoxon rank sum test

Comparing Group Means t-test ANOVA Assumptions Data is continuous & nl distributed Methods 2 indep samples: 2 sample t-test Paired data: Paired t-test >2 indep samples: ANOVA Includes Confidence intervals Hypothesis testing

3 types 2 sample t-test Student s t-test t-tests Independent samples t-test Paired samples t-test Paired data: 2 measurements on same subject or test unit One sample t-test Compare to a known (norm) value

t-tests One-tailed vs two-tailed Almost always use two-tailed Results could be higher or lower not just one way

95% CI Confidence Intervals 95% confident that the true value falls in the interval. Wide CI suggests uncertainty about data Does the CI contain a value that implies no change or no effect? Mean: 0 Odds ratio: 1 Does the confidence interval lie partly or entirely within a range of clinical indifference?

Example: Confidence Intervals Survey 19 millionaires Mean income donation=15% +/- 2 SD CI: +/- 2.4% Interpretation We are 95% confident that millionaires donate between 12.6-17.4 % of their income.

Comparing 2 groups or rxs Type of Outcome Continuous Binary (y/n) Nl Distribution Yes No Parametric Nonparametric Paired Unpaired Paired Unpaired t-test t-test Sign test Sign rank test Wilcoxon rank sum test

SIGN TEST Non-parametric test Not a nl distribution Alternative to paired t-test Good for small sample size Test the difference for matched pairs on before & after data Method: Calculate diffs Throw-out zero diff Test for # of + diff H 1 is true: median does not = 0

WILCOXON SIGN RANK TEST Same application as Sign Test Uses the ranks & the signs of diff More powerful test than Sign Test Method: Calculate differences in pairs Throw away zero differences Rank from smallest to largest difference w/out regard to +/- Test: sum of ranks of + diff

Wilcoxon Rank Sum Test Also known as: Mann-Whitney U test Comparing 2 independent samples Not nl distribution Good for detecting changes in medians Method: Combine data from 2 gps Rank smallest to largest Add ranks in the gp with smaller sample size Add ranks in gp with larger N Test: sum of ranks for smaller gp compared to larger gp

EXAMPLE: Rank-Sum Test Team Cheetah 5 team members Team Impala 7 team members Results TC: 3, 4, 7, 12, 13 (min) Results TI: 2, 5, 6, 8, 9, 10, 11 (min) Combine data & then rank: 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12 Sum ranks of smaller gp 2 + 3 + 6 + 11 + 12 = 34 Test if sum ranks of smaller gp is the same or different from other group

MY TOP 10 STATS TIP LIST

Comparing 3 or > groups Type of Outcome Continuous Binary (y/n) Nl Distribution Yes Parametric No Nonparametric Frequency Tables Chi-squared Methods ANOVA Kruskal- Wallis Test

ANOVA Analysis of variance Comparing means of >2 groups Assumes Continuous Nl distrib Same variance w/in each group Benefits compared to t-tests Efficiency Avoids multiple testing problem Problem Sign F test tells you that at least 2 gps are different, but not which ones!

ANOVA-Problem Multiple Comparisons Procedures Used to tell which groups differ Stricter levels for accepting/rejecting that the means are the same 4 methods Bonferroni Tukey Neuman-Keuls Scheffe

Kruskal-Wallis Test Nonparametric test Use for comparing 3 or > independent groups Think of as a non-parametric ANOVA test Good for detecting changes in median

MY TOP 10 STATS TIP LIST

Comparing 2 groups or rxs Type of Outcome Continuous Binary (y/n) Paired Unpaired McNemar s test Yes Large Sample Size No Chi-Squared Fischer s Exact Test

Comparing Frequency Data Binary outcome (yes/no) Paired method McNemar s Test Non-paired methods Pearson s Chi-square Fisher s Exact Test

Assumes Pearson s Chi-square Random samples from 2 groups Compares expected with observed All samples sizes are large enough All frequencies must be > 5 2x2 table: Standard New Helmet Helmet Concussion No Concussion 18 6 7 13 TOTAL n 1 =25 n 2 =19 p 1 =18/25 =0.72 (72%) p 2 =6/19 =0.32 (32%)

Pearson s Chi-square OBSERVED Standard Helmet New Helmet TOTAL Concussion 18 6 24 No Concussion 7 13 TOTAL n 1 =25 n 2 =19 20 44 X 2 =7.1 (p=0.0077) EXPECTED (if not different) Concussion No Concussion Standard Helmet 24/44 x 25 =13.64 20/44 x 25 =11.36 New Helmet 24/44 x 19 =10.36 20/44 x 19 =8.64

Fisher s Exact Test Use this test when 1 or more of frequencies is < 5

McNemar s Test Use for paired binary data Same subject before & after rx Cross-over study

MY TOP 10 STATS TIP LIST http://statpages.org www.theresearchassistant.com www.ats.ucla.edu/stat/

RISK Risk difference Absolute difference in risk proportions Can be difficult to interpret Relative Risk (RR) Also known as Risk Ratio Risk in 1 gp/risk in other gps Odds Ratio (OR) Probability or Odds of an event OR= odds of exposed gp/odds of control gp OR=1 means no difference

RELATIVE RISK Relative risk (RR) is the risk of an event relative to exposure. Risk of having a boy if mom took testosterone during pregnancy 75/100=75% Risk (probability) of having a boy= 51/100= 51% Risk Ratio=.75/.51=1.5 Easier to understand Risk ratio =0.5 =risk is half Risk ratio=2=risk is double

CALCULATING ODDS Odds of an event =# of events/# of nonevents 51 boys born for every 100 births Odds of any randomly chosen delivery being a boy=51/100-51=1.04 Odds>1: Event is more likely to happen than not Odds of certain event= Odds<1: Event is not likely to happen Odds of an impossible event=0

ODDS RATIO Testosterone example 75/100-75 51/100-51= 3/1.04= 2.9 The odds of having a boy is 2.9x higher in moms using testosterone vs mom s not using testosterone.

ODDS RATIO: Benefits No upper limit RR range varies depending on baseline prevalence When events are low (rare dz) OR approx RR OR ok to use with case control Don t use RR with case control

Calculating OR Cross Product Factor (Event) Group 1 Group 2 a b No Factor (No Event) c OR= a/c b/d d = a x d b x c Concussion No Concussion Standard New Helmet Helmet 18 6 7 13 18 x 13 = 5.57 6 x 7

MY TOP 10 STATS TIP LIST

SCATTERPLOT Can help answer the following Are variables X & Y related? Are X & Y linearly related? Are X & Y non-linearly related? Does the variation in Y change depending on X? Are there outliers? 1. Linear relationship 2. Small scatter (strong correlation) 3. + slope (+ correlation)

SCATTERPLOTS No relationship 1. Linear 2. Small scatter (strong correlation) 3. - slope (neg correlation)

SCATTERPLOTS Outlier Non-linear

CORRELATION: PEARSON Measures the strength of (linear) association between 2 variables Ranges from -1 to 1 1= -1= 0= Examples: r=0.8 r=0.3 r=-0.7 perfect + correlation perfect correlation no correlation strong + correlation weak + correlation moderate correlation

MY TOP 10 STATS TIP LIST

REGRESSION A straight line that describes the dependence of one variable on another is called a regression line Y=response variable ie finishing time X=explanatory variable ie body fat percentage Is finishing time predicted by body fat percentage?

Linear REGRESSION TYPES Data: Normal distribution Simple or Multiple Logistical Data: binary (y/n) Simple or Multiple Multiple Regression Models Allow estimation of the indep effect of each X after controlling for other variables in the model.

Simple LINEAR REGRESSION Use to predict Y given X Determine best fitting equation Test whether there is a relationship between X & Y

Linear Regression R 2 value =% of variance in Y explained by X If R 2 =1 then x can predict y 100% of the time F test for significance If p >0.05 then no significant relationship (slope of line =zero) exists between x & Y

Multiple Linear Regression Model that explains how a single dependent variable (Y) relates to several independent variables (x). Example: Test if age, gender, body fat %, prior triathlon competitions, & occupation predict finishing time.

Multiple Linear Regression How many variables to use? Recommend that you have 10-20x # of cases to variables tested. Test lots of variables Increase random chance of stat sign Model becomes unstable

Multiple Linear Regression Example cont: Model predicts 90% of variance in performance Now test for which variable or combinations of variables is most predictive Body fat %: 15% Age: 10% Gender: 30% Body fat & gender 35% Occupation 0% Prior triathlon 40%

MY TOP 10 STATS TIP LIST

QUIZ What is the appropriate stats test to apply?. 50 soccer players wore head gear & 40 did not. Players were followed for diagnosis of concussion over one season. 1. Paired two tailed t-test 2. ANOVA 3. Chi-square analysis 4. McNemar test

OTHER TIPS Stats support at Universities Usually charge per hour MS cheaper than PhD Authorship If stats person willing to: (International Committee of Medical Journal Editors (ICMJE) guidelines) Help design study Analyze data Format tables, graphs, etc Write a portion of article May be able to get small grant to cover $ of stats analysis On-line support

REFERENCES 1. Applied Biostatistics in Clinical Research Course Book; Case-Western Reserve General Clinical Research Center 2005 2. Biostatistics 100B Course Book; UCLA 1998 3. The Essentials of Clinical Investigation Course Book; UCLA Clinical Research Center 1999 4. Moore, McCabe, Craig (2009) Introduction to the Practice of Statistics, Sixth Edition. WH Freeman and Company, New York. ISBN-13: 978-1-4292-1622-7.