Statistics in Medicine Research Lecture Series CSMC Fall 2014

Size: px
Start display at page:

Download "Statistics in Medicine Research Lecture Series CSMC Fall 2014"

Transcription

1 Catherine Bresee, MS Senior Biostatistician Biostatistics & Bioinformatics Research Institute Statistics in Medicine Research Lecture Series CSMC Fall 2014

2 Overview Review concept of statistical power Factors influencing power Examples of sample size calculations(test of means, test of proportions) Writing a sample size justification Software tutorial

3 Power in Statistical Testing Metaphorically, the concept of statistical power is like a magnifying glass. The more powerful a magnifying glass is, the greater the ability to show greater detail. A more powerful study can better reveal a significant result.

4 Why do a Power Analysis? Optimize sample size to economize on costs. Determine the sample size for a clinically relevant and statistically significant difference. Determine the minimally detectable difference for a fixed sample size. Determine if the study is justifiably worth doing. Required for NIH Grant Proposals IRB / IACUC Protocols

5 Hypothesis Testing

6 Hypothesis Testing Typically statistical testing is to disprove a falsehood. Goal: Proving two groups are different. More challenging to prove equality! (Tests of equivalence require a different approach than typical statistical testing and much larger sample size.) Similar to Presumed innocent until proven guilty We assume treatment groups are not different until statistically proven different.

7 THE TRUTH IN NATURE True Null Hypothesis No Difference Between Groups True Alternative Hypothesis Difference Between Groups What Your Experiment Observes No Difference Between Groups (fail to reject Ho) Difference Between Groups (reject Ho) Correct Outcome! True Negative False Positive Type I Error α False Negative Type II Error β Correct Outcome! True Positive

8 Type I Error Type I Error: reject a null hypothesis when in fact it is true (False Positive) Probability of False Positive = α α = significance level P value: Probability of observing a given set of results as extreme (or more) than possible by chance alone if the null hypothesis is true. Smaller p value, stronger the evidence against the null If computed p value < α then reject null hypothesis

9 Type II Error Type II Error: when an experiment concludes that you cannot reject a null hypothesis when in fact the null hypothesis is incorrect (False Negative) Probability of False Negative = β 1 β = POWER Power: Probability of a true positive result. Power = 1 P(Type II Error) = 1 β Measure of a statistical test s ability to correctly reject a false null hypothesis. Type I Error is often considered more serious than Type II Error

10 Null Hypothesis There is no wolf chasing the sheep. Type I Error (False Positive) Shepherd cries out Wolf, Wolf when there is no wolf. Type II Error (False Negative) There is a wolf, but shepherd is too distracted to notice. Cost Assessment Help will not arrive next time shepherd cries wolf since no one believes him. One of the sheep might get eaten.

11 Null Hypothesis New drug is not better than old drug. Scenario for Type I Error (False Positive) Experimental group treated with new drug just happens to do better than the group treated with old drug, but the new drug really does not work that well. Scenario for Type II Error (False Negative) Experimental group treated with new drug does not do better than the group treated with old drug, but the new drug really does work better than old drug. Cost Assessment Drug goes on to treat more patients that don t get better and die. Study stops and the new better drug is abandoned without further testing.

12 Motivating Example Mice treated with a know carcinogen develop colon tumors 6 months later. Experiment will test a new transgenic knock out mouse to see if gene of interest is involved in the pathway for colon cancer. New transgenic mouse should have more severe colon cancer. How many mice to use per group?

13 There is no magic number, no perfect sample size! The chosen sample size is a function of many factors.

14 Six Components to a Statistical Power Analysis Typically set 5 items as fixed and solve for the 6 th.

15 1. Chose the type of Statistical Test What is the data (probably) going to look like? What statistical test for your primary hypothesis? Continuous & normally distributed > Student s t test, ANOVA, Linear Regression, Correlation Continuous & skewed > Wilcoxon Rank Sum Test, Mann Whitney U test Counts & proportions > Chi square, Fisher s Exact test Survival time > Kaplan Meier test, Log rank test Rates > Poisson regression

16 Parametric tests (e.g., t test) have more power than non parametric tests (rank sum test). Continuous data has more power than categorical. Avoid cut points or dichotomization of continuous data. Altman, Douglas G., and Patrick Royston. "The cost of dichotomising continuous variables." BMJ 332 (2006): MacCallum, Robert C., et al. "On the practice of dichotomization of quantitative variables." Psychological Methods 7(2002): 19.

17 2. Choose your Significance Level Set your α level. What test level is required? One sided or two sided? Conventional choice is two sided α = 0.05 Any value can be selected, justification needed. Pilot projects sometimes select higher α

18 Two Sided Test : = : Mean=0 SD=0.5 Mean=1 SD=0.25

19 One Sided Test : : < Mean=0 SD=0.5 Mean=1 SD=0.25 In general, most research is done with a two sided test (more conservative assumption). If one sided is chosen, then cut alpha in half and set to 0.025

20 3. Expected Variation Standard deviation (SD) can be estimated by Previous published research Observed SD from a control group in another experiment. Coefficient of Variation, i.e. percent of mean Estimated range (min, max) Typically C=4, based on convention Range is a poor substitute for having an estimated SD, use only as last resort.

21 Variation Normal Distribution Large amounts of variation can effect precision in estimating the difference between group means. Larger sample size will be required with larger variation 2 ±1 vs 2 ±2 n=6 n=6 2 ±3 vs 2 ±3 n=6 n=6 2 ±3 vs 2 ±3 N=10 n=10

22 Variation Non normal Distribution Categorical data the amount of variation is dependent on the size of the sample and the expected percent for each category. Skewed data where possibility of extreme outliers, then need the range (min, max) and median (50 th percentile) Survival data need median survival time and percent of censored observations (still alive at end of study)

23 4. Minimum Detectable Difference Difference between group means, group proportions Larger the difference between groups, smaller the required sample size. Typically, the more groups to compare, larger the required sample size per group.

24 What is Effect Size? Measurement of the magnitude of difference between groups as a function of the amount of variation. 2 groups effect size: >2 groups effect size: In ANOVA Small effect size = 0.1 Large effect size = for the regression model

25 5. Power Power is the probability an experiment will find a true difference between groups if it exists. Anything greater than 80% generally acceptable. Typically set at 80% or 90% when fixed value used in computing sample size. Higher power requires larger sample size

26 6. Sample Size What is cost effective, reasonable to work with? Unbalanced sample size is acceptable. Treatment group might have more variation requiring larger sample size than control group One group might not be as available. Should use more careful estimates to calculate true power.

27 Factors Affecting Power Parametric tests Multiple groups to compare Increased magnitude of difference between groups Increased variation in the sample Bigger sample size Smaller p value required for statistical significance

28 Writing a Power Analysis Must include all 6 elements: 1. Type of statistical test to be used 2. Significance level (one sided or two sided) 3. Estimated standard deviation 4. Minimum detectable difference 5. Expected Power 6. Sample size per treatment group

29 Motivating Example Mice treated with a know carcinogen develop colon tumors 6 months later. Experiment will test new transgenic knockout mice to determine if the gene of interest is involved in the cancer pathway. Transgenic mice should get sicker than control mice if this gene is important. PI s question: How many mice to use per group? Is there any published data on control animals?

30

31

32 Example 100% of the mice treated with the carcinogen develop colon tumors. Mean ± SD number of tumors per mouse = 5.0 ± 2.9 Mean ± SD tumor size per tumor = 3.5 ± 0.13 mm PI would prefer to work with no more than 10 mice per group for cost. New transgenic mouse to be studied should have worse colon cancer if gene knocked out is part of colon cancer pathway.

33 Null Hypothesis Transgenic mouse has colon cancer just as bad as control group. Cost Assessment Scenario for Type I Error (False Positive) Transgenic mouse just happens to show worse colon cancer in this experiment, but the new mouse strain really is not any different than regular mouse. Results get published, but then retracted later when future studies show futility. Scenario for Type II Error (False Negative) Transgenic mice does not get any sicker than control mice in our experiment, but the transgenic mouse really does have a higher rate of colon cancer. Study stops and the no new experiments are planned with the new mouse breed; other genes are studied.

34 Null Hypothesis Transgenic mouse has colon cancer just as bad as control group. Cost Assessment Scenario for Type I Error (False Positive) Transgenic mouse just happens to show worse colon cancer in this experiment, but the new mouse strain really is not any different than regular mouse. Results get published, but then retracted later when future studies show futility. Scenario for Type II Error (False Negative) Transgenic mice does not get any sicker than control mice in our experiment, but the transgenic mouse really does have a higher rate of colon cancer. Study stops and the no new experiments are planned with the new mouse breed; other genes are studied. Risk of this Scenario = α Set to 5% Risk of this Scenario = β Set to 20%

35

36

37 Final Power Analysis for NIH Grant For Experiment 2.3: With this treatment (AOM 10 mg/kg i.p., once a week for 4 weeks) mice usually develop multiple colon tumors with 100% penetrance. When mice are sacrificed we will count the number of tumors and the average size of the tumor per animals. According to published data mice develop ~5 (sd=3) tumors/mouse with the average size 3.5mm (sd=0.13). 32 We will sacrifice these mice 5 weeks after the first injection. We expect that Car1/GH/IRES/GFP+/+ mice develop more tumors and the size of tumors will be bigger. We have therefore chosen 6 animals per group to achieve 80% power to detect an average increase of 5 tumors per mouse and 0.2mm increase in average size in a twosided two sample t test at the 0.05 significance level.

38 Example, part 2 Test of new drug to potentially stop colon cancer using same mouse model. 100% of mice treated with a know carcinogen develop colon tumors 6 months later. How many mice do we need to test a decrease of colon cancer incidence rate?

39

40 Power Analysis Results: Based on the assumption that our control group of mice will have an incidence rate at ~100% in developing colon cancer, a future study of the new drug with 10 animal in each group will have 80% power in a two sided Fisher s Exact Test at the 0.05 significance level, assuming the treatment group will have an incidence level of 40% or lower.

41 All NIH grants, IRB & IACUC protocols are reviewed by a biostatistician who know what a properly written sample size justification should look like.

42 Some Common Mistakes Common misconception: p<0.05 means the probability that the null hypothesis is true is less than Correct interpretation: smaller p value indicates the weight of the evidence against the null hypothesis is stronger, and not simply a random chance. It is NOT more statistically significant. Differences are either statistically significant or they are not.

43 Using Pilot Data Inappropriately Pilot data should be an independent collection of data to analyze in planning a future study It should not be a small sample that failed to demonstrate significant differences and more cases will be added to. If plan is to increase sample size for initial smaller failed experiment, decision should be made a priori, and plan for interim analysis, alpha spending rules, etc.

44 More Common Mistakes Common misconception: Doubling the sample cuts the standard deviation in half One sided versus two sided Non random sampling Dichotomizing or classifying data extracted from a continuous variable

45 Cost of Dichotomization

46 The Error of Post Hoc Power Analysis When a study has negative results it is inappropriate to calculate how much power the study had. All statistical tests that have p>0.05 will have poor observed power using the data at hand. The implied conclusion of post hoc power is that the effect observed could be real but the sample size was too small. However, the observed mean will vary with each trial. Instead the confidence interval should be computed for the observed difference from groups and interpreted as such. Hoenig, John M., and Dennis M. Heisey. "The abuse of power." The American Statistician 55.1 (2001).

47 Simple Formula for Difference Between two Means (not appropriate for small sample sizes) Sample size in each group (assumes equal sized groups) n Standard deviation of the outcome variable Represents the desired power (typically.84 for 80% power). 2 2 ( Z Z /2 2 difference Effect Size (the difference in means) ) 2 Represents the desired level of statistical significance (typically 1.96)

48 Simple Formula for Difference Between two Proportions (not appropriate for small sample sizes) Sample size in each group (assumes equal sized groups) Represents the desired power (typically.84 for 80% power). n A measure of variability (similar to standard deviation) 2( p)(1 (p p)( Z 1 p 2 Effect Size (the difference in proportions) ) 2 Z /2 ) 2 Represents the desired level of statistical significance (typically 1.96).

49 Online Calculators

50 Calculations are based on formulas for the 1 sample Z test

51 Calculations are based on formulas for the 1 sample Z test

52 Online Calculators

53 PS: Power and Sample Size Demo Free PS software (v 3.1.2) available at: plesize Dupont WD, Plummer WD: 'Power and Sample Size Calculations: A Review and Computer Program', Controlled Clinical Trials 1990; 11:

54 Step 1: Chose your type of statistical test

55 Step 2: Choose what to solve for Step 3: Input parameters 2 group t test significance level = 0.05 standard deviation = 2.9 difference in group means = 4 power = 0.80 m= ratio of samples in each group

56 Auto generated sample size justification paragraph!!

57 G*Power Demo Free software (v ) available at: Faul, F., Erdfelder, E., Lang, A. G., & Buchner, A. (2007). G*Power 3: A flexible statistical power analysis program for the social, behavioral, and biomedical sciences. Behavior Research Methods, 39,

58 Excellent G*Power Tutorial:

59 G*Power Mann Whitney (rank test) Non parametric test of two group means

60

61 Non parametric test of two group means

62 G*Power ANOVA ANOVA 3 groups 10 per group Alpha = 0.05 Power = 0.80

63 Minimum detectable effect size = 0.6

64 Absence of Evidence is not Evidence of Absence

65

Sample Size Planning, Calculation, and Justification

Sample Size Planning, Calculation, and Justification Sample Size Planning, Calculation, and Justification Theresa A Scott, MS Vanderbilt University Department of Biostatistics theresa.scott@vanderbilt.edu http://biostat.mc.vanderbilt.edu/theresascott Theresa

More information

Consider a study in which. How many subjects? The importance of sample size calculations. An insignificant effect: two possibilities.

Consider a study in which. How many subjects? The importance of sample size calculations. An insignificant effect: two possibilities. Consider a study in which How many subjects? The importance of sample size calculations Office of Research Protections Brown Bag Series KB Boomer, Ph.D. Director, boomer@stat.psu.edu A researcher conducts

More information

Final Exam Practice Problem Answers

Final Exam Practice Problem Answers Final Exam Practice Problem Answers The following data set consists of data gathered from 77 popular breakfast cereals. The variables in the data set are as follows: Brand: The brand name of the cereal

More information

Descriptive Statistics

Descriptive Statistics Descriptive Statistics Primer Descriptive statistics Central tendency Variation Relative position Relationships Calculating descriptive statistics Descriptive Statistics Purpose to describe or summarize

More information

X X X a) perfect linear correlation b) no correlation c) positive correlation (r = 1) (r = 0) (0 < r < 1)

X X X a) perfect linear correlation b) no correlation c) positive correlation (r = 1) (r = 0) (0 < r < 1) CORRELATION AND REGRESSION / 47 CHAPTER EIGHT CORRELATION AND REGRESSION Correlation and regression are statistical methods that are commonly used in the medical literature to compare two or more variables.

More information

II. DISTRIBUTIONS distribution normal distribution. standard scores

II. DISTRIBUTIONS distribution normal distribution. standard scores Appendix D Basic Measurement And Statistics The following information was developed by Steven Rothke, PhD, Department of Psychology, Rehabilitation Institute of Chicago (RIC) and expanded by Mary F. Schmidt,

More information

Biostatistics: Types of Data Analysis

Biostatistics: Types of Data Analysis Biostatistics: Types of Data Analysis Theresa A Scott, MS Vanderbilt University Department of Biostatistics theresa.scott@vanderbilt.edu http://biostat.mc.vanderbilt.edu/theresascott Theresa A Scott, MS

More information

Study Design and Statistical Analysis

Study Design and Statistical Analysis Study Design and Statistical Analysis Anny H Xiang, PhD Department of Preventive Medicine University of Southern California Outline Designing Clinical Research Studies Statistical Data Analysis Designing

More information

Two-Sample T-Tests Assuming Equal Variance (Enter Means)

Two-Sample T-Tests Assuming Equal Variance (Enter Means) Chapter 4 Two-Sample T-Tests Assuming Equal Variance (Enter Means) Introduction This procedure provides sample size and power calculations for one- or two-sided two-sample t-tests when the variances of

More information

Two-Sample T-Tests Allowing Unequal Variance (Enter Difference)

Two-Sample T-Tests Allowing Unequal Variance (Enter Difference) Chapter 45 Two-Sample T-Tests Allowing Unequal Variance (Enter Difference) Introduction This procedure provides sample size and power calculations for one- or two-sided two-sample t-tests when no assumption

More information

1. What is the critical value for this 95% confidence interval? CV = z.025 = invnorm(0.025) = 1.96

1. What is the critical value for this 95% confidence interval? CV = z.025 = invnorm(0.025) = 1.96 1 Final Review 2 Review 2.1 CI 1-propZint Scenario 1 A TV manufacturer claims in its warranty brochure that in the past not more than 10 percent of its TV sets needed any repair during the first two years

More information

" Y. Notation and Equations for Regression Lecture 11/4. Notation:

 Y. Notation and Equations for Regression Lecture 11/4. Notation: Notation: Notation and Equations for Regression Lecture 11/4 m: The number of predictor variables in a regression Xi: One of multiple predictor variables. The subscript i represents any number from 1 through

More information

Part 2: Analysis of Relationship Between Two Variables

Part 2: Analysis of Relationship Between Two Variables Part 2: Analysis of Relationship Between Two Variables Linear Regression Linear correlation Significance Tests Multiple regression Linear Regression Y = a X + b Dependent Variable Independent Variable

More information

Version 4.0. Statistics Guide. Statistical analyses for laboratory and clinical researchers. Harvey Motulsky

Version 4.0. Statistics Guide. Statistical analyses for laboratory and clinical researchers. Harvey Motulsky Version 4.0 Statistics Guide Statistical analyses for laboratory and clinical researchers Harvey Motulsky 1999-2005 GraphPad Software, Inc. All rights reserved. Third printing February 2005 GraphPad Prism

More information

Principles of Hypothesis Testing for Public Health

Principles of Hypothesis Testing for Public Health Principles of Hypothesis Testing for Public Health Laura Lee Johnson, Ph.D. Statistician National Center for Complementary and Alternative Medicine johnslau@mail.nih.gov Fall 2011 Answers to Questions

More information

Correlational Research

Correlational Research Correlational Research Chapter Fifteen Correlational Research Chapter Fifteen Bring folder of readings The Nature of Correlational Research Correlational Research is also known as Associational Research.

More information

Research Methods & Experimental Design

Research Methods & Experimental Design Research Methods & Experimental Design 16.422 Human Supervisory Control April 2004 Research Methods Qualitative vs. quantitative Understanding the relationship between objectives (research question) and

More information

Additional sources Compilation of sources: http://lrs.ed.uiuc.edu/tseportal/datacollectionmethodologies/jin-tselink/tselink.htm

Additional sources Compilation of sources: http://lrs.ed.uiuc.edu/tseportal/datacollectionmethodologies/jin-tselink/tselink.htm Mgt 540 Research Methods Data Analysis 1 Additional sources Compilation of sources: http://lrs.ed.uiuc.edu/tseportal/datacollectionmethodologies/jin-tselink/tselink.htm http://web.utk.edu/~dap/random/order/start.htm

More information

Chicago Booth BUSINESS STATISTICS 41000 Final Exam Fall 2011

Chicago Booth BUSINESS STATISTICS 41000 Final Exam Fall 2011 Chicago Booth BUSINESS STATISTICS 41000 Final Exam Fall 2011 Name: Section: I pledge my honor that I have not violated the Honor Code Signature: This exam has 34 pages. You have 3 hours to complete this

More information

Statistics I for QBIC. Contents and Objectives. Chapters 1 7. Revised: August 2013

Statistics I for QBIC. Contents and Objectives. Chapters 1 7. Revised: August 2013 Statistics I for QBIC Text Book: Biostatistics, 10 th edition, by Daniel & Cross Contents and Objectives Chapters 1 7 Revised: August 2013 Chapter 1: Nature of Statistics (sections 1.1-1.6) Objectives

More information

Statistics Review PSY379

Statistics Review PSY379 Statistics Review PSY379 Basic concepts Measurement scales Populations vs. samples Continuous vs. discrete variable Independent vs. dependent variable Descriptive vs. inferential stats Common analyses

More information

Simple Linear Regression Inference

Simple Linear Regression Inference Simple Linear Regression Inference 1 Inference requirements The Normality assumption of the stochastic term e is needed for inference even if it is not a OLS requirement. Therefore we have: Interpretation

More information

Non-Inferiority Tests for Two Means using Differences

Non-Inferiority Tests for Two Means using Differences Chapter 450 on-inferiority Tests for Two Means using Differences Introduction This procedure computes power and sample size for non-inferiority tests in two-sample designs in which the outcome is a continuous

More information

DATA INTERPRETATION AND STATISTICS

DATA INTERPRETATION AND STATISTICS PholC60 September 001 DATA INTERPRETATION AND STATISTICS Books A easy and systematic introductory text is Essentials of Medical Statistics by Betty Kirkwood, published by Blackwell at about 14. DESCRIPTIVE

More information

UNIVERSITY OF NAIROBI

UNIVERSITY OF NAIROBI UNIVERSITY OF NAIROBI MASTERS IN PROJECT PLANNING AND MANAGEMENT NAME: SARU CAROLYNN ELIZABETH REGISTRATION NO: L50/61646/2013 COURSE CODE: LDP 603 COURSE TITLE: RESEARCH METHODS LECTURER: GAKUU CHRISTOPHER

More information

Introduction to Quantitative Methods

Introduction to Quantitative Methods Introduction to Quantitative Methods October 15, 2009 Contents 1 Definition of Key Terms 2 2 Descriptive Statistics 3 2.1 Frequency Tables......................... 4 2.2 Measures of Central Tendencies.................

More information

Introduction to. Hypothesis Testing CHAPTER LEARNING OBJECTIVES. 1 Identify the four steps of hypothesis testing.

Introduction to. Hypothesis Testing CHAPTER LEARNING OBJECTIVES. 1 Identify the four steps of hypothesis testing. Introduction to Hypothesis Testing CHAPTER 8 LEARNING OBJECTIVES After reading this chapter, you should be able to: 1 Identify the four steps of hypothesis testing. 2 Define null hypothesis, alternative

More information

Sample Size and Power in Clinical Trials

Sample Size and Power in Clinical Trials Sample Size and Power in Clinical Trials Version 1.0 May 011 1. Power of a Test. Factors affecting Power 3. Required Sample Size RELATED ISSUES 1. Effect Size. Test Statistics 3. Variation 4. Significance

More information

5. Linear Regression

5. Linear Regression 5. Linear Regression Outline.................................................................... 2 Simple linear regression 3 Linear model............................................................. 4

More information

for the social, behavioral, and biomedical sciences. Behavior Research Methods, 39, 175-191.

for the social, behavioral, and biomedical sciences. Behavior Research Methods, 39, 175-191. Example of a Statistical Power Calculation (p. 206) Photocopiable This example uses the statistical power software package G*Power 3. I am grateful to the creators of the software for giving their permission

More information

Point Biserial Correlation Tests

Point Biserial Correlation Tests Chapter 807 Point Biserial Correlation Tests Introduction The point biserial correlation coefficient (ρ in this chapter) is the product-moment correlation calculated between a continuous random variable

More information

HYPOTHESIS TESTING (ONE SAMPLE) - CHAPTER 7 1. used confidence intervals to answer questions such as...

HYPOTHESIS TESTING (ONE SAMPLE) - CHAPTER 7 1. used confidence intervals to answer questions such as... HYPOTHESIS TESTING (ONE SAMPLE) - CHAPTER 7 1 PREVIOUSLY used confidence intervals to answer questions such as... You know that 0.25% of women have red/green color blindness. You conduct a study of men

More information

Statistical tests for SPSS

Statistical tests for SPSS Statistical tests for SPSS Paolo Coletti A.Y. 2010/11 Free University of Bolzano Bozen Premise This book is a very quick, rough and fast description of statistical tests and their usage. It is explicitly

More information

BIOM611 Biological Data Analysis

BIOM611 Biological Data Analysis BIOM611 Biological Data Analysis Spring, 2015 Tentative Syllabus Introduction BIOMED611 is a ½ unit course required for all 1 st year BGS students (except GCB students). It will provide an introduction

More information

1-3 id id no. of respondents 101-300 4 respon 1 responsible for maintenance? 1 = no, 2 = yes, 9 = blank

1-3 id id no. of respondents 101-300 4 respon 1 responsible for maintenance? 1 = no, 2 = yes, 9 = blank Basic Data Analysis Graziadio School of Business and Management Data Preparation & Entry Editing: Inspection & Correction Field Edit: Immediate follow-up (complete? legible? comprehensible? consistent?

More information

Introduction to Statistics and Quantitative Research Methods

Introduction to Statistics and Quantitative Research Methods Introduction to Statistics and Quantitative Research Methods Purpose of Presentation To aid in the understanding of basic statistics, including terminology, common terms, and common statistical methods.

More information

Data Analysis, Research Study Design and the IRB

Data Analysis, Research Study Design and the IRB Minding the p-values p and Quartiles: Data Analysis, Research Study Design and the IRB Don Allensworth-Davies, MSc Research Manager, Data Coordinating Center Boston University School of Public Health IRB

More information

2 Precision-based sample size calculations

2 Precision-based sample size calculations Statistics: An introduction to sample size calculations Rosie Cornish. 2006. 1 Introduction One crucial aspect of study design is deciding how big your sample should be. If you increase your sample size

More information

Calculating, Interpreting, and Reporting Estimates of Effect Size (Magnitude of an Effect or the Strength of a Relationship)

Calculating, Interpreting, and Reporting Estimates of Effect Size (Magnitude of an Effect or the Strength of a Relationship) 1 Calculating, Interpreting, and Reporting Estimates of Effect Size (Magnitude of an Effect or the Strength of a Relationship) I. Authors should report effect sizes in the manuscript and tables when reporting

More information

Unit 31 A Hypothesis Test about Correlation and Slope in a Simple Linear Regression

Unit 31 A Hypothesis Test about Correlation and Slope in a Simple Linear Regression Unit 31 A Hypothesis Test about Correlation and Slope in a Simple Linear Regression Objectives: To perform a hypothesis test concerning the slope of a least squares line To recognize that testing for a

More information

Introduction. Hypothesis Testing. Hypothesis Testing. Significance Testing

Introduction. Hypothesis Testing. Hypothesis Testing. Significance Testing Introduction Hypothesis Testing Mark Lunt Arthritis Research UK Centre for Ecellence in Epidemiology University of Manchester 13/10/2015 We saw last week that we can never know the population parameters

More information

Study Guide for the Final Exam

Study Guide for the Final Exam Study Guide for the Final Exam When studying, remember that the computational portion of the exam will only involve new material (covered after the second midterm), that material from Exam 1 will make

More information

Section 13, Part 1 ANOVA. Analysis Of Variance

Section 13, Part 1 ANOVA. Analysis Of Variance Section 13, Part 1 ANOVA Analysis Of Variance Course Overview So far in this course we ve covered: Descriptive statistics Summary statistics Tables and Graphs Probability Probability Rules Probability

More information

BA 275 Review Problems - Week 6 (10/30/06-11/3/06) CD Lessons: 53, 54, 55, 56 Textbook: pp. 394-398, 404-408, 410-420

BA 275 Review Problems - Week 6 (10/30/06-11/3/06) CD Lessons: 53, 54, 55, 56 Textbook: pp. 394-398, 404-408, 410-420 BA 275 Review Problems - Week 6 (10/30/06-11/3/06) CD Lessons: 53, 54, 55, 56 Textbook: pp. 394-398, 404-408, 410-420 1. Which of the following will increase the value of the power in a statistical test

More information

Normality Testing in Excel

Normality Testing in Excel Normality Testing in Excel By Mark Harmon Copyright 2011 Mark Harmon No part of this publication may be reproduced or distributed without the express permission of the author. mark@excelmasterseries.com

More information

Tutorial 5: Hypothesis Testing

Tutorial 5: Hypothesis Testing Tutorial 5: Hypothesis Testing Rob Nicholls nicholls@mrc-lmb.cam.ac.uk MRC LMB Statistics Course 2014 Contents 1 Introduction................................ 1 2 Testing distributional assumptions....................

More information

Come scegliere un test statistico

Come scegliere un test statistico Come scegliere un test statistico Estratto dal Capitolo 37 of Intuitive Biostatistics (ISBN 0-19-508607-4) by Harvey Motulsky. Copyright 1995 by Oxfd University Press Inc. (disponibile in Iinternet) Table

More information

Error Type, Power, Assumptions. Parametric Tests. Parametric vs. Nonparametric Tests

Error Type, Power, Assumptions. Parametric Tests. Parametric vs. Nonparametric Tests Error Type, Power, Assumptions Parametric vs. Nonparametric tests Type-I & -II Error Power Revisited Meeting the Normality Assumption - Outliers, Winsorizing, Trimming - Data Transformation 1 Parametric

More information

The Statistics Tutor s Quick Guide to

The Statistics Tutor s Quick Guide to statstutor community project encouraging academics to share statistics support resources All stcp resources are released under a Creative Commons licence The Statistics Tutor s Quick Guide to Stcp-marshallowen-7

More information

Standard Deviation Estimator

Standard Deviation Estimator CSS.com Chapter 905 Standard Deviation Estimator Introduction Even though it is not of primary interest, an estimate of the standard deviation (SD) is needed when calculating the power or sample size of

More information

Introduction to Statistics with GraphPad Prism (5.01) Version 1.1

Introduction to Statistics with GraphPad Prism (5.01) Version 1.1 Babraham Bioinformatics Introduction to Statistics with GraphPad Prism (5.01) Version 1.1 Introduction to Statistics with GraphPad Prism 2 Licence This manual is 2010-11, Anne Segonds-Pichon. This manual

More information

Case Study in Data Analysis Does a drug prevent cardiomegaly in heart failure?

Case Study in Data Analysis Does a drug prevent cardiomegaly in heart failure? Case Study in Data Analysis Does a drug prevent cardiomegaly in heart failure? Harvey Motulsky hmotulsky@graphpad.com This is the first case in what I expect will be a series of case studies. While I mention

More information

Business Statistics. Successful completion of Introductory and/or Intermediate Algebra courses is recommended before taking Business Statistics.

Business Statistics. Successful completion of Introductory and/or Intermediate Algebra courses is recommended before taking Business Statistics. Business Course Text Bowerman, Bruce L., Richard T. O'Connell, J. B. Orris, and Dawn C. Porter. Essentials of Business, 2nd edition, McGraw-Hill/Irwin, 2008, ISBN: 978-0-07-331988-9. Required Computing

More information

Course Text. Required Computing Software. Course Description. Course Objectives. StraighterLine. Business Statistics

Course Text. Required Computing Software. Course Description. Course Objectives. StraighterLine. Business Statistics Course Text Business Statistics Lind, Douglas A., Marchal, William A. and Samuel A. Wathen. Basic Statistics for Business and Economics, 7th edition, McGraw-Hill/Irwin, 2010, ISBN: 9780077384470 [This

More information

A POPULATION MEAN, CONFIDENCE INTERVALS AND HYPOTHESIS TESTING

A POPULATION MEAN, CONFIDENCE INTERVALS AND HYPOTHESIS TESTING CHAPTER 5. A POPULATION MEAN, CONFIDENCE INTERVALS AND HYPOTHESIS TESTING 5.1 Concepts When a number of animals or plots are exposed to a certain treatment, we usually estimate the effect of the treatment

More information

INTERPRETING THE ONE-WAY ANALYSIS OF VARIANCE (ANOVA)

INTERPRETING THE ONE-WAY ANALYSIS OF VARIANCE (ANOVA) INTERPRETING THE ONE-WAY ANALYSIS OF VARIANCE (ANOVA) As with other parametric statistics, we begin the one-way ANOVA with a test of the underlying assumptions. Our first assumption is the assumption of

More information

CALCULATIONS & STATISTICS

CALCULATIONS & STATISTICS CALCULATIONS & STATISTICS CALCULATION OF SCORES Conversion of 1-5 scale to 0-100 scores When you look at your report, you will notice that the scores are reported on a 0-100 scale, even though respondents

More information

THE KRUSKAL WALLLIS TEST

THE KRUSKAL WALLLIS TEST THE KRUSKAL WALLLIS TEST TEODORA H. MEHOTCHEVA Wednesday, 23 rd April 08 THE KRUSKAL-WALLIS TEST: The non-parametric alternative to ANOVA: testing for difference between several independent groups 2 NON

More information

COMPARISONS OF CUSTOMER LOYALTY: PUBLIC & PRIVATE INSURANCE COMPANIES.

COMPARISONS OF CUSTOMER LOYALTY: PUBLIC & PRIVATE INSURANCE COMPANIES. 277 CHAPTER VI COMPARISONS OF CUSTOMER LOYALTY: PUBLIC & PRIVATE INSURANCE COMPANIES. This chapter contains a full discussion of customer loyalty comparisons between private and public insurance companies

More information

Univariate Regression

Univariate Regression Univariate Regression Correlation and Regression The regression line summarizes the linear relationship between 2 variables Correlation coefficient, r, measures strength of relationship: the closer r is

More information

22. HYPOTHESIS TESTING

22. HYPOTHESIS TESTING 22. HYPOTHESIS TESTING Often, we need to make decisions based on incomplete information. Do the data support some belief ( hypothesis ) about the value of a population parameter? Is OJ Simpson guilty?

More information

Analysis of Data. Organizing Data Files in SPSS. Descriptive Statistics

Analysis of Data. Organizing Data Files in SPSS. Descriptive Statistics Analysis of Data Claudia J. Stanny PSY 67 Research Design Organizing Data Files in SPSS All data for one subject entered on the same line Identification data Between-subjects manipulations: variable to

More information

HYPOTHESIS TESTING: CONFIDENCE INTERVALS, T-TESTS, ANOVAS, AND REGRESSION

HYPOTHESIS TESTING: CONFIDENCE INTERVALS, T-TESTS, ANOVAS, AND REGRESSION HYPOTHESIS TESTING: CONFIDENCE INTERVALS, T-TESTS, ANOVAS, AND REGRESSION HOD 2990 10 November 2010 Lecture Background This is a lightning speed summary of introductory statistical methods for senior undergraduate

More information

Generalized Linear Models

Generalized Linear Models Generalized Linear Models We have previously worked with regression models where the response variable is quantitative and normally distributed. Now we turn our attention to two types of models where the

More information

Section Format Day Begin End Building Rm# Instructor. 001 Lecture Tue 6:45 PM 8:40 PM Silver 401 Ballerini

Section Format Day Begin End Building Rm# Instructor. 001 Lecture Tue 6:45 PM 8:40 PM Silver 401 Ballerini NEW YORK UNIVERSITY ROBERT F. WAGNER GRADUATE SCHOOL OF PUBLIC SERVICE Course Syllabus Spring 2016 Statistical Methods for Public, Nonprofit, and Health Management Section Format Day Begin End Building

More information

Non-Parametric Tests (I)

Non-Parametric Tests (I) Lecture 5: Non-Parametric Tests (I) KimHuat LIM lim@stats.ox.ac.uk http://www.stats.ox.ac.uk/~lim/teaching.html Slide 1 5.1 Outline (i) Overview of Distribution-Free Tests (ii) Median Test for Two Independent

More information

List of Examples. Examples 319

List of Examples. Examples 319 Examples 319 List of Examples DiMaggio and Mantle. 6 Weed seeds. 6, 23, 37, 38 Vole reproduction. 7, 24, 37 Wooly bear caterpillar cocoons. 7 Homophone confusion and Alzheimer s disease. 8 Gear tooth strength.

More information

Some Essential Statistics The Lure of Statistics

Some Essential Statistics The Lure of Statistics Some Essential Statistics The Lure of Statistics Data Mining Techniques, by M.J.A. Berry and G.S Linoff, 2004 Statistics vs. Data Mining..lie, damn lie, and statistics mining data to support preconceived

More information

QUANTITATIVE METHODS BIOLOGY FINAL HONOUR SCHOOL NON-PARAMETRIC TESTS

QUANTITATIVE METHODS BIOLOGY FINAL HONOUR SCHOOL NON-PARAMETRIC TESTS QUANTITATIVE METHODS BIOLOGY FINAL HONOUR SCHOOL NON-PARAMETRIC TESTS This booklet contains lecture notes for the nonparametric work in the QM course. This booklet may be online at http://users.ox.ac.uk/~grafen/qmnotes/index.html.

More information

Parametric and Nonparametric: Demystifying the Terms

Parametric and Nonparametric: Demystifying the Terms Parametric and Nonparametric: Demystifying the Terms By Tanya Hoskin, a statistician in the Mayo Clinic Department of Health Sciences Research who provides consultations through the Mayo Clinic CTSA BERD

More information

Example: Boats and Manatees

Example: Boats and Manatees Figure 9-6 Example: Boats and Manatees Slide 1 Given the sample data in Table 9-1, find the value of the linear correlation coefficient r, then refer to Table A-6 to determine whether there is a significant

More information

HYPOTHESIS TESTING (ONE SAMPLE) - CHAPTER 7 1. used confidence intervals to answer questions such as...

HYPOTHESIS TESTING (ONE SAMPLE) - CHAPTER 7 1. used confidence intervals to answer questions such as... HYPOTHESIS TESTING (ONE SAMPLE) - CHAPTER 7 1 PREVIOUSLY used confidence intervals to answer questions such as... You know that 0.25% of women have red/green color blindness. You conduct a study of men

More information

Likelihood Approaches for Trial Designs in Early Phase Oncology

Likelihood Approaches for Trial Designs in Early Phase Oncology Likelihood Approaches for Trial Designs in Early Phase Oncology Clinical Trials Elizabeth Garrett-Mayer, PhD Cody Chiuzan, PhD Hollings Cancer Center Department of Public Health Sciences Medical University

More information

Simple Regression Theory II 2010 Samuel L. Baker

Simple Regression Theory II 2010 Samuel L. Baker SIMPLE REGRESSION THEORY II 1 Simple Regression Theory II 2010 Samuel L. Baker Assessing how good the regression equation is likely to be Assignment 1A gets into drawing inferences about how close the

More information

MULTIPLE REGRESSION AND ISSUES IN REGRESSION ANALYSIS

MULTIPLE REGRESSION AND ISSUES IN REGRESSION ANALYSIS MULTIPLE REGRESSION AND ISSUES IN REGRESSION ANALYSIS MSR = Mean Regression Sum of Squares MSE = Mean Squared Error RSS = Regression Sum of Squares SSE = Sum of Squared Errors/Residuals α = Level of Significance

More information

Tests for Two Survival Curves Using Cox s Proportional Hazards Model

Tests for Two Survival Curves Using Cox s Proportional Hazards Model Chapter 730 Tests for Two Survival Curves Using Cox s Proportional Hazards Model Introduction A clinical trial is often employed to test the equality of survival distributions of two treatment groups.

More information

Chi Squared and Fisher's Exact Tests. Observed vs Expected Distributions

Chi Squared and Fisher's Exact Tests. Observed vs Expected Distributions BMS 617 Statistical Techniques for the Biomedical Sciences Lecture 11: Chi-Squared and Fisher's Exact Tests Chi Squared and Fisher's Exact Tests This lecture presents two similarly structured tests, Chi-squared

More information

Skewed Data and Non-parametric Methods

Skewed Data and Non-parametric Methods 0 2 4 6 8 10 12 14 Skewed Data and Non-parametric Methods Comparing two groups: t-test assumes data are: 1. Normally distributed, and 2. both samples have the same SD (i.e. one sample is simply shifted

More information

1. The parameters to be estimated in the simple linear regression model Y=α+βx+ε ε~n(0,σ) are: a) α, β, σ b) α, β, ε c) a, b, s d) ε, 0, σ

1. The parameters to be estimated in the simple linear regression model Y=α+βx+ε ε~n(0,σ) are: a) α, β, σ b) α, β, ε c) a, b, s d) ε, 0, σ STA 3024 Practice Problems Exam 2 NOTE: These are just Practice Problems. This is NOT meant to look just like the test, and it is NOT the only thing that you should study. Make sure you know all the material

More information

Chapter 7 Section 1 Homework Set A

Chapter 7 Section 1 Homework Set A Chapter 7 Section 1 Homework Set A 7.15 Finding the critical value t *. What critical value t * from Table D (use software, go to the web and type t distribution applet) should be used to calculate the

More information

Projects Involving Statistics (& SPSS)

Projects Involving Statistics (& SPSS) Projects Involving Statistics (& SPSS) Academic Skills Advice Starting a project which involves using statistics can feel confusing as there seems to be many different things you can do (charts, graphs,

More information

HYPOTHESIS TESTING WITH SPSS:

HYPOTHESIS TESTING WITH SPSS: HYPOTHESIS TESTING WITH SPSS: A NON-STATISTICIAN S GUIDE & TUTORIAL by Dr. Jim Mirabella SPSS 14.0 screenshots reprinted with permission from SPSS Inc. Published June 2006 Copyright Dr. Jim Mirabella CHAPTER

More information

Statistical issues in the analysis of microarray data

Statistical issues in the analysis of microarray data Statistical issues in the analysis of microarray data Daniel Gerhard Institute of Biostatistics Leibniz University of Hannover ESNATS Summerschool, Zermatt D. Gerhard (LUH) Analysis of microarray data

More information

Pearson's Correlation Tests

Pearson's Correlation Tests Chapter 800 Pearson's Correlation Tests Introduction The correlation coefficient, ρ (rho), is a popular statistic for describing the strength of the relationship between two variables. The correlation

More information

Non-Inferiority Tests for One Mean

Non-Inferiority Tests for One Mean Chapter 45 Non-Inferiority ests for One Mean Introduction his module computes power and sample size for non-inferiority tests in one-sample designs in which the outcome is distributed as a normal random

More information

Introduction to Hypothesis Testing

Introduction to Hypothesis Testing I. Terms, Concepts. Introduction to Hypothesis Testing A. In general, we do not know the true value of population parameters - they must be estimated. However, we do have hypotheses about what the true

More information

Data Mining Techniques Chapter 5: The Lure of Statistics: Data Mining Using Familiar Tools

Data Mining Techniques Chapter 5: The Lure of Statistics: Data Mining Using Familiar Tools Data Mining Techniques Chapter 5: The Lure of Statistics: Data Mining Using Familiar Tools Occam s razor.......................................................... 2 A look at data I.........................................................

More information

Analysis and Interpretation of Clinical Trials. How to conclude?

Analysis and Interpretation of Clinical Trials. How to conclude? www.eurordis.org Analysis and Interpretation of Clinical Trials How to conclude? Statistical Issues Dr Ferran Torres Unitat de Suport en Estadística i Metodología - USEM Statistics and Methodology Support

More information

NCSS Statistical Software

NCSS Statistical Software Chapter 06 Introduction This procedure provides several reports for the comparison of two distributions, including confidence intervals for the difference in means, two-sample t-tests, the z-test, the

More information

Permutation Tests for Comparing Two Populations

Permutation Tests for Comparing Two Populations Permutation Tests for Comparing Two Populations Ferry Butar Butar, Ph.D. Jae-Wan Park Abstract Permutation tests for comparing two populations could be widely used in practice because of flexibility of

More information

STA-201-TE. 5. Measures of relationship: correlation (5%) Correlation coefficient; Pearson r; correlation and causation; proportion of common variance

STA-201-TE. 5. Measures of relationship: correlation (5%) Correlation coefficient; Pearson r; correlation and causation; proportion of common variance Principles of Statistics STA-201-TE This TECEP is an introduction to descriptive and inferential statistics. Topics include: measures of central tendency, variability, correlation, regression, hypothesis

More information

Guide to Biostatistics

Guide to Biostatistics MedPage Tools Guide to Biostatistics Study Designs Here is a compilation of important epidemiologic and common biostatistical terms used in medical research. You can use it as a reference guide when reading

More information

Testing Hypotheses About Proportions

Testing Hypotheses About Proportions Chapter 11 Testing Hypotheses About Proportions Hypothesis testing method: uses data from a sample to judge whether or not a statement about a population may be true. Steps in Any Hypothesis Test 1. Determine

More information

Introduction to Hypothesis Testing OPRE 6301

Introduction to Hypothesis Testing OPRE 6301 Introduction to Hypothesis Testing OPRE 6301 Motivation... The purpose of hypothesis testing is to determine whether there is enough statistical evidence in favor of a certain belief, or hypothesis, about

More information

www.rmsolutions.net R&M Solutons

www.rmsolutions.net R&M Solutons Ahmed Hassouna, MD Professor of cardiovascular surgery, Ain-Shams University, EGYPT. Diploma of medical statistics and clinical trial, Paris 6 university, Paris. 1A- Choose the best answer The duration

More information

CONTENTS OF DAY 2. II. Why Random Sampling is Important 9 A myth, an urban legend, and the real reason NOTES FOR SUMMER STATISTICS INSTITUTE COURSE

CONTENTS OF DAY 2. II. Why Random Sampling is Important 9 A myth, an urban legend, and the real reason NOTES FOR SUMMER STATISTICS INSTITUTE COURSE 1 2 CONTENTS OF DAY 2 I. More Precise Definition of Simple Random Sample 3 Connection with independent random variables 3 Problems with small populations 8 II. Why Random Sampling is Important 9 A myth,

More information

Sample size estimation is an important concern

Sample size estimation is an important concern Sample size and power calculations made simple Evie McCrum-Gardner Background: Sample size estimation is an important concern for researchers as guidelines must be adhered to for ethics committees, grant

More information

Service courses for graduate students in degree programs other than the MS or PhD programs in Biostatistics.

Service courses for graduate students in degree programs other than the MS or PhD programs in Biostatistics. Course Catalog In order to be assured that all prerequisites are met, students must acquire a permission number from the education coordinator prior to enrolling in any Biostatistics course. Courses are

More information

Design and Analysis of Phase III Clinical Trials

Design and Analysis of Phase III Clinical Trials Cancer Biostatistics Center, Biostatistics Shared Resource, Vanderbilt University School of Medicine June 19, 2008 Outline 1 Phases of Clinical Trials 2 3 4 5 6 Phase I Trials: Safety, Dosage Range, and

More information

Sample Size Determination in Clinical Trials HRM-733 CLass Notes

Sample Size Determination in Clinical Trials HRM-733 CLass Notes Sample Size Determination in Clinical Trials HRM-733 CLass Notes Lehana Thabane, BSc, MSc, PhD Biostatistician Center for Evaluation of Medicines St. Joseph s Heathcare 105 Main Street East, Level P1 Hamilton

More information