Statistics in Medicine Research Lecture Series CSMC Fall 2014


 Natalie Flowers
 5 years ago
 Views:
Transcription
1 Catherine Bresee, MS Senior Biostatistician Biostatistics & Bioinformatics Research Institute Statistics in Medicine Research Lecture Series CSMC Fall 2014
2 Overview Review concept of statistical power Factors influencing power Examples of sample size calculations(test of means, test of proportions) Writing a sample size justification Software tutorial
3 Power in Statistical Testing Metaphorically, the concept of statistical power is like a magnifying glass. The more powerful a magnifying glass is, the greater the ability to show greater detail. A more powerful study can better reveal a significant result.
4 Why do a Power Analysis? Optimize sample size to economize on costs. Determine the sample size for a clinically relevant and statistically significant difference. Determine the minimally detectable difference for a fixed sample size. Determine if the study is justifiably worth doing. Required for NIH Grant Proposals IRB / IACUC Protocols
5 Hypothesis Testing
6 Hypothesis Testing Typically statistical testing is to disprove a falsehood. Goal: Proving two groups are different. More challenging to prove equality! (Tests of equivalence require a different approach than typical statistical testing and much larger sample size.) Similar to Presumed innocent until proven guilty We assume treatment groups are not different until statistically proven different.
7 THE TRUTH IN NATURE True Null Hypothesis No Difference Between Groups True Alternative Hypothesis Difference Between Groups What Your Experiment Observes No Difference Between Groups (fail to reject Ho) Difference Between Groups (reject Ho) Correct Outcome! True Negative False Positive Type I Error α False Negative Type II Error β Correct Outcome! True Positive
8 Type I Error Type I Error: reject a null hypothesis when in fact it is true (False Positive) Probability of False Positive = α α = significance level P value: Probability of observing a given set of results as extreme (or more) than possible by chance alone if the null hypothesis is true. Smaller p value, stronger the evidence against the null If computed p value < α then reject null hypothesis
9 Type II Error Type II Error: when an experiment concludes that you cannot reject a null hypothesis when in fact the null hypothesis is incorrect (False Negative) Probability of False Negative = β 1 β = POWER Power: Probability of a true positive result. Power = 1 P(Type II Error) = 1 β Measure of a statistical test s ability to correctly reject a false null hypothesis. Type I Error is often considered more serious than Type II Error
10 Null Hypothesis There is no wolf chasing the sheep. Type I Error (False Positive) Shepherd cries out Wolf, Wolf when there is no wolf. Type II Error (False Negative) There is a wolf, but shepherd is too distracted to notice. Cost Assessment Help will not arrive next time shepherd cries wolf since no one believes him. One of the sheep might get eaten.
11 Null Hypothesis New drug is not better than old drug. Scenario for Type I Error (False Positive) Experimental group treated with new drug just happens to do better than the group treated with old drug, but the new drug really does not work that well. Scenario for Type II Error (False Negative) Experimental group treated with new drug does not do better than the group treated with old drug, but the new drug really does work better than old drug. Cost Assessment Drug goes on to treat more patients that don t get better and die. Study stops and the new better drug is abandoned without further testing.
12 Motivating Example Mice treated with a know carcinogen develop colon tumors 6 months later. Experiment will test a new transgenic knock out mouse to see if gene of interest is involved in the pathway for colon cancer. New transgenic mouse should have more severe colon cancer. How many mice to use per group?
13 There is no magic number, no perfect sample size! The chosen sample size is a function of many factors.
14 Six Components to a Statistical Power Analysis Typically set 5 items as fixed and solve for the 6 th.
15 1. Chose the type of Statistical Test What is the data (probably) going to look like? What statistical test for your primary hypothesis? Continuous & normally distributed > Student s t test, ANOVA, Linear Regression, Correlation Continuous & skewed > Wilcoxon Rank Sum Test, Mann Whitney U test Counts & proportions > Chi square, Fisher s Exact test Survival time > Kaplan Meier test, Log rank test Rates > Poisson regression
16 Parametric tests (e.g., t test) have more power than non parametric tests (rank sum test). Continuous data has more power than categorical. Avoid cut points or dichotomization of continuous data. Altman, Douglas G., and Patrick Royston. "The cost of dichotomising continuous variables." BMJ 332 (2006): MacCallum, Robert C., et al. "On the practice of dichotomization of quantitative variables." Psychological Methods 7(2002): 19.
17 2. Choose your Significance Level Set your α level. What test level is required? One sided or two sided? Conventional choice is two sided α = 0.05 Any value can be selected, justification needed. Pilot projects sometimes select higher α
18 Two Sided Test : = : Mean=0 SD=0.5 Mean=1 SD=0.25
19 One Sided Test : : < Mean=0 SD=0.5 Mean=1 SD=0.25 In general, most research is done with a two sided test (more conservative assumption). If one sided is chosen, then cut alpha in half and set to 0.025
20 3. Expected Variation Standard deviation (SD) can be estimated by Previous published research Observed SD from a control group in another experiment. Coefficient of Variation, i.e. percent of mean Estimated range (min, max) Typically C=4, based on convention Range is a poor substitute for having an estimated SD, use only as last resort.
21 Variation Normal Distribution Large amounts of variation can effect precision in estimating the difference between group means. Larger sample size will be required with larger variation 2 ±1 vs 2 ±2 n=6 n=6 2 ±3 vs 2 ±3 n=6 n=6 2 ±3 vs 2 ±3 N=10 n=10
22 Variation Non normal Distribution Categorical data the amount of variation is dependent on the size of the sample and the expected percent for each category. Skewed data where possibility of extreme outliers, then need the range (min, max) and median (50 th percentile) Survival data need median survival time and percent of censored observations (still alive at end of study)
23 4. Minimum Detectable Difference Difference between group means, group proportions Larger the difference between groups, smaller the required sample size. Typically, the more groups to compare, larger the required sample size per group.
24 What is Effect Size? Measurement of the magnitude of difference between groups as a function of the amount of variation. 2 groups effect size: >2 groups effect size: In ANOVA Small effect size = 0.1 Large effect size = for the regression model
25 5. Power Power is the probability an experiment will find a true difference between groups if it exists. Anything greater than 80% generally acceptable. Typically set at 80% or 90% when fixed value used in computing sample size. Higher power requires larger sample size
26 6. Sample Size What is cost effective, reasonable to work with? Unbalanced sample size is acceptable. Treatment group might have more variation requiring larger sample size than control group One group might not be as available. Should use more careful estimates to calculate true power.
27 Factors Affecting Power Parametric tests Multiple groups to compare Increased magnitude of difference between groups Increased variation in the sample Bigger sample size Smaller p value required for statistical significance
28 Writing a Power Analysis Must include all 6 elements: 1. Type of statistical test to be used 2. Significance level (one sided or two sided) 3. Estimated standard deviation 4. Minimum detectable difference 5. Expected Power 6. Sample size per treatment group
29 Motivating Example Mice treated with a know carcinogen develop colon tumors 6 months later. Experiment will test new transgenic knockout mice to determine if the gene of interest is involved in the cancer pathway. Transgenic mice should get sicker than control mice if this gene is important. PI s question: How many mice to use per group? Is there any published data on control animals?
30
31
32 Example 100% of the mice treated with the carcinogen develop colon tumors. Mean ± SD number of tumors per mouse = 5.0 ± 2.9 Mean ± SD tumor size per tumor = 3.5 ± 0.13 mm PI would prefer to work with no more than 10 mice per group for cost. New transgenic mouse to be studied should have worse colon cancer if gene knocked out is part of colon cancer pathway.
33 Null Hypothesis Transgenic mouse has colon cancer just as bad as control group. Cost Assessment Scenario for Type I Error (False Positive) Transgenic mouse just happens to show worse colon cancer in this experiment, but the new mouse strain really is not any different than regular mouse. Results get published, but then retracted later when future studies show futility. Scenario for Type II Error (False Negative) Transgenic mice does not get any sicker than control mice in our experiment, but the transgenic mouse really does have a higher rate of colon cancer. Study stops and the no new experiments are planned with the new mouse breed; other genes are studied.
34 Null Hypothesis Transgenic mouse has colon cancer just as bad as control group. Cost Assessment Scenario for Type I Error (False Positive) Transgenic mouse just happens to show worse colon cancer in this experiment, but the new mouse strain really is not any different than regular mouse. Results get published, but then retracted later when future studies show futility. Scenario for Type II Error (False Negative) Transgenic mice does not get any sicker than control mice in our experiment, but the transgenic mouse really does have a higher rate of colon cancer. Study stops and the no new experiments are planned with the new mouse breed; other genes are studied. Risk of this Scenario = α Set to 5% Risk of this Scenario = β Set to 20%
35
36
37 Final Power Analysis for NIH Grant For Experiment 2.3: With this treatment (AOM 10 mg/kg i.p., once a week for 4 weeks) mice usually develop multiple colon tumors with 100% penetrance. When mice are sacrificed we will count the number of tumors and the average size of the tumor per animals. According to published data mice develop ~5 (sd=3) tumors/mouse with the average size 3.5mm (sd=0.13). 32 We will sacrifice these mice 5 weeks after the first injection. We expect that Car1/GH/IRES/GFP+/+ mice develop more tumors and the size of tumors will be bigger. We have therefore chosen 6 animals per group to achieve 80% power to detect an average increase of 5 tumors per mouse and 0.2mm increase in average size in a twosided two sample t test at the 0.05 significance level.
38 Example, part 2 Test of new drug to potentially stop colon cancer using same mouse model. 100% of mice treated with a know carcinogen develop colon tumors 6 months later. How many mice do we need to test a decrease of colon cancer incidence rate?
39
40 Power Analysis Results: Based on the assumption that our control group of mice will have an incidence rate at ~100% in developing colon cancer, a future study of the new drug with 10 animal in each group will have 80% power in a two sided Fisher s Exact Test at the 0.05 significance level, assuming the treatment group will have an incidence level of 40% or lower.
41 All NIH grants, IRB & IACUC protocols are reviewed by a biostatistician who know what a properly written sample size justification should look like.
42 Some Common Mistakes Common misconception: p<0.05 means the probability that the null hypothesis is true is less than Correct interpretation: smaller p value indicates the weight of the evidence against the null hypothesis is stronger, and not simply a random chance. It is NOT more statistically significant. Differences are either statistically significant or they are not.
43 Using Pilot Data Inappropriately Pilot data should be an independent collection of data to analyze in planning a future study It should not be a small sample that failed to demonstrate significant differences and more cases will be added to. If plan is to increase sample size for initial smaller failed experiment, decision should be made a priori, and plan for interim analysis, alpha spending rules, etc.
44 More Common Mistakes Common misconception: Doubling the sample cuts the standard deviation in half One sided versus two sided Non random sampling Dichotomizing or classifying data extracted from a continuous variable
45 Cost of Dichotomization
46 The Error of Post Hoc Power Analysis When a study has negative results it is inappropriate to calculate how much power the study had. All statistical tests that have p>0.05 will have poor observed power using the data at hand. The implied conclusion of post hoc power is that the effect observed could be real but the sample size was too small. However, the observed mean will vary with each trial. Instead the confidence interval should be computed for the observed difference from groups and interpreted as such. Hoenig, John M., and Dennis M. Heisey. "The abuse of power." The American Statistician 55.1 (2001).
47 Simple Formula for Difference Between two Means (not appropriate for small sample sizes) Sample size in each group (assumes equal sized groups) n Standard deviation of the outcome variable Represents the desired power (typically.84 for 80% power). 2 2 ( Z Z /2 2 difference Effect Size (the difference in means) ) 2 Represents the desired level of statistical significance (typically 1.96)
48 Simple Formula for Difference Between two Proportions (not appropriate for small sample sizes) Sample size in each group (assumes equal sized groups) Represents the desired power (typically.84 for 80% power). n A measure of variability (similar to standard deviation) 2( p)(1 (p p)( Z 1 p 2 Effect Size (the difference in proportions) ) 2 Z /2 ) 2 Represents the desired level of statistical significance (typically 1.96).
49 Online Calculators
50 Calculations are based on formulas for the 1 sample Z test
51 Calculations are based on formulas for the 1 sample Z test
52 Online Calculators
53 PS: Power and Sample Size Demo Free PS software (v 3.1.2) available at: plesize Dupont WD, Plummer WD: 'Power and Sample Size Calculations: A Review and Computer Program', Controlled Clinical Trials 1990; 11:
54 Step 1: Chose your type of statistical test
55 Step 2: Choose what to solve for Step 3: Input parameters 2 group t test significance level = 0.05 standard deviation = 2.9 difference in group means = 4 power = 0.80 m= ratio of samples in each group
56 Auto generated sample size justification paragraph!!
57 G*Power Demo Free software (v ) available at: Faul, F., Erdfelder, E., Lang, A. G., & Buchner, A. (2007). G*Power 3: A flexible statistical power analysis program for the social, behavioral, and biomedical sciences. Behavior Research Methods, 39,
58 Excellent G*Power Tutorial:
59 G*Power Mann Whitney (rank test) Non parametric test of two group means
60
61 Non parametric test of two group means
62 G*Power ANOVA ANOVA 3 groups 10 per group Alpha = 0.05 Power = 0.80
63 Minimum detectable effect size = 0.6
64 Absence of Evidence is not Evidence of Absence
65
Sample Size Planning, Calculation, and Justification
Sample Size Planning, Calculation, and Justification Theresa A Scott, MS Vanderbilt University Department of Biostatistics theresa.scott@vanderbilt.edu http://biostat.mc.vanderbilt.edu/theresascott Theresa
More informationConsider a study in which. How many subjects? The importance of sample size calculations. An insignificant effect: two possibilities.
Consider a study in which How many subjects? The importance of sample size calculations Office of Research Protections Brown Bag Series KB Boomer, Ph.D. Director, boomer@stat.psu.edu A researcher conducts
More informationFinal Exam Practice Problem Answers
Final Exam Practice Problem Answers The following data set consists of data gathered from 77 popular breakfast cereals. The variables in the data set are as follows: Brand: The brand name of the cereal
More informationDescriptive Statistics
Descriptive Statistics Primer Descriptive statistics Central tendency Variation Relative position Relationships Calculating descriptive statistics Descriptive Statistics Purpose to describe or summarize
More informationX X X a) perfect linear correlation b) no correlation c) positive correlation (r = 1) (r = 0) (0 < r < 1)
CORRELATION AND REGRESSION / 47 CHAPTER EIGHT CORRELATION AND REGRESSION Correlation and regression are statistical methods that are commonly used in the medical literature to compare two or more variables.
More informationII. DISTRIBUTIONS distribution normal distribution. standard scores
Appendix D Basic Measurement And Statistics The following information was developed by Steven Rothke, PhD, Department of Psychology, Rehabilitation Institute of Chicago (RIC) and expanded by Mary F. Schmidt,
More informationBiostatistics: Types of Data Analysis
Biostatistics: Types of Data Analysis Theresa A Scott, MS Vanderbilt University Department of Biostatistics theresa.scott@vanderbilt.edu http://biostat.mc.vanderbilt.edu/theresascott Theresa A Scott, MS
More informationStudy Design and Statistical Analysis
Study Design and Statistical Analysis Anny H Xiang, PhD Department of Preventive Medicine University of Southern California Outline Designing Clinical Research Studies Statistical Data Analysis Designing
More informationTwoSample TTests Assuming Equal Variance (Enter Means)
Chapter 4 TwoSample TTests Assuming Equal Variance (Enter Means) Introduction This procedure provides sample size and power calculations for one or twosided twosample ttests when the variances of
More informationTwoSample TTests Allowing Unequal Variance (Enter Difference)
Chapter 45 TwoSample TTests Allowing Unequal Variance (Enter Difference) Introduction This procedure provides sample size and power calculations for one or twosided twosample ttests when no assumption
More information1. What is the critical value for this 95% confidence interval? CV = z.025 = invnorm(0.025) = 1.96
1 Final Review 2 Review 2.1 CI 1propZint Scenario 1 A TV manufacturer claims in its warranty brochure that in the past not more than 10 percent of its TV sets needed any repair during the first two years
More information" Y. Notation and Equations for Regression Lecture 11/4. Notation:
Notation: Notation and Equations for Regression Lecture 11/4 m: The number of predictor variables in a regression Xi: One of multiple predictor variables. The subscript i represents any number from 1 through
More informationPart 2: Analysis of Relationship Between Two Variables
Part 2: Analysis of Relationship Between Two Variables Linear Regression Linear correlation Significance Tests Multiple regression Linear Regression Y = a X + b Dependent Variable Independent Variable
More informationVersion 4.0. Statistics Guide. Statistical analyses for laboratory and clinical researchers. Harvey Motulsky
Version 4.0 Statistics Guide Statistical analyses for laboratory and clinical researchers Harvey Motulsky 19992005 GraphPad Software, Inc. All rights reserved. Third printing February 2005 GraphPad Prism
More informationPrinciples of Hypothesis Testing for Public Health
Principles of Hypothesis Testing for Public Health Laura Lee Johnson, Ph.D. Statistician National Center for Complementary and Alternative Medicine johnslau@mail.nih.gov Fall 2011 Answers to Questions
More informationCorrelational Research
Correlational Research Chapter Fifteen Correlational Research Chapter Fifteen Bring folder of readings The Nature of Correlational Research Correlational Research is also known as Associational Research.
More informationResearch Methods & Experimental Design
Research Methods & Experimental Design 16.422 Human Supervisory Control April 2004 Research Methods Qualitative vs. quantitative Understanding the relationship between objectives (research question) and
More informationAdditional sources Compilation of sources: http://lrs.ed.uiuc.edu/tseportal/datacollectionmethodologies/jintselink/tselink.htm
Mgt 540 Research Methods Data Analysis 1 Additional sources Compilation of sources: http://lrs.ed.uiuc.edu/tseportal/datacollectionmethodologies/jintselink/tselink.htm http://web.utk.edu/~dap/random/order/start.htm
More informationChicago Booth BUSINESS STATISTICS 41000 Final Exam Fall 2011
Chicago Booth BUSINESS STATISTICS 41000 Final Exam Fall 2011 Name: Section: I pledge my honor that I have not violated the Honor Code Signature: This exam has 34 pages. You have 3 hours to complete this
More informationStatistics I for QBIC. Contents and Objectives. Chapters 1 7. Revised: August 2013
Statistics I for QBIC Text Book: Biostatistics, 10 th edition, by Daniel & Cross Contents and Objectives Chapters 1 7 Revised: August 2013 Chapter 1: Nature of Statistics (sections 1.11.6) Objectives
More informationStatistics Review PSY379
Statistics Review PSY379 Basic concepts Measurement scales Populations vs. samples Continuous vs. discrete variable Independent vs. dependent variable Descriptive vs. inferential stats Common analyses
More informationSimple Linear Regression Inference
Simple Linear Regression Inference 1 Inference requirements The Normality assumption of the stochastic term e is needed for inference even if it is not a OLS requirement. Therefore we have: Interpretation
More informationNonInferiority Tests for Two Means using Differences
Chapter 450 oninferiority Tests for Two Means using Differences Introduction This procedure computes power and sample size for noninferiority tests in twosample designs in which the outcome is a continuous
More informationDATA INTERPRETATION AND STATISTICS
PholC60 September 001 DATA INTERPRETATION AND STATISTICS Books A easy and systematic introductory text is Essentials of Medical Statistics by Betty Kirkwood, published by Blackwell at about 14. DESCRIPTIVE
More informationUNIVERSITY OF NAIROBI
UNIVERSITY OF NAIROBI MASTERS IN PROJECT PLANNING AND MANAGEMENT NAME: SARU CAROLYNN ELIZABETH REGISTRATION NO: L50/61646/2013 COURSE CODE: LDP 603 COURSE TITLE: RESEARCH METHODS LECTURER: GAKUU CHRISTOPHER
More informationIntroduction to Quantitative Methods
Introduction to Quantitative Methods October 15, 2009 Contents 1 Definition of Key Terms 2 2 Descriptive Statistics 3 2.1 Frequency Tables......................... 4 2.2 Measures of Central Tendencies.................
More informationIntroduction to. Hypothesis Testing CHAPTER LEARNING OBJECTIVES. 1 Identify the four steps of hypothesis testing.
Introduction to Hypothesis Testing CHAPTER 8 LEARNING OBJECTIVES After reading this chapter, you should be able to: 1 Identify the four steps of hypothesis testing. 2 Define null hypothesis, alternative
More informationSample Size and Power in Clinical Trials
Sample Size and Power in Clinical Trials Version 1.0 May 011 1. Power of a Test. Factors affecting Power 3. Required Sample Size RELATED ISSUES 1. Effect Size. Test Statistics 3. Variation 4. Significance
More information5. Linear Regression
5. Linear Regression Outline.................................................................... 2 Simple linear regression 3 Linear model............................................................. 4
More informationfor the social, behavioral, and biomedical sciences. Behavior Research Methods, 39, 175191.
Example of a Statistical Power Calculation (p. 206) Photocopiable This example uses the statistical power software package G*Power 3. I am grateful to the creators of the software for giving their permission
More informationPoint Biserial Correlation Tests
Chapter 807 Point Biserial Correlation Tests Introduction The point biserial correlation coefficient (ρ in this chapter) is the productmoment correlation calculated between a continuous random variable
More informationHYPOTHESIS TESTING (ONE SAMPLE)  CHAPTER 7 1. used confidence intervals to answer questions such as...
HYPOTHESIS TESTING (ONE SAMPLE)  CHAPTER 7 1 PREVIOUSLY used confidence intervals to answer questions such as... You know that 0.25% of women have red/green color blindness. You conduct a study of men
More informationStatistical tests for SPSS
Statistical tests for SPSS Paolo Coletti A.Y. 2010/11 Free University of Bolzano Bozen Premise This book is a very quick, rough and fast description of statistical tests and their usage. It is explicitly
More informationBIOM611 Biological Data Analysis
BIOM611 Biological Data Analysis Spring, 2015 Tentative Syllabus Introduction BIOMED611 is a ½ unit course required for all 1 st year BGS students (except GCB students). It will provide an introduction
More information13 id id no. of respondents 101300 4 respon 1 responsible for maintenance? 1 = no, 2 = yes, 9 = blank
Basic Data Analysis Graziadio School of Business and Management Data Preparation & Entry Editing: Inspection & Correction Field Edit: Immediate followup (complete? legible? comprehensible? consistent?
More informationIntroduction to Statistics and Quantitative Research Methods
Introduction to Statistics and Quantitative Research Methods Purpose of Presentation To aid in the understanding of basic statistics, including terminology, common terms, and common statistical methods.
More informationData Analysis, Research Study Design and the IRB
Minding the pvalues p and Quartiles: Data Analysis, Research Study Design and the IRB Don AllensworthDavies, MSc Research Manager, Data Coordinating Center Boston University School of Public Health IRB
More information2 Precisionbased sample size calculations
Statistics: An introduction to sample size calculations Rosie Cornish. 2006. 1 Introduction One crucial aspect of study design is deciding how big your sample should be. If you increase your sample size
More informationCalculating, Interpreting, and Reporting Estimates of Effect Size (Magnitude of an Effect or the Strength of a Relationship)
1 Calculating, Interpreting, and Reporting Estimates of Effect Size (Magnitude of an Effect or the Strength of a Relationship) I. Authors should report effect sizes in the manuscript and tables when reporting
More informationUnit 31 A Hypothesis Test about Correlation and Slope in a Simple Linear Regression
Unit 31 A Hypothesis Test about Correlation and Slope in a Simple Linear Regression Objectives: To perform a hypothesis test concerning the slope of a least squares line To recognize that testing for a
More informationIntroduction. Hypothesis Testing. Hypothesis Testing. Significance Testing
Introduction Hypothesis Testing Mark Lunt Arthritis Research UK Centre for Ecellence in Epidemiology University of Manchester 13/10/2015 We saw last week that we can never know the population parameters
More informationStudy Guide for the Final Exam
Study Guide for the Final Exam When studying, remember that the computational portion of the exam will only involve new material (covered after the second midterm), that material from Exam 1 will make
More informationSection 13, Part 1 ANOVA. Analysis Of Variance
Section 13, Part 1 ANOVA Analysis Of Variance Course Overview So far in this course we ve covered: Descriptive statistics Summary statistics Tables and Graphs Probability Probability Rules Probability
More informationBA 275 Review Problems  Week 6 (10/30/0611/3/06) CD Lessons: 53, 54, 55, 56 Textbook: pp. 394398, 404408, 410420
BA 275 Review Problems  Week 6 (10/30/0611/3/06) CD Lessons: 53, 54, 55, 56 Textbook: pp. 394398, 404408, 410420 1. Which of the following will increase the value of the power in a statistical test
More informationNormality Testing in Excel
Normality Testing in Excel By Mark Harmon Copyright 2011 Mark Harmon No part of this publication may be reproduced or distributed without the express permission of the author. mark@excelmasterseries.com
More informationTutorial 5: Hypothesis Testing
Tutorial 5: Hypothesis Testing Rob Nicholls nicholls@mrclmb.cam.ac.uk MRC LMB Statistics Course 2014 Contents 1 Introduction................................ 1 2 Testing distributional assumptions....................
More informationCome scegliere un test statistico
Come scegliere un test statistico Estratto dal Capitolo 37 of Intuitive Biostatistics (ISBN 0195086074) by Harvey Motulsky. Copyright 1995 by Oxfd University Press Inc. (disponibile in Iinternet) Table
More informationError Type, Power, Assumptions. Parametric Tests. Parametric vs. Nonparametric Tests
Error Type, Power, Assumptions Parametric vs. Nonparametric tests TypeI & II Error Power Revisited Meeting the Normality Assumption  Outliers, Winsorizing, Trimming  Data Transformation 1 Parametric
More informationThe Statistics Tutor s Quick Guide to
statstutor community project encouraging academics to share statistics support resources All stcp resources are released under a Creative Commons licence The Statistics Tutor s Quick Guide to Stcpmarshallowen7
More informationStandard Deviation Estimator
CSS.com Chapter 905 Standard Deviation Estimator Introduction Even though it is not of primary interest, an estimate of the standard deviation (SD) is needed when calculating the power or sample size of
More informationIntroduction to Statistics with GraphPad Prism (5.01) Version 1.1
Babraham Bioinformatics Introduction to Statistics with GraphPad Prism (5.01) Version 1.1 Introduction to Statistics with GraphPad Prism 2 Licence This manual is 201011, Anne SegondsPichon. This manual
More informationCase Study in Data Analysis Does a drug prevent cardiomegaly in heart failure?
Case Study in Data Analysis Does a drug prevent cardiomegaly in heart failure? Harvey Motulsky hmotulsky@graphpad.com This is the first case in what I expect will be a series of case studies. While I mention
More informationBusiness Statistics. Successful completion of Introductory and/or Intermediate Algebra courses is recommended before taking Business Statistics.
Business Course Text Bowerman, Bruce L., Richard T. O'Connell, J. B. Orris, and Dawn C. Porter. Essentials of Business, 2nd edition, McGrawHill/Irwin, 2008, ISBN: 9780073319889. Required Computing
More informationCourse Text. Required Computing Software. Course Description. Course Objectives. StraighterLine. Business Statistics
Course Text Business Statistics Lind, Douglas A., Marchal, William A. and Samuel A. Wathen. Basic Statistics for Business and Economics, 7th edition, McGrawHill/Irwin, 2010, ISBN: 9780077384470 [This
More informationA POPULATION MEAN, CONFIDENCE INTERVALS AND HYPOTHESIS TESTING
CHAPTER 5. A POPULATION MEAN, CONFIDENCE INTERVALS AND HYPOTHESIS TESTING 5.1 Concepts When a number of animals or plots are exposed to a certain treatment, we usually estimate the effect of the treatment
More informationINTERPRETING THE ONEWAY ANALYSIS OF VARIANCE (ANOVA)
INTERPRETING THE ONEWAY ANALYSIS OF VARIANCE (ANOVA) As with other parametric statistics, we begin the oneway ANOVA with a test of the underlying assumptions. Our first assumption is the assumption of
More informationCALCULATIONS & STATISTICS
CALCULATIONS & STATISTICS CALCULATION OF SCORES Conversion of 15 scale to 0100 scores When you look at your report, you will notice that the scores are reported on a 0100 scale, even though respondents
More informationTHE KRUSKAL WALLLIS TEST
THE KRUSKAL WALLLIS TEST TEODORA H. MEHOTCHEVA Wednesday, 23 rd April 08 THE KRUSKALWALLIS TEST: The nonparametric alternative to ANOVA: testing for difference between several independent groups 2 NON
More informationCOMPARISONS OF CUSTOMER LOYALTY: PUBLIC & PRIVATE INSURANCE COMPANIES.
277 CHAPTER VI COMPARISONS OF CUSTOMER LOYALTY: PUBLIC & PRIVATE INSURANCE COMPANIES. This chapter contains a full discussion of customer loyalty comparisons between private and public insurance companies
More informationUnivariate Regression
Univariate Regression Correlation and Regression The regression line summarizes the linear relationship between 2 variables Correlation coefficient, r, measures strength of relationship: the closer r is
More information22. HYPOTHESIS TESTING
22. HYPOTHESIS TESTING Often, we need to make decisions based on incomplete information. Do the data support some belief ( hypothesis ) about the value of a population parameter? Is OJ Simpson guilty?
More informationAnalysis of Data. Organizing Data Files in SPSS. Descriptive Statistics
Analysis of Data Claudia J. Stanny PSY 67 Research Design Organizing Data Files in SPSS All data for one subject entered on the same line Identification data Betweensubjects manipulations: variable to
More informationHYPOTHESIS TESTING: CONFIDENCE INTERVALS, TTESTS, ANOVAS, AND REGRESSION
HYPOTHESIS TESTING: CONFIDENCE INTERVALS, TTESTS, ANOVAS, AND REGRESSION HOD 2990 10 November 2010 Lecture Background This is a lightning speed summary of introductory statistical methods for senior undergraduate
More informationGeneralized Linear Models
Generalized Linear Models We have previously worked with regression models where the response variable is quantitative and normally distributed. Now we turn our attention to two types of models where the
More informationSection Format Day Begin End Building Rm# Instructor. 001 Lecture Tue 6:45 PM 8:40 PM Silver 401 Ballerini
NEW YORK UNIVERSITY ROBERT F. WAGNER GRADUATE SCHOOL OF PUBLIC SERVICE Course Syllabus Spring 2016 Statistical Methods for Public, Nonprofit, and Health Management Section Format Day Begin End Building
More informationNonParametric Tests (I)
Lecture 5: NonParametric Tests (I) KimHuat LIM lim@stats.ox.ac.uk http://www.stats.ox.ac.uk/~lim/teaching.html Slide 1 5.1 Outline (i) Overview of DistributionFree Tests (ii) Median Test for Two Independent
More informationList of Examples. Examples 319
Examples 319 List of Examples DiMaggio and Mantle. 6 Weed seeds. 6, 23, 37, 38 Vole reproduction. 7, 24, 37 Wooly bear caterpillar cocoons. 7 Homophone confusion and Alzheimer s disease. 8 Gear tooth strength.
More informationSome Essential Statistics The Lure of Statistics
Some Essential Statistics The Lure of Statistics Data Mining Techniques, by M.J.A. Berry and G.S Linoff, 2004 Statistics vs. Data Mining..lie, damn lie, and statistics mining data to support preconceived
More informationQUANTITATIVE METHODS BIOLOGY FINAL HONOUR SCHOOL NONPARAMETRIC TESTS
QUANTITATIVE METHODS BIOLOGY FINAL HONOUR SCHOOL NONPARAMETRIC TESTS This booklet contains lecture notes for the nonparametric work in the QM course. This booklet may be online at http://users.ox.ac.uk/~grafen/qmnotes/index.html.
More informationParametric and Nonparametric: Demystifying the Terms
Parametric and Nonparametric: Demystifying the Terms By Tanya Hoskin, a statistician in the Mayo Clinic Department of Health Sciences Research who provides consultations through the Mayo Clinic CTSA BERD
More informationExample: Boats and Manatees
Figure 96 Example: Boats and Manatees Slide 1 Given the sample data in Table 91, find the value of the linear correlation coefficient r, then refer to Table A6 to determine whether there is a significant
More informationHYPOTHESIS TESTING (ONE SAMPLE)  CHAPTER 7 1. used confidence intervals to answer questions such as...
HYPOTHESIS TESTING (ONE SAMPLE)  CHAPTER 7 1 PREVIOUSLY used confidence intervals to answer questions such as... You know that 0.25% of women have red/green color blindness. You conduct a study of men
More informationLikelihood Approaches for Trial Designs in Early Phase Oncology
Likelihood Approaches for Trial Designs in Early Phase Oncology Clinical Trials Elizabeth GarrettMayer, PhD Cody Chiuzan, PhD Hollings Cancer Center Department of Public Health Sciences Medical University
More informationSimple Regression Theory II 2010 Samuel L. Baker
SIMPLE REGRESSION THEORY II 1 Simple Regression Theory II 2010 Samuel L. Baker Assessing how good the regression equation is likely to be Assignment 1A gets into drawing inferences about how close the
More informationMULTIPLE REGRESSION AND ISSUES IN REGRESSION ANALYSIS
MULTIPLE REGRESSION AND ISSUES IN REGRESSION ANALYSIS MSR = Mean Regression Sum of Squares MSE = Mean Squared Error RSS = Regression Sum of Squares SSE = Sum of Squared Errors/Residuals α = Level of Significance
More informationTests for Two Survival Curves Using Cox s Proportional Hazards Model
Chapter 730 Tests for Two Survival Curves Using Cox s Proportional Hazards Model Introduction A clinical trial is often employed to test the equality of survival distributions of two treatment groups.
More informationChi Squared and Fisher's Exact Tests. Observed vs Expected Distributions
BMS 617 Statistical Techniques for the Biomedical Sciences Lecture 11: ChiSquared and Fisher's Exact Tests Chi Squared and Fisher's Exact Tests This lecture presents two similarly structured tests, Chisquared
More informationSkewed Data and Nonparametric Methods
0 2 4 6 8 10 12 14 Skewed Data and Nonparametric Methods Comparing two groups: ttest assumes data are: 1. Normally distributed, and 2. both samples have the same SD (i.e. one sample is simply shifted
More information1. The parameters to be estimated in the simple linear regression model Y=α+βx+ε ε~n(0,σ) are: a) α, β, σ b) α, β, ε c) a, b, s d) ε, 0, σ
STA 3024 Practice Problems Exam 2 NOTE: These are just Practice Problems. This is NOT meant to look just like the test, and it is NOT the only thing that you should study. Make sure you know all the material
More informationChapter 7 Section 1 Homework Set A
Chapter 7 Section 1 Homework Set A 7.15 Finding the critical value t *. What critical value t * from Table D (use software, go to the web and type t distribution applet) should be used to calculate the
More informationProjects Involving Statistics (& SPSS)
Projects Involving Statistics (& SPSS) Academic Skills Advice Starting a project which involves using statistics can feel confusing as there seems to be many different things you can do (charts, graphs,
More informationHYPOTHESIS TESTING WITH SPSS:
HYPOTHESIS TESTING WITH SPSS: A NONSTATISTICIAN S GUIDE & TUTORIAL by Dr. Jim Mirabella SPSS 14.0 screenshots reprinted with permission from SPSS Inc. Published June 2006 Copyright Dr. Jim Mirabella CHAPTER
More informationStatistical issues in the analysis of microarray data
Statistical issues in the analysis of microarray data Daniel Gerhard Institute of Biostatistics Leibniz University of Hannover ESNATS Summerschool, Zermatt D. Gerhard (LUH) Analysis of microarray data
More informationPearson's Correlation Tests
Chapter 800 Pearson's Correlation Tests Introduction The correlation coefficient, ρ (rho), is a popular statistic for describing the strength of the relationship between two variables. The correlation
More informationNonInferiority Tests for One Mean
Chapter 45 NonInferiority ests for One Mean Introduction his module computes power and sample size for noninferiority tests in onesample designs in which the outcome is distributed as a normal random
More informationIntroduction to Hypothesis Testing
I. Terms, Concepts. Introduction to Hypothesis Testing A. In general, we do not know the true value of population parameters  they must be estimated. However, we do have hypotheses about what the true
More informationData Mining Techniques Chapter 5: The Lure of Statistics: Data Mining Using Familiar Tools
Data Mining Techniques Chapter 5: The Lure of Statistics: Data Mining Using Familiar Tools Occam s razor.......................................................... 2 A look at data I.........................................................
More informationAnalysis and Interpretation of Clinical Trials. How to conclude?
www.eurordis.org Analysis and Interpretation of Clinical Trials How to conclude? Statistical Issues Dr Ferran Torres Unitat de Suport en Estadística i Metodología  USEM Statistics and Methodology Support
More informationNCSS Statistical Software
Chapter 06 Introduction This procedure provides several reports for the comparison of two distributions, including confidence intervals for the difference in means, twosample ttests, the ztest, the
More informationPermutation Tests for Comparing Two Populations
Permutation Tests for Comparing Two Populations Ferry Butar Butar, Ph.D. JaeWan Park Abstract Permutation tests for comparing two populations could be widely used in practice because of flexibility of
More informationSTA201TE. 5. Measures of relationship: correlation (5%) Correlation coefficient; Pearson r; correlation and causation; proportion of common variance
Principles of Statistics STA201TE This TECEP is an introduction to descriptive and inferential statistics. Topics include: measures of central tendency, variability, correlation, regression, hypothesis
More informationGuide to Biostatistics
MedPage Tools Guide to Biostatistics Study Designs Here is a compilation of important epidemiologic and common biostatistical terms used in medical research. You can use it as a reference guide when reading
More informationTesting Hypotheses About Proportions
Chapter 11 Testing Hypotheses About Proportions Hypothesis testing method: uses data from a sample to judge whether or not a statement about a population may be true. Steps in Any Hypothesis Test 1. Determine
More informationIntroduction to Hypothesis Testing OPRE 6301
Introduction to Hypothesis Testing OPRE 6301 Motivation... The purpose of hypothesis testing is to determine whether there is enough statistical evidence in favor of a certain belief, or hypothesis, about
More informationwww.rmsolutions.net R&M Solutons
Ahmed Hassouna, MD Professor of cardiovascular surgery, AinShams University, EGYPT. Diploma of medical statistics and clinical trial, Paris 6 university, Paris. 1A Choose the best answer The duration
More informationCONTENTS OF DAY 2. II. Why Random Sampling is Important 9 A myth, an urban legend, and the real reason NOTES FOR SUMMER STATISTICS INSTITUTE COURSE
1 2 CONTENTS OF DAY 2 I. More Precise Definition of Simple Random Sample 3 Connection with independent random variables 3 Problems with small populations 8 II. Why Random Sampling is Important 9 A myth,
More informationSample size estimation is an important concern
Sample size and power calculations made simple Evie McCrumGardner Background: Sample size estimation is an important concern for researchers as guidelines must be adhered to for ethics committees, grant
More informationService courses for graduate students in degree programs other than the MS or PhD programs in Biostatistics.
Course Catalog In order to be assured that all prerequisites are met, students must acquire a permission number from the education coordinator prior to enrolling in any Biostatistics course. Courses are
More informationDesign and Analysis of Phase III Clinical Trials
Cancer Biostatistics Center, Biostatistics Shared Resource, Vanderbilt University School of Medicine June 19, 2008 Outline 1 Phases of Clinical Trials 2 3 4 5 6 Phase I Trials: Safety, Dosage Range, and
More informationSample Size Determination in Clinical Trials HRM733 CLass Notes
Sample Size Determination in Clinical Trials HRM733 CLass Notes Lehana Thabane, BSc, MSc, PhD Biostatistician Center for Evaluation of Medicines St. Joseph s Heathcare 105 Main Street East, Level P1 Hamilton
More information