www.eurordis.org Analysis and Interpretation of Clinical Trials How to conclude? Statistical Issues Dr Ferran Torres Unitat de Suport en Estadística i Metodología - USEM Statistics and Methodology Support Unit Hospital Clínic Barcelona Ferran.Torres@uab.es Barcelona, June 15, 2009
Documentation All the information will be available at the following link: http://ferran.torres.name/docencia/eurordis Presentation in power point and pdf Additional documentation: Scientific recommendations Regulatory recommendations 2
Documentation 3
Outline Why statistics? Population and samples p value Statistical vs clinical significance Statistical errors (a and b) Sample size Estimation of treatment effect: Confidence Intervals 4
Why Statistics? Variation!!!! 5
Variability 6
Why Statistics? Medicine is a quantitative science but not exact Not like physics or chemistry Variation characterises much of medicine Statistics is about handling and quantifying variation and uncertainty Humans differ in response to exposure to adverse effects Example: not every smoker dies of lung cancer some non-smokers die of lung cancer Humans differ in response to treatment Example: penicillin does not cure all infections Humans differ in disease symptoms Example: Sometimes cough and sometimes wheeze are presenting features for asthma 7
Why Statistics Are Necessary Statistics can tell us whether events could have happened by chance and to make decisions We need to use Statistics because of variability in our data Generalise: can what we know help to predict what will happen in new and different situations? 8
Memorable quotes 50% of what you learn about therapy in the next 5 years is wrong. (The trouble is we don t know which 50%) (Anon) in this world there is nothing certain but death and taxes. Benjamin Franklin (1706-1790). (also said by Woody Allen) 86% of all statistics are invented on the spot (Huff How to Lie with Statistics) There are lies, damn lies, and statistics Benjamin Disraeli (1804-1881) 9
Error? unreliable reliable but not valid reliable & valid Classification of measurements 10
Random versus Systematic error Example: Systolic Blood Pressure (mm Hg) Random Systematic (Bias) True Value 130 150 170 01 02 03 04 05 True Value 130 150 170 01 05 02 03 04 11
Random versus Systematic error Random Sample size Bias Sample size 12
Types of Statistics Descriptive Statistics enumerate, organise, summarise, and categorise graphical representation of data. these type of statistics describes the data. Examples means and frequency of outcomes charts and graphs 13
Types of Statistics Inferential Statistics drawing conclusions from incomplete information. they make predictions about a larger population given a smaller sample these are thought of as the statistical test Examples 95%CI, t-test, chi-square test, ANOVA, regression 14
Population and Samples Sample Population of the Study Target Population 15
Extrapolation Sample Study Results Inferential analysis Statistical Tests Confidence Intervals Population Conclusions 16
Statistical Inference Statistical Tests=> p-value Confidence Intervals 17 Ferran.Torres@uab.es 17
Valid samples? Population Likely to occur Invalid Sample and Conclusions Unlikely to occur 18
P-value The p-value is a tool to answer the question: Could the observed results have occurred by chance*? p <.05 statistically significant Remember: Decision given the observed results in a SAMPLE Extrapolating results to POPULATION *: accounts exclusively for the random error, not bias 19 1 19
RCT from a statistical point of view Treatment A Randomisation Treatment B (control) 1 homogeneous population 2 distinct populations 20 TITRE DE LA PRESENTATION (A MODIFIER SUR LE MASQUE)
RCT Sample Population 21
Statistical significance/confidence A>B p<0.05 means:? I can conclude that the higher values observed with treatment A vs treatment B are linked to the treatment rather to chance, with a risk of error of less than 5% 22 TITRE DE LA PRESENTATION (A MODIFIER SUR LE MASQUE)
Factors influencing statistical significance Signal Noise (background) Quantity Difference Variance (SD) Quantity of data 23 TITRE DE LA PRESENTATION (A MODIFIER SUR LE MASQUE)
P-value A very low p-value does NOT imply: Clinical relevance (NO!!!) Magnitude of the treatment effect (NO!!) With n or variability p Please never compare p-values!! (NO!!!) 24
Type I & II Error & Power Reality (Population) A=B A B Conclusion (sample) A=B p>0.05 A B p<0.05 OK Type I error (a) Type II error (b) OK 25
Type I & II Error & Power Type I Error (a) False positive Rejecting the null hypothesis when in fact it is true Standard: a=0.05 In words, chance of finding statistical significance when in fact there truly was no effect Type II Error (b) False negative Accepting the null hypothesis when in fact alternative is true Standard: b=0.20 or 0.10 In words, chance of not finding statistical significance when in fact there was an effect 26
Sample Size The planned number of participants is calculated on the basis of: Expected effect of treatment(s) Variability of the chosen endpoint Accepted risks in conclusion effect number variability number risk number 27
Frecuencia Frecuencia Frecuencia Sample Size The planned number of participants is calculated on the basis of: Expected effect of treatment(s) Variability of the chosen endpoint Accepted risks in conclusion effect number variability number risk number 300 ALTURA 300 ALTURA 120 ALTURA 100 200 200 80 0 202.5 197.5 192.5 187.5 182.5 177.5 172.5 167.5 162.5 157.5 152.5 147.5 142.5 137.5 132.5 127.5 122.5 Desv. típ. = 25.54 Media = 165.1 N = 2000.00 0 220.0 210.0 200.0 190.0 180.0 170.0 160.0 150.0 140.0 130.0 120.0 110.0 Desv. típ. = 26.94 Media = 165.0 N = 2000.00 20 0 250.0 240.0 230.0 220.0 210.0 200.0 190.0 180.0 170.0 160.0 150.0 140.0 130.0 120.0 110.0 100.0 90.0 80.0 Desv. típ. = 32.27 Media = 165.1 N = 2000.00 60 100 100 40 ALTURA ALTURA ALTURA 28
Sample Size The planned number of participants is calculated on the basis of: Expected effect of treatment(s) Variability of the chosen endpoint Accepted risks in conclusion effect number variability number risk number A=B Reality (Population) A B Conclusion (sample) A=B p>0.05 A B p<0.05 OK Type I error (a) Type II error (b) POWER 29
95%CI Better than p-values use the data collected in the trial to give an estimate of the treatment effect size, together with a measure of how certain we are of our estimate CI is a range of values within which the true treatment effect is believed to be found, with a given level of confidence. 95% CI is a range of values within which the true treatment effect will lie 95% of the time Generally, 95% CI is calculated as Sample Estimate ± 1.96 x Standard Error 30
Interval Estimation A probability that the population parameter falls somewhere within the interval Confidence interval Sample statistic (point estimate) Confidence limit (lower) Confidence limit (upper) 31
Superiority study Control better Test better IC95% d < 0 - effect d = 0 No differences d > 0 + effect 32
Lower equivalence boundary Upper equivalence boundary Statistically and Clinically superiority Statistical Superiority Non-inferiority Equivalence Inferiority <- Treatment less effective 0 Treatment-Control Treatment more effective -> 33
http://ferran.torres.name/docencia/eurordis 34