Big data size isn t enough! Irene Petersen, PhD Primary Care & Population Health

Similar documents

Randomized trials versus observational studies

With Big Data Comes Big Responsibility

Guide to Biostatistics

Big Data Health Big Health Improvements? Dr Kerry Bailey MBBS BSc MSc MRCGP FFPH Dr Kelly Nock MPhys PhD

An Application of the G-formula to Asbestos and Lung Cancer. Stephen R. Cole. Epidemiology, UNC Chapel Hill. Slides:

New Cholesterol Guidelines: Carte Blanche for Statin Overuse Rita F. Redberg, MD, MSc Professor of Medicine

Research Skills for Non-Researchers: Using Electronic Health Data and Other Existing Data Resources

Missing data and net survival analysis Bernard Rachet

Komorbide brystkræftpatienter kan de tåle behandling? Et registerstudie baseret på Danish Breast Cancer Cooperative Group

What are observational studies and how do they differ from clinical trials?

Hormones and cardiovascular disease, what the Danish Nurse Cohort learned us

MOH CLINICAL PRACTICE GUIDELINES 6/2011 DEPRESSION

Case-Control Studies. Sukon Kanchanaraksa, PhD Johns Hopkins University

If several different trials are mentioned in one publication, the data of each should be extracted in a separate data extraction form.

Diabetes Prevention in Latinos

Introduction to study design

Biostatistics and Epidemiology within the Paradigm of Public Health. Sukon Kanchanaraksa, PhD Marie Diener-West, PhD Johns Hopkins University

Measures of Prognosis. Sukon Kanchanaraksa, PhD Johns Hopkins University

FULL COVERAGE FOR PREVENTIVE MEDICATIONS AFTER MYOCARDIAL INFARCTION IMPACT ON RACIAL AND ETHNIC DISPARITIES

Informatics: Opportunities & Applications. Professor Colin McCowan Robertson Centre for Biostatistics and Glasgow Clinical Trials Unit

A list of FDA-approved testosterone products can be found by searching for testosterone at

Neal Rouzier responds to the JAMA article on Men and Testosterone

Cancer research in the Midland Region the prostate and bowel cancer projects

Confounding in health research

Calculating the number needed to be exposed with adjustment for confounding variables in epidemiological studies

Measure #257 (NQF 1519): Statin Therapy at Discharge after Lower Extremity Bypass (LEB) National Quality Strategy Domain: Effective Clinical Care

Managing depression after stroke. Presented by Maree Hackett

Basic Study Designs in Analytical Epidemiology For Observational Studies

MOLINA HEALTHCARE OF CALIFORNIA

The Women s Health Initiative: The Role of Hormonal Therapy in Disease Prevention

Appendix: Description of the DIETRON model

Professional Certificate in Primary Care Psychology

COULD IT BE LOW TESTOSTERONE?

Intervention and clinical epidemiological studies

EXPANDING THE EVIDENCE BASE IN OUTCOMES RESEARCH: USING LINKED ELECTRONIC MEDICAL RECORDS (EMR) AND CLAIMS DATA

Statins and Risk for Diabetes Mellitus. Background

Does referral from an emergency department to an. alcohol treatment center reduce subsequent. emergency room visits in patients with alcohol

Quantifying Life expectancy in people with Type 2 diabetes

Using 'Big Data' to Estimate Benefits and Harms of Healthcare Interventions

FOR THE PREVENTION OF ATRIAL FIBRILLATION RELATED STROKE

Dietary treatment of cachexia challenges of nutritional research in cancer patients

Hormone therapy and breast cancer: conflicting evidence. Cindy Farquhar Cochrane Menstrual Disorders and Subfertility Group

ESCMID Online Lecture Library. by author

ESC/EASD Pocket Guidelines Diabetes, pre-diabetes and cardiovascular disease

SULFONYLUREA USE AND RISK OF HIP FRACTURES AMONG ELDERLY MEN AND WOMEN WITH TYPE 2 DIABETES

Drugs for MS.Drug fact box cannabis extract (Sativex) Version 1.0 Author

2. Background This was the fourth submission for everolimus requesting listing for clear cell renal carcinoma.

Success factors in Behavioral Medicine

Barriers to Healthcare Services for People with Mental Disorders. Cardiovascular disorders and diabetes in people with severe mental illness

Liver Disease & Hepatitis Program Providers: Brian McMahon, MD, Steve Livingston, MD, Lisa Townshend, ANP. Primary Care Provider:

The Women s Health Initiative where are we a decade later?

DEPRESSION Depression Assessment PHQ-9 Screening tool Depression treatment Treatment flow chart Medications Patient Resource

SEX INCLUSION in CLINICAL TRIALS

Elizabeth A. Crocco, MD Assistant Clinical Professor Chief, Division of Geriatric Psychiatry Department of Psychiatry and Behavioral Sciences Miller

Summary of the risk management plan (RMP) for Otezla (apremilast)

The Harvard MPH in Epidemiology Online/On-Campus/In the Field. Two-Year, Part-Time Program. Sample Curriculum Guide

Electronic Health Record (EHR) Data Analysis Capabilities

Questionnaire: Use of placebo-medication for treating depression. 1. Explanation about the Placebo Treatment for Depression

How to choose an analysis to handle missing data in longitudinal observational studies

East Midlands Cancer Clinical Network Improving Lung Cancer Outcomes. Dr Paul Beckett Royal Derby Hospital

Butler Memorial Hospital Community Health Needs Assessment 2013

Drug Treatment Considerations In The Elderly

practitioners and physician assistants.advanceweb.com/features/articles/alcohol Abuse.aspx

Medical marijuana for pain and anxiety: A primer for methadone physicians. Meldon Kahan MD CPSO Methadone Prescribers Conference November 6, 2015

Can I have FAITH in this Review?

Not All Clinical Trials Are Created Equal Understanding the Different Phases

CARE MANAGEMENT FOR LATE LIFE DEPRESSION IN URBAN CHINESE PRIMARY CARE CLINICS

Assessment of depression in adults in primary care

Tips for surviving the analysis of survival data. Philip Twumasi-Ankrah, PhD

Advanced Quantitative Methods for Health Care Professionals PUBH 742 Spring 2015

Which Design Is Best?

THE NHS HEALTH CHECK AND INSURANCE FREQUENTLY ASKED QUESTIONS

Glossary of Methodologic Terms

The Harvard MPH in Epidemiology Online/On-Campus/In the Field. Two-Year, Part-Time Program. Sample Curriculum Guide

Michael E Dewey 1 and Martin J Prince 1. Lund, September Retirement and depression. Michael E Dewey. Outline. Introduction.

BIG DATA SCIENTIFIC AND COMMERCIAL APPLICATIONS (ITNPD4) LECTURE: DATA SCIENCE IN MEDICINE

GAO ADVERSE EVENTS. Surveillance Systems for Adverse Events and Medical Errors. Testimony

What is critical appraisal?

Designing Clinical Addiction Research

Study Design Of Medical Research

Journal Club: Niacin in Patients with Low HDL Cholesterol Levels Receiving Intensive Statin Therapy by the AIM-HIGH Investigators

Transcription:

Big data size isn t enough! Irene Petersen, PhD Primary Care & Population Health

Introduction Reader (Statistics and Epidemiology) Research team epidemiologists/statisticians/phd students Primary care databases THIN and CPRD - 70+ studies Research topics Prescribed medicine in pregnancy Mental health Methodological questions Missing data Regression Discontinuity Design (RDD) Confounding (by indication) Cardiovascular diseases Infectious diseases http://www.ucl.ac.uk/pcph/research-groups-themes/thin-pub/ Or just google THIN UCL

Today Big data - Electronic Health Records Safety and efficacy of medicine in real life Confounding and selection bias Potential solutions

Big Data - Electronic Health Records Primary Care Databases THIN, CPRD, QRESEARCH Administrative Databases Insurance claims databases Hospital Episodes Statistics Population registers Scandinavian registers

UK primary care databases (1) THIN, CPRD, Qresearch Anonymised records million years of patient data Medical diagnoses and symptoms, preventative measures, test results and immunisations, prescriptions, referrals to secondary care and free text information

UK primary care databases (2) Demographic information e.g. year of birth, sex, social deprivation (Townsend score) Broadly representative of the UK population (sex, age, size of practice and geographic distribution) For clinical management NOT for research

Electronic Health Records versus Randomised Controlled Trials Electronic Health Records Randomised Controlled Trials Data collection Clinical sessions At fixed time points Data Missing data Coded clinical records Read or ICD-10, measurements Well people have less data Interviews, questionnaires, measurements Random(?) Size Millions From hundreds to thousands Treatment Selective Randomised

Recording of weight by age and gender in THIN 40 30 20 10 0 16 20 25 30 35 40 45 50 55 60 65 70 75 80 85 90 95 100 Age (years) Male Female

Recording of weight in diabetics and nondiabetics 80 measurement recorded weight 60 40 20 0 1995 19961997 1998 19992000 2001 20022003 2004 20052006 Year measurement recorded 2007 20082009 2010 2011 Registered 1995 Registered 2000 Registered 2005 Registered 2010 solid line - diabetes, dashed line - no diabetes

Strength of Electronic Health Records Large sample sizes Population that is often difficult to follow up by other means Very elderly Pregnant women People with severe mental illnesses Real life data Long follow-up

Provide some fantastic opportunities! Explore effectiveness in populations not covered by RCTs Very elderly, pregnant, people with severe mental illnesses Examine adverse (drug) effects Bridge from RCT to real life

Before we get too excited. Let s look at some of the challenges and pitfalls

Case-control or Cohort study? Drug A is commonly prescribed Outcome B is rare Use electronic health records to address the question

88,125 cases and 362,254 matched controls Cases between 30 and 100 years 5 matched controls to be alive at the date of cancer Adjusted analysis for potential confounders such as smoking and social deprivation and some specific diseases

Prolonged use of statin (more than 4 years) Associated with significantly increased risk of: Colorectal cancer (OR 1.23, 95%CI 1.10 to 1.38) Bladder cancer (OR 1.29, 95%CI 1.08 to 1.54) Lung cancer (OR 1.18, 95%CI 1.05 to 1.34)

Hang on. Let s have a look at this again Cases were those who got cancer Controls were a random sample of those who were alive at the time the cases got cancer What if..

Those who receive prolonged statin treatment More likely to die from other causes than cancer (e.g. cardiovascular diseases)? Then they would be less likely to enter the control group

Case-control study with live controls Estimates sensitive to difference in survival rates between exposed and unexposed

Interpretation of case-control study cohort study Case-control study: Of the survivors what is the chance that those with cancer were exposed to statin? Cohort study: Of those exposed to statin what is the risks of developing cancer compared to those not exposed

Cohort study found no associations between statins and cancers Smeeth et al. British Journal of Clinical Pharmacology

Let s look at a few other examples

High dose antidepressants increase self harm in young people? Miller et al. JAMA Internal Medicine, April 2014

High dose antidepressants increase self harm in young people? Propensity score matched cohort study Health care utilization data from 162 625 US residents Depression ages 10 to 64 years who initiated antidepressant therapy Standard (modal) versus higher doses

Propensity score methods Propensity Scores (PS) estimate the predicted probability (propensity) of use of a given drug PS based on his/her characteristics when the treatment is chosen logistic treatment sex age x1 x2 x3 x4 x5 x6 predict Predicted value (between 0-1) is the propensity score

Propensity scores Sturmer et al. J Clin Epidemiol. 2006 May ; 59(5): 437 447 Williamson et al. Statistical Methods in Medical Research 2011 21(3) 273 293

High dose antidepressants increase self harm in young people? High versus modal dose in young people: hazard ratio [HR], 2.2 [95%CI, 1.6-3.0] 1 additional event for every 150 such patients treated with high-dose (instead of modal-dose) therapy High versus model dose in adults (25 64 years): HR, 1.2 [95%CI,0.8-1.9]

Conclusions and implication Children and young adults. at hightherapeutic (rather than modal-therapeutic) doses seem to be at heightened risk of deliberate self-harm. Our findings offer clinicians an additional incentive to avoid initiating pharmacotherapy at high-therapeutic doses.

What is the problem?

Health care data Randomised Controlled Trial Propensity score methods may balance measured characteristics but. Propensity score methods may NOT balance unmeasured characteristics

Not just random allocation Clinicians make a treatment decision Acute presentation of psychiatric illness Severe ill individuals may receive higher doses This information is NOT captured in the health care databases

Confounding by indication Acute and severe psychiatric problem High dose antidepressant self harm

Confounding by indication Acute and severe psychiatric problem High dose antidepressant self harm

High dose antidepressant A mere marker of acute severe psychiatric problems Underlying condition associated with increased self-harm? Too early to conclude that high doses of antidepressants are unsafe Leave young people without treatment!?

Efficacy of heart failure treatment in real life? An example from our own work

Spironolactone treatment for heart failure Spironolactone improve survival in people with heart failure in Randomised Controlled Trials What about in real life?

RALES trial - NEJM September 1999 The trial was discontinued early, after 24 months Spironolactone was efficacious 386 deaths in the placebo group (46%) 284 in the spironolactone group (35%) Relative risk of death: Hazard Ratio: 0.70 (95% CI, 0.60 to 0.82; P<0.001)

Bridge from RCT to real life Identified people with severe heart failure in THIN Propensity score for spironolactone treatment Matched individuals with and without treatment Estimated relative survival using a Cox model

What did we find? People treated with spironolactone had same chance of survival after 24 months as those treated in RALES trial People with heart failure, but NOT treated, had an even better survival!

Relative risk of mortality in people treated with spironolactone RALES: Hazard Ratio: 0.70 (95% CI 0.60 0.82) Our study: Hazard Ratio: 1.32 (95% CI 1.18 1.47) Freemantle et al. BMJ 2013

Similar problem as before. Treatment is NOT randomly allocated Doctors make a choice Spironolactone given to those with worse prognosis More severe heart failure Acute situation? Reason for prescribing was NOT recorded Propensity score cannot solve the issue of confounding by indication

Bridge from Randomised Controlled Trials to real life - Statin example Propensity score matched sample This time, they were able to replicate trial results!

Why was it possible in this situation? Statin was used as primary prevention, rather than treatment of illness Potential confounders were captured in data Weight, blood pressure, sex, deprivation etc.

Propensity Scores and other regression adjustments may. It may work when Treatment is given as prevention and data is recorded in the database (e.g. statin) Decision to treat unrelated to prognosis of outcome (unexpected effects) It may NOT work when treatment is given in response to acute clinical need or people who are more frail (e.g.antidepressant, spironolactone, hypnotics)

The future for analysis and design of BIG data studies Cohort rather case-control studies Still need to think design through carefully Cohort studies including Several comparison groups Active comparisons e.g. drug versus drug Sequential simulated trials Self-controlled case series (SCCS) Instrumental Variables Regression Discontinuation Design

The future for analysis and design of BIG data studies (2) Sequential Simulated trials Run series of trials in database Trial 1: those initiated on treatment in 2001 versus those not initiated in 2001 Trial 2: those initiated on treatment in 2002 versus those not initiated in 2002 Danaei et al. 2013 Toh & Manson 2013

The future for analysis and design of BIG data studies (3) Sequential Simulated trials Avoid selection based on the future Less likely to have a healthy user or healthy non-user effect Makes is easy to define start for non-exposed

The future for analysis and design of BIG data studies (4) Self-controlled case series (SCCS) methodology Use individuals as their own control See http://statistics.open.ac.uk/sccs

The future for analysis and design of BIG data studies (4) Instrumental variables Use variation among general practices?

Summary Electronic Health Records reflect real life clinical practice Randomised Controlled Trials Confounding (by indication) and selection bias still are major issues! - think about the study design Interpretation of results Offer more than one interpretation perhaps drug is just a marker? Accept the limitations of BIG data

i.petersen@ucl.ac.uk

References 1. Freemantle N, Marston L, Walters K, Wood J, Reynolds MR, Petersen I. Making inferences on treatment effects from real world data: propensity scores, confounding by indication, and other perils for the unwary in observational research. BMJ. 2013 Nov 11;347(nov11 3):f6409 f6409. 2. Williamson E, Morley R, Lucas A, Carpenter J. Propensity scores: From naive enthusiasm to intuitive understanding. Stat Methods Med Res. 2012 Jun 1;21(3):273 93. 3. Stürmer T, Joshi M, Glynn RJ, Avorn J, Rothman KJ, Schneeweiss S. A review of the application of propensity score methods yielded increasing use, advantages in specific settings, but not substantially different estimates compared with conventional multivariable methods. J Clin Epidemiol. 2006 May;59(5):437 47. 4. Rosenbaum PR, Rubin DB. The Central Role of the Propensity Score in Observational Studies for Causal Effects. Biometrika. 1983 Apr;70(1):41. 5. Miller M, Swanson SA, Azrael D, Pate V, Stürmer T. Antidepressant Dose, Age, and the Risk of Deliberate Self-harm. JAMA Intern Med [Internet]. 2014 Apr 28 [cited 2014 May 1]; Available from: http://archinte.jamanetwork.com/article.aspx?doi=10.1001/jamainternmed.2014.1053 6. Smeeth L, Douglas I, Hall AJ, Hubbard R, Evans S. Effect of statins on a wide range of health outcomes: a cohort study validated by comparison with randomized trials. Br J Clin Pharmacol. 2009 Jan;67(1):99 109. 7. Toh S, Manson JE. An Analytic Framework for Aligning Observational and Randomized Trial Data: Application to Postmenopausal Hormone Therapy and Coronary Heart Disease. Stat Biosci. 2013 Nov;5(2):344 60. 8. Whitaker HJ, Paddy Farrington C, Spiessens B, Musonda P. Tutorial in biostatistics: the self-controlled case series method. Stat Med. 2006;25(10):1768 97.