Big data in health research Professor Tony Blakely



Similar documents
Cancer research in the Midland Region the prostate and bowel cancer projects

Research Skills for Non-Researchers: Using Electronic Health Data and Other Existing Data Resources

CARDIAC REHABILITATION

Electronic health records to study population health: opportunities and challenges

Secondary Uses of Data for Comparative Effectiveness Research

NATIONAL STATISTICS TO MONITOR THE NHS CANCER PLAN - REPORT OF A PRE SCOPING STUDY

Big Data for Population Health and Personalised Medicine through EMR Linkages

National Professional Development Framework for Cancer Nursing in New Zealand

"Statistical methods are objective methods by which group trends are abstracted from observations on many separate individuals." 1

REGULATIONS FOR THE POSTGRADUATE DIPLOMA IN CLINICAL RESEARCH METHODOLOGY (PDipClinResMethodology)

Guide to Biostatistics

Basic of Epidemiology in Ophthalmology Rajiv Khandekar. Presented in the 2nd Series of the MEACO Live Scientific Lectures 11 August 2014 Riyadh, KSA

TRACKS GENETIC EPIDEMIOLOGY

Environmental Health Science. Brian S. Schwartz, MD, MS

Tips for surviving the analysis of survival data. Philip Twumasi-Ankrah, PhD

Intervention and clinical epidemiological studies

Chapter 7: Effect Modification

University of Maryland School of Medicine Master of Public Health Program. Evaluation of Public Health Competencies

Quantifying Life expectancy in people with Type 2 diabetes

Hormones and cardiovascular disease, what the Danish Nurse Cohort learned us

Scottish Diabetes Survey Scottish Diabetes Survey Monitoring Group

Impact of Massachusetts Health Care Reform on Racial, Ethnic and Socioeconomic Disparities in Cardiovascular Care

With Big Data Comes Big Responsibility

PROTOCOL FOR THE INVESTIGATIVE APPROACH TO SERIOUS ANIMAL/HUMAN HEALTH PROBLEMS

The Adverse Health Effects of Cannabis

An Article Critique - Helmet Use and Associated Spinal Fractures in Motorcycle Crash Victims. Ashley Roberts. University of Cincinnati

Informatics: Opportunities & Applications. Professor Colin McCowan Robertson Centre for Biostatistics and Glasgow Clinical Trials Unit

Six Degrees of Separation No More: Using Data Linkages to Improve the Quality of Cancer Registry and Study Data

Does referral from an emergency department to an. alcohol treatment center reduce subsequent. emergency room visits in patients with alcohol

Māori Pathways to and Through Health Care for STEMIs in New Zealand. Summer Studentship Research by Ellie Tuzzolino- Smith

Case-control studies. Alfredo Morabia

CPRD Clinical Practice Research Datalink Scotland Meeting May 2012

Changing Trends in Mesothelioma Incidence. Hans Weill, M.D. Professor of Medicine Emeritus Tulane University Medical Center

Introduction to Observational studies Dr. Javaria Gulzar Clinical Research Associate SCRC.

Measures of Prognosis. Sukon Kanchanaraksa, PhD Johns Hopkins University

Understanding Diseases and Treatments with Canadian Real-world Evidence

Randomized trials versus observational studies

Digital Health: Catapulting Personalised Medicine Forward STRATIFIED MEDICINE

Butler Memorial Hospital Community Health Needs Assessment 2013

Health Disparities in Multiple Myeloma. Kenneth R. Bridges, M.D. Senior Medical Director Onyx Pharmaceuticals, Inc.

Social inequalities impacts of care management and survival in patients with non-hodgkin lymphomas (ISO-LYMPH)

Hepatitis C Infections in Oregon September 2014

Analysis of Population Cancer Risk Factors in National Information System SVOD

Validation and Replication

If several different trials are mentioned in one publication, the data of each should be extracted in a separate data extraction form.

Competency 1 Describe the role of epidemiology in public health

SUBTITLE D--PROVISIONS RELATING TO TITLE IV SEC GRANTS FOR SMALL BUSINESSES TO PROVIDE COMPREHENSIVE WORKPLACE WELLNESS PROGRAMS

School of Public Health and Health Services Department of Epidemiology and Biostatistics

Early mortality rate (EMR) in Acute Myeloid Leukemia (AML)

PSA screening in asymptomatic men the debate continues keyword: psa

Not All Clinical Trials Are Created Equal Understanding the Different Phases

THE VIRTUAL DATA WAREHOUSE (VDW) AND HOW TO USE IT

The Health and Well-being of the Aboriginal Population in British Columbia

Effect measure modification & Interaction. Madhukar Pai, MD, PhD McGill University madhukar.pai@mcgill.ca

EXPANDING THE EVIDENCE BASE IN OUTCOMES RESEARCH: USING LINKED ELECTRONIC MEDICAL RECORDS (EMR) AND CLAIMS DATA

Data and Information Management in Public Health

How can you unlock the value in real-world data? A novel approach to predictive analytics could make the difference.

Likelihood of Cancer

Scottish Diabetes Survey Scottish Diabetes Survey Monitoring Group

Certified in Public Health (CPH) Exam CONTENT OUTLINE

An Application of the G-formula to Asbestos and Lung Cancer. Stephen R. Cole. Epidemiology, UNC Chapel Hill. Slides:

Nurse Practitioner Role Primary Health Care In General Practice Setting.

Komorbide brystkræftpatienter kan de tåle behandling? Et registerstudie baseret på Danish Breast Cancer Cooperative Group

Consultation Response Medical profiling and online medicine: the ethics of 'personalised' healthcare in a consumer age Nuffield Council on Bioethics

Aggregate data available; release of county or case-based data requires approval by the DHMH Institutional Review Board

Finnish Cancer Registry Institute for Statistical and Epidemiological Cancer Research. Survival ratios of cancer patients by area in Finland

Tatau Kahukura: Māori Health Chart Book nd Edition

Basic Study Designs in Analytical Epidemiology For Observational Studies

INJURIES IN YOUNG PEOPLE

Diabetes Prevention in Latinos

A comparison of myhealthcare Cost Estimator users and nonusers: Effect on provider choices

Use of Electronic Health Records in Clinical Research: Core Research Data Element Exchange Detailed Use Case April 23 rd, 2009

Diabetes Complications

Singapore s National Electronic Health Record

Transcription:

Big data in health research Professor Tony Blakely Burden of Disease Epidemiology, Equity and Cost Effectiveness Programme 1

Structure Added value of big data. Examples: Linked census health data Linked health administrative data Longitudinal data Genetic data and epidemiology Challenges Opportunities 2

NZCMS: method in one slide 1991 census cohort (0-74 yr olds) Anonymous and probabilistic record linkage Deaths + + + + + 3

Life expectancy 85 80 75 70 65 60 Without linkage of census and 55 mortality data, we would have 50 overestimated Māori LE 1980s to mid 1990s. 45 1941 1951 1961 1971 1981 1991 2001 2011 Non-Māori Male Non-Māori Female Māori Male Māori Female 4

Breast cancer incidence rates by ethnicity Suggestion survival gaps widening faster than incidence gaps Breast cancer mortality rates by ethnicity NZCMS and CancerTrends (Incidence) findings 5

Rate ratios of 45 74 year old mortality for nil cf. postschool education, before and after adjusting for smoking Reduction in excess RR (ie RR-1) due to adjusting for smoking 1.5 RR 1.4 3% 16% 11% 21% Age & Ethnicity adjusted 1.3 1.2 Plus adjusted for smoking 1.1 1 Females 1981-84 Males 1981-84 Females 1996-99 Males 1996-99 6

HealthTracker linked health data hospital costs paid by the Ministry or DHBs (case mix cost weights) outpatient costs (contracted purchase units) GP visits (average capitation cost only) general medical subsidy for GP visits outside enrolled PHO emergency department triage level contracted purchase unit cost for event community pharmacy, and more recently hospital pharmacy costs (excluding non subs medications) lab tests funded by Vote:Health. 7

HealthTracker colon cancer costs Females age 62.5 yrs by time pre/post diagnosis $20,000 Cost per person month $15,000 $10,000 $5,000 $- 6-11 mth 1-5 mth <1 mth < 1 mth 1-5 mth 6-11 mth 12-23 mth 24+ mth 6-11 mth 1-5 mth <1 mth Pre-Diagnosis Post-Diagnosis, & not within yr of death Pre-Death from cancer 8

Economic evaluation: system costs Cost of intervention Health sector (C1) Consequences DALYs averted Other sectors (C2) Patient/family (C3) INTERVENTION Change in health Productivity losses (C4) New Zealand with NHI linked health datasets has a wonderful tool for calculating these health system costs HealthTracker Downstream costs averted/incurred Health sector (S1) Other sectors (S2) Patient/family (S3) Productivity gains (S4) 9

Cost $30,000,000 $25,000,000 HPV vaccination: Cost effectiveness plane Girls&Boys intensified schoolonly prog (2G+B) $20,000,000 $15,000,000 $10,000,000 $5,000,000 $0 0 100 200 300 400 500 600 700 HALYs gained Girls&Boys current prog (1G+B) Girls only intensified schoolonly prog (1G) Girls only current prog (1G) 10

Longitudinal Causal Inference H 1 H 2 H 3 Z L 1 L 2 L 3 Does a change in H cause a change in L or vice versa?

Does change in income predict change in self-rated health? Longitudinal data: SoFIE-Health

Does change in income predict change in self-rated health? No Variables Odds ratio 95% confidence interval Amalgamated conditional logit regression model Household annual income* 1.009 0.995 1.023 Hybrid proportional odds model Household annual income* 1.006 0.997 1.015 Supported by international literature Imlach Gunasekara Soc Sci Med 2011 But counter to most people s expectations

HDL and myocardial infarct example of big gene data internationally HDL accepted as (causal) risk factor for IHD So much so, that HDL a target for pharmaceutical companies big $$$ But really?

Mendellian randomisation = genetic variation as an instrument variable Z: Instrument = genes that predict HDL X: Exposure = HDL Y: Outcome = Myocardial infarction U: Unmeasured confounders Usually requires massive datasets in this case 20,000 MI cases and 100,000 controls with blood But incredibly good for causal inference 15

No association of HDL with MI major body of knowledge overturned

Major implications Drug discovery Pharmaceutical business CVD risk calculators Although HDL may still be useful for prediction, emphasizing prediction not the same as causation) And how did we miss this in observational epidemiology? Measurement error and residual confounding

Structure Added value of big data. Examples. Challenges Need vision Need champions Need capacity to use big data to add value Need funding Opportunities 18

Vision: Cancer Collections Framework Hewlett Packard Report for NZHIS and NSU, 2006 Year 1 Year 2 Year 3 Year 5 System NCMD Business Case developed and approved First phase of NCMD implemented (2 cancer specialities/tumour sites) Facilitate national view by using NCMD with links from NMDS, NZCR and Mortality Cancer Collections: 5 Year Vision Information/Reporting Current State Develop front-end Reporting tools within existing NZHIS reporting channels Set standard reports for Key Answers using existing reporting tools Use current reporting/ data extraction channels (e.g. PHO Performance Indicators site). Links created between Mortality and NZCR Links created between NCSP and Year 1 NZCR Trial of Information Explore data Laboratory for users to Implementation and from new access linked data from roll out of Front end sources eg cancer collections and Reporting tool Private clinics, related datasets community services A menu of pre structured reports made available. There - Add clinical Is also some ad-hoc report Automate links to information from ability for some datasets improve speed of NNPAC and access community Links between NZCR and Mortality are automated Links created between BSA and NZCR Year 1 New NCSP Link to PHO Information enhanced data System with capture Some additional data automated links access to authorised users for systems Link to accessible Explore providing key data aggregated data: PHI; links to PHO & DHB palliative performance indicators care data Increase data Add Collection Establish a process to increase accessed from from NNPAC - cancer related capture eg current systems (no count palliative care, primary care, new data collection). private, by working with other directorates and developments within the Ministry Year 3 Year 2 Year 1 Data Collection/Structure Data Access 19

Vision: Cancer Collections Framework Hewlett Packard Report for NZHIS and NSU, 2006 Widespread Authorised Data Access (Authorised) on - line access through to base data (that meets standard alignment requirements) On -line access to filtered/pre-formatted data and prestructured reports (real-time/recent data) On-line access to standard reports only Reports available to answer key critical questions based on data 2+ years old Set of standard reports readily available (recent data) with limited ability to generate ad- hoc reports. Data Laboratory: Able to undertake complex adhoc analysis in a supported or facilitated, real or virtual environment User-friendly interface for Information search and reporting No change to current data structures some additional data collected Data held in national collections linked, with the ability to add data fields where extraction or linking is not onerous and is valueadding On-line guide to answering key questions with links to particular reports Front end search and compilation tool with links to key databases Uniform Data Collection/Structure All information linked by NHI through cancer continuum for individuals diagnosed with cancer Datawarehouse for high-speed integrated analysis and sophisticated research Loosely linked System with search capability 20

Making vision happen challenging Still not there, many committees later Researcher/clinician/manager enthusiasm, hits reality of: Data dictionaries and definitions Systems of collecting the data Who? How? When? Reliability and validity of data Fitting it in with existing data Cost Linking it with biological samples and trial networks Demonstrating value Privacy, confidentiality and ethics 21

Challenge: Champions Census mortality and census cancer linkage would not have happened (as soon) without: Vision and championing of the then Government Statistician An emerging researcher looking for a PhD HealthTracker would not have happened (as soon) without drive of staff within Ministry Order of magnitude up is whole of cancer collections, not only requiring champion(s) but coordination, leadership, resources, etc. This is not easy. 22

Challenge: Capacity Capacity needed to assemble and maintain big data. but also to make good use of it: Provision to likely users Users capable of using it well, e.g.: Longitudinal data analyses Comparative effectiveness research, econometric and epidemiological skills Funding 23

Challenge: Cost New Zealand is a small country: May cost just as much to run a birth cohort study in New Zealand as Australia to achieve internal validity (e.g. sample size).. or put another way New Zealand does not have economies of scale. Are we able to even get to table to new drug trials: Numbers Registries Tissue samples 24

Structure Added value of big data. Examples. Challenges Opportunities 25

Opportunities E.g. HealthTracker, virtual access to data, joining in clinical data, etc. 26

Opportunities: Use what we have well Examples: NHI (VIEW/PREDICT, HealthTracker, etc) Growing Up Synthesis to answer research and policy questions through modelling Contributing data to international collaborations 27

Opportunities: Internet & social media Mountains of data: Twitter Facebook Websites How do we use machine learning and other methods to ask questions of, follow up and retrieve data from free living humans? Texting health messages just scratching the surface Innovation needed (e.g. monitoring how social media discussions alter as a result of health promotion campaigns) 28

Big data in health research Professor Tony Blakely Examples of added value of big data Challenges Opportunities Burden of Disease Epidemiology, Equity and Cost Effectiveness Programme 29