Smarter Healthcare@IBM Research Joseph M. Jasinski, Ph.D. Distinguished Engineer IBM Research
Our researchers work on a wide spectrum of topics Basic Science Industry specific innovation Nanotechnology IT Infrastructure Optimized Systems Industry Solutions Analytics Data Storage & Transport High-Performance Computing Collaboration Multimodal Analytics Nanomedicine Data Privacy & Security Green Data Centers Workflow Management Real-time Analytics DNA Transistor Cloud Computing Deep Q&A Workforce Management Big Data Predictive Analytics Big Data 2014 International Business Machines Corporation 2
The World Wide Health Informatics Research Team Almaden Data, Medical Imaging, Public Health Analytics, Visualization, Wellness Watson Austin Dublin Zurich Social Care Analytics, Medical Imaging, Translational Medicine Haifa India China Clinical decision support system, Vulnerable population Tokyo Analytics Machine Learning Medical Imaging Optimization Patient similarity Drug efficacy Care pathway analytics Genomics/ microbiomics Wellness and Mobile Technologies Risk assessment Brazil Population Health Africa Mobile Health, Health Systems Analysis Melbourne Analytics, Medical Imaging, Health Systems Research 2014 International Business Machines Corporation 3
Global team, global reach 2014 International Business Machines Corporation 4
Generation and Delivery of Evidence and Insights From population averages To insights for individual patient! Scientific papers Books Guidelines Published Knowledge Observational Data Longitudinal records Claims, Rx, Labs Patient reported data Knowledge-Driven Method Data-Driven Method Closing the translational knowledge gap Personalized Insights from institutional data 2014 International Business Machines Corporation 5
Major Areas of Innovations in Data-Driven Analytics Patient Similarity Analytics Advanced data-driven analytics on longitudinal patient data Medical Sieve Cognitive radiology assistant with advanced visual and textual reasoning Genomics Analytics Cognitive system for mapping sequencing results to literature Personalized System of Insights 2014 International Business Machines Corporation 6
EuResist Data and Deep Analytics to Improve HIV/AIDS Therapy EuResist EU Programme Predict in-vivo efficacy of anti-retroviral drug regimens against a given HIV. Based on viral genotype data integrated with treatment response IBM hosts and manages the largest HIV clinical genomics db in the world Prediction Systems training EuResist DB2 v9 Freely available on-line clinical support system Contract with GIE an NOP European Economic Interest Group Arevir Karolinska ARCA 7 2014 International Business Machines Corporation 7
Patient Similarity Analysis? x 1 Q x 2 Q x N Q Similarity Analysis? Clinically similar to Query patient Patient similarity assessment in clinical factor/feature space Patient population x 1 1 x 2 1 x N 1 x 2 1 x 2 2,,, x N 2 x 1 K x 2 K x N K Best Treatment=? Prognosis=? Diagnosis=? Outcomes Analysis Treatment Comparison Disease Progression Overcoming Various Analytical Challenges High dimensionality: often >10,000 dimensions about patients Similarity is context-dependent: relevant factors differ per morbidity Missing data and matrix sparsely: especially when considering longitudinal data 2014 International Business Machines Corporation 8
Data-Driven Analytics for Disease prediction and interception Solutions HR Affinity (Retail) Patient Engagement Point of Care Decision Support Diagnosis Practice Management Care Mgmt. & Coordination Genomic data Medication Wellness Management Real World Evidence Diet Social Interaction Diagnostics Lab Clinical notes Data Driven Analytics Physical Activities Vital Signs Patient Similarity Care Pathway Analytics Predictive Modeling Risk Stratification Interactive Cohort Analysis Disease Modeling Outcome Analysis 9 Data from Health Ecosystem Visual Analytics Workbench Health Behavior Research Translational Medicine 2014 International Business Machines Corporation 9
Congestive Heart Failure PHR based early detection Goal: Build a model for predicting heart failure onset x months before the its diagnosis Data: Longitudinal patient records Structured data: Demographics, diagnoses, problem lists, vitals, medications, labs Unstructured text: Encounter notes Challenge faced by our clinical partners: Collect and evaluate many weak and non-specific indicators Identify the indicators that when combined are truly predictive Combine knowledge based and data driven predictors 0.5 AUC 0.85 0.8 0.75 0.7 0.65 0.6 0.55 AUC vs Predic on window 0 90 180 270 360 450 540 630 720 810 900 Predic on Window (number of days) HFpEF HFrEF 0.8 AUC vs Observa on window AUC 0.75 0.7 0.65 0.6 0.55 HFpEF HFrEF 0.5 30 90 180 360 450 540 720 all Observa on Window (number of days) 2014 International Business Machines Corporation 10
Congestive Heart Failure Onset prediction results AUC 0.8 0.75 0.7 0.65 0.6 0.55 0.5 +Hypertension +diabetes CAD +50 +150 +100 all knowledge features +200 0 100 200 300 400 500 600 Number of features 4644 case patients, 45,981 control patients Over 20k features of different types (diagnoses, demographics, Framingham symptoms, lab results, medication, vital) Novel feature selection algorithm enabling integration of knowledge driven and data driven risk factors Investigation of different observation windows (30 900 days) and prediction windows (1 720 days) Investigation of multiple classification models (logistic regression, random forest, knn, cox regression ) AUC significantly improves as complementary data driven risk factors are added into existing knowledge based risk factors. A significant AUC increase occurs when we add first 50 data driven features NIH grant on early CHF detection 2013 2014 International Business Machines Corporation 11
Use Case: Prevention by care coordination Schizophrenia Seriously Persistent Mentally Ill (SPMI) individuals that have one of the following diagnoses: Bipolar disorder Schizophrenic disorder Major depression People who are lost in the system The dataset includes data on an SPMI population of 29,558 individuals. 2014 International Business Machines Corporation 12
Four primary negative outcomes of mental health crisis Rehospitalization Reincarceration Homelessness Suicide A predictive model, constructed using elastic net regularized logistic regression, and considering age, past arrests, mental health diagnosis, as well as use of a jail diversion program, outpatient, medical and case management services predicted the probability of Re-incarceration with AUC=.67 13 2014 International Business Machines Corporation 13
Majority of our Health and Wellness related data is Exogenous Exogenous data (Behavior, Socio-economic, Environmental,...) 60% of determinants of health Volume, Variety, Velocity, Veracity Genomics data 30% of determinants of health Volume 1100 Terabytes Generated per lifetime 6 TB Per lifetime Clinical data 10% of determinants of health Variety 0.4 TB Per lifetime Source: "The Relative Contribution of Multiple Determinants to Health Outcomes", Lauren McGover et al., Health Affairs, 33, no.2 (2014) 2014 International Business Machines Corporation 14
Leveraging Exogenous Data for Chronic Care (Type 2 Diabetes; Primary & Secondary Prevention) Clinical Exogenous Glucose Monitoring Calorie Intake Physical Activity Stress Levels Sleep Pattern Other vital signs Social Interaction Affinity (retail) Employer (HR) Genetics Medications Lab Results Disease management and treatment Calibration of blood glucose readings Medication/Dosage management Co-morbidity management Behavioral change and prevention Context-aware profile generation Behavior models to develop recommendation services 2014 International Business Machines Corporation 15
Combine Health and HR data points to create new hypotheses for predicting adherence behavior Employee Health Benefits Medical Claims Extracts Prescription Benefits Extracts Benefits Eligibility Extract Employee Job + Characteristics + Employee Job Roles and Performance Employee Travel Expense Data Employee Wellness Programs Health Risk Assessments Healthy Living Rebates Data Care Coordination Program Data Records 120M+ harmonized Data sources 8 aggregated 5yrs Historical data Adherence Data Models Data Mining and Predictive Analytics: Propensity to Adhere 2014 International Business Machines Corporation 16
Collaboration with national not for profit wellness initiative What s the Way to Wellville? The Way to Wellville is a five-community, fiveyear challenge to produce visible improvements in five measures of health and economic vitality. The challenge is sponsored by HICCup (Health Initiative Coordinationg Council), a nonprofit founded by angel investor Esther Dyson to encourage new models and markets for the production of health. The Wellville Five Clatsop County, OR Greater Muskegon, MI Lake County, CA Niagara Falls, NY Spartenburg, SC 2014 International Business Machines Corporation 17
2014 International Business Machines Corporation 18 18