1 Modeling and Prediction With ICU Electronic Health Records Data Department of Computer Science University of Massachusetts Amherst January 21, 2011
2 The Big Picture: History of Medical Records The first use of centralized, paper medical charts began in The first electronic medical records systems were introduced in the 1970s. Today, an increasing number of hospitals are adopting the use of electronic records.
3 The Big Picture: EHR Adoption Rates Percent of non-federal acute care hospitals with adoption of at least a Basic EHR
4 The Big Picture: Meaningful Use As the amount of electronic health records data increases, there is a growing opportunity to leverage this data to improve the quality of care that patients receive. Prediction of length of stay, diagnosis, mortality outcomes, leading to improved clinical decisions. Better ways of accessing data like physiology-based patient similarity search.
5 Cognitive Prosthesis
8 Outline Introduction Electronic Health Records Data B A C Modeling, Learning & Inference Discussion
9 EHRs: What s in them? The data contained in EHRs is collected during the course of routine treatment. An EHR includes information like: Diagnosis and some outcome information (eg: died while in hospital) Medications ordered and administered (including time stamps and dose) Laboratory analysis and radiologic information Clinical notes with assessments In the case of ICU EHRs, physiologic data
10 EHRs: Physiological Data The physiological data in an EHR is a multivariate time series that starts when the patient is admitted and ends when the patient is discharged. HR RR Admit Discharge
11 EHRs: The vpicu Dataset Variable The vpicu data set contains measurements of 13 variables from over 10,500 episodes. Msmts per day Pulse Oximetric saturation (SpO2) Heart Rate (HR) Respiratory Rate (RR) Systolic Blood Pressure (SBP) Diastolic Blood Pressure (DBP) End-tidal Carbon Dioxide (ETCO2) Temperature (Temp) Total Glascow Coma Score (TGCS) Capillary Refill Rate (CRR) Urine Output (UO) 9.50 Fraction Inspired Oxygen (FIO2) 5.17 Glucose (Gluc) 2.06 ph 1.50
12 EHRs: The vpicu Dataset The length of stay per patient episode varies from a few hours to a few years. Length of Stay (days)
13 EHRs: The vpicu Dataset The time series for different patients are usually not aligned in any meaningful way. HR Admit Discharge HR Admit Discharge
14 EHRs: The vpicu Dataset The data is very sparsely and irregularly sampled in the time domain and the measurement times are not aligned at all. HR RR Admit Discharge
15 EHRs: The vpicu Dataset Different variables have vastly different sampling frequencies from an average of once an hour to an average of once per day.
16 EHRs: The vpicu Dataset Different patients have vastly different sampling frequencies for the same variable.
17 EHRs: The vpicu Dataset Some variables may be subject to sample selection bias for a variety of reasons, including observation of abnormal values. HR RR Admit Discharge
18 EHRs: The vpicu Dataset Some variables may be subject to nonrandom missingness for a variety of reasons, including diagnosis hypotheses. HR ETC Admit Discharge
19 EHRs: The vpicu Dataset Some variables may be subject to nonrandom missingness.
20 EHRs: The vpicu Dataset The application of interventions alters the underlying physiology and needs to be taken into account. HR RR Admit I1 I2 I3 Discharge
21 Outline Introduction Electronic Health Records Data B A C Modeling, Learning & Inference Discussion
25 Clustering: Simplifications B A C We dealt with irregular sampling issues using temporal discretization (binning). We dealt with the resulting missing observations by assuming MAR. We ignored temporal alignment issues. We dealt with interventions and length of stay variations by modeling 24H of data only.
26 Clustering: Simplifications? ? ?? 0
27 Clustering: Dealing with Sparsity ML: MAP:
28 Clustering The model is wrong, but is it useful? Are the clusters associated with recognizable physiologic and diagnostic patterns that have prognostic significance?
29 Clustering: Cluster Analysis
30 Clustering: Cluster Analysis Low blood pressure Prolonged cap refill High heart rate High respiratory rate Low SaO2 Low ph Low TGCS Shock and depressed cognitive function
31 Clustering: 24H Mortality Prediction
32 Patient-Patient Similarity: Simplifications B A C We re dealing with irregular sampling directly using continuous time models. We re still treating totally missing data as MAR and ignoring temporal alignment, sample selection bias and interventions.
33 Patient-Patient Similarity: Gaussian Processes
34 Patient-Patient Similarity: Gaussian Processes
35 Patient-Patient Similarity: Bhattacharyya We can measure the similarity between two Gaussian processes posteriors at any time point t by computing the Bhattacharyya coefficient: 1/2 1/2
36 Patient-Patient Similarity: Bhattacharyya
37 Patient-Patient Similarity: Examples
38 Preliminary Tests: Spectral Embedding
39 Outline Introduction Electronic Health Records Data B A C Modeling, Learning & Inference Discussion
40 Discussion There is a hierarchy of increasingly sophisticated models that could be applied to this data. The direct similarity approach can incorporate many possible (and necessary) extensions: Alignment Selection bias NMAR Interventions? Additional data sources impacting similarity
41 Discussion An advantage of the direct similarity approach is that it can be used in the solution to many machine learning algorithms: Dimensionality reduction Clustering Classification Regression We can use these tools to drive data visualization and information retrieval as well as prediction.
42 Collaborators and Students Dr. Randall Wetzel Professor of Pediatrics and Anesthesiology; The Anne O'M. Wilson Professor of Critical Care Medicine; Director, Critical Care Medicine Childrens Hospital Los Angeles; Director, The Laura P. and Leland K. Whittier Virtual Pediatric ICU. Dr. Robinder Khemani Anesthesiology and Critical Care Medicine, Childrens Hospital Los Angeles and The Laura P. and Leland K. Whittier Virtual Pediatric ICU. David Kale The Laura P. and Leland K. Whittier Virtual Pediatric ICU. Steve Li UMass CS PhD student Gregory W. Koch UMass REU student /Austin College
Early Prediction of Cardiac Arrest (Code Blue) using Electronic Medical Records Sriram Somanchi Carnegie Mellon University Pittsburgh, USA email@example.com Elena Eneva Accenture San Franscisco, USA firstname.lastname@example.org
Developing Predictive Models Using Electronic Medical Records: Challenges and Pitfalls Chris Paxton Computer Science Department Johns Hopkins University Baltimore, MD 21218 Alexandru Niculescu-Mizil NEC
Predictive analytics: Poised to drive population health As health care moves toward value-based payments and accountable care, providers need better tools for population health and risk management. The
COMPUTATIONAL METHODS FOR ELECTRONIC HEALTH RECORD-DRIVEN PHENOTYPING By Peggy L. DeLong Peissig A dissertation submitted in partial fulfillment of the requirements for the degree of Doctor of Philosophy
Health Data Analysis Toolkit American Health Information Management Association 2011 Table of Contents Foreword...3 Authors and Acknowledgments...3 Introduction...4 Data Dictionary...5 Study Design and
Waste and Inefficiency in the U.S. Health Care System Clinical Care: A Comprehensive Analysis in Support of System-wide Improvements ABOUT NEHI The New England Healthcare Institute (NEHI) is an independent,
Providing Clinical Summaries to Patients after Each Office Visit: A Technical Guide July 2012 Prepared by: Jeff Hummel, MD, MPH Qualis Health Seattle, Washington Peggy Evans, PhD, CPHIT Qualis Health Seattle,
Dynamic Predictive Modeling in Claims Management - Is it a Game Changer? Anil Joshi Alan Josefsek Bob Mattison Anil Joshi is the President and CEO of AnalyticsPlus, Inc. (www.analyticsplus.com)- a Chicago
Journal of Machine Learning Research 6 (2005) 1961 1998 Submitted 8/04; Revised 3/05; Published 12/05 What s Strange About Recent Events (WSARE): An Algorithm for the Early Detection of Disease Outbreaks
P a g e 0 Restricted Enhancing Recovery Rates in IAPT Services: Lessons from analysis of the Year One data. Alex Gyani 1, Roz Shafran 1, Richard Layard 2 & David M Clark 3 1 University of Reading, 2 London
International Journal of Information Science and Intelligent System, 3(3): 51-60, 2014 Modeling Temporal Data in Electronic Health Record Systems Chafiqa Radjai 1, Idir Rassoul², Vytautas Čyras 3 1,2 Mouloud
Evidence Report on Rehabilitation of Persons with Traumatic Brain Injury Randall M. Chesnut, MD, Principal Investigator Nancy Carney, PhD Hugo Maynard, PhD Patricia Patterson, PhD N. Clay Mann, PhD Mark
NQF-Endorsed Measures for Care Coordination: Phase 3, 2014 DRAFT REPORT FOR COMMENT April 29, 2014 This report is funded by the Department of Health and Human Services under contract HHSM-500-2012-00009I
THE LYTX ADVANTAGE: USING PREDICTIVE ANALYTICS TO DELIVER DRIVER SAFETY BRYON COOK VP, ANALYTICS www.lytx.com 1 CONTENTS Introduction... 3 Predictive Analytics Overview... 3 Drivers in Use of Predictive
Topic Overview & Introduction: A Long Way to Go for EMR Usability By: Jessica Green With the ever increasing implementation of Electronic Medical and Health Records (EMR and EHR) systems into hospitals,
Unlocking the Value of Healthcare s Big Data with Predictive Analytics Background The volume of electronic data in the healthcare industry continues to grow. Adoption of electronic solutions and increased
Predictive analytics and data mining Charles Elkan email@example.com May 28, 2013 1 Contents Contents 2 1 Introduction 7 1.1 Limitations of predictive analytics.................. 8 1.2 Opportunities for
Design of a Vital Sign Protocol Format Using XML and ASN.1 Bayu Erfianto Graduation Committee prof. dr. ir. D. Konstantas dr. ir. I. Widya dr. ir. A. van Halteren A thesis submitted to the department of
Case Mix Tools FOR DECISION MAKING IN HEALTH CARE EDITORS: Lina M. Johnson, MBA Julie Richards, MHSc George H. Pink, PhD Lindsay Campbell, MHSc Case Mix Tools FOR DECISION MAKING IN HEALTH CARE Lina M.
Challenges in Designing an Online Healthcare Platform for Personalised Patient Analytics Norman Poh 1, Santosh Tirunagari 2 and David Windridge 3 Abstract The growing number and size of clinical medical
Artigos originais The Evaluation of Treatment Services and Systems for Substance Use Disorders 1,2 Dr. Brian Rush, Ph.D.* NEED FOR EVALUATION Large numbers of people suffer from substance use disorders
WORKING PAPERS IN ECONOMICS No.13/12 PRASHANT BHARADWAJ, KATRINE V. LØKEN AND CHRISTOPHER NEILSON EARLY LIFE HEALTH INTERVENTIONS AND ACADEMIC ACHIEVEMENT. Department of Economics U N I V E R S I T Y OF
Encouraging Quality Pathology Ordering in Australia s Public Hospitals Final Report February 2012 A project funded under the Australian Government s Quality Use of Pathology Program Encouraging Quality
Chapter 8. Selection of Data Sources Cynthia Kornegay, Ph.D.* U.S. Food and Drug Administration, Silver Spring, MD Jodi B. Segal, M.D., MPH Johns Hopkins University, Baltimore, MD Abstract The research
An avoidable death of a three-year-old child from sepsis A report by the Health Service Ombudsman for England on an investigation into a complaint from Mr and Mrs Morrish about The Cricketfield Surgery,
Reflections on Geographic Variations in U.S. Health Care Jonathan Skinner, PhD The Dartmouth Institute for Health Policy and Clinical Practice Department of Economics, Dartmouth College Jon.firstname.lastname@example.org
A Comparative Analysis Of Predictive Data-Mining Techniques A Thesis Presented for the Master of Science Degree The University of Tennessee, Knoxville Godswill Chukwugozie Nsofor August, 2006 DEDICATION