Big Data and Analytics in Healthcare Philadelphia 2013 Health Information Exchanges and Big Data: Challenges and Opportunities Hadi Kharrazi, MHI, MD, PhD Johns Hopkins University hkharrazi@jhsph.edu
The presentation in a nutshell Introduction Big Data and Healthcare HIE History, Architecture and Services HIE Examples (Indiana HIE, CRISP) HIE Big Data Exploration (translating CPGs into population health data) HIE and Population Health IT Framework JHU Center for Population Health IT (CPHIT) 2
Introduction Hadi Kharrazi, MHI, MD, PhD o o o Assistant Professor, Johns Hopkins School of Public Health Department of Health Policy and Management Joint Appointment, Johns Hopkins School of Medicine Division of Health Sciences Informatics Assistant Director Johns Hopkins Center for Population Health IT (CPHIT) o o o E: hkharrazi@jhsph.edu L: http://www.linkedin.com/in/kharrazi/ W: http://jhsph.edu/cphit o Research interests: Population DSS, HIE & ACO Analytics 3
Big Data and Healthcare Statistics: o Data size: 2012 = 500 petabytes 2020 = 25,000 petabytes o Cost: 2009 = 17.6% GDP ($2.9t) 2025 = 25.0% GDP o Big Data Saving: ~$300b/yr Definition: o Big Data is a collection of data sets so large and complex that it becomes difficult to process using on-hand database management tools or traditional data processing applications. History in Healthcare o Last time: We pushed the data boundary in the Human Genome Project (1990) o This time: Massive roll out of EHRs (Meaningful-Use) and Integration of various source of data (genomic, mobile, patient-generated, exchanges) 4
Big Data and Healthcare (cont.) Big Data Specs data driven 4Vs: o Volume quantity (size) o Variety type (structure, standardized, ontology) o Veracity quality (meaning, completeness, accuracy) o Velocity time (real-time, timeliness) a b c d e Volume Variety Veracity Velocity? 5
Big Data and Healthcare (cont.) Basic Research Biomedical informatics methods, techniques, and theories Bioinformatics Population HIT Public Health Informatics Applied Research Imaging Informatics Clinical Informatics Consumer Health Informatics Molecular Research Health Research Biomedical informatics as a basic science 6
Big Data and Healthcare (cont.) Copyright H.Kharrazi Big Data Specs clinical driven 5Ms: o Measure size, type, quality (QMs, digitization, meaningful use) o Mapping integration, interoperability (information exchanges, EDWs) o Methods analytics (exploration, visualization, predictive, replacing RCTs) o Meanings knowledge (EBM, CPG, meaningful use) o Matching outcomes (triple aims, ACA) Mandates CQMs Devices Mandates Business Standards Analytics Modeling Simulation Application Knowledge Outcome EHRs EDWs Expl/Visu CPG ACA Genomics Bio Signal Consumer Measure ACOs HIEs Mapping Statistical Data Mining Methods CDSS EBM Meaning HITECH MU1,2,3 Matching Triple Aim Public NwHIN Machine Learning Theories Other policies????????????? 7
Big Data and Healthcare (cont.) Copyright H.Kharrazi Johns Hopkins JH Community Physicians JH Hospitals JH Health Care Epic Measure - JH ACO Mapping Methods Meaning -????? JCHIP / JCASE (CMMI) Matching?? 8
HIE History Copyright HiMSS 9
HIE History (cont.) > ONC ONC relationship with DHHS Copyright HiMSS 10
HIE History (cont.) > Health Information Organization (HIO) HIE (verb): The electronic movement of health-related information among disparate organizations according to nationally recognized standards in an authorized and secure manner. HIO (noun): An organization that oversees and governs the exchange activities of health-related information among independent stakeholders and disparate organizations according to nationally recognized standards in an authorized and secure manner. An HIO can be described by many acronyms, including: o o o o o State Level Health Information Exchange (SLHIE) Regional Health Information Exchange (RHIO) Regional Health Information Network (RHIN) Health Information Exchange Networks (HIE[N]) Others: Integrated Delivery Systems (IDNs); Physician practices HIEs; Payerled HIEs; and, Disease-specific HIEs. Copyright HiMSS 11
HIE Architecture > Centralized EHRs PHRs Claims Data Repository Registries PACS LABs Centralized HIE Copyright Regenstrief Institute 12
HIE Architecture (cont.) > Federated EHRs PHRs Claims Registries PACS LABs Federated Inconsistent HIE Copyright Regenstrief Institute 13
HIE Architecture (cont.) > Federated EHRs PHRs Claims MPI Registries PACS LABs Federated Consistent HIE Copyright Regenstrief Institute 14
HIE Architecture (cont.) > Hybrid EHRs PHRs Data Repository Claims EHRs LABs Claims MPI Registries PACS LABs Hybrid HIE Copyright Regenstrief Institute 15
HIE Architecture (cont.) > Switch EHRs PHRs Claims Data Standardization Machine Registries PACS LABs Switch HIE Copyright Regenstrief Institute 16
HIE Architecture (cont.) > Patient Centric EHRs Other Claims Registries PACS LABs Patient Centric HIE (e.g., PHR controlled) Copyright Regenstrief Institute 17
HIE Architecture (cont.) Health Information Organizations / Regional HIOs Copyright Regenstrief Institute 18
HIE Architecture (cont.) > NwHIN Connecting RHIOs / NwHIN Copyright Regenstrief Institute 19
HIE Architecture (cont.) > NwHIN NHII (2004) NHIN (2010) NwHIN (2011) Nationwide Health Information Network (NwHIN) Copyright HiMSS 20
HIE Services > Core Services Presentation Services: login, patient look-up, request patient records, view data Business Application Services: e-prescribing, EMR, lab, radiology, eligibility checking, problem list/visit history Data Management Services: data persistence/access, value/code sets, key manage. Data Storage Services: message logs, XML Schemas, Provider/User Directory Integration Services: message translation/transport, HL7 mapping, EMR adapter System Management Services: system config, audit/logging, exception handling Measure Mapping Methods Meaning Matching - -??? Copyright HiMSS 21
HIE Services > Data Services by Constituency Measure? Hospitals: o Clinical messaging o Medication reconciliation Public Health: o Needs assessment o Biosurveilance Mapping Methods???? o Shared EHR o Reportable conditions Meaning? o Eligibility checking Physicians: o Results delivery Consumers: Matching - o Result reporting o Secure document sharing o Shared EHR o Clinical decision support o Eligibility checking Laboratory: o Clinical messaging o Orders o Personal Health Records Researchers: o De-identified longitudinal clinical data Payers: o Quality measure o Claims adjustment o Secure document transfer Copyright HiMSS 22
HIE Services > Emerging Services Next Generation Analytics o Data warehouse, data analytics and business intelligence o Quality reporting support Measure Mapping Methods - -??? o Performance management Meaning??? o Fraud and abuse identification and prevention Matching?? o Care gap identification o Care and disease management o Public health monitoring and analysis o Population monitoring and predictive profiling Copyright HiMSS 23
HIE Examples > IHIE The Indiana HIE (IHIE) includes (as of mid-2011): Federated Consistent Databases 22 hospital systems ~70 hospitals 5 large medical groups and clinics & 5 payors Several free-standing labs and imaging centers State and local public health agencies 10.75 million unique patients 20 million registration events 3 billion coded results 38 million dictated reports 9 million radiology reports 12 million drug orders 577,000 EKG tracings 120 million radiology images Copyright Regenstrief Institute 24
HIE Examples (cont.) > CRISP (Chesapeake Regional Info. Sys. for our Patients) Focus Areas: o Query Portal Growth o Direct Secure Messaging o Encounter Notification System (ENS) o Encounter Reporting System (ERS) o Health Benefits Exchange integration Progress Metric Organizations Live Result Hospitals (Total 48) 48 Hospital Clinical Data Feeds 86 (Total 143 - Lab, Radiology, Clinical Docs) National Labs 2 Radiology Centers (Non-Hospital) 5 Identities and Queries Master Patient Index (MPI) Identities ~4M Opt-Outs ~1500 Queries (Past 30 Days) ~3500 Data Feeds Available Lab Results ~16M Radiology Reports ~5M Clinical Documents ~2M Copyright CRISP 25
HIE Big Data Exploration > Translating CPGs into Population Health Data Clinical Practice Guidelines Computerized AHRQ s Computerized NGC Popup CPGs (guideline.gov) alert (Know.Rep.Lang.) DSS Computerized CPG (CARE, Arden syntax) Manual Programming for each CPG One patient at the time Relational DB Arden mysql_connect( localhost, Syntax: root, pass, ehr'); *Blood Patient Pressure, has diabetes Weight/Body and has Mass not Index had an (BMI) - Every visit. For adults: Blood pressure mysql_query( SELECT eye logic: exam in two years. patient_id Based FROM on patient target goal <130/80 mm Hg; BMI (body mass WHERE guideline if (creatinine patient.fname xxx is you not may number) = peter want or AND to (calcium consider is index) <25 kg/m2. patient.lname asking not number) for an or= eye Johnson ); consultation. (phosphate is not number) then For children: Blood pressure target goal <90th $results conclude = mysql_fetch(); false; percentile adjusted for age, height, and gender; endif; BMI-for-age <85th percentile. $patient_id product := $results[ patient_id ]; calcium * phosphate; ----------------------------------------------------- Foot Exam (for adults) - Thorough visual CARE mysql_query Language: ( SELECT * FROM lab WHERE inspection every diabetes care visit; pedal patient_id = $patient_id and lab_term = pulses, neurological exam yearly. BEGIN HBA1c ); BLOCK CHOLESTEROL Dilated Eye Exam (by trained expert) - Type 1: DEFINE $results "MALE_PT" = mysql_fetch(); {VALUE} AS "SEX" IS = 'M' Five years post diagnosis, then every year. DEFINE "CHOLESTEROL LAST" AS LAST "CHOL TESTS" $lab_result [BEFORE = $results[ lab_term ]; "DATE"] EXIST Depression - Probe for emotional/physical IF "CHOLESTEROL LAST" WAS AFTER 5 YEARS factors linked to depression yearly; treat AGO IF ($lab_result > 9){ aggressively with counseling, medication, THEN ECHO EXIT Patient CHOLESTEROL has a high HBA1c level ; and/or referral. ELSE } CONTINUE 26
HIE Big Data Exploration (cont.) Clinical Practice Guidelines NQF QM emeasures Computerized CPG (CARE, Arden syntax) CDSS SQL n Relational DB n Manual Programming for each CPG Automated for all CPGs CDSS SQL Relational DB One patient at the time All patient at the time CDSS SQL 2 Relational DB 2 Relational DB 1 CDSS SQL 1 Population Health Informatics 27
HIE Big Data Exploration (cont.) CPG NQF CDSS SQL n Relational DB n CPG Consortium CARE Arden XML Lexical Integration Syntactic Mapping Semantic Integration XSLT trans. Renewable CDSS SQL CDSS SQL 2 CDSS SQL 1 Relational DB Relational DB 2 Relational DB 1 Generating Renewable CDSS Pragmatic Integration 28
HIE Big Data Exploration (cont.) Language Specific Language Free 29
HIE Big Data Exploration (cont.) Rules Population One patient* CD4_1.sql 59 0.18 CHOLESTEROL_1.sql 180 0.11 CHOLESTEROL_2.sql 1 0.12 DIABETES_1.sql 541 0.56 DIABETES_2.sql 1561 0.47 DIABETES_3.sql 182 0.40 DIABETES_4.sql 121 0.40 DIABETES_5.sql 1620 1.51. TCL_2.sql 67 0.12 TCL_3.sql 69 0.18 TCL_4.sql 61 0.17 TETANUS_SHOT_1.sql 1 0.07 WHOLE_1.sql 127 0.16 WHOLE_2.sql 129 0.13 Average (sec) 1249 0.73 Runtime per second * each row shows average of 20 individual runs 30
MEDICINE_1.sql HIV_PT_1.sql TCL_3.sql PREVENT_STOOL_1.sql CD4_1.sql WHOLE_1.sql TCL_4.sql IDC_REM_1.sql OBFORM_6.sql 31 PREVENT_MAM_1.sql HIE Big Data Exploration (cont.) Institute 1: All CDS Hits (regardless of missed proportion) 70000 60000 50000 40000 30000 20000 10000 0 K_1.sql FLU_SHOT_5.sql DIABETES_5.sql TCL_1.sql WHOLE_2.sql PREVENT_STOOL_4.sql DIABETES_1.sql K_2.sql PREVENT_MAM_3.sql FLU_SHOT_1.sql TCL_2.sql LIPID_MANAGEMENT_1.sql TETANUS_SHOT_1.sql PEDS_9.sql PEDS_8.sql PEDS_7.sql PEDS_1.sql PEDS_11.sql PEDS_10.sql FLU_SHOT_3.sql HEPVAX_B_1.sql FLU_SHOT_4.sql PNVX2_1.sql PREVENT_MAM_2.sql CHOLESTEROL_1.sql CHOLESTEROL_2.sql PREVENT_STOOL_2.sql PREVENT_PAP_1.sql MEDICINE_2.sql OBFORM_5.sql OBFORM_2.sql OBFORM_3.sql OBFORM_1.sql applicable CARE rules Applying all applicable CARE rules for Institute-1 Some rules are never hit for institute 1 (not shown in this diagram)
HIE Big Data Exploration (cont.) rules institutes Institute versus Rules (Raw Numbers) (zero rules / zero institutes are removed) 32
%13.49 Avg Percent HEPVAX_B_1.sql PREVENT_MAM_2.sql PREVENT_MAM_3.sql PEDS_7.sql PEDS_1.sql 33 HIE Big Data Exploration (cont.) Institute 1: Percentage of Hit / Miss Rules 30.00% 25.00% 20.00% 15.00% 10.00% 5.00% 0.00% TCL_1.sql CD4_1.sql K_2.sql OBFORM_3.sql FLU_SHOT_1.sql PEDS_9.sql IDC_REM_1.sql PEDS_11.sql LIPID_MANAGEMENT_1.sql PREVENT_STOOL_4.sql WHOLE_2.sql CHOLESTEROL_1.sql OBFORM_2.sql OBFORM_6.sql PNVX2_1.sql PEDS_10.sql OBFORM_1.sql DIABETES_5.sql PREVENT_PAP_1.sql ~12.3% PREVENT_STOOL_1.sql FLU_SHOT_3.sql CHOLESTEROL_2.sql PREVENT_STOOL_2.sql OBFORM_5.sql K_1.sql FLU_SHOT_4.sql TCL_4.sql MEDICINE_2.sql TETANUS_SHOT_1.sql TCL_2.sql FLU_SHOT_5.sql PEDS_8.sql DIABETES_1.sql WHOLE_1.sql TCL_3.sql PREVENT_MAM_1.sql MEDICINE_1.sql HIV_PT_1.sql applicable CARE rules Missed applicable CARE rules for institute-1 (CDSS-generated QM fingerprint) Manual process of deciphering the rules (hit-means-miss OR hit-means-hit-only) The lower the better (less missed)
Percent %3.47 Avg PREVENT_MAM_2.sql PREVENT_STOOL_2.sql PREVENT_STOOL_4.sql HEPVAX_B_1.sql 34 HIE Big Data Exploration (cont.) Institute 2: Inverse Percentage of Hit / Miss Rules 30.00% 25.00% 20.00% 15.00% 10.00% 5.00% 0.00% LIPID_MANAGEMENT_1.sql TCL_2.sql OBFORM_5.sql PNVX2_1.sql PEDS_10.sql OBFORM_1.sql ~5.5% PREVENT_MAM_3.sql K_1.sql K_2.sql FLU_SHOT_3.sql CHOLESTEROL_2.sql CHOLESTEROL_1.sql MEDICINE_2.sql OBFORM_3.sql PEDS_11.sql TETANUS_SHOT_1.sql OBFORM_2.sql PEDS_8.sql PEDS_1.sql FLU_SHOT_4.sql PREVENT_PAP_1.sql WHOLE_2.sql TCL_4.sql PEDS_9.sql PEDS_7.sql OBFORM_6.sql IDC_REM_1.sql HIV_PT_1.sql FLU_SHOT_1.sql CD4_1.sql applicable CARE rules Missed applicable CARE rules for institute-2 (CDSS-generated QM fingerprint) Manual process of deciphering the rules (hit-means-miss OR hit-means-hit-only) The lower the better (less missed)
HIE Big Data Exploration (cont.) Rules Missed % institutes rules Missed rate by institute The effect size of the missed rules depends on the population size. 35
HIE and Population HIT > Population Health Informatics Population Health Info. Population Health Analytics Population Health Data-warehouse PubHI to CI CI to PubHI Clinical Informatics Personal Health Records EHRs 1 n Insurance Data HIE Rx Data Admin data repositories A Public Health Informatics B C Public Health CI to PubHI PubHI to CI 36
HIE and Population HIT > CDS Continuum (cont.) Population-based HIE EHRs PHRs Claims MPI Registries PACS LABs Population-based CDSS across databases 37
CPHIT The Johns Hopkins Center for Population Health Information Technology (CPHIT, or see-fit ) The mission of this innovative, multi-disciplinary R&D center is to improve the health and well-being of populations by advancing the state-of-the-art of Health IT across public and private health organizations and systems. CPHIT focuses on the application of electronic health records (EHRs), mobile health (m-health) and other e-health and HIT tools targeted at communities and populations. Director: Dr. J. Weiner www.jhsph.edu/cphit Copyright Dr. J. Weiner 38
CPHIT (cont.) External PH/IDS Orgs. JH Health System JH Healthcare Solutions, LLC CPHIT JHU Academic Departments and other R&D centers Industry Foundations Government Business Partners CPHIT Organizational Linkages Copyright Dr. J. Weiner 39
CPHIT (cont.) CPHIT s R&D Priorities Development and testing of measures created from EHRs and other HIT systems to quantify quality, outcome and need within target populations and communities. Use and advancement of computing methodologies including natural language processing (NLP) and pattern recognition tools to improve the application of nonstructured EHR data for population-based interventions. Initiation of effective approaches for linking provider-centric EHR systems with consumer-centric internet and mobile-based e-health applications. Development of EHR-based tools and decision support applications to identify and help manage high risk populations for preventive and/or chronic care and to coordinate care within IDS/ACO. Strategic approaches for creating an interoperable community of EHR networks and integrating them with the current functions of public health agencies (e.g., surveillance and vital records). Creation of legal/ethical and policy frameworks to support the development of secondary use of EHRs for public health programs and research. Copyright Dr. J. Weiner 40
CPHIT (cont.) > Research Pipeline Developing and Testing "e-acgs": Using EHR / HIT Data to Enhance Johns Hopkins ACGs' Ability to Measure Risk and Health Status Real-time, Cross-Provider High Readmission Risk Detection and Notification System Evaluate the economic impact of the Shared Patient Virtual Health Record (SPVHR) in Inpatient Settings Prescription Drug Monitoring Program (PDMP) Evaluation Developing and evaluating an Inter-Provider Health Screening and Quality Measure System (IPHSQMS) Identifying high risk pregnancies by linking claims and clinical data using natural language processing tools www.jhsph.edu/cphit 41
Thank you Q & A www.jhsph.edu/cphit 42