Big Data Integration and Governance Considerations for Healthcare



Similar documents
Uncovering Value in Healthcare Data with Cognitive Analytics. Christine Livingston, Perficient Ken Dugan, IBM

How To Get More Data From Your Computer

WHITE PAPER. QualityAnalytics. Bridging Clinical Documentation and Quality of Care

Clintegrity 360 QualityAnalytics

Big Data & Analytics for Semiconductor Manufacturing

IBM Big Data in Government

Big Data Analytics in Health Care

Business Analytics for Big Data

Klarna Tech Talk: Mind the Data! Jeff Pollock InfoSphere Information Integration & Governance

How To Analyze Health Data

A Population Health Management Approach in the Home and Community-based Settings

Predictive Care Models to Improve Outcomes Brendan Fowkes Sr. Healthcare Solution Executive May 14, 2013

PREDICTIVE ANALYTICS FOR THE HEALTHCARE INDUSTRY

Optum One. The Intelligent Health Platform

IBM Software White Paper. Data-driven healthcare organizations use big data analytics for big gains

Building Confidence in Big Data Innovations in Information Integration & Governance for Big Data

Using Predictive Analytics to Reduce COPD Readmissions

Identifying High-Risk Medicare Beneficiaries with Predictive Analytics

Exploiting Data at Rest and Data in Motion with a Big Data Platform

The IBM Agile Information Governance Process

Managing big data for smart grids and smart meters

Environmental Health Science. Brian S. Schwartz, MD, MS

Big Data and Analytics In Healthcare Overview

Leveraging EHR to Improve Patient Safety: A Davies Story

Active AnAlytics: Driving informed Decisions leading to Better clinical AnD financial outcomes

IBM SPSS Modeler Professional

Big Data, Integration and Governance: Ask the Experts

Using Health Information Technology to Improve Quality of Care: Clinical Decision Support

I n t e r S y S t e m S W h I t e P a P e r F O R H E A L T H C A R E IT E X E C U T I V E S. In accountable care

Premier. Helping healthcare providers deliver the best possible care to their patients. Smart is...

Addressing government challenges with big data analytics

WHITE PAPER. How a multi-tiered strategy can reduce readmission rates and significantly enhance patient experience

CUSTOMER RELATIONSHIP MANAGEMENT IN HEALTHCARE

How the oil and gas industry can gain value from Big Data?

IBM Software Understanding big data so you can act with confidence

IBM Software Wrangling big data: Fundamentals of data lifecycle management

ACCOUNTABLE CARE ANALYTICS: DEVELOPING A TRUSTED 360 DEGREE VIEW OF THE PATIENT

University of Ontario Institute of Technology

Nandan Banerjee Cogent Infotech Corporation COGENT INFOTECH CORPORATION

Data Governance Best Practices

3M Health Information Systems. Take action: How predictive analytics can help you improve healthcare value

THE VALUE OF A COMPLETE CODING QUALITY AUDIT PROGRAM. By Lisa Marks, RHIT, CCS, Coding Audit Director, Precyse

Behavioral Health Services. Provider Manual

Physician Discovery Services Provide a Full Range of Physician Practice Solutions

Beyond Watson: The Business Implications of Big Data

Electronic Health Record (EHR) Data Analysis Capabilities

TRUSTED PATIENT EDUCATION FOR BETTER OUTCOMES. MICROMEDEX Patient Connect. Patient Education & Engagement

Leveraging Big Data & Deep Analytics to Improve Care

TACKLING POPULATION HEALTH MANAGEMENT with Worksite Wellness & Community Outreach

U.S. Department of Health & Human Services May 7, New HHS Data Shows Major Strides Made in Patient Safety, Leading to Improved Care and Savings

HealthCare Partners of Nevada. Heart Failure

How To Create A Health Analytics Framework

Big Data Analytics- Innovations at the Edge

The 4 Pillars of Clinical Integration: A Flexible Model for Hospital- Physician Collaboration

Big Data Analytics in Healthcare In pursuit of the Triple Aim with Analytics. David Wiggin, Director, Industry Marketing, Teradata 20 November, 2014

Mastering the Data Game: Accelerating

IBM InfoSphere Guardium Data Activity Monitor for Hadoop-based systems

CHAPTER 535 HEALTH HOMES. Background Policy Member Eligibility and Enrollment Health Home Required Functions...

Supplemental Technical Information

IBM SPSS Modeler Premium

WHITE PAPER. 9 Steps to Better Patient Flow and Decreased Readmissions in Your Emergency Department

Smarter Analytics. Barbara Cain. Driving Value from Big Data

Innerview Reimbursement in the Physician Office Setting * 2014

The Changing Landscape of Healthcare and What it means to you!

Breathe With Ease. Asthma Disease Management Program

IBM Analytics. Just the facts: Four critical concepts for planning the logical data warehouse

Carolina s Journey: Turning Big Data Into Better Care. Michael Dulin, MD, PhD

Section 6. Medical Management Program

BIG DATA. John A. Eisenhauer Chair, Data Governance Society Rick Young - Managing Director 3Sage Consulting

SAP/PHEMI Big Data Warehouse and the Transformation to Value-Based Health Care

An Essential Ingredient for a Successful ACO: The Clinical Knowledge Exchange

SOLUTION BRIEF. SAP/PHEMI Big Data Warehouse and the Transformation to Value-Based Health Care

IBM BigInsights for Apache Hadoop

Modern care management

A STRATIFIED APPROACH TO PATIENT SAFETY THROUGH HEALTH INFORMATION TECHNOLOGY

How To Use Predictive Analytics To Improve Health Care

Ohio Health Homes Learning Community Meeting. Overview of Health Homes Measures

At the End of the Day Does the Pipeline Deliver: Cerner / WellPoint ICD-10 Pilot Test Collaboration

New Substance Abuse Screening and Intervention Benefit Covered by BadgerCare Plus and Medicaid

Substance Abuse Treatment Services Objectives and Performance Measures Progress: First Annual Report

Taking A Proactive Approach To Loyalty & Retention

Exploration and Visualization of Post-Market Data

Supercharged CDI: NLP, intelligent workflow and CAC revolutionize CDI program at UPMC

Transcription:

White Paper Big Data Integration and Governance Considerations for Healthcare by Sunil Soares, Founder & Managing Partner, Information Asset, LLC

Big Data Integration and Governance Considerations for Healthcare There is a lot of discussion in the press about Big Data. Big Data is traditionally defined in terms of the three V s of Volume, Velocity, and Variety. In other words, Big Data is often characterized as high-volume, streaming, and including semi-structured and unstructured formats.. Healthcare organizations have produced enormous volumes of unstructured data, such as the notes by physicians and nurses in electronic medical records (EMRs). In addition, healthcare organizations produce streaming data, such as from patient monitoring devices. Now, thanks to emerging technologies such as Hadoop and streams, healthcare organizations are in a position to harness this Big Data to reduce costs and improve patient outcomes. However, this Big Data has profound implications from an Information Governance perspective. In this white paper, we discuss Big Data Governance from the standpoint of three case studies. Governance of Electronic Medical Records for Predictive Analytics A large hospital system offered a broad range of services, including emergency care. A significant portion of the patient population was indigent. The hospital implemented a pilot program to leverage big data analytics aimed at reducing the readmission rate of patients with congestive heart failure. The objectives of the study were two-fold: 1. Reduce costs that will not be reimbursed by insurance. The United States Medicare and Medicaid programs are moving to approaches that reduce or eliminate payments for care to patients who are readmitted for the same disease. 2. Increase the quality of patient care by proactively implementing early intervention to prevent the progression of disease. Because the hospital system had limited funds for programs such as smoking cessation and home health care, it wanted these programs to be targeted at patients who were more likely to be readmitted within 30 days. For example, if smoking was a key predictor of patients who were readmitted within 30 days, then the hospital system wanted to target those persons with smoking cessation programs. The analytics department built a predictive model in IBM SPSS based on 150 variables and 20,000 patient encounters over five years. This data was sourced from a variety of applications, including electronic medical records, the admissions system, and the cost accounting database. From a Big Data Governance perspective, the hospital had to establish a number of policies.. 2

Data Quality The analytics team determined that a number of variables were significant predictors of a patient s readmission rate. The team used text analytics to improve the quality of sparsely populated structured data. In this situation, IBM InfoSphere BigInsights provides strong text analytics capabilities. We discuss four of these variables below: 1. Smoking status. Smoking status is a significant factor associated with heart disease. Surprisingly, the hospital did not have a complete history of patient smoking status, including years of smoking and frequency. At the outset, only 25 percent of the structured data around smoking status was populated with binary yes/no answers. However, by using content analytics, the analytics team was able to identify a larger population rate of 85 percent of patient encounters for smoking status. The content analytics team was also able to unlock additional information, such as smoking duration and frequency. There were a number of reasons for this discrepancy. For example, some patients indicated that they were non-smokers, but the text analytics revealed the following from the doctor s notes: Patient is restless and asked for a smoking break Patient quit smoking yesterday Quit 2. Drug and alcohol abuse. The clinical team knew from experience that drug abuse and alcohol abuse were significant predictors of hospital readmission rates. Only 20 percent of the patients checked off the box at admission to indicate whether they were addicted to drugs and alcohol. However, the analytics team used unstructured data sources to identify a total of 76 percent of the encounters where patients were abusing drugs and alcohol. 3. Assisted living facility. The clinical team knew from experience that patients in assisted living facilities were more likely to take their medications as compared with patients who lived alone. However, the hospital system was not capturing this patient status information in a formalized manner. The business intelligence team analyzed the text within discharge summaries, echocardiograms, patient histories, doctors notes, and physicals to find that 25 percent of the patients resided in an assisted living facility. The analysis confirmed that residence in an assisted living facility did indeed reduce the likelihood that a patient would be readmitted within 30 days 4. Pharmacology compliance indicator. Information about pharmacology compliance was critical to clinicians and case managers because it indicated the degree to which patients were taking their medications as part of a treatment plan. The business intelligence team analyzed doctor s notes and electronic medical records to populate this data. 3

Metadata The analytics team had to derive consistent definitions for key business terms. In this situation, IBM InfoSphere Business Glossary can provide the foundation for sound governance and stewardship of business terms. In our case study, the term readmission had at least three different definitions: 1. Clinical perspective: 30 days, all causes. The patient was readmitted to the hospital whether or not the condition was related to congestive heart failure. 2. Clinical perspective: 30 days, same diagnosis. The patient was readmitted to the hospital with a dominant ICD-9 diagnosis code related to heart failure. 3. Finance perspective: quarterly and annually. Finance had definitions of readmissions that were based on longer periods, including six to nine months. Master Data Management The analytics team also struggled with the lack of consistent patient data within the hospital system, caused by the proliferation of identification numbers for each patient. As a result, hospital personnel were not able to track medical events for the same patient across different facilities. As a workaround, the hospital system instituted a lengthy manual process to reconcile the medical events that related to the same patient. As a result, the team lost significant time in retrieving a patient s medical history when he or she was readmitted to another hospital in the same system within a very short period. This would potentially adversely affect decisions about treatment plans and clinical outcomes. In this scenario, an Enterprise Master Patient Index based on IBM InfoSphere Master Data Management can provide significant value. Reference Data Management ICD-9 reference data is a well-defined database with great granularity. For example, ICD-9 assigns code 428 for heart failure. Different ICD-9 codes describe details of heart failure conditions, such as 428.1 for left heart failure and 428.2 for systolic heart failure. Medical researchers at the hospital tried to analyze comorbidity the presence of one or more diseases or disorders associated with congestive heart failure using ICD-9 data. Being able to categorize ICD-9 codes into similar diseases helped to manage the results more effectively. The analytics team collaborated with clinicians to categorize over 21,000 ICD-9 codes into 20 disease groups. Based on this exercise, the analytics team was able to minimize the noise in their analysis and yield better clinical insights. IBM InfoSphere Master Data Management Reference Data Management Hub can support complex mappings of reference data sets in healthcare, such as ICD-9 and CPT codes. 4

Governance of Time Series Data in a Neonatal Intensive Care Unit A hospital leveraged IBM InfoSphere Streams to monitor the health of newborn babies in its neonatal intensive care unit. Using IBM InfoSphere Streams, the hospital was able to predict the onset of nosocomial (hospital-acquired) infection a full 24 hours earlier by identifying the onset of very slight symptoms. From a Big Data Governance perspective, the hospital had to establish multiple policies. Data Quality The application depended on large volumes of time series data. However, the time series data was sometimes missing when a patient moved and caused a lead (a monitor attached to the baby s skin) to disengage and discontinue readings. In these situations, IBM InfoSphere Streams applied linear and polynomial regressions to historical readings to fill in the gaps in the time series data. Information Lifecycle Management The hospital also tagged all time series data that had been modified by IBM InfoSphere Streams. In the event of a lawsuit or medical inquiry, the hospital would be able to produce both the original and the modified readings. In this situation, IBM InfoSphere Optim Data Growth Solution could potentially reduce data storage costs through archival and compression techniques. Privacy The hospital also established policies around safeguarding protected health information (PHI) pertaining to the time series data. In this scenario, IBM InfoSphere Optim Data Masking can mask PHI within non-production environments such as development and test. In addition, IBM InfoSphere Guardium can potentially monitor access by privileged users, such as database administrators, to PHI. Improving Confidence in Predictive Pathways for Disease The United States Centers for Disease Control and Prevention (CDC) estimates that nearly 8 percent of Americans have diabetes and another 60 million have prediabetes. As a result, medical intervention has become an imperative rather than an option for health insurers. However, nearly one-third of those who meet the criteria for diabetes do not know they have the disease. 5

Health plans are now able to uncover such insights with rapid and accurate patient scoring for diabetes based on multivariate analysis of very large datasets. Provider, facility, pharmacy, and enrollment data can be analyzed simultaneously for multiple variants, yielding a highly accurate and detail-rich statistical array with billions of rows of data. New member records can be matched against the disease profiles generated by the comprehensive analytics to help wellness companies reach out to patients as early as possible in the disease progression. This solution relies on a highly scalable platform based on IBM PureData System for Analytics. It also depends on consistent master data for members, providers, and pharmacies based on IBM InfoSphere Master Data Management. Finally, IBM InfoSphere Business Glossary offers a strong foundation of consistent business terms. IBM InfoSphere provides a robust platform for Big Data Integration and Governance. For more information, visit www.ibm.com/software/data/infosphere. About the Author Sunil Soares is the founder and managing partner of Information Asset, LLC, a consulting firm that specializes in helping organizations build out their Data Governance programs. Prior to this role, Sunil was the Director of Information Governance at IBM, where he worked with clients across six continents and multiple industries. Sunil has written four books about Information Governance, including The IBM Data Governance Unified Process, Selling Information Governance to the Business, Big Data Governance, and IBM InfoSphere: A Platform for Big Data Governance and Process Data Governance. The second and third books have dedicated chapters on healthcare. The following terms are trademarks or registered trademarks of International Business Machines Corporation in the United States, other countries, or both: IBM, BigInsights, Guardium, InfoSphere, Optim, PureData, and SPSS. A current list of IBM trademarks is available on the Web at http://www.ibm.com/legal/copytrade.shtml. 6