Data Driven Approaches to Prescription Medication Outcomes Analysis Using EMR



Similar documents
How to De-identify Data. Xulei Shirley Liu Department of Biostatistics Vanderbilt University 03/07/2008

HIPAA-Compliant Research Access to PHI

De-Identification of Health Data under HIPAA: Regulations and Recent Guidance" " "

HIPAA COMPLIANCE INFORMATION. HIPAA Policy

HIPAA COMPLIANCE. What is HIPAA?

De-Identification of Clinical Data

The De-identification of Personally Identifiable Information

HIPAA-P06 Use and Disclosure of De-identified Data and Limited Data Sets

Winthrop-University Hospital

What is Covered by HIPAA at VCU?

UPMC POLICY AND PROCEDURE MANUAL

HIPAA POLICY REGARDING DE-IDENTIFICATION OF PROTECTED HEALTH INFORMATION AND USE OF LIMITED DATA SETS

Protecting Personal Health Information in Research: Understanding the HIPAA Privacy Rule

LA BioMed Secure

University of Cincinnati Limited HIPAA Glossary

North Shore LIJ Health System, Inc. Facility Name

HIPAA and You The Basics

Memorandum. Factual Background

4. No accounting of disclosures is required with respect to disclosures of PHI within a Limited Data Set.

Legal Insight. Big Data Analytics Under HIPAA. Kevin Coy and Neil W. Hoffman, Ph.D. Applicability of HIPAA

IRB Application for Medical Records Review Request

HIPAA Privacy and Security Rules: A Refresher. Marilyn Freeman, RHIA California Area HIPAA Coordinator California Area HIM Consultant

De-identification Koans. ICTR Data Managers Darren Lacey January 15, 2013

Health Insurance Portability & Accountability Act (HIPAA) Compliance Application

Application for an Off-Site Tissue Banking Waiver at a Non-Profit or Academic Institution

What is Covered under the Privacy Rule? Protected Health Information (PHI)

HIPAA-G04 Limited Data Set and Data Use Agreement Guidance

8/3/2015. Integrating Behavioral Health and HIV Into Electronic Health Records Communities of Practice

Everett School Employee Benefit Trust. Reportable Breach Notification Policy HIPAA HITECH Rules and Washington State Law

Computer Security Incident Response Plan. Date of Approval: 23- FEB- 2015

Statement of Policy. Reason for Policy

HIPAA ephi Security Guidance for Researchers

Presented by Jack Kolk President ACR 2 Solutions, Inc.

HIPAA OVERVIEW ETSU 1

Data Masking for HIPAA Compliance

EXPANDING THE EVIDENCE BASE IN OUTCOMES RESEARCH: USING LINKED ELECTRONIC MEDICAL RECORDS (EMR) AND CLAIMS DATA

SESSION DEPENDENT DE-IDENTIFICATION OF ELECTRONIC MEDICAL RECORDS

Health Insurance Portability and Accountability Policy 1.8.4

Stage 1 vs. Stage 2 Comparison Table for Eligible Hospitals and CAHs Last Updated: August, 2012

HIPAA 101: Privacy and Security Basics

Malpractice Issues for the Radiologic Technologist

Secondary Uses of Data for Comparative Effectiveness Research

PharmaSUG2011 Paper HS03

Medical Records Law: Regulatory Issues Meaningful Use? EHR v. EMR

Presented by. Terri Gonzalez Director of Practice Improvement North Carolina Medical Society

OCR/HHS HIPAA/HITECH Audit Preparation

HEALTH INFORMATION TECHNOLOGY*

IRB Month Investigator Meeting April 2014

How to Achieve Meaningful Use with ICANotes

MEANINGFUL USE. Community Center Readiness Guide Additional Resource #13 Meaningful Use Implementation Tracking Tool (Template) CONTENTS:

HIPAA and Clinical Research

Big Data Analytics in Health Care

HIPAA Compliance for Students

EMR Name/ Model. meridianemr 4.2 CCHIT 2011 certified

Meaningful Use Qualification Plan

De-Identification 101

Meaningful Use. Medicare and Medicaid EHR Incentive Programs

Understanding Diseases and Treatments with Canadian Real-world Evidence

Setting up the necessary components for E.H.R usage in Practice-Web

HITECH Act Update: An Overview of the Medicare and Medicaid EHR Incentive Programs Regulations

Information Privacy and Security Program Title:

Appendix 6.2 Data Source Described in Detail Hospital Data Sets

December Federal Employees Health Benefits (FEHB) Program Report on Health Information Technology (HIT) and Transparency

Meaningful Use Criteria for Eligible Hospitals and Eligible Professionals (EPs)

EHR Meaningful Use Guide

Data Security Considerations for Research

Stage 1 measures. The EP/eligible hospital has enabled this functionality

Contact Information: West Texas Health Information Technology Regional Extension Center th Street MS 6232 Lubbock, Texas

Custom Report Data Elements: 2012 IT Database Fields. Source: American Hospital Association IT Survey

MICROMD EMR VERSION OBJECTIVE MEASURE CALCULATIONS

TABLE 4: STAGE 2 MEANINGFUL USE OBJECTIVES AND ASSOCIATED MEASURES SORTED BY CORE AND MENU SET

Summary of the Proposed Rule for the Medicare and Medicaid Electronic Health Records (EHR) Incentive Program (Eligible Professionals only)

Transcription:

Data Driven Approaches to Prescription Medication Outcomes Analysis Using EMR Nathan Manwaring University of Utah Masters Project Presentation April 2012

Equation Consulting Who we are Equation Consulting is a 45 person consulting firm based in Salt Lake City, UT with a core focus on data-driven solutions to improve physician economics within hospital, private, or academic settings. Our engagements are almost exclusively limited to hospital and physician billing data, but due to the Meaningful Use Initiative of HITECH, more clients are asking about EMR data and Equation is seeking to increase their exposure to clinical data. 2

Project Goals Equation s Goal was to: 1. Understand what information is available and discover possible applications in future products and projects that could be offered to Equation s clients. Measuring effectiveness of anti-hypertension medication chosen as first-look analysis 2. Protect privacy by determining how to create meaningful datasets for research that meet HIPAA requirements for de-identified data sets Researcher s Goal was to: 1. Improve understanding of statistical techniques in data analysis including: t-test, regression, and data-mining methods 2. Work with Dr. Sheng to develop tools to identify potential Adverse Drug Events from EMR data 3

Background What is EMR? Electronic Medical Records (EMR/EHR) are computerized medical record in a hospital or physician's office. Major benefits over paper charts include improving efficiency and accuracy of data sharing, decreased cost of storing cumbersome paper files, and opportunities to prompt physicians with automated warnings prior to performing orders. VS. 4

Background HITEC Act and the EMR Explosion Long before Obamacare, Congress signed the Health Information Technology for Economic and Clinical Health Act of 2009 (HITECH) creating $34 billion in financial incentives for hospitals and physicians for their meaningful use of certified electronic health records (EHRs). This law also includes substantial payment reductions if they are not meaningful users of health IT after 2015. As of late 2011, physician EMR adoption ranges from 39% for solo practitioners to 77% adoption rate for large multi-specialty practices. 5

Background What to do with all of this new Data? 6

Background How to protect Patient Privacy? A Safe Harbor outlined by HIPPA for sharing patient data with researchers is through removing all PHI, or creating a de-identified dataset. A de-identified dataset is created by removing all 18 elements that could be used to identify the individual or the individual's relatives, employers, or household members. There also must be no actual knowledge that the remaining information could be used alone or in combination with other information to identify the individual who is the subject of the information. It is the goal of Equation and it s clients to determine how to generate meaningful data extracts while still falling within these requirements for de-identification. 1. Names. 2. All geographic subdivisions smaller than a state, including street address, city, county, precinct, ZIP Code, and their equivalent geographical codes, except for the initial three digits of a ZIP Code 3. All elements of dates (except year) for dates directly related to an individual, including birth date, admission date, discharge date, date of death and all ages over 89 except that such ages are aggregated into a single category of age 90 or older. 4. Telephone numbers. 5. Facsimile numbers. 6. Electronic mail addresses. 7. Social security numbers. 8. Medical record numbers. 9. Health plan beneficiary numbers. 10. Account numbers. 11. Certificate/license numbers. 12. Vehicle identifiers and serial numbers, including license plate numbers. 13. Device identifiers and serial numbers. 14. Web universal resource locators (URLs). 15. Internet protocol (IP) address numbers. 16. Biometric identifiers, including fingerprints and voiceprints. 17. Full-face photographic images and any comparable images. 18. Any other unique identifying number, characteristic, or code, unless otherwise permitted by the Privacy Rule for reidentification. 7

Background: Why Hypertension? Data Perspective Blood Pressure is one of the most basic/universal attributes collected by physicians using EMR Hypertension, has a well-defined list of diagnosis codes that can be easily used to identify the primary patient population we wish to study Data Collection doesn t require significant behavior change from providers, increasing data accuracy and availability Clinical Perspective...Even small improvements in blood pressure control can have major public health impact. A 1990 systematic review of 14 randomized treatment trials for hypertensive patients showed that lowering diastolic blood pressure (DBP) by 5 to 6 points reduced stroke rates by 42%. Another recent study showed that lowering DBP by only 2 points could result in a 6% reduction in the risk of coronary heart disease, along with a 15% reduction in the risk of stroke and one type of heart attack... http://www.ahrq.gov/qual/hypertengap.htm 8

Step 1: Preparing the Data The Data The source of this data is a subset Epic s Clarity data warehouse. The subset contains data spread between 149 unique tables. Items to be included in the analysis include: Patient Demographics, Medications, Diagnosis Codes, Procedure Codes, Provider Information, Location and Department information, Dates of Service, Medication Orders, etc. The Process First goal was to generate a clean list of patients that would be the starting universe for potential analysis. Criteria for meeting this requirement is an encounter with a primary diagnosis of hypertension. Second goal was to generate a dataset that would tie together various medications, encounters, BP readings, and physician encounters in an ordered way that allowed for comparing consecutive items on a single line Grain of the dataset is a unique Patient/PrescriptionDate combination. Data must include all medication prescribed as well as detail on previous and future encounters and BP measurements from that time period Over 2,000 lines of SQL code was written to generate the desired tables 9

Step 1: Preparing the Data ----------------------------------------------------------------------------------------------- ---------------------------------- --08 Create Marker for the top 50 Drugs (by Simple Generic Code) that can reside as a seperate column ----------------------------------------------------------------------------------------------- ---------------------------------- IF OBJECT_ID('TEMP_TOP_DRUGS') <> 0 DROP TABLE TEMP_TOP_DRUGS GO SELECT TOP 50 Order_MedicationSimpleGenericCode,COUNT(*) AS Cnt,row_number () OVER( ORDER BY COUNT(*) DESC) AS RowNum INTO TEMP_TOP_DRUGS FROM MedicationOrder where Order_MedicationSimpleGenericCode is not null GROUP BY Order_MedicationSimpleGenericCode ORDER BY COUNT(*) DESC GO DECLARE @SimpleGenericCode VARCHAR(255) DECLARE @ColumnName VARCHAR(255) DECLARE @SQLCode1 NVARCHAR(MAX) DECLARE @SQLCode2 NVARCHAR(MAX) DECLARE @RowNum INT DECLARE @MAXRowNum INT SET @RowNum = 1 SET @MAXRowNum = (SELECT MAX(RowNum) FROM TEMP_TOP_DRUGS) WHILE @RowNum <= @MAXRowNum BEGIN SET @SimpleGenericCode = (SELECT Order_MedicationSimpleGenericCode FROM TEMP_TOP_DRUGS WHERE RowNum = @RowNum) SET @SQLCode1 = 'ALTER TABLE PatientOrder ADD Drug_'+@SimpleGenericCode+'_mrkr TINYINT' EXECUTE sp_executesql @SQLCode1 --SELECT @SQLCode1 SET @SQLCode2 = 'UPDATE PatientOrder SET Drug_'+@SimpleGenericCode+'_mrkr = CASE WHEN MedicationList LIKE ''% '+@SimpleGenericCode+' %'' THEN 1 ELSE 0 END' EXECUTE sp_executesql @SQLCode2 --SELECT @SQLCode2 SET @RowNum = @RowNum+1 END GO ----------------------------------------------------------------------------------------------- ---------------------------------- --08 Create Active Medication List for All Drug Orders ----------------------------------------------------------------------------------------------- ---------------------------------- GO IF OBJECT_ID('TEMP_All_CSN_ID') <> 0 DROP TABLE TEMP_All_CSN_ID SELECT Order_BP_EncounterPrev04_CSN_ID AS CSN_ID INTO TEMP_All_CSN_ID FROM( SELECT DISTINCT Order_BP_EncounterPrev04_CSN_ID FROM PatientOrder UNION SELECT DISTINCT Order_BP_EncounterPrev03_CSN_ID FROM PatientOrder UNION SELECT DISTINCT Order_BP_EncounterPrev02_CSN_ID FROM PatientOrder UNION SELECT DISTINCT Order_BP_EncounterPrev01_CSN_ID FROM PatientOrder UNION SELECT DISTINCT Order_BP_EncounterCurr00_CSN_ID FROM PatientOrder UNION SELECT DISTINCT Order_BP_EncounterNext01_CSN_ID FROM PatientOrder UNION SELECT DISTINCT Order_BP_EncounterNext02_CSN_ID FROM PatientOrder UNION SELECT DISTINCT Order_BP_EncounterNext03_CSN_ID FROM PatientOrder UNION SELECT DISTINCT Order_BP_EncounterNext04_CSN_ID FROM PatientOrder ) a GO ALTER TABLE TEMP_All_CSN_ID ADD MedList VARCHAR(500) GO DECLARE @Line INT DECLARE @Max_Line INT SET @Line = 1 SET @Max_Line = (SELECT MAX(LINE) FROM db_0175_08_ods..pat_enc_curr_meds a INNER JOIN TEMP_All_CSN_ID b ON a.pat_enc_csn_id = b.csn_id AND a.is_active_yn = 'Y') WHILE @Line <= @Max_Line BEGIN UPDATE TEMP_All_CSN_ID SET MedList = ISNULL(a.MedList,'') +' '+ d.simple_generic_c FROM TEMP_All_CSN_ID a INNER JOIN db_0175_08_ods..pat_enc_curr_meds b ON a.csn_id = b.pat_enc_csn_id AND b.line = @Line AND b.is_active_yn = 'Y' INNER JOIN db_0175_08_ods..clarity_medication d ON b.medication_id = d.medication_id WHERE CHARINDEX(d.SIMPLE_GENERIC_C,ISNULL(a.MedList,''),1) = 0 SET @Line = @Line + 1 END 10

Step 1: Preparing the Data 11

Step 2: Understanding the Data High Level Review of Data Patient Data 28,178 Patients 29 Demographics Reviewed: Median Age 60, range 40-80; 57% Female, Median Weight 190 lbs, Avg 7 BP Measurements per patient Medication Information 6,818 Drugs Analyzed 30,590 Discontinued Orders (potential adverse events) Prescription Data 1.9 million Unique prescriptions written 20,568 Unique Order Dates Top 50 drugs prescribed 19,151 times to primary population 12

Step 2: Data Analysis Core Drug Comparison Most common Anti-hypertensive medication: 1. Lisinopril (LIS) 2. Hydrochlorothiazide (HCTZ) 3. Lisinopril-Hydrochlorothiazide combination drug 4. Furomeside Exclude core drugs # 1 & 4 due to unique patient populations making comparison less helpful Next Steps Analysis Compare All Drugs Data-Mining & Classification Analyze LIS and HCTZ o T-test comparisons o Industry Research o Demographics Analysis o Outcomes Analysis o Detail 13

Multiple Linear Regression: Introduction Linear regression is a modeling approach that determines the closest linear relationship between one or more independent variables and a single dependent variable. Regression modeling is facilitated through binary columns that indicate the presence of a particular independent variable (see below) 14

Analysis: Multiple Linear Regression We initially attempted to model the relationship between many (14) medications and their impact on blood pressure. While this method did successfully identify several drugs known to increase/decrease BP, it also indicated that the drug Furosemide as increasing diastolic BP Additional data scrubbing would show that Furosemide is not associated with increased BP Conclusion While we gain some useful information from linear regression modeling, it does not appear to be the most appropriate modeling method for this dataset 15

Data Mining Classification: Introduction Classification Using Data Mining Techniques (Decision Tree, Neural Networks) to classify an object into sets of pre-defined object classes o Example: Based on specific demographics can we classify patients as churn or not o Example: Based on customer attributes classify customers as will or won t purchase 16

Analysis: Data Mining - Classification We tried Data Mining modeling J48 decision tree and NaïveBays to try to determine which patients would experience a favorable drop in their Category of Hypertension Variables Given: Gender, Age, HypertensionDX_mrkr, DiagnosisGroup, DaysFromOrder, Weight, BMI, Temp, Pulse, DiastolicCategory, medication markers The J48 decision tree was closest with only 65% correctly classified instances Definitely an area to continue exploring in the future 17

Analysis: Lisinopril VS. Hydrochlorothiazide Preliminary analysis shows comparable results from LIS and HCTZ, especially among Category 2-3 combined populations. LIS has slightly better performance over all 18

Lisinopril VS. Hydrochlorothiazide: T-test The T-test is commonly used to calculate the significance of observed differences between the means of two samples. T-test result is the percent probability of the null hypothesis The null hypothesis is that there are no significant difference between the means. 19

Lisinopril VS. Hydrochlorothiazide: Preliminary Analysis Preliminary analysis shows comparable results from LIS and HCTZ, especially among Category 2-3 combined populations. LIS has slightly better performance over all 20

Analysis: Industry Research Lisinopril VS. Hydrochlorothiazide Several studies were found comparing effectiveness of LIS vs HCTZ at lowering blood pressure. Both studies found that LIS was 6-7 points more effective than HCTZ EMR Analysis shows only 2 points of difference between the impacts of the two drugs Need to examine potential causes of difference between EMR findings and research http://www.springerlink.com/content/r01716462m4864u8/ http://journals.lww.com/cardiovascularpharm/abstract/1987/00003/controlled_multicenter_study_of_the.10.aspx 21

Analysis: Difference 1- Age Demographics Study age demographics were significantly under representing the >64 category compared to the actual patient population http://www.springerlink.com/content/r01716462m4864u8/ http://journals.lww.com/cardiovascularpharm/abstract/1987/00003/controlled_multicenter_study_of_the.10.aspx 22

Analysis: Difference 2 - Gender Study includes significantly more males than the actual patient population http://www.springerlink.com/content/r01716462m4864u8/ http://journals.lww.com/cardiovascularpharm/abstract/1987/00003/controlled_multicenter_study_of_the.10.aspx 23

Analysis: Impact of Demographic Variation is Important Splitting the group comparison by gender reveals that Males have less favorable outcomes from HCTZ than females. T-test results suggest a high degree of similarity for females using the medications, but not for Males. If EMR data had similar demographics to the Study, the results of the comparison would be comparable. http://www.springerlink.com/content/r01716462m4864u8/ http://journals.lww.com/cardiovascularpharm/abstract/1987/00003/controlled_multicenter_study_of_the.10.aspx 24

Conclusion: Real World vs. Research EMR Research is not to replace studies, nor is it competing with them. EMR tells the physician a different story: EMR Data Story What actually happens to my patient s BP when I prescribe this medication? Clinical Trial Story What happens in a highly controlled environment when a specific population of patients are given this medication? 25

Conclusion: Privacy Data Integrity Protection After the initial SQL data scrubbing, none of the 18 Safe Harbor fields were required to complete the analysis, and very little value is lost by removing dates. It is possible to construct a data extraction script that created the Patient Summary table, but retained order of and time in between encounters and measurements without including any actual dates. Clients would control the execution of the extract and would certify that the end-result was free form PHI By constructing special data extraction tools for clients, it s possible to both guarantee patient privacy and achieve rich data sets with high research value. Furthermore, unless a specific query is created for a particular data set, it is unlikely that a hospital would be able to provide PHI-free data that retained this information. Original Data De-identified Data 26

Future Research Ideas Framingham Study: 1. What are predictors of Heart Attack, Stroke, Diabetes, etc.? 2. What medications seem to have the best track record for reducing Heart Attack Risk? 3. What physicians seem to be the most efficient providers? (best outcomes relative to patient cost) Adverse Events and Discontinued Medication 1. What is the relationship between Adverse Events and Discontinued Medication Codes? 2. Can we predict a decrease in medication effectiveness in key populations with higher adverse events reported? 27

Future Research: Adverse Events and Discontinued Medication One possible factor influencing the comparison of LIS and HCTZ in large populations of patients is the frequency of side-effects influencing whether patients are unable or unwilling to finish the prescription due to side effects. Lisinopril Hydrochlorothiazide 28

Future Research: Adverse Events and Discontinued Medication EMR data captured information when a medication is discontinued due to dose adjustment, patient preference, allergic reactions, etc. By analyzing this data, we can start to compare the picture to findings from the DrugInformer.com database. The user of the tool simply sets the weighting criteria and then the excel sheet will display the top ranked drugs by severity based on the weighting chosen by the user. 29