ezdi s semantics-enhanced linguistic, NLP, and ML approach for health informatics



Similar documents
All-in-one, Integrated HIM Workflow Solution

The What, When, Where and How of Natural Language Processing

CLINICAL DOCUMENTATION DRIVING PERFORMANCE IN THE NEW WORLD OF HEALTHCARE

I n t e r S y S t e m S W h I t e P a P e r F O R H E A L T H C A R E IT E X E C U T I V E S. In accountable care

The course will run simultaneously with the MSCI students.

Global Headquarters: 5 Speen Street Framingham, MA USA P F

Extreme Makeover - ICD-10 Code Edition: Demystifying the Conversion Toolkit

Find the signal in the noise

Value of. Clinical and Business Data Analytics for. Healthcare Payers NOUS INFOSYSTEMS LEVERAGING INTELLECT

How To Handle Big Data With A Data Scientist

THE FUTURE OF CODING IS NOW

White Paper. Prepared by: Neil Shah Director, Product Management March, 2014 Version: 1. Copyright 2014, ezdi, LLC.

IBM Watson and Medical Records Text Analytics HIMSS Presentation

Big Data Analytics in Health Care

Problem-Centered Care Delivery

TopBraid Insight for Life Sciences

ELECTRONIC MEDICAL RECORDS. Selecting and Utilizing an Electronic Medical Records Solution. A WHITE PAPER by CureMD.

Preparing for ICD- 10:

Not all NLP is Created Equal:

How To Make Sense Of Data With Altilia

How To Use Predictive Analytics To Improve Health Care

An Essential Ingredient for a Successful ACO: The Clinical Knowledge Exchange

Business Intelligence & Data Warehouse Consulting

Meaningful use. Meaningful data. Meaningful care. The 3M Healthcare Data Dictionary (HDD): Implemented with a data warehouse

User Needs and Requirements Analysis for Big Data Healthcare Applications

Exploration and Visualization of Post-Market Data

MODERN TRENDS AND TECHNIQUES ON MEDICAL RECORDS

5 Key Trends in Connected Health

Defense Healthcare Management Systems

Industry 4.0 and Big Data

From Fishing to Attracting Chicks

M*MODAL FLUENCY DIRECT CLOSED-LOOP CLINICAL DOCUMENTATION WITH A HIGH-PERFORMANCE FRONT-END SPEECH SOLUTION

Transforming CliniCal Trials: The ability to aggregate and Visualize Data Efficiently to make impactful Decisions

5 PLACES IN YOUR HOSPITAL WHERE ENTERPRISE CONTENT MANAGEMENT CAN HELP

Natural Language to Relational Query by Using Parsing Compiler

SAP/PHEMI Big Data Warehouse and the Transformation to Value-Based Health Care

Uncovering Value in Healthcare Data with Cognitive Analytics. Christine Livingston, Perficient Ken Dugan, IBM

The Do s and Don ts of Medical Device integration

SOLUTION BRIEF. SAP/PHEMI Big Data Warehouse and the Transformation to Value-Based Health Care

KPMG Unlocks Hidden Value in Client Information with Smartlogic Semaphore

Putting IBM Watson to Work In Healthcare

Harnessing the Power of Big Data for Real-Time IT: Sumo Logic Log Management and Analytics Service

Preventive Treatment for the Provider s Back-office

EXPERTISE FOR WHERE YOU RE HEADED

Technology Enablement

TRUVEN HEALTH UNIFY. Population Health Management Enterprise Solution

Semantically Steered Clinical Decision Support Systems

72. Ontology Driven Knowledge Discovery Process: a proposal to integrate Ontology Engineering and KDD

Natural Language Processing in the EHR Lifecycle

Big Data 101: Harvest Real Value & Avoid Hollow Hype

Meaningful Use Stage 2 Certification: A Guide for EHR Product Managers

IBM Tivoli Netcool network management solutions for enterprise

The Six A s. for Population Health Management. Suzanne Cogan, VP North American Sales, Orion Health

Meaningful use. Meaningful data. Meaningful care. The 3M Healthcare Data Dictionary: Standardizing lab data to LOINC for meaningful use

Building Successful Big Data Solutions

Big Data Classification: Problems and Challenges in Network Intrusion Prediction with Machine Learning

Sentiment Analysis on Big Data

THE CHALLENGE OF COORDINATING EMR

Discover more, discover faster. High performance, flexible NLP-based text mining for life sciences

WHITE PAPER. QualityAnalytics. Bridging Clinical Documentation and Quality of Care

Collaborative Intelligence: Unlocking the Power of Narrative Documentation

i-care Integrated Hospital Information System

Data Integrity in an Era of EHRs, HIEs, and HIPAA: A Health Information Management Perspective

Computer-assisted coding and documentation for a changing healthcare landscape

IMPROVING DATA INTEGRATION FOR DATA WAREHOUSE: A DATA MINING APPROACH

High Availability of VistA EHR in Cloud. ViSolve Inc. White Paper February

Scalable End-User Access to Big Data HELLENIC REPUBLIC National and Kapodistrian University of Athens

Transformational Data-Driven Solutions for Healthcare

Healthcare Solutions. Nuance Speakers Bureau

MED 2400 MEDICAL INFORMATICS FUNDAMENTALS

Meaningful Use, ICD-10 and HIPAA 5010 Overview, talking points and FAQs

Semantic Search in Portals using Ontologies

Creating A Functional Bridge With The EMR

TRUVEN HEALTH UNIFY. Population Health Management Enterprise Solution

Big Data Analytics and Healthcare

IT services for analyses of various data samples

A Systemic Artificial Intelligence (AI) Approach to Difficult Text Analytics Tasks

Healthcare Measurement Analysis Using Data mining Techniques

MEDICAL DATA MINING. Timothy Hays, PhD. Health IT Strategy Executive Dynamics Research Corporation (DRC) December 13, 2012

Capture. Structure. Access. Any EHR.

Clintegrity 360 QualityAnalytics

Empowering Personalized Medicine with Big Data and Semantic Web Technology: Promises, Challenges, and Use Cases

Benefits of Image-Enabling the EHR

Transcription:

ezdi s semantics-enhanced linguistic, NLP, and ML approach for health informatics Raxit Goswami*, Neil Shah* and Amit Sheth*, ** ezdi Inc, Louisville, KY and Ahmedabad, India. ** Kno.e.sis-Wright State Univ. Abstract ezdi uses large and extensive knowledge graph to enhance linguistics, NLP and ML techniques to improve structured data extraction from millions of EMR records. It then normalizes it, and maps it with various computer-processable nomenclature such as SNOMED-CT, RxNorm, ICD- 9, ICD-10, CPT, and LOINC. Furthermore, it applies advanced reasoning that exploited domainspecific and hierarchical relationships among entities in the knowledge graph to make the data actionable. These capabilities are part of its highly scalable AWS deployed heath intelligence platform that support healthcare informatics applications, including Computer Assisted Coding (CAC), Computerized Document Improvement (CDI), compliance and audit, and core measures and utilization, as well as support improved decision making that involve identification of patients at risk, patterns in diseases, outcome prediction, etc. This paper focuses on the key role of its semantic approach and techniques. Keywords: knowledge graph, graph reasoning, health informatics applications, semanticsenhanced NLP Introduction and Background There is a radical shift in healthcare with healthcare reforms. The pressure to deliver transparent quality of care using evidence-based medicine has created even more challenges for healthcare providers to meet government and legislative demands to reduce healthcare costs. Health systems and providers are buried under enormous amounts of data generated by a variety of healthcare platforms such as Electronic Health Record systems, patient portals, medical device data, and much more data that is extremely difficult to classify and manage, let alone analyze. Traditional data analytics tools such as Data Warehousing and Business Intelligence software are limited to work only on structured data. Furthermore, the frustration mounts while investment in new technologies like EHR or analytics products, structured around 20% of the healthcare s data, fail to provide comprehensive, credible, and actionable information combining data from disparate systems (structured and unstructured) to meet with hospitals strategic goals for better, safer care at lower costs. So, where is the solution to make clinical data easily accessible, meaningful, and actionable with real-time interventions to support value-based care and actually improving patient outcomes? ezdi, Inc. [ezdi] was founded with a mission to improve patient care and reduce overall healthcare cost using the latest research and technologies. We believe the key is to abstract only meaningful and actionable clinical data from an ever-increasingly unstructured healthcare dataset, and to make it easily accessible and useful for healthcare professionals.

This means developing the technology that can convert the unstructured data that makes up significant portions of clinical records into structured data/knowledge, normalize it, and map it with various computer-processable nomenclature such as SNOMED-CT, RxNorm, ICD-9, ICD- 10, CPT, and LOINC, and then apply advanced reasoning to make the data actionable. This involves use of various clinical informatics applications, including Computer Assisted Coding (CAC), Computerized Document Improvement (CDI), compliance and audit, core measures and utilization, identification of patients at risk and patterns in diseases, outcome prediction, etc. At ezdi, it begins by extracting meaningful and actionable data from tens of millions of patient datasets at scale. The technology that has traditionally been used for this is that of natural language processing (NLP). While NLP does generate clinical insights, the quality of extraction in converting unstructured data into structured data/knowledge and its ability to reason, leading to deeper insights and decision support, is very limited. That is why ezdi developed Knowledge Graphs that can provide meaning to the abstracted data. The scale is equally important for processing large amounts of data and providing real-time insights; ezdi built on AWS HIPAA Compliant Cloud platform and makes integration and scalability a no brainer for health systems resulting in faster deployment at significantly lower costs than traditional legacy platforms. We also understand that extensibility is the key to responding to rapidly changing business priorities. The ezdi health intelligence platform provides comprehensive APIs to enable integration with a variety of platforms such as healthcare content service providers. Tackling the healthcare data storm gets harder everyday; the challenges of providing greater quality of care in a more efficient and cost-effective model is a common goal across all healthcare delivery organizations. Our ability to provide real-time clinical insights at scale to help healthcare providers make better clinical decisions is revolutionary and a stepping stone in the Healthcare Data Mining and Analytics space. We successfully demonstrated our abilities to improve overall patient care and reduce healthcare costs, aiding major hospitals from a 500-bed academic university hospital to a 100-bed critical access hospital, and are being promoted to more than 500,000 healthcare providers by key healthcare alliances in Texas, New York, and North Carolina. Challenge: Healthcare data size and complexity Healthcare data is rapidly evolving and is locked in disparate systems with different permissions and in different formats. More than 80% of medical data is unstructured in terms of dictationtranscription reports, such as office visits, discharge summaries, progress notes, scanned reports, and much more. This data and the information it contains is difficult to classify and manage, let alone analyze. Traditional data analytics software works reasonably well with structured data. The processes to abstract meaningful and actionable data from unstructured data is tedious, retrospective, resource-intensive, and time-consuming, and as a result, very expensive. Furthermore, the pressure to deliver transparent, quality healthcare using evidence-based medicine has created additional challenges for healthcare providers to meet government and

legislative demands to reduce healthcare costs. Hospitals and providers are buried under the enormous amounts of data generated to meet these initiatives, and making sense of the data is a monumental task. Step forward in organizing medical data to improve outcome We believe the key to solving healthcare data challenges is to organize and abstract meaningful and actionable medical data while weeding out the noise. This means developing the technology that can convert unstructured data into structured data, normalize it, and map it with various computer-processable nomenclature such as SNOMED-CT, RxNorm, ICD-9, ICD-10, CPT, LOINC, and much more. The complexity of the involved tasks has led ezdi industry leading solutions consisting of: Development of extensive background knowledge, termed ezkg (for ezdi s very large knowledge graph), that uses existing ontologies, medical knowledge, as well as techniques for semi-automatic knowledge extraction and curation to continue expansion and refreshing of extensive knowledge used for subsequent semantics-enhanced tasks. Exploiting state-of-the-art linguistic, NLP, and machine learning (ML) techniques, and using the semantic approach of employing extensive domain knowledge to improve them and to enable natural language understanding (NLU) and structured semantic data extraction. Using rule-based, statistical, and graph-based reasoning to support advanced analysis and decision support. Modular applications and solutions developed ground up for scalable, cloud-based deployment to meet the market needs.

Figure 1 shows software architecture of AWS deployed ezdi s health intelligence platform. ezkb is part of Graph Database, and Semantic Computing is generically captured as part of Graph Computing to minimize education required for most enterprise customers. As part of the presentations (and a live demo, if desired), we will focus on two objectives: (a) illustrating the unique role of the semantic approach in ezdi solutions, (b) our ability to scale computations in which semantic techniques for extraction and reasoning plays a key role in supporting the needs of ezdi s typical customers who are a few hundred-bed hospitals. Key examples include: a. How does ezdi leverage the semantics of the large amount of EMR data and the existing knowledge graph to continue to expand and keep ezkg updated [Perera et al 2012, Perera et al 2014] b. How can we improve upon the current state-of-the-art in NLP for the quality of core tasks such as named entity recognition, coreference resolution, and implicit entity recognition [Perera et al 2015], to ultimately improve the quality of clinical text understanding and NLU, with the help of a large domain-specific knowledge (embodied in ezkg) [Perera et al 2013] c. What have been pros and cons of using RDF in our technology d. How have we exploited domain-specific and hierarchical relationships among entities in advanced graph-based reasoning tasks, such as identification and ranking of ICD-10- PCS codes, for achieving high quality for CAC and CDI. This is crucial since ICD-10- PCS represents over an order of magnitude higher complexity than the current ICD-9- CM coding. In the process, we will share insights on why non-semantic solutions, such as current approaches for ICD-9 CAC based on rule-based NLP or statistical machine learning are unlikely to give the accuracy needed as evidenced in part from the failure of competing solutions in the market. A key take away is that use of a very comprehensive medical knowledge base helps to avoid very large training and deployment times needed by current solutions that do not use such a knowledge base. Conclusion We conclude with the following observations: There is unprecedented market need to exploit unstructured data in healthcare as part of applications such as CAC and CDI. This requires deep clinical text understanding including extraction of structured knowledge. State-of-the-art rule-based NLP and Machine Learning technologies do not give high quality solutions, take long deployment times and do not scale. ezdi has found use of semantics supported by large and growing medical knowledge indispensable to enhance the state-of-the-art technologies, and has also been able to deploy the solution entirely on a cloud platform to make it highly scalable.

Customers find it easier to understand and appreciate terms graph processing and knowledge graph, which is what we used for communication instead of semantics, semantic web, and reasoning. Meeting the needs of hospitals with hundreds of beds is now a reality, not a future. About the Authors: Raxit is the lead semantic engineer, Neil is the VP, Product Engineering, and Amit is the primary research & technology advisor for ezdi. Kno.e.sis center, led by Prof. Sheth is a research collaborator of ezdi since the founding of ezdi. Acknowledgements: Many ezdi researchers/developers and collaborating researchers deserve credit for extensive technology covered in this paper. Special mentions go to Sujan Perera, Parth Pathak and Suhas Nair. References [ezdi] ezdi Healthcare Data Intelligence Solutions, including ezcac, ezdci and ezaudit, accessible at http://www.ezdi.us/ [Perera et al 2012] Sujan Perera, Cory Henson, Krishnaprasad Thirunarayan, Amit Sheth, Suhas Nair, 'Data driven knowledge acquisition method for domain knowledge enrichment in the healthcare', IEEE International Conference Bioinformatics and Biomedicine 2012 (BIBM '12), pp.1-8, 4-7 Oct. 2012 [Perera et al 2013] Sujan Perera, Amit Sheth, Krishnaprasad Thirunarayan, Suhas Nair and Neil Shah, 'Challenges in Understanding Clinical Notes: Why NLP Engines Fall Short and Where Background Knowledge Can Help', International Workshop on Data management & Analytics for healthcare (DARE) at ACM Conference of Information and Knowledge Management (CIKM), pp. 21-26, Burlingame, USA, Nov 1, 2013, DOI: 10.1145/2512410.2512427 [Perera et al 2014] Sujan Perera, Cory Henson, Krishnaprasad Thirunarayan, Amit Sheth, Suhas Nair, 'Semantics Driven Approach for Knowledge Acquisition From EMRs', IEEE Journal of Biomedical and Health Informatics, vol.18, no.2, pp.515-524, March 2014, doi: 10.1109/JBHI.2013.2282125, PMID: 24058038 [Perera et al 2015] Sujan Perera, Pablo Mendes, Amit Sheth, Krishnaprasad Thirunarayan, Adarsh Alex, Christopher Heid, Greg Mott, 'Implicit Entity Recognition in Clinical Documents', In Proceedings of the Fourth Joint Conference on Lexical and Computational Semantics (*SEM), June 2015.