How can you unlock the value in real-world data? A novel approach to predictive analytics could make the difference.

Similar documents
Transforming Internal Audit: A Maturity Model from Data Analytics to Continuous Assurance

In this presentation, you will be introduced to data mining and the relationship with meaningful use.

Big Data Trends A Basis for Personalized Medicine

A leader in the development and application of information technology to prevent and treat disease.

Data Catalogs for Hadoop Achieving Shared Knowledge and Re-usable Data Prep. Neil Raden Hired Brains Research, LLC

Data Mining for Successful Healthcare Organizations

DISCOVER MERCHANT PREDICTOR MODEL

Integrating Genetic Data into Clinical Workflow with Clinical Decision Support Apps

The Definitive Guide to Preparing Your Data for Tableau

Tools for Managing and Measuring the Value of Big Data Projects

TIBCO Spotfire Helps Organon Bridge the Data Gap Between Basic Research and Clinical Trials

Find the signal in the noise

Information Visualization WS 2013/14 11 Visual Analytics

PREDICTIVE ANALYTICS: PROVIDING NOVEL APPROACHES TO ENHANCE OUTCOMES RESEARCH LEVERAGING BIG AND COMPLEX DATA

Careers in Biostatistics and Clinical SAS Programming An Overview for the Uninitiated Justina M. Flavin, Independent Consultant, San Diego, CA

Targeting Cancer: Innovation in the Treatment of Chronic Myelogenous Leukemia EXECUTIVE SUMMARY. New England Healthcare Institute

Secondary Uses of Data for Comparative Effectiveness Research

UNIFY YOUR (BIG) DATA

Transforming study start-up for optimal results

The Clinical Trials Process an educated patient s guide

MSD Information Technology Global Innovation Center. Digitization and Health Information Transparency

A Business Intelligence Training Document Using the Walton College Enterprise Systems Platform and Teradata University Network Tools Abstract

ASCO s CancerLinQ aims to rapidly improve the overall quality of cancer care, and is the only major cancer data initiative being developed and led by

How To Change Medicine

hmetrix Revolutionizing Healthcare Analytics with Vertica & Tableau

Proficy Monitoring & Analysis. Software to harness the industrial internet

KNOWLEDGENT WHITE PAPER. Big Data Enabling Better Pharmacovigilance

Driving Innovation in Licensing Through Competitive Intelligence and Big Data Analytics

Cardinal Health Specialty Solutions. Cardinal Health Geographic Insights Maximize Market Opportunity with Actionable Insights from Data Visualization

Summary. January 2013»» white paper

Cisco Data Preparation

Measuring Health System Performance: Population Health Analytics for Accountable Care. Part 1

Solutions For. Information, Insights, and Analysis to Help Manage Business Challenges

ADVANCED DATA VISUALIZATION

BioVisualization: Enhancing Clinical Data Mining

Using Predictive Analytics to Reduce COPD Readmissions

ACCOUNTABLE CARE ANALYTICS: DEVELOPING A TRUSTED 360 DEGREE VIEW OF THE PATIENT

Adobe Insight, powered by Omniture

January EHR Implementation Planning and the Need to Focus on Data Reusability. An Encore Point of View. Encore Health Resources

STATISTICA. Clustering Techniques. Case Study: Defining Clusters of Shopping Center Patrons. and

Pentaho Data Mining Last Modified on January 22, 2007

Predictive Analytics

Clintegrity 360 QualityAnalytics

Impact Intelligence. Flexibility. Security. Ease of use. White Paper

Clinical Trials: Questions and Answers

Promises and Pitfalls of Big-Data-Predictive Analytics: Best Practices and Trends

Understanding the Value of In-Memory in the IT Landscape

Business Intelligence and Big Data Analytics: Speeding the Cycle from Insights to Action Four Steps to More Profitable Customer Engagement

Leading Genomics. Diagnostic. Discove. Collab. harma. Shanghai Cambridge, MA Reykjavik

Medical Data Review and Exploratory Data Analysis using Data Visualization

Practical Data Science with Azure Machine Learning, SQL Data Mining, and R

WHITE PAPER. QualityAnalytics. Bridging Clinical Documentation and Quality of Care

Health Care Data CHAPTER 1. Introduction

International Journal of Computer Science Trends and Technology (IJCST) Volume 2 Issue 3, May-Jun 2014

National Cancer Institute

Data Mining, Predictive Analytics with Microsoft Analysis Services and Excel PowerPivot

Process Intelligence: An Exciting New Frontier for Business Intelligence

The registry of the future: Leveraging EHR and patient data to drive better outcomes

KnowledgeSEEKER Marketing Edition

Big data: Unlocking strategic dimensions

Total Cost of Care and Resource Use Frequently Asked Questions (FAQ)

Medicals c i e n t i f i c study

Making confident decisions with the full spectrum of analysis capabilities

Azure Machine Learning, SQL Data Mining and R

Access. Action. Insight. Healthcare Analytics and Marketing Communications Consultative, Analytical, and Promotional Solutions

It s a New World: Innovations in Oncology Data Analytics. By Mahmood Majeed and Prashant Poddar

Environmental Health Science. Brian S. Schwartz, MD, MS

Analytics For Everyone - Even You

Data Mining Applications in Higher Education

Natalia Olchanski, MS, Paige Lin, PhD, Aaron Winn, MPP. Center for Evaluation of Value and Risk in Health, Tufts Medical Center.

Elsevier ClinicalKey TM FAQs

Increase success using business intelligence solutions

Understanding Diseases and Treatments with Canadian Real-world Evidence

Integrating SAP and non-sap data for comprehensive Business Intelligence

Consultation Response Medical profiling and online medicine: the ethics of 'personalised' healthcare in a consumer age Nuffield Council on Bioethics

Transforming the pharmacy into a strategic asset

Call Planning that Delivers on Brand Strategy

Big Data Integration and Governance Considerations for Healthcare

PHARMACEUTICAL BIGDATA ANALYTICS

BIG Data Analytics Move to Competitive Advantage

locuz.com Big Data Services

SOLUTION BRIEF. IMAT Enhances Clinical Trial Cohort Identification. imatsolutions.com

Smarter Research. Joseph M. Jasinski, Ph.D. Distinguished Engineer IBM Research

Data Mining On Diabetics

Maximizing the ROI Of Visual Rules

Transcription:

How can you unlock the value in real-world data? A novel approach to predictive analytics could make the difference.

What if you could diagnose patients sooner, start treatment earlier, and prevent symptoms from worsening? The secrets to developing more effective treatments and enabling better outcomes likely reside in the volumes of data captured from sources such as patient registries, administrative claims databases, patient and provider surveys, and electronic medical records. This is what the life sciences industry terms real-world data: data used for decision making that are not collected in conventional controlled randomized trials (RCTs). 1 Many companies, though, struggle to derive clear and useful insight from what is effectively a massive chopped salad of information that resides in disparate locations and formats. In our experience, two things can accelerate analysis of complex real-world data and development of more effective treatment approaches: 1. Analytic tools that have the necessary power to identify the complex variable interactions that are predictive of diagnosis and treatment efficacy 2. A discovery-driven research approach that uses data analytics to reveal answers to challenging questions as opposed to traditional hypothesisbased approaches that test pre-defined theories This article describes, at a high level, how this is possible. 1 Using Real World Data for Coverage and Payment Decisions: The ISPOR Real World Data Task Force Report, 2007.

Getting from real-world data to actionable insight Maximizing a therapy s potential requires answers to many questions: What predictors or markers could speed accurate diagnoses and avoid the possibility of misdiagnosis and years of improper treatment? How much earlier could diagnosis occur? Which patients are most likely to require drug treatment? Which patients are most likely to respond to a treatment? Which patients are at risk of negative reaction to a drug treatment? Why do patients go off therapy? Typically, the answers are far from straightforward. They depend on complex and interacting combinations of variables, some obvious and many not so obvious. To find answers, life sciences organizations frequently turn to real-world data (see sidebar) available through sources such as Truven Health Analytics MarketScan Databases or Symphony Health Solutions Integrated Dataverse TM. Despite an increasing focus on real-world data, many life sciences organizations still struggle to derive clear and useful conclusions (known as real-world evidence), limiting their ability to answer questions such as those above. Why is this the case? The biggest challenge is often identifying predictive information within the massive amount of collected and stored real-world data. Determining the best way to organize and analyze years of data from doctor visits, hospital stays, walk-in clinics, lab results, insurance What is real-world data? Most organizations definitions of real-world data center on the premise that it includes data captured without the biases traditionally involved in clinical trials. For example: claims, and pharmacies to find meaningful combinations of variables is a major roadblock for most companies. Studies that do incorporate real-world data often consider hundreds or thousands of potential indicators across tens of thousands of patients, necessitating powerful data management and cleansing processes prior to analysis. Moreover, real-world data collection frequently is not well organized, includes many instances of missing data, and is not readily processed with standard techniques. Quite simply, working with real-word data is messy and time consuming. How can you navigate the path from a messy mix of real-world data to real-world evidence and insights? One way is by rethinking your approach to analyzing real-world data. An ISPOR (International Society for Pharmacoeconomics and Outcomes Research) task force defined real-world data as data used for decision making that are not collected in conventional controlled randomized trials (RCTs) 1. Rather, these data come from patient registries, administrative claims databases, patient and provider surveys, and electronic medical records, as well as large simple trials and supplement data collected alongside randomized clinical trials. 1 ISPOR task force: Using Real-World Data for Coverage and Payment Decisions: The ISPOR Real- World Data Task Force Report; 2007 2 ABPI: The Vision for Real World Data Harnessing the Opportunities in the UK; 2011

Use the most effective predictive analytics tools Effective analysis of real-world data requires sophisticated analytic tools and predictive modeling techniques that can overcome common data challenges and identify complex interactions among a myriad of potential variables. Despite the existence of many techniques to perform these analyses, many predictive models cannot easily identify the hidden relationships among the numerous variables, the interaction of which frequently underlie disease risk, the onset of an exacerbation event, or the likelihood of a positive treatment response. Even when these techniques are able to identify important relationships among predictor variables, there is often difficulty bridging the gap between the results of these analyses and actionable business or medical insights. The result is suboptimal use of real-world data. HyperCube TM, a unique analytic tool (see sidebar), overcomes many of these challenges to reveal important relationships among large numbers of predictor variables and generate insights in a human-readable format that allows for easy interpretation by both technical experts and business users. We have seen HyperCube work particularly well in the life sciences environment. While HyperCube s predictive capabilities are quite powerful, it still faces one of the biggest challenges in predictive modeling that is, appropriate steps to organize, prepare, and cleanse datasets prior to analysis. As with any predictive modeling of real-world data, a substantial amount of data cleansing and transformation is necessary to prepare data for proper analytic processing. Thus, before we analyze data with HyperCube, we employ in-depth data cleaning and preparation processes, including SQL integration work, ETL (extract-transform-load) processes, and big data frameworks such as Apache TM Hadoop. These data preparation steps can account for a significant portion of the work involved in complex real-world data analysis, but they are essential to success. Hypercube HyperCube is a proprietary predictive modeling algorithm and software package that can predict disease/disorder/treatment outcomes from analysis of data sets that include hundreds to tens of thousands of potential predictor variables. HyperCube has similar predictive capabilities as many standard modeling techniques including regression modeling, random forest analysis, etc. but also many features and capabilities that go above and beyond, including Rule mining, variable selection, bagging algorithm, cross validation and visualization. HyperCube is able to take large and complex datasets and derive a group of predictive business rules that identify rare cases in a highly targeted fashion. These rules are human readable and can be understood by both business and technical users. For example, a typical output from HypeCube might read like this: If the patient is male, has blood sugar levels between X and Y, and is positive for biomarker 123c, then the likelihood of developing Disease A is 20 times greater than average. Importantly, analysts can optimize HyperCube to generate more precise sets of predictive rules (e.g., for risky and expensive treatments) or wider sets of predictive rules that cover broad events (e.g., for general screening of patients likely to need a flu vaccine).

Rather than test hypotheses, let data drive discovery The scientific method guides standard approaches to data analysis. These approaches begin with hypotheses formulated from existing research results, intuition, and judgment. The hypotheses are very specific predictions of an answer to the research question, and analyses then test whether or not that single answer is accurate. For example, if our goal is to understand the characteristics of patients who respond better to a certain drug treatment for a chronic condition, we might hypothesize that some patients respond better because they 1) are more likely to take the medication as prescribed and 2) attended a seminar that explained the importance of adherence to therapy. So we test that specific hypothesis by collecting and analyzing relevant data about drug use by individuals who did and did not attend seminars. We use the findings to draw a conclusion, from which we refine the hypothesis or develop a new one (see visual at right) and then repeat the analysis cycle to refine our understanding or test other possible explanatory hypotheses. While this process is rigorous and logically sound, it can be particularly time consuming and frequently results in outcomes that are not easily actionable or interpretable. We have seen life sciences companies accelerate data analysis by taking a different approach one that aims to discover variables for further testing by letting real-world data act as the guide. Rather than employing a sequence of projects to test and refine a single hypothesis, this discovery-driven approach (see visual at right) uses predictive analytic capabilities to reveal unique characteristics of a group under study for example, the distinctive combination of variables common to people likely to choose one drug versus its competitor. This method obviates the dependence on pre-determined hypotheses, and it allows the data to drive an understanding of predictor variables.

Why is this approach beneficial? In biological applications, such as rare diseases, cause is typically not determined by a discrete combination of factors identified in a formulated hypothesis (for example, A+B+C causes rare disease X). Rather, it is more nuanced; for example, for someone of a particular age and ethnicity, a certain combination of biomarkers in conjunction with environmental exposure can predict the likelihood of contracting a disease or reacting to a particular therapy. When viewed this way, it becomes clear why traditional hypothesisbased approaches for identifying predictor variables would have been much slower to identify those combinations if they could at all. Used early in a research project, a discovery-driven approach can pinpoint variables and research avenues that researchers may not previously have considered. It also can reinforce and expand upon existing hypotheses in meaningful ways. HyperCube and a discovery-based approached in action In one case, we used HyperCube to support clinical evaluation of patients with a well-known chronic disease. The study s goal was to identify indicators of patients who were likely to experience a deleterious and dangerous disease-state event as opposed to patients whose conditions remained stable. In this way, it would be possible to advise physicians to screen for high-risk patients and take preventative action. Rather than starting with a hypothesis about the factors that caused some patients to be unstable over a period of years, our analysis with HyperCube examined 100+ variables across populations of both stable and unstable patients to allow the data to drive our understanding of important predictor variables. By doing so, the research team quickly isolated a combination of five variables among the many studied that were associated with patient disease events during therapy. This discovery-based analysis led to identification of previously unconsidered variable combinations and a faster conclusion. In a different case, we used HyperCube to identify a combination of characteristics indicative of an oftenundiagnosed rare disease, enabling the pharmaceutical company to educate physicians to test for this rare disease when they encountered that specific combination of characteristics. This study included a massive amount of data: more than 10 million rows of patient data representing five years of insurance claims data and more than 12,000 columns of potential predictor variables representing all ICD-9 diagnoses occurring before those patients were diagnosed with the rare disease. The analysis considered both single diagnosis codes as well as aggregate variables representing certain combinations of ICD-9 codes as defined by previously published studies. Through this analysis, the research team was able to confirm the predictive strength of the previously published variable set and also identify additional variables that increased predictive coverage and strength.

In both cases HyperCube s unique discovery capabilities were important but so, too, was its humanreadable prescriptive output, which provides direction for taking action. For the studies described above, HyperCube produced specific business rules: if/then statements that allow experts such as doctors and researchers to use the results to advance studies and act on insights gained. Better clinical outcomes, greater commercial success A holistic and robust data-science platform with the right tools, approaches, and expertise is more critical than ever for problem solving and successful therapy. As outlined above, a discovery-driven approach can help improve and focus research and analysis, particularly when used early in a project and combined with the right analytic capabilities for dealing with the complexity and volume of real-world data. Of course, regardless of the sophistication of analytics at hand, analysis does not replace the need for critical thought. Analysts and researchers should always apply a common-sense filter when looking at the results. But taking a discovery-driven approach that challenges common sense can help speed up analysis and speed can make a big difference. The ability to hone in quickly on the populations most affected by a particular therapy can create significant value, both in clinical outcomes and in commercial success. For more information about applying HyperCube to unlock the potential of real-world data, please contact Jim Bedford, jbedford@westmonroepartners.com or [312.980.9393].