Ernestina Menasalvas Universidad Politécnica de Madrid



Similar documents
DGE /DG Connect

Towards a Thriving Data Economy: Open Data, Big Data, and Data Ecosystems

Synergies between the Big Data Value (BDV) Public Private Partnership and the Helix Nebula Initiative (HNI)

Towards a data-driven economy in Europe

Kimmo Rossi. European Commission DG CONNECT

Vivir en un mar de Datos 2015: Big Data una mirada Global Fundación Telefónica

User Needs and Requirements Analysis for Big Data Healthcare Applications

European Big Data Value Strategic Research & Innovation Agenda

European Big Data Value Strategic Research & Innovation Agenda

Data Analytics in Health Care

How To Make Sense Of Data With Altilia

Exploiting the power of Big Data

European Big Data Value Partnership Strategic Research and Innovation Agenda. Executive Summary

Clintegrity 360 QualityAnalytics

European Big Data Value Strategic Research & Innovation Agenda

ACCOUNTABLE CARE ANALYTICS: DEVELOPING A TRUSTED 360 DEGREE VIEW OF THE PATIENT

IBM Content Analytics with Enterprise Search, Version 3.0

Big Data Analytics in Health Care

BIG DATA Alignment of Supply & Demand Nuria de Lama Representative of Atos Research &

Big Data Are You Ready? Jorge Plascencia Solution Architect Manager

Putting IBM Watson to Work In Healthcare

BigData Value PPP i Horizon 2020 Arne.J.Berre@sintef.no

Big Data Analytics and Healthcare

5 Keys to Unlocking the Big Data Analytics Puzzle. Anurag Tandon Director, Product Marketing March 26, 2014

Data Isn't Everything

WHITE PAPER. QualityAnalytics. Bridging Clinical Documentation and Quality of Care

Big Data overview. Livio Ventura. SICS Software week, Sept Cloud and Big Data Day

The Six A s. for Population Health Management. Suzanne Cogan, VP North American Sales, Orion Health

BI en Salud: Registro de Salud Electrónico, Estado del Arte!

Concept and Project Objectives

Find the signal in the noise

EDITORIAL MINING FOR GOLD : CAPITALISING ON DATA TO TRANSFORM DRUG DEVELOPMENT. A Changing Industry. What Is Big Data?

Big Data Analytics- Innovations at the Edge

Turn your information into a competitive advantage

Uncovering Value in Healthcare Data with Cognitive Analytics. Christine Livingston, Perficient Ken Dugan, IBM

Data Centric Computing Revisited

I n t e r S y S t e m S W h I t e P a P e r F O R H E A L T H C A R E IT E X E C U T I V E S. In accountable care

Big Data R&D Initiative

Géraud Guilloud Horizon-2020 appels Atelier Big Data Technologies & Application

COMP9321 Web Application Engineering

Big Data and Analytics: Challenges and Opportunities

Big Data & Security. Aljosa Pasic 12/02/2015

The Future of Business Analytics is Now! 2013 IBM Corporation

A Strategic Approach to Unlock the Opportunities from Big Data

All-in-one, Integrated HIM Workflow Solution

PICTURE Project Final Event. 21 May 2014 Minsk, Belarus

Social Data Science for Intelligent Cities

Data Science & Big Data Practice

PREDICTIVE ANALYTICS FOR THE HEALTHCARE INDUSTRY

Healthcare Measurement Analysis Using Data mining Techniques

Big Data Analytics for Healthcare

The 4 Pillars of Technosoft s Big Data Practice

FITMAN Future Internet Enablers for the Sensing Enterprise: A FIWARE Approach & Industrial Trialing

Exploration and Visualization of Post-Market Data

GE Healthcare. Centricity PACS and PACS-IW with Universal Viewer* Where it all comes together

SAP/PHEMI Big Data Warehouse and the Transformation to Value-Based Health Care

SOLUTION BRIEF. SAP/PHEMI Big Data Warehouse and the Transformation to Value-Based Health Care

Auto-Classification for Document Archiving and Records Declaration

Cloud and Big Data Standardisation

Exploiting Data at Rest and Data in Motion with a Big Data Platform

Course DSS. Business Intelligence: Data Warehousing, Data Acquisition, Data Mining, Business Analytics, and Visualization

International Open Data Charter

Chapter 5. Warehousing, Data Acquisition, Data. Visualization

The Five Pillars of Population Health Management. Dr. Christopher Mathews Senior Vice President and Chief Medical Officer ZeOmega

Healthcare data analytics. Da-Wei Wang Institute of Information Science

Unlocking the Intelligence in. Big Data. Ron Kasabian General Manager Big Data Solutions Intel Corporation

Standards for Big Data in the Cloud

Tutorial: Big Data Algorithms and Applications Under Hadoop KUNPENG ZHANG SIDDHARTHA BHATTACHARYYA

5 Key Trends in Connected Health

International Journal of Advancements in Research & Technology, Volume 3, Issue 5, May ISSN BIG DATA: A New Technology

Global Headquarters: 5 Speen Street Framingham, MA USA P F

New York ehealth Collaborative. Health Information Exchange and Interoperability April 2012

Smarter Research. Joseph M. Jasinski, Ph.D. Distinguished Engineer IBM Research

From Data to Foresight:

ezdi s semantics-enhanced linguistic, NLP, and ML approach for health informatics

Internet of Things (IoT): A vision, architectural elements, and future directions

Bruhati Technologies. About us. ISO 9001:2008 certified. Technology fit for Business

BIG DATA WITHIN THE LARGE ENTERPRISE 9/19/2013. Navigating Implementation and Governance

BIG DATA STRATEGY. Rama Kattunga Chair at American institute of Big Data Professionals. Building Big Data Strategy For Your Organization

Big Data Use Cases Update

Department of the Interior Open Data FY14 Plan

Chapter 5 Business Intelligence: Data Warehousing, Data Acquisition, Data Mining, Business Analytics, and Visualization

IMPLEMENTING BIG DATA IN TODAY S HEALTH CARE PRAXIS: A CONUNDRUM TO PATIENTS, CAREGIVERS AND OTHER STAKEHOLDERS - WHAT IS THE VALUE AND WHO PAYS

Big Data better business benefits

A Glimpse at the Future of Predictive Analytics in Healthcare

Industry 4.0 and Big Data

BIG DATA & ANALYTICS. Transforming the business and driving revenue through big data and analytics

Transcription:

Ernestina Menasalvas Universidad Politécnica de Madrid EECA Cluster networking event RITA 12th november 2014, Baku

Sectors/Domains Big Data Value Source Public administration EUR 150 billion to EUR 300 billion in new value (Considering EU 23 larger governments) Healthcare & Social Care Utilities Transport and logistics EUR 90 billion considering only the reduction of national healthcare expenditure in the EU Reduce CO2 emissions by more than 2 gigatonnes, equivalent to EUR 79 billion (Global figure) USD 500 billion in value worldwide in the form of time and fuel savings, or 380 megatonnes of CO2 emissions saved OCDE, 2013 McKinsey Global Institute, 2011 OCDE, 2013 OCDE, 2013 Retail & Trade Geospatial Applications & Services 60% potential increase in retailers operating margins possible with Big Data USD 800 billion in revenue to service providers and value to consumer and business end users USD 51 billion worldwide directly associated to Big Data market (Services and applications) McKinsey Global Institute 2, 2011 McKinsey Global Institute 2, 2011 Various, 4

Motivation In 2012, worldwide digital healthcare data was estimated to be equal to 500 petabytes and is expected to reach 25,000 petabytes in 2020 Can we learn from the past to become better in the future? Healthcare Data is becoming more complex!! The problem : Milllions of reports, tasks, incidents, events, images, DNA Complete availability Lack of protocols and structure Organization oriented processes Need of patient oriented processes information 5

From Mckensey: big data in health report 2013 From physicians judgment to evidence-based medicine Standard medical practice is moving from relatively ad-hoc and subjective decision making to evidence-based healthcare Is the health-care industry prepared to capture big data s full potential, or are there roadblocks that will hamper its use? Holistic, patient-centered approach to value, one that focuses equally on health-care spending and treatment outcomes. 6

EHR adoption http://www.accenture.com/sitecollectiondocuments/pdf/accenture_emr_markets_whitepaper_vfinal.pdf 7

BIG DATA IN THE HEALTH DOMAIN 8

The average hospital (300 beds) 500.000 patients (reference population) 1300 users (250 physicians, 900 nurses and technicsian, 150 administrative tasks) Monthly activity: 20.000 consultations, 1300 admissions, 800 interventions 10.000 emergencies 75.000 annotations 25.000 reports 90.000 interdepartamental orders 450.000 lab results (analytical) 13.000 images analysis 24.000 pharmacological prescriptions 9

Hospital Management They require of solutions for cost-reduction policies. efficiency procedures. establishing share-risk policies Alarms Early prognosis and diagnosis Environmental, sensor, integration Use data and services of the cloud for comparison of data of other hospitals/countries/.. for efficiency policies... 10

Goverment support for cost-reduction policies analysis of early detection of chronic diseases analysis of diseases and the elderly prediction of the evolution of diseases depending on clinical and societal factors. sentiment analysis (user satisfaction) of policies, health care, impact of environmental factors on the evolution, prevalence and.. of diseases impact of socio economic situation of people on the disease evolution and impact on health costs cloud based services for analysis of all the data generated in different hospitals 11

Clinicians: evidence based medicine correlations, associations of symptoms, familiar antecedents, habits, diseases impact of certain biomedical factors (genome structure, clinical variables ) on the evolution of certain diseases automatic classification of images (prioritization of RX images to help diagnosis) automatic annotation of images natural language (google style) based diagnose aid tools 12

14

15 ESCUELA TÉCNICA SUPERIOR DE INGENIEROS INFORMATICOS Process

Data Acquisition Data Silos Standarization Privacy Structured data: Diverse numeric scales on different labs Missing data Clinical and demographic data (ICD) medium recall and medium precision for characterizing patients Non-structured data: Images Clinical reports Data processing Modelling Image annotation NLP Integration Deep analysis Visualization Validation Apply 16

By 2015, the average hospital will have two-thirds of a petabyte of patient data, 80% of which will be unstructured image data like CT scans and X-rays. http://medcitynews.com/2013/03/the-body-in-bytesmedical-images-as-a-source-of-healthcare-big-datainfographic/ 17

Most frequent ComputedTomography (CT), X-Ray, Positron Emission Tomography (PET) The main challenge with the image data is that it is not only huge, but is also high-dimensional and complex. Extraction of the important and relevant features is a daunting task. 18

Methodology for image processing Overall process of image mining Data Preprocessing Extracting multidimensional feature vectors Mining of vectors and acquire high level knowledge 19

NLP applied to EHR Analysis of free text input from clinical reports and patient s history would improve healthcare. There are several English-centric tools working towards that goal: Mayo s ctakes SNOMED-CT MetaMap UMLS MedLee LOINC HiTex 20

Natural Language Processing Sentence Detector Tokenizer Part of Speech Chunker Name Entity Negation Detection Negation Hypothesis Historical Event Subject Recogntion 21

NESSI CPPP: BIG DATA VALUE 22

GOAL of the cppp Ensure Europe s leading role in the data-driven world addressing competitiveness, innovation, and society Covering the dimensions of Big Data Value: data, skills, legal, technical, application, business, social.

Multiple views of Big Data 24

Technical and non technical aspects Data: Data is at the centre of the Big Data Value activities and making data sets and assets accessible. private and open data sources, ensure their availability, integrity, and confidentiality Data ownership Technology: technologies and tools which are needed to support data-driven Non structured data Algorithms for text, image Anonimization Legal, Policy and Privacy: European-wide legislation, regulation Social: Acquiring early insights into the social impact of new technologies and data-driven applications and how they will change the behaviour of individuals 25

Technical issues Harmonization across different sources: standardized modelling, integration of heterogeneous data sources Low latency and real-time data processing Advanced data mining: predictive analytics, graph mining, semantic analysis Image, text processing Data protection and privacy technologies Advanced visualization, user experience and usability 26

Tecnical priorities: Data Management Define, interoperate, openly share, access, transform, link, syndicate, and manage data: Annotation: Data needs to be semantically annotated in digital formats, without imposing extra-effort to data producers Unstructured data Semantic Interoperability: Data silos have to be unlocked Legal Frameworks: Technical means have to be backed by legal frameworks to ensure the transparent sharing and exchange of data Quality 27

Tecnical priorities: Deep analytics Event Space: Move beyond limited samples used so far in statistical analytics to samples covering the whole or the largest part of an event space Model Accuracy: Improve the accuracy of statistical models by enabling fast nonlinear approximations in very large datasets Event Discovery: Discover rare events that are hard to identify since they have a small probability of occurrence, but have a great significance (such as rare diseases and treatments) Real Time: Enable real-time analytics that are capable of analysing large amounts of data-in-motion and data-at-rest by updating the analysis results as the information content changes Semantic Analysis: Deep learning, contextualization based on IA, machine learning, semantic analysis in near-real time, graph mining Unstructured Data: Processing of unstructured data (multi-media, text). Linking and cross-analysis algorithms to deliver cross-domain and cross-sector intelligence Canonical forms: Provide canonical paths so that data can be aggregated and shared easily without dependency on technicians or domain experts and provide a path for the smart analysis of data across and within domains 28

Tecnical priorities: Privacy and Anonymisation Mechansims Cloud Data Protection: Protect the cloud infrastructure, analytics applications, and the data from leakage and threats Data minimisation: Methods for secure deletion of data and data minimization Algorithms: Robust anonymisation algorithms Reversibility: Risk assessment tools to evaluate the reversibility of the anonymisation mechanisms Mining Algorithms: Developed privacy-preserving data mining algorithms Privacy Preservation: Mechanisms for privacy-preserving data publishing and data computations Pattern Hiding: Design of mechanisms for pattern hiding so data is transformed in a way that certain patterns cannot be derived (via mining), while others can Multiparty Mining: Secure multiparty mining mechanisms over distributed datasets 29

Tecnical priorities: Advanced Visualisation and User Experience End User Centric: Adaptation to the needs of end users rather than predefined visualization and analytics. User feedback Scale: handle extremely large volumes of data: aggregate data at different scales of interaction techniques, which should enable easy transitions from one scale or form of aggregation to another while supporting aggregation and comparisons among different scales Clusters: Dynamic clustering of information based on similarity or relatedness to the problem rather than on individual categories Geospatial: New visualisation for data with geo-locations, distances, and space/time correlations (i.e. sensor data, event data) Interrelated Data: Rather than data islands, visual interfaces must take account of spatial and semantic relationships, such as positions, distances, space/time correlations Qualitative Analysis Time Plug and Play 30

Priority Year 1 Year 2 Year 3 Year 4 Year 5 Data Management Mechanisms for integration of hetero-geneous data sources Semantic based data and content interoperability Generalisation of secure remote data access techniques Collaborative Tools and techniques for Data Quality (including integrity and veracity check) Harmonized description format for meta-data and for data reduction Methodology, models and tools for data lifecycle management Data management as a service Deep analytics Improved statistical models by enabling fast non-linear approximations in very large datasets Real-time analytics Predictive modelling and graph mining techniques applied on extremely large graphs Semantic analysis in near-real-time Algorithms for multimedia data mining Descriptive language for deep analytics Deep learning techniques Privacy and Anonymisation Complete Data Protection framework Method for deletion of data and data minimization Robust anonymisation algorithms Advance isualisation and User Experience End-user Centric data search and solutions paradigms Semantic driven data visualisation Integration of analytics and visualization Contextuali-sation Collaborative realtime, dynamic 3D solutions 31

Mechanisms In order implement the research and innovation strategy and to align technical and non-technical aspects, the four major kinds of mechanisms are recommended to be realized: Innovation Spaces (i-spaces): Cross-organisational and cross-sector environments will allow challenges to be addressed in an interdisciplinary way and will serve as a hub for other research and innovation activities. Lighthouse projects: These will help raise awareness about the opportunities offered by Big Data and the value of data-driven applications for different sectors and they will be an incubator for data-driven ecosystems. Technical Projects: These will take up specific Big Data issues addressing targeted aspects of the technical priorities Non-technical Projects: These projects will foster international cooperation for efficient information exchange and coordination of activities. 32

Main components and research priorities of the cppp Innovation Spaces serve as hubs for bringing the technology and application developments together and cater for the development of skills, competence, and best practices. Improving understanding of data by deep analytics (e.g. predictive modelling, graph mining,...) Architectures for analysing data including real-time data (e.g. recommendation engines,...) Visualization and user experience (e.g. User adaptive systems, search capabilities,...) Lighthouse Projects Large scale demonstrations focusing on certain sectors and domains Data management engineering (e.g. Data integration, data integrity,...) Privacy and anonymisation mechanisms

Implementation Timeline

THANKS! Ernestina Menasalvas Universidad Politecnica de Madrid