Using Big Data in Healthcare

Size: px
Start display at page:

Download "Using Big Data in Healthcare"

Transcription

1 Speaker First Plenary Session THE USE OF "BIG DATA" - WHERE ARE WE AND WHAT DOES THE FUTURE HOLD? David R. Holmes III, PhD Mayo Clinic College of Medicine Rochester, MN, USA Using Big Data in Healthcare Graph Databases and Graph Analytic Approaches David R. Holmes III ISPOR 19 th Annual Meeting June 2 nd, MFMER slide-2

2 Teamwork Special Purpose Processor Development Group Barry Gilbert, Ph.D. Robert Techentin Center for Science of Healthcare Delivery Jeanne Huddleston, M.D. Nilay Shah, Ph.D. Rochester Epidemiology Project Jennifer St. Sauver, Ph.D. YarcData Steve Reinhardt Biomedical Imaging Resource Will and Charlie Mayo, The Mayo Brothers 2014 MFMER slide-3 Graph Analytics 2014 MFMER slide-4

3 What is a graph? A 2 Node 1 and Node 2 are related 1 Node 1 is forward related to Node 2 B 3 Node 1 is forward related to Node 2 and Node 3 Correlates Coffee Drinking Node 1 is forward related to Node 2 via Edge A. Node 1 is forward related to Node 3 via Edge B Smoking Causes Heart Attack Smoking is correlated with coffee drinking. Smoking may cause heart attacks. Smoking is a confounding variable MFMER slide-5 Semantic Graphs / Databases Node-typed, edge-typed, directed graph Using the Resource Description Framework (RDF), we can describe each piece of information in the graph as a triple: <Subject> <Predicate> <Object> Correlates Coffee Drinking <Smoking> <corr. with> <Coffee Drinking> <Coffee Drinking> <corr. with> <Smoking> <Smoking> <causes> <Heart Attacks> Smoking Causes Heart Attack A semantic database is referred to as a triple-store (e.g. a collection of triples) Semantic Databases are queried using SPARQL (the semantic equivalent of SQL) Inferential rules and ontologies can be applied dynamically to the data to further enrich the dataset 2014 MFMER slide-6

4 Origins of Semantic Databases in Healthcare Mishelevich, David J. "MEANINGEX: a computer-based semantic parse approach to the analysis of meaning." (1971) "Semantic analysis of medical records." (1972) Initial notion of an ontology and semantic (i.e. noun phrase) representation of medical data Schmid, Hans Albrecht, and J. Richard Swenson. "On the semantics of the relational data model." (1975) Formalizing the graph-like nature of semantic data models 1970s 1980s 1990s 2000s... Lenz, Richard, Mario Beyer, and Klaus A. Kuhn. "Semantic integration in healthcare networks. (2007) 2014 MFMER slide-7 Benefits of Semantic Databases Semantic databases center around the users need to collect and interrogate the heterogeneous data Flexible Schema New variables can be added to the data model easily Data type agnostic New variables are added with indifference to variables already in the data model Expressability Ability to query the database in a flexible manner without regards for the specific data model Can dynamically apply inferential rules and ontologies Whole graph algorithms can be applied in order to find unique relationships between variables 2014 MFMER slide-8

5 Healthcare Semantification at Mayo Rochester Epidemiology Project (Population-based) Goal: Leverage the stable population to track health over time 500K Individuals, 40 year duration 2 M healthcare records Bedside Patient Rescue (In-hospital) Goal: Early Warning Systems (EWS) for patient events 115K patient encounters, 2 year duration 38M records (labs, nursing evals, etc.) 2014 MFMER slide-9 Rochester Epidemiology Project 2014 MFMER slide-10

6 2014 MFMER slide MFMER slide-12

7 2014 MFMER slide-13 Whole Graph Algorithm: Diffusion Algorithm Diffusion algorithm can find hidden relationships by exploiting connections in the semantic graph Initial values are attached to specific seed nodes Values propagate over graph edges, and accumulate in different parts of the graph Sometimes results are unexpected With a functioning graph diffusion algorithm, many possible searches can be performed For the REP, we can identify a representative example of cohort features and label the graph 2014 MFMER slide-14

8 2014 MFMER slide-15 Bedside Patient Rescue 2014 MFMER slide-16

9 2014 MFMER slide MFMER slide-18

10 2014 MFMER slide MFMER slide-20

11 Just one algorithm? No There are many whole graph algorithms which could be applied to healthcare data: PageRank Google-developed algorithms for weighting the edges to emphasize important nodes in a graph Peer-pressure clustering Graph-based cluster algorithm to find groups based on both node and edge data Betweeness-centrality Algorithm to determine key nodes in a graph which are most connected Clique detection Methods to find sub-graphs in a graph 2014 MFMER slide MFMER slide-22

12 Why doesn t everyone use Semantic Databases? Migrating relational databases to semantic databases can be tricky Graph databases suffer from missing data and noisy data just like relational databases Graph databases are large, and graph algorithms are complex 2014 MFMER slide-23 Migrating Relational Databases Relational DBs, by definition, are an efficient tabular storage of information. Care must be taken in developing a semantic model to ensure semantic richness Data must be promoted correctly to subjects/objects Predicates must be semantically meaningful Standard nomenclature must be used to be compatible 2014 MFMER slide-24

13 Missing and Noisy Data Missing data is just that missing. Graph algorithms need to be smarter about missing data. For example, Building latent variables into the data Using a priori models to address missing data Healthcare data is notoriously noisy Moreover, there is a lot of it Algorithms must be robust to noise and oversampling While pre-processing can address this, some useful information can be lost. Algorithms need to intelligently weight the data to draw meaningful conclusions. Connecting Two BPR Encounters 2014 MFMER slide-25 Graph Data is Large and Complex For decades, the community didn t have the computational resources to deal with semantic data efficiently. Technology developers were unable to pack enough memory into a computer to hold the data Networks were too slow As a result, CPUs were data starved New technologies address this issue specifically Hadoop clusters Graph computers 8192 threads, 2 TB memory 2014 MFMER slide-26

14 Progressively complex queries using graph computer vs standard SQL database 2014 MFMER slide-27 Final Thoughts Graph databases for healthcare were proposed in the 1970s. Over time, the conceptual model of graph databases / algorithms matured. Technology has finally caught up. The Jerry Springer Show The technical community is now prepared to accept massive amounts of healthcare data and store it semantically. Semantic graph databases change the way that we look at data. Graph analytics will yield new insights into existing and soon-to-be collected datasets. There are still challenges in data migration and data quality to be addressed. Harass your favorite computer scientist / informaticist to make progress in these areas MFMER slide-28

15 2014 MFMER slide-29 Speaker First Plenary Session THE USE OF "BIG DATA" - WHERE ARE WE AND WHAT DOES THE FUTURE HOLD? David R. Holmes III, PhD Mayo Clinic College of Medicine Rochester, MN, USA

Big Data and Graph Analytics in a Health Care Setting

Big Data and Graph Analytics in a Health Care Setting Big Data and Graph Analytics in a Health Care Setting Supercomputing 12 November 15, 2012 Bob Techentin Mayo Clinic SPPDG Archive 43738-1 Archive 43738-2 What is the Mayo Clinic? Mayo Clinic Mission: To

More information

Smart Financial Data: Semantic Web technology transforms Big Data into Smart Data

Smart Financial Data: Semantic Web technology transforms Big Data into Smart Data Smart Financial Data: Semantic Web technology transforms Big Data into Smart Data Insurance Data and Analytics Summit 2013 18 April 2013 David Saul, Senior Vice President & Chief Scientist State Street

More information

Complexity and Scalability in Semantic Graph Analysis Semantic Days 2013

Complexity and Scalability in Semantic Graph Analysis Semantic Days 2013 Complexity and Scalability in Semantic Graph Analysis Semantic Days 2013 James Maltby, Ph.D 1 Outline of Presentation Semantic Graph Analytics Database Architectures In-memory Semantic Database Formulation

More information

Industry 4.0 and Big Data

Industry 4.0 and Big Data Industry 4.0 and Big Data Marek Obitko, mobitko@ra.rockwell.com Senior Research Engineer 03/25/2015 PUBLIC PUBLIC - 5058-CO900H 2 Background Joint work with Czech Institute of Informatics, Robotics and

More information

Cray: Enabling Real-Time Discovery in Big Data

Cray: Enabling Real-Time Discovery in Big Data Cray: Enabling Real-Time Discovery in Big Data Discovery is the process of gaining valuable insights into the world around us by recognizing previously unknown relationships between occurrences, objects

More information

urika! Unlocking the Power of Big Data at PSC

urika! Unlocking the Power of Big Data at PSC urika! Unlocking the Power of Big Data at PSC Nick Nystrom Director, Strategic Applications Pittsburgh Supercomputing Center February 1, 2013 nystrom@psc.edu 2013 Pittsburgh Supercomputing Center Big Data

More information

bigdata Managing Scale in Ontological Systems

bigdata Managing Scale in Ontological Systems Managing Scale in Ontological Systems 1 This presentation offers a brief look scale in ontological (semantic) systems, tradeoffs in expressivity and data scale, and both information and systems architectural

More information

Ching-Yung Lin, Ph.D. Adjunct Professor, Dept. of Electrical Engineering and Computer Science IBM Chief Scientist, Graph Computing. October 29th, 2015

Ching-Yung Lin, Ph.D. Adjunct Professor, Dept. of Electrical Engineering and Computer Science IBM Chief Scientist, Graph Computing. October 29th, 2015 E6893 Big Data Analytics Lecture 8: Spark Streams and Graph Computing (I) Ching-Yung Lin, Ph.D. Adjunct Professor, Dept. of Electrical Engineering and Computer Science IBM Chief Scientist, Graph Computing

More information

Some Research Challenges for Big Data Analytics of Intelligent Security

Some Research Challenges for Big Data Analytics of Intelligent Security Some Research Challenges for Big Data Analytics of Intelligent Security Yuh-Jong Hu hu at cs.nccu.edu.tw Emerging Network Technology (ENT) Lab. Department of Computer Science National Chengchi University,

More information

E6895 Advanced Big Data Analytics Lecture 4:! Data Store

E6895 Advanced Big Data Analytics Lecture 4:! Data Store E6895 Advanced Big Data Analytics Lecture 4:! Data Store Ching-Yung Lin, Ph.D. Adjunct Professor, Dept. of Electrical Engineering and Computer Science Mgr., Dept. of Network Science and Big Data Analytics,

More information

Big Data Analytics. Rasoul Karimi

Big Data Analytics. Rasoul Karimi Big Data Analytics Rasoul Karimi Information Systems and Machine Learning Lab (ISMLL) Institute of Computer Science University of Hildesheim, Germany Big Data Analytics Big Data Analytics 1 / 1 Introduction

More information

Marcus Wilson, PharmD. First Plenary Session

Marcus Wilson, PharmD. First Plenary Session Moderator First Plenary Session THE USE OF "BIG DATA" - WHERE ARE WE AND WHAT DOES THE FUTURE HOLD? Marcus Wilson, PharmD HealthCore Wilmington, DE, USA Speakers First Plenary Session THE USE OF "BIG DATA"

More information

HadoopRDF : A Scalable RDF Data Analysis System

HadoopRDF : A Scalable RDF Data Analysis System HadoopRDF : A Scalable RDF Data Analysis System Yuan Tian 1, Jinhang DU 1, Haofen Wang 1, Yuan Ni 2, and Yong Yu 1 1 Shanghai Jiao Tong University, Shanghai, China {tian,dujh,whfcarter}@apex.sjtu.edu.cn

More information

! E6893 Big Data Analytics Lecture 9:! Linked Big Data Graph Computing (I)

! E6893 Big Data Analytics Lecture 9:! Linked Big Data Graph Computing (I) ! E6893 Big Data Analytics Lecture 9:! Linked Big Data Graph Computing (I) Ching-Yung Lin, Ph.D. Adjunct Professor, Dept. of Electrical Engineering and Computer Science Mgr., Dept. of Network Science and

More information

YarcData urika Technical White Paper

YarcData urika Technical White Paper YarcData urika Technical White Paper 2012 Cray Inc. All rights reserved. Specifications subject to change without notice. Cray is a registered trademark, YarcData, urika and Threadstorm are trademarks

More information

Application of Engineering Principles to Patient Flow & Healthcare Delivery

Application of Engineering Principles to Patient Flow & Healthcare Delivery Application of Engineering Principles to Patient Flow & Healthcare Delivery Jeanne M Huddleston, MD, MS Medical Director, Health Care Systems Engineering Mayo Clinic 2013 MFMER slide-1 2013 MFMER slide-2

More information

A Performance Evaluation of Open Source Graph Databases. Robert McColl David Ediger Jason Poovey Dan Campbell David A. Bader

A Performance Evaluation of Open Source Graph Databases. Robert McColl David Ediger Jason Poovey Dan Campbell David A. Bader A Performance Evaluation of Open Source Graph Databases Robert McColl David Ediger Jason Poovey Dan Campbell David A. Bader Overview Motivation Options Evaluation Results Lessons Learned Moving Forward

More information

Natural Language Processing in the EHR Lifecycle

Natural Language Processing in the EHR Lifecycle Insight Driven Health Natural Language Processing in the EHR Lifecycle Cecil O. Lynch, MD, MS cecil.o.lynch@accenture.com Health & Public Service Outline Medical Data Landscape Value Proposition of NLP

More information

The Best Way to Get BIG DATA is By Starting Small

The Best Way to Get BIG DATA is By Starting Small The Best Way to Get BIG DATA is By Starting Small Dr. Brand Niemann Director and Senior Data Scientist Semantic Community for Johns Hopkins University School of Medicine and Modus Operandi http://semanticommunity.info/

More information

Six Days in the Network Security Trenches at SC14. A Cray Graph Analytics Case Study

Six Days in the Network Security Trenches at SC14. A Cray Graph Analytics Case Study Six Days in the Network Security Trenches at SC14 A Cray Graph Analytics Case Study WP-NetworkSecurity-0315 www.cray.com Table of Contents Introduction... 3 Analytics Mission and Source Data... 3 Analytics

More information

Graph Database Performance: An Oracle Perspective

Graph Database Performance: An Oracle Perspective Graph Database Performance: An Oracle Perspective Xavier Lopez, Ph.D. Senior Director, Product Management 1 Copyright 2012, Oracle and/or its affiliates. All rights reserved. Program Agenda Broad Perspective

More information

Big Data for Big Value @ Intel

Big Data for Big Value @ Intel Big Data for Big Value @ Intel Moty Fania, PE Big data Analytics Assaf Araki, Sr. Arch. Big data Analytics Advanced Analytics team @ Intel IT Corporate ownership of advanced analytics Team charter Solve

More information

SE Minnesota Beacon Enabling Population Health Research

SE Minnesota Beacon Enabling Population Health Research SE Minnesota Beacon Enabling Population Health Research Minnesota ehealthsummit June 13, 2013 Research into Practice 3:00pm Session Lacey Hart, MBA, PMP Conflict of Interest Disclosure: Speaker has no

More information

ESS event: Big Data in Official Statistics. Antonino Virgillito, Istat

ESS event: Big Data in Official Statistics. Antonino Virgillito, Istat ESS event: Big Data in Official Statistics Antonino Virgillito, Istat v erbi v is 1 About me Head of Unit Web and BI Technologies, IT Directorate of Istat Project manager and technical coordinator of Web

More information

Big Data and Analytics: Challenges and Opportunities

Big Data and Analytics: Challenges and Opportunities Big Data and Analytics: Challenges and Opportunities Dr. Amin Beheshti Lecturer and Senior Research Associate University of New South Wales, Australia (Service Oriented Computing Group, CSE) Talk: Sharif

More information

BIG DATA IN THE CLOUD : CHALLENGES AND OPPORTUNITIES MARY- JANE SULE & PROF. MAOZHEN LI BRUNEL UNIVERSITY, LONDON

BIG DATA IN THE CLOUD : CHALLENGES AND OPPORTUNITIES MARY- JANE SULE & PROF. MAOZHEN LI BRUNEL UNIVERSITY, LONDON BIG DATA IN THE CLOUD : CHALLENGES AND OPPORTUNITIES MARY- JANE SULE & PROF. MAOZHEN LI BRUNEL UNIVERSITY, LONDON Overview * Introduction * Multiple faces of Big Data * Challenges of Big Data * Cloud Computing

More information

We have big data, but we need big knowledge

We have big data, but we need big knowledge We have big data, but we need big knowledge Weaving surveys into the semantic web ASC Big Data Conference September 26 th 2014 So much knowledge, so little time 1 3 takeaways What are linked data and the

More information

Supercomputing and Big Data: Where are the Real Boundaries and Opportunities for Synergy?

Supercomputing and Big Data: Where are the Real Boundaries and Opportunities for Synergy? HPC2012 Workshop Cetraro, Italy Supercomputing and Big Data: Where are the Real Boundaries and Opportunities for Synergy? Bill Blake CTO Cray, Inc. The Big Data Challenge Supercomputing minimizes data

More information

Transforming the Telecoms Business using Big Data and Analytics

Transforming the Telecoms Business using Big Data and Analytics Transforming the Telecoms Business using Big Data and Analytics Event: ICT Forum for HR Professionals Venue: Meikles Hotel, Harare, Zimbabwe Date: 19 th 21 st August 2015 AFRALTI 1 Objectives Describe

More information

ADVANCED DATA VISUALIZATION

ADVANCED DATA VISUALIZATION If I can't picture it, I can't understand it. Albert Einstein ADVANCED DATA VISUALIZATION REDUCE TO THE TIME TO INSIGHT AND DRIVE DATA DRIVEN DECISION MAKING Mark Wolff, Ph.D. Principal Industry Consultant

More information

COMP9321 Web Application Engineering

COMP9321 Web Application Engineering COMP9321 Web Application Engineering Semester 2, 2015 Dr. Amin Beheshti Service Oriented Computing Group, CSE, UNSW Australia Week 11 (Part II) http://webapps.cse.unsw.edu.au/webcms2/course/index.php?cid=2411

More information

Discovering Business Insights in Big Data Using SQL-MapReduce

Discovering Business Insights in Big Data Using SQL-MapReduce Discovering Business Insights in Big Data Using SQL-MapReduce A Technical Whitepaper Rick F. van der Lans Independent Business Intelligence Analyst R20/Consultancy July 2013 Sponsored by Copyright 2013

More information

An industry perspective on deployed semantic interoperability solutions

An industry perspective on deployed semantic interoperability solutions An industry perspective on deployed semantic interoperability solutions Ralph Hodgson, CTO, TopQuadrant SEMIC Conference, Athens, April 9, 2014 https://joinup.ec.europa.eu/community/semic/event/se mic-2014-semantic-interoperability-conference

More information

Big Data: Opportunities & Challenges, Myths & Truths 資 料 來 源 : 台 大 廖 世 偉 教 授 課 程 資 料

Big Data: Opportunities & Challenges, Myths & Truths 資 料 來 源 : 台 大 廖 世 偉 教 授 課 程 資 料 Big Data: Opportunities & Challenges, Myths & Truths 資 料 來 源 : 台 大 廖 世 偉 教 授 課 程 資 料 美 國 13 歲 學 生 用 Big Data 找 出 霸 淩 熱 點 Puri 架 設 網 站 Bullyvention, 藉 由 分 析 Twitter 上 找 出 提 到 跟 霸 凌 相 關 的 詞, 搭 配 地 理 位 置

More information

Customer Case Study. Sharethrough

Customer Case Study. Sharethrough Customer Case Study Customer Case Study Benefits Faster prototyping of new applications Easier debugging of complex pipelines Improved overall engineering team productivity Summary offers a robust advertising

More information

Machine Learning over Big Data

Machine Learning over Big Data Machine Learning over Big Presented by Fuhao Zou fuhao@hust.edu.cn Jue 16, 2014 Huazhong University of Science and Technology Contents 1 2 3 4 Role of Machine learning Challenge of Big Analysis Distributed

More information

Big Data Management Assessed Coursework Two Big Data vs Semantic Web F21BD

Big Data Management Assessed Coursework Two Big Data vs Semantic Web F21BD Big Data Management Assessed Coursework Two Big Data vs Semantic Web F21BD Boris Mocialov (H00180016) MSc Software Engineering Heriot-Watt University, Edinburgh April 5, 2015 1 1 Introduction The purpose

More information

fédération de données et de ConnaissancEs Distribuées en Imagerie BiomédicaLE Data fusion, semantic alignment, distributed queries

fédération de données et de ConnaissancEs Distribuées en Imagerie BiomédicaLE Data fusion, semantic alignment, distributed queries fédération de données et de ConnaissancEs Distribuées en Imagerie BiomédicaLE Data fusion, semantic alignment, distributed queries Johan Montagnat CNRS, I3S lab, Modalis team on behalf of the CrEDIBLE

More information

AllegroGraph. a graph database. Gary King gwking@franz.com

AllegroGraph. a graph database. Gary King gwking@franz.com AllegroGraph a graph database Gary King gwking@franz.com Overview What we store How we store it the possibilities Using AllegroGraph Databases Put stuff in Get stuff out quickly safely Stuff things with

More information

Big Data, Fast Data, Complex Data. Jans Aasman Franz Inc

Big Data, Fast Data, Complex Data. Jans Aasman Franz Inc Big Data, Fast Data, Complex Data Jans Aasman Franz Inc Private, founded 1984 AI, Semantic Technology, professional services Now in Oakland Franz Inc Who We Are (1 (2 3) (4 5) (6 7) (8 9) (10 11) (12

More information

Anatomy of Cyber Threats, Vulnerabilities, and Attacks

Anatomy of Cyber Threats, Vulnerabilities, and Attacks Anatomy of Cyber Threats, Vulnerabilities, and Attacks ACTIONABLE THREAT INTELLIGENCE FROM ONTOLOGY-BASED ANALYTICS 1 Anatomy of Cyber Threats, Vulnerabilities, and Attacks Copyright 2015 Recorded Future,

More information

Big Data and Data Science. The globally recognised training program

Big Data and Data Science. The globally recognised training program Big Data and Data Science The globally recognised training program Certificate in Big Data Analytics Duration 5 days Big Data and Data Science enables value creation from data, through the use of calculative

More information

Panel ADVCOMP/SEMAPRO. Luc Vouligny, moderator

Panel ADVCOMP/SEMAPRO. Luc Vouligny, moderator Panel ADVCOMP/SEMAPRO Luc Vouligny, moderator Computing Challenges with Semantics and Ontology Models Cristovâo D P Sousa Universidade do Porto, Portugal Michel ClauB Technische Universität, Chemnitz,

More information

Storage and Retrieval of Large RDF Graph Using Hadoop and MapReduce

Storage and Retrieval of Large RDF Graph Using Hadoop and MapReduce Storage and Retrieval of Large RDF Graph Using Hadoop and MapReduce Mohammad Farhan Husain, Pankil Doshi, Latifur Khan, and Bhavani Thuraisingham University of Texas at Dallas, Dallas TX 75080, USA Abstract.

More information

Information Management course

Information Management course Università degli Studi di Milano Master Degree in Computer Science Information Management course Teacher: Alberto Ceselli Lecture 01 : 06/10/2015 Practical informations: Teacher: Alberto Ceselli (alberto.ceselli@unimi.it)

More information

The Big Data Paradigm Shift. Insight Through Automation

The Big Data Paradigm Shift. Insight Through Automation The Big Data Paradigm Shift Insight Through Automation Agenda The Problem Emcien s Solution: Algorithms solve data related business problems How Does the Technology Work? Case Studies 2013 Emcien, Inc.

More information

Data Modeling in the Age of Big Data

Data Modeling in the Age of Big Data Data Modeling in the Age of Big Data Pete Stiglich Pete Stiglich is a principal at Clarity Solution Group. pstiglich@clarity-us.com Abstract With big data adoption accelerating and strong interest in NoSQL

More information

The data forest. Application. Application Application DATA. Office of Research

The data forest. Application. Application Application DATA. Office of Research The data forest DATA Unfortunately Data to the rescue The Rensselaer IDEA HPC: Computational Science and Engineering + Data Science and Predictive Analytics + Cognitive Computing + Perceptualization DATA

More information

Capitalize on Big Data for Competitive Advantage with Bedrock TM, an integrated Management Platform for Hadoop Data Lakes

Capitalize on Big Data for Competitive Advantage with Bedrock TM, an integrated Management Platform for Hadoop Data Lakes Capitalize on Big Data for Competitive Advantage with Bedrock TM, an integrated Management Platform for Hadoop Data Lakes Highly competitive enterprises are increasingly finding ways to maximize and accelerate

More information

Augmented Search for Software Testing

Augmented Search for Software Testing Augmented Search for Software Testing For Testers, Developers, and QA Managers New frontier in big log data analysis and application intelligence Business white paper May 2015 During software testing cycles,

More information

A Survey on: Efficient and Customizable Data Partitioning for Distributed Big RDF Data Processing using hadoop in Cloud.

A Survey on: Efficient and Customizable Data Partitioning for Distributed Big RDF Data Processing using hadoop in Cloud. A Survey on: Efficient and Customizable Data Partitioning for Distributed Big RDF Data Processing using hadoop in Cloud. Tejas Bharat Thorat Prof.RanjanaR.Badre Computer Engineering Department Computer

More information

Big Data and Healthcare Payers WHITE PAPER

Big Data and Healthcare Payers WHITE PAPER Knowledgent White Paper Series Big Data and Healthcare Payers WHITE PAPER Summary With the implementation of the Affordable Care Act, the transition to a more member-centric relationship model, and other

More information

Characterization of Semi-Synthetic Dataset for Big-Data Semantic Analysis

Characterization of Semi-Synthetic Dataset for Big-Data Semantic Analysis Characterization of Semi-Synthetic Dataset for Big-Data Semantic Analysis Robert Techentin¹, Daniel Foti², Sinan Al-Saffar³, Peter Li¹, Erik Daniel¹, Barry Gilbert¹, David Holmes¹ ¹Mayo Clinic College

More information

Steve Hamby Chief Technology Officer Orbis Technologies, Inc. shamby@orbistechnologies.com 678.346.6386

Steve Hamby Chief Technology Officer Orbis Technologies, Inc. shamby@orbistechnologies.com 678.346.6386 Semantic Technology and Cloud Computing Applied to Tactical Intelligence Domain Steve Hamby Chief Technology Officer Orbis Technologies, Inc. shamby@orbistechnologies.com 678.346.6386 1 Abstract The tactical

More information

Big RDF Data Partitioning and Processing using hadoop in Cloud

Big RDF Data Partitioning and Processing using hadoop in Cloud Big RDF Data Partitioning and Processing using hadoop in Cloud Tejas Bharat Thorat Dept. of Computer Engineering MIT Academy of Engineering, Alandi, Pune, India Prof.Ranjana R.Badre Dept. of Computer Engineering

More information

IC05 Introduction on Networks &Visualization Nov. 2009.

IC05 Introduction on Networks &Visualization Nov. 2009. <mathieu.bastian@gmail.com> IC05 Introduction on Networks &Visualization Nov. 2009 Overview 1. Networks Introduction Networks across disciplines Properties Models 2. Visualization InfoVis Data exploration

More information

LINKED DATA EXPERIENCE AT MACMILLAN Building discovery services for scientific and scholarly content on top of a semantic data model

LINKED DATA EXPERIENCE AT MACMILLAN Building discovery services for scientific and scholarly content on top of a semantic data model LINKED DATA EXPERIENCE AT MACMILLAN Building discovery services for scientific and scholarly content on top of a semantic data model 22 October 2014 Tony Hammond Michele Pasin Background About Macmillan

More information

Architectures for massive data management

Architectures for massive data management Architectures for massive data management Apache Spark Albert Bifet albert.bifet@telecom-paristech.fr October 20, 2015 Spark Motivation Apache Spark Figure: IBM and Apache Spark What is Apache Spark Apache

More information

Protein Protein Interaction Networks

Protein Protein Interaction Networks Functional Pattern Mining from Genome Scale Protein Protein Interaction Networks Young-Rae Cho, Ph.D. Assistant Professor Department of Computer Science Baylor University it My Definition of Bioinformatics

More information

In-Situ Bitmaps Generation and Efficient Data Analysis based on Bitmaps. Yu Su, Yi Wang, Gagan Agrawal The Ohio State University

In-Situ Bitmaps Generation and Efficient Data Analysis based on Bitmaps. Yu Su, Yi Wang, Gagan Agrawal The Ohio State University In-Situ Bitmaps Generation and Efficient Data Analysis based on Bitmaps Yu Su, Yi Wang, Gagan Agrawal The Ohio State University Motivation HPC Trends Huge performance gap CPU: extremely fast for generating

More information

3 rd Graph-based Technologies and Applications

3 rd Graph-based Technologies and Applications 3 rd Graph-based Technologies and Applications Program 18 th March 2015 9:00 Registration 9:30 10:45 Welcome by Fernando Orejas Vice-rector of research UPC Presentation session I 1 2 3 RDF Graph Data Management

More information

Healthcare, transportation,

Healthcare, transportation, Smart IT Argus456 Dreamstime.com From Data to Decisions: A Value Chain for Big Data H. Gilbert Miller and Peter Mork, Noblis Healthcare, transportation, finance, energy and resource conservation, environmental

More information

The Development of the Clinical Trial Ontology to standardize dissemination of clinical trial data. Ravi Shankar

The Development of the Clinical Trial Ontology to standardize dissemination of clinical trial data. Ravi Shankar The Development of the Clinical Trial Ontology to standardize dissemination of clinical trial data Ravi Shankar Open access to clinical trials data advances open science Broad open access to entire clinical

More information

A Statistical Text Mining Method for Patent Analysis

A Statistical Text Mining Method for Patent Analysis A Statistical Text Mining Method for Patent Analysis Department of Statistics Cheongju University, shjun@cju.ac.kr Abstract Most text data from diverse document databases are unsuitable for analytical

More information

Introduction to urika. Multithreading. urika Appliance. SPARQL Database. Use Cases

Introduction to urika. Multithreading. urika Appliance. SPARQL Database. Use Cases 1 Introduction to urika Multithreading urika Appliance SPARQL Database Use Cases 2 Gain business insight by discovering unknown relationships in big data Graph analytics warehouse supports ad hoc queries,

More information

An In-Depth Look at In-Memory Predictive Analytics for Developers

An In-Depth Look at In-Memory Predictive Analytics for Developers September 9 11, 2013 Anaheim, California An In-Depth Look at In-Memory Predictive Analytics for Developers Philip Mugglestone SAP Learning Points Understand the SAP HANA Predictive Analysis library (PAL)

More information

University of Manchester Health Data Science Masters Modules

University of Manchester Health Data Science Masters Modules University of Manchester Health Data Science Masters Modules We are taking applications now for Masters CPD modules beginning in February. All modules are 15 credits and cost 750. Timetable is as follows

More information

Big Data 101: Harvest Real Value & Avoid Hollow Hype

Big Data 101: Harvest Real Value & Avoid Hollow Hype Big Data 101: Harvest Real Value & Avoid Hollow Hype 2 Executive Summary Odds are you are hearing the growing hype around the potential for big data to revolutionize our ability to assimilate and act on

More information

Where is... How do I get to...

Where is... How do I get to... Big Data, Fast Data, Spatial Data Making Sense of Location Data in a Smart City Hans Viehmann Product Manager EMEA ORACLE Corporation August 19, 2015 Copyright 2014, Oracle and/or its affiliates. All rights

More information

Rackspace Cloud Databases and Container-based Virtualization

Rackspace Cloud Databases and Container-based Virtualization Rackspace Cloud Databases and Container-based Virtualization August 2012 J.R. Arredondo @jrarredondo Page 1 of 6 INTRODUCTION When Rackspace set out to build the Cloud Databases product, we asked many

More information

By Evan Quinn, Senior Principal Analyst. This ESG White Paper was commissioned by YarcData and is distributed under license from ESG.

By Evan Quinn, Senior Principal Analyst. This ESG White Paper was commissioned by YarcData and is distributed under license from ESG. White Paper Discovering Big Data s Value with Graph Analytics By Evan Quinn, Senior Principal Analyst April 2013 This ESG White Paper was commissioned by YarcData and is distributed under license from

More information

Innovative Advances in. Big Data and Analytics

Innovative Advances in. Big Data and Analytics Innovative Advances in Big Data and Analytics STANLEY is leading the way with innovative advances in big data and analytics, providing unparalleled visibility into your organization s activities and operations.

More information

Up Your R Game. James Taylor, Decision Management Solutions Bill Franks, Teradata

Up Your R Game. James Taylor, Decision Management Solutions Bill Franks, Teradata Up Your R Game James Taylor, Decision Management Solutions Bill Franks, Teradata Today s Speakers James Taylor Bill Franks CEO Chief Analytics Officer Decision Management Solutions Teradata 7/28/14 3 Polling

More information

PREDICTIVE ANALYTICS: PROVIDING NOVEL APPROACHES TO ENHANCE OUTCOMES RESEARCH LEVERAGING BIG AND COMPLEX DATA

PREDICTIVE ANALYTICS: PROVIDING NOVEL APPROACHES TO ENHANCE OUTCOMES RESEARCH LEVERAGING BIG AND COMPLEX DATA PREDICTIVE ANALYTICS: PROVIDING NOVEL APPROACHES TO ENHANCE OUTCOMES RESEARCH LEVERAGING BIG AND COMPLEX DATA IMS Symposium at ISPOR at Montreal June 2 nd, 2014 Agenda Topic Presenter Time Introduction:

More information

A Sematic Web-Based Framework for Quality Assurance of Electronic Medical Records Data for Secondary Use

A Sematic Web-Based Framework for Quality Assurance of Electronic Medical Records Data for Secondary Use A Sematic Web-Based Framework for Quality Assurance of Electronic Medical Records Data for Secondary Use Guoqian Jiang, Harold Solbrig, Christopher Chute Mayo Clinic W3C RDF Validation Workshop September

More information

A Novel Cloud Based Elastic Framework for Big Data Preprocessing

A Novel Cloud Based Elastic Framework for Big Data Preprocessing School of Systems Engineering A Novel Cloud Based Elastic Framework for Big Data Preprocessing Omer Dawelbeit and Rachel McCrindle October 21, 2014 University of Reading 2008 www.reading.ac.uk Overview

More information

Data Catalogs for Hadoop Achieving Shared Knowledge and Re-usable Data Prep. Neil Raden Hired Brains Research, LLC

Data Catalogs for Hadoop Achieving Shared Knowledge and Re-usable Data Prep. Neil Raden Hired Brains Research, LLC Data Catalogs for Hadoop Achieving Shared Knowledge and Re-usable Data Prep Neil Raden Hired Brains Research, LLC Traditionally, the job of gathering and integrating data for analytics fell on data warehouses.

More information

Alejandro Vaisman Esteban Zimanyi. Data. Warehouse. Systems. Design and Implementation. ^ Springer

Alejandro Vaisman Esteban Zimanyi. Data. Warehouse. Systems. Design and Implementation. ^ Springer Alejandro Vaisman Esteban Zimanyi Data Warehouse Systems Design and Implementation ^ Springer Contents Part I Fundamental Concepts 1 Introduction 3 1.1 A Historical Overview of Data Warehousing 4 1.2 Spatial

More information

Big Data and the Data Lake. February 2015

Big Data and the Data Lake. February 2015 Big Data and the Data Lake February 2015 My Vision: Our Mission Data Intelligence is a broad term that describes the real, meaningful insights that can be extracted from your data truths that you can act

More information

Unisys ClearPath Forward Fabric Based Platform to Power the Weather Enterprise

Unisys ClearPath Forward Fabric Based Platform to Power the Weather Enterprise Unisys ClearPath Forward Fabric Based Platform to Power the Weather Enterprise Introducing Unisys All in One software based weather platform designed to reduce server space, streamline operations, consolidate

More information

A.I. Tech Company profile A.I. Tech designs and develops intelligent audio and video analysis systems; We help operators to identify and give a meanin

A.I. Tech Company profile A.I. Tech designs and develops intelligent audio and video analysis systems; We help operators to identify and give a meanin EUCISE 2020 Industry Day Brussels 23.Sept.2015 Pierluigi Ritrovato A.I. Tech Artificial Intelligence Tech Technologies nologies and Solutions A spinspin-off company of the University of Salerno The Vision

More information

Blazent IT Data Intelligence Technology:

Blazent IT Data Intelligence Technology: Blazent IT Data Intelligence Technology: From Disparate Data Sources to Tangible Business Value White Paper The phrase garbage in, garbage out (GIGO) has been used by computer scientists since the earliest

More information

DEMYSTIFYING BIG DATA. What it is, what it isn t, and what it can do for you.

DEMYSTIFYING BIG DATA. What it is, what it isn t, and what it can do for you. DEMYSTIFYING BIG DATA What it is, what it isn t, and what it can do for you. JAMES LUCK BIO James Luck is a Data Scientist with AT&T Consulting. He has 25+ years of experience in data analytics, in addition

More information

2 Linked Data, Non-relational Databases and Cloud Computing

2 Linked Data, Non-relational Databases and Cloud Computing Distributed RDF Graph Keyword Search 15 2 Linked Data, Non-relational Databases and Cloud Computing 2.1.Linked Data The World Wide Web has allowed an unprecedented amount of information to be published

More information

Big Data and Semantic Web in Manufacturing. Nitesh Khilwani, PhD Chief Engineer, Samsung Research Institute Noida, India

Big Data and Semantic Web in Manufacturing. Nitesh Khilwani, PhD Chief Engineer, Samsung Research Institute Noida, India Big Data and Semantic Web in Manufacturing Nitesh Khilwani, PhD Chief Engineer, Samsung Research Institute Noida, India Outline Big data in Manufacturing Big data Analytics Semantic web technologies Case

More information

Handling the Complexity of RDF Data: Combining List and Graph Visualization

Handling the Complexity of RDF Data: Combining List and Graph Visualization Handling the Complexity of RDF Data: Combining List and Graph Visualization Philipp Heim and Jürgen Ziegler (University of Duisburg-Essen, Germany philipp.heim, juergen.ziegler@uni-due.de) Abstract: An

More information

Semantic tagging for crowd computing

Semantic tagging for crowd computing Semantic tagging for crowd computing Roberto Mirizzi 1, Azzurra Ragone 1,2, Tommaso Di Noia 1, and Eugenio Di Sciascio 1 1 Politecnico di Bari Via Orabona, 4, 70125 Bari, Italy mirizzi@deemail.poliba.it,

More information

> Semantic Web Use Cases and Case Studies

> Semantic Web Use Cases and Case Studies > Semantic Web Use Cases and Case Studies Case Study: Applied Semantic Knowledgebase for Detection of Patients at Risk of Organ Failure through Immune Rejection Robert Stanley 1, Bruce McManus 2, Raymond

More information

Parallel Data Warehouse

Parallel Data Warehouse MICROSOFT S ANALYTICS SOLUTIONS WITH PARALLEL DATA WAREHOUSE Parallel Data Warehouse Stefan Cronjaeger Microsoft May 2013 AGENDA PDW overview Columnstore and Big Data Business Intellignece Project Ability

More information

Information Discovery on Electronic Medical Records

Information Discovery on Electronic Medical Records Information Discovery on Electronic Medical Records Vagelis Hristidis, Fernando Farfán, Redmond P. Burke, MD Anthony F. Rossi, MD Jeffrey A. White, FIU FIU Miami Children s Hospital Miami Children s Hospital

More information

Chronon: A modern alternative to Log Files

Chronon: A modern alternative to Log Files Chronon: A modern alternative to Log Files A. The 5 fundamental flows of Log Files Log files, Old School, are a relic from the 1970s, however even today in 2012, IT infrastructure monitoring relies on

More information

Demonstration of SAP Predictive Analysis 1.0, consumption from SAP BI clients and best practices

Demonstration of SAP Predictive Analysis 1.0, consumption from SAP BI clients and best practices September 10-13, 2012 Orlando, Florida Demonstration of SAP Predictive Analysis 1.0, consumption from SAP BI clients and best practices Vishwanath Belur, Product Manager, SAP Predictive Analysis Learning

More information

Medical Big Data Workshop 12:30-5pm Star Conference Room. #MedBigData15

Medical Big Data Workshop 12:30-5pm Star Conference Room. #MedBigData15 Medical Big Data Workshop 12:30-5pm Star Conference Room #MedBigData15 Welcome! Today s Goals: Introduce you to the Big Data @ CSAIL Introduce you to the popular MIMIC II Dataset Overview of Database Technologies

More information

Integrated Big Data: Hadoop + DBMS + Discovery for SAS High Performance Analytics

Integrated Big Data: Hadoop + DBMS + Discovery for SAS High Performance Analytics Paper 1828-2014 Integrated Big Data: Hadoop + DBMS + Discovery for SAS High Performance Analytics John Cunningham, Teradata Corporation, Danville, CA ABSTRACT SAS High Performance Analytics (HPA) is a

More information

Data Discovery, Analytics, and the Enterprise Data Hub

Data Discovery, Analytics, and the Enterprise Data Hub Data Discovery, Analytics, and the Enterprise Data Hub Version: 101 Table of Contents Summary 3 Used Data and Limitations of Legacy Analytic Architecture 3 The Meaning of Data Discovery & Analytics 4 Machine

More information

Opportunities and Challenges in Big Data Neuroscience

Opportunities and Challenges in Big Data Neuroscience Opportunities and Challenges in Big Data Neuroscience Joshua T. Vogelstein {BME, ICM, CIS, IDIES}@JHU Co-founder and Director of the Open Connectome Project e: jovo@jhu.edu, w: http://ocp.me Why is it

More information

Database Marketing, Business Intelligence and Knowledge Discovery

Database Marketing, Business Intelligence and Knowledge Discovery Database Marketing, Business Intelligence and Knowledge Discovery Note: Using material from Tan / Steinbach / Kumar (2005) Introduction to Data Mining,, Addison Wesley; and Cios / Pedrycz / Swiniarski

More information

HOW TO MAKE SENSE OF BIG DATA TO BETTER DRIVE BUSINESS PROCESSES, IMPROVE DECISION-MAKING, AND SUCCESSFULLY COMPETE IN TODAY S MARKETS.

HOW TO MAKE SENSE OF BIG DATA TO BETTER DRIVE BUSINESS PROCESSES, IMPROVE DECISION-MAKING, AND SUCCESSFULLY COMPETE IN TODAY S MARKETS. HOW TO MAKE SENSE OF BIG DATA TO BETTER DRIVE BUSINESS PROCESSES, IMPROVE DECISION-MAKING, AND SUCCESSFULLY COMPETE IN TODAY S MARKETS. ALTILIA turns Big Data into Smart Data and enables businesses to

More information

ISSN:2321-1156 International Journal of Innovative Research in Technology & Science(IJIRTS)

ISSN:2321-1156 International Journal of Innovative Research in Technology & Science(IJIRTS) Nguyễn Thị Thúy Hoài, College of technology _ Danang University Abstract The threading development of IT has been bringing more challenges for administrators to collect, store and analyze massive amounts

More information