The Best Way to Get BIG DATA is By Starting Small

Size: px
Start display at page:

Download "The Best Way to Get BIG DATA is By Starting Small"

Transcription

1 The Best Way to Get BIG DATA is By Starting Small Dr. Brand Niemann Director and Senior Data Scientist Semantic Community for Johns Hopkins University School of Medicine and Modus Operandi December 12,

2 BIG DATA The new Digital Government Strategy is "treating all content as data." So big data = all your content: But just a small sample to start a pilot. There are many Big Data Technologies to choose from and many early adopters are finding them more expensive than expected: Use open source-free trials to pilot. There are many Big Data Problems to solve that could boil the ocean : Use a data scientist to help build a team and community for a fast, inexpensive, and small semantic data science pilot. 2

3 Subcommittee on Networking and Information Technology Research and Development (NITRD Subcommittee) These three activities fostered Semantic Medline on the YarcData Graph Appliance for the White House Big Data Initiative. & Web Address 3

4 Data Science Team Example: Chief Data Science Officer Chief Data Science Officer: Dr. George Strawn, Director, White House OSTP NITRD/NCO: Semantic Medline could be the killer Semantic Web application for the US Federal Government Data Science Team: Dr. Brand Niemann, Lead Dr. Tom Rindflesch, NLM Semantic Medline Creator Professor Kirk Borne, George Mason University Federal Big Data Senior Steering WG Workforce Training Initiative Tim White, Director, YarcData Federal Global Head Aaron Bossett, YarcData Federal Solution Architect Dr. Eric Little, Modus Operandi Chief Scientist 4

5 Generic Problems How to get Big Data: Unstructured (Natural Language Processing to Graph-RDF Triples) and Structured (Relational-RDF Triples) Where to store Big Data: Graph-RDF Triples and Relational What to show about Big Data: Statistics, Visualizations, and Network Graphs Note: RDF Triples make Big Data smaller, smarter, and integrated! Semantic Medline on the YarcData Graph Appliance is an example of the best content on the best graph data store with the best visualization results so far (in my humble opinion)! Our Semantic Data Science Team delivered this for the recent White House Big Data Event: See Making the Most of Big Data 5

6 Semantic Medline YarcData Graph Appliance Application for Federal Big Data Senior Steering WG: Work Flow 6

7 Semantic Medline YarcData Graph Appliance Application for Federal Big Data Senior Steering WG: Semantic Medline Database Application See More Information: 7

8 Semantic Medline YarcData Graph Appliance Application for Federal Big Data Senior Steering WG: Visualization and Linking to Original Text 8

9 Semantic Medline YarcData Graph Appliance Application for Federal Big Data Senior Steering WG: Bioinformatics Publication My Note: My SQL database for non-commercial use. 9

10 Semantic Medline YarcData Graph Appliance Application for Federal Big Data Senior Steering WG: Semantic Medline at NIH-NLM Current : Web based research tool. Transition: Current systems re-engineered to leverage Urika (less than 5 days). Purpose: Build a platform for users to perform increasingly complex analysis. Immediate Requirement : Replicate current capability. Future: Allow for increasingly complex analysis. Ability to capture and share analytics in addition to sharing data. Tailor Urika to less complex queries. 10

11 Semantic Medline YarcData Graph Appliance Application for Federal Big Data Senior Steering WG: Graphs and Traditional Technologies Square peg, round hole: Current technology does not support efficient representation, storage, and interaction with complex graph structures Traditional relational models only add the an already complex structure Traditional hardware approaches do not support efficient access to highly interconnected graphs You don t know what you don t know: Efficient relational schemas require prior knowledge of the relationships between database fields Updating and modifying schemas frequently introduces delays and errors Problems in partitioning the problem: Distributed computing solutions are good If your problem can be easily partitioned Graphs are not predictable; accessing graph nodes across large clusters can be unwieldy at best and does not work at scale CPU CPU CPU 11

12 Semantic Medline YarcData Graph Appliance Application for Federal Big Data Senior Steering WG: The YarcData Approach Business Challenge:? CPU CPU CPU Large Shared Memory Architecture Up to 512 TB XMT2 Massively Multi- Threaded Processors 128 Threads Scalable IO Up to 350TB per Hour Real-time, Interactive Analytics on Large Graph Problems 12

13 Semantic Medline YarcData Graph Appliance Application for Federal Big Data Senior Steering WG: New Use Cases Schizophrenia Current therapies target dopamine receptors Not entirely effective Side effects Basic research is exploring glutamate and its NMDA receptor Goal: can we use Semantic MEDLINE to discover that research trend in the scientific literature Cancer With some exceptions, therapy is not effective Has not progressed significantly in 60 years Scientific basis Traditionally cancer cells More recently non-cancer cells (immune system) Immune system and cancer Connection noted in 1863 (Virchow) But not exploited until recently Goal: look for trends in cancer immunotherapy Note: See Two YouTube Video Demos: Schizo (7 minutes) and Cancer (21 minutes) Discovery Browsing Method for Exploiting Semantic MEDLINE Cooperative reciprocity Between system and human Issue query Inspect graph for interesting concept Use selected concept to seed another query Iterate until satisfied 13

14 Modus Operandi: Mantra, Performance, and Vision Mantra: Speeding the Discovery, Integration, and Fusion of Information Performance: SBIR Phase Three Successes: Wave Exploitation Framework (EF) Wave EF: Government-off-the-shelf (GOTS) technology for intelligence applications that tackles the difficult problem of processing unstructured and semi-structured data C4ISR Government Customers: U.S. Air Force, U.S. Army, U.S. Marine Corps, U.S. Navy, DARPA, DTRA, Missile Defense Agency, and Intelligence Agencies Vision: Wave All-Source Semantic Fusion Engine: In development to support individual medical researchers/intelligence analysts to work with big data Semedy (former Ontoprise founders): Reasoner and Triple Store 14

15 Modus Operandi: Finding the Right Needle in the Right Haystack Dyson said. So a lot of what we re doing is enabling that by making the data sources accessible and searchable. Our specialization is what we call semantic technology, which is just a way of making the data smarter. We enrich the data with various tags to make it easier to find. The software also provides what McNeight called data provenance which has to do with the traceability back to the source of the data - the really important aspect for intelligence personnel. We don t make decisions, McNeight explained. We just help (the analyst) to make decisions and to find the right data. He may only be interested in a certain person in a certain location at a certain time. We can bring that back to him across multiple databases. Source: 15

16 Data Science Team Example: President of Modus Operandi President of Modus Operandi: Richard McNeight, President, Masters Degree in Artificial Intelligence & Computer Science, Board of Regents, Florida Institute of Technology University, Recognized for Entrepreneurial Leadership, and Recipient of Florida County Economic Development Grant for Big Medical Data Data Science Team: Lee Watkins, Director of Bioinformatics & IT JHMI, and Dr. Brand Niemann, Semantic Community, Co-Leads Dr. Eric Little, Modus Operandi Chief Scientist, Ontology and Wave All-Source Semantic Fusion Engine Development Bryan Thompson and Michael Personick, SYSTAP Principals, Bigdata Platform Tim Barr, YarcData Medical Informatics, and Aaron Bossett, YarcData Federal Solution Architect Others to be added as needed Advisors: Dr. Tom Rindflesch, NIH/NLM Semantic Medline Creator Dr. Richard Ford and Dr. Marco Carvalho, Florida Institute of Technology 16

17 Wave and the vmdc (virtual metadata catalog which is a query translator for non-semantic queries) Structured, Semistructured, Unstructured Data Semantic Reasoner Batch Data Streaming Data Wave Ingest Trust/Provenance Algorithms Generated Semantic Graph (RDF) vmdc Accumulo DB An engine that can ingest any kind of data, transform that data into RDF graphs, then do a lot of semantic coolness with those graphs. High Performance Triple Store (Rya) 17

18 How Wave Drives the BLADE Semantic Wiki and Other Kinds of Analytic Visualizations The wiki is just a way to view the entities in the model and make changes and see related content without having to type any SPARQL code or really know anything about the backend model structure just point and click at the content you want to see. BLADE 2.0 Wiki Apps and Visualizations 18

19 Possible Scenario For medicine the Blade 2.0 Semantic Wiki would allow different researchers to view the data collectively from within their areas of expertise, but connect them to other areas effortlessly. This means scientist 1 could be looking up information on a given receptor on a cell, while scientist 2 is looking at proteomic information (perhaps not even knowing it is the underlying substance of that cell/receptor). Scientist 3 could add some new information about a given compound that shows reactions at the receptor site scientist 1 is studying. Upon entering that information, scientist 1 would see a new linked piece of data about their receptor related to the compound and the cool part is scientist 2 would also see information about the connection between their protein structure and that compound. Scientist 3 would see the information about the protein related to their compound as well (since they were only looking at the receptor-compound connection). All 3 would basically have new linked information available to pursue if they wanted. Now imagine being able to do those kinds of joins in near-real-time with a simple tool across the entire corpus of the Semantic Medline data set. Kaboom! Source: Dr. Eric Little, Chief Scientist and Ontologist 19

20 Knowledge Base: Modus Operandi Web Intelligence in MindTouch Practical Example of How to Get BIG DATA By Starting Small with Structured & Unstructured Data as Relational & RDF Triples Stored in Excel and Visualized in Spotfire. 20

21 Big Data in Memory: Innovation Story Met Jef Sharp, President, Panève: Amazing fast access and massive storage Big Data Supercomputer on My Mobile Device John Hopkins University Blackbook (CIA Cloud) I suggested: Greylock Partners - #2 Data Scientist in the World (DJ Patil, Entrepreneur-in-Residence who built the first formal data science team at LinkedIn) Works for In-Q-Tel (Robert Ames, Senior VP for Technology, In-Q-Tel) Works for CIA (Gus Hunt, CTO, CIA) Who Wants Big Data Supercomputer on Mobile Devices 21

22 Future: Possibility Panève s ZettaLeaf & ZettaTree Products Scalable single level storage Panève s scalable single level storage model collapses the server, network, and storage by removing software and replacing them with memory system primitives. This eliminates all network and network-processing overhead associated with accessing storage and delivers a 10,000X increase in raw performance

Cray: Enabling Real-Time Discovery in Big Data

Cray: Enabling Real-Time Discovery in Big Data Cray: Enabling Real-Time Discovery in Big Data Discovery is the process of gaining valuable insights into the world around us by recognizing previously unknown relationships between occurrences, objects

More information

Complexity and Scalability in Semantic Graph Analysis Semantic Days 2013

Complexity and Scalability in Semantic Graph Analysis Semantic Days 2013 Complexity and Scalability in Semantic Graph Analysis Semantic Days 2013 James Maltby, Ph.D 1 Outline of Presentation Semantic Graph Analytics Database Architectures In-memory Semantic Database Formulation

More information

bigdata Managing Scale in Ontological Systems

bigdata Managing Scale in Ontological Systems Managing Scale in Ontological Systems 1 This presentation offers a brief look scale in ontological (semantic) systems, tradeoffs in expressivity and data scale, and both information and systems architectural

More information

Semantic Data Management. Xavier Lopez, Ph.D., Director, Spatial & Semantic Technologies

Semantic Data Management. Xavier Lopez, Ph.D., Director, Spatial & Semantic Technologies Semantic Data Management Xavier Lopez, Ph.D., Director, Spatial & Semantic Technologies 1 Enterprise Information Challenge Source: Oracle customer 2 Vision of Semantically Linked Data The Network of Collaborative

More information

Oracle Big Data SQL Technical Update

Oracle Big Data SQL Technical Update Oracle Big Data SQL Technical Update Jean-Pierre Dijcks Oracle Redwood City, CA, USA Keywords: Big Data, Hadoop, NoSQL Databases, Relational Databases, SQL, Security, Performance Introduction This technical

More information

How To Build A Cloud Based Intelligence System

How To Build A Cloud Based Intelligence System Semantic Technology and Cloud Computing Applied to Tactical Intelligence Domain Steve Hamby Chief Technology Officer Orbis Technologies, Inc. shamby@orbistechnologies.com 678.346.6386 1 Abstract The tactical

More information

Supercomputing and Big Data: Where are the Real Boundaries and Opportunities for Synergy?

Supercomputing and Big Data: Where are the Real Boundaries and Opportunities for Synergy? HPC2012 Workshop Cetraro, Italy Supercomputing and Big Data: Where are the Real Boundaries and Opportunities for Synergy? Bill Blake CTO Cray, Inc. The Big Data Challenge Supercomputing minimizes data

More information

urika! Unlocking the Power of Big Data at PSC

urika! Unlocking the Power of Big Data at PSC urika! Unlocking the Power of Big Data at PSC Nick Nystrom Director, Strategic Applications Pittsburgh Supercomputing Center February 1, 2013 nystrom@psc.edu 2013 Pittsburgh Supercomputing Center Big Data

More information

Smart Financial Data: Semantic Web technology transforms Big Data into Smart Data

Smart Financial Data: Semantic Web technology transforms Big Data into Smart Data Smart Financial Data: Semantic Web technology transforms Big Data into Smart Data Insurance Data and Analytics Summit 2013 18 April 2013 David Saul, Senior Vice President & Chief Scientist State Street

More information

www.objectivity.com Choosing The Right Big Data Tools For The Job A Polyglot Approach

www.objectivity.com Choosing The Right Big Data Tools For The Job A Polyglot Approach www.objectivity.com Choosing The Right Big Data Tools For The Job A Polyglot Approach Nic Caine NoSQL Matters, April 2013 Overview The Problem Current Big Data Analytics Relationship Analytics Leveraging

More information

Using Big Data in Healthcare

Using Big Data in Healthcare Speaker First Plenary Session THE USE OF "BIG DATA" - WHERE ARE WE AND WHAT DOES THE FUTURE HOLD? David R. Holmes III, PhD Mayo Clinic College of Medicine Rochester, MN, USA Using Big Data in Healthcare

More information

Ganzheitliches Datenmanagement

Ganzheitliches Datenmanagement Ganzheitliches Datenmanagement für Hadoop Michael Kohs, Senior Sales Consultant @mikchaos The Problem with Big Data Projects in 2016 Relational, Mainframe Documents and Emails Data Modeler Data Scientist

More information

Data-intensive HPC: opportunities and challenges. Patrick Valduriez

Data-intensive HPC: opportunities and challenges. Patrick Valduriez Data-intensive HPC: opportunities and challenges Patrick Valduriez Big Data Landscape Multi-$billion market! Big data = Hadoop = MapReduce? No one-size-fits-all solution: SQL, NoSQL, MapReduce, No standard,

More information

Industry 4.0 and Big Data

Industry 4.0 and Big Data Industry 4.0 and Big Data Marek Obitko, mobitko@ra.rockwell.com Senior Research Engineer 03/25/2015 PUBLIC PUBLIC - 5058-CO900H 2 Background Joint work with Czech Institute of Informatics, Robotics and

More information

How To Make Sense Of Data With Altilia

How To Make Sense Of Data With Altilia HOW TO MAKE SENSE OF BIG DATA TO BETTER DRIVE BUSINESS PROCESSES, IMPROVE DECISION-MAKING, AND SUCCESSFULLY COMPETE IN TODAY S MARKETS. ALTILIA turns Big Data into Smart Data and enables businesses to

More information

Oracle Big Data Building A Big Data Management System

Oracle Big Data Building A Big Data Management System Oracle Big Building A Big Management System Copyright 2015, Oracle and/or its affiliates. All rights reserved. Effi Psychogiou ECEMEA Big Product Director May, 2015 Safe Harbor Statement The following

More information

Objectivity positions graph database as relational complement to InfiniteGraph 3.0

Objectivity positions graph database as relational complement to InfiniteGraph 3.0 Objectivity positions graph database as relational complement to InfiniteGraph 3.0 Analyst: Matt Aslett 1 Oct, 2012 Objectivity Inc has launched version 3.0 of its InfiniteGraph graph database, improving

More information

HadoopTM Analytics DDN

HadoopTM Analytics DDN DDN Solution Brief Accelerate> HadoopTM Analytics with the SFA Big Data Platform Organizations that need to extract value from all data can leverage the award winning SFA platform to really accelerate

More information

Understanding the Value of In-Memory in the IT Landscape

Understanding the Value of In-Memory in the IT Landscape February 2012 Understing the Value of In-Memory in Sponsored by QlikView Contents The Many Faces of In-Memory 1 The Meaning of In-Memory 2 The Data Analysis Value Chain Your Goals 3 Mapping Vendors to

More information

Hortonworks & SAS. Analytics everywhere. Page 1. Hortonworks Inc. 2011 2014. All Rights Reserved

Hortonworks & SAS. Analytics everywhere. Page 1. Hortonworks Inc. 2011 2014. All Rights Reserved Hortonworks & SAS Analytics everywhere. Page 1 A change in focus. A shift in Advertising From mass branding A shift in Financial Services From Educated Investing A shift in Healthcare From mass treatment

More information

Application of OASIS Integrated Collaboration Object Model (ICOM) with Oracle Database 11g Semantic Technologies

Application of OASIS Integrated Collaboration Object Model (ICOM) with Oracle Database 11g Semantic Technologies Application of OASIS Integrated Collaboration Object Model (ICOM) with Oracle Database 11g Semantic Technologies Zhe Wu Ramesh Vasudevan Eric S. Chan Oracle Deirdre Lee, Laura Dragan DERI A Presentation

More information

Mastering Big Data. Steve Hoskin, VP and Chief Architect INFORMATICA MDM. October 2015

Mastering Big Data. Steve Hoskin, VP and Chief Architect INFORMATICA MDM. October 2015 Mastering Big Data Steve Hoskin, VP and Chief Architect INFORMATICA MDM October 2015 Agenda About Big Data MDM and Big Data The Importance of Relationships Big Data Use Cases About Big Data Big Data is

More information

The Fusion of Supercomputing and Big Data. Peter Ungaro President & CEO

The Fusion of Supercomputing and Big Data. Peter Ungaro President & CEO The Fusion of Supercomputing and Big Data Peter Ungaro President & CEO The Supercomputing Company Supercomputing Big Data Because some great things never change One other thing that hasn t changed. Cray

More information

The Future of Data Management

The Future of Data Management The Future of Data Management with Hadoop and the Enterprise Data Hub Amr Awadallah (@awadallah) Cofounder and CTO Cloudera Snapshot Founded 2008, by former employees of Employees Today ~ 800 World Class

More information

BIG DATA IN THE CLOUD : CHALLENGES AND OPPORTUNITIES MARY- JANE SULE & PROF. MAOZHEN LI BRUNEL UNIVERSITY, LONDON

BIG DATA IN THE CLOUD : CHALLENGES AND OPPORTUNITIES MARY- JANE SULE & PROF. MAOZHEN LI BRUNEL UNIVERSITY, LONDON BIG DATA IN THE CLOUD : CHALLENGES AND OPPORTUNITIES MARY- JANE SULE & PROF. MAOZHEN LI BRUNEL UNIVERSITY, LONDON Overview * Introduction * Multiple faces of Big Data * Challenges of Big Data * Cloud Computing

More information

Challenges and Solutions for Big Data in the Public Sector:

Challenges and Solutions for Big Data in the Public Sector: Challenges and Solutions for Big Data in the Public Sector: Digital Government Institute s Annual Big Data Conference, October 9, Washington, DC Reagan Building Dr. Brand Niemann Director and Senior Data

More information

Luncheon Webinar Series May 13, 2013

Luncheon Webinar Series May 13, 2013 Luncheon Webinar Series May 13, 2013 InfoSphere DataStage is Big Data Integration Sponsored By: Presented by : Tony Curcio, InfoSphere Product Management 0 InfoSphere DataStage is Big Data Integration

More information

MarkLogic Enterprise Data Layer

MarkLogic Enterprise Data Layer MarkLogic Enterprise Data Layer MarkLogic Enterprise Data Layer MarkLogic Enterprise Data Layer September 2011 September 2011 September 2011 Table of Contents Executive Summary... 3 An Enterprise Data

More information

Big Data Challenges and Opportunities

Big Data Challenges and Opportunities Big Data Challenges and Opportunities Ira A. (Gus) Hunt Chief Technology Officer Our Mission We are the nation's first line of defense. We accomplish what others cannot accomplish and go where others cannot

More information

TopQuadrant-Syngenta Webcast July 10, 2014 Semantic Data Virtualization: Extracting More Value from Data Silos

TopQuadrant-Syngenta Webcast July 10, 2014 Semantic Data Virtualization: Extracting More Value from Data Silos TopQuadrant-Syngenta Webcast July 10, 2014 Semantic Data Virtualization: Extracting More Value from Data Silos Featuring Syngenta's report on its successful pilot Webcast Agenda Overview of Problem and

More information

Safe Harbor Statement

Safe Harbor Statement Safe Harbor Statement The following is intended to outline our general product direction. It is intended for information purposes only, and may not be incorporated into any contract. It is not a commitment

More information

Big Data Technologies Compared June 2014

Big Data Technologies Compared June 2014 Big Data Technologies Compared June 2014 Agenda What is Big Data Big Data Technology Comparison Summary Other Big Data Technologies Questions 2 What is Big Data by Example The SKA Telescope is a new development

More information

Bigdata : Enabling the Semantic Web at Web Scale

Bigdata : Enabling the Semantic Web at Web Scale Bigdata : Enabling the Semantic Web at Web Scale Presentation outline What is big data? Bigdata Architecture Bigdata RDF Database Performance Roadmap What is big data? Big data is a new way of thinking

More information

Presented by: Aaron Bossert, Cray Inc. Network Security Analytics, HPC Platforms, Hadoop, and Graphs Oh, My

Presented by: Aaron Bossert, Cray Inc. Network Security Analytics, HPC Platforms, Hadoop, and Graphs Oh, My Presented by: Aaron Bossert, Cray Inc. Network Security Analytics, HPC Platforms, Hadoop, and Graphs Oh, My The Proverbial Needle In A Haystack Problem The Nuclear Option Problem Statement and Proposed

More information

Pentaho & MongoDB Partner to Solve Government Big Data Challenges

Pentaho & MongoDB Partner to Solve Government Big Data Challenges Pentaho & MongoDB Partner to Solve Government Big Data Challenges December 2013 Bob Gourley Publisher, CTOvision.com Will LaForest Director of Federal, MongoDB Dave Henry SVP Enterprise Solutions, Pentaho

More information

Big Data Processing: Past, Present and Future

Big Data Processing: Past, Present and Future Big Data Processing: Past, Present and Future Orion Gebremedhin National Solutions Director BI & Big Data, Neudesic LLC. VTSP Microsoft Corp. Orion.Gebremedhin@Neudesic.COM B-orgebr@Microsoft.com @OrionGM

More information

Introduction to Epinomy TEXT TABLES TRIPLES. Big Data Semantics

Introduction to Epinomy TEXT TABLES TRIPLES. Big Data Semantics Introduction to Epinomy TEXT TABLES TRIPLES Big Data Semantics The Promise of Big Data The application of big data in industrial settings is driving a productivity revolution. - Jeff Immelt, CEO/GE Companies

More information

SAP HANA SAP s In-Memory Database. Dr. Martin Kittel, SAP HANA Development January 16, 2013

SAP HANA SAP s In-Memory Database. Dr. Martin Kittel, SAP HANA Development January 16, 2013 SAP HANA SAP s In-Memory Database Dr. Martin Kittel, SAP HANA Development January 16, 2013 Disclaimer This presentation outlines our general product direction and should not be relied on in making a purchase

More information

Big Data Analytics Platform @ Nokia

Big Data Analytics Platform @ Nokia Big Data Analytics Platform @ Nokia 1 Selecting the Right Tool for the Right Workload Yekesa Kosuru Nokia Location & Commerce Strata + Hadoop World NY - Oct 25, 2012 Agenda Big Data Analytics Platform

More information

BIG DATA THE NEW OPPORTUNITY

BIG DATA THE NEW OPPORTUNITY Feature Biswajit Mohapatra is an IBM Certified Consultant and a global integrated delivery leader for IBM s AMS business application modernization (BAM) practice. He is IBM India s competency head for

More information

Introduction to urika. Multithreading. urika Appliance. SPARQL Database. Use Cases

Introduction to urika. Multithreading. urika Appliance. SPARQL Database. Use Cases 1 Introduction to urika Multithreading urika Appliance SPARQL Database Use Cases 2 Gain business insight by discovering unknown relationships in big data Graph analytics warehouse supports ad hoc queries,

More information

SQL Server 2012 Performance White Paper

SQL Server 2012 Performance White Paper Published: April 2012 Applies to: SQL Server 2012 Copyright The information contained in this document represents the current view of Microsoft Corporation on the issues discussed as of the date of publication.

More information

BIG Data Analytics Move to Competitive Advantage

BIG Data Analytics Move to Competitive Advantage BIG Data Analytics Move to Competitive Advantage where is technology heading today Standardization Open Source Automation Scalability Cloud Computing Mobility Smartphones/ tablets Internet of Things Wireless

More information

From Distributed Computing to Distributed Artificial Intelligence

From Distributed Computing to Distributed Artificial Intelligence From Distributed Computing to Distributed Artificial Intelligence Dr. Christos Filippidis, NCSR Demokritos Dr. George Giannakopoulos, NCSR Demokritos Big Data and the Fourth Paradigm The two dominant paradigms

More information

News and trends in Data Warehouse Automation, Big Data and BI. Johan Hendrickx & Dirk Vermeiren

News and trends in Data Warehouse Automation, Big Data and BI. Johan Hendrickx & Dirk Vermeiren News and trends in Data Warehouse Automation, Big Data and BI Johan Hendrickx & Dirk Vermeiren Extreme Agility from Source to Analysis DWH Appliances & DWH Automation Typical Architecture 3 What Business

More information

How To Handle Big Data With A Data Scientist

How To Handle Big Data With A Data Scientist III Big Data Technologies Today, new technologies make it possible to realize value from Big Data. Big data technologies can replace highly customized, expensive legacy systems with a standard solution

More information

Executive Summary... 2 Introduction... 3. Defining Big Data... 3. The Importance of Big Data... 4 Building a Big Data Platform...

Executive Summary... 2 Introduction... 3. Defining Big Data... 3. The Importance of Big Data... 4 Building a Big Data Platform... Executive Summary... 2 Introduction... 3 Defining Big Data... 3 The Importance of Big Data... 4 Building a Big Data Platform... 5 Infrastructure Requirements... 5 Solution Spectrum... 6 Oracle s Big Data

More information

With DDN Big Data Storage

With DDN Big Data Storage DDN Solution Brief Accelerate > ISR With DDN Big Data Storage The Way to Capture and Analyze the Growing Amount of Data Created by New Technologies 2012 DataDirect Networks. All Rights Reserved. The Big

More information

Big Data and the Data Lake. February 2015

Big Data and the Data Lake. February 2015 Big Data and the Data Lake February 2015 My Vision: Our Mission Data Intelligence is a broad term that describes the real, meaningful insights that can be extracted from your data truths that you can act

More information

Semantic Web Success Story

Semantic Web Success Story Semantic Web Success Story Practical Integration of Semantic Web Technology Chris Chaulk, Software Architect EMC Corporation 1 Who is this guy? Software Architect at EMC 12 years, Storage Management Software

More information

How To Make A Mobile Bridge Work For You

How To Make A Mobile Bridge Work For You MobileBridge ALLOWING BRANDS TO ENGAGE EXISTING AND POTENTIAL NEW AUDIENCES CUSTOMER SUCCESS STORY MobileBridge used Clustrix to grow beyond MySQL on its high-end AWS instance, which was struggling with

More information

Designing a Cloud Storage System

Designing a Cloud Storage System Designing a Cloud Storage System End to End Cloud Storage When designing a cloud storage system, there is value in decoupling the system s archival capacity (its ability to persistently store large volumes

More information

WINDOWS AZURE DATA MANAGEMENT

WINDOWS AZURE DATA MANAGEMENT David Chappell October 2012 WINDOWS AZURE DATA MANAGEMENT CHOOSING THE RIGHT TECHNOLOGY Sponsored by Microsoft Corporation Copyright 2012 Chappell & Associates Contents Windows Azure Data Management: A

More information

Analytics in the Cloud. Peter Sirota, GM Elastic MapReduce

Analytics in the Cloud. Peter Sirota, GM Elastic MapReduce Analytics in the Cloud Peter Sirota, GM Elastic MapReduce Data-Driven Decision Making Data is the new raw material for any business on par with capital, people, and labor. What is Big Data? Terabytes of

More information

Emerging Geospatial Trends The Convergence of Technologies. Jim Steiner Vice President, Product Management

Emerging Geospatial Trends The Convergence of Technologies. Jim Steiner Vice President, Product Management Emerging Geospatial Trends The Convergence of Technologies Jim Steiner Vice President, Product Management United Nation Analysis Initiative on Global GeoSpatial Information Management Future Trends Technology

More information

IBM PureData System for Operational Analytics

IBM PureData System for Operational Analytics IBM PureData System for Operational Analytics An integrated, high-performance data system for operational analytics Highlights Provides an integrated, optimized, ready-to-use system with built-in expertise

More information

Improve Cooperation in R&D. Catalyze Drug Repositioning. Optimize Clinical Trials. Respect Information Governance and Security

Improve Cooperation in R&D. Catalyze Drug Repositioning. Optimize Clinical Trials. Respect Information Governance and Security SINEQUA FOR LIFE SCIENCES DRIVE INNOVATION. ACCELERATE RESEARCH. SHORTEN TIME-TO-MARKET. 6 Ways to Leverage Big Data Search & Content Analytics for a Pharmaceutical Company Improve Cooperation in R&D Catalyze

More information

Big Data. George O. Strawn NITRD

Big Data. George O. Strawn NITRD Big Data George O. Strawn NITRD Caveat auditor The opinions expressed in this talk are those of the speaker, not the U.S. government Outline What is Big Data? NITRD's Big Data Research Initiative Big Data

More information

Big Data Web Analytics Platform on AWS for Yottaa

Big Data Web Analytics Platform on AWS for Yottaa Big Data Web Analytics Platform on AWS for Yottaa Background Yottaa is a young, innovative company, providing a website acceleration platform to optimize Web and mobile applications and maximize user experience,

More information

Course 803401 DSS. Business Intelligence: Data Warehousing, Data Acquisition, Data Mining, Business Analytics, and Visualization

Course 803401 DSS. Business Intelligence: Data Warehousing, Data Acquisition, Data Mining, Business Analytics, and Visualization Oman College of Management and Technology Course 803401 DSS Business Intelligence: Data Warehousing, Data Acquisition, Data Mining, Business Analytics, and Visualization CS/MIS Department Information Sharing

More information

The 4 Pillars of Technosoft s Big Data Practice

The 4 Pillars of Technosoft s Big Data Practice beyond possible Big Use End-user applications Big Analytics Visualisation tools Big Analytical tools Big management systems The 4 Pillars of Technosoft s Big Practice Overview Businesses have long managed

More information

A REVIEW PAPER ON THE HADOOP DISTRIBUTED FILE SYSTEM

A REVIEW PAPER ON THE HADOOP DISTRIBUTED FILE SYSTEM A REVIEW PAPER ON THE HADOOP DISTRIBUTED FILE SYSTEM Sneha D.Borkar 1, Prof.Chaitali S.Surtakar 2 Student of B.E., Information Technology, J.D.I.E.T, sborkar95@gmail.com Assistant Professor, Information

More information

Autonomy Consolidated Archive

Autonomy Consolidated Archive Autonomy Consolidated Archive Dennis Wild Director SME, Information Governance and Archiving POWER PROTECT PROMOTE Meaning-Based Governance Files IM Audio Email Social Video SharePoint Archiving = Gain

More information

Big Data and Natural Language: Extracting Insight From Text

Big Data and Natural Language: Extracting Insight From Text An Oracle White Paper October 2012 Big Data and Natural Language: Extracting Insight From Text Table of Contents Executive Overview... 3 Introduction... 3 Oracle Big Data Appliance... 4 Synthesys... 5

More information

Data Wrangling: From the Wild to the Lake

Data Wrangling: From the Wild to the Lake Data Wrangling: From the Wild to the Lake Ignacio Terrizzano Peter Schwarz Mary Roth John Colino IBM Research - Almaden 48 hours of video is uploaded to YouTube every minute Walmart processes million transactions

More information

Big Data Analytics Best Practices

Big Data Analytics Best Practices 1 Big Data Analytics Best Practices Marshall Presser Federal Field CTO Greenplum 2 Big Data Makes the Mainstream 3 WHAT DOES IT TAKE? 4 1. New Applications MADlib 5 2. New Skill Sets -- Data Science 6

More information

Increase Agility and Reduce Costs with a Logical Data Warehouse. February 2014

Increase Agility and Reduce Costs with a Logical Data Warehouse. February 2014 Increase Agility and Reduce Costs with a Logical Data Warehouse February 2014 Table of Contents Summary... 3 Data Virtualization & the Logical Data Warehouse... 4 What is a Logical Data Warehouse?... 4

More information

Partner Camp 2016. Leistungsstarkes Log-Management für physische, virtuelle und cloud-basierte Umgebungen. Tomas Baublys 25.04.

Partner Camp 2016. Leistungsstarkes Log-Management für physische, virtuelle und cloud-basierte Umgebungen. Tomas Baublys 25.04. Partner Camp 2016 vrealize Click Log to edit Insight Master title style Leistungsstarkes Log-Management für physische, virtuelle und cloud-basierte Umgebungen Tomas Baublys 25.04.2016 2014 VMware Inc.

More information

White Paper SAP HANA for Scientific Computing in Aerospace & Defense

White Paper SAP HANA for Scientific Computing in Aerospace & Defense Aerospace & Defense Sector We make it happen. Better. White Paper SAP HANA for Scientific Computing in Aerospace & Defense May 2013 Aerospace and Defense Capability Better Analytics Contents Objective

More information

Six Days in the Network Security Trenches at SC14. A Cray Graph Analytics Case Study

Six Days in the Network Security Trenches at SC14. A Cray Graph Analytics Case Study Six Days in the Network Security Trenches at SC14 A Cray Graph Analytics Case Study WP-NetworkSecurity-0315 www.cray.com Table of Contents Introduction... 3 Analytics Mission and Source Data... 3 Analytics

More information

LinkZoo: A linked data platform for collaborative management of heterogeneous resources

LinkZoo: A linked data platform for collaborative management of heterogeneous resources LinkZoo: A linked data platform for collaborative management of heterogeneous resources Marios Meimaris, George Alexiou, George Papastefanatos Institute for the Management of Information Systems, Research

More information

Make the Most of Big Data to Drive Innovation Through Reseach

Make the Most of Big Data to Drive Innovation Through Reseach White Paper Make the Most of Big Data to Drive Innovation Through Reseach Bob Burwell, NetApp November 2012 WP-7172 Abstract Monumental data growth is a fact of life in research universities. The ability

More information

White Paper: Evaluating Big Data Analytical Capabilities For Government Use

White Paper: Evaluating Big Data Analytical Capabilities For Government Use CTOlabs.com White Paper: Evaluating Big Data Analytical Capabilities For Government Use March 2012 A White Paper providing context and guidance you can use Inside: The Big Data Tool Landscape Big Data

More information

CitusDB Architecture for Real-Time Big Data

CitusDB Architecture for Real-Time Big Data CitusDB Architecture for Real-Time Big Data CitusDB Highlights Empowers real-time Big Data using PostgreSQL Scales out PostgreSQL to support up to hundreds of terabytes of data Fast parallel processing

More information

Cloud Computing for Research Roger Barga Cloud Computing Futures, Microsoft Research

Cloud Computing for Research Roger Barga Cloud Computing Futures, Microsoft Research Cloud Computing for Research Roger Barga Cloud Computing Futures, Microsoft Research Trends: Data on an Exponential Scale Scientific data doubles every year Combination of inexpensive sensors + exponentially

More information

Parallel Data Warehouse

Parallel Data Warehouse MICROSOFT S ANALYTICS SOLUTIONS WITH PARALLEL DATA WAREHOUSE Parallel Data Warehouse Stefan Cronjaeger Microsoft May 2013 AGENDA PDW overview Columnstore and Big Data Business Intellignece Project Ability

More information

Primary Key Associates Limited

Primary Key Associates Limited is at the core of Primary Key Associates work Our approach to analytics In this paper Andrew Lea, our Technical Director in charge of, describes some of the paradigms, models, and techniques we have developed

More information

The Fusion of Supercomputing and Big Data: The Role of Global Memory Architectures in Future Large Scale Data Analytics

The Fusion of Supercomputing and Big Data: The Role of Global Memory Architectures in Future Large Scale Data Analytics HPC 2014 High Performance Computing FROM clouds and BIG DATA to EXASCALE AND BEYOND An International Advanced Workshop July 7 11, 2014, Cetraro, Italy Session III Emerging Systems and Solutions The Fusion

More information

The various steps in the solution approach are presented below.

The various steps in the solution approach are presented below. From Web 1.0 3.0: Is RDF access to RDB enough? Vipul Kashyap, Senior Medical Informatician, Partners Healthcare System, vkashyap1@partners.org Martin Flanagan, CTO, InSilico Discovery, mflanagan@insilicodiscovery.com

More information

Chapter 5 Business Intelligence: Data Warehousing, Data Acquisition, Data Mining, Business Analytics, and Visualization

Chapter 5 Business Intelligence: Data Warehousing, Data Acquisition, Data Mining, Business Analytics, and Visualization Turban, Aronson, and Liang Decision Support Systems and Intelligent Systems, Seventh Edition Chapter 5 Business Intelligence: Data Warehousing, Data Acquisition, Data Mining, Business Analytics, and Visualization

More information

Next. Beyond Big Data: Riding the Technology Wave. Ira A. (Gus) Hunt Chief Technology Officer

Next. Beyond Big Data: Riding the Technology Wave. Ira A. (Gus) Hunt Chief Technology Officer Next Beyond Big Data: Riding the Technology Wave Ira A. (Gus) Hunt Chief Technology Officer Profound Change is under way 1 Social Mobile Cloud 2 3 4 Altered the Flow of Information 5 Nano Bio Sensors 6

More information

INTRODUCTION TO CASSANDRA

INTRODUCTION TO CASSANDRA INTRODUCTION TO CASSANDRA This ebook provides a high level overview of Cassandra and describes some of its key strengths and applications. WHAT IS CASSANDRA? Apache Cassandra is a high performance, open

More information

VIEWPOINT. High Performance Analytics. Industry Context and Trends

VIEWPOINT. High Performance Analytics. Industry Context and Trends VIEWPOINT High Performance Analytics Industry Context and Trends In the digital age of social media and connected devices, enterprises have a plethora of data that they can mine, to discover hidden correlations

More information

Oracle Big Data Handbook

Oracle Big Data Handbook ORACLG Oracle Press Oracle Big Data Handbook Tom Plunkett Brian Macdonald Bruce Nelson Helen Sun Khader Mohiuddin Debra L. Harding David Segleau Gokula Mishra Mark F. Hornick Robert Stackowiak Keith Laker

More information

How To Build A Cloud Storage System

How To Build A Cloud Storage System Reference Architectures for Digital Libraries Keith Rajecki Education Solutions Architect Sun Microsystems, Inc. 1 Agenda Challenges Digital Library Solution Architectures > Open Storage/Open Archive >

More information

YarcData urika Technical White Paper

YarcData urika Technical White Paper YarcData urika Technical White Paper 2012 Cray Inc. All rights reserved. Specifications subject to change without notice. Cray is a registered trademark, YarcData, urika and Threadstorm are trademarks

More information

Introduction to Epinomy. Big Data Semantics

Introduction to Epinomy. Big Data Semantics Introduction to Epinomy Big Data Semantics The Promise and Challenge of Big Data The application of big data in industrial settings is driving a productivity revolution. - Jeff Immelt, CEO/GE Companies

More information

BIG DATA-AS-A-SERVICE

BIG DATA-AS-A-SERVICE White Paper BIG DATA-AS-A-SERVICE What Big Data is about What service providers can do with Big Data What EMC can do to help EMC Solutions Group Abstract This white paper looks at what service providers

More information

Hur hanterar vi utmaningar inom området - Big Data. Jan Östling Enterprise Technologies Intel Corporation, NER

Hur hanterar vi utmaningar inom området - Big Data. Jan Östling Enterprise Technologies Intel Corporation, NER Hur hanterar vi utmaningar inom området - Big Data Jan Östling Enterprise Technologies Intel Corporation, NER Legal Disclaimers All products, computer systems, dates, and figures specified are preliminary

More information

PLATFORA INTERACTIVE, IN-MEMORY BUSINESS INTELLIGENCE FOR HADOOP

PLATFORA INTERACTIVE, IN-MEMORY BUSINESS INTELLIGENCE FOR HADOOP PLATFORA INTERACTIVE, IN-MEMORY BUSINESS INTELLIGENCE FOR HADOOP Your business is swimming in data, and your business analysts want to use it to answer the questions of today and tomorrow. YOU LOOK TO

More information

Using Tableau Software with Hortonworks Data Platform

Using Tableau Software with Hortonworks Data Platform Using Tableau Software with Hortonworks Data Platform September 2013 2013 Hortonworks Inc. http:// Modern businesses need to manage vast amounts of data, and in many cases they have accumulated this data

More information

Leveraging Big Data Technologies to Support Research in Unstructured Data Analytics

Leveraging Big Data Technologies to Support Research in Unstructured Data Analytics Leveraging Big Data Technologies to Support Research in Unstructured Data Analytics BY FRANÇOYS LABONTÉ GENERAL MANAGER JUNE 16, 2015 Principal partenaire financier WWW.CRIM.CA ABOUT CRIM Applied research

More information

Department of Defense. Enterprise Information Warehouse/Web (EIW) Using standards to Federate and Integrate Domains at DOD

Department of Defense. Enterprise Information Warehouse/Web (EIW) Using standards to Federate and Integrate Domains at DOD Department of Defense Human Resources - Enterprise Information Warehouse/Web (EIW) Using standards to Federate and Integrate Domains at DOD Federation Defined Members of a federation agree to certain standards

More information

How To Use Hp Vertica Ondemand

How To Use Hp Vertica Ondemand Data sheet HP Vertica OnDemand Enterprise-class Big Data analytics in the cloud Enterprise-class Big Data analytics for any size organization Vertica OnDemand Organizations today are experiencing a greater

More information

Big Data and Transactional Databases Exploding Data Volume is Creating New Stresses on Traditional Transactional Databases

Big Data and Transactional Databases Exploding Data Volume is Creating New Stresses on Traditional Transactional Databases Big Data and Transactional Databases Exploding Data Volume is Creating New Stresses on Traditional Transactional Databases Introduction The world is awash in data and turning that data into actionable

More information

Big Data Trends A Basis for Personalized Medicine

Big Data Trends A Basis for Personalized Medicine Big Data Trends A Basis for Personalized Medicine Dr. Hellmuth Broda, Principal Technology Architect emedikation: Verordnung, Support Prozesse & Logistik 5. Juni, 2013, Inselspital Bern Over 150,000 Employees

More information

Oracle Big Data Discovery Unlock Potential in Big Data Reservoir

Oracle Big Data Discovery Unlock Potential in Big Data Reservoir Oracle Big Data Discovery Unlock Potential in Big Data Reservoir Gokula Mishra Premjith Balakrishnan Business Analytics Product Group September 29, 2014 Copyright 2014, Oracle and/or its affiliates. All

More information

Successfully Deploying Alternative Storage Architectures for Hadoop Gus Horn Iyer Venkatesan NetApp

Successfully Deploying Alternative Storage Architectures for Hadoop Gus Horn Iyer Venkatesan NetApp Successfully Deploying Alternative Storage Architectures for Hadoop Gus Horn Iyer Venkatesan NetApp Agenda Hadoop and storage Alternative storage architecture for Hadoop Use cases and customer examples

More information

HDP Hadoop From concept to deployment.

HDP Hadoop From concept to deployment. HDP Hadoop From concept to deployment. Ankur Gupta Senior Solutions Engineer Rackspace: Page 41 27 th Jan 2015 Where are you in your Hadoop Journey? A. Researching our options B. Currently evaluating some

More information