ON DEMAND ACCESS TO BIG DATA. Peter Haase fluid Operations AG



Similar documents
Federated Query Processing over Linked Data

Linked Statistical Data Analysis

BIG DATA AGGREGATOR STASINOS KONSTANTOPOULOS NCSR DEMOKRITOS, GREECE. Big Data Europe

LinkZoo: A linked data platform for collaborative management of heterogeneous resources

RDF Dataset Management Framework for Data.go.th

XpoLog Center Suite Data Sheet

DataOps: Seamless End-to-end Anything-to-RDF Data Integration

How To Build A Cloud Based Intelligence System

Revealing Trends and Insights in Online Hiring Market Using Linking Open Data Cloud: Active Hiring a Use Case Study

Amit Sheth & Ajith Ranabahu, Presented by Mohammad Hossein Danesh

Scalable End-User Access to Big Data HELLENIC REPUBLIC National and Kapodistrian University of Athens

SAP BusinessObjects BI Clients

Semantic Data Management. Xavier Lopez, Ph.D., Director, Spatial & Semantic Technologies

Data Publishing with DaPaaS

LDIF - Linked Data Integration Framework

Open Data collection using mobile phones based on CKAN platform

TopBraid Insight for Life Sciences

Enabling Digitization with Next Generation Cloud

Big Data Analytics Nokia

Semantic SharePoint. Technical Briefing. Helmut Nagy, Semantic Web Company Andreas Blumauer, Semantic Web Company

Scope. Cognescent SBI Semantic Business Intelligence

Fraunhofer FOKUS. Fraunhofer Institute for Open Communication Systems Kaiserin-Augusta-Allee Berlin, Germany.

Big Data, Cloud Computing, Spatial Databases Steven Hagan Vice President Server Technologies

BUSINESS VALUE OF SEMANTIC TECHNOLOGY

Intelligence. Productivity. Mobility. Unified Service. Predictive analytics: Offline mobile: Self, assisted & field service

Semantic Interoperability

Linked Open Data Infrastructure for Public Sector Information: Example from Serbia

Principles and Foundations of Web Services: An Holistic View (Technologies, Business Drivers, Models, Architectures and Standards)

Portal Version 1 - User Manual

A Generic Platform for Enterprise Gamification

Lightweight Data Integration using the WebComposition Data Grid Service

Successful Platform-as-a-Service Requires a Supporting Ecosystem for HR Applications

RSA Via Lifecycle and Governance 101. Getting Started with a Solid Foundation

Landscape as a Service as Enhancement to Infrastructure as a Service

1 Copyright 2011, Oracle and/or its affiliates. All rights reserved.

MarkLogic Enterprise Data Layer

INTRAFOCUS. DATA VISUALISATION An Intrafocus Guide

ARIS 9 Highlights and Outlook

DISCOVERING RESUME INFORMATION USING LINKED DATA

Data collection architecture for Big Data

Visual Analysis of Statistical Data on Maps using Linked Open Data

Independent process platform

Build. an Amazon-like experience for Cloud Services. Key Challenges. you click it. you see it. you got it. October

Decoding the Big Data Deluge a Virtual Approach. Dan Luongo, Global Lead, Field Solution Engineering Data Virtualization Business Unit, Cisco

Semantic Technologies for Enterprise Cloud Management

Platforming Open Source

FUJITSU Software Interstage Business Operations Platform: A Foundation for Smart Process Applications

GS Big Data Platform

Microsoft Office SharePoint Server (MOSS) 2007 Overview

JBoss Enterprise Middleware

Augmented Search for Web Applications. New frontier in big log data analysis and application intelligence

Big Data and Semantic Web in Manufacturing. Nitesh Khilwani, PhD Chief Engineer, Samsung Research Institute Noida, India

ARIS 9ARIS 9.6 map and Future Directions Die nächste Generation des Geschäftsprozessmanagements

Guiding SOA Evolution through Governance From SOA 101 to Virtualization to Cloud Computing

Industry 4.0 and Big Data

SERVICE-ORIENTED MODELING FRAMEWORK (SOMF ) SERVICE-ORIENTED SOFTWARE ARCHITECTURE MODEL LANGUAGE SPECIFICATIONS

XpoLog Competitive Comparison Sheet

An Enhanced Visualization Service based on Geospatial and Statistical Linked Open Data

Copyright 2014, Oracle and/or its affiliates. All rights reserved.

SAP SE - Legal Requirements and Requirements

Comparison of Enterprise Reporting Tools

Introduction to TIBCO MDM

Empower Individuals and Teams with Agile Data Visualizations in the Cloud

Understanding the SAP BI Strategy

How To Make Money From Cloud Computing

Armanino McKenna LLP Welcomes You To Today s Webinar:

Structured Content: the Key to Agile. Web Experience Management. Introduction

Oracle Reference Architecture and Oracle Cloud

Linked Data Interface, Semantics and a T-Box Triple Store for Microsoft SharePoint

Cloud application services (SaaS) Multi-Tenant Data Architecture Shailesh Paliwal Infosys Technologies Limited

Serendipity a platform to discover and visualize Open OER Data from OpenCourseWare repositories Abstract Keywords Introduction

Taking control of the virtual image lifecycle process

Combining SAWSDL, OWL DL and UDDI for Semantically Enhanced Web Service Discovery

SAP Business Objects BO BI 4.1

CASRAI, eurocris, Lattes, and VIVO: Four Perspectives on Research Information Standards

Big Data for Investment Research Management

IAAS CLOUD EXCHANGE WHITEPAPER

Open Data Integration Using SPARQL and SPIN

Portal for ArcGIS. Satish Sankaran Robert Kircher

Service Oriented Architecture

De la Business Intelligence aux Big Data. Marie- Aude AUFAURE Head of the Business Intelligence team Ecole Centrale Paris. 22/01/14 Séminaire Big Data

The Concept of Big Data Reference Model

On-demand Provisioning of Workflow Middleware and Services An Overview

Ignite Your Creative Ideas with Fast and Engaging Data Discovery

Transcription:

ON DEMAND ACCESS TO BIG DATA THROUGHSEMANTIC TECHNOLOGIES Peter Haase fluid Operations AG

fluid Operations (fluidops) Linked Data & SemanticTechnologies Enterprise Cloud Computing Software company founded Q1/2008 by team of serial entrepreneurs, privately held, VC funded Headquarters in Walldorf / Germany, SAP Partner Port Currently 45 employees Named Cool Vendor by Gartner Mar 2010 Global reseller agreement with EMC focus large enterprise customers Apr 2010 NetApp Advantage Alliance Partner Oct 2010

Outline Big Data Challenges: Beyond Volume Semantic Technologies for Big Data Challenges On Demand Data Access in a Self service Process

Big Data Big data consists of datasets that grow so large that they become awkward to work with using on hand database management tools. (Wikipedia) 12 terabytes of Tweets created daily 30 terabytes of telescope data each night 350 billion meter readings...

Optique Case Study: Statoil Exploration Experts in geology and geophysics develop stratigraphic models of unexplored areas. Based on production and exploration data from nearby locations Analytics on: 1,000 TB of relational data using diverse schemata spread over 2,000 tables spread over multiple individual data bases 900 experts in Statoil Exploration up to 4 days for new data access queries assistance from IT experts required

LOD as an Example of Horizontal Big Data LOD as an Example of Horizontal Big Data

Semantic Technologies for Horizontal Big Data Linked Data Set of standards, principles for publishing, sharing and interrelating structured data: RDF as data model, SPARQL for querying Graph based data model for achieving higher degree of variety Semantically interlink data scattered among different information spaces: from data silos to a Web of Data Ontologies For describing the semantics of the data As conceptual models for end user oriented access For the integration of heterogeneoussources For (light weight) reasoning 8

On Demand Access to Big Data Enabling on demand data access 1. discovery of relevant data sources 2. automated integration and interlinking of sources, and 3. interactive ti exploration and ad hoc analysisof dt data => Linked Data as a Service

Everything as a Service Abstract from physical implementation details and location of resources Regardless of geographic or organizational separation of provider and consumer In the cloud Data as a Service Web based Virtualized Software as a Service On demand Self service Platform as a Service Scalable Pay as you go Infrastructure as a Service

Linked Data as a Service Like all members of the "as a Service family, DaaS is based on the concept that the product, data in this case, can be provided on demand to the user regardless of geographic or organizational separation of provider and consumer. Source: Wikipedia Data virtualization supported by Linked Data principles 1. Use URIs as names for things 2. Use HTTP URIs so that people can look up those names. 3. When someone looks up a URI, provide useful information, using the standards: RDF, SPARQL 4. Include links to other URIs, to discover more things. Linked Data as abstraction layer for virtualized data access across data spaces Enables data portability across current data silos Platform independent data access Basis for enabling automation of discovery, composition, ii and use of datasets 11

Information Workbench Linked Data Platform Information ato Workbench: Semantics & Linked Data based integration of private and public data sources Intelligent Data Access and Analytics Visual Exploration Semantic Search Dashboarding and Reporting Semantic Web Data Collaboration and knowledge management platform Wiki based curation & authoring of data Collaborative workflows 12

Enabling Data Discovery: Metadata about Data Sets Metadata about data sources essential for dynamic discovery Based on metadata vocabularies (VoID, DCAT) Access to data registered at global registries, e.g. ckan.org, data.gov, Sort/filter data sets by topic, license, size and many more facets to identify relevant data Visually explore data sets

FedX Federated Query Processing 1) Involveonlyrelevant only sources inthe evaluation Problem: Subqueries are sent to all sources, although potentially irrelevant 2) Compute joins close to the data Problem: All joins are executed locally in a nested loop fashion 3) Reduce remote communication Problem: Nested loop joincauses manyremote requests 14

Enabling Data Composition: Fd Federation of Virtualized Data Sources Application Layer Virtualization Layer Data Layer SPARQL Endpoint SPARQL SPARQL SPARQL Endpoint Endpoint Endpoint Data Source Data Source Data Source Data Source Metadata Registry See also: FedX: Optimization Techniques for Federated Query Processing on Linked Data (ISWC2011)

Enabling On Demand Use: Self service Linked Data Frontend Semantic Wiki as user frontend Declarative specification of the UI based onavailable poolofwidgetsof and declarative wiki based syntax Widgets have direct access to the DB Type based template mechanism Ad hocdata exploration, visualization, analytics, dashboards,... Wiki Page in Edit Mode and Displayed Result Page

Information Workbench Linked Data as a Service Application Areas Knowledge Management in the Life Sciences Digital Libraries, Media and Content Management Intelligent Data Center Management

Example: Linked Data in Pharma Search, Interrogate and Reason Visualize, Analyze and Explore Capture and Augment Knowledge Integrated data graph over all data sources Integ Main Use Cases Integratedatafrom company internal data silos Augment companyinternal data with Linked Open Data Collaborative knowledge management Support of internal processes (drug development) Private Data Sources Public Data Sources

Example: Data Center Management Support collaborative operations management in the data center Link business data to technical data Technical Documentation Analytics and Reporting Performance and Capacity Monitoring Responsibility Management Resource Management Change Management Technical Ticketing System 19

Example: A Cloud Portal for Access to Open Data with the Information Workbench Goal Collect meta data from global data markets (LOD Cloud, WorldBank, CKAN, ) Allow integrated search and ad hoc integration of data sources from different repositories Linkdata with private/internaldata sources, ifdesired Support semi automated linking between data sets Provide visualization, exploration, and analytics functionality on top of integrated data sources... using the fluid Operations Technology Stack Realization Currently running project with the Hasso Plattner Institute (Potsdam, Germany) Create local repository containing data market metadata Use self service technology to make services publicly available lbl + Information Workbench for analytics

Information Workbench: Linked Dt Data as a Service in a Cloud Platform Architecture t Application Layer (SaaS) Pr rovisioning, Mon nitoring and Ma anagement Netw. Att. Storage Virtualization Layer Infrastructure Layer (IaaS) Network Computing Resources Enterprise DataSources Data Layer (DaaS) Open DataSources

Provis ioning, Monitori ing and Manage ement Application Layer (SaaS) Virtualization Layer Infrastructure Layer (IaaS) Data Layer (DaaS) Netw. Att. Storage Network Computing Resources Enterprise Data Sources Open Data Sources Self service Deployment Data Discovery Data Integration & Federation Self service UI & Analytics Self service deployment of the Information Workbench in the cloud Pay per use Scalability on demand On demand access to private and public data sources Dynamic Discovery Virtualized data access Dynamic integration & federation of data sources Living UI, composed from semantics aware widgets Adhoc data exploration, visualization, analytics

Summary Big Data means more than volume and vertical scale Semantic Technologies for Big Data management Linked Data as adequate data model Ontologies as conceptual models to access big data Integration of diverse, heterogeneus data sources Linked Data as a Service for enabling on demand data access 1. discovery of relevant data sources 2. automated integration and interlinking of sources, and 3. interactive exploration and ad hoc analysis of data

CONTACT: fluid Operations Altrottstr. 31 Walldorf, Germany Email: peter.haase@fluidops.com website: www.fluidops.com Tel.: +49 6227 3846 527