A Big Picture for Big Data



Similar documents
Agile Analytics on Extreme-Size Earth Science Data

Agile Retrieval of Big Data with. EarthServer. ECMWF Visualization Week, Reading, 2015-sep-29

Handling Heterogeneous EO Datasets via the Web Coverage Processing Service

Standard Big Data Architecture and Infrastructure

WCS as a Download Service for Big (and Small) Data

On the Efficient Evaluation of Array Joins

Standardization Requirements Analysis on Big Data in Public Sector based on Potential Business Models

Databases & Web Applications Lab Big Data Project A

Adding Big Earth Data Analytics to GEOSS

RDA PROPOSAL FOR Array Database Working Group (AD-WG) Peter Baumann, Jacobs University

The InterNational Committee for Information Technology Standards INCITS Big Data

ISO/IEC JTC 1 Information technology. Big data

ISO/IEC JTC 1/WG 10 Working Group on Internet of Things. Sangkeun YOO, Convenor

Big Data, Cloud Computing, Spatial Databases Steven Hagan Vice President Server Technologies

Geospatial Platforms For Enabling Workflows

Information Services for Smart Grids

NIST Big Data PWG & RDA Big Data Infrastructure WG: Implementation Strategy: Best Practice Guideline for Big Data Application Development

ISO/IEC JTC1 SC32. Next Generation Analytics Study Group

LDIF - Linked Data Integration Framework

XML for Manufacturing Systems Integration

Smart Cities require Geospatial Data Providing services to citizens, enterprises, visitors...

The Ontological Approach for SIEM Data Repository

NoSQL storage and management of geospatial data with emphasis on serving geospatial data using standard geospatial web services

M2M Communications and Internet of Things for Smart Cities. Soumya Kanti Datta Mobile Communications Dept.

Geospatial Technology Innovations and Convergence

Geospatial Platforms For Enabling Workflows

Databases for 3D Data Management: From Point Cloud to City Model

Big Data and Semantic Web in Manufacturing. Nitesh Khilwani, PhD Chief Engineer, Samsung Research Institute Noida, India

Cloud and Big Data Standardisation

Using the Grid for the interactive workflow management in biomedicine. Andrea Schenone BIOLAB DIST University of Genova

Cloud Computing Standards: Overview and ITU-T positioning

Standards for Big Data in the Cloud

Big Data and Advanced Analytics Applications and Capabilities Steven Hagan, Vice President, Server Technologies

Big Data Systems and Interoperability

Big Data Standardisation in Industry and Research

Data interchange between Web client based task controllers and management information systems using ISO and OGC standards

BIG DATA IN THE CLOUD : CHALLENGES AND OPPORTUNITIES MARY- JANE SULE & PROF. MAOZHEN LI BRUNEL UNIVERSITY, LONDON

Introduction to Service Oriented Architectures (SOA)

Industry 4.0 and Big Data

How To Build A Cloud Based Intelligence System

Introduction to Geospatial Web Services

Department of Defense. Enterprise Information Warehouse/Web (EIW) Using standards to Federate and Integrate Domains at DOD

Standardised SLAs: how far can we go? DIHC, Euro-Par 2013, Aachan John Kennedy Intel Labs Europe

Alejandro Vaisman Esteban Zimanyi. Data. Warehouse. Systems. Design and Implementation. ^ Springer

Where is... How do I get to...

Comparative Analysis of SOA and Cloud Computing Architectures using Fact Based Modeling

Standards for Identity & Authentication. Catherine J. Tilton 17 September 2014

Achieving Business Value through Big Data Analytics Philip Russom

Data-intensive HPC: opportunities and challenges. Patrick Valduriez

Towards an On board Personal Data Mining Framework For P4 Medicine

GetLOD - Linked Open Data and Spatial Data Infrastructures

UK Location Programme

W3C Meeting ISO/IEC/IEEE P

Leveraging Big Data Technologies to Support Research in Unstructured Data Analytics

EL Program: Smart Manufacturing Systems Design and Analysis

Applying Semantic Web Technologies in Service-Oriented Architectures

ISO JTC 1 SGBD Mtg and ACM Workshop

ISO and OGC Service Architecture

Secure and Semantic Web of Automation

Big Data Volume & velocity data management with ERDAS APOLLO. Alain Kabamba Hexagon Geospatial

City Data Pipeline. A System for Making Open Data Useful for Cities. stefan.bischof@tuwien.ac.at

Business Performance Management Standards

DISCOVERING RESUME INFORMATION USING LINKED DATA

The use of Semantic Web Technologies in Spatial Decision Support Systems

Oracle Big Data SQL Technical Update

The Nordic way to International standardization ISO/TC 211

INPE Spatial Data Infrastructure for Big Spatiotemporal Data Sets. Karine Reis Ferreira (INPE-Brazil)

Application of ontologies for the integration of network monitoring platforms

Core Enterprise Services, SOA, and Semantic Technologies: Supporting Semantic Interoperability

Semantic Data Management. Xavier Lopez, Ph.D., Director, Spatial & Semantic Technologies

Siemens Future HANNOVER MESSE Internet of Things and Services Guido Stephan

Design and Implementation of a Semantic Web Solution for Real-time Reservoir Management

Transcription:

Supported by EU FP7 SCIDIP-ES, EU FP7 EarthServer A Big Picture for Big Data FOSS4G-Europe, Bremen, 2014-07-15 Peter Baumann Jacobs University rasdaman GmbH p.baumann@jacobs-university.de

Our Stds Involvement ISO: member, SC32 / WG3 SQL; SC32 Big Data Study Group; OGC liaison, TC211 Open Geospatial Consortium: co-chair, BigData.DWG, WCS.SWG, Coverages.DWG; co-founder, Temporal.DWG Charter Member, OSGeo Research Data Alliance: co-chair, Big Data Interest Group and Geospatial Interest Group member, ERCIM Expert Group Big Data member, Belmont Forum, WP 3 Harmonization of global environmental data infrastructure council member, CGI / IUGS founding member and secretary, CODATA Germany...

Big Data Market Forecast [ISO/IEC JTC1 SC32 N0065]

Oracle Reference Architecture [oracle.com]

Think Big Analytics RA [thinkbiganalytics.com]

Ulitzer New Enterprise Reference Architecture [ulitzer.com]

Apache Big Data Framework [http://semanticommunity.info]

NIST Big Data Reference Architecture* [DRAFT NIST Big Data Interoperability Framework: Vol. 6, Reference Architecture, April 23, 2014]

Reference Architectures? Static Mixing concept, architecture, interfaces, tools often 3-tier variants what s new?

My Reference Architecture ;-)...for Sensor, Image, Simulation, Statistics Data = spatio-temporal data cubes machine 2 machine [src: OGC & own]

A Data Structure Centric View Stock trading: 1-D sequences (i.e., arrays) Social networks: large, homogeneous graphs Ontologies: small, heterogeneous graphs Climate modelling: 4D/5D arrays Satellite imagery: 2D/3D arrays (+irregularity) Genome: long string arrays Particle physics: sets of events analyzing domain standards (like OGC WCS) can help Bio taxonomies: hierarchies (such as XML) Documents: key/value stores: sets of unique identifiers + whatever etc.

Embracing Variety Observation: (Big) Data means many things, at least: Heterogeneous data types Heterogeneous ways to access (download, extract, analyze, mix,...) Proposition: Query languages as common c/s interfaces Sets: SQL Arrays: SQL/MDA Graphs: SparQL, graph QLs Hierarchies: XQuery/XPath + glue

Sample Domain-Specific Queries Arrays: data cube paradigm SELECT la.scene.red la.scene.nir FROM LandsatArchive as la WHERE avg_cells( la.green > 17 ) Hierarchies: XPath /book/chapter[ @title = Introduction ] Graphs: Graph QLS (CYPHER): MATCH (n:crew)-[r:knows*]-m WHERE n.name='neo' RETURN n AS Neo,r,m SPARQL: SELECT?name?email WHERE {?person rdf:type foaf:person.?person foaf:name?name.?person foaf:mbox?email. } 14

Sample Integrated Queries Arrays + sets (SQL) select encode( scene.nir, image/tiff ) from LandsatScenes where acquired between 1990-06-01 and 1990-06-30 Arrays + hierarchies (XQuery) for $c in doc( WCPS )//coverage/[ some( $c.nir > $c.red )] return <id> { $c/@id } </id> <area> { $c/boundedby } </area> 15

Major Consortia for Big Data Standardization ISO/IEC JTC 1/SC 32: Data management & interchange SC32 Big Data Analytics Study Group ITU-T SG13: Cloud computing for Big Data W3C: Web & Semantic related standards for markup, structure, query, semantics, & interchange Open Geospatial Consortium (OGC): Geospatial related standards for the specification, structure, query, and processing of location related data BigData.DWG, http://external.opengeospatial.org/twiki_public/bigdatadwg/ Organization for the Advancement of Structured Information Standards (OASIS): Information access and exchange. structured data content; security, Cloud computing, SOA, Web services, the Smart Grid, electronic publishing, emergency management,...

Major Consortia for Big Data Standardization /2 Research Data Alliance (RDA) Big Data Analytics Interest Group: use case based Geospatial Interest Group Transaction Processing Performance Council Benchmarks for Big Data Systems; Specification of TPC Express, BenchmarkTM for Hadoop TM Forum Enable enterprises, service providers and suppliers continuously transform to succeed in the digital economy Share experiences to solve critical business challenges including IT transformation, business process optimization, big data analytics, cloud management, and cyber security.

Potential Gaps in Standardization Big Data use cases, definitions, vocabulary, reference architectures Specifications & standardization of metadata including provenance Application models (batch, streaming, etc.) Query languages for diverse data types (XML, RDF, JSON, multimedia, etc.) and Big Data operations (e.g. matrix operations) Domain-specific languages Semantics of eventual consistency efficient network protocols ontologies & taxonomies Security, privacy access controls Analytics, resource discovery, mining Data sharing & exchange Data storage human consumption (visualization!) Energy measurement for Big Data [ISO/IEC JTC1 SC32 N0065]

Conclusion Query language concept successful......if & when data model adequate Query language as general c/s paradigm Client can hide behind visual interface Server can optimize Agnostic of low-level protocol: KVP, SOAP, REST,... More options for future Security, uncertainty, etc. Open research: next-gen mediators as glue Issues: model integration, cross-model optimization,...