Agile Analytics on Extreme-Size Earth Science Data



Similar documents
Agile Retrieval of Big Data with. EarthServer. ECMWF Visualization Week, Reading, 2015-sep-29

Handling Heterogeneous EO Datasets via the Web Coverage Processing Service

Databases & Web Applications Lab Big Data Project A

On the Efficient Evaluation of Array Joins

WCS as a Download Service for Big (and Small) Data

Adding Big Earth Data Analytics to GEOSS

A Big Picture for Big Data

RDA PROPOSAL FOR Array Database Working Group (AD-WG) Peter Baumann, Jacobs University

Big Data Volume & velocity data management with ERDAS APOLLO. Alain Kabamba Hexagon Geospatial

Oklahoma s Open Source Spatial Data Clearinghouse: OKMaps

Cloud-based Infrastructures. Serving INSPIRE needs

NetCDF and HDF Data in ArcGIS

A standards-based open source processing chain for ocean modeling in the GEOSS Architecture Implementation Pilot Phase 8 (AIP-8)

A New Cloud-based Deployment of Image Analysis Functionality

Big Data, Cloud Computing, Spatial Databases Steven Hagan Vice President Server Technologies

INTEROPERABLE IMAGE DATA ACCESS THROUGH ARCGIS SERVER

VITO Centre of Image Processing

How To Use Gis

A Next-Generation Analytics Ecosystem for Big Data. Colin White, BI Research September 2012 Sponsored by ParAccel

EUMETSAT EO Portal. End User Image Access using OGC WMS/WCS services. EUM/OPS/VWG/10/0095 Issue <1> <14/01/2010> Slide: 1

Hadoop. MPDL-Frühstück 9. Dezember 2013 MPDL INTERN

Embedded inside the database. No need for Hadoop or customcode. True real-time analytics done per transaction and in aggregate. On-the-fly linking IP

Understanding Big Data Analytics Applications in Earth Science Morris Riedel, Rahul Ramachandran/Kuo Kwo-Sen, Peter Baumann Big Data Analytics

Big Data Explained. An introduction to Big Data Science.

Where is... How do I get to...

Big Data and Analytics: Getting Started with ArcGIS. Mike Park Erik Hoel

Use of OGC Sensor Web Enablement Standards in the Meteorology Domain. in partnership with

Smart Cities require Geospatial Data Providing services to citizens, enterprises, visitors...

HPC technology and future architecture

BIG DATA Alignment of Supply & Demand Nuria de Lama Representative of Atos Research &

Geoprocessing in Hybrid Clouds

GeoMedia Product Update. Title of Presentation. Lorilie Barteski October 15, 2008 Edmonton, AB

Vision. South Pacific GIS/RS Conference /17/2015. Applying Geography Everywhere. Applying Geography Everywhere

EXPLORING AND SHARING GEOSPATIAL INFORMATION THROUGH MYGDI EXPLORER

UK Location Programme

INPE Spatial Data Infrastructure for Big Spatiotemporal Data Sets. Karine Reis Ferreira (INPE-Brazil)

Geospatial Platforms For Enabling Workflows

GIS Databases With focused on ArcSDE

Towards a Thriving Data Economy: Open Data, Big Data, and Data Ecosystems

Standards for Big Data in the Cloud

The MELODIES project: Exploiting open data using cloud computing and Linked Data

<no narration for this slide>

Web-based spatio-temporal visualization and analysis of the Siberian Earth System Science Cluster (SIB-ESS-C)

From Spark to Ignition:

<Insert Picture Here> Data Management Innovations for Massive Point Cloud, DEM, and 3D Vector Databases

Scalable End-User Access to Big Data HELLENIC REPUBLIC National and Kapodistrian University of Athens

Geospatial Platforms For Enabling Workflows

1 st Symposium on Colossal Data and Networking (CDAN-2016) March 18-19, 2016 Medicaps Group of Institutions, Indore, India

Oracle Big Data SQL Technical Update

Big Data Architect Certification Self-Study Kit Bundle

Standardized data sharing through an open-source Spatial Data Infrastructure: the Afromaison project

5 Keys to Unlocking the Big Data Analytics Puzzle. Anurag Tandon Director, Product Marketing March 26, 2014

OPEN BIM and the future: BIM and GIS integration. Building the DND Real Property Spatial Data Framework

INSPIRE in practice: Experiences with INSPIRE data and services

How To Use Spagobi Suite

The Arctic Observing Network and its Data Management Challenges Florence Fetterer (NSIDC/CIRES/CU), James A. Moore (NCAR/EOL), and the CADIS team

Data Technologies for Quantitative Finance

The Matsu Wheel: A Cloud-based Scanning Framework for Analyzing Large Volumes of Hyperspectral Data

3rd International Symposium on Big Data and Cloud Computing Challenges (ISBCC-2016) March 10-11, 2016 VIT University, Chennai, India

High Performance Data Management Use of Standards in Commercial Product Development

End to End Solution to Accelerate Data Warehouse Optimization. Franco Flore Alliance Sales Director - APJ

Analytics in the Cloud. Peter Sirota, GM Elastic MapReduce

Session 1: IT Infrastructure Security Vertica / Hadoop Integration and Analytic Capabilities for Federal Big Data Challenges

Managing a Geographic Database From Mobile Devices Through OGC Web Services

Geospatial Cloud Computing - Perspectives for

How Single-Sign-On Improves The Usability Of Protected Services For Geospatial Data

Using OBIEE for Location-Aware Predictive Analytics

BIG DATA IN THE CLOUD : CHALLENGES AND OPPORTUNITIES MARY- JANE SULE & PROF. MAOZHEN LI BRUNEL UNIVERSITY, LONDON

ArcGIS. Server. A Complete and Integrated Server GIS

VISUAL INSPECTION OF EO DATA AND PRODUCTS - OVERVIEW

Cloud Computing and Big Data

Using Proxies to Accelerate Cloud Applications

News and trends in Data Warehouse Automation, Big Data and BI. Johan Hendrickx & Dirk Vermeiren

Complexity and Scalability in Semantic Graph Analysis Semantic Days 2013

Advanced Image Management using the Mosaic Dataset

Client Overview. Engagement Situation. Key Requirements

Trends and Research Opportunities in Spatial Big Data Analytics and Cloud Computing NCSU GeoSpatial Forum

DataTube: web services voor data

How To Make Data Streaming A Real Time Intelligence

Oracle Big Data Strategy Simplified Infrastrcuture

Design Requirements for an AJAX and Web-Service Based Generic Internet GIS Client

Transcription:

Agile Analytics on Extreme-Size Earth Science Data COPERNICUS Big Data Workshop Brussels / BE, 2014-mar-14 Peter Baumann Jacobs University rasdaman GmbH p.baumann@jacobs-university.de

What are the Big Data in Geo? Roughly: Spatio-temporal sensor, image, simulation, statistics data Often n-d raster data = arrays massive arrays & graphs next great challenges [major DBMS vendors] [img: OGC & own]

: Big Earth Data Analytics Scalable On-Demand Processing for the Earth Sciences EU FP7-INFRA, Sep 2011 Aug 2914, ~6 meur Platform: pioneer Array Database technology, rasdaman Integrated filtering & processing on metadata, regular/irregular grids, point clouds,... 11 partners (3 SMEs): CP-CSA 3 activity pillars: RTD, Services, Networking

Service Activity 6 Lighthouse Applications covering all Earth Sciences Established data centers adding technology to portfolio Summer 2014: ~300 TB operational Earth & Planetary science data

RTD Activity: Overview Big Geo Data engine development Based on rasdaman Array Database strictly open standards (OGC WCS, WMS, WCPS) Regular & irregular grids, point clouds, meshes Coupling: Hadoop, R, MatLab, MapServer,... Data/metadata search integration Scalability: distributed processing Visual 1D/2D/3D client toolkit

rasdaman: Agile Array Analytics raster data manager : SQL + n-d raster objects select img.green[x0:x1,y0:y1] > 130 from LandsatArchive as img where avg_cells( img.nir ) < 17 Scalable parallel tile streaming architecture In operational use since many years OGC WCS Core Reference Implementation

Tiling: Tuning Data for Applications tiling strategies as service tuning [Furtado]: regular directional area of interest chunks [Sarawagi, DeWitt,...] rasdaman storage layout language insert into MyCollection values... tiling area of interest [0:20,0:40], [45:80,80:85] tile size 1000000 index d_index storage array compression zlib

rasdaman Federation Scenarios select max((a.nir - A.red) / (A.nir + A.red)) - max((b.nir - B.red) / (B.nir + B.red)) from A, B Dataset D Dataset C Dataset B Dataset A

Next: On-Board Query Intelligence ESA Democratize direct data access NASA [imagery courtesy ESA, NASA]

Inset: Array Databases

Inset: OGC Big Data Coverage Service Portfolio OGC standards cover full range from data-intensive to processing-intensive Big Data coverage services WCS WCPS WPS data access ad-hoc analytics predefined process Web Coverage Service (WCS): easy data access & extraction Web Coverage Processing Service (WCPS): agile analytics, enabling automatic parallelization Web Processing Service (WPS): predefined processes of arbitrary complexity as black boxes

Networking Activity: Outreach (select) Making the standards Open Geospatial Consortium (OGC) o chairing 4 WGs, editor of WCS standards suite ISO: new work item Array SQL Research Data Alliance: Big Data Analytics IG Conferences Initiated ESA conference series Big Data From Space FOSS4G-Europe ( free & open-source 4 geospatial ), July 2014 Sessions at EGU TV documentaries for German ZDF & arte Innovation awards Ex: Geospatial World Forum

Take Home Messages Big Data Analytics requires high-level languages Problem-adjusted Array Databases like rasdaman : agile analytics on spatio-temporal Big Geo Data Flexibility + scalability + integration pictures actionable data www.earthserver.eu Impact on science, industry, business Next-gen service standards : OGC, ISO, RDA

Conference Announcements www.foss4g-e.org congrexprojects.com/2014- events/bigdatafromspace