Adding Big Earth Data Analytics to GEOSS

Similar documents
Databases & Web Applications Lab Big Data Project A

Agile Retrieval of Big Data with. EarthServer. ECMWF Visualization Week, Reading, 2015-sep-29

WCS as a Download Service for Big (and Small) Data

Agile Analytics on Extreme-Size Earth Science Data

On the Efficient Evaluation of Array Joins

Handling Heterogeneous EO Datasets via the Web Coverage Processing Service

A Big Picture for Big Data

Use of OGC Sensor Web Enablement Standards in the Meteorology Domain. in partnership with

Big Data Volume & velocity data management with ERDAS APOLLO. Alain Kabamba Hexagon Geospatial

PART 1. Representations of atmospheric phenomena

Open Source Visualisation with ADAGUC Web Map Services

RDA PROPOSAL FOR Array Database Working Group (AD-WG) Peter Baumann, Jacobs University

Sextant. Spatial Data Infrastructure for Marine Environment. C. Satra Le Bris, E. Quimbert, M. Treguer

INTEROPERABLE IMAGE DATA ACCESS THROUGH ARCGIS SERVER

VITO Centre of Image Processing

Web-based spatio-temporal visualization and analysis of the Siberian Earth System Science Cluster (SIB-ESS-C)

HPC technology and future architecture

Mr. Apichon Witayangkurn Department of Civil Engineering The University of Tokyo

Oracle Big Data SQL Technical Update

<Insert Picture Here> Data Management Innovations for Massive Point Cloud, DEM, and 3D Vector Databases

Trends and Research Opportunities in Spatial Big Data Analytics and Cloud Computing NCSU GeoSpatial Forum

Cloud-based Geospatial Data services and analysis

BIG DATA IN THE CLOUD : CHALLENGES AND OPPORTUNITIES MARY- JANE SULE & PROF. MAOZHEN LI BRUNEL UNIVERSITY, LONDON

AUTOMATIC AND MANUAL DATA MANAGEMENT - WMS DATA COLLECTION, PROCESSING AND REPRESENTATION

A standards-based open source processing chain for ocean modeling in the GEOSS Architecture Implementation Pilot Phase 8 (AIP-8)

Linking Sensor Web Enablement and Web Processing Technology for Health-Environment Studies

Chapter 5 Business Intelligence: Data Warehousing, Data Acquisition, Data Mining, Business Analytics, and Visualization

NetCDF and HDF Data in ArcGIS

AN OPENGIS WEB MAP SERVER FOR THE ESA MULTI-MISSION CATALOGUE

The four (five) Sensors

Advanced Image Management using the Mosaic Dataset

GeoKettle: A powerful open source spatial ETL tool

Analytics in the Cloud. Peter Sirota, GM Elastic MapReduce

NASA s Big Data Challenges in Climate Science

CLOUD BASED N-DIMENSIONAL WEATHER FORECAST VISUALIZATION TOOL WITH IMAGE ANALYSIS CAPABILITIES

Course DSS. Business Intelligence: Data Warehousing, Data Acquisition, Data Mining, Business Analytics, and Visualization

IBM Netezza High Capacity Appliance

Reprojecting MODIS Images

GLOBAL DATA SPATIALLY INTERRELATE SYSTEM FOR SCIENTIFIC BIG DATA SPATIAL-SEAMLESS SHARING

European Archival Records and Knowledge Preservation Database Archiving in the E-ARK Project

Copyright 2007 Ramez Elmasri and Shamkant B. Navathe. Slide 29-1

Luncheon Webinar Series May 13, 2013

Prognoz Payment System Data Analysis. Description of the solution

OLAP and OLTP. AMIT KUMAR BINDAL Associate Professor M M U MULLANA

The ORIENTGATE data platform

Data Warehouse: Introduction

Zhenping Liu *, Yao Liang * Virginia Polytechnic Institute and State University. Xu Liang ** University of California, Berkeley

Inge Os Sales Consulting Manager Oracle Norway

SeaCloudDM: Massive Heterogeneous Sensor Data Management in the Internet of Things

A quick overview of geographic information systems (GIS) Uwe Deichmann, DECRG

Big Data Explained. An introduction to Big Data Science.

NASA Earth System Science: Structure and data centers

CSE 544 Principles of Database Management Systems. Magdalena Balazinska Winter 2009 Lecture 15 - Data Warehousing: Cubes

OLAP and Data Mining. Data Warehousing and End-User Access Tools. Introducing OLAP. Introducing OLAP

Big Data, Cloud Computing, Spatial Databases Steven Hagan Vice President Server Technologies

Integrating Ingres in the Information System: An Open Source Approach

SURFsara Data Services

RESEARCH ON THE FRAMEWORK OF SPATIO-TEMPORAL DATA WAREHOUSE

Smart Cities require Geospatial Data Providing services to citizens, enterprises, visitors...

Developing Business Intelligence and Data Visualization Applications with Web Maps

GEOG 482/582 : GIS Data Management. Lesson 10: Enterprise GIS Data Management Strategies GEOG 482/582 / My Course / University of Washington

VISUAL INSPECTION OF EO DATA AND PRODUCTS - OVERVIEW

M Designing and Implementing OLAP Solutions Using Microsoft SQL Server Day Course

Emerging Technologies Shaping the Future of Data Warehouses & Business Intelligence

GIS Databases With focused on ArcSDE

ArcGIS. Server. A Complete and Integrated Server GIS

Obtaining and Processing MODIS Data

Chapter 5. Warehousing, Data Acquisition, Data. Visualization

AN INTEGRATION APPROACH FOR THE STATISTICAL INFORMATION SYSTEM OF ISTAT USING SDMX STANDARDS

Jozef Matula. Visualisation Team Leader IBL Software Engineering. 13 th ECMWF MetOps Workshop, 31 th Oct - 4 th Nov 2011, Reading, United Kingdom

White Paper. How Streaming Data Analytics Enables Real-Time Decisions

Big Data in the context of Preservation and Value Adding

SuperGIS Server 3.2 Standard Edition Specification

Data Warehousing and OLAP Technology for Knowledge Discovery

BUILDING BLOCKS OF DATAWAREHOUSE. G.Lakshmi Priya & Razia Sultana.A Assistant Professor/IT

The distribution of marine OpenData via distributed data networks and Web APIs. The example of ERDDAP, the message broker and data mediator from NOAA

Supercomputing and Big Data: Where are the Real Boundaries and Opportunities for Synergy?

DATA WAREHOUSING AND OLAP TECHNOLOGY

BIG DATA Alignment of Supply & Demand Nuria de Lama Representative of Atos Research &

TerraAmazon - The Amazon Deforestation Monitoring System - Karine Reis Ferreira

Transcription:

Research funded through EU FP7 283610 EarthServer European Scalable Earth Science Service Environment Adding Big Earth Data Analytics to GEOSS GEO IX Plenary Foz do Iguacu, 2012-nov-20 Peter Baumann, Stefano Nativi Jacobs University Germany, CNR Italy [gamingfeeds.com] 1

Features & Coverages The basis of all: geographic feature = abstraction of a real world phenomenon [OGC, ISO] associated with a location relative to Earth Special kind of feature: coverage Typical representative: raster image...but there is more! Typically, Big Data are coverages 3

Big Data : The 4 Vs Volume Velocity Variety Veracity [M. Stonebraker and IBM] 4

Raster Data Volume Social Networks Incidence matrix of size 10^8 x 10^8...now do linear algebra! Satellite Imagery HPC ngeo plannings: 10^12 images under ESA custody Even with multi-terabyte local disk sub-systems and multi-petabyte archives, I/O can become a bottleneck in HPC. -- Jeanette Jenness, LLNL, ASCI-Project, 1998 Users download 10x more data than needed -- Kerstin Kleese van Dam, 2002 5

Raster Data Velocity NASA MODIS instrument on board of AQUA & TERRA ~ 1 TB per day LOFAR: distributed sensor array farms for radio astronomy 3 GB per second per station sustained, consolidated into 2 3 PB per year M. Stonebraker: drinking from the firehose 6

Raster Data Variety Sensor, image, model, & statistics data Life Science: Pharma/chem, healthcare / bio research, bio statistics, genetics,... Geo: Geodesy, geology, hydrology, oceanography, meteorology, earth system,... Engineering & research: Simulation & experimental data in automotive/shipbuilding/ aerospace industry, turbines, process industry, astronomy, high energy physics,... Management/Controlling: Decision Support, OLAP, Data Warehousing, census, statistics in industry and public administration,... Multimedia: e-learning, distance learning, prepress,... 80% of all data have some spatial connotation [C&P Hane, 1992] 7

Raster Data Variety: Coverages n-d "space/time-varying phenomenon" [ISO 19123, OGC 09-146r2] «FeatureType» Abstract Coverage Grid Coverage MultiSolid Coverage MultiSurface Coverage MultiCurve Coverage MultiPoint Coverage Referenceable GridCoverage Rectified GridCoverage 8

Raster Data Veracity Both measured and computed data need to carry quality information as part of provenance Sometimes established (costly!) procedures for error estimation, sometimes not Ex: Satellite image processing, from L0 to L2 Many quality criteria determined, but hardwired error propagation by far not always customary What to do with this information? Complicates life of data consumer dramatically! [l2gen, bitmask for ocean color] 9

Let s Take a Closer Look... Remember? Users download 10x more data than needed [Kerstin Kleese van Dam, 2002] t Divergent access patterns for ingest and retrieval Server must mediate between access patterns 10

Use Case: Satellite ImageTime Series [Diedrich et al 2001] 11

The rasdaman Raster Analytics Server Raster DBMS for massive n-d raster data www.rasdaman.org rasql = SQL with integrated raster processing select img.green[x0:x1,y0:y1] > 130 from Tile-based architecture LandsatArchive as img n-d array set of n-d tiles Extensive optimization, hw/sw parallelization In operational use dozen-terabyte objects Analytics queries in 50 ms on laptop 12

Query Processing in a Federation Heterogeneous federation / cloud Can optimize for data location, transport volume, node load,... Work in progress array A select encode( (A.nir - A.red) / (A.nir +A.red), array-compressed ) from A [Owonibi 2012] select encode( ( (A.nir - A.red) / (A.nir + A.red) - (B.nir - B.red) / (B.nir + B.red) ), HDF5 ) from A, B Array B select encode( (B.nir - B.red) / (B.nir + B.red), array-compressed ) from B 13

What Raster Analytics Offers Raster Query Language: ad-hoc navigation, extraction, aggregation, analytics Time series Image processing Summary data Sensor fusion & pattern mining 14

Ex: Climate Data Service [MEEO 2012] 15

3D Clients: Experiments Problem: coupling DB / visualization Approach: deliver RGBA image to X3D client, transparency as height Feed directly into client GPU select encode( { red: (char) s.b7[x0:x1,x0:x1], green: (char) s.b5[x0:x1,x0:x1], blue: (char) s.b0[x0:x1,x0:x1], alpha: (char) scale( d, 20 ) }, "png" ) from SatImage as s, DEM as d [JacobsU, Fraunhofer 2012] 16

EarthServer: Big Earth Data Analytics Scalable On-Demand Analytics & Fusion for all Earth Sciences 11 partners (lead: JacobsU), 7 mus$ budget, 2011-sep-01 2014-aug-31 6 * 100+ TB databases for all Earth sciences + planetary science www.earthserver.eu Advisory board: OGC, ESA, IEEE 17

Web Coverage Service (WCS) Core: Simple access to multi-dimensional coverages subset = trim slice WCS Extensions for additional functionality facets encodings, band extraction, scaling, reprojection, interpolation, query language, data upload,... 18 18

Integration of OGC WCS and SWE SWE O&M and SOS (+ friends): specialized for sensor acquisition, some complexity upstream acquisition GMLCOV and WCS (+WCPS): simple, uniform schema for all coverages; scalable; versatile processing downstream services O&M + SensorML coverage server GMLCOV + WCS Semantic Web 19

Conclusion: Agile Analytics Propose EarthServer platform, rasdaman, as contribution to CGI Flexible ad-hoc processing & filtering Working in situ on existing archives; no copying! Integrated n-d coverage data / metadata search Smooth integration with GEOSS Broker Scalable n-d interfaces using OGC standards WMS, WCS suite including WCPS, WPS nd visual coverage client toolkit 1D diagrams, 2D maps, 3D data cubes, 3D timeseries sets,... Dynamically composed from query results 20