German Copernicus Data Access and Exploitation Platform BiDS 16, Teneriffa, Spain,



Similar documents
Data and Information Management for EO Data Centers. Eberhard Mikusch German Aerospace Center - German Remote Sensing Data Center

Big Data and Cloud Computing for GHRSST

Task AR-09-01a Progress and Contributions

Das Copernicus Collaborative GroundSegment des DLR

Big Data Infrastructures for Processing Sentinel Data

Copernicus Space Component ESA Data Access Overview J. Martin (ESA), R. Knowelden (Airbus D&S)

Assignment # 1 (Cloud Computing Security)

George Mason University (GMU)

Use of Hadoop File System for Nuclear Physics Analyses in STAR

News and trends in Data Warehouse Automation, Big Data and BI. Johan Hendrickx & Dirk Vermeiren

vnas Series All-in-one NAS with virtualization platform

The DLR Multi Mission EO Ground Segment

Solution Brief: Creating Avid Project Archives

Silviu Panica, Marian Neagul, Daniela Zaharie and Dana Petcu (Romania)

Managing Complexity in Distributed Data Life Cycles Enhancing Scientific Discovery

Outcomes of the CDS Technical Infrastructure Workshop

Keystone Image Management System

Cloud-based Infrastructures. Serving INSPIRE needs

Cloud Computing Where ISR Data Will Go for Exploitation

A standards-based open source processing chain for ocean modeling in the GEOSS Architecture Implementation Pilot Phase 8 (AIP-8)

Astrium GEO UK Multi-Mission PDGS Facilities and Services

The distribution of marine OpenData via distributed data networks and Web APIs. The example of ERDDAP, the message broker and data mediator from NOAA

Data Information and Management System (DIMS) The New DIMS Hardware

CEDA Storage. Dr Matt Pritchard. Centre for Environmental Data Archival (CEDA)

Managing a local Galaxy Instance. Anushka Brownley / Adam Kraut BioTeam Inc.

Forestry Thematic Exploitation Platform Earth Observation Open Science 2.0

Enabling embedded maps

IT of SPIM Data Storage and Compression. EMBO Course - August 27th! Jeff Oegema, Peter Steinbach, Oscar Gonzalez


Outline. Introduction Virtualization Platform - Hypervisor High-level NAS Functions Applications Supported NAS models

Agenda. HPC Software Stack. HPC Post-Processing Visualization. Case Study National Scientific Center. European HPC Benchmark Center Montpellier PSSC

SECURE, ENTERPRISE FILE SYNC AND SHARE WITH EMC SYNCPLICITY UTILIZING EMC ISILON, EMC ATMOS, AND EMC VNX

Unified Computing System When Delivering IT as a Service. Tomi Jalonen DC CSE 2015

HPC Cloud. Focus on your research. Floris Sluiter Project leader SARA

How To Use Data From Copernicus And Big Data To Help The Environment

Optimizing IT Deployment Issues

The Multimission National Center of the Italian Space Agency

BIG DATA TRENDS AND TECHNOLOGIES

EO data hosting and processing core capabilities and emerging solutions

Big Data and Analytics: Getting Started with ArcGIS. Mike Park Erik Hoel

A Novel Cloud Based Elastic Framework for Big Data Preprocessing

Big Data Volume & velocity data management with ERDAS APOLLO. Alain Kabamba Hexagon Geospatial

PARALLELS CLOUD STORAGE

Big Data, Cloud Computing, Spatial Databases Steven Hagan Vice President Server Technologies

Big Data in the context of Preservation and Value Adding

Open Cloud System. (Integration of Eucalyptus, Hadoop and AppScale into deployment of University Private Cloud)

Web-based spatio-temporal visualization and analysis of the Siberian Earth System Science Cluster (SIB-ESS-C)

Quick Reference Selling Guide for Intel Lustre Solutions Overview

GeoCloud Project Report USGS/EROS Spatial Data Warehouse Project

BLACKBRIDGE SATELLITE IMAGERY THROUGH CLOUD COMPUTING

Experiences and challenges in the development of the JASMIN cloud service for the environmental science community

What Is Microsoft Private Cloud Fast Track?

An Esri White Paper June 2011 ArcGIS for INSPIRE

Emerging Technologies CEOS/WGISS

CLOUD. MADE EASY. vnebula Portal

Scalable Cloud Computing Solutions for Next Generation Sequencing Data

Getting Started & Successful with Big Data

TELEIOS FP Deliverable D10.2. TELEIOS Web Site. Kostis Kyzirakos, Michael Sioutis, Stavros Vassos, Manolis Koubarakis.

DIABLO TECHNOLOGIES MEMORY CHANNEL STORAGE AND VMWARE VIRTUAL SAN : VDI ACCELERATION

VMware Virtual SAN Backup Using VMware vsphere Data Protection Advanced SEPTEMBER 2014

IMPLEMENTING GREEN IT

DATA ACCESS AT EUMETSAT

Data Lab Operations Concepts

The Trials and Tribulations and ultimate success of parallelisation using Hadoop within the SCAPE project

D3.3.1: Sematic tagging and open data publication tools

COMP/CS 605: Intro to Parallel Computing Lecture 01: Parallel Computing Overview (Part 1)

An Alternative Storage Solution for MapReduce. Eric Lomascolo Director, Solutions Marketing

EO Data by using SAP HANA Spatial Hinnerk Gildhoff, Head of HANA Spatial, SAP Satellite Masters Conference 21 th October 2015 Public

EMC DATA PROTECTION. Backup ed Archivio su cui fare affidamento

EUMETSAT DATA CENTRES AND ARCHIVE AND LONG-TERM DATA PRESERVATION

Metadata for Data Discovery: The NERC Data Catalogue Service. Steve Donegan

A Cloud Computing Approach for Big DInSAR Data Processing

VDI Without Compromise with SimpliVity OmniStack and Citrix XenDesktop

UNINETT Sigma2 AS: architecture and functionality of the future national data infrastructure

N /150/151/160 RAID Controller. N MegaRAID CacheCade. Feature Overview

Restricted Document. Pulsant Technical Specification

Big Data Analytics. with EMC Greenplum and Hadoop. Big Data Analytics. Ofir Manor Pre Sales Technical Architect EMC Greenplum

AlphaTrust PRONTO - Hardware Requirements

GeoNetwork, The Open Source Solution for the interoperable management of geospatial metadata

GGOS Portal EXECUTIVE SUMMARY

(Scale Out NAS System)

Who s Endian?

EGI services for distribution and federation of data and computing

Bosch Video Management System High Availability with Hyper-V

Scopia Desktop Server

NASA's Earth Observing Data and Information System (EOSDIS)

Trends driving software-defined storage

Hedvig Distributed Storage Platform with Cisco UCS

Copernicus Space Component Data Access Architecture. Meeting with Austria 27 May 2014, Vienna

Copyright 2014 Oracle and/or its affiliates. All rights reserved.

GeoNetwork, The Open Source Solution for the interoperable management of geospatial metadata

Symantec NetBackup Appliances

Client-aware Cloud Storage

owncloud Enterprise Edition on IBM Infrastructure

BIG DATA USING HADOOP

Windows Server 2012 授 權 說 明

DBpedia German: Extensions and Applications

Cloud Storage. Parallels. Performance Benchmark Results. White Paper.

Adobe Deploys Hadoop as a Service on VMware vsphere

VMware Workspace Portal Reference Architecture

Transcription:

DLR.de Chart 1 German Copernicus Data Access and Exploitation Platform BiDS 16, Teneriffa, Spain, 2016-03-16 Christoph Reck Gina Campuzano Klaus Dengler Torsten Heinen Mario Winkler DLR Oberpfaffenhofen German Aerospace Center (DLR) Earth Observation Center (EOC),

COPERNICUS: Initial constellation complete Sentinel-1A, Sentinel-2A, Sentinel-3A Sentinel-3 first image Released 02/03/2016 2:21 pm Copyright Copernicus data (2016)

DLR.de Chart 3 and more data to come 2014 2015 2016 1 2 3 4 5 6 7 8 9 10 11 12 1 2 3 4 5 6 7 8 9 10 11 12 1 2 3 4 5 6 7 8 9 10 11 12 1 ESA Data Hub all Sentinels user products 2014 2015 2016 2017 2018 2019 2020 Yearly volume estimate [TB] 180 966 4.490 6.591 7.250 7.469 8.127 Average Data Rate [Mbit/s] 194 257 1.194 1.753 1.928 1.987 2.162 February 2016: 14 monthssentinel-1 PAC operations: 1 Petabyte; > 750,000 data sets, (data sets to Level-0 products to valueadded Level-2 products) in the EOC long-term data archive. This amount stored since December 2014 for only S1A is equivalent to all the radar data from the Envisat satellite that were generated during its mission lifetime of over ten years.

DLR.de Chart 4 Deutsches Satelliten Daten Archiv Sentinel PAC Archive 50 (+33) PetaByte storage capacity ~ 1,5 PetaByte of product data per year per Sentinel

DLR.de Chart 5 OGC-Compliant Spatial Data Services Interoperable data discovery, viewing, and download Data discovery, viewing, and download Google Earth Apps on mobile devices EOWEB GeoPortal Community portals Geoportal.DE, WDC-RSAT, GEOSS, Desktop GIS Data access services Discovery CSW-ISO Viewing WMS Download WFS/WCS Processing WPS Data preparation and storage EOC Geoservice Long-term archiving Data Catalog EOC Earth observation data library

DLR.de Chart 6 Big Data Computing at EOC: GeoFarm - in-house private cloud for internal large scale EO computing open for projects with partners (no public cloud ) platform for demonstration of cloud technologies for EO DLR precursor for larger installations at the envisaged Copernicus Center GeoFarm Extension 2016 > 4300 Cores > 33 TB RAM > 1,9 PB Storage (HDD & SSD) Copernicus Sentinel-PACs @ DLR > 2,7 PB Sentinel product data/ Yr * > 2 * 10 Gbit network connection > 50 PB long term archive capacity **

DLR.de Chart 7 German Copernicus Data and Exploitation Platform Objectives CODE-DE (Copernicus Data and Exploitation Platform Deutschland) is to establish, configure, and operate the software systems and infrastructure to exploit the possibilities of the continuous data stream of free, full and open Copernicus Sentinel data and service information covered by the following elements: Ingestion and Archive Search and Access Processing Value Added Products Portal and User Management Monitoring and Reporting

DLR.de Chart 8 CODE-DE: Data and Exploitation Infrastructure Copernicus- Products National Missions Long-Term Archive Copernicus & ` value-added Products Financing: TBytes / Jahr Global EU DE S1 900 75 3 S2 900 75 3 S3 1200 100 4 Summe 3000 250 10 Online-Archive Processing (orchestration, processors, toolboxes) Access (search, visualization, download) Portal CSW Open Search W*S http(s) email

DLR.de Chart 10 Key Functional Aspects Fast (> 100 Mbyte/sec) and parallel access (> 20 applications) Based on best-practices and OGC Standards Avoid data duplication Provide processing capacity collocated to the data Portal functions (FAQ, Service Marketplace, Catalog Client, etc.) Follow usability best-practices User management (with quotas, priorities and external federations) Security (avoid compromising data and systems) Reporting usage

DLR.de Folie 11 Architecture Users Internet (via DFN) Public Processing Cloud(s) data access UFTP clientd Internet Catalogue Client HTTP HTTPS Service Marketplace EIP DMZ EGP Client Drupal HMA CSW Discovery Services FEDEO ECSW EOWEB OpenSearch register metadata Ingestion & Eviction Service pull + push WMS Visualisation Services Metadata Extraction trigger write pull WFS Download Service Storage 150 TB WCS GPFS access control HTTPS NGINX trigger access control access IP-Multicast Distribution Service UFTP Processing Cloud WPS SSH HTTP PaaS vsphere (Hosted Proc.) Hadoop VM servers... Hadoop Cluster... Copernicus data Access and Exploitation Platform EOC-Production Interfaces Long-Term Archive psm delivery Data Hub Relay DHuS HTTP(S) Apache Storage 150 TB Governance User Management Monitoring Reporting Functional element SW Component Toolbox or Plug-in (to be developed)

DLR.de Folie 12 Hardware Infrastructure 2 x 12 Cores E5-2680v3 128 GB Ram GPFS Server VM Client(s) VM 2 x 12 Cores E5-2680v3 128 GB Ram GPFS Server VM Client(s) VM... 2 x 12 Cores E5-2680v3 128 GB Ram GPFS Server VM Client(s) VM Clustered NFS 2 x 10 Gbit/s x2 x2 x2 x2 x2 x2 HPC Cluster 2 x Xeon 6 Core 2.6 GHz 48 GB Ram 2 x Xeon 6 Core 2.6 GHz 48 GB Ram... 2 x Xeon 6 Core 2.6 GHz 48 GB Ram Storage Head Dual Controller Storage Extension 1 Storage Extension 2 Storage Extension 3 up to 16 x 8 Gbit/sec FC connections up to 2 x 16 x 6 Gbit/sec SAS connections up to 336 3.5in Disks > 1 PByte

DLR.de Chart 13 Prototype Performance Measurements KBytes/s 2,300 2,100 1,900 1,700 1,500 1,300 Reading files in parallel dd if=$filename of=/dev/null bs=1g The upward error bars depict the maximum value of 40 runs, whereas the lower error bar shows the standard deviation from the average. 1,100 900 700-10 20 30 40 50 800 parallel transfers KBytes/sec 1,200 1,000 600 400 DHuS HTTPS remote access over 10GBit/sec network 200 0 0 10 20 30 40 50 wget -q "https://.../products('$uuid')/\$value" > /dev/null parallel transfers

DLR.de Folie 14 Catalog Client https://github.com/eox-a/eoxclient

DLR.de Folie 15 > DFD-INF Abteilungsbesprechung > C.Reck CODE-DE > 2015-10-28 Processing Platform Calvalus (Brockman Consult) access by users, data consumers Hadoop cluster virtual machines for internal services, hosted services, and projects VM servers HDFS distributed storage I/O host computing and EO data cluster for systematic and hosted processing storage for VMs, EO data, data access SAN access to data hub

Temporal Feature Extraction Germany Example: Landsat 8, 2014-2015 RGB NDVI max-mean-min 2014-2015 RGB NDBI-max, NDVI-max and NDWI-mean 2014-2015

Temporal Feature Extraction Germany Base Products : Thematic Masks

Temporal Feature Extraction Germany Base Products : Thematic Masks

Temporal Feature Extraction Germany Base Products : Thematic Masks

Temporal Feature Extraction Germany Base Products : Thematic Masks

DLR.de Chart 21 Conclusion Features Fast catalogue with HMA CSW and OGC OpenSearch interfaces Flexible dataset browsing with OGC Web Map Service (WMS) High performance data access using HTTP protocol Advanced data access using OGC Web Coverage Service (WCS) Parallel file system on an online storage attached network (SAN) Hadoop and Docker processing infrastructure Option for historical data access from the long-term-archive cross-cutting services like Governance, User Management, Monitoring and Reporting and Network infrastructure complement the architecture. These form a streamlined simple, scalable and performant architecture covering standardized interfaces from Discovery over Visualization to Download for web users and machine-to-machine applications.

DLR.de Chart 22 Thank you for your attention!