Integration strategy



Similar documents
XSEDE Service Provider Software and Services Baseline. September 24, 2015 Version 1.2

Concepts and Architecture of the Grid. Summary of Grid 2, Chapter 4

Data Grids. Lidan Wang April 5, 2007

THE CCLRC DATA PORTAL

The Data Grid: Towards an Architecture for Distributed Management and Analysis of Large Scientific Datasets

GRID COMPUTING Techniques and Applications BARRY WILKINSON

Data Management in an International Data Grid Project. Timur Chabuk 04/09/2007

Deploying Business Virtual Appliances on Open Source Cloud Computing

3rd Annual Earth System Grid Federation and Ultrascale Visualization Climate Data Analysis Tools Face-to-Face Meeting Report December 2013

Cluster, Grid, Cloud Concepts

Cloud Computing. Lecture 5 Grid Case Studies

DataNet Flexible Metadata Overlay over File Resources

The ORIENTGATE data platform

Service-Oriented Architecture and Software Engineering

An approach to grid scheduling by using Condor-G Matchmaking mechanism

Abstract. 1. Introduction. Ohio State University Columbus, OH

The THREDDS Data Repository: for Long Term Data Storage and Access

CMIP6 Data Management at DKRZ

CMIP5 Data Management CAS2K13

Anwendungsintegration und Workflows mit UNICORE 6

The CMS analysis chain in a distributed environment

Policy Policy--driven Distributed driven Distributed Data Management (irods) Richard M arciano Marciano marciano@un

EUDAT. Towards a pan-european Collaborative Data Infrastructure. Willem Elbers

Status and Integration of AP2 Monitoring and Online Steering

Functional Requirements for Digital Asset Management Project version /30/2006

NASA s Big Data Challenges in Climate Science

GridFTP: A Data Transfer Protocol for the Grid

StreamServe Persuasion SP5 Document Broker Plus

DESIGN OF A PLATFORM OF VIRTUAL SERVICE CONTAINERS FOR SERVICE ORIENTED CLOUD COMPUTING. Carlos de Alfonso Andrés García Vicente Hernández

Using the MyProxy Online Credential Repository

AHE Server Deployment and Hosting Applications. Stefan Zasada University College London

A Survey Study on Monitoring Service for Grid

16th International Conference on Control Systems and Computer Science (CSCS16 07)

Recommendations for Static Firewall Configuration in D-Grid

DRIVER Providing value-added services on top of Open Access institutional repositories

Resource Management on Computational Grids

CDI/THREDDS Interoperability: the SeaDataNet developments. P. Mazzetti 1,2, S. Nativi 1,2, 1. CNR-IMAA; 2. PIN-UNIFI

TOG & JOSH: Grid scheduling with Grid Engine & Globus

GridWay: Open Source Meta-scheduling Technology for Grid Computing

Big Data Services at DKRZ

GT 6.0 GSI C Security: Key Concepts

PASS4TEST 専 門 IT 認 証 試 験 問 題 集 提 供 者

Globus Research Data Management: Introduction and Service Overview

Building integration environment based on OAI-PMH protocol. Novytskyi Oleksandr Institute of Software Systems NAS Ukraine

HPC and Grid Concepts

NASA's Strategy and Activities in Server Side Analytics

Cloud and Virtualization to Support Grid Infrastructures

An Introduction to Virtualization and Cloud Technologies to Support Grid Computing

European Data Infrastructure - EUDAT Data Services & Tools

Leveraging the Eclipse TPTP* Agent Infrastructure

Metadata for Data Discovery: The NERC Data Catalogue Service. Steve Donegan

CHAPTER 6: TECHNOLOGY

STREAM ANALYTIX. Industry s only Multi-Engine Streaming Analytics Platform

Metastorm BPM Interwoven Integration. Process Mapping solutions. Metastorm BPM Interwoven Integration. Introduction. The solution

Figure 1: Illustration of service management conceptual framework

Fast Innovation requires Fast IT

Grid Sun Carlo Nardone. Technical Systems Ambassador GSO Client Solutions

Analyses on functional capabilities of BizTalk Server, Oracle BPEL Process Manger and WebSphere Process Server for applications in Grid middleware

Building Test-Sites with Simware

SOA REFERENCE ARCHITECTURE: WEB TIER

Towards an E-Governance Grid for India (E-GGI): An Architectural Framework for Citizen Services Delivery

Deployment Guide: Unidesk and Hyper- V

irods and Metadata survey Version 0.1 Date March Abhijeet Kodgire 25th

Australian Synchrotron, Storage Gateway

Mobile Software Agents: an Overview

Customer Bank Account Management System Technical Specification Document

Oracle Communications WebRTC Session Controller: Basic Admin. Student Guide

GeoNetwork, The Open Source Solution for the interoperable management of geospatial metadata

IMAN: DATA INTEGRATION MADE SIMPLE YOUR SOLUTION FOR SEAMLESS, AGILE DATA INTEGRATION IMAN TECHNICAL SHEET

Transcription:

C3-INAD and ESGF: Integration strategy C3-INAD Middleware Team: Stephan Kindermann, Carsten Ehbrecht [DKRZ] Bernadette Fritzsch [AWI] Maik Jorra, Florian Schintke, Stefan Plantikov [ZUSE Institute] Markus Kemmerling, Christian i Grimme [TU-DO] Stephan Kindermann, DKRZ GO-ESSP 2011, Asheville

C3Grid-INAD (08.2010 07.2013) (Climate community specific funding) follow up of C3-Grid (09.2005 03.2009) (D-Grid funding) Key characteristics of C3-Grid: National Climate Data Center Integration (world data centers, research institutes, universities) ISO 19139 Metadata, uniform data access interface, Portal Grid based distributed workspace with data/job co-scheduling middleware C3-INAD = C3-Grid + Production level, sustainability plan, support center, new scientific workflows ESGF data node integration Stephan Kindermann, DKRZ GO-ESSP 2011, Asheville

Objective: establish C3-INAD as an alternative to download and process at home approach for ESGF data node hosted data C3-INAD develops CMIP5 multi-model-multi-ensemble WFs, e.g. for means, variances, percentiles, statistical means (different multi-model weighting tools) as well as model verification/validation WFs. Redesign of C3-Grid infrastructure necessary (no globus WSRF, towards REST architecture) including security architecture (-> opportunity to align developments with ESGF)

C3-Grid / C3-INAD infrastructure Status / generic architecture Data management / job management infrastructure C3-Grid / ESGF integration Data management / (Security) Metadata t infrastructure t

Data Providers deploy a uniform data access interface, which is part of a distributed data management middleware OAI-PMH C3Grid Portal OAI-PMH Data Providers provide ISO 19139 data discovery metadata, via OAI-PMH Workflow- Management Data Management The portal submits JSDL work flow descriptions to the workflow management system. A workflow consists of data staging and computing tasks ISO 19139 MD Word Data Centres Climate / Mare / RSAT Universities / Research institutes ISO 19139 MD Security: X509 (proxy) certificates + (GT4) delegation The data management infrastructure cares for data staging, data replication, data life- time etc. Data management and job management components interact via a co-scheduling protocol

Generation N Data Management System (GNDMS): http://gndms.zib.de Key characteristics: Based on data staging and co-scheduling Data integration layer (logical names, data transfers, workspace management) Support for GT4 transition to REST based architecture

Features: Co-scheduling support for all data management activities Failure recovery for all data management activities persistent storage of whole critical data management status Highly configurable and extensible generic data management tasks Different Roles C3Grid DMS Provider Overall C3Grid DMS coordinator Provide data in C3Grid via staging g System management components: Runtime configuration and monitoring Uses Globus/WSRF, GSI and GridFTP / New REST based implementation ongoing

DSpace data space / workspace manager slice per job, lifetime, auto delete GORFX generic offer request factory offers services depending on site role Available Services: SliceStageIn Subspace X Slice (Data,Type, Permissions ) Slice staging at DMS: provider selection and trigger remote ProviderStageIn ProviderStageIn Subspace Y Slice Slice Subspace Z Slice DSpace Slice staging into slice at data provider InterSliceTransfer copying of files between slices FileTransfer import/export files via GridFTP GNDMS negotiates contracts with clients about task execution A GNDMS negotiates contracts with clients about task execution. A contract specifies what is to be done, and optionally when and where it is to be done by GNDMS on behalf of the client. Contracts are accepted Offers

Workspace Workspace Workspace Workspace Data Transfer Data Staging Data Publishing DSpace GORFX GNDMS Services and Components GNDMS Core Globus Toolkit and Security Infrastructure Relational l Database

IS-ENES portal search Metafor CIM Repository C3Grid Portal OAI-PMH ISO 19139 Exporter CIM Data Object Exporter Thredds Harvester DB Gateway Workflow- Management Data Management DKRZ Data Node ESGF data federation Word Data Centres Climate / Mare / RSAT Universities / Research institutes C3Grid legacy data centres :ESGF/C3 C3-INAD data stager

Metadata: Dataset metadata available in central DB (Thredds harvest sink + CERA) Thredds to ISO 19139 MD generation harvesting into C3 Portal Data: a C3-ESGF data staging request: queries central DB for corresponding files and generates access script(s) gets delegated certificate and stages files directly from ESGF data nodes Sub-selects requested spatio-temporal subset(s), Generates provenance metadata in ISO 19139 for result Security: MyProxy based delegation O-AUTH is currently being evaluated

C3 ESGF portal integration i and data access prototype planed for fall 2011 Multi-model multi-ensemble wflow deployment in 2012 Under discussion: ESGF data node sw stack as default way to add new data providers po desto C3-INAD Developments and roadmap can be followed at http://wiki.c3grid.de (registration necessary) There you will also find info on reusable components like Thredds harvester and DB ingester + ISO generator GNDMS middleware

Old C3 Grid: Data center integration, portal, middleware development New C3 INAD: Climate scientist, use case driven (C3-INAD workflow developers ) ESGF data node integration new development of middleware components necessary opportunity to align developments with ESGF Stephan Kindermann, DKRZ GO-ESSP 2011, Asheville