Software Engineering for Distributed Data Processing Infrastructures (DCIs)

Similar documents
PRACE WP4 Distributed Systems Management. Riccardo Murri, CSCS Swiss National Supercomputing Centre

The EDGeS project receives Community research funding

StratusLab project. Standards, Interoperability and Asset Exploitation. Vangelis Floros, GRNET

IGI Portal architecture and interaction with a CA- online

On Enabling Hydrodynamics Data Analysis of Analytical Ultracentrifugation Experiments

MIGRATING DESKTOP AND ROAMING ACCESS. Migrating Desktop and Roaming Access Whitepaper

Assessing Success and Uptake for New Standards in Cloud and Grid Computing

Future Developments in UniGrids and NextGRID

EMI views on Cloud Computing

A Service for Data-Intensive Computations on Virtual Clusters

WIRELESS TRAINING SOLUTIONS. by vlogic, Inc. L a b 0.3 Remote Access Labs

Integrating F5 Application Delivery Solutions with VMware View 4.5

CompatibleOne Open Source Cloud Broker Architecture Overview

Assignment # 1 (Cloud Computing Security)

Globus Research Data Management: Introduction and Service Overview. Steve Tuecke Vas Vasiliadis

Workspaces Concept and functional aspects

Towards a New Model for the Infrastructure Grid

The EGI Federated Cloud e-infrastructure

Hybrid for SharePoint Server Search Reference Architecture

NCSU SSO. Case Study

Grid Computing vs Cloud

Simplify SSL Certificate Management Across the Enterprise

Anwendungsintegration und Workflows mit UNICORE 6

Interoperability in Cloud Federations

The standards landscape in cloud

INDIGO DataCloud. Technical Overview RIA INFN-Bari

Cloud Computing Architecture with OpenNebula HPC Cloud Use Cases

Manjrasoft Market Oriented Cloud Computing Platform

A Web-based Portal to Access and Manage WNoDeS Virtualized Cloud Resources

Interwise Connect. Working with Reverse Proxy Version 7.x

GESTIONE DI APPLICAZIONI DI CALCOLO ETEROGENEE CON STRUMENTI DI CLOUD COMPUTING: ESPERIENZA DI INFN-TORINO

ReadyNAS Remote White Paper. NETGEAR May 2010

IaaS Cloud Architectures: Virtualized Data Centers to Federated Cloud Infrastructures

Remote Desktop Gateway. Accessing a Campus Managed Device (Windows Only) from home.

OpenNebula Open Souce Solution for DC Virtualization

ReadyNAS Replicate. Software Reference Manual. 350 East Plumeria Drive San Jose, CA USA. November v1.0

Cloud services in PL-Grid and EGI Infrastructures

1.1.1 Introduction to Cloud Computing

Cluster, Grid, Cloud Concepts

Scyld Cloud Manager User Guide

Integration Guide. SafeNet Authentication Client. Using SAC CBA for Check Point Security Gateway

Enhanced Research Data Management and Publication with Globus

OpenNebula Open Souce Solution for DC Virtualization

Cloud Services ADM. Agent Deployment Guide

Clearswift Information Governance

Deploying Business Virtual Appliances on Open Source Cloud Computing

Test of cloud federation in CHAIN-REDS project

Concepts and Architecture of the Grid. Summary of Grid 2, Chapter 4

DSI File Server Client Documentation

Uila SaaS Installation Guide

How to Secure a Groove Manager Web Site

Getting Started with Capacity Planner

SCI-BUS gateways for grid and cloud infrastructures

PMES Dashboard User Manual

Introduction to Programming and Computing for Scientists

Directory Integration with Okta. An Architectural Overview. Okta Inc. 301 Brannan Street San Francisco, CA

Federation At Fermilab. Al Lilianstrom National Laboratories Information Technology Summit May 2015

Cloud Computing Concept, Technology & Architecture

2X Cloud Portal v10.5

OPC UA vs OPC Classic

Entrust IdentityGuard Comprehensive

Shavlik Patch for Microsoft System Center

NetIQ Advanced Authentication Framework - MacOS Client

Trademark Notice. General Disclaimer

The Data Grid: Towards an Architecture for Distributed Management and Analysis of Large Scientific Datasets

XSEDE Service Provider Software and Services Baseline. September 24, 2015 Version 1.2

How To Authenticate An Ssl Vpn With Libap On A Safeprocess On A Libp Server On A Fortigate On A Pc Or Ipad On A Ipad Or Ipa On A Macbook Or Ipod On A Network

Concepts and Architecture of Grid Computing. Advanced Topics Spring 2008 Prof. Robert van Engelen

Cloud Computing. Adam Barker

Plateforme de Calcul pour les Sciences du Vivant. SRB & glite. V. Breton.

NetWrix Logon Reporter V 2.0

Cloud Standards. Sam Johnston Google Zürich

Virtual Desktop Infrastructure in

Lab 00: Configuring the Microsoft Lync Ignite Environment Cloud Hosted Version

SOLUTION BRIEF Citrix Cloud Solutions Citrix Cloud Solution for On-boarding

White paper. Implications of digital certificates on trusted e-business.

Technical White Paper BlackBerry Enterprise Server

GRID COMPUTING Techniques and Applications BARRY WILKINSON

RingStor User Manual. Version 2.1 Last Update on September 17th, RingStor, Inc. 197 Route 18 South, Ste 3000 East Brunswick, NJ

OpenNebula Open Souce Solution for DC Virtualization. C12G Labs. Online Webinar

Alfresco Enterprise on Azure: Reference Architecture. September 2014

Enabling cloud for e-science with OpenNebula

INUVIKA OPEN VIRTUAL DESKTOP FOUNDATION SERVER

Deployment Guide: Unidesk and Hyper- V

dcache, Software for Big Data

Device LinkUP + Desktop LP Guide RDP

vcloud Suite Licensing

Sophos UTM. Remote Access via PPTP. Configuring UTM and Client

Workday Mobile Security FAQ

Cisco Application Networking Manager Version 2.0

Grid and Cloud Computing at LRZ Dr. Helmut Heller, Group Leader Distributed Resources Group

Building a protocol validator for Business to Business Communications. Abstract

SSL VPN. Virtual Appliance Installation Guide. Virtual Private Networks

GEM Network Advantages and Disadvantages for Stand-Alone PC

VMWARE VIEW WITH JUNIPER NETWORKS SA SERIES SSL VPN APPLIANCES

FastPass Password Manager

Sistemi Operativi e Reti. Cloud Computing

CareGiver Remote Support Information Technology FAQ

Mobile Admin Architecture

CompatibleOne Open Source Cloud Broker Architecture Overview

Transcription:

Etienne URBAH urbah@lal.in2p3.fr Oleg LODYGENSKY lodygens@lal.in2p3.fr LAL, Univ Paris-Sud, IN2P3/CNRS, Orsay, France The EDGI project receives Community research funding under grant agreement 261556 1

Summary 1) Distributed Data Processing 2) Best practices Software engineering 3) DCI = Federation of Administrative Domains 4) Middleware Stacks for Distributed Data Processing 5) Standards for Distributed Data Processing v1.4 2

Software Engineering for Distributed 1) Distributed Data Processing v1.4 3

Software Engineering for Distributed 1.1) Scope of Data Processing Data processing is much more than computing : v1.4 4

Software Engineering for Distributed 1.1) Scope of Data Processing Data processing is much more than computing : Data acquisition Instrument management Data, Metadata management Access rights to data Computing Filtering, Sorting Graphical display Long term curation,... v1.4 5

Software Engineering for Distributed 1.2) Data Processing on a User Device On any single user device (PC, Mobile phone,...) : Access rights are very simple, But storage and computing power are limited. v1.4 6

Software Engineering for Distributed 1.3) Data Processing needs to be distributed Depending on the workflow of data processing, infrastructure types are more or less appropriate, with a tradeoff between development effort, performance and cost : Costly supercomputers are mandatory for workflows requiring huge core memory and/or highly coupled parallelism. Development effort is high for efficient highly coupled parallelism. DCIs such as EGI and OSG are optimal for secure data sharing, and for workflows using large data sets with medium core memory and small parallelism. Development effort is low for single applications. Desktop grids are optimal for workflows requiring heavy uncoupled parallelism, moderate core memory and moderate data sets (< 100 MB). In the near future, desktop grids will also permit secure storage of data sets of any size, but transfer rate will stay limited. Development effort for single applications highly depends on the grid middleware (high for BOINC, low for XtremWeb). v1.4 7

2) Best practices Software engineering WHO? WHAT? WHEN? WHERE? HOW? v1.4 8

2.1) Bad practices to avoid In order to harness the complexity of distributed data processing : Beware of buzzwords Avoid meaningless pretty pictures Beware of Clouds Beware of Virtualization Passive sentences should also be avoided : Each of us is responsible for avoiding passive sentences. v1.4 9

2.2) Best practices Software engineering In order to harness the complexity of distributed data processing : Best practices and Software engineering already exist : UML http://www.uml.org/ CMMI-DEV http://www.sei.cmu.edu/library/abstracts/reports/10tr033.cfm ITIL http://www.itil-officialsite.com/aboutitil/whatisitil.aspx etc v1.4 10

2.3) Best practices Simple questions v1.4 11

2.4) Software engineering UML v1.4 12

2.5) Software engineering First steps Using Best practices and Software engineering to harness the complexity of distributed data processing : Clearly distinguish and describe relationships between concepts An extensive glossary is available at http://www.ogf.org/documents/gfd.181.pdf A use case collection is available at http://www.ogf.org/documents/gfd.180.pdf Robustness is an intrinsic property of a resource. Virtualization permits flexible on demand management of underlying resources, but only if these underlying resources are already robust, and at the cost of the virtualization layer, which may take months to acquire robustness. Administrative domains contain a repository permitting easy authentication and authorization of users. Grid computing is a federation of administrative domains, based on mutual trust, some common operational setup (in particular the trust anchor for authentication and authorization), specialized interoperable middleware stacks, and permit secure data sharing. Cloud Computing is the on demand management (allocation, instantiation and usage) of internal or external computing resources inside a given administrative domain; but it does currently not span across multiple administrative domains. Interoperability between clouds does not exist yet, as it requires some common cloud operational setup and interoperable cloud middleware stacks. Work is going on federation of clouds, for instance in http://contrail-project.eu v1.4 13

2.6) Software engineering Users In order to assess User Needs, take care to track all Users : Users having ICT knowledge are able to express, criticize and validate sound needs and requirements. End Users of DCIs are scientists, with various ICT, Grid and Cloud knowledge. For example : Application developers, Experienced application users, Scientists with no ICT knowledge using a scientific portal,... Direct Users of DCIs are various : Developers of scientific applications, Integrators of scientific applications for grids, Providers of scientific workflow engines, Providers of scientific portals, Providers of SAGA, Site Administrators, VO Administrators, v1.4 14

3) DCI = Federation of Administrative Domains v1.4 15

3.1) Administrative Domain Principle Each organization (University, Research Institute, Administration, Enterprise,...) has in general an Administrative Domain managed by an operational team and protected by a firewall. v1.4 16

3.1) Administrative Domain Principle Each organization (University, Research Institute, Administration, Enterprise,...) has in general an Administrative Domain managed by an operational team and protected by a firewall. This Administrative Domain permits that each registered user can use any shared resource of the Administrative Domain following defined access rights. v1.4 17

The end user is primarily interested in 'payload' functionalities : Data Acquisition Instrument Management Data, Metadata Management Computing Software Engineering for Distributed 3.2) Payload and Support Functionalities v1.4 18

3.2) Payload and Support Functionalities The end user is primarily interested in 'payload' functionalities : Data Acquisition Instrument Management Data, Metadata Management Computing But management of the shared resources of an Administrative Domain also requires 'support' functionalities : Security Information (Publication of resource capabilities so that they can be dynamically discovered) Monitoring Logging and Bookkeeping Accounting v1.4 19

3.2) Payload and Support Functionalities The end user is primarily interested in 'payload' functionalities : Data Acquisition Instrument Management Data, Metadata Management Computing But management of the shared resources of an Administrative Domain also requires 'support' functionalities : Security Information (Publication of resource capabilities so that they can be dynamically discovered) Monitoring Logging and Bookkeeping Accounting These functionalities are provided more or less completely and consistently by access and sharing middleware. v1.4 20

Inside an Administrative Domain, any access or sharing middleware may be used : 3.3) Administrative Domain : Access and sharing methods v1.4 21

Inside an Administrative Domain, any access or sharing middleware may be used : 3.3) Administrative Domain : Access and sharing methods BitDew CORBA File server Database server Remote login (SSH, RDP) Batch system Applicative portal Private grid (Condor, BOINC, XtremWeb,...) Private cloud... v1.4 22

Inside an Administrative Domain, any access or sharing middleware may be used : Standardization is optional. 3.3) Administrative Domain : Access and sharing methods BitDew CORBA File server Database server Remote login (SSH, RDP) Batch system Applicative portal Private grid (Condor, BOINC, XtremWeb,...) Private cloud... v1.4 23

3.4) Clouds Private Cloud Commercial Cloud v1.4 24

3.4) Clouds provide on-demand resources Private Cloud Commercial Cloud A cloud provide on-demand resources with a rich and consistent set of data processing functionalities, and a powerful and consistent interface, but limited to ONE cloud provider (no interoperability). It permits flexible extension of an Administrative Domain. Cost of a commercial cloud is affordable for permanent data storage and computing, but very high for the transfer of large amount of data. Currently, clouds do NOT manage federation of Administrative Domains. Work is going on federation of clouds, for instance in http://contrail-project.eu v1.4 25

3.5) DCI = Federation of Administrative Domains The goal is sharing of data. v1.4 26

3.5) DCI = Federation of Administrative Domains The goal is sharing of data. The principle is mutual trust. v1.4 27

3.5) DCI = Federation of Administrative Domains The goal is sharing of data. The principle is mutual trust. The method is agreement on common middleware stacks and operational procedures. Resources stay locally managed. v1.4 28

3.5) DCI = Federation of Administrative Domains The goal is sharing of data. The principle is mutual trust. The method is agreement on common middleware stacks and operational procedures. Resources stay locally managed. The middleware provides consistent access rights to all users according to their DCI credentials. v1.4 29

3.5) DCI = Federation of Administrative Domains Some scientific, academic and research organizations, which already own data related resources, increasingly need to securely share these resources with others. In order to federate their private resources into a production infrastructure, these organizations have had to establish mutual trust, adopted compatible middleware stacks and procedures integrated through operations teams to bring their resources together into a Distributed Computing Infrastructure (DCI). The recurring feature of the various DCIs that offer production resources (e.g. EGI, DEISA, PRACE, etc.) is that each one integrates multiple locally managed administrative domains into a usable environment. The middleware deployed by each DCI provides its users, according to their DCI credentials, consistent access rights to all resources managed by that DCI. v1.4 30

4) Middleware Stacks for Distributed Data Processing v1.4 31

4.1) Various DCI middleware stacks v1.4 32

4.2) DCI middleware stacks Poor interoperability Each area in white is a grid infrastructure, now called DCI. Interoperability, as shown in green and red, is desired. But it is not achieved yet. This diagram was created by Morris RIEDEL (Jülich Supercomputing Centre) v1.4 33

5) Standards for Distributed Data Processing v1.4 34

5.1) Identity of a User on a DCI Each DCI spreads across several Administrative Domains. So (login, password) does NOT work anymore. Each user must be identified by a certificate. Each middleware component using (login, password) must be adapted. In the USA, the InCommon federation provides SSL certificates. X509 certificates are the most robust : Such certificates are provided to a user, after approval of his/her organization, by a Certificate Authority. Each Administrative Domain must maintain a copy of the list of Certificate Authorities. The list of CAs published by IGTF (International Grid Trust Federation) is a de facto standard. v1.4 35

5.2) Security Layer of DCI Middleware Stacks If you are lucky, all software components using (login, password) are maintained in a consistent security layer. This security layer must now handle X509 certificates. This permit any user, depending on his/her access rights, to use shared resources or share his/her own ones. In particular, he/she can submit jobs (tasks, activities) on remote resources. But a job executed on a remote resource often has to process data in another location, with the access rights of the job submitter. This is a complicated problem called delegation, which the security layer also has to solve. Each instance of this security layer must work consistently in each of the various Administrative Domains of one given DCI, and this requires very careful standardization. v1.4 36

5.3) GLUE 2.0 (OGF) permits to describe DCI entities v1.4 37

5.3) GLUE 2.0 (OGF) permits to describe DCI entities An Endpoint accepts an Activity (Job or Data) following the AccessPolicy defined for the UserDomain (VO, VRC) containing the user having submitted the Activity. v1.4 38

5.3) GLUE 2.0 (OGF) permits to describe DCI entities An Endpoint accepts an Activity (Job or Data) following the AccessPolicy defined for the UserDomain (VO, VRC) containing the user having submitted the Activity. This Activity is allocated to the Share defined by the MappingPolicy for the UserDomain, and transmitted to a Manager (Batch system) for execution on a Resource (consistent set of physical machines). v1.4 39

5.3) GLUE 2.0 (OGF) permits to describe DCI entities An Endpoint accepts an Activity (Job or Data) following the AccessPolicy defined for the UserDomain (VO, VRC) containing the user having submitted the Activity. This Activity is allocated to the Share defined by the MappingPolicy for the UserDomain, and transmitted to a Manager (Batch system) for execution on a Resource (consistent set of physical machines). A Service, managed inside an AdminDomain, aggregates a consistent set of Endpoints, Shares and Managers. v1.4 40

5.3) GLUE 2.0 (OGF) permits to describe DCI entities An Endpoint accepts an Activity (Job or Data) following the AccessPolicy defined for the UserDomain (VO, VRC) containing the user having submitted the Activity. This Activity is allocated to the Share defined by the MappingPolicy for the UserDomain, and transmitted to a Manager (Batch system) for execution on a Resource (consistent set of physical machines). A Service, managed inside an AdminDomain, aggregates a consistent set of Endpoints, Shares and Managers. The Service should keep persistent ActivityRecords (Logging and Accounting), but this is not well standardized yet. ActivityRecord v1.4 41

5.4) Standardization of Support Functionalities Security : As shown in a previous slide, there are a lot of standards. These should be consolidated. Information : GLUE 2.0 (OGF) is an official standard It should be used as foundation. Monitoring : 'OCCI Req. Table 7' (OGF) is a specification permitting to define a standard. VM Image : OVF = Open Virtualization Format (DMTF) is an official standard Log : 'Activity Instance Documents Specification' (OGF) is a proposal for a standard. Accounting : UR (OGF) is an official standard, but I do not know in which extent it is implemented. v1.4 42

5.5) Standardization of Payload Functionalities Data : GridFTP, ByteIO, SRM, DMI (OGF), CDMI (SNIA) Instrument : DORII (not a standard yet) Cloud : OCCI (OGF) Job : JSDL, OGSA-BES, HPC profiles (OGF) These standards exist, but are not consistent between each other, nor consistent with support standards. They should be consolidated using GLUE 2.0 v1.4 43

5.6) Useful official and de facto standards for DCIs v1.4 44

5.7) Problem of the backend interfaces For each functionality, it is necessary to standardize the client interface. But interoperability between different DCIs requires usage of backend interfaces permitting access to resources. Then, these backend interfaces also have to be standardized. For each functionality, I see only two solutions : The standard describing the client interface also describes the backend interface (this is heavy and monolithic) We have to completely separate the definition of the data flowing through the interface and the definition of the dialog protocol, and/or define a specific standard for the backend interface (this generates a lot of standards to create and maintain) See next slide for the concept. See http://forge.gridforum.org/sf/go/doc15980?nav=1 for the complexity. Could anybody propose a solution to this problem? v1.4 45

5.7) Problem of the backend interfaces v1.4 46