Processing big data by WS- PGRADE/gUSE and Data Avenue
|
|
|
- Leo Bradford
- 10 years ago
- Views:
Transcription
1 Processing big data by WS- PGRADE/gUSE and Data Avenue Peter Kacsuk, Zoltan Farkas, Krisztian Karoczkai, Istvan Marton, Akos Hajnal, Tamas Pinter MTA SZTAKI SCI-BUS is supported by the FP7 Capacities Programme under contract nr RI
2 Processing big data by workflows Processing big data many times requires a set of activities that can be combined and formulated in a scientific workflow in order to repeat the activities for a large set of data components in an automatic way. Therefore scientific workflows that can run in Globus-based DCIs and they can access large data storages are crucial for processing big data. 2
3 The SCI-BUS approach Integrate workflows with the Data Avenue services Run these workflows in an environment that enables to run the nodes of a workflow in many different type of DCIs to achieve Highly parallel and distributed workflow execution Workflow level interoperability among DCIs and data storages The environment offered by SCI-BUS is the WS- PGRADE/gUSE gateway framework 3
4 WS-PGRADE/gUSE Generic-purpose gateway framework Based on Liferay General purpose Workflow-oriented gateway framework Supports the development and execution of workflow-based applications Supports the fast development of domainspecific gateways by a customization technology Most important design aspects are flexibility and robustness 4
5 Flexibility in exploiting parallelism Multiple instances of the same workflow with different data files Parallel execution inside a workflow node Parallel execution among workflow nodes Parameter study execution of the workflow Multiple jobs run parallel Each job can be a parallel program
6 Flexibility of using various DCIs Flexible management of Security: Individual users certificate Robot certificates Flexible access to various types of DCIs: Clusters (PBS, LSF, MOAB, SGE) Cluster grids (ARC, glite, GT2, GT4, GT5, UNICORE) Supercomputers (e.g. via UNICORE) Desktop grids (BOINC) Clouds 6
7 Using IGE Globus resources in the DRIHM gateway start.sh $modelname $jobid 7
8 Flexibility in data storage access Use Data Avenue Blacktop service To access data storages in different DCIs To transfer files among the storages of different DCIs To upload/download files to/from the storages of different DCIs Data Avenue Liferay portlet to access the data transfer services of Data Avenue Blacktop See details: Currently supported protocols: http, https, ftp, gsiftp, srm, S3, (irods in beta phase) Soon coming protocols: LFC, further cloud storage protocols 8
9 Data Avenue services Data SZTAKI Data XY Data Avenue Portlet WS-PGRADE gateway Data Avenue Blacktop service Openstack FS1 FS2 FS3 FSn Amazon glite GT5 9
10 Use cases to be supported Browse, download, upload Create dir, Remove item, Data Avenue Produce data Use data Storage Service Storage Service Storage Service EGI Community Forum 2014, Helsinki, Finland 10
11 Data Avenue services Data Avenue Blacktop: Core service accessible through SOAP (Java API provided) Hides access details of storage services Data Avenue Portlet: User-friendly interface to manage data, up-, and download files,... Can be deployed onto any Liferay-based portal Data Avenue in WS-PGRADE/gUSE: Integration in a science gateway enabling easy data usage from workflows EGI Community Forum 2014, Helsinki, Finland 11
12 Data Avenue Blacktop Core service accessible through SOAP File transfers Directory operations Easy to add new protocols using the Adaptor interface HTTP(S), SFTP, SRM, GSIFTP, S3 EGI Community Forum 2014, Helsinki, Finland 12
13 Data Avenue Blacktop API Java API available: Focused on easy usage when created EGI Community Forum 2014, Helsinki, Finland 13
14 Data Avenue Blacktop usage API or portlet ticket must be requested: Used to identify DA Blacktop clients EGI Community Forum 2014, Helsinki, Finland 14
15 Data Avenue Portlet Try it for yourself: Also available as a JSR-268 portlet (can be deployed over e.g. Liferay) Included in WS-PGRADE releases Two-panel layout Data up- and download Copy/move Favorites Progress monitoring 7/7/2014 Footer 15
16 Data SZTAKI 16
17 Data Avenue Liferay portlet 17
18 Data Avenue Liferay portlet 18
19 Data Avenue Liferay portlet 19
20 Data Avenue Liferay portlet 20
21 Generic data transfer among WS-PGRADE workflow nodes DCI1 WS-PGRADE Workflow DCI2 FS1 J2 J1 J4 J5 J3 FS2 FS3 FS5 The Data Avenue Blacktop services are available not only by the Data Avenue portlet but also by the nodes of a WS-PGRADE workflow J: Job FS: File storage system, e.g. gsiftp, irods, SRM
22 Data Avenue in WS- PGRADE/gUSE Data sources and destinations of jobs can be selected guse automatically manages data transfers using Blacktop Actual transfer delegated up to the worker node wherever possible (two-phase up- and download), bypassing the Blacktop service if the middleware is capable of handling the protocol To be released before summer EGI Community Forum 2014, Helsinki, Finland 22
23 Comparison with Globus Online 1. Globus Online is excellent inside a globus grid 2. But it supports only the Globus storage protocols 3. Does not enable to use inside a workflow 4. Data Avenue is a generalization of Globus Online 5. Enables the access to many different types of storages even in a workflow that runs through several kind of DCIs 6. This technology enables the easy integration of Globus and Cloud resources at workflow 23 level
24 Flexibility for collaboration among community members SHIWA Repository WF upload WF download guse Portal guse WF Repo guse Portal guse WF Repo Cloud 1 OpenNebula Cloud 2 Amazon Cloud n OpenStack 24
25 Flexibility in using different workflow systems Cyberspace WS-PGRADE Gateways Bio1 Bio2 BioN er Taverna Galaxy Kepler WF systems EMI Grids Glob us Infrastructures Cloud Combining SCI-BUS and SHIWA technologies (supported by ER- Flow) users can access and use many WFs and many infrastructures in an interoperable way no matter which is their home WF system 25
26 Flexibility of gateway types and user views 1. Generic purpose gateways for clouds (workflow view) Core WS-PGRADE/gUSE (e.g. Greek NGI) 2. Generic purpose gateway for specific technologies (workflow view) SHIWA gateway for workflow sharing and interoperation 3. Domain-specific science gateway instance Autodock gateway (end-user view) Swiss proteomics portal (customized GUI using ASM API) VisIVO Mobile (use of Remote API) 26
27 Some examples of SCI-BUS domain-specific gateways 27
28 The DRIHM project s gateway Other data sources
29 guse based gateways More than 100 deployments worldwide More than downloads from 75 countries on sourceforge 29
30 Conclusions Join SCI-BUS as associated member Why to select WS-PGRADE/gUSE and join the SCI-BUS community? 1.Robustness Already large number of gateways used in production 2.Sustainability The SCI-BUS project and its sustainability and commercialization plan guarantees it 3.Functionalities Rich functionalities that are growing according to the SCI-BUS and sourceforge community needs 4.How easy to adapt for the needs of the new user community? Already large number of gateways customized from guse/ws- PGRADE 5.You can influence the progress of WS-PGRADE/gUSE 30
31 Where to find further information? SCI-BUS web page: guse/ws-pgrade: guse on sourceforge
Data Avenue: Remote Storage Resource Management in WS-PGRADE/gUSE
Data Avenue: Remote Storage Resource Management in WS-PGRADE/gUSE Laboratory of Parallel and Distributed Systems Institute for Computer Science and Control Hungarian Academy of Sciences Budapest, Hungary
Project Full Title: Cloud based Simulation platform for Manufacturing and Engineering. Project Acronym: CloudSME Project Number: 608886
Project Full Title: Cloud based Simulation platform for Manufacturing and Engineering Project Acronym: CloudSME Project Number: 608886 Programme: Cooperation Themes: Information and Communication Technologies;
Hadoop Cloud SaaS access via WS-PGRADE adaptation
Hadoop Cloud SaaS access via WS-PGRADE adaptation Elisa Cauhe 1 Arturo Giner 1 Jaime Ibar 1 Gonzalo Ruiz 1 Ruben Valles 1 BIFI: Institute for Biocomputation and Physics of Complex Systems of the University
User Manual: Using Hadoop with WS-PGRADE. workflow.
User Manual: Using Hadoop with WS-PGRADE workflows December 9, 2014 1 About This manual explains the configuration of a set of workflows that can be used to submit a Hadoop job through a WS-PGRADE portal.
Anwendungsintegration und Workflows mit UNICORE 6
Mitglied der Helmholtz-Gemeinschaft Anwendungsintegration und Workflows mit UNICORE 6 Bernd Schuller und UNICORE-Team Jülich Supercomputing Centre, Forschungszentrum Jülich GmbH 26. November 2009 D-Grid
Test of cloud federation in CHAIN-REDS project
Test of cloud federation in CHAIN-REDS project Italian National Institute of Nuclear Physics, Division of Catania - Italy E-mail: [email protected] Roberto Barbera Department of Physics and
HPC Cloud Computing with OpenNebula
High Performance Cloud Computing Day BiG Grid - SARA Amsterdam, The Netherland, October 4th, 2011 HPC Cloud Computing with OpenNebula Ignacio M. Llorente Project Director Acknowledgments The research leading
Managing Complexity in Distributed Data Life Cycles Enhancing Scientific Discovery
Center for Information Services and High Performance Computing (ZIH) Managing Complexity in Distributed Data Life Cycles Enhancing Scientific Discovery Richard Grunzke*, Jens Krüger, Sandra Gesing, Sonja
A Survey Study on Monitoring Service for Grid
A Survey Study on Monitoring Service for Grid Erkang You [email protected] ABSTRACT Grid is a distributed system that integrates heterogeneous systems into a single transparent computer, aiming to provide
Using the Grid for the interactive workflow management in biomedicine. Andrea Schenone BIOLAB DIST University of Genova
Using the Grid for the interactive workflow management in biomedicine Andrea Schenone BIOLAB DIST University of Genova overview background requirements solution case study results background A multilevel
glibrary: Digital Asset Management System for the Grid
glibrary: Digital Asset Management System for the Grid Antonio Calanducci INFN Catania EGEE User Forum Manchester, 09 th -11 th May 2007 www.eu-egee.org EGEE and glite are registered trademarks Outline
Scientific and Technical Applications as a Service in the Cloud
Scientific and Technical Applications as a Service in the Cloud University of Bern, 28.11.2011 adapted version Wibke Sudholt CloudBroker GmbH Technoparkstrasse 1, CH-8005 Zurich, Switzerland Phone: +41
The EGI pan-european Federation of Clouds
The EGI pan-european Federation of Clouds CGW12 Cracow, 22-24 Oct 2012 Matteo Turilli Senior Research Associate Chair EGI Federated Clouds Task Force Oxford e-research Centre University of Oxford [email protected]
Cluster, Grid, Cloud Concepts
Cluster, Grid, Cloud Concepts Kalaiselvan.K Contents Section 1: Cluster Section 2: Grid Section 3: Cloud Cluster An Overview Need for a Cluster Cluster categorizations A computer cluster is a group of
How To Make A Grid Broker Work With A Grid For A Better Profit
The evolution of Grid Brokers: union for interoperability ATTILA KERTÉSZ Institute of Informatics, University of Szeged MTA SZTAKI Computer and Automation Research Institute [email protected] Reviewed
Execution of scientific workflows on federated multi-cloud infrastructures
Execution of scientific workflows on federated multi-cloud infrastructures Daniele Lezzi 1, Francesc Lordan 1, Roger Rafanell 1, and Rosa M. Badia 1, 1 Barcelona Supercomputing Center - Centro Nacional
Volunteer Computing, Grid Computing and Cloud Computing: Opportunities for Synergy. Derrick Kondo INRIA, France
Volunteer Computing, Grid Computing and Cloud Computing: Opportunities for Synergy Derrick Kondo INRIA, France Outline Cloud Grid Volunteer Computing Cloud Background Vision Hide complexity of hardware
Approaches for Cloud and Mobile Computing
Joint CLEEN and ACROSS Workshop on Cloud Technology and Energy Efficiency in Mobile Communications at EUCNC 15, Paris, France - 29 June, 2015 Interoperable Data Management Approaches for Cloud and Mobile
Cloud security monitoring and vulnerability management
Cloud security monitoring and vulnerability management M. Kozlovszky*, L.Kovács*, M. Törőcsik*, G.Windisch*, S.Ács*, D.Prém*, Gy. Eigner*, P.I. Sas*, T. Schubert*, V. Póserné* * Óbuda University/John von
Grids Computing and Collaboration
Grids Computing and Collaboration Arto Teräs CSC, the Finnish IT center for science University of Pune, India, March 12 th 2007 Grids Computing and Collaboration / Arto Teräs 2007-03-12 Slide
User Guide of edox Archiver, the Electronic Document Handling Gateway of
User Guide of edox Archiver, the Electronic Document Handling Gateway of project v0.7 SCI-BUS is supported by the FP7 Capacities Programme under contract nr RI-283481 Table of Contents 1 INTRODUCTION...
An approach to grid scheduling by using Condor-G Matchmaking mechanism
An approach to grid scheduling by using Condor-G Matchmaking mechanism E. Imamagic, B. Radic, D. Dobrenic University Computing Centre, University of Zagreb, Croatia {emir.imamagic, branimir.radic, dobrisa.dobrenic}@srce.hr
GridWay: Open Source Meta-scheduling Technology for Grid Computing
: Open Source Meta-scheduling Technology for Grid Computing Ruben S. Montero dsa-research.org Open Source Grid & Cluster Oakland CA, May 2008 Contents Introduction What is? Architecture & Components Scheduling
Grid Scheduling Architectures with Globus GridWay and Sun Grid Engine
Grid Scheduling Architectures with and Sun Grid Engine Sun Grid Engine Workshop 2007 Regensburg, Germany September 11, 2007 Ignacio Martin Llorente Javier Fontán Muiños Distributed Systems Architecture
Software Entwicklungen für das LSDF Datenmanagement
Software Entwicklungen für das LSDF Datenmanagement Rainer Stotzka, V. Hartmann, T. Jejkal,, P. Neuberger, S. Ochsenreither, F. Rindone, T. Schmidt, H. Pasic J. van Wezel, A. Garcia, R. Kupsch, S. Bourov,
Orchestrated service deployment, maintenance, and debugging in IaaS clouds for crowd computing *
Orchestrated service deployment, maintenance, and debugging in IaaS clouds for crowd computing * Robert Lovas Institute for Computer Science and Control, Hungarian Academy of Sciences Kende u. 13-17, Budapest
The Lattice Project: A Multi-Model Grid Computing System. Center for Bioinformatics and Computational Biology University of Maryland
The Lattice Project: A Multi-Model Grid Computing System Center for Bioinformatics and Computational Biology University of Maryland Parallel Computing PARALLEL COMPUTING a form of computation in which
XSEDE Service Provider Software and Services Baseline. September 24, 2015 Version 1.2
XSEDE Service Provider Software and Services Baseline September 24, 2015 Version 1.2 i TABLE OF CONTENTS XSEDE Production Baseline: Service Provider Software and Services... i A. Document History... A-
Deploying Business Virtual Appliances on Open Source Cloud Computing
International Journal of Computer Science and Telecommunications [Volume 3, Issue 4, April 2012] 26 ISSN 2047-3338 Deploying Business Virtual Appliances on Open Source Cloud Computing Tran Van Lang 1 and
UFTP High-performance data transfer for UNICORE
Mitglied der Helmholtz-Gemeinschaft UFTP High-performance data transfer for UNICORE Dr. Bernd Schuller, Tim Pohlmann Federated Systems and Data division Jülich Supercomputer Centre Forschungszentrum Jülich
MIGRATING DESKTOP AND ROAMING ACCESS. Migrating Desktop and Roaming Access Whitepaper
Migrating Desktop and Roaming Access Whitepaper Poznan Supercomputing and Networking Center Noskowskiego 12/14 61-704 Poznan, POLAND 2004, April white-paper-md-ras.doc 1/11 1 Product overview In this whitepaper
HISP: a data-driven portal for hadron therapy
HISP: a data-driven portal for hadron therapy Faustin Laurentiu Roman CERN / IFIC Prototype architecture Tools, implementation & services Conclusions (& demo) 1 One slide situation: ereferral and escience
Concepts and Architecture of the Grid. Summary of Grid 2, Chapter 4
Concepts and Architecture of the Grid Summary of Grid 2, Chapter 4 Concepts of Grid Mantra: Coordinated resource sharing and problem solving in dynamic, multi-institutional virtual organizations Allows
Cloud-pilot.doc 12-12-2010 SA1 Marcus Hardt, Marcin Plociennik, Ahmad Hammad, Bartek Palak E U F O R I A
Identifier: Date: Activity: Authors: Status: Link: Cloud-pilot.doc 12-12-2010 SA1 Marcus Hardt, Marcin Plociennik, Ahmad Hammad, Bartek Palak E U F O R I A J O I N T A C T I O N ( S A 1, J R A 3 ) F I
A Service for Data-Intensive Computations on Virtual Clusters
A Service for Data-Intensive Computations on Virtual Clusters Executing Preservation Strategies at Scale Rainer Schmidt, Christian Sadilek, and Ross King [email protected] Planets Project Permanent
Australian Synchrotron, Storage Gateway
Australian Synchrotron, Storage Gateway User Help Manual Version 1.2 Storage Gateway User Help Manual 2 REVISION HISTORY Date Version Description Author 2 May 2008 1.0 Document creation Chris Myers 13
THE CCLRC DATA PORTAL
THE CCLRC DATA PORTAL Glen Drinkwater, Shoaib Sufi CCLRC Daresbury Laboratory, Daresbury, Warrington, Cheshire, WA4 4AD, UK. E-mail: [email protected], [email protected] Abstract: The project aims
Building Platform as a Service for Scientific Applications
Building Platform as a Service for Scientific Applications Moustafa AbdelBaky [email protected] Rutgers Discovery Informa=cs Ins=tute (RDI 2 ) The NSF Cloud and Autonomic Compu=ng Center Department
SOA REFERENCE ARCHITECTURE: WEB TIER
SOA REFERENCE ARCHITECTURE: WEB TIER SOA Blueprint A structured blog by Yogish Pai Web Application Tier The primary requirement for this tier is that all the business systems and solutions be accessible
Cloud Computing Architecture with OpenNebula HPC Cloud Use Cases
NASA Ames NASA Advanced Supercomputing (NAS) Division California, May 24th, 2012 Cloud Computing Architecture with OpenNebula HPC Cloud Use Cases Ignacio M. Llorente Project Director OpenNebula Project.
HPC Cluster Decisions and ANSYS Configuration Best Practices. Diana Collier Lead Systems Support Specialist Houston UGM May 2014
HPC Cluster Decisions and ANSYS Configuration Best Practices Diana Collier Lead Systems Support Specialist Houston UGM May 2014 1 Agenda Introduction Lead Systems Support Specialist Cluster Decisions Job
HPC and Grid Concepts
HPC and Grid Concepts Divya MG ([email protected]) CDAC Knowledge Park, Bangalore 16 th Feb 2012 GBC@PRL Ahmedabad 1 Presentation Overview What is HPC Need for HPC HPC Tools Grid Concepts GARUDA Overview
Accessing the FTP Server - User Manual
CENTRAL BANK OF CYPRUS Accessing the FTP Server - User Manual IT Department, CENTRAL BANK OF CYPRUS TABLE OF CONTENTS 1 EXECUTIVE SUMMARY... 1 1.1 AUDIENCE... 1 1.2 SCOPE... 1 2 CHANGES FROM THE OLD FTP
Project Title: Judicial Branch Enterprise Document Management System RFP Number: FIN122210CK DMS TECHNICAL REQUIREMENTS
APPENDIX C. DMS TECHNICAL REQUIREMENTS C.1 INTRODUCTION This appendix is an overview of the California Judicial Branch infrastructure and technical requirements considerations that require a written response.
GRID COMPUTING Techniques and Applications BARRY WILKINSON
GRID COMPUTING Techniques and Applications BARRY WILKINSON Contents Preface About the Author CHAPTER 1 INTRODUCTION TO GRID COMPUTING 1 1.1 Grid Computing Concept 1 1.2 History of Distributed Computing
IGI Portal architecture and interaction with a CA- online
IGI Portal architecture and interaction with a CA- online Abstract In the framework of the Italian Grid Infrastructure, we are designing a web portal for the grid and cloud services provisioning. In following
Product Training Services. Training Options and Procedures for JobScheduler and YADE
Product Services Product Services Options and Procedures for JobScheduler and YADE 2 Contents Product Services JobScheduler Levels Level: JobScheduler Operations Level: JobScheduler Installation Level:
Communiqué 4. Standardized Global Content Management. Designed for World s Leading Enterprises. Industry Leading Products & Platform
Communiqué 4 Standardized Communiqué 4 - fully implementing the JCR (JSR 170) Content Repository Standard, managing digital business information, applications and processes through the web. Communiqué
TUTORIAL. Rebecca Breu, Bastian Demuth, André Giesler, Bastian Tweddell (FZ Jülich) {r.breu, b.demuth, a.giesler, b.tweddell}@fz-juelich.
TUTORIAL Rebecca Breu, Bastian Demuth, André Giesler, Bastian Tweddell (FZ Jülich) {r.breu, b.demuth, a.giesler, b.tweddell}@fz-juelich.de September 2006 Outline Motivation & History Production UNICORE
MassTransit vs. FTP Comparison
MassTransit vs. Comparison If you think is an optimal solution for delivering digital files and assets important to the strategic business process, think again. is designed to be a simple utility for remote
Open Source Cloud Computing Management with OpenNebula
CloudCamp Campus Party July 2011, Valencia Open Source Cloud Computing Management with OpenNebula Javier Fontán Muiños dsa-research.org Distributed Systems Architecture Research Group Universidad Complutense
GridFTP: A Data Transfer Protocol for the Grid
GridFTP: A Data Transfer Protocol for the Grid Grid Forum Data Working Group on GridFTP Bill Allcock, Lee Liming, Steven Tuecke ANL Ann Chervenak USC/ISI Introduction In Grid environments,
Big Data and Cloud Computing for GHRSST
Big Data and Cloud Computing for GHRSST Jean-Francois Piollé ([email protected]) Frédéric Paul, Olivier Archer CERSAT / Institut Français de Recherche pour l Exploitation de la Mer Facing data deluge
The glite File Transfer Service
Enabling Grids Enabling for E-sciencE Grids for E-sciencE The glite File Transfer Service Paolo Badino On behalf of the JRA1 Data Management team EGEE User Forum - CERN, 2 Mars 2006 www.eu-egee.org Outline
Pilot-Streaming: Design Considerations for a Stream Processing Framework for High- Performance Computing
Pilot-Streaming: Design Considerations for a Stream Processing Framework for High- Performance Computing Andre Luckow, Peter M. Kasson, Shantenu Jha STREAMING 2016, 03/23/2016 RADICAL, Rutgers, http://radical.rutgers.edu
Cisco Enterprise Mobility Services Platform
Data Sheet Cisco Enterprise Mobility Services Platform Reduce development time and simplify deployment of context-aware mobile experiences. Product Overview The Cisco Enterprise Mobility Services Platform
