Resource Management on Computational Grids

Similar documents

A Taxonomy and Survey of Grid Resource Planning and Reservation Systems for Grid Enabled Analysis Environment

A Survey Study on Monitoring Service for Grid

Concepts and Architecture of the Grid. Summary of Grid 2, Chapter 4

MapCenter: An Open Grid Status Visualization Tool

Grid Scheduling Dictionary of Terms and Keywords

The Data Grid: Towards an Architecture for Distributed Management and Analysis of Large Scientific Datasets

Web Service Based Data Management for Grid Applications

DiPerF: automated DIstributed PERformance testing Framework

Collaborative & Integrated Network & Systems Management: Management Using Grid Technologies

G-Monitor: Gridbus web portal for monitoring and steering application execution on global grids

KNOWLEDGE GRID An Architecture for Distributed Knowledge Discovery

Grid Technology and Information Management for Command and Control

Service Oriented Distributed Manager for Grid System

2. Create (if required) 3. Register. 4.Get policy files for policy enforced by the container or middleware eg: Gridmap file

An approach to grid scheduling by using Condor-G Matchmaking mechanism

CSF4:A WSRF Compliant Meta-Scheduler

Abstract. 1. Introduction. Ohio State University Columbus, OH

How To Monitor A Grid System

Cluster, Grid, Cloud Concepts

Monitoring Clusters and Grids

Sun Grid Engine, a new scheduler for EGEE

Grid Computing Vs. Cloud Computing

Grid Scheduling Architectures with Globus GridWay and Sun Grid Engine

Gridscape: A Tool for the Creation of Interactive and Dynamic Grid Testbed Web Portals

IBM Solutions Grid for Business Partners Helping IBM Business Partners to Grid-enable applications for the next phase of e-business on demand

GridSolve: : A Seamless Bridge Between the Standard Programming Interfaces and Remote Resources

Use of Agent-Based Service Discovery for Resource Management in Metacomputing Environment

HPC and Grid Concepts

Data Management in an International Data Grid Project. Timur Chabuk 04/09/2007

Integration of the OCM-G Monitoring System into the MonALISA Infrastructure

Compute Power Market: Towards a Market-Oriented Grid

GridFTP: A Data Transfer Protocol for the Grid

Distributed Systems and Recent Innovations: Challenges and Benefits

Grid Computing With FreeBSD

A P2P SERVICE DISCOVERY STRATEGY BASED ON CONTENT

Deploying a distributed data storage system on the UK National Grid Service using federated SRB

THE CCLRC DATA PORTAL

X.500 and LDAP Page 1 of 8

A Service Platform for On-Line Games

Classic Grid Architecture

Roberto Barbera. Centralized bookkeeping and monitoring in ALICE

Xweb: A Framework for Application Network Deployment in a Programmable Internet Service Infrastructure

An Online Credential Repository for the Grid: MyProxy

A System for Monitoring and Management of Computational Grids

Multilingual Interface for Grid Market Directory Services: An Experience with Supporting Tamil

e-science Technologies in Synchrotron Radiation Beamline - Remote Access and Automation (A Case Study for High Throughput Protein Crystallography)

LSKA 2010 Survey Report Job Scheduler

Condor-G: A Computation Management Agent for Multi-Institutional Grids

BSC vision on Big Data and extreme scale computing

GSiB: PSE Infrastructure for Dynamic Service-oriented Grid Applications

GAMoSe: An Accurate Monitoring Service For Grid Applications

Introduction. MCSN N. Tonellotto Complements of Distributed Enabling Platforms

CLEVER: a CLoud-Enabled Virtual EnviRonment

QoS Adaptation in Service-Oriented Grids

Grid Security : Authentication and Authorization

Resource Cost Optimization for Dynamic Load Balancing on Web Server System

Transcription:

Univeristà Ca Foscari, Venezia http://www.dsi.unive.it Resource Management on Computational Grids Paolo Palmerini Dottorato di ricerca di Informatica (anno I, ciclo II) email: palmeri@dsi.unive.it 1/29

Outline Introduction to Grid Technology What is a Grid Tools and projects The Globus toolkit Box Globus services Resource Information Infrastructure The globus MDS GRIS & GIIS A Grid Resource Broker State of the Art Architecture of a broker 2/29

General Motivation HPC applications are becoming complex and multidisciplinar High Energy Physics Aircraft design Resources are inherently distributed Need for new models and approaches to HPC, distributed computing, resource management. 3/29

What is a Grid? I. Foster and C. Kasselman, The Grid: blueprint for a future infrastructure, Morgan Kaufman, 1999 Starting from the analogy with electric power grid Gathering diverse and large scale distributed resources in a unique and coherent view Based on the concept of Virtual Organizations 4/29

Grid Architecture Fabric layer Computational resources Storage resources Network resources Code repositories Catalogs Connectivity layer Signle sign on Delegation Integration with local services 5/29

Grid Architecture (cont d) Resource layer Information Protocols Management protocols Collective layer Directory Service Scheduling, co-allocation Monitoring and diagnosing Grid-enabled programming systems Workload management systems Community authorization servers Commodity accounting 6/29

Grid Middleware Globus http://www.globus.org Legion 7/29 http://www.legion.org Condor http://www.cs.wisc.edu/condor Harness http://www.epm.ornl.gov/harness

The Globus Toolkit A software toolkit addressing key technical problems in the development of Grid enabled tools, services, and applications Offer a modular bag of technologies Enable incremental development of grid-enabled tools and applications Implement standard Grid protocols and APIs Make available under liberal open source license 8/29

Basic Globus Services Security Communication Fault Detection Information Infrastructure Resource Management Portability Data Management 9/29

Globus Layered Architecture 10/29

Globus Resource Management 11/29

Resource Specification Language Example: 3 stages pipe + ( &(resourcemanagercontact="prosecco.cnuce.cnr.it") (count= 1) (label="subjob 0") (environment=(globus_duroc_subjob_index 0)) (arguments= "pipe.conf" "128" "128" "20" "10") (directory="/home/prosecco/palmeri/asi-pqe/mpi/src") (executable="/home/prosecco/palmeri/asi-pqe/mpi/src/pipe-g") ) ( &(resourcemanagercontact="barbera.cnuce.cnr.it") (count= 4) (label="subjob 1") (environment=(globus_duroc_subjob_index 1)) (arguments= "pipe.conf" "128" "128" "20" "10") (directory="/home/prosecco/palmeri/asi-pqe/mpi/src") (executable="/home/prosecco/palmeri/asi-pqe/mpi/src/pipe-g") ) ( &(resourcemanagercontact="prosecco.cnuce.cnr.it") (count= 1) (label="subjob 5") (environment=(globus_duroc_subjob_index 2)) (arguments= "pipe.conf" "128" "128" "20" "10") (directory="/home/prosecco/palmeri/asi-pqe/mpi/src") (executable="/home/prosecco/palmeri/asi-pqe/mpi/src/pipe-g") ) 12/29

First problem: which RSL level? ground RSL is too complex: ground RSL ( &(resourcemanagercontact="barbera") (count= 4) (label="subjob 1") (environment=(globus_duroc_subjob_index 1)) (arguments= "pipe.conf" "128" "10") (directory=".") (executable="./pipe-g") ) High level request run pipe-g on 6 nodes with 128 MBytes memory each We need an entity that translate high level requests to ground RSL. 13/29

Metacomputing Diretory Services 14/29 Basic elements are Virtual Organizations (VO) Scalability achieved through a distributed model Soft-state registration mechanism

MDS Implementation Grid Information Service (GRIS) Provides resource description Modular content gateway Grid Index Information Service (GIIS) Provides aggregate directory Hierarchical groups of resources Lightweight Dir. Access Protocol (LDAP) Standard with many client implementations Used for resource inquiry and registration 15/29

Lightway Directory Access Protocol Structural information Resource hierarchy maps to objects Named positions in LDAP DIT 16/29 Merged information Some parents join child data Simplifies common query patterns Auxiliary information Uniform representation of leaf/ parent data Uses LDAP auxiliary objectclasses

Second Problem: resource discovery Which resources do we need? Where do we find them? How do we use them? 17/29

Second Problem: resource discovery Which resources do we need? Where do we find them? How do we use them? 17/29 The entity should: Understands our needs Looks for the best resources available Executes the application on them

Nimrod/G: a Grid Resource Broker A specialized system for the execution of parametric jobs on distributed resources. Uses a simple language to express the experiment and provides machinery that automates the task of formulating, running and monitoring the jobs, and collecting the results. 18/29 Uses Globus Services Four phases scheduling algorithm Discovery. Based on MDS Allocation. Evaluate a cost foreach job. Monitoring Refinement

Limitations of Nimrod/G Narrow application domain. Parametric problems are only a part of Grid Applications. No communications among components. Network influence is not properly considered. Cost model assumes constant job behavior. In many cases the cost of a job cannot be known in advance (e.g. datamining). 19/29

Sun Grid Engine Software Revenue from grid computing solutions is expected to grow from $1.5 billion (2001) $4 billion 2003; A set of tools for managing distributed resources; Aimed at optimizing resource usage within an organization; Based on a centralized queue where jobs are submitted and scheduled to the most adequate resource; Jobs are described in terms of resource requirements, (hardware and software licenses) and priorities; It s an open source project! http://www.sun.com/gridware http://www.gridengine.sunsource.net 20/29

Resource Management of Structured Applications We want to manage the configuration and execution of structured parallel applications on computational grids. Describe the application and its requirement Discover the resources that match the requirements Schedule the components onto the discovered nodes (co-allocation) Important issues heterogeneous architecure executable staging maintain up-to-date information on grid status 21/29

Describing Structured Applications The application can be described by means of a configuration file, built in a semi-automatic way. static info: application structure, component names and implementations dynamic info: parallelism degree, mapping of components onto machines. Coded in XML, but Conceptually equivalent to ground RSL. <global_config> <structure> </structure> <granularity> </granularity> <mapping> </mapping> </global_config> 22/29

Configuration file Components of the XML file structure Specify the general structure of the application, by means of a generic graph. Component names and mutual dataflow dependencies. Also the name and location of the modules (i.e. files) that implements the components. granularity The parallelism degree and the replication of each component. For example the number of workers in a farm, or the number of processes for a parallel module. mapping The physical machines the components will run on. 23/29

Resource Discovery Varius approaches are possible: Sol. 1. Query a statically known list of other GRIS or GIIS. Many queries during configuration. Sol. 2. Build an ad-hoc root GRIS that covers a statically known list of other GRIS or GIIS. Queries are resolved locally. Use the globus mechanism for updating information. Sol. 3. Search engine for grid resources Off-line spidering of all the GRIS and indexing them On-line query engine that finds resources that matche user queries. 24/29

Scheduling Once the application is configured (in its granularity and mapping), we have the low level RSL specification and the application could be started. 25/29

Scheduling Once the application is configured (in its granularity and mapping), we have the low level RSL specification and the application could be started. But the grid is a highly heterogeneous environment, whose status rapidly changes during time. 25/29

Scheduling Once the application is configured (in its granularity and mapping), we have the low level RSL specification and the application could be started. But the grid is a highly heterogeneous environment, whose status rapidly changes during time. We can introduce a centralized scheduler that is in charge of managing the queue of requests for the computational grid. 25/29

Scheduling The Grid Scheduler should: Co-schedule application s component, when possible. Optimize resource utilization (i.e. overlapping communication and computation). Apply application specific (hence more accurate) cost models. Implement trading and negotiation for resource contention. 26/29

A Grid Resource Broker for Structured Parallel Applications 27/29

References [1] I. Foster and C. Kasselman, The Grid: blueprint for a future infrastructure, Morgan Kaufman, 1999 [2] I. Foster, C. Kesselman, S. Tuecke, The Anatomy of the Grid: Enabling Scalable Virtual Organizations. International J. Supercomputer Applications, 15(3), 2001. [3] I. Foster, C. Kesselman, Globus: A Metacomputing Infrastructure Toolkit,. Intl J. Supercomputer Applications, 11(2):115-128, 1997. [4] K. Czajkowski, S. Fitzgerald, I. Foster, C. Kesselman, Grid Information Services for Distributed Resource Sharing, Proceedings of the Tenth IEEE International Symposium on High-Performance Distributed Computing (HPDC-10), IEEE Press, August 2001. [5] K. Czajkowski, I. Foster, N. Karonis, C. Kesselman, S. Martin, W. Smith, S. Tuecke, A Resource Management Architecture for Metacomputing Systems, Proc. IPPS/SPDP 98 Workshop on Job Scheduling Strategies for Parallel Processing, po. 62-82, 1998. [6] LDAP, http://www.openldap.org [7] Abramson, D., Giddy, J., Foster, I., and Kotler, L., High Performance Parametric Modeling with Nimrod/G: Killer Application for the Global Grid?, in 2000 International Parallel and Distributed Processing Symposium Cancun, Mexico. 28/29

[8] R. Baraglia, R. Ferrini, D. Laforenza, S. Orlando, P. Palmerini, R. Perego, Sistemi di Calcolo ad Alte Prestazioni per Applicazioni di Osservazio ne della Terra, Deliverable 1,2,3,4,5, WP5, ASI-PQE [9] S. Orlando, P. Palmerini, R. Perego, F. Silvestri, Knowledge Grid scheduling, (submitted work) 29/29