Sun Grid Engine, a new scheduler for EGEE



Similar documents
Sun Grid Engine, a new scheduler for EGEE middleware

The Grid-it: the Italian Grid Production infrastructure

GT 6.0 GRAM5 Key Concepts

Streamline Computing Linux Cluster User Training. ( Nottingham University)

An approach to grid scheduling by using Condor-G Matchmaking mechanism

HTCondor within the European Grid & in the Cloud

Status and Integration of AP2 Monitoring and Online Steering

CMS Dashboard of Grid Activity

Introduction to Sun Grid Engine (SGE)

An objective comparison test of workload management systems

Work Environment. David Tur HPC Expert. HPC Users Training September, 18th 2015

The ENEA-EGEE site: Access to non-standard platforms

Grid Engine 6. Troubleshooting. BioTeam Inc.

GridWay: Open Source Meta-scheduling Technology for Grid Computing

Roberto Barbera. Centralized bookkeeping and monitoring in ALICE

Grid 101. Grid 101. Josh Hegie.

Resource Management on Computational Grids

The dashboard Grid monitoring framework

Grid Scheduling Architectures with Globus GridWay and Sun Grid Engine

Recommendations for Static Firewall Configuration in D-Grid

SGE Roll: Users Guide. Version Edition

Grid Engine Basics. Table of Contents. Grid Engine Basics Version 1. (Formerly: Sun Grid Engine)

ARC Computing Element

Mitglied der Helmholtz-Gemeinschaft. System monitoring with LLview and the Parallel Tools Platform

Anar Manafov, GSI Darmstadt. GSI Palaver,

Grid Engine Training Introduction

Batch Systems. provide a mechanism for submitting, launching, and tracking jobs on a shared resource

TUTORIAL. Rebecca Breu, Bastian Demuth, André Giesler, Bastian Tweddell (FZ Jülich) {r.breu, b.demuth, a.giesler,

Cluster Computing With R

Computing in High- Energy-Physics: How Virtualization meets the Grid

LSKA 2010 Survey Report Job Scheduler

The SUN ONE Grid Engine BATCH SYSTEM

Instruments in Grid: the New Instrument Element

ITG Software Engineering

The CMS analysis chain in a distributed environment

Grid Scheduling Dictionary of Terms and Keywords

SEE-GRID-SCI. SEE-GRID-SCI USER FORUM 2009 Turkey, Istanbul December, 2009

Vulnerability Assessment for Middleware

An Introduction to High Performance Computing in the Department

Beyond Windows: Using the Linux Servers and the Grid

Provisioning and Resource Management at Large Scale (Kadeploy and OAR)

HTCondor at the RAL Tier-1

NorduGrid ARC Tutorial

Using the Yale HPC Clusters

Grid Engine experience in Finis Terrae, large Itanium cluster supercomputer. Pablo Rey Mayo Systems Technician, Galicia Supercomputing Centre (CESGA)

A Taxonomy and Survey of Grid Resource Planning and Reservation Systems for Grid Enabled Analysis Environment

Grid Computing in SAS 9.4 Third Edition

Job Scheduling with Moab Cluster Suite

Auto-administration of glite-based

MPI / ClusterTools Update and Plans

Report from SARA/NIKHEF T1 and associated T2s

Microsoft HPC. V 1.0 José M. Cámara (checam@ubu.es)

HAMBURG ZEUTHEN. DESY Tier 2 and NAF. Peter Wegner, Birgit Lewendel for DESY-IT/DV. Tier 2: Status and News NAF: Status, Plans and Questions

Job scheduler details

The GENIUS Grid Portal

A Survey Study on Monitoring Service for Grid

Grid Engine Administration. Overview

ARDA Experiment Dashboard

A High Performance Computing Scheduling and Resource Management Primer

Grid Engine. Application Integration

System Requirements. Version

High Performance Computing with Sun Grid Engine on the HPSCC cluster. Fernando J. Pineda

Sun Grid Engine Package for OSCAR A Google SoC 2005 Project

PBS Tutorial. Fangrui Ma Universit of Nebraska-Lincoln. October 26th, 2007

Clusters in the Cloud

Chapter 2: Getting Started

The glite Workload Management System

Transcription:

Sun Grid Engine, a new scheduler for EGEE G. Borges, M. David, J. Gomes, J. Lopez, P. Rey, A. Simon, C. Fernandez, D. Kant, K. M. Sephton IBERGRID Conference Santiago de Compostela, Spain 14, 15, 16 May 2007

EGEE back to basics The EGEE project The Infrastructure and the glite Middleware Outline EGEE Local Resource Management Systems (LRMS) LSF, Torque/Maui, Condor and Sun Grid Engine Sun Grid Engine glite integration (for the lcg-ce) JobManager Accounting Information Information plug-in YAIM Integration Conclusions and Future Work goncalo@lip.pt IBERGRID Conference, Santiago de Compostela 2

Enabling Grids for E-Science fundamental goal EGEE Project Deployment of a Grid Infrastructure for all fields of science EGEE infrastructure Resources are glued together by a set of agreed services provided and supported by the EGEE comunity EGEE proposes glite as the appropriate middleware to support the necessary grid services for multi-science aplications EGEE Services are divided in two different sets: Core Services: Only installed in some RCs but used by all users Resource Broker, Top-BDII, File Catalogues, VOMS servers,... Local Services: Deployed and Maintained by each participating site Computing Element, Storage Element, MonBox, User Interface,... goncalo@lip.pt IBERGRID Conference, Santiago de Compostela 3

The CE may be used by a generic client An end-user which interacts directly with it The Computing Element The Workload Manager (RB) which submits a given job to it after going through by all the matchmaking process It is THE SERVICE representing the computing resources Authentication and Autorization Has to interact with the Local Resource Management System Job management (Job submission, Job control, Job canceling,...) Provide information describing itself This information is published in the Information Service Used by the match making engine which matches available resources to queued jobs. goncalo@lip.pt IBERGRID Conference, Santiago de Compostela 4

Computing Element and the LRMS Enabling Grids for E-sciencE The CE is the entry point from a Grid job into the LRMS Gatekeeper Service for Authentication, Authorization and Globus Job Submission. GRIS Service (InfoService) for publishing Local Resource Usage and Characteristics infoservice gatekeeper Grid Gate node glite must implement proper tools (Virtual Layers) to Use LRMS specific cmds for Job Management (translate RSL requests; feed the L&B Service) Query Resource Usage (feed the CE GRIS Service) Process the Accounting Information generated by the LRMS and feed it to the central Accounting Registry? Batch server LRMS goncalo@lip.pt IBERGRID Conference, Santiago de Compostela 5

The Local Resource Management System (LRMS) is the Cluster Component which Manages the execution of Users Applications Allows to optimize the Cluster Resource Usage Enables to fullfil a broadrage of Usage Policies Easies the Cluster Administration Tasks LRMS Each EGEE Cluster Admin should be allowed to choose the LRMS he thinks its best for their needs Most of the times, EGEE clusters are shared with Local Farms However, only Torque/Maui and LSF are fully supported in EGEE glite should be able to cope with a much wider set of LRMS Easies the integration of clusters already in operation Better inter-operability The wider the glite offer, more appealing it becomes goncalo@lip.pt IBERGRID Conference, Santiago de Compostela 6

Basic LRMS Comparison LRMS Pros Cons LSF Torque/ Maui Condor Flexible Job Scheduling Policies Advance Resource Management Checkpointing & Job Migration, Load Balacing Good Graphical Interfaces to monitor Cluster functionalities Integrable with Grids Good integration of parallel libraries Able to start parallel jobs using LRMS services Full control of parallel processes Flexible Job Scheduling Policies Fair Share Policies, Backfilling, Resource Reservations CPU harvesting Special ClassAds language Dynamic check-pointing and migration Mechanisms for Globus Interface Coherent with glite MD Expensive comercial product Not suitable for small computing clusters Configurations done through the command line A non user friendly GUI Software development uncertain Bad documentation Not optimal to parallel aplications Check-pointing only works for batch jobs Complex configuration goncalo@lip.pt IBERGRID Conference, Santiago de Compostela 7

Sun Grid Engine SGE, an open source job management system supported by Sun Queues are located in server nodes and have attributes which caracterize the properties of the different servers A user may request at submission time certain execution features Memory, execution speed, available software licences, etc Submitted jobs wait in a holding area where its requirements/priorities are determined It only runs if there are queues (servers) matching the job requests Some Important Features Supports Check-pointing and Migration... Although some additional programming could be needed Tight integration of parallel libraries Supported through a SGE specific version of rsh, called qrsh Flexible Scheduling Polices Implements Calendars Fluctuating Resources Intuitive Graphic Interface Used by users to manage jobs and by admins to configure and monitor their cluster Good Documentation Still Work in Progress Observed flaws maybe addressed to dedicated teams and support is assureb by dedicated staff goncalo@lip.pt IBERGRID Conference, Santiago de Compostela 8

The JM is the core service of the Globus GRAM Service Submits jobs to SGE based on Globus requests and through a jobwrapper script Intermediary to query the status of jobs and to cancel them SGE command client tools (qstat, qsub, qdel) have to be available in the CE Even if the Qmaster machine is installed in another machine Doesn t require shared homes But home dirs must have the same path on the CE and WNs LCGSGE JobManager (1) lcgsge.conf gatekeeper Grid Gate node Lcgsge JobManager SGE Qmaster L&B Globus RLS Requests SGE cmds (qstat, qsub,...) The SGE JM is based on the LCGPBS JM Requires XML::Simple.pm goncalo@lip.pt IBERGRID Conference, Santiago de Compostela 9

SGE JM re-implements the following functions: Submit: Checks Globus RSL arguments returning a Globus error if the arguments are not valid or if there are no resources Submit_to_batch_system: Submits jobs to SGE, after building the jobwrapper script, by getting the necessary information from the RSL variables Poll: Links the present status of jobs running in SGE with the Globus appropriate message LCGSGE JobManager (2) lcgsge.conf gatekeeper Grid Gate node Lcgsge JobManager SGE Qmaster L&B Globus RLS Requests SGE cmds (qstat, qsub,...) Poll_batch_system: Allows to know the status of running jobs parsing the qstat SGE output. Cancel_in_batch_system: Cancels jobs running in SGE using qdel goncalo@lip.pt IBERGRID Conference, Santiago de Compostela 10

The solution implemented for SGE does not currently use the generic EGEE scripts lcg-info info-dynamic-sge A standalone Information plugin script that examines SGE queuing system state SGE Information Plugin (1) ldif static file Cluster.state, Info-reporter.conf, vqueues.conf InfoSystem Grid Gate node Site BDII LDAP 2135 Information expected to be reported is based on queues SGE does not assign a job to a queue until execution time. ``virtual queues'' are used lcg-info-dynamic-sge SGE cmds (qstat, qsub,...) SGE Qmaster The info reporter reads A copy of a static ldif file with details of all ``virtual queues' Config files specifying how virtual queues map into a list of resource requirements goncalo@lip.pt IBERGRID Conference, Santiago de Compostela 11

The dynamic information single call to SGE's ``qstat'' The system determines which virtual queues the job should be associated with Each virtual queue is considered to count up Nb of job slots, Nb of pending/running jobs Total amount of runtime left on all of the jobs assuming that they will run for their max duration SGE Information Plugin (2) ldif static file Cluster.state, Info-reporter.conf, vqueues.conf InfoSystem Grid Gate node lcg-info-dynamic-sge SGE Qmaster Site BDII LDAP 2135 SGE cmds (qstat, qsub,...) The state of the batch queues can change quite fast Option to capture a copy of all information provider input data, which can be replayed to the information provider goncalo@lip.pt IBERGRID Conference, Santiago de Compostela 12

APEL SGE plug-in is a log processing application Used to produce CPU job accounting records Interprets gatekeeper & batch system logs Requires the JM to add ``gridinfo'' records in the log file Standard Globus JMs do not log them but LCG JMs do it apel-sge sge-log-parser parses the SGE accounting log file This information, together with the gridinfo mappings from the JobManager are joined together to form accounting records Published using R-GMA R to an accounting database. SGE Accounting Plugin log file Accounting file Grid Gate node Apel-sge-log-parser SGE Qmaster MON R-GMA goncalo@lip.pt IBERGRID Conference, Santiago de Compostela 13

YAIM (Yet Another Installation Method) SGE YAIM Integration (1) Separates the instalation process from the configuration one Based on a library of bash functions called by a configuration script Functions needed by each node are defined in node-info.def file The grid site topology is totally encapsulated on the site-info.def file Development of two integration rpms lcgce-yaimtosge-0.0.0-2.i386.rpm glitewn-yaimtosge-0.0.0-2.i386.rpm Requirements SGE installed (we presently made SGE rpms to install it) lcg-ce and glite-wn glite-yaim (>=3.0.0-34), perl-xml-simple (>= 2.14-2.2), openmotif (>=2.2.3-5) and xorg-x11-xauth (>= 6.8.2-1) goncalo@lip.pt IBERGRID Conference, Santiago de Compostela 14

SGE YAIM Integration (2) $SGE_ROOT software dir must be set to /usr/local/sge/pro May be changed by the site admin in a future release The SGE Qmaster can only be installed in the CE May be installed in another machine in a future release Three new variables must be set in the site-info.def info.def SGE_QMASTER, DEFAULT_DOMAIN, ADMIN_EMAIL The integration rpms do... Change the node-info.def file to include two new node types CE_sge and WN_sge Run the same functions as the CE and WN nodes, plus at the end Config_sge_server and Config_sge_client goncalo@lip.pt IBERGRID Conference, Santiago de Compostela 15

The Config_sge_server SGE YAIM Integration (3) Uses an auxiliary perl script (configure_sge_server.pm) Builds all the default SGE directory structure Configures environment setting files, sets the global SGE configuration file, the SGE scheduler configuration file and SGE complex attributes Defines one cluster queue for each VO Deploys the lcgsge JM and builds b its configuration files Deploys SGE Information plug-in and builts its configuration files Accounting is not properly integrated but will be soon... The Config_sge_client Uses an auxiliary perl scrip (configure_sge_client.pm configure_sge_client.pm) Builds all the default SGE directory structure in the client goncalo@lip.pt IBERGRID Conference, Santiago de Compostela 16

SGE YAIM Integration (4) /opt/glite/yaim/bin/yaim c s s site-info.def info.def n n CE_sge goncalo@lip.pt IBERGRID Conference, Santiago de Compostela 17

Future Work SGE is working on a lcg-ce although additional work is required YAIM SGE integration More flexible allowing site admins to dynamically set a broader range of options Separate Qmaster from the CE Fully integrate the SGE Accounting SGE Information Provider needs to improve its flexibility and take into account overlapping cluster queues / virtual queues definitions Started on integrating support for BLAHP, running on glite-ce Will be used within glite-ce and CREAM to interface with the LRMS Expected to share the configuration files and concept of virtual queues with the information provider. Other local middleware elements (GIIS, YAIM) basically remain unchanged for this glite-ce flavour. Still missing GridICE sensors for SGE goncalo@lip.pt IBERGRID Conference, Santiago de Compostela 18