Grid Engine Training Jordi Blasco (jordi.blasco@xrqtc.org) 26-03-2012
Agenda 1 How it works? 2 History Current status future About the Grid Engine version of this training Documentation 3 Grid Engine internals What provides? Features Key terms & Concepts tools 4 5
How it works? Batch Queue System is a software application that is in charge of unattended background executions, commonly known for historical reasons as batch processing. a a source : Wikipedia What is a queue? A queue oers a set of resources for a similar jobs. The queues use to have some limits to manage the computational resources eciency. Only few BQS have consumable resources control like concurrent licenses limitation. Jordi Blasco (jordi.blasco@xrqtc.org)
How it works? How it works? The users sent their jobs with qsub, with a detailed resources needs(mem., cputime, disk, number of cores, licenses,...) The manager register the job. When all the resources are available, the manager send the job to execution nodes following a complex allocation rules (priority, urgency, etc.) Some BQS needs more complex scheduler layer like Maui, Moab cluster suite. The users can view the job status using qstat. The users can delete their job using qdel.
project History Current status future About the Grid Engine version of this trainin Documentation History Previously known as CODINE (COmputing in DIstributed Networked Environments). In 2000, Sun acquired Gridware, Inc. In 2001, Sun made the source code available In 2010, Oracle acquired Sun. In December 2010, Oracle announced that Grid Engine would no longer be freely available as an open-source product. In response to this, the Grid Engine community started the develop forked versions of Grid Engine.
Grid Engine Forks History Current status future About the Grid Engine version of this trainin Documentation Open Source and Free Grid Engine Forks OGS Open Grid Scheduler SGE Son of Grid Engine Commercially Supported and Licensed Grid Engine Forks UGE Univa Grid Engine OGS Scalable Logic (Open Grid Scheduler) OGE Oracle Grid Engine
Grid Engine Forks History Current status future About the Grid Engine version of this trainin Documentation Comparing Grid Engine at 1Q'12 UGE OGE SGE OGS Base Core 6.2U5 6.2U7 6.2U5 a 6.2U5 Current Version 8.0.1 6.2U5p2 8.0.0d GE-2011.11 License Commercial Commercial SISSL SISSL Support level Enterprise Enterprise Community Community* b Activity high medium medium high Source Access yes* yes* yes yes Public Roadmap yes no no no We suggest to read the article "Which Grid Engine" by Chris Dagdigian, consultant at BioTeam. http://www.bio-itworld.com/2012/02/15/which-grid-engine.html a since 2011-09-29 SGE use Univa Grid Engine 8.0.0 source code core b In November 2011, Scalable Logic has announced its intent to provide commercial support and consulting
What now? History Current status future About the Grid Engine version of this trainin Documentation
Jordi Blasco (jordi.blasco@xrqtc.org) History Current status future About the Grid Engine version of this trainin Documentation Which flavor of Grid Engine we will use for training? We will use Open Grid Scheduler to develop this training, because: It's 100% Open Source. It follows the same policy of Sun (SISSL). If you need help, you can contact for commercial support. The enterprise and the community version are the same. The developers are active members of the mail list. And,... we have to choose only one :-)
Docs & Info History Current status future About the Grid Engine version of this trainin Documentation Docs & Info http://arc.liv.ac.uk/sge/ http://gridscheduler.sourceforge.net/documentation.html http://bioteam.net/2009/09/sge-training-slides/ http://www.hpckp.org (coming soon)
What Grid Engine provides? What provides? Features Key terms & Concepts tools What Grid Engine provides? The BQS and the scheduler comes in the same pack You don't need extra software to schedule Detailed job accounting Fine-grained computing resources to the users Allows suspend, resume and migrate jobs Checkpointing integration Parallel environment integration Job arrays Awesome policies to share resources APIs to easy develop 3rd party software
Features What provides? Features Key terms & Concepts tools Most awesome features The BQS and the scheduler are integrated Job Preemption Portable Hardware Locality Library (hwloc) support GPU support ARM Linux port available Linux Kernel 3.0 Support
Key Terms & Concepts What provides? Features Key terms & Concepts tools Key Terms Cluster Execution host Master (Shadow) Submit host Admin host Daemons (sge_qmaster, sge_execd) Queue instance Slots Jobs
Key Terms & Concepts What provides? Features Key terms & Concepts tools Key Concepts Users describe the needed resources and GE looks for the best host and queue instance. You don't have to send jobs on particular queue Each node can have one or more queue instances
source: www.bioteam.net Jordi Blasco (jordi.blasco@xrqtc.org) Key Terms & Concepts What provides? Features Key terms & Concepts tools
Setting up Grid Engine What provides? Features Key terms & Concepts tools Setting up GE with qmon(gui) & qconf(cli).
Setting up Grid Engine What provides? Features Key terms & Concepts tools Core user commands qstat & qhost Tools for monitoring. qsub Submit job tool. Core admin commands qconf Admin tool for adding/changing/conguring the Grid Engine system. qstat & qhost Tools for monitoring. qmod Modify & disable an existing queue, clear error states, etc. qalter Change attribute of pending job.
Missing features on main project code Accounting web interface (S-GAE - RDLab) Monitoring web interface (PHPQstat - XRQTC) GPU Integration (Jose Alcantara scripts - XRQTC) Power Control (CLUES - GRyCAP) Eciency Control (New XRQTC/HPCKP Contribution) Dynamic quota (New XRQTC/HPCKP Contribution) Jordi Blasco (jordi.blasco@xrqtc.org)
Jordi Blasco (jordi.blasco@xrqtc.org)
References Jordi Blasco (jordi.blasco@xrqtc.org) References Sun Grid Engine Installation Guide Sun Grid Engine Administrator Guide Sun Grid Engine User Guide http://bioteam.net http://www.univa.com https://arc.liv.ac.uk/trac/sge http://gridengine.org http://gridscheduler.sourceforge.net http://gridengine.info
Lets go to Hands-On 1 (Install) Jordi Blasco (jordi.blasco@xrqtc.org)