Virtualization of a Cluster Batch System Christian Baun, Volker Büge, Benjamin Klein, Jens Mielke, Oliver Oberst and Armin Scheurer Die Kooperation von
Cluster Batch System Batch system accepts computational jobs from a central access point (headnode) Distribution of individual jobs to one or more nodes of the cluster eventually according to a scheduling scheme, provided by scheduler 2 Benjamin Klein Institut für Experimentelle Kernphysik 09.09.08
Partitioning of a High Performance Cluster historically different user groups have their own independent clusters one common cluster for different groups reduces costs infrastructure maintenance discount prices for hardware different user groups different software environments static partitioning of the cluster workload peaks and idle times desirable: dynamic partitioning load balancing reduced idle times of the hardware 3 Benjamin Klein Institut für Experimentelle Kernphysik 09.09.08
New Tier-3 Cluster UNI-KARLSRUHE Computer Cluster, shared between 8 different institutes of the University of Karlsruhe 200 worker nodes, 2x Quad-Core Xeon 2.66 GHz 1600 cores 16 GB RAM per node, 6 portal machines with 32 GB RAM 350 TB storage Institut für Experimentelle Kernphysik shares about 1/3 of the cluster 4 Benjamin Klein Institut für Experimentelle Kernphysik 09.09.08
Different user groups 7 Institutes: mainly local fully parallelized batch jobs (MPI) Institut für Experimentelle Kernphysik: local jobs + grid jobs Cluster maintained by the Computing Centre of the University Suse Enterprise Linux 10 glite middleware strongly relies on Scientific Linux CERN Edition Software framework of LHC experiments also developed under Scientific Linux Solution: Virtualization Prepare Virtual Machines with Scientific Linux that host the different grid services Execute grid jobs inside of Scientific Linux Virtual Machines Software area isolated from local users security Local users can use Suse Linux 5 Benjamin Klein Institut für Experimentelle Kernphysik 09.09.08
A Wrap Job for the preparation of Virtual Machines grid jobs: use a wrap-job to prepare the virtual machine the batch system selects one or more nodes the batch system does not execute the actual job but a wrap-job the wrap-job script prepares a virtual machine and executes the actual computational job inside the virtual machine after the execution the virtual machine is disposed local jobs are executed natively on the hardware operating system 6 Benjamin Klein Institut für Experimentelle Kernphysik 09.09.08
Dynamic Partitioning of the Cluster Dynamic distribution of Virtual Machines over the nodes, according to current load on the cluster load balancing Also possible for local jobs every user group can work in a customized software environment 7 Benjamin Klein Institut für Experimentelle Kernphysik 09.09.08
Conclusion Virtualization offers the possibility for Dynamic partitioning of a cluster Utilization of the same hardware of different user groups Reduced costs Providing different user groups customized software environments Load balancing Low performance losses, even for HPC purposes 8 Benjamin Klein Institut für Experimentelle Kernphysik 09.09.08