elab and the FVG grid Stefano Cozzini CNR-INFM DEMOCRITOS and SISSA elab Trieste
Agenda/Aims Present elab ant its computational infrastructure GRID-FVG structure basic requirements technical choices open questions Discussion
what is elab?
elab Mission Maintaining a cutting-edge computational infrastructure for research groups at SISSA and DEMOCRITOS Plan and setup advanced solution for scientific and technical computing Software engineering Parallel computing (platforms & algorithms) Cluster computing (high-performance and distributed) Grid computing Enabling E-science at Sissa/Democritos! 4
elab main partner: Democritos CNR/INFM National Simulation Center for condensed matter Research areas: Computational nano-science and materials engineering Computational biochemistry, biotechnology, and drug design Hard and soft condensed matter: superconductivity and disordered systems IT for computer simulations: (software engineering parallel, cluster, and grid computing) 5
elab computational environment 6
elab numbers established in 2007 10 people involved (3 permanent staff ) ~ 200 server managed (~ 1000 cpu) from HPC and GRID more than 2 millions hours computed so far ( november 2008) : ~ 300k hours budget: ~ 1 Millions Euro strong collaboration with Trieste scientific institutions: ELETTRA/INAF/ICTP
elab HPC resources HG1 main cluster it includes hardware from: CUBENET project GRID FVG project funds from SISSA and DEMOCRITOS researchers built using v 0.1 of EPICO WIP Enhanced Package for Installation and Configuration a suite of software tools to install manage HPC and Grid infrastructure developed by elab heterogeneous platform 8
elab computational infrastructure 9
elab training events Advanced School in High Performance Computing Tools for e-science - joint DEMOCRITOS/INFM-eLab/SISSA-ICTP activity, 5-16 march 2007 (Trieste/Italy) ~100 participants ( ~ 50 in Trieste Area) Advanced School in High Performance and Grid computing -joint DEMOCRITOS/INFMeLab/SISSA-ICTP activity, 3-14 november 2008 (Trieste/Italy) 10
HG1 hardware: computational nodes ZEBRA Partition: 160 core 20 Supermicro machine; 2 Intel CPU quad core: E5420 @ 2.50GHz 16 GB RAM each 20Gb infiniband card BLADE partition: 352 core 40(c) +48(m) Eurotech Blade system 2 Opteron CPU dual core 280 8 GB RAM each 10Gbg infiniband card Diskless 11
HG1 hardware: computational nodes MYRI Partition: 48 core 12 v20z machines 2 AMD CPU dual core: 275 8 GB RAM each myrinet card SMP partition: ~ 40 servers 2/4/8 smp machines mixed Opteron/Intel CPU dual core/single core 280 dedicated to SMP computations 12
GRID infrastructure at elab A small grid site within the EGEE infrastructure active in several V.O. ce-01.grid.sissa.it se-01.grid.sissa.it lfc.grid.sissa.it ( implemented JIT security mechanism for Storm SE ) The GRID-FVG infrastructure based on HPC: hg1 + 200 core HPC platform located in Amaro ( Mercurio Headquarter) The Gridseed VM tool
il progetto GRID-FVG SISSA/elab partner scientifico di Eurotech per la fornitura di un sistema grid HPC a Mercurio FVG da offrire a industria end enti locali Risorse hardware iniziali del progetto: presso SISSA : 200 core +servizi integrate dentro hg1 presso Mercurio FVG: 20O core + servizi
Requirements of GRID_FVG HPC resources should be integrated seamlessly in GRID environment. HPC resources should be used and exploited as HPC resources. (heterogeneity as added value and not as problem): Computational Resources should be available through grid infrastructure and local resources as well..
technical deployment: (requirement1) glite adoption : central grid services at Sissa/eLAB HPC systems as lcg- Computing Elements HCP systems installed using standard elab procedures: NO glite installation procedures on HPC infrastructure => ENEA SPAGO solution (see later) Status: satisfactory Future development: integration with CREAM
technical deployment (requirement 2) FULL MPI support via ~GRID job-submission.. Mpistart approach tricks on the WN to load appropriate module appropriate tags for information system Status: unsatisfactory: lack of features in JDL lack of info in GLUE2.0 schema Future Development: we are part of EGEE- MPI working group
technical deployment (requirement 2) FULL MPI support via ~GRID job-submission.. Mpistart approach tricks on the WN to load appropriate module appropriate tags for information system Status: unsatisfactory: lack of features in JDL lack of info in GLUE2.0 schema Future Development: we are part of EGEE- MPI working group
technical deployment (requirement 3) Not yet completed: Local user have a separate management Dedicated parallel filesystems Issues: authentication/authorization (should be the same) not really complicated I guess resources management: easy for CPUs: LRSM does it for your if appropriately configured data management: complicated under study...
gl ite standard architecture Voms Server Resource Broker MyProxy Server WAN UI CE Grid users batch server SE DMZ WN WN WN LAN
gl ite / SPAGO architecture Voms Server MyProxy Server Resource Broker WAN central glite services UI CE Grid users SE DMZ site glite services dj gri Local Masternode s ob grid jobs PBS/LSF batch server local jobs Local users Local resources Heterogeneous HW/SW LAN local cluster facilities no glite middleware!
SPAGO - Implementation grid users MAPPED home /home/grid001 /home/grid002... /opt/something NFS AFS... glite cluster node cluster node local batch server pbs/lsf cluster node computing element cluster node no glite PROXY WORKER NODE every node delegates all grid commands concerning transfer to/from grid to the proxy worker node
SPAGO M PI support MPI-start is the interface to hide the complexity of the local cluster different MPI vendors different batch servers MPI-start is installed on the CE NOTHING is installed on the cluster computing nodes The cluster nodes must be able: to load the same environment of a standard glite WN (MPI_* variables) to execute MPI-start scripts (fake mpirun, openmpi.mpi,...) When the cluster supports module loading, a mechanism should map the MPI Tags published in the CE_RUNTIMEENV in a call to a specific module $ lcg-info --list-ce --attrs Tag query 'CE=grid2.mercuriofvg.it:2119/jobmanagerlcgpbs-mercurio'... MPI-START MPI_SHARED_HOME OPENMPI-1.3 MPICH2-1.2p1... module load openmpi/1.3-intel/... export PATH... export LD_LIBRARY_PATH
SPAGO M PI test Intel MPI Benchmarks on computing nodes in a submission FROM User Interface: OPENMPI-1.3 (GNU wrapper) with Infiniband support MVAPICH2 (Intel wrapper) LAM (Intel wrapper) on TCP access to non standard glite platforms compilation of source code can't be targeted
IM B results
elab.sissa.it/gridseed
GridSeed as today: central services NO glite dependent CA: Certification Authority service + DNS+NTP+..
User Interfaces in GridSeed UI-1 standard UI based on glite UI UI-2 Clean Linux Box + Milu3.1 package MILU: Miramare Lightweight User interface
Gridseed typical site-grids CE-? Computing Element LFC-CE + TORQUE CExWN1 CExWN2 2 x CPU 2 x CPU SE-? Storage Element STORM srm v2.2 server
Tutorials/exercises elab.sissa.it/gridseed Basic glite tutorials Getting Started with GridSeed and glite middleware Basic Data Management AdvancedGLiteUserTutorials All the examples for this section /opt/examples on the gridseed UI. are available on Advanced Job submission mechanisms MPI job submission Specific ELabToolUserTutorials This tutorials are specific to elab and EU-Indiagrid tools developed to help grid User to run their applications. Automatic Thread optimization on the GRID using GOTO Blas and Reser Run QuantumEspresso pw.x code using SMP resources on the GRID a simple client server python tool for the GRID
future development Automatic/semiautomatic updating procedure of glite software Compacting/reducing size of central services: more services on one VM all central services on a single DVD Client/server mechanism to automatically add more grid-site to a basic configuration More grid-services DB elements Respect glite software and so on..