Grid @ Forschungszentrum Karlsruhe: GridKa and GGUS Forschungszentrum Karlsruhe GmbH Institute for Scientific omputing P.O. Box 3640 D-76021 Karlsruhe, Germany Holger Marten (for the GridKa and GGUS teams) http://www.gridka.de
The Grid omputing entre Karlsruhe GridKa Requested as a Regional Data and omputing entre by 41 HEP user groups in 19 German universities and research institutions. Founded in 2001 as part of the computing centre of Forschungszentrum Karlsruhe. Main goals test environment for LH (ALIE, ATLAS, MS, LHb) LH Tier-1 in 2007+ production environment for non-lh (BaBar, DF, D0, ompass) environment for Grid R&D (rossgrid, LG, EGEE, ) user support, services, education & training grid environment for other sciences (astrophysics, bio-informatics...)
High Energy Physics experiments served by GridKa Atlas (SLA, USA) (FNAL,USA) ommitted to Grid omputing Have real data already today (FNAL,USA) (ERN) LH experiments non-lh experiments Other sciences later
GridKa Project Organization Technical Advisory Board Alice Atlas MS LHb BaBar DF D0 ompass Physics ommittees DESY Project Leader GridKa Planning Development Technical realization Operation Overview Board Board BMBF Physics ommittees HEP Experiments LG FZK Management Head FZK omp. entre hairman of TAB Project Leader
Aachen (4) Bielefeld (2) Bochum (2) Bonn (3) Darmstadt (1) Dortmund (1) Dresden (2) Erlangen (1) Frankfurt (1) Freiburg (2) Hamburg (1) Heidelberg (1) (6) Karlsruhe (2) Mainz (3) Mannheim (1) München (1) (5) Münster (1) Rostock (1) Siegen (1) Wuppertal (2) Forschungszentrum Karlsruhe German Users of GridKa 22 institutions 44 user groups 350 scientists
GridKa in the network of international Tier-1 1 centres anada: France: Germany: Italy: North Europe: Spain: Taiwan: The Netherlands: UK: USA: USA: TRIUMPF, Vancouver IN2P3, Lyon Forschungszentrum Karlsruhe NAF, Bologna NDGF, Nordic DataGrid Facility PI, Barcelona Academia Sinica, Taipei NIKHEF/SARA, Amsterdam Rutherford Laboratory, hilton Fermi Laboratory, Batavia, IL BNL
desktops portables Santiago small RAL Tier-2 Weizmann centres Tier-1 Forschungszentrum Karlsruhe MSU I IN2P3 IFA The fifth LH subproject LH omputing Model (simplified!!) Tier-0 the accelerator centre UB FNAL ambridge Filter Æ raw data Reconstruction Æ summary data (ESD) Record raw data and ESD Distribute raw and ESD to Tier-1 NAF Budapest Prague FZK Taipei PI TRIUMF BNL Legnaro SS Rome IEMAT Krakow NIKHEF US Tier-1 Permanent storage and management of raw, ESD, calibration data, meta- online to data acquisition process data, analysis data and databases -- high availability (24h x7d) Æ grid-enabled data service -- managed mass storage Data-heavy analysis -- long-term commitment Re-processing raw Æ ESD -- resources: 50% of average National, regional support Tier-1 Les Robertson, GDB, May 2004 IEPP
desktops portables Santiago small RAL Tier-2 Weizmann centres Tier-1 Forschungszentrum Karlsruhe MSU I IN2P3 Tier-2 UB Well-managed disk storage grid-enabled Simulation End-user analysis batch and interactive High performance parallel analysis (PROOF?) FNAL ambridge NAF Budapest Prague FZK Taipei PI TRIUMF BNL Legnaro IEPP SS Rome Each Tier-2 is associated with a Tier-1 that Serves as the primary data source Takes responsibility for long-term storage and management of all of the data generated at the Tier-2 (grid-enables mass storage) May also provide other support services (grid expertise, software distribution, maintenance, ) IEMAT Krakow NIKHEF US Les Robertson, GDB, May 2004 IFA
Stepwise extension of GridKa resources Oct 2004 Apr 2005 Oct 2005 % of 2008 Processors 1070 1280 1550 40 % ompute power / ksi2k 920 1200 1500 18 % Disk [TB] 220 270 310 8 % Tape [TB] 375 475 500 11 % Internet [Gb/s] 10 10 10 50 % extension of resources every 6 months (2001-2005) heterogeneous environment, PIII@1,26 GHz,, PIV@3,06, Opteron 246 310 TB Disk is usable disk space, ~2550 disk drives installed with S3.0.4, LG 2_6_0 available via Grid together with ~140 other installations in Europe (EGEE)
Requested resources at GridKa (LH + non-lh) 2005 2006 2007 2008 2009 PU [ksi2k] 1440 2020 3080 8300 12780 Disk [TB] 310 640 1390 3860 5880 Tape [TB] 500 960 1830 4460 8700 WAN [Gb/s] 10 10 10 20 20 2005 2006 2007 2008 S2 S3 cosmics First physics First beams Full physics run S = Service hallenge (between Tier-0/1/2)
100% 80% 60% 40% 20% 0% 100% 80% 60% 40% 20% 0% 100% 80% 60% 40% 20% 0% Forschungszentrum Karlsruhe Distribution of planned resources at GridKa non-lh Sep-2005 2004 2005 2006 2007 2008 2009 non-lh 2004 2005 2006 2007 2008 2009 non-lh LH Significant contributions to non-lh!! BaBar Tier-A D0, DF Regional entre LH LH 2004 2005 2006 2007 2008 2009 PU Disk Tape
Usage of GridKa 2004 LH 34% nlh 66% Processor usage [h] 4.183.000 Number of jobs 1.442.000
Work done at GridKa as a computing centre Infrastructure planning (floor space, electricity, cooling) Resource planning, procurement, installation, exchange Development of tools (installation, monitoring, inventory db, ) Software installation & upgrades (OS, micro codes, ) onnection of disk & tape systems (TSM, dache, ) Batch system management Security User support Access & throughput optimization LG installations and Service hallenges plus huge effort for grid operation
Forschungszentrum Karlsruhe Global (Grid) operation M D c bc D D c bc R b c Virtual Organisation (VO) ALIE D R b c c bc VO ATLAS c bc
Forschungszentrum Karlsruhe Global (Grid) operation User Management M c bc D D D R b c c bc R b c c bc Reliability Interoperability D c bc User Support Virtual Organisation (VO) ALIE VO ATLAS This thing should be global, dynamic and virtual
entres in D/H contributing to the EGEE infrastructure DESY HU Berlin U Wuppertal RWTH Aachen FhG SAI FhG ITWM, Kaiserslautern TU Karlsruhe, IEKP GSI FZK LRZ München SS, Manno Distributed RO R R to be certified
HEP Forschungszentrum Karlsruhe Resource allocation policy in D/H SS yes DESY yes FZK yes RO management GSI yes FhG SAI FhG ITWM Bio Medicine yes Earth Science omputational hemistry yes On demand Astrophysics yes (MAGI) Others Synchrotron XFEL PUs 20 88 1550 336 30 48 19 supported global and regional VOs about 97% of resources for HEP
Tasks for Grid Operations in the federation Development of policies for User registration Registration and certification of new sites Resource allocation Grid security & incident response Development of methods and tools Implement the above policies and refine them Middleware roll-out Site functional tests Resource monitoring User & Operations Support Weekly operations meetings
Health monitoring of sites
GermanGrid A (GridKa( GridKa-A) Delivers X.509 certificates for German users, hosts & applications 2001: supported / connected to DataGrid 2002: continued within rossgrid 2004: contributes to EGEE Became member of EUGridPMA EUGridPMA European Grid Policy Management Authority oordinates the European Public Key Infrastructure (PKI) for use with grid authentication middleware develops PKI and A policies; provides root As >35 member As in Europe collaborates with APGridPMA (Asia-Pacific) and TAGPMA (America)
GGUS (Global Grid User Support) www.ggus.org ombine this with regional and VO specific support units
User support: classical communication model I need help! I send e-mail to GGUS team Request for help GGUS support Support teams or units Mail to R 2 Mail to Germany- Switzerland RO support RO DEH Rs DEH Admins R 1 Admins R 2 RO support RO F Mail to VO Tools support Quattor dache GGUS. Middleware support glite FTS Mail: Solution ready! I on duty I I I I I I I I I I I VO support BioMed ASTRO
User support: GGUS application model I need help! I send e-mail to support@ggus.org Mail with Ticket solution ID E-Mail automatically converted to GGUS ticket. Problem registered in central DB. entral support application Regional support applications Automatic ticket conversion RO support RO DEH Assign ticket RO support Register reassignment RO F GGUS support entral GGUS application Web portal + Workflow engine with problem and solution DB I on duty I I I I I I I I I Assign ticket Rs DEH Reassign to VO Tools support Store solution in central DB; register ticket closed Notify ticket assingnment Middleware support I I VO support E-mail lists Admins R 1 Admins R 2 Quattor dache glite FTS BioMed ASTRO
Advantages and challenges of the GGUS appl.. model Advantages All steps of the support process are registered centrally User and all involved units can monitor the support process communication always synchronised Workflow can be optimised (automatised) Problems and solutions stored for re-use (solution DB) hallenges to define the support units (find the people) to develop and implement the workflow to develop the central GGUS application to develop interfaces between the GGUS application and those applications used by the ROs, Is, VOs,...
GGUS application system architecture User Web service interface Resources and Status entral User DB VO DB Web-Interface for user interaction Middleware & Interfaces GGUS Helpdesk Application (Remedy) Reporting & Analysis Workflow engine Middleware & Interfaces Web-Interface & JAVA lient for support interaction Universal Database for for GRID GRID related related problems Web service interface ESUS, DS, GO/RO
User support: GGUS application model I need help! I send e-mail to vo-support@ggus.org Mail with Ticket solution ID E-Mail automatically converted to GGUS ticket. Problem registered in central DB. entral support application GGUS support entral GGUS application Workflow engine with problem and solution DB Regional support applications RO support RO DEH RO support RO F I on duty I I I I I I I I I Store solution in central DB; register ticket closed Notify ticket assingnment I I Rs DEH Tools support Middleware support VO support E-mail lists Admins R 1 Admins R 2 Quattor dache glite FTS BioMed ASTRO
User support: alternative path I need help! I use the portal of my RO entral support application GGUS support entral GGUS application Workflow engine with problem and solution DB Regional support applications RO support RO DEH RO support RO F Interfaces among applications I on duty I I I I I I I I I I I Rs DEH Tools support Middleware support VO support E-mail lists Admins R 1 Admins R 2 Quattor dache glite FTS BioMed ASTRO
Secure access via grid certificate User view http://dech-support.fzk.de
User view submit ticket http://dech-support.fzk.de
oncluding remarks M I D D L E W A R E Storage Archive P-luster Supercomputer a Analysis user friendly Application environment. Supercomputer Storage Archive Analysis Apart from middleware development Storage Archive there is a lot of work to make this P-luster User Virtualisation of Resources
oncluding remarks Thank you! EGEE is funded by the European ommunity under grant IST-2002-508833. We appreciate the continuous interest and support by the Federal Ministry of Education and Research, BMBF.