GEFÖRDERT VOM Grid Computing in Aachen III. Physikalisches Institut B Berichtswoche des Graduiertenkollegs Bad Honnef, 05.09.2008
Concept of Grid Computing Computing Grid; like the power grid, but for computing and storage resources key features of a Computing Grid:! full resource availability at every single client computer! grid sites can be distributed around the world! standardised protocols, data formats and environments! advantages: scalability, low costs, reliability! disadvantages: troubleshooting, maintenance 2
Concept of Grid Computing Analysis CPU Cluster Tape Storage Analysis Application Disk Storage CPU Cluster Super Computer Disk Storage Super Computer users resources 3
Grid Infrastructure for the LHC Tier-2 Tier-2 Tier-2 GridKa Tier-2 IN2P3 TRIUMF Tier-2 BNL Tier-2 ASCC Tier-2 Nordic FNAL Tier-2 CNAF Tier-2 Tier-2 SARA PIC RAL 4
Infrastructure of a typical Tier-2 site site information CE SE network information between this and other sites WN WN WN site te information WN WN WN CE WN SE status SEs that are close network information (not necessarily at the same between site) this and other sites supported protocols file statistics 5
Job Submission Job Status Resource Broker Node Replica Catalog User Interface Network Server RB Storage Logging & Bookkeeping MatchMaker/Broker Workload Manager Log Monitor Information Service Job Adapter Job Contr. (CondorG) CE SE 6
Job Submission Job Status submitted Jo b User Interface Resource Broker Node ut Inp MatchMaker/Broker Information Service ox db n Sa Network Server Replica Catalog RB Storage Logging & Bookkeeping Workload Manager Log Monitor Job Adapter Job Contr. (CondorG) CE SE 7
Job Submission Job Status Resource Broker Node submitted Replica Catalog User Interface waiting Network Server MatchMaker/Broker Information Service b Jo RB Storage Logging & Bookkeeping Workload Manager Log Monitor Job Adapter Job Contr. (CondorG) CE SE 8
Job Submission Job Status Resource Broker Node submitted Replica Catalog User Interface waiting MatchMaker/Broker Network Server Information Service ready RB Storage Workload Manager Job Adapter b Jo Logging & Bookkeeping Log Monitor Job Contr. (CondorG) CE SE 9
Job Submission Job Status Resource Broker Node submitted Replica Catalog User Interface waiting Network Server MatchMaker/Broker Information Service ready RB Storage scheduled Logging & Bookkeeping Workload Manager Log Monitor Job Contr. (CondorG) Job Adapter Job CE SE 10
Job Submission Job Status Resource Broker Node submitted Replica Catalog User Interface waiting Network Server MatchMaker/Broker Information Service ready RB Storage scheduled Workload Manager Job Adapter data transfers/ accesses running Logging & Bookkeeping Log Monitor Job Contr. (CondorG) CE SE 11
Job Submission Job Status Resource Broker Node submitted Replica Catalog User Interface waiting Network Server MatchMaker/Broker Information Service ready RB Storage scheduled running done Logging & Bookkeeping Workload Manager Log Monitor Job Adapter Job Contr. (CondorG) Output Sandbox CE SE 12
Job Submission Job Status Resource Broker Node submitted Replica Catalog User Interface Network Server MatchMaker/Broker Information Service ut tp Ou waiting nd Sa x bo ready RB Storage scheduled running done Logging & Bookkeeping Workload Manager Log Monitor Job Adapter Job Contr. (CondorG) CE SE cleared 13
CMS Grid Computing CMS data structure! RAW detector data and Level-1 information 1.5 MB/evt, 2!copies, 5 PB/y COMMUNICATION 40 MHz COLLISION RATE 100 khz LEVEL-1 TRIGGER PROCESSING 16 Million channels 3 Gigacell buffers 1 Megabyte EVENT DATA! RECO REConstructed Objects 250 kb/evt, 3 copies, 2.1 PB/y 1 Terabit/s (50k DATA CHANNELS) 200 Gigabyte BUFFERS 500 Readout memories! AOD Analysis Objects Data 50 kb/evt, 1 copy@t1, 2.6 PB/y 500 Gigabit/s SWITCH NETWORK EVENT BUILDER! TAG high level physics objects and run info (event directory) ~10 kb/evt 150 Hz FILTERED EVENT Gigabit/s SERVICE LAN 5 TeraFLOP EVENT FILTER Petabyte ARCHIVE 14
Federated CMS-T2 RWTH & DESY supported Virtual Organisations (VO) + OBSERVATORY 90% 5% 5% Monte Carlo production mainly in Aachen DESY offers large tape storage and disk space host space for QCD, JetMET, SUSY, Top, Tracker, FWD-Physics GridKa in Karlsruhe as associated Tier-1 15
Aachen s Grid Structure Aachen s Grid Deployment Team:! Walter Bender, Achim Burdziak, Manuel Giffels, Carsten Hof, Sergey Kalinin, Thomas Kreß, Andreas Nowack,, Peter Schiffer, Daiske Tornier, Oleg Tsigenov, Clemens Zeidler shift crew (one person per week)! monitors hardware, transfers, production, storage, network,! problems are communicated to the experts weekly strategy meetings for overall discussions and organisational matters ticket system for task structuring and documentary purposes 16
Site Administration HP Onboard Administration (Integrated Lights-Out 2) 17
Site Monitoring hardware infrastructure PhEDEx transfers 18
Site Monitoring Monte Carlo production disk storage 19
Site Monitoring & Services grid services at RWTH-T2:! Computing Element! dcache! disk storage! glite 3.1! middleware! CMS software framework! Crabserver! job management! DDBS! database for datasets! PhEDEx! transfers! Auger software external status! 20
Tier-2 in Aachen Prototype System prototype system till March 2008! located in the Physics Center! air conditioned room with 30 kw cooling capacity! 37 worker nodes with a total of ~100!CPU cores! 30 TB disk space! 2 GBit/s wan link speed! 1 GBit/s interconnection speed Sep. 2007 21
Tier-2 in Aachen Production System production system since April 2008! located in the RWTH IT-Center! installed in water-cooled racks with 160 kw cooling capacity! homogeneous hardware from HP! 253 worker nodes with a total of 2024!CPU cores! 530 TB disk storage! 2 10 GBit/s wan link speed! 1-4 GBit/s interconnection speed 22
Tier-2 in Aachen Production System 23
Tier-2 in Aachen Production System 24
Aachen s Computing Power Tier-2 area! CPU power DESY-HH RWTH-Aachen 25
Summary Grid Computing; the key technology for LHC physics analyses extremely high computing and storage resources transparent access for the end-users high effort needed to run a Tier-2 Aachen s Tier-2 got a major hardware upgrade this spring production system (hard- & software) runs stable we are ready for the LHC start-up! ;-) 26