US CMS Tier1 Facility Network at Fermilab Andrey Bobyshev Fermilab, Computing Division Winter 2010 ESCC/Internet2 Joint Techs Salt Lake City, Utah, January 31 February 4, 2010
Outline of the talk : A little bit of history USCMS Tier1 Facility Resources Data Model/requirements The current status LAN (Technologies deployed: Nexus 7000, HW redundancy, vpc, ISSU, QoS, GLBP, Rapid-SPT, SLA, Tracking Objects, SLB, network-wide remote SPAN) Circuits/CHIMAN/USLHC Net Graphs and snapshoots of live network monitor (if time permits)
CMS Tier1 Facility Network staff: Phil Demar Andrey Bobyshev Mark Bowden Maxim Grigoriev Vytautas Grigaliunas Wenji Wu Fermilab's General Site Networking Staff USLHC Net ESNet, Internet2 Chicago Man (joint efforts, ESNet, FNAL, ANL) CMS T1 Computing Staff and Users Community
Tier1 Facilities for CMS Experiment US CMS Tier1 is one of 7 Tier1 centers for CMS experiment, a biggest one. 25000 3 buildings 20000 15000 Tape Disk CPU 10000 Worker Nodes CMSStore with 2x1GE Servers 2 x Nexus 7000 8 x C6509 5000 0 FZK Tier0 FZK PIC CCIN2P3 CNAF ASGC RAL FNAL PIC CCIN2P3 Switzerland Germany Spain France Italy Taiwan UK US CNAF ASGC Tape(TB) 10000 2000 974 1650 804 2100 1887 7100 RAL FNAL Disk(TB) 2200 1060 630 1067 516 3100 774 2600 CPU(kHS06) 44000 7200 4232 7296 4760 7800 7208 20400 1600 200 150 512 x E 2448 x1ge 1x1GE 2x1GE 1x1GE
A model of USCMS-T1 Network data traffic T0 Tier2s/Tiers1 2.2Gbps 3-bps 3.2Gbps EnStore Tape Robots cmsstor/dcache nodes Federated File System QoS CMS-LPC/SLB Clusters QoS BlueArc NAS 10-20Gbps 1Gbps 30-80Gbps Interactive users Data processing /~1600 Worker nodes
January 2009 US CMS Tier1 Network Purdue UTK MIT UNL UCSD UWisc CALTECH DE-KIT UFL PIC RAL LHCOPN / LHCNet ESNet-MAN Fermilab Site Network General Internet Tape Robot ESNet/ Internet2 SDN/DCN End-To-End Circuits 20G Bonded 2x1GE Eth connected to different switches BlueArc NAS 2x20G 2x20G Bonded 2x1GE Eth connected to different switches 2x1G Eth 40Gbps HUB1 HUB2 SRM/dCache Servers Load-Balanced Servers SRM/dCache Servers L2/L3 Redundancy GLBP 30G 30G 1G 330 Worker nodes per each c6509 switch 330 Worker nodes per each c6509 switch
USCMS Tier1: Data Analysis Traffic 02/05 02/06/09 29.8 of 30G R-CMS-FCC2-3 18 of 20G R-CMS-FCC2-2 R-CMS-FCC2 30 of 40G 65 of 80G 18 of 30G R-CMS-N7K-GCC-B 20 of 20G 20 0f 30G 30G S-CMS-GCC-4 S-CMS-GCC-1 25 of 30G S-CMS-GCC-6 S-CMS-GCC-3 https://fngrey.fnal.gov/wm/uscms S-CMS-GCC-5
US CMS Tier1 Network General Internet 20G E ~20GE s-exp-gcc-robot STK r-s-hub-fcc E 20GE CMS BlueArc 2x1GE CMSStor nodes bonded to different switches $AB Site Network ESMAN/LHCOPN ESNet/I2 SDN/DCN FCC January 2010 E E 2x20GE vpc 80GE vpc 20 GBps Cisco Nexus 7000 40GE vpc 40GE vpc 330 Worker nodes per each c6509 switch 330 Worker nodes per each c6509 switch GCC
Allocation of LAN Bandwidth per CMS Traffic Classes 222 34 50 10 Best Eforts NAS Stor Rea ltime In te ractive Network use BW% 80GE 40GE Network Use 2 1.6Gbps 0.8Gbps Interactive, LPC 2 1.6Gbps 0.8Gbps Monitoring, DB 2 1.6Gbps 0.8Gbps Critical (Store Nodes, EnStore 34 27.2Gbps 13.6Gbps NAS 10 8Gbps 4Gbps Best Efforts 50 40Gbps 20Gbps Real Time, Transactions Traffic)
End-to-End Circuits USCMS-T1 has a long history of using USLHCNet, ESNet/SDN and Internet2 DCN circuits Circuit LHCOPN LHCOPN Secondary LHCOPN Backup Country Switzerland Switzerland Switzerland Affiliation T0 T0 T0 BW 8.5G 8.5G 3.5G Germany France Taiwan T1 T1 T1 1G 2x1G 2.5G CALTECH Purdue UWISC UFL UNL MIT UCSD TIFR UTK India T3 1G 1G Canada Czech CDF/D0 D0 1G 1G DE-KIT IN2P3 ASNet/ASGC McGill Cesnet, Prague SLA monitor, IOS Track objects to automatically fail over traffic if circuit is down
PerfSonar Monitoring Monitoring status of circuits Alert on a change of link status Utilization PingEr RTT measurements PerfSonar-BUOY Active measurements, BWCTL and OWAMP Two NPI Took kit boxes Two LHCOPN/MDM monitoring boxes also based on NPITool kit
USCMS Tier1 Network in FY10-11 r-s-bdr End-To-End Circuits E site-core2 site-core1 E 20GE 20GE 40GE 20GE Worker nodes 20GE cmsstor 2x1GE N2K vpc 80GE VLAN207 vpc 80GE VLAN191 N2K 80-160GE L3 Data Path vpc peer 20GE ~12 units ~250 nodes/500 x1ge ports 80GE vpc 80GE vpc s-cms-gcc-m s-cms-gcc-1 s-cms-gcc-n ~288 nodes x 9 c6509 = 2592 nodes CMS VLANs 187,191, 207 are the data center wide. 2 x 80GE uplinks to separate L2 VLAN traffic between switches. 160GBps ECMP for L3 traffic
Summary of the current status Two core Nexus 7000 switches for aggregation, interconnected by 20Gbps currently, in the future 160Gbps 2 xbps to the Site Network (read/write data to tapes) bps to the Border Router (non-us Tier2s, other LHC-related traffic) 20Gbps toward ESNET CHIMAN and USLHCNET, SDN/DCN/E2E circuits ~200 dcache nodes with 2x1GE ~1600 worker nodes with 1GE ~150 various servers 2X 20G for BlueArc NAS storage Satellite c6509 switches connected by 40G (30Gbps + bps) Redundancy/loadsharing at L2 (vpc) and L3 (GLBP) IOS based Server Load Balancing ~12 SDN/DCN End-To-End Circuits Virtual port channeling (vpc) QoS ( 5 major classes of traffic)