US CMS Tier1 Facility Network at Fermilab



Similar documents
Deploying distributed network monitoring mesh

LHCOPN and LHCONE an introduction

(Possible) HEP Use Case for NDN. Phil DeMar; Wenji Wu NDNComm (UCLA) Sept. 28, 2015

LHCONE Site Connections

Network monitoring with perfsonar. Duncan Rand Imperial College London

Scientific Storage at FNAL. Gerard Bernabeu Altayo Dmitry Litvintsev Gene Oleynik 14/10/2015

End-to-End Network/Application Performance Troubleshooting Methodology

Tier0 plans and security and backup policy proposals

VIDEO SURVEILLANCE WITH SURVEILLUS VMS AND EMC ISILON STORAGE ARRAYS

Troubleshooting and Maintaining Cisco IP Networks Volume 1

How To Design A Data Centre

Trial of the Infinera PXM. Guy Roberts, Mian Usman

Software Defined Networking for big-data science

Virtual PortChannels: Building Networks without Spanning Tree Protocol

Cisco Data Center Network Manager Release 5.1 (LAN)

Software Defined Networking for big-data science

Tier3 Network Issues. Richard Carlson May 19, 2009

TRILL for Service Provider Data Center and IXP. Francois Tallet, Cisco Systems

Voice Over IP. MultiFlow IP Phone # 3071 Subnet # Subnet Mask IP address Telephone.

VMDC 3.0 Design Overview

Chapter 4: Spanning Tree Design Guidelines for Cisco NX-OS Software and Virtual PortChannels

Migrate from Cisco Catalyst 6500 Series Switches to Cisco Nexus 9000 Series Switches

ProLiant Essentials Intelligent Networking Active Path Failover in Microsoft Windows environments

Expert Reference Series of White Papers. Planning for the Redeployment of Technical Personnel in the Modern Data Center

Forschungszentrum Karlsruhe in der Helmholtz - Gemeinschaft. Holger Marten. Holger. Marten at iwr. fzk. de

Brocade Solution for EMC VSPEX Server Virtualization

Low Latency 10 GbE Switching for Data Center, Cluster and Storage Interconnect

Experiences with MPTCP in an intercontinental multipathed OpenFlow network

ICND2 NetFlow. Question 1. What are the benefit of using Netflow? (Choose three) A. Network, Application & User Monitoring. B.

Transform Your Business and Protect Your Cisco Nexus Investment While Adopting Cisco Application Centric Infrastructure

ADDENDUM 1 September 22, 2015 Request for Proposals: Data Center Implementation

Comparing SMB Direct 3.0 performance over RoCE, InfiniBand and Ethernet. September 2014

IP Networking and the Advantages of consolidation

Leased Line + Remote Dial-in connectivity

AT-S39 Version 1.3 Management Software for the AT-8024 and AT-8024GB Fast Ethernet Switches. Software Release Notes

The CMS analysis chain in a distributed environment

Backup and Recovery with Cisco UCS Solutions for SAP HANA

PROPRIETARY CISCO. Cisco Cloud Essentials for EngineersV1.0. LESSON 1 Cloud Architectures. TOPIC 1 Cisco Data Center Virtualization and Consolidation

Global Headquarters: 5 Speen Street Framingham, MA USA P F

The Future of Computing Cisco Unified Computing System. Markus Kunstmann Channels Systems Engineer

CON Software-Defined Networking in a Hybrid, Open Data Center

Routing Security Server failure detection and recovery Protocol support Redundancy

Building the Virtual Information Infrastructure

Virtualized Converged Data Centers & Cloud how these trends are effecting Optical Networks

Accurate End-to-End Performance Management Using CA Application Delivery Analysis and Cisco Wide Area Application Services

Copyright 2012, Oracle and/or its affiliates. All rights reserved.

IP Connectivity Dedicated servers Co-location in data centers

Experiences with Dynamic Circuit Creation in a Regional Network Testbed

Data Center Multi-Tier Model Design

Implementing and Troubleshooting the Cisco Cloud Infrastructure **Part of CCNP Cloud Certification Track**

RESILIENT NETWORK DESIGN

Grid Computing in Aachen

Data Center Design for the Midsize Enterprise. Silvo Lipovšek Systems Engineer

What s New in VMware vsphere 5.5 Networking

PANDORA FMS NETWORK DEVICE MONITORING

VERITAS Backup Exec 9.0 for Windows Servers

perfsonar: End-to-End Network Performance Verification

How To Design A Data Center

Isilon IQ Network Configuration Guide

CCNP Switch Questions/Answers Implementing High Availability and Redundancy

Maintaining Non-Stop Services with Multi Layer Monitoring

ehealth and VoIP Overview

Jive Core: Platform, Infrastructure, and Installation

Network Monitoring with the perfsonar Dashboard

UCS Network Utilization Monitoring: Configuration and Best Practice

Lustre Networking BY PETER J. BRAAM

The Business Case for Ethernet Services Whitepaper Sponsored by Time Warner Cable Business Class

Server-Virtualisierung mit Windows Server Hyper-V und System Center MOC 20409

TechBrief Introduction

Description: Objective: Upon completing this course, the learner will be able to meet these overall objectives:

Configuring Cisco Nexus 5000 Switches Course DCNX5K v2.1; 5 Days, Instructor-led

Network Architecture Validated designs utilizing MikroTik in the Data Center

Astaro Deployment Guide High Availability Options Clustering and Hot Standby

Computer Networking Networks

Expert Reference Series of White Papers. VMware vsphere Distributed Switches

Campus Network Best Practices: Core and Edge Networks

Cisco NetFlow Generation Appliance (NGA) 3140

Emulex OneConnect 10GbE NICs The Right Solution for NAS Deployments

Transcription:

US CMS Tier1 Facility Network at Fermilab Andrey Bobyshev Fermilab, Computing Division Winter 2010 ESCC/Internet2 Joint Techs Salt Lake City, Utah, January 31 February 4, 2010

Outline of the talk : A little bit of history USCMS Tier1 Facility Resources Data Model/requirements The current status LAN (Technologies deployed: Nexus 7000, HW redundancy, vpc, ISSU, QoS, GLBP, Rapid-SPT, SLA, Tracking Objects, SLB, network-wide remote SPAN) Circuits/CHIMAN/USLHC Net Graphs and snapshoots of live network monitor (if time permits)

CMS Tier1 Facility Network staff: Phil Demar Andrey Bobyshev Mark Bowden Maxim Grigoriev Vytautas Grigaliunas Wenji Wu Fermilab's General Site Networking Staff USLHC Net ESNet, Internet2 Chicago Man (joint efforts, ESNet, FNAL, ANL) CMS T1 Computing Staff and Users Community

Tier1 Facilities for CMS Experiment US CMS Tier1 is one of 7 Tier1 centers for CMS experiment, a biggest one. 25000 3 buildings 20000 15000 Tape Disk CPU 10000 Worker Nodes CMSStore with 2x1GE Servers 2 x Nexus 7000 8 x C6509 5000 0 FZK Tier0 FZK PIC CCIN2P3 CNAF ASGC RAL FNAL PIC CCIN2P3 Switzerland Germany Spain France Italy Taiwan UK US CNAF ASGC Tape(TB) 10000 2000 974 1650 804 2100 1887 7100 RAL FNAL Disk(TB) 2200 1060 630 1067 516 3100 774 2600 CPU(kHS06) 44000 7200 4232 7296 4760 7800 7208 20400 1600 200 150 512 x E 2448 x1ge 1x1GE 2x1GE 1x1GE

A model of USCMS-T1 Network data traffic T0 Tier2s/Tiers1 2.2Gbps 3-bps 3.2Gbps EnStore Tape Robots cmsstor/dcache nodes Federated File System QoS CMS-LPC/SLB Clusters QoS BlueArc NAS 10-20Gbps 1Gbps 30-80Gbps Interactive users Data processing /~1600 Worker nodes

January 2009 US CMS Tier1 Network Purdue UTK MIT UNL UCSD UWisc CALTECH DE-KIT UFL PIC RAL LHCOPN / LHCNet ESNet-MAN Fermilab Site Network General Internet Tape Robot ESNet/ Internet2 SDN/DCN End-To-End Circuits 20G Bonded 2x1GE Eth connected to different switches BlueArc NAS 2x20G 2x20G Bonded 2x1GE Eth connected to different switches 2x1G Eth 40Gbps HUB1 HUB2 SRM/dCache Servers Load-Balanced Servers SRM/dCache Servers L2/L3 Redundancy GLBP 30G 30G 1G 330 Worker nodes per each c6509 switch 330 Worker nodes per each c6509 switch

USCMS Tier1: Data Analysis Traffic 02/05 02/06/09 29.8 of 30G R-CMS-FCC2-3 18 of 20G R-CMS-FCC2-2 R-CMS-FCC2 30 of 40G 65 of 80G 18 of 30G R-CMS-N7K-GCC-B 20 of 20G 20 0f 30G 30G S-CMS-GCC-4 S-CMS-GCC-1 25 of 30G S-CMS-GCC-6 S-CMS-GCC-3 https://fngrey.fnal.gov/wm/uscms S-CMS-GCC-5

US CMS Tier1 Network General Internet 20G E ~20GE s-exp-gcc-robot STK r-s-hub-fcc E 20GE CMS BlueArc 2x1GE CMSStor nodes bonded to different switches $AB Site Network ESMAN/LHCOPN ESNet/I2 SDN/DCN FCC January 2010 E E 2x20GE vpc 80GE vpc 20 GBps Cisco Nexus 7000 40GE vpc 40GE vpc 330 Worker nodes per each c6509 switch 330 Worker nodes per each c6509 switch GCC

Allocation of LAN Bandwidth per CMS Traffic Classes 222 34 50 10 Best Eforts NAS Stor Rea ltime In te ractive Network use BW% 80GE 40GE Network Use 2 1.6Gbps 0.8Gbps Interactive, LPC 2 1.6Gbps 0.8Gbps Monitoring, DB 2 1.6Gbps 0.8Gbps Critical (Store Nodes, EnStore 34 27.2Gbps 13.6Gbps NAS 10 8Gbps 4Gbps Best Efforts 50 40Gbps 20Gbps Real Time, Transactions Traffic)

End-to-End Circuits USCMS-T1 has a long history of using USLHCNet, ESNet/SDN and Internet2 DCN circuits Circuit LHCOPN LHCOPN Secondary LHCOPN Backup Country Switzerland Switzerland Switzerland Affiliation T0 T0 T0 BW 8.5G 8.5G 3.5G Germany France Taiwan T1 T1 T1 1G 2x1G 2.5G CALTECH Purdue UWISC UFL UNL MIT UCSD TIFR UTK India T3 1G 1G Canada Czech CDF/D0 D0 1G 1G DE-KIT IN2P3 ASNet/ASGC McGill Cesnet, Prague SLA monitor, IOS Track objects to automatically fail over traffic if circuit is down

PerfSonar Monitoring Monitoring status of circuits Alert on a change of link status Utilization PingEr RTT measurements PerfSonar-BUOY Active measurements, BWCTL and OWAMP Two NPI Took kit boxes Two LHCOPN/MDM monitoring boxes also based on NPITool kit

USCMS Tier1 Network in FY10-11 r-s-bdr End-To-End Circuits E site-core2 site-core1 E 20GE 20GE 40GE 20GE Worker nodes 20GE cmsstor 2x1GE N2K vpc 80GE VLAN207 vpc 80GE VLAN191 N2K 80-160GE L3 Data Path vpc peer 20GE ~12 units ~250 nodes/500 x1ge ports 80GE vpc 80GE vpc s-cms-gcc-m s-cms-gcc-1 s-cms-gcc-n ~288 nodes x 9 c6509 = 2592 nodes CMS VLANs 187,191, 207 are the data center wide. 2 x 80GE uplinks to separate L2 VLAN traffic between switches. 160GBps ECMP for L3 traffic

Summary of the current status Two core Nexus 7000 switches for aggregation, interconnected by 20Gbps currently, in the future 160Gbps 2 xbps to the Site Network (read/write data to tapes) bps to the Border Router (non-us Tier2s, other LHC-related traffic) 20Gbps toward ESNET CHIMAN and USLHCNET, SDN/DCN/E2E circuits ~200 dcache nodes with 2x1GE ~1600 worker nodes with 1GE ~150 various servers 2X 20G for BlueArc NAS storage Satellite c6509 switches connected by 40G (30Gbps + bps) Redundancy/loadsharing at L2 (vpc) and L3 (GLBP) IOS based Server Load Balancing ~12 SDN/DCN End-To-End Circuits Virtual port channeling (vpc) QoS ( 5 major classes of traffic)