Intel Ethernet Switch Ethernet Interconnects in ATCA Systems Enabled by Low Latency Switching

Similar documents
Intel Ethernet Switch Converged Enhanced Ethernet (CEE) and Datacenter Bridging (DCB) Using Intel Ethernet Switch Family Switches

Fiber Channel Over Ethernet (FCoE)

Intel Ethernet Switch Load Balancing System Design Using Advanced Features in Intel Ethernet Switch Family

Intel Network Builders: Lanner and Intel Building the Best Network Security Platforms

Different NFV/SDN Solutions for Telecoms and Enterprise Cloud

Accelerating High-Speed Networking with Intel I/O Acceleration Technology

Network Management Systems Today

Intel Data Direct I/O Technology (Intel DDIO): A Primer >

NFV Reference Platform in Telefónica: Bringing Lab Experience to Real Deployments

A Superior Hardware Platform for Server Virtualization

Introduction to PCI Express Positioning Information

Scaling Networking Solutions for IoT Challenges and Opportunities

A Dell Technical White Paper Dell PowerConnect Team

IEEE Congestion Management Presentation for IEEE Congestion Management Study Group

Intel Desktop Board DG965RY

Intel Media SDK Library Distribution and Dispatching Process

PCI Express Overview. And, by the way, they need to do it in less time.

Cloud based Holdfast Electronic Sports Game Platform

Interfacing Intel 8255x Fast Ethernet Controllers without Magnetics. Application Note (AP-438)

Intel Ethernet and Configuring Single Root I/O Virtualization (SR-IOV) on Microsoft* Windows* Server 2012 Hyper-V. Technical Brief v1.

RAID and Storage Options Available on Intel Server Boards and Systems

USB 3.0* Radio Frequency Interference Impact on 2.4 GHz Wireless Devices

High Availability Server Clustering Solutions

Scaling 10Gb/s Clustering at Wire-Speed

Benchmarking the OpenCloud SIP Application Server on Intel -Based Modular Communications Platforms

Intel Data Center Manager. Data center IT agility and control

Intel Desktop Board DQ965GF

A Platform Built for Server Virtualization: Cisco Unified Computing System

Configuring RAID for Optimal Performance

Intel Processors in Industrial Control and Automation Applications Top-to-bottom processing solutions from the enterprise to the factory floor

iscsi Quick-Connect Guide for Red Hat Linux

10GBASE-T for Broad 10 Gigabit Adoption in the Data Center

The Case for Rack Scale Architecture

Intel Desktop Board DP55WB

WHITE PAPER. LVDS Flat Panel Display Interface on Intel Desktop Boards. July 2009 Order Number: E

RAID and Storage Options Available on Intel Server Boards and Systems based on Intel 5500/5520 and 3420 PCH Chipset

Intel Core i5 processor 520E CPU Embedded Application Power Guideline Addendum January 2011

Creating Overlay Networks Using Intel Ethernet Converged Network Adapters

Intel Network Builders

The Cross-Media Contact Center

10GBASE-T for Broad 10 Gigabit Adoption in the Data Center

Switch Fabric Implementation Using Shared Memory

How to Configure Intel X520 Ethernet Server Adapter Based Virtual Functions on Citrix* XenServer 6.0*

Intel Ethernet Controller X540 Feature Software Support Summary. LAN Access Division (LAD)

Intel Cloud Builders Guide to Cloud Design and Deployment on Intel Platforms

PCI Express IO Virtualization Overview

Intel Technical Advisory

IBM BladeCenter H with Cisco VFrame Software A Comparison with HP Virtual Connect

Intel Cloud Builder Guide: Cloud Design and Deployment on Intel Platforms

New Dimensions in Configurable Computing at runtime simultaneously allows Big Data and fine Grain HPC

Figure 1 FPGA Growth and Usage Trends

Leading Virtualization 2.0

How to Configure Intel Ethernet Converged Network Adapter-Enabled Virtual Functions on VMware* ESXi* 5.1

Virtualizing the SAN with Software Defined Storage Networks

HPC & Big Data THE TIME HAS COME FOR A SCALABLE FRAMEWORK

The Transition to PCI Express* for Client SSDs

Intel X38 Express Chipset Memory Technology and Configuration Guide

Applying Multi-core and Virtualization to Industrial and Safety-Related Applications

Gigabit Ethernet. Abstract. 1. Introduction. 2. Benefits of Gigabit Ethernet

Highly Available Mobile Services Infrastructure Using Oracle Berkeley DB

Intel IXP42X Product Line of Network Processors and IXC1100 Control Plane Processor: Spread-Spectrum Clocking to Reduce EMI

Intel 965 Express Chipset Family Memory Technology and Configuration Guide

Storage Solutions Overview. Benefits of iscsi Implementation. Abstract

Serial ATA in Servers and Networked Storage

Maximizing Server Storage Performance with PCI Express and Serial Attached SCSI. Article for InfoStor November 2003 Paul Griffith Adaptec, Inc.

VNF & Performance: A practical approach

Top Ten Reasons for Deploying Oracle Virtual Networking in Your Data Center

21152 PCI-to-PCI Bridge

10-/100-Mbps Ethernet Media Access Controller (MAC) Core

Simplifying the Data Center Network to Reduce Complexity and Improve Performance

Affordable Building Automation System Enabled by the Internet of Things (IoT)

COLO: COarse-grain LOck-stepping Virtual Machine for Non-stop Service. Eddie Dong, Tao Hong, Xiaowei Yang

Enhance Service Delivery and Accelerate Financial Applications with Consolidated Market Data

Block based, file-based, combination. Component based, solution based

Intel Extreme Memory Profile (Intel XMP) DDR3 Technology

Intel Open Network Platform Release 2.1: Driving Network Transformation

Overcoming Security Challenges to Virtualize Internet-facing Applications

Technical Brief. DualNet with Teaming Advanced Networking. October 2006 TB _v02

Network Functions Virtualization Using Intel Ethernet Multi-host Controller FM10000 Family

Fibre Channel over Ethernet in the Data Center: An Introduction

Juniper Networks QFabric: Scaling for the Modern Data Center

Addressing Scaling Challenges in the Data Center

Intel Desktop Board D945GNT

COLO: COarse-grain LOck-stepping Virtual Machine for Non-stop Service

Things You Must Know About Gigabit Ethernet 1. Understanding Gigabit Ethernet

Intel HTML5 Development Environment. Tutorial Building an Apple ios* Application Binary

How To Increase Network Performance With Segmentation

Upgrading Data Center Network Architecture to 10 Gigabit Ethernet

Intel Matrix Storage Console

Intel architecture. Platform Basics. White Paper Todd Langley Systems Engineer/ Architect Intel Corporation. September 2010

The Intel NetStructure SIU520 Signaling Interface

Intel Solid-State Drive Pro 2500 Series Opal* Compatibility Guide

Unified Fabric: Cisco's Innovation for Data Center Networks

Fibre Channel Over and Under

Intel Desktop Board D945GCZ

White Paper Abstract Disclaimer

Intel Cloud Builder Guide to Cloud Design and Deployment on Intel Platforms

COMPUTING. Centellis Virtualization Platform An open hardware and software platform for implementing virtualized applications

Data Center Convergence. Ahmad Zamer, Brocade

新 一 代 軟 體 定 義 的 網 路 架 構 Software Defined Networking (SDN) and Network Function Virtualization (NFV)

Transcription:

Intel Ethernet Switch Ethernet Interconnects in ATCA Systems Enabled by Low Latency Switching White Paper September, 2005

Legal INFORMATION IN THIS DOCUMENT IS PROVIDED IN CONNECTION WITH INTEL PRODUCTS. NO LICENSE, EXPRESS OR IMPLIED, BY ESTOPPEL OR OTHERWISE, TO ANY INTELLECTUAL PROPERTY RIGHTS IS GRANTED BY THIS DOCUMENT. EXCEPT AS PROVIDED IN INTEL'S TERMS AND CONDITIONS OF SALE FOR SUCH PRODUCTS, INTEL ASSUMES NO LIABILITY WHATSOEVER, AND INTEL DISCLAIMS ANY EXPRESS OR IMPLIED WARRANTY, RELATING TO SALE AND/OR USE OF INTEL PRODUCTS INCLUDING LIABILITY OR WARRANTIES RELATING TO FITNESS FOR A PARTICULAR PURPOSE, MERCHANTABILITY, OR INFRINGEMENT OF ANY PATENT, COPYRIGHT OR OTHER INTELLECTUAL PROPERTY RIGHT. Intel products are not intended for use in medical, life saving, life sustaining, critical control or safety systems, or in nuclear facility applications. Intel may make changes to specifications and product descriptions at any time, without notice. Intel Corporation may have patents or pending patent applications, trademarks, copyrights, or other intellectual property rights that relate to the presented subject matter. The furnishing of documents and other materials and information does not provide any license, express or implied, by estoppel or otherwise, to any such patents, trademarks, copyrights, or other intellectual property rights. The Controller may contain design defects or errors known as errata which may cause the product to deviate from published specifications. Current characterized errata are available on request. Intel and Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States and other countries. *Other names and brands may be claimed as the property of others. Copyright 2011. Intel Corporation. All Rights Reserved. 2

Table of Contents Overview... 4 ATCA...4 Driving Toward an Architecture Standard...4 Unintentional Fragmentation...5 Ethernet as a Fabric Interconnect...6 The Case for Ethernet...6 Congestion Management in Ethernet...6 Enabling Effective PAUSE-Based Congestion Management...7 Conclusion...9 3

Overview High performance embedded systems form the basis for telecommunications and networking equipment as well as for enterprise server applications, storage and high performance computing platforms. Independently designed and optimized for their respective applications, these embedded systems have historically been expensive. The common drive toward cost reduction across all these markets has led toward standardization in the physical and electrical aspects of system design. One of the leading examples of this movement is the ATCA (Advanced Telecom Computing Architecture) specification developed by PICMG (the PCI Industrial Computer Manufacturers Group). In the ATCA specification the backplane fabric interconnect however, remains open to a number of standards, a situation that has led to a splintering of the ecosystem, in conflict with the basic principles of driving toward commoditization through uniformity. This activity has inhibited the ATCA industry growth. This white paper makes the case that among the candidate fabric technologies Ethernet, with its continually evolving combination of features and performance, is the most consistent with the goals of the ATCA specification and will emerge as the one right interconnect. The unique attributes of Ethernet switch chips soon to enter the market from Intel will speed the adoption of Ethernet by improving congestion management, its Achilles' heel. ATCA Driving Toward an Architecture Standard In the continuing effort to lower the cost of communications equipment, Telecommunications Equipment Manufacturers (TEM's) are migrating away from proprietary designs toward the use of standardsbased designs. Standardization can be implemented at several levels in the system's design while preserving the ability to differentiate through software and services. Standards at the level of physical layer protocols, silicon, blades, chassis mechanical characteristics, power, system management and software have all been leveraged to lower the costs of system design, manufacture, maintenance and application development. The next logical step has been taken by the PICMG group in the development of the ATCA standard, perhaps better described as a unified set of standards, forming the basis for an entire system platform. Targeted primarily at carrier grade applications, PICMG 3.0 is the foundational specification for systems featuring scalable capacity, five 9's reliability, manageability, high availability, modularity and serviceability. A well thought out standard attracts a multitude of component vendors creating a rich ecosystem of building blocks from 4

which system designers may choose. Competition among vendors and economies of scale in manufacturing cause price reduction to quickly ensue. The PICMG 3.0 specification for ATCA systems addresses numerous aspects of an open architecture modular platform: mechanical design, shelf management, power distribution, thermal management, and data transport. Figure 1 gives a physical overview of a basic system. In order to achieve broad market applicability, elements of the architecture have not been rigidly specified, but left flexible enough to accommodate a variety of usage models. One key aspect of the architecture on which the specification remains agnostic is the data interconnect, or fabric, technology. PICMG 3.0 defines the electrical signaling and physical layout parameters for a shelf of up to 16 slots containing node boards and hub (switch) boards. The specification provides for four kinds of data transport among boards. A base interface is used primarily for the exchange of management information, although it may also be used for data transport. An update interface is used for redundancy between boards or for specialized fabric bypass. A clock synchronization interface can be used for sharing synchronous clock information throughout the shelf. The data transport fabric provides for the exchange of high volumes of data and ultimately defines the system capacity. It is this last interconnect type that remains open to a number of different data transmission protocols. Figure 1. Elements of the ATCA Modular Architecture Unintentional Fragmentation Ideally, one interconnect technology would have all the attributes required for use across the wide range of applications targeted by ATCA systems - equipment spanning edge, core, transport and data center 5

applications. While the base system fabric is defined to be 10/100/ 1000BASE-T Ethernet, five subsidiary specifications, PICMG 3.1 through PICMG 3.5, define the use of Ethernet/Fibre Channel, Infiniband, StarFabric, Advanced Switching and Serial Rapid IO protocols, respectively, for the critical data fabric interconnect. Moreover, there are potential additions to this list as additional proposals to PICMG are received. Each protocol has a set of advantages and disadvantages associated with it, and the PICMG 3.0 specification allows the designer to choose from among the available options the one that best matches his particular set of requirements. Switching elements and fabric interface chips (FIC) on node and hub cards implement the interconnect protocol and the cards are therefore not interchangeable with other systems using different protocols. While the existence of numerous interconnect protocols does address the issue of broad applicability, it also has the undesirable effect of fragmenting the market for ATCA systems, thereby inhibiting the development of a robust component ecosystem, consequently slowing its market acceptance and increasing cost. Clearly, the existence of one single interconnect with all the functionality required would accelerate the adoption of the ATCA specification. Ethernet as a Fabric Interconnect The Case for Ethernet Due largely to its ubiquity, Ethernet offers compelling features that are consistent with the drive toward low cost system architectures. Initially developed as a LAN/WAN networking protocol, it has been successfully applied as a box-to-box interconnect and as a backplane interconnect, bringing it to a pre-eminent position in terms of number of installed network connections. This adaptability is a direct result of the formation of working groups within the IEEE 802 LAN/MAN Standards Committee for the purpose of creating extensions to the standard. Its very large installed base has resulted in the highest level of software compatibility, the lowest implementation and support cost, and the broadest applicability of all the combined interconnect technologies. If Ethernet adequately addressed all the critical functional requirements of ATCA switch fabrics, it would be the obvious choice to standardize upon, but there is one area where Ethernet is perceived to be deficient. Congestion Management in Ethernet ATCA-based systems are being designed for demanding compute and communication applications, requiring blade to blade movement of vast amounts of data. Data transmission through the backplane switch fabric must support carrier-class performance, in turn requiring lossless 6

transmission, flow control and congestion management, essential elements in enabling Quality of Service (QoS). It is in this related set of requirements where Ethernet is seen as deficient. Presently, Ethernet's fundamental mechanism for providing congestion management (other than frame discard, which is unacceptable) is the PAUSE function provided by IEEE 802.3x. This mechanism provides XON/XOFF functionality via the transmission of MAC control frames in the upstream direction, from a receiving port experiencing congestion to its transmitting partner originating the congesting traffic. The reception of PAUSE control frames causes the transmitting device to halt transmission for a time period specified within the frame - enough time for the congested device to relieve its buffer overflow. It was conceived as a link layer flow control mechanism and can be effective when applied to a pair of directly connected devices and for transitory congestion, where peak traffic conditions exist for short time periods and can be buffered. However, IEEE 802.3x flow control as currently defined does not distinguish between multiple data flows that use the same path, meaning all data flows through a path are paused when only one flow is congested. Because of this, when applied to more complex data paths involving traffic from multiple end points to multiple end points through a switch, for example, this mechanism may cause head-of-line (HOL) blocking, ultimately resulting in congestion spreading. Whereas 802.3x PAUSE operation can be effectively used on a link level, it is not deemed effective for this end to end congestion management. For this reason, 802.3x flow control has not been widely implemented. Class-based congestion management mechanisms, those that can distinguish between multiple data flows sharing a path, are capable of mitigating higher levels of congestion before congestion spreading becomes an issue. They can flow control specific data flows without slowing other data flows sharing the same path. A Task Force within the IEEE 802.3 Working Group is now working to add increased congestion management capabilities to Ethernet that will resolve these issues, and efforts outside the task force may result in more immediate ad hoc solutions. Enabling Effective PAUSE-Based Congestion Management Nevertheless, when considering a model for general ATCA data transmission from an NPU on one card to one on another card (or traffic manager to traffic manager), there are certain attributes that, as present in INTEL Ethernet switches, enable effective PAUSE-based end to end congestion management. Straightforward enhancements to the basic operation of 802.3x PAUSE operation, when coupled with very low latency through the switch, form the basis for this operation. 7

A generalized data flow from an NPU on line card 1 to another NPU on line card 2 through an Ethernet switch fabric is illustrated in Figure 2. In this scenario, two data flows traverse the backplane from NPU1 to NPU2 through the switch - a higher priority flow (green) and a lower priority flow (red). The different data flows constituting the 10 Gbps traffic are distinguished by the different MAC addresses assigned to them. Detecting congestion, NPU2 chooses to flow control the low priority data flow, accomplished by the following steps as numbered in Figure 2. Figure 2. Data Flows Across an Internet ATCA Backplane 1. NPU2 detects a congested condition. 2. NPU2 issues a PAUSE control frame toward NPU1 bearing the source address (SA) of the low priority data flow. [NPU's have the ability to assign and manage multiple MAC SA's for data flows using the same physical interface. In this way the NPU's are able to communicate between themselves as a collection of logical end points.] 3. The switch fabric chip, with its flexible port logic, may optionally respond to the PAUSE frame. 4. The switch fabric chip, again through the use of its flexible port logic, passes the PAUSE frame on toward NPU1. 5. NPU1 responds to PAUSE by throttling the queue for low priority data flow without affecting the throughput of the higher priority flow. NPU2 must have enough queue buffer capacity to absorb all the frames it receives until NPU1 throttles the low priority data flow. Frames partially transmitted, partially received and in flight must be stored in NPU1's input buffers during the PAUSE operations latency. The switch latency, as a contributor to the overall latency of the PAUSE mechanism, must be kept to an absolute minimum. Low latency switching ensures that end point buffer requirements are minimized, and Intel s industry leading latency performance is a key enabling factor in this congestion management solution 8

Conclusion Ethernet's long history of adaptability has led to its enormous installed base. Its ubiquity as a network connection has contributed to its uparalleled low cost and ease of use. For these reasons as well as its specified use as a base fabric, Ethernet should be considered for use in high capacity data fabric in ATCA systems. Its weak point - congestion management - is overcome in the near term by switch fabric products from Intel that feature flexible port logic and industry leading low latency. In the long term it will be overcome by class-based flow control to be defined by the IEEE and designed into future Intel products. 9

NOTE: This page intentionally left blank. 10