Data Center Switch Fabric Competitive Analysis



Similar documents
Large Scale Clustering with Voltaire InfiniBand HyperScale Technology

Juniper Networks QFabric: Scaling for the Modern Data Center

Cisco s Massively Scalable Data Center

STATE OF THE ART OF DATA CENTRE NETWORK TECHNOLOGIES CASE: COMPARISON BETWEEN ETHERNET FABRIC SOLUTIONS

VMDC 3.0 Design Overview

Scaling 10Gb/s Clustering at Wire-Speed

Components: Interconnect Page 1 of 18

The Software Defined Hybrid Packet Optical Datacenter Network SDN AT LIGHT SPEED TM CALIENT Technologies

TRILL for Service Provider Data Center and IXP. Francois Tallet, Cisco Systems

Cloud Networking: A Novel Network Approach for Cloud Computing Models CQ1 2009

Data Center Network Topologies: FatTree

Architecting Low Latency Cloud Networks

Ethernet Fabrics: An Architecture for Cloud Networking

System Interconnect Architectures. Goals and Analysis. Network Properties and Routing. Terminology - 2. Terminology - 1

Data Center Fabrics What Really Matters. Ivan Pepelnjak NIL Data Communications

Fabrics that Fit Matching the Network to Today s Data Center Traffic Conditions

Flattening the Data Center Architecture

Load Balancing Mechanisms in Data Center Networks

Simplifying the Data Center Network to Reduce Complexity and Improve Performance

Achieving Real-Time Business Solutions Using Graph Database Technology and High Performance Networks

AlcAtel-lucent enterprise AnD sdnsquare sdn² network solution enabling highly efficient, volumetric, time-critical data transfer over ip networks

Next Steps Toward 10 Gigabit Ethernet Top-of-Rack Networking

Data Center Infrastructure of the future. Alexei Agueev, Systems Engineer

Chapter 1 Reading Organizer

Intel Ethernet Switch Converged Enhanced Ethernet (CEE) and Datacenter Bridging (DCB) Using Intel Ethernet Switch Family Switches

SummitStack in the Data Center

Brocade Solution for EMC VSPEX Server Virtualization

Solution Brief Network Design Considerations to Enable the Benefits of Flash Storage

How to Monitor a FabricPath Network

Lecture 18: Interconnection Networks. CMU : Parallel Computer Architecture and Programming (Spring 2012)

Advanced Computer Networks. Datacenter Network Fabric

Switching Architectures for Cloud Network Designs

Lecture 23: Interconnection Networks. Topics: communication latency, centralized and decentralized switches (Appendix E)

THE BIG DATA REVOLUTION

Simplify Your Data Center Network to Improve Performance and Decrease Costs

Data Center Convergence. Ahmad Zamer, Brocade

SX1024: The Ideal Multi-Purpose Top-of-Rack Switch

Addressing Scaling Challenges in the Data Center

SummitStack in the Data Center

All-Flash Arrays Weren t Built for Dynamic Environments. Here s Why... This whitepaper is based on content originally posted at

How To Increase Network Performance With Segmentation

Hadoop Cluster Applications

Technical Bulletin. Enabling Arista Advanced Monitoring. Overview

VMware Virtual SAN 6.2 Network Design Guide

Non-blocking Switching in the Cloud Computing Era

Cisco s Massively Scalable Data Center. Network Fabric for Warehouse Scale Computer

Virtual PortChannels: Building Networks without Spanning Tree Protocol

Flexible Modular Data Center Architecture Simplifies Operations

Technology-Driven, Highly-Scalable Dragonfly Topology

PROPRIETARY CISCO. Cisco Cloud Essentials for EngineersV1.0. LESSON 1 Cloud Architectures. TOPIC 1 Cisco Data Center Virtualization and Consolidation

WHITE PAPER. Copyright 2011, Juniper Networks, Inc. 1

Interconnection Network Design

Cisco UCS and Fusion- io take Big Data workloads to extreme performance in a small footprint: A case study with Oracle NoSQL database

Latency Monitoring Tool on Cisco Nexus Switches: Troubleshoot Network Latency

How the Port Density of a Data Center LAN Switch Impacts Scalability and Total Cost of Ownership

Data Center Networking Designing Today s Data Center

MIGRATING TO A 40 GBPS DATA CENTER

Storage Area Network Design Overview Using Brocade DCX Backbone Switches

Powerful Duo: MapR Big Data Analytics with Cisco ACI Network Switches

Scalable Approaches for Multitenant Cloud Data Centers

Choosing the Best Network Interface Card for Cloud Mellanox ConnectX -3 Pro EN vs. Intel XL710

Photonic Switching Applications in Data Centers & Cloud Computing Networks

Large-Scale Distributed Systems. Datacenter Networks. COMP6511A Spring 2014 HKUST. Lin Gu

Radware ADC-VX Solution. The Agility of Virtual; The Predictability of Physical

Meeting the Five Key Needs of Next-Generation Cloud Computing Networks with 10 GbE

Software-Defined Networks Powered by VellOS

LAYER3 HELPS BUILD NEXT GENERATION, HIGH-SPEED, LOW LATENCY, DATA CENTER SOLUTION FOR A LEADING FINANCIAL INSTITUTION IN AFRICA.

EVOLVING ENTERPRISE NETWORKS WITH SPB-M APPLICATION NOTE

全 新 企 業 網 路 儲 存 應 用 THE STORAGE NETWORK MATTERS FOR EMC IP STORAGE PLATFORMS

Virtualizing the SAN with Software Defined Storage Networks

SBSCET, Firozpur (Punjab), India

SDN CENTRALIZED NETWORK COMMAND AND CONTROL

Connecting the Clouds

Dell Force10. Data Center Networking Product Portfolio. Z-Series, E-Series, C-Series, and S-Series

Ethernet-based Software Defined Network (SDN) Cloud Computing Research Center for Mobile Applications (CCMA), ITRI 雲 端 運 算 行 動 應 用 研 究 中 心

Solving I/O Bottlenecks to Enable Superior Cloud Efficiency

Deploying Brocade VDX 6720 Data Center Switches with Brocade VCS in Enterprise Data Centers

Radware ADC-VX Solution. The Agility of Virtual; The Predictability of Physical

How To Send Video At 8Mbps On A Network (Mpv) At A Faster Speed (Mpb) At Lower Cost (Mpg) At Higher Speed (Mpl) At Faster Speed On A Computer (Mpf) At The

The Need for Low-Loss Multifiber Connectivity

Using Multipathing Technology to Achieve a High Availability Solution

Benchmarking Hadoop & HBase on Violin

Migrate from Cisco Catalyst 6500 Series Switches to Cisco Nexus 9000 Series Switches

Maximizing Server Storage Performance with PCI Express and Serial Attached SCSI. Article for InfoStor November 2003 Paul Griffith Adaptec, Inc.

Arista and Leviton Technology in the Data Center

Network Virtualization and Data Center Networks Data Center Virtualization - Basics. Qin Yin Fall Semester 2013

Data Center Network Topologies

Cloud Computing and the Internet. Conferenza GARR 2010

Testing Network Virtualization For Data Center and Cloud VERYX TECHNOLOGIES

InfiniBand Switch System Family. Highest Levels of Scalability, Simplified Network Manageability, Maximum System Productivity

Juniper / Cisco Interoperability Tests. August 2014

Optical interconnection networks for data centers

Interoperability Testing and iwarp Performance. Whitepaper

Transcription:

Introduction Data Center Switch Fabric Competitive Analysis This paper analyzes Infinetics data center network architecture in the context of the best solutions available today from leading vendors such as Cisco, Juniper Networks, Arista Networks and Force 10 Networks. The target audience is designers of network infrastructure hardware and control software used in large- scale data centers. The document has the following organization: Overview of Infinetics architecture Analysis of leading industry solutions Analysis of Current Architecture of Choice Comparison of Infinetics architecture Conclusion Overview of Infinetics Architecture Infinetics has developed a new way of connecting large numbers of nodes consisting of some combination of computation and data storage, with behaviors and features that are hard or impossible to achieve using current methods. We have developed software that runs on standard data center switches and hypervisors, and supports any network topology. We have also determined that specific new topologies work far better than all those discovered to date, and have tuned our initial implementation to support one of them. The essential difference between Infinetics approach and all presently existing solutions is the flexible, practically unlimited radix of networks that can be constructed. Although there are presently switches which can be upgraded from an initial configuration with a smaller radix to a configuration with a higher radix, the maximum radix is fixed in advance to at most a few hundred to a few thousand ports. Further, the radix multiplier switching fabric for the maximum configuration is hardwired in the switch design. For example, a typical commercial switch such as Arista Networks 7500 can be expanded to 384 ports by adding 1-8 line cards, each providing 48 ports; but the switching fabric gluing the 8 separate 48 port switches into one 384 port switch is rigidly fixed by the design and it is even included in the basic unit. In contrast the Infinetics architecture has no upper limit, neither in advance nor later, on the maximum number of ports it can provide. For any given type of switch with radix R, the upper limit for simple expansion without performance penalty is 2 R- 1 component switches. Since typical R is at least 48, even this conditional limit of 2 47 1.4 10 14 on the radix expansion is already far larger than the number of ports in the entire Internet, let alone in any existing or contemplated data center. The Flexible Radix Switch does not require the very expensive core and fabric switches usually required to control broadcast flooding and other adverse behaviors of large data center networks. Instead, it can be configured to run on basic commodity switches. It can also run on more powerful switches, but it does not require complicated configuration. Additionally, the architecture provides a significant performance edge over any existing or proposed data center Layer 2 network. Infinetics Technologies, Inc. Rev. DCFCA-5111 ALL RIGHTS RESERVED 2009-2011 Page 1 of 7

Infinetics new network architecture provides: a) Near limitless number of nodes b) Throughput that scales nearly linearly with the number of nodes without bottlenecks or throughput restriction c) Simple incremental expansion whereby increasing the number of nodes requires only a proportional increase in the number of switching components, while maintaining the throughput per node d) Maximized parallel multipath use of available node interconnection paths to increase node- to- node bandwidth e) Long hop topology enhancements that simultaneously minimize latency (average and maximum path lengths) and maximize throughput at any given number of nodes f) Fully unified and scalable control and management plane g) Very simple connectivity nodes connected to interconnection fabric do not need to have any knowledge of topology or connection patterns h) Streamlined interconnection paths. All dense interconnections have regular wiring patterns and use very short cables ; physically distant nodes have sparse connections, resulting in very simple and economical interconnection and wiring. Leading Industry Solutions Many leading network hardware vendors have developed products that can be configured to provide very high throughput as required by modern data centers with large numbers of physical and virtual servers. While these solutions are a great improvement over traditional switch hardware, they have inherently high costs that are a result of the fundamental characteristics of the hardware and firmware within the switches. Vendors emphasize various attributes of their networks, sometimes focusing on the improvement over traditional networks, and sometimes on the relative merits of one vendor s solution over another s. The following analysis of four vendors hardware sets the scene for a direct fact- based comparison with the behavior of the Infinetics network architecture. This approach removes all bias introduced by vendor self- promotion and focuses solely on what is physically possible based on each product s publicly disclosed operating characteristics. Cisco Analysis The analysis was performed on the Cisco Nexus 7018 in its FabricPath configuration, with a trunking factor of 32 meaning that each of the links from a top layer switch to a bottom layer switch uses 32 ports on each switch for interconnection. This results in a network that: Uses 6 Nexus 7018 switches Costs $3,300K (or $3223 per port) Consumes 40 Watts per port Infinetics Technologies, Inc. Rev. DCFCA-5111 ALL RIGHTS RESERVED 2009-2011 Page 2 of 7

Arista Networks Analysis The analysis was performed on the Arista 7500 series, with a trunking factor of 72. This results in a network that: Uses 8 Arista 7500 switches Costs $2,880K (or $2,813 per port) Consumes 40 Watts per port Juniper Networks Analysis The analysis was performed on the Juniper EX8216, with a trunking factor of 8. This results in a network that: Uses 24 Juniper EX8216 switches Costs$10,440K (or $10,195 per port) Consumes 141 Watts per port Force 10 Networks Analysis The analysis was performed on the Force 10 E1200i, with a trunking factor of 8. This results in a network that: Uses 24 Force 10 E1200i switches Costs $8,544K (or $8344 per port) Consumes 110 Watts per port Cost and Power Savings with Infinetics Approach The Infinetics architecture that provides an equivalent 1024 available ports and equal available bandwidth to the industry vendor solutions described above: Uses 64 PICA8 Pronto 3780 switches Has oversubscription 1.057 (effectively the 1 of the other vendor networks) Costs $768K Consumes 22W of power per port Table 1 shows the relative cost and power of the Infinetics network compared to the four industry solutions described above. Cisco Arista Juniper Force 10 Relative cost 4.3x 3.8x 13.5x 11x Relative power 1.8x 1.8x 6.4x 5x Table 1. Infinetics Technologies, Inc. Rev. DCFCA-5111 ALL RIGHTS RESERVED 2009-2011 Page 3 of 7

Current Architecture of Choice First, some background. Folded Clos is a family of network topologies that is typically controlled by 3 parameters that select a specific member from the very wide range of possibilities. The Fat Tree is parametrized as FT(h,m,w), where h is the number of layers, each non- leaf has m children and each child has w parents. The simple tree is obtained by setting the number of parents w to 1. The equivalent of hypercube (of dimension d=2h) is obtained by FT(h,4,2h). Many other topologies are possible. One partially scalable Folded Clos subclass (SFC) is of particular interest because of its use in the emerging TRILL standard and in Cisco s FabricPath based networks. In this two layer network, the top layer is called the "spine" and has no servers connected, and the bottom one is called the "leaf" layer, where servers connect. This network is only partially scalable since the maximum number of external ports (hence servers) is only P = R 2 /2, where R is the radix (number of ports) of the switch used as the building block. A truly scalable network does not limit the maximum number of attached servers. In Figure 1 below, the B switches are the spine layer, A switches are the leaf layer, Q represents the trunking factor, and M is the number of switches in the spine layer. Figure 1. SFC becomes scalable if an arbitrary number of layers are used, instead of the usual 2. With H layers, the number of external ports is P = 2*(R/2) H. Hence for any fixed radix (R) for a component switch, the number of ports (P) can grow arbitrarily large. The present generation of solutions offered by the major switch vendors only achieves the behavior of SFC with large radix component switches. This scheme locks users into chasing ever larger radix R as the network grows, while making obsolete the previous switches with smaller R. Therefore, we will consider below only the SFC topology, since it applies to the commercially available FabricPath, QFabric, Fulcrum's FocalPoint hardware implementation, and other solutions. Infinetics Technologies, Inc. Rev. DCFCA-5111 ALL RIGHTS RESERVED 2009-2011 Page 4 of 7

The chief feature of SFC is that it produces a non- blocking network with maximum bisection for any given radix and number of component switches (N), where N must be divisible by 3. Since the cost of such a network is N times the switch cost, this implies that no other network using the same component switches and having the same bisection can cost less than SFC. Namely, to reach the matching bisection, a would- be competitor of FCS would have to use at least as many switches of given R as FCS, which means it would cost at least as much. However, SFC pays a hidden price for the desirable feature of maximum bisection above: It is over- optimized for worst case traffic, the case in which each source is sending only to the farthest destination in the network, i.e. for the singular case when all paths are of precisely of the maximum length. As a result SFC suffers a big throughput penalty on the remaining 99.99+ percent of all possible traffic patterns. The worst case pattern is an exponentially small fraction of all traffic patterns, the magnitude of which is shown in Figure 2 below: Figure 2. The vertical axis expresses latency due to buffering of the frames that cannot be forwarded, i.e. an indirect measure for the network overload. The horizontal axis shows the traffic load relative to the 'all pipes full' capacity. At exactly 50% of the maximum load, the throughput of SFC tops out and all extra frames have to be indefinitely buffered. In contrast, the hypercube and flattened butterfly topologies undergo the same overload only when the traffic load reaches the actual full capacity at twice the load of the SFC overload point. Thus, for the sake of achieving the maximum bisection possible for a given number of switches with a given radix R, SFC squanders half its switching capacity for that goal. Thus for 99.99+ percent of the traffic patterns it is heavily underutilized, with a maximum of 50% utilization. As a result, its cost per Gb/s of throughput is double the cost for the regular hypercube or flattened butterfly, being dominated by the throughput of the non- worst case traffic patterns. Infinetics Technologies, Inc. Rev. DCFCA-5111 ALL RIGHTS RESERVED 2009-2011 Page 5 of 7

Comparison of Infinetics Architecture Infinetics long- hop hypercube augments the bisection to almost the level of the folded- Clos (mathematically the maximum possible), while simultaneously shortening the average path lengths by a factor of 2 or more compared to a plain hypercube, as illustrated in Table 2 below. Also, capacity is boosted by the same factors over hypercube for the average traffic (random, all- to- all, etc). Table 2. Therefore, Infinetics long- hop hypercube augmentation achieves the best of both worlds: It handles the worst case traffic (bottleneck capacity) nearly as well as the best that can be achieved by SFC, while simultaneously improving on the most common traffic case over a hypercube by a factor of 2 or more. Infinetics Technologies, Inc. Rev. DCFCA-5111 ALL RIGHTS RESERVED 2009-2011 Page 6 of 7

Conclusion The basic conclusions on how well Infinetics compares against the Fat Tree, used by Cisco s FabricPath and Juniper s QFabric backbone and other upcoming TRILL- enabled networks are quite clear and simple: 1. With the same component switches as used in the competing topologies, Infinetics cost per Gb/second of network throughput will be less than half of the cost for a Fat Tree based topology. 2. Using common off- the- shelf commodity switches with very low per- port cost, Infinetics can build a network that has approximately 10 times lower cost per port with equivalent available bandwidth, with no penalties imposed by traffic patterns encountered in real world data center usage scenarios. 3. Infinetics bottleneck capacity for the worst case traffic patterns (which also defines the oversubscription figure) will be practically identical to the mathematically best possible value, as shown in the Cisco example above where the Infinetics oversubscription ratio is 1.057, compared to exactly 1 for Fat Tree. 4. Unlike SFC, which limits the size of the flat Layer 2 network to R 2 /2 ports for the component switch radix R, the upper limit of the Infinetics flat Layer 2 network is exponential in R, which even with 48 port COTS switches is for all practical purposes unlimited. In contrast, any Fat Tree architecture requires the largest available switches to even achieve a flat Layer 2 size of a few thousand 10GbE ports. In other words, Conclusions (1) and (2) above are the best that a Fat Tree can do in the limited context of fairly small data center networks. To support larger networks, the Fat Tree approach has to rely on using current data center expand- up schemes, which vastly increases the cost per Gb/second. Infinetics Technologies, Inc. www.infinetics.com info@infinetics.com T: 877-438- 1010 Notice: Infinetics and the Infinetics logo are trademarks of Infinetics Technologies, Incorporated. All other trademarks are the property of their respective owners. Infinetics Technologies, Inc. Rev. DCFCA-5111 ALL RIGHTS RESERVED 2009-2011 Page 7 of 7