Server Consolidation and Remote Disaster Recovery: The Path to Lower TCO and Higher Reliability Executive Summary: To minimize TCO for server / data center consolidation and optimize disaster recovery and backup operations, enterprises must carefully consider the real costs of using older generations of switches compared to the economies and new functionalities enabled by next generation switch/routers designed for high-performance Ethernet applications: 1. Next generation switch/routers with high-capacity Ethernet services will substantially reduce network complexity and cost. 2. Feature rich, line rate performance simplifies network design, security, and control eliminating the need for complex traffic engineering analysis and management. 3. Next generation equipment incorporating multiple levels of high-availability drastically reduces the likelihood of "single-component" catastrophic failures, providing better protection than previously afforded by existing equipment. 4. 10 Gigabit Ethernet for data center / disaster recovery / backup interconnect using Dark Fiber or individual Wavelengths (lambdas) enables high-speed transport services at significant cost savings to the Enterprise today. The hidden costs of interconnecting and managing consolidation solutions can amount to many times their purchase cost. Server Consolidation and Remote Disaster Recovery Today, several enterprises plan to consolidate the number of data centers resulting in significant reductions in total cost of ownership (TCO). Replacing multiple outdated servers with fewer high-performance servers saves on space and power as well as on operations and maintenance. Too often enterprise consolidation planning examines the effects of server consolidation looking only at the price per port of a network interface. This paper examines how the hidden costs of interconnecting and managing these solutions can amount to many times their purchase cost. Without a holistic approach including the networking infrastructure, TCO gains made by consolidating servers alone can be negated by other factors including: Network equipment failure Human error Rogue user activities Natural and unnatural disasters To ensure the near 100% uptime required by enterprises, the overall consolidation design must incorporate additional high availability, disaster recovery, and data backup considerations at every level, starting with the networking infrastructure and extending through the data transport system. To minimize TCO for server /data center consolidation and optimize disaster recovery and backup operations, enterprises must carefully consider the real costs of using older generations of switches compared to the economies and new functionalities enabled by next generation switch/routers designed for high-performance Ethernet applications. [ P AGE 1 OF 6 ]
Higher-Density and Higher-Capacity Decreases TCO For a given size server cluster, the complexity of the network aggregating the server connections directly relates to the number of line rate ports supported in a single chassis. The most costeffective network is built with the highest density switch/routers. When the size of the cluster grows beyond the capacity of a single switch/router, retaining non-blocking connectivity between servers becomes a very large problem: the size of the aggregation network increases non-linearly and its cost rises dramatically. The reason is simple: ports that used to connect to servers must now be dedicated to interconnecting the switches. Figure 1 compares the older generation networking equipment to next generation switch/routers. Older equipment supporting one half the port density leads to a six-fold increase in the number of switch/routers and a three-fold increase in the number of ports required to support an equivalent number of servers. With these increases come reliability concerns and additional management complexity. As enterprises pursue consolidation, replacing older servers and their 10/100 Ethernet connections with a smaller number of new higherperformance servers with Gigabit Ethernet (GbE) connectivity is the clear choice. Consolidation of these servers into fewer data centers further reduces TCO as stated earlier. The result is a high concentration of GbE attached servers in the remaining data center locations. These changes dramatically alter the demands put on the network infrastructure. The last generation of Ethernet switch is very limited in its ability to meet these demands because it lacks three critical elements: 1) capacity, 2) line rate GbE and 10 Gigabit Ethernet (10 GbE) performance, and 3) high-availability design. Because of its unmatched GbE and 10 GbE port density, Force10 Networks E-Series is a capable choice for building scalable, cost-effective clusters. Keeping the cluster in one box as long as possible is clearly the most cost-effective choice. Unlike past generation of Ethernet switches, the E-Series enables GbE attached clusters of up to 624 nodes (including uplink connections). As the cluster grows, the E-Series, with its capacity of 56.25 Gbps per slot, has enough headroom to double its port density without requiring a forklift upgrade. 8 Server Nodes One 8-port Switch Six 4-port Switches Figure 1: Twice the capacity at 1/6 the complexity Consider a switch with eight GbE ports. For an eight-node server cluster all eight of these ports are available to connect to the servers. If switches with four ports were used instead, only half of the ports on each switch could be devoted to server connections. This is because half of the ports on each switch would now be needed for switch connections. Thus, in this example of non-blocking interconnection, six times as many 4 port switches and three times as many ports are needed to connect the same eight server [ P AGE 2 OF 6 ]
Enterprises can avoid additional operational costs by using network equipment that delivers line rate performance on all ports regardless of the type of security or control service enabled on those ports or systems. Feature Rich Line Rate Performance Simplifies Network Design Vendors of previous generation switches address their lack of capacity by increasing port densities without providing proportional increases in total system bandwidth or performance. This divides the already limited capacity of the switch across more ports, resulting in ports that operate well below line rate. On the surface, these over-subscribed or blocking ports address the explosion of boxes outlined above, but introduce new burdens. Even though the majority of servers available cannot on average fill a GbE link, bursts of traffic can. Traffic patterns depend on a number of variables: application, number of users, time of day, degree of contention for storage or database resources, etc. With a blocking consolidated network, the IT manager needs to consider these variables in relation to the levels of service required in the network. Additionally, the IT manager must consider the evolution of the server cluster to ensure that enough flexibility is designed into the network to meet future bandwidth needs. In the implementation phase, managers use traffic prioritization and Quality of Service (QoS) capabilities to fine tune the network to avoid choke points and incorporate Access Control Lists (ACLs) for network security and control. All of the tasks mentioned above adds significantly delay to the consolidated environment as well as increases in recurring operational costs. However, enterprises can avoid these costs completely by using network equipment that delivers line rate performance on all ports regardless of the type of security or control service enabled on those ports or systems. Figure 2 illustrates Force10 Networks E-Series architecture. Force10 ASICs have line rate support for filtering, statistics collection, QoS, rate policing, and limiting. Force10 ASICs also deliver protocol-specific hardware support at line rate for L2 switching and L3 routing. The E-Series line rate high-touch features also include: Filtering with standard and extended ACLs QoS (DiffServ, IEEE 802.1p) Rate policing and limiting L2 switching features Source address learning and limiting Link Aggregation VLAN stacking L3 routing features ECMP Inter-VLAN routing IP multicast Packet over SONET/SDH Statistics collection Figure 2. E-Series Architecture [ P AGE 3 OF 6 ]
A direct result of server consolidation is fewer server clusters running more applications and serving more users. Multiple Levels of High Availability A direct result of server consolidation is fewer server clusters running more applications and serving more users. Here, the impact of any downtime, scheduled or not, is magnified. This multiplies the importance of network availability for server access. High-availability becomes essential at two levels: at the individual switch/router and the overall network. Server availability is addressed by providing redundant chassis and giving each server redundant GbE connections. 8 Dual-Homed Server Nodes Two 8-port Switches +VRRP Twelve 4-port Switches + ECMP Figure 3: Network redundancy With a design point of any-to-any connectivity between servers, dual homing requires twice the number of chassis. Here, going from an 8-port switch to a 4-port switch adds ten more switches to the solution. For TCO reasons, it is impractical to deploy this many network elements per server cluster. Earlier we addressed many of the issues arising from low-capacity and low-performance network elements. Building redundancy into a servercluster network multiplies these issues dramatically. As the example in Figure 3 makes clear, it simply is not practical to build highly available, dual-interface server clusters with low-portdensity switches. Aside from "soft" high-availability features such as redundant protocols, switch/routers must also deliver "hardened" network-availability elements previously obtainable only on service-provider grade equipment. These features include: Redundancy of all critical elements Stateful fail-over of control modules In-service software upgrades and maintenance Protected memory systems Hot-swap and online insertion/removal (OIR) of all components Clean separation of control and data planes The E-Series maximizes network uptime by supporting extensive redundancy, availability, and serviceability features including: 1+1 Redundant Route Processor Modules (RPM) 8:1 redundant Switch Fabric Modules (SFM) Redundant power and cooling Passive copper backplane Hot swap of all key components All memory systems ECC/parity protected Clean separation of control and data planes System-wide environmental monitoring Persistent configuration synchronization between RPMs Virtual Router Redundancy Protocol (VRRP) Cable management and front-side serviceability To quickly restore forwarding stability in the event of a failure, the network must allow fast convergence. For this reason, the E-Series RPM provides innovative methods of filtering and rate limiting control-traffic as well as dedicated 100 Mbps switched control links to every line card. [ P AGE 4 OF 6 ]
Aside from 'soft' high-availability features such as redundant protocols, switch/routers must also deliver 'hardened' network-availability elements previously obtainable only on service-provider grade equipment. 10 Gigabit Ethernet for Remote Disaster Recovery & Backup Leveraging the abilities of the networking equipment providing the foundation for server consolidation to also provide high-speed transport between data centers for disaster recovery and remote backups, further helps to drive total costs down even lower. The E-Series chassis supports native 10 GbE services over past generation products where 10 GbE functionality exists simply as a "bolt on" technology. Force10 views 10 GbE LAN and WAN connectivity as crucial building-block elements when assembling comprehensive consolidation strategies that include contingencies for disaster recovery and remote backup functionality. High-speed interconnect for disaster recovery and data backup operations used to be the exclusive realm of GbE (individually or grouped together using Link Aggregation) or by implementing some form of SONET based service. Today, 10 Gigabit Ethernet allows IT managers to back up and mirror data faster while placing mirrored sites farther apart. Force10 10 GbE WAN PHY (physical layer) ports connect directly to existing SONET Add/Drop Multiplexers (ADMs) and Dense Wavelength Divisional Multiplexers (DWDM) devices at 1/10 the cost of traditional (and bandwidth equivalent) OC-192 Packet over SONET (POS) interfaces. As well, many optical transport vendors now offer native 10 GbE DWDM ports. Using the same E-Series 10 GbE ports within the LAN and throughout the Metropolitan Area Network (MAN) or WAN transport service further simplifies consolidation sparing and OPEX costs. As illustrated in Figure 4, Force10 E-Series provides high-speed transport services within and throughout the consolidated environment. The Switch/Router Choice for Consolidation Before you finalize your consolidation vendor choice, be sure to ask these critical questions: 1) What is the ultimate capacity of the switch/router? Is there enough internal capacity in the chassis and chassis slots today or will I be forced into upgrading or replacing my existing products to accommodate my future changing requirements? 2) Does the system offer line rate performance when implementing all the QoS, control, and security features or must I implement "vendor specific" wiring configurations to overcome linecard limitations? I.e., can the linecards actually support all the bandwidth introduced by the ports or am I limited to "local switching" capacities only? 3) What happens if a critical system element fails? Can it be hot swapped? Does the chassis reboot? How long does it take the system to restore to full 100% operation? 4) Does the vendor support both the LAN and WAN ports for 10 GbE in the products that I m purchasing today? Will they operate at full line rate? Figure 4: 10 Gig Backbone for Metro and Disaster Recovery Sites [ P AGE 5 OF 6 ]
Summary Today, Force10 Networks E-Series provides the needed capacity, functionality, and reliability for the largest and most advanced networking implementations. The E-Series Next generation switch/router provides high-capacity Ethernet services to substantially reduce network complexity and cost. Feature rich, line rate performance simplifies network design, security, and control, eliminating the need for complex traffic engineering analysis and management. Incorporates multiple levels of high-availability to drastically reduce the likelihood of "singlecomponent" catastrophic failures, thereby providing better protection than previously afforded by existing equipment. Supports both 10 Gigabit Ethernet LAN and WAN PHYs for data center / disaster recovery / backup interconnect, enabling high-speed transport services at significant cost savings to the Enterprise today. The Force10 E-Series provides massive capacity, line rate support, and "built from the ground up" high availability. E-Series: High-Performance Ethernet for Consolidation. Force10 Networks, Inc. 350 Holger Way San Jose, CA 95134 USA www.force10networks.com 408-571-3500 PHONE 408-571-3550 FACSIMILE 2007 Force10 Networks, Inc. All rights reserved. Force10 and the Force10 logo are registered trademarks, and EtherScale, FTOS, SFTOS, and TeraScale are trademarks of Force10 Networks, Inc. All other brand and product names are trademarks or registered trademarks of their respective holders. Information in this document is subject to change without notice. Certain features may not yet be generally available. Force10 Networks, Inc. assumes no responsibility for any errors that may appear in this document. AN01 607 v1.8 [ P AGE 6 OF 6 ]