EMC SCALEIO NETWORKING BEST PRACTICES



Similar documents
Licensing Windows Server 2012 for use with virtualization technologies

Licensing Windows Server 2012 R2 for use with virtualization technologies

How To Install An Orin Failver Engine On A Network With A Network Card (Orin) On A 2Gigbook (Orion) On An Ipad (Orina) Orin (Ornet) Ornet (Orn

Microsoft Exchange 2010 on VMware Design and Sizing Examples

An Oracle White Paper January Oracle WebLogic Server on Oracle Database Appliance

Disk Redundancy (RAID)

Best Practices for Optimizing Performance and Availability in Virtual Infrastructures

The Importance Advanced Data Collection System Maintenance. Berry Drijsen Global Service Business Manager. knowledge to shape your future

Information Services Hosting Arrangements

Implementing ifolder Server in the DMZ with ifolder Data inside the Firewall

Improved Data Center Power Consumption and Streamlining Management in Windows Server 2008 R2 with SP1

SBClient and Microsoft Windows Terminal Server (Including Citrix Server)

Wireless Light-Level Monitoring

Best Practice - Pentaho BA for High Availability

Trends and Considerations in Currency Recycle Devices. What is a Currency Recycle Device? November 2003

UC4 AUTOMATED VIRTUALIZATION Intelligent Service Automation for Physical and Virtual Environments

ScaleIO Security Configuration Guide

This guide is intended for administrators, who want to install, configure, and manage SAP Lumira, server for BI Platform

HP ExpertOne. HP2-T21: Administering HP Server Solutions. Table of Contents

Build the cloud OpenStack Installation & Configuration Integration with existing tools and processes Cloud Migration

Process Automation With VMware

Serv-U Distributed Architecture Guide

Prioritization and Management of VoIP & RTP s

Firewall/Proxy Server Settings to Access Hosted Environment. For Access Control Method (also known as access lists and usually used on routers)

Implementing SQL Manage Quick Guide

Caching Software Performance Test: Microsoft SQL Server Acceleration with FlashSoft Software 3.8 for Windows Server

Restricted Document. Pulsant Technical Specification

System Business Continuity Classification

Diagnosis and Troubleshooting

expertise hp services valupack consulting description security review service for Linux

Networking Best Practices

SaaS Listing CA Cloud Service Management

How to deploy IVE Active-Active and Active-Passive clusters

Traffic monitoring on ProCurve switches with sflow and InMon Traffic Sentinel

Datasheet. PV4E Management Software Features

Installation Guide Marshal Reporting Console

Serv-U Distributed Architecture Guide

Identify Major Server Hardware Components

MaaS360 Cloud Extender

Mobilizing Healthcare Staff with Cloud Services

CSC IT practix Recommendations

SANsymphony-V Storage Virtualization Software Installation and Getting Started Guide. February 5,

Identify Storage Technologies and Understand RAID

Citrix XenServer from HP Getting Started Guide

State of Wisconsin. File Server Service Service Offering Definition

Architecting HP Server Solutions

Knowledge Base Article

2. When logging is used, which severity level indicates that a device is unusable?

Optimal Payments Extension. Supporting Documentation for the Extension Package v1.1

HP Point of Sale FAQ Warranty, Care Pack Service & Support. Limited warranty... 2 HP Care Pack Services... 3 Support... 3

Readme File. Purpose. What is Translation Manager 9.3.1? Hyperion Translation Manager Release Readme

How To Improve The Availability Of A Micrsft Exchange Server With A Vsphere Platfrm On Vsphera 2010 N Vspheme 2010

IMPLEMENTING CISCO IP SWITCHED NETWORKS (SWITCH)

Pexip Infinity and Cisco UCM Deployment Guide

webnetwork Pre-Installation Configuration Checklist

Licensing the Core Client Access License (CAL) Suite and Enterprise CAL Suite

Silver Peak NX Appliances and the Brocade 7500 Extension Switch

Architecting Multi-site HP Storage Solutions

Understand Business Continuity

Diagnostic Manager Change Log

Microsoft Exchange 2010 on VMware Availability and Recovery Options

CallRex 4.2 Installation Guide

Using Sentry-go Enterprise/ASPX for Sentry-go Quick & Plus! monitors

Microsoft Exchange 2013 on VMware Design and Sizing Guide

ViPNet VPN in Cisco Environment. Supplement to ViPNet Documentation

Chorus UFB Services Agreement Bitstream Services: Service Description for UFB Handover Connection

Corente Cloud Services Exchange (CSX) Corente Cloud Services Gateway Site Survey Form

Systems Support - Extended

FINRA Regulation Filing Application Batch Submissions

Integrate Marketing Automation, Lead Management and CRM

Ensuring end-to-end protection of video integrity

Monthly All IFS files, all Libraries, security and configuration data

ROSS RepliWeb Operations Suite for SharePoint. SSL User Guide

Dolphin Express IX Reflective Memory / Multicast

System Business Continuity Classification

Using PayPal Website Payments Pro UK with ProductCart

AccessData Corporation AD Lab System Specification Guide v1.1

Exercise 5 Server Configuration, Web and FTP Instructions and preparatory questions Administration of Computer Systems, Fall 2008

Mobile Workforce. Improving Productivity, Improving Profitability

Introduction to Mindjet MindManager Server

Readme File. Purpose. Introduction to Data Integration Management. Oracle s Hyperion Data Integration Management Release 9.2.

1)What hardware is available for installing/configuring MOSS 2010?

The Relativity Appliance Installation Guide

GETTING STARTED With the Control Panel Table of Contents

Helpdesk Support Tickets & Knowledgebase

Ten Steps for an Easy Install of the eg Enterprise Suite

This report provides Members with an update on of the financial performance of the Corporation s managed IS service contract with Agilisys Ltd.

Time is Money Profiting from Reduced Cycle Time

webnetwork Pre-Installation Configuration Checklist

KronoDesk Migration and Integration Guide Inflectra Corporation

HOWTO: How to configure SSL VPN tunnel gateway (office) to gateway

StarterPak: Dynamics CRM Opportunity To NetSuite Sales Order

Transcription:

EMC SCALEIO NETWORKING BEST PRACTICES ABSTRACT This dcument describes the cre cncepts, best practices, and validatin methds fr architecting a ScaleIO netwrk. Nvember 2015 WHITE PAPER

T learn mre abut hw EMC prducts, services, and slutins can help slve yur business and IT challenges, cntact yur lcal representative r authrized reseller, visit www.emc.cm, r explre and cmpare prducts in the EMC Stre Cpyright 2015 EMC Crpratin. All Rights Reserved. EMC believes the infrmatin in this publicatin is accurate as f its publicatin date. The infrmatin is subject t change withut ntice. The infrmatin in this publicatin is prvided as is. EMC Crpratin makes n representatins r warranties f any kind with respect t the infrmatin in this publicatin, and specifically disclaims implied warranties f merchantability r fitness fr a particular purpse. Use, cpying, and distributin f any EMC sftware described in this publicatin requires an applicable sftware license. Fr the mst up-t-date listing f EMC prduct names, see EMC Crpratin Trademarks n EMC.cm. VMware and <insert ther VMware marks in alphabetical rder; remve sentence if n VMware marks needed. Remve highlight and brackets> are registered trademarks r trademarks f VMware, Inc. in the United States and/r ther jurisdictins. All ther trademarks used herein are the prperty f their respective wners. Part Number H14708 2

TABLE OF CONTENTS EXECUTIVE SUMMARY... 5 AUDIENCE... 5 SCALEIO OVERVIEW... 5 NETWORK INFRASTRUCTURE... 7 Netwrk Tplgy... 7 Leaf-Spine Netwrk Tplgy... 8 Flat Netwrk Tplgy... 9 IPv4 and IPv6... 9 NETWORK PERFORMANCE... 10 Netwrk Latency... 10 Netwrk Speed... 10 Infiniband... 13 NICs... 13 Tw NICs vs. Fur NICs and Other Cnfiguratins... 13 Dedicated Access Prt... 13 NIC Pling... 13 Jumb Frames... 14 Flw Cntrl... 14 Link Aggregatin... 15 High Availability... 15 Active High Availability... 15 Passive High Availability... 15 SOFTWARE-DEFINED SAN VS. HYPERCONVERGED... 15 VMWARE IMPLEMENTATIONS... 15 SDC... 15 VM-Kernel prt... 15 Virtual Machine Prt Grup... 15 VMware Advantages ver a Standard Switch... 15 Hyper-cnverged Cnsideratins... 15 3

VALIDATION METHODS... 16 Internal SIO Tls... 16 SDS Netwrk Test... 16 In the example abve, yu can see the netwrk perfrmance frm the SDS yu are testing frm, t every ther SDS in the netwrk. Ensure that the speed per secnd is clse t the expected perfrmance f yur netwrk cnfiguratin.... 16 SDS Netwrk Latency Meter Test... 17 Iperf and NetPerf... 18 Iperf... 18 NetPerf... 18 Netwrk Mnitring... 18 Netwrk Trubleshting 101... 18 Revisin histry... 19 REFERENCES... 19 4

EXECUTIVE SUMMARY EMC s ScaleIO sftware defined strage platfrm presents limitless pprtunities t create pwerful strage systems frm cmmdity hardware. ScaleIO s success is driven nt nly by the hardware it perates n, but als by prperly tuned perating system platfrms and netwrks. Further, it empwers users with the ability t size netwrks prperly frm a redundancy and perfrmance perspective. Please nte that this guide des nt cver every netwrking best practice fr ScaleIO but attempts t cver a minimum set f netwrk best practices. It is very likely that a ScaleIO technical expert culd recmmend a mre cmprehensive r smetimes different best practices than cvered in this guide. This guide is intended t prvide details n: Netwrk tplgy chices Netwrk perfrmance Sftware defined SAN and hypercnverged cnsideratins ScaleIO implementatins within a VMware envirnment Validatin methds Mnitring recmmendatins AUDIENCE This white paper is intended as a guide t advise end-users regarding best practices fr ScaleIO netwrking, while creating a better understanding f the ptins available fr the successful peratin f ScaleIO frm a netwrking perspective. SCALEIO OVERVIEW The management f large-scale, rapidly grwing infrastructures is a cnstant challenge fr many data center peratin teams and it is nt surprising that data strage is at the heart f these challenges. The traditinal dedicated SAN and dedicated wrklads cannt always prvided the scale and flexibility needed. A strage array can t brrw capacity frm anther SAN if demand increases and can lead t data bttlenecks and a single pint f failure. When delivering Infrastructure-as-a-Service (IaaS) r high perfrmance applicatins, delays in respnse are simply nt acceptable t custmers r users. EMC ScaleIO is sftware that creates a server-based SAN frm lcal applicatin server strage (lcal r netwrk strage devices). ScaleIO delivers flexible, scalable perfrmance and capacity n demand. ScaleIO integrates strage and cmpute resurces, scaling t thusands f servers (als called ndes). As an alternative t traditinal SAN infrastructures, ScaleIO cmbines hard disk drives (HDD), slid state disk (SSD), and Peripheral Cmpnent Intercnnect Express (PCIe) flash cards t create a virtual pl f blck strage with varying perfrmance tiers. ScaleIO is hardware-agnstic, supprts physical and/r virtual applicatin servers, and has been prven t deliver significant TCO savings vs. traditinal SAN. Massive Scale - ScaleIO is designed t massively scale frm three t thusands f ndes. Unlike mst traditinal strage systems, as the number f strage devices grws, s d thrughput and IOPS. The scalability f perfrmance is linear with regard t the grwth f the deplyment. Whenever the need arises, additinal strage and cmpute resurces (i.e., additinal servers and/r drives) can be added mdularly s that resurces can grw individually r tgether t maintain balance Extreme Perfrmance - Every server in the ScaleIO cluster is used in the prcessing f I/O peratins, making all I/O and thrughput accessible t any applicatin within the cluster. Such massive I/O parallelism eliminates bttlenecks. Thrughput and IOPS scale in direct prprtin t the number f servers and lcal strage devices added t the system, imprving cst/perfrmance rates with grwth. Perfrmance ptimizatin is autmatic; whenever rebuilds and rebalances are needed, they ccur in the backgrund with minimal r n impact t applicatins and users. The ScaleIO system autnmusly manages perfrmance ht spts and data layut 5

Cmpelling Ecnmics - As ppsed t traditinal Fibre Channel SANs, ScaleIO has n requirement fr a Fibre Channel fabric between the servers and the strage and n dedicated cmpnents like HBAs. There are n frklift upgrades fr end-f-life hardware. Yu simply remve failed disks r utdated servers frm the cluster. It creates a sftware-defined strage envirnment that allws users t explit the unused lcal strage capacity in any server. Thus ScaleIO can reduce the cst and cmplexity f the slutin resulting in typically greater than 60 percent TCO savings vs. traditinal SAN. Unparalleled Flexibility - ScaleIO prvides flexible deplyment ptins. With ScaleIO, yu are prvided with tw deplyment ptins. The first ptin is called tw-layer and is when the applicatin and strage are installed in separate servers in the ScaleIO cluster. This prvides efficient parallelism and n single pints f failure. The secnd ptin is called hyper-cnverged and is when the applicatin and strage are installed n the same servers in the ScaleIO cluster. This creates a single-layer architecture and prvides the lwest ftprint and cst prfile. ScaleIO prvides unmatched chice fr these deplyments ptins. ScaleIO is infrastructure agnstic making it a true sftware-defined strage prduct. It can be used with mixed server brands, perating systems (physical and virtual), and strage media types (HDDs, SSDs, and PCIe flash cards). In additin, custmers can als use OpenStack cmmdity hardware fr strage and cmpute ndes. Supreme Elasticity - With ScaleIO, strage and cmpute resurces can be increased r decreased whenever the need arises. The system autmatically rebalances data n the fly with n dwntime. Additins and remvals can be dne in small r large increments. N capacity planning r cmplex recnfiguratin due t interperability cnstraints is required, which reduces cmplexity and cst. Based n the requirements, ScaleIO can be used either in 2-layer architecture als knwn as SAN.NEXT r in a single-layer architecture als knwn as Infrastructure.NEXT. Tw-layer r SAN.NEXT: ScaleIO allws the custmer t mve twards a sftware-defined scale ut SAN infrastructure using cmmdity hardware. Custmers can keep running the applicatins the way they are tday n a separate servers and transfrm the way t handle SAN strage. Single-layer r Infrastructure.NEXT: ScaleIO als allws the custmers t run the applicatin n the same strage server, bringing the cmpute and strage tgether in a single architecture. In a nutshell, ScaleIO brings tgether the strage, cmpute and applicatin in a single layer, which makes the management f the infrastructure much simpler. 6

NETWORK INFRASTRUCTURE Netwrk Tplgy There are tw netwrk tplgies that we will be discussing fr ScaleIO deplyments: (1) Leaf-Spine netwrk, and (2) Flat netwrk. The primary cnsideratins when determining yur netwrk tplgy are: 1. What is the number f ScaleIO ndes planned fr yur deplyment High Number f ndes that dn t fit int a single switch Small number f ndes (<10) that fit int a single switch Prt density/number f available prts n switch 2. What is yur deplyment plan hyper-cnverged r sftware-defined SAN? If yu plan n running hyper-cnverged, yu may need additinal netwrk capacity frm a bandwidth perspective t accmmdate additinal strage and applicatins 3. Are yu extending an existing netwrk t accmmdate a small number f ndes? Small number f ndes Want t leverage existing prt capacity 4. Netwrk Redundancy Netwrk redundancy enables yu t have the same amunt f bandwidth acrss the netwrk infrastructure Yu shuld have enugh cnnectins between switches t have end-t-end capacity if there is a device failure N singular pint f failure highly available 5. Security If yu are cnnecting t untrusted SDCs, yu may want r need t separate the SDC netwrk frm the SDS netwrk 7

Leaf-Spine Netwrk Tplgy A Leaf-Spine tplgy is a tw-tier architecture, and is an alternative t the classic three-layer netwrk design. It cnsists f Leaf Switches and Spine Switches. In this design, each Leaf Switch is attached t all Spine Switches, but the Leaves and Spines are nt cnnected t each ther. The Leaf Switches cntrl the flw f traffic between servers, and the Spine Switches mve traffic between ndes at Layer 2. In mst instances, we recmmend leveraging a Leaf-Spine netwrk tplgy design fr a ScaleIO implementatin. This is because: ScaleIO can scale upwards t hundreds f ndes Leaf-Spine architecture facilitates scale-ut deplyments, withut having t re-architect the netwrk (future-prfing) When designed crrectly t allw fr maximum bandwidth, a Leaf-Spine tplgy will be nn blckings All cnnectins have equal access t bandwidth Predictable latency Highest availability and perfrmance 8

Flat Netwrk Tplgy A Flat netwrk design is less cstly and easier t maintain frm an administratin perspective. A Flat netwrk tplgy is easier t implement, and may be the preferred chice if an existing netwrk is being extended r if the netwrk des nt scale beynd fur switches. If yu expand beynd fur switches, yu will need mre crss-link prts that it wuld likely be cst prhibitive t remain in a flat netwrk tplgy. A Flat Netwrk is the fastest, simplest way t get yur ScaleIO deplyment up and running. The primary use-cases fr a flat netwrk tplgy are: Small deplyment, nt extending beynd fur switches Remte Office/Back Office Small Business IPv4 and IPv6 While the current versin f ScaleIO (1.32.2) supprts Internet Prtcl versin 4 (IPv4) addressing, IPv6 will be supprted in the cming prduct release. 9

NETWORK PERFORMANCE NOTE: It is recmmended that all specialty netwrk cnfiguratins be disabled when deplying ScaleIO. This includes Jumb Frames, Flw Cntrl, and Link Aggregatin. It is recmmended that fllwing the successful deplyment f yur infrastructure yu can begin t tune the envirnment and add these layers t achieve best perfrmance. Netwrk Latency Netwrk latency is imprtant t accunt fr when designing yur netwrk. Minimizing the amunt f netwrk latency will prvide fr imprved perfrmance. Fr best perfrmance, latency fr all SDS and MDM cmmunicatin shuld nt exceed 1ms rundtrip time. This can be easily verified by pinging, and mre extensively by the SDS Netwrk Latency Meter Test. Please nte that ScaleIO is nt designed t extend utside the datacenter. Leveraging wide area netwrks t perate ScaleIO is discuraged. Netwrk Speed Netwrk speed is a critical cmpnent when designing yur ScaleIO implementatin. In rder t determine yur netwrk speed, the fllwing cnsideratins shuld be examined: Rebuild time (the amunt f time it takes fr a failed nde t rebuild) Rebalance time (the amunt f time it takes t redistribute data in the event f practive device remval r uneven data distributin in the event f a nde failure) Drive capability/perfrmance (the amunt f data a drive is capable f delivering frm an IO perspective) Perfrmance expectatins (IOPS, Bandwidth, Latency) While ScaleIO can be deplyed n a 1Gbps netwrk, strage perfrmance will be bttlenecked by the netwrk capacity. At a minimum, we recmmend leveraging 10Gbps netwrk technlgy. Using the fllwing frmula, yu can calculate the number f NICs that are suggested fr ptimal netwrk thrughput, based n the size and type f drives yu will be using in a ScaleIO nde. Nte: The fllwing calculatin can be used as a guide fr ptimizing thrughput in rder t minimize ScaleIO rebuild and rebalance times. (Number f drives per server * average sequential drive perfrmance in MBps) Is apprximately equal t ((Number f 10G NICs * 10000) / 8) 10

EXAMPLE 1: 10 SATA 7200 RPM drives per bx at average 100MByte/s sequential 11

EXAMPLE 2: 6 SATA SSD drives per bx at maximum 450MByte/s sequential 12

Infiniband While Infiniband is nt required technlgy fr running ScaleIO, as a TCP/IP based strage technlgy, IP-ver-Infiniband is supprted fr frnt-end and backend ScaleIO strage platfrms. If yu are leveraging Infiniband cmbined with Ethernet technlgy it is recmmended that an MTU size f 4092 be utilized acrss bth netwrks. This des present a ptential disadvantage depending n yur netwrk tplgy. NICs ScaleIO supprts single and multiple NIC cnfiguratins, but as a best practice we recmmend having a minimum f (2) 10Gbps Ethernet NICs per ScaleIO nde as an initial cnfiguratin. The fllwing items shuld be cnsidered when determining the NIC cnfiguratin: Redundancy Ensure yur NIC cnfiguratin will be fault-tlerant in the event f a switch r single NIC card failure. We always recmmend leveraging active/active r active/passive netwrk cnfiguratins fr best netwrk stability and perfrmance. Perfrmance Netwrk speed shuld be cnsidered in ensuring capability t perfrm strage peratins cnsistent with the needs f the specific use case. Ensure yu are using a sufficient amunt f NICs t meet yu IOPS and bandwidth requirements. Ease f use We recmmend leveraging active/active Ethernet netwrks that will cnsistently deliver the required amunt f capacity and perfrmance fr the envirnment. Capacity Sizing netwrks directly crrespnded with the rebuild time f yur disks and ndes in a sftware defined SAN ScaleIO deplyment. If yu are leveraging a hyper-cnverged slutin it is als imprtant t cnsider the bandwidth needs f peratins taking place utside f the strage infrastructure. Tw NICs vs. Fur NICs and Other Cnfiguratins As a baseline fr system design every ScaleIO SDS nde shuld cntain a minimum f tw netwrk interfaces fr redundancy. Further, as utlined abve additinal netwrk capacity may be required fr a variety f reasns as utlined abve in Netwrk Speed. ScaleIO allws fr the scaling f netwrk resurces thrugh the additin f additinal netwrk interfaces. Althugh nt required, there may be situatins where islating frnt-end and back-end traffic fr the strage netwrk may be ideal. In all cases we recmmend multiple interfaces fr redundancy, capacity, and speed. The primary driver t segment frnt end and back end netwrk traffic is t guarantee the perfrmance f strage and applicatin related netwrk traffic. Dedicated Access Prt Fr ptimal perfrmance, it is recmmended that switch prts remain in their factry default state. It is nt recmmended t enable VLAN tagging fr ScaleIO traffic. NIC Pling ScaleIO has the ability t use multiple IPs t manage strage traffic, and these IPs can be tied t separate NICs. As netwrk redundancy is a primary cncern fr mst rganizatins, we recmmend setting up the ScaleIO data netwrk with multiple separate IPs fr each ScaleIO implementatin. Refer t the netwrk sectin in the ScaleIO User Guide fr mre infrmatin. 13

Jumb Frames While ScaleIO des supprt Jumb Frames, leveraging Jumb Frames can be challenging depending n yur infrastructure. There may be netwrking technlgy limitatins that prevent the use f Jumb Frames. Because f the ptential cnstraints, we recmmend leaving Jumb Frames disabled initially, and enabling them nly after yu have cnfirmed that yur infrastructure can supprt their use. There are several situatins when Jumb Frames d nt drive perfrmance, and can have an adverse effect. In the fllwing scenaris, Jumb Frames shuld nt be utilized: Jumb Frames are nt supprted by the netwrk infrastructure Jumb Frames are nt supprted by the clients Mst reads and/r writes are nt expected t exceed 1500 bytes Hwever, if Jumb Frames are supprted by yur netwrk technlgy, there are benefits t enabling Jumb Frames. Enabling Jumb Frames will allw blcks which are being written t the filesystem t be passed in a single Ethernet frame, decreasing interrupts and maximizing perfrmance. Fr example, if yu will be writing 4k r 8k blcks t a filesystem, the number f packets fr such writes can be significantly reduced. If yu determine that yur envirnment des supprt Jumb Frames, and yur writes typically exceed 1500 bytes, enabling Jumb Frames allws fr imprved perfrmance. Flw Cntrl Flw cntrl is a signaling methd that allws tw cnnected netwrk devices t ntify the ther side that they cannt accept mre packets at this time due t buffers being full; it signals the ther side t briefly pause sending mre packets befre resuming traffic. This prevents packets frm being drpped, and by ding s may prevent unwanted TCP/IP cngestin cntrl algrithms t activate. When t use flw cntrl Glbal flw cntrl: Applies t all traffic n a NIC When there is a dedicated netwrk cnfiguratin fr just ScaleIO When Pririty Flw Cntrl (PFC) is nt supprted r available n NIC r switches Datacenter Bridging/PFC Wuld want t enable if switches and NICs can supprt Like flw cntrl but nly fr specific type f traffic. N packets get lst When NOT t use flw cntrl When yu can nly use glbal flw cntrl and yu are hyper-cnverged (e.g. nt nly ScaleIO traffic is being passed alng With either PFC r glbal flw cntrl: If yu have a dedicated physical netwrk nly fr strage (hwever there still may be benefits) Any netwrk type that guarantees 100% TCP packet delivery with n lss wuld nt need flw cntrl, hwever this is an unlikely cnfiguratin) 14

Link Aggregatin In rder t ptimize netwrk perfrmance, enabling Link Aggregatin is recmmended if yur netwrk technlgy has the ability t supprt it. Link Aggregatin increases netwrk thrughput by using all available netwrk paths simultaneusly, exceeding the perfrmance f a single netwrk cnnectin. It als prvides a level f redundancy, in that a single link can fail and netwrk traffic will cntinue t pass ver the remaining trunked cnnectins. High Availability Active High Availability Link aggregatin with tw r mre NICs where all NICs are active and divide traffic acrss them. Use Link Aggregatin if yur switch supprts teaming/bnding acrss tw r mre switches; cnfiguratin is needed n bth sides. Be aware that the cnfiguratin must use active link detectin (n static LACP cnfiguratin) and shuld be cnfigured with shrt timers t allw fast failver t take place. Passive High Availability Link aggregatin with tw r mre NICs where nly a single link/nic is ever active. Use if switch des nt supprt teaming/bnding acrss tw r mre switches; the server chses the active NIC based n NIC interface link status SOFTWARE-DEFINED SAN VS. HYPERCONVERGED ScaleIO has the ability t run hypercnverged, using the lcal strage f each f the ScaleIO servers and cmbining it int a shared strage pl. There are n significant netwrk implicatins when running a hypercnverged instance f ScaleIO, hwever yu shuld cnsider the bandwidth utilizatin f ScaleIO and any implicatin that it may have n prductin applicatins. VMWARE IMPLEMENTATIONS VMware-based netwrking prvides all the ptins f a physical switch, but gives mre flexibility within netwrk design. T make use f virtual netwrking, virtual netwrk cnfiguratins must be cnsistent with physical netwrk devices cnnected t the virtual structure. Just as wuld be necessary withut netwrk virtualizatin, uplinks t physical switches must take int accunt redundancy and bandwidth. We recmmend determining traffic patterns in advance, in rder t prevent bandwidth starving. SDC While the ScaleIO SDC is nt integrated int the ESX kernel, there is a kernel driver fr ESX that implements the ScaleIO client mdule. VM-Kernel prt VM-kernel Prt is ften used fr vmtin, strage netwrk, fault tlerance, and management. We recmmend having a higher pririty n netwrk traffic ver virtual machine traffic, fr example virtual machine prt grups r user-level traffic. Virtual Machine Prt Grup Virtual Machine Prt Grup can be separate r jined. Fr example, yu can have three virtual machine prt grups n the same VLAN. They can be segregated nt separate VLANs, r depending n the number f NICs, they can be n different netwrks. VMware Advantages ver a Standard Switch NetIOC prvides the ability t priritize certain types f traffic, fr example strage traffic can be given a higher pririty than ther types f traffic. This will nly wrk with VMware distributed switches and nt standard switches. This can als be dne with Datacenter Bridging (als knwn as Pririty Flw Cntrl), and culd be cnfigured with standard QS; hwever nt all switches supprt these features. Hyper-cnverged Cnsideratins With a VMware-based ScaleIO implementatin running in a hypercnverged envirnment, yu shuld have a strage management netwrk and tw (2) separate data netwrks, with at least three VM prt grups, defined in advance f installing the envirnment. 15

VALIDATION METHODS Internal SIO Tls There are tw main built-in tls that mnitr netwrk perfrmance: SDS Netwrk Test SDS Netwrk Latency Meter Test SDS Netwrk Test The first test is the SDS netwrk test please refer t start_sds_netwrk_test in the ScaleIO User Manual. Once this test has cmpleted, yu can fetch the results with query_sds_netwrk_test_results. This is t ensure that yu will saturate the maximum bandwidth available in yur system. It is imprtant t nte that the parallel_messages and netwrk_test_size_gb - ptins shuld be set s that they are at least 2x larger than the maximum netwrk bandwidth. Fr Example: 1x10GB NIC = 1250MB * 2 = 2500MB, r 3 GB runded up. In this case yu shuld run the cmmand --netwrk_test_size_gb 3 This will ensure that yu are sending enugh bandwith ut n the netwrk t have a cnsistent test result, accunting fr variability n the system as a whle. The parallel message size shuld be equal t the ttal number f cres in yur system, with a maximum f 16. Example Output: scli --start_sds_netwrk_test --sds_ip 10.248.0.23 --netwrk_test_size_gb 8 --parallel_messages 8 Netwrk testing successfully started. scli --query_sds_netwrk_test_results --sds_ip 10.248.0.23SDS with IP 10.248.0.23 returned infrmatin n 7 SDSs SDS 6bfc235100000000 10.248.0.24 bandwidth 2.4 GB (2474 MB) per-secnd SDS 6bfc235200000001 10.248.0.25 bandwidth 3.5 GB (3592 MB) per-secnd SDS 6bfc235400000003 10.248.0.26 bandwidth 2.5 GB (2592 MB) per-secnd SDS 6bfc235500000004 10.248.0.28 bandwidth 3.0 GB (3045 MB) per-secnd SDS 6bfc235600000005 10.248.0.30 bandwidth 3.2 GB (3316 MB) per-secnd SDS 6bfc235700000006 10.248.0.27 bandwidth 3.0 GB (3056 MB) per-secnd SDS 6bfc235800000007 10.248.0.29 bandwidth 2.6 GB (2617 MB) per-secnd In the example abve, yu can see the netwrk perfrmance frm the SDS yu are testing frm, t every ther SDS in the netwrk. Ensure that the speed per secnd is clse t the expected perfrmance f yur netwrk cnfiguratin. 16

SDS Netwrk Latency Meter Test There is als "query_netwrk_latency_meters" (fr writes nly) which can be run at any time, which will shw the average netwrk latency fr each SDS when it cmmunicates with ther SDSs. Here, we are just making sure that there are n utliers n the latency side and that latency stays lw. Nte that this can and shuld be run frm each SDS t ther SDSs. Example Output: scli --query_netwrk_latency_meters --sds_ip 10.248.0.23 SDS with IP 10.248.0.23 returned infrmatin n 7 SDSs SDS 10.248.0.24 Average IO size: 8.0 KB (8192 Bytes) Average latency (micr secnds): 231 SDS 10.248.0.25 Average IO size: 40.0 KB (40960 Bytes) Average latency (micr secnds): 368 SDS 10.248.0.26 Average IO size: 38.0 KB (38912 Bytes) Average latency (micr secnds): 315 SDS 10.248.0.28 Average IO size: 5.0 KB (5120 Bytes) Average latency (micr secnds): 250 SDS 10.248.0.30 Average IO size: 1.0 KB (1024 Bytes) Average latency (micr secnds): 211 SDS 10.248.0.27 Average IO size: 9.0 KB (9216 Bytes) Average latency (micr secnds): 252 SDS 10.248.0.29 Average IO size: 66.0 KB (67584 Bytes) Average latency (micr secnds): 418 17

Iperf and NetPerf NOTE: Iperf and NetPerf shuld be used t validate yur netwrk befre cnfiguring ScaleIO. If yu identify issues with Iperf r NetPerf, there may be netwrk issues that need t be investigated. If yu d nt see issues with Iperf/NetPerf, use the ScaleIO internal validatin tls fr additinal and mre accurate validatin. Iperf Iperf is a traffic generatin tl, which can be used t measure the maximum pssible bandwidth n IP netwrks. The Iperf featureset allws fr tuning f varius parameters and reprts n bandwidth, lss, and ther measurements. NetPerf Netperf is a benchmark that can be used t measure the perfrmance f many different types f netwrking. It prvides tests fr bth unidirectinal thrughput, and end-t-end latency. Netwrk Mnitring It is imprtant t mnitr the health f yur netwrk t identify any issues that are preventing yur netwrk fr perating at ptimal capacity, and t safeguard frm netwrk perfrmance degradatin. There are a number f netwrk mnitring tls available fr use n the market, which ffer many different featuresets. We recmmend mnitring the fllwing areas: Input and utput traffic Errrs, Discards, Overruns Prt status Netwrk Trubleshting 101 Ping cnnectivity end-t-end between SDSs and SDCs Test traffic between devices in bth directins Check fr rund-trip latency between devices Check fr prt errrs/discards/verruns n the hst and switch side Check MTU acrss all switches and servers Check t make sure LACP/MLAG is disabled Check SIO test utput 18

REVISION HISTORY Date Versin Authr Change Summary Nv 2015 1.0 EMC Initial Dcument References ScaleIO User Guide ScaleIO Installatin Guide ScaleIO ECN cmmunity VMware vsphere 5.5 Dcumentatin Center EMC ScaleIO fr VMware Envirnment 19