Extreme Networks: Big Data A SOLUTIONS GUIDE

Similar documents

Extreme Networks NetSight SDN Integration with A10 Networks Load Balancer, Service Pools and Virtualization Resources

Extreme Networks Purview Application Analytics Integration with VMware vrealize Log Insight

Extreme Networks Jumpstart Deployment Guide

OneFabric Connect and iboss Internet Filtering Appliance

SummitStack in the Data Center

OneFabric Connect and Fiberlink MaaS360 Mobile Device Management (MDM)

Extreme Networks: Building Cloud-Scale Networks Using Open Fabric Architectures A SOLUTION WHITE PAPER

Brocade Solution for EMC VSPEX Server Virtualization

Extreme Networks EAS t Switch Software Release Notes, Version 1.00

Intel Ethernet Switch Load Balancing System Design Using Advanced Features in Intel Ethernet Switch Family

OneFabric Connect and Lightspeed Systems Rocket Web Filtering Appliance

Bezpieczne i wysoce skalowalne Data Center. Piotr Szolkowski Pre-sales Support Engineer pszolkowski@extremenetworks.com

Cisco SFS 7000D Series InfiniBand Server Switches

VMware Virtual SAN 6.2 Network Design Guide

Chapter 1 Reading Organizer

ADVANCED NETWORK CONFIGURATION GUIDE

Elasticsearch on Cisco Unified Computing System: Optimizing your UCS infrastructure for Elasticsearch s analytics software stack

TechBrief Introduction

Demonstrating the high performance and feature richness of the compact MX Series

Increase Simplicity and Improve Reliability with VPLS on the MX Series Routers

Migrate from Cisco Catalyst 6500 Series Switches to Cisco Nexus 9000 Series Switches

Solving I/O Bottlenecks to Enable Superior Cloud Efficiency

SummitStack in the Data Center

Management Software. Web Browser User s Guide AT-S106. For the AT-GS950/48 Gigabit Ethernet Smart Switch. Version Rev.

Virtual PortChannels: Building Networks without Spanning Tree Protocol

vsphere Networking vsphere 6.0 ESXi 6.0 vcenter Server 6.0 EN

Application Note Gigabit Ethernet Port Modes

Powerful Duo: MapR Big Data Analytics with Cisco ACI Network Switches

Data Center Networking Designing Today s Data Center

How To Make A Vpc More Secure With A Cloud Network Overlay (Network) On A Vlan) On An Openstack Vlan On A Server On A Network On A 2D (Vlan) (Vpn) On Your Vlan

Data Center Convergence. Ahmad Zamer, Brocade

Optimizing Data Center Networks for Cloud Computing

VMware Virtual SAN Network Design Guide TECHNICAL WHITE PAPER

Layer 3 Network + Dedicated Internet Connectivity

HIGH-PERFORMANCE SOLUTIONS FOR MONITORING AND SECURING YOUR NETWORK A Next-Generation Intelligent Network Access Guide OPEN UP TO THE OPPORTUNITIES

Chapter 3. Enterprise Campus Network Design

IP SAN Best Practices

Intel Ethernet Switch Converged Enhanced Ethernet (CEE) and Datacenter Bridging (DCB) Using Intel Ethernet Switch Family Switches

FlexNetwork Architecture Delivers Higher Speed, Lower Downtime With HP IRF Technology. August 2011

Network Virtualization and Data Center Networks Data Center Virtualization - Basics. Qin Yin Fall Semester 2013

SDN CENTRALIZED NETWORK COMMAND AND CONTROL

Data Center Infrastructure of the future. Alexei Agueev, Systems Engineer

TIME TO RETHINK REAL-TIME BIG DATA ANALYTICS

Achieving Real-Time Business Solutions Using Graph Database Technology and High Performance Networks

VXLAN: Scaling Data Center Capacity. White Paper

EPICenter Network Management Software

Fibre Channel over Ethernet in the Data Center: An Introduction

Improving Quality of Service

Chapter 7 Configuring Trunk Groups and Dynamic Link Aggregation

The Software Defined Hybrid Packet Optical Datacenter Network SDN AT LIGHT SPEED TM CALIENT Technologies

Choosing Tap or SPAN for Data Center Monitoring

Using & Offering Wholesale Ethernet Network and Operational Considerations

VMDC 3.0 Design Overview

HADOOP SOLUTION USING EMC ISILON AND CLOUDERA ENTERPRISE Efficient, Flexible In-Place Hadoop Analytics

Network Virtualization for Large-Scale Data Centers

SAN Conceptual and Design Basics

Arista 7060X and 7260X series: Q&A

ARISTA NETWORKS AND F5 SOLUTION INTEGRATION

network infrastructure: getting started with VoIP

Hadoop Cluster Applications

Software-Defined Networks Powered by VellOS

Broadcom Smart-Buffer Technology in Data Center Switches for Cost-Effective Performance Scaling of Cloud Applications

Platfora Big Data Analytics

The Future of Cloud Networking. Idris T. Vasi

White Paper Abstract Disclaimer

Core and Pod Data Center Design

vsphere Networking ESXi 5.0 vcenter Server 5.0 EN

Enhancing Cisco Networks with Gigamon // White Paper

Expert Reference Series of White Papers. VMware vsphere Distributed Switches

BUILDING A NEXT-GENERATION DATA CENTER

TRILL for Service Provider Data Center and IXP. Francois Tallet, Cisco Systems

High Availability. Palo Alto Networks. PAN-OS Administrator s Guide Version 6.0. Copyright Palo Alto Networks

Extreme Networks: Public, Hybrid and Private Virtualized Multi-Tenant Cloud Data Center A SOLUTION WHITE PAPER

Advanced Computer Networks. Datacenter Network Fabric

Cisco Integrated Services Routers Performance Overview

Network Configuration Example

Fiber Channel Over Ethernet (FCoE)

Juniper Networks EX Series/ Cisco Catalyst Interoperability Test Results. May 1, 2009

vsphere Networking vsphere 5.5 ESXi 5.5 vcenter Server 5.5 EN

Driving IBM BigInsights Performance Over GPFS Using InfiniBand+RDMA

Disaster Recovery Design Ehab Ashary University of Colorado at Colorado Springs

Data Center Use Cases and Trends

VLANs. Application Note

RESILIENT NETWORK DESIGN

The Four Pillar Strategy for Next Generation Data Centers

Testing Software Defined Network (SDN) For Data Center and Cloud VERYX TECHNOLOGIES

Policy Management: The Avenda Approach To An Essential Network Service

How To Handle Big Data With A Data Scientist

Isilon IQ Network Configuration Guide

Simplifying Data Center Network Architecture: Collapsing the Tiers

High Availability Solutions & Technology for NetScreen s Security Systems

Cloud Infrastructure Planning. Chapter Six

Enabling High performance Big Data platform with RDMA

What s New in VMware vsphere 5.5 Networking

Data Center Network Evolution: Increase the Value of IT in Your Organization

FASTIRON II SWITCHES Foundry Networks award winning FastIron II family of switches provides high-density

Transcription:

Extreme Networks: Big Data A SOLUTIONS GUIDE

Copyright 2014 Extreme Networks, Inc. All Rights Reserved. AccessAdapt, Alpine, Altitude, BlackDiamond, Direct Attach, EPICenter, ExtremeWorks Essentials, Ethernet Everywhere, Extreme Enabled, Extreme Ethernet Everywhere, Extreme Networks, Extreme Standby Router Protocol, Extreme Turbodrive, Extreme Velocity, ExtremeWare, ExtremeWorks, ExtremeXOS, Go Purple Extreme Solution, ExtremeXOS ScreenPlay, ReachNXT, Ridgeline, Sentriant, ServiceWatch, Summit, SummitStack, Triumph, Unified Access Architecture, Unified Access RF Manager, UniStack, XNV, the Extreme Networks logo, the Alpine logo, the BlackDiamond logo, the Extreme Turbodrive logo, the Summit logos, and the Powered by ExtremeXOS logo are trademarks or registered trademarks of Extreme Networks, Inc. or its subsidiaries in the United States and/or other countries. sflow is the property of InMon Corporation. Specifications are subject to change without tice. All other registered trademarks, trademarks, and service marks are property of their respective owners. For additional information on Extreme Networks trademarks, see www.extremenetworks.com/company/legal/trademarks. Big Data Solutions Guide 2

Table of Contents Big Data Requirements... 5 Architecture... 6 LAYER 3 AND ECMP...8 LAYER 2 ALTERNATIVES...8 Scalability... 8 Performance... 9 THROUGHPUT...9 CONGESTION MANAGEMENT...11 JUMBO FRAMES...11 LATENCY... 12 High Availability...12 LOAD-SHARING... 12 REDUNDANCY... 13 RECOVERY... 14 BFD... 15 Manageability...16 MANAGING THE CLUSTER WITH NETSIGHT... 16 RACK AWARENESS... 17 AUTO-PROVISIONING... 20 FLOW ANALYSIS... 20 PROACTIVE TECH SUPPORT... 20 CLI SCRIPTING... 21 Big Data Solutions Guide 3

Security...21 LIMITING UNAUTHORIZED HOSTS... 21 DENIAL OF SERVICE (DOS) PROTECTION... 21 Appendix...22 LEAF SWITCH SAMPLE CONFIGURATION...22 SPINE SWITCH SAMPLE CONFIGURATION...28 RACK AWARENESS SAMPLE SCRIPT... 30 Big Data Solutions Guide 4

BIG DATA SOLUTIONS GUIDE Abstract: Big Data analytics are fueling the multi-billion dollar investment to integrate Big Data applications into business operations, and they need a solid infrastructure to run on. Extreme Networks delivers a Big Data solution with a high-performing Ethernet-based architecture. Best-in-class switches and operating system offer a simple and affordable solution that is highly resilient. This paper discusses how Extreme s Big Data solution meets the five pillars of Big Data: Scalability Performance, High Availability, Manageability, and Security. Big Data Requirements Big Data analytics are fueling the multi-billion dollar investment to integrate Big Data applications into business operations, and they need a solid infrastructure to run on. That s why Extreme Networks delivers a Big Data solution with a high-performance Ethernet-based architecture. Extreme Networks best-in-class switches and operating system offer a simple and affordable solution that is also highly resilient. With the release of Hadoop 2.0 by the Apache open source community, Hadoop YARN provides the resource management and pluggable architecture for enabling a wide variety of data access methods. Along with the techlogies provided by Cloudera, Hortonworks, Pivotal, DataStax, MongoDB and other NoSQL applications, YARN also supports emerging use cases for search and streaming with Apache Solr, Spark and Storm. Big Data applications are evolving rapidly and enterprise architects can confidently deploy Extreme Networks best-in-class solution kwing that it will handle current and future diverse workloads. Extreme Networks hardware and software are designed to support various data processing requirements. 1. Online NOSQL HBase and Cassandra provide databases with support for massive tables (billions of rows, millions of columns) with fast random access 2. Batch SQL Techlogies such as Hive are designed for batch queries on Hadoop and are used primarily for queries on very large data sets and large ETL jobs 3. Analytic SQL Impala, HAWQ, Drill and Stinger provide interactive query capabilities to enable traditional business intelligence and analytics on Hadoop-scale databases 4. In-Memory Machine Learning and Stream Processing In-memory computing for Hadoop is called Shark, which is a new project that also uses in-memory computing while retaining full Hive compatibility to provide x faster queries than Hive. The growing Big Data application ecosystem is accelerating Big Data deployments they are longer isolated to just the research community or top Enterprises. Big Data is becoming just data that Enterprises are integrating into mission critical operations in order to manage all the enterprise data for new insights and to gain a competitive advantage. Big Data Solutions Guide 5

The network architecture has to meet the same requirements as the whole solution. Next-Generation clusters demand high performance from a wire-speed architecture that provides high throughput and low latency. Big Data processes massive amounts of data (both structured and unstructured) in distributed systems and the network is part of the critical path for these clusters because of the nature of the distributed data systems with an east-west traffic flow that need a network fabric for interconnecitivity. This paper discusses how Extreme Networks best-inclass Big Data solution meets the five pillars of Big Data: 1. Scalability 2. Performance 3. High Availability 4. Manageability 5. Security Architecture Big Data requires a scalable way to store and process an ever-growing massive amount of data, which inspired the open-source development of Apache Hadoop. The Hadoop Distributed Files System (HDFS) is designed to span a large cluster of servers that pool data in any format into a single scalable and reliable data storage file system. Hadoop s MapReduce is a distributed parallel processing framework that is used to retrieve data from the Hadoop cluster. HDFS and MapReduce work in tandem to process data with locality in consideration, executing the compute processing at the data location. By distributing the storage and processing across many servers, the cluster can stretch and scale with the demand. With data distributed across multiple servers, the same data is often replicated across multiple des and intentionally across multiple racks for redundancy. During rmal MapReduce operations, a high volume of data is shuffled between datades and then reported back to the user. In addition, HDFS was designed with the assumption that hardware failures are the rm rather than the exception, which is more probable in environments where there may be hundreds or thousands of datades. A datade failure can result in Hadoop managing the task workload across the still-functioning datades and can also result in spikes of network bandwidth usage to replicate the data blocks that had been stored on the failed datade. Applications must detect and address failures quickly, but the infrastructure should detect and address failures quickly. The high bandwidth and high density requirements of Big Data can be achieved with a Layer 3 Ethernet-based topology that is optimized for the east-west traffic patterns of the typical workloads between the datades within the cluster. The Layer 3 Equal Cost Multi Path (ECMP) fabric interconnects the Layer 3 enabled leaf switches to a Layer 3 spine, as shown below: The topology is a flat two-stage Clos switch fabric comprising: Big Data Solutions Guide 6

1. First stage: top-of-rack leaf switches that connect to compute resources 2. Second stage: core spine switches that interconnect the leaf switches The leaf switches attached to any given rack are t directly connected to leaf switches belonging to other racks because this would require full mesh connectivity between the leafs. Instead, the datapath between the leafs is via uplinks to the spines, which requires the leaf to be connected to every spine switch instead of every leaf. This modern architecture has low oversubscription which is ideal for Big Data workloads. The Extreme Networks Big Data architecture easily scales with the size of the Big Data cluster. Smaller clusters can start with a lower number of leaf switches and add on as the cluster grows. The larger clusters can easily scale out the leaf switches and accommodate the ever-growing densities within each rack: dozens of datades, hundreds of cores, and exabytes of internal storage. In turn, this puts higher and higher demands on the network since larger clusters will drive port densities and network bandwidth utilization. Extreme Networks Big Data solution is also based on the most popular transport techlogy Ethernet which makes it easy to manage and integrate into existing Ethernet networks. Extreme Networks already delivers Gbps, 40Gbps, 10Gbps and 1Gbps solutions, and Ethernet techlogies remain ubiquitous and competitive in its flexibility to provide performant solutions. For example Data Center Bridging (DCB) and future industry advancements like 400Gbps and RoCE (RDMA over Converged Ethernet) and iwarp (Internet Wide Area RDMA Protocol), which helps reduce datades Ethernet processor overhead. Big Data Solutions Guide 7

LAYER 3 AND ECMP For IP reachability between the datades in the cluster, the network fabric is simple in that it uses Equal Cost Multi Path (ECMP) and Open Shorted Path First (OSPF) for routing. Any rack is equidistant to any other rack with a maximum traversal of three switches (leaf spine leaf), achieving low and predictable latency, and there are multiple paths to loadshare across improving link utilization and providing redundancy. Leaf switches and spine switches participate in OSPF and perform Layer 3 forwarding. Leaf switches make forwarding decisions based on Layer 3 lookups of the IP address of the destination datade, which do t need Layer 2 adjacencies between one ather, just IP reachability. The routing tables on the Extreme Networks switches can easily scale to support the number of routes in the network plus the rack subnets. Thus the network topology with OSPF as a routing protocol scales to support small to large clusters while maintaining performance, and allows users to easily add/remove compute resources. LAYER 2 ALTERNATIVES Extreme Networks also offers a pure Layer 2 solution where the network may be configured completely without Layer 3 by leveraging Multi-Switch Link Aggregation (MLAG) in the network fabric between leaf switches and spine switches. MLAG provides device and network level resiliency, but then the whole network is in a single broadcast domain, so it is a suitable configuration only for small deployments. Scalability Big Data HDFS-based applications leverage a distributed environment to process data on multiple servers in parallel and can run multiple jobs at the same time. Generally speaking, job completion times can be decreased as the datades scale up, so as new racks are added to the cluster the Extreme Networks architecture maintains optimized performance and bandwidth utilization between the racks. Customers can start with a small number of servers and then expand the infrastructure as the need grows. Extreme Networks best-in-class ExtremeXOS software is a single operating system that runs across the switch portfolio and provides flexibility to deploy various Extreme Networks switches depending on the requirements. This paper focuses on the Summit X670 series switches as leaf switches and Summit X770 Series switches as spine switches, where each leaf switch has 10 GbE links down to each server and 40 GbE links up to each spine. The Summit X770 provides 32 40bE ports and for the above architecture, 4 ports connect to each pair of Summit X670 leaf switches, supporting 8 pairs of X670 switches (i.e., 8 compute racks). Each pair of X670 switches can connect to 40 servers (assuming 8x10Gbps ports are for the ISC and 4x40Gbps ports are uplinks to the spine switches), so this architecture will scale out to 40x8 = 320 servers. Higher-density Big Data deployments can also leverage the Summit X770 Series switches as leaf switches and the BlackDiamond X8 at the spine. Lower-density deployments can leverage any of the other Extreme Networks Summit switches, such as the Summit X460. Extreme Networks offers the flexibility for those lowerdensity deployments to leverage either the Layer 3 design which is the focus of this paper, or the Layer 2 MLAG-based design mentioned earlier. Big Data Solutions Guide 8

Performance THROUGHPUT HDFS brings data to the compute des and moves a massive amount of data around the network, and server or rack failures will cause a lot of data to be replicated elsewhere on other servers. The time it takes to process tasks depends on how fast the network can move that data. Extreme Networks provides higher speeds and feeds than anyone else in the industry with 10Gbps/40Gbps/Gbps Ethernet with higher density per slot. Extreme Networks portfolio includes 10Gbps down to datades to support emerging 10Gbps-enabled servers, especially with LAN-on-Motherboard (LOM) capabilities that are driving down the cost of 10Gbps networking, and helps optimize new server deployments with higher bandwidth uplinks. As well Extreme Networks offers 1Gbps down to datades. Big Data clusters need to move data as quickly as possible between datades, and the Exreme Big Data solution supports line rate forwarding for Layer 2 and Layer 3 traffic. See the independent Lippis report shown below: Big Data Solutions Guide 9

Open Industry Network Performance & Power Test for Private/Public Data Center Cloud Computing Ethernet Fabrics Report: Evaluating 10 GbE Switches Extreme Networks Summit X670V Top-of-Rack Switch The Extreme Networks X670V demonstrated % throughput as a percentage of line rate across all 48-10GbE and 4-40GbE ports. In other words, t a single packet was dropped while the Extreme Networks X670V was presented with eugh traffic to populate its 48-10GbE and 4-40GbE ports at line rate simultaneously for both L2 and L3 traffic flows. Extreme Networks Summit X670V RFC 2544 L2 & L3 Throughput Test Throughput % Line Rate % 80% 60% 40% 20% Layer 2 Layer 3 Two congestion test were conducted using the same methodology. A 10GbE and 40GbE congestion test stressed the X670V ToR switch s congestion management attributes for both 10GbE and 40GbE. The Extreme Networks X670V demonstrated % of aggregated forwarding rate as percentage of line rate during congestion conditions for both the 10GbE and 40GbE test. A single 10GbE port was flooded at 150% of line rate. In addition, a single 40GbE port was flooded at 150% of line rate. The Extreme Networks X670V did t use HOL blocking which means that as the 10GbE and 40GbE ports on the X670V became congested, it did t impact the performance of other ports. Back pressure was detected. The X670V did send flow control frames to the Ixia test gear signaling it to slow down the rate of incoming traffic flow, which is common in ToR switches. Frame (Bytes) 64 128 192 256 512 1024 1280 1518 2176 9216 0% 64 128 192 256 512 1024 1280 1518 2176 9216 Layer 2 Layer 3 Extreme Networks Summit X670V Congestion Test 150% of Line Rate into a single 10GbE 64 128 192 256 512 1024 1280 1518 2176 9216 Agg Forwarding Rate (% Line Rate) 150% of Line Rate into a single 40GbE Head of Line Blocking Back Pressure Agg Flow Control Frames LAYER 2 LAYER 3 6976188 6641074 4921806 5720526 6453360 6751626 4438204 6167986 3033068 5425000 3125294 5668574 17307786 5540616 13933834 5320560 13393656 5339310 9994308 5042648 10265682 7951820 9938718 28243060 26799154 31017876 25762198 28271714 27102102 22865370 10273162 9517872 10655066 9508906 9534112 7900572 9482604 9345090 9294636 9521042 14 Lippis Enterprises, Inc. 2011 Evaluation conducted at Ixia s isimcity Santa Clara Lab on Ixia test equipment www.lippisreport.com Big Data Solutions Guide 10

CONGESTION MANAGEMENT Bursty traffic patterns can happen in Big Data clusters, such as during the shuffle phase of a MapReduce job when multiple mappers terminate and transmit data to reducers. Other periods of transient congestion can occur when multiple traffic flows are trying to go to the same output port on a switch. To manage these situations, Extreme Networks switches leverage a Smart Buffer techlogy that transparently self-tunes and dynamically allows buffers to be utilized as needed during periods of congestion. Extreme Networks Smart Buffer techlogy provides a dynamic and adaptive onchip buffer allocation scheme that is surperior to static per-port allocation schemes and avoids latency incurred by off-chip buffers. Ports have dedicated buffers and in addition can get extra buffer allocation from a shared pool as needed, thereby demonstrating an effective management of and tolerance for microbursts. In contrast, arbitrarily large off-chip buffers can exacerbate congestion or can increase latency and jitter, which leads to less deterministic Big Data job performance, especially if chaining jobs. While the Extreme Networks hardware maximizes burst absorption capability and addresses temporary congestion, it also maintains fairness. Since Extreme Networks Smart Buffer techlogy is adaptive in the shared buffering allocations, uncongested ports do t get starved of access to the shared buffer pool and they are t throttled by congestion on other ports, while still allowing congested ports to get more of the buffers to address the traffic burst. Extreme Networks Smart Buffering techlogy is dynamic and adaptive in its default configuration; Extreme Networks also supports fine-tuning the Smart Buffer allocations from the dedicated and shared pools. Users can configure the percentage of the shared buffers that each port can consume. configure ports <port> shared-packet-buffer <percent> In addition, kwing specifically which Big Data applications and traffic flows that may be used for high bandwidth data transfers or that are congestion risks, users can configure an access-list to match on specific criteria in traffic flows (e.g. match on a TCP port), assign them to a QoS Profile (QP), and then set a percentage of the dedicated buffers to that QP. configure qosprofile <QP> maxbuffer <percentage> port <port> Users should understand the consequences of modifying the default behavior. For more details on Extreme Network s Smart Buffer techlogies, please refer to the Congestion Management and Buffering in Data Center Networks white paper: http://learn.extremenetworks.com/rs/extreme/images/congestion-managementand-buffering-wp.pdf JUMBO FRAMES Distributed filesystems like HDFS have high network I/O requirements because they move massive amounts of data around the cluster in the form of blocks. Oneto-many writes for block replication may occur at the same time as many-to-one reads, which results in periods of high network I/O demanding high throughput. These blocks can account for a majority of the bandwidth utilization in the network fabric. The Block size default is 64 MB but can be configured to be several times larger than that. To avoid the overhead of small packet sizes and to further improve Big Data Solutions Guide 11

the performance of the network for Big Data applications, jumbo frame support and increased IP MTU can be advantageous and Extreme Networks switches X670 and X770 can perform line rate with jumbo frames enabled. enable jumbo-frame ports all configure ip-mtu 9194 vlan <vlan name> LATENCY Extreme Networks Big Data solution provides the lowest port-to-port latency in the I/O fabric, jumbo-frame support, and short-reach optics and cables. The X770 is the first to support 104 wire-speed 10GbE ports or 32x40GbE ports or a mix of 10GbE and 40GbE in 1RU for switch interconnections and is capable of forwarding up to 1904 million packets per second. It has less than 600 naseconds latency with 64-byte packets using cut-through forwarding. 10G switches down to the servers provide lower latency than 1G switches and are recommended for optimal performance. Cut-through forwarding commands are as follows: configure forwarding switching-mode cut-through show forwarding configuration High Availability LOAD-SHARING Parallel layer 3 links between leaf switches and spine switches provide multiple equal cost paths, to enhance load sharing and providing a backup in case of link failure. Which link a traffic flow will use is determined by a hashing algorithm that takes into account the source and destination IP addresses of the servers. In the Big Data environment, the administrator assigns subnets to each rack of datades so the routing table will have a n-diverse set of IP address subnets representing each rack. If the Layer 3 source and destination IP addresses and the Layer 4 source and destination ports do t vary much across traffic flows, then a sub-optimal hashing algorithm (especially one that assumes diverse Internet addresses) may unevenly distribute the traffic flows across the available equal cost paths. This may cause some links to be underutilized or perhaps t used at all, and other links to be overutilized and exceed capacity. ECMP uses hash algorithms based on source and destination parameters of the traffic (L3, or L3 and L4), regardless of current utilization, so suboptimal algorithms may also result in congestion. Figure 3 compares two load sharing configurations and the variance in the link utilization. In both cases the same amount of traffic was delivered but the variance in the link utilization was much smaller in the better loadsharing algorithm and much larger in the suboptimal load-sharing algorithm. Big Data Solutions Guide 12

In order to achieve optimal load sharing of multiple traffic flows across the available paths in Big Data s n-diverse routing table, administrators can increase the number of ECMP gateways from its default of four and configure the optimal hashing algorithm. ExtremeXOS provides the cluster administrator different options for fine tuning the load sharing algorithms. Here is an example of what could be configured in this environment, ting that it can be adapted to the number of paths and IP address variance in the network: configure forwarding sharing L3_L4 configure iproute sharing hash-algorithm crc upper configure iproute sharing max-gateways 8 REDUNDANCY For the topology presented here, each server has two 10GbE NICs (and they can be negotiated down if needed) connected to two leaf switch pairs. The server NICs are bonded together to form a virtual 20GbE pipe providing increased bandwidth and redundancy to ensure high availability. Then each leaf switch pair creates the perception of a common link aggregated group so that the server doesn t see its NICs connected to two different leaf switches. The server thinks it has two NICs connected to the same leaf switch and it doesn t see anything different from a link aggregation perspective, even though the link aggregated ports are w distributed across Leaf 1 and Leaf 2, thereby leading to the term Multi-Switch Link Aggregation (MLAG). In this Big Data design, each pair of leaf switches is configured as MLAG peers down to the servers. A Layer 3 VLAN exists between the MLAG peers for control communication, called the ISC, and multiple physical links should be used and configured as a single logical LAG link to protect the MLAG ISC from loss of a single link. The ports down to the servers can be configured for Link Aggregation Control Protocol (LACP) which dynamically maintains LAG state information. Please see the Appendix for sample configuration. In this design VRRP (Virtual Router Redundancy Protocol) allows pairs of leaf switches to provide redundant routing services to the racks of servers. VRRP must be enabled on the server VLANs, allowing multiple switches to provide redundant Big Data Solutions Guide 13

routing services to the servers. This ensures high availability by eliminating the single point of failure associated with a default gateway failure. VRRP in Active/ Active mode allows both leaf switches to simultaneously act as the default gateway for the subnet and can be configured with a policy that blocks the VRRP multicast address on the input (i.e. ingress) of both sides of the MLAG ISC. entry v4active { if match all { destination-address 224.0.0.18/32 ; } then { deny ; } } RECOVERY Extreme Networks Big Data architecture with ECMP makes it resilient. It can dynamically react to failures, reroute traffic intelligently, and recover quickly. Consider the following topology where bi-directional traffic between the racks are load-shared across the network fabric. Big Data Solutions Guide 14

If a power hit results in Leaf 1 rebooting, then because each leaf de is part of an MLAG/VRRP pair (Leaf 1 and Leaf 2; Leaf 3 and Leaf4) the network stability is maintained. On the south side, the rack is still connected to the other pair leaf switch Leaf 2 and on the rth side Layer 3 reachability is still maintained through OSPF. The spine switch will send traffic through Leaf 2 until the original Leaf 1 boots and OSPF converges. See the figure below taken from the reporting mechanisms available in Extreme Networks OneView 1. The graphs below show links to leaf pairs Leaf 1 and Leaf 2, and leaf pairs Leaf 3 and Leaf 4, and shows three incidents where Leaf 1 is rebooted. When it is rebooted, the traffic traverses Leaf 2 until Leaf 1 recovers, at which point the traffic resumes load sharing between the two. Because of the network s resiliency, the applications and the MapReduce jobs are unaware of the network failure. BFD OSPF ECMP enables rapid recovery from outages and convergence times can be improved using Bidirectional Forwarding Detection (BFD), which more quickly detects failures and allows OSPF to react faster. BFD is a hello protocol that provides the rapid detection of failures in the path and informs the clients (routing protocols) to initiate the route convergence. It is independent of media, routing protocols, and data protocols. BFD helps in the separation of forwarding plane connectivity and control plane connectivity. 1 OneView data is sourced from SNMP polling of the switch interfaces every minute and can provide bandwidth utilization of interfaces, among other things. However, because of the polling interval there is some variance in the data and it will t necessarily be precise. Big Data Solutions Guide 15

Different routing protocol hello mechanisms operate in variable rates of detection, but BFD detects the forwarding path failures at a uniform rate, thus allowing for easier network profiling and planning, and consistent and predictable re-convergence time. Manageability MANAGING THE CLUSTER WITH NETSIGHT Extreme Networks NetSight provides a rich set of integrated management capabilities for centralized visibility and highly efficient anytime, anywhere control of enterprise wired and wireless network resources. NetSight s granularity reaches beyond ports, VLANs, and SSIDs down to individual users, applications, and protocols. NetSight increases efficiency, enabling IT staff to avoid time-consuming manual device-by-device configuration tasks. NetSight fills the functionality gap between traditional element managers that offer limited vendorspecific device control, and expensive, complex enterprise management applications. The NetSight management application provides centralized visibility and granular management of the cluster, including: Managing inventory Monitoring performance Use of syslog Configuring and upgrading devices Instead of managing the cluster through separate management applications, NetSight OneView provides a single management view of the entire cluster, in this case including the leaf switches and spine switches plus the datades. Big Data Solutions Guide 16

Interface status is also reported through the same view, as shown below. Furthermore, NetSight s reporting and visualizations can graphically show bandwidth utilization on different interfaces. NetSight OneFabric Control Center also provides comprehensive network inventory and change management information, tracks and reports changes to network configuration, and simplifies configuration backups, firmware upgrades and capacity planning. For more information on NetSight, please visit: http://www. extremenetworks.com/product/netsight/ RACK AWARENESS Hadoop s rack awareness capability enables intelligent decisions for where blocks of data should be replicated to which datades, which helps with replication of blocks to different racks for improved fault tolerance. The rack awareness is achieved through topology information that maps datade IP addresses to rack names. Normally this is a manual and time-consuming administrative task that needs to be done repeatedly as datades are added or removed from the cluster, and any user errors would result in less than optimal block replication locations. The Extreme Networks Big Data solution provides a seamless method to dynamically generate the topology data. The user can configure all the leaf switches for authentication on the ports connecting to the datades as shown in the CLI below. Note that the CLI uses the keyword vm-tracking but it applies to physical servers as well. Big Data Solutions Guide 17

configure vm-tracking nms primary server <NAC IP address> 1812 client-ip <switch management IP address> shared-secret encrypted Gt}xolg5 vr VR-Mgmt enable vm-tracking configure vm-tracking authentication database-order nms enable vm-tracking ports <port list> When the switch detects traffic from a new datade or end system, the switch authenticates the datade with the Network Management System (NMS). The NMS can then respond to the tification by calling a mapping program that maps the datade IP address to its given rack. The figure below shows the NMS configuration for the Hadoop Rack Awareness tification which calls the mapping program with arguments of the datade IP address and the switch IP address. Big Data Solutions Guide 18

A sample mapping program is provided in the Appendix. The output of the mapping program as shown below becomes the input to the Hadoop application, specifically the topology file, which is the enabler for the rack awareness. 192.168.10.101 /rack1 192.168.10.102 /rack1 192.168.10.103 /rack1 192.168.10.104 /rack1 192.168.11.105 /rack2 192.168.11.106 /rack2 192.168.11.107 /rack2 192.168.11.108 /rack2 Users can validate the rack information in Cloudera or manually through HDFS on the cluster Big Data Solutions Guide 19

$ hdfs dfsadmin -printtopology Rack: /default/rack1 192.168.10.101:50010 (dn1-10g) 192.168.10.102:50010 (dn2-10g) 192.168.10.103:50010 (dn3-10g) 192.168.10.104:50010 (dn4-10g) Rack: /default/rack2 192.168.11.105:50010 (dn5-10g) 192.168.11.106:50010 (dn6-10g) 192.168.11.107:50010 (dn7-10g) 192.168.11.108:50010 (dn8-10g) AUTO-PROVISIONING Big Data clusters need to be simple to manage and to deploy new hardware as the cluster scales out. To reduce the time needed to configure the switches to connect server racks into the Big Data cluster, ExtremeXOS has an auto-provisioning feature. This feature automates the (usually) manual configuration operations and allows the configuration to be more modularly defined. The autoprovisioning feature enables a port on the switch to be connected to a domain with a DHCP server, receive its IP address, and then the switch can receive a configuration file to be loaded and optionally send an SNMP trap to inform the user of the succesful auto-provision. The auto-provisioning feature can be enabled with the command: enable auto-provision FLOW ANALYSIS Varied traffic patterns from different applications may be traversing the Big Data cluster, and ExtremeXOS supports an industry-standard techlogy for monitoring that traffic through statistical sampling of packets. This techlogy, sflow, enables clients to analyze traffic patterns to better understand cluster utilization and to be used for capacity planning. PROACTIVE TECH SUPPORT Extreme Networks offers an automated support mechanism for customers who want to easily engage the Extreme Networks Technical Assistance Center (TAC) team. There is a loadable application called Proactive Tech Support that can be upgraded independently at any time without restarting the switch, and it enables system information to be automatically pushed into a cloud-hosted collector where the Extreme Networks TAC team can analyze the data and identify problems to provide solutions. Big Data Solutions Guide 20

CLI SCRIPTING To streamline deployment and administration of the network, users can leverage ExtremeXOS automated switch management capabilities. The CLI-based scripting, with TCL and Python support, allows users to significantly automate switch management through support of variables and functions that users customize for handling special events. ExtremeXOS has a flexible framework that can enable selected trigger events to activate dynamic profiles, such as when a user or device connection to a switch port. These profiles contain script commands and cause dynamic changes to the switch configuration, and can be used for general manageability of the network or to enforce policies. Security Big Data applications like Hadoop have some capabilities that address organization concerns for securing the Big Data environment. Even clusters that may be deployed in private networks need to consider security measures against insider threats. Securing the cluster at the network layer is equally important to securing the cluster at the application layer to ensure traffic; even the switch hardware should be authorized for access. ExtremeXOS supports a variety of security features that protect the network. LIMITING UNAUTHORIZED HOSTS The Big Data cluster may or may t have physical isolation, and it s important to prevent a user from connecting their own device to a switch and gaining direct access to the cluster. Extreme Networks provides users flexibility with two options such that an unintended user may gain physical access to connect their own device to the cluster but they would be unable to do anything on it. 1. The ExtremeXOS feature called MAC address lockdown causes all dynamic FDB entries associated with a given port to be converted to locked static entries. It also sets the MAC learning limit to 0 so that new entries can be learned and all new source MAC addresses are blackholed. 2. The OneFabric connect API in NetSight can be used in conjunction with other Big Data management tools to import a list of authorized servers in the network. The NetSight network autoconfiguration system then uses this list of authorized servers to ensure that only authorized des access the network. DENIAL OF SERVICE (DOS) PROTECTION If the Big Data cluster is t isolated from the main IT infrastructure, there is an increased risk for untrusted users to issue a Denial of Service (DoS) attack. DoS attacks on the network can overwhelm the switch CPU with packets that require costly processing, thereby degrading performance. DoS protection helps prevent this degraded performance by attempting to characterize the problem and filter out the offending traffic so that other functions can continue. When a flood of CPU-bound packets reach the switch, DoS Protection will temporarily create a hardware access control list (ACL) to limit the flow of these packets to the switch CPU until the attack ends. Big Data Solutions Guide 21

Appendix LEAF SWITCH SAMPLE CONFIGURATION Below are snippets of a leaf switch configuration: enable jumbo-frame ports all create vlan rack1 configure vlan rack1 tag 192 create vlan isc configure vlan isc tag 2 create vlan nvlan create vlan vl1 configure vlan vl1 tag 101 create vlan vl2 configure vlan vl2 tag 102 create vlan vl21 configure vlan vl21 tag 121 create vlan vl22 configure vlan vl22 tag 122 # Ports to servers disable flow-control rx-pause port 1 configure ports 1 display-string nn1_eth3 configure ports 1 auto off speed 00 duplex full disable flow-control rx-pause port 2 configure ports 2 display-string dn1_eth3 configure ports 2 auto off speed 00 duplex full disable flow-control rx-pause port 3 configure ports 3 display-string dn2_eth3 configure ports 3 auto off speed 00 duplex full disable flow-control rx-pause port 4 configure ports 4 display-string dn3_eth3 configure ports 4 auto off speed 00 duplex full disable flow-control rx-pause port 5 configure ports 5 display-string dn4_eth3 configure ports 5 auto off speed 00 duplex full Big Data Solutions Guide 22

# ISC ports disable flow-control rx-pause port 41 configure ports 41 display-string L1-41 configure ports 41 auto off speed 00 duplex full disable flow-control rx-pause port 42 configure ports 42 display-string L1-42 configure ports 42 auto off speed 00 duplex full disable flow-control rx-pause port 43 configure ports 43 display-string L1-43 configure ports 43 auto off speed 00 duplex full disable flow-control rx-pause port 44 configure ports 44 display-string L1-44 configure ports 44 auto off speed 00 duplex full disable flow-control rx-pause port 45 configure ports 45 display-string L1-45 configure ports 45 auto off speed 00 duplex full disable flow-control rx-pause port 46 configure ports 46 display-string L1-46 configure ports 46 auto off speed 00 duplex full disable flow-control rx-pause port 47 configure ports 47 display-string L1-47 configure ports 47 auto off speed 00 duplex full disable flow-control rx-pause port 48 configure ports 48 display-string L1-48 configure ports 48 auto off speed 00 duplex full Big Data Solutions Guide 23

# ISC ports disable flow-control rx-pause port 49 configure ports 49 display-string S1-v1 disable flow-control rx-pause port 50 disable flow-control rx-pause port 51 disable flow-control rx-pause port 52 disable flow-control rx-pause port 53 configure ports 53 display-string S1-v21 disable flow-control rx-pause port 54 disable flow-control rx-pause port 55 disable flow-control rx-pause port 56 disable flow-control rx-pause port 57 configure ports 57 display-string S2-v2 disable flow-control rx-pause port 58 disable flow-control rx-pause port 59 disable flow-control rx-pause port 60 disable flow-control rx-pause port 61 configure ports 61 display-string S2-v22 disable flow-control rx-pause port 62 disable flow-control rx-pause port 63 disable flow-control rx-pause port 64 enable sharing 1 grouping 1 algorithm address-based L3 lacp enable sharing 2 grouping 2 algorithm address-based L3 lacp enable sharing 3 grouping 3 algorithm address-based L3 lacp enable sharing 4 grouping 4 algorithm address-based L3 lacp enable sharing 5 grouping 5 algorithm address-based L3 lacp enable sharing 41 grouping 41-48 algorithm address-based L3_L4 lacp Big Data Solutions Guide 24

configure vlan hadoop10 add ports 41 tagged configure vlan hadoop10 add ports 1-6 untagged configure vlan isc add ports 41 tagged configure vlan vl1 add ports 49 tagged configure vlan vl2 add ports 57 tagged configure vlan vl21 add ports 53 tagged configure vlan vl22 add ports 61 tagged configure vlan Mgmt ipaddress 10.6.117.7 255.255.255.0 configure vlan hadoop10 ipaddress 192.168.10.2 255.255.255.0 enable ipforwarding vlan hadoop10 configure ip-mtu 9194 vlan hadoop10 configure vlan isc ipaddress 1.1.1.2 255.255.255.0 configure ip-mtu 9194 vlan isc configure vlan vl1 ipaddress 10.10.1.61 255.255.255.0 enable ipforwarding vlan vl1 configure ip-mtu 9194 vlan v1 configure vlan vl2 ipaddress 10.10.2.61 255.255.255.0 enable ipforwarding vlan vl2 configure ip-mtu 9194 vlan v2 configure vlan vl21 ipaddress 10.10.21.61 255.255.255.0 enable ipforwarding vlan vl21 configure ip-mtu 9194 vlan v21 configure vlan vl22 ipaddress 10.10.22.61 255.255.255.0 enable ipforwarding vlan vl22 configure ip-mtu 9194 vlan v22 enable iproute sharing vr VR-Default configure iproute add default 10.6.117.1 vr VR-Mgmt configure iproute add default 10.6.117.1 # NMS configure vm-tracking nms primary server 10.65.0.11 1812 client-ip 10.6.117.7 shared-secret encrypted Gt}xolg5 vr VR-Mgmt # For Active/Active VRRP configure access-list Block_VRRP ports 41 ingress Big Data Solutions Guide 25

# configure iproute sharing max-gateways 8 configure forwarding switching-mode cut-through # OSPF configure ospf routerid 10.11.11.61 enable ospf configure ospf vlan corp priority 0 configure ospf add vlan rack1 area 0.0.0.0 passive configure ospf vlan rack1 priority 0 configure ospf vlan isc priority 0 configure ospf vlan rtr priority 0 configure ospf add vlan vl1 area 0.0.0.0 link-type point-to-point configure ospf vlan vl1 priority 0 configure ospf vlan vl1 bfd on configure ospf add vlan vl2 area 0.0.0.0 link-type point-to-point configure ospf vlan vl2 priority 0 configure ospf vlan vl2 bfd on configure ospf add vlan vl21 area 0.0.0.0 link-type point-to-point configure ospf vlan vl21 priority 0 configure ospf vlan vl21 bfd on configure ospf add vlan vl22 area 0.0.0.0 link-type point-to-point configure ospf vlan vl22 priority 0 configure ospf vlan vl22 bfd on # SNMP configure snmpv3 add target-addr v1v2cnotifytaddr1 param v1v2cnotifyparam1 ipaddress 10.65.0.69 transport-port 10550 from 10.6.117.7 tag-list defaultnotify configure snmpv3 add target-params v1v2cnotifyparam1 user v1v2cnotifyuser1 mp-model snmpv2c sec-model snmpv2c sec-level auth # enable vm-tracking configure vm-tracking authentication database-order nms enable vm-tracking ports 1-5 Big Data Solutions Guide 26

# VRRP create vrrp vlan rack1 vrid 1 configure vrrp vlan rack1 vrid 1 priority 150 configure vrrp vlan rack1 vrid 1 preempt delay 1 configure vrrp vlan rack1 vrid 1 add 192.168.10.1 enable vrrp vlan rack1 vrid 1 # MLAG create mlag peer x670_leaf_2 configure mlag peer x670_leaf_2 ipaddress 1.1.1.1 vr VR-Default enable mlag port 1 peer x670_leaf_2 id 1 enable mlag port 2 peer x670_leaf_2 id 2 enable mlag port 3 peer x670_leaf_2 id 3 enable mlag port 4 peer x670_leaf_2 id 4 enable mlag port 5 peer x670_leaf_2 id 5 Here is the policy for VRRP Active/Active: x670_leaf_1.10 # show policy Block_VRRP Policies at Policy Server: Policy: Block_VRRP entry v4active { if match all { destination-address 224.0.0.18/32 ; } then { deny ; } } Number of clients bound to policy: 1 Client: acl bound once Big Data Solutions Guide 27

SPINE SWITCH SAMPLE CONFIGURATION Below are relevant snippets of a spine switch configuration: enable jumbo-frame ports all create vlan vl1 configure vlan vl1 tag 101 create vlan vl21 configure vlan vl21 tag 121 create vlan vl23 configure vlan vl23 tag 123 create vlan vl25 configure vlan vl25 tag 125 create vlan vl27 configure vlan vl27 tag 127 create vlan vl3 configure vlan vl3 tag 103 create vlan vl5 configure vlan vl5 tag 105 create vlan vl7 configure vlan vl7 tag 107 # Ports to leaf switches configure ports 1 display-string L1-v1 configure ports 5 display-string L2-v3 configure ports 9 display-string L3-v5 configure ports 13 display-string L4-v7 configure ports 17 display-string L1-v21 configure ports 21 display-string L2-v23 configure ports 25 display-string L3-v25 configure ports 29 display-string L4-v27 configure vlan vl1 add ports 1 tagged configure vlan vl21 add ports 17 tagged configure vlan vl23 add ports 21 tagged configure vlan vl25 add ports 25 tagged configure vlan vl27 add ports 29 tagged configure vlan vl3 add ports 5 tagged configure vlan vl5 add ports 9 tagged configure vlan vl7 add ports 13 tagged Big Data Solutions Guide 28

configure vlan Mgmt ipaddress 10.6.117.35 255.255.255.0 configure vlan vl1 ipaddress 10.10.1.71 255.255.255.0 enable ipforwarding vlan vl1 configure ip-mtu 9194 vlan vl1 configure vlan vl3 ipaddress 10.10.3.71 255.255.255.0 enable ipforwarding vlan vl3 configure ip-mtu 9194 vlan vl3 configure vlan vl5 ipaddress 10.10.5.71 255.255.255.0 enable ipforwarding vlan vl5 configure ip-mtu 9194 vlan vl5 configure vlan vl7 ipaddress 10.10.7.71 255.255.255.0 enable ipforwarding vlan vl7 configure ip-mtu 9194 vlan vl7 configure vlan rtr ipaddress 10.7.70.1 255.255.255.255 configure vlan vl21 ipaddress 10.10.21.71 255.255.255.0 enable ipforwarding vlan vl21 configure ip-mtu 9194 vlan vl22 configure vlan vl23 ipaddress 10.10.23.71 255.255.255.0 enable ipforwarding vlan vl23 configure ip-mtu 9194 vlan vl23 configure vlan vl25 ipaddress 10.10.25.71 255.255.255.0 enable ipforwarding vlan vl25 configure ip-mtu 9194 vlan vl25 configure vlan vl27 ipaddress 10.10.27.71 255.255.255.0 enable ipforwarding vlan vl27 configure ip-mtu 9194 vlan vl27 enable iproute sharing vr VR-Default configure iproute add default 10.6.117.1 vr VR-Mgmt configure iproute sharing max-gateways 8 configure forwarding switching-mode cut-through Big Data Solutions Guide 29

# OSPF configure ospf routerid 10.11.11.71 enable ospf configure ospf add vlan vl1 area 0.0.0.0 link-type point-to-point configure ospf vlan vl1 bfd on configure ospf add vlan vl21 area 0.0.0.0 link-type point-to-point configure ospf vlan vl21 bfd on configure ospf add vlan vl23 area 0.0.0.0 link-type point-to-point configure ospf vlan vl23 bfd on configure ospf add vlan vl25 area 0.0.0.0 link-type point-to-point configure ospf vlan vl25 bfd on configure ospf add vlan vl27 area 0.0.0.0 link-type point-to-point configure ospf vlan vl27 bfd on configure ospf add vlan vl3 area 0.0.0.0 link-type point-to-point configure ospf vlan vl13 bfd on configure ospf add vlan vl5 area 0.0.0.0 link-type point-to-point configure ospf vlan vl15 bfd on configure ospf add vlan vl7 area 0.0.0.0 link-type point-to-point configure ospf vlan vl17 bfd on RACK AWARENESS SAMPLE SCRIPT Below is the script that auto-generates the topology file as servers are added or removed:!/usr/bin/python Overview: this script maps the datade server IP address to a rack It generates the topology file that is input into the Hadoop cluster 1. Load this script on the NMS in a new directory: /usr/local/extreme_networks/netsight/rackawareness 2. This script needs the prerequisite file that maps switches to racks because the server is associated to a switch and the switch is associated to a rack. Example line 10.1.1.1 /rack1 db_rack_switches.csv 3. Configure the leaf switch to authenticate on the ports connected to the datades, which will send the server info to the NAC. Big Data Solutions Guide 30

4. Configure the NAC to respond to the authentication request by calling this script with the server IP address and switch IP address import sys import csv import subprocess import shlex import re import os.path switch_mapping_filename = db_rack_switches.csv server_mapping_filename = topology.data server_mapping_filename_new = server_mapping_filename switch_db = {} if os.path.isfile(switch_mapping_filename): switch_mapping_fh = open(switch_mapping_filename, r ) for line in switch_mapping_fh: line = line.strip().split() switch = line[0] rack = line[1] switch_db[switch] = rack switch_mapping_fh.close() # build server_db server_db = {} if os.path.isfile(server_mapping_filename): server_mapping_fh = open(server_mapping_filename, r ) for line in server_mapping_fh: line = line.strip().split() server = line[0] rack = line[1] server_db[server] = rack server_mapping_fh.close() # get rack from switch rack_name = switch_db[switch_ip] server_db[server_ip] = rack_name Big Data Solutions Guide 31

# get hostname skip_get_hostname = 1 if skip_get_hostname == 0: cmd = snmpgetnext -v 2c -c public %s 1.3.6.1.2.1.1.5 %(server_ip) print cmd args = shlex.split(cmd) proc = subprocess.popen(args, stdout=subprocess.pipe) response, error = proc.communicate() result = re.search( STRING: \ *(.*?)\., response) if result: server_hostname = result.group(1) else: print Server name t found from %s %(server_ip) exit(1) print server_hostname server_db[server_hostname] = rack_name # write new server mapping file fh = open(server_mapping_filename_new, w ) for server in server_db.keys(): fh.write ( %s\t%s\n %(server,server_db[server])) fh.close() http://www.extremenetworks.com/contact Phone +1-408-579-2800 2014 Extreme Networks, Inc. All rights reserved. Extreme Networks and the Extreme Networks logo are trademarks or registered trademarks of Extreme Networks, Inc. in the United States and/or other countries. All other names are the property of their respective owners. For additional information on Extreme Networks Trademarks please see http://www.extremenetworks.com/company/legal/trademarks/. Specifications and product availability are subject to change without tice. 8625-0713 WWW.EXTREMENETWORKS.COM Big Data Solutions Guide 32