How To Monitor Infiniband Network Data From A Network On A Leaf Switch (Wired) On A Microsoft Powerbook (Wired Or Microsoft) On An Ipa (Wired/Wired) Or Ipa V2 (Wired V2)



Similar documents
INAM - A Scalable InfiniBand Network Analysis and Monitoring Tool

Introduction to Infiniband. Hussein N. Harake, Performance U! Winter School

RDMA over Ethernet - A Preliminary Study

InfiniBand Software and Protocols Enable Seamless Off-the-shelf Applications Deployment

Scaling 10Gb/s Clustering at Wire-Speed

Performance Evaluation of InfiniBand with PCI Express

Can High-Performance Interconnects Benefit Memcached and Hadoop?

Exploiting Remote Memory Operations to Design Efficient Reconfiguration for Shared Data-Centers over InfiniBand

LS DYNA Performance Benchmarks and Profiling. January 2009

Mellanox Academy Online Training (E-learning)

Lustre Networking BY PETER J. BRAAM

Why Compromise? A discussion on RDMA versus Send/Receive and the difference between interconnect and application semantics

High Speed I/O Server Computing with InfiniBand

ECLIPSE Performance Benchmarks and Profiling. January 2009

Cisco s Massively Scalable Data Center

A Micro-benchmark Suite for Evaluating Hadoop RPC on High-Performance Networks

InfiniBand Clustering

Sockets vs. RDMA Interface over 10-Gigabit Networks: An In-depth Analysis of the Memory Traffic Bottleneck

Mellanox Academy Course Catalog. Empower your organization with a new world of educational possibilities

LS-DYNA Best-Practices: Networking, MPI and Parallel File System Effect on LS-DYNA Performance

Lustre & Cluster. - monitoring the whole thing Erich Focht

High Performance Data-Transfers in Grid Environment using GridFTP over InfiniBand

Performance Evaluation of the RDMA over Ethernet (RoCE) Standard in Enterprise Data Centers Infrastructure. Abstract:

Advancing Applications Performance With InfiniBand

Building a Scalable Storage with InfiniBand

Network Contention and Congestion Control: Lustre FineGrained Routing

SMB Advanced Networking for Fault Tolerance and Performance. Jose Barreto Principal Program Managers Microsoft Corporation

Architecting Low Latency Cloud Networks

OFA Training Program. Writing Application Programs for RDMA using OFA Software. Author: Rupert Dance Date: 11/15/

Mellanox Cloud and Database Acceleration Solution over Windows Server 2012 SMB Direct

Switching Architectures for Cloud Network Designs

Quantum StorNext. Product Brief: Distributed LAN Client

Cray DVS: Data Virtualization Service

Chapter 16: Distributed Operating Systems

Comparing SMB Direct 3.0 performance over RoCE, InfiniBand and Ethernet. September 2014

SDN CENTRALIZED NETWORK COMMAND AND CONTROL

Networking in the Hadoop Cluster

Achieving Real-Time Business Solutions Using Graph Database Technology and High Performance Networks

Storage at a Distance; Using RoCE as a WAN Transport

High Throughput File Servers with SMB Direct, Using the 3 Flavors of RDMA network adapters

Bandwidth consumption: Adaptive Defense and Adaptive Defense 360

A Tour of the Linux OpenFabrics Stack

Driving IBM BigInsights Performance Over GPFS Using InfiniBand+RDMA

Isilon IQ Network Configuration Guide

Sun Constellation System: The Open Petascale Computing Architecture

Configuring IPS High Bandwidth Using EtherChannel Load Balancing

BRIDGING EMC ISILON NAS ON IP TO INFINIBAND NETWORKS WITH MELLANOX SWITCHX

Understanding PCI Bus, PCI-Express and In finiband Architecture

Performance Monitoring of Parallel Scientific Applications

VoIP network planning guide

Infiniband Monitoring of HPC Clusters using LDMS (Lightweight Distributed Monitoring System)

Know your Cluster Bottlenecks and Maximize Performance

The IntelliMagic White Paper on: Storage Performance Analysis for an IBM San Volume Controller (SVC) (IBM V7000)

Scalability and Classifications

Advanced Computer Networks. Datacenter Network Fabric

Ethernet: THE Converged Network Ethernet Alliance Demonstration as SC 09

Ultra Low Latency Data Center Switches and iwarp Network Interface Cards

Hadoop on the Gordon Data Intensive Cluster

Mellanox Technical Training and Certification Program

D1.2 Network Load Balancing

SMB Direct for SQL Server and Private Cloud

Accelerating Spark with RDMA for Big Data Processing: Early Experiences

An Oracle White Paper July Oracle Primavera Contract Management, Business Intelligence Publisher Edition-Sizing Guide

The proliferation of the raw processing

ECE 358: Computer Networks. Solutions to Homework #4. Chapter 4 - The Network Layer

InfiniBand Switch System Family. Highest Levels of Scalability, Simplified Network Manageability, Maximum System Productivity

Hari Subramoni. Education: Employment: Research Interests: Projects:

Current Status of FEFS for the K computer

Interconnect Efficiency of Tyan PSC T-630 with Microsoft Compute Cluster Server 2003

Ethernet-based Software Defined Network (SDN) Cloud Computing Research Center for Mobile Applications (CCMA), ITRI 雲 端 運 算 行 動 應 用 研 究 中 心

Microsoft SMB Running Over RDMA in Windows Server 8

Hadoop Cluster Applications

SR-IOV In High Performance Computing

Interoperability Testing and iwarp Performance. Whitepaper

Performance of RDMA-Capable Storage Protocols on Wide-Area Network

Converging Data Center Applications onto a Single 10Gb/s Ethernet Network

Sun in HPC. Update for IDC HPC User Forum Tucson, AZ, Sept 2008

Module 15: Network Structures

SR-IOV: Performance Benefits for Virtualized Interconnects!

FLOW-3D Performance Benchmark and Profiling. September 2012

Cloud Infrastructure Planning. Chapter Six

Routing in packet-switching networks

ConnectX -3 Pro: Solving the NVGRE Performance Challenge

Monitoring PostgreSQL database with Verax NMS

Optimizing Data Center Networks for Cloud Computing

RDMA Performance in Virtual Machines using QDR InfiniBand on VMware vsphere 5 R E S E A R C H N O T E

Interconnection Networks. Interconnection Networks. Interconnection networks are used everywhere!

Data Center Convergence. Ahmad Zamer, Brocade

Advanced Computer Networks. High Performance Networking I

Designing Efficient FTP Mechanisms for High Performance Data-Transfer over InfiniBand

5 Performance Management for Web Services. Rolf Stadler School of Electrical Engineering KTH Royal Institute of Technology.

Transcription:

INFINIBAND NETWORK ANALYSIS AND MONITORING USING OPENSM N. Dandapanthula 1, H. Subramoni 1, J. Vienne 1, K. Kandalla 1, S. Sur 1, D. K. Panda 1, and R. Brightwell 2 Presented By Xavier Besseron 1 Date: 08/30/2011 1 Network-Based Computing Laboratory, The Ohio State University 2 Sandia National Laboratories

Outline Introduction InfiniBand OpenSM Problem Statement INAM Scalable InfiniBand Network Analysis & monitoring tool Experimental Analysis Conclusions and Future work

InfiniBand An industry standard for low latency, high bandwidth System Area Networks 41.20% of the top 500 most powerful supercomputers in the world are based on the InfiniBand interconnects (JUNE 2011) Pleiades 111,104 cores - NASA Road Runner 122,400 cores - LANL Red Sky 42,440 cores - Sandia National Labs Ranger 62,976 cores - TACC Multiple Virtual Lanes (VL) supported by IB Logical channel under the same physical link Separate buffer and flow control Service Differentiation

OpenSM InfiniBand Subnet Manager (IBA Specifications) Part of OFED software package Open Fabrics Enterprise Distribution Open source software for RDMA and kernel bypass applications Needed by the HPC community for applications which need low latency and high efficiency and fast I/O Scans, Initiates and Monitors the InfiniBand Fabric Performance Counters and Subnet Management Attributes (Not supported at VL granularity) Subnet Manager (SM), Subnet Management Agent (SMA) At least one instance required per Subnet Usage of Virtual Lanes

Existing Monitoring Tools Nagios [Agent Based] + Easily Integratable & Configurable + Supports multiple interconnects - No discovery process - Involves more overhead - No Layer 2, Switch Dependent Ganglia [Agent Based] + Portable and Scalable + Distributed Modules provide higher sampling rates + Supports multiple interconnects - Use of Daemons (gmond) involves more overhead - Metric measurements in compiled code - Adding custom metrics can be a bit complicated

Existing Monitoring Tools (Contd) Fabric IT [Agent Less] + Good Sampling Rates + Agent less + Integrated into the Subnet Manager - Proprietary by Mellanox, Specific for IB - Does not show communication patterns or Link usage pertaining to a Job - No long term data storage

InfiniBand Network Analysis and Monitoring Tool Can an InfiniBand network monitoring tool be designed such that: Shows the various performance counters and attributes Is Agentless Has low overhead Depicts the communication matrix of target applications Shows the link usage statistics

Outline Introduction INAM Scalable InfiniBand Network Analysis & monitoring tool Framework & Design Network Monitoring Link Utilization & Communication Pattern Experimental Analysis Conclusions and Future work

INAM-Framework 789 *++,-./0-1#& 7-B5 8'(?1(C/#.' D8>E-4(/(-'& "#$%&'(&)*++,-./0-1#2.-'#0-&0& "#0'(+(-&' *++,-./0-1#& "#0'(+(-&' D-$$,'F/(' >G*D >#?-#-;/#$G'0F1(H :'4;/&'$<-&6/,-=/0-1#>#0'(?/.' @:<>A >#?-#-;/#$G'0F1(HI6'(J-#B 2'(K-.'@>GI2A L+'#2D 3142.5'$6,'(

INAM-Framework (Contd) $%-, $%&' "#$ 34562*4 "*1'*47*4 ()*+', &8*49:+;<./0/ =5>>*?0:5+./0/1/2*

INAM Network Monitoring Network Monitoring Query the SMAs on the host nodes to obtain the performance counters and Subnet Management attributes and SM info Temporary Database. Real time monitoring with visualization. Permanent Database Keeps track of events in the subnet Stores them for the time period mentioned by the user Query this database to obtain the behavior of network traffic over a period of time Modify rate of data collection (Sampling rate) as per user input Modify rate of display as per user input

INAM Network Monitoring Monitors the following in real time Performance Counters Subnet Management Attributes Subnet Manager information in real time. Selecting Performance Counters to monitor

Monitoring Performance Counters Comparing Transmitted and Received Data on a Port

INAM Network Monitoring Subnet Manager Information VL Attributes Link Attributes Monitoring Subnet Management Attributes

INAM Link Utilization Link Utilization Attributes Used XmtWait attribute The number of units of time a packet waits to be transmitted from a port Used for determining Link overutilization Received Packets, Sent Packets, Link Speed Used for determining data exchange Based on the host file provided by the user, obtain all possible paths between every source & destination pairs Color variation of the links dependent on the amount of data transferred Keep track of how many times each link is traversed and the amount of data flowing through it

INAM Link Utilization Screenshot Showing Link Utilization

Outline Introduction INAM Scalable InfiniBand Network Analysis & monitoring tool Experimental Analysis Conclusions and Future work

Experimental Analysis Experimental Setup 6 Leaf Switches, 6 Spine Switches with 24 ports per switch 35 nodes Experiments Communication pattern analysis for 16 processes and 64 processes Communication pattern analysis for MPI_Bcast Operation with 16KB and 1 MB and with 6 processes. One process on each leaf switch Communication Pattern for LU benchmark from Spec MPI Suite

Network Traffic Pattern for 16 processes P2P communication with 8 processes on switch 84 and 4 processes each on switch 78 and 66

Network Traffic Pattern for 64 processes P2P communication with 32 processes on switch 84 and 16 processes each on switch 78 and 66

Link Utilization of Binomial Algorithm MPI_BCAST 16 KB 6 nodes 1 node / switch

Link Utilization of Scatter Allgather Algorithm MPI_BCAST 1 MB 6 nodes 1 node / switch

Communication Pattern of LU Benchmark 128 processes 8 nodes / switch 8 processes / node

INAM - Overhead IMB alltoall 8 cores / node Overhead less then 0.5 % as we increase the system size

Outline Introduction INAM Scalable InfiniBand Network Analysis & monitoring tool Experimental Analysis Conclusions and Future work

Conclusion & Future Work Conclusion INAM - a scalable network monitoring and visualization tool for InfiniBand networks Low Overhead Agent less Link Utilization Communication Pattern Future Work Time line graphical pattern display which shows the entire cluster s traffic at every instant. Scalability Studies On line analysis of the Communication patterns on a cluster Incorporate support for counters per virtual lane