Performance Analysis and Tuning in Windows HPC Server Xavier Pillons Program Manager Microsoft Corp.

Size: px
Start display at page:

Download "Performance Analysis and Tuning in Windows HPC Server 2008. Xavier Pillons Program Manager Microsoft Corp. xpillons@microsoft.com"

Transcription

1 erformance Analysis and Tuning in Windows HC Server 2008 Xavier illons rogram Manager Microsoft Corp.

2 Introduction How to monitor performance on Windows? What to look for? How to tune the system? How to trace MS-MI?

3 MEASURING ERFORMANCES

4 erformance Analysis Cluster wide Built in Diagnostics The Heatmap Local erfmon xperf

5 Built-in Network Diagnostics MI ing-ong (mpipingpong.exe) Launchable via HC Admin Console Diagnostics ro s: Easy, Data is auto-stored for historical comparison Con s: No choice of network, no intermediate results Launchable via command line Command Line Features Tournament mode, ring mode, serial mode Output progress to xml, stderr, stdout Histogram, per-node, and per-cluster data Test throughput / latency or both Remember: Usually you want only1 rank per node Additional diagnostics and extensibility in v3

6 Network diagnostics

7 Basic Network Troubleshooting Know Expected Bandwidths and Latencies Network Bandwidth Latency IB QDR (ConnectX CI-E 2.0) 2400MB/s 2µs IB DDR (ConnectX CI-E 2.0) 1500MB/s 2µs IB DDR (ConnectX CI-E 1.0) 1400MB/s 2.8µs IB DDR / ND 1400MB/s 5µs IB SDR /ND 950MB/s 6µs IB / IoIB MB/s 30µs Gige 105MB/s 40-70µs Make sure drivers and firmware are up to date Use the product diagnostics to confirm Or allas ingpong, etc.

8 Cluster Sanity Checks HC Toolpack can help too

9 The Heatmap

10 Basic Tools - erfmon Counter Tolerance Used For rocessor /%CU time 95% User mode bottleneck rocessor / %Kernel Time 10% Kernel issues rocessor / %DC time 5% RSS, Affinity rocessor / %Interrupt Time 5% Misbehaving drivers Network / Output Queue Length 1 Network bottleneck Disk / Average Queue Length 1 / platter Disk bottleneck Memory / ages er Sec 1 Hard Faults System/ Context Switches per sec 20,000 Locks, wasting processing System / system calls per sec 100,000 Excessive transitions

11 erfmon In Use

12 Windows erformance Toolkit Official performance analysis tools from Windows Used to optimize Windows itself Wide support range Cross platform: Vista, Server 2008/R2, Win7 Cross architecture: x86, x64, ia64 Very low overhead live capture on production systems Less than 2 % processor overhead for a sustained rate of 10,000 events/second on a 2GHz processor The only tool that lets you correlate most of the fundamental system activity All processes and threads, both user and kernel mode DCs and ISRs, thread scheduling, disk and file I/O, memory usage, graphics subsystem, etc. Available externally: part of Windows 7 SDK

13 erformance Analysis

14 TUNING

15 Kernel By-ass NetworkDirect A new RDMA networking interface built for speed and stability Verbs-based design for close fit with native, high-perf networking interfaces Equal to Hardware-Optimized stacks for MI micro-benchmarks NetworkDirect drivers for key highperformance fabrics: Infiniband [available now!] 10 Gigabit Ethernet (iwar-enabled) [available now!] Myrinet [available soon] MS-MIv2 has 4 networking paths: Shared Memory between processors on a motherboard TC/I Stack ( normal Ethernet) Winsock Direct for sockets-based RDMA New NetworkDirect interface TC/Ethernet Networking Socket-Based App Windows Sockets (Winsock + WSD) TC I NDIS Networking Networking Mini-port Hardware Hardware Driver (ISV) App Networking Hardware Networking Hardware Hardware Driver MI App MS-MI Networking WinSock Hardware Hardware Direct NetworkDirect Networking Hardware rovider rovider Networking Networking Hardware Hardware User Mode Access Layer Networking Hardware Networking Hardware Networking Hardware HCS2008 Component OS Component RDMA Networking IHV Component User Mode Kernel Mode

16 MS-MI Fine tuning Lots of MI parameters (mpiexec help3) : MICH_ROGRESS_SIN_LIMIT 0 is adaptive, otherwise 1-64K SHM / SOCK / ND eager limit Switchover point for eager / rendezvous behaviour ND ZCOY threshold Sets the switchover point between bcopy and zcopy Affinity Buffer-reuse and registration cost affect this ( registration ~= 32K bcopy ) Definitely used for NUMA systems

17 Reducing OS Jitter Track Hard Fault with xperf Disable non used services (up to 42+) Delete Windows scheduled tasks Change G update interval (90mn by default)

18 Tuning Memory Access Effective memory use is rule #1 rocessor Affinity is key here Need to know the rocessor architecture Use STREAM to measure memory bandwidth

19 rocess lacement node groups, job templates, filters, affinity Application Aware A An ISV application (requires Nodes where the application is installed) A GigE InfiniBand 10 GigE Capacity Aware A A A A A A Multi-threaded application (requires machine with many Cores) A big model (requires Large memory machines) Blade Chassis 10 GigE 8-core servers 16-core servers A 32-core servers A A A A A InfiniBand InfiniBand GigE Numa Aware 4-way Structural Analysis MI Job C0 C0 M C1 C1 C2 C2 M C3 C3 M M M M M M M M M Quad-core IO 32-core IO

20 MI rocess lacement Request resource with JOB: /numnodes:n /numsockets:n /numcores:n /exclusive Control lacement with MIEXEC: cores X n X affinity Examples job submit /numcores:4 mpiexec foo.exe Compute Node job submit /numnodes:2 mpiexec c 2 affinity foo.exe Compute Node

21 Force Affinity mpiexec -affinity start /wait /b /affinity <mask> app.exe Windows AI SetrocessAffinityMask SetThreadAffinityMask With task manager or procexp.exe

22 Core and Affinity mask for woodcrest rocessor 1 0x0F 0x01 0x02 0x04 0x08 Core 0 Core 1 0x03 L2 Cache Bus Interface Core 2 Core 3 0x0C L2 Cache Bus Interface System Bus 0x00 0x00 0x00 rocessor Affinity Mask L2 Cache Affinity Mask Core Affinity Mask Bus Interface 0x30 L2 Cache Bus Interface 0xC0 L2 Cache 0x10 0x20 0x40 0x80 Core 4 Core 5 Core 6 Core 7 0xF0 rocessor 2

23 Finer control of affinity to overcome hyperthreading on NH mpiexec setaff.cmd setaff.cmd set affinity based on MI "%MI_SMD_KEY%" == "7" set "%MI_SMD_KEY%" == "1" set "%MI_SMD_KEY%" == "5" set "%MI_SMD_KEY%" == "3" set "%MI_SMD_KEY%" == "4" set "%MI_SMD_KEY%" == "2" set "%MI_SMD_KEY%" == "6" set "%MI_SMD_KEY%" == "0" set AFFINITY=80 start /wait /b /affinity %AFFINITY% %*

24 MS-MI TRACING

25 Devs can't tune what they can't see MS-MI Tracing: Single, time-correlated log of MI Events on All Nodes Dual purpose: erformance Analysis Application Trouble-Shooting Trace Data Display VAMIR (TU Dresden) Intel Trace Analyzer MICH Jumpshot (Argonne NL) Windows ETW tools Text

26 MS-MI Tracing Overview MS-MI includes Built-In Tracing Low Overhead Based on Event Tracing for Windows (ETW) No need to recompile your application Three Step rocess Trace: mpiexec trace [event category] MyApp.exe Sync: clocks across nodes (mpicsync.exe) Convert: to Viewing format Explained in excruciating detail in: Tracing MI Apps with Windows HC Server 2008 Traces can also be triggered via any ETW mechanism (Xperf, etc.)

27 Step 1 Tracing and filtering mpiexec -trace MyApp.exe mpiexec -trace (T2T,ICND) MyApp.exe T2T : oint to point communication ICND : Network Direct Interconnect Communication These event groups are defined in the file mpitrace.mof which resides in the %CC_HOME%\bin\ folder log files written on each node in %userprofile% mpi_trace_{jobid}.{taskid}.{taskinstanceid}.etl Trace filename can be overriden with tracefile argument

28 Step 2 Clock synchronisation Use mpiexec and mpicsync to correct trace file timestamps for each node used in a job mpiexec cores 1 mpicsync mpi_trace_ etl mpicsync uses uniquely trace (.etl) file data to calculate CU clock corrections mpicsync must be run as an MI program mpiexec -cores 1 wdir %%USERROFILE%% mpicsync mpi_trace_%cc_jobid%.%cc_taskid%.%cc_taskinstanceid%.etl

29 Step 3 - Format the Binary.etl File For Viewing Format to TEXT, OTF, CLOG2 tracefmt, etl2otf and etl2clog Format the event log and apply clock corrections Leverage the power of your cluster by using mpiexec to translate all your.etl files simultaneously on the compute nodes used for your trace job mpiexec -cores 1 -wdir %%USERROFILE%% etl2otf mpi_trace_ etl Finally collect trace files from all nodes in a single location

30 Helper script TraceMyMI.cmd rovided as part of the tracing whitepaper Execute all the require steps Start mpiexec for you

31 MS-MI Tracing and viewing

32 QUESTIONS?

33 Resources The windows performance toolkit is here tools.mspx Windows Internals series is very good Basic windows server tuning is here _tun_srv.mspx rocess Affinity in HC Server 2008 S1 10/01/process-affinity-and-windows-hpc-server sp1.aspx

Accelerating From Cluster to Cloud: Overview of RDMA on Windows HPC. Wenhao Wu Program Manager Windows HPC team

Accelerating From Cluster to Cloud: Overview of RDMA on Windows HPC. Wenhao Wu Program Manager Windows HPC team Accelerating From Cluster to Cloud: Overview of RDMA on Windows HPC Wenhao Wu Program Manager Windows HPC team Agenda Microsoft s Commitments to HPC RDMA for HPC Server RDMA for Storage in Windows 8 Microsoft

More information

Can High-Performance Interconnects Benefit Memcached and Hadoop?

Can High-Performance Interconnects Benefit Memcached and Hadoop? Can High-Performance Interconnects Benefit Memcached and Hadoop? D. K. Panda and Sayantan Sur Network-Based Computing Laboratory Department of Computer Science and Engineering The Ohio State University,

More information

High Productivity Computing With Windows Windows HPC Server 2008

High Productivity Computing With Windows Windows HPC Server 2008 High Productivity Computing With Windows Windows HPC Server 2008 Product Unit Manager Windows HPC Server Microsoft Corporation Trends affecting HPC Make HPC Broadly Accessible Upgrade IT skills Easy access

More information

LS-DYNA Best-Practices: Networking, MPI and Parallel File System Effect on LS-DYNA Performance

LS-DYNA Best-Practices: Networking, MPI and Parallel File System Effect on LS-DYNA Performance 11 th International LS-DYNA Users Conference Session # LS-DYNA Best-Practices: Networking, MPI and Parallel File System Effect on LS-DYNA Performance Gilad Shainer 1, Tong Liu 2, Jeff Layton 3, Onur Celebioglu

More information

Sockets vs. RDMA Interface over 10-Gigabit Networks: An In-depth Analysis of the Memory Traffic Bottleneck

Sockets vs. RDMA Interface over 10-Gigabit Networks: An In-depth Analysis of the Memory Traffic Bottleneck Sockets vs. RDMA Interface over 1-Gigabit Networks: An In-depth Analysis of the Memory Traffic Bottleneck Pavan Balaji Hemal V. Shah D. K. Panda Network Based Computing Lab Computer Science and Engineering

More information

A Micro-benchmark Suite for Evaluating Hadoop RPC on High-Performance Networks

A Micro-benchmark Suite for Evaluating Hadoop RPC on High-Performance Networks A Micro-benchmark Suite for Evaluating Hadoop RPC on High-Performance Networks Xiaoyi Lu, Md. Wasi- ur- Rahman, Nusrat Islam, and Dhabaleswar K. (DK) Panda Network- Based Compu2ng Laboratory Department

More information

Parallel Programming Survey

Parallel Programming Survey Christian Terboven 02.09.2014 / Aachen, Germany Stand: 26.08.2014 Version 2.3 IT Center der RWTH Aachen University Agenda Overview: Processor Microarchitecture Shared-Memory

More information

SMB Advanced Networking for Fault Tolerance and Performance. Jose Barreto Principal Program Managers Microsoft Corporation

SMB Advanced Networking for Fault Tolerance and Performance. Jose Barreto Principal Program Managers Microsoft Corporation SMB Advanced Networking for Fault Tolerance and Performance Jose Barreto Principal Program Managers Microsoft Corporation Agenda SMB Remote File Storage for Server Apps SMB Direct (SMB over RDMA) SMB Multichannel

More information

Windows IB. Introduction to Windows 2003 Compute Cluster Edition. Eric Lantz Microsoft elantz@microsoft.com

Windows IB. Introduction to Windows 2003 Compute Cluster Edition. Eric Lantz Microsoft elantz@microsoft.com Datacenter Fabric Workshop Windows IB Introduction to Windows 2003 Compute Cluster Edition Eric Lantz Microsoft elantz@microsoft.com August 22, 2005 What this talk is not about High Availability, Fail-over

More information

RDMA over Ethernet - A Preliminary Study

RDMA over Ethernet - A Preliminary Study RDMA over Ethernet - A Preliminary Study Hari Subramoni, Miao Luo, Ping Lai and Dhabaleswar. K. Panda Computer Science & Engineering Department The Ohio State University Outline Introduction Problem Statement

More information

Virtualization Technologies and Blackboard: The Future of Blackboard Software on Multi-Core Technologies

Virtualization Technologies and Blackboard: The Future of Blackboard Software on Multi-Core Technologies Virtualization Technologies and Blackboard: The Future of Blackboard Software on Multi-Core Technologies Kurt Klemperer, Principal System Performance Engineer kklemperer@blackboard.com Agenda Session Length:

More information

EView/400i Management Pack for Systems Center Operations Manager (SCOM)

EView/400i Management Pack for Systems Center Operations Manager (SCOM) EView/400i Management Pack for Systems Center Operations Manager (SCOM) Concepts Guide Version 6.3 November 2012 Legal Notices Warranty EView Technology makes no warranty of any kind with regard to this

More information

Technical Overview of Windows HPC Server 2008

Technical Overview of Windows HPC Server 2008 Technical Overview of Windows HPC Server 2008 Published: June, 2008, Revised September 2008 Abstract Windows HPC Server 2008 brings the power, performance, and scale of high performance computing (HPC)

More information

10.04.2008. Thomas Fahrig Senior Developer Hypervisor Team. Hypervisor Architecture Terminology Goals Basics Details

10.04.2008. Thomas Fahrig Senior Developer Hypervisor Team. Hypervisor Architecture Terminology Goals Basics Details Thomas Fahrig Senior Developer Hypervisor Team Hypervisor Architecture Terminology Goals Basics Details Scheduling Interval External Interrupt Handling Reserves, Weights and Caps Context Switch Waiting

More information

Logically a Linux cluster looks something like the following: Compute Nodes. user Head node. network

Logically a Linux cluster looks something like the following: Compute Nodes. user Head node. network A typical Linux cluster consists of a group of compute nodes for executing parallel jobs and a head node to which users connect to build and launch their jobs. Often the compute nodes are connected to

More information

Using the Windows Cluster

Using the Windows Cluster Using the Windows Cluster Christian Terboven terboven@rz.rwth aachen.de Center for Computing and Communication RWTH Aachen University Windows HPC 2008 (II) September 17, RWTH Aachen Agenda o Windows Cluster

More information

Achieving Real-Time Business Solutions Using Graph Database Technology and High Performance Networks

Achieving Real-Time Business Solutions Using Graph Database Technology and High Performance Networks WHITE PAPER July 2014 Achieving Real-Time Business Solutions Using Graph Database Technology and High Performance Networks Contents Executive Summary...2 Background...3 InfiniteGraph...3 High Performance

More information

evm Virtualization Platform for Windows

evm Virtualization Platform for Windows B A C K G R O U N D E R evm Virtualization Platform for Windows Host your Embedded OS and Windows on a Single Hardware Platform using Intel Virtualization Technology April, 2008 TenAsys Corporation 1400

More information

Accelerating Spark with RDMA for Big Data Processing: Early Experiences

Accelerating Spark with RDMA for Big Data Processing: Early Experiences Accelerating Spark with RDMA for Big Data Processing: Early Experiences Xiaoyi Lu, Md. Wasi- ur- Rahman, Nusrat Islam, Dip7 Shankar, and Dhabaleswar K. (DK) Panda Network- Based Compu2ng Laboratory Department

More information

Technical Paper. Moving SAS Applications from a Physical to a Virtual VMware Environment

Technical Paper. Moving SAS Applications from a Physical to a Virtual VMware Environment Technical Paper Moving SAS Applications from a Physical to a Virtual VMware Environment Release Information Content Version: April 2015. Trademarks and Patents SAS Institute Inc., SAS Campus Drive, Cary,

More information

Boosting Data Transfer with TCP Offload Engine Technology

Boosting Data Transfer with TCP Offload Engine Technology Boosting Data Transfer with TCP Offload Engine Technology on Ninth-Generation Dell PowerEdge Servers TCP/IP Offload Engine () technology makes its debut in the ninth generation of Dell PowerEdge servers,

More information

Basics of VTune Performance Analyzer. Intel Software College. Objectives. VTune Performance Analyzer. Agenda

Basics of VTune Performance Analyzer. Intel Software College. Objectives. VTune Performance Analyzer. Agenda Objectives At the completion of this module, you will be able to: Understand the intended purpose and usage models supported by the VTune Performance Analyzer. Identify hotspots by drilling down through

More information

Parallel application development

Parallel application development Jan Ciesko HPC Consultant Microsoft Microsoft HPC landscape Introduction Microsoft HPC Server 2008 Demo Parallel application development Microsoft Parallel Programming initiative MPI / OMP / TPL Demos

More information

LS DYNA Performance Benchmarks and Profiling. January 2009

LS DYNA Performance Benchmarks and Profiling. January 2009 LS DYNA Performance Benchmarks and Profiling January 2009 Note The following research was performed under the HPC Advisory Council activities AMD, Dell, Mellanox HPC Advisory Council Cluster Center The

More information

Agenda. Enterprise Application Performance Factors. Current form of Enterprise Applications. Factors to Application Performance.

Agenda. Enterprise Application Performance Factors. Current form of Enterprise Applications. Factors to Application Performance. Agenda Enterprise Performance Factors Overall Enterprise Performance Factors Best Practice for generic Enterprise Best Practice for 3-tiers Enterprise Hardware Load Balancer Basic Unix Tuning Performance

More information

Running applications on the Cray XC30 4/12/2015

Running applications on the Cray XC30 4/12/2015 Running applications on the Cray XC30 4/12/2015 1 Running on compute nodes By default, users do not log in and run applications on the compute nodes directly. Instead they launch jobs on compute nodes

More information

Comparing SMB Direct 3.0 performance over RoCE, InfiniBand and Ethernet. September 2014

Comparing SMB Direct 3.0 performance over RoCE, InfiniBand and Ethernet. September 2014 Comparing SMB Direct 3.0 performance over RoCE, InfiniBand and Ethernet Anand Rangaswamy September 2014 Storage Developer Conference Mellanox Overview Ticker: MLNX Leading provider of high-throughput,

More information

June, 2009. Supermicro ICR Recipe For 1U Twin Department Cluster. Version 1.4 6/25/2009

June, 2009. Supermicro ICR Recipe For 1U Twin Department Cluster. Version 1.4 6/25/2009 Supermicro ICR Recipe For 1U Twin Department Cluster with ClusterVision ClusterVisionOS Version 1.4 6/25/2009 1 Table of Contents 1. System Configuration... 3 Bill Of Materials (Hardware)... 3 Bill Of

More information

Tool - 1: Health Center

Tool - 1: Health Center Tool - 1: Health Center Joseph Amrith Raj http://facebook.com/webspherelibrary 2 Tool - 1: Health Center Table of Contents WebSphere Application Server Troubleshooting... Error! Bookmark not defined. About

More information

REFERENCE. Microsoft in HPC. Tejas Karmarkar, Solution Sales Professional, Microsoft

REFERENCE. Microsoft in HPC. Tejas Karmarkar, Solution Sales Professional, Microsoft REFERENCE Microsoft in HPC Tejas Karmarkar, Solution Sales Professional, Microsoft Agenda What is HPC? MSC.Software Confidential Microsoft Vision of HPC Microsoft solution & Ecosystem Architecture Proof

More information

ECLIPSE Performance Benchmarks and Profiling. January 2009

ECLIPSE Performance Benchmarks and Profiling. January 2009 ECLIPSE Performance Benchmarks and Profiling January 2009 Note The following research was performed under the HPC Advisory Council activities AMD, Dell, Mellanox, Schlumberger HPC Advisory Council Cluster

More information

Toward a practical HPC Cloud : Performance tuning of a virtualized HPC cluster

Toward a practical HPC Cloud : Performance tuning of a virtualized HPC cluster Toward a practical HPC Cloud : Performance tuning of a virtualized HPC cluster Ryousei Takano Information Technology Research Institute, National Institute of Advanced Industrial Science and Technology

More information

Hadoop on the Gordon Data Intensive Cluster

Hadoop on the Gordon Data Intensive Cluster Hadoop on the Gordon Data Intensive Cluster Amit Majumdar, Scientific Computing Applications Mahidhar Tatineni, HPC User Services San Diego Supercomputer Center University of California San Diego Dec 18,

More information

Why Compromise? A discussion on RDMA versus Send/Receive and the difference between interconnect and application semantics

Why Compromise? A discussion on RDMA versus Send/Receive and the difference between interconnect and application semantics Why Compromise? A discussion on RDMA versus Send/Receive and the difference between interconnect and application semantics Mellanox Technologies Inc. 2900 Stender Way, Santa Clara, CA 95054 Tel: 408-970-3400

More information

High Performance Data-Transfers in Grid Environment using GridFTP over InfiniBand

High Performance Data-Transfers in Grid Environment using GridFTP over InfiniBand High Performance Data-Transfers in Grid Environment using GridFTP over InfiniBand Hari Subramoni *, Ping Lai *, Raj Kettimuthu **, Dhabaleswar. K. (DK) Panda * * Computer Science and Engineering Department

More information

Comparing the performance of the Landmark Nexus reservoir simulator on HP servers

Comparing the performance of the Landmark Nexus reservoir simulator on HP servers WHITE PAPER Comparing the performance of the Landmark Nexus reservoir simulator on HP servers Landmark Software & Services SOFTWARE AND ASSET SOLUTIONS Comparing the performance of the Landmark Nexus

More information

Mellanox Academy Online Training (E-learning)

Mellanox Academy Online Training (E-learning) Mellanox Academy Online Training (E-learning) 2013-2014 30 P age Mellanox offers a variety of training methods and learning solutions for instructor-led training classes and remote online learning (e-learning),

More information

Intel DPDK Boosts Server Appliance Performance White Paper

Intel DPDK Boosts Server Appliance Performance White Paper Intel DPDK Boosts Server Appliance Performance Intel DPDK Boosts Server Appliance Performance Introduction As network speeds increase to 40G and above, both in the enterprise and data center, the bottlenecks

More information

IBM Tivoli Composite Application Manager for Microsoft Applications: Microsoft Hyper-V Server Agent Version 6.3.1 Fix Pack 2.

IBM Tivoli Composite Application Manager for Microsoft Applications: Microsoft Hyper-V Server Agent Version 6.3.1 Fix Pack 2. IBM Tivoli Composite Application Manager for Microsoft Applications: Microsoft Hyper-V Server Agent Version 6.3.1 Fix Pack 2 Reference IBM Tivoli Composite Application Manager for Microsoft Applications:

More information

Removing Performance Bottlenecks in Databases with Red Hat Enterprise Linux and Violin Memory Flash Storage Arrays. Red Hat Performance Engineering

Removing Performance Bottlenecks in Databases with Red Hat Enterprise Linux and Violin Memory Flash Storage Arrays. Red Hat Performance Engineering Removing Performance Bottlenecks in Databases with Red Hat Enterprise Linux and Violin Memory Flash Storage Arrays Red Hat Performance Engineering Version 1.0 August 2013 1801 Varsity Drive Raleigh NC

More information

New Storage System Solutions

New Storage System Solutions New Storage System Solutions Craig Prescott Research Computing May 2, 2013 Outline } Existing storage systems } Requirements and Solutions } Lustre } /scratch/lfs } Questions? Existing Storage Systems

More information

SMB Direct for SQL Server and Private Cloud

SMB Direct for SQL Server and Private Cloud SMB Direct for SQL Server and Private Cloud Increased Performance, Higher Scalability and Extreme Resiliency June, 2014 Mellanox Overview Ticker: MLNX Leading provider of high-throughput, low-latency server

More information

Microsoft SQL Server 2012 on Cisco UCS with iscsi-based Storage Access in VMware ESX Virtualization Environment: Performance Study

Microsoft SQL Server 2012 on Cisco UCS with iscsi-based Storage Access in VMware ESX Virtualization Environment: Performance Study White Paper Microsoft SQL Server 2012 on Cisco UCS with iscsi-based Storage Access in VMware ESX Virtualization Environment: Performance Study 2012 Cisco and/or its affiliates. All rights reserved. This

More information

theguard! ApplicationManager System Windows Data Collector

theguard! ApplicationManager System Windows Data Collector theguard! ApplicationManager System Windows Data Collector Status: 10/9/2008 Introduction... 3 The Performance Features of the ApplicationManager Data Collector for Microsoft Windows Server... 3 Overview

More information

ECLIPSE Best Practices Performance, Productivity, Efficiency. March 2009

ECLIPSE Best Practices Performance, Productivity, Efficiency. March 2009 ECLIPSE Best Practices Performance, Productivity, Efficiency March 29 ECLIPSE Performance, Productivity, Efficiency The following research was performed under the HPC Advisory Council activities HPC Advisory

More information

Mellanox Academy Course Catalog. Empower your organization with a new world of educational possibilities 2014-2015

Mellanox Academy Course Catalog. Empower your organization with a new world of educational possibilities 2014-2015 Mellanox Academy Course Catalog Empower your organization with a new world of educational possibilities 2014-2015 Mellanox offers a variety of training methods and learning solutions for instructor-led

More information

I/O Virtualization Using Mellanox InfiniBand And Channel I/O Virtualization (CIOV) Technology

I/O Virtualization Using Mellanox InfiniBand And Channel I/O Virtualization (CIOV) Technology I/O Virtualization Using Mellanox InfiniBand And Channel I/O Virtualization (CIOV) Technology Reduce I/O cost and power by 40 50% Reduce I/O real estate needs in blade servers through consolidation Maintain

More information

Solving I/O Bottlenecks to Enable Superior Cloud Efficiency

Solving I/O Bottlenecks to Enable Superior Cloud Efficiency WHITE PAPER Solving I/O Bottlenecks to Enable Superior Cloud Efficiency Overview...1 Mellanox I/O Virtualization Features and Benefits...2 Summary...6 Overview We already have 8 or even 16 cores on one

More information

Chapter 14 Virtual Machines

Chapter 14 Virtual Machines Operating Systems: Internals and Design Principles Chapter 14 Virtual Machines Eighth Edition By William Stallings Virtual Machines (VM) Virtualization technology enables a single PC or server to simultaneously

More information

A Smart Investment for Flexible, Modular and Scalable Blade Architecture Designed for High-Performance Computing.

A Smart Investment for Flexible, Modular and Scalable Blade Architecture Designed for High-Performance Computing. Appro HyperBlade A Smart Investment for Flexible, Modular and Scalable Blade Architecture Designed for High-Performance Computing. Appro HyperBlade clusters are flexible, modular scalable offering a high-density

More information

Exploiting Remote Memory Operations to Design Efficient Reconfiguration for Shared Data-Centers over InfiniBand

Exploiting Remote Memory Operations to Design Efficient Reconfiguration for Shared Data-Centers over InfiniBand Exploiting Remote Memory Operations to Design Efficient Reconfiguration for Shared Data-Centers over InfiniBand P. Balaji, K. Vaidyanathan, S. Narravula, K. Savitha, H. W. Jin D. K. Panda Network Based

More information

- An Essential Building Block for Stable and Reliable Compute Clusters

- An Essential Building Block for Stable and Reliable Compute Clusters Ferdinand Geier ParTec Cluster Competence Center GmbH, V. 1.4, March 2005 Cluster Middleware - An Essential Building Block for Stable and Reliable Compute Clusters Contents: Compute Clusters a Real Alternative

More information

MPI / ClusterTools Update and Plans

MPI / ClusterTools Update and Plans HPC Technical Training Seminar July 7, 2008 October 26, 2007 2 nd HLRS Parallel Tools Workshop Sun HPC ClusterTools 7+: A Binary Distribution of Open MPI MPI / ClusterTools Update and Plans Len Wisniewski

More information

InfiniBand Software and Protocols Enable Seamless Off-the-shelf Applications Deployment

InfiniBand Software and Protocols Enable Seamless Off-the-shelf Applications Deployment December 2007 InfiniBand Software and Protocols Enable Seamless Off-the-shelf Deployment 1.0 Introduction InfiniBand architecture defines a high-bandwidth, low-latency clustering interconnect that is used

More information

DELL TM PowerEdge TM T610 500 Mailbox Resiliency Exchange 2010 Storage Solution

DELL TM PowerEdge TM T610 500 Mailbox Resiliency Exchange 2010 Storage Solution DELL TM PowerEdge TM T610 500 Mailbox Resiliency Exchange 2010 Storage Solution Tested with: ESRP Storage Version 3.0 Tested Date: Content DELL TM PowerEdge TM T610... 1 500 Mailbox Resiliency

More information

Certification: HP ATA Servers & Storage

Certification: HP ATA Servers & Storage HP ExpertONE Competency Model Certification: HP ATA Servers & Storage Overview Achieving an HP certification provides relevant skills that can lead to a fulfilling career in Information Technology. HP

More information

WinOF Updates. Gilad Shainer Ishai Rabinovitz Stan Smith Sean Hefty. www.openfabrics.org

WinOF Updates. Gilad Shainer Ishai Rabinovitz Stan Smith Sean Hefty. www.openfabrics.org WinOF Updates Gilad Shainer Ishai Rabinovitz Stan Smith Sean Hefty Windows OpenFabrics (WinOF) Collaborative effort to develop, test and release OFA software for Windows Components Kernel and User Space

More information

Michael Kagan. michael@mellanox.com

Michael Kagan. michael@mellanox.com Virtualization in Data Center The Network Perspective Michael Kagan CTO, Mellanox Technologies michael@mellanox.com Outline Data Center Transition Servers S as a Service Network as a Service IO as a Service

More information

Intel Data Direct I/O Technology (Intel DDIO): A Primer >

Intel Data Direct I/O Technology (Intel DDIO): A Primer > Intel Data Direct I/O Technology (Intel DDIO): A Primer > Technical Brief February 2012 Revision 1.0 Legal Statements INFORMATION IN THIS DOCUMENT IS PROVIDED IN CONNECTION WITH INTEL PRODUCTS. NO LICENSE,

More information

Introduction. Need for ever-increasing storage scalability. Arista and Panasas provide a unique Cloud Storage solution

Introduction. Need for ever-increasing storage scalability. Arista and Panasas provide a unique Cloud Storage solution Arista 10 Gigabit Ethernet Switch Lab-Tested with Panasas ActiveStor Parallel Storage System Delivers Best Results for High-Performance and Low Latency for Scale-Out Cloud Storage Applications Introduction

More information

Petascale Software Challenges. Piyush Chaudhary piyushc@us.ibm.com High Performance Computing

Petascale Software Challenges. Piyush Chaudhary piyushc@us.ibm.com High Performance Computing Petascale Software Challenges Piyush Chaudhary piyushc@us.ibm.com High Performance Computing Fundamental Observations Applications are struggling to realize growth in sustained performance at scale Reasons

More information

A Study on the Scalability of Hybrid LS-DYNA on Multicore Architectures

A Study on the Scalability of Hybrid LS-DYNA on Multicore Architectures 11 th International LS-DYNA Users Conference Computing Technology A Study on the Scalability of Hybrid LS-DYNA on Multicore Architectures Yih-Yih Lin Hewlett-Packard Company Abstract In this paper, the

More information

"Charting the Course...... to Your Success!" MOC 50290 A Understanding and Administering Windows HPC Server 2008. Course Summary

Charting the Course...... to Your Success! MOC 50290 A Understanding and Administering Windows HPC Server 2008. Course Summary Description Course Summary This course provides students with the knowledge and skills to manage and deploy Microsoft HPC Server 2008 clusters. Objectives At the end of this course, students will be Plan

More information

Overview: X5 Generation Database Machines

Overview: X5 Generation Database Machines Overview: X5 Generation Database Machines Spend Less by Doing More Spend Less by Paying Less Rob Kolb Exadata X5-2 Exadata X4-8 SuperCluster T5-8 SuperCluster M6-32 Big Memory Machine Oracle Exadata Database

More information

Introduction to Infiniband. Hussein N. Harake, Performance U! Winter School

Introduction to Infiniband. Hussein N. Harake, Performance U! Winter School Introduction to Infiniband Hussein N. Harake, Performance U! Winter School Agenda Definition of Infiniband Features Hardware Facts Layers OFED Stack OpenSM Tools and Utilities Topologies Infiniband Roadmap

More information

Know your Cluster Bottlenecks and Maximize Performance

Know your Cluster Bottlenecks and Maximize Performance Know your Cluster Bottlenecks and Maximize Performance Hands-on training March 2013 Agenda Overview Performance Factors General System Configuration - PCI Express (PCIe) Capabilities - Memory Configuration

More information

WINDOWS PROCESSES AND SERVICES

WINDOWS PROCESSES AND SERVICES OBJECTIVES: Services o task manager o services.msc Process o task manager o process monitor Task Scheduler Event viewer Regedit Services: A Windows service is a computer program that operates in the background.

More information

Cloud Computing through Virtualization and HPC technologies

Cloud Computing through Virtualization and HPC technologies Cloud Computing through Virtualization and HPC technologies William Lu, Ph.D. 1 Agenda Cloud Computing & HPC A Case of HPC Implementation Application Performance in VM Summary 2 Cloud Computing & HPC HPC

More information

Achieving Mainframe-Class Performance on Intel Servers Using InfiniBand Building Blocks. An Oracle White Paper April 2003

Achieving Mainframe-Class Performance on Intel Servers Using InfiniBand Building Blocks. An Oracle White Paper April 2003 Achieving Mainframe-Class Performance on Intel Servers Using InfiniBand Building Blocks An Oracle White Paper April 2003 Achieving Mainframe-Class Performance on Intel Servers Using InfiniBand Building

More information

Mark Bennett. Search and the Virtual Machine

Mark Bennett. Search and the Virtual Machine Mark Bennett Search and the Virtual Machine Agenda Intro / Business Drivers What to do with Search + Virtual What Makes Search Fast (or Slow!) Virtual Platforms Test Results Trends / Wrap Up / Q & A Business

More information

JBoss Seam Performance and Scalability on Dell PowerEdge 1855 Blade Servers

JBoss Seam Performance and Scalability on Dell PowerEdge 1855 Blade Servers JBoss Seam Performance and Scalability on Dell PowerEdge 1855 Blade Servers Dave Jaffe, PhD, Dell Inc. Michael Yuan, PhD, JBoss / RedHat June 14th, 2006 JBoss Inc. 2006 About us Dave Jaffe Works for Dell

More information

SAN Conceptual and Design Basics

SAN Conceptual and Design Basics TECHNICAL NOTE VMware Infrastructure 3 SAN Conceptual and Design Basics VMware ESX Server can be used in conjunction with a SAN (storage area network), a specialized high speed network that connects computer

More information

Building Clusters for Gromacs and other HPC applications

Building Clusters for Gromacs and other HPC applications Building Clusters for Gromacs and other HPC applications Erik Lindahl lindahl@cbr.su.se CBR Outline: Clusters Clusters vs. small networks of machines Why do YOU need a cluster? Computer hardware Network

More information

The Lagopus SDN Software Switch. 3.1 SDN and OpenFlow. 3. Cloud Computing Technology

The Lagopus SDN Software Switch. 3.1 SDN and OpenFlow. 3. Cloud Computing Technology 3. The Lagopus SDN Software Switch Here we explain the capabilities of the new Lagopus software switch in detail, starting with the basics of SDN and OpenFlow. 3.1 SDN and OpenFlow Those engaged in network-related

More information

Headline in Arial Bold 30pt. The Need For Speed. Rick Reid Principal Engineer SGI

Headline in Arial Bold 30pt. The Need For Speed. Rick Reid Principal Engineer SGI Headline in Arial Bold 30pt The Need For Speed Rick Reid Principal Engineer SGI Commodity Systems Linux Red Hat SUSE SE-Linux X86-64 Intel Xeon AMD Scalable Programming Model MPI Global Data Access NFS

More information

Red Hat Summit 2009 Bryan Che

Red Hat Summit 2009 Bryan Che 1 Red Hat Enterprise MRG Update Bryan Che Product Manager, Red Hat September 2, 2009 2 About Red Hat Enterprise MRG Integrated platform for high performance distributed computing High speed, interoperable,

More information

Hardware Performance Optimization and Tuning. Presenter: Tom Arakelian Assistant: Guy Ingalls

Hardware Performance Optimization and Tuning. Presenter: Tom Arakelian Assistant: Guy Ingalls Hardware Performance Optimization and Tuning Presenter: Tom Arakelian Assistant: Guy Ingalls Agenda Server Performance Server Reliability Why we need Performance Monitoring How to optimize server performance

More information

Using Multipathing Technology to Achieve a High Availability Solution

Using Multipathing Technology to Achieve a High Availability Solution Using Multipathing Technology to Achieve a High Availability Solution Table of Contents Introduction...3 Multipathing Technology...3 Multipathing I/O Implementations...5 Storage Redundancy...5 Infortrend

More information

How To Understand And Understand The Power Of An Ipad Ios 2.5 (Ios 2) (I2) (Ipad 2) And Ipad 2.2 (Ipa) (Io) (Powergen) (Oper

How To Understand And Understand The Power Of An Ipad Ios 2.5 (Ios 2) (I2) (Ipad 2) And Ipad 2.2 (Ipa) (Io) (Powergen) (Oper Cisco Unified Computing System (UCS) A Complete Reference Guide to the Data Center Visualization Server Architecture Silvano Gai Tommi Salli Roger Andersson Cisco Press 800 East 96th Street Indianapolis,

More information

FIGURE 33.5. Selecting properties for the event log.

FIGURE 33.5. Selecting properties for the event log. 1358 CHAPTER 33 Logging and Debugging Customizing the Event Log The properties of an event log can be configured. In Event Viewer, the properties of a log are defined by general characteristics: log path,

More information

MLNX_VPI for Windows Installation Guide

MLNX_VPI for Windows Installation Guide MLNX_VPI for Windows Installation Guide www.mellanox.com NOTE: THIS INFORMATION IS PROVIDED BY MELLANOX FOR INFORMATIONAL PURPOSES ONLY AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIM- ITED

More information

NEC Corporation of America Intro to High Availability / Fault Tolerant Solutions

NEC Corporation of America Intro to High Availability / Fault Tolerant Solutions NEC Corporation of America Intro to High Availability / Fault Tolerant Solutions 1 NEC Corporation Technology solutions leader for 100+ years Established 1899, headquartered in Tokyo First Japanese joint

More information

PADS GPFS Filesystem: Crash Root Cause Analysis. Computation Institute

PADS GPFS Filesystem: Crash Root Cause Analysis. Computation Institute PADS GPFS Filesystem: Crash Root Cause Analysis Computation Institute Argonne National Laboratory Table of Contents Purpose 1 Terminology 2 Infrastructure 4 Timeline of Events 5 Background 5 Corruption

More information

Distributed File System. MCSN N. Tonellotto Complements of Distributed Enabling Platforms

Distributed File System. MCSN N. Tonellotto Complements of Distributed Enabling Platforms Distributed File System 1 How do we get data to the workers? NAS Compute Nodes SAN 2 Distributed File System Don t move data to workers move workers to the data! Store data on the local disks of nodes

More information

Also on the Performance tab, you will find a button labeled Resource Monitor. You can invoke Resource Monitor for additional analysis of the system.

Also on the Performance tab, you will find a button labeled Resource Monitor. You can invoke Resource Monitor for additional analysis of the system. 1348 CHAPTER 33 Logging and Debugging Monitoring Performance The Performance tab enables you to view the CPU and physical memory usage in graphical form. This information is especially useful when you

More information

Post-production Video Editing Solution Guide with Microsoft SMB 3 File Serving AssuredSAN 4000

Post-production Video Editing Solution Guide with Microsoft SMB 3 File Serving AssuredSAN 4000 Post-production Video Editing Solution Guide with Microsoft SMB 3 File Serving AssuredSAN 4000 Dot Hill Systems introduction 1 INTRODUCTION Dot Hill Systems offers high performance network storage products

More information

EDUCATION. PCI Express, InfiniBand and Storage Ron Emerick, Sun Microsystems Paul Millard, Xyratex Corporation

EDUCATION. PCI Express, InfiniBand and Storage Ron Emerick, Sun Microsystems Paul Millard, Xyratex Corporation PCI Express, InfiniBand and Storage Ron Emerick, Sun Microsystems Paul Millard, Xyratex Corporation SNIA Legal Notice The material contained in this tutorial is copyrighted by the SNIA. Member companies

More information

Oracle Database Scalability in VMware ESX VMware ESX 3.5

Oracle Database Scalability in VMware ESX VMware ESX 3.5 Performance Study Oracle Database Scalability in VMware ESX VMware ESX 3.5 Database applications running on individual physical servers represent a large consolidation opportunity. However enterprises

More information

Comparison of Novell, Polyserve, and Microsoft s Clustering Performance

Comparison of Novell, Polyserve, and Microsoft s Clustering Performance Technical White Paper Comparison of Novell, Polyserve, and Microsoft s Clustering Performance J Wolfgang Goerlich Written December 2006 Business Abstract Performance is a priority in highly utilized, asynchronous

More information

PCI Express Impact on Storage Architectures and Future Data Centers. Ron Emerick, Oracle Corporation

PCI Express Impact on Storage Architectures and Future Data Centers. Ron Emerick, Oracle Corporation PCI Express Impact on Storage Architectures and Future Data Centers Ron Emerick, Oracle Corporation SNIA Legal Notice The material contained in this tutorial is copyrighted by the SNIA. Member companies

More information

Improving High Performance Networking Technologies. for Data Center Clusters

Improving High Performance Networking Technologies. for Data Center Clusters Improving High Performance Networking Technologies for Data Center Clusters by Ryan Eric Grant A thesis submitted to the Department of Electrical and Computer Engineering In conformity with the requirements

More information

DPDK Summit 2014 DPDK in a Virtual World

DPDK Summit 2014 DPDK in a Virtual World DPDK Summit 2014 DPDK in a Virtual World Bhavesh Davda (Sr. Staff Engineer, CTO Office, ware) Rashmin Patel (DPDK Virtualization Engineer, Intel) Agenda Data Plane Virtualization Trends DPDK Virtualization

More information

Performance Beyond PCI Express: Moving Storage to The Memory Bus A Technical Whitepaper

Performance Beyond PCI Express: Moving Storage to The Memory Bus A Technical Whitepaper : Moving Storage to The Memory Bus A Technical Whitepaper By Stephen Foskett April 2014 2 Introduction In the quest to eliminate bottlenecks and improve system performance, the state of the art has continually

More information

Practical Performance Understanding the Performance of Your Application

Practical Performance Understanding the Performance of Your Application Neil Masson IBM Java Service Technical Lead 25 th September 2012 Practical Performance Understanding the Performance of Your Application 1 WebSphere User Group: Practical Performance Understand the Performance

More information

Unstructured Data Accelerator (UDA) Author: Motti Beck, Mellanox Technologies Date: March 27, 2012

Unstructured Data Accelerator (UDA) Author: Motti Beck, Mellanox Technologies Date: March 27, 2012 Unstructured Data Accelerator (UDA) Author: Motti Beck, Mellanox Technologies Date: March 27, 2012 1 Market Trends Big Data Growing technology deployments are creating an exponential increase in the volume

More information

Distribution One Server Requirements

Distribution One Server Requirements Distribution One Server Requirements Introduction Welcome to the Hardware Configuration Guide. The goal of this guide is to provide a practical approach to sizing your Distribution One application and

More information

DIABLO TECHNOLOGIES MEMORY CHANNEL STORAGE AND VMWARE VIRTUAL SAN : VDI ACCELERATION

DIABLO TECHNOLOGIES MEMORY CHANNEL STORAGE AND VMWARE VIRTUAL SAN : VDI ACCELERATION DIABLO TECHNOLOGIES MEMORY CHANNEL STORAGE AND VMWARE VIRTUAL SAN : VDI ACCELERATION A DIABLO WHITE PAPER AUGUST 2014 Ricky Trigalo Director of Business Development Virtualization, Diablo Technologies

More information

From Ethernet Ubiquity to Ethernet Convergence: The Emergence of the Converged Network Interface Controller

From Ethernet Ubiquity to Ethernet Convergence: The Emergence of the Converged Network Interface Controller White Paper From Ethernet Ubiquity to Ethernet Convergence: The Emergence of the Converged Network Interface Controller The focus of this paper is on the emergence of the converged network interface controller

More information

High Performance. CAEA elearning Series. Jonathan G. Dudley, Ph.D. 06/09/2015. 2015 CAE Associates

High Performance. CAEA elearning Series. Jonathan G. Dudley, Ph.D. 06/09/2015. 2015 CAE Associates High Performance Computing (HPC) CAEA elearning Series Jonathan G. Dudley, Ph.D. 06/09/2015 2015 CAE Associates Agenda Introduction HPC Background Why HPC SMP vs. DMP Licensing HPC Terminology Types of

More information

Windows Server Performance Monitoring

Windows Server Performance Monitoring Spot server problems before they are noticed The system s really slow today! How often have you heard that? Finding the solution isn t so easy. The obvious questions to ask are why is it running slowly

More information