Considering Middleware Options

Size: px
Start display at page:

Download "Considering Middleware Options"

Transcription

1 Considering Middleware Options in High-Performance Computing Clusters Middleware is a critical component for the development and porting of parallelprocessing applications in distributed high-performance computing (HPC) cluster infrastructures. This article describes the evolution of the Message Passing Interface (MPI) standard specification as well as both open source and commercial MPI implementations that can be used to enhance Dell HPC cluster environments. BY RINKU GUPTA, MONICA KASHYAP, YUNG-CHIN FANG, AND SAEED IQBAL, PH.D. High-performance computing (HPC) clusters a popular platform for hosting distributed parallel-processing applications comprise multiple standards-based servers connected to each other via network interconnects. A typical HPC cluster has a layered architecture, beginning at the hardware level and concluding with the application level, as shown in Figure 1. Servers reside at the lowest level of the architecture, and each server contributes computational power to the cluster. Servers are connected to each other by a network infrastructure, which may be based on standard Ethernet technologies (such as Fast Ethernet or Gigabit 1 Ethernet) or proprietary high-speed technologies (such as Myricom Myrinet or InfiniBand). On top of the hardware level is the operating system (OS) and required communication protocol libraries, as defined by the specific interconnect for example, TCP/IP for Ethernet or GM for Myrinet. This infrastructure helps provide the computational power of a supercomputer for parallel-processing applications. To enhance this distributed infrastructure and ease development and porting of parallelprocessing applications, a layer of middleware is required. With the growth of parallel-processing application development, two programming models have evolved to provide middleware capabilities: the shared-memory programming model and the message-passing programming model. The shared-memory programming model is based on the concept of shared address space in which data exchange is achieved by writing to the shared space. The message-passing programming model is based on the concept of distributed address space in which data exchange is achieved through explicit message passing. Message Passing Interface (MPI) is the de facto messagepassing standard today. This article focuses on the growth on the MPI standard as well as the open source and commercial implementations of MPI available for use on Dell HPC clusters. Evolution of middleware libraries As massively parallel processing (MPP) and clusters have gained popularity, organizations have developed middleware libraries for use with these powerful systems. Parallel Virtual Machine (PVM) 2 was one of the first full-fledged middleware software libraries. PVM is designed to allow a network of heterogeneous machines to appear logically to the user as a single, large parallel machine. PVM was initially developed in 1989 as a joint research effort between 1 This term does not connote an actual operating speed of 1 Gbps. For high-speed transmission, connection to a Gigabit Ethernet server and network infrastructure is required. 2 For more information about PVM, visit Reprinted from Dell Power Solutions, February Copyright 2005 Dell Inc. All rights reserved. POWER SOLUTIONS 1

2 Application Middleware Operating system Communication protocol Hardware Figure 1. HPC cluster layered architecture the University of Tennessee, Oak Ridge National Laboratory, and Emory University. In addition to providing an MPI implementation for sending and receiving messages, PVM implemented resource management, signal handling, and fault tolerance to help build a user environment for parallel processing. Because PVM was one of the first parallel-processing systems that provided portability across heterogeneous networks, the library was widely adopted by developers of parallel-processing applications. Both the popularity and the shortcomings of PVM have provided a great impetus for the development of the MPI specification. Emergence of the MPI specification The MPI standard 3 specification was developed in 1993 by a diverse group of computer vendors, computer scientists, and software programmers who formed the MPI Forum. During the early 1990s, various vendors were developing their own middleware. The MPI Forum set out to develop a practical, portable, efficient, and flexible standard for communication among nodes and for running parallelprocessing applications on distributed memory architectures. MPI allows data to be moved between the nodes in a cluster by sending and receiving the data as messages. This sending and receiving of messages allows all the nodes in the cluster to be synchronized. Note: The MPI specification is not a language. The specification comprises collections of subroutine application programming interfaces (APIs) that can be called by C and FORTRAN programs. Wide acceptance of MPI has led to multiple implementations of the MPI specification for a variety of distributed memory based clusters. For parallel-processing nodes with specialized networking hardware, native MPI implementations can enhance performance. Various implementations have led to parallel MPI applications being ported across a wide range of architectures. MPI implementations can be fine-tuned for a specific architecture and the interconnects on which they run, helping to optimize efficiency and provide high performance. MPI 1.1 and 1.2 standard specifications The MPI 1.1 and MPI 1.2 specifications introduced many subroutine APIs, which provided great ease of application writing. These APIs included primitives for point-to-point communications and collective operations, and for creating process topologies and process groups. Point-to-point communications comprise communications between two nodes for example, synchronous or asynchronous sending and receiving of messages. Collective operations comprise global communications between groups of nodes for example, barriers that bring about synchronization between groups of nodes; broadcasts for sending messages from one to many nodes; and reduce, scatter, and gather operations. Figure 2 shows some of the subroutine primitives defined within the MPI 1.2 specification. A vendor can implement a subroutine as long as the subroutine primitive provided in the implementation conforms to the specification both syntactically and semantically. MPI 2.0 standard specification Released after MPI 1.2 had been widely accepted, MPI 2.0 made major changes to the MPI 1.2 specification. Some of the most significant enhancements offered by MPI 2.0 include the following: Dynamic process management: Process management allows processes to be dynamically added and deleted. MPI 2.0 supports dynamic process management because many emerging message-passing applications (such as applications that require runtime assessment of the number and type of processes needed) require process control. By contrast, MPI 1.2 based applications are static; that is, no processes can be added to or deleted from an application after the application has been started. One-sided communication operations: MPI 2.0 provides support for one-sided communication operations such as Action put and get. The put operation transfers data directly from the sender node s memory to the receiver, or target, node s memory; the get operation transfers data from the target node s memory to the caller node s memory. Send data (blocking) to a node Receive data (blocking) from a node Broadcast data from one node to many nodes Figure 2. Examples of MPI subroutine primitives MPI command MPI_Send MPI_Recv MPI_Bcast 3 For more information about the MPI standard, visit www-unix.mcs.anl.gov/mpi. 2 POWER SOLUTIONS Reprinted from Dell Power Solutions, February Copyright 2005 Dell Inc. All rights reserved. February 2005

3 Other enhancements relate to extending collective communication operations and defining new nonblocking operations. 4 MPI collective operations Open source MPI implementations MPICH, 5 which is currently maintained by the Argonne National Labs and Mississippi State University, is a freely available, portable implementation of MPI. The development of MPICH began in parallel with the development of the MPI specification to enable the specification to address problems that would be faced by implementers of the specification. Thus, a complete, portable, and efficient MPICH implementation was available when the MPI specification was formally released, allowing developers of parallel-processing applications to experiment with MPI almost immediately. MPICH functionality MPICH was designed with the following goals: MPI point-to-point communications ADI Channel interface Implementations of channel interface Figure 3. MPICH layered architecture Implementations of ADI Maximum portability and reuse of code: In any implementation, a large amount of code is system independent. MPICH was designed to allow complex communication operations to be specified portably in terms of low-level primitives. The developers intention was to maximize the amount of code that can be shared without compromising performance. Fast porting to new architectures: Another design goal was to create a structure whereby MPICH could be ported to a new platform quickly and then gradually tuned for that platform by replacing parts of the shared code with platformspecific code. To achieve these goals, the MPICH implementation follows a layered architecture, as shown in Figure 3. At the top level of the hierarchy are primitives for the MPI collective operations. An example of a collective operation is broadcast ( MPI_Bcast), wherein one of the source nodes can send the same data to multiple nodes within a group of nodes. These collective operations are implemented in the MPICH implementation by calling MPI point-to-point primitives such as send ( MPI_Send) and receive e ( MPI_Recv). These point-topoint primitives call various other functions specified at lower levels of the hierarchy to carry out the actual sending and receiving functions using the communication protocol. One of the lowest layers in the architecture is the abstract device interface (ADI), which is a mechanism designed to help achieve goals of portability and performance. The ADI contains the communication protocol dependent code. All the MPI functions are implemented using the functions and macros defined at the ADI layer. Hence, functions defined at levels higher than the ADI layer are portable. Having multiple implementations of the ADI helps provide portability and ease of implementation. Below the ADI layer is an additional low-level layer called the channel interface. The channel interface is designed to provide a mechanism to quickly port MPICH to new environments. The channel interface comprises functions that provide the basic capability of sending data from one process to another. MPICH thus offers an incremental approach to trading portability for performance. A vendor can start the porting process by creating a channel interface implementation. The implementation can then be expanded to include additional, specialized ADI functionality. Going upward in the MPICH architecture hierarchy increases the performance benefits of the implementation but obviously decreases the portability of the same code for future implementations. The current releases of MPICH are based on the MPI 1.2 standard. MPICH2, 6 which is now under development, is an all-new implementation of MPI that is intended to support research into the implementation of both MPI 1.2 and MPI 2.0. MPICH variations MPICH has been widely adapted by various vendors, and has been the basis for MPI-related research projects in various universities and research institutions. The following sections discuss popular MPICH variations adapted for commonly used high-speed interconnects. 4 A detailed discussion of the MPI specifications is beyond the scope of this article. For more information, refer to the MPI specifications at www-unix.mcs.anl.gov/mpi. 5 For MPICH papers and implementation details, visit www-unix.mcs.anl.gov/mpi/mpich. 6 For more information about MPICH2, visit www-unix.mcs.anl.gov/mpi/mpich2. Reprinted from Dell Power Solutions, February Copyright 2005 Dell Inc. All rights reserved. POWER SOLUTIONS 3

4 MPICH-GM (MPI on Myrinet). Myricom Myrinet 7 is a highspeed, low-latency, high-bandwidth interconnect used in HPC clusters. The GM 8 protocol is a low-level message-passing communication protocol designed for Myrinet networks. Myrinet is theoretically capable of providing unidirectional throughput of up to 2 Gbps and low latency. Low latency is critical for communicationintensive applications because less time is spent on communication overhead, leaving more time for computation. This high performance is achievable on Myrinet networks because GM is a userlevel protocol, which bypasses the OS while sending and receiving messages during communication after the initial connection has been established. MPICH-GM, 9 which is a port of MPICH on top of GM, is the MPI implementation on top of GM. The porting of MPICH on GM is accomplished by creating a new GM device at the ADI and channel interface levels of MPICH. In this way, MPICH-GM offers a portable, efficient implementation of MPI that applications can use to take advantage of performance offered by the low-level Myrinet hardware and the GM protocol. MPICH-GM works on a variety of platforms, including the Linux, Solaris, and FreeBSD operating systems. MPICH-GM is also supported on many architectures, including IA-32, IA-64, and Mac OS X, and is fully supported by Myricom. MVAPICH (MPI on InfiniBand). The InfiniBand architecture is a standard that defines a high-speed network for interprocess communication and storage I/O nodes. The low-latency, highbandwidth capabilities and remote direct memory access (RDMA) features accelerate applications running in HPC and enterprise environments. MVAPICH is an open source MPI 1.2 implementation developed by The Ohio State University and is based on the Verbs API (VAPI) implementation by Mellanox Technologies. MVAPICH is also a port of MPICH on the VAPI layer. This porting is carried out by creating the VAPI device at the ADI level of MPICH. Other open source MPI implementations In addition to the MPICH implementations, other implementations of the MPI standard exist. Local Area Multicomputer (LAM) 10 is an open source implementation of the MPI standard. LAM originated at the Ohio Supercomputing Center and is now maintained by the Open Systems Laboratory at Indiana University. Like other MPI implementations, LAM/MPI provides high performance on many platforms, even on heterogeneous clusters of workstations. Commercial MPI implementations Commercial MPI implementations are available for a wide range of hardware and are produced by many vendors. The following sections briefly discuss some of the popular commercial MPI implementations that enterprises can run on Dell HPC clusters. Verari MPI/Pro. MPI/Pro 11 is a proprietary, commercially supported MPI 1.2 implementation developed by Verari Systems. It is one of the most popular commercial implementations and is supported on both Microsoft Windows and Red Hat Linux operating systems. MPI/Pro features include low CPU overhead and thread safety. The implementation supports TCP, symmetric multiprocessing (SMP), and Myrinet and InfiniBand drivers for Windows and Linux. Verari ChaMPIon/Pro. ChaMPIon/Pro 12 is a full MPI 2.0 implementation available for Linux. ChaMPIon/Pro supports Myrinet, InfiniBand, and Quadrics network interconnects as well as TCP/IP protocols. This MPI implementation supports major MPI 2.0 enhancements, including extended collective operations, dynamic process management, and one-sided communication APIs. Scali MPI Connect. Scali offers an MPI implementation called Scali MPI Connect. 13 Scali s integrated architecture enables third-party applications to be compiled once to run on the various leading interconnect technologies. The implementation is designed to allow binary programs that are linked with Scali MPI Connect to run on any of the supported interconnects Gigabit Ethernet, Myrinet, Dolphin Interconnect scalable coherent interface (SCI), or InfiniBand without recompilation or relinking. Whether the cluster is built using one of these interconnects or a combination thereof, applications and users interact only with Scali MPI Connect. Middleware: A key component for HPC cluster performance Middleware implementations, both commercial and open source, are significant components in HPC cluster configurations. The widely accepted MPI standard has enabled a diverse set of implementations that are designed to enhance performance and ease the development and porting of parallel-processing applications in distributed computing infrastructures. Rinku Gupta is a systems engineer and advisor in the Scalable Systems Group at Dell. Her current research interests are middleware libraries, parallel processing, performance, and interconnect benchmarking. Rinku has a B.E. in Computer Engineering from Mumbai University in India and an M.S. in Computer Information Science from The Ohio State University. 7, 8,9 For more information about Myrinet, GM, and MPICH-GM, visit 10 For more information about LAM/MPI, visit 11,12 For more information about Verari Systems MPI/Pro and ChaMPIon/Pro, visit 13 For more information about Scali MPI Connect, visit 4 POWER SOLUTIONS Reprinted from Dell Power Solutions, February Copyright 2005 Dell Inc. All rights reserved. February 2005

5 Monica Kashyap is a senior systems engineer in the Scalable Systems Group at Dell. Her current interests and responsibilities include in-band and out-of-band cluster management, cluster computing packages, and product development. She has a B.S. in Applied Science and Computer Engineering from the University of North Carolina at Chapel Hill. Yung-Chin Fang is a senior consultant in the Scalable Systems Group at Dell. He specializes in cyberinfrastructure resource management and high-performance computing. He also participates in open source groups and standards organizations as a Dell representative. Yung-Chin has a B.S. in Computer Science from Tamkang University and an M.S. in Computer Science from Utah State University. Saeed Iqbal, Ph.D., is a systems engineer and advisor in the Scalable Systems Group at Dell. His current work involves evaluation of resource managers and job schedulers used for commodity clusters. Saeed is also involved in performance analysis and system design of clusters. He has a Ph.D. in Computer Engineering from The University of Texas at Austin, and an M.S. in Computer Engineering and a B.S. in Electrical Engineering from the University of Engineering and Technology in Lahore, Pakistan. Reprinted from Dell Power Solutions, February Copyright 2005 Dell Inc. All rights reserved. POWER SOLUTIONS 5

Cluster Grid Interconects. Tony Kay Chief Architect Enterprise Grid and Networking

Cluster Grid Interconects. Tony Kay Chief Architect Enterprise Grid and Networking Cluster Grid Interconects Tony Kay Chief Architect Enterprise Grid and Networking Agenda Cluster Grid Interconnects The Upstart - Infiniband The Empire Strikes Back - Myricom Return of the King 10G Gigabit

More information

Using PCI Express Technology in High-Performance Computing Clusters

Using PCI Express Technology in High-Performance Computing Clusters Using Technology in High-Performance Computing Clusters Peripheral Component Interconnect (PCI) Express is a scalable, standards-based, high-bandwidth I/O interconnect technology. Dell HPC clusters use

More information

Client/Server Computing Distributed Processing, Client/Server, and Clusters

Client/Server Computing Distributed Processing, Client/Server, and Clusters Client/Server Computing Distributed Processing, Client/Server, and Clusters Chapter 13 Client machines are generally single-user PCs or workstations that provide a highly userfriendly interface to the

More information

MPICH FOR SCI-CONNECTED CLUSTERS

MPICH FOR SCI-CONNECTED CLUSTERS Autumn Meeting 99 of AK Scientific Computing MPICH FOR SCI-CONNECTED CLUSTERS Joachim Worringen AGENDA Introduction, Related Work & Motivation Implementation Performance Work in Progress Summary MESSAGE-PASSING

More information

Improved LS-DYNA Performance on Sun Servers

Improved LS-DYNA Performance on Sun Servers 8 th International LS-DYNA Users Conference Computing / Code Tech (2) Improved LS-DYNA Performance on Sun Servers Youn-Seo Roh, Ph.D. And Henry H. Fong Sun Microsystems, Inc. Abstract Current Sun platforms

More information

Achieving Mainframe-Class Performance on Intel Servers Using InfiniBand Building Blocks. An Oracle White Paper April 2003

Achieving Mainframe-Class Performance on Intel Servers Using InfiniBand Building Blocks. An Oracle White Paper April 2003 Achieving Mainframe-Class Performance on Intel Servers Using InfiniBand Building Blocks An Oracle White Paper April 2003 Achieving Mainframe-Class Performance on Intel Servers Using InfiniBand Building

More information

LS-DYNA Best-Practices: Networking, MPI and Parallel File System Effect on LS-DYNA Performance

LS-DYNA Best-Practices: Networking, MPI and Parallel File System Effect on LS-DYNA Performance 11 th International LS-DYNA Users Conference Session # LS-DYNA Best-Practices: Networking, MPI and Parallel File System Effect on LS-DYNA Performance Gilad Shainer 1, Tong Liu 2, Jeff Layton 3, Onur Celebioglu

More information

Why Compromise? A discussion on RDMA versus Send/Receive and the difference between interconnect and application semantics

Why Compromise? A discussion on RDMA versus Send/Receive and the difference between interconnect and application semantics Why Compromise? A discussion on RDMA versus Send/Receive and the difference between interconnect and application semantics Mellanox Technologies Inc. 2900 Stender Way, Santa Clara, CA 95054 Tel: 408-970-3400

More information

InfiniBand Software and Protocols Enable Seamless Off-the-shelf Applications Deployment

InfiniBand Software and Protocols Enable Seamless Off-the-shelf Applications Deployment December 2007 InfiniBand Software and Protocols Enable Seamless Off-the-shelf Deployment 1.0 Introduction InfiniBand architecture defines a high-bandwidth, low-latency clustering interconnect that is used

More information

- An Essential Building Block for Stable and Reliable Compute Clusters

- An Essential Building Block for Stable and Reliable Compute Clusters Ferdinand Geier ParTec Cluster Competence Center GmbH, V. 1.4, March 2005 Cluster Middleware - An Essential Building Block for Stable and Reliable Compute Clusters Contents: Compute Clusters a Real Alternative

More information

MOSIX: High performance Linux farm

MOSIX: High performance Linux farm MOSIX: High performance Linux farm Paolo Mastroserio [mastroserio@na.infn.it] Francesco Maria Taurino [taurino@na.infn.it] Gennaro Tortone [tortone@na.infn.it] Napoli Index overview on Linux farm farm

More information

Operating System for the K computer

Operating System for the K computer Operating System for the K computer Jun Moroo Masahiko Yamada Takeharu Kato For the K computer to achieve the world s highest performance, Fujitsu has worked on the following three performance improvements

More information

COMP5426 Parallel and Distributed Computing. Distributed Systems: Client/Server and Clusters

COMP5426 Parallel and Distributed Computing. Distributed Systems: Client/Server and Clusters COMP5426 Parallel and Distributed Computing Distributed Systems: Client/Server and Clusters Client/Server Computing Client Client machines are generally single-user workstations providing a user-friendly

More information

Dell High-Performance Computing Clusters and Reservoir Simulation Research at UT Austin. http://www.dell.com/clustering

Dell High-Performance Computing Clusters and Reservoir Simulation Research at UT Austin. http://www.dell.com/clustering Dell High-Performance Computing Clusters and Reservoir Simulation Research at UT Austin Reza Rooholamini, Ph.D. Director Enterprise Solutions Dell Computer Corp. Reza_Rooholamini@dell.com http://www.dell.com/clustering

More information

Can High-Performance Interconnects Benefit Memcached and Hadoop?

Can High-Performance Interconnects Benefit Memcached and Hadoop? Can High-Performance Interconnects Benefit Memcached and Hadoop? D. K. Panda and Sayantan Sur Network-Based Computing Laboratory Department of Computer Science and Engineering The Ohio State University,

More information

PERFORMANCE CONSIDERATIONS FOR NETWORK SWITCH FABRICS ON LINUX CLUSTERS

PERFORMANCE CONSIDERATIONS FOR NETWORK SWITCH FABRICS ON LINUX CLUSTERS PERFORMANCE CONSIDERATIONS FOR NETWORK SWITCH FABRICS ON LINUX CLUSTERS Philip J. Sokolowski Department of Electrical and Computer Engineering Wayne State University 55 Anthony Wayne Dr. Detroit, MI 822

More information

Exploiting Remote Memory Operations to Design Efficient Reconfiguration for Shared Data-Centers over InfiniBand

Exploiting Remote Memory Operations to Design Efficient Reconfiguration for Shared Data-Centers over InfiniBand Exploiting Remote Memory Operations to Design Efficient Reconfiguration for Shared Data-Centers over InfiniBand P. Balaji, K. Vaidyanathan, S. Narravula, K. Savitha, H. W. Jin D. K. Panda Network Based

More information

Sockets vs. RDMA Interface over 10-Gigabit Networks: An In-depth Analysis of the Memory Traffic Bottleneck

Sockets vs. RDMA Interface over 10-Gigabit Networks: An In-depth Analysis of the Memory Traffic Bottleneck Sockets vs. RDMA Interface over 1-Gigabit Networks: An In-depth Analysis of the Memory Traffic Bottleneck Pavan Balaji Hemal V. Shah D. K. Panda Network Based Computing Lab Computer Science and Engineering

More information

White Paper Solarflare High-Performance Computing (HPC) Applications

White Paper Solarflare High-Performance Computing (HPC) Applications Solarflare High-Performance Computing (HPC) Applications 10G Ethernet: Now Ready for Low-Latency HPC Applications Solarflare extends the benefits of its low-latency, high-bandwidth 10GbE server adapters

More information

System Software for High Performance Computing. Joe Izraelevitz

System Software for High Performance Computing. Joe Izraelevitz System Software for High Performance Computing Joe Izraelevitz Agenda Overview of Supercomputers Blue Gene/Q System LoadLeveler Job Scheduler General Parallel File System HPC at UR What is a Supercomputer?

More information

MPI / ClusterTools Update and Plans

MPI / ClusterTools Update and Plans HPC Technical Training Seminar July 7, 2008 October 26, 2007 2 nd HLRS Parallel Tools Workshop Sun HPC ClusterTools 7+: A Binary Distribution of Open MPI MPI / ClusterTools Update and Plans Len Wisniewski

More information

Client/Server and Distributed Computing

Client/Server and Distributed Computing Adapted from:operating Systems: Internals and Design Principles, 6/E William Stallings CS571 Fall 2010 Client/Server and Distributed Computing Dave Bremer Otago Polytechnic, N.Z. 2008, Prentice Hall Traditional

More information

PARALLEL & CLUSTER COMPUTING CS 6260 PROFESSOR: ELISE DE DONCKER BY: LINA HUSSEIN

PARALLEL & CLUSTER COMPUTING CS 6260 PROFESSOR: ELISE DE DONCKER BY: LINA HUSSEIN 1 PARALLEL & CLUSTER COMPUTING CS 6260 PROFESSOR: ELISE DE DONCKER BY: LINA HUSSEIN Introduction What is cluster computing? Classification of Cluster Computing Technologies: Beowulf cluster Construction

More information

From Ethernet Ubiquity to Ethernet Convergence: The Emergence of the Converged Network Interface Controller

From Ethernet Ubiquity to Ethernet Convergence: The Emergence of the Converged Network Interface Controller White Paper From Ethernet Ubiquity to Ethernet Convergence: The Emergence of the Converged Network Interface Controller The focus of this paper is on the emergence of the converged network interface controller

More information

Analysis and Implementation of Cluster Computing Using Linux Operating System

Analysis and Implementation of Cluster Computing Using Linux Operating System IOSR Journal of Computer Engineering (IOSRJCE) ISSN: 2278-0661 Volume 2, Issue 3 (July-Aug. 2012), PP 06-11 Analysis and Implementation of Cluster Computing Using Linux Operating System Zinnia Sultana

More information

PCI Express High Speed Networks. Complete Solution for High Speed Networking

PCI Express High Speed Networks. Complete Solution for High Speed Networking PCI Express High Speed Networks Complete Solution for High Speed Networking Ultra Low Latency Ultra High Throughput Maximizing application performance is a combination of processing, communication, and

More information

LS DYNA Performance Benchmarks and Profiling. January 2009

LS DYNA Performance Benchmarks and Profiling. January 2009 LS DYNA Performance Benchmarks and Profiling January 2009 Note The following research was performed under the HPC Advisory Council activities AMD, Dell, Mellanox HPC Advisory Council Cluster Center The

More information

The Fastest Way to Parallel Programming for Multicore, Clusters, Supercomputers and the Cloud.

The Fastest Way to Parallel Programming for Multicore, Clusters, Supercomputers and the Cloud. White Paper 021313-3 Page 1 : A Software Framework for Parallel Programming* The Fastest Way to Parallel Programming for Multicore, Clusters, Supercomputers and the Cloud. ABSTRACT Programming for Multicore,

More information

Vers des mécanismes génériques de communication et une meilleure maîtrise des affinités dans les grappes de calculateurs hiérarchiques.

Vers des mécanismes génériques de communication et une meilleure maîtrise des affinités dans les grappes de calculateurs hiérarchiques. Vers des mécanismes génériques de communication et une meilleure maîtrise des affinités dans les grappes de calculateurs hiérarchiques Brice Goglin 15 avril 2014 Towards generic Communication Mechanisms

More information

Mellanox Academy Online Training (E-learning)

Mellanox Academy Online Training (E-learning) Mellanox Academy Online Training (E-learning) 2013-2014 30 P age Mellanox offers a variety of training methods and learning solutions for instructor-led training classes and remote online learning (e-learning),

More information

Interconnect Efficiency of Tyan PSC T-630 with Microsoft Compute Cluster Server 2003

Interconnect Efficiency of Tyan PSC T-630 with Microsoft Compute Cluster Server 2003 Interconnect Efficiency of Tyan PSC T-630 with Microsoft Compute Cluster Server 2003 Josef Pelikán Charles University in Prague, KSVI Department, Josef.Pelikan@mff.cuni.cz Abstract 1 Interconnect quality

More information

Symmetric Multiprocessing

Symmetric Multiprocessing Multicore Computing A multi-core processor is a processing system composed of two or more independent cores. One can describe it as an integrated circuit to which two or more individual processors (called

More information

BLM 413E - Parallel Programming Lecture 3

BLM 413E - Parallel Programming Lecture 3 BLM 413E - Parallel Programming Lecture 3 FSMVU Bilgisayar Mühendisliği Öğr. Gör. Musa AYDIN 14.10.2015 2015-2016 M.A. 1 Parallel Programming Models Parallel Programming Models Overview There are several

More information

High Performance Computing. Course Notes 2007-2008. HPC Fundamentals

High Performance Computing. Course Notes 2007-2008. HPC Fundamentals High Performance Computing Course Notes 2007-2008 2008 HPC Fundamentals Introduction What is High Performance Computing (HPC)? Difficult to define - it s a moving target. Later 1980s, a supercomputer performs

More information

Storage at a Distance; Using RoCE as a WAN Transport

Storage at a Distance; Using RoCE as a WAN Transport Storage at a Distance; Using RoCE as a WAN Transport Paul Grun Chief Scientist, System Fabric Works, Inc. (503) 620-8757 pgrun@systemfabricworks.com Why Storage at a Distance the Storage Cloud Following

More information

High Performance Computing

High Performance Computing High Performance Computing Trey Breckenridge Computing Systems Manager Engineering Research Center Mississippi State University What is High Performance Computing? HPC is ill defined and context dependent.

More information

Message-passing over shared memory for the DECK programming environment

Message-passing over shared memory for the DECK programming environment This PhD Undergraduate Professor, -passing over shared memory for the DECK programming environment Rafael B Ávila Caciano Machado Philippe O A Navaux Parallel and Distributed Processing Group Instituto

More information

Chapter 16 Distributed Processing, Client/Server, and Clusters

Chapter 16 Distributed Processing, Client/Server, and Clusters Operating Systems: Internals and Design Principles Chapter 16 Distributed Processing, Client/Server, and Clusters Eighth Edition By William Stallings Table 16.1 Client/Server Terminology Applications Programming

More information

Lustre Networking BY PETER J. BRAAM

Lustre Networking BY PETER J. BRAAM Lustre Networking BY PETER J. BRAAM A WHITE PAPER FROM CLUSTER FILE SYSTEMS, INC. APRIL 2007 Audience Architects of HPC clusters Abstract This paper provides architects of HPC clusters with information

More information

DB2 Connect for NT and the Microsoft Windows NT Load Balancing Service

DB2 Connect for NT and the Microsoft Windows NT Load Balancing Service DB2 Connect for NT and the Microsoft Windows NT Load Balancing Service Achieving Scalability and High Availability Abstract DB2 Connect Enterprise Edition for Windows NT provides fast and robust connectivity

More information

How To Monitor Infiniband Network Data From A Network On A Leaf Switch (Wired) On A Microsoft Powerbook (Wired Or Microsoft) On An Ipa (Wired/Wired) Or Ipa V2 (Wired V2)

How To Monitor Infiniband Network Data From A Network On A Leaf Switch (Wired) On A Microsoft Powerbook (Wired Or Microsoft) On An Ipa (Wired/Wired) Or Ipa V2 (Wired V2) INFINIBAND NETWORK ANALYSIS AND MONITORING USING OPENSM N. Dandapanthula 1, H. Subramoni 1, J. Vienne 1, K. Kandalla 1, S. Sur 1, D. K. Panda 1, and R. Brightwell 2 Presented By Xavier Besseron 1 Date:

More information

State of the Art Cloud Infrastructure

State of the Art Cloud Infrastructure State of the Art Cloud Infrastructure Motti Beck, Director Enterprise Market Development WHD Global I April 2014 Next Generation Data Centers Require Fast, Smart Interconnect Software Defined Networks

More information

Accelerating From Cluster to Cloud: Overview of RDMA on Windows HPC. Wenhao Wu Program Manager Windows HPC team

Accelerating From Cluster to Cloud: Overview of RDMA on Windows HPC. Wenhao Wu Program Manager Windows HPC team Accelerating From Cluster to Cloud: Overview of RDMA on Windows HPC Wenhao Wu Program Manager Windows HPC team Agenda Microsoft s Commitments to HPC RDMA for HPC Server RDMA for Storage in Windows 8 Microsoft

More information

High Performance Computing (HPC)

High Performance Computing (HPC) High Performance Computing (HPC) High Performance Computing (HPC) White Paper Attn: Name, Title Phone: xxx.xxx.xxxx Fax: xxx.xxx.xxxx 1.0 OVERVIEW When heterogeneous enterprise environments are involved,

More information

Fibre Channel Overview of the Technology. Early History and Fibre Channel Standards Development

Fibre Channel Overview of the Technology. Early History and Fibre Channel Standards Development Fibre Channel Overview from the Internet Page 1 of 11 Fibre Channel Overview of the Technology Early History and Fibre Channel Standards Development Interoperability and Storage Storage Devices and Systems

More information

Simplest Scalable Architecture

Simplest Scalable Architecture Simplest Scalable Architecture NOW Network Of Workstations Many types of Clusters (form HP s Dr. Bruce J. Walker) High Performance Clusters Beowulf; 1000 nodes; parallel programs; MPI Load-leveling Clusters

More information

Virtual Machines. www.viplavkambli.com

Virtual Machines. www.viplavkambli.com 1 Virtual Machines A virtual machine (VM) is a "completely isolated guest operating system installation within a normal host operating system". Modern virtual machines are implemented with either software

More information

Petascale Software Challenges. Piyush Chaudhary piyushc@us.ibm.com High Performance Computing

Petascale Software Challenges. Piyush Chaudhary piyushc@us.ibm.com High Performance Computing Petascale Software Challenges Piyush Chaudhary piyushc@us.ibm.com High Performance Computing Fundamental Observations Applications are struggling to realize growth in sustained performance at scale Reasons

More information

Enabling High performance Big Data platform with RDMA

Enabling High performance Big Data platform with RDMA Enabling High performance Big Data platform with RDMA Tong Liu HPC Advisory Council Oct 7 th, 2014 Shortcomings of Hadoop Administration tooling Performance Reliability SQL support Backup and recovery

More information

Principles and characteristics of distributed systems and environments

Principles and characteristics of distributed systems and environments Principles and characteristics of distributed systems and environments Definition of a distributed system Distributed system is a collection of independent computers that appears to its users as a single

More information

McMPI. Managed-code MPI library in Pure C# Dr D Holmes, EPCC dholmes@epcc.ed.ac.uk

McMPI. Managed-code MPI library in Pure C# Dr D Holmes, EPCC dholmes@epcc.ed.ac.uk McMPI Managed-code MPI library in Pure C# Dr D Holmes, EPCC dholmes@epcc.ed.ac.uk Outline Yet another MPI library? Managed-code, C#, Windows McMPI, design and implementation details Object-orientation,

More information

Mellanox Cloud and Database Acceleration Solution over Windows Server 2012 SMB Direct

Mellanox Cloud and Database Acceleration Solution over Windows Server 2012 SMB Direct Mellanox Cloud and Database Acceleration Solution over Windows Server 2012 Direct Increased Performance, Scaling and Resiliency July 2012 Motti Beck, Director, Enterprise Market Development Motti@mellanox.com

More information

A Tour of the Linux OpenFabrics Stack

A Tour of the Linux OpenFabrics Stack A Tour of the OpenFabrics Stack Johann George, QLogic June 2006 1 Overview Beyond Sockets Provides a common interface that allows applications to take advantage of the RDMA (Remote Direct Memory Access),

More information

Solving I/O Bottlenecks to Enable Superior Cloud Efficiency

Solving I/O Bottlenecks to Enable Superior Cloud Efficiency WHITE PAPER Solving I/O Bottlenecks to Enable Superior Cloud Efficiency Overview...1 Mellanox I/O Virtualization Features and Benefits...2 Summary...6 Overview We already have 8 or even 16 cores on one

More information

OpenMosix Presented by Dr. Moshe Bar and MAASK [01]

OpenMosix Presented by Dr. Moshe Bar and MAASK [01] OpenMosix Presented by Dr. Moshe Bar and MAASK [01] openmosix is a kernel extension for single-system image clustering. openmosix [24] is a tool for a Unix-like kernel, such as Linux, consisting of adaptive

More information

RoCE vs. iwarp Competitive Analysis

RoCE vs. iwarp Competitive Analysis WHITE PAPER August 21 RoCE vs. iwarp Competitive Analysis Executive Summary...1 RoCE s Advantages over iwarp...1 Performance and Benchmark Examples...3 Best Performance for Virtualization...4 Summary...

More information

Windows TCP Chimney: Network Protocol Offload for Optimal Application Scalability and Manageability

Windows TCP Chimney: Network Protocol Offload for Optimal Application Scalability and Manageability White Paper Windows TCP Chimney: Network Protocol Offload for Optimal Application Scalability and Manageability The new TCP Chimney Offload Architecture from Microsoft enables offload of the TCP protocol

More information

Supercomputing on Windows. Microsoft (Thailand) Limited

Supercomputing on Windows. Microsoft (Thailand) Limited Supercomputing on Windows Microsoft (Thailand) Limited W hat D efines S upercom puting A lso called High Performance Computing (HPC) Technical Computing Cutting edge problems in science, engineering and

More information

Network Performance in High Performance Linux Clusters

Network Performance in High Performance Linux Clusters Network Performance in High Performance Linux Clusters Ben Huang, Michael Bauer, Michael Katchabaw Department of Computer Science The University of Western Ontario London, Ontario, Canada N6A 5B7 (huang

More information

High Speed I/O Server Computing with InfiniBand

High Speed I/O Server Computing with InfiniBand High Speed I/O Server Computing with InfiniBand José Luís Gonçalves Dep. Informática, Universidade do Minho 4710-057 Braga, Portugal zeluis@ipb.pt Abstract: High-speed server computing heavily relies on

More information

Achieving Real-Time Business Solutions Using Graph Database Technology and High Performance Networks

Achieving Real-Time Business Solutions Using Graph Database Technology and High Performance Networks WHITE PAPER July 2014 Achieving Real-Time Business Solutions Using Graph Database Technology and High Performance Networks Contents Executive Summary...2 Background...3 InfiniteGraph...3 High Performance

More information

Optimizing the Virtual Data Center

Optimizing the Virtual Data Center Optimizing the Virtual Center The ideal virtual data center dynamically balances workloads across a computing cluster and redistributes hardware resources among clusters in response to changing needs.

More information

Introduction to Virtual Machines

Introduction to Virtual Machines Introduction to Virtual Machines Introduction Abstraction and interfaces Virtualization Computer system architecture Process virtual machines System virtual machines 1 Abstraction Mechanism to manage complexity

More information

Introduction to grid technologies, parallel and cloud computing. Alaa Osama Allam Saida Saad Mohamed Mohamed Ibrahim Gaber

Introduction to grid technologies, parallel and cloud computing. Alaa Osama Allam Saida Saad Mohamed Mohamed Ibrahim Gaber Introduction to grid technologies, parallel and cloud computing Alaa Osama Allam Saida Saad Mohamed Mohamed Ibrahim Gaber OUTLINES Grid Computing Parallel programming technologies (MPI- Open MP-Cuda )

More information

A Survey on Availability and Scalability Requirements in Middleware Service Platform

A Survey on Availability and Scalability Requirements in Middleware Service Platform International Journal of Computer Sciences and Engineering Open Access Survey Paper Volume-4, Issue-4 E-ISSN: 2347-2693 A Survey on Availability and Scalability Requirements in Middleware Service Platform

More information

RLX Technologies Server Blades

RLX Technologies Server Blades Jane Wright Product Report 10 July 2003 RLX Technologies Server Blades Summary RLX Technologies has designed its product line to support parallel applications with high-performance compute clusters of

More information

QUADRICS IN LINUX CLUSTERS

QUADRICS IN LINUX CLUSTERS QUADRICS IN LINUX CLUSTERS John Taylor Motivation QLC 21/11/00 Quadrics Cluster Products Performance Case Studies Development Activities Super-Cluster Performance Landscape CPLANT ~600 GF? 128 64 32 16

More information

Cluster Implementation and Management; Scheduling

Cluster Implementation and Management; Scheduling Cluster Implementation and Management; Scheduling CPS343 Parallel and High Performance Computing Spring 2013 CPS343 (Parallel and HPC) Cluster Implementation and Management; Scheduling Spring 2013 1 /

More information

1 Organization of Operating Systems

1 Organization of Operating Systems COMP 730 (242) Class Notes Section 10: Organization of Operating Systems 1 Organization of Operating Systems We have studied in detail the organization of Xinu. Naturally, this organization is far from

More information

Agenda. HPC Software Stack. HPC Post-Processing Visualization. Case Study National Scientific Center. European HPC Benchmark Center Montpellier PSSC

Agenda. HPC Software Stack. HPC Post-Processing Visualization. Case Study National Scientific Center. European HPC Benchmark Center Montpellier PSSC HPC Architecture End to End Alexandre Chauvin Agenda HPC Software Stack Visualization National Scientific Center 2 Agenda HPC Software Stack Alexandre Chauvin Typical HPC Software Stack Externes LAN Typical

More information

benchmarking Amazon EC2 for high-performance scientific computing

benchmarking Amazon EC2 for high-performance scientific computing Edward Walker benchmarking Amazon EC2 for high-performance scientific computing Edward Walker is a Research Scientist with the Texas Advanced Computing Center at the University of Texas at Austin. He received

More information

Introduction to High Performance Cluster Computing. Cluster Training for UCL Part 1

Introduction to High Performance Cluster Computing. Cluster Training for UCL Part 1 Introduction to High Performance Cluster Computing Cluster Training for UCL Part 1 What is HPC HPC = High Performance Computing Includes Supercomputing HPCC = High Performance Cluster Computing Note: these

More information

Architecting Low Latency Cloud Networks

Architecting Low Latency Cloud Networks Architecting Low Latency Cloud Networks Introduction: Application Response Time is Critical in Cloud Environments As data centers transition to next generation virtualized & elastic cloud architectures,

More information

Workshare Process of Thread Programming and MPI Model on Multicore Architecture

Workshare Process of Thread Programming and MPI Model on Multicore Architecture Vol., No. 7, 011 Workshare Process of Thread Programming and MPI Model on Multicore Architecture R. Refianti 1, A.B. Mutiara, D.T Hasta 3 Faculty of Computer Science and Information Technology, Gunadarma

More information

Informatica Ultra Messaging SMX Shared-Memory Transport

Informatica Ultra Messaging SMX Shared-Memory Transport White Paper Informatica Ultra Messaging SMX Shared-Memory Transport Breaking the 100-Nanosecond Latency Barrier with Benchmark-Proven Performance This document contains Confidential, Proprietary and Trade

More information

LaPIe: Collective Communications adapted to Grid Environments

LaPIe: Collective Communications adapted to Grid Environments LaPIe: Collective Communications adapted to Grid Environments Luiz Angelo Barchet-Estefanel Thesis Supervisor: M Denis TRYSTRAM Co-Supervisor: M Grégory MOUNIE ID-IMAG Laboratory Grenoble - France LaPIe:

More information

Advancing Applications Performance With InfiniBand

Advancing Applications Performance With InfiniBand Advancing Applications Performance With InfiniBand Pak Lui, Application Performance Manager September 12, 2013 Mellanox Overview Ticker: MLNX Leading provider of high-throughput, low-latency server and

More information

Making Multicore Work and Measuring its Benefits. Markus Levy, president EEMBC and Multicore Association

Making Multicore Work and Measuring its Benefits. Markus Levy, president EEMBC and Multicore Association Making Multicore Work and Measuring its Benefits Markus Levy, president EEMBC and Multicore Association Agenda Why Multicore? Standards and issues in the multicore community What is Multicore Association?

More information

LS-DYNA Scalability on Cray Supercomputers. Tin-Ting Zhu, Cray Inc. Jason Wang, Livermore Software Technology Corp.

LS-DYNA Scalability on Cray Supercomputers. Tin-Ting Zhu, Cray Inc. Jason Wang, Livermore Software Technology Corp. LS-DYNA Scalability on Cray Supercomputers Tin-Ting Zhu, Cray Inc. Jason Wang, Livermore Software Technology Corp. WP-LS-DYNA-12213 www.cray.com Table of Contents Abstract... 3 Introduction... 3 Scalability

More information

The EMSX Platform. A Modular, Scalable, Efficient, Adaptable Platform to Manage Multi-technology Networks. A White Paper.

The EMSX Platform. A Modular, Scalable, Efficient, Adaptable Platform to Manage Multi-technology Networks. A White Paper. The EMSX Platform A Modular, Scalable, Efficient, Adaptable Platform to Manage Multi-technology Networks A White Paper November 2002 Abstract: The EMSX Platform is a set of components that together provide

More information

Where IT perceptions are reality. Test Report. OCe14000 Performance. Featuring Emulex OCe14102 Network Adapters Emulex XE100 Offload Engine

Where IT perceptions are reality. Test Report. OCe14000 Performance. Featuring Emulex OCe14102 Network Adapters Emulex XE100 Offload Engine Where IT perceptions are reality Test Report OCe14000 Performance Featuring Emulex OCe14102 Network Adapters Emulex XE100 Offload Engine Document # TEST2014001 v9, October 2014 Copyright 2014 IT Brand

More information

10Gb Ethernet: The Foundation for Low-Latency, Real-Time Financial Services Applications and Other, Latency-Sensitive Applications

10Gb Ethernet: The Foundation for Low-Latency, Real-Time Financial Services Applications and Other, Latency-Sensitive Applications 10Gb Ethernet: The Foundation for Low-Latency, Real-Time Financial Services Applications and Other, Latency-Sensitive Applications Testing conducted by Solarflare and Arista Networks reveals single-digit

More information

Solid State Storage in Massive Data Environments Erik Eyberg

Solid State Storage in Massive Data Environments Erik Eyberg Solid State Storage in Massive Data Environments Erik Eyberg Senior Analyst Texas Memory Systems, Inc. Agenda Taxonomy Performance Considerations Reliability Considerations Q&A Solid State Storage Taxonomy

More information

Performance Evaluation of InfiniBand with PCI Express

Performance Evaluation of InfiniBand with PCI Express Performance Evaluation of InfiniBand with PCI Express Jiuxing Liu Server Technology Group IBM T. J. Watson Research Center Yorktown Heights, NY 1598 jl@us.ibm.com Amith Mamidala, Abhinav Vishnu, and Dhabaleswar

More information

Lecture 2 Parallel Programming Platforms

Lecture 2 Parallel Programming Platforms Lecture 2 Parallel Programming Platforms Flynn s Taxonomy In 1966, Michael Flynn classified systems according to numbers of instruction streams and the number of data stream. Data stream Single Multiple

More information

InfiniBand -- Industry Standard Data Center Fabric is Ready for Prime Time

InfiniBand -- Industry Standard Data Center Fabric is Ready for Prime Time White Paper InfiniBand -- Industry Standard Data Center Fabric is Ready for Prime Time December 2005 Server and storage clusters benefit today from industry-standard InfiniBand s price, performance, stability,

More information

Big data management with IBM General Parallel File System

Big data management with IBM General Parallel File System Big data management with IBM General Parallel File System Optimize storage management and boost your return on investment Highlights Handles the explosive growth of structured and unstructured data Offers

More information

Applications of Passive Message Logging and TCP Stream Reconstruction to Provide Application-Level Fault Tolerance. Sunny Gleason COM S 717

Applications of Passive Message Logging and TCP Stream Reconstruction to Provide Application-Level Fault Tolerance. Sunny Gleason COM S 717 Applications of Passive Message Logging and TCP Stream Reconstruction to Provide Application-Level Fault Tolerance Sunny Gleason COM S 717 December 17, 2001 0.1 Introduction The proliferation of large-scale

More information

SMB Advanced Networking for Fault Tolerance and Performance. Jose Barreto Principal Program Managers Microsoft Corporation

SMB Advanced Networking for Fault Tolerance and Performance. Jose Barreto Principal Program Managers Microsoft Corporation SMB Advanced Networking for Fault Tolerance and Performance Jose Barreto Principal Program Managers Microsoft Corporation Agenda SMB Remote File Storage for Server Apps SMB Direct (SMB over RDMA) SMB Multichannel

More information

10G Ethernet: The Foundation for Low-Latency, Real-Time Financial Services Applications and Other, Future Cloud Applications

10G Ethernet: The Foundation for Low-Latency, Real-Time Financial Services Applications and Other, Future Cloud Applications 10G Ethernet: The Foundation for Low-Latency, Real-Time Financial Services Applications and Other, Future Cloud Applications Testing conducted by Solarflare Communications and Arista Networks shows that

More information

Distributed RAID Architectures for Cluster I/O Computing. Kai Hwang

Distributed RAID Architectures for Cluster I/O Computing. Kai Hwang Distributed RAID Architectures for Cluster I/O Computing Kai Hwang Internet and Cluster Computing Lab. University of Southern California 1 Presentation Outline : Scalable Cluster I/O The RAID-x Architecture

More information

Performance Monitoring on an HPVM Cluster

Performance Monitoring on an HPVM Cluster Performance Monitoring on an HPVM Cluster Geetanjali Sampemane geta@csag.ucsd.edu Scott Pakin pakin@cs.uiuc.edu Department of Computer Science University of Illinois at Urbana-Champaign 1304 W Springfield

More information

Introduction. Need for ever-increasing storage scalability. Arista and Panasas provide a unique Cloud Storage solution

Introduction. Need for ever-increasing storage scalability. Arista and Panasas provide a unique Cloud Storage solution Arista 10 Gigabit Ethernet Switch Lab-Tested with Panasas ActiveStor Parallel Storage System Delivers Best Results for High-Performance and Low Latency for Scale-Out Cloud Storage Applications Introduction

More information

Xgrid. The simple solution for distributed computing. Features

Xgrid. The simple solution for distributed computing. Features Xgrid The simple solution for distributed computing. Features Comes built into both Mac OS X and Mac OS X Server Lets you harness the underutilized power of your own workgroup s Mac computers or volunteered

More information

High Performance Data-Transfers in Grid Environment using GridFTP over InfiniBand

High Performance Data-Transfers in Grid Environment using GridFTP over InfiniBand High Performance Data-Transfers in Grid Environment using GridFTP over InfiniBand Hari Subramoni *, Ping Lai *, Raj Kettimuthu **, Dhabaleswar. K. (DK) Panda * * Computer Science and Engineering Department

More information

A Micro-benchmark Suite for Evaluating Hadoop RPC on High-Performance Networks

A Micro-benchmark Suite for Evaluating Hadoop RPC on High-Performance Networks A Micro-benchmark Suite for Evaluating Hadoop RPC on High-Performance Networks Xiaoyi Lu, Md. Wasi- ur- Rahman, Nusrat Islam, and Dhabaleswar K. (DK) Panda Network- Based Compu2ng Laboratory Department

More information

Impact of Latency on Applications Performance

Impact of Latency on Applications Performance Impact of Latency on Applications Performance Rossen Dimitrov and Anthony Skjellum {rossen, tony}@mpi-softtech.com MPI Software Technology, Inc. 11 S. Lafayette Str,. Suite 33 Starkville, MS 39759 Tel.:

More information

Introduction to MPIO, MCS, Trunking, and LACP

Introduction to MPIO, MCS, Trunking, and LACP Introduction to MPIO, MCS, Trunking, and LACP Sam Lee Version 1.0 (JAN, 2010) - 1 - QSAN Technology, Inc. http://www.qsantechnology.com White Paper# QWP201002-P210C lntroduction Many users confuse the

More information

Technical Computing Suite Job Management Software

Technical Computing Suite Job Management Software Technical Computing Suite Job Management Software Toshiaki Mikamo Fujitsu Limited Supercomputer PRIMEHPC FX10 PRIMERGY x86 cluster Outline System Configuration and Software Stack Features The major functions

More information

Block based, file-based, combination. Component based, solution based

Block based, file-based, combination. Component based, solution based The Wide Spread Role of 10-Gigabit Ethernet in Storage This paper provides an overview of SAN and NAS storage solutions, highlights the ubiquitous role of 10 Gigabit Ethernet in these solutions, and illustrates

More information