Bottleneck Detection in Parallel File Systems with Trace-Based Performance Monitoring



Similar documents
THE EXPAND PARALLEL FILE SYSTEM A FILE SYSTEM FOR CLUSTER AND GRID COMPUTING. José Daniel García Sánchez ARCOS Group University Carlos III of Madrid

Overlapping Data Transfer With Application Execution on Clusters

MS SQL Performance (Tuning) Best Practices:

COSC 6374 Parallel Computation. Parallel I/O (I) I/O basics. Concept of a clusters

Audit & Tune Deliverables

COSC 6374 Parallel Computation. Parallel I/O (I) I/O basics. Concept of a clusters

Performance Characteristics of VMFS and RDM VMware ESX Server 3.0.1

A Comparison of Oracle Performance on Physical and VMware Servers

Shared Parallel File System

CSAR: Cluster Storage with Adaptive Redundancy

features at a glance

Oracle Database Scalability in VMware ESX VMware ESX 3.5

GPU File System Encryption Kartik Kulkarni and Eugene Linkov

InterferenceRemoval: Removing Interference of Disk Access for MPI Programs through Data Replication

A Comparison of Oracle Performance on Physical and VMware Servers

Performance Testing at Scale

Optimizing the Performance of Your Longview Application

SAP HANA - Main Memory Technology: A Challenge for Development of Business Applications. Jürgen Primsch, SAP AG July 2011

Opportunistic Data-driven Execution of Parallel Programs for Efficient I/O Services

Scaling Objectivity Database Performance with Panasas Scale-Out NAS Storage

SQL Server 2012 Optimization, Performance Tuning and Troubleshooting

Virtuoso and Database Scalability

PEPPERDATA IN MULTI-TENANT ENVIRONMENTS

How To Virtualize A Storage Area Network (San) With Virtualization

Write a technical report Present your results Write a workshop/conference paper (optional) Could be a real system, simulation and/or theoretical

Picking the right number of targets per server for BeeGFS. Jan Heichler March 2015 v1.2

Redpaper. Performance Metrics in TotalStorage Productivity Center Performance Reports. Introduction. Mary Lovelace

Linux Filesystem Performance Comparison for OLTP with Ext2, Ext3, Raw, and OCFS on Direct-Attached Disks using Oracle 9i Release 2

USB Flash Drives as an Energy Efficient Storage Alternative

Storing Data: Disks and Files

Page 1 of 5. IS 335: Information Technology in Business Lecture Outline Operating Systems

LS-DYNA Best-Practices: Networking, MPI and Parallel File System Effect on LS-DYNA Performance

- An Essential Building Block for Stable and Reliable Compute Clusters

File System & Device Drive. Overview of Mass Storage Structure. Moving head Disk Mechanism. HDD Pictures 11/13/2014. CS341: Operating System

IBM Systems and Technology Group May 2013 Thought Leadership White Paper. Faster Oracle performance with IBM FlashSystem

DSS. Diskpool and cloud storage benchmarks used in IT-DSS. Data & Storage Services. Geoffray ADDE

VirtualCenter Database Performance for Microsoft SQL Server 2005 VirtualCenter 2.5

Scaling Analysis Services in the Cloud

Portable, Scalable, and High-Performance I/O Forwarding on Massively Parallel Systems. Jason Cope

Performance and scalability of a large OLTP workload

Full and Para Virtualization

ontune SPA - Server Performance Monitor and Analysis Tool

Oracle Rdb Performance Management Guide

TRACE PERFORMANCE TESTING APPROACH. Overview. Approach. Flow. Attributes

WHITE PAPER FUJITSU PRIMERGY SERVER BASICS OF DISK I/O PERFORMANCE

Maximizing VMware ESX Performance Through Defragmentation of Guest Systems. Presented by

GPFS Storage Server. Concepts and Setup in Lemanicus BG/Q system" Christian Clémençon (EPFL-DIT)" " 4 April 2013"

In-Situ Bitmaps Generation and Efficient Data Analysis based on Bitmaps. Yu Su, Yi Wang, Gagan Agrawal The Ohio State University

David Rioja Redondo Telecommunication Engineer Englobe Technologies and Systems

System Software for High Performance Computing. Joe Izraelevitz

Performance Baseline of Oracle Exadata X2-2 HR HC. Part II: Server Performance. Benchware Performance Suite Release 8.4 (Build ) September 2013

Tuning WebSphere Application Server ND 7.0. Royal Cyber Inc.

DB2 Database Layout and Configuration for SAP NetWeaver based Systems

Removing Performance Bottlenecks in Databases with Red Hat Enterprise Linux and Violin Memory Flash Storage Arrays. Red Hat Performance Engineering

How To Make Your Computer More Efficient

Configuring RAID for Optimal Performance

Benchmarking Cassandra on Violin

CHAPTER 3 PROBLEM STATEMENT AND RESEARCH METHODOLOGY

Agenda. Capacity Planning practical view CPU Capacity Planning LPAR2RRD LPAR2RRD. Discussion. Premium features Future

Users are Complaining that the System is Slow What Should I Do Now? Part 1

Cloud Storage. Parallels. Performance Benchmark Results. White Paper.

Uncovering degraded application performance with LWM 2. Aamer Shah, Chih-Song Kuo, Lucas Theisen, Felix Wolf November 17, 2014

POSIX and Object Distributed Storage Systems

Agenda. Enterprise Application Performance Factors. Current form of Enterprise Applications. Factors to Application Performance.

InfoScale Storage & Media Server Workloads

Performance Tuning and Optimizing SQL Databases 2016

How To Understand And Understand An Operating System In C Programming

Database Hardware Selection Guidelines

Tableau Server 7.0 scalability

Performance And Scalability In Oracle9i And SQL Server 2000

An Architectural study of Cluster-Based Multi-Tier Data-Centers

PERFORMANCE TUNING ORACLE RAC ON LINUX

Windows Server Performance Monitoring

Case Study - I. Industry: Social Networking Website Technology : J2EE AJAX, Spring, MySQL, Weblogic, Windows Server 2008.

Maximum performance, minimal risk for data warehousing

Recommended hardware system configurations for ANSYS users

Improving the Database Logging Performance of the Snort Network Intrusion Detection Sensor

Web Application s Performance Testing

Introduction 1 Performance on Hosted Server 1. Benchmarks 2. System Requirements 7 Load Balancing 7

RevoScaleR Speed and Scalability

Binary search tree with SIMD bandwidth optimization using SSE

Virtualizare sub Linux: avantaje si pericole. Dragos Manac

A Deduplication File System & Course Review

Dell Reference Configuration for Hortonworks Data Platform

Current Status of FEFS for the K computer

Analysis of VDI Storage Performance During Bootstorm

Software-defined Storage at the Speed of Flash

SUN ORACLE EXADATA STORAGE SERVER

SQL Server Business Intelligence on HP ProLiant DL785 Server

A Performance Monitor based on Virtual Global Time for Clusters of PCs

Accelerating High-Speed Networking with Intel I/O Acceleration Technology

IMPLEMENTING GREEN IT

Comprehending the Tradeoffs between Deploying Oracle Database on RAID 5 and RAID 10 Storage Configurations. Database Solutions Engineering

SIDN Server Measurements

Multilevel Load Balancing in NUMA Computers

Optimizing Performance. Training Division New Delhi

The Mainframe Virtualization Advantage: How to Save Over Million Dollars Using an IBM System z as a Linux Cloud Server

HP ProLiant DL580 Gen8 and HP LE PCIe Workload WHITE PAPER Accelerator 90TB Microsoft SQL Server Data Warehouse Fast Track Reference Architecture

Introduction. Part I: Finding Bottlenecks when Something s Wrong. Chapter 1: Performance Tuning 3

Transcription:

Julian M. Kunkel - Euro-Par 2008 1/33 Bottleneck Detection in Parallel File Systems with Trace-Based Performance Monitoring Julian M. Kunkel Thomas Ludwig Institute for Computer Science Parallel and Distributed Systems Ruprecht-Karls-Universität Heidelberg Euro-Par 2008

Julian M. Kunkel - Euro-Par 2008 2/33 Outline 1 Introduction 2 System overview 3 Performance Metrics and Statistics 4 Evaluation 5 Summary

Julian M. Kunkel - Euro-Par 2008 3/33 Outline 1 Introduction 2 System overview 3 Performance Metrics and Statistics 4 Evaluation 5 Summary

Julian M. Kunkel - Euro-Par 2008 4/33 Motivation A parallel file system should utilize all resources Existing distributed/parallel file systems distribute data/metadata Load imbalance leads to degraded performance Several factors lead to variation in I/O performance Hardware capability Access pattern of clients Degraded RAID-Arrays Efficiency of optimizations could vary Throughput of a component could depend on the order of requests

Julian M. Kunkel - Euro-Par 2008 5/33 Questions Regarding Load Imbalance How could the system or users detect load imbalance? Which hardware/software causes the load imbalance? Is the application s access pattern the reason of the load imbalance? How will the user figure out the application behavior leading to imbalance?

Julian M. Kunkel - Euro-Par 2008 6/33 Precondition It is important to monitor the systems behavior Proposed solution Integrate meaningful metrics into the parallel file system Allow to query these metrics online Visualize the metrics for the users and relate the behavior with the application Long-term goal Maybe the user can modify the code to increase efficiency of the system: Access pattern Data layout on the servers Give hints to the file system Maybe the file system can rebalance access

Julian M. Kunkel - Euro-Par 2008 7/33 Outline 1 Introduction 2 System overview 3 Performance Metrics and Statistics 4 Evaluation 5 Summary

Julian M. Kunkel - Euro-Par 2008 8/33 PVFS Architecture Client Application User Level Interface MPI-IO Kernel-VFS... System Interface acache ncache Job Flow BMI Client Network Client Server Main Scheduler Job Flow BMI Trove Disk Server Performance monitor

Julian M. Kunkel - Euro-Par 2008 9/33 PVFS-Server Architecture Server Main Scheduler Job Flow BMI Trove Disk Performance monitor Description of the layers/components Main Accepts new requests Contains/processes statemachines Scheduler allows concurrent access of non-interfering requests Performance monitor Orthogonal layer Stores statistics of internal layers Can be queried remotely Updated in fixed intervals Keeps a history of the statistics

Julian M. Kunkel - Euro-Par 2008 10/33 Used Software Environment J MPICH-2 MPE2 Wrappers Slog2SDK ROMIO ADIO Jumpshot ad_nfs ad_pvfs2 ad_...

Julian M. Kunkel - Euro-Par 2008 11/33 PIOviz Attempt to visualize client and server activities together Developed at the University of Heidelberg Uses MPE to generate client and server traces Add unique ID to PVFS requests in ROMIO Provides tools to work on these trace files: Merge Adjust time Correlate client and server activities Modifications on Jumpshot: Provide more details on events Allow heights of states proportional to value

Julian M. Kunkel - Euro-Par 2008 12/33 Outline 1 Introduction 2 System overview 3 Performance Metrics and Statistics 4 Evaluation 5 Summary

Julian M. Kunkel - Euro-Par 2008 13/33 Problem It is not easy to define meaningful metrics!

Julian M. Kunkel - Euro-Par 2008 14/33 Absolute metrics Measure observable usage or performance e.g. throughput of network or disk, Processed requests (of a given type), Number of bytes read/written... Is a value of 50 MiB/s a high value for network throughput? One could relate the value with the maximum network throughput Good value of a server if clients access data only half the time Bad value if the server does not manage to process all requests Is there a congestion in network? Are servers not utilized by client requests?

Julian M. Kunkel - Euro-Par 2008 15/33 Relative metrics Relate usage/performance with actual demand Idle time of a component in percent within an interval 0 % - Wasted processing capability 0 % - Does the component benefit from concurrent operations? Average number of pending jobs within an interval (load-index) The more complex a job the longer it is processed The faster a component operates the shorter the queue e.g. Linux Kernel 60 second system-load e.g. drive queue depth Does the component share its ressources among pending operations? Is only a subset of operations serviced at a given time? What if a set of long running jobs is serviced prior short jobs?... Expected behavior of a load-index Load should be accurate for long-running jobs Load average over longer periods should be accurate (even if there are short jobs)

Julian M. Kunkel - Euro-Par 2008 16/33 Added Statistics (1) We added plenty of new statistics to PVFS Relation of several statistics could reveal the component causing imbalance Internal PVFS Statistics Request average-load-index Flow average-load-index I/O subsystem average-load-index I/O subsystem idle-time [percent] Network average-load-index

Julian M. Kunkel - Euro-Par 2008 17/33 Added Statistics/Metrics (2) Kernel Statistics Average kernel load for one minute Memory used for I/O caches [Bytes] CPU usage [percent] Data received from the network [Bytes] Data send to the network [Bytes] Data read from the I/O subsystem [Bytes] Data written by the I/O subsystem [Bytes]

Julian M. Kunkel - Euro-Par 2008 18/33 Outline 1 Introduction 2 System overview 3 Performance Metrics and Statistics 4 Evaluation 5 Summary

Julian M. Kunkel - Euro-Par 2008 19/33 Test Environment 10 node cluster each node equipped with: Two Intel Xeon 2GHz CPUs 1000 MiB memory 1000-MBit-Ethernet (throughput: 117 MiB/sec) IBM Hard disk (sequential throughput: 45 MiB/sec) RAID Controller with two disks ( 90 MiB/sec) Test program Allows to select a level of access ((non-)contig., (non-)collective) Clients write a fixed amount of data Barrier Clients read (their) data Disjoint clients and server partition Five clients, one metadata server and three data servers

Julian M. Kunkel - Euro-Par 2008 20/33 Experiments Independent contiguous I/O (each client accesses 200 x 10 MiB) Collective, non-contiguous I/O (each client accesses 4 x 500 MiB) Note: Size of ROMIO s collective I/O buffer is 4 MiB Both cases measured also on an inhomogeneous I/O subsystem One server uses its system disk and not the RAID system

Julian M. Kunkel - Euro-Par 2008 21/33 Collective, non-contiguous I/O with homogeneous hardware

Julian M. Kunkel - Euro-Par 2008 23/33 Collective, non-contiguous I/O with inhomogeneous hardware

Julian M. Kunkel - Euro-Par 2008 25/33 Independent, contiguous I/O

Julian M. Kunkel - Euro-Par 2008 29/33 Independent, contiguous I/O with inhomogeneous hardware

Julian M. Kunkel - Euro-Par 2008 31/33 Trove idleness [%] Trove load BMI load Request load ind. contig. (lvl. 0) 7.8 5.1 5.5 2.1 2.1 2.1 36.7 17.9 20.6 5.4 9.7 9.8 ind. coll. (lvl. 1) 5.1 5 5.1 1.7 1.7 1.7 9.2 9 9.2 26.9 27.6 27.3 noncont (lvl. 2) 8.7 8.8 8.2 2.2 2.3 2.2 54.4 54.9 49.9 3.4 2.1 4.4 noncont. coll (lvl. 3) 2.9 3 2.9 1.2 1.2 1.2 4.4 4.8 4.6 56.7 54 56.3 level 0-inh. I/O 8.4 2.3 2.2 1.2 1.3 1.3 44.1 5 4.6 2.6 40.6 42.1 level 1-inh. I/O 5.8 3.5 3.6 1.2 1.3 1.3 11.9 6 6.2 10.1 49.5 49.2 level 2-inh. I/O 9.2 4.1 4.1 2.6 1.1 1.1 65.2 25.6 25 1.6 39.1 39.8 level 3-inh. I/O 3.5 2.3 2.3 0.9 1 0.9 7.2 3.5 3.5 40.4 63.7 64.3

Julian M. Kunkel - Euro-Par 2008 32/33 Outline 1 Introduction 2 System overview 3 Performance Metrics and Statistics 4 Evaluation 5 Summary

Julian M. Kunkel - Euro-Par 2008 33/33 Summary It is important to monitor component s performance Relative metrics allow to assess demand with delivered performance Introduced an environment which allows to relate server statistics with (MPI) client activities Traces could assist the user to detect inefficient MPI-I/O calls and potential reasons Relation of several metrics allows to localize the component causing the load imbalance More work is needed to assess observed behavior