Storage benchmarking cookbook

Similar documents
Removing Performance Bottlenecks in Databases with Red Hat Enterprise Linux and Violin Memory Flash Storage Arrays. Red Hat Performance Engineering

Performance Characteristics of VMFS and RDM VMware ESX Server 3.0.1

Storage Performance Testing

WHITE PAPER FUJITSU PRIMERGY SERVER BASICS OF DISK I/O PERFORMANCE

PERFORMANCE TUNING ORACLE RAC ON LINUX

Agenda. Enterprise Application Performance Factors. Current form of Enterprise Applications. Factors to Application Performance.

Parallels Cloud Server 6.0

AIX NFS Client Performance Improvements for Databases on NAS

WHITE PAPER Optimizing Virtual Platform Disk Performance

DIABLO TECHNOLOGIES MEMORY CHANNEL STORAGE AND VMWARE VIRTUAL SAN : VDI ACCELERATION

IOmark- VDI. Nimbus Data Gemini Test Report: VDI a Test Report Date: 6, September

Sun 8Gb/s Fibre Channel HBA Performance Advantages for Oracle Database

Virtuoso and Database Scalability

Using Synology SSD Technology to Enhance System Performance. Based on DSM 5.2

Maximizing SQL Server Virtualization Performance

Q & A From Hitachi Data Systems WebTech Presentation:

SAS Grid Manager Testing and Benchmarking Best Practices for SAS Intelligence Platform

Deep Dive: Maximizing EC2 & EBS Performance

Investigation of storage options for scientific computing on Grid and Cloud facilities

MySQL performance in a cloud. Mark Callaghan

Windows 8 SMB 2.2 File Sharing Performance

Advanced Linux System Administration on Red Hat

Xangati Storage Solution Brief. Optimizing Virtual Infrastructure Storage Systems with Xangati

Chapter Introduction. Storage and Other I/O Topics. p. 570( 頁 585) Fig I/O devices can be characterized by. I/O bus connections

Benchmarking FreeBSD. Ivan Voras

InfoScale Storage & Media Server Workloads

IOmark- VDI. HP HP ConvergedSystem 242- HC StoreVirtual Test Report: VDI- HC b Test Report Date: 27, April

Oracle Database Scalability in VMware ESX VMware ESX 3.5

Calsoft Webinar - Debunking QA myths for Flash- Based Arrays

How to Choose your Red Hat Enterprise Linux Filesystem

Performance Report Modular RAID for PRIMERGY

Price/performance Modern Memory Hierarchy

RFP-MM Enterprise Storage Addendum 1

Using Synology SSD Technology to Enhance System Performance Synology Inc.

Virtualizing Microsoft SQL Server 2008 on the Hitachi Adaptable Modular Storage 2000 Family Using Microsoft Hyper-V

SAN Conceptual and Design Basics

Performance characterization report for Microsoft Hyper-V R2 on HP StorageWorks P4500 SAN storage

SLIDE 1 Previous Next Exit

SALSA Flash-Optimized Software-Defined Storage

COSC 6374 Parallel Computation. Parallel I/O (I) I/O basics. Concept of a clusters

Oracle Database Deployments with EMC CLARiiON AX4 Storage Systems

Database Hardware Selection Guidelines

The IntelliMagic White Paper: Storage Performance Analysis for an IBM Storwize V7000

File System & Device Drive. Overview of Mass Storage Structure. Moving head Disk Mechanism. HDD Pictures 11/13/2014. CS341: Operating System

HP Smart Array Controllers and basic RAID performance factors

THE EXPAND PARALLEL FILE SYSTEM A FILE SYSTEM FOR CLUSTER AND GRID COMPUTING. José Daniel García Sánchez ARCOS Group University Carlos III of Madrid

AirWave 7.7. Server Sizing Guide

Private Cloud Migration

Best Practices for Optimizing Storage for Oracle Automatic Storage Management with Oracle FS1 Series Storage ORACLE WHITE PAPER JANUARY 2015

Enabling Technologies for Distributed Computing

Analysis of VDI Storage Performance During Bootstorm

StorPool Distributed Storage Software Technical Overview

Using Multipathing Technology to Achieve a High Availability Solution

Deployments and Tests in an iscsi SAN

White Paper. Recording Server Virtualization

Storage Architectures for Big Data in the Cloud

WHITE PAPER 1

EMC Unified Storage for Microsoft SQL Server 2008

VDI Optimization Real World Learnings. Russ Fellows, Evaluator Group

CON9577 Performance Optimizations for Cloud Infrastructure as a Service

Benchmarking Microsoft SQL Server Using VMware ESX Server 3.5

Performance in a Gluster System. Versions 3.1.x

This white paper has been deprecated. For the most up to date information, please refer to the Citrix Virtual Desktop Handbook.

Best Practices for Optimizing Your Linux VPS and Cloud Server Infrastructure

Evaluation Report: Accelerating SQL Server Database Performance with the Lenovo Storage S3200 SAN Array

How To Test For Speed On Postgres (Postgres) On A Microsoft Powerbook On A 2.2 Computer (For Microsoft) On An 8Gb Hard Drive (For

Enabling Technologies for Distributed and Cloud Computing

1 Storage Devices Summary

Running a Workflow on a PowerCenter Grid

Measuring Interface Latencies for SAS, Fibre Channel and iscsi

Condusiv s V-locity Server Boosts Performance of SQL Server 2012 by 55%

Benchmarking Hadoop & HBase on Violin

Video Surveillance Storage and Verint Nextiva NetApp Video Surveillance Storage Solution

Configuring RAID for Optimal Performance

Isilon IQ Scale-out NAS for High-Performance Applications

VMware Virtual SAN Backup Using VMware vsphere Data Protection Advanced SEPTEMBER 2014

Evaluating Network Attached Storage Units

CIT 668: System Architecture. Performance Testing

Optimizing Linux Performance

SIDN Server Measurements

KVM PERFORMANCE IMPROVEMENTS AND OPTIMIZATIONS. Mark Wagner Principal SW Engineer, Red Hat August 14, 2011

System Requirements for Netmail Archive

Optimizing LTO Backup Performance

VERITAS Database Edition for Oracle on HP-UX 11i. Performance Report

NexentaStor Enterprise Backend for CLOUD. Marek Lubinski Marek Lubinski Sr VMware/Storage Engineer, LeaseWeb B.V.

Virtualization Performance on SGI UV 2000 using Red Hat Enterprise Linux 6.3 KVM

Best Practices for Optimizing SQL Server Database Performance with the LSI WarpDrive Acceleration Card

Qsan Document - White Paper. Performance Monitor Case Studies

Network Attached Storage. Jinfeng Yang Oct/19/2015

Cloud Computing through Virtualization and HPC technologies

EMC Unified Storage for Oracle Database 11g/10g Virtualized Solution. Enabled by EMC Celerra and Linux using NFS and DNFS. Reference Architecture

Scaling from Datacenter to Client

Big Fast Data Hadoop acceleration with Flash. June 2013

Virtualizare sub Linux: avantaje si pericole. Dragos Manac

Secure Web. Hardware Sizing Guide

Array Performance 101 Part 4

Transcription:

Storage benchmarking cookbook How to perform solid storage performance measurements Stijn Eeckhaut Stijn De Smet, Brecht Vermeulen, Piet Demeester

The situation today: storage systems can be very complex Example: Clients (local file system + I/O stack) Essence IP network Files File system Cluster nodes FC network (SAN) I/O blocks Storage controllers FC-AL loops (disk connection network) Logical segments Hard disks Physical sectors Picture provided by Luc Andries, VRT

Complexity can impede correct measurement of the storage system We need a storage measurement methodology to guarantee realistic storage measurements that predict production behavior of the storage system

Storage benchmarking cookbook How to perform solid storage performance measurements Stijn Eeckhaut Stijn De Smet, Brecht Vermeulen, Piet Demeester

In this cookbook Description of a number of storage peculiarities What should a solid storage measurement look like?

Storage peculiarities: individual disk throughput depends on applied load best case worst case 1 partition on outer tracks of device 1 sequential access pattern (small disk head movement) 1 partition on outer tracks + 1 partition on inner tracks accessing both partitions (maximum disk head movement)

Storage peculiarities: workarounds to speed up performance Individual disks are slow and not reliable Typical 5 70 MB/s Workarounds to speed up storage performance Combining disks into RAID arrays Caching on different system levels

Storage peculiarities: lower maximum performance when more load Example: performance of 1 storage box Test system : AMD Opteron CPU, Areca Raid controller (ARC1160), 12 SATA disks of 500 GB in RAID 6, xfs file system. Load Max READ [MB/s] Max WRITE [MB/s] 1 READ 311-1 WRITE - 246 100 READS 89-100 WRITES - 79 100 READS + 10 WRITES 50 20 100 READS + 100 WRITES 27 26 (all sequential)

What would you consider as a solid storage measurement? We can reproduce the measurement The applied test load reflects the real load of the system We measure the right bottleneck avoid measuring the cache unless we want to avoid file copying

What would you consider as a solid storage measurement? Other criteria analysis phase between subsequent measurements determine deviation by performing measurement more than once Work bottom up in order to know the efficiency of each layer

How to perform a reproducible measurement What value did we use for that parameter? Clients (local file system + I/O stack) IP network File system Cluster nodes I want to do an extra measurement What if we did FC network (SAN) Storage controllers FC-AL loops (disk connection network) Hard disks

How to perform a reproducible measurement Client hardware Operating System Application parameters Transport protocol parameters Server hardware Operating System File System settings (caching, prefetching, redundancy, ) Transport protocol parameters Network topology Network technology Network delay Network protocol parameters Controller configuration Controller cache settings RAID settings LUN settings Number of disks Disk size Place on disk of partition Disk segment size Disk cache Clients (local file system + I/O stack) IP network File system Cluster nodes FC network (SAN) Storage controllers FC-AL loops (disk connection network) Hard disks

How to perform a reproducible measurement Take the time to describe the System Under Test Describe the test Or automate the test with a test script Collect relevant system parameters Take disk images Keep config files Save output of Linux monitoring tools dmesg, sysctl, ifconfig, ethtool, lspci, netstat, /proc dir, Clients (local file system + I/O stack) IP network File system Cluster nodes FC network (SAN) Storage controllers FC-AL loops (disk connection network) Hard disks

Measurement iterations make reproducibility more difficult Observation: often multiple iterations needed difficult to know all test parameter values in advance keep data of all relevant iterations measurements Analysis/model simulation

What would you consider as a solid storage measurement? We can reproduce the measurement The applied test load reflects the real load of the system We measure the right bottleneck avoid measuring the cache unless we want to avoid file copying

How to choose your test load Do you want to Test the storage performance of a specific application? Run a standard storage benchmark? Test load that resembles a specific application What are your application characteristics? Test load of a standard storage benchmark E.g. to compare vendors without a specific application in mind E.g.: SPC storage benchmark

What are your application s characteristics? What is its storage access pattern? Sequential or random access Read/Write ratio Temporal and spatial locality of storage access requests Number of simultaneous access requests What is its requested performance? Needed throughput Latency sensitivity Used together with other applications? Real load consists of a mix of applications Concurrent sharing of data

Storage benchmarks like SPC-1 and SPC-2 try to standardize storage system evaluation Storage Performance Council (SPC) defines industry-standard storage workloads forces vendors to publish a standardized performance of their storage systems SPC-1 and SPC-2 evaluate complete storage systems SPC-1C and SPC-2C evaluate storage subsystems e.g. individual disk drives, HBAs, storage software (e.g. LVM, ) In development

SPC-1 defines random I/O workloads SPC-2 defines sequential I/O workloads Typical applications SPC-1 database operations mail servers OLTP SPC-2 large file processing large database queries video on demand Workload Random I/O 1 or more concurrent sequential I/Os Workload variations Reported metrics address request distribution: uniform + sequential R/W ratio I/O rate (IOPS) Total storage capacity Price-performance transfer size R/W ratio number of outstanding I/O requests Data rate (MBPS) Total storage capacity Price-performance URL: www.storageperformance.org

What would you consider as a solid storage measurement? We can reproduce the measurement The applied test load reflects the real load of the system We measure the right bottleneck avoid measuring the cache unless we want to avoid file copying

Only measure the cache if you want to Caching exists on multiple system levels Hard disk cache : default set to write-back mode on SATA disks default set to write-through mode on SCSI disks (on the disks tested) cache Min. sequential write throughput [MB/s] 43 45 9 SATA disk without cache SATA with cache SCSI without cache

Only measure the cache if you want to Caching exists on multiple system levels Disk controller cache : cache (RAID) controller

Only measure the cache if you want to GPFS caching & prefetching GPFS tries to recognize access pattern : sequential, random, fuzzy sequential, strided GPFS tries to prefetch data into its cache : based on the detected access pattern requested requested prefetched GPFS cache GPFS also caches inodes of recently used files GPFS LUNs

Only measure the cache if you want to NFS client side caching NFSv3 clients cache cache Close-to-open cache consistency no POSIX semantics Reads may or may not get last data written NFSv3 server

If you don t want to measure the cache Use a large data set Allocate buffer-cache before the measurement e.g. with a small C program disable swap Clear the cache between measurements restart GPFS file system, NFS server, remount file system

Only measure the cache if you want to Measure the link transfer speed, not the writing speed to the socket buffer 1 Gbps link Example: link measurement with iperf tool data first written to socket buffer, then sent on the link iperf reports write transfer speed to the buffer remedy: also check with link monitoring tools Measurement: transfer speed > 1 Gbps? Socket buffer size parameters: /proc/sys/net/core/rmem_max /proc/sys/net/core/rmem_max

Measure your system bottom-up Facilitates efficiency assessment of each layer better determine influence of parameter variations in each layer comparison with subsystem performance Example: NFS protocol NFS TCP/IP Ethernet (NFS loopback) (file system) (RAM disk) Example: FTP server FTP app file system RAID hard disk

Tools, benchmarks, appliances for different system layers application layer network layer filesystem layer device layer Load generator/ Benchmark real application (FTP, NFS client, ) SPC (seq/random R/W) SPECsfs2008 (CIFS, NFS) DVDstore (SQL) TPC (transactions) avalanche appliance (application layer network testing) iperf (TCP/UDP bandwidth) smartbits appliance (network infrastructure testing) dd iozone (file operations) dd (sequential R/W) iometer (random/sequential R/W) diskspeed32, hdtune, hdtach, zcav own tool (e.g. written in C) Monitor top dstat wireshark/ethereal optiview link analyzer dstat dstat (resource statistics) iostat, vmstat Linux /proc directory

Example: monitoring the network layer with Optiview Optiview Tap Optiview Link Analyzer Optiview Protocol Expert 256 MB buffer

Use memory-to-memory transfers to measure network protocol performance Example: measure TCP/UDP performance with iperf no disk access at both sides NFSv3 clients Example: measure NFS performance first use server with NFS-exported RAM disk then replace with real storage memory memory TCP/UDP (iperf) memory memory NFSv3 server RAM disk

Avoid using file copy commands Use special Linux devices to avoid disk access dd if=/dev/zero of=outputfile bs=1m count=1048576 sequential write to storage with dd tool dd if=inputfile of=/dev/null bs=1m sequential read from storage with dd tool /dev/urandom creates random contents when you read from it may load the CPU however

Monitor all CPUs/cores Not all cores may be equally loaded E.g. output from dstat tool for 4-core machine: total --cpu0-usage-- -------cpu1-usage------- --cpu2-usage-- --cpu3-usage-- -------cpu-usage------- idl: usr sys idl wai hiq siq: idl: idl: usr sys idl wai hiq siq 100: 100 0 0 0 0 0: 100: 100: 25 0 75 0 0 0 100: 100 0 0 0 0 0: 100: 100: 25 0 75 0 0 0 100: 100 0 0 0 0 0: 100: 100: 25 0 75 0 0 0 100: 100 0 0 0 0 0: 100: 100: 25 0 75 0 0 0 100: 100 0 0 0 0 0: 100: 100: 25 0 75 0 0 0 100: 100 0 0 0 0 0: 100: 100: 25 0 75 0 0 0 100: 100 0 0 0 0 0: 100: 100: 25 0 75 0 0 0 1 CPU 100% loaded CPUs only 25% loaded?

CPU states: iowait time is idle time CPU iowait ( wai ) state: amount of time the CPU has been waiting for I/O to complete A CPU is only bottleneck if idl = 0% and wai = 0% if wai > 0%, extra calculations can be executed on the CPU Output from dstat tool: -------cpu-usage------- -disk/totalusr sys idl wai hiq siq _read write 0 35 0 59 0 6 0 159M 0 34 0 60 0 5 2458B 157M 0 33 0 62 0 5 0 151M 0 32 0 63 0 5 4096B 142M 0 33 0 62 0 5 0 150M CPU states : usr: user CPU time, sys: system CPU time, idl: idle CPU time, ni: nice CPU time, wai: iowait time, hiq: hardware IRQ servicing time, siq: software IRQ servicing time.

What does virtualization change to storage measurement methodology? Xen: monitor in all relevant domains dom0 domu domu App OS OS original driver Xen driver event channel Xen hypervisor physical device

Monitoring tools for Xen Monitor domains with Xentop & virt-top CPU, memory, network

What would you consider as a solid storage measurement? We can reproduce the measurement The applied test load reflects the real load of the system know your application s storage access pattern We measure the right bottleneck avoid caching, file copying measure bottom-up resource monitoring

Storage benchmarking cookbook With acknowledgement to the team members of the IBBT FIPA and GEISHA projects http://www.ibbt.be/en/project/fipa http://www.ibbt.be/en/project/geisha Stijn Eeckhaut Stijn De Smet, Brecht Vermeulen, Piet Demeester