The IntelliMagic White Paper: Storage Performance Analysis for an IBM Storwize V7000



Similar documents
The IntelliMagic White Paper on: Storage Performance Analysis for an IBM San Volume Controller (SVC) (IBM V7000)

Violin Memory Arrays With IBM System Storage SAN Volume Control

IOmark- VDI. Nimbus Data Gemini Test Report: VDI a Test Report Date: 6, September

The Data Placement Challenge

SAN Conceptual and Design Basics

Q & A From Hitachi Data Systems WebTech Presentation:

SAS Analytics on IBM FlashSystem storage: Deployment scenarios and best practices

Evaluation Report: Accelerating SQL Server Database Performance with the Lenovo Storage S3200 SAN Array

WHITEPAPER: Understanding Pillar Axiom Data Protection Options

High Performance Tier Implementation Guideline

Lab Evaluation of NetApp Hybrid Array with Flash Pool Technology

Capacity planning for IBM Power Systems using LPAR2RRD.

Comparison of Hybrid Flash Storage System Performance

EqualLogic PS Series Load Balancers and Tiering, a Look Under the Covers. Keith Swindell Dell Storage Product Planning Manager

Virtual SAN Design and Deployment Guide

DIABLO TECHNOLOGIES MEMORY CHANNEL STORAGE AND VMWARE VIRTUAL SAN : VDI ACCELERATION

Cloud Storage. Parallels. Performance Benchmark Results. White Paper.

CONFIGURATION BEST PRACTICES FOR MICROSOFT SQL SERVER AND EMC SYMMETRIX VMAXe

Evaluation Report: Supporting Microsoft Exchange on the Lenovo S3200 Hybrid Array

Technical Paper. Best Practices for SAS on EMC SYMMETRIX VMAX TM Storage

Redpaper. Performance Metrics in TotalStorage Productivity Center Performance Reports. Introduction. Mary Lovelace

The IntelliMagic White Paper: Green Storage: Reduce Power not Performance. December 2010

Oracle Database Deployments with EMC CLARiiON AX4 Storage Systems

White Paper. Educational. Measuring Storage Performance

HP Smart Array Controllers and basic RAID performance factors

SPC BENCHMARK 1 FULL DISCLOSURE REPORT IBM CORPORATION IBM SYSTEM STORAGE SAN VOLUME CONTROLLER V6.2 SPC-1 V1.12

OPTIMIZING EXCHANGE SERVER IN A TIERED STORAGE ENVIRONMENT WHITE PAPER NOVEMBER 2006

Measuring Interface Latencies for SAS, Fibre Channel and iscsi

Best Practices for Deploying SSDs in a Microsoft SQL Server 2008 OLTP Environment with Dell EqualLogic PS-Series Arrays

89 Fifth Avenue, 7th Floor. New York, NY White Paper. HP 3PAR Adaptive Flash Cache: A Competitive Comparison

MaxDeploy Ready. Hyper- Converged Virtualization Solution. With SanDisk Fusion iomemory products

IOmark-VM. DotHill AssuredSAN Pro Test Report: VM a Test Report Date: 16, August

DELL TM PowerEdge TM T Mailbox Resiliency Exchange 2010 Storage Solution

Oracle Database Scalability in VMware ESX VMware ESX 3.5

Benchmarking Hadoop & HBase on Violin

HP reference configuration for entry-level SAS Grid Manager solutions

STORAGE CENTER. The Industry s Only SAN with Automated Tiered Storage STORAGE CENTER

Leveraging EMC Fully Automated Storage Tiering (FAST) and FAST Cache for SQL Server Enterprise Deployments

WHITE PAPER 1

Dell Compellent Storage Center SAN & VMware View 1,000 Desktop Reference Architecture. Dell Compellent Product Specialist Team

The IntelliMagic White Paper: SMI-S for Data Collection of Storage Performance Metrics. December 2010

The Effect of Priorities on LUN Management Operations

Choosing and Architecting Storage for Your Environment. Lucas Nguyen Technical Alliance Manager Mike DiPetrillo Specialist Systems Engineer

Nutanix Tech Note. Failure Analysis All Rights Reserved, Nutanix Corporation

SCI Briefing: A Review of the New Hitachi Unified Storage and Hitachi NAS Platform 4000 Series. Silverton Consulting, Inc.

EMC Unified Storage for Microsoft SQL Server 2008

MICROSOFT HYPER-V SCALABILITY WITH EMC SYMMETRIX VMAX

IOmark- VDI. HP HP ConvergedSystem 242- HC StoreVirtual Test Report: VDI- HC b Test Report Date: 27, April

Accelerating Server Storage Performance on Lenovo ThinkServer

FUJITSU Storage ETERNUS DX200 S3 Performance. Silverton Consulting, Inc. StorInt Briefing

Converged storage architecture for Oracle RAC based on NVMe SSDs and standard x86 servers

Windows 8 SMB 2.2 File Sharing Performance

Analysis of VDI Storage Performance During Bootstorm

W W W. Z J O U R N A L. C O M o c t o b e r / n o v e m b e r INSIDE

Post Production Video Editing Solution Guide with Apple Xsan File System AssuredSAN 4000

EMC XTREMIO EXECUTIVE OVERVIEW

Implementing FlashSystem 840 with SAN Volume Controller IBM Redbooks Solution Guide

Technology Insight Series

Virtualizing Microsoft SQL Server 2008 on the Hitachi Adaptable Modular Storage 2000 Family Using Microsoft Hyper-V

Technical White Paper Integration of ETERNUS DX Storage Systems in VMware Environments

PARALLELS CLOUD STORAGE

Agenda. Enterprise Application Performance Factors. Current form of Enterprise Applications. Factors to Application Performance.

Microsoft SharePoint Server 2010

Solid State Storage in Massive Data Environments Erik Eyberg

What s New in VMware Virtual SAN TECHNICAL WHITE PAPER

Software-defined Storage Architecture for Analytics Computing

IBM System Storage DS5020 Express

Hitachi Unified Storage 130 Dynamically Provisioned 8,000 Mailbox Exchange 2010 Mailbox Resiliency Storage Solution

VMware Best Practice and Integration Guide

NetApp High-Performance Computing Solution for Lustre: Solution Guide

VNX HYBRID FLASH BEST PRACTICES FOR PERFORMANCE

All-Flash Arrays Weren t Built for Dynamic Environments. Here s Why... This whitepaper is based on content originally posted at

Xangati Storage Solution Brief. Optimizing Virtual Infrastructure Storage Systems with Xangati

ARCHITECTURE WHITE PAPER ARCHITECTURE WHITE PAPER

3PAR Fast RAID: High Performance Without Compromise

RealStor 2.0 Provisioning and Mapping Volumes

MS Exchange Server Acceleration

VMware Virtual SAN Backup Using VMware vsphere Data Protection Advanced SEPTEMBER 2014

File System & Device Drive. Overview of Mass Storage Structure. Moving head Disk Mechanism. HDD Pictures 11/13/2014. CS341: Operating System

ENTERPRISE VIRTUALIZATION ONE PLATFORM FOR ALL DATA

EMC XtremSF: Delivering Next Generation Storage Performance for SQL Server

Chapter 12: Mass-Storage Systems

Using Multipathing Technology to Achieve a High Availability Solution

Sun 8Gb/s Fibre Channel HBA Performance Advantages for Oracle Database

WHITE PAPER FUJITSU PRIMERGY SERVER BASICS OF DISK I/O PERFORMANCE

Hitachi Unified Storage 110 Dynamically Provisioned 10,400 Mailbox Exchange 2010 Mailbox Resiliency Storage Solution

Infortrend ESVA Family Enterprise Scalable Virtualized Architecture

Flash Memory Arrays Enabling the Virtualized Data Center. July 2010

Maxta Storage Platform Enterprise Storage Re-defined

Dell Exchange 2013 Reference Architecture for 500 to 20,000 Microsoft Users. 1 Overview. Reliable and affordable storage for your business

COS 318: Operating Systems. Storage Devices. Kai Li Computer Science Department Princeton University. (

June Blade.org 2009 ALL RIGHTS RESERVED

Deploying Affordable, High Performance Hybrid Flash Storage for Clustered SQL Server

HyperQ Storage Tiering White Paper

Optimizing Large Arrays with StoneFly Storage Concentrators

Transcription:

The IntelliMagic White Paper: Storage Performance Analysis for an IBM Storwize V7000 Summary: This document describes how to analyze performance on an IBM Storwize V7000. IntelliMagic 2012 Page 1

This white paper was prepared by: IntelliMagic B.V. Leiden, The Netherlands Phone: +31 71 579 6000 IntelliMagic, Inc. Texas, USA Phone: +1 214 432 7920 Email: info@intellimagic.net Web: www.intellimagic.net Disclaimer This document discusses storage performance analysis for IBM Storwize V7000 storage systems. IntelliMagic products can be used to support all phases of the storage performance management processes. Appropriate usage and interpretation of the results of IntelliMagic products are the responsibility of the user. Support Please direct support requests to support@intellimagic.net. Please direct requests for general information to info@intellimagic.net. Trademarks All trademarks and registered trademarks are the property of their respective owners. 2012 IntelliMagic B.V. IntelliMagic 2012 Page 2

Table of Contents Section 1. Introduction... 4 1.1 I/O Path... 4 Section 2. IBM Storwize V7000 Architectural Overview and Measurements... 5 2.1 IBM Storwize V7000 Architecture Overview... 5 Measurement Overview... 6 2.2 I/O Group... 7 Measurements... 7 2.3 Nodes... 7 Measurements... 8 Cache... 9 Cache Measurements... 10 2.4 Ports... 11 2.5 Internal RAID Groups or External Managed Disks... 11 Managed Disk Measurements... 11 2.6 Internal Disk... 14 Disk Measurements... 14 2.7 Storage Pool... 15 Storage Pool Measurements... 16 2.8 Volume... 16 Measurements... 17 2.9 Thin Provisioning... 18 2.10 Easy Tier... 18 Section 3. Conclusion... 20 Additional Resources... 21 IntelliMagic 2012 Page 3

Section 1. Introduction The purpose of this paper is to provide a practical guide for conducting performance analysis on an IBM Storwize V7000. This paper will discuss the end-to-end I/O path, the Storwize V7000 architecture, and key Storwize V7000 measurements. In addition, it will provide guidance in diagnosing and resolving performance issues using IntelliMagic products such as IntelliMagic Vision and IntelliMagic Direction. IntelliMagic Vision and IntelliMagic Direction are part of the IntelliMagic Storage Performance Management Suite. For additional information on these software products, please refer to http://www.intellimagic.net/intellimagic/products. 1.1 I/O Path Figure 1: End-to-End View illustrates how I/Os traverse the I/O path. At its simplest level, any host initiated I/O request is either a read or a write. The host device driver instructs the host bus adapter (HBA) to initiate communication with the IBM Storwize V7000 fibre channel ports. The connectivity equipment, such as the SAN switches and directors, confirm access and send the packet to the destination fibre ports on the IBM Storwize V7000. If the data requested by the host resides within the V7000 cache, then the data is sent back across the fabric and to the host. If the data requested by the host does not reside within the IBM Storwize V7000 s cache, the IBM Storwize V7000 requests it from its local disks, or the back-end storage array associated with the volume. A detailed discussion of the read and write paths for different types of I/O will be discussed in the cache section. Figure 1: End-to-End View IntelliMagic 2012 Page 4

Section 2. IBM Storwize V7000 Architectural Overview and Measurements This section contains a brief overview of the IBM Storwize V7000 components and their relevant measurements. For an in-depth discussion on the IBM Storwize V7000 architecture and performance considerations see the references in the Additional Resources section of this paper. 2.1 IBM Storwize V7000 Architecture Overview The IBM Storwize V7000 is a storage system residing in the I/O path between the host and the back-end storage as illustrated in Figure 1: End-to-End View. It leverages software from the IBM SAN Volume Controller (SVC), IBM IV, and IBM DS8000 series to provide block level storage virtualization, and automated storage tiering for either internally or externally managed drives. Typically the IBM Storwize V7000 includes internal disk drives, but in some environments it may also be used to virtualize external storage. The IBM Storwize V7000 can support up to 32 PB of externally managed storage. The advantage to virtualizing the back-end storage arrays is that it facilitates centralized migrations, provisioning, and replication. The virtualization and pooling of traditional storage may also lead to improved capacity utilization, as well as performance improvements, as I/O can be easily balanced across the back-end resources. An IBM Storwize V7000 consists of two or four hardware components called nodes or node canisters. Each pair of nodes is known as an I/O group, and it may contain up to twenty disk storage enclosures. The I/O group consists of two redundant, clustered nodes. At the time that this document was published, each node consists of an Intel chipset, 8 GB of memory, four 8 Gbps fibre channel ports, two Ethernet ports, two USB 2.0 ports, two 6 Gbps SAS ports, and either 12, 3.5 inch drives, or 24, 2.5 inch drives. The Storwize V7000 supports up to eighteen disk expansion enclosures each containing either 12, 3.5 inch drives or 24, 2.5 inch drives. The physical components of IBM Storwize V7000 are commodities. The uniqueness of IBM Storwize V7000 is in the software and the logical components that will be discussed in the remainder of this section. Figure 2: IBM Storwize V7000 Components illustrates the IBM Storwize V7000 components. Working our way from the bottom of the diagram to the top of the diagram, the LUNs can be from internal storage system RAID groups or external storage system LUNs. The IBM Storwize V7000 manages these objects as managed disks (mdisks). Each mdisk consists of a number of extents of a specified size (default 256 MB). The mdisks are grouped together to form storage pools. Once part of the storage pools, the extents are grouped to form volumes. Hosts are zoned to the nodes which have access to all the volumes within the storage pools. The volumes are assigned to the hosts. The individual components, how they relate to each other, and their associated measurements will be discussed in more detail in the remainder of this section. IntelliMagic 2012 Page 5

Figure 2: IBM Storwize V7000 Components Measurement Overview Both front-end and back-end measurements are available for read response times, write response times, read I/O rate, write I/O rate, read throughput, and write throughput. Front-end only metrics include write cache delays, and read hit percentage for the I/O groups and nodes. Back-end only metrics include read queue time and write queue time. Users who are familiar with the measurements available for the IBM SAN Volume Controller (SVC) will find nearly identical measurements for the IBM Storwize V7000. The V7000 also includes statistics on the physical disks internal to the controller. Figure 3: V7000/SVC Multi-Chart Performance Overview illustrates how one might track key performance indicators for their IBM Storwize V7000. IntelliMagic 2012 Page 6

Figure 3: V7000/SVC Multi-Chart Performance Overview 2.2 I/O Group The I/O group is a logical entity that refers to a pair of redundant clustered controller nodes or controller node canisters. If one node fails within the I/O group, its workload will be transferred to the other node. Hosts should be zoned to both nodes within the I/O group so that a node failure does not cause a host s volumes to become inaccessible. Within an I/O group, a volume is associated with a preferred node. In normal operations, the preferred node services all the I/Os for a given volume. The preferred node can be selected at volume creation. By default, the IBM Storwize V7000 attempts to distribute the volumes evenly across the nodes. If a preferred node is not manually selected, the IBM Storwize V7000 will assign the volume to the node with the fewest volumes. As with host workloads, volume workloads can vary dramatically. If one node is more heavily utilized than the other node, the preferred node can be manually changed for a specific volume. Measurements The IBM Storwize V7000 aggregates volume measurements such as response times, throughput and I/O rates to the I/O group. The read response time, and write response times for the volumes associated with the I/O group represent the average amount of time required for IBM Storwize V7000 to service the workload. Acceptable I/O response times will vary depending on application and user requirements, the IBM Storwize V7000 hardware, the IBM Storwize V7000 firmware, and back-end storage hardware and configuration. In addition to monitoring the I/O response times, the read and write I/O rates and throughput should be monitored to understand if the workload is balanced across the I/O groups and the nodes and if the workload is approaching the limits of the I/O group. 2.3 Nodes The nodes run the IBM Storwize V7000 software and provide I/O processing, memory buffering, and connectivity. The IBM Storwize V7000 nodes utilize off the shelf Intel processors, memory, network, and fibre channel ports. The processors provide the compute power for processing all IntelliMagic 2012 Page 7

the I/O operations. The memory serves as both a read and write I/O cache. The Ethernet ports enable iscsi host connections while the fibre ports provide fibre connectivity between the attached hosts, IBM Storwize V7000, and externally managed storage systems. The selected connectivity medium can also enable communication between peer IBM Storwize V7000 clusters for replication activities. Measurements The IBM Storwize V7000 provides a robust set of measurements for both the front-end, and back-end measurements, as well as individual node CPU utilization. The node CPU utilization is illustrated in Figure 4: Node Utilization. Figure 4: Node Utilization Perhaps the only shortcoming of the IBM Storwize V7000 node metrics is the lack of visibility into the internal bandwidth utilization; however, IntelliMagic Direction may be used to estimate these metrics as Figure 5: IntelliMagic Direction shown in Figure 5: IntelliMagic Direction Internal Internal Components Components. Data throughput and I/O rate per port are required to understand the port utilizations. The IBM Storwize V7000 provides response times for each of the volumes associated with the node from the viewpoint of the host to the IBM Storwize V7000. These can be rolled up at the node level to provide a view of the latency for any particular node. Tip #1: When planning connectivity requirements ensure that there are adequate bandwidth and processor resources on the IBM Storwize V7000 ports to handle all the read hits, writes, and the read misses. The read miss payload and the write payload must be staged and destaged to Storwize V7000 cache from the internal disks or external storage controllers, and sent out to the host over the same ports, effectively doubling the port bandwidth requirement for read miss workloads and write workloads. IntelliMagic 2012 Page 8

Cache The primary purpose of cache is to reduce the host I/O response time. This is achieved by providing cache hits for read requests, and buffering of fast cache writes. The entire cache on an IBM Storwize V7000 node can be used for read or write activity. If all the cache is consumed by write activity, and draining writes to the back-end storage is slow, the system will go into write-through mode. This allows for unwritten write data to be drained to internal storage in the case of a power outage. Cache on an IBM Storwize V7000 is segmented into 4 KB segments or pages. A track is used to describe the unit of locking and de-stage granularity. There are up to eight 4 KB segments in a track. The IBM Storwize V7000 attempts to coalesce multiple segments into a track destage when the segments to be written are within the same track. Read I/O requests are either read cache hits or read cache misses. If the requested data is resident in cache, the data is immediately transferred to the requestor from memory. This is called a read cache hit and avoids back-end disk access. If the requested data is not resident in cache, the data is requested from the internal disk drives or the external storage systems. This is called a read cache miss. After the data is loaded into cache from internal drives or the external storage system, the front-end adapter sends it to the host that initiated the request. For write I/O requests, the data is written to cache on the host s preferred node. It is then mirrored to the partner node to provide resilience in the event of a node failure. Subsequently, the preferred node sends an acknowledgement to the host that the I/O has been completed. This is called a fast write (FW). At some point after the acknowledged completion of the I/O, the write is destaged to the internal drives or external storage system. The tracks are marked unmodified at this point and will stay in cache until they are moved out of cache by the controller using an LRU algorithm. Assuming there is sufficient free write cache on the external storage system, the write response times to external storage systems should be fast writes. This latency is not accumulated on the front-end response time; rather it is associated with the mdisk latency as will be discussed further in the mdisk section. In the event of a node failure, all the modified cache from the remaining node will be drained to the internal drives or external storage system. The behavior of write I/Os in this scenario will change to write through mode. The acknowledgement from an IBM Storwize V7000 to the host that a write has been completed will only be sent upon confirmation that the write has been completed to the internal drives or acknowledged as completed by the back-end storage system. Tip #2: Consider running your nodes and their associated components at no more than 50% utilization during online periods. That way if you have a node failure, your cluster will not severely impact the performance of your online applications. IntelliMagic 2012 Page 9

Cache is also managed at the storage pool level. This is referred to as partitioning. There is an upper cache limit set for each of the storage pools. This prevents a single pool from getting more than its fair share of cache. The upper cache limit depends on the number of storage pools. If the cache limit is reached for I/Os to a particular storage pool, write I/Os will behave in a write through mode. This behavior will only continue while the amount of cache consumed for write I/Os to a specific storage pool exceeds the upper limit. As discussed previously, this has the effect of requiring write I/Os to be completed to internal drives or acknowledged by the external storage system prior to informing the initiator that the write has been completed. It is rare to encounter this behavior unless there is a problem draining writes to an external storage system due to overutilization, or a problem within the storage system. Cache Measurements From a performance analysis perspective, it is important to understand the effectiveness of the cache management. Key cache measurements include the read cache hit ratios, and the write cache delays. Consistent low read cache hit ratios may indicate that the storage system has insufficient cache for the workload. These metrics are available at the IBM Storwize V7000 I/O group, node, partition, and volume level. They can be rolled up at the storage pool level as illustrated in Figure 6: Read Hit % vs I/O Rate. Figure 6: Read Hit % vs I/O Rate Tip #3: For open systems workloads, an average read cache hit ratio greater than 50% is desired. IntelliMagic Direction can help identify whether additional cache will help improve the response time or throughput for a particular workload. For certain workloads that have very random read access patterns it may be completely acceptable to have a read cache hit ratio much lower than 50%. The number of write cache delays per second indicates whether or not the IBM Storwize V7000 is able to destage I/Os quickly enough to the back-end storage system. A write delay event indicates that cache is full, and that new write I/O operations may only be completed after a physical destage to disk frees up space. This typically only happens when the internal drives or external storage system is saturated. IntelliMagic 2012 Page 10

Tip #4: A small number of write delays can significantly increase write response times to hosts. 2.4 Ports There are fibre channel and Ethernet ports on the nodes. All traffic from hosts, between nodes, and to external storage systems passes through the node s ports. Measurements include read and write throughput, and read and write operations to and from hosts, other SVC nodes, and external controllers. It is important to keep the node port throughput balanced and less than 50% utilized in case there is a node failure. Figure 7: Fibre Host Read MB/sec Balance Chart illustrates how balanced the ports are on a V7000 cluster. Each bar represents a single fibre channel port. The average read mb/sec, standard deviation and minimum/maximum values provide the analyst a quick view of how well balanced port activity is across the entire cluster. Figure 7: Fibre Host Read MB/sec Balance Chart 2.5 Internal RAID Groups or External Managed Disks A managed disk has a one-to-one relationship with a back-end storage system LUN or internally managed RAID group. There should also be a 1:1 relationship between externally managed LUNs and their supporting RAID groups. This section uses mdisk to refer to both the internally managed RAID groups and the external managed disks. Managed Disk Measurements You can monitor the performance of your managed disks using the managed disk statistics that include both the internal RAID group and external mdisk read and write response times. These statistics measure the amount of time required to perform a read or write operation from the Storwize V7000 to internal RAID groups or the external mdisk. For externally managed mdisks, read and write queue times measure how long read or write operations are waiting to be sent to the back-end storage system or internally managed drives. Tip #5: Average external mdisk queue times should be less than a couple of milliseconds as the Storwize queue times only measure the amount of time the I/O request spends waiting to be IntelliMagic 2012 Page 11

sent to the internal disk or external storage systems. If this is over 1.0 ms, it is a sign of contention on the fabric or on the back-end internal disks or external storage system. Figure 8: External Write Queue Time illustrates a situation where the external write queue time is exceptionally high. Figure 8: External Write Queue Time Tip #6: For increases in front-end response time that do not correlate to increases in the RAID group or external mdisk response times, the performance issue is with the path from the host to the Storwize V7000 or the Storwize V7000 s ports or processors. Tip #7: Average mdisk response times greater than 12.0 ms for fibre channel drives and 15.0 ms for SATA drives indicate some sort of constraint within the Storwize V7000 to the internally managed drives, or for externally managed mdisks it indicates a problem along the path or within the external storage system. The expected response times will vary depending on the exact configuration. IntelliMagic 2012 Page 12

Figure 9: Managed Disks External Read Response Time SLA Chart illustrates a situation in which the read response times to a significant number of the mdisks is exceeding desirable service levels. Figure 9: Managed Disks External Read Response Time SLA Chart In order to understand if high back-end response times on the Storwize V7000 are a result of saturated internal disk RAID groups or an external storage system component, you will need to have visibility into the storage system s components. For supported platforms, IntelliMagic Vision provides end-to-end visibility into both the internal RAID groups and the external storage system components. When vendors describe I/O response times they typically mean weighted averages similar to what is described in Example 1: Average Response Time. Example 1: Average Response Time Workload Type: OLTP Read / Write Ratio: 80/20 Read Hit Ratio: 50% Write Hit Ratio: 100% Write Hit Response Time: 1.0 ms Read Hit Response Time: 1.0 ms Read Miss Response Time: 6.0 ms Read I/Os per Second: 800 Write I/Os per Second: 200 Average Response Time = ((400*1.0) + (200*1.0) + (400*6.0))/1000 = 3.0 ms per I/O IntelliMagic 2012 Page 13

2.6 Internal Disk The IBM Storwize V7000 can have up to 240 3.5 inch drives or 480 2.5 inch drives. Table 1: Drives Supported shows the drives currently supported: Table 1: Drives Supported Drive Type Speed (RPM) Size 2.5-inch form factor SSD N/A 200, 300, and 400 GB 2.5-inch form factor SAS 10,000 300,450, and 600 GB 2.5-inch form factor SAS 15,000 146 and 300 GB 2.5-inch form factor Nearline SAS 7,200 1 TB 3.5-inch form factor Nearline SAS 7,200 2 TB Note: For the latest supported disk drives, please consult the IBM web site. Disk Measurements For each drive, the IBM Storwize V7000 provides read I/O, write I/O, read throughput, write throughput, read response time, and write response time. From these metrics, disk utilization can be calculated. Depending on the type of drive, you can expect different types of response times. For an estimation of expected service times, you can refer to Table 2: Disk Service Time Estimates: Table 2: Disk Service Time Estimates IOPS Average Service Drive (No Time (ms) Queue) SATA 7200 RPM 120 8.5 SAS/FC 10K RPM 167 6 SAS/FC 15K RPM 250 4 IntelliMagic 2012 Page 14

Figure 10: Internal Disk Drive Utilization SLA Chart illustrates the utilization of all the internal drives of a six-node SVC cluster. This is a good way to quickly identify and disk hot spots within the internal drives. Figure 10: Internal Disk Drive Utilization SLA Chart 2.7 Storage Pool A storage pool is a grouping of more than one managed disk. When planning a storage pool it is important to remember that when a single managed disk experiences a failure, it brings the entire storage pool offline. As a result, one of the primary design goals is to limit the hardware failure boundaries. General performance guidelines should also be addressed as part of the design, and will also be discussed in this section. With these thoughts in mind, there are several best practices to consider when creating a storage pool as enumerated in Table 3: Storage Pool Best Practices. Table 3: Storage Pool Best Practices Best Practice Availability Performance A storage pool should utilize managed disks from one storage system. Each external storage system should provide managed disks to a single Storwize V7000 cluster. A V7000 can present storage to SVC, but a V7000 cannot present storage to another V7000. Each internal or external RAID group must be included in only one storage pool. Rather than adding capacity to an existing storage pool, create a new storage pool. Implement striped volumes for all workloads except 100% Sequential. For externally managed disks, utilize storage pools with at least eight managed disks to take advantage of the round-robin distribution of I/O workload to the external storage controller. Select an extent size that balances cluster capacity and volume granularity. Testing has shown good results with 128 MB and 256 MB extents. IntelliMagic 2012 Page 15

Storage Pool Measurements The storage pool measurements provide a good means for monitoring the performance of both the front end (host to port) and the back end (cache to disk). The storage pool measurements provide a combination of measurements that are aggregated from both the front-end volume operations, as well as the back-end managed disks operations. On the front end, the response times, I/O rates, and I/O throughput provide an excellent means for understanding if the storage pools are responsive and balanced. The response times include the read cache hits as well as the read cache misses. On a system with a normal read cache hit percentage, the average front-end read response time will be understandably lower than the back-end read response times for the same storage pool. The front-end write response times should generally be 1.0 ms or less as they are only measuring the amount of time required to write to the preferred node s cache and mirror to the secondary node. Figure 11: V7000 Read Response Time by Storage Pool 2.8 Volume Storage pools are created from the storage provided by externally managed disks (mdisks) or internally managed RAID Groups. The storage from the managed volumes are grouped together and divided into extents. A volume is a discrete grouping of storage extents can be made addressable to a host. The size of the extent can be selected during the storage pool creation but the default size is 256 MBytes. When volumes are created the capacity and the layout of the volume are selected. For IBM Storwize V7000 managed disks, the data layout can be striped or sequential. In all cases but 100% sequential workloads, the volumes should be created as striped volumes. Striped volumes consist of extents that are spread across the managed disks in a round-robin fashion. Non-managed Storwize V7000 volumes or image-mode disks are not covered in this paper as they not typically part of steady state configurations. Table 4: vdisk Best Practices Best Practice Management Availability New vdisks should go to least utilized storage pool within cluster. IntelliMagic 2012 Page 16

Vdisk size should be appropriate for hosts. For most hosts, fewer larger vdisks work better from a performance and management perspective. This should be balanced with the size and activity level of the vdisk in relationship to the storage pool in which it resides. A rule of thumb is the vdisk should not consume more than 10% of the capacity of the storage pool nor should it consume more than 20% of the performance bandwidth of the storage pool. Use striped volumes even if application or host has its own striping as long as stripe sizes are dissimilar. Fine-grained host LVM striping can be beneficial for performance. Set Thin Disk to auto-expand, so if allocated space is used the disk doesn t go offline if additional writes occur. Measurements In order to detect whether performance problems exist, it is important to ignore inactive volumes, since those volumes have an insignificant impact on the overall performance of the storage system. It is common for some of the volumes with the highest response time to have little I/O activity. Sorting the volumes based on their response time alone, therefore, is not a good way to find the volumes that cause bad overall performance. A good way to find the volumes that have the biggest impact on the overall performance is to sort the volumes by their I/O rate time's response time, which is called I/O Intensity as illustrated in Figure 12: Top Volume by I/O Intensity. Figure 12: Top Volume by I/O Intensity Response times for volumes are provided from the view of the front end of the IBM Storwize V7000, so they include both cache hits and cache misses. On a well balanced system that is not overloaded, 100% of writes should be satisfied from cache. The response time for writes satisfied from cache should be no more than the amount of time required to write to the target node s cache, mirror to the secondary node, and then send the host an acknowledgement. IntelliMagic 2012 Page 17

Read hits should also take very little time to process and transfer the data. For internally managed disks, the read misses will require a physical disk access. For externally managed disks, the read miss may be served from the externally managed controller s cache or its back-end disk drives if the requested tracks are not in cache. In either case, a read miss is significantly slower than a read hit. Figure 13: v7000 vdisk Response Time illustrates the average vdisk response time for the top 30 volumes. Figure 13: v7000 vdisk Response Time 2.9 Thin Provisioning Thin provisioning provides a mechanism to assign more logical capacity than physical capacity. In a thin provisioning environment, only the portion of the LUN that is written to is actually used. This means that if 1 GB of a 20 GB LUN is written to, then only 1 GB is in use. To accomplish this, the extents are divided into smaller elements called grains. The grain size is configurable at the storage pool or volume level. The default is 32 Kbytes. In a thin provisioned environment, only the grains that are written to will be counted towards the amount of storage actually used by the volume. The IBM Storwise V7000 tracks both the allocated and the used. Tip #8: When you create a thin-provisioned volume, set the cache mode to readwrite to cache metadata. This will reduce latency associated with I/Os to thin provisioned volumes as the metadata will be cached. 2.10 Easy Tier Easy Tier provides a mechanism to transparently migrate hot extents to the most appropriate storage tier within an IBM Storwise V7000 environment. It evaluates the read miss activity of the volumes within an Easy Tier enabled storage pool and moves the active extents to a higher disk tier. The current implementation is designed to take advantage of a storage pool containing mixed drive technology such as SSD and FC. Hot extents will be migrated to the SSD drives and extents with low activity will be migrated to the slower and less expensive disk technology. The migration does not take place between storage pools. IntelliMagic 2012 Page 18

The SSDs should be placed internally within the IBM Storwise V7000 even if the FC or SATA drives are located externally. This will reduce the fabric overhead. Tip #9: Set grain size for non flash copy thin volumes to 256 KB. For flash copy volumes, set the grain size to 64 KB. Default is 32 KB! If you leave the grain size at 32 KB, EasyTier will assume that it is not sequential I/O even when it is. This may result in sub-optimal data placement and performance. IntelliMagic 2012 Page 19

Section 3. Conclusion In this paper, we highlighted some of the architectural components of the IBM Storwize V7000. We also discussed some of their associated measurements. Finally, we utilized IntelliMagic Vision to examine several of the key performance metrics for the IBM Storwize V7000 storage system performance. Using IntelliMagic Vision we were able to easily identify imbalances and service level exceptions. We realize that many subjects were greatly simplified, and understand that becoming an expert in storage performance management requires significant real world experience that cannot be obtained by reading a white paper. Here at IntelliMagic, we strive to make storage performance management easier by providing world class solutions, support, training, and services. Next time you need some guidance on storage performance issues, feel free to contact us for a free performance analysis at sales@intellimagic.net. IntelliMagic 2012 Page 20

Additional Resources Implementing the IBM Storwize v7000, IBM Redbook SG247938 IBM Storwize V7000 Information Center, http://publib.boulder.ibm.com/infocenter/storwize/ic/index.jsp IntelliMagic 2012 Page 21