Storage Basics Architecting the Storage Supplemental Handout



Similar documents
Best Practices for Deploying SSDs in a Microsoft SQL Server 2008 OLTP Environment with Dell EqualLogic PS-Series Arrays

Evaluation Report: Accelerating SQL Server Database Performance with the Lenovo Storage S3200 SAN Array

Storage and SQL Server capacity planning and configuration (SharePoint...

A Virtual Machine Dynamic Migration Scheduling Model Based on MBFD Algorithm

DELL TM PowerEdge TM T Mailbox Resiliency Exchange 2010 Storage Solution

C-Bus Voltage Calculation

White Paper. Educational. Measuring Storage Performance

HP Smart Array Controllers and basic RAID performance factors

Drinking water systems are vulnerable to

Memory management. Chapter 4: Memory Management. Memory hierarchy. In an ideal world. Basic memory management. Fixed partitions: multiple programs

Best Practices for Optimizing SQL Server Database Performance with the LSI WarpDrive Acceleration Card

Q & A From Hitachi Data Systems WebTech Presentation:

Deep Dive: Maximizing EC2 & EBS Performance

Lab Evaluation of NetApp Hybrid Array with Flash Pool Technology

The impact of metadata implementation on webpage visibility in search engine results (Part II) q

PowerVault MD1200/MD1220 Storage Solution Guide for Applications

An Introduction to Risk Parity Hossein Kazemi

Web Application Scalability: A Model-Based Approach

Using Synology SSD Technology to Enhance System Performance. Based on DSM 5.2

FDA CFR PART 11 ELECTRONIC RECORDS, ELECTRONIC SIGNATURES

Overview of I/O Performance and RAID in an RDBMS Environment. By: Edward Whalen Performance Tuning Corporation

Users are Complaining that the System is Slow What Should I Do Now? Part 1

Using Synology SSD Technology to Enhance System Performance Synology Inc.

HP LeftHand SAN Solutions

Sage Document Management. User's Guide Version 13.1

Accelerating Server Storage Performance on Lenovo ThinkServer

Optimizing SQL Server Storage Performance with the PowerEdge R720

The Online Freeze-tag Problem

How SSDs Fit in Different Data Center Applications

A Simple Model of Pricing, Markups and Market. Power Under Demand Fluctuations

Machine Learning with Operational Costs

Profiling Application Workloads for Microsoft SQL Server Unlocking I/O Performance Potential for Enterprise Applications

New Features in SANsymphony -V10 PSP1 Software-defined Storage Platform

Concurrent Program Synthesis Based on Supervisory Control

Azure VM Performance Considerations Running SQL Server

This white paper has been deprecated. For the most up to date information, please refer to the Citrix Virtual Desktop Handbook.

Monitoring Frequency of Change By Li Qin

Using Synology SSD Technology to Enhance System Performance Synology Inc.

Solid State Drive vs. Hard Disk Drive Price and Performance Study

One of the database administrators

Comprehending the Tradeoffs between Deploying Oracle Database on RAID 5 and RAID 10 Storage Configurations. Database Solutions Engineering

The IntelliMagic White Paper: Storage Performance Analysis for an IBM Storwize V7000

Rummage Web Server Tuning Evaluation through Benchmark

NetApp FAS Mailbox Exchange 2010 Mailbox Resiliency Storage Solution

N /150/151/160 RAID Controller. N MegaRAID CacheCade. Feature Overview

Two-resource stochastic capacity planning employing a Bayesian methodology

Everything a DBA Needs to Know About Storage

FUJITSU Storage ETERNUS DX200 S3 Performance. Silverton Consulting, Inc. StorInt Briefing

Configuration best practices for Microsoft SQL Server 2005 with HP StorageWorks Enterprise Virtual Array 4000 and HP blade servers white paper

Firebird and RAID. Choosing the right RAID configuration for Firebird. Paul Reeves IBPhoenix. mail:

WHITE PAPER FUJITSU PRIMERGY SERVER BASICS OF DISK I/O PERFORMANCE

DAY-AHEAD ELECTRICITY PRICE FORECASTING BASED ON TIME SERIES MODELS: A COMPARISON

HP ProLiant DL380p Gen mailbox 2GB mailbox resiliency Exchange 2010 storage solution

Sage Document Management. User's Guide Version 12.1

Sage HRMS I Planning Guide. The HR Software Buyer s Guide and Checklist

Risk in Revenue Management and Dynamic Pricing

Analysis of VDI Storage Performance During Bootstorm

Configuring RAID for Optimal Performance

Buffer Capacity Allocation: A method to QoS support on MPLS networks**

Sage Timberline Office

The Technologies & Architectures. President, Demartek

Data Center Storage Solutions

EMC Unified Storage for Microsoft SQL Server 2008

Atlantis USX Hyper- Converged Solution for Microsoft SQL 2014

Performance Characteristics of VMFS and RDM VMware ESX Server 3.0.1

Comparison of Hybrid Flash Storage System Performance

ENFORCING SAFETY PROPERTIES IN WEB APPLICATIONS USING PETRI NETS

COST CALCULATION IN COMPLEX TRANSPORT SYSTEMS

GAS TURBINE PERFORMANCE WHAT MAKES THE MAP?

IOmark- VDI. Nimbus Data Gemini Test Report: VDI a Test Report Date: 6, September

About Me: Brent Ozar. Perfmon and Profiler 101

NEXENTA S VDI SOLUTIONS BRAD STONE GENERAL MANAGER NEXENTA GREATERCHINA

Microsoft SharePoint Server 2010

Agenda. Enterprise Application Performance Factors. Current form of Enterprise Applications. Factors to Application Performance.

Oracle Database Deployments with EMC CLARiiON AX4 Storage Systems

Compensating Fund Managers for Risk-Adjusted Performance

Evaluation Report: Supporting Multiple Workloads with the Lenovo S3200 Storage Array

The Economics of the Cloud: Price Competition and Congestion

VMware Best Practice and Integration Guide

Evaluation Report: Supporting Microsoft Exchange on the Lenovo S3200 Hybrid Array

Improving Microsoft Exchange Performance Using SanDisk Solid State Drives (SSDs)

Maximizing SQL Server Virtualization Performance

Violin Memory 7300 Flash Storage Platform Supports Multiple Primary Storage Workloads

How it can benefit your enterprise. Dejan Kocic Hitachi Data Systems (HDS)

Xangati Storage Solution Brief. Optimizing Virtual Infrastructure Storage Systems with Xangati

Red vs. Blue - Aneue of TCP congestion Control Model

An important observation in supply chain management, known as the bullwhip effect,

Measuring Interface Latencies for SAS, Fibre Channel and iscsi

Hitachi Path Management & Load Balancing with Hitachi Dynamic Link Manager and Global Link Availability Manager

with VMware vsphere 5.1 (ESXi)

SQL Server Virtualization

Flash Storage: Trust, But Verify

IEEM 101: Inventory control

Automatic Search for Correlated Alarms

Oracle Aware Flash: Maximizing Performance and Availability for your Database

Transcription:

Storage Basics Architecting the Storage Sulemental Handout INTRODUCTION With digital data growing at an exonential rate it has become a requirement for the modern business to store data and analyze it in a timely fashion, simly to remain cometitive. This handout rovides a simlified ste-by-ste aroach to architecting basic storage systems focusing rimarily on architecting for erformance. Use this handout to architect basic direct-attached and shared storage systems 1. STEPS TO ARCHITECTING STORAGE 1. Choose Storage Tye Determine the tye of storage demanded by the business and the alication and the amount of storage required to meet business needs. Understanding required storage tye hels determine the hysical storage devices required, the caabilities of those devices and your role in storage architecture. Object storage, for examle, can be delivered as a service by cloud roviders needing no additional architecture. Table 1: Storage tyes Block Storage Network File Storage Object Storage Network Storage for file sharing. Storage resented effectively as hard drives to the server. Block storage is tyically rovided by local disks, dedicated DAS and SAN units. Examle uses include OS drives and database store. Table 2: Common Physical storage devices Network file storage is tyically rovided by servers configured with NFS/CIFS/SMB or dedicated NAS units. Distributed storage where data is managed and organized via its attributes (metadata). Object storage includes hosted cloud storage like Racksace Cloud Files. Examle uses include storage of static files DAS SAN NAS (Storage Area Network) Block Storage Device. (Directly-Attached Storage) Block Storage Device Storage is directly attached to servers either internally or via SAS cable. Storage from a hysical storage unit resented to servers via a high seed storage network such as iscsi or FC. 2. Quantify Storage Caacity and Performance Requirements (Network-Attached Storage) Network File Storage Device. Centralized file-based storage resented over a local IP network. Quantify Storage Caacity 1 Storage architecture is an extremely comlicated toic and this handout makes a number of generalizations. For large and comlex storage systems, we recommend engaging storage architects from Racksace or your local sulier.

Storage caacity requirements of your alication need to be quantified to determine final caacity requirements of your storage. Table 3: Caacity measure based on new or existing alication New Alications: - Estimate the required storage caacity requirements based on available data - Utilize scalable storage devices such as SAN/NAS/Object Storage to mitigate against under-rojection Existing Alications: - Measure current storage caacity requirements using oerating system tools - Factor in an aroriate growth factor to cater for future growth Quantify Storage Performance Read and write erformance of your alication must be quantified in order to define the erformance needs of the new storage. Table 4: Performance measure based on new or existing alication New Alications: - Use erformance monitoring tools to measure during I/O load-testing and Dev hase - Use theoretical data and documentation Existing Alications: - Use erformance monitoring tools to measure roduction workloads at regularly throughout the day over a eriod of 30 days - For storage with multile alications, run erformance testing as normal however assume results aly to the aggregate of alications When measuring using erformance monitoring tools, measurements should be collected over a eriod of at least one month across all eriods of the day, including eak workloads. Measure erformance using the IOPS 2 metric, which quantifies the required number of inut or outut oerations occurring every second. Measure using Windows Perfmon Windows Perfmon can measure Read and Write IOPS as well as other disk metrics such as throughut and disk latency. Data collector sets can be used to automate collection over a eriod of time. For more information about Perfmon and how to use Perfmon refer to: (htt://technet.microsoft.com/en-us/library/cc749249.asx). Using data collector sets and the counters in Table 5 and Table 6 to determine the sustained eak (eliminating any obvious outliers) read and write IOPS of the server. Table 5: Perfmon counters to measure read erformance Read Metric Descrition Physical Disk\ Disk Reads/Sec Number of Read IOPS 2 There are numerous storage erformance metrics relevant for different alications and use cases. IOPS is a simle to use metric that is relevant for alications with random and small-medium I/O request sizes or for shared storage. Other measures such as throughut maybe more aroriate for sequential and large request sizes alications

Physical Disk\ Avg. Disk Bytes/Read Physical Disk\ Disk Reads Bytes/Sec Average Read I/O Request Size Read Throughut Table 6: Perfmon counters to measure write erformance Metric Descrition Physical Disk\ Disk Writes/Sec Number of Write IOPS Physical Disk\ Avg. Disk Bytes/Write Average Write I/O Request Size Physical Disk\ Disk Reads Write/Sec Write Throughut Refer to Aendix 1 for more information about other Perfmon counters. Measure using Linux IOStat Linux IOStat can measure Read and Write IOPS and disk throughut on Linux devices. Cron as well as the IOstat r flag can be used to automate collection over a eriod of time. For more information about IOStat usage, lease refer to the IOStat man age (htt://linux.die.net/man/1/iostat) Using crontab and the r flag utilize the statistics in Table 7 and Table 8 to determine the sustained eak (eliminating any obvious outliers) read and write IOPS of the server. Table 7: IOStat statistics to measure read erformance Read Metric Descrition r/s (read/sec) Number of Read IOPS kr/s (read kilobyte/sec) Read Throughut 1-%w (read %) % of Read Requests Table 8: IOStat statistics to measure write erformance Read Metric Descrition w/s (write/sec) Number of Write IOPS kw/s (Write kilobytes/sec) Write Throughut %w (write %) % of Write Requests

3. Determine Alication I/O Characteristics Understanding the I/O characteristics of your alication allows you to architect storage secifically tailored for the alication. Every imlementation of an alications has different I/O characteristics based on individual usage and configuration. Alication I/O characteristics can generalized into the following three categories: Read and Write Slit The ercentage of oerations the alication sends reading or writing data from disk during normal oeration comared to total number of oerations: Write% = (Write oeration er second) (Total oerations er second) x 100 Read% = (Read oeration er second) (Total oerations er second) x 100 (Note also that Perfmon counter Transfers/sec defines the total oerations er second) Majority Random and Sequential Access tyes Sequential access is access to disk where data consecutively follows one after the other. Random access is access to disk where data is scattered throughout the disk. I/O Request Size The average size of each data transfer between alication and disk (usually in KB) - Request sizes between 0-64KB are considered small - Request sizes between 64-256KB are considered medium - Request sizes greater than 256KB are considered large Alication I/O Characteristics can be determined by rofiling data with tools (e.g. Perfmon Avg. Disk Bytes/O or IOStat) or by consulting the vendor or vendor documentation. Table 9: Common I/O characteristics for common alications Alication Seek Tye I/O Request Size % I/O Writes MS Exchange Random Small (32KB) Mid-High SAP/Oracle Random Small (~8KB) Usage Secific OLTP Database (e.g. Small (~8-64KB / MSSQL/MySQL) Random MSSQL: 64KB) Mid-High Database Transaction Logs Sequential Very Small (~512b) High Database Tem Sace Random Small (<64KB) Very High File Sharing (large files) Sequential Large (>256KB) Usage Secific File Sharing (small files) Random Small (<64KB) Usage Secific Online Media Streaming Sequential Large (>256KB) Low Data Warehouse/Archiving Sequential Large (>256KB) Low VMware Virtualization Base on underlying A Base on underlying A Base on underlying A Small (NTFS/EXT4: Oerating Systems Random 4KB) Low Webservers (e.g. Aache/IIS) Random Small (<64KB) Low 4. Make Initial Storage Decisions The first stes towards architecting storage starts by making initial storage decisions based on data collected in stes 1 to 3. Two storage decisions must be made and these decisions will be verified by erforming calculations to determine if the decisions meet the needs of the

alication, budget and availability, while falling within the device limitations. These decisions are: Disk Tye Every server or storage vendor has a list of suorted disks drives. Each disk drive has different characteristics (cost, caacity and erformance) and usage models. Utilize table 10 to select an aroriate disk tye as a starting oint. Table 10: Reference IOPS outut based on disk tye Disk Tye Aroximate Delivered IOPS Cost Storage Caacity SATA / NL-SAS 90 IOPS Low High SAS 10K-rm 140 IOPS Medium-Low Medium-High SAS 15K-rm 180 IOPS Medium Medium SSD / EFD 3500 IOPS High Low Otimal Use Case Network File Sharing, Backus and workloads with low erformance requirements Oerating Systems, General alications and workloads with redominately sequential oerations Oerating Systems, General alications and workloads with redominately sequential oerations OLTP Databases, Caching Alications and workloads with redominately random read/write oerations Raid Level RAID combines multile disks in a logical unit to imrove data redundancy and erformance and should always be used by default. RAID has multile configurations, each with differing I/O erformance and redundancy characteristics, making each configuration aroriate for different use cases. Table 11 defines common RAID levels and their characteristics. Utilize Table 12 and the additional notes in Table 13 to choose the aroriate RAID level based on your alication I/O characteristics. Table 11: Common RAID Levels RAID Level Descrition and Protection Min Disk Available Storage Caacity (%) Read Perf. Write Perf. Write Penalty Suggested Uses 0 Striing 2 100 Excellent Excellent 1x Non-critical data requiring no data rotection 1 Mirror 2 50 Good Good 2x Oerating Systems or

1+0 5 6 Mirror & Striing Striing w/ Parity Mirroring w/ 2x Parity 4 50 Very Good Very Good 3 [(n-1)/n]*100 Good Fair 4x 4 [(n-2)/n]*100 Good Poor 6x 2x small OLTP DBs High Perf OLTP or RDBMS Databases Mid Perf Messaging, Media Serving or RDBMS Network File Sharing or critical static data Table 12: Aroriate RAID Levels based on Alication I/O Characteristics Significantly Random Significantly Sequential Block Size Read Write Read Write Small (<32KB) RAID 1 / 10, 5, 6 RAID 1/10 RAID 1 / 10, 5, 6 RAID 1 / 10, 5 Medium (32-256KB) RAID 1 / 10, 5, 6 RAID 1/10 RAID 1 / 10, 5, 6 RAID 5 Large (>256KB) RAID 1 / 10, 5, 6 RAID 1/10 RAID 1 / 10, 5, 6 RAID 5 Table 13: Additional Notes about choosing RAID levels based on Alication I/O characteristics RAID 5 and RAID 6 works best for sequential, large I/Os (>256KB) RAID 5 or RAID 1/10 for small I/Os ( <32KB ) For I/O sizes in between, the RAID Level is dictated by other alication characteristics: - RAID 5 and RAID 1/10 have similar characteristics for most read environments and sequential writes - RAID 5 and RAID 6 exhibit the worst erformance mostly by random writes. - In random I/O alications consisting of more than 10% write oerations, RAID 1/10 rovides the best erformance.

5. Perform Storage Calculations A number of storage calculations need to be erformed to calculate the number of disks we require to meet our caacity and erformance requirements, as well as to verify our storage decisions are valid. The rovided Storage Calculator utilizes the below formula to calculate your disk requirements. Calculate Performance Requirements The storage erformance needs can be calculated by utilizing the sustained eak IOPS calculated in ste 2 and by factoring in the RAID write enalty. Note that when calculating for shared storage or storage hosting multile alications, simly cumulate each RequiredIOPS calculation. RequiredIOPS = ReadIOPS + (writepenalty x WriteIOPS) Calculate Caacity Requirements The storage caacity needs can be calculated by utilizing the caacity requirements calculated in ste 2 and adding a 12-month growth rate. The growth rate ercentage is deendent on your data growth requirements. RequiredCaacity = CurrentCaacity + (CurrentCaacity x GrowthRate%) Calculate required number of disks We determine the number of disks by calculating the minimum number of disks needed to meet both the caacity and erformance requirements based on the chosen RAID level. Number of disks required to meet required caacity selected based on chosen RAID level: Table 13: Calculating number of disks to meet caacity based on RAID level RAID Level Number of Disks for Caacity (N C ) RAID5 Total Storage Caacity Required Caacity of single disk + 1 RAID1/0 Total Storage Caacity Required Caacity of single disk x 2 RAID6 Total Storage Caacity Required Caacity of single disk + 2 Number of disks required to meet required erformance can be calculated with the below formula (use Aroximate Delivered IOPS in Table 10 for the Single Disk IOPS ): N = RequiredIOPS IOPS serviced by disk

In order to meet the requirements of the resective RAID, ensure that N meets the requirements as defined in Table 14. If N does not meet the requirements, increment until N it meets the requirements. Table 14: Requirements to meet corresonding RAID level RAID Level RAID Rules RAID5 ( N 3 ) RAID1/0 N must be an even number RAID6 ( N 4 ) The number of disks (N T ) required for storage system can be then be determined from the following formula: N T = MAX( N, N c ) Verify Storage Decisions Using the revious N T calculation, there is a need to verify whether the chosen RAID and disks tyes meet the overall business and technical requirements. That is, for examle: - Is there sufficient budget to urchase the tye and number of disks? - Are the tye and number of disks available within the required timeframes? - Does the tye and number of disks fit within the technical limitations of the storage device? If the disk tyes or number of disks does not meet the requirements, iterate from ste 4 with another more aroriate disk tye. If the disk tyes and number of disks is aroriate, then roceed with imlementation

APPENDIX 1 Perfmon Storage Counter descritions Counter Descrition LogicalDisk Perfmon Object Disk Reads/sec Disk Writes/sec Measures the number of IOPs. You should discuss the exected IOPs er disk for different tye and rotational seeds with your storage hardware vendor. Tyical sizing at the er disk level are listed here: 10K RPM disk 100 to 120 IOPs 15K RPM disk 150 to 180 IOPs Enterrise-class solid state devices (SSDs) 5,000+ IOPs Sizing is discussed at length later in this aer. Measures disk latency. Numbers vary, but here are the otimal values for averages over time: 1-5 milliseconds (ms) for Log (ideally 1 ms or less on average) Note: For modern storage arrays, log writes should ideally be less than or equal to 1-2 ms on average if writes are occurring to a cache that guarantees data integrity (that is, battery backed u and mirrored). Storage-based relication and disabled write caching are two common reasons for log latencies in the range of 5 or more milliseconds. Average Disk sec/ Average Disk sec/write 5-20 ms for Database Files (OLTP) (Ideally 10 ms or less on average) Less than or equal to 25-30 ms for Data (decision suort or data warehouse) Note: The value for decision suort or data warehouse workloads is affected by the size of the I/O being issued. Larger I/O sizes naturally incur more latency. When interreting this counter, consider whether the aggregate throughut otential of the configuration is being realized. SQL Server scan activity (read-ahead oerations) issues transfer sizes u to 512K, and it may ush a large amount of outstanding requests to the storage subsystem. If the realized throughut is reasonable for the articular configuration, higher latencies may be accetable for heavy workload eriods. If SSD is used, the latency of the transfers should be much lower than what is noted here. It is not uncommon for latencies to be less than 5 ms for any data access. This is esecially true of read oerations.

Average Disk Bytes/Read Average Disk Bytes/Write Current Disk Queue Length Measures the size of I/Os being issued. Larger I/Os tend to have higher latency (for examle, BACKUP/RESTORE oerations issue 1 MB transfers by default). Dislays the number of outstanding I/Os waiting to be read or written from the disk. Dee queue deths can indicate a roblem if the latencies are also high. However, if the queue is dee, but latencies are low (that is, if the queue is emtied and then refilled very quickly), dee queue deths may just indicate an active and efficient system. A high queue length does not necessarily imly a erformance roblem. Note: This value can be hard to interret due to virtualization of storage in modern storage environments, which abstract away the hysical hardware characteristics; this counter is therefore limited in its usefulness. Disk Read Bytes/sec Disk Write Bytes/sec Measures total disk throughut. Ideally larger block scans should be able to heavily utilize connection bandwidth. This counter reresents the aggregate throughut at any given oint in time. SQL Server Buffer Manager Perfmon Object The Buffer Manager counters are measured at the SQL Server instance level and are useful in characterizing a SQL Server system that is running to determine the ratio of scan tye activity to seek activity. Checkoint ages/sec Measures the number of 8K database ages er second being written to database files during a checkoint oeration. Page Reads/sec Readahead ages/sec Pages reads/sec measures the number of hysical age reads being issued er second. Read-ahead ages/sec measures the number of hysical age reads that are erformed using the SQL Server read-ahead mechanism. Read-ahead oerations are used by SQL Server for scan activity (which is common for data warehouse and decision suort workloads). These can vary in size in any multile of 8 KB, from 8 KB through 512 KB. This counter is a subset of Pages reads/sec and can be useful in determining how much I/O is generated by scans as oosed to seeks in mixed workload environments. (Source: Analyzing Characterizing and IO Size Considerations, Microsoft SQL Server Technical Article)