Managing Storage Space in a Flash and Disk Hybrid Storage System

Similar documents
Flexible Storage Allocation

Accelerating Server Storage Performance on Lenovo ThinkServer

Solid State Drive Architecture

HP Smart Array Controllers and basic RAID performance factors

Evaluation Report: Accelerating SQL Server Database Performance with the Lenovo Storage S3200 SAN Array

RAID Performance Analysis

Computer Engineering and Systems Group Electrical and Computer Engineering SCMFS: A File System for Storage Class Memory

SSDs: Practical Ways to Accelerate Virtual Servers

Exploring RAID Configurations

SSDs: Practical Ways to Accelerate Virtual Servers

Cloud Storage. Parallels. Performance Benchmark Results. White Paper.

Sistemas Operativos: Input/Output Disks

The IntelliMagic White Paper: Storage Performance Analysis for an IBM Storwize V7000

RAID. RAID 0 No redundancy ( AID?) Just stripe data over multiple disks But it does improve performance. Chapter 6 Storage and Other I/O Topics 29

Oracle Database 10g: Performance Tuning 12-1

Chapter Introduction. Storage and Other I/O Topics. p. 570( 頁 585) Fig I/O devices can be characterized by. I/O bus connections

VMware Virtual SAN Backup Using VMware vsphere Data Protection Advanced SEPTEMBER 2014

Using Synology SSD Technology to Enhance System Performance Synology Inc.

Energy aware RAID Configuration for Large Storage Systems

Comparison of Hybrid Flash Storage System Performance

Understanding Flash SSD Performance

Accelerating Enterprise Applications and Reducing TCO with SanDisk ZetaScale Software

IncidentMonitor Server Specification Datasheet

DIABLO TECHNOLOGIES MEMORY CHANNEL STORAGE AND VMWARE VIRTUAL SAN : VDI ACCELERATION

Outline. CS 245: Database System Principles. Notes 02: Hardware. Hardware DBMS Data Storage

SSDs and RAID: What s the right strategy. Paul Goodwin VP Product Development Avant Technology

Benchmarking Cassandra on Violin

Optimizing SQL Server Storage Performance with the PowerEdge R720

WHITE PAPER FUJITSU PRIMERGY SERVER BASICS OF DISK I/O PERFORMANCE

Integrating Flash-based SSDs into the Storage Stack

An Analysis on Empirical Performance of SSD-based RAID

Technical Paper. Best Practices for SAS on EMC SYMMETRIX VMAX TM Storage

Analysis of VDI Storage Performance During Bootstorm

Impact of Stripe Unit Size on Performance and Endurance of SSD-Based RAID Arrays

Evaluation Report: Supporting Microsoft Exchange on the Lenovo S3200 Hybrid Array

Best Practices for Optimizing SQL Server Database Performance with the LSI WarpDrive Acceleration Card

HP Z Turbo Drive PCIe SSD

The Data Placement Challenge

Disk Storage & Dependability

Virtuoso and Database Scalability

Intel Solid- State Drive Data Center P3700 Series NVMe Hybrid Storage Performance

Flash-optimized Data Progression

What is RAID? data reliability with performance

Lab Evaluation of NetApp Hybrid Array with Flash Pool Technology

Using Synology SSD Technology to Enhance System Performance. Based on DSM 5.2

How SSDs Fit in Different Data Center Applications

Today s Papers. RAID Basics (Two optional papers) Array Reliability. EECS 262a Advanced Topics in Computer Systems Lecture 4

NAND Flash Architecture and Specification Trends

Indexing on Solid State Drives based on Flash Memory

Speeding Up Cloud/Server Applications Using Flash Memory

Performance Characteristics of VMFS and RDM VMware ESX Server 3.0.1

Operating Systems. Virtual Memory

The Pitfalls of Deploying Solid-State Drive RAIDs

Hybrid Storage Management for Database Systems

File System & Device Drive. Overview of Mass Storage Structure. Moving head Disk Mechanism. HDD Pictures 11/13/2014. CS341: Operating System

A Data De-duplication Access Framework for Solid State Drives

760 Veterans Circle, Warminster, PA Technical Proposal. Submitted by: ACT/Technico 760 Veterans Circle Warminster, PA

PARALLELS CLOUD STORAGE

EMC VFCACHE ACCELERATES ORACLE

IOmark- VDI. Nimbus Data Gemini Test Report: VDI a Test Report Date: 6, September

Difference between Enterprise SATA HDDs and Desktop HDDs. Difference between Enterprise Class HDD & Desktop HDD

Introduction to the EMC VNXe3200 FAST Suite

Flash-Friendly File System (F2FS)

Maximizing VMware ESX Performance Through Defragmentation of Guest Systems. Presented by

Using Synology SSD Technology to Enhance System Performance Synology Inc.

LSI MegaRAID CacheCade Performance Evaluation in a Web Server Environment

Storage Class Memory Aware Data Management

Q & A From Hitachi Data Systems WebTech Presentation:

System Architecture. CS143: Disks and Files. Magnetic disk vs SSD. Structure of a Platter CPU. Disk Controller...

Certification Document bluechip STORAGEline R54300s NAS-Server 03/06/2014. bluechip STORAGEline R54300s NAS-Server system

NexentaStor Enterprise Backend for CLOUD. Marek Lubinski Marek Lubinski Sr VMware/Storage Engineer, LeaseWeb B.V.

June Blade.org 2009 ALL RIGHTS RESERVED

AIX NFS Client Performance Improvements for Databases on NAS

Microsoft Windows Server 2003 with Internet Information Services (IIS) 6.0 vs. Linux Competitive Web Server Performance Comparison

Express5800 Scalable Enterprise Server Reference Architecture. For NEC PCIe SSD Appliance for Microsoft SQL Server

Violin: A Framework for Extensible Block-level Storage

A Virtual Storage Environment for SSDs and HDDs in Xen Hypervisor

COSC 6374 Parallel Computation. Parallel I/O (I) I/O basics. Concept of a clusters

Don t Let RAID Raid the Lifetime of Your SSD Array

Price/performance Modern Memory Hierarchy

89 Fifth Avenue, 7th Floor. New York, NY White Paper. HP 3PAR Adaptive Flash Cache: A Competitive Comparison

Increasing Storage Performance

Leveraging EMC Fully Automated Storage Tiering (FAST) and FAST Cache for SQL Server Enterprise Deployments

p-oftl: An Object-based Semantic-aware Parallel Flash Translation Layer

N /150/151/160 RAID Controller. N MegaRAID CacheCade. Feature Overview

Minimum Hardware Configurations for EMC Documentum Archive Services for SAP Practical Sizing Guide

LSI MegaRAID FastPath Performance Evaluation in a Web Server Environment

Best Practices for Deploying SSDs in a Microsoft SQL Server 2008 OLTP Environment with Dell EqualLogic PS-Series Arrays

Data Distribution Algorithms for Reliable. Reliable Parallel Storage on Flash Memories

Transcription:

Managing Storage Space in a Flash and Disk Hybrid Storage System Xiaojian Wu, and A. L. Narasimha Reddy Dept. of Electrical and Computer Engineering Texas A&M University IEEE International Symposium on Modeling, Analysis & Simulation of Computer and Telecommunication Systems, 2009 MASCOTS 09

Outline Introduction Related Work Proposed Scheme Evaluation Conclusion

Introduction (1/4) To build large flash-only storage system is too expensive employ both flash and magnetic disk as a hybrid storage system Different characteristics write to flash can take longer than magnetic disk drives while read can finish faster flash have a limit on the number of times a block can be written magnetic disks typically perform better with larger file sizes Data placement, retrieval, scheduling and buffer management algorithms needs to be revisited for the hybrid storage system

Introduction (2/4) The disk drive is more efficient for larger reads and writes!

Introduction (3/4) Requests experience different performance at different devices based on the request type (read or write) and the request size (small or large)

Introduction (4/4) Managing the space across the devices in a hybrid system should be adaptable to changing device characteristics issues allocation data redistribution or migration Proposing a measurement-driven approach to migration to address these issues observe the access characteristics of individual blocks and consider migrating individual blocks

Related Work HP s AutoRAID system considered data migration between a mirrored device and a RAID device migrate hot data to faster devices and cold data to slower devices improve the access times of hot data by keeping it local to faster devices when data sets are larger than the capacity of faster devices in such systems, thrashing may occur

Proposed Scheme (1/5) pool the storage space across flash and disk drives and make it appear like a single larger device to the file system maintain an indirection map, containing mappings of logical to physical addresses, to allow blocks to be flexibly assigned to different devices when data is migrated, indirection map needs to be updated to reduce the cost, consider migration at a unit larger than a typical page size (data in chunks or blocks of size of 64KB or larger)

Proposed Scheme (2/5) keep track of access behavior of a block by maintaining two counters, for Read and Write accesses use 2 bytes for keeping track of read/write frequency separately per chunk (64KB or larger) about 32KB per 1GB of storage a block can be considered for migration or relocation only after receiving a minimum number of accesses for observing sufficient access history block access counters are initialized to zero on boot-up and after migration Every time a request is served by the device, keep track of the request response time at that device maintain both read and write performance separately exponential average of the device performance: average response time = 0.99 * previous average + 0.01 * current sample allowing longer term trends to reflect in the performance measure

Proposed Scheme (3/5)

Proposed Scheme (4/5) For each device i, keep track of the read r i and write w i response times Determine whether to migration Given a block j s read/write access history through its access counters R j and W j and the device response times current cost of accessing block j in its current device i: C ji = (R j * r i + W j * w i ) / (R j + W j ) compare with a block with similar access patterns at another device k, C jk if C ji > (1+δ)*C jk, consider this block to be a candidate for migration

Proposed Scheme (5/5) Employ a token scheme to control the rate of migration potential cost of a block migrated form device i to device k: r i + w k only consider blocks are currently being read or written to the device, as part of normal I/O activity, to reduce the cost Strategy in choosing which block to migrate maintain a cache of recently accessed blocks whenever a migration token is generated, migrate a block from this cached list to benefit the most active blocks Migration is carried out in blocks or chunks of 64KB or larger larger block size increases migration costs, reduces the size of the indirection map, can benefit from spatial locality or similarity of access patterns

Evaluation (1/7) NFS server Intel Pentium Dual Core 3.2 GHz processor 1GB main memory magnetic disk: one 7200RPM, 250G SAMSUNG SATA disk (SP2504C) flash disk drives: a 16GB Transcend SSD (TS16GSSD25S-S) a 32GB MemoRight GT drive Fedora 9 with a 2.6.21 kernel Ext2 file system 3 Workloads SPECsfs 3.0 file system workloads, read/write ratio about 1:4 Postmark typical access patterns in an email server IOzone create controlled workloads at the storage system, control the read/write ratio from 100%, 75%, 50%, 25%, and 0%

4 policies FLASH-ONLY MAGNETIC-ONLY STRIPING data is striped on both flash and magnetic disk STRIPING-MIGRATION data is striped on and migrated across both disks Evaluation (2/7) throughput saturation point 434 426 600 (a) benefit from data redistribution matches the read/write characteristics of block to the device performance (b) succeed in redistributing write-intensive blocks to the magnetic disk

if C ji > (1+δ)*C jk, consider this block to be a candidate for migration Evaluation (3/7) Using δ = 1 and chunk size of 64KB in all the following experiments

2-HARDDISK STRIPING: Evaluation (4/7) data is striped on two HDD and no migration is employed Transcend 16G (slower) MemoRight 32G (faster) (a)2-harddisk striping outperforms hybrid drive on both saturation point and response time (b) hybrid drive achieves nearly 50% higher throughput saturation point

Using IOzone to create 100% writes to 75%, 50%, 25%, 0% write workloads 2-HARDDISK STRIPING: data is striped on two HDD and no migration is employed STRIPING: data is striped on both flash and magnetic disk (Transcend-base hybrid drive) STRIPING-MIGRATION: data is striped on and migrated across both disks the read/write characteristics of the workload have a critical impact on the hybrid system

file size is 500 bytes to 10KB Evaluation (6/7) migration improves the transaction rate, read/write throughputs in both the hybrid systems by about 10% Transcend-based hybrid system can not compete with 2-HDD system MemoRight-based hybrid system outperforms the 2-HDD system by roughly about 10-17%

Evaluation (7/7) Migration-1: consider only read/write characteristics Migration-2: request size is also considered if < 64KB, based on the read/write request pattern if > 64KB, allow to exploit the gain from striping data across both the devices file size is 500 bytes to 500KB For MemoRight-Hybrid Migration-1 improves performance over striping by about 7% Migration-2 improves about 20% on average For Transcend-Hybrid the improvement of both migration policies is not as much can not match the performance of the 2-HDD system it shows that both read/write and request size patterns can be exploited to improve performance

Conclusion proposed a measurement-driven migration strategy for managing storage space in a hybrid system to exploit the performance asymmetry extract the read/write access patterns and request size patterns of different blocks and matches them with the read/write advantages of different devices The results indicate that the proposed approach can improve the performance of the system significantly, up to 50% in some cases