Managing Storage Space in a Flash and Disk Hybrid Storage System Xiaojian Wu, and A. L. Narasimha Reddy Dept. of Electrical and Computer Engineering Texas A&M University IEEE International Symposium on Modeling, Analysis & Simulation of Computer and Telecommunication Systems, 2009 MASCOTS 09
Outline Introduction Related Work Proposed Scheme Evaluation Conclusion
Introduction (1/4) To build large flash-only storage system is too expensive employ both flash and magnetic disk as a hybrid storage system Different characteristics write to flash can take longer than magnetic disk drives while read can finish faster flash have a limit on the number of times a block can be written magnetic disks typically perform better with larger file sizes Data placement, retrieval, scheduling and buffer management algorithms needs to be revisited for the hybrid storage system
Introduction (2/4) The disk drive is more efficient for larger reads and writes!
Introduction (3/4) Requests experience different performance at different devices based on the request type (read or write) and the request size (small or large)
Introduction (4/4) Managing the space across the devices in a hybrid system should be adaptable to changing device characteristics issues allocation data redistribution or migration Proposing a measurement-driven approach to migration to address these issues observe the access characteristics of individual blocks and consider migrating individual blocks
Related Work HP s AutoRAID system considered data migration between a mirrored device and a RAID device migrate hot data to faster devices and cold data to slower devices improve the access times of hot data by keeping it local to faster devices when data sets are larger than the capacity of faster devices in such systems, thrashing may occur
Proposed Scheme (1/5) pool the storage space across flash and disk drives and make it appear like a single larger device to the file system maintain an indirection map, containing mappings of logical to physical addresses, to allow blocks to be flexibly assigned to different devices when data is migrated, indirection map needs to be updated to reduce the cost, consider migration at a unit larger than a typical page size (data in chunks or blocks of size of 64KB or larger)
Proposed Scheme (2/5) keep track of access behavior of a block by maintaining two counters, for Read and Write accesses use 2 bytes for keeping track of read/write frequency separately per chunk (64KB or larger) about 32KB per 1GB of storage a block can be considered for migration or relocation only after receiving a minimum number of accesses for observing sufficient access history block access counters are initialized to zero on boot-up and after migration Every time a request is served by the device, keep track of the request response time at that device maintain both read and write performance separately exponential average of the device performance: average response time = 0.99 * previous average + 0.01 * current sample allowing longer term trends to reflect in the performance measure
Proposed Scheme (3/5)
Proposed Scheme (4/5) For each device i, keep track of the read r i and write w i response times Determine whether to migration Given a block j s read/write access history through its access counters R j and W j and the device response times current cost of accessing block j in its current device i: C ji = (R j * r i + W j * w i ) / (R j + W j ) compare with a block with similar access patterns at another device k, C jk if C ji > (1+δ)*C jk, consider this block to be a candidate for migration
Proposed Scheme (5/5) Employ a token scheme to control the rate of migration potential cost of a block migrated form device i to device k: r i + w k only consider blocks are currently being read or written to the device, as part of normal I/O activity, to reduce the cost Strategy in choosing which block to migrate maintain a cache of recently accessed blocks whenever a migration token is generated, migrate a block from this cached list to benefit the most active blocks Migration is carried out in blocks or chunks of 64KB or larger larger block size increases migration costs, reduces the size of the indirection map, can benefit from spatial locality or similarity of access patterns
Evaluation (1/7) NFS server Intel Pentium Dual Core 3.2 GHz processor 1GB main memory magnetic disk: one 7200RPM, 250G SAMSUNG SATA disk (SP2504C) flash disk drives: a 16GB Transcend SSD (TS16GSSD25S-S) a 32GB MemoRight GT drive Fedora 9 with a 2.6.21 kernel Ext2 file system 3 Workloads SPECsfs 3.0 file system workloads, read/write ratio about 1:4 Postmark typical access patterns in an email server IOzone create controlled workloads at the storage system, control the read/write ratio from 100%, 75%, 50%, 25%, and 0%
4 policies FLASH-ONLY MAGNETIC-ONLY STRIPING data is striped on both flash and magnetic disk STRIPING-MIGRATION data is striped on and migrated across both disks Evaluation (2/7) throughput saturation point 434 426 600 (a) benefit from data redistribution matches the read/write characteristics of block to the device performance (b) succeed in redistributing write-intensive blocks to the magnetic disk
if C ji > (1+δ)*C jk, consider this block to be a candidate for migration Evaluation (3/7) Using δ = 1 and chunk size of 64KB in all the following experiments
2-HARDDISK STRIPING: Evaluation (4/7) data is striped on two HDD and no migration is employed Transcend 16G (slower) MemoRight 32G (faster) (a)2-harddisk striping outperforms hybrid drive on both saturation point and response time (b) hybrid drive achieves nearly 50% higher throughput saturation point
Using IOzone to create 100% writes to 75%, 50%, 25%, 0% write workloads 2-HARDDISK STRIPING: data is striped on two HDD and no migration is employed STRIPING: data is striped on both flash and magnetic disk (Transcend-base hybrid drive) STRIPING-MIGRATION: data is striped on and migrated across both disks the read/write characteristics of the workload have a critical impact on the hybrid system
file size is 500 bytes to 10KB Evaluation (6/7) migration improves the transaction rate, read/write throughputs in both the hybrid systems by about 10% Transcend-based hybrid system can not compete with 2-HDD system MemoRight-based hybrid system outperforms the 2-HDD system by roughly about 10-17%
Evaluation (7/7) Migration-1: consider only read/write characteristics Migration-2: request size is also considered if < 64KB, based on the read/write request pattern if > 64KB, allow to exploit the gain from striping data across both the devices file size is 500 bytes to 500KB For MemoRight-Hybrid Migration-1 improves performance over striping by about 7% Migration-2 improves about 20% on average For Transcend-Hybrid the improvement of both migration policies is not as much can not match the performance of the 2-HDD system it shows that both read/write and request size patterns can be exploited to improve performance
Conclusion proposed a measurement-driven migration strategy for managing storage space in a hybrid system to exploit the performance asymmetry extract the read/write access patterns and request size patterns of different blocks and matches them with the read/write advantages of different devices The results indicate that the proposed approach can improve the performance of the system significantly, up to 50% in some cases