FIOS: A Fair, Efficient Flash I/O Scheduler Stan Park and Kai Shen presented by Jason Gevargizian
Flash devices NAND Flash devices are used for storage aka Solid-state Drives (SSDs) much higher I/O performance than mechanical disks this increased performance has potential to alleviate I/O bottlenecks in data-intensive applications
I/O Schedulers & Flash conventional I/O schedulers were built for mechanical storage devices they fail to recognize the unique flash characteristics read-blocked-by-write I/O anticipation suppression of parallelism This results in poor fairness and performance on Flash devices
Write limitations of Flash erase-before-write write target locations must be erased before writing large erasure granularity 64-256x the size of the read consequently, read-write speeds are highly asymmetric reads are one to two orders of magnitude faster mechanical disks r&w operations are fairly symmetric
read-blocked-by-write write limitations lead to read interference on these devices when reads are blocked by writes, reads suffer a substantial slowdown scheduling reads and writes with equal preference creates unfairness
I/O Anticipation anticipation is the idling of a device to prepare for soon-arriving I/O requests necessary for mechanical devices with high seek times some anticipation is necessary for fairness to avoid premature switching/advancing in quanta and fair queuing approaches Flash devices have low access times and are hurt by the liberal use of anticipation
I/O Parallelism Flash devices have built-in parallelism Multiple channels, often each with multiple parallel planes conventional fairness oriented I/O schedulers suppress parallelism simplifies accounting/allocation of device for queues this does not maximize the benefit of Flash devices
I/O Schedulers & Flash conventional I/O schedulers were built for mechanical storage devices they fail to recognize the unique flash characteristics read-blocked-by-write I/O anticipation suppression of parallelism This results in poor fairness and performance on Flash devices
FIOS Design - 4 Techniques 1. Fair timeslice management with timeslice fragmentation and concurrent request issuance 2. support read preference to combat readblocked-by-write 3. enable concurrent issuance to utilize devicelevel parallelism 4. use I/O anticipation judiciously to maintain fairness at minimal idling cost
1) Fair Timeslice Management each task is allotted one full timeslice within an Epoch timeslices can be fragmented remaining timeslice is calculated after each I/O request completion epochs end when either: (1) no task has a non-zero timeslice (2) no task with a positive timeslice has I/O requests
2) Read/Write Interference reads suffer dramatic write interference read preference policy is used all writes are blocked until all reads complete writes have a large service time, so additional queuing time is small Epoch policy governs read preference (to ensure writes don t starve) read preference is optional (some devices like vertex don t have high asymmetry)
3) I/O Parallelism many Flash drives have internal parallelism after issuing I/O requests, FIOS searches for additional requests to run in parallel I/O Cost accounting outstanding requests can be delayed slightly by parallel request issuance tasks must not be billed for the additional time (spent on other tasks requests)
3) I/O Parallelism I/O Cost accounting FIOS attempts to not bill tasks for this additional device time FIOS calibrates (for the given device once) read and write request times at different sizes, then uses interpolation for other sizes When calibration unavailable, FIOS assumes that outstanding requests take an equal time on device
4) I/O Anticipation anticipation (idling) is used for mechanical disks to combat long seek times Flash devices are fast and are hurt by I/O anticipation some anticipation is still necessary to ensure fairness...
4) I/O Anticipation FIOS uses anticipation minimally anticipation is employed to delay epoch ending for anticipating tasks that have remaining timeslice left (but no requests) anticipation is also employed after read completion to mitigate read-blocked-by-write anticipates instead of immediately issuing write
FIOS Design - 4 Techniques 1. Fair timeslice management with timeslice fragmentation and concurrent request issuance 2. support read preference to combat readblocked-by-write 3. enable concurrent issuance to utilize devicelevel parallelism 4. use I/O anticipation judiciously to maintain fairness at minimal idling cost
Evaluation - Schedulers FIOS is compared against three fairnessoriented I/O schedulers: 1. Linux CFQ scheduler (Completely Fair Queuing) 2. SFQ (Start-time Fair Queuing with a concurrency depth) 3. a quanta-based I/O scheduler similar to one employed in Argon
Evaluation - Flash Devices Evaluation was performed on the following drives: Intel X225-M Flash-based SSD (2009) Mtron Pro 7500 Flash-based SSD (2008) OCZ Vertex 3 Flash-based SSD (2011) SanDisk CompactFlash drive on 6-Watts wimpy node
Evaluation - Workloads Set of synthetic benchmarks with varying I/O concurrency realistic applications SPCweb workload on Apache TPC-C workload on MySQL database CompactFlash drive in a low-power wimpy node using FAWN Data Store workload
Synthetic Benchmarks 1-reader 1-writer that continuously issue 4KB reads/writes respectively 4-reader 4-writer of 4KB reads/writes 4-reader 4-writer with thinktime exponentially distributed thinktime between I/O requests such that total thinktime is equal to I/O device time 4KB-reader and 128KB reader
Synthetic Benchmarks slowdown is latency ratio to running-alone raw, CFQ, & SFQ suffer read interference quanta performs fairer due to aggressive pertask quantum quanta suffers in performance due to excessive anticipation and lack of parallelism
Synthetic Benchmarks on vertex, all achieve decent fairness due to low read-write asymmetry
Synthetic Benchmarks (continued) the 4KB-reader and 128KB-reader benchmark reveals a case where even on the vertex, only FIOS and quanta achieve fairness
Synthetic Benchmarks overall concurrent efficiency (relative throughput of concurrent execution compared to runningalone throughput) quanta suffers for its aggressive methods for fairness
SPECweb and TPC-C read-only SPECweb99 workload running on an Apache 2.2.3 webserver write intensive TPC-C running on MySQL 5.5.13 database each application is driven by a closed-loop load generator with 4 concurrent clients each client issuing requests continuously
SPECweb and TPC-C write intensive TPC-C experiences more slowdown on Flash again, quanta suffers from excessive anticipation FIOS performs the best on these benchmarks
Low-Power CompactFlash low-power wimpy node like the ones used with FAWN node contains an Alix board with single-core 500MHz AMD Geode CPU, 256MB SDRAM 16GB SanDisk CompactFlash drive does not have parallelism FAWN Data Store application workload two tasks, one performing hash gets and another performing hash puts
Low-Power CompactFlash quanta and FIOS perform the most fair quanta does not suffer as it had from lack of paralleization (since CompactFlash does not support it)
Wrap Up Flash storage devices show potential to alleviate I/O bottlenecks conventional I/O schedulers do not account for the unique characteristics of Flash the authors propose FIOS as a method to better handle the scheduling of I/O for Flash devices with promising results
Discussion Thank you. Questions? To the class: Much of the benefit of FIOS relies upon read-write speed discrepancy of Flash. Do you think FIOS and systems like it will be useful for very long?