CAFTL: A Content-Aware Flash Translation Layer Enhancing the Lifespan of flash Memory based Solid State Drives FAST 11 Yongseok Oh <ysoh@uos.ac.kr> University of Seoul Mobile Embedded System Laboratory 1
Introduction The limited lifespan is the Achilles hill of Flash Memory based SSDs Not-so-well-known reasons As bit density increases, flash memory chips become less reliable and less durable Traditional redundancy solutions are considered less effective for SSDs Some prior research work has presented empirical and modeling based studies on the lifespan of flash memories The lifespan of SSDs is a function of three factors The amount of incoming write traffic The size of over-provisioned flash space The efficiency of garbage collection and wear-leveling mechanisms Mobile Embedded System Laboratory 2
Data Duplication is Common Kernel developers can have multiple kernel sources Word tools often automatically save a copy of documents Mobile Embedded System Laboratory 3
Contributions Study of data duplications for improving endurance of SSDs in file systems and various workloads Design of a content-aware FTL to extend the SSD lifespan by removing duplicate writes (up to 24.2%) and redundant data (up to 31.2%) with minimal overhead. Acceleration methods for in-line deduplication in SSD devices Implementation of CAFTL in the DiskSim simulator Mobile Embedded System Laboratory 4
Technical Challenges Limited resources CAFTL is designed for running in an SSD device with limited memory space and computing power, rather than running on a dedicated powerful enterprise server. Relatively lower redundancy CAFTL mostly handles regular file system workloads, which have an impressive but much lower duplication rate than that of backup streams with high redundancy (often 10 times or even higher). Lack of semantic hints CAFTL works at the device level and only sees a sequence of logical blocks without any semantic hints from host file systems. Low overhead requirement CAFTL must retain high data access performance for regular workloads, while this is a less stringent requirement in backup systems that can run during out-of-office hours. Mobile Embedded System Laboratory 5
The Design of CAFTL Tree critical objectives Reducing unnecessary write traffic Extending available flash space Retaining access performance Mobile Embedded System Laboratory 6
Hashing and Fingerprint Store A fixed-sized chunking approach (e.g. 4KB) The basic operation unit in flash is a page (Mapping policy) SHA1 (160-bit) Fingerprint store in memory Only 10-20% of the fingerprints are highly deduplicated (skewed) Most fingerprints are unique A complete search in the fingerprint store would incur high lookup latencies We should only store and search in the most likely-to-be-duplicated fingerprint in memory 15 Disks Mobile Embedded System Laboratory 7
Hashing and Fingerprint Store {fingerprint,(location, reference)} Fingerprint f Hash Function f mod N Highly Duplicated Fingerprints are kept in RAM Optimization techniques (1) Range Check (2) Hotness-based Reorganization (3) Bucket level binary Search Seg 0 Seg N-1 Bucket Memory Flash If a fingerprint cannot be found in Fingerprint Store, Out-of-line scanning can still identify these duplicates later Mobile Embedded System Laboratory 8
Indirect Mapping Two-level indirect mapping Mobile Embedded System Laboratory 9
Acceleration Methods Fingerprinting is the key bottleneck of the in-line deduplicaton in CAFTL Three acceleration methods have been proposed Sampling for hashing Light-weight pre-hashing Dynamic switches Mobile Embedded System Laboratory 10
Sampling for hashing We selectively pick only one page as a sample page for fingerprint We use this sample fingerprint to query the fingerprint store We propose to use Content-based Sampling We select the first four bytes, called sample bytes Mobile Embedded System Laboratory 11
Light-weight pre-hashing Producing a 32-bit CRC32 hash value is over 10 times faster than computing a 160-bit SHA-1 hash value Reducing the hash strength would not incur a significant increase of false positives for a typical SSD capacity Light-weight pre-hashing Compute a CRC32 and query the fingerprint store If a mach is found Generate SHA1 fingerprint and confirm it Otherwise, write data to flash We have also considered using a Bloom filter Multiple hashings Summary vector cannot be updated when a fingerprint is removed Mobile Embedded System Laboratory 12
Dynamic switches Incoming requests may wait for available buffer space to be released by previous requests Dynamic switch Set high watermark and low watermark to turn on/off inine deduplication If the percentage of the occupied cache space reaches a high watermark (95%) We disable in-line deduplication to flush writes quickly to flash Once the low watermark (50%) is hit, we re-enable the in-line deduplication. Mobile Embedded System Laboratory 13
Out-of-line Deduplication CAFTL does not pursue a perfect in-line deduplication An internal routine is periodically launched to perform out-ofline fingerprinting and out-of-line deduplication during the device idle Mobile Embedded System Laboratory 14
Performance Evaluation Experimental Systems SSD Simulator Microsoft Research SSD extension for the DiskSim Write buffering SSD Configurations Workloads and trace collection Desktop Hadoop TPC-H Transaction TPC-C Mobile Embedded System Laboratory 15
Effective of Deduplication Removing duplicate writes Mobile Embedded System Laboratory 16
Effective of Deduplication Extending flash space Mobile Embedded System Laboratory 17
Performance Impact Cache size Mobile Embedded System Laboratory 18
Performance Impact Hashing speed Mobile Embedded System Laboratory 19
Performance Impact Fingerprint searching Mobile Embedded System Laboratory 20
Acceleration Methods Sampling Because of the significantly reduced waiting time for the buffer Mobile Embedded System Laboratory 21
Acceleration Methods Light-weight pre-hashing This workload is write intensive and has a long waiting queue, which makes queuing effect particularly significant Mobile Embedded System Laboratory 22
Acceleration Methods Dynamic switch Mobile Embedded System Laboratory 23
Conclusions CAFTL can remove duplicate writes Enhance the lifespan of SSDs while retaining high data access performance If budget allows, we would suggest maintaining the fingerprint store fully in memory, which not only improves deduplication rate but also simplifies designs PCM into the SSDs to maintain the metadata Which can remove much design complexity On-line compression into SSDs Mobile Embedded System Laboratory 24