Skimpy-Stash. RAM Space Skimpy Key-Value Store on Flash-based Storage YASHWANTH BODDU FQ9316

Similar documents
SkimpyStash: RAM Space Skimpy Key-Value Store on Flash-based Storage

Speeding Up Cloud/Server Applications Using Flash Memory

SMALL INDEX LARGE INDEX (SILT)

FAWN - a Fast Array of Wimpy Nodes

A Deduplication File System & Course Review

Operating Systems CSE 410, Spring File Management. Stephen Wagner Michigan State University

Benchmarking Cassandra on Violin

Benchmarking Hadoop & HBase on Violin

A Data De-duplication Access Framework for Solid State Drives

Raima Database Manager Version 14.0 In-memory Database Engine

Copyright 2007 Ramez Elmasri and Shamkant B. Navathe. Slide 13-1

Cuckoo Filter: Practically Better Than Bloom

The Bw-Tree Key-Value Store and Its Applications to Server/Cloud Data Management in Production

Chapter 13 Disk Storage, Basic File Structures, and Hashing.

File System Management

Storage and File Structure

Hypertable Architecture Overview

File-System Implementation

Chapter 13. Disk Storage, Basic File Structures, and Hashing

Chapter 11: File System Implementation. Operating System Concepts with Java 8 th Edition

Chapter 11: File System Implementation. Chapter 11: File System Implementation. Objectives. File-System Structure

Chapter 13 File and Database Systems

Chapter 13 File and Database Systems

File Management. Chapter 12

FAST 11. Yongseok Oh University of Seoul. Mobile Embedded System Laboratory

Storage in Database Systems. CMPSCI 445 Fall 2010

6. Storage and File Structures

Physical Data Organization

ECE 7650 Scalable and Secure Internet Services and Architecture ---- A Systems Perspective

ACM, This is the author's version of the work. It is posted here by permission of ACM for your personal use. Not for redistribution.

Lecture 1: Data Storage & Index

DATABASE DESIGN - 1DL400

Part III Storage Management. Chapter 11: File System Implementation

Improve Business Productivity and User Experience with a SanDisk Powered SQL Server 2014 In-Memory OLTP Database

A client side persistent block cache for the data center. Vault Boston Luis Pabón - Red Hat

A SCALABLE DEDUPLICATION AND GARBAGE COLLECTION ENGINE FOR INCREMENTAL BACKUP

Big Data Technology Map-Reduce Motivation: Indexing in Search Engines

SILT: A Memory-Efficient, High-Performance Key-Value Store

COS 318: Operating Systems

Distributed File Systems

WITH A FUSION POWERED SQL SERVER 2014 IN-MEMORY OLTP DATABASE

Chapter 13. Chapter Outline. Disk Storage, Basic File Structures, and Hashing

A High-Throughput In-Memory Index, Durable on Flash-based SSD

Finding a needle in Haystack: Facebook s photo storage IBM Haifa Research Storage Systems

How To Store Data On An Ocora Nosql Database On A Flash Memory Device On A Microsoft Flash Memory 2 (Iomemory)

DNS LOOKUP SYSTEM DATA STRUCTURES AND ALGORITHMS PROJECT REPORT

Record Storage and Primary File Organization

Couchbase Server Under the Hood

CS 2112 Spring Instructions. Assignment 3 Data Structures and Web Filtering. 0.1 Grading. 0.2 Partners. 0.3 Restrictions

COS 318: Operating Systems. File Layout and Directories. Topics. File System Components. Steps to Open A File

CHAPTER 13: DISK STORAGE, BASIC FILE STRUCTURES, AND HASHING

Krishna Institute of Engineering & Technology, Ghaziabad Department of Computer Application MCA-213 : DATA STRUCTURES USING C

Universal hashing. In other words, the probability of a collision for two different keys x and y given a hash function randomly chosen from H is 1/m.

Data Structures for Big Data: Bloom Filter. Vinicius Vielmo Cogo Smalltalks, DI, FC/UL. October 16, 2014.

J-Flow on J Series Services Routers and Branch SRX Series Services Gateways

File Management Chapters 10, 11, 12

Chapter 1 File Organization 1.0 OBJECTIVES 1.1 INTRODUCTION 1.2 STORAGE DEVICES CHARACTERISTICS

Two Parts. Filesystem Interface. Filesystem design. Interface the user sees. Implementing the interface

Big Table A Distributed Storage System For Data

Best Practices for Optimizing SQL Server Database Performance with the LSI WarpDrive Acceleration Card

Chapter 8: Structures for Files. Truong Quynh Chi Spring- 2013

RAMCloud and the Low- Latency Datacenter. John Ousterhout Stanford University

SDFS Overview. By Sam Silverberg

Data Management for Portable Media Players

1. Comments on reviews a. Need to avoid just summarizing web page asks you for:

FPGA-based Multithreading for In-Memory Hash Joins

Avoiding the Disk Bottleneck in the Data Domain Deduplication File System

Snapshots in Hadoop Distributed File System

Chapter 12 File Management

A COOL AND PRACTICAL ALTERNATIVE TO TRADITIONAL HASH TABLES

DEXT3: Block Level Inline Deduplication for EXT3 File System

Chapter 11: File System Implementation. Operating System Concepts 8 th Edition

DataBlitz Main Memory DataBase System

NAND Flash Memories. Understanding NAND Flash Factory Pre-Programming. Schemes

A Comparison of Dictionary Implementations

StreamStorage: High-throughput and Scalable Storage Technology for Streaming Data

Seeking Fast, Durable Data Management: A Database System and Persistent Storage Benchmark

Big Data With Hadoop

Lecture 18: Reliable Storage

Using Redis as a Cache Backend in Magento

How To Test For Performance And Scalability On A Server With A Multi-Core Computer (For A Large Server)

Memory Management Outline. Background Swapping Contiguous Memory Allocation Paging Segmentation Segmented Paging

Job Reference Guide. SLAMD Distributed Load Generation Engine. Version 1.8.2

The What, Why and How of the Pure Storage Enterprise Flash Array

Data storage Tree indexes

Chapter 13: Query Processing. Basic Steps in Query Processing

MAD2: A Scalable High-Throughput Exact Deduplication Approach for Network Backup Services

Storage Management for Files of Dynamic Records

MapReduce. from the paper. MapReduce: Simplified Data Processing on Large Clusters (2004)

Non-Stop for Apache HBase: Active-active region server clusters TECHNICAL BRIEF

Programming NAND devices

Transcription:

Skimpy-Stash RAM Space Skimpy Key-Value Store on Flash-based Storage YASHWANTH BODDU FQ9316

Outline Introduction Flash Memory Overview KV Store Applications Skimpy-Stash Design Architectural Components Hash Table Directory Design

Introduction This is designed to achieve high throughput and low latency. The distinguishing feature of Skimpy-Stash is the design goal of extremely low RAM footprint at about 1 (± 0.5) byte per key-value pair, which is more aggressive than earlier designs. It uses a hash table directory in RAM to index key value pairs stored in log structured manner on flash.

Q. Our base design uses less than 1 byte in RAM per key-value pair and our enhanced design takes slightly more than 1 byte per key-value pair. In FAWN, even a pointer to a KV pair needs a 4-byte pointer. How can possibly Skimpy-Stash achieve such a low memory cost for metadata? Skimpy-Stash moves most of the pointers that locate each key value pair from RAM to the Flash itself. Resolving hash table collisions using linear chaining and saving the linked lists on flash itself with a pointer in each hash table bucket in RAM pointing to the beginning record of the chain on flash. Enhanced design uses two-choice based load balancing to reduce wide variations in bucket sizes and a bloom filter in each hash table directory slot in RAM for summarizing the records in the bucket.

Q. Skimpy-Stash uses a hash table directory in RAM to index key-value pairs stored in a log-structure on flash. Why are key-value pairs on the flash organized as a log? The key benefit of the log structured data organization on the Flash may be the high write throughput that can be obtained, since all the updates to the data and the metadata are written in a sequential order in the log.

Q. The average bucket size is the critical design parameter that serves as a powerful knob for making a continuum of tradeoffs between low RAM usage and low lookup latencies. Please explain this statement. If the bucket size is small, the linked list will be greater in size and therefore we have more lookup latency. And, if the bucket size is more, linked list will be shorter in size and we have less lookup latency but more the size of the bucket, more RAM space will be utilized which is again a backdrop.

Flash Memory Overview

KV Store Applications Online Multi Player gaming There is a need to maintain server-side state so as to track player actions on each client machine and update global game states to make them visible to other players as quickly as possible. There is also the requirement to store server-side game state (i) resume game from interrupted state if and when crashes occur (ii) offline analysis of game popularity, progression, and dynamics with the objective of improving the game, and (iii) verification of player actions for fairness when outcomes are associated with monetary rewards. Storage Deduplication

Skimpy-Stash Design Coping with Flash Constraints Random Writes Writes less than Flash page size

Design Goals Support low latency, high throughput operations Use flash aware data structures and algorithms Random writes and in place updates are expensive on flash memory, hence they must be avoided and sequential writes should be used to extent possible. Low RAM footprint per key independent of key-value

Architectural Components RAM write buffer RAM Hash Table directory Flash Store

Q. The client [write] call returns only after the write buffer is flushed to flash. Why cannot such a call be acknowledged earlier? It can t be acknowledged earlier because the memory is still volatile in RAM write buffer.

Q. Basic functions: Store, Lookup, Delete explain how these basic functions are executed? Store: A key insert (or, update) operation (set) writes the key-value pair into the RAM write buffer. When there are enough key-value pairs in RAM write buffer to fill a flash page (or, a configurable timeout interval since the client call has expired, say 1 msec), these entries are written to flash and inserted into the RAM HT directory and flash. Lookup: It firsts looks up the RAM write buffer, upon a miss there it lookups the Hash Table directory in RAM and searches the chained key value pair records on flash in the respective bucket. Delete: A delete operation on a key is supported through insertion of null value for that key. Eventually the null entry and earlier inserted values of the key on flash will be garbage collected.

Q. The chain of records on flash pointed to by each slot comprises the bucket of records corresponding to this slot in the HT directory. Please use the figure to describe Skimpy-Stash s data structure. Also explain how lookup, insert, and delete operations are executed.

Hash Table Directory Design Base Design: Resolving hash table collisions using linear chaining, where multiple keys that resolve (collide) to the same hash table bucket are chained in a linked list, and Storing the linked lists on flash itself with a pointer in each hash table bucket in RAM pointing to the beginning record of the chain on flash. Each key-value pair record on flash contains, in addition to the key and value fields, a pointer to the next record (in the order in its respective chain) on flash.

Enhanced Design Load balancing across Buckets Bloom Filter per Bucket

Q. Because we store the chain of key-value pairs in each bucket on flash, we incur multiple flash reads upon lookup of a key in the store. Please explain how this issue can be alleviated. This can be done by periodically compacting the chain on flash in a bucket by placing the valid keys in the chain contiguously on one or more flash pages that are appended to the tail of log.

Q...two-choice based load balancing strategy is used to reduce variations in the number of keys assigned to each bucket. Explain how this is achieved. This strategy is used to reduce variations in the number of keys assigned to each bucket. With a load balanced design for HT directory, each key would be hashed to 2 candidate HT directory buckets using 2 hash functions h1 and h2, and actually inserted into the one that has currently fewer elements.

Q. when the last record in a bucket chain is encountered in the log during garbage collection, all valid records in that chain are compacted and relocated to the tail of the log.. Please explain how garbage is collected. When a certain configurable fraction of garbage accumulates in the log (in terms of space occupied), the pages on flash from the head of the log are recycled- valid entries from the head of the log are written back to end of the log while invalid entries can be skipped. This effectively leads to the design decision of garbage collecting entire bucket chains on flash at a time.

Thank You