RAMCloud: Scalable Datacenter Storage Entirely in DRAM. John Ousterhout Stanford University

Similar documents
Web Technologies: RAMCloud and Fiz. John Ousterhout Stanford University

RAMCloud and the Low- Latency Datacenter. John Ousterhout Stanford University

The Case for RAMClouds: Scalable High-Performance Storage Entirely in DRAM

The Case for RAMClouds: Scalable High-Performance Storage Entirely in DRAM

Large-Scale Web Applications

Datacenter Operating Systems

Cloud Computing Is In Your Future

A1 and FARM scalable graph database on top of a transactional memory layer

Cloud Computing with Microsoft Azure

Database Scalability and Oracle 12c

A Study of Application Performance with Non-Volatile Main Memory

BENCHMARKING CLOUD DATABASES CASE STUDY on HBASE, HADOOP and CASSANDRA USING YCSB

Configuring Apache Derby for Performance and Durability Olav Sandstå

Future Prospects of Scalable Cloud Computing

White Paper November Technical Comparison of Perspectium Replicator vs Traditional Enterprise Service Buses

MySQL High-Availability and Scale-Out architectures

Cassandra A Decentralized, Structured Storage System

BASHO DATA PLATFORM SIMPLIFIES BIG DATA, IOT, AND HYBRID CLOUD APPS

Apache HBase. Crazy dances on the elephant back

HGST Virident Solutions 2.0

Memory Channel Storage ( M C S ) Demystified. Jerome McFarland

Outdated Architectures Are Holding Back the Cloud

Chapter 7: Distributed Systems: Warehouse-Scale Computing. Fall 2011 Jussi Kangasharju

Cloud Based Application Architectures using Smart Computing

DISASTER RECOVERY WITH AWS

Hadoop IST 734 SS CHUNG

NoSQL Data Base Basics

Fast Crash Recovery in RAMCloud

Flash Databases: High Performance and High Availability

An Overview of Flash Storage for Databases

Direct NFS - Design considerations for next-gen NAS appliances optimized for database workloads Akshay Shah Gurmeet Goindi Oracle

A Close Look at PCI Express SSDs. Shirish Jamthe Director of System Engineering Virident Systems, Inc. August 2011

BB2798 How Playtech uses predictive analytics to prevent business outages

Boost SQL Server Performance Buffer Pool Extensions & Delayed Durability

Amazon Cloud Storage Options

Enterprise Architectures for Large Tiled Basemap Projects. Tommy Fauvell

Cloud Computing. Chapter 1 Introducing Cloud Computing

Fast Crash Recovery in RAMCloud

Traditional v/s CONVRGD

SCALABLE DATA SERVICES

Cloud Optimize Your IT

Maximizing SQL Server Virtualization Performance

Scalability of web applications. CSCI 470: Web Science Keith Vertanen

Benchmarking Cassandra on Violin

Cluster Computing. ! Fault tolerance. ! Stateless. ! Throughput. ! Stateful. ! Response time. Architectures. Stateless vs. Stateful.

On- Prem MongoDB- as- a- Service Powered by the CumuLogic DBaaS Platform

Keys to Optimizing Your Backup Environment: Tivoli Storage Manager

GigaSpaces Real-Time Analytics for Big Data

Lecture 6 Cloud Application Development, using Google App Engine as an example

Scalable Internet Services and Load Balancing

Cloud DBMS: An Overview. Shan-Hung Wu, NetDB CS, NTHU Spring, 2015

1 The RAMCloud Storage System

Operating Systems. Cloud Computing and Data Centers

<Insert Picture Here> Oracle NoSQL Database A Distributed Key-Value Store

How To Make Your Database More Efficient By Virtualizing It On A Server

Best Practices for Monitoring Databases on VMware. Dean Richards Senior DBA, Confio Software

Apache Derby Performance. Olav Sandstå, Dyre Tjeldvoll, Knut Anders Hatlen Database Technology Group Sun Microsystems

Benchmarking Couchbase Server for Interactive Applications. By Alexey Diomin and Kirill Grigorchuk

Distributed Systems. Lec 2: Example use cases: Cloud computing, Big data, Web services

Diablo and VMware TM powering SQL Server TM in Virtual SAN TM. A Diablo Technologies Whitepaper. May 2015

Tushar Joshi Turtle Networks Ltd

OLTP Meets Bigdata, Challenges, Options, and Future Saibabu Devabhaktuni

Optimizing Data Center Networks for Cloud Computing

A Deduplication File System & Course Review

Google App Engine. Guido van Rossum Stanford EE380 Colloquium, Nov 5, 2008

MakeMyTrip CUSTOMER SUCCESS STORY

Practical Cassandra. Vitalii

How A V3 Appliance Employs Superior VDI Architecture to Reduce Latency and Increase Performance

Overview: X5 Generation Database Machines

The Flash Transformed Data Center & the Unlimited Future of Flash John Scaramuzzo Sr. Vice President & General Manager, Enterprise Storage Solutions

CS2510 Computer Operating Systems

CS2510 Computer Operating Systems

Caching Mechanisms for Mobile and IOT Devices

ioscale: The Holy Grail for Hyperscale

Cloud Computing. Up until now

Distributed File System. MCSN N. Tonellotto Complements of Distributed Enabling Platforms

VDI Solutions - Advantages of Virtual Desktop Infrastructure

Parallels Plesk Automation

Preview of Oracle Database 12c In-Memory Option. Copyright 2013, Oracle and/or its affiliates. All rights reserved.

CSE 544 Principles of Database Management Systems. Magdalena Balazinska Fall 2007 Lecture 5 - DBMS Architecture

Ant Rowstron. Joint work with Paolo Costa, Austin Donnelly and Greg O Shea Microsoft Research Cambridge. Hussam Abu-Libdeh, Simon Schubert Interns

Mark Bennett. Search and the Virtual Machine

Promise of Low-Latency Stable Storage for Enterprise Solutions

ZingMe Practice For Building Scalable PHP Website. By Chau Nguyen Nhat Thanh ZingMe Technical Manager Web Technical - VNG

WITH A FUSION POWERED SQL SERVER 2014 IN-MEMORY OLTP DATABASE

Outline: Operating Systems

Enabling Database-as-a-Service (DBaaS) within Enterprises or Cloud Offerings

SOLUTION BRIEF. Resolving the VDI Storage Challenge

Aspera Direct-to-Cloud Storage WHITE PAPER

Design Patterns for Distributed Non-Relational Databases

The Benefits of Virtualizing

Top 10 Reasons why MySQL Experts Switch to SchoonerSQL - Solving the common problems users face with MySQL

MinCopysets: Derandomizing Replication In Cloud Storage

Introduction to Big Data Training

Data Protection with IBM TotalStorage NAS and NSI Double- Take Data Replication Software

The NoSQL Ecosystem, Relaxed Consistency, and Snoop Dogg. Adam Marcus MIT CSAIL

The Cloud Trade Off IBM Haifa Research Storage Systems

Finding a needle in Haystack: Facebook s photo storage IBM Haifa Research Storage Systems

Outline. CS 245: Database System Principles. Notes 02: Hardware. Hardware DBMS Data Storage

ECE6130 Grid and Cloud Computing

Transcription:

RAMCloud: Scalable Datacenter Storage Entirely in DRAM John Ousterhout Stanford University

Introduction New research project at Stanford Create large-scale storage systems entirely in DRAM Interesting combination: scale, low latency The future of datacenter storage Low latency disruptive to database community October 27, 2009 RAMCloud/HPTS Slide 2

RAMCloud Overview Storage for datacenters 10-10000 commodity servers ~64 GB DRAM/server All data always in RAM Durable and available High throughput: 1M ops/sec/server Low-latency access: 5-10µs RPC Application Servers Storage Servers Datacenter October 27, 2009 RAMCloud/HPTS Slide 3

Example Configurations # servers GB/server Total capacity Total server cost $/GB Today 1000 64GB 64TB $4M $60 5-10 years 1000 1024GB 1PB $4M $4 October 27, 2009 RAMCloud/HPTS Slide 4

RAMCloud Motivation Relational databases don t scale Every large-scale Web application has problems: Facebook: 4000 MySQL servers + 2000 memcached servers New forms of storage starting to appear: Bigtable Dynamo PNUTS H-store memcached October 27, 2009 RAMCloud/HPTS Slide 5

RAMCloud Motivation, cont d Disk access rate not keeping up with capacity: Mid-1980 s 2009 Change Disk capacity 30 MB 500 GB 16667x Max. transfer rate 2 MB/s 100 MB/s 50x Latency (seek & rotate) 20 ms 10 ms 2x Capacity/bandwidth (large blocks) 15 s 5000 s 333x Capacity/bandwidth (1KB blocks) 600 s 58 days 8333x Jim Gray's rule 5 min 30 hrs 360x Disks must become more archival More information must move to memory October 27, 2009 RAMCloud/HPTS Slide 6

Impact of Latency Traditional Application Web Application UI App. Logic Single machine Data Structures Application Servers Datacenter UI Bus. Logic Storage Servers << 1µs latency 0.5-10ms latency Large-scale apps struggle with high latency RAMCloud goal: low latency and large scale Enable a new breed of information-intensive applications October 27, 2009 RAMCloud/HPTS Slide 7

Research Issues Achieving 5-10 µs RPC Durability at low latency Data model Concurrency/consistency model Data distribution, scaling Automated management Multi-tenancy Node architecture October 27, 2009 RAMCloud/HPTS Slide 8

Low Latency: SQL is Dead? Relational query model tied to high latency: Describe what you need up fron DBMS optimizes retrieval With sufficiently low latency: Don't need optimization; make individual requests as needed Can't afford query processing overhead The relational query model will disappear Question: what systems offer very low latency and use relational model? October 27, 2009 RAMCloud/HPTS Slide 9

Low Latency: Stronger Consistency? Cost of consistency rises with transaction overlap: O ~ R*D O = # overlapping transactions R = arrival rate of new transactions D = duration of each transaction R increases with system scale Eventually, scale makes consistency unaffordable But, D decreases with lower latency Stronger consistency affordable at larger scale Is this phenomenon strong enough to matter? October 27, 2009 RAMCloud/HPTS Slide 10

Low Latency: One Size Fits All Again? "One-size-fits-all is dead" - Mike Stonebraker Specialized databases proliferating: 50x performance improvements in specialized domains Optimize disk layout to eliminate seeks With low latency: Layout doesn't matter General-purpose is fast One-size-fits-all rides again? October 27, 2009 RAMCloud/HPTS Slide 11

Conclusions All online data is moving to RAM RAMClouds = the future of datacenter storage Low latency will change everything: New applications Stronger consistency at scale One-size-fits-all again SQL is dead 1000-10000 clients accessing 100TB - 1PB @ 5-10µs latency October 27, 2009 RAMCloud/HPTS Slide 12

Questions/Comments? For more on RAMCloud motivation & research issues: The Case for RAMClouds: Scalable High-Performance Storage Entirely in DRAM To appear in Operating Systems Review http://www.stanford.edu/~ouster/cgi-bin/papers/ramcloud.pdf Or, google RAMCloud October 27, 2009 RAMCloud/HPTS Slide 13

Backup Slides October 27, 2009 RAMCloud/HPTS Slide 14

Why Not a Caching Approach? Lost performance: 1% misses 10x performance degradation Won t save much money: Already have to keep information in memory Example: Facebook caches ~75% of data size Changes disk management issues: Optimize for reads, vs. writes & recovery October 27, 2009 RAMCloud/HPTS Slide 15

Why not Flash Memory? Many candidate technologies besides DRAM Flash (NAND, NOR) PC RAM DRAM enables lowest latency: 5-10x faster than flash Most RAMCloud techniques will apply to other technologies Ultimately, choose storage technology based on cost, performance, energy, not volatility October 27, 2009 RAMCloud/HPTS Slide 16

Is RAMCloud Capacity Sufficient? Facebook: 200 TB of (non-image) data today Amazon: Revenues/year: $16B Orders/year: 400M? ($40/order?) Bytes/order: 1000-10000? Order data/year: 0.4-4.0 TB? RAMCloud cost: $24K-240K? United Airlines: Total flights/day: 4000? (30,000 for all airlines in U.S.) Passenger flights/year: 200M? Bytes/passenger-flight: 1000-10000? Order data/year: 0.2-2.0 TB? RAMCloud cost: $13K-130K? Ready today for all online data; media soon October 27, 2009 RAMCloud/HPTS Slide 17

Data Durability/Availability Data must be durable when write RPC returns Unattractive possibilities: Synchronous disk write (100-1000x too slow) Replicate in other memories (too expensive) One possibility: log to RAM, then disk write log log Storage Servers DRAM DRAM DRAM async, batch disk disk disk October 27, 2009 RAMCloud/HPTS Slide 18

Durability/Availability, cont d Buffered logging supports ~50K writes/sec./server (vs. 1M reads) Need fast recovery after crashes: Read 64 GB from disk? 10 minutes Shard backup data across 100 s of servers Reduce recovery time to 1-2 seconds Other issues: Power failures Cross-datacenter replication October 27, 2009 RAMCloud/HPTS Slide 19

Low-Latency RPCs Achieving 5-10µs will impact every layer of the system: Must reduce network latency: Typical today: 10-30 µs/switch, 5 switches each way Arista: 0.9 µs/switch: 9 µs roundtrip Need cut-through routing, congestion mgmt Switch Switch Switch Switch Switch Client Server Tailor OS on server side: Dedicated cores No interrupts? No virtual memory? October 27, 2009 RAMCloud/HPTS Slide 20

Low-Latency RPCs, cont d Client side: need efficient path through VM User-level access to network interface? Network protocol stack TCP too slow (especially with packet loss) Must avoid copies Preliminary experiments: 10-15 µs roundtrip Direct connection: no switches October 27, 2009 RAMCloud/HPTS Slide 21

Interesting Facets Use each system property to improve the others High server throughput: No replication for performance, only durability? Simplifies consistency issues Low-latency RPC: Cheap to reflect writes to backup servers Stronger consistency? 1000 s of servers: Sharded backups for fast recovery October 27, 2009 RAMCloud/HPTS Slide 22

New Conference! USENIX Conference on Web Application Development: All topics related to developing and deploying Web applications First conference: June 20-25, 2010, Boston Paper submission deadline: January 11, 2010 http://www.usenix.org/events/webapps10/ October 27, 2009 RAMCloud/HPTS Slide 23