Web Technologies: RAMCloud and Fiz. John Ousterhout Stanford University

Similar documents
RAMCloud and the Low- Latency Datacenter. John Ousterhout Stanford University

The Case for RAMClouds: Scalable High-Performance Storage Entirely in DRAM

Future Prospects of Scalable Cloud Computing

The Case for RAMClouds: Scalable High-Performance Storage Entirely in DRAM

CSE-E5430 Scalable Cloud Computing Lecture 7

A1 and FARM scalable graph database on top of a transactional memory layer

A Study of Application Performance with Non-Volatile Main Memory

Datacenter Operating Systems

Fast Crash Recovery in RAMCloud

1 The RAMCloud Storage System

Large-Scale Web Applications

Fast Crash Recovery in RAMCloud

An Overview of Flash Storage for Databases

Configuring Apache Derby for Performance and Durability Olav Sandstå

MinCopysets: Derandomizing Replication In Cloud Storage

Chapter Introduction. Storage and Other I/O Topics. p. 570( 頁 585) Fig I/O devices can be characterized by. I/O bus connections

6.S897 Large-Scale Systems

Traditional v/s CONVRGD

Efficient Network Marketing - Fabien Hermenier A.M.a.a.a.C.

DURABILITY AND CRASH RECOVERY IN DISTRIBUTED IN-MEMORY STORAGE SYSTEMS

On- Prem MongoDB- as- a- Service Powered by the CumuLogic DBaaS Platform

NoSQL Data Base Basics

Chapter 7: Distributed Systems: Warehouse-Scale Computing. Fall 2011 Jussi Kangasharju

Ground up Introduction to In-Memory Data (Grids)

Memory Channel Storage ( M C S ) Demystified. Jerome McFarland

It s Time for Low Latency

Amazon Cloud Storage Options

Cloud Optimize Your IT

Cloud Computing Is In Your Future

Benchmarking Cassandra on Violin

Part V Applications. What is cloud computing? SaaS has been around for awhile. Cloud Computing: General concepts

DATA COMMUNICATOIN NETWORKING

VDI Solutions - Advantages of Virtual Desktop Infrastructure

WITH A FUSION POWERED SQL SERVER 2014 IN-MEMORY OLTP DATABASE

Flash Databases: High Performance and High Availability

SQL Server 2014 New Features/In- Memory Store. Juergen Thomas Microsoft Corporation

Database Scalability and Oracle 12c

Why Computers Are Getting Slower (and what we can do about it) Rik van Riel Sr. Software Engineer, Red Hat

BENCHMARKING CLOUD DATABASES CASE STUDY on HBASE, HADOOP and CASSANDRA USING YCSB

Server-i-zation of Storage Jeff Sosa Sr. Director of Product Management, Virident

Internet Scale Storage Microsoft Storage Community

Switching Architectures for Cloud Network Designs

Hadoop Distributed File System. T Seminar On Multimedia Eero Kurkela

Chapter 10: Mass-Storage Systems

Cloud Based Application Architectures using Smart Computing

A Close Look at PCI Express SSDs. Shirish Jamthe Director of System Engineering Virident Systems, Inc. August 2011

Best Practices for Monitoring Databases on VMware. Dean Richards Senior DBA, Confio Software

The Cloud to the rescue!

Keys to Optimizing Your Backup Environment: Tivoli Storage Manager

Outdated Architectures Are Holding Back the Cloud

Performance Benchmark for Cloud Block Storage

Improve Business Productivity and User Experience with a SanDisk Powered SQL Server 2014 In-Memory OLTP Database

Optimizing SQL Server Storage Performance with the PowerEdge R720

In Memory Accelerator for MongoDB

Price/performance Modern Memory Hierarchy

Lecture 6 Cloud Application Development, using Google App Engine as an example

Operating Systems. Cloud Computing and Data Centers

Preview of Oracle Database 12c In-Memory Option. Copyright 2013, Oracle and/or its affiliates. All rights reserved.

Resource Efficient Computing for Warehouse-scale Datacenters

Cloud Computing Trends

MakeMyTrip CUSTOMER SUCCESS STORY

CASE STUDY: Oracle TimesTen In-Memory Database and Shared Disk HA Implementation at Instance level. -ORACLE TIMESTEN 11gR1

Open source Google-style large scale data analysis with Hadoop

Benchmarking Couchbase Server for Interactive Applications. By Alexey Diomin and Kirill Grigorchuk

Maximizing SQL Server Virtualization Performance

Cassandra A Decentralized, Structured Storage System

CUMULUX WHICH CLOUD PLATFORM IS RIGHT FOR YOU? COMPARING CLOUD PLATFORMS. Review Business and Technology Series

GigaSpaces Real-Time Analytics for Big Data

Diablo and VMware TM powering SQL Server TM in Virtual SAN TM. A Diablo Technologies Whitepaper. May 2015

Ant Rowstron. Joint work with Paolo Costa, Austin Donnelly and Greg O Shea Microsoft Research Cambridge. Hussam Abu-Libdeh, Simon Schubert Interns

Agenda. Enterprise Application Performance Factors. Current form of Enterprise Applications. Factors to Application Performance.

Optimizing Data Center Networks for Cloud Computing

Flash Use Cases Traditional Infrastructure vs Hyperscale

10G Ethernet: The Foundation for Low-Latency, Real-Time Financial Services Applications and Other, Future Cloud Applications

NAND Flash Architecture and Specification Trends

Distributed Data Parallel Computing: The Sector Perspective on Big Data

Cloud Computing with Microsoft Azure

Chapter 2: Computer-System Structures. Computer System Operation Storage Structure Storage Hierarchy Hardware Protection General System Architecture

PipeCloud : Using Causality to Overcome Speed-of-Light Delays in Cloud-Based Disaster Recovery. Razvan Ghitulete Vrije Universiteit

America s Most Wanted a metric to detect persistently faulty machines in Hadoop

CS2510 Computer Operating Systems

CS2510 Computer Operating Systems

Top 10 Reasons why MySQL Experts Switch to SchoonerSQL - Solving the common problems users face with MySQL

Hadoop IST 734 SS CHUNG

How To Scale Myroster With Flash Memory From Hgst On A Flash Flash Flash Memory On A Slave Server

Direct NFS - Design considerations for next-gen NAS appliances optimized for database workloads Akshay Shah Gurmeet Goindi Oracle

The Use of Flash in Large-Scale Storage Systems.

Dimension Data Enabling the Journey to the Cloud

Cluster Computing. ! Fault tolerance. ! Stateless. ! Throughput. ! Stateful. ! Response time. Architectures. Stateless vs. Stateful.

How Router Technology Shapes Inter-Cloud Computing Service Architecture for The Future Internet

Implications of Storage Class Memories (SCM) on Software Architectures

Distributed Systems [Fall 2012]

Flash 101. Violin Memory Switzerland. Violin Memory Inc. Proprietary 1

File System & Device Drive. Overview of Mass Storage Structure. Moving head Disk Mechanism. HDD Pictures 11/13/2014. CS341: Operating System

ioscale: The Holy Grail for Hyperscale

Google App Engine. Guido van Rossum Stanford EE380 Colloquium, Nov 5, 2008

Bigdata High Availability (HA) Architecture

Building a cost-effective and high-performing public cloud. Sander Cruiming, founder Cloud Provider

Promise of Low-Latency Stable Storage for Enterprise Solutions

Accelerating Enterprise Applications and Reducing TCO with SanDisk ZetaScale Software

Transcription:

Web Technologies: RAMCloud and Fiz John Ousterhout Stanford University

The Web is Changing Everything Discovering the potential: New applications 100-1000x scale New development style New approach to deployment Phase 1 2000 2010 2020 Phase 2 Realizing the potential: New models of computation (EC2) New storage (Bigtable, Dynamo) New algorithms (MapReduce) New languages New frameworks New approaches to software development September 24, 2009 CS300: RAMCloud and Fiz Slide 2

RAMCloud Introduction New research project at Stanford (Kozyrakis, Mazières, Mitra, Ousterhout, Parulkar, Prabhakar, Rosenblum) Create large-scale storage systems entirely in DRAM Interesting combination: scale, low latency The future of datacenter storage? Topics for this talk: Overview of RAMCloud Motivation Research challenges September 24, 2009 CS300: RAMCloud and Fiz Slide 3

RAMCloud Overview Storage for datacenters 1000-10000 commodity servers 64 GB DRAM/server All data always in RAM Durable and available High throughput: 1M ops/sec/server Low-latency access: 5-10µs RPC Application Servers Storage Servers Datacenter September 24, 2009 CS300: RAMCloud and Fiz Slide 4

Example Configurations Today 5-10 years # servers 1000 1000 GB/server 64GB 1024GB Total capacity 64TB 1PB Total server cost $4M $4M $/GB $60 $4 September 24, 2009 CS300: RAMCloud and Fiz Slide 5

RAMCloud Motivation Relational databases don t scale Every large-scale Web application has problems: Facebook: 4000 MySQL servers + 2000 memcached servers New forms of storage starting to appear: Bigtable Dynamo PNUTS H-store memcached Many apps don t need all RDBMS features, can t afford them September 24, 2009 CS300: RAMCloud and Fiz Slide 6

RAMCloud Motivation, cont d Disk access rate not keeping up with capacity: Mid-1980 s 2009 Change Disk capacity 30 MB 200 GB Max. transfer rate Latency (seek & rotate) Capacity/bandwidth (large blocks) Capacity/bandwidth (200B blocks) 2 MB/s 20 ms Disks must become more archival 100 MB/s 10 ms 15 s 2000 s 133x 3000 s 115 days 3333x Can t afford small random accesses RAM cost today = disk cost 10 years ago 6667x 50x September 24, 2009 CS300: RAMCloud and Fiz Slide 7 2x

Why Not a Caching Approach? Encourages bad habits: A few misses are OK. NOT! 1% misses 10x performance degradation Changes disk layout issues: Optimize for reads, vs. writes & recovery Won t save much money: Already have to keep information in memory Example: Facebook caches 75% of data September 24, 2009 CS300: RAMCloud and Fiz Slide 8

Does Latency Matter? Yes! Latency historically undervalued Web applications becoming more data intensive (100 s of storage requests per Web page) Low latency enables richer query models, stronger consistency Traditional Application Web Application UI App. Logic Data Structures UI Bus. Logic Application Storage Servers Servers September 24, 2009 CS300: RAMCloud and Fiz Slide 9

Is RAMCloud Capacity Sufficient? Facebook: 200 TB of (non-image) data today Amazon: Revenues/year: $16B Orders/year: 400M? ($40/order?) Bytes/order: 1000-10000? Order data/year 0.4-4.0 TB? United Airlines: Total flights/day: 4000? (30,000 for all airlines in U.S.) Passenger flights/year: 200M? Data/passenger-flight: 1000-10000? Order data/year: 0.2-2.0 TB? Ready today for all online data; media soon September 24, 2009 CS300: RAMCloud and Fiz Slide 10

Data Durability/Availability Data must be durable when write RPC returns Non-starters: Synchronous disk write (100-1000x too slow) Replicate in other memories (too expensive) One possibility: buffered logging write log log Storage Servers DRAM DRAM DRAM async, batch disk disk disk September 24, 2009 CS300: RAMCloud and Fiz Slide 11

Durability/Availability, cont d Buffered logging supports ~50K writes/sec./server (vs. 1M reads) Need fast recovery after crashes: Read 64 GB from disk? 10 minutes Shard backup data across 100 s of servers Reduce recovery time to 1-2 seconds Other issues: Power failures Cross-datacenter replication September 24, 2009 CS300: RAMCloud and Fiz Slide 12

Other RAMCloud Research Issues Low-latency RPCs Data model Concurrency/consistency model Data distribution, scaling Automated management Multi-tenancy Client-server functional distribution Node architecture September 24, 2009 CS300: RAMCloud and Fiz Slide 13

Status and Plans Project plan: build production-quality RAMCloud implementation Just beginning detailed design/implementation Current students: Ryan Stutsman Steve Rumble Aravind Narayanan Possibly room for one first-year student Must love building real software See me if interested September 24, 2009 CS300: RAMCloud and Fiz Slide 14

RAMCloud Summary Many interesting research issues Exciting combination of scale and latency: 50-500 TBytes 10 microsecond access time Enable new forms of data-intensive applications September 24, 2009 CS300: RAMCloud and Fiz Slide 15

Fiz Overview The problem: Too hard to develop interactive Web applications Existing frameworks too low-level The solution: Raise the level of programming: don t write HTML! Create applications from high-level reusable components Fiz: Framework for creating components for Web applications Library of built-in components Goal: create community around component set September 24, 2009 CS300: RAMCloud and Fiz Slide 16

Questions/Comments September 24, 2009 CS300: RAMCloud and Fiz Slide 17

Why not Flash Memory? Many candidate technologies besides DRAM Flash (NAND, NOR) PC RAM DRAM enables lowest latency: 5-10x faster than flash Most RAMCloud techniques will apply to other technologies Ultimately, choose storage technology based on cost, performance, energy, not volatility September 24, 2009 CS300: RAMCloud and Fiz Slide 18

Low-Latency RPCs Achieving 5-10µs will impact every layer of the system: Must reduce network latency: Typical today: 10-30 µs/switch, 5 switches each way Arista: 0.9 µs/switch: 9 µs roundtrip Need cut-through routing, congestion mgmt Switch Switch Switch Switch Switch Client Server Tailor OS on server side: Dedicated cores No interrupts? No virtual memory? September 24, 2009 CS300: RAMCloud and Fiz Slide 19

Low-Latency RPCs, cont d Client side: need efficient path through VM User-level access to network interface? Network protocol stack TCP too slow (especially with packet loss) Must avoid copies Preliminary experiments: 10-15 µs roundtrip Direct connection: no switches September 24, 2009 CS300: RAMCloud and Fiz Slide 20