Sherpa: Cloud Computing of the Third Kind

Similar documents
Cloud Computing at Google. Architecture

Cloud data store services and NoSQL databases. Ricardo Vilaça Universidade do Minho Portugal

Cloud Data Yahoo!

BENCHMARKING CLOUD DATABASES CASE STUDY on HBASE, HADOOP and CASSANDRA USING YCSB

Where We Are. References. Cloud Computing. Levels of Service. Cloud Computing History. Introduction to Data Management CSE 344

The Sierra Clustered Database Engine, the technology at the heart of

Perspectives on Cloud Computing

F1: A Distributed SQL Database That Scales. Presentation by: Alex Degtiar (adegtiar@cmu.edu) /21/2013

NoSQL Databases. Institute of Computer Science Databases and Information Systems (DBIS) DB 2, WS 2014/2015

Petabyte Scale Data at Facebook. Dhruba Borthakur, Engineer at Facebook, SIGMOD, New York, June 2013

References. Introduction to Database Systems CSE 444. Motivation. Basic Features. Outline: Database in the Cloud. Outline

Introduction to Database Systems CSE 444

Hadoop IST 734 SS CHUNG

An Approach to Implement Map Reduce with NoSQL Databases

Integrating Big Data into the Computing Curricula

Non-Stop for Apache HBase: Active-active region server clusters TECHNICAL BRIEF

BIG DATA IN THE CLOUD : CHALLENGES AND OPPORTUNITIES MARY- JANE SULE & PROF. MAOZHEN LI BRUNEL UNIVERSITY, LONDON

these three NoSQL databases because I wanted to see a the two different sides of the CAP

Design and Evolution of the Apache Hadoop File System(HDFS)

Big Data With Hadoop

Facebook: Cassandra. Smruti R. Sarangi. Department of Computer Science Indian Institute of Technology New Delhi, India. Overview Design Evaluation

Data Management in the Cloud -

Can the Elephants Handle the NoSQL Onslaught?

Overview of Databases On MacOS. Karl Kuehn Automation Engineer RethinkDB

CloudDB: A Data Store for all Sizes in the Cloud

Hadoop: A Framework for Data- Intensive Distributed Computing. CS561-Spring 2012 WPI, Mohamed Y. Eltabakh

Introduction to Hadoop. New York Oracle User Group Vikas Sawhney

An Oracle White Paper November Leveraging Massively Parallel Processing in an Oracle Environment for Big Data Analytics

Amr El Abbadi. Computer Science, UC Santa Barbara

Cassandra A Decentralized, Structured Storage System

Hadoop Ecosystem B Y R A H I M A.

LARGE-SCALE DATA STORAGE APPLICATIONS

Hosting Transaction Based Applications on Cloud

Data Management in the Cloud

NoSQL for SQL Professionals William McKnight

Comparison of the Frontier Distributed Database Caching System with NoSQL Databases

Apache HBase. Crazy dances on the elephant back

A REVIEW PAPER ON THE HADOOP DISTRIBUTED FILE SYSTEM

Architectural patterns for building real time applications with Apache HBase. Andrew Purtell Committer and PMC, Apache HBase

White Paper. Optimizing the Performance Of MySQL Cluster

Chapter 7: Distributed Systems: Warehouse-Scale Computing. Fall 2011 Jussi Kangasharju

Preview of Oracle Database 12c In-Memory Option. Copyright 2013, Oracle and/or its affiliates. All rights reserved.

Hadoop & its Usage at Facebook

Structured Data Storage

NoSQL Databases. Nikos Parlavantzas

PNUTS: Yahoo! s Hosted Data Serving Platform

Prepared By : Manoj Kumar Joshi & Vikas Sawhney

Affordable, Scalable, Reliable OLTP in a Cloud and Big Data World: IBM DB2 purescale

16.1 MAPREDUCE. For personal use only, not for distribution. 333

SQL VS. NO-SQL. Adapted Slides from Dr. Jennifer Widom from Stanford

Oracle s Big Data solutions. Roger Wullschleger. <Insert Picture Here>

MongoDB Developer and Administrator Certification Course Agenda

In Memory Accelerator for MongoDB

Preparing Your Data For Cloud

Challenges for Data Driven Systems

extensible record stores document stores key-value stores Rick Cattel s clustering from Scalable SQL and NoSQL Data Stores SIGMOD Record, 2010

2.1.5 Storing your application s structured data in a cloud database

A programming model in Cloud: MapReduce

Building a Cloud for Yahoo!

Hadoop & its Usage at Facebook

Lecture Data Warehouse Systems

NoSQL - What we ve learned with mongodb. Paul Pedersen, Deputy CTO paul@10gen.com DAMA SF December 15, 2011

MongoDB in the NoSQL and SQL world. Horst Rechner Berlin,

Infrastructures for big data

SQL Azure vs. SQL Server

SQL Server Administrator Introduction - 3 Days Objectives

Trafodion Operational SQL-on-Hadoop

ORACLE DATABASE 10G ENTERPRISE EDITION

Lambda Architecture. Near Real-Time Big Data Analytics Using Hadoop. January Website:

So What s the Big Deal?

Distributed File System. MCSN N. Tonellotto Complements of Distributed Enabling Platforms

BIG DATA What it is and how to use?

Practical Cassandra. Vitalii

f...-. I enterprise Amazon SimpIeDB Developer Guide Scale your application's database on the cloud using Amazon SimpIeDB Prabhakar Chaganti Rich Helms

YouTube Vitess. Cloud-Native MySQL. Oracle OpenWorld Conference October 26, Anthony Yeh, Software Engineer, YouTube.

bigdata Managing Scale in Ontological Systems

Elastic Application Platform for Market Data Real-Time Analytics. for E-Commerce

Architectures for massive data management

Database Scalability {Patterns} / Robert Treat

Introduction to NOSQL

Distributed Systems. Tutorial 12 Cassandra

NoSQL and Hadoop Technologies On Oracle Cloud

Tips and Tricks for Using Oracle TimesTen In-Memory Database in the Application Tier

Hypertable Architecture Overview

Using RDBMS, NoSQL or Hadoop?

NoSQL Data Base Basics

A survey of big data architectures for handling massive data

How To Scale Out Of A Nosql Database

Domain driven design, NoSQL and multi-model databases

CitusDB Architecture for Real-Time Big Data

Lecture 10: HBase! Claudia Hauff (Web Information Systems)!

SQL Server 2014 New Features/In- Memory Store. Juergen Thomas Microsoft Corporation

ESS event: Big Data in Official Statistics. Antonino Virgillito, Istat

Tutorial: Big Data Algorithms and Applications Under Hadoop KUNPENG ZHANG SIDDHARTHA BHATTACHARYYA

Apache Hadoop FileSystem and its Usage in Facebook

ORACLE COHERENCE 12CR2

High Availability Using MySQL in the Cloud:

The evolution of database technology (II) Huibert Aalbers Senior Certified Executive IT Architect

Evaluating NoSQL for Enterprise Applications. Dirk Bartels VP Strategy & Marketing

Transcription:

Sherpa: Cloud Computing of the Third Kind Raghu Ramakrishnan Yahoo! and Platform Engineering Team

What s in a Name? Data Intensive Super Scalable Computing Grid Computing Super Computing Cloud Computing Parallel Database Management Systems Distributed Database Management Systems Vary across: Workload, Programming model, Ownership model, Architectural trade-offs - 2 -

Cloud Computing: Computing as a Service Packaged Software Cloud Computing CPU Intensive Data Intensive High-throughput E.g., Condor Transactional Storage & Serving E.g., PNUTS, S3, SSDS, UDB Analytic E.g., SSDS, Hadoop - 3 -

Trivia Question What s the world s most widely used parallel programming language? - 4 -

Why Not Use an RDBMS for Analytics? RDBMS provides too much ACID transactions Complex query language Lots and lots of knobs to turn RDBMS provides too little Lots of optimization and tuning possible for analytics E.g., Column stores, bit-map indexes Flexible programming model E.g., Group By vs. Map-Reduce; multi-dimensional OLAP But many good ideas to borrow! Declarative language; parallelization and optimization techniques; value of data consistency - 5 -

Why Not Use an RDBMS for OLTP? RDBMS provides too much ACID transactions Complex query language Lots and lots of knobs to turn RDBMS provides too little Lack of (cost-effective) scalability, availability Not enough schema/data type flexibility RDBMS and Sherpa aim for different parts of the space RDBMS: Heavyweight, strongly consistent OLTP Sherpa: Lightweight but massive scale, relaxed consistency OLTP - 6 -

I want a big, virtual database What I want is a robust, high performance virtual relational database that runs transparently over a cluster, nodes dropping in and out of service at will, read-write replication and data migration all done automatically. I want to be able to install a database on a server cloud and use it like it was all running on one machine. -- Greg Linden s blog, 2006 We re building a hosted version of such a system - 7 -

An Example Web App Heavy use of simple database operations Updates uploads tags as flower» Your Photos Queries» Photos tagged as flower» Friend activity Sonja uploaded Brandon tagged a photo - 8 -

Why Hosted? simple API Rapid application development On-demand scaling DBA functions amortized across applications - 9 -

Rapid Application Development What does it take to get the Next Great Thing off the ground? Now: Set up multiple replicas of a clustered data store Set up a system for indexing Set up a system for caching Set up auxiliary DBMS instances for reporting, etc. Set up the feeds and messaging between them Write the application logic Fairly complex system at first line of new code Our vision: Write the application logic Use a hosted infrastructure to store and query your data Or, as Joshua Shachter puts it: The next cool thing shouldn t take a team of 30, it should be three guys, PHP and a long weekend - 10 -

Implications Data management as a service Scientists and others who ve resisted (installing, maintaining, and) using DBMSs will find it much easier to reap the benefits Data centers and Computing Centers will come into vogue again The Web is becoming open E.g., OpenSocial, OpenID Hosted back-ends and RAD tools will make Web application development accessible to all Ideas will be the most valuable currency, not the wherewithal to build complex systems Paradigm shifts possible for how we do research in many fields: Build applications that embed your algorithms and test them directly in the field Computer Scientists can interact directly with users (ironically, this would still be a breakthrough of sorts after four decades!) Many other disciplines (e.g., Sociology, microeconomics) can design and conduct online experiments involving unprecedented numbers of participants - 11 -

PNUTS: DB in the Cloud A 42342 E B 42521 W C 66354 W D 12352 E E 75656 C F 15677 E Indexes Indexes and and views views A 42342 E B 42521 W C 66354 W D 12352 E E 75656 C F 15677 E Parallel Parallel database database CREATE TABLE Parts (( ID ID VARCHAR, StockNumber INT, Status VARCHAR )) A 42342 E B 42521 W C 66354 W D 12352 E E 75656 C F 15677 E Geographic Geographic replication replication Structured, Structured, flexible flexible schema schema Hosted, Hosted, managed managed infrastructure infrastructure - 12 -

Sherpa Data Services Applications PNUTS Services Query planning and execution Index maintenance YCA: Authorization Distributed infrastructure for tabular data Data partitioning Update consistency Replication YDOT FS Ordered tables YDHT FS Hash tables YMB Pub/sub messaging Zookeeper Consistency service - 13 -

Guiding Principles for PNUTS Reliable and robust storage Replication for fault tolerance Predictable consistency guarantees Simple to use Simple operations set Minimal client configuration Service-level authentication Flexible schemas Highly Scalable / Performant Partitioning data over many machines Horizontal scaling at every level Data is local to its usage Predictable performance via quality of service levels Predicates evaluated on back end Cheaper consistency guarantees than full ACID - 14 - Multiple rich access methods Hash and ordered table types System-maintained secondary indexes Optimization for complex access patterns Rapid provisioning of new storage Simple, automated cluster growth Cheap table creation Pay as you grow, grow big as you need Operationally cheap Automated failover Automated load balancing No single points of failure Hosted platform

Data Model and Retrieval YDOT/YDHT Data model: Key value dictionary Value can be packed with multiple attributes YDHT operations: Hash table calls Get Set (insert and update) Remove Scan YDOT: YDHT + ordered ranges PNUTS Data model: Relational tables with flexible schema Typed, declared attributes Fast addition of new attributes Operations: PNUTS query language Point lookup Range queries Insert/Update/Remove Complex predicates Ordering Top-K Primary API is web services (JSON over HTTP) Client libraries for various languages (PHP, C++, Java, ) - 15 -

YDHT Scalable distributed record store Optimized for small reads and writes Focus on ease of operations, multi-region redundancy, organic scalability Storage as a service Clients Tablet Controller Routers Storage servers - 16 -

Ways to use YDHT As a primary store APP YDHT As a materialized view/cache APP YDHT Primary store As part of PNUTS! APP PNUTS YDHT - 17 -

Data Concepts YDHT Table Primary key Record Tablet Grape Lime Apple Strawberry Orange Avocado Lemon Tomato Banana Kiwi Grapes are good to eat Limes are green Apple is wisdom Strawberry shortcake Arrgh! Don t get scurvy! But at what price? How much did you pay for this lemon? Is this a vegetable? The perfect fruit New Zealand - 18 - Fields

Data Concepts YDOT Ordered by primary key Tablets contain clustered ranges Apple Avocado Banana Grape Kiwi Lemon Lime Orange Strawberry Tomato Apple is wisdom But at what price? The perfect fruit Grapes are good to eat New Zealand How much did you pay for this lemon? Limes are green Arrgh! Don t get scurvy! Strawberry shortcake Is this a vegetable? - 19 -

YDOT Ordered Table Store YDOT provides clustered, ordered retrieval of records Apple Avocado Banana Blueberry Grapefruit Pear? Canteloupe Grape Kiwi Lemon Lime Mango Orange Storage unit 1 Canteloupe Storage unit 3 Lime Storage unit 2 Strawberry Storage unit 1 Router Lime Pear? Grapefruit Lime? Apple Strawberry Avocado Tomato Banana Watermelon Blueberry Strawberry Tomato Watermelon Lime Mango Orange Canteloupe Grape Kiwi Lemon Storage unit 1 Storage unit 2 Storage unit 3-20 -

Data Concepts PNUTS Schema: declared, typed fields Name Description Price Apple Apple is wisdom $1 Avocado But at what price? $3 Banana The perfect fruit $2 Grape Grapes are good to eat $12 Kiwi New Zealand $8 Retains tablet structure of YDHT/YDOT Lemon Lime How much did you pay for this lemon? Limes are green $1 $9 Orange Arrgh! Don t get scurvy! $2 Strawberry Strawberry shortcake $900 Tomato Is this a vegetable? $14-21 -

Flexible Schema Primary table Posted date Listing id Item Price 6/1/07 424252 Couch $570 6/1/07 763245 Bike $86 6/3/07 211242 Car $1123 6/5/07 421133 Lamp $15 Color Red Condition Good Fair - 22 -

Asynchronous Replication - 23 -

Mastering A 42342 E B 42521 W C 66354 W D 12352 E E 75656 C F 15677 E Tablet master A 42342 E B 42521 W C 66354 W D 12352 E E 75656 C F 15677 E A 42342 E B 42521 W C 66354 W D 12352 E E 75656 C F 15677 E - 24 -

Basic Consistency Model Goal: Make it easier for applications to reason about updates and cope with asynchrony alternative to transactions in an asynchronous world What happens to a record with primary key Brian? Record inserted Update Update Delete Record Update inserted Update Update Delete Record inserted Delete v. 1 v. 2 v. 3 Generation 1 v. 1 v. 2 v. 3 v. 4 Generation 2 v. 1 Generation 3 Time Guarantees: Every reader will always see some consistent, but possibly stale version Readers can request a more up-to-date version, but may pay extra latency Special case: Critical read (writer/readers see their own writes) Writers can verify that the record is still at the version they expect - 25 -

Distribution 6/1/07 424252 6/1/07 256623 Couch $570 Data Distribution shuffling for for load parallelism load balancing Car $1123 6/2/07 636353 6/5/07 662113 6/7/07 121113 6/9/07 887734 6/11/07 252111 6/11/07 116458 Bike $86 Chair $10 Lamp $19 Bike $56 Scooter $18 Hammer $8000 Server 1 Server 2 Server 3 Server 4-26 -

Tablet Splitting and Balancing Each Each storage unit unit has has many many tablets tablets Storage unit unit may may become a hotspot hotspot Overfull tablets tablets split split Tablets Tablets may may grow grow over over time time Shed Shed load load by by moving moving tablets tablets to to other other servers servers - 27 -

Architecture Data-path components Clients Each can can be be scaled horizontally Tablet map Load balancer Server monitor Tablet controller Routers WS API YMB SU API Storage units Cluster 1 Cluster 2 Query processing - 28 -

Yahoo! Message Broker (YMB) Pub/sub based on reliable logging Topic-based Persistent subscriptions Multi-region presence Guarantees In the presence of at most one YMB machine failure: Published messages will be delivered on live subscriptions system-wide Messages published in one region will be delivered to all subscribers in the order they were published (partial order) Published messages available for re-delivery until subscriber calls consume() If there are two machine failures: Subscribers will be notified of broken subscription Since messages may have been lost Uses in YDHT/PNUTS Reliably replicate data and updates between regions Reliably communicate coordination/synchronization message between distributed actors Reliably log to-do actions for individual actors - 29 -

Quality of Service Hosted platform supporting multiple applications And eventually, multi-tenancy! Inter-application isolation Applications run on leased servers Performance is as good as those servers give you Unaffected by other applications Some shared infrastructure Overprovisioned to ensure performance agreements Intra-application isolation How to share my data without hurting my app s performance? Gold versus best-effort access Best-effort may be interrupted - 30 - to serve gold requests

BigTable BigTable overview Rows and columns abstraction with flexible schemas and data versioning, range scans Built on top of GFS Things BigTable emphasizes that we don t (for now, anyway) Keeping multiple versions Tight integration with MapReduce Things we emphasize that BigTable doesn t Asynchrony Geographic replication Indexing - 31 -

Dynamo Dynamo overview Highly write available data store Uses gossip and eventual consistency: can write anywhere, eventually update will propagate to all replicas PNUTS versus Dynamo Dynamo is a hash table; PNUTS is both hashed and ordered Eventual consistency model exposes dirty data PNUTS can operate in high availability or high consistency mode Gossip is not tuned for geographic replication No record structure or indexes in Dynamo - 32 -

Summary Hosted data management is a new frontier Beyond the issues we discussed, many novel aspects that arise because of hosting (e.g., multi-tenancy) Paradigm shift that goes beyond the technology (e.g., new kinds of usage, new business models) Formulas for new research problem: Old research problem + fine-grained asynchrony Old research problem + hosted service model Formulas for solutions? None so far, but lots of good ideas in the old solutions! - 33 -