OldSQL vs. NoSQL vs. NewSQL on New OLTP



Similar documents
One-Size-Fits-All: A DBMS Idea Whose Time has Come and Gone. Michael Stonebraker December, 2008

What Does Big Data Mean and Who Will Win? Michael Stonebraker

Tackling The Challenges of Big Data Big Data Storage. Tackling The Challenges of Big Data Big Data Storage. History Lesson. Michael Stonebraker

Cassandra A Decentralized Structured Storage System

WITH A FUSION POWERED SQL SERVER 2014 IN-MEMORY OLTP DATABASE

TECHNICAL OVERVIEW HIGH PERFORMANCE, SCALE-OUT RDBMS FOR FAST DATA APPS RE- QUIRING REAL-TIME ANALYTICS WITH TRANSACTIONS.

Improve Business Productivity and User Experience with a SanDisk Powered SQL Server 2014 In-Memory OLTP Database

Affordable, Scalable, Reliable OLTP in a Cloud and Big Data World: IBM DB2 purescale

THE DEVELOPER GUIDE TO BUILDING STREAMING DATA APPLICATIONS

Actian Vector in Hadoop

SQL Server 2014 New Features/In- Memory Store. Juergen Thomas Microsoft Corporation

Big Data Technology ดร.ช ชาต หฤไชยะศ กด. Choochart Haruechaiyasak, Ph.D.

Architectural patterns for building real time applications with Apache HBase. Andrew Purtell Committer and PMC, Apache HBase

CISC 432/CMPE 432/CISC 832 Advanced Database Systems

Scalability and BMC Remedy Action Request System TECHNICAL WHITE PAPER

Why DBMSs Matter More than Ever in the Big Data Era

Evaluating NoSQL for Enterprise Applications. Dirk Bartels VP Strategy & Marketing

Comparison of the Frontier Distributed Database Caching System with NoSQL Databases

Cloud Computing with Microsoft Azure

HYPER-CONVERGED INFRASTRUCTURE STRATEGIES

Mark Bennett. Search and the Virtual Machine

A survey of big data architectures for handling massive data

VoltDB Technical Overview

Infrastructures for big data

Testing Big data is one of the biggest

<Insert Picture Here> Adventures in Middleware Database Abuse

BIG DATA CAN DRIVE THE BUSINESS AND IT TO EVOLVE AND ADAPT RALPH KIMBALL BUSSUM 2014

Database Scalability and Oracle 12c

X4-2 Exadata announced (well actually around Jan 1) OEM/Grid control 12c R4 just released

Top 10 reasons your ecommerce site will fail during peak periods

Cloud Computing at Google. Architecture

Data Modeling for Big Data

Cloud Computing Is In Your Future

these three NoSQL databases because I wanted to see a the two different sides of the CAP

Introduction to Apache Cassandra

Energy Efficient MapReduce

The Sierra Clustered Database Engine, the technology at the heart of

Big Data Use Cases. To Start Today. Paul Scholey Sales Director, EMEA. 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866)

Introduction to Apache Kafka And Real-Time ETL. for Oracle DBAs and Data Analysts

NoSQL for SQL Professionals William McKnight

STORAGE. Buying Guide: TARGET DATA DEDUPLICATION BACKUP SYSTEMS. inside

Maximizing SQL Server Virtualization Performance

Integrating Big Data into the Computing Curricula

Future-Proofing MySQL for the Worldwide Data Revolution

Overview of Databases On MacOS. Karl Kuehn Automation Engineer RethinkDB

ScaleArc idb Solution for SQL Server Deployments

Getting Started Practical Input For Your Roadmap

The Future of Data Management

How graph databases started the multi-model revolution

Embracing the Cloud, Mobile, Social & Big Data

Preparing Your Data For Cloud

Top 10 Reasons why MySQL Experts Switch to SchoonerSQL - Solving the common problems users face with MySQL

Flash Use Cases Traditional Infrastructure vs Hyperscale

Storage in Database Systems. CMPSCI 445 Fall 2010

OLTP Meets Bigdata, Challenges, Options, and Future Saibabu Devabhaktuni

SCALABILITY AND AVAILABILITY

Well packaged sets of preinstalled, integrated, and optimized software on select hardware in the form of engineered systems and appliances

The Modern Online Application for the Internet Economy: 5 Key Requirements that Ensure Success

Can the Elephants Handle the NoSQL Onslaught?

Big Data Integration: A Buyer's Guide

Production ready hadoop. By Deepak Rao Na,onal Head Datawarehousing Bajaj Finserv

High Availability for Database Systems in Cloud Computing Environments. Ashraf Aboulnaga University of Waterloo

CitusDB Architecture for Real-Time Big Data

GigaSpaces Real-Time Analytics for Big Data

Whitepaper: performance of SqlBulkCopy

Building Scalable Applications Using Microsoft Technologies

INTRODUCTION TO CASSANDRA

Performance Management in Big Data Applica6ons. Michael Kopp, Technology

A Next-Generation Analytics Ecosystem for Big Data. Colin White, BI Research September 2012 Sponsored by ParAccel

A REVIEW PAPER ON THE HADOOP DISTRIBUTED FILE SYSTEM

How To Scale Out Of A Nosql Database

How To Handle Big Data With A Data Scientist

Big Data on AWS. Services Overview. Bernie Nallamotu Principle Solutions Architect

An Oracle White Paper June High Performance Connectors for Load and Access of Data from Hadoop to Oracle Database

Apache HBase. Crazy dances on the elephant back

Why NoSQL? Your database options in the new non- relational world IBM Cloudant 1

Cloud Big Data Architectures

CSE 544 Principles of Database Management Systems. Magdalena Balazinska Fall 2007 Lecture 5 - DBMS Architecture

How to Build a High-Performance Data Warehouse By David J. DeWitt, Ph.D.; Samuel Madden, Ph.D.; and Michael Stonebraker, Ph.D.

The Evolution of Cloud Storage - From "Disk Drive in the Sky" to "Storage Array in the Sky" Allen Samuels Co-founder & Chief Architect

Solution: B. Telecom

Enabling Real-Time Sharing and Synchronization over the WAN

IN-MEMORY DATABASE SYSTEMS. Prof. Dr. Uta Störl Big Data Technologies: In-Memory DBMS - SoSe

In-memory Tables Technology overview and solutions

Azure VM Performance Considerations Running SQL Server

F1: A Distributed SQL Database That Scales. Presentation by: Alex Degtiar (adegtiar@cmu.edu) /21/2013

Rackspace Cloud Databases and Container-based Virtualization

Transcription:

the NewSQL database you ll never outgrow OldSQL vs. NoSQL vs. NewSQL on New OLTP Michael Stonebraker, CTO VoltDB, Inc.

Old OLTP Remember how we used to buy airplane Hckets in the 1980s + By telephone + Through an intermediary (professional terminal operator) Commerce at the speed of the intermediary In 1985, 1,000 transachons per second was considered an incredible stretch goal!!!! + HPTS (1985) VoltDB 2

How has OLTP Changed in 25 Years? The internet + Client is no longer a professional terminal operator + Instead Aunt Martha is using the web herself + Sends volume through the roof VoltDB 3

How has OLTP Changed in 25 Years? PDAs and sensors + Your cell phone is a transachon originator + Everything is being geo- posihoned by sensors (marathon runners, your car,.) + Sends volume through the roof VoltDB 4

How has OLTP Changed in 25 Years? The definihons + Online no longer exclusively means a human operator The oncoming data tsunami is o_en device and system- generated + TransacHon now transcends the tradihonal business transachon High- throughput ACID write operahons are a new requirement + HA and durability are now core database requirements VoltDB 5

Examples Maintain the state of mulh- player internet games Real Hme ad placement Fraud/intrusion detechon Risk management on Wall Street VoltDB 6

New OLTP Challenges You need to ingest the firehose in real Hme You need to process, validate, enrich and respond in real- Hme New OLTP and You You o_en need real- Hme analyhcs VoltDB VoltDB 7 7

SoluHon Choices OldSQL + Legacy RDBMS vendors NoSQL + Give up SQL and ACID for performance NewSQL + Preserve SQL and ACID + Get performance from a new architecture VoltDB 8

OldSQL TradiHonal SQL vendors (the elephants ) + Code lines dahng from the 1980 s + bloatware + Not very good at anything Can be beaten by at least an order of magnitude in every verhcal market I know of + Mediocre performance on New OLTP At low velocity it doesn t maeer Otherwise you get to tear your hair out VoltDB 9

DBMS Landscape Other apps DBMS apps Data Warehouse OLTP VoltDB 10

DBMS Landscape Performance Needs high Other apps low high Data Warehouse high OLTP VoltDB 11

One Size Does Not Fit All - - Pictorially NoSQL Array DBMSs Elephants only get the crevices Open source Column stores Hadoop Low-overhead Main memory DBs VoltDB 12

Reality Check TPC- C CPU cycles On the Shore DBMS prototype Elephants should be similar VoltDB 13

The Elephants Are slow because they spend all of their Hme on overhead!!! + Not on useful work Would have to re- architect their legacy code to do beeer VoltDB 14

To Go a Lot Faster You Have to Focus on overhead + Beeer B- trees affects only 4% of the path length Get rid of ALL major sources of overhead + Main memory deployment gets rid of buffer pool Leaving other 75% of overhead intact i.e. win is 25% VoltDB 15

Long Term Elephant Outlook Up against The Innovators Dilemma + Steam shovel example + Disk drive example + See the book by Clayton Christenson for more details Long term dri_ into the sunset + The most likely scenario + Unless they can solve the dilemma VoltDB 16

NoSQL Give up SQL Give up ACID VoltDB 17

Give Up SQL? Compiler translates SQL at compile Hme into a sequence of low level operahons Similar to what the NoSQL products make you program in your applicahon 30 years of RDBMS experience + Hard to beat the compiler + High level languages are good (data independence, less code, ) + Stored procedures are good! One round trip from app to DBMS rather than one one round trip per record Move the code to the data, not the other way around VoltDB 18

Give Up ACID If you need data accuracy, giving up ACID is a decision to tear your hair out by doing database heavy li_ing in user code Can you guarantee you won t need ACID tomorrow? ACID = goodness, in spite of what these guys say VoltDB 19

Who Needs ACID? Funds transfer + Or anybody moving something from X to Y Anybody with integrity constraints + Back out if fails + Anybody for whom usually ships in 24 hours is not an acceptable outcome Anybody with a mulh- record state + E.g. move and shoot VoltDB 20

Who needs ACID in replicahon Anybody with non- commutahve updates + For example, + and * don t commute Anybody with integrity constraints + Can t sell the last item twice. Eventual consistency means creates garbage VoltDB 21

NoSQL Summary Appropriate for non- transachonal systems Appropriate for single record transachons that are commutahve Not a good fit for New OLTP Use the right tool for the job Interes5ng Two recently- proposed NoSQL language standards CQL and UnQL are amazingly similar to (you guessed it!) SQL VoltDB 22

NewSQL SQL ACID Performance and scalability through modern innovahve so_ware architecture VoltDB 23

NewSQL Needs something other than tradihonal record level locking (1 st big source of overhead) + Hmestamp order + MVCC + Your good idea goes here VoltDB 24

NewSQL Needs a soluhon to buffer pool overhead (2 nd big source of overhead) + Main memory (at least for data that is not cold) + Some other way to reduce buffer pool cost VoltDB 25

NewSQL Needs a soluhon to latching for shared data structures (3 rd big source of overhead) + Some innovahve use of B- trees + Single- threading + Your good idea goes here VoltDB 26

NewSQL Needs a soluhon to write- ahead logging (4th big source of overhead) + Obvious answer is built- in replicahon and failover + New OLTP views this as a requirement anyway Some details + On- line failover? + On- line failback? + LAN network parhhoning? + WAN network parhhoning? VoltDB 27

A NewSQL Example VoltDB Main- memory storage Single threaded, run Xacts to complehon + No locking + No latching Built- in HA and durability + No log (in the tradihonal sense) VoltDB 28

Yabut: What About MulHcore? For A K- core CPU, divide memory into K (non overlapping) buckets i.e. convert mulh- core to K single cores VoltDB 29

Where all the Hme goes revisited Before VoltDB VoltDB 30

Current VoltDB Status Runs a subset of SQL (which is getng larger) On VoltDB clusters (in memory on commodity gear) No WAN support yet + Working on it right now 50X a popular OldSQL DBMS on TPC- C 5-7X Cassandra on VoltDB K- V layer Scales to 384 cores (biggest iron we could get our hands on) Clearly note this is an open source system! VoltDB 31

Summary Old OLTP New OLTP OldSQL for New OLTP NoSQL for New OLTP NewSQL for New OLTP Too slow Does not scale Lacks consistency guarantees Low- level interface Fast, scalable and consistent Supports SQL VoltDB 32

the NewSQL database you ll never outgrow Thank You