NoSQL's biggest secret: SQL went nowhere Matthew Revell Director of Developer Advocacy, Couchbase



Similar documents
NoSQL Database Options

NoSQL systems: introduction and data models. Riccardo Torlone Università Roma Tre

Lecture 21: NoSQL III. Monday, April 20, 2015

NoSQL Databases. Nikos Parlavantzas

Querying MongoDB without programming using FUNQL

Cloud Scale Distributed Data Storage. Jürmo Mehine

NoSQL Databases. Institute of Computer Science Databases and Information Systems (DBIS) DB 2, WS 2014/2015

NoSQL: Going Beyond Structured Data and RDBMS

Perl & NoSQL Focus on MongoDB. Jean-Marie Gouarné jmgdoc@cpan.org

SQL VS. NO-SQL. Adapted Slides from Dr. Jennifer Widom from Stanford

The Quest for Extreme Scalability

Enterprise Operational SQL on Hadoop Trafodion Overview

Comparing SQL and NOSQL databases

extensible record stores document stores key-value stores Rick Cattel s clustering from Scalable SQL and NoSQL Data Stores SIGMOD Record, 2010

How To Handle Big Data With A Data Scientist

.NET User Group Bern

Data Modeling for Big Data

Evaluator s Guide. McKnight. Consulting Group. McKnight Consulting Group

How To Scale Out Of A Nosql Database

Open source, high performance database

MongoDB Developer and Administrator Certification Course Agenda

Understanding NoSQL Technologies on Windows Azure

A survey of big data architectures for handling massive data

DbSchema Tutorial with Introduction in MongoDB

Structured Data Storage

INTRODUCTION TO CASSANDRA

nosql and Non Relational Databases

MongoDB in the NoSQL and SQL world. Horst Rechner Berlin,

Open Source Technologies on Microsoft Azure

these three NoSQL databases because I wanted to see a the two different sides of the CAP

MongoDB: document-oriented database

Lecture Data Warehouse Systems

Document Oriented Database

Databases 2 (VU) ( )

Big Data Management. Big Data Management. (BDM) Autumn Povl Koch September 2,

Analytics March 2015 White paper. Why NoSQL? Your database options in the new non-relational world

Big data for the Masses The Unique Challenge of Big Data Integration

Overview of Databases On MacOS. Karl Kuehn Automation Engineer RethinkDB

Making Sense ofnosql A GUIDE FOR MANAGERS AND THE REST OF US DAN MCCREARY MANNING ANN KELLY. Shelter Island

Integrating Big Data into the Computing Curricula

An Approach to Implement Map Reduce with NoSQL Databases

SURVEY ON MONGODB: AN OPEN- SOURCE DOCUMENT DATABASE

BIG DATA Alignment of Supply & Demand Nuria de Lama Representative of Atos Research &

Managing Cloud Server with Big Data for Small, Medium Enterprises: Issues and Challenges

Big Data. Facebook Wall Data using Graph API. Presented by: Prashant Patel Jaykrushna Patel

Do Relational Databases Belong in the Cloud? Michael Stiefel

NORWEGIAN UNIVERSITY OF SCIENCE AND TECHNOLOGY DEPARTMENT OF CHEMICAL ENGINEERING ADVANCED PROCESS SIMULATION. SQL vs. NoSQL

The MongoDB Tutorial Introduction for MySQL Users. Stephane Combaudon April 1st, 2014

ESS event: Big Data in Official Statistics. Antonino Virgillito, Istat

MariaDB Cassandra interoperability

Why NoSQL? Your database options in the new non- relational world IBM Cloudant 1

Can the Elephants Handle the NoSQL Onslaught?

Using distributed technologies to analyze Big Data

Chapter 11 Map-Reduce, Hadoop, HDFS, Hbase, MongoDB, Apache HIVE, and Related

CSCC09F Programming on the Web. Mongo DB

The evolution of database technology (II) Huibert Aalbers Senior Certified Executive IT Architect

Slave. Master. Research Scholar, Bharathiar University

Data storing and data access

Big Data Analytics. Rasoul Karimi

Databases for text storage

Lecture 10: HBase! Claudia Hauff (Web Information Systems)!

The SQL++ Unifying Semi-structured Query Language, and an Expressiveness Benchmark of SQL-on-Hadoop, NoSQL and NewSQL Databases

Microsoft Azure Data Technologies: An Overview

HBase A Comprehensive Introduction. James Chin, Zikai Wang Monday, March 14, 2011 CS 227 (Topics in Database Management) CIT 367

Getting Started with SandStorm NoSQL Benchmark

DYNAMIC QUERY FORMS WITH NoSQL

Introduction to NoSQL Databases and MapReduce. Tore Risch Information Technology Uppsala University

Application of NoSQL Database in Web Crawling

How graph databases started the multi-model revolution

Evaluating NoSQL for Enterprise Applications. Dirk Bartels VP Strategy & Marketing

A COMPARATIVE STUDY OF NOSQL DATA STORAGE MODELS FOR BIG DATA

Preparing Your Data For Cloud

How To Use Big Data For Telco (For A Telco)

NoSQL Database Systems and their Security Challenges

The Total Cost of (Non) Ownership of a NoSQL Database Cloud Service

Composite Data Virtualization Composite Data Virtualization And NOSQL Data Stores

Big Data & Data Science Course Example using MapReduce. Presented by Juan C. Vega

NoSQL in der Cloud Why? Andreas Hartmann

Dr. Chuck Cartledge. 15 Oct. 2015

Вовченко Алексей, к.т.н., с.н.с. ВМК МГУ ИПИ РАН

How To Improve Performance In A Database

Reference Architecture, Requirements, Gaps, Roles

MongoDB and Couchbase

Scalable ecommerce with NoSQL. Dipali Trivedi

1 Structured Query Language: Again. 2 Joining Tables

Integration of Apache Hive and HBase

Understanding NoSQL on Microsoft Azure

NoSQL for SQL Professionals William McKnight

Introduction to NoSQL Databases. Tore Risch Information Technology Uppsala University

CSE 530A Database Management Systems. Introduction. Washington University Fall 2013

Big Data, Fast Data, Complex Data. Jans Aasman Franz Inc

Transcription:

NoSQL's biggest secret: SQL went nowhere Matthew Revell Director of Developer Advocacy, Couchbase 1

Meet my toaster 2

A toaster Redundancy built-in Balanced input/output Commodity hardware 3

A cruster of toasters Redundancy built-in Balanced input/output Commodity hardware Easily clustered 100% NoSQL! 4

A brief history of data storage 5

This is data 6

But what does it mean? 7

This is data 8

1960: the first commercial database 9

The data model determines the query pattern 10

Hierarchical The GOTO statement of databases 11

Hierarchical 12

Hierarchical CEO CTO CFO CMO SVP HR VP Engineering Chief Architect VP of Comms VP of Demand Gen Director of Product Marketing Head of Media Relations Head of Analyst Relations Head of Event Marketing 13

Network The programmer as navigator 14

Network Employee Employee Name Matthew Revell Name Owen Hughes Office London Office London Title Next Owner Prior Director of Developer Advocacy Owen Hughes Arun Gupta Adam Blackshaw Title Next Owner Prior Head of Pre-Sales Bindi Bhullar Dipti Borkar Matthew Revell 15

Relational (with SQL) Heirarchical A declarative Network/ data query CODASYL language Relational with SQL 16

Object oriented databases Why are we flattening everything? 17

2005-2010: NoSQL 18

Key value email: matthew@couchbase.com { } personal : matthew@understated.co.uk, work : matthew@couchbase.com 19

Document London matthew@couchbase.com: Developer Advocacy matthew@couchbase.com { matthew@couchbase.com james@couchbase.com "city": "London", james@couchbase.com laura@couchbase.com "glasses": true, laura@couchbase.com tom@couchbase.com "team": "Developer Advocacy", laurent@couchbase.com david@couchbase.com "music": "METAAAAAL!" martin@couchbase.com greg@couchbase.com } matt@couchbase.com nic@couchbase.com will@couchbase.com 20

Document London and Developer Advocacy matthew@couchbase.com james@couchbase.com laura@couchbase.com 21

Column and graph 22

Context is all 23

There's always a trade-off Offload from some other data store (i.e. caching) Computation offload Speed Scalability Availability Flexibility in what you store Query flexibility 24

Querying NoSQL 25

It's your problem Photo by Donarreiskoffer. CC-by-3.0 26

Manual 2i 27

Map/Reduce 28

Declarative query 29

NoSQL declarative query DBMS-specific Bold, new options SQL-derivatives 30

MongoDB query db.staff.find({office: 'London'}) Index document contents db.staff.find({office: {$in:['london', 'Amsterdam']}}) db.staff.insert({name: 'Matthew Revell', office: 'London'}) and query natively db.staff.update({name: 'Matthew Revell', office: 'Amsterdam'}) 31

JSONiq XQuery for JSON Declarative language for JSON Functional,composable, set-based 32

JSONiq for $p in collection('staff') where $p.serviceyears gt 2 let $name := $p.firstname " " $p.lastname group by $p.office order by $p.serviceyears return { $name, $p.office, $p.serviceyears } 33

JSONiq for $captain in collection("captains"), $movie in collection("movies") [ try { $$.captain eq $captain.name } catch * { false } ] return { "captain" : $captain.name, "movie" : $movie.name } 34

Why SQL? Creative Commons Attribution-Share Alike 2.5 Generic, image by Per Erik Strandberg 35

Cassandra's CQL 36

Cassandra's CQL Really looks like SQL Schema is back No JOINs, no GROUP BY 37

Cassandra's CQL CREATE TABLE authors ( name text, year int, title text, isbn text, publisher text, PRIMARY KEY (name, year, title) ) WITH CLUSTERING ORDER BY (year DESC); http://www.planetcassandra.org/making-the-change-from-thrift-to-cql/ 38

Cassandra's CQL INSERT INTO books (title, author, year) VALUES ('Patriot Games', 'Tom Clancy', 1987); INSERT INTO books (title, author, year) VALUES ('Without Remorse', 'Tom Clancy', 1993); http://www.planetcassandra.org/making-the-change-from-thrift-to-cql/ 39

Cassandra's CQL name year title isbn publisher ------------+------+-----------------+---------------+----------- Tom Clancy 1993 Without Remorse 0-399-13825-0 Putnam Tom Clancy 1987 Patriot Games 0-399-13241-4 Putnam RowKey: Tom Clancy => (name=1993:without Remorse:ISBN, value=0-399-13825-0) => (name=1993:without Remorse:publisher, value=putnam) => (name=1987:patriot Games:ISBN, value=0-399-13241-4) => (name=1987:patriot Games:publisher, value=putnam) http://www.planetcassandra.org/making-the-change-from-thrift-to-cql/ 40

Cassandra's CQL SELECT * FROM authors WHERE name = 'Tom Clancy' AND year >= 1993; http://www.planetcassandra.org/making-the-change-from-thrift-to-cql/ 41

Cassandra's CQL RowKey: Tom Clancy => (name=1996:executive Orders:publisher, value=putnam) => (name=1996:executive Orders:ISBN, value=0-399-13825-0) => (name=1994:debt of Honor:publisher, value=putnam) => (name=1994:debt of Honor:ISBN, value=0-399-13826-1) => (name=1993:without Remorse:publisher, value=putnam) => (name=1993:without Remorse:ISBN, value=0-399-13825-0) => (name=1991:the Sum of All Fears:publisher, value=putnam) => (name=1991:the Sum of All Fears:ISBN, value=0-399-13241-6)... => (name=1987:patriot Games:publisher, value=putnam) => (name=1987:patriot Games:ISBN, value=0-399-13241-4) http://www.planetcassandra.org/making-the-change-from-thrift-to-cql/ 42

Cassandra's CQL Not an ad-hoc query language 43

This story is mostly about JSON 44

SQL++ 45

SQL++ Non-relational data is semi-structured Non-relational data is heterogenous JSON in, JSON out! 46

What must happen to SQL to make it JSON friendly? Data is nested Sometimes data is missing Data is likely to be found in more than one place JOINs need thinking about 47

SQL++ Superset of SQL for semi-structured data Handles missing data gracefully and/or explicitly Can query inside nested data Nests and unnests data JOINs between documents 48

N1QL: SQL++ in practice 49

Couchbase Server 4.0 High availability cache Key-value store Document database N1QL SQL-like query for JSON 50

A profile { "email": "matthew@couchbase.com", "office": "London", "title": "Director of Developer Advocacy", "team": "Developer Advocacy", "manager": "Arun Gupta", "start-date": "2014-01-06", "meet-up-groups": ["London", "Dublin", "Manchester"], "conferences": [ { "name": "OSCON Europe", "location": "Amsterdam", "roles": ["booth", "speaker"], "start-date": "2015-10-26", "end-date": "2015-10-28" }, { "name": "Topconf", "location": "Talinn", "roles": "speaker", "start-date": "2015-11-17", "end-date": "2015-11-18" }, { "name": "Big Data Strategy", "location": "Vilnius", "roles": "speaker", "start-date": "2015-10-05", "end-date": "2015-10-05" } ] } 51

N1QL N1QL implements much of SQL++ Dive into arrays and objects NEST data from JOINs UNNEST data Gracefully handles MISSING data 52

SELECT SELECT email FROM `default` WHERE office = "London"; 53

ARRAY ELEMENTS SELECT conferences[0].name AS event_name FROM `default`; 54

REMOVE MISSING ITEMS SELECT DISTINCT conferences[0].name AS event_name FROM `default` WHERE conferences IS NOT MISSING; 55

WHO IS GOING TO DROIDCON SWEDEN? SELECT email AS person, conferences[0].name AS event FROM `default` WHERE ANY event in conferences SATISFIES event.name = "Droidcon Sweden" END; 56

Updating and deleting DELETE: provide the key to delete the document INSERT: provide a key and some JSON to create a new document UPSERT: as INSERT but will overwrite existing docs UPDATE: change individual values inside existing docs 57

A larger data-set: travel-sample 58

TRAVEL SAMPLE DATA 59

JOINs 60

JOINs Retrieve data from multiple documents in a single query Join within a keyspace/bucket Join across keyspaces/buckets 61

IN A RELATIONAL DATABASE AIRLINES id country iata icao name callsign 5209 United States UA UAL United Airlines UNITED 1355 United Kingdom BA BAW British Airways SPEEDBIRD AIRPORTS id airportname city country alt lat lon icao tz 507 Heathrow London United Kingdom 83 51.4775-0.461389 EGLL Europe/London 3469 San Francisco Intl San Francisco United States 13 37.618972-122.374889 KSFO America/Los_Angeles FLIGHTS id airline source destination equipment day flight utc stops 57047-1 UA LHR SFO 777 0 UA894 02:32:00 0 62

IN JSON { } "callsign": "UNITED", "country": "United States", "iata": "UA", "icao": "UAL", "id": 5209, "name": "United Airlines", "type": "airline" { } "airline": "UA", "airlineid": "airline_5209", "destinationairport": "SFO", "equipment": "777", "id": 57047, "schedule": [ { "day": 0, "flight": "UA894", "utc": "02:32:00" },... ], "sourceairport": "LHR", "stops": 0, "type": "route" { } "airportname": "Heathrow", "city": "London", "country": "United Kingdom", "faa": "LHR", "geo": { "alt": 83, "lat": 51.4775, "lon": -0.461389 }, "icao": "EGLL", "id": 507, "type": "airport", "tz": "Europe/London" 63

A SIMPLE JOIN SELECT * FROM `travel-sample` r JOIN `travel-sample` a ON KEYS r.airlineid WHERE r.sourceairport="lhr" AND r.destinationairport = "SFO"; 64

WHO FLIES LHR->SFO? SELECT DISTINCT a.name FROM `travel-sample` r JOIN `travel-sample` a ON KEYS r.airlineid WHERE r.sourceairport="lhr" AND r.destinationairport = "SFO"; 65

UNNEST Breaks out nested JSON from the results 66

SOMETHING USEFUL SELECT a.name, s.flight, s.utc, r.sourceairport, r.destinationairport, r.equipment FROM `travel-sample` r UNNEST r.schedule s JOIN `travel-sample` a ON KEYS r.airlineid WHERE r.sourceairport="lhr" AND r.destinationairport = "SFO" AND s.day=1 ORDER BY s.utc; 67

Next Steps

Couchbase Developer Portal developer.couchbase.com 69

SQL++ paper http://arxiv.org/abs/1405.3631 70

Forums http://forums.couchbase.com 71