Special Relativity and the Problem of Database Scalability

Similar documents
Recovery Theory. Storage Types. Failure Types. Theory of Recovery. Volatile storage main memory, which does not survive crashes.

Recovery and the ACID properties CMPUT 391: Implementing Durability Recovery Manager Atomicity Durability

A Survey of Distributed Database Management Systems

In Memory Accelerator for MongoDB

Big Data & Scripting storage networks and distributed file systems

Transactions and ACID in MongoDB

EMC MID-RANGE STORAGE AND THE MICROSOFT SQL SERVER I/O RELIABILITY PROGRAM

Avoid a single point of failure by replicating the server Increase scalability by sharing the load among replicas

DISTRIBUTED SYSTEMS [COMP9243] Lecture 9a: Cloud Computing WHAT IS CLOUD COMPUTING? 2

NoSQL in der Cloud Why? Andreas Hartmann

Database Management. Chapter Objectives

Technical Note. Dell PowerVault Solutions for Microsoft SQL Server 2005 Always On Technologies. Abstract

Principles of Database. Management: Summary

Overview of Databases On MacOS. Karl Kuehn Automation Engineer RethinkDB

Lecture 18: Reliable Storage

Near Real Time Indexing Kafka Message to Apache Blur using Spark Streaming. by Dibyendu Bhattacharya

Geo-Replication in Large-Scale Cloud Computing Applications

Cluster Computing. ! Fault tolerance. ! Stateless. ! Throughput. ! Stateful. ! Response time. Architectures. Stateless vs. Stateful.

Affordable, Scalable, Reliable OLTP in a Cloud and Big Data World: IBM DB2 purescale

Facebook: Cassandra. Smruti R. Sarangi. Department of Computer Science Indian Institute of Technology New Delhi, India. Overview Design Evaluation

Data Management in the Cloud. Zhen Shi

Big Data Management and NoSQL Databases

Divy Agrawal and Amr El Abbadi Department of Computer Science University of California at Santa Barbara

Massive Data Storage

Do Relational Databases Belong in the Cloud? Michael Stiefel

Not Relational Models For The Management of Large Amount of Astronomical Data. Bruno Martino (IASI/CNR), Memmo Federici (IAPS/INAF)

Lecture 7: Concurrency control. Rasmus Pagh

Designing a Cloud Storage System

Distributed Data Stores

Business Whitepaper. Selecting The Right Database For Applications In The Cloud. Exec Insight

Practical Cassandra. Vitalii

Amr El Abbadi. Computer Science, UC Santa Barbara

MapReduce Jeffrey Dean and Sanjay Ghemawat. Background context

Information Systems. Computer Science Department ETH Zurich Spring 2012

CS 91: Cloud Systems & Datacenter Networks Failures & Replica=on

Comp 5311 Database Management Systems. 16. Review 2 (Physical Level)

CAP Theorem and Distributed Database Consistency. Syed Akbar Mehdi Lara Schmidt

Introduction to NOSQL

CASE STUDY: Oracle TimesTen In-Memory Database and Shared Disk HA Implementation at Instance level. -ORACLE TIMESTEN 11gR1

Cloud DBMS: An Overview. Shan-Hung Wu, NetDB CS, NTHU Spring, 2015

This paper defines as "Classical"

Big Data Technology ดร.ช ชาต หฤไชยะศ กด. Choochart Haruechaiyasak, Ph.D.

Hadoop IST 734 SS CHUNG

bigdata Managing Scale in Ontological Systems

COMPUTING SCIENCE. Designing a Distributed File-Processing System Using Amazon Web Services. Students of MSc SDIA TECHNICAL REPORT SERIES

Distributed Data Management

Challenges for Data Driven Systems

<Insert Picture Here> Oracle Cloud Storage. Morana Kobal Butković Principal Sales Consultant Oracle Hrvatska

Design Patterns for Distributed Non-Relational Databases

RADOS: A Scalable, Reliable Storage Service for Petabyte- scale Storage Clusters

The ConTract Model. Helmut Wächter, Andreas Reuter. November 9, 1999

Microsoft SQL Server Always On Technologies

BIG DATA IN THE CLOUD : CHALLENGES AND OPPORTUNITIES MARY- JANE SULE & PROF. MAOZHEN LI BRUNEL UNIVERSITY, LONDON

Service Description Cloud Storage Openstack Swift

Segmentation in a Distributed Real-Time Main-Memory Database

HSS: A simple file storage system for web applications

References. Introduction to Database Systems CSE 444. Motivation. Basic Features. Outline: Database in the Cloud. Outline

Introduction to Database Systems CSE 444

Can the Elephants Handle the NoSQL Onslaught?

Database Replication with Oracle 11g and MS SQL Server 2008

Amazon EC2 Product Details Page 1 of 5

High Availability for Database Systems in Cloud Computing Environments. Ashraf Aboulnaga University of Waterloo

COS 318: Operating Systems

How To Store Data On An Ocora Nosql Database On A Flash Memory Device On A Microsoft Flash Memory 2 (Iomemory)

Chapter 6 The database Language SQL as a tutorial

X4-2 Exadata announced (well actually around Jan 1) OEM/Grid control 12c R4 just released

Brewer s Conjecture and the Feasibility of Consistent, Available, Partition-Tolerant Web Services

extensible record stores document stores key-value stores Rick Cattel s clustering from Scalable SQL and NoSQL Data Stores SIGMOD Record, 2010

Client/Server and Distributed Computing

Outline. Failure Types

A Review of Column-Oriented Datastores. By: Zach Pratt. Independent Study Dr. Maskarinec Spring 2011

Introduction to the Cloud OS Windows Azure Overview Visual Studio Tooling for Windows Azure Scenarios: Dev/Test Web Mobile Hybrid

Hosting Transaction Based Applications on Cloud

DFSgc. Distributed File System for Multipurpose Grid Applications and Cloud Computing

DATABASE ENGINES ON MULTICORES SCALE: A PRACTICAL APPROACH. João Soares and Nuno Preguiça

Distribution transparency. Degree of transparency. Openness of distributed systems

RAID. Tiffany Yu-Han Chen. # The performance of different RAID levels # read/write/reliability (fault-tolerant)/overhead

The Sierra Clustered Database Engine, the technology at the heart of

MongoDB in the NoSQL and SQL world. Horst Rechner Berlin,

Cloud data store services and NoSQL databases. Ricardo Vilaça Universidade do Minho Portugal

Chapter 1: Introduction

Scalability of web applications. CSCI 470: Web Science Keith Vertanen

Transactions and Recovery. Database Systems Lecture 15 Natasha Alechina

A COMPARATIVE STUDY OF NOSQL DATA STORAGE MODELS FOR BIG DATA

5 Signs You Might Be Outgrowing Your MySQL Data Warehouse*

Building highly available systems in Erlang. Joe Armstrong

Review: The ACID properties

Introduction to Apache Cassandra

Data Management in the Cloud

Design and Evolution of the Apache Hadoop File System(HDFS)

InDetail. InDetail Paper by Bloor Author Philip Howard Date October NuoDB Swifts Release 2.1

Building Hyper-Scale Platform-as-a-Service Microservices with Microsoft Azure. Patriek van Dorp and Alex Thissen

How to Choose Between Hadoop, NoSQL and RDBMS

Dr Markus Hagenbuchner CSCI319. Distributed Systems

<Insert Picture Here> Oracle In-Memory Database Cache Overview

Availability Digest. MySQL Clusters Go Active/Active. December 2006

Survey on Comparative Analysis of Database Replication Techniques

Cloud Computing at Google. Architecture

High Availability with Postgres Plus Advanced Server. An EnterpriseDB White Paper

Distributed File System. MCSN N. Tonellotto Complements of Distributed Enabling Platforms

Transcription:

Special Relativity and the Problem of Database Scalability James Starkey NimbusDB, Inc.

The problem, some jargon, some physics, a little theory, and then NimbusDB.

Problem: Database systems scale badly beyond a single computer. or How can we get more oomph for more bucks?

[ Glossary ] Transaction: A unit of database work Atomic: Consistent: Isolated: Durable: A transaction happens or it doesn t Logical relationships are preserved A transaction sees only committed data and no partial transactions Once committed, it stays committed

[ Glossary ] Consistency: What does it mean? 1. Transactions are isolated (read-write and write-write) 2. Database constraints are enforced (unique keys, referential integrity, etc.) 3. If you can define it, you can enforce it.

[ Glossary ] Serializable: A database system in which concurrent transactions have the effect of having been executed one at a time in some order.

[ Glossary ] Node: A computer on a network.

[ Glossary ] Elasticity: The ability to add or remove a node from a running system.

[ Now, some physics ] Newton: A body at rest (in other words, a universal reference frame)

Theory of Luminiferous Æther Light is a wave Waves propagate in a medium Ergo luminiferous æther

Michelson and Morley: Oops.

Einstein: Observations are relative to the reference frame of the observer. Theory of Special Relativity, 1905

[ Returning to databases ] Serializability: Good idea or bad habit? Sufficient condition for consistency But not a necessary condition Expensive to enforce Almost serializable is utterly useless

Serializability: Good idea or bad habit? Serializable Sequential transaction order At every point, database has a definitive state [Gosh, another universal reference frame!]

Some thoughts on time Time is a sequence of events, not just a clock Communication, Einstein tells us, requires latency Two nodes just can t see events in the same order That s not a bug, it s the way it has to be. Deal with it.

Multi-Version Concurrency Control is an alternative to serializability. Row updates create new versions pointing to old version(s) Each version tagged with the transaction that created it A transaction sees a version consistent with when it started A transaction can t update a version it can t see Each transaction sees stable, consistent state

NimbusDB is an elastic, ACID, SQL-based relational database.

NimbusDB modest goals are: Elastic, scalable, ACID RDBMS Very high performance in data center High performance geographically disperse Software fault tolerant Hardware fault tolerant Geological fault tolerant

NimbusDB less modest goals are: Zero administration Dynamic, self-tuning Arbitrary redundancy Multi-tenant Of, for, and in the cloud

Glossary: A chorus is a set of nodes that instantiate a database.

A NimbusDB database is composed of distributed objects called atoms. An atom can be serialized to the network or to a disk An atom can reside on any number of chorus nodes All instances of an atom know about each other Atoms replicate peer to peer Every atom has a chairman node

Examples of NimbusDB atoms: Transaction manager starts and ends transactions Table metadata for a relational table Data container for user data Catalog tracks atom locations

A NimbusDB chorus has transactional nodes that do SQL and archive nodes that maintain a persistent disk archive.

NimbusDB communication is fully connected, asynchronous, ordered, and batched

NimbusDB Messaging Most data is archival and inactive A small fraction is active but stable A smaller fraction is volatile but local Even less data is volatile and global Replicate only to those who care

NimbusDB nodes are autonomous A node chooses where to get an atom A node chooses which atoms to keep A node chooses which atoms to drop

NimbusDB Transaction Control A transaction executes on a single node Record version based A transaction sees the results of transactions reported committed when and where it started Consistency is maintained by atom chairmen Atom updates broadcast replication messages Replication messages precede commit message

[ Folks, this is the key slide ] NimbusDB is relativistic The database can be viewed only through transactions Consistency is viewed only through transactions There is no single definitive database state Nodes may differ due to message skew And Dorothy, we re not in Kansas anymore.

Archive nodes provide durability Archive nodes see all atom updates followed by a pre-commit message An archive broadcasts the actual commit message Transactional nodes retain their dirty atoms until an archive node reports the atoms archived. Multiple archive nodes provide redundancy

Transactional nodes provide scalability Any transactional node can do anything Connection broker can give effect of sharding Transactional nodes tend to request atoms from local nodes Data dynamically trends toward locality

And the little stuff Semantic extensions (shirts are clothes but not pants) Unbounded strings (punch cards are oh so yesterday) Unbounded numbers All metadata is dynamic Members of the chorus are platform independent And software updates are rolling Goal: 24/365 and beyond

Network partitions, CAP, and NimbusDB Certain archive nodes are designated as commit agents Subsets of commit agents form into coteries (Coteries: subsets where no two are disjoint) A pre-commit must be received at least one commit agent in every coterie to commit Post partition, the partition that contains a coterie survives If a pre-commit was reported to the partition, it commits

Questions, comments and brickbats?