A1 and FARM scalable graph database on top of a transactional memory layer

Similar documents
Understanding Neo4j Scalability

RAMCloud and the Low- Latency Datacenter. John Ousterhout Stanford University

Graph Database Proof of Concept Report

Web Technologies: RAMCloud and Fiz. John Ousterhout Stanford University

Petabyte Scale Data at Facebook. Dhruba Borthakur, Engineer at Facebook, SIGMOD, New York, June 2013

How to Choose Between Hadoop, NoSQL and RDBMS

F1: A Distributed SQL Database That Scales. Presentation by: Alex Degtiar (adegtiar@cmu.edu) /21/2013

How To Scale Out Of A Nosql Database

Cloud Computing and Advanced Relationship Analytics

Achieving Real-Time Business Solutions Using Graph Database Technology and High Performance Networks

Enabling High performance Big Data platform with RDMA

Lecture Data Warehouse Systems

Overview on Graph Datastores and Graph Computing Systems. -- Litao Deng (Cloud Computing Group)

NoSQL Data Base Basics

<Insert Picture Here> Oracle NoSQL Database A Distributed Key-Value Store

ScaleArc idb Solution for SQL Server Deployments

Data Management in the Cloud: Limitations and Opportunities. Annies Ductan

A REVIEW PAPER ON THE HADOOP DISTRIBUTED FILE SYSTEM

Big Fast Data Hadoop acceleration with Flash. June 2013

Can the Elephants Handle the NoSQL Onslaught?

How To Handle Big Data With A Data Scientist

Technical Challenges for Big Health Care Data. Donald Kossmann Systems Group Department of Computer Science ETH Zurich

TECHNICAL OVERVIEW HIGH PERFORMANCE, SCALE-OUT RDBMS FOR FAST DATA APPS RE- QUIRING REAL-TIME ANALYTICS WITH TRANSACTIONS.

Benchmarking Cassandra on Violin

SMB Advanced Networking for Fault Tolerance and Performance. Jose Barreto Principal Program Managers Microsoft Corporation

BENCHMARKING CLOUD DATABASES CASE STUDY on HBASE, HADOOP and CASSANDRA USING YCSB

Petabyte Scale Data at Facebook. Dhruba Borthakur, Engineer at Facebook, XLDB Conference at Stanford University, Sept 2012

Domain driven design, NoSQL and multi-model databases

bigdata Managing Scale in Ontological Systems

Software-defined Storage at the Speed of Flash

ioscale: The Holy Grail for Hyperscale

Next Generation Operating Systems

Apache HBase. Crazy dances on the elephant back

Conjugating data mood and tenses: Simple past, infinite present, fast continuous, simpler imperative, conditional future perfect

Using distributed technologies to analyze Big Data


CSE-E5430 Scalable Cloud Computing Lecture 2

Oracle s Big Data solutions. Roger Wullschleger. <Insert Picture Here>

InfiniteGraph: The Distributed Graph Database

BigMemory & Hybris : Working together to improve the e-commerce customer experience

Petabyte Scale Data at Facebook. Dhruba Borthakur, Engineer at Facebook, UC Berkeley, Nov 2012

Scaling Out With Apache Spark. DTL Meeting Slides based on

In Memory Accelerator for MongoDB

Putting Apache Kafka to Use!

Can High-Performance Interconnects Benefit Memcached and Hadoop?

Energy Efficient MapReduce

DATA MINING WITH HADOOP AND HIVE Introduction to Architecture

Scaling Database Performance in Azure

The Methodology Behind the Dell SQL Server Advisor Tool

Preview of Oracle Database 12c In-Memory Option. Copyright 2013, Oracle and/or its affiliates. All rights reserved.

Big Data With Hadoop

Near Real Time Indexing Kafka Message to Apache Blur using Spark Streaming. by Dibyendu Bhattacharya

Overview: X5 Generation Database Machines

Network Attached Storage. Jinfeng Yang Oct/19/2015

BookKeeper. Flavio Junqueira Yahoo! Research, Barcelona. Hadoop in China 2011

Daniel J. Adabi. Workshop presentation by Lukas Probst

SQL Server 2014 New Features/In- Memory Store. Juergen Thomas Microsoft Corporation

The Sierra Clustered Database Engine, the technology at the heart of

Evaluator s Guide. McKnight. Consulting Group. McKnight Consulting Group

Managing Big Data with Hadoop & Vertica. A look at integration between the Cloudera distribution for Hadoop and the Vertica Analytic Database

2009 Oracle Corporation 1

DIABLO TECHNOLOGIES MEMORY CHANNEL STORAGE AND VMWARE VIRTUAL SAN : VDI ACCELERATION

Distributed Architecture of Oracle Database In-memory

Inge Os Sales Consulting Manager Oracle Norway

SMB Direct for SQL Server and Private Cloud

NoSQL and Hadoop Technologies On Oracle Cloud

Lambda Architecture. Near Real-Time Big Data Analytics Using Hadoop. January Website:

In-Memory Columnar Databases HyPer. Arto Kärki University of Helsinki

Big Data Technology Map-Reduce Motivation: Indexing in Search Engines

System Architecture. In-Memory Database

Stateful Distributed Dataflow Graphs: Imperative Big Data Programming for the Masses

On- Prem MongoDB- as- a- Service Powered by the CumuLogic DBaaS Platform

Comparing SMB Direct 3.0 performance over RoCE, InfiniBand and Ethernet. September 2014

Towards Rack-scale Computing Challenges and Opportunities

Virtualizing Microsoft SQL Server on Dell XC Series Web-scale Converged Appliances Powered by Nutanix Software. Dell XC Series Tech Note

An Oracle White Paper July Oracle Primavera Contract Management, Business Intelligence Publisher Edition-Sizing Guide

Hadoop Architecture. Part 1

Datacenter Operating Systems

NoSQL for SQL Professionals William McKnight

Apache Ignite TM (Incubating) - In- Memory Data Fabric Fast Data Meets Open Source

Performance And Scalability In Oracle9i And SQL Server 2000

HadoopRDF : A Scalable RDF Data Analysis System

2.1.5 Storing your application s structured data in a cloud database

Enterprise GIS Architecture Deployment Options. Andrew Sakowicz

Big data blue print for cloud architecture

Where We Are. References. Cloud Computing. Levels of Service. Cloud Computing History. Introduction to Data Management CSE 344

GigaSpaces Real-Time Analytics for Big Data

Overview of Databases On MacOS. Karl Kuehn Automation Engineer RethinkDB

Big Graph Processing: Some Background

DISTRIBUTED AND PARALLELL DATABASE

BigMemory: Providing competitive advantage through in-memory data management

be architected pool of servers reliability and

CASE STUDY: Oracle TimesTen In-Memory Database and Shared Disk HA Implementation at Instance level. -ORACLE TIMESTEN 11gR1

<Insert Picture Here> Oracle In-Memory Database Cache Overview

Big Data and Apache Hadoop s MapReduce

Accelerating Enterprise Applications and Reducing TCO with SanDisk ZetaScale Software

Transcription:

A1 and FARM scalable graph database on top of a transactional memory layer Miguel Castro, Aleksandar Dragojević, Dushyanth Narayanan, Ed Nightingale, Alex Shamis Richie Khanna, Matt Renzelmann Chiranjeeb Buragohain,

Distributed computing today ms SQL RDBMS no general platform for low latency at scale latency s hours Hadoop/MapReduce 1 machine > 1K machines scale

A platform for low latency computing latency ms s SQL RDBMS A1 FARM hours Hadoop/MapReduce 1 machine scale > 1K machines

A new frontier for low latency computing µs ms SQL RDBMS A1 FARM latency s hours Hadoop/MapReduce 1 machine > 1K machines scale

A graph workload me meeting Bob Ann Jane

Scale out me meeting Bob Ann Jane

Latency 1 me 3 2 meeting Ann Bob Jane 4

Freshness and consistency me meeting Bob Ann Jane

Fault tolerance me meeting Bob Ann Jane

It is hard to build low latency apps at scale

FARM simplifies building low latency apps transactions and replication simplify programming hide failures, distribution, and concurrency abstraction that transactions run sequentially on a reliable server no compromises: consistency, availability, performance strict serializability millions of transactions per second with sub millisecond latencies recovery from failure to peak performance in tens of milliseconds

FARM is enabled by three hardware trends inexpensive DRAM currently $8/GB machines with 128 GB, container will hold more than 100 TBs non-volatile RAM NVDIMMs available today: DRAM+battery+SSD fast commodity networks with RDMA CX3: 40 Gb/s and < 3 µs latency on Ethernet (CX4: 100 Gb/s) CX3: 35M messages per second (CX4: 150M) this hardware is commodity and being brought into datacenters

FARM software new algorithms and data structures optimized for RDMA single-object read only transactions with a single RDMA read Transaction commit protocol optimized for RDMA operations. read validation during commit using a single RDMA read btree and hash table lookups with a single RDMA read changes to OS and network card driver to speed up RDMA efficient memory translation connection multiplexing

FARM API Storage APIs Transaction CreateTransaction() ObjBuf *Transaction::Alloc(size, locality_hint) ObjBuf *Transaction::Read(addr, size) ObjBuf *Transaction::OpenForWrite(ObjBuf) void Transaction::Free(ObjBuf) void Transaction::Commit() Communication APIs for application level coordination 29/09/15 14

Scale-out OLTP throughput: TATP transac2ons per second (millions) 250 200 150 100 50 0 In- memory SQL (no fault tolerance) FARM (no fault tolerance) FARM (with fault tolerance) 30x 100x database size 0 10 20 30 40 50 60 70 80 90 100 number of servers

Why a Graph Database? On FARM? Many datasets naturally lend themselves to a graph structure Facebook, Twitter, enterprise relationships Graph traversal queries are hard on relational stores A1 Goals Scalable Transactional Complex traversal and subgraph queries Not targeted for graph compute

A1 Architecture overview A1 Trusted and untrusted coprocessors A1 graph database core A1 Core API FARM ACID Transac2ons FARM Shared memory FARM Communica2on primi2ves 17

The A1 Data Model Me Planning Meeting Alice Doc1 Bob Doc2 29/09/15 18

A1 Data Layout Locality of reference matters Graph : a set of vertexes with adjacency lists Building blocks FARM objects identified by 64 bit pointers RPC primitives BTree for indexes Linked lists for adjacency

A1 Data Layout

BFS Query Me Planning Meeting Alice Doc1 Bob Doc2 21

Traversal Algorithm Designate starting host as coordinator (maintain visited list) Coordinate level by level traversal Data Shipping Query shipping RPC for synchronization All predicates applied locally All neighbors discovered locally Hybrid

BFS Traversal via Query Shipping 1. Start 2. Find neighbors 3. Coord: Partition and ship query 4. Worker: Find neighbors 5. Worker: Ship list back to coordinator 6. Back to step 3 29/09/15 23

Extensibility by Coprocessors A1 Graph database / API FARM Communica2on primi2ves FARM ACID Transac2ons FARM Shared memory ` Coprocessor models Trusted core extensions No separation, running inside the FARM process space No latency Untrusted core hosted Separate address and process space on same machine. 1-2us (10-25% overhead) Untrusted cluster hosted Separate machines on the same failure/network domains 50-200 us (4x 16x overhead)

Preliminary Performance Linkbench with 1B vertexes and 4.5B edges 3X replication Single container / network domain / failure domain Single transactions Vertex Creation: 375 us Edge Create: 495 us Edge Get: 240 us Vertex Read: 100 us BFS 1 hop : 700 us 2 hop : 850 us Throughput ~ 350K transactions/second/ node Linear scale-up to 100 nodes

Backup slides

Important applications can benefit µs petabyte transactional memory latenc y ms s hekato n faceboo k hour s delv e commerce graph 1 machine > 1K machines scale

Petabyte transactional memory makes it easy general transactional memory model! objects, arrays, and pointers! graphs, relational databases, key-value stores on the same system abstraction of running on a single machine! scale out! low latency! freshness and consistency! fault tolerance high throughput to keep costs low it will speed up innovation as MapReduce did for batch analytics

Data Model Comparison Data Model Key-Value Relational Graph Horizontal Scaling Join Support Language Support Maturity Easy None None Medium Non-trivial Good SQL High Hard Join-less Specialized Low

Scale-out OLTP throughput-latency: TATP

Scale-out OLTP recovery: TATP 45ms

Scale-out OLTP: TPC-C benchmark Millions 250 200 150 tpmc 100 50 0 0 20 40 60 80 100 number of servers