Cassandra A Decentralized, Structured Storage System



Similar documents
Facebook: Cassandra. Smruti R. Sarangi. Department of Computer Science Indian Institute of Technology New Delhi, India. Overview Design Evaluation

Cassandra A Decentralized Structured Storage System

A Review of Column-Oriented Datastores. By: Zach Pratt. Independent Study Dr. Maskarinec Spring 2011

LARGE-SCALE DATA STORAGE APPLICATIONS

Hypertable Architecture Overview

Distributed Systems. Tutorial 12 Cassandra

these three NoSQL databases because I wanted to see a the two different sides of the CAP

Practical Cassandra. Vitalii

High Throughput Computing on P2P Networks. Carlos Pérez Miguel

Cassandra. Jonathan Ellis

Dynamo: Amazon s Highly Available Key-value Store

A Brief Analysis on Architecture and Reliability of Cloud Based Data Storage

Distributed Data Stores

Case study: CASSANDRA

Cassandra - A Decentralized Structured Storage System

Cassandra - A Decentralized Structured Storage System

Cloud Computing at Google. Architecture

Design Patterns for Distributed Non-Relational Databases

ZooKeeper. Table of contents

Apache HBase. Crazy dances on the elephant back

Distributed File System. MCSN N. Tonellotto Complements of Distributed Enabling Platforms

SQL VS. NO-SQL. Adapted Slides from Dr. Jennifer Widom from Stanford

Storage Systems Autumn Chapter 6: Distributed Hash Tables and their Applications André Brinkmann

Evaluation of NoSQL databases for large-scale decentralized microblogging

Big Table A Distributed Storage System For Data

Joining Cassandra. Luiz Fernando M. Schlindwein Computer Science Department University of Crete Heraklion, Greece

Lecture Data Warehouse Systems

Non-Stop for Apache HBase: Active-active region server clusters TECHNICAL BRIEF

A programming model in Cloud: MapReduce

A Survey of Distributed Database Management Systems

NoSQL Data Base Basics

multiparadigm programming Multiparadigm Data Storage for Enterprise Applications

Highly available, scalable and secure data with Cassandra and DataStax Enterprise. GOTO Berlin 27 th February 2014

Bigtable is a proven design Underpins 100+ Google services:

extensible record stores document stores key-value stores Rick Cattel s clustering from Scalable SQL and NoSQL Data Stores SIGMOD Record, 2010

The Cloud Trade Off IBM Haifa Research Storage Systems

BENCHMARKING CLOUD DATABASES CASE STUDY on HBASE, HADOOP and CASSANDRA USING YCSB

<Insert Picture Here> Oracle NoSQL Database A Distributed Key-Value Store

Data Management in the Cloud

Benchmarking Failover Characteristics of Large-Scale Data Storage Applications: Cassandra and Voldemort

Snapshots in Hadoop Distributed File System

Study and Comparison of Elastic Cloud Databases : Myth or Reality?

Comparing Microsoft SQL Server 2005 Replication and DataXtend Remote Edition for Mobile and Distributed Applications

Distributed File Systems

Introduction to Hbase Gkavresis Giorgos 1470

Designing Performance Monitoring Tool for NoSQL Cassandra Distributed Database

Comparing SQL and NOSQL databases

Cluster Computing. ! Fault tolerance. ! Stateless. ! Throughput. ! Stateful. ! Response time. Architectures. Stateless vs. Stateful.

Referential Integrity in Cloud NoSQL Databases

How to Choose Between Hadoop, NoSQL and RDBMS

Big Data With Hadoop

HBase Schema Design. NoSQL Ma4ers, Cologne, April Lars George Director EMEA Services

HDB++: HIGH AVAILABILITY WITH. l TANGO Meeting l 20 May 2015 l Reynald Bourtembourg

Databases 2 (VU) ( )

CS435 Introduction to Big Data

MASTER PROJECT. Resource Provisioning for NoSQL Datastores

Cloud Computing with Microsoft Azure

How To Scale Out Of A Nosql Database

BigData. An Overview of Several Approaches. David Mera 16/12/2013. Masaryk University Brno, Czech Republic

Introduction to NOSQL

Lecture 5: GFS & HDFS! Claudia Hauff (Web Information Systems)! ti2736b-ewi@tudelft.nl

Hadoop and Map-Reduce. Swati Gore

Can the Elephants Handle the NoSQL Onslaught?

References. Introduction to Database Systems CSE 444. Motivation. Basic Features. Outline: Database in the Cloud. Outline

Introduction to Database Systems CSE 444

FAWN - a Fast Array of Wimpy Nodes

StreamStorage: High-throughput and Scalable Storage Technology for Streaming Data

NoSQL: Going Beyond Structured Data and RDBMS

Big Data Development CASSANDRA NoSQL Training - Workshop. March 13 to am to 5 pm HOTEL DUBAI GRAND DUBAI

Benchmarking Cassandra on Violin

NoSQL Databases. Nikos Parlavantzas

Using Object Database db4o as Storage Provider in Voldemort

MarkLogic Server. Database Replication Guide. MarkLogic 8 February, Copyright 2015 MarkLogic Corporation. All rights reserved.

Bigdata High Availability (HA) Architecture

Cloud Computing Is In Your Future

Big Data Technology Map-Reduce Motivation: Indexing in Search Engines

Big Systems, Big Data

low-level storage structures e.g. partitions underpinning the warehouse logical table structures

Apache Cassandra for Big Data Applications

International Journal of Scientific & Engineering Research, Volume 4, Issue 11, November ISSN

Slave. Master. Research Scholar, Bharathiar University

Technical Overview Simple, Scalable, Object Storage Software

The Sierra Clustered Database Engine, the technology at the heart of

INTRODUCTION TO CASSANDRA

Cloud Application Development (SE808, School of Software, Sun Yat-Sen University) Yabo (Arber) Xu

NoSQL Database Options

TECHNICAL WHITE PAPER: ELASTIC CLOUD STORAGE SOFTWARE ARCHITECTURE

NoSQL. Thomas Neumann 1 / 22

How To Write A Database Program

CS2510 Computer Operating Systems

CS2510 Computer Operating Systems

Distributed Systems (CS236351) Exercise 3

Integrating Big Data into the Computing Curricula

NoSQL and Hadoop Technologies On Oracle Cloud

Transcription:

Cassandra A Decentralized, Structured Storage System Avinash Lakshman and Prashant Malik Facebook Published: April 2010, Volume 44, Issue 2 Communications of the ACM http://dl.acm.org/citation.cfm?id=1773922 Presented by James Owens Old Dominion University For CS795 on 11//2014

About the Authors Avinash Lakshman Currently: Hedvig / Quexascale? 2013 Notable Works: o Dynamo: amazon's highly available key-value store - 2007 o Cassandra: a decentralized structured storage system - 2010 o Cassandra: structured storage system on a p2p network - 2009 o System and method for providing high availability data - 2010 Prashant Malik Currently: o LimeRoad? 2013 Notable works: o Cassandra: a decentralized structured storage system - 2010 o Cassandra: structured storage system on a p2p network - 2009 o Asynchronous communication within a server arrangement - 2007 o Publishing digital content within a defined universe such as an organization in accordance with a digital rights management (DRM) system - 2009 Rank 4 search results on author name - Google Scholar 11/14/2014 http://www.bizjournals.com/sanjose/news/2013/06/25/ex-facebook-amazon-data-engineer.html http://techcircle.vccircle.com/2013/05/30/ex-facebook-tech-lead-prashant-malik-joins-social-commerce-firm-limeroad-as-cto/

Significance Provides a platform for data storage and retrieval which supports very high write throughput and tolerates continuous component failure. Integrates strategies from many other technologies, cited over 916 times. [Google Scholar, Nov 2014]

Difficulties o Dense reading. o No Diagrams. o Simultaneously defines a general purpose tool(cassandra) and specific implementation (Inbox Search) o No clear separation of the above.

Approach o Handling Density: Inbox Search o Big-Picture View of Cassandra Data Model API Read/Write Model o How Cassandra solves Inbox Search Cassandra Internals

Why was Cassandra Created? o Solution to the Inbox Search problem Consider the Facebook context: o Many simultaneous users o Billions of writes per day o Need for Scalability o Cassandra is used for multiple services within Facebook.

Inbox Search Problem o A user wants to search his or her inbox for messages using one of two strategies Term Search - keyword Interactions - name

What is Cassandra? o A structured data storage system Logical ring of servers Designed to support multiple, continuous component failures No central point of failure Highly Configurable Runs on commodity hardware o *It is not a full relational DBMS http://www.datastax.com/wp-content/uploads/2012/03/migration-distribution.png

Data Model o Distributed Multidimensional Map < Key : Value > Pairs o String Key Typically 16 36 B o Object Value

Data Model o Column Types: Column Families Super Column Families o A superset of column families o (The Value Model supports recursion) o A Visual

Data Model http://www.divconq.com/category/cassandra/page/2/

Data Model http://www.divconq.com/category/cassandra/page/2/

API o insert ( table, key, rowmutation ) o get ( table, key, columnname ) o delete ( table, key, columnname ) o columnname (Path) May refer to any column, column family, super column family, or a column within a super column

Read/Write Overview o *Read and Write requests are processed by any node. o The searching node determines which particular nodes contain the data. o Writes Issues write to all nodes, waits for a quorum of commits o Reads (Variable) Closest Node Low integrity All nodes + quorum High integrity

Inbox Search Problem o A user wants to search his or her inbox for messages using one of two strategies Term Search - keyword Interactions - name

Term Search o Key user ID Super Column (Inverted index) o CF - words that make up a message C - Individual Message Identifiers

Search by Interactions o Key user ID Super Column (Inverted index) o CF Recipient ID C -Individual Message Identifiers

Inbox Search Problem o Cassandra provides hooks for intelligent caching: e.g. user clicks on inbox, primes index Production Performance numbers Latency Stat Search Interactions Term Search Min 7.69 ms 7.78 ms Median 15.69 ms 18.27 ms Max 26.13 ms 44.41 ms

Inbox Search Solution o The full solution requires understanding of the underlying architecture. o Key points are: Any node can service a query o Global knowledge of data stores Query all relevant data stores Take most-recent, good response o Quorum, flexibility o Time Threshold

Questions

My Question What should I emphasize in the architecture?

Cassandra Architecture General Structure http://www.ibm.com/developerworks/library/osapache-cassandra/figure003.gif

Cassandra Architecture General Structure http://www.datastax.com/docs/1.0/cluster_architecture/replication

Cassandra Architecture Data Partitioning Consistent Hashing Algorithm o Logical Ring of hash values o Each node is given a position on this ring o Each node is responsible for a LEFT range of hashes. o Each data item s RIGHT neighbor is responsible for storage and replication Recall the nodes communicate about ranges so any one node knows the locations which should contain data for a particular key.

Cassandra Architecture Data Partitioning http://www.datastax.com/docs/1.0/cluster_architecture/replication

Cassandra Architecture Replication Schemes Configurable: o Number of replicas o Replication Policies Non-coordinator replicas are chosen by picking N-1 other nodes on ring. o Rack Unaware Zookeeper (Elected leader) Abstraction o Rack Aware o Datacenter Aware

Replication Coordination http://www.datastax.com/wp-content/uploads/ 2012/03/migration-distribution.png

Cassandra Architecture Domain Awareness Each node has full awareness - handwaving o Cassandra nodes communicate via a Gossip Protocol Based on Scuttlebutt

Cassandra Architecture PHI Accrual Failure Detection Failure detection (prediction) via Gossip o A sliding window is used to calculate likelihood of failure, based on gossip: o (Phi, P(Failure) ) o { (1, 0.1) (2, 0.01) (3, 0.001) }

Cassandra Architecture Failure Response System architecture does not reconfigure o The server will return eventually. o Recall: Scuttlebutt, sliding window and PHI Permanent modification of the ring is an administrator task. Phi represents a level of suspicion a particular node is down or unreachable o (*) This is used for timeout avoidance? (*) Protocols account for failure by design

Cassandra Architecture Request Handling Request arrives at any node: 1. Servicing node looks up the data hosts. 1. All nodes have knowledge of the data partition 2. (*) Request is routed to all data hosts. 3. If requests timeout, failure is returned 4. Identify the response containing data with the youngest timestamp, return it. 5. Schedule update of nodes with older timestamps. This ensures quorum integrity

Cassandra Architecture Local Data Persistence Typical Write: 1. Write to commit log Dedicated Disk 2. Update to in-memory structure When in-memory structure crosses threshold (data size, number of data items) Data is sequentially written to commodity disks. Over time these files are merged and reorganized ala BigTable

Cassandra Architecture Local Data Persistence Typical Read: (A key can be in many files) 1. Bloom filter (index of keys in each file) 2. Get Values (reverse chronological order) 1. Values (CF) have Column Indices allowing for direct access of columns. I ve left out much of the chunking details. Optimizations, included at almost every conceivable point, are our of scope.

Questions

Read Path http://prettyprint.me/prettyprint.me/2010/05/02/ understanding-cassandra-code-base/index.html

Write Path http://prettyprint.me/prettyprint.me/2010/05/02/ understanding-cassandra-code-base/index.html

Questions Additional Information: http://www.slideshare.net/datastax/an-overview-of-apache-cassandra