Hands-on Cassandra. OSCON July 20, Eric
|
|
|
- Mariah Bates
- 10 years ago
- Views:
Transcription
1 Hands-on Cassandra OSCON July 20, 2010 Eric
2 2 Background
3 Influential Papers BigTable Strong consistency Sparse map data model GFS, Chubby, et al Dynamo O(1) distributed hash table (DHT) BASE (aka eventual consistency) Client tunable consistency/availability 3
4 NoSQL HBase Hypertable MongoDB HyperGraphDB Riak Memcached Voldemort Tokyo Cabinet Neo4J Redis Cassandra CouchDB 4
5 NoSQL Big data HBase Hypertable MongoDB HyperGraphDB Riak Memcached Voldemort Tokyo Cabinet Neo4J Redis Cassandra CouchDB 5
6 Bigtable / Dynamo Bigtable Dynamo HBase Riak Hypertable Voldemort Cassandra?? 6
7 7 Dynamo-Bigtable Lovechild
8 CAP Theorem Pick Two CP AP Bigtable Dynamo Hypertable Voldemort HBase Cassandra 8
9 CAP Theorem Pick Two Consistency Availability Partition Tolerance 9
10 History Facebook ( ) Avinash, former Dynamo engineer Motivated by Inbox Search Google Code ( ) Dark times Apache (2009-Present) Digg, Twitter, Rackspace, Others Rapidly growing community Fast-paced development 10
11 11 Hands-on Setup
12 Installation $ TUT_ROOT=$HOME $ cd $TUT_ROOT $ tar xfz apache-cassandra-xxxx-bin.tar.gz $ tar xfz twissandra-xxxx.tar.gz $ tar xfz pycassa-xxxx.tar.gz 12
13 Setup $ cp twissandra/cassandra.yaml \ apache-cassandra-xxxx/conf $ mkdir $TUT_ROOT/log $ mkdir -p $TUT_ROOT/lib/data $ mkdir -p $TUT_ROOT/lib/commitlog 13
14 Setup (continued) conf/cassandra.yaml # Where data is stored on disk data_file_directories: - TUT_ROOT/lib/data # Commit log commitlog_directory: TUT_ROOT/lib/commitlog 14
15 Setup (continued) conf/log4j-server.properties log4j.rootlogger=debug,stdout,r log4j.appender.r.file=tut_root/log/system.log 15
16 Starting up / Initializing $ cd $TUT_ROOT/apache-cassandra-xxxx $ bin/cassandra -f # In a new terminal $ cd $TUT_ROOT/apache-cassandra-xxxx $ bin/loadschemafromyaml localhost
17 Pycassa / Twissandra $ cd $TUT_ROOT/pycassa $ sudo python setup.py -cassandra install \ [--prefix=/usr/local] $ cd $TUT_ROOT/twissandra $ python manage.py runserver :
18 18 Data Model
19 Users CREATE TABLE user ( id INTEGER PRIMARY KEY, username VARCHAR(64), password VARCHAR(64) ); 19
20 Friends and Followers CREATE TABLE followers ( user INTEGER REFERENCES user(id), follower INTEGER REFERENCES user(id) ); CREATE TABLE following ( user INTEGER REFERENCES user(id), followee INTEGER REFERENCES user(id) ); 20
21 Tweets CREATE TABLE tweets ( id INTEGER PRIMARY KEY, user INTEGER REFERENCES user(id), body VARCHAR(140), timestamp TIMESTAMP ); 21
22 Overview Keyspace Uppermost namespace Typically one per application ColumnFamily Associates records of a similar kind Record-level Atomicity Indexed Column Basic unit of storage 22
23 23 Sparse Table
24 Column name byte[] Queried against (predicates) Determines sort order value byte[] Opaque to Cassandra timestamp long Conflict resolution (Last Write Wins) 24
25 Column Comparators Bytes UTF8 TimeUUID Long LexicalUUID Composite (third-party) 25
26 Column Families User Username Friends Followers Tweet Timeline Userline 26
27 User / Username User Stores users Keyed on a unique ID (UUID). Columns for username and password Username Indexes User Keyed on username One column, the unique UUID for user 27
28 Friends and Followers Friends Maps a user to the users they follow Keyed on user ID Columns for each user being followed Followers Maps a user to those following them Keyed on username Columns for each user following 28
29 Tweets Keyed on a unique identifier Columns: Unique identifier User ID Body of the tweet timestamp 29
30 Timeline / Userline Timeline Keyed on user ID Columns that map timestamps to Tweet ID The materialized view of Tweets for a user. Userline Keyed on user ID Columns that map timestamps to Tweet ID The collection of Tweets attributed to a user 30
31 31 Pycassa
32 Pycassa Python Client API connect() Thrift proxy cf = ColumnFamily(proxy, ksp, cfname) cf.insert() long cf.get() dict cf.get_range() dict 32
33 Adding a User cass.save_user() username = 'jericevans' password = '**********' useruuid = str(uuid()) columns = { 'id': useruuid, 'username': username, 'password': password } USER.insert(useruuid, columns) USERNAME.insert(username, {'id': useruuid}) 33
34 Following a Friend cass.add_friends() FRIENDS.insert(userid, {friendid: time()}) FOLLOWERS.insert(friendid, {userid: time()}) 34
35 35 Tweeting cass.save_tweet() columns = { 'id': tweetid, 'user_id': useruuid, 'body': body, '_ts': timestamp } TWEET.insert(tweetid, columns) columns = {pack('>d', timestamp): tweetid} USERLINE.insert(useruuid, columns) TIMELINE.insert(useruuid, columns) for otheruuid in FOLLOWERS.get(useruuid, 5000): TIMELINE.insert(otheruuid, columns)
36 Getting a Timeline cass.get_timeline() start = request.get.get('start') limit = NUM_PER_PAGE timeline = TIMLINE.get( userid, column_start=start, column_count=limit, column_reversed=true ) tweets = TWEET.multiget(timeline.values()) 36
37 37 Hands-on pycassashell
38 38 Retweet
39 Adding Retweet $ cd $TUT_ROOT/twissandra $ patch -p1 <../django.patch $ patch -p1 <../retweet.patch 39
40 Retweet cass.save_retweet() ts = _long(int(time() * 1e6)) for follower in get_follower_ids(userid): TIMELINE.insert(follower_id, {ts: tweet_id}) 40
41 41 Clustering Concepts
42 42 P2P Routing
43 43 P2P Routing
44 Partitioning (see partitioner) Random 128bit namespace, (MD5) Good distribution Order Preserving Tokens determine namespace Natural order (lexicographical) Range / cover queries Yours?? 44
45 Replica Placement (see endpoint_snitch) SimpleSnitch Default N-1 successive nodes RackInferringSnitch Infers DC/rack from IP PropertyFileSnitch Configured w/ a properties file 45
46 46 Bootstrap (see auto_bootstrap)
47 47 Bootstrap
48 48 Bootstrap
49 Remember CAP? Consistency Availability Partition Tolerance 49
50 Choosing Consistency Write Level Description ZERO Hail Mary ANY 1 replica (HH) ONE 1 replica QUORUM (N / 2) +1 ALL All replicas Read Level Description ZERO N/A ANY N/A ONE 1 replica QUORUM (N / 2) +1 ALL All replicas R + W > N 50
51 51 Quorum ((N/2) + 1)
52 52 Quorum ((N/2) + 1)
53 53 Operations
54 Cluster sizing Data size and throughput Fault tolerance (replication) Data-center / hosting costs 54
55 Nodes Go commodity! Cores (more is better) Memory (more is better) Disks Commitlog Storage double-up for working space 55
56 56 Writes
57 57 Reads
58 Tuning (heap size) bin/cassandra.in.sh # Arguments to pass to the JVM JVM_OPTS= \ -Xmx1G \ 58
59 Tuning (memtable) conf/cassandra.yaml # Amount of data written memtable_throughput_in_mb: 64 # Number of objects written memtable_operations_in_millions: 0.3 # Time elapsed memtable_flush_after_mins: 60 59
60 Tuning (column families) conf/cassandra.yaml keyspaces: - name: Twissandra column_families: - name: User keys_cached: 100 preload_row_cache: true rows_cached:
61 Tuning (mmap) conf/cassandra.yaml # Choices are auto, standard, mmap, and # mmap_index_only. disk_access_mode: auto 61
62 Nodetool bin/nodetool host <arg> command ring info cfstats tpstats 62
63 Nodetool (cont.) bin/nodetool host <arg> command compact snapshot [name] flush drain repair decommission move loadbalance 63
64 Clustertool bin/clustertool host <arg> command get_endpoints <keyspace> <key> global_snapshot [name] clear_global_snapshot truncate <keyspace> <cfname> 64
65 65 Wrapping Up
66 When Things Go Wrong Where to look Logs ERRORs, stack traces Enable DEBUG Isolate if possible Crash files (java_pid*.hprof, hs_err_pid*.log) nodetool / jconsole / etc Thread pool stats Column family stats 66
67 When Things Go Wrong What to do #cassandra on irc.freenode.net 67
68 Further Reading Bigtable: A Distributed Storage System for Structured Data Fay Chang, Jeffrey Dean, Sanjay Ghemawat, Wilson C. Hsieh, Deborah A. Wallach, Mike Burrows, Tushar Chandra, Andrew Fikes, and Robert E. Gruber Dynamo: Amazon s Highly Available Key-value Store Giuseppe DeCandia, Deniz Hastorun, Madan Jampani, Gunavardhan Kakulapati, Avinash Lakshman, Alex Pilchin, Swaminathan Sivasubramanian, Peter Vosshall and Werner Vogels 68
69 Thanks Apache Cassandra: Twissandra: Eric Florenzano (and others) Pycassa: Jonathan Hseu (and others) 69
70 Fin
Dynamo: Amazon s Highly Available Key-value Store
Dynamo: Amazon s Highly Available Key-value Store Giuseppe DeCandia, Deniz Hastorun, Madan Jampani, Gunavardhan Kakulapati, Avinash Lakshman, Alex Pilchin, Swaminathan Sivasubramanian, Peter Vosshall and
Distributed Systems. Tutorial 12 Cassandra
Distributed Systems Tutorial 12 Cassandra written by Alex Libov Based on FOSDEM 2010 presentation winter semester, 2013-2014 Cassandra In Greek mythology, Cassandra had the power of prophecy and the curse
Slave. Master. Research Scholar, Bharathiar University
Volume 3, Issue 7, July 2013 ISSN: 2277 128X International Journal of Advanced Research in Computer Science and Software Engineering Research Paper online at: www.ijarcsse.com Study on Basically, and Eventually
Practical Cassandra. Vitalii Tymchyshyn [email protected] @tivv00
Practical Cassandra NoSQL key-value vs RDBMS why and when Cassandra architecture Cassandra data model Life without joins or HDD space is cheap today Hardware requirements & deployment hints Vitalii Tymchyshyn
Facebook: Cassandra. Smruti R. Sarangi. Department of Computer Science Indian Institute of Technology New Delhi, India. Overview Design Evaluation
Facebook: Cassandra Smruti R. Sarangi Department of Computer Science Indian Institute of Technology New Delhi, India Smruti R. Sarangi Leader Election 1/24 Outline 1 2 3 Smruti R. Sarangi Leader Election
Evaluation of NoSQL databases for large-scale decentralized microblogging
Evaluation of NoSQL databases for large-scale decentralized microblogging Cassandra & Couchbase Alexandre Fonseca, Anh Thu Vu, Peter Grman Decentralized Systems - 2nd semester 2012/2013 Universitat Politècnica
The NoSQL Ecosystem, Relaxed Consistency, and Snoop Dogg. Adam Marcus MIT CSAIL [email protected] / @marcua
The NoSQL Ecosystem, Relaxed Consistency, and Snoop Dogg Adam Marcus MIT CSAIL [email protected] / @marcua About Me Social Computing + Database Systems Easily Distracted: Wrote The NoSQL Ecosystem in
Making Sense of NoSQL Dan McCreary Wednesday, Nov. 13 th 2014
Making Sense of NoSQL Dan McCreary Wednesday, Nov. 13 th 2014 Agenda Why NoSQL? What are the key NoSQL architectures? How are they different from traditional RDBMS Systems? What types of problems do they
Seminar Presentation for ECE 658 Instructed by: Prof.Anura Jayasumana Distributed File Systems
Seminar Presentation for ECE 658 Instructed by: Prof.Anura Jayasumana Distributed File Systems Prabhakaran Murugesan Outline File Transfer Protocol (FTP) Network File System (NFS) Andrew File System (AFS)
Joining Cassandra. Luiz Fernando M. Schlindwein Computer Science Department University of Crete Heraklion, Greece [email protected].
Luiz Fernando M. Schlindwein Computer Science Department University of Crete Heraklion, Greece [email protected] Joining Cassandra Binjiang Tao Computer Science Department University of Crete Heraklion,
High Throughput Computing on P2P Networks. Carlos Pérez Miguel [email protected]
High Throughput Computing on P2P Networks Carlos Pérez Miguel [email protected] Overview High Throughput Computing Motivation All things distributed: Peer-to-peer Non structured overlays Structured
Cassandra. Jonathan Ellis
Cassandra Jonathan Ellis Motivation Scaling reads to a relational database is hard Scaling writes to a relational database is virtually impossible and when you do, it usually isn't relational anymore The
NoSQL Databases. Institute of Computer Science Databases and Information Systems (DBIS) DB 2, WS 2014/2015
NoSQL Databases Institute of Computer Science Databases and Information Systems (DBIS) DB 2, WS 2014/2015 Database Landscape Source: H. Lim, Y. Han, and S. Babu, How to Fit when No One Size Fits., in CIDR,
Introduction to NoSQL
Introduction to NoSQL NoSQL Seminar 2012 @ TUT Arto Salminen What is NoSQL? Class of database management systems (DBMS) "Not only SQL" Does not use SQL as querying language Distributed, fault-tolerant
NoSQL Data Base Basics
NoSQL Data Base Basics Course Notes in Transparency Format Cloud Computing MIRI (CLC-MIRI) UPC Master in Innovation & Research in Informatics Spring- 2013 Jordi Torres, UPC - BSC www.jorditorres.eu HDFS
Advanced Data Management Technologies
ADMT 2014/15 Unit 15 J. Gamper 1/44 Advanced Data Management Technologies Unit 15 Introduction to NoSQL J. Gamper Free University of Bozen-Bolzano Faculty of Computer Science IDSE ADMT 2014/15 Unit 15
Cassandra A Decentralized, Structured Storage System
Cassandra A Decentralized, Structured Storage System Avinash Lakshman and Prashant Malik Facebook Published: April 2010, Volume 44, Issue 2 Communications of the ACM http://dl.acm.org/citation.cfm?id=1773922
Storage of Structured Data: BigTable and HBase. New Trends In Distributed Systems MSc Software and Systems
Storage of Structured Data: BigTable and HBase 1 HBase and BigTable HBase is Hadoop's counterpart of Google's BigTable BigTable meets the need for a highly scalable storage system for structured data Provides
Lecture Data Warehouse Systems
Lecture Data Warehouse Systems Eva Zangerle SS 2013 PART C: Novel Approaches in DW NoSQL and MapReduce Stonebraker on Data Warehouses Star and snowflake schemas are a good idea in the DW world C-Stores
LARGE-SCALE DATA STORAGE APPLICATIONS
BENCHMARKING AVAILABILITY AND FAILOVER PERFORMANCE OF LARGE-SCALE DATA STORAGE APPLICATIONS Wei Sun and Alexander Pokluda December 2, 2013 Outline Goal and Motivation Overview of Cassandra and Voldemort
Study and Comparison of Elastic Cloud Databases : Myth or Reality?
Université Catholique de Louvain Ecole Polytechnique de Louvain Computer Engineering Department Study and Comparison of Elastic Cloud Databases : Myth or Reality? Promoters: Peter Van Roy Sabri Skhiri
Structured Data Storage
Structured Data Storage Xgen Congress Short Course 2010 Adam Kraut BioTeam Inc. Independent Consulting Shop: Vendor/technology agnostic Staffed by: Scientists forced to learn High Performance IT to conduct
Introduction to NOSQL
Introduction to NOSQL Université Paris-Est Marne la Vallée, LIGM UMR CNRS 8049, France January 31, 2014 Motivations NOSQL stands for Not Only SQL Motivations Exponential growth of data set size (161Eo
Elosztott, skálázódó adatbázis-kezelő rendszer
Elosztott, skálázódó adatbázis-kezelő rendszer http://cassandra.apache.org/ Molnár András ([email protected]) Garzó András ([email protected]) 2012. július 13. péntek Eredet In Greek mythology,...
Cloud Data Management: A Short Overview and Comparison of Current Approaches
Cloud Data Management: A Short Overview and Comparison of Current Approaches Siba Mohammad Otto-von-Guericke University Magdeburg [email protected] Sebastian Breß Otto-von-Guericke University
nosql and Non Relational Databases
nosql and Non Relational Databases Image src: http://www.pentaho.com/big-data/nosql/ Matthias Lee Johns Hopkins University What NoSQL? Yes no SQL.. Atleast not only SQL Large class of Non Relaltional Databases
Survey of Apache Big Data Stack
Survey of Apache Big Data Stack Supun Kamburugamuve For the PhD Qualifying Exam 12/16/2013 Advisory Committee Prof. Geoffrey Fox Prof. David Leake Prof. Judy Qiu Table of Contents 1. Introduction... 3
Hypertable Architecture Overview
WHITE PAPER - MARCH 2012 Hypertable Architecture Overview Hypertable is an open source, scalable NoSQL database modeled after Bigtable, Google s proprietary scalable database. It is written in C++ for
A Review of Column-Oriented Datastores. By: Zach Pratt. Independent Study Dr. Maskarinec Spring 2011
A Review of Column-Oriented Datastores By: Zach Pratt Independent Study Dr. Maskarinec Spring 2011 Table of Contents 1 Introduction...1 2 Background...3 2.1 Basic Properties of an RDBMS...3 2.2 Example
NoSQL: Going Beyond Structured Data and RDBMS
NoSQL: Going Beyond Structured Data and RDBMS Scenario Size of data >> disk or memory space on a single machine Store data across many machines Retrieve data from many machines Machine = Commodity machine
USC Viterbi School of Engineering
USC Viterbi School of Engineering INF 551: Foundations of Data Management Units: 3 Term Day Time: Spring 2016 MW 8:30 9:50am (section 32411D) Location: GFS 116 Instructor: Wensheng Wu Office: GER 204 Office
SQL VS. NO-SQL. Adapted Slides from Dr. Jennifer Widom from Stanford
SQL VS. NO-SQL Adapted Slides from Dr. Jennifer Widom from Stanford 55 Traditional Databases SQL = Traditional relational DBMS Hugely popular among data analysts Widely adopted for transaction systems
extensible record stores document stores key-value stores Rick Cattel s clustering from Scalable SQL and NoSQL Data Stores SIGMOD Record, 2010
System/ Scale to Primary Secondary Joins/ Integrity Language/ Data Year Paper 1000s Index Indexes Transactions Analytics Constraints Views Algebra model my label 1971 RDBMS O tables sql-like 2003 memcached
Apache Cassandra 1.2 Documentation
Apache Cassandra 1.2 Documentation January 13, 2013 2013 DataStax. All rights reserved. Contents Apache Cassandra 1.2 Documentation 1 What's new in Apache Cassandra 1.2 1 Key Improvements 1 Concurrent
NoSQL Databases. Nikos Parlavantzas
!!!! NoSQL Databases Nikos Parlavantzas Lecture overview 2 Objective! Present the main concepts necessary for understanding NoSQL databases! Provide an overview of current NoSQL technologies Outline 3!
Future Prospects of Scalable Cloud Computing
Future Prospects of Scalable Cloud Computing Keijo Heljanko Department of Information and Computer Science School of Science Aalto University [email protected] 7.3-2012 1/17 Future Cloud Topics Beyond
wow CPSC350 relational schemas table normalization practical use of relational algebraic operators tuple relational calculus and their expression in a declarative query language relational schemas CPSC350
Case study: CASSANDRA
Case study: CASSANDRA Course Notes in Transparency Format Cloud Computing MIRI (CLC-MIRI) UPC Master in Innovation & Research in Informatics Spring- 2013 Jordi Torres, UPC - BSC www.jorditorres.eu Cassandra:
Designing Performance Monitoring Tool for NoSQL Cassandra Distributed Database
Designing Performance Monitoring Tool for NoSQL Cassandra Distributed Database Prasanna Bagade, Ashish Chandra, Aditya B.Dhende Pune Institute of Computer Technology University of Pune Pune ABSTRACT: The
NOSQL DATABASES AND CASSANDRA
NOSQL DATABASES AND CASSANDRA Semester Project: Advanced Databases DECEMBER 14, 2015 WANG CAN, EVABRIGHT BERTHA Université Libre de Bruxelles 0 Preface The goal of this report is to introduce the new evolving
A REVIEW ON EFFICIENT DATA ANALYSIS FRAMEWORK FOR INCREASING THROUGHPUT IN BIG DATA. Technology, Coimbatore. Engineering and Technology, Coimbatore.
A REVIEW ON EFFICIENT DATA ANALYSIS FRAMEWORK FOR INCREASING THROUGHPUT IN BIG DATA 1 V.N.Anushya and 2 Dr.G.Ravi Kumar 1 Pg scholar, Department of Computer Science and Engineering, Coimbatore Institute
Preparing Your Data For Cloud
Preparing Your Data For Cloud Narinder Kumar Inphina Technologies 1 Agenda Relational DBMS's : Pros & Cons Non-Relational DBMS's : Pros & Cons Types of Non-Relational DBMS's Current Market State Applicability
Which NoSQL Database? A Performance Overview
2014 by the authors; licensee RonPub, Lübeck, Germany. This article is an open access article distributed under the terms and conditions Veronika of the Creative Abramova, Commons Jorge Attribution Bernardino,
NoSQL Databases: a step to database scalability in Web environment
NoSQL Databases: a step to database scalability in Web environment Jaroslav Pokorny Charles University, Faculty of Mathematics and Physics, Malostranske n. 25, 118 00 Praha 1 Czech Republic +420-221914265
BRAC. Investigating Cloud Data Storage UNIVERSITY SCHOOL OF ENGINEERING. SUPERVISOR: Dr. Mumit Khan DEPARTMENT OF COMPUTER SCIENCE AND ENGEENIRING
BRAC UNIVERSITY SCHOOL OF ENGINEERING DEPARTMENT OF COMPUTER SCIENCE AND ENGEENIRING 12-12-2012 Investigating Cloud Data Storage Sumaiya Binte Mostafa (ID 08301001) Firoza Tabassum (ID 09101028) BRAC University
Scalable Multiple NameNodes Hadoop Cloud Storage System
Vol.8, No.1 (2015), pp.105-110 http://dx.doi.org/10.14257/ijdta.2015.8.1.12 Scalable Multiple NameNodes Hadoop Cloud Storage System Kun Bi 1 and Dezhi Han 1,2 1 College of Information Engineering, Shanghai
An Open Source NoSQL solution for Internet Access Logs Analysis
An Open Source NoSQL solution for Internet Access Logs Analysis A practical case of why, what and how to use a NoSQL Database Management System instead of a relational one José Manuel Ciges Regueiro
Apache Cassandra 1.2
Apache Cassandra 1.2 Documentation January 21, 2016 Apache, Apache Cassandra, Apache Hadoop, Hadoop and the eye logo are trademarks of the Apache Software Foundation 2016 DataStax, Inc. All rights reserved.
Big Data and Hadoop with components like Flume, Pig, Hive and Jaql
Abstract- Today data is increasing in volume, variety and velocity. To manage this data, we have to use databases with massively parallel software running on tens, hundreds, or more than thousands of servers.
Benchmarking Failover Characteristics of Large-Scale Data Storage Applications: Cassandra and Voldemort
Benchmarking Failover Characteristics of Large-Scale Data Storage Applications: Cassandra and Voldemort Alexander Pokluda Cheriton School of Computer Science University of Waterloo 2 University Avenue
HDB++: HIGH AVAILABILITY WITH. l TANGO Meeting l 20 May 2015 l Reynald Bourtembourg
HDB++: HIGH AVAILABILITY WITH Page 1 OVERVIEW What is Cassandra (C*)? Who is using C*? CQL C* architecture Request Coordination Consistency Monitoring tool HDB++ Page 2 OVERVIEW What is Cassandra (C*)?
Distributed Lucene : A distributed free text index for Hadoop
Distributed Lucene : A distributed free text index for Hadoop Mark H. Butler and James Rutherford HP Laboratories HPL-2008-64 Keyword(s): distributed, high availability, free text, parallel, search Abstract:
Challenges for Data Driven Systems
Challenges for Data Driven Systems Eiko Yoneki University of Cambridge Computer Laboratory Quick History of Data Management 4000 B C Manual recording From tablets to papyrus to paper A. Payberah 2014 2
Dynamo: Amazon s Highly Available Key-value Store
Dynamo: Amazon s Highly Available Key-value Store Giuseppe DeCandia, Deniz Hastorun, Madan Jampani, Gunavardhan Kakulapati, Avinash Lakshman, Alex Pilchin, Swaminathan Sivasubramanian, Peter Vosshall and
A COMPARATIVE STUDY OF NOSQL DATA STORAGE MODELS FOR BIG DATA
A COMPARATIVE STUDY OF NOSQL DATA STORAGE MODELS FOR BIG DATA Ompal Singh Assistant Professor, Computer Science & Engineering, Sharda University, (India) ABSTRACT In the new era of distributed system where
Cassandra - A Decentralized Structured Storage System
Cassandra - A Decentralized Structured Storage System Avinash Lakshman Facebook Prashant Malik Facebook ABSTRACT Cassandra is a distributed storage system for managing very large amounts of structured
Overview of Databases On MacOS. Karl Kuehn Automation Engineer RethinkDB
Overview of Databases On MacOS Karl Kuehn Automation Engineer RethinkDB Session Goals Introduce Database concepts Show example players Not Goals: Cover non-macos systems (Oracle) Teach you SQL Answer what
Cloud Scale Distributed Data Storage. Jürmo Mehine
Cloud Scale Distributed Data Storage Jürmo Mehine 2014 Outline Background Relational model Database scaling Keys, values and aggregates The NoSQL landscape Non-relational data models Key-value Document-oriented
Cassandra - A Decentralized Structured Storage System
Cassandra - A Decentralized Structured Storage System Avinash Lakshman Facebook Prashant Malik Facebook ABSTRACT Cassandra is a distributed storage system for managing very large amounts of structured
A programming model in Cloud: MapReduce
A programming model in Cloud: MapReduce Programming model and implementation developed by Google for processing large data sets Users specify a map function to generate a set of intermediate key/value
A TAXONOMY AND COMPARISON OF HADOOP DISTRIBUTED FILE SYSTEM WITH CASSANDRA FILE SYSTEM
A TAXONOMY AND COMPARISON OF HADOOP DISTRIBUTED FILE SYSTEM WITH CASSANDRA FILE SYSTEM Kalpana Dwivedi and Sanjay Kumar Dubey Department of Computer Science Engineering, Amity School of Engineering and
Comparing SQL and NOSQL databases
COSC 6397 Big Data Analytics Data Formats (II) HBase Edgar Gabriel Spring 2015 Comparing SQL and NOSQL databases Types Development History Data Storage Model SQL One type (SQL database) with minor variations
Contents Set up Cassandra Cluster using Datastax Community Edition on Amazon EC2 Installing OpsCenter on Amazon AMI References Contact
Contents Set up Cassandra Cluster using Datastax Community Edition on Amazon EC2... 2 Launce Amazon micro-instances... 2 Install JDK 7... 7 Install Cassandra... 8 Configure cassandra.yaml file... 8 Start
Big Table A Distributed Storage System For Data
Big Table A Distributed Storage System For Data OSDI 2006 Fay Chang, Jeffrey Dean, Sanjay Ghemawat et.al. Presented by Rahul Malviya Why BigTable? Lots of (semi-)structured data at Google - - URLs: Contents,
Real-Time Big Data in practice with Cassandra. Michaël Figuière @mfiguiere
Real-Time Big Data in practice with Cassandra Michaël Figuière @mfiguiere Speaker Michaël Figuière @mfiguiere 2 Ring Architecture Cassandra 3 Ring Architecture Replica Replica Replica 4 Linear Scalability
An Oracle White Paper April 2014. Back to the Future with Oracle Database 12c
An Oracle White Paper April 2014 Back to the Future with Oracle Database 12c Introduction... 1 How Database Concepts Influence the Database Market... 2 Database concepts and fundamentals a practical review...
these three NoSQL databases because I wanted to see a the two different sides of the CAP
Michael Sharp Big Data CS401r Lab 3 For this paper I decided to do research on MongoDB, Cassandra, and Dynamo. I chose these three NoSQL databases because I wanted to see a the two different sides of the
Distributed Data Stores
Distributed Data Stores 1 Distributed Persistent State MapReduce addresses distributed processing of aggregation-based queries Persistent state across a large number of machines? Distributed DBMS High
Review of Query Processing Techniques of Cloud Databases Ruchi Nanda Assistant Professor, IIS University Jaipur.
Suresh Gyan Vihar University Journal of Engineering & Technology (An International Bi Annual Journal) Vol. 1, Issue 2, 2015,pp.12-16 ISSN: 2395 0196 Review of Query Processing Techniques of Cloud Databases
Big Data Development CASSANDRA NoSQL Training - Workshop. March 13 to 17-2016 9 am to 5 pm HOTEL DUBAI GRAND DUBAI
Big Data Development CASSANDRA NoSQL Training - Workshop March 13 to 17-2016 9 am to 5 pm HOTEL DUBAI GRAND DUBAI ISIDUS TECH TEAM FZE PO Box 121109 Dubai UAE, email training-coordinator@isidusnet M: +97150
Institutionen för datavetenskap Department of Computer and Information Science
Institutionen för datavetenskap Department of Computer and Information Science Final thesis Exploration of NoSQL Technologies for managing hotel reservations by Sylvain Coulombel LiTH-IDA/ERASMUS-A--15/001--SE
COMPARATIVE STUDY OF NOSQL DOCUMENT, COLUMN STORE DATABASES AND EVALUATION OF CASSANDRA
COMPARATIVE STUDY OF NOSQL DOCUMENT, COLUMN STORE DATABASES AND EVALUATION OF CASSANDRA Manoj V VIT University, India ABSTRACT In the last decade, rapid growth in mobile applications, web technologies,
Cassandra A Decentralized Structured Storage System
Cassandra A Decentralized Structured Storage System Avinash Lakshman, Prashant Malik LADIS 2009 Anand Iyer CS 294-110, Fall 2015 Historic Context Early & mid 2000: Web applicaoons grow at tremendous rates
HBase A Comprehensive Introduction. James Chin, Zikai Wang Monday, March 14, 2011 CS 227 (Topics in Database Management) CIT 367
HBase A Comprehensive Introduction James Chin, Zikai Wang Monday, March 14, 2011 CS 227 (Topics in Database Management) CIT 367 Overview Overview: History Began as project by Powerset to process massive
Cassandra: Principles and Application
Cassandra: Principles and Application Dietrich Featherston [email protected] [email protected] Abstract Cassandra is a distributed database designed to be highly scalable both in terms of storage
Apache Cassandra 2.0
Apache Cassandra 2.0 Documentation December 16, 2015 Apache, Apache Cassandra, Apache Hadoop, Hadoop and the eye logo are trademarks of the Apache Software Foundation 2015 DataStax, Inc. All rights reserved.
EXPERIMENTAL EVALUATION OF NOSQL DATABASES
EXPERIMENTAL EVALUATION OF NOSQL DATABASES Veronika Abramova 1, Jorge Bernardino 1,2 and Pedro Furtado 2 1 Polytechnic Institute of Coimbra - ISEC / CISUC, Coimbra, Portugal 2 University of Coimbra DEI
Axway API Gateway. Version 7.4.1
K E Y P R O P E R T Y S T O R E U S E R G U I D E Axway API Gateway Version 7.4.1 26 January 2016 Copyright 2016 Axway All rights reserved. This documentation describes the following Axway software: Axway
Data Management Course Syllabus
Data Management Course Syllabus Data Management: This course is designed to give students a broad understanding of modern storage systems, data management techniques, and how these systems are used to
Automatic, multi- grained elasticity- provisioning for the Cloud
Automatic, multi- grained elasticity- provisioning for the Cloud Integration Prototype "CELAR System Prototype" V1 Companion Document Date: 30-09- 2013 CELAR is funded by the European Commission DG- INFSO
Measuring Elasticity for Cloud Databases
Measuring Elasticity for Cloud Databases Thibault Dory, Boris Mejías Peter Van Roy ICTEAM Institute Univ. catholique de Louvain [email protected], [email protected], [email protected]
Why NoSQL? Your database options in the new non- relational world. 2015 IBM Cloudant 1
Why NoSQL? Your database options in the new non- relational world 2015 IBM Cloudant 1 Table of Contents New types of apps are generating new types of data... 3 A brief history on NoSQL... 3 NoSQL s roots
The Cloud Trade Off IBM Haifa Research Storage Systems
The Cloud Trade Off IBM Haifa Research Storage Systems 1 Fundamental Requirements form Cloud Storage Systems The Google File System first design consideration: component failures are the norm rather than
Comparison of Distribution Technologies in Different NoSQL Database Systems
Comparison of Distribution Technologies in Different NoSQL Database Systems Studienarbeit Institute of Applied Informatics and Formal Description Methods (AIFB) Karlsruhe Institute of Technology (KIT)
Scalable Queries For Large Datasets Using Cloud Computing: A Case Study
Scalable Queries For Large Datasets Using Cloud Computing: A Case Study James P. McGlothlin The University of Texas at Dallas Richardson, TX USA [email protected] Latifur Khan The University of
Cloud Computing in Distributed System
M.H.Nerkar & Sonali Vijay Shinkar GCOE, Jalgaon Abstract - Cloud Computing as an Internet-based computing; where resources, software and information are provided to computers on-demand, like a public utility;
On- Prem MongoDB- as- a- Service Powered by the CumuLogic DBaaS Platform
On- Prem MongoDB- as- a- Service Powered by the CumuLogic DBaaS Platform Page 1 of 16 Table of Contents Table of Contents... 2 Introduction... 3 NoSQL Databases... 3 CumuLogic NoSQL Database Service...
