nosql and Non Relational Databases



Similar documents

Lecture Data Warehouse Systems

SQL VS. NO-SQL. Adapted Slides from Dr. Jennifer Widom from Stanford

A COMPARATIVE STUDY OF NOSQL DATA STORAGE MODELS FOR BIG DATA

Why NoSQL? Your database options in the new non- relational world IBM Cloudant 1

extensible record stores document stores key-value stores Rick Cattel s clustering from Scalable SQL and NoSQL Data Stores SIGMOD Record, 2010

NoSQL: Going Beyond Structured Data and RDBMS

The NoSQL Ecosystem, Relaxed Consistency, and Snoop Dogg. Adam Marcus MIT CSAIL

Overview of Databases On MacOS. Karl Kuehn Automation Engineer RethinkDB

MongoDB in the NoSQL and SQL world. Horst Rechner Berlin,

Domain driven design, NoSQL and multi-model databases

Comparison of the Frontier Distributed Database Caching System with NoSQL Databases

Advanced Data Management Technologies

Analytics March 2015 White paper. Why NoSQL? Your database options in the new non-relational world

The Quest for Extreme Scalability

NoSQL Database Options

Preparing Your Data For Cloud

Introduction to NoSQL

Not Relational Models For The Management of Large Amount of Astronomical Data. Bruno Martino (IASI/CNR), Memmo Federici (IAPS/INAF)

Introduction to NOSQL

Can the Elephants Handle the NoSQL Onslaught?

Evaluating NoSQL for Enterprise Applications. Dirk Bartels VP Strategy & Marketing

NoSQL in der Cloud Why? Andreas Hartmann

NoSQL Databases. Nikos Parlavantzas

Structured Data Storage

Cloud Scale Distributed Data Storage. Jürmo Mehine

NoSQL Data Base Basics

NoSQL Databases. Institute of Computer Science Databases and Information Systems (DBIS) DB 2, WS 2014/2015

Introduction to Big Data Training

these three NoSQL databases because I wanted to see a the two different sides of the CAP

Do Relational Databases Belong in the Cloud? Michael Stiefel

INTRODUCTION TO CASSANDRA

Big Systems, Big Data

Integrating Big Data into the Computing Curricula

Benchmarking and Analysis of NoSQL Technologies

NOSQL INTRODUCTION WITH MONGODB AND RUBY GEOFF

bigdata Managing Scale in Ontological Systems

Practical Cassandra. Vitalii

So What s the Big Deal?

An Approach to Implement Map Reduce with NoSQL Databases

Challenges for Data Driven Systems

Introduction to NoSQL

Database Management System Choices. Introduction To Database Systems CSE 373 Spring 2013

Scalable Architecture on Amazon AWS Cloud

An Open Source NoSQL solution for Internet Access Logs Analysis

Slave. Master. Research Scholar, Bharathiar University

NoSQL and Hadoop Technologies On Oracle Cloud

Big Data Development CASSANDRA NoSQL Training - Workshop. March 13 to am to 5 pm HOTEL DUBAI GRAND DUBAI

BRAC. Investigating Cloud Data Storage UNIVERSITY SCHOOL OF ENGINEERING. SUPERVISOR: Dr. Mumit Khan DEPARTMENT OF COMPUTER SCIENCE AND ENGEENIRING

Study and Comparison of Elastic Cloud Databases : Myth or Reality?

Infrastructures for big data

.NET User Group Bern

3 Case Studies of NoSQL and Java Apps in the Real World

Chapter 11 Map-Reduce, Hadoop, HDFS, Hbase, MongoDB, Apache HIVE, and Related

The evolution of database technology (II) Huibert Aalbers Senior Certified Executive IT Architect

NoSQL Systems for Big Data Management

NoSQL - What we ve learned with mongodb. Paul Pedersen, Deputy CTO paul@10gen.com DAMA SF December 15, 2011

NoSQL. What Is NoSQL? Why NoSQL?

LARGE-SCALE DATA STORAGE APPLICATIONS

Open Source Technologies on Microsoft Azure

NOSQL DATABASES AND CASSANDRA

BIG DATA TOOLS. Top 10 open source technologies for Big Data

Cassandra A Decentralized Structured Storage System

Big Data. Facebook Wall Data using Graph API. Presented by: Prashant Patel Jaykrushna Patel

Open source large scale distributed data management with Google s MapReduce and Bigtable

Transactions and ACID in MongoDB

Introduction to Apache Cassandra

MongoDB Developer and Administrator Certification Course Agenda

NoSQL Database - mongodb

Making Sense ofnosql A GUIDE FOR MANAGERS AND THE REST OF US DAN MCCREARY MANNING ANN KELLY. Shelter Island

Databases 2 (VU) ( )

BIG DATA Alignment of Supply & Demand Nuria de Lama Representative of Atos Research &

NoSQL systems: introduction and data models. Riccardo Torlone Università Roma Tre

CS 4604: Introduc0on to Database Management Systems. B. Aditya Prakash Lecture #13: NoSQL and MapReduce

How To Write A Database Program

Querying MongoDB without programming using FUNQL

How To Use Big Data For Telco (For A Telco)

Evaluation of NoSQL databases for large-scale decentralized microblogging

Distributed Storage Systems

Introduction to NoSQL Databases. Tore Risch Information Technology Uppsala University

Introduction to Hadoop. New York Oracle User Group Vikas Sawhney

NoSQL replacement for SQLite (for Beatstream) Antti-Jussi Kovalainen Seminar OHJ-1860: NoSQL databases

GigaSpaces Real-Time Analytics for Big Data

F1: A Distributed SQL Database That Scales. Presentation by: Alex Degtiar (adegtiar@cmu.edu) /21/2013

NoSQL Evaluation. A Use Case Oriented Survey

Real Time Fraud Detection With Sequence Mining on Big Data Platform. Pranab Ghosh Big Data Consultant IEEE CNSV meeting, May Santa Clara, CA

Application of NoSQL Database in Web Crawling

On- Prem MongoDB- as- a- Service Powered by the CumuLogic DBaaS Platform

Hadoop: A Framework for Data- Intensive Distributed Computing. CS561-Spring 2012 WPI, Mohamed Y. Eltabakh

How To Scale Out Of A Nosql Database

Cassandra vs MySQL. SQL vs NoSQL database comparison

NoSQL for SQL Professionals William McKnight

CloudDB: A Data Store for all Sizes in the Cloud

Transcription:

nosql and Non Relational Databases Image src: http://www.pentaho.com/big-data/nosql/ Matthias Lee Johns Hopkins University

What NoSQL? Yes no SQL.. Atleast not only SQL Large class of Non Relaltional Databases trading Consistancy for Availability Easily Scalable (Partitioning) Highly fault tolerent Google, Facebook, Amazon, Twitter et al.

What? Why Non Relational? No complicated Relationships Schema light/free Less inter dependencies Easier scaling Higher fault tolerance Distributed Computing Store and search Hashing & MapReduce

CAP Theorem Choose 2 and work around the other. Eric Brewer UC Berkley Consistency Availability Patritioning

Dropping some of ACID Compromises must be made ACID Atomic All or nothing? partial Consitency eventually consistent Isolation revision history Durability written in stone sometimes

BASE BASE Basically Available Soft State Eventual consistency ASYNC conflict resolution and repair

ID <unique id> <unique id> <unique id> <unique id> Content Key1, Value Key2, Value Key3, Value Key4, Value Key2, Value Key3, Value Key8, Value Key9, Value Key5, Value Key4, Value Key1, Value Key2, Value Key3, Value Key4, Value Key5, Value Key6, Value Key7, Value Key8, Value

Things these DBs do... Easily Highlights Fast processing/specific tasks Usage of distributed queries and operations MapReduce/Hadoop Async reads and writes Fire & Forget Flexible schema (often) And its scalable / easy replecation

Things these DBs do... Easily Distributed storage (performance/fault tolerance) Increased response time and fault tolerance User Request User Request Master Master Master Slave 1 Slave 2 Slave 3 Slave 4

Things these DBs do... Server 2 Distributed storage (locality) North America Server 4 Server 1 Europe Australia Server 3

Things these DBs *dont* do... Easily Issues and challenges ACID goes out of the windows No direct translation SQL< >nosql Relatively new field many similar solutions All solutions have different trade offs

nosql means mostly no SQL Querying nosql Map/Reduce Various query languages RQL rasdaman CQL Cassandra

Map/Reduce Easily distributed method of processing data Fault tolarant map() and reduce() Input reader/partitioner map() Sort and partition reduce() Output writer

Map/Reduce Feburary 14th 2013

Map/Reduce INPUT map() sorting reduce() Chunk_1 map_out_1 Chunk_2 map_out_2 red_out_1 Chunk_3 Chunk_4 map_out_3 map_out_4 magic red_out_2 Chunk_5 map_out_5 Chunk_6 map_out_6 red_out_3 Chunk_7 map_out_7 Chunk_8 map_out_8

Types of nosql DBs Document Databases Key/Value Stores Array Databases Column Oriented Datastores Graph Databases(they exist)

Types of nosql DBs Column Oriented Datastores Indexing over Column families Fast aggregation & searching Inline compression Easy sharding Column Families id Fname Lname Zip Street 1 Joe Shmoe 32818 Cedar 2 Ralph Peters 65636 Birch 3 Mary Lewis 10337 Green Name { Location { Joe,Ralph,Mary Shmoe,Peters,Lewis 32818,65636,10337 Cedar,Birch,Green

Types of nosql DBs Column Oriented Datastores Big Table and its clones Hbase Google Hypertable Facebook, Hulu and StumbleUpon Baidu and Rediff

Types of nosql DBs Key/Value Stores (simple) Some of the earliest nosql early 90's Easily distributed Storage and Searching Hashtable like structure MapReduce Often used as caching engine O(1) ave lookup time [hash] : bytes[n]

Types of nosql DBs Document stores (mostly structured K,V store) MongoDB FourSquare, Shutterfly, Intuit, Github & more CouchDB BBC, Canonical, Cern, Android apps & more Redis Digg, Flicker, StackOverflow, Craigslist & more

Types of nosql DBs Key/Value Stores BerkleyDB Redis MySQL, Bitcoin, MemcachedDB, SVN & more Digg, Flicker, StackOverflow, Craigslist & more Cassandra (CQL) Facebook, Reddit, Twitter, Netflix & many more

Types of nosql DBs Document stores (mostly structured K,V store) Versitile Dynamic schema Eventual consistency Highly Parallelizable Easy replication "_id": "4eea98de1550e2cc04000000": { "lastmodified": "2011-12-15 20:03:26", "name": "Peter Lustig", "avatar": "4eea61a11550e26f7d000000", "email": "Peter.Lustig@void.net", hobbies : sleeping }

Types of nosql DBs Document stores (mostly structured K,V store) MongoDB FourSquare, Shutterfly, Intuit, Github & more CouchDB BBC, Canonical, Cern, Android apps & more Redis Digg, Flicker, StackOverflow, Craigslist & more

Distributed TinyURL crawler CouchDB set it up and relax cluster of unreliable commodity hardware RESTful JSON API Document Store with easy replication Eventual consistency Light weight (runs on phones) Easy replication

Distributed TinyURL crawler TinyURL crawler Quick deploy TinyURL resolver Master Slave architecture Replicating Databases Amazon EC2 spot instances

Distributed TinyURL crawler TinyURL crawler resolver TinyURL http://tinyurl.com/2tx google.com Feburary 14th 2013

Distributed TinyURL crawler TinyURL crawler Master R Amazon EC2 R R R R R R

Any questions? Comments? Thanks for listening Interested in this? Want to know more? #jhuacm on irc.freenode.net +Matthias Lee github.com/madmaze