Open source, high performance database

Size: px
Start display at page:

Download "Open source, high performance database"

Transcription

1 Open source, high performance database Anti-social Databases: NoSQL and MongoDB Will LaForest Senior Director of 10gen 1

2 SQL invented Dynamic Web Content released IBM s IMS Oracle founded Client Server Web applications SOA 10gen founded BigTable Codd publishes relational model paper in 1970 PC s gain traction WWW born 3 tier architecture Brewer s Cap born Cloud Computing NoSQL Movement 2

3 Attribute Tuple Relation 3

4 Category N 1 BlogCategory N 1 Author N 1 N 1 Blog BlogTag N N Content Tag 4

5 Data stored in a RDBMS is very compact (disk was more expensive) SQL and RDBMS made queries flexible with rigid schemas Rigid schemas helps opcmize joins and storage Massive ecosystem of tools, libraries and integracons Been around 40 years! 5

6 Sensor Data/SIGINT (volume, velocity) Asset Management (variety, velocity) Crowd Sourcing (volume, variety) Open Source (volume, variety, velocity) 6

7 VOLUME & NEW ARCHITECTURES Systems scaling horizontally, not verccally Commodity servers Cloud CompuCng DATA VARIETY & VOLATILITY Extremely difficult to find a single fixed schema Don t know data schema a- priori TRANSACTIONAL MODEL N x Inserts or updates Distributed transaccons 7

8 Non- relaconal been hanging around (heard of MUMPS?) Modern NoSQL theory and offerings started in early 2000s NoSQL = Not Only SQL A colleccon of very different products AlternaCves to relaconal databases when a bad fit Common mocvacons Horizontally scalable (commodity server/cloud compucng) Schema Flexibility 8

9 I group by data model Some people use or include data arangement Data arrangement is important and cuts across data models (I have separate talk on this) Column/Document Consistent hashing parcconing Range based parcconing Key/Value Big Table Descendents Document Oriented Graph 9

10 Value (Data) mapped to a key (think primary) Some typed, some just BLOBs Fast hash based queries No range queries, no ordering, simple data model Will Chris Will-obj [email protected] [email protected] [4e61 6d65 3a57 10

11 Data stored on disk in a column oriented fashion Predominantly hash based indexing Rudimentary or no secondary indexes Range queries and ordering on one dimension (row key) Some consistent hashing some range based row keys column family contact column family personal row Will twitter : wlaforest [email protected] phone : bio : Will attended picture : row Chris [email protected] phone : bio : picture : hobby : golf 11

12 Data modeled as documents or objects (XML and JSON) No fixed schema Richest data model Consistent hashing and range based parcconing {name: will, eyes: blue, birthplace: NY, aliases: [ bill, la ciacco ], gender:???, boss: ben } name: jeff, eyes: blue, height: 72, boss: ben } name: ben, hat: yes } {name: brendan, aliases: [ el diablo ]} {name: matt, pizza: DiGiorno, height: 72, boss: } 12

13 Not a database! Map Reduce on HDFS or other data source Great for grinding through data When you want to use it Can t use a index DistribuCng custom algorithms ETL Many NoSQL offerings have nacve map reduce funcconality 13

14 14

15 2007 founded 2009 first release of MongoDB MongoDB is open source 10gen has a Redhat- like business model SubscripCons (subscriber build, support) Training ConsulCng > 80M in funding Sequoia, NEA, In- Q- Tel, Union Square Ventures, Flybridge Capital, 15

16 #2 on Indeed s Fastest Growing Jobs Jaspersom BigData Index Demand for MongoDB, the document- oriented NoSQL database, saw the biggest spike with over 200% growth in Google Searches 451 Group MongoDB increasing its dominance 16

17 #2 ON INDEED S FASTEST GROWING JOBS 17

18 MongoDB INCREASING ITS DOMINANCE 18

19 Scale horizontally over commodity hardware Agility essencal (schema free/heterogeneous interface) RDBMSs great so keep what works Rich data models Adhoc queries Fully featured indexes What doesn t distribute well? Long running mulc- row transaccons Join Both arcfacts of the relaconal data model 19

20 Data stored as documents (JSON) Schema free CRUD operacons (Create Read Update Delete) Atomic document operacons Consistent (but is tunable advanced topic) Rich indexing (secondary, geospacal, covered) Ad hoc Queries like SQL Equality Ranges Regular expression searches GeospaCal ReplicaCon HA, read scalability, geo centric reads Sharding (somecmes called parcconing) for scalability 20

21 RDBMS MongoDB Database Database Table Collection Row Document 21

22 22

23 var p = { author: roger, date: new Date(), text: Spirited Away, tags: [ Tezuka, Manga ]} > db.posts.save(p) 23

24 >db.posts.find() { _id : ObjectId("4c4ba5c0672c685e5e8aabf3"), author : "roger", date : "Sat Jul :47:11 GMT (PDT)", text : "Spirited Away", tags : [ "Tezuka", "Manga" ] } Notes: - _id is unique, but can be anything you d like 24

25 Create index on any Field in Document // 1 means ascending, -1 means descending >db.posts.ensureindex({author: 1}) >db.posts.find({author: 'roger'}) { _id : ObjectId("4c4ba5c0672c685e5e8aabf3"), author : "roger",... } 25

26 CondiConal Operators $all, $exists, $mod, $ne, $in, $nin, $nor, $or, $size, $type $lt, $lte, $gt, $gte // find posts with any tags > db.posts.find( {tags: {$exists: true }} ) // find posts matching a regular expression > db.posts.find( {author: /^rog*/i } ) // count posts by an author before a certain date > db.posts.find( {author: roger, date:{ $lt: Sat }} ).count() 26

27 $set, $unset, $inc, $push, $pushall, $pull, $pullall, $bit > comment = { author: fred, date: new Date(), text: Best Movie Ever } > db.posts.update( { _id:... }, $push: {comments: comment} ); 27

28 { _id : ObjectId("4c4ba5c0672c685e5e8aabf3"), author : "roger", date : "Sat Jul :47:11 GMT (PDT)", text : "Spirited Away", tags : [ "Tezuka", "Manga" ], comments : [ { author : "Fred", date : "Sat Jul :51:03 GMT (PDT)", text : "Best Movie Ever" } ]} 28

29 // Index nested documents > db.posts.ensureindex( comments.author :1 ) Ø db.posts.find({ comments.author : Fred }) // Index on tags > db.posts.ensureindex( tags: 1) > db.posts.find( { tags: Manga } ) // geospacal index > db.posts.ensureindex( author.locacon : 2d ) > db.posts.find( author.locacon : { $near : [22,42] } ) 29

30 30

31 Do them client side (Python or R) NaCve Map/Reduce in JS in MongoDB Distributes across the cluster with good data locality New aggregacon framework DeclaraCve (no JS required) Pipeline approach (like Unix ps - ax tee processes.txt more) Hadoop Intersect the indexing of MongoDB with the brute force parallelizacon of hadoop Hadoop MongoDB connector BI/ReporCng Tools Pentaho, Jaspersom, Ikanow, Nucleon 31

32 32

33 Prod clusters on order of 100B objects 50k qps per server ~1400 nodes Better data locality In-Memory Caching Auto-Sharding Replication /HA Horizontal Scaling 33

Building Your First MongoDB Application

Building Your First MongoDB Application Building Your First MongoDB Application Ross Lawley Python Engineer @ 10gen Web developer since 1999 Passionate about open source Agile methodology email: [email protected] twitter: RossC0 Today's Talk Quick

More information

Introduction to NoSQL and MongoDB. Kathleen Durant Lesson 20 CS 3200 Northeastern University

Introduction to NoSQL and MongoDB. Kathleen Durant Lesson 20 CS 3200 Northeastern University Introduction to NoSQL and MongoDB Kathleen Durant Lesson 20 CS 3200 Northeastern University 1 Outline for today Introduction to NoSQL Architecture Sharding Replica sets NoSQL Assumptions and the CAP Theorem

More information

MongoDB in the NoSQL and SQL world. Horst Rechner [email protected] Berlin, 2012-05-15

MongoDB in the NoSQL and SQL world. Horst Rechner horst.rechner@fokus.fraunhofer.de Berlin, 2012-05-15 MongoDB in the NoSQL and SQL world. Horst Rechner [email protected] Berlin, 2012-05-15 1 MongoDB in the NoSQL and SQL world. NoSQL What? Why? - How? Say goodbye to ACID, hello BASE You

More information

MongoDB Developer and Administrator Certification Course Agenda

MongoDB Developer and Administrator Certification Course Agenda MongoDB Developer and Administrator Certification Course Agenda Lesson 1: NoSQL Database Introduction What is NoSQL? Why NoSQL? Difference Between RDBMS and NoSQL Databases Benefits of NoSQL Types of NoSQL

More information

MongoDB. Or how I learned to stop worrying and love the database. Mathias Stearn. N*SQL Berlin October 22th, 2009. 10gen

MongoDB. Or how I learned to stop worrying and love the database. Mathias Stearn. N*SQL Berlin October 22th, 2009. 10gen What is? Or how I learned to stop worrying and love the database 10gen N*SQL Berlin October 22th, 2009 What is? 1 What is? Document Oriented JavaScript Enabled Fast, Scalable, Available, and Reliable 2

More information

Dr. Chuck Cartledge. 15 Oct. 2015

Dr. Chuck Cartledge. 15 Oct. 2015 CS-695 NoSQL Database MongoDB (part 2 of 2) Dr. Chuck Cartledge 15 Oct. 2015 1/17 Table of contents I 1 Miscellanea 2 Assignment #4 3 DB comparisons 4 Extensions 6 Midterm 7 Conclusion 8 References 5 Summary

More information

Comparing SQL and NOSQL databases

Comparing SQL and NOSQL databases COSC 6397 Big Data Analytics Data Formats (II) HBase Edgar Gabriel Spring 2015 Comparing SQL and NOSQL databases Types Development History Data Storage Model SQL One type (SQL database) with minor variations

More information

Getting Started with MongoDB

Getting Started with MongoDB Getting Started with MongoDB TCF IT Professional Conference March 14, 2014 Michael P. Redlich @mpredli about.me/mpredli/ 1 1 Who s Mike? BS in CS from Petrochemical Research Organization Ai-Logix, Inc.

More information

Making Sense ofnosql A GUIDE FOR MANAGERS AND THE REST OF US DAN MCCREARY MANNING ANN KELLY. Shelter Island

Making Sense ofnosql A GUIDE FOR MANAGERS AND THE REST OF US DAN MCCREARY MANNING ANN KELLY. Shelter Island Making Sense ofnosql A GUIDE FOR MANAGERS AND THE REST OF US DAN MCCREARY ANN KELLY II MANNING Shelter Island contents foreword preface xvii xix acknowledgments xxi about this book xxii Part 1 Introduction

More information

Cloud Scale Distributed Data Storage. Jürmo Mehine

Cloud Scale Distributed Data Storage. Jürmo Mehine Cloud Scale Distributed Data Storage Jürmo Mehine 2014 Outline Background Relational model Database scaling Keys, values and aggregates The NoSQL landscape Non-relational data models Key-value Document-oriented

More information

NoSQL systems: introduction and data models. Riccardo Torlone Università Roma Tre

NoSQL systems: introduction and data models. Riccardo Torlone Università Roma Tre NoSQL systems: introduction and data models Riccardo Torlone Università Roma Tre Why NoSQL? In the last thirty years relational databases have been the default choice for serious data storage. An architect

More information

Вовченко Алексей, к.т.н., с.н.с. ВМК МГУ ИПИ РАН

Вовченко Алексей, к.т.н., с.н.с. ВМК МГУ ИПИ РАН Вовченко Алексей, к.т.н., с.н.с. ВМК МГУ ИПИ РАН Zettabytes Petabytes ABC Sharding A B C Id Fn Ln Addr 1 Fred Jones Liberty, NY 2 John Smith?????? 122+ NoSQL Database

More information

X4-2 Exadata announced (well actually around Jan 1) OEM/Grid control 12c R4 just released

X4-2 Exadata announced (well actually around Jan 1) OEM/Grid control 12c R4 just released General announcements In-Memory is available next month http://www.oracle.com/us/corporate/events/dbim/index.html X4-2 Exadata announced (well actually around Jan 1) OEM/Grid control 12c R4 just released

More information

Why NoSQL? Your database options in the new non- relational world. 2015 IBM Cloudant 1

Why NoSQL? Your database options in the new non- relational world. 2015 IBM Cloudant 1 Why NoSQL? Your database options in the new non- relational world 2015 IBM Cloudant 1 Table of Contents New types of apps are generating new types of data... 3 A brief history on NoSQL... 3 NoSQL s roots

More information

Can the Elephants Handle the NoSQL Onslaught?

Can the Elephants Handle the NoSQL Onslaught? Can the Elephants Handle the NoSQL Onslaught? Avrilia Floratou, Nikhil Teletia David J. DeWitt, Jignesh M. Patel, Donghui Zhang University of Wisconsin-Madison Microsoft Jim Gray Systems Lab Presented

More information

L7_L10. MongoDB. Big Data and Analytics by Seema Acharya and Subhashini Chellappan Copyright 2015, WILEY INDIA PVT. LTD.

L7_L10. MongoDB. Big Data and Analytics by Seema Acharya and Subhashini Chellappan Copyright 2015, WILEY INDIA PVT. LTD. L7_L10 MongoDB Agenda What is MongoDB? Why MongoDB? Using JSON Creating or Generating a Unique Key Support for Dynamic Queries Storing Binary Data Replication Sharding Terms used in RDBMS and MongoDB Data

More information

NoSQL for SQL Professionals William McKnight

NoSQL for SQL Professionals William McKnight NoSQL for SQL Professionals William McKnight Session Code BD03 About your Speaker, William McKnight President, McKnight Consulting Group Frequent keynote speaker and trainer internationally Consulted to

More information

extensible record stores document stores key-value stores Rick Cattel s clustering from Scalable SQL and NoSQL Data Stores SIGMOD Record, 2010

extensible record stores document stores key-value stores Rick Cattel s clustering from Scalable SQL and NoSQL Data Stores SIGMOD Record, 2010 System/ Scale to Primary Secondary Joins/ Integrity Language/ Data Year Paper 1000s Index Indexes Transactions Analytics Constraints Views Algebra model my label 1971 RDBMS O tables sql-like 2003 memcached

More information

HBase A Comprehensive Introduction. James Chin, Zikai Wang Monday, March 14, 2011 CS 227 (Topics in Database Management) CIT 367

HBase A Comprehensive Introduction. James Chin, Zikai Wang Monday, March 14, 2011 CS 227 (Topics in Database Management) CIT 367 HBase A Comprehensive Introduction James Chin, Zikai Wang Monday, March 14, 2011 CS 227 (Topics in Database Management) CIT 367 Overview Overview: History Began as project by Powerset to process massive

More information

Lecture Data Warehouse Systems

Lecture Data Warehouse Systems Lecture Data Warehouse Systems Eva Zangerle SS 2013 PART C: Novel Approaches in DW NoSQL and MapReduce Stonebraker on Data Warehouses Star and snowflake schemas are a good idea in the DW world C-Stores

More information

How To Scale Out Of A Nosql Database

How To Scale Out Of A Nosql Database Firebird meets NoSQL (Apache HBase) Case Study Firebird Conference 2011 Luxembourg 25.11.2011 26.11.2011 Thomas Steinmaurer DI +43 7236 3343 896 [email protected] www.scch.at Michael Zwick DI

More information

Integrating Big Data into the Computing Curricula

Integrating Big Data into the Computing Curricula Integrating Big Data into the Computing Curricula Yasin Silva, Suzanne Dietrich, Jason Reed, Lisa Tsosie Arizona State University http://www.public.asu.edu/~ynsilva/ibigdata/ 1 Overview Motivation Big

More information

Evaluating NoSQL for Enterprise Applications. Dirk Bartels VP Strategy & Marketing

Evaluating NoSQL for Enterprise Applications. Dirk Bartels VP Strategy & Marketing Evaluating NoSQL for Enterprise Applications Dirk Bartels VP Strategy & Marketing Agenda The Real Time Enterprise The Data Gold Rush Managing The Data Tsunami Analytics and Data Case Studies Where to go

More information

Analytics March 2015 White paper. Why NoSQL? Your database options in the new non-relational world

Analytics March 2015 White paper. Why NoSQL? Your database options in the new non-relational world Analytics March 2015 White paper Why NoSQL? Your database options in the new non-relational world 2 Why NoSQL? Contents 2 New types of apps are generating new types of data 2 A brief history of NoSQL 3

More information

NoSQL: Going Beyond Structured Data and RDBMS

NoSQL: Going Beyond Structured Data and RDBMS NoSQL: Going Beyond Structured Data and RDBMS Scenario Size of data >> disk or memory space on a single machine Store data across many machines Retrieve data from many machines Machine = Commodity machine

More information

NoSQL - What we ve learned with mongodb. Paul Pedersen, Deputy CTO [email protected] DAMA SF December 15, 2011

NoSQL - What we ve learned with mongodb. Paul Pedersen, Deputy CTO paul@10gen.com DAMA SF December 15, 2011 NoSQL - What we ve learned with mongodb Paul Pedersen, Deputy CTO [email protected] DAMA SF December 15, 2011 DW2.0 and NoSQL management decision support intgrated access - local v. global - structured v.

More information

Open Source Technologies on Microsoft Azure

Open Source Technologies on Microsoft Azure Open Source Technologies on Microsoft Azure A Survey @DChappellAssoc Copyright 2014 Chappell & Associates The Main Idea i Open source technologies are a fundamental part of Microsoft Azure The Big Questions

More information

Evaluator s Guide. McKnight. Consulting Group. McKnight Consulting Group

Evaluator s Guide. McKnight. Consulting Group. McKnight Consulting Group NoSQL Evaluator s Guide McKnight Consulting Group William McKnight is the former IT VP of a Fortune 50 company and the author of Information Management: Strategies for Gaining a Competitive Advantage with

More information

Challenges for Data Driven Systems

Challenges for Data Driven Systems Challenges for Data Driven Systems Eiko Yoneki University of Cambridge Computer Laboratory Quick History of Data Management 4000 B C Manual recording From tablets to papyrus to paper A. Payberah 2014 2

More information

MySQL és Hadoop mint Big Data platform (SQL + NoSQL = MySQL Cluster?!)

MySQL és Hadoop mint Big Data platform (SQL + NoSQL = MySQL Cluster?!) MySQL és Hadoop mint Big Data platform (SQL + NoSQL = MySQL Cluster?!) Erdélyi Ernő, Component Soft Kft. [email protected] www.component.hu 2013 (c) Component Soft Ltd Leading Hadoop Vendor Copyright 2013,

More information

Big Data & Data Science Course Example using MapReduce. Presented by Juan C. Vega

Big Data & Data Science Course Example using MapReduce. Presented by Juan C. Vega Big Data & Data Science Course Example using MapReduce Presented by What is Mongo? Why Mongo? Mongo Model Mongo Deployment Mongo Query Language Built-In MapReduce Demo Q & A Agenda Founders Max Schireson

More information

How graph databases started the multi-model revolution

How graph databases started the multi-model revolution How graph databases started the multi-model revolution Luca Garulli Author and CEO @OrientDB QCon Sao Paulo - March 26, 2015 Welcome to Big Data 90% of the data in the world today has been created in the

More information

NoSQL Database Options

NoSQL Database Options NoSQL Database Options Introduction For this report, I chose to look at MongoDB, Cassandra, and Riak. I chose MongoDB because it is quite commonly used in the industry. I chose Cassandra because it has

More information

Chapter 11 Map-Reduce, Hadoop, HDFS, Hbase, MongoDB, Apache HIVE, and Related

Chapter 11 Map-Reduce, Hadoop, HDFS, Hbase, MongoDB, Apache HIVE, and Related Chapter 11 Map-Reduce, Hadoop, HDFS, Hbase, MongoDB, Apache HIVE, and Related Summary Xiangzhe Li Nowadays, there are more and more data everyday about everything. For instance, here are some of the astonishing

More information

.NET User Group Bern

.NET User Group Bern .NET User Group Bern Roger Rudin bbv Software Services AG [email protected] Agenda What is NoSQL Understanding the Motivation behind NoSQL MongoDB: A Document Oriented Database NoSQL Use Cases What is

More information

NoSQL. Thomas Neumann 1 / 22

NoSQL. Thomas Neumann 1 / 22 NoSQL Thomas Neumann 1 / 22 What are NoSQL databases? hard to say more a theme than a well defined thing Usually some or all of the following: no SQL interface no relational model / no schema no joins,

More information

The evolution of database technology (II) Huibert Aalbers Senior Certified Executive IT Architect

The evolution of database technology (II) Huibert Aalbers Senior Certified Executive IT Architect The evolution of database technology (II) Huibert Aalbers Senior Certified Executive IT Architect IT Insight podcast This podcast belongs to the IT Insight series You can subscribe to the podcast through

More information

BIG DATA Alignment of Supply & Demand Nuria de Lama Representative of Atos Research &

BIG DATA Alignment of Supply & Demand Nuria de Lama Representative of Atos Research & BIG DATA Alignment of Supply & Demand Nuria de Lama Representative of Atos Research & Innovation 04-08-2011 to the EC 8 th February, Luxembourg Your Atos business Research technologists. and Innovation

More information

Using distributed technologies to analyze Big Data

Using distributed technologies to analyze Big Data Using distributed technologies to analyze Big Data Abhijit Sharma Innovation Lab BMC Software 1 Data Explosion in Data Center Performance / Time Series Data Incoming data rates ~Millions of data points/

More information

NoSQL Databases. Nikos Parlavantzas

NoSQL Databases. Nikos Parlavantzas !!!! NoSQL Databases Nikos Parlavantzas Lecture overview 2 Objective! Present the main concepts necessary for understanding NoSQL databases! Provide an overview of current NoSQL technologies Outline 3!

More information

NoSQL Data Base Basics

NoSQL Data Base Basics NoSQL Data Base Basics Course Notes in Transparency Format Cloud Computing MIRI (CLC-MIRI) UPC Master in Innovation & Research in Informatics Spring- 2013 Jordi Torres, UPC - BSC www.jorditorres.eu HDFS

More information

NoSQL in der Cloud Why? Andreas Hartmann

NoSQL in der Cloud Why? Andreas Hartmann NoSQL in der Cloud Why? Andreas Hartmann 17.04.2013 17.04.2013 2 NoSQL in der Cloud Why? Quelle: http://res.sys-con.com/story/mar12/2188748/cloudbigdata_0_0.jpg Why Cloud??? 17.04.2013 3 NoSQL in der Cloud

More information

Big Data Explained. An introduction to Big Data Science.

Big Data Explained. An introduction to Big Data Science. Big Data Explained An introduction to Big Data Science. 1 Presentation Agenda What is Big Data Why learn Big Data Who is it for How to start learning Big Data When to learn it Objective and Benefits of

More information

NoSQL Databases. Institute of Computer Science Databases and Information Systems (DBIS) DB 2, WS 2014/2015

NoSQL Databases. Institute of Computer Science Databases and Information Systems (DBIS) DB 2, WS 2014/2015 NoSQL Databases Institute of Computer Science Databases and Information Systems (DBIS) DB 2, WS 2014/2015 Database Landscape Source: H. Lim, Y. Han, and S. Babu, How to Fit when No One Size Fits., in CIDR,

More information

Big Data. Facebook Wall Data using Graph API. Presented by: Prashant Patel-2556219 Jaykrushna Patel-2619715

Big Data. Facebook Wall Data using Graph API. Presented by: Prashant Patel-2556219 Jaykrushna Patel-2619715 Big Data Facebook Wall Data using Graph API Presented by: Prashant Patel-2556219 Jaykrushna Patel-2619715 Outline Data Source Processing tools for processing our data Big Data Processing System: Mongodb

More information

Database Scalability and Oracle 12c

Database Scalability and Oracle 12c Database Scalability and Oracle 12c Marcelle Kratochvil CTO Piction ACE Director All Data/Any Data [email protected] Warning I will be covering topics and saying things that will cause a rethink in

More information

Big Data, Fast Data, Complex Data. Jans Aasman Franz Inc

Big Data, Fast Data, Complex Data. Jans Aasman Franz Inc Big Data, Fast Data, Complex Data Jans Aasman Franz Inc Private, founded 1984 AI, Semantic Technology, professional services Now in Oakland Franz Inc Who We Are (1 (2 3) (4 5) (6 7) (8 9) (10 11) (12

More information

Introduction to Apache Cassandra

Introduction to Apache Cassandra Introduction to Apache Cassandra White Paper BY DATASTAX CORPORATION JULY 2013 1 Table of Contents Abstract 3 Introduction 3 Built by Necessity 3 The Architecture of Cassandra 4 Distributing and Replicating

More information

Not Relational Models For The Management of Large Amount of Astronomical Data. Bruno Martino (IASI/CNR), Memmo Federici (IAPS/INAF)

Not Relational Models For The Management of Large Amount of Astronomical Data. Bruno Martino (IASI/CNR), Memmo Federici (IAPS/INAF) Not Relational Models For The Management of Large Amount of Astronomical Data Bruno Martino (IASI/CNR), Memmo Federici (IAPS/INAF) What is a DBMS A Data Base Management System is a software infrastructure

More information

NoSQL storage and management of geospatial data with emphasis on serving geospatial data using standard geospatial web services

NoSQL storage and management of geospatial data with emphasis on serving geospatial data using standard geospatial web services NoSQL storage and management of geospatial data with emphasis on serving geospatial data using standard geospatial web services Pouria Amirian, Adam Winstanley, Anahid Basiri Department of Computer Science,

More information

Practical Cassandra. Vitalii Tymchyshyn [email protected] @tivv00

Practical Cassandra. Vitalii Tymchyshyn tivv00@gmail.com @tivv00 Practical Cassandra NoSQL key-value vs RDBMS why and when Cassandra architecture Cassandra data model Life without joins or HDD space is cheap today Hardware requirements & deployment hints Vitalii Tymchyshyn

More information

Comparison of the Frontier Distributed Database Caching System with NoSQL Databases

Comparison of the Frontier Distributed Database Caching System with NoSQL Databases Comparison of the Frontier Distributed Database Caching System with NoSQL Databases Dave Dykstra [email protected] Fermilab is operated by the Fermi Research Alliance, LLC under contract No. DE-AC02-07CH11359

More information

An Approach to Implement Map Reduce with NoSQL Databases

An Approach to Implement Map Reduce with NoSQL Databases www.ijecs.in International Journal Of Engineering And Computer Science ISSN: 2319-7242 Volume 4 Issue 8 Aug 2015, Page No. 13635-13639 An Approach to Implement Map Reduce with NoSQL Databases Ashutosh

More information

Oracle Big Data SQL Technical Update

Oracle Big Data SQL Technical Update Oracle Big Data SQL Technical Update Jean-Pierre Dijcks Oracle Redwood City, CA, USA Keywords: Big Data, Hadoop, NoSQL Databases, Relational Databases, SQL, Security, Performance Introduction This technical

More information

Architectural patterns for building real time applications with Apache HBase. Andrew Purtell Committer and PMC, Apache HBase

Architectural patterns for building real time applications with Apache HBase. Andrew Purtell Committer and PMC, Apache HBase Architectural patterns for building real time applications with Apache HBase Andrew Purtell Committer and PMC, Apache HBase Who am I? Distributed systems engineer Principal Architect in the Big Data Platform

More information

Hadoop Ecosystem B Y R A H I M A.

Hadoop Ecosystem B Y R A H I M A. Hadoop Ecosystem B Y R A H I M A. History of Hadoop Hadoop was created by Doug Cutting, the creator of Apache Lucene, the widely used text search library. Hadoop has its origins in Apache Nutch, an open

More information

A COMPARATIVE STUDY OF NOSQL DATA STORAGE MODELS FOR BIG DATA

A COMPARATIVE STUDY OF NOSQL DATA STORAGE MODELS FOR BIG DATA A COMPARATIVE STUDY OF NOSQL DATA STORAGE MODELS FOR BIG DATA Ompal Singh Assistant Professor, Computer Science & Engineering, Sharda University, (India) ABSTRACT In the new era of distributed system where

More information

NoSQL Systems for Big Data Management

NoSQL Systems for Big Data Management NoSQL Systems for Big Data Management Venkat N Gudivada East Carolina University Greenville, North Carolina USA Venkat Gudivada NoSQL Systems for Big Data Management 1/28 Outline 1 An Overview of NoSQL

More information

Introduction to Polyglot Persistence. Antonios Giannopoulos Database Administrator at ObjectRocket by Rackspace

Introduction to Polyglot Persistence. Antonios Giannopoulos Database Administrator at ObjectRocket by Rackspace Introduction to Polyglot Persistence Antonios Giannopoulos Database Administrator at ObjectRocket by Rackspace FOSSCOMM 2016 Background - 14 years in databases and system engineering - NoSQL DBA @ ObjectRocket

More information

How To Improve Performance In A Database

How To Improve Performance In A Database Some issues on Conceptual Modeling and NoSQL/Big Data Tok Wang Ling National University of Singapore 1 Database Models File system - field, record, fixed length record Hierarchical Model (IMS) - fixed

More information

nosql and Non Relational Databases

nosql and Non Relational Databases nosql and Non Relational Databases Image src: http://www.pentaho.com/big-data/nosql/ Matthias Lee Johns Hopkins University What NoSQL? Yes no SQL.. Atleast not only SQL Large class of Non Relaltional Databases

More information

Advanced Data Management Technologies

Advanced Data Management Technologies ADMT 2014/15 Unit 15 J. Gamper 1/44 Advanced Data Management Technologies Unit 15 Introduction to NoSQL J. Gamper Free University of Bozen-Bolzano Faculty of Computer Science IDSE ADMT 2014/15 Unit 15

More information

Big Data With Hadoop

Big Data With Hadoop With Saurabh Singh [email protected] The Ohio State University February 11, 2016 Overview 1 2 3 Requirements Ecosystem Resilient Distributed Datasets (RDDs) Example Code vs Mapreduce 4 5 Source: [Tutorials

More information

How To Use Big Data For Telco (For A Telco)

How To Use Big Data For Telco (For A Telco) ON-LINE VIDEO ANALYTICS EMBRACING BIG DATA David Vanderfeesten, Bell Labs Belgium ANNO 2012 YOUR DATA IS MONEY BIG MONEY! Your click stream, your activity stream, your electricity consumption, your call

More information

A survey of big data architectures for handling massive data

A survey of big data architectures for handling massive data CSIT 6910 Independent Project A survey of big data architectures for handling massive data Jordy Domingos - [email protected] Supervisor : Dr David Rossiter Content Table 1 - Introduction a - Context

More information

BIG DATA What it is and how to use?

BIG DATA What it is and how to use? BIG DATA What it is and how to use? Lauri Ilison, PhD Data Scientist 21.11.2014 Big Data definition? There is no clear definition for BIG DATA BIG DATA is more of a concept than precise term 1 21.11.14

More information

Hadoop. MPDL-Frühstück 9. Dezember 2013 MPDL INTERN

Hadoop. MPDL-Frühstück 9. Dezember 2013 MPDL INTERN Hadoop MPDL-Frühstück 9. Dezember 2013 MPDL INTERN Understanding Hadoop Understanding Hadoop What's Hadoop about? Apache Hadoop project (started 2008) downloadable open-source software library (current

More information

The MongoDB Tutorial Introduction for MySQL Users. Stephane Combaudon April 1st, 2014

The MongoDB Tutorial Introduction for MySQL Users. Stephane Combaudon April 1st, 2014 The MongoDB Tutorial Introduction for MySQL Users Stephane Combaudon April 1st, 2014 Agenda 2 Introduction Install & First Steps CRUD Aggregation Framework Performance Tuning Replication and High Availability

More information

INTRODUCTION TO CASSANDRA

INTRODUCTION TO CASSANDRA INTRODUCTION TO CASSANDRA This ebook provides a high level overview of Cassandra and describes some of its key strengths and applications. WHAT IS CASSANDRA? Apache Cassandra is a high performance, open

More information

So What s the Big Deal?

So What s the Big Deal? So What s the Big Deal? Presentation Agenda Introduction What is Big Data? So What is the Big Deal? Big Data Technologies Identifying Big Data Opportunities Conducting a Big Data Proof of Concept Big Data

More information

Scaling up = getting a better machine. Scaling out = use another server and add it to your cluster.

Scaling up = getting a better machine. Scaling out = use another server and add it to your cluster. MongoDB 1. Introduction MongoDB is a document-oriented database, not a relation one. It replaces the concept of a row with a document. This makes it possible to represent complex hierarchical relationships

More information

Luncheon Webinar Series May 13, 2013

Luncheon Webinar Series May 13, 2013 Luncheon Webinar Series May 13, 2013 InfoSphere DataStage is Big Data Integration Sponsored By: Presented by : Tony Curcio, InfoSphere Product Management 0 InfoSphere DataStage is Big Data Integration

More information

NoSQL and Hadoop Technologies On Oracle Cloud

NoSQL and Hadoop Technologies On Oracle Cloud NoSQL and Hadoop Technologies On Oracle Cloud Vatika Sharma 1, Meenu Dave 2 1 M.Tech. Scholar, Department of CSE, Jagan Nath University, Jaipur, India 2 Assistant Professor, Department of CSE, Jagan Nath

More information

SQL VS. NO-SQL. Adapted Slides from Dr. Jennifer Widom from Stanford

SQL VS. NO-SQL. Adapted Slides from Dr. Jennifer Widom from Stanford SQL VS. NO-SQL Adapted Slides from Dr. Jennifer Widom from Stanford 55 Traditional Databases SQL = Traditional relational DBMS Hugely popular among data analysts Widely adopted for transaction systems

More information

Structured Data Storage

Structured Data Storage Structured Data Storage Xgen Congress Short Course 2010 Adam Kraut BioTeam Inc. Independent Consulting Shop: Vendor/technology agnostic Staffed by: Scientists forced to learn High Performance IT to conduct

More information

Business Intelligence for Big Data

Business Intelligence for Big Data Business Intelligence for Big Data Will Gorman, Vice President, Engineering May, 2011 2010, Pentaho. All Rights Reserved. www.pentaho.com. What is BI? Business Intelligence = reports, dashboards, analysis,

More information

Decoding the Big Data Deluge a Virtual Approach. Dan Luongo, Global Lead, Field Solution Engineering Data Virtualization Business Unit, Cisco

Decoding the Big Data Deluge a Virtual Approach. Dan Luongo, Global Lead, Field Solution Engineering Data Virtualization Business Unit, Cisco Decoding the Big Data Deluge a Virtual Approach Dan Luongo, Global Lead, Field Solution Engineering Data Virtualization Business Unit, Cisco High-volume, velocity and variety information assets that demand

More information

Information Architecture

Information Architecture The Bloor Group Actian and The Big Data Information Architecture WHITE PAPER The Actian Big Data Information Architecture Actian and The Big Data Information Architecture Originally founded in 2005 to

More information

Introduction to NoSQL Databases and MapReduce. Tore Risch Information Technology Uppsala University 2014-05-12

Introduction to NoSQL Databases and MapReduce. Tore Risch Information Technology Uppsala University 2014-05-12 Introduction to NoSQL Databases and MapReduce Tore Risch Information Technology Uppsala University 2014-05-12 What is a NoSQL Database? 1. A key/value store Basic index manager, no complete query language

More information

Introduction to Hadoop. New York Oracle User Group Vikas Sawhney

Introduction to Hadoop. New York Oracle User Group Vikas Sawhney Introduction to Hadoop New York Oracle User Group Vikas Sawhney GENERAL AGENDA Driving Factors behind BIG-DATA NOSQL Database 2014 Database Landscape Hadoop Architecture Map/Reduce Hadoop Eco-system Hadoop

More information

http://glennengstrand.info/analytics/fp

http://glennengstrand.info/analytics/fp Functional Programming and Big Data by Glenn Engstrand (September 2014) http://glennengstrand.info/analytics/fp What is Functional Programming? It is a style of programming that emphasizes immutable state,

More information

Making Sense of NoSQL Dan McCreary Ann Kelly

Making Sense of NoSQL Dan McCreary Ann Kelly Sample Chapter Making Sense of NoSQL Dan McCreary Ann Kelly Chapter 1 Copyright 2013 Manning Publications brief contents PART 1 INTRODUCTION...1 1 NoSQL: It s about making intelligent choices 3 2 NoSQL

More information

Big Data and Transactional Databases Exploding Data Volume is Creating New Stresses on Traditional Transactional Databases

Big Data and Transactional Databases Exploding Data Volume is Creating New Stresses on Traditional Transactional Databases Big Data and Transactional Databases Exploding Data Volume is Creating New Stresses on Traditional Transactional Databases Introduction The world is awash in data and turning that data into actionable

More information

Oracle Database 12c Plug In. Switch On. Get SMART.

Oracle Database 12c Plug In. Switch On. Get SMART. Oracle Database 12c Plug In. Switch On. Get SMART. Duncan Harvey Head of Core Technology, Oracle EMEA March 2015 Safe Harbor Statement The following is intended to outline our general product direction.

More information

NOSQL DATABASES AND CASSANDRA

NOSQL DATABASES AND CASSANDRA NOSQL DATABASES AND CASSANDRA Semester Project: Advanced Databases DECEMBER 14, 2015 WANG CAN, EVABRIGHT BERTHA Université Libre de Bruxelles 0 Preface The goal of this report is to introduce the new evolving

More information

Preparing Your Data For Cloud

Preparing Your Data For Cloud Preparing Your Data For Cloud Narinder Kumar Inphina Technologies 1 Agenda Relational DBMS's : Pros & Cons Non-Relational DBMS's : Pros & Cons Types of Non-Relational DBMS's Current Market State Applicability

More information

Sentimental Analysis using Hadoop Phase 2: Week 2

Sentimental Analysis using Hadoop Phase 2: Week 2 Sentimental Analysis using Hadoop Phase 2: Week 2 MARKET / INDUSTRY, FUTURE SCOPE BY ANKUR UPRIT The key value type basically, uses a hash table in which there exists a unique key and a pointer to a particular

More information

Do Relational Databases Belong in the Cloud? Michael Stiefel www.reliablesoftware.com [email protected]

Do Relational Databases Belong in the Cloud? Michael Stiefel www.reliablesoftware.com development@reliablesoftware.com Do Relational Databases Belong in the Cloud? Michael Stiefel www.reliablesoftware.com [email protected] How do you model data in the cloud? Relational Model A query operation on a relation

More information

Big Data and Data Science: Behind the Buzz Words

Big Data and Data Science: Behind the Buzz Words Big Data and Data Science: Behind the Buzz Words Peggy Brinkmann, FCAS, MAAA Actuary Milliman, Inc. April 1, 2014 Contents Big data: from hype to value Deconstructing data science Managing big data Analyzing

More information

Big Data and Hadoop with components like Flume, Pig, Hive and Jaql

Big Data and Hadoop with components like Flume, Pig, Hive and Jaql Abstract- Today data is increasing in volume, variety and velocity. To manage this data, we have to use databases with massively parallel software running on tens, hundreds, or more than thousands of servers.

More information

CS 4604: Introduc0on to Database Management Systems. B. Aditya Prakash Lecture #13: NoSQL and MapReduce

CS 4604: Introduc0on to Database Management Systems. B. Aditya Prakash Lecture #13: NoSQL and MapReduce CS 4604: Introduc0on to Database Management Systems B. Aditya Prakash Lecture #13: NoSQL and MapReduce Announcements HW4 is out You have to use the PGSQL server START EARLY!! We can not help if everyone

More information

Slave. Master. Research Scholar, Bharathiar University

Slave. Master. Research Scholar, Bharathiar University Volume 3, Issue 7, July 2013 ISSN: 2277 128X International Journal of Advanced Research in Computer Science and Software Engineering Research Paper online at: www.ijarcsse.com Study on Basically, and Eventually

More information

How To Handle Big Data With A Data Scientist

How To Handle Big Data With A Data Scientist III Big Data Technologies Today, new technologies make it possible to realize value from Big Data. Big data technologies can replace highly customized, expensive legacy systems with a standard solution

More information

Applications for Big Data Analytics

Applications for Big Data Analytics Smarter Healthcare Applications for Big Data Analytics Multi-channel sales Finance Log Analysis Homeland Security Traffic Control Telecom Search Quality Manufacturing Trading Analytics Fraud and Risk Retail:

More information