L7_L10. MongoDB. Big Data and Analytics by Seema Acharya and Subhashini Chellappan Copyright 2015, WILEY INDIA PVT. LTD.

Similar documents
Getting Started with MongoDB

Scaling up = getting a better machine. Scaling out = use another server and add it to your cluster.

MongoDB Developer and Administrator Certification Course Agenda

Dr. Chuck Cartledge. 15 Oct. 2015

Please ask questions! Have people used non-relational dbs before? MongoDB?

MongoDB: document-oriented database

Big Data & Data Science Course Example using MapReduce. Presented by Juan C. Vega

NOSQL INTRODUCTION WITH MONGODB AND RUBY GEOFF

Building Your First MongoDB Application

SURVEY ON MONGODB: AN OPEN- SOURCE DOCUMENT DATABASE

NoSQL: Going Beyond Structured Data and RDBMS

MongoDB Aggregation and Data Processing Release 3.0.4

MongoDB Aggregation and Data Processing

NoSQL - What we ve learned with mongodb. Paul Pedersen, Deputy CTO paul@10gen.com DAMA SF December 15, 2011

Certified MongoDB Professional VS-1058

MongoDB. Or how I learned to stop worrying and love the database. Mathias Stearn. N*SQL Berlin October 22th, gen

The MongoDB Tutorial Introduction for MySQL Users. Stephane Combaudon April 1st, 2014

NoSQL Database - mongodb

Cloud Scale Distributed Data Storage. Jürmo Mehine

MongoDB Aggregation and Data Processing

Lecture 21: NoSQL III. Monday, April 20, 2015

Hacettepe University Department Of Computer Engineering BBM 471 Database Management Systems Experiment

Open source, high performance database

Can the Elephants Handle the NoSQL Onslaught?

MongoDB. The Definitive Guide to. The NoSQL Database for Cloud and Desktop Computing. Apress8. Eelco Plugge, Peter Membrey and Tim Hawkins

DbSchema Tutorial with Introduction in MongoDB

Brad Dayley. NoSQL with MongoDB

Document Oriented Database

NoSQL Databases, part II

Analytics March 2015 White paper. Why NoSQL? Your database options in the new non-relational world

Application of NoSQL Database in Web Crawling

Big Data. Facebook Wall Data using Graph API. Presented by: Prashant Patel Jaykrushna Patel

.NET User Group Bern

A Brief Introduction to MySQL

Visual Statement. NoSQL Data Storage. MongoDB Project. April 10, Bobby Esfandiari Stefan Schielke Nicole Saat

Why NoSQL? Your database options in the new non- relational world IBM Cloudant 1

MongoDB. An introduction and performance analysis. Seminar Thesis

Introduction to NoSQL and MongoDB. Kathleen Durant Lesson 20 CS 3200 Northeastern University

ADVANCED DATABASES PROJECT. Juan Manuel Benítez V Gledys Sulbaran

these three NoSQL databases because I wanted to see a the two different sides of the CAP

Cloud Powered Mobile Apps with Azure

Cloudant Querying Options

An Approach to Implement Map Reduce with NoSQL Databases

Using distributed technologies to analyze Big Data

Facebook Twitter YouTube Google Plus Website

Once the schema has been designed, it can be implemented in the RDBMS.

Big Data Solutions. Portal Development with MongoDB and Liferay. Solutions

CSCC09F Programming on the Web. Mongo DB

Frictionless Persistence in.net with MongoDB. Mogens Heller Grabe Trifork

NoSQL web apps. w/ MongoDB, Node.js, AngularJS. Dr. Gerd Jungbluth, NoSQL UG Cologne,

MONGODB - THE NOSQL DATABASE

COSC 6397 Big Data Analytics. 2 nd homework assignment Pig and Hive. Edgar Gabriel Spring 2015

NoSQL in der Cloud Why? Andreas Hartmann

Understanding NoSQL Technologies on Windows Azure

Perl & NoSQL Focus on MongoDB. Jean-Marie Gouarné jmgdoc@cpan.org

Spring Data, Jongo & Co. Java Persistenz-Frameworks für MongoDB

Lab 9 Access PreLab Copy the prelab folder, Lab09 PreLab9_Access_intro

Scaling with MongoDB. by Michael Schurter Scaling with MongoDB by Michael Schurter - OS Bridge,

Visualizing MongoDB Objects in Concept and Practice

HO5604 Deploying MongoDB. A Scalable, Distributed Database with SUSE Cloud. Alejandro Bonilla. Sales Engineer abonilla@suse.com

A COMPARATIVE STUDY OF NOSQL DATA STORAGE MODELS FOR BIG DATA

BIG DATA HANDS-ON WORKSHOP Data Manipulation with Hive and Pig

Leveraging the Power of SOLR with SPARK. Johannes Weigend QAware GmbH Germany pache Big Data Europe September 2015

PHP and MongoDB Web Development Beginners Guide by Rubayeet Islam

NoSQL Databases. Nikos Parlavantzas

CSE 530A Database Management Systems. Introduction. Washington University Fall 2013

SQL. Short introduction

In Memory Accelerator for MongoDB

A table is a collection of related data entries and it consists of columns and rows.

MyOra 3.0. User Guide. SQL Tool for Oracle. Jayam Systems, LLC

MongoDB and Couchbase

Building a RESTful service with MongoDB and Spring MVC

Structured Data Storage

Advanced Data Management Technologies

Big Data Visualization with JReport

Databases for text storage

MongoDB in the NoSQL and SQL world. Horst Rechner Berlin,

database abstraction layer database abstraction layers in PHP Lukas Smith BackendMedia

Predictive Research Inc., Predict & Benefit

Data Model Design for MongoDB

NORWEGIAN UNIVERSITY OF SCIENCE AND TECHNOLOGY DEPARTMENT OF CHEMICAL ENGINEERING ADVANCED PROCESS SIMULATION. SQL vs. NoSQL

How To Create A Table In Sql (Ahem)

Integrating VoltDB with Hadoop

Big data and urban mobility

Access Queries (Office 2003)

Title. Syntax. stata.com. odbc Load, write, or view data from ODBC sources. List ODBC sources to which Stata can connect odbc list

An Open Source NoSQL solution for Internet Access Logs Analysis

Benchmarking Couchbase Server for Interactive Applications. By Alexey Diomin and Kirill Grigorchuk

MS Access Lab 2. Topic: Tables

Database Management System Choices. Introduction To Database Systems CSE 373 Spring 2013

Вовченко Алексей, к.т.н., с.н.с. ВМК МГУ ИПИ РАН

Big Data and Analytics by Seema Acharya and Subhashini Chellappan Copyright 2015, WILEY INDIA PVT. LTD. Introduction to Pig

Data Warehouse and Hive. Presented By: Shalva Gelenidze Supervisor: Nodar Momtselidze

2874CD1EssentialSQL.qxd 6/25/01 3:06 PM Page 1 Essential SQL Copyright 2001 SYBEX, Inc., Alameda, CA

Transcription:

L7_L10 MongoDB

Agenda What is MongoDB? Why MongoDB? Using JSON Creating or Generating a Unique Key Support for Dynamic Queries Storing Binary Data Replication Sharding Terms used in RDBMS and MongoDB Data Types in MongoDB CRUD (Insert(), Update(), Save(), Remove(), Find()) MapReduce Functions Aggregation Java Scripting MongoImport MongoExport

What is MongoDB? MongoDB is: 1. Cross-platform. 2. Open source. 3. Non-relational. 4. Distributed. 5. NoSQL. 6. Document-oriented data store.

Why MongoDB?

Why MongoDB? Open Source Distributed Fast In-Place Updates Replication Full Index Support Rich Query Language Easy Scalability Auto sharding

Database Collection - Document Database: Collection of collections Created by a reference or on demand MongoDB several DBs Set of files Collection: Analogous to a table of RDBMS Created on demand or attempt to save a document that references it Hold several MongoDB documents Does not enforce a schema: no. of fields / order of fields may vary Document: Analogous to a tuple in an RDBMS table

JSON

JSON (Java Script Object Notation) Sample JSON Document { FirstName: John, LastName: Mathews, ContactNo: [+123 4567 8900, +123 4444 5555] } { FirstName: Andrews, LastName: Symmonds, ContactNo: +456 7890 1234 }

Unique Identifier Each JSON document should have a unique identifier. It is the _id key. 0 1 2 3 4 5 6 7 8 9 10 11 Timestamp Machine ID Process ID Counter

Support for Dynamic Queries MongoDB has extensive support for dynamic queries. This is in keeping with traditional RDBMS wherein we have static data and dynamic queries.

Storing Binary Data MongoDB provides GridFS to support the storage of binary data. It can store up to 4 MB of data. Metadata: file Data chunks chunks collection scalability

Replication in MongoDB Client Application Writes Reads Primary Replication Replication Replication Secondary Secondary Secondary

Sharding in MongoDB Collection 1 1 TB database Shard 1 Shard 2 Shard 3 Shard 4 (256 GB) (256 GB) (256 GB) Logical Database (Collection 1) (256 GB)

Terms Used in RDBMS and MongoDB

Terms Used in RDBMS and MongoDB RDBMS Database Table Record Columns Index Joins Primary Key MongoDB Database Collection Document Fields / Key Value pairs Index Embedded documents Primary key (_id is a identifier)

Data Types in MongoDB

Data Types in MongoDB String Integer Boolean Double Min/Max keys Arrays Timestamp Null Date Object ID Binary data Code Regular expression Must be UTF-8 valid. Most commonly used data type. Can be 32-bit or 64-bit (depends on the server). To store a true/false value. To store floating point (real values). To compare a value against the lowest or highest BSON elements. To store arrays or list or multiple values into one key. To record when a document has been modified or added. To store a NULL value. A NULL is a missing or unknown value. To store the current date or time in Unix time format. One can create object of date and pass day, month and year to it. To store the document s id. To store binary data (images, binaries, etc.). To store javascript code into the document. To store regular expression.

CRUD in MongoDB

Database To create a DB use mydb; To confirm the existence of DB db; To get a list of all DBs show dbs To Drop a DB db.dropdatabase(); To switch to a new DB use mydb1; To display the list of collection show collections TO display statistics db.stats();

Collections To create a collection by the name Person. Let us take a look at the collection list prior to the creation of the new collection Person. db.createcollection( Person );

Collections To drop a collection by the name Person. db.person.drop();

Insert Method Create a collection by the name Students and store the following data in it. db.students.insert({_id:1, StudName:"Michelle Jacintha", Grade: "VII", Hobbies: "Internet Surfing"}); show collections db.students.find(); db.students.find().pretty();

Insert additional records db.students.insert({_id:2, StudName:"Mabel Mathews", Grade: VII", Hobbies: Baseball"}); Show collections db.students.find(); db.students.find().pretty();

Update Method Insert the document for Aryan David into the Students collection only if it does not already exist in the collection. However, if it is already present in the collection, then update the document with new values. (Update his Hobbies from Skating to Chess.) Use Update else insert (if there is an existing document, it will attempt to update it, if there is no existing document then it will insert it). db.students.update({_id:3, StudName:"Aryan David", Grade: "VII"},{$set:{Hobbies: "Skating"}},{upsert:false}); db.students.update({_id:3, StudName:"Aryan David", Grade: "VII"},{$set:{Hobbies: "Skating"}},{upsert:true}); Update the Hobbies of Aryand David to Chess

Find Method To search for documents from the Students collection based on certain search criteria. db.students.find(); db.students.find({_id:3}); db.students.find({grade: VII }); db.students.find({studname:"aryan David"});

Find Method To display only the StudName and Grade from all the documents of the Students collection. The identifier _id should be suppressed and NOT displayed. db.students.find({},{studname:1,grade:1}); db.students.find({},{studname:1,grade:1,_id:0}); db.students.find({grade: VII }).pretty().limit(2);

Save Method db.students.save({_id:4,studname: Vamsi Bapat", Grade: VI"}); db.students.find().pretty(); db.students.save({_id:4,studname: Vamsi Bapat", Grade: VI, Hobbies: Cricket }); db.students.find().pretty();

Adding a new field db.students.update({_id:4},{$set:{location: Bangalore"}}) db.students.find().pretty(); db.students.update({grade: VII },{$set:{location: Mumbai }}) db.students.find().pretty(); db.students.update({grade: VII }, "},{$set:{location: Mumbai"}}, {multi:true}) db.students.find().pretty();

Removing a document / field db.students.remove({_id:4}) db.students.find().pretty(); db.students.insert({_id:4,studname: VamsiBapat",Grade: VI, Hobbies: Cricket }); db.students.update({_id:4}, "},{$unset:{location: Bangalore"}}) db.students.find({_id:4}).pretty();

Find Method To find those documents where the Grade is set to VII db.students.find({grade:{$eq:'vii'}}).pretty(); Other relational operators: ne, gt, lt, gte, lte

Find Method To find those documents from the Students collection where the Hobbies is set to either Chess or is set to Skating. db.students.find ({Hobbies :{ $in: ['Chess','Skating']}}).pretty (); db.students.find ({Hobbies :{ $nin: [ Cricket']}}).pretty ();

Find Method To find documents from the Students collection where the StudName begins with M. db.students.find({studname:/^m/}).pretty();

Find Method To find documents from the Students collection where the StudName has an e in any position. db.students.find({studname:/e/}).pretty();

Find Method To find documents from the Students collection where the StudName ends in a. db.students.find({studname:/a$/}).pretty();

Find Method To find documents from the Students collection where id is 3 or 4. db.students.find({$or:[{_id:3},{_id:4}]}).pretty();

Find Method To find documents from the Students collection where location is null. db.students.find({location:null}).pretty();

Find Method To find the number of documents in the Students collection. db.students.count(); db.students.count({grade: VII });

Find Method To sort the documents from the Students collection db.students.find().sort({studname:1}).pretty(); // ascending order db.students.find().sort({studname:-1}).pretty(); db.students.find().sort({studname:1, Hobbies: -1}).pretty();

Find Method To display the records from the Students collection db.students.find().skip(2).pretty(); // skip first 2 documents from the output db.students.find().skip(1).sort({studname:1}).pretty(); db.students.find().skip(db.students.count()-2).pretty(); //last two documents db.students.find().skip(2).limit(2).pretty(); //display 3 rd and 4 th documents

Aggregate Function

Aggregate Function { } { } { } { } CustID: C123, AccBal: 500, AccType: S CustID: C123, AccBal: 900, AccType: S CustID: C111, AccBal: 1200, AccType: S CustID: C123, AccBal: 1500, AccType: C $match { } { } { } CustID: C123, AccBal: 500, AccType: S CustID: C123, AccBal: 900, AccType: S CustID: C111, AccBal: 1200, AccType: S $group { { _id: C123, TotAccBal: 1400 } _id: C111, TotAccBal: 1200 } Customers

Aggregate Function First filter on AccType:S and then group it on CustID and then compute the sum of AccBal and then filter those documents wherein the TotAccBal is greater than 1200, use the below syntax: // Insert customers documents db.customers.insert([{custid:"c123",accbal:500,acctype:"s"}, {CustID:"C123",AccBal:900,AccType:"S"}, {CustID:"C11",AccBal:1200,AccType:"S"}, {CustID:"C123",AccBal:1500,AccType:"C"}]);

Aggregate Function First filter on AccType:S and then group it on CustID and then compute the sum of AccBal db.customers.aggregate( { $match : {AccType : "S" } }, { $group : { _id : "$CustID",TotAccBal : { $sum : "$AccBal" } } } ); Compute average(avg)/maximum(max)/minimum(min) balance

Aggregate Function First filter on AccType:S and then group it on CustID and then compute the sum of AccBal and then filter those documents wherein the TotAccBal is greater than 1200, use the below syntax: db.customers.aggregate( { $match : {AccType : "S" } }, { $group : { _id : "$CustID",TotAccBal : { $sum : "$AccBal" } } }, { $match : {TotAccBal : { $gt : 1200 } }});

MapReduce

MapReduce Framework { } { } { } { } CustID: C123, AccBal: 500, AccType: S CustID: C123, AccBal: 900, AccType: S CustID: C111, AccBal: 1200, AccType: S CustID: C123, AccBal: 1500, AccType: C query { } { } { } CustID: C123, AccBal: 500, AccType: S CustID: C123, AccBal: 900, AccType: S CustID: C111, AccBal: 1200, AccType: S map { C123 :[ 500,900 ]} { C111 : 1200 } { } { } _id: C123, value: 1400 _id: C111, value: 1200 Customer_Totals Customers

MapReduce 1. Map function: var map = function() { emit(this.custid, this.accbal); } 2. Reduce Function: var reduce = function(key, values) { return Array.sum(values) ; }

MapReduce 1. Map function: var map = function() { emit(this.custid, this.accbal); } 2. Reduce Function: var reduce = function(key, values) { return Array.sum(values) ; } 3. Map-Reduce query: db.customers.mapreduce( map, reduce, {out: Customer_Totals, query: {AccType: S }} ); 4. Output: db.customer_totals.find();

Arrays

Arrays Create a collection by the name food and then insert documents into the foold collection db.food.insert({_id:1, fruits:['banana', 'apple', 'cherry']}); db.food.insert({_id:2, fruits:['orange', 'butterfruit', 'mango']}); db.food.insert({_id:3, fruits:['peneapple', 'strawberry', 'grapes']});

Arrays Find those collections from the food collection which the fruits array constituted of banana, apple and cherry. db.food.find({fruits:['banana','apple','cherry']});

Arrays Find those collections from the food collection which the fruits array having banana as an element. db.food.find({fruits: banana });

Arrays Find those collections from the food collection which the fruits array having banana in the first index position db.food.find({ fruits.0 : banana });

Arrays Find those collections from the food collection where size of the array is two. db.food.find({fruits: {$size: 2}});

Arrays Find collections from the food collection and display first two elements db.food.find({},{fruits: {$slice: 2}});

Arrays Find collections from the food collection and display two elements starting with the element at 1 st index position db.food.find({},{fruits: {$slice: [1,2]}});

Arrays Update the element at 0th index position of document with _id:3 by apple db.food.update({_id:3},{$set: { fruits.0 : apple }});

Arrays Update the document with _id:1 and push new key value pairs in the fruits array db.food.update({_id:1},{$push: {price:{banane: 50, apple:150, cherry:100}}});

Arrays Update document with _id:3 by adding an element an Orange db.food.update({_id:3},{$addtoset:{fruits:'orange'}});

Arrays Update document with _id:3 by popping an element db.food.update({_id:3},{$pop:{fruits:1}});

Arrays Update document with _id:3 by popping an element from the beginning of the array db.food.update({_id:3},{$pop:{fruits:-1}});

Arrays Update document with _id:2 by popping two elements from the list : orange and mango. db.food.update({_id:2},{$pullall:{fuits:['orange','mango']}});

Arrays Update document with _id:2 by popping two elements from the list : orange and mango. db.food.update({_id:2},{$pullall:{fuits:['orange','mango']}});

Arrays To pull out an array element based on index position. db.food.update({_id:1},{$unset:{"fuits.1":null}}); db.food.update({_id:1},{$pull:{"fuits":null}});

Cursors

Cursors To create a collection of a, b,. z.by the name alphabets db.alphabets.insert({_id:1, alphabet: a });. db.alphabets.insert({_id:26, alphabet: z }); var mycur = db.alphabets.find(); mycur

Cursors: hasnext(), next() Diplay all alphabets using cursor: var mycur = db.alphabets.find(); while (mycur.hasnext()) { var myrec= mycur.next(); print( The alphabet is: + myrec.alphabet); }

Indexes

Indexes Create an index on AccType for Customers Collection db.customers.ensureindex( AccType :1}); db.customers.find({"acctype":"s"}).hint({"acctype":1});

Java Script Programming

Java Script Programming To compute the factorial of a given positive number. The user is required to create a function by the name factorial and insert it into the system.js collection.

MongoImport

Import data from a CSV file Given a CSV file sample.txt in the D: drive, import the file into the MongoDB collection, SampleJSON. The collection is in the database test. Mongoimport --db test --collection SampleJSON --type csv --headerline --file d:\sample.txt

MongoExport

Export data to a CSV file This command used at the command prompt exports MongoDB JSON documents from Customers collection in the test database into a CSV file Output.txt in the D: drive. Mongoexport --db test --collection Customers --csv --fieldfile d:\fields.txt --out d:\output.txt

Answer a few quick questions

Crossword

Answer Me What is MongoDB? Comment on Auto-sharding in MongoDB. What are collections and documents? What is JSON? Explain your understanding of Update In-Place.

References

Further Readings http://www.mongodb.org/ https://university.mongodb.com/ http://www.tutorialspoint.com/mongodb/

Thank you