Building Your First MongoDB Application

Similar documents
Open source, high performance database

MongoDB. Or how I learned to stop worrying and love the database. Mathias Stearn. N*SQL Berlin October 22th, gen

Getting Started with MongoDB

MongoDB Developer and Administrator Certification Course Agenda

.NET User Group Bern

L7_L10. MongoDB. Big Data and Analytics by Seema Acharya and Subhashini Chellappan Copyright 2015, WILEY INDIA PVT. LTD.

MongoDB: document-oriented database

MongoDB in the NoSQL and SQL world. Horst Rechner Berlin,

NoSQL - What we ve learned with mongodb. Paul Pedersen, Deputy CTO paul@10gen.com DAMA SF December 15, 2011

Big Data & Data Science Course Example using MapReduce. Presented by Juan C. Vega

NoSQL: Going Beyond Structured Data and RDBMS

Please ask questions! Have people used non-relational dbs before? MongoDB?

NoSQL in der Cloud Why? Andreas Hartmann

NOSQL INTRODUCTION WITH MONGODB AND RUBY GEOFF

Big Data Solutions. Portal Development with MongoDB and Liferay. Solutions

Introduction to NoSQL and MongoDB. Kathleen Durant Lesson 20 CS 3200 Northeastern University

Cloud Scale Distributed Data Storage. Jürmo Mehine

MONGODB - THE NOSQL DATABASE

Scaling up = getting a better machine. Scaling out = use another server and add it to your cluster.

Big Data. Facebook Wall Data using Graph API. Presented by: Prashant Patel Jaykrushna Patel

Document Oriented Database

NoSQL Databases. Nikos Parlavantzas

The MongoDB Tutorial Introduction for MySQL Users. Stephane Combaudon April 1st, 2014

Brad Dayley. NoSQL with MongoDB

MongoDB. The Definitive Guide to. The NoSQL Database for Cloud and Desktop Computing. Apress8. Eelco Plugge, Peter Membrey and Tim Hawkins

Dr. Chuck Cartledge. 15 Oct. 2015

MySQL és Hadoop mint Big Data platform (SQL + NoSQL = MySQL Cluster?!)

SURVEY ON MONGODB: AN OPEN- SOURCE DOCUMENT DATABASE

Why NoSQL? Your database options in the new non- relational world IBM Cloudant 1

HO5604 Deploying MongoDB. A Scalable, Distributed Database with SUSE Cloud. Alejandro Bonilla. Sales Engineer abonilla@suse.com

NoSQL Database - mongodb

Analytics March 2015 White paper. Why NoSQL? Your database options in the new non-relational world

How graph databases started the multi-model revolution

Certified MongoDB Professional VS-1058

Application of NoSQL Database in Web Crawling

An Approach to Implement Map Reduce with NoSQL Databases

Scaling with MongoDB. by Michael Schurter Scaling with MongoDB by Michael Schurter - OS Bridge,

NoSQL replacement for SQLite (for Beatstream) Antti-Jussi Kovalainen Seminar OHJ-1860: NoSQL databases

Integrating Big Data into the Computing Curricula

Comparing SQL and NOSQL databases

Comparison of the Frontier Distributed Database Caching System with NoSQL Databases

Frictionless Persistence in.net with MongoDB. Mogens Heller Grabe Trifork

Domain driven design, NoSQL and multi-model databases

Can the Elephants Handle the NoSQL Onslaught?

MongoDB. An introduction and performance analysis. Seminar Thesis

MongoDB Aggregation and Data Processing Release 3.0.4

MongoDB Aggregation and Data Processing

MongoDB Aggregation and Data Processing

Hacettepe University Department Of Computer Engineering BBM 471 Database Management Systems Experiment

How To Handle Big Data With A Data Scientist

X4-2 Exadata announced (well actually around Jan 1) OEM/Grid control 12c R4 just released

extensible record stores document stores key-value stores Rick Cattel s clustering from Scalable SQL and NoSQL Data Stores SIGMOD Record, 2010

Advanced Data Management Technologies

CSCC09F Programming on the Web. Mongo DB

Building a RESTful service with MongoDB and Spring MVC

PHP and MongoDB Web Development Beginners Guide by Rubayeet Islam

Sentimental Analysis using Hadoop Phase 2: Week 2

Big data and urban mobility

Structured Data Storage

Benchmarking Couchbase Server for Interactive Applications. By Alexey Diomin and Kirill Grigorchuk

Lecture 21: NoSQL III. Monday, April 20, 2015

3 Case Studies of NoSQL and Java Apps in the Real World

the missing log collector Treasure Data, Inc. Muga Nishizawa

ADVANCED DATABASES PROJECT. Juan Manuel Benítez V Gledys Sulbaran

NoSQL Databases. Institute of Computer Science Databases and Information Systems (DBIS) DB 2, WS 2014/2015

An Open Source NoSQL solution for Internet Access Logs Analysis

these three NoSQL databases because I wanted to see a the two different sides of the CAP

MongoDB and Couchbase

Evaluator s Guide. McKnight. Consulting Group. McKnight Consulting Group

Spring Data, Jongo & Co. Java Persistenz-Frameworks für MongoDB

Overview of Databases On MacOS. Karl Kuehn Automation Engineer RethinkDB

Lecture Data Warehouse Systems

Perl & NoSQL Focus on MongoDB. Jean-Marie Gouarné jmgdoc@cpan.org

Data Model Design for MongoDB

Introduction to Hadoop. New York Oracle User Group Vikas Sawhney

Oracle s Big Data solutions. Roger Wullschleger. <Insert Picture Here>

Practical Cassandra. Vitalii

Big Systems, Big Data

NoSQL systems: introduction and data models. Riccardo Torlone Università Roma Tre

Spark Application Carousel. Spark Summit East 2015

Using MySQL for Big Data Advantage Integrate for Insight Sastry Vedantam

BIG DATA Alignment of Supply & Demand Nuria de Lama Representative of Atos Research &

Lecture 10: HBase! Claudia Hauff (Web Information Systems)!

Scalable Architecture on Amazon AWS Cloud

NoSQL Systems for Big Data Management

@tobiastrelle. codecentric AG 1

Using distributed technologies to analyze Big Data

A COMPARATIVE STUDY OF NOSQL DATA STORAGE MODELS FOR BIG DATA

How To Compare The Economics Of A Database To A Microsoft Database

In Memory Accelerator for MongoDB

NoSQL web apps. w/ MongoDB, Node.js, AngularJS. Dr. Gerd Jungbluth, NoSQL UG Cologne,

Configuration and Deployment Guide for the Cassandra NoSQL Data Store on Intel Architecture

Transcription:

Building Your First MongoDB Application

Ross Lawley Python Engineer @ 10gen Web developer since 1999 Passionate about open source Agile methodology email: ross@10gen.com twitter: RossC0

Today's Talk Quick introduction to NoSQL Some Background about mongodb Using mongodb, general querying and usage Building your first app, creating a location-based app Data modelling in mongodb, queries, geospatial, updates and map reduce (examples work in mongodb JS shell)

What is NoSQL? Key / Value Column Graph Document

Key-Value Stores A mapping from a key to a value The store doesn't know anything about the the key or value The store doesn't know anything about the insides of the value Operations Set, get, or delete a key-value pair

Column-Oriented Stores Like a relational store, but flipped around: all data for a column is kept together An index provides a means to get a column value for a record Operations: Get, insert, delete records; updating fields Streaming column data in and out of Hadoop

Graph Databases Stores vertex-to-vertex edges Operations: Getting and setting edges Sometimes possible to annotate vertices or edges Query languages support finding paths between vertices, subject to various constraints

Document Stores The store is a container for documents Documents are made up of named fields Fields may or may not have type definitions e.g. XSDs for XML stores, vs. schema-less JSON stores Can create "secondary indexes" These provide the ability to query on any document field(s) Operations: Insert and delete documents Update fields within documents

What is? MongoDB is a scalable, high-performance, open source NoSQL database. Document-oriented storage Full Index Support Querying Fast In-Place Updates Replication & High Availability Auto-Sharding Map/Reduce GridFS

Database Landscape scalability & performance memcached key/value RDBMS depth of functionality

History First release February 2009 v1.0 - August 2009 v1.2 - December 2009 MapReduce, ++ v1.4 - March 2010 Concurrency, Geo V1.6 - August 2010 Sharding, Replica Sets V1.8 March 2011 Journaling, Geosphere V2.0 - Sep 2011 V1 Indexes, Concurrency V2.2 - Soon - Aggregation, Concurrency

Company behind mongodb (A)GPL license, own copyrights, engineering team support, consulting, commercial license Management Google/DoubleClick, Oracle, Apple, NetApp Funding: Sequoia, Union Square, Flybridge Offices in NYC, Palo Alto, London, Dublin 100+ employees

Where can you use it? MongoDB is Implemented in C++ Platforms 32/64 bit Windows, Linux, Mac OS-X, FreeBSD, Solaris Drivers are available in many languages 10gen supported C, C# (.Net), C++, Erlang, Haskell, Java, JavaScript, Perl, PHP, Python, Ruby, Scala Community supported Clojure, ColdFusion, F#, Go, Groovy, Lua, R... http://www.mongodb.org/display/docs/drivers

Why use mongodb? Intrinsic support for agile development Super low latency access to your data Very little CPU overhead No additional caching layer required Built in replication and horizontal scaling support

Terminology RDBMS Table Row(s) Index Join Partition Partition Key MongoDB Collection JSON Document Index Embedding & Linking Shard Shard Key

Documents Blog Post Document > p = { author: "Ross", date: new Date(), text: "About MongoDB...", tags: ["tech", "databases"]} > db.posts.save(p)

Querying > db.posts.find() { _id : ObjectId("4c4ba5c0672c685e5e8aabf3"), author : "Ross", date : ISODate("2012-02-02T11:52:27.442Z"), text : "About MongoDB...", tags : [ "tech", "databases" ] } Notes: _id is unique, but can be anything you'd like

Introducing BSON JSON has powerful, but limited set of datatypes arrays, objects, strings, numbers and null BSON is a binary representation of JSON Adds extra dataypes with Date, Int types, Id, Optimized for performance and navigational abilities And compression MongoDB sends and stores data in BSON

BSON http://bsonspec.org

Query Operators Conditional Operators - $all, $exists, $mod, $ne, $in, $nin, $nor, $or, $size, $type - $lt, $lte, $gt, $gte // find posts with any tags > db.posts.find({tags: {$exists: true }}) // find posts matching a regular expression > db.posts.find({author: /^ro*/i }) // count posts by author > db.posts.find({author: 'Ross'}).count()

Secondary Indexes Create index on any Field in Document // 1 means ascending, -1 means descending > db.posts.ensureindex({author: 1}) > db.posts.findone({author: 'Ross'}) { _id : ObjectId("4c4ba5c0672c685e5e8aabf3"), author: "Ross",... }

Compound Indexes Create index on multiple fields in a Document // 1 means ascending, -1 means descending > db.posts.ensureindex({author: 1, ts: -1}) > db.posts.find({author: 'Ross'}).sort({ts: -1}) [{ _id : ObjectId("4c4ba5c0672c685e5e8aabf3"), author: "Ross",...}, { _id : ObjectId("4f61d325c496820ceba84124"), author: "Ross",...}]

Examine the query plan > db.posts.find({"author": 'Ross'}).explain() { "cursor" : "BtreeCursor author_1", "nscanned" : 1, "nscannedobjects" : 1, "n" : 1, "millis" : 0, "indexbounds" : { "author" : [ [ "Ross", "Ross" ] ] } }

Atomic Operations $set, $unset, $inc, $push, $pushall, $pull, $pullall, $bit // Create a comment > new_comment = { author: "Fred", date: new Date(), text: "Best Post Ever!"} // Add to post > db.posts.update({ _id: "..." }, {"$push": {comments: new_comment}, "$inc": {comments_count: 1} });

Nested Documents { _id : ObjectId("4c4ba5c0672c685e5e8aabf3"), author : "Ross", date : "Thu Feb 02 2012 11:50:01", text : "About MongoDB...", tags : [ "tech", "databases" ], comments : [{ author : "Fred", date : "Fri Feb 03 2012 13:23:11", text : "Best Post Ever!" }], comment_count : 1 }

Nested Documents { _id : ObjectId("4c4ba5c0672c685e5e8aabf3"), author : "Ross", date : "Thu Feb 02 2012 11:50:01", text : "About MongoDB...", tags : [ "tech", "databases" ], comments : [{ author : "Fred", date : "Fri Feb 03 2012 13:23:11", text : "Best Post Ever!" }], comment_count : 1 }

Secondary Indexes // Index nested documents > db.posts.ensureindex("comments.author": 1) > db.posts.find({"comments.author": "Fred"}) // Index on tags (multi-key index) > db.posts.ensureindex( tags: 1) > db.posts.find( { tags: "tech" } )

Geo Geo-spatial queries Require a geo index Find points near a given point Find points within a polygon/sphere // geospatial index > db.posts.ensureindex({"author.location": "2d"}) > db.posts.find({ "author.location" : { $near : [22, 42] }})

Map Reduce The caller provides map and reduce functions written in JavaScript // Emit each tag > map = "this['tags'].foreach( function(item) {emit(item, 1);} );" // Calculate totals > reduce = "function(key, values) { var total = 0; var valuessize = values.length; for (var i=0; i < valuessize; i++) { total += parseint(values[i], 10); } return total; };

Map Reduce // run the map reduce > db.posts.mapreduce(map, reduce, {"out": { inline : 1}}); { "results" : [ {"_id" : "databases", "value" : 1}, {"_id" : "tech", "value" : 1 } ], "timemillis" : 1, "counts" : { "input" : 1, "emit" : 2, "reduce" : 0, "output" : 2 }, "ok" : 1, }

Aggregation - coming in 2.2 // Count tags > agg = db.posts.aggregate( {$unwind: "$tags"}, {$group : {_id : "$tags", count : {$sum: 1}}} ) > agg.result [{"_id": "databases", "count": 1}, {"_id": "tech", "count": 1}]

GridFS Save files in mongodb Stream data back to the client // (Python) Create a new instance of GridFS >>> fs = gridfs.gridfs(db) // Save file to mongo >>> my_image = open('my_image.jpg', 'r') >>> file_id = fs.put(my_image) // Read file >>> fs.get(file_id).read()

Building Your First MongoDB App Want to build an app where users can check in to a location Leave notes or comments about that location Iterative Approach: Decide requirements Design documents Rinse, repeat :-)

Requirements "As a user I want to be able to find other locations nearby" Need to store locations (Offices, Restaurants etc) name, address, tags coordinates user generated content e.g. tips / notes

Requirements "As a user I want to be able to 'checkin' to a location" Checkins User should be able to 'check in' to a location Want to be able to generate statistics: Recent checkins Popular locations

Locations v1 > location_1 = { name: "Holiday Inn", address: "Engelhardsgasse 12", city: "Nürnberg ", post_code: "D-90402" }

Locations v1 > location_1 = { name: "Holiday Inn", address: "Engelhardsgasse 12", city: "Nürnberg ", post_code: "D-90402" } > db.locations.save(location_1) > db.locations.find({name: "Holiday Inn"})

Locations v2 > location_2 = { name: "Holiday Inn", address: "Engelhardsgasse 12", city: "Nürnberg ", post_code: "D-90402", } tags: ["hotel", "conference", "mongodb"]

Locations v2 > location_2 = { name: "Holiday Inn", address: "Engelhardsgasse 12", city: "Nürnberg ", post_code: "D-90402", } tags: ["hotel", "conference", "mongodb"] > db.locations.ensureindex({tags: 1}) > db.locations.find({tags: "hotel"})

Locations v3 > location_3 = { name: "Holiday Inn", address: "Engelhardsgasse 12", city: "Nürnberg ", post_code: "D-90402", tags: ["hotel", "conference", "mongodb"], } lat_long: [52.5184, 13.387]

Locations v3 > location_3 = { name: "Holiday Inn", address: "Engelhardsgasse 12", city: "Nürnberg ", post_code: "D-90402", tags: ["hotel", "conference", "mongodb"], } lat_long: [52.5184, 13.387] > db.locations.ensureindex({lat_long: "2d"}) > db.locations.find({lat_long: {$near:[52.53, 13.4]}})

Finding locations // creating your indexes: > db.locations.ensureindex({name: 1}) > db.locations.ensureindex({tags: 1}) > db.locations.ensureindex({lat_long: "2d"}) // with regular expressions: > db.locations.find({name: /^holid/i}) // by tag: > db.locations.find({tag: "hotel"}) // finding places: > db.locations.find({lat_long: {$near:[52.53, 13.4]}})

Inserting locations - adding tips // initial data load: > db.locations.insert(location_3) // adding a tip with update: > db.locations.update( {name: "Holiday Inn"}, {$push: { tips: { user: "Ross", date: ISODate("05-04-2012"), tip: "Check out my replication talk later!"} }})

Tips added > db.locations.findone() { _id : ObjectId("5c4ba5c0672c685e5e8aabf3"), name: "Holiday Inn", address: "Engelhardsgasse 12", city: "Nürnberg ", post_code: "D-90402", tags: ["hotel", "conference", "mongodb"], lat_long: [52.5184, 13.387], } tips:[{ user: "Ross", date: ISODate("05-04-2012"), tip: "Check out my replication talk later!" }]

Requirements "As a user I want to be able to find other locations nearby" Need to store locations (Offices, Restaurants etc) name, address, tags coordinates user generated content e.g. tips / notes "As a user I want to be able to 'checkin' to a location" Checkins User should be able to 'check in' to a location Want to be able to generate statistics: Recent checkins Popular locations

Users and Checkins > user_1 = { _id: "ross@10gen.com", name: "Ross", twitter: "RossC0", checkins: [ {location: "Holiday Inn", ts: "25/04/2012"}, {location: "Munich Airport", ts: "24/04/2012"} ] }

Simple Stats // find all users who've checked in here: > db.users.find({"checkins.location":"holiday Inn"})

Simple Stats // find all users who've checked in here: > db.users.find({"checkins.location":"holiday Inn"}) // Can't find the last 10 checkins easily > db.users.find({"checkins.location":"holiday Inn"}).sort({"checkins.ts": -1}).limit(10) Schema hard to query for stats.

User and Checkins v2 > user_2 = { _id: "ross@10gen.com", name: "Ross", twitter: "RossC0", } > checkin_1 = { location: location_id, user: user_id, ts: ISODate("05-04-2012") } > db.checkins.find({user: user_id})

Simple Stats // find all users who've checked in here: > loc = db.locations.findone({"name":"holiday Inn"}) > checkins = db.checkins.find({location: loc._id}) > u_ids = checkins.map(function(c){return c.user}) > users = db.users.find({_id: {$in: u_ids}}) // find the last 10 checkins here: > db.checkins.find({location: loc._id}).sort({ts: -1}).limit(10) // count how many checked in today: > db.checkins.find({location: loc._id, ts: {$gt: midnight}} ).count()

Map Reduce // Find most popular locations > map_func = function() { emit(this.location, 1); } > reduce_func = function(key, values) { return Array.sum(values); } > db.checkins.mapreduce(map_func, reduce_func, {query: {ts: {$gt: now_minus_3_hrs}}, out: "result"}) > db.result.findone() {"_id": "Holiday Inn", "value" : 99}

Aggregation // Find most popular locations > agg = db.checkins.aggregate( {$match: {ts: {$gt: now_minus_3_hrs}}}, {$group: {_id: "$location", value: {$sum: 1}}} ) > agg.result [{"_id": "Holiday Inn", "value" : 17}]

Deployment Single server - need a strong backup plan P

Deployment Single server - need a strong backup plan Replica sets - High availability - Automatic failover P P S S

Deployment Single server - need a strong backup plan Replica sets - High availability - Automatic failover P P S S Sharded - Horizontally scale - Auto balancing P S S P S S

MongoDB Use Cases Archiving Content Management Ecommerce Finance Gaming Government Metadata Storage News & Media Online Advertising Online Collaboration Real-time stats/analytics Social Networks Telecommunications

Any Questions?

download at mongodb.org conferences, appearances, and meetups http://www.10gen.com/events Facebook Twitter LinkedIn http://bit.ly/mongofb @mongodb http://linkd.in/joinmongo support, training, and this talk brought to you by