Analytics March 2015 White paper. Why NoSQL? Your database options in the new non-relational world



Similar documents
Why NoSQL? Your database options in the new non- relational world IBM Cloudant 1

Build more and grow more with Cloudant DBaaS

Cloud Scale Distributed Data Storage. Jürmo Mehine

INTRODUCTION TO CASSANDRA

Structured Data Storage

Making Sense ofnosql A GUIDE FOR MANAGERS AND THE REST OF US DAN MCCREARY MANNING ANN KELLY. Shelter Island

Open Source Technologies on Microsoft Azure

NoSQL Database Options

Preparing Your Data For Cloud

The evolution of database technology (II) Huibert Aalbers Senior Certified Executive IT Architect

Understanding NoSQL Technologies on Windows Azure

Lecture Data Warehouse Systems

NoSQL systems: introduction and data models. Riccardo Torlone Università Roma Tre

A COMPARATIVE STUDY OF NOSQL DATA STORAGE MODELS FOR BIG DATA

Data Modeling for Big Data

NoSQL Databases. Nikos Parlavantzas

An Approach to Implement Map Reduce with NoSQL Databases

GigaSpaces Real-Time Analytics for Big Data

Domain driven design, NoSQL and multi-model databases

Comparison of the Frontier Distributed Database Caching System with NoSQL Databases

NoSQL Data Base Basics

IBM Cloudant: The Do-More NoSQL Data Layer IBM Redbooks Solution Guide

Can the Elephants Handle the NoSQL Onslaught?

extensible record stores document stores key-value stores Rick Cattel s clustering from Scalable SQL and NoSQL Data Stores SIGMOD Record, 2010

IBM Cognos Enterprise: Powerful and scalable business intelligence and performance management

Evaluator s Guide. McKnight. Consulting Group. McKnight Consulting Group

Big Data Analytics. Rasoul Karimi

Introduction to Polyglot Persistence. Antonios Giannopoulos Database Administrator at ObjectRocket by Rackspace

How To Scale Out Of A Nosql Database

SQL VS. NO-SQL. Adapted Slides from Dr. Jennifer Widom from Stanford

MongoDB in the NoSQL and SQL world. Horst Rechner Berlin,

IBM Analytics The fluid data layer: The future of data management

Advanced Data Management Technologies


Comparing SQL and NOSQL databases

Slave. Master. Research Scholar, Bharathiar University

NoSQL Databases. Polyglot Persistence

BIG DATA TOOLS. Top 10 open source technologies for Big Data

How Cloud-Based Technology is Transforming the Retail Shopping Experience

IBM System x and VMware solutions

Overview of Databases On MacOS. Karl Kuehn Automation Engineer RethinkDB

An Oracle White Paper November Leveraging Massively Parallel Processing in an Oracle Environment for Big Data Analytics

NoSQL Systems for Big Data Management

NoSQL: Going Beyond Structured Data and RDBMS

Fiserv. Saving USD8 million in five years and helping banks improve business outcomes using IBM technology. Overview. IBM Software Smarter Computing

Reduce your data storage footprint and tame the information explosion

Understanding NoSQL on Microsoft Azure

Big data management with IBM General Parallel File System

Performance and Scalability Overview

IBM InfoSphere Guardium Data Activity Monitor for Hadoop-based systems

NoSQL and Hadoop Technologies On Oracle Cloud

Introduction to Apache Cassandra

IBM Software Database strategies for the world of big data

IBM Software Information Management. Scaling strategies for mission-critical discovery and navigation applications

So What s the Big Deal?

Performance Evaluation of NoSQL Systems Using YCSB in a resource Austere Environment

How graph databases started the multi-model revolution

IBM System x reference architecture solutions for big data

How To Handle Big Data With A Data Scientist

nosql and Non Relational Databases

NOSQL INTRODUCTION WITH MONGODB AND RUBY GEOFF

Making Sense of NoSQL Dan McCreary Ann Kelly

Affordable, Scalable, Reliable OLTP in a Cloud and Big Data World: IBM DB2 purescale

Taking control of the virtual image lifecycle process

Composite Data Virtualization Composite Data Virtualization And NOSQL Data Stores

BASHO DATA PLATFORM SIMPLIFIES BIG DATA, IOT, AND HYBRID CLOUD APPS

IBM SmartCloud Monitoring

Chapter 11 Map-Reduce, Hadoop, HDFS, Hbase, MongoDB, Apache HIVE, and Related

NoSQL Databases. Institute of Computer Science Databases and Information Systems (DBIS) DB 2, WS 2014/2015

MEAP Edition Manning Early Access Program Neo4j in Action MEAP version 3

The IBM Cognos family

Databases 2 (VU) ( )

Big Data Buzzwords From A to Z. By Rick Whiting, CRN 4:00 PM ET Wed. Nov. 28, 2012

IBM Analytical Decision Management

Making Sense of NoSQL Dan McCreary Wednesday, Nov. 13 th 2014

Database Scalability {Patterns} / Robert Treat

NOSQL, BIG DATA AND GRAPHS. Technology Choices for Today s Mission- Critical Applications

The NoSQL Ecosystem, Relaxed Consistency, and Snoop Dogg. Adam Marcus MIT CSAIL

Big Data Management. Big Data Management. (BDM) Autumn Povl Koch September 30,

NoSQL Database Systems and their Security Challenges

INTERNATIONAL JOURNAL OF PURE AND APPLIED RESEARCH IN ENGINEERING AND TECHNOLOGY

Introduction to NOSQL

An Open Source NoSQL solution for Internet Access Logs Analysis

The business value of improved backup and recovery

NoSQL and Graph Database

NOSQL DATABASES AND CASSANDRA

Table of Contents. Développement logiciel pour le Cloud (TLC) Table of Contents. 5. NoSQL data models. Guillaume Pierre

Database Management System Choices. Introduction To Database Systems CSE 373 Spring 2013

On- Prem MongoDB- as- a- Service Powered by the CumuLogic DBaaS Platform

Study concluded that success rate for penetration from outside threats higher in corporate data centers

How To Use Big Data For Telco (For A Telco)

Data Services Advisory

IBM SmartCloud for Service Providers

Hacettepe University Department Of Computer Engineering BBM 471 Database Management Systems Experiment

IBM Social Media Analytics

Sentimental Analysis using Hadoop Phase 2: Week 2

Beyond listening Driving better decisions with business intelligence from social sources

Enabling Database-as-a-Service (DBaaS) within Enterprises or Cloud Offerings

Transcription:

Analytics March 2015 White paper Why NoSQL? Your database options in the new non-relational world

2 Why NoSQL? Contents 2 New types of apps are generating new types of data 2 A brief history of NoSQL 3 Types of NoSQL databases 4 Why consider NoSQL? 5 Summary 5 Getting started with IBM Cloudant 5 For more information New types of apps are generating new types of data The continual increase in web, mobile and IoT applications, alongside emerging trends shifting online consumer behavior and new classes of data, is causing developers to reevaluate how their data is stored and managed. Today s needs require a database that is capable of providing a scalable, flexible solution to efficiently and safely manage the massive flow of data to and from a global user base. Developers and IT alike are finding it difficult, and sometimes even impossible, to quickly incorporate all of this data into the relational model while dynamically scaling to maintain the performance levels users demand. This is causing many to look at NoSQL databases for the flexibility they offer, and is a big reason why the global NoSQL market is forecasted to nearly double and reach USD3.4 billion in 2020 1. A brief history of NoSQL NoSQL, also known as not only SQL or non-relational databases, were specifically introduced to handle the rise in data types, data access, and data availability needs brought on by the dot-com boom. Going back 20 years or so, when application architects and developers needed a data store for their applications, they were choosing among various relational databases. In fact, relational databases have been the de facto choice since the 1970s, as they have been the sole option available for both developers (taught in computer science programs everywhere) and infrastructure teams (well-understood tooling and predictable operational characteristics). Some of the most popular are Oracle, MySQL, SQL Server and IBM DB2. However, when Internet applications and companies started exploding during the late 90s to early 2000s, applications went from serving thousands of internal employees within companies to having millions of users on the public Internet. For these applications, performance and availability were paramount. The new problem of high availability at large scale drove companies like Google, Facebook and Amazon to create new technologies. Thankfully, they documented their efforts, released white papers, and open sourced their technology for the Internet community to continue building upon. By the late 2000s, several new non-relational database technologies had emerged, and NoSQL was the name that stuck to describe them all. NoSQL s roots in open source Many NoSQL databases have roots in the open source community. This heritage has been fundamental for their ever-increasing popularity and usage. You will often see companies that provide a commercial version of a database with services and support for the technology, while at the same time participating in the communities of their open source

Analytics counterparts. Examples of these relationships include Datastax for Apache Cassandra, IBM Cloudant for Apache CouchDB, and even MongoDB with an open source version of its MongoDB software. Types of NoSQL databases NoSQL is simply the term used to describe a family of databases that are all non-relational. While the technologies, data types, and use cases vary wildly among them, it is generally agreed that there are four types of NoSQL databases: Key-value stores These databases pair keys to values, like a hash table in computer programming. A good analogy for a key-value store is a file system. The path acts as the key and the contents act as the file. There are often no fields to update; rather, the entire value other than the key must be updated if changes are to be made. This simplicity scales well; however, it can limit the complexity of queries and other advanced features available in key-value stores. Examples of pure key-value stores include MemcacheD, REDIS and Riak. Graph stores Graph data stores excel at dealing with interconnected data. Graph databases consist of connections (called edges in graph DB terms) between nodes. Both nodes and their edges can store additional properties, such as key-value pairs. The strength of a graph database is in traversing the connections between nodes. Graph databases, however, generally require all data to fit on one machine, limiting their scalability. Examples of graph stores include Neo4j and Sesame. Column stores Relational databases store all of the data in a particular table s row together on disk, which makes retrieval of a particular row fast. Column-family databases generally serialize all the values of a particular column together on disk, which makes retrieval of aggregated data on a specific attribute fast. This is valuable in data warehousing and analytics scenarios where you might run range queries over a specific value. Examples of column stores include Hbase and Cassandra. Document stores Document databases store records as documents, where a document can generally be thought of as a grouping of key-value pairs. Keys are always strings and values can be stored as strings, numerics, booleans, arrays and other nested, key-value pairs. Values can be nested to arbitrary depths. Popular syntaxes for document DBs include XML and JavaScript Object Notation (JSON). Examples of document stores include MongoDB, CouchDB, Cloudant and MarkLogic. Why consider NoSQL? The various NoSQL databases available today differ quite a bit, but there are common threads uniting them: flexibility, scalability, availability, lower costs and special capabilities. Flexibility Schema flexibility and intuitive data structures are key features

4 Why NoSQL? that attract developers working in agile development cycles, and most NoSQL databases accommodate these qualities. Schema updates and their associated downtime are generally not required, unlike in a relational database. NoSQL s flexibility is popular for supporting agile development practices by eliminating the complexity of database schema changes to support changing codebases. Scalability Scale refers to both the size of the data as well as the concurrent users acting on that data. NoSQL databases are typically more specialized to various use cases involving scale and can be much simpler for developing related application functions than relational databases. Many NoSQL databases are built to scale horizontally and spread their data across a cluster of servers more easily than their relational counterparts. Relational database performance suffers when queries execute JOINs across shards, whereas a NoSQL database can avoid JOINs altogether and maintain high performance. Availability Downtime results in lost revenue, lost customers, and frustrated users. Thankfully, some NoSQL databases offer new or different replication architectures that can make different types of NoSQL databases more highly available for writes in addition to reads. This means that if one or more database servers, or nodes, goes down, the other nodes in the system are able to continue operations without data loss. Lower operational cost The open source roots of NoSQL make its entry point an obvious incentive, and it is common to hear of NoSQL adopters cutting significant costs versus their existing databases while still receiving the same or better performance and functionality. Historically, large relational databases have run on expensive machines and mainframes. The distributed nature The Appeal of NoSQL JSON document stores JSON schema can be evolved rapidly without the need for intervention It only contains six kinds of values, so it is easy to implement and use JSON is self-describing and easy to understand It helps enable performance and scalability gains Example: NoSQL JSON document store versus a RDBMS Records in a NoSQL document store may be fully denormalized into individual JSON documents (store a record and all its related data together) versus the RDBMS approach of decomposing relations into separate tables (separate records into logical tables to better enforce consistency). The benefit here is that by using NoSQL, you can get and put objects at once without JOINing data from separate tables. The JOINs of a relational database are absolute scalability killers for clustered databases. of NoSQL databases means that they can be deployed and operated on clusters of servers in cloud architectures. Specialized capabilities Not all NoSQL databases are created equal. To offer incentive and added integration to their users, many NoSQL providers include various specialized capabilities. Examples include specific indexing and querying capabilities for geospatial data, integrated full-text indexing for search, automatic data replication and synchronization features and applicationfriendly RESTful web APIs. Summary The database you pick for your next application matters now more than ever. Thankfully, you don t have to pick just one. Today s applications are expected to run non-stop and must efficiently manage continuously growing amounts of multistructured data in order to do so. This has caused NoSQL to grow from a buzzword to a serious consideration for every

Analytics database, from small shops to the enterprise. While NoSQL databases share some common qualities, such as being non-relational and generally easy to scale, there are many unique differentiators to consider when deciding upon the correct NoSQL solution for your unique data needs. The right NoSQL database can act as a viable alternative to relational databases or can be utilized in a complementary fashion along with existing systems. The requirement for efficient data delivery is our future, and you should consider NoSQL technology when planning any application development project that involves scale, different varieties of data or large amounts of potential users. Getting started with IBM Cloudant IBM Cloudant is available as a fully managed NoSQL database as a service (DBaaS) for fast, turnkey provisioning and worryfree data management. It is also available as Cloudant Local, which puts the power of the Cloudant platform in the privacy of your data centers. You can even connect Cloudant Local and Cloudant Managed DBaaS databases together to form hybrid cloud databases for the greatest balance of cloud cost, reach, performance, and compliance control. Simply sign up for a no-charge account and get started at cloudant.com. For more information For more information, please contact your IBM representative or IBM Business Partner, or visit cloudant.com or ibm.com/cloudant.

Copyright IBM Corporation 2015 IBM Corporation Software Group (or appropriate division, or no division) Route 100 Somers, NY 10589 Produced in the United States of America February 2015 IBM, the IBM logo, and ibm.com are trademarks of International Business Machines Corp., registered in many jurisdictions worldwide. The Apache Software Foundation, Apache Cassandra, CouchDB, and Apache CouchDB are trademarks or registered trademarks of The Apache Software Foundation. IOS is a trademark or registered trademark of Cisco Systems, Inc., in the U.S. and other countries. Other product and service names might be trademarks of IBM or other companies. A current list of IBM trademarks is available on the Web at Copyright and trademark information at www.ibm.com/legal/copytrade.shtml. This document is current as of the initial date of publication and may be changed by IBM at any time. Not all offerings are available in every country in which IBM operates. THE INFORMATION IN THIS DOCUMENT IS PROVIDED AS IS WITHOUT ANY WARRANTY, EXPRESS OR IMPLIED, INCLUDING WITHOUT ANY WARRANTIES OF MERCHANT- ABILITY, FITNESS FOR A PARTICULAR PURPOSE AND ANY WARRANTY OR CONDITION OF NONINFRINGEMENT. IBM products are warranted according to the terms and conditions of the agreements under which they are provided. 1 www.marketresearchmedia.com/?p=568 Please Recycle KUW12354-USEN-00