Linux on Power. Open Source Databases. Kevin Lawrence IBM - NA Power Systems - Server Solutions Ecosystem Open Source Databases.

Similar documents
Analytics March 2015 White paper. Why NoSQL? Your database options in the new non-relational world

Why NoSQL? Your database options in the new non- relational world IBM Cloudant 1

MySQL és Hadoop mint Big Data platform (SQL + NoSQL = MySQL Cluster?!)

An Approach to Implement Map Reduce with NoSQL Databases

Il mondo dei DB Cambia : Tecnologie e opportunita`

Accelerating Enterprise Applications and Reducing TCO with SanDisk ZetaScale Software


Oracle s Big Data solutions. Roger Wullschleger. <Insert Picture Here>

So What s the Big Deal?

Cloud Scale Distributed Data Storage. Jürmo Mehine

Overview of Databases On MacOS. Karl Kuehn Automation Engineer RethinkDB

Introduction to Hadoop. New York Oracle User Group Vikas Sawhney

Open Source Technologies on Microsoft Azure

MongoDB in the NoSQL and SQL world. Horst Rechner Berlin,

How To Handle Big Data With A Data Scientist

Big Data Solutions. Portal Development with MongoDB and Liferay. Solutions

Introduction to Polyglot Persistence. Antonios Giannopoulos Database Administrator at ObjectRocket by Rackspace

Introduction to Big Data Training

Scalable Architecture on Amazon AWS Cloud

IBM Software Information Management Creating an Integrated, Optimized, and Secure Enterprise Data Platform:

Database Management System Choices. Introduction To Database Systems CSE 373 Spring 2013

How graph databases started the multi-model revolution

How To Use Big Data For Telco (For A Telco)

InfiniteGraph: The Distributed Graph Database

HGST Virident Solutions 2.0

Hadoop Ecosystem Overview. CMSC 491 Hadoop-Based Distributed Computing Spring 2015 Adam Shook

In Memory Accelerator for MongoDB

SQL VS. NO-SQL. Adapted Slides from Dr. Jennifer Widom from Stanford

Data Modeling for Big Data

Understanding NoSQL Technologies on Windows Azure

Enterprise Operational SQL on Hadoop Trafodion Overview

Structured Data Storage

Can the Elephants Handle the NoSQL Onslaught?

Understanding NoSQL on Microsoft Azure

On- Prem MongoDB- as- a- Service Powered by the CumuLogic DBaaS Platform

NoSQL Databases. Nikos Parlavantzas

[Hadoop, Storm and Couchbase: Faster Big Data]

NoSQL for SQL Professionals William McKnight

Comparison of the Frontier Distributed Database Caching System with NoSQL Databases

X4-2 Exadata announced (well actually around Jan 1) OEM/Grid control 12c R4 just released

extensible record stores document stores key-value stores Rick Cattel s clustering from Scalable SQL and NoSQL Data Stores SIGMOD Record, 2010

TUT NoSQL Seminar (Oracle) Big Data

Parallel Data Warehouse

The NoSQL Ecosystem, Relaxed Consistency, and Snoop Dogg. Adam Marcus MIT CSAIL

Dell Reference Configuration for DataStax Enterprise powered by Apache Cassandra

Big Data and Data Science: Behind the Buzz Words

Performance and Scalability Overview

A survey of big data architectures for handling massive data

NOSQL DATABASES AND CASSANDRA

Challenges for Data Driven Systems

Real World Big Data Architecture - Splunk, Hadoop, RDBMS

MongoDB Developer and Administrator Certification Course Agenda

Big Data Analytics. with EMC Greenplum and Hadoop. Big Data Analytics. Ofir Manor Pre Sales Technical Architect EMC Greenplum

NoSQL Systems for Big Data Management

The evolution of database technology (II) Huibert Aalbers Senior Certified Executive IT Architect

Affordable, Scalable, Reliable OLTP in a Cloud and Big Data World: IBM DB2 purescale

Performance and Scalability Overview

Composite Data Virtualization Composite Data Virtualization And NOSQL Data Stores

HO5604 Deploying MongoDB. A Scalable, Distributed Database with SUSE Cloud. Alejandro Bonilla. Sales Engineer abonilla@suse.com

NoSQL Databases. Institute of Computer Science Databases and Information Systems (DBIS) DB 2, WS 2014/2015

Real Time Fraud Detection With Sequence Mining on Big Data Platform. Pranab Ghosh Big Data Consultant IEEE CNSV meeting, May Santa Clara, CA

Cloudera Enterprise Reference Architecture for Google Cloud Platform Deployments

The Flash Transformed Data Center & the Unlimited Future of Flash John Scaramuzzo Sr. Vice President & General Manager, Enterprise Storage Solutions

Evaluating NoSQL for Enterprise Applications. Dirk Bartels VP Strategy & Marketing

Actian SQL in Hadoop Buyer s Guide

NoSQL and Hadoop Technologies On Oracle Cloud

Using Big Data for Smarter Decision Making. Colin White, BI Research July 2011 Sponsored by IBM

Introduction to Apache Cassandra

Lecture Data Warehouse Systems

Configuration and Deployment Guide for the Cassandra NoSQL Data Store on Intel Architecture

Open Source: The New Data Center Standard

Domain driven design, NoSQL and multi-model databases

Hadoop and Data Warehouse Friends, Enemies or Profiteers? What about Real Time?

Real-Time Big Data Analytics SAP HANA with the Intel Distribution for Apache Hadoop software

Supercharge your MySQL application performance with Cloud Databases

Вовченко Алексей, к.т.н., с.н.с. ВМК МГУ ИПИ РАН

An Oracle White Paper May Oracle Database Cloud Service

Well packaged sets of preinstalled, integrated, and optimized software on select hardware in the form of engineered systems and appliances

The Flash-Transformed Financial Data Center. Jean S. Bozman Enterprise Solutions Manager, Enterprise Storage Solutions Corporation August 6, 2014

Fact Sheet In-Memory Analysis

Not Relational Models For The Management of Large Amount of Astronomical Data. Bruno Martino (IASI/CNR), Memmo Federici (IAPS/INAF)

CASE STUDY: Oracle TimesTen In-Memory Database and Shared Disk HA Implementation at Instance level. -ORACLE TIMESTEN 11gR1

An Oracle White Paper June High Performance Connectors for Load and Access of Data from Hadoop to Oracle Database

NOSQL INTRODUCTION WITH MONGODB AND RUBY GEOFF

GAIN BETTER INSIGHT FROM BIG DATA USING JBOSS DATA VIRTUALIZATION

How To Scale Out Of A Nosql Database

Big Data With Hadoop

INTRODUCING APACHE IGNITE An Apache Incubator Project

Benchmarking and Analysis of NoSQL Technologies

How To Store Data On An Ocora Nosql Database On A Flash Memory Device On A Microsoft Flash Memory 2 (Iomemory)

<Insert Picture Here> Oracle NoSQL Database A Distributed Key-Value Store

Table of Contents. Technical paper Open source comes of age for ERP customers

Big Data Technologies Compared June 2014

Architecting for Big Data Analytics and Beyond: A New Framework for Business Intelligence and Data Warehousing

Transcription:

Linux on Power Open Source Databases Kevin Lawrence IBM - NA Power Systems - Server Solutions Ecosystem Open Source Databases 2016 IBM Corporation

Linux on Power - Open Source Databases By 2018, more than 70% of new in-house applications will be developed on an OSDBMS, and 50% of existing commercial RDBMS instances will have been converted or will be in process * *Gartner - The State of Open Source RDBMSs, 2015, by Donald Feinberg and Merv Adrian, published April 21, 2015. 2016 IBM Corporation 2

Database Ecosystem Many Database choices spanning commercial to open source products, Relational and non-relational models no single winner takes all, Relational DBs strengths transactional integrity and large ecosystem around SQL NoSQL DBs are much lower cost and provide clients a simple data model with dynamic control over store and retrieve of primarily unstructured data types. The primary 4 flavors of NoSQL DBs are all available on Power 8 : Key/Value Store (example is Redis) Document Store (example is MongoDB) Columnar Store (example is Cassandra Graph Stores (example is Neo4J) 2016 IBM Corporation 3

Types of Databases Relational database management systems (RDBMS) support the relational (table-oriented) data model. The schema of a table (relation schema) is defined by the table name and a fixed number of attributes with fixed data types. A record (entity) corresponds to a row in the table and consists of the values of each attribute. (Open Source example would be Postgres/EnterpriseDB) Document Databases (eg MongoDB) store data in Documents, Documents contain one or more Fields. Data can be queried based on any combination of fields in a document. The appeal of these systems is that that are very general purpose, have large application ecosystems and map very nicely to support and enable many of today s object oriented programing styles. Key Value Store Databases (eg Redis) are the most basic type of nonrelational DBs. They store a Key and associated Values. Wide Column Stores (example Cassandra) vary in the number of Columns that are stored. The appeal of these systems is around their very high performance and scalability. Graph Databases (eg Neo4j) focus on storing simple and complex relationships and can be queried to discover simple and more complex relationships between the data. 2016 IBM Corporation 4

Types of Databases with Open Source Examples - Example: MongoDB - Example: Redis Relational - Example: EnterpriseDB Wide column store - Example: Cassandra Graphical - Example: Neo4J 2016 IBM Corporation 5

Common Linux on Power OSDBs Name Classification Optimized for Common Use Cases MongoDB NoSQL - Document Store Document Model, Document stores, semistructured or unstructured data. Redis NoSQL - in memory Key Value Store Data queues, Strings, Lists, Counts, caching, Statistics, Text, session IDs, pictures, videos Cassandra NoSQL - Wide Column Store NoSQL environments that need Very High Performance and Scalability, Very High data volumes Neo4J NoSQL - Graph Store Data stored as edges, nodes, or attributes (Graphs). Single view of Customer records, Enterprise content management, catalogs, personalization Live in memory cache, data queues, User session data, shopping cart data, Messaging, Fraud detection, Internet of Things data sensor data, log data, telco call detail records Fraud detection, Social Network Analysis, Location aware apps, Master data mgmt., Machine Learning PostGres (Enterprise DB) Open source Object Relational database Wide variety of transactional work at lower TCO relational/structured queries to object store and retrieval Oracle RDBMs migrations and takeouts MariaDB Open source Relational database Lower cost transactional SQL based queries and updates Migrations from Oracle MySQL, Turbo LAMP stack 2016 IBM Corporation 6

Redis Main points: Simple values or data structures by keys. Blazing fast Exploits Power 8: Redis Labs on Power utilizes IBM POWER8 servers, the IBM Flash System, the IBM CAPI-Flash card and the Redis Labs Enterprise Cluster (RLEC) for Flash software. Other features : Master-slave replication, automatic failover Best used: For rapidly changing data with a foreseeable database size (should fit mostly in memory). For example: To store real-time stock prices. Real-time analytics. Leaderboards. Real-time communication. And wherever you used memcached before. 2016 IBM Corporation 7

MongoDB Main point: Retains some friendly properties of SQL. (Query, index) Exploits Power 8 features: Performance, MongoDB with CAPI Flash on P8 testing just starting Other features : Master/slave replication (auto failover with replica sets), Sharding, Text search integrated, Has geospatial indexing Data center aware Best used: If you need dynamic queries. If you prefer to define indexes, not map/reduce functions. If you need good performance on a big DB. If you wanted CouchDB, but your data changes too much, filling up disks. For example: Most popular NoSQL Document DB. 2016 IBM Corporation 8

Cassandra Main point: Store huge datasets, retrieves in "almost" SQL (CQL3) Exploits Power 8 features : Apache Other features: CQL3 is the official interface and very similar SQL, but with some limitations that come from the scalability (most notably: no JOINs, no aggregate functions.) Querying by key, or key range (secondary indices are also available). Highly scalable and highly available with no single point of failure NoSQL column family implementation Very high write throughput and good read throughput. Writes can be much faster than reads (when reads are disk-bound) SQL-like query language (since 0.8) and support search through secondary indexes Tunable consistency and support for replication Flexible schema Map/reduce possible with Apache Hadoop Very good and reliable cross-datacenter replication Best used: When you need to store data so huge that it doesn't fit on server, but still want a friendly familiar interface to it. For example: Web analytics, to count hits by hour, by browser, by IP, etc. Transaction logging. Data collection from huge sensor arrays. 2016 IBM Corporation 9

Neo4j Main point: NoSQL Graph database optimized for connected data Exploit Power 8 features: Neo4j on POWER8 offers 56 TB of extended memory, drastically increasing the size at which realtime graph queries are possible. Real-time graph processing with Neo4j on POWER8 supports both standard operational requirements and analytic insights that normally require offline processing. IBM POWER8 hardware allows Neo4j to scale both up and out for graphs of greater size than ever before. Other features: HTTP/REST (or embedding in Java) Full ACID (Atomicity, Consistency, Isolation, Durability) conformity (including durable data) Integrated pattern-matching-based query language ("Cypher") Indexing of keys, nodes and relationships Advanced path-finding with multiple algorithms Optimized for reads Has transactions (in the Java API) Clustering, replication, caching, online backup, advanced monitoring and High Availability are commercially licensed Best used: For graph-style, rich or complex, interconnected data. For example: For searching routes in social relations, public transport links, road maps, or network topologies. 2016 IBM Corporation 10

EnterpriseDB (Postgres) Main Point: Enterprise class, Open Source, Relational Database Easily integrates/supplants OracleDB - This means that many applications written for Oracle run on Postgres Advanced Server without modification and Oracle-skilled developers can use it with minimal re-training. Performance EDB running on Power8 brings a cost-effective, enterprise-class solution to CIOs and IT managers running Red Hat Enteprise Linux 7.x and Power8 based on little endian. EDB Postgres Advanced Server on Power8 offers 2x higher performance over Intel-based systems for OLTP applications, high performance multi-threading, more cache and greater data bandwidth Scalability Reliably handles multi-terabyte data sets supporting millions of users with guaranteed transactional integrity and continuous availability TCO Reduces operating costs by requiring less systems at a lower acquisition cost DBMS Convergence Support traditional structured, semi-structured, and unstructured data types to reduce the need to deploy costly, one-off NoSQL data silos, adoption of Postgres and migration of workloads from proprietary databases. Services Brings together two industry leaders committed to Open Source offerings. EDB Postgres Management, Integration, and Migration Suites supports replication, HA, database monitoring/management and data integration for mission-critical enterprise applications. 2016 IBM Corporation 11

Modernize your Database with POWER8 and EnterpriseDB 30% Less servers 84% reduction in SW licensing cost with fewer cores and EnterpriseDB 29% reduction in HW costs and maintenance 68% reduction in core count 6000000 Solution TCO for 3 years 5000000 4000000 3000000 2000000 1000000 79% 3-year TCO Reduction 0 S822LC/20c/2.926 with EnterpriseDB HP DL380p/Brwell (2s) with OracleEE Environmentals HW SW Assumptions: 7 Power S922LC servers (65% utilization) have equivalent performance as 10 x86 servers (40% utilization)

Modernize your Database with POWER8/PowerKVM and MongoDB vs x86/vmware and Oracle EE 90% reduction in SW licensing cost with fewer cores and MongoDB 23% reduction in HW costs and maintenance 45% reduction in core count 6000000 Solution TCO for 3 years 5000000 4000000 3000000 2000000 1000000 0 S822LC/20c/2.926 with MongoDB Environmentals HW SW HP DL380/BWL/44c/2.2 with OracleEE 85% 3-year TCO Reduction Assumptions: 7xPower S822LC/20c servers with PowerKVM (40% utilization) have equivalent performance as 10xHPDL380/E5-2699 v4/44c servers with VMWare (40% utilization) Performance is based on SPECint_rate

Hortonworks Announcement Announced at IBM Edge: Hortonworks HDP is coming to Power! What is Hortonworks HDP? It is an Enterprise-ready open source Apache Hadoop distribution based on a centralized architecture (YARN). HDP addresses the complete needs of data-at-rest, powers real-time customer applications and delivers robust analytics that accelerate decision making and innovation 2016 IBM Corporation 14

By 2018, more than 70% of new inhouse applications will be developed on an OSDBMS, and 50% of existing commercial RDBMS instances will have been converted or will be in process * *Gartner - The State of Open Source RDBMSs, 2015, by Donald Feinberg and Merv Adrian, published April 21, 2015. 2016 IBM Corporation 15

Trademarks and notes IBM Corporation 2016 IBM, the IBM logo and ibm.com are registered trademarks, and other company, product, or service names may be trademarks or service marks of International Business Machines Corporation in the United States, other countries, or both. A current list of IBM trademarks is available on the web at Copyright and trademark information at www.ibm.com/legal/copytrade.shtml. Other company, product, and service names may be trademarks or service marks of others. References in this publication to IBM products or services do not imply that IBM intends to make them available in all countries in which IBM operates. IBM and IBM Credit LLC do not, nor intend to, offer or provide accounting, tax or legal advice to clients. Clients should consult with their own financial, tax and legal advisors. Any tax or accounting treatment decisions made by or on behalf of the client are the sole responsibility of the customer. IBM Global Financing offerings are provided through IBM Credit LLC in the United States, IBM Canada Ltd. in Canada, and other IBM subsidiaries and divisions worldwide to qualified commercial and government clients. Rates and availability are based on a client s credit rating, financing terms, offering type, equipment type and options, and may vary by country. Some offerings are not available in certain countries. Other restrictions may apply. Rates and offerings are subject to change, extension or withdrawal without notice. 2016 IBM Corporation 16

Welcome to the Waitless World. 2016 IBM Corporation 17