Comparing NoSQL Solutions In a Real-World Scenario: Aerospike, Cassandra Open Source, Cassandra DataStax, Couchbase and Redis Labs



Similar documents
Deep Dive: Maximizing EC2 & EBS Performance

Benchmarking Couchbase Server for Interactive Applications. By Alexey Diomin and Kirill Grigorchuk

On- Prem MongoDB- as- a- Service Powered by the CumuLogic DBaaS Platform

NoSQL Performance Test In-Memory Performance Comparison of SequoiaDB, Cassandra, and MongoDB

Marvell DragonFly Virtual Storage Accelerator Performance Benchmarks

INTRODUCTION TO CASSANDRA

Resource Sizing: Spotfire for AWS

How In-Memory Data Grids Can Analyze Fast-Changing Data in Real Time

How To Store Data On An Ocora Nosql Database On A Flash Memory Device On A Microsoft Flash Memory 2 (Iomemory)

2) Xen Hypervisor 3) UEC

The Total Cost of (Non) Ownership of a NoSQL Database Cloud Service

Building a Scalable News Feed Web Service in Clojure

Performance Tuning and Optimizing SQL Databases 2016

Benchmarking Top NoSQL Databases Apache Cassandra, Couchbase, HBase, and MongoDB Originally Published: April 13, 2015 Revised: May 27, 2015

Benchmarking the Availability and Fault Tolerance of Cassandra

JVM Performance Study Comparing Oracle HotSpot and Azul Zing Using Apache Cassandra

Enterprise Edition Scalability. ecommerce Framework Built to Scale Reading Time: 10 minutes

How To Test For Performance And Scalability On A Server With A Multi-Core Computer (For A Large Server)

Tableau Server 7.0 scalability

HYPER-CONVERGED INFRASTRUCTURE STRATEGIES

LARGE-SCALE DATA STORAGE APPLICATIONS

Big Data on AWS. Services Overview. Bernie Nallamotu Principle Solutions Architect

Why NoSQL? Your database options in the new non- relational world IBM Cloudant 1

Stratusphere Solutions

Application Performance Testing Basics

BENCHMARKING CLOUD DATABASES CASE STUDY on HBASE, HADOOP and CASSANDRA USING YCSB

Case Study - I. Industry: Social Networking Website Technology : J2EE AJAX, Spring, MySQL, Weblogic, Windows Server 2008.

THE DEFINITIVE GUIDE FOR AWS CLOUD EC2 FAMILIES

Getting Started with SandStorm NoSQL Benchmark

Performance And Scalability In Oracle9i And SQL Server 2000

Benchmarking Cassandra on Violin

NoSQL Failover Characteristics: Aerospike, Cassandra, Couchbase, MongoDB

Scalable Architecture on Amazon AWS Cloud

Improve Business Productivity and User Experience with a SanDisk Powered SQL Server 2014 In-Memory OLTP Database

Putchong Uthayopas, Kasetsart University

Drupal Performance Tuning

Scalability Factors of JMeter In Performance Testing Projects

AppDynamics Lite Performance Benchmark. For KonaKart E-commerce Server (Tomcat/JSP/Struts)

Time series IoT data ingestion into Cassandra using Kaa

Choosing Between Commodity and Enterprise Cloud

MakeMyTrip CUSTOMER SUCCESS STORY

C3: Cutting Tail Latency in Cloud Data Stores via Adaptive Replica Selection

DataStax Enterprise, powered by Apache Cassandra (TM)

Data Center Solutions

WITH A FUSION POWERED SQL SERVER 2014 IN-MEMORY OLTP DATABASE

Leveraging Public Clouds to Ensure Data Availability

SCALABILITY IN THE CLOUD

Amazon EC2 Product Details Page 1 of 5

Dell Reference Configuration for DataStax Enterprise powered by Apache Cassandra

Introduction to Apache Cassandra

Amazon Elastic Beanstalk

Accelerating Enterprise Applications and Reducing TCO with SanDisk ZetaScale Software

BIG DATA: STORAGE, ANALYSIS AND IMPACT GEDIMINAS ŽYLIUS

The State of Cloud Storage

CASE STUDY: Oracle TimesTen In-Memory Database and Shared Disk HA Implementation at Instance level. -ORACLE TIMESTEN 11gR1

Benchmarking Hadoop & HBase on Violin

UBUNTU DISK IO BENCHMARK TEST RESULTS

Analytics March 2015 White paper. Why NoSQL? Your database options in the new non-relational world

Load Testing on Web Application using Automated Testing Tool: Load Complete

How AWS Pricing Works

Evaluating Network Attached Storage Units

BASHO DATA PLATFORM SIMPLIFIES BIG DATA, IOT, AND HYBRID CLOUD APPS

PostgreSQL Performance Characteristics on Joyent and Amazon EC2

IJRSET 2015 SPL Volume 2, Issue 11 Pages: 29-33

CLOUD COMPUTING An Overview

Introduction 1 Performance on Hosted Server 1. Benchmarks 2. System Requirements 7 Load Balancing 7

Performance White Paper

Network Infrastructure Services CS848 Project

Cloud Big Data Architectures

EXECUTIVE SUMMARY CONTENTS. 1. Summary 2. Objectives 3. Methodology and Approach 4. Results 5. Next Steps 6. Glossary 7. Appendix. 1.

DIABLO TECHNOLOGIES MEMORY CHANNEL STORAGE AND VMWARE VIRTUAL SAN : VDI ACCELERATION

Highly available, scalable and secure data with Cassandra and DataStax Enterprise. GOTO Berlin 27 th February 2014

An Introduction to Cloud Computing Concepts

Topology Aware Analytics for Elastic Cloud Services

Accelerating Web-Based SQL Server Applications with SafePeak Plug and Play Dynamic Database Caching

Big Data Analytics - Accelerated. stream-horizon.com

Performance Analysis and Capacity Planning Whitepaper

Migration Scenario: Migrating Batch Processes to the AWS Cloud

Affordable, Scalable, Reliable OLTP in a Cloud and Big Data World: IBM DB2 purescale

Evaluating HDFS I/O Performance on Virtualized Systems

A Middleware Strategy to Survive Compute Peak Loads in Cloud

Cloud Spectator Comparative Performance Report July 2014

How swift is your Swift? Ning Zhang, OpenStack Engineer at Zmanda Chander Kant, CEO at Zmanda

Milestone Solution Partner IT Infrastructure MTP Certification Report Scality RING Software-Defined Storage

Performance Analysis: Benchmarking Public Clouds

Optimizing Cloud Performance Using Veloxum Testing Report on experiments run to show Veloxum s optimization software effects on Terremark s vcloud

QUALITY OF SERVICE FOR CLOUD-BASED MOBILE APPS: Aruba Networks AP-135 and Cisco AP3602i


By Cloud Spectator July 2013

Public Cloud Offerings and Private Cloud Options. Week 2 Lecture 4. M. Ali Babar

Performance Management for Cloudbased STC 2012

Contents Introduction... 5 Deployment Considerations... 9 Deployment Architectures... 11

Accelerating Big Data: Using SanDisk SSDs for MongoDB Workloads

DataStax Enterprise Reference Architecture

Xangati Storage Solution Brief. Optimizing Virtual Infrastructure Storage Systems with Xangati

Directions for VMware Ready Testing for Application Software

Transcription:

Comparing NoSQL Solutions In a Real-World Scenario: Aerospike, Cassandra Open Source, Cassandra DataStax, Couchbase and Redis Labs Composed by Avalon Consulting, LLC June 2015 1

Introduction Specializing in cloud architecture, Emind Cloud Experts is an AWS Advanced Consulting Partner and a Google Cloud Platform Premier Partner that assists enterprises and startups establish secure and scalable IT operations. The following benchmark employed a real-world use case from an Emind customer. The Emind team was tasked with the following high-level requirements: Support a real-time voting process during massive live events (e.g. televised election surveys or America Votes type game shows) Keep voters data anonymous but unique Ensure scalability to support surges in requests Why Cloud-Based? During a voting event, concurrent voting app users could spike to the 10 s of millions. On the other hand, when there are no voting events, usage is practically non-existent. Therefore, it is important to set up a cloud infrastructure that can be quickly built up to handle usage spikes, and quickly torn down once a voting event ends. Efficiently scaling the app's cloud infrastructure up and down to match an event's time schedule can dramatically reduce cost. Thus a cloud-based infrastructure is highly suitable for this task. Composed by Avalon Consulting, LLC June 2015 2

Why NoSQL? Since scalability and cost effectiveness were considered of utmost importance in this project, Emind decided to use a NoSQL-type database to meet application requirements rather than a relational database. All the NoSQL offerings considered for this application provide efficient horizontal scaling that enables operators to quickly add or remove resources in response to the sudden spikes inherent in a voting application. Application Flow High performance is vital to optimize the apps user experience. The application flow starts on a client device, which makes a call to a stateless application server and sends the user s vote. The application server sends the vote to the database for recording. During this flow, it is important that throughput remain high enough to handle usage spikes, while ensuring low latency. The following figure represents the benchmark application flow: Composed by Avalon Consulting, LLC June 2015 3

Each test load was configured to run for 60 seconds, creating 35 threads and 5000 connections. The following is the wrk command used to execute each test: wrk-c5000-t35-d60"http://172.31.62.126/vote?u=1&v=y The NoSQL Candidates The following NoSQL databases were tested: Aerospike Cassandra Open Source Cassandra DataStax Enterprise In Memory Couchbase Redis Labs Enterprise Cluster A Go mockup application, "mockapp" was created to simulate the voting app ( https://github.com/emind-systems/real_time_vote_benchmark ). The mockapp was responsible for providing stateless application server endpoints, and for sending voting requests to each NoSQL database being benchmarked. The wrk ( https://github.com/wg/wrk ) HTTP benchmarking tool was used to generate load on each application setup. It captured throughput and latency metrics to determine which database was best suited for the scenario s requirements. Composed by Avalon Consulting, LLC June 2015 4

Finally, Avalon Consulting, LLC was hired to work with each vendor (refer to Appendix A for the introductory email sent to vendors) in order to determine the optimal configuration for each NoSQL database solution and to validate the results by re-running all tests. All vendors responded, which ensured that each system was configured correctly. The following precautions were taken to fully optimize each product: Aerospike As recommended by the vendor, Avalon referred to its documentation for optimal set-up and configuration of the system. Cassandra Open Source and DataStax Enterprise In-Memory DataStax recommended that Cassandra DataStax Enterprise In-Memory be implemented. However, the company raised a concern that the benchmark scenario is not optimal for this database. Consequently, Avalon decided to test both Open Source Cassandra and Cassandra DataStax Enterprise In-Memory. Couchbase In order to avoid disk IO and obtain the best performance, Avalon verified that the entire Couchbase dataset fit into RAM. Redis Labs As recommended by the vendor, Avalon referred to its documentation for optimal set-up and configuration of the system. Composed by Avalon Consulting, LLC June 2015 5

Benchmark Methodology Setup Each test run utilized 3 servers: WRK Server used to run the workload and send requests to the App Server: c4.large EC2 instance App Server used to run the Go HTTP server plus the Mockapp server software: c4.8xlarge EC2 instance During initial testing with c4.large, it was clear that the client server was going to be a bottleneck. Instead of scaling out with more servers, Avalon determined that the larger c4.8xlarge would be able to handle the load without additional servers. Database Server used to run each NoSQL database: c4.large EC2 instance Composed by Avalon Consulting, LLC June 2015 6

Database Configuration For all systems, each database server had enough RAM to store the entire dataset in memory. The exact database configuration for each of the vendors was: Aerospike Version: 3.5.9 OS: Amazon Linux 2015.03 AWS disks initialized (or pre-warmed) using the dd tool Cassandra Open Source Version: 2.1.3 OS: Ubuntu 14.04 Cassandra DataStax Enterprise In-Memory Datastax Enterprise 4.6.5 OS: Ubuntu 14.04 Configured for MemoryOnlyStrategy Couchbase Version: Couchbase Enterprise 3.0.3 OS: Ubuntu 14.04 Replicas disabled Redis Labs Enterprise Cluster Version: 0.99.7-3 OS: Ubuntu 14.04 Composed by Avalon Consulting, LLC June 2015 7

Benchmark Process Here is how the overall benchmark was conducted: Emind set up the initial infrastructure for the benchmark. This included setting up the servers for running wrk, the Go HTTP code and each NoSQL database. Emind wrote the initial code for running the HTTP server, capturing user input, and writing that user input to each database. Avalon Consulting, LLC reviewed and optimized the initial code, making changes where necessary. Avalon Consulting, LLC reviewed each server setup. This process included communicating with each vendor to determine optimal settings. Avalon Consulting, LLC reviewed the code for running each HTTP server to ensure it was as similar as possible for each system. Avalon Consulting, LLC executed every test in each newly configured environment. Composed by Avalon Consulting, LLC June 2015 8

Final Results Summary of Application Throughput and Latency Product App Req/Sec App Latency(ms) Aerospike 16481.55 161.94 Cassandra Open Source 5234.52 381.31 Cassandra DSE In-Memory 6659.26 372.31 Couchbase 4417.15 394.42 Redis Labs Enterprise Cluster 37038.82 71.22 * Raw data results are provided in Appendix B Composed by Avalon Consulting, LLC June 2015 9

Composed by Avalon Consulting, LLC June 2015 10

Summary This benchmark determined which NoSQL database would perform best with the dynamic load of a voting application scenario. For this particular workload, in which a large amount of write requests were made, Redis Labs Enterprise Cluster was the clear winner, Aerospike came in second with about half the throughput and double the latency, and the rest of the NoSQL databases were way behind. The test results are also applicable to other use cases where massive and continuous write operations are required, such as ingestion data-store for a real-time analytics application accessed by API, or for an Internet of Things (IoT) application that receives telemetry data from a large amount of sensors. Composed by Avalon Consulting, LLC June 2015 11

Appendix A Intro Email to Vendors Composed by Avalon Consulting, LLC June 2015 12

Appendix B Raw Data Results Aerospike Running1mtest@ http://172.31.62.126/vote?u=1&v=y 35threadsand5000connections ThreadStats Avg Stdev Max +/- Stdev Latency 161.94ms 123.90ms 1.99s 67.74% Req/Sec 475.56 386.61 14.18k 77.07% 990530requestsin1.00m, 99.19MBread Socketerrors: connect0, read0, write0, timeout56 Requests/sec: 16481.55 Transfer/sec: 1.65MB Cassandra Open Source Running1mtest@ http://172.31.62.126/vote?u=1&v=y 35threadsand5000connections ThreadStats Avg Stdev Max +/- Stdev Latency 381.31ms 466.79ms 2.00s 83.33% Req/Sec 174.43 160.60 1.75k 73.46% 314594requestsin1.00m, 31.50MBread Socketerrors: connect0, read0, write0, timeout16849 Requests/sec: 5234.52 Transfer/sec: 536.74KB Composed by Avalon Consulting, LLC June 2015 13

Cassandra DataStax Enterprise In-Memory Running1mtest@ http://172.31.62.126/vote?u=1&v=y 35threadsand5000connections ThreadStats Avg Stdev Max +/- Stdev Latency 372.31ms 390.91ms 2.00s 83.05% Req/Sec 204.09 153.23 1.15k 66.63% 400219requestsin1.00m, 40.08MBread Socketerrors: connect0, read0, write0, timeout6826 Requests/sec: 6659.26 Transfer/sec: 682.83KB Couchbase Running1mtest@ http://172.31.62.126/vote?u=1&v=y 35threadsand5000connections ThreadStats Avg Stdev Max +/- Stdev Latency 394.42ms 469.91ms 2.00s 82.95% Req/Sec 187.50 153.00 1.76k 64.79% 265463requestsin1.00m, 26.58MBread Socketerrors: connect0, read0, write0, timeout11358 Requests/sec: 4417.15 Transfer/sec: 452.93KB Redis Labs Enterprise Cluster Running1mtest@ http://172.31.62.126/vote?u=1&v=y 35threadsand5000connections ThreadStats Avg Stdev Max +/- Stdev Latency 71.22ms 43.85ms661.01ms 73.02% Req/Sec 1.06k 732.44 9.92k 75.22% 2225247requestsin1.00m, 222.96MBread Requests/sec: 37038.82 Transfer/sec: 3.71MB Composed by Avalon Consulting, LLC June 2015 14