PostgreSQL Performance Characteristics on Joyent and Amazon EC2



Similar documents
LARGE-SCALE DATA STORAGE APPLICATIONS

How AWS Pricing Works May 2015

How AWS Pricing Works

Benchmarking Couchbase Server for Interactive Applications. By Alexey Diomin and Kirill Grigorchuk

Amazon EC2 Product Details Page 1 of 5

Amazon Cloud Storage Options

Technology and Cost Considerations for Cloud Deployment: Amazon Elastic Compute Cloud (EC2) Case Study

Deep Dive: Maximizing EC2 & EBS Performance

Web Application Deployment in the Cloud Using Amazon Web Services From Infancy to Maturity

Accelerating Enterprise Applications and Reducing TCO with SanDisk ZetaScale Software

How To Scale Myroster With Flash Memory From Hgst On A Flash Flash Flash Memory On A Slave Server

PARALLELS CLOUD STORAGE

Cloud Storage. Parallels. Performance Benchmark Results. White Paper.

Rackspace Cloud Databases and Container-based Virtualization

On- Prem MongoDB- as- a- Service Powered by the CumuLogic DBaaS Platform

Benchmark Results of Fengqi.Asia

Cloud Computing and E-Commerce

A Comparison of Clouds: Amazon Web Services, Windows Azure, Google Cloud Platform, VMWare and Others (Fall 2012)

NoSQL Performance Test In-Memory Performance Comparison of SequoiaDB, Cassandra, and MongoDB

Fault-Tolerant Computer System Design ECE 695/CS 590. Putting it All Together

GeoCloud Project Report GEOSS Clearinghouse

Increase Database Performance by Implementing Cirrus Data Solutions DCS SAN Caching Appliance With the Seagate Nytro Flash Accelerator Card

Benchmarking Cassandra on Violin

Performance Analysis: Benchmarking Public Clouds

Alfresco Enterprise on AWS: Reference Architecture

BENCHMARKING CLOUD DATABASES CASE STUDY on HBASE, HADOOP and CASSANDRA USING YCSB

TECHNOLOGY WHITE PAPER Jun 2012

An Esri White Paper January 2011 Estimating the Cost of a GIS in the Amazon Cloud

EXECUTIVE SUMMARY CONTENTS. 1. Summary 2. Objectives 3. Methodology and Approach 4. Results 5. Next Steps 6. Glossary 7. Appendix. 1.

Amazon Elastic Beanstalk

Développement logiciel pour le Cloud (TLC)

Cloud Computing Workload Benchmark Report

Determining the IOPS Needs for Oracle Database on AWS

NoSQL Database in the Cloud: Couchbase Server 2.0 on AWS July 2013

Getting Started with SandStorm NoSQL Benchmark

QLIKVIEW INTEGRATION TION WITH AMAZON REDSHIFT John Park Partner Engineering

Intro to AWS: Storage Services

Performance Benchmark for Cloud Block Storage

Best Practices for Using MySQL in the Cloud

Hypertable Architecture Overview

UBUNTU DISK IO BENCHMARK TEST RESULTS

TECHNOLOGY WHITE PAPER Jan 2016

THE DEFINITIVE GUIDE FOR AWS CLOUD EC2 FAMILIES

National Center for Education Statistics. Amazon Hosted ESRI ArcGIS Servers Project Final Report

Amazon Web Services Primer. William Strickland COP 6938 Fall 2012 University of Central Florida

Common Server Setups For Your Web Application - Part II

FortiGate Amazon Machine Image (AMI) Selection Guide for Amazon EC2

By Cloud Spectator July 2013

Using Synology SSD Technology to Enhance System Performance Synology Inc.

Benchmarking Hadoop & HBase on Violin

Introduction to AWS Economics

Scaling Database Performance in Azure

Big Data on AWS. Services Overview. Bernie Nallamotu Principle Solutions Architect

Performance test report

References. Introduction to Database Systems CSE 444. Motivation. Basic Features. Outline: Database in the Cloud. Outline

Introduction to Database Systems CSE 444

Chapter 9 PUBLIC CLOUD LABORATORY. Sucha Smanchat, PhD. Faculty of Information Technology. King Mongkut s University of Technology North Bangkok

RDBMS in the Cloud: Oracle Database on AWS

Cloud Computing Performance. Benchmark Testing Report. Comparing ProfitBricks vs. Amazon EC2

Cloud Computing Performance Benchmarking Report. Comparing ProfitBricks and Amazon EC2 using standard open source tools UnixBench, DBENCH and Iperf

Eloquence Training What s new in Eloquence B.08.00

Deploying Splunk on Amazon Web Services

Best Practices for Sharing Imagery using Amazon Web Services. Peter Becker

How To Store Data On An Ocora Nosql Database On A Flash Memory Device On A Microsoft Flash Memory 2 (Iomemory)

DataStax Enterprise, powered by Apache Cassandra (TM)

Performance Benchmark for Cloud Databases

Cloud Spectator Comparative Performance Report July 2014

How To Speed Up A Flash Flash Storage System With The Hyperq Memory Router

HyperQ Storage Tiering White Paper

MaxDeploy Ready. Hyper- Converged Virtualization Solution. With SanDisk Fusion iomemory products

Zadara Storage Cloud A

Amazon EC2 XenApp Scalability Analysis

Cloud Based Application Architectures using Smart Computing

Hosting Requirements Smarter Balanced Assessment Consortium Contract 11 Test Delivery System. American Institutes for Research

Energy Efficient MapReduce

Building a Scalable News Feed Web Service in Clojure

A Shared File System on SAS Grid Manger in a Cloud Environment

DISTRIBUTED SYSTEMS [COMP9243] Lecture 9a: Cloud Computing WHAT IS CLOUD COMPUTING? 2

The Secret World of Cloud IaaS Pricing in 2014: How to Compare Apples and Oranges Among Cloud Providers

Choosing Storage Systems

Building a Private Cloud with Eucalyptus

MySQL and Virtualization Guide

Accelerating Cassandra Workloads using SanDisk Solid State Drives

Using Synology SSD Technology to Enhance System Performance Synology Inc.

How To Use Arcgis For Free On A Gdb (For A Gis Server) For A Small Business

Postgres Plus Cloud Database!

Web Application Hosting in the AWS Cloud Best Practices

Copyright 1

Using SUSE Studio to Build and Deploy Applications on Amazon EC2. Guide. Solution Guide Cloud Computing.

Hosting Requirements Smarter Balanced Assessment Consortium Contract 11 Test Delivery System. American Institutes for Research

OTM in the Cloud. Ryan Haney

PipeCloud : Using Causality to Overcome Speed-of-Light Delays in Cloud-Based Disaster Recovery. Razvan Ghitulete Vrije Universiteit

ArcGIS for Server in the Amazon Cloud. Michele Lundeen Esri

The Total Cost of (Non) Ownership of a NoSQL Database Cloud Service

Amazon Hosted ESRI GeoPortal Server. GeoCloud Project Report

Database Economic Cost Optimization for Cloud Computing Adam Conrad Department of Computer Science Brown University Providence, RI

Amazon Elastic Compute Cloud Getting Started Guide. My experience

Cloud Computing and Amazon Web Services

OnCommand Cloud Manager 2.2 Storage System Management Guide

Transcription:

OVERVIEW In today's big data world, high performance databases are not only required but are a major part of any critical business function. With the advent of mobile devices, users are consuming data like never before requiring enterprises to serve large user populations and deliver highly responsive applications with no downtime. CTOs must make a prudent decision and thus seek out and investigate a database's performance characteristics to obtain a level of assurance that their database applications are both durable and scalable. They need the hard facts to back up such a critical business decision. Further more, this decision has to speak to the now and the future. The intent of this document is to provide database performance benchmarks for PostgreSQL across popular cloud platforms. This effort embodies some of the current architectures used by companies on those platforms. Results from such benchmarks can be as varied as the applications interfacing with them but the results included in this report can be used as a reference for running a high performance database on both Joyent and Amazon (AWS). TEST DESCRIPTION In our tests we analyzed the performance of PostgreSQL using several configurations of both Amazon EC2 and Joyent s public cloud. Our goal was to measure the performance of PostgreSQL utilizing an architecture and configuration that is widely used in production environments on comparable systems. Thus no tuning was done to either PostgreSQL itself or the servers themselves other than what is noted below. We certainly acknowledge that in expert hands both PostgreSQL and the servers they run on can be tuned to an astonishing degree. But these tunings may not be available or even advantageous on all systems and make valid side-by-side comparisons more and more difficult. For all the tests, PostgreSQL was set up in a Master/Slave configuration using asynchronous binary replication. On Amazon EC2 we tested 3 server configurations. The first EC2 configuration was a pair of M1 Large 7.5gb instances also optimized for EBS, and a 100gb EBS volume but was not set with provisioned IOPS. The second EC2 configuration was a pair of M1 Large 7.5gb instances also optimized for EBS, and a 100gb EBS volume provisioned with 3000 IOPS. The third EC2 configuration was a M3 Large 7.5gb instance using the 33gb SSD ephemeral (local) storage rather than an attached EBS volume. On Joyent s public cloud we tested a pair of Standard 7.5gb instances running SmartOS. On these instances we configured PostgreSQL to use the available 738gb local storage for data storage. The systems were tested using Yahoo Cloud Service Benchmark tool (YCSB) with 5 standard loads that demonstrate a variety of usage scenarios. YCSB is a purpose built tool that delivers a framework useful for evaluating different workloads on cloud platforms. YCSB has been used in more than 70 published benchmark comparisons and represents a standardized way to compare the performance between systems. For a more detailed description of the configuration of the machines used in these tests, please refer to the methodology section of this paper.

DATA LOADING To begin the work of testing various workloads, each system needed to be loaded with data. We loaded 6 million records that contained 10 fields of 100 bytes of data each, creating 1k of data per record. The total table size was 6gb. optimized for EBS performed an average of 2,523.4 ops/s, with provisioned IOPS performed an average of 4,146.8 ops/s, performed 4,820.8 ops/s. performed 15,225.5 ops/s. Comparatively, was 603% faster than optimized, 367% faster than with provisioned IOPS and 316% faster than. The total load time for optimized was 2,300 seconds, for with provisioned IOPS was 1,447 seconds, for AWS M3 was 1,245 seconds and 394 seconds for Joyent SmartOS. 1 1 1 15,225.5 10,000.0 8,000.0 2,523.4 4,146.8 4,820.8 EBS

50% READ, 50% UPDATE The 50% read, 50% update workload represents a workload similar to a session store that is recording recent actions. 50% of the operations are read operations while 50% are updates of an existing record. optimized for EBS performed an average of 472.5 ops/s, with provisioned IOPS performed an average of 1,642.3 ops/s, performed 3,919.0 ops/s. performed 5,969.5 ops/s. Comparatively, was 1,263% faster than optimized, 363% faster than with provisioned IOPS and 152% faster than. The total load time for optimized was 1,058 seconds, for with provisioned IOPS was 304 seconds, for AWS M3 was 128 seconds and 84 seconds for Joyent SmartOS. 7,000.0 5,000.0 5,969.5 3,000.0 3,919.0 1,000.0 472.5 1,642.3 EBS

95% READ, 5% UPDATE The 95% read, 5% update workload represents a workload similar to a photo tagging website where a photo is viewed frequently and tags are updated to the photo s data. 95% of the operations are read operations of a single record and 5% are updates. optimized for EBS performed an average of 786.1 ops/s, with provisioned IOPS performed an average of 2,655.7 ops/s, performed 4,634.3 ops/s. performed 11,033.6 ops/s. Comparatively, was 1,404% faster than optimized, 415% faster than with provisioned IOPS and 238% faster than. The total load time for optimized was 636 seconds, for with provisioned IOPS was 188 seconds, for AWS M3 was 108 seconds and 45 seconds for Joyent SmartOS. 1 10,000.0 8,000.0 11,033.6 786.1 2,655.7 4,634.3 EBS

100% READ The 100% read workload represents a static dataset such as a user profile cache where profiles are constructed infrequently but read often. No operations write to the database under this test. optimized for EBS performed an average of 762.1 ops/s, with provisioned IOPS performed an average of 3,216.3 ops/s, performed 4,850.0 ops/s. performed 9,042.1 ops/s. Comparatively, was 1,186% faster than optimized, 281% faster than with provisioned IOPS and 186% faster than. The total load time for optimized was 656 seconds, for with provisioned IOPS was 155 seconds, for AWS M3 was 103 seconds and 55 seconds for Joyent SmartOS. 10,000.0 8,000.0 9,042.1 4,850.0 762.1 3,216.3 EBS EBS

95% READ, 5% INSERT The 95% read, 5% insert workload represents a usage similar to status updates where the latest records are also the records most likely to be read. 95% of the operations are read operations and 5% are inserts of new records into the database. optimized for EBS performed an average of 2,486.4 ops/s, with provisioned IOPS performed an average of 3,742.1 ops/s, performed 4,393.1 ops/s. performed 7,382.3 ops/s. Comparatively, was 297% faster than optimized, 197% faster than with provisioned IOPS and 168% faster than. The total load time for optimized was 201 seconds, for with provisioned IOPS was 134 seconds, for AWS M3 was 114 seconds and 45 seconds for Joyent SmartOS. 8,000.0 7,000.0 7,382.3 5,000.0 3,000.0 3,742.1 4,393.1 1,000.0 2,486.4 EBS

50% READ, 50% READ MODIFY WRITE The 50% read, 50% read modify write represents a workload that simulates users reading and modifying the record read. 50% of the operations are read operations of a single record, and 50% are a read of a single record, modification of the record retrieved and a subsequent save of the modification back to the database. optimized for EBS performed an average of 439.2 ops/s, with provisioned IOPS performed an average of 1,581.2 ops/s, performed 3,198.6 ops/s. performed 6,780.9 ops/s. Comparatively, was 1544% faster than optimized, 429% faster than with provisioned IOPS and 212% faster than. The total load time for optimized was 1,138 seconds, for with provisioned IOPS was 316 seconds, for AWS M3 was 156 seconds and 61 seconds for Joyent SmartOS. 8,000.0 7,000.0 6,780.9 5,000.0 3,000.0 3,198.6 1,000.0 439.2 1,581.2 EBS

DATA TABLE: ECOND Workload EBS AWS M3 SSD Data Load 2,523.4 4,146.8 4,820.8 15,225.5 50% Read / 50% Update 472.5 1,642.3 3,919.0 5,969.5 95% Read / 5% Update 786.1 2,655.7 4,634.3 11,033.6 100% Read 762.1 3,216.3 4,850.0 9,042.1 95% Read / 5% Insert 2,486.4 3,742.1 4,393.1 7,382.3 50% Read/ 50% Read Modify Write 439.2 1,581.2 3,198.6 6,780.9 DATA TABLE: TOTAL TIME IN SECONDS Workload EBS AWS M3 SSD Data Load 2300 1447 1245 394 50% Read / 50% Update 1058 304 128 84 95% Read / 5% Update 636 188 108 45 100% Read 656 155 103 55 95% Read / 5% Insert 201 134 114 45 50% Read/ 50% Read Modify Write 1138 316 156 61

Methodology In order to test the performance of the PostgreSQL across various cloud platform configurations, we set up several pairs of PostgreSQL in a master/slave configuration with each server configured identically. The slave server was set to use asynchronous binary replication and was set to be in hot standby mode. Each PostgreSQL server was configured to accept up to 1,000 connections and its shared buffers size was increased to 2 gigabytes. At Joyent, we created a master/slave PostgreSQL pair using Standard 7.5gb instances for both the master and slave servers. The 7.5gb machine was equipped with 2 virtual CPUs, a 738gb disk and up to 10Gbit/s network access. Definitions YCSB Yahoo Cloud Service Benchmark tool. The tool is available from github at github.com/brianfrankcooper/ycsb Operations per second as reported from both the YCSB tool. While not officially part of the cluster, a Standard 3.75gb instance was created to be used as a YCSB client machine using the same operating system used with that server pair. In addition the YCSB client was within the same internal network to allow communication using the internal network IP rather than the public IP address. At Amazon, we created a master/slave postgres pair in three different EC2 configurations; M1 Large EBS, M1 Large EBS with provisioned IOPS, and M3 SSD ephemeral disk (without EBS). In each case, CentOS 6.4 was used from the disk image provided from CentOS.org (community AMI ami-b3bf2f83). Also each instance was set to single tenant to prevent noisy neighbors from disturbing the test. The first AWS configuration used m1.large 7.5gb single tenant instances. large instances have 2 virtual CPUs, 840gb ephemeral storage (which was not used) and 500 Mbps network access. Each server was configured to run as EBS and was configured with a 100gb EBS volume. The EBS volume used the ext4 file system. To improve the performance of the ext4 file system, we set the following mount options; noatime,nodiratime,data=writeback,barrier=0,nobh,errors=remount-ro. The second AWS configuration used m1.large 7.5gb single tenant instances. large instances have 2 virtual CPUs, 840gb ephemeral storage (which was not used) and 500 Mbps network access. Each server was configured to run as EBS and was configured with a 100gb EBS volume. The EBS volume was provisioned with 3,000 IOPS, the maximum IOPS available. The EBS volume used the ext4 file system. To improve the performance of the ext4 file system, we set the following mount options; noatime,nodiratime,data=writeback,barrier=0,nobh,errors=remount-ro. The third AWS configuration used m3.large 7.5gb single tenant instances. AWS M3 large instances have 2 virtual CPUs, 32gb ephemeral storage and moderate network performance. The 32gb storage is SSD based disk and was configured for use at the time of launch. A small 8gb EBS volume was also added to these instances because the CentOS AMI used required EBS storage to run. These EBS volumes were otherwise not used. As with the M1 instances, the ephemeral storage used an ext4 file system, which was set with the same mount options as the AWS M1 instances. While not officially part of the cluster, a Standard 3.75gb m3.medium instance was created to be used as a YCSB client machine using the same operating system used with that server pair. In addition the YCSB client was within the same internal network to allow communication using the internal network IP rather than the public IP address.

Yahoo Cloud Service Benchmark Tool The benchmarking tests were performed using the Yahoo Cloud Service Benchmark (YCSB) tool, which provided a consistent framework for loading, and running various test scenarios. YCSB was developed at Yahoo Labs to assist in evaluating various key-value and cloud databases. Since the publication of Benchmarking Cloud Serving Systems with YCSB (http://research.yahoo.com/node/3202) in 2010 and release of the YCSB source code, over 70 publications have used the tool for various benchmark comparisons. While it is possible to run the YCSB tool locally on the same machine as the database server, we chose to run it on a separate instance in order to prevent any possible interference with the CPU or memory of the machines under test. YCSB allows the user to set the number of threads and a target number of operations per second. We found that setting the thread count to 50 gave the highest operations per second. We initially loaded each cluster with 6 million records. Each record contained 10 fields of 100 bytes each and the entire data size was approximately 6gb. Each workload test was run for 500,000 operations.