PostgreSQL Performance Characteristics on Joyent and Amazon EC2

OVERVIEW In today's big data world, high performance databases are not only required but are a major part of any critical business function. With the advent of mobile devices, users are consuming data like never before requiring enterprises to serve large user populations and deliver highly responsive applications with no downtime. CTOs must make a prudent decision and thus seek out and investigate a database's performance characteristics to obtain a level of assurance that their database applications are both durable and scalable. They need the hard facts to back up such a critical business decision. Further more, this decision has to speak to the now and the future. The intent of this document is to provide database performance benchmarks for PostgreSQL across popular cloud platforms. This effort embodies some of the current architectures used by companies on those platforms. Results from such benchmarks can be as varied as the applications interfacing with them but the results included in this report can be used as a reference for running a high performance database on both Joyent and Amazon (AWS). TEST DESCRIPTION In our tests we analyzed the performance of PostgreSQL using several configurations of both Amazon EC2 and Joyent s public cloud. Our goal was to measure the performance of PostgreSQL utilizing an architecture and configuration that is widely used in production environments on comparable systems. Thus no tuning was done to either PostgreSQL itself or the servers themselves other than what is noted below. We certainly acknowledge that in expert hands both PostgreSQL and the servers they run on can be tuned to an astonishing degree. But these tunings may not be available or even advantageous on all systems and make valid side-by-side comparisons more and more difficult. For all the tests, PostgreSQL was set up in a Master/Slave configuration using asynchronous binary replication. On Amazon EC2 we tested 3 server configurations. The first EC2 configuration was a pair of M1 Large 7.5gb instances also optimized for EBS, and a 100gb EBS volume but was not set with provisioned IOPS. The second EC2 configuration was a pair of M1 Large 7.5gb instances also optimized for EBS, and a 100gb EBS volume provisioned with 3000 IOPS. The third EC2 configuration was a M3 Large 7.5gb instance using the 33gb SSD ephemeral (local) storage rather than an attached EBS volume. On Joyent s public cloud we tested a pair of Standard 7.5gb instances running SmartOS. On these instances we configured PostgreSQL to use the available 738gb local storage for data storage. The systems were tested using Yahoo Cloud Service Benchmark tool (YCSB) with 5 standard loads that demonstrate a variety of usage scenarios. YCSB is a purpose built tool that delivers a framework useful for evaluating different workloads on cloud platforms. YCSB has been used in more than 70 published benchmark comparisons and represents a standardized way to compare the performance between systems. For a more detailed description of the configuration of the machines used in these tests, please refer to the methodology section of this paper.

DATA LOADING To begin the work of testing various workloads, each system needed to be loaded with data. We loaded 6 million records that contained 10 fields of 100 bytes of data each, creating 1k of data per record. The total table size was 6gb. optimized for EBS performed an average of 2,523.4 ops/s, with provisioned IOPS performed an average of 4,146.8 ops/s, performed 4,820.8 ops/s. performed 15,225.5 ops/s. Comparatively, was 603% faster than optimized, 367% faster than with provisioned IOPS and 316% faster than. The total load time for optimized was 2,300 seconds, for with provisioned IOPS was 1,447 seconds, for AWS M3 was 1,245 seconds and 394 seconds for Joyent SmartOS. 1 1 1 15,225.5 10,000.0 8,000.0 2,523.4 4,146.8 4,820.8 EBS

50% READ, 50% UPDATE The 50% read, 50% update workload represents a workload similar to a session store that is recording recent actions. 50% of the operations are read operations while 50% are updates of an existing record. optimized for EBS performed an average of 472.5 ops/s, with provisioned IOPS performed an average of 1,642.3 ops/s, performed 3,919.0 ops/s. performed 5,969.5 ops/s. Comparatively, was 1,263% faster than optimized, 363% faster than with provisioned IOPS and 152% faster than. The total load time for optimized was 1,058 seconds, for with provisioned IOPS was 304 seconds, for AWS M3 was 128 seconds and 84 seconds for Joyent SmartOS. 7,000.0 5,000.0 5,969.5 3,000.0 3,919.0 1,000.0 472.5 1,642.3 EBS

95% READ, 5% UPDATE The 95% read, 5% update workload represents a workload similar to a photo tagging website where a photo is viewed frequently and tags are updated to the photo s data. 95% of the operations are read operations of a single record and 5% are updates. optimized for EBS performed an average of 786.1 ops/s, with provisioned IOPS performed an average of 2,655.7 ops/s, performed 4,634.3 ops/s. performed 11,033.6 ops/s. Comparatively, was 1,404% faster than optimized, 415% faster than with provisioned IOPS and 238% faster than. The total load time for optimized was 636 seconds, for with provisioned IOPS was 188 seconds, for AWS M3 was 108 seconds and 45 seconds for Joyent SmartOS. 1 10,000.0 8,000.0 11,033.6 786.1 2,655.7 4,634.3 EBS

100% READ The 100% read workload represents a static dataset such as a user profile cache where profiles are constructed infrequently but read often. No operations write to the database under this test. optimized for EBS performed an average of 762.1 ops/s, with provisioned IOPS performed an average of 3,216.3 ops/s, performed 4,850.0 ops/s. performed 9,042.1 ops/s. Comparatively, was 1,186% faster than optimized, 281% faster than with provisioned IOPS and 186% faster than. The total load time for optimized was 656 seconds, for with provisioned IOPS was 155 seconds, for AWS M3 was 103 seconds and 55 seconds for Joyent SmartOS. 10,000.0 8,000.0 9,042.1 4,850.0 762.1 3,216.3 EBS EBS

95% READ, 5% INSERT The 95% read, 5% insert workload represents a usage similar to status updates where the latest records are also the records most likely to be read. 95% of the operations are read operations and 5% are inserts of new records into the database. optimized for EBS performed an average of 2,486.4 ops/s, with provisioned IOPS performed an average of 3,742.1 ops/s, performed 4,393.1 ops/s. performed 7,382.3 ops/s. Comparatively, was 297% faster than optimized, 197% faster than with provisioned IOPS and 168% faster than. The total load time for optimized was 201 seconds, for with provisioned IOPS was 134 seconds, for AWS M3 was 114 seconds and 45 seconds for Joyent SmartOS. 8,000.0 7,000.0 7,382.3 5,000.0 3,000.0 3,742.1 4,393.1 1,000.0 2,486.4 EBS

50% READ, 50% READ MODIFY WRITE The 50% read, 50% read modify write represents a workload that simulates users reading and modifying the record read. 50% of the operations are read operations of a single record, and 50% are a read of a single record, modification of the record retrieved and a subsequent save of the modification back to the database. optimized for EBS performed an average of 439.2 ops/s, with provisioned IOPS performed an average of 1,581.2 ops/s, performed 3,198.6 ops/s. performed 6,780.9 ops/s. Comparatively, was 1544% faster than optimized, 429% faster than with provisioned IOPS and 212% faster than. The total load time for optimized was 1,138 seconds, for with provisioned IOPS was 316 seconds, for AWS M3 was 156 seconds and 61 seconds for Joyent SmartOS. 8,000.0 7,000.0 6,780.9 5,000.0 3,000.0 3,198.6 1,000.0 439.2 1,581.2 EBS

DATA TABLE: ECOND Workload EBS AWS M3 SSD Data Load 2,523.4 4,146.8 4,820.8 15,225.5 50% Read / 50% Update 472.5 1,642.3 3,919.0 5,969.5 95% Read / 5% Update 786.1 2,655.7 4,634.3 11,033.6 100% Read 762.1 3,216.3 4,850.0 9,042.1 95% Read / 5% Insert 2,486.4 3,742.1 4,393.1 7,382.3 50% Read/ 50% Read Modify Write 439.2 1,581.2 3,198.6 6,780.9 DATA TABLE: TOTAL TIME IN SECONDS Workload EBS AWS M3 SSD Data Load 2300 1447 1245 394 50% Read / 50% Update 1058 304 128 84 95% Read / 5% Update 636 188 108 45 100% Read 656 155 103 55 95% Read / 5% Insert 201 134 114 45 50% Read/ 50% Read Modify Write 1138 316 156 61

Methodology In order to test the performance of the PostgreSQL across various cloud platform configurations, we set up several pairs of PostgreSQL in a master/slave configuration with each server configured identically. The slave server was set to use asynchronous binary replication and was set to be in hot standby mode. Each PostgreSQL server was configured to accept up to 1,000 connections and its shared buffers size was increased to 2 gigabytes. At Joyent, we created a master/slave PostgreSQL pair using Standard 7.5gb instances for both the master and slave servers. The 7.5gb machine was equipped with 2 virtual CPUs, a 738gb disk and up to 10Gbit/s network access. Definitions YCSB Yahoo Cloud Service Benchmark tool. The tool is available from github at github.com/brianfrankcooper/ycsb Operations per second as reported from both the YCSB tool. While not officially part of the cluster, a Standard 3.75gb instance was created to be used as a YCSB client machine using the same operating system used with that server pair. In addition the YCSB client was within the same internal network to allow communication using the internal network IP rather than the public IP address. At Amazon, we created a master/slave postgres pair in three different EC2 configurations; M1 Large EBS, M1 Large EBS with provisioned IOPS, and M3 SSD ephemeral disk (without EBS). In each case, CentOS 6.4 was used from the disk image provided from CentOS.org (community AMI ami-b3bf2f83). Also each instance was set to single tenant to prevent noisy neighbors from disturbing the test. The first AWS configuration used m1.large 7.5gb single tenant instances. large instances have 2 virtual CPUs, 840gb ephemeral storage (which was not used) and 500 Mbps network access. Each server was configured to run as EBS and was configured with a 100gb EBS volume. The EBS volume used the ext4 file system. To improve the performance of the ext4 file system, we set the following mount options; noatime,nodiratime,data=writeback,barrier=0,nobh,errors=remount-ro. The second AWS configuration used m1.large 7.5gb single tenant instances. large instances have 2 virtual CPUs, 840gb ephemeral storage (which was not used) and 500 Mbps network access. Each server was configured to run as EBS and was configured with a 100gb EBS volume. The EBS volume was provisioned with 3,000 IOPS, the maximum IOPS available. The EBS volume used the ext4 file system. To improve the performance of the ext4 file system, we set the following mount options; noatime,nodiratime,data=writeback,barrier=0,nobh,errors=remount-ro. The third AWS configuration used m3.large 7.5gb single tenant instances. AWS M3 large instances have 2 virtual CPUs, 32gb ephemeral storage and moderate network performance. The 32gb storage is SSD based disk and was configured for use at the time of launch. A small 8gb EBS volume was also added to these instances because the CentOS AMI used required EBS storage to run. These EBS volumes were otherwise not used. As with the M1 instances, the ephemeral storage used an ext4 file system, which was set with the same mount options as the AWS M1 instances. While not officially part of the cluster, a Standard 3.75gb m3.medium instance was created to be used as a YCSB client machine using the same operating system used with that server pair. In addition the YCSB client was within the same internal network to allow communication using the internal network IP rather than the public IP address.

Yahoo Cloud Service Benchmark Tool The benchmarking tests were performed using the Yahoo Cloud Service Benchmark (YCSB) tool, which provided a consistent framework for loading, and running various test scenarios. YCSB was developed at Yahoo Labs to assist in evaluating various key-value and cloud databases. Since the publication of Benchmarking Cloud Serving Systems with YCSB (http://research.yahoo.com/node/3202) in 2010 and release of the YCSB source code, over 70 publications have used the tool for various benchmark comparisons. While it is possible to run the YCSB tool locally on the same machine as the database server, we chose to run it on a separate instance in order to prevent any possible interference with the CPU or memory of the machines under test. YCSB allows the user to set the number of threads and a target number of operations per second. We found that setting the thread count to 50 gave the highest operations per second. We initially loaded each cluster with 6 million records. Each record contained 10 fields of 100 bytes each and the entire data size was approximately 6gb. Each workload test was run for 500,000 operations.