High Performance MySQL Choices in Amazon Web Services: Beyond RDS. Andrew Shieh, SmugMug Operations shandrew @ smugmug.



Similar documents
Choosing Storage Systems

OTM in the Cloud. Ryan Haney

Cloud Computing and Amazon Web Services

Web Application Deployment in the Cloud Using Amazon Web Services From Infancy to Maturity

MySQL: Cloud vs Bare Metal, Performance and Reliability

Flash Use Cases Traditional Infrastructure vs Hyperscale

Amazon Cloud Storage Options

Design for Failure High Availability Architectures using AWS

Storage and Disaster Recovery

Running an E-Commerce Database in the Cloud. Mark Uhrmacher (CTO) Aaron Brown (Senior Systems Engineer) ideeli

How AWS Pricing Works

Avoiding Pain Running MySQL in the Cloud

How To Choose Between A Relational Database Service From Aws.Com

Understanding AWS Storage Options

How AWS Pricing Works May 2015

Storage Solutions in the AWS Cloud. Miles Ward Enterprise Solutions Architect

Deploying Database clusters in the Cloud

Web Application Hosting in the AWS Cloud Best Practices

Migration Scenario: Migrating Backend Processing Pipeline to the AWS Cloud

TECHNOLOGY WHITE PAPER Jun 2012

TECHNOLOGY WHITE PAPER Jan 2016

Top 10 Reasons why MySQL Experts Switch to SchoonerSQL - Solving the common problems users face with MySQL

High-Availability in the Cloud Architectural Best Practices

AWS Performance Tuning

Intro to AWS: Storage Services

Enhancements of ETERNUS DX / SF

Deploying Splunk on Amazon Web Services

Accelerating Real Time Big Data Applications. PRESENTATION TITLE GOES HERE Bob Hansen

Zadara Storage Cloud A

Cloud Computing Disaster Recovery (DR)

Web Application Hosting in the AWS Cloud Best Practices

MySQL backups: strategy, tools, recovery scenarios. Akshay Suryawanshi Roman Vynar

PostgreSQL on Amazon. Christophe Pettus PostgreSQL Experts, Inc.

nomorerack CUSTOMER SUCCESS STORY RELIABILITY AND AVAILABILITY WITH FAST GROWTH IN THE CLOUD

Software- as- a- Service (SaaS) on AWS Business and Architecture Overview

DISTRIBUTED SYSTEMS [COMP9243] Lecture 9a: Cloud Computing WHAT IS CLOUD COMPUTING? 2

Scalable Architecture on Amazon AWS Cloud

DISASTER RECOVERY WITH AWS

Designing Apps for Amazon Web Services

Expert Reference Series of White Papers. Introduction to Amazon Relational Database Service (Amazon RDS)

Increased Security, Greater Agility, Lower Costs for AWS DELPHIX FOR AMAZON WEB SERVICES WHITE PAPER

never 20X spike ClustrixDB 2nd Choxi (Formally nomorerack.com) Customer Success Story Reliability and Availability with fast growth in the cloud

Be Very Afraid. Christophe Pettus PostgreSQL Experts Logical Decoding & Backup Conference Europe 2014

High Availability & Disaster Recovery Development Project. Concepts, Design and Implementation

How To Store Data On A Server Or Hard Drive (For A Cloud)

SOFTWAREDEFINED-STORAGE

How To Scale Myroster With Flash Memory From Hgst On A Flash Flash Flash Memory On A Slave Server

Best Practices for Using MySQL in the Cloud

StorReduce Technical White Paper Cloud-based Data Deduplication

Deep Dive: Maximizing EC2 & EBS Performance

Migrating VSMs & Performing disaster recovery

HyperQ DR Replication White Paper. The Easy Way to Protect Your Data

Preparing Your IT for the Holidays. A quick start guide to take your e-commerce to the Cloud

RDBMS in the Cloud: Oracle Database on AWS

An Introduction - ZNetLive's Hybrid Dedicated Servers

Amazon EC2 Product Details Page 1 of 5

MaxDeploy Ready. Hyper- Converged Virtualization Solution. With SanDisk Fusion iomemory products

Service Catalogue. virtual services, real results

Service Availability Metrics

Parallels Cloud Storage

House of Cards. IaaS without storage performance testing. Howard Marks, Deep Storage Len Rosenthal, Load DynamiX

How To Speed Up A Flash Flash Storage System With The Hyperq Memory Router

GET. tech brief FASTER BACKUPS

Ground up Introduction to In-Memory Data (Grids)

Amazon Relational Database Service (RDS)

All-Flash Storage Solution for SAP HANA:

Cloud Database Demystified to Deliver SaaS Customer Value

IAN MASSINGHAM. Technical Evangelist Amazon Web Services

PARALLELS CLOUD STORAGE

Deep Dive on SimpliVity s OmniStack A Technical Whitepaper

How To Manage An Orgsync Database On An Amazon Cloud 2 Instance

BORG DIGITAL High Availability

Storage Options in the AWS Cloud

Combining Onsite and Cloud Backup

KT ucloud storage. Two Years of Life with OpenStack Swift / Jaesuk Ahn, Cloud OS Dev. Team, Korea Telecom

Amazon Web Services Yu Xiao

Traditional v/s CONVRGD

SQL Server on Azure An e2e Overview. Nosheen Syed Principal Group Program Manager Microsoft

Microsoft Azure Cloud on your terms. Start your cloud journey.

NEXT GENERATION EMC: LEAD YOUR STORAGE TRANSFORMATION. Copyright 2013 EMC Corporation. All rights reserved.

Chapter 9 PUBLIC CLOUD LABORATORY. Sucha Smanchat, PhD. Faculty of Information Technology. King Mongkut s University of Technology North Bangkok

Are you ready for your Journey to the cloud? Maybe some of you are already using some cloud- based services?

Dimension Data Enabling the Journey to the Cloud

THE FIRST LOCAL ENTERPRISE CLOUD STORAGE FEATURES. Enterprise iscsi (Block) & NFS/ CIFS (File) Storage-as-a-Service

So What s the Big Deal?

Cloud computing infrastructure and related issues

PostgreSQL Performance Characteristics on Joyent and Amazon EC2

High Availability Solutions for the MariaDB and MySQL Database

The Data Placement Challenge

AMAZON S3: ARCHITECTING FOR RESILIENCY IN THE FACE OF FAILURES Jason McHugh

SysAid Cloud Architecture Including Security and Disaster Recovery Plan

Understanding SAN and Storage Constraints In Private Clouds

Scalable Application. Mikalai Alimenkou

Providing Self-Service, Life-cycle Management for Databases with VMware vfabric Data Director

Deploying ArcGIS for Server Using Managed Services

GeoCloud Project Report USGS/EROS Spatial Data Warehouse Project

Drupal in the Cloud. Scaling with Drupal and Amazon Web Services. Northern Virginia Drupal Meetup

Migration and Building of Data Centers in IBM SoftLayer with the RackWare Management Module

How swift is your Swift? Ning Zhang, OpenStack Engineer at Zmanda Chander Kant, CEO at Zmanda

Transcription:

High Performance MySQL Choices in Amazon Web Services: Beyond RDS Andrew Shieh, SmugMug Operations shandrew @ smugmug.com April 15, 2015

Agenda 2 All about AWS Current RDS alternatives Cloud failures -> Cloud improvements Migrating MySQL into AWS MySQL backups in AWS The future

SmugMug Who are we? 3

The early days of SmugMug 4 Gradual bootstrapped growth Multiple self-managed datacenter cages Too many servers of varying types Too many disks Tons of valuable skilled employee hours spent in cages

5 Data Center Fantasy

6 Data Center Reality

SmugMug <3 AWS 7 Early adopter of Amazon S3 Over the years, moved rendering, upload, archiving, payments, permissions, email, and more compute to AWS! But DBs? Before mid-2012, no ultra-high performance I/O in AWS In 2013, we went 100% AWS

All About AWS 8 AWS Customer Focus - amazing support Features Tech Cloud is not easy, and is far from a commodity!

RDS? If it works great for you, great! In our 2012 evaluation, it lacked scalability and features that we desired Loss of SUPER, shell access no local SSD We already knew how to run MySQL effectively 9

RDS Alternatives 10 Run your database hardware yourself, Connect via AWS Direct Connect (peering to AWS).

RDS Alternatives 11 No! The hybrid cloud is a trap that will come back to bite you High I/O hardware, complexity goes up, costs can skyrocket

12 Data Center Reality

Use Amazon EC2 13! Pre-EC2, our custom, obscure SSD hardware => difficult to resolve problems, difficult to upgrade TRIM through FS, LVM, MD, driver layers: nightmare Micron/Crucial 5000 hour bug hi1 overall DB IO performance comparable to 8 x SSD RAID10 hi1 < 3%/yr instance failure rate! Many small important SSD details solved for you

EC2 i2.* instances: even better 14 1/2/4/8x sizes (up to 16 core, 244 GB RAM, 6.4 TB SSD) Better everything/$ vs hi1 Slightly higher failure rate, ~5%/yr

Two faces of EBS 15 Since 2008, AWS' primary persistent storage Prior to 2013, large source of the major AWS outages: April 2011, October 2012, December 2012. Regional failures caused by cascading EBS failures Caused EBS-dependent services to fail as well So, we avoided it for production

Two faces of EBS 16 But AWS has learned and continuously improved Limits on amount of reprovisioning when there's a failure Avoiding multi-az outages is key. Separate AZs as much as possible: independent power, networking, cooling, pushes

Two faces of EBS 17 Mid-2012 EBS provisioned- IOPS: EBS on SSDs Late-2014 EBS gp2: EBS on SSDs now standard EBS-optimized lower latency high bw connections from EC2 to EBS at small cost 2015 EBS: 16 TB volumes

Use Amazon EC2+EBS? 18 Improved EBS reliability/ technology=>greater use of it in production Cost advantages, select compute separately from SSD size Particularly useful for replicas and backups

Migration to AWS 19

SmugMug Architecture ~2006 20 AWS: S3 SV: Web, DB, Image*

SmugMug Architecture ~2011 21 AWS: S3 SV: Web, DB AWS: S3, Image (upload, processing, render, video, )

SmugMug Architecture - Transition 22 AWS: S3 SV: Web, DB AWS: S3, Image*, Web DC: Replication DB, Direct Connect

SmugMug Architecture Today 23 Ø AWS: S3, Image*, Web, DB

Zero Downtime Move Requirements 24 Read-only site mode Traffic control shadow load Cross country MySQL replication + sufficient bandwidth Make sure replicas are caught up, go read only, then switch masters to your AWS MySQL instances

High Availability Solution for conventional MySQL Percona MySQL 5.5/5.6: Stable and well tested Other HA solutions address SPOF of conventional MySQL master replication But newer software may be less tested, more buggy. Bugs = SPOF 25

High Availability Solution for conventional MySQL Reduce failure exposure by distributing replicas in separate AZs, masters in one AZ Minimize master failure impact by switching to a read-only mode Recover quickly by reducing time needed to swap in new master 26

Controlling AWS costs 27 Reserved instances, 1/3 year commit discount rates Monitor usage always Talk to your account manager

MySQL Backups in AWS 28 With replicas on EBS, use EBS-S3 Snapshots Simple, fast, and only pay for changed blocks Snapshots also increases durability of your volumes Fast, lazy-loaded volume recovery. Your multi- TB backup can be ready in < ~5 min

MySQL Backups in AWS 29 Percona xtrabackup Consistent, reliable, easy to manage Runs on production DB with minimal impact Choose your favorite storage method for short-term/long-term

The Future: PXC / Galera High durability Masterless, easy architecture Easy to migrate in/out with replication But little track record And performance penalties, esp crossdatacenter 30

Amazon Aurora 31 Admin similar to RDS, backend all different MySQL 5.6 hacked up to separate cache, compute, and storage Compute: r3 (high RAM) EC2 instances only

Amazon Aurora: Storage 32 A new storage method for AWS: Log-structured file system, on a single shared three-datacenter distributed system Master and all replicas share the same storage--each proceeding with replication by points in time on the log

Amazon Aurora, continued External distributed cache Master performance roughly 2-3x higher than fastest EC2 ephemeral SSD (8 x 800G) Near-zero replica lag No need to provision storage or IOPS, ever! 33

Amazon Aurora, continued Server cost is 65-70% higher than equivalent EC2 instances - a bargain! Storage cost is same as GP2 EBS SSD, 10 cents/gb-mo, despite having 6 copies on SSD! IO charges similar to RDS Increased commit latency No replication in currently More details: watch Anurag Gupta's presentation (30m) at re:invent 2014 34

Make your picks 35 PXC and Aurora fix failure modes of traditional MySQL replication But add lots of unknowns The leading MySQL users continue to choose MySQL 5.6 with (mostly) traditional replication

Make your picks 36 Amazon designed Aurora for customers like SmugMug! Trust = Reputation + Time x Track record AWS wants to "lock you in" by being the best choice. This is their time tested track record (YMMV)

Stuff we really like 37 AWS S3 and VPC Stackdriver and NewRelic for monitoring Percona s support and tools Puppet, configuration management SmugMug s Super Heroes

Questions? 38 Andrew Shieh, Sunnyvale, CA shandrew @ smugmug.com @shandrew! http://www.smugmug.com/ http://pics.shieh.info/!! Thank you!! Please email me to request an audio recording of my session.