MySQL: Cloud vs Bare Metal, Performance and Reliability Los Angeles MySQL Meetup Vladimir Fedorkov, March 31, 2014
Let s meet each other Performance geek All kinds MySQL and some Sphinx Working for Blackbird Twitter @vfedorkov Slides will be there: http://astellar.com
Credits and thank yous LAMySQL Meetup Joe Devon Caroline Rose Blackbird, AWS advances tech partner Laine Campbell http://blackbird.io Blackbird s Rsync and EU team
How does it start Startup deciding to go with MySQL Getting successful That server gets slower and slower Now what to do?
Usual ways to grow Buy bigger server (scale up) Buy another server (scale out) Tune up your application Limit functionality
Questions to answer Which server to buy? Where? How long does it take to get new server? How long does it take to make it work? Why it didn t help? Why application is slow?
Which server to get? How much would infrastructure will cost me? Which performance will I get? How easy would be to add more servers? How much would it cost me?
Bare metal vs Cloud
Physical servers Predictable performance Raw infrastructure Takes long to install Old-style operations
Amazon EC2 Virtually provisioned boxes Managed infrastructure Boxes Network Attached disks Web/API bound operations Strong scalability suport Full OS access Flexible performance More your pay more you ll get
Amazon RDS Fully managed MySQL instances (SaaS) Oracle, MSSQL, RedShift, etc. No OS access Same on-demand scalability as EC2 Scripted scaling up/down MySQL-specific operations can be scripted
Resources Bare metal EC2 RDS CPU Physical Virtual, limited Virtual, limited RAM Physical Shared Shared IO Predictable Networkbased, limited Networkbased, limited
Operations
Day-to-day MySQL operations Hardware faults Software bugs Reliable backup and restore Monitoring Upgrading and replication setup Networking and Security
Physical servers vs EC2 vs RDS Bare metal EC2 RDS Provisioning By you Ready-to-go boxes DB instances Monitoring Your own OS-level by amazon All by amazon Backups Your own Or no backups Snapshot-based Manual on MySQL level Automated, no delayed replicas Replication Manual Manual Auto magic Networking All yours Provided by AWS Provided by AWS Hardware faults and software bugs All yours Included Included
Amazon cloud details Several regions (US, Europe, Asia) Several availability zones inside one region Different instance types General Memory optimized Compute-optimized GPU-instances Storage optimized
Storage EBS EC2 / RDS POIPS EC2 / RDS SSD-based backend Ephemeral EC2 only, attached SSD Attached
RDS details Parameter groups instead of my.cnf One click Web interface API available Limited visibility and operations Cloud-specific behavior Upgrade as a Surprise Can be disables in control panel Instance relocation as-a-surprise
How to survive?
Know your application Configure MySQL properly Tune your queries Monitor your system Predict your traffic
EC2 and RDS specific Hardware might be different Even each benchmark run IO is highly relies on network CPU & IO limiters are in place You don t know your cloud neighbors
First step MySQL configuration 5% settings makes 95% performance Which every query could kill What to tune at the first place innodb_buffer_pool_size Set on ¾ memory in RDS by default innodb_file_per_table key_buffer for MyISAM server-id for EC2 and standalone
Second step Network IO skip_name_resolv May affect authentication max_connect_errors FLUSH HOSTS max_allowed_packet innodb_flush_log_at_trx_commit innodb_log_file_size
Changes to avoid (usually) max_join_size sort_buffer_size tx_isolation Remember about 5% settings and 95% performance!
Backups What would you do if your MySQL has gone? Replicas-Replicas-Replicas and binary logs For point-in-time recovery Better to another host expire_logs_days max_binlog_size RDS snapshots Volume snapshots on EC2 Works like LVM backups
Consider worst case Limit InnoDB IO capacity on EBS Affecting replica creation time! Tune flush_log_at_trx_commit Keep tmp_table_size big enough Temporary tables are BAD! There are drawbacks! Know your queries!
Queries Most likely the cause of MySQL slowness Affect application speed Fastest MySQL query is the query that did not MySQL at all Cached Sent to another storage (you name it) Just didn t run Helps in case of overload
Bad queries: full table scans SELECT * FROM table WHERE DAY(FROM_UNIXTIME(`ts`)) = 205 WHERE enabled!= 1 WHERE id NOT IN (1,2,3,,10) WHERE url LIKE %something% LIKE something% is still okay ORDER BY RAND() All time classic
Poor index usage SELECT * / COUNT(*) FROM users WHERE enabled=1 Hello half of the table How to fix? Pre-aggregation Caches! Not in MySQL query cache! Right time to use external storage (Mongo, Redis, Sphinx, etc)
Temporary tables Bad Very bad if they hit the disk When? Will hit if there are TEXT or BLOB fields GROUP BY Sub SELECTS (better in 5.6) DISTINCT + ORDER BY
What else to do? Scale Run heavy queries on reporting replica Rewrite bad queries where possible
When load is really high Query cache (off) Thread cache Table cache Slow query log wait_timeout connection pooling Sharding and replication
Questions!
Thank you! http://blackbird.io, Vladimir Fedorkov, LA MySQL 2014