MySQL High Availability Solutions. Lenz Grimmer <lenz@grimmer.com> http://lenzg.net/ 2009-08-22 OpenSQL Camp St. Augustin Germany

Similar documents
High Availability Solutions for MySQL. Lenz Grimmer DrupalCon 2008, Szeged, Hungary

Architectures Haute-Dispo Joffrey MICHAÏE Consultant MySQL

High Availability Solutions with MySQL

High Availability Solutions for the MariaDB and MySQL Database

Comparing MySQL and Postgres 9.0 Replication

MySQL High-Availability and Scale-Out architectures

High Availability and Scalability for Online Applications with MySQL

Open Source High Availability on Linux

High Availability with Postgres Plus Advanced Server. An EnterpriseDB White Paper

MySQL Reference Architectures for Massively Scalable Web Infrastructure

Eliminate SQL Server Downtime Even for maintenance

MySQL Replication. openark.org

Preparing for the Big Oops! Disaster Recovery Sites for MySQL. Robert Hodges, CEO, Continuent MySQL Conference 2011

Tushar Joshi Turtle Networks Ltd

High Availability Storage

Active/Active DB2 Clusters for HA and Scalability


High Availability & Disaster Recovery Development Project. Concepts, Design and Implementation

The Evolution of Database Management Systems

Zero Downtime Deployments with Database Migrations. Bob Feldbauer

High Availability Database Solutions. for PostgreSQL & Postgres Plus

The Future of PostgreSQL High Availability Robert Hodges - Continuent, Inc. Simon Riggs - 2ndQuadrant

MySQL synchronous replication in practice with Galera

Database High Availability. Solutions 2010

Database Replication with MySQL and PostgreSQL

short introduction to linux high availability description of problem and solution possibilities linux tools

Microsoft SQL Database Administrator Certification

Appendix A Core Concepts in SQL Server High Availability and Replication

High-Availability Using Open Source Software

Availability Digest. MySQL Clusters Go Active/Active. December 2006

How To Build A Fault Tolerant Mythrodmausic Installation

INUVIKA TECHNICAL GUIDE

Administering Microsoft SQL Server 2012 Databases

Westek Technology Snapshot and HA iscsi Replication Suite

How to choose High Availability solutions for MySQL MySQL UC 2010 Yves Trudeau Read by Peter Zaitsev. Percona Inc MySQLPerformanceBlog.

Top 10 Reasons why MySQL Experts Switch to SchoonerSQL - Solving the common problems users face with MySQL

High Availability Using MySQL in the Cloud:

Red Hat Cluster Suite

Implementing the SUSE Linux Enterprise High Availability Extension on System z

Portable Scale-Out Benchmarks for MySQL. MySQL User Conference 2008 Robert Hodges CTO Continuent, Inc.

Linux High Availability

MySQL Cluster New Features. Johan Andersson MySQL Cluster Consulting johan.andersson@sun.com

How to evaluate which MySQL High Availability solution best suits you

High Availability Low Dollar Clustered Storage

Configure AlwaysOn Failover Cluster Instances (SQL Server) using InfoSphere Data Replication Change Data Capture (CDC) on Windows Server 2012

Cisco Active Network Abstraction Gateway High Availability Solution

MySQL/MariaDB Multi-Master Replication & Failover

High Availability and Disaster Recovery for Exchange Servers Through a Mailbox Replication Approach

SCALABILITY AND AVAILABILITY

Windows clustering glossary

Cloud Based Application Architectures using Smart Computing

Deployment Topologies

SCALABLE DATA SERVICES

Synchronous multi-master clusters with MySQL: an introduction to Galera

Comparing Microsoft SQL Server 2005 Replication and DataXtend Remote Edition for Mobile and Distributed Applications

Solving Large-Scale Database Administration with Tungsten

Parallel Replication for MySQL in 5 Minutes or Less

Contents. SnapComms Data Protection Recommendations

Eloquence Training What s new in Eloquence B.08.00

A SURVEY OF POPULAR CLUSTERING TECHNOLOGIES

<Insert Picture Here> Oracle NoSQL Database A Distributed Key-Value Store

Mirror File System for Cloud Computing

Architecture and Mode of Operation

ScaleArc for SQL Server

Database Resilience at ISPs. High-Availability. White Paper

Ingres High Availability Option

Scalability of web applications. CSCI 470: Web Science Keith Vertanen

Real-time reporting at 10,000 inserts per second. Wesley Biggs CTO 25 October 2011 Percona Live

Explain how to prepare the hardware and other resources necessary to install SQL Server. Install SQL Server. Manage and configure SQL Server.

Administering and Managing Failover Clustering

Enabling Database-as-a-Service (DBaaS) within Enterprises or Cloud Offerings

Open Source High Availability Writing Resource Agents for your own services. Lars Marowsky-Brée Team Lead SUSE Labs

SanDisk ION Accelerator High Availability

Flash Databases: High Performance and High Availability

Postgres Plus xdb Replication Server with Multi-Master User s Guide

Using MySQL for Big Data Advantage Integrate for Insight Sastry Vedantam

SQL Server 2012/2014 AlwaysOn Availability Group

Establishing Environmental Best Practices. Brendan Flamer.co.nz/spag/

Server Clusters : Geographically Dispersed Clusters For Windows 2000 and Windows Server 2003

Linas Virbalas Continuent, Inc.

be architected pool of servers reliability and

Antelope Enterprise. Electronic Documents Management System and Workflow Engine

Availability Digest. Redundant Load Balancing for High Availability July 2013

MySQL Administration and Management Essentials

PostgreSQL Clustering with Red Hat Cluster Suite

Scaling DBMail with MySQL

L.A.M.P.* - Shaman-X High Availability and Disaster Tolerance. Design proposition

Module 14: Scalability and High Availability

Maintaining a Microsoft SQL Server 2008 Database

Open-Xchange Server High availability Daniel Halbe, Holger Achtziger

Implementing the Future of PostgreSQL Clustering with Tungsten

MySQL. Leveraging. Features for Availability & Scalability ABSTRACT: By Srinivasa Krishna Mamillapalli

Investor Newsletter. Storage Made Easy Cloud Appliance High Availability Options WHAT IS THE CLOUD APPLIANCE?

WHITE PAPER: HIGH CUSTOMIZE AVAILABILITY AND DISASTER RECOVERY

Netezza PureData System Administration Course

Building Highly Available OpenZFS Storage Appliances Grenville Whelan

Online Transaction Processing in SQL Server 2008

HA for Enterprise Clouds: Oracle Solaris Cluster & OpenStack

MarkLogic Server. Database Replication Guide. MarkLogic 8 February, Copyright 2015 MarkLogic Corporation. All rights reserved.

Bigdata High Availability (HA) Architecture

Transcription:

MySQL High Availability Solutions Lenz Grimmer <lenz@grimmer.com> < http://lenzg.net/ 2009-08-22 OpenSQL Camp St. Augustin Germany

Agenda High Availability: Concepts & Considerations MySQL Replication DRBD / Heartbeat MySQL Cluster Other HA tools/applications Links

What Is HA Clustering? Redundancy One service goes down others take over IP address takeover, service takeover Failover vs. failback vs. switchover Not designed for high-performance Not designed for high troughput (load balancing)

Replication vs. Shared Storage Shared storage resource can become SPOF Split brain situations can lead to mayhem (e.g. mounting file systems concurrently) SAN/NAS read I/O overhead Consistency of replicated data Synchronous vs. asynchronous replication

MySQL Replication One-way, statement- or row-based One Master, many Slaves Asynchronous Slaves can lag Master maintains binary logs & index Easy to use and set up Built into MySQL Replication is single-threaded No automated fail-over No heartbeat, no monitoring

MySQL Replication Overview Read & Write Web/App Server Write Relay Log mysqld Index & Binlogs I/O Thread SQL Thread Data Replication Binlog Data mysqld MySQL Master MySQL Slave

Statement-based replication Pro Con Proven (around since MySQL 3.23) Smaller log files Auditing of actual SQL statements No primary key requirement for replicated tables Non-deterministic functions and UDFs LOAD_FILE(), UUID(), USER(), FOUND_ROWS() (but RAND() and NOW() work)

Row-based replication Pro Con All changes can be replicated Similar technology used by other RDBMSes Fewer locks required for some INSERT, UPDATE or DELETE statements More data to be logged Log file size increases (backup/restore implications) Replicated tables require explicit primary keys Possible different result sets on bulk INSERTs

Replication Topologies Master > Slave Master > Slaves Master > Slave > Slaves Masters > Slave (Multi-Source) Master < > Master (Multi-Master) Ring (Multi-Master)

Master-Master Replication Useful for easier failover Not suitable for load-balancing Writes still end up on both machines Neither machine has the authorative data Don't write to both masters! Use Sharding or Partitioning instead (e.g. MySQL Proxy)

MySQL Replication as an HA Solution What happens if the Master fails? The application won't work anymore Slave has nothing to replicate from What happens if the Slave fails? Data will no longer be replicated This doesn t sound like High Availability? Correct! Replication is just one component of an HA setup It doesn t implement all parts of an HA Solution!

Replication & HA Combined with Heartbeat Virtual IP takeover Slave gets promoted to Master Side benefits: load balancing & backup Tricky to fail back No automatic conflict resolution Proper failover needs to be scripted

Linux-HA / Hearbeat Supports 2 or more nodes Resource monitoring Active fencing mechanism: STONITH Policy-based resource management, dependencies & constraints Time-based rules Includes support for many applications GUI support Low dependencies / requirements Subsecond node failure detection

Applications Master Virtual IP HA Slave Scale-out Slave Replication Replication

DRBD Distributed Replicated Block Device RAID-1 over network Synchronous/asynchronous block replication Automatic resync on recovery Application-agnostic Can mask local I/O errors Active/passive configuration by default Dual-primary mode (requires a cluster filesystem like GFS or OCFS2) http://drbd.org/

DRBD in detail DRBD Replicates data between two disk partitions DRBD integrates nicely with Linux-HA and other HA Solutions MySQL runs on the Active node as usual MySQL is dormant on the Passive node DRBD is Linux only Active Node Applications Virtual IP Passive Node DRBD

MySQL Cluster Shared-nothing architecture Automatic partitioning Distributed Fragments Synchronous replication Fast automatic fail-over of data nodes Automatic resynchronization Transparent to Application Supports Transactions

MySQL Cluster In-memory indexes Not suitable for all query patterns (multi-table JOINs, range scans) No support for foreign key constraints Not suitable for large datasets/transactions Latency matters Can be combined with MySQL Replication (RBR)

MySQL Cluster & Replication MySQL Cluster Easy failover from one master to another Scaling writes to multiple masters Asynchronous replication from Cluster to multiple slaves Reads are distributed to slave servers (running InnoDB/MyISAM) Quick setup of new slaves (using Cluster Online Backup) Easy failover and quick recovery

http://johanandersson.blogspot.com/2009/05/ha-mysql-write-scaling-using-cluster-to.html

A v a ila b ilit y S c a la b ilit y MySQL Replication & Heartbeat MySQL, Heartbeat & DRBD MySQL Cluster Requirements MySQL Replication Automated IP Failover No Yes Yes No Automated DB Failover No No Yes Yes Typical Failover time Varies Varies < 30s < 3s Auto resync of data No No Yes Yes Geographic redundancy Yes Yes MySQL Replication Built-in load balancing MySQL Replication MySQL Replication MySQL Replication Yes Read-intensive Yes Yes MySQL Replication Yes Write-intensive No No Possible Yes #Nodes/Cluster Master/Slave(s) Master/Slave(s) Active/Passive 255 MySQL Replication

MMM MySQL Master-Master Replication Manager Collection of scripts to perform monitoring/failover and management Master-Master replication configurations (one writable node) Failover by moving virtual IP http://code.google.com/p/mysql-master-master/

Flipper Perl Script that manages pairs of MySQL servers replicating to each other. Automatic switchover, triggered manually Enables one half of the pair to be taken offline for maintenance work while the other half carries on dealing with queries from clients Moves IP addresses based on a role ("read-only", "writable") between two nodes in the master pair, to ensure that each role is available http://provenscaling.com/software/flipper/

Red Hat Cluster Suite HA and load balancing Supports up to 128 nodes Shared storage Monitors services, file systems and network status Fencing (STONITH) or distributed lock manager Configuration synchronization http://www.redhat.com/cluster_suite/

Solaris Cluster / OpenHA Cluster Provides failover and scalability services Solaris / OpenSolaris Kernel-level components plus userland Agents monitor applications Geo Edition to provide Disaster Recovery using Replication Open Source since June 2007 See Thorsten's FrOSCon talk for more details! (Sunday, 11:15, HS4) http://www.opensolaris.org/os/community/ha-clusters/

Related tools / Links Linux Heartbeat http://linux-ha.org/ Red Hat Cluster Suite http://www.redhat.com/cluster_suite/ Sun Open High Availability Cluster http://opensolaris.org/os/project/ha-mysql/ Linux Cluster Information Center http://www.lcic.org/ha.html

Tools/Links MySQL Multi-Master Replication Manager http://code.google.com/p/mysql-master-master/ Maatkit http://maatkit.sourceforge.net/ Mon scheduler and alert management http://www.kernel.org/software/mon/ Continuent Tungsten Replicator https://community.continuent.com/community/tungsten-replicator

Q & A Questions, Comments?

Thank you! Lenz Grimmer <lenz@grimmer.com> http://lenzg.net/

Eliminating the SPOF Identify what will fail Disks Fans Find out what can fail Network cables Power supplies CPUs NICs OOM

HA Components Heartbeat Checks that services that are being failed over, are alive. Can check individual servers, software services, networking etc. HA Monitor Configuration of the services Ensures proper shutdown and startup Allows manual control Shared storage / Replication

Rules of High Availability Prepare for failure Aim to ensure that no important data is lost Keep it simple, stupid (KISS) Complexity is the enemy of reliability Automate it Test your setup frequently!

Split-Brain Communications failures can lead to separated partitions of the cluster If those partitions each try and take control of the cluster, then it's called a split-brain condition If this happens, then bad things will happen http://linux-ha.org/badthingswillhappen Use Fencing or Moderation/Arbitration to avoid it