Using Tungsten Replicator to solve replication problems

Similar documents

Preventing con!icts in Multi-master replication with Tungsten

Parallel Replication for MySQL in 5 Minutes or Less

Solving Large-Scale Database Administration with Tungsten

Preparing for the Big Oops! Disaster Recovery Sites for MySQL. Robert Hodges, CEO, Continuent MySQL Conference 2011

High Availability And Disaster Recovery

From Dolphins to Elephants: Real-Time MySQL to Hadoop Replication with Tungsten

Replicating to everything

How, What, and Where of Data Warehouses for MySQL

How to evaluate which MySQL High Availability solution best suits you

Linas Virbalas Continuent, Inc.

High Availability Solutions for MySQL. Lenz Grimmer DrupalCon 2008, Szeged, Hungary

High Availability Solutions for the MariaDB and MySQL Database

MySQL High Availability Solutions. Lenz Grimmer OpenSQL Camp St. Augustin Germany

Zero Downtime Deployments with Database Migrations. Bob Feldbauer

Future-Proofing MySQL for the Worldwide Data Revolution

Database Replication with MySQL and PostgreSQL

Synchronous multi-master clusters with MySQL: an introduction to Galera

MySQL Replication. openark.org

Scaling up = getting a better machine. Scaling out = use another server and add it to your cluster.

<Insert Picture Here> Introduction to Using MySQL in Cloud Computing

Comparing MySQL and Postgres 9.0 Replication

Percona Server features for OpenStack and Trove Ops

Using MySQL for Big Data Advantage Integrate for Insight Sastry Vedantam

Testing and Verifying your MySQL Backup Strategy

How To Manage Myroster Database With Hp And Myroberty

Portable Scale-Out Benchmarks for MySQL. MySQL User Conference 2008 Robert Hodges CTO Continuent, Inc.

Dave Stokes MySQL Community Manager

MySQL Cluster New Features. Johan Andersson MySQL Cluster Consulting johan.andersson@sun.com

YouTube Vitess. Cloud-Native MySQL. Oracle OpenWorld Conference October 26, Anthony Yeh, Software Engineer, YouTube.

MySQL Backup IEDR

Comparing Microsoft SQL Server 2005 Replication and DataXtend Remote Edition for Mobile and Distributed Applications

Setting up the Oracle Warehouse Builder Project. Topics. Overview. Purpose

Real-time reporting at 10,000 inserts per second. Wesley Biggs CTO 25 October 2011 Percona Live

Postgres Plus xdb Replication Server with Multi-Master User s Guide

SAP Business Objects Business Intelligence platform Document Version: 4.1 Support Package Data Federation Administration Tool Guide

User Guide for VMware Adapter for SAP LVM VERSION 1.2

MySQL Administration and Management Essentials

KonyOne Server Installer - Linux Release Notes

High Availability, Disaster Recovery and Extreme Read Scaling using Binlog Servers. Jean-François Gagné jeanfrancois DOT gagne AT booking.

Practical Cassandra. Vitalii

MATLAB Distributed Computing Server with HPC Cluster in Microsoft Azure

Availability Guide for Deploying SQL Server on VMware vsphere. August 2009

How to Install SMTPSwith Mailer on Centos Server/VPS

Deploying Database clusters in the Cloud

MySQL and Hadoop: Big Data Integration. Shubhangi Garg & Neha Kumari MySQL Engineering

ORACLE NOSQL DATABASE HANDS-ON WORKSHOP Cluster Deployment and Management

Configuration Manager Error Messages

IBM InfoSphere MDM Server v9.0. Version: Demo. Page <<1/11>>

Architectures Haute-Dispo Joffrey MICHAÏE Consultant MySQL

Install guide for Websphere 7.0

MySQL synchronous replication in practice with Galera

The Future of PostgreSQL High Availability Robert Hodges - Continuent, Inc. Simon Riggs - 2ndQuadrant

Comparing SQL and NOSQL databases

Informatica Corporation Proactive Monitoring for PowerCenter Operations Version 3.0 Release Notes May 2014

Enabling Database-as-a-Service (DBaaS) within Enterprises or Cloud Offerings

Understanding MySQL storage and clustering in QueueMetrics. Loway

Managing MySQL Scale Through Consolidation

Oracle Exam 1z0-102 Oracle Weblogic Server 11g: System Administration I Version: 9.0 [ Total Questions: 111 ]

WebSphere Business Monitor V7.0 Configuring a remote CEI server

Tushar Joshi Turtle Networks Ltd

Monitoring MySQL. Geert Vanderkelen MySQL Senior Support Engineer Sun Microsystems

Automation Engine 14. Troubleshooting

CycleServer Grid Engine Support Install Guide. version 1.25

Top 10 Reasons why MySQL Experts Switch to SchoonerSQL - Solving the common problems users face with MySQL

MySQL Fabric: High Availability Solution for Connector/Python

1. GridGain In-Memory Accelerator For Hadoop. 2. Hadoop Installation. 2.1 Hadoop 1.x Installation

High Availability Solutions with MySQL

PRM and DRBD tutorial. Yves Trudeau October 2012

This guide specifies the required and supported system elements for the application.

CatDV-StorNext Archive Additions: Installation and Configuration Guide

VMware Continuent. Benefits and Configurations TECHNICAL WHITE PAPER

Cisco Unified Contact Center Express Data Migration Tool User Guide, Release 1.0

Websense Support Webinar: Questions and Answers

Moving to Plesk Automation 11.5

Exam Name: IBM InfoSphere MDM Server v9.0

IceWarp to IceWarp Server Migration

Implementing Moodle on a Windows High Availability Environment

CA ARCserve and CA XOsoft r12.5 Best Practices for protecting Microsoft SQL Server

AWS Schema Conversion Tool. User Guide Version 1.0

ADAM 5.5. System Requirements

High Availability and Scalability for Online Applications with MySQL

How To Install An Aneka Cloud On A Windows 7 Computer (For Free)

Server & Workstation Installation of Client Profiles for Windows

EVENT LOG MANAGEMENT...

18.2 user guide No Magic, Inc. 2015

VERSION 9.02 INSTALLATION GUIDE.

Chancery SMS Database Split

FioranoMQ 9. High Availability Guide

Database Administration with MySQL

Department of Veterans Affairs VistA Integration Adapter Release Enhancement Manual

Tungsten Replicator, more open than ever!

Alfresco Enterprise on AWS: Reference Architecture

IBM WebSphere Application Server Version 7.0

Backup and Recovery. What Backup, Recovery, and Disaster Recovery Mean to Your SQL Anywhere Databases

TP1: Getting Started with Hadoop

Transcription:

Using Tungsten Replicator to solve replication problems Neil Armitage, Cluster implementation Engineer, Continuent Giuseppe Maxia, QA Director, Continuent 1 1

ABOUT US Neil Armitage Continuent Tungsten Deployment and Support Engineer, Continuent, Inc 20 years development and DB experience Giuseppe Maxia, a.k.a. "The Data Charmer" QA Director, Continuent, Inc 25 years development and DB experience long timer MySQL community member. Oracle ACE Director 2 2

Tungsten replicator Global transaction ID Multiple masters Multiple sources Flexible topologies Parallel replication Heterogeneous replication... and more 3 3

What Tungsten Replicator is NOT Automated management Automatic failover Transparent connections All the above (and more) are available with a commercial solution named Continuent Tungsten (a.k.a. Tungsten Enterprise) 4 4

What are we talking about? Requirements Components Installation Topologies Administration Troubleshooting 5 5

Tungsten Replicator Concepts Replicator The replication engine Role Master, slave, direct slave service A.k.a. "pipeline" stage extract,queue,apply 6 6

Tungsten Replicator Components THL Transaction History Log service schema Makes the node crash proof properties file service definition tools Ruling from a centralized location 7 7

Tungsten Replicator in a nutshell host1 master host2 slave binlog global transaction ID THL THL trep_commit_seqno origin seqno eventid trep_commit_seqno origin seqno eventid 8 8

Planning Hosts Topology Stand-alone or taking over 9 9

master-slave fan-in slave MySQL Oracle Oracle MySQL all-masters Heterogeneous Oracle MySQL MySQL Oracle star 10

Installation 11 11

Installation System Requirements Validate!rst Deploying from a single location 12 12

Installation - tools tools/ tungsten-installer tools/ con!gure-service tools/update (Using the cookbook recipes, you hardly see them) 13 13

Tungsten in practice Installation 14 14

Installation Check the requirements Get the binaries Expand the tarball Run cookbook 15 15

REQUIREMENTS Java JRE or JDK (Sun/Oracle or Open-jdk) Ruby 1.8 (only during installation) ssh access to the same user in all nodes MySQL user with all privileges 16 16

Installation - Choices --master-slave --direct 17 17

master-slave host1 master host2 slave binlog THL THL host3 slave THL 18 18

direct host1 master host2 slave relay log binlog host3 THL slave relay log THL 19 19

Overview of Virtual Machines Copy zip!les from USB Key Expand on local disk Start all 4 Machines in VirtualBox 20 20

Virtual Machines 4 Nodes host1->host4 Running centos 6.3 and Percona 5.5 Root and tungsten password = password localhost port 2222 redirects to 22 on hosts ssh - p 2222 tungsten@localhost 21 21

VERY important de!nitions Staging directory: Where you unpack the software and run the installer. There is generally only one, in one host; Can be discarded after installation Installation directory: Where your installed software will go; There is one for every host; 22 22

Example host1 Staging directory: $HOME/tungsten-replicator-2.0.8-167 Installation directory: /opt/replication Installation directory: /opt/replication host2 Installation directory: /opt/replication host3 23 23

Requirements : how to step by step: how it happened 24 24

installing VMs Step-by-step demo 25 25

Overview of Tungsten cookbook 26 26

tungsten cookbook tungsten-replicator-2.0.8-167 +--/cluster-home +--/cookbook +--/tools +--/tungsten-replicator 27 27

tungsten cookbook tungsten-replicator-2.0.8-167 +--/cookbook +--COMMON_NODES.sh +--USER_VALUES.sh +--NODES_MASTER_SLAVE.sh +--install_master_slave +--show_cluster +--test_cluster... 28 28

tungsten cookbook tungsten-replicator-2.0.8-167 +--/cookbook +--COMMON_NODES.sh +--USER_VALUES.sh +--NODES_ALL_MASTERS.sh +--install_all_masters +--show_cluster +--test_cluster... 29 29

tungsten cookbook tungsten-replicator-2.0.8-167 +--/cookbook +--COMMON_NODES.sh +--USER_VALUES.sh +--NODES_STAR.sh +--install_star +--show_cluster +--test_cluster... 30 30

tungsten cookbook tungsten-replicator-2.0.8-167 +--/cookbook +--COMMON_NODES.sh +--USER_VALUES.sh +--NODES_FAN_IN.sh +--install_fan_in +--show_cluster +--test_cluster... 31 31

tungsten cookbook $ cat COMMON_NODES.sh export NODE1=host1 export NODE2=host2 export NODE3=host3 export NODE4=host4 32 32

tungsten cookbook $ cat USER_VALUES.sh # User defined values for the cluster to be installed. export TUNGSTEN_BASE=$HOME/installs/cookbook export DATABASE_USER=tungsten export BINLOG_DIRECTORY=/var/lib/mysql export MY_CNF=/etc/my.cnf export DATABASE_PASSWORD=secret export DATABASE_PORT=3306 export TUNGSTEN_SERVICE=cookbook export RMI_PORT=10000 export THL_PORT=2112 export START_OPTION=start 33 33

Getting started: VALIDATE FIRST export VERBOSE=1./cookbook/check_cookbook./cookbook/validate_cluster 34 34

sample master-slave installation edit cookbook/common_nodes.sh edit cookbook/user_values.sh run cookbook/install_master_slave and then: run cookbook/show_cluster run cookbook/test_cluster 35 35

What does the installation do 1: Validate all servers host4 host1 host2 host3 Report all errors 36 36

What does the installation do 1: (again) Validate all servers host4 host1 host2 host3 37 37

What does the installation do 2: install Tungsten in all servers $HOME/ tinstall/ config/ releases/ relay/ thl/ tungsten/ backups/ host4 host1 host2 host3 38 38

example (from manual installation) ssh r2 chmod 444 $HOME/tinstall./tools/tungsten-installer \ --master-slave --master-host=r1 \ --datasource-user=tungsten \ --datasource-password=secret \ --service-name=dragon \ --home-directory=$home/tinstall \ --thl-directory=$home/tinstall/logs \ --relay-directory=$home/tinstall/relay \ --cluster-hosts=r1,r2,r3,r4 --start ERROR >> qa.r2.continuent.com >> /home/tungsten/ tinstall is not writeable 39 39

example ssh r2 chmod 755 $HOME/tinstall./tools/tungsten-installer \ --master-slave --master-host=r1 \ --datasource-user=tungsten \ --datasource-password=secret \ --service-name=dragon \ --home-directory=$home/tinstall \ --thl-directory=$home/tinstall/logs \ --relay-directory=$home/tinstall/relay \ --cluster-hosts=r1,r2,r3,r4 --start # no errors 40 40

After installation. A tour of the cookbook utilities 41 41

General principles (1) Scripts without extension are designed to be launched by users e.g../cookbook/help./cookbook/install_master_slave Scripts with extension ".sh" are either for internal use only or deprecated../cookbook/install_* scripts can be used before installing. Most everything else require an installed topology 42 42

General principles (2) After installation there is a!le CURRENT_TOPOLOGY in the staging directory cookbook scripts can be used either from the staging directory or from the installation directory. 43 43

Cookbook tour: help and checks./cookbook/check_cookbook./cookbook/help./cookbook/readme 44 44

Cookbook tour: Getting information./cookbook/show_cluster./cookbook/paths./cookbook/backups./cookbook/services./cookbook/query_node {node} {query}./cookbook/query_all_nodes {query} 45 45

Cookbook tour: Inspecting replication./cookbook/replicator./cookbook/trepctl./cookbook/thl./cookbook/show_conf./cookbook/edit_conf./cookbook/show_log./cookbook/vimlog./cookbook/emacslog 46 46

Cookbook tour: testing tools./cookbook/test_cluster./cookbook/start_load [start stop]./cookbook/test_all_topologies 47 47

Cookbook tour: powerful admin tools./cookbook/heartbeat./cookbook/switch./cookbook/add_node_master_slave./cookbook/add_node_star./cookbook/copy_backup./cookbook/clear_cluster # <--- CAUTION! 48 48

More installation 49 49

DRY-RUN Method to simulate installation; Does NOT perform installation; Does NOT even do validation; It only shows the commands used to install; Allows you to get the commands and do an installation manually (e.g. when you can't ssh between nodes) 50 50

DRY-RUN export DRYRUN=1./cookbook/install_master_slave 51 51

Intro to multi-master installation 52 52

How tungsten-installer Works for Basic Master/Slave Deployment Staging copy of files db1 db2 check prereqs copy code db3 configure 53 53

From Master/Slave Replication... db2 db1 Replicator Service alpha Replicator Service alpha db3 Replicator tungsteninstaller Service alpha Install master and slaves on the whole cluster 54 54

To Multi-Master db1 Replicator Replicator db2 Service alpha Service alpha Service bravo Service bravo tungsteninstaller tungsteninstaller con!gureservice con!gureservice Install master on db1 install master on db2 install slave service on db1 install slave service on db2 55 55

tungsten-installer master 1 TUNGSTEN_HOME=/home/tungsten/installs/cookbook./tools/tungsten-installer --master-slave --master-host=$master1 --datasource-port=3306 --datasource-user=tungsten --datasource-password=secret --datasource-log-directory=/var/lib/mysql --service-name=alpha --home-directory=$tungsten_home --cluster-hosts=$master1 --start creating service 'alpha' Notice: --cluster-hosts has only one host 56 56

tungsten-installer master 2 TUNGSTEN_HOME=/home/tungsten/installs/cookbook./tools/tungsten-installer --master-slave --master-host=$master2 --datasource-port=3306 --datasource-user=tungsten --datasource-password=secret --datasource-log-directory=/var/lib/mysql --service-name=bravo --home-directory=$tungsten_home --cluster-hosts=$master2 --start creating service 'bravo' Notice: --cluster-hosts has only one host 57 57

Con!gure Service master 1 TUNGSTEN_HOME=/home/tungsten/installs/cookbook $TUNGSTEN_HOME/tungsten/tools/configure-service -C --quiet --host=$master1 --datasource=$master1 --local-service-name=alpha --role=slave --service-type=remote --release-directory=$tungsten_home/tungsten --skip-validation-check=thlstoragecheck --master-thl-host=$master2 --master-thl-port=2112 --svc-start bravo Notice: bravo is the master service in host 2 58 58

Con!gure Service master 2 TUNGSTEN_HOME=/home/tungsten/installs/cookbook $TUNGSTEN_HOME/tungsten/tools/configure-service -C --quiet --host=$master2 --datasource=$master2 --local-service-name=bravo --role=slave --service-type=remote --release-directory=$tungsten_home/tungsten --skip-validation-check=thlstoragecheck --master-thl-host=$master1 --master-thl-port=2112 --svc-start alpha Notice: alpha is the master service in host 1 59 59

From Master/Slave Replication... db2 db1 Replicator Service db1 Replicator Service db1 db3 Replicator Service db1./cooobook/install_master_slave 60 60

How Do I Install Fan-In Replication? db1 Replicator Service db1 Replicator db2 Replicator Service db1 Service db2 Service db2 db3./cooobook/install_fan_in 61 61

How Do I Install Multi-Master? db1 Replicator Service db1 Service db2 db2 Service db1 Service db2 Replicator./cooobook/install_all_masters 62 62

How Do I Extend Multi-Master? db1 Replicator Service db1 Service db2 Service db3 Replicator Service db1 db3 db2 Service db2 Service db3 Service db1 Service db2 Service db3 Replicator 63 63

How Do I Extend Multi-Master? db1 Replicator Service db1 Service db2 Service db3 Service db4 Replicator Service db1 Service db2 Service db3 Service db4 db3 db2 Service db1 Service db1 db4 Service db2 Service db2 Service db3 Service db3 Service db4 Service db4 Replicator Replicator 64 64

How Do I Install a Star Topology? db1 Replicator Service db1 Service db3 Replicator Service db1 db3 HUB db2 Service db2 Service db3 Service db2 Service db3 Replicator./cooobook/install_star 65 65

How Do I Extend a Star Topology? db1 Replicator Service db1 Service db3 Service db1 Service db2 db3 HUB db2 Replicator Service db2 Service db3 Service db4 Service db3 db4 Service db3 Service db4 Replicator 66 66

How Do I Extend a Star Topology? db1 Replicator Service db1 Service db3 Service db1 Service db2 db3 HUB db2 Replicator Service db2 Service db3 Service db4 Service db5 Service db3 db5 Replicator Service db5 Service db3 db4 Service db3 Service db4 Replicator 67 67

BI-DIR: the painless way edit cookbook/common_nodes.sh edit cookbook/user_values.sh remove two nodes edit the variables in cookbook/ NODES_ALL_MASTERS.sh cookbook/install_all_masters 68 68

Multiple masters fan-in Steps: install a master service in each node install a slave service for each master in the fanin node or : cookbook/install_fan_in 69 69

Multiple masters star topology Steps: install a master service in each server in the hub, install a slave service for each spoke in each spoke, install a slave service for the hub, using bypass option cookbook/install_star 70 70

Taking Over from Standard Replication cookbook/install_standard_replicaton cookbook/takeover 71 71

Replication Management 72 72

Common Commands replicator trepctl thl the Tungsten service schema 73 73

replicator It s the service provider You launch it once when you start You may restart it when you change con!g 74 74

trepctl Tungsten Replicator ConTroLler It s the driving seat for your replication You can start, update, and stop services You can get speci!c info 75 75

trepctl Tungsten Replicator Controller put services online or o"ine check status skip events inspect internals change roles heartbeat backup/restore... and a lot more 76 76

thl Transaction History List Gives you access to the Tungsten transaction history logs 77 77

thl Transaction History Log info index list (total or a speci!c event, or by range) purge 78 78

Tungsten service schema one for each service named "tungsten_service_name" e.g. tungsten_alpha, tungsten_dragon Most important table: trep_commit_seqno 79 79

Looking at the tungsten service db select * from tungsten_dragon.trep_commit_seqno\g ******************* 1. row ******************* task_id: 0 seqno: 102 fragno: 0 last_frag: 1 source_id: qa.r1.continuent.com epoch_number: 0 eventid: mysql-bin. 000002:0000000000018903;0 applied_latency: 0 update_timestamp: 2012-02-06 05:56:12 shard_id: tungsten_dragon extract_timestamp: 2012-02-06 05:56:09 80 80

Where are the tools in the tungsten directory: $TUNGSTEN_BASE/tungsten/tungsten-replicator/bin replicator trepctl thl # the daemon # replicator controller # transaction history log tool 81 81

Starting and stopping the replicator cd $TUNGSTEN_BASE/tungsten/tungsten-replicator/bin./replicator status Tungsten Replicator Service is running (PID:32400)../replicator stop Stopping Tungsten Replicator Service... Stopped Tungsten Replicator Service../replicator start Starting Tungsten Replicator Service...... or./cookbook/replicator... 82 82

checking replicator vitals trepctl services Processing services command... NAME VALUE ---- ----- appliedlastseqno: -1 # bad sign? appliedlatency : -1.0 role : slave servicename : dragon servicetype : local started : true state : ONLINE Finished services command... 83 83

sending a heartbeat trepctl -host $MASTER_HOST heartbeat trepctl services Processing services command... NAME VALUE ---- ----- appliedlastseqno: 102 appliedlatency : 3.139 role : slave servicename : dragon servicetype : local started : true state : ONLINE Finished services command... 84 84

replicator status (1) trepctl status Processing status command... NAME VALUE ---- ----- appliedlasteventid : mysql-bin.000002:0000000000018903;0 appliedlastseqno : 102 appliedlatency : 3.139 clustername : default currenteventid : NONE currenttimemillis : 1328504342058 dataserverhost : qa.r4.continuent.com extensions : latestepochnumber : 0 masterconnecturi : thl://qa.r1.continuent.com:2112/ masterlistenuri : thl://qa.r4.continuent.com:2112/ maximumstoredseqno : 102 minimumstoredseqno : 0 [...] 85 85

replicator status (2) [...] offlinerequests : NONE pendingerror : NONE pendingerrorcode : NONE pendingerroreventid : NONE pendingerrorseqno : -1 pendingexceptionmessage: NONE resourceprecedence : 99 rmiport : 10000 role : slave seqnotype : java.lang.long servicename : dragon servicetype : local simpleservicename : dragon sitename : default sourceid : qa.r4.continuent.com state : ONLINE timeinstateseconds : 245.215 uptimeseconds : 245.539 Finished status command... 86 86

A failover scenario 1: MySQL native replication 87 87

1. one Master, two slaves Loading the employees test database 88 88

2. Master goes away * Stop replication * Slaves are updated at di"erent levels # 2 select count(*) from titles 333,145 # 3 select count(*) from titles 443,308 89 89

3. Look into Slave #2 binary logs!nd the last transaction 90 90

4. Look into Slave #3 binary logs 1.!nd the transaction that was last in slave #2 2. Recognize that last transaction in the log of slave #3 (This can actually take you a LOOOONG TIME) 3. Get the position immediately after this transaction 4. (e.g. 134000 in!le mysql-bin.000018) 91 91

5. promote Slave #3 to master * in slave #2 CHANGE MASTER TO master_host= slave_3_ip, master_user= slavename, master_password= slavepassword, master_log_file= mysql-bin.000018, master_log_pos=134000; 92 92

A failover scenario 1I: Tungsten Replicator 93 93

1. one master, two slaves loading the employees test database 94 94

2. Master goes away * Stop replication * Slaves are updated at di"erent levels # 2 select count(*) from titles 333,145 # 3 select count(*) from titles 443,308 95 95

3. no need to!nd the last transaction # simply change roles trepctl -host slave3 setrole -role master trepctl -host slave2 setrole \ -role slave -uri thl://slave3 trepctl -host slave3 online State: ONLINE trepctl -host slave2 online State: GOING-ONLINE:SYNCHRONIZING 96 96

4. Check that the slave has synchronized # new master select seqno from tungsten.trep_commit_seqno; 78 # new slave select seqno from tungsten.trep_commit_seqno; 64 97 97

4. Tell the replicator to hurry up # new master trepctl -node slave3 flush Master log is synchronized with database at log sequence number: 78 # new slave trepctl host slave2 wait -applied 78 ONLINE select seqno from tungsten.trep_commit_seqno; 78 98 98

4.... and we re done # new master select count(*) from employees.titles count(*) 443308 # new slave: count(*) 443308 99 99

planned role switch cookbook/install_master_slave cookbook/switch 100 100

Switching roles in master/slave replication (1) online db2 db1 Replicator Service db1 Replicator Service db1 db3 online Replicator Service db1 online 101 101

Switching roles in master/slave replication (2) online db2 db1 Replicator Service db1 Replicator Service db1 db3 o"ine Replicator Service db1 online 102 102

Switching roles in master/slave replication (3) online db2 db1 Replicator Service db1 Replicator Service db1 db3 o"ine Replicator Service db1 Wait for transactions to be applied online 103 103

Switching roles in master/slave replication (4) o"ine db2 db1 Replicator Service db1 Replicator Service db1 db3 o"ine Replicator Service db1 Slaves go offline o"ine 104 104

Switching roles in master/slave replication (5) o"ine db2 db1 Replicator Service db1 Replicator Service db1 db3 o"ine Replicator Service db1 Slave is promoted. Notice: 2 masters, but o"ine o"ine 105 105

Switching roles in master/slave replication (6) o"ine db2 db1 Replicator Service db1 Replicator Service db1 db3 o"ine Replicator Service db1 old master becomes slave o"ine 106 106

Switching roles in master/slave replication (7) o"ine db2 db1 Replicator Service db1 Replicator Service db1 db3 o"ine Replicator Service db1 slaves are directed to new master o"ine 107 107

Switching roles in master/slave replication (8) online db2 db1 Replicator Service db1 Replicator Service db1 db3 online Replicator Service db1 all nodes go online, using new master online 108 108

Tungsten GTID vs MySQL 5.6 GTID What is GTID How it works in Tungsten How it works (or not) in MySQL 5.6 109 109

without global transaction ID commit commit commit commit A master binlog position binlog binlog slave B C slave position 110 position 110

with global transaction ID commit commit commit commit A master id#200 slave B C slave id#200 111 id#200 111

Tungsten and global transaction ID: activation (none) active by default 112 112

Tungsten and global transaction ID: status trepctl status Processing status command... NAME VALUE ---- ----- appliedlasteventid : mysql-bin.000002:0000000000001442;0 appliedlastseqno : 6 appliedlatency : 0.862 clustername : default currenteventid : NONE currenttimemillis : 1354304680923 dataserverhost : qa.r4.continuent.com 113 113

Tungsten and global transaction ID: seeing transactions thl list -seqno 6 SEQ# = 6 / FRAG# = 0 (last frag) - TIME = 2012-11-30 20:44:35.0 - EPOCH# = 0 - EVENTID = mysql-bin.000002:0000000000001442;0 - SOURCEID = qa.r1.continuent.com - SQL(0) = insert into test.v1 values (1, 'inserted by node #1') /* SERVICE = [cookbook] */ 114 114

Tungsten and global transaction ID: changing master connection trepctl offline trepctl online -seqno 105 115 115

Tungsten and Global transaction ID: crash-safe slave tables mysql -e 'select * from tungsten_cookbook.trep_commit_seqno\g' *************************** 1. row *************************** task_id: 0 seqno: 6 fragno: 0 last_frag: 1 source_id: qa.r1.continuent.com epoch_number: 0 eventid: mysql-bin.000002:0000000000001442;0 applied_latency: 0 update_timestamp: 2012-11-30 20:44:35 shard_id: test extract_timestamp: 2012-11-30 20:44:35 116 116

Tungsten and Global transaction ID: crash-safe tables and parallel replication mysql -e 'select seqno, source_id, shard_id,update_timestamp from tungsten_cookbook.trep_commit_seqno' +-------+----------------------+----------+---------------------+ seqno source_id shard_id update_timestamp +-------+----------------------+----------+---------------------+ 7 qa.r1.continuent.com db1 2012-11-30 20:54:14 8 qa.r1.continuent.com db2 2012-11-30 20:54:14 9 qa.r1.continuent.com db3 2012-11-30 20:54:14 10 qa.r1.continuent.com db4 2012-11-30 20:54:14 11 qa.r1.continuent.com db5 2012-11-30 20:54:14 12 qa.r1.continuent.com db6 2012-11-30 20:54:14 13 qa.r1.continuent.com db7 2012-11-30 20:54:14 14 qa.r1.continuent.com db8 2012-11-30 20:54:14 15 qa.r1.continuent.com db9 2012-11-30 20:54:14 16 qa.r1.continuent.com db10 2012-11-30 20:54:14 +-------+----------------------+----------+---------------------+ 117 117

MySQL 5.6 and global transaction ID activation mysqld --log-slave-updates \ --gtid-mode=on \ --enforce-gtid-consistency WARNING: before MySQL 5.6.10, it was --disable-gtid-unsafe-statements 118 118

MySQL 5.6 and global transaction ID seeing transactions #121203 11:15:49 server id 1 end_log_pos 344 CRC32 0x45b25c8f GTID [commit=yes] SET @@SESSION.GTID_NEXT= '7A77A490-3D3A-11E2-8CC9-7DCF9991097B: 2'/*!*/; # at 344 #121203 11:15:49 server id 1 end_log_pos 423 CRC32 0x873c8fac Query thread_id=3 exec_time=0 error_code=0 SET TIMESTAMP=1354533349/*!*/; BEGIN /*!*/; # at 423 #121203 11:15:49 server id 1 end_log_pos 522 CRC32 0xb4bf4372 Query thread_id=3 exec_time=0 error_code=0 SET TIMESTAMP=1354533349/*!*/; insert into t1 values (1) 119 119

MySQL 5.6 and global transaction ID status show slave status\g *************************** 1. row *************************** Slave_IO_State: Waiting for master to send event Master_Host: 127.0.0.1 Master_User: rsandbox Master_Port: 13233 Connect_Retry: 60 Master_Log_File: mysql-bin.000002 Read_Master_Log_Pos: 1837 Relay_Log_File: mysql_sandbox13234-relay-bin.000005 Relay_Log_Pos: 2047 Relay_Master_Log_File: mysql-bin.000002... Retrieved_Gtid_Set: 46E13434-3B28-11E2-BF47-6C626DA07446:1-7 Executed_Gtid_Set: 46E13434-3B28-11E2-BF47-6C626DA07446:1-7 120 120

MySQL 5.6 and global transaction ID changing master connection CHANGE MASTER TO master_log_file='mysql-bin-000003', master_log_pos='1234' # No global transaction ID is used 121 121

MySQL 5.6 and global transaction ID crash-safe slave table select * from slave_relay_log_info\g ********************* 1. row ******************** Number_of_lines: 7 Relay_log_name:./mysql_sandbox13234-relay-bin.000005 Relay_log_pos: 2047 Master_log_name: mysql-bin.000002 Master_log_pos: 1837 Sql_delay: 0 Number_of_workers: 5 Id: 1 # NO Global transaction ID is used! 122 122

MySQL 5.6 and global transaction ID crash-safe slave table + parallel select * from mysql.slave_worker_info\g Id: 12 Relay_log_name:./mysql_sandbox13234-relay-bin.000007 Relay_log_pos: 4299 Master_log_name: mysql-bin.000002 Master_log_pos: 7155 Checkpoint_relay_log_name:./mysql_sandbox13234-relay-bin.000007 Checkpoint_relay_log_pos: 1786 Checkpoint_master_log_name: mysql-bin.000002 Checkpoint_master_log_pos: 4642 Checkpoint_seqno: 9 Checkpoint_group_size: 64 Checkpoint_group_bitmap:? # NO Global transaction ID is used! 123 123

Filters 124 124

Tungsten Replication Service Pipeline Stage Extract Filter Apply Stage Extract Filter Apply Stage Extract Filter Apply Master DBMS Transaction History Log In-Memory Queue Slave DBMS 125 125

Restrict replication to some schemas and tables./tools/tungsten-installer \ --master-slave -a \... --svc-extractor-filters=replicate \ "--property=replicator.filter.replicate.do=test,*.foo" \... --start-and-report # test="test.*" -> same drawback as binlog-do-db in MySQL # *.foo = table 'foo' in any database # employees.dept_codes,employees.salaries => safest way 126 126

Exclude some schemas and tables from replication./tools/tungsten-installer \ --master-slave -a \... --svc-extractor-filters=replicate \ "--property=replicator.filter.replicate.ignore=test,*.foo" \... --start-and-report # test="test.*" -> same drawback as binlog-ignore-db in MySQL # *.foo = table 'foo' in any database # employees.dept_codes,employees.salaries => safest way # DO NOT MIX.do and.ignore! # (you can do it, but it may not do what you mean) 127 127

Change name of replicated schema -a --svc-applier-filters=dbtransform \ --property=replicator.filter.dbtransform.from_regex1=stores \ --property=replicator.filter.dbtransform.to_regex1=playground # from_regex1=stores -> name of the schema in the master # to_regex1=playground -> name of the schema in the slave # WARNING: requires "USE schema_name" to work properly. 128 128

Multi-master: Con#ict prevention 129 129

CONFLICTS Continuent 2012 130 130

What's a con#ict Data modi!ed by several sources (masters) Creates one or more : data loss (unwanted delete) data inconsistency (unwanted update) duplicated data (unwanted insert) replication break 131 131

Data duplication 4 Matt 140 alpha id name amount 1 Joe 100 2 Frank 110 3 Sue 100 bravo charlie 4 Matt 130 BREAKS REPLICATION 132 132

auto_increment o$sets are not a remedy A popular recipe auto_increment_increment + auto_increment_offset They don't prevent con#icts They hide duplicates 133 133

Hidden data duplication 11 Matt 140 INSERT alpha o$set 1 id name amount 1 Joe 100 2 Frank 110 3 Sue 100 bravo o$set 2 INSERT charlie o$set 3 13 Matt 130 134 134

Data inconsistency 3 Sue 108 UPDATE alpha id name amount 1 Joe 100 2 Frank 110 3 Sue 100 bravo charlie UPDATE 3 Sue 105 135 135

Data loss 3 Sue 108 UPDATE alpha id name amount 1 Joe 100 2 Frank 110 3 Sue 100 bravo charlie DELETE record #3 MAY BREAK REPLICATION 136 136

con#ict handling strategies resolving after the fact planned for future use Needs information that is missing in async replication avoiding requires synchronous replication with 2pc preventing setting and enforcing a split sources policy Transforming and resolving all records are converted to INSERTs used by Tungsten planned for future use con"icts are resolved within a given time window Continuent 2012 137 137

Multi-master: Con!ict prevention 138 138

Tungsten con#ict prevention in a nutshell 1. de!ne the rules (which master can update which database) 2. tell Tungsten the rules 3. de!ne the policy (error, drop, warn, or accept) 4. Let Tungsten enforce your rules 139 139

Tungsten Con#ict prevention facts Sharded by database De!ned dynamically Applied on the slave services methods: error: make replication fail drop: drop silently warn: drop with warning 140 140

Tungsten con#ict prevention applicability unknown shards The schema being updated is not planned actions: accept, drop, warn, error unwanted shards the schema is updated from the wrong master actions: accept, drop, warn, error whitelisted shards can be updated by any master 141 141

Con#ict prevention directives --svc-extractor-filters=shardfilter replicator.filter.shardfilter.unknownshardpolicy=error replicator.filter.shardfilter.unwantedshardpolicy=error replicator.filter.shardfilter.enforcehomes=false replicator.filter.shardfilter.allowwhitelisted=false 142 142

con#ict prevention in a star topology alpha updates employees A C Host1 master: alpha database: employees A B C Host3 master: charlie (hub) database: vehicles Host2 master: bravo database: buildings B C 143 143

con#ict prevention in a star topology alpha updates vehicles A C Host1 master: alpha database: employees A B C Host3 master: charlie (hub) database: vehicles Host2 master: bravo database: buildings B C 144 144

con#ict prevention in a all-masters topology alpha updates employees B A C Host1 master: alpha database: employees Host2 master: bravo database: buildings B A C A B C Host3 master: charlie database: vehicles 145 145

con#ict prevention in a all-masters topology charlie updates vehicles B A C Host1 master: alpha database: employees Host2 master: bravo database: buildings B A C A B C Host3 master: charlie database: vehicles 146 146

con#ict prevention in a all-masters topology bravo updates employees B A C Host1 master: alpha database: employees Host2 master: bravo database: buildings B A C A B C Host3 master: charlie database: vehicles 147 147

con#ict prevention in a all-masters topology charlie updates employees B A C Host1 master: alpha database: employees Host2 master: bravo database: buildings B A C A B C Host3 master: charlie database: vehicles 148 148

setting con#ict prevention rules trepctl -host host1 -service charlie \ shard -insert < shards.map cat shards.map shard_id master critical personnel alpha false buildings bravo false vehicles charlie false test whitelisted false # charlie is slave service in host 1 149 149

setting con#ict prevention rules trepctl -host host2 -service charlie \ shard -insert < shards.map cat shards.map shard_id master critical personnel alpha false buildings bravo false vehicles charlie false test whitelisted false # charlie is slave service in host 2 150 150

setting con#ict prevention rules trepctl -host host3 -service alpha \ shard -insert < shards.map trepctl -host host3 -service bravo \ shard -insert < shards.map cat shards.map shard_id master critical personnel alpha false buildings bravo false vehicles charlie false test whitelisted false # alpha and bravo are slave services in host 3 151 151

Con#ict prevention demo reminder Server #1 can update "employees" Server #2 can update "buildings" Server #3 can update "vehicles" 152 152

Sample correct operation (1) mysql #1> create table employees.names(... ) # all servers receive the table # all servers keep working well 153 153

Sample correct operation (2) mysql #2> create table buildings.homes(... ) # all servers receive the table # all servers keep working well 154 154

Sample incorrect operation (1) mysql #2> create table employees.nicknames(... ) # Only server #2 receives the table # slave service in hub gets an error # slave service in #1 does not receive anything 155 155

sample incorrect operation (2) #3 $ trepct services simple_services alpha [slave] seqno: 7 - latency: 0.136 - ONLINE bravo seqno: [slave] -1 - latency: -1.000 - OFFLINE:ERROR charlie [master] seqno: 66 - latency: 0.440 - ONLINE 156 156

sample incorrect operation (3) #3 $ trepct -service bravo status NAME VALUE ---- ----- appliedlasteventid : NONE appliedlastseqno : -1 appliedlatency : -1.0 (...) offlinerequests : NONE pendingerror : Stage task failed: q-to-dbms pendingerrorcode : NONE pendingerroreventid : mysql-bin.000002:0000000000001241;0 pendingerrorseqno : 7 pendingexceptionmessage: Rejected event from wrong shard: seqno=7 shard ID=employees shard master=alpha service=bravo (...) 157 157

Fixing the issue mysql #1> drop table if exists employees.nicknames; mysql #1> create table if exists employees.nicknames (... ) ; #3 $ trepct -service bravo online -skip-seqno 7 # all servers receive the new table 158 158

Sample whitelisted operation mysql #2> create table test.hope4best(... ) mysql #1> insert into test.hope4best values (... ) # REMEMBER: 'test' was explicitly whitelisted # All servers get the new table and records # But there is no protection against conflicts 159 159

administration 160 160

Viewing THL Events thl info log directory = /home/tungsten/installs/master_slave/thl/dragon/ min seq# = 0 max seq# = 101 events = 101 161 161

viewing THL events thl index LogIndexEntry thl.data.0000000001(0:102) 162 162

viewing THL events thl index [...] LogIndexEntry thl.data.0000000001(0:18) LogIndexEntry thl.data.0000000002(19:33) LogIndexEntry thl.data.0000000003(34:35) LogIndexEntry thl.data.0000000004(36:3641) LogIndexEntry thl.data.0000000005(3642:3712) LogIndexEntry thl.data.0000000006(3713:3838) LogIndexEntry thl.data.0000000007(3839:3949) LogIndexEntry thl.data.0000000008(3950:4011) LogIndexEntry thl.data.0000000009(4012:4039) LogIndexEntry thl.data.0000000010(4040:4057) LogIndexEntry thl.data.0000000011(4058:4067) LogIndexEntry thl.data.0000000012(4068:4073) LogIndexEntry thl.data.0000000013(4074:4085) LogIndexEntry thl.data.0000000014(4086:4095) LogIndexEntry thl.data.0000000015(4096:4101) LogIndexEntry thl.data.0000000016(4102:4111) 163 163

viewing THL events thl list -seqno 102 [...] SEQ# = 102 / FRAG# = 0 (last frag) - TIME = 2012-02-06 05:56:09.0 - EPOCH# = 0 - EVENTID = mysql-bin.000002:0000000000018903;0 - SOURCEID = qa.r1.continuent.com - METADATA = [mysql_server_id=10;is_metadata=true;service=dragon;shard=tung sten_dragon;heartbeat=none] - TYPE = com.continuent.tungsten.replicator.event.repldbmsevent - OPTIONS = [##charset = ISO8859_1, autocommit = 1, sql_auto_is_null = 1, foreign_key_checks = 1, unique_checks = 1, sql_mode = 'IGNORE_SPACE', character_set_client = 8, collation_connection = 8, collation_server = 8] - SCHEMA = tungsten_dragon - SQL(0) = UPDATE tungsten_dragon.heartbeat SET source_tstamp= "2012-02-06 05:56:09", salt= 2, name= "NONE" WHERE id= 1 /* SERVICE = [dragon] */ 164 164

Skipping a THL Event trepctl online -skip-seqno 1092 trepctl online -skip-seqno 1092,1093,1094 # see example 165 165

Adding a Member Let's see the cookbook, and use it 166 166

parallel replication 167 167

Replicator Pipeline Architecture Tungsten Replicator Process Pipeline Stage Stage Stage Extract Assign Shard ID Apply Extract Apply Parallel Queue Extract Extract Extract Apply Apply Apply channels Transaction History Log MySQL Binlog THL shard.list file Slave DBMS 168

Parallel replication facts Sharded by database Good choice for slave lag problems Bad choice for single database projects 169 169

Parallel Replication test STOPPED binary logs MySQL slave Concurrent sysbench on 30 databases Tungsten slave OFFLINE direct: alpha (slave) running for 1 hour TOTAL DATA: 130 GB RAM per server: 20GB replicator alpha Slaves will have 1 hour lag 170

measuring results START binary logs MySQL slave Tungsten slave ONLINE direct: alpha (slave) Recording catch-up time replicator alpha 171

MySQL native replication slave catch up in 04:29:30 172

Tungsten parallel replication slave catch up in 00:55:40 173

Parallel replication made simpler FROM HERE... 174

Parallel replication made simpler TO HERE 175

Parallel replication made simpler 176

parallel replication direct slave facts No need to install Tungsten on the master Tungsten runs only on the slave Replication can revert to native slave with two commands (trepctl offline; start slave) Native replication can continue on other slaves Failover (either native or Tungsten) becomes a manual task 177 177

installing parallel replication MORE_OPTIONS='--channels=10'./cookbook/install_master_slave 178 178

Checking parallel replication trepctl status trepctl status -name tasks trepctl status -name shards trepctl status -name stores 179 179

Parallel replication demo 180 180

Troubleshooting 181 181

Identify the Failed Component Steps 1. trepctl services 2. trepctl -service SVC_NAME status 3. look at the logs 4. Take action 182 182

reading the logs ls $TUNGSTEN_BASE/tungsten/tungsten-replicator/logs/ trepsvc.log user.log... or./cookbook/show_log # let's see it in practice 183 183

Parting thoughts 184 184

Open source Tungsten Replicator now includes Oracle-to-MySQL and Oracle-to-Oracle extractors and appliers! 185 185

560 S. Winchester Blvd., Suite 500 San Jose, CA 95128 Tel +1 (866) 998-3642 Fax +1 (408) 668-1009 e-mail: sales@continuent.com Our Blogs: http://scale-out-blog.blogspot.com http://datacharmer.blogspot.com http://flyingclusters.blogspot.com Continuent Website: http://www.continuent.com Tungsten Replicator 2.0: http://code.google.com/p/tungsten-replicator Continuent 2012 186 186