MySQL Cluster Deployment Best Practices



Similar documents
Tushar Joshi Turtle Networks Ltd

MySQL Cluster New Features. Johan Andersson MySQL Cluster Consulting johan.andersson@sun.com

High Availability Solutions for the MariaDB and MySQL Database

DBA Tutorial Kai Voigt Senior MySQL Instructor Sun Microsystems Santa Clara, April 12, 2010

Outline. Failure Types

Best Practices for Using MySQL in the Cloud

SCALABLE DATA SERVICES

Performance Report Modular RAID for PRIMERGY

Percona Server features for OpenStack and Trove Ops

Architectures Haute-Dispo Joffrey MICHAÏE Consultant MySQL

MySQL performance in a cloud. Mark Callaghan

White Paper. Optimizing the Performance Of MySQL Cluster

MySQL Enterprise Backup

Agenda. Enterprise Application Performance Factors. Current form of Enterprise Applications. Factors to Application Performance.

How to Optimize the MySQL Server For Performance

MySQL Administration and Management Essentials

Comprehending the Tradeoffs between Deploying Oracle Database on RAID 5 and RAID 10 Storage Configurations. Database Solutions Engineering

A SURVEY OF POPULAR CLUSTERING TECHNOLOGIES

Using MySQL for Big Data Advantage Integrate for Insight Sastry Vedantam

Database Administration with MySQL

Distribution One Server Requirements

WHITE PAPER FUJITSU PRIMERGY SERVER BASICS OF DISK I/O PERFORMANCE

This guide specifies the required and supported system elements for the application.

A Quick Start Guide to Backup Technologies

Boost SQL Server Performance Buffer Pool Extensions & Delayed Durability

Best Practices for Monitoring Databases on VMware. Dean Richards Senior DBA, Confio Software

Recovery Principles in MySQL Cluster 5.1

Monitoreo de Bases de Datos

Configuring Apache Derby for Performance and Durability Olav Sandstå

Configuring Apache Derby for Performance and Durability Olav Sandstå

Tips and Tricks for Using Oracle TimesTen In-Memory Database in the Application Tier

Overview of I/O Performance and RAID in an RDBMS Environment. By: Edward Whalen Performance Tuning Corporation

MySQL: Cloud vs Bare Metal, Performance and Reliability

Parallel Replication for MySQL in 5 Minutes or Less

Virtuoso and Database Scalability

Part 3. MySQL DBA I Exam


Synchronous multi-master clusters with MySQL: an introduction to Galera

Oracle Database 10g: Performance Tuning 12-1

File System & Device Drive. Overview of Mass Storage Structure. Moving head Disk Mechanism. HDD Pictures 11/13/2014. CS341: Operating System

MySQL Reference Architectures for Massively Scalable Web Infrastructure

Sawmill Log Analyzer Best Practices!! Page 1 of 6. Sawmill Log Analyzer Best Practices

Lessons Learned while Pushing the Limits of SecureFile LOBs. by Jacco H. Landlust. zondag 3 maart 13

VERITAS Volume Management Technologies for Windows

Client Hardware and Infrastructure Suggested Best Practices

Geospatial Server Performance Colin Bertram UK User Group Meeting 23-Sep-2014

MySQL backups: strategy, tools, recovery scenarios. Akshay Suryawanshi Roman Vynar

Removing Performance Bottlenecks in Databases with Red Hat Enterprise Linux and Violin Memory Flash Storage Arrays. Red Hat Performance Engineering

The Google File System

Techniques for implementing & running robust and reliable DB-centric Grid Applications

Encrypting MySQL data at Google. Jonas Oreland and Jeremy Cole

An Overview of Flash Storage for Databases

Availability Guide for Deploying SQL Server on VMware vsphere. August 2009

Choosing Storage Systems

About Me: Brent Ozar. Perfmon and Profiler 101

Partitioning under the hood in MySQL 5.5

Oracle Database 11 g Performance Tuning. Recipes. Sam R. Alapati Darl Kuhn Bill Padfield. Apress*

Welcome to Virtual Developer Day MySQL!

MySQL High-Availability and Scale-Out architectures

DELL TM PowerEdge TM T Mailbox Resiliency Exchange 2010 Storage Solution

SAP HANA - Main Memory Technology: A Challenge for Development of Business Applications. Jürgen Primsch, SAP AG July 2011

1 Storage Devices Summary

Cloud Storage. Parallels. Performance Benchmark Results. White Paper.

StreamServe Persuasion SP5 Oracle Database

Enhancements of ETERNUS DX / SF

Q & A From Hitachi Data Systems WebTech Presentation:

How to Choose your Red Hat Enterprise Linux Filesystem

Outline. Database Management and Tuning. Overview. Hardware Tuning. Johann Gamper. Unit 12

Best practices for operational excellence (SharePoint Server 2010)

SAN TECHNICAL - DETAILS/ SPECIFICATIONS

OLTP Meets Bigdata, Challenges, Options, and Future Saibabu Devabhaktuni

Introduction 1 Performance on Hosted Server 1. Benchmarks 2. System Requirements 7 Load Balancing 7

Database Resilience at ISPs. High-Availability. White Paper

The Data Placement Challenge

RAID Implementation for StorSimple Storage Management Appliance

Windows Server Performance Monitoring

Oracle Database In-Memory The Next Big Thing

NEC Storage NV Series

Oracle Data Guard for High Availability and Disaster Recovery

Maximize System Performance

Oracle Database Deployments with EMC CLARiiON AX4 Storage Systems

Boosting Database Batch workloads using Flash Memory SSDs

Application-Tier In-Memory Analytics Best Practices and Use Cases

SOP Common service PC File Server

High Availability and Scalability for Online Applications with MySQL

Flash for Databases. September 22, 2015 Peter Zaitsev Percona

SLIDE 1 Previous Next Exit

Restore and Recovery Tasks. Copyright 2009, Oracle. All rights reserved.

Performance and Tuning Guide. SAP Sybase IQ 16.0

Optimizing Performance. Training Division New Delhi

PADS GPFS Filesystem: Crash Root Cause Analysis. Computation Institute

Module 14: Scalability and High Availability

Monitoring MySQL. Kristian Köhntopp

Secure Web. Hardware Sizing Guide

University of Edinburgh. Performance audit. Date: Niels van Klaveren Kasper van der Leeden Yvette Vermeer

High Availability Databases based on Oracle 10g RAC on Linux

Scaling Analysis Services in the Cloud

Inge Os Sales Consulting Manager Oracle Norway

Lecture 5: GFS & HDFS! Claudia Hauff (Web Information Systems)! ti2736b-ewi@tudelft.nl

Transcription:

<Insert Picture Here> MySQL Cluster Deployment Best Practices Johan ANDERSSON Joffrey MICHAÏE MySQL Cluster practice Manager MySQL Consultant

The presentation is intended to outline our general product direction. It is intended for information purposes only, and may not be incorporated into any contract. It is not a commitment to deliver any material, code, or functionality, and should not be relied upon in making purchasing decisions. The development, release, and timing of any features or functionality described for Oracle s products remains at the sole discretion of Oracle.

Agenda Cluster Setup Minimal & Recommended Setup Networking & Hardware Selection Disk Data Tables Configuration Administration Online/Offline Operations Backup and restore Monitoring

Minimal Setup Clients 4 Computers having: 2 Data nodes 2 MySQL Servers 2 Management Nodes Never: Co-locate data node and management node! (split brain/network partitioning) SQL+Mgm +AppServer +WebServer... SQL+Mgm +AppServer +WebServer However : No redundant network SPOF No DRP Data Nodes

Recommended Setup Clients Load Balancer(s) Redundant switches SQL+Mgm +AppServer +WebServer... SQL+Mgm +AppServer +WebServer... Bonding Data node Data node

Networking Dedicated >= 1GB/s networking Prevent network failures (NIC x 2, Bonding) Use Low-latency networking (Dolphin...) if >= 8 data nodes or want higher throughput and lower latency Use dedicated network for cluster communication No security layer to management node (remote shutdown allowed...) Enable port 1186 access only from cluster nodes and administrators

Hardware Selection - RAM & CPU Storage Layer (data nodes) One data node can (7.0+) use 8 cores CPU: 2 x 4 core (Nehalem works really well). Fast CPU fast processing of messages. RAM: As much as you need a 10GB data set will require 20GB of RAM (because of redundancy Each node will then need 2 x 10 / #of data nodes. (2 data nodes 10GB of RAM 16GB RAM is good Disk space: 10xDataMemory + space for BACKUP + TableSpace (if disk data tables) SQL Layer (MySQL Servers) CPU: 2 16 cores RAM: Not so important 4GB enough (depends on connections and buffers)

Hardware Selection - Disk Subsystem low-end mid-end high-end LCP REDOLOG LCP REDOLOG LCP REDOLOG 1 x SATA 7200RPM For a read-most, write not so much No redundancy (but other data node is the mirror) 1 x SAS 10KRPM Heavy duty (many MB/s) No redundancy (but other data node is the mirror) 4 x SAS 10KRPM Heavy duty (many MB/s) Disk redundancy (RAID1+0) hot swap REDO, LCP, BACKUP written sequentually in small chunks (256KB) If possible, use Odirect = 1

Filesystem Most customers uses EXT3(Linux) and UFS (Solaris) Ext2 could be an option (but recovery is longer) XFS we haven't experienced so much... ZFS You must separate journal (Zil) and filesystem Mount with noatime Raw device is not supported

Hardware Selection - Disk Data Storage Minimal recommended high-end LCP REDOLOG UNDOLOG TABLESPACE 2 x SAS 10KRPM (preferably) UNDOLOG (REDO LOG) TABLESPACE 1 TABLESPACE 2 Use High-end for heavy read write (1000's of 10KB records per sec) of data (e.g Content Delivery platforms) SSD for TABLESPACE is also interesting not much experience of this yet Having TABLESPACE on separate disk is good for read perf. Enable WRITE_CACHE on devices (REDO LOG / UNDO LOG) LCP 4 x SAS 10-15KRPM (preferably)

Configuration - Disk Data Storage Use Disk Data tables for Simple accesses (read/write on PK) Same for innodb you can easily get DISK BOUND (iostat) Set DiskPageBufferMemory=3072M is a good start if you rely a lot on disk data like the Innodb_Buffer_Pool, but set it as high as you can! Increased chance that a page will be cached SharedGlobalMemory=384M-1024M UNDO_BUFFER=64M to 128M (if you write a lot) You cannot change this BUFFER later! Specified at LOGFILE GROUP creation time DiskIOThreadPool=[ 8.. 16 ] (introduced in 7.0)

Configuration - General Set MaxNoOfExecutionThreads<=#cores Otherwise contention will occur unexpected behaviour. RedoBuffer=32-64M If you need to set it higher your disks are probably too slow FragmentLogFileSize=256M NoOfFragmentLogFiles= 6 x DataMemory (in MB) / (4x 256MB) Most common issue customers never configure big enough redo log The above parameters (and others, also for MySQL) are set with www.severalnines.com/config

Application : Primary Keys To avoid problems with Cluster 2 Cluster replication Recovery Application behavior (KEY NOT FOUND.. etc) ALWAYS DEFINE A PRIMARY KEY ON THE TABLE! A hidden PRIMARY KEY is added if no PK is specified. BUT.... NOT recommended The hidden primary key is e.g not replicated (between Clusters)!! There are problems in this area, so avoid the problems! So always, at least have id BIGINT AUTO_INCREMENT PRIMARY KEY Even if you don't need it for you applications

Application : Query Cache Don't cache everything in the Query Cache It is very expensive to invalidate over X mysql servers A write on one server will force the others to purge their cache. If you have tables that are read only (or change very seldom): my.cnf: query_cache_type=2 (DEMAND) SELECT SQL_CACHE <cols>.. FROM table; This can be good for STATIC data

Application : Transactions Failed transactions must be retried by the application If the REDOLOG or REDOBUFFER is full the transaction will be aborted This differs from INNODB behaviour There are also other resources / timeouts "Lock wait timeout" transaction will abort after TransactionDeadlockDetectionTimeout MaxNoOfConcurrent[Operations/Transactions] You can then increase this Nodefail/noderestart will cause transaction to abort

Application : Transactions Transactions (large updates) Remember NDB is designed for many and short transactions You are recommended to UPDATE / DELETE in small chunks Use LIMIT 10000 until all records are UPDATED/DELETED MaxNoOfConcurrentOperations sets the upper limit for how many records than can be modified simultaneously on one data node. MaxNoOfConcurrentOperations=1000000 will use 1GB of RAM Despite it is possible, we do recommend DELETE/UPDATE in smaller chunks.

Application : Table locks FLUSH TABLE WITH READ LOCK; LOCK TABLES <table> READ; Only locks the table(s) on the LOCAL mysql server You must get the LOCK on all mysql servers

Application : Schema Operations Don't use too much CREATE/DROP TABLE of NDB tables It is a heavy operation within Cluster Takes much longer than with standard MySQL

Log Only What Needs To Be Recovered Some types of tables account for a lot of WRITEs, but does not need to be recovered (E.g, Session tables) A session table is often unnecessary to REDO LOG and to CHECKPOINT Create these tables as 'NO LOGGING' tables: mysql> set @ndb_curr_val=@@ndb_table_no_logging; mysql> set ndb_table_no_logging=1; mysql> create table session_table(..) engine=ndb; mysql> set ndb_table_no_logging=@ndb_curr_val; 'session_table' will not be REDO logged or Checkpointed No disk activity for this table! After System Restart it will be there, but empty!

Administration Layer Introduce a MySQL Server for administration purposes! Should never ever get application requests Simplifies heavy (non online) schema changes App layer SQL layer Storage layer Admin layer syncrepl #give explicit nodeid in config.ini: [mysqld] id=8 hostname=x # in my.cnf: ndb_connectstring= nodeid=8;x,y ndb_cluster_connnection_pool=1

Administration Layer Modifying Schema is NOT online when you do: Rename a table Change data type Change storage size Drop column Rename column Add/Drop a PRIMARY KEY Altering a 1GB table requires 1GB of free DataMemory (copying) Online (and ok to do with transactions ongoing): Add column (ALTER ONLINE ) CREATE INDEX Online add node (see my presentation from last year how to do it)

Admistration Layer ALTER TABLE etc (non-online DDL) performed on Admin Layer! App layer SQL layer Storage layer Admin layer STOP!! No Traffic Now! syncrepl #give explicit nodeid in config.ini [mysqld] id=8 hostname=x # in my.cnf: ndb_connectstring= nodeid=8;x,y ndb_cluster_connnection_pool=1 1. Block traffic from SQL layer to data nodes ndb_mgm> ENTER SINGLE USER MODE 8 Only Admin mysqld is now connected to the data nodes 2. Perform heavy ALTER on admin layer 3. Allow traffic from SQL layer to data nodes ndb_mgm> EXIT SINGLE USER MODE

Admistration Layer You can also set up MySQL Replication from Admin layer to the SQL layer Replicate mysql database GRANT, SPROCs etc will be replicated. Keeps the SQL Layer aligned App layer SQL layer Storage layer syncrepl Admin layer binlog_do_db=mysq l

Online Upgrades Change Online OS, SW version (7.0.x 7.1.x) configuration E.g, incease DM, IM, Buffers, redo log, [mysqld] slots etc hardware (upgrade more RAM etc) These procedures requires a Rolling Restart Change config.ini, copy it over to all ndb_mgmd Stop BOTH ndb_mgmd, start BOTH ndb_mgmd Restart one data node at a time Restart one mysqld at a time Adding data nodes (from 7.0) See my presentation from Last Year (UC 2009) Adding MySQL Servers Make sure you have free [mysqld] slots Start the new mysqld

Backup Backup of NDB tables Online can have ongoing transactions Consistent only committed data and changes are backed up ndb_mgm -e START BACKUP Copy backup files from data nodes to safe location Non-NDB tables must be backed up separately Mysql system tables are stored only in MYISAM. You want to backup (for each mysql server) mysql database Triggers, SP... Use 'mysqldump' mysqldump mysql > mysql.sql mysqldump --no-data --no-create-info -R > routines.sql Copy my.cnf & config.ini files

Restore ndb_restore is in many cases the MOST write intensive operation on Cluster The problem is that ndb_restore produce REDO LOG This is unnecessary but a fact for now Restores many records in parallel, no throttling.. So 128 or more small records may be fine, but 128 BLOBs. Temporary error: 410: REDO log buffers overloaded, consult online manual (increase RedoBuffer, and or decrease TimeBetweenLocalCheckpoints, and or increase NoOfFragmentLogFiles) If you run into this during restore Try increase RedoBuffer (a value of higher than 64MB is seldom practical nor needed) Run only one instance of ndb_restore ndb_restore -p10... Or even a lower value, e.g, -p1 If this does not help faster disk(s) is/are needed RB Synced every TBGCP

Separate Realtime and Reporting MySQL Cluster is really good for Short, many parallel transactions Is bad at most reporting type queries (complex JOINs). For reporting, replicate/copy out data to a reporting server more suitable for the job MySQL Server running Innodb For replication see http://johanandersson.blogspot.com/2009/05/hamysql-write-scaling-using-cluster-to.html Realtime Apps Reporting System Replication Mysqldump Complex reporting queries ndb_restore csv LOAD DATA INFILE

Scaling One data node can (7.0+) use up to 8 cores CPU: Reaches bottleneck at about 370% CPU DISK: iostat -kx 1 : Check util; await, svctime etc.. NETWORK: iftop (linux) - add another data node (spread load) Add data node (online or offline) MySQL Server CPU: About the same 300-500% DISK: Should not be a factor NETWORK: Add another MySQL Server to offload query processing

Monitoring Mandatory to monitor CPU/Network/Memory usage Disk capacity (I/O) usage Network latency between nodes Node status... Used Index/Data Memory www.severalnines.com/cmon- monitors data nodes and mysql servers New in 7.1 : NDB$INFO Table in INFORMATION_SCHEMA Check node status Check buffer status etc Statistics

Questions? Talk to us outside! We offer cheap consulting this week: One question one beer. Go to the other sessions : MySQL Cluster Performance Tuning Best Practices 2.00pm Wednesday 04/14/2010 Location: Ballroom D MySQL Cluster and Pushdown-joins (In Pursuit of the Holy Grail) Jonas Oreland 3.05pm Wednesday 04/14/2010 Location: Ballroom B