Effective MySQL Monitoring Baron Schwartz March 2012
Who Am I?
Who Am I?
Who Am I? Maatkit Percona Toolkit Innotop Monitoring Plugins Aspersa Online Tools JavaScript Libraries
Consulting Percona Server Support Percona XtraBackup Training Percona XtraDB Cluster Conferences Percona Toolkit Engineering Many More
Agenda What Kind of Monitoring? Historical Data, Graphing Fault Detection and Alerting What to Monitor Tools Overview of Nagios Plugins Demo of Cacti Templates
Two Kinds of Monitoring Capturing metrics Detecting faults
Capturing Metrics Useful for troubleshooting, diagnosis, and noticing things Very handy for graphs/charts Typical problem: you're missing the one thing you want What to monitor: everything?
Fault Detection Monitor system health Alert when there are problems
Fault Detection Monitor system health Alert when there are problems Typical problem: spammy alerts Non-actionable False positives /dev/null email filters
Fault Detection Monitor system health Alert when there are problems Typical problem: spammy alerts Non-actionable False positives /dev/null email filters What to monitor: only as much as needed?
Why not do both in one tool? Caching Consistency Staleness
Monitoring Tools
Why So Many?
Why So Many? Thus is born Yet Another Tool
Popular Tools Nagios MRTG Cacti Munin Zabbix Monit MySQL Enterprise MonYOG New Relic Zenoss Circonus and many more
Percona Monitoring Plugins The philosophy: Don't create Yet Another Tool Make existing tools better and easier to use Nagios and Cacti are good enough for most The plugins are 100% free and open source See http:///software
Monitoring with Nagios
Using Nagios Nagios is primarily for fault detection
Using Nagios Nagios is primarily for fault detection Works via plugins Executables that follow specific conventions Exit code Text to STDOUT
Using Nagios Nagios is primarily for fault detection Works via plugins Executables that follow specific conventions Exit code Text to STDOUT Nagios plugins are pretty universal Portable to many/most similar systems Zabbix, etc
Nagios Plugins for MySQL There are too many plugins, too! My suggestion: The official plugins (there are only a couple) Plus Percona's (there is a decent selection)
Why Percona's Plugins? Targeted towards avoiding spammy alerts High quality Good documentation Free
Why Percona's Plugins? Integrates with other Percona tools Easy to extend and customize Highly structured Designed for testability Written in Bourne shell
Why Percona's Plugins?
Why Percona's Plugins? This makes the plugins unit-testable for quality.
Overview of Percona Plugins Operating Environment: LVM snapshots Deleted files Privileges PID files System memory
Overview of Percona Plugins MySQL: Table checksums Replication Processes InnoDB status Deadlocks Arbitrary status counters
What to Monitor Everything appropriate from the previous slides Carefully selected status counters Uptime. If too small, server has restarted Threads_connected approaches max_connections
What NOT to Monitor In general, be cautious with... Cache hit ratios $variable-per-second Threshold-based alerts
What NOT to Monitor Why? A single right threshold is hard to find These are usually not reliable indicators They are also usually not actionable They generate spam alerts!
Writing Custom Checks Use the pmp-check-mysql-status plugin You can easily monitor counters, ratios... Example: -x Threads_running -w 20 -c 40 See the documentation for more examples
Monitoring with Cacti
Cacti Cacti is for metrics collection and graphing Good: easy to install, easy to get started Bad: tedious to template-ize
Cacti Cacti is for metrics collection and graphing Good: easy to install, easy to get started Bad: tedious to template-ize Percona Monitoring Plugins fixes this Ready-made templates Template-generation utilities Easy to extend and customize
Take-away Points Record metrics eagerly Alert cautiously Try not to reinvent the wheel Nothing's perfect; usually 2 systems are best One for graphing/trending/metrics One for fault detection
Monitoring Resources Blog posts about our plugins: http://goo.gl/pcen3 and http://goo.gl/i2xna White papers about preventing downtime: percona.com/about-us/mysql-white-papers
Public and on-premises courses Upcoming dates: Dallas May 14 London May 21 Raleigh June 18 Chicago July 9 See http:///training/
MySQL Conference & Expo April 10-12, Santa Clara /live baron@percona.com @xaprb