Booking.com: Evolution of MySQL System Design. Nicolai Plum <nicolai.plum@booking.com>



Similar documents
2000 databases later. Kristian Köhntopp Mittwoch, 17. April 13

High Availability Solutions for the MariaDB and MySQL Database

Using MySQL for Big Data Advantage Integrate for Insight Sastry Vedantam

OLTP Meets Bigdata, Challenges, Options, and Future Saibabu Devabhaktuni

Scaling Pinterest. Yash Nelapati Ascii Artist. Pinterest Engineering. Saturday, August 31, 13

Tushar Joshi Turtle Networks Ltd

Scaling Graphite Installations

Scaling Database Performance in Azure

Dave Stokes MySQL Community Manager

SAP HANA - Main Memory Technology: A Challenge for Development of Business Applications. Jürgen Primsch, SAP AG July 2011

Scalability of web applications. CSCI 470: Web Science Keith Vertanen

Parallel Replication for MySQL in 5 Minutes or Less

Monitoring MySQL. Kristian Köhntopp

Exploring Amazon EC2 for Scale-out Applications



MySQL Cluster New Features. Johan Andersson MySQL Cluster Consulting johan.andersson@sun.com

Apache HBase. Crazy dances on the elephant back

Big data blue print for cloud architecture

MySQL High-Availability and Scale-Out architectures

Scalable Architecture on Amazon AWS Cloud

Real-time reporting at 10,000 inserts per second. Wesley Biggs CTO 25 October 2011 Percona Live

MySQL és Hadoop mint Big Data platform (SQL + NoSQL = MySQL Cluster?!)

Tier Architectures. Kathleen Durant CS 3200

About Me: Brent Ozar. Perfmon and Profiler 101

Synchronous multi-master clusters with MySQL: an introduction to Galera

MySQL: Cloud vs Bare Metal, Performance and Reliability

How To Use Big Data For Telco (For A Telco)

Big Data Approaches. Making Sense of Big Data. Ian Crosland. Jan 2016

SCALABLE DATA SERVICES

Database Scalability and Oracle 12c

Top 10 Performance Tips for OBI-EE

HBase Schema Design. NoSQL Ma4ers, Cologne, April Lars George Director EMEA Services

Big Systems, Big Data

Data warehousing with PostgreSQL

MySQL High Availability Solutions. Lenz Grimmer OpenSQL Camp St. Augustin Germany

W I S E. SQL Server 2008/2008 R2 Advanced DBA Performance & WISE LTD.

Would-be system and database administrators. PREREQUISITES: At least 6 months experience with a Windows operating system.

Zynga Analytics Leveraging Big Data to Make Games More Fun and Social

Hypertable Goes Realtime at Baidu. Yang Dong Sherlock Yang(

F1: A Distributed SQL Database That Scales. Presentation by: Alex Degtiar (adegtiar@cmu.edu) /21/2013

European Archival Records and Knowledge Preservation Database Archiving in the E-ARK Project

Petabyte Scale Data at Facebook. Dhruba Borthakur, Engineer at Facebook, SIGMOD, New York, June 2013

The Classical Architecture. Storage 1 / 36

StratioDeep. An integration layer between Cassandra and Spark. Álvaro Agea Herradón Antonio Alcocer Falcón

Architectural patterns for building real time applications with Apache HBase. Andrew Purtell Committer and PMC, Apache HBase

Large-Scale Web Applications

News and trends in Data Warehouse Automation, Big Data and BI. Johan Hendrickx & Dirk Vermeiren

Performance Baseline of Hitachi Data Systems HUS VM All Flash Array for Oracle

HBase A Comprehensive Introduction. James Chin, Zikai Wang Monday, March 14, 2011 CS 227 (Topics in Database Management) CIT 367

Best Practices for Deploying SSDs in a Microsoft SQL Server 2008 OLTP Environment with Dell EqualLogic PS-Series Arrays

Cloud Based Application Architectures using Smart Computing

Managing MySQL Scale Through Consolidation

Performance and Scalability Overview

MySQL Reference Architectures for Massively Scalable Web Infrastructure

Hadoop: Embracing future hardware

Petabyte Scale Data at Facebook. Dhruba Borthakur, Engineer at Facebook, XLDB Conference at Stanford University, Sept 2012

Alexander Rubin Principle Architect, Percona April 18, Using Hadoop Together with MySQL for Data Analysis

Lost in Space? Methodology for a Guided Drill-Through Analysis Out of the Wormhole

High Availability and Scalability for Online Applications with MySQL

Best Practices for Hadoop Data Analysis with Tableau

Comparing SQL and NOSQL databases

Benchmarking Hadoop & HBase on Violin

THE DEVELOPER GUIDE TO BUILDING STREAMING DATA APPLICATIONS

Partitioning under the hood in MySQL 5.5

Exploring the Synergistic Relationships Between BPC, BW and HANA

Enabling Database-as-a-Service (DBaaS) within Enterprises or Cloud Offerings

SharePoint 2010 Performance and Capacity Planning Best Practices

Splice Machine: SQL-on-Hadoop Evaluation Guide

Inge Os Sales Consulting Manager Oracle Norway

Search Big Data with MySQL and Sphinx. Mindaugas Žukas

KNIME & Avira, or how I ve learned to love Big Data

An Introduction to Data Virtualization as a Tool for Application Development

MySQL. Leveraging. Features for Availability & Scalability ABSTRACT: By Srinivasa Krishna Mamillapalli

How To Scale Myroster With Flash Memory From Hgst On A Flash Flash Flash Memory On A Slave Server

Architectures Haute-Dispo Joffrey MICHAÏE Consultant MySQL

Why NoSQL? Your database options in the new non- relational world IBM Cloudant 1

Azure Scalability Prescriptive Architecture using the Enzo Multitenant Framework

Affordable, Scalable, Reliable OLTP in a Cloud and Big Data World: IBM DB2 purescale

Zero Downtime Deployments with Database Migrations. Bob Feldbauer

MySQL backups: strategy, tools, recovery scenarios. Akshay Suryawanshi Roman Vynar

Using Apache Derby in the real world

SQL Server Virtualization 101. David Klee, Group Principal and Practice Lead. SQL PASS Virtualization VC,

YouTube Vitess. Cloud-Native MySQL. Oracle OpenWorld Conference October 26, Anthony Yeh, Software Engineer, YouTube.

Building Scalable Big Data Infrastructure Using Open Source Software. Sam William

Apache Kylin Introduction Dec 8,

Performance and Scalability Overview

Putting Apache Kafka to Use!

BENCHMARKING CLOUD DATABASES CASE STUDY on HBASE, HADOOP and CASSANDRA USING YCSB

How Lucene Powers LinkedIn Segmentation & Targeting Platform

Cognos Performance Troubleshooting

Supercharge your MySQL application performance with Cloud Databases

The Sierra Clustered Database Engine, the technology at the heart of

Associate Professor, Department of CSE, Shri Vishnu Engineering College for Women, Andhra Pradesh, India 2

Overview of Databases On MacOS. Karl Kuehn Automation Engineer RethinkDB

MySQL Enterprise Backup

Transcription:

Booking.com: Evolution of MySQL System Design Nicolai Plum <nicolai.plum@booking.com>

Booking.com 2

Founded in 1996 as bookingsportal.nl Early days Purchased by Priceline.com Inc in 2005 Became Booking.com in 2006

Still small in 1999 Picture by Geert-Jan Bruinsma

We got bigger since then Available Accommodations Daily Reservations

Travel business opportunity Booking.com Accommodation Opportunty Other spend

Architecture decisions Around 2003 we decided Keep using Perl MySQL + replication Analytics and dashboards A/B testing 7

Hotel reservation website design experts Picture by SantiMB.Photos on flickr; license cc-by-na 8

Scalability dimensions Schema growth Data growth Query rate Data complexity Each has different solutions 9

Scalability dimensions Schema growth Data growth Query rate Data complexity Each has different solutions 10

Data complexity Complex multi-directional relations and normalisation Many-way JOINs Foreign Key constraints All put stress on SQL Optimiser query plans Storage engines Schema design Developers DBAs 11

Data complexity reduction Prefer client-side logic to Foreign Keys and Stored Procedures Client-side scales better in CPU We have control of all our code Prefer simpler joins Denormalise pragmatically Fast schema changes Online schema change, low bureaucracy 12

Scalability dimensions Schema growth Data growth Query rate Data complexity 13

Query rate Travel websites are all read-intensive Replication for the win! Winning for us since 2003 How to monitor and manage? Puppet, Graphite, Nagios, etc Comprehensive application event and error analysis 14

Databases - beginning 15

Database replication 16

Sharing databases Cells 17

Sharing databases Cells Simple to administer Good failure isolation Poor efficiency Even worse with many schemas 18

Use a Load Balancer! 19

Use a Load Balancer! Network stress Single Point Of Failure Scalability nightmare 20

Use a Load Balancer! 21

DNS Database Load Balancer Separate control and signal path Modified HAProxy Standard HAProxy MySQL healthcheck HAProxy tracks server availability Returns list of severs in DNS query 22

DNS Database Load Balancer 23

DNS Database Load Balancer 24

Rosters of eligible DB servers Separate control and signal path De-centralised service checks Apache Zookeeper Pools of available servers ZooAnimal deamon registers available database servers ZooRoster deamon retrieves servers for clients 25

Rosters of eligible DB servers 26

Rosters of eligible DB servers 27

Reliability Cells Strong failure isolation, inflexible DNS LB Less failure isolation, more flexible, LB is scaling and reliability problem Rosters Less failure isolation, more flexible, very scalable, Zookeper very reliable 28

Replication Speed challenges Single threading hurts us Especially on a SAN Careful optimisation of bulk jobs Binlog server, make it all faster Some help with failover Bodge: copy tables Works on myisam Needs transportable tables for InnoDB Alter tables from innodb to myisam, copy 29

Scalability dimensions Schema growth Data growth Query rate Data complexity 30

Acommodation reservation data Accommodation catalogue Descriptions, amenities, policies Inventory Room prices, quantities and restrictions Customer details Names, contact info, payment Different growth and use patterns 31

Multiple schemas Split data by function Keeps it simple for most developers Queries against single schema Keeps it simple for DBAs Less simple for infrastructure developers ORM changes, data pumps, consistency checks Just feed them more coffee 32

Multiple schemas 33

Multiple schemas Consistency Distributed transactions, XA = pain DB failures, code bugs, app server crashes Careful order of updates so critical things last Consistency check references later Requires skilled developers and strong code knowledge APIs and ORM layers help 34

When to split? Analysis tools for busiest tables Performance_Schema and SYS Schema Business impacts, development time Isolate critical functions from complex, less critical functions 35

Scalability dimensions Schema growth Data growth Query rate Data complexity 36

Data growth Business growth 30-50% annually Data growth 40-60% annually Faster than Moore s Law And disk IOPS 37

38

We outgrow CPU speed Booking.com CPU industry 39 SPEC graph by Jeff Preshing

40

Database growing pains Dataset size exceeds memory Read performance decreases (a lot) Dataset size exceeds local disc size SAN latency, management, cost Write perf decreases, Read perf decreases more Dataset exceeds size a CPU can scan in reasonable time Dataset exceeds storage volume size, disc array size, backup capacity, filesystem limits 41 Ad-hoc queries are impossible, analyse table difficult, schema changes difficult, table scans lethal Totally unmanageable. Give up!

Database growing pains Dataset size exceeds memory Read performance decreases (a lot) ~200GB Dataset size exceeds local disc size Dataset exceeds size a CPU can scan in reasonable time Dataset exceeds storage volume size, disc array size, backup capacity, filesystem limits SAN latency, cost, complexity Write perf decreases, Read perf decreases more Ad-hoc queries are impossible, analyse table difficult, schema changes difficult, table scans lethal Totally unmanageable. Give up! 42 ~5TB ~20TB ~300TB

Archives Separate transcational and analytical Store the past in another schema File off payment, PII where possible Also shrinks dataset Win-win J but you need more (later) 43

Materialisation and data models Read-optimised is not write-optimised OLTP vs OLAP the timeless struggle Two schemas Different read and write data models Data pumps, materialisation queues Inevitably more complex Needs smarter infrastructure to keep feature development easy 44

Inventory first materialisation Flat availability Write Complex relational structure of rooms, rates, restrictions Read Simple point query for inventory for a single stay Much more predictable than caching 45

Row index Hotel ID order Hotel_ID District City UFI Country 1 Kensington London England 2 Chaoyang Beijing China 20000 La Défense Paris France 20001 Dongchen Beijing China 20002 TriBeCa New York USA. 35678 Xicheng Beijing China 35679 Gentofte København Danmark. 70035 Chaoyang Beijing China 46

Row indexing Hotel ID order 47

Row index UFI order Hotel_ID District City UFI Country 1 Kensington London England 70035 Chaoyang Beijing China 35678 Xicheng Beijing China 2 Chaoyang Beijing China 20001 Dongchen Beijing China 20002 TriBeCa New York USA 20000 La Défense Paris France 35679 Gentofte København Danmark 48

Row indexing UFI order 49

Location coding - Z-order curve Location latitude and longitude 12 bits is 10km longitude 6-10km latitude (for most hotels) Index with bitwise interleave of latitude and longitude in a space-filling curve 50

51 (David Eppstein, via Wikipedia)

Row index Z-order Hotel_ID District City UFI Country Z-location 1 Kensington London England 3456789 35678 Xicheng Beijing China 6788567 70035 Chaoyang Beijing China 6789456 2 Chaoyang Beijing China 6789456 20001 Dongchen Beijing China 6790463 20002 TriBeCa New York USA 8534535 20000 La Défense Paris France 10013346 35679 Gentofte København Danmark 13036743 52

Row indexing Z-order curve 53

Materialised inventory 54

Sharding Prefer schema split to sharding Necessary in growing, busy, transactional schemas Inventory Materialised datasets Requires good API (or developer awareness) Complexity, overhead 55

Analytics Two types of analytics Exploitation Canned reports with some parameters exploited many staff Exploration New queries, unknown unknowns 56

Analytics exploitation Pre-prepared reports - Controlrooms Part pre-aggregated data Intermediate infrastructure between raw data and single purpose report data Satisfies need for regular reports, common questions Fixed reports with parameters Less flexible for ad-hoc queries Technical dead-end, useful for medium-term Moving to Hadoop 57

Analytics exploration Surprisingly easy to make a write only dataset in various ways Too big to query Queries hit performance too hard Can t add indexes so queries hit too hard Users too bad at SQL (Excel/ODBC) so they give up 58

Analytics - exploration Need constant compute power per unit of data during growth Data to MySQL and Hadoop Hadoop answers in linear time but not quickly Business analysts love it Most people need a friendly interface 59

Web marketing Business systems More traditional database Large imports, ETL >20TB MySQL Even with split schemas Analysis is moving to Hadoop 60

Masters First DRBD Now SAN Netapp filers Future is trying to reduce number of important machines Now: rapid automated failovers SAN == safety + latency For arbitrary topology changes, need a global way to identify of changes (transactions) GTID Future: (pseudo) gtid, no special masters 61

Measuring capacity In fixed config cells, you need more DB than app In flexible pools, capacity is hard Metrics lie, due to nonlinearity in the database Qps, etc, help a bit Traffic replay at high rate helps more (if you can) Replication capacity is also important Single thread, often a limit Stop slave and measure time to catch up P_S replication stats too complicated in 5.6 Contention and non-linearity really hard 62

Abstraction Layers No need for a full microservice intercommunication framework architecture standardisation committee Just a function call will do Inventory was easy Few calls to well defined API functions Search was not Search: everyone fetched hotels and filtered themselves even for common searches 63

64 nicolai.plum@booking.com