Building resilient infrastructure with CouchDB

Similar documents
CASE STUDY: Oracle TimesTen In-Memory Database and Shared Disk HA Implementation at Instance level. -ORACLE TIMESTEN 11gR1

JBoss & Infinispan open source data grids for the cloud era

In Memory Accelerator for MongoDB

NoSQL Databases. Nikos Parlavantzas

Cloud Elements! Marketing Hub Provisioning and Usage Guide!

AppSense Environment Manager. Enterprise Design Guide

MiaRec. Architecture for SIPREC recording

Glassfish Architecture.

Comparison of the Frontier Distributed Database Caching System with NoSQL Databases

An Approach to Implement Map Reduce with NoSQL Databases

Eloquence Training What s new in Eloquence B.08.00

Low-cost Open Data As-a-Service in the Cloud

Petabyte Scale Data at Facebook. Dhruba Borthakur, Engineer at Facebook, XLDB Conference at Stanford University, Sept 2012

A REVIEW PAPER ON THE HADOOP DISTRIBUTED FILE SYSTEM

Configuring CQ Security

Cloudera Manager Training: Hands-On Exercises

Cosmos. Big Data and Big Challenges. Ed Harris - Microsoft Online Services Division HTPS 2011

Design for Failure High Availability Architectures using AWS

Tushar Joshi Turtle Networks Ltd

Scalable Architecture on Amazon AWS Cloud

Architectural patterns for building real time applications with Apache HBase. Andrew Purtell Committer and PMC, Apache HBase

Budget Event Management Design Document

Technical Overview: Anatomy of the Cloudant DBaaS

Petabyte Scale Data at Facebook. Dhruba Borthakur, Engineer at Facebook, SIGMOD, New York, June 2013

HDMQ :Towards In-Order and Exactly-Once Delivery using Hierarchical Distributed Message Queues. Dharmit Patel Faraj Khasib Shiva Srivastava

Overview of Databases On MacOS. Karl Kuehn Automation Engineer RethinkDB

Department of Computer Science University of Cyprus EPL646 Advanced Topics in Databases. Lecture 12

A simple object storage system for web applications Dan Pollack AOL

Open Source DBMS CUBRID 2008 & Community Activities. Byung Joo Chung bjchung@cubrid.com

Petabyte Scale Data at Facebook. Dhruba Borthakur, Engineer at Facebook, UC Berkeley, Nov 2012

McAfee Agent Handler

Monitoring and Alerting

Hitachi Content Platform as a Continuous Integration Build Artifact Storage System

How to Setup SQL Server Replication

F1: A Distributed SQL Database That Scales. Presentation by: Alex Degtiar (adegtiar@cmu.edu) /21/2013

<Insert Picture Here> Oracle In-Memory Database Cache Overview

Alfresco Enterprise on AWS: Reference Architecture

Real-time Data Replication

Monitoring Agent for Microsoft Exchange Server Fix Pack 9. Reference IBM

Cloud Computing at Google. Architecture

Hacettepe University Department Of Computer Engineering BBM 471 Database Management Systems Experiment

Getting Real Real Time Data Integration Patterns and Architectures

TESTING AND OPTIMIZING WEB APPLICATION S PERFORMANCE AQA CASE STUDY

This presentation covers virtual application shared services supplied with IBM Workload Deployer version 3.1.

NoSQL replacement for SQLite (for Beatstream) Antti-Jussi Kovalainen Seminar OHJ-1860: NoSQL databases

Sharding with postgres_fdw

Qualys API Limits. July 10, Overview. API Control Settings. Implementation

the client omits the BranchCache identifier from the request message.

Network Registrar Data Backup and Recovery Strategies

Informatica Master Data Management Multi Domain Hub API: Performance and Scalability Diagnostics Checklist

q for Gods Whitepaper Series (Edition 7) Common Design Principles for kdb+ Gateways

Enabling Database-as-a-Service (DBaaS) within Enterprises or Cloud Offerings

27 th March 2015 Istanbul, Turkey. Performance Testing Best Practice

Real World Enterprise SQL Server Replication Implementations. Presented by Kun Lee

BookKeeper overview. Table of contents

Software Performance, Scalability, and Availability Specifications V 3.0

Spectrum Technology Platform. Version 9.0. Administration Guide

EDG Project: Database Management Services

MongoDB Developer and Administrator Certification Course Agenda

Performance Analysis and Capacity Planning Whitepaper

THE SEMANTIC WEB AND IT`S APPLICATIONS

Tips and Tricks for Using Oracle TimesTen In-Memory Database in the Application Tier

The deployment of OHMS TM. in private cloud

SQL diagnostic manager Management Pack for Microsoft System Center. Overview

Understanding Evolution's Architecture A Technical Overview

User and Programmer Guide for the FI- STAR Monitoring Service SE

This section contains information intended to help plan for SocialMiner installation and deployment.

Monitoring Replication

VMware vrealize Automation

WebLogic Server 11g Administration Handbook

Near Real Time Indexing Kafka Message to Apache Blur using Spark Streaming. by Dibyendu Bhattacharya

Apache HBase. Crazy dances on the elephant back

vcloud Air Platform Programmer's Guide

The last 18 months. AutoScale. IaaS. BizTalk Services Hyper-V Disaster Recovery Support. Multi-Factor Auth. Hyper-V Recovery.

HADOOP MOCK TEST HADOOP MOCK TEST I

Server based signature service. Overview

Big data and urban mobility

Using Microsoft Operations Manager To Monitor And Maintain Your Farm. Michael Noel.

Reliable DNS and DHCP for Microsoft Active Directory Protecting and Extending Active Directory Infrastructure with Infoblox Appliances

Performance Tuning and Optimizing SQL Databases 2016

Cloud Customer Architecture for Web Application Hosting, Version 2.0

Jive and High-Availability

VMware vrealize Automation

Performance Testing. Checklist Packet. Everything you need to trigger thoughts, discussions and actions in the projects you are working on

Microsoft SharePoint Server 2010

Chapter 1 - Web Server Management and Cluster Topology

Workshop on Hadoop with Big Data

W I S E. SQL Server 2008/2008 R2 Advanced DBA Performance & WISE LTD.

3 Techniques for Database Scalability with Hibernate. Geert Bevin - SpringOne 2009

Scaling Analysis Services in the Cloud

Couchbase Server Under the Hood

Common Server Setups For Your Web Application - Part II

Web Application Hosting Cloud Architecture

Session 6 Patterns and best practices in SOA/REST

Cloud Elements ecommerce Hub Provisioning Guide API Version 2.0 BETA

X4-2 Exadata announced (well actually around Jan 1) OEM/Grid control 12c R4 just released

Cloud Computing. Up until now

Transcription:

Building resilient infrastructure with CouchDB Tim Perry Tech Lead & Open-Source Champion at Softwire tim-perry.co.uk @pimterry github.com/pimterry

Document Store { "_id":"my-document-example", "_rev":"21-qwe123asd", "some-content": { "a":1, "b":2 }, "a list!":[3,4,5] }

HTTP API $ curl-x GET http://couchdb:5984/my-db/a-doc-id {"_id":"a-doc-id" "_rev":"4-9812eojawd" "data":[1,2,3]}

HTTP API $ curl -X PUT http://couchdb:5984/my-db/another-id \ -H'Content-Type: application/json' \ -d'{ "other data": 4}' {"ok":true, "id":"another-id", "rev":"1-2902191555"}

Replication # PullfromB>A $ curl-x POST http://couchdb-a:5984/_replicator \ -H 'Content-Type: application/json' \ -d '{"source": "http://couchdb-b:5984/demo-db", "target": "demo-db", "continuous": true}' # PullfromA-> B $ curl-x POST http://couchdb-b:5984/_replicator \ -H 'Content-Type: application/json' \ -d '{"source": "http://couchdb-a:5984/demo-db", "target": "demo-db", "continuous": true}'

Indexed Views Incremental Map/Reduce ACID (locally) Erlang-based Web UI Show Functions Filters Validation

Resilient Infrastructure

Let's break everything! whiletrue do curl-xpost'http://couchdb-a:5984/demo-db'\ -H"content-type:application/json"\ -d'{"created_at":"'"`date`"'"}'\ --max-time0.1 curl-xpost'http://couchdb-b:5984/demo-db'\ -H"content-type:application/json"\ -d'{"created_at":"'"`date`"'"}'\ --max-time0.1 done whiletrue do vagranthaltcouchdb-a--force sleep30 vagrantupcouchdb-a--no-provision sleep30 vagranthaltcouchdb-b--force sleep30 vagrantupcouchdb-b--no-provision sleep30 done (Some console logging omitted)

Is this useful?

Real World Example (Anonymized) B2B SaaS product, with strict SLAs Millions of paying daily users 3,000 servers across 25 datacentres 50,000 requests per second, average Highly latency sensitive Every request needs the (readonly) user session

Bonus Challenges Struggling network infrastructure Frequent loss of connection to datacentres Occasional power outages in datacentres Users can and do roam, worldwide Server failover is always to a different datacentre Data centres have hub & spoke connectivity only (through London)

Previous Solution Hold all user sessions in memory on every server Announce new sessions to every server with a central message queue Canonical store kept in a single RDBMS (for server initialisation)

Real World Problems Memory usage doesn't scale Network and server failures are big problems Message queue failures are catastrophic problems

CouchDB Solution Small LRU cache in every server CouchDB in every datacentre CouchDB in the central datacentre Hub & spoke replication Servers query local CouchDB by default, or fall back to central CouchDB

Real World Improvements No single point of failure Scales horizontally easily Major memory savings

Some Challenges Ops ramp-up Support service setup Disk usage

Hoodie http://hood.ie

Hoodie No Backend Offline-First

Hoodie Save data $('.addtask.submit').click(function () { vardesc =$('.addtask.desc').val(); hoodie.store.add('task',{ desc: desc }); });

Hoodie Handle new data hoodie.store.on('add:task', function(task) { $('.tasklist').append('<li>'+task.desc+'</li>'); });

Hoodie Log in users $('.login').click(function(){ varusername=$(".username").val(); varpassword=$(".password").val(); hoodie.account.signin(username, password).done(loginsuccessful); });

Hoodie Architecture (From the Hoodie team at http://hood.ie/intro#magic, CC-BY-SA-NC)

Hoodie Future Architecture (Probably) (Modified, from the Hoodie team's diagram at http://hood.ie/intro#magic, CC-BY-SA-NC)

Why does any of this work?

Reliable Replication

Multiversion Concurrency Control (or MVCC)

Reliable Replication The Changes Feed $ curl-x GET http://couchdb:5984/my-db/_changes?since=1 { "results":[ {"seq":2,"id":"my-doc","changes":[{"rev":"1-128qw99"}]}, {"seq":3,"id":"my-doc","changes":[{"rev":"2-98s9123"}]}, ], last_seq: 3}

Reliable Replication Replication Process 1. Track the source's sequence number in a local-only metadata document in the target DB, unique to this replication, set to 0 initially 2. Read the changes from the source, since the sequence id stored in the local document in the target 3. Read any missing document revisions from the source DB 4. Write these updates to the target DB 5. Update the sequence number tracked in the target 6. Go to 2 (Paraphrased from http://replication.io)

Append-Only B+ Trees

Append-Only B+ Trees (From 'CouchDB: The Definitive Guide', CC-BY)

Did we break everything? whiletrue do curl-xpost'http://couchdb-a:5984/demo-db'\ -H"content-type:application/json"\ -d'{"created_at":"'"`date`"'"}'\ --max-time0.1 curl-xpost'http://couchdb-b:5984/demo-db'\ -H"content-type:application/json"\ -d'{"created_at":"'"`date`"'"}'\ --max-time0.1 done whiletrue do vagranthaltcouchdb-a--force sleep30 vagrantupcouchdb-a--no-provision sleep30 vagranthaltcouchdb-b--force sleep30 vagrantupcouchdb-b--no-provision sleep30 done (Some console logging omitted)

Phew.

CouchDB is not perfect

But 'always available' is a great superpower

Any questions? Tim Perry Tech Lead & Open-Source Champion at Softwire tim-perry.co.uk @pimterry github.com/pimterry