Designing, Scoping, and Configuring Scalable Drupal Infrastructure. Presented 2009-05-30 by David Strauss



Similar documents
Are You Ready for the Holiday Rush?

An overview of Drupal infrastructure and plans for future growth. prepared by Kieran Lal and Gerhard Killesreiter for the Drupal Association

The importance of Drupal Cache. Luis F. Ribeiro Ci&T Inc. 2013

Simple Tips to Improve Drupal Performance: No Coding Required. By Erik Webb, Senior Technical Consultant, Acquia

An overview of the Drupal infrastructure and plans for future growth

making drupal run fast

Tushar Joshi Turtle Networks Ltd

Ensuring scalability and performance with Drupal as your audience grows

Drupal Performance Tuning

MakeMyTrip CUSTOMER SUCCESS STORY

Wikimedia Architecture Doing More With Less. Asher Feldman Ryan Lane Wikimedia Foundation Inc.

MAGENTO HOSTING Progressive Server Performance Improvements

Serving 4 million page requests an hour with Magento Enterprise

SCALABILITY. Hodicska Gergely. Web Engineering Manager as Ustream. May 7, 2012

High Availability Solutions for the MariaDB and MySQL Database

Scalability of web applications. CSCI 470: Web Science Keith Vertanen

Achieving High Throughput. Fernando Castano Sun Microsystems

Table of Contents. Overview... 1 Introduction... 2 Common Architectures Technical Challenges with Magento ChinaNetCloud's Experience...

Wikimedia architecture. Mark Bergsma Wikimedia Foundation Inc.

Drupal High Availability High Performance

Building Success on Acquia Cloud:

Bricks Cluster Technical Whitepaper

Top 10 Reasons why MySQL Experts Switch to SchoonerSQL - Solving the common problems users face with MySQL

Cloud Based Application Architectures using Smart Computing

Web Application Hosting Cloud Architecture

How Comcast Built An Open Source Content Delivery Network National Engineering & Technical Operations

BASICS OF SCALING: LOAD BALANCERS

Layers of Caching: Key to scaling your website. Lance Albertson -- Narayan Newton

always available Cloud

Performance Tuning and Optimization for high traffic Drupal sites. Khalid Baheyeldin Drupal Camp, Toronto May 11 12, 2007

Building a Highly Available and Scalable Web Farm

Accelerating Wordpress for Pagerank and Profit

ZingMe Practice For Building Scalable PHP Website. By Chau Nguyen Nhat Thanh ZingMe Technical Manager Web Technical - VNG

SCALABLE DATA SERVICES

ZEN LOAD BALANCER EE v3.04 DATASHEET The Load Balancing made easy

SCALABILITY AND AVAILABILITY

Deployment Topologies

5 Mistakes to Avoid on Your Drupal Website

Is Drupal secure? A high-level perspective on web vulnerabilities, Drupal s solutions, and how to maintain site security

Practical Load Balancing

Installing and Configuring Windows Server Module Overview 14/05/2013. Lesson 1: Planning Windows Server 2008 Installation.

ZEN LOAD BALANCER EE v3.02 DATASHEET The Load Balancing made easy

Enterprise Edition Scalability. ecommerce Framework Built to Scale Reading Time: 10 minutes

STORAGE CENTER WITH NAS STORAGE CENTER DATASHEET

INDIA September 2011 virtual techdays

Where every interaction matters. Data Sheet: Magento Optimised Managed Hosting. Optimal Performance. Rock-Solid Reliability. Expertly Supported

Geospatial Server Performance Colin Bertram UK User Group Meeting 23-Sep-2014

Cluster Computing. ! Fault tolerance. ! Stateless. ! Throughput. ! Stateful. ! Response time. Architectures. Stateless vs. Stateful.

Exploiting Remote Memory Operations to Design Efficient Reconfiguration for Shared Data-Centers over InfiniBand

High Availability with Postgres Plus Advanced Server. An EnterpriseDB White Paper

Red Hat Network Satellite Management and automation of your Red Hat Enterprise Linux environment

BENCHMARKING CLOUD DATABASES CASE STUDY on HBASE, HADOOP and CASSANDRA USING YCSB

Red Hat Satellite Management and automation of your Red Hat Enterprise Linux environment

Tuning Tableau Server for High Performance

What s New with VMware Virtual Infrastructure

FOR SERVERS 2.2: FEATURE matrix

HUAWEI OceanStor Load Balancing Technical White Paper. Issue 01. Date HUAWEI TECHNOLOGIES CO., LTD.

R3: Windows Server 2008 Administration. Course Overview. Course Outline. Course Length: 4 Day

End Your Data Center Logging Chaos with VMware vcenter Log Insight

Managing and Maintaining Windows Server 2008 Servers

CS 188/219. Scalable Internet Services Andrew Mutz October 8, 2015

Tableau Server 7.0 scalability

RED HAT ENTERPRISE VIRTUALIZATION FOR SERVERS: COMPETITIVE FEATURES

Server Architecture for High- Performance Drupal

CLUSTERING CAS for High Availability. Eric Pierce, University of South Florida

Using MySQL for Big Data Advantage Integrate for Insight Sastry Vedantam

Planning Domain Controller Capacity

High Availability and Scalability for Online Applications with MySQL

Managing your Red Hat Enterprise Linux guests with RHN Satellite

Centrata IT Management Suite 3.0

hp ProLiant network adapter teaming

MySQL Enterprise Monitor

Achieving Zero Downtime and Accelerating Performance for WordPress

Reducing the Cost and Complexity of Business Continuity and Disaster Recovery for

Sage 300 ERP 2014 Compatibility guide

Simplify Data Management and Reduce Storage Costs with File Virtualization

3/21/2011. Topics. What is load balancing? Load Balancing

Apache Hadoop Cluster Configuration Guide

Scaling out a SharePoint Farm and Configuring Network Load Balancing on the Web Servers. Steve Smith Combined Knowledge MVP SharePoint Server

Dependency Free Distributed Database Caching for Web Applications and Web Services

Designing, Optimizing and Maintaining a Database Administrative Solution for Microsoft SQL Server 2008

Topics. 1. What is load balancing? 2. Load balancing techniques 3. Load balancing strategies 4. Sessions 5. Elastic load balancing

SiteCelerate white paper

Agenda. Enterprise Application Performance Factors. Current form of Enterprise Applications. Factors to Application Performance.

Barracuda Load Balancer Online Demo Guide

SQL Server Consolidation Using Cisco Unified Computing System and Microsoft Hyper-V

Preparing for the Big Oops! Disaster Recovery Sites for MySQL. Robert Hodges, CEO, Continuent MySQL Conference 2011

E-commerce is also about

Copyright bizagi

Advances in Virtualization In Support of In-Memory Big Data Applications

<Insert Picture Here> Oracle Web Cache 11g Overview

BASHO DATA PLATFORM SIMPLIFIES BIG DATA, IOT, AND HYBRID CLOUD APPS

Transcription:

Designing, Scoping, and Configuring Scalable Drupal Infrastructure Presented 2009-05-30 by David Strauss

Understanding Load Distribution

Predicting peak traffic Traffic over the day can be highly irregular. To plan for peak loads, design as if all traffic were as heavy as the peak hour of load in a typical month -- and then plan for some growth.

Analyzing hit distribution 100% Dynamic Pages Static Content 30% Anonymous 50% Web Crawler Human 10% 40% No Special Treatment Pay Wall Bypass 3% 70% Authenticated 20% 7%

Throughput vs. Delivery Methods Green (Static) Yellow (Dynamic, Cacheable) Red (Dynamic) Content Delivery Network 2 Reverse Proxy Cache Drupal + Page Cache + memcached Drupal + Page Cache 1000 req/s 1 1 Drupal 1 10 req/s More dots = More throughput 1 2 Delivered by Apache without Drupal Some actually can do this.

Objective Deliver hits using the fastest, most scalable method available.

Layering: Less Traffic at Each Step Your Datacenter Traffic Load Balancer Reverse Proxy Cache DNS Round Robin CDN Database

Offload from the master database Search Your master database is the single greatest limitation on scalability. Slave Database Memory Cache Master Database

Tools to use Apache Solr for search. (Acquia offers hosting of this now.) Squid or Varnish for reverse proxy caching. Any third-party service for CDN.

Do the math All non-cdn traffic travels through your load balancers and reverse proxy caches. Even traffic passed through to application servers must run through the initial layers. Traffic Load Balancer Reverse Proxy Cache What hit rate is each layer ge ing? How many servers share the load?

Get a management/monitoring box Load Balancer (maybe two or three and have them specialized or redundant) Database Management Reverse Proxy Cache

Planning + Scoping

Infrastructure goals Redundancy Scalability Performance Manageability

Redundancy When one server fails, the website should be able to recover without taking too long. This requires N+1, putting a floor on system requirements. How long can your site be down? Automatic versus manual failover

Performance Find the sweet spot for hardware. This is the best price/performance point. Avoid overspending on any type of component Yet, avoid creating bottlenecks Swapping memory to disk is very dangerous

Relative importance Processors/Cores Memory Disk Speed Reverse Proxy Cache Web Database Monitoring

Reverse proxy caches Squid makes poor use of multiple cores. Focus on getting the highest per-core performance. The best per-core performance is often on dual-core processors with high clock rates and lots of cache. Varnish is much more multithreaded. 4-8 GB memory, total Expect 1000 requests per second, per Squid 64-bit operating system if more than 2 GB RAM

Web servers Apache 2.2 + mod_php + memcached Many processors + many cores is best 25 Apache threads per core 50 MB memory per thread, system-wide 1 GB memory for system 1 GB memory for memcached Configure MaxClients in Apache to maximum system-wide thread count Expect 1 request per thread, per second

Database servers MySQL 5.0 cannot use more than eight cores effectively but gets good gains from at least quadcore processors. Depend on each Apache thread needing one connection, and add another 50. Each MySQL connection needs around 6 MB. MySQL with InnoDB needs a buffer pool large enough to cache all indexes. Start by giving the pool most remaining database server memory and working from there. 64-bit operating system if more than 2 GB RAM

Monitoring server Very low hardware requirements Choose hardware that is inexpensive but essentially similar to the rest of the cluster to reduce management overhead Reliability and fast failover are typically low priorities for monitoring services

Assembling the numbers Start with an architecture providing redundancy. Two servers, each running the whole stack Increase the number of proxy caches based on anonymous and search engine traffic. Increase the number of web servers based on authenticated traffic. Databases are harder to predict, but large sites should run them on at least two separate boxes with replication.

Pressflow Make Drupal sites scale by upgrading core with a compatible, powerful replacement.

Common large-site issues Drupal core requires patching to effectively support the advanced scalability techniques discussed here. Patches often conflict and have to be reapplied with each Drupal upgrade. The original patches are often unmaintained. Sites stagnate, running old, insecure versions of Drupal core because updating is too difficult.

What is Pressflow? Pressflow is a derivative of Drupal core that integrates the most popular performance and scalability enhancements. Pressflow is completely compatible with existing Drupal 5 and 6 modules, both standard and custom. Pressflow installs as a drop-in replacement for standard Drupal. Pressflow is free as long as the matching version of Drupal is also supported by the community.

What are the enhancements? Reverse proxy support Database replication support Lower database and session management load More efficient queries Testing and optimization by Four Kitchens with standard high-performance software and hardware configuration Industry-leading scalability support by Four Kitchens and Tag1 Consulting

Four Kitchens + Tag1 Provide the development, support, scalability, and performance services behind Pressflow Comprise most members of the Drupal.org infrastructure team Have the most experience scaling Drupal sites of all sizes and all types

Ready to scale? Learn more about Pressflow: Pick up pamphlets in the lobby Request Pressflow releases at fourkitchens.com Get the help you need to make it happen: Talk to me (David) or Todd here at DrupalCamp Email shout@fourkitchens.com

Managing the Cluster

The problem So ware and Configuration Objectives: Fast, atomic deployment and rollback Minimize single points of failure and contention Restart services Integrate with version control systems

Manual updates and deployment Human Human Human Human Human Why not: slow deployment, non-atomic/difficult rollbacks

Shared storage NFS Why not: single point of contention and failure

rsync Synchronized with rsync Why not: non-atomic, does not manage services

Capistrano Deployed with Capistrano Capistrano provides near-atomic deployment, service restarts, automated rollback, test automation, and version control integration (tagged releases).

Multistage deployment Deployed with Capistrano Deployments can be staged. cap staging deploy cap production deploy Deployed with Capistrano Development Integration Deployed with Capistrano Staging

But your application isn t the only thing to manage.

Beneath the application Reverse Proxy Cache Cluster-level configuration Database Cluster management applies to package management, updates, and so ware configuration. cfengine and bcfg2 are popular cluster-level system configuration tools.

System configuration management Deploys and updates packages, cluster-wide or selectively. Manages arbitrary text configuration files Analyzes inconsistent configurations (and converges them) Manages device classes (app. servers, database servers, etc.) Allows confident configuration testing on a staging server.

All on the management box Integration Staging Management{Development Deployment Tools Monitoring

Monitoring

Types of monitoring Failure Capacity/Load Analyzing Downtime Viewing Failover Troubleshooting Notification Analyzing Trends Predicting Load Checking Results of Configuration and So ware Changes

Everyone needs both.

What to use Failure/Uptime Capacity/Load Nagios Hyperic Cacti Munin

Nagios Highly recommended. Used by Four Kitchens and Tag1 Consulting for client work, Drupal.org, Wikipedia, etc. Easy to install on CentOS 5 using EPEL packages. Easy to install nrpe agents to monitor diverse services. Can notify administrators on failure. We use this on Drupal.org

Hyperic I haven t used this much, but it s fairly popular. More difficult to set up than Nagios.

Cacti Highly annoying to set up. One instance generally collects all statistics. (No agents on the systems being monitored.) Provides flexible graphs that can be customized on demand. Optimized database for perpetual statistics collection. We use this on Drupal.org and for client sites.

Munin Fairly easy to set up. One instance generally collects all statistics. (No agents on the systems being monitored.) Provides static graphs that cannot be customized.

Cluster Problems

Cache/session coherency Systems that run properly on single boxes may lose coherency when run on a networked cluster. Some caches, like APC s object cache, have no ability to handle network-level coherency. (APC s opcode cache is safe to use on clusters.) memcached, if misconfigured, can hash values inconsistently across the cluster, resulting in different servers using different memcached instances for the same keys. Session coherency can be helped with load balancer affinity.

Cache regeneration races Downside to network cache coherency: synched expiration Hard to solve All servers regenerating the item. { Old Cached Item New Cached Item Time Expiration

Broken replication MySQL slave servers get out of synch, fall further behind No means of automated recovery Only solvable with good monitoring and recovery procedures Can automate removal from use, but requires cluster management tools

failure Load balancers can remove broken or overloaded application reverse proxy caches. Reverse proxy caches like Varnish can automatically use only functional application servers. Cluster management tools like heartbeat2 can manage service IPs on MySQL servers to automate failover. Conclusion: Each layer intelligently monitors and uses the servers beneath it.

All content in this presentation, except where noted otherwise, is Creative Commons Attribution- ShareAlike 3.0 licensed and copyright 2009 Four Kitchen Studios, LLC.