Drupal performance and scalability Ensuring scalability and performance with Drupal as your audience grows Presented by Jon Anthony
Bounty.com Northern and Shell (OK! Magazine etc) Drupal.org/project/ phonegap adappt.co.uk WHO ON EARTH IS JON ANTHONY
Perform & Scale Environment Team HYGIENE FACTORS
Size of Decision Decreasing time to implement Tuckma n CANNOT SCALE WITHOUT A STABLE TEAM
DEFINE YOUR SDLC
Must haves Should haves Could haves Won t do this time EDUCATE MANAGEMENT AND TEAMS ON YOUR PROJECT MANAGEMENT METHODOLOGY
Basecamp Simple and loved by clients Feng Office ajaxed clone of basecamp (opensource) Ontime Scrum based project management system, hugely flexible Jira and Greenhopper combo very powerful AGILE based project management system SOME PROJECT MANAGEMENT TOOLS
SOURCE CONTROL ROBUST DEVELOPMENT ENVIRONMENT
Think about ditching windows, get your devs used to using the shell, doing their own mysql dumps etc Eclipse or Aptana Apache or Zend Apache xdebug or Zend debugger extension INTEGRATED DEVELOPMENT ENVIRONMENT (IDE)
Slow page loads Write locks Read queues Server crashes Google Quality score penalty Bad user experience LOST USERS Great news for your competition WHEN IT GOES BAD
Client rendering Page Size Network pipe Page Serving Concurrent Session Handling Code Execution Database Writing Processors and multi threading reading Ram access Disk I/O Internal Network Transit THE CHAIN BY FLEETWOOD MAC
6 Volunteers 1 stop watch What is a x b AUDIENCE PARTICIPATION TIME
Know your audience What % are anonymous What % are first time visitors (why does it matter) What % view personalised pages With the right caching setup an anonymous user may never touch the database In fact an anonymous user may never touch apache if you are using a reverse proxy such as varnish Just how many database logged in or personalized page delivery can require UNDERSTANDING YOUR TRAFFIC
OXO Tower McDonald s Tesco s Sandwiches Do you need to write data? Are you tracking pages reads in Drupal Why Use client side analytics CLIENT LOAD VS SERVER LOAD
Get these answers right and you may not need as much hardware as you thought Are you reading data or writing it? Doing it all from memory Bus speed vs network 1 Big server vs cluster Disaster Recovery Load Balancing Processor Cores RAM Content Distribution Network Cloud stacks e.g. Acquia HARDWARE
The original Bounty search was hidden on at the bottom of the page because it maxed out the CPU for 4 seconds Searching a big site takes processor time Use Solr It runs under Tomcat (Java) so put it on another box Or better still get Acquia to do the hardwork USING SERVICES
Multi sites are massively powerful Common user database Only load the modules you need Use separate DB servers for content Run multiple sites under one URL THE GREAT MULTI SITE TRICK
Writing Reading Table locking Record locking Isam vs innodb Query caching Query cache size DB Maint (module available) DATABASE CONSIDERATIONS
query caching on mysql opcode caching with apc or... query caching with memcache or apc caching of views data caching of block data block cache alter cache by page, role, universal etc example of two menu blocks one universal and one by role and by page CACHING
Boost Block cache alter js min css zip Apc Memcached Cacherouter for d6 Supercron (Housekeeping on roids) DB Maintennance Devel (for slow queries) Throttle (last resort) Pressflow Core replacement Move apache.htaccess rule to http.conf MODULES + CORE
Boost Pro: Easy setup, file based, fast, robust Cons: Still uses apache Varnish Pros: Very very fast, all files held in memory, unflinching at high loads (during the riots a small 4gb server received 120,000 in one day uniques, > 99% new visitors, and did not even break stride) Cons: Tough setup, needs dedicated host BOOST VS VARNISH
Apache sessions MYSQL query cache APC Memcached Php session script size Varnish How linux uses spare memory /etc/init.d/httpd restart graceful ALLOCATING MEMORY
Missing Images!!! (require a 404 page to be generated) Badly coded modules with hard coded queries (which cannot be cached) Developers who do not understand how to code behind a reverse proxy Not compressing files Wrong choice of image format (jpg, gif, png, sprites) Not turning unused code off BIGGEST MISTAKES
A chain is as strong as its weakest link One piece of bad code (a hard coded query pointing to your dev server) can ruin all of your hard work Because they charge 450 per day does not mean the understand enterprise level coding Does your dev understand how to code behind a reverse proxy such a varnish Ask if they know what esi stands for (edge side integration) If they are php only with out an impressive ajax portfolio chances are they cannot deal with a reverse proxy In this scenario generally two things happen; the dev is completely unable to understand why their code works in test but not on staging or live A single dirty hack can cause varnish not to cache, undoing all of your hard thought out architecture CHOOSE YOUR DEVELOPERS WISELY
Varnish Ram cache All files cached Cache size Edge side Integration When to use ajax To bootstrap or not to booststrap Cookies Breaking Varnish UNDERSTANDING REVERSE PROXIES
Chrome / safari resource tracking tools Firebug net tab Apache bench ( ab -c 20 n 50 http://www.mysite.co.uk) http://loadimpact.com/ Turn off your advertising unless you want to be black listed!!!!! Devel module Test test and test again TESTING FOR SPEED
Unless you have masochistic tendencies, or a large in house team of sys admins, pick a managed host Get a cheap un-managed VPS for development How much does your traffic fluctuate Dedicated boxes Virtual servers (upgrade with a restart) Acquia dev cloud Acquia enterprise hosting HOSTING
TEST AND ITERATE BUILD SECOND ARCHITECT FIRST!!!! FINAL WORD
This presentation will be available online at www.adappt.co.uk/drupalspeed NOTES AVAILABLE ONLINE
Seeking supporter for www.drupal.org/project/phonegap Bringing native mobile to Drupal Contact drupal.org:jonanthony www.adappt.co.uk/contact
What did you think? Locate this session on the DrupalCon London website: http://london2011.drupal.org/conference/schedule Click the Take the survey link THANK YOU!