How to Plan a Successful Load Testing Programme for today s websites This guide introduces best practise for load testing to overcome the complexities of today s rich, dynamic websites. It includes 10 top tips for implementing a successful load testing programme.
1 How to Plan a Successful Load Testing Programme Whether it s a new Content Management System, a new ecommerce platform, a software release or simply looking ahead to traffic peaks, risk reduction is important. Applying an artificial load onto a website is the best way to determine how well a website will cope under stress. Load testing, stress testing and capacity planning essentially refer to the same thing, and are typically carried out before going live following a significant change, or in advance of a predicted peak of seasonal traffic or before a marketing campaign. Many engineers see the prime purpose of this testing to find out if more hardware is needed to handle the desired traffic levels. For managers and marketers the purpose is to find out the size of the shop how many customers the website can manage at one time and will it cope at peak times. In nine cases out of ten, major benefits also come from identifying below par areas of the web application s architecture or coding. By fixing these often small, issues there can be a substantial capacity gain - far greater than can be achieved by simply doubling the hardware. As cloud computing becomes more widely adopted, use of cloud auto-scaling is becoming a practical option. This allows a site owner to have a site automatically dial up more cloud capacity, as the traffic on the website increases and the extra capability is necessary. This is a great feature if you re unsure about predicted traffic and want to be certain your site will handle the traffic. But it s still important to load test, to ensure cloud auto-scaling is delivering as intended, as traffic ramps up. According to Google getting auto scaling just right is one of the hardest things about cloud apps.
2 Modelling your system Prior to load testing, a system requires modelling and preparatory work before load testing can begin. When modelling website usage patterns average traffic figures are of little use. At its simplest, you need to identify from your web logs the peak traffic levels on the site. Keeping to simple round numbers 24,000 hits a day doesn t mean 1,000 hits every hour. Find traffic levels during the busiest hour. Then use the average number of pages per visitor to calculate equivalent pages per hour, and work out the worst case number of concurrent users on the site: for example, 360 visits per hour, at 10 pages per visit = 3600 pages per hour =1 page per second. So when looking for a Load or Stress Testing supplier, what should you insist upon, to achieve the most benefit? Your load testing supplier should test the system as closely as possible to its real world state. Use the same server hardware, firewalls, ISP bandwidth as the final site. This is often best carried out overnight or whenever you have fewest visitors. The virtual user traffic to load your website should be as near as possible to real traffic. This rules out forms of load testing that simply hit isolated pages or your home page. Your site needs to be loaded with virtual users performing tasks and making choices that real users would - following multi-page User Journeys. It s critical to test the correct user journeys and the right mix of user journeys for useful results. Choose a supplier that will develop realistic load testing user journeys and user journey mixes based on your weblogs.
Top 10 steps to implementing a 3 successful load testing programme Here are our top 10 tips for implementing a successful User Journey load testing programme. An experienced, proactive testing supplier will perform most of these steps for you. 1. Identify & Script User Journeys Identify key multi-step User Journeys. These will emulate the activity of real visitors to your site and need to be realistic. For example it s bad to have an Add to basket journey that always buys the same product. Instead specify your Journeys with realistic variations. Don t simplify the routes, just because it will make testing simpler. Depending on the complexity of your site and variety of pages users will take to perform different tasks, you will most likely need at least 3 to 5 journeys. User Journeys, one for each of the main services or transactions offered to users need to be carefully scripted, so that at each step they act like a real user - eg when filling in forms and making choices. Creating effective scripts for every step of each Journey requires expert testing staff. Defining your User Journeys is not a technical task - you don t need to know anything about how your site works under the bonnet: it is purely how it works for your actual users. Do bring in your web analytics guy: to give suggestions for user routes that are important and widely used: but bear in mind the 80/20 rule: you can t run every possible journey to uniquely cover each of your unique historic visits. So decide on what routes are most valuable to the business: the most used, the most commercial value. Building User Journeys requires a good understanding of your web system - how it handles session IDs, directory structures and product numbering conventions, how caching of page content and images is handled by the server, javascript usage and so on. So consider a testing service that has the experience and knowledge to do this for your tech team, saving you time and resource.
2. Check page content It s not enough to simply have a page returned, it must contain the content that is expected as well as the link or button required to move to the next step in the Journey. Simple online self service load testing approaches often prove to be disappointing - the level of scripting expertise initially required is not available. 3. Run scripts in single-user mode Once Journeys for your site have been agreed, check that the scripts generated have proven stable over a few days, usually by running them in single-user mode. This is done to measure the system s responses without any stress. Performed repeatedly, it provides data to tell you whether the system is running stably, consistently and errorfree at low user levels. If your web system is already running out of hardware capacity, or the application design is flawed then inconsistent performance will be highlighted here. It is difficult to get reliable test data from a system where performance varies greatly, independent of the changing stress test load. 4. Run User Journey scripts at low levels to identify system flaws A range of system design flaws can contribute to site errors and slow page delivery times. These are usually random, difficult to pin down, and don t go away when hardware is upgraded. Flaws are identified by running the User Journey scripts at relatively low levels simulating 2, 5 or maybe 10 % of your intended virtual users; and running them at high simultaneity - accessing the same page at exactly the same moment. A high percentage of errors here indicates that an application may have database locking flaws, or errors when handling shared application variables or similar. 5. Synchronise your internal system monitoring To deliver the maximum value to clients we always ask to hook up the internal system monitoring tools into ours: it makes it so much easier to carry out joined up investigation of root causes, and will ultimately allow the load testers to provide more actionable data about any bottlenecks: whether easy to fix, or more complex. 6. Run tests for each User Journey under load Next, test each User Journey in isolation - increasing the load and observing the performance degradation. As the quantity of virtual users is ramped up, the effect on page delivery time and error rates is measured. This stage is not just about numbers - it requires expert staff to reflect on the pages that returned errors or were missing content. Expert analysis provides invaluable insight into the root causes of problems, and these can be quickly fed back to clients. The key statistic you ll get for each Journey at this point is the peak number of journeys per second that the system can handle. Testing needs to be able to monitor and report on a range of defects that can arise as a system starts to creak under load. Look out for missing page components (graphics, style sheets, javascript files), the appearance of error messages within pages or missing page content. A page may be delivered many times without any server errors being reported, yet have obvious defects that can impact on users. Putting your complex web systems to the test 4
5 7. Reconfigure and retest the improved system It s not unusual at this point, for experienced testers to highlight substantial performance gains. Testing can be continued after a short break with a much improved web system, with much higher throughput. This added intelligence is not available with automated test tools. 8. Run more complex test sequences Testing of each journey should include a more complex test sequence - for example SciVisum s high simultaneity runs where many virtual users follow a journey in their own time, but all stop before a defined page and hit it concurrently. This is an aggressive test, and must be used carefully, but can identify a range of coding and configuration problems - known as race conditions that are often down to coding errors, configuration oversights or database locking problems - including root causes that are extremely difficult to find by any other method. For client s who can provide data on their drop off rates (what percentage of users drop off and at what point in say a Checkout journey), extra journeys can be scripted to model the same drop off ratios. These provide a more realistic load than Journeys where 100% of those starting finish, and often provide quite different throughput capacity figures. 9. Run a realistic mix of User Journeys The final stage of heavy testing, is running mixed User Journeys - this again helps identify more realistically how the system performs in the real world, by having a mix of User Journeys running which match the expected load ratios - e.g. 20% Load Testing is increasing load to a pre-defined level, one expected to cover the busiest peaks. Testing stops once the pre-set level is reached. As the traffic volumes increase, and the performance of different User Journeys begins to decline, the test cases planned evolve to drill deeper and identify likely root causes. New test scripts are written to further expose the problem causes, and to rule out alternative explanations. Test team experience and expertise is vital here; as is a flexible and powerful test engine that can smoothly construct new test cases on the fly, and rapidly correlate data across a number of different runs and user journeys. Never approach testing with preconceived assumptions about potential problems. In our experience it s common for a team to have concerns about a specific issue as their prime suspect - but when testing begins and real evidence is produced, these may be shown to be wrong. The real bottlenecks will be discernible and unambiguous from analysis of the test data. It s sometimes a shock to find that perhaps the ISP is not the bottleneck after all, and what s causing the problem is actually an element in the application design.
checkouts, 30% add to baskets, 40% product searches and so on. Weblogs should be used to identify the appropriate mix of journeys. Some load testing suppliers will analyse your weblogs and suggest the best mix of journeys for you. After the heavy testing is done, some vital investigation work remains - analysing the delivered pages for more root causes and possible optimisation. Having the same engineers that scripted the User Journeys at the start to perform this final stage usually ensures best value. 10. Consider auto-scaling and test that it s delivering what you expect With today s elastic cloud technology it s possible to scale up a website as and when you need it, even for short periods of time. However it s also important to carry out load testing specifically to ensure that cloud auto-scaling is delivering against expectations. Firstly, because If we created a certain traffic profile. The nice auto-scaling you thought your team had built, may in reality not scale very well at all: as always in technology there are many pitfalls, configuration issues and bugs that get in the way. Even auto-scaling that works, can prove to be unreliable: if the scaling algorithm is not right, you can find your site capacity ramping down prematurely, just because of a short downward trend in the traffic peak, before it ramps up again: conversely, we have tested sites where the capacity did not scale down afterwards at all. At the end of the day, what s needed from a Load Test is a lot more than just a set of numbers. Watch out for the symptom of poor load testing practice at your company: if the output is measured in terms of concurrent users : then you know it s neither realistic nor meaningful. Clarks Shoes phrased it nicely when asking for a load test: We want to know the size of our online store. A realistic Load Test will answer that question. The report provided for your tech teams should include specific engineering issues identified on your site and recommended actions vital though these are - it is the engineer s insight into root causes and the re-factoring and re-configuring that will provide the biggest performance gains for the lowest cost. 6 About SciVisum With over 10 years performance testing experience, helping clients such as Debenhams, Boden, Joules and Dixons to maximise user experience and protect their brand, our highly experienced team of test professionals help clients implement monitoring programmes and proactively oversee testing; automatically updating journeys as a website changes, highlighting performance issues and helping clients quickly pinpoint root causes - saving time and money. To find out how SciVisum can help you implement the best monitoring programme to suit your organisation please contact us on 01227 or visit our website at www.scivisum.co.uk