Performance Testing: A guide to successful Real World Performance Testing November 2013 Mohit Verma Performance Engineering Evangelist Tufts Health Plan
Abstract In this paper, we present benefits of performance testing, forms of performance testing, key success factors and provide a framework to build a business case for Performance Testing and Application Performance Monitoring. It will be beneficial for beginner Performance Engineers and help close gaps for existing engineers by illustrating some best practices and guidelines for Successful Performance testing, and building a Performance Testing Center of Excellence Performance and Capacity, Nov 2013 2
Agenda About us Market State Why Performance Test? Technical Environment Performance Testing Performance Testing Benefits Performance Testing CSFs Performance Testing Synergies Questions? Performance and Capacity, Nov 2013 3
About us: Founded in 1979 as a not-for-profit health maintenance organization, Tufts Health Plan is one of the nation s most highly rated health plans. The company is distinguished for providing an outstanding member experience and access to quality care. Tufts Health Plan offers a broad array of health care coverage options to individuals and employer groups. Through diverse product offerings, the plan covers members regionally and across the country. Our applications typically support: Health Care Providers (Hospitals, Physician Practices, ACO s, etc) Employers Members Brokers Employees Performance and Capacity, Nov 2013 4
Market State Forrester recently reported that among companies with revenue of more than $1 billion, nearly 85% reported experiencing incidents of significant application performance degradation. Respondents identified the application architecture and deployment as being of primary importance to the root cause of application performance problems. Performance and Capacity, Nov 2013 5
Example 1: Amazon.com June 29 th Outage* Amazon.com experienced a widespread outage in the AM that lasted, at least for many customers, more than three hours and displayed blank or partial pages instead of product listings. By mid-afternoon, Amazon's home page was devoid of any product photographs and showed only a list of categories on the left of the screen. Searching for items often didn't work, and customers' shopping carts and saved item lists were temporarily displayed as empty. At an annual revenue of nearly $27 billion, Amazon faces a potential loss of an average of $51,400 a minute when it's site is offline. Amazon shares closed down 7.8 percent, a sharper fall than the Nasdaq index. A post on an Amazon seller community form at 12:47 p.m. PDT said: "We are currently experiencing an issue that is impacting customers' ability to place orders on the Amazon.com website." A followup announcement an hour later said the problem had not been resolved. Performance and Capacity, Nov 2013 6
Example 2: Dell.com gave shoppers the lowest high broadband access time among large web retailers according to Gomez. Retailer High Low Dial Up Broadband Broadband Dell.com 7.48 23.08 49.64 ColdwaterCreek.com 7.5 22.76 55.88 Williams-Sonoma.com 7.78 23.28 56.52 QVC.com 8.03 24.8 57.69 Amazon.com 8.24 22.05 49.46 OfficeDepot.com 8.52 25.24 57.46 Scholastic.com 8.59 29.63 66.01 CDW.com 8.95 25.38 51.15 Netflix.com 9.48 28.39 65.09 Staples.com 9.59 29.16 53.17 Performance and Capacity, Nov 2013 7
Example 3: Thursday Jan 31 st, 2013 The gateway page of Amazon.com was offline to some customers for approximately 49 minutes. Other pages of the site were accessible and AWS was not impacted, a spokesperson told TechCrunch in an email. Visits to the site at the time of the outage were bringing up 503 errors, which sometimes are linked to DDoS attacks or other kinds of overloads. 503 errors can also be linked to maintenance issues. Site outages are never good things but feel particularly shaky when they are linked to e-commerce sites or other places where user data is stored. Performance and Capacity, Nov 2013 8
Monitoring Dashboard AWS Performance and Capacity, Nov 2013 9
Why Performance Test? Software Engineers often build Software components/products not being aware of the target load or environment requirements or service level agreements Complexity and highly distributed nature of the various hardware and web hosting servers offers challenges on optimal configuration of applications Globalization of users offers additional complexity Virtualization of Business Critical Applications demands Performance Testing Mobility complicates end-users perceptions due to a myriad number of devices which require acceptable performance Recommendation Performance Test Proactively and Early in the Software Development LifeCycle Performance and Capacity, Nov 2013 10
Technical Environment N-tier Diagram - Simple Performance and Capacity, Nov 2013 11
Typical Technical Environment Technologies used (Complex and Diverse Environment) Web Application Servers: Weblogic, WebSphere, JBOSS, Aqualogic Infrastructure Security: CA SiteMinder, IBM Tivoli Access Manager Web Server: Apache, IIS,IHS Middleware: Tibco BusinessWorks and BusinessConnect Reporting: Siebel, Lawson, Actuate, Cognos, Hyperion Midrange/Mainframe/Legacy: HP/IBM/AS400 Performance and Capacity, Nov 2013 12
Performance Testing What is performance testing? Performance Testing Load Sociability Stress Endurance Testing which measures application performance under user load Testing which measures system performance under user load of all system variables in the deployment environment Testing to stress the application/system to find its limits Testing to validate system stability Performance and Capacity, Nov 2013 13
Performance Testing Key variables measured: End User Response Time (includes Web 2.0 metrics when needed) Resource utilization (CPU, Memory, Disk, etc) Network utilization & latency Throughput (bytes/sec, hits/sec) Performance and Capacity, Nov 2013 14
Performance Testing Benefits Measure response time for applications and enforce SLAs Improve end-user experience Proactive load/stress testing of mission critical applications would enable us to benchmark applications as per concurrent user support, response times, etc Capacity Planning save Costs( $$ ) by sizing production/non-production environments more accurately Help build proven scalable applications Failover Capabilities* Performance and Capacity, Nov 2013 15
Performance Testing: Critical Success Factors Understand the Drivers and Triggers for Performance testing (NFR) Build or identify Production Workload model Well Defined Success criteria - SLAs Identify Business Critical Workflows of application Identify/Create Test Data Build Test Environment that models production Support of all teams Performance Testing is a TEAM Effort!! Workflow Automation Tool (Load Test Tool) Load Generation environment Performance Test Analysis/Reporting Need Management that values Performance Testing Keep control of the Performance Test Environment Never let Development teams run the Performance Test for you Performance and Capacity, Nov 2013 16
Successful Performance Test LifeCycle SUCCESSFUL PERFORMANCE TEST LIFECYCLE Performance Test Triggers/ Requirements NO Performance Test Required ANALYTICS Production Report identifying transactional throughput of business transactions - Test Scripts, Test Data Perf Report: No Performance Testing Required YES Existing Application/ System YES Model Existing Production Workload Build Accurate Test Scenarios (Load, Stress & Sociability) NO Based on Triggers/Requirements Identify Issue, Make Tweaks(software, configuration or hardware) NO Define Test Success Criteria/ Script Workflows Performance Test Plan YES - SIGNOFF Are Results Acceptable? Execute Test Scenario Performance Test Environment Prod vs Test Report Any Differences Perf Report with results and any Exceptions Performance Test Result Report LOAD TEST TOOL - LOADRUNNER/ HOMEGROWN Performance and Capacity, Nov 2013 17
Performance Testing CSFs: Drivers and Triggers SLA Change Hardware change (upgrade/downgrade, Virtualization) Application Software Upgrade (New features/enhancements) Infrastructure Software Upgrade/Patch (Security, Database, Systems, etc) Compliance Patch (DOD) Java/.Net version upgrade Unexpected growth in number of users Database retention policy change Typically, the non-functional requirements (NFRs) should dictate the need for performance testing Performance and Capacity, Nov 2013 18
Performance Testing CSFs: Production Workload Model What is the existing usage of the application/system? Transaction Throughput (hour, day) Number of concurrent users for the average hour/peak hour Most used transactions Performance and Capacity, Nov 2013 19
Performance Testing CSFs: Well-Defined Success Criteria How do we know if the test was a success Document SLA s (response time, CPU/Memory usage thresholds) Meets customer goals Performance and Capacity, Nov 2013 20
Performance Testing CSFs: Define Business Critical Workflows Identify Business Critical Workflows of application Use the 80/20 rule (Pareto s Principle): 20% of the transactions cause 80% of the defects in production. Performance Testing is not typically a full regression test- 20% of the total test cases provides you 80% coverage. Include resource-intensive transactions (CPU, database, memory, network) Include highly used transactions Performance and Capacity, Nov 2013 21
Performance Testing CSFs: Test Data Identification Performance Testing is data-driven testing Choose your test data carefully in consultation with production workload models or business analysts Represent boundary value conditions (example large result sets) Represent required security roles when creating test ids Test with a production-sized database Test with same data setup at least 2 times for consistency Test with a randomized data setup at least once Performance and Capacity, Nov 2013 22
Performance Testing CSFs: Test Environment Considerations* Develop and Enforce Test readiness checklist Pristine Performance Test Environment Monitoring tools setup Historical data is mandatory Locked down environment (including disabling virus scans) Production sized in all respects, if possible Document and communicate any deviations from production to stakeholders If environment is shared? Disable builds and deployment during test times Build and Communicate Test Schedule Communicate, communicate, communicate Shutdown environments not needed Monitor, monitor, monitor Performance and Capacity, Nov 2013 23
Performance Testing CSFs: Team Support needed Performance testing is a TEAM effort! Developers DBAs Network Engineers System Engineers Business (involve them to run UAT during performance testing execution) Performance Engineers typically do the first/second line of analysis Root cause analysis tool may eliminate a total team effort Performance and Capacity, Nov 2013 24
Performance Testing CSFs: Load Test Tool For efficient performance testing need automation tool (industry standard or Open Source): Quick scripting, Correlation & Replay of scripts Building Test Models/scenarios Executing Test Scenarios Analysis Monitoring Home grown tools may suffice where technology platform is not as varied or for proprietary applications Performance and Capacity, Nov 2013 25
Performance Testing CSFs: Load Generation Environment Mimic production if possible Firewalls Several Network locations or use WAN emulator Performance and Capacity, Nov 2013 26
Performance Testing CSFs: Performance Test tools HP Loadrunner & Performance Center Microfocus SilkPerformer NeoLoad RadView - WebLoad MicroFocus QALoad IBM Rational Performance Tester Performance and Capacity, Nov 2013 27
Load/Performance Test Tool Benefits Identify and resolve performance bottlenecks quickly Repeatable tests can be scripted and run quickly Real world user scenarios can be modeled by the tools Helps improve the quality and stability of applications Provides server monitoring capability for non-production environments Provides co-related performance analysis reports with drill-down capability Integrates with existing production monitoring tools Performance and Capacity, Nov 2013 28
Performance Testing CSFs: Performance Test Analysis/Reporting Tool Analysis module provide: Real Time monitoring graphs Transaction Response Time Reports User Ramp-up graphs Transaction Response Summary graphs Drill-Down for Root cause analysis Correlating Graphs and results Performance and Capacity, Nov 2013 29
Performance Testing Analysis/Reporting Sample Report Performance and Capacity, Nov 2013 30
Performance Testing Analysis/Reporting Performance Test Reports Error Rate graph Performance and Capacity, Nov 2013 31
Performance Testing Analysis/Reporting Non-Compliant SLA Report (MP_Login) Performance and Capacity, Nov 2013 32
Performance Testing Analysis/Reporting SLA Report after enhancements Performance and Capacity, Nov 2013 33
Front End Analysis - Tools Performance and Capacity, Nov 2013 34
Front End Analysis Tools (2) Performance and Capacity, Nov 2013 35
Front End Analysis Tools (3) Performance and Capacity, Nov 2013 36
Root Cause Analysis- Server Side Tools Performance and Capacity, Nov 2013 37
Performance Testing Synergies Performance Testing and Application Performance Management (APM) go hand in hand Performance Testing proactively identifies and resolves issues before production metrics captured during performance testing can help build and monitor production systems more accurately Performance Testing Scripts can be reused for synthetic transaction monitoring in production for SLA enforcement Performance Testing Tools can be used for Root Cause Analysis and to replicate production issues Performance and Capacity, Nov 2013 38
Application Performance Testing/Monitoring Magic Quadrant Performance and Capacity, Nov 2013 39
Conclusion Performance Testing/Engineering is critical to Application success Building in-house competencies or outsourcing/cloud bases testing is possible today Successfully identifying your CSFs is imperative APM and PT/PE go hand and hand and provide immense benefit to organizations Performance and Capacity, Nov 2013 40
Questions/Discussion? Performance and Capacity, Nov 2013 41
References Dynatrace Software http://www.compuware.com/en_us/applicatio n-performance-management.html Google Page Speed https://developers.google.com/speed/pagesp eed HP Performance Engineering Solutions http://www8.hp.com/us/en/softwaresolutions/software.html?compuri=1170507 Performance and Capacity, Nov 2013 42
Performance and Capacity, Nov 2013 43
Performance and Capacity, Nov 2013 44
Performance and Capacity, Nov 2013 45
Performance and Capacity, Nov 2013 46
Performance and Capacity, Nov 2013 47
v Performance and Capacity, Nov 2013 48