Technical White Paper: WEb Load Testing To perform as intended, today s mission-critical applications rely on highly available, stable and trusted software services. Load testing ensures that those criteria are met in both testing and staging environments. But the fact is that most solutions fail to support real root-cause analysis because the limited monitoring capabilities of most load testing tools treat applications essentially as a black box. In addition, most load testing does not provide an accurate picture of the end-user experience, the most crucial measurement for the success of web applications. Applying traditional operational monitoring solutions or development tools such as code profilers to a load test does not provide the required remedy either, because the former does not provide the required level of detail and the latter is impractical for use under full production-like application load levels. To gather the performance and diagnostic information required for rapid problem resolution, a new generation of tools is required. With the Compuware Gomez 360 0 Web Load Testing solution, the combined power of Gomez Web Load Testing and the technology of dynatrace, a division of Compuware, provides both the most accurate measurement of the end-user experience and the most advanced problem-resolution techniques available. Gomez 360 0 Web Load Testing is the only solution that combines high-volume, real-world load from the industry s largest cloud and Last Mile testing network with deep transaction-specific analytics to pinpoint scalability and performance problems across the entire application delivery chain and provide a collaboration platform for resolution. With Gomez, you can: find and fix performance and scalability problems from the First Mile to the Last Mile rapidly determine problem root cause down to the component, SQL statement or method level reduce testing iterations and cycle time with proactive problem resolution and prevention validate architecture for maximum scalability. Why Gomez 360 o Web Load Testing Is a Critical Part of the Web Application Life Cycle Web applications have never been more critical to business and their importance is increasing all the time. As a result, the application delivery chain becomes more complex and dependent on third-party content and services, and the end-user experience, which is the most critical factor to consider when launching applications, becomes more difficult to ensure. With business wanting more application functionality faster, there is less time to test but more need to do so. Application failures or even degraded response times can result in significant costs such as reduced revenue, lost customers and negative publicity. Thus, validating that applications correctly fulfill business requirements prior to deployment from both a functional and a performance standpoint has become a critical focal point for IT management. Web load testing strives to ensure that web applications deliver expected performance and reliability before they are placed into service. Web load testing tools enable the simulation of actual user behavior and high load levels across the web application delivery chain before it is deployed into production to customers. The goals of load testing are: 1) to identify performance problems before they impact user experience and 2) enable corrective actions to be taken proactively. The Evolution of Load Testing The evolution in web application development and the critical nature of its deployment has mandated a change in approach to load testing. Whereas previously applications were contained mostly inside a data center, testing could be accomplished from inside the firewall. When applications began to expand outside the organization, it became more important to test from outside the firewall. Now, with applications being assembled by the browser, in the end user s hands, it is critical to test from that perspective. Given all this complexity, equally important is the ability to find the root cause of performance problems. For many years, load testing tools had been used primarily as load generators in order to test application responsiveness and stability, as well as provide scalability information for capacity planning purposes. Today s browser has, however, come a long way and needs to manage considerable client-side logic, becoming the integration platform for increasingly complex web applications.
This forces load testing tools to take a browser-based approach to improving the accuracy of simulation and improving tester productivity by simplifying scripting. Another evolution is that traditional load testing capabilities for analyzing performance problems are rather limited, as they concentrate mainly on system-level resource bottlenecks such as overall CPU and memory usage. Applications under load-test conditions are traditionally treated as a black box, with little consideration given to their inner workings. Application performance diagnostic tools have not generally been used as part of the load testing process. However, the need to gather precise information about the dynamic behavior of the application has increased significantly in recent years. There are a number of reasons for this, one of which is that load testing is typically performed relatively late in the development cycle, hence the time window to fix problems is extremely short. Additionally, complex, distributed and increasingly heterogeneous application architectures make the root-cause analysis of application performance problems extremely difficult. Consequently, simple load generation and system monitoring fails to support these requirements. Additional tools are required ones that enable in-depth diagnosis in load testing environments. 1 Diagnostic Tools for Load Testing Traditional development and operational monitoring tools also fail to provide this enhanced functionality. Development tools provide good insight into the application; however, they are limited to low load scenarios and can only be used for applications which are isolated to a single server. Additionally, these tools are limited to a specific technology platform. Therefore neither profilers nor debuggers can be used to fulfill this requirement. Operational monitoring tools can be used when the application is under full load. However, these tools are designed to provide only high-level, aggregated availability and performance metrics. Their primary goal is the monitoring of infrastructure resource utilization. Consequently, they fail to support root-cause analysis of application performance problems. To gather the required performance and diagnostic information, a new solution is required. Such a solution must introduce minimum overhead on applications under full load in order to guarantee accurate, uncontaminated load testing results. At the same time it must be able to precisely pinpoint the root cause of performance problems deep within application source code at runtime in order to enable rapid problem resolution. In the age of SOAs and highly distributed application architectures, these tools must likewise be able to trace requests across multiple servers and correlate this information in near real time. Finally, they must be able to work in a heterogeneous application ecosystem in which Java and.net application components interoperate seamlessly. In-Depth Performance Diagnosis Across the Application Life Cycle In-depth performance diagnostics tools that can be used across the entire application life cycle provide the precise capabilities IT personnel are looking for to rapidly identify the root cause of problems discovered within a load test. These include response time, scalability or reliability issues. dynatrace, as part of the 360 0 Web Load Testing solution, delivers on this promise, as shown in Figure 1. It is designed to capture code-level performance and contextual data for each and every individual transaction (e.g., HTTP requests for a web page) using lightweight dynamic byte-code instrumentation (requiring no source code modification) even as applications operate under real-world user load levels. Figure 1: Extending load testing with dynatrace 1 Application Load Testing Report, Yankee Research Group 2005
Traditional development and operational monitoring tools fail to provide this kind of code-level visibility. A solution must be able to precisely track down a problem s root cause deep within source code under full load levels. It also must be an in-depth performance diagnostics solution that can be used across the entire application life cycle to rapidly identify the offending lines of code that have caused a performance or stability problem under load. In today s complex application environments, this process is highly challenging. It requires in-depth knowledge of the inner workings of the application, and the means to trace problematic transactions down to the offending lines of code. To quickly and easily provide the necessary in-depth performance information about the application under test, dynatrace captures relevant performance metrics along with contextual information (SQL statements including bind values, log messages, exceptions, etc.) for all major standard Java and.net application APIs (e.g., Web Service, Servlet, RMI, EJB, JDBC, JMS, ASP.NET, ADO.NET, etc.). It achieves this using pre-packaged KnowledgeSensorsPacks. Additionally, dynatrace automatically pinpoints performance hot spots of each individual transaction using Auto Sensors that monitor immediately, without any manual configuration in which methods an application spends most of the time. Thus the code-level root cause of any performance problem is captured with the first load test, making problem resolution effective and reliable. Improving the Communication Process In addition to providing compelling technical capabilities, dynatrace also improves collaboration between all stakeholders involved in application development. Poor communication across departments is a frequently cited reason for the high costs and lengthy cycle times often required to resolve performance issues. Today, QA often detects performance problems and communicates them to developers. Developers, however, cannot generally rely on ambiguous information such as customer transaction X is slow to uncover the root cause of a problem. They also may find it difficult or impossible to reproduce application failures on their development workstations. This is because development environments tend to be small, single-server environments compared to larger distributed and/ or clustered staging environments. A lot of effort is expended trying to reproduce the problems found by QA in production-like environments. This process is shown in Figure 2. Figure 2: Classical performance problem-solving process
Detailed execution information is collected for each individual transaction path across heterogeneous, distributed application components. Although granular information is collected, the impact to both the application under load and the system under test is minimal. The relevant information is provided to developers and system architects in order to reconstruct the error directly from the monitoring data offline, without having to reproduce it on their systems. This reduces the error reproduction time to zero and speeds up the problem-analysis process significantly, as shown in Figure 3. Figure 3: Improved performance-solving process Poor communication between development and QA and the inability to easily reproduce performance problems on the developer s workstation result in very high costs for resolving performance problems. dynatrace provides an easy way to reconstruct problems offline, which not only significantly improves the communication process, but also the entire problem diagnosis and resolution process.
A Comprehensive Triage Process for Web Applications Efficient resolution of web application performance problems requires an integrated solution that enables clear communication between all stakeholders. This solution should also employ extensive automation to triage, diagnose and ultimately resolve performance problems by aligning the efforts of developers, QA and performance engineers, and system and application architects. The Gomez 360 o Web Load Testing solution is based upon this very principle. QA and performance engineers use Gomez 360 o Web Load Testing to quickly capture end-user transactions and execute accurate load tests against the entire web application delivery chain to uncover potential performance problems. These tests, which are run from outside the firewall from an end-user perspective, provide the most accurate view of user experience. At the same time, dynatrace PurePath Technology is used to trace each transaction; the sessions are recorded, providing in-depth code-level performance of the target application. Application performance metrics like end-to-end response time, time spent in various application components (e.g., the Java or.net stack), and database actions are automatically correlated with the client-side load testing results. This information can be analyzed in real time while the load test is executing, as shown in Figure 4, which significantly speeds up your performance tuning activities. dynatrace provides an efficient, integrated problem-triage process that enables clear communication between all stakeholders and employs extensive automation to accelerate problem resolution. This correlation provides a comprehensive overview of the behavior of the target application allowing QA and performance engineers to easily triage any problems they uncover during the load test. This in turn enables them to quickly engage the appropriate personnel responsible for diagnosing and resolving the performance or stability problem. To do that, the Gomez 360 o solution helps QA and performance engineers to answer questions such as: What is the real end-user performance going to be from varying geographies and different end-user devices? What are the top resource- and/or time-consuming components in the application? Do the application components scale as expected? Is the problem caused by the application itself or is it located within the Infrastructure of the web application delivery chain? Figure 4: Real-time performance metrics using dynatrace
Diagnosing the Problem s Root Cause for Rapid Resolution Using analysis based on PurePath Technology you can then rapidly diagnose the root cause of any performance and stability issue that has occurred during the web load test, including performance bottlenecks, functional errors, memory leaks and synchronization and configuration problems. To do that, PurePath Technology records and visually maps the precise runtime execution path of each and every discrete transaction across heterogeneous and distributed application components down to code level, imparting only minimal performance overhead. dynatrace PurePath Technology enables you to rapidly diagnose the root cause of performance and stability issues that occur during a load test, significantly accelerating problem resolution. Whether the performance problem is regarding a bottleneck, functional errors, memory leaks, or synchronization and configuration problems, PurePath Technology helps you rapidly find answers to the most relevant questions: Which component is causing the problem (WHERE)? What is the root cause of the problem (WHY)? Unlike other tools, PurePath Technology introduces only a statistically insignificant performance overhead under load in order to avoid adversely influencing the behavior of the application, thus ensuring the accuracy of test results and diagnostic findings. Additionally, other solutions often provide only aggregated results that group all transaction information together making it very difficult to isolate performance issues related to specific individual transactions. In contrast, PurePath Technology can identify tricky problems such as memory leaks, synchronization issues, inadequately configured frameworks, excessive SQL calls to the database, and excessive remoting/messaging in the context of a single transaction across servers in heterogeneous Java and.net environments. PurePath results contain not only pure performance metrics (e.g., response times, CPU usage), but also contextual information (e.g., memory usage, method arguments, exceptions, log events, I/O usage, SQL statements, synchronization delays, etc.) in order to enable a precise root-cause analysis. Its ability to trace discrete transactions across heterogeneous distributed applications at production-safe overhead levels makes PurePath unique. Figure 5: In-depth drill-down for heterogeneous distributed applications using dynatrace PurePath technology
Conclusion The Gomez 360 Web Load Testing solution provides enormous benefits to your software delivery process by ensuring that you find all the problems that affect end users. Plus, the solution enhances communication between development and test/qa as they collaborate to quickly resolve performance and stability problems. Testers and performance engineers can easily and automatically document the root cause of performance and stability issues down to the offending line of code, even in complex, highly distributed applications. They require no programming knowledge or access to source code to deliver an offline diagnostic report that allows development to efficiently resolve issues. Developers, architects and database engineers become more productive in resolving performance and stability issues that occur under load. The time-intensive, tedious and error-prone process of manually correlating log files from different back-end systems with the client-side load testing results becomes obsolete, thus significantly reducing problem resolution times. Additionally, developers can now easily specify and configure the desired in-depth monitoring requirements for their application prior to load test. These configurations are then automatically applied to the testing environment. Gomez 360 Web Load Testing creates repeatable value by: providing the most accurate picture of end-user performance available analyzing and optimizing application performance before deployment, reducing often catastrophic, costly periods of application downtime in production reducing the resolution times of performance and stability issues by enabling test/qa to rapidly triage problems and provide developers with timely, accurate and relevant diagnosis information down to the offending lines of code automatically reducing the time required by development teams to isolate performance issues and understand who owns them. To learn more about Gomez, visit: compuware.com/apm Compuware Corporation, the technology performance company, provides software, experts and best practices to ensure technology works well and delivers value. Compuware solutions make the world s most important technologies perform at their best for leading organizations worldwide, including 46 of the top 50 Fortune 500 companies and 12 of the top 20 most visited U.S. web sites. Learn more at: compuware.com. Compuware Corporation World Headquarters One Campus Martius Detroit, MI 48226-5099 2012 Compuware Corporation Compuware products and services listed within are trademarks or registered trademarks of Compuware Corporation. Oracle and Java are registered trademarks of Oracle and/or its affiliates. Other names may be trademarks of their respective owners. 1.16.12 20438JP