Automated Analysis of the Results of Performance Regression Tests

Similar documents
TRACE PERFORMANCE TESTING APPROACH. Overview. Approach. Flow. Attributes

Table 1: Comparison between performance regression detection and APM tools

Automated Verification of Load Tests Using Control Charts

Hardware Performance Optimization and Tuning. Presenter: Tom Arakelian Assistant: Guy Ingalls

A Framework for Measurement Based Performance Modeling

Understand Performance Monitoring

Successful Factors for Performance Testing Projects. NaveenKumar Namachivayam - Founder - QAInsights

Throughput Capacity Planning and Application Saturation

A Survey on Load Testing of Large-Scale Software Systems

Load Testing on Web Application using Automated Testing Tool: Load Complete

A Framework for Enterprise IT Capacity Management

Predictive Analytics And IT Service Management

Load Testing and Monitoring Web Applications in a Windows Environment

Predictive Analytics And IT Service Management

System Requirements for Microsoft Dynamics NAV 2009 SP1

Application of Predictive Analytics for Better Alignment of Business and IT

Leveraging Performance Counters and Execution Logs to Diagnose Memory-Related Performance Issues

Fingerprinting the datacenter: automated classification of performance crises

Advanced analytics at your hands

Testing and optimizing Web Performance on the EPiServer platform Excellence in Web Performance

Performance TesTing expertise in case studies a Q & ing T es T

Oracle Database 12c: Performance Management and Tuning NEW

Sitecore Health. Christopher Wojciech. netzkern AG. Sitecore User Group Conference 2015

ROCANA WHITEPAPER How to Investigate an Infrastructure Performance Problem

Keywords: Load testing, testing tools, test script, Open-source Software, web applications.

Estimate Performance and Capacity Requirements for Workflow in SharePoint Server 2010

Condusiv s V-locity Server Boosts Performance of SQL Server 2012 by 55%

Performance Modeling for Web based J2EE and.net Applications

Performance Test Report For OpenCRM. Submitted By: Softsmith Infotech.

Audit & Tune Deliverables

Case Study - I. Industry: Social Networking Website Technology : J2EE AJAX, Spring, MySQL, Weblogic, Windows Server 2008.

Dell Active Administrator 8.0

Best Practices for Web Application Load Testing

Web Application s Performance Testing

Anomaly detection. Problem motivation. Machine Learning

Introduction to Data Mining and Machine Learning Techniques. Iza Moise, Evangelos Pournaras, Dirk Helbing

Datasheet FUJITSU Software ServerView Cloud Monitoring Manager V1.0

Overview. NT Event Log. CHAPTER 8 Enhancements for SAS Users under Windows NT

IP Forwarding Anomalies and Improving their Detection using Multiple Data Sources

Cloud Based Application Architectures using Smart Computing

Top Purchase Considerations for Virtualization Management

Monitor and Manage Your MicroStrategy BI Environment Using Enterprise Manager and Health Center

Response Time Analysis

3 Examples of Reliability Testing. Dan Downing, VP Testing Services MENTORA GROUP

VDI FIT and VDI UX: Composite Metrics Track Good, Fair, Poor Desktop Performance

Hands-On Microsoft Windows Server 2008

Security Toolsets for ISP Defense

MIMIC Simulator helps testing of Business Service Management Products

Load Testing Analysis Services Gerhard Brückl

Oracle Enterprise Manager 12c New Capabilities for the DBA. Charlie Garry, Director, Product Management Oracle Server Technologies

Deep Dive: Maximizing EC2 & EBS Performance

theguard! ApplicationManager System Windows Data Collector

Biometrics Workshop. The evolution of large-scale biometric architecture. Facilitators. Mark Crego, Accenture Mike Matyas, Mount Airey Group

Performance Monitoring with Dynamic Management Views

System Management and Operation for Cloud Computing Systems

Load Testing Tools. Animesh Das

Company & Solution Profile

FIGURE Selecting properties for the event log.

Capacity Management for Oracle Database Machine Exadata v2

XpoLog Center Suite Log Management & Analysis platform

U.S. Navy Automated Software Testing

Monitoring Best Practices for

WHAT WE NEED TO START THE PERFORMANCE TESTING?

Website Performance Analysis Based on Component Load Testing: A Review

Case Study I: A Database Service

Automating Big Data Benchmarking for Different Architectures with ALOJA

Mike Chyi, Micro Focus Solution Consultant May 12, 2010

Bright Idea: GE s Storage Performance Best Practices Brian W. Walker

Table of Contents. Chapter 1: Introduction. Chapter 2: Getting Started. Chapter 3: Standard Functionality. Chapter 4: Module Descriptions

System Behavior Analysis by Machine Learning

Monitoring Best Practices for COMMERCE

Microsoft Dynamics NAV 2013 R2 Sizing Guidelines for On-Premises Single Tenant Deployments

Web Server Software Architectures

SP Apps Performance test Test report. 2012/10 Mai Au

Appendix M INFORMATION TECHNOLOGY (IT) YOUTH APPRENTICESHIP

Response Time Analysis

Enterprise Applications in the Cloud: Non-virtualized Deployment

Web Load Stress Testing

Tableau Server 7.0 scalability

Transcription:

Automated Analysis of the Results of Performance Regression Tests

Zhen Ming (Jack) Jiang Assistant Professor, Department of Electrical Engineering and Computer Science, Lassonde School of Engineering, York University. Toronto, ON, Canada. Head of Software Construction, AnaLytics and Evaluation (SCALE) Lab Co-founder and co-organizer for the International Workshop on Large-Scale Testing (LT) Research Interests: Software Performance Engineering Mining Software Engineering Data Software Construction, AnaLysis and Evaluation (SCALE) Lab LT 2012 present

Most field problems for large scale systems are rarely functional instead they are load-related

Software Testing (Functional) Regression Testing Unit Testing System Integration Testing Non-Functional Testing Performance Regression Testing

Performance Regression Testing Mimics multiple users repeatedly performing the same tasks Take hours or even days Produces GB/TB of data that must be analyzed

Is the system ready for release?

Data Cleanup/ Abstraction Model Derivation Past Test(s) Current Test Data Cleanup/ Abstraction Test Validation

Studied Systems

Execution Logs

Automated Functional Analysis (E2, E3) are always together: (acquire_lock, release_lock) (open_inbox, close_inbox) If we see (E2, E6) this might be a problem E1 E2 E3 E4 E1 E2 E3 E4 E1 E2 E3 E4 E1 E2 E6 E4

Dell DVD Store 99.99% reduction in viewed log lines

Automated Performance Analysis Each load test has periods of peak/light load Fluctuations of response time Test # 1 Test # 2 Test # 3 Same workload applied across different tests Similar response time distributions

Performance Analysis Report - Visual Comparison Distribution Evolution

Performance Analysis Report - Step-wise Performance Diagnosis Stepwise We discovered a MySQL performance bug

Performance Counters

[Nguyen et al., ICPE 2012] Deriving Performance Ranges Using Control Charts Normal Operations Suspicious Test Derive control charts from the past good tests Flag new tests as anomalous if there are many violations in the control charts

Deriving Frequent Itemset from Past Test(s) Time DB read/sec Throughput Request Queue Size 10:00 Medium Medium Low 10:03 Medium Medium Low 10:06 Low Medium Medium 10:09 Medium Medium Low 10:12 Medium Medium Low 10:15 Medium Medium Low

Deriving Frequent Itemset from Past Test(s) Time DB read/sec Throughput Request Queue Size 10:00 Medium Medium Low 10:03 Medium Medium Low 10:06 Low Medium Medium 10:09 Medium Medium Low 10:12 Medium Medium Low 10:15 Medium Medium Low DB read/sec Medium Throughput Medium Request Queue size Low

Deriving Frequent Itemset from Past Test(s) DB read/sec Medium Throughput Medium Request Queue size Low DB read/sec Medium DB read/sec Medium Request Queue Size Low Throughput Medium Request Queue Size Low Throughput Medium Request Queue size Low Throughput Medium DB read/sec Medium

Automated Derivation of Performance Rules We have derived from the past test(s) a set of performance rules: DB read/sec Medium Throughput Medium Request Queue size Low Premise Consequence Flags tests where the rule does not hold DB read/sec Medium Throughput Medium Request Queue size High

Counter Analysis Report

Counter Analysis Report - Examine the Anomalous Period

Deriving Transaction Profiles Using Queuing Theory Every transaction visits various resources (e.g., CPU and disk) The complete series of service demands for one transaction is known as the Transaction Profile (TP) TP for transaction type A TP should be stable across different tests. Otherwise, there is an anomaly [Ghaith et al., CSMR 2013, STVR 2015]

Correlating Logs and Counters

Combining Logs and Counters 00:01, USER starts a conversation with USER 00:01, USER says MSG to USER 00:02, USER says MSG to USER 00:11, USER says MSG to USER 00:12, USER says MSG to USER 00:18, USER ends a conversation with USER 00:00, 5MB 00:08, 15MB 00:16, 15MB 00:24, 5MB

Memory (MB) Diagnosing Memory-Related Problems 20 15 10 5 0 00:00 00:08 00:16 00:24 Time

Creating A Research Community

Large-Scale Testing includes all different objectives and strategies of testing large-scale software systems using load. Examples of large-scale testing include live upgrade testing, load testing, high availability testing, operational profile testing, performance testing, reliability testing, stability testing and stress testing. Workshop: http://lt2016.eecs.yorku.ca/ Contact: zmjiang@cse.yorku.ca