Automatically Finding Performance Problems with Feedback-Directed Learning Software Testing

Similar documents

The Theory And Practice of Testing Software Applications For Cloud Computing. Mark Grechanik University of Illinois at Chicago

Whitepaper: performance of SqlBulkCopy

Testing Automation for Distributed Applications By Isabel Drost-Fromm, Software Engineer, Elastic

MapReduce. MapReduce and SQL Injections. CS 3200 Final Lecture. Introduction. MapReduce. Programming Model. Example

How to Plan a Successful Load Testing Programme for today s websites

MAGENTO HOSTING Progressive Server Performance Improvements

Scalability Factors of JMeter In Performance Testing Projects

How To Handle Big Data With A Data Scientist

The Evolution of Load Testing. Why Gomez 360 o Web Load Testing Is a

Ce document a été téléchargé depuis le site de Precilog. - Services de test SOA, - Intégration de solutions de test.

The Scientific Data Mining Process

THE 2014 THREAT DETECTION CHECKLIST. Six ways to tell a criminal from a customer.

Sense Making in an IOT World: Sensor Data Analysis with Deep Learning

Load Testing on Web Application using Automated Testing Tool: Load Complete

B M C S O F T W A R E, I N C. BASIC BEST PRACTICES. Ross Cochran Principal SW Consultant

Muse Server Sizing. 18 June Document Version Muse

A New Era Of Analytic

Taking the First Steps in. Web Load Testing. Telerik

Big Data Processing with Google s MapReduce. Alexandru Costan

Lab 2 : Basic File Server. Introduction

SPATIAL DATA CLASSIFICATION AND DATA MINING

Big Insights from Little Data: A Spotlight on Unlocking Insights from the Log Data That Matters

The Big Data Paradigm Shift. Insight Through Automation

How In-Memory Data Grids Can Analyze Fast-Changing Data in Real Time

VDI FIT and VDI UX: Composite Metrics Track Good, Fair, Poor Desktop Performance

TESTING AND OPTIMIZING WEB APPLICATION S PERFORMANCE AQA CASE STUDY

WhatsUp Gold v11 Features Overview

Testing Rails. by Josh Steiner. thoughtbot

Web Performance, Inc. Testing Services Sample Performance Analysis

RTI v3.3 Lightweight Deep Diagnostics for LoadRunner

Parallel Analysis and Visualization on Cray Compute Node Linux

Open source software framework designed for storage and processing of large scale data on clusters of commodity hardware

Big Data Technology Map-Reduce Motivation: Indexing in Search Engines

Principles of Continuous Integration

Monitoring Best Practices for

Data Warehouse and Business Intelligence Testing: Challenges, Best Practices & the Solution

Agenda. Enterprise Application Performance Factors. Current form of Enterprise Applications. Factors to Application Performance.

Programa de Actualización Profesional ACTI Oracle Database 11g: SQL Tuning Workshop

Zend and IBM: Bringing the power of PHP applications to the enterprise

Intelligent Heuristic Construction with Active Learning

MONyog White Paper. Webyog

Testing Big data is one of the biggest

Mobile Performance Testing Approaches and Challenges

Top 10 reasons your ecommerce site will fail during peak periods

A Performance Engineering Story

Monitoring Best Practices for COMMERCE

Intelligent Monitoring Software. IMZ-RS400 Series IMZ-RS401 IMZ-RS404 IMZ-RS409 IMZ-RS416 IMZ-RS432

The Benefits of Deployment Automation

D5.3.2b Automatic Rigorous Testing Components

Oracle Database 12c Plug In. Switch On. Get SMART.

Peach Fuzzer Platform

Chronon: A modern alternative to Log Files

Networking in the Hadoop Cluster

Metatrader 4 Tutorial

Product Review: James F. Koopmann Pine Horse, Inc. Quest Software s Foglight Performance Analysis for Oracle

Augmented Search for Web Applications. New frontier in big log data analysis and application intelligence

The Methodology Behind the Dell SQL Server Advisor Tool

Data Refinery with Big Data Aspects

Case Study: Load Testing and Tuning to Improve SharePoint Website Performance

Choosing the Right Way of Migrating MySQL Databases

BIG DATA & ANALYTICS. Transforming the business and driving revenue through big data and analytics

TECHNOLOGY WHITE PAPER. Application Performance Management. Introduction to Adaptive Instrumentation with VERITAS Indepth for J2EE

THE BUSY DEVELOPER'S GUIDE TO JVM TROUBLESHOOTING

25 May Code 3C3 Peeling the Layers of the 'Performance Onion John Murphy, Andrew Lee and Liam Murphy

Large-Scale Data Sets Clustering Based on MapReduce and Hadoop

About me - Joel Montvelisky

In this presentation, you will be introduced to data mining and the relationship with meaningful use.

3 Examples of Reliability Testing. Dan Downing, VP Testing Services MENTORA GROUP

Course 55144B: SQL Server 2014 Performance Tuning and Optimization

Exploring the Synergistic Relationships Between BPC, BW and HANA

Monitoring applications in multitier environment. Uroš Majcen A New View on Application Management.

The Easiest Way to Run Spark Jobs. How-To Guide

Evaluation of Open Source Data Cleaning Tools: Open Refine and Data Wrangler

New Constructions and Practical Applications for Private Stream Searching (Extended Abstract)

Cloud computing is a marketing term that means different things to different people. In this presentation, we look at the pros and cons of using

Toad for Oracle 8.6 SQL Tuning

WHAT WE NEED TO START THE PERFORMANCE TESTING?

Bringing Value to the Organization with Performance Testing

How To Test A Web Server

EMC SOLUTION FOR SPLUNK

Installing and Running the Google App Engine On Windows

ISTQB Certified Tester. Foundation Level. Sample Exam 1

Continuous Integration (CI) for Mobile Applications

Transcription:

Automatically Finding Performance Problems with Feedback-Directed Learning Software Testing Mark Grechanik University of Illinois at Chicago Chen Fu and Qing Xie Accenture Technology Lab

Good Performance Is Important

Excellent Performance Is Even More Important!

Finding Performance Problems A goal is to find situations when applications unexpectedly exhibit worsened characteristics for certain combinations of input values.

Finding This Situation Is Like Getting a Perfect Hand!

Example: Bottlenecks A slow component of the application A C B Bottlenecks or hot spots are phenomena where the performance of the application is limited by one or few components

Bottlenecks in Nontrivial Applications Inputs Output

Performance Testing Applications Client calls Server methods Server Server interacts with back-end data storages Data base The Client executes methods of the Server using different combination of input parameter values

Performance Testing Applications Client receives data from Server Server Server receives data from Database Data base Performance can be measured in the number of roundtrips per time unit

Automating Performance Testing Applications Client calls Server methods and receives data Server Server interacts with back-end data storages Data base Automated test scripts simulate clients who perform transactions against the Server, and they use input data from the backend database

Automating Performance Testing Applications Server Server interacts with back-end data storages Data base Automated test scripts simulate clients who perform transactions against the Server, and they use input data from the backend database

A Problem With Performance Testing of Large Applications A medium-size renters insurance application has a large universe of input data. For 78,000,000 customer profiles it would take close to 1,500 years to test this application on all profile inputs provided that it takes only 10mins per input. Just touching one customer profile entry for 1sec, it takes about 2.5 years to touch them all! Some customer profiles may lead to significant load on different resources. 12

A Problem With Performance Testing of Large Applications How to select a manageable subset of the input data without compromising the effectiveness of performance testing. We need an insight into selecting this subset!

An Example Most of the renters are good customers, have hardly any accident/claims or fraudulent activities. That s how insurance companies make money. Unfortunately, the data of these good customers are not always good for performance testing purpose. Randomly selecting customer profiles does not work well keep encountering good customers. Yet, exploratory performance testing is one of the main ways of performance testing applications in industry! 14

The Intuition Behind Performance Testing Trigger more computation More data to process in programs Bigger executing footprint Bigger database footprint Trigger more database queries

Core Idea Select only these customer profiles that trigger intensive computations. These computations most likely involve significant interactions with databases. 16

The State of the Practice Intuitive testing is a method for testers to exercise the product based on their intuition and experience, surmising probable errors. Intuitive testing dates back to 1979. Use experience of test engineers to focus on error-prone and relevant system functions without writing time-consuming performance test specifications. Intuitive testing cannot cope with demands of performance testing. Frequently, guesses by test engineers and system administrators are wrong. Typically, performance degradations of up to 20% go unnoticed.

Performance Testing Is Laborious and Difficult

But, what if I wish I knew the rules that concisely expressed properties of the customer profiles that triggered intensive computations!

We Need Rules!

We Need Rules! Descriptive rules for selecting test input data play a significant role in software testing, where these rules approximate the functionality of the application. Example rule: some customers will pose a high insurance risk if these customers have one or more prior insurance fraud convictions and deadbolt locks are not installed on their premises. Bottleneck

We Need Rules! Descriptive rules for selecting test input data play a significant role in software testing, where these rules approximate the functionality of the application. Example rule: some customers will pose a high insurance risk if these customers have one or more prior insurance fraud convictions and deadbolt locks are not installed on their premises. Bottleneck

Our Solution - FOREPOST Feedback-ORiEnted PerfOrmance Software Testing (FOREPOST) finds performance problems automatically by learning and using rules that describe classes of input data that lead to intensive computations. FOREPOST is an adaptive, feedback-directed learning testing system that learns rules from execution traces and uses these learned rules to select test input data automatically to find more performance problems in applications when compared to exploratory random performance testing. FOREPOST is like a heat-seeking solution!

Seeking Hot Spots

Components of FOREPOST Adaptive Test Script Runtime Monitoring Machine Learning 1 2 3 Information Compression 4 Bottleneck Detection Algorithm 5 Adaptive test script uses rules that are issued by machine learning components to select input data that lead to more intensive computations. When the test script executes the application, its execution traces are collected as part of runtime monitoring, leading to (re)learning rules.

Collect Execution Traces Test Data Database Name User SSN JVM h( 354-78-2648,1100, f(5, g() user ) Don ) 28

Execution Trace Entries Method entry and exit Every time a method is called, a corresponding entry is put into a file where execution traces are stored Every time a method call is finished, a corresponding entry is put into a profile file Database Data When an application sends data to and receives data from the database, this data is captured and put into the execution trace file Input parameters Capture choices of inputs that users provides 29

Execution Trace Sample MTDENT_ _WebContainer : 0_ _21088404309_ _statefarm/framework/controllerw0089701/appcoordination/aspurlfilter _ _getrequest(ljavax/servlet/servletrequest;)ljavax/servlet/http/httpservletrequest;_ _ MTDENT_ _WebContainer : 0_ _21236207529_ _statefarm/framework/jcaimsframeworkw0099456/msg_ _ addstringtobytearray(i[bljava/lang/string;)v_ _ MTDRET_ _WebContainer : 0_ _21236207566_ _statefarm/framework/jcaimsframeworkw0099456/msg_ _ addstringtobytearray(i[bljava/lang/string;)v_ _ MTDRET_ _WebContainer : 0_ _21088404359_ _statefarm/framework/controllerw0089701/appcoordination/aspurlfilter _ _getrequest(ljavax/servlet/servletrequest;)ljavax/servlet/http/httpservletrequest;_ _ REQ_ _WebContainer : 0_ _21208103665_ _wateravailableyearround={n+} CTNextPage={unprotectedDwelling+}visibleFromRoadOrNeighborUnselected={+}command={navigate+} propertycity={hastings+}firedeptwithin10unselected={+}nextpagefocuselement={+} propertycountyunselected={+}directionstolocation={+}sortcolumn={+}street2={+}street1={63 JACK PL+} insidecitylimit={y+}tf:rn={3+}firedeptwithin10={n+}insidecitylimitunselected={+}returncommand={+} requestid={v1ky805mv05+}wateravailableyearroundunselected={+}tf:id={controllerframework:0+} nextpage={unprotecteddwelling+}currentpage={location+}selectedentityid={+}ctcurrentpage={location+} accessibletofiredeptunselected={+}clearerrorlist={+}selectedcontainerid={+}propertyzipcode={550331097+ }accessibletofiredept={n+}propertycounty={dakota+}returnpage={current+} visiblefromroadorneighbor={+}returnpagefocuselement={unprotecteddwellinglink+}_ _ IMSRET_ _WebContainer : 0_ _21238137181_ _331617041_ _APUN7N_ _112FE009Z1 081V1KY805MV05082abcd083 AGENT_APPLICATION09505092GNRT_RATE_ZONES 041042APPLY_FOR_POLICY041042AGRE041042AGREEMENT041 042INSURED_LOCATION_ _010Z1081V1KY805MV050840_ _

Some Statistics It takes about 10mins to execute the instrumented renters app end-2-end automatically using a test script. An average run invokes over 3,000 methods more than 3,000,000 times and it retrieves over 1Mb of data. An average execution traces contains over 1Gb of data. It takes less than 10mins on average to process and analyze one trace.

Constructing Data For a Learner and Learning Rules Input_1_Name Input_2_Name Input_k_Name Class Value_p Value_q Value_m Good Value_p Value_q Value_m Bad Value_p Value_q Value_m Good

Finding Bottlenecks A slow component of the application A C B Bottlenecks or hot spots are phenomena where the performance of the application is limited by one or few components

Finding Bottlenecks

Cluster Execution Traces We want to generalize a set of execution traces, compress information, and find common patterns All execution traces in the same set share some common properties It is much easier to find differences between few patterns than between billions of pairs of execution trace entries These differences should tell us what properties are specific to execution traces in the same set

Intuition There are some patterns in profiles that are repeated for different executions, for example, logging, database accesses These patterns can be thought of as components that correspond to implementations of some requirements We should reduce many different method calls in different profiles to a small set of components that offer us some insight into what methods are important for meaningful tests

It Is A Difficult Problem It s like trying to locate aliens in the universe! 37

Too Much Data To Process We have too many observations and dimensions Many inputs, key/values, method invocations to reason about or obtain insights from Too much noise in the data Too many dimensions for humans to deal with think requirements Need to reduce them to a smaller set of factors Better representation of data without losing much information Build more effective data analyses on the reduced-dimensional space: classification, clustering, pattern recognition 38

FOREPOST Architecture Profiler Test Script Rules Learner Good test traces Method and data statistics ICA Advisor Bad test traces Method weights Execution Trace Analyzer Trace Statistics Trace Clustering

A Highlight Of Our Results

The Method checkwildfirearea This method is instrumental in computing quotes for the Renters application Thus this method is important for testing With FOREPOST, this method is selected as the top of 30 from over 3,000 methods However, with the state MN this method does not affect quote computation Yet FOREPOST identified this method is important. Why?

Discovery Of a Problem FOREPOST automatically selected the method checkwildfirearea as important because its weight is significant in interesting profiles. It means that this method is invoked many times for the state MN even though its functional contribution is zero for this state. Invoking this method consumes resources and time for this state. It took few hours and several developers to uncover this information based on the results of running DATE. When we reported it developers thought that we made a mistake since this method was not supposed to execute on MN.

Result for Renters App

Result for JPetStore

Conclusions This proposed research program is novel, as to the best of our knowledge, there exists little but a growing research that addresses the problem of the controlled release of sensitive information that balances privacy and software engineering tasks. The results of this work will be a foundation for a new direction in requirements engineering, program comprehension, globally distributed software development, maintenance, evolution, and testing supported by a set of tools for low-cost automated software engineering tasks that consider software privacy issues.

Email: drmark@uic.edu