Integrating Artificial Intelligence. Software Testing



Similar documents
Exploring the Duality in Conflict-Directed Model-Based Diagnosis

Measuring the Performance of an Agent

Tuesday, October 18. Configuration Management (Version Control)

Safe robot motion planning in dynamic, uncertain environments

POMPDs Make Better Hackers: Accounting for Uncertainty in Penetration Testing. By: Chris Abbott

Applied Software Project Management

Module 10. Coding and Testing. Version 2 CSE IIT, Kharagpur

Towards Model-Based Diagnosis of Coordination Failures

Guideline for stresstest Page 1 of 6. Stress test

Simulation and Risk Analysis

Diagnosis of Simple Temporal Networks

What Is Specific in Load Testing?

Testing Automation for Distributed Applications By Isabel Drost-Fromm, Software Engineer, Elastic

Model-based Synthesis. Tony O Hagan

Model Based Diagnosis in ITS for Programming with Pedagogical Patterns

Neural Networks for Machine Learning. Lecture 13a The ups and downs of backpropagation

CSC2420 Fall 2012: Algorithm Design, Analysis and Theory

Software Configuration Management

Reinforcement Learning

Change Request Process Overview

The Association of System Performance Professionals

Teaching Introductory Artificial Intelligence with Pac-Man

Reinforcement Learning

Confirmation Bias as a Human Aspect in Software Engineering

Copyright. Network and Protocol Simulation. What is simulation? What is simulation? What is simulation? What is simulation?

Towards better accuracy for Spam predictions

Pervasive Diagnosis: Integration of Active Diagnosis into Production Plans

Once upon a product cycle, there were four testers who set. Intelligent Test Automation. Tester1. Tools & Automation. out on a quest to test software.

Intelligent and Automated Software Testing Methods Classification

Chapter 2: Intelligent Agents

Automatic Gameplay Testing for Message Passing Architectures

ISTQB Certified Tester. Foundation Level. Sample Exam 1

Why Study NP- hardness. NP Hardness/Completeness Overview. P and NP. Scaling 9/3/13. Ron Parr CPS 570. NP hardness is not an AI topic

SYSM 6304: Risk and Decision Analysis Lecture 5: Methods of Risk Analysis

University of Portsmouth PORTSMOUTH Hants UNITED KINGDOM PO1 2UP

Extension of Decision Tree Algorithm for Stream Data Mining Using Real Data

Machine Learning. CS 188: Artificial Intelligence Naïve Bayes. Example: Digit Recognition. Other Classification Tasks

System Copy GT Manual 1.8 Last update: 2015/07/13 Basis Technologies

1.1 Difficulty in Fault Localization in Large-Scale Computing Systems

Procedia Computer Science 00 (2012) Trieu Minh Nhut Le, Jinli Cao, and Zhen He. trieule@sgu.edu.vn, j.cao@latrobe.edu.au, z.he@latrobe.edu.

Evaluation of Complexity of Some Programming Languages on the Travelling Salesman Problem

LECTURE 4. Last time: Lecture outline

Introduction to Automated Testing

Experimental Comparison of Concolic and Random Testing for Java Card Applets

Decision-Theoretic Troubleshooting: A Framework for Repair and Experiment

Branch-and-Price Approach to the Vehicle Routing Problem with Time Windows

KEYWORD SEARCH OVER PROBABILISTIC RDF GRAPHS

NEURAL networks [5] are universal approximators [6]. It

ASSURE: Automatic Software Self- Healing Using REscue points. Automatically Patching Errors in Deployed Software (ClearView)

CS/COE

Introduction to ios Testing

Search based Software Testing Technique for Structural Test Case Generation

Topic 2: Structure of Knowledge-Based Systems

Lecture 1: Introduction to Reinforcement Learning

Evaluating and Comparing the Impact of Software Faults on Web Servers

Online Algorithms: Learning & Optimization with No Regret.

Testing Lifecycle: Don t be a fool, use a proper tool.

A Bayesian Approach for on-line max auditing of Dynamic Statistical Databases

RECIPE: a prototype for Internet-based real-time collaborative programming

Fault Localization in a Software Project using Back- Tracking Principles of Matrix Dependency

Using Markov Decision Processes to Solve a Portfolio Allocation Problem

Quality Meets the CEO

Model for dynamic website optimization

2IOE0 Interactive Intelligent Systems

Testing Rails. by Josh Steiner. thoughtbot

Improving Knowledge-Based System Performance by Reordering Rule Sequences

cs171 HW 1 - Solutions

Predictive Models for Min-Entropy Estimation

Runtime Hardware Reconfiguration using Machine Learning

Machine Learning and Statistics: What s the Connection?

Software Metrics. Lord Kelvin, a physicist. George Miller, a psychologist

Basic Data Analysis. Stephen Turnbull Business Administration and Public Policy Lecture 12: June 22, Abstract. Review session.

How to Plan a Successful Load Testing Programme for today s websites

Psychological Questionnaires and Kernel Extension

Dynamic programming formulation

Tracking Groups of Pedestrians in Video Sequences

Co-Presented by Mr. Bill Rinko-Gay and Dr. Constantin Stanca 9/28/2011

CLIC: CLient-Informed Caching for Storage Servers

Oracle Insurance Policy Administration System Quality Assurance Testing Methodology. An Oracle White Paper August 2008

Change Impact Analysis

CS MidTerm Exam 4/1/2004 Name: KEY. Page Max Score Total 139

APPROACHES TO SOFTWARE TESTING PROGRAM VERIFICATION AND VALIDATION

Custom Web Development Guidelines

Introduction to Learning & Decision Trees

INTRODUCTION TO MACHINE LEARNING 3RD EDITION

Coverity White Paper. Effective Management of Static Analysis Vulnerabilities and Defects

Part III: Machine Learning. CS 188: Artificial Intelligence. Machine Learning This Set of Slides. Parameter Estimation. Estimation: Smoothing

Applied Algorithm Design Lecture 5

Informatics 2D Reasoning and Agents Semester 2,

A Vague Improved Markov Model Approach for Web Page Prediction

Search Algorithm in Software Testing and Debugging

LOGICAL COMPONENT CONCEPT... 1 CONTENTS... 1 CUSTOMER SCENARIO... 2 SYSTEM LANDSCAPE... 2

Globally Optimal Crowdsourcing Quality Management

Context-Aware Route Planning

Moving Target Search. 204 Automated Reasoning

Identification and Analysis of Combined Quality Assurance Approaches

On the Importance of Thread Placement on Multicore Architectures

Obfuscation of sensitive data in network flows 1

Module 10. Coding and Testing. Version 2 CSE IIT, Kharagpur

QA Analysis of the WRF Program

Transcription:

Integrating Artificial Intelligence in Software Testing Roni Stern and Meir Kalech, ISE department, BGU Niv Gafni, Yair Ofir and Eliav Ben-Zaken, Software Eng., BGU 1

Abstract Artificial Intelligence Planning Diagnosisi Software Engineering Testing 2

Testing Planning Diagnosis 3

Testing is Important Quite a few software development paradigms All of them recognize the importance of testing There are several types of testing E.g., unit testing, black-box b (functional) testing. Programmer Write programs Follows spec. Goal: fix bugs! Tester Runs tests Follows a test plan Goal: find bugs! 4

Handling a Bug 1. Bugs are reported by the tester - Screen A crashed on step 13 of test suite 4 2. Prioritized bugs are assigned to programmers 3. Bugs are diagnosed - What caused the bug? 4. Bugs are fixed - Hopefully 5

Why is Debugging Hard? The developer needs to reproduce the bug Reproducing (correctly) a bug is non-trivial Bug reports may be inaccurate or missing Software may have stochastic elements Bug occurs only once in a while Bugs may appear only in special cases Let the tester provide more information! 6

7

Introducing AI! Te ester Run a test suite Find a bug mer Pr rogram Diagnose Fix the bug Tester Run a test suite Find a bug AI Compute Diagnoses Plan next tests Program mmer Fix the bug 8

Test, DiagnoseandPlan(TDP) and Tester Run a test suite Find a bug AI Compute Diagnoses Plan next testst Progra ammer Fix the bug 1. The tester runs a set of planned tests (test suite) 2. The tester finds a bug 3. AI generates possible diagnoses 4. If there is only one candidate pass to programmer 5. Else, AI plans new tests to prune candidates 9

Diagnosis Find the reason of a problem given observed symptoms Requirement: knowledge of the diagnosed system Given by experts Learned by AI techniques 10

Model-Based Diagnosis Given: a model of how the system works Infer f what went wrong Example: IF ok(battery) AND ok(ignition) THEN start(engine) What if the engine doesn t start? 11

Where is MBD applied? Printers (Kuhn and de Kleer 2008) Vehicles (Pernestål et. al. 2006) Robotics (Steinbauer 2009) Spacecraft - Livingstone (Williams and Nayak, 1996; Bernard et al., 1998). 12

Software is Difficult to Model Need to code how the software should behave Specs are complicated to model Static code analysis ignores runtime Polymorphism Instrumentation 13

Zoltar [Abrue et. al. 2011 ] Construct a model from the observations Observations should include execution traces Functions visited during execution Observed outcome: bug / no bug Weaker model Ok(function1) function1 outputs valid value A bug entails that at least one comp was not Ok 14

Execution Matrix A key component in Zoltar is the execution matrix It is built from the observed execution traces Observation 1 (BUG) : F1 F5 F6 Observation 2 (BUG): F2 F5 Observation 3 (OK) : F2 F6 F7 F8 F1 F2 F3 F4 F5 F6 F7 F8 Bug Obs1 1 0 0 0 1 1 0 0 1 Obs2 0 1 0 0 1 0 0 0 1 Obs3 0 1 0 0 0 1 1 1 0 15

Diagnosis = Hitting Sets of Conflicts In every BUG trace at least one function is faulty Observations: Observation 1 (BUG) : F1 F5 F6 Observation 2 (BUG) : F2 F5 Conflicts: Ok(F1) AND Ok(F5) AND Ok(F6) Ok(F2) ANDOk(F5) Possible diagnoses: {F5},{F1,F2},{F6,F2} Hitting sets of Conflicts 16

Software Diagnosis with Zoltar Bug Report Pi Prioritize iti Diagnoses Zoltar Set of Candidate Diagnoses 17

Plan More Tests Bug Report AIEngine Set of Possible Diagnoses Suggest New Test to Prune Candidates 18

Plan More Tests Bug Report AIEngine A single diagnosis Suggest New Test to Prune Candidates 19

Pacman Example 20

Pacman Example 21

Pacman Example 22

Pacman Example 23

Pacman Example 24

Pacman Example 25

Pacman Example 26

Execution Trace F1 F2 F3 Move Stop Eat Pill 27

Which Diagnosis is Correct? 28

Knowledge Most probable cause: Eat Pill 29

Knowledge Most probable cause: Eat Pill Most easy to check: Stop 30

Objective Plan a sequence of tests to find the best diagnosis with minimal tester actions 31

1. Highest Probability (HP) Compute the prob. of every function C i = Sum of prob. of candidates containing C i Plan a test to check the most probable function 0.2 F1 0.4 0.2 F2 F5 0.4 F6 0.4 32

1. Highest Probability (HP) Compute the prob. of every function C i = Sum of prob. of candidates containing C i Plan a test to check the most probable function 0.2 F1 0.4 0.6 F2 F5 F6 0.4 33

2. Lowest Cost (LC) Consider only functions C i with 0<P(C i )<1 Test the closest function from this set F1 F5 F6 F2 34

3. Entropy-based A test α = a set of components Ω + = set of candidates that assume α will pass Ω - = set of candidates that assume α will fail Ω? = set of candidates that are indifferent to α Prob. α will pass = prob. of candidates in Ω + Choose test with lowest Entropy(Ω +,Ω -,Ω??) = highest information gain 0.2 F1 0.4 0.2 F2 F5 0.4 F6 0.4 35

4. Planning Under Uncertainty Outcome of test is unknown we would like to minimize expected cost Formulate as Markov Decision Process (MDP) States: set of performed tests and their outcomes Actions: Possible tests Transition: Pr(Ω + ), Pr(Ω - )andpr(ω? ) Reward (cost): Cost of performed tests Current solver uses myopic sampling 36

Preliminary Experimental Results Synthetic code Setup Generated random call graphs with 300 nodes Generate code from the random call graphs Injected 10/20 random bugs Generate 15 random initial tests Run until a diagnosis with prob.>0.9 is found Results on 100 instances 37

Preliminary Experimental Results HP and Entropy perform worse than (LC) and MDP probability based (HP and Entropy) perform worse than the cost based approach. MDP which considers cost and probability outperforms the others 20 bugs is more costly for all algorithms than 10 bugs 38

Preliminary Experimental Results - NUnits Real code Well-known testing framework for C# Setup Generated call graphs with 302 nodes from NUnits Injected 10,15,20,25,30 random bugs Generate 15 random initial tests Run until a diagnosis with prob.>0.9 is found Results on 110 instances 39

Preliminary Experimental Results - NUnits Similar results The cost increases with the probability bound MDP which considers cost and probability outperforms the others 40

Conclusion The test, diagnose and plan (TDP) paradigm: AI diagnoses observed bug with MBD algorithm AI plans further tests to find correct diagnosis Empowers tester t using AI Bug diagnosis done by tester+ai 41

SCM Server Logs Source Code AI Engine QA Tester Developer 42