PREDICTIVE TECHNIQUES IN SOFTWARE ENGINEERING : APPLICATION IN SOFTWARE TESTING



Similar documents
Applications of Machine Learning in Software Testing. Lionel C. Briand Simula Research Laboratory and University of Oslo

DATA MINING TECHNIQUES AND APPLICATIONS

Intelligent and Automated Software Testing Methods Classification

Machine Learning-based Software Testing: Towards a Classification Framework

Software Development Cost and Time Forecasting Using a High Performance Artificial Neural Network Model

Performance Based Evaluation of New Software Testing Using Artificial Neural Network

Software Engineering/Courses Description Introduction to Software Engineering Credit Hours: 3 Prerequisite: (Computer Programming 2).

Process Models and Metrics

Software Defect Prediction Modeling

Using Data Mining for Mobile Communication Clustering and Characterization

Confirmation Bias as a Human Aspect in Software Engineering

EXTENDED ANGEL: KNOWLEDGE-BASED APPROACH FOR LOC AND EFFORT ESTIMATION FOR MULTIMEDIA PROJECTS IN MEDICAL DOMAIN

Formal Software Testing. Terri Grenda, CSTE IV&V Testing Solutions, LLC

Machine Learning. Chapter 18, 21. Some material adopted from notes by Chuck Dyer

Customer Classification And Prediction Based On Data Mining Technique

Karunya University Dept. of Information Technology

Classification algorithm in Data mining: An Overview

Data Mining Part 5. Prediction

EFFICIENT DATA PRE-PROCESSING FOR DATA MINING

An Overview of Knowledge Discovery Database and Data mining Techniques

Software Engineering: Analysis and Design - CSE3308

AUTOMATION OF ENERGY DEMAND FORECASTING. Sanzad Siddique, B.S.

Data Mining and Knowledge Discovery in Databases (KDD) State of the Art. Prof. Dr. T. Nouri Computer Science Department FHNW Switzerland

Software Engineering Compiled By: Roshani Ghimire Page 1

Learning is a very general term denoting the way in which agents:

not possible or was possible at a high cost for collecting the data.

Keywords Class level metrics, Complexity, SDLC, Hybrid Model, Testability

Software Metrics & Software Metrology. Alain Abran. Chapter 4 Quantification and Measurement are Not the Same!

Chap 4. Using Metrics To Manage Software Risks

Comparison of K-means and Backpropagation Data Mining Algorithms

Pattern-Aided Regression Modelling and Prediction Model Analysis

Detecting Defects in Object-Oriented Designs: Using Reading Techniques to Increase Software Quality

Integer Programming: Algorithms - 3

Requirements Analysis Concepts & Principles. Instructor: Dr. Jerry Gao

Presentation: 1.1 Introduction to Software Testing

Module 10. Coding and Testing. Version 2 CSE IIT, Kharagpur

Machine Learning: Overview

The Software Process. The Unified Process (Cont.) The Unified Process (Cont.)

How To Write Software

Software Defect Prediction for Quality Improvement Using Hybrid Approach

Microsoft Azure Machine learning Algorithms

Software Defect Prediction Tool based on Neural Network

Certification of a Scade 6 compiler

Data Mining for Customer Service Support. Senioritis Seminar Presentation Megan Boice Jay Carter Nick Linke KC Tobin

Introduction to Software Engineering. 8. Software Quality

Data Mining. Nonlinear Classification

Total Quality Management (TQM) Quality, Success and Failure. Total Quality Management (TQM) vs. Process Reengineering (BPR)

What do you think? Definitions of Quality

ISTQB Certified Tester. Foundation Level. Sample Exam 1

Software Engineering Introduction & Background. Complaints. General Problems. Department of Computer Science Kent State University

The Scientific Data Mining Process

Techniques and Tools for Rich Internet Applications Testing

DATA MINING TECHNOLOGY. Keywords: data mining, data warehouse, knowledge discovery, OLAP, OLAM.

Chapter 2 The Research on Fault Diagnosis of Building Electrical System Based on RBF Neural Network

Software Quality Management

Azure Machine Learning, SQL Data Mining and R

Analysis of kiva.com Microlending Service! Hoda Eydgahi Julia Ma Andy Bardagjy December 9, 2010 MAS.622j

Data Mining - Evaluation of Classifiers

A Hybrid Approach of Module Sequence Generation using Neural Network for Software Architecture

AN EMPIRICAL REVIEW ON FACTORS AFFECTING REUSABILITY OF PROGRAMS IN SOFTWARE ENGINEERING

1.1 The Nature of Software... Object-Oriented Software Engineering Practical Software Development using UML and Java. The Nature of Software...

Basic academic skills (1) (2) (4) Specialized knowledge and literacy (3) Ability to continually improve own strengths Problem setting (4) Hypothesis

ON INTEGRATING UNSUPERVISED AND SUPERVISED CLASSIFICATION FOR CREDIT RISK EVALUATION

Introduction to Machine Learning Lecture 1. Mehryar Mohri Courant Institute and Google Research

CS 1632 SOFTWARE QUALITY ASSURANCE. 2 Marks. Sample Questions and Answers

Requirements engineering

Software Engineering. Software Testing. Based on Software Engineering, 7 th Edition by Ian Sommerville

SOFTWARE ENGINEERING INTERVIEW QUESTIONS

Practical Data Science with Azure Machine Learning, SQL Data Mining, and R

Overview. Swarms in nature. Fish, birds, ants, termites, Introduction to swarm intelligence principles Particle Swarm Optimization (PSO)

Software Engineering Question Bank

Gerard Mc Nulty Systems Optimisation Ltd BA.,B.A.I.,C.Eng.,F.I.E.I

APPLYING MACHINE LEARNING ALGORITHMS IN SOFTWARE DEVELOPMENT

Mining Metrics to Predict Component Failures

Software Metrics: Roadmap

Network Machine Learning Research Group. Intended status: Informational October 19, 2015 Expires: April 21, 2016

Engineering Process Software Qualities Software Architectural Design

Software estimation process: a comparison of the estimation practice between Norway and Spain

APPROACHES TO SOFTWARE TESTING PROGRAM VERIFICATION AND VALIDATION

International Journal of Computer Trends and Technology (IJCTT) volume 4 Issue 8 August 2013

How To Develop Software

Chapter 7. Cluster Analysis

Prediction of Software Development Faults in PL/SQL Files using Neural Network Models

Agile Development and Testing Practices highlighted by the case studies as being particularly valuable from a software quality perspective

Welcome. Data Mining: Updates in Technologies. Xindong Wu. Colorado School of Mines Golden, Colorado 80401, USA

The purpose of Capacity and Availability Management (CAM) is to plan and monitor the effective provision of resources to support service requirements.

Measurement Information Model

A New Approach For Estimating Software Effort Using RBFN Network

About the Author. The Role of Artificial Intelligence in Software Engineering. Brief History of AI. Introduction 2/27/2013

SOMA, RUP and RMC: the right combination for Service Oriented Architecture

SCADE SUITE SOFTWARE VERIFICATION PLAN FOR DO-178B LEVEL A & B

Hathaichanok Suwanjang and Nakornthip Prompoon

Machine learning for algo trading

Predictive modelling around the world

Evaluation of Complexity of Some Programming Languages on the Travelling Salesman Problem

Comparison of Data Mining Techniques used for Financial Data Analysis

Transcription:

PREDICTIVE TECHNIQUES IN SOFTWARE ENGINEERING : APPLICATION IN SOFTWARE TESTING Jelber Sayyad Shirabad Lionel C. Briand, Yvan Labiche, Zaheer Bawar Presented By : Faezeh R.Sadeghi

Overview Introduction Process of Applying ML to Software Engineering (SE) Applications of Predictive Models in SE MELBA (MachinE Learning based refinement of BLAck-box test specification) Future Directions

Software Development Cycle Data is created at different stages in the software development cycle i.e. Creation of documents in the analysis stage, diagrams in the design stage, code in implementation stage, and test cases in the testing phase

Use of Technology to... Better understand a system Make more informative decisions as needed through the life of an existing system Apply lessons learned from building other systems to the creation of a new system MACHINE LEARNING (ME)!

Steps in Applying ML to SE

Step 1: Understanding the Problem Estimate cost or effort involved in developing a software To be able to characterize the quality of a software system To be able to predict what modules in a system are more likely to have a defect

Step 2: Casting the Original Problem as a Learning Problem Decide on how to formulate the problem as a machine learning task. i.e. Classification vs. Numeric prediction problem May not be straight forward and require further refinement of the original problem into sub problems

Step 3: Collection of Data and Relevant Background Knowledge Easier to collect data than knowledge background i.e. It is easy to collect data regarding faults discovered in software system and changes applied to the source to correct a fault, but there is no agreed upon domain theory describing software systems. We need to limit ourselves to incomplete background knowledge, i.e. Choosing a subset of the system

Step 4: Data Pre-Processing and Encoding Noise reduction Selecting appropriate subset of collected data Determining a proper subset of features that describe concepts to be learned Data and background knowledge need to be described and formatted in a manner that complies with the requirements of algorithm used

Step 5: Applying Machine Learning and Evaluating Results Need to measure the goodness of what is learned Accuracy in classification problems Mean Magnitude of Relative Error(MMRE) in numeric prediction Software Engineering: Percentage of the examples with magnitude of relative error (MRE)<= x (known as PRED) to assess cost, effort, and schedule models MMRE <25% is considered good, PRED >75%

Step 5: Applying Machine Learning and Evaluating Results (cont.) If what is learned is inadequate Retry this step by adjusting parameters OR Reconsider the decisions made in earlier stages

Step 6: Field Testing and Deployment Used by intended users Unfortunately number of articles discussing actual use and impact of the research in industry is very small Confidentiality

Applications of Predictive Models in SE Software Size Prediction Software Quality Prediction Software Cost Prediction Software Defect Prediction Software Reliability Prediction Software Reusability Prediction Recent Uses (Software Test Suite Reliability Prediction)

Software Size Prediction Methods for measuring Lines of code (LOC) Regolin, de Souza, Pozo, and Vergilio (2003) used neural networks(nn) and genetic programming(gp) Component-based method (CBM) Dolado (2000) concluded that NN and GP perform as well or better than multiple linear regression Function points Pendharkar (2004) used decision tree regression to predict the size of OO components

Software Quality Prediction ISO 9126 : Functionality, reliability, efficiency, usability, maintainability and portability McCall : factors composed of quality criteria Fault Density a measure for quality Genetic programming used by Evett and Khoshgoftaar (1998) to build models that predict the number of faults expected in each module Khoshgoftaar, Allen, Hudepohl, and Aud(1997) Neural networks Xing, Guo, and Lyu (2005) used SVM Seliya and Khoshgoftaar(2007) used an EM semisupervised learning algorithm

Software Cost Prediction Early estimation allows us to determine the feasibility Detailed estimations allow managers to better plan the project

Cost Estimation Methods Expert opinion Analogy based on similarity to other projects Decompose project Components to deliver Tasks to accomplish Total estimate of cost of individual components Use of estimation models

Examples of Cost Prediction: Shepperd and schofield (1997) Use of analogies for effort prediction Projects characterized in terms of attributes such as the number of interfaces, development method, or the size of functional requirements document Euclidean distance in n-dimensional space of project feature Decision trees and neural networks used More recently: Oliveira(2006) uses support vector regression(svr), radial basis function neural networks(rbfns) and linear regression-based models In an experiment SVR outperforms RBFN and linear regression

Software Reliability Prediction The probability of a failure-free operation of a computer program for a given environment for a given time Pai and Hong (2006), SVM-based models with simulated annealing perform better than existing Bayesian models Neural networks can also be used

Software Reusability Prediction Use of existing software knowledge Aim: Increase productivity of software developers Increase quality of end product Label code as reusable or non-reusable, then use software metrics to describe the example of interest

Related Work on Software Reusability Mao, Sahraoui, and Lounis(1998) Verify hypothesis of correlation between reusability and quantitative attributes of a piece of software: inheritance, coupling, and complexity Four labels, from totally reusable to not reusable at all

Recent Uses: Predict defect contents of documents after software inspection. Padberg, Ragg, and Schoknecht(2004) Predicting stability of object-oriented software Grosser, Sahroui, and Valtchev(2002) Software Test Suite Reliability Prediction Lionel C.Briand, Yvan Labiche, Zaheer Bawar (2008)

Software Test Suite Reliability Prediction Problem: Developers are faced with test suites which have been developed with no apparent rationale Developers have to evaluate and reduce or augment them depending on the level of weakness or redundancy Quality prediction???

Goal: An automated methodology based on machine learning to help software engineers analyze the weakness of test suites to be able to improve them Re- engineering of test suites MachinE Learning based refinement of BlAck-box test specification (MELBA)

Melba Methodology

Category Partitioning: Tested function Function s parameters and environment variables Parameter/variable characteristics Characteristic s blocks Values of block

Design First case: Test suite but not CP specification Testers must then build the CP specification based on their understanding of the software functional behaviour iteratively refine it Second Case: CP specification is used to generate the test suite and the test suite must evolve to account for changes in the system under test (Evolution context).

Potential Problems and Causes in Test Suite

Case Study: PackHexChar Program Inputs: RLEN String of characters ODD_DIGIT Program output Array of Bytes and Integer value -1 for even RLEN -2 for illegal RLEN -3 for illegal ODD_DIGIT

MELBA: Snapshot of Decision Tree

Melba Snap Shot of Unused CP Elements

Melba : Snap Shot of Unused Combination of Choices

Results: Improvement of cp specification Improvement of test suites

Future Work for MELBA Investigate other black box testing specifications additional evaluations of the iterative process on programs of varying sizes and complexities User-friendly automated tool support

Future Directions in SE: Self-configuring Self-optimizing Self-healing Self-protecting