An Evaluation of Neural Networks Approaches used for Software Effort Estimation



Similar documents
Software Development Cost and Time Forecasting Using a High Performance Artificial Neural Network Model

Estimation of the COCOMO Model Parameters Using Genetic Algorithms for NASA Software Projects

Hybrid Neuro-Fuzzy Systems for Software Development Effort Estimation

A Concise Neural Network Model for Estimating Software Effort

A New Approach in Software Cost Estimation with Hybrid of Bee Colony and Chaos Optimizations Algorithms

A New Approach For Estimating Software Effort Using RBFN Network

Keywords : Soft computing; Effort prediction; Neural Network; Fuzzy logic, MRE. MMRE, Prediction.

A HYBRID FUZZY-ANN APPROACH FOR SOFTWARE EFFORT ESTIMATION

Approach of software cost estimation with hybrid of imperialist competitive and artificial neural network algorithms

A Fuzzy Decision Tree to Estimate Development Effort for Web Applications

Fuzzy Logic based framework for Software Development Effort Estimation

Software project cost estimation using AI techniques

Comparison and Analysis of Different Software Cost Estimation Methods

Towards applying Data Mining Techniques for Talent Mangement

Size-Based Software Cost Modelling with Artificial Neural Networks and Genetic Algorithms

Data Mining for Customer Service Support. Senioritis Seminar Presentation Megan Boice Jay Carter Nick Linke KC Tobin

Keywords Software Cost; Effort Estimation, Constructive Cost Model-II (COCOMO-II), Hybrid Model, Functional Link Artificial Neural Network (FLANN).

Evaluation of Feature Selection Methods for Predictive Modeling Using Neural Networks in Credits Scoring

SOFT COMPUTING TECHNIQUES FOR SOFTWARE PROJECT EFFORT ESTIMATION

Software Cost Estimation: A Tool for Object Oriented Console Applications

Software Defect Prediction Tool based on Neural Network

Computational Neural Network for Global Stock Indexes Prediction

Efficient Indicators to Evaluate the Status of Software Development Effort Estimation inside the Organizations

A HYBRID INTELLIGENT MODEL FOR SOFTWARE COST ESTIMATION

EARLY STAGE SOFTWARE DEVELOPMENT EFFORT ESTIMATIONS MAMDANI FIS VS NEURAL NETWORK MODELS

DATA MINING TECHNIQUES AND APPLICATIONS

Learning is a very general term denoting the way in which agents:

First Semester Computer Science Students Academic Performances Analysis by Using Data Mining Classification Algorithms

Artificial Neural Networks for Software Effort Estimation: A Review

SURVIVABILITY ANALYSIS OF PEDIATRIC LEUKAEMIC PATIENTS USING NEURAL NETWORK APPROACH

2. IMPLEMENTATION. International Journal of Computer Applications ( ) Volume 70 No.18, May 2013

Time Series Data Mining in Rainfall Forecasting Using Artificial Neural Network

Artificial Neural Network-based Electricity Price Forecasting for Smart Grid Deployment

OPTIMUM LEARNING RATE FOR CLASSIFICATION PROBLEM

An Empirical Study of Software Cost Estimation in Saudi Arabia Software Industry

Neural Network Predictor for Fraud Detection: A Study Case for the Federal Patrimony Department

Pragmatic Peer Review Project Contextual Software Cost Estimation A Novel Approach

A Comparison of Calibrated Equations for Software Development Effort Estimation

A Classical Fuzzy Approach for Software Effort Estimation on Machine Learning Technique

Visualization of large data sets using MDS combined with LVQ.

Software Development Effort Estimation by Means of Genetic Programming

Introduction to Data Mining

A hybrid method for increasing the accuracy of software development effort estimation

How To Use Neural Networks In Data Mining

IMPROVEMENT AND IMPLEMENTATION OF ANALOGY BASED METHOD FOR SOFTWARE PROJECT COST ESTIMATION

Using Data Mining for Mobile Communication Clustering and Characterization

A survey on Data Mining based Intrusion Detection Systems

An Introduction to Data Mining. Big Data World. Related Fields and Disciplines. What is Data Mining? 2/12/2015

REVIEW OF HEART DISEASE PREDICTION SYSTEM USING DATA MINING AND HYBRID INTELLIGENT TECHNIQUES

FOREX TRADING PREDICTION USING LINEAR REGRESSION LINE, ARTIFICIAL NEURAL NETWORK AND DYNAMIC TIME WARPING ALGORITHMS

Hathaichanok Suwanjang and Nakornthip Prompoon

International Journal of Computer Science Trends and Technology (IJCST) Volume 2 Issue 3, May-Jun 2014

A Multi-level Artificial Neural Network for Residential and Commercial Energy Demand Forecast: Iran Case Study

disagrees. Given all the factors that can influence a software project, it is very surprising that pred(30) = 69 can be achieved at all.

Software Estimation: Practical Insights & Orphean Research Issues

Comparative Analysis of Classification Algorithms on Different Datasets using WEKA

Scalable Developments for Big Data Analytics in Remote Sensing

Pentaho Data Mining Last Modified on January 22, 2007

Performance Based Evaluation of New Software Testing Using Artificial Neural Network

An Introduction to Data Mining

ON INTEGRATING UNSUPERVISED AND SUPERVISED CLASSIFICATION FOR CREDIT RISK EVALUATION

Comparison of K-means and Backpropagation Data Mining Algorithms

EFFICIENCY OF DECISION TREES IN PREDICTING STUDENT S ACADEMIC PERFORMANCE

A Specific Effort Estimation Method Using Function Point

Chapter 12 Discovering New Knowledge Data Mining

Advanced analytics at your hands

Impact of Boolean factorization as preprocessing methods for classification of Boolean data

Feature Selection using Integer and Binary coded Genetic Algorithm to improve the performance of SVM Classifier

Data Mining and Neural Networks in Stata

Artificial Neural Network Approach for Classification of Heart Disease Dataset

Prediction of Stock Performance Using Analytical Techniques

Supply Chain Forecasting Model Using Computational Intelligence Techniques

Short Term Electricity Price Forecasting Using ANN and Fuzzy Logic under Deregulated Environment

An Intelligent Approach to Software Cost Prediction

Data Mining & Data Stream Mining Open Source Tools

Real-Time Credit-Card Fraud Detection using Artificial Neural Network Tuned by Simulated Annealing Algorithm

IMPROVISATION OF STUDYING COMPUTER BY CLUSTER STRATEGIES

Research Article Predicting Software Projects Cost Estimation Based on Mining Historical Data

American International Journal of Research in Science, Technology, Engineering & Mathematics

Extending Change Impact Analysis Approach for Change Effort Estimation in the Software Development Phase

Data Mining in Education: Data Classification and Decision Tree Approach

Essential Components of an Integrated Data Mining Tool for the Oil & Gas Industry, With an Example Application in the DJ Basin.

Analecta Vol. 8, No. 2 ISSN

Predictive Analytics Tools and Techniques

ANALYSIS OF FEATURE SELECTION WITH CLASSFICATION: BREAST CANCER DATASETS

How To Identify A Churner

Impact of Feature Selection on the Performance of Wireless Intrusion Detection Systems

Neural Network Applications in Stock Market Predictions - A Methodology Analysis

FrIDA Free Intelligent Data Analysis Toolbox.

Data Mining Part 5. Prediction

EXTENDED ANGEL: KNOWLEDGE-BASED APPROACH FOR LOC AND EFFORT ESTIMATION FOR MULTIMEDIA PROJECTS IN MEDICAL DOMAIN

Estimating Size and Effort

Cost Drivers of a Parametric Cost Estimation Model for Data Mining Projects (DMCOMO)

Transcription:

Proc. of Int. Conf. on Multimedia Processing, Communication and Info. Tech., MPCIT An Evaluation of Neural Networks Approaches used for Software Effort Estimation B.V. Ajay Prakash 1, D.V.Ashoka 2, V.N. Manjunath Aradhya 3 1 SJB Institute of Technology/Department of ISE, Bangalore, India ajayprakas@gmail.com 2 JSSATE, Dept. of Information Science and Engineering, Bangalore, India dr.ashok_research@hotmail.com 3 SJCE, Department of MCA, Mysore, India aradhya1980@yahoo.co.in Abstract Demand for software is increasing day by day due to its more usage in IT industries. It is a mega challenge for software industry to develop very high quality software effectively within stipulated time and budget. To accomplish this challenge, the software development process needs to be effectively managed and well planned. It is very important to have good effort estimation in order to manage well budget. This paper aims to highlight the application of data mining techniques that yields high software effort estimation accuracy in contrast with other well established effort estimation models using little features. The effect of the proposed work discussed in this paper consisting of several steps. These steps can provide cost-saving project effort estimation means to identify and select only relevant and necessary features. By applying data mining techniques project managers and experts can consume less time to predict software project effort and more time on more important issues in releasing the project in time to customers. Further, this work illustrates the advantages like less computation time and effort resulting in energy saving mandates that can be easily adapted in software development process. Index Terms Data mining, effort estimation, neural networks, project planning. I. INTRODUCTION Estimating size or resources is one of the vital topics in software engineering and IT. During software development, effort estimation is the basic step taken in budgeting, planning and development of the software project. Exact estimation of software development effort, predicting and scheduling are the essential components to deliver the software product on time, to produce superior quality and contained by estimated cost. Effort estimation inaccuracy can be harmful to an IT industry s economics and dependability due to behind schedule release, poor quality and stakeholder s displeasure with the software product. From the Brazilian Ministry of Science and Technology-MCT research report in 2001, only 29% of the IT industry s able to achieve size estimation and 45.7% of software effort estimation [1]. So in current years effort estimation has motivated as an extensive research. The key in component is developed kilo line of code (KLOC), which affects the software effort estimation which includes all program instructions [10]. Many software estimation models have been proposed to support project manager in accurate decision making about their projects [11, 12]. One of the known mathematical approaches for software cost estimation is the DOI: 03.AETS.2013.4.105 Association of Computer Electronics and Electrical Engineers, 2013

COnstructive COst MOdel (COCOMO) model was first provided by Boehm [11], it was built based on 63 software projects. COCOMO model helps in finding the development time, the effort. Here E represents the evaluated software effort in man-months. In equation 1 a and b parameters depend mainly on the type of software project. Software projects were categorized based on the complexity of the project into three categories. They are: TABLE I. BASIC COCOMO MODELS Model Name Effort (E) Time (D) Organic Model E 1.0 5 2.4 ( K L O C ) D 2.5 ( E ) 0.38 1.12 0.35 Semi-Detached Model E 3.0( K LO C ) D 2.5( E ) Embedded Model E 3.6( KLOC) 1.20 D 2.5( E) 0.32 II. RELATED WORKS Quite a lot of research works has been carried out on building efficient effort estimation using soft computing techniques [2] [3]. Kelly et. al. [4] provides methodology for exploring software cost estimation using Neural Networks (NNs), Genetic Algorithms (GAs) and Genetic Programming (GP). Artificial neural network are used in effort estimation due to its ability to learn from previous data [5] [6] It is also complex relationships between the dependent (effort) and independent variables (cost drivers). Many authors encompass neural network to effort estimation using feed-forward multi-layer Perceptron, Back propagation algorithm and sigmoid function [4]. Idri et.al [7] has made research on estimating software cost using the neural network and fuzzy logic rules on COCOMO'81 dataset. Samson et. al [8] applied multilayer perception in software effort prediction for boehm s COCOMO dataset also compares neural network with linear regression. Radlinki et. al. [13], analyses the accuracy of predictions for software development effort using different machine learning techniques. Parag [14], proposed a Probabilistic Neural Networks (PNN) approach for simultaneously estimating values of software development parameters (either software size or software effort) and probability that the actual value of the parameter will be less than its estimated value. Attarzadeh et. al. [15], the author s presents that the one of the greatest challenges for software developers is predicting the development effort for a software system based on developer abilities, size, complexity and other metrics for the last decades. Stamatia et. al. [16], comparative research has been done by using three machine learning methods such as Artificial Neural Networks (ANNs), Case-Based Reasoning (CBR) and Rule Induction (RI) to build software effort prediction systems. III. ESTIMATING DEVELOPMENT EFFORT USING NEURAL NETWORKS TOOLS Different DM tools such as weka 3.6, neuro solutions evaluation version 6.29, and MATLAB neural network tool box used for software effort estimation. In this research work publicly available PROMISE repository (http://promisedata.org/data is used. Table 2 shows the different data sets used for effort estimation. TABLE II. EIGHT DIFFERENT DATASET AND 890 PROJECTS USED FOR EFFORT ESTIMATION Dataset Description Features Size Cocomonasa NASA projects 17 60 Maxwell Biggest commercial Banks projects in finland 27 62 China Chinese software companies projects data 18 499 Nasa93 NASA projects 17 93 Desharnais Canadian software projects 12 81 Cocomo81 Projects data from NASA 17 63 Kemerer Large business applications 7 15 Telecom Telecom companies projects 3 18 Total 890 293

The steps followed in calculating the software effort is as shown in Figure 2. In this work the application of neural network to effort estimation made use of feed-forward multi-layer Perceptron, Radial basis function network and Support vector machine. Fig.2. Steps in effort estimation Fig.3. Multilayer Perceptron A. Weka Tool: As a DM tool Weka is a tool for performing data mining tasks such as classification, association rules, clustering and visualization. Also contains many machine algorithms, data pre-processing. It can be directly applied to dataset or can be called by developing own java source code. In this experiment, used multilayer preceptron with setting the test cross validation folds to 10 and applied to different data set which is mentioned in table 2. B. Neuro Solutions Neuro Solutions is developed by NeuroDimension which provides the neural network development environment. It can be used as modula or icon based network design with user interface with implementation of different machine learning algorithms such as back propagation, Levenberg-Marquardt etc. Using Neuro Solution tool, we can design, train, test and deploy neural network (supervised learning and unsupervised learning) models to perform a wide variety of tasks such as data mining, classification, function approximation, multivariate regression and time-series prediction [17] Figure 4, shows the neuro solution tool output. In neuro solution, performance of neural network results is evaluated using the magnitude of related error (MRE) and mean magnitude of relative error (MMRE). MRE equation 2 is defined as follows ActualEffort Pr edictedeffort M RE *100 Actualeffort 294 (2)

M M RE 1 n M RE i n i 1 (1) Fig.4. Neuro solution result C. MATLAB Using MATLAB R2011a, different data sets are trained and tested. China dataset is selected as input and target output is the effort. Figure 5 shows the neural network model used in the experiment and figure 6 shows the result of effort. Fig.5. Neural network training Fig. 6. Result of effort 295

IV. CONCLUSIONS Software effort estimation implementation in a large software development organization takes more than a year. Getting hold of estimation experience and integrating it into project management processes along with the consequent introduction of IT measurements for continuing improvement might require another couple of years. This paper has illustrated the application of different data mining tools for software effort estimation to reduce time and effort in software development. A neural network, Weka 3.6 tool and MATLAB neural network toolbox are used to different dataset of 890 projects in order to estimate efforts were discussed. The discussion made in this paper may use as a reference guide to software project managers in software effort estimation. REFERENCES [1] I.F. Barcelos Tronto, J.D. Simoes da Silva, N. Sant. Anna, Comparison of Artificial Neural Network and Regression Models in Software Effort Estimation, INPE eprint, Vol.1, 2006. [2] J. Ryder, Fuzzy COCOMO: Software Cost Estimation. PhD thesis, Binghamton University, 1995. [3] A. C. Hodgkinson and P. W. Garratt, A neuro-fuzzy cost estimator, in Proceedings of the Third Conference on Software Engineering and Applications, pp. 401 406, 1999. [4] M. A. Kelly, A methodlolgy for software cost estimation using machine learning techniques, Master s thesis, Naval Postgratuate School, Monterey, California, 1993. [5] A. J. Albrecht and J. R. Gaffney, Software function, source lines of code, and development effort prediction: A software science validation, IEEE Trans. Software Engineering, vol. 9, no. 6, pp. 630 648, 1983. [6] J. E. Matson, B. E. Barret, and J. M. Mellinchamp, Software developemnet cost estimation using function points, IEEE Trans. Software Engineering, vol. 20, no. 4, pp. 275 287, 1994. [7] Ali Idri and Taghi M. Khoshgoftaar& Alain Abran, Can Neural Networks be easily Interpreted in Software Cost Estimation, IEEE Transaction, 2002, page:1162-1167. [8] Samson, B., Ellison, D., Dugard, P., Software cost estimation using an Albus perceptron (CMAC), Journal of Information and Software Technology, 39 (1), 55 60. 1997. [9] C. F. Kemere, An empirical validation of software cost estimation models, Communication ACM, vol. 30, pp. 416 429, 1987. [10] T. Menzies, D. Port, Z. Chen, J. Hihn, and S. Stukes, Validation methods for calibrating software effort models, in Proceedings of the 27th international conference on Software Engineering (ICSE 05), (New York, NY, USA), pp. 587 595, ACM Press, 2005. [11] B. Boehm, Software Engineering Economics. Englewood Cliffs, NJ, Prentice-Hall, 1981. [12] B. Boehm, Cost Models for Future Software Life Cycle Process: COCOMO2. Annals of Software Engineering, 1995. [13] L. Radlinki and W. Hoffmann, On Predicting Software Development Effort Using Machine Learning Techniques and Local Data, International Journal of Software Engineering and Computing, vol. 2, pp.123-136, 2010. [14] Parag C. Pendharkar, Probabilistic estimation of software size and effort, An International Journal of Expert Systems with Applications, vol. 37, pp.4435-4440, 2010. [15] I. Attarzadeh and Siew Hock Ow, Software Development Effort Estimation Based on a New Fuzzy Logic Model, International Journal of Computer Theory and Engineering, Vol. 1, No. 4, pp.1793-8201, October 2009. [16] Bibi Stamatia and Stamelos Ioannis, Selecting the Appropriate Machine Learning Techniques for Predicting of Software Development Costs, Artificial Intelligence Applications and Innovations, vol. 204, pp.533-540, 2006. [17] http://en.wikipedia.org/wiki/neurosolutions [18] http://promisedata.org/data [19] http://www.knime.org 296