A Specific Effort Estimation Method Using Function Point



Similar documents
Full Function Points for Embedded and Real-Time Software. UKSMA Fall Conference

A Comparison of Calibrated Equations for Software Development Effort Estimation

Function Point Measurement from Java Programs

Introduction to Function Points

Software project cost estimation using AI techniques

Software Cost Estimation: A Tool for Object Oriented Console Applications

APPLYING FUNCTION POINTS WITHIN A SOA ENVIRONMENT

Why SNAP? What is SNAP (in a nutshell)? Does SNAP work? How to use SNAP when we already use Function Points? How can I learn more? What s next?

MEASURING THE SIZE OF SMALL FUNCTIONAL ENHANCEMENTS TO SOFTWARE

Software Development Cost and Time Forecasting Using a High Performance Artificial Neural Network Model

Efficient Indicators to Evaluate the Status of Software Development Effort Estimation inside the Organizations

SOFTWARE ESTIMATING RULES OF THUMB. Version 1 - April 6, 1997 Version 2 June 13, 2003 Version 3 March 20, 2007

Fundamentals of Function Point Analysis

Project Planning and Project Estimation Techniques. Naveen Aggarwal

Using Productivity Measure and Function Points to Improve the Software Development Process

Software Metrics & Software Metrology. Alain Abran. Chapter 4 Quantification and Measurement are Not the Same!

Measuring Software Functionality Using Function Point Method Based On Design Documentation

Prediction of Stock Performance Using Analytical Techniques

Software Development: Tools and Processes. Lecture - 16: Estimation

Hathaichanok Suwanjang and Nakornthip Prompoon

A Comparative Evaluation of Effort Estimation Methods in the Software Life Cycle

An Evaluation of Neural Networks Approaches used for Software Effort Estimation

A Concise Neural Network Model for Estimating Software Effort

ALGORITHM OF SELECTING COST ESTIMATION METHODS FOR ERP SOFTWARE IMPLEMENTATION

Estimating Size and Effort

STATISTICA Formula Guide: Logistic Regression. Table of Contents

Information Security and Risk Management

A New Approach in Software Cost Estimation with Hybrid of Bee Colony and Chaos Optimizations Algorithms

Software Cost Estimation using Function Point with Non Algorithmic Approach

Data quality in Accounting Information Systems

Artificial Neural Network and Non-Linear Regression: A Comparative Study

Comparison and Analysis of Different Software Cost Estimation Methods

FUNCTION POINT ANALYSIS: Sizing The Software Deliverable. BEYOND FUNCTION POINTS So you ve got the count, Now what?

A New Approach For Estimating Software Effort Using RBFN Network

The software maintenance project effort estimation model based on function points

INCORPORATING VITAL FACTORS IN AGILE ESTIMATION THROUGH ALGORITHMIC METHOD

A Case Study Research on Software Cost Estimation Using Experts Estimates, Wideband Delphi, and Planning Poker Technique

Measurement Information Model

Module 11. Software Project Planning. Version 2 CSE IIT, Kharagpur

EPL603 Topics in Software Engineering

PMI PMBOK & ESTIMATING PMI PMBOK & ESTIMATING PMI PMBOK & ESTIMATING PMI PMBOK & ESTIMATING PMI PMBOK & ESTIMATING PMI PMBOK & ESTIMATING

FUNCTION POINT ANAYSIS DETERMINING THE SIZE OF ERP IMPLEMENTATION PROJECTS By Paulo Gurevitz Cunha

Estimation of the COCOMO Model Parameters Using Genetic Algorithms for NASA Software Projects

A Fuzzy Decision Tree to Estimate Development Effort for Web Applications

Derived Data in Classifying an EO

EXTENDED ANGEL: KNOWLEDGE-BASED APPROACH FOR LOC AND EFFORT ESTIMATION FOR MULTIMEDIA PROJECTS IN MEDICAL DOMAIN

Pragmatic Peer Review Project Contextual Software Cost Estimation A Novel Approach

SIZE & ESTIMATION OF DATA WAREHOUSE SYSTEMS

Optimal Resource Allocation for the Quality Control Process

A Property & Casualty Insurance Predictive Modeling Process in SAS

Bootstrapping Big Data

Effort and Cost Allocation in Medium to Large Software Development Projects

Using Entity-Relationship Diagrams To Count Data Functions Ian Brown, CFPS Booz Allen Hamilton 8283 Greensboro Dr. McLean, VA USA

C. Wohlin, "Is Prior Knowledge of a Programming Language Important for Software Quality?", Proceedings 1st International Symposium on Empirical

Keywords : Soft computing; Effort prediction; Neural Network; Fuzzy logic, MRE. MMRE, Prediction.

Software Project Management Matrics. Complied by Heng Sovannarith

DATA MINING, DIRTY DATA, AND COSTS (Research-in-Progress)

A hybrid method for increasing the accuracy of software development effort estimation

Web Data Mining: A Case Study. Abstract. Introduction

Sizing Logical Data in a Data Warehouse A Consistent and Auditable Approach

Counting Infrastructure Software

Deducing software process improvement areas from a COCOMO II-based productivity measurement

International Journal of Computer Trends and Technology (IJCTT) volume 4 Issue 8 August 2013

Software Migration Project Cost Estimation using COCOMO II and Enterprise Architecture Modeling

Marketing Mix Modelling and Big Data P. M Cain

Finally, Article 4, Creating the Project Plan describes how to use your insight into project cost and schedule to create a complete project plan.

A Survey of Software Test Estimation Techniques

Comparing Fault Prediction Models Using Change Request Data for a Telecommunication System

Improving proposal evaluation process with the help of vendor performance feedback and stochastic optimal control

A Comparative Study of the Pickup Method and its Variations Using a Simulated Hotel Reservation Data

Performance Management for Inter-organization Information Systems Performance: Using the Balanced Scorecard and the Fuzzy Analytic Hierarchy Process

A Comparison of General Approaches to Multiprocessor Scheduling

FUNCTION POINT ESTIMATION METHODS: A COMPARATIVE OVERVIEW

Multiple Linear Regression in Data Mining

Towards a Methodology to Estimate Cost of Object- Oriented Software Development Projects

Towards applying Data Mining Techniques for Talent Mangement

Data Mining - Evaluation of Classifiers

A Preliminary Checklist for Software Cost Management

SAS Software to Fit the Generalized Linear Model

Analysis of Load Frequency Control Performance Assessment Criteria

Handling attrition and non-response in longitudinal data

Software Cost Estimation

Industry Environment and Concepts for Forecasting 1

Target Strategy: a practical application to ETFs and ETCs

Do Supplemental Online Recorded Lectures Help Students Learn Microeconomics?*

The Challenge of Productivity Measurement

A Project Estimator Tool: for Software Estimation using Neuro-Fuzzy

Transcription:

JOURNAL OF INFORMATION SCIENCE AND ENGINEERING 27, 1363-1376 (2011) A Specific Effort Estimation Method Using Function Point BINGCHIANG JENG 1,*, DOWMING YEH 2, DERON WANG 3, SHU-LAN CHU 2 AND CHIA-MEI CHEN 1 1 Department of Information Management National Sun Yat-sen University Kaohsiung, 804 Taiwan 2 Department of Software Engineering National Kaohsiung Normal University Kaohsiung, 802 Taiwan 3 Department of Information System China Steel Corporation Kaohsiung, 812 Taiwan Software estimation provides an important tool for project planning; whose quality and accuracy greatly affect the success of a project. Despite a plethora of estimation models, practitioners experience difficulties in applying them because most models attempt to include as many influential factors as possible in estimating software size and/or effort. This research suggests a different approach that simplifies and tailors a generic function point analysis model to increase ease of use. The proposed approach redefines the function type categories in the FPA model, on the basis of the target application s characteristics and system architecture. This method makes the function types more suitable for the particular application domain. It also enables function point counting by the programmers themselves instead of by an expert. An empirical study using historical data establishes the regression model and demonstrates that its prediction accuracy is comparable to that of a FPA model. Keywords: effort estimation, function point analysis, empirical model, project management, size measurement 1. INTRODUCTION Size and effort of a software system are two different but correlated terms; size measures how large the system is while effort specifies how much endeavor requires to create it. Given a same size task, the effort required from different teams may vary a lot because their productivities are different. Thus, it is the effort that provides more useful reference for project cost estimation, schedule planning, and so on [1, 2]. However, effort estimation in practice usually rely on human domain experts, who offer estimations based on their experiences with similar projects in past, and estimation quality thus depends on personal experiences and subjective judgments, which tend to result in unstable or even poor estimation accuracy. Algorithmic estimation models address this problem by trying to capture the influence factors to estimate software size and/or effort using predefined equations [3-7]. Yet practitioners experience difficulties with these models due to hidden factors such as team s productivity that affect development effort differently but are not fully reflected in the models [8-11]. In addition, most existing models are complicated and difficult to implement, Received November 9, 2009; revised March 12, 2010; accepted June 10, 2010. Communicated by Chih-Ping Chu. * Corresponding author. 1363

1364 BINGCHIANG JENG, DOWMING YEH, DERON WANG, SHU-LAN CHU AND CHIA-MEI CHEN which further hinders their acceptance among practitioners [12, 13]. Given this situation, a more sophisticated model to cover ever more factors seems inappropriate since existing ones already take effort to implement [12, 13]. Besides, some factors are inherent to a domain and may not be easy to generalize. An alternative approach might be necessary to face with these problems. This paper presents a new type of estimation approach that is created solely for a specific organization, using its historical project data and some metrics pertaining to the application type and system architecture. It inherits the spirit from function point analysis (FPA) [3, 4, 15] but is much simplified. FPA analysis process usually relies on experts who judge and estimate the various aspects of a software system [13], which makes it difficult for a common project team to adopt. In addition, the new approach generates estimated effort only, without the intermediate step of size estimation, which further simplifies the process. The remainder of this article is organized as follows: Section 2 briefly introduces the related methods of size and effort estimations and FPA. The new model and a redefinition of function points appear in section 3. A regression process establishes the model in section 4, and then an empirical study applies the created model to a real example and compares its result with FPA in section 5. The final section concludes with some implications and further research directions. 2.1 Estimation Methods 2. RELATED ESTIMATION METHODS Effort (and size) estimations are critical to the success of a software project [16]. Of the different techniques presented in prior literature [6-9, 17], the most popular include algorithmic and parametric models, expert judgment, and reasoning by analogy [18]. According to Heemstra s survey, 29 different software-based cost models have been proposed since 1966 [6]. Most effort estimation models rely on empirical derivation, using regression analysis of a collection of historical project data. These models usually take a software size as input to estimate its development effort, where the size measurement can be either lines of code (LOC) or function points. The LOC-based models are mostly nonlinear, and their estimated effort is at least quadratic to the size. Other models based on function points are not; the effort relates linearly to the function points, e.g., the one proposed by Albrecht and Gaffney [4]. A possible explanation to this is that the generation process of counting function points already involves nonlinear computation. Machine learning models for size and/or effort estimation are rather more recent; they include case-based reasoning [20], fuzzy logic [21], neural network [22], and many others [23]. Most such approaches report results that are comparable to those of other techniques [9]. However, because the data needed by these models usually are not available at the start of a software project, most can be applied only during the later stages of the software development process.

A SPECIFIC EFFORT ESTIMATION METHOD 1365 2.2 Function Point Analysis FPA denotes a family of algorithmic methods for size estimation. This method separately evaluates two classes of the attributes of a software system: size factors and influence factors. The first version of FPA, invented by Albrecht at IBM in 1979 [3], proposed a new metric (i.e., function point) for software size rather than lines of code. The International Function Point User Group (IFPUG) adopted a revised method [4], defining a function point as a means to measure software size by quantifying the functionality provided to the user based solely on logical designs and functionality specifications [24]. Because the functionality of a software system, from the user s perspective, usually emerges early in a project, FPA offers the unique advantage of being applicable during the early stage, when other approaches to size measurement are not appropriate. A FPA model classifies the functions of a software system into five types: external inputs (EI), which refer to the unique user data or control inputs that add to or change the data; external outputs (EO), which are the unique user data or control outputs that fall outside the boundaries of the system; external inquiries (EQ), which are the unique input that generates immediate output; internal logical files (ILF), which are internally maintained, logical groups of data; and external interface files (EIF) that are passed or shared among applications. Furthermore, IFPUG groups these functions into either data functions (ILFs and EIFs) or transactional functions (EIs, EOs, and EQs). Each category consists of different function elements. For example, the data function category consists of data element type (DET), which refers to a user-recognizable field from a business perspective that participates in a transaction or is stored on a logical data file, and record element type (RET), which refers to a user-recognizable subgroup of data elements within ILF or EIF only. The suggested steps for FPA, according to the IFPUG, are as follows: (1) identify the function elements in each prescribed function; (2) assign each function a ranking level of simple, medium, or complex according to the number of function elements found; (3) calculate an initial function point count; (4) determine the total degree of influence exerted on the general system characteristics; and (5) calculate the final function points on the basis of the initial point count and the influence adjustment factor. Although the description of the first step is simple, it takes effort to identify and count the function elements (i.e., DET, RET, and so on) that occur in each of the five categories. Based on the findings, a user function is then ranked with simple, medium, or complex according to the complexity ranking table defined for each category. This rank equates with a numeric value in the complexity weighting factor table, ranging from 3 to 15. These numeric values sum together to generate the initial count of unadjusted function points, as described in the third step of the process. Because FPA is designed to apply across different organizations, the next step considers fourteen environmental factors and their influences. Depending on the degree of influence, each characteristic receives a value from 0 to 5. The summation of the fourteen factors values, multiplied by 0.01 and then added to a constant of 0.65, generates the value adjustment factor (VAF). The previous unadjusted function point count multiplied by this VAF equals the final count. Various other FPA models exist as well. Since the first revision of FPA in 1984, variations and extensions have altered the original model. For example, Symons introduced Mark II Function Point in 1991 [11], and Abran et al. considered Full Function Point in

1366 BINGCHIANG JENG, DOWMING YEH, DERON WANG, SHU-LAN CHU AND CHIA-MEI CHEN 1997 [25]. Other related models include COSMIC, Object Point Analysis, and Feature Point Model [26]; the IFPUG s version 4.2.1 is the most recent. Although FPA is well known, it is difficult to adopt in modern software environments, which has been the impetus for various extensions [11]. The coefficients in the complexity weighting factor table also must be justified before implementation, as noted by Kitchenham [14]. The determination of the influence adjustment factor is another problem; empirical studies show that each estimator demands different considerations regarding the influence factor of the general system characteristics, and the final function points may vary by as much as 30% within an organization and even more across organizations when they are based on different estimators [10, 15]. In a specific application domain, with historical project data available, a simpler approach might avoid or at least alleviate these problems. In particular, the FPA model might be made to be specific to the domain, which could mitigate existing concerns, as well as the variance of subjective judgments regarding the influence adjustment factor. 3. TAILORED FUNCTION POINT ANALYSIS The proposed approach consists of three steps. First, it reduces the gap between the classification of function types in FPA and the classification contained in a real application. The five basic function types and function elements from FPA usually cannot match well the functional complexity of a modern application. In addition, counting the function elements requires some expertise. Therefore, the first step redefines the basic function types with respect to the application domain and upgrades the classification to a higher level of meaningful indicators. Second, general system characteristics get incorporated into the function points to reduce possible subjective bias. Our goal here is not to develop a generic model, as FPA does, so there is no need to separate the consideration of function complexity into two parts. Both Kemerer [23] and Low and Jeffery [27] show that size predictions based on raw function points are just as accurate as predictions based on adjusted function points, and Kitchenham [14] supports the abandonment of adjustments. This second step therefore does nothing but set the value of the influence adjustment factor equal to 1. Third, it requires a readjustment of the complexity weighting factor table in FPA according to the system characteristics of different applications. Because the first step has modified the basic function types, which requires updating the weighting table, the third step consists of a regression process that recreates the complexity weighting factor table and determines its coefficients on the basis of historical data. 3.1 Function Type Reclassification Although the five function types in FPA are useful in size estimation, within a specific application domain and developer team, other factors may better reflect the complexity and effort of the task. The application domain entails an information department in a large steel company whose information systems mainly feature mainframe computers running the third-generation language COBOL. This case provides an effective experimental target for two reasons. First, the department keeps good records of its historical project

A SPECIFIC EFFORT ESTIMATION METHOD 1367 data, and second, it experiences a low turnover rate among software engineers, which helps reduce the effect of the human factor in evaluating the accuracy of the estimation model. In the formalized software development process, most software engineers perform their work according to the schedule assigned by the project. Our study case is a set of On Line Transaction Processing (OLTP) applications running in this company, and interviews with their senior engineers reveal six influential factors that affect their daily programming jobs, namely, 1. Intercommunication parameter sets: As a common feature of the company s OLTP programs, many programs call one another to complete a specific task. These programs usually exchange complicated parameters during their communication. Thus, it is a complexity indicator reflecting the complexity of the task that a program performs. 2. Connected subsystems: Depending on its business purpose, a program might be connected to the sales management, production management, logistic and data warehouse, equipment management, administration, and/or financial management systems. The more subsystems involved, the greater is the functional complexity. At the least, engineers need more time to study the connected systems and their internal data elements. 3. Updated files: As a good standard practice, business application programs keep logs of inserted, deleted, or modified data for failure recovery purposes or auditing trails. These logs provide good indicators of the workload and complexity of a program. 4. User departments involved: A special characteristic of the systems in this study is that most of the applications have similar user interfaces. The more user departments are involved, the more complicated a user interface is. In addition, more user departments imply more complicated function requirements. 5. Utility programs: Utilities are programs that implement code tables, data checking and validation, or certain business rules. The more utilities are invoked, the greater program s execution of the data transformation functions. 6. Subroutines (or subprograms): Similar to a utility program, a subroutine performs various jobs. Again, the more subroutines a program calls, the more complicated it is. Table 1 summarizes the new function categories compared to the original ones. As the table reveals, the spirit of the new function categories follows the FPA s original classification, except that the levels of emphasis differ. In addition, the proposed categories are more natural and easier for project team members to use, because they refer to familiar topics. Table 1. Comparison between basic function types and proposed classification. Categories Equivalent to FPA Inter-communication parameter sets Internal logical files, external interface files Connected subsystems Data communications, distributed functions, complex processing, multiple sites. Updated files Internal logical files User departments External inputs, external outputs, external inquiries Utility programs External interface files Subroutines External interface files

1368 BINGCHIANG JENG, DOWMING YEH, DERON WANG, SHU-LAN CHU AND CHIA-MEI CHEN An example illustrates the point: According to the FPA s rules, one cannot update an external file directly but instead must assume an (virtual) external interface with another program that can handle the processing between the application program and the file. The proposed function category is more direct and does not suffer from such a problem. In addition, counting function elements is much easier. Because the new function types are already the most basic elements that provide meaning, there is no need to divide them further to identify the function elements. These factors reflect the specific characteristics for effort estimation under this study. To derive factors for other domain, one may follow a similar procedure. Namely, collect possible factors from senior engineers, consolidates them to preferably no more than seven factors. Too many factors would complicate the estimation model. It is best to compare these factors with the original function point classification to note for possible biases or even pretermission of important aspects. 3.2 Creating the Complexity Weighting Factor Table In FPA, the complexity weighting factor table helps generate the function points of each individual function. Redefining the basic function types according to the application domain requires rebuilding the table and determining its coefficients, using a regression analysis of the historical data. Start with the equation for function point computing: n FP = UFP VAF, i= 1 i where n is the number of function types, and UFP i is the unadjusted function points for the ith function type. In line with the preceding discussion, VAF equals 1 and thus may be ignored hereafter. The total function point is the sum of function points in each classified function type. Assume X i denotes the function points for the ith function type in a program, W is the ratio of person-day per function point, and C is the physical person-days of a programming task. The rewritten equation becomes: n Xi W = C. (1) i = 1 The calculation of X i requires a complexity weighting factor table similar to that of FPA and multiply the number of functions, N i, by the corresponding weight in that category, i.e., X i = N i W i. If treating the weights for the low, medium, and high complexity levels of a function type as independent, 3n variables must be determined. In contrast, in the current FPA, the weight factors for the low, medium, and high complexity levels roughly follow the ratio of 3:4:6. So the proposed model introduces another variable π i and rewrites X i = (N 1 ia 1 i + N 2 ia 2 i + N 3 ia 3 i)π i, where A j i denotes the ratio for different complexity levels, assigned constant value of 3, 4, or 6, and π i is the factor to be determined by the relative weight over the different function types. When π i W = W i, the equation simplifies to: n 3 j j ( Ni Ai ) Wi = C. (2) i= 1 j= 1

A SPECIFIC EFFORT ESTIMATION METHOD 1369 This equation is ready for regression. Substitution the historical data into it will determine the best values for W i. Assume the initial solution be represented by W i, in which case the equation becomes n 3 j j ( Ni Ai ) Wi = C + e. (3) i= 1 j= 1 The next step minimizes the summation of the squared errors for m number of training data to find W i : m 2 e j. j= 1 s = (4) The converging criterion restricts the difference between the largest and smallest W i to less than 5% (which can be adjusted). Alternatively, the smallest W i can divide all A i W i and reveal a new set of A i, which, when substituted into Eq. (3), generates a new set of W i. The iteration process terminates when the criterion is satisfied. 4. REGRESSION MODELS The software team under study develops new functions for existing systems in response to daily requests from other departments in the company. To simplify the data analysis, this investigation applies effort estimation to the last two phases of software development, namely, implementation and testing, when the function requirements and overall design of programs are already known so that counting function points is easier and more certain. The analysis can be extended to the whole life cycle, if necessary, in that the effort expended during these two phases exhibits roughly a constant ratio with total development effort, with minor variance. Table 2. Test data profiles. System Number of Programs Total Lines of Code Effort (person-day) A 6 10230 32 B 21 23318 126 C 20 30548 95 D 20 13458 122 E 5 3330 23 The regression analysis includes program data from five historical projects, as Table 2 shows. For each system, the programs are sorted according to their complexity in each classified function category. A program earns a low complexity rank in a function category if its complexity falls below 30% on the spectrum, a medium rank if it appears greater than 30% but less than 70%, and a high rank otherwise. During the regression process, sensitivity analysis reveals the strength of different factors that contribute to development effort; factors with smaller values disappear from

1370 BINGCHIANG JENG, DOWMING YEH, DERON WANG, SHU-LAN CHU AND CHIA-MEI CHEN Complexity Weights Intercommunication Parameter Sets Table 3. Sensitivity analyses of function categories. Connected Subsystems User Departments Called Subroutines Connected Utility Programs Updated Files (3, 4, 5) 0.2667 0.6260 0.1705 0.0148 0.4187 0.1418 (3, 4, 6) 0.2969 0.5298 0.1904 0.0074 0.3860 0.1627 (7, 10, 15) 0.3128 0.5401 0.2054 0.0109 0.1410 0.1660 Table 4. Convergences of weight control factors. Model 1 Model 2 Model 3 Model 4 Model 5 Category W i A i W i A i W i A i W i A i W i A i Inter-communication parameter sets 0.167 6/8/12 0.1506 14/18/27 0.166 3/4/6 0.0123 5/6/10 0.1612 5/6/10 Connected subsystems 0.1924 10/13/19 0.1613 5/7/11 0.1595 10/13/20 0.0112 127/170/255 0.1529 11/15/22 User departments 0.1472 5/7/11 0.1604 4/6/9 0.1326 6/7/11 0.0116 12/16/24 0.1416 3/4/6 Connected utility programs 0.1746 4/5/8 0.1405 4/5/8 0.1579 7/9/14 0.0113 178/238/357 0.1669 7/9/13 Updated files 0.1762 3/4/6 0.1393 4/5/8 0.1675 5/6/9 0.0113 63/84/125 0.1589 4/6/9 the function category. The test indicates that the called subroutines function category attained too much emphasis, in that its related value is far less than that of others, as Table 3 displays. This category therefore joins the similar category of connected utility programs. An iteration of the regression analysis process determines the best values for W i that lead to the total error summation in Eq. (4) within the threshold value, 5%. The initial values of A i are 3, 4, and 6, respectively. If the error summation after the regression fails to reach the threshold value, the values of A i require adjustment, by multiplying corresponding W i, dividing by the minimum value of W i, and iterating the regression analysis. The evaluation of the generated regression models employs a five-fold cross-validation process; in each run of the process, data from four out of the five projects serve to train the model, and then data from the remaining project test it. The five repetitions of this process produce the results in Table 4. One of the basic assumptions of multiple regression analysis is that the error e i of independent variables should not be related. The Durbin-Watson (DW) test can examine this assumption. The DW values in all five runs are close to 2. This result implies there are no significant relations in the model and that its predictions are reliable. Another test to examine the validity of the overall regression model is the analysis of variance (ANOVA). The test results demonstrate that all p-values are less than 0.05, in support of the linear relationship in the proposed model. 5. EXPERIMENTAL RESULTS AND MODEL EVALUATION This section describes the physical test of the effectiveness of the proposed new models and compares their estimation accuracies with those obtained by standard FPA. For easy comparisons, all the results are shown in the commonly used measure of the mean value of the magnitude of relative error (MMRE) [9, 16, 22]. The magnitude of relative

A SPECIFIC EFFORT ESTIMATION METHOD 1371 error (MRE) is a normalized measure of the variance between actual values (V A ) and predicted values (V F ): VA VF MRE =. V A 5.1 Estimation Accuracy of the Tailored Models The tests of the five regression models established in the previous section return the average estimation accuracies in Table 5. The mean error of the new estimation method based on the five-fold test is 25.91% in MMRE, with 13.37% variance not very good but comparable to other known studies [19, 22]. Table 5. New models effort estimation (by application systems). Reference Projects Tested Project MRE A, B, C, D E 49.55% A, B, C, E D 21.52% A, B, D, E C 18.75% A, C,D, E B 17.36% B, C, D, E A 22.39% MMRE 25.91% Standard deviation 13.37% Table 6. New models effort Estimation (by random partition). Reference Groups Tested group MRE V, W, X,Y Z 13.06% V, W, X, Z Y 17.67% V, W,Y, Z X 25.00% V, X, Y, Z W 15.86% W, X, Y, Z V 9.12% MMRE 16.14% Standard deviation 5.91% Yet the experiment is not perfect. Model quality depends on the regression analysis of historical project data, and these models are not trained with a sufficient number of projects. If different projects vary greatly in their characteristics, the proposed model is less reliable, as indicated by the large standard deviation in MRE. Another experiment therefore trained the models using the program, rather than the system, units. This experiment breaks down the application systems boundaries and randomly assigns each program from the five projects into five groups so that the application characteristics from the five systems distribute evenly into these groups. After a random partitioning, the five groups contained 14, 16, 17, 13, and 12 programs. The same process applied to these groups creates five new models. Estimation accuracies in this test are better (see Table 6). These experimental outcomes appear closer to

1372 BINGCHIANG JENG, DOWMING YEH, DERON WANG, SHU-LAN CHU AND CHIA-MEI CHEN the real evaluation of the new approach, because the models derive from more random data items, and the standard deviation of MRE declines greatly. 5.2 Comparison with Standard FPA Another way to evaluate the new approach is to compare the estimation results with those from standard FPA. Table 7 shows the function points and person-days per function point for each of the five applications using the standard FPA method. Table 7. Function points and person-days conversion. Application A B C D E Adjusted function points 111 400 389 368 78 Actual person-days 32 126 95 122 23 Person-days per function point 0.2883 0.3150 0.2442 0.3315 0.2949 Table 8. FPA s effort estimation (by application systems). Reference projects Average MD/FP Tested Project MRE A, B, C, D 0.2957 E 0.28% A, B, C, E 0.2822 D 14.88% A, B, D, E 0.3166 C 29.64% A, C,D, E 0.2875 B 8.73% B, C, D, E 0.2964 A 2.84% MMRE 11.28% Standard deviation 11.72% A similar experimental process uses four of the five projects as the historical data to compute an average number of person-days per function point, which then serves as an estimation of the effort for the last project. For example, the average person-days per function point from projects A, B, C, and D is (32 + 126 + 95 + 122)/(111 + 400 + 389 + 368) = 0.2957. This value, multiplied by 78, is the estimated effort for project E, which equals 23.0646. Compared with the actual value for person-day 23, the estimation accuracy is 0.28% in MMRE. Table 8 shows the estimation results of this experiment. The surprisingly high estimation accuracies of some models indicate that the standard FPA has physical support, which likely explains its persistent popularity. The only shortcoming is its stability, which varies up to 29% in terms of accuracy from the best to the worst. The large variation indicates that the intrinsic characteristics of these applications vary greatly across different projects in the experiment. Thus another experiment, similar to the second one in the last subsection, follows, and its estimation outcomes appear in Table 9. The statistics in this table, however, are not as reliable as those in Table 9, because the standard deviation of MRE increases. That is, FPA prefers to estimate using a whole project s data rather than randomized data. This experiment does not represent the usual way in using FPA, because the VAF factors in the final process to adjust function points are designed to evaluate a whole system, not programs.

A SPECIFIC EFFORT ESTIMATION METHOD 1373 Table 9. FPA s effort estimation (by random partition). Reference Groups Average MD/FP Tested Group MRE V, W, X,Y 0.2074 Z 9.80% V, W, X, Z 0.199 Y 26.31% V, W,Y, Z 0.232 X 44.33% V, X, Y, Z 0.2103 W 3.73% W, X, Y, Z 0.2125 V 2.13% MMRE 17.26% Standard deviation 17.90% 6. DISCUSSION AND CONCLUSION Although effort estimation is a critical step for the success of a software project, in practice, many projects still use ad hoc methods to conduct this task. This phenomenon may result because most of the generic algorithmic estimation models are difficult to be adopted, in that they require the collection of many detailed items that may affect the size (and effort) associated with an application. This research presents a different approach that trades generic with specific and greatly simplifies the estimation model to enable common programmers to use it. Although the new approach may not be as formal as a normal estimation model, it does reduce the difficulties of using it by giving programmers a method to build their own model, which captures more of a given application domain s characteristics and is easier to use. The demonstration of this approach offers an unique benefit in that it makes the function classification more suitable for a particular application domain so that function point counting can be conducted by programmers themselves instead of by a certified FPA expert. We believe such an approach should be applicable to domains when FPA works since it still stays with the FPA spirit. However, it may shares the limitations of FPA also. To verify its usefulness, it is no doubt that more experimental data from a wider range of application domains and environments should be collected to deepen the investigation into the trade-off between generic versus specific. At least two research directions exist. First, studies could evaluate whether the tailored FPA model always fits a business application domain and still maintains comparable estimation accuracy. Second, the ideas presented herein might be adopted to another estimation model and determine how it performs. Each direction will lead to more detailed research findings. REFERENCES 1. K. Molokken and M. Jorgensen, A review of software surveys on software effort estimation, in Proceedings of International Symposium on Empirical Software Engineering, 2003, pp. 223-230. 2. J. S. Osmundson, J. B. Micheal, M. J. Machniak, and M. A. Grossman, Quality management metrics for software development, Information and Management, Vol. 40, 2003, pp. 799-812.

1374 BINGCHIANG JENG, DOWMING YEH, DERON WANG, SHU-LAN CHU AND CHIA-MEI CHEN 3. A. J. Albrecht, Measuring application development productivity, in Proceedings of IBM Applications Development Symposium, 1979, pp. 83-92. 4. A. J. Albrecht and J. E. Gaffney Jr., Software function, source lines of code, and development effort prediction: A software science validation, IEEE Transactions on Software Engineering, Vol. 9, 1983, pp. 639-648. 5. B. W. Boehm, Software Cost Estimation with Cocomo II, Prentice Hall, New Jersey, 2000. 6. A. L. Lederer and J. Prasad, A causal model for software cost estimating error, IEEE Transactions on Software Engineering, Vol. 24, 1998, pp. 137-148. 7. F. Walkerden and R. Jeffery, An empirical study of analogy-based software effort estimation, Empirical Software Engineering, Vol. 4, 1999, pp. 135-158. 8. N. E. Fenton and M. Neil, Software metrics: success, failures and new directions, Journal of Systems and Software, Vol. 47, 1997, pp. 149-157. 9. A. R. Gray and S. G. MacDonell, A comparison of techniques for developing predictive models of software metrics, Information and Software Technology, Vol. 39, 1997, pp. 425-437. 10. D. Garmus and D. Herron, Function Point Analysis: Measurement Practices for Successful Software Projects, Addison-Wesley, Boston, MA, 2001. 11. C. R. Symons, Software Sizing and Estimating MkII FPA (Function Point Analysis), John Wiley and Sons, Chichester, U.K., 1991. 12. J. J. Dolado, On the problem of the software cost function, Information and Software Technology, Vol. 43, 2001, pp. 61-72. 13. A. Abran and P. N. Robillard, Function points analysis: An empirical study of its measurement processes, IEEE Transactions on Software Engineering, Vol. 22, 1996, pp. 895-909. 14. B. Kitchenham, The problem with function points, IEEE Software, Vol. 14, 1997, pp. 29-31. 15. R. Jeffery and J. Stathis, Function point sizing: Structure, validity and applicability, Empirical Software Engineering, Vol. 1, 1996, pp. 11-30. 16. M. Jorgensen and D. I. K. Sjoberg, Impact of effort estimates on software project work, Information and Software Technology, Vol. 43, 2001, pp. 939-948. 17. C. Gencel and O. Demirors, Functional size measurement revisited, ACM Transactions on Software Engineering and Methodology, Vol. 17, 2008, pp. 1-36. 18. B. Barry, C. Abts, and S. Chulani, Software development cost estimation approaches A survey, Annals of Software Engineering, Vol. 10, 2000, pp. 177-205. 19. L. H. Putnam and W. Myers, Measures for Excellence: Reliable Software on Time, within Budget, Yourdon Press, Englewood Cliffs, NJ, 1992. 20. R. Bisio and F. Malabocchia, Cost estimation of software projects through case-base reasoning, in Proceedings of Case-Based Reasoning Research and Development, 1995, pp. 11-22. 21. O. D. Lima, P. M. Farias, and A. D. Belchior Fuzzy modeling for function points analysis, Software Quality Journal, Vol. 11, 2003, pp. 149-166. 22. J. Hakkarainen, P. Laamamen, and R. Rask, Neural networks in specification level software size estimation, in P. K. Simpson, ed., Neural Network Applications, IEEE Technology Update Series, 1993, pp. 887-895. 23. A. Heiat, Comparison of artificial neural network and regression models for estimat-

A SPECIFIC EFFORT ESTIMATION METHOD 1375 ing software development effort, Information and Software Technology, Vol. 44, 2002, pp. 911-922. 24. R. Boehm, Frequently asked questions, http://www.ifpug.org, 2002. 25. A. Abran, M. Maya, J. M. Desharnais, and D. St-Pierre, Adapting function points to real-time software, American Programmer, Vol. 10, 1997, pp. 32-43. 26. C. Jones, Applied Software Measurement: Assuring Productivity and Quality, McGraw- Hill, New York, 2008. 27. G. C. Low and D. R. Jeffery, Function points in the estimation and evaluation of the software process, IEEE Transactions on Software Engineering, Vol. 16, 1990, pp. 64-71. Bingchiang Jeng ( ) joined in the National Sun Yat- Sen University as an Associate Professor in 1990 and became a full Professor in 1998. He is currently the department chair of Information Management and the graduate director of Communication Management. He received B.S. in Computer Science and Information Engineering from National Chiao Tung University, and Ph.D. in Computer Science from New York University. His current research interests include software testing, model checking, and computer-aided instruction in programming. Dowming Yeh ( ) is a Professor in the Department of Software Engineering at the National Kaohsiung Normal University, Taiwan, R.O.C. He received his Ph.D. in Computer Science from the University of Utah in U.S.A. Before assuming his current post, he was an Associate Professor in the Department of Management Information System at the National Pingtung University of Science and Technology and a manager at the Institute for Information Industry in Taiwan. His research interest includes software reengineering, e-learning, web engineering, program analysis, and human computer interaction. Dr. Yeh is a member of IEEE and ACM. Deron Wang ( ) joined in the China Steel Corporation as a system engineer in 1978 and became a section manager of Information System Department in 1989. He received M.S. in Information Management from National Sun Yat-Sen University in 2003. His working interests include ERP consulting, downsizing migration, data center management and network security.

1376 BINGCHIANG JENG, DOWMING YEH, DERON WANG, SHU-LAN CHU AND CHIA-MEI CHEN Shu-Lan Chu ( ) is presently employed as an Associate Technical Specialist in the Southern Regional Regulatory Department of National Communications Commission. She received Master of Education in Information and Computer Education Institute, from National Kaohsiung Normal University in 2006. Chia-Mei Chen ( ) joined in the National Sun Yat- Sen University as an Associate Professor in 1996 and became a full professor in 2004. In addition, she is Section Chef of Network Division, Office of Library and Information Services. She received B.S. in Computer Science and Information Engineering from National Chiao Tung University, and Ph.D. in Computer Science from the University of Maryland, College Park. She serves as a coordinator of TWCERT/CC (Taiwan Computer Emergency Response Team/Coordination Center) since 1998 and continues working for the network security society. Her current research interests include mobile networks, multimedia systems, and network security.