Exploring the Accuracy of Existing Effort Estimation Methods for Distributed Software Projects - Two Case Studies



Similar documents
MTAT Software Economics. Lecture 5: Software Cost Estimation

Extending Change Impact Analysis Approach for Change Effort Estimation in the Software Development Phase

CSSE 372 Software Project Management: Software Estimation With COCOMO-II

Software Migration Project Cost Estimation using COCOMO II and Enterprise Architecture Modeling

Software cost estimation. Predicting the resources required for a software development process

Cost Estimation Driven Software Development Process

Topics. Project plan development. The theme. Planning documents. Sections in a typical project plan. Maciaszek, Liong - PSE Chapter 4

Finally, Article 4, Creating the Project Plan describes how to use your insight into project cost and schedule to create a complete project plan.

Effect of Schedule Compression on Project Effort

Fuzzy Expert-COCOMO Risk Assessment and Effort Contingency Model in Software Project Management

Software cost estimation

CISC 322 Software Architecture

COCOMO-SCORM Interactive Courseware Project Cost Modeling

COCOMO II and Big Data

Software Development Cost and Time Forecasting Using a High Performance Artificial Neural Network Model

Identifying Factors Affecting Software Development Cost

Software Engineering. Dilbert on Project Planning. Overview CS / COE Reading: chapter 3 in textbook Requirements documents due 9/20

Chapter 23 Software Cost Estimation

Project Plan. Online Book Store. Version 1.0. Vamsi Krishna Mummaneni. CIS 895 MSE Project KSU. Major Professor. Dr.Torben Amtoft

Project Plan 1.0 Airline Reservation System

Keywords Software Cost; Effort Estimation, Constructive Cost Model-II (COCOMO-II), Hybrid Model, Functional Link Artificial Neural Network (FLANN).

Software cost estimation

Safe and Simple Software Cost Analysis Barry Boehm, USC Everything should be as simple as possible, but no simpler.

A Comparative Evaluation of Effort Estimation Methods in the Software Life Cycle

Safety critical software and development productivity

Software Project Planning - The Relationship between Project Planning and Project Success.

An Introduction to. Metrics. used during. Software Development

Project Planning and Project Estimation Techniques. Naveen Aggarwal

Universiteit Leiden. ICT in Business. Leiden Institute of Advanced Computer Science (LIACS) Capability Maturity Model for Software Usage

Cost Estimation Strategies COST ESTIMATION GUIDELINES

Software cost estimation

Incorporating Data Mining Techniques on Software Cost Estimation: Validation and Improvement

University of Southern California COCOMO Reference Manual

Software Requirements Metrics

Deducing software process improvement areas from a COCOMO II-based productivity measurement

Scrum on Offshore Development Case Study

Module 11. Software Project Planning. Version 2 CSE IIT, Kharagpur

SOFTWARE COST DRIVERS AND COST ESTIMATION IN NIGERIA ASIEGBU B, C AND AHAIWE, J

10 Keys to Successful Software Projects: An Executive Guide

Best-Practice Software Engineering: Software Processes to Support Project Success. Dietmar Winkler

Software Estimation Experiences at Xerox

2 Evaluation of the Cost Estimation Models: Case Study of Task Manager Application. Equations

Software project cost estimation using AI techniques

Usability metrics for software components

E-COCOMO: The Extended COst Constructive MOdel for Cleanroom Software Engineering

PMI PMBOK & ESTIMATING PMI PMBOK & ESTIMATING PMI PMBOK & ESTIMATING PMI PMBOK & ESTIMATING PMI PMBOK & ESTIMATING PMI PMBOK & ESTIMATING

ICS 121 Lecture Notes Spring Quarter 96

ANALOG-BASED COST ESTIMATION FOR MANAGING INCONSISTENCY IN SOFTWARE DEVELOPMENT

Contents. Today Project Management. Project Management. Last Time - Software Development Processes. What is Project Management?

Software Metrics. Lord Kelvin, a physicist. George Miller, a psychologist

Web Development: Estimating Quick-to-Market Software

Software Process Improvement Framework for Software Outsourcing Based On CMMI Master of Science Thesis in Software Engineering and Management

MoP Glossary of Terms - English

Software Engineering. Introduction. Software Costs. Software is Expensive [Boehm] ... Columbus set sail for India. He ended up in the Bahamas...

Project Management. Lecture 3. Software Engineering CUGS. Spring 2012 (slides made by David Broman) Kristian Sandahl

Project Management. Lecture 3. Software Engineering CUGS. Spring 2011 (slides made by David Broman)

11.1 What is Project Management? Object-Oriented Software Engineering Practical Software Development using UML and Java. What is Project Management?

IT2403-SOFTWARE PROJECT MANAGEMENT 2 MARKS QUESTIONS

Pragmatic Peer Review Project Contextual Software Cost Estimation A Novel Approach

Software Engineering. Reading. Effort estimation CS / COE Finish chapter 3 Start chapter 5

Software Development: Tools and Processes. Lecture - 16: Estimation

Appendix B Data Quality Dimensions

A Study on RE Process Models for Offshore Software Development

Using Productivity Measure and Function Points to Improve the Software Development Process

Function Point: how to transform them in effort? This is the problem!

Studying the Impact of Global Software Development Characteristics on Project Goals: A Causal Model

Efficient Indicators to Evaluate the Status of Software Development Effort Estimation inside the Organizations

AN ENHANCED MODEL TO ESTIMATE EFFORT, PERFORMANCE AND COST OF THE SOFTWARE PROJECTS

TG TRANSITIONAL GUIDELINES FOR ISO/IEC :2015, ISO 9001:2015 and ISO 14001:2015 CERTIFICATION BODIES

Requirements Management in Distributed Projects

Chemuturi Consultants Do it well or not at all Productivity for Software Estimators Murali Chemuturi

Requirements Engineering: Elicitation Techniques

How To Manage Project Management

Manual Guide of The Induction Program for New Employees in the Federal Government

CALCULATING THE COSTS OF MANUAL REWRITES

Agile Based Software Development Model : Benefits & Challenges

Article 3, Dealing with Reuse, explains how to quantify the impact of software reuse and commercial components/libraries on your estimate.

Module 11. Software Project Planning. Version 2 CSE IIT, Kharagpur

Hathaichanok Suwanjang and Nakornthip Prompoon

CS Homework 4 p. 1. CS Homework 4. To become more familiar with top-down effort estimation models, especially COCOMO 81 and COCOMO II.

Center for Effective Organizations

SE351a: Software Project & Process Management

Introduction to Systems Analysis and Design

High-Performance Scorecards. Best practices to build a winning formula every time

Evaluation and Integration of Risk Management in CMMI and ISO/IEC 15504

COCOMO II Model Definition Manual

Applying COCOMO II - A case study Darko Milicic

Managing Uncertainty in Globally Distributed Software Development Projects

Agile Inspired Risk Mitigation Techniques for Software Development Projects

Introduction to Software Engineering. 9. Project Management

Architecture of a Software Configuration Management System for Globally Distributed Software Development Teams

Lecture 14: Cost Estimation

Procurement Programmes & Projects P3M3 v2.1 Self-Assessment Instructions and Questionnaire. P3M3 Project Management Self-Assessment

IMPROVED SIZE AND EFFORT ESTIMATION MODELS FOR SOFTWARE MAINTENANCE. Vu Nguyen

The COCOMO II Estimating Model Suite

University of Calgary Schulich School of Engineering Department of Electrical and Computer Engineering

Amplification of the COCOMO II regarding Offshore Software Projects

Software Development Process

TIME MANAGEMENT TOOLS AND TECHNIQUES FOR PROJECT MANAGEMENT. Hazar Hamad Hussain *

Quality Management in Purchasing

Transcription:

Master Thesis Software Engineering Thesis no: MSE-2009-09 June 2009 Exploring the Accuracy of Existing Effort Estimation Methods for Distributed Software Projects - Two Case Studies Abid Ali Khan Zaka Ullah Muhammad School of Engineering Blekinge Institute of Technology Box 520 SE 372 25 Ronneby Sweden

This thesis is submitted to the School of Computing at Blekinge Institute of Technology in partial fulfillment of the requirements for the degree of Master of Science in Software Engineering. The thesis is equivalent to 20 weeks of full time studies. Contact Information: Authors: Abid Ali Khan Address: Folkparksvägen 19:16, 37240 Ronneby, Sweden E-mail: bestwish25@gmail.com Zaka Ullah Muhammad Address: Folkparksvägen 22:02, 37240 Ronneby, Sweden E-mail: majorzaka@hotmail.com University advisor: Dr. Darja Šmite Assistant Professor, BTH, School of Computing, SoftCenter, Ronneby School of Computing Blekinge Institute of Technology Box 520 SE 372 25 Ronneby Sweden Internet : www.bth.se/tek Phone : +46 457 38 50 00 Fax : + 46 457 271 25 ii

ABSTRACT The term Globalization brought many challenges with itself in the field of software development. The challenge of accurate effort estimation in GSD is one among them. When talking about effort estimation, the discussion starts for effort estimation methods. There are a number of effort estimation methods available. Existing effort estimation methods used for co-located projects are might not enough capable to estimate effort for distributed projects. This is why; ratio of failure of GSD projects is high. It is important to calibrate existing methods or invent new with respect to GSD environment. This thesis is an attempt to explore the accuracy of effort estimation methods for distributed projects. For this purpose, the authors selected three estimation approaches: COCOMO II, SLIM and ISBSG. COCOMO II and SLIM are two well known effort estimation methods, whereas, ISBSG is used to check the trend of a project depending upon its (ISBSG s) repository. The selection of the methods and approaches was based on their popularity and advantages over other methods/approaches. Two finished projects from two different organizations were selected and analyzed as case studies. The results indicated that effort estimation with COCOMO II deviated 15.97 % for project A and 9.71% for project B. Whereas, SLIM showed the deviation of 4.17% for project A and 10.86 % for project B. Thus, the authors concluded that both methods underestimated the effort in the studied cases. Furthermore, factors that might cause deviation are discussed and several solutions are recommended. Particularly, the authors state that existing effort estimation methods can be used for GSD projects but they need calibration by considering GSD factors to achieve accurate results. This calibration will help in process improvement of effort estimation. Keywords: Effort Estimation, Global Software Development (GSD), COCOMO II, SLIM, ISBSG

ACKNOWLEDGEMENT First, we are grateful to our creator who blessed us with the abilities to contribute in human knowledge. We would like to express our heartily gratitude for our advisor Dr. Darja Šmite, who always guided and motivated us during this thesis work. Beside an advisor, she used her leadership abilities to encourage and motivate us when we needed it. Whenever we stuck on some issues regarding thesis, she always provided us positive response. She has provided assistance in numerous ways so that we can understand easily. This thesis would not have been possible without her kind support. We are thankful to her for helping us throughout our thesis. We also would like to thankful to Dr. Cidgem Gencel, Peter Vd Stad (QSM) and Christine Green (EDS) for their kind support and providing us valuable information regarding our work. Our sincere thanks also go to our friends and fellows for providing us moral support during our studies. Finally yet importantly, we would like to thank our families for their love, motivation and support that have guided us to achieve this milestone. We dedicate our work to our respective families. ii

Contents ABSTRACT...I ACKNOWLEDGEMENT... II 1 INTRODUCTION... 1 1.1 BACKGROUND... 1 1.2 PROBLEM DEFINITION... 2 1.3 AIMS AND OBJECTIVES... 2 1.4 RESEARCH QUESTIONS... 2 1.5 RESEARCH OUTCOMES... 3 1.6 RESEARCH METHODOLOGY... 3 1.6.1 Literature Review... 4 1.6.2 Empirical Case Studies... 4 1.6.2.1 Data Source... 5 1.6.2.2 Data Analysis... 5 1.7 VALIDITY THREATS... 5 2 PROJECT MANAGEMENT IN GSD... 6 2.1 GSD CHALLENGES... 6 2.2 PROJECT MANAGEMENT IN GSD... 6 3 EFFORT ESTIMATION... 8 3.1 OBJECTIVES OF EFFORT ESTIMATION... 8 3.2 EFFORT ESTIMATION METHODS... 8 3.2.1 COCOMO II... 8 3.2.1.1 Size estimation... 9 3.2.1.2 Cost Drivers... 9 3.2.1.3 Scale Factors... 10 3.2.1.4 COCOMO II Equation... 10 3.2.2 SLIM... 11 3.2.2.1 SLIM Tool... 11 3.3 CONTINUOUS IMPROVEMENT... 12 3.3.1 Reasons for Selecting COCOMO II and SLIM... 13 4 CASE STUDIES... 15 4.1 PROJECTS OVERVIEW... 15 4.1.1 Project A... 15 4.1.1.1 Description... 15 4.1.2 Project B... 17 4.1.2.1 Description... 17 4.2 EFFORT ESTIMATION WITH SELECTED APPROACHES... 18 4.2.1 Effort Estimation with COCOMO II... 18 4.2.1.1 Equation for COCOMO II... 18 4.2.1.2 Project A: COCOMO II... 18 4.2.1.3 Project B: COCOMO II... 20 4.2.2 Effort Estimation with SLIM... 22 4.2.2.1 Some Facts about SLIM... 22 4.2.2.2 Equations for SLIM... 22 4.2.2.3 Project A: SLIM... 23 4.2.2.4 Project B: SLIM... 23 4.3 ESTIMATION WITH ISBSG... 24 4.3.1 Project A: ISBSG... 24 4.3.2 Project B: ISBSG... 25 4.4 SUMMARY OF CALCULATIONS... 27 iii

5 ACCURACY OF EFFORT ESTIMATION... 28 5.1 COMPARISON OF ESTIMATED RESULTS VS ACTUAL EFFORT... 28 5.2 FACTORS THAT MIGHT CAUSE DEVIATION... 30 5.2.1 Project A... 30 5.2.2 Project B... 31 5.2.3 Comparison of Factors that Might Cause Deviation... 31 5.3 RELATED WORK... 32 6 RECOMMENDATIONS FOR IMPROVING EFFORT ESTIMATION... 33 6.1 SUGGESTIONS AND NEW FACTORS INVOLVEMENT FOR GSD... 33 6.1.1 Purpose and Process of Calibration... 33 7 DISCUSSION AND CONCLUSIONS... 35 7.1 RESULTS... 35 7.2 APPLICABILITY OF DIFFERENT METHODS IN INDUSTRY... 36 7.3 FUTURE WORK... 37 8 REFERENCES... 38 APPENDIX X... 43 APPENDIX Y... 45 iv

LIST OF TABLES Table 1: Cost Drivers COCOMO II... 10 Table 2: Analyses of different effort estimation methods [21, 23]... 13 Table 3: Selected Effort Estimation Methods [21, 23]... 14 Table 4: Actual Effort Project A... 16 Table 5: Actual Effort Project B... 17 Table 6: Scale Factors Project A... 19 Table 7: Cost Drivers (Effort Multiplier) Project A... 19 Table 8: Scale Factors Project B... 20 Table 9: Cost Drivers (Effort Multiplier) Project B... 21 Table 10: Variables and Values for Putnam s Equation... 23 Table 11: Analysis and Selection of ISBSG... 24 Table 12: SLOC/FP [49]... 24 Table 13: Total Functional Size of Studied Projects... 24 Table 14: Summer of Calculations with COCOMO II and SLIM... 27 Table 15: Summary of Calculations with ISBSG... 27 Table 16: Comparison of Estimated Results Vs Actual Effort... 29 Table 17: Estimation Results from ISBSG Both Projects... 29 Table 18: Factors that might cause deviation All factors... 31 Table 19: Factors that might cause deviation Not common in both projects... 31 v

LIST OF FIGURES Figure 1: Research Outline... 4 Figure 2: Project Management and GSD Factors [50]... 7 Figure 3: Effort Estimation Equation COCOMO II... 10 Figure 4: Putnam s Equation for SLIM (Productivity Parameter) [29]... 11 Figure 5: Putnam s Equation for SLIM (Effort Estimation)... 11 Figure 6: Phase Division Project A... 16 Figure 7: Phase Division Project B... 17 Figure 8: Putnam s Equation for SLIM with PP showing PI and B value [29]... 22 Figure 9: Effort Estimation with ISBSG Project A... 25 Figure 10: Effort Estimation with SLIM Project B... 26 Figure 11: Comparison of Actual Effort with Estimated from SLIM and COCOMO II... 29 Figure 12: Comparison of Actual Calendar Month with Estimated from ISBSG... 30 Figure 13: Effort Multipliers Outsourcing Factors for Offshore Outsourcing Software Development [2]... 32 Figure 14: Proposed equation for COCOMO II [2]... 32 vi

Abbreviations GSD Global Software Engineering ISBSG International Software Benchmarking Standards Group RE Requirement Engineering SRS Software Requirements Specification EDS Electronic Data System SLIM Software Life-cycle Model PI Productivity Index MBI Manpower Buildup Index LOC Line Of Code SLOC Source Line of Code PM Person Month EM Effort Multiplier SF Scale Factors QSM Quantitative Software Management COCOMO II Constructive Cost Model SPI Software Process Improvement IEEE Institute of Electrical and Electronic Engineers ACM Association for Computing Machinery HRM Human Resource Management UK United Kingdom PAK Pakistan USA United States of America ISO International Organization for Standardization MS Microsoft vii

1 INTRODUCTION 1.1 Background Globalization of the world is on a doorstep, economic growth and rapidly new inventions are forcing the software industry to boost up development speed in order to cope with these challenges. There is a need to globalize software development in order to save the time, cost and resources [1]. Global Software Development (GSD) is a strategy in which the software development is performed beyond the boundaries such as contextual, organizational, cultural, temporal, geographical and political. In GSD the software life cycle activities is distributing among teams across different boundaries [2]. The diverse distribution of the activities among different organizations all over the world causes a number of questions that need to be answered about realization and successful execution of GSD projects. However, there are insufficient tools, techniques and methods available for distributed projects [3]. Software industry is struggling to invent tools, techniques and methods for GSD projects [3]. This is one of the reasons why GSD is still using the same tools, techniques and methods even same practices for effort estimation, which are being used in co-located projects [4]. Effort estimation is a process that estimates necessary effort (cost, time, etc). It is highly important to make accurate/reliable estimates for the project [35, 36] at the beginning [5] to support project planning, and control the project within the budget and schedule [1]. To call a project successful, it has to be on time, in budget and fulfilling the customer s requirements effectively. On time delivery of the project is a burning issue in development organizations [1, 3, 33]. It becomes more difficult when talking about distributed organizations. The complex nature of distributed projects is one of the hurdles in effort estimation due to which organizations fail to estimate accurately. There is a need to overcome these problems. Researchers and software engineering communities are struggling to explore these complexities and their solutions [7, 8, 9]. Researchers have explored the process of software effort estimation extensively in the past couple of decades [4]. Research shows that wrong estimation of effort can lead a project to failure [10]. Wrong estimation of cost reduction tends to fail a project because organizations do not calculate all the risk and cost associated with Outsourcing Development. Statistics show that 60% of GSD projects fail [11]; latest research pointed out this figure as 40% [12]. Research also highlighted that 2 out of 5 international joint-venture project teams show poor performance [13]. That is why, the need of developing tools, techniques and methods, to overcome these failures is a big challenge. Methods like COCOMO II, SLIM, COBRA etc have been developed to facilitate the software development industry [6]. These models have been working and contributing in process improvement [14, 39, 40] and providing help for the planning team. Another research about the regression analysis by Magne Jorgensen [15] explains the accuracy and bias variation of an organizational estimate of software development effort through regression analysis. He collected information about variables that would affect the accuracy or bias of estimates of the performance of the task completed by organization. He concluded, it is possible that the type of formal analysis and regression-based model in some cases, support for human judgment [15, 37, 38]. Estimation of cost and duration is one of the inherent problems and main challenges of the software engineering business. It is difficult for a project manager to give estimate for distributed projects because there are some additional factors involved that should be taken into account when estimating effort. Beside these additional factors, processes of communication and coordination (considerably more effort consuming than in co-located projects) also contribute to the difficulty in estimation of distributed software projects [2]. 1

Co-located development practices are frequently being used in the GSD; it is good to use the same practices but the question is, whether the tools, techniques and methods used in co-located development are also efficient for GSD? There are no specific effort estimation methods available for the distributed projects [2]. Distributed projects are using the conventional effort estimation methods. Literature is missing to evaluate and address which effort estimation methods provide accurate results for distributed projects and which lead to faulty results [2]. Literature about the factors affecting accuracy of existing effort estimation methods for distributed projects is also lacking. This thesis is an attempt to explore the accuracy of existing effort estimation methods for GSD projects and investigate the factors affecting their accuracy. 1.2 Problem Definition As stated above, current effort estimation methods used for co-located software projects do not have enough capability to estimate effort and duration accurately for distributed software projects. It results into overly cost and schedule, and most of the projects fail. This is why, it is necessary to find some new methods or calibrate existing methods for distributed software projects. It is required to explore the accuracy of existing effort estimation methods and find the potential improvement alternatives. 1.3 Aims and Objectives The main aim of this thesis project is to explore the accuracy of existing effort estimation methods for distributed software projects. To meet this goal, following objectives are set: Investigation of well known effort estimation methods in software engineering; Collection of empirical data from distributed projects for further analysis; Application of different effort estimation methods; Comparison of results produced by different methods/approaches with the actual effort of investigated projects; Analysis of the results; Comparison of the results from COCOMO II and SLIM with related studies; Improvement suggestions. 1.4 Research Questions Corresponding to the aims and objectives the following are the research questions for this thesis project: Do existing effort estimation methods accurately estimate effort for distributed software projects? Do existing effort estimation methods provide optimistic or pessimistic calculation of the selected studied cases? Which method tends to provide more accurate estimates? In case of deviations, what are the factors that shall be considered when calculating effort for distributed software projects? What could be done for the improvement of effort estimation methods in distributed software projects? 2

1.5 Research Outcomes Meeting with above research questions, following outcomes were produced: Analysis of accuracy of effort estimation methods/approaches (COCOMO II, SLIM and ISBSG) for two distributed finished projects; Evaluated trends in deviation of COCOMO II, SLIM and ISBSG effort estimation with actual effort for selected studied cases; List of factors influencing effort estimation in studied cases; Recommendation for Software Process Improvement (SPI) activities in relation to effort estimation for distributed projects; Recommendation with respect to related studies; Hypotheses for future research. 1.6 Research Methodology The research conducted within this thesis was exploratory in nature and based on two case studies. Interviews conducted in order to obtain empirical data considering GSD projects in the studied organizations. A structured questionnaire (See appendix X) was designed and sent to the project manager to let him understand what would be asked during interview. Questionnaire was divided into several parts to collect personal data of the interviewee and, organizational and project information. Following are some main categories defined in questionnaire: Basic Information of Interviewee; General Information about organization and projects/products; Questions regarding finished project and team for that project; Domain related questions; Data related questions. These interviews conducted in a very friendly environment. The authors tried to extract all the related information from interviewee. Project manager helped the authors to fill questionnaire, which was documented later. Afterwards, authors used emails and telephonic calls to collect missing data. This collected data was used as an input for analyses, calculations and comparisons later. Actual effort received from organizations, and results from effort estimation methods in this research were compared with each other. These comparisons showed some variation between them. These results were also compared with related scientific studies. From this point, a discussion on deviation of results was started. Factors caused these deviations were also discussed. At last, but not least, suggestions for the involvement of some factors related to distributed environment in existing effort estimation methods were given. This also led towards new hypotheses for future studies. To clear research outline, the authors divided above description in four phases. The following diagram shows the detailed view of research outline. 3

Figure 1: Research Outline In phase 1, authors gathered data regarding two finished GSD projects from different organizations. This data consisted of user manuals, SRS documents and size of project in SLOC. Interviews were the main approach for gathering data. The authors performed analyses on gathered data according to the selected effort estimation approaches in phase 2. The estimated results from phase 2 are then compared with actual effort and related studies in phase 3. Phase 4 consisted of analyses of different factors affecting these projects. The authors provided some recommendations according to estimated results and analyses of different affecting factors. At the end, the authors concluded and discussed about outcomes from this research, and provided some hypotheses for future work. 1.6.1 Literature Review The authors used literature review to have background and current knowledge of effort estimation methods, their pros and cons, and usage in the software industry. This background knowledge helped the authors during empirical case studies. Furthermore, literature review also helped in comparing case studies results with scientific literature. This identified suggestion for the improvement of effort estimation in GSD. 1.6.2 Empirical Case Studies Data of two finished GSD projects was used in this research project. COCOMO II, SLIM and ISBSG examined with the collected data from the GSD organizations. The authors used collected data as an input for the calculations and found different results. These results compared with different related studies. 4

1.6.2.1 Data Source Scientific literature and the industry were the main sources for this thesis. Online databases were used to gather the scientific research published through IEEE, ACM, and other libraries. Two structured interviews were conducted with project managers of investigated GSD projects. These project managers were highly experienced in the field of software development and of course had vast knowledge for studied cases. These interviews were documented immediately. Both literature review and interviews were used as input for further operations in this thesis. 1.6.2.2 Data Analysis The collected data was analyzed according to the objectives of this thesis. The authors analyzed different factors related to COCOMO II, SLIM and ISBSG. These analyses were then described into tabular forms. Investigation was completed on the basis of these analyzed data. Furthermore, these investigations were modeled the possible results and performed further actions such as comparisons, recommendations etc. During analysis, the authors considered all parameters, which helped for the accomplishment of this thesis report. 1.7 Validity Threats As it has mentioned above that the nature of this study was exploratory. The authors investigated two GSD finished projects in order to explore the accuracy of existing effort estimation methods. There were many confidential issues involved for and organization in order to provide the required data for this thesis. Therefore, it was quite difficult to find more case studies in this short period. The authors documented interviewed data before forgetting it. Results gained from these two case studies might not be too strong decision making. However, it showed a trend and authors pointed out some critical issues. It is still needed to validate these results over a number of projects. Furthermore, the case studies organizations were distributed in UK and PAK. It is therefore still required to validate this thesis results in the organizations with different structure and boundaries where the time zone difference is more than eight hours such as Africa, USA, etc. 5

2 PROJECT MANAGEMENT IN GSD 2.1 GSD Challenges The phenomenon of Global Software Development (GSD) is taking potential and the importance of GSD boost with the passage of time. GSD offers highly skilled personal with low cost. It is one of the reasons behind shifting software industry from co-located development to GSD. This shifting brought some new challenges in software development. The problems arising from the geographical, temporal and socio-cultural distance are the main challenges for GSD. Lack of formal communication generates misunderstandings in requirements or changes in the requirement specification. Real time contact between the distributed team is more difficult with temporal and geographical difference. Socio-cultural difference leads problems, such as, different opinions about the nature of the software development process. Language problem also discourages employees to online meeting. To avoid this fear he/she prefers to asynchronous communication e.g. email etc. However, sometime one cannot express a problem through only words; he needs gesture, body language and pitch [16, 41, 42, 43]. Moreover, the distance and complexity of coordination is directly proportional. As distance increases, the complexity of coordination increases in software development process. This complexity also arise lack of familiarity with remotely located colleagues and increase in communication cost as well. Obviously, trust and teamwork cannot take birth in this situation. GSD environment erg to reduce these complexities also needed to enhance the ability to focus on coordination of resources [16, 17, 44]. There is a need to overcome these challenges and provide useful and common solutions to those challenges. 2.2 Project Management in GSD Because of time zone, geographic and communication differences, the project management is now a great challenge in GSD. Project management seems to be same entity for both co-located and GSD environment and, an experienced project manager can set the project on the success path. Some concealed risks sway project s success. Project managers are advancing and updating themselves according to the new paradigm of GSD. Both (literature and industry) are lacking to provide the standardized approaches, sights of GSD process management and problem solution [18, 45, 46, 47]. Some generic problems and hurdles need to solve and remove. According to [18] conclusion about global project management, it is recommended to overcome problems of communication manner. Employees also hesitate to adopt new means of interactive communications. Switching the personnel between their partners according to task distribution causing problems associated with wrong approach in project process distribution that affects the overall result of the project. There are also some problems of unclear-shared goals in the GSD environment. A project manager has to share this knowledge with the employees and the organizations. Effort estimation is one of important activity of project management. It is closely associated with risk management [50]. For GSD project, risk management requires more intension because there are some additional GSD factors involved. According to [50], GSD factors, e-g multisourcing, geographical distribution, temporal diversity, socio-cultural diversity, linguistic diversity, contextual diversity 6

and political & legislative diversity are the main roots for GSD threats that imperial the success of global project. These threats are directly affect the GSD project management activities. These threats reveal the weird nature of GSD project. They also generate a force that develops obstacles in a project. Figure 2 represents the relation of GSD factors, their threats and the effects of these threats on GSD project management. GSD Factors GSD Threats Multisourcing Lack of language skills Geographic Distribution Temporal Diversity Socio-cultural Diversity Terminology difference Lack of joint risk management Linguistic Diversity Contextual Diversity Political and Legislative Diversity Wrong effort estimation Lack of trust Temporal difference Complex project measurement Inconsistency in work practice Figure 2: Project Management and GSD Factors [50] Project managers require significant effort to forecast obvious and hidden risks associated with GSD factors, and perform necessary precautions to overcome challenges associated with these factors to succeed in GSD. All these GSD factors affect project management activities and thus should be taken into account by project managers when estimating project schedule. 7

3 EFFORT ESTIMATION 3.1 Objectives of Effort Estimation Effort estimation plays a key role in the field of software engineering. It helps in predicting the effort and duration required for completing a project [9]. Researches show, projects that often exceed, are without planning or using unrealistic approaches to plan their costs and schedules [9, 20, 35]. As the cost and duration of projects continuous to increase, the research attention is now attracting to gain better methods for accurate effort estimation. Accurate software project estimation is one of the most important activities in software development that can solve the problems of delays in projects [20, 35, 48]. It is much difficult to plan and control projects without an effective and strong estimate, in terms of effort and schedule indicating the calendar time required. Therefore, it is necessary to estimate effort for software development projects to set deadline and cost to meet quality. Projects history helps in estimation for new projects and it ensures positive results [20]. It also involves the expertise of estimators that lead towards good results. 3.2 Effort Estimation Methods In this section, two well-known effort estimation methods (COCOMO II and SLIM), used for analyzing data from finished projects in this thesis, are described. Furthermore, the authors also provided some reasons for selecting these methods. 3.2.1 COCOMO II COCOMO II is an extension to COCOMO 81, originally published in Software Engineering Economics by Dr. Barry Boehm [21] in 1981. The word COCOMO derived from COnstructive COst MOdel. This is widely used method for estimating cost and schedule for the projects at early stages. It helps to make software implication decisions. It estimates project cost, derived directly from Person Month (PM) effort. PM is number of hours a person spends on working for software development project in a calendar month. Its nominal value is defined as 152 hours for one person in a calendar month. 160 person-hours are also treated as one PM [20]. Excluding weekends and leaves, allowed in one calendar month, calculate these hours. COCOMO II estimates effort in PM. Keep remembering, PM and duration of project are different from each other. For example, if a project is estimated for 20 PM, it can have 5 months duration [20]. COCOMO II has a well-described structure for estimating necessary effort and duration of projects. It mainly uses size of project i.e. Source Lines Of Codes (SLOC) or Function Points (FP). Many organizations are now using FP as an input for estimating effort in COCOMO II [27]. COCOMO II deals with a large variety of factors, which influence the estimation of project. It has 17 cost drivers (for post architecture model) and 5 scale factors. Scale factors introduced in COCOMO II, were not available in COCOMO 81[20, 22]. There are three sub models for COCOMO II: Application Composition Model; Post Architecture Model; Early Design Model. 8

The projects that use Integrated Computer Aided Software Engineering tools are mostly estimated for effort and schedule by using Application Composition Model. Rapid application development uses these tools. Furthermore, these tools also support prototype activities occurring later in spiral model [22]. The Early Design and Post Architecture models are used to estimate effort and schedule for the projects like software infrastructure, embedded systems and large applications [20]. When there is incomplete project analysis, early design model is used for rough estimation. Whereas post-architecture model is used when analysis and top-level design of project is completed and know detailed information about project [20]. Post Architecture and Early Design models use the same approach to find the size of project and scale factors as well. Reusing code and other data from previous projects is also included in product sizing [20]. 3.2.1.1 Size estimation Size of software is direct input to calculate effort and schedule estimation in COCOMO II. It becomes very important for reliable estimation. Sizing is a difficult task because it includes new and reused code and modified code. In COCOMO II, aggregate size is used for new and reused code with updates. These adjustments take into account by considering amount of design, code and testing changes. There are two main types of size used in COCOMO II for effort and schedule estimation [20]: Source Lines Of Code (SLOC); Function Points (FP). 3.2.1.2 Cost Drivers COCOMO II enhanced and added more cost drivers as compared with COCOMO 81. These cost drivers are divided into four categories for The Post-Architecture Model, the sub model of COCOMO II. Table 1 shows the cost drivers used in The Post-Architecture Model [20]. Product Factors Platform Factors Personnel Factors Project Factors Required Reusability Execution Time Analyst Capability Use of Software (RELY) Constraint (TIME) (ACAP) Tools (TOOL) Database Size (DATA) Main Storage Constraint (STOR) Programming Capability (PCAP) Multisite Development (SITE) Product Complexity (CPLX) Platform Volatility (PVOL) Application Experience (APEX) Scheduling Factor (SCED) Developed for Reusability (RUSE) Platform Experience (PLEX) Documentation match to life-cyclemodel (DOCU) Personal Continuity (PCON) 9

Language & Tools Experience (LTEX) Table 1: Cost Drivers COCOMO II 3.2.1.3 Scale Factors These factors were not available in COCOMO 81 and included later in COCOMO II. These scale factors let the effort estimation team to make better approximation of influencing factors. These factors are related to organizational and team characteristics. Each scale factor has values from range of very low to extra high rating level. Each rating level has a weight/value. The weight of scaling factors for different organizations and projects could be different. Following are the scaling factors in COCOMO II [20]: Precedentedness (PREC); Development Flexibility (FLEX); Architecture/ Risk Resolution (RESL); Team Cohesion (TEAM); Process Maturity (PMAT). 3.2.1.4 COCOMO II Equation The basic equation for COCOMO II is shown as figure 3 [20, 28]; Figure 3: Effort Estimation Equation COCOMO II Where A = 2.94 (for COCOMO II), size is Kilo Source Lines Of Code (KSLOC), EM i is Effort Multiplier, which can be calculated from cost drivers in COCOMO II E is used to calculate Scale Factors (SF) in COCOMO II. Formula for E is; E = B+ 0.01* SF j (j= 1 to 5) Where B = 0.91 for COCOMO II [20] Cost drivers are given in table 1 Equation for duration in COCOMO II is [20, 28] Duration = [C* (PM NS ) (D+0.2*(E-B)) ] Where C = 3.67, D = 0.28, B = 0.91 and PM NS is effort in PM excluding SCED cost driver PM NS = a * Size b * Π EM i (i= 1 to 16) 10

3.2.2 SLIM SLIM is another algorithmic method used to estimate effort and schedule of projects. SLIM stands for Software LIfe-cycle Model and is introduced by Putnam [23]. It was developed for measuring the general size of project based on its estimated SLOC. Later, it was modified for effort estimation using Rayleigh curve model [23]. Equation I (Figure 4) is use to find the productivity parameter (PP). PP is use to calculate the effort in man-years. Equation II (Figure 4) is used to calculate effort, using value of PP from equation I. Figure 4: Putnam s Equation for SLIM (Productivity Parameter) [29] Eq. I This implies that: Figure 5: Putnam s Equation for SLIM (Effort Estimation) Eq. II Where B is special skill factor PP is productivity parameter Duration is development schedule length in years Size is Source Lines Of Code 3.2.2.1 SLIM Tool SLIM tool is the product of SLIM (for the proprietary of Putnam s model) which is metrics-based estimation tool, developed by Quantitative Software Management (QSM), using validated data of over 2600 projects. These projects were classified into nine different application categories. This tool helps the management to estimate the effort and time required to build medium and large software projects. The most important thing is that it can be customized for a specific organization [24]. There are two main management indicators, Productivity Index (PI) and Manpower Buildup Index (MBI). PI could be taken as process productivity. PI values were observed from 0.1 to 34, whereas its values were given 0.1 to 40.0 in SLIM tool by QSM to overcome future contingencies. A high PI value means that project s productivity is high and it is low complex. In case of PI having values below average implies 10% more development time and 30 % more cost. Second important indicator MBI is measure of staff buildup rate. Some factors, by which MBI is influenced, are schedule pressure, task concurrency and resource constraints. Its values are observed in the range of -3 to 10. A low MBI value implies [24]: 11

Longer time; Fewer people; Less effort; Fewer defects; Fewer LOC/month; Higher LOC/PM. The following are steps involved in effort estimation using SLIM tool [24]: First of all estimate product size; Select/analyze the PI and MBI values for this project; Get minimum time for solution; Determine alternative solution for time-effort-cost; Chose your desired solution; At the end, generate project plan from the chosen solution. 3.3 Continuous Improvement Consideration about calibration is very important when talking about estimation. Calibration is a technique that allows application of a general model to a specific set of data. Some of the case studies have shown that calibration is important for estimation because of the following reasons [22, 25]: Size of the project and its relation with effort, which are used in different methods for estimation, could be different for different environments (skills of coding, tools used, platform understandability etc). There could be high risk involved when using generic algorithms due to its contrast with organization structure and working Calibration is important because, in particular, organizational needs and availability of data varies Historical data about completed project plays integral role in effort estimation. However, it is important to calibrate the historical data accordingly before using it for estimation [34]. There are some factors that influence calibration such as; similar projects executed in past, features of finished project, and the complexity of the software to be developed [25, 26]. It is important for organizations to determine whether their estimates arrived at, for a project are realistic or not. This could be done by validating estimates against completed project data, which would describe the correctness/accuracy of estimates. It is also required to make sure to consider current development environment with respect to previously completed projects [26]. Calibration of co-located project experience would be helpful for GSD projects. Every organization might have their calibrated estimation models. Even in one organization there might be different models used for different kind of projects. Methodology may remains the same but some influencing factors might vary from project to project, thus calibration becomes very important [26, 31]. Organizations have many past-completed projects data. It would be beneficial to use that data to estimate effort and duration for new projects. In general, existing effort estimation methods required to calibrate by the addition/involvement of some new factors related to GSD environment etc. 12

There is also a need to introduce new tools and techniques for the calibration of existing effort estimation methods. Dynamic tools and techniques are one in this regard [31, 32]. 3.3.1 Reasons for Selecting COCOMO II and SLIM There are a number of methods used for effort estimation. In this thesis, two wellknown methods (COCOMO II and SLIM) were used. The authors investigated and analyzed different effort estimation methods. Table 2 shows the analyses of different effort estimation methods for software projects. The authors analyzed these methods by considering different reasons and objectives for selection. Methods Reasons Availability of literature Availability of tools Coverage of parameters Used for Early estimation COCOMO II Many Free tools available More than 17 cost drivers and 5 very important scale factors SLIM Sufficient, some are not free. Yes, but most of the tools are not free. 2 main parameters such as MBI and PP Expert Judgment Available but not sufficient Not specific Depends on expert Analogy Parkinson Price to Win Top Down Bottom- Up satisfactory Few Few Few Few Few - Not specific Depends on the nature of the project, which has to be compared with. - Varies accordingly Not specific Varies accordingly Yes Yes Yes Yes Yes Yes Yes Yes Algorithmic Yes Yes No No No No No No Widely Yes Yes Yes No No No No No Used Possible to calibrate - Yes - - Yes Yes Yes, different factors are available according to situations. Yes, According to QSM Table 2: Analyses of different effort estimation methods [21, 23] After analyses, the authors selected two effort estimation methods (COCOMO II and SLIM) to find their accuracy for selected cases. Both selected methods have many advantages over other methods. Table 3 shows specifically some common objectives of COCOMO II and SLIM. Not specific Varies accordingly 13

Methods COCOMO II SLIM Reasons Availability of literature Many Sufficient, some are not free Availability of tools Free tools available Yes, but most of the tools are not free Coverage of parameters More than 17 cost drivers 2 main parameters such as and 5 very important scale MBI and PP factors Used for early estimation Yes Yes Algorithmic Yes Yes Widely used Yes Yes Possible to calibrate Yes, different factors are Yes, According to QSM available according to situations Table 3: Selected Effort Estimation Methods [21, 23] 14

4 CASE STUDIES 4.1 Projects Overview After the literature review of both methods, the authors selected two different organizations for data collection. Exchanging emails and online interviews ensured the data collection process. Parameters of both methods were considered during the interviews to elicit data according to these parameters. Some other necessary data, such as delay factors and suggestion for improvement from the project manager are also considered in this regard. 4.1.1 Project A 4.1.1.1 Description Name: Workforce Evaluation Tool The Workforce Evaluation Tool offers an online balance scorecard, helping users to assess changes in workforce practice. It enables organizations to calculate performance in four key perspectives workforce, customer, service, and finance. The tool offers a visual representation of the performance in the aforementioned perspectives, and alerts best and the worst performance indicators. Data collection process was very difficult because of time constraint and unavailability of concerned person. Confidentiality was another issue in this regard. There was also a problem of lack of documentation. Although, the studied organization is ISO 9001:2008 and ISO 20000 certified, and also a Microsoft Gold Certified partner, it was observed that the organization is lacking to follow formal steps of software life cycle e.g. documentation and record keeping. An online interview was conducted to discuss the questionnaire and extract some more information related to the project. Interview session was of one and half hours long. A series of emails exchanged later to remove the ambiguity from the gathered data. Interview information then documented and arranged in a readable and understandable form. This data was used as input to the both of the methods. Studied project was initially estimated through expert judgment, which estimated three calendar months to finish, but it delayed. Actual completion time of the project was four calendar months. According to the project manager, the development team consisted of 7 persons. The project manager also provided the duration spent on each activity, which helped the authors in finding the actual effort in person months, which was 14.4 person months. There were two locations involved in this project. The project activity distribution was organized as follows. Requirement engineering and deployment performed in the onshore (UK) office of the organization, whereas designing, coding, technical writing and testing performed in the offshore (PAK) office. There was no specific designation for project manager explicitly in this project, and requirement analyzer worked for project management activities. Figure 6 shows the pictorial representation of phases of the project. Table 4 provides detailed distribution of effort spent on each activity. 15

Figure 6: Phase Division Project A Phase Number of Person Hours Person Days Persons Requirements 2 352 44 Engineering Design 1 128 16 Coding 2 1152 144 Testing 1 400 50 Technical Writing 1 152 19 Total 7 2184 273 = 14.36 person months Table 4: Actual Effort Project A 16

4.1.2 Project B 4.1.2.1 Description Name: Computerized account system Computerized account system was an application for accounts system. It facilitated the organization in multi dimensional ways, such as record keeping, inventory, sales, reports etc. This project was built to convert the organizational manual account system into computerize. The client was using a manual record keeping procedure. The client decided to convert their manual record into computerized account system to get benefits from the modern technology. The duration of this project was estimated four calendar months by expert judgment, but it delayed one and half calendar months. Therefore, the total development period of this project was five and half calendar months. The authors used same strategy as for project A for the data collection process. This time, the authors were more prepared for data collection because of previous experience in project A. That is why, data collection time for project B was short as compare to project A. Online interview was arranged with project manager. The same questionnaire was used to gather data. According to the data provided by the organization, it was seventeen person months project. The team of this project consisted on six persons. During the interview, different issues and delay factors were discussed. Like project A, same agenda was strictly followed for data collection. Data was collected according to the requirements and methods parameters. The authors then organized data and discussed all the ambiguities with concerned person through emails. This data was used for further process. Figure 7 shows the pictorial representation of phases of the project. Table 5 provides detailed time spent on each activity. Figure 7: Phase Division Project B Phase Number of Person Hours Person Days Persons Project Management 1 328 41 Requirement 1 264 33 Engineering Design + Technical 1 232 29 Writing Coding + Deployment 2 1352 169 Testing 1 432 54 Total 6 2608 326 = 17.15 person months Table 5: Actual Effort Project B 17

4.2 Effort Estimation with Selected Approaches 4.2.1 Effort Estimation with COCOMO II 4.2.1.1 Equation for COCOMO II As already discussed above that COCOMO II model estimates the required effort (in Person-Months) based on size of software project. This size is calculated into thousands source lines of code (KSLOC) of software project. The size collected from organization is supposed to be effected SLOC, excluding blank spaces and comments. Studied organizations ensured that they calculated SLOC using specific tools. Following is the formula to estimate effort in PM for COCOMO II; Effort (PM) = A * Size E * Π EM i (i= 1 to 17)... (1), where A = 2.94 (for COCOMO II) EM i is effort multiplier; one can calculate it by multiplying all cost drivers values related to that project, with each other. E is the exponent derived from five scale factors. Its equation is as follows; E = B+ 0.01* SF j (j= 1 to 5). (2), where B = 0.91 (for COCOMO II) SF j is scale factor introduced in COCOMO II and mostly contain organizational and project team characteristics. It is calculated by adding all five-scale factor values. For both cases, values for scale factors and cost drivers were analyzed using interviews data, studying organizational characteristics, expert judgment etc. For these factors, organizations are contacted several times. Equation to find duration of project is as follows; Duration = [C* (PM NS ) (D+0.2*(E-B)) ]. (3), where C = 3.67, D = 0.28, B = 0.91 and PM NS is effort in PM excluding SCED cost driver. It means that one needs to calculate PM but he should not use SCED cost driver value for multiplication and in this way, result will be called PM NS. i-e PM NS = A * Size E * Π EM i (i= 1 to 16). (4) 4.2.1.2 Project A: COCOMO II Table 6 and 7 show selected scale factors and cost drivers values for project A respectively. Some of the scales were gathered during interviews and remaining was selected by studying the organizational characteristics. 18

Scale Factors Value Precedentedness (PREC) Very high: 1.24 Development Flexibility (FLEX) High: 2.03 Architecture / Risk Resolution (RESL) High: 2.83 Team Cohesion (TEAM) Nominal: 3.29 Process Maturity (PMAT) High: 3.14 Table 6: Scale Factors Project A Effort Multiplier Value Product Factors Required Software Reliability (RELY) Nominal: 1.00 Data Size (DATA) N/A Develop for Reuse (RUSE) Nominal: 1.00 Documentation match to life-cycle needs (DOCU) Nominal: 1.00 Product Complexity (CPLX) Low : 0.87 Platform Cost Drivers Execution Time Constraint (TIME) N/A Main Storage Constraint (STOR) N/A Platform Volatility (PVOL) Low : 0.87 Personal Cost Drivers Analyst Capability (ACAP) High : 0.85 Programmer Capability (PCAP) High : 0.88 Applications Experience (APEX) Very High : 0.81 Platform Experience (PLEX) Very High : 0.85 Language and Tool Experience (LTEX) Very High : 0.85 Personnel Continuity (PCON) N/A Project Cost Drivers Use of Software Tools (TOOL) Very High : 0.78 Multisite Development (SITE) Very Low : 1.22 Required Development Schedule (SCED) Low : 1.14 Table 7: Cost Drivers (Effort Multiplier) Project A N/A means that values for this effort multiplier are not applicable in studied project. To solve equation 1 using gathered data for effort estimation, E was calculated first. This value of E was used in effort estimation equation for COCOMO II. E = 0.91+0.01*12.53 E= 1.0353 Final estimation of effort based on E and the given cost drivers values are as follows; Effort (PM) = 2.94* (10.537) 1.0353 *0.3594 Effort (PM) = 12.1PM 19

Above result shows, effort required for this project is approximately equal to 12.1 Person-Months. 4; Duration of the project is calculated by providing related values to equation 3 and i.e. PM NS = 2.94 * (10.537 ) 1.0353 * 0.3152 PM NS = 10.61 PM Duration = [3.67* (10.61) (0.28+0.2*(1.0353-0.91)) ] Duration = 7.54 While executing the gathered data with COCOMO II, it is found that 12.1 PM effort was required to complete this project. Excluding SCED factor from effort, duration is calculated as 7.54 months. This shows that an effort of 12.1 Person Months was required to complete project A with duration of 7.54 months. 4.2.1.3 Project B: COCOMO II Table 8 and 9 show scale factors and cost drivers values for project B respectively. Same strategy (as in project A) was used to gather scales/values for cost drivers and scales factors and Scale Factors Value Precedentedness (PREC) Very High : 1.24 Development Flexibility (FLEX) Very High : 1.01 Architecture / Risk Resolution (RESL) Very High : 1.41 Team Cohesion (TEAM) High : 2.19 Process Maturity (PMAT) High: 3.14 Table 8: Scale Factors Project B Effort Multiplier Value Product Factors Required Software Reliability (RELY) Very Low: 0.82 Data Size (DATA) N/A Develop for Reuse (RUSE) Low: 0.95 Documentation match to life-cycle needs (DOCU) Low: 0.91 Product Complexity (CPLX) Low : 0.87 Platform Cost Drivers Execution Time Constraint (TIME) N/A Main Storage Constraint (STOR) N/A Platform Volatility (PVOL) Low : 0.87 Personal Cost Drivers Analyst Capability (ACAP) High : 0.85 Programmer Capability (PCAP) High : 0.88 Applications Experience (APEX) Very High : 0.81 Platform Experience (PLEX) Very High : 0.85 Language and Tool Experience (LTEX) Very High : 0.84 20