DOMAIN-BASED EFFORT DISTRIBUTION MODEL FOR SOFTWARE COST ESTIMATION. Thomas Tan

Transcription

1 DOMAIN-BASED EFFORT DISTRIBUTION MODEL FOR SOFTWARE COST ESTIMATION by Thomas Tan A Dissertation Presented to the FACULTY OF THE USC GRADUATE SCHOOL UNIVERSITY OF SOUTHERN CALIFORNIA In Partial Fulfillment of the Requirements for the Degree DOCTOR OF PHILOSOPHY (COMPUTER SCIENCE) August 2012 Copyright 2012 Thomas Tan

2 DEDICATION To my parents, To my family, And to my friends. ii

3 ACKNOWLEDGEMENTS I would like to thank many of the researchers who worked alongside me through the long and painful process of data cleansing, normalization, and analyses. These individuals not only pushed me through the hardships but also enlightened me to find better solutions: Dr. Brad Clark from Software Metrics Inc, Dr. Wilson Rosa from Air Force Cost Analysis Agency, and Dr. Ray Madachy from Naval Post-graduate School. Additionally, I like to acknowledge my colleagues from the USC Center of Systems and Software Engineering for their support and encouragement. Sue, Tip, Qi, and many others, you guys made most of my days in the research lab fun and easy and were able to pull me out of those gloomy ones. I would also like to thank my PhD committee members who always provide insightful suggestions to guide through my research and helped me achieve my goals: Prof. Nenad Medividovic, Prof. F. Stan Settles, Prof. William GJ Halfond, and Prof. Richard Selby. Most importantly, I would like to express my deepest gratitude to my mentor and advisor: Dr. Barry Boehm. Throughout my graduate school career at USC, Dr. Boehm has always been there guiding me to the right direction, pointing me to the right answer, and teaching me to make the right decision. His influences not only have helped me to iii

4 make through graduate school, but also will have a long lasting impact on me as a professional and scholar in the field of Software Engineering. Last, a special thanks to the special person in my life, Sherry, who supported me with her whole heart in any way she can and provided many suggestions that proved to be more than just useful, but also brilliant. iv

5 TABLE OF CONTENTS Dedication ii Acknowledgements iii List of Tables viii List of Figures xi Abstract xiii Chapter 1: Introduction Motivation Propositions and Hypotheses Contributions Outline of the Dissertation Chapter 2: Review of Existing Software Cost Estimation Models and Related Research Studies Existing Software Estimation Models Conventional Industry Practice COCOMO 81 Model COCOMO II Model SLIM SEER-SEM True S Research Studies on Effort Distribution Estimations Studies on RUP Activity Distribution Studies on Effort Distribution Impact Drivers Chapter 3: Research Approach And Methodologies Research Overview v

6 3.2 Effort Distribution Definitions Establish Domain Breakdown Select and Process Subject Data Analyze Data and Build Model Analyze Effort Distribution Patterns Build Domain-based Effort Distribution Model Chapter 4: Data Analyses and Results Summary of Data Selection and Normalization Data Analysis of Domain Information Application Domains Productivity Types Data Analysis of Project Size Application Domains Productivity Types Data Analysis of Personnel Capability Application Domains Productivity Types Comparison of Application Domains and Productivity Types Conclusion of Data Analyses Chapter 5: Domain-Based Effort Distribution Model Model Description Model Implementation Comparison of Domain-Based Effort Distribution and COCOMO II Effort Distribution Chapter 6: Research Summary and Future Works Research Summary Future Work References vi

7 Appendix A: Domain Breakdown Appendix B: Matrix Factorization Source Code Appendix C: COCOMO II Domain-Based Extension Tool And Examples Appendix D: DCARC Sample Data Report vii

8 LIST OF TABLES Table 1: COCOMO 81 Phase Distribution of Effort: All Modes [Boehm, 1981] Table 2: COCOMO II Waterfall Effort Distribution Percentages Table 3: COCOMO II MBASE/RUP Effort Distribution Percentages Table 4: SEER-SEM Phases and Activities Table 5: Lifecycle Phases Supported by True S Table 6: Mapping of SRDR Activities to COCOMO II Phases Table 7: Comparisons of Existing Domain Taxonomies Table 8: Productivity Types to Application Domain Mapping Table 9: COCOMO II Waterfall Effort Distribution Percentages Table 10: Personnel Rating Driver Values Table 11: Data Selection and Normalization Progress Table 12: Research Data Records Count - Application Domains Table 13: Average Effort Percentages - Perfect Set by Application Domains Table 14: Average Effort Percentages - Missing 2 Set by Application Domains Table 15: ANOVA Results - Application Domains Table 16: T-Test Results - Application Domains Table 17: Research Data Records Count - Productivity Types Table 18: Average Effort Percentages - Perfect Set by Productivity Types Table 19: Average Effort Percentages - Missing 2 Set by Productivity Types viii

9 Table 20: ANOVA Results - Productivity Types Table 21: T-Test Results - Productivity Types Table 22: Effort Distribution by Size Groups Communication (Perfect) Table 23: Effort Distribution by Size Groups - Mission Management (Perfect) Table 24: Effort Distribution by Size Groups Command & Control (Missing 2) - 57 Table 25: Effort Distribution by Size Groups - Sensor Control (Missing 2) Table 26: Effort Distribution by Size Groups RTE (Perfect) Table 27: Effort Distribution by Size Groups - VC (Perfect) Table 28: Effort Distribution by Size Groups - MP (Missing 2) Table 29: Effort Distribution by Size Groups - SCI (Missing 2) Table 30: Effort Distribution by Size Groups - SCP (Missing 2) Table 31: Personnel Rating Analysis Results - Application Domains Table 32: Personnel Rating Analysis Results - Productivity Types Table 33: Effort Distribution Patterns Comparison Table 34: Effort Distribution Patterns Comparison Table 35: ANOVA Results Comparison Table 36: T-Test Results Comparison Table 37: Average Effort Percentages Table for the Domain-Based Model Table 38: Sample Project Summary Table 39: COCOMO II Estimation Results Table 40: Project 49 Effort Distribution Estimate Comparison Table 41: Project 51 Effort Distribution Estimate Comparison ix

10 Table 42: Project 62 Effort Distribution Estimate Comparison x

11 LIST OF FIGURES Figure 1: Cone of Uncertainty in Software Cost Estimation [Boehm, 2010]... 3 Figure 2: RUP Hump Chart Figure 3: Research Overview Figure 4: Example Backfilled Data Set Figure 5: Effort Distribution Pattern - Perfect set by Application Domains Figure 6: Effort Distribution Pattern - Missing 2 Set by Application Domains Figure 7: Effort Distribution Pattern - Perfect set by Productivity Types Figure 8: Effort Distribution Pattern - Missing 2 Set by Productivity Types Figure 9: Effort Distribution by Size Groups Communication (Perfect) Figure 10: Effort Distribution by Size Groups - Mission Management (Perfect) Figure 11: Effort Distribution by Size Groups Command & Control (Missing 2) 57 Figure 12: Effort Distribution by Size Groups - Sensor Control (Missing 2) Figure 13: Effort Distribution by Size Groups RTE (Perfect) Figure 14: Effort Distribution by Size Groups - VC (Perfect) Figure 15: Effort Distribution by Size Groups - MP (Missing 2) Figure 16: Effort Distribution by Size Groups - SCI (Missing 2) Figure 17: Effort Distribution by Size Groups - SCP (Missing 2) Figure 18: Domain-based Effort Distribution Model Structure Figure 19: Project Screen of the Domain-based Effort Distribution Tool xi

12 Figure 20: Effort Results from the Domain-base Effort Distribution Tool xii

13 ABSTRACT In software cost estimation, effort allocation is an important and usually challenging task for project management. Due to the Cone of Uncertainty effect on overall effort estimation and lack of representative effort distribution data, project managers often find it difficult to plan for staffing and other team resources. This often leads to risky decisions to assign too few or too many people to complete software lifecycle activities. As a result, projects with inaccurate resource allocation will generally experience serious schedule delay or cost overrun, which has been the outcome of 44% of the projects reported by the Standish Group [Standish, 2009]. Due to lack of data, most effort estimation models, including COCOMO II, use a one-size-fits-all distribution of effort by phase and activity. The availability of a critical mass of data from U.S. Defense Department software projects on effort distribution has enabled me to test several hypotheses that effort distributions vary by project size, personnel capability, and application domains. This dissertation will summarize the analysis approach, describe the techniques and methodologies used, and report the results. The key results were that size and personnel capability were not significant sources of effort distribution variability, but that analysis of the influence of application domain on effort distribution rejected the null hypothesis that the distributions do not vary by domains, at least for the U.S. Defense Department sector. The results were then used to xiii

14 produce an enhanced version of the COCOMO II model and tool for better estimation of the effort distributions for the data-supported domains. xiv

15 CHAPTER 1: INTRODUCTION This opening chapter will reveal the motivation behind this research, state the central question and hypothesis of this dissertation, list the contributions, and introduce the organization of this dissertation. 1.1 Motivation In most engineering projects, a good estimate does not stop when the total cost or schedule is calculated: both management and engineering team need to know the details in terms of resource allocations. In software cost estimation, the estimator must provide effort (cost) and schedule breakdowns among the primary software lifecycle activities: specification, design, implementation, testing, etc. Such effort distribution is important for many reasons, for instances: Before the project kick off, we need to know what types of personnel are needed at what time. When designing the project plan, we need to plan ahead the assignments and responsibilities with respects to team members. When overseeing the project s progress, we need to make sure that the right amount of effort is being allocated to different activities. 1

16 In the COCOMO II model, supporting both Waterfall and MBASE/RUP software processes, an effort distribution percentages table is given as a guideline to help estimator in calculating the detailed effort needed for the engineering activities. However, due to the well-known Cone of Uncertainty [Boehm, 2010] effect, illustrated by Figure 1, the early stage estimate of overall project effort is considerably questionable for project management to design a reliable schedule for resource allocation. Some progress has been made in concurrent USC-CSSE dissertation [Aroonvatanaporn, 2012] in narrowing the Cone of Uncertainty. But the uncertainty in effort distribution by activity still remains. Due to lack of data, most effort estimation models, including COCOMO II, use a one-size-fits-all distribution of effort by phase and activity. The availability of a critical mass of data from U.S. Defense Department software projects on effort distribution has enabled me to test several hypotheses that effort distributions vary by project size, personnel capability, and application domains. 2

17 Figure 1: Cone of Uncertainty in Software Cost Estimation [Boehm, 2010] 1.2 Propositions and Hypotheses The goal of this research work is to use information about application domain, project size, and personnel capabilities in a large software project data set to enhance the current COCOMO II effort distribution guideline in order to provide more accurate resource allocation for software projects. In order to achieve this goal, hypotheses are tested on whether different effort distribution patterns are observed from different application domains, project size, and personnel capabilities. 1.3 Contributions In this dissertation, I will present the analysis approach, describe the techniques and methodologies that are used, and report the primary as summarized below: 3

18 1) Confirmed hypothesis that software phase effort distributions vary by domain. Rejected hypotheses that the distributions vary by project size and personnel capability. 2) Built a domain-based effort distribution model that can help to improve the accuracy of estimating resource allocation guideline for the domains, especially at the early stage of the software development lifecycle when domain knowledge may be the only available piece of information for the management team. 3) Provided a detail definition of application domains and productivity types as well as their relationship to each other. Also performed a head-to-head usability comparison to determine that domain breakdowns would be more relevant and useful as model inputs than would productivity types. 4) Provided a guideline to process and backfill missing phase distribution of effort data: use of non-negative matrix factorization. 1.4 Outline of the Dissertation This dissertation is organized as follows: Chapter 1 introduces the research topic, its motivation, and central question and hypothesis; Chapter 2 summarizes mainstream estimation models and reviews their utilizations of domain knowledge; Chapter 3 outlines the research approach and methodologies; Chapter 4 describes the analysis results and discusses their implications and key discoveries; Chapter 5 presents the domain-based 4

19 effort distribution model with its design and implementation details; Chapter 6 concludes the dissertation with a research summary and discussion on future work. 5

20 CHAPTER 2: REVIEW OF EXISTING SOFTWARE COST ESTIMATION MODELS AND RELATED RESEARCH STUDIES As effort distribution is an important part of software cost estimation, many mainstream software cost estimation models provide guidelines to assist project managers in allocating resources for software projects. In section 2.1, we will review some mainstream cost estimation models and their approaches in providing effort distribution guidelines. Additionally, in section 2.2, we will examine the results of several research studies that are working toward refining the effort distribution guidelines. 2.1 Existing Software Estimation Models Conventional Industry Practice Many practitioners use a conventional industry rule-of-thumb for distributing software development efforts across a generalized software development life cycle [Borysowich, 2005]: 15 to 20 percent toward requirements, 15 to 20 percent toward analysis and design, 25 to 30 percent toward construction (coding and unit testing), 15 to 20 percent toward system-level testing and integration, and 5 to 10 percent toward transition. This approach is adapted by many mainstream software cost estimation models producing effort distribution percentage means tables through different activities or phases in the software development process. 6

21 2.1.2 COCOMO 81 Model The COCOMO 81 Model is the first in the series of the COCOMO (COnstructive COst MOdel) models and was published by Barry Boehm [Boehm, 1981]. The original model is based on an empirical study of 63 projects at TRW Aerospace and other sources where Boehm was Director of Software Research and Technology in There are three sub models of the COCOMO 81 model: basic model, intermediate model, and detailed model. There are also three development modes: organic, semidetached, and embedded. The development mode is used to determine the development characteristics of a project and their corresponding size exponents and project constants. The basic model is quick and easy to use for a rough estimate, but it lacks accuracy. The intermediate model provides a much better overall estimate with effects from impacting cost drivers. The detailed model further enhances the accuracy of the estimate by projecting phase level with a three-level product hierarchy and adjustment of the phasesensitive effort multipliers. The project phases supported by the COCOMO 81 model are similar to the waterfall process: including plan and requirements, product design, programming (detailed design, coding, and unit testing), and integration and test. All three models use the effort distribution percentages table to guide resources allocation for estimators. The percentages table, as shown in Table 1, provides effort percentages for each of the development mode separated by five size groups. 7

22 Table 1: COCOMO 81 Phase Distribution of Effort: All Modes [Boehm, 1981] Effort Distribution Size Mode Phase Small 2 KDSI Intermediate 8 KDSI Medium 32 KDSI Large 128 KDSI Organic Plan & requirements Product design Programming Detailed design Code and unit test Integration and test Very Large 512 KDSI Semidetached Plan & requirements Product design Programming Detailed design Code and unit test Integration and test Embedded Plan & requirements Product design Programming Detailed design Code and unit test Integration and test The general approach for determining the effort distribution is simple: the estimator can calculate the total estimate using the overall COCOMO 81 model and multiply by the given effort percentages to calculate the estimated effort for the specific phase of the given development mode and size group. This approach is the same for the basic and intermediate model but somewhat different in the detailed model where the complete step-by-step process is documented in Chapter 23 of Boehm s publication [Boehm, 1981]. The detailed COCOMO model is based on the module-subsystem-system hierarchy and phase sensitive cost drivers, which the driver values are different by phases and/or activities. Using this model, practitioners can calculate more accurate estimates 8

23 with specific details on resource allocations. However, because this process is somewhat complicated especially considering the various cost driver values in common projects, normal practitioners often find it exhaustive to perform the detailed COCOMO estimation and would fall back on the intermediate model. Overall, use of the effort distribution percentages table is straightforward, and the approach to developing such a table sets a significant example for our research. With regard to application types, the COCOMO 81 model eliminates the use of application types due to lack of data support and possible overlapping with other cost drivers, although Boehm suggests that application type is a useful indicator that can help to shape estimates at an early stage of a project lifecycle and a possible influential factor for effort distribution patterns. However, the notion to use the development mode is similar to using domain information: the three modes are chosen based on features that we can also use to define domain. For example, the Organic mode was applied primarily to business data processing projects. Although there are only three development modes to choose from, COCOMO 81 provides substantial assurance that domain information is carried into the calculation of both the total ownership cost and effort distribution patterns COCOMO II Model The COCOMO II model [Boehm, 2000] inherits the approach from the COCOMO 81 model and is re-calibrated to mediate the issues in estimating costs of modern software projects such as those developed in newer lifecycle processes and capabilities. Instead of using the project modes (organic, semidetached, and embedded) to 9

24 determine the scaling exponent for the input size, the COCOMO II model suggests calculating the exponent from a set of scale factors that are identified as precedentedness, flexibility, architecture/risk resolution, team cohesion, and process maturity. These scale factors are replacements for the development modes and are meant to capture domain information in early stages: precendentedness indicates see how well we understand the system domain and flexibility to determine the domain s conformance with requirements. The model also modifies the four sets of cost drivers to cover more aspects in modern software development practices. Equation 1 and 2 [Boehm, 2000] are the basic estimation formulas used in the COCOMO II model. The model does not take in any input of application types or environment; this information is captured by the product and platform factors (total of eight effort multipliers). (EQ. 1) where (EQ. 2) The COCOMO II model outputs total effort, schedule, costs, and staffing as in COCOMO 81. It also continues the use of effort distribution percentages table for resource allocation guidance. The COCOMO II model replaces the original phase definition with two activity schemes: Waterfall and MBASE/RUP, which stands for Model-based Architecting and Software Engineering and was co-evolved with Rational Unified Process, or RUP [Kruchten, 2003]. The Waterfall scheme is essentially the same as defined in the COCOMO 81 model. The MBASE/RUP scheme is for the newer development lifecycle that covers the Inception, Elaboration, Construction, and Transition phases. Although the model acknowledges the variation of effort distribution 10

25 due to size, the general effort distribution percentages are not separated by size groups as they were in the COCOMO 81 model. The use of the effort distribution percentages table is similar to that in COCOMO 81; the following tables are the effort distribution percentages tables used by the COCOMO II model. Table 2: COCOMO II Waterfall Effort Distribution Percentages Phase/Activities Effort % Plan and Requirement 7 (2-15) Product Design 17 Detailed Design Code and Unit Test Integration and Test Transition 12 (0-20) Table 3: COCOMO II MBASE/RUP Effort Distribution Percentages Phases (End Points) MBASE Effort % RUP Effort % Inception (IRR to LCO) 6 (2-15) 5 Elaboration (LCO to LCA) 24 (20-28) 20 Construction (LCA to IOC) 76 (72-80) 65 Transition (IOC to PRR) 12 (0-20) 10 Totals SLIM SLIM (Software Lifecycle Model) is developed by Quantitative Software Management (QSM) based on analysis of staffing profiles and the Rayleigh distribution in software projects published by Lawrence H. Putnam in the late 1970s. SLIM can be summarized as the following equation: [ ] (EQ. 3) where B is a scaling factor and is a function of the project size. [Putnam, 1992] 11

26 In QSM s recent release of the SLIM tool, the SLIM-Estimate [QSM] (the estimation part of the complete package) takes the primary sizing parameter of Implementation Units that can be converted from a variety of sizing metrics such as SLOC, function points, CSCI, interfaces, etc. The tool also needs to define a Productivity Index (PI) in order to produce an estimate. The Productivity Index can be derived from historical data or the QSM industry standard. It can also be adjusted by additional factors that cover software maturity to project tooling. Additional inputs such as system types, languages, personnel experiences, management constraints, etc. can also be calculated in the equation to produce the final estimate. Another important parameter for SLIM- Estimate is the Manpower Buildup Index (MBI), which is hidden from user input but derived from various user inputs on project constraints. The MBI is used to reflect the rate at which personnel are added to a project: higher rate indicates higher cost with shorter schedule, whereas lower rate results in lower cost with longer schedule. Combined with PI and size, SLIM-Estimate is able to draw the Rayleigh-Norden curve [Norden, 1958] which describes the overall delivery schedule for a project. The output of the SLIM-Estimate is usually illustrated by a distribution graph that depicts the staffing level throughout the user-defined project phases. Overall schedule, effort, and costs are produced along with a master plan that applies to both iterative and conventional development process. SLIM-Estimate outlines the staffing resource distribution by four general phases: Concept Definitions, Requirements and Design, Construct and Test, and Perfective Maintenance. Additionally, it provides a list of WBS elements for each phase while 12

27 offering users the ability to change names, work products, and descriptions for both phases and WBS elements. Looking through SLIM-Estimate's results, we cannot find any direct connection between effort distribution and the application types input. It seems application types may be a contributor to PI or MBI for calculating the overall effort and schedule. From the overall effort and schedule, SLIM-Estimate will calculate effort distribution based on user flexibility, a parameter that SLIM-Estimate uses to choose from user-defined historical effort distribution profiles. In summary, SLIM-Estimate acknowledges that application domains or types are important inputs for its model, but does not provide specific instructions on translating application domains into estimate effort distribution patterns SEER-SEM The System Evaluation and Estimation Resources Software Estimation Model (SEER-SEM) is a parametric cost estimation model developed by Galorath Inc. The model is inspired by the Jensen Model [Jensen, 1983] and has evolved as one of the leading products for software cost estimation. SEER-SEM [Galorath, 2005] accepts SLOC and function points as its primary size inputs. It incorporates a long list of environment parameters, such as complexity, personnel capabilities and experiences, development requirements, etc. Based on the inputs, the model is able to predict effort, schedule, staffing, and defects. The detail equations of the model are proprietary and we can only study the model from its inputs and outputs. 13

28 To simplify the input process, SEER-SEM allows users to choose preset scenarios that automatically populate input environment factors. The tool calls these predetermined sets knowledge bases, and users can change them to fit their own needs. To determine which knowledge base to use, users need to identify the project s platform, application types, development method, and development standard. Development methods describe the development approach such as object-oriented design, spiral, prototyping, waterfall, etc. The development standards summarize the standards for various categories such as documentation, tests, quality, etc. Platforms and application types are used to describe the product's characteristics compared with existing systems. Platforms include system built for avionics, business, ground-based, manned space, shipboard, and more. Application types cover a wide spectrum of applications from computer-aided design to command and control, and so on. The output of the SEER-SEM tool includes overall effort, costs, and schedule. There are also a number of different reports such as estimation overview, trade-off analyses, decision support information, staffing, risks, etc. If given the work breakdown structure, SEER-SEM will also map all the estimate costs, effort, and schedule to the WBS. It is able to export out the master plan in Microsoft Project. In term of effort distribution, SEER-SEM covers eight development phases and all major lifecycle activities, as shown in Table 4. It allows full customization of these phases and activities. Effort and labor can be displayed by phases as well as by activities. 14

29 Table 4: SEER-SEM Phases and Activities Phases Activities SEER-SEM System Requirements Design Software Requirements Analysis Preliminary Design Detailed Design Code / Unit Test Component Integrate and Test Program Test System Integration Through OT&E Management Software Requirements Design Code Data Programming Test CM QA SEER-SEM uses application types as contributors to find appropriate historical profiles for setting its cost drivers and calculating estimates. The model does not provide any specific rules to link application types and effort distribution patterns True S The Programmed Review of Information for Costing and Evaluation (PRICE) model was first developed for internal use by Frank Freiman in the 1970s at. Modified for modern software development practices in 1987, PRICE Systems released PRICE S for effort and schedule estimation for computer systems. True S [PRICE, 2005] is the current product of the PRICE S model. True S takes a list of inputs including sizing input in SLOC, productivity and complexity factors, integration parameters, and new design/code percentages, etc. It also allows users to define application types selecting from seven categories: mathematical, string manipulation, data storage and retrieval, on-line, real-time, interactive, or operating system. There is also a platform input that describes the operating environments, structure, and reliability requirements. From the size inputs and application types, the model is able 15

30 to compute the weight of the software. Combined with other factors, effort in person hours or months is calculated and schedule is produced to map the nine DOD-STD- 2167A phases: System Concept through Operational Test and Evaluation, detail phases shown in Table 5. TruePlanning, the commercial suite that contains True S and the COCOMO II model, produces a staffing distribution that depicts the number of staff needed by category throughout the project lifecycle, i.e. the number of test engineers or design engineers needed as the project progresses. Additionally, True S also calculates support effort in three support phases: maintenance, enhancements, and growth. Table 5: Lifecycle Phases Supported by True S True S DoD-STD-2167A Phases System Requirements Software Requirements Preliminary Design Detailed Design Code/Unit Test Integration & Test Hardware/Software Integration Field Test Other Support Phases System Integration and Test Maintenance Enhancement Growth Similar to SEER-SEM and SLIM, it is difficult to trace the connection between application type input and effort distribution guideline as there is little known about the model or how total effort is distributed to each phase. From the surface, we can only see the end results, in which a cost schedule is produced according to the engineering phases. 16

31 2.2 Research Studies on Effort Distribution Estimations In addition to the mainstream models proposed effort distribution guidelines, some recent studies also focus on effort distribution patterns Studies on RUP Activity Distribution A number of the studies are related to the Rational Unified Process (RUP) for its clear definitions in project phases and disciplines as well as straightforward guidance on effort distribution. The Rational Unified Process [Kruchten, 2003] is an iterative software development process that is commonly used in modern software projects. The RUP hump chart, as shown in Figure 2, is famous for setting a general guideline of effort distribution for the RUP process. The humps in the chart represent the amount of effort estimated for a particular discipline over the four major life cycle phases in RUP. There are six engineering disciplines: business modeling, requirements, analysis and design, implementation, test, and deployment. There are also three supporting disciplines: configuration and change management, project management, and environment. 17

32 Figure 2: RUP Hump Chart Over the years, there have been many attempts to validate the RUP hump chart with sample data sets. A study by Port, Chen, and Kruchten [Port, 2005] illustrates their experiment that assessed 26 classroom projects that used the MBASE/RUP process and found that their results do not follow the RUP guideline. Similarly, Heijstek investigates Rational Unified Process effort distribution based on 21 industrial software engineering projects [Heijistek, 2008]. In his study, Heijstek compared the phase effort measured against several other studies and found that his industrial projects spent less time during elaboration and more during transition. He also produced visualization of his effort data to compare against the RUP hump chart. He observed similarities in most major disciplines, but noted discrepancies in supporting 18

33 disciplines such as configuration and change management and environment. He also extended his research in modeling the impact of effort distribution on software engineering process quality and concluded that effort distribution can serve as a predictor of system quality Studies on Effort Distribution Impact Drivers In addition to studies on RUP, there are also works that investigate the influential factors that impact effort distribution patterns. These works also use extensive empirical analyses to back their findings. Yang et al. [Yang, 2008] conducted a research study on 75 Chinese projects to investigate affecting factors of variant phase effort distribution. They compared the overall effort distribution percentages against the COCOMO II effort percentages and found disagreements between the two in plan/requirement and design phases. They also performed in-depth analyses on four candidate factors: development lifecycle (waterfall vs. iterative), development type (new development, re-development, or enhancement), software size (divided into 6 different size groups), and team size (four different team size groups). For each of the candidate factors, Yang et al. compared the effort distribution between the sub-groups visually and then verified the significance of the differences using simple ANOVA tests. Their results indicate that factors such as development type, software size, and team size have visible impacts on effort distribution pattern, and they can be used as supporting drivers when making resource allocation decisions. 19

34 Kultur et al conducted a similar study with application domain as an additional factor in development type and software size [Kultur, 2009]. In their study, they filtered out 395 ISBSG data points from 4106 software projects, where each data point is given a clear application domain along with development type (new development, redevelopment, or enhancement) and software size. The application domains used in this research include banking, communications, electricity/gas/water, financial/property/ business services, government, insurance, manufacturing, and public administration. The researchers compared the overall effort distribution by domains with the COCOMO II effort percentages, and suggested that some domains follow COCOMO II distribution whereas others present visible differences. Additionally, they cross-examined application domains, development types, and software size for each phase in order to uncover more detailed effort distribution patterns. They applied these distribution patterns to the sample data sets and calculated MMRE value between use of domainspecific distribution and no use of domain-specific distribution. Their results indicate obvious improvements for various domains and phases and therefore call to encourage the use of domain-specific effort distribution for future analysis. However, unlike Yang s analysis, their reports are not enclosed with detail definitions of the application domains and software process phases, which will need further investigation into the validity of their results. 20

35 CHAPTER 3: RESEARCH APPROACH AND METHODOLOGIES This chapter documents the main approach used to achieve the research goal including descriptions of various methodologies and techniques for different analyses. 3.1 Research Overview In order to achieve the research goal of improving the COCOMO II effort distribution guideline, I began by addressing the most complex hypothesis: the variation of effort distribution by application domain. Subsequently, I addressed the simpler hypotheses on variation by project size and personnel capability. Three smaller goals are defined to accomplish this: 1) determine the domain definitions (or domain breakdown) to be supported by the improved model; 2) find a sufficient data set; and 3) find solid evidence of different effort distribution patterns for improved effort distribution percentages for the COCOMO II model. For the smaller objectives, the following separated yet correlated studies are conducted: 1) Establish domain breakdown definitions. 2) Select and process subject data set. 3) Analyze data and build model. Note that these studies are not necessarily done sequentially. For instance, the tasks of establishing the domain breakdown are generally done in parallel with data processing tasks so that the right domain breakdown can be generated to cover all the 21

36 data points. Subsequently, I tested the variation hypotheses for project size and personnel capability, and found no support for the variation hypotheses. Figure 3 depicts the relationships between domains, project size, personnel capability, and the subject data set for this research. It also provides an overall guidance that lists detailed tasks for each smaller study. Figure 3: Research Overview 3.2 Effort Distribution Definitions In order for this research to run smoothly, a unified set of effort distribution definitions must be established before conducting any analysis. There are two sets of standard definitions that are considered for this research: development activities defined in the data dictionary from the data source [DCRC, 2005] and the COCOMO II model definitions on lifecycle activities and phases. Both sets hold their edge as the favorite for this research. Data dictionary is used by all the data points and COCOMO II model 22

37 definition is well-known and widely used by all industry leaders. Still both are not perfect on their own. Therefore, a merging effort takes place to map the overlapping activities, namely plan & requirements, architecture & design, code & unit testing, and integration & qualification tests. The result of this mapping is shown in Table 6. Using this mapping, the two sets of definitions can be connected to form a unified set that facilitates data analyses in this research. Note that because the data does not cover any transition activities, the transition phase from the COCOMO II model is excluded from this research. Table 6: Mapping of SRDR Activities to COCOMO II Phases COCOMO II Phase Plan and Requirement Product Design and Detail Design Coding and Unit Testing Integration and testing SRDR Activities Software requirements analysis Software architecture and detailed design Coding, unit testing Software integration and system/software integration; Qualification/Acceptance testing 3.3 Establish Domain Breakdown Another important set of definitions is the domain breakdown definitions for the domains or types that are used as the input of the domain-based effort distribution model. Establishing such domain breakdown from scratch is extremely challenging. It will require years of effort summarizing distinctive features and characteristics from many different software projects with valid domain information. Then, it will need a number of Delphi discussions among various experts to establish the best definitions. A number of independent reviews will also take place in order to finalize the definitions. Any of these 23

38 tasks will take a long time to complete and the results of the new breakdown may arise from a dissertation of its own. An alternative and rather simple approach is to research well-established domain taxonomies and use either an appropriate taxonomy or a combination of several taxonomies that have enough domain definitions to cover the research data. The following tasks outline the approach to completing the establishment of the domain breakdown: Select the appropriate domain taxonomies. Understand how these taxonomies describe the domains, i.e. the dimension that these taxonomies are using to come up with domain definitions. Make a master domains list and group the similar domains. Select those that can be applied to the research data set. Note that this part of the research is done with researchers from my sponsored program [AFCAA, 2011], with whom we are building a software cost estimation manual for the government. In this joint research, we have reviewed a long list of domain taxonomies and selected the following seven taxonomies that can cover both government and commercial projects, and are applicable to our data set: North American Industry Classification System (NAICS) IBM s Work-group Taxonomy Digital s Industry and Application Taxonomy MIL-HDBK-881A WBS Standard 24

39 Reifer s Application Domains Putnam s Breakdown of Application Types by Productivity McConnell s Kinds of Software Breakdown Among these taxonomies, NAICS [NAICS, 2007] is the official taxonomy to categorize industries based on goods-producing and service-providing functionalities of businesses. It is a rather high-level categorization, yet does provide a quality perspective on industry taxonomy. IBM [IBM, 1988] and Digital s [Digital, 1991] taxonomies focus primarily on commercial software projects from the perspectives of both industry and system capability. Both provide comprehensive guideline in determining the software project s domain using cross references on its industry and application characteristics. The Mil-HDBK-881A standard [DoD HDBK, 2005] is used by the US government to provide detailed WBS guideline for different military systems. The first two levels of the WBS structure provide the description of the system s overall operating environment and high-level functionalities, thus giving us a breakdown in terms of domain knowledge. This standard is especially useful because it provides a broad view of government projects which we have not received explicitly from the previous three taxonomies. Both Putnam s [Putnam, 1976] and McConnell s [McConnell, 2006] application type breakdowns are based on productivity range and size divisions. This tells us that application types may contain certain software product characteristics that have direct relationships with productivity. Confirming this approach, Reifer [Reifer, 1990] also indicates the importance of productivity in relation to application domains, which he summarized from real-world data points to cover a wide range of software systems 25

40 ranging from government to commercial projects. The following table shows a comparison between our subject taxonomies. Table 7: Comparisons of Existing Domain Taxonomies Taxonomy Name NAICS IBM Digital Reifer s Number of Domains Defined 10 industries domains. 46 work groups; 8 business functions (across-industry application domains). 18 industry sectors; 18 application domains. 12 application domains. Breakdown Rationale Categorizing goodsproducing industries and service-providing industries. Use work groups as horizontal perspective and business functions as vertical perspective to pinpoint a software project. Combines domain characteristics with industry definitions. Summarized from 500 data points based on project size and productivity range. Mil-881A 8 system types. Provide WBS for each system type. Level 2 in WBS describes the application features. Putnam s McConnell s 11 application types. 13 kinds of Software. Using productivity range to categorize application types. Adopted from Putnam s application types, refined the taxonomy with software size groups. Considered Size Effect No No No Yes No Yes Yes Considered Productivity Effect No No No Yes No Yes Yes After careful review and study of these taxonomies, we have also come to a common understanding that there are two main dimensions we can look to determine domains platform and capability. Platform describes the operating environment in 26

41 which the software system will reside. It provides the key constraints of the software system in terms of physical space, power supplies, data storage, computing flexibility, etc. Capability outlines the intended operations of the software system and indicates the requirements of the development team in terms of domain expertise. Capability may also suggest the difficulties of the software project given nominal personnel rating of the development team. With a prepared master list of domain categories and the notion to use both platform and capability dimensions, we put together our domain breakdown in terms of operating environment (8) and application domains (21). Each can describe a software project on its own terms. Operating environment defines the platform and product constraint of the software system whereas application domains focus on the common functionality descriptions of the software system. We can use them together or separately as they do not interfere with each other. A detailed breakdown of the operating environment and application domains is documented in Appendix A. Although this initial version of the domain breakdown is sufficient to differentiate software project, we can do more. We have not applied productivity ratings for this breakdown. We have analyzed further breakdown using productivity rate. Upon further comparisons of these taxonomies and data analysis, a proposal was filed for a more simplified domain breakdown. In this newer version of the domain breakdown, the 21 application domains are grouped by its productivity range. We call this new group productivity types or PT. Detailed definitions of productivity types can also be found in Appendix A. The eight 27

42 operating environments are essentially the same as the previous version but split into 10 operating environments. The following table shows a mapping between our application domains and the productivity types. Note that there are some application domains that can be mapped to more than one productivity type. This is because the application domains are covering more than one major capability that spreads across more than one productivity range. Table 8: Productivity Types to Application Domain Mapping Productivity Types Application Domains Sensor Control and Signal Processing (SCP) Sensor Control and Processing Vehicle Control (VC) Executive Spacecraft Bus Real Time Embedded (RTE) Communication Controls and Displays Mission Planning Vehicle Payload (VP) Weapons Delivery and Control Spacecraft Payload Mission Processing (MP) Mission Management Mission Planning Command & Control (C&C) Command & Control System Software (SYS) Infrastructure or Middleware Information Assurance Maintenance & Diagnostics Telecommunications (TEL) Communication Infrastructure or Middleware Process Control (PC) Process Control Scientific Systems (SCI) Scientific Systems Simulation and Modeling Training (TRN) Training Test Software (TST) Test and Evaluation Software Tools (TUL) Tools and Tool Systems Business Systems (BIS) Business Internet In this research, both application domains and productivity types are candidates for support by the domain-based effort distribution model. Both breakdowns will be 28

43 thoroughly analyzed, compared, and contrasted according to the analysis procedure documented in Section Select and Process Subject Data The data is primarily project data from government funded programs collected through Department of Defense s Software Resource Data Report [DCRC, 2005]. Each data point consists of the following sets of information: a set of effort values such as person hours for requirements to qualification testing activities; a set of sizing measurements such as new size, modified size, unmodified size, etc.; and a set of project specific parameters such as maturity level, staffing, requirement volatility, etc. Additionally, each data point is attached with its own refined data dictionary. Our program sponsor, the Air Force Data Analysis Agency, sanitized the data set by removing all project identity information and helped us define the application domains and operating environment for these projects. The original data sets are not perfect: missing data points, unrealistic data values, and ambiguous data definitions are common in our data sets. As a result, normalizing and cleansing of data points is needed for the data analyses. First, records with significant defects need to be located and eliminated from the subject data set: defects such as 1) missing important effort or size data; 2) missing data definitions on important effort or size data, i.e. no definition indicating whether size is measured in logical or physical lines of code; and 3) duplicated records. Second, abnormal and untrustworthy data patterns need to be reviewed and handled in the subject data set: patterns, which are made of huge 29

44 size with little effort or vice versa. For example, there is a record with one million lines of code, produced in 3 to 4 person months with all lines of code as new size. After removing all problematic records, two additional tasks need to be performed: 1) backfill missing effort of remaining activities and 2) test for overall normality of the data set. There are two approaches to backfilling effort data. The first uses simple averages of the existing records to calculate the missing values. The second uses matrix factorization to approximate missing values [Au Yeung; Lee, 2001]. After a few attempts, the first approach proved to be less effective than the second and produced with large margins of errors. Therefore, the second approach, matrix factorization, is the best choice. In matrix factorization, we start with two random matrices, W and H, whose dot product equals the dimension of our data set, X 0. By iteratively adjusting values of W and H, we can find the closest approximation such that W x H X, where X is an approximation of X 0. This process typically sets a maximum iteration number to be at least 5,000 to 10,000 in case W x H never reach close enough to X 0. This algorithm is applied to three subsets: a subset missing 2 out of 5 activities at most; a subset missing 3 out of 5 activities, and a subset missing 4 out of 5 activities. Setting approximation exit margin to and run for 10,000 iterations, backfilled data with very small margin of errors is produced, usually with 10% of the original (when comparing against existing data values). Figure 4 shows an example of the resulting data. 30

45 Figure 4: Example Backfilled Data Set Notice that few records are observed with huge margin of errors. This typically results when a value for one activity is extremely small, while the values for other activities are relatively large. Although this discrepancy seems harmful to the data process results, a low number among a large collection of data points lowers the possibility of entering large error. On the positive side, this discrepancy can help us identify possible outliers in our data set if we experience situations when we need to analyze outlier effects. The last step is to run basic normality tests on the data set. These tests are crucial because they validate the initial assumption which states that all of the data points are independent from each other and therefore normally distributed. The initial assumption is made because there is no known source information or detailed background information for all projects in the data set. We can only assume that they are not correlated in any way, and are thus independent from each other. Since the subject data fields are effort data, it is only necessary to run the tests on this data. Both histogram and Q-Q diagrams [Blom, 1958; Upton, 1996] are produced to visualize the distribution. Several normal distribution tests such as Shapiro-Wilk test 31

46 [Shapiro, 1965], Kolmogorov-Smirnov test [Stephens, 1974], and Pearson s Chi-square test [Pearson, 1901] are performed to check the distribution normality and to determine whether the data set is good for analysis. In addition to checking, eliminating, and backfilling, calculating the equivalent lines of code, converting person-hours to person months, summing up the schedule in calendar months, and calculating the equivalent personnel ratings for each project are also taken place as part of the data processing. 3.5 Analyze Data and Build Model The final piece of this research focuses on answering the central question and building an alternative model to the current COCOMO II Waterfall effort distribution guideline. There are two major steps in this part of the study: 1) calculate and analyze effort distribution patterns and 2) build and implement the model. These two steps are described in full detail in the following sub sections Analyze Effort Distribution Patterns In studying effort distribution patterns, two rounds of analyses are conducted. In the first round, we analyze the initial version of domain breakdown with 21 application domains. In the second round, we analyze the refined version of 14 productivity types. Each round follows the same analysis steps, as described below. The results from each round will be analyzed and compared. Based on the comparison, one domain breakdown 32

47 is determined as the domain information set that will be supported by the new effort distribution model. Effort Distribution Percentages: From the data processing results, effort percentages by activity groups can be calculated for each project. Percentage means of each domain can also be found by grouping the records. By looking at the trend lines, simple line graph can help us visualize the distribution patterns and find interesting points. Although the plots may indicate large gaps of percentage means between domains, the evidence will not be solid enough to prove the difference significant. Statistical proofs are also needed. Single factor of variance (ANOVA) can be used for this proof. We line up all the data points in each domain and use the ANOVA test to determine whether the variance between domains is caused by mere noise or truly represents differences. The null hypothesis for this test is that all domains will have the same distribution percentage means for each activity. The alternative hypothesis is that domains have different percentage means, which is the desired result because this will prove that domains have their effect over effort distribution patterns. Once the ANOVA tests conclude, a subsequent test must be performed to find out if the domains percentage means are different from COCOMO II Waterfall effort distribution guideline s percentage means. This is important because if the domains percentage means are no different from the current COCOMO II Waterfall model, then this research will have no ground in enhancing the COCOMO II model in effort distribution. Table 9 shows the COCOMO II Waterfall effort distribution percentages. 33

48 Table 9: COCOMO II Waterfall Effort Distribution Percentages Phase/Activity Effort % Plan and Requirement 6.5 Product Architecture & Design 39.3 Code and Unit Testing 30.8 Integration and Qualification Testing 23.4 Note that COCOMO II model s percentage means have been divided by 1.07 because the original COCOMO II model s percentage means sum up to 107% of the full effort distribution. In order to find out if there are any differences, the independent one-sample t-test [O Connor, 2003] is used. The formula for the t-test is shown as follows: (EQ. 4) Statistic t in the above formula is to test the null hypothesis that sample average is equal to a specific value, where s is the standard deviation and n is the sample size. In our case, would be the COCOMO II model s effort distribution averages and is a domain s distribution percentage average for each activity group. The rejection of the null hypothesis in each effort activity can provide a conclusion that the domain average does not agree with the COCOMO II model s effort distribution averages. Such an ideal result may indicate that the current COCOMO II model s effort distribution percentages are not sufficient for accurate estimation for effort allocation and thus it is necessary to find an improvement. In both ANOVA and t-tests, 90% significance level to accept or reject the null hypothesis is used because the data is from real world projects and the noise level is 34

49 rather high. If the tests indicate a mix between rejections and acceptances, a consensus of the results can be used to determine a final call on rejection or acceptance Comparison of Application Domains and Productivity Types The final step in data analysis is a comparison analysis of the results from application domains and productivity types. The purpose of this comparison is to evaluate the applicability of these domain breakdowns as the main domain definition set to be supported by the domain-based effort distribution model. In this comparison, general effort distribution patterns are compared to find out which breakdown provides stronger trends that show more differences between domains or types. Similarly, the statistical test results are analyzed for the same reason. Lastly, the characteristics and behaviors of application domains and productivity types are analyzed to compare their identifiability, availability, and supportability Project Size To study project size, data points are divided into different size groups. Effort distribution patterns for each size group are produced and analyzed. The goal of this analysis is to find possible trends within a domain or type that is differentiated by project size. Since size is a direct influential driver of effort in most estimation models, a direct and simple relationship that proportionally increases or decreases project size and effort percentages is expected. Again, statistical tests are necessary if such a trend is found to prove its variance significance level. 35

50 The challenge of this analysis is the division of size groups. Some domains/types may not have enough total data points to be divided into size groups and some domains/types may not have enough data points in one or more size groups. Either case can inhibit determination of the best size driver on effort distribution patterns Personnel Capability For the personnel rating, the SRDR data supplies three personnel experience percentages: Highly Experienced, Nominally Experienced, and Inexperienced/Entry Level. The experience level is evaluated by the years of experience the staff has worked on software development as well as the years of experience the staff has worked within the mission discipline or project domain. Given these percentages, an overall personnel rating can be calculated using the three COCOMO II personnel rating driver values: Application Experience (APEX), Platform Experience (PLEX), and Language and Tool Experience (LTEX). The following formula is used to calculate personnel ratings for each data point. Table 10 shows the driver values of APEX, PLEX, and LTEX of different experience levels. where ( ) ( ) ( ) (EQ.5) Table 10: Personnel Rating Driver Values Driver Names High (~3 years) Nominal (~1 year) Low (~6 months) APEX PLEX LTEX PEXP

51 Using the calculated personnel ratings, data points can be plotted as personnel ratings versus effort percentages for each activity group in a domain/type. Trends can be observed from these plots if increases in personnel ratings results in decreasing in effort percentages or vice versa. For simplicity, the end result for the personnel rating will be kept in linear adjustment factor (at least as close to linear as possible) Build Domain-based Effort Distribution Model From the analyses of effort distribution patterns by application domains and productivity types, project size, and personnel ratings, the variations by size and personnel ratings were negligible, and the effort distribution model was based on domain variation. A set of effort distribution percentages by application domains, and the domain definitions set to be used in the model were collected and readied to build the domainbased effort distribution model. The key guideline for designing the model is that it has to be similar to the current COCOMO II model design. Both Waterfall and MBASE effort distribution models use average percentage tables in conjunction with size as partial influential factor. This compatibility must be established in the new model. Additionally, procedures for using the model should not be more complicated than what are currently provided by the COCOMO II model. That is, without any more instruction than to input all the COCOMO II drivers and necessary information, the model should produce the effort distribution guideline automatically as part of the COCOMO II estimates. The only additional input required may be the domain information. Comparison of the effort 37

52 distribution guideline produced by the COCOMO II Waterfall model and the new model can also be added as a new feature to make this model more useful. When the design of the model is complete, it is important to provide an implementation instance of the model to demonstrate its features. In order to accomplish this, an instance of the COCOMO II model must be selected with source code as the implementation environment for the new model. After the new model implementation is complete, a comparison of results between COCOMO II Waterfall model and the domain-based effort distribution model must be conducted to test the new model s performance. 38

53 CHAPTER 4: DATA ANALYSES AND RESULTS This chapter summarizes the key data analyses and their results conducted in building the domain-based effort distribution model and testing the domain-variability hypothesis. Section 4.1 provides an overview of the data selection and normalization results that defines the baseline data sets for the data analyses. Section 4.2 and 4.3 reports the data analyses performed on the data sets grouped by application domains and productivity types respectively. Section 4.4 reviews the analyses results and compares the pros and cons between application domains and productivity types. Finally, Section 4.5 discusses the conclusions drawn from the data analyses. 4.1 Summary of Data Selection and Normalization Data selection and normalization are completed before most data analyses are started. A set of 1,023 project data points was collected by our data source and research sponsor, the AFCAA, and given to us for initiation of the data selection and normalization process. As discussed in Section 3.4, simple and straight forward browsing through data points helped us eliminate most defective (missing effort, size, or important data definitions such as counting method or domain information) and duplicated data points. Further analysis of abnormal data patterns also identified and removed more than 39

54 a dozen data points. A total of 530 records remains in our subject data set to begin the effort distribution analysis. Although these 530 records are completed with total effort, size, and other important attributes, they need further processing to ensure sufficient phase effort data. Some records do not have all the phase effort distribution data that we need for the effort distribution analysis. Having eliminated those without phase effort distribution data, we are left with 345 total data points that we can work with. The table below illustrates the overall data selection and normalization progress. Table 11: Data Selection and Normalization Progress Data Set Action Record Count 1023 Browsing through data records: looks for defective and duplicated data points. 544 Look for abnormal or weird patterns. 530 Remove records with insufficient effort distribution data. 345 Divide data set by the number of missing phase effort fields. Results Eliminated 479 defective and duplicated data points. Eliminated 14 data points. Eliminated 185 data points. Ready to create 3 sub sets, namely missing 2, missing 1, and perfect sets. Missing 4 and Missing 3 are ignored because they still miss too much information to be persuasive. Created Missing 2 set. 257 Backfilled the two missing phase effort fields. 221 Backfilled the only one missing Created Missing 1 set. phase effort field. 135 None. Created Perfect set. Missing 2 and Missing 1 sets were created as comparison sets against the Perfect set in order to 1) increase the number of data points in the sample data set, and 2) outlook the data pattern as more data points become available in the future. The 40

55 method for backfilling is very effective in predicting continuous and correlated data patterns such as the effort distribution patterns we are focusing on. Therefore, the resulting data sets are sufficient for our data analysis. Since there is little difference between the Missing 2 and Missing 1 sets, we have used only the Missing 2 set. Two copies are created for each of the Missing 2 and Perfect sets. One copy is grouped by Application Domains, and the other copy is grouped by Productivity Types. We are now ready for our main data analyses. 4.2 Data Analysis of Domain Information Application Domains The following table is the records count by application domains. Note the highlighted rows are the domains with sufficient number of data points in all three sub sets. The threshold for sufficient data points count is five. 41

56 Table 12: Research Data Records Count - Application Domains Research Data Records Count Application Domains Missing 2 Missing 1 Perfect Set Business Systems Command & Control Communications Controls & Displays Executive Information Assurance Infrastructure or Middleware Maintenance & Diagnostics Mission Management Mission Planning Process Control Scientific Systems Sensor Control and Processing Simulation & Modeling Spacecraft BUS Spacecraft Payload Test & Evaluation Tool & Tool Systems Training Weapons Delivery and Control Total Overall Effort Distribution Patterns: For each of the highlighted Application Domains, average effort percentages of each activity group are calculated from the data records and then plotted to visualize the effort distribution pattern. Table 13 and Figure 5 illustrate the Perfect set whereas Table 14 and Figure 6 illustrate the Missing 2 set. 42

57 Percentages Table 13: Average Effort Percentages - Perfect Set by Application Domains Average Effort Percentages Perfect Set Domain REQ ARCH CUT INT Business Biz 20.98% 22.55% 24.96% 31.51% Command & Control CC 21.04% 22.56% 33.73% 22.66% Communications Comm 14.95% 30.88% 28.54% 25.62% Control & Display CD 14.72% 34.80% 24.39% 26.09% Mission Management MM 15.40% 17.78% 28.63% 38.20% Mission Planning MP 17.63% 12.45% 44.32% 25.60% Sensors Control and Processing Sen 7.78% 45.74% 22.29% 24.19% Simulation Sim 10.71% 39.11% 30.80% 19.38% Spacecraft Bus SpBus 33.04% 20.66% 30.00% 16.30% Weapons Delivery and Control Weapons 11.50% 17.39% 29.82% 41.29% 50.00% Average Effort Percentages Distribution 45.00% 40.00% 35.00% 30.00% 25.00% 20.00% 15.00% 10.00% 5.00% 0.00% Requirement Arch&Design Code&Unit Test Integration & QT Biz CC Comm CD MM MP Sen Sim SpBus Weapons COCOMO II Figure 5: Effort Distribution Pattern - Perfect set by Application Domains Among the average effort percentages, it is clear that none of the application domains produce a similar trend as indicated by the COCOMO II averages, which are 6.5% 43

58 for Requirements & Planning, about 40% for Architecture & Design, about 30% for Code & Unit Testing, and roughly 23.5% for Integration & Qualification Testing. The closest application domain to the COCOMO II averages is Simulation, which suggests more time should be dedicated to Requirements & Planning, while less time should be dedicated tointegration & Qualification Testing. All other application domains seem to request a significant increase in time for Requirements & Planning, except Sensors Control and Process which seems to allocate most of its time to Architecture & Design. Since most application domains spend more time on Requirements & Planning, most of them spend less on Architecture & Design. A few application domains show huge differences for Code & Unit Testing. Only Mission Planning seems to allocate a significant amount of time to Code & Unit Testing, perhaps because it tends to do less architecting (only 12.45%). Sensor Control and Processing, on the other hand, spends less time on coding and unit testing due to more effort in designing the system. Overall, the gap between the minimum and the maximum average effort percentages for each activity group is significant, and the average percentages from the application domains are evenly spread across these gaps, which will be reflected in testing the hypothesis. For the domains with sufficient data, the domain-specific effort distributions are better than the COCOMO II distributions, although this also needs statistical validation, which is explored next. 44

59 Percentages Table 14: Average Effort Percentages - Missing 2 Set by Application Domains Average Effort Percentages Missing 2 Domain REQ ARCH CUT INT Business Biz 18.51% 23.64% 27.85% 30.00% Command & Control CC 19.41% 23.70% 34.59% 22.31% Communications Comm 14.97% 27.85% 27.89% 29.28% Control & Display CD 14.67% 26.66% 27.72% 30.95% Mission Management MM 16.59% 17.60% 25.74% 40.07% Mission Planning MP 14.41% 16.42% 43.47% 25.69% Sensors Control and Processing Sen 7.75% 31.84% 25.32% 35.09% Simulation Sim 13.73% 29.09% 30.33% 26.85% Spacecraft Bus SpBus 36.38% 17.23% 27.90% 18.50% Weapons Delivery and Control Weapons 11.73% 17.79% 28.97% 41.51% 50.00% Average Effort Percentages Distribution 45.00% 40.00% 35.00% 30.00% 25.00% 20.00% 15.00% 10.00% 5.00% Biz CC Comm CD MM MP Sen Sim SpBus Weapons COCOMO II 0.00% Requirement Arch&Design Code&Unit Test Integration & QT Figure 6: Effort Distribution Pattern - Missing 2 Set by Application Domains 45

60 Similar to what has already been observed from the Perfect set, data points from the Missing 2 set produce effort distribution patterns that show wide gaps between the minimum and the maximum for each activity group, while scattering average percentages from application domains somewhat evenly. One thing to notice is that for Requirements & Planning, Spacecraft Bus pushed the maximum to over 35% while most of the other application domains stayed below 20%. This could have resulted because the Spacecraft Bus domain needs to satisfy the requirements for a family of spacecraft systems. ANOVA and T-Test: Although the plot shows obvious differences between the distribution patterns of application domains, more mathematical evidence is needed to fully support use of the domain-based effort distribution patterns. Using ANOVA to test the significant level of the differences between application domains can help us confirm the hypothesis. We need to reject the null hypothesis that all application domains will produce the same effort distribution pattern. Table 15 lays out the results for both the Perfect and Missing 2 set. F and P-value are calculated for each activity group to evaluate whether the variances between application domains are produced by noise or patterns that dictate the differences. With strong rejections for every activity group, the ANOVA results favor the hypothesis and ensure the differences between different application domains are not merely coincidental. 46

61 Table 15: ANOVA Results - Application Domains Activity Group Plan & Requirements Architecture & Design Code & Unit Testing Integration and Qualification Testing F Perfect Data Set P- Value Missing 2 Data Set Results F P- Value Results Reject Reject Reject Reject Reject Reject Reject Reject Given strong support from the ANOVA results for the hypothesis, the T-Test results also provide encouraging evidence that domain-based effort distribution is a good alternative to the conventional COCOMO Averages. As shown in Table 16, most application domains are far from the COCOMO average in Plan & Requirements and Architecture & Design activity groups, and are somewhat apart in Integration & Qualification Testing. However, only few application domains disagree with the COCOMO average in Code & Unit Testing. This may be because most of COCOMO II s calibration data points are completed with Code & Unit Testing effort data, while lacking quality support from other activity groups, particularly in Plan & Requirement and Architecture & Design. In summary, the results favor three out of the four activity groups; and therefore, add support for using domain-based effort distribution patterns. 47

62 Table 16: T-Test Results - Application Domains Activity Group COCOMO Averages Perfect Data Set Missing 2 Data Set Plan & Requirements Architecture & Design Code & Unit Testing Integration and Qualification Testing 6.5% All domains reject except Control and Display, Sensor Control, and Simulation domains. 39.3% All domains reject except Control and Display, Sensor Control, and Simulation domains. 30.8% Only Mission Planning rejects. 23.4% Only Mission Management, Spacecraft Bus, and Weapon Delivery domain reject. All domains reject except Sensor Control. All domains reject except Control and Display. Communication, Mission Management, Mission Planning, and Sensor Control reject, other six domains do not. Communications, Mission Management, Sensor Control, Spacecraft Bus, and Weapons Delivery domains reject; other five domains do not Productivity Types The following table illustrates the records count by productivity types. Note the highlighted rows are the types with a sufficient number of data points in all three sub sets. Again, the threshold count of sufficient data points is five. 48

63 Table 17: Research Data Records Count - Productivity Types Research Data Records Count Productivity Types Missing 2 Missing 1 Perfect Set C&C ISM MP PC PLN RTE SCI SCP SYS TEL TRN TST TUL VC VP Total Overall Effort Distribution Patterns: For each of the highlighted Productivity Types, average effort percentages of each activity group are calculated from the data records and then plotted to visualize the effort distribution pattern. Table 13 and Figure 5 illustrate the Perfect set whereas Table 14 and Figure 6 illustrate the Missing 2 set. 49

64 Percentages Table 18: Average Effort Percentages - Perfect Set by Productivity Types Average Effort Percentages Perfect Set Productivity Type Requirement Arch&Design Code&Unit Test Integration & QT ISM 11.56% 27.82% 35.63% 24.99% MP 20.56% 15.75% 28.89% 34.80% PLN 16.22% 12.27% 50.78% 20.73% RTE 15.47% 26.65% 26.71% 31.17% SCI 7.38% 39.90% 32.05% 20.67% SCP 10.80% 45.20% 20.34% 23.66% SYS 17.61% 21.10% 28.75% 32.54% VC 18.47% 23.60% 31.32% 26.61% 60.00% Average Effort Percentages Distribution 50.00% 40.00% 30.00% 20.00% 10.00% ISM MP PLN RTE SCI SCP SYS VC COCOMO II 0.00% Requirement Arch&Design Code&Unit Test Integration & QT Figure 7: Effort Distribution Pattern - Perfect set by Productivity Types The average effort percentages are very similar to those from application domains. This is expected since several Productivity Types are essentially the same as their application domains counterparts (for instance, SCI = Scientific and Simulation Systems, PLN = Systems for Planning and Support Activities, and SCP = Sensor Control and 50

65 Percentages Processing). Again, the gaps between minimum and maximum average percentages is easy to spot in the plot, and average percentages from Productivity Types are spreading across these ranges following the same even fashion. Table 19: Average Effort Percentages - Missing 2 Set by Productivity Types Average Effort Percentages Missing 2 Productivity Type Requirement Arch&Design Code&Unit Test Integration & QT ISM 12.34% 25.32% 32.81% 29.53% MP 22.43% 15.06% 26.11% 36.40% PLN 14.99% 14.15% 49.23% 21.62% RTE 15.43% 24.02% 28.84% 31.70% SCI 7.40% 34.98% 30.47% 27.15% SCP 12.50% 29.07% 24.11% 34.32% SYS 15.74% 20.87% 32.61% 30.79% VC 18.02% 21.36% 30.02% 30.59% 60.00% Average Effort Percentages Distribution 50.00% 40.00% 30.00% 20.00% 10.00% ISM MP PLN RTE SCI SCP SYS VC COCOMO II 0.00% Requirement Arch&Design Code&Unit Test Integration & QT Figure 8: Effort Distribution Pattern - Missing 2 Set by Productivity Types 51

66 The percentage distributions from the Missing 2 set are essentially the same as those from the Perfect set except: 1) the gap is smaller for Integration & Qualification Testing and Requirement & Planning; 2) most of the Productivity Types suggest around 30% effort on Code & Unit Testing while PLN pushes that to almost 50%. Regardless of the differences, both the Perfect set and the Missing 2 set produce favorable results and anchor a great foundation for out-looking further statistical analyses. ANOVA and T-Test: Similar to the results of Application Domains, the ANOVA results for Productivity Types also positively support the hypothesis that the difference between effort distribution patterns by Productivity Types cannot be neglected as noise. Table 25 summarizes the ANOVA results below. Activity Group Table 20: ANOVA Results - Productivity Types F Perfect Data Set P- Value Plan & Requirements Architecture & Design Missing 2 Data Set Results F P- Value Weak Reject Results Reject Reject Reject Code & Unit Testing Reject Reject Integration and Qualification Testing Weak Reject Reject The T-Test results suggest better evidence that the effort distribution patterns by Productivity Types are very different from the COCOMO averages, with more 52

67 disagreements in the Code & Unit Testing activity group. These disagreements provide strong support for using productivity types to further enhance the COCOMO II model in term of effort distribution. Table 21: T-Test Results - Productivity Types Activity Group COCOMO Averages Perfect Data Set Missing 2 Data Set Plan & Requirements 6.5% All types reject except ISM, SCI, and SCP. All types reject except SCI. Architecture & Design 39.3% All types reject except SCI and SCP. All types reject except SCI. Code & Unit Testing 30.8% Only PLN, RTE, and SCP reject. Only MP, PLN, and SCP reject. Integration and Qualification Testing 23.4% Only MP, RTE, and SYS reject. All types reject except PLN and SCI. 4.3 Data Analysis of Project Size Application Domains Since the data includes other possible sources of variation in effort distribution, such as size, a study on project size was performed to provide a further in-depth look at the effort distribution by application domains. As discussed earlier in Chapter 3, data from each application domain is divided into size groups (providing at least five data 53

68 points for each size group). Using the average effort percentage from each size group, we can observe possible effects of size upon the effort distribution patterns. In this study, three size groups are drawn to divide application domains: 0 to 10 KSLOCs, 10 to 32 KSLOCs, and 32 plus KSLOCs. ANOVA is also performed on each domain where 90% confidence level is used. Using the ANOVA will help measuring the variability strength of project size on effort distribution patterns, and compare this strength level against that resulting from application domains. The following tables show the project size analysis results for Communication and Mission Management application domains for the Perfect set. Table 22: Effort Distribution by Size Groups Communication (Perfect) Size Group (KSLOC) Count REQ ARCH CODE INT&QT 0 to % 50.5% 17.3% 21.5% 10 to % 25.2% 31.1% 25.1% % 27.4% 31.1% 27.8% ANOVA Results F P-Value Result Can t Reject Reject Reject Can t Reject 54

69 Average Effort Percentages 60.0% Effort Distribution by Size Groups - Communication 50.0% 40.0% 30.0% 20.0% 0 to to % 0.0% REQ ARCH CODE INT&QT Figure 9: Effort Distribution by Size Groups Communication (Perfect) Table 23: Effort Distribution by Size Groups - Mission Management (Perfect) Size Group (KSLOC) Count REQ ARCH CODE INT&QT 0 to % 23.4% 27.1% 32.9% 10 to % 14.8% 26.2% 43.5% % 15.6% 32.0% 38.2% ANOVA Results F P-Value Result Can t Reject Can t Reject Can t Reject Can t Reject 55

70 Average Effort Percentages Effort Distribution by Size Groups - Mission Management 50.0% 45.0% 40.0% 35.0% 30.0% 25.0% 20.0% 15.0% 10.0% 5.0% 0.0% REQ ARCH CODE INT&QT 0 to to Figure 10: Effort Distribution by Size Groups - Mission Management (Perfect) In addition to the two application domains from the Perfect set, Command & Control and Sensor Control & Process application domains from the Missing 2 set are also analyzed with the same size groups. Note that no duplicated analysis is done for Communication and Mission Management domains even though there are enough data points for those domains in the Missing 2 set. Results from the Perfect set will be used for those two domains. The following tables and figures provide the results from those two application domains. 56

71 Average Effort Percentages Table 24: Effort Distribution by Size Groups Command & Control (Missing 2) Size Group (KSLOC) Count REQ ARCH CODE INT&QT 0 to % 23.2% 28.6% 22.0% 10 to % 20.6% 38.6% 25.5% % 24.6% 35.4% 21.6% ANOVA Results F P-Value Result Can t Reject Can t Reject Can t Reject Can t Reject 45.0% Effort Distribution by Size Groups - Command and Control 40.0% 35.0% 30.0% 25.0% 20.0% 15.0% 0 to to % 5.0% 0.0% REQ ARCH CODE INT&QT Figure 11: Effort Distribution by Size Groups Command & Control (Missing 2) Table 25: Effort Distribution by Size Groups - Sensor Control (Missing 2) Size Group (KSLOC) Count REQ ARCH CODE INT&QT 0 to % 45.8% 14.6% 29.8% 10 to % 28.1% 29.1% 36.2% % 23.6% 29.0% 39.2% ANOVA Results F P-Value Result Can t Reject Weak Reject Reject Can t Reject 57

72 Average Effort Percentages 50.0% 45.0% 40.0% 35.0% 30.0% 25.0% 20.0% 15.0% 10.0% 5.0% 0.0% Effort Distribution by Size Group - Sensor Control and Processing REQ ARCH CODE INT&QT 0 to to Figure 12: Effort Distribution by Size Groups - Sensor Control (Missing 2) The following observations can be found from the analysis results: 1) Code & Unit Testing and Integration & Qualification Testing effort seems to increase as size grows in Communication domain. 2) Nothing noteworthy from the Mission Management and Command & Control domains. 3) Architecting and Designing effort seems to decrease as size grows while Code & Unit Testing and Integration & Qualification Testing efforts increase as size grows from the Sensor Control & Processing domain. 4) No uniform trend was found across the analyzed domains and therefore no conclusion can be drawn from analyzing these domains. Although analysis results from these application domains provide limited information indicating whether project size influences effort distribution patterns, there 58

73 are a fair number of other application domains that have not been analyzed. The main reason for the absence of analysis of other application domains is that there are not enough data points to divide the domain into size groups. There are two scenarios in which application domain is dropped due to lack of data points: 1) The overall data points are below the minimum threshold of 15 data points (5 for each size group). 2) There are not enough data points for one or more size groups. For instance, Command and Control has 15 data points, but there are only 2 data points with a size less than 10 KSLOCs. There were attempts to bypass the second scenario by using different divisions of size groups, but none of these attempts were successful in providing more application domains to analyze. The current division is a better choice than most of the other divisions Productivity Types The same project size analysis is done for the productivity types. Size groups are divided as 0 to 10 KSLOCs, 10 to 36 KSLOCs, and 36 plus KSLOCs. Again, 90% confidence level is used for the ANOVA test for measuring the variability strength of project size. There are two productivity types, namely Real Time Embedded (RTE) and Vehicle Control (VC), with enough data points from the Perfect set. The following tables and figures summarize the results. 59

74 Average Effort Percentages Table 26: Effort Distribution by Size Groups RTE (Perfect) Size Group (KSLOC) Count REQ ARCH CODE INT&QT 0 to % 34.7% 23.8% 28.5% 10 to % 26.3% 31.9% 30.2% % 12.4% 22.5% 37.9% ANOVA Results F P-Value Result Reject Weak Reject Reject Can t Reject 40.0% 35.0% 30.0% 25.0% 20.0% 15.0% 10.0% 5.0% 0.0% Effort Distribution by Size Groups - RTE REQ ARCH CODE INT&QT 0 to to Figure 13: Effort Distribution by Size Groups RTE (Perfect) Table 27: Effort Distribution by Size Groups - VC (Perfect) Size Group (KSLOC) Count REQ ARCH CODE INT&QT 0 to % 26.7% 31.5% 12.6% 10 to % 23.6% 32.6% 29.7% % 20.6% 29.9% 37.5% ANOVA Results F P-Value Result Can t Reject Can t Reject Can t Reject Weak Reject 60

75 Average Effort Percentages 40.0% Effort Distribution by Size Groups - VC 35.0% 30.0% 25.0% 20.0% 15.0% 10.0% 0 to to % 0.0% REQ ARCH CODE INT&QT Figure 14: Effort Distribution by Size Groups - VC (Perfect) Three other productivity types can be analyzed from the Missing 2 set. They are Mission Processing (MP), Scientific and Simulation Systems (SCI), and Sensor Control and Processing (SCP). Their results are illustrated in the following tables and figures. Table 28: Effort Distribution by Size Groups - MP (Missing 2) Size Group (KSLOC) Count REQ ARCH CODE INT&QT 0 to % 16.9% 27.4% 35.1% 10 to % 17.0% 21.7% 41.7% % 12.5% 29.1% 32.6% ANOVA Results F P-Value Result Can t Reject Can t Reject Can t Reject Weak Reject 61

76 Average Effort Percentages 45.0% Effort Distribution by Size Groups - MP 40.0% 35.0% 30.0% 25.0% 20.0% 15.0% 0 to to % 5.0% 0.0% REQ ARCH CODE INT&QT Figure 15: Effort Distribution by Size Groups - MP (Missing 2) Table 29: Effort Distribution by Size Groups - SCI (Missing 2) Size Group (KSLOC) Count REQ ARCH CODE INT&QT 0 to % 53.4% 26.5% 17.8% 10 to % 25.6% 21.5% 43.6% % 29.2% 36.7% 24.7% ANOVA Results F P-Value Result Reject Reject Can t Reject Reject 62

77 Average Effort Percentages 60.0% Effort Distribution by Size Groups - SCI 50.0% 40.0% 30.0% 20.0% 0 to to % 0.0% REQ ARCH CODE INT&QT Figure 16: Effort Distribution by Size Groups - SCI (Missing 2) Table 30: Effort Distribution by Size Groups - SCP (Missing 2) Size Group (KSLOC) Count REQ ARCH CODE INT&QT 0 to % 30.2% 19.5% 34.7% 10 to % 27.0% 29.3% 32.8% % 30.7% 26.5% 37.1% ANOVA Results F P-Value Result Can t Reject Can t Reject Weak Reject Can t Reject 63

78 Average Effort Percentages 40.0% Effort Distribution by Size Groups - SCP 35.0% 30.0% 25.0% 20.0% 15.0% 10.0% 0 to to % 0.0% REQ ARCH CODE INT&QT Figure 17: Effort Distribution by Size Groups - SCP (Missing 2) From these results, the only detected trends are 1) Architecture & Design efforts decrease with size growth as Integration and Qualification Testing efforts increase with size growth in the Real Time Embedded productivity type; and similarly, 2) Integration & Qualification Testing efforts increase in the Vehicle Control productivity type with size growth while both Planning & Requirements and Architecture & Design efforts drop as size increases. There are no additional interesting points found in the three productivity types from the Missing 2 set. Although the total number of productivity types that can be analyzed is higher than that of application domains, the overall quality of the analysis results is not much better. Again, the lack of data points may be the main contributor to inhibiting results. Increasing the data count seems to be the only strategy for improving this analysis. 64

79 4.4 Data Analysis of Personnel Capability Application Domains Unlike analysis of project size, analysis of personnel rating does not lack data points. Data points are not divided into smaller groups and each application domain can be studied with all of its data points together. Following the guideline described in Section 3.5.1, each data point is attached with an overall personnel rating. For each activity group of each application domain, the personnel ratings are plotted against the effort percentages, and a calculated regression line represents the correlation between personnel rating and effort distribution. The following table summarizes the results. Note that Spacecraft Bus is ignored because its personnel ratings are all the same. Table 31: Personnel Rating Analysis Results - Application Domains Application Domain Business Command & Control Communication Perfect Set Missing 2 Set Remark Regression lines have notable slope for Architecture & Design and Code & Unit Testing activity groups. R 2 is about 30%. Flat for the other activity groups (R 2 < 2%). Flat regression lines for all activity groups. Low R 2 below 10%. Flat regression lines for all activity groups. Low R 2 below 5%. R 2 drops for all activity groups and regression lines become more horizontal. Same as in the Perfect Set. Same as in the Perfect Set. Not a lot of data points to take in account of analysis. Chances of noise in significant results are high. Data points almost double from the Perfect Set to the Missing 2 Set. 65

80 Table 31, Continued Control and Display Mission Management Mission Planning Sensor Control and Processing Simulation Weapons Delivery & Control Strong correlation with notable regression lines. High R 2 above 80%. Flat regression lines for all activity groups. Low R 2 below 10%. Notable regression line for Requirements & Planning. Others are poor (Low R 2 below 10%). Notable regression lines (Fair R 2 around 20%) except Code & Unit Testing. Good regression lines for Architecture & Design and Code & Unit Testing (R 2 around 30%). Strong correlation indicated for Code & Unit Testing and Integration & Qualification Testing (R 2 around 50%). Good regression line for Requirements & Planning (R 2 around 35%) and fair for Architecture & Design (R 2 around 18%). Flatter regression lines and significant drop of R 2 (<10%). Same as in the Perfect Set. Improves regression lines for Code & Unit Testing and Integration & Qualification Testing. Flatter regression lines result a drop of R 2 to below 10%. Same as in the Perfect Set. Much flatter regression lines and significant drop of R 2 (below 10%). Personnel ratings are basically same in the Perfect Set to create strong correlation. Once become different in the Missing 2 Set, regression lines become flatter. Data points almost double from the Perfect Set to the Missing 2 Set. Data points almost double from the Perfect Set to the Missing 2 Set. As shown in the table above, most application domains produce below-par regression results that reject the use of personnel ratings as an extra factor in the effort 66

81 distribution model. The only strong result is from Weapons Delivery & Control, yet the significant drop of R 2 in the Missing 2 Set eliminates it as favorable evidence. In summary, the results of analyzing Project Size and Personnel Rating provide more information about application domain s characteristics and behaviors regarding effort distribution patterns. However, because of the limitations in either data point counts or correlation determination values, neither factor can be used with high confidence to further enhance the domain-based effort distribution model. Thus, neither will be included in the model architecture if application domains are to be used as the final domain breakdown Productivity Types Mimicking the approach for the application domains analysis, the following table lays out the results from analyzing productivity types with regard to personnel ratings. Table 32: Personnel Rating Analysis Results - Productivity Types Productivity Types ISM (Infrastructure Middleware) MP (Missing Processing) Perfect Set Missing 2 Set Remark Strong correlation for Architecture & Design and Code & Unit Testing (R2 above 60%). However, poor for Requirements & Planning and Integration & Qualification Testing (R 2 below 3%). All activity groups produce flat regression lines. All activity groups produce flatter regression line. R 2 drops below 3%. Same as the Perfect Set. Data points double from Perfect Set to Missing 2 Set. Data points double from Perfect Set to Missing 2 Set. 67

82 Table 32, Continued PLN (Planning and Support Activities) RTE (Real-Time Embedded) SCI (Scientific) SCP (Sensor Control and Processing) SYS (System) VC (Vehicle Control) Most activity groups produce good regression lines (R 2 above 30%) while Integration & Qualification Testing results poor regression line (R 2 below 3%). All activity groups produce flat regression lines. All activity groups produce flat regression lines. All activity groups produce flat regression lines except Code & Unit Testing with a relatively better line (R 2 hits 18%). Good regression results for Architecture & Design and Code & Unit Testing (R 2 above 22%). Poor for the others (R 2 below 2%). Good regression results for Requirements & Planning and Integration & Qualification Testing (R 2 above 30%). Poor for the others (R 2 below 2%). Requirements & Planning and Code & Unit Testing stay good (R 2 around 25%). Architecture & Design become flatter (R 2 drops to below 1%). However, Integration & Qualification Testing improves the regression result (R 2 increases to 13%). Worse results than the Perfect Set. Worse results than the Perfect Set. Worse results than the Perfect Set. Worse results than the Perfect Set: Architecture & Design becomes flatter (R 2 drops to 16%). Slightly change in Integration and Qualification Testing (R 2 drops to 38%). Data points double from Perfect Set to Missing 2 Set. 68

83 Similar to Application Domains, there no strong evidence to back the use of personnel ratings for the effort distribution model. Most productivity types suggest poor correlation between personnel ratings and effort percentages as indicated by the low R 2 values. Again, productivity types provide no better conclusion than application domains, and eliminating the project size and personnel rating as additional factors for the effort distribution model seems reasonable. Overall, by analyzing productivity types, we are able to produce a list of comparable evidence that we can use to decide which breakdown is best for the final model. The full detail of this comparison is covered in the next section and a final decision can be made thereafter. 4.5 Comparison of Application Domains and Productivity Types In order to build the domain-based effort distribution model, one of the domain breakdowns must be selected and used as the primary domain definition set, which is to be supported by the model. In this section, the comparison analysis is described in detail to show the evaluation of the two analyzed domain breakdown. This analysis will help to determine the best breakdown based on the following dimensions: 1) Effort distribution patterns and trends. 2) Statistical tests results. 3) Other non-tangible advantages. Before exploring the detailed comparisons, note that the actual number of analyzed data points is slightly different between application domains and productivity 69

84 types. This is mainly because application domains have more domains that reach the minimum number of records to analyze (10 domains with 5 or more data points) than productivity types (8 types with 5 or more data points). To sum up this statistic, the total numbers of data points analyzed for application domains are 122 and 223 for the Perfect set and Missing 2 set respectively. The total numbers for productivity types are 118 and 219 for the Perfect set and Missing 2 set respectively. Since the Missing 2 set is a predicted set that does not fully represent the actual distribution, the comparison analysis will be conducted solely on the results produced from the Perfect set. Effort Distribution Patterns: In effort distribution patterns, there are three factors to investigate for application domains and productivity types: 1) How big are the gaps between the minimum and the maximum average effort percentages for each activity group? A wider gap indicates larger differences between domains or types that provide better evidence as strong candidate to use for the model. 2) How are the trend lines spread between these gaps? Evenly distributed trend lines indicate that the difference between domains or types is normally distributed; thus it makes more sense to use this breakdown. 3) How is the general shape of the trend lines different between application domains and productivity types? If the general shape is the same, there is no reason to use a separate breakdown. 70

85 The following two tables summarize the size of the gaps between the minimum and maximum average effort percentages for all activity groups. Notice that the gap for Planning & Requirements of productivity types is almost half the size of the gap for application domains. There is a similar difference for Integration & Qualification Testing, in that application domains have a much larger gap compared to that of the productivity types. On the other hand, a wider gap is found for Code & Unit Testing from productivity types than that of the application domains. However, the magnitude of the gap is not as wide as we observed in Planning & Requirements or Integration and Qualification Testing. Therefore, its advantages for productivity types are limited. Table 33: Effort Distribution Patterns Comparison Activity Group Plan & Requirements Architecture & Design Min Max Diff Min Max Diff Application Domain 7.78% 33.04% 25.26% 12.45% 45.74% 33.29% Productivity Type 7.38% 20.56% 13.18% 12.27% 45.20% 32.93% Table 34: Effort Distribution Patterns Comparison Activity Group Code & Unit Testing Integration and QT Min Max Diff Min Max Diff Application Domain 22.29% 44.32% 22.03% 16.30% 41.29% 24.99% Productivity Type 20.34% 50.78% 30.44% 20.67% 34.80% 14.13% 71

86 Trend lines from both application domains and productivity types are evenly distributed between the gaps for almost all activity groups. Most application domains are below 22% for Plan & Requirements, while Spacecraft Bus goes beyond 30%. Most productivity types squeeze in between 35% and 25% for Code & Unit Testing, while PLN hits 50%, and MP drops to 20%. In general, the distribution suggests that both application domains and productivity types are good candidates to produce different effort distribution patterns. As for the general trend line shapes, these two breakdowns are quite different. Application domains seem to have wide gaps for all activity groups, whereas productivity types generates large gap for Architecture & Design and Code and Unit Testing while closer together for Plan & Requirements and Integration and Qualification Testing. Statistical Tests Results: The statistical tests results are straight forward. Tables below outline the comparison between application domains and productivity types in terms of ANOVA and T-Test results. 72

87 Table 35: ANOVA Results Comparison Activity Group F Application Domain P- Value Results F Productivity Type P- Value Results Plan & Requirements Reject Reject Architecture & Design Reject Reject Code & Unit Testing Reject Reject Integration and Qualification Testing Reject Reject Table 36: T-Test Results Comparison Activity Group COCOMO Averages Application Domain Productivity Type Plan & Requirements 6.5% All domains reject except Control and Display, Sensor Control, and Simulation domains. All types reject except ISM, SCI, and SCP. Architecture & Design 39.3% All domains reject except Control and Display, Sensor Control, and Simulation domains. All types reject except SCI and SCP. Code & Unit Testing 30.8% Only Mission Planning rejects. Only PLN, RTE, and SCP reject. Integration and Qualification Testing 23.4% Only Mission Management, Spacecraft Bus, and Weapon Delivery domain reject. Only MP, RTE, and SYS reject. 73

88 In terms of ANOVA results, application domains are more favorable because of the low P-values for all activity groups, which give us a 95% confidence level to support the hypothesis. On the other hand, productivity types perform very well in the T-Test, giving more rejections to differentiate against the COCOMO II averages especially in Code & Unit Testing. Non-tangible advantages: There are three subject areas where we study the characteristics and behaviors of the domain breakdowns: 1) Identifiability: How are domains or types identified? Is the way to identify domains or types consistent and/or easy? 2) Availability: When can we identify the domain or type in the software lifecycle? 3) Supportability: How much data can we collect for each breakdown now and in the future? Application domains are identified based on the project s primary functionalities or capabilities. Since most projects start off with higher level requirements that focus on functionalities or capabilities, application domain can be easily identified based on the initial operation description, thus making application domain available before further requirement analysis and/or design. This early identification also contributes to shrinking the Cone of Uncertainty effect. Additionally, since the application domains are categorized by functionalities, the domain definitions are more precise and easier to 74

89 understand. Many people can relate to the functionalities to draw boundaries between domains; and therefore, eliminate confusion and overhead between application domains. On the other hand, in order to identify productivity type, we need both identification of the application domain and estimation of productivity rate. The additional estimation may delay the overall identification time since there are no straight forward references to append productivity rate to a given application domain. Projects from the same application domain may have a wide range of productivity rates that depend on additional estimation parameters such as size and personnel information. Moreover, because a given productivity type may contain several application domains that summarize a range of functionalities or capabilities, the definition of a productivity type may not be easily interpreted by project managers who are not familiar with its concept. In summary, application domains are more uniform in term of definition clarity and determination factor. Both revolve around project functionality or capability; whereas productivity types may jump around between functionality, size, personnel information, and other necessary parameter to estimate its productivity rate. Lastly, application domain will have more data support since it is widely used by many organizations as standard meta-information for software projects and attached to data collection surveys. In contrast, productivity types are fairly novel such that many people are not familiar with them, and therefore, project data are not collected with productivity type information. 75

90 4.6 Conclusion of Data Analyses After analyzing the application domains and productivity types, three important findings emerge: 1) Effort distribution patterns are impacted by project domains. 2) Project size was not confirmed as a source of effort distribution variation. 3) Personnel capability was not confirmed as a source of effort distribution variation. With these important findings, I am confident that the domain-based effort distribution model is a necessary alternative to enhance the overall effort distribution guidance that is currently available in the COCOMO II model. Having compared the application domains and productivity types, I ve concluded that application domains are more relevant and usable as the key domain definition set (or breakdown structure) to be supported by the domain-based effort distribution model as it is more uniform to use and has better data support over the productivity types. 76

91 CHAPTER 5: DOMAIN-BASED EFFORT DISTRIBUTION MODEL This chapter presents detailed information about the domain-based effort distribution model. Section 5.1 provides a comprehensive description of the model including its inputs and outputs, general structure, and key components. Section 5.2 outlines an implementation instance of the model and its connection to a copy of the COCOMO II model implementation. This section also provides a simple guide to using the implemented tool to estimate effort distribution of a sample project. 5.1 Model Description As depicted in the following figure, the domain-based effort distribution model takes project effort (in person-month) and application domain as the primary inputs and produces a suggested effort distribution guideline as its main output. It is designed as an extension model for the COCOMO II model and follows a similar reporting fashion to display the output effort distribution. Effort distribution is reported in a tabular format in terms of development phase (or activity group), phase effort percentages, and phase effort in person-months. 77

92 Figure 18: Domain-based Effort Distribution Model Structure The suggested effort distribution is a product of total project effort and average effort percentages. Average effort percentages are determined using a look up table of application domains and activity groups. This table is shown below. If no application domain is provided with the project, the suggested effort distribution will fall back to the standard COCOMO II waterfall effort distribution. 78

93 Table 37: Average Effort Percentages Table for the Domain-Based Model Application Domains Requirement Average Effort Percentages Arch & Design Code & Unit Test Integration & QT Business 20.98% 22.55% 24.96% 31.51% Command & Control 21.04% 22.56% 33.73% 22.66% Communications 14.95% 30.88% 28.54% 25.62% Control & Display 14.72% 34.80% 24.39% 26.09% Mission Management 15.40% 17.78% 28.63% 38.20% Mission Planning 17.63% 12.45% 44.32% 25.60% Sensors Control and Processing 7.78% 45.74% 22.29% 24.19% Simulation 10.71% 39.11% 30.80% 19.38% Spacecraft Bus 33.04% 20.66% 30.00% 16.30% Weapons Delivery and Control 11.50% 17.39% 29.82% 41.29% These above listed application domains have a sufficient number of data points in the Perfect set, and their average effort percentages are also calculated from the Perfect set. Additionally, we can add those with enough data points in the Missing 2 set, namely Infrastructure and Middleware and Tool and Tool Systems. We can note these application domains as predicted domains, and we need to be aware of their limitation of data support when using the suggested effort distribution. 5.2 Model Implementation In order to demonstrate how the model works, I developed an implementation instance of the model. This instance is built on top of a web-based COCOMO II tool that 79

94 was developed earlier for demonstration. This tool runs on Apache web server, using MySQL database, and written in PHP and JavaScript. The mathematical formulation used for this implementation is as follows: From the COCOMO II model, the total project effort (PM) is computed as (EQ. 5) where (EQ. 6) EM is the COCOMO II effort multiplier and SF is scale factor. For each supported application domain AD k, effort can be computed as ( ) ( ) (EQ. 7) ( ) ( ) (EQ. 8) ( ) ( ) (EQ. 9) ( ) ( ) (EQ. 10) The percentage for each equation comes from Table 37. Figure 19 captures the main screen of a sample project. The general information of the project is displayed in the top portion including project name, application domain, operating environment, development method, scale factors, and schedule constraints. Modules can be added to the project using the add module button. Each module will need input of language, labor rate, size, and EAF. Size can be calculated in three modes: adapted code, function point conversion, and simple SLOCs. EAF is calculated by setting values for 16 effort multipliers from the COCOMO II model. The estimation results are displayed at the bottom. Effort, schedule, productivity, costs, and staff are calculated based on the project input. Three levels of estimates are produced: optimistic, most likely, 80

95 and pessimistic. An effort distribution link is available next to the Estimation Results label. Clicking on it will bring the effort distribution screen to the user. Figure 19: Project Screen of the Domain-based Effort Distribution Tool The following figure shows the suggested domain-based effort distribution of a Weapon Delivery and Control project. The total project effort is 24.2 PM, and the suggested percentages and actual PM for each activity group is listed at the bottom portion of the display area. A column graph is also produced to illustrate the effort distribution visually. Both COCOMO II Waterfall and MBASE effort distribution are also available for the user to compare results. 81

96 Figure 20: Effort Results from the Domain-base Effort Distribution Tool 5.3 Comparison of Domain-Based Effort Distribution and COCOMO II Effort Distribution This section will validate and compare the results produced using the domainbased effort distribution against the results from the COCOMO II model. Data Source: 82

97 Three sets of sample projects are used for this comparison. The sample projects come from the original COCOMO II calibration data which contains sufficient information including domain information, project size, COCOMO II drivers values, and schedule constraints. The sample projects are also attached with most of the actual effort distribution data in PM for each activity group or waterfall phase. The details of these sample projects are described in the following table. Note that there are no requirements or planning effort data for project 51 and 62. In fact, the only data points that I can find with requirements & planning effort data is from project 49. The requirements and planning effort data was not required in the earlier data collection survey, therefore, most projects submitted the survey without that effort data. Table 38: Sample Project Summary Phase Effort (PM) ID Application Domain Size (KSLOC) Effort (PM) REQ ARCH CUT INT&QT 49 Command and Control Communication NA Simulation NA Comparison Steps: The procedure for the comparison analysis is straight forward. For each sample project, the following steps are performed to collect a final result that I can use to evaluate the performance of the domain-based effort distribution model: 1) Create a project with necessary information (domain information, size, name, etc.) using the tool; add COCOMO II drivers value accordingly. 83

98 2) Collect the output effort distribution for both COCOMO II Waterfall and Domain-based effort distribution model. 3) Compare the effort distributions with the actual effort distribution of the project. 4) Calculate the errors of each estimated effort distribution and determine which produces the best results. Analysis Results: Using the tool, the following estimated total efforts are produced from the COCOMO II model for the sample projects: Table 39: COCOMO II Estimation Results ID Actual Effort (PM) Estimated Effort (PM) Estimation Error % % % Next, I produced a series of comparison tables that compares the COCOMO II Waterfall effort distribution results against results produced using the domain-based effort distribution model. 84

99 Project 49 (Command and Control): Table 40: Project 49 Effort Distribution Estimate Comparison COCOMO II Domain-based Effort Distribution Activity Group Est. PM Error Est. PM Error Requirements % % Architecture % % Code & Unit Test % % Integration & QT % % Total Error % % Project 51 (Communications): Table 41: Project 51 Effort Distribution Estimate Comparison COCOMO II Domain-based Effort Distribution Activity Group Est. PM Error Est. PM Error Requirements NA NA Architecture % % Code & Unit Test % % Integration & QT % % Total Error % % Project 62 (Simulations): Table 42: Project 62 Effort Distribution Estimate Comparison COCOMO II Domain-based Effort Distribution Activity Group Est. PM Error Est. PM Error Requirements 35.2 NA NA Architecture % % Code & Unit Test % % Integration & QT % % Total Error 83.38% 61.88% 85

100 In each result table, the estimated efforts are produced for each Waterfall phase or activity group by its respective effort distribution model. Error for each activity group is calculated against the actual efforts. The total error is the sum of the individual error and can be used as the final evaluation value for this comparison. The results suggest that COCOMO II Waterfall produces slightly better results in project 49 (194% total error vs. 199% total error), but somewhat worse results in project 51 and project 62 (generally +20% worse than the domain-based effort distribution results). In the COCOMO II Waterfall s defense, missing the requirements and planning phase results may cause a drop in performance in project 51 and project 62. The counter argument is that the COCOMO II effort distribution does not take account of the requirements and planning effort in general, as it was not part of the COCOMO II model calibration (also a reason the effort data was not required in the data collection survey). This may also undermine the better results from project 49. The good results for estimated requirements and planning effort may have been produced merely by chance. If only considering the other three activity groups, the results are 194% vs. 125%. The domain-based effort distribution model produces a much better estimate. Another important point is that these sample projects are selected from the calibration data points for the COCOMO II model, and therefore should fit the COCOMO II Waterfall better since the domain-based effort distribution model is based on an entirely different data set. To sum up, the domain-based effort distribution model produces a better estimate if we ignore requirements and planning effort (less error if we only sum up errors from Architecture & Design, Code & Unit Testing, and Integration & Qualification Testing). 86

101 Although the evaluation result suggests that the domain-based effort distribution model performs fairly well against the COCOMO II Waterfall, at least for three of the four activity groups, there is definitely need for further validation tests to confirm the advantage. 87

102 CHAPTER 6: RESEARCH SUMMARY AND FUTURE WORKS 6.1 Research Summary The central theme of this research revolves around finding the relationship between software project information and effort distribution patterns, particular project domain, size, and personnel rating. Such a relationship was discovered for domain information, but was not for size and personnel rating. Since the domain is usually easy to define in the early stage of a project lifecycle, it can provide substantial improvement in preparing resource allocation plans for the different stages of software development. For data and domains analyzed, the hypothesis test strongly confirms such a relationship. And as a result, a new domain-based effort distribution model is drafted and we are one step further in improving the already-popular COCOMO II model for the data-supported application domains. In this research, two sets of domain breakdowns, namely application domains and productivity types, are analyzed for their correlations to effort distribution patterns. A data support of 530 project records is used for this analysis. Both visual and statistical tests are conducted to prove the significance of domain as an influential driver on effort distribution patterns. Project size and personnel rating are studied to determine additional factors that may cause distinguishable trends within domain or type. Comparison between application domains and productivity types helps to determine the domain breakdown 88

103 that can be used for the domain-base effort distribution model, which is designed and implemented as a prototype to attach to the COCOMO II model as a new extension model. Finally, estimation results produced from several sample projects are compared against those produced from the original COCOMO II Waterfall effort distribution guideline to test the performance of the domain-based effort distribution model. Although the results of this comparison do not favor any model in a huge fashion, it is encouraging to see desirable numbers produced from the domain-based effort distribution model. 6.2 Future Work After this research, the main goal is to continue refining the domain-based effort distribution model. There are several known improvements that are beneficial to complete: 1) Refine the overall effort distribution patterns of the existing supported application domains when more data points become available. 2) Support more application domains when more data points become available. 3) Test hypotheses about whether similar results will emerge in sections other than the defense industry. Other than improvement works on existing model features, the following studies can be completed to add valuable extension features to the model: 1) Expand the model to include the schedule distribution patterns available in the current COCOMO II model for Waterfall and MBASE distribution guideline. 89

104 2) Study the effects of operating environments, which state the physical constraint of software systems; and explore the possibility of adding operating environment as a new dimension to influence both effort distribution patterns and schedule distribution patterns. 3) Study the relationships between COCOMO II drivers and domain information; seek possible correlations that can help to enhance the domain-based effort distribution model. 90

105 REFERENCES [AFCAA, 2011] Air Force Cost Analysis Agency (AFCAA). Software Cost Estimation Metrics Manual [Aroonvatanaporn, 2012] Aroonvatanaporn, P. Shrinking the Cone of Uncertainty with Continuous Assessment for Software Team Dynamics in Design and Development. USC CSSE, PhD Dissertation [Au Yeung] Au Yeung, C. Matrix Factorization: A Simple Tutorial and Implementation in Python. [Blom, 1958] Blom, G. Statistical estimates and transformed beta variables. John Wiley and Sons. New York [Boehm, 2010] Boehm, B. Future Challenges and Rewards of Software Engineers. Journals of Software Technology, Vol. 10, No. 3, October [Boehm, 2000] Boehm, B., et al. Software Cost Estimation with COCOMO II. Prentice Hall, NY [Boehm, 1981] Boehm, B. Software Engineering Economics. Prentice Hall, New Jersey [Borysowich, 2005] Borysowich, C. Observations from a Tech Architect: Enterprise Implementation Issues & Solutions Effort Distribution Across the Software Lifecycle. Enterprise Architecture and EAI Blog. October [DCRC, 2005] Defense Cost and Resource Center. The DoD Software Resource Data Report An Update. Practical Software Measurement (PSM) Users Group Conference Proceedings. July [DoD HDBK, 2005] Department of Defense Handbook. Work Breakdown Structure for Defense Material Items: MIL-HDBK-881A. July 30, [Digital, 1991] Digital Equipment. VAX PWS Software Source Book. Digital Equipment Corp., Maynard, Mass., [Galorath, 2005] Galorath Inc. SEER-SEM User Manual [Heijistek, 2008] Heijstek, W., Chaudron, M.R.V. Evaluating RUP Software Development Process Through Visualization of Effort Distribution. EUROMICRO Conference Software Engineering and Advanced Application Proceedings Page

106 [IBM, 1988] IBM Corporation. Industry Applications and Abstracts. IBM. White Plains, N.Y., [Jensen, 1983] Jensen, R. An Improved Macrolevel Software Development Resource Estimation Model. Proceedings 5 th ISPA Conference. April Page 88. [Kruchten, 2003] Kruchten, P. The Rational Unified Process: An Introduction. Addison-Wesley Longman Publishing Co., Inc. Boston [Kultur, 2009] Kultur, Y., Kocaguneli, E., Bener, A.B. Domain Specific Phase By Phase Effort Estimation in Software Projects. International Symposium on Computer and Information Sciences. September Page 498. [Lee, 2001] Lee, D., Seung, H.S. Algorithms for Non-negative Matrix Factorization. Advances in Neural Information Processing Systems 13: Proceedings of the 2000 Conference. MIT Press. pp [McConnell, 2006] McConnell, S. Software Estimation Demystifying the Black Art, Microsoft Press, 2006, page 62. [Norden, 1958] Norden, P.V. Curve Fitting for a Model of Applied Research and Development Scheduling. IBM J. Research and Development Vol. 3, No. 2, Page [NAICS, 2007] North American Industry Classification System, [O Connor, 2003] O'Connor, J. Robertson, E. "Student's t-test", MacTutor History of Mathematics archive, University of St Andrews, [Pearson, 1901] Pearson, K. "On the criterion that a given system of deviations from the probable in the case of a correlated system of variables is such that it can be reasonably supposed to have arisen from random sampling". Philosophical Magazine, Series 5 50 (302), Page [Port, 2005] Port, D. Chen, Z. Kruchten, P. An Empirical Validation of the RUP Hump Diagram. Proceedings of the 4 th International Symposium on Empirical Software Engineering [PRICE, 2005] PRICE Systems. True S User Manual [Putnam, 1976] Putnam, L.H. A Macro-Estimating Methodology for Software Development. IEEE COMPCON 76 Proceedings. September Page [Putnam, 1992] Putnam, L. and Myers. W. Measures for Excellence. Yourdon Press Computing Series [QSM] Quantitative Software Management (QSM) Inc. SLIM-Estimate 92

107 [Reifer, 1990] Reifer Consultants. Software Productivity and Quality Survey Report. El Segundo, Calif., [Shapiro, 1965] Shapiro, S. S.; Wilk, M. B. An analysis of variance test for normality (complete samples). Biometrika 52 (3-4), 1965: page [Standish, 2009] Standish Group. Chaos summary 2009, [Stephens, 1974] Stephens, M. A. "EDF Statistics for Goodness of Fit and Some Comparisons". Journal of the American Statistical Association. Vol. 69, No. 347 (Sep., 1974). Page [Upton, 1996] Upton, G., Cook, I. Understanding Statistics. Oxford University Press. Page [Yang, 2008] Yang, Y., et al. Phase Distribution of Software Development Effort. Empirical Software Engineering and Measurement. October Page

108 Application Domains: APPENDIX A: DOMAIN BREAKDOWN Name Business Systems Internet Tool and Tool Systems Scientific Systems Definition Software that automates business functions, stores and retrieves data, processes orders, manages/tracks the flow of materials, combines data from different sources, or uses logic and rules to process information. Example: Management information systems (Personnel) Financial information systems Enterprise Resource Planning systems Logistics systems (Order Entry, Inventory) Enterprise data warehouse Other IT systems Software developed for applications that run and utilize the Internet. Typically uses web services or middleware platforms (Java, Flash) to provide a variety of functions, e.g. search, order/purchase and multimedia. Example: Web services Search systems like Google Web sites (active or passive) that provide information in multimedia form (voice, video, text, etc.) Software packages and/or integrated tool environments that are used to support analysis, design, construction and test of other software applications Example: Integrated collection of tools for most development phases of the life cycle Rational development environment Software that involves significant computational and scientific analysis. It uses algorithmic, numerical or statistical analysis to process data to produce information. Example: Seismic survey analysis Experiments run on supercomputers to unravel DNA 94

109 Name Simulation and Modeling Test and Evaluation Training Command and Control Mission Management Definition Software used to evaluate scenarios and assess empirical relationships that exist between models of physical processes, complex systems or other phenomena. The software typically involves running models using a simulated clock in order to mimic real world events. Example: Computer-in-the-loop Guidance simulations Environment simulations Orbital simulations Signal generators Software used to support test and evaluation functions. This software automates the execution of test procedures and records results. Example: Test suite execution software Test results recording Software used to support the education and training of system users. This software could be hosted on the operational or a dedicated training system. Example: On-line courses Computer based training Computer aided instruction Courseware Tutorials Software that enables decision makers to manage dynamic situations and respond in real time. Software provides timely and accurate information for use in planning, directing, coordinating and controlling resources during operations. Software is highly interactive with a high degree of multi-tasking. Example: Satellite Ground Station Tactical Command Center Battlefield Command Centers Telephone network control systems Disaster response systems Utility power control systems Air Traffic Control systems Software that enables and assists the operator in performing mission management activities including scheduling activities based on vehicle, operational and environmental priorities. Example: Operational Flight Program Mission Computer Flight Control Software 95

110 Name Weapons Delivery and Control Communications Controls and Displays Infrastructure or Middleware Executive Information Assurance Definition Software used to select, target, and guide weapons. Software is typically complex because it involves sophisticated algorithms, failsafe functions and must operate in real-time. Example: Target location Payload control Guidance control Ballistic computations Software that controls the transmission and receipt of voice, data, digital and video information. The software operates in real-time or in pseudo real-time in noisy environments. Example: Radios Microwave controller Large telephone switching systems Network management Software that provides the interface between the user and system. This software is highly interactive with the user, e.g. screens, voice, keyboard, pointing devices, biometric devices. Example: Heads Up Displays Tactical 3D displays Software that provides a set of service interfaces for a software application to use for control, communication, event handling, interrupt handling, scheduling, security, and data storage and retrieval. This software typically interfaces to the hardware and other software applications that provide services. Example: Systems that provide essential services across a bus Delivery systems for service-oriented architectures, etc. Middleware systems Tailored operating systems and their environments Software used to control the hardware and operating environment and to serve as a platform to execute other applications. Executive software is typically developed to control specialized platforms where there are hard run-time requirements. Example: Real-time operating systems Closed-loop control systems Software that protects other software applications from threats such as unauthorized access, viruses, worms, denial of service, and corruption of data. Includes sneak circuit analysis software. A sneak circuit is an unexpected path or logic flow within a system that, under certain conditions, can initiate an undesired function or inhibit a desired function. Example: Intrusion prevention devices 96

111 Name Maintenance and Diagnostics Mission Planning Process Control Sensor Control and Processing Spacecraft Bus Spacecraft Payload Definition Software used to perform maintenance functions including detection and diagnosis of problems. Used to pinpoint problems, isolate faults and report problems. It may use rules or patterns to pinpoint solutions to problems. Example: Built-in-test Auto repair and diagnostic systems Software used for scenario generation, feasibility analysis, route planning, and image/map manipulation. This software considers the many alternatives that go into making a plan and captures the many options that lead to mission success. Example: Route planning software Tasking order software Software that provides closed-loop feedback controls for systems that run in real-time. This software uses sophisticated algorithms and control logic. Example: Power plant control Oil refinery control Petro-chemical control Closed loop control-systems Software used to control and manage sensor transmitting and receiving devices. This software enhances, transforms, filters, converts or compresses sensor data typically in real-time. This software uses a variety of algorithms to filter noise, process data concurrently in real-time and discriminate between targets. Example: Image processing software Radar systems Sonar systems Electronic Warfare systems Spacecraft vehicle control software used to control and manage a spacecraft body. This software provides guidance, attitude and articulation control of the vehicle. Example: Earth orbiting satellites Deep space exploratory vehicles Spacecraft payload management software used to manage and control payload functions such as experiments, sensors or deployment of onboard devices. Example: Sensors on earth orbiting satellites Equipment on deep space exploratory vehicles 97

112 Productivity Types: Name Sensor Control and Signal Processing (SCP) Vehicle Control (VC) Real Time Embedded (RTE) Vehicle Payload (VP) Mission Processing (MP) Command & Control (C&C) System Software (SYS) Telecommunications (TEL) Process Control (PC) Definitions Software that requires timing-dependent device coding to enhance, transform, filter, convert, or compress data signals. Ex.: Bean steering controller, sensor receiver/transmitter control, sensor signal processing, sensor receiver/transmitter test. Ex. of sensors: antennas, lasers, radar, sonar, acoustic, electromagnetic. Hardware & software necessary for the control of vehicle primary and secondary mechanical devices and surfaces. Ex: Digital Flight Control, Operational Flight Programs, Fly-By-Wire Flight Control System, Flight Software, Executive. Real-time data processing unit responsible for directing and processing sensor input/output. Ex: Devices such as Radio, Navigation, Guidance, Identification, Communication, Controls And Displays, Data Links, Safety, Target Data Extractor, Digital Measurement Receiver, Sensor Analysis, Flight Termination, Surveillance, Electronic Countermeasures, Terrain Awareness And Warning, Telemetry, Remote Control. Hardware & software which controls and monitors vehicle payloads and provides communications to other vehicle subsystems and payloads. Ex: Weapons delivery and control, Fire Control, Airborne Electronic Attack subsystem controller, Stores and Self-Defense program, Mine Warfare Mission Package. Vehicle onboard master data processing unit(s) responsible for coordinating and directing the major mission systems. Ex.: Mission Computer Processing, Avionics, Data Formatting, Air Vehicle Software, Launcher Software, Tactical Data Systems, Data Control And Distribution, Mission Processing, Emergency Systems, Launch and Recovery System, Environmental Control System, Anchoring, Mooring and Towing. Complex of hardware and software components that allow humans to manage a dynamic situation and respond to user-input in real time. Ex: Battle Management, Mission Control. Layers of software that sit between the computing platform and applications Ex: Health Management, Link 16, Information Assurance, Framework, Operating System Augmentation, Middleware, Operating Systems. Transmission and receipt of voice, digital, and video data on different mediums & across complex networks. Ex: Network Operations, Communication Transport. Software that controls an automated system, generally sensor driven. Ex: 98

113 Scientific Systems (SCI) Training (TRN) Test Software (TST) Software Tools (TUL) Business Systems (BIS) Non real time software that involves significant computations and scientific analysis. Ex: Environment Simulations, Offline Data Analysis, Vehicle Control Simulators Hardware and software that are used for educational and training purposes Ex: Onboard or Deliverable Training Equipment & Software, Computer- Based Training. Hardware & Software necessary to operate and maintain systems and subsystems which are not consumed during the testing phase and are not allocated to a specific phase of testing. Ex: Onboard or Deliverable Test Equipment & Software. Software that is used for analysis, design, construction, or testing of computer programs Ex: Integrated collection of tools for most development phases of the life cycle, e.g. Rational development environment Software that automates a common business function Ex: Database, Data Distribution, Information Processing, Internet, Entertainment, Enterprise Services, Enterprise Information Operating Environments: Name Fixed Ground Mobile Ground Definition Manned and unmanned fixed, stationary land sites (buildings) with access to external power sources, backup power sources, physical access to systems, regular upgrades and maintenance to hardware and software, support for multiple users. Example: Computing facilities Command and Control centers Tactical Information centers Communication centers Mobile platform that moves across the ground. Limited power sources. Computing resources limited by platform s weight and volume constraints. Upgrades to hardware and software occur during maintenance periods. Computing system components are physically accessible. Example: Tanks Artillery systems Mobile command vehicles Reconnaissance vehicles Robots 99

114 Name Shipboard Avionics Unmanned Airborne Missile Manned Space Unmanned Space Definition Mobile platform that moves across or under water. Example: Aircraft carriers Cruisers Destroyers Supply ships Submarines Manned airborne platforms. Software that is complex and runs in real-time in embedded computer systems. It must often operate under interrupt control to process timelines in the nanoseconds. Example: Fixed-wing aircraft Helicopters Unmanned airborne platforms. Man-in-the-loop control. Example: Remotely piloted air vehicles Very high-speed airborne platform with tight weight and volume restrictions. Example: Air-to-air missiles Strategic missiles Space vehicle used to carry or transport passengers. Severe power, weight and volume restrictions. Example: Space shuttle Space passenger vehicle Manned space stations Space vehicle used to carry payloads into space. Severe power, weight and volume restrictions. Software in this environment is complex and real-time. Software is subject to severe resource constraints because its platform may have memory and speed limitations due to weight restrictions and radiation. Example: Orbiting satellites (weather, communications) Exploratory space vehicles 100

115 APPENDIX B: MATRIX FACTORIZATION SOURCE CODE Matlab Program Source Code: Main routine: 101

116 Matrix_fact function: 102