Should NASA Embrace Agile Processes?

Should NASA Embrace Agile Processes? Jefferey Smith, Tim Menzies Lane Department of Computer Science West Virginia University PO Box 69, Morgantown WV, 656-69, USA; jefferey@jeffereysmith.com,tim@menzies.com Abstract As a growing number of NASA software projects tend to stray from the conventional waterfall method of software development, alternate development paradigms need to be explored and assessed. This paper examines one such paradigm, Agile Processes; and more specifically, the economics of one its key practices, pair programming. It will be shown that this core practice is only demonstrably useful in a very small number of cases. Introduction A growing trend among NASA software projects is the tendency for software development to move away from the traditional software development paradigms. Previously, NASA software has been developed by large software contractor organizations with mature and traditional process models. While this continues to be NASA s main mode of software development, there are an increasing number of cases where an alternative approach is taken. For example, the telescope controller for a current near-earth satellite mission was developed by a university-based consortium that used numerous heritage products and a loose consortium of sub-contractors. This team produced useful code, but not a fully developed traditional requirements document. The economic motivation for such non-traditional development is clear: more science can be conducted for less dollars. However, the safety implications such as the potential for loss-of-mission is unclear. In situations such as these, portions of code are being developed before the requirements are written. This then leads to requirements being written based upon the code; instead of the other way Submitted to the 7th Annual IEEE/NASA Software Engineering Workshop, Greenbelt, MD, USA, December 5-6,, Greenbelt Marriott. http://sel.gsfc.nasa.gov/website/7ieee.htm. around. These projects are clearly no longer following the traditional waterfall model of software development: Requirements -> Design -> Coding -> Testing -> Maintenance In this light, alternate software development paradigms should be explored as possible replacements for the waterfall method. One such alternate paradigm is Agile Process (AP) development. AP is iterative, incremental, self-organizing, and emergence [5]. According to the Agile Manifesto, AP is based on the idea of uncovering better ways of developing software by doing it and helping others do it [3]. In his article, Get Ready for Agile Methods, With Care, Barry Boehm states that the primary difference between AP and conventional methods is that [AP] methods derive much of their agility by relying on the tacit knowledge embodied in the team, rather than writing the knowledge down in plans []. He goes on to discuss that while AP methods risk making mistakes that could have been found by planning practices used by conventional methods, conventional methods risk that rapid change will make the plans obsolete or very expensive to keep up to date []. While AP may not be appropriate in every situation, there appears to exist cases in which AP may be advantageous to conventional Before beginning, it is important to stress the limits of our study. This work will comment specifically on an agile process practice called pair programming. We will study pair programming because it is a core part of an agile process called extreme programming, which is the most widely cited agile process in the literature. It will be shown that this core practice is only demonstrably useful in a very small number of cases. However, if another agile process

did not use pair programming, then the negative conclusions of this paper would not apply to that agile process. Agile Processes / Extreme Programming AP is an approach that has gained popularity in recent times. AP is a collection of software design practices and techniques that diverge from the heavily structured methodologies in favor of a less structured, more adaptive approach. According to its manifesto [3], AP values... Individuals and interactions over processes and tools; Working software over comprehensive documentation; Customer collaboration over contract negotiation; Responding to change over following a plan. In addition to these values, AP is based upon a set of twelve principals:. The highest priority is to satisfy the customer through early and continuous delivery of valuable software;. Welcome changing requirements, even late in development. Agile processes harness change for the customer s competitive advantage; 3. Deliver working software frequently, from a couple of weeks to a couple of months, with a preference to the shorter timescale; 4. Business people and developers must work together daily throughout the project; 5. Build projects around motivated individuals. Give them the environment and support they need, and trust them to get the job done; 6. The most efficient and effective method of conveying information to and within a development team is faceto-face conversation; 7. Working software is the primary measure of progress; 8. Agile processes promote sustainable development. The sponsors, developers, and users should be able to maintain a constant pace indefinitely; 9. Continuous attention to technical excellence and good design enhances agility;. Simplicity the art of maximizing the amount of work not done is essential;. The best architectures, requirements, and designs emerge from self-organizing teams;. At regular intervals, the team reflects on how to become more effective, then tunes and adjusts its behavior accordingly. percent of projects 4 35 3 5 5 5 insig marginal significant grave Figure. Ratio of different project error severities seen in 9 NASA IV&V projects 996-. There are many positive aspects of AP which might be attractive to the NASA organization: Continual interaction with the customer helps the project to become more tailored to the customer s needs Frequent releases allow the customer to see progress and working code more often Pair programming and continual testing leads to fewer errors Despite these benefits, however, there are some aspects of AP that render it inappropriate for certain types of projects. The best example of this is its approach towards documentation. While some documentation is produced, AP does not include the full documentation provided by other This is perfectly acceptable in many cases, but in the case of safety critical systems, this full documentation, like that which is produced as part of the waterfall method, is required. This doesn t meant that AP should be abandoned. It simply means that AP is not an acceptable choice in certain situations. Figure shows that out of 9 NASA IV&V projects, more than one third of these projects were not of high criticality (grave or significant). For projects such as these, AP should still be explored for possible use. A critique of AP is complicated by the broad and diverse nature of the AP movement. However, one of the most popular AP approaches, Kent Beck s extreme programming, has been thoroughly specified; and therefore, it is easier to study []. According to Beck, extreme programming has the twelve key practices shown in Figure. The precise definition of all these practices are beyond scope of this paper. However, note the key role of one of the practices: pair programming (PP). PP is the concept of two developers working together at a single machine, designing and writing code cooperatively. Figure shows the connection between XP

pair programming 4 hour week metaphor refractoring simple design coding standards testing collective ownership short release continuous nintergration on-site customer planning game Figure. The connection between extreme programming practices. From []. Practices directly effected by PP are shaded. programming practices; the shaded practices have a direct connection to PP. The centrality of PP to XP is evident from this diagram. If PP is removed, it would directly affect all of the shaded practices. AP proponents claim that by working in pairs, developers produce code at a faster rate, and with fewer errors. But the idea of developer pairs leads to two scenarios, each with their own issues: Pool: If additional developers are added to a project from a pool of developers to create the developer pairs, then does the time/error advantage outweigh the additional developer costs? NoPool: If additional developers are not added, and instead, the current developers are divided into groups of two, then does the time/error advantage outweigh the additional time needed to write code that results from the fewer number of tasks that can be worked on at one time? To answer these questions, the economic effects of PP on a software project need to be examined, and compared to the outcome of conventional 3. Equations In their paper, Müller and Padberg present an economic model of software projects for both PP and conventional Their model is based on the Net Present Value (NP V ) of the software project, which is calculated and compared for both programming The NP V of the projects is found using Equation. NP V = AssetV alue DevT ime ( + DiscountRate) The AssetV alue and DiscountRate are the same for both projects. The DiscountRate is used to simulate time to market: a lower discount rate represents a longer time to market, and a higher discount rate represents a faster time to market. In NASA s case, a fast time to market might be the equivalent to a rapidly approaching launch date. The equation for the development time, DevT ime, differs for the two development Since the PP method results in fewer errors, an allowance must be made in the development time for the conventional method to allow for the elimination of an equivalent number of errors. This additional time for quality assurance is the QAT ime. The DevT ime equation for the conventional method is: () 3 Müller/Padberg Study DevT ime = QAT ime + () P roductsize DeveloperP roductivity NumberOfDevelopers A study of this nature has been conducted previously. In their paper, Extreme Programming from an Engineering Economics Viewpoint, Müller and Padberg compare the economics of PP with those of conventional methods [4]. This section examines their work. The next section discusses our concerns with their methods, and the procedure we followed as a result. And the DevT ime equation for the PP method is: DevT ime = where DeveloperP roductivity NumberOfP airs P roductsize % + P airspeedadvantage (3) 3

Parameter Developer Productivity 35 LOC/month Developer Monthly Hours 35 hours/month Defects per KLOC Defects Eliminated by Conventional Processes 7% Product Size 6,8 LOC Developer Salary $5, Project Lead Salary $6, Asset Value $,, Parameter PairSpeedAdvantage % % 3% 4% 5% PairDefectAdvantage 5% % 5% % 5% 3% DefectRemovalTime 5h h 5h DiscountRate % 5% 5% 75% % Table. Parameters systematically varied by Müller and Padberg Table. Fixed parameters used by Müller and Padberg NumberOfP airs = NumberOfDevelopers The QAT ime require three calculations. First, the number of defects left in a typical project must be calculated using Equation 5. DP KLOC DefectsLeft = P roductsize ( DNE) (5) DP KLOC is the number of defects per thousand lines of code, and DNE is the percent of defects not eliminated by conventional techniques ( - Defects Eliminated by Conventional Processes). Once Def ectslef t has been calculated, the next step is to find the DefectDifference. This is simply the Def ectslef t multiplied by the P airdef ectadvantage, which is the percent of defects PP eliminates that conventional programming does not. DefectDifference = DefectsLeft P airdefectadvantage (6) The final step is to calculate the time required for conventional developers to remove these extra defects. This is known as the QAT ime, and is found using Equation 7. DefectRemovalT ime QAT ime = DefectDifference DeveloperMonthlyHours NumberOfDevelopers The final calculation that is required for the project model, which is the same for both the conventional and the PP method, is the development cost, DevCost. DevCost is found using Equation 8. DevCost = DevT ime (NumberOfDevelopers DeveloperSalary +P rojectleadsalary) (8) (4) (7) 3. Method (Müller/Padberg) For their study, Müller and Padberg used a set of fixed parameters, see Table, and four parameters, that they considered to be key features, for which they systematically varied their values (Table ). They used these values to calculate the NP V for the conventional method and the PP method, and they did so for both in two situations: NoPool: The number of developers was fixed. In this situation, when the conventional method is using n developers, the PP method is using n pairs. Pool: The number of developers is not fixed, as if there existed pool of developers to create pairs with. In this situation, when the conventional method is using n developers, the PP method is using n pairs, or n developers. 3.3 Results (Müller/Padberg) First, Müller and Padberg found that PP is advantageous when the number of pairs is equivalent to the number of developers, as described in situation above. In other words, n developers are more efficient than n/ pairs. Therefore, as long as there exists additional tasks that can be assigned to individual developers, PP is not the best solution. However, if the project cannot be subdivided beyond n tasks, and you have access to n developers, then PP has the potential to be beneficial in certain cases. Müller and Padberg found that PP is advantageous in this situation if three criteria are met:. the project is of small to medium size (P roductsize is not large). the project is of high quality (AssetV alue is high) 3. the need for a rapid time to market is present (DiscountRate is high) 4

a) b) c) (Pair/conventional) cost (Pair/conventional) cost (Pair/conventional) cost.5.5 79 baseline simulation 5.5.5 y values, sorted With max. developer productivity 986 5.5.5 y values, sorted With max. pair speed & defect advantage 368 5 y values, sorted Figure 3. Raw data plots of a) completely random cases, b) cases with DeveloperP roductivity set to maximum (T), and c) cases with P airspeedadvantage and P airdefectadvantage set to maximum (T). The vertical line indicates the point where PP is no longer advantageous 4 Smith/Menzies Study After examining the work done by Müller and Padberg, we felt that a wider range of attributes could be explored. By only varying four parameters, a large number of the model s attributes were not being examined as possible factors in determining the advantages of PP. 4. Method (Smith/Menzies) To correct this, the Müller and Padberg model was reimplemented, and random values within appropriate ranges (see Table 3) were generated for a larger set of attributes. Parameter Min Max PairSpeedAdvantage % 5% PairDefectAdvantage 5% 3% DefectRemovalTime (hours) 5 5 DiscountRate % % ProductSize (LOC) 5 DeveloperSalary ($) 45 65 ProjectLeaderSalary ($) 6 9 AssetValue ($) DeveloperProductivity (LOC/month) 5 NumberofDevelopers 4 DefectsPerKLOC 4 DefectsNotEliminated % 8% Table 3. Parameters and ranges used by Smith and Menzies These attributes were then input into the model and, using Equation 9, the total cost was found for both AP and conventional development T otalcost = [( DevT ime ( + DiscountRate) ) AssetV alue] (9) + DevCost Equation 9 is a modified version of Equation. This modified equation was used so that the resulting values would always be positive. This then allowed us to use the following ratio without worrying about the effects of negative numbers: T otalcostofp P T otalcostofconventional This ratio was our means of comparison between the two The random generation and calculations were repeated for, cases, the results of each being recorded. These, cases were then used with a treatment learner to find the attributes that led to an advantage for PP. Note that this approach generates very large data files. The TAR treatment learner is a tool for performing automatic sensitivity analysis of large logs looking for constraints to parameter ranges that can most improve or degrade the performance of a system [6]. TAR performs an exhaustive search over the data, and is capable of discovering treatments not easily found by hand. In this study, improve or degrade was measured according to Equation ; i.e. the impact on the ratio of PP s cost to the cost of conventional The treatment learner was applied to the test cases in two separate studies: () S: no possible treatments were ignored. That is, no attributes were ignored as possible factors that could be changed to influence the performance of PP. 5

S: select attributes that the authors considered to be unchangeable were ignored as possible treatments. These attributes were P airspeedadvantage, P airdef ectadvantage, and Def ectremovalt ime. Also, each of these studies were examined in the two situations, NoPool and Pool, described in section 3., for a total set of four studies: S/Pool, S/NoPool, S/Pool, and S/NoPool In addition to these completely random tests, two additional tests were conducted with specific values set for various attributes. These were done to examine the effects of specific attributes about which the authors wanted to know more. These tests were as follows: T: DeveloperP roductivity was set to the maximum value, 5 LOC/month, to see the distribution when developers are being the most productive. T: P airspeedadvantage and P airdef ectadvantage were set to their maximum values, 5% and 3% respectively, to see the distribution when pairs are operating at their greatest rate of advantage over individuals. 4. Results (Smith/Menzies) 4.. Summary (T and T) A summary of our results is shown in Figure 3. In those plots, pair programming is at an advantage over conventional approaches when P P Conventional >. Plot a in Figure 3 shows the output of our model from, runs across the distributions shown in Table 3. Note that in only 8% of the runs is there an advantage to PP. Plot b in Figure 3 shows the output from T. When DeveloperP roductivity was set to the maximum value, PP was advantageous in approximately % of the cases. Plot c in Figure 3 shows the output from T. From this plot it can be seen that even when the two attributes that have the most effect on the advantage of PP over conventional methods are at their highest values, still only about 37% of the cases resulted in an advantage for PP. This is the greatest advantage we found, even after using TAR to exhaustively search all the attribute ranges. 4.. Details In both S/Pool and S/NoPool it was found that increasing P airspeedadvantage and P airdef ectadvantage was the way to most improve PP s advantage over conventional This is not surprising since P airspeedadvantage and P airdef ectadvantage represent advantages that PP has over conventional Another attribute property that showed some influence in increasing the advantage of PP in S/NoPool was a high Def ectremovalt ime. This too makes sense because conventional methods are prone to more defects, and therefore, a high Def ectremovalt ime would result in conventional developers working significantly more than developer pairs. Considering that it may not be possible to simply change P airspeedadvantage, P airdef ectadvantage, and Def ectremovalt ime as needed, studies S/Pool and S/NoPool ignored these attributes as possible treatments, and looked for additional features that could lead to an advantage for PP. In S/Pool it was found that there were three things things that significantly contributed to an advantage of PP over conventional methods:. A high DiscountRate ( >.4). A small P roductsize (, to 6, LOC) 3. A high AssetV alue ( $.6M to $.M) These are the same three factors that were found by Müller and Padberg when a pool of developers was present. In S/NoPool, no significant treatments were found. Therefore, when a pool of developers is not present, or when additional tasks are present as described by Müller and Padberg in section 3.3, PP does not have an advantage over conventional 5 Conclusions We offer two conclusions. Software development approaches such as AP/XP that do not precisely pre-specify the requirements are not indicated for NASA projects with significant or grave risks. Based on the IV&V experience 996- (Figure ), this excludes around 64% of NASA software projects from using AP/XP. For the remaining projects, our model has only endorsed AP/XP over conventional approaches in a relatively small and specialized set of cases; i.e. when: the project is relatively small an abundance of developers exists a rapid development time is needed This is not to say that NASA should avoid AP/XP. However, this study concludes that a convincing case for AP/XP cannot be based on the factors modelled in this paper; i.e. increased productivity of pairs over conventional approaches productivity vs. cost lower error rates 6

Proponents of AP/XP must therefore base their case on other factors not modelled in this paper, such as (e.g.) increased performance in rapid changing environments, or decreased cost due to conventional requirements reworking to accommodate changes. Finally, it is important to reiterate the limitations of our study. We focus primarily on the AP practice of pair programming. If an AP method does not include pair programming as part of its process, then the negative results of our study are not applicable. Acknowledgements This research was conducted at West Virginia University under NASA contract NCC-979. The work was sponsored by the NASA Office of Safety and Mission Assurance under the Software Assurance Research Program led by the NASA IV&V Facility. Reference herein to any specific commercial product, process, or service by trade name, trademark, manufacturer, or otherwise, does not constitute or imply its endorsement by the United States Government. References [] K. Beck. Extreme Programming Explained: Embrace Change. Addison Wesley,. [] B. Boehm. Get ready for agile methods, with care. Computer, 35():95 3, January. [3] M. B. et. al. Manifesto for agile software development,. Available from http://www.agilemanifesto.org/. [4] M. M. Mller and F. Padberg. Extreme programming from an engineering economics viewpoint. In Proceedings of the Fourth International Workshop on Economics-Driven Software Engineering Research (EDSER),. [5] K. Schwaber,. Quote from the First eworkshop on Agile Methods. Available from http://fc-md.umd.edu/ projects/agile/summary/summary.htm. [6] T.Menzies and Y. Hu. The tar treatment learner,. Available from http://www.ece.ubc.ca/twiki/ pub/softeng/treatmentlearner/intro.pdf. 7