Systems engineering return on investment

Transcription

1 Systems engineering return on investment by Eric C. Honour BSSE, MSEE A thesis submitted for the degree of Doctor of Philosophy Defence and Systems Institute School of Electrical and Information Engineering January 2013

2

3 Systems Engineering Return on Investment Contents 1!"#$%&'(#)%" * *+*,-(./$%'"& * *+0 1%-23 4 *+4 5(%67 4 * $(:;7#:%&3 8 *+< =$);-$>$73'2#3 < 2 97B)7C%D$72-#7&$737-$(: E 0+*,-(./$%'"&)"D%$;-#)%" E 0+0 F%$;-#)B7#:7%$> * #-#)3#)(-2$737-$(:$72-#7&#%5G9H! *? 0+8 5';;-$>-"&D)"&)"/3%D6$)%$$73'2# $(:&73)/" 0? 4+* 9737-$(:I'73#)%"3 0? $(:-(#)B)#)73 0J 4+4 1')&-"(7-"&6-$#)()6-#)%" 4* 4+8 G#:)(3(%"3)&7$-#)%" < HK37$B-#)%"3-"&D)"&)"/ L-#-/-#:7$)"/ 4< 8+* L-#-#%K7/-#:7$7& 4? 8+0!"#7$B)7C&73)/" 8M 8+4 N66$%-(:#%%K#-)"&-#- <0 8+8 L-#-6$%#7(#)%" <8 8+<!"#7$B)7C;7#:%&3 <? 8+? L7;%/$-6:)(3 <E 8+J HK37$B-#)%"3-"&D)"&)"/3 JO 5 5#-#)3#)(-2$73'2#3 J0 <+* 5#-#)3#)(-2;7#:%&3 J4 iii

4 < $(:P'73#)%"NQR%$$72-#75GC)#:6$%/$-;3'((733 *OE < $(:P'73#)%",QH6#);';5G(-"K76$7&)(#7& *4* <+8 HK37$B-#)%"3-"&D)"&)"/3 *4J 6 L)3('33)%"%D973'2#3 *8O?+* S-T%$D)"&)"/3Q(%$$72-#75GC)#:6$%/$-;3'((733 *8O?+0 S-T%$D)"&)"/3Q%6#);';5G(-"K76$7&)(#7& *?0?+4 GU-;6273%D'37 *???+8 V);)#-#)%"3%"#:7$73'2#3 *?M?+< HK37$B-#)%"3-"&D)"&)"/3 *J8 7 R%"(2'3)%"3-"&$7(%;;7"&-#)%"3 *JJ J+* S-T%$D)"&)"/3 *JJ J+0 F'#'$7$737-$(:)"&)(-#)%"3 *E8 J+4 5';;-$> *EJ N667"&)UN,)K2)%/$-6:> *MO iv

5 Systems Engineering Return on Investment N667"&)U,!"#7$B)7C)"3#$';7"#3 *M< N667"&)UR L7B72%6;7"#-26-67$3 0*4 R+* N&B-"()"/-"%"#%2%/> 0*4 R+0 =$-(#)(-26$%/$-;%D$737-$(: 00? R+4 1-#:7$)"/&-#- 04J R+8 L73)/"%D7U67$);7"#3 08J R+< L7;%/$-6:)(3 0<J R+? GDD7(#)B7(:-$-(#7$)A-#)%"6-$-;7#7$3 0J8 R+J 5>3#7;37"/)"77$)"/$7#'$"%")"B73#;7"# 0EE R+E!;6$%B7&(%$$72-#)%" 4OJ R+M 5)A)"/3>3#7;37"/)"77$)"/-(#)B)#)73 4*? v

6 List of Figures F)/'$7*+!"#')#)B7B-2'7%D5G F)/'$70+!;6-(#%DD$%"#7"&6$%T7(#&7D)")#)%"7DD%$#+WD$%;1$':2*MM0X M F)/'$74+L7D)")#)%"7DD%$#)3"%##:73-;7-33>3#7;37"/)"77$)"/7DD%$# *O F)/'$78+5:%$#7$3(:7&'27D%$;%$7(%;627UYZF'3)"/K7##7$5GWD$%;F$-"#A*MM<X *O F)/'$7<+S-">7"/)"77$)"/6$%T7(#3D-)2#%;77#%KT7(#)B73WD$%;S)227$0OOOX ** F)/'$7?+!;627;7"#-#)%"%D5G6$%(73373$73'2#7&)"3#-#)3#)(-22>3)/")D)(-"#(%3#&7($7-37 WD$%;,-$.7$0OO4X *4 F)/'$7J+=7$(7"#%D#%#-26$%T7(#(%3#367"#%"5GWD$%;[2'&A70OO8X *8 B-2'7WD$%;Z%"%'$0OO0KX *? F)/'$7*O+Z)3#%/$-;%D3'K;)33)%"3K>5GGDD%$#\]6$%T7(#(%3#WD$%;Z%"%'$0OO8X *J F)/'$7**+R%3#67$D%$;-"(7-3-D'"(#)%"%D5G7DD%$#WD$%;Z%"%'$0OO8X *E F)/'$7*0+5(:7&'2767$D%$;-"(7-3-D'"(#)%"%D5G7DD%$#WD$%;Z%"%'$0OO8X *E F)/'$7*4+5'KT7(#)B73'((733-3-D'"(#)%"%D5G7DD%$#WD$%;Z%"%'$0OO8X *E F)/'$7*8+5G7DD%$#$7I')$7&)"RHRHSH!!D%$&)DD7$7"#3)A73>3#7;3WD$%;,%7:;0OOJX+0O F)/'$7*<+R%$$72-#)%"%D5G(-6-K)2)#>#%6$%/$-;67$D%$;-"(7WD$%;G2;0OOEX * F)/'$7*?+5G(-6-K)2)#)73(%$$72-#7C)#:6$%/$-;67$D%$;-"(7WD$%;G2;0OOEX F)/'$7*J+5G(-6-K)2)#>: DD7(#D%$:)/:(:-227"/76$%/$-;3WD$%;G2;0OOEX F)/'$7*E+577.)"/%6#);';27B72%D5G7DD%$#C)#:)"6$%/$-;3WD$%;Z%"%'$0OO0X E F)/'$7*M+R%3#%B7$$'"B3+3>3#7;37"/)"77$)"/7DD%$# ?* F)/'$70O+5(:7&'27%B7$$'"B3+3>3#7;37"/)"77$)"/7DD%$# ?* F)/'$70*+Z)3#%/$-;3%D6$%/$-;^3)A7_6-$-;7#7$ ?4 F)/'$704+=$%K27;&)DD)('2#>6-$-;7#7$ ?< F)/'$70?+GDD7(#)B75G7DD%$#-3-67$(7"#%D-(#'-26$%/$-;(%3# ?E F)/'$70J+5G-(#)B)#)737DD%$#-3-67$(7"#%D-(#'-26$%/$-;(%3# ?M F)/'$70E+GDD7(#)B75G-(#)B)#)737DD%$#-3-67$(7"#%D-(#'-26$%/$-;(%3# ?M F)/'$70M+=%33)K27"%"2)"7-$;-66)"/%D$76%$#7&I'-2)#> JJ F)/'$74*+R%$$72-#)%"(:-$#3%D3'((733;7-3'$73-/-)"3#!"##C)#:%'#-662)(-#)%"%D 6$%/$-;(:-$-(#7$)3#)( M* F)/'$740+`7)/:#)"/37-$(:7U-;627W%$)/)"-2B-2'73X M< F)/'$748+`7)/:#)"/37-$(:7U-;627WD)"-2B-2'73X M? F)/'$74<+N&T'3#;7"#3&'7#%I'-"#)#-#)B7D-(#%$ MM F)/'$74?+N&T'3#;7"#3&'7#%3'KT7(#)B7D-(#%$ MM F)/'$74J+R%$$72-#)%"62%#Q(%3#W$ % X-/-)"3#5GW#"## % X **O F)/'$74E+R%$$72-#)%"62%#Q3(:7&'27W" % X-/-)"3#5GW#"## % X **O F)/'$74M+R%$$72-#)%"62%#Q%B7$-223'((733W&" % X-/-)"3#5GW#"## % X *** F)/'$78O+R%$$72-#)%"62%#Q#7(:")(-2I'-2)#>W'( % X-/-)"3#5GW#"## % X *** F)/'$78*+5727(#)"/%6#);';5G7DD%$#'3)"/9H! **4 F)/'$780+R%$$72-#)%"62%#Q(%3#W$ % X-/-)"3#SLW#)*# % X **8 F)/'$784+R%$$72-#)%"62%#Q3(:7&'27W" % X-/-)"3#SLW#)*# % X **< F)/'$788+R%$$72-#)%"62%#Q%B7$-223'((733W&" % X-/-)"3#SLW#)*# % X **< F)/'$78<+R%$$72-#)%"62%#Q#7(:")(-2I'-2)#>W'( % X-/-)"3#SLW#)*# % X **< F)/'$78?+R%$$72-#)%"62%#Q(%3#W$ % X-/-)"3#9GW#+## % X **? F)/'$78J+R%$$72-#)%"62%#Q3(:7&'27W" % X-/-)"3#9GW#+## % X **J F)/'$78E+R%$$72-#)%"62%#Q%B7$-223'((733W&" % X-/-)"3#9GW#+## % X **J F)/'$78M+R%$$72-#)%"62%#Q#7(:")(-2I'-2)#>W'( % X-/-)"3#9GW#+## % X **J F)/'$7<O+R%$$72-#)%"62%#Q(%3#W$ % X-/-)"3#5NW#"!# % X **E F)/'$7<*+R%$$72-#)%"62%#Q3(:7&'27W" % X-/-)"3#5NW#"!# % X **M F)/'$7<0+R%$$72-#)%"62%#Q%B7$-223'((733W&" % X-/-)"3#5NW#"!# % X **M F)/'$7<4+R%$$72-#)%"62%#Q#7(:")(-2I'-2)#>W'( % X-/-)"3#5NW#"!# % X **M F)/'$7<8+R%$$72-#)%"62%#Q(%3#W$ % X-/-)"3#5!W#",# % X *0O F)/'$7<<+R%$$72-#)%"62%#Q3(:7&'27W" % X-/-)"3#5!W#",# % X *0* vi

7 Systems Engineering Return on Investment F)/'$7<?+R%$$72-#)%"62%#Q%B7$-223'((733W&" % X-/-)"3#5!W#",# % X *0* F)/'$7<J+R%$$72-#)%"62%#Q#7(:")(-2I'-2)#>W'( % X-/-)"3#5!W#",# % X *0* F)/'$7<E+R%$$72-#)%"62%#Q(%3#W$ % X-/-)"3#aaW#--# % X *00 F)/'$7<M+R%$$72-#)%"62%#Q3(:7&'27W" % X-/-)"3#aaW#--# % X *04 F)/'$7?O+R%$$72-#)%"62%#Q%B7$-223'((733W&" % X-/-)"3#aaW#--# % X *04 F)/'$7?*+R%$$72-#)%"62%#Q#7(:")(-2I'-2)#>W'( % X-/-)"3#aaW#--# % X *04 F)/'$7?0+R%$$72-#)%"62%#Q(%3#W$ % X-/-)"3#@NW#'!# % X *08 F)/'$7?4+R%$$72-#)%"62%#Q3(:7&'27W" % X-/-)"3#@NW#'!# % X *0< F)/'$7?8+R%$$72-#)%"62%#Q%B7$-223'((733W&" % X-/-)"3#@NW#'!# % X *0< F)/'$7?<+R%$$72-#)%"62%#Q#7(:")(-2I'-2)#>W'( % X-/-)"3#@NW#'!# % X *0< F)/'$7??+R%$$72-#)%"62%#Q(%3#W$ % X-/-)"3#5SW#")# % X *0? F)/'$7?J+R%$$72-#)%"62%#Q3(:7&'27W" % X-/-)"3#5SW#")# % X *0J F)/'$7?E+R%$$72-#)%"62%#Q%B7$-223'((733W&" % X-/-)"3#5SW#")# % X *0J F)/'$7?M+R%$$72-#)%"62%#Q#7(:")(-2I'-2)#>W'( % X-/-)"3#5SW#")# % X *0J F)/'$7JO+R%$$72-#)%"62%#Q(%3#W$ % X-/-)"3#@SW#')# % X *0E F)/'$7J*+R%$$72-#)%"62%#Q3(:7&'27W" % X-/-)"3#@SW#')# % X *0M F)/'$7J0+R%$$72-#)%"62%#Q%B7$-223'((733W&" % X-/-)"3#@SW#')# % X *0M F)/'$7J4+R%$$72-#)%"62%#Q#7(:")(-2I'-2)#>W'( % X-/-)"3#@SW#')# % X *0M F)/'$7J8+R%;6-$)3%"%D%6#);';27B723C)#:%K37$B7&27B *48 F)/'$7J<+@$-"3D%$;-#)%"3&'7#%6$%/$-;(:-$-(#7$)3#)( *4< F)/'$7J?+972-#)B77DD7(#%D7-(:D-(#%$%"(%$$72-#)%" *84 F)/'$7JJ+56-(73>3#7;6$%/$-;(:-$-(#7$)A-#)%" *?? F)/'$7JE+N)$K%$"7#$-)")"/3>3#7;6$%/$-;(:-$-(#7$)A-#)%" *?E vii

8 List viii

9 Systems Engineering Return on Investment Glossary ASEE AXXE C C A C P C XX ˆ C ˆ C G Adjusted systems engineering effort Adjusted XX effort (adjusted from XXE to correct for missing earlyphase activities; see Section 5.1.5) Cost compliance (also used as subscript) Actual cost at completion Planned cost Cost of the effort expended in SE activity XX Cost compliance, predicted trend for all programs Cost compliance, predicted trend for a given set of program characteristics CMMi COCOMO COSYSMO EIA ESEE EXXE G XX H X H X0 INCOSE Capability maturity model integrated Constructive cost model Constructive systems engineering cost model Electronics Industries Association Effective systems engineering effort Effective XX effort (corrected from AXXE to remove the specific characteristics of the program; see Section 5.1.9) Correction factor applied to XX effort for a given set of program characteristics Hypothesis (used for several, with different subscript designations) Null hypothesis (used for several, with different subscript designations) International Council on Systems Engineering K, k Value for a KPP KPP LEP MD MDE MIT N NDIA O OS OXXE 0 Key performance parameter Large engineering projects Mission/purpose definition Mission/purpose definition effort Massachusetts Institute of Technology Sample size National Defense Industry Association Overall success (when used as subscript) Overall success Optimum XX effort for the average program (predicted from data trends) ix

10 OXXE G Optimum XX effort (predicted from data trends, corrected from OXXE 0 based on the given program characteristics) PC PCA PM PP QF1-QF7 R 2 RAG RE REE RESL RFP ROI RQ X S s S A S P SA SAE SE SEC SECOE SEE SEQ SE-ROI SE% SF1-SF7 SI SIE SM SME Program challenge Principal component analysis Program management Percentile point ranking of a program s characteristic against all programs Principal components (factors) of quantitative program characterization parameters Statistical Pearson s correlation coefficient Research advisory group Requirements engineering Requirements engineering effort Architecture and risk resolution Request for proposal Return on investment Research question (used for several, with different subscript designations) Schedule compliance Sample variation in the data obtained Actual duration Planned duration System architecting System architecting effort Systems engineering Systems engineering capability Systems Engineering Center of Excellence Systems engineering effort Systems engineering quality Systems engineering return on investment Raw cost ratio of effort expended for total SE activity against total program cost. Principal components (factors) of subjective program characterization System integration System integration effort Scope management Scope management effort x

11 Systems Engineering Return on Investment T t TA TAE TC TM TME TQ U UHF UniSA VV VVE Weight j XX XXE XXQ XX% Technical quality (when used as subscript) Student s t distribution statistic Technical analysis Technical analysis effort Technical committee Technical leadership/management Technical leadership/management effort Technical quality Utility value Universal holding fixture University of South Australia Verification/validation Verification/validation effort Weighting factor used to correct SE activity efforts based on one characterization factor j General indicator for various SE activities MD, RE, SA, SI, VV, TA, SM, TM (also used as subscript) Effort expended for SE activity XX, normalized from XX% for subjective quality of original effort (See Section 5.1.4) Subjective quality of the effort expended for SE activity XX Raw cost ratio of effort expended for SE activity XX against total program cost. Significance level (in statistics); probability of accepting the null hypothesis wrongly; Type-I error rate; miss rate Probability of rejecting the null hypothesis wrongly; Type-II error; false alarm rate Acceptable variation in the calculated mean Pearson s correlation coefficient xi

12 Summary This Systems Engineering Return on Investment (SE-ROI) research project explored the quantifiable relationships between systems engineering (SE) activities and program success. The work discovered statistically significant relationships between SE activities and three success measures: cost compliance, schedule compliance, and stakeholder overall success. SE-ROI is discovered to be as high as 7:1 for programs with little SE effort and 3!:1 for median programs. Optimum SE effort for median programs is 14.4% of total program cost; the work provides an a priori estimation method to determine this optimum for specific programs based on 14 characterization parameters. These findings address a significant state-of-the-art gap in that SE effort levels have typically been based on subjective heuristics rather than quantified success parameters. The research developed an interview methodology and interview instruments to obtain a rich set of data from completed programs. The research was supported by a Research Advisory Group of over 60 international members who evaluated the research plans, methods, and instruments during development, ensuring a robust research approach. Program interviews were performed on 51 completed programs in 16 organizations. Interview participants were the Program Manager and Lead Systems Engineer of each program. Programs were from a wide variety of both contracted and amortized development domains; had a wide range of cost, schedule and success; and evidenced SE effort level from near-nil to large. Through the use of Principal Component Analysis and a hill-climbing search for best correlation, the program SE effort levels were adjusted for the specific program characteristics, increasing correlations from R 2 = 14% to as much as R 2 = 80%. The high degree of resultant correlation indicates that the appropriately weighted program characterization parameters largely remove the confounding factors that usually obscure the relationship between SE effort and program success. The resulting relationships show that all SE activities correlate significantly with cost compliance, nearly all SE activities correlate with schedule compliance, and most SE activities correlate with stakeholder overall success. There is some indication of causality in qualitative and theoretical factors. While not definitive, the implication is that the level of selected SE effort is causative of the program success. If true, then the xii

13 Systems Engineering Return on Investment use of the SE effort estimation method herein would result in the best available program success. Some additional findings are also presented. The data shows no significant correlation between SE effort levels and system technical quality. There is indication that this lack of correlation is due to program emphasis on requirement thresholds rather than on stakeholder-defined technical quality. Optimizing technical leadership/management levels is shown to provide a unique benefit in simultaneously associating with cost compliance, schedule compliance, and overall stakeholder success. The work also contributed to the discovery of a commonly held SE ontology that could be expressed in eight SE activities. The worth of this ontology was evident in its easy understanding by all interview participants. xiii

14 Declaration This thesis presents work carried out by myself and does not incorporate without acknowledgment any material previously submitted for a degree or diploma in any university; to the best of my knowledge it does not contain any materials previously published or written by another person except where due reference is made in the text; and all substantive contributions by others to the work presented, including jointly authored publications, are clearly acknowledged. Eric C. Honour 8 January 2013 xiv

15 Systems Engineering Return on Investment Acknowledgements This work would not have been possible without the participation, encouragement, and wisdom provided by many. The University of South Australia (UniSA) has played a significant role in the completion of this work. While the original three-phase research plan was created in 1997, its progress was slow and difficult until UniSA graciously offered its sponsorship of Phase III as a doctoral candidacy. The official nature of the research from that point served two primary purposes: (a) it provided deadlines and impetus to progress, to act in opposition to the pull of daily business that often held the research back, and (b) it opened doors to other advisors and for programs to interview that were not open to an independent researcher. In particular, I wish to acknowledge Prof. Stephen Cook, who twisted my arm to formalize this relationship and who reluctantly ended up as primary supervisor; A/Prof. Joseph Kasser, primary supervisor during the formative stages, who helped to keep the scope within control; and A/Prof. Timothy Ferris, whose detailed reviews were always insightful and helpful. Other researchers and academics on staff at UniSA Defence and Systems Institute (DASI) often helped with encouragement, review, contacts, and tidbits of essential knowledge. In addition, the entire DASI administrative staff was a delight to work with, with a special note for the selfless and always-present help of Dale Perin. Untold numbers of friends in the International Council on Systems Engineering (INCOSE) provided special help, encouragement, and knowledge to this effort. I can single out Dr. George Friedman, Dr. Elliot Axelband, and Dr. Azad Madni for their constant and necessary push toward successful completion. Others, who preceded me through the immense work to complete doctorates, gave me a helpful pull forward, including specifically Dr. Ricardo Valerdi and Dr. Sarah Sheard. I also thank Dr. Bill Ewald for his encouragement and always-positive attitude toward my work. Particular thanks go to Dr. Brian Mar, who worked with me and encouraged me in the earlier phases but has since gone on to a better world. The SE-ROI Research Advisory Group participants, over 60 strong, provided significant help by reviewing the formulation of the research plans and methods, with many timely comments and changes to improve the effort. They also often provided the necessary contacts to obtain the program interviews. xv

16 And finally, I cannot express sufficiently my gratitude to the unnamed organizations that participated in interviews, and the individuals from those organizations who were interviewed. The work could not have been done without the generous access you allowed me to your programs through proprietary boundaries. I hope that this work fulfills your expectations, and that the results advance the discipline of systems engineering to your benefit and the benefit of mankind. xvi

17 Systems Engineering Return on Investment This work is dedicated to My wife, Beth, whose unstinting support over 15 years has encouraged, cautioned and prodded this success, The systems engineering discipline, making the world better through technology, and mostly to God, who gave me the skills and reasons to perform it. xvii

18

19 Systems Engineering Return on Investment 1 Introduction 1.1 Background The discipline of systems engineering (SE) has been recognized for 50 years as essential to the development of complex systems. Since its recognition in the 1950s (e.g. Goode 1957), SE has been applied to products as varied as ships, computers and software, aircraft, environmental control, urban infrastructure, automobiles, and many more (SE Applications TC 2000). Systems engineers have been the recognized technical leaders of complex program after complex program (Hall 1993, Frank 2000). In many ways, however, less is understood about SE than nearly any other engineering discipline. The engineering aspects of SE rely on systems sciences; they also rely on engineering relationships in many domains to analyze product system performance. But systems engineers still struggle with the basic mathematical relationships that control the development of systems. SE guides each system development by the use of heuristics learned by each practitioner during the personal experimentation of a career. The heuristics known by each differ; one need only view the fractured development of SE standards and SE certification to see how much they differ. As a result of this heuristic understanding of the discipline, it has in the past been nearly impossible to quantify the value of SE to programs (Sheard 2000). Yet both practitioners and managers intuitively understand that value. They typically incorporate some SE practices in every complex program. The differences in understanding, however, just as typically result in disagreement over the level and formality of the practices to include. Presciptivists create extensive standards, handbooks, and maturity models that prescribe the practices that should be included. Descriptivists document the practices that were successfully followed on given programs. In neither case, however, are the practices based on a quantified measurement of the actual value to the program. 1

20 The intuitive understanding of the value of SE is shown in Figure 1. In traditional design, without consideration of SE concepts, the creation of a system product is focused on fixing problems during production, integration, and test. In a system thinking design, greater emphasis on the front-end system design creates easier, more rapid integration and test. The overall result promises to save both time and cost, with a higher quality system product. The primary impact of the systems engineering concepts is to reduce risk early, as also shown in Figure 1. By reducing risk early, the problems of integration and test are prevented from occurring, thereby reducing cost and shortening schedule. The challenge in understanding the value of SE is to quantify these intuitive understandings. Traditional Design Traditional Design Risk SYSTEM DETAIL PRODUCTION DESIGN DESIGN INTEGRATION System Thinking Design TEST Saved Time/ Cost Time System Thinking Design Risk Time Figure 1. Intuitive value of SE The research program described in this thesis commenced in In an earlier phase of this research Honour (2004) reported on survey work that indicated a correlation between the amount of SE work and the cost/schedule success of programs. Another more extensive survey performed by the National Defense Industry Association (NDIA) (Elm et al. 2007) showed levels of correlation between subjectively determined SE activities and the success of programs. A third work (Boehm et al. 2008) explored quantitative indications about SE correlations available from within software cost estimation data. Yet to date, none of these prior works have provided sufficient information to determine either the degree of correlation or the optimum level and type of SE to select based on the parameters of a program. These are the goals of this research. See chapter 2 for a more extensive discussion of prior work. 2

21 Systems Engineering Return on Investment 1.2 Goals This Systems Engineering Return on Investment (SE-ROI) research was designed to gather empirical information about how systems engineering methods relate to program 1 success. In particular, the research was aimed at two results of great value to the theory and practice of SE: 1. Determining the degree of correlation between SE activities and program success. 2. Determining the optimum amount and type of SE activities based on a program s definition parameters. The effort leading to this thesis was originally conceptualized in work that preceded this thesis work. Sections 2.2 and 3.2 describe the distinction between the prior work and this thesis. 1.3 Scope In performing this research, the field of SE has been defined by an ontological view based on a melding of the primary current SE standards as described in Section This ontology resulted in the definition of eight major activities that collectively encompass what is usually perceived to be systems engineering : Mission/purpose definition Requirements engineering System architecting System integration Verification and validation Technical analysis Scope management Technical leadership/management Of these, the first five are roughly sequential in nature, defining the typical development lifecycle. The last three are largely continuous throughout the development. See section for definitions of these eight activities and a description of how they were selected. 1 Possible confusion exists between the terms program and project. For clarity in this report, the word project refers to the SE-ROI project. The word program refers to the system development programs whose data is gathered. 3

22 The results of this research are bounded by the lifecycle of a system development. Because SE applies to the primary development of systems, the data obtained has been restricted to programs in which a system development occurred, with temporal lifecycle bounds from the beginning of development to the creation of first system(s). See section 4.5 for the methods used to define the bounds of each program. The results of this research are also bounded by the domain source of the data. The data source has been limited to the programs and organizations that made themselves available for the research. See section 4.6 for the demographics of the source organizations and programs. The results herein do not apply to SE as used outside these bounds. 1.4 Research methods The research gathered data from real programs through an interview process. The information gathered included three major classes: Amount and type of SE activities used on the programs Success levels of the programs Program characterization parameters Prior to conducting interviews, the research project created a consistent structure for the interviews based on (a) theoretical hypotheses about the expected relationships, (b) the ontological view of systems engineering, and (c) feasible interview length. The interview design was vetted through peer review and through a process of test interviews. See section 4.2 for further discussion of the interview design. The interviews were conducted once per program, using data extracted from program records and from responses of the interview participants. Interviews were conducted over a three-year period by the author. Interview participants were the primary program and technical leaders of each program. At the outset of each interview, initial conversations set the program bounds to be used throughout the interview; from that point, all answers were consistent with the defined program bounds. The researcher performed all recording of data. See section 4.5 for further description of the interview methods. After gathering interview data, the research applied rigorous statistical methods to examine the relationships in the data against the hypotheses related to the research 4

23 Systems Engineering Return on Investment project s goals. Through Principal Component Analysis and a hill-climbing search for best correlation, the data was used to determine the combination of program characteristics that most affected the correlations of success versus SE activities. Initial adjustments were then made to the raw data to convert each program from its native program characteristics to a median level of characteristics. These adjustments largely removed the confounding factors that usually obscure the primary relationships. Using the medianized data then, tests were performed on the primary relationships of success versus SE activities, to determine the level and confidence of the correlations. The relationships of the program characterization parameters to the primary correlations were tested to determine the effect of each parameter on the correlations. The ROI of SE was then calculated from the correlated relationships. See section 5.1 for further discussion of the statistical methods used, and sections 5.2 and 5.3 for the specific statistical analyses against the project goals. 1.5 Primary results The research succeeded in demonstrating the expected relationships between program success and SE activities. In addition, the results provide a quantified evaluation of those relationships that can be used in the planning of system development programs. Findings of this research are included throughout this thesis and are listed at the end of each chapter. Major findings are described in Section 7.1. The primary results include the following, all of which are rigorously supported by the work herein: There is a quantifiable relationship between systems engineering effort levels and program success, demonstrated by high correlation coefficients well in excess of the test values for a significance level of = The Return on Investment (ROI) for SE effort can be as high as 7:1 for programs expending little to no SE effort. For programs expending a median level of SE effort, the ROI is 3.5:1. No correlation was found between systems engineering effort levels and system technical quality. There is an optimum amount of systems engineering effort for best program success. For a program of median characterization parameters, that optimum is 14.4% of the total program cost. 5

24 Programs typically use less systems engineering effort than is optimum for best success. An effort estimation method is available to determine the optimal levels of SE effort for a given set of program characterization parameters. Variation in the program characterization typically changes the optimum between 8% and 19% of the total program cost. Prior research work in this area is examined in Chapter 2. The primary results listed above are a significant extension beyond anything available in the prior work. The extension is most apparent in the specific empirical evidence that provides quantifiable, management-oriented decision information as opposed to the prior work that is largely subjective in kind. Section 2.4 specifically summarizes the findings of prior work and compares the results obtained by this research. 1.6 Thesis organization The thesis has the following organization: Chapter 1 Introduction (this chapter) contains a brief introduction to the research and summary of its results, with references to the chapters that follow. Chapter 2 Review of related research describes prior work by the author and others, establishing a basis upon which the current research builds. Some findings are offered based on the research of prior works. Chapter 3 Research design describes in detail the design of the research, including the research questions explored; the research activities performed; the guidance, participation, and organization of the research project; and the ethics considerations. Findings are offered based on the research design work. Chapter 4 Data gathering describes the methods used to obtain data, including the approach to gain access to programs; the methods used during interviews; the protection of the data; and the demographics of the programs interviewed. Findings are offered based on the interview work and demographics. Chapter 5 Statistical results describes the statistical methods used to analyze the gathered data and the results obtained. The research questions are reformulated as specific hypotheses. The statistical results are used to test each hypothesis. Findings are offered based on the statistical results, along with necessary limitations. 6

25 Systems Engineering Return on Investment Chapter 6 Discussion of results provides a logical analysis of the results, possible indications of causality, and their limitations, including examples of how the results can be used during or in advance of a system development program. Chapter 7 Conclusions and recommendations completes the work by summarizing the major findings and indicating areas of possible future work. Appendices provide supporting information including Bibliography Interview instruments Copies of developmental papers 7

26 2 Review of related research This chapter provides a review of prior work done by the author and others, to lay a basis of knowledge for the research. For each work, this chapter provides a specific reference to the work (by reference to the Bibliography) and a summary of its contributions and findings. 2.1 Background information Some prior work has provided a historical background of information related to the ROI of SE. Most of this work, while interesting, is anecdotal in nature because it (a) was not directed at the SE issues, (b) was based on few data points, and/or (c) was not based on a sound and declared research methodology. Yet the total of these works indicates an underlying trend that generally supports the possibility to calculate SE- ROI. Boundary management study. A statistical research project in the late 1980s (Ancona 1990) studied the use of time in engineering projects. Ancona and Caldwell gathered data from 45 technology product development teams. Data included observation and tracking of the types of tasks performed by all project members throughout the projects. Secondary data included the degree of success in terms of product quality and marketability. Of the projects studied, 41 produced products that were later successfully marketed. The remaining four projects failed to produce a viable product. One primary conclusion of the research was that a significant portion of the project time was spent working at the team boundaries. Project time was divided as: Boundary management 14% Work within team 38% Individual work 48% 8

27 Systems Engineering Return on Investment Boundary management included work that was typically done by a few individuals rather than by all members of the team. The work included efforts in classes defined as Ambassador, Task Coordinator, Scout, and Guard, indicating the role of the work with relation to the project. More important to the value of systems engineering, the research also concluded statistically that high-performing teams did more boundary management than low-performing teams. This relates to systems engineering because many of the boundary management tasks are those that are commonly performed as part of SE management. NASA project definition. Werner Gruhl of the NASA Comptroller s office presented results (Gruhl 1992) that relate project quality metrics with a form of systems engineering effort (Figure 2). This data was developed within NASA in the late 1980 s for 32 major projects over the 1970s and 1980s. Total Program Overrun 32 NASA Programs Program Overrun GRO76 OMV TDRSS GALL IRAS HST GOES I-M MARS CEN LAND76 ACT MAG CHA.REC. TETH EDO ERB77 COBE STS LAND78 GRO82 SEASAT UARS DE SMM Definition $ Definition Percent = Target + Definition$ Actual + Definition$ Program Overrun = Target + Definition$ ERB88 VOY EUVE/EP R 2 = ULYS PIONVEN IUE ISEE Definition Percent of Total Estimate HEAO Figure 2. Impact of front end project definition effort. (from Gruhl 1992) The NASA data compares project cost overrun with the amount of the project spent during phases A and B of the NASA five-phase process (called by Gruhl the definition percent ). The data shows that expending greater funds in the project definition results in significantly less cost overrun during project development. Most projects used less than 10% of funds for project definition; most projects had cost overruns well in excess 9

28 of 40%. The trend line on Gruhl s data seems to show an optimum project definition fraction of about 15%. The NASA data, however, does not directly apply to systems engineering. In Gruhl s research, the independent variable is the percent of funding spent during NASA Phases A and B, the project definition phases. Figure 3 shows the difference between this and true systems engineering effort. It is apparent from this difference that the relationship shown in the NASA data only loosely supports any conclusion related to systems engineering. Total Project Effort Development Effort NASA Definition Effort Systems Engineering Effort Time Figure 3. Definition effort is not the same as systems engineering effort. Impact of SE on quality and schedule. A unique opportunity occurred at Boeing as reported by (Frantz 1995), in which three roughly similar systems were built at the same time using different levels of systems engineering. The three systems were Universal Holding Fixtures (UHF) used for manipulating large assemblies during the manufacture of airplanes. Each UHF was of a size on the order of 10 x 40, with accuracy on the order of thousands of an inch. The three varied in their complexity, with differences in the numbers and types of sensors and interfaces. Overall Development Time (weeks) UHF3 UHF2 UHF Figure 4. Shorter schedule for more complex UHF using better SE (from Frantz 1995) The three projects also varied in their use of explicit SE practices. In general, the more complex UHF also used more rigorous SE practices. Some differences in process, for 10

29 Systems Engineering Return on Investment example, included the approach to stating and managing requirements, the approach to subcontract technical control, the types of design reviews, the integration methods, and the form of acceptance testing. The primary differences noted in the results were in the subjective quality of work and the development time. Even in the face of greater complexity, the study showed that the use of more rigorous SE practices reduced the durations (a) from requirements to subcontract Request For Proposal (RFP), (b) from design to production, and (c) overall development time. Figure 4 shows the significant reduction in overall development time. It should be noted that UHF3 was the most complex system and UHF1 the least complex system. Even though it was the most complex system, UHF3 (with better SE) completed in less than! the time of UHF1. Large engineering projects study. An international research project led by Massachusetts Institute of Technology (MIT) studied the strategic management of large engineering projects (LEP) (Miller 2000). The project reviewed the entire strategic history of 60 worldwide LEPs that included the development of infrastructure systems such as dams, power plants, road structures, and national information networks. The focus of the project was on strategic management rather than technical management. The project used both subjective and objective measures, including project goals, financial metrics and interviews with participants. Percent of Projects Meeting: Cost Targets 82% Schedule Targets 72% Objective Targets 45% 18% 37% Failed! Figure 5. Many engineering projects fail to meet objectives (from Miller 2000) The statistical results of the LEPs are shown in Figure 5. Cost and schedule targets were often not met, but technical objective targets were only met in 45% of the 60 projects. Fully 37% of the projects completely failed to meet objectives, while another 18% met only some objectives. The project found that the most important determinant in success was a coherent, well-developed organizational structure; in other words, a structure of leadership creates greater success. Because SE usually includes a 11

30 component of technical leadership, this finding seems to indicate a significant value of SE. The Shangri-La of ROI. A popular and often-referenced paper is Sheard and Miller (2000), which describes the difficulties in attempting to define the ROI of SE. Through observation of the then-current state of measurement, they hypothesized that: (1) There are no hard numbers. (2) There will be no hard numbers in the foreseeable future. (3) If there were hard numbers, there wouldn t be a way to apply them to your situation, and (4) If you did use such numbers, no one would believe you anyway. Sheard and Miller built the theory based on the general lack of hard numbers in the preceding decade and the level of non-use of the few hard numbers that were available. In particular, they referenced Herbsleb (1994) and Frantz (1995), the small sample sizes used, and the lack of impact of those results. They then went on to discuss how to motivate SE process improvement through means other than hard numbers. Lessons learned from large, complex technical projects. In contrast with Sheard and Miller, another contemporary paper (Cook 2000) performed a survey of prior literature to determine what had actually been learned about SE from large, complex technical projects. After defining the basic boundaries of SE, Cook examined the NASA data from Gruhl (1992), UK Ministry of Defense (MOD) reports, UK civil software development, US civil software development, US federal software development, and aircraft development case studies, resulting in a set of guidance principles for SE practitioners, planners, and process developers. As part of this survey, Cook cites the UK Downey principles (DERA 1996), defined since the 1960s, in which 15% of the total project costs should be expended during systems definition to engender speedier, more coherent and interactive processes. This number is also contained in MOD (1999). Commercial systems engineering effectiveness study. IBM Commercial Products division implemented new SE processes in their development of commercial software. While performing this implementation, they tracked the effectiveness of the change through metrics of productivity. As reported by Barker (2003), productivity metrics existed prior to the implementation and were used in cost estimation. These metrics were based on the cost per arbitrary point assigned as a part of system architecting. During the SE implementation, the actual costs of eight projects were tracked against 12

31 Systems Engineering Return on Investment the original estimates of points. Three projects used prior non-se methods, while the remaining five used the new SE methods. In the reported analysis, the data indicated that the use of SE processes improved overall project productivity when effectively combined with the project management and test processes. Cost per point for the prior projects averaged $1350, while cost per point for the projects using SE processes averaged $944, a cost reduction of 30%. Year Project Points Cost ($K) Project 1 Project 2 Project 3 Project 4 Project 5 Project 6 Project 7 Project 8 12,934 1,223 10,209 8,707 4,678 5,743 14, ,191 2,400 11,596 10,266 5,099 5,626 10,026 1,600 SE Costs (%) $/ Point Figure 6. Implementation of SE processes resulted in statistically significant cost decrease (from Barker 2003) 1,406 1,962 1,136 1,179 1, ,739 Impact of systems engineering on complex systems. A third study was reported by Kludze (2004), showing results of a survey on the impact of SE as perceived by NASA employees and by INCOSE members. The survey contained 40 questions related to demographics, cost, value, schedule, risk, and other general effects. While most of the survey relates in some way to the value of systems engineering, one primary result stands out. Respondents were asked to indicate the percent of their most recent project cost that was expended on SE, using aggregated brackets of 0-5%, 6-10%, 11-15%, and 16% or more. Figure 7 shows the result. (It is noted that the study presented the results as shown in a continuous curve, although the actual results only support four data points.) The respondents believed that their projects most often spent between 6-10% on SE, with few projects spending more than 10%. It appears that INCOSE respondents believed their projects spent proportionately more on SE than did NASA respondents. There is, however, an anomaly in this data that is represented by the bimodal characteristic of the responses. Many respondents indicated that their projects spent 16% or above. It is believed that this anomaly occurs because the respondents interpreted project to include such projects as a system design effort, in which most of the project is spent on SE. 13

32 Percent of Respondents % 6-10% 11-15% 16% & Above NASA INCOSE Combined Figure 7. Percent of total project cost spent on SE (from Kludze 2004) Project management and systems engineering in the commercial environment. This observational study (Gamgee 2006) examined 10 large programs in commercial industries (telecommunications, banking, gaming) for evidence of SE activities, comparing those activities to the success or difficulties on the projects. In the paper, SE activities include requirements development, technical design/solution, system integration & test, system implementation, and system support. Both SE activities and program success were evaluated using subjective measures. It was found that all projects with poor SE activity measures also had poor success measures. The study went further, to examine the management attitudes around the finding. The exploration showed that managers were generally aware of the finding, but often had bad experiences with increasing or improving SE. The bad experiences included overwhelming process initiatives, overemphasis on technical design, and extension of important time-to-market. 2.2 Formative theory Based on the background information, the author performed earlier work to determine theoretical relationships that apply to SE-ROI. In 1997, the author defined a threephase approach to quantifying the value of SE: Phase I: Theoretical work to predict the quantified form of the relationship Phase II: Statistical evaluation of volunteer, subjective surveys Phase III: Detailed interviews with programs to obtain contractual data. 14

33 Systems Engineering Return on Investment The papers reported in this section and some of those in the next represent the results from Phase I and Phase II. It should be noted that this thesis is essentially Phase III of that original research plan. The Phase I work was presented in two papers, the second of which became foundational for the sequence of research activities that have led to this thesis. Characteristics of engineering disciplines. In Honour (1999), the author explored the theory that SE is essentially inter-disciplinary in nature and therefore dependent on the underlying engineering disciplines for its value. A review of ten major engineering disciplines revealed both common and diverging natures that affected the ability to engineer systems using those disciplines. In some cases, other engineering disciplines were found also to be practicing some essential characteristics of SE. (e.g. Civil engineers usually develop systems building and structures using life-cycle-phased processes and inter-discipline coordination.) Each discipline has representative engineered products for which it is normally responsible, but many of those products are actually system products. Finally, the paper compared each of the ten disciplines to a primary SE standard (EIA-632) to determine what value the standard offered to that discipline. A primary conclusion was that the defined SE processes usually provide significant value, but that value differed depending on the engineering discipline involved. Toward a mathematical theory of systems engineering management. In Honour (2002b), the author provided the underlying formal theory that led to the work in this thesis, based on earlier work reported in Honour (2002a). In this theoretical work, the author explored the mathematical relationships among cost, schedule, technical value, and risk. (Technical value was further hypothesized as being comprised of size, complexity, and quality.) Each relationship was handled by examining the end-point values in ternary, heuristic combinations. This analysis showed that the end-points always devolve to trivial cases that are usually easily evaluated. Between the trivial cases always lies some optimum with unknown value. Figure 8 shows one example of such a relationship, taking into account the limits of (a) short duration impossibilities (left end point) and (b) constant administrative cost for long durations (right end point). A primary result of this work was Figure 9, which shows the value relationship against Systems Engineering Effort (SEE), a primary research question of this thesis. 15

34 Expected Cost (C) High V Medium V Low Optimum Cost/Duration Expected Duration (D) Figure 8. Theoretical relationship: cost against schedule for different levels of technical value (from Honour 2002b) E(v) for better parameters Value (V) E(v) for SEE=0% SE Effort (SEE) % Figure 9. Theoretical relationship: value against SE effort (from Honour 2002b) 2.3 Statistical research related to SE-ROI This thesis is a direct result of a series of prior research works published by the author in pursuit of the theory presented in Honour (2002b). It is also directly related to several other statistical works. Value of SE SECOE research progress report. The precursor empirical work to this thesis started following the theoretical work of Honour (2002b) and was first reported as interim results in Mar and Honour (2002). This was the Phase II work of the 1997 plan. The work was performed under the auspices of the INCOSE Systems Engineering Center of Excellence (SECOE). The interim report showed the path used in Phase II, defining statistical parameters of cost, duration, SE costs, and SE quality that provided the basis for the survey effort. These same parameters have carried into this current research. The interim report also established four methods of calculating success that are also used in this thesis: cost compliance, schedule compliance, subjective success, and objective technical success. 16

35 Systems Engineering Return on Investment Reporting on 25 received surveys, the interim paper provided initial indications that closely matched the results later reported in Honour (2004) and in this thesis. One significant finding in the interim report, supported in later work, is that the correlations improve when the SE percent is modified by a subjective evaluation of SE quality. Following this realization, all graphs reported correlations against a modified SE Effort, in which the SE percent is factored based on SE quality. Understanding the value of SE. The Phase II work was reported complete in Honour (2004) with a total of 43 surveys received. This work has been quoted and referenced widely to show the basic value of SE, including in the INCOSE Systems Engineering Handbook (INCOSE 2010). The surveys gathered anonymous data on the statistical parameters of cost (planned/actual), duration (planned/actual), SE costs, SE quality, objective technical success, and comparative subjective success. The data collected had sufficient variation in SE effort, SE quality, cost, and schedule to provide a good statistical basis for results. Figure 10 shows the variability in SE Effort as a percent of the total project. 10 Number of Projects SE Effort % = SE Quality * SE Cost/Actual Cost Figure 10. Histogram of submissions by SE Effort, % project cost (from Honour 2004) The data was subjected to statistical correlation analysis to determine the statistical relationship between the SE Effort and the various success measures. Figure 11 provides the correlation graph of cost compliance versus SE Effort, while Figure 12 provides the similar graph for schedule compliance and Figure 13 the graph for overall subjective success. The graphs showed that all three success measures have a usable level of correlation with SE Effort. The optimum level of SE Effort is not well determined due to a lack of data in the optimum region, but was reported as 15-20%. 17

36 Actual/Planned Cost % 4% 8% 12% 16% 20% 24% 28% 0.6 SE Effort = SE Quality * SE Cost/Actual Cost Figure 11. Cost performance as a function of SE effort (from Honour 2004) 3.0 Actual/Planned Schedule % 4% 8% 12% 16% 20% 24% 0.6 SE Effort = SE Quality * SE Cost/Actual Cost Figure 12. Schedule performance as a function of SE effort (from Honour 2004) Comparative Success % 4% 8% 12% 16% 20% 24% SE Effort Figure 13. Subjective success as a function of SE effort (from Honour 2004) The cost and schedule graphs showed the 5% and 95% probability bounds on the data distributions as dotted lines. The variability between the bounds became significantly 18

37 Systems Engineering Return on Investment less as the SE Effort increased. This fact shows a significant increase in predictability (i.e. reduction in success variance) as the SE Effort increases toward the optimum. This paper also reported an analysis of cost compliance correlated to the program size, showing that cost overruns appeared to be more prevalent for programs in the range of tens to hundreds of millions of dollars, with larger and smaller programs reporting smaller overruns. It should be noted that Honour (2004) has nothing to report about the technical quality of the product systems. There is indication in the interim report (Mar 2002) that technical quality was subjectively measured as objective success, but that there was no correlation observed between objective success and SE Effort. Constructive SE cost model (COSYSMO). A doctoral dissertation at the University of Southern California (Valerdi 2005) extended the well-established Constructive Cost Model (COCOMO) for software development into the field of systems engineering. The resulting Constructive Systems Engineering Cost Model (COSYSMO) addressed the basic question as to how much SE effort should be allocated for the successful development of large-scale systems. Valerdi created a mathematical model based on four system size parameters (requirements, interfaces, algorithms, and operational scenarios), one scale factor, and 14 effort multipliers. Parameter relationships were developed first by expert-level consensus estimation, then by evaluation of gathered program data. The basic form of the COSYSMO mathematical model is in which 14 PM NS A (Size) E EM i Equation 1 i1 PM NS A Size is effort in person months (nominal schedule) is a calibration constant is computed size based on four size parameters E is a factor for economy/diseconomy of scale (default is 1.0) EM is an effort multiplier for the 14 cost drivers 19

38 The dissertation work shows that this form of equation, when applied to various actual system development projects, results in consistent calculation of SE effort as used on the projects. By extension, the work offered the COSYSMO model as a means to calculate the SE effort that should be planned for each project. Careful review of the method used, as well as personal discussions with Valerdi, however, shows that should be is defined in terms of the level of SE used on the sample programs rather than by an objective measure based on program success. This is an important work in the advancement of knowledge about SE value, but the work still does not provide the empirical relationship between SE effort and program success. ROI of SE for software-intensive systems. A further exploration into the constructive cost models (Boehm 2007) tied the extensive data from the COCOMO software model into the field of SE. The COCOMO II model included one specific effort multiplier parameter, Architecture and Risk Resolution (RESL), that represented the degree to which the software design was subject to front-end architectural analysis. This type of effort is indicative of SE activities, although the work acknowledges that it is not a complete representation of SE. Through a statistical calculation of the 161 projects in the COCOMO II database, the work determined the amount of time added due to RESL for different size projects as shown in Figure 14. Figure 14. SE effort required in COCOMO II for different size systems (from Boehm 2007) 20

39 Systems Engineering Return on Investment SE effectiveness. An important extension to the work of Honour (2004) was reported in Elm (2008), in which Elm and others performed an extensive survey to obtain more detailed information about SE activities and program success. After formal development and testing of a survey instrument, the study obtained information from 64 programs, 46 of which were complete enough to contribute to the statistical work. Most survey questions were subjective in nature, but were sufficiently detailed to obtain insight into the SE activities. SE was measured by the subjective evidence of artifacts and activities defined as part of the survey; success was measured by subjective answers on a set of questions about the results of the program. Correlation of the SE capability against program success was demonstrated by the success levels for three brackets of SE capability as shown in Figure 15. The figure shows that programs evidencing higher SE capability demonstrated greater success. Figure 15. Correlation of SE capability to program performance (from Elm 2008) In the statistical analysis, related questions were combined to represent the level of capability in specific areas of SE activity. Each SE activity was then checked for correlation against program performance, with statistical measurement of the degree of correlation using a Gamma test. The results are shown in Figure 16. Nearly all activities tested showed a positive correlation. The negative correlation to Monitor & Control was perceived to be an inverse causal relationship, in that programs with performance difficulty tend to receive greater management monitoring and control. The significant result in this effort is to show the ranking of which SE activities have stronger correlation. 21

40 Figure 16. SE capabilities correlate with program performance (from Elm 2008) In another statistical analysis of the data, Elm explored how the program challenge (PC) affected the basic correlations by segmenting the program data into low challenge and high challenge programs. The results in Figure 17 show that the correlation between performance and SE Capability (SEC) is less for high challenge programs. Figure 17. SE capability has less effect for high challenge programs (from Elm 2008) 2.4 Summary and findings of prior results Work on the value of SE has made significant progress in the past decade, beginning to build a basis of data that supports informed management decisions. Prior research has indicated a number of findings, but the findings do not all have equal validity. 22

41 Systems Engineering Return on Investment Front-end program work. Several reports indicate that front-end program work at about 15% of the total program cost minimizes the cost overrun during system development. Effect of SE on program success. Multiple reports show various forms of subjective linkage between SE activities and program success, always indicating that greater SE (than currently used) leads to better success. As with front-end program work, the prior research also shows that SE effort at about 15% of the total program cost minimizes the cost overrun. It has also been shown that the quality of that SE effort matters. Technical leadership. Two reports show that better technical leadership is a strong indicator of better success. Ancona and Caldwell (Ancona 1990) focused on the aspects of leadership involving boundary management, the technical interfaces between the development team and external entities. The large engineering projects study (Miller 2000) focused on the technical organization and coordination of a project team as part of the larger strategic management. Optimal levels. Theoretical work shows that there should exist an optimum level of SE effort; although most programs seem to operate below that optimum, too much SE effort would also be detrimental. Program size. One study showed that the worst cost overruns appear to occur with programs of mid-range size, on the order of $100 million. Another study showed that larger software-intensive programs need a greater level of SE effort. Parametric estimation of SE effort. The COSYSMO effort shows that parametric estimation of SE effort can create consistent predictions using a multiplicative formulat of size and subjective parameters. Program challenge. One study showed that the correlation between SE effort and program success was greater for low-challenge programs and lower for high-challenge programs. The findings from prior research indicate that this SE-ROI project provides a significant new advance in the knowledge about value of SE. The advance is specifically in the area of providing management decision information as to how much and what kind of SE is indicated for best program success. Table 1 shows the specific advances this SE-ROI project makes against the prior research, including a series of findings not covered at all by prior work. 23

42 Table 1. Advances of this research against past work Prior Research Findings Front-end Program Work. Front-end work at about 15% of total program minimizes the cost overrun. Effect of SE on Program Success. Subjective linkage is shown that greater SE leads to better success SE effort at about 15% of total program cost minimizes the cost overrun. The quality of the SE effort contributes to the relationship. Technical Leadership Better technical leadership is a strong indicator of program success. Optimal Levels Theory showed that an optimum level of SE effort must exist Program Size Worst cost overruns occur in programs of mid-range size, on the order of $100 million. SE-ROI Research Advances SE-ROI provides specific information about the SE activities throughout the system development, rather than just the front-end work that includes SE and many other activities. SE-ROI provides proven empirical correlation between SE, its subordinate activities, and program success. SE-ROI shows specific empirical evidence that the optimum is 14.4% total SE for a median program. It also provides optimum values for eight subordinate SE activities, as well as a means to pre-calculate the optimum values for a given program based on its program characteristics. SE-ROI shows that the quality of each of the eight subordinate activities also matters. SE-ROI provides specific empirical values for the correlation between technical leadership and program success, showing that 3.9% technical leadership effort (of total program cost) is optimum for a median program. SE-ROI shows that technical leadership/ management is unique among the eight subordinate SE activities in that it provides optimum program success simultaneously in cost, schedule, and stakeholder acceptance. SE-ROI provides specific values of the optimum for total SE (14.4%) and each of eight subordinate SE activities SE-ROI showed that system size (not program size) is the most significant confounding factor in correlation of SE activity to program success, and defined program size through a combination of nine parameters. Parametric Estimation of SE Effort COSYSMO provides a consistent methodology to estimate SE effort based on the effort used by other programs. Program Challenge Correlation between SE effort and program success is greater for lowchallenge programs. SE-ROI provides a consistent methodology to estimate optimum SE effort based on program success. SE-ROI provides empirical estimation for the amount of SE effort required based on the level of technology risk. 24

43 Systems Engineering Return on Investment Prior Research Findings SE-ROI Research Advances Findings Not Covered in Prior Work SE-ROI shows a significant, quantifiable Return on Investment for SE activities, usually 3.5:1. SE-ROi found no correlation between SE and system technical quality. SE-ROi found that programs typically use less SE effort than is optimum for best success. SE-ROI provides a quantified list of program characterization parameters that affect the relationship between SE activity and program success. SE-ROI demonstrated that there is a common ontology of SE that is sufficient to be meaningful. SE-ROI demonstrated that it is possible to effectively quantify SE effort using empirical data. SE-ROI demonstrated that it is possible to obtain meaningful data about SE and success through program proprietary boundaries. Most of the prior research provides only anecdotal information about the value of SE, with subjective indications that it has value. Some prior work indicates the quantified gain that may be obtained with SE activities (30% cost reduction, 50-70% time reduction), without any indication of how much or what kind of SE is required to obtain these gains. Other work provides some quantifiable decision material (worst overruns occur for programs at about $100M size, parametric calculation of SE effort works), but again without evaluation of the objective value obtained by using SE activities. The COSYSMO work provides an excellent tool to predict the amount of SE on a program, but its results are not tied to the program success. The only quantified information about SE-ROI is obtained from the Gruhl and Honour studies, which provide indication that total SE (or total front-end work) should be on the order of 15% of the program cost to minimize cost and schedule overruns. While useful for management planning, these results give only the highest-level statistical indication. The next chapter describes the research method used to take these indications down to a much deeper level, with greater resolution and with greater completeness. 25

44 3 Research design The SE-ROI research required a formal design to ensure that the results would be acceptable to the wider SE community. This chapter describes in detail the design of the research, including the research questions explored; the research activities and their order; the guidance, participation, and organization of the research project; and the ethics considerations. Findings are offered based on the research design work. 3.1 Research questions Based on the background work of Chapter 2, the primary research questions of the SE- ROI project are: (RQ A ) Is there a quantifiable correlation between the amount, types and quality of systems engineering efforts used during a program and the success of the program? (RQ B ) For any given program, can an optimum amount, type and quality of systems engineering effort be predicted from the quantified correlations? Several terms in these questions require more definition. These are: Program Each program sought as a data point is a system development program that starts with an operational concept and ends with the first prototype system. Systems engineering effort The scope of systems engineering effort to be considered is based on an analysis of the existing standards (Honour & Valerdi 2006) that demonstrates the widely-agreed categories of mission/purpose definition, requirements engineering, system architecting, system implementation, technical analysis, technical leadership/management, scope management, and verification/validation. Amount Systems engineering effort is quantified herein in terms of the cost of SE effort applied as a fraction of the total program cost (SE%). As shown in Mar & 26

45 Systems Engineering Return on Investment Honour (2002), however, this must also be qualified by a measure of the quality of the effort applied. Type This research explores the eight categories of SE effort found in Honour & Valerdi (2006) as definitions of type. Statistical correlation of each type against program success is sought. Quality - The interview participants select the quality of systems engineering effort against a subjective scale. Success The success of a program is measured herein by four separate success parameters: (a) cost compliance with plan, (b) schedule compliance with plan, (c) overall subjective success, and (d) technical performance against quantifiable key performance parameters (KPP). 3.2 Research activities Prior related research work by the author provided a literature review, definitions of theoretical mathematical concepts, effective methods to obtain data, preliminary data that validates the methodology, and an ontology of systems engineering categories suitable for measurement. Therefore, the research work covered in this thesis started at a more advanced level than many others. Table 2. Research activities prior to and during this thesis effort Prior Research Work ( Value of SE ) Definition of three-phase concept to quantify the value of SE Implementation of phase I, theoretical exploration of quantified relationships. Implementation of phase II, informal survey to obtain initial indications Thesis Research Work ( SE-ROI ) Implementation of Phase III, detailed program interviews to obtain extensive empirical data Statistical analysis of the empirical data to determine relationships and findings Determination of findings. The research worked through the following five major activities, which often overlapped, in some cases supporting each other simultaneously. Research organization Technical structuring Data gathering Data analysis Reporting 27

46 The general plan for the research was published as a developmental paper in Honour (2006a). That paper is included herein as Appendix C.2. The following material expands on the general plan in that paper Research organization The research organization activity provided the underlying organization and structure for the research. Tasks that were part of this activity included: Creation and maintenance of research plans Development of the Research Advisory Group (See Section ) Monthly status reporting to the Research Advisory Group and to UniSA. This activity started with the first creation of a research plan in October 2005 and completed its creation/development tasks in October Maintenance and reporting efforts continued throughout the research Technical structuring The technical structuring activity provided the technical concepts and data structures necessary to start data gathering. Work in this activity involved the members of the Research Advisory Group and the UniSA supervisors. During this activity, the researcher was the primary worker while coordinating ideas and results with the advisory group and supervisors. The work created concepts and structures that are a consensus product of the advisory group. Specific goals for the activity were to create: Technical correlations to be tested by the research Data structures to obtain the necessary source data Access to real programs The technical structuring activity started in late 2006 as the Research Advisory Group was assembled. It continued through the initial stages of data gathering, to allow modification of the technical concepts and data structures based on initial data. Technical correlations to be tested. The basic research hypotheses are stated in The intent of data gathering was to quantify the hypotheses in three dimensions: Program success, measured in cost compliance, schedule compliance, overall success, and technical quality. 28

47 Systems Engineering Return on Investment Systems engineering effort, measured in effort costs against the program total cost, in each of eight categories. Program characterization values (size, complexity, quality) that parameterize the expected correlation of systems engineering effort with program success. The primary correlations included the following: Percent total SE effort against program success in each of cost, schedule, overall success, and technical quality (4 correlations). Percent subordinate SE activity effort (8 activities) against program success in each of cost, schedule, overall success, and technical quality (32 correlations). Each of the 36 correlations was tested against the program characterization parameters to determine whether the correlations are improved by adjustment against the characterization parameters. Possible characterization parameters to be tested were drawn from the work on COSYSMO (Valerdi 2004) and the experience base of the Research Advisory Group. Data structures. Based on the technical correlations to be tested, the researcher guided and coordinated the efforts of the Research Advisory Group to define an effective set of data to be gathered. Section 4.1 describes the technical structuring of the data, starting with the research questions and continuing into the data necessary to support the research questions. The purpose of this task was to define data structures that could reasonably be attained during an interview and that capture the information necessary to explore the desired technical correlations. Access to programs. It was the responsibility of the researcher to identify programs and to negotiate access to those programs through key management individuals. Section 4.2 describes the approach used to obtain access to programs. As a result of the technical structuring work, some members of the Research Advisory Group gained sufficient interest in the research to assist in providing access within their organizations. Other program interviews occurred through the direct contacts and efforts of the researcher. The research required a sufficient number of programs to support the statistical correlations desired in the technical structure. The greatest challenge of the SE-ROI research, as it was for prior projects including COSYSMO, was to obtain data from a 29

48 sufficient number of programs. To this end, the researcher made frequent contacts with industry and government individuals seeking access to the necessary program data Data gathering The data gathering activity obtained data from programs in accordance with the data structures defined in the technical structuring activity. See Section 4.5 for the interview methods used in data gathering and Section 4.6 for the demographic description of the programs interviewed. This activity started in March 2007 on the initial completion of the technical structuring activity and continued until the there was sufficient data for completion, in September Obtaining data from programs was time consuming. Although individual interviews were short in research terms, the political work to obtain the interviews took many months and considerable personal contact Data analysis The data analysis activity used the data gathered to seek correlations that support the hypotheses. This activity used statistical methods as described in Chapter 5. As each set of data was obtained, the statistical analysis was extended based on the quality and quantity of the total data set. Initially, there was insufficient data to reliably support any correlation. With a few data sets, high-level correlations were attempted. As the number of data sets increased, more correlations were attempted in accordance with design of experiments methods. The interim results were provided to participating organizations. This activity started when a few data sets were obtained in September 2007 and completed when the statistical correlations were sufficient to support or reject the primary hypotheses in March Further data analysis continued to extend the results into secondary hypotheses and SE cost estimation methods, completing in November Reporting The reporting activity included the generation of interim and final technical reports, in the forms of: A public website with summary information. An organization of SE practices that was vetted by the Research Advisory Group, published as an interim technical paper. (See Appendix C.1) 30

49 Systems Engineering Return on Investment Interim analysis results prepared as internal data and distributed to the Research Advisory Group. Benchmark reports prepared as written reports to each participating organization. The reports included specific data from the organization s interviewed programs, compared with aggregate data from the research as a whole. Interim technical conference papers disseminating various aspects of the research as it progressed. (See Appendix C.) Final results in the form of this technical thesis to UniSA. Final results offered for publication as journal-level technical papers. The reporting activity started from the beginning of the project in February 2006 and continued through Guidance and participation The SE-ROI research required information that could only be obtained from system development organizations, and the collection of that data has proven difficult in past research. To ensure a high level of acceptance of the research results, the research project sought guidance and participation from an unusually rich selection of senior people, namely a Research Advisory Group, numerous staff in the Defence and Systems Institute at UniSA, and from senior representatives of the participating organizations Research advisory group A Research Advisory Group was created to participate in developing the research methods and interview instruments. The Research Advisory Group: Provided general acceptance of the data organization, Built public interest in the research and its expected results, and Provided access to real programs in the group s parent organizations. The Research Advisory Group comprised volunteer individuals who expressed an interest in the research and a willingness to participate in its development. The formation of this group followed the successful methods used on the COSYSMO project and documented by Valerdi (2004). The Research Advisory Group used virtual collaboration methods ( reflector, web-based repository, web-enabled 31

50 presentations, teleconferencing) coupled with face-to-face working meetings (in conjunction with conferences). The group assisted the researcher to do the following. Create a high-level ontological structure for SE practices to act as a basis for the research hypotheses and data. Create the structure and format of data gathering to support the intended hypotheses. Facilitate access to data from programs within their parent organizations. Review and discuss interim results to provide consensus guidance to the researcher. (Interim results were carefully protected to guard the security of the source data. Members of the advisory group came from many competing organizations.) The Research Advisory Group grew and changed in membership during the research. In its final version, the group included 66 representatives from 59 organizations comprising 31 system development companies, 9 government agencies, 10 universities, and 9 analysis or independent organizations. The organizations were based in 10 different countries Participating organizations Sixteen different system development organizations participated in the research by providing access to their programs in the form of formal interview sessions. See Section 4.6 for statistical demographic information on the participating organizations and their projects. 3.4 Ethics considerations The SE-ROI research has involved extensive interviews with key leadership individuals for the target programs in the participating organizations. Several ethics considerations were necessary to ensure that the research to ensure participants engaged freely with full knowledge of the intent of the research program and the confidentiality of all data was respected. Organization Selection. By the very nature of the research, organizations participated voluntarily. A large part of the success of this research was due to the researcher s reputation, which encouraged a representative number of organizations to take part. To safeguard that reputation and to meet university ethics requirements, the comprehensive protocols described in the UniSA Ethics Approval were adhered to throughout the 32

51 Systems Engineering Return on Investment research program and safeguarded described therin have been enacted to ensure the long-term safeguards of proprietary or sensitive data. Program selection. The participating organizations selected within themselves the programs to be interviewed. A senior manager of the organization usually performed this selection, with concurrence of the leadership individuals to be interviewed. Nonetheless, each interview participant was clearly a volunteer and signed a statement to that effect without coercion. As evidence of the voluntary nature, there were two cases in which an individual/program was offered by the organization for interview but declined during the interview to take part. In these two cases, the programs were therefore not included in the data gathering. Personal protection. While the data obtained was primarily organizational data rather than personal data, the interviewees interpreted that data based on their own professional judgment. Interviewees were afforded threefold protection from personal exposure as a result of their responses: Interviewees were very senior individuals, the leaders of major system development projects. As such, they are experienced in careful response to potentially damaging questions. At least two interviewees were involved in each program interview. The recorded answers were the consensus answers from the set of interviewees in the room, thereby removing individual exposure. Names of the interviewees were not recorded with the data, providing a blind separation of the answers from the individuals. 3.5 Observations and findings The research design as originally envisioned in 2005 for this thesis was essentially sound. No changes were necessary during the research. As noted in Table 2 of Section 3.2, this research was a continuation of a prior research protocol, extending and deepening the results. As such, the research design was heavily influenced by the prior research and the multi-phase plans created as early as The prior research and the multi-phase plans contributed to the success of the research design for this thesis. 33

52 Additional enhancements added to the research design during the formation of this phase of SE-ROI research were useful and appropriate. Three particular enhancements helped significantly: The formation of the Research Advisory Group, an idea drawn from the COSYSMO research of Valerdi (2004), contributed materially to the success of the SE-ROI project by providing visible results and access to programs. As reported by Valerdi, this researcher also found the Research Advisory Group to be a highly effective method to create a useful technical structure and to provide access to programs. Formal consideration of ethics issues, both for organizations and for the interviewed individuals, protected the project from potential serious problems. Affiliation of the research with UniSA provided a constant impetus for forward movement; something that proved difficult in the prior phases that were independently-funded research projects. Obtaining data from programs was a time consuming process. Although individual interviews were short in time, lasting less than two hours each, the political work to obtain the interviews took many months and considerable personal contact. 34

53 Systems Engineering Return on Investment 4 Data gathering In accordance with the research design, a first major step was to define and gather the appropriate data from real programs. The following basic types of data were needed to address the research questions stated in Section 3.1, as noted in Section 3.2.2: Program success, measured in cost, schedule, overall success, and technical quality. Systems engineering effort, measured in effort costs against the program total cost, in each of eight categories. Program characterization values (size, complexity, quality) that parameterize the expected correlation of systems engineering effort with program success. Such data is not usually stored directly in customer or contractor databases. Program databases typically store some equivalent data, but the data is organized in accordance with program, customer, or contractor structures. Interpretation of the data is needed to convert it into a common structure. For these reasons, the only effective method to obtain the required data is through an interview process with the key individuals. This chapter describes the development of the data set to be gathered, the development of the methods and instruments used in gathering the data, and the actual process of gathering the data. The chapter also includes statistical demographic information on the raw data as gathered. Much of this material has already been published in developmental conference papers that are provided as appendices to this thesis. Where the material has already been published, this chapter refers to the papers and provides only expansion information to explain depth that was inappropriate for the conference papers. In particular, the developmental paper by Honour (2006b) summarized the methods to gather data for this research. This paper is included here as Appendix C.3. This chapter expands on that summarization. 35

54 4.1 Data to be gathered The three basic types of data cited above had to be expanded into actual raw data items that could feasibly be gathered during an interview. A first activity for the research was to identify these raw data items. Throughout this activity, the interim results were vetted with the Research Advisory Group described in Section and the UniSA supervisors to obtain the best consensus list of data items Observable attributes in SE management The well-known attributes in SE management are described in Honour (2002b) as technical size, technical complexity, technical quality, cost, duration, risk, and SE effort. Each system development program can be viewed as a stochastic process. At the beginning of the program, management choices are made that set the parameters for the stochastic process. Such choices include goals, process definitions, tool applications, personnel assignments and many more. During the program, many internal and external factors influence the actual outcome. The resulting completed program achieves a set of values for the observable attributes. A set of similar programs would give rise to as-yet-unknown probability distributions for the attributes. All of the observable attributes cited in this section may therefore be viewed as sample values from inter-related stochastic processes. The first three attributes (technical size, technical complexity, and technical quality) are inherent and largely orthogonal characteristics of the product system as developed by a program. Taken together, they may be considered as the technical value of the system. Technical size is an intuitive but highly elusive quantity that represents the overall size of the system. Some proposed measures of technical size include the number of requirements, the number of development organizations, number of function points, the number of new-development items, and even (in a twist of cause-and-effect) the overall development cost. (Valerdi 2004) Technical complexity represents another intuitive attribute of the system. Size and complexity are independent characteristics. A system of any given size can be made more difficult by increasing its complexity, where complexity is usually related to the 36

55 Systems Engineering Return on Investment degree of interaction of the system components. One measure of complexity was explored well by Thomas & Mog (1997) and then subsequently validated on a series of NASA programs by Thomas & Mog (1998). Technical quality is a third intuitive and independent attribute of the system. Quality is measured by comparing the actual resulting product system with the stakeholder desires. Component attributes of quality vary widely and are based on the perceptions of the stakeholders, thereby resulting in what appears to be subjective measurement. One measure of technical quality was proposed by Honour (2001) in the form of value against a pre-agreed Objective Function. This measure was further expanded by Browning & Honour (2005) to consider life-cycle value. The next three attributes (schedule, cost, risk) are of primary program management interest, representing inherent attributes of the develoment program. Program schedule or duration is one of three attributes of the system development that is commonly used for management tracking and control. Duration is well understood, with extensive software tools for planning and scheduling programs. For this research, the concern is with the overall development duration from concept through validation of first product(s). This duration may include activities such as operational analysis, requirements definition, system design, developmental engineering, prototyping, first article(s) production, verification, and validation. Program cost is a second attribute of the system development that is also commonly used for management tracking and control. As with duration, program cost is well understood. The scope for program cost for this research, as with duration, is the overall development cost from concept through validation of first product(s). Risk is a third attribute of the system development. Risk is defined in the literature in many ways. In its basic form, risk represents variability in the stochastic processes for value, duration, and cost. Risk may exist in technical parameters, in schedule, and in cost. Some current risk definitions focus on cost, with the assumption that technical and schedule risks can be translated to cost (e.g. Langenberg 1999). As an attribute of the overall program, a single value of program risk was proposed by Honour (2001). One other attribute is controllable by program management in their allocation of resources to a program. It is the primary independent variable in the heuristic relationships explored by this research. 37

56 Systems engineering effort (SEE) is the effort expended during the program to perform effective systems engineering tasks. SEE is a primary variable that is selectable and controllable during a system development. Other values usually occur by selecting SEE. SEE must take into account the quality of the work performed, because a group that performs systems engineering tasks poorly provides little benefit to a program, as shown in Mar & Honour (2002). SEE can be expressed as an effective percent of the total program cost. It should be noted that SEE is not merely the effort of those in a SE office or charging to a SE account in a program. SE efforts are generally also performed by others contributing to the program who may be located in other organizations or charging to different accounts Heuristic relationships In Honour (2002b), the author explored the heuristic relationships among the basic SE values by performing two-point end-value analysis of each pair-wise relationship. The heuristic relationships can be seen in that paper. VALUE E(V) for SE Quality = 100% Typical Operating Region E(V) for SE Quality = 0% 0 SE Effort as % of total project 100 Figure 18. Seeking optimum level of SE effort within programs (from Honour 2002). Among the heuristic relationships is a primary heuristic hypothesis for the value of systems engineering, shown in Figure 18. In this graphic, value is the subjective value of the system development program, comprised of an unknown function of the observable attributes and measured against the stakeholder perceptions. Value is plotted against the selected level of SEE as a percentage of the total project. The end points are trivial and obvious: (1) If the entire project is used to perform SE activities (i.e. SEE = 100%), then there are no resources to actually produce the system, and value is zero. (2) If no resources are allocated to SE activities (i.e. SEE = 0%), then a system is produced but it likely has problems in technical quality, cost, or schedule. 38

57 Systems Engineering Return on Investment The thin lines represent two hypothetical cases. The lower thin line represents a case in which resources are allocated to SE activities, but they are performed so poorly as to have no beneficial effect (i.e. SE quality = 0% of possible quality). In such a case, the SE resources simply take away from the available program resources. The upper thin line represents an ideal case in which all of the allocated SE resources are used at the highest absolute quality. The thick red line represents the actual achievable value for application of effective SE resources. At low levels of SEE, the SE resources cannot be used at the highest absolute quality because the resources become split to different needs, creating inefficiencies. As SEE increases, the resources can be used in ways that more closely approximate the highest absolute quality. The resulting relationship of value to SEE therefore starts at non-zero (a program without SE can still achieve some value), grows to a maximum, then diminishes to zero at SEE = 100% (all program effort is assigned to SE, so no system is produced). The rapid upward trend in the resulting curve for lower values of SEE corresponds to expectations of many systems engineers, that greater application of systems engineering improves the value of a program. Most programs appear to operate somewhere within this region, leading to a widespread occurrence of this common expectation Systems engineering ontology The systems engineering discipline employs many different vocabularies which complicate collaborative work and information sharing. Each vocabulary derived from a specific domain of work (military, automotive, commercial software, etc.) and reflects the paradigms prevalent in that domain. To allow data gathering across domains, it was necessary to develop a common vocabulary that would be meaningful. This necessity led to the identification of a systems engineering ontology, a description of the shared knowledge across domains. The work presented in Honour & Valerdi (2006) forms an early part of this research and sought to develop the ontology and common vocabulary to be used in the SE-ROI project. This developmental paper is also included herein as Appendix C.1. In this effort, the author joined with others to review the primary international standards for systems engineering, seeking to identify the commonality that represents the widespread common understanding of the field. The standards reviewed included 39

58 ANSI/EIA-632, IEEE-1220, ISO-15288, CMMi TM, and MIL-STD-499. The review demonstrated the issue with different vocabularies, as shown in Table 5 of Appendix C.1, in which each standard provides different words and phrases to describe the same activities. As a result of the ontology work, the SE-ROI project identified eight systems engineering activities evidenced across the standards. These eight activities have been used throughout the research as a first-level breakdown of systems engineering. Mission/purpose definition (MD). The starting point for the creation of a new system, or the modification of an existing system, is to define the mission or purpose of the new/changed system. This mission is typically described in the language of the system users rather than in technical language (i.e. the range of an airplane rather than the length, drag, wingspan, tank capacity, fuel rate, etc.) It usually also includes a level of quantification, to determine the acceptable values in that same user language. Requirements engineering (RE). A long-recognized core discipline of systems engineering has been the creation and management of requirements, formal technical statements that define the capabilities, characteristics, or quality factors of a system. Generally referred to as requirements management or requirements engineering, this discipline may include efforts to define, analyze, validate, and manage the requirements. Because these efforts are so widely recognized, they appear in every standard. System architecting (SA). The design aspect of systems engineering is to define the system in terms of its component elements and their relationships. This category of effort has come to be known as architecting, following the practice of civil engineering in which the structure, aesthetics and relationships of a building are defined before doing the detailed engineering work to design the components. In systems engineering, architecting takes the form of diagrams that depict the highlevel concept of the system in its environment, the components of the system, and the relation of the components to each other and to the environment. Creation of the system architecture (sometimes called system design ) is usually described as a process of generation and evaluation of alternatives. As a part of architecting, systems engineers define the components in terms of allocated requirements 40

59 Systems Engineering Return on Investment through a process of defining lower-level requirements from the system requirements. System integration (SI). The next system-level activity occurs following the detailed activity to design, purchase, create, and test the system components. Components may be hardware, software, or processes, and components may be so large or complex as to be treated as systems in their own right. From the higher system view, however, the development of the components is a next-level activity that is not part of the system engineering of this system. When the components are completed, though, the systems engineering continues with the task of building the components into the envisioned system architecture. This task of system integration occurs typically through a series of assembly and test actions. Verification and validation (VV). The standards also agree on the inclusion of system-level verification and validation in the scope of systems engineering. Component testing is not part of this activity. Verification is described as the comparison of the system (or the developmental artifacts) with its requirements through the use of examinations, analysis, demonstrations, tests, or other objective evidence. Validation is described as the comparison of the completed system (or the development artifacts such as requirements or architecture) with the intended mission or purpose of the system. Technical analysis (TA). It is widely accepted in the standards that systems engineering is responsible for system-level technical analysis, particularly as related to assessment of system performance against the requirements. System technical analyses may include functional analysis, predictive analysis, and trade-off analysis, except when inseparable from requirements engineering or system architecting. The standards also include performance analysis, timing analysis, capacity analysis, quality analysis, trending, sensitivity, failure modes and effects analysis, technical performance measurement, and other similar multi-disciplinary technical evaluations of the system configuration and components. Scope management (SM). Another aspect of technical management that appears to be somewhat distinct in the standards is the technical definition and management of acquisition and supply issues. This area of effort applies to the contractual relationships both upward and downward. Upward relationships involve a development contract or internal definition of scope for the entire system development, which contract or scope usually involves the system requirements. 41

60 Downward relationships involve the contracts or internal scope definition for system components to be developed by others. These relationships are distinct in character from the internal team relationships covered by technical leadership/management. Technical leadership/management (TM). All standards recognize the inherent need for technical management as a part of systems engineering, an effort required by the size of the engineering teams involved in many system design programs. It was noted earlier that the research of Frank (2000) showed that such management and leadership are widely regarded as attributes of successful systems engineers. The descriptions of these efforts in the standards are significantly different, but can often be interpreted to agree with each other. These activities encompass elements of project planning, technical progress assessment, technical control, team leadership, inter-discipline coordination, providing common language and goals, risk management, and interface management. This is distinguished from program/project management by the focus on technical goals and technical guidance. Finally, with this ontology in place, the last step is to define systems engineering. Systems Engineering (SE). An interdisciplinary approach and means to enable the realization of successful systems (INCOSE 1996). For the purposes of SE-ROI research, SE is considered to be the total effort expended across the eight systemlevel technical activities above to define and develop a new system. This effort may be expended by systems engineers or by others, including individuals who may be outside the development organization. The ontology work therefore created a set of definitions for use during the research. During the actual interviews, despite the wide variation in SE terminology across domains and organizations, these definitions were accepted and understood by the participants Data lists The systems engineering ontology of the previous section provided a scope definition for the research, but it still remained to identify the specific data elements that were needed for the research. As noted in Section 3.2.2, supporting the primary hypotheses of Section 3.1 requires three basic types of data, leading to three lists of data. 42

61 Systems Engineering Return on Investment Program success. Measures of success vary depending on the perspective desired. Program managers typically measure success in cost and schedule terms. Quality proponents measure overall success in terms of stakeholder acceptance. Systems engineers measure success in technical terms. All of these perspectives are valid, and it was a desire in this research to determine the relationship between SE effort and success in any of these valid measures. Therefore, the decision was made to collect and correlate all of these measures. The data to be collected was also adapted to the limitations of the interview method. See Section 4.2 for the considerations and limitations of the method. In some cases, ideal data elements that might have been available through detailed document search were not available during interview. In such cases, the data elements that were available were chosen instead. Cost success was evaluated by comparison with the actual cost-at-completion with the budgeted cost, the measure called cost variance by program managers (Kerzner 2006). Measuring cost success therefore requires gathering budgeted cost and actual cost for a consistent scope of work. Schedule success was evaluated by comparison of actual project duration with the planned schedule, the measure called schedule variance (Kerzner 2006). Measuring schedule success requires gathering planned schedule and actual project duration for a consistent scope of work. Overall success is a subjective measure of the quality of the entire program (and resulting system) in the collective perception of the stakeholders. Overall success was measured subjectively by having the interview participants estimate the stakeholder satisfaction on a typical subjective scale. Technical quality was also measured against the stakeholder desires, by comparing actual technical parameters achieved with the values collectively expected/desired by the stakeholders. The values that matter to the stakeholders are known as key performance parameters (KPP). They frequently have known values for threshold (the least performance acceptable to stakeholders) and objective (the highest performance that has any value to stakeholders). Measuring technical quality in this way requires identifying the KPPs, then determining for each KPP the threshold, objective, and actual value achieved. In addition, it is necessary to evaluate the relative weight of each KPP against the others. 43

62 In summary, Table 3 provides a list of the data items that must be gathered from each target program to measure program success. Table 3. Data items for program success Cost Success Schedule Success Overall Success Technical Quality Budgeted cost Actual cost Planned duration Actual duration Subjective success List of KPPs KPP relative weights KPP threshold values KPP objective values KPP actual values Systems engineering effort. The research questions in Section 3.1 desire to explore the amount, types and quality of systems engineering effort. Based on the ontology of Section 4.1.3, there are eight identified types of SE effort as well as the total SE effort. For each type of effort, the research measured the cost expended performing that effort as a fractional percent of the total program cost. Based on the research in Mar & Honour (2002), it is also necessary to qualify this amount by a subjective evaluation of the quality of the effort. These parameters allow quantifying the desired amount, types and quality of SE effort. Table 4 summarizes the data items required to measure SE effort. Table 4. Data items for systems engineering effort For each SE type Gather these data items Also, for the program SE Total SE MD Mission Definition RE Requirements Engineering SA System Architecting SI System Integration VV Verification & Validation TA Technical Analysis SM Scope Management TM Technical Leadership/ Management Actual cost expended Subjective quality of effort Actual program cost Program characterization values. Finally, it was also expected that there would be many confounding factors in the relationship between SE effort and program success. It is appreciated that SE is certainly not the only activity that might affect program success. Other activities such as program management, design engineering, test engineering, production also can be assumed to have a significant effect on success. Some of the known major confounding factors are program size, level of integration, proof difficulty, development autonomy, system complexity, team process capability, 44

63 Systems Engineering Return on Investment and technology risk. (See Section for a full discussion of these factors based on the statistical data.) The variability in the relationship was demonstrated in the prior work (Honour 2004), with recognition that the variability was due to confounding factors similar to these. To address the confounding factors, the research gathered a set of program characterization values that would quantify these factors and possibly allow correlation improvement. Some characterization values were quantitative, such as total program cost, number of system requirements, and number of system components. Other characterization values were qualitative, such as team understanding of requirements, personnel experience, and technology risk. At the outset of this research, it was unknown which characterization parameters would actually have what effects. Part of the statistical research was envisioned to determine these effects by evaluating what quantitative change each parameter had on the correlations. Some characterization parameters had been used in prior phases of this research, or in other research work such as COSYSMO. Because of the uncertainty in which parameters have what effect, the research gathered a wide set of possible characterization parameters to permit the statistical evaluation of them. COSYSMO parameters were used completely, as defined by Valerdi (2005). These parameters had already proven useful in SE cost estimation, and could be expected to have a significant effect on the correlations. Other parameters were drawn from prior unpublished work by the author and from theoretical relationships perceived during the literature research and ontology work. It should be noted that there was overlap; some of the COSYSMO parameters also addressed the theoretical issues. Table 5 summarizes the list of data items chosen for program characterization. The data items include many items that are additional to the COSYSMO set. While the COSYSMO parameters stand with a record behind them, some explanation of the theoretical issues behind the additional data items is appropriate: Program environment parameters were necessary because COSYSMO did not have them. COSYSMO instead assumes, for cost estimation, that the program is at its beginning. The correlation relationships observed in the SE-ROI research might be significantly different for programs under different funding methods (i.e. amortized vs. contracted programs) or for programs in different life-cycle stages. 45

64 Table 5. Data items for program characterization Program Environment Quantified Values Funding method COSYSMO Level of definition at start Nbr system requirements Life-cycle stage Nbr system interfaces Engineering life-cycle phase at interview Nbr system-specific algorithms Nbr operational scenarios Other Nbr system components Nbr formal tests Nbr formal test locations Nbr developing organizations Nbr customer agencies System production quantity CMMi level Subjective Values COSYSMO Other Requirements understanding Mission/purpose understanding Architecture understanding Requirements volatility Level of service requirements Requirements growth Migration complexity Overall system complexity Technology risk Lead system engineer experience level Documentation level Nbr/diversity of installations/platforms Nbr recursive levels Stakeholder team cohesion Personnel/team capability Personnel experience/continuity Process capability Multisite coordination Tool support It was a necessary checkpoint to know the life-cycle phase at the time of the interview; this item later became not useful because all interviewed programs were at or near completion. The level of definition at the start of a program could vary significantly. This occurs because system development work is often spread across multiple phases, with different organizations performing each phase. It was believed that this parameter might be used to back-fit data into phases that had been performed prior to the development effort that was being interviewed. This perception became true, as discussed in Section Quantified values were largely an attempt to evaluate the elusive attribute of system size. COSYSMO chose to make this evaluation based on four counted parameters, each of which was striated into easy, nominal, and hard. The author had been moderately successful in a prior unpublished effort using a different set of quantified parameters based on the number of components and formal tests. The complexity measurement work of Thomas & Mog (1998) also 46

65 Systems Engineering Return on Investment had some success using the number of developing organizations and number of customer agencies. These additional parameters were added from Thomas & Mog and from the prior unpublished work. It was also deemed sufficiently of interest to explore whether initial production quantity (during development) or CMMi TM maturity level affected the SE relationships, so these parameters were added. The ontology work showed a significant difference between mission/purpose definition and requirement engineering. While COSYSMO included requirements and architecture understanding, it did not include mission/purpose understanding. Requirements volatility and requirements growth were deemed to be checkpoints on requirements understanding. The analysis of Section shows eventually that these parameters had a high correlation with each other and were duplicative. While COSYSMO had a subjective measure for migration complexity (how difficult is it to move from a prior system version to this system version), there was no measure for overall system complexity and thus a new data item was added to address this. Finally, the author believed that the impact of the lead system engineer experience on a program might be significant in and of itself, so this parameter was also added Design of experiments variation The statistical research used interview data to explore multiple hypotheses at once. Each interview provided data elements for the multiple hypotheses, with many data elements in common to the different hypotheses. This type of statistical research is similar to the realm of design of experiments (DOE), in which an experiment is controlled for variation in the multiple hypotheses. While the SE-ROI research was not specifically controlling the elements of an experimental research, it was desirable to influence the selection of programs to achieve the necessary variation. Such variation can then allow separation of the different factors that contribute to the program success. Therefore, validity of the statistical results requires variation in the program s size, quality, systems engineering effort, and characteristics parameters. To that end, research advisory group members were approached to provide that variation. In particular, programs were sought with variation in: Program total cost as small as $1M and as large as $1000M Program success levels from failure to overwhelmingly exceeding expectations 47

66 Program systems engineering levels from zero to over 20%. Program systems engineering quality from poor to world class It can be seen in Section 4.6 that all of these variations are met by the data as gathered. Developmental analysis of the nature of this research was published in Honour (2007), included herein as Appendix C.4. The paper discusses the considerations and limitations of a DOE approach. While this research is not DOE per se, the following DOE issues were worthy of consideration in the research statistical work. Variable result phenomena. The four success measures had wide variation across the interviewed programs. Randomization of source factors. By obtaining interviews from a wide variety of programs from different organizations and domains, the source factors were randomized. Replication. Each program is a replication of the experiment in which a system is developed using some form of systems engineering. Blocking of confounding factors. It was not possible to block the confounding factors because there was no a priori control over each experiment. Instead, this research handled the confounding factors by statistical calculation of the program characteristics as noted in Section Orthogonality of the source factors. Orthogonality was handled by statistical analysis of the factors using Principal Component Analysis as described in Section Sample size The number of programs needed for the research was determined by a calculation of sample size using the Students-t distributions based on the statistical parameters of (a) acceptable rates for Type-I and Type-II errors, (b) sample variation in the data obtained, and (c) acceptable variation in the calculated mean, as in Desu & Raghavarao (1990) s N (t /2 t ) 2 2 Equation 2 Where N is the required sample size 48

67 Systems Engineering Return on Investment t s is the Students-t distribution statistic is the probability of accepting the null hypothesis wrongly (Type-I error, miss rate) is the probability of rejecting the null hypothesis wrongly (Type- II error, false alarm rate) is the sample variation in the data obtained, and is the acceptable variation in the calculated mean. For values of = 0.05, = 0.10, and = 0.5s, the minimum sample size calculates as 43 programs. The error rates and variations will be further reduced as the number of programs rises above this number. Further statistical work on sample size is in Section , showing that the data actually obtained provides highly significant results with = Interview design Structured interviews. The SE-ROI project chose to use structured interviews to obtain data because this method was believed to be the most cost-effective path to obtain reasonably accurate source data. This decision was based on experience with four prior methods that all presented drawbacks to validity. An unpublished, proprietary work by the researcher in 1994 attempted to gather similar data within one company by requiring extra entries on employee time cards. After over a year of using the extra entries, the data was still insufficient. This showed the difficulty of obtaining exact data through direct measurement. In another unpublished work, the researcher obtained a database of program information from one government agency. In review of the data, however, it became clear that the structure under which data was recorded was different for each program, and that the structures were usually incompatible with the desired research objectives. The prior work in Value of SE (Honour 2004) used informal, anonymous surveys. As a result, there was no means to determine the validity of the submitted data. It was likely that the pool of respondents had perceptive bias. It was impossible to know whether the respondents understood the questions being asked. While the 49

68 data seemed to correlate well, questionable internal validity was always suspected. The original paper discussed other known limitations. The Systems Engineering Effectiveness work (Elm 2008) used much more extensive surveys to obtain similar data for correlation analysis. The investigators did extensive work to invite surveys from appropriate individuals and companies, controlling the access to the survey web site to ensure that only the solicited individuals would be likely to submit responses. Nonetheless, the source and validity of the surveys were unknown. In addition, as in the Value of SE work, it was impossible to know whether respondents understood the questions. All of this experience indicated that a structured interview process would likely produce better data. Through such an interview process, the researcher and respondents could work together to translate the individual program data into into the project data items previously described. Interview participants. Validity of this process required both the researcher and the respondents to have sufficient experience and knowledge to accurately perform the translation. The researcher had sufficient seniority, with extensive program management and systems engineering experience, to be capable of probing beyond the initial responses to get at the true data. Standardization of the data also required interview participants with sufficient seniority to understand the systems engineering and program management aspects of program success. Therefore, respondents were chosen to be the primary program and technical leaders, usually titled the Program Manager and Lead Systems Engineer or equivalent. Interview timing and methods. The planned form of data gathering was to use one day in a sponsoring organization to obtain data from two to four programs. Each interview would last 1.5 to 2 hours. The interview was structured around the data sheets, with the intent to obtain a full set of data at one sitting. Data was to be obtained to the best level available during the interview. Interview flow. Consideration was given to the mental processes necessary for participants to provide effective data as accurately as possible. The overall interview flow was selected to allow participants to work first through the most familiar data, then move to the less familiar data. By doing so, participants could learn from the 50

69 Systems Engineering Return on Investment researcher the definitions used in the research and thereby provide better answers for the later, less familiar data. Therefore, the interview flow took the following steps: Pre-interview. All interview forms were provided at least a week prior to the interview, to allow participants to familiarize themselves with the material. It was strongly suggested that participants not attempt to fill out the interview data prior to the session, to ensure the appropriate understanding of definitions. Introduction. The researcher provided an introduction to the research and its objectives. Participants were given full information and trusted to provide program data with knowledge of the research purpose. This period also worked to develop trust in the researcher, by incorporating the signing of participant consent forms for each participant in accordance with the approved UniSA Ethics Approval. SE-ROI Program Information forms and Participant Consent forms are shown in Appendix B. Program description. The participants were asked to describe the program to be used for interview. Program scope definition. Working together, the researcher and participants defined a consistent program scope to be used throughout the interview. The scope was defined by an actual starting point, an actual ending point, and the technical features of the completed system. Program characterization parameters. Again working together, the researcher and participants decided on appropriate program characterization parameters. During this period, the researcher s role was to standardize definitions across all interviewed programs, while the participants role was to apply those definitions to their program. Program characterization parameters were elicited first, because (a) participants would be most familiar with them, and (b) they helped to solidify the scope definition. Program success parameters. In a similar way, the researcher and participants defined the program success parameters. Eliciting the success parameters was done at this early stage of the interview again because it was something with which participants could be expected to be very familiar. Systems engineering effort. Finally, the researcher and participants worked together to define the systems engineering effort. The SE definitions would be least familiar to the participants due to the wide variation in definitions across domains. 51

70 Interview instruments. With understanding of the desired data lists and the desired flow of the interviews, the author next created the interview instrument ( data sheet ) to be used during the interviews. The interview instruments are shown in Appendix B. The instruments strictly followed the desired flow, incorporating each of the data items from the data lists. Because obtaining this type of interview was so difficult in prior work, it was decided to include space to record qualitative information that might help to understand the data items during statistical evaluation. Qualitative information included other success measures that might be used on programs (interview sheet p.4), methods used in the SE activities (interview pp.5-8), tools used in the SE activities (interview pp.5-8), metrics used to evaluate SE activities (interview pp.5-8), and lessons learned from the program (interview p.9). The research supervisors and the Research Advisory Group reviewed the draft interview instruments. The author modified the instruments based on a series of comments and suggestions by the reviewers, improving the flow and the items. Both major and minor changes were made in response to the comments. One significant addition during this time was to add five pages of definitions with operable hotlinks in the softcopy form. One frequent comment during the review was concern that the interview data could not be collected within the desired hour time frame. Therefore, the forms were tested in several mock interviews with the research supervisors and others. During the mock interviews, the mock participants were asked to select a system development program with which they were familiar, and then to provide answers as if they were the leader of that program. In at least one case, the mock participant chose to be difficult, to purposefully extend the interview time. In all cases, the mock interviews completed in 75 to 90 minutes. 4.3 Approach to obtain data Throughout the prior research in this topic, the largest difficulty had been gaining access to program data. Organizations have negative incentive to allow such access, because it involves opening their successes and failures to the scrutiny of people outside their control. In prior work, the researcher had some limited success in gaining 52

71 Systems Engineering Return on Investment such access within some organizations. (Examples included four systems companies and two NASA organizations. Others had also indicated willingness.) The methods that appear to work involve creating a personal relationship of trust, offering a tangible benefit to the organizations, and ensuring strong data protection through following the data protection protocols recorded in the UniSA Ethics Approval, and through proprietary data agreements between the researcher and the interviewed organization. The personal relationship of trust was built through the use of the Research Advisory Group (RAG) described in Section During the technical structuring activities (research planning, systems engineering ontology, and interview forms development), the researcher involved the interested individuals by providing frequent ings of status information and by soliciting advice and help from the RAG. As the time for interviews neared, the same contacts were used to request organizations to offer themselves for interview. Because of the trust developed during the technical structuring, many of the RAG individuals became advocates for their organization to take part. Tangible benefits were offered to organizations that made their programs available for interview and analysis. The primary incentive offered was early access to the research results in the form of benchmark reports that compared the specific programs against the aggregate gathered data. Throughout the research, these reports were issued on a regular basis to keep the information flowing. An additional, minor incentive was the willingness to take part in research that could frame the SE discipline. The SE-ROI research used the following practices to obtain data while mitigating the risk to contributing organizations: Making personal contact with key individuals, working upward through the organizational structure to find a manager who could authorize the access. The researcher s personal international reputation in the field contributed significantly to success in these contacts. Using the interest and successes of the RAG to facilitate access within their parent organizations. This method was successful for the COSYSMO project and was repeated for SE-ROI. 53

72 Offering benchmark data to the source organizations for their early use prior to publication of the research final reports. Source organizations therefore gained a competitive advantage over organizations that do not have this data. It is noted that many companies spend considerable resources obtaining such benchmark data and consider it of high value. Offering strong proprietary data agreements coupled with strong data protection methods to ensure that source data is not publicly available. (See Section 4.4.) Gaining public consensus for the SE-ROI research and its methods by frequent publication of developmental results, following on the widely accepted earlier results from Honour (2004). During the search for interviews, the researcher maintained a database of possible contacts in Excel format. Individuals were placed in the database upon expressing any form of interest in the project and then categorized as shown in Table 6. Each entry contained basic contact data for the individual, as well as free form notes on the history of that contact and any actions needed. On a weekly basis, the researcher reviewed the contacts database for required actions to advance the project. Obtaining access to a sufficient number of organizations required time. Contacts were pursued through many paths during the time frame , with resulting interviews occurring during Table 6. Contacts database categories Label A-iviews B-sched C-ready D-poss E-help R-review Z-unlikely Category Meaning Interview already held, with notes for follow-up actions Interview scheduled, with notes for preparatory actions Organization ready to be scheduled Organization possibly willing to allow interviews, with notes for further actions necessary Individual might be able to provide help in obtaining interviews Individual can assist in reviews of work and data Individual interested, but unlikely to provide assistance 4.4 Data protection The data types obtained from programs were highly proprietary to the parent organizations, including key business parameters of technical success, cost, schedule 54

73 Systems Engineering Return on Investment and risk. Therefore, strong protections have been and are still used on the raw data. The principal investigator maintains all interview data in accordance with proprietary data agreements with the participating organizations and the UniSA Ethics Approval. No one other than the principal investigator sees the raw interview data. Specifically, only aggregated data was provided to the RAG, because that group included participants from various, possibly competing organizations. In addition, the benchmark reports to participating organizations included only aggregated data from other organizations. Prior to any data gathering at an organization, the researcher executed a proprietary data agreement with the organization. A standard form of the agreement was developed as part of the SE-ROI research structuring, drawing on the proprietary data agreements used by the researcher s company. However, most participating organizations required the use of their own forms. Regardless of the specific form, essential terms of the agreement allowed sufficient access to program data while ensuring that the data is not released in any way that provides attribution of the data to the source organization. The form of each agreement also allowed subsequent use of the data by the researcher as needed to develop aggregated data, and did not restrict the use of such aggregated data. Actual practice of the SE-ROI research used the following protections to secure the data: Data sheets were identified only with a blind randomized code. No data was recorded on the sheets that identified the organization or the people involved. The key that links the blind codes to the actual organizations and programs was maintained in a single hard copy record. Only the researcher has access to the key. Raw interview data, even though it is tagged only with the blind code, is limited to the researcher. This data was specifically not provided to the Research Advisory Group as stated above. Aggregated data resulting from fewer than five source interviews was also limited to the researcher. This same practice was applied to aggregated data from fewer than three source organizations. This practice was intended to prevent inference of organizational data from the aggregated data. Aggregated data from one source organization may only be included in the benchmark reports provided to that source organization. 55

74 Benchmark reports to each organization were password-protected. Password keys to the report were created by the researcher and provided only to the specific organization. From the time of first interview through to the production of this thesis, there have been no breaches of the proprietary data. 4.5 Interview methods The design of the interview instruments described in Section 4.2 defined the flow and methods to be used during the interviews. During the actual interview work, additional methods were applied. Defining program bounds. For compatibility of the data, it was desirable to have programs that were similar in bounds. The ideal program for the research was a system development effort that Started with creating the mission definition, Ended with completion of a single first working system, and Had no scope, cost, or schedule perturbations during the development. Such an ideal program largely does not exist. The nature of system development can result in highly varied program structures. Some programs work through several developmental phases, with different teams working sequentially on the phases. The main development effort therefore might have been preceded by other phases that performed the early mission definition, requirements engineering, and/or architectural design. Often, programs change during development, with modifications to scope, cost and schedule. Some programs create multiple prototype systems as part of the initial development. It was therefore necessary to define the bounds of each interviewed program as part of the interview. At the beginning of each interview, the researcher helped the participants to establish a starting point, ending point, and scope of work that represented the program to be interviewed. In every interviewed program, the ending point was completion of working prototype system(s). The starting point was recorded in the data as level of definition at start. For changes in scope, the participants were asked throughout the interview to interpret the actual program data in light of the agreed scope. Where the program made significant scope changes, the participants 56

75 Systems Engineering Return on Investment evaluated the impact of those scope changes on earlier parts of the program to determine values that would have been true for the agreed scope. It was noted later during interviews that this approach to program scope definition was effective in controlling variability due to scope changes. Interview participants. As noted earlier, the researcher planned to interview primary program and technical leaders, usually titled the Program Manager and Lead Systems Engineer or equivalent. However, others sometimes also took part and added to the knowledge. During the actual interviews, the selected individuals were always able to provide the desired data. There were always multiple respondents present during each interview, so the answers were largely created by consensus among the respondents. The interviewer s role was to provide an understanding of the definitions being used in the research; the respondents role was to apply those definitions to the data and understanding they had of their own program. Where there were differences among the respondents, a simple Delphi process was applied to achieve consensus, similar to that described in Brown (1968). Definitions. Throughout each interview, it was frequently useful to refer to standardized SE-ROI definitions for many of the terms being used. As noted in Section 4.1.3, definitions in the SE discipline vary widely across domains and organizations, so the standardized definitions were essential. The definitions were therefore included on the interview instrument for easy reference. Data sources. Data was to be obtained to the best level available during the interview. In some cases, data may have been directly available from the program records that were brought to the interview by participants. In other cases, the key individuals may have interpreted data from the program records. In still other cases, data may have relied on the memory of the key individuals. The specific method used for each data item depended on the form of records available within the project. The SE effort levels presented the greatest difficulty. In most cases, programs did not have actual data recorded at the level of detail desired. Therefore, the researcher probed the participants to identify the number of people involved in each SE activity, the durations when those people were active (considering part-time efforts as well), and the labor cost level associated with those people. The researcher helped to interpret 57

76 actual work efforts into the SE-ROI activity definitions. As a final check during each interview, the researcher added up the totals of the eight SE activities into total SE and allowed the participants to vet the total against their recorded program data and/or memory of the program. Recording. During the interviews, the researcher recorded all data onto the interview data sheets. 4.6 Demographics Through the data gathering effort, interviews were performed with 51 different programs from within 16 organizations. Three interviews were either terminated early or did not provide full information, leaving a set of data from 48 programs for statistical analysis. This section provides a summary of the distributions of source populations and the groups within them. This information was published partway through the data gathering as a developmental paper (Honour 2009), which is included in Appendix C.5. The information in this section provides the final demographic results Sources of data SE-ROI interviews started in 2007 and continued into The data used in this research comes from two different data sets, obtained using different methods. Value of Systems Engineering ( ValueSE ) data includes 44 program data points obtained during as a part of the prior project (Honour 2004). This data was obtained through voluntary, anonymous surveys using a simple data sheet. SE-ROI data includes 48 program data points obtained during as a part of the SE-ROI project. This data was obtained using interviews guided by the interview instrument designed for SE-ROI. Table 7 displays the primary demographics of the data, including funding methods, cost and schedule compliance, and systems engineering content. Protection of the proprietary data prevents listing directly the organizations and programs that were interviewed in the SE-ROI research. The data sources can be identified generally by type or domain, including the following: Environment control systems development 58

77 Systems Engineering Return on Investment Navy shipbuilding programs Security control systems development Shipboard control systems development Avionics systems development Aircraft power systems development Army ground systems development Military intelligence systems development Commercial production control systems development Operational training devices systems development Communications systems development Space systems development Interviews were performed in the United States, Australia, and Israel. Table 7. Basic demographic data Characteristic ValueSE Data Set SE-ROI Data Set Number of organizations Unknown 16 Number of data points Funding method Unknown 39 contracted, 9 amortized Program total cost $1.1M - $5.6B Median $42.5M Cost compliance (0.8):1 (3.0):1 Median (1.2):1 Development schedule 2.8 mo. 144 mo. Median 43 mo. Schedule compliance (0.8):1 (4.0):1 Median (1.2):1 Percent of program used in systems engineering effort, by cost Subjective assessment of systems engineering quality (scale of 1 poor to 10 world class) 0.1% - 27% Median 5.8% Values of 1 to 10 Median 5 $600K - $1.8B Median $14.4M (0.6):1 (10):1 2 Median (1.0):1 2 mo. 120 mo. Median 35 mo. (0.3):1 (2.5):1 Median (1.1):1 0.1% - 80% 3 Median 17.4% Values of 1 to 10 Median 7 2 (Table 1, Cost compliance ) This outlier program had a highly excessive cost overrun, likely due to poor estimation of effort. The next largest cost overrun is 3:1. 3 (Table 1, Percent systems engineering effort ) There were three outlier points with very large SE content at 80%, 51%, and 46%. All other programs had SE content at about 30% or less. All three projects were systems whose component design was relatively simple, so that the systems engineering activities extended well into what would normally be component architecting and design. 59

78 Limitations due to data source. While the gathered data was broadly based, there are nonetheless limitations to the conclusions in this research due to data sources not covered. There was a preponderance (although not a totality) of contracted systems developments and of military systems development; the results herein may be less applicable to amortized, commercial systems development. The interviewed programs all came from Western-oriented cultures; the results may be less applicable to other cultures. The total number of interviewed programs is more than sufficient to meet the desired sample sizes (48 programs vs. required 43), but does not provide sufficient quantity to allow segmentation of the data into sub-groups for more detailed analysis Validation of the combined data Using two different data sets raises the question of the compatibility of those data sets. Compatibility was checked by comparing the primary results of the two data sets: correlation of systems engineering effort with program cost and schedule compliance. Figure 19 and Figure 20 update the widely used graphics of Honour (2004) with the inclusion of the SE-ROI data. In each figure, the small red symbols represent the prior Value of Systems Engineering data, while the larger blue symbols represent the SE- ROI interview-based data. The figures show quadratic trend lines for each separate data set and for the combined set of all data. The similarity of the trend lines is striking. To test for the compatibility of the data, the two sets of samples were compared with a Kolmogorov-Smirnov test for comparison of the distributions (Kolmogorov 1941, Smirnov 1939). In each case, the distribution of the sample set about the common polynomial was compared. (See Section for the definition of the common polynomial.) The two-sample, two-tailed test provides a p-value of against = 0.05, indicating that the two distributions should be accepted as the same. (The risk of an error in this decision is only 22%.) This similarity of the distributions is significant. The two data sets were obtained using different methods from different populations, and yet the distributions are similar enough to be confidently the same. There appears to be an underlying process at work. 60

79 Systems Engineering Return on Investment Figure 19. Cost overrun vs. systems engineering effort Figure 20. Schedule overrun vs. systems engineering effort Yet there is a difference in the data sets. Although the data appears to come from a similar distribution, it is apparent in Figure 19 and Figure 20 that the two data sets come from different portions of the distributions. The Value of SE data includes points with a significantly lower value of SE Effort ( = 5.4% of total program cost), while the SE-ROI data includes points with larger SE Effort ( = 15.5% of total program cost). There is no certain reason for the difference, but one possible reason is the form of data gathering. Because the Value of SE data was reported on anonymous surveys, 61

80 it is more likely to include programs that did poorly in their application of SE. Likewise, it was observed during the SE-ROI data gathering that organizations were more inclined to provide their better programs for interviews; in fact, the researcher often had to encourage organizations to include poorer programs to gain a spread of data Source program size characteristics Within the SE-ROI data, the interviews include a series of questions on the size characteristics of the source programs. Figure 21 provides histograms of several key parameters, demonstrating the variety of programs interviewed. It can be seen that programs varied significantly in start point and in numbers of requirements, interfaces, algorithms, and operational scenarios. Interviewed programs varied in the starting level of system definition from poorly defined to system architecture defined. No programs reported starting with requirements allocated into the system architecture. The primary mode was to start with performance-based requirements, although there was a reasonably Normal distribution about this mode. This starting level of definition was likely due to the preponderance of programs that were contracted in nature. In a contracted environment, the acquirer performs much of the earlier SE effort in order to create the contractual definition. Because the interview process selected organizations that performed the system development (as opposed to acquirers), this earlier effort was frequently not included in the program scope. Nonetheless, there were sufficient earlydefinition programs included to allow for a distribution of the earlier effort. See Section for the statistical treatment given to this early-stage effort. Most programs were in the Development life-cycle stage, which was expected for system development programs. Regardless of life-cycle stage, however, all programs were a development of an identifiable new capability. Even in Utilization and Support stages, some new system development occurs. There was a wide distribution of the number of requirements, but the mode was between 100 and 500 requirements. One very large shipbuilding program had over 10,000 identified requirements. The number of system-level interfaces was typically fewer than 50, with the exception of three shipbuilding programs that had on the order of 1000 identified interfaces. 62

81 Systems Engineering Return on Investment Figure 21. Histograms of program size parameters The number of system-specific algorithms varied widely, from none (for 11 interviewed programs) to greater than 50. There was a significant mode in the range of fewer than three algorithms. The number of operational scenarios was also usually few, with a significant mode in the range of fewer than three scenarios. There were four exceptions (again, shipbuilding programs) with greater than 50 scenarios. 63

82 Limitations due to source program size. The results developed in this research are limited to programs from within the populations interviewed. The results may be less applicable to programs in other than the Development life-cycle stage, programs that include the entire development within one funded effort, programs with very large or very small numbers of requirements, interfaces, algorithms, or scenarios Source program subjective characteristics Some characteristics of the programs were estimated subjectively rather than quantitatively. Interview participants used an informal Delphi method to agree on the subjective values. Subjective values used a standard scale of VL=Very Low; L=Low; N=Nominal; H=High; VH=Very High. Figure 22. Team understanding parameters Team understanding. Several parameters shown in Figure 22 relate to the degree to which the development team understood, during the development, the problem to be solved. The specific questions were estimates of how well the team understood the mission/purpose of the system, the requirements, and the architecture. 64

83 Systems Engineering Return on Investment It is interesting to note that the interview participants believed their teams generally to have better-than-average understanding of the mission/purpose, requirements, and architecture. In each question, the mode is at High understanding rather than Nominal. Distribution is essentially Normal around this mode. Problem difficulty. Parameters shown in Figure 23 relate to the difficulty of the problem to be solved, in terms of risk, changeability, and complexity. Figure 23. Problem difficulty parameters Requirements volatility and requirements growth were both measures of how much change occurred in the program. As with the team understanding parameters, both of these measures have an off-center mode. System complexity exhibits a bimodal behavior, with modes at both High and Low system complexity. No explanation is available within the data for this behavior. Technology risk has the mode at Nominal, but exhibits an off-center distribution weighted toward higher risk. Team capability and experience. Parameters shown in Figure 24 relate to the general capability and experience of the development team or key individuals. 65

84 Figure 24. Team capability and experience parameters Stakeholder team cohesion, personnel/team capability, and lead SE experience all demonstrate modes toward the High end. As with prior findings, these seem to indicate a high confidence in the capability of the teams. In contrast, the interviewed team leaders have a Nominal-centered belief in their organizational process capability. It should be noted that nearly half of the interviewed programs were operating within organizations at Capability Maturity Model Integration (CMMi TM ) levels zero or one. 66

85 Systems Engineering Return on Investment Limitations due to subjective program characteristics. The interviewed programs present a suitably wide variation in subjective characteristics, including programs from Very Low to Very High assessment in nearly all categories. While not certain, the observed skews may be representative of the larger population because they match well-known perceptions in the management literature. One large limitation is the self-selection of programs at relatively low CMMi TM levels. The results herein may be less applicable for organizations at higher CMMi TM levels Systems engineering effort levels Of primary interest to this research is the amount of systems engineering effort used in the programs. In the Value of SE project, only the total systems engineering effort was obtained. In the SE-ROI project, the systems engineering effort is also categorized into eight subordinate activities. Total systems engineering effort. Figure 25 shows a histogram of the systems engineering cost as a percent of the total program cost as reported. Counts are shown for the combined data set (Value of SE and SE-ROI) and for the SE-ROI data alone. It is visually apparent that the SE-ROI data has been obtained from programs that use a greater level of systems engineering effort than in the Value of SE project. As discussed above, this is seen as a characteristic of the interview process, in that participating organizations have incentive to guide interviews to their better programs. As noted in Honour (2004), however, the raw percent does not represent a good measure of effective systems engineering effort. This is true because the raw percent does not take into account the quality of that effort. Following the previous work therefore, a primary measure called the Effective SE Effort is calculated by factoring the SE costs proportionately downward based on the respondents subjective assessment of the SE quality. (See Section for the mathematical treatment of SE quality.) Figure 26 shows the histograms for the effective SE effort. It is still apparent that the SE-ROI data points represent a greater level of effective SE effort than the Value of SE data. 67

86 Figure 25. Total SE cost as a percent of actual program cost 4 Figure 26. Effective SE effort as a percent of actual program cost Effort by systems engineering activities. Also of interest is the spread of the SE effort across the eight defined categories of SE activity. Figure 27 shows the spread of program effort in the eight SE activities by cost. (This data is only available for the SE-ROI data points.) For each activity, the figure uses a range bar to show the minimum, median, and maximum levels of effort in the SE-ROI data sets. As with the overall SE effort, subjective assessments were made on each activity independently. The same data is shown in Figure 28 modified by the subjective quality of each activity. The largest SE activities by cost are verification/validation (VV) and technical leadership/management (TM). The smallest SE activities are scope management (SM) and mission/purpose definition (MD). However, the prior discussion about starting level of system definition indicates that the early-phase activities of mission/purpose 4 One SE-ROI data point at 80% SE effort is not shown. This outlier program involved a system whose components were so simple that a large part of the program effort was spent in managing the system architecting effort. This point, and the point shown at 51% were later treated as outliers and removed from the analysis. The point at 46% contributed to the analysis. 68

87 Systems Engineering Return on Investment definition (MD), requirements engineering (RE), and system architecting (SA) may have been aided by effort that occurred prior to the interviewed program. Again, see Section for an appropriate treatment of these early-phase SE activities. Figure 27. SE activities effort as a percent of actual program cost Figure 28. Effective SE activities effort as a percent of actual program cost Limitations due to SE effort levels. The interviewed programs present a suitably wide variation in SE effort levels, ranging from near-nil to levels on the order of 30% of the total program. (Two programs with greater SE effort were later removed as 69

88 outliers.) The distribution of SE activity levels can be interpreted as representative of the larger population, subject to the corrections applied in Section Observations and findings The research design continued to be essentially sound through the data gathering activities, with no surprises that caused changes to the design. The interview method proved to be an effective means to obtain the desired data, although obtaining interviews was every bit as difficult as expected. The desired data was defined in an appropriate way, and the data items identified were available during the interviews. The SE ontology of eight SE activities was clear to the interview participants, validating its conception and review. It was possible, through the data gathering methods, to obtain a representative set of programs that met the required Design of Experiments variability. The design of the structured interviews supported data gathering within a reasonable time at each organization. The use of the Research Advisory Group assisted materially in obtaining the required interviews. Coupled with a rigorous contacts database, the researcher was able to obtain sufficient interviews for significance of the data. The designed data protection methods were effective. Finally, the simple demographics of the obtained data not only proved the methodology but also provided a series of interesting observations and findings in their own right. SE ontology. The well-research SE ontology of terms proved to be effective in standardizing the data obtained across domains and organizations. Even though the interviewed organizations were broadly diverse, respondents were able to understand and use the common terminology derived from a merging of the various SE standards. Interview planning. The plan for intervews worked well. The interview instrument allowed gathering complete data within the target hours of time in 48 of 51 interviews. The methodology to define program scope early in the interview was effective in controlling variability due to scope changes. Using program leaders as interview participants was effective in obtaining the desired data, because these individuals were able to interpret their program information into the common language of the research project. Data protection. The data protection methods and other ethics considerations in the UniSA Ethics Approval proved effective. There were no data breaches during the 70

89 Systems Engineering Return on Investment research. The ethics constraints were accepted as sufficient by the participating organizations, and allowed those organizations to provide interviews. Commonality of data sets. The earlier Value of SE data and the SE-ROI data obtained through interviews appear to come from a common distribution. This allows the use of the combined data, a total of 92 programs, for comparisons where the data exists. However, it is noted that the two data sets appear to come from different regions of the distribution, with the Value of SE data representing programs with lesser SE effort ( = 5.4% of total program cost) and the SE-ROI data representing programs with higher SE effort ( = 15.5% of total program cost). Program demographics. The interviewed programs came from a diverse set of organizations and domains, including both commercial and military system developments. While providing a breadth of values in each attribute, it is interesting to note the primary modes for each distribution: Between 100 and 500 system-level requirements Fewer than 50 system-level interfaces Fewer than 7 system-specific algorithms Fewer than 20 operational scenarios. Team leader perceptions of their team. The interviewed team leaders reported a wide variety of perceptions about their team capabilities, but the general trends of the reported data indicate that team leaders believe: Their team has better than average understanding of the technical problem, Their programs are more stable than most, Their programs have higher risk than most, Their stakeholders are more cohesive than most, Their personnel and team capabilities are better than most, and Their lead systems engineer is more experienced than most. 71

90 5 Statistical results The interview instrument was designed to allow a statistical approach toward the determination of SE value. The prior research (Chapter 2) shows anecdotal evidence of the overall value of SE, but is singularly lacking in information about how much is enough. The SE-ROI results in this chapter are based on rigorous use of statistical methods and show a far greater level of detail than has hitherto been available. Throughout this work, the key goal has been to provide management decision-level information about how much and what kind of SE is indicated to optimize a development program. To achieve this end, this project established two primary research questions as discussed in section 3.1: (RQ A ) Is there a quantifiable correlation between the amount, types and quality of systems engineering efforts used during a program and the success of the program? (RQ B ) For any given program, can an optimum amount, type and quality of systems engineering effort be predicted from the quantified correlations? This chapter provides the statistical methods and evidence to explore these research questions by testing specific hypotheses. Much of this material has already been published in developmental conference papers that are provided as appendices to this thesis. Where the material has already been published, this chapter refers to the papers and provides only expansion information to explain depth that was inappropriate for the conference papers. 72

91 Systems Engineering Return on Investment 5.1 Statistical methods Creation of the hypotheses The primary research questions of section 3.1 are compound in form, with several subsidiary clauses. Each subsidiary clause can be formulated as a specific hypothesis to test the individual assertions in them. In RQ A, the compound forms are twofold: the amount, types and quality of the systems engineering efforts The data design supports nine different types of SE efforts, comprising each of the eight SE activities plus total SE. These nine values can be measured by amount and quality as mathematically described in sections and the success of the program. The data design supports four different measures of program success, applied to each program: Cost compliance Schedule compliance Overall success Technical quality This leads to a set of 36 hypotheses that support RQ A, all of which are tested, of the form: (H AXXY ) There is a quantifiable correlation between the amount and quality of systems engineering type XX effort used during a program and the class Y success of the program. Where Type XX is each of the following nine (See Section for definitions of these activities): SE MD RE SA SI VV TA is total SE effort is mission/purpose definition effort is requirements engineering effort is system architecting effort is system integration effort is verification/validation effort is technical analysis effort 73

92 SM TM is scope management effort is technical leadership/management effort And where class Y success is each of the following four: C S O T is cost compliance is schedule compliance is overall success is technical quality In RQ B, there is only one compound form: an optimum amount, type and quality of systems engineering effort The data design supports the same nine different types of SE efforts as in RQ A, comprising each of the eight SE activities plus total SE. This leads to a set of nine hypotheses that support RQ B, all of which are tested, of the form: (H BXX ) For any given program, an optimum amount of highest-quality systems engineering type XX effort can be predicted from the quantified correlations. Testing a total of 45 subsidiary hypotheses, 36 H AXXY plus 9 H BXX, will thus endeavor to answer the two primary research questions. Use of program characteristics. In addition to the measures of SE amount and quality and the measures of program success, the SE-ROI research also gathered extensive program characteristics. As noted in Section 4.1.4, the purpose of the program characteristics is to handle the expected extensive confounding variables. If the program characteristics have been properly chosen and are properly used, it is expected that they can stand in for the many confounding variables. For RQ A, indications from prior research (Honour 2004) are that the primary correlations exist without any need for consideration of the confounding variables. It will be seen, however, that their use significantly improves the correlation. For RQ B, the phrase For any given program requires the use of program characteristics. To reject the null hypothesis, it is necessary to show that some set of 74

93 Systems Engineering Return on Investment program characteristics allows prediction of an optimum level of SE activity for that program Null hypotheses Using classical statistical methods, one may choose to accept the hypotheses by finding evidence to reject the null hypotheses. In this case, the hypotheses may be reworded into the opposing general form to form the null hypotheses: (H AXXY0 ) No quantifiable correlation exists between the amount and quality of systems engineering type XX effort used during a program and the class Y success of the program. 36 null hypotheses, based on nine instances of type XX and four instances of class Y success as defined in Section (H BXX0 ) For any given program, no optimum amount of highest-quality systems engineering type XX effort can be predicted from the quantified correlations. Nine null hypotheses, based on nine instances of type XX as defined in section Treatment of SE amount The interview data provides two possible measures of the amount of SE effort, either by labor-hours or by costs. Because of the interview methodology used to estimate the two values (see 4.5), the two values are highly correlated; in most interviews, estimation required deriving one from the other. It is therefore not desirable to use both possible measures. Of the two, the measure that is most significant to management decisions is cost. Therefore, the cost of SE efforts is used as the measure of SE amount. Cost of SE efforts was obtained for each of the eight SE activities and also for total SE. During the interviews (see 4.5), the total SE amount was double-checked to ensure that (a) it was the sum of the eight SE activity amounts, and (b) it represented a valid total value in terms of the program. However, the SE costs are nearly meaningless by themselves; they are only useful in context of the size of the program. Therefore, it was chosen to use as a basic parameter the SE costs as a percent of the total program cost: XX% i 100C XXi /C Ai Equation 3 75

94 Where XX% i C XXi C Ai is the cost ratio of type XX effort for program i (percent) is the actual cost expended ($) of type XX effort for program i is the actual cost at completion ($) for program i For total SE, these values are available from both the SE-ROI data and the prior ValueSE data; the values are all used in the hypothesis testing. For the eight SE activities, these values are only available from the SE-ROI data. Limitations. The choice to use cost values rather than labor hour values causes a possible aberration in ratios when development programs have widely varied amounts of non-labor costs. A development program building extensive equipment, for instance, may evidence a lower ratio than a program building little equipment Treatment of SE quality The issue of SE quality was treated by Mar & Honour (2002) with the discovery that correlations between SE effort and program success improved with the inclusion of SE Quality (SEQ). This discovery was interpreted as representing the reality that poor SE effort does not have the same effect as excellent SE effort; cost spent on poor effort can therefore be reduced as representing a lower effective value. Following this same practice, and as described in Appendix C.6 (Honour 2010a), the SE percent (SE%) is next modified into SE Effort (SEE) as follows, reducing the SE percent when the SEQ is less than highest quality: XXE i XXQ i XX% i 100 XXQ i C XXi /C Ai Equation 4 Where, in addition to the above, XXQ i is the quality of type XX effort for program i (values [0:1] unitless) However, it should be noted that the reported values for quality (1-to-5 with half values allowed) must be transformed to use them in Equation 4. In the original work (Mar & Honour 2002), a linear transformation was used, mapping reported values of [1:10] into values of [0.1:1.0]. 76

95 Systems Engineering Return on Investment Because the SE-ROI data is combined with the ValueSE data, all SE-ROI reported values are transformed linearly from the range [1:5] into the range [1:10] for compatibility, then further mapped into the XXQ i range of [0:1]. The author made a further exploration of the mapping from the reported values into XXQ i to consider non-linear mappings using two basic mapping parameters: XXQ i 1 1 v 10 xxq c i 9 Equation 5 Where xxq i is the reported quality of type XX effort (unitless, range [1:10]) for program i v c is the low XXQ i number to which a reported value of 1 is mapped is a non-linear compression factor As an example, the form of the non-linear mapping for v = 0.2 and c = 1.5 is shown in Figure 29, in which the reported quality values [1:10] are shown on the abscissa and the resulting values of XXQ are shown on the ordinate. A compression value c = 1.0 gives a linear mapping; values >1.0 bend the mapping upward and values <1.0 bend the mapping downward. Figure 29. Possible non-linear mapping of reported quality Using this mapping, the exploration calculated the R 2 correlation coefficient between program cost compliance (C i defined below in section 5.1.5, Equation 9) and SEE (Equation 4 above), using this one hypothesis H ASEC as a test case for the non-linear 77

96 mapping. (See Section for the definition of R 2 as used in this research.) A search for the values of v and c that maximized R 2 found values of v = 0; c = 1.0; R 2 = 0.204, indicating that a linear mapping actually produced the best correlation. It is also noted that few programs reported the lowest quality values, so the difference between the Mar & Honour (2002) approach (mapping [1:10] linearly to [0.1:1.0]) and the optimum values for the H ASEC correlation (mapping [1:10] linearly to [0:1]) is slight. For compatibility with the prior data, therefore, the reported values are mapped to XXQ i using the simple linear form, reduced from Equation 5 using v = 0.1 and c = 1.0: XXQ i xxq i /10 Equation 6 Limitations. This is not the only possible treatment of SEQ. While this treatment has been optimized for its effect on the primary correlation relationship, there may be other mathematical treatments that might create a better effect. None such were discovered Treatment of early-phase SE activities All interviewed programs did not start at the same point in the engineering lifecycle. As described in section 4.2, the interview recorded the level of system definition at start as anywhere from poorly-defined user problem (very early program start) to technical requirements allocated to next-level components (very late program start), with eight start-point selections between. The distribution of the programs against this data point is shown in Figure 21 of Section This difference in program start points introduced an inequity in the treatment of the SE activities, in that many programs did not include the early-phase SE activities of MD, RE, and SA to a full extent. These activities had been performed in prior work, either within a prior program phase or perhaps within a client organization prior to contract of the interviewed program. Without correction, the lack of these activities would introduce a back-end skew into the data for mission/problem definition effort (MDE), requirements engineering effort (REE), system architecting effort (SAE), and even systems engineering effort (SEE). To correct for this skew, an appropriate amount of early-phase effort was added to each program based on (a) the program start point and (b) the average of programs that started earlier. Table 8 tabulates the average SE activity effort XXE (see Equation 4) 78

97 Systems Engineering Return on Investment for all programs by the level of start definition of each program, also tabulating the number of programs that reported starting at or earlier than each start definition level. Note that the MD activity culminates in level 4 system mission/operations defined, so only 20 of the 48 programs can be expected to have performed significant levels of MD activity. This can be seen in the MD column, in which the MDE for programs starting within the first four levels is significantly higher than programs starting after level 4. Table 8. Original SE activity levels by level of start definition From the data in Table 8, the additional amount of early-phase effort to add to each program (as shown in Table 9) was calculated by a simple subtraction: XXE add (m) XXE m XXE 1 Equation 7 Where XXE add(m) is the amount of activity XX effort to add to a program that started at definition level m XXE m is the average amount of activity XX effort for all programs that started at definition level m 79

98 Table 9. Corrections to SE activity levels by level of start definition It should be noted that there is one anomaly in Table 9, indicated by the boxed cell at the intersection of start level 4 and REE. Referring to Table 8, it can be seen that the RE effort does not behave in the same way as MD and SA effort, reaching a maximum of 1.7% at start definition level 4 and then reducing for later start definitions. This appears to indicate that RE effort is somewhat related to MD and/or SA effort. The RE effort was therefore not corrected for start definitions at level 4 or earlier, only applying the correction to later start definitions. To properly apply the additions, a further calculation was needed to adjust the percentage basis for the modified total amount. So adding the additional effort while also correcting for the percentage basis resulted in an Adjusted XX Effort (AXXE) for each program as: Where AXXE i XXE i XXE add (m ) 1 TOT add (m ) Equation 8 AXXE i is the adjusted amount of activity XX effort for program i TOT add(m) is the total amount of SE effort added to programs that started at definition level m, the sum of the added MD, RE, and SA efforts. m is the start definition level of program i 80

99 Systems Engineering Return on Investment Treatment of success measures The four success measures have been described in Section and in the developmental paper in Appendix C.6 (Honour 2010a), although without mathematical explanation. This section provides the calculation method for each success measure. Cost compliance is calculated as the simple ratio: C i C Ai /C Pi Equation 9 Where C i C Ai C Pi is the cost compliance for program i (unitless ratio) is the actual cost at completion ($) for program i is the planned cost ($) for program i Schedule compliance is calculated as the simple ratio: S i S Ai /S Pi Equation 10 Where S i S Ai S Pi is the schedule compliance for program i (unitless ratio) is the actual duration (months) for program i is the planned duration (months) for program i Overall success (OS i ) is given by the subjective value (1 to 5, with half-values allowed) assigned by the interview participants. Technical quality is given by the weighted sum of the system technical KPP values against the stakeholder threshold and objective desired KPP values: m i TQ i w ij K ij Equation 11 j 1 Where TQ i is the technical quality for program i (unitless value in the range [0:2]), interpreted as 0 Failed technical quality 1 Acceptable quality, meeting thresholds 2 Exceptional quality, meeting highest objectives 81

100 m i w ij K ij is the number of stakeholder KPPs for program i is the stakeholders utility weight for KPP j for program i, a unitless value in the range [0:1] such that the sum of all KPP weights for the program is equal to one. is the adjusted preference value for KPP j for program i (unitless value between 0 and 2), with the same interpretation as TQ i, given by 1 k ij U 1ij, k between U U 2ij U 1ij and U 2ij 1ij K ij k ij U 0ij, k between U U 1ij U 0ij and U 1ij 0ij Equation 12 k ij U Qij is the actual system value for KPP j for program i (various units) is the stakeholders utility preference for KPP j for program i (various units compatible with k ij ), in which Q = 0 represents absolute failure, Q = 1 represents threshold acceptability, and Q = 2 represents highest possible objective. Note that with these definitions, it is not possible for k ij to be outside the range from U 0ij to U 2ij. Limitations. These success measures are defined as above for this work. As definitions, they require no theoretical basis. Nonetheless, the names used for the success measures imply some relation to reality. That relation is subject to some limitations in the use of the results of this work. Cost and schedule compliance are based on standard Program Management (PM) definitions for cost variance and schedule variance. It should be noted that both of these terms are used in the PM literature to apply to programs for which there have been scope changes. In this research, scope changes have been taken out during the interview process as noted in Section 4.5. Overall success is based on coarse subjective measures across the interviewed programs. There are mathematical limitations to the use of measures with only a few discrete values. There are also psychological limitations to the use of subjective measures due to errors and fallacies in the creation of such measures. 82

101 Systems Engineering Return on Investment Technical quality as used herein is only one possible measure for the true quality of the system product. Other quality measures cannot be inferred to operate with the same relationships as the technical quality defined herein Correlation definition Throughout this research the interest is in the correlation between various SE activities and various success measures. Figure 30 shows one such relationship as an example. Figure 30. Typical scatter plot of a success measure against an SE activity The underlying theory presented in Honour (2002b) indicated a relationship between program success measures and SE effort that would include an optimum value at some point between 0% SEE and 100% SEE. To approximate this relationship in the region of interest (SEE between 0% and 30%), a second-order (quadratic) curve provides a suitable shape. Given the typical scatter plot of data in Figure 30, a quadratic curve as Where y ˆ aˆ x 2 bˆ x c Equation 13 ˆ x ˆ y a,b,c is the abscissa value of the curve, is the ordinate value of the curve, and are the quadratic parameters, was fitted to the sample data points using a least-squares regression following the solution of Gordon (2004) as: 83

102 a S 01S 10 S 30 S 11 S 00 S 30 S 01 S S 11 S 10 S 20 S 21 S 00 S 20 S 21 S 10 Equation 14 D b S 11S 00 S 40 S 01 S 10 S 40 S 01 S 20 S 30 S 21 S 00 S 30 S 11 S 2 20 S 21 S 10 S 20 Equation 15 D c S 01S 20 S 40 S 11 S 10 S 40 S 01 S S 11 S 20 S 30 S 21 S 10 S 30 S 21 S 20 Equation 16 D D S 00 S 20 S 40 S 2 10 S 40 S 00 S S 10 S 20 S 30 S 20 Equation 17 S ij x i y j Equation 18 Where D S ij x,y is an intermediate term, the denominator, are intermediate terms calculated from powers i,j of the sample points, and are the abscissa and ordinate values of the sample points. The measure of greatest interest is how well this derived quadratic regression fits the data. This goodness of fit is calculated as the coefficient of determination R 2 based on Pearson s-r, also known as the Pearson product-moment correlation coefficient, using the formula provided by Eshbach (1952): (y y ˆ ) 2 R 2 1 (y y ) 2 Equation 19 Where, in addition to the above, y is the mean of all values in y Treatment of program characteristics As described in Section 4.1.4, the SE-ROI research purposefully gathered a wide set of program characteristics, because the actual characteristics that perform well are to be discovered as part of the research. Prior to using the parameters, they were compared with each other to orthogonalize them and reduce the number of paramters. This treatment was reported fully in the development paper by Honour (2010a), which is contained in Appendix C.6. 84

103 Systems Engineering Return on Investment The interview instrument contains 45 originally gathered program characteristic parameters, 20 of which are subjective and 25 of which are quantitative size parameters. To ease the burden of the subsequent statistical calculations, it was necessary to reduce this number of parameters. Principal Component Analysis (PCA) (Pearson 1901) is a mathematical technique to evaluate the combinatorial dependence of the parameters and thereby select a smaller set of orthogonal parameters. The principles of PCA are described briefly in the paper in Appendix C.6. Correlations among quantitative size parameters. Prior to applying the PCA, correlation tests of several parameter groups showed that they were highly correlated with each other and therefore duplicative. This analysis was applied to the three individual grades (easy, nominal, hard) of the four graded quantities (numbers of system requirements, external interfaces, unique algorithms, and operational scenarios). From Figure 5 of appendix C.6, it can be seen that each grade of each quantity is correlated with the total of that quantity with correlation value >0.6. All of these correlations are different than zero with a significance level = As a result of this correlation, it was determined that the 12 parameters (three grades of four graded quantities) can be fairly represented with the four totals, reducing the 25 quantitative parameters to 17. In a similar way, the author evaluated and tested the statistical dependence of the three number of components parameters (number of unique components, number of designed components, and number of integrated components). Again, all correlations are different than zero with a significance level =0.05. It was therefore determined that the three number of components parameters can be fairly represented by the one parameter number of integrated components, further reducing the 17 quantitative parameters to 15. Other high correlations among the quantitative parameters were also found to exist with significance level = 0.05, although none of these lended themselves to further reduction in the number of parameters. Funding method and life-cycle stage are negatively correlated, indicating that earlier stage programs tend to be amortized while later stage programs tend to be contracted. 85

104 Systems with more interfaces tend to involve more customers; they tend to have more detailed start definition, and tend to require more test locations. Systems with more system algorithms tend to require more test locations and more system components. Programs with more developing organizations tend to have more system components, and more operational scenarios. Principal component analysis quantitative parameters. The author then applied PCA to the remaining 15 quantitative parameters to discover a set of orthogonal principal components ( factors ) that covered the same region of variability. By ranking the orthogonal factors, it was determined that the first seven factors effectively model over 70% of the variability in the 15 remaining quantitative parameters. By examining the structure of the seven factors and their mathematical relationships to the 15 quantitative parameters, the following interpretation of the factors was discovered: QF1 System size 17% of variability - ranges from small to large - related to a wide set of the original parameters including (in decreasing order of impact) numbers of algorithms, components, formal test locations, developing organizations, requirements, interfaces, operational scenarios, customer agencies, and more contracted-style funding. QF2 Development methods 12% of variability - ranges from amortized methods to contracted methods related to (in decreasing order of impact) lifecycle stage, customer funding method, CMMi level, number of customer agencies, and production quantity. QF3 Level of integration 11% of variability ranges from system-level to subsystem-level related to (in decreasing order of impact) number of system interfaces, later start point with more definition, fewer number of customer agencies, number of formal test locations, and fewer operational scenarios. QF4 Detail of system definition at start 10% of variability ranges from high-level definition to detailed definition related to (in decreasing order of impact) number of formal tests, later start point with more definition, CMMi level. QF5 Life-cycle stage 8% of variability ranges from development stages to production stages related to (in decreasing order of impact) production quantity, number of formal test locations, later reported life-cycle stage, number of system requirements, and earlier start point with less definition. 86

105 Systems Engineering Return on Investment QF6 Proof difficulty 7% of variability ranges from easy to difficult related to (in decreasing order of impact) number of formal tests, number of algorithms, fewer number of system requirements, number of components. QF7 Development autonomy 6% of variability (total of QF1-QF7 cover 72% of the variability) ranges from controlled to independent related to (in decreasing order of impact) number of customer agencies, fewer number of formal tests, fewer number of requirements, number of components, production quantity, and amortized-style funding. Correlations among subjective parameters. Prior to performing the PCA on the set of 20 subjective program characterization parameters, it was noted that one of the parameters is the total SE quality that is used separately as described in section 5.1.4, thereby reducing the number of subjective parameters to 19. Correlation tests on the remaining 19 subjective parameters noted the following relationships that have significance level = 0.05: Process capability correlates strongly with team continuity. Parameters associated with team understanding of the problem (mission/purpose understanding, requirements understanding, architecture understanding) all correlate with each other and also with team continuity. Parameters associated with volatility (requirements volatility, requirements growth) correlate with each other and with number of platforms. All of these also negatively correlate with requirements understanding, architecture understanding, and stakeholder team cohesion. Requirements growth negatively correlates with lead SE experience. Team capability correlates with stakeholder team cohesion and with mission/purpose understanding. Number of sites correlates with recursive levels in the architecture. Parameters associated with complexity (system complexity, recursive levels, migration complexity), level of service requirements, documentation level, and lead SE experience are all correlated. Principal component analysis subjective parameters. The author then applied PCA to the remaining 19 subjective parameters to discover a second set of orthogonal principal components ( factors ) that covered the same region of variability as the subjective parameters. By ranking the orthogonal factors, it was again determined that 87

106 the first seven factors effectively model over 70% of the variability in the 19 remaining subjective parameters. By examining the structure of the seven factors and their mathematical relationships to the 19 subjective parameters, the following interpretation of the factors was discovered: SF1 Team understanding of the system 21% of variability - ranges from low to high understanding - related to (in decreasing order of impact) requirements understanding, mission/purpose understanding, architecture understanding, team capability, less requirements volatility, team continuity, stakeholder team cohesion, less requirements growth. SF2 Complexity of the program/system 14% of variability ranges from simple to complex related to (in decreasing order of impact) system complexity, level of service requirements, documentation level, number of recursive levels, and migration complexity. SF3 Difference across installation sites 9% of variability ranges from few to many differences related to (in decreasing order of impact) number of sites, number of recursive levels, tool support, requirements growth, less stakeholder team cohesion. SF4 Team process capability 9% of variability ranges from weak to strong related to (in decreasing order of impact) process capability, team continuity, less migration complexity, less requirements volatility. SF5 Need for and use of SE tools 7% of variability ranges from light tools to great tools related to (in decreasing order of impact) tool support, number of platforms, fewer service requirements, less requirements growth. SF6 Technology risk 6% of variability ranges from low risk to high risk related to (in decreasing order of impact) technology risk and less stakeholder team cohesion. SF7 System applicability 5% of variability (total of SF1-SF7 cover 71% of the variability) ranges from narrow to wide applicability related to (in decreasing order of impact) number of platforms, requirements volatility, stakeholder team cohesion, requirements understanding, team capability, and fewer service requirements. 88

107 Systems Engineering Return on Investment Limitations. The PCA methodology used to create the two sets of factors is mathematically rigorous and well established. Nonetheless, it produces a simplification of the original program characterization parameters that is subject to some limitations: Because the combination of factors covers just over 70% of the variability, there is still another 30% of variability that is not included. The factor calculation is based on the 48 data points provided. There is some likelihood that any given program might operate under conditions that are different, in which case the factors may not be correct. The naming of the factors is subjective, based on an interpretation of the mathematical impact on each factor of the original program characterization parameters Use of program characteristics in correlation The resulting quantitative factors (QF1-QF7) and subjective factors (SF1-SF7) were then used to modify the primary correlations of SE efforts to success measures. The methodology used to apply the factors is described in Appendix C.8 (Honour 2011a). The general process is a mathematical discovery of an effective set of weights for each factor that best improve the correlation. Underlying this process is an associated hypothesis that must be tested as part of the discovery: (H CXX ) A selected set of program characterization factors can be used to improve the correlation of type XX effort used during a program and the success of the program. As with the primary hypotheses, this hypothesis can be accepted by testing to reject the null hypothesis: (H CXX0 ) No set of program characterization factors can be used to improve the correlation of type XX effort used during a program and the success of the program. The work in appendix C.8 (Honour 2011a) describes briefly the process used to: Demonstrate the original correlation graphs for cost compliance, schedule compliance, and overall success against SE effort, with low R 2 correlation coefficient due to wide scattering. 89

108 Select the 14 quantitative factors (QF1-QF7) and subjective factors (SF1-SF7) as a representative set of program characterization factors. Calculate the factors for each program as a percentile point PP i against all programs, based on: The original program characterization parameter values for that program, The mathematical transformation from program characterization parameters into factors, as given by the PCA, and Percentile ranking of all resultant values for each factor. Modify the ASEE on each program data point into an Effective SEE (ESEE) by application of the factors in a multiplicative formula similar to Equation 1 used in COSYSMO (Valerdi 2005) as ESEE i ASEE i j PP ij.5 Weight j 100 Equation 20 Where j PP ij Weight j is an index for each of the 14 factors, is the percentile point ranking of program i against all programs when ranked in order of program factor j, a value from 1 to 99 with value of 50 set to the median of all programs, is a weighting factor that determines the strength and direction of the correction given for characterization factor j. For convenience in representation, weight factors are stated as 100 times greater than actual, with correction in the formula. Vary incrementally the values of Weight j in a manually controlled hill-climbing algorithm to discover the set of weights that results in the best reasonable value for the R 2 correlation coefficients. The algorithm in this process is successful, increasing the R 2 values from typical values in the range of 10-14% to as high as 80%. The brief description in Honour (2011a), however, does not provide sufficient detail for objective evaluation of the work. The subsections below expand on the description to provide the details, focusing primarily on total SE calculations while indicating differences in the calculations for subordinate SE activities. 90

109 Systems Engineering Return on Investment Definition of correlative improvement The primary correlations of interest compare ESEE i to each of the four success measures C i, S i, OS i and TQ i. The level of correlation is calculated as described in Section Although four correlations were available (one for each success measure), it was desirable to define a single value (a combined correlation measure R 2 T ) that could be used to measure success of the search algorithm. As shown in Section 5.2.1, the correlations to TQ i (technical quality) are very low. A preliminary application of the weighting search method in section failed to find any combination of weights that effectively improved the correlations to TQ i. Therefore, it was decided not to include TQ i in the combined correlation measure. The remaining three success measures (C i, S i, OS i ) all demonstrate some initial level of correlation as shown in Figure 31. Further discussion of and findings about these relationships are contained in Section 5.2.1, For the purposes of this section, it is simply noted that the uncorrected correlations exist, although the correlations are relatively low as shown in Table 10. Figure 31. Correlation charts of success measures against ASEE without application of program characteristics 91

110 Table 10. Correlation coefficients for success measures against ASEE without application of program characteristics Success Measure Against R 2 Correlation Coefficient Cost compliance C i ASEE i Schedule compliance S i ASEE i Overall success OS i ASEE i Average ASEE i A suitable combined correlation measure was chosen as the average of the three correlation coefficients: Where R 2 T 1 3 R 2 C R 2 2 S R OS Equation 21 R T 2 is the combined correlative improvement measure R C 2 is the correlation of ESEE i with cost compliance C i R S 2 is the correlation of ESEE i with schedule compliance S i 2 R OS is the correlation of ESEE i with objective success OS i Limitation. This is obviously not the only possible combined measure. Nevertheless, if the minor hypothesis H CXX can be accepted by test, then the validity of this particular measure can also be accepted. Other measures may provide better results; none were found to do so Calculation of percentile points It is desirable that the mapping from ASEE i to ESEE i use the program factors (QF1- QF7, SF1-SF7) such that programs below the median for each factor affect the mapping in the opposite direction to programs above the median. Values of the factors vary widely among programs, with some programs having factor values that are orders of magnitude greater than others. Using the factor values directly would thus result in an inappropriately large effect for such programs. To create a more benign mapping, it was decided to use the percentile point ranking of each program for each factor. The PCA calculations provided a transformation for each program from the original program characterization parameters to the 14 program factors QF1-QF7, SF1-SF7. 92

111 Systems Engineering Return on Investment The 48 program values were ranked from lowest to highest for each of the 14 factors. It should be observed that this ranking was different for each factor, with programs falling in different ranks due to the difference in their program characteristics. With the ranking established, the percentile point is calculated as PP ij k ij /48 Equation 22 Where PP ij is the percentile point ranking for program i against factor j. k ij is the ordered ranking [0 48] for program i against factor j. Finally, to prevent the extreme outliers from inordinate influence, the values for PP ij were truncated to the range [0.02:0.98]. Some experimentation with these truncation limits indicated that this level provided a reasonable optimization during the weighting search without removing an excessive number of outliers. Limitation. As with the combined correlative improvement measure, this is obviously not the only possible mathematical approach to using the program factors. Again however, if the minor hypothesis H CXX can be accepted by test, then the validity of this particular calculation can also be accepted. Other calculations may provide better results; none were found to do so Removal of outliers and constrained programs Some programs wer clear outliers either in their characteristics or their success measures. Outliers would have an inordinate effect on the statistical treatment to follow. For some programs, the interviews indicated that they were constrained by management in either cost or schedule and therefore could not vary in these values. These programs, both outliers and constrained programs, were removed from the data for specific correlation tests as shown in Table 11. In addition to the deletions shown in the table, which were applied to calculations for total SE, the calculations for subordinate SE activities sometimes required further deletions due to outliers or constraints in that specific SE activity. 93

112 Table 11. Outliers removed from statistical data Category of outliers Number of programs removed Cost Schedule Objective Success Technical Quality Cost constrained 7 Schedule constrained 10 Excessive size (cost) Excessive SE% Excessive C i Excessively low TQ i 2 High nbr of test locations High nbr of interfaces High production quantity High nbr of customers High nbr of algorithms Remaining programs Weighting search With the parameters defined, a manual, incremental, hill-climbing search was performed to discover a set of values for Weight j that would optimize the combined correlation measure R 2 T. The search was both manual and incremental because it was a simultaneous search in 14 dimensions for the 14 program factors QF1-QF7, SF1-SF7. With this many dimensions, the nature of the hills to be climbed can be very complex, with many possible sub-optimal values that represent logically incorrect combinations. At each incremental step of the search, the values were compared for logical consistency to ensure climbing the appropriate hills. Values of Weight j were initially changed in increments of 20, later refined to increments of 10, 5, and 1. Most progressions required adjustments in minor steps to manually control the hill climbing. Each search terminated when further incremental changes to any Weight j value either caused (a) a decrease in R 2 T, indicating the top of a hill, (b) an increase in R 2 T of less than 0.001, or (c) unjustifiable control of the correlation by individual extreme data points. Changing each Weight j value moves all data points horizontally on the scatter graphs of success versus ESEE (i.e. success values on the ordinate do not change, but ESEE values on the abscissa do). Because each program has different PP i for each factor, an adjustment of Weight j for that factor causes different movement for each program, right or left, dependent on the program s characteristics. Figure 32, Figure 33, and Figure 34 show an example of a typical progression, using the H ASES correlation of schedule 94

113 Systems Engineering Return on Investment overrun S i versus equivalent SE effort ESEE i. Only three of the 100+ increments are shown. Each scatter chart plots S i versus ESEE i, showing both the ValueSE data (small red symbols) and the SE-ROI data (large blue symbols). The charts also show the R 2 2 correlation values for ValueSE, SE-ROI, and total, although the correlation measure R S uses only the SE-ROI data because only that data has program characteristics available. Size parameters Weight Subjective parameters Weight QF1 System Size 0 SF1 Team understanding 0 QF2 Development methods 0 SF2 Complexity of program/system 0 QF3 Level of integration 0 SF3 Differences across installation sites 0 QF4 Definition at start 0 SF4 Team process capability 0 QF5 Lifecycle stage 0 SF5 Need for and use of SE tools 0 QF6 Proof difficulty 0 SF6 Technology risk 0 QF7 Development autonomy 0 SF7 Wider system applicability 0 Schedule correlation R 2 S Combined correlative measure R 2 T Figure 32. Weighting search example (original values) Figure 32 shows the original chart with no factor adjustments. There is a significant amount of scattering due to the compounding factors. The schedule correlation is R 2 S =0.136; when combined with the values for R 2 2 C and R OS correlation measure is R T 2 = (not shown), the combined Figure 33 shows the weighting search at an interim point in which many of the large blue SE-ROI symbols have been adjusted left and right by adjusting the values of Weight j, resulting in the scatter points fitting closer to the correlation average. The combined correlation measure has now improved to R T 2 = Figure 34 shows the final state of the search, in which further incremental changes to any Weight j in any direction cause little further improvements. At this point, R T 2 =0.542 and the schedule correlation is an impressive R S 2 = The scattering has been significantly reduced and the correlation has been significantly increased. 95

114 Size parameters Weight Subjective parameters Weight QF1 System Size -20 SF1 Team understanding 0 QF2 Development methods -20 SF2 Complexity of program/system -10 QF3 Level of integration -20 SF3 Differences across installation sites -10 QF4 Definition at start 40 SF4 Team process capability -10 QF5 Lifecycle stage 10 SF5 Need for and use of SE tools -20 QF6 Proof difficulty 0 SF6 Technology risk -20 QF7 Development autonomy 10 SF7 Wider system applicability -10 Schedule correlation R 2 S Combined correlative measure R 2 T Figure 33. Weighting search example (interim values) Size parameters Weight Subjective parameters Weight QF1 System Size -25 SF1 Team understanding -15 QF2 Development methods -13 SF2 Complexity of program/system -20 QF3 Level of integration -30 SF3 Differences across installation sites -11 QF4 Definition at start 53 SF4 Team process capability -21 QF5 Lifecycle stage 9 SF5 Need for and use of SE tools -7 QF6 Proof difficulty 22 SF6 Technology risk -19 QF7 Development autonomy 30 SF7 Wider system applicability -9 Schedule correlation R 2 S Combined correlative measure R 2 T Figure 34. Weighting search example (final values) Differences when treating subordinate SE activities. The same procedure was used for each of the eight subordinate SE activities MD, RD, SA, SI, VV, TA, SM, TM. However, in the case of each SE activity it is also necessary to treat the remaining 96

115 Systems Engineering Return on Investment seven SE activities as additional confounding factors. For these searches, the effective level of the SE activity is thus given by: EXXE i AXXE i Where j PP ij.5 Weight xxj 100 k PP ik Weight xxk 100 Equation 23 XX is replaced by one of the eight SE activities MD, RD, SA, SI, VV, TA, SM, TM, EXXE i is the effective level of XX effort on program i, AXXE i is the adjusted level of XX effort on program i as in Equation 8 (percent XX effort adjusted first for XXQ quality of XX effort, then again for early-phase SE activities), j, PP ij are as before Weight xxj is a weighting factor that determines the strength and direction of the correction given to XX activity for program factor j, k PP ik is a numeric index for each of the eight SE activities arbitrarily using the range [15 22] to separate them from the first 14 program characterization factors, is the percentile point ranking of program i against all programs when ranked in order of level of SE activity k, a value from 1 to 99 with value of 50 set to the median of all programs, Weight xxk is a weighting factor that determines the strength and direction of the correction given to SE activity XX for the confounding effect of SE activity k. The value of Weight xxk when XX and k refer to the same activity is always set to zero; corrections are only applied for the other seven SE activities The weighting search for each subsidiary SE activity, therefore, was in 21 dimensions rather than 14. Searches typically therefore took incremental steps to converge. For the early-phase SE activities (MD, RE, SA), only those programs that performed that early phase work were included in the correlations. 97

116 Results for total SE As a result of the procedures of the preceding sections, the combined correlation measure for total SE was significantly improved as shown in Table 12. Correlation coefficients rose from weak correlation levels to very strong correlation levels. The final values of the weights used for each program factor are also shown. It is noted that the strong increase in correlation due to this selection of weighting factors negates the minor null hypothesis H CSE0. There is in fact a set of program characterization factors that can be used to improve the correlation of SE effort with the success of the program. Therefore, the null hypothesis H CSE0 is rejected and it is possible to accept the minor hypothesis H CSE. Table 12. Correlation improvement for total SE Success Measure Against Original Value Final Value 2 R C Cost compliance C i ESEE i Schedule compliance ESEE i R S 2 R OS S i 2 Overall success OS i ESEE i R T 2 Average ESEE i Size parameters Weight Subjective parameters Weight QF1 System Size -25 SF1 Team understanding -15 QF2 Development methods -13 SF2 Complexity of program/system -20 QF3 Level of integration -30 SF3 Differences across installation sites -11 QF4 Definition at start 53 SF4 Team process capability -21 QF5 Lifecycle stage 9 SF5 Need for and use of SE tools -7 QF6 Proof difficulty 22 SF6 Technology risk -19 QF7 Development autonomy 30 SF7 Wider system applicability -9 Interpretation of Weighting Values. With the final weighting values available, it is appropriate to examine the effect of each weighting value and what can be learned from the final values. In each case, the weighting value indicates the degree and direction to which ASEE i is adjusted to create ESEE i : When the weight is positive, SEE is increased for programs with a larger value of the factor and decreased for programs with a smaller value of the factor. The greater the absolute value of the weight, the greater the SEE is changed. For example, with a QF1 weight of -25, the ASEE for a large program (i.e. QF1 system size greater than median) is decreased to a lower ESEE and the ASEE for a smaller 98

117 Systems Engineering Return on Investment program is increased to a higher ESEE. This adjustment worked to align the programs to better conform to the correlation average. In other words, large programs require less SE effort (by percent of the program). Each of the 14 weighting factors of Table 12 thus provides a quantifiable indication of the greater or lesser use of SE effort as related to the particular confounding factors included within the factor. Within the set of interviewed programs, the program characteristics were coupled with the greater or lesser use of total SE effort as shown in Figure 35 and Figure 36. In each figure, as the program factor increased toward the right-hand interpretation, the effective SE effort ESEE i was adjusted as shown by the bar graph. Possible explanations for these relationships are contained in Section Figure 35. Adjustments due to quantitative factors Figure 36. Adjustments due to subjective factors From Figure 35 and Figure 36, it can be seen that the program factors that are most coupled with an increase in ESEE i include more detailed definition at start (QF4), 99

118 higher system level of integration (QF3), more development autonomy (QF7), smaller system size (QF1), greater proof difficulty (QF6), and to a lesser extent, the subjective factors of lesser program/system complexity (SF2), lesser team process capability (SF4), and lesser technology risk (SF6) Results for subordinate SE activities A similar process was followed for each of the subordinate SE activities, using the data appropriate to that activity to improve the correlation: 1. Adjust the XX activity effort for each program (XX% i ) by applying the subjective quality of that effort (XXQ i ) (see Section 5.1.4) and correcting for early-phase SE activity on the program (see Section ), creating the adjusted XX effort AXXE i (Equation 8). 2. Calculate the correlations between the success measures (C i, S i, OS i, TQ i ) and AXXE i. At this point, the correlations are in general very low, on the order of R 2 < Create the combined correlation measure for the activity R TXX by averaging the R 2 correlation coefficients for the first three success measures (C i, S i, OS i ). As with total SE effort, no significant improvement has been found possible for the technical quality success measure TQ i. 4. Using the percentage point ranking for each program against each program factor PP ij and the percentage point ranking for each program against each XX activity PP ik, perform a manual, incremental, hill-climbing search for the sets of values of Weight xxj and Weight xxk that maximize the combined correlation measure. At each increment of the search, recalculate the effective XX effort EXXE i for each program to allow recalculation of the combined correlation measure. The results of the eight individual searches are shown in Table 13 through Table 20. Each table is associated with observations on the results. Mission/purpose definition. The correlation improvement and weights for MD are shown in Table 13. As noted earlier in Table 8, there were only 20 programs that performed MD activity. When further outlying programs were removed from the search as described in section , there were only nine programs remaining that 100

119 Systems Engineering Return on Investment had valid data for the MD correlation improvement. Nonetheless, the correlations for the remaining 9 programs were initially quite low and were all significantly improved by the weighting search, thereby testing against H CMD0 and allowing the acceptance of H CMD, that a selected set of program characterization factors can be used to improve the MD correlation. Table 13. Correlation improvement for mission/purpose definition (MD) Success Measure Against Original Value Final Value 2 R CMD Cost compliance C i EMDE i Schedule compliance S i EMDE i R SMD R OSMD R TMD 2 Overall success OS i EMDE i Average EMDE i Size parameters Weight Subjective parameters Weight QF1 System Size 4 SF1 Team understanding -15 QF2 Development methods -11 SF2 Complexity of program/system -11 QF3 Level of integration 14 SF3 Differences across installation sites 10 QF4 Definition at start -15 SF4 Team process capability 1 QF5 Lifecycle stage -11 SF5 Need for and use of SE tools 14 QF6 Proof difficulty 7 SF6 Technology risk -8 QF7 Development autonomy 9 SF7 Wider system applicability -11 MD Mission/Purpose Definition Not used VV Verification & Validation 12 RE Requirements Engineering -43 TA Technical Analysis 11 SA System Architecting -5 SM Scope Management 25 SI System Integration -4 TM Technical Leadership/management -16 Program factors that were coupled with a significant increase in EMDE i included lesser RE effort, greater SM effort, lesser TM effort, lesser team understanding (SF1), earlier definition at start (QF4), level of integration (QF3) more indicative of subsystem, and greater need for/use of SE tools (SF5). Requirements engineering. The correlation improvement and weights for RE are shown in Table 14. There is a significant improvement in correlation measure, thereby testing against H CRE0 and allowing the acceptance of H CRE, that a selected set of program characterization factors can be used to improve the RE correlation. Program factors that were coupled with significant increase in EREE i included greater TM effort, lesser team understanding (SF1), lesser MD effort, lesser development autonomy (QF7), greater system size (QF1), greater SM effort, greater differences across installation sites (SF3), greater VV effort, and less wide system applicability (SF7). 101

120 It is noted that there is a quantifiable cross-connect between MD effort and RE effort; in each case, greater effort in one is coupled with lesser effort in the other. Table 14. Correlation improvement for requirements engineering (RE) Success Measure Against Original Value Final Value 2 R CRE Cost compliance C i EREE i Schedule compliance EREE i R SRE R OSRE R TRE S i 2 Overall success OS i EREE i Average EREE i Size parameters Weight Subjective parameters Weight QF1 System Size 33 SF1 Team understanding -48 QF2 Development methods -7 SF2 Complexity of program/system 10 QF3 Level of integration 9 SF3 Differences across installation sites 26 QF4 Definition at start -13 SF4 Team process capability 10 QF5 Lifecycle stage -11 SF5 Need for and use of SE tools 12 QF6 Proof difficulty 9 SF6 Technology risk 11 QF7 Development autonomy -38 SF7 Wider system applicability -18 MD Mission/Purpose Definition -44 VV Verification & Validation 19 RE Requirements Engineering Not used TA Technical Analysis 10 SA System Architecting 11 SM Scope Management 30 SI System Integration 5 TM Technical Leadership/management 49 System architecting. The correlation improvement and weights for SA are shown in Table 15. There is a significant improvement in correlation measure, thereby testing against H CSA0 and allowing the acceptance of H CSA, that a selected set of program characterization factors can be used to improve the SA correlation. Table 15. Correlation improvement for system architecting (SA) Success Measure Against Original Value Final Value 2 R CSA Cost compliance C i ESAE i Schedule compliance ESAE i R SSA R OSSA R TSA S i 2 Overall success OS i ESAE i Average ESAE i Size parameters Weight Subjective parameters Weight QF1 System Size 14 SF1 Team understanding -31 QF2 Development methods 5 SF2 Complexity of program/system 11 QF3 Level of integration 18 SF3 Differences across installation sites 4 QF4 Definition at start 7 SF4 Team process capability 5 QF5 Lifecycle stage -9 SF5 Need for and use of SE tools 11 QF6 Proof difficulty 19 SF6 Technology risk 12 QF7 Development autonomy -25 SF7 Wider system applicability -22 MD Mission/Purpose Definition 16 VV Verification & Validation -3 RE Requirements Engineering -38 TA Technical Analysis 9 SA System Architecting Not used SM Scope Management 29 SI System Integration 12 TM Technical Leadership/management

121 Systems Engineering Return on Investment Program factors that were coupled with significant increase in ESAE i included lesser RE effort, lesser team understanding (SF1), greater SM effort, lesser development autonomy (QF7), less wide system applicability (SF7), greater proof difficulty (QF6), level of integration (QF3) more indicative of a subsystem, and greater MD effort. System integration. The correlation improvement and weights for SI are shown in Table 16. There is a significant improvement in correlation measure, thereby testing against H CSI0 and allowing the acceptance of H CSI, that a selected set of program characterization factors can be used to improve the SI correlation. Table 16. Correlation improvement for system integration (SI) Success Measure Against Original Value Final Value 2 R CSI Cost compliance C i ESIE i Schedule compliance ESIE i R SSI R OSSI R TSI S i 2 Overall success OS i ESIE i Average ESIE i Size parameters Weight Subjective parameters Weight QF1 System Size 10 SF1 Team understanding 0 QF2 Development methods -30 SF2 Complexity of program/system -8 QF3 Level of integration 0 SF3 Differences across installation sites -4 QF4 Definition at start 0 SF4 Team process capability 24 QF5 Lifecycle stage -1 SF5 Need for and use of SE tools 5 QF6 Proof difficulty 25 SF6 Technology risk -2 QF7 Development autonomy 19 SF7 Wider system applicability 0 MD Mission/Purpose Definition -10 VV Verification & Validation 2 RE Requirements Engineering 6 TA Technical Analysis -5 SA System Architecting 12 SM Scope Management 18 SI System Integration Not used TM Technical Leadership/management 11 Program factors that were coupled with significant increase in ESIE i included development methods (QF2) more indicative of amortized development, greater proof difficulty (QF6), greater team process capability (SF4), greater development autonomy (QF7), and greater SM effort. Verification and validation. The correlation improvement and weights for VV are shown in Table 17. There is a significant improvement in correlation measure, thereby testing against H CVV0 and allowing the acceptance of H CVV, that a selected set of program characterization factors can be used to improve the VV correlation. It is noted, however, that the correlation between OS i and EVVE i does not improve, being 103

122 less than significant both in the original value and the final value. No combination of weights was found to improve the correlation with overall success. It is also noted that 2 the final correlation value of R TVV less than 20% correlation. = is only one of two SE activities that finish at Program factors that were coupled with significant increase in EVVE i included lesser development autonomy (QF7) (extreme coupling), greater TM effort (large coupling), lesser RE effort (large coupling), lesser team understanding (SF1) (large coupling), lesser TM effort, greater SM effort, greater definition at start (QF4), greater program/system complexity (SF2), and lesser team process capability (SF4). Table 17. Correlation improvement for verification & validation (VV) Success Measure Against Original Value Final Value 2 R CVV Cost compliance C i EVVE i Schedule compliance EVVE i R SVV R OSVV R TVV S i 2 Overall success OS i EVVE i Average EVVE i Size parameters Weight Subjective parameters Weight QF1 System Size -10 SF1 Team understanding -45 QF2 Development methods 2 SF2 Complexity of program/system 21 QF3 Level of integration -2 SF3 Differences across installation sites -10 QF4 Definition at start 22 SF4 Team process capability -18 QF5 Lifecycle stage -1 SF5 Need for and use of SE tools 0 QF6 Proof difficulty -9 SF6 Technology risk 0 QF7 Development autonomy -74 SF7 Wider system applicability -4 MD Mission/Purpose Definition -28 VV Verification & Validation Not used RE Requirements Engineering -48 TA Technical Analysis 0 SA System Architecting 10 SM Scope Management 22 SI System Integration 0 TM Technical Leadership/management 55 Technical analysis. The correlation improvement and weights for TA are shown in Table 18. There is a significant improvement in correlation measure, thereby testing against H CTA0 and allowing the acceptance of H CTA, that a selected set of program characterization factors can be used to improve the TA correlation. Program factors that were coupled with significant increase in ETAE i included lesser MD effort (extreme coupling), lesser SI effort (extreme coupling), lesser development autonomy (QF7) (extreme coupling), greater program/system complexity (SF2) (large coupling), more development-oriented lifecycle stage (QF5), and greater technology risk (SF6). 104

123 Systems Engineering Return on Investment Table 18. Correlation improvement for technical analysis (TA) Success Measure Against Original Value Final Value 2 R CTA Cost compliance C i ETAE i Schedule compliance ETAE i R STA R OSTA R TTA S i 2 Overall success OS i ETAE i Average ETAE i Size parameters Weight Subjective parameters Weight QF1 System Size -12 SF1 Team understanding -5 QF2 Development methods -2 SF2 Complexity of program/system 40 QF3 Level of integration 12 SF3 Differences across installation sites 6 QF4 Definition at start 10 SF4 Team process capability 4 QF5 Lifecycle stage -26 SF5 Need for and use of SE tools 10 QF6 Proof difficulty -12 SF6 Technology risk 22 QF7 Development autonomy -58 SF7 Wider system applicability 4 MD Mission/Purpose Definition -90 VV Verification & Validation 7 RE Requirements Engineering -5 TA Technical Analysis Not used SA System Architecting 10 SM Scope Management 5 SI System Integration -90 TM Technical Leadership/management 14 Scope management. The correlation improvement and weights for SM are shown in Table 19. While this activity demonstrates the lowest improvement of all the SE activities, there is still a significant improvement in correlation measure, thereby testing against H CSM0 and allowing the acceptance of H CSM, that a selected set of program characterization factors can be used to improve the SM correlation. Table 19. Correlation improvement for scope management (SM) Success Measure Against Original Value Final Value 2 R CSM Cost compliance C i ESME i Schedule compliance ESME i R SSM R OSWM R TWM S i 2 Overall success OS i ESME i Average ESME i Size parameters Weight Subjective parameters Weight QF1 System Size -10 SF1 Team understanding 18 QF2 Development methods -42 SF2 Complexity of program/system 27 QF3 Level of integration 0 SF3 Differences across installation sites -17 QF4 Definition at start -5 SF4 Team process capability 24 QF5 Lifecycle stage 10 SF5 Need for and use of SE tools 10 QF6 Proof difficulty 10 SF6 Technology risk -5 QF7 Development autonomy -10 SF7 Wider system applicability -40 MD Mission/Purpose Definition -1 VV Verification & Validation 7 RE Requirements Engineering -77 TA Technical Analysis 40 SA System Architecting -10 SM Scope Management Not used SI System Integration 31 TM Technical Leadership/management

124 Program factors that were coupled with significant increase in ESME i included lesser RE effort (extreme coupling), greater TM effort (extreme coupling), development methods (QF2) more indicative of amortized development (large coupling), greater TA effort (large coupling), less wide system applicability (SF7) (large coupling), greater SI effort, greater program/system complexity (SF2), and greater team understanding (SF1). Technical leadership/management. The correlation improvement and weights for TM are shown in Table 20. There is a significant improvement in correlation measure, thereby testing against H CTM0 and allowing the acceptance of H CTM, that a selected set of program characterization factors can be used to improve the TM correlation. In this activity, it is particularly noted that the correlations change move from near nil to significant values when applying the program factors. Program factors that were coupled with significant increase in ESME i included less wide system applicability (SF7) (extreme coupling), lesser team understanding (SF1) (extreme coupling), lesser differences across installation sites (SF3) (extreme coupling), smaller system size (QF1) (extreme coupling), lesser definition at start (QF4), greater SM effort, greater TA effort, development methods (QF2) more indicative of contracted development, greater technology risk (SF6), greater SI effort, lesser SA effort, greater MD effort, greater team process capability (SF4), and level of integration (QF3) more indicative of system. Table 20. Correlation improvement for technical leadership/management (TM) Success Measure Against Original Value Final Value 2 R CTM Cost compliance C i ETME i Schedule compliance S i ETME i R STM R OSTM R TTM 2 Overall success OS i ETME i Average ETME i Size parameters Weight Subjective parameters Weight QF1 System Size -52 SF1 Team understanding -55 QF2 Development methods 23 SF2 Complexity of program/system 11 QF3 Level of integration -18 SF3 Differences across installation sites -55 QF4 Definition at start -33 SF4 Team process capability 18 QF5 Lifecycle stage 2 SF5 Need for and use of SE tools 7 QF6 Proof difficulty 1 SF6 Technology risk 22 QF7 Development autonomy -12 SF7 Wider system applicability -60 MD Mission/Purpose Definition 20 VV Verification & Validation 9 RE Requirements Engineering 10 TA Technical Analysis 26 SA System Architecting -22 SM Scope Management 27 SI System Integration 22 TM Technical Leadership/management Not used 106

125 Systems Engineering Return on Investment Decision points for correlation significance In Section 5.2, it is repeatedly necessary to test a correlation for significance. Given a certain level of correlation coefficient R 2 in the sample set, is that level of correlation significant to a predefined probability? Following the practice of Spiegel & Stephens (1999) as further explained by Brown (2011), it is possible to assume a null hypothesis H 0 that the true population has a Pearson s correlation of = 0. The transformation from a true population correlation = 0 to a sample correlation R follows a Student s t distribution as t R n 2 2 with df n 2 Equation 24 1 R Where t n R df is the Student s t distribution statistic is the sample size is the sample Pearson s correlation coefficient is degrees of freedom The probability value (p-value) for each resulting value of the statistic t can be calculated, and then a test is performed to determine if the p-value is lower than a desired threshold chosen as = If the p-value is lower than the threshold, then the assertion of the null hypothesis H 0 can be rejected, and the converse hypothesis H 1 that the true population correlation is nonzero ( 0) can be accepted. For given values of the sample size, decision points can be pre-calculated for the value of R 2 that represents sufficiently high sample correlation to indicate a correlation of significance greater than 95% ( 0.05). For each sample size, the one-tailed value of the Student s t statistic for /2 = is calculated. From this critical value of t, the corresponding value of R 2 is calculated by a reversed version of Equation 24 as R 2 t 2 2 Equation 25 n 2 t The decision point values for appropriate sample sizes are shown in Table 21. When the R 2 of any correlation is higher than the decision point shown, then that correlation is significant to a better than 95% probability. 107

126 Table 21. R 2 decision values for significance 0.05 Sample Size Critical t Decision R Research Question A: Correlate SE with program success With the mathematical background of section 5.1, the work can now examine Research Question A, stated originally as: (RQ A ) Is there a quantifiable correlation between the amount, types and quality of systems engineering efforts used during a program and the success of the program? The 36 specific hypotheses that support this research question are stated in To examine this question, this section tests the 36 instances of the null hypotheses (H AXXY0 ) No quantifiable correlation exists between the amount and quality of systems engineering type XX effort used during a program and the class Y success of the program. In each case, the null hypothesis is tested by evaluating whether there exists a quantifiable and supportable correlation between the specific type of systems engineering effort and the specific success measure. For each relationship of equivalent SE activity effort (EXXE i ; nine XX activities) and a success measure (C i, S i, OS i, TQ i ; four success measures), the scatter plot is converted to a quadratic curve by a least-square regression as in Section Then the Pearson correlation coefficient R 2 is calculated between the scatter plot data and the quadratic curve to test how well the curve fits the data. The correlation level is then tested for significance by using the decision values in Table 21, indexed by the number of samples in the correlation. If the correlation is found to be significant (i.e. greater than the decision value), then the null hypothesis is rejected and thereby the corresponding hypothesis is accepted. If the 108

127 Systems Engineering Return on Investment correlation is less than the decision value, then the null hypothesis cannot be rejected and this must be accepted as a lack of information about the corresponding hypothesis. Developmental publication of this work was made in Honour (2010b), which is contained in Appendix C.7. However, the results as shown in that developmental paper did not yet apply the program characteristics as developed in Section The results in the following sections include the effects of the program characteristics, resulting in correlations that are significantly greater than the developmental publication SE - total systems engineering effort The correlation plots for total SE effort (ESEE i ) against the four success measures are shown in Figure 37 through Figure 40. Each plot displays both SE-ROI data (large blue symbols) and the prior ValueSE data (small red symbols). Only the SE-ROI data is corrected for program characteristics, because the ValueSE surveys did not include information about the program characteristics. The plots also display the quadratic curves for all points (bold), SE-ROI points, and ValueSE points determined by the least-squares regression. Table 22 summarizes the correlation significance tests for total SE effort against the four success measures, using the data from the SE-ROI data set only. The correlations against cost compliance, schedule compliance, and overall success are significant, being well in excess of the 95% probability threshold. The correlation against technical quality is marginal. Table 23 summarizes the correlation significant tests for total SE effort against three success measures using the data from both SE-ROI and ValueSE data sets. (The ValueSE data set did not gather information about technical quality.) Because the ValueSE data cannot be corrected for program characteristics, the correlations are somewhat lower. The decision values are also lower, due to the greater number of samples. The results are the same as with the SE-ROI data alone. Table 22. Correlation significance tests for SE effort (SE-ROI data only) Hypothesis Success Measure Samples R 2 Decision Significant Value Actual R 2 >95% H ASEC C i YES H ASES S i YES H ASEO OS i YES H ASET TQ i Marginal 109

128 Table 23. Correlation significance tests for Systems Engineering effort (all data) Hypothesis Success Measure Samples R 2 Decision Significant Value Actual R 2 >95% H ASEC C i YES H ASES S i YES H ASEO OS i YES Figure 37. Correlation plot: cost (C i ) against SE (ESEE i ) Figure 38. Correlation plot: schedule (S i ) against SE (ESEE i ) 110

129 Systems Engineering Return on Investment Figure 39. Correlation plot: overall success (OS i ) against SE (ESEE i ) Figure 40. Correlation plot: technical quality (TQ i ) against SE (ESEE i ) 111

130 Return on investment Return on Investment (ROI) for additional SE effort can be calculated by comparing the incremental change in program cost for a given increase in SE effort. This could be calculated in either of two ways: (1) considering that the additional SE effort takes away from other efforts (re-allocated from within the program), or (2) considering that the additional SE effort is also additional to the program (funding for additional SE effort comes from outside the program). The data in this thesis was all drawn from programs that had allocated SE effort out of their total program budget, so it is appropriate to use the first form of calculation. The value for ROI is therefore given by: ROI SE ˆ C ESEE Equation 26 Where ROI SE ˆ C ESEE is the return on investment for additional SE effort is cost compliance (% total cost), predicted trend for all programs is the equivalent SE cost expended (% total cost) Figure 41 shows the form of the relationships for actual/planned cost (cost compliance) and ROI against ESEE. The curve in the figure represents the quadratic median of all programs for cost compliance (ratio of actual cost to planned cost) against ESEE as seen in Figure 37. The dashed straight line is ROI, which is calculated as the negative slope of the quadratic as in Equation 26. ROI is positive at lesser values of SE activity, indicating that there is a cost benefit to increasing the SE activity. Likewise, ROI is negative at greater values of SE activity, indicating a cost benefit to decreasing the SE activity. At ROI = 0, the SE activity is at an optimum level. This occurs at the minimum point of the quadratic. Table 24 shows values for SE-ROI. 112

131 Systems Engineering Return on Investment Figure 41. Selecting optimum SE effort using ROI Table 24. Quantified systems engineering return on investment (SE-ROI) Current SE Effort (% of Program Cost) Average Cost Overrun ROI for Additional SE Effort (Cost Reduction Per $$ Added) 0% 53% 7.0 5% 24% % 15% 3.5 (median of all programs) 10% 7% % 3% % 10% -2.8 This calculation supports two strong findings. First, the monetary Return on Investment of greater systems engineering effort can be as high as 7:1 for programs using little to no current systems engineering effort. Second, the monetary Return on Investment of greater systems engineering effort for median programs is 3.5:1. 113

132 5.2.2 MD mission/purpose definition effort The correlation plots for MD effort (EMDE i ) against the four success measures are shown in Figure 42 through Figure 45. Each plot displays only SE-ROI data, because the ValueSE surveys did not include information about subordinate SE activities. Table 25 summarizes the correlation significance tests for MD effort against the four success measures. Because few interviewed programs had actually performed MD effort, the sample sizes are small and the corresponding R 2 decision values for significance are large. The correlations against cost compliance and schedule compliance are significant, being well in excess of the 95% probability threshold. The correlations against overall success and technical quality are not significant. Table 25. Correlation significance tests for MD effort Hypothesis Success Measure Samples R 2 Decision Significant Value Actual R 2 >95% H AMDC C i YES H AMDS S i YES H AMDO OS i No H AMDT TQ i No Figure 42. Correlation plot: cost (C i ) against MD (EMDE i ) 114

133 Systems Engineering Return on Investment Figure 43. Correlation plot: schedule (S i ) against MD (EMDE i ) Figure 44. Correlation plot: overall success (OS i ) against MD (EMDE i ) Figure 45. Correlation plot: technical quality (TQ i ) against MD (EMDE i ) 115

134 5.2.3 RE requirements engineering effort The correlation plots for RE effort (EREE i ) against the four success measures are shown in Figure 46 through Figure 49. Each plot displays only SE-ROI data, because the ValueSE surveys did not include information about SE activities. Table 26 summarizes the correlation significance tests for RE effort against the four success measures. The correlations against cost compliance, schedule compliance, and overall success are significant, being well in excess of the 95% probability threshold. The correlation against technical quality is not significant. Table 26. Correlation significance tests for RE effort Hypothesis Success Measure Samples R 2 Decision Significant Value Actual R 2 >95% H AREC C i YES H ARES S i YES H AREO OS i YES H ARET TQ i No Figure 46. Correlation plot: cost (C i ) against RE (EREE i ) 116

135 Systems Engineering Return on Investment Figure 47. Correlation plot: schedule (S i ) against RE (EREE i ) Figure 48. Correlation plot: overall success (OS i ) against RE (EREE i ) Figure 49. Correlation plot: technical quality (TQ i ) against RE (EREE i ) 117

136 5.2.4 SA system architecting effort The correlation plots for SA effort (ESAE i ) against the four success measures are shown in Figure 50 through Figure 53. Each plot displays only SE-ROI data, because the ValueSE surveys did not include information about SE activities. Table 27 summarizes the correlation significance tests for SA effort against the four success measures. The correlations against cost compliance, schedule compliance, and overall success are significant, being well in excess of the 95% probability threshold. The correlation against technical quality is not significant. Table 27. Correlation significance tests for SA effort Hypothesis Success Measure Samples R 2 Decision Significant Value Actual R 2 >95% H ASAC C i YES H ASAS S i YES H ASAO OS i YES H ASAT TQ i No Figure 50. Correlation plot: cost (C i ) against SA (ESAE i ) 118

137 Systems Engineering Return on Investment Figure 51. Correlation plot: schedule (S i ) against SA (ESAE i ) Figure 52. Correlation plot: overall success (OS i ) against SA (ESAE i ) Figure 53. Correlation plot: technical quality (TQ i ) against SA (ESAE i ) 119

138 5.2.5 SI system integration effort The correlation plots for SI effort (ESIE i ) against the four success measures are shown in Figure 54 through Figure 57. Each plot displays only SE-ROI data, because the ValueSE surveys did not include information about SE activities. Table 28 summarizes the correlation significance tests for SI effort against the four success measures. The correlations against cost compliance, schedule compliance, and overall success are significant, being well in excess of the 95% probability threshold. The correlation against technical quality is not significant. Table 28. Correlation significance tests for SI effort Hypothesis Success Measure Samples R 2 Decision Significant Value Actual R 2 >95% H ASIC C i YES H ASIS S i YES H ASIO OS i YES H ASIT TQ i No Figure 54. Correlation plot: cost (C i ) against SI (ESIE i ) 120

139 Systems Engineering Return on Investment Figure 55. Correlation plot: schedule (S i ) against SI (ESIE i ) Figure 56. Correlation plot: overall success (OS i ) against SI (ESIE i ) Figure 57. Correlation plot: technical quality (TQ i ) against SI (ESIE i ) 121

140 5.2.6 VV verification & validation effort The correlation plots for VV effort (EVVE i ) against the four success measures are shown in Figure 58 through Figure 61. Each plot displays only SE-ROI data, because the ValueSE surveys did not include information about SE activities. Table 29 summarizes the correlation significance tests for VV effort against the four success measures. The correlations against cost compliance and schedule compliance are significant, being well in excess of the 95% probability threshold. The correlations against overall success and technical quality are not significant. Table 29. Correlation significance tests for VV effort Hypothesis Success Measure Samples R 2 Decision Significant Value Actual R 2 >95% H AVVC C i YES H AVVS S i YES H AVVO OS i No H AVVT TQ i No Figure 58. Correlation plot: cost (C i ) against VV (EVVE i ) 122

141 Systems Engineering Return on Investment Figure 59. Correlation plot: schedule (S i ) against VV (EVVE i ) Figure 60. Correlation plot: overall success (OS i ) against VV (EVVE i ) Figure 61. Correlation plot: technical quality (TQ i ) against VV (EVVE i ) 123

142 5.2.7 TA technical analysis effort The correlation plots for TA effort (ETAE i ) against the four success measures are shown in Figure 62 through Figure 65. Each plot displays only SE-ROI data, because the ValueSE surveys did not include information about SE activities. Table 30 summarizes the correlation significance tests for TA effort against the four success measures. The correlations against cost compliance and schedule compliance are significant, being well in excess of the 95% probability threshold. The correlation against objective success is marginal. The correlation against technical quality is not significant. Table 30. Correlation significance tests for TA effort Hypothesis Success Measure Samples R 2 Decision Significant Value Actual R 2 >95% H ATAC C i YES H ATAS S i YES H ATAO OS i Marginal H ATAT TQ i No Figure 62. Correlation plot: cost (C i ) against TA (ETAE i ) 124

143 Systems Engineering Return on Investment Figure 63. Correlation plot: schedule (S i ) against TA (ETAE i ) Figure 64. Correlation plot: overall success (OS i ) against TA (ETAE i ) Figure 65. Correlation plot: technical quality (TQ i ) against TA (ETAE i ) 125

144 5.2.8 SM scope management effort The correlation plots for SM effort (ESME i ) against the four success measures are shown in Figure 66 through Figure 69. Each plot displays only SE-ROI data, because the ValueSE surveys did not include information about SE activities. Table 31 summarizes the correlation significance tests for SM effort against the four success measures. The correlations against cost compliance and overall success are significant, being in excess of the 95% probability threshold. The correlations against schedule compliance and technical quality are not significant. Table 31. Correlation significance tests for SM effort Hypothesis Success Measure Samples R 2 Decision Significant Value Actual R 2 >95% H ASMC C i YES H ASMS S i No H ASMO OS i YES H ASMT TQ i No Figure 66. Correlation plot: cost (C i ) against SM (ESME i ) 126

145 Systems Engineering Return on Investment Figure 67. Correlation plot: schedule (S i ) against SM (ESME i ) Figure 68. Correlation plot: overall success (OS i ) against SM (ESME i ) Figure 69. Correlation plot: technical quality (TQ i ) against SM (ESME i ) 127

146 5.2.9 TM technical leadership/management effort The correlation plots for TM effort (ETME i ) against the four success measures are shown in Figure 70 through Figure 73. Each plot displays only SE-ROI data, because the ValueSE surveys did not include information about SE activities. Table 32 summarizes the correlation significance tests for TM effort against the four success measures. The correlations against cost compliance, schedule compliance, and overall success are significant, being well in excess of the 95% probability threshold. The correlation against technical quality is not significant. Table 32. Correlation significance tests for TM effort Hypothesis Success Measure Samples R 2 Decision Significant Value Actual R 2 >95% H ATMC C i YES H ATMS S i YES H ATMO OS i YES H ATMT TQ i No Figure 70. Correlation plot: cost (C i ) against TM (ETME i ) 128

147 Systems Engineering Return on Investment Figure 71. Correlation plot: schedule (S i ) against TM (ETME i ) Figure 72. Correlation plot: overall success (OS i ) against TM (ETME i ) Figure 73. Correlation plot: technical quality (TQ i ) against TM (ETME i ) 129

148 Summary of Research Question A The preceding sections demonstrate the statistical basis on whether to accept the 36 hypotheses H AXXY. Table 33 summarizes the results, portraying each of the hypotheses using one of the following three conclusions: Accept The correlation coefficient is high enough that the probability of the null hypothesis being true is much less than 5%; the null hypothesis is rejected and the primary hypothesis is accepted. Marginal The correlation coefficient is near the decision value at which the probability of the null hypothesis is 5%; the possibility of a relationship is acknowledged. Unknown The correlation coefficient is lower than the decision value. The null hypothesis has some probability greater than 5% of being true; no conclusion can be made about the primary hypothesis. The results of Table 33 lead to a series of primary findings for Research Question A: All SE activities have a significant correlation with cost compliance. All SE activities (except scope management) have a significant correlation with schedule compliance. All SE activities (except mission/purpose definition, verification & validation, and technical analysis) have a significant correlation with overall success. No SE activities have a significant correlation with technical quality. 130

149 Systems Engineering Return on Investment Table 33. Summary of correlations: SE with program success Quantifiable Correlation Exists With Cost Schedule Overall Technical Code Compliance Compliance Success Quality Activity XX H AXXC H AXXS H AXXO H AXXT Total Systems Engineering Effort Mission/Purpose Definition Effort Requirements Engineering Effort System Architecting Effort System Integration Effort Verification & Validation Effort Technical Analysis Effort Scope Management Effort Technical Management/ Leadership Effort SE Accept Accept Accept Marginal MD Accept Accept Unknown Unknown RE Accept Accept Accept Unknown SA Accept Accept Accept Unknown SI Accept Accept Accept Unknown VV Accept Accept Unknown Unknown TA Accept Accept Marginal Unknown SM Accept Unknown Accept Unknown TM Accept Accept Accept Unknown 5.3 Research Question B: Optimum SE can be predicted With correlations shown for every SE activity as part of examining Research Question A, the work turned to examination of Research Question B, stated originally as: (RQ B ) For any given program, can an optimum amount, type and quality of systems engineering effort be predicted from the quantified correlations? The nine specific hypotheses that support this research question are stated in To examine this question, this section tests the nine instances of the null hypothesis (H BXX0 ) For any given program, no optimum amount of highest-quality systems engineering type XX effort can be predicted from the quantified correlations. In this case, the null hypothesis is tested by evaluating whether the quantified correlations allow prediction of an optimum amount of highest-quality systems engineering effort for a given program. (In each case, the EXXE i values have already been corrected to highest-quality by the use of the subjective quality measures as described in Section ) 131

150 In general, this section uses the following flow of logic: 1. Select an appropriate prediction method for SE effort. 2. Demonstrate that the selected prediction method fulfills an optimum for the given data. 3. Demonstrate that the translation of the optimum for the characteristics of a given program necessarily retains the optimum. Initial work on RQ B was reported in Honour (2011b), which is contained in Appendix C.9. It should be noted, however, that the calculations in (Honour 2011b) were performed somewhat differently than in this section. The mathematical process in this section provides more rigor than the initial work Selected prediction methodology It is desirable to have a prediction method that can be useful to projects, by selecting to optimize the cost benefit of the SE activities for the effort expended. The concept of Return on Investment (ROI) provides such a cost benefit, being the relationship of the total cost reduction in a program to the incremental cost addition of further SE effort, as defined in Equation 26 of Section It was therefore chosen to calculate the level of SE effort at which ROI = 0, and to use this value as the prediction of optimum SE effort. At lower levels of SE effort, more SE effort has a cost benefit to the program. At higher levels of SE effort, additional SE effort is detrimental. This level of SE effort where ROI = 0 provides the optimum cost benefit. By applying Equation 26 (written just for total SE) to the more general case of any form of SE activity, the calculation of ROI is given by: Where ROI XX ˆ C EXXE 0 Equation 27 ROI XX ˆ C EXXE 0 is the return on investment for additional XX effort is cost compliance (% total cost), predicted trend for a median program is the equivalent XX cost expended (% total cost) on the median program 132

151 Systems Engineering Return on Investment As with ESEE in Section , this equation calculates the slope of the quadratic regression line on the scatter plot of cost compliance (C i ) against effective XX effort (EXXE i ). That quadratic regression line is given by a revision of Equation 13 for each SE activity as: ˆ C a XX EXXE 0 2 b XX EXXE 0 c XX Equation 28 Where, in addition to the above, a XX, b XX, c XX are quadratic parameters for the XX relationship as described in Section Substituting Equation 28 into Equation 27, and then taking the indicated derivative, results in: ROI XX (2a XX EXXE 0 b XX ) Equation 29 Setting this to zero and solving for EXXE 0 provides the predicted value at which the ROI goes to zero: Where OXXE 0 b XX 2a XX Equation 30 OXXE 0 is the optimum level of equivalent XX effort EXXE (% total cost) for a median program Predictions of optimum SE median program Table 34 displays the calculated optimum levels of SE activities for a median program, based on the prediction methodology of the previous section. The optimum level of total SE effort is 14.4%. The table also provides optimum levels for each subordinate SE activity. It should be noted that the sum of the SE activities is considerably greater than the value for total SE. This is appropriate because: The calculation for each SE activity is independent of the others, optimizing for the one activity. Optimizing for a single subordinate SE activity is not optimum for total SE. 133

152 Table 34. Optimum levels of SE activities for median program Code Optimum (% Median of Activity XX total cost) AXXE data Total Systems Engineering Effort SE 14.4% 8.5% Mission/Purpose Definition Effort MD 1.3% 1.6% Requirements Engineering Effort RE 2.0% 0.8% System Architecting Effort SA 3.9% 1.4% System Integration Effort SI 2.8% 1.5% Verification & Validation Effort VV 2.4% 2.0% Technical Analysis Effort TA 1.8% 1.3% Scope Management Effort SM 1.4% 0.3% Technical Management/ Leadership Effort TM 3.9% 1.9% The fourth column of Table 34 shows the median of the AXXE data as reported in the interviews and adjusted for missing front-end effort. It can be seen that, in nearly every case, the interviewed programs were operating at something less than the optimum values. This comparison is also shown graphically in Figure 74, in which the large diamonds are the optimum values for each SE activity while the squares are the median of all programs. Some programs used greater or lesser levels than the optimum, but the median in nearly every activity was less than optimum. Figure 74. Comparison of optimum levels with observed levels Predictions of optimum SE any program Looking again at the null hypothesis (H BXX0 ) For any given program, no optimum amount of highest-quality systems engineering type XX effort can be predicted from the quantified correlations. 134

153 Systems Engineering Return on Investment It has been shown that, for a median program, an optimum level of highest-quality systems engineering type XX effort can be predicted from the quantified correlations. In this section, it is shown that this amount can also be adjusted appropriately for any given program based on its program characteristics, without losing the essential optimality. The given program can be defined by the same 14 orthogonal program characterization factors discovered in the Principal Component Analysis of section and analyzed for impact in section (QF1-QF7, SF1-SF7). Figure 75 depicts the transformations used. In the first three panes of the figure, the transformation converted the base program values for XX effort into a set of values for median program, by moving each program value an amount determined by the characteristics PP ij and the weighting factors Weight xxj for each characteristic. The result was to transform each program into a median domain in which the confounding factors due to the characteristics were largely removed, as indicated by the high correlation. Figure 75. Transformations due to program characteristics In this section, the same transformation is applied in reverse as shown in the last two panes of Figure 75. The predictive trend line for the median program is moved back into the values for a given program by applying in reverse the same transforming factors. In essence, this transformation converts the median data points for all programs into a set of data points specific to programs with the given characteristics. The AXXE G values for the given program characteristics are given by the reverse of Equation 23 as: 135

154 AXXE G EXXE 0 j PP ij.5 Weight xxj 100 Equation 31 Where XX AXXE G EXXE 0 PP j is replaced by one of the nine SE activities SE, MD, RD, SA, SI, VV, TA, SM, TM, is XX cost expended (% total cost) for the given program is XX cost expended (% total cost) for a median program is the estimated percentile point ranking of the given program against all programs against program factor j, a value from 0.01 to 0.99 with value of 0.50 set to the median of all programs, Weight xxj is the weighting factor for activity XX and program factor j as calculated through the optimization of Section It is not necessary, in this inverse transformation, to consider the confounding effect of the other seven XX efforts, as was done in Equation 23, because the calculations are now using the optimum values of each of those efforts rather than accepting them as external confounding factors. This calculation can be simplified by recognizing that, for the given program, the values of PP j are fixed by the program characteristics. In addition, the values of Weight xxj are fixed by the optimization performed previously. Therefore, a set of constant factors for the given program can be calculated as: G XX Weight xxj PP j 100 Equation 32.5 j Equation 31 can then be re-expressed using the constants G XX simply as AXXE G G XX EXXE 0 Equation 33 Mathematically, the reverse transformation from the median program to the given program combines Equation 28 with Equation 33 to result in a predictive trend line for the given program characteristics as C ˆ 2 G a XX G XX EXXE 2 0 b XX G XX EXXE 0 c XX Equation

155 Systems Engineering Return on Investment Where ˆ C G is the cost compliance (% total cost), predicted trend for the given set of program characteristics The value for the optimum is then calculated based on the trend line that is now specific to the given program. Following the same mathematics as in Section 5.3.2, the optimum level of XX activity is: Where OXXE G b XXG XX 2 Equation 35 2a XX G XX OXXE G is the optimum level of XX effort (% total cost) for a given program Hypothesis test The predictive method as applied to any given program results in the predictive trend line of Equation 34. This trend line is still in a quadratic form, guaranteeing an optimum for any values other than a = 0, a degenerate case. The calculation of optimum value in Equation 35 is therefore assured for any but the degenerate case. In any case in which the correlation is high enough to accept H AXXC (see Table 33 of Section ), the resulting quadratic is of necessity not degenerate. But H AXXC has been accepted for all cases of XX. Therefore, the data does in fact support predicting an optimum amount, and the null hypothesis H BXX0 is also proven false for each case in which H AXXY is accepted. In each such case, the hypothesis H BXX is also accepted. Therefore, it is possible to predict an optimum level of SE activity for a given program based on the program characteristics. 5.4 Observations and findings This chapter presents the statistical work and findings based on the data gathered during the SE-ROI research. The findings provide significant information about the major research questions, with mathematical rigor. In particular, as shown in Table 33, the following major hypotheses can be accepted with assurance greater than 95%: All systems engineering activities have a significant correlation with cost compliance. (H AXXC ) 137

156 All systems engineering activities (except scope management) have a significant correlation with schedule compliance. (H AXXS ) All systems engineering activities (except mission/purpose definition, verification & validation, and technical analysis) have a significant correlation with overall success. (H AXXO ) It is possible to predict an optimum amount of highest-quality systems engineering effort for any given program. (H BXX ) A selected set of program characterization factors can be used to improve the correlation of type XX effort used during a program and the success of the program. (H CXX ) However, one major hypothesis cannot be accepted from the data, resulting in the negative statement: No systems engineering activities have a significant correlation with technical quality. (H AXXT0 ) The data supports a statistical calculation of monetary Return on Investment showing that: Return on Investment of re-allocating funds to systems engineering effort can be as high as 7:1 for programs using little to no current systems engineering effort. Return on Investment of re-allocating funds systems engineering effort for a median program is 3.5:1. Another significant finding is: Programs typically operate at less than the optimum level of SE effort. In addition to the major results concerning the hypotheses and ROI, the statistical work has also generated a series of other findings as shown in Table

157 Systems Engineering Return on Investment Table 35. Summary of findings from statistical work Nbr Finding Quantification 1 The three grades (easy, nominal, hard) of the graded quantities (numbers of system requirements, external interfaces, unique algorithms, and operational scenarios) are highly correlated with each other, indicating high statistical dependence on each other. Correlation coefficient >0.6, = The three number of components parameters (unique components, designed components, integrated components) are highly correlated with each other, indicating high statistical dependence on each other. 3 Funding method and life-cycle stage are negatively correlated, indicating that earlier stage programs tend to be amortized while later stage programs tend to be contracted. 4 Systems with more interfaces tend to involve more customers; they tend to have more detailed start definition, and tend to require more test locations. 5 Systems with more system algorithms tend to require more test locations and more system components. 6 Programs with more developing organizations tend to have more system components, and more operational scenarios. Correlation coefficient >0.6, =0.05 Correlation coefficient >0.3, =0.05 Correlation coefficient >0.3, =0.05 Correlation coefficient >0.3, =0.05 Correlation coefficient >0.3, =0.05 Calculations as part of PCA 7 The seven quantitative principal components QF1-QF7 cover 72% of the variability in all quantitative parameters. 8 Process capability correlates strongly with team continuity. Correlation coefficient >0.5, = Parameters associated with team understanding of the problem (mission/purpose understanding, requirements understanding, architecture understanding) all correlate with each other and also with team continuity. 10 Parameters associated with volatility (requirements volatility, requirements growth) correlate with each other and with number of platforms. All of these also negatively correlate with requirements understanding, architecture understanding, and stakeholder team cohesion. 11 Requirements growth negatively correlates with lead SE experience. 12 Team capability correlates with stakeholder team cohesion and with mission/purpose understanding. Correlation coefficient >0.3, =0.05 Correlation coefficient >0.3, =0.05 Correlation coefficient >0.3, =0.05 Correlation coefficient >0.4, = Number of sites correlates with recursive levels in the architecture. Correlation coefficient >0.3, = Parameters associated with complexity (system complexity, recursive levels, migration complexity), level of service requirements, documentation level, and lead SE experience are all correlated. 15 The seven subjective principal components SF1-SF7 cover 71% of the variability in all subjective parameters. 16 With an appropriate set of weights, the selected program factors can adjust the Total SE values to create strong correlations between success measures and SE effort. Correlation coefficient >0.3, =0.05 Calculations as part of PCA Correlation coefficient as high as 0.79, =

158 6 Discussion of Results To a large extent, the statistical results of Chapter 5 provide self-evident results. They show that SE activities correlate strongly to program success measures, but do not correlate strongly to the technical quality of the resulting system. They show that the results provide a method to estimate a priori an optimum level of SE effort based on the program characteristics. They calculate the ROI of adding additional SE effort to a program with a given level of current effort. This chapter presents additional discussion of the major findings, exploring The possible causal chains for the relationships Additional qualitative relationships Interpretation of the results for use in a development program Methods to use the estimation Limitations to the research This chapter also contains two example programs developed as a hybrid of the actual programs that were interviewed, to show how the materials might be used. 6.1 Major findings: correlate SE with program success Section provides a summary of the statistical relationship concerning Research Question A, with Table 33 as major results. The statistical correlations are quite clear, succeeding against = 0.05 in all cases marked Accept. In this section, possible causal chains for the relationships are explored. It should be noted that the discussions in this section are largely in the nature of speculation based on intangible, qualitative factors during the interviews and on personal experience. They are not proven by the statistical data, which in itself can only indicate correlation, not cause. 140

159 Systems Engineering Return on Investment SE - total systems engineering effort The statistical results in Table 12 show uncorrected R 2 T = 0.142, improved to R 2 T = after adjustment for program characterization factors. These correlation values are well in excess of the R 2 = decision value for 36 valid programs as presented in Table 22. Therefore, even the uncorrected data, with confounding factors present, represents a significant correlation. After largely removing the confounding factors, the correlation is overwhelmingly significant. Possible causal factors. Further examination of the data in Table 22 and Table 23 shows more structure. Correlation of total SE effort with cost compliance, schedule compliance, and overall success are all far in excess of the = 0.05 level, indicating that there is very strong evidence of a significant correlating effect. Strikingly different is the correlation of total SE effort with technical quality, which is only marginal against the = 0.05 level. This indicates that there is likely a different causality creating a different correlating effect. It should be cautioned that causality in a statistical correlation might be in either direction. Perhaps the amount of total SE effort causes levels of program success, or perhaps a given level of program success leads to a certain amount of total SE effort. The statistical data does not inform. Examination of the theoretical factors, however, can indicate potential causality. Programs undertake systems engineering activities specifically to affect the program success. Programs choose to allocate funds to SE because they expect the SE to improve the program success. SE activities themselves are aimed at success: better mission definition more clearly defines and informs the team as to success goals; better requirements engineering provides measures for success; architecture trade-offs are typically performed against known success factors including cost, schedule and technical. These theoretical relationships tend, therefore, to support that the SE effort affects the program success rather than the reverse. Yet the correlation is different for the fourth success factor than the first three. It is notee that the first three success measures are of significantly greater interest to program management than is the fourth success measure, as evidenced in Figure 5 and in Miller (2000). This fact seems significant in explaining the possible differences in the statistical correlations. It is possible that SE process definitions, as used widely on 141

160 the interviewed programs, have been strongly influenced by the program management priorities evidenced in Miller (2000). Such an influence would have skewed SE processes toward cost and schedule compliance and away from technical quality. A further explanation is available by review of the qualitative interview data. The vast majority of interview participants had no initial answer for the stakeholder Key Performance Parameters (KPP) used to calculate technical quality. These project leaders were aware of the KPP concept and were willing and able to identify KPPs during the interview. Nonetheless, their programs apparently drove all technical effort by compliance with requirements rather than by the use of measured KPPs in a Technical Performance Measurement (TPM) process. These two approaches (requirements-driven versus TPM-driven) are counter to each other. If the whole set of requirements is used as the development measure, then technical effort is driven toward minimum compliance with requirements. If a TPM process is used, then the technical effort is driven toward best satisfying the stakeholder KPPs. In only a few cases were the interviewed programs driven toward technical excellence using KPPs in a TPM process. Those few programs did well in the fourth success measure. Effects of program characterization factors. Table 12 also displays the weighting values discovered in a hill-climbing search to best improve the correlation of total SE to success. These weighting factors are also shown graphically in Figure 35 and Figure 36. These weighting values represent the combination of weights that best removes the confounding factors from the correlation. Figure 76 shows the quantitative effect on the correlation by displaying the correlation values with and without each single factor. In this bar chart, all bars have R 2 T = as the right-hand terminus, the best correlation achieved with all factors. The left-hand terminus of each bar is the value of R 2 T achieved when that one factor is removed from the correction. The individual values from Figure 35, Figure 36 and Figure 76 can be interpreted into the causal effect of each confounding factor in the relationship. 142

161 Systems Engineering Return on Investment Figure 76. Relative effect of each factor on correlation System size - Figure 35 shows that system size has a large inverse effect on the amount of SE required. Larger systems require less SE as proportion of the total program. Figure 76 shows that this correction has a very large effect on the correlation, increasing it from to the final value of Development methods Both Figure 35 and Figure 76 show that the effect of contracted versus amortized development methods is small. Contracted methods appear to require somewhat less SE than amortized methods. From observations during the interviews, those organizations using amortized methods tended to have less formal, less well established SE methods; this fact may have contributed to the relationship. Level of integration Figure 35 shows that a subsystem level of integration requires significantly less SE than a higher system level of integration. This is likely due to the effect of the SE that is performed at the higher level, from which a subsystem reaps the benefit. Yet Figure 76 shows that correcting for the level of integration provides the largest single benefit, increasing the correlation from to This indicates that the level of integration is the most important consideration in understanding the relationship between total SE and program 143

162 success. See below for a further discussion of this relationship in context of the development autonomy. Definition at start - Figure 35 shows that the level of definition at start applies the single largest correction to the total SE quantity, and that the correction is positive. In this case, the causality may be reversed; programs that have a detailed level of definition at start are more likely to have expended significant levels of SE in the early stages prior to program start. This seems to be a self-evident, almost trivial result. Figure 76 shows that this correction is important to the relationship, increasing correlation from to Life cycle stage Both Figure 35 and Figure 76 show that the effect of development versus production life cycle stages is small. Regardless of the life cycle stage, system development appears to need consistent levels of SE. Proof difficulty - Figure 35 shows a moderately positive correction to total SE quantity due to proof difficulty. Figure 76 shows that this is a moderately important correction to apply. The more difficult a system is to prove, the greater amount of SE is needed. Development autonomy - Figure 35 shows that programs developed autonomously require a significantly greater total SE effort than programs developed in a highercontrolled environment. Figure 76 shows that this correction is second only in important to level of integration, increasing correlation from to Although these two factors (autonomy and level of integration) are necessarily orthogonal due to the nature of the Principal Component Analysis, there is a similar effect at work in that both involve a work relationship between this program and a higher-level structure. For level of integration, the higher-level structure is the higher system; for development autonomy, the higher-level structure is the parent business effort. Development performed independently requires a significantly greater level of SE effort than development performed in the context of a higherlevel structure. In a like manner, Figure 36 and Figure 76 show the effect of each subjective characterization parameter on the correlation. It is noted from a consideration of the two figures that the effect of the subjective factors both on total SE effort and on the correlations is considerably less than the effects of the quantitative factors. In any consideration of SE effect on program success, it is more important to accurately 144

163 Systems Engineering Return on Investment identify the quantitative measures of the program than the subjective measures. Nonetheless, there are some interesting observations about the subjective factors as well. Team understanding has an inverse effect on total SE effort; greater team understanding allows a program to proceed with lesser SE effort. Program/system complexity also has an inverse effect on total SE effort. This appears to be true because more complex systems tend to use less SE effort, relying more on lower-level test-and-fix methods. This finding appears to be in consonance with Elm (2008), which discovered that SE capability has less salutary effect on high challenge programs. (See Figure 17.) Installation differences across sites have little effect on the required SE effort. Team process capability has a moderate effect on both the total SE effort required (Figure 36) and on the correlative relationship (Figure 76). Greater process capability allows system development with somewhat lesser SE effort. It is noted that the general SE discipline has made very large emphasis on process capability in the last two decades. Nonetheless, from this data, process capability has much less effect on determining total SE effort than do the quantitative factors concerning the system and program development. Need for and use of SE tools has very little effect on the required SE effort. Technology risk has a moderate effect on both the total SE effort required (Figure 36) and on the correlative relationship (Figure 76). Programs with greater technology risk appear to use less SE effort, again relying on lower-level test-andfix methods. Wider system applicability has little effect on the required SE effort MD - mission/purpose definition effort The number of valid programs that actually performed mission/purpose definition is small because most of the interviewed programs started at a level of definition further advanced than the mission/purpose definition activities. As a result, the correlation decision value for only 14 programs is somewhat higher than for more programs, at R 2 = as presented in Table 25. Nonetheless, the statistical results in Table 13 show 2 R TMD = after adjustment for program characterization factors. This correlation value is well in excess of the decision value, indicating a significant correlation. 145

164 Possible causal factors. Mission/purpose definition activities provide the highest-level definition of program scope. Such definition is essential to cost compliance and schedule compliance, because success is predicated on a firm definition of scope. This theoretical relationship implies strongly the causal direction, that the level of MD effort creates the corresponding level of cost/schedule compliance. The theoretical relationship also explains the bathtub forms of Figure 42 and Figure 43, in which too much MD effort would be counterproductive in two ways: (a) causing confusion through over-analysis, and (b) expending extra effort that could be used more effectively elsewhere. As shown in Table 25, there is no statistically significant relationship between MD effort and either overall success or technical quality. While some degradation in these two success measures would likely happen due to a lack of firm mission/purpose definition, the effort that leads to overall success and technical quality typically occurs in later stages of the system development. This would weaken the possible relationship, resulting in the lack of statistical significance. Effects of program characterization factors. Table 13 also displays the weighting values discovered to best improve the correlation of MD effort to success. These weighting values represent the combination of weights that best removes the confounding factors from the correlation. Again, the number of valid programs with MD information was small, so the significance of these weights is of less internal validity. Most of the weights are considerably smaller than the corresponding weights for the total SE, yet a few of the program characterization factors deserve the following notice: Level of integration has a moderately positive effect on the required MD effort, with subsystem-level integration requiring somewhat more MD effort than systemlevel integration. This may be due to the necessary connection between a subsystem and its parent system, requiring more effort to understand that connection. Definition at start has a moderately negative effect on the required MD effort, with a better-defined program requiring less MD effort. Team understanding has a moderately negative effect on the required MD effort, in that the team that understands the system better requires less MD effort. 146

165 Systems Engineering Return on Investment Need for and use of SE tools has a moderately negative effect on the required MD effort, indicating that typical SE tools are less useful for MD effort than for later efforts such as requirements engineering and system architecting. As noted in Section , there are cross-connect factors among the subordinate SE activities. If the SE activities are not operating at most effective effort levels, then these cross-connects are necessary to adjust for the confounding effect of the in appropriate activity levels. For the MD effort, the following cross-connects are of note: Requirements engineering is highly correlated with MD effort, as observed from the data in Section Having the RE effort inappropriately selected has the single largest impact on the required MD effort. If too much RE effort is used, then far less MD effort is required and vice versa. This single factor increases the 2 correlation of MD effort to success from R TMD 2 = to R TMD = 0.620, highlighting the importance to have MD effort and RE effort appropriately balanced. Scope management effort also has a significant impact on the level of MD effort required. If inappropriately greater scope management effort is used, then greater MD effort is also required. This relationship is likely due to the need to analyze the mission/purpose impacts of scope changes. Technical leadership/management effort has a moderately negative impact on the level of MD effort required. If inappropriately lesser TM effort is used on a program, then the MD effort required becomes greater to compensate for the loss of leadership RE - requirements engineering effort As shown in Table 14 and Table 26, the adjusted, combined correlation factor for RE 2 effort versus program success R TRE = is well in excess of the correlation decision value of R 2 = 0.115, indicating that there is a clearly strong correlation. The individual correlation factors show significant correlations of RE effort against cost compliance, schedule compliance, and overall success; however, the correlation of RE effort against technical quality is nil. Possible causal factors. As with mission/purpose definition, the requirements engineering activities provide the verification-level definition of program scope. Such definition is essential to cost compliance and schedule compliance, because program 147

166 success is measured by verifications against the requirements. This theoretical relationship implies strongly the causal direction, that the level of RE effort creates the corresponding level of cost/schedule compliance and overall success. Greater requirements engineering contributes to cost compliance, schedule compliance, and overall stakeholder-perceived success. Again as with MD effort, this theoretical relationship also explains the bathtub forms of Figure 46, Figure 47, and Figure 48, in which too much RE effort would be counterproductive through expending extra effort that could be used more effectively elsewhere. The lack of significance between RE effort and technical quality can be explained by the same observation made earlier concerning total SE effort: the vast majority of interview participants had no initial answer for the stakeholder Key Performance Parameters used to calculate technical quality. The interviewed programs evidenced little effort toward improving technical quality, instead driving toward minimum compliance with requirements. Effects of program characterization factors. Table 14 also displays the weighting values discovered to best improve the correlation of RE effort to success. Most of the weights are considerably smaller than the corresponding weights for total SE, yet a few of the program characterization factors deserve the following notice: System size has a significantly positive effect on the required RE effort. Larger systems have more requirements and need greater effort to develop and manage those requirements. This confounding factor also has a significant effect on the 2 correlation of RE effort to program success, increasing R TRE from to This indicates the importance of adjusting the RE effort appropriately to the system size. Development autonomy has a significantly negative effect on the required RE effort. Independent development programs can proceed with far less rigor to requirements than programs operating in a larger program environment. This confounding factor has the largest single effect on the correlation of RE effort to 2 program success, increasing R TRE from to This indicates that it is exceedingly important to properly adjust the RE effort for the autonomy characteristics of the program. 148

167 Systems Engineering Return on Investment Team understanding has a significantly negative effect on the required RE effort. If the team understands the system better, then less rigor is needed in the 2 requirements. Team understanding alone increases R TRE from to 0.457, again indicating the importance of scoping RE effort with knowledge of the team understanding. Differences across installation sites has a large positive effect on the required RE effort. When many sites have different needs, there is a need for more requirements engineering. Wider system applicability has a moderately negative effect on the required RE effort. When the system can be used in a more standard way across multiple applications, then fewer requirements are needed. If other SE activities are not operating at most effective effort levels, then the following cross-connects concerning the RE effort are of note: Mission/purpose definition is highly correlated with RE effort, as observed from the data in Section If the MD effort is inappropriately selected, this has a significant impact on the required RE effort. If too much MD effort is used, then far less RE effort is required and vice versa. This single factor increases the 2 correlation of RE effort to success R TRE from to 0.457, again highlighting the importance to have MD effort and RE effort appropriately balanced. Verification and validation effort has a moderately positive impact on the level of RE effort. If inappropriately large VV effort is used, then more RE effort is required. This is likely due to the strong theoretical linkage between requirements and verification. Scope management effort has a significantly positive impact on the level of RE effort required. If inappropriately greater scope management effort is used, then greater RE effort is also required. This relationship is likely due to the need to change requirements in response to scope changes. Technical leadership/management effort has a significantly positive impact on the level of RE effort required. If inappropriately greater TM effort is used on a program, then the RE effort also becomes greater. This may be due to increased scrutiny on the requirements as a part of the technical management. 149

168 6.1.4 SA - system architecting effort As shown in Table 15 and Table 27, the adjusted, combined correlation factor for SA 2 effort versus program success R TSA = is well in excess of the correlation decision value of R 2 = 0.108, indicating that there is a clearly strong correlation. The individual correlation factors show significant correlations of SA effort against cost compliance, schedule compliance, and overall success; however, the correlation of SA effort against technical quality is nil. Possible causal factors. The SA effort on a system development creates the technical solution that can comply with the requirements. An appropriate solution is essential to cost compliance and schedule compliance, because program success is measured by verifications against the requirements. This theoretical relationship implies strongly the causal direction, that the level of SA effort creates the corresponding level of cost/schedule compliance and overall success. Again as with MD effort, this theoretical relationship also explains the bathtub forms of Figure 50, Figure 51, and Figure 52, in which too much SA effort would be counterproductive through expending extra effort that could be used more effectively elsewhere. It is noted that the bathtub in Figure 52 has an optimum point (ESAE = 7%) well in excess of the optima evidenced in the previous two figures (ESAE = 4%). This feature is also explained by the theoretical relationship, in that cost and schedule compliance are created by bare compliance with requirements, while overall success in the stakeholders perceptions also includes intangible and specialty engineering features that require more SA effort. The lack of significance between SA effort and technical quality can be explained by the same observation as in the RE effort, that the interviewed programs evidenced little effort toward improving technical quality, driving instead toward minimum compliance with requirements. Effects of program characterization factors. Table 15 also displays the weighting values discovered to best improve the correlation of SA effort to success. Most of the weights are considerably smaller than the corresponding weights for total SE, yet a few of the program characterization factors deserve the following notice: 150

169 Systems Engineering Return on Investment System size has a moderately positive effect on the required SA effort. Larger systems have more requirements and need greater effort to design a solution that meets those requirements. Level of integration has a moderately positive effect on the required SA effort. The more a system is linked to a higher-level system, the more architecting effort is required to satisfy that higher-level system s needs. Proof difficulty has a moderately positive effect on the required SA effort. The more difficult is the anticipated proof, the more design effort is necessary. Development autonomy has a significantly negative effect on the required SA effort. A more autonomous development requires much less design effort. This is likely due to the necessary interactions with the controlling authority that extend the system design effort with additional trade studies and reports. Team understanding has a significantly negative effect on the required SA effort. If the team understands the problem and requirements, then it is likely that the team needs not perform as much architecting. Wider system applicability has a moderately negative effect on the required SA effort. As with RE effort, a system that applies to multiple applications leverages the design effort to a greater extent. If other SE activities are not operating at most effective effort levels, then the following cross-connects concerning the SA effort are of note: Mission/purpose definition has a moderately positive effect on the required SA effort. If too much MD effort is used, then more SA effort is required. This is likely due to the trade study interactions between mission definition and system architecting. Requirements engineering has a significantly negative effect on the required SA effort. Inappropriately large RE effort causes much less SA effort, likely due to design work being intertwined with the requirements work. Scope management has a significantly positive effect on the required SA effort. Inappropriately large SM effort cause much more SA effort, likely due to the design trade studies necessary to support scope changes. None of the factors, neither program characterization factors nor inappropriately selected SE activity levels, has a significant individual effect on the correlation of SA 151

170 2 effort with program success. The increase in R TSA adjusting all factors together. from to was an effect of SI - system integration effort As shown in Table 16 and Table 28, the adjusted, combined correlation factor for SI 2 effort versus program success R TSI = is well in excess of the correlation decision value of R 2 = 0.105, indicating that there is a clearly strong correlation. The individual correlation factors show significant correlations of SI effort against cost compliance, schedule compliance, and overall success; however, the correlation of SI effort against technical quality is nil. Possible causal factors. The SI effort on a system development creates the system as envisioned during architecting, by assembling and testing the component parts into a system. This effort usually occurs in the later stages of development, nearing the finality of verification and validation. The theoretical relationship between SI effort and program success is weaker than for the preceding activities, because there is less time available in the program for any increased SI effort to have a salutary effect. Increasing SI effort would likely cause both cost and schedule overrun, but may have a strong positive effect on overall success in the stakeholders perceptions. A close examination of Figure 54, Figure 55, and Figure 56 reveals that three data points at ESIE values between 7% and 8% may overly enhance the correlation, so the actual correlation would be somewhat weaker than the statistical numbers, more in consonance with theory. The lack of significance between SI effort and technical quality can be explained by the same observation as in the RE effort, that the interviewed programs evidenced little effort toward improving technical quality, driving instead toward minimum compliance with requirements. Effects of program characterization factors. Table 16 also displays the weighting values discovered to best improve the correlation of SI effort to success. None of the weights create significant adjustments to the SI effort, although some are worthy of note: 152

171 Systems Engineering Return on Investment Development methods have a moderately negative effect on the required SI effort. Amortized development methods require more SI effort than do contracted development methods. Proof difficulty has a moderately positive effect on the required SI effort. The more difficult is the anticipated proof, the more integration effort is necessary to prepare for that proof. Development autonomy has a moderately positive effect on the required SI effort. The more autonomy in the development, the more integration effort is necessary. 2 However, it should be noted that this single factor increases R TSI from to 0.524, and does so largely by moving the three suspect data points far to the right. The three data points all sourced from the same organization, one with little autonomy in their development methods due to strong controlling clients, and an organization that typically uses more SE effort than is optimum in most activities. Therefore, it is likely that this relationship with development autonomy was statistically driven by this one organization and may not be generally valid. Team process capability has a moderately positive effect on the required SI effort. The more process definition that is embedded in the organization, the more integration is necessary to meet all the demands of the processes. Only one other SE activity has a cross-connect concerning the SI effort: Scope management has a moderately positive effect on the required SI effort. Inappropriately large SM effort cause more SI effort, likely due to the changes to the integration plans and methods that result from scope changes. None of the factors other than development autonomy has a significant individual effect on the correlation of SI effort with program success VV - verification & validation effort As shown in Table 17 and Table 29, the adjusted, combined correlation factor for VV 2 effort versus program success R TVV = is somewhat in excess of the correlation decision value of R 2 = 0.111, indicating that there is a correlation. This test is clear, but 2 is less compelling than all the prior SE activities due to the lower value of R TVV relative to the decision value. The individual correlation factors show significant correlations of VV effort against cost compliance and schedule compliance; however, the correlation of VV effort against overall success and technical quality are both nil. 153

172 Possible causal factors. The VV effort on a system development provides the final proof that the system complies with the requirements and the mission/purpose. Because it occurs so late in the program, increasing VV effort should likely lead to both cost and schedule overruns. As with SI effort, Figure 58 and Figure 59 show that two data points at EVVE values above 6% may overly enhance the correlation, so the actual correlation would be somewhat weaker than the statistical numbers, more in consonance with theory. The lack of significance between VV effort and overall success can be explained by consideration of the stakeholder role in verification and validation. It is during this time that stakeholders see the final system and its capabilities. The amount of VV effort during this time does little to change the actual system, and thereby little to change the stakeholders perception of the program quality. The lack of significance between VV effort and technical quality stems from the same relationship, in that changing the level of VV effort causes little change in the parametric quality of the system against its KPPs. Effects of program characterization factors. Table 17 also displays the weighting values discovered to best improve the correlation of VV effort to success. Most of the weights are considerably smaller than the corresponding weights for total SE, yet a few of the program characterization factors deserve the following notice: Definition at start has a moderately positive effect on the required VV effort. A system development program that starts at a later level of definition requires more VV effort. Because the VV effort includes both the planning for and execution of testing, the positive relationship may be due to the developer s need to recreate more of the test planning that may have been done in earlier phases by other agencies. Development autonomy has an overwhelmingly negative effect on the required VV effort. A more autonomous development requires overwhelmingly less verification and validation. This is likely due to the ability of the organization to make its own decisions about the level of proof. Development autonomy also causes a significant 2 individual change in the correlation of VV effort to success, changing R TVV to This highlights the importance of scoping the development autonomy when selecting the level of VV effort. from 154

173 Systems Engineering Return on Investment Team understanding has a significantly negative effect on the required VV effort. If the team understands the problem and requirements, then it is likely that the team needs not perform as much verification and validation. This factor also causes a significant individual change in the correlation of VV effort to success, changing 2 R TVV from to It is important to properly scope the team understanding when selecting the level of VV effort. Team process capability has a moderately negative effect on the required VV effort. Great process compliance in earlier development activities, particularly in the integration activity, leads to smoother and easier verification and validation. If other SE activities are not operating at most effective effort levels, then the following cross-connects concerning the VV effort are of note: Mission/purpose definition has a moderately negative effect on the required VV effort. If too little MD effort is used, then more VV effort is required. This is likely due to a poorer understanding of the proof necessary for the system. Requirements engineering has a significantly negative effect on the required VV effort. Inappropriately low RE effort causes much greater VV effort, again likely due to a poor understanding of the proof necessary for the system. This single cross-connect causes a significant individual change in the correlation of VV effort 2 to success, changing R TVV from to It is important to properly balance the RE effort and VV effort in scoping total SE effort. Scope management has a moderately positive effect on the required VV effort. Inappropriately large SM effort cause more VV effort, likely due to the test changes necessary to support scope changes. Technical leadership/management has a significantly positive effect on the required VV effort. If inappropriately large TM effort is applied, then VV effort is also likely to rise to prove the system not just to the requirements, but also to the technical managers TA technical analysis effort As shown in Table 18 and Table 30, the adjusted, combined correlation factor for TA 2 effort versus program success R TTA = is well in excess of the correlation decision value of R 2 = 0.108, indicating that there is a strong correlation. The individual correlation factors show significant correlations of TA effort against cost compliance 155

174 and schedule compliance; however, the correlations of TA effort against overall success and technical quality are marginal to nil. Possible causal factors. The TA effort on a system development provides the technical proof and guidance that the system complies with the requirements and the mission/purpose. This theoretical relationship is a strong explanation both for the correlation and for the bathtub shape of Figure 62 and Figure 63. With sufficient technical analysis, the program scope and proof operate well; with too little analysis, problems occur during proof and with too much analysis, paralysis by analysis causes excessive effort with little benefit. The relative lack of significance between TA effort and overall success is likely due to the indirect nature of the relationship. Technical analysis is a step removed from the actual proof, and it is often more technical in nature than the stakeholders interest. The lack of significance between TA effort and technical quality stems from the same issue considered under RE effort, that the interviewed programs largely made little effort toward the use of TPM processes. Effects of program characterization factors. Table 18 also displays the weighting values discovered to best improve the correlation of TA effort to success. Most of the weights are considerably smaller than the corresponding weights for total SE, yet a few of the program characterization factors deserve the following notice: Lifecycle stage has a significantly negative effect on the required TA effort. A system development program during production stages requires much less technical analysis than a program in the development stages. This is true because a production stage program can leverage from existing analyses. Development autonomy has a significantly negative effect on the required TA effort. A more autonomous development requires much less technical analysis. This is likely related to the same issue in VV effort, due to the ability of the organization to make its own decisions about the level of proof. Development autonomy also causes a significant individual change in the correlation of TA effort 2 to success, changing R TTA from to This highlights the importance of scoping the development autonomy when selecting the level of TA effort. 156

175 Systems Engineering Return on Investment Complexity of program/system has a significantly positive effect on the required TA effort. A more complex program/system requires more technical analysis because of the need to predict perform under the uncertainty of complexity. Technology risk also has a moderately positive effect on the required TA effort, for the same reasons. If other SE activities are not operating at most effective effort levels, then the following cross-connects concerning the TA effort are of note: Mission/purpose definition has an overwhelmingly negative effect on the required TA effort. If too little MD effort is used, then much greater TA effort is required. This is likely due to the need to perform more analysis due to a lack of understanding of the mission/purpose, and may also reflect a level of rework. This cross-connect also causes a significant individual change in the correlation of TA 2 effort to success, changing R TTA from to This highlights the importance of properly balancing the MD effort with the TA effort. System integration has an overwhelmingly negative effect on the required TA effort. This is likely due to the inverse effect of the two activities; system interactions and capability may be proven with analysis or with integration testing. Again, this cross-connect causes a significant individual change in the correlation 2 of TA effort to success, changing R TTA from to This highlights the importance of properly balancing the MD effort with the TA effort SM scope management effort As shown in Table 19 and Table 31, the adjusted, combined correlation factor for SM 2 effort versus program success R TWM = is somewhat in excess of the correlation decision value of R 2 = 0.105, indicating that there is a correlation. As with VV effort, the correlation is clear but is less compelling than the other SE activities due to the 2 lower value of R TWM relative to the decision value. The individual correlation factors show significant correlations of SM effort against cost compliance and overall success; however, the correlations of SM effort against schedule compliance and technical quality are poor to nil. As noted in Table 33, SM effort is the only SE activity that shows no significant correlation with schedule compliance. Possible causal factors. The SM effort on a system development, when properly scoped, ensures that program scope does not inappropriately drift during the program. 157

176 However, this activity often causes perturbations in the other activities because it demands additional analysis and engineering work for each scope issue considered. Proper scope management includes the cost and schedule impact in its considerations, but the relationships shown by Miller (2000) indicate that cost usually receives greater emphasis than schedule. This theoretical relationship between SM effort and program success indicates why SM would show correlation with both cost compliance and overall success, but not with schedule compliance. Even so, the correlation is not as compelling as that with other activities. It is also noted in Figure 66 that there appears to be a statistically bipolar effect in the adjusted values for the relationship of SM effort to cost compliance; five interviewed programs were driven to ESME values in excess of 2.0% while the remaining programs clustered at less than 0.5%. The calculated correlation values seem to be driven by these five programs, allowing for less internal validity in the results for SM effort. Effects of program characterization factors. Table 19 also displays the weighting values discovered to best improve the correlation of SM effort to success. Because of the bipolar effect of the adjusted data, the weights are less reliable than for other activities. Nonetheless, a few of the program characterization factors deserve the following notice: Development methods have a significantly negative effect on the required SM effort. An amortized system development program uses much greater scope management than does a contracted development program. Complexity of program/system has a moderately positive effect on the required SM effort. A more complex program/system requires more scope management because of the unexpected interactions caused by complexity. Team process capability also has a moderately positive effect on the required SM effort. Teams with higher process capability use greater scope management. Wider system applicability has a significantly negative effect on the required SM effort. As with prior activities, wider applicability allows the system to leverage its work for more uses, with less scope management. If other SE activities are not operating at most effective effort levels, then the following cross-connects concerning the SM effort are of note: 158

177 Systems Engineering Return on Investment Requirements engineering has an overwhelmingly negative effect on the required SM effort. If too little RE effort is used, then much greater SM effort is required. This one relationship also causes a significant individual change in the correlation 2 of SM effort to success, changing R TWM from to This highlights the importance of properly balancing the RE effort with the SM effort. System integration has a moderately positive effect on the required SM effort. Greater system integration is linked with larger SM effort. This was also noted in reverse, that SM effort has a positive effect on the required SI effort. Technical analysis has a significantly positive effect on the required SM effort. More technical analysis can lead to more scope issues. Technical leadership/management has a significantly positive effect on the required SM effort. Greater technical management often causes more scrutiny of scope, leading to greater scope management effort TM technical leadership/management effort As shown in Table 20 and Table 32, the adjusted, combined correlation factor for TM 2 effort versus program success R TTM = is well in excess of the correlation decision value of R 2 = 0.115, indicating that there is a strong correlation. The individual correlation factors show significant correlations of TM effort against cost compliance, schedule compliance, and overall success; however, the correlation of TM effort against technical quality is nil. Possible causal factors. The primary purpose of TM effort on a system development is to provide vision and guidance for the technical work, the leadership aspect of TM. With proper vision and guidance, a technical team can be expected to make appropriate trade-offs of technical issues against cost and schedule issues, thereby leading to better cost compliance, schedule compliance, and overall success. In addition, it is significant in all three of Figure 70, Figure 71 and Figure 72 that the programs in the optimal region of TM effort are largely on-cost, on-schedule and with the highest levels of stakeholder overall success. This level of program success is not evident in any of the other SE activities; only TM effort has this unique level of success in its optimal regions. However, a greater level of TM effort often moves more into the management aspect and leads to over-management or micro-management, in which the technical workers 159

178 become thrashed by management issues. Such issues often put more emphasis on cost compliance as in Miller (2000), which can be seen in Figure 70 in which even very large TM effort still manages to maintain largely effective cost control. However, Figure 71 and Figure 72 tell a different story in which greater levels of TM effort cause severe degradation in both schedule compliance and overall success. Excessive technical management can cause significant problems in schedule compliance and in stakeholder perceptions of success. Effects of program characterization factors. Table 20 also displays the weighting values discovered to best improve the correlation of TM effort to success. Many of the weights are large values, which fact is unique among the SE activities. This seems to indicate that achieving the right level of TM effort is difficult, requiring a careful balance: System size has a significantly negative effect on the required TM effort. Larger programs require much less TM effort as a percent of the program. This one relationship also causes a significant individual change in the correlation of TM 2 effort to success, changing R TTM from to This highlights the importance of properly selecting the TM effort level based on the system size. Development methods have a moderately positive effect on the required TM effort. Contracted developments require somewhat greater TM effort than amortized developments, likely due to the need to translate issues between the higher program and this program. Level of integration has a moderately positive effect on the required TM effort. Subsystem-level developments require somewhat greater TM effort than higherlevel systems, again due to the interactions between a higher-level system and this system. Team understanding has a significantly negative effect on the required TM effort. When the team understands the problem and development, then much less vision is necessary from technical leadership. Differences across installation sites have a significantly negative effect on the required TM effort. Team process capability has a moderately positive effect on the required TM effort. Teams with higher process capability use greater technical management. 160

179 Systems Engineering Return on Investment Technology risk has a moderately positive effect on the required TM effort. When risk exists, technical leadership is required to guide and vision the technical work. Wider system applicability has a significantly negative effect on the required TM effort. As with prior activities, wider applicability allows the system to leverage its work for more uses, with less technical management. If other SE activities are not operating at most effective effort levels, then several crossconnects concerning the TM effort are of note: Mission/purpose definition has a moderately positive effect on the required TM effort. If inappropriately too much MD effort is used, then greater TM effort is required. System architecting has a moderately negative effect on the required TM effort. If inappropriately little SA effort is used, then greater TM effort is required. System integration has a moderately positive effect on the required TM effort. If system integration effort becomes inordinately large, then greater TM effort is required to guide the technical work. Technical analysis has a moderately positive effect on the required TM effort. As with system integration, too much technical analysis requires greater TM effort to guide. Scope management has a moderately positive effect on the required TM effort. Inappropriately large scope management may be indicative of shifting from technical leadership to technical over-management. This relationship was also noted from the other side in the SM effort. Observations on Technical Leadership/management. The results in this subsection on TM effort are unique among the eight SE activities, providing additional distinction in the observations. No other SE activity, including total SE, provides the kind of on-cost, on-schedule, overall success measurements that are observed in the relationship of TM effort to success. This finding is both significant and unique, in that properly selecting the optimal level of TM effort seems to provide the best possible success measures. Coupled with this, however, there is a caution that using excessive TM effort carries a significant penalty. Further, consideration of the program characterization factors shows that many of them have moderate to significant effect on the selection of TM 161

180 effort. Some of the factors are so significant as to be highly sensitive, essentially removing any correlation between TM effort and success if improperly used. These statistical relationships may be causal factors in the widespread industry difficulty in properly scoping the technical leadership/management activities. As program managers attempt to select appropriate levels of effort, the sensitivity of these factors causes wild swings in the success levels. 6.2 Major findings: optimum SE can be predicted The findings of Section showed the statistical evidence to accept all nine hypotheses H BXX. This evidence was achieved by selecting a prediction methodology for optimum levels of SE activities, then demonstrating that the bounds on the selected methodology proved false the null hypotheses H BXX0. This statistical method sufficed for the hypotheses. The prediction methodology finalized in Section 5.3.3, however, offers a significant benefit itself: it provides an estimation of should-be levels of effort for SE activities. In particular, Equation 35 and Equation 32 provide the ability to estimate, based on the program characterization parameters, the optimum levels for each of OSEE Total systems engineering effort OMDE Mission/purpose definition effort OREE Requirements engineering effort OSAE System architecting effort OSIE System integration effort OTAE Technical analysis effort OSME Scope management effort OTME Technical leadership/management effort Such an estimation tool for systems engineering effort has not yet existed in the public literature. The most widely used current estimation tool is COSYSMO (Valerdi 2005), which is continually updated by its author, but COSYSMO has no basis for measurement of optimal values. Instead, COSYSMO relies on the average values actually used by current projects, adjusted by a strong set of both system size and qualitative parameters. However, as shown in Table 34, most programs operate at much less than the optimum level of SE effort. This would indicate that following the COSYSMO model simply perpetuates the use of less SE than optimum. Estimation 162

181 Systems Engineering Return on Investment using the method described herein, however, provides values that optimize the program success in a balance of cost compliance, schedule compliance, and stakeholder overall success Estimation methodology for optimal SE effort The estimation methodology follows the plan of Equation 35 and Equation 32 by requiring the following steps: 1. Conceptualize the system to be developed and the development program. 2. Estimate the values for the seven quantitative program characterization factors QF1-QF7, and for the seven subjective program characterization factors SF1- SF7. 3. Convert the estimated factors into percentile points PP j for the 14 program characterization parameters by comparing them with the range of values obtained in this research. 4. Calculate the adjustment constant G XX for each of the SE activities using Equation 32 and the values of Weight xxj obtained from the optimization in this research. 5. Calculate the optimum level of XX effort for the given program OSSE G using Equation 35 and the quadratic parameters a XX, b XX obtained from the correlative relationships in this research. It should be noted that, if the percentile points PP j in step 3 are all set to the value 0.5 (median), then the result of this estimation will be the optimal values for a median program as shown in Table 34. See Section for a calculation of the typical variations from median values that occur for each program characterization factor. As can be seen in the examples of Section 6.3, this methodology is well within the capability of a system developer during program formulation. It provides an eminently usable tool for systems engineering estimation Evaluating program characteristics The most difficult part of the estimation methodology in the previous section is the creation of values for the program characterization factors. As seen in Section 6.2.3, the results are sensitive to this estimation, easily causing variation in the optimal SE 163

182 effort from 8% to 19% of the total program. It is therefore important to create reasonably accurate program characterization values, yet the effort to create them must be commensurate with the benefit gained. The most accurate valuation can be obtained by creating a full development program plan, calculating from that plan the original program characterization parameters of the interview instrument, then calculating through the Principal Component Analysis to determine the resulting values of QF1-QF7 and SF1-SF7. Unfortunately, this method has four drawbacks: The creation of a full development program plan during program formulation is a time-consuming and difficult process. It is often impossible to foresee all possible problems, and so the initial program plan is usually inadequate. The interview instrument contains 45 program characterization parameters, which is a somewhat excessive number for effective estimation efforts. The calculations to convert the program characterization parameters to the 14 factors somewhat obscure the meaning of the original parameters, and some of the parameters (such as the highly correlated requirements quantities) have little additional effect. Subjective estimation is still necessary for many of the original parameters, including (during a program formulation stage) even the quantitative parameters. While this method undoubtedly produces the most accurate valuation, the marginal benefit in accuracy may not justify the great additional effort required. A simpler valuation method can follow the practice of COSYSMO by subjectively assessing directly each program characterization factor QF1-QF7, SF1-SF7 against a verbalized description of that factor in multiple target levels. The verbalized descriptions can be obtained from consideration of the effects of the original program characterization parameters on the PCA factors, such as the example for QF1 system size in Table 36. Typical characteristics in this table have been inferred from the values for interviewed programs that evidence QF1 values at the indicated percentile points (PP). Applying such a valuation method requires fourteen such tables describing the five levels for each of the program characterization factors. The estimation levels of VL, L, M, H, and VH can then be translated into the corresponding percentile points for use in Equation

183 Systems Engineering Return on Investment Table 36. Typical program characterization valuation definitions ( system size ) Assigned Valuation VL (Very Low) PP=0.15 L (Low) PP=0.30 M (Moderate) PP=0.50 H (High) PP=0.70 VH (Very High) PP=0.85 No unique algorithms A few components in the system 1-2 formal test locations 1-2 developing organizations 1-2 unique algorithms 6-16 components in the system 2-4 formal test locations 1-3 developing organizations A few unique algorithms components in the system 3-5 formal test locations 2-4 developing organizations 3-20 unique algorithms components in the system 4-6 formal test locations 2-5 developing organizations Dozens of unique algorithms components in the system 4-6 formal test locations 4-7 developing organizations Typical Characteristics system-level requirements 2-8 system-level external interfaces One or two operational scenarios One customer agency system-level requirements 4-15 system-level external interfaces 2-4 operational scenarios A few customer agencies system-level requirements 4-20 system-level external interfaces 3-10 operational scenarios 2-6 customer agencies system-level requirements 5-25 system-level external interfaces 5-20 operational scenarios 3-8 customer agencies system-level requirements system-level external interfaces operational scenarios 3-10 customer agencies Relative effects of program characteristics on estimation As noted earlier, assigning PP = 0.50 to all characteristic factors results in the SE effort estimation values for a median program. It is useful to explore what variation is obtained by the estimation method for differing values of each program characterization factor. Table 37 shows the variation in optimal Total SE Effort when each factor is varied alone to different values. 165

184 Table 37. Optimal total SE effort for variation in one factor Quantitative parameters VL L M H VH QF1 System size QF2 Development methods QF3 Level of Integration QF4 Definition at Start QF5 Life-cycle stage QF6 Proof difficulty QF7 Development autonomy Subjective parameters VL L M H VH SF1 Team understanding SF2 Complexity of program/ system SF3 Differences across installation sites SF4 Team process capability SF5 Need for and use of SE tools SF6 Technology risk SF7 Wider system applicability Examples of use Space system development This program covered the NASA-contracted development of a single space-certified measurement system to be placed aboard the International Space Station. Figure 77 shows the ranking of the program against the parameters. The program presented significant technology risk. While the system was not particularly large, its proof was exceedingly difficult. The NASA environment led to a highly controlled development process, removing autonomy from the program. The technology and program issues led to a high program/system complexity. Figure 77. Space system program characterization The values represented in Figure 77 are translated first into the scale of Very Low (VL), Low (L), Moderate (M), High (H), or Very High (VH) and then into the specific values of PP i corresponding to this scale. These estimations are then applied to the 166

185 Systems Engineering Return on Investment transformation equation to determine the optimal level of effort of each SE activity and of total SE activity, with the results in Table 38. For a program of these characteristics, the optimal total SE effort (15.6%) is somewhat greater than for a median program (14.4%), due largely to the proof difficulty, high complexity, and high risk (and reduced by the lack of development autonomy). The front-end activities of mission/purpose definition (MD), requirements engineering (RE), and system architecting (SA) are all somewhat less than for a median program due to the early-program uncertainty of technology risk and complexity. The optimal value for technical leadership/management (TM) is very high, indicating the need for continual management of the technical risk. The optimum levels are shown against the actual levels used. Table 38. Optimum and actual SE effort levels for space system program Optimum Level, % of Program Cost Actual Values, % of Program Cost MD RE SA SI VV TA SM TM Total SE 0.5% 1.0% 2.7% 1.4% 1.9% 1.3% 1.2% 5.5% 15.6% 0.1% 2.1% 5.0% 6.6% 0.3% 2.5% 0.2% 3.0% 19.2% This program was a technical success, but overran its cost. Schedule was controlled by space launch events and was not allowed to vary. The success measures evidenced a cost overrun of 24%, no schedule overrun, a stakeholder success rated at 5 on 1-to-5, and 85% above minimal technical performance toward technical performance objectives. SE efforts were much greater than optimum during requirements engineering (RE), system architecting (SA), system integration (SI), and technical analysis (TA), documenting the interview-reported methodology of test-and-fix iteration. Mission/purpose definition (MD) was very low, as were scope management (SM) and technical leadership/management (TM). The resulting lack of scope definition and management to that scope definition likely led to the high cost overrun. This program appears to have been a case of technology running wild with few scope controls Airborne training system development This program covered the contracted development of a training system that was installed into a series of existing aircraft. Figure 78 shows the ranking of the program 167

186 against the parameters. This program was modifying a production system (the aircraft) but was developed primarily as a stand-alone system. It represented low technology risk and easy proof, and the development team operated with a great deal of autonomy. The team did not, however, have a high maturity of team process capability. Figure 78. Airborne training system program characterization As with the first example, the values represented in Figure 78 are translated into the scale of Very Low (VL), Low (L), Moderate (M), High (H), or Very High (VH), then applied to the transformation equation to determine the optimal level of effort of each SE activity and of total SE activity. The results are in Table 39. Table 39. Optimum and actual SE effort levels for airborne training system Optimum Level, % of Program Cost Actual Values, % of Program Cost MD RE SA SI VV TA SM TM Total SE 0.7% 1.3% 2.0% 1.2% 3.0% 1.5% 0.9% 2.6% 13.2% 1.1% 0.3% 1.2% 1.9% 5.2% 0.3% 0.2% 2.9% 12.6% In this case, the optimal total SE effort was somewhat less than median due to the easy proof, lack of development autonomy, and low risk (but increased by the system level of integration and the low process capability). However, due largely to the low risk and easy proof, the optimal front-end activities of mission/purpose definition (MD), requirements engineering (RE), and system architecting (SA) are all somewhat less than median. Despite the easy proof, the optimal level for verification/validation (VV) was higher than median because of the lower process capability and system-oriented level of integration. The optimum levels are shown against the actual levels used. 168

187 Systems Engineering Return on Investment This program was a success by all four measures. It completed its work on cost and on schedule (no overrun in either), had a stakeholder success rated at 4 on 1-to-5, and met the objective level of all technical performance measures. Although the program used considerably less than optimal technical analysis (TA), requirements engineering (RE), and system architecting (SA), it made up for these by a greater-than-optimal use of mission/purpose definition (MD), system integration (SI), and verification/validation (VV). The technical leadership/management (TM) activity was close to optimal. From the qualitative statements during the interview, it was apparent that the TM approach was a likely source of success due to positive, proactive team development and risk management. 6.4 Limitations on the results As with all research, there are limitations on the possible use of the results. These limitations fall in two major categories. Internal validity limitations derive from assumptions made during the research analysis, including bounds on the mathematical processes used. External validity limitations derive from the sampling process, which limits the application of the results to programs that are similar to the sample used Internal validity limitations The statistical procedure used in Section 5 is essentially robust. It is based in wellknown mathematical techniques and is largely used within the limits of those techniques. The data set is sufficient to support the claims of the research. Nonetheless, there are still some limitations that should be considered in the use of these results. Causality. The statistical methods in Section 5 are carefully described without imputation of causality. Statistical data rarely implies causality; rather it simply implies a relationship. In Section 6, however, some indications of causality have been inferred from the data, from interview statements by participants, and from the general experience of the principal investigator. All statements of causality in this thesis should be treated with caution. Without a theoretical basis for the effect of SE on a program, such statements are only inferences from the information available. Weak correlation. In all cases, the correlation between SE activity and the four program success measures was evaluated against a statistical confidence level of 95% 169

188 ( = 0.05). Significant correlation was reported when this criterion was met. In most cases, the observed correlation vastly exceeded the criterion. In at least two cases (VV and SM activity), the final correlation was acceptable but not overwhelming. In these cases, it would be appropriate to gather more data to determine (a) whether the correlation is truly significant, and (b) what other factors might have an effect on the correlation. See Sections and for discussion of these two weaker areas. Lack of correlation. This research reports a lack of correlation for those relationships in which the observed data does not meet the criterion of 95% statistical confidence. (In two cases in Table 33, the relationship is reported as Marginal where the observed data barely meets the criterion.) When the relationship falls short of the criterion, it is reported as Unknown. It should not be inferred from this that there is no correlation; it is simply true that the data does not support a positive confirmation of the correlation at this confidence level. Extreme values. The breadth of values for each program characterization parameter provides a wide range of applicability of the results. However, there are cases within the observed data in which programs having extreme values did not fit the trend of the rest of the data. Section describes how programs were removed from the analysis for various extreme values. While this common statistical procedure found the underlying relationships, it still leaves in question the nature of these extreme programs. Even further, other programs exist that are not part of the interview set and that may have values even further extreme than those in the interview set. The statistical treatment used provides well-behaved results within the bounds of the normal data, but should not be used for extreme-valued programs without further statistical examination. Edgy combinations. Even without going to the extreme values, the set of interviewed programs is not wide enough to cover all possible combinations of the 14 program characterization factors. Particularly in the use of the estimation methodology, more data is needed before using the methodology for programs with multiply conflicting values of VL and VH. The two examples in Section 6.3 provide reasonable bounds in which there are no more than two or three edgy parameter values. Principal component analysis (PCA). The use of PCA reported in Section relies on a rigorous mathematical treatment. This treatment showed that over 70% of 170

189 Systems Engineering Return on Investment the variability of 45 parameters could be characterized by two sets of seven orthogonal parameters, thereby reducing significantly the statistical analysis workload. Nonetheless, this treatment means that the remaining ~30% of variability was no longer included in the data analysis beyond this point. This remaining variability may have discernible effects on the results. Subjective usage. Subjective values have been used in this research in several areas, usually scored on the classical 1-to-5 Likert (1932) rating scale. The estimation methodology of Section also relies on subjective treatment of the perceived future program characteristics. As always, subjective values are suspect, relying on the perceptions of the reporting individuals at the time of the interview. Principal component analysis (PCA) on subjective scores. The two usages of PCA reported in Section rely on a rigorous mathematical treatment. However, the second usage ( subjective parameters ) applies the PCA treatment to subjective values on the 1-to-5 rating scale. PCA relies mathematically on the fine-grained variation in parameters to distinguish the orthogonal axes of variability. With parameters having such coarse-grained, discrete values, the PCA results are less reliable. Errors in the resulting orthogonal factors may have impacted the values obtained in the hill-climbing search for weighting factors. Lack of front-end data. Only 14 of the interviewed programs had actually performed mission/purpose definition (MD) activities. Table 25 shows that only seven programs determined the relationship between MD and schedule success. While the specific statistical result in Table 25 is rigorously and well supported against the 95% confidence level, this lack of data has other consequences throughout the analysis. In essence, programs relied on work performed prior to the interviewed program; hence, the data is essentially incomplete in this regard. One example of the consequential weakness can be observed in the hill-climbing weighting search of Table 13, in which the 21 weighting factors applying to MD effort were found using data points from only 14 programs. This weakness was inevitable and unavoidable, simply due to the difficulty of obtaining valid program data. Quadratic approximation. Section describes the use of quadratic regression to fit the observed data in each relationship to a quadratic equation. The use of the quadratic equation was a chosen approximation based on the theoretical work of 171

190 Honour (2002b). That theoretical work performed only an end-point value analysis to indicate the general form of the relationship rather than the specific quadratic nature. No test has been performed to determine if other forms (cubic, exponential, Rayleigh, etc.) might be a closer approximation. The results in this research show high degree of correlation using this particular approximation, but there may be a better form. Hill-climbing search for optimum. The search for optimum weighting values for each relationship used a controlled hill-climbing algorithm in 14 or 21 simultaneous dimensions. It is likely that the 15 th /22 nd dimensional surface in each relationship has multiple local optima that could be found by searches in different directions. At each step of the algorithm, simultaneous variation in all dimensions was computationally beyond a reasonable scope of work. Therefore, the investigator controlled the search at each step to seek only those directions in which the results were logically consistent with the data. Because of these factors, it is distinctly possible that the weights discovered by the search may not be truly optimal. This is more likely for those SE activities with fewer data points such as MD effort. Further examination of the weighting factors using more data would be appropriate External validity limitations The methods used to obtain interviews are discussed in Section 4.3, as are the demographics of the results. External validity of the results is of course dependent on the similarity of any application with the demographics of the source data. When programs are within the range of the source demographics, then the results may be used with confidence; when programs are outside that range, then the results may not be applicable. The following limitations in the demographics exist. Contracted systems development. The data set includes both contracted and amortized system developments, but the majority of the interviewed programs were in a contracted environment. While the results may be applicable to both environments, they can be more confidently applied to the contracted environment. Military systems development. The SE discipline has been significantly developed in the military systems environment. Over half of the interviewed programs were from this environment. Consequently, the results can be more confidently applied to military systems. Nonetheless, there are sufficient non-military programs in the data set such that the results can also be applied with caution to non-military systems. 172

191 Systems Engineering Return on Investment Western-oriented cultures. All interviewed programs were from Western-oriented cultures, and all but four were English-speaking. The results may not be applicable to other cultures. Self-selected organizations. The organizations that offered programs for interview selected themselves, based on the contacts made by the researcher and the visibility afforded through the Research Advisory Group. The researcher also made the research visible through presentation of the developmental papers at conferences. It can be inferred that the organizations chose to take part because of their interest in the SE discipline and with a hope that the benchmark reports would help them to be more successful. These motivations imply a skew to the selection of organizations that may be difficult to quantify. The use of these results in organizations that do not share the same goals may be weak. Organization-selected programs. The programs that were interviewed were typically selected by managers within the participating organizations, with some level of negotiation with the program leaders. Program leaders who were interviewed were volunteers. The researcher had some small influence on the selection of programs by requesting programs to fit the necessary variation in parameters, often having to insist on obtaining programs that did more poorly. All of these factors influence the degree to which the interviewed programs are actually representative of all system development programs. Program environment and size. Demographics of the interviewed program size parameters are shown in Figure 21. Interviewed programs typically had the following characteristics: Development life-cycle, as opposed to production or operation Level of start definition between mission defined and technical requirements defined Between 50 and 500 system-level requirements Between 3 and 50 system-level external interfaces Fewer than 7 system-specific algorithms Between 2 and 20 operational scenarios The results are less applicable for programs outside these bounds. 173

192 Program characteristics. Demographics of the program characteristics are shown in Figure 22, Figure 23, and Figure 24. In nearly all cases, there was a suitably wide variation in program characteristics to justify the results. Some anomalies provide limitations that should be considered. Interviewed programs typically were limited in: Better-than-average team understanding of the system mission/purpose, in the viewpoint of the interviewed leaders No cases in which the interviewed leaders perceived the team to be deficient in understanding of the requirements or system architecture Lead systems engineers with better-than-average experience CMMi TM levels at three or below. Traditional SE methods. None of the interviewed programs reported using significant levels of Agile, Lean, or Model-Based systems engineering methods; all programs operated largely within traditional SE methods. The results should not be used for programs using these more recent, non-traditional methods. 6.5 Observations and findings This chapter further discusses the results of the statistical work, seeking to infer causality where it may be available as well as examining the relative effects of the many factors on the results. Major results of this further examination include: Demonstration that the estimation methodology used in this research provides a useable tool for SE effort estimation during program formulation. Realization that appropriately selecting the level of technical leadership/ management (TM) has the unique ability to bring high simultaneous levels of cost, schedule, and overall success rates. However, using too much TM effort correlates with rapid deterioration in both schedule and overall success. Discovery that the lack of correlation between SE activities and technical quality is likely due to program emphasis on requirements compliance to the exclusion of stakeholder-defined technical quality. An understanding that the following program characterization factors are most important in the determination of SE effort levels. Level of integration, system versus subsystem Development autonomy 174

193 Systems Engineering Return on Investment System size Level of definition at start Realization that, for estimation of SE effort levels, the quantitative factors (system size, development autonomy, level of integration, level of definition at start) are significantly more important than the subjective factors (team understanding, process capability, etc.) An understanding of the desired balance among the subordinate SE activity effort levels. In addition to these major results, the discussion has also generated a series of other findings as shown in Table

194 Table 40. Summary of findings from discussion of statistical results Nbr Finding Quantification 67 Level of integration is one of two most important confounding factors Increase in to be considered in the relationship of total SE to success. correlation 70 Development autonomy is one of two most important confounding factors to be considered in the relationship of total SE to success. 66 System size is an important confounding factor to be considered in the relationship of total SE to success. 68 Level of definition at start is an important confounding factor to be considered in the relationship of total SE to success. 69 The more difficult a system is to prove, the greater amount of SE is needed. 71 Development performed independently requires a significantly greater level of SE effort than development performed in the context of a higher-level structure. 73 Process capability has much less effect on determining total SE effort than do the quantitative factors concerning the system and program development. 74 It is important to program success to have the levels of MD effort and RE effort balanced. 78 System size, development autonomy, and the degree of team understanding are the largest determinants of the level of requirements engineering effort needed. 80 Development autonomy and the degree of team understanding are the largest determinants of the level of system architecting effort needed. 82 Development autonomy and the degree of team understanding are the largest determinants of the level of verification and validation effort needed. 85 Lifecycle stage, development autonomy and complexity are the largest determinants of the level of technical analysis effort needed. 88 Development methods and wider system applicability are the largest determinants of the level of scope management effort needed. 93 System size, team understanding, differences across installation sites, and wider system applicability are the largest determinants of the level of technical leadership/management effort needed Increase in correlation Increase in correlation Increase in correlation QF6 value of 22 QF7 value of 30 Relative increase of correlation Increase in correlation Relative increase of correlation Relative increase of correlation Relative increase of correlation Relative increase of correlation Relative increase of correlation Relative increase of correlation 176

195 Systems Engineering Return on Investment 7 Conclusions and recommendations This chapter summarizes the thesis by highlighting the major findings and indicating future research areas that come out of this work. 7.1 Major findings The work described in this thesis developed and executed a research program to explore the quantitative relationships between systems engineering and program success. The program created an interview instrument, based on a peer-reviewed ontology, which could gather salient data for the research. Using that instrument, the principal investigator obtained 1.5-hour interviews with the program management and systems engineering leaders of 51 system development projects. The interview data was subjected to rigorous mathematical and statistical processing to extract and test the relationships. All of the following major findings are supported by the research work. The first six findings are of highest importance to the SE discipline as a whole, and to the system development programs that use SE. 1. There is a quantifiable relationship between systems engineering effort levels and program success. The original research question A was stated as (RQ A ) Is there a quantifiable correlation between the amount, types and quality of systems engineering efforts used during a program and the success of the program? The empirical work in this research has shown that there exists such a quantifiable relationship, with correlation coefficients well in excess of the test values for a significance level of =

196 The relationships for total SE effort are shown in Figure 37, Figure 38, and Figure 39, all of which show program success measures (cost compliance, schedule compliance, and overall success, respectively) plotted against Equivalent SE Effort (ESEE) as a percentage of program cost. In all three cases, the relationship shows poor program success at low levels of ESEE, improving to desirable program success at moderate levels of ESEE, and then again deteriorating to poor program success at higher levels of ESEE. This relationship is consistent with the theoretical work of Honour (2002b). Similar relationships were discovered, with minor exceptions, for each of the eight subordinate SE activities. Table 33 summarized the findings that are depicted in Figure 42 through Figure 73. All eight SE activities demonstrated a statistically significant correlation against at least two of the three program success measures. All SE activities were significantly correlated with cost compliance; seven activities were significant against schedule compliance, and five activities were significant against overall success. Given the demanding significance level of = 0.05, these results provide an overwhelming affirmation of this major finding. Importance. This major finding acts as a warning to programs: the level of SE effort matters to the success of programs, as does the mix of that effort across the constituent activities of SE. 2. Systems engineering has a significant, quantifiable Return on Investment. The quantified relationship between cost compliance and Total SE effort was subjected to standard financial calculations for Return on Investment, in which the Return was measured as program cost reduction and the Investment was measured as additional cost applied to Total SE effort. The results (Section ) clearly showed a quantifiable ROI for the re-allocation of funding to additional SE effort. For programs operating at near-nil SE effort, that ROI is as high as 7:1, a program cost reduction seven times as great as the added SE cost. For programs operating near the median of the interviewed programs, the ROI is 3.5:1. Importance. The result is compelling program management information: greater SE effort is associated with programs at significantly less cost overrun. This statement is true for programs operating at and below the median of SE effort. 178

197 Systems Engineering Return on Investment 3. No correlation was found between systems engineering and system technical quality. The research work included four success measures, not only three. The fourth success measure was intended to quantify the technical quality of the resulting system, using the Key Performance Parameters (KPP) believed to matter most to the stakeholders. The statistical work unfortunately showed no significant correlation between any SE activity and the system technical quality. (See Table 33.) A review of the qualitative interview information provided an explanation, in that the program leaders (PM and SE) rarely used KPPs as a driving factor in the program. Instead, the driving technical factor was usually the set of requirements defined for the development. Programs typically excelled at KPPs only when either (a) improving the KPPs carried no additional effort or cost, or (b) technical excellence was the program driver. Importance. This major finding is a caution for the SE discipline, to take care lest SE become merely an adjunct of program management. The role of SE in a project is to monitor and guide the technical success. Today, it appears that technical requirements are the de facto measure of technical success, rather than the technical qualities that matter to the stakeholders. This definition is attractive to program managers and contracts, but does not produce the best systems. 4. There is an optimum amount of systems engineering for best program success. The quantifiable correlations between SE effort and program success all evidence a bathtub behavior in which there is a clear optimum value of SE effort in each relationship. That optimum has been calculated by determining the point at which ROI goes to zero; up to this point, additional application of SE effort has a positive effect on the program, while beyond this point the ROI becomes negative. For Total SE, the optimum amount of effort for a median program is 14.4% of the total program cost. For non-median programs, this value can vary roughly between 8% and 19% of the total program based on the varying program characteristics. Similar values exist for each of the eight subordinate SE activities, with similar variation for non-median programs. Optimal effort values for the median programs are shown in Table 34 and range from 1.3% to 3.9% of total program cost. 179

198 Importance. This major finding provides a useful metric for program development and for program reviews because it shows the typical levels of SE effort that are associated with the most successful programs. 5. Programs typically use less systems engineering effort than is optimum for best success. For the median of the interviewed programs, the calculated ROI is 3.5:1. This indicates that additional total SE effort would result in a program cost reduction 3.5 times as great as the cost of the additional effort. It is apparent from this data that the median programs in the interview set operated with considerably less SE effort than is optimum. How much less can be seen in Table 34, which shows the median level of SE effort is 8.5% of total program cost against an optimum level of 14.4%. These numbers are important: for a median $14M program operating at 8.5% SE effort, the observed cost overrun was on the order of $1.5M; for a similar program using $200K greater SE effort, the cost overrun was only $1.0M. (Both numbers from the averaging line of Figure 37.) By allocating $200K greater SE effort, such a typical program could reduce its cost by $500K. Importance. This major finding is a caution to program managers: ensure that their program is operating at a sufficient level of SE effort, a level higher than they are accustomed to using. 6. A method is provided to estimate the optimal systems engineering effort for a given set of program characterization parameters. The second major research question was stated as (RQ B ) For any given program, can an optimum amount, type and quality of systems engineering effort be predicted from the quantified correlations? This research has developed a systems engineering effort estimation method that is based on the quantified correlations. That method is described in in five steps that use the mathematics of this research. The method is suitable for use in very early stages of a system development to provide meaningful effort estimation even for the initial SE activities. It has been shown that the method necessarily results in a level of SE effort that is optimum for program success for the given program characteristics. 180

199 Systems Engineering Return on Investment There is no guarantee that following the estimations will result in the best success for a program. Other confounding factors still exist, and the causality has yet to be proven. However, the estimation provides SE effort levels that are proven to be associated with the most successful programs in this interview data set. This associative relationship is more useful than any prior method. Importance. This major finding provides an effective program/technical management tool that can be used in system development programs to appropriately size the SE effort. 7. For systems engineering effort estimation, some program characterization parameters are of much greater importance than others. The mathematics of the SE effort estimation method provides great insight into the factors that are of most importance. If these factors are mis-estimated, the SE effort will be poorly estimated. The importance of the factors occurs through two effects: The strength of their contribution to adjustment of SE effort from median to the specific program. More important parameters create a larger adjustment. This effect can be seen in Figure 35, Figure 36, and Table 37 that show the effort adjustment due to each factor. The effect of the factors on the correlation. More important parameters align the programs more closely to the central tendency, thereby removing the effects of many confounding issues and improving the estimation assurance. This effect can be seen in Figure 76 showing the change in correlation due to each factor. The comparison of the data in these figures shows the factors in Table 41 to be most important for proper estimation of SE effort levels. Table 41. Most important factors for SE effort estimation Factor SE Effort Adjustment (Weight j, unitless) Correlation (Assurance) Improvement (R 2, unitless) Level of Definition at Start Development Autonomy Level of Integration (system vs. subsystem) System Size Proof Difficulty Technology Risk

200 Importance. This finding shows that it is necessary to take the important program factors into account when estimating program SE effort. Adjustment factors have been calculated to facilitate this. 8. Of the defined SE activities, technical leadership/management is unique in providing optimum program success simultaneously in cost, schedule, and stakeholder acceptance. Section observed the unique nature of the technical leadership/management (TM) effort as displayed in Figure 70, Figure 71 and Figure 72, that all programs in the optimal region of TM effort were largely on-cost, on-schedule, and had the highest levels of stakeholder overall success. No other SE activity claims this same level of success; in all other activities, some programs fell short of these desired goals. Therefore, it appears that using the optimal levels of TM effort is associated with this high assurance of meeting program goals. There is a caution with this finding, however, in that the same graphs show severe degradation in both schedule compliance and overall success when the TM effort becomes greater than optimum. Providing optimal TM effort is important, but it is also important not to provide too much. Importance. This major finding provides an indication to program managers in what area to most carefully emphasize proper SE effort. 9. There is a commonly held ontology of systems engineering that is sufficient to be meaningful. The work of Honour & Valerdi (2006) demonstrated the confused nature of SE terminology by the difference of terms used in various SE standards for similar concepts. This confusion has existed for many years, due to the use of SE in various stovepipe domains that do not trade information with others. One early concern for the current work was whether it was even possible to obtain comparable data from different organizations, due to the vast difference in terminologies and program structures. The development of the eight SE activities described in Section showed that a common vocabulary existed, indicating an underlying ontology ( shared understanding ) was in fact sufficient to be meaningful. The use of those activities 182

201 Systems Engineering Return on Investment during interviews was invariably meaningful to the interview participants, and every interview was able to translate data from the source organization s definitions into the common ontology of the research. Importance. This major finding is useful in the SE discipline to bring commonality and closure to many issues of conflict about terminology and concepts. It is noted that such closure is developing even during the period of this current work, with the development of a widely supported Systems Engineering Body of Knowledge (SEBOK). 10. It is possible to effectively quantify systems engineering effort using empirical data. One dissenting view about SE quantification was published during Phase I of the prior Value of SE work described in Section 2.3. This dissenting view was The Shangri- La of ROI (Sheard/Miller 2000), in which the authors wrote This paper shows that 1) There are no hard numbers. 2) There will be no hard numbers in the foreseeable future. 3) If there were hard numbers, there wouldn t be a way to apply them to your situation, and 4) If you did use such numbers, no one would believe you anyway. The paper was accepted with acclaim in a time when systems engineers struggled with program managers for recognition of their worth. It was widely referenced for many years, partly because the paper echoed the despair among systems engineers, and partly because the paper also provided useful indications of subjective and political methods to build a case for systems engineering. This current work has now proven that it is in fact possible to effectively quantify SE effort using empirical data. There is nothing magical or different about SE than other disciplines in this regard. The author has applied the same empirical methods used in many other disciplines to create the results herein. Importance. This major finding removes a common excuse for not doing the necessary quantification. Programs can quantify their SE effort and can relate that to the success of the program. 11. It is possible to obtain meaningful data about systems engineering and success through program proprietary boundaries. Sheard & Miller (2000) also expressed another objection to quantifying SE: 183

202 Data such as productivity, cost to produce systems, process improvement cost, and the like is usually considered highly confidential. Everyone would be more than happy to receive reports on industry trends and data on all the other companies, but no one is willing to provide such data. Even if the data was clearly defined, additional confidential data would be required to verify that the data had the same meaning from company to company. It is hard to imagine ever prying such numbers out of a large number of companies in a manner that has any sort of fidelity. This has been a commonly held thought by many, that the proprietary boundaries and the sensitivity of the necessary data would completely preclude obtaining meaningful data. Three works: Valerdi (2005), Elm (2008), and this thesis, show that this fear is not true. Proprietary boundaries can be bridged with care, and sensitive data can be controlled successfully. Importance. This major finding helps to pave the way for future studies that also may need access to proprietary data. 7.2 Future research indications As with all research, each step forward shines the light on the path beyond. The current work has provided indications of possible research in several different directions. Any of the following areas may bear great fruit for the future of the SE discipline. Recent SE paradigms. As noted in the limitations of Section 6.4.2, the source data for the current work included only projects that were executed using traditional SE paradigms. This constraint had not been intentional; it was a simple result of the difficulty of obtaining willing programs and the nature of the programs that became available. No programs were explicitly using Model-Based Systems Engineering, Lean, or Agile methods. A good further research project would be to apply similar empirical methods to programs using these more recent SE paradigms. Proponents of the newer paradigms make strong claims for their worth on programs (Kennedy & Umphress 2011) and the research method demonstrated herein can provide the information to validate those claims. Other SE domains or cultures. Another limitation to external validity, also noted in Section 6.4.2, is that all programs interviewed were within Western cultures. All but four programs were within English-speaking countries (USA, Australia). Most of the 184

203 Systems Engineering Return on Investment interviewed programs were for military systems. Most of the interviewed programs were within a contracted environment. A good additional research project would be to use the same empirical methods for system development programs in other system domains and other cultures, obtaining sufficient data to compare the differences with the work herein. Validating current findings on subordinate SE activities. The current work has presented mathematically solid relationships between total SE and program success, and between subordinate SE activities and some forms of program success. In explicating the relationships, the data was transformed first through PCA and then through weights on the program characterization parameters. These transformations were often performed with minimal numbers of data elements, as noted in Section This minimal approach was necessary due to the segmentation of data into the different areas, as well as the removal of outliers and constrained programs that was described in Section The result is that many of the Weight j values are derived based on 15-dimension or 21-dimension searches with only a few more data points than the degrees of freedom. Given the indications of the current work, a good advance may result from obtaining greater sets of data specifically on the subordinate SE activities. With larger data sets, the results herein may be further validated as well as extended into new knowledge. The values of Weight j can be corroborated or corrected as needed. True measures of technical quality. This work has shown a surprising lack of correlation between SE activities and a well-recognized measure of technical quality. It became apparent during interviews and in the post-analysis that most of the interviewed programs treated the set of requirements as the only measure of technical quality. The Technical Performance Measurement method of KPPs was used by almost none of the interviewed programs, even though the method has been documented and known for decades. A fruitful area of research would be to evaluate the accuracy of different measures of technical quality. True quality is always defined in the perceptions of the stakeholders. Are requirements generally an accurate representation of that perception? How often does a system meet requirements and yet is deficient in the stakeholder perception? What other technical quality measures exist, and how well do they reflect true quality? 185

204 Qualitative analysis of the interview data. During the current work, the SE-ROI interviews gathered a significant set of qualitative information. This information was not necessary for this thesis, but was gathered simply because obtaining program information is so difficult. A review of the interview instrument in Appendix Appendix B shows that the information includes: Selection of key performance parameters Other success measures Methods used for each SE activity, with indications of their success Tools used for each SE activity, with indications of their success Metrics used to evaluate each SE activity, with indications of their value Lessons learned on the programs Other descriptive information captured during the interview A useful research project would be to collate this information against the program success measures to evaluate the qualitative factors that contribute to success. Through this type of research, causal factors may be revealed that are largely hidden through statistical work alone. Best practices and leading indicators. Best practices are those activities, methods, or tools that contribute to success under defined conditions. Leading indicators are the measurable or perceivable characteristics of a program that provide advance indications of future success or failure. The current work has identified correlative relationships, but has not provided any indication of the best practices or leading indicators that accompany the relationships. Although it has shown the level of SE effort that is associated with success, this thesis has not obtained data to show the details of what SE methods are included within that level of effort, nor what effect those methods might have on success. It has also not developed leading indicators other than the appropriate levels of early SE activity. It would be of tremendous benefit to the SE discipline to do empirical research to correlate both best practices and leading indicators with program success, so that programs can be guided in advance toward success. Benchmarking the SE effort estimation method. The SE effort estimation method presented herein could possibly be an important tool for the SE discipline. The current widely used COSYSMO tool provides good management information for estimating 186

205 Systems Engineering Return on Investment SE effort, but is based solely on the level of SE effort actually used in programs. This thesis has shown that the level actually used is only about 70% of the optimal level. This discrepancy is considerable, indicating that programs using COSYSMO continue to perpetuate the gap. This thesis provides data based on 51 program interviews, but many of those were also extracted as outliers or constrained programs as noted in Section The final results therefore used only program interviews (70-80 total data points, if including the Value of SE data). While statistically significant, this is a small number of data points for a method that could drive program decisions affecting billions of dollars. A good research avenue would be to use the SE effort estimation method for many trial programs, with follow-on tracking to determine its true accuracy and utility. This research might also consider merging this method with the COSYSMO method for even better benchmarking. 7.3 Summary This research was concerned with the relationships between systems engineering and program success. The research specifically showed that relationships are evident in the data that significantly exceed the statistical bounds. The results allow the calculation of the cost-based Return on Investment for additional SE effort, showing that SE-ROI to be significant for programs in the median of those interviewed. The findings from the research are that the relationship between SE effort level and program success demonstrates a distinctive optimum that can be calculated as 14.4% of total program cost for median programs. The median of interviewed programs operated at only 8.5%, significantly less than the optimum. For programs operating near the median of the interviewed programs, the ROI is 3.5:1. These findings can be considered to be typical of similar programs in similar domains and cultures. The implication of these findings for SE methodologies are to demonstrate that most programs operate with less SE effort than is optimum for program success. For most programs, adding SE effort can be expected to significantly reduce the total development cost. 187

206 In another finding, no significant correlation appears in the data between SE activities and system technical quality (as measured by stakeholder-based Key Performance Parameters). The implication of this finding is that current SE process definitions may be over-emphasizing the use of requirements as opposed to more basic technical measures of interest to the stakeholders. The conflict between KPP-based success and requirements-based success is evident. The research also specifically supports a new SE effort estimation method based on optimizing program success based on an evaluation of program characterization parameters. The SE efforts that result from the estimate represent the SE effort level associated in this data with programs that had the best level of program success. All of these findings are applied both to total SE effort and to eight subordinate SE activities, with appropriate parameters and relationships for each activity. The effort estimation method provides values not only for the total SE effort but also for the eight activities. The implication of these findings is that this estimation method offers a more accurate level of SE effort estimation (for program success) than is currently available anywhere else. As part of the SE effort estimation method, the research specifically shows that a small set of the program characterization parameters has the greatest impact on the effort estimation, indicating the program characteristics that most drive the level of SE effort. The implication of this finding for SE methodology is to indicate which program characteristics should be considered first in tailoring the SE methodology for a program. In examining the eight subordinate SE activities, the research specifically shows that technical leadership/management provides a unique possible benefit in its optimal association with simultaneous success in cost, schedule, and stakeholder overall success. The implication of this finding for SE methodology is to indicate that technical leadership/management has a pre-eminent place among the SE methodologies. The development of the research project and the interviews followed straightforward interview methodology, but still revealed insight that can be applied to the SE discipline. A commonly held SE ontology was discovered that could be expressed in eight subordinate SE activities and that was understood by all interview participants. 188

207 Systems Engineering Return on Investment Through this ontology, useful demographics of the interviewed programs are shown that can be considered to be representative of similar SE domains and cultures. The implications of this ontology are that the SE discipline has a common thought process that seems to span across work domains and cultures. Acceptance of and encouragement of this common thought process can likely help the SE discipline to coalesce into greater formality. Finally, the entire work demonstrates that it is possible to obtain meaningful and quantifiable data about systems engineering and success through empirical methods. The implication of this demonstration is that further empirical research is indeed possible, that can observe the SE discipline to the point of formulating effective underlying theory. 189

208 Appendix A Bibliography Ancona and Caldwell (1990). Boundary management, Research Technology Management. Barker, B. (2003). Determining systems engineering effectiveness, Conference on Systems Integration, Stevens Institute of Technology, Hoboken, NJ. Boehm, B., Valerdi, R., and Honour, E. (2008). The ROI of systems engineering: Some quantitative results for software-intensive systems, Systems Engineering vol 11, No.3, Wiley Periodicals, Inc. Brown, B.B. (1968), Delphi Process: A methodology used for the elicitation of opinions of experts, Rand Report No. P-3925, The RAND Corporation, Santa Monica, CA. Brown, S. (2011), Inferences about linear correlation, [ sbrown/stat/correl.htm] accessed 19 Aug Browning, T.R. and Honour, E. (2008). Measuring the life-cycle value of enduring systems, Systems Engineering, vol.11, no.3, pp , Wiley Interscience, Hoboken, NJ. Cook, S. (2000), What the lessons learned from large, complex technical projects tell us about the art of systems engineering, INCOSE International Symposium, Minneapolis, MN. DERA (1996), The MoD s Downey Procedures, DRA/LS(CS)/SYS_ENG/TG0/CMR/ 96/1 Desu, M. M. and Raghavarao, D. (1990), Sample Size Methodology, Academic Press, Boston. 190

209 Systems Engineering Return on Investment Elm, J, et al. (2008). A Survey of Systems Engineering Effectiveness. Special Report CMU/SEI-2008-SR-034. Pittsburgh, PA: Carnegie Mellon Software Engineering Institute. Eshbach, O.W. (1952). Handbook of Engineering Fundamentals, John Wiley & Sons, New York, p Frank, M. (2000), Cognitive and personality characteristics of successful systems engineers, INCOSE International Symposium, Minneapolis, MN. Frantz, W.F. (1995). The impact of systems engineering on quality and schedule empirical evidence, NCOSE International Symposium, St. Louis, MO. Gamgee (2006). Project management & systems engineering in the commercial environment now and the future, Systems Engineering Test & Evaluation Conference, Melbourne, SA. Goode, H.H. and Machol, R.E. (1957). System Engineering: An Introduction to the Design of Large-Scale Systems, McGraw-Hill, New York Gordon, S.P. and Gordon, F.S. (2004). Deriving the quadratic regression equation using algebra, Mathematics and Computer Education, v38 n3 pp Fall Gruhl, W. (1992). Lessons learned, cost/schedule assessment guide, Internal presentation, NASA Comptroller s office. Hall, M.N. (1993), Reducing Longterm System Cost by Expanding the Role of the Systems Engineer, Proceedings of the 1993 International Symposium on Technology and Society, IEEE, Washington DC, pp Herbsleb, J., Carleton, A., Rozum, J. and Zubrow, D. (1994). Benefits of CMM-based Software Process Improvement: Initial Results, CMU/SEI-94-TR-13, ESC-TR Software Engineering Institute. Honour, E.C. (1999). Characteristics of engineering disciplines, Proceedings of the 13 th International Conference on Systems Engineering, University of Nevada, Las Vegas. (2001). Optimising the value of systems engineering, Proceedings of the INCOSE International Symposium, Melbourne, Australia. 191

210 (2002a). Quantitative relationships in effective systems engineering, INCOSE_IL Conference, ILTAM, Haifa, Israel. (2002b). Toward a mathematical theory of systems engineering management, INCOSE International Symposium, Las Vegas, NV. (2003). Toward understanding the value of systems engineering, Conference on Systems Integration, Stevens Institute of Technology, Hoboken, NJ. (2004), Understanding the value of systems engineering, Proceedings of the INCOSE International Symposium, Toulouse, France (2006a). A practical program of research to measure systems engineering return on investment (SE-ROI), INCOSE International Symposium, Orlando, FL (2006b). Gathering data to measure systems engineering return on investment (SE-ROI), Systems Engineering Test & Evaluation Conference, Melbourne, VIC. (2007). Design of experiments as applied to systems engineering return on investment, Conference on Systems Engineering Research, Hoboken, NJ. (2009). Demographics in measuring systems engineering return on investment (SE-ROI), INCOSE International Symposium, Singapore (2010a). Effective characterization parameters for measuring systems engineering, Conference on Systems Engineering Research, Hoboken, NJ. (2010b). Systems engineering return on investment, INCOSE International Symposium, Chicago, IL. (2011a). Improved correlation for systems engineering return on investment, Conference on Systems Engineering Research, Los Angeles, CA. (2011b). Sizing systems engineering activities to optimize return on investment. INCOSE International Symposium, Denver, CO. Honour, E.C. and Valerdi, R. (2006). Advancing an ontology for systems engineering to allow consistent measurement. Conference on Systems Engineering Research, Los Angeles, CA. INCOSE (1996), Definition by the Terms and Definitions Working Group, accessed 30 May

211 Systems Engineering Return on Investment INCOSE (2010), Systems Engineering Handbook, v3.2, INCOSE-TP , San Diego, CA, pp Kennedy, J.R., and D.A. Umphress (2011), An agile systems engineering process, the missing link? CrossTalk, The Journal of Defense Software Engineering, Hill AFB, UT, May/June Kerzner, H. (2006), Project Management, 9 th Ed., John Wiley & Sons, Hoboken, NJ, p.615. Kludze, A.K. (2004), The impact of systems engineering on complex systems, Conference on Systems Engineering Research, University of Southern California, Los Angeles, CA. Kolmogorov A. (1941), Confidence limits for an unknown distribution function, Ann. Math. Stat. vol.12, pp Langenberg, I. and F. de Wit, Managing the right thing: risk management, Proceedings of the INCOSE International Symposium, Brighton, UK. Likert, R. (1932). A technique for the measurement of attitudes. Archives of Psychology 140: Mar, B.L. and Honour, E.C. (2002). Value of systems engineering SECOE project report, Proceedings of the INCOSE International Symposium, Las Vegas, NV. Miller, R., Floricel, S., Lessard, D.R. (2000). The Strategic Management of Large Engineering Projects, MIT Press. Ministry of Defence Smart Procurement Team (1999). The Acquisition Handbook, A Guide to Smart Procurement, Ed.2, The Williams Lea Group, UK. Pearson, K. (1901). On lines and planes of closest fit to systems of points in space. Philosophical Magazine 2(6): SE Applications Technical Committee (2000), Systems Engineering Application Profiles, Version 3.0, INCOSE Sheard, S. and Miller, C. (2000), The Shangri-La of ROI, Proceedings of the INCOSE International Symposium, Minneapolis, MN. 193

212 Smirnov, N.V. (1939), On the estimation of the discrepancy between empirical curves of distribution for two independent samples, Bulletin Moscow University, vol.2, pp Spiegel and Stephens (1999), Theory and Problems of Statistics, Third Edition, McGraw-Hill, p.317ff. Thomas, L.D. and R.A. Mog, (1997), A Quantitative Metric of System Development Complexity, Proceedings of the INCOSE International Symposium, Los Angeles, CA. Thomas, L.D. and R.A. Mog, (1998), A Quantitative Metric of System Development Complexity: Validation Results, Proceedings of the INCOSE International Symposium, Vancouver, BC. Valerdi, R., Miller, C., and Thomas, G. (2004). Systems engineering cost estimation by consensus, International Conference on Systems Engineering, Las Vegas, NV. Valerdi, R (2005). The constructive systems engineering cost model (COSYSMO), Dissertation work, University of Southern California, Los Angeles, CA. 194

213 Systems Engineering Return on Investment Appendix B Interview instruments This appendix contains the formatted interview instruments as used in the research interviews. The interview data-recording sheets are 14 pages long, necessitating interview durations of 1.5 to 2 hours. The format of the data sheets include: Page 1 provides basic information about the research and the interview process. Pages 2-3 gather program characterization data. Page 4 gathers program success data. Pages 5-8 gather systems engineering effort data. Page 9 provides a place to record qualitative information about lessons learned. Pages provide definitions of terms to standardize the data collection. Following the interview data sheets are two additional forms used at the beginning of each interview: SE-ROI Participant Information Participant Consent Form All forms were provided in advance to each set of participants for a program team; however, the participants were cautioned not to attempt to create final answers prior to the interview. By providing the data sheets in advance, participants could familiarize themselves with the definitions and the types of data to be gathered. They could also prepare themselves for the interview, including filling in tentative answers if desired. 195

214 196

215 197 Systems Engineering Return on Investment

216 198

218 200

220 202

222 204

224 206

226 208

228 210

230 212

231 Systems Engineering Return on Investment Appendix C Developmental papers This appendix contains a chronological portfolio of developmental papers published at various conferences out of the SE-ROI research while it was in work. The papers describe most of the research path followed. The primary body text of this thesis refers to the information in these papers in lieu of repeating it. The papers are included herein as partial proof of the research methodology and results, with the remainder of the proof contained in the body of this thesis. C.1 Advancing an ontology Honour, E.C. and R. Valerdi. (2006). Advancing an ontology for systems engineering to allow consistent measurement. Conference on Systems Engineering Research, Los Angeles, CA. 213

232 C.2 Practical program of research Honour, E.C. (2006a). A practical program of research to measure systems engineering return on investment (SE-ROI), INCOSE International Symposium, Orlando, FL 226

233 Systems Engineering Return on Investment C.3 Gathering data Honour, E.C. (2006b). Gathering data to measure systems engineering return on investment (SE-ROI), Systems Engineering Test & Evaluation Conference, Melbourne, VIC. 237

234 Systems Engineering Return on Investment C.4 Design of experiments Honour, E.C. (2007). Design of experiments as applied to systems engineering return on investment, Conference on Systems Engineering Research, Hoboken, NJ. 247

235 Systems Engineering Return on Investment C.5 Demographics Honour, E.C. (2009). Demographics in measuring systems engineering return on investment (SE-ROI), INCOSE International Symposium, Singapore 257

236 C.6 Effective characterization parameters Honour, E.C. (2010a). Effective characterization parameters for measuring systems engineering, Conference on Systems Engineering Research, Hoboken, NJ. 274

237 C.7 Systems engineering return on investment Honour, E.C. (2010b). Systems engineering return on investment, INCOSE International Symposium, Chicago, IL. 288

238 Systems Engineering Return on Investment C.8 Improved correlation Honour, E.C. (2011a). Improved correlation for systems engineering return on investment, Conference on Systems Engineering Research, Los Angeles, CA. 307

239 C.9 Sizing systems engineering activities Honour, E.C. (2011b). Sizing systems engineering activities to optimize return on investment. INCOSE International Symposium, Denver, CO. 316