A Change Impact Analysis Tool for Software Development Phase

, pp. 245-256 http://dx.doi.org/10.14257/ijseia.2015.9.9.21 A Change Impact Analysis Tool for Software Development Phase Sufyan Basri, Nazri Kama, Roslina Ibrahim and Saiful Adli Ismail Advanced Informatics School, Universiti Teknologi Malaysia, Malaysia msufyan4@live.utm.my, {mdnazri, iroslina.kl, saifuladli}@utm.my Abstract Accepting too many software change requests could contribute to expense and delay in project delivery. On the other hand rejecting the changes may increase customer dissatisfaction. Software project management might use a reliable estimation on potential impacted artifacts to decide whether to accept or reject the changes. In software development phase, an assumption that all classes in the class artifact are completely developed is impractical compared to software maintenance phase. This is due to some classes in the class artifact are still under development or partially developed. This paper extends our previous works on developing an impact analysis approach for the software development phase in automating the approach which we call it CIAT (Change Impact Analysis Tool). The significant achievements of the tool are demonstrated through an extensive experimental validation using several case studies. The experimental analysis shows improvement in the accuracy over current impact analysis results. Keywords: Software development, change impact analysis, change effort estimation, impact analysis, effort estimation 1. Introduction It is important to manage the changes in the software to meet the evolving needs of the customer and hence, satisfy them [1]. Accepting too many changes causes delay in the completion and it incurs additional cost. Rejecting the changes may cause dissatisfaction to the customers. Thus, it is important for the software project manager to make effective decisions when managing the changes during software development. One type of information that helps to make the decision is the prediction of the number of classes affected by the changes. This prediction can be done by performing change impact analysis [2]. Impact analysis is a process of identifying potential consequences of change, or estimating what needs to be modified to accomplish a change. [3-5]. The motivation behind the impact analysis activity is to identify software artifacts (i.e., requirement, design, class and test artifacts) that are potentially to be affected by a change. Impact analysis has been widely used in software maintenance phase rather than software development phase [6-9]. This is because the current developed impact analysis solutions assume that all classes or class artifacts are completely developed. Most solutions use dynamic analysis techniques [10-13] for their impact analysis implementation. Based on our previous work [6-9], we have demonstrated that there is a significant difference change impact analysis implementation between software maintenance phase and software development phase. The existence of partially developed classes in the software development phase causes problem to the current implementation of dynamic analysis technique to be used in the software development phase. Basically, the dynamic analysis technique analyses the method execution paths that are generated from source code through reverse engineering [10-13]. This analysis tends to produce inaccurate results because some method execution paths that involve partially developed classes [7,8] are not visible due to they have yet to be implemented. The inaccuracy of the ISSN: 1738-9984 IJSEIA Copyright c 2015 SERSC

generated paths from the dynamic analysis technique is indirectly led to inaccuracy of impact analysis results. This paper extends our previous works on change impact analysis approach for software development phase [6-9]. In brief, the previous approach integrates between the static analysis and dynamic analysis techniques to perform change impact analysis during the software development phase. In this paper, we have extended our work by developing a prototype tool to support the previous developed approach. This paper will give more explanation on the developed tool rather than the concept of change impact analysis approach itself. However, detail information on the change impact analysis approach can be referred to [6-9]. This paper is laid out as follow. Section 2 presents related work whereas Section 3 shows the prototype tool main screen. Section 4 and Section 5 explains our evaluation procedure and its results. Finally, Section 6 describes conclusion and future works. 2. Related Work There are two categories of impact analysis techniques which are static analysis technique and dynamic analysis technique. The static analysis technique develops a set of potential impacted classes by analyzing program static information that is generated from software artifacts (i.e., requirement, design, class and test artifacts). Conversely for the dynamic analysis technique, this technique develops a set of potential impacted classes by analyzing program dynamic information or executing code. 2.1. Static Analysis There are two most related current static analysis techniques to the new proposed approach which are the Use Case Maps (UCM) technique [14,15] and the class interactions prediction with impact prediction filters (CIP-IPF) technique [16,17]. The UCM technique [14] performs impact analysis on the functional requirements and the high level design model. This technique assumes that all the functional requirements and the high level design model are completely developed. This technique has two limitations which are; (1) there is no traceability technique used from the functional requirements and the high level design artifacts to the actual source codes. This technique only makes an assumption that the content of these two artifacts that is represented using the UCM model are reflected to the class artifacts. Any affected elements in the UCM model are indirectly reflected to the affected class artifacts, and (2) there is no dynamic analysis or source code analysis involved in this technique. Based on the precept that some of the effect of a change from a class to other class(es) may only be visible through dynamic or behavior analysis of the changed class [18,19], therefore results from this technique tend to miss some actual impacted classes. Next, the CIP-IPF technique [16, 17] uses a class interactions prediction as a model for detecting impacted classes. This technique has its strength compared to the UCM technique. Comparing to the UCM technique, this technique has traceability link detection between the requirement artifacts and the class artifacts feature. This feature is used to navigate impact of changes at the requirement level to the class artifacts. 2.2. Dynamic Analysis For the dynamic analysis techniques, we have selected two most related works to our research which are the Influence Mechanism technique [11, 13] and the Path Impact technique [12]. Basically, these techniques predict the impact set (classes or methods) based on method level analysis. The Influence Mechanism technique [11, 13] introduces the Influence Graph (IG) as a model to identify impacted classes. This technique uses the class artifacts as a source of analysis and assumes that the class artifacts are completely developed. There is a 246 Copyright c 2015 SERSC

limitation of this technique in which there is no formal mapping or traceability process from requirement artifacts or design artifacts to class artifacts. This process is important in impact analysis process as changes not only come from class artifacts but it also comes from design and/or requirement artifacts. Since design and requirement artifacts do interact among them vertically (between two different artifacts of a same type) and horizontally (between requirement and design artifacts), changes that occur to them could contribute to different affected class artifacts. In some circumstances, focusing on the source code analysis may not be able to detect those affected classes. The Path Impact technique [12] uses the Whole Path DAG (Directed Acyclic Graph) model as a model to identify impacted classes. The concept of implementation for this technique is almost similar to the Influence Mechanism technique as this technique uses the class artifacts as a source of analysis and assumes that the class artifacts are completely developed. Also, this technique performs a preliminary analysis prior to performing a detail analysis. There are two limitations of this technique; (1) the implementation is time consuming as the technique opens to a huge number of data when the analysis goes to a large application, (2) there is no formal mapping process from requirement artifacts or design artifacts to class artifacts. As described earlier, this process is important in impact analysis process as changes not only come from class artifacts but also from design and/or requirement artifacts. 3. The CIAT Change Impact Analysis Tool (will called CIAT henceforth) has two main modules, which are Class Interaction Prediction (CIP) Module and Impact Analysis Module as in Figure 1. Figure 1. System Overview The CIP module produces a CIP model which is a traceability model that depicts the interactions between all the software artifacts including requirements, design and classes. It will be used later to perform static change impact analysis and find the impacts of the change. The CIP model can be developed in any format, but for the purpose of this project we developed it using a consistent XML format. It consists of two sections, first section contains the project information and the second section contains the artifacts information. For the impact analysis module, the process begins with performing static impact analysis to find the direct and indirect affected classes. The static impact analysis will find the direct impacted classes which are the first layer of classes affected by a particular changes requirement without vertical traceability relations consideration. Then indirect impacted classes will be identified by complete traceability search through the CIP interactions to find all related classes to the changed requirement. The next process in the impact analysis module is performing vertical traceability to find the actual interactions between the fully developed classes, and improve CIP model by adding underestimated results and removing overestimated results from the CIP model. Then dynamic impact analysis is performed on the improved CIP model as in static impact analysis. Copyright c 2015 SERSC 247

Due to space limitation, this paper concentrates on impact analysis module only. Basically, the interaction between these two modules is done through an interface file named CIP. This file will be exported from Class Interactions Prediction Module and imported by Impact Analysis Module. In other words the output of the first CIP Module acts as an input for impact analysis module. Overall functionalities of this tool are; (1) To import the CIP file to the system database, (2) To acquire the change request information, and (3) To perform static and dynamic impact analysis. See Figure 2: 3.1. Implementation Steps Figure 2. Overall Functionalities of CIAT The following sub-sections explain briefly the implementation steps of this tool as shown in Figure 3: 3.1.1. Import CIP Figure 3. Main Page Form The purpose of this procedure is to import the generated CIP model file. The CIP model file can be an XML (.xml) file or a CIP (.cip) file. Both should have been developed according to the CIP descriptions in XML format. See Figure 4 below: Figure 4. Import CIP Form 248 Copyright c 2015 SERSC

3.1.2. Acquire Change Request The purpose of this step is to get the required change request information. Acquiring change request will be performed in Change Request Form (see Figure 5). There are six inputs in Change Request Form; (1) Identification Number - which is filled automatically, (2) Change Requester, (3) Requested Change, (4) Change Priority, (5) Affected Requirements, and (6) Comments (Optional). 3.1.3. Perform Impact Analysis Figure 5. Change Request Form This step identifies a set of potential impacted classes using the class interactions prediction according to change request specification. The outcome of this process is a set of initial impacted classes. There are two main sub-steps which are Class Dependency Filtration (CDF) level and the Method Dependency Filtration (MDF) level. a) Perform Class Dependency Filtration Level In this level, the class dependency filtration (CDF) process will be performed on the static impact analysis results as in Figure 6. The CDF level implements the static analysis on the initial set of potential impacted classes produced by the impact analysis process. This implementation is important by the fact that some interaction links in the initial set of potential impacted classes have no change impact value. The interaction link that has no change impact value means that if change occurs to one side of two interacting classes, the other class will not be affected. This is because the other class does not require the changed class for its implementation. Copyright c 2015 SERSC 249

Figure 6. Sample of CDF Filtration on Impact Analysis b) Perform Method Dependency Filtration (MDF) The MDF level performs another filtration on the filtered set of potential impacted classes produced by the CDF level. In brief, all method execution paths from the filtered set of potential impacted classes will be extracted and further analyzed to eliminate false impacted classes. In this work, we use the backward and forward analysis technique from [13]. This level can be considered as the dynamic analysis level as it uses the method execution paths analysis to identify impacted classes. To recap we want to make a note that there is one main disadvantage of the current dynamic impact analysis approaches from the software development phase perspective. The disadvantage is all approaches do not consider or include partially developed class analysis in its implementation. This occurs because of all classes in the software maintenance phase have been completely developed or fully developed. Thus, it is not important for these techniques to have the partially developed class analysis. From the dynamic impact analysis implementation in the software development phase, the inclusion of partially developed class analysis is an important feature. This is due to in the software development phase which some classes in the class artifacts are still under construction or partly developed exist. This inclusion is required to ensure the accuracy of the extracted or generated method execution paths from class artifacts. This accuracy indirectly contributes to the accuracy of the set of potential impacted classes results. Figure 7 shows the dynamic impact analysis output. 250 Copyright c 2015 SERSC

4. Evaluation Strategy Figure 7. Sample of Dynamic Impact Analysis The ultimate aim of the prototype development is to answer a question of Does the developed prototype tool gives an acceptable accuracy of impact analysis results than the selected current impact analysis techniques? To answer this question, we have compared the accuracy of the prototype results with the selected current impact analysis approaches. The selected approaches are; (1) the Class Interactions Prediction with Impact Prediction Filters (CIP-IPF) approach [16-17], and (2) the Path Impact approach [11]. There are five evaluation attributes used; (1) subject and case study, (2) software development process, (3) evaluation procedure, (4) evaluation metrics, and (5) hypothesis. 4.1. Subject and Case Study Three groups of final year master students of software engineering course in our school were selected. Each group composes five to six members that playing different roles in the software development activity. We have asked these groups to perform impact analysis at several specific phases based on the issued change requests. The groups used the selected three impact analysis approaches which are the CIP-IPF approach [16, 17], the Path-Impact approach [11] and our previous approach using the developed prototype tool. The subjects were given a preliminary guideline and briefing on these techniques prior to the experiment. The guideline includes thorough technique explanations and example of its implementation. 4.2. Software Development Process Several change requests were issued during the development phase of a software i.e., requirement phase, design phase, coding phase. In this experiment we have selected the Waterfall software development model [20]. To note, we aware of the latest software development model like Agile model. We will take this model in our future work. 4.3. Evaluation Procedure There are four steps in the procedure: Step 1- change request issuance: There are fifteen change requests have been issued to all groups that reflects to fifteen change requests to three software development project. In total, there were forty five change requests. Copyright c 2015 SERSC 251

Step 2- potential impacted class estimation: All groups were required to estimate potential impacted classes using the three selected impact analysis approaches. Step 3- actual impacted class identification: All groups were required to identify actual set of impacted classes through actual change implementation. Step 4- accuracy of the potential impacted class measurement: The comparison of results between step 2 and step 3 using a set of evaluation metrics were conducted. 4.4. Evaluation Metrics The following evaluation metrics [18] were used. Briefly, each prediction of impacted classes by three approaches was categorized according to four numbers: Not Predicting and Not Changing (NP-NC): number of pairs of classes correctly predicted to not be changing; Predicting and Not Changing (P-NC): number of pairs incorrectly predicted to be changing; Not Predicting and Changing (NP-C): number of classes incorrectly predicted to not be changing; and Predicting and Changing (P-C): number of classes correctly predicted to be changing Based on these numbers, we further used to calculate the following values [19, 21]; (1) Completeness value - the ratio of the actual class interactions or impacted classes that were predicted, (2) Correctness value - the ratio of the predicted class interactions that were actually interacting or impacted classes that were actually impacted, and (3) Kappa value [18] - this value reflects the accuracy or the prediction (0 is no better than random chance, 0.4-0.6 is moderate agreement, 0.6-0.8 is substantial agreement, and 0.8-1 is almost perfect agreement [14,19]). 4.5. Hypotheses A hypothesis that investigates the effectiveness of the CIAT is developed. The CIAT represents our developed tool whereby the selected current impact analysis approaches represent the independent technique (CIP-IPF- static analysis approach only; Path Impactdynamic analysis approach only). If the combination is not effective, H 0 is accepted: H 0 : CIAT does not give higher accuracy of impact analysis results than the selected current techniques results. H a : CIAT gives higher accuracy of impact analysis results than the selected current evaluation results. An Independent T-Test statistical analysis was used to validate the hypotheses. In the Stage 1, we compared Means results between the CIP-IPF approach and the CIAT whereas in the Stage 2, Means results of Path Impact technique and CIAT are compared. 5. Evaluation Results Table 1 below shows the summary of T-Test results. Table 1. T-test Results Stage Technique Means Results Analysis Stage 1 CIP-IPF 0.7927 CIAT 0.9060 Stage 2 Path Impact 0.7773 CIAT 0.9060 252 Copyright c 2015 SERSC

5.1. Stage 1 Analysis: The CIP-IPF Technique vs. CIAT The results show that CIAT means value is 0.9060 and the CIP-IPF approach value is 0.7927. This indirectly shows that CIAT value is higher than the CIP-IPF approach. Thus, the values reject the null hypothesis (H 0 : CIAT does not improve on the CIP-IPF approach results) and accept the alternate hypothesis (H a : CIAT approach gives higher accuracy of impact analysis results than the CIP-IPF approach). 5.2. Stage 2 Analysis: The Path Impact Technique vs. CIAT The results show CIAT value is 0.9060 and the Path Impact approach value is 0.7773. This shows that the CIAT value is higher than the Path Impact approach. Thus, the values reject the null hypothesis (H 0 : CIAT does not give higher accuracy of impact analysis results than the Path Impact approach) and accept the alternate hypothesis (H a : CIAT gives higher accuracy of impact analysis results than the Path Impact approach). 6. Conclusion and Future Work The main contribution of this paper is an automated prototype tool that is developed based on our previous work on change impact analysis for software development phase. To recap, the uniqueness of the approach or the prototype tool is that the element of integration between static and dynamic analysis techniques. This integration intends to overcome the challenges on impact analysis implementation using static analysis and dynamic analysis techniques. For the future works, we plan to extend the tool implementation from agile methodology perspective instead of waterfall methodology. Besides that, we will extend the experiment by including a performance metric. This metric measures the amount of time that is used to identify a set of potential impacted classes according to a particular change request compared to manual approach. Acknowledgements We would like to thank Universiti Teknologi Malaysia (UTM) and Ministry of Higher Education (MoHE) Malaysia for their financial support under the Vote No 4L067. Also to final year post-graduate students of software engineering course at the Advanced Informatics School, Universiti Teknologi Malaysia who have been involved in the study. References [1] S. L. Pfleeger and S. A. Bohner, A framework for Software Maintenance Metrics, Proceedings of the International Conference on Software Maintenance, (1990), pp. 320-327. [2] K. H. Bennet and V. T. Rajlich, Software Maintenance and Evolution: A Roadmap, Proceedings of the International Conference on the Future of Sofware Engineering, (2000), pp. 75-87. [3] G. Kotonya and I. Somerville, Requirements Engineering: Processes and Techniques, John Wiley & Sons Ltd, (1998). [4] R. S. Arnold and S. A. Bohner, Impact analysis-towards a framework for comparison, Software Maintenance, (1993). CSM-93, Proceedings, Conference, (1993) September 27-30, pp. 292-301. [5] G. Antoniol, G. Canfora and G. Casazza, Information Retrieval Models for Recovering Traceability Links between Source Code and Documentation, Proceedings of the International Conference on Software Maintenance, (2000), pp. 40-44. [6] N. Kama, A Change Impact Analysis Approach for the Software Development Phase: Evaluating an Integration Approach, International Journal of Software Engineering and Its Application, vol. 7, no. 2, (2013) March. [7] N. Kama, Integrated change impact analysis approach for the software development phase, International Journal of Software Engineering and its Applications, vol. 7, no. 2, (2013) March, pp. 293-304. [8] N. Kama Change impact analysis for the software development phase: State-of-the-art, International Journal of Software Engineering and its Applications, vol. 7, no. 2, (2013) March, pp. 235-244. Copyright c 2015 SERSC 253

[9] N. Kama and S. Basri, Considering Partially Developed Artifacts in Change Impact Analysis Implementation, Journal of Software, vol. 9, no. 8, (2014), pp. 2174-2179. [10] B. Breech, M. Tegtmeyer and L. Pollock, Integrating Influence Mechanisms into Impact Analysis for Increased Precision, Proceedings of the 22nd International Conference on Software Maintenance, (2006), pp. 55-65. [11] J. Law and G. Rothermal, Whole Program Path-Based Dynamic Impact Analysis, Proceedings of the 25th International Conference on Software Engineering (ICSE 2003), (2003), pp. 308-318. [12] B. Breech, A. Danalis, S. Shindo and L. Pollock, Online Impact Analysis via Dynamic Compilation Technology, Proceeding of the 20th IEEE International Conference on Software Maintenance, Washington, US, (2004) September 11-17. [13] J. Law and G. Rothermel, Incremental Dynamic Impact Analysis for Evolving Software Systems, Proceeding of the 14th International Symposium on Software Reliability Engineering, Washington, US, (2003) November 17-20. [14] J. Hassine, J. Rilling, J. Hewitt and R. Dssouli, Change Impact Analysis for Requirement Evolution Using Use Case Maps, Proceeding of the 8th International Workshop on Principles of Software Evolution, Washington, US, (2005) September 5. [15] M. Shiri, J. Hassine and J. Rilling, A Requirement Level Modification Analysis Support Framework, Proceeding of the 3rd International IEEE Workshop on Software Evolvability, Montreal, Canada, (2007) October 1. [16] N. Kama, T. French and M. Reynolds, Design Patterns Consideration in Class Interactions Prediction Development, International Journal of Advanced Science and Technology, vol. 28, no. 6, (2011). [17] N. Kama and F. Azli, Requirement Level Impact Analysis with Impact Prediction Filter, Proceeding of the 4th International Conference on Software Technology and Engineering, Phuket Thailand, (2012) September 1-2. [18] M. Lindvall and K. Sandahl, How Well Do Experienced Software Developers Predict Software Changes, Journal of Systems and Software, vol. 43, no. 1, (1998). [19] J. Cohen, A Coefficient of Agreement for Nominal Scales, Journal of Educational and Psychological Measurement, vol. 20, no. 1, (1960). [20] I. Sommerville, Software Engineering (7th Edition), Pearson Education, (2008). [21] J. R. Landis and G. G. Koch. The Measurement of Observer Agreement for Categorical Data, Journal of Biometrics, vol. 33, no. 1, (1977). Authors Sufyan Basri, obtained his first degree at Universiti Teknologi Malaysia (UTM) in Electrical Engineering (Mechatronic) in 2001, second degree in Real Time Software Engineering at the same university in 2003. she is currently pursuing his PhD at UTM and his research interest includes Software Engineering, Change Management and Effort Estimation. Nazri Kama, obtained his first degree at Universiti Teknologi Malaysia (UTM) in Management Information System in 2000, second degree in Real Time Software Engineering at the same university in 2002 and his PhD at The University of Western Australia (UWA) in Software Engineering in 2010. He has a considerable experience in a wide range on Software Engineering area. His major involvement is in software development. 254 Copyright c 2015 SERSC

Roslina Ibrahim, is currently attached to Advanced Informatics School (AIS) at Universiti Teknologi Malaysia, Kuala Lumpur. She holds a PhD in information science from Universiti Kebangsaan Malaysia and master degree in computer science. Her research interests are user acceptance of information systems, design and developments of educational games and usability evaluation. She has 15 years experiences in teaching several IT related courses both for undergraduate and postgraduate students, including IT and multimedia, Web development, audio and video in multimedia, instructional multimedia design, IT Project Management, User Experience and Interaction Design and Research methodology. Currently, she is an Acting Academic Manager for Research Students with responsibilities of managing academic matters for Doctor of Philosophy and Master of Philosophy students at AIS. Saiful Adli Ismail, obtained his first degree at Universiti Teknologi Malaysia (UTM) in Industrial Computing in 1997, second degree in Real Time Software Engineering at the same university in 2000 and currently continues his PhD at Universiti Teknologi Malaysia. Copyright c 2015 SERSC 255

256 Copyright c 2015 SERSC