An Approach for Reverse Engineering thorough Complementary Software Views Niranjan Kumar SIEMENS Technology and Services Pvt. Ltd. #84, KEONICS, Electronics City, Bangalore 560 100, India. e-mail: Niranjan.kumar@siemens.com, kniranjansingh@gmail.com Abstract. Reverse engineering is the process to retrieve missing software artifacts from existing source code. Static and Dynamic information are important to understand existing software system but not sufficient enough to get high level view of system. This paper presents a methodology to synthesize state and communication diagram from static and dynamic information. The purpose of this paper is to assist reverse engineering process through complementary software views. Keywords: Reverse engineering, Static analysis, Dynamic analysis, Design recovery. 1. Introduction Software documentation is essential to understand the structure and behavior of an application. Software development starts with sound documentation and design but during the course of time, design document or other software artifacts are neither updated or nor well maintained. It is very essential to understand the existing system to maintain and to evolve it with new requirements. Reverse engineering extracts and retrieves the missing software documentation in the form of abstract UML model from existing source code. This enables developers to gain better understanding of the existing source code of system [1]. Most currently available reverse engineering tools focus on either static or dynamic aspects of the software but rarely on both. However, there are few tools that support both of these tasks and assist developers to understand the relations between behavioral and structural aspect of the software. The paper explains an approach to derive static and dynamic information and complement abstract software view in the form of state and communication diagram of the system. These complementary software views extend the dynamic view and assist to comprehend the big picture of the software system. This paper presents a case study of ATM system to explain the reverse engineering process. 2. Reverse Engineering Approach To maintain and enhance legacy systems, developer needs to understand the high level structure (artifact about the top level structure of overall system) and low level artifacts (artifact about low level/class level structure and their hierarchies). Reverse engineering of static, dynamic, state and communication diagrams from existing software could significantly facilitate in software maintenance and software evolution. Static information explains structural characteristics of the system and dynamic information explains the dynamic characteristics like runtime behavior of the system. Extracting the abstract high level static and dynamic views of complex systems are the most difficult phase of reverse engineering process. A methodology has been explained in this paper to extract the state diagram and communication diagram from that static and dynamic view of software system, assists to get the top level view of the system behavior. There are reverse engineering tools present that facilitate to extract static and dynamic information. EA (Enterprise Architect), a reverse engineering tool has been used to extract static and dynamic views of software for ATM (Automated teller machine) case study. Corresponding author Elsevier Publications 2013.
Niranjan Kumar Figure 1. Static software view of ATM system. 2.1 Static software views The proposed approach to extract static software views is divided two steps: collection of class hierarchy and reconstruction of abstract view of software system. A class diagram is a graphical presentation of the static view that shows a collection of static model, such as classes, interfaces, types, as well as their contents and relationship. A class is the descriptor for a set of objects with similar structure, behavior and relationships. The hierarchical class diagram exhibits a way to discover an entire class inheritance relationship within an application [2]. The high-level class diagram exhibits aggregation or composition relationships between classes, which is helpful for developers to extend or modify an existing design. This is particularly powerful for developers to explore a large system [3]. Developer manually generates the high level of abstraction of a UML Class diagram by removing or altering redundant class dependencies and by applying the acquired knowledge from existing system [2]. The static reverse engineering is performed on code base with the EA (Enterprise Architect) tool. It has a reverse engineering tool integrated to the EA environment, which is capable of extracting a UML static model from C++/C# source-code. In figure 1, A graphical representation of static software view of ATM system is derived; it assists to identify the important components of an existing system. 2.2 Dynamic Software Views Dynamic characterization is particularly important for those parts of a system which are mainly understood by their dynamic behavior, like various kinds of controllers, drivers etc [4]. Dynamic reverse engineering aims at the reconstruction of a system behavioral model, represented through UML sequence diagrams. The dynamic reverse engineering is performed with the EA tool, a reverse engineering tool integrated to the EA environment, which is capable of extracting a UML dynamic model from C#/C++ source-code. Developer determines use-cases for important features of an application. Each specific sequence of communications in a use case is called a scenario diagram. The sequence diagram for ATM transaction use case has been extracted from pre-existing source code by using the automated EA tool in figure 2. It shows message flow among the objects of ATM application. Based on diagram; it can be observed that the object react differently based on current event and pass events to other objects. Dynamic diagrams can be used to illustrate the runtime/dynamic behavior of an object-oriented system. 230 Elsevier Publications 2013.
An Approach for Reverse Engineering thorough Complementary Software Views Figure 2. Dynamic view of ATM transaction scenario for ATM system. 3. Complementary Software Views Static models depict classes, inheritance relationships, and aggregation relationships. Object oriented models are complex to comprehend due to behavioral characteristic like polymorphism, late binding. Software design is about behavior; and behavior is dynamic. Therefore emphasis is on dynamic analysis to understand the behavior of software system. A static model is inappropriate without associated dynamic model. On other hand dynamic models do not represent structural consideration of software components and its relationship. So to understand the system we need to integrate the static and dynamic information. We extracted communication diagrams and state diagrams to extend the dynamic behavior visualization. 3.1 Extract state diagram State diagrams give an abstract description of the behavior of a system. This behavior is analyzed and represented in series of events that could occur in one or more possible states [5]. So it represents objects of a single class and tracks the different states of its objects during its life time. It fills the gap between the static source code and run time behavior of the executable program. We demonstrate below how to extract the state diagram from a set of sequence diagrams. This is extracted by running the system with manual intervention. A state trace is extracted from the sequence diagram by selecting the participating object. Human intervention is required to extract the state diagram from sequence diagram. Traversing the vertical line of the object from top to bottom, events occurrence is ordered. These events are of type send event and receive events. Send event starts from the base of message arrow where message departs from the life line of sending object. Receive events starts at the arrow Elsevier Publications 2013. 231
Niranjan Kumar Figure 3. State transition logic. head of message where the arrow hits the life line of the receiving object. Each object incoming call is considered as incoming event and outgoing call is considered as resultant action in response to previous event. An action is set of message interaction performed between object to achieve a task. Interaction operand like opt, loop, alternatives, break are part of action and can be grouped together as an action [4]. Map the resultant actions as transition between states. Identify the abstract states between these transitions manually. Iterate the logic till all actions are coupled with states. Extraction of state transition and states can be done in following steps [4]: 1. Manually identify the start and exit point of states. 2. Construct an Event table/state table to identify the states and transition points. 3. Create transition table and map the transaction points to associated states. 4. Merge interaction operand as a part of single resultant action. 5. Iterate till all transition action are coupled with states. State table prepared to identify the states like CardVerification and PINVerification for the given ATM use case scenario, derived by applying State Transition Logic [6] as shown in below figure 3. Input Condition Represented by virtual condition to perform state transition. Output Condition Represented by outcome of transition action. Action Represented by all the available actions on input condition. State names as defined for each of the states. Table 1. Identify state table. State Table State Name Input Condition(s) Action(s) Output Condition (s) Card Verification InsertCard IsCard Verified CardInvalid/Card Verified PINVerification EnterPINDetails IsPIN Verified PINValid/PIN Invalid Error A state transition function f () transits A B, consists of the source code that is executed between the entry and exit points of method f (), when it is called in the execution context represented by state A. Here state transition function f () contribute to the process of changing to state B from A [7,8]. Prepared state transition table for the given 232 Elsevier Publications 2013.
An Approach for Reverse Engineering thorough Complementary Software Views Use case scenario where the combination of current state (e.g. CardVerification) and input (e.g. EnterPinDetails) shows the next state (e.g.pinverification) [9]. Table 2. State transition table. State Transition Table Input Condition Display Message/Idle State CardVerification Transaction Initiation InsertCard EnterPIN Details PINVerification State diagram extracted by above scenario diagram represents initial and final specification of dynamic behavior as showninfigure4. 3.2 Extract communication diagram Sequence diagram represents the object interaction in term of time sequence and sequence of message exchanged but it does not represent object relationships. Communication diagram in following figure 5 is derived from the sequence diagram for the use case ATM transaction. These diagrams have been generated using EA, a tool for designing UML diagrams. It explains how the objects in the static model collaborate to execute the ATM transaction use case. It uses a combination of information derived from class, sequence and use case diagrams and describes the static and dynamic behavior of the system. It focuses upon the relationships between the objects. It shows relationships among the instances in sequential manner. They are very useful for visualizing the way several objects collaborate to get a job done and for comparing a dynamic model with a static model of the system [10]. In the following figure 5, sequence of message is shown through numbering schema and can be used to assess the object dependencies of the system. It represents the combination of static structure and dynamic behavior of the system by combining the information from use case, class and sequence diagrams. Figure 4. State diagram of transaction for ATM system. Elsevier Publications 2013. 233
Niranjan Kumar Figure 5. Communication diagram of transaction for ATM system. 4. Conclusion Reverse engineering tools that assist software engineer to understand the static structure and dynamic aspect of the software together are rarely available. The reverse engineering techniques proposed in this paper was developed to facilitate the engineer to attain reverse engineering goal. Since this approach uses both static and dynamic information, it can answer a large range of questions about an application. Visualization of the runtime behavior is extended to one step further to composition of state diagram and communication diagram. References [1] Chikofsky, E. J. and Cross, J. H., Reverse engineering and design recovery: A taxonomy, IEEE Software, 7, January 1990. [2] Yazmin Angelica Ibanez Garcia, Complexity boundaries for full satisfiability of restricted UML class diagrams, October 2009. [3] Reverse Engineering UML class and sequence diagrams from Java code with IBM Rational Software Architect, Fenglian Xu, Alex Wood, IBM Technical Library United Kingdom, 2008. URL: http://www.ibm.com/developerworks/rational/library/08/0610 xu-wood/index.html [4] Tarja Systä and Kai Koskimies, Extracting state diagrams from legacy systems, Object-Oriented Technologys Lecture Notes in Computer Science, Volume 1357, pp. 272 273, January 1997 1998. [5] State Diagram, http://en.wikipedia.org/wiki/state diagram [6] Finite state machine, http://en.wikipedia.org/wiki/finite state machine [7] Neil, Krill, Shaukat and Mike Holcombe, Automated discovery of state transitions and their functions in source code, Software Testing, Verification and Reliability, 2007. [8] D. H. A. van Zeeland, Master s thesis, Department of Mathematics & Computer Science Reverse-engineering state machine diagrams from legacy C-code, http://alexandria.tue.nl/extra1/afstversl/wsk-i/zeeland2009.pdf(march 2009) [9] Sate Transition Table, http://en.wikipedia.org/wiki/state transition table [10] Robert C. Martin, UML Tutorial: Collaboration Diagrams, Engineering Notebook Column, November/December, 97. [11] Enterprise Architect, http://www.sparxsystems.com/ 234 Elsevier Publications 2013.