OUR WORK takes place in the process described in [2]



Similar documents
A Reverse Inheritance Relationship for Improving Reusability and Evolution: The Point of View of Feature Factorization

Object-Oriented Design Lecture 4 CSU 370 Fall 2007 (Pucella) Tuesday, Sep 18, 2007

Quotes from Object-Oriented Software Construction

Masters programmes in Computer Science and Information Systems. Object-Oriented Design and Programming. Sample module entry test xxth December 2013

Thomas Jefferson High School for Science and Technology Program of Studies Foundations of Computer Science. Unit of Study / Textbook Correlation

Computing Concepts with Java Essentials

Programming and Software Development CTAG Alignments

An Exception Monitoring System for Java

Introduction to Java

MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question.

Case studies: Outline. Requirement Engineering. Case Study: Automated Banking System. UML and Case Studies ITNP090 - Object Oriented Software Design

Design by Contract beyond class modelling

Chapter 6 Experiment Process

Model-driven Engineering for Requirements Analysis

A Meeting Room Scheduling Problem

Programming the Flowoid NetFlow v9 Exporter on Android

Java Application Developer Certificate Program Competencies

SERENITY Pattern-based Software Development Life-Cycle

UML-based Test Generation and Execution

Graph-Grammar Based Completion and Transformation of SDL/UML-Diagrams

Dynamic Adaptability of Services in Enterprise JavaBeans Architecture

Object Oriented Design

CS 141: Introduction to (Java) Programming: Exam 1 Jenny Orr Willamette University Fall 2013

Using semantic properties for real time scheduling

Java Generation from UML Models specified with Alf Annotations

CS 111 Classes I 1. Software Organization View to this point:

Limitations of Data Encapsulation and Abstract Data Types

The Java Series. Java Essentials I What is Java? Basic Language Constructs. Java Essentials I. What is Java?. Basic Language Constructs Slide 1

Journal of Information Technology Management SIGNS OF IT SOLUTIONS FAILURE: REASONS AND A PROPOSED SOLUTION ABSTRACT

Software Development. Chapter 7. Outline The Waterfall Model RISKS. Java By Abstraction Chapter 7

National University of Ireland, Maynooth MAYNOOTH, CO. KILDARE, IRELAND. Testing Guidelines for Student Projects

a. Inheritance b. Abstraction 1. Explain the following OOPS concepts with an example

A METHOD FOR REWRITING LEGACY SYSTEMS USING BUSINESS PROCESS MANAGEMENT TECHNOLOGY

Effect of Using Neural Networks in GA-Based School Timetabling

Adaptation of the ACO heuristic for sequencing learning activities

Two Flavors in Automated Software Repair: Rigid Repair and Plastic Repair

Processing and data collection of program structures in open source repositories

Requirements by Contracts allow Automated System Testing

Meeting Scheduling with Multi Agent Systems: Design and Implementation

Integration of Application Business Logic and Business Rules with DSL and AOP

UML Profile For Software Product Lines

Towards Integrating Modeling and Programming Languages: The Case of UML and Java

Understanding SOA Migration Using a Conceptual Framework

Chapter 6: Programming Languages

Software Visualization Tools for Component Reuse

Glossary of Object Oriented Terms

Adapting C++ Exception Handling to an Extended COM Exception Model

java.util.scanner Here are some of the many features of Scanner objects. Some Features of java.util.scanner

Structural Design Patterns Used in Data Structures Implementation

Contents. Introduction and System Engineering 1. Introduction 2. Software Process and Methodology 16. System Engineering 53

VISUALIZATION APPROACH FOR SOFTWARE PROJECTS

Fundamentals of Java Programming

Java 6 'th. Concepts INTERNATIONAL STUDENT VERSION. edition

Java Program Coding Standards Programming for Information Technology

Chapter 1: Key Concepts of Programming and Software Engineering

A Static Analyzer for Large Safety-Critical Software. Considered Programs and Semantics. Automatic Program Verification by Abstract Interpretation

Java Interview Questions and Answers

Prediction of Software Development Modication Eort Enhanced by a Genetic Algorithm

JDK 1.5 Updates for Introduction to Java Programming with SUN ONE Studio 4

CSC408H Lecture Notes

Checking Access to Protected Members in the Java Virtual Machine

PROBLEM SOLVING SEVENTH EDITION WALTER SAVITCH UNIVERSITY OF CALIFORNIA, SAN DIEGO CONTRIBUTOR KENRICK MOCK UNIVERSITY OF ALASKA, ANCHORAGE PEARSON

Handout 1. Introduction to Java programming language. Java primitive types and operations. Reading keyboard Input using class Scanner.

Java (12 Weeks) Introduction to Java Programming Language

Traceability Patterns: An Approach to Requirement-Component Traceability in Agile Software Development

Enumerated Types in Java

BCS Certificate in Systems Modelling Techniques Syllabus

Evaluation of Students' Modeling and Programming Skills

Software Testing. Motivation. People are not perfect We make errors in design and code

Dsc+Mock: A Test Case + Mock Class Generator in Support of Coding Against Interfaces

What Questions Developers Ask During Software Evolution? An Academic Perspective

Multi-objective Design Space Exploration based on UML

An Approach for Generating Concrete Test Cases Utilizing Formal Specifications of Web Applications

Functional Size Measurement of Multi-Layer Object- Oriented Conceptual Models

Usability metrics for software components

THE BCS PROFESSIONAL EXAMINATIONS Diploma. April 2006 EXAMINERS REPORT. Systems Design

Use Case Modeling. Software Development Life Cycle Training. Use Case Modeling. Set A: Requirements Analysis Part 3: Use Case Modeling

Outline. 1 Denitions. 2 Principles. 4 Implementation and Evaluation. 5 Debugging. 6 References

Automatic generation of fully-executable code from the Domain tier of UML diagrams

Automatic Test Data Synthesis using UML Sequence Diagrams

STORM. Simulation TOol for Real-time Multiprocessor scheduling. Designer Guide V3.3.1 September 2009

Coordinated Visualization of Aspect-Oriented Programs

Supporting Software Development Process Using Evolution Analysis : a Brief Survey

Introduction to Java Applications Pearson Education, Inc. All rights reserved.

Recovery of Metrics by using Reverse Engineering

Composing Concerns with a Framework Approach

Lecture J - Exceptions

Standard for Software Component Testing

Scanner. It takes input and splits it into a sequence of tokens. A token is a group of characters which form some unit.

Leveraging UML for Security Engineering and Enforcement in a Collaboration on Duty and Adaptive Workflow Model that Extends NIST RBAC

Computer Programming I

Chapter 5. Recursion. Data Structures and Algorithms in Java

An Eclipse Plug-In for Visualizing Java Code Dependencies on Relational Databases

Topics. Introduction. Java History CS 146. Introduction to Programming and Algorithms Module 1. Module Objectives

A Tool for Generating Partition Schedules of Multiprocessor Systems

Towards the Use of Sequence Diagrams as a Learning Aid

DEFINING CONTRACTS WITH DIFFERENT TOOLS IN SOFTWARE DEVELOPMENT

An Automated Model Based Approach to Test Web Application Using Ontology

Part I. Multiple Choice Questions (2 points each):

PART-A Questions. 2. How does an enumerated statement differ from a typedef statement?

Transcription:

Reverse-engineering of UML 2.0 Sequence Diagrams from Execution Traces Romain Delamare, Benoit Baudry IRISA / INRIA Rennes Campus Universitaire de Beaulieu Avenue du Général Leclerc 35042 Rennes Cedex France {romain.delamare,benoit.baudry@irisa.fr Yves Le Traon France Télécom R&D MAPS / EXA 2 avenue Pierre Marzin 22307 Lannion Cedex France yves.letraon@francetelecom.com Abstract To fully understand the behavior of a program, it is crucial to have efficient techniques to reverse dynamic views of the program. In this paper, we focus on the reverse engineering of UML 2.0 sequence diagrams showing loops and ernatives from execution traces. To build these complete sequence diagrams, we need to capture the systems state through dynamic analysis. We propose to build state vectors through trace analysis and we precisely discuss how the state of an object-oriented system can be captured. We also present an adaptable trace analysis tool that we have developed to experiment the ideas presented in this work. during the trace analysis. The contribution of this paper consists in proposing an approach to capture the program state during the trace analysis in order to precisely reverse engineer sequence diagrams. We also present and adaptable tool for trace analysis. The rest of the paper is organised as follows : Section II introduces our approach; Section III explains how to catch the program state; Section IV discusses the tracing methods that can be used to retrieve the needed informations; Finaly Sect. V concludes the discussion. I. INTRODUCTION SOFTWARE maintenance is a major concern in the industry as software are getting more and more complex and bigger. Due to high level languages and mechanism such as inheritance or polymorphism it is very difficult to understand the behavior of a program just by reading its source code. Moreover the constant evolution of softwares and technologies often makes documentations obsolete and difficult to maintain. Reverse-engineering can help maintainers understanding a complex system by retrieving models and documentation from a program. This is the process defined as redocumentation in []. UML is a good target language for the reverse engineering of models since it is largely used and offers different diagrams. The reverse-engineering of UML static diagrams like class diagrams has been studied and is now implemented in several tools. Static views of the system allow engineers to understand its structure but it does not show the behavior of the software. To fully understand its behavior, dynamic models are needed, such as sequence diagrams or statecharts. But for now, little work has been done on the reverseengineering of UML dynamic diagrams. Our work is inspired by [2]. We focus on trace analysis for the reverse engineering of sequence diagrams. Simple sequence diagrams can be retrieved by a simple analysis of the messages exchanged between the objects. We concentrate on the generation of UML 2.0 sequence diagrams which model loops and ernatives. To retrieve those informations we need informations about the state of the system II. REVERSING SEQUENCE DIAGRAMS OUR WORK takes place in the process described in [2] and focuses on the two first steps of the process. The first step consists in tracing the program to produce basic sequence diagrams and the second step consists in combining the sequence diagrams obtained in the first step to obtain high level sequence diagrams. High level sequence diagrams are opposed to basic (or simple) sequence diagrams. High level sequence diagrams describe a more complex behavior by using the UML 2.0 fragments. Those fragments show ernatives ( fragment) or loops (loop fragement). Basic sequence diagrams do not show those kind of behavior. To obtain high level sequence diagrams we annote basic sequence diagrams with informations on the state of the program before and after each message. We then use this state informations to combine the sequence diagrams and obtain a high level sequence diagram with the combination operators from UML 2.0. This method is our main contribution and it is detailed in the rest of the section. Although generating basic sequence diagrams is easy, generating high level sequence diagrams is a complex task. A basic sequence diagram can be directly extracted from an execution trace. Figure shows an example of Java code with two classes. If an instance of A executes the method, then it calls on an instance of B; so the sequence diagram corresponding to that execution must have a message from the instance of A to the instance of B, as shown in

public class A { B b; public void () { b.(); {a.state= public class B { public void () { Fig.. An example of Java code {a.state= Fig. 3. A sequence diagram annoted with state vectors. loop Fig. 2. A sequence diagram obtained by tracing the example from Fig. Fig. 2. On the other hand, composing basic sequence diagrams to generate a higher level sequence diagram is a difficult task : the detection of iterations or optional messages is not obvious, even with many basic sequence diagrams. We propose to use state vectors for the generation of high level sequence diagrams. A state vector is a vector of variables that characterizes the system state. State vectors are useful to detect iteration or conditional message. State vectors can be extracted at runtime before and after each message. Then we can combine several basic sequence diagrams. With state vectors we can detect when two different sequence diagrams are in the same state, or we can detect when there is an iteration on a single sequence diagram. In Sect. III we detail how to catch the system state in a state vector. Capturing the program state is not simple and we propose a precise analysis ot the state notion in Sect. III. A sequence diagram can be annoted with state vectors before and after each message. Then this sequence diagram can be transformed to show loops or ernatives. For instance, Fig. 3 shows a sequence diagram annoted with state vectors. The sequence of the message and can be iterated as the vector state before the sending of and the state vector after the sending of are the same. The message can also be iterated as it does not change the state of the system. Fig. 4. 3. A sequence diagram obtained by transforming the diagram of Fig. Figure 4 shows a transformation of the sequence diagram of Fig. 3 with the UML 2.0 loop and fragments. Different sequence diagrams can be combined to obtain a more detailed sequence diagram. For instance, Fig. 5 is the result of a second trace analysis of the program traced in Fig. 3. We can combine those two annoted sequence diagrams to obtain the sequence diagram from Fig. 6. The combination of sequence diagrams, as well as the detection of loops and ernatives, can be automated. Although our method has similarities with the work of [3], they are very different. Both methods rely on state vectors but in [3] the goal is to transform sequence diagrams into statecharts. Moreover in [3] the authors deduce the state vectors from preconditions and post-conditions on the message whereas we extract the state vectors from trace executions. In the next section we define the state vectors we want to

public class Boiler { private int temperature; m4 {a.state= {a.state= public static void main(string[] args) { if (temperature > 80) if (temperature < 50) else Fig. 5. An other annoted sequence diagram. Fig. 7. An example of a Java class with a state characterized by a large domain attribute. loop characterized by a set of attributes. We can reasonably suppose that if a statechart can describe the behaviour of the system, then a set of attributes that characterizes the program can be found. The attributes of the state vector can be of several different types, and they must be considered differently. Mainly we can divide the different attribute types into three groups : the primitive values on a small domain, the primitive values on a large domain and the values that refer to an object. Fig. 6. A sequence diagram obtained by combining the diagrams from Fig. 3 and 5. capture and how to capture them. III. CAPTURING THE PROGRAM STATE THROUGH TRACE ANALYSIS CAPTURING the state of a program is an essential part of our approach as it is based on state vectors. Therefore we must have precise and consistent informations about the state of the program at several points of the program. We propose to catch the program state during the execution before and after each message. We also propose a methodology to extract the desired sequence diagrams. State Vector: A state vector is a vector of values of attributes of the system that characterizes the state of the program. This implies that the state of the program can be m4 A. Small domain attributes The attributes with a primitive value on a small domain are essentially boolean, character or enumeration attributes. They all have a few different values. Boolean and enumeration attributes often characterize the state of the program, but character attributes or rarely used for the state of the program. Small domain attributes are easy to capture. Comparing their values is enough to distinguish two states. So the trace must only store the value of those attributes at each considered event of the execution. B. Large domain attributes The attributes with a primitive value on a large domain, like integers or floats, are very difficult to catch. Mainly two cases can occur : the attribute takes only a few different values the attribute takes its values on a very large part of the domain If they take only a few different values then they can be considered as primitive values on short domains : comparing their values is enough to distinguish two states. But if they take a lot of different values the state is harder to catch. Instead of being characterized by the exact value of the attributes like short domain attributes, the state of the program can be characterized by the intervals containing the values of the attributes. For instance, Fig. 7 shows a Java class representing a boiler. An instance of the class Boiler can be in three different states :

A -b: B C -d: D -a -b -d B -att: int -a: A D state state2 state3 Fig. 8. A UML class diagram. Although the class A and C both have an attribute referencing an object, they must be catched differently. when the temperature is greater than 80 degrees, when the temperature is between 50 and 80 degrees and when the temperature is less than 50. Distinguishing the state is difficult : the temperature can have two different values in the same state. To solve this problem we must know the intervals of interest in regard of the state. Either those intervals must be specified by the user, either we must infer them. C. Object reference attributes Attributes referencing an object can describe the state vector in two different ways : the state of the program is characterized by the attributes of the referenced object the state of the program is characterized by the real type of the referenced object If the state of the program is characterized by the attributes of the referenced object, we must capture the values of those attributes. For example in Fig. 8, the class A has an attribute b of type B. The state of the program is characterized by the attributes of B, so those attributes must be captured. To avoid cross-reference problems and to keep the state vector simple we suggest to capture only the primitive attributes of the object. Figure 8 shows an example of cross-reference : an instance of A and an instance of B can reference each other. The state of the program can be characterized by the real type of the object. This is often the case when using specific design patterns like the state pattern. In this pattern the state of an object is characterized by the real type of an attribute declared with an abstract type. For instance in Fig. 8 the class C has an object reference attribute declared with type D. The class D is an abstract class extended by three different classes : the state of C is characterized by the real type of its attribute d. To handle the case of state characterized by the real type of an object, we propose to extract the name of the real type as the relevant information for the attribute declared with an abstract type. The state pattern can be handled easily like this. D. Methodology In order to produce relevant sequence diagrams the state vector must be precise and easy to compare with other state vectors. A state vector with all the attributes of all objects of the program does not characterize the state of the program properly. So the state vector must only contain the object attributes that characterize the state of the program. To extract relevant state vectors we propose an interactive approach and detail a methodology. This method is based on a sequence of exchanges between the user and the tracing tool. The user specifies which attributes must be part of the state vector. This cannot be easily done with little knowledge of the system, but we propose a methodology to avoid this problem. Our methodology consists in several interactions between the user and the tracing tool. At first the user will likely not know which attributes to choose. She can either try to guess the relevant attributes or select all the attributes. The first result has little chance to be meaningful. Then the user must try and remove some attributes from the state vector. If the result is worse then the removed attribute should be put back in the vector state, otherwise the user must continue and try to remove other attributes. This method will quickly converge to a minimal state vector where all the attributes characterizing the state are present, and only those attributes. IV. TRACING OBJECT ORIENTED PROGRAMS THE DIFFICULT part of that method is to determine whether a result is better than an other or not. First a result showing a new ernative or iteration is better because it describes a more precise behavior. Also if after removing an attribute from the vector state the result has not changed, it is likely that the attribute does not characterize the state of the program. In the worst case the result is a basic sequence diagram without the UML 2.0 fragments. If the user cannot find a correct state vector either because of the difficulty of the task or the lack of correct state vector the method neither improve nor worsen the sequence diagram. This means that if a result differs from the basic sequence diagram, then this result is better. A second contribution of our work is to develop an adaptable tool for execution trace analysis. This tool is based on the use of a debugging tool called JTracor that is developped at IRISA. It traces Java program and was initially used for code coverage [4] and fault localization [5] purposes. We have extended JTracor for reverse-engineering. Figure 9 shows the UML class diagram of JTracor. It relies on the Java Debug Interface (JDI), an API provided by Sun Microsystems. The class TraceProvider executes the program and at each event (e.g. method entry or exit, exception) it calls a method on its listeners. The listeners are instance of a class implementing the interface Trace. The trace produced depends on the implementation of Trace. For example CodeCoverageTrace

Java Debug Interface (JDI) TraceProvider #listeners: Trace +execute() +addlistener(trace) <<interface>> Trace +handlestep() +handlecall() +handleret() +handleexception() +handleend() CodeCoverageTrace SequenceDiagramTrace Fig. 9. UML class diagram of JTracor produces code coverage statistics. This allows JTracor to be used in many different cases. The SequenceDiagramTrace class generates sequence diagrams with informations on state vectors. The user must specify the classes and the method to appear in the sequence diagram. She must also specify for each class the attributes whose value must be catched in the state vectors. At each method entry, a message is created in the sequence diagram. The values of the watched attributes of the source object and the class object are read and put in the state vector. When the method returns, the attributes of the target are read again. Exceptions are handled too, because method exit may occur when an exception is thrown. [3] J. Whittle and J. Schumann, Generating statechart designs from scenarios, in proceedings of the 22 nd International Conference on Software Engineering, 2000. [4] C. Nebut, F. Fleurey, Y. Le Traon, and J.-M. Jézéquel, Requirements by contracts allow automated system testing, in Proc. of the 4 th. IEEE International Symposium on Software Reliability Engineering (ISSRE 03), 2003. [5] B. Baudry, F. Fleurey, and Y. Le Traon, Improving test suites for efficient fault localization, in Proceedings of the 28 th International Conference on Software Engineering (ICSE 06). ACM, 2006. V. CONCLUSION OUR WORK consists in generating UML 2.0 sequence diagrams from execution traces, as described in [2]. We propose a method based on state vectors that allows loops and ernatives detection within a single sequence diagrams and combination of several sequence diagrams. The state vectors are retrieved at runtime and we present a methodology to obtain meaningful sequence diagrams by finding a precise state vector. For trace analysis we use a tool developed at IRISA, JTracor, and we are currently experimenting with several different situations. Future works include more experimentations and some case-studies to confirm our methodology and our hypothesis. REFERENCES [] E. J. Chikofsky and J. H. Cross II, Reverse engineering and design recovery: a taxonomy, IEEE Software, 990. [2] Y.-G. Guéhéneuc and T. Ziadi, Automated reverse-engineering of UML 2.0 dynamic models, in proceedings of the 6 th ECOOP Workshop on Object-Oriented Reengineering, 2005.