Eclipse 3.5 - A Case Study in Software Visualization Based on Static Analysis



Similar documents
VISUALIZATION APPROACH FOR SOFTWARE PROJECTS

Percerons: A web-service suite that enhance software development process

Software Analysis Visualization

Program Understanding with Code Visualization

Software Engineering & Architecture

Comprensione del software - principi base e pratica reale 5

BugMaps-Granger: A Tool for Causality Analysis between Source Code Metrics and Bugs

Génie Logiciel et Gestion de Projets. Evolution

VisCG: Creating an Eclipse Call Graph Visualization Plug-in. Kenta Hasui, Undergraduate Student at Vassar College Class of 2015

Quality prediction model for object oriented software using UML metrics

Component visualization methods for large legacy software in C/C++

The Class Blueprint A Visualization of the Internal Structure of Classes

Exploiting Dynamic Information in IDEs Eases Software Maintenance

EVALUATING METRICS AT CLASS AND METHOD LEVEL FOR JAVA PROGRAMS USING KNOWLEDGE BASED SYSTEMS

Visualization of bioinformatics workflows for ease of understanding and design activities

A Tool for Visual Understanding of Source Code Dependencies

CodeCrawler An Extensible and Language Independent 2D and 3D Software Visualization Tool

Program Understanding in Software Engineering

Software Visualization Tools for Component Reuse

Software Bugs and Evolution: A Visual Approach to Uncover Their Relationship

Modifying first person shooter games to perform real time network monitoring and control tasks

Tool Support for Inspecting the Code Quality of HPC Applications

IMPROVING JAVA SOFTWARE THROUGH PACKAGE STRUCTURE ANALYSIS

Chap 4. Using Metrics To Manage Software Risks

Software Visualization and Model Generation

A Practical Approach to Software Quality Management visualization

Visualization of Software Metrics Marlena Compton Software Metrics SWE 6763 April 22, 2009

Processing and data collection of program structures in open source repositories

Bayesian Inference to Predict Smelly classes Probability in Open source software

The software developers view on product metrics A survey-based experiment

Object Oriented Design

Analyzing Java Software by Combining Metrics and Program Visualization

How To Calculate Class Cohesion

Definitions. Software Metrics. Why Measure Software? Example Metrics. Software Engineering. Determine quality of the current product or process

GSPIM: Graphical Visualization Tool for MIPS Assembly

Exploring the Differing Usages of Programming Language Features in Systems Developed in C++ and Java

A Visualization Approach for Bug Reports in Software Systems

Coordinated Visualization of Aspect-Oriented Programs

Visualizing Software Architecture Evolution using Change-sets

BugMaps-Granger: a tool for visualizing and predicting bugs using Granger causality tests

Visual Analysis Tool for Bipartite Networks

A hybrid approach for the prediction of fault proneness in object oriented design using fuzzy logic

VISUALIZATION TECHNIQUES OF COMPONENTS FOR LARGE LEGACY C/C++ SOFTWARE

Integrating TAU With Eclipse: A Performance Analysis System in an Integrated Development Environment

Baseline Code Analysis Using McCabe IQ

Gadget: A Tool for Extracting the Dynamic Structure of Java Programs

EvoSpaces: 3D Visualization of Software Architecture

II. TYPES OF LEVEL A.

An Introduction to Software Visualization. Visualization. Types of Software Visualization. Course Overview

Automatic software measurement data collection for students

SourceMeter SonarQube plug-in

Keywords Class level metrics, Complexity, SDLC, Hybrid Model, Testability

A Survey Paper on Software Architecture Visualization

Chapter 24 - Quality Management. Lecture 1. Chapter 24 Quality management

An Eclipse Plug-In for Visualizing Java Code Dependencies on Relational Databases

TOWARDS PERFORMANCE MEASUREMENT AND METRICS BASED ANALYSIS OF PLA APPLICATIONS

Software Visualization

A Framework of Model-Driven Web Application Testing

Software Visualization

Development of a 3D tool for visualization of different software artifacts and their relationships. David Montaño Ramírez

VISUALIZATION. Improving the Computer Forensic Analysis Process through

Empirical study of software quality evolution in open source projects using agile practices

Software Development Kit

Visualization methods for patent data

Hierarchical Data Visualization. Ai Nakatani IAT 814 February 21, 2007

Vragen en opdracht. Complexity. Modularity. Intra-modular complexity measures

Enterprise Integration: operational models of business processes and workflow systems *

Understanding Software Static and Dynamic Aspects

International Journal of Software Engineering and Knowledge Engineering c World Scientific Publishing Company

Research Article An Empirical Study of the Effect of Power Law Distribution on the Interpretation of OO Metrics

DAHLIA: A Visual Analyzer of Database Schema Evolution

Comprehending Software Architecture using a Single-View Visualization

Dynamic Analysis. The job of the reverse engineer is similar to the one of the doctor, as they both need to reason about an unknown complex system.

Visually Encoding Program Test Information to Find Faults in Software

The Use of Information Visualization to Support Software Configuration Management *

RETRATOS: Requirement Traceability Tool Support

An Interactive Visualization Tool for the Analysis of Multi-Objective Embedded Systems Design Space Exploration

What Questions Developers Ask During Software Evolution? An Academic Perspective

Tracking the Evolution of Object-Oriented Quality Metrics on Agile Projects

JRefleX: Towards Supporting Small Student Software Teams

Towards Collaborative Requirements Engineering Tool for ERP product customization

Péter Hegedűs, István Siket MTA-SZTE Research Group on Artificial Intelligence, Szeged, Hungary

A Framework for Dynamic Software Analysis & Application Performance Monitoring

WebSphere Business Modeler

Measurement, Analysis with Visualization for Better Reliability

Big Data: Rethinking Text Visualization

Software visualization. Why visualization?

Detecting Defects in Object-Oriented Designs: Using Reading Techniques to Increase Software Quality

Design methods. List of possible design methods. Functional decomposition. Data flow design. Functional decomposition. Data Flow Design (SA/SD)

Simulating the Structural Evolution of Software

White Coats: Web-Visualization of Evolving Software in 3D

Software Evolution Analysis & Visualization

Compiling Object Oriented Languages. What is an Object-Oriented Programming Language? Implementation: Dynamic Binding

Extracting Facts from Open Source Software

Contents. Introduction and System Engineering 1. Introduction 2. Software Process and Methodology 16. System Engineering 53

Call Graph Based Metrics To Evaluate Software Design Quality

BCS HIGHER EDUCATION QUALIFICATIONS Level 6 Professional Graduate Diploma in IT. March 2013 EXAMINERS REPORT. Software Engineering 2

Structural Complexity Evolution in Free Software Projects: A Case Study

VISUALIZING HIERARCHICAL DATA. Graham Wills SPSS Inc.,

Development/Maintenance/Reuse: Software Evolution in Product Lines

Transcription:

MetricAttitude: A Visualization Tool for the Reverse Engineering of Object Oriented Software ABSTRACT Michele Risi Dipartimento di Matematica e Informatica Università di Salerno Fisciano, Italy mrisi@unisa.it In this paper, we present a visualization approach for the reverse engineering of object-oriented (OO) software systems and its implementation in MetricAttitude, an Eclipse Rich Client Platform application. The goal of our proposal is to ease both the comprehension of a subject system and the identification of fault-prone classes. The approach graphically represents a suite of object-oriented design metrics (e.g., Weighted Methods per Class) and traditional codesize metrics (e.g., Lines Of Code). To assess the validity of MetricAttitude and its underlying approach, we have conducted a case study on the framework Eclipse 3.5. The study has provided indications about the tool scalability, interactivity, and completeness. The results also suggest that our proposal can be successfully used in the identification of fault-prone classes. Categories and Subject Descriptors H.5.2 [Information Interfaces and Presentation]: Graphical User Interface; I.3.6 [Computer Graphics]: Methodology and Techniques Interaction Techniques Keywords Program Comprehension, Reverse Engineering, Software Maintenance, Metrics, Software Visualization 1. INTRODUCTION Many activities in the software engineering field involve existing software. Software maintenance, testing, quality assurance, reuse, and integration are only few examples of software processes that involve existing systems [1]. A crucial issue of these processes entails the reverse engineering of these systems. The term reverse engineering has been defined in the context of software engineering as: the process of analyzing a subject system to (i) identify the system s components and their inter-relationships and (ii) create representations of the Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. To copy otherwise, to republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. AVI 12 May 21-25, 2012, Capri Island, Italy Copyright 2012 ACM 978-1-4503-1287-5/12/05...$10.00. Giuseppe Scanniello Dipartimento di Matematica e Informatica Università della Basilicata Potenza, Italy giuseppe.scanniello@unibas.it system in another form or at a higher level of abstraction [2]. To summarize, this process entails the gathering of information in software artifacts (not only source code). On the basis of that information, a reverse engineering process builds more understandable representations of software artifacts. In the reverse engineering field, software visualization [3], [4] is extensively and successfully explored and used. Researchers have proposed approaches/techniques and supporting tools based on 2D and 3D environments [5], [6], [7], [8]. These approaches visualize either static information, performing static analysis (e.g., [9]), or dynamic information, relying on the execution of the system (e.g., [10]). Dynamic information is affected by the execution scenarios that should be as much as possible representative of the system s typical usage [11]. This concern is the weakness of visualization approaches based on dynamic analysis. Whatever is the information to be visualized, a relevant concern of visualization is the way information is presented. The choice of a proper visualization is straightforward, when a well known and highly adopted representation is used, e.g., the notations of the UML [12]. In other cases, a visualization is the essence of a technique because it has to highlight relevant information at a proper level of detail. The kind of representation adopted impacts on the usefulness of the technique [1]. In this paper 1, we present an approach for software visualization based on static analysis. The approach provides a mental picture by viewing an object oriented (OO) software system by means of polymetric views [9], i.e., lightweight software visualizations enriched with software metrics. The approach has been implemented in a prototype of a supporting tool, intended as an Eclipse Rich Client Platform (RCP). We named this prototype MetricAttitude. To assess the validity of both the approach and the prototype, we have conducted a case study on the framework Eclipse 3.5. The case study provided indications about the tool scalability, interactivity, and completeness. Further, to assess whether our proposal eases the identification of fault-prone classes, we have studied the visual representation of Eclipse 3.5 with respect to data associated with changes in response to bug reports in a public repository. The main contributions of the paper are: A visualization approach for the reverse engineering of OO software. It is based on OO metrics and traditional code-size metrics; 1 Please read the paper on-screen or as a color-printed paper version, we make extensive use of color pictures.

A prototype of a supporting system. To increase the comprehension of a subject system, it implements several views, filtering, and display/editing operations; A novel static analysis based approach to recover relationships among classes. Relationships are used in the visualization together with the metrics collected on the system; A case study on a large open source software system implemented in Java. The results of this study reveled that the prototype has the following main characteristics: scalability, interactivity, and completeness; The bug reports of Eclipse 3.5 has been used to assess whether MetricAttitude and the underlying approach can be used to identify fault-prone classes. The remainder of the paper is organized as follows: in Section 2, we review previous software visualization approaches and techniques. In Section 3, we describe our approach, while we present MetricAttitude in Section 4. In Section 5, we outline the preliminary results of the conducted empirical evaluation. Finally, Section 6 concludes and discusses possible future directions for our research. 2. RELATED WORK There are a number of approaches/techniques for software visualization based on Synthetic Natural Environment (SNE) (e.g. [13], [14], [15], [16], [17]). In the first subsection, we discuss these techniques. In the literature, visualization techniques based on graph representations and on polymetric views have been also proposed [9], [18], [19]. We present some of them in the second subsection. Due to the huge number of software visualization techniques, a deep and exhaustive discussion of related work is not possible and it is also out of the scope of the paper. An extensive taxonomy of software visualization, with several examples and tools is available in [20]. 2.1 SNE based Techniques Among the SNE based techniques, the city metaphor is one of the most used (e.g., [13], [14], [15], [16]). Wettel and Lanza [21] propose such a kind of metaphor for the comprehension of object oriented software systems. Classes are represented as buildings and packages as districts. The authors implement the metaphor in the CodeCity tool. In a more recent paper, the same authors [22] present a controlled experiment to assess the validity of the metaphor and CodeCity. The experiment was conducted with software professionals. The results indicate that the tool leads to a statistically significant improvement in terms of task correctness. The results also indicate the task completion time statistically decrease when using the tool. Graham et al. [23] propose a solar system metaphor. Suns represent a package, planets are classes, and orbits represent the inheritance level within the package. Martínez et al. [24] suggest a metaphor based on landscape. The main objective of that proposal is to visually represent a number of aspects related to software development processes. Ghandar et al. [25] propose a jigsaw puzzle metaphor. Each software component is a piece of a jigsaw puzzle. The system complexity is represented as a pattern on the surface of the pieces. Erra and Scanniello [8] present an approach based on a forest metaphor to ease the comprehension of OO software. A software is depicted as a forest of trees. Each tree is a class. The trunk, the branches, the leaves, and their color represent characteristic of the class (e.g., a method is a branch of the tree). CodeTrees is the name of the tool implementing the metaphor. Several are the differences between our proposal and the approaches discussed above. The most remarkable one is that our approach enriches lightweight visualizations with metrics information and relationships among classes. 2.2 Polymetric and Graph Views Lanza and Ducasse [9] present the polymetric views. These views are lightweight visualizations enriched with software metrics. This approach presents a number of similarities with our approach, e.g., the structure of hierarchies is enriched with software metrics information. Differently, our approach is able to (i) visual represent more metrics at the same time and (ii) show instance level relationships and structure of hierarchies in the same view. In this paper, we also propose a novel static analysis based technique to recover relationships among classes. The validity of our tool and the underlying approach is assessed on a large open source software system. Reniers et al. [26] present SolidSX, an integrated tool for visualizing large software compound attributed graphs. Our approach is different because it is able to visually represent more metrics in the same view. The software engineer can apply filters to hide useless information for the problem at the hand. 3. THE APPROACH It is essential to decide which goals have to pursuit, when defining a new visualization approach for the reverse engineering of a subject system [9]. We have focused on the following goals of primary importance for our proposal: Getting an overview of the system in terms of size, complexity, and structure. Locating the most important classes and inheritance hierarchies. This allows comprehending the classes and hierarchies that represent a core part of the system. Identifying classes that exchange a huge number of messages and make strong usage of delegations. These classes could require refactoring operations to improve the quality of the subject system. Identifying exceptional classes in terms of size and/or complexity compared to all classes. These classes may be candidate for a further inspection or for the application of refactoring. Suggesting the presence of design patterns or occasions where design patterns could be introduced. The approach provides a large-scale understanding of a software system visualizing all classes together and handles class instance level relationships and structure of hierarchies. We also provide a fine-grained representation of individual classes (e.g., the name of the class and its methods).

In the following, we describe the selected metrics and the static analysis based approach we defined to get relationships among classes. The section concludes presenting the mapping proposed between the graphical properties of classes (or more generally software entities) and the selected metrics as well as the representations used for class instance level relationships and structure of hierarchies. 3.1 Metrics A reverse engineering process should not only present a list of problematic classes or subsystems [9], but its result should also facilitate the identification of valuable piece of information for the quality attributes (i.e., complexity, efficiency, re-usability, testability, and maintainability) of the subject system [27]. To this end, software metrics have been proposed [28], [29], [30], [31]. The concept of software metrics is well established in the software engineering community [32]. Recently, an increasing interest in OO metrics [33] has been showed thanks to the increasing popularity of OO analysis and design methodologies. We considered the most popular OO metrics firstly presented by Chidamber and Kemerer [34] and more traditional code-size metrics: Weighted Methods per Class (WMC). It is defined as being the number of all member functions and operators defined in each class. WMC measures the complexity of individual classes. Depth of Inheritance Tree (DIT). It measures the number of ancestors of a class. DIT primarily evaluates efficiency and re-usability. Number Of Children (NOC). It gives the number of direct descendants for a class. NOC is used to assess efficiency, re-usability, and testability. Coupling Between Object classes (CBO). It represents the number of classes to which a given class is coupled. CBO evaluates efficiency and re-usability. Response For a Class (RFC). It measures the number of methods that can be potentially executed in response to a message received by an object of that class. This metric evaluates understandability, maintainability, and testability. Flow Info (FI). It is computed on fan-in and fan-out as follows: L (fan in fan out) 2. L is a codesize metric. We used the LOC metric. Fan-in measures the number of methods called in a software component. We considered here a class as a component. Conversely, Fan-out is defined as the number of local flows out of a component. As for Fan-in, we considered a class as a component. FI is employed to measure testability. Number of Message Sends (NMS). It measures the number of messages from a method. This metric is designed to be a unbiased measure of the size of a method. The value for a class is achieved by summing the NMS values of all the methods in the class. This metric is used to evaluate re-usability and testability. Number of Methods (NM). It represents the number of method of a class and is used to determine the complexity. Lines Of Code (LOC). It indicates the number of all nonempty and non-comment lines of the class and its methods. LOC is useful to estimate maintainability. Number of Comments (NC). It represents the number of comment lines of the body of a class and its methods. NC can be used to evaluate understandability, re-usability, and maintainability. As shown in [27] and recalled above, these design metrics provide useful information on software quality attributes (e.g., complexity and maintainability). 3.2 Reverse Engineering the Subject System We highlight here the new static analysis approach proposed to identify relationships among abstract classes, interfaces, concrete classes, and inner classes. Our proposal has been inspired from the work presented by Sundaresan et al. in [35]. Several are the differences with respect to that approach. The most remarkable one is that we work on Java source code, while byte-code is used in [35]. Our choice makes our approach more flexible and easily modifiable for subject systems implemented with programming languages different from Java. We first build a call graph, where each node represents an abstract class, an interface, a concrete class, or an inner class. For simplicity reasons, we will refer to them as class or software entity. We will distinguish among abstract class, interface, concrete class, and inner class only if needed. Directed edges in the call graph are invocations from a method in a class to another method in the same class or in a different class. Special attention is needed when adding an edge representing a call to a virtual method (i.e., a method without a concrete implementation). We did that by using an approximation of run-time types of receivers. In particular, we find all the possible targets of a method call of a class by looking down the class hierarchy of the receiver. Directed edges are added from the caller to each possible target receiver. We called this kind of edges as Virtual Delegation. At run-time only a call to a target method is present. This is the difference between the type resolution at run-time and an approximation of run-time types. We observed on the system used in the case study (i.e., Eclipse 3.5) that only in a few cases the number of possible target receivers is greater than one. For example, the class FileHandle through the method handlenotification calls the virtual method exists of the interface ChainedHandle. The found target methods are those implemented in the FileHandle and LinkedResourceHandle classes, that implement the ChainedHandle interface. In the case study, similar conditions happened in 132 cases out of 682. There could be also the case that no target receivers are found using an approximation of run-time types. Then, we add a directed edge from the caller to the class containing the virtual method. This kind of directed edge is named Abstract Delegation. An example of this edge and how it can be used in the identification of a design pattern is shown in Figure 1. Directed edges in the call graph are also added on the basis of the access to public fields. Accesses to public fields are managed similarly to the invocations from a method in a class to another method either in the same class or in another class. For space limits, details are not provided.

Figure 1: Virtual Delegation and Abstract Delegation in a command design pattern in Eclipse 3.5 3.3 Graphical Representation of Software Entities and their Relationships The chosen mapping and graphical representations are inspired by well known software visualization techniques (e.g. [9],[18]) and the experience the authors gained in the program comprehension field and software visualization and development. We also tried as much as possible to find intuitive mappings to ease the comprehension of both high and low experienced software engineers. Each software entity is represented with a node, which is graphically represented with an ellipse and a rectangle placed just below. As shown in Figure 2 and Table 1, the properties of ellipse summarize OO metrics, while the properties of the rectangle traditional code-size metrics. Table 1 also shows how the kind of relationships among classes are graphically represented. The dimension of ellipse denotes WMC. The convention is that the larger the ellipse, the higher the metric value is. We used the flattening of ellipse to represent DIT. The more the flattening, the higher the metric value is. The color of the border of ellipse is used to denote NOC. The color is between light red and dark red. The higher the metric value, the darker the border is. We used the thickness of the border of ellipse to represent CBO. The larger the thickness, the higher the metric value is. To avoid loosing information about NOC, a thin border is drawn even if CBO is 0. The color of ellipse denotes RFC. The color ranges between black to white. Dark gray represents a smaller metric measurement than light gray. The color of rectangle denotes FI. Similar to RFC, the color ranges between black and white. The lower the metric value, the darker the color is. We used the thickness of the border of rectangle to represent NMS. The larger the thickness, the higher the metric value is. We used the color of the border of rectangle for NM. The color ranges between light and dark red. Light red represents a smaller metric measurement than dark red. The height of rectangle denotes LOC. The convention is that higher the metric value, the more the height of the rectangle is. The width of rectangle is used for NC. The higher the metric value, the more the width is. Relationships between classes are showed as lines: simple, arrowed, and hollow. These lines can assume different colors (see Table 1). With regard to the instance level relationships, we also considered the thickness of the lines associated. More the thickness, the larger the number of links between a source class to a target one is. Figure 2: Visual properties of nodes OO Metrics WMC (Weighted Methods per Class) Size of ellipse DIT (Depth of Inheritance Tree) Flatten the ellipse NOC (Number Of Children) Color of ellipse border CBO (Coupling Between Object classes) Thickness of ellipse border RFC (Response For a Class) Color of ellipse Traditional Code-size Metrics FI (Flow Info) Color of rectangle NMS (Number of Message Sends) Thickness of rectangle border NM (Number of Methods) Color of rectangle border LOC (Lines Of Code) Height of rectangle NC (Number of Comments) Width of rectangle Relationships Hierarchy Hollow triangle shape Delegation Blue arrowed line Field Access Cyan arrowed line Abstract Delegation Orange arrowed line Virtual Delegation Green arrowed line Method Call Loop Blue line Field Access Loop Cyan line Table 1: Metrics and visual mapping 4. METRICATTITUDE The approach has been implemented in a prototype of a supporting system intended as an Eclipse RCP 2 application. We named this prototype as MetricAttitude. It is composed of three main software components. The former component enables the selection of the subject system and visualizes it. The second component extracts the metrics, while the latter builds the graphical representation of the subject system (i.e., call graph). To improve the readability of the graphical representation of the subject system. Relationships among classes can be filtered. Figure 3 shows all the filters available for the relationships between pairs of software entities. Filters can be combined to better handle the reversers engineering task at the hand. Our tool also provides metrics based filters. Ellipses and rectangle can be shown separately or together. It is also possible to show the subject system in terms of a 2 A sample video of the execution of the approach and the prototype is available at http://www.dmi.unisa.it/ people/risi/www/metricattitude/. The prototype is also available for downloading.

Figure 3: Filters implemented for the relationships among software entities Figure 4: Graphical layouts for disposing the elements of a subject systems UML class diagram. Figure 1 shows an excerpt of the class diagram obtained from Eclipse 3.5. MetricAttitude also implements various graphical layouts for disposing the graphical elements of the subject system. The implemented layout algorithms are those shown in Figure 4. For each entity, the tool also shows: the name, the fields, and the methods. The values of all the metrics are shown as a tooltip when the mouse pointer is over the associated graphical object. The technologies used to implement MetricAttitude are: JDT (Java Development Tools) Core 3. It is included in the Eclipse SDK and offers a built-in incremental Java compiler to run and debug code which still contains unresolved errors. JDT Core also provides a model for Java source code. The model offers API for navigating elements like package fragments, compilation units, binary classes, types, methods, fields, and so on. Although JDT Core implements a number of features, our system exploits the incremental Java compiler to make independent the subject system from the libraries it uses. Our prototype also uses the Java 3 www.eclipse.org/jdt/core/ source code model to identify class instance level relationships (as shown in Section 3.2) and structure of hierarchies. JGraphT 4. It is a Java library for graph visualization. It supports drag and drop operations, selection modes, and display/editing options. The graphical appearance of the nodes can be easily modified and used. To automatically position nodes, JGraphT implements a number of layout algorithms. Examples of available layouts are: hierarchical, organic, and tree. JGraphT easily allows defining new layouts. This is was one of the reasons because we chose that library. The prototype uses that technology to visualize the subject system. Eclipse RCP 5.This technology allows developers to use the Eclipse platform to create flexible and extensible desktop applications. The Eclipse RCP applications are build upon a plug-in architecture. We used this technology because it easily allows integrating Java based technologies. The use of the Eclipse RCP technology made the prototype platform independent. 5. EVALUATION To validate MetricAttitude, we conducted a case study on the core of the framework Eclipse 3.5. It is an open source multi-language software development environment. The Eclipse framework is implemented in Java and has an extensible plug-in architecture. It has been also used in the past to assess the validity of various software visualization tools (e.g., [21]). The core of the framework has: 331 classes, 2313 methods, and 1580 attributes. The number of lines of code is 73809, while the number of line comments is 31427. Figure 5 shows Eclipse 3.5 as it appears in MetricAttitude. There is a software entity that represent the core part of the subject system. This entity is that on the top of the class hierarchy. The height and width of the rectangle associated to this entity indicates high values for LOC and NC. Further, the dimension of this rectangle is greater than the other rectangles. As suggested by the thickness of the rectangle border the represented software entity is an abstract class or an interface. We inspected the source code to confirm that assumption. It is an interface, whose name is IResources. Figure 1 shows an excerpt of the class diagram of Eclipse 3.5. It represent a command design pattern of Eclipse 3.5. There is an Abstract Delegation from the class BuildManager to the ICommand interface. MetricAttitude adds a Virtual Delegations to the concrete class BuildClass. This result gives credit to the fact that our approach could suggest the presence of design patterns or occasions where design patterns could be introduced. The use of MetricAttitude on Eclipse 3.5 leads to the following considerations: Scalability. The tool scales up well in terms of the size of the software systems to be visualized. On a very large software systems (i.e., Eclipse) the tool did not slow down. The good scalability is due to the use of both JDT Core for the reverse engineering of Eclipse and JGraphT for its visualization. 4 www.jgrapht.org/ 5 www.eclipse.org/rcp/

Figure 5: The framework Eclipse 3.5 in MetricAttitude with the Tree Layout Navigation and Interactivity. The tool enables the visualization of the name of each class through a label associated to the corresponding node. For each node, the tool also shows attributes and methods. Furthermore, double-clicking on the node the source code can be inspected or modified. The modified source code is automatically compiled and the graphical representation of the subject system is modified according to the modifications performed. This is possible due to the use of JDT Core. Several layouts and filtering operations are also available to improve interactivity. The tool provides moving and selection modes as well as display/editing options. Completeness. A number of features for analyzing single nodes are available. For example, the tool visualizes the values of all metrics, the kind of node (abstract class, interface, concrete class, and inner class), the fields, and the signatures. The approach and MetricAttitude also provide a fair amount of information for an overview of the subject system. In fact, nodes can be graphically compared each others to get indications about the relevance of abstract class, interface, concrete class, and inner class. 5.1 Identification of Fault-Prone Classes The goal of this part of the evaluation is to analyze whether MetricAttitude provides support to the identification of faultprone classes. The perspective of the study is both from the point of view of the researcher, evaluating whether the visualization approach is able to identify fault-prone classes, and from the point of view of the project manager, who wants to evaluate the possibility of adopting the proposed approach within his/her own organization. Mappings between 43 bug reports and the changes that implemented the fixes were extracted from their repository and bug tracking system (available at bugs.eclipse.org). Past changes in Eclipse provide us with actual changes in the code done (i.e., classes and methods) in response to a bug. The total number of (non unique) methods changed for the 43 bugs was 144. Changed methods for a bug could be in more than one class. In particular, the classes involved in the change methods were 54. Based on our choices and assumptions, the evaluation consists of a case study where our tool and the underlying approach are used to ease the localization of classes associated with past changes. We have studied the classes that include these changes looking at the metrics empirical validated in [28], [29], [30], and [31]. The results of these studies are summarized in Table 2. The symbol + denotes that the effect of the metric was significant in the prediction of faultproneness class in direct way, while in inverse way. The symbol ++ means that the metric was a more useful predictor. The symbol indicates that the metric was more useful in the built model, but in inverse way. The symbol 0 denotes that the effect of the metric was not significant. A blank entry means that the effect of the metric was not investigated. As shown in Table 2, only a subset of the metrics used in our approach has been empirical validated on commercial and open source software systems to determine whether they can be used as fault predictors. To identify the classes involved in the change methods, we implemented a new feature in MetricAttitude. This feature highlights these classes in yellow. Figure 6 shows the framework Eclipse 3.5 (by applying the radial layout) and its

Metric [28] [29] [30] [31] WMC (Weighted Methods per Class) + + ++ ++ DIT (Depth of Inheritance Tree) ++ + 0 NOC (Number Of Children) ++ 0 CBO (Coupling Between Object) + ++ + + RFC (Response For a Class) ++ ++ + LOC (Lines Of Code) ++ + Table 2: Fault predictor metrics Figure 6: A view of the Framework Eclipse 3.5 to identify fault-prone classes changed classes that implemented the fixes extracted from the public repository used. The visualization of the subject system indicates that the entities with the larger graphical objects (both considering the dimensions of ellipses and rectangles) are highlighted in yellow, so suggesting that these entities are fault-prone classes. This result gives strength to the validity of our approach and further confirming the metrics as good fault predictors. Due to the preliminary nature of this study further investigations are needed. 5.2 Threats to Validity To comprehend the strengths and limitations of the empirical evaluation of our approach, the threats that could affect the validity of the results and their generalization are presented and discussed here. Despite our efforts to mitigate as many threats to validity as possible, some are unavoidable. The second part of our evaluation is based on reenactment with user actions being simulated. Using actual developers could result to different results. We plan to conduct studies with developers in the future. The software systems used in the study represent another threat. It was chose primarily because of the availability of data and previous studies. We cannot estimate the effect of attributes, such as, programming language, domain, and so on. With regard to the system studied, other possible threats are related to the fact that it is open source. This kind of software is usually developed by volunteers and the development process used is different from those applied in the industry. 6. CONCLUSION AND FUTURE WORK Software visualization allows studying multiple aspects of complex problems like in reverse engineering, when comprehension tasks are performed. The available techniques and approaches are often too simplistic and lack visual cues to correctly interpret software systems [9]. In other cases, visualizations are too complex to be useful in practice. In this paper, we have presented a visualization approach based on static analysis that easily shows complex information in OO software. Our proposal provides a mental picture by viewing a software through several views based on OO metrics and traditional code-size metrics (e.g, LOC) and relationships among classes (e.g., delegation and inheritance). To ease the comprehension of a subject system, we have chosen an intuitive mappings between the visual representation of classes and the ten metrics selected. We implemented the approach in MetricAttitude. The current implementation of this tool only supports software written Java. However, the prototype can be extended to software written in different programming languages. For example, the use of ADT and CDT with respect to JDT will make our prototype suitable for ADA and C/C++, respectively. To assess the validity of the approach and the supporting tool, we have conducted a preliminary empirical investigation as a case study. This investigation was performed in two steps on the framework Eclipse 3.5. In the first step, we assessed the supporting tool with respect to: scalability, navigation and interactivity, and completeness. In the second step, we verified whether our approach is useful to identify fault-prone classes. To this end, data associated with changes in response to bug reports were used. For both the steps of the evaluation, we achieved promising results. The main future direction for our research consists in conducting users studies to evaluate the effectiveness of our proposal in the execution of maintenance or software comprehension tasks. These studies will also enable us to identify possible concerns in the actual use of the tool. Finally, future work will be devoted to replicate the study presented here with commercial software systems. 7. REFERENCES [1] G. Canfora and M. Di Penta. New frontiers of reverse engineering. In Future of Software Engineering, FOSE, pages 326 341. IEEE Computer Society, 2007. [2] E. J. Chikofsky and J. H. Cross II. Reverse engineering and design recovery: A taxonomy. IEEE Software, 7:13 17, 1990. [3] J. Stasko, J. Domingue, B. Price, and M. Brown. Software visualization: programming as a multimedia experience. 1998. [4] S. Diehl, editor. Software Visualization, International Seminar Dagstuhl Castle, Revised Lectures, volume 2269 of Lecture Notes in Computer Science. Springer, 2002. [5] R. Wettel and M. Lanza. Program comprehension through software habitability. In In Proc. of the International Conference on Program Comprehension, ICPC, pages 231 240, 2007. [6] S. Ducasse and M. Lanza. The class blueprint: Visually supporting the understanding of classes. IEEE Trans. Software Eng., 31(1):75 90, 2005.

[7] A. Marcus, L. Feng, and Maletic J. I. Comprehension of software analysis data using 3d visualization. In International Conference/Workshop on Program Comprehension, IWPC, pages 105 114, 2003. [8] U. Erra and G. Scanniello. Towards the visualization of software systems as 3d forests: the codetrees environment. ACM Symposium On Applied Computing, page to appear, 2012. [9] M. Lanza and S. Ducasse. Polymetric views - a lightweight visual approach to reverse engineering. IEEE Trans. on Software Engineering, 29:782 795, 2003. [10] R. J. Walker, G. C. Murphy, B. Freeman-Benson, D. Wright, D. Swanson, and J. Isaak. Visualizing dynamic software system information through high-level models. In In Proc. of the 13th ACM SIGPLAN conference on Object-oriented programming, systems, languages, and applications, OOPSLA, pages 271 283. ACM, 1998. [11] M. Salah, S. Mancoridis, G. Antoniol, and M. Di Penta. Scenario-driven dynamic analysis for comprehending large software systems. In European Conference on Software Maintenance and Reengineering, CSMR, pages 71 80. IEEE Computer Society, 2006. [12] OMG. Unified modeling language (UML) specification, version 2.0. Technical report, Object Management Group, July 2005. [13] C. Knight and M. Munro. Virtual but visible software. In In Proc. of the IEEE International Conference on Information Visualization, pages 198 205, 2000. [14] S. M. Charters, C. Knight, N. Thomas, and M. Munro. Visualisation for informed decision making; from code to components. In In Proc. of the 14th International Conference on Software Engineering and Knowledge Engineering, SEKE, pages 765 772. ACM, 2002. [15] T. Panas, R. Berrigan, and J. Grundy. A 3d metaphor for software production visualization. In In Proc. of the 7th International Conference on Information Visualization, pages 314 319. IEEE Computer Society, 2003. [16] B. Kot, B. Wuensche, J. Grundy, and J. Hosking. Information visualisation utilising 3d computer game engines case study: a source code comprehension tool. In In Proc. of the 6th ACM SIGCHI New Zealand chapter s international conference on Computer-human interaction: making CHI natural, CHINZ, pages 53 60. ACM, 2005. [17] A. R. Teyseyre and M. R. Campo. An overview of 3d software visualization. IEEE Trans. on Visualization and Computer Graphics, 15:87 105, January 2009. [18] M. Lanza. Codecrawler - polymetric views in action. In In Proc. of the IEEE International Conference on Automated Software Engineering, ASE, pages 394 395. IEEE Computer Society, 2004. [19] P. Martin, G. Harald, and F. Michael. Towards an integrated view on architecture and its evolution. Electr. Notes Theor. Comput. Sci., 127(3):183 196, 2005. [20] B. A. Price, R. M. Baecker, and I. S. Small. A principled taxonomy of software visualization. Journal of Visual Lang. and Computing, 4(3):211 266, 1993. [21] R. Wettel and M. Lanza. Codecity: 3d visualization of large-scale software. In Companion of the 30th international conference on Software engineering, ICSE Companion, pages 921 922. ACM, 2008. [22] R. Wettel, M. Lanza, and R. Robbes. Software Systems as Cities: A Controlled Experiment. In In Proc. of International Conference on Software Engineeering), ICSE, page to be published, 2011. [23] H. Graham, H. Y. Yang, and R. Berrigan. A solar system metaphor for 3d visualisation of object oriented software metrics. In In Proc. of the Australasian symposium on Information Visualisation, APVis, pages 53 59. Australian Computer Society, Inc., 2004. [24] A. Martínez, J. D. Cosín, and C. P. García. A landscape metaphor for visualization of software projects. In In Proc. of the 4th ACM symposium on Software visualization, SoftVis, pages 197 198, New York, NY, USA, 2008. ACM. [25] A. Ghandar, A. S. M. Sajeev, and X. Huang. Pattern puzzle: a metaphor for visualizing software complexity measures. In In Proc. of the Asia-Pacific Symposium on Information Visualisation, APVis, pages 221 224. Australian Computer Society, Inc., 2006. [26] D. Reniers, L. Voinea, and A. Telea. Visual exploration of program structure, dependencies and metrics with solidsx. In In Proc. of IEEE Vissoft, pages 1 4. IEEE, 2011. [27] L. H. Rosenberg and L. E. Hyatt. Software Quality Metrics for Object-Oriented Environments, 1997. [28] V. R. Basili, L. C. Briand, and W. L. Melo. A validation of object-oriented design metrics as quality indicators. IEEE Trans. on Software Engineering, 22:751 761, October 1996. [29] T. Gyimothy, R. Ferenc, and I. Siket. Empirical validation of object-oriented metrics on open source software for fault prediction. IEEE Trans. on Software Engineering, 31:897 910, 2005. [30] P. Yu, T. Systä, and H. A. Müller. Predicting fault-proneness using oo metrics: An industrial case study. In In Proc. of the 6th European Conference on Software Maintenance and Reengineering, CSMR, pages 99 107, Washington, DC, USA, 2002. IEEE Computer Society. [31] R. Subramanyam and M. S. Krishnan. Empirical analysis of ck metrics for object-oriented design complexity: Implications for software defects. IEEE Trans. on Software Engineering, 29:297 310, April 2003. [32] T. DeMarco. Controlling Software Projects: Management, Measurement and Estimation. Yourdon, New York, NY, 1982. [33] M. Lanza and R. Marinescu. Object-Oriented Metrics in Practice. Using Software Metrics to Characterize, Evaluate, and Improve the Design of Object-oriented Systems. Springer Verlag, 2010. [34] S. R. Chidamber and C. F. Kemerer. A metrics suite for object oriented design. IEEE Trans. on Software Engineering, 20:476 493, 1994. [35] V. Sundaresan, L. Hendren, C. Razafimahefa, R. Vallée-Rai, P. Lam, E. Gagnon, and C. Godin. Practical virtual method call resolution for java. SIGPLAN Not., 35:264 280, 2000.