Modernized and Maintainable Code Frank Weil, Ph.D. UniqueSoft, LLC
UniqueSoft is a provider of next-generation software development tools and services specializing in modernizing legacy software using highly automated tools and techniques to achieve superior results. UniqueSoft helps its clients through the various phases of a major transformation of their legacy systems, including analyzing the full system, formulating a low-risk and high-value modernization strategy, and executing on the elements of that strategy. In this white paper, we describe the UniqueSoft process and toll for reengineering client software systems to create a modern, easy-to-maintain, and high-quality system in a cost-effective manner. 1. Overview.2 2. Objectives of Reengineering Projects 2 3. Code Maintainability 3 4. UniqueSoft Legacy Reengineering Capabilities..6 5. Summary.7 Page 2 of 7
1. Overview UniqueSoft is a provider of next-generation software development tools and services specializing in modernizing legacy software using highly automated tools and techniques to achieve superior results. UniqueSoft helps its clients through the various phases of a major transformation of their legacy systems, including analyzing the full system, formulating a low-risk and high-value modernization strategy, and executing on the elements of that strategy. The modernized code is maintainable and understandable. Legacy software almost always has built up a considerable amount of technical debt. Change is costly and risky due to poorly understood features and dependencies, implementation details of legacy platforms interwoven with business logic, defects that are difficult to find and fix, features left in the code that are no longer needed, obsolete documentation and test cases, and knowledge of architecture and design decisions that has been lost. In short, legacy code is rarely in a state that makes it easy to maintain. When migrating the code to a new language or platform, the resulting code should be more maintainable, not less. Unfortunately, most code translation tools make this problem worse, not better. This white paper discusses how the UniqueSoft reengineering process and tool make it possible to modernize applications by migrating them and making them maintainable and understandable. 2. Objectives of Reengineering Projects While each reengineering project is unique, the objectives of the projects typically can be categorized by three dimensions: migration, maintainability, and understanding. These objectives are independent of whether the reengineering process is manual or automated, and they are largely independent of each other. Examples of projects related to the objectives are summarized in Table 1. Table 1: Reengineering Project Objectives Objective Project Goal Example Projects Platform migration Client-server to cloud; mainframe to Linux; Struts to Spring Migration Database migration MySQL or Access to Oracle Language translation COBOL to C#; C to Java Maintainability Design improvement Modularize application; remove anti-patterns Code improvement Remove dead / replicated code; apply coding standards Understanding Business Rule extraction Document transaction processing rules from COBOL systems System discovery Uncover features, architecture, and dependencies in the legacy code base The overall objective of migration is to move the behavior of the software from one enabling technology to another. For example, the legacy system may have a client-server architecture and the goal is to make it cloud enabled, or may be currently on a mainframe and the goal is to move it to a Linux-based server, or may have originally been built with a MySQL database and the goal is to move it to a high-performance Oracle implementation, or may be in an older language such as COBOL and the goal is to move it to more modern language such as Java, or may be in a procedural language and the goal is to move it to a high-performance rules engine. Page 3 of 7
The common theme is that the objective of these projects is not to change the application behavior, but rather to change how the application is written and deployed. The overall objective of maintainability is to improve some aspect of the software. Typical improvements are to restructure the code to reduce its complexity, improve its modularity (e.g., group the code implementing a feature into a single module), reduce the coupling between modules, minimize replicated code, and remove dead code. Other maintainability improvements relate to discovering and eliminating anti-patterns in the code, such as decision statements in which some values are not covered by branches, calls to external library functions which do not check the return value, or use of unsafe functions such as the C strcat routine. The overall objective of understanding involves uncovering information in the legacy system. Two kinds of information can be extracted: surface and deep. Surface information about legacy software relates to what can be extracted purely from the code itself. That is, it is independent of what the code is supposed to do or why. Examples of surface information about legacy software include number of lines of code, number of files, the directory structure, complexity measures, what calls are made to external libraries, what databases are accessed, etc. Deep information about legacy software relates to what the code does in the terminology of the stakeholders, how it is architected, and the various dependencies between the modules of the system. For example, one may want to understand what features are implemented in a system, where in the code those features are implemented, what business rules are realized, and how those business rules relate to and depend on each other. 3. Code Maintainability As was mentioned above, code should not be migrated at the expense of its maintainability. At the completion of the modernization process, the code should be easier to maintain, not harder. The IEEE Standard Glossary of Software Engineering Terminology gives a simple definition of maintainability: The ease with which a software system or component can be modified to correct faults, improve performance or other attributes, or adapt to a changed environment. Code that is easy to maintain is readable, is documented, follows coding standards and best practices for the implementation language, is well structured and modular, and is testable (and tested). Each of these qualities is refined in the following tables, which list contributing factors to maintainability, why the factors are important, and examples of characteristics they should have. Table 2: Code Readability Consistent and informative naming Program intent is clearer User-directed, customized naming Good comments Program intent is clear; design choices are documented Relevant comments are preserved and collated Native language paradigms Efficiency, understandability, correctness, code size Translation of program intent, not a template-based, line-by-line Native environment use (direct instantiation to new environment, no extra runtime libraries) Efficiency, understandability, correctness, code size translation Direct instantiation to the new environment, no extra runtime libraries Page 4 of 7
Table 3: Documented Code Updated requirements Understand the overall behavior of the system; form the basis of new development; allow incorrect/inconsistent/outdated behavior to be identified Updated architecture diagram What features exist Mapping of features to code File/module/data dependencies Control flow diagrams Call hierarchy diagram Resource usage diagrams Understand the current system architecture; identify elements to be migrated Understand what features are currently implemented in the system from a business perspective; allow identification of unneeded features; form the basis of new documentation Understand where to make changes Understand impact of changes Understand program structure and sequencing Understand call stack and program sequencing Understand external resource usage ITU standard Use Case Maps (UCMs) capturing full behavioral requirements of existing code UML Composite Structure (Architecture) diagrams from the source code Hierarchical feature model in business terminology Full interactive mapping of features and code Full interactive Design Structure Matrices (DSMs) Full interactive control flow and data flow diagrams; Message Sequence Charts (MSCs) Full interactive call hierarchy diagrams Flow diagrams based on external database tables; interactive resource usage diagrams for different types of resources Table 4: Well-Structured and Modular Code Low feature diffusion Localized changes; ease of feature enhancements reduce feature diffusion Low coupling Lower impact of changes; reduced defects due to unintended consequences of reduce coupling changes; lower cost of migration High cohesion Lower impact of changes; lower cost of migration increase cohesion Reasonable file/ Faster compile times; faster editing; ease of module size Lack of replicated code Lack of dead code distributing maintenance work Less code to maintain; localized changes; ease of understanding; faster compile times; improved testability Ease of understanding; faster compile times; lower maintenance times; improved quality reduce file/module sizes identify and remove replicated code, including parameterized code identify and remove dead code Page 5 of 7
Table 5: Coding Standards and Best Practices General language standards Consistency of code; improved quality; portability Consistent rule-based application of coding standards Company-specific programming standards Consistency of code; conformance to external constraints Customizable, consistent rule-based application of coding standards Absence of antipatterns Improved quality; improved performance; improved maintainability Customizable identification and automated elimination of anti-patterns Table 6: Testable and Tested Code Low complexity Improved testability, improved quality, ease of maintenance identify code that has high complexity and refactor it to improve the complexity Comprehensive tests Improved quality; confidence in Create comprehensive tests from the Regression testing migration efforts Confidence that changes do not have unintended consequences behavioral requirements (UCMs) Create comprehensive tests from the code itself; test harness for executing tests 4. UniqueSoft Legacy Modernization Capabilities The UniqueSoft reengineering tool enables the creation of maintainable, modernized code by providing capabilities beyond those typically available in COTS tools. By using a high degree of automation to apply advanced techniques throughout the legacy modernization process, the UniqueSoft reengineering process can modernize a legacy code base while still proving the maintainability factors as described above. The phases of the UniqueSoft reengineering process Discovery, Transformation, and Instantiation allow the analyst to understand the structure and intent of the system (features, architecture, design, dependencies, metrics ), make selected changes to the system to make it better (more maintainable, more understandable, improved testability ), and migrate the code to make native use of the target environment (implementation language, deployment frameworks, customer-specific libraries ). A partial list of the interactive and GUI-driven capabilities of the UniqueSoft reengineering tool is given in the table on the next page. An in-depth discussion of this tool is provided in other white papers. Page 6 of 7
Table 7: UniqueSoft Modernization Capabilities Phase Discovery Transformation Instantiation Capability Size, complexity, and modularity metrics File and directory structure Database dependencies Library dependencies Dependencies between files and modules Control flows Understand what features exist and where they are implemented Uncover system architecture Interactive system exploration Uncover business rules in user terminology Extract requirements Reduce file sizes Reduce duplicated code Eliminate dead code Reduce feature diffusion Selectively apply transformations Examine and be able to roll back individual changes Examine effect on metrics of individual changes Rename identifiers for readability Create test cases Translate the code to new languages Custom coding standards Feature-driven development Native support for the chosen deployment environment Native language constructs 5. Summary The UniqueSoft reengineering process and tool makes it possible to transform a legacy software system into a modern, understandable, and maintainable asset. Advanced automation and a powerful transformation system enable the advantages of a manual rewrite at a fraction of the cost while avoiding the pitfalls of one-size-fits-all translation tools. Reengineering can be applied across a wide variety of domains to accomplish project goals, and these techniques and tool readily scale to projects comprised of many millions of lines of legacy code. UniqueSoft, LLC Corporate Headquarters 1530 E. Dundee Road Suite 100 Palatine, IL 60074 USA +1 847-963-1777 info@uniquesoft.com Page 7 of 7