Detecting Defects in Object-Oriented Designs: Using Reading Techniques to Increase Software Quality Current Research Team: Prof. Victor R. Basili Forrest Shull, Ph.D. Guilherme H. Travassos, D.Sc. (1) Jeffrey Carver, Graduate Research Assistant {basili,travassos,carver}@cs.umd.edu fshull@fc-md.umd.edu This work is partially supported by UMIACS and by NSF grant CCR9706151 (1) On leave from the Federal University of Rio de Janeiro - /Computer Science and System Engineering Department, partially supported by CAPES - Brazil Department of Computer Science Experimental Software Engineering Group Fraunhofer Center - Maryland Evolving a Process for Inspecting OO Designs Outline: Reading Techniques Families of Reading Techniques Lessons Learned from the Initial Study The Road Ahead Observational Studies Lessons Learned from the Observational Studies Open Questions 1
Reading Techniques Why read? Software practitioners are taught how to write, but typically not how to read, software documents Many software processes assume that practitioners know how to effectively find the information they need (i.e. how to read) such documents e.g. inspection process: read a document to find defects Reading Techniques For what? to deal with complete, consistent, unambiguous, and, correct documents across the software process Domain Knowledge General Requirements omission Increasing Quality! inconsistency Software Artifacts incorrect fact Software Artifacts: Requirements Documents, Use-cases Scenarios descriptions Design Diagrams Source Code Other Domain extraneous ambiguity 2
Reading Techniques Why read? looking for defects: Defect General Description Omission Necessary information about the system has been omitted from the software artifact. Incorrect Fact Some information in the software artifact contradicts information in the requirements document or the general domain knowledge. Inconsistency Information within one part of the software artifact is inconsistent with other information in the software artifact. Ambiguity Information within the software artifact is ambiguous, i.e. any of a number of interpretations may be derived that should not be the prerogative of the developer doing the implementation. Extraneous Information is provided that is not needed Information or used. to: understanding reusing analyzing constructing maintaining testing reasoning Reading Techniques Software reading techniques attempt to increase the effectiveness of inspections by providing procedural guidelines that can be used by individual reviewers to examine (or read ) a given software artifact and identify defects There is empirical evidence that software reading is a promising technique for increasing the effectiveness of inspections on different types of software artifacts, not just limited to source code. 3
Families of Reading Techniques Reading Technology PROBLEM SPACE Construction Analysis General Goal Maintenance Reuse Traceability Defect Detection Usability Specific Goal Requirements Design Code Test Plan Design Requirements Code User Interface Document (artifact) Use-Cases Project Source Code White Box Framework Black Box Framework Code Library SCR English Screen Shot Notation Form SOLUTION SPACE Scope-based Defect-based Perspective-based Usability-based Family System Wide Task Oriented Omission Inconsistent Incorrect Expert Novice Error Ambiguity DeveloperTester User Technique Reading Technology PROBLEM SPACE Analysis General Goal Defect Detection Usability Specific Goal Design Requirements Code User Interface Document (artifact) OO Diagrams SCR English Screen Shot Notation Form SOLUTION SPACE Traceability Defect-based Perspective-based Usability-based Family Horizontal Vertical Omission Ambiguity Technique Incorrect ExpertNoviceError Developer Tester User Inconsistent 4
Late A Lender : Specified Lender Receive Monthly Report receive a monthly report monthly report informing payment on time [ payment time <= due time ] monthly_report(lender, loans, borrowers) Fanny May : Loan Arranger monthly report informing payment on time [ payment time <= due time ] Good verify_report() monthly report informing late payment [ payment time > due time + 10 ] identify_report_format() look_for_a_lender(lender) update(lender) look_for_a_loan(loan) lende r : new_lender(name,contact,phone_number) update_loan(lender, borrower) new_loan(lender, borrowers) monthly report informing late payment [ payment time > due time + 10 ] Default Loan : Loan look_for_a_ update_ new_ Borrower : Borrower Goal: To develop a family of reading techniques that can be used to read OO high level design artifacts against 1. themselves, and 2. requirements descriptions and use-cases within a domain in order to identify defects among them. Abstractions of Information: tracing the semantic consistency of pairs of documents Uses of Information: detect defects Procedures: For detecting defects in design diagrams UML Artifacts: Monthly Report Lender name : text id : text contact : text phone_number : number 1.* Bundle active time period : date profit : number estimated risk : number total : number loan analyst : id_number discount_rate : number investor_name : text date_sold : date Specified Lender Receive Reports Request Investor Investment Request Generate Reports Fanny May Loan Analyst Borrower name : text id : number risk : number status : text risk() set_status_good() set_status_late() set_status_default() borrower_status() set_status() 1..* 1 1..* Loan amount : number interest rate : number settlement data : date term : date status : text original_value : number principal_original : number risk() set_status_default() set_status_late() set_status_good() discount_rate() borrowers() principal_remaining() 0..1 1..* risk() calculate_profit() cost() Loan Arranger rec_monthly_report() inv_request() generate reports() identify_report_format() verify_report() look_for_a_lender() look_for_a_loan() identify_loan_by_criteria() manually_select_loans() optimize_bundle() calculate_new_bundle() identify_asked_report() aggregate_bundles() aggregate_loans() aggregate_borrowers() Loan-Arranger Requirements Specification Jan. 8, 1999 Fixed_Rate Loan risk() principal_remaining() Variable_Rate Loan principal_remaining : number risk() principal_remaing() aggregate_lenders() format_report() show_report() Background Banks generate income in many ways, often by borrowing money from their depositors at a low interest rate, and then lending that same money at a higher interest rate in the form of bank loans. However, property loans, such as mortgages, typically have terms of 15, 25 or even 30 years. For example, suppose that you purchase a $150,000 house with a $50,000 down payment and borrow a $100,000 mortgage from National Bank for thirty years at 5% interest. That means that National Bank gives you $100,000 to pay the balance on your house, and you pay National Bank back at a rate of 5% per year over a period of thirty years. You must pay back both principal and interest. That is, the initial principal, $100,000, is paid back in 360 installments (once a month for 30 years), with interest on the unpaid balance. In this case the monthly payment is $536.82. Although the income from interest on these loans is lucrative, the loans tie up money for a long time, preventing the banks from using their money for other transactions. Consequently, the banks often sell their loans to consolidating organizations such as Fannie Mae and Freddie Mac, taking less long-term profit in exchange for freeing the capital for use in other ways. Loan Arranger Classes Description Class name: Fixed_Rate Loan Category: Logical View Documentation: A fixed rate loan has the same interest rate over the entire term of the mortgage External Documents: Export Control: Public Cardinality: n Hierarchy: Superclasses: Loan Public Interface: Operations: risk principal_remaining State machine: No Concurrency: Sequential Persistence: Persistent Operation name: risk Public member of: Fixed_Rate Loan Return Class: float Documentation: take the average of the risks' sum of all borrowers related to this loan if the average risk is less than 1 round up to 1 else if the average risk is less than 100 round up to the nearest integer otherwise round down to 100 Concurrency: Sequential Loan State Diagram monthly report informing late payment [ due time < payment time < due time + 10 ] 5
Target Artifacts: Requirements Specification Requirements Descriptions Use-Cases High Level Design Class Diagrams Class Descriptions State Machine Diagrams Interaction Diagrams Vertical reading Horizontalreading (Sequence) looking for defects: Type of Defect Omission Incorrect Fact Inconsistency Ambiguity Extraneous Information Description One or more design diagrams that should contain some concept from the general requirements or from the requirements document do not contain a representation for that concept. A design diagram contains a misrepresentation of a concept described in the general requirements or requirements document. A representation of a concept in one design diagram disagrees with a representation of the same concept in either the same or another design diagram. A representation of a concept in the design is unclear, and could cause a user of the document (developer, low-level designer, etc.) to misinterpret or misunderstand the meaning of the concept. The design includes information that, while perhaps true, does not apply to this domain and should not be included in the design. Table 1 Types of software defects, and their specific definitions for OO designs 6
Description: Fall/98 course - CMSC 435 http://www.cs.umd.edu/projects/softeng/eseg/manual/tbr_package/ Students organized in 15 teams (14 (3 students team) and 1 (2 students team)) System: financial database application (a system responsible for organizing the loans held by a financial consolidating organization, and for bundling them for resale to investors) Problem features: small system with low number of classes but with some design complexity due to non-functional performance requirements Target programming languages: C++ or Java Goal: to evaluate the first version of the reading techniques Size Measures for the inspected design: Class Name Attrs. WMC DIT NOC CBO State Dgm. exists? Seq1 Contains? Seq2 Containstainstains? Seq3 Con- Seq4 Con- Seq5 Contains? Property 5 0 0 0 1 Borrower 2 2 0 0 1 Lender 3 1 0 0 2 yes Loan 3 3 0 2 4 yes yes Fixed 0 1 1 0 4 Rate Loan Adjustable 0 1 1 0 4 Rate Loan Bundle 5 2 0 0 2 yes yes Investment 4 1 0 0 1 yes Request Loan Arranger 0 15 0 0 4 yes yes yes yes yes Financial Org. 1 0 0 0 2 yes yes yes yes yes yes Loan Analyst 1 12 0 0 3 yes yes yes yes yes number of attributes (Attrs.), Weighted Methods/Class (WMC), Depth of Inheritance (DIT), Number of Children (NOC), and Coupling Between Objects (CBO) 7
The inspection process with OO reading techniques: Reader 1 Reader 2 Reader 3 looking for consistency horizontal reading looking for consistency horizontal reading looking for traceability vertical reading Meet as a team to discuss a comprehensive defect list. Each reader is an expert in a different aspect Final list of all defects sent to designer for repairing Subset of metrics collected in the feasibility study When Collected Before the study After individual review Metrics a) Details on subjects amount of experience with requirements, design, and code b) Time spent on review (in minutes) c) Opinion of effectiveness of technique (measured by what percentage of the defects in the document they thought they had found) d) How closely they followed the techniques (measured on a 3-point scale) e) Number and type of defects reported After team meeting f) Time spent (in minutes) g) Usefulness of different perspectives (open-ended question) In post-hoc interviews h) How closely they followed technique (open-ended question, to corroborate d) i) Were the techniques practical, would they use some or all of them again (openended question) 8
Reading Technique for Class diagrams x Class descriptions For each class modeled into the class diagram, do: 1)Read the class description to find an associated description to the class. Underline with a yellow pen the portion of the Class descriptions corresponding to the class Verify if: 1.1)All the attributes are described and with basic types associated Underline them with a blue pen 1.2)All the behaviors and conditions are described Underline them with a green pen 1.3)All the class inheritance relationships are described Draw a box around them with a yellow pen if there is any 1.4) All the class relationships (association, aggregation and composition) are described with multiplicity indication Circle each multiplicity indication with a blue pen if it is correct. First version of a horizontal reading technique. The emphasis is on syntactic checking, that is, that the OO notation on both diagrams agrees. Some results from the experiment: The definition of design defects by extending an already existing defect taxonomy It is important to measure the effectiveness (number of defects found): Subjects using vertical reading tended, on average, to report slightly more defects of omission and incorrect fact (i.e. of types of defects uncovered by comparisons against the requirements) than those using horizontal reading (6.8 versus 5.4 defects) Subjects using horizontal reading tended to report more defects of ambiguity and inconsistency (i.e. of types of defects uncovered by examination of the design diagrams themselves) than subjects using vertical reading (5.3 versus 2.9) 9
Some results from the experiment: developers agreed that using: heuristics to construct OO artifacts is good some kind of OO reading technique is worthwhile some developers said that they would like to use the same techniques again but, the mechanisms used to instrument them should be improved. The study allowed us to identify weaknesses in the first version of the techniques that have led to a second version It was possible to demonstrate that such reading techniques can be used as part of design inspections, and do help reviewers detect defects The Road Ahead: 1. Improving first version of the reading techniques using the subjects feedback by: inserting semantic concerns into the reading techniques to improve the detection of defects (incorrect fact, ambiguity and extraneous) inserting construction concerns into the reading techniques to allow developers argue about design reasoning 10
Reading Technique for Class diagrams x Class descriptions 2) Read the class descriptions to find an associated description to each class on the class diagram. Take a look at the class description. Can this description be used to describe the class that you are considering at this time? Is it using an adequate abstraction level? Box the class name in the class description with a blue pen and write in the same number given to the class on the class diagram. Mark found classes with a blue symbol (*) on the class diagram Make good use of this time and verify if: All the attributes are described along with basic types Can this class encapsulate all these attributes? Are the attributes associated to feasible/possible basic types? Does it make sense to have these attributes in the class description? Is some attribute missing between the two documents or sounding extraneous? Write down missing and extraneous attributes with a yellow pen All the behaviors and constraints are described Can this class encapsulate all these behaviors? Is some behavior missing between the two documents or sounding extraneous? Are the behavior descriptions using the same abstraction level? Do the behaviors use the class attributes to accomplish the procedure? Are the constraints reachable using the attributes and behaviors of the class? Do the constraints make sense for this class? Are the constraints depending upon some class attributes? Were they described? Write down missing/extraneous behaviors with a green pen Write down missing/extraneous constraints with a yellow pen The Road Ahead: 2. Assessment issues: Better metrics for assessing inspections effectiveness by looking for a way to take account false positives (i.e. items that are reported by reviewers but which should really not be considered defects) Developing qualitative as well as quantitative means of assessment. Qualitative methods would help us understand the process as well as the results. 11
The Road Ahead Observational Study: we use the term observational to define a setting in which an experimental subject performs some task while the experimenter gathers data about what exactly the subject does. Purpose: to collect data about how the particular task is accomplished observational data Time taken Problems encountered Inquisitive data Influence of prior knowledge Lessons Learned from the Observational Studies Observational data Time taken: time for applying the techniques is influenced by prior experience of the reader, complexity of the information, order of reading steps and levels of abstraction Problems encountered: organizational and semantic issues, format of software artifacts Inquisitive data Influence of prior knowledge: domain knowledge seemed to be necessary to support horizontal reading basic OO knowledge seemed to be necessary 12
The Road Ahead: 3. Improving defect taxonomy making use of expert opinion (where experts include skilled practitioners and researchers) to understand if the taxonomy captures the important types of defects in OO designs 4. Empirically validating the reading techniques doing experiments with specified projects and real development Experiment: http://www.cs.umd.edu/class/fall1999/cmsc735/ 5. Packaging and replicating the experiments in other places identifying the applicability of such reading techniques in different development environments Technical Report: ftp://ftp.cs.umd.edu/pub/papers/papers/ncstrl.umcp/cs- TR-4070/CS-TR-4070.ps.Z Open Questions: 1. What is the role of domain knowledge for these two sets of reading techniques? Horizontal x Vertical reading 2. What is the adequate level of automated support that should be provided for such techniques? Some steps are repetitive and mechanical to the reader Clerical activities need to be identified 13