SOFTWARE PRACTICE AND EXPERIENCE, VOL. 23(10), 1157 1174 (OCTOBER 1993) Static Analysis of Exception Handling in Ada carl f. schaefer and gary n. bundy The MITRE Corporation, 7525 Colshire Drive, McLean, VA 22102, U.S.A. SUMMARY Since the signature of an Ada subprogram does not specify the set of exceptions that the subprogram can propagate, computing the set of exceptions that a subprogram may encounter is not a trivial task. This is a source of error in large Ada systems: for example, a subprogram may not be prepared to handle an exception propagated from another subprogram several layers lower in the call-tree. In a large system, the number of paths in exceptional processing is so great that it is unlikely that testing will uncover all errors in inter-procedural exception handling. Nor are compilers or code inspections likely to locate all such errors. Exception handling is an area where static analysis has a high potential payoff for systems with high reliability requirements. We discuss fundamental notions in computing exception propagation and describe an analysis tool that has proved to be effective in detecting inconsistencies in the exception-handling code of Ada applications. key words: Static analysis Exception handling Ada INTRODUCTION Ada has several powerful features whose goal is to facilitate the development of dependable computing systems. Among these is exception handling. Ada provides specific syntax for detecting, signaling, and responding to extraordinary situations, typically thought of as (but not limited to) situations in which an error has occurred. Unfortunately, the addition of exception handling code often increases the complexity of a system, particularly if the exception handling code does not follow strict guidelines governing the use and handling of exceptions. Although Ada s exception mechanism can facilitate the development of clear and correct code, undisciplined use can result in exceptional control flow paths similar to those created by nonlocal, dynamically-bound gotos. Thus exception handling code can itself be an abundant source of defects. 1,2 Defects are typically detected by code inspections or by testing. However, as Cui 3 points out, the exception handling portion of a system is often the most poorly tested. Perhaps this unfortunate situation reflects the view that the most frequently occurring paths should consume the bulk of the testing budget, inappropriate as this may be for a safety-critical system. Moreover, forcing certain exceptional paths may require elaborate test set-ups (consider, for example, what is required to force the software to detect a hardware error while the software is already responding to a 0038 0644/93/101157 18$14.00 Received 15 October 1992 1993 by John Wiley & Sons, Ltd. Revised 30 April 1993
1158 c. f. schaefer and g. n. bundy different error). In any case, testing of all exceptional paths, like testing of all nominal paths, is not practical. Inspection does have an advantage over testing: it is generally believed that the cost to fix a defect discovered earlier in the development cycle is lower than the cost to fix same defect discovered later in the development cycle. 4 However, because the propagation of exceptions can be so complex, detection of defects in exception handling code by inspection can be a labor-intensive and error-prone process. Shortcomings of both testing and code reviews are discussed by Parnas. 5 Automated static analysis is a useful complement to testing and code inspections. Because the analysis is automated, it does not require extensive manual effort on the part of the inspector, and because the analysis does not actually execute the code, all control flows can be analyzed. Adherence to a set of guidelines can be verified, and problems with the implementation can be identified and corrected during development, when they are less costly to fix. The output of a static analysis tool need not be absolutely unambiguous to be useful; if the analysis is not refined enough, the tool may cite false positives along with instances that are certainly defects. Even in such cases, the output of the analysis tool constitutes a defined set of red flags which can then be refined by manual techniques. We have found automated static analysis to be useful in identifying defects in the exception handling code of a large Ada system. The next section of this paper describes exception handling in Ada and develops certain notions useful in computing the propagation of exceptions. Following this, we describe a tool for analyzing exception handling in Ada code and discuss the use of the tool to detect violations of application-specific design and coding guidelines. EXCEPTION HANDLING IN ADA Overview Ada 6 provides a means by which a program can respond to the occurrence of some extraordinary situation (an exception) by abandoning the normal flow of control in favor of an alternative flow of control. When the extraordinary situation occurs, the appropriate exception is said to be raised; this initiates the alternative flow of control. Control is passed to a specially designated segment of code called a handler. After the code in a handler has executed, the normal flow of control is resumed. However, normal flow of control resumes from the point at which the handler terminates, not from the point at which the exception was originally raised. Thus, in Goodenough s terminology, Ada exceptions are escape exceptions. 7 Ada predefines certain exceptions; these correspond to extraordinary situations such as division by zero, an illegal array reference, dynamic storage exhaustion, an attempt to open a file that does not exist, and so forth. The Ada run-time system detects the corresponding exceptional situations and raises the appropriate predefined exception implicitly. In Figure 1, for example, if Y 1 X then the predefined exception Constraint Error will be raised on attempting to assign the value X + Y to Y, whose subtype is Positive, and control will be passed to the handler for Constraint Error, causing the message Constraint Error raised to be printed. Exception handlers may occur at the end of a subprogram (as will be explained below, there are other contexts in which exception handlers may occur), after the
static analysis of exception handling in ada 1159 Figure 1. Implicit raise of Constraint Error Figure 2. Programmer-defined exception reserved word exception. If procedure P did not have an exception handler for Constraint Error, then the Ada run-time system would search for the most recently activated subprogram on the dynamic call stack that did contain a handler for Constraint Error, and if such a handler were found, control would be passed to that handler. If no subprogram on the dynamic call stack contained a handler for Constraint Error, then the entire program would terminate. Propagation of Ada exceptions will be explained in more detail shortly. Ada also permits a programmer to define exceptions. Programmer-defined exceptions are raised explicitly with the raise statement. The raise statement may also be
1160 c. f. schaefer and g. n. bundy used to raise a predefined exception, but this is generally considered poor practice. 8 In Figure 2, procedure Delete Directory may raise the exception Directory Not Empty. In contrast to the previous example, Delete Directory does not contain a handler for Directory Not Empty; it is expected that a caller of Delete Directory will contain an appropriate handler. In general, a handler is part of a handler set; a given exception is caught (named) by at most one handler in any given handler set. Ada also allows the programmer to specify a catch-all handler (written when others), which catches every exception that is not caught by another handler in the handler set. Within a handler and within a block inside a handler, but nowhere else, the simple raise statement, consisting solely of the reserved word raise, is allowed; the effect of the simple raise statement is to reraise whatever exception caused control to be passed to the handler containing the simple raise statement. Figure 3 illustrates the simple raise statement and the when others handler. Computing the set of propagated exceptions We are interested in computing the set of exceptions that a code segment can propagate and in determining what exceptions a handler can propagate when it catches a given exception. For this, we need to develop several basic notions. First, we need a term for a construct whose execution, activation, or elaboration can occasion the raising of an exception; we will call such a construct a chunk. Secondly, we need a relation which specifies the paths along which propagation of exceptions is possible; this is the dynamic parent relation. Thirdly, we need a precise definition of what exceptions a given handler catches. Fourthly, we need a term for the union of all exceptions that can be raised in a chunk and all exceptions that can be propagated to a chunk; this is the encounters relation. Finally, to account for the meaning of the simple raise statement, we define the context of a chunk. With these Figure 3. Simple raise statement
static analysis of exception handling in ada 1161 notions, we can define the propagates relation and three relations that determine which exceptions a handler propagates when it catches a given exception: maps, passes, and stops. The Ada language reference manual 6 (ARM) has a term for a construct that can have exception handlers; such a construct is a frame. However, Ada does not have a term for a construct whose elaboration or execution can cause an exception to be raised. We use the term chunk for such a construct. More precisely, a chunk is one of the following: (a) the sequence of statements in a package body, subprogram body, task body or block statement (b) a package specification, a package body declarative part, subprogram declarative part, task body declarative part, block declarative part (c) an accept statement (d) a handler. In Figure 4, each of the boxed segments is a chunk. A1 and B1 are subprogram declarative parts. A2 and B2 are sequences of statements in subprogram bodies. A3, A4, B4, B5, B6, and B8 are handlers. B3 and B7 are sequences of statements in block statements. A handler set is a set of exception handlers associated statically with a chunk by the syntax rules of Ada. In Figure 4, the handler set {A3,A4} is associated with A2. Likewise, {B4} is associated with B3, {B5,B6} with B2, and {B8} with B7. If a chunk has no associated handlers, we speak of the null handler set. By the rules of Ada, the null handler set is semantically equivalent to the following handler set: exception when others = raise; The handler set associated with a package specification, the declarative part of a package body, a subprogram declarative part, a block statement declarative part, an accept statement, or a handler is always the null handler set. The handler set associated with one of the other kinds of chunk may be non-null. Catches is a relation between handlers and exceptions. A handler H catches an exception E explicitly if H names E. InFigure 4, A3 and B5 each catch Channel not available explicitly. A handler H that is a member of the handler set S catches an exception E implicitly if H is the when others handler and if there is no handler in S that catches E explicitly. In Figure 4, A4 and B6 each catch all exceptions other than Channel not available implicitly, whereas B4 and B8 each catch all exceptions implicitly. The actual propagation path of an exception is determined by the previous execution history of the program; however, it is possible to characterize statically the set of possible propagation paths covering all execution histories. For this purpose, we introduce the dynamic parent relation, which holds between chunks. The following cases define the dynamic parent relation: 1. A chunk declaring a package specification P is the (single) dynamic parent of P.
1162 c. f. schaefer and g. n. bundy Figure 4. Chunks
static analysis of exception handling in ada 1163 2. A chunk declaring a package body P is the (single) dynamic parent of the declarative part of P, the sequence of statements of P, and each handler in the handler set of the package body of P. 3. A chunk containing a call to subprogram S is the dynamic parent of the declarative part of S, the sequence of statements of S, and each handler in the handler set of S. 4. A chunk containing a block statement B is the (single) dynamic parent of the declarative part of B, the sequence of statements of B, and each handler in the handler set of B. 5. A chunk declaring a task object T, or creating a task object T by means of an allocator, is the (single) dynamic parent of the declarative part of T. 6. If a package specification declares a task object T, or creates a task object T by means of an allocator, then the sequence of statements of the package body (if this package body exists) is the (single) dynamic parent of the declarative part of T. 7. A chunk containing an entry call corresponding to an accept statement A is the dynamic parent of A. 8. A chunk (task body) containing an accept statement A is the dynamic parent of A. 9. Library package specifications and bodies, and the main subprogram, have no dynamic parent. In Figure 4, B2 is the dynamic parent of A1, A2, A3, and A4 since B2 contains a call to the subprogram Allocate channel. B2 is the dynamic parent of B3 and B4, since B3 is a block statement contained in B2 and B4 is in B3 s handler set. Likewise, B6 is the dynamic parent of B7 and B8, since B7 is a block statement contained in B6 and B8 is in B7 s handler set. To characterize the set of exceptions that can be active in a chunk, either by reason of having been propagated to the chunk or by reason of having been raised within the chunk, we introduce the encounters relation. If a chunk contains a simple raise statement, then the chunk is a handler or a block statement nested in a handler and the set of exceptions that is raised within the chunk depends on the exception that was caught by the handler. For this reason, encounters is a relation between (chunk, exception) pairs and exceptions, rather than simply a relation between chunks and exceptions. A (chunk, exception) pair (C,E ) encounters an exception E if (a) E is a predefined exception and the Ada run-time system could raise E when executing a statement in C, or (b) C contains a statement of the form raise E; or (c) C contains a simple raise statement ( raise; ) and E = E, or (d) C is the dynamic parent of some chunk C, and C propagates E (the relation propagates is defined below). A chunk that is not a handler and is not inside a handler cannot contain a simple raise statement. In this case, what (C,E ) encounters is not dependent on the context exception (E ). Where encounters is not dependent on the context exception, we may say (C,*) encounters E. In Figure 4, (A2,*) encounters Channel not available since it contains an explicit raise of this exception. Similarly, (A4,*), (B4,*), and (B8,*) encounter Fatal error since each of these chunks contains an explicit raise of this exception. (A3, Channel not available) encounters Channel not available since it
1164 c. f. schaefer and g. n. bundy catches this exception explicitly and contains a simple raise statement. Note that the fact that a handler catches an exception does not imply that the handler encounters the exception; in Figure 4, handler B8 catches all exceptions, but it encounters only Fatal error. The simple raise statement can occur immediately in a handler or in the sequence of statements of a block statement nested in a handler. The meaning of a simple raise statement is the intersection of the set of exceptions caught by the most closely enclosing handler and the set of exceptions encountered by the chunk the handler is associated with. To formalize this, we introduce a context of relation between chunks. If C is a chunk in which the simple raise statement is not permitted, then C is the context of itself. Otherwise, C is a handler or C is a sequence of statements in block statement enclosed directly or indirectly by a handler. If C is a handler, then there is a chunk C that is associated with the handler set that C is a member of, and this C is the context of C. IfCis the sequence of statements in a block statement inside a handler, then the context of the (unique) dynamic parent of C is also the context of C. InFigure 4, A2 is the context of A3 and A4; B2 is the context of B5, B6, and B7; B7 is the context of B8; and B3 is the context of B4. For chunks other than A3, A4, B4, B5, B6, B7, and B8, each chunk is its own context. We can now define the propagates relation. First, we give the simpler, special case that applies when chunk C is the context of itself. In this case, C propagates exception E if (a) (C,E) encounters E, and (b) there is no handler in the handler set associated with C that catches E. The more general formulation covers handlers and blocks within handlers, as well as other chunks. A chunk C propagates exception E if there is an exception E and a chunk C such that (a) C is the context of C, and (b) (C,E ) encounters E, and (c) (C,E ) encounters E, and (d) if C is a handler, then C catches E, and (e) there is no handler in the handler set associated with C that catches E. The special case given previously can be derived from the general formulation by letting C = C and E = E. As an illustration of the general formulation, consider that in Figure 4, chunk A3 propagates Channel not available since (a) A2 is the context of A3, and (b) (A2, Channel not available) encounters Channel not available (more generally, (A2,*) encounters Channel not available), and (c) A3 is a handler and A3 catches Channel not available, and (d) (A3,Channel not available) encounters Channel not available, and (e) there is no handler in the handler set associated with A3 (this is the null handler set) that catches Channel not available. It is sometimes convenient to speak of the set of exceptions propagated by a subprogram, including its declarative part, its sequence of statements, and its handlers. Speaking less precisely, we will say that a program unit U propagates the exception E if the declarative part of U propagates E, or the sequence of statements of U propagates E, or any of the handlers in the handler set of U propagates E.
static analysis of exception handling in ada 1165 There are three additional relations that are useful in characterizing the behavior of a handler. These are maps, passes, and stops. Maps is a relation between (handler, exception) pairs and exceptions; it characterizes the set of exceptions that a handler can propagate given that the handler catches a particular exception. The (handler, exception) pair (H,E ) maps to E if there is a chunk C such that (a) C is the context of H, and (b) (C,E ) encounters E, and (c) H catches E, and (d) (H,E ) encounters E (being a handler, H propagates what it encounters). Passes is a relation between handlers and exceptions. A handler passes an exception E if it can propagate E when it catches E. More precisely, H passes E if (H,E) maps to E; thus, passes is a subset of maps. Stops is a relation between handlers and exceptions. A handler stops an exception E if it propagates nothing when it catches E. More precisely, H stops E if there is a chunk C such that (a) C is the context of H, and (b) (C,E) encounters E, and (c) H catches E (d) there is no exception E such that (H,E) maps to E. Figure 5 illustrates the basic relations with a compact, though complex, example. The example assumes that the declarations of exceptions E1, E2, E3, and E4 are directly visible. We will sketch the reasoning behind the detailed assertions regarding the handler H4 in Figure 5. Since H4 calls procedure P1, H4 is the dynamic parent of C1, H1, and H2. As a result, since H1 propagates E1 and H2 propagates E3, (H4,*) encounters E1 and E3. In addition, (H4,*) encounters E4 since this exception is explicitly raised in the statements of H4. H4, being a handler and therefore having the null handler set, propagates everything it encounters; hence H4 propagates E1, E3, and E4. To see that (H4,E3) maps to E1, E3, and E4, consider that (a) C2 is the context of H4, and (b) (C2,E3) encounters E3 (more generally, (C2,*) encounters E3), and (c) H4 catches E3, and (d) (H4,E3) encounters E1, E3, and E4. Finally, since (H4,E3) maps to E1, E3, and E4, we may say that H4 passes E3. In addition to the annotations in Figure 5, we may say, less precisely, that procedure P1 propagates E1 and E3, and procedure P2 propagates E1, E3, and E4. With these definitions, it is fairly straightforward to implement algorithms for computing the set of all exceptions that a chunk propagates and all propagation paths. The main challenge is dealing with recursion; this is discussed below, under the heading Implementation details. ANALYSIS TOOL The goal of our analysis is to make it easier to identify defects in exception handling code. A code construct might be classified as a defect because it causes a program to behave incorrectly, or because it makes the program more difficult to maintain,
1166 c. f. schaefer and g. n. bundy Figure 5. An example of the basic relations or because it is a violation of a specific set of guidelines. Since computing exception propagation paths by manual methods is tedious and error-prone, the computation would seem ideally suited for automated analysis. However, the construction of such a static analysis tool is itself challenging. Ada has several features, including renaming declarations, operator-overloading, and direct visibility to declarations in external packages ( use-visibility ), that make the semantic analysis of Ada source a complicated task. An Ada compiler must, of course, determine the meaning of each identifier in an Ada program, and some compiler-vendors make this semantic information available through a program interface to the intermediate form produced by the compiler. We used the interface to the intermediate form produced by the Sun TM * Ada compiler. This intermediate form is a variant of DIANA (Descriptive Intermedi- * Sun is a registered trademark of Sun Microsystems, Inc.
static analysis of exception handling in ada 1167 ate Attributed Notation for Ada). 9 DIANA is high-level intermediate form; it does not reflect optimizing transformations, and, in general, it is possible to reconstruct the source by traversing the DIANA. However, the meaning of each identifier has been resolved in the DIANA. Although DIANA provides the semantic information needed to compute the relations needed in our analysis, it is not a particularly efficient representation. For the Ada applications we have been analyzing, one line of source text expands to roughly 160 bytes of DIANA. Multiple traversals of the DIANA would be too expensive, whereas packing the entire analysis into a single traversal would complicate the algorithms excessively. We therefore developed a pre-analysis step, in which the information pertinent to exception handling is extracted from the DIANA and stored in a more compact extracted form. For the application code we have analyzed, the extracted form is one-ninth the size of the full DIANA. Logically, the analysis falls into two parts: computation of the general relations described above (dynamic parent, catches, encounters, propagates, stops, maps, passes) and identification of violations of system-specific guidelines (examples of which are discussed below). Our analysis capability generally follows this division: the general relations are computed by compiled code that takes the extracted form as input and produces a number of human-readable files; the system-specific violations are derived from these files using UNIX * utilities (awk, grep, sort, and join). The full process (Figure 6) consists of five steps, the first four of which are automated: 1. Compilation: transforms Ada source into DIANA. 2. Pre-analysis: filters the DIANA, producing the extracted form that contains Figure 6. Analysis process * UNIX is a registered trademark of Unix System Laboratories, Inc.
1168 c. f. schaefer and g. n. bundy information pertinent to exception handling. If the source for a system is compiled into multiple libraries, the extracted forms for the multiple libraries are merged into one extracted form at the conclusion of the pre-analysis phase. 3. System-independent analysis: using information in the extracted form, computes relations that are generally useful in analysis of Ada exception handling, producing text files. 4. System-dependent analysis: using the relations in the text files, identifies violations of system-specific design guidelines. 5. Since the automated analysis may, in general, identify some false positives (locations in the code that are not actually defects), some manual verification of the tool outputs may be necessary. In any case, repair of the defects is not automated. Extracted form The extracted form is a directed graph. In general, node and attribute kinds in the extracted form have a close correspondence to DIANA nodes and attributes, but several simplifying transformations have been applied. In particular, DIANA sequences are transformed to sets in the extracted form and each of the following is collected into a single attribute: (a) all explicit raises in a set of statements (b) all calls made in a set of statements (c) all function calls made in a set of declarations. DIANA nodes and attributes not pertinent to an exception handling view have no correspondence in the extracted form. System-independent analysis The analysis tool generates text file representations of the encounters, catches, propagates, maps, passes, and stops relations (the tool computes the context of relation but does not produce a text file representation; to do so would be a trivial modification). In addition, several other text files are generated: 1. A list of each (chunk, exception) pair (C1,E1) such that C1 contains a raise statement that raises E1 (either explicitly or implicitly via the simple raise statement). 2. A list of each (chunk, chunk, exception) triple (C1,C2,E1) such that a declaration in C1 calls a function C2, and C2 propagates E1. This is the case in which E1 propagates from C2 through the declarative part C1. 3. A list of each (subprogram, exception) pair (S1,E1) such that (a) S1 propagates E1, and (b) S1 encounters E1, and (c) S1 does not raise E1, and (d) no block contained directly or indirectly in S1 raises E1, and (e) there is no handler in the handler set associated with S1 that contains an explicit raise of E1. This is the case in which C1 propagates E1 without there being an explicit raise of E1 in the text of C1.
static analysis of exception handling in ada 1169 4. A list of each (chunk, chunk, exception) pair (C1,C2,E1) such that C1 propagates E1,C2 calls C1, the declaration of E1 is visible to C1, and the declaration of E1 is not visible to C2. This is the case in which E1 is propagated out of scope. Finally, several text files containing information of a more general sort are produced: 1. A list of each (exception, exception) pair (E1,E2) such that E1 renames E2. 2. A file containing the name, internal identification number, and source location of each chunk and exception. 3. A call tree (the dynamic parent relation less pairs in which a chunk is the dynamic parent of a declarative part or of a handler). Simplifications The goal of our analysis was to identify portions of a program that are causes of incorrect program behavior or that violate design guidelines. However, we did not intend the results of the analysis to drive an automated correction of the software. This allowed us to make several assumptions that reduce the complexity of computation and the completeness of the results, without significantly diminishing the value of the analysis. The tool makes several compromises between completeness of analysis and ease of construction. On the one hand, the tool considers all raise statements in a chunk to be reachable; in this respect the tool is too stringent, since it may find non-existent problems. On the other hand, the tool does not identify all contexts in which an Ada-predefined exception could be raised; in this respect the tool is too lax, since it may fail to find some real defects. The tool does not evaluate static expressions and does not perform dataflow analysis. Furthermore, it treats a sequence of statements as a set of statements. As a result, the tool may assume that a certain chunk raises an exception even if the raise statement is not reachable. For example, a chunk containing the statement if X = X then raise E1; else raise E2; end if; is considered to raise E1 and E2, even though the raise of E2 is unreachable (assuming that X is not a function call with side-effects). Likewise, a chunk containing the sequence raise E1; raise E2; is also considered to raise E1 and E2, even though the raise of E2 is not reachable. Correctly identifying each potential raise of an Ada-predefined exception would require a sophisticated analysis such as that performed by the optimizer of an Ada compiler. Consider the Ada-predefined exception Constraint Error. The Ada run-time system will raise Constraint Error when there is an attempt to assign to a variable a value that is outside the range of values declared for the variable, when there is an attempt to index an array with a value that is outside the range of values declared for the index subtype of the array, when there is an attempt to dereference a null access value, and in several other circumstances. The effort required to identify
1170 c. f. schaefer and g. n. bundy possible raises of Constraint Error, Numeric Error and Tasking Error would far outweigh the added diagnostic value of the analysis tool. Possible raises of Storage Error, Program Error, and the IO exceptions (Status Error, Mode Error, Name Error, Use Error, Device Error, End Error, and Data Error) would not be as difficult to identify, and future versions of the analysis tool may do this. There is one special case in which the tool will add a raise of an Ada-predefined exception to the encounters relation. If a handler H catches an Ada-predefined exception E explicitly (e.g. when Constraint Error = ), and if H contains a simple raise statement, then the pair ((H,E), E) is considered to be a member of the encounters relation. It follows that if H catches Constraint Error explicitly and contains a simple raise statement, then H will propagate Constraint Error, even though there may be no explicit raise of Constraint Error anywhere in the program. The following example illustrates this: begin X:=Y*Z; exception when Constraint Error = Log Error; raise; end; -- Handler is assumed to raise Constraint Error The tool makes one additional simplifying assumption: it treats every others handler as reachable (more formally, for a chunk C that is the context of an others handler H, the tool assumes that there is some exception E such that C encounters E and H catches E). Thus for the following code: B1: begin X:=Y*Z; exception when others = raise Overflow; end; the tool will report that the handler (and, loosely speaking, the block B1) propagates Overflow. With respect to others handlers, the tool acts as if every chunk can encounter some unspecified exception, and this assumption partially corrects the tool s failure to consider implicit raises of Ada-predefined exceptions. Implementation details Two interesting issues arose in implementing the analysis tool. The first issue concerns the caching of computations. The definitions of encounters and propagates are mutually recursive. As a consequence, when the set of propagated exceptions has been computed for a chunk C, it is advantageous to save the result of that computation so that it can be used in computing the set of exceptions encountered by any dynamic parent of C. On the other hand, a certain amount of recomputation is unavoidable in computing the maps relation. If a handler, or a block statement
static analysis of exception handling in ada 1171 nested in a handler, contains a simple raise statement, then it is necessary to recompute the maps relation for each exception that is both caught by the handler and is encountered by the context of the handler. Our implementation saves the set of propagated exceptions for all chunks other than handlers and block statements; for handlers and block statements, the propagated exceptions are recomputed each time they are needed. The set of encountered exceptions, as opposed to the set of propagated exceptions, is always recomputed for all chunks. Since the extracted form provides ready access to the set of exceptions immediately raised in a chunk and the set of dynamic children of a chunk, this recomputation is not very complex. The second issue concerns recursion. If there is recursion in the dynamic parent relation (as there will be, for example, in the presence of recursive subprogram calls), then the implementation must find some way to terminate the mutually recursive computation of the encounters and propagates relations. Our implementation keeps a stack to mirror the current traversal through the dynamic parent relation. This stack allows the computation to detect recursion. Suppose the set of exceptions encountered by a chunk C1 is currently being computed; then C1 is the top of the stack (the element most recently added). Suppose that C1 is the dynamic parent of some chunk C0 and that the set of exceptions propagated by C0 has not yet been computed. Furthermore, suppose that C0 is already on the stack. Instead of pushing C0 on the stack a second time, our implementation caches the null set as the set of exceptions propagated by C0, marks as temporary all chunks on the stack from C1 to C0, and marks C0 for recomputation. When the processing of C1 is complete, its computed set of propagated exceptions is cached (provided it is not a handler or a block), and it is popped off the stack. For C1, the cached result is a partial result since the exceptions propagated from C0 have not yet been included. When a chunk that is marked for recomputation is popped off the stack, a second pass is initiated. When processing in the second pass reaches the point of recursion, the previously cached partial computation for C0 is available and is unioned with the set of exceptions encountered by C1. At the conclusion of the second pass, the set of propagated exceptions has been completely computed for the chunk that was marked for recomputation as well as for each chunk for which the marked chunk is directly or indirectly the dynamic parent. USE OF THE ANALYSIS TOOL The analysis tool is intended to be used in conjunction with a set of design and coding guidelines that provide an application-specific discipline to the use of exception handling. Typically, such guidelines would distinguish different classes of exceptions and would attempt to answer at least these questions for each class: 1. Where may an exception belonging to the class be raised? 2. How (implicitly or explicitly) may a handler catch an exception belonging to the class? 3. What action (stop, pass, or map to a different class of exception) may a handler take when it has caught an exception belonging to the class? The appropriate handling of a class of exceptions may depend on the context. For example, the guidelines may prescribe that a handler in the immediate context of a raise must pass an exception of medium severity while a handler at a higher level in the call tree must either stop the exception or map it to one of higher severity.
1172 c. f. schaefer and g. n. bundy We have run the analysis tool on more than one million lines of code. It has proved to be effective in detecting a range of violations of design and coding guidelines. Among the types of anomalies the tool has detected are the following: (a) An exception is propagated to a subprogram that is not capable of handling it, either because the declaration of the exception is not visible to the subprogram or because proper handling of the exception would require access to hidden implementation details. (b) An exception indicating fatal severity is stopped prematurely or is erroneously mapped to an exception of lower severity. (c) A handler explicitly catches an exception that its associated chunk cannot encounter (if nothing that the handler catches can be encountered by its associated chunk, then the entire handler is dead code). (d) An exception that, by design, is supposed to be handled explicitly, is passed implicitly through one or more layers of subprogram calls (either because it is not caught by any handler or because it is caught implicitly and propagated by the simple raise statement). Some errors that the tool detected were due to simple oversights and could be corrected with trivial fixes. For example, one application had the reasonable guideline that a severe exception should not be stopped except by a handler at the highest level in the call-tree. A handler violating this guideline might be fixed simply by inserting the simple raise statement. However, although the fix may in this instance be easy, detecting the defect in the first place may be very difficult without the assistance of automated analysis tools. Furthermore, if not corrected, such a defect could result in the system s continuing to execute in an undefined state. Other errors were more subtle. One subtle defect is the erroneous mapping of a more severe exception to a less severe exception. Many such mappings are due to unanticipated propagations of exceptions from subprograms called in handlers. Consider the handler, exception when Data Err = Log Error; Send alert; raise Data Err; end; At first glance, it would appear that the handler will always pass the exception Data Err. However, suppose that the procedure Send Alert can itself propagate the exception Msg Processing Failed. Then under certain conditions, the handler has the effect of mapping Data Err to Msg Processing Failed. If Msg Processing Failed is less severe than Data Err, then the handler would be considered erroneous. A possible fix for this problem is to nest the call to Send alert in its own block, as follows: exception when Data Err = Log Error; begin
static analysis of exception handling in ada 1173 Send Alert; exception when others = null; end; raise Data Err; end; Although this fix ensures that the outer handler passes the exception Data Err, it may not always be appropriate. If Send Alert propagates an exception whose severity is higher than the severity of Data Err, then the inner handler will potentially mask a severe exception to allow a less severe exception to be propagated. This example points out the importance of thoroughly understanding the behavior of subprograms that are called from handlers. Another subtle source of defects is the implicit passing of exceptions. Prohibiting subprograms from passing certain classes of exceptions implicitly generally makes programs more comprehensible. If this guideline is adhered to, then for any given subprogram, a programmer can determine all the exceptions that can be propagated to the subprogram simply by examining the text of immediately called subprograms. Consider three procedures P1, P2, and P3, where P1 calls P2 and P2 calls P3. Suppose that P3 raises the exception Invalid Msg, and that P1 and P2 (and the handlers in their handler sets) do not raise Invalid Msg. If P2 passes Invalid Msg implicitly (and is thereby inconsistent with the hypothesized design guidelines), then P1 may also pass Invalid Msg implicitly even though the programmer who codes P1 examines the text of all immediately called subprograms to determine which exceptions can be propagated to P1. Thus, there may be a cascade of errors resulting from one inconsistency with a guideline. If the programmer responsible for P2 ultimately fixes P2 by explicitly passing Invalid Msg or by mapping Invalid Msg to another exception, then the code for P1 may also have to be changed. On the other hand, if the fix is to stop Invalid Msg in P2, then the code of P1 need not be changed. CONCLUSION Ada s exception handling mechanism offers the programmer considerable help in constructing correct and intelligible programs. Perhaps its principal benefit is the clarity that results from separating code for normal processing from code for exceptional processing. It reduces the need for explicit checking of return codes on each subprogram call and effects the alternate flow of control without the need for deeply nested conditionals or, worse, explicit gotos. Nevertheless, a programmer must exercise care in designing and coding the exception handling of an Ada program, as with any powerful language feature. One aspect of Ada s exception handling mechanism, in particular, permits the unwary programmer to construct faulty and unintelligible programs: the signature of an Ada subprogram, while specifying the types and modes of its parameters, does not constrain the set of exceptions which the subprogram may propagate. Discussing exception handling from a general, language-independent point of view, Goodenough 7 argues that the set of exceptions that a subprogram can propagate should be part of the signature of the subprogram, and subprograms in CLU 10 are subject to this
1174 c. f. schaefer and g. n. bundy restriction. Nevertheless, the designers of Ada explicitly rejected this alternative on efficiency grounds, 11 and they may well have struck the correct compromise. However, it remains true that it can be remarkably difficult to predict the dynamic behavior of Ada programs under exceptional processing from an examination of the program text. Particularly difficult to analyze is the behavior of handlers which call subprograms that may themselves propagate exceptions. It is for this reason that static analysis tools have such a high potential payoff in verifying Ada exception handling code. In the case of the application code we have analyzed, the static analysis tool was able to detect inconsistencies which, having escaped the notice of code inspectors, could only have been detected with an unrealistic number of test cases. As noted above, a static analysis tool need not be an unfailing oracle to be valuable; the tool may identify a number of instances of a suspicious construct which may turn out, on further analysis, to be correct. Simply identifying a set of potential errors (we have used the term red flags ) is of significant value. acknowledgements Thanks to Diane Mularz and Shari Lawrence Pfleeger for suggestions that improved this paper, and to Chuck Howell for his many ideas on exception handling. REFERENCES 1. C. Howell and D. Mularz, Exception handling in large Ada systems, Washington Ada Symposium Proceedings, June 1991, pp. 90 101. 2. C. Howell, D. Mularz and G. Bundy, Exception handling, or when bad things happen to good programs, 14th International Conference on Software Engineering Tutorial, May 1992. 3. Qian Cui and John Gannon, Data-oriented exception handling, IEEE Trans. Software Engineering, 18, (5), 393 401 (1992). 4. Michael E. Fagan, Advances in software inspections, IEEE Trans. Software Engineering, SE-12, (7), 744 751 (1986). 5. David L. Parnas, A. John van Schouwen and Shu Po Kwan, Evaluation of safety-critical software, Communications of the ACM, 33, (6), 636 648 (1990). 6. Reference Manual for the Ada Programming Language, ANSI/Military Standard MIL-STD-1815A-1983, U.S. Department of Defense, January 1983. 7. J. B. Goodenough, Exception handling: issues and a proposed notation, Communications of the ACM, 18, (12), 683 696 (1975). 8. Software Productivity Consortium, Ada Quality and Style, Van Nostrand Reinhold, 1989. 9. G. Goos, W. A. Wulf, A. Evans, Jr. and K. J. Butler, DIANA: An Intermediate Language for Ada, Springer-Verlag, 1983. 10. B. Liskov and A. Snyder, Exception handling in CLU, IEEE Trans. Software Engineering, 5, (6), 546 558 (1979). 11. J. D. Ichbiah et al., Rationale for the design of the Ada programming language, SIGPLAN Notices, 14, (6), Part B, (1979).