GENERALLY speaking, a client/server program is a type

Size: px
Start display at page:

Download "GENERALLY speaking, a client/server program is a type"

Transcription

1 IEEE TRANSACTIONS ON SOFTWARE ENGINEERING, VOL. 30, NO. 1, JANUARY Technology for Testing Nondeterministic Client/Server Database Applications Gwan-Hwan Hwang, Sheng-Jen Chang, and Huey-Der Chu Abstract The execution of a client/server application involving database access requires a sequence of database transaction events (or, T-events), called a transaction sequence (or, T-sequence). A client/server database application may have nondeterministic behavior in that multiple executions thereof with the same input may produce different T-sequences. In this paper, we present a framework for testing all possible T-sequences of a client/server database application. We first show how to define a T-sequence in order to provide sufficient information to detect race conditions between T-events. Second, we design algorithms to change the outcomes of race conditions in order to derive race variants, which are prefixes of other T-sequences. Third, we develop a prefix-based replay technique for race variants derived from T-sequences. We prove that our framework can derive all the possible T-sequences in cases where every execution of the application terminates. A formal proof and an analysis of the proposed framework are given. We describe a prototype implementation of the framework and present experimental results obtained from it. Index Terms Concurrent programming, reachability testing, client/server, database management system. æ 1 INTRODUCTION GENERALLY speaking, a client/server program is a type of distributed program (or concurrent program). Let P be a distributed program. Multiple executions of P with the same input may produce different results this is called nondeterministic behavior [1], [2]. Because of this, when testing P with input X which is a sequence of inputs for processes in P, a single execution is insufficient to determine the correctness of P with X. Even if P with input X has been executed successfully many times, it is possible that a future execution of P with X will produce incorrect results. An execution of a distributed program exercises a sequence of synchronization events called a synchronization sequence (or SYN-sequence). Examples of process synchronization include P and V primitives applied to a shared semaphore, monitor-entry procedures, send/receive message primitives, and general sharing of memory [3], [4]. The distributed program (or concurrent program) exhibits nondeterministic behavior because different executions of P with the same input X may produce distinct SYN-sequences. The meanings of some terms are not standard. In this paper, a test of P with input X is to execute P with input X once to obtain a SYN-sequence and check the result of execution. Duplicating test means that different tests of P with input X produce the same SYN-sequence. Exhaustive testing of P. G.-H. Hwang is with the Department of Information and Computer Education, National Taiwan Normal University, 162, Hoping E RD. Sec. 1, Taipei, Taiwan ghhwang@ice.ntnu.edu.tw.. S.-J. Chang is with Telecommunications Laboratories, Chunghwa Telecom Co., Ltd., No. 12, Lane 551, Min-Tsu Road Sec. 5 Yang-Mei, Taoyuan, Taiwan s_jchang@cht.com.tw.. H.-D. Chu is with the Department of Management Information Systems, Takming College, #56, Huan Shan Rd., Sec. 1, Ney Hwu, Taipei, Taiwan jchu@mis.takming.edu.tw. Manuscript received 1 Oct. 2002; revised 6 Nov. 2003; accepted 14 Nov Recommended for acceptance by J. Offutt. For information on obtaining reprints of this article, please send to: tse@computer.org, and reference IEEECS Log Number with input X is to conduct a lot of tests which exercise all feasible SYN-sequences of P with input X. In this paper, we focus on the problem of testing client/ server SQL database applications that exhibit nondeterministic behavior. The client interacts with the servers using the SQL [5], which is a powerful set-oriented language consisting of a few commands; it was created as a language for databases that adhere to the relational model [6]. Consider a client/server SQL application P with input X. The X contains input for each client as well as the initial state of each SQL database. During the execution of P, clients send SQL transactions to database servers concurrently. Whenever there are two or more SQL transactions, sharing the same portion of data in the same database server, a race condition may occur. Executing these SQL transactions in different orders, i.e., with different race outcomes, may produce different results. Thus, our target programs are distributed programs in which multiple clients race for data in relational databases. We present a framework using the reachability testing scheme to test SQL client/server applications. Our scheme provides a way of overcoming the nondeterministic behavior. If every execution of P with input X terminates, reachability testing of P with input X can accomplish exhaustive testing of P with input X and, thus, can determine the correctness of P with X. However, existing methods for reachability testing can only be applied to the testing of shared-memory [7] and asynchronous-messagepassing [8], [9], [10] distributed programs. First, we design the data structure of the SYN-sequence for race analysis of SQL transactions. It is the transaction sequence (T-sequence) that consists of many transaction events (T-events). The T-sequence can be used to perform a prefix-based replay in the reachability testing. The prefixbased replay is to start the execution at a state other than the initial state. We believe that the T-sequence is the most general form of the defined SYN-sequences [7], [8], [11] /04/$20.00 ß 2004 IEEE Published by the IEEE Computer Society

2 2 IEEE TRANSACTIONS ON SOFTWARE ENGINEERING, VOL. 30, NO. 1, JANUARY 2004 Since the SQL transaction actually has the most complicated data-access models, the T-sequence can also handle other types of synchronization model, including shared-memory, semaphore, monitoring, and message-passing models. Second, we design new algorithms to derive race variants from a T-sequence. Because the data-access models of shared memory and message passing are far simpler than the SQL database, the algorithms for deriving race variants presented in [7], [8], [9], [10] can no longer be applied. We present two algorithms in this paper. The first one is a modified version of the old algorithm presented in [7]. It generates a race-variant diagram (RV diagram) to derive race variants. The second is a novel algorithm that analyzes the order of T-events in a T-sequence to generate race variants. Formal proofs and complexity analyses are given for both algorithms. The experimental results show that the second algorithm is much more efficient than the first one, and we consider it a significant breakthrough in reachability-testing technology. Third, we develop a scheme for performing prefix-based replay for an SQL client/server database program. We present two approaches: the first one is based on the intrinsic synchronization mechanism of an SQL database server, and the other is based on thread synchronization. We have implemented the entire architecture and have performed many experiments on it. The experimental results show that the scheme we propose in this paper is a practical method for testing client/server database applications. In addition, the codes we implemented for the experiments can be obtained via the Internet (see Section 7). This paper is organized as follows: Section 2 surveys previous work on database testing and several approaches to the testing of distributed programs. Section 3 discusses how to performa race analysis for SQL transactions. In Section 4, we present two approaches for deriving race variants from a T-sequence. Section 5 presents two approaches for prefix-based replay. Section 6 presents the results of experiments and Section 7 concludes this paper. 2 PREVIOUS WORK AND TESTING APPROACHES FOR DISTRIBUTED PROGRAMS No techniques targeted specifically toward the nondeterministic behavior of database applications have been described in the software-testing research literature. However, some studies have investigated techniques for generating automated and semiautomated tests for database systems and applications. The work of Davies et al. [12] on dataconstraint testing investigated the feasibility of automatically producing test data to a database for load testing and establishing the integrity of the stored data. They proposed a scheme to check if the database has the ability to store, retrieve, update, and delete records correctly. The work in [13] proposed a testing technique to help assure that database programs meet a user s specification. Some other work has assessed the performance of database management systems rather than tested applications for correctness [14], [15], [16], [17]. Fig. 1. Reachability testing. The only related work is that by Chu and Dobson [18], which involved a client/server database program that exhibited nondeterministic behavior. But, they did not propose any solution. However, there has been a lot of work on testing technologies for ordinary nondeterministic concurrent programs that do not involve database access. The remainder of this section provides a brief overview of this work. Nondeterministic testing (also called multipleexecution testing) of a concurrent program P involves the following steps: 1) Select a set of inputs of P, and 2) for each selected input X, execute P with X many times and examine the result of each execution. Nondeterministic testing of P with input X has two major problems; one is that some feasible SYN-sequences of P with input X may be executed many times, and the other is that some feasible sequencesmay never be executed [7], [19]. Deterministic testing of a concurrent program P involves the following steps: 1) Select a set of tests, each of the form (X,S), where X and S are an input and a SYN-sequence of P, respectively, and 2) for each selected test (X,S), perform a deterministic execution of P with input X according to S. The forced execution determines whether S is feasible for P with input X. Deterministic testing of P allows the use of SYN-sequences selected according to the implementation and specification of P. However, deterministic testing has additional problems to solve. One major problem is deciding which pairs of inputs and SYN-sequences to select for a concurrent program. A number of methods for solving this problem have been proposed [20], [21], [22]. Fig. 1 illustrates the concept of reachability testing. Assume that S is the SYN-sequence of an execution of P with input X. Reachability testing of P with input X and SYN-sequence S involves the following steps: 1. Use S to derive a set of prefixes of other feasible SYN-sequences of P with input X. Such prefixes (race variants of S) are derived by changing the outcome of race conditions in S. An execution which follows a race variant of S will always exercise a SYN-sequence which is different from S. 2. For each new race variant derived in Step 1, perform a prefix-based replay of P with input X and the race variant to execute and collect an additional SYN-sequence for P with input X. The prefix-based replay of P with a race variant R includes two phases:

3 HWANG ET AL.: TECHNOLOGY FOR TESTING NONDETERMINISTIC CLIENT/SERVER DATABASE APPLICATIONS 3 a. Replay phase: replay all the synchronization events in R. b. Monitor phase: subsequent synchronization events are recorded after replaying R. 3. For each new SYN-sequence collected in Step 2, repeated Steps 1 and 2. Note that R is not a complete SYN-sequence of an execution of P. Thus, after the replaying of R, we have to record the subsequent synchronization events to obtain a SYN-sequence of an execution of P. 3 RACE ANALYSIS FOR SQL DATABASE TRANSACTIONS The concurrent queries of several clients to the same database may access the same portion of data. The majority of modern SQL servers are transaction servers; the client invokes remote procedures that reside on the server with an SQL database engine. The remote procedures on the server execute a group of SQL statements called a transaction, and the SQL statements contained within this group either all succeed or fail as a unit. Also, a transaction will lock the accessed data automatically. The special case of a transaction is one with only a single SQL statement. All process synchronizations in a parallel program were modeled in [11] as operations on shared data (or shared memory). It showed that this characterization of process interactions is not restrictive, since many communication and synchronization primitives can be reduced to operations on shared data. In particular, message passing can be modeled as communication through a shared port or mailbox. The work in [7] showed how to perform reachability testing of concurrent programs using read and write operations to shared memory. Let P be a concurrent program using read and write operations. Each shared variable in P is assigned a version number, which is initialized to zero and increased by one after each write operation on it. An execution of P involves two types of synchronization events: read and write. A read event is denoted as R(U,V), which refers to a read operation on variable U with the version number being V, and a write event is denoted as W(U,V), which refers to a write operation on variable U with the resulting version number being V. Thus, if W(A,1) and R(A,1) are issued by different processes in the concurrent program, we can determine that W(A,1) and R(A,1) have a race condition and that W(A,1) happened before R(A,1). In [9], Bechini and Tai presented algorithms for vector timestamps developed to determine the happen before relations between events of an execution of a message-passing program. The algorithms can be applied to perform race analysis. In [10], Lei and Tai improved the work in [8]. It showed how to derive all race variants for asynchronous message-passing programs for efficient searching and insertion without using prefix-based testing. However, compared to read/write access to shared memory, the data-sharing model of SQL is far more complicated. First, the data manipulation language of SQL includes the instructions INSERT, DELETE, UPDATE, SELECT, JOIN, STORED PROCEDURE, TRIGGER, RULE, and CURSOR. Second, a single SQL statement may refer to TABLE 1 Car multiple tables which contain multiple rows (as opposed to the shared-memory model, in which each read or write operation only accesses a single shared variable). Third, the shared-memory model has a fixed number of shared variables which are declared and initialized before program execution. However, with the INSERT and DELETE operations in SQL, the number of rows in any table always varies depending on the SQL transaction operations of clients. We can treat read/write access to shared memory as a special case in the data access model of SQL transaction in which each row represents a shared variable. Consider the essential data structure of an SQL database. Data are stored as rows in tables. A primary key is a field whose value must be unique for each row. It seems that the most precise way to identify the accessed rows in an SQL transaction is to store the table names as well as the primary keys of the accessed rows. However, we will show that there are some situations in which storing the primary key of each access row of an SQL transaction does not help to detect the race condition. We provide three examples to illustrate this point. The first example considers the case where two concurrent database transactions are the SQL statements INSERT and SELECT, respectively. Assume the table name is car and that column id is the primary key of this table (see Table 1). The two SQL transactions issued by two concurrent clients are T 1 and T 2, where T 1 is BEGIN_TRANSACTION INSERT INTO car VALUES (5, LS430, Lexus, 3,500,000) END_TRANSACTION and T 2 is BEGIN_TRANSACTION SELECT * FROM car WHERE car.car_price > 600,000 END_TRANSACTION Transaction T 1 inserts a row (5, LS430, Lexus, 3,500,000) into table car, and transaction T 2 selects rows from car for which car_price > 600,000. Assume that these transactions are issued by two concurrent clients. If T 1 is executed prior to T 2, then the primary keys of accessed rows of the SQL transaction issued by T 1 and T 2 are 5 and 1, 2, and 5, respectively. By observing the primary-key information in the two T-events, it is obvious that the two clients race because they access the same row, i.e., the row with primary key 5. However, if T 2 is executed prior to T 1, then the primary keys of accessed rows of T 1 and T 2 are 5 and 1 and 2, respectively. Since f5g[f1; 2g ¼, the primary-key information in the T-events is not sufficient for identifying the race condition. However, since it is possible for the two SQL transactions to access the same row in the database, we should still consider that the two SQL transactions exhibit a race condition.

4 4 IEEE TRANSACTIONS ON SOFTWARE ENGINEERING, VOL. 30, NO. 1, JANUARY 2004 The second example is where the two concurrent database transactions are SELECT and UPDATE. The two database transactions issued by two concurrent clients are T 3 and T 4, where T 3 is BEGIN_TRANSACTION SELECT * FROM car WHERE car.car_price > 600,000 END_TRANSACTION and T 4 is BEGIN_TRANSACTION UPDATE car SET car_price = 550,000 WHERE car_name = "Altis" END_TRANSACTION If T 3 is executed prior to T 4, then the primary keys of accessed rows of the SQL transaction issued by T 3 and T 4 are 1 and 2, and 2, respectively. By observing the primarykey information in the two T-events, it is obvious that the two clients race because they access the same row. However, if T 4 is executed prior to T 3, then the primary keys of accessed rows of the SQL transaction issued by T 4 and T 3 are 2 and 1, respectively. In this case, the primarykey information in the T-events does not exhibit the race condition. The above two examples show that it is inappropriate to use the primary keys of accessed rows in database transactions to decide whether two SQL transactions exhibit a race condition. This is because the accessed rows are determined by the Boolean predicates which are not related to the primary key in the WHERE clause of the SQL statement. Consider the following example where the Boolean predicates are bound to the primary key. The two database transactions issued by two concurrent clients are T 5 and T 6, where T 5 is BEGIN_TRANSACTION SELECT * FROM car WHERE car.id in (1,3) END_TRANSACTION and T 6 is BEGIN_TRANSACTION UPDATE car SET car_price = 71,532 WHERE car.id=2 END_TRANSACTION Irrespective of whether T 5 executes prior to T 6 or T 6 executes prior to T 5, the primary keys of the accessed rows of the SQL transaction issued by T 5 and T 6 are 1 and 3, and 2, respectively. Since f1; 3g[f2g ¼, there is no data race between the two transactions. Using the above three examples, we can design the data structures of the T-event and T-sequence so that they contain enough information to perform a race analysis. Let the client/server application P contain clients CL 1 ; CL 2 ;...; CL n, and the SQL database comprise DB 1 ; DB 2 ;...; DB m among all the database transaction servers, where n>1 and m>0 (note that there may be more than one database in a database server). An execution of P exercises a sequence of SQL transactions (the T-sequence). More specifically, the T-sequence of P is denoted as (T 1 ;...; T i ;...; T x ), where T i ; 1 i x, is a T-event. We define the T-event as the following data structure: T a = {Database_name, Database_transaction_order, Client_name, Client_transaction_order, [Table_name 1, Operation 1, Primary_Key_List 1 ],... [Table_name i, Operation i, Primary_Key_List i ],... [Table_name s, Operation s, Primary_Key_List s ] } T a is a T-event. The header of a T-event records the name of the database and the name of the requested client, which here are Database_name and Client_name, respectively. Each database is associated with a transaction number, which is initialized to one and increased by one after each transaction is performed on this database. Database_ transaction_order records the order of this transaction relative to all transactions executed on the database by different clients. The Client_transaction_order indicates the order of this transaction relative to all transactions executed by the client. After the header, there is a series of accessed table records. We have mentioned that a single SQL statement may access multiple rows in multiple tables. For the race analysis, we translate each SQL statement into multiple accessed table records. Each accessed table record is represented by Table name i, Operation i, and Primary Key List i, 1 i s. Table name i is the table name. We summarize the row operations as having INSERT, READ, UPDATE, and DELETE instructions. Note that these four instructions are operations to rows rather than SQL statements. The combination of these four operations can represent any SQL data-manipulation statements, including static and dynamic SQL [5]. Primary Key List i has one of following two forms: 1. The {*} represents that the primary keys of the accessed rows depend on the status of the database and the WHERE clause in the SQL statement (e.g., SQL transactions T 2, 1 T 3, and T 4 shown above). It means that an operation is possible to act on any row in a table. 2. If the primary keys of the access rows do not depend on the status of the database and the WHERE clause in the SQL statement, Primary Key List i is a set of primary keys of the form fk1; k2;...g which records the primary keys of accessed rows (e.g., SQL transactions T 1, T 5, and T 6 shown above). Example 1 (Fig. 2) shows an SQL transaction with five SQL statements and its corresponding T-event. In the following, we define a race relation of two T-events (see Fig. 3). Two T-events are said to exhibit a transaction race if they can access the same portion of data in the same database, and if the results of their execution may differ with their order of execution. 1. Note that transactions T 1 and T 6 are different from the transactions in Example 1.

5 HWANG ET AL.: TECHNOLOGY FOR TESTING NONDETERMINISTIC CLIENT/SERVER DATABASE APPLICATIONS 5 Fig. 2. Example 1. Referring to Definition 1, conditions 1-3 mean that T u and T v are two SQL transactions which act on the same table in a database, but are issued by different clients. Condition 4 means that the operations are not both READ, and condition 5 determines if the two operations issued by two clients may act on the same row. Note that {*} means that an operation is able to act on any row in a table. In the above discussion, we have provided examples which illustrate that the race may occur between INSERT and READ or between READ and UPDATE; two INSERT operations may also race. Consider the case where two clients C1 and C2 are trying to insert a row with the same primary key into the same table in a database. If Client C1 executes prior to Client C2, then the SQL transaction of C2 Fig. 3. Definition 1.

6 6 IEEE TRANSACTIONS ON SOFTWARE ENGINEERING, VOL. 30, NO. 1, JANUARY 2004 Fig. 4. Algorithm 1. will fail; conversely, C1 will fail is C2 executes first. The execution order of the two INSERT operations changes the execution results. Another case is when the two operations are DELETE. Consider the case where two clients are attempting to delete the same row from the same table. The first client will delete the row successfully, after which the second client will receive a reply stating that it cannot successfully delete the row from the database server. The client may perform another transaction if it cannot successful delete the row from the table. Thus, it is obvious that the order of execution of transactions issued by the two clients affects the results. This is also a type of race between two transactions. 4 DERIVE RACE VARIANTS FROM A TRANSACTION SEQUENCE Section 3 defined the format of a T-sequence and showed how to detect a transaction race between two T-events in a T-sequence. This section presents algorithms for deriving race variants from a T-sequence. Let the client/server application P contain clients CL 1, CL 2 ;...; CL n, and the SQL database comprise DB 1, DB 2 ;...; DB m among all the database transaction servers, where n>1 and m>0. Let TS be a feasible T-sequence of P with input X. X contains input for each client and the initial state of each database. For convenience, we define the following notation: TSðCL i Þ is the ordered set of the T-events of Client CL i, and TSðCL i ; jþ is the jth T-event of Client CL i. 4.1 The Race Graph of a T-Sequence Before we describe how to derive race variants for a T-sequence, we show how to construct a directed partialordered graph of a T-sequence (see Fig. 4). We call this the race graph of a T-sequence. The race graph is used to determine if there exists a race condition whose outcome can be changed. The vertices in a race graph represent T-events. The added edge between two T-events e 1 and e 2 represents that e 1 must happen before e 2. We use the following example to illustrate Algorithm 1 (Fig. 4). Assume there are two databases, DB1 and DB2: TA1 and TA2 are tables of DB1; TA3 is a table of DB2; there are three clients C1, C2, and C3; and TS includes the following T-events: TS(C1,1) = {DB1,1,C1,1,[TA1,read,*], [TA2, read,*]} TS(C1,2) = {DB2,1,C1,2,[TA3,insert,4]}, TS(C2,1) = {DB1,2,C2,1,[TA1,insert,5]}, TS(C3,1) = {DB2,2,C3,1,[TA4,insert,5]}, TS(C2,2) = {DB2,3,C2,2, [TA3,read,*], [TA4,read,*]}, TS(C2,3) = {DB1,3,C2,3,[TA1,delete,*], [TA2,read,*]}, TS(C1,3) = {DB1,4,C1,3,[TA1,read,*], [TA2,read,*]}, We summarize the race relation between the T-events in TS in Table 2. After Steps 1 and 2 of Algorithm 1, we have the graph shown in Fig. 5a. After Step 3, we obtain the final race graph, as shown in Fig. 5b. Given a T-sequence TS, we denote the race graph of TS as Race-graph(TS). In the following, we give definitions which are related to the race graph, which will be referred to in the algorithm for deriving race variants (see Figs. 7 and 8). Note that e 1 happened before e 2 means not only that e 1 occurred before e 2, but also that there is a causality relation between e 1 and e 2 [23]. TABLE 2 Race Relationships between T-Events in T-Sequence TS Note that yes means there exists a transaction race between two T-events.

7 HWANG ET AL.: TECHNOLOGY FOR TESTING NONDETERMINISTIC CLIENT/SERVER DATABASE APPLICATIONS 7 Fig. 5. An example of the race graph. If f is a feasible prefix of a T-sequence TS, then f corresponds to a feasible intermediate execution state which can reach the execution state of TS. We use Fig. 6 to illustrate Definition 3. P1 is a feasible prefix of TS, but P2 is not a feasible prefix of T because TS(C2,1) happened before TS(C1,3). Since TS(C1,3) is in P2, TS(C2,1) should also be included in P2 to make P2 a feasible prefix of TS. 4.2 Generate an RV Diagram to Derive Race Variants In this paper, we propose two schemes for deriving race variants from a T-sequence TS in Sections 4.2 and 4.3, respectively. In the first scheme, we construct the RV diagram for TS, which is a tree with the path from the root to the node representing a feasible prefix or race variant of TS. The nodes in the RV diagram for TS are generated by considering all the possible interleavings of T-events. It is an n-ary tree if the number of concurrent clients is n. For a node in the RV diagram for TS, the path from the root node of the diagram to this node is a totally ordered sequence of T-events. The RV diagram scheme was first proposed in [7], which uses the version number of variable access to determine if there exists a different race outcome in the nodes of the RV diagram. However, the transaction order is insufficient for race analysis in a T-sequence. As shown in the example of Section 4.1 (Fig. 5b), two T-events TS(C1,2) = {DB2, 1, C1, 2, [TA3, insert, 4]} and TS(C3,1) = {DB2, 2, C3, 1, [TA4, insert, 5]} access the same database with different transaction orders, but do not race. The reordering of these two T-events cannot change the execution result. Thus, the Fig. 6. An example illustrating the definition of feasible prefix. algorithm for generating race variants from the read-write sequence presented in [7] cannot be applied to the T-sequence. Since the version number is insufficient for determining whether race conditions are changed, the RV diagram defined in [7] is insufficient. Algorithm 2 (see Fig. 9) is our new algorithm for generating an RV diagram based on the race graph. Each node N in the RV diagram for TS contains a client-transaction-order vector and a race graph: Client-transaction-order vector: (I 1 ; I 2 ;...; I n ), where I j ; 1 j n, is the order number of the last T-event in the jth client that is executed for the generated node N. Race graph: This represents a race variant if N is a racevariant node; otherwise, it represents a feasible prefix of the T-sequence if N is a prefix node. The following algorithm (Fig. 9) shows how to drive race variants from a T-sequence. Algorithm 2 generates an RV diagram to simulate the execution of the issuing of T-events by clients. Each generating of a child node represents an execution of a T-event by a client. A prefix node corresponds to a feasible prefix of TS which does not have different race outcomes. A race-variant node contains a sequence of T-events which have different race outcomes. In case a node is marked, it is either a race-variant node or a prefix node which does not need to generate child nodes anymore. We use the T-sequence TS in the running example of Section 4.1 (Fig. 5b) to illustrate Algorithm 2. Fig. 10 is part of the RV diagram generated from TS. Note that TS(Ci,j) is abbreviated to Ti,j in the figure. There are two race-variant Fig. 7. Definition 2. Fig. 8. Definition 3.

8 8 IEEE TRANSACTIONS ON SOFTWARE ENGINEERING, VOL. 30, NO. 1, JANUARY 2004 Fig. 9. Algorithm 2. nodes. Consider the prefix node with client transaction order (1,1,0). In Step 3 of Algorithm 2, observe the following: j=1: This generates a prefix node whose client transaction order is (2,1,0). Since there is already a prefix node with client transaction order (2,1,0), this prefix node is labeled marked. Fig. 10. An example of the RV diagram. j=2: This generates a node whose client transaction order is (1,2,0), which adds TS(C2,2) to the race graph of its own. In the target T-sequence TS, TS(C3,1) happened before TS(C2,2), but TS(C3,1) is not in its race graph. Thus, it is a race-variant node. j=3: This generates a prefix node whose client transaction order is (1,1,1). Fig. 11 shows all the race variants derived by Algorithm 2 with the T-sequence TS in the running example of Section 4.1 (Fig. 5b) as input. There is a total of 10 race variants derived: RV2-1 to RV2-10. We can easily find out that some race variants are feasible prefixes of others. For example, RV2-3 is a feasible prefix of both RV2-6 and RV2-10; RV2-10 is a feasible prefix of RV2-6. It will make the prefix-based replay of the three race variants RV2-3, RV2-6, and RV2-10 able to generate the same T-sequence. This means that during the testing process it will be possible to duplicate tests. We discuss this issue in more detail in Section Analyzing Edges in a Race Graph to Derive Race Variants In this section, we propose another algorithm for deriving race variants. Instead of generating an RV diagram, it only has to analyze the edges in the race graph of a T-sequence. The algorithm tries to reverse the direction of the edges so as to derive race variants of a T-sequence. The analysis of the edges is based on the in-edges of each T-event in the race

9 HWANG ET AL.: TECHNOLOGY FOR TESTING NONDETERMINISTIC CLIENT/SERVER DATABASE APPLICATIONS 9 Fig. 12. In-edges and out-edges of a T-event in a race graph. analysis and investigating the duplication of the same tests by the two algorithms. Fig. 11. Race graphs of the race variants derived by Algorithms 2 and 3. graph (see Definition 4 (Fig. 13) for the definitions of the inedge and out-edge). We use Fig. 12 to illustrate Definition 4. The in-edges of T-event TS(2,2) are (TS(1,1), TS(2,2)) and (TS(1,2), TS(2,2)). The out-edges of T-event TS(2,2) are (TS(2,2), TS(3,1)) and (TS(2,2), TS(3,3)). With Definition 4, we now present Algorithm 3 (see Fig. 14). For each T-event, if it is with k in-edges, the algorithm generates 2 k 1 subsets of in-edges and tries to reverse them so as to generate race variants. Also, after reversing some edges, a cycle means that the sequence is infeasible, i.e., e 1 happened before e 2 and e 2 happened before e 1 is not possible. We use Fig. 15 to illustrate Algorithm 3. In Step 4 of Algorithm 3, assume E is the second event of Client 2. There are two in-edges of E. We show the three cases for i ¼ 1, i ¼ 2, and i ¼ 3. Note that i is the index variable in Step 4. Fig. 11 shows all the race variants derived by Algorithm 3 with the T-sequence TS in Section 4.1 (Fig. 5b) as input. Note that TS(Ci,j) is abbreviated to Ti,j. There is a total of six race variants derived: RV3-1 to RV3-6. However, Algorithm 2 derives 10 race variants from the same T-sequence. This motivates us to study the correctness of the two algorithms and the differences between them. In Section 4.4, we prove that reachability testing with the two algorithms to derive race variants can execute all the possible T-sequences if the client/server program always terminates. In Section 4.5, we compare the two algorithms. We focus on the running time 4.4 The Correctness of Algorithms 2 and 3 In this section, we investigate whether the two algorithms can perform the exhaustive testing for a client/server program. We first need two additional definitions (see Figs. 17 and 18). In Step 3 of Algorithm 4 (see Fig. 19), it first adds a T-event e and its edges to G MCFP if e is a not-happenedbefore node in both RG 1 and RG 2. Then, it removes e from RG 1 and RG 2. This process continues until there does not exist any T-event which is a not-happened-before node in both RG 1 and RG 2. The algorithm shows one method of deriving the MCFP from two T-sequences, from which we can easily know that the MCFP of two T-sequences is unique. Fig. 16 shows an example for the definition of the MCFP. In the example, the MCFP has three T-events. In the extreme case, the MCFP of two T-sequences has no T-event. Now, we show why all the feasible T-sequences can be derived if we use Algorithms 2 and 3 to derive race variants in reachability testing (see Theorem 1 and Theorem 2). Theorem 1. Assume that every execution of a client server program P with input X terminates. According to Algorithm 2, reachability testing of P with input X derives and executes all feasible T-sequences of P with input X. Proof. Refer to Fig. 1. Assume we have a T-sequence S of P with input X. Since every execution of P with input X terminates, we assume that the number of T-events in S is n, where n is a positive integer, which we denote in the following as jsj ¼n. Assume a feasible T-sequence S 0 of P with input X. We prove that conducting the process shown in Fig. 1 by applying Algorithm 2 to derive race variants can always perform a testing of P with X and S 0. Let T M be the MCFP of S and S 0, and let jt M j¼m,mis a positive integer. It is trivial that m jsj and m js 0 j. Fig. 13. Definition 4.

10 10 IEEE TRANSACTIONS ON SOFTWARE ENGINEERING, VOL. 30, NO. 1, JANUARY 2004 Fig. 14. Algorithm 3. Reachability testing will first invoke Algorithm 2 with S as its input (see Fig. 1). As the nodes in the RV diagram are generated by considering all the possible interleavings of T-events of S, there must exist at least one prefix node whose race graph is the same as the race graph of T M in the generated RV diagram. This is because T M is also a feasible prefix of S. We denote this node N M. In the following, we prove that there must be at least one child node of N M which is a feasible prefix of S 0. Let T S M be the set of T-events in S but not in T M, and let T S0 M be the set of T-events in S 0 but not in T M. Without loss of generality, let e be a not-happened-before event in Race-graph (T S 0 M). First, we have that T M þ e is a feasible prefix of S 0. It is trivial that e must not be a nothappened-before event in Race-graph (T S M ). If e is a not-happened-before event in both Race-graph (T S M ) and Race-graph (T S 0 M), then it must be that e 2 T M. According to Step 3 of Algorithm 2, this will generate a child node thathas Race-graph (T M þ e) as its race graph. Since there must exist an event thathas happened before e in Race-graph(S), this child node is a race-variant node of the generated RV diagram. Then, the reachability testing will conduct a prefixbased replay of T Mþe. Assume that this produces a T-sequence S 1. It is trivial that T Mþe is a feasible prefix of S 1 and S 0. Let T Mþ1 be the MCFP of S 1 and S 0. We easily see that jt Mþ1 jjt M jþ1 ¼ m þ 1. Again, applying Algorithm 2 to S 1 will derive a race variant of size jt Mþ1 jþ1. Repeating the above process, we will eventually derive a T-sequence that is equal to S 0 in the prefix-based replay.tu Fig. 15. An example to illustrate Algorithm 3. Fig. 16. Two examples illustrating the definition of MCFP.

11 HWANG ET AL.: TECHNOLOGY FOR TESTING NONDETERMINISTIC CLIENT/SERVER DATABASE APPLICATIONS 11 Fig. 17. Definition 5. Fig. 18. Definition 6. Fig. 19. Algorithm 4. Theorem 2. Assume that every execution of a client/server program P with input X terminates. According to Algorithm 3, reachability testing of P with input X derives and executes all feasible T-sequences of P with input X. Proof. Refer to Fig. 1. Assume that we have a T-sequence S of P with input X. Since every execution of P with input X terminates, we assume that jsj is n, n is a positive integer. Assume a feasible T-sequence S 0 of P with input X. We prove that conducting the process in Fig. 1 by applying Algorithm 3 to derive race variants can always perform a testing of S 0. Let T M be the MCFP of S and S 0, and let jt M j¼m, where m is a positive integer. It is trivial that m jsj and m js 0 j. The reachability testing will first invoke Algorithm 3 with S as its input (see Fig. 1). The same way as the proof of Theorem 1, let T S M be the set of T-events in S but not in M, and T S 0 M be the set of T-events in S 0 but not in M. Without loss of generality, let e be a not-happened-before event in Race-graph (T S 0 M). It is trivial that e must not be a not-happenedbefore event in Race-graph (T S M ). Then, there must be some in-edges to e from the events in Race-graph (T S M ) (see Fig. 20). By changing the directions of the in-edges of e from the nodes in Race-graph (T S M ), we can obtain a feasible prefix T M þ e of S 0. Because it has no knowledge of T M in the process of deriving race variants, Step 4 in Algorithm 3 will change the direction of the in-edges of an event of S in any combination, i.e., 2 jnumber of in-edgesj 1. The changing of in-edge directions of e in S to derive T M þ e must fall into one of the 2 jnumber of in-edgesj 1 cases (see Fig. 20). Then, the reachability testing will conduct a prefixbased replay of T M þ e. Assume that this produces a T-sequence S 1. It is trivial that T M þ e is a feasible prefix of S 1 and S 0. Let T Mþ1 be the MCFP of S 1 and S 0.We easily see that jt Mþ1 jjt M jþ1 ¼ m þ 1. Again, applying Algorithm 3 to S 1 will derive a race variant of size jt Mþ1 jþ1. Repeating the above process, we will eventually derive a T-sequence that is equal to S 0 in the prefixbased replay. tu Fig. 20. Changing the direction of some of the in-edges of e to obtain a feasible prefix S 0.

12 12 IEEE TRANSACTIONS ON SOFTWARE ENGINEERING, VOL. 30, NO. 1, JANUARY Analysis of Algorithms 2 and 3 The analyses of the algorithms are divided into two parts. One is the complexity analysis of Algorithms 1, 2, and 3 (Complexity Analyses 1-3). The other is the analysis of duplicating tests during reachability testing (Theorems 3-9). Complexity Analysis 1: The running time of the algorithm for deciding if T u and T v exhibit a transaction race in Definition 1 is Oðm n MPKL 2 Þ in which m, n, and M PKL are the number of accessed table records in T u and T v, and maximal number of elements in the Primary_ Key_List, respectively. Proof. For each pair of the ½Table name ui ; Operation ui ; Primary Key List ui Š in T u and ½Table name vj ; Operation vj ; Primary Key List vj Š in T v, 1 i m, 1 j n, it needs the running time OðM 2 PKLÞ to check if Primary Key List ui [ Primary Key List vj is the empty set. Since there are m n pairs of accessed table records to be checked, the running time is Oðm n M 2 PKL Þ. tu Complexity Analysis 2: Assume that the T-sequence TS is from the execution of P with n clients. Let TS, K, M ATR, and M PKL be the number of T-events in TS, the number of edges between different clients in the race graph of TS, the maximal number of accessed table records in T-events of TS, and the maximal number of elements in Primary_Key_List, respectively. The running time for Algorithm 1 to generate the race graph of TS is OðMAT 2 R M2 PKL jtsj2 þ KÞ. The running time for Algorithm 2 to derive race variants from TS is OðjTSj 2 K þ n jtsj þ MATR 2 M2 PKL jtsj2 Þ. Proof. For Algorithm 1, it first needs OðMATR 2 M2 PKL jtsjðjtsj 1Þ=2Þ ¼OðMATR 2 M2 PKL jtsj2 Þ to compute all the transaction race relationships between jtsj T-sequences. Steps 1, 2, and 3 of Algorithm 1 are bound on OðjTSjÞ, OðjTSjÞ, and OðKÞ, respectively. Thus, the running time of Algorithm 1 is OðMATR 2 M2 PKL jtsj2 þ jtsjþjtsjþkþ ¼OðMATR 2 M2 PKL jtsj2 þ KÞ. For Algorithm 2, the running time for identifying the happened-before relations between jtsj T-events is OðjTSj 2 KÞ. The RV diagram for TS is an n-ary tree with a maximal depth of jtsj. It is because each expanding of a child node from a prefix node simulates an execution of a T-event and the total number of T-events is jtsj. The number of nodes in an n-ary tree with depth jtsj is ðn jtsjþ1 1Þ=ðn 1Þ. We have Oððn jtsjþ1 1Þ=ðn 1ÞÞ ¼ Oðn jtsj Þ. Algorithm 2 is bound on the running time to generate the race graph and the number of nodes in the generated RV diagram. We have the running time of Algorithm 2 is OðjTSj 2 Kþn jtsj þmat 2 R M2 PKL jtsj2 þkþ¼oðjtsj 2 K þ n jtsj þ MATR 2 M2 PKL jtsj2 Þ. tu Complexity Analysis 3: The running time for Algorithm 3 to derive race variants from TS is OðjTSj 2 2 K þ MATR 2 M2 PKL jtsj2 þ KÞÞ. Note that jtsj, K, M ATR, and M PKL are defined in Complexity Analysis 2. Proof. Assume that the jtsj T-events are V 1 ;V 2 ;..., and V jtsj. Let k i be the number of in-edges of V i. Then, we have K ¼ k 1 þ k 2 þ...þ k jtsj. In addition to generating the race graph for TS, the running time of Algorithm 3 is bound on the time for executing Step 4 for each T-event, i.e., OðjTSj2 ki 1 Þ for V i. Note that OðjTSjÞ is to check if there is a loop in G 0. Thus, the running time of Step 4 is jtsj 2 k1 1 þjtsj2 k2 1 þ...þjtsj2 kjtsj 1 < jtsj2 K þ jtsj2 K þ...þjtsj2 K ¼jTSj 2 2 K. The running time is OðjTSj 2 2 K þ MATR 2 M2 PKL jtsj2 þ KÞÞ. tu According to Fig. 11, we can easily determine that the number of race variants generated by Algorithms 2 and 3 may be different. That is, given the same T-sequence, Algorithms 2 and 3 may derive different sets of race variants. We can divide the race variants into six groups. For example, in the race variants derived by Algorithm 2, RV2-3 is a feasible prefix of both RV2-6 and RV2-10 and RV2-10 is a feasible prefix of RV2-6 in group 5. A similar situation also holds in groups 1 and 6. Theorem 3. Assume that RV1 and RV2 are two race variants derived by Algorithm 2 from a T-sequence. It is possible that RV1 is a feasible prefix of RV2. Proof. This comes directly from the example shown in Fig. 11, where RV2-3 is a feasible prefix of both RV2-6 and RV2-10. tu Theorem 4. Assume two racevariants RV1 and RV2. If RV1 is a feasible prefix of RV2, then the prefix-based replay of RV1 and RV2 with the same input may produce the same T-sequence. Proof. Assume that the prefix-based replays of RV1 and RV2 produce two feasible T-sequences T1 and T2, respectively. T1 is prefixed with RV1. The T-events of T1 after RV1 are obtained in the monitor phase of the prefix-based replay. Since the execution of T-events in the monitor phase is not controlled, it is possible to produce any feasible sequences of T-events. Since T2 is also prefixed with RV1 (because T2 is prefixed with RV2 and RV2 is prefixed with RV1), it is possible that T1 is equal to T2. tu Theorem 3 and Theorem 4 show that it is possible to duplicate the same test if we employ Algorithm 2 to derive the race variant. However, it is impossible for Algorithm 3 to duplicate the same test in this situation (see Theorem 5). Theorem 5. Assume a T-sequence TS, and that RV1 and RV2 are two race variants derived by applying Algorithm 3. It is impossible that RV1 is a feasible prefix of RV2. Proof. According to the operation of Theorem 3, we consider the following two cases: 1) We assume that RV1 and RV2 are generated by altering the in-edges of the same node in step 4 of Theorem 3. If the node is e, then e must be in a T-event in RV1 and RV2. However, according to Step 4 of Theorem 3, the in-edges of e of RV1 and e of RV2 must be different. Thus, it is

13 HWANG ET AL.: TECHNOLOGY FOR TESTING NONDETERMINISTIC CLIENT/SERVER DATABASE APPLICATIONS 13 Fig. 21. A race graph G and two of its race variants RV1 and RV2. impossible for RV1 to be equivalent to RV2 in this case. 2) We assume that RV1 and RV2 are generated by altering the in-edges of the different nodes in Step 4 of Theorem 3. Assume that RV1 is generated by altering the in-edges of node e. If RV1 is a feasible prefix of RV2, e must also be a T-event of RV2. However, since we apply Theorem 3 to change the in-edges of e to derive RV1 and RV2 are derived by altering the in-edges of another node, the inedges of e must be different in RV1 and RV2, so it is impossible for RV1 to be a feasible prefix of RV2. It is clear that each case leads to a contradiction. tu We have shown in Theorem 5 that it is impossible for Algorithm 3 to derive two racevariants from the same T-sequence, for which one is the other s feasible prefix. Also, in the example shown in Fig. 11, none of the race variants derived by Algorithm 3 is another s feasible prefix. However, this does not mean that performing reachability testing with Algorithm 3 to derive race variants will not duplicate the same test. Consider the race graph G of a T-sequence shown in Fig. 21. By changing the in-edges of two T-events E1 and E2 in G, we can derive at least two race variants RV1 and RV2. Fig. 21 also shows that it is possible for two T-sequences obtained by applying prefix-based replay of RV1 and RV2 to be identical. Since each prefixbased replay of a racevariant conducts a test, it is obvious that Algorithm 3 cannot also avoid duplication of the same test. The example shown in Fig. 21 motivates Theorem 6. In addition to Theorem 4, Theorem 6 shows another situation in which the prefix-based replay of two race variants may produce the same T-sequence. Theorem 6. Assume two race variants RV1 and RV2. Let M be the MCFP of RV1 and RV2. If (set of nodes in Race-graph (RV1-M) \ set of nodes in Race-graph (RV2-M)) 2 is the empty set, then the prefix-based replays of RV1 and RV2 may produce the same T-sequence. Proof. The prefix-based replaying of a racevariant includes two steps: 1) controlling the execution of the concurrent program by following the execution order specified in the racevariant, and then 2) executing the concurrent 2. Note that the - in (RV1-M) denotes a set difference. program without any control and recording the order of the executed synchronization events. Consider the prefix-based replaying of RV1. Because we do not control the execution of the concurrent program in Step 2, it is possible to produce a T-sequence thathas RV2 as its feasible prefix. tu Following Theorem 6, Theorem 7 discusses another situation in which it is impossible for the prefix-based replays of two race variants to produce the same T-sequence. Theorem 7. Assume two race variants RV1 and RV2. Let M be the MCFP of RV1 and RV2. If (set of nodes in Race-graph (RV1-M) \ set of nodes in Race-graph (RV2-M)) is not the empty set, then the prefix-based replays of RV1 and RV2 produce different T-sequences. Proof. We first assume e 2 (set of nodes in Race-graph (RV1-M) \ set of nodes in Race-graph (RV2-M)). Note that e is a T-event. Let N1 and N2 be the sets of nothappened-before nodes in Race-graph (RV1-M) and Race-graph (RV2-M), respectively. According to the definition of MCFP, we have N1 \ N2 ¼. We consider the following two cases: 1) Assume that node e is in one of N1 and N2. Without loss of generality, we assume e is in N1. Since N1 \ N2 ¼, we have that e is in Racegraph (RV2-M-N2). It is obvious that the replay of RV1 and RV2 will not produce the same T-sequence because one of the nodes in N2 must occur before e in RV2. 2) Assume that e is not in N1 and N2 (i.e., e is in Racegraph (RV1-M-N1) and Race-graph (RV2-M-N2)). It is impossible for the replaying of RV1 and RV2 to produce the same T-sequence because there must exist a node x 2 N1 that has occurred before e in RV1 plus a node y 2 N2 that has occurred before e in RV2. Since N1 \ N2 ¼, we have x 6¼ y. tu Theorem 8. Assume a T-sequence TS, and that RV1 and RV2 are two race variants derived by applying Algorithm 2. It is possible that (set of nodes in Race-graph (RV1-M) \ set of nodes in Race-graph (RV2-M)) is the empty set. Proof. This comes directly from the example shown in Fig. 11. Assume that M is the MCFP of RV2-3 and RV2-10. Actually, M is RV2-3. Then, (set of nodes in

14 14 IEEE TRANSACTIONS ON SOFTWARE ENGINEERING, VOL. 30, NO. 1, JANUARY 2004 Race-graph (RV2-3-M) \ set of nodes in Racegraph(RV2-10-M)) is the empty set. tu Theorem 9. Assume a T-sequence TS, and that RV1 and RV2 are two race variants derived by applying Algorithm 3. It is possible that (set of nodes in Race-graph (RV1-M) \ set of nodes in Race-graph (RV2-M)) is the empty set. Proof. This comes directly from the example shown in Fig. 21. Set of nodes in Race-graph (RV1-M) is E1 and set of nodes in Race-graph (RV2-M) is E2. E1 \ E2 is the empty set. tu With Theorems 6 and 9, we can conclude that it is also possible to duplicate the same test if we employ Algorithm 3 to derive the race variants during reachability test. From the above discussion, we conclude that both Algorithms 2 and 3 may conduct the same test. However, the probability of Algorithm 3 duplicating the same test seems lower. Fig. 11 is an illustrative example. The experimental results in Section 6 also demonstrate this. 5 PREFIX-BASED REPLAY FOR DATABASE TRANSACTIONS In addition to the implementation of Algorithms 2 and 3, we have also implemented a prefix-based replay technique for race variants based on T-sequences. The prefix-based replay of a racevariant includes two steps: 1) controlling the execution of the program by following the execution order specified in the racevariant, and then 2) executing the program without any control and recording the order of the executed T-events. Note that the two steps are the replay and monitor phases we mentioned in Section 2. After the two steps, the program has been executed once. This procedure conducts a test and produces a T-sequence. For example, see Fig. 21. RV1 is the race variant to be replayed in the first step and we obtain six T-events in the second step. It produces a T-sequence with nine T-events. A scheme for performing monitoring and replaying of concurrent shared-memory programs has been presented previously [11]. The scheme cannot perform prefix-based replay of a racevariant because it cannot switch between the monitoring and replay phases during execution. It must first replay the execution order according to the target race variant and then monitor the execution of the concurrent program to conduct the two steps of prefix-based replay. A method has been proposed [8] for performing prefix-based replay of a racevariant, but it can only be applied to concurrent software which uses shared memory or semaphores to perform process synchronization. In this paper, we propose a scheme that performs prefix-based replay for an SQL client/server program. Fig. 22 shows how to modify the original client/server program to perform prefix-based replay of a race variant. First of all, there must be some variables added to the modified program, which are used to control the replaying and monitoring of the program. They can be divided into two classes: 1) those stored in the tables of databases and 2) local variables in each client process. These variables should be initialized before performing the prefix-based replay. For each T-event E in the original program, an entry Fig. 22. How to modify the source program to perform prefix-based replay. and exit protocol must be inserted before and after E, respectively. In this paper, we present two methods for performing prefix-based replay based on the data structure of the T-sequence defined in this paper. For the replay phase in Step 1 and monitor phase in Step 2 of the prefix-based replay, we need a mechanism to synchronize the execution of concurrent clients. In this paper, we present the entry and exit protocols based on two types of synchronization mechanism: one is based on the built-in synchronization functions in the SQL database system, and the other uses Java thread synchronization. We first present the entry and exit protocols based on the built-in SQL database synchronization mechanism. Note that Client 1 to Client n are processes executed in different machines. For clients executed in the same machines, we can consider it as special case of it. There are some variables added to the databases and client processes. The variables added to the databases are as follows:. We add a table Transaction_order to each database. There is only one row in this table. The primary key is DTO, which represents the transaction order of the database in replaying and monitoring. The initial value of DTO is 1.. We add a new database called Un_replay_client, which has only one table, Un_finish_replay_client. There is only one row in this table, and its primary key is UFRC. This variable represents the number of clients that have not finished the replay phase. The initial value of UFRC is the total number of clients in the application. The local variables added to each client process are as follows:. Mode: This variable represents whether the client is in Monitor or Replay mode. Its initial value is Replay.. Number of replay events: This constant variable represents the number of T-events which the client should replay. Assume that RV is the race variant

15 HWANG ET AL.: TECHNOLOGY FOR TESTING NONDETERMINISTIC CLIENT/SERVER DATABASE APPLICATIONS 15 Fig. 23. The entry protocol based on built-in database synchronization. that is going to be prefix-based replayed. It is initialized to the number of the T-events of the client in RV.. Client transaction order: This variable records the order of T-events which the client is going to replay or monitor. Its initial value is 1. The following local procedures should be invoked in the client process:. LockDB(Name of Database): This is a function to lock the use of the database Name of Database. While a database is locked by a client, all the database transactions issued by other clients are detained until the client calls UnLockDB(Name of Database). Almost all implementations of database servers support a way to lock the database, such as the settransactionisolation member function of the Connection object in Java Database Connectivity [24].. UnLockDB(Name of Database): This is a function to unlock the database Name of Database.. My_next_DB_transaction_order(i): This function returns the data transaction order according to the client transaction order i. Figs. 22 and 23 show the pseudocodes of entry and exit protocols, respectively, for the prefix-based replay of a race variant. In the second case, we have Client 1 to Client n as threads in the same machine, which means that they can use their shared memory to synchronize themselves. This is based on the thread synchronization procedure of the Java programming language [25], for which the entry and exit protocols are very similar to the protocols shown in Figs. 22 and 23. However, the following code fragment uses the built-in database synchronization to protect the access of variable Un_finish_replay_client: Lock_DB(Un_replay_Client); Num_UFRC = Select UFRC From Un_replay_client. Un_finish_replay_ client; Update Un_replay_ client. Un_finish_replay_ client Set UFRC = Num_UFRC -1; UnLock_DB(Un_replay_Client); which is replaced with the following Java subroutine: synchronized private void UFRC() { Un_finish_replay_client = Un_finish_replay_client - 1; }. Also, the Lock_DB(DB) and UnLock_DB(DB) functions are replaced with wait(db.sem) and signal(db.sem), which are Java subroutines that implement the binary semaphore operations defined in [4]. The DB.sem subroutine implements a semaphore with an integer value of either 0 or 1. Figs. 24 and 25 are the entry and exit protocols based on Java thread synchronization. 6 EXPERIMENTS This section presents the results from the experimental testing of three programs: Program 1 is a client/server program with three clients, where each client issues three SQL transactions. Program 2 is a client/server program with three clients, where each client issues five SQL transactions. Program 3 is a client/server application extracted from [18]. There are three clients which issues three, five, and four SQL transactions, respectively. We test each example program using the following schemes: ME is the multipleexecution testing with random delays during execution. RT_1 is the reachability testing using Algorithm 2 to

16 16 IEEE TRANSACTIONS ON SOFTWARE ENGINEERING, VOL. 30, NO. 1, JANUARY 2004 Fig. 24. The exit protocol based on built-in database synchronization. generate the race variants, with Figs. 22 and 23 as the entry and exit protocols for prefix-based replay of a race variant. RT_2 is the reachability testing using Algorithm 2 to generate the race variants, with Figs. 24 and 25 as the entry and exit protocols for prefix-based replay of a race variant. RT_3 is the reachability testing using Algorithm 3 to generate the race variants, with Figs. 22 and 23 as the entry and exit protocols for prefix-based replay of a race variant. RT_4 is the reachability testing using Algorithm 3 to generate the race variants, with Figs. 24 and 25 as the entry and exit protocols for prefix-based replay of a race variant. All the client processes and database servers are executed on PCs with 800-MHz Intel Pentium III processors and the MS Windows 2000 operating system. Client programs and database servers are located in the same network segment (with a bandwidth of 100Mbps). The database system is Microsoft SQL Server 7.0 [26]. The client application programs and code for implementing of reachability testing are compiled and executed in Java Runtime Environment, Standard Edition (build 1.3.0_01) [27]. Fig. 27a shows the experimental results froma multipleexecution test of Program 1. It is obvious that this program duplicates a lot of tests since, from 1,000 tests, an average of only 5.2 different T-sequences are derived. We repeat the testing 10 times. Fig. 27b shows the experimental results from using schemes RT_1, RT_2, RT_3, and RT_4 to test Program 1. Each test is repeated three times. We discover Fig. 25. The entry protocol based on Java thread synchronization. For ease of understanding, we still show the pseudocode rather than real Java souce code.

17 HWANG ET AL.: TECHNOLOGY FOR TESTING NONDETERMINISTIC CLIENT/SERVER DATABASE APPLICATIONS 17 Fig. 26. The exit protocol based on Java thread synchronization. that using Algorithm 2 to derive race variants tends to duplicate more tests than using Algorithm 3. Fig. 27c shows the time required to perform tests ME, RT_1, RT_2, RT_3, and RT_4 on Program 1. This figure shows that using Java thread synchronization (Figs. 24 and 25) to perform prefix-based replay is much more efficient than synchronizing clients with the database synchronization mechanism (Figs. 22 and 23). Consider the programs which use the SQL database synchronization mechanism, i.e, ME, RT_1, and RT_3. Although the multiple-execution testing scheme required the least time to perform a test (i.e., 8.8 seconds), it could only derive an average of six different T-sequences from 1,000 tests. Note that the time to perform a test includes initializing the tables in the database and the execution of all clients. Fig. 27d focuses on two points. The first one is the required time to generate race variants from a T-sequence, i.e., the execution times of Algorithms 2 and 3. The second one is the number of duplicate T-sequences during reachability testing. The figure shows that Algorithm 2 is slightly faster than Algorithm 3. However, considering the number of duplicated tests, Algorithm 2 is much worse than Algorithm 3. In this example, Algorithm 3 (RT_3 and RT_4) duplicates less than five tests among 24 T-sequences, whereas Algorithm 2 (RT_1 and RT_2) duplicates more than 80 tests. The experimental results of Programs 2 and 3 are shown in Figs. 28 and 29, respectively. From these experimental results, we draw the following conclusions. First, the multiple-execution testing scheme is not a practical way of testing SQL client/server programs since a tremendous number of the tests are duplicated. Also, most T-sequences are never exercised. Second, although Algorithm 2 can always exhaustively execute all possible tests; this is associated with the duplication of Fig. 27. Experimental result of Program 1. (a) Using ME to test Program 1. (b) Reachability testing of Program 1. (c) Time required to perform reachability testing of Program 1. (d) Time required to generate the race variants for Program 1.

18 18 IEEE TRANSACTIONS ON SOFTWARE ENGINEERING, VOL. 30, NO. 1, JANUARY 2004 Fig. 28. Experimental result of Program 2. (a) Using ME to test Program 2. (b) Reachability testing of Program 2. (c) Time required to perfrom reachability testing of Program 2. (d) Time required to generate the race variants for Program 2. Fig. 29. Experimental result of Program 3. (a) Using ME to test Program 3. (b) Reachability testing of Program 3. (c) Time required to perfrom reachability testing of Program 3. (d) Time required to generate the race variants for Program 3. many tests. Algorithm 3 duplicates fewer tests than Algorithm 2, with the result that, although Algorithm 3 is slightly slower than Algorithm 2, reachability testing using Algorithm 3 to generate race variants is more efficient than using Algorithm 2. Third, using the intrinsic synchronization mechanism in an SQL server to perform prefix-based replay is much slower than using Java thread synchronization. Finally, RT_4 provided the best performance in all of our experiments. 7 CONCLUSION In this paper, we have provided a framework for exhaustive testing of a client/server database application that exhibits nondeterministic behavior. We prove theoretically that the proposed scheme can derive all the possible T-sequences of a database application if each execution of application terminates, and our experimental results support this. In the key technology of reachability testing (i.e., for deriving race variants from a SYN-sequence), instead of generating an RV diagram (Algorithm 2) we have developed a new algorithm (Algorithm 3) that is based on the edge analysis of the race graph of the target SYN-sequence. Our theoretical analysis suggests that Algorithm 3 duplicates fewer tests, which is confirmed by the results of our experiments. This algorithm can be applied to the dataaccess model of database applications, which is the most general type of model and, hence, the algorithm can be applied to any data-access model for reachability testing. The Java byte code of it can be obtained at bashful.ice.ntnu.edu.tw/~ghhwang/rv_gen.zip. We consider the algorithm to be a significant breakthrough in reachability-testing technology. To test a concurrent program having a huge or infinite number of feasible T-sequences, we can combine reachability testing with some strategies for selecting T-sequences. There is room for further investigation. ACKNOWLEDGMENTS The authors would like to thank the anonymous referees for a number of useful suggestions to improve the paper. G.-H. Hwang and S.-J. Chang s work was supported in part by the Republic of China National Science Council under grant E and ROC MOE/NSC program for promoting academic excellence of universities under grant 89-E-FA REFERENCES [1] C.E. Mcdowell and D.P. Helmold, Debugging Concurrent Programs, ACM Computing Surveys, vol. 21, no. 4, Dec [2] K.C. Tai and R.H. Carver, Testing of Distributed Programs, Parallel and Distributed Computing Handbook, A.Y. Zomaya, ed., chapter 33, McGraw-Hill, 1996.

19 HWANG ET AL.: TECHNOLOGY FOR TESTING NONDETERMINISTIC CLIENT/SERVER DATABASE APPLICATIONS 19 [3] A. Dinning, A Survey of Synchronization Methods for Parallel Computers, Computer, July [4] A. Silberschatz, P. Baer Galvin, and G. Gagne, Operating System Concepts, sixth ed. John Wiley & Sons, June [5] Int l Organization for Standardization, Information Technology Database Languages-SQL-Part 1: Framework (SQL/Framework), ISO/IEC : 1999 and Information Technology Database Languages-SQL-Part 2: Foundation (SQL/Foundation), ISO/IEC : 1999, [6] R.A. Elmasri and S.B. Navathe, Fundamentals of Database Systems, third ed. Addison-Wesley, Jan [7] G.-H. Hwang, K.C. Tai, and T.L. Huang, Reachability Testing: An Approach To Testing Concurrent Software, Int l J. Software Eng. and Knowledge Eng., vol. 5, no. 4, pp , Dec [8] K.-C. Tai, Reachability Testing of Asynchronous Message- Passing Programs, Proc. Second Int l Workshop Software Eng. for Parallel and Distributed Systems, [9] A. Bechini and K.-C. Tai, Timestamps for Programs Using Messages and Shared Variables, Proc. Int l Conf. Distributed Computing Systems, pp , May [10] Y. Lei and K.-C. Tai, Efficient Reachability Testing of Asynchronous Message-Passing Programs, Proc. Eighth IEEE Int l Conf. Eng. for Complex Computer Systems, pp , Dec [11] T.J. LeBlanc and J.M. Mellor-Crummey, Debugging Parallel Programs with Instant Replay, IEEE Trans. Computers, vol. 36, no. 4, pp , Apr [12] R.A. Davies, R.J.A. Beynon, and B.F. Jones, Automating the Testing of Databases, Proc. First Int l Workshop Automated Program Analysis, Testing and Verification, June [13] D. Chays, S. Dan, P.G. Frankl, F.I. Vokolos, and E.J. Weyuker, A Framework for Testing Database Applications, Proc. ACM Int l Symp. Software Testing and Analysis, [14] M.J. Carey, D.J. DeWitt, and J.F. Naughton, The 007 Benchmark, Proc ACM SIGMOD Int l Conf. Management of Data, pp , May [15] J. Gray, P. Sundaresan, S. Englert, K. Baclawski, and P.J. Weinberger, Quickly Generating Billion-Record Synthetic Databases, SIGMOD Record (ACM Special Interest Group on Management of Data), vol. 23, no. 2, pp , June [16] D. Slutz, Massive Stochastic Testing of SQL, Proc. Conf. Very Large Databases, pp , Aug [17] Trans. Processing Performance Council, TPC-Benchmark C, [18] H. Chu and J. Dobson, Towards Quality Programming in the Automated Testing of Client/Server Application, Proc. Pacific Northwest Software Quality Conf. 98 and Proc. Int l Conf. Software Quality 98, Oct [19] D. Helmbold and D. Luckham, Debugging ADA Tasking Programs, IEEE Software, vol. 2, no. 2, pp , [20] R.N. Taylor, A General-Purpose Algorithm for Analyzing Concurrent Programs, Comm. ACM, vol. 21, no. 7, July [21] M. Young and R.N. Taylor, Combining Static Concurrency Analysis with Symbolic Execution, IEEE Trans. Software Eng., vol. 14, no. 10, Oct [22] C.E. McDowell, A Practical Algorithm for Static Analysis of Parallel Programs, J. Parallel and Distributed Computing, vol. 6, pp , [23] L. Lamport, Time, Clocks, and the Ordering of Events in a Distributed System, Comm. ACM, vol. 21, no. 7, pp , July [24] M. Gruber, Mastering SQL, Sybex, Jan [25] G. Steele, J. Gosling, and G. Bracha, Java(TM) Language Specification, second ed. B. Joy, ed., Addison-Wesley, June [26] S. Bjeletich, G. Mable, and D.W. Solomon, Microsoft SQL Server 7.0 Unleashed, Sams, first ed., May [27] Sun Microsystem, The Source for Java(TM) Technology, java.sun.com, Gwan-Hwan Hwang received the BS and MS degrees while in the Department of Computer Science and Information Engineering at National Chiao-Tung University, in 1991 and 1993, respectively, and the PhD degree while in the Department of Computer Science at National Tsing-Hua University, HsinChu, Taiwan, in He has been an assistant professor in the Department of Information and Computer Education at National Taiwan Normal University, Taiwan, since His research interests include concurrent program testing, parallelizing compilers, Internet security, thin-client technologies, and groupware. Sheng-Jen Chang received the BBA degree while in the Department of Management Information Systems at National Cheng-Chi University in 2000 and the MBA degree while in the Department of Information Management at National Chi-Nan University in 2002, respectively. He has been an associate researcher in the Telecommunication Laboratories of Chunghwa Telecom Co., Ltd., Taiwan, since His research interests include concurrent program testing and database system testing. Huey-Der Chu received the PhD degree (1998) from the Centre for Software Reliability at the University of Newcastle upon Tyne, England, funded by the National Science Council in Taiwan. He is an associate professor at Takming College in Taiwan, as well as the executive director of the Chinese Software Quality Association. His current research interests are in software process improvement, knowledge management, and quality management.. For more information on this or any other computing topic, please visit our Digital Library at

The PageRank Citation Ranking: Bring Order to the Web

The PageRank Citation Ranking: Bring Order to the Web The PageRank Citation Ranking: Bring Order to the Web presented by: Xiaoxi Pang 25.Nov 2010 1 / 20 Outline Introduction A ranking for every page on the Web Implementation Convergence Properties Personalized

More information

Comparing Microsoft SQL Server 2005 Replication and DataXtend Remote Edition for Mobile and Distributed Applications

Comparing Microsoft SQL Server 2005 Replication and DataXtend Remote Edition for Mobile and Distributed Applications Comparing Microsoft SQL Server 2005 Replication and DataXtend Remote Edition for Mobile and Distributed Applications White Paper Table of Contents Overview...3 Replication Types Supported...3 Set-up &

More information

Teradata SQL Assistant Version 13.0 (.Net) Enhancements and Differences. Mike Dempsey

Teradata SQL Assistant Version 13.0 (.Net) Enhancements and Differences. Mike Dempsey Teradata SQL Assistant Version 13.0 (.Net) Enhancements and Differences by Mike Dempsey Overview SQL Assistant 13.0 is an entirely new application that has been re-designed from the ground up. It has been

More information

1. INTRODUCTION TO RDBMS

1. INTRODUCTION TO RDBMS Oracle For Beginners Page: 1 1. INTRODUCTION TO RDBMS What is DBMS? Data Models Relational database management system (RDBMS) Relational Algebra Structured query language (SQL) What Is DBMS? Data is one

More information

PEER-TO-PEER (P2P) systems have emerged as an appealing

PEER-TO-PEER (P2P) systems have emerged as an appealing IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, VOL. 21, NO. 4, APRIL 2009 595 Histogram-Based Global Load Balancing in Structured Peer-to-Peer Systems Quang Hieu Vu, Member, IEEE, Beng Chin Ooi,

More information

The Goldberg Rao Algorithm for the Maximum Flow Problem

The Goldberg Rao Algorithm for the Maximum Flow Problem The Goldberg Rao Algorithm for the Maximum Flow Problem COS 528 class notes October 18, 2006 Scribe: Dávid Papp Main idea: use of the blocking flow paradigm to achieve essentially O(min{m 2/3, n 1/2 }

More information

Approximation Algorithms

Approximation Algorithms Approximation Algorithms or: How I Learned to Stop Worrying and Deal with NP-Completeness Ong Jit Sheng, Jonathan (A0073924B) March, 2012 Overview Key Results (I) General techniques: Greedy algorithms

More information

Postgres Plus xdb Replication Server with Multi-Master User s Guide

Postgres Plus xdb Replication Server with Multi-Master User s Guide Postgres Plus xdb Replication Server with Multi-Master User s Guide Postgres Plus xdb Replication Server with Multi-Master build 57 August 22, 2012 , Version 5.0 by EnterpriseDB Corporation Copyright 2012

More information

Overview Motivating Examples Interleaving Model Semantics of Correctness Testing, Debugging, and Verification

Overview Motivating Examples Interleaving Model Semantics of Correctness Testing, Debugging, and Verification Introduction Overview Motivating Examples Interleaving Model Semantics of Correctness Testing, Debugging, and Verification Advanced Topics in Software Engineering 1 Concurrent Programs Characterized by

More information

Double guard: Detecting Interruptions in N- Tier Web Applications

Double guard: Detecting Interruptions in N- Tier Web Applications Vol. 3, Issue. 4, Jul - Aug. 2013 pp-2014-2018 ISSN: 2249-6645 Double guard: Detecting Interruptions in N- Tier Web Applications P. Krishna Reddy 1, T. Manjula 2, D. Srujan Chandra Reddy 3, T. Dayakar

More information

IPv4 and IPv6: Connecting NAT-PT to Network Address Pool

IPv4 and IPv6: Connecting NAT-PT to Network Address Pool Available online www.jocpr.com Journal of Chemical and Pharmaceutical Research, 2014, 6(5):547-553 Research Article ISSN : 0975-7384 CODEN(USA) : JCPRC5 Intercommunication Strategy about IPv4/IPv6 coexistence

More information

Lecture 15 An Arithmetic Circuit Lowerbound and Flows in Graphs

Lecture 15 An Arithmetic Circuit Lowerbound and Flows in Graphs CSE599s: Extremal Combinatorics November 21, 2011 Lecture 15 An Arithmetic Circuit Lowerbound and Flows in Graphs Lecturer: Anup Rao 1 An Arithmetic Circuit Lower Bound An arithmetic circuit is just like

More information

Data Networks Project 2: Design Document for Intra-Domain Routing Protocols

Data Networks Project 2: Design Document for Intra-Domain Routing Protocols Data Networks Project 2: Design Document for Intra-Domain Routing Protocols Assigned: Wed, 30 May 2007 Due: 11:59pm, Wed, 20 June 2007 1 Introduction You have just joined Bisco Systems, a networking equipment

More information

D61830GC30. MySQL for Developers. Summary. Introduction. Prerequisites. At Course completion After completing this course, students will be able to:

D61830GC30. MySQL for Developers. Summary. Introduction. Prerequisites. At Course completion After completing this course, students will be able to: D61830GC30 for Developers Summary Duration Vendor Audience 5 Days Oracle Database Administrators, Developers, Web Administrators Level Technology Professional Oracle 5.6 Delivery Method Instructor-led

More information

Mutual Exclusion using Monitors

Mutual Exclusion using Monitors Mutual Exclusion using Monitors Some programming languages, such as Concurrent Pascal, Modula-2 and Java provide mutual exclusion facilities called monitors. They are similar to modules in languages that

More information

MonitorExplorer: A State Exploration-Based Approach to Testing Java Monitors

MonitorExplorer: A State Exploration-Based Approach to Testing Java Monitors Department of Computer Science and Engineering University of Texas at Arlington Arlington, TX 76019 MonitorExplorer: A State Exploration-Based Approach to Testing Java Monitors Y. Lei, R. Carver, D. Kung,

More information

Extending Data Processing Capabilities of Relational Database Management Systems.

Extending Data Processing Capabilities of Relational Database Management Systems. Extending Data Processing Capabilities of Relational Database Management Systems. Igor Wojnicki University of Missouri St. Louis Department of Mathematics and Computer Science 8001 Natural Bridge Road

More information

An Overview of Distributed Databases

An Overview of Distributed Databases International Journal of Information and Computation Technology. ISSN 0974-2239 Volume 4, Number 2 (2014), pp. 207-214 International Research Publications House http://www. irphouse.com /ijict.htm An Overview

More information

Online, Asynchronous Schema Change in F1

Online, Asynchronous Schema Change in F1 Online, Asynchronous Schema Change in F1 Ian Rae University of Wisconsin Madison ian@cs.wisc.edu Eric Rollins Google, Inc. erollins@google.com Jeff Shute Google, Inc. jshute@google.com ABSTRACT Sukhdeep

More information

Index Terms Domain name, Firewall, Packet, Phishing, URL.

Index Terms Domain name, Firewall, Packet, Phishing, URL. BDD for Implementation of Packet Filter Firewall and Detecting Phishing Websites Naresh Shende Vidyalankar Institute of Technology Prof. S. K. Shinde Lokmanya Tilak College of Engineering Abstract Packet

More information

Dr Markus Hagenbuchner markus@uow.edu.au CSCI319. Distributed Systems

Dr Markus Hagenbuchner markus@uow.edu.au CSCI319. Distributed Systems Dr Markus Hagenbuchner markus@uow.edu.au CSCI319 Distributed Systems CSCI319 Chapter 8 Page: 1 of 61 Fault Tolerance Study objectives: Understand the role of fault tolerance in Distributed Systems. Know

More information

Glossary of Object Oriented Terms

Glossary of Object Oriented Terms Appendix E Glossary of Object Oriented Terms abstract class: A class primarily intended to define an instance, but can not be instantiated without additional methods. abstract data type: An abstraction

More information

6.852: Distributed Algorithms Fall, 2009. Class 2

6.852: Distributed Algorithms Fall, 2009. Class 2 .8: Distributed Algorithms Fall, 009 Class Today s plan Leader election in a synchronous ring: Lower bound for comparison-based algorithms. Basic computation in general synchronous networks: Leader election

More information

1 File Processing Systems

1 File Processing Systems COMP 378 Database Systems Notes for Chapter 1 of Database System Concepts Introduction A database management system (DBMS) is a collection of data and an integrated set of programs that access that data.

More information

Formal Languages and Automata Theory - Regular Expressions and Finite Automata -

Formal Languages and Automata Theory - Regular Expressions and Finite Automata - Formal Languages and Automata Theory - Regular Expressions and Finite Automata - Samarjit Chakraborty Computer Engineering and Networks Laboratory Swiss Federal Institute of Technology (ETH) Zürich March

More information

A Labeling Algorithm for the Maximum-Flow Network Problem

A Labeling Algorithm for the Maximum-Flow Network Problem A Labeling Algorithm for the Maximum-Flow Network Problem Appendix C Network-flow problems can be solved by several methods. In Chapter 8 we introduced this topic by exploring the special structure of

More information

SOFT 437. Software Performance Analysis. Ch 5:Web Applications and Other Distributed Systems

SOFT 437. Software Performance Analysis. Ch 5:Web Applications and Other Distributed Systems SOFT 437 Software Performance Analysis Ch 5:Web Applications and Other Distributed Systems Outline Overview of Web applications, distributed object technologies, and the important considerations for SPE

More information

Chapter 6: Physical Database Design and Performance. Database Development Process. Physical Design Process. Physical Database Design

Chapter 6: Physical Database Design and Performance. Database Development Process. Physical Design Process. Physical Database Design Chapter 6: Physical Database Design and Performance Modern Database Management 6 th Edition Jeffrey A. Hoffer, Mary B. Prescott, Fred R. McFadden Robert C. Nickerson ISYS 464 Spring 2003 Topic 23 Database

More information

SQL Server. 2012 for developers. murach's TRAINING & REFERENCE. Bryan Syverson. Mike Murach & Associates, Inc. Joel Murach

SQL Server. 2012 for developers. murach's TRAINING & REFERENCE. Bryan Syverson. Mike Murach & Associates, Inc. Joel Murach TRAINING & REFERENCE murach's SQL Server 2012 for developers Bryan Syverson Joel Murach Mike Murach & Associates, Inc. 4340 N. Knoll Ave. Fresno, CA 93722 www.murach.com murachbooks@murach.com Expanded

More information

Network (Tree) Topology Inference Based on Prüfer Sequence

Network (Tree) Topology Inference Based on Prüfer Sequence Network (Tree) Topology Inference Based on Prüfer Sequence C. Vanniarajan and Kamala Krithivasan Department of Computer Science and Engineering Indian Institute of Technology Madras Chennai 600036 vanniarajanc@hcl.in,

More information

ibolt V3.2 Release Notes

ibolt V3.2 Release Notes ibolt V3.2 Release Notes Welcome to ibolt V3.2, which has been designed to deliver an easy-touse, flexible, and cost-effective business integration solution. This document highlights the new and enhanced

More information

Coverability for Parallel Programs

Coverability for Parallel Programs 2015 http://excel.fit.vutbr.cz Coverability for Parallel Programs Lenka Turoňová* Abstract We improve existing method for the automatic verification of systems with parallel running processes. The technique

More information

(debajit@seas.upenn.edu) (khadera@seas.upenn.edu) (ayesham2@seas.upenn.edu) April 19, 2007

(debajit@seas.upenn.edu) (khadera@seas.upenn.edu) (ayesham2@seas.upenn.edu) April 19, 2007 MMS MAIL SYSTEM CIS 505 PROJECT, SPRING 2007 Debajit Adhikary Khader Naziruddin Ayesha Muntimadugu (debajit@seas.upenn.edu) (khadera@seas.upenn.edu) (ayesham2@seas.upenn.edu) April 19, 2007 1 Design MMS

More information

Why? A central concept in Computer Science. Algorithms are ubiquitous.

Why? A central concept in Computer Science. Algorithms are ubiquitous. Analysis of Algorithms: A Brief Introduction Why? A central concept in Computer Science. Algorithms are ubiquitous. Using the Internet (sending email, transferring files, use of search engines, online

More information

In this session, we use the table ZZTELE with approx. 115,000 records for the examples. The primary key is defined on the columns NAME,VORNAME,STR

In this session, we use the table ZZTELE with approx. 115,000 records for the examples. The primary key is defined on the columns NAME,VORNAME,STR 1 2 2 3 In this session, we use the table ZZTELE with approx. 115,000 records for the examples. The primary key is defined on the columns NAME,VORNAME,STR The uniqueness of the primary key ensures that

More information

Random vs. Structure-Based Testing of Answer-Set Programs: An Experimental Comparison

Random vs. Structure-Based Testing of Answer-Set Programs: An Experimental Comparison Random vs. Structure-Based Testing of Answer-Set Programs: An Experimental Comparison Tomi Janhunen 1, Ilkka Niemelä 1, Johannes Oetsch 2, Jörg Pührer 2, and Hans Tompits 2 1 Aalto University, Department

More information

Union-Find Algorithms. network connectivity quick find quick union improvements applications

Union-Find Algorithms. network connectivity quick find quick union improvements applications Union-Find Algorithms network connectivity quick find quick union improvements applications 1 Subtext of today s lecture (and this course) Steps to developing a usable algorithm. Define the problem. Find

More information

Bounded Cost Algorithms for Multivalued Consensus Using Binary Consensus Instances

Bounded Cost Algorithms for Multivalued Consensus Using Binary Consensus Instances Bounded Cost Algorithms for Multivalued Consensus Using Binary Consensus Instances Jialin Zhang Tsinghua University zhanggl02@mails.tsinghua.edu.cn Wei Chen Microsoft Research Asia weic@microsoft.com Abstract

More information

Computationally Complete Spiking Neural P Systems Without Delay: Two Types of Neurons Are Enough

Computationally Complete Spiking Neural P Systems Without Delay: Two Types of Neurons Are Enough Computationally Complete Spiking Neural P Systems Without Delay: Two Types of Neurons Are Enough Rudolf Freund 1 and Marian Kogler 1,2 1 Faculty of Informatics, Vienna University of Technology Favoritenstr.

More information

Replication on Virtual Machines

Replication on Virtual Machines Replication on Virtual Machines Siggi Cherem CS 717 November 23rd, 2004 Outline 1 Introduction The Java Virtual Machine 2 Napper, Alvisi, Vin - DSN 2003 Introduction JVM as state machine Addressing non-determinism

More information

Patterns of Information Management

Patterns of Information Management PATTERNS OF MANAGEMENT Patterns of Information Management Making the right choices for your organization s information Summary of Patterns Mandy Chessell and Harald Smith Copyright 2011, 2012 by Mandy

More information

1.264 Lecture 15. SQL transactions, security, indexes

1.264 Lecture 15. SQL transactions, security, indexes 1.264 Lecture 15 SQL transactions, security, indexes Download BeefData.csv and Lecture15Download.sql Next class: Read Beginning ASP.NET chapter 1. Exercise due after class (5:00) 1 SQL Server diagrams

More information

SAP Business Objects Business Intelligence platform Document Version: 4.1 Support Package 7 2015-11-24. Data Federation Administration Tool Guide

SAP Business Objects Business Intelligence platform Document Version: 4.1 Support Package 7 2015-11-24. Data Federation Administration Tool Guide SAP Business Objects Business Intelligence platform Document Version: 4.1 Support Package 7 2015-11-24 Data Federation Administration Tool Guide Content 1 What's new in the.... 5 2 Introduction to administration

More information

A Dynamic Programming Approach for Generating N-ary Reflected Gray Code List

A Dynamic Programming Approach for Generating N-ary Reflected Gray Code List A Dynamic Programming Approach for Generating N-ary Reflected Gray Code List Mehmet Kurt 1, Can Atilgan 2, Murat Ersen Berberler 3 1 Izmir University, Department of Mathematics and Computer Science, Izmir

More information

Security Test s i t ng Eileen Donlon CMSC 737 Spring 2008

Security Test s i t ng Eileen Donlon CMSC 737 Spring 2008 Security Testing Eileen Donlon CMSC 737 Spring 2008 Testing for Security Functional tests Testing that role based security functions correctly Vulnerability scanning and penetration tests Testing whether

More information

Report on the Train Ticketing System

Report on the Train Ticketing System Report on the Train Ticketing System Author: Zaobo He, Bing Jiang, Zhuojun Duan 1.Introduction... 2 1.1 Intentions... 2 1.2 Background... 2 2. Overview of the Tasks... 3 2.1 Modules of the system... 3

More information

Efficient Recovery of Secrets

Efficient Recovery of Secrets Efficient Recovery of Secrets Marcel Fernandez Miguel Soriano, IEEE Senior Member Department of Telematics Engineering. Universitat Politècnica de Catalunya. C/ Jordi Girona 1 i 3. Campus Nord, Mod C3,

More information

Firewall Policy Change-Impact Analysis

Firewall Policy Change-Impact Analysis 15 Firewall Policy Change-Impact Analysis ALEX X LIU, Michigan State University Firewalls are the cornerstones of the security infrastructure for most enterprises They have been widely deployed for protecting

More information

Oracle Database 10g Express

Oracle Database 10g Express Oracle Database 10g Express This tutorial prepares the Oracle Database 10g Express Edition Developer to perform common development and administrative tasks of Oracle Database 10g Express Edition. Objectives

More information

Find-The-Number. 1 Find-The-Number With Comps

Find-The-Number. 1 Find-The-Number With Comps Find-The-Number 1 Find-The-Number With Comps Consider the following two-person game, which we call Find-The-Number with Comps. Player A (for answerer) has a number x between 1 and 1000. Player Q (for questioner)

More information

Satisfiability Checking

Satisfiability Checking Satisfiability Checking SAT-Solving Prof. Dr. Erika Ábrahám Theory of Hybrid Systems Informatik 2 WS 10/11 Prof. Dr. Erika Ábrahám - Satisfiability Checking 1 / 40 A basic SAT algorithm Assume the CNF

More information

A Systematic Approach. to Parallel Program Verication. Tadao TAKAOKA. Department of Computer Science. Ibaraki University. Hitachi, Ibaraki 316, JAPAN

A Systematic Approach. to Parallel Program Verication. Tadao TAKAOKA. Department of Computer Science. Ibaraki University. Hitachi, Ibaraki 316, JAPAN A Systematic Approach to Parallel Program Verication Tadao TAKAOKA Department of Computer Science Ibaraki University Hitachi, Ibaraki 316, JAPAN E-mail: takaoka@cis.ibaraki.ac.jp Phone: +81 94 38 5130

More information

Programma della seconda parte del corso

Programma della seconda parte del corso Programma della seconda parte del corso Introduction Reliability Performance Risk Software Performance Engineering Layered Queueing Models Stochastic Petri Nets New trends in software modeling: Metamodeling,

More information

How to Implement Multi-way Active/Active Replication SIMPLY

How to Implement Multi-way Active/Active Replication SIMPLY How to Implement Multi-way Active/Active Replication SIMPLY The easiest way to ensure data is always up to date in a 24x7 environment is to use a single global database. This approach works well if your

More information

On the Relationship between Classes P and NP

On the Relationship between Classes P and NP Journal of Computer Science 8 (7): 1036-1040, 2012 ISSN 1549-3636 2012 Science Publications On the Relationship between Classes P and NP Anatoly D. Plotnikov Department of Computer Systems and Networks,

More information

Performance Tuning for the Teradata Database

Performance Tuning for the Teradata Database Performance Tuning for the Teradata Database Matthew W Froemsdorf Teradata Partner Engineering and Technical Consulting - i - Document Changes Rev. Date Section Comment 1.0 2010-10-26 All Initial document

More information

Comparing SQL and NOSQL databases

Comparing SQL and NOSQL databases COSC 6397 Big Data Analytics Data Formats (II) HBase Edgar Gabriel Spring 2015 Comparing SQL and NOSQL databases Types Development History Data Storage Model SQL One type (SQL database) with minor variations

More information

AUTOMATED TEST GENERATION FOR SOFTWARE COMPONENTS

AUTOMATED TEST GENERATION FOR SOFTWARE COMPONENTS TKK Reports in Information and Computer Science Espoo 2009 TKK-ICS-R26 AUTOMATED TEST GENERATION FOR SOFTWARE COMPONENTS Kari Kähkönen ABTEKNILLINEN KORKEAKOULU TEKNISKA HÖGSKOLAN HELSINKI UNIVERSITY OF

More information

MODEL DRIVEN DEVELOPMENT OF BUSINESS PROCESS MONITORING AND CONTROL SYSTEMS

MODEL DRIVEN DEVELOPMENT OF BUSINESS PROCESS MONITORING AND CONTROL SYSTEMS MODEL DRIVEN DEVELOPMENT OF BUSINESS PROCESS MONITORING AND CONTROL SYSTEMS Tao Yu Department of Computer Science, University of California at Irvine, USA Email: tyu1@uci.edu Jun-Jang Jeng IBM T.J. Watson

More information

Concepts of Database Management Seventh Edition. Chapter 7 DBMS Functions

Concepts of Database Management Seventh Edition. Chapter 7 DBMS Functions Concepts of Database Management Seventh Edition Chapter 7 DBMS Functions Objectives Introduce the functions, or services, provided by a DBMS Describe how a DBMS handles updating and retrieving data Examine

More information

A single minimal complement for the c.e. degrees

A single minimal complement for the c.e. degrees A single minimal complement for the c.e. degrees Andrew Lewis Leeds University, April 2002 Abstract We show that there exists a single minimal (Turing) degree b < 0 s.t. for all c.e. degrees 0 < a < 0,

More information

A NETWORK CONSTRUCTION METHOD FOR A SCALABLE P2P VIDEO CONFERENCING SYSTEM

A NETWORK CONSTRUCTION METHOD FOR A SCALABLE P2P VIDEO CONFERENCING SYSTEM A NETWORK CONSTRUCTION METHOD FOR A SCALABLE P2P VIDEO CONFERENCING SYSTEM Hideto Horiuchi, Naoki Wakamiya and Masayuki Murata Graduate School of Information Science and Technology, Osaka University 1

More information

Concurrent Programming

Concurrent Programming Concurrent Programming Principles and Practice Gregory R. Andrews The University of Arizona Technische Hochschule Darmstadt FACHBEREICH INFCRMATIK BIBLIOTHEK Inventar-Nr.:..ZP.vAh... Sachgebiete:..?r.:..\).

More information

Database Replication with MySQL and PostgreSQL

Database Replication with MySQL and PostgreSQL Database Replication with MySQL and PostgreSQL Fabian Mauchle Software and Systems University of Applied Sciences Rapperswil, Switzerland www.hsr.ch/mse Abstract Databases are used very often in business

More information

Reinforcement Learning of Task Plans for Real Robot Systems

Reinforcement Learning of Task Plans for Real Robot Systems Reinforcement Learning of Task Plans for Real Robot Systems Pedro Tomás Mendes Resende pedro.resende@ist.utl.pt Instituto Superior Técnico, Lisboa, Portugal October 2014 Abstract This paper is the extended

More information

Whitepapers on Imaging Infrastructure for Research Paper 1. General Workflow Considerations

Whitepapers on Imaging Infrastructure for Research Paper 1. General Workflow Considerations Whitepapers on Imaging Infrastructure for Research Paper 1. General Workflow Considerations Bradley J Erickson, Tony Pan, Daniel J Marcus, CTSA Imaging Informatics Working Group Introduction The use of

More information

Distributed Database Management Systems

Distributed Database Management Systems Distributed Database Management Systems (Distributed, Multi-database, Parallel, Networked and Replicated DBMSs) Terms of reference: Distributed Database: A logically interrelated collection of shared data

More information

CSE 4351/5351 Notes 7: Task Scheduling & Load Balancing

CSE 4351/5351 Notes 7: Task Scheduling & Load Balancing CSE / Notes : Task Scheduling & Load Balancing Task Scheduling A task is a (sequential) activity that uses a set of inputs to produce a set of outputs. A task (precedence) graph is an acyclic, directed

More information

ICAB4136B Use structured query language to create database structures and manipulate data

ICAB4136B Use structured query language to create database structures and manipulate data ICAB4136B Use structured query language to create database structures and manipulate data Release: 1 ICAB4136B Use structured query language to create database structures and manipulate data Modification

More information

A Multi-Agent Approach to a Distributed Schedule Management System

A Multi-Agent Approach to a Distributed Schedule Management System UDC 001.81: 681.3 A Multi-Agent Approach to a Distributed Schedule Management System VYuji Wada VMasatoshi Shiouchi VYuji Takada (Manuscript received June 11,1997) More and more people are engaging in

More information

1. Nondeterministically guess a solution (called a certificate) 2. Check whether the solution solves the problem (called verification)

1. Nondeterministically guess a solution (called a certificate) 2. Check whether the solution solves the problem (called verification) Some N P problems Computer scientists have studied many N P problems, that is, problems that can be solved nondeterministically in polynomial time. Traditionally complexity question are studied as languages:

More information

Triggers & Packages. {INSERT [OR] UPDATE [OR] DELETE}: This specifies the DML operation.

Triggers & Packages. {INSERT [OR] UPDATE [OR] DELETE}: This specifies the DML operation. Triggers & Packages An SQL trigger is a mechanism that automatically executes a specified PL/SQL block (referred to as the triggered action) when a triggering event occurs on the table. The triggering

More information

MyOra 3.0. User Guide. SQL Tool for Oracle. Jayam Systems, LLC

MyOra 3.0. User Guide. SQL Tool for Oracle. Jayam Systems, LLC MyOra 3.0 SQL Tool for Oracle User Guide Jayam Systems, LLC Contents Features... 4 Connecting to the Database... 5 Login... 5 Login History... 6 Connection Indicator... 6 Closing the Connection... 7 SQL

More information

The following themes form the major topics of this chapter: The terms and concepts related to trees (Section 5.2).

The following themes form the major topics of this chapter: The terms and concepts related to trees (Section 5.2). CHAPTER 5 The Tree Data Model There are many situations in which information has a hierarchical or nested structure like that found in family trees or organization charts. The abstraction that models hierarchical

More information

A Brief Introduction to MySQL

A Brief Introduction to MySQL A Brief Introduction to MySQL by Derek Schuurman Introduction to Databases A database is a structured collection of logically related data. One common type of database is the relational database, a term

More information

Adaptive Linear Programming Decoding

Adaptive Linear Programming Decoding Adaptive Linear Programming Decoding Mohammad H. Taghavi and Paul H. Siegel ECE Department, University of California, San Diego Email: (mtaghavi, psiegel)@ucsd.edu ISIT 2006, Seattle, USA, July 9 14, 2006

More information

Multi-layer Structure of Data Center Based on Steiner Triple System

Multi-layer Structure of Data Center Based on Steiner Triple System Journal of Computational Information Systems 9: 11 (2013) 4371 4378 Available at http://www.jofcis.com Multi-layer Structure of Data Center Based on Steiner Triple System Jianfei ZHANG 1, Zhiyi FANG 1,

More information

Demystified CONTENTS Acknowledgments xvii Introduction xix CHAPTER 1 Database Fundamentals CHAPTER 2 Exploring Relational Database Components

Demystified CONTENTS Acknowledgments xvii Introduction xix CHAPTER 1 Database Fundamentals CHAPTER 2 Exploring Relational Database Components Acknowledgments xvii Introduction xix CHAPTER 1 Database Fundamentals 1 Properties of a Database 1 The Database Management System (DBMS) 2 Layers of Data Abstraction 3 Physical Data Independence 5 Logical

More information

Less naive Bayes spam detection

Less naive Bayes spam detection Less naive Bayes spam detection Hongming Yang Eindhoven University of Technology Dept. EE, Rm PT 3.27, P.O.Box 53, 5600MB Eindhoven The Netherlands. E-mail:h.m.yang@tue.nl also CoSiNe Connectivity Systems

More information

Software testing. Objectives

Software testing. Objectives Software testing cmsc435-1 Objectives To discuss the distinctions between validation testing and defect testing To describe the principles of system and component testing To describe strategies for generating

More information

Caching XML Data on Mobile Web Clients

Caching XML Data on Mobile Web Clients Caching XML Data on Mobile Web Clients Stefan Böttcher, Adelhard Türling University of Paderborn, Faculty 5 (Computer Science, Electrical Engineering & Mathematics) Fürstenallee 11, D-33102 Paderborn,

More information

System Interconnect Architectures. Goals and Analysis. Network Properties and Routing. Terminology - 2. Terminology - 1

System Interconnect Architectures. Goals and Analysis. Network Properties and Routing. Terminology - 2. Terminology - 1 System Interconnect Architectures CSCI 8150 Advanced Computer Architecture Hwang, Chapter 2 Program and Network Properties 2.4 System Interconnect Architectures Direct networks for static connections Indirect

More information

BCA. Database Management System

BCA. Database Management System BCA IV Sem Database Management System Multiple choice questions 1. A Database Management System (DBMS) is A. Collection of interrelated data B. Collection of programs to access data C. Collection of data

More information

Duration Vendor Audience 5 Days Oracle Developers, Technical Consultants, Database Administrators and System Analysts

Duration Vendor Audience 5 Days Oracle Developers, Technical Consultants, Database Administrators and System Analysts D80186GC10 Oracle Database: Program with Summary Duration Vendor Audience 5 Days Oracle Developers, Technical Consultants, Database Administrators and System Analysts Level Professional Technology Oracle

More information

Model Checking: An Introduction

Model Checking: An Introduction Announcements Model Checking: An Introduction Meeting 2 Office hours M 1:30pm-2:30pm W 5:30pm-6:30pm (after class) and by appointment ECOT 621 Moodle problems? Fundamentals of Programming Languages CSCI

More information

InfiniteGraph: The Distributed Graph Database

InfiniteGraph: The Distributed Graph Database A Performance and Distributed Performance Benchmark of InfiniteGraph and a Leading Open Source Graph Database Using Synthetic Data Objectivity, Inc. 640 West California Ave. Suite 240 Sunnyvale, CA 94086

More information

131-1. Adding New Level in KDD to Make the Web Usage Mining More Efficient. Abstract. 1. Introduction [1]. 1/10

131-1. Adding New Level in KDD to Make the Web Usage Mining More Efficient. Abstract. 1. Introduction [1]. 1/10 1/10 131-1 Adding New Level in KDD to Make the Web Usage Mining More Efficient Mohammad Ala a AL_Hamami PHD Student, Lecturer m_ah_1@yahoocom Soukaena Hassan Hashem PHD Student, Lecturer soukaena_hassan@yahoocom

More information

SHARP BOUNDS FOR THE SUM OF THE SQUARES OF THE DEGREES OF A GRAPH

SHARP BOUNDS FOR THE SUM OF THE SQUARES OF THE DEGREES OF A GRAPH 31 Kragujevac J. Math. 25 (2003) 31 49. SHARP BOUNDS FOR THE SUM OF THE SQUARES OF THE DEGREES OF A GRAPH Kinkar Ch. Das Department of Mathematics, Indian Institute of Technology, Kharagpur 721302, W.B.,

More information

Network Model APPENDIXD. D.1 Basic Concepts

Network Model APPENDIXD. D.1 Basic Concepts APPENDIXD Network Model In the relational model, the data and the relationships among data are represented by a collection of tables. The network model differs from the relational model in that data are

More information

Network File Storage with Graceful Performance Degradation

Network File Storage with Graceful Performance Degradation Network File Storage with Graceful Performance Degradation ANXIAO (ANDREW) JIANG California Institute of Technology and JEHOSHUA BRUCK California Institute of Technology A file storage scheme is proposed

More information

PART IV Performance oriented design, Performance testing, Performance tuning & Performance solutions. Outline. Performance oriented design

PART IV Performance oriented design, Performance testing, Performance tuning & Performance solutions. Outline. Performance oriented design PART IV Performance oriented design, Performance testing, Performance tuning & Performance solutions Slide 1 Outline Principles for performance oriented design Performance testing Performance tuning General

More information

The Trip Scheduling Problem

The Trip Scheduling Problem The Trip Scheduling Problem Claudia Archetti Department of Quantitative Methods, University of Brescia Contrada Santa Chiara 50, 25122 Brescia, Italy Martin Savelsbergh School of Industrial and Systems

More information

Job Reference Guide. SLAMD Distributed Load Generation Engine. Version 1.8.2

Job Reference Guide. SLAMD Distributed Load Generation Engine. Version 1.8.2 Job Reference Guide SLAMD Distributed Load Generation Engine Version 1.8.2 June 2004 Contents 1. Introduction...3 2. The Utility Jobs...4 3. The LDAP Search Jobs...11 4. The LDAP Authentication Jobs...22

More information

Database Design Patterns. Winter 2006-2007 Lecture 24

Database Design Patterns. Winter 2006-2007 Lecture 24 Database Design Patterns Winter 2006-2007 Lecture 24 Trees and Hierarchies Many schemas need to represent trees or hierarchies of some sort Common way of representing trees: An adjacency list model Each

More information

EVALUATION. WA1844 WebSphere Process Server 7.0 Programming Using WebSphere Integration COPY. Developer

EVALUATION. WA1844 WebSphere Process Server 7.0 Programming Using WebSphere Integration COPY. Developer WA1844 WebSphere Process Server 7.0 Programming Using WebSphere Integration Developer Web Age Solutions Inc. USA: 1-877-517-6540 Canada: 1-866-206-4644 Web: http://www.webagesolutions.com Chapter 6 - Introduction

More information

Basic Unix/Linux 1. Software Testing Interview Prep

Basic Unix/Linux 1. Software Testing Interview Prep Basic Unix/Linux 1 Programming Fundamentals and Concepts 2 1. What is the difference between web application and client server application? Client server application is designed typically to work in a

More information

Simple Network Management Protocol

Simple Network Management Protocol CHAPTER 32 Simple Network Management Protocol Background Simple Network Management Protocol (SNMP) is an application-layer protocol designed to facilitate the exchange of management information between

More information

Discovering Interacting Artifacts from ERP Systems (Extended Version)

Discovering Interacting Artifacts from ERP Systems (Extended Version) Discovering Interacting Artifacts from ERP Systems (Extended Version) Xixi Lu 1, Marijn Nagelkerke 2, Dennis van de Wiel 2, and Dirk Fahland 1 1 Eindhoven University of Technology, The Netherlands 2 KPMG

More information

Weakly Secure Network Coding

Weakly Secure Network Coding Weakly Secure Network Coding Kapil Bhattad, Student Member, IEEE and Krishna R. Narayanan, Member, IEEE Department of Electrical Engineering, Texas A&M University, College Station, USA Abstract In this

More information

Discuss the size of the instance for the minimum spanning tree problem.

Discuss the size of the instance for the minimum spanning tree problem. 3.1 Algorithm complexity The algorithms A, B are given. The former has complexity O(n 2 ), the latter O(2 n ), where n is the size of the instance. Let n A 0 be the size of the largest instance that can

More information