Korat: automated testing based on Java predicates Chandrasekhar Boyapati Sarfraz Khurshid Darko Marinov 2002 MIT Laboratory for Computer Science presented by Nicola Vermes
Outline Motivation Korat Finitization State space Search Example How it works Test Efficiency Results Alloy Analyzer The tool Demo Conclusion 2
Motivation Testing can be tedious Test data generation (objects under test) Test cases (method's input) Test oracles (check correctness) Tools can help JUnit: automate test execution JML: exploitable to create test oracle Korat: combine Junit+JML and generate test cases+inputs 3
Motivation: example Linked list of 1..N elements Method to test: remove an element N possible lists: for each list, list.size possible inputs for the remove method (N 2 +N)/2 possible pairs (list, node_to_remove) and this is just a simple linked list 4
Korat: overview Korat can automatically create: objects from class invariants Inputs, for method to test, from its pre-conditions test oracles from method's post-conditions Key idea: efficiently search objects/inputs from the set of all possible cases User must just specify how large can be this set 5
Finitization: the idea Bounds of the cases/input size Defining # objects to create (for each class) Define the range for primitive types Example: linked list (nodes, size) the user specify how many nodes to create and the range for the size field (int) Korat then generates all possible (and valid) lists of sizes within the range using the created objects 6
Finitization: some concepts Class domain: set of objects of one class A field can point to a set of classes Field domain: set of values a field can take Field domain: an union of some class domains 7
State space: the idea Fields of each object have an ID Candidate vector: at position ID value of field Each field can have a (finite) set of values State space: combination of all possible values With class invariant [bool repok()]: check if a candidate vector is a valid structure 8
State space: the problem Even with the Finitization, it can be very large Korat must use an efficient algorithm to search all valid structures Linked list of size two, with nodes A/B Fields: List { Node root; int size; } Node { Node next; } State space size: 18 (2 ptrs that can point to A/B/null; root can point A or B: 3*3*2=18) Valid lists: 2 (r A B null and r B A null) 9
Search Backtracking algorithm prunes portions of the state space with invalid struct Korat does not generate isomorphous struct In class domain, objects are ordered Also in field domain (consistent) Candidate vector stores the index (of the field domain order) of the values taken by fields 10
Linked list: example Object List A B Invalid candidate CV = [0,0,1,0] {A,2,A,null} Valid candidate CV = [0,0,2,0] {A,2,B,null} index in candidate vector 0 1 2 3 Field root size next next Field domain {A,B} {2} {null,a,b} Field domain index [0,1] [0] [0,1,2] Class invariant determines if valid or not Good practice: data structures provided with predicates bool repok() (check inv.) 11
Backtracking Init CV with 0s Keeps trace of accessed fields during repok() (first access ordered) If false: Next CV: incr. index of last accessed field If exceeds domain size, set to 0 and incr. index of the previous accessed field This avoids to explore many structures in state space If true: skip all the isomorphic structures 12
Linked list: backtracking Object List A B CV = [0,0,1,0] field accessed= [root,size,a.next] index in candidate vector 0 1 2 3 Field root size next next Field domain {A,B} {2} {null,a,b} Field domain index [0,1] [0] [0,1,2] CV = [0,0,1,*] always invalid [0,0,1,1] and [0,0,1,2] pruned CV = [0,0,2,0] field accessed= [root,size,a.next, B.next] Incr. last accessed (A.next): [0,0,1+1,0] 13
Nonisomorphism Further optimization Candidates isomorphic if graphs of reachable objects are isomorphic When a valid structure is found all the isomorphic structures are skipped 14
Instrumentation Monitor repok's executions To know field accesses ordering Source-to-Source translation Add special setter/getter for each field Replace field access with set/get Approach similar to the observer pattern Javassist 15
Testing methods Input generated from pre-conditions Check correctness from post-conditions Annotations in JML Generates all pairs (structure, input) that satisfy class invariant and method preconditions Check correctness: class invariant and method post-conditions must hold If not: method incorrect, Korat provides a counter-example 16
Efficiency The backtracking algorithm performance is strictly related to the repok() implementation The repok() should return false as soon as possible In this way large portions of the state space are pruned If repok() always accesses to all fields (also for invalid structures) nothing will be pruned 17
Experimental results All finitization parameters set to size State space rounded to nearest smaller exp. of two # fields =1+8*2 (root + 8 nodes with left/right) = 17 # elements = 1+8 (null + nodes) = 9 2 53 < 9 17 < 2 54 # of nonisomorphic BinaryTree is known: Korat generates exactly the same # of structures The pruning is effective: candidates vs. state space 18
Alloy Analyzer Tool for analyzing Alloy models Alloy: first-order logic, declarative language based on relations Relations, sets, quantifier, to model a structure and its constraint Translate the model into a boolean formula SAT-solver, back to to an instance of the model 19
Korat vs. Alloy Analyzer Korat more efficient Korat learns from repok() Alloy generates some isomorphic structures 20
The tool Generates only structures/inputs for method It does NOT: Generate automatically finitization skeleton use JML It does NOT run the method under test thus it is not so concretely usable Demo 21
Korat: conclusion The basic idea is good Efficient, but still not applicable to large structures What about the correctness of repok() method? The tool should definitely be improved to be more usable 22
Questions / Discussion 23