# Enumerating Answers to First-Order Queries over Databases of Low Degree

Save this PDF as:

Size: px
Start display at page:

## Transcription

4 3. EVALUATION ALGORITHMS In this section, we present our algorithms for counting, testing, and enumerating the solutions to a query (see Sections 3.3, 3.4, and 3.5. They all build on the same preprocessing algorithm which runs in pseudo-linear time and which essentially reduces the input to a quantifier-free query over a suitable signature (see Section 3.2. However, before presenting these algorithms, we start with a very simple case. 3.1 Warming up As a warm-up for working with classes of structures of low degree, we first consider the simple case of queries which we call connected conjunctive queries, and which are defined as follows. A conjunction is a query γ which is a conjunction of relational atoms and potentially negated unary atoms. Note that the query of Example 1 is not a conjunction as it has a binary negated atom. With each conjunction γ we associate a query graph H γ. This is the undirected graph whose vertices are the variables x 1,..., x k of γ, and where there is an edge between two vertices x i and x j iff γ contains a relational atom in which both x i and x j occur. We call the conjunction γ connected if its query graph H γ is connected. A connected conjunctive query is a query q( x of the form ȳ γ( x, ȳ, where γ is a connected conjunction in which all variables of x, ȳ occur (here, ȳ = 0 is allowed. The next simple lemma implies that over a class of structures of low degree, connected conjunctive queries can be evaluated in pseudo-linear time. It will be used in several places throughout this paper: in the proof of Proposition 8, and in the proofs for our counting and enumeration results in Sections 3.3 and 3.5. LEMMA 7. There is an algorithm which, at input of a structure A and a connected conjunctive query q( x computes q(a in time O( q n d h( q, where n = dom(a, d = degree(a, and h is a computable function. PROOF. Let q( x be of the form ȳ γ( x, ȳ, for a connected conjunction γ. Let k = x be the number of free variables of q, let l = ȳ, and let r = k + l. Note that since γ is connected, for every tuple c γ(a the following is true, where a is the first component of c. All components c of c belong to the r-neighbourhood N A r (a of a in dom(a. Thus, q(a is the disjoint union of the sets S a := { b q(n A r (a : the first component of b is a }, for all a dom(a. For each a dom(a, the set S a can be computed as follows: (1 Initialise S a :=. (2 Compute N A r (a. Since A has degree d, this neighbourhood s domain contains at most d r+1 elements of dom(a. Thus, by using breadthfirst search, N A r (a can be computed in time O(d h( q, for a computable function h. (3 Use a brute-force algorithm that enumerates all k-tuples b of elements in N A r (a whose first component is a. For each such tuple b, use a brute-force algorithm that checks whether N A r (a = q( b. If so, insert b into S a Note that the number of considered tuples b is d (r+1(k 1. And checking whether N A r (a = q( b can be done in time O( γ d (r+1l : for this, enumerate all l-tuples c of elements in N A r (a and take time O( γ to check whether γ( x, ȳ is satisfied by the tuple ( b, c. Thus, we are done after O( γ d (r2 steps. In summary, we can compute q(a = a A Sa in time O(n q d h( q, for a computable function h. As an immediate consequence we obtain the following: PROPOSITION 8. Let C be a class of structures of low degree. Given a structure A C and a connected conjunctive query q, one can compute q(a in pseudo-linear time. PROOF. We use the algorithm provided in Lemma 7. To see that the running time is as claimed, we use the assumption that C is of low degree: for every δ > 0 there is an m δ N 1 such that every structure A C of cardinality A m δ has degree(a A δ. For a given ε > 0 we let δ := ε and define h( q nε := m δ. Then, every A C with A n ε has degree(a A ε/h( q. Thus, on input of A and q, the algorithm from Lemma 7 has running time O( q A 1+ε if A n ε and takes time bounded by O( q n 1+h(q ε otherwise. The method of the proof of Proposition 8 above will be used for several times in the paper. 3.2 Quantifier elimination and normal form In this section, we make precise the quantifier elimination approach that is at the heart of the preprocessing phase of the query evaluation algorithms of our paper. A signature is binary if all its relation symbols have arity at most 2. A coloured graph is a finite relational structure over a binary signature. PROPOSITION 9. There is an algorithm which, at input of a structure A and a first-order query ϕ( x, produces a binary signature τ (containing, among other symbols, a binary relation symbol E, a coloured graph G of signature τ, an FO(τ-formula ψ( x, and a mapping f such that the following is true for k = x, n = dom(a, d = degree(a and h some computable function: 1. ψ is quantifier-free. Furthermore, ψ is of the form (ψ 1ψ 2, where ψ 1 states that no distinct free variables of ψ are connected by an E-edge, and ψ 2 is a positive boolean combination of unary atoms. 2. τ and ψ are computed in time and space h( ϕ n d h( ϕ. Moreover, τ h( ϕ and ψ h( ϕ. 3. G is computed in time and space h( ϕ n d h( ϕ. Moreover, degree(g d h( ϕ. 4. f is an injective mapping from dom(a k to dom(g k such that f is a bijection between ϕ(a and ψ(g. Furthermore, on input of any tuple v ψ(g, the tuple f 1 ( v can be computed in time and space O(k 2. Using time O(n d h( ϕ and space O(n 2, we can furthermore construct a data structure such that, on input of any ā dom(a k, f(ā can be computed in time O(k 2. The proof of Proposition 9 is long and technical and of a somewhat different nature than the results we now describe. It is postponed to Section 4. However, this proposition is central in the proofs of the results below. 4

5 3.3 Counting Here we consider the problem of counting the number of solutions to a query on low degree structures. A generalised conjunction is a conjunction of relational atoms and negated relational atoms (hence, also atoms of arity bigger than one may be negated, and the query of Example 1 is a generalised conjunction. EXAMPLE 10. Before moving to the formal proof of Theorem 4, consider again the query q from Example 1. Recall that it computes the pairs of blue-red nodes that are not connected by an edge. To count its number of solutions over a class of structures of low degree we can proceed as follows. We first consider the query q (x, y returning the set of blue-red nodes that are connected. In other words, q is B(x R(y E(x, y. Notice that this query is a connected conjunction. Hence, by Proposition 8 its answers can be computed in pseudo-linear time and therefore we can also count its number of solutions in pseudo-linear time. It is also easy to compute in pseudo-linear time the number of pairs of blue-red nodes. The number of answers to q is then the difference between these two numbers. The proof sketch for Theorem 4 goes as follows. Using Proposition 9 we can assume modulo a pseudo-linear preprocessing that our formula is quantifier-free and over a binary signature. Each connected component is then treated separately and we return the product of all the results. For each connected component we eliminate the negated symbols one by one using the trick illustrated in Example 10. The resulting formula is then a connected conjunction that is treated in pseudo-linear time using Proposition 8. LEMMA 11. There is an algorithm which, at input of a coloured graph G and a generalised conjunction γ( x, computes γ(g in time O(2 m γ n d h( γ, where h is a computable function, m is the number of negated binary atoms in γ, n = dom(g, and d = degree(g. PROOF. By induction on the number m of negated binary atoms in γ. The base case for m=0 is obtained as follows. We start by using O( γ steps to compute the query graph H γ and to compute the connected components of H γ. In case that H γ is connected, we can use Lemma 7 to compute the entire set γ(g in time O( γ n d h( γ, for a computable function h. Thus, counting γ(g can be done within the same time bound. In case that γ is not connected, let H 1,..., H l be the connected components. For each i {1,..., l} let x i be the tuple obtained from x by removing all variables that do not belong to H i. Furthermore, let γ i( x i be the conjunction of all atoms or negated unary atoms of γ that contain variables in H i. Note that γ( x is equivalent to l i=1 γi( xi, and l γ(g = γ i(g. Since each γ i is connected, we can compute γ i(g in time O( γ i n d h( γ i by using the algorithm of Lemma 7. We do this for each i {1,..., l} and output the product of the values. In summary, we are done in time O( γ n d h( γ for the base case m = 0. For the induction step, let γ be a formula with m+1 negated binary atoms. Let R(x, y be a negated binary atom of γ, and let i=1 γ 1 be such that γ = γ 1 R(x, y, and let γ 2 := γ 1 R(x, y. Clearly, γ(g = γ 1(G γ 2(G. Since each of the formulas γ 1 and γ 2 has only m negated binary atoms, we can use the induction hypothesis to compute γ 1(G and γ 2(G each in time O(2 m γ n d h( γ. The total time used for computing γ(g is thus O(2 m+1 γ n d h( γ. By using Proposition 9, we can lift this to arbitrary structures and first-order queries: PROPOSITION 12. There is an algorithm which at input of a structure A and a first-order query ϕ( x computes ϕ(a in time h( ϕ n d h( ϕ, for a computable function h, where n = dom(a and d = degree(a. PROOF. We first use the algorithm of Proposition 9 to compute the according graph G and the quantifier-free formula ψ( x. This takes time h( ϕ n d h( ϕ for a computable function h. And we also know that ψ h( ϕ. By Proposition 9 we know that ϕ(a = ψ(g. Next, we transform ψ( x into disjunctive normal form γ i( x, i I such that the conjunctive clauses γ i exclude each other (i.e., for each v ψ(g there is exactly one i I such that v γ i(g. Clearly, this can be done in time O(2 ψ. Each γ i has length at most ψ, and I 2 ψ. Obviously, ψ(g = i I γi(g. We now use, for each i I, the algorithm from Lemma 11 to compute the number s i = γ i(g and output the value s = i I si. By Lemma 11 we know that for each i I the computation of s i can be done in time O(2 m γ i ñ d h 0( γ i, where m is the number of binary atoms in γ, ñ = dom(g, d = degree(g, and h 0 is some computable function. By Proposition 9 we know that ñ h( ϕ n d h( ϕ and d d h( ϕ. Since also γ i ψ h( ϕ, the computation of s i, for each i I, takes time h 1( ϕ n d h1( ϕ, for some computable function h 1 (depending on h and h 0. To conclude, since I 2 ψ, the total running time for computing ϕ(a = i I si is h2( ϕ n dh 2( ϕ, for a suitably chosen computable function h 2. Hence, we meet the required bound. Theorem 4 is an immediate consequence of Proposition 12 (following the arguments of the proof of Proposition Testing Here we consider the problem of testing whether a given tuple is a solution to a query. By Proposition 9 it is enough to consider quantifier-free formulas. Those are treated using the lazy array initialisation technique mentioned in Section 2. PROPOSITION 13. There is an algorithm which at input of a structure A and a first-order query ϕ( x, has a preprocessing phase of time h( ϕ n d h( ϕ in which it computes a data structure such that, on input of any ā dom(a k for k = x, it can be tested in time h( ϕ whether ā ϕ(a, where h is a computable function, n = dom(a, and d = degree(a. 5

6 PROOF. We first use the algorithm of Proposition 9 to compute the graph G, the quantifier-free formula ψ( x and the data structure for function f. For some computable function h, this takes time and space h( ϕ n d h( ϕ, and furthermore, ψ h( ϕ and degree(g d h( ϕ. Note that G h( ϕ n d h( ϕ. By construction, we furthermore know for all ā dom(a k that ā ϕ(a f(ā ψ(g. Recall from Proposition 9 that ψ( x is a quantifier-free formula built from atoms of the form E(y, z and C(y for unary relation symbols C. Thus, checking whether a given tuple v dom(g k belongs to ψ(g can be done easily, provided that one can check whether unary atoms C(u and binary atoms E(u, u hold in G for given nodes u, u of G. To enable checking whether E(u, u holds in G, we construct the following data structure. W.l.o.g. we assume that dom(g = {1,..., ñ} for ñ := dom(g G h( ϕ n d h( ϕ. We use an (ñ ñ-array A E that is initialised to 0. By looping through all edges E(u, u of G, we then update the array entry A E[u, u ] to 1. This way, using time O( G, we ensure that for all nodes u, u of G we have A E[u, u ] = 1 if E(u, u holds in G, and A E[u, u ] = 0 otherwise. In a similar way, within time O( G we can build, for each unary relation symbol C, a 1-dimensional array A C of length ñ such that for all nodes u of G we have A C[u] = 1 if C(u holds in G, and A C[u] = 0 otherwise. All these arrays are constructed in time O( G, which is O(h( ϕ n d h( ϕ. This completes the preprocessing phase. Note that using the arrays A E and A C, testing whether a given tuple v dom(g k belongs to ψ(g can be done in time O( ψ, since each atomic statement of ψ can be checked in constant time by a simple look-up of the according array entry. Finally, the testing algorithm works as follows. Given a tuple ā dom(a k, we first construct v := f(ā and then check whether v ψ(g. Building v := f(ā can be done in time O(k 2 (see Proposition 9, and checking whether v ψ(g requires time O( ψ, which is O(h( ϕ. Hence, we meet the required bound for testing. Theorem 5 is an immediate consequence of Proposition 13 (following the arguments of the proof of Proposition Enumeration Here we consider the problem of enumerating the solutions to a given query. We first illustrate the proof of Theorem 6 with our running example. EXAMPLE 14. Consider again the query q of Example 1. In order to enumerate q with constant delay over a class of low degree we proceed as follows. During the preprocessing phase we precompute those blue nodes that contribute to the answer set, i.e. such that there is a red node not connected to it. This is doable in pseudo-linear time because our class has low degree and each blue node is connected to few red nodes. We call green the resulting nodes. We then order the green nodes and the red nodes in order to be able to iterate through them with constant delay. Finally, we compute the binary function skip(x, y associating to each green node x and red node y such that E(x, y the smallest red node y such that y < y and E(x, y, where < is the order on red nodes precomputed above. From Proposition 8 it follows that computing skip can be done in pseudo-linear time. It is crucial here that the domain of skip has pseudo-linear size. The enumeration phase now goes as follows: We iterate through all green nodes. For each of them we iterate through all red nodes. If there is no edge between them, we output the result and continue with the next red node. If there is an edge, we apply skip to this pair and the process continues with the resulting red node. Note that the new red node immediately yields an answer. Note also that all the red nodes that will not be considered are safely skipped as they are linked to the current green node. The proof of Theorem 6 can be sketched as follows. By Proposition 9 it is enough to consider quantifier-free formulas looking for tuples of nodes that are disconnected and have certain colours. Hence the query q described in Example 1 corresponds to the binary case. For queries of larger arities we proceed by induction on the arity. By induction we can enumerate the answers of the query projecting out the last variable from the initial query. For each tuple ū obtained by induction, we iterate through all the red nodes that are a potential completion. We then proceed as in Example 14. If the current red node a is not connected to ū, then ūa forms an answer and we proceed to the next red node. If a is connected to ū then we need to jump in constant time to the next red node that yields an answer. This is done by precomputing a suitable function skip that depends on the arity of the query and is slightly more complex that the one described in Example 14. The design and computation of this function is the main technical originality of the proof. We now turn to the technical details that are summarised in the next proposition. PROPOSITION 15. There is an algorithm which at input of a structure A and a first-order query ϕ( x enumerates ϕ(a with delay h(ϕ after a preprocessing of time h(ϕ n d h(ϕ, where n = dom(a, d = degree(a, and h is a computable function. PROOF. The proof is by induction on the number k := x of free variables of ϕ. In case that k = 0, the formula ϕ is a sentence, and we are done using Theorem 2. In case that k > 0 we proceed as follows. We first use the algorithm of Proposition 9 to compute the according coloured graph G and the quantifier-free formula ψ( x. This takes time g( ϕ n d g( ϕ for a computable function g. And we know that ψ g( ϕ, that G has degree d d g( ϕ, and that dom(g has ñ elements, where ñ g( ϕ n d g( ϕ. From Item 1 of Proposition 9 we know that the formula ψ( x is of the form (ψ 1 ψ 2, where ψ 1 states that no distinct free variables of ψ are connected by an E-edge and ψ 2 is a positive boolean combination of unary atoms. We let x = (x 1,..., x k. In case that k = 1, ψ(x 1 = ψ 2(x 1 is a positive boolean combination of unary atoms. We can use Lemma 7 for each unary atom in order to compute ψ(g in time O( ψ ñ d g( ψ for a computable function g. This is time h(ϕ n d h(ϕ, for a computable function h, and can thus be done during the preprocessing phase. In the enumeration phase, we then simply loop through the list of all elements in v ψ(g, compute ā := f 1 ( v, and output ā. Due to Item 4 of Proposition 9, the delay is O(k 2, and we are done. The case for k 2 requires much more elaborate constructions. We let x k 1 := (x 1,..., x k 1. To enable enumeration of ψ(g (and hence also ϕ(a, by translating each result v ψ(g to ā := f 1 ( v, we first transform ψ into a normal form j J θj( x such that the formulas θ j exclude each other (i.e., for each v ψ(g there is exactly one j J such that v θ j(g, and each θ j( x is 6

7 of the form φ j( x k 1 P j(x k γ( x, γ( x := k 1 i=1 where ( E(xi, x k E(x k, x i, P j(x k is a boolean combination of unary atoms regarding x k, and φ j( x k 1 is a formula with only k 1 free variables. Note that the transformation into this normal form can be done easily, using the particularly simple form of the formula ψ. Clearly, we can enumerate ψ(g by enumerating θ j(g for each j J. In the following, we therefore restrict attention to the enumeration of θ j(g for a fixed j J. For θ j we shortly write θ( x = φ( x k 1 P (x k γ( x. We let θ ( x k 1 := x k θ( x. By the induction hypothesis (since θ only has k 1 free variables, we can enumerate θ (G with delay h(θ after a preprocessing phase of time h(θ ñ d h(θ. Since P (x k is a boolean combination of unary atoms on x k, we can use Lemma 7 to compute P (G in time O( P ñ d g( P for a computable function g. Afterwards, we have available a list of all nodes v of G that belong to P (G. In the following, we will write P to denote the linear ordering of P (G induced by this list, and we write first P for the first element in this list, and next P for the successor function, such that for any node v P (G, next P (v is the next node in P (G in this list (or the value void, if v is the last node in the list. We extend the signature of G by a unary relation symbol P and a binary relation symbol next, and let Ĝ be the expansion of G where P is interpreted by the set P (G and next is interpreted by the successor function next P (i.e., next(v, v is true in Ĝ iff v = next P (v. Note that Ĝ has degree ˆd = d+2, which is d g( ϕ for a computable function g. We now start the key idea of the proof, i.e., the function that will help us skipping over irrelevant nodes. To this end consider the first-order formulas E 1,..., E k defined inductively as follows, where E (x, y is an abbreviation for (E(x, y E(y, x. The reason for defining these formulas will become clear only later on, in the proof of Claim 1. E 1(u, y := P (y E (u, y, and E i+1(u, y := E i(u, y z z v ( E (z, u next(z, z E (v, z E i(v, y. A simple induction shows that for E i(u, y to hold, y must be at distance 3(i < 3i from u. In our enumeration algorithm we will have to test, given nodes u, v dom(g, whether (u, v E k (Ĝ. Since E k is a firstorder formula, Theorem 5 implies that, after preprocessing time g ( E k ñ ˆd g ( E k (for some computable function g, testing membership in E k (Ĝ, for any given (u, v dom(g2, is possible within time g ( E k. By our knowledge on the formula E k and the size of the parameters ñ and ˆd we know that the preprocessing time is bounded by g ( ϕ n d g ( ϕ, for a suitable computable function g, and that each membership test can be done in time g ( ϕ. The last step of the precomputation phase computes the function skip that associates to each node y P (G and each set V of at most k 1 nodes that are related to y via E k, the smallest (according to the order P of P (G element z P y in P (G that is not connected by an E-edge to any node in V. More precisely: For any node y P (G and any set V with 0 V < k and (v, y E k (Ĝ for all v V, we let skip(y, V := min{z P (G : y P z and v V : (v, z E(G and (z, v E(G}, respectively, skip(y, V := void if no such z exists. Notice that the nodes of V are related to y via E k and hence are at distance < 3k from y. Hence for each y, we only need to consider at most ˆd (3k2 such sets V. For each set V, skip(y, V can be computed by running consecutively through all nodes z P y in the list P (G and test whether ( E(z, v E(v, z holds for some v V. To perform the latter test in constant time, we precompute an (ñ ñ-array A E such that A E[z, v] = 1 if (z, v E G, and A E[z, v] = 0 otherwise, in the same way as in the proof of Theorem 5. Since V k and each v V is of degree at most d in G, the value skip(y, V can be found in time O(k 2 d. Therefore, the entire skip-function can be computed, and stored in an array, in time O(ñ ˆd (3k2 g ( ϕ k 2 d, which is time g( ϕ n d g( ϕ for a suitable computable function g. During the enumeration phase we will use this array such that for given y and V, the value skip(y, V can be looked-up within constant time. We are now done with the preprocessing phase. Altogether it took 1. the time to compute ψ and G, which is g( ϕ n d g( ϕ, for a computable function g 2. the time to compute j J θj, which is g( ϕ, for a computable function g 3. for each j J and θ := θ j, it took (a the preprocessing time for enumerating θ (G, which is h(θ ñ d h(θ, for the computable function h in the Proposition s statement (b the time for computing P (G, which is g( ϕ n d g( ϕ, for a computable function g (c the preprocessing time for testing membership in E k (Ĝ and for producing the array A E, which can be done in time g( ϕ n d g( ϕ, for a computable function g (d and the time for computing the skip-function, which is g( ϕ n d g( ϕ, for a computable function g. It is straightforward to see that, by suitably choosing the computable function h, all the preprocessing steps can be done within time h(ϕ n d h(ϕ. We now turn to the enumeration phase. We will describe how to enumerate θ(g with delay h(ϕ. Note that this will immediately lead to the desired enumeration algorithm for ϕ(g by doing the following: Loop through all j J to enumerate θ j(g; however, instead of outputting a tuple v θ j(g, compute the tuple ā := f 1 ( v and output ā. In the rest of this proof, we will therefore restrict attention to enumerating θ(g. We first describe the enumeration algorithm, then analyse its running time, and finally prove that it outputs, without repetition, all the tuples in θ(g. 7

8 Algorithm for enumerating θ(g: 1. Let ū be the first output produced in the enumeration of θ (G. If ū = void then STOP with output void, else let (u 1,..., u k 1 = ū and goto line Let y := first P be the first element in the list P (G. 3. Let V := {v {u 1,..., u k 1 } : (v, y E k (Ĝ}. 4. Let z := skip(y, V. 5. If z void then OUTPUT (ū, z and goto line If z = void then 7. Let ū be the next output produced in the enumeration of θ (G. 8. If ū = void then STOP with output void, else let ū := ū and goto line Let y := next P (z. 10. If y = void then goto line 7, else goto line 3. Note that the algorithm never outputs any tuple more than once. Before proving that this algorithm enumerates exactly the tuples in θ(g, let us first show that it operates with delay at most h(ϕ. By the induction hypothesis, the execution of line 1 and and each execution of line 7 takes time at most h(θ. Furthermore, each execution of line 3 takes time (k 1 g ( E k. Concerning the remaining lines of the algorithm, each execution can be done in time O(1. Furthermore, before outputting the first tuple, the algorithm executes at most 5 lines (namely, lines 1 5; note that by our choice of the formula θ we know that when entering line 5 before outputting the first tuple, it is guaranteed that z void, hence an output tuple is generated. Between outputting two consecutive tuples, the algorithm executes at most 12 lines (the worst case is an execution of lines 9, 10, 3, 4, 5, 6, 7, 8, 2, 3, 4, 5; again, by our choice of the formula θ, at the last execution of line 5 it is guaranteed that z void, hence an output tuple is generated. Therefore, by suitably choosing the function h, we obtain that the algorithm enumerates with delay at most h(ϕ. Concerning the correctness of the output, let us first show that every tuple (ū, z that is produced as an output, does belong to θ(g: Recall that θ( x = φ( x k 1 P (x k γ( x. We know that ū θ (G, and thus, in particular, φ(ū is satisfied. Furthermore, if the tuple (ū, z is output in line 5 (this is the only OUTPUT instruction present in the algorithm, we know that V = {v {u 1,..., u k 1 } : (v, y E k (Ĝ} and z = skip(y, V. Hence, z belongs to P (G, and we know that z is not connected by an E-edge to any node in V. By the next claim (Claim 1 we obtain that z is not connected by an E-edge to any node in {u 1,..., u k 1 }, and hence γ(ū, z is satisfied and the tuple (ū, z belongs to θ(g. This is the key of our enumeration algorithm. CLAIM 1. Let U be a set of at most k 1 nodes of G. Let y P (G, let V := {v U : (v, y E k (Ĝ}, and let z := skip(y, V void. Then, z = min{w P (G : y P w and u U : (u, w E(G and (w, u E(G}. PROOF OF CLAIM 1. By definition of skip(y, V we have z = min{z P (G : y P z and v V : (v, z E(G and (z, v E(G}. Since V U, we thus know that z P min{w P (G : y P w and u U : (u, w E(G and (w, u E(G}. All that remains to be done is to show that for all u U \ V we have (u, z E(G and (z, u E(G. In case that z = y, this is true because U \ V only contains vertices that are not connected to y by an E k -edge, and hence also not connected to y by an E-edge. In case that z y, we know that y < P z, and thus y P z < P z for the immediate predecessor z of z, i.e., the node z P (G with next P (z = z. For contradiction, assume that for some u U \ V we have (u, z E(G or (z, u E(G. Thus, E (z, u is satisfied in Ĝ. Also, next(z, z is satisfied in Ĝ. Since y P z < P z (i.e., z is skipped by skip(y, V, we furthermore know that z is connected by an E-edge to some node v V, i.e., E (v, z is true in Ĝ. Assume now that also E k 1 (v, y is true in Ĝ. Recalling the definition of the formula E k, note that we thus have found witnesses showing that E k (u, y holds in Ĝ, i.e., (u, y Ek(Ĝ. This, however, implies that u V, contradicting the assumption that u U \ V. To conclude the proof of Claim 1 it remains to show that E k 1 (v, y is indeed true in Ĝ, i.e., (v, y Ek 1(Ĝ. To this end, for all j k, let V j := {v V : (v, y E j(ĝ}. Clearly, V k = V. By the choice of the formulas E j we know that V 1 V k. Moreover, it is straightforward to see that the definition of the formulas E 1,..., E k ensures that the following is true: if V j = V j+1, for some j < k, then V j = = V k. Thus, there is a j k such that ( : V 1 V j = = V k = V. Since V U, U k 1, and u U \ V, we know that V k 2. Hence, ( implies that j k 1. Thus, V k 1 = V. Since v V, we therefore obtain that v belongs to V k 1, i.e., E k 1 (v, y is true in Ĝ. This completes the proof of Claim 1. To finish the proof of Proposition 15, we need to verify that every tuple in θ(g is eventually output by the algorithm. Let (u 1,..., u k be an arbitrary tuple in θ(g. Then, in particular, for ū := (u 1,..., u k 1, we have that ū θ (G, Thus, by the induction hypothesis, the enumeration algorithm for θ will eventually output the tuple ū. Let z 1 < P < P z m be an ordered list of all elements such that the enumeration algorithm for θ(g outputs the tuple (ū, z i for i {1,..., m}. Clearly, it suffices to show that u k is one of the z i s. Let U := {u 1,..., u k 1 }. Since (ū, u k θ(g we, in particular, have that u k P (G and u k is not connected to any u U by an E-edge. In case that u k P z 1, by construction of the algorithm we know that z 1 = skip(y, V for y := first P. By Claim 1, we furthermore know that z 1 is the minimum element w P (G with y P w, which is not connected to any u U by an E-edge. Thus, z 1 P u k, and hence u k = z 1. In case that z i 1 < P u k P z i, we know for y := next P (z i 1 that z i = skip(y, V. Since y P u k, we obtain from Claim 1 that z i P u k, and hence u k = z i. 8

9 The case that z m < P u k cannot occur since, by construction of the algorithm, the following is true: Either z m is the largest element w.r.t. P or for the element y := next P (z m, we have skip(y, V = void, whereas according to the definition of the skipfunction, skip(y, V would have to be an element P u k. This concludes the proof of Proposition 15 Theorem 6 follows immediately from Proposition 15 (following again the arguments of the proof of Proposition PROOF OF QUANTIFIER ELIMINATION AND NORMAL FORM This section is devoted to the proof of Proposition 9. The proof consists of several steps, the first of which relies on a transformation of ϕ( x into an equivalent formula in Gaifman normal form, i.e., a boolean combination of basic-local sentences and formulas that are local around x. A formula λ( x is r-local around x (for some r 0 if every quantifier is relativized to the r- neighbourhood of x. A basic-local sentence is of the form y 1 y l 1 i<j l dist(y i, y j > 2r l θ(y i, where θ(y is r-local around y. By Gaifman s well-known theorem we obtain an algorithm that transforms an input formula ϕ( x into an equivalent formula in Gaifman normal form [9]. The rest of the proof can be sketched as follows. Basic-local sentences can be evaluated on structures of low degree in pseudolinear time by Theorem 2, so it remains to treat formulas that are local around their free variables. By the Feferman-Vaught Theorem (cf., e.g. [17], we can further decompose local formulas into formulas that are local around one of their free variables. The latter turns out to have a small answer set that can be precomputed in pseudo-linear time. The remaining time is used to compute the structures useful for reconstructing the initial answers from their components. We now give the details. PROOF OF PROPOSITION 9. Step 1: transform ϕ( x into a local formula ϕ ( x. We first transform ϕ( x into an equivalent formula ϕ G ( x in Gaifman normal form. For each basic-local sentence χ occurring in ϕ G ( x, check whether A = χ and let χ := true if A = χ and χ := false if A = χ. Let ϕ ( x be the formula obtained from ϕ G ( x by replacing every basic-local sentence χ occurring in ϕ G ( x with χ. By using Gaifman s theorem and Theorem 2, all this can be done in time and space O(h( ϕ n d h( ϕ, for a computable function h. Clearly, for every ā dom(a k we have A = ϕ (ā iff A = ϕ(ā. Note that there is a number r 0 such that ϕ ( x is r-local around x, and this number can easily be computed given ϕ G ( x. Step 2: transform ϕ ( x into a disjunction P P ψ P ( x. Let x = (x 1,..., x k. A partition of the set {1,..., k} is a list P = (P 1,..., P l with 1 l k such that i=1 = P j {1,..., k}, for every j {1,..., l}, P 1 P l = {1,..., k}, P j P j =, for all j, j {1,..., l} with j j, min P j < min P j+1, for all j {1,..., l 1}. Let P be the set of all partitions of {1,..., k}. Clearly, P k!. For each P = (P 1,..., P l P and each j l let x Pj be the tuple obtained from x by deleting all those x i with i P j. For every partition P = (P 1,..., P l P let ϱ P ( x be an FO(σ-formula stating that each of the following is true: 1. The r-neighbourhood around x in A is the disjoint union of the r-neighbourhoods around x Pj for j l. I.e., δ P ( x := dist(x i, x i > 2r+1. 1 j<j l (i,i P j P j 2. For each j l, the r-neighbourhood around x Pj in A is connected, i.e., satisfies the formula γ Pj ( x Pj := dist(x i, x i 2r+1. E P j P j such that the graph (P j,e is connected (i,i E Note that the formula ϱ P ( x := δ P ( x l j=1 γp j ( xp j is r- local around x. Furthermore, ϕ ( x obviously is equivalent to the formula ( P P ϱp ( x ϕ ( x. Using the Feferman-Vaught Theorem (see e.g. [17], we can, for each P = (P 1,..., P l P, compute a decomposition of ϕ ( x into r-local formulas ϑ P,j,t( x Pj, for j {1,..., l} and t T P, for a suitable finite set T P, such that the formula ( ϱ P ( x ϕ ( x is equivalent to ϱ P ( x t T P ( ϑp,1,t( x P1 ϑ P,l,t ( x Pl which, in turn, is equivalent to ψ P ψ P,1 := δ P ( x and ψ P,2 := ( l j=1 γ Pj ( x Pj := (ψ P,1 ψ P,2, where t T P ( l j=1 ϑ P,j,t( x Pj. In summary, ϕ ( x is equivalent to P P ψ P ( x, and for every tuple ā dom(a k with A = ϕ (ā, there is exactly one partition P P such that A = ψ P (ā (since A = ϱ P (ā is true for only one such P P. Step 3: defining G, f, and ψ. We define the domain G of G to be the disjoint union of the sets A and V, where A := dom(a, and V consists of a dummy element v, and an element v ( b,ι for each b A 1 A k such that A = γ Pj ( b for P j := {1,..., b } and for each injective mapping ι : {1,..., b } {1,..., k}. Note that the first item ensures that the r-neighbourhood around b in A is connected. The second item ensures that we can view ι as a description telling us that the i-th component of b shall be viewed as an assignment for the variable x ι(i (for each i {1,..., b }. We let f be the function from A k to V k defined as follows: For each ā A k let P = (P 1,..., P l be the unique element in P such that A = ϱ P (ā. For each j {1,..., l}, we write ā Pj for the tuple obtained from ā by deleting all those a i with i P j. Furthermore, we let ι Pj be the mapping from {1,..., P j } to {1,..., k} such that for any i {1,..., P j }, ι(i is the i-th smallest element of P j. Then, f(ā := ( v (āp1,ι P1,..., v (āpl,ι Pl, v,..., v, where the number of v -components is (k l. It is straightforward to see that f is injective. 9

10 We let τ 1 be the signature consisting of a unary relation symbol C, and a unary relation symbol C ι for each injective mapping ι : {1,..., s} {1,..., k} for s {1,..., k}. In G, the symbol C is interpreted by the singleton set {v }, and each C ι is interpreted by the set of all nodes v ( b,ˆι V with ˆι = ι. We let E be a binary relation symbol which is interpreted in G by the set of all tuples (v ( b,ι, v ( c,ˆι V 2 such that there are elements b A in b and c A in c such that dist A (b, c 2r+1. For each P = (P 1,..., P l P, each j {1,..., l}, and each t T P we let C P,j,t be a unary relation symbol which, in G, is interpreted by the set of all nodes v ( b,ι V such that ι = ι Pj and A = ϑ P,j,t( b. We let τ 2 be the signature consisting of all the unary relation symbols C P,j,t. We let ȳ = (y 1,..., y k be a tuple of k distinct variables, and we define ψ 1(ȳ to be the FO(E-formula ψ 1(ȳ := E(y j, y j. 1 j,j k with j j For each P = (P 1,..., P l we let ψ P (ȳ be the FO(τ 1 τ 2- formula defined as follows: ψ P (ȳ := ( l j=1 t T P C ιpj (y j ( l j=1 C P,j,t(y j. ( k j=l+1 C (y j It is straightforward to verify that the following is true: (1 For every ā A k with A = ψ P (ā, we have G = (ψ 1 ψ P (f(ā. (2 For every v G k with G = (ψ 1 ψ P ( v, there is a (unique tuple ā A k with v = f(ā, and for this tuple we have A = ψ P (ā. Finally, we let ψ(ȳ := ( ψ 1(ȳ ψ 2(ȳ with ψ 2(ȳ := P P ψ P (ȳ. It is straightforward to see that f is a bijection between ϕ(a and ψ(g. In summary, we now know that items 1 and 2, as well as the first statement of item 4 of Proposition 9 are true. To achieve the second statement of item 4, we use additional binary relation symbols F 1,..., F k which are interpreted in G as follows: Start by initialising all of them to the empty set. Then, for each v = v ( b,ι V and each j {1,..., b }, add to F G ι(j the tuple (v, a, where a is j-th component of b. This completes the definition of G and τ, letting τ := τ 1 τ 2 {E, F 1,..., F k }. Using the relations F 1,..., F k of G, in time and space O(k we can, upon input of v = v ( b,ι V compute the tuple b and the mapping ι (for this, just check for all i {1,..., k} whether node v has an outgoing F i-edge. Using this, it is straightforward to see that upon input of v ψ(g, the tuple f 1 ( v A k can be computed in time and space O(k 2, thus proving the second statement of item 4. Step 4: proving item 3. First of all, note that for each v ( b,ι V, the tuple b is of the form (b 1,..., b s A s for some s k, such that all components of the tuple belong to the ˆr-neighbourhood N Aˆr (b 1 of b 1 in A, for ˆr := k(2r+1. Since A has degree d, N Aˆr (b 1 contains at most dˆr+1 elements of A. And by using breadth-first search, N Aˆr (b 1 can be computed in time d O(ˆr+1. Thus, the set V, along with the relations C, C ι and F 1,..., F k of G, can be computed as follows: Start by letting V := {v } and initialising all relations to the empty set. Let C G := {v }. Then, for each a A, compute the ˆr-neighbourhood N Aˆr (a of a in A, and compute (by a brute-force algorithm, for each s {1,..., k}, the set of all s-tuples b of elements from this neighbourhood, which satisfy the following: The first component of b is a, and N Aˆr (a = γ Pj ( b for P j = {1,..., s}. For each such tuple b do the following: For each injective mapping ι : {1,..., s} {1,..., k} add to V a new element v ( b,ι, add this element to the relation Cι G, and for each j {1,..., s}, add to F G ι(j the tuple (v ( b,ι, a, where a is the j-th component of b. This way, the domain G = A V of G, along with the relations C ι and F 1,..., F k of G, can be computed in time and space O(h( ϕ n d h( ϕ, for a computable function h. For computing the unary relations C P,j,t of G, start by initialising all of them to the empty set. For each v ( b,ι V do the following: Compute (by using the relations F 1,..., F k the tuple b and the mapping ι. Let a be the first component of b. Compute the r-neighbourhood Nr Ã (a of a in A, for r := ˆr+r. For each P = (P 1,..., P l P, each j {1,..., l} such that ι Pj = ι, and each t T P, check whether Nr Ã (a = θ P,t,j( b. If so, add the element v ( b,ι to the relation C P,j,t of G. (This is correct, since the formula θ P,j,t is r-local around its free variables, and the radius of the neighbourhood is large enough. This way, G s relations C P,j,t can be computed in time and space O(h( ϕ n d h( ϕ, for a computable function h. To compute the E-relation of G, note that for all tuples (v ( b,ι, v ( c,ˆι E G, we have dist A (a, c j (2k+1(2r+1, for all components c j of c, where a is the first component of b. Thus, the E-relation of G can be computed as follows: Start by initialising this relation to the empty set. For each v ( b,ι V do the following: Compute (by using the relations F 1,..., F k the tuple b. Let a be the first component of b. Compute the r -neighbourhood N A r (a of a in A, for r := r + (2k+1(2r+1. Use a brute-force algorithm to compute all tuples c of elements in N A r r(a, such that c k and N A r (a = γp j ( c for Pj = {1,..., c }. Check if there are components b of b and c of c such that dist N A r (a (b, c 2r+1. If so, add to E G the tuple (v ( b,ι, v ( c,ˆι for each injective mapping ˆι : {1,..., c } {1,..., k}. This way, the E-relation of G can be computed in time and space O(h( ϕ n d h( ϕ, for a computable function h. In summary, we obtain that G is computable from A and ϕ within the desired time and space bound. To finish the proof of item 3, we need to give an upper bound on the degree of G. As noted above, (v ( b,ι, v ( c,ˆι E G implies that dist A (a, c j r for r := (2k+1(2r+1, for all components c j of c, where a is the first component of b. Thus, for each fixed v ( b,ι V, the number of elements v ( c,ˆι such that (v ( b,ι, v ( c,ˆι E G is at most k! k N A r (a s k! N A r (a k+1 k! d (r +1(k+1. s=1 Thus, since E G is symmetric, its degree is 2k!d (r +1(k+1. Similarly, for each tuple (v ( b,ι, a F G i (with i {1,..., k} we know that a is the ι 1 (i-th component of b and each com- 10

11 ponent of b belongs to the ˆr-neighbourhood of a in A, for ˆr = k(2r+1. Thus, for each fixed a A, the number of elements v ( b,ι V such that (v ( b,ι, a F G i is at most k! k s=1 N Aˆr (a s k! d (ˆr+1(k+1. In summary, we thus obtain that G is of degree at most d h( ϕ for a computable function h. Step 5: proving the third statement of item 4. For ā A k we know that f(ā := ( v (āp1,ι P1,..., v (āpl,ι Pl, v,..., v, for the unique partition P = (P 1,..., P l P such that A = ϱ P (ā. The number of v -components in f(ā is (k l. To compute the partition P for a given tuple ā = (a 1,..., a k, we can proceed as follows: Construct an undirected graph H with vertex set {1,..., k}, where there is an edge between i j iff dist A (a i, a j 2r+1. Compute the connected components of H. Let l be the number of connected components of H. For each j {1,..., l} let P j be vertex set of the j-th connected component, such that min P j < min P j+1 for all j {1,..., l 1}. Once the edge set of H is constructed, all further steps of this algorithm can be performed in time O(k 2. After having constructed the partition P = (P 1,..., P l, further O(k 2 steps suffice to construct the tuples ā P1,..., ā Pl, the mappings ι P1,..., ι Pl, and the according tuple f(ā. Hence, we can compute f(ā in time O(k 2, provided that the edge set of H can be computed in time O(k 2. To enable the fast computation of the edge set of H, we precompute an (n n-array D such that for any two elements a, b A = {1,..., n} we have D[a, b] = 1 if a b and dist A (a, b 2r+1, and D[a, b] = 0 otherwise. Once D is available, we can compute the edge set of H in time O(k 2 by checking, for all i, j {1,..., k} whether D[a i, a j] = 1. To precompute the array D, we can proceed as follows: In the standard RAM-model, when starting the RAM D is initialised to 0 for free (in a model where this initialisation is not for free, we can avoid initialisation by using the lazy array technique described in Section 2. For each a A, we use time d O(2r+1 to compute the (2r+1-neighbourhood of a in A, and for each element b a in this neighbourhood we let D[a, b] := 1. In summary, the computation of the array D thus is done in time O(n d h( ϕ for a computable function h. This completes the proof of Proposition CONCLUSION For classes of databases of low degree, we presented an algorithm which enumerates the answers to first-order queries with constant delay after pseudo-linear preprocessing. An inspection of the proof shows that the constants involved are non-elementary in the query size. In the bounded degree case the constants are triply exponential in the query size [13]. In the (unranked tree case the constants are provably non-elementary [8] (modulo some complexity assumption. We do not know what is the situation for classes of low degree. It would also be interesting to know whether we can enumerate the answers to a query using the lexicographical order (as it is the case over structures of bounded expansion [14]. Finally it would be interesting to know whether the memory assumption that we use for our RAMs can be avoided. 6. REFERENCES [1] G. Bagan. MSO queries on tree decomposable structures are computable with linear delay. In Proc. 20th International Workshop/15th Annual Conference of the EACSL (CSL 06, pages , [2] G. Bagan, A. Durand, and E. Grandjean. On acyclic conjunctive queries and constant delay enumeration. In Proc. 21st International Workshop/16th Annual Conference of the EACSL (CSL 07, pages , [3] J. Brault-Baron. A negative conjunctive query is easy if and only if it is beta-acyclic. In Proc. 26th International Workshop/21st Annual Conference of the EACSL (CSL 12, pages , [4] B. Courcelle. Linear delay enumeration and monadic second-order logic. Discrete Applied Mathematics, 157(12: , [5] A. Durand and E. Grandjean. First-order queries on structures of bounded degree are computable with constant delay. ACM Transactions on Computational Logic, 8(4, [6] Z. Dvořák, D. Král, and R. Thomas. Deciding First-Order Properties for Sparse Graphs. In Proc. 51st Annual IEEE Symposium on Foundations of Computer Science (FOCS 10, pages , [7] J. Flum and M. Grohe. Parameterized Complexity Theory. Springer-Verlag, [8] M. Frick and M. Grohe. The complexity of first-order and monadic second-order logic revisited. Annals of Pure and Applied Logic, 130(1-3:3 31, [9] H. Gaifman. On local and nonlocal properties. In J. Stern, editor, Logic Colloquium 81, pages North-Holland, [10] E. Grandjean and F. Olive. Graph properties checkable in linear time in the number of vertices. Journal of Computer and System Sciences, 68(3: , [11] M. Grohe. Generalized model-checking problems for first-order logic. In Proc. 18th Annual Symposium on Theoretical Aspects of Computer Science (STACS 01, pages 12 26, [12] M. Grohe, S. Kreutzer, and S. Siebertz. Deciding first-order properties of nowhere dense graphs. In Proc. 46th Symposium on Theory of Computing (STOC 14, Preliminary version: CoRR abs/ (2013. [13] W. Kazana and L. Segoufin. First-order query evaluation on structures of bounded degree. Logical Methods in Computer Science, 7(2, [14] W. Kazana and L. Segoufin. Enumeration of first-order queries on classes of structures with bounded expansion. In Proc. 32nd ACM Symposium on Principles of Database Systems (PODS 13, pages , [15] W. Kazana and L. Segoufin. Enumeration of monadic second-order queries on trees. ACM Transactions on Computational Logic, 14(4, [16] S. Kreutzer and A. Dawar. Parameterized complexity of first-order logic. Electronic Colloquium on Computational Complexity (ECCC, 16:131, [17] J. A. Makowsky. Algorithmic uses of the Feferman-Vaught Theorem. Annals of Pure and Applied Logic, 126(1-3: , [18] B. M. E. Moret and H. D. Shapiro. Algorithms from P to NP. Volume 1: Design & Efficiency. Benjamin/Cummings, [19] M. Yannakakis. Algorithms for acyclic database schemes. In Proc. 7th International Conference on Very Large Data Bases (VLDB 81, pages 82 94,

### Mathematics for Computer Science/Software Engineering. Notes for the course MSM1F3 Dr. R. A. Wilson

Mathematics for Computer Science/Software Engineering Notes for the course MSM1F3 Dr. R. A. Wilson October 1996 Chapter 1 Logic Lecture no. 1. We introduce the concept of a proposition, which is a statement

### Outline. Database Theory. Perfect Query Optimization. Turing Machines. Question. Definition VU , SS Trakhtenbrot s Theorem

Database Theory Database Theory Outline Database Theory VU 181.140, SS 2016 4. Trakhtenbrot s Theorem Reinhard Pichler Institut für Informationssysteme Arbeitsbereich DBAI Technische Universität Wien 4.

### Bounded Treewidth in Knowledge Representation and Reasoning 1

Bounded Treewidth in Knowledge Representation and Reasoning 1 Reinhard Pichler Institut für Informationssysteme Arbeitsbereich DBAI Technische Universität Wien Luminy, October 2010 1 Joint work with G.

### MAT2400 Analysis I. A brief introduction to proofs, sets, and functions

MAT2400 Analysis I A brief introduction to proofs, sets, and functions In Analysis I there is a lot of manipulations with sets and functions. It is probably also the first course where you have to take

### SETS, RELATIONS, AND FUNCTIONS

September 27, 2009 and notations Common Universal Subset and Power Set Cardinality Operations A set is a collection or group of objects or elements or members (Cantor 1895). the collection of the four

### The Recovery of a Schema Mapping: Bringing Exchanged Data Back

The Recovery of a Schema Mapping: Bringing Exchanged Data Back MARCELO ARENAS and JORGE PÉREZ Pontificia Universidad Católica de Chile and CRISTIAN RIVEROS R&M Tech Ingeniería y Servicios Limitada A schema

### Tree-representation of set families and applications to combinatorial decompositions

Tree-representation of set families and applications to combinatorial decompositions Binh-Minh Bui-Xuan a, Michel Habib b Michaël Rao c a Department of Informatics, University of Bergen, Norway. buixuan@ii.uib.no

### Introduction to Logic in Computer Science: Autumn 2006

Introduction to Logic in Computer Science: Autumn 2006 Ulle Endriss Institute for Logic, Language and Computation University of Amsterdam Ulle Endriss 1 Plan for Today Now that we have a basic understanding

### Approximation Algorithms

Approximation Algorithms or: How I Learned to Stop Worrying and Deal with NP-Completeness Ong Jit Sheng, Jonathan (A0073924B) March, 2012 Overview Key Results (I) General techniques: Greedy algorithms

### (LMCS, p. 317) V.1. First Order Logic. This is the most powerful, most expressive logic that we will examine.

(LMCS, p. 317) V.1 First Order Logic This is the most powerful, most expressive logic that we will examine. Our version of first-order logic will use the following symbols: variables connectives (,,,,

### Satisfiability Checking

Satisfiability Checking First-Order Logic Prof. Dr. Erika Ábrahám RWTH Aachen University Informatik 2 LuFG Theory of Hybrid Systems WS 14/15 Satisfiability Checking Prof. Dr. Erika Ábrahám (RWTH Aachen

### A first step towards modeling semistructured data in hybrid multimodal logic

A first step towards modeling semistructured data in hybrid multimodal logic Nicole Bidoit * Serenella Cerrito ** Virginie Thion * * LRI UMR CNRS 8623, Université Paris 11, Centre d Orsay. ** LaMI UMR

### Discuss the size of the instance for the minimum spanning tree problem.

3.1 Algorithm complexity The algorithms A, B are given. The former has complexity O(n 2 ), the latter O(2 n ), where n is the size of the instance. Let n A 0 be the size of the largest instance that can

### Locality from Circuit Lower Bounds

Locality from Circuit Lower Bounds Matthew Anderson, Dieter Van Melkebeek, Nicole Schweikardt, Luc Segoufin To cite this version: Matthew Anderson, Dieter Van Melkebeek, Nicole Schweikardt, Luc Segoufin.

### Large induced subgraphs with all degrees odd

Large induced subgraphs with all degrees odd A.D. Scott Department of Pure Mathematics and Mathematical Statistics, University of Cambridge, England Abstract: We prove that every connected graph of order

### Exponential time algorithms for graph coloring

Exponential time algorithms for graph coloring Uriel Feige Lecture notes, March 14, 2011 1 Introduction Let [n] denote the set {1,..., k}. A k-labeling of vertices of a graph G(V, E) is a function V [k].

### 2.3 Scheduling jobs on identical parallel machines

2.3 Scheduling jobs on identical parallel machines There are jobs to be processed, and there are identical machines (running in parallel) to which each job may be assigned Each job = 1,,, must be processed

### ! Solve problem to optimality. ! Solve problem in poly-time. ! Solve arbitrary instances of the problem. #-approximation algorithm.

Approximation Algorithms 11 Approximation Algorithms Q Suppose I need to solve an NP-hard problem What should I do? A Theory says you're unlikely to find a poly-time algorithm Must sacrifice one of three

Approximation Algorithms Chapter Approximation Algorithms Q. Suppose I need to solve an NP-hard problem. What should I do? A. Theory says you're unlikely to find a poly-time algorithm. Must sacrifice one

### ! Solve problem to optimality. ! Solve problem in poly-time. ! Solve arbitrary instances of the problem. !-approximation algorithm.

Approximation Algorithms Chapter Approximation Algorithms Q Suppose I need to solve an NP-hard problem What should I do? A Theory says you're unlikely to find a poly-time algorithm Must sacrifice one of

### R u t c o r Research R e p o r t. Boolean functions with a simple certificate for CNF complexity. Ondřej Čepeka Petr Kučera b Petr Savický c

R u t c o r Research R e p o r t Boolean functions with a simple certificate for CNF complexity Ondřej Čepeka Petr Kučera b Petr Savický c RRR 2-2010, January 24, 2010 RUTCOR Rutgers Center for Operations

### THE TURING DEGREES AND THEIR LACK OF LINEAR ORDER

THE TURING DEGREES AND THEIR LACK OF LINEAR ORDER JASPER DEANTONIO Abstract. This paper is a study of the Turing Degrees, which are levels of incomputability naturally arising from sets of natural numbers.

### 1 Definitions. Supplementary Material for: Digraphs. Concept graphs

Supplementary Material for: van Rooij, I., Evans, P., Müller, M., Gedge, J. & Wareham, T. (2008). Identifying Sources of Intractability in Cognitive Models: An Illustration using Analogical Structure Mapping.

### Monitoring Metric First-order Temporal Properties

Monitoring Metric First-order Temporal Properties DAVID BASIN, FELIX KLAEDTKE, SAMUEL MÜLLER, and EUGEN ZĂLINESCU, ETH Zurich Runtime monitoring is a general approach to verifying system properties at

### This asserts two sets are equal iff they have the same elements, that is, a set is determined by its elements.

3. Axioms of Set theory Before presenting the axioms of set theory, we first make a few basic comments about the relevant first order logic. We will give a somewhat more detailed discussion later, but

### Applied Algorithm Design Lecture 5

Applied Algorithm Design Lecture 5 Pietro Michiardi Eurecom Pietro Michiardi (Eurecom) Applied Algorithm Design Lecture 5 1 / 86 Approximation Algorithms Pietro Michiardi (Eurecom) Applied Algorithm Design

### CARDINALITY, COUNTABLE AND UNCOUNTABLE SETS PART ONE

CARDINALITY, COUNTABLE AND UNCOUNTABLE SETS PART ONE With the notion of bijection at hand, it is easy to formalize the idea that two finite sets have the same number of elements: we just need to verify

### o-minimality and Uniformity in n 1 Graphs

o-minimality and Uniformity in n 1 Graphs Reid Dale July 10, 2013 Contents 1 Introduction 2 2 Languages and Structures 2 3 Definability and Tame Geometry 4 4 Applications to n 1 Graphs 6 5 Further Directions

### Math 3000 Running Glossary

Math 3000 Running Glossary Last Updated on: July 15, 2014 The definition of items marked with a must be known precisely. Chapter 1: 1. A set: A collection of objects called elements. 2. The empty set (

### Bounded-width QBF is PSPACE-complete

Bounded-width QBF is PSPACE-complete Albert Atserias 1 and Sergi Oliva 2 1 Universitat Politècnica de Catalunya Barcelona, Spain atserias@lsi.upc.edu 2 Universitat Politècnica de Catalunya Barcelona, Spain

### Sets and Cardinality Notes for C. F. Miller

Sets and Cardinality Notes for 620-111 C. F. Miller Semester 1, 2000 Abstract These lecture notes were compiled in the Department of Mathematics and Statistics in the University of Melbourne for the use

### Linear Codes. Chapter 3. 3.1 Basics

Chapter 3 Linear Codes In order to define codes that we can encode and decode efficiently, we add more structure to the codespace. We shall be mainly interested in linear codes. A linear code of length

### Generating models of a matched formula with a polynomial delay

Generating models of a matched formula with a polynomial delay Petr Savicky Institute of Computer Science, Academy of Sciences of Czech Republic, Pod Vodárenskou Věží 2, 182 07 Praha 8, Czech Republic

### Partitioning edge-coloured complete graphs into monochromatic cycles and paths

arxiv:1205.5492v1 [math.co] 24 May 2012 Partitioning edge-coloured complete graphs into monochromatic cycles and paths Alexey Pokrovskiy Departement of Mathematics, London School of Economics and Political

### EMBEDDING COUNTABLE PARTIAL ORDERINGS IN THE DEGREES

EMBEDDING COUNTABLE PARTIAL ORDERINGS IN THE ENUMERATION DEGREES AND THE ω-enumeration DEGREES MARIYA I. SOSKOVA AND IVAN N. SOSKOV 1. Introduction One of the most basic measures of the complexity of a

### Chapter 4, Arithmetic in F [x] Polynomial arithmetic and the division algorithm.

Chapter 4, Arithmetic in F [x] Polynomial arithmetic and the division algorithm. We begin by defining the ring of polynomials with coefficients in a ring R. After some preliminary results, we specialize

### Approximation Algorithms: LP Relaxation, Rounding, and Randomized Rounding Techniques. My T. Thai

Approximation Algorithms: LP Relaxation, Rounding, and Randomized Rounding Techniques My T. Thai 1 Overview An overview of LP relaxation and rounding method is as follows: 1. Formulate an optimization

### Week 5: Binary Relations

1 Binary Relations Week 5: Binary Relations The concept of relation is common in daily life and seems intuitively clear. For instance, let X be the set of all living human females and Y the set of all

### Discrete Mathematics, Chapter 5: Induction and Recursion

Discrete Mathematics, Chapter 5: Induction and Recursion Richard Mayr University of Edinburgh, UK Richard Mayr (University of Edinburgh, UK) Discrete Mathematics. Chapter 5 1 / 20 Outline 1 Well-founded

### ON FUNCTIONAL SYMBOL-FREE LOGIC PROGRAMS

PROCEEDINGS OF THE YEREVAN STATE UNIVERSITY Physical and Mathematical Sciences 2012 1 p. 43 48 ON FUNCTIONAL SYMBOL-FREE LOGIC PROGRAMS I nf or m at i cs L. A. HAYKAZYAN * Chair of Programming and Information

### Factoring & Primality

Factoring & Primality Lecturer: Dimitris Papadopoulos In this lecture we will discuss the problem of integer factorization and primality testing, two problems that have been the focus of a great amount

### Network (Tree) Topology Inference Based on Prüfer Sequence

Network (Tree) Topology Inference Based on Prüfer Sequence C. Vanniarajan and Kamala Krithivasan Department of Computer Science and Engineering Indian Institute of Technology Madras Chennai 600036 vanniarajanc@hcl.in,

### The Goldberg Rao Algorithm for the Maximum Flow Problem

The Goldberg Rao Algorithm for the Maximum Flow Problem COS 528 class notes October 18, 2006 Scribe: Dávid Papp Main idea: use of the blocking flow paradigm to achieve essentially O(min{m 2/3, n 1/2 }

### Mean Ramsey-Turán numbers

Mean Ramsey-Turán numbers Raphael Yuster Department of Mathematics University of Haifa at Oranim Tivon 36006, Israel Abstract A ρ-mean coloring of a graph is a coloring of the edges such that the average

### Resource Allocation with Time Intervals

Resource Allocation with Time Intervals Andreas Darmann Ulrich Pferschy Joachim Schauer Abstract We study a resource allocation problem where jobs have the following characteristics: Each job consumes

### Lecture 7: Approximation via Randomized Rounding

Lecture 7: Approximation via Randomized Rounding Often LPs return a fractional solution where the solution x, which is supposed to be in {0, } n, is in [0, ] n instead. There is a generic way of obtaining

### Degree Hypergroupoids Associated with Hypergraphs

Filomat 8:1 (014), 119 19 DOI 10.98/FIL1401119F Published by Faculty of Sciences and Mathematics, University of Niš, Serbia Available at: http://www.pmf.ni.ac.rs/filomat Degree Hypergroupoids Associated

### Lecture 2: Universality

CS 710: Complexity Theory 1/21/2010 Lecture 2: Universality Instructor: Dieter van Melkebeek Scribe: Tyson Williams In this lecture, we introduce the notion of a universal machine, develop efficient universal

### Finite Sets. Theorem 5.1. Two non-empty finite sets have the same cardinality if and only if they are equivalent.

MATH 337 Cardinality Dr. Neal, WKU We now shall prove that the rational numbers are a countable set while R is uncountable. This result shows that there are two different magnitudes of infinity. But we

### Random vs. Structure-Based Testing of Answer-Set Programs: An Experimental Comparison

Random vs. Structure-Based Testing of Answer-Set Programs: An Experimental Comparison Tomi Janhunen 1, Ilkka Niemelä 1, Johannes Oetsch 2, Jörg Pührer 2, and Hans Tompits 2 1 Aalto University, Department

### INTRODUCTORY SET THEORY

M.Sc. program in mathematics INTRODUCTORY SET THEORY Katalin Károlyi Department of Applied Analysis, Eötvös Loránd University H-1088 Budapest, Múzeum krt. 6-8. CONTENTS 1. SETS Set, equal sets, subset,

### The degree, size and chromatic index of a uniform hypergraph

The degree, size and chromatic index of a uniform hypergraph Noga Alon Jeong Han Kim Abstract Let H be a k-uniform hypergraph in which no two edges share more than t common vertices, and let D denote the

### FACTORING POLYNOMIALS IN THE RING OF FORMAL POWER SERIES OVER Z

FACTORING POLYNOMIALS IN THE RING OF FORMAL POWER SERIES OVER Z DANIEL BIRMAJER, JUAN B GIL, AND MICHAEL WEINER Abstract We consider polynomials with integer coefficients and discuss their factorization

### arxiv:1112.0829v1 [math.pr] 5 Dec 2011

How Not to Win a Million Dollars: A Counterexample to a Conjecture of L. Breiman Thomas P. Hayes arxiv:1112.0829v1 [math.pr] 5 Dec 2011 Abstract Consider a gambling game in which we are allowed to repeatedly

### Mathematical Logic. Tableaux Reasoning for Propositional Logic. Chiara Ghidini. FBK-IRST, Trento, Italy

Tableaux Reasoning for Propositional Logic FBK-IRST, Trento, Italy Outline of this lecture An introduction to Automated Reasoning with Analytic Tableaux; Today we will be looking into tableau methods for

### Basic Concepts of Point Set Topology Notes for OU course Math 4853 Spring 2011

Basic Concepts of Point Set Topology Notes for OU course Math 4853 Spring 2011 A. Miller 1. Introduction. The definitions of metric space and topological space were developed in the early 1900 s, largely

### 4 Basics of Trees. Petr Hliněný, FI MU Brno 1 FI: MA010: Trees and Forests

4 Basics of Trees Trees, actually acyclic connected simple graphs, are among the simplest graph classes. Despite their simplicity, they still have rich structure and many useful application, such as in

### What s next? Reductions other than kernelization

What s next? Reductions other than kernelization Dániel Marx Humboldt-Universität zu Berlin (with help from Fedor Fomin, Daniel Lokshtanov and Saket Saurabh) WorKer 2010: Workshop on Kernelization Nov

### Testing LTL Formula Translation into Büchi Automata

Testing LTL Formula Translation into Büchi Automata Heikki Tauriainen and Keijo Heljanko Helsinki University of Technology, Laboratory for Theoretical Computer Science, P. O. Box 5400, FIN-02015 HUT, Finland

### We give a basic overview of the mathematical background required for this course.

1 Background We give a basic overview of the mathematical background required for this course. 1.1 Set Theory We introduce some concepts from naive set theory (as opposed to axiomatic set theory). The

### 3. Equivalence Relations. Discussion

3. EQUIVALENCE RELATIONS 33 3. Equivalence Relations 3.1. Definition of an Equivalence Relations. Definition 3.1.1. A relation R on a set A is an equivalence relation if and only if R is reflexive, symmetric,

### OHJ-2306 Introduction to Theoretical Computer Science, Fall 2012 8.11.2012

276 The P vs. NP problem is a major unsolved problem in computer science It is one of the seven Millennium Prize Problems selected by the Clay Mathematics Institute to carry a \$ 1,000,000 prize for the

### An example of a computable

An example of a computable absolutely normal number Verónica Becher Santiago Figueira Abstract The first example of an absolutely normal number was given by Sierpinski in 96, twenty years before the concept

### Homework MA 725 Spring, 2012 C. Huneke SELECTED ANSWERS

Homework MA 725 Spring, 2012 C. Huneke SELECTED ANSWERS 1.1.25 Prove that the Petersen graph has no cycle of length 7. Solution: There are 10 vertices in the Petersen graph G. Assume there is a cycle C

### Lecture 17 : Equivalence and Order Relations DRAFT

CS/Math 240: Introduction to Discrete Mathematics 3/31/2011 Lecture 17 : Equivalence and Order Relations Instructor: Dieter van Melkebeek Scribe: Dalibor Zelený DRAFT Last lecture we introduced the notion

### This chapter describes set theory, a mathematical theory that underlies all of modern mathematics.

Appendix A Set Theory This chapter describes set theory, a mathematical theory that underlies all of modern mathematics. A.1 Basic Definitions Definition A.1.1. A set is an unordered collection of elements.

### each college c i C has a capacity q i - the maximum number of students it will admit

n colleges in a set C, m applicants in a set A, where m is much larger than n. each college c i C has a capacity q i - the maximum number of students it will admit each college c i has a strict order i

### A Sublinear Bipartiteness Tester for Bounded Degree Graphs

A Sublinear Bipartiteness Tester for Bounded Degree Graphs Oded Goldreich Dana Ron February 5, 1998 Abstract We present a sublinear-time algorithm for testing whether a bounded degree graph is bipartite

### x < y iff x < y, or x and y are incomparable and x χ(x,y) < y χ(x,y).

12. Large cardinals The study, or use, of large cardinals is one of the most active areas of research in set theory currently. There are many provably different kinds of large cardinals whose descriptions

### Notes on Complexity Theory Last updated: August, 2011. Lecture 1

Notes on Complexity Theory Last updated: August, 2011 Jonathan Katz Lecture 1 1 Turing Machines I assume that most students have encountered Turing machines before. (Students who have not may want to look

### Relations Graphical View

Relations Slides by Christopher M. Bourke Instructor: Berthe Y. Choueiry Introduction Recall that a relation between elements of two sets is a subset of their Cartesian product (of ordered pairs). A binary

### A Turán Type Problem Concerning the Powers of the Degrees of a Graph

A Turán Type Problem Concerning the Powers of the Degrees of a Graph Yair Caro and Raphael Yuster Department of Mathematics University of Haifa-ORANIM, Tivon 36006, Israel. AMS Subject Classification:

### Composing Schema Mappings: Second-Order Dependencies to the Rescue

Composing Schema Mappings: Second-Order Dependencies to the Rescue Ronald Fagin IBM Almaden Research Center fagin@almaden.ibm.com Phokion G. Kolaitis UC Santa Cruz kolaitis@cs.ucsc.edu Wang-Chiew Tan UC

### Graphs without proper subgraphs of minimum degree 3 and short cycles

Graphs without proper subgraphs of minimum degree 3 and short cycles Lothar Narins, Alexey Pokrovskiy, Tibor Szabó Department of Mathematics, Freie Universität, Berlin, Germany. August 22, 2014 Abstract

### JUST-IN-TIME SCHEDULING WITH PERIODIC TIME SLOTS. Received December May 12, 2003; revised February 5, 2004

Scientiae Mathematicae Japonicae Online, Vol. 10, (2004), 431 437 431 JUST-IN-TIME SCHEDULING WITH PERIODIC TIME SLOTS Ondřej Čepeka and Shao Chin Sung b Received December May 12, 2003; revised February

### It is not immediately obvious that this should even give an integer. Since 1 < 1 5

Math 163 - Introductory Seminar Lehigh University Spring 8 Notes on Fibonacci numbers, binomial coefficients and mathematical induction These are mostly notes from a previous class and thus include some

### NP-Completeness I. Lecture 19. 19.1 Overview. 19.2 Introduction: Reduction and Expressiveness

Lecture 19 NP-Completeness I 19.1 Overview In the past few lectures we have looked at increasingly more expressive problems that we were able to solve using efficient algorithms. In this lecture we introduce

### Constraint Satisfaction Problems and Descriptive Complexity

1 Constraint Satisfaction Problems and Descriptive Complexity Anuj Dawar University of Cambridge AIM workshop on Universal Algebra, Logic and CSP 1 April 2008 2 Descriptive Complexity Descriptive Complexity

### princeton univ. F 13 cos 521: Advanced Algorithm Design Lecture 6: Provable Approximation via Linear Programming Lecturer: Sanjeev Arora

princeton univ. F 13 cos 521: Advanced Algorithm Design Lecture 6: Provable Approximation via Linear Programming Lecturer: Sanjeev Arora Scribe: One of the running themes in this course is the notion of

### CHAPTER 7 GENERAL PROOF SYSTEMS

CHAPTER 7 GENERAL PROOF SYSTEMS 1 Introduction Proof systems are built to prove statements. They can be thought as an inference machine with special statements, called provable statements, or sometimes

### From Causes for Database Queries to Repairs and Model-Based Diagnosis and Back

From Causes for Database Queries to Repairs and Model-Based Diagnosis and Back Babak Salimi 1 and Leopoldo Bertossi 2 1 Carleton University, School of Computer Science, Ottawa, Canada bsalimi@scs.carleton.ca

### Ph.D. Thesis. Judit Nagy-György. Supervisor: Péter Hajnal Associate Professor

Online algorithms for combinatorial problems Ph.D. Thesis by Judit Nagy-György Supervisor: Péter Hajnal Associate Professor Doctoral School in Mathematics and Computer Science University of Szeged Bolyai

### Analysis of Algorithms, I

Analysis of Algorithms, I CSOR W4231.002 Eleni Drinea Computer Science Department Columbia University Thursday, February 26, 2015 Outline 1 Recap 2 Representing graphs 3 Breadth-first search (BFS) 4 Applications

### NP-Completeness and Cook s Theorem

NP-Completeness and Cook s Theorem Lecture notes for COM3412 Logic and Computation 15th January 2002 1 NP decision problems The decision problem D L for a formal language L Σ is the computational task:

### No: 10 04. Bilkent University. Monotonic Extension. Farhad Husseinov. Discussion Papers. Department of Economics

No: 10 04 Bilkent University Monotonic Extension Farhad Husseinov Discussion Papers Department of Economics The Discussion Papers of the Department of Economics are intended to make the initial results

### Introduction to Algorithms Review information for Prelim 1 CS 4820, Spring 2010 Distributed Wednesday, February 24

Introduction to Algorithms Review information for Prelim 1 CS 4820, Spring 2010 Distributed Wednesday, February 24 The final exam will cover seven topics. 1. greedy algorithms 2. divide-and-conquer algorithms

### Theory of Computation Prof. Kamala Krithivasan Department of Computer Science and Engineering Indian Institute of Technology, Madras

Theory of Computation Prof. Kamala Krithivasan Department of Computer Science and Engineering Indian Institute of Technology, Madras Lecture No. # 31 Recursive Sets, Recursively Innumerable Sets, Encoding

### Mathematics Course 111: Algebra I Part IV: Vector Spaces

Mathematics Course 111: Algebra I Part IV: Vector Spaces D. R. Wilkins Academic Year 1996-7 9 Vector Spaces A vector space over some field K is an algebraic structure consisting of a set V on which are

### POLYNOMIAL RINGS AND UNIQUE FACTORIZATION DOMAINS

POLYNOMIAL RINGS AND UNIQUE FACTORIZATION DOMAINS RUSS WOODROOFE 1. Unique Factorization Domains Throughout the following, we think of R as sitting inside R[x] as the constant polynomials (of degree 0).

### Intersection Dimension and Maximum Degree

Intersection Dimension and Maximum Degree N.R. Aravind and C.R. Subramanian The Institute of Mathematical Sciences, Taramani, Chennai - 600 113, India. email: {nraravind,crs}@imsc.res.in Abstract We show

### it is easy to see that α = a

21. Polynomial rings Let us now turn out attention to determining the prime elements of a polynomial ring, where the coefficient ring is a field. We already know that such a polynomial ring is a UF. Therefore

### 1 if 1 x 0 1 if 0 x 1

Chapter 3 Continuity In this chapter we begin by defining the fundamental notion of continuity for real valued functions of a single real variable. When trying to decide whether a given function is or

### Discrete Mathematics & Mathematical Reasoning Chapter 10: Graphs

Discrete Mathematics & Mathematical Reasoning Chapter 10: Graphs Kousha Etessami U. of Edinburgh, UK Kousha Etessami (U. of Edinburgh, UK) Discrete Mathematics (Chapter 6) 1 / 13 Overview Graphs and Graph

### Institut für Informatik Lehrstuhl Theoretische Informatik I / Komplexitätstheorie. An Iterative Compression Algorithm for Vertex Cover

Friedrich-Schiller-Universität Jena Institut für Informatik Lehrstuhl Theoretische Informatik I / Komplexitätstheorie Studienarbeit An Iterative Compression Algorithm for Vertex Cover von Thomas Peiselt

### Complexity and Completeness of Finding Another Solution and Its Application to Puzzles

yato@is.s.u-tokyo.ac.jp seta@is.s.u-tokyo.ac.jp Π (ASP) Π x s x s ASP Ueda Nagao n n-asp parsimonious ASP ASP NP Complexity and Completeness of Finding Another Solution and Its Application to Puzzles Takayuki

### Diagonalization. Ahto Buldas. Lecture 3 of Complexity Theory October 8, 2009. Slides based on S.Aurora, B.Barak. Complexity Theory: A Modern Approach.

Diagonalization Slides based on S.Aurora, B.Barak. Complexity Theory: A Modern Approach. Ahto Buldas Ahto.Buldas@ut.ee Background One basic goal in complexity theory is to separate interesting complexity

### Foundations of Artificial Intelligence

Foundations of Artificial Intelligence 7. Propositional Logic Rational Thinking, Logic, Resolution Wolfram Burgard, Bernhard Nebel and Martin Riedmiller Albert-Ludwigs-Universität Freiburg Contents 1 Agents