Non Uniform Generation of Combinatorial Objects

Size: px
Start display at page:

Transcription

1 Non Uniform Generation of Combinatorial Objects Frank Weinberg and Markus E. Nebel Fachbereich Informatik, Technische Universität Kaiserslautern, Gottlieb-Daimler-Straße 48, D Kaiserslautern, Germany {f Abstract In this article we present a method to generate random objects from a large variety of combinatorial classes according to a given distribution. Given a description of the combinatorial class and a set of sample data our method will provide an algorithm that generates objects of size n in worst-case runtime O(n 2 ) (O(n log(n)) can be achieved at the cost of a higher average-case runtime), with the generated objects following a distribution that closely matches the distribution of the sample data. 1 Introduction When testing an algorithm one is often confronted with the problem of generating suitable input data. Since the possible inputs can usually be modeled as combinatorial classes the desire for procedures that generate the objects from these classes arises. It is however usually neither feasible to test algorithms with all possible inputs nor does one know which of the possible inputs are the most interesting ones. Hence it is common practice to test with random input data. This immediately leads to the problem of generating random objects from a combinatorial class, usually of a preliminarily fixed size in order to get results which can be easily compared with each other. Because of its importance this problem has been studied for quite some time and many results have been achieved both in terms of procedures for specific classes (see [3] or [15] for examples,) as well as more general approaches ([9], [10]). A restriction common to all of these procedures is that they do not allow to specify a distribution among the objects of the considered combinatorial class, i.e. objects are generated according to a uniform distribution. This usually is no problem as long as only the correctness of the tested algorithm is considered. If, however, we want to do simulations to gather information about reallife behaviour of our algorithms (e.g. concerning runtime) we need to generate objects according to distributions observed on real-life data. For those combinatorial classes which can be described by context-free grammars there exist algorithms to train the distribution of a given set of objects into the grammar and use the resulting stochastic context-free grammar to generate more data with a distribution similar to the one of the sample data. However, they do not allow the user to fix the length of these sequences. ([13]) In this paper we will present a method to combine the advantages of both approaches, that is to generate objects of a previously fixed length according to the distribution of a given set of sample data. The general proceeding of our method is in two steps: First we equip the productions of a context-free grammar with probabilities such that the induced distribution on the generated language models the distribution of the sample data as closely as possible. Methods for doing so are well known; one of them is presented in Section 2.2. After being trained this way, the (now weighted) context-free grammar is translated into an admissible specification. Admissible specifications, presented in Section 2.1 and extended for our purpose in Section 3.1, are a framework for specifying combinatorial classes, that immediately 1

2 yields generating algorithms for the specified classes. The translation is presented in Sections 3.2 and Known Results and Basic Definitions 2.1 Random Generation In order to generate objects from combinatorial classes we wil use unranking algorithms, that is, algorithms that, given k and n, generates the k-th object of size n from the class according to a previously fixed order. Obviously by drawing k at random this can be used to generate random objects. We do this by extending an approach introduced by Flajolet, Zimmermann and Van Cutsem in [4] for the (uniform) random generation and adapted to the problem of unranking by Martínez and Molinero. Their results are collected in [10]. The basic principle of the approach is to decompose a combinatorial class into simpler classes (or, viewed the other way around, construct more complex classes from simpler ones) by using so-called admissible constructions. Formally: Definition 2.1 ([5]). Objects of size 0 are called neutral objects or tags, a class consisting of a single neutral object ɛ is called a neutral class. We will denote neutral classes by E (E, E 1, E 2,... if we need to distinguish multiple neutral classes containing the objects, ɛ 1, ɛ 2,... respectively). A class consisting of a single object of size 1 is called an atomic class. We will denote atomic classes by Z (Z a, Z b,... to distinguish the classes containing the atoms a, b,...). Definition 2.2 ([5]). The counting sequence of a combinatorial class A is the sequence of integers (a n ) n 0 where a n = card(a n ) is the number of objects in class A that have size ( = number of atoms) n. Definition 2.3 ([5]). Assume that Φ is a construction that associates to a finite collection of classes B, C,... a new class A := Φ[B, C,...], in a finitary way: each A n depends on finitely many of the {B i }, {C j },... Then Φ is admissible iff the counting sequence (a n ) of A only depends on the counting sequences (b i ), (c j ),... of B, C,... In order to represent context-free grammars we will need two well-known admissible constructions. Definition 2.4 ([5]). If A 1,..., A k are combinatorial classes and ɛ 1,..., ɛ k are neutral objects, the combinatorial sum or disjoint union A A k of A 1,..., A k is defined as A A k := (E 1 A 1 )... (E k A k ), where denotes set theoretic union. If A 1 and A 2 are combinatorial classes, the cartesian product A 1 A 2 of A 1 and A 2 is defined as A 1 A 2 := {(α 1, α 2 ) α i A i }. The size (α 1, α 2 ) is defined to be α 1 + α 2. Now to be able to come up with unranking algorithms we need to specify an ordering on the objects of the same size that belong to the considered combinatorial class. We will do this recursively, following the structure of the class admissible specification: Definition 2.5 ([10]). Neutral and atomic classes contain only one element, hence there is only one possible ordering. 2

3 If B = A A k and β, β B n β < B n β (β (A i ) n and β (A j ) n and i < j) or (β, β (A i ) n and β < (Ai) n β ). If B = A 1 A 2 and β = (α 1, α 2 ), β = (α 1, α 2) B n β < B n β α 1 < α 1 or (j = α 1 = α 1 and α 1 < (A1) j α 1) or (α 1 = α 1 and α 2 < (A2) n j α 2). The order used above for products is called lexicographic order. It leads to unranking algorithms with a worst-case time complexity in O(n 2 ), n the size of the objects to be generated. This worstcase complexity can be improved to O(n log n) by using the boustrophedonic order at the cost of increasing average-case runtime. Details can be found in [10]. Now the actual unranking algorithms are quite straightforward: For neutral and atomic classes we simply return the single object from the class, if the parameters of the call are correct ([10]). Given a disjoint union we first determine the correct subclass by succesively adding up their sizes until this sum would exceed the given rank and then compute the rank among this class by subtracting the sizes of the precedent classes (Algorithm 1). Algorithm 1 Unranking of disjoint unions ([10]) object function unrank(a A n, n: size, i: rank) c := 0; j := 1; d := size(a j, n); while c + d i do c := c + d; j := j + 1; d := size(a j, n); end while return unrank(a j, n, i c); Note 2.1. We do not return the tags that would be added according to Definition 2.4, since this behaviour is usually more convenient in practice and if one actually needs the tags, they can easily be added by changing the specification accordingly. In cartesian products the objects are primarily ordered according to the sizes of their constituent objects, hence we first determine these sizes in the same way we determined the correct subclass in unions. Having found these sizes and the rank among all objects with the same sizes, the actual ranks in the subcalsses can easily be determined using div and mod (Algorithm 2). The sizes required for the algorithms above can be computed using the following recursion: 1 A is atomic and n = 1, 0 A is atomic and n 1, 1 A is neutral and n = 0, size(a, n) := (1) 0 A is neutral and n 0, k i=1 size(b i, n) A = B B k, size(b, i) size(c, n i) A = B C. n i=0 3

4 Algorithm 2 Unranking of cartesian products ([10]) object function unrank(a B, n: size, i: rank) c := 0; j := 0; d := size(a, j) size(b, n j); while c + d i do c := c + d; j := j + 1; d := size(a, j) size(b, n j); end while i := i c; b := size(b, n j); α := unrank(a, j, i div b); β := unrank(b, n j, i mod b); return (α, β); 2.2 Weighted Context-Free Grammars Combinatorial classes can be modeled as formal languages. We will use this analogy in Section 3 to transfer results on how to train weights for formal languages onto combinatorial classes. Hence we will introduce the required definitions and results. We assume the reader has basic knowledge of the notions concerning context-free languages. An introduction can be found in [7]. Definition 2.6 ([11]). A weighted context-free grammar (WCFG) is a 5-tuple G = (I, T, R, S, W ), where I (resp. T ) is an alphabet (finite set) of intermediate (resp. terminal) symbols (I and T are disjoint), S I is called axiom, R I (I T ) is a finite set of production rules and W : R R + is a mapping such that each rule f R is equipped with a weight w f := W (f). In the sequel we will write A w f α instead of f = (A, α) R, w f = W (f). If G is a WCFG, G is a stochastic context-free grammar (SCFG) iff the additional restrictions f R : W (f) (0, 1], A I : w f = 1, f R,Q(f)=A hold, where Q(f) denotes the premise of the production f, that is, the first component A of a production rule (A, α) R. (We will call the second component the conclusion of (A, α).) Words are generated as for usual context-free grammars, the product of the weights of the used production rules gives the weight of a parse tree, the sum of the weights of all possible full-parse trees of a word w gives the weight of w. We will write W ( ) resp. W (w) for these weights. Definition 2.7. If G, G are WCFGs, G and G are said to be word-equivalent iff L(G) = L(G ) and for each word w L(G) : W (w) = W (w). If additionally there exists a bijection f from the full-parse trees of w in G to the full-parse trees of w in G such that for each tree : W ( ) = W (f( )) we will call G and G (derivation-)equivalent. A WCFG G is called loop-free, iff there exists no nonempty derivation A + A, A I. It is called ɛ-free, iff (A, ɛ) R, A S and (A, α 1 Sα 2 ) R, where ɛ denotes the empty word. The following result shows that given a CFG G and a set of full-parse trees to that grammar distributed according to some unknown distribution of L(G) we can model the distribution by setting the weights on the grammar to relative frequencies with which the productions occur in the set of parse trees. This is based on the concept of maximum likelihood as introduced by Fisher in ([1]). 4

5 Definition 2.8 ([14]). For a given treebank, i.e., for a non-empty and finite corpus of full-parse trees, the treebank grammar G = (I, T, R, S, W ) is a SCFG defined by 1. I (resp. T ; R) consist of exactly the intermediate symbols (resp. terminal symbols; productions) appearing in the treebank; S is the root of all the full-parse trees, 2. p = (A, α) R : W (p) = f(r) p =(A,α ) P f(p ), where f(p) denotes the number of occurrences of p in the set of full-parse trees. 5

6 Theorem 2.1 ([14]). Let f t be a non-empty and finite corpus of full-parse trees and let G be the treebank grammar read off from f t. Then the weights induced on the full-parse trees by G are a maximum-likelihood estimate of the probability model of the unweighted context-free grammar G corresponding to G on f t. 3 Results 3.1 Weighting as an Admissible Construction In order to encode the distribution into the combinatorial class we will equip the objects from our combinatorial class with weights. For practical reasons we restrict ourselves to integral weights, modeling an object with weight λ as λ copies of the respective element. This can easily be achieved by introducing a new admissible construction: Definition 3.1. If A is a combinatorial class and λ is an integer, the weighting of A by λ, λa, is defined as λa := A A. } {{ } λ times We will call two objects from a combinatorial class copies of the same object iff they only differ in the tags added by weighting operations. In order to unrank (resp. count) the objects of a class whose specification involves weighting, we could replace any weighting constructions by unions according to Definition 3.1, and use Algorithm 1 (resp. equation (1)). It is, however, especially if λ is large, more convenient to use a specialised algorithm, that takes into account that all the classes united are equal (Algorithm 3). For the size we find: size(λa, n) = λ size(a, n). Algorithm 3 Unranking of weighted classes object function unrank(λa, n: size, i: rank) return unrank(a, n, i mod size(a, n)); Since weighting a class can be replaced by a disjoint union the results from [10] concerning worst-case time complexity hold for classes whose specification involves weighting, meaning the unranking algorithms generated will have a runtime in O(n 2 ). To conclude the introduction of this new construction we will have a look at the behaviour of weighted classes with respect to union and product as introduced in Section 2.1: Lemma 3.1. If A 1,..., A n, B are combinatorial classes, B = A A n and A i contains λ copies of an object α, then B contains λ copies of (ɛ i, α). If A 1,..., A n, B are combinatorial classes, B = A 1... A n and each A i contains λ i copies of an object α i, then B contains n i=1 λ i copies of β = (α 1,..., α n ). Proof. Immediate from Definition Reweighting WCFGs Since the weighing of classes allows only integral weights while the training as described in Section 2.2 introduces rational weights we have to reweight the grammars before we can transform them into admissible secifications. As we will see in Section 3.3, the combinatorial objects we want to unrank in the end correspond to the full-parse trees in our grammar, objects of the same size corresponding to full-parse trees of words of the same length. Hence we have to make sure, that the relative weights of full-parse trees 6

7 of words of the same length do not change, in order to guarantee, that the resulting unranking algorithms will produce the objects according to the distribution trained into the grammar. If all productions are scaled by the same factor c, the weight of a derivation will scale by c k, k being the number of steps in the derivation. Hence if all derivations of words of the same length involve the same number of steps, this operation will maintain the relative weights of these words. This is e.g. the case, if the grammar is in Chomsky normal form: Theorem 3.2. Let G be a WCFG in Chomsky normal form with all the weights appearing in G being rational; let c be the common denominator of the weights appearing in G, and G be derived from G by multiplying all weights by c. Then all the weights in G are integral and for w 1, w 2 L(G), w 1 = w 2 and i a full-parse tree of w i, i {1, 2}, in G and G holds. W ( 1 ) W ( 2 ) = W ( 1) W ( 2 ) Proof. The first property follows immediately from the definition of c. If w i = 0, i {1, 2}, then w 1 = w 2 = ɛ and 1 = 2 consists of the single rule S ɛ. Thus both fractions evaluate to 1 and the second property holds. Now let w i 1, i {1, 2}. Since G is in Chomsky normal form, i involves w i 1 applications of productions of the form A BC, A, B, C I, and w i applications of productions of the form A a, A I, a T. Thus W ( i ) = W ( i ) c 2 wi 1, i {1, 2}, and the second property follows immediately. If we are given a non-ambiguous SCFG, we can preprocess it in order to be able to use Theorem 3.2: Theorem 3.3 ([8]). If G is a SCFG, there exists a SCFG G in Chomsky normal form that is word-equivalent to G, and G can be effectively constructed from G. The construction given in [8] assumes that G is ɛ-free. It can however be extended to non-ɛ-free grammars by adding an additional step after the intermediate grammar G 1 has been created: For each production A λ1 ɛ remove this production and for each instance of A in the conclusion of a production B λ2 β 1 Aβ 2 add a new rule B λ1 λ2 β 1 β 2. If A occurs multiple times in a single conclusion, there has to be one new production for each combination of As removed, the weight being λ 2 λ l 1 if l instances of A are deleted. In principle the former two theorems already suffice for many practical purposes. However the transformation of Theorem 3.3 provides only word-equivalence while in some cases derivationequivalence is needed. Hence we will present another method for reweighting. Instead of scaling all productions by the same factor and ensuring that derivations are of equal length for words of the same size, we will scale the productions according to how much they contribute to the length of the word, that is, productions lengthening the word by k will be scaled by c k. Since we consider context-free grammars, the lengthening of a production is given by length of conclusion 1. However, this rule leads to productions with a conclusion of length 1 not being reweighted, hence we have to assure, that all those productions already have integral weights in order to get a grammar with only integral weights after reweighting. Additionally productions with a conclusion of length 0, i.e. ɛ-productions, would be rescaled by 1 c and thus they have to be avoided completely. Obeying these restrictions would however make it impossible to generate words of length 0 or 1 with appropriate weights. Thus, to circumvent this problem, we will modify the required structure of the grammar a bit: 7

8 The axiom does not occur in any conclusion and any production having the axiom as premise has a conclusion of length 0 or 1. All other productions obey the restrictions stated above. This assures, that each derivation includes exactly one application of a production having the axiom as premise and, unless the word generated is the empty word, this application doesn t influence the length of the sentential form. Thus these productions can be reweighted seperately. Formally: Definition 3.2. If G = (I, T, R, S, W ) is a WCFG, G is said to be in reweighting normal form (RNF) iff 1. G is loop-free and ɛ-free; 2. A α R, A = S : α 1; 3. A α R, A S : α > 1 or W (A α) N; 4. A I α (I T ) : A α R. The last condition, that any intermediate symbol occurs as premise of at least one production, is not required for reweighting. It is, however, necessary for the translation of the grammar into an admissible specification later on. Theorem 3.4. Let G be a WCFG in RNF with all the weights appearing in G being rational; let s be the common denominator of the weights of productions with premise S in G, c be the common denominator of the weights of the remaining productions and G be derived from G by multiplying the weights of productions with source S by s, and by multiplying the weights of productions A α, A S by c α 1. Then all the weights in G are integral and for w 1, w 2 L(G), w 1 = w 2 and i a full-parse tree of w i, i {1, 2}, in G and G holds. W ( 1 ) W ( 2 ) = W ( 1) W ( 2 ) Proof. Since A α R, A S implies α > 1 the first property follows immediately from the definition of s and c. If w i = 0, i {1, 2}, then w 1 = w 2 = ɛ and 1 = 2 consists of the single rule S ɛ. Thus both fractions evaluate to 1 and the second property holds. Now let w i 1, i {1, 2}. Since G is in RNF i involves 1 application of a production of the form S α, α = 1 and k applications of productions of the form A i,j α i,j, A i,j S. Thus k j=1 ( α i,j 1) = w i, i {1, 2}, and we find W ( i ) = W ( i ) s n j=1 c αj 1 = W ( i ) s c wi, i {1, 2}, and the second property follows immediately. Now, what remains to be done in this section is to transform arbitrary WCFGs to equivalent grammars in reweighting normal form. The proof of the following theorem describes how this can be done. Theorem 3.5. If G is a loop-free, ɛ-free WCFG, there exists a WCFG G in RNF that is equivalent to G and G can be effectively constructed from G. Proof. Let CH denote the set of all quadruples (A, A 1 A 2... A n, λ 0 λ 1... λ n, B) with A, A 1,..., A n I, B I T, λ i R such that A λ0 λ A 1 λ 1... An n B (for n = 0, we get (A, ɛ, λ, B) CH whenever A λ B is a rule with B I T and n = 1 results in (A, ɛ, 1, A) CH for A N). Since G is loop-free, CH is finite. Now define G as follows: G contains the same terminals as G. The nonterminals are those from G, a new start symbol S and the set CH. The rules are as follows: (A, γ, λ, B) 1 B for (A, γ, λ, B) CH, 8

9 A λ0λ1 λn w 0 (A 1, γ 1, λ 1, B 1 )w 1 (A 2, γ 2, λ 2, B 2 )... (A n, γ n, λ n, B n )w n for A λ0 α = w 0 A 1 w 1 A 2... A n w n a rule from G with α 2, w i T and A i I, S λ (S, γ, λ, a) for (S, γ, λ, a) CH, S λ ɛ, if S λ ɛ is a rule in G. From this construction it is obvious that G satisfies conditions 1 3 of Definition 3.2. Condition 4 can be obtained by removing all nonterminals that violate it along with the productions containing them. Since these symbols and productions can not appear in any full-parse tree this removal does not influence derivation equivalence. Since CH can easily be computed from G, G can be constructed effectively. It remains to be shown that G and G are actually derivation-equivalent. Claim. For each full-parse tree : S w, w T in G there exists a full-parse tree : S w in G with W ( ) = W ( ). Proof. If w = ɛ then consists of the single rule S λ λ ɛ and consisting of the single rule S ɛ is a full-parse tree for w in G with W ( ) = W ( ). If w = 1, consists of the rules S λ 1 (S, γ, λ, a) and (A, γ, λ, a) a and consisting of the rules that led to the inclusion of (S, γ, λ, a) in CH is a full-parse tree for w in G with W ( ) = W ( ). Otherwise consists of one instance of the rule S 1 λ S and several groups of rules A 0λ 1 λ n w 0 (A 1, γ 1, λ 1, B 1 )w 1 (A 2, γ 2, λ 2, B 2 )... (A n, γ n, λ n, B n )w n, (A i, γ i, λ i, B i ) 1 B i. For each of these groups there is in G a production A λ0 α = w 0 A 1 w 1 A 2... A n w n and if A i B i the set of productions that led to the inclusion of (A i, γ i, λ i, B i ) in CH. Additionally by construction of CH and G the product of weights is the same for both groups. Thus by combining the groups of productions from G derived from the groups of productions in we get a full-parse tree for w in G with W ( ) = W ( ). Claim. For each full-parse tree : S w, w T in G there exists a full-parse tree : S w in G with W ( ) = W ( ). Proof. By applying the argumentation from the above proof in reverse. Together the proofs of the two claims give the bijection required for derivation-equivalence. Neither of the requirements in Theorem 3.5 may be dropped without losing derivation-equivalence: Concerning ɛ-freeness consider the set of productions {S ɛ, S A, A ɛ} where ɛ L(G) has two different full-parse trees while in an ɛ-free grammar ɛ can only have the single derivation S ɛ. Concerning loop-freeness note that if A + A is a loop in G and A appears in a derivation of w L(G), w has infinitely many full-parse trees in G while in a loop-free grammar any word w has only finitely many parse trees. To transform a grammar to a word-equivalent loop-free, ɛ-free grammar, the ideas used in the proof of Theorem 3.3 can be applied. Note, however, that eliminating loops with a weight 1, will result in infinite weights. 9

10 3.3 Transforming WCFGs into Admissible Specifications To get the transformation from WCFGs to admissible specifications, we note how the elements of grammars can be modeled by elements of specifications: The empty word is modeled as neutral class. Terminal symbols are modeled as atomic classes, containing the respective symbol. Intermediate symbols are modeled as classes that contain all words that can be derived from the symbol. Productions are modeled as classes containing the cartesian product of all symbols of the conclusion. Weights of productions are modeled by weighting the respective classes. Multiple productions with the same source are modeled as disjoint union of the classes modeling the (weighted) productions. Formalizing this, we get: Theorem 3.6. Let G = (I, T, R, S, W ) be a WCFG with integral weights. For each A I let {A λ A,j α A,j,1... α A,j,nA,j 1 j n A } denote the complete set of productions with premise A, n A being the number of such productions. Let T be constructed as follows: For each a T add an atomic class Z a. For each A I add a class A. For each A λ A,j α A,j,1... α A,j,nA,j G, 1 j n A, add a class A j and an equation A j = β A,j,1... β A,j,nA,j where β A,j,k = Z a if α A,j,k = a T and β A,j,k = B if α A,j,k = B I. If n A,j = 0 add the equation A j = E instead of A j = β A,j,1... β A,j,nA,j. For each A I add an equation A = λ A,1 A A, λ A,nA A A,nA. Then T is an admissible specification and for each w L(G) there exists a bijection f between the full-parse trees of w in G and those objects in S w that contain the atomic classes Z a1... Z a w in the same order as the terminal symbols a i appear in w such that S w contains W ( ) copies of f( ). Proof. The first property follows immediately from the construction of T. Considering the second property we will first show the following lemma: Lemma 3.7. Let T be constructed as in Theorem 3.6, = A w, w L((I, T, R, A, W )), λ = W ( ). Then there are λ copies of an object o A w that contains the atomic classes Z a1... Z a w in the same order as the terminal symbols a i appear in w. Proof. Let n denote the number of steps in the derivation of w, A µ α 1... α k be the first production used in, B be the class introduced for this production and ɛ B be the tag associated with B in the equation for A. n = 1: Then λ = µ and w = α 1... α k and by construction of T : (α 1,..., α k ) B and thus 1 i λ : (ɛ B, (ɛ i, (α 1,..., α k ))) A where the ɛ i are introduced during the weighting operation λb. Hence the Lemma holds. 10

11 n > 1: Assume the Lemma holds for all derivations of lengths < n. The derivations α i wi, where w i T, w = w 1... w k, all have lengths < n hence if α i I we may set o i to the object corresponding to the derivation α i wi. If α i T we set o i = α i. Then by construction of T : (o 1,..., o k ) B and thus 1 i λ : (ɛ B, (ɛ i (o 1,..., o k ))) A where the ɛ i are introduced during the weighting operation λb. Using the induction hypothesis and Lemma 3.1 the Lemma follows. It remains to be shown that conversely each object o S models a derivation: From the construction of T, we get that each o A, A added for A I, must be of the form o = (ɛ j, (ɛ l, (o 1,..., o k ))), where each of the o i, 1 i k, is either a terminal symbol or o i B i, B i added for B i I and o i is smaller than o. Hence assuming that for smaller o i there exist corresponding derivations, we get a derivation corresponding to o by concatenating A α 1... α k, α i = o i if o i T, α i = B i if o i B i with the derivations for those o i not in T. The property follows by induction. It is easy to verify that the above transformation can also be applied inversely to get a CFG from an admissible specification involving only disjoint unions and cartesian products. Additionally the admissible construction of sequence generation can also be transformed into productions of a CFG, translating R = SEQ(S) into R ɛ RS. 4 Example of Application As a simple example we have used our method to generate random RNA secondary structures. These are typically modelled as correctly parenthesized words over the alphabet {[,, ]}, where corresponding brackets represent bases paired with each other and denotes an unpaired base. (A more thorough introduction to the topic can be found in [2].) Using a simple grammar for these words and training it using the data from [16] we arrived at the following: 1 : S R, 0.31 : R T, 0.69 : T, 0.69 : R T R, 0.31 : T [R]. Since this grammar is loop-free and ɛ-free, we can use the procedure from Theorem 3.5 to transform it to RNF. We find CH = {(S, ɛ, 1, S), (S, ɛ, 1, R), (S, R, 0.31, T ), (S, RT, 0.21, ), (R, ɛ, 1, R), (R, ɛ, 0.31, T ), (R, T, 0.21, ), (T, ɛ, 1, T ), (T, ɛ, 0.69, )} and thus after reducing get the new grammar 1 : S (S, ɛ, 1, R), 0.69 : R (T, ɛ, 1, T )(R, ɛ, 1, R), 0.31 : S (S, R, 0.31, T ), 0.21 : R (T, ɛ, 1, T )(R, ɛ, 0.31, T ), 0.31 : S (S, RT, 0.31, ), 0.14 : R (T, ɛ, 1, T )(R, T, 0.21, ), 0.31 : T [(R, ɛ, 1, R)], 0.48 : R (T, ɛ, 0.69, )(R, ɛ, 1, R), 0.10 : T [(R, ɛ, 0.31, T )], 0.15 : R (T, ɛ, 0.69, )(R, ɛ, 0.31, T ), 0.07 : T [(R, T, 0.21, )], 0.10 : R (T, ɛ, 0.69, )(R, T, 0.21, ), 1 : (A, γ, λ, B) B, (A, γ, λ, B) CH, B S. Since replacing each of the (A, γ, λ, B) by B in the conclusions does not lead to two of the rules becoming identical, this transformation is derivation equivalent for our grammar and gives the more readable version 1 : S R, 0.69 : R T R, 0.48 : R R, 0.31 : T [R], 0.31 : S T, 0.21 : R T T, 0.15 : R T, 0.10 : T [T ], 0.31 : S, 0.14 : R T, 0.10 : R, 0.07 : T [ ]. 11

12 Applying Theorem 3.4 we find the smallest common denominators to be s = 100 and c = 100 and thus the reweighted set of productions is: 1 : S R, 69 : R T R, 48 : R R, 3100 : T [R], 31 : S T, 21 : R T T, 15 : R T, 1000 : T [T ], 31 : S, 14 : R T, 10 : R, 700 : T [ ]. Applying Theorem 3.6, we get the following specification: S 1 = R, R 1 = T R, R 4 = Z R, T 1 = Z [ R Z ], S 2 = T, R 2 = T T, R 5 = Z T, T 2 = Z [ T Z ], S 3 = Z, R 3 = T Z, R 6 = Z Z, T 3 = Z [ Z Z ], S = 100S S S 3, R = 69R R R R R R 6, T = 3100T T T 3. We have implemented the resulting algorithm as well as an algorithm for uniform generation of RNA secondary structures in the computer algebra system Maple and used these implementations to generate 1000 objects of size 150 with each of the algorithms. The resulting ratios of unpaired bases are shown in Table 1. Structures Ratio of unpaired bases Generated unweighted 45,1% Genrated weighted 53,2% From database 52,2% Table 1: Ratio of unpaired bases for the generated structures. As was to be expected, the structures generated with the weighted algorithm closely match the structures in the database concerning this parameter. Concerning other parameters, such as the free energy of the molecules, the results generated by our weighted unranking algorithm aren t as good. This is due to the fact, that the grammar we started with is too simple (i.e. it has too few free variables) to model these parameters and hence there is no way to train their properties into this grammar. This problem can however be circumvented by using a more sophisticated grammar. In [12] such a grammar is given for RNA secondary structures and it is shown that applying our approach to this grammar yields an algorithm that generates secondary structures with free energies distributed very similar to the distribution observed for real RNA molecules. 5 Concluding Remarks In this paper we have shown, how to get algorithms for the random generation of combinatorial objects of a fixed size according to an arbitrary distribution, given a context-free grammar that models the class or an admissible specification of the class that can be transformed into such a grammar and given the distribution either as weights on the grammar or as a set of sample objects, distributed accordingly. We have done so by extending the popular framework of admissible specifications of combinatorial classes to also include the specification of weights, allowing for the generic creation of algorithms that generate objects from combinatorial classes according to a non-uniform distribution and by furthermore showing, how weighted context-free grammars can be transformed into admissible specifications, hence allowing to use the well-known procedures for training the weights on context-free grammars to be used for the training of weights in admissible specifications. However, we had to restrict ourselves to use only integral weights. Hence the weights returned by the training algorithms had to be rescaled, which in turn required a rather complex transformation of the grammar. 12

13 This transformation is however, as well as all other steps involved, easily automateable. Thus it appears to be a sensible task, to develop a tool that, given a specification or grammar and a set of sample objects, generates the corresponding unranking algorithm. The algorithms presented generate objects of size n with a worst-case runtime in O(n 2 ). In their work [6], Flajolet, Fusy and Pivoteau show how algorithms for the uniform random generation of combinatorial objects that have a worst-case runtime linear in the size of the object generated can be automatically created from admissible specifications if one weakens the restriction for the algorithms to create objects of exact size n to the creation of objects with expected size n. We believe it to be possible to combine their results with the results presented here in order to get algorithms for the weighted generation that yield a runtime linear in the size of the created objects. It remains to be seen, however, how exactly this can be achieved. References [1] John Aldrich. R. A. Fisher and the Making of Maximum Likelihood Statistical Science, 12(3): , [2] Richard Durbin, Sean R. Eddy, Anders Krogh, and Graeme Mitchison. Biological sequence analysis. Cambridge University Press, [3] M. C. Er. Enumerating ordered trees lexicographically. The Computer Journal, 28(5): , [4] P. Flajolet, P. Zimmermann, and B. V. Cutsem. A calculus for the random generation of combinatorial structures. Theoretical Computer Science, 132(1-2):1 35, [5] Philippe Flajolet and Robert Sedgewick. Analytic Combinatorics. Cambridge University Press, [6] Philippe Flajolet, Éric Fusy, and Carine Pivoteau. Boltzmann sampling of unlabelled structures Extended abstract. [7] Michael A. Harrison. Introduction to Formal Language Theory. Addison-Wesley, [8] T. Huang and K.S. Fu. On stochastic context-free languages. Inf. Sc., 3: , [9] Jens Liebehenschel. Lexikographische Generierung, Ranking und Unranking kombinatorischer Objekte: Eine Average-Case Analyse. PhD thesis, Johann Wolfgang Goethe-Universität Frankfurt am Main, [10] Xavier Molinero. Ordered Generation of Classes of Combinatorial Structures. PhD thesis, Universitat Politècnica de Catalunya, [11] Markus E. Nebel. On a Statistical Filter for RNA Secondary Structures. Technical report, Frankfurter Informatik-Berichte, [12] Markus E. Nebel and Anika Scheid. Random generation of rna secondary structures according to native distributions. Submitted, [13] Yann Ponty, Michel Termier, and Alain Denise. GenRGenS: software for generating random genomic sequences and structures. Bioinformatics, 22(12): , [14] Detlef Prescher. A tutorial on the expectation-maximization algorithm including maximum-likelihood estimation and em training of probabilistic context-free grammars. prescher/papers/bib/2003em.prescher.pdf, [15] F. Ruskey. Generating t-ary trees lexicographically. SIAM J. Comp., 7(4): , [16] J. Wuyts, P. De Rijk, Y. Van de Peer, T. Winkelmans, and R. De Wachter. The european large subunit ribosomal rna database. Nucleic Acids Res., 29: ,

CHAPTER 7 GENERAL PROOF SYSTEMS

CHAPTER 7 GENERAL PROOF SYSTEMS 1 Introduction Proof systems are built to prove statements. They can be thought as an inference machine with special statements, called provable statements, or sometimes

Elementary Number Theory We begin with a bit of elementary number theory, which is concerned

CONSTRUCTION OF THE FINITE FIELDS Z p S. R. DOTY Elementary Number Theory We begin with a bit of elementary number theory, which is concerned solely with questions about the set of integers Z = {0, ±1,

Chapter 4, Arithmetic in F [x] Polynomial arithmetic and the division algorithm.

Chapter 4, Arithmetic in F [x] Polynomial arithmetic and the division algorithm. We begin by defining the ring of polynomials with coefficients in a ring R. After some preliminary results, we specialize

Chapter 7 Uncomputability

Chapter 7 Uncomputability 190 7.1 Introduction Undecidability of concrete problems. First undecidable problem obtained by diagonalisation. Other undecidable problems obtained by means of the reduction

Some facts about polynomials modulo m (Full proof of the Fingerprinting Theorem)

Some facts about polynomials modulo m (Full proof of the Fingerprinting Theorem) In order to understand the details of the Fingerprinting Theorem on fingerprints of different texts from Chapter 19 of the

The countdown problem

JFP 12 (6): 609 616, November 2002. c 2002 Cambridge University Press DOI: 10.1017/S0956796801004300 Printed in the United Kingdom 609 F U N C T I O N A L P E A R L The countdown problem GRAHAM HUTTON

Automata and Computability. Solutions to Exercises

Automata and Computability Solutions to Exercises Fall 25 Alexis Maciel Department of Computer Science Clarkson University Copyright c 25 Alexis Maciel ii Contents Preface vii Introduction 2 Finite Automata

Basic Concepts of Point Set Topology Notes for OU course Math 4853 Spring 2011

Basic Concepts of Point Set Topology Notes for OU course Math 4853 Spring 2011 A. Miller 1. Introduction. The definitions of metric space and topological space were developed in the early 1900 s, largely

Notes on Complexity Theory Last updated: August, 2011. Lecture 1

Notes on Complexity Theory Last updated: August, 2011 Jonathan Katz Lecture 1 1 Turing Machines I assume that most students have encountered Turing machines before. (Students who have not may want to look

INTRODUCTORY SET THEORY

M.Sc. program in mathematics INTRODUCTORY SET THEORY Katalin Károlyi Department of Applied Analysis, Eötvös Loránd University H-1088 Budapest, Múzeum krt. 6-8. CONTENTS 1. SETS Set, equal sets, subset,

I. GROUPS: BASIC DEFINITIONS AND EXAMPLES

I GROUPS: BASIC DEFINITIONS AND EXAMPLES Definition 1: An operation on a set G is a function : G G G Definition 2: A group is a set G which is equipped with an operation and a special element e G, called

We give a basic overview of the mathematical background required for this course.

1 Background We give a basic overview of the mathematical background required for this course. 1.1 Set Theory We introduce some concepts from naive set theory (as opposed to axiomatic set theory). The

Mathematics for Computer Science/Software Engineering. Notes for the course MSM1F3 Dr. R. A. Wilson

Mathematics for Computer Science/Software Engineering Notes for the course MSM1F3 Dr. R. A. Wilson October 1996 Chapter 1 Logic Lecture no. 1. We introduce the concept of a proposition, which is a statement

CHAPTER 5. Number Theory. 1. Integers and Division. Discussion

CHAPTER 5 Number Theory 1. Integers and Division 1.1. Divisibility. Definition 1.1.1. Given two integers a and b we say a divides b if there is an integer c such that b = ac. If a divides b, we write a

Continued Fractions and the Euclidean Algorithm

Continued Fractions and the Euclidean Algorithm Lecture notes prepared for MATH 326, Spring 997 Department of Mathematics and Statistics University at Albany William F Hammond Table of Contents Introduction

Information Theory and Coding Prof. S. N. Merchant Department of Electrical Engineering Indian Institute of Technology, Bombay

Information Theory and Coding Prof. S. N. Merchant Department of Electrical Engineering Indian Institute of Technology, Bombay Lecture - 17 Shannon-Fano-Elias Coding and Introduction to Arithmetic Coding

Integer roots of quadratic and cubic polynomials with integer coefficients

Integer roots of quadratic and cubic polynomials with integer coefficients Konstantine Zelator Mathematics, Computer Science and Statistics 212 Ben Franklin Hall Bloomsburg University 400 East Second Street

Mathematics Review for MS Finance Students

Mathematics Review for MS Finance Students Anthony M. Marino Department of Finance and Business Economics Marshall School of Business Lecture 1: Introductory Material Sets The Real Number System Functions,

Statistical RNA Secondary Structure Sampling Based on a Length-Dependent SCFG Model

Statistical RNA Secondary Structure Sampling Based on a Length-Dependent SCFG Model Anika Scheid and Markus E. Nebel Department of Computer Science, University of Kaiserslautern, P.O. Box 3049, D-67653

it is easy to see that α = a

21. Polynomial rings Let us now turn out attention to determining the prime elements of a polynomial ring, where the coefficient ring is a field. We already know that such a polynomial ring is a UF. Therefore

Full and Complete Binary Trees

Full and Complete Binary Trees Binary Tree Theorems 1 Here are two important types of binary trees. Note that the definitions, while similar, are logically independent. Definition: a binary tree T is full

MA651 Topology. Lecture 6. Separation Axioms.

MA651 Topology. Lecture 6. Separation Axioms. This text is based on the following books: Fundamental concepts of topology by Peter O Neil Elements of Mathematics: General Topology by Nicolas Bourbaki Counterexamples

THE TURING DEGREES AND THEIR LACK OF LINEAR ORDER

THE TURING DEGREES AND THEIR LACK OF LINEAR ORDER JASPER DEANTONIO Abstract. This paper is a study of the Turing Degrees, which are levels of incomputability naturally arising from sets of natural numbers.

Linear Codes. Chapter 3. 3.1 Basics

Chapter 3 Linear Codes In order to define codes that we can encode and decode efficiently, we add more structure to the codespace. We shall be mainly interested in linear codes. A linear code of length

Mathematics Course 111: Algebra I Part IV: Vector Spaces

Mathematics Course 111: Algebra I Part IV: Vector Spaces D. R. Wilkins Academic Year 1996-7 9 Vector Spaces A vector space over some field K is an algebraic structure consisting of a set V on which are

Continued Fractions. Darren C. Collins

Continued Fractions Darren C Collins Abstract In this paper, we discuss continued fractions First, we discuss the definition and notation Second, we discuss the development of the subject throughout history

CHAPTER II THE LIMIT OF A SEQUENCE OF NUMBERS DEFINITION OF THE NUMBER e.

CHAPTER II THE LIMIT OF A SEQUENCE OF NUMBERS DEFINITION OF THE NUMBER e. This chapter contains the beginnings of the most important, and probably the most subtle, notion in mathematical analysis, i.e.,

Testing LTL Formula Translation into Büchi Automata

Testing LTL Formula Translation into Büchi Automata Heikki Tauriainen and Keijo Heljanko Helsinki University of Technology, Laboratory for Theoretical Computer Science, P. O. Box 5400, FIN-02015 HUT, Finland

Math Review. for the Quantitative Reasoning Measure of the GRE revised General Test

Math Review for the Quantitative Reasoning Measure of the GRE revised General Test www.ets.org Overview This Math Review will familiarize you with the mathematical skills and concepts that are important

3. Equivalence Relations. Discussion

3. EQUIVALENCE RELATIONS 33 3. Equivalence Relations 3.1. Definition of an Equivalence Relations. Definition 3.1.1. A relation R on a set A is an equivalence relation if and only if R is reflexive, symmetric,

DEGREES OF ORDERS ON TORSION-FREE ABELIAN GROUPS

DEGREES OF ORDERS ON TORSION-FREE ABELIAN GROUPS ASHER M. KACH, KAREN LANGE, AND REED SOLOMON Abstract. We construct two computable presentations of computable torsion-free abelian groups, one of isomorphism

Extension of measure

1 Extension of measure Sayan Mukherjee Dynkin s π λ theorem We will soon need to define probability measures on infinite and possible uncountable sets, like the power set of the naturals. This is hard.

MATH10212 Linear Algebra. Systems of Linear Equations. Definition. An n-dimensional vector is a row or a column of n numbers (or letters): a 1.

MATH10212 Linear Algebra Textbook: D. Poole, Linear Algebra: A Modern Introduction. Thompson, 2006. ISBN 0-534-40596-7. Systems of Linear Equations Definition. An n-dimensional vector is a row or a column

A Propositional Dynamic Logic for CCS Programs

A Propositional Dynamic Logic for CCS Programs Mario R. F. Benevides and L. Menasché Schechter {mario,luis}@cos.ufrj.br Abstract This work presents a Propositional Dynamic Logic in which the programs are

Formal Languages and Automata Theory - Regular Expressions and Finite Automata -

Formal Languages and Automata Theory - Regular Expressions and Finite Automata - Samarjit Chakraborty Computer Engineering and Networks Laboratory Swiss Federal Institute of Technology (ETH) Zürich March

This chapter is all about cardinality of sets. At first this looks like a

CHAPTER Cardinality of Sets This chapter is all about cardinality of sets At first this looks like a very simple concept To find the cardinality of a set, just count its elements If A = { a, b, c, d },

A Second Course in Mathematics Concepts for Elementary Teachers: Theory, Problems, and Solutions

A Second Course in Mathematics Concepts for Elementary Teachers: Theory, Problems, and Solutions Marcel B. Finan Arkansas Tech University c All Rights Reserved First Draft February 8, 2006 1 Contents 25

CS 103X: Discrete Structures Homework Assignment 3 Solutions

CS 103X: Discrete Structures Homework Assignment 3 s Exercise 1 (20 points). On well-ordering and induction: (a) Prove the induction principle from the well-ordering principle. (b) Prove the well-ordering

Introduction to Automata Theory. Reading: Chapter 1

Introduction to Automata Theory Reading: Chapter 1 1 What is Automata Theory? Study of abstract computing devices, or machines Automaton = an abstract computing device Note: A device need not even be a

The Dirichlet Unit Theorem

Chapter 6 The Dirichlet Unit Theorem As usual, we will be working in the ring B of algebraic integers of a number field L. Two factorizations of an element of B are regarded as essentially the same if

A simple and fast algorithm for computing exponentials of power series

A simple and fast algorithm for computing exponentials of power series Alin Bostan Algorithms Project, INRIA Paris-Rocquencourt 7815 Le Chesnay Cedex France and Éric Schost ORCCA and Computer Science Department,

Cartesian Products and Relations

Cartesian Products and Relations Definition (Cartesian product) If A and B are sets, the Cartesian product of A and B is the set A B = {(a, b) :(a A) and (b B)}. The following points are worth special

P (A) = lim P (A) = N(A)/N,

1.1 Probability, Relative Frequency and Classical Definition. Probability is the study of random or non-deterministic experiments. Suppose an experiment can be repeated any number of times, so that we

9.2 Summation Notation

9. Summation Notation 66 9. Summation Notation In the previous section, we introduced sequences and now we shall present notation and theorems concerning the sum of terms of a sequence. We begin with a

MODULAR ARITHMETIC. a smallest member. It is equivalent to the Principle of Mathematical Induction.

MODULAR ARITHMETIC 1 Working With Integers The usual arithmetic operations of addition, subtraction and multiplication can be performed on integers, and the result is always another integer Division, on

CHAPTER 5. Product Measures

54 CHAPTER 5 Product Measures Given two measure spaces, we may construct a natural measure on their Cartesian product; the prototype is the construction of Lebesgue measure on R 2 as the product of Lebesgue

a 11 x 1 + a 12 x 2 + + a 1n x n = b 1 a 21 x 1 + a 22 x 2 + + a 2n x n = b 2.

Chapter 1 LINEAR EQUATIONS 1.1 Introduction to linear equations A linear equation in n unknowns x 1, x,, x n is an equation of the form a 1 x 1 + a x + + a n x n = b, where a 1, a,..., a n, b are given

Local periods and binary partial words: An algorithm

Local periods and binary partial words: An algorithm F. Blanchet-Sadri and Ajay Chriscoe Department of Mathematical Sciences University of North Carolina P.O. Box 26170 Greensboro, NC 27402 6170, USA E-mail:

x if x 0, x if x < 0.

Chapter 3 Sequences In this chapter, we discuss sequences. We say what it means for a sequence to converge, and define the limit of a convergent sequence. We begin with some preliminary results about the

HOMEWORK 5 SOLUTIONS. n!f n (1) lim. ln x n! + xn x. 1 = G n 1 (x). (2) k + 1 n. (n 1)!

Math 7 Fall 205 HOMEWORK 5 SOLUTIONS Problem. 2008 B2 Let F 0 x = ln x. For n 0 and x > 0, let F n+ x = 0 F ntdt. Evaluate n!f n lim n ln n. By directly computing F n x for small n s, we obtain the following

COUNTING INDEPENDENT SETS IN SOME CLASSES OF (ALMOST) REGULAR GRAPHS

COUNTING INDEPENDENT SETS IN SOME CLASSES OF (ALMOST) REGULAR GRAPHS Alexander Burstein Department of Mathematics Howard University Washington, DC 259, USA aburstein@howard.edu Sergey Kitaev Mathematics

Chapter 10. Abstract algebra

Chapter 10. Abstract algebra C.O.S. Sorzano Biomedical Engineering December 17, 2013 10. Abstract algebra December 17, 2013 1 / 62 Outline 10 Abstract algebra Sets Relations and functions Partitions and

Degrees that are not degrees of categoricity

Degrees that are not degrees of categoricity Bernard A. Anderson Department of Mathematics and Physical Sciences Gordon State College banderson@gordonstate.edu www.gordonstate.edu/faculty/banderson Barbara

Factoring & Primality

Factoring & Primality Lecturer: Dimitris Papadopoulos In this lecture we will discuss the problem of integer factorization and primality testing, two problems that have been the focus of a great amount

Catalan Numbers. Thomas A. Dowling, Department of Mathematics, Ohio State Uni- versity.

7 Catalan Numbers Thomas A. Dowling, Department of Mathematics, Ohio State Uni- Author: versity. Prerequisites: The prerequisites for this chapter are recursive definitions, basic counting principles,

Introduction to computer science

Introduction to computer science Michael A. Nielsen University of Queensland Goals: 1. Introduce the notion of the computational complexity of a problem, and define the major computational complexity classes.

EMBEDDING COUNTABLE PARTIAL ORDERINGS IN THE DEGREES

EMBEDDING COUNTABLE PARTIAL ORDERINGS IN THE ENUMERATION DEGREES AND THE ω-enumeration DEGREES MARIYA I. SOSKOVA AND IVAN N. SOSKOV 1. Introduction One of the most basic measures of the complexity of a

CHAPTER 3 Numbers and Numeral Systems

CHAPTER 3 Numbers and Numeral Systems Numbers play an important role in almost all areas of mathematics, not least in calculus. Virtually all calculus books contain a thorough description of the natural,

Kolmogorov Complexity and the Incompressibility Method

Kolmogorov Complexity and the Incompressibility Method Holger Arnold 1. Introduction. What makes one object more complex than another? Kolmogorov complexity, or program-size complexity, provides one of

University of Oslo MAT2 Project The Banach-Tarski Paradox Author: Fredrik Meyer Supervisor: Nadia S. Larsen Abstract In its weak form, the Banach-Tarski paradox states that for any ball in R, it is possible

INDISTINGUISHABILITY OF ABSOLUTELY CONTINUOUS AND SINGULAR DISTRIBUTIONS

INDISTINGUISHABILITY OF ABSOLUTELY CONTINUOUS AND SINGULAR DISTRIBUTIONS STEVEN P. LALLEY AND ANDREW NOBEL Abstract. It is shown that there are no consistent decision rules for the hypothesis testing problem

Automata on Infinite Words and Trees

Automata on Infinite Words and Trees Course notes for the course Automata on Infinite Words and Trees given by Dr. Meghyn Bienvenu at Universität Bremen in the 2009-2010 winter semester Last modified:

Notes 11: List Decoding Folded Reed-Solomon Codes

Introduction to Coding Theory CMU: Spring 2010 Notes 11: List Decoding Folded Reed-Solomon Codes April 2010 Lecturer: Venkatesan Guruswami Scribe: Venkatesan Guruswami At the end of the previous notes,

CONTRIBUTIONS TO ZERO SUM PROBLEMS

CONTRIBUTIONS TO ZERO SUM PROBLEMS S. D. ADHIKARI, Y. G. CHEN, J. B. FRIEDLANDER, S. V. KONYAGIN AND F. PAPPALARDI Abstract. A prototype of zero sum theorems, the well known theorem of Erdős, Ginzburg

C H A P T E R Regular Expressions regular expression

7 CHAPTER Regular Expressions Most programmers and other power-users of computer systems have used tools that match text patterns. You may have used a Web search engine with a pattern like travel cancun

Notes on Determinant

ENGG2012B Advanced Engineering Mathematics Notes on Determinant Lecturer: Kenneth Shum Lecture 9-18/02/2013 The determinant of a system of linear equations determines whether the solution is unique, without

U.C. Berkeley CS276: Cryptography Handout 0.1 Luca Trevisan January, 2009. Notes on Algebra

U.C. Berkeley CS276: Cryptography Handout 0.1 Luca Trevisan January, 2009 Notes on Algebra These notes contain as little theory as possible, and most results are stated without proof. Any introductory

Structural properties of oracle classes

Structural properties of oracle classes Stanislav Živný Computing Laboratory, University of Oxford, Wolfson Building, Parks Road, Oxford, OX1 3QD, UK Abstract Denote by C the class of oracles relative

How many numbers there are?

How many numbers there are? RADEK HONZIK Radek Honzik: Charles University, Department of Logic, Celetná 20, Praha 1, 116 42, Czech Republic radek.honzik@ff.cuni.cz Contents 1 What are numbers 2 1.1 Natural

SOLUTIONS TO ASSIGNMENT 1 MATH 576

SOLUTIONS TO ASSIGNMENT 1 MATH 576 SOLUTIONS BY OLIVIER MARTIN 13 #5. Let T be the topology generated by A on X. We want to show T = J B J where B is the set of all topologies J on X with A J. This amounts

1 Introduction to Counting

1 Introduction to Counting 1.1 Introduction In this chapter you will learn the fundamentals of enumerative combinatorics, the branch of mathematics concerned with counting. While enumeration problems can

Fractions and Decimals

Fractions and Decimals Tom Davis tomrdavis@earthlink.net http://www.geometer.org/mathcircles December 1, 2005 1 Introduction If you divide 1 by 81, you will find that 1/81 =.012345679012345679... The first

This asserts two sets are equal iff they have the same elements, that is, a set is determined by its elements.

3. Axioms of Set theory Before presenting the axioms of set theory, we first make a few basic comments about the relevant first order logic. We will give a somewhat more detailed discussion later, but

FUNCTIONAL ANALYSIS LECTURE NOTES: QUOTIENT SPACES

FUNCTIONAL ANALYSIS LECTURE NOTES: QUOTIENT SPACES CHRISTOPHER HEIL 1. Cosets and the Quotient Space Any vector space is an abelian group under the operation of vector addition. So, if you are have studied

The Halting Problem is Undecidable

185 Corollary G = { M, w w L(M) } is not Turing-recognizable. Proof. = ERR, where ERR is the easy to decide language: ERR = { x { 0, 1 }* x does not have a prefix that is a valid code for a Turing machine

o-minimality and Uniformity in n 1 Graphs

o-minimality and Uniformity in n 1 Graphs Reid Dale July 10, 2013 Contents 1 Introduction 2 2 Languages and Structures 2 3 Definability and Tame Geometry 4 4 Applications to n 1 Graphs 6 5 Further Directions

Classification of Cartan matrices

Chapter 7 Classification of Cartan matrices In this chapter we describe a classification of generalised Cartan matrices This classification can be compared as the rough classification of varieties in terms

Mathematical Induction. Lecture 10-11

Mathematical Induction Lecture 10-11 Menu Mathematical Induction Strong Induction Recursive Definitions Structural Induction Climbing an Infinite Ladder Suppose we have an infinite ladder: 1. We can reach

1 Sets and Set Notation.

LINEAR ALGEBRA MATH 27.6 SPRING 23 (COHEN) LECTURE NOTES Sets and Set Notation. Definition (Naive Definition of a Set). A set is any collection of objects, called the elements of that set. We will most

Universal hashing. In other words, the probability of a collision for two different keys x and y given a hash function randomly chosen from H is 1/m.

Universal hashing No matter how we choose our hash function, it is always possible to devise a set of keys that will hash to the same slot, making the hash scheme perform poorly. To circumvent this, we

On Integer Additive Set-Indexers of Graphs

On Integer Additive Set-Indexers of Graphs arxiv:1312.7672v4 [math.co] 2 Mar 2014 N K Sudev and K A Germina Abstract A set-indexer of a graph G is an injective set-valued function f : V (G) 2 X such that

3. Mathematical Induction

3. MATHEMATICAL INDUCTION 83 3. Mathematical Induction 3.1. First Principle of Mathematical Induction. Let P (n) be a predicate with domain of discourse (over) the natural numbers N = {0, 1,,...}. If (1)

ON THE COEFFICIENTS OF THE LINKING POLYNOMIAL

ADSS, Volume 3, Number 3, 2013, Pages 45-56 2013 Aditi International ON THE COEFFICIENTS OF THE LINKING POLYNOMIAL KOKO KALAMBAY KAYIBI Abstract Let i j T( M; = tijx y be the Tutte polynomial of the matroid

Integer Factorization using the Quadratic Sieve

Integer Factorization using the Quadratic Sieve Chad Seibert* Division of Science and Mathematics University of Minnesota, Morris Morris, MN 56567 seib0060@morris.umn.edu March 16, 2011 Abstract We give

You know from calculus that functions play a fundamental role in mathematics.

CHPTER 12 Functions You know from calculus that functions play a fundamental role in mathematics. You likely view a function as a kind of formula that describes a relationship between two (or more) quantities.

Quotient Rings and Field Extensions

Chapter 5 Quotient Rings and Field Extensions In this chapter we describe a method for producing field extension of a given field. If F is a field, then a field extension is a field K that contains F.

Pricing of Limit Orders in the Electronic Security Trading System Xetra

Pricing of Limit Orders in the Electronic Security Trading System Xetra Li Xihao Bielefeld Graduate School of Economics and Management Bielefeld University, P.O. Box 100 131 D-33501 Bielefeld, Germany

Computational complexity theory

Computational complexity theory Goal: A general theory of the resources needed to solve computational problems What types of resources? Time What types of computational problems? decision problem Decision

Bounded Cost Algorithms for Multivalued Consensus Using Binary Consensus Instances

Bounded Cost Algorithms for Multivalued Consensus Using Binary Consensus Instances Jialin Zhang Tsinghua University zhanggl02@mails.tsinghua.edu.cn Wei Chen Microsoft Research Asia weic@microsoft.com Abstract

Discrete Mathematics and Probability Theory Fall 2009 Satish Rao,David Tse Note 11

CS 70 Discrete Mathematics and Probability Theory Fall 2009 Satish Rao,David Tse Note Conditional Probability A pharmaceutical company is marketing a new test for a certain medical condition. According

13 Infinite Sets. 13.1 Injections, Surjections, and Bijections. mcs-ftl 2010/9/8 0:40 page 379 #385

mcs-ftl 2010/9/8 0:40 page 379 #385 13 Infinite Sets So you might be wondering how much is there to say about an infinite set other than, well, it has an infinite number of elements. Of course, an infinite

Degree Hypergroupoids Associated with Hypergraphs

Filomat 8:1 (014), 119 19 DOI 10.98/FIL1401119F Published by Faculty of Sciences and Mathematics, University of Niš, Serbia Available at: http://www.pmf.ni.ac.rs/filomat Degree Hypergroupoids Associated

Finite Automata and Formal Languages

Finite Automata and Formal Languages TMV026/DIT321 LP4 2011 Lecture 14 May 19th 2011 Overview of today s lecture: Turing Machines Push-down Automata Overview of the Course Undecidable and Intractable Problems

Regular Expressions with Nested Levels of Back Referencing Form a Hierarchy

Regular Expressions with Nested Levels of Back Referencing Form a Hierarchy Kim S. Larsen Odense University Abstract For many years, regular expressions with back referencing have been used in a variety

Some Polynomial Theorems. John Kennedy Mathematics Department Santa Monica College 1900 Pico Blvd. Santa Monica, CA 90405 rkennedy@ix.netcom.

Some Polynomial Theorems by John Kennedy Mathematics Department Santa Monica College 1900 Pico Blvd. Santa Monica, CA 90405 rkennedy@ix.netcom.com This paper contains a collection of 31 theorems, lemmas,

Adaptive Online Gradient Descent Peter L Bartlett Division of Computer Science Department of Statistics UC Berkeley Berkeley, CA 94709 bartlett@csberkeleyedu Elad Hazan IBM Almaden Research Center 650

Reading 13 : Finite State Automata and Regular Expressions

CS/Math 24: Introduction to Discrete Mathematics Fall 25 Reading 3 : Finite State Automata and Regular Expressions Instructors: Beck Hasti, Gautam Prakriya In this reading we study a mathematical model