# Non Uniform Generation of Combinatorial Objects

Save this PDF as:

Size: px
Start display at page:

## Transcription

1 Non Uniform Generation of Combinatorial Objects Frank Weinberg and Markus E. Nebel Fachbereich Informatik, Technische Universität Kaiserslautern, Gottlieb-Daimler-Straße 48, D Kaiserslautern, Germany {f Abstract In this article we present a method to generate random objects from a large variety of combinatorial classes according to a given distribution. Given a description of the combinatorial class and a set of sample data our method will provide an algorithm that generates objects of size n in worst-case runtime O(n 2 ) (O(n log(n)) can be achieved at the cost of a higher average-case runtime), with the generated objects following a distribution that closely matches the distribution of the sample data. 1 Introduction When testing an algorithm one is often confronted with the problem of generating suitable input data. Since the possible inputs can usually be modeled as combinatorial classes the desire for procedures that generate the objects from these classes arises. It is however usually neither feasible to test algorithms with all possible inputs nor does one know which of the possible inputs are the most interesting ones. Hence it is common practice to test with random input data. This immediately leads to the problem of generating random objects from a combinatorial class, usually of a preliminarily fixed size in order to get results which can be easily compared with each other. Because of its importance this problem has been studied for quite some time and many results have been achieved both in terms of procedures for specific classes (see [3] or [15] for examples,) as well as more general approaches ([9], [10]). A restriction common to all of these procedures is that they do not allow to specify a distribution among the objects of the considered combinatorial class, i.e. objects are generated according to a uniform distribution. This usually is no problem as long as only the correctness of the tested algorithm is considered. If, however, we want to do simulations to gather information about reallife behaviour of our algorithms (e.g. concerning runtime) we need to generate objects according to distributions observed on real-life data. For those combinatorial classes which can be described by context-free grammars there exist algorithms to train the distribution of a given set of objects into the grammar and use the resulting stochastic context-free grammar to generate more data with a distribution similar to the one of the sample data. However, they do not allow the user to fix the length of these sequences. ([13]) In this paper we will present a method to combine the advantages of both approaches, that is to generate objects of a previously fixed length according to the distribution of a given set of sample data. The general proceeding of our method is in two steps: First we equip the productions of a context-free grammar with probabilities such that the induced distribution on the generated language models the distribution of the sample data as closely as possible. Methods for doing so are well known; one of them is presented in Section 2.2. After being trained this way, the (now weighted) context-free grammar is translated into an admissible specification. Admissible specifications, presented in Section 2.1 and extended for our purpose in Section 3.1, are a framework for specifying combinatorial classes, that immediately 1

2 yields generating algorithms for the specified classes. The translation is presented in Sections 3.2 and Known Results and Basic Definitions 2.1 Random Generation In order to generate objects from combinatorial classes we wil use unranking algorithms, that is, algorithms that, given k and n, generates the k-th object of size n from the class according to a previously fixed order. Obviously by drawing k at random this can be used to generate random objects. We do this by extending an approach introduced by Flajolet, Zimmermann and Van Cutsem in [4] for the (uniform) random generation and adapted to the problem of unranking by Martínez and Molinero. Their results are collected in [10]. The basic principle of the approach is to decompose a combinatorial class into simpler classes (or, viewed the other way around, construct more complex classes from simpler ones) by using so-called admissible constructions. Formally: Definition 2.1 ([5]). Objects of size 0 are called neutral objects or tags, a class consisting of a single neutral object ɛ is called a neutral class. We will denote neutral classes by E (E, E 1, E 2,... if we need to distinguish multiple neutral classes containing the objects, ɛ 1, ɛ 2,... respectively). A class consisting of a single object of size 1 is called an atomic class. We will denote atomic classes by Z (Z a, Z b,... to distinguish the classes containing the atoms a, b,...). Definition 2.2 ([5]). The counting sequence of a combinatorial class A is the sequence of integers (a n ) n 0 where a n = card(a n ) is the number of objects in class A that have size ( = number of atoms) n. Definition 2.3 ([5]). Assume that Φ is a construction that associates to a finite collection of classes B, C,... a new class A := Φ[B, C,...], in a finitary way: each A n depends on finitely many of the {B i }, {C j },... Then Φ is admissible iff the counting sequence (a n ) of A only depends on the counting sequences (b i ), (c j ),... of B, C,... In order to represent context-free grammars we will need two well-known admissible constructions. Definition 2.4 ([5]). If A 1,..., A k are combinatorial classes and ɛ 1,..., ɛ k are neutral objects, the combinatorial sum or disjoint union A A k of A 1,..., A k is defined as A A k := (E 1 A 1 )... (E k A k ), where denotes set theoretic union. If A 1 and A 2 are combinatorial classes, the cartesian product A 1 A 2 of A 1 and A 2 is defined as A 1 A 2 := {(α 1, α 2 ) α i A i }. The size (α 1, α 2 ) is defined to be α 1 + α 2. Now to be able to come up with unranking algorithms we need to specify an ordering on the objects of the same size that belong to the considered combinatorial class. We will do this recursively, following the structure of the class admissible specification: Definition 2.5 ([10]). Neutral and atomic classes contain only one element, hence there is only one possible ordering. 2

3 If B = A A k and β, β B n β < B n β (β (A i ) n and β (A j ) n and i < j) or (β, β (A i ) n and β < (Ai) n β ). If B = A 1 A 2 and β = (α 1, α 2 ), β = (α 1, α 2) B n β < B n β α 1 < α 1 or (j = α 1 = α 1 and α 1 < (A1) j α 1) or (α 1 = α 1 and α 2 < (A2) n j α 2). The order used above for products is called lexicographic order. It leads to unranking algorithms with a worst-case time complexity in O(n 2 ), n the size of the objects to be generated. This worstcase complexity can be improved to O(n log n) by using the boustrophedonic order at the cost of increasing average-case runtime. Details can be found in [10]. Now the actual unranking algorithms are quite straightforward: For neutral and atomic classes we simply return the single object from the class, if the parameters of the call are correct ([10]). Given a disjoint union we first determine the correct subclass by succesively adding up their sizes until this sum would exceed the given rank and then compute the rank among this class by subtracting the sizes of the precedent classes (Algorithm 1). Algorithm 1 Unranking of disjoint unions ([10]) object function unrank(a A n, n: size, i: rank) c := 0; j := 1; d := size(a j, n); while c + d i do c := c + d; j := j + 1; d := size(a j, n); end while return unrank(a j, n, i c); Note 2.1. We do not return the tags that would be added according to Definition 2.4, since this behaviour is usually more convenient in practice and if one actually needs the tags, they can easily be added by changing the specification accordingly. In cartesian products the objects are primarily ordered according to the sizes of their constituent objects, hence we first determine these sizes in the same way we determined the correct subclass in unions. Having found these sizes and the rank among all objects with the same sizes, the actual ranks in the subcalsses can easily be determined using div and mod (Algorithm 2). The sizes required for the algorithms above can be computed using the following recursion: 1 A is atomic and n = 1, 0 A is atomic and n 1, 1 A is neutral and n = 0, size(a, n) := (1) 0 A is neutral and n 0, k i=1 size(b i, n) A = B B k, size(b, i) size(c, n i) A = B C. n i=0 3

4 Algorithm 2 Unranking of cartesian products ([10]) object function unrank(a B, n: size, i: rank) c := 0; j := 0; d := size(a, j) size(b, n j); while c + d i do c := c + d; j := j + 1; d := size(a, j) size(b, n j); end while i := i c; b := size(b, n j); α := unrank(a, j, i div b); β := unrank(b, n j, i mod b); return (α, β); 2.2 Weighted Context-Free Grammars Combinatorial classes can be modeled as formal languages. We will use this analogy in Section 3 to transfer results on how to train weights for formal languages onto combinatorial classes. Hence we will introduce the required definitions and results. We assume the reader has basic knowledge of the notions concerning context-free languages. An introduction can be found in [7]. Definition 2.6 ([11]). A weighted context-free grammar (WCFG) is a 5-tuple G = (I, T, R, S, W ), where I (resp. T ) is an alphabet (finite set) of intermediate (resp. terminal) symbols (I and T are disjoint), S I is called axiom, R I (I T ) is a finite set of production rules and W : R R + is a mapping such that each rule f R is equipped with a weight w f := W (f). In the sequel we will write A w f α instead of f = (A, α) R, w f = W (f). If G is a WCFG, G is a stochastic context-free grammar (SCFG) iff the additional restrictions f R : W (f) (0, 1], A I : w f = 1, f R,Q(f)=A hold, where Q(f) denotes the premise of the production f, that is, the first component A of a production rule (A, α) R. (We will call the second component the conclusion of (A, α).) Words are generated as for usual context-free grammars, the product of the weights of the used production rules gives the weight of a parse tree, the sum of the weights of all possible full-parse trees of a word w gives the weight of w. We will write W ( ) resp. W (w) for these weights. Definition 2.7. If G, G are WCFGs, G and G are said to be word-equivalent iff L(G) = L(G ) and for each word w L(G) : W (w) = W (w). If additionally there exists a bijection f from the full-parse trees of w in G to the full-parse trees of w in G such that for each tree : W ( ) = W (f( )) we will call G and G (derivation-)equivalent. A WCFG G is called loop-free, iff there exists no nonempty derivation A + A, A I. It is called ɛ-free, iff (A, ɛ) R, A S and (A, α 1 Sα 2 ) R, where ɛ denotes the empty word. The following result shows that given a CFG G and a set of full-parse trees to that grammar distributed according to some unknown distribution of L(G) we can model the distribution by setting the weights on the grammar to relative frequencies with which the productions occur in the set of parse trees. This is based on the concept of maximum likelihood as introduced by Fisher in ([1]). 4

5 Definition 2.8 ([14]). For a given treebank, i.e., for a non-empty and finite corpus of full-parse trees, the treebank grammar G = (I, T, R, S, W ) is a SCFG defined by 1. I (resp. T ; R) consist of exactly the intermediate symbols (resp. terminal symbols; productions) appearing in the treebank; S is the root of all the full-parse trees, 2. p = (A, α) R : W (p) = f(r) p =(A,α ) P f(p ), where f(p) denotes the number of occurrences of p in the set of full-parse trees. 5

6 Theorem 2.1 ([14]). Let f t be a non-empty and finite corpus of full-parse trees and let G be the treebank grammar read off from f t. Then the weights induced on the full-parse trees by G are a maximum-likelihood estimate of the probability model of the unweighted context-free grammar G corresponding to G on f t. 3 Results 3.1 Weighting as an Admissible Construction In order to encode the distribution into the combinatorial class we will equip the objects from our combinatorial class with weights. For practical reasons we restrict ourselves to integral weights, modeling an object with weight λ as λ copies of the respective element. This can easily be achieved by introducing a new admissible construction: Definition 3.1. If A is a combinatorial class and λ is an integer, the weighting of A by λ, λa, is defined as λa := A A. } {{ } λ times We will call two objects from a combinatorial class copies of the same object iff they only differ in the tags added by weighting operations. In order to unrank (resp. count) the objects of a class whose specification involves weighting, we could replace any weighting constructions by unions according to Definition 3.1, and use Algorithm 1 (resp. equation (1)). It is, however, especially if λ is large, more convenient to use a specialised algorithm, that takes into account that all the classes united are equal (Algorithm 3). For the size we find: size(λa, n) = λ size(a, n). Algorithm 3 Unranking of weighted classes object function unrank(λa, n: size, i: rank) return unrank(a, n, i mod size(a, n)); Since weighting a class can be replaced by a disjoint union the results from [10] concerning worst-case time complexity hold for classes whose specification involves weighting, meaning the unranking algorithms generated will have a runtime in O(n 2 ). To conclude the introduction of this new construction we will have a look at the behaviour of weighted classes with respect to union and product as introduced in Section 2.1: Lemma 3.1. If A 1,..., A n, B are combinatorial classes, B = A A n and A i contains λ copies of an object α, then B contains λ copies of (ɛ i, α). If A 1,..., A n, B are combinatorial classes, B = A 1... A n and each A i contains λ i copies of an object α i, then B contains n i=1 λ i copies of β = (α 1,..., α n ). Proof. Immediate from Definition Reweighting WCFGs Since the weighing of classes allows only integral weights while the training as described in Section 2.2 introduces rational weights we have to reweight the grammars before we can transform them into admissible secifications. As we will see in Section 3.3, the combinatorial objects we want to unrank in the end correspond to the full-parse trees in our grammar, objects of the same size corresponding to full-parse trees of words of the same length. Hence we have to make sure, that the relative weights of full-parse trees 6

7 of words of the same length do not change, in order to guarantee, that the resulting unranking algorithms will produce the objects according to the distribution trained into the grammar. If all productions are scaled by the same factor c, the weight of a derivation will scale by c k, k being the number of steps in the derivation. Hence if all derivations of words of the same length involve the same number of steps, this operation will maintain the relative weights of these words. This is e.g. the case, if the grammar is in Chomsky normal form: Theorem 3.2. Let G be a WCFG in Chomsky normal form with all the weights appearing in G being rational; let c be the common denominator of the weights appearing in G, and G be derived from G by multiplying all weights by c. Then all the weights in G are integral and for w 1, w 2 L(G), w 1 = w 2 and i a full-parse tree of w i, i {1, 2}, in G and G holds. W ( 1 ) W ( 2 ) = W ( 1) W ( 2 ) Proof. The first property follows immediately from the definition of c. If w i = 0, i {1, 2}, then w 1 = w 2 = ɛ and 1 = 2 consists of the single rule S ɛ. Thus both fractions evaluate to 1 and the second property holds. Now let w i 1, i {1, 2}. Since G is in Chomsky normal form, i involves w i 1 applications of productions of the form A BC, A, B, C I, and w i applications of productions of the form A a, A I, a T. Thus W ( i ) = W ( i ) c 2 wi 1, i {1, 2}, and the second property follows immediately. If we are given a non-ambiguous SCFG, we can preprocess it in order to be able to use Theorem 3.2: Theorem 3.3 ([8]). If G is a SCFG, there exists a SCFG G in Chomsky normal form that is word-equivalent to G, and G can be effectively constructed from G. The construction given in [8] assumes that G is ɛ-free. It can however be extended to non-ɛ-free grammars by adding an additional step after the intermediate grammar G 1 has been created: For each production A λ1 ɛ remove this production and for each instance of A in the conclusion of a production B λ2 β 1 Aβ 2 add a new rule B λ1 λ2 β 1 β 2. If A occurs multiple times in a single conclusion, there has to be one new production for each combination of As removed, the weight being λ 2 λ l 1 if l instances of A are deleted. In principle the former two theorems already suffice for many practical purposes. However the transformation of Theorem 3.3 provides only word-equivalence while in some cases derivationequivalence is needed. Hence we will present another method for reweighting. Instead of scaling all productions by the same factor and ensuring that derivations are of equal length for words of the same size, we will scale the productions according to how much they contribute to the length of the word, that is, productions lengthening the word by k will be scaled by c k. Since we consider context-free grammars, the lengthening of a production is given by length of conclusion 1. However, this rule leads to productions with a conclusion of length 1 not being reweighted, hence we have to assure, that all those productions already have integral weights in order to get a grammar with only integral weights after reweighting. Additionally productions with a conclusion of length 0, i.e. ɛ-productions, would be rescaled by 1 c and thus they have to be avoided completely. Obeying these restrictions would however make it impossible to generate words of length 0 or 1 with appropriate weights. Thus, to circumvent this problem, we will modify the required structure of the grammar a bit: 7

8 The axiom does not occur in any conclusion and any production having the axiom as premise has a conclusion of length 0 or 1. All other productions obey the restrictions stated above. This assures, that each derivation includes exactly one application of a production having the axiom as premise and, unless the word generated is the empty word, this application doesn t influence the length of the sentential form. Thus these productions can be reweighted seperately. Formally: Definition 3.2. If G = (I, T, R, S, W ) is a WCFG, G is said to be in reweighting normal form (RNF) iff 1. G is loop-free and ɛ-free; 2. A α R, A = S : α 1; 3. A α R, A S : α > 1 or W (A α) N; 4. A I α (I T ) : A α R. The last condition, that any intermediate symbol occurs as premise of at least one production, is not required for reweighting. It is, however, necessary for the translation of the grammar into an admissible specification later on. Theorem 3.4. Let G be a WCFG in RNF with all the weights appearing in G being rational; let s be the common denominator of the weights of productions with premise S in G, c be the common denominator of the weights of the remaining productions and G be derived from G by multiplying the weights of productions with source S by s, and by multiplying the weights of productions A α, A S by c α 1. Then all the weights in G are integral and for w 1, w 2 L(G), w 1 = w 2 and i a full-parse tree of w i, i {1, 2}, in G and G holds. W ( 1 ) W ( 2 ) = W ( 1) W ( 2 ) Proof. Since A α R, A S implies α > 1 the first property follows immediately from the definition of s and c. If w i = 0, i {1, 2}, then w 1 = w 2 = ɛ and 1 = 2 consists of the single rule S ɛ. Thus both fractions evaluate to 1 and the second property holds. Now let w i 1, i {1, 2}. Since G is in RNF i involves 1 application of a production of the form S α, α = 1 and k applications of productions of the form A i,j α i,j, A i,j S. Thus k j=1 ( α i,j 1) = w i, i {1, 2}, and we find W ( i ) = W ( i ) s n j=1 c αj 1 = W ( i ) s c wi, i {1, 2}, and the second property follows immediately. Now, what remains to be done in this section is to transform arbitrary WCFGs to equivalent grammars in reweighting normal form. The proof of the following theorem describes how this can be done. Theorem 3.5. If G is a loop-free, ɛ-free WCFG, there exists a WCFG G in RNF that is equivalent to G and G can be effectively constructed from G. Proof. Let CH denote the set of all quadruples (A, A 1 A 2... A n, λ 0 λ 1... λ n, B) with A, A 1,..., A n I, B I T, λ i R such that A λ0 λ A 1 λ 1... An n B (for n = 0, we get (A, ɛ, λ, B) CH whenever A λ B is a rule with B I T and n = 1 results in (A, ɛ, 1, A) CH for A N). Since G is loop-free, CH is finite. Now define G as follows: G contains the same terminals as G. The nonterminals are those from G, a new start symbol S and the set CH. The rules are as follows: (A, γ, λ, B) 1 B for (A, γ, λ, B) CH, 8

9 A λ0λ1 λn w 0 (A 1, γ 1, λ 1, B 1 )w 1 (A 2, γ 2, λ 2, B 2 )... (A n, γ n, λ n, B n )w n for A λ0 α = w 0 A 1 w 1 A 2... A n w n a rule from G with α 2, w i T and A i I, S λ (S, γ, λ, a) for (S, γ, λ, a) CH, S λ ɛ, if S λ ɛ is a rule in G. From this construction it is obvious that G satisfies conditions 1 3 of Definition 3.2. Condition 4 can be obtained by removing all nonterminals that violate it along with the productions containing them. Since these symbols and productions can not appear in any full-parse tree this removal does not influence derivation equivalence. Since CH can easily be computed from G, G can be constructed effectively. It remains to be shown that G and G are actually derivation-equivalent. Claim. For each full-parse tree : S w, w T in G there exists a full-parse tree : S w in G with W ( ) = W ( ). Proof. If w = ɛ then consists of the single rule S λ λ ɛ and consisting of the single rule S ɛ is a full-parse tree for w in G with W ( ) = W ( ). If w = 1, consists of the rules S λ 1 (S, γ, λ, a) and (A, γ, λ, a) a and consisting of the rules that led to the inclusion of (S, γ, λ, a) in CH is a full-parse tree for w in G with W ( ) = W ( ). Otherwise consists of one instance of the rule S 1 λ S and several groups of rules A 0λ 1 λ n w 0 (A 1, γ 1, λ 1, B 1 )w 1 (A 2, γ 2, λ 2, B 2 )... (A n, γ n, λ n, B n )w n, (A i, γ i, λ i, B i ) 1 B i. For each of these groups there is in G a production A λ0 α = w 0 A 1 w 1 A 2... A n w n and if A i B i the set of productions that led to the inclusion of (A i, γ i, λ i, B i ) in CH. Additionally by construction of CH and G the product of weights is the same for both groups. Thus by combining the groups of productions from G derived from the groups of productions in we get a full-parse tree for w in G with W ( ) = W ( ). Claim. For each full-parse tree : S w, w T in G there exists a full-parse tree : S w in G with W ( ) = W ( ). Proof. By applying the argumentation from the above proof in reverse. Together the proofs of the two claims give the bijection required for derivation-equivalence. Neither of the requirements in Theorem 3.5 may be dropped without losing derivation-equivalence: Concerning ɛ-freeness consider the set of productions {S ɛ, S A, A ɛ} where ɛ L(G) has two different full-parse trees while in an ɛ-free grammar ɛ can only have the single derivation S ɛ. Concerning loop-freeness note that if A + A is a loop in G and A appears in a derivation of w L(G), w has infinitely many full-parse trees in G while in a loop-free grammar any word w has only finitely many parse trees. To transform a grammar to a word-equivalent loop-free, ɛ-free grammar, the ideas used in the proof of Theorem 3.3 can be applied. Note, however, that eliminating loops with a weight 1, will result in infinite weights. 9

10 3.3 Transforming WCFGs into Admissible Specifications To get the transformation from WCFGs to admissible specifications, we note how the elements of grammars can be modeled by elements of specifications: The empty word is modeled as neutral class. Terminal symbols are modeled as atomic classes, containing the respective symbol. Intermediate symbols are modeled as classes that contain all words that can be derived from the symbol. Productions are modeled as classes containing the cartesian product of all symbols of the conclusion. Weights of productions are modeled by weighting the respective classes. Multiple productions with the same source are modeled as disjoint union of the classes modeling the (weighted) productions. Formalizing this, we get: Theorem 3.6. Let G = (I, T, R, S, W ) be a WCFG with integral weights. For each A I let {A λ A,j α A,j,1... α A,j,nA,j 1 j n A } denote the complete set of productions with premise A, n A being the number of such productions. Let T be constructed as follows: For each a T add an atomic class Z a. For each A I add a class A. For each A λ A,j α A,j,1... α A,j,nA,j G, 1 j n A, add a class A j and an equation A j = β A,j,1... β A,j,nA,j where β A,j,k = Z a if α A,j,k = a T and β A,j,k = B if α A,j,k = B I. If n A,j = 0 add the equation A j = E instead of A j = β A,j,1... β A,j,nA,j. For each A I add an equation A = λ A,1 A A, λ A,nA A A,nA. Then T is an admissible specification and for each w L(G) there exists a bijection f between the full-parse trees of w in G and those objects in S w that contain the atomic classes Z a1... Z a w in the same order as the terminal symbols a i appear in w such that S w contains W ( ) copies of f( ). Proof. The first property follows immediately from the construction of T. Considering the second property we will first show the following lemma: Lemma 3.7. Let T be constructed as in Theorem 3.6, = A w, w L((I, T, R, A, W )), λ = W ( ). Then there are λ copies of an object o A w that contains the atomic classes Z a1... Z a w in the same order as the terminal symbols a i appear in w. Proof. Let n denote the number of steps in the derivation of w, A µ α 1... α k be the first production used in, B be the class introduced for this production and ɛ B be the tag associated with B in the equation for A. n = 1: Then λ = µ and w = α 1... α k and by construction of T : (α 1,..., α k ) B and thus 1 i λ : (ɛ B, (ɛ i, (α 1,..., α k ))) A where the ɛ i are introduced during the weighting operation λb. Hence the Lemma holds. 10

11 n > 1: Assume the Lemma holds for all derivations of lengths < n. The derivations α i wi, where w i T, w = w 1... w k, all have lengths < n hence if α i I we may set o i to the object corresponding to the derivation α i wi. If α i T we set o i = α i. Then by construction of T : (o 1,..., o k ) B and thus 1 i λ : (ɛ B, (ɛ i (o 1,..., o k ))) A where the ɛ i are introduced during the weighting operation λb. Using the induction hypothesis and Lemma 3.1 the Lemma follows. It remains to be shown that conversely each object o S models a derivation: From the construction of T, we get that each o A, A added for A I, must be of the form o = (ɛ j, (ɛ l, (o 1,..., o k ))), where each of the o i, 1 i k, is either a terminal symbol or o i B i, B i added for B i I and o i is smaller than o. Hence assuming that for smaller o i there exist corresponding derivations, we get a derivation corresponding to o by concatenating A α 1... α k, α i = o i if o i T, α i = B i if o i B i with the derivations for those o i not in T. The property follows by induction. It is easy to verify that the above transformation can also be applied inversely to get a CFG from an admissible specification involving only disjoint unions and cartesian products. Additionally the admissible construction of sequence generation can also be transformed into productions of a CFG, translating R = SEQ(S) into R ɛ RS. 4 Example of Application As a simple example we have used our method to generate random RNA secondary structures. These are typically modelled as correctly parenthesized words over the alphabet {[,, ]}, where corresponding brackets represent bases paired with each other and denotes an unpaired base. (A more thorough introduction to the topic can be found in [2].) Using a simple grammar for these words and training it using the data from [16] we arrived at the following: 1 : S R, 0.31 : R T, 0.69 : T, 0.69 : R T R, 0.31 : T [R]. Since this grammar is loop-free and ɛ-free, we can use the procedure from Theorem 3.5 to transform it to RNF. We find CH = {(S, ɛ, 1, S), (S, ɛ, 1, R), (S, R, 0.31, T ), (S, RT, 0.21, ), (R, ɛ, 1, R), (R, ɛ, 0.31, T ), (R, T, 0.21, ), (T, ɛ, 1, T ), (T, ɛ, 0.69, )} and thus after reducing get the new grammar 1 : S (S, ɛ, 1, R), 0.69 : R (T, ɛ, 1, T )(R, ɛ, 1, R), 0.31 : S (S, R, 0.31, T ), 0.21 : R (T, ɛ, 1, T )(R, ɛ, 0.31, T ), 0.31 : S (S, RT, 0.31, ), 0.14 : R (T, ɛ, 1, T )(R, T, 0.21, ), 0.31 : T [(R, ɛ, 1, R)], 0.48 : R (T, ɛ, 0.69, )(R, ɛ, 1, R), 0.10 : T [(R, ɛ, 0.31, T )], 0.15 : R (T, ɛ, 0.69, )(R, ɛ, 0.31, T ), 0.07 : T [(R, T, 0.21, )], 0.10 : R (T, ɛ, 0.69, )(R, T, 0.21, ), 1 : (A, γ, λ, B) B, (A, γ, λ, B) CH, B S. Since replacing each of the (A, γ, λ, B) by B in the conclusions does not lead to two of the rules becoming identical, this transformation is derivation equivalent for our grammar and gives the more readable version 1 : S R, 0.69 : R T R, 0.48 : R R, 0.31 : T [R], 0.31 : S T, 0.21 : R T T, 0.15 : R T, 0.10 : T [T ], 0.31 : S, 0.14 : R T, 0.10 : R, 0.07 : T [ ]. 11

12 Applying Theorem 3.4 we find the smallest common denominators to be s = 100 and c = 100 and thus the reweighted set of productions is: 1 : S R, 69 : R T R, 48 : R R, 3100 : T [R], 31 : S T, 21 : R T T, 15 : R T, 1000 : T [T ], 31 : S, 14 : R T, 10 : R, 700 : T [ ]. Applying Theorem 3.6, we get the following specification: S 1 = R, R 1 = T R, R 4 = Z R, T 1 = Z [ R Z ], S 2 = T, R 2 = T T, R 5 = Z T, T 2 = Z [ T Z ], S 3 = Z, R 3 = T Z, R 6 = Z Z, T 3 = Z [ Z Z ], S = 100S S S 3, R = 69R R R R R R 6, T = 3100T T T 3. We have implemented the resulting algorithm as well as an algorithm for uniform generation of RNA secondary structures in the computer algebra system Maple and used these implementations to generate 1000 objects of size 150 with each of the algorithms. The resulting ratios of unpaired bases are shown in Table 1. Structures Ratio of unpaired bases Generated unweighted 45,1% Genrated weighted 53,2% From database 52,2% Table 1: Ratio of unpaired bases for the generated structures. As was to be expected, the structures generated with the weighted algorithm closely match the structures in the database concerning this parameter. Concerning other parameters, such as the free energy of the molecules, the results generated by our weighted unranking algorithm aren t as good. This is due to the fact, that the grammar we started with is too simple (i.e. it has too few free variables) to model these parameters and hence there is no way to train their properties into this grammar. This problem can however be circumvented by using a more sophisticated grammar. In [12] such a grammar is given for RNA secondary structures and it is shown that applying our approach to this grammar yields an algorithm that generates secondary structures with free energies distributed very similar to the distribution observed for real RNA molecules. 5 Concluding Remarks In this paper we have shown, how to get algorithms for the random generation of combinatorial objects of a fixed size according to an arbitrary distribution, given a context-free grammar that models the class or an admissible specification of the class that can be transformed into such a grammar and given the distribution either as weights on the grammar or as a set of sample objects, distributed accordingly. We have done so by extending the popular framework of admissible specifications of combinatorial classes to also include the specification of weights, allowing for the generic creation of algorithms that generate objects from combinatorial classes according to a non-uniform distribution and by furthermore showing, how weighted context-free grammars can be transformed into admissible specifications, hence allowing to use the well-known procedures for training the weights on context-free grammars to be used for the training of weights in admissible specifications. However, we had to restrict ourselves to use only integral weights. Hence the weights returned by the training algorithms had to be rescaled, which in turn required a rather complex transformation of the grammar. 12

13 This transformation is however, as well as all other steps involved, easily automateable. Thus it appears to be a sensible task, to develop a tool that, given a specification or grammar and a set of sample objects, generates the corresponding unranking algorithm. The algorithms presented generate objects of size n with a worst-case runtime in O(n 2 ). In their work [6], Flajolet, Fusy and Pivoteau show how algorithms for the uniform random generation of combinatorial objects that have a worst-case runtime linear in the size of the object generated can be automatically created from admissible specifications if one weakens the restriction for the algorithms to create objects of exact size n to the creation of objects with expected size n. We believe it to be possible to combine their results with the results presented here in order to get algorithms for the weighted generation that yield a runtime linear in the size of the created objects. It remains to be seen, however, how exactly this can be achieved. References [1] John Aldrich. R. A. Fisher and the Making of Maximum Likelihood Statistical Science, 12(3): , [2] Richard Durbin, Sean R. Eddy, Anders Krogh, and Graeme Mitchison. Biological sequence analysis. Cambridge University Press, [3] M. C. Er. Enumerating ordered trees lexicographically. The Computer Journal, 28(5): , [4] P. Flajolet, P. Zimmermann, and B. V. Cutsem. A calculus for the random generation of combinatorial structures. Theoretical Computer Science, 132(1-2):1 35, [5] Philippe Flajolet and Robert Sedgewick. Analytic Combinatorics. Cambridge University Press, [6] Philippe Flajolet, Éric Fusy, and Carine Pivoteau. Boltzmann sampling of unlabelled structures Extended abstract. [7] Michael A. Harrison. Introduction to Formal Language Theory. Addison-Wesley, [8] T. Huang and K.S. Fu. On stochastic context-free languages. Inf. Sc., 3: , [9] Jens Liebehenschel. Lexikographische Generierung, Ranking und Unranking kombinatorischer Objekte: Eine Average-Case Analyse. PhD thesis, Johann Wolfgang Goethe-Universität Frankfurt am Main, [10] Xavier Molinero. Ordered Generation of Classes of Combinatorial Structures. PhD thesis, Universitat Politècnica de Catalunya, [11] Markus E. Nebel. On a Statistical Filter for RNA Secondary Structures. Technical report, Frankfurter Informatik-Berichte, [12] Markus E. Nebel and Anika Scheid. Random generation of rna secondary structures according to native distributions. Submitted, [13] Yann Ponty, Michel Termier, and Alain Denise. GenRGenS: software for generating random genomic sequences and structures. Bioinformatics, 22(12): , [14] Detlef Prescher. A tutorial on the expectation-maximization algorithm including maximum-likelihood estimation and em training of probabilistic context-free grammars. prescher/papers/bib/2003em.prescher.pdf, [15] F. Ruskey. Generating t-ary trees lexicographically. SIAM J. Comp., 7(4): , [16] J. Wuyts, P. De Rijk, Y. Van de Peer, T. Winkelmans, and R. De Wachter. The european large subunit ribosomal rna database. Nucleic Acids Res., 29: ,

### MAT2400 Analysis I. A brief introduction to proofs, sets, and functions

MAT2400 Analysis I A brief introduction to proofs, sets, and functions In Analysis I there is a lot of manipulations with sets and functions. It is probably also the first course where you have to take

### CHAPTER 7 GENERAL PROOF SYSTEMS

CHAPTER 7 GENERAL PROOF SYSTEMS 1 Introduction Proof systems are built to prove statements. They can be thought as an inference machine with special statements, called provable statements, or sometimes

### Sets and Cardinality Notes for C. F. Miller

Sets and Cardinality Notes for 620-111 C. F. Miller Semester 1, 2000 Abstract These lecture notes were compiled in the Department of Mathematics and Statistics in the University of Melbourne for the use

### Elementary Number Theory We begin with a bit of elementary number theory, which is concerned

CONSTRUCTION OF THE FINITE FIELDS Z p S. R. DOTY Elementary Number Theory We begin with a bit of elementary number theory, which is concerned solely with questions about the set of integers Z = {0, ±1,

### CARDINALITY, COUNTABLE AND UNCOUNTABLE SETS PART ONE

CARDINALITY, COUNTABLE AND UNCOUNTABLE SETS PART ONE With the notion of bijection at hand, it is easy to formalize the idea that two finite sets have the same number of elements: we just need to verify

### Chapter 4, Arithmetic in F [x] Polynomial arithmetic and the division algorithm.

Chapter 4, Arithmetic in F [x] Polynomial arithmetic and the division algorithm. We begin by defining the ring of polynomials with coefficients in a ring R. After some preliminary results, we specialize

### Chapter 7 Uncomputability

Chapter 7 Uncomputability 190 7.1 Introduction Undecidability of concrete problems. First undecidable problem obtained by diagonalisation. Other undecidable problems obtained by means of the reduction

### Theory of Computation Prof. Kamala Krithivasan Department of Computer Science and Engineering Indian Institute of Technology, Madras

Theory of Computation Prof. Kamala Krithivasan Department of Computer Science and Engineering Indian Institute of Technology, Madras Lecture No. # 31 Recursive Sets, Recursively Innumerable Sets, Encoding

### Finite Sets. Theorem 5.1. Two non-empty finite sets have the same cardinality if and only if they are equivalent.

MATH 337 Cardinality Dr. Neal, WKU We now shall prove that the rational numbers are a countable set while R is uncountable. This result shows that there are two different magnitudes of infinity. But we

### It is not immediately obvious that this should even give an integer. Since 1 < 1 5

Math 163 - Introductory Seminar Lehigh University Spring 8 Notes on Fibonacci numbers, binomial coefficients and mathematical induction These are mostly notes from a previous class and thus include some

### Some facts about polynomials modulo m (Full proof of the Fingerprinting Theorem)

Some facts about polynomials modulo m (Full proof of the Fingerprinting Theorem) In order to understand the details of the Fingerprinting Theorem on fingerprints of different texts from Chapter 19 of the

### SETS, RELATIONS, AND FUNCTIONS

September 27, 2009 and notations Common Universal Subset and Power Set Cardinality Operations A set is a collection or group of objects or elements or members (Cantor 1895). the collection of the four

### The countdown problem

JFP 12 (6): 609 616, November 2002. c 2002 Cambridge University Press DOI: 10.1017/S0956796801004300 Printed in the United Kingdom 609 F U N C T I O N A L P E A R L The countdown problem GRAHAM HUTTON

### Automata and Computability. Solutions to Exercises

Automata and Computability Solutions to Exercises Fall 25 Alexis Maciel Department of Computer Science Clarkson University Copyright c 25 Alexis Maciel ii Contents Preface vii Introduction 2 Finite Automata

### This chapter describes set theory, a mathematical theory that underlies all of modern mathematics.

Appendix A Set Theory This chapter describes set theory, a mathematical theory that underlies all of modern mathematics. A.1 Basic Definitions Definition A.1.1. A set is an unordered collection of elements.

### Notes on Complexity Theory Last updated: August, 2011. Lecture 1

Notes on Complexity Theory Last updated: August, 2011 Jonathan Katz Lecture 1 1 Turing Machines I assume that most students have encountered Turing machines before. (Students who have not may want to look

### I. GROUPS: BASIC DEFINITIONS AND EXAMPLES

I GROUPS: BASIC DEFINITIONS AND EXAMPLES Definition 1: An operation on a set G is a function : G G G Definition 2: A group is a set G which is equipped with an operation and a special element e G, called

### 3. Recurrence Recursive Definitions. To construct a recursively defined function:

3. RECURRENCE 10 3. Recurrence 3.1. Recursive Definitions. To construct a recursively defined function: 1. Initial Condition(s) (or basis): Prescribe initial value(s) of the function.. Recursion: Use a

### Basic Concepts of Point Set Topology Notes for OU course Math 4853 Spring 2011

Basic Concepts of Point Set Topology Notes for OU course Math 4853 Spring 2011 A. Miller 1. Introduction. The definitions of metric space and topological space were developed in the early 1900 s, largely

### INTRODUCTORY SET THEORY

M.Sc. program in mathematics INTRODUCTORY SET THEORY Katalin Károlyi Department of Applied Analysis, Eötvös Loránd University H-1088 Budapest, Múzeum krt. 6-8. CONTENTS 1. SETS Set, equal sets, subset,

### Chomsky and Greibach Normal Forms. Chomsky and Greibach Normal Forms p.1/24

Chomsky and Greibach Normal Forms Chomsky and Greibach Normal Forms p.1/24 Simplifying a CFG It is often convenient to simplify CFG Chomsky and Greibach Normal Forms p.2/24 Simplifying a CFG It is often

### CHAPTER 5. Number Theory. 1. Integers and Division. Discussion

CHAPTER 5 Number Theory 1. Integers and Division 1.1. Divisibility. Definition 1.1.1. Given two integers a and b we say a divides b if there is an integer c such that b = ac. If a divides b, we write a

### We give a basic overview of the mathematical background required for this course.

1 Background We give a basic overview of the mathematical background required for this course. 1.1 Set Theory We introduce some concepts from naive set theory (as opposed to axiomatic set theory). The

### Continued Fractions and the Euclidean Algorithm

Continued Fractions and the Euclidean Algorithm Lecture notes prepared for MATH 326, Spring 997 Department of Mathematics and Statistics University at Albany William F Hammond Table of Contents Introduction

### Mathematical induction

Mathematical induction If we want to prove that P n holds for for all natural numbers n, we can do the following twostep rocket called mathematical induction: 1. Prove that P 0 holds 2. Prove that if P

### NOTES ON MEASURE THEORY. M. Papadimitrakis Department of Mathematics University of Crete. Autumn of 2004

NOTES ON MEASURE THEORY M. Papadimitrakis Department of Mathematics University of Crete Autumn of 2004 2 Contents 1 σ-algebras 7 1.1 σ-algebras............................... 7 1.2 Generated σ-algebras.........................

### Testing LTL Formula Translation into Büchi Automata

Testing LTL Formula Translation into Büchi Automata Heikki Tauriainen and Keijo Heljanko Helsinki University of Technology, Laboratory for Theoretical Computer Science, P. O. Box 5400, FIN-02015 HUT, Finland

### MA651 Topology. Lecture 6. Separation Axioms.

MA651 Topology. Lecture 6. Separation Axioms. This text is based on the following books: Fundamental concepts of topology by Peter O Neil Elements of Mathematics: General Topology by Nicolas Bourbaki Counterexamples

### Full and Complete Binary Trees

Full and Complete Binary Trees Binary Tree Theorems 1 Here are two important types of binary trees. Note that the definitions, while similar, are logically independent. Definition: a binary tree T is full

### Mathematics Course 111: Algebra I Part IV: Vector Spaces

Mathematics Course 111: Algebra I Part IV: Vector Spaces D. R. Wilkins Academic Year 1996-7 9 Vector Spaces A vector space over some field K is an algebraic structure consisting of a set V on which are

### Formal Languages and Automata Theory - Regular Expressions and Finite Automata -

Formal Languages and Automata Theory - Regular Expressions and Finite Automata - Samarjit Chakraborty Computer Engineering and Networks Laboratory Swiss Federal Institute of Technology (ETH) Zürich March

### Information Theory and Coding Prof. S. N. Merchant Department of Electrical Engineering Indian Institute of Technology, Bombay

Information Theory and Coding Prof. S. N. Merchant Department of Electrical Engineering Indian Institute of Technology, Bombay Lecture - 17 Shannon-Fano-Elias Coding and Introduction to Arithmetic Coding

### Structural properties of oracle classes

Structural properties of oracle classes Stanislav Živný Computing Laboratory, University of Oxford, Wolfson Building, Parks Road, Oxford, OX1 3QD, UK Abstract Denote by C the class of oracles relative

### Continued Fractions. Darren C. Collins

Continued Fractions Darren C Collins Abstract In this paper, we discuss continued fractions First, we discuss the definition and notation Second, we discuss the development of the subject throughout history

### Mathematics Review for MS Finance Students

Mathematics Review for MS Finance Students Anthony M. Marino Department of Finance and Business Economics Marshall School of Business Lecture 1: Introductory Material Sets The Real Number System Functions,

### Integer roots of quadratic and cubic polynomials with integer coefficients

Integer roots of quadratic and cubic polynomials with integer coefficients Konstantine Zelator Mathematics, Computer Science and Statistics 212 Ben Franklin Hall Bloomsburg University 400 East Second Street

### Statistical RNA Secondary Structure Sampling Based on a Length-Dependent SCFG Model

Statistical RNA Secondary Structure Sampling Based on a Length-Dependent SCFG Model Anika Scheid and Markus E. Nebel Department of Computer Science, University of Kaiserslautern, P.O. Box 3049, D-67653

### (Refer Slide Time: 1:41)

Discrete Mathematical Structures Dr. Kamala Krithivasan Department of Computer Science and Engineering Indian Institute of Technology, Madras Lecture # 10 Sets Today we shall learn about sets. You must

### it is easy to see that α = a

21. Polynomial rings Let us now turn out attention to determining the prime elements of a polynomial ring, where the coefficient ring is a field. We already know that such a polynomial ring is a UF. Therefore

### THE TURING DEGREES AND THEIR LACK OF LINEAR ORDER

THE TURING DEGREES AND THEIR LACK OF LINEAR ORDER JASPER DEANTONIO Abstract. This paper is a study of the Turing Degrees, which are levels of incomputability naturally arising from sets of natural numbers.

### Linear Codes. Chapter 3. 3.1 Basics

Chapter 3 Linear Codes In order to define codes that we can encode and decode efficiently, we add more structure to the codespace. We shall be mainly interested in linear codes. A linear code of length

### Mathematics for Computer Science/Software Engineering. Notes for the course MSM1F3 Dr. R. A. Wilson

Mathematics for Computer Science/Software Engineering Notes for the course MSM1F3 Dr. R. A. Wilson October 1996 Chapter 1 Logic Lecture no. 1. We introduce the concept of a proposition, which is a statement

### Cartesian Products and Relations

Cartesian Products and Relations Definition (Cartesian product) If A and B are sets, the Cartesian product of A and B is the set A B = {(a, b) :(a A) and (b B)}. The following points are worth special

### Math Review. for the Quantitative Reasoning Measure of the GRE revised General Test

Math Review for the Quantitative Reasoning Measure of the GRE revised General Test www.ets.org Overview This Math Review will familiarize you with the mathematical skills and concepts that are important

### 3. Equivalence Relations. Discussion

3. EQUIVALENCE RELATIONS 33 3. Equivalence Relations 3.1. Definition of an Equivalence Relations. Definition 3.1.1. A relation R on a set A is an equivalence relation if and only if R is reflexive, symmetric,

### The Dirichlet Unit Theorem

Chapter 6 The Dirichlet Unit Theorem As usual, we will be working in the ring B of algebraic integers of a number field L. Two factorizations of an element of B are regarded as essentially the same if

### DEGREES OF ORDERS ON TORSION-FREE ABELIAN GROUPS

DEGREES OF ORDERS ON TORSION-FREE ABELIAN GROUPS ASHER M. KACH, KAREN LANGE, AND REED SOLOMON Abstract. We construct two computable presentations of computable torsion-free abelian groups, one of isomorphism

### Groups, Rings, and Fields. I. Sets Let S be a set. The Cartesian product S S is the set of ordered pairs of elements of S, S S = {(x, y) x, y S}.

Groups, Rings, and Fields I. Sets Let S be a set. The Cartesian product S S is the set of ordered pairs of elements of S, A binary operation φ is a function, S S = {(x, y) x, y S}. φ : S S S. A binary

### MATH10212 Linear Algebra. Systems of Linear Equations. Definition. An n-dimensional vector is a row or a column of n numbers (or letters): a 1.

MATH10212 Linear Algebra Textbook: D. Poole, Linear Algebra: A Modern Introduction. Thompson, 2006. ISBN 0-534-40596-7. Systems of Linear Equations Definition. An n-dimensional vector is a row or a column

### Pythagorean Triples, Complex Numbers, Abelian Groups and Prime Numbers

Pythagorean Triples, Complex Numbers, Abelian Groups and Prime Numbers Amnon Yekutieli Department of Mathematics Ben Gurion University email: amyekut@math.bgu.ac.il Notes available at http://www.math.bgu.ac.il/~amyekut/lectures

### Domain Theory: An Introduction

Domain Theory: An Introduction Robert Cartwright Rebecca Parsons Rice University This monograph is an unauthorized revision of Lectures On A Mathematical Theory of Computation by Dana Scott [3]. Scott

### ORDERS OF ELEMENTS IN A GROUP

ORDERS OF ELEMENTS IN A GROUP KEITH CONRAD 1. Introduction Let G be a group and g G. We say g has finite order if g n = e for some positive integer n. For example, 1 and i have finite order in C, since

### Extension of measure

1 Extension of measure Sayan Mukherjee Dynkin s π λ theorem We will soon need to define probability measures on infinite and possible uncountable sets, like the power set of the naturals. This is hard.

### CS 103X: Discrete Structures Homework Assignment 3 Solutions

CS 103X: Discrete Structures Homework Assignment 3 s Exercise 1 (20 points). On well-ordering and induction: (a) Prove the induction principle from the well-ordering principle. (b) Prove the well-ordering

### 1. LINEAR EQUATIONS. A linear equation in n unknowns x 1, x 2,, x n is an equation of the form

1. LINEAR EQUATIONS A linear equation in n unknowns x 1, x 2,, x n is an equation of the form a 1 x 1 + a 2 x 2 + + a n x n = b, where a 1, a 2,..., a n, b are given real numbers. For example, with x and

### MODULAR ARITHMETIC. a smallest member. It is equivalent to the Principle of Mathematical Induction.

MODULAR ARITHMETIC 1 Working With Integers The usual arithmetic operations of addition, subtraction and multiplication can be performed on integers, and the result is always another integer Division, on

### A Propositional Dynamic Logic for CCS Programs

A Propositional Dynamic Logic for CCS Programs Mario R. F. Benevides and L. Menasché Schechter {mario,luis}@cos.ufrj.br Abstract This work presents a Propositional Dynamic Logic in which the programs are

### P (A) = lim P (A) = N(A)/N,

1.1 Probability, Relative Frequency and Classical Definition. Probability is the study of random or non-deterministic experiments. Suppose an experiment can be repeated any number of times, so that we

### A simple and fast algorithm for computing exponentials of power series

A simple and fast algorithm for computing exponentials of power series Alin Bostan Algorithms Project, INRIA Paris-Rocquencourt 7815 Le Chesnay Cedex France and Éric Schost ORCCA and Computer Science Department,

### 4.4 The recursion-tree method

4.4 The recursion-tree method Let us see how a recursion tree would provide a good guess for the recurrence = 3 4 ) Start by nding an upper bound Floors and ceilings usually do not matter when solving

### CHAPTER 3 Numbers and Numeral Systems

CHAPTER 3 Numbers and Numeral Systems Numbers play an important role in almost all areas of mathematics, not least in calculus. Virtually all calculus books contain a thorough description of the natural,

### CHAPTER II THE LIMIT OF A SEQUENCE OF NUMBERS DEFINITION OF THE NUMBER e.

CHAPTER II THE LIMIT OF A SEQUENCE OF NUMBERS DEFINITION OF THE NUMBER e. This chapter contains the beginnings of the most important, and probably the most subtle, notion in mathematical analysis, i.e.,

### Introduction to Automata Theory. Reading: Chapter 1

Introduction to Automata Theory Reading: Chapter 1 1 What is Automata Theory? Study of abstract computing devices, or machines Automaton = an abstract computing device Note: A device need not even be a

### x if x 0, x if x < 0.

Chapter 3 Sequences In this chapter, we discuss sequences. We say what it means for a sequence to converge, and define the limit of a convergent sequence. We begin with some preliminary results about the

### U.C. Berkeley CS276: Cryptography Handout 0.1 Luca Trevisan January, 2009. Notes on Algebra

U.C. Berkeley CS276: Cryptography Handout 0.1 Luca Trevisan January, 2009 Notes on Algebra These notes contain as little theory as possible, and most results are stated without proof. Any introductory

### HOMEWORK 5 SOLUTIONS. n!f n (1) lim. ln x n! + xn x. 1 = G n 1 (x). (2) k + 1 n. (n 1)!

Math 7 Fall 205 HOMEWORK 5 SOLUTIONS Problem. 2008 B2 Let F 0 x = ln x. For n 0 and x > 0, let F n+ x = 0 F ntdt. Evaluate n!f n lim n ln n. By directly computing F n x for small n s, we obtain the following

### Notes 11: List Decoding Folded Reed-Solomon Codes

Introduction to Coding Theory CMU: Spring 2010 Notes 11: List Decoding Folded Reed-Solomon Codes April 2010 Lecturer: Venkatesan Guruswami Scribe: Venkatesan Guruswami At the end of the previous notes,

### 9.2 Summation Notation

9. Summation Notation 66 9. Summation Notation In the previous section, we introduced sequences and now we shall present notation and theorems concerning the sum of terms of a sequence. We begin with a

### a 11 x 1 + a 12 x 2 + + a 1n x n = b 1 a 21 x 1 + a 22 x 2 + + a 2n x n = b 2.

Chapter 1 LINEAR EQUATIONS 1.1 Introduction to linear equations A linear equation in n unknowns x 1, x,, x n is an equation of the form a 1 x 1 + a x + + a n x n = b, where a 1, a,..., a n, b are given

### CHAPTER 5. Product Measures

54 CHAPTER 5 Product Measures Given two measure spaces, we may construct a natural measure on their Cartesian product; the prototype is the construction of Lebesgue measure on R 2 as the product of Lebesgue

### Context-Free Grammars

Context-Free Grammars Describing Languages We've seen two models for the regular languages: Automata accept precisely the strings in the language. Regular expressions describe precisely the strings in

### For the purposes of this course, the natural numbers are the positive integers. We denote by N, the set of natural numbers.

Lecture 1: Induction and the Natural numbers Math 1a is a somewhat unusual course. It is a proof-based treatment of Calculus, for all of you who have already demonstrated a strong grounding in Calculus

### Discrete Mathematics, Chapter 5: Induction and Recursion

Discrete Mathematics, Chapter 5: Induction and Recursion Richard Mayr University of Edinburgh, UK Richard Mayr (University of Edinburgh, UK) Discrete Mathematics. Chapter 5 1 / 20 Outline 1 Well-founded

### Prime Numbers. Chapter Primes and Composites

Chapter 2 Prime Numbers The term factoring or factorization refers to the process of expressing an integer as the product of two or more integers in a nontrivial way, e.g., 42 = 6 7. Prime numbers are

### This chapter is all about cardinality of sets. At first this looks like a

CHAPTER Cardinality of Sets This chapter is all about cardinality of sets At first this looks like a very simple concept To find the cardinality of a set, just count its elements If A = { a, b, c, d },

### A Formalization of the Turing Test Evgeny Chutchev

A Formalization of the Turing Test Evgeny Chutchev 1. Introduction The Turing test was described by A. Turing in his paper [4] as follows: An interrogator questions both Turing machine and second participant

### Local periods and binary partial words: An algorithm

Local periods and binary partial words: An algorithm F. Blanchet-Sadri and Ajay Chriscoe Department of Mathematical Sciences University of North Carolina P.O. Box 26170 Greensboro, NC 27402 6170, USA E-mail:

### Chapter 10. Abstract algebra

Chapter 10. Abstract algebra C.O.S. Sorzano Biomedical Engineering December 17, 2013 10. Abstract algebra December 17, 2013 1 / 62 Outline 10 Abstract algebra Sets Relations and functions Partitions and

CS5236 Advanced Automata Theory Frank Stephan Semester I, Academic Year 2012-2013 Advanced Automata Theory is a lecture which will first review the basics of formal languages and automata theory and then

### CHAPTER 2. Inequalities

CHAPTER 2 Inequalities In this section we add the axioms describe the behavior of inequalities (the order axioms) to the list of axioms begun in Chapter 1. A thorough mastery of this section is essential

### MATH 2030: SYSTEMS OF LINEAR EQUATIONS. ax + by + cz = d. )z = e. while these equations are not linear: xy z = 2, x x = 0,

MATH 23: SYSTEMS OF LINEAR EQUATIONS Systems of Linear Equations In the plane R 2 the general form of the equation of a line is ax + by = c and that the general equation of a plane in R 3 will be we call

### Degrees that are not degrees of categoricity

Degrees that are not degrees of categoricity Bernard A. Anderson Department of Mathematics and Physical Sciences Gordon State College banderson@gordonstate.edu www.gordonstate.edu/faculty/banderson Barbara

### 4. Number Theory (Part 2)

4. Number Theory (Part 2) Terence Sim Mathematics is the queen of the sciences and number theory is the queen of mathematics. Reading Sections 4.8, 5.2 5.4 of Epp. Carl Friedrich Gauss, 1777 1855 4.3.

### God created the integers and the rest is the work of man. (Leopold Kronecker, in an after-dinner speech at a conference, Berlin, 1886)

Chapter 2 Numbers God created the integers and the rest is the work of man. (Leopold Kronecker, in an after-dinner speech at a conference, Berlin, 1886) God created the integers and the rest is the work

### Catalan Numbers. Thomas A. Dowling, Department of Mathematics, Ohio State Uni- versity.

7 Catalan Numbers Thomas A. Dowling, Department of Mathematics, Ohio State Uni- Author: versity. Prerequisites: The prerequisites for this chapter are recursive definitions, basic counting principles,

### Partitioning edge-coloured complete graphs into monochromatic cycles and paths

arxiv:1205.5492v1 [math.co] 24 May 2012 Partitioning edge-coloured complete graphs into monochromatic cycles and paths Alexey Pokrovskiy Departement of Mathematics, London School of Economics and Political

### Factoring & Primality

Factoring & Primality Lecturer: Dimitris Papadopoulos In this lecture we will discuss the problem of integer factorization and primality testing, two problems that have been the focus of a great amount

### C H A P T E R Regular Expressions regular expression

7 CHAPTER Regular Expressions Most programmers and other power-users of computer systems have used tools that match text patterns. You may have used a Web search engine with a pattern like travel cancun

### On Integer Additive Set-Indexers of Graphs

On Integer Additive Set-Indexers of Graphs arxiv:1312.7672v4 [math.co] 2 Mar 2014 N K Sudev and K A Germina Abstract A set-indexer of a graph G is an injective set-valued function f : V (G) 2 X such that

### Automata on Infinite Words and Trees

Automata on Infinite Words and Trees Course notes for the course Automata on Infinite Words and Trees given by Dr. Meghyn Bienvenu at Universität Bremen in the 2009-2010 winter semester Last modified:

### In mathematics you don t understand things. You just get used to them. (Attributed to John von Neumann)

Chapter 1 Sets and Functions We understand a set to be any collection M of certain distinct objects of our thought or intuition (called the elements of M) into a whole. (Georg Cantor, 1895) In mathematics

### EMBEDDING COUNTABLE PARTIAL ORDERINGS IN THE DEGREES

EMBEDDING COUNTABLE PARTIAL ORDERINGS IN THE ENUMERATION DEGREES AND THE ω-enumeration DEGREES MARIYA I. SOSKOVA AND IVAN N. SOSKOV 1. Introduction One of the most basic measures of the complexity of a

### INDISTINGUISHABILITY OF ABSOLUTELY CONTINUOUS AND SINGULAR DISTRIBUTIONS

INDISTINGUISHABILITY OF ABSOLUTELY CONTINUOUS AND SINGULAR DISTRIBUTIONS STEVEN P. LALLEY AND ANDREW NOBEL Abstract. It is shown that there are no consistent decision rules for the hypothesis testing problem

### Introduction to computer science

Introduction to computer science Michael A. Nielsen University of Queensland Goals: 1. Introduce the notion of the computational complexity of a problem, and define the major computational complexity classes.

### Fractions and Decimals

Fractions and Decimals Tom Davis tomrdavis@earthlink.net http://www.geometer.org/mathcircles December 1, 2005 1 Introduction If you divide 1 by 81, you will find that 1/81 =.012345679012345679... The first

### Notes on Determinant

ENGG2012B Advanced Engineering Mathematics Notes on Determinant Lecturer: Kenneth Shum Lecture 9-18/02/2013 The determinant of a system of linear equations determines whether the solution is unique, without

### Automata and Rational Numbers

Automata and Rational Numbers Jeffrey Shallit School of Computer Science University of Waterloo Waterloo, Ontario N2L 3G1 Canada shallit@cs.uwaterloo.ca http://www.cs.uwaterloo.ca/~shallit 1/40 Representations

University of Oslo MAT2 Project The Banach-Tarski Paradox Author: Fredrik Meyer Supervisor: Nadia S. Larsen Abstract In its weak form, the Banach-Tarski paradox states that for any ball in R, it is possible