Global Optimization in Type Theory. Roland Zumkeller

Size: px
Start display at page:

Download "Global Optimization in Type Theory. Roland Zumkeller"


1 Global Optimization in Type Theory Roland Zumkeller


3 revision 3


5 In memory of my grandfather Wolfgang Merkel 5


7 Contents Thank You! 9 List of Figures 11 Introduction 13 Chapter 1. Preliminaries Type Theory Lists Polynomials 21 Chapter 2. Numbers Integers Dyadics Rationals Reals 42 Chapter 3. Intervals Arithmetic Irrational Functions The Dependency Problem Branch and Bound 54 Chapter 4. Bernstein Polynomials The Bernstein Bases Differentiation and Integration Bounding Several Variables Examples 74 Chapter 5. Function Sets Taylor Models Chebyshev Balls Polynomial Models 90 Conclusion 95 Appendix A. Flyspeck 97 Appendix B. CoRN 101 When Strong Extensionality is Too Strong 101 Bibliography 105 Index 109 Errata and Changes 111 7


9 Thank You! I m deeply indebted to many people without whom this thesis would never have seen daylight. First I would like to thank Gilles Dowek for his class on programming languages at Polytechnique and encouraging me to undertake graduate studies. His clear way of both answering and asking questions has always been a model for me and tremendously stimulated my interest in academic research. Many thanks are due to my advisor, Benjamin Werner, whose comments and moral support were essential to me. Among many other lessons, he taught me that there is always a shorter, more elegant, way of saying things. Thanks to Thomas Hales for patiently answering many questions about his proof of the Kepler conjecture. His work gave the original motivation for this thesis and I m deeply honoured that he accepted to be on the committee. Thanks to Tobias Nipkow, who let me work in his team at TU Munich as an undergraduate student. He introduced me not only to the pleasures of functional programming, but also allowed me a first glance at formal proofs. Thanks to Bruno Barras and Hugo Herbelin for having kept their office doors always open. It was a privilege to have such easy access to enlightening explanations on Coq and beyond. Thanks to Georges Gonthier for many generous explanations on a variety of formalization questions. His unique perspective encouraged me to pay a lot more attention to many details. Thanks to Nathalie Revol for important feedback and advice, as well as for offering to read a draft version of this thesis. Thanks to César Muñoz for welcoming me at NASA s NIA institute during December I greatly appreciated this stimulating work environment. Thanks to Bas Spitters, Milad Niqui, and Russell O Connor. They taught me a lot about constructive mathematics and real numbers. Visiting their team at Nijmegen was a very enjoyable experience. Thanks to Assia Mahboubi and Guillaume Melquiond. They not only helped me to lessen my ignorance on several mathematical topics, but were also pleasant office neighbors. 9

10 Thanks to Sean McLaughlin for many discussions about computer science related topics and others. Thanks to Catherine Moreau and Martine Thirion, whose friendliness made me feel at home at LIX and in the INRIA/Microsoft Joint Lab. Thanks to Philipp, Wassim, Fethallah, Arash, Domingo, and others for lasting friendship. Thanks to Marwa. Thanks to my parents and family. 10

11 List of Figures 0.1 A seventeenth-century wine merchant The face-centered cubic packing Transitions in the computation of the binary gcd Taylor approximations of the logarithm function 83 11


13 Introduction Optimization as a mathematical discipline is concerned with effectively determining the minimum or the maximum of a given function. This problem arises in a great variety of fields, reaching from chemistry to pure mathematics. Scientific progress has often been driven by practical problems. Beyond academia, engineers use optimization methods to make sensible choices for physical parameters, and capitalist economy poses the question of how to put limited resources to maximal benefit. Figure 0.1. A seventeenth-century wine merchant It is a very practical problem that led Johannes Kepler to an important observation about the behaviour of a function around a maximum. In the autumn of 1613, just after his second wedding, he had a number of wine barrels delivered to his cellar. A few days later, as it was customary at the time, the merchant came to measure the diagonal of each barrel with a gauging-rod. This was how he determined the volume and therefore the price to pay. But was he right? Kepler was 13

14 astonished by this method without consideration for the geometrical shape, without further reflection or calculation 1 and as district mathematician in the Austrian city of Linz he felt it was his duty to examine its validity. He thus took on the task of analyzing the problem in the most rigorous imaginable manner, which at the time meant to follow the style of the ancient Greek mathematicians. The result of this effort is his treatise Nova stereometria doliorum vinariorum Kep11. Following a description of a method to calculate the volumes of rotational bodies, its second part gives an answer to a related question: Among all cylinders of the same diagonal which one has the largest volume? He shows by elementary means that for this a ratio of 2 : 1 between base diameter and height is required. Incidentally, this is almost what the Austrian coopers of the time were used to producing; their barrels came in different sizes, but all with a ratio of a little less (due to curvature) than 3 : 2. Kepler lauds them for coming so close to the optimal value, excusing their small error by the following remark: Near a maximal value the neighborhoods on both sides show only an insensible decrease at the beginning. 2 He did not use this fact to determine the maximum (he found it by elementary means instead), but at a time when analysis was still behind the horizon of mathematical knowledge this was a remarkable observation it is often seen as the first time in history that the connection between extremum and tangent is made. A few decades later, Fermat stated explicitly the well-known necessary criterion of zero derivatives for local extrema, then generalized by Euler to functions of several variables Eul55. This so-called derivative check is of utmost importance for many optimization algorithms, as we will see in section 3.4. Was the measurement method used by the wine merchants accurate or not? In the third part of his treatise Kepler gives the answer: since their method consisted solely of measuring the diagonal, the validity depends on the ratio between base diameter and height. This means that the scales used in Linz were valid for Austrian barrels with their ratio of almost 2 : 1, but not for imported ones barrels imported from Hungary were longer, the ones coming from the Rhineland shorter. However, the local wine merchants used the same gauging-rods and scales for all barrels indiscriminately. Since the Austrian barrel had (almost) maximal volume for a fixed diagonal, they overestimated the volume of all foreign barrels and therefore paid too much for them. Optimization as a Means of Verification. By determining that the wine merchants of Linz systematically incurred financial loss when buying foreign barrels, Kepler solved an optimization problem. A large portion of his treatise consisted of 1... ohne Rücksicht auf die geometrische Gestalt, ohne weitere Überlegung oder Rechnung... Kep11 2 Circa maximum vero utrinque circumstantes decrementa habent initio insensibilia. 14

15 a rigorous proof that was meant to verify that his answer ( the Austrian ones ) was correct. This connection between optimization and verification seems to be universal and unchanged since the days of Kepler: solving an optimization problem usually aims at verifying that a given function respects some bound (whether it be known in advance or not). However, two aspects of optimization have largely evolved over the last few centuries: The manual work of Kepler writing down an ingenious proof has been replaced by more and more systematic methods. Calculus has greatly contributed to their development and eventually led to the design of powerful algorithms requiring a computer to be executed effectively. Furthermore, the notion of rigorous has progressed significantly since the beginning of the 20th century. During the renaissance it had still meant to follow the style of the ancient Greeks, namely to deduce new truths by deduction from axioms, using syllogisms. Today the greatest rigor requires a formal proof, i.e. a very detailed proof that can be checked mechanically by a machine. This thesis is about formal correctness proofs of optimization algorithms: we present definitions and proofs in type theory, materialized by the Coq proof assistant Cdt04. The aim of our work is to machine-check proofs against the definitions and obtain as a result certified algorithms that solve optimization problems, minimizing the risk of software bugs. The Problem. Optimization in itself is a wide field: the problem variables can be real numbers, integers or others; constraints on them can be box-formed, linear, more complex, or absent; we can be interested in local or in global extrema. To clarify our focus, we first state the problem. Problem 0.1. Given a box a 1 ; b 1... an ; b n R n and a function f : R n R, find an interval Y such that min X f Y. The width of Y should not exceed some given precision ε. Depending on the nature of f different methods apply: (1) Interval arithmetic can be used if f has a so-called interval extension. This is in particular the case if f is continuous. The method can be made more efficient if f is also differentiable. This method is explained in chapter 3. (2) If f is polynomial, it can be converted to the Bernstein basis, in which efficient algorithms provide a way to determine the minimum. They will be presented, along with detailed proofs, in chapter 4. (3) Often f can be efficiently approximated by a polynomial, e.g. if f is smooth, using Taylor s theorem (if f is smooth). Depending on the nature of f, different approximation theorems apply. This approach is described in chapter 5. 15

16 It is hard to make a precise statement about which method is preferable for what kind of problem. However, interval arithmetic tends to be efficient in low dimension (typically 3) and small values of ε, while the methods of chapter 5 aim to deal with problems of higher dimensions, but with less precision. Applications. At the beginning it was mentioned that global optimization is useful in a variety of fields. However, there is one particularly interesting application that has been our initial motivation. In 1611, Johannes Kepler (and this is already the second time we encounter him) stated a conjecture on sphere packings. Theorem 0.2. The maximal density of sphere packings in 3-space is The density π 18 π 18. is attained by the so-called hexagonal-close and the facecentered cubic (see figure 0.2) packings, which correspond to the way most people would intuitively stack oranges, leading to a pyramid shape. Figure 0.2. The face-centered cubic packing A proof of this conjecture had been long sought for and the problem has a very prominent history Hal06. It was solved only in 1998, when Thomas Hales came up with a proof Hal04. However, this proof has one feature that makes it difficult to judge its correctness: it consists not only of several hundred pages of pencil-and-paper mathematics, but it has also a computational part: a good amount of non-trivial algorithms are executed many times, in order to check a finite (but large) number of sub-cases arising in the proof. In this it is similar to the proof of the four colour theorem, which had a similar status until it has been entirely formalized in Coq Gon05. Thomas Hales began an effort 3 to develop a formal proof for the Kepler conjecture. So far, the graph enumeration algorithms and linear programming techniques used in the proof have been formalized and proved correct NBS06. Our work aims at making progress on the third component, that consists of a long list of real inequalities, occurring in the proof. Here is one example: 3 Flyspeck is an acronym that stands for formal proof of the Kepler conjecture. 16

17 x : R x x x x x x < π 2 + arctan x 1 x 3 + x 2 x 5 x 1 x 4 x 3 x 6 + x 4 x 6 x 2 ( x 2 + x 1 + x 3 x 5 + x 4 + x 6 ) v u t 4x 2 ( x2 x 5 ( x 2 + x 1 + x 3 x 5 + x 4 + x 6 ) + x 1 x 4 (x 2 x 1 + x 3 + x 5 x 4 + x 6 ) + x 3 x 6 (x 2 + x 1 x 3 + x 5 + x 4 x 6 ) x 1 x 3 x 5 x 2 x 3 x 4 x 2 x 1 x 6 x 5 x 4 x 6 ) A proof of this inequality can be found in appendix A (lemma A.5). However, this inequality is among the easier ones! A list of the about thousand such inequalities occurring in the proof of the Kepler conjecture can be found in McL. Before they can be fed into a general optimization procedure they are subject to dimension reduction as described in Hal08: arguments inspired by geometrical insight are used to break down each inequality in several cases of lower dimension. What is new? The main original contributions of this thesis are enumerated at the beginning of chapters 4 and 5. On the other hand, the first three chapters contain little original work, but aim to present their material in a manner that is up to the requirements of formal proof. 17


19 CHAPTER 1 Preliminaries A formal proof distinguishes itself from an informal one not only by the level of detail, but also by the fact that it is given in a precise, defined language. Ideally, proofs written in such a language can be checked by a proof assistant. We have chosen to use the proof assistant Coq Cdt04, for its expressive language, and because it accommodates computational proofs well GL02, Gré Type Theory While proofs in formal languages are machine-checkable, informal presentations can be more intuitive. Our presentation here takes a hybrid approach. While closely sticking to formal definitions, we stop short of (with the exception of chapter 2) displaying concrete Coq syntax. However, many notations and conventions that we found to increase clarity are borrowed from formal proof and functional programming. This chapter describes them in a manner appealing to the reader s intuition. More rigorous presentations can be found in Cdt04. To begin with, we try to avoid ambiguous language such as the function x Assuming x : R is declared, x 2 +2 has the type R, not R R. Coq would reject it in a place where a function is expected, so we avoid it. Instead, we prefer writing the function x x (in some presentations this would be written λx. x ). Function application is written by simple juxtaposition, so we write f x instead of f(x). It has lower precedence than any infix operator, so f x+y stands for (f x) + y, not f (x + y). For definitions, f x :=... is a shorthand for f := x.... Also, we implicitly use the fact that the function spaces A B C and A B C are isomorphic. We may thus write f x y for f (x, y) and viceversa. This extends to partial application: if f : A B C then f x denotes y f (x, y). Finally, partial application may be used with infix operators, so (+ a) denotes x x + a, and (a +) denotes x a + x. Indeed x + y = (+ y) x = (x +) y. Universally quantified statements are written as x : A. P x, meaning P x holds for all x of type A. If A is clear from the context, or the position x occurs in, we write simply x. P x. The power set (the set of all subsets) of a set A is denoted A. 19

20 Point-wise operations on functions are denoted by a dot above the operation symbol: f + g stands for x f x + g x. Multiple-Dispatch Overloading. It is in fact possible to use the same symbol for different operations, and associate it to different meanings, according to the type. For example + can be made to stand for both addition an the reals and for point-wise addition (allowing us to omit the dot on top of +). Recent work added Haskell-style type classes Coq, in order to allow this socalled overloading SO08. It can also by achieved by encoding the operations in records and using canonical structures, using a trick due to Georges Gonthier personal communication: Variables A B : Type. Variable AAmul : A A A. Variable BBmul : B B B. Variable ABmul : A B B. Variable BAmul : B A B. Record mul_class (R : Type) : Type := MulClass { mul_src : Type; mul_op : mul_src R }. Notation x y := (mul_op x y). Record A_mul_class (R : Type) : Type := AMulClass { A_mul_src : Type; A_mul_op : A A_mul_src R }. Record B_mul_class (R : Type) : Type := BMulClass { B_mul_src : Type; B_mul_op : B B_mul_src R }. Canonical Structure A_mul R S := MulClass (A_mul_op R S). Canonical Structure B_mul R S := MulClass (B_mul_op R S). Canonical Structure A_mul_A := AMulClass AAmul. Canonical Structure B_mul_B := BMulClass BBmul. Canonical Structure B_mul_A := BMulClass BAmul. Canonical Structure A_mul_B := AMulClass ABmul. Now if a : A and b : B, then all four of a a : A, a b : B, b a : B, and b b : B. In the case where A can be injected into B, one might want to do this with Coq s 20

21 coercion mechanism. However coercions are inserted sequentially in arguments, so multiple dispatching is not possible Lists Time and again we will manipulate lists of objects. Given a type A, the type A k denotes k-tuples of type A. The values x 1,..., x k denote the components of x = (x 1,..., x k ). A is the type of lists of arbitrary length, i.e. k Ak. We use the terms lists, tuple and vector interchangeably, with a preference for the latter two if something about the length is known. The operation denotes list concatenation: if x : A k1 and y : A k2 then x y := (x 1,..., x k1, y 1,..., y k2 ). The operation of adding a single element in front of a list is written by ; : if a : A and x : A k, then a ; x := (a, x 1,..., x k ). Adding a single element at the end of a list is denoted by. : x. a := (x 1,..., x k, a). x : A k ). The first and the last element of a list are denoted x := x 1 and x := x k (for The operation removing the first (resp. last) element of a list is written as prefix (resp. postfix ): For x : A k, x := (x 2,..., x k ) and x := (x 1,..., x k 1 ). The reversed list is noted rev (x 1,..., x k ) := (x k,..., x 1 ). We note (a) k for (a,..., a). Finally, x denotes the length of a list x. } {{ } k times 1.3. Polynomials Polynomials have been formalized many times, so we shall not attempt to repeat this task here. Instead, we list basic properties that we reasonably expect to hold, along with some notations. For formal proofs the reader is referred to e.g. CoR, Mah06. The term polynomial in this section refers to polynomials in the power basis 1, X, X 2,.... A different one, the Bernstein basis, will be introduced in chapter 4, where the contents of this section will serve as a tool. Assumption 1.1. Let C be a ring, whose elements we will call the coefficients. Definition 1.2. A polynomial is a list of coefficients C, i.e. of type C, also noted C1. Evaluation is defined by: : C1 C C () x 0 c ; p x c + x p x Furthermore, we shall permit ourselves to write p x as a short-hand for p x. 21

22 In many mathematical texts (e.g. in probability theory) the term variable has a somewhat mystical standing. In formal type theory, a variable is a symbol bound either by a function abstraction, i.e. x..., or a quantifier, i.e. x..... How does the variable of a polynomial relate to this? The answer is given by carefully distinguishing between the polynomial (a list of coefficients), e.g. (1, 2, 3), and its associated function (1, 2, 3) = x 1+2x+3x 2. We shall thus never speak of the polynomial 1 + 2x + 3x 2, since the status of x in it is unclear (e.g., is it the same as 1 + 2y + 3y 2?). Nevertheless, since such abusive language is often found to be convenient, we provide a formal definition that allows the same briefness. Definition 1.3 (Zero, one, and variable). 0 := () 1 := (1) X := (0, 1) X a;b := (a, b a) Note that X = id. With this definition 1 + 2X + 3X 2 = (1, 2, 3) is a polynomial, and 1 + 2X + 3X 2 := x 1+2x+3x 2. As far as type theory is concerned, X is not a variable, but a defined constant of type C1. The same is true for 0 and 1. Furthermore, we assume the domain 0; 1 for all polynomials. If other domains are of interest the variables can be easily stretched: The polynomial (a, b a) has the same range on 0; 1 as X on a; b. Addition and multiplication of polynomials will be noted + and. Lemma 1.4. The set of polynomials C1 with + and forms a ring. Definition 1.5. Composition of two polynomials is defined by: : C1 C1 C1 () q () c ; p q c + q (p q) Lemma 1.6. p q = p q Definition 1.7. The differentiation operator on power basis polynomials is defined as: : C k+1 C k := p (p 1, 2p 2,..., kp k ) Lemma 1.8 (Linearity of differentiation). For any α : R, (α p) = α p. Lemma 1.9 (Chain Rule). (p q) = ( p q) q 22

23 Lemma 1.10 (Product Rule). (p q) = p q + p q Definition Assuming that C has characteristic zero, the integration operator on power basis polynomials is defined as: S : C C k C k+1 := c p c ; (p 1, p2 2,..., p k k ) Note that S c p is x c + x 0 p(ξ)dξ in traditional notation. Theorem 1.12 (Fundamental theorem of calculus for polynomials). For any polynomial p : C k+1, S ( p) ( p) = p. Theorem For any polynomial p : C1, ( S c p) = p. Corollary For any two polynomials p, p : C1, p = p p = p p = p. Proof. We have S ( p) ( p) = S ( p ) ( p ) and thus by applying theorem 1.12 on both sides p = p. Lemma For any α : R, S c p (α X ) = S c ((α X ) p (α X )) Proof. By corollary Several Variables. Since our coefficients C were required to be a ring (assumption 1.1) and the polynomials C1 form a ring (lemma 1.4) themselves, polynomials can be used as coefficients of polynomials. This is a standard trick to obtain multivariate polynomials Mah06. Definition 1.16 (Multivariate polynomials). C0 := C Cn + 1 := (Cn)1 Our previous remarks on carefully using the notion variable and not relying on their names applies here two. In contrast to CX, Y, which is often seen, C2 solely refers to the number of variables. For p Cn we will use the notation p x to abbreviate p x 1... xn (note that in this latter term p x 1 Cn 1). Furthermore Cn can take a 23

24 different meaning if C is replaced by certain special symbols: Tn, Qn, Pn, d will be defined in chapter 5. Note that, alternatively, we could have defined Cn + 1 := (C1)n (this is reminiscent of the two ways to define Church numbers). 24

25 CHAPTER 2 Numbers Today, in most pieces of software used and developed, numbers are treated as commodity. Modern microprocessors offer instructions to compute with integers and floating-point numbers, a choice which is reflected in most programming languages. However, being finite, these machine numbers are only approximations of the mathematical objects they are usually supposed to model, namely N, Z, Q, or R. A program can only be expected to be correct if the results of all its intermediate computations are machine-representable. If they are not (and often this is not obvious), a computation either aborts with an error, or much worse gives a wrong result. Luckily, this problem can be overcome. The solution consists of generalizing the notion of machine-representable: N and Z can be implemented by finite digit lists (section 2.1), Q by pairs of integers representing fractions (section 2.3), and R by Cauchy sequences or digit streams (section 2.4). These representations are no longer finite approximations of the intended mathematical objects, but coincide precisely with them. This is indeed a necessity for the purpose of formalizing mathematics: if addition on Z was defined in a way such that = 0 it could not be part of an ordered ring. The same is true if =, where denotes an error element Integers Before we look into more efficient representations of large numbers, we will briefly describe how proof assistants commonly represent N and Z. The emphasis here is on the accuracy of the models, so efficiency considerations are secondary Unary Representation. In the Coq standard library N is implemented by an inductive definition: Inductive nat := O S of nat. The object O represents the number 0 and S the successor function. Note that S does not compute a value from an argument. It is only a primitive symbol (a constructor), of which, together with O, objects of type nat can be constructed. Accordingly, the number 5 is represented by S (S (S (S (S 0) ) ) ) and 2 32 S (S (... O) ), consuming an amount of memory linear in the size of the number. } {{ } 2 32 times as 25

26 Addition and multiplication are defined by structural recursion on the first argument: Fixpoint plus a b := if a is S a then S (plus a b) else b. Fixpoint mult a b := if a is S a then plus b (mult a b) else O. Although this representation perfectly models the set N and is often convenient for reasoning, it becomes quickly unusable for computations involving larger numbers the time complexity of addition is linear in the size of its first argument Binary Representation. The Coq standard library therefore offers another type, representing positive natural numbers more efficiently: Inductive positive := xi of positive xo of positive xh. The constructors xi and xo represent the functions (x 2x + 1) and (x 2x), while xh stands for 1. Again, they are constructors, so they don t compute. The first two can be noted more conveniently in postfix notation: Notation p 1 := (xi p). Notation p 0 := (xo p). In this binary representation memory usage is only logarithmic in the size of the number: xh 0 1 stands for 5 and xh } {{ } 32 times for On the other hand, the successor function is not represented directly by a constructor, so it has to be defined by structural recursion: Fixpoint Psucc x := match x with p 1 (Psucc p) 0 p 0 p 1 xh xh 0 end. Note that Coq does not use machine integers to perform these operations. It just evaluates each function step-by-step, following its definition, e.g.: Psucc (xh 0 1) (Psucc (xh 0)) 0 xh 1 0 The slightly more technical definitions of addition and multiplication in this representation can be found, along with many others, in the Coq standard library. Although these are considerably faster than in the unary representation, they are still slower than the primitive instructions provided by micro-processors. On the other hand, when computations on positive are performed, the CPU is only used to perform symbolic manipulations, not as a calculator. In such a setting hardware bugs will more likely lead to crashes than to erroneous results Nic94. Finally, the entire set of integers Z is modeled by the inductive definition Inductive Z := Z0 Zpos of positive Zneg of positive. 26

27 The standard library provides implementations of all common integer operations on this type Machine Integers. For the many situations where worries about hardware bugs, standard libraries, and compilers are to be dismissed as paranoid, Coq also provides primitive 31-bit machine integers (the 32nd bit is used for bookkeeping by the garbage collector) Spi06: Inductive digits := D0 D1. Inductive int31 := I31 of digits & digits & digits & digits & digits & digits & digits & digits & digits & digits & digits & digits & digits & digits & digits & digits & digits & digits & digits & digits & digits & digits & digits & digits & digits & digits & digits & digits & digits & digits & digits. This type comes with operations computing modulo 2 31, e.g.: add31 : int31 int31 int31 mul31 : int31 int31 int31 When the arguments to these functions are closed terms (i.e. concrete numbers, without variables in their bits), fast machine operations are used to perform the computation in a single step. In short, int31 is a very fast implementation of Z/31Z that can be used in Coq terms Arbitrary-Size Natural Numbers. As mentioned in the introduction of this chapter, implementations of algorithms based (for example) on N can only be expected to be correct on all possible inputs if the numbers used actually model the set N accurately, rather than being only an approximation like Z/31Z. While the binary representation of subsection offers such an accurate implementation, it processes only a single binary digit in one step of computation. In contrast, the machine integers int31 are fast, but implement Z/31Z, not N. If we want to achieve both good performance and accurate representation of N, a little more work is necessary: integers of arbitrary size (only limited by machine memory) can be implemented by gluing several machine integers int31 together. It is possible to implement operations on such numbers that are much faster than those on objects of type positive. This has been of particular interest for verified primality tests that have been carried out in Coq GTW06. The numbers processed by these algorithms are indeed very large, meaning several thousand decimal digits. The implementation of arbitrary-size integers developed for this purpose, the library BigN, is particularly adapted to this situation: not only does it use Karatsuba s fast algorithm for multiplication; it also represents a number as 27

28 a tree, providing access in logarithmic time to all the last digits. This makes it possible to implement integer square root and division very efficiently. While these tree-integers have proved useful for the treatment of huge prime numbers, they come with an associated overhead: the tree needs to be kept in balance, and simple operations such as addition need to do recursion over a tree instead of a list, which is slightly more expensive. It will turn out, that the integers used in our development will typically be large, but not huge. Therefore this overhead does not pay off in our case and we achieved better over-all-performance by using the more traditional flavor of arbitrary-size integers represented as lists Knu69. Let us now turn to the Coq implementation: we begin by defining the type t of medium (i.e. large, but not huge) arbitrary-size integers as a list of int31 digits. We place it in a module medn and the variable names m and n are declared to be of type medn.t. Definition t := list int31. Implicit Types m n : t. An arbitrary-precision integer can easily be constructed from a single int31 digit: Definition of_int31 i : t := i. It is explained what number an object of type t represents by converting it to type Z (cf. subsection 2.1.2): Fixpoint to_z n := if n is n0 :: n then Int31. phi n0 + Int31. base to_z n else 0. The function Int31. phi converts int31 to Z. Our arbitrary-size numbers can be seen as numbers in base Int31.base = Note that to_z is not injective. Indeed, leading 1 zeros cause an ambiguity in the representation: the objects 42 and 42;0 stand for the same number. A normalization function that removes any leading zeros is provided 2 : Fixpoint norm n := if n is i :: n then let n := norm n in if n is (_ :: _) then i :: n else 1 Since in our representation the list begins with the digit of least weight, the zero is actually trailing the list. However, the term leading zero is so common that we stick to it. 2 The let n :=... in the following definition seems a little odd here, since n is used only once (the second occurrence could be replaced by an as-bound pattern-variable). This is a workaround for a peculiarity of Coq s interpretation of default branches in match / if: in certain cases the matched value is textually copied inside the branch, which would evaluate it twice when the term is normalized. 28

How many numbers there are?

How many numbers there are? How many numbers there are? RADEK HONZIK Radek Honzik: Charles University, Department of Logic, Celetná 20, Praha 1, 116 42, Czech Republic Contents 1 What are numbers 2 1.1 Natural

More information

The Gödel Phenomena in Mathematics: A Modern View

The Gödel Phenomena in Mathematics: A Modern View Chapter 1 The Gödel Phenomena in Mathematics: A Modern View Avi Wigderson Herbert Maass Professor School of Mathematics Institute for Advanced Study Princeton, New Jersey, USA 1.1 Introduction What are

More information

Dependent Types at Work

Dependent Types at Work Dependent Types at Work Ana Bove and Peter Dybjer Chalmers University of Technology, Göteborg, Sweden {bove,peterd} Abstract. In these lecture notes we give an introduction to functional programming

More information

A Course on Number Theory. Peter J. Cameron

A Course on Number Theory. Peter J. Cameron A Course on Number Theory Peter J. Cameron ii Preface These are the notes of the course MTH6128, Number Theory, which I taught at Queen Mary, University of London, in the spring semester of 2009. There

More information


ON THE DISTRIBUTION OF SPACINGS BETWEEN ZEROS OF THE ZETA FUNCTION. A. M. Odlyzko AT&T Bell Laboratories Murray Hill, New Jersey ABSTRACT ON THE DISTRIBUTION OF SPACINGS BETWEEN ZEROS OF THE ZETA FUNCTION A. M. Odlyzko AT&T Bell Laboratories Murray Hill, New Jersey ABSTRACT A numerical study of the distribution of spacings between zeros

More information

Switching Algebra and Logic Gates

Switching Algebra and Logic Gates Chapter 2 Switching Algebra and Logic Gates The word algebra in the title of this chapter should alert you that more mathematics is coming. No doubt, some of you are itching to get on with digital design

More information

The Set Data Model CHAPTER 7. 7.1 What This Chapter Is About

The Set Data Model CHAPTER 7. 7.1 What This Chapter Is About CHAPTER 7 The Set Data Model The set is the most fundamental data model of mathematics. Every concept in mathematics, from trees to real numbers, is expressible as a special kind of set. In this book,

More information

Hypercomputation: computing more than the Turing machine

Hypercomputation: computing more than the Turing machine Hypercomputation: computing more than the Turing machine Abstract: Toby Ord Department of Philosophy * The University of Melbourne In this report I provide an introduction to

More information

Object-Oriented Programming in Oberon-2

Object-Oriented Programming in Oberon-2 Hanspeter Mössenböck Object-Oriented Programming in Oberon-2 Second Edition Springer Verlag Berlin Heidelberg 1993, 1994 This book is out of print and is made available as PDF with the friendly permission

More information

Intellectual Need and Problem-Free Activity in the Mathematics Classroom

Intellectual Need and Problem-Free Activity in the Mathematics Classroom Intellectual Need 1 Intellectual Need and Problem-Free Activity in the Mathematics Classroom Evan Fuller, Jeffrey M. Rabin, Guershon Harel University of California, San Diego Correspondence concerning

More information

Out of the Tar Pit. February 6, 2006

Out of the Tar Pit. February 6, 2006 Out of the Tar Pit Ben Moseley Peter Marks February 6, 2006 Abstract Complexity is the single major difficulty in the successful development of large-scale software

More information

Communicating Sequential Processes

Communicating Sequential Processes Communicating Sequential Processes C. A. R. Hoare June 21, 2004 C. A. R. Hoare, 1985 2004 This document is an electronic version of Communicating Sequential Processes, first published in 1985 by Prentice

More information

Steering User Behavior with Badges

Steering User Behavior with Badges Steering User Behavior with Badges Ashton Anderson Daniel Huttenlocher Jon Kleinberg Jure Leskovec Stanford University Cornell University Cornell University Stanford University {dph,

More information

Robust Set Reconciliation

Robust Set Reconciliation Robust Set Reconciliation Di Chen 1 Christian Konrad 2 Ke Yi 1 Wei Yu 3 Qin Zhang 4 1 Hong Kong University of Science and Technology, Hong Kong, China 2 Reykjavik University, Reykjavik, Iceland 3 Aarhus

More information

Basics of Compiler Design

Basics of Compiler Design Basics of Compiler Design Anniversary edition Torben Ægidius Mogensen DEPARTMENT OF COMPUTER SCIENCE UNIVERSITY OF COPENHAGEN Published through c Torben Ægidius Mogensen 2000 2010

More information

OPRE 6201 : 2. Simplex Method

OPRE 6201 : 2. Simplex Method OPRE 6201 : 2. Simplex Method 1 The Graphical Method: An Example Consider the following linear program: Max 4x 1 +3x 2 Subject to: 2x 1 +3x 2 6 (1) 3x 1 +2x 2 3 (2) 2x 2 5 (3) 2x 1 +x 2 4 (4) x 1, x 2

More information

On Understanding Types, Data Abstraction, and Polymorphism

On Understanding Types, Data Abstraction, and Polymorphism 1 Computing Surveys, Vol 17 n. 4, pp 471-522, December 1985 On Understanding Types, Data Abstraction, and Polymorphism Luca Cardelli AT&T Bell Laboratories, Murray Hill, NJ 07974 (current address: DEC

More information

On Understanding Types, Data Abstraction, and Polymorphism

On Understanding Types, Data Abstraction, and Polymorphism On Understanding Types, Data Abstraction, and Polymorphism LUCA CARDELLI AT&T Bell Laboratories, Murray Hill, N. J. 07974 PETER WEGNER Department of Computer Science, Brown University, Providence, R. I.

More information

Optimization by Direct Search: New Perspectives on Some Classical and Modern Methods

Optimization by Direct Search: New Perspectives on Some Classical and Modern Methods SIAM REVIEW Vol. 45,No. 3,pp. 385 482 c 2003 Society for Industrial and Applied Mathematics Optimization by Direct Search: New Perspectives on Some Classical and Modern Methods Tamara G. Kolda Robert Michael

More information

Foundations of Data Science 1

Foundations of Data Science 1 Foundations of Data Science John Hopcroft Ravindran Kannan Version /4/204 These notes are a first draft of a book being written by Hopcroft and Kannan and in many places are incomplete. However, the notes

More information

Prime Number Races. Andrew Granville and Greg Martin. Table 1. The number of primes of the form 4n + 1and4n + 3uptox.

Prime Number Races. Andrew Granville and Greg Martin. Table 1. The number of primes of the form 4n + 1and4n + 3uptox. Prime Number Races Andrew Granville and Greg Martin. INTRODUCTION. There s nothing quite like a day at the races...the quickening of the pulse as the starter s pistol sounds, the thrill when your favorite

More information

Abstract. 1. Introduction. Butler W. Lampson Xerox Palo Alto Research Center David D. Redell Xerox Business Systems

Abstract. 1. Introduction. Butler W. Lampson Xerox Palo Alto Research Center David D. Redell Xerox Business Systems Experience with Processes and Monitors in Mesa 1 Abstract Butler W. Lampson Xerox Palo Alto Research Center David D. Redell Xerox Business Systems The use of monitors for describing concurrency has been

More information

System Programming in a High Level Language. Andrew D Birrell

System Programming in a High Level Language. Andrew D Birrell System Programming in a High Level Language Andrew D Birrell Dissertation submitted for the degree of Doctor of Philosophy in the University of Cambridge, December 1977. Except where otherwise stated in

More information

Orthogonal Bases and the QR Algorithm

Orthogonal Bases and the QR Algorithm Orthogonal Bases and the QR Algorithm Orthogonal Bases by Peter J Olver University of Minnesota Throughout, we work in the Euclidean vector space V = R n, the space of column vectors with n real entries

More information

Understanding and Writing Compilers

Understanding and Writing Compilers Understanding and Writing Compilers A do-it-yourself guide Richard Bornat Middlesex University, London. First published 1979. Internet edition 2007; corrected 2008. Copyright c 1979,

More information

What s Sophisticated about Elementary Mathematics?

What s Sophisticated about Elementary Mathematics? What s Sophisticated about Elementary Mathematics? Plenty That s Why Elementary Schools Need Math Teachers illustrated by roland sarkany By Hung-Hsi Wu Some 13 years ago, when the idea of creating a cadre

More information

A Tutorial on Support Vector Machines for Pattern Recognition

A Tutorial on Support Vector Machines for Pattern Recognition c,, 1 43 () Kluwer Academic Publishers, Boston. Manufactured in The Netherlands. A Tutorial on Support Vector Machines for Pattern Recognition CHRISTOPHER J.C. BURGES Bell Laboratories, Lucent Technologies

More information

JCR or RDBMS why, when, how?

JCR or RDBMS why, when, how? JCR or RDBMS why, when, how? Bertil Chapuis 12/31/2008 Creative Commons Attribution 2.5 Switzerland License This paper compares java content repositories (JCR) and relational database management systems

More information



More information

How to Make Ad Hoc Proof Automation Less Ad Hoc

How to Make Ad Hoc Proof Automation Less Ad Hoc How to Make Ad Hoc Proof Automation Less Ad Hoc Georges Gonthier Microsoft Research Beta Ziliani MPI-SWS Aleksandar Nanevski IMDEA Software Institute

More information