Computation of 2700 billion decimal digits of Pi using a Desktop Computer
|
|
|
- Lee Edwards
- 9 years ago
- Views:
Transcription
1 Computation of 2700 billion decimal digits of Pi using a Desktop Computer Fabrice Bellard Feb 11, 2010 (4th revision) This article describes some of the methods used to get the world record of the computation of the digits of π digits using an inexpensive desktop computer. 1 Notations We assume that numbers are represented in base B with B = A digit in base B is called a limb. M(n) is the time needed to multiply n limb numbers. We assume that M(Cn) is approximately CM(n), which means M(n) is mostly linear, which is the case when handling very large numbers with the Schönhage-Strassen multiplication [5]. log(n) means the natural logarithm of n. log 2 (n) is log(n)/ log(2). SI and binary prefixes are used (i.e. 1 TB = bytes, 1 GiB = 2 30 bytes). 2 Evaluation of the Chudnovsky series 2.1 Introduction The digits of π were computed using the Chudnovsky series [10] with 1 π = 12 n=0 ( 1) n (6n)!(A + Bn) (n!) 3 (3n)!C 3n+3/2 A = B = C = It was evaluated with the binary splitting algorithm. The asymptotic running time is O(M(n) log(n) 2 ) for a n limb result. It is worst than the asymptotic running time of the Arithmetic-Geometric Mean algorithms of O(M(n) log(n)) but it has better locality and many improvements can reduce its constant factor. Let S be defined as S(n 1, n 2 ) = n 2 n=n a n n k=n 1 +1 p k q k.
2 We define the auxiliary integers P (n 1, n 2 ) = Q(n 1, n 2 ) = n 2 p k k=n 1 +1 n 2 q k k=n 1 +1 T (n 1, n 2 ) = S(n 1, n 2 )Q(n 1, n 2 ). P, Q and T can be evaluated recursively with the following relations defined with m such as n 1 < m < n 2 : P (n 1, n 2 ) = P (n 1, m)p (m, n 2 ) Q(n 1, n 2 ) = Q(n 1, m)q(m, n 2 ) T (n 1, n 2 ) = T (n 1, m)q(m, n 2 ) + P (n 1, m)t (m, n 2 ). Algorithm 1 is deduced from these relations. Algorithm 1 Compute1 PQT(n 1, n 2 ) if n = n 2 then return (p n2, q n2, a n2 p n2 ) else m (n 1 + n 2 )/2 (P 1, Q 1, T 1 ) Compute1 PQT(n 1, m) (P 2, Q 2, T 2 ) Compute1 PQT(m, n 2 ) return (P 1 P 2, Q 1 Q 2, T 1 Q 2 + P 1 T 2 ) end if For the Chudnovsky series we can take: a n = ( 1) n (A + Bn) p n = (2n 1)(6n 5)(6n 1) q n = n 3 C 3 /24. We get then π Q(0, N) 12T (0, N) + 12AQ(0, N) C3/ Improvements Common factor reduction The result is not modified if common factors are removed from the numerator and the denominator as in algorithm 2. 2
3 Algorithm 2 Compute2 PQT(n 1, n 2 ) if n = n 2 then return (p n2, q n2, a n2 p n2 ) else m (n 1 + n 2 )/2 (P 1, Q 1, T 1 ) Compute2 PQT(n 1, m) (P 2, Q 2, T 2 ) Compute2 PQT(m, n 2 ) d gcd(p 1, Q 2 ) P 1 P 1 /d, Q 2 Q 2 /d return (P 1 P 2, Q 1 Q 2, T 1 Q 2 + P 1 T 2 ) end if The gcd (Greatest Common Divisor) is computed by explicitly storing the factorization of P and Q in small prime factors at every step. The factorization is computed by using a sieve [3]. In [3] the sieve takes a memory of O(N log(n)) which is very large. Our implementation only uses a memory of O(N 1/2 log(n)), which is negligible compared to the other memory use. As (6n)! n (n!) 3 (3n)! = k=1 24(2k 1)(6k 5)(6k 1) k 3 is an integer, it is useful to note that the final denominator Q(0, N) of the sum divides C 3N when all the common factors have been removed. So the maximum size of the final T (0, N) can easily be evaluated. Here it is log(c)/ log(c/12) 1.22 times the size of the final result. Another idea to reduce the memory size and time is to use a partial factorization of P, Q and T. It is described in [13] Precomputed powers In the Chudnovsky series, Q contains a power which can be precomputed separately. So we redefine S as S(n 1, n 2 ) = n 2 a n c n n 1 n=n 1 +1 The recursive relation for T becomes n k=n 1 +1 p k q k. T (n 1, n 2 ) = T (n 1, m)q(m, n 2 )c n 2 m + P (n 1, m)t (m, n 2 ). For the Chudnovsky formula, we choose a n = ( 1) n (A + Bn) p n = 24(2n 1)(6n 5)(6n 1) q n = n 3 3
4 c = C 3. c = C 3 /24 would also have been possible but we believe the overall memory usage would have been larger. c n needs only be computed for log 2 (N) exponents, so the added memory is kept small. 2.3 Multi-threading When the operands are small enough, different parts of the binary splitting recursion are executed in different threads to optimally use the CPU cores. When the operands get bigger, a single thread is used, but the large multiplications are multi-threaded. 2.4 Restartability When operands are stored on disk, each step of the computation is implemented so that it is restartable. It means that if the program is stopped (for example because of a power loss or a hardware failure), it can be restarted from a known synchronization point. 3 Verification 3.1 Binary result The usual way is to get the same result using a different formula. Then it is highly unlikely that a software or hardware bug modifies the result the same way. Our original plan was to use the Ramanujan formula which is very similar to the Chudnovsky one, but less efficient (8 digits per term instead of 14 digits per term). However, a hard disk failure on a second PC doing the verification delayed it and we did not wanted to spend more time on it. Instead of redoing the full computation, we decided to only compute the last binary digits of the result using the Borwein-Bailey-Plouffe algorithm [8] with a formula we found which is slightly faster (and easier to prove) than the original BBP formula [9]. Of course, verifying the last digits only demonstrates that it is very unlikely that the overall algorithm is incorrect, but hardware errors can still have modified the previous digits. So we used a second proof relying on the fact that the last computation step of the Chudnovsky formula is implemented as a floating point multiplication of two N limbs floating point numbers C 1 and C 2. We suppose here that the floating point results are represented as size N integers. Then: R = C 1 C 2 (R has 2N limbs) 4
5 πb N R/B N (truncation to take the high N limbs) where C 2 /B N C 3/2 and C 1 /B N is the result of the binary splitting. If we assume that C 1 and C 2 are random like, then the last digits of the result of the multiplication depend on all the input digits of C 1 and C 2. More precisely, if there is an additive error eb k in C 1 with B < e < B and 0 k < N, then R = (C 1 + eb k )C 2 = C 1 C 2 + ec 2.B k The fact that C 2 has random like digits ensures that digits near the position N in R are modified (multiplying ec 2 by B k can be seen as a left shift by k limbs of ec 2 ). So it remains to prove that the multiplication itself was done without error. To do it, we check the 2N limb result R (before doing the truncation) modulo a small number p : c 1 C 1 mod p c 2 C 2 mod p r 1 c 1 c 2 mod p r 2 R mod p The check consists in verifying that r1 = r2. p is chosen as a prime number so that the modulus depends on all the limbs of the input numbers. Its size was fixed to 64 bit which gives a negligible probability of not catching an error. 3.2 Base conversion The binary to base 10 conversion also needs to be verified. The common way is to convert back the result from base 10 to binary and to verify that it matches the original binary result. We implemented this method, but it would have doubled the computation time. Instead we did a more efficient (but less direct) verification: Let R be the N limb floating point binary result and R 2 be the base 10 result expressed as a N 2 limb floating point number in base B 2 (B 2 = in our case). Assuming that R 2 is normalized between 1/B 2 and 1, we want to verify that RB N 2 2 = R 2B N 2 2 neglecting possible rounding errors. Doing the full computation would need converting back R 2 B N 2 2 to integer, which does not save time. So we check the equality modulo a small prime p. We compute r 1 = RB N 2 2 mod p the usual way (cost of O(M(N))). r 2 = R 2 B N 2 2 mod p is evaluated by considering R 2B N 2 2 as a polynomial in B 2 evaluated modulo p using the Horner algorithm. It has a cost of O(N) assuming that single word operations have a running time of O(1). Then the check consists in verifying that r 1 = r 2. 5
6 4 Fast multiplication algorithm 4.1 Small sizes For small sizes (a few limbs), the naive O(n 2 ) method is used. For larger sizes (up to a few tens of limbs), the Karatsuba method is used (O(n log 2 (3) ) running time). 4.2 Large sizes (operands fit in RAM) For larger sizes, the multiplication is implemented as a cyclic convolution of polynomials. The cyclic convolution of polynomials is implemented with floating point Discrete Fourier Transforms (DFT) [5]. Instead of doing more complicated real valued DFTs, we do complex DFTs and use a dyadic multiplication instead of the point-wise multiplication. Negacyclic convolutions use a similar trick [2]. To reduce the memory usage, large multiplications giving a result of a size of 2N limbs are splitted in two multiplications modulo B N 1 (cyclic convolution) and modulo B N + 1 (negacyclic convolution). The two results are combined using the Chinese Remainder Theorem (CRT) to get a multiplication modulo B 2N 1. Although it is possible to design the floating point DFT convolution with guaranteed error bounds [4] [1], it severely reduces its performance. Instead of doing that, we use the fact that the input numbers are random like and we use a DFT limb size found empirically. We systematically use a fast modulo (B 1) checksum for the cyclic convolution to detect most rounding errors. We made the assumption that possible rounding error left would be detected by the verification steps. A DFT of size N = n 1 n 2 is evaluated by doing n 2 DFT of size n 1, N multiplications by complex exponentials (they are usually called twiddle factors) and n 1 DFT of size n 2. The method can be applied recursively [6]. When N is small, we use n 1 or n 2 = 2, 3, 5 or 7 which give well known Fast Fourier Transform algorithms using decimation in time or decimation in frequency. Using other factors than 2 allows N to be chosen very close to the actual size of the result, which is important to reduce the memory usage and to increase the speed. When N is large, a better CPU cache usage is obtained by choosing factors so that each DFT fits in the L1 cache. The smaller DFTs are evaluated using several threads to use all the CPU cores. Since the task is mostly I/O bound, it does not scale linearly with the number of CPUs. Another important detail is how to compute the complex exponentials. For the small DFTs fitting the L1 cache, a cache is used to store all the possible values. The twiddle factors of the large DFTs are evaluated on the fly because it would use too much memory to save them. It comes from the fact that the binary splitting uses multiplications of very different sizes. The twiddle factors are evaluated using multiplication trees ensuring there is little error propagation. The intermediate 6
7 complex values are stored using 80 bit floating point numbers which are natively supported by x86 CPUs. All the other floating point computations are achieved using 64 bit floating point numbers. All the operations are vectorized using the x86 SSE3 instruction set. 4.3 Huge sizes (operands are on disk) Main parameters For huge sizes (hundreds gigabytes as of year 2009), the main goal was to reduce the amount of disk I/O. A key parameter is what we call the DFT expansion ratio r, which is the ratio between the DFT word size and the DFT limb size. For example, for DFTs of 2 27 complex samples, the expansion ratio is about 4.5. The DFT expansion ratio cannot be smaller than two because the result of the convolution needs at least twice the size of the DFT limb size. A ratio close to 2 can be obtained by using large DFT word sizes. Using the hardware of year 2009, it is easier to do it using a Number Theoretic Transform (NTT) which is a DFT in Z/nZ where n is chosen so that there are primitive roots of unity of order N, where N is the size of the NTT. A possible choice is to choose n = 2 k + 1, which gives the Schönhage-Strassen multiplication algorithm. An efficient implementation is described in [7]. Another choice is to use several small primes n and to combine them with the CRT. The advantage is that the same multi-threading and cache optimizations as for the floating point DFT are reused and that it is possible to find a NTT size N very close to the needed result size. The drawback is that the computation involves modular arithmetic on small numbers which is not as efficient as floating point arithmetic on x86 CPUs of the year However, it turns out that latest x86 CPUs are very fast at doing 64 bit by 64 bit multiplications giving a 64 bit result. We improved the NTT speed by using the fact that most modular multiplications are modular multiplications by a constant. This trick is attributed to Victor Shoup and is described in section 4.2 of [11]. We choose 8 64 bit moduli, which gives a DFT expansion ratio of about 2.2 for NTTs of a few terabytes. Using more moduli is possible to get closer to 2, but it complicates the final CRT step NTT convolution The NTT size N is split in two factors n 1 and n 2. n 2 is chosen as large as possible with the constraint that a NTT of size n 2 can be done in memory. The method is discussed in section of [14] (Mass storage convolution). The resulting convolution in split in 3 steps: 7
8 1. n 2 NTTs of size n 1 on both operands, multiplication by the twiddle factors 2. n 1 NTTs of size n 2 on both operands, pointwise multiplication, n 1 inverse NTT of size n 2 3. multiplication by the twiddle factors, n 2 NTTs of size n 1 on the result. Step (2) involves only linear disk access, which are fast. (1) and (3) involves more disk seeks as n 1 gets bigger because the NTTs must be done using samples separated by n 2 samples. If n 1 is too big, it is split again in two steps, which gives a 5 step convolution algorithm. For the 3 step algorithm, the amount of disk I/O is 12r +4 assuming two input operands of unit size and a DFT expansion ratio of r. The needed disk space, assuming the operands are destroyed is 4r + 1. If the disk space is limited, we use the same CRT trick as for the floating point DFT by doing two multiplications modulo B N 1 and B N + 1. The amount disk I/O is 12r + 10 but the needed disk space is lowered to 2r + 2. The disk space can be further reduced at the expense of more I/Os by using more than two moduli. 5 Other operations 5.1 Division The evaluation of the Chudnovsky formula involves a lot of integer divisions because of the GCD trick. So having a good division algorithm is critical. We implemented the one described in [1] using the concept of middle product. Then the floating point inversion has a cost of 2 multiplications and the division a cost of 2.5 multiplications. 5.2 Base conversion The usual base conversion algorithm for huge numbers involves recursively dividing by powers of the new base. The running time is D(n/2) log 2 (n) [1] when n is a power of two and D(n) is the time needed to divide a 2n limb number by a n limb number. We implemented a better algorithm by using the Bernstein scaled remainder tree and the middle product concept, as suggested in exercise 1.32 of [1]. The running time becomes M(n/2) log 2 (n). 6 Pi computation parameters We describe the various parameters to compute 2700 billion decimal digits of π. 8
9 6.1 Global memory use The full π computation, including the base conversion uses a total memory of 6.42 times the size of the final result. Here the result is 1.12 TB big, hence at least 7.2 TB of mass storage is needed. It is possible to further reduce the memory usage by using more moduli in the large multiplications (see section 4.2). 6.2 Hardware A standard desktop PC was used, using a Core i7 CPU at 2.93 GHz. This CPU contains 4 physical cores. We put only 6 GiB of RAM to reduce the cost. It means the amount of RAM is about 170 times smaller than the size of the final result, so the I/O performance of the mass storage is critical. Unfortunately, the RAM had no ECC (Error Correcting Code), so random bit errors could not be corrected nor detected. Since the computation lasted more than 100 days, such errors were likely [12]. Hopefully, the computations included verification steps which could detect such errors. For the mass storage, five 1.5 TB hard disks were used. The aggregated peak I/O speed is about 500 MB/s. 6.3 Computation time Computation of the binary digits: 103 days Verification of the binary digits: 13 days Conversion to base 10: 12 days Verification of the conversion: 3 days 7 Ideas for future work Here are some potentially interesting topics: Compare the running time and memory usage of the evaluation of the Chudnovsky series described here with the algorithm described in [13]. See if the Schönhage-Strassen multiplication is faster than the NTT with small moduli when using mass storage. Speed up the computations using a network of PCs and/or many-core CPUs. Lower the memory usage of the π computation. 9
10 8 Conclusion We showed that a careful implementation of the latest arbitrary-precision arithmetic algorithms allows computations on numbers in the terabyte range with inexpensive hardware resources. References [1] Richard Brent and Paul Zimmermann, Modern Computer Arithmetic, version 0.4, November 2009, [2] Richard Crandall, Integer convolution via split-radix fast Galois transform, Feb 1999, [3] Hanhong Xue, gmp-chudnovsky.c program to compute the digits of Pi using the GMP library, [4] Richard P. Brent, Colin Percival, and Paul Zimmermann, Error bounds on complex floating-point multiplication. Mathematics of Computation, 76(259): , [5] Arnold Schönhage and Volker Strassen, Schnelle Multiplikation großer Zahlen, Computing 7: , [6] Cooley, James W., and John W. Tukey, An algorithm for the machine calculation of complex Fourier series, Math. Comput. 19, (1965). [7] Pierrick Gaudry, Alexander Kruppa, Paul Zimmermann, A GMP-based Implementation of Schönhage-Strassen s Large Integer Multiplication Algorithm, ISSAC 07. [8] Bailey David H., Borwein Peter B., and Plouffe Simon, On the Rapid Computation of Various Polylogarithmic Constants, April 1997, Mathematics of Computation 66 (218): [9] Fabrice Bellard, A new formula to compute the n th binary digit of Pi, 1997, [10] D. V. Chudnovsky and G. V. Chudnovsky, Approximations and complex multiplication according to Ramanujan, in Ramanujan Revisited, Academic Press Inc., Boston, (1988), p & p [11] Tommy Färnqvist, Number Theory Meets Cache Locality - Efficient Implementation of a Small Prime FFT for the GNU Multiple Precision Arithmetic Library, 2005, 10
11 2005/rapporter05/farnqvist_tommy_05091.pdf. [12] Bianca Schroeder, Eduardo Pinheiro, Wolf-Dietrich Weber, DRAM Errors in the Wild: A Large-Scale Field Study, SIGMETRICS/Performance 09, [13] Howard Cheng, Guillaume Hanrot, Emmanuel Thomé, Eugene Zima, Paul Zimmermann, Time- and Space-Efficient Evaluation of Some Hypergeometric Constants, ISSAC 07, [14] Jörg Arndt, Matters Computational, draft version of December 31, Document history Revision 1 Corrected reference to exercise in [1]. 2 More detail about the sieve in the common factor reduction. Corrected typos. 3 Added reference to [13]. Changed notations for the binary splitting algorithm. Removed invalid claim regarding the size contraints of the Schönhage-Strassen multiplication. Corrected running time of the base conversion. Corrected typos. 4 Added reference to [14]. 11
The BBP Algorithm for Pi
The BBP Algorithm for Pi David H. Bailey September 17, 2006 1. Introduction The Bailey-Borwein-Plouffe (BBP) algorithm for π is based on the BBP formula for π, which was discovered in 1995 and published
Faster deterministic integer factorisation
David Harvey (joint work with Edgar Costa, NYU) University of New South Wales 25th October 2011 The obvious mathematical breakthrough would be the development of an easy way to factor large prime numbers
A simple and fast algorithm for computing exponentials of power series
A simple and fast algorithm for computing exponentials of power series Alin Bostan Algorithms Project, INRIA Paris-Rocquencourt 7815 Le Chesnay Cedex France and Éric Schost ORCCA and Computer Science Department,
U.C. Berkeley CS276: Cryptography Handout 0.1 Luca Trevisan January, 2009. Notes on Algebra
U.C. Berkeley CS276: Cryptography Handout 0.1 Luca Trevisan January, 2009 Notes on Algebra These notes contain as little theory as possible, and most results are stated without proof. Any introductory
= 2 + 1 2 2 = 3 4, Now assume that P (k) is true for some fixed k 2. This means that
Instructions. Answer each of the questions on your own paper, and be sure to show your work so that partial credit can be adequately assessed. Credit will not be given for answers (even correct ones) without
8 Primes and Modular Arithmetic
8 Primes and Modular Arithmetic 8.1 Primes and Factors Over two millennia ago already, people all over the world were considering the properties of numbers. One of the simplest concepts is prime numbers.
Breaking The Code. Ryan Lowe. Ryan Lowe is currently a Ball State senior with a double major in Computer Science and Mathematics and
Breaking The Code Ryan Lowe Ryan Lowe is currently a Ball State senior with a double major in Computer Science and Mathematics and a minor in Applied Physics. As a sophomore, he took an independent study
Computer and Network Security
MIT 6.857 Computer and Networ Security Class Notes 1 File: http://theory.lcs.mit.edu/ rivest/notes/notes.pdf Revision: December 2, 2002 Computer and Networ Security MIT 6.857 Class Notes by Ronald L. Rivest
FFT Algorithms. Chapter 6. Contents 6.1
Chapter 6 FFT Algorithms Contents Efficient computation of the DFT............................................ 6.2 Applications of FFT................................................... 6.6 Computing DFT
ECE 842 Report Implementation of Elliptic Curve Cryptography
ECE 842 Report Implementation of Elliptic Curve Cryptography Wei-Yang Lin December 15, 2004 Abstract The aim of this report is to illustrate the issues in implementing a practical elliptic curve cryptographic
FACTORING. n = 2 25 + 1. fall in the arithmetic sequence
FACTORING The claim that factorization is harder than primality testing (or primality certification) is not currently substantiated rigorously. As some sort of backward evidence that factoring is hard,
Lecture 13 - Basic Number Theory.
Lecture 13 - Basic Number Theory. Boaz Barak March 22, 2010 Divisibility and primes Unless mentioned otherwise throughout this lecture all numbers are non-negative integers. We say that A divides B, denoted
An Efficient RNS to Binary Converter Using the Moduli Set {2n + 1, 2n, 2n 1}
An Efficient RNS to Binary Converter Using the oduli Set {n + 1, n, n 1} Kazeem Alagbe Gbolagade 1,, ember, IEEE and Sorin Dan Cotofana 1, Senior ember IEEE, 1. Computer Engineering Laboratory, Delft University
Factoring Algorithms
Factoring Algorithms The p 1 Method and Quadratic Sieve November 17, 2008 () Factoring Algorithms November 17, 2008 1 / 12 Fermat s factoring method Fermat made the observation that if n has two factors
Integer Factorization using the Quadratic Sieve
Integer Factorization using the Quadratic Sieve Chad Seibert* Division of Science and Mathematics University of Minnesota, Morris Morris, MN 56567 [email protected] March 16, 2011 Abstract We give
Determining the Optimal Combination of Trial Division and Fermat s Factorization Method
Determining the Optimal Combination of Trial Division and Fermat s Factorization Method Joseph C. Woodson Home School P. O. Box 55005 Tulsa, OK 74155 Abstract The process of finding the prime factorization
CHAPTER 5. Number Theory. 1. Integers and Division. Discussion
CHAPTER 5 Number Theory 1. Integers and Division 1.1. Divisibility. Definition 1.1.1. Given two integers a and b we say a divides b if there is an integer c such that b = ac. If a divides b, we write a
Computers. Hardware. The Central Processing Unit (CPU) CMPT 125: Lecture 1: Understanding the Computer
Computers CMPT 125: Lecture 1: Understanding the Computer Tamara Smyth, [email protected] School of Computing Science, Simon Fraser University January 3, 2009 A computer performs 2 basic functions: 1.
Discrete Mathematics, Chapter 4: Number Theory and Cryptography
Discrete Mathematics, Chapter 4: Number Theory and Cryptography Richard Mayr University of Edinburgh, UK Richard Mayr (University of Edinburgh, UK) Discrete Mathematics. Chapter 4 1 / 35 Outline 1 Divisibility
Factoring. Factoring 1
Factoring Factoring 1 Factoring Security of RSA algorithm depends on (presumed) difficulty of factoring o Given N = pq, find p or q and RSA is broken o Rabin cipher also based on factoring Factoring like
CSE373: Data Structures and Algorithms Lecture 3: Math Review; Algorithm Analysis. Linda Shapiro Winter 2015
CSE373: Data Structures and Algorithms Lecture 3: Math Review; Algorithm Analysis Linda Shapiro Today Registration should be done. Homework 1 due 11:59 pm next Wednesday, January 14 Review math essential
Primality Testing and Factorization Methods
Primality Testing and Factorization Methods Eli Howey May 27, 2014 Abstract Since the days of Euclid and Eratosthenes, mathematicians have taken a keen interest in finding the nontrivial factors of integers,
THE NUMBER OF REPRESENTATIONS OF n OF THE FORM n = x 2 2 y, x > 0, y 0
THE NUMBER OF REPRESENTATIONS OF n OF THE FORM n = x 2 2 y, x > 0, y 0 RICHARD J. MATHAR Abstract. We count solutions to the Ramanujan-Nagell equation 2 y +n = x 2 for fixed positive n. The computational
Fast Fourier Transform: Theory and Algorithms
Fast Fourier Transform: Theory and Algorithms Lecture Vladimir Stojanović 6.973 Communication System Design Spring 006 Massachusetts Institute of Technology Discrete Fourier Transform A review Definition
The Mathematics of the RSA Public-Key Cryptosystem
The Mathematics of the RSA Public-Key Cryptosystem Burt Kaliski RSA Laboratories ABOUT THE AUTHOR: Dr Burt Kaliski is a computer scientist whose involvement with the security industry has been through
Faster polynomial multiplication via multipoint Kronecker substitution
Faster polynomial multiplication via multipoint Kronecker substitution 5th February 2009 Kronecker substitution KS = an algorithm for multiplying polynomials in Z[x]. Example: f = 41x 3 + 49x 2 + 38x +
Cryptography and Network Security Chapter 9
Cryptography and Network Security Chapter 9 Fifth Edition by William Stallings Lecture slides by Lawrie Brown (with edits by RHB) Chapter 9 Public Key Cryptography and RSA Every Egyptian received two names,
Primality - Factorization
Primality - Factorization Christophe Ritzenthaler November 9, 2009 1 Prime and factorization Definition 1.1. An integer p > 1 is called a prime number (nombre premier) if it has only 1 and p as divisors.
Binary Number System. 16. Binary Numbers. Base 10 digits: 0 1 2 3 4 5 6 7 8 9. Base 2 digits: 0 1
Binary Number System 1 Base 10 digits: 0 1 2 3 4 5 6 7 8 9 Base 2 digits: 0 1 Recall that in base 10, the digits of a number are just coefficients of powers of the base (10): 417 = 4 * 10 2 + 1 * 10 1
RSA and Primality Testing
and Primality Testing Joan Boyar, IMADA, University of Southern Denmark Studieretningsprojekter 2010 1 / 81 Correctness of cryptography cryptography Introduction to number theory Correctness of with 2
minimal polyonomial Example
Minimal Polynomials Definition Let α be an element in GF(p e ). We call the monic polynomial of smallest degree which has coefficients in GF(p) and α as a root, the minimal polyonomial of α. Example: We
Factoring Algorithms
Institutionen för Informationsteknologi Lunds Tekniska Högskola Department of Information Technology Lund University Cryptology - Project 1 Factoring Algorithms The purpose of this project is to understand
Binary search tree with SIMD bandwidth optimization using SSE
Binary search tree with SIMD bandwidth optimization using SSE Bowen Zhang, Xinwei Li 1.ABSTRACT In-memory tree structured index search is a fundamental database operation. Modern processors provide tremendous
Operation Count; Numerical Linear Algebra
10 Operation Count; Numerical Linear Algebra 10.1 Introduction Many computations are limited simply by the sheer number of required additions, multiplications, or function evaluations. If floating-point
SUBGROUPS OF CYCLIC GROUPS. 1. Introduction In a group G, we denote the (cyclic) group of powers of some g G by
SUBGROUPS OF CYCLIC GROUPS KEITH CONRAD 1. Introduction In a group G, we denote the (cyclic) group of powers of some g G by g = {g k : k Z}. If G = g, then G itself is cyclic, with g as a generator. Examples
PURSUITS IN MATHEMATICS often produce elementary functions as solutions that need to be
Fast Approximation of the Tangent, Hyperbolic Tangent, Exponential and Logarithmic Functions 2007 Ron Doerfler http://www.myreckonings.com June 27, 2007 Abstract There are some of us who enjoy using our
Number Theory. Proof. Suppose otherwise. Then there would be a finite number n of primes, which we may
Number Theory Divisibility and Primes Definition. If a and b are integers and there is some integer c such that a = b c, then we say that b divides a or is a factor or divisor of a and write b a. Definition
Information Theory and Coding Prof. S. N. Merchant Department of Electrical Engineering Indian Institute of Technology, Bombay
Information Theory and Coding Prof. S. N. Merchant Department of Electrical Engineering Indian Institute of Technology, Bombay Lecture - 17 Shannon-Fano-Elias Coding and Introduction to Arithmetic Coding
Computing exponents modulo a number: Repeated squaring
Computing exponents modulo a number: Repeated squaring How do you compute (1415) 13 mod 2537 = 2182 using just a calculator? Or how do you check that 2 340 mod 341 = 1? You can do this using the method
Lecture 3: Finding integer solutions to systems of linear equations
Lecture 3: Finding integer solutions to systems of linear equations Algorithmic Number Theory (Fall 2014) Rutgers University Swastik Kopparty Scribe: Abhishek Bhrushundi 1 Overview The goal of this lecture
To convert an arbitrary power of 2 into its English equivalent, remember the rules of exponential arithmetic:
Binary Numbers In computer science we deal almost exclusively with binary numbers. it will be very helpful to memorize some binary constants and their decimal and English equivalents. By English equivalents
Scalable Data Analysis in R. Lee E. Edlefsen Chief Scientist UserR! 2011
Scalable Data Analysis in R Lee E. Edlefsen Chief Scientist UserR! 2011 1 Introduction Our ability to collect and store data has rapidly been outpacing our ability to analyze it We need scalable data analysis
RSA Attacks. By Abdulaziz Alrasheed and Fatima
RSA Attacks By Abdulaziz Alrasheed and Fatima 1 Introduction Invented by Ron Rivest, Adi Shamir, and Len Adleman [1], the RSA cryptosystem was first revealed in the August 1977 issue of Scientific American.
Notes on Factoring. MA 206 Kurt Bryan
The General Approach Notes on Factoring MA 26 Kurt Bryan Suppose I hand you n, a 2 digit integer and tell you that n is composite, with smallest prime factor around 5 digits. Finding a nontrivial factor
RN-coding of Numbers: New Insights and Some Applications
RN-coding of Numbers: New Insights and Some Applications Peter Kornerup Dept. of Mathematics and Computer Science SDU, Odense, Denmark & Jean-Michel Muller LIP/Arénaire (CRNS-ENS Lyon-INRIA-UCBL) Lyon,
Cryptography and Network Security Chapter 8
Cryptography and Network Security Chapter 8 Fifth Edition by William Stallings Lecture slides by Lawrie Brown (with edits by RHB) Chapter 8 Introduction to Number Theory The Devil said to Daniel Webster:
Kapitel 1 Multiplication of Long Integers (Faster than Long Multiplication)
Kapitel 1 Multiplication of Long Integers (Faster than Long Multiplication) Arno Eigenwillig und Kurt Mehlhorn An algorithm for multiplication of integers is taught already in primary school: To multiply
Modélisation et résolutions numérique et symbolique
Modélisation et résolutions numérique et symbolique via les logiciels Maple et Matlab Jeremy Berthomieu Mohab Safey El Din Stef Graillat [email protected] Outline Previous course: partial review of what
CHAPTER 5 Round-off errors
CHAPTER 5 Round-off errors In the two previous chapters we have seen how numbers can be represented in the binary numeral system and how this is the basis for representing numbers in computers. Since any
Is n a Prime Number? Manindra Agrawal. March 27, 2006, Delft. IIT Kanpur
Is n a Prime Number? Manindra Agrawal IIT Kanpur March 27, 2006, Delft Manindra Agrawal (IIT Kanpur) Is n a Prime Number? March 27, 2006, Delft 1 / 47 Overview 1 The Problem 2 Two Simple, and Slow, Methods
Divide: Paper & Pencil. Computer Architecture ALU Design : Division and Floating Point. Divide algorithm. DIVIDE HARDWARE Version 1
Divide: Paper & Pencil Computer Architecture ALU Design : Division and Floating Point 1001 Quotient Divisor 1000 1001010 Dividend 1000 10 101 1010 1000 10 (or Modulo result) See how big a number can be
8 Divisibility and prime numbers
8 Divisibility and prime numbers 8.1 Divisibility In this short section we extend the concept of a multiple from the natural numbers to the integers. We also summarize several other terms that express
Factoring by Quantum Computers
Factoring by Quantum Computers Ragesh Jaiswal University of California, San Diego A Quantum computer is a device that uses uantum phenomenon to perform a computation. A classical system follows a single
Quantum Computing Lecture 7. Quantum Factoring. Anuj Dawar
Quantum Computing Lecture 7 Quantum Factoring Anuj Dawar Quantum Factoring A polynomial time quantum algorithm for factoring numbers was published by Peter Shor in 1994. polynomial time here means that
FAST INVERSE SQUARE ROOT
FAST INVERSE SQUARE ROOT CHRIS LOMONT Abstract. Computing reciprocal square roots is necessary in many applications, such as vector normalization in video games. Often, some loss of precision is acceptable
A Binary Recursive Gcd Algorithm
A Binary Recursive Gcd Algorithm Damien Stehlé and Paul Zimmermann LORIA/INRIA Lorraine, 615 rue du jardin otanique, BP 101, F-5460 Villers-lès-Nancy, France, {stehle,zimmerma}@loria.fr Astract. The inary
This Unit: Floating Point Arithmetic. CIS 371 Computer Organization and Design. Readings. Floating Point (FP) Numbers
This Unit: Floating Point Arithmetic CIS 371 Computer Organization and Design Unit 7: Floating Point App App App System software Mem CPU I/O Formats Precision and range IEEE 754 standard Operations Addition
Primes. Name Period Number Theory
Primes Name Period A Prime Number is a whole number whose only factors are 1 and itself. To find all of the prime numbers between 1 and 100, complete the following exercise: 1. Cross out 1 by Shading in
COMP 250 Fall 2012 lecture 2 binary representations Sept. 11, 2012
Binary numbers The reason humans represent numbers using decimal (the ten digits from 0,1,... 9) is that we have ten fingers. There is no other reason than that. There is nothing special otherwise about
Cryptography and Network Security Number Theory
Cryptography and Network Security Number Theory Xiang-Yang Li Introduction to Number Theory Divisors b a if a=mb for an integer m b a and c b then c a b g and b h then b (mg+nh) for any int. m,n Prime
How To Write A Hexadecimal Program
The mathematics of RAID-6 H. Peter Anvin First version 20 January 2004 Last updated 20 December 2011 RAID-6 supports losing any two drives. syndromes, generally referred P and Q. The way
Continued Fractions and the Euclidean Algorithm
Continued Fractions and the Euclidean Algorithm Lecture notes prepared for MATH 326, Spring 997 Department of Mathematics and Statistics University at Albany William F Hammond Table of Contents Introduction
On the generation of elliptic curves with 16 rational torsion points by Pythagorean triples
On the generation of elliptic curves with 16 rational torsion points by Pythagorean triples Brian Hilley Boston College MT695 Honors Seminar March 3, 2006 1 Introduction 1.1 Mazur s Theorem Let C be a
FACTORING LARGE NUMBERS, A GREAT WAY TO SPEND A BIRTHDAY
FACTORING LARGE NUMBERS, A GREAT WAY TO SPEND A BIRTHDAY LINDSEY R. BOSKO I would like to acknowledge the assistance of Dr. Michael Singer. His guidance and feedback were instrumental in completing this
RevoScaleR Speed and Scalability
EXECUTIVE WHITE PAPER RevoScaleR Speed and Scalability By Lee Edlefsen Ph.D., Chief Scientist, Revolution Analytics Abstract RevoScaleR, the Big Data predictive analytics library included with Revolution
In mathematics, it is often important to get a handle on the error term of an approximation. For instance, people will write
Big O notation (with a capital letter O, not a zero), also called Landau's symbol, is a symbolism used in complexity theory, computer science, and mathematics to describe the asymptotic behavior of functions.
Current Standard: Mathematical Concepts and Applications Shape, Space, and Measurement- Primary
Shape, Space, and Measurement- Primary A student shall apply concepts of shape, space, and measurement to solve problems involving two- and three-dimensional shapes by demonstrating an understanding of:
Persistent Binary Search Trees
Persistent Binary Search Trees Datastructures, UvA. May 30, 2008 0440949, Andreas van Cranenburgh Abstract A persistent binary tree allows access to all previous versions of the tree. This paper presents
Short Programs for functions on Curves
Short Programs for functions on Curves Victor S. Miller Exploratory Computer Science IBM, Thomas J. Watson Research Center Yorktown Heights, NY 10598 May 6, 1986 Abstract The problem of deducing a function
Quotient Rings and Field Extensions
Chapter 5 Quotient Rings and Field Extensions In this chapter we describe a method for producing field extension of a given field. If F is a field, then a field extension is a field K that contains F.
The finite field with 2 elements The simplest finite field is
The finite field with 2 elements The simplest finite field is GF (2) = F 2 = {0, 1} = Z/2 It has addition and multiplication + and defined to be 0 + 0 = 0 0 + 1 = 1 1 + 0 = 1 1 + 1 = 0 0 0 = 0 0 1 = 0
Factorization Methods: Very Quick Overview
Factorization Methods: Very Quick Overview Yuval Filmus October 17, 2012 1 Introduction In this lecture we introduce modern factorization methods. We will assume several facts from analytic number theory.
The mathematics of RAID-6
The mathematics of RAID-6 H. Peter Anvin 1 December 2004 RAID-6 supports losing any two drives. The way this is done is by computing two syndromes, generally referred P and Q. 1 A quick
MATH 168: FINAL PROJECT Troels Eriksen. 1 Introduction
MATH 168: FINAL PROJECT Troels Eriksen 1 Introduction In the later years cryptosystems using elliptic curves have shown up and are claimed to be just as secure as a system like RSA with much smaller key
Math Workshop October 2010 Fractions and Repeating Decimals
Math Workshop October 2010 Fractions and Repeating Decimals This evening we will investigate the patterns that arise when converting fractions to decimals. As an example of what we will be looking at,
MATH 4330/5330, Fourier Analysis Section 11, The Discrete Fourier Transform
MATH 433/533, Fourier Analysis Section 11, The Discrete Fourier Transform Now, instead of considering functions defined on a continuous domain, like the interval [, 1) or the whole real line R, we wish
CS 103X: Discrete Structures Homework Assignment 3 Solutions
CS 103X: Discrete Structures Homework Assignment 3 s Exercise 1 (20 points). On well-ordering and induction: (a) Prove the induction principle from the well-ordering principle. (b) Prove the well-ordering
Lecture 13: Factoring Integers
CS 880: Quantum Information Processing 0/4/0 Lecture 3: Factoring Integers Instructor: Dieter van Melkebeek Scribe: Mark Wellons In this lecture, we review order finding and use this to develop a method
8 Square matrices continued: Determinants
8 Square matrices continued: Determinants 8. Introduction Determinants give us important information about square matrices, and, as we ll soon see, are essential for the computation of eigenvalues. You
The Prime Numbers. Definition. A prime number is a positive integer with exactly two positive divisors.
The Prime Numbers Before starting our study of primes, we record the following important lemma. Recall that integers a, b are said to be relatively prime if gcd(a, b) = 1. Lemma (Euclid s Lemma). If gcd(a,
Grade 6 Math Circles March 10/11, 2015 Prime Time Solutions
Faculty of Mathematics Waterloo, Ontario N2L 3G1 Centre for Education in Mathematics and Computing Lights, Camera, Primes! Grade 6 Math Circles March 10/11, 2015 Prime Time Solutions Today, we re going
Efficiency of algorithms. Algorithms. Efficiency of algorithms. Binary search and linear search. Best, worst and average case.
Algorithms Efficiency of algorithms Computational resources: time and space Best, worst and average case performance How to compare algorithms: machine-independent measure of efficiency Growth rate Complexity
MapReduce and Distributed Data Analysis. Sergei Vassilvitskii Google Research
MapReduce and Distributed Data Analysis Google Research 1 Dealing With Massive Data 2 2 Dealing With Massive Data Polynomial Memory Sublinear RAM Sketches External Memory Property Testing 3 3 Dealing With
The Goldberg Rao Algorithm for the Maximum Flow Problem
The Goldberg Rao Algorithm for the Maximum Flow Problem COS 528 class notes October 18, 2006 Scribe: Dávid Papp Main idea: use of the blocking flow paradigm to achieve essentially O(min{m 2/3, n 1/2 }
Data Structures. Algorithm Performance and Big O Analysis
Data Structures Algorithm Performance and Big O Analysis What s an Algorithm? a clearly specified set of instructions to be followed to solve a problem. In essence: A computer program. In detail: Defined
Stupid Divisibility Tricks
Stupid Divisibility Tricks 101 Ways to Stupefy Your Friends Appeared in Math Horizons November, 2006 Marc Renault Shippensburg University Mathematics Department 1871 Old Main Road Shippensburg, PA 17013
Arithmetic algorithms for cryptology 5 October 2015, Paris. Sieves. Razvan Barbulescu CNRS and IMJ-PRG. R. Barbulescu Sieves 0 / 28
Arithmetic algorithms for cryptology 5 October 2015, Paris Sieves Razvan Barbulescu CNRS and IMJ-PRG R. Barbulescu Sieves 0 / 28 Starting point Notations q prime g a generator of (F q ) X a (secret) integer
Study of algorithms for factoring integers and computing discrete logarithms
Study of algorithms for factoring integers and computing discrete logarithms First Indo-French Workshop on Cryptography and Related Topics (IFW 2007) June 11 13, 2007 Paris, France Dr. Abhijit Das Department
1 Formulating The Low Degree Testing Problem
6.895 PCP and Hardness of Approximation MIT, Fall 2010 Lecture 5: Linearity Testing Lecturer: Dana Moshkovitz Scribe: Gregory Minton and Dana Moshkovitz In the last lecture, we proved a weak PCP Theorem,
Homework until Test #2
MATH31: Number Theory Homework until Test # Philipp BRAUN Section 3.1 page 43, 1. It has been conjectured that there are infinitely many primes of the form n. Exhibit five such primes. Solution. Five such
ALGEBRAIC APPROACH TO COMPOSITE INTEGER FACTORIZATION
ALGEBRAIC APPROACH TO COMPOSITE INTEGER FACTORIZATION Aldrin W. Wanambisi 1* School of Pure and Applied Science, Mount Kenya University, P.O box 553-50100, Kakamega, Kenya. Shem Aywa 2 Department of Mathematics,
MATH10040 Chapter 2: Prime and relatively prime numbers
MATH10040 Chapter 2: Prime and relatively prime numbers Recall the basic definition: 1. Prime numbers Definition 1.1. Recall that a positive integer is said to be prime if it has precisely two positive
ECE 0142 Computer Organization. Lecture 3 Floating Point Representations
ECE 0142 Computer Organization Lecture 3 Floating Point Representations 1 Floating-point arithmetic We often incur floating-point programming. Floating point greatly simplifies working with large (e.g.,
