An Overview of Integer Factoring Algorithms Manindra Agrawal IITK / NUS The Problem Given an integer n, find all its prime divisors as efficiently as possible. 1
A Difficult Problem No efficient algorithm (= taking time (log n) c ) is know for the problem. The fastest known algorithm takes time exp( c (log n) 1/3 (loglog n) 2/3 ) with c 1.9. With this, we can factor 140 digit numbers in reasonable time. It is believed that no efficient algorithm exists. Useful in Cryptography RSA cryptosystem s security is based on hardness of factoring. Several other cryptosystems rely on this problem as well. 2
We present an overview of the known factoring algorithms. #1: Trial Division Divide n with all primes up to n starting from 2 and collect all divisors. A very simple algorithm. Takes time exp(½ log n) = L(1, ½). Notation: Denote exp(c(log n) ε (loglog n) 1-ε ) as L(ε, c). 3
#2: Pollard s Rho Method 1. Randomly select x 0 {1, 2,, n-1}, and compute x i = x i-12 + 1 (mod n) for i = 1, 2, 2. Compute gcd(x i x 2i, n) until a factor is found. Discovered by J. Pollard in 1975. Takes time L(1, ¼). Used to factorize eighth Fermat number 2 28 + 1, a 78 digit number. x t+2 x t+1 x t = x m x m-1 x m-2 x 2 x 1 x 0 Pollard s Rho Shape 4
Analysis Let p be the smallest prime factor of n, so p < n. Number sequence x 0, x 1, x 2, behaves randomly modulo p. So the probability that x t = x m (mod p) for t < m is roughly 1/ p. Notice that if x t = x m (mod p), then x t+k = x m+k (mod p) for all k > 0. Therefore, there exists a s < 2t with x s = x 2s (mod p). Again using randomness of the sequence, with probability at least ½, x s x 2s (mod n). Therefore, p gcd(x s x 2s, n) < n. For good probability of success, we need to generate roughly p = n 1/4 x i s. So the time complexity is exp(¼log n). 5
#3: Pollard s p-1 Method 1. Fix a factor base = set of all primes B. 2. Compute m = q prime, q B q log n. 3. Compute gcd(a m -1, n) for a random a. Discovered by J. Pollard in 1974. Takes time O(B (log n) 2 ). Works if prime p n and p-1 has no prime divisor greater than B. Fermat s Little Theorem If p is prime then for all a with gcd(a, p) = 1, a p-1 = 1 (mod p). In other words, the set of numbers { a 0 < a < p } forms a group of size p-1 under multiplication modulo p. 6
Analysis Suppose prime p n and p-1 has no factor greater than B. This implies that p-1 m. So, by Fermat s Little Theorem, p divides a m -1. So it might be found when computing gcd(a m -1, n). Useful only for a subset of numbers n. #4: Elliptic Curve Method Previous method works only for n s with a prime divisor p such that p-1 is a product of small primes. It is always true that a number m close to p will have this property. So if we can work with a group of size m, instead of p-1, the method will work for all numbers. 7
Elliptic Curves Elliptic curve E(a,b) has the following form: y 2 = 4x 3 - ax b; a 3 27 b 2 0 The set of points on an elliptic curve form a group under addition. We consider elliptic curves modulo n. The number of points on an elliptic curve modulo prime p (= #E p (a,b)) is between p+1-2 p and p+1+2 p. Curve y 2 = 4x 3-4x A F B -C E C Addition on curve: A + B = C; E + F = O, point at infinity 8
Algorithm 1. Fix a factor base = set of all primes B. 2. Compute m = q prime, q B q log n. 3. Choose a random a and b with a 3 27b 2 0 (mod n). 4. Choose a random point P on elliptic curve E n (a,b). 5. Attempt to compute a factor of n from mp O (the zero for addition ) Analysis Similar to Pollard s p-1 method. If prime p n and #E p (a,b) has no divisor > B, then n can be factored. This works for all the numbers since #E p (a,b) is randomly distributed between p+1-2 p and p+1+2 p. A careful analysis shows the running time to be L(½, 1) much better than earlier methods! 9
Used to factor tenth and eleventh Fermat numbers: 2 210 + 1 (308 digits) and 2 211 + 1 (610 digits). Fastest known algorithm for most of numbers. Discovered by H. Lenstra in 1987. #5: Fermat s Method 1. Compute m = [ n]. 2. For d = 1, 2, 3, do: i. Let x = m + d and test if x 2 -n is a perfect square. ii. If yes, let y 2 = x 2 -n and factor n using gcd(n,x+y). Discovered by P. Fermat in 17 th century. Works fast if n has two factors close to n. 10
Analysis Suppose n = k (k + t) with t small compared to k. Then m = [ n] k (1 + t/k) 1/2 k + ½t. Notice that with x = k + ½t, x 2 -n= k 2 + kt + ¼t 2 -k 2 -kt=(½t) 2 So the right x will be quickly found. #6: Dixon s Method Proposed by Dixon in 1970 s. Simple version of Morrison-Brillhart method. Based on Fermat s method. Aims to find x and y such that x 2 = y 2 (mod n). 11
Algorithm Data Collection Step: 1. Fix a factor base = set of primes B. 2. Randomly choose a number v and compute u = v 2 (mod n). 3. If u has all prime factors B, store the pair (v,u). Do this until about B pairs have been stored. Data Analysis Step: 1. Let p 1, p 2,, p t be primes B. 2. Let u i = p 1 e i,1 * p 2 e i,2 * * p t e i,t for every stored u i. 3. Let vector w i = [ e i,1 e i,2 e i,t ]. 4. Find a linear dependency amongst these vectors over F 2 : i β i w i = 0 (mod 2). 5. Compute x = Π i v i β i. 6. Compute y = (Π i u i β i ) ½. 7. Factor n as gcd(n, x+y). 12
Analysis Over integers, all numbers in i β i w i are multiples of 2. So, β Π i u i i = p i β i e i,1 1 * p i β i e i,2 2 * * p i β i e i,t t is a perfect square. Since v 2 i =u i (mod n), we get x 2 2β = Π i v i β i = Π i u i i = y 2 (mod n). Analysis How quickly can we find required number of pairs? Observation: If B is small, we need to find only a few pairs. But the chances of finding one pair are small. If B is large, we need to find many pairs. But chances of finding one pair are high. 13
Analysis What is the best value of B? It turns out to be L(½, 1/ 2) = exp(1/ 2(log n) 1/2 (loglog n) 1/2 ). With this value, the running time is L(½, 2) = exp( 2(log n) 1/2 (loglog n) 1/2 ). Not as good as Elliptic curve method. #7: Quadratic Sieve Proposed by C. Pomerance in 1981. A combination of Fermat s method and Dixon s method. Does the Data Collection step cleverly to reduce time. The best value of B becomes L(½, ½). The running time reduces to L(½, 1). Betters Elliptic curve method for large numbers that are used in cryptography. 14
Used to factor 129-digit RSA challenge in 1994: RSA-129 = 1143 81625 75788 88676 69235 77997 61466 12010 21829 67212 42362 56256 18429 35706 93524 57338 97830 59712 35639 58705 05898 90751 47599 29002 68795 43541 The Sieving Idea The v i s to be tested are chosen from the range [ n, n+a]. For each v i, we check if v 2 i nhas all prime divisors B. For a prime q B, if q divides v 2 n, then it will also divide (v + kq) 2 n and (kq - v) 2 n for all integers k. 15
The Sieving Idea So, for each q B, do the following: Solve the equation x 2 = n (mod q) to obtain two solutions, say α and β. Divide all numbers in the range [ n, n+a] that are of the form α + kq or β + kq by q as many times as possible. Once all q s are finished, the numbers in the range that become 1 are the useful ones. The Time Complexity of Factoring A number of algorithms have time complexity L(½, c) for constants c. This led to the belief that the optimal complexity for factoring is L(½, c) for some c 1. And then the Number Field sieve appeared 16
#8: Number Field Sieve Proposed by J. Pollard in 1988 and improved by C. Pomerance, H. Lenstra and others. A generalization of Quadratic sieve to number fields. The running time is L(1/3, 1.923). Used to factor ninth Fermat number 2 29 + 1 (153 digits) and RSA-130 (in 1996). Number Field Sieve Idea Select a small degree d. Find a polynomial f(x) and number m such that (1) m n 1/d and (2) n divides f(m). Let α be a root of f(x) over complex numbers. Consider ring Z[α], consisting of all complex numbers that can be written as: j c j α j with c j integers. 17
Define a map ψ from Z[α] to Z/nZ, the ring of residues modulo n as: Ψ( j c j α j ) = j c j m j (mod n). Clearly, 0 = Ψ(f(α)) = f(m) = 0 (mod n), and so Ψ is a ring homomorphism. Now find sequence of pairs (u i, v i ) and a sequence of exponents β i such that: 1. Π i (u i -mv i ) β i is a square in Z. 2. Π i (u i - αv i ) β i is a square in Z[α]. Then, x 2 = Π i (u i -mv i ) β i = Π i Ψ(u i - αv i ) β i (mod n) = Ψ(Π i (u i - αv i ) β i ) (mod n) = Ψ(g 2 (α)) (mod n) = [Ψ(g(α))] 2 (mod n) = g(m 2 ) = y 2 (mod n). 18
Pairs (u i, v i ) satisfying the first condition can be found as before. The second condition is more tricky. Using some additional ideas, it can be done. Time complexity reduces because numbers that we work with are now smaller ( n 1/d instead of n 1/2 ). #9: Shore s Algorithm Proposed by P. Shore in 1995. Works on Quantum computers only. The running time is L(0, 3) = n 3!! If one can build Quantum computers, factoring would become easy. As of now, we do not know 19
Remarks Number Field sieve is the fastest known general purpose algorithm. Is that the best possible? Perhaps not. Where does one find better algorithms?? 20