Library (versus Language) Based Parallelism in Factoring: Experiments in MPI Dr. Michael Alexander Dr. Sonja Sewera Talk 2007-10-19 Slide 1 of 20
Primes Definitions Prime: A whole number n is a prime number if the only number that divides n are 1 and n itself Composite Number: A whole number with factors other than 1 and itself Prime Factor: A factor that is prime Relatively Prime (two whole numbers): not sharing a common whole number divisor > 1 Slide 2 of 20
Prime Factorization Unique FactorizationTheorem Theorem With n and n > 1 the n can be constructed in exactly one way but of rearrangement as a product of unique p i prime factors with prime factor orders α 1 α 2 α k n = p 1 p 2... p k Proof If n is prime, then it can be written as product of primes If n is not prime, then by induction: n 1, n 2 with 2 n 1 n and 2 n 2 n for n = n 1 n 2. As n 1, n 2 can be constructed as products of primes, n can be as well. α i Slide 3 of 20
Prime Factorization II Factorization is well suited for cryptography One-way function (easy to compute, computationally hard to reverse) Uniquness property of factors p i RSA Cryptosystem outline: e d mod ϕ( n) = 1 Public key: tuple ( e, n), private key: ( dn, ) Prime factors p, q with n = p q e, d chosen to satisfy (1): see [7] ϕ: Euler s totient function (1) Encrypt message m: c = m e mod n, decrypt cyphertext c with m = c d mod n Slide 4 of 20 (2)
Prime Factorization III RSA Factoring problem Finding eth root to c = m e mod n by factoring mod n ϕ is a multiplicative function: ϕ( p q) = ϕ( p) ϕ( q) (3) ϕ( p) = p 1 as whole numbers < p being relatively prime to p Substituting ϕ( p) with ( p 1) ϕ( p q) = ( p 1) ( q 1) (4) Calculate d via (1) and (2) Slide 5 of 20
Factoring Algorithms Factoring problems are hard problems No polynomial time classical (non-quantum) algorithm with an Ob k ( ) known 1 Shor s Algorithm (quantum) in Complexity Class Ob 3 ( ) Solution validity check in NP, whether problem in P unknown Brute-force for e.g. 2048 bit key length checking the combination of all factors from 1 to 2 2048 1 1.b as the bit length of the composite Slide 6 of 20
Factoring Algorithms II Incomplete list of factoring algorithms (see [8]) Bren t Algorithm, Direct Search Factorization (Trial Division) Devide n by every prime, n primes 1 that divide without remainder are factors Dixon s Factrorization Method, Elliptic Curve Factorization Method, Euler's Factorization Method, (General) Number Field Sieve, Pollard s ρ (improved by Brent), Quadratic Sieve, Shor s Algorithm,... 1.Proof see [9] Slide 7 of 20
Factoring Experiments Synopsis Experiments were carried out with with trial division and variations of the Brent version of the Pollard s ρ algorithm with and without MPI Environments Experiments: Gescher Cluster (pre-wu Cluster) Experiments: Xen Virtual Machine Environment Staging/experiments: Mac Pro with ICC and MPI WU Cluster test.q queue Data Runs: WU Cluster LAM/MPI 7.2, GCC4 Sun GridEngine scheduler Slide 8 of 20
Staging Notes ICC flags are simple -O3 -parallel plus architecture flag (?) finds loop parallelism (100% CPU over all cores) OpenMP is convenient -openmp adding #pragma omp parallel for to main loop reaches 100% CPU on each of 4 Mac Pro cores without any other modifications but of init parameters MPI is comperatively hard Slide 9 of 20
Pollard s ρ, Version by Brent Proposed by Pollard [6], followed by Brent [3] Based on detecting cycles in a pseudo-random sequence mod n Typically as polynomial of the form and the Birthday paradox, fx ( ) = x 2 + a modn Pxmod ( s ymod s) = 0,5 1,77 s numbers having been chosen Little implicit parallelism, high sequential component U. bound on running time for finding prime factor p Op 1 2 logn ( ( ) 2 ) Slide 10 of 20
Pollard s ρ, Version by Brent II Algorithms in pseudo-code Pollard s ρ [6] x := 2; y :=x; d := 1 while d =1; x = f(x); y = f(y); d = GCD( x-y, n) if G=N then {failure} else {success}. Version by Brent [3] y := x 0 ; r := 1; q :=1; repeat x := y; Slide 11 of 20
Pollard s ρ, Version by Brent III for i := 1 to r do y := f(y); k :=0; repeat ys := y; for i := 1 to min (m, r-k) do begin y:= f(y); q := q x x-y mod N end; G := GCD(q,N); k := k+m until (k >=r) or (G>1); r:= 2xr until g>1; if G=N the repeat ys:= f(ys); g := GCD( x-ys, N) until G>1; if G=N then {failure} else {success}. Slide 12 of 20
Approach to parallization Different polynomials for each worker node -> Processor count bound by number of primes + slack Probability mod c for distinct factors with C processors and c different sequences k 2 exp C ----------- 2c With a speed up O( C) Slide 13 of 20
Implementation Experimental implementations of Direct Search Factorization, multiple in Objective-C, C Pollard s ρ in C++ GNU MP Multi Precision Library (GMP) GMP included factorize.c as foundation Arithmetic functions changed to GMP MPI augemented with Master-Worker Pattern mpz_get_str (buf,10,t);... MPI_Send(buf,256,MPI_CHAR,Status.MPI_SOURCE,PIPE_MSG,MPI_COMM_WORLD); Slide 14 of 20
Data Runs Brent without MPI Queue node.q Composite 999999999342000000180561999973231526002321511667880 89724222147059023818699172316046 Prime Factors {3689965796251, 999999999877, 999999999961, 999999999989, 999999999857, 999999999899, 999999999863 999999999959, 999999999937} Slide 15 of 20
Data Runs II Laufzeit0 0 2000 4000 6000 8000 2 4 6 8 10 12 Prozessoren0 Slide 16 of 20
Data Runs III Brent with MPI Queue node.q Composite 13082761331670030 14 Prime Factors {2, 3, 5, 7, 11, 13, 17, 19, 23, 29, 31, 37, 41, 43} Slide 17 of 20
Data Runs IV Good agreement with Brent s speed-up prediction Laufzeit 5000 15000 25000 35000 p r i m e s 2 4 6 8 10 12 Slide 18 of 20
Sources [1] Alexander, M. Netzwerke und Netzwerksicherheit. Heidelberg: Hüthig Telekommunikation, 2006. [2] Barnes. C. Integer Factorization Algorithms. Technical Report Department of Physics, Oregon State University, 2004 [3] Brent, R. An Improved Monte Carlo Factorization Algorithm. Nordisk Tidskrift for Informationsbehandlung (BIT) 20, 176-184, 1980. [4] Chen, W. Discrete Mathematics. Notes Macquarie University Dept. of Mathematics, 2002. [5] Sewera, S. Prime factorization and parallelization of Pollard s rho algorithm. Bachelor Thesis, WU Wien, 2007. Slide 19 of 20
[6] Pollard, J. M. A Monte Carlo Method for Factorization." Nordisk Tidskrift for Informationsbehandlung (BIT) 15, 331-334, 1975. [7] Rivest, R, Shamir, A and L. Adleman. A Method for Obtaining Digital Signatures and Public-Key Cryptosystems. Communications of the ACM, Vol. 21 (2), pp.120 126. 1978. [8] Weisstein, E. Prime Factorization Algorithms. From MathWorld-- A Wolfram Web Resource. http://mathworld.wolfram.com/prime- FactorizationAlgorithms.html [9] Weisstein, E. Direct Search Factorization. From MathWorld--A Wolfram Web Resource. http://mathworld.wolfram.com/ DirectSearchFactorization.html Slide 20 of 20