Cryptography and Network Security Spring 2012 http://users.abo.fi/ipetre/crypto/ Lecture 7: Public-key cryptography and RSA Ion Petre Department of IT, Åbo Akademi University 1
Some unanswered questions on symmetric cryptosystems Key management: changing the secret key or establishing one is nontrivial Change the keys two users share (should be done reasonably often) Establish a secret key with somebody you do not know and cannot meet in person: (e.g., visiting secure websites such as e-shops) This could be done via a trusted Key Distribution Center (details in a future lecture) Can (or should) we really trust the KDC? What good would it do after all to develop impenetrable cryptosystems, if their users were forced to share their keys with a KDC that could be compromised by either burglary or subpoena? Diffie, 1988 Digital signatures 2
A breakthrough idea Rather than having a secret key that the two users must share, each users has two keys One key is secret and he is the only one who knows it The other key is public and anyone who wishes to send him a message uses that key to encrypt the message Diffie and Hellman first (publicly) introduced the idea in 1976 this was radically different than all previous efforts NSA claims to have known it since mid-1960s! Communications-Electronic Security Group (the British counterpart of NSA) documented the idea in a classified report in 1970 3
A word of warning Public-key cryptography complements rather than replaces symmetric cryptography There is nothing in principle to make public-key crypto more secure than symmetric crypto Public-key crypto does not make symmetric crypto obsolete: it has its advantages but also its (major) drawbacks such as speed Due to its low speed, it is mostly confined to key management and digital signatures 4
The idea of public-key cryptography The concept was proposed in 1976 by Diffie and Hellman although no practical way to design such a system was suggested Each user has two keys: one encryption key that he makes public and one decryption key that he keeps secret Clearly, it should be computationally infeasible to determine the decryption key given only the encryption key and the cryptographic algorithm Some algorithms (such as RSA) satisfy also the following useful characteristic: Either one of the two keys can be used for encryption the other one should then be used to decrypt the message First we will investigate the concept with no reference yet to practical design of a public-key system 5
Essential steps in public-key encryption Each user generates a pair of keys to be used for encryption and decryption Each user places one of the two keys in a public register and the other key is kept private If B wants to send a confidential message to A, B encrypts the message using A s public key When A receives the message, she decrypts it using her private key Nobody else can decrypt the message because that can only be done using A s private key Deducing a private key should be infeasible If a user wishes to change his keys generate another pair of keys and publish the public one: no interaction with other users is needed 6
Bob sends an encrypted message to Alice 7
Some notation The public key of user A will be denoted KU A The private key of user A will be denoted KR A Encryption method will be a function E Decryption method will be a function D If B wishes to send a plain message X to A, then he sends the cryptotext Y=E(KU A,X) The intended receiver A will decrypt the message: D(KR A,Y)=X 8
A first attack on the public-key scheme authenticity Immediate attack on this scheme: An attacker may impersonate user B: he sends a message E(KU A,X) and claims in the message to be B A has no guarantee this is so This was guaranteed in classical cryptosystems simply through knowing the key (only A and B are supposed to know the symmetric key) The authenticity of user B can be established as follows: B will encrypt the message using his private key: Y=E(KR B,X) This shows the authenticity of the sender because (supposedly) he is the only one who knows the private key The entire encrypted message serves as a digital signature Note: this may not be the best possible solution: ideally, digital signatures should be rather small so that one can preserve many of them over a long period of time Better schemes will be presented a couple of lectures on 9
A scheme to authenticate the sender of the message 10
Encryption and authenticity Still a drawback: the scheme on the previous slide authenticate but does not ensure security: anybody can decrypt the message using B s public key One can provide both authentication and confidentiality using the public-key scheme twice: B encrypts X with his private key: Y=E(KR B,X) B encrypts Y with A s public key: Z=E(KU A,Y) A will decrypt Z (and she is the only one capable of doing it): Y=D(KR A,Z) A can now get the plaintext and ensure that it comes from B (he is the only one who knows his private key): decrypt Y using B s public key: X=E(KU B,Y) 11
Secrecy and authentication using public-key schemes 12
Applications for public-key cryptosystems 1. Encryption/decryption: sender encrypts the message with the receiver s public key 2. Digital signature: sender signs the message (or a representative part of the message) using his private key 3. Key exchange: two sides cooperate to exchange a secret key for later use in a secret-key cryptosystem 13
Requirements for public-key cryptosystems Generating a key pair (public key, private key) is computationally easy Encrypting a message using a known key (his own private or somebody else s public) is computationally easy Decrypting a message using a known key (his own private or somebody else s public) is computationally easy Knowing the public key, it is computationally infeasible for an opponent to deduce the private key Knowing the public key and a ciphertext, it is computationally infeasible for an opponent to deduce the private key Useful extra feature: encryption and decryption can be applied in any order: E( KU A, D(KR A,X) ) =D(KR A, E( KU A, X) ) 14
Designing a public-key cryptosystem Computationally easy usually means polynomial-time algorithm Computationally infeasible more difficult to define Usually means super-polynomial-time algorithms, e.g., exponential-time algorithms Classical complexity analysis (worst-case complexity or average-case complexity) are worthless in cryptography one should make sure a problem is difficult for virtually all inputs and not just in the worse or in the average case Public-key cryptosystems usually rely on difficult math functions rather than S-P networks as classical cryptosystems One-way function: easy to calculate in one direction, infeasible to calculate in the other direction (i.e., the inverse is infeasible to compute) Trap-door function: difficult function that becomes easy if some extra information is known Aim: find a trap-door one-way function for encryption decryption will be the inverse 15
RSA One of the first proposals on implementing the concept of public-key cryptography was that of Rivest, Shamir, Adleman 1977: RSA The RSA scheme is a block cipher in which the plaintext and the ciphertext are integers between 0 and n-1 for some fixed n Typical size for n is 1024 bits (or 309 decimal digits) To be secure with today s technology size should between 1024 and 2048 bits Idea of RSA: it is a difficult math problem to factorize (large) integers Choose p and q odd primes, n=pq Choose integers d,e such that M ed =M mod n, for all M<n Plaintext: block of k bits, where 2 k <n 2 k+1 can be considered a number M with M<n Encryption: C=M e mod n Decryption: C d mod n = M de mod n = M Public key: KU={e,n} Private key: KR={d,n} Questions: How do we find d,e? How do we find large primes? Answer: Number Theory! 16
Motto for our introduction to Number Theory The Devil said to Daniel Webster: "Set me a task I can't carry out, and I'll give you anything in the world you ask for." Daniel Webster: "Fair enough. Prove that for n greater than 2, the equation a n + b n = c n has no non-trivial solution in the integers." They agreed on a three-day period for the labour, and the Devil disappeared. At the end of three days, the Devil presented himself, haggard, jumpy, biting his lip. Daniel Webster said to him, "Well, how did you do at my task? Did you prove the theorem?' "Eh? No... no, I haven't proved it." "Then I can have whatever I ask for? Money? The Presidency?' "What? Oh, that of course. But listen! If we could just prove the following two lemmas " The Mathematical Magpie, Clifton Fadiman 17
Notions of number theory Fermat s little theorem: if p is prime and a is positive integer not divisible by p, then a p-1 1 mod p Corollary: For any positive integer a and prime p, a p a mod p Comments: Example: This is a first step in our quest to find M ed =M mod n not quite enough though Fermat s little theorem provides a necessary condition for an integer p to be prime the condition is not sufficient We will turn this theorem into a (probabilistic) test for primality p=5, a=3, 3 5 =243=3 mod 5 p=5, a=10, 10 5 =100000=10 mod 5 = 0 mod 5 Fermat s theorem, as useful as it will turn out to be, it does not provide us with integers d,e we are looking for Euler s theorem (a refinement of Fermat s) does 18
Euler s totient function Euler s function associates to any positive integer n a number φ(n): the number of positive integers smaller than n and relatively prime to n Example: φ(37)=36 φ(p)=p-1, for any prime p φ(35)=24: {1,2,3,4,6,8,9,11,12,13,16,17,18,19,22,23,24,26,27,29,31,32,33,34} Easy to see that for any two distinct primes p,q, φ(pq)=(p-1)(q-1) All numbers smaller than pq are relatively primes with pq except for multiples of p (q-1 of them) and multiples of q (p-1 of them): pq-(q-1)-(p-1)=(p-1)(q-1) Euler s theorem: for any relatively prime integers a,n we have a φ(n) 1 mod n Corollary: For any integers a,n we have a φ(n)+1 a mod n Corollary: Let p,q be two odd primes and n=pq. Then: φ(n)=(p-1)(q-1) For any integer m with 0<m<n, m (p-1)(q-1)+1 m mod n For any integers k,m with 0<m<n, m k(p-1)(q-1)+1 m mod n 19
Back to RSA Euler s theorem provides us the numbers d,e such that M ed =M mod n We have to choose d,e such that ed=kφ(n)+1 for some k Equivalently, d e -1 mod φ(n) To calculate the modular inverse of an interger: the extended Euclid s algorithm! see Lecture 5 The RSA scheme Key generation Choose two odd primes p,q keep private. Compute n=pq make public Choose e, 1<e<φ(n) with gcd(φ(n),e)=1 make public Compute d e -1 mod φ(n) keep private Private key is {d,n} Public key is {e,n} Encryption Plaintext: block of k bits, where 2 k <n 2 k+1 can be considered a number M with M<n Ciphertext: C=M e mod n Decryption: Ciphertext: C Plaintext: C d mod n = M de mod n = M 20
Example Key generation Select primes p=17, q=11 Compute n=pq=187 Compute φ(n)=(p-1)(q-1)=160 Select e=7 Compute d: d=23 (use the extended Euclid s algorithm) KU={7,187} KR={23,187} Encrypt M=88: 88 7 mod 187 88 7 mod 187 = [ (88 4 mod 187)(88 2 mod 187) (88 mod 187) ] = 11 Decrypt C=11: 11 23 mod 187 M=11 23 mod 187= [ (11 16 mod 187)(11 4 mod 187) (11 2 mod 187)(11 mod 187)] 11 2 mod 187 =121 11 4 mod 187= 121 2 mod 187=55 11 8 mod 187=55 2 mod 187= 33 11 16 mod 187=33 2 mod 187=154 M=154 x 55 x 121 x 11 mod 187 = 88 RSA scheme Key generation Choose primes p,q Compute n=pq Choose e, 1<e<φ(n) with gcd(φ(n),e)=1 Compute d e -1 mod φ(n) Private key is {d,n} Public key is {e,n} Encryption C=M e mod n Decryption: C d mod n = M de mod n = M 21
Computational aspects RSA implementation Fast modular exponentiation Take each step in turn and discuss how can it be implemented efficiently For encryption and decryption we must be able to do quick modular exponentiations two ideas are useful: (ab mod n) = [(a mod n)(b mod n)] (mod n) To compute x 16 mod n we do not have to do 15 multiplication but only 4: compute x 2 mod n, x 4 mod n, x 8 mod n, x 16 mod n Apply this to compute quickly any exponent, not just powers of 2 RSA scheme Key generation Choose primes p,q Compute n=pq Choose e, 1<e<φ(n) with gcd(φ(n),e)=1 Compute d e -1 mod φ(n) Private key is {d,n} Public key is {e,n} Encryption C=M e mod n Decryption: C d mod n = M de mod n = M 22
Fast modular exponentiation Square-and-multiply algorithm Input: n,x,b (b is in base 2 (b k-1,,b 1,b 0 ), b 0 Output: x b mod n 1. z=1 2. for i=k-1 downto 0 3. z=z 2 mod n 4. if b i =1 then z=zx mod n Complexity O(r 3 ), where r=[log 2 n] Example: encrypt 9726 with KU={3533,11413}: 9726 3533 mod 11413 3533=(1,1,0,1,1,1,0,0,1,1,0,1) Ciphertext: 5761 RSA scheme Key generation Choose primes p,q Compute n=pq Choose e, 1<e<φ(n) with gcd(φ(n),e)=1 Compute d e -1 mod φ(n) Private key is {d,n} Public key is {e,n} Encryption C=M e mod n Decryption: C d mod n = M de mod n = M i b i z 11 1 9726 10 1 9726 2 x9726=2659 9 0 2659 2 =5634 8 1 5634 2 x9726=9167 7 1 9167 2 x9726=4958 6 1 4958 2 x9726=7783 i b i z 5 0 7783 2 =6298 4 0 6298 2 =4629 3 1 4629 2 x9726=10185 2 1 10185 2 x9726=105 1 0 105 2 =11025 0 1 11025 2 x9726=5761 23
Computational aspects RSA implementation Key generation The highlighted part in the algorithm is easy to implement Generate a series of random numbers and test each against φ(n) for relative primality Testing whether or not two integers are relatively prime and finding a modular inverse can be done with the extended Euclid s algorithm Very few tests are needed before a usable e is found: the probability that two random numbers are relatively prime is 0.6 RSA scheme Key generation Choose primes p,q Compute n=pq Choose e, 1<e<φ(n) with gcd(φ(n),e)=1 Compute d e -1 mod φ(n) Private key is {d,n} Public key is {e,n} Encryption C=M e mod n Decryption: C d mod n = M de mod n = M 24
Computational aspects RSA implementation Key generation No practical techniques to yield large prime numbers Procedure: generate random odd numbers and test whether that integer is prime Testing whether or not an integer n is a prime is a difficult problem ( primality is difficult ) There has been a long standing question in math whether or not primality can be tested in polynomial deterministic time Answer (2002): YES! Manindra Agrawal, Neeraj Kayal and Nitin Saxena, PRIMES is in P, Ann. of Math. (2), 160:2 (2004) 781--793. Drawback: high complexity O(log 12 n f(log log n)), where f is a polynomial RSA scheme Key generation Choose primes p,q Compute n=pq Choose e, 1<e<φ(n) with gcd(φ(n),e)=1 Compute d e -1 mod φ(n) Private key is {d,n} Public key is {e,n} Encryption C=M e mod n Decryption: C d mod n = M de mod n = M 25
Miller-Rabin primality test Faster methods of testing primality exist they are all probabilistic Such an algorithm can give two answers to the question Is n prime? 1. No, it is not 2. n is probably prime The probability can be made arbitrarily large Other algorithms may give precise answer but with low probability they may take a long time to finish Most popular primality test: Miller- Rabin, based on Fermat s little theorem 26
Miller-Rabin primality test Fermat s little theorem: if p is prime and a is positive integer not divisible by p, then a p-1 1 mod p Idea of the Miller-Rabin test: We need to test if the odd integer n is prime: test the equality in Fermat s little theorem for n and a random a A speedup may be done so that we do not have to compute all powers of a details bellow n-1 is even, i.e., of the form n-1=2 k q, with k>0, q odd: k and q easy to find Choose an integer a such that 1<a<n-1 Compute modulo n the values a 2j q, 0 j q: a q, a 2q,, a 2k-1 q, a 2k q By Fermat s theorem, if n is prime, then the last value in the sequence is 1 the sequence may have some other 1s, consider the first 1 in the sequence Case 1: the first number in the sequence is 1 then all other powers are also 1 Case 2: some number a 2j q in the sequence is 1 in this case a 2j-1 q = n-1 mod n 0 = (a 2j q -1) mod n = (a 2j-1 q 1) (a 2j-1 q + 1) mod n, i.e., n divides (a 2j-1 q 1) or (a 2j-1 q + 1) Since we took the first 1 in the sequence, it follows that n divides (a 2j-1 q + 1): a 2j-1 q = n-1 mod n The test: if either the first element in the sequence is 1, or some other element is n-1, then n could be prime. Otherwise n is certainly not prime 27
Miller-Rabin primality test TEST(n) 1. n-1=2 k q: compute k and q 2. Select a random integer a, 1<a<n-1 3. If a q mod n=1 then return probably prime 4. For j=0 to k-1 do 5. If a 2j q mod n = n-1, then return probably prime 6. Return not a prime Question: for how many integers a does the test fail? Failure: n is not prime but the algorithm return probably prime Answer: for at most (n-1)/4 integers a with 1 a n-1 Thus, the probability of failure is at most ¼ Practical implementation: Repeatedly invoke TEST(n) using random choices for a If TEST(n) return at least once not a prime, then n is not a prime If t executions of TEST(n) return probably prime, then the probability that n is indeed a prime is larger than 1-4 -t t=10 gives probability larger than 0.999999 28
Computational aspects RSA implementation Key generation To choose primes p,q we generate random numbers p,q on the desired scale of magnitude and test the primality with Miller-Rabin Question: How many trials should we expect to do before we find a prime? Distribution of primes Prime number theorem: for any integer x, the primes near x are spaced in average one every log(x) integers On average we have to test log(x) integers before we find a prime reject immediately even integers and integers ending in 5 Correct rate: we need to test in average 0.4 log(x) integers before we find a prime of the order of x Example: if we look for a prime on the order of magnitude 2 200 we need to do in average 55 trials, order of magnitude 2 1024 : in average 284 trials RSA scheme Key generation Choose primes p,q Compute n=pq Choose e, 1<e<φ(n) with gcd(φ(n),e)=1 Compute d e -1 mod φ(n) Private key is {d,n} Public key is {e,n} Encryption C=M e mod n Decryption: C d mod n = M de mod n = M 29
Attacking RSA Brute force attacks: try all possible private keys As in the other cases defend using large keys: nowadays integers between 1024 and 2048 bits Mathematical attacks Factor n into its two primes p,q: this is a hard problem for large n Challenges by RSA Labs to factorize large integers Last solved challenge: 768 bits (2009) Determine φ(n) directly without first determining p,q: this math problem is equivalent to factoring Determine d directly, without first determining φ(n): this is believed to be at least as difficult as factoring Suggestions for design The larger the keys, the better but also the slower the algorithm Choosing p,q badly may weaken the algorithm p,q should differ in length by only a few bits: for a 1024-bit key, p,q should be on the order of magnitude 10 75 to 10 100 p-1 and q-1 should both contain a large prime factor gcd(p-1,q-1) should be small d should be larger than n 1/4 RSA scheme Key generation Choose primes p,q Compute n=pq Choose e with gcd(φ(n),e)=1 Compute d e -1 mod φ(n) Private key is {d,n} Public key is {e,n} Encryption C=M e mod n Decryption: C d mod n = M de mod n = M 30
Attacks on RSA Timing attacks: determine a private key by keeping track of how long a computer takes to decipher a message (ciphertext-only attack) this is essentially an attack on the fast exponentiation algorithm but can be adapted for any other algorithm Whenever a bit is 1 the algorithm has more computations to do and takes more time Countermeasures: Ensure that all exponentiations take the same time before returning a result: degrade performance of the algorithm Add some random delay: if there is not enough noise the attack succeeds Blinding: multiply the ciphertext by a random number before performing exponentiation in this way the attacker does not know the input to the exponentiation algorithm. (implemented in the commercial products from RSA Data Security Inc.) Decryption M=C d mod n is modified as follows: Generate a secret random number r between 0 and n-1 Compute C =C(r e ) mod n where e is the public exponent Compute M =(C d ) mod n with the ordinary exponentiation Compute M=M r -1 mod n Reported performance penalty: 2 to 10% Square-and-multiply algorithm Input: n,x,b (b is in base 2 (b k-1,,b 1,b 0 ) Output: x b mod n 1. z=1 2. for i=k-1 downto 0 3. z=z 2 mod n 4. if b i =1 then z=zx mod n 31
Pseudo-random number generators Essential in RSA (and elsewhere) to be able to generate pseudorandom numbers A sequence of numbers is random if they have uniform distribution and are independent (no value can be deduced from the others) We generally use algorithmic techniques to generate such numbers they will not be independent and thus not random The whole point is to make them look random, i.e., make them pass many test of randomness Three tests to be used in evaluating a pseudo-random number generator The function should be full-period generating function: generate all numbers in its range before repeating The generated sequence should appear random: pass many statistical tests The function should implement efficiently with 32-bit arithmetic 32
Pseudo-random number generators The most widely used technique is the linear congruential method (Lehmer 1951) X n+1 =(ax n +c) mod m One should be very careful in choosing constants a, c, m: a=c=1 is bad choice! Value of m should be as large as possible: usually close to 2 31, very often chosen to be the prime number 2 31-1; in this case one can take c=0 There are very few good choices for a: for m= 2 31-1 only a handful of choices are advisable very often used is a=7 5 =16807 X n+1 =16807 X n mod (2 31-1) Using this in cryptography needs extra care: If the attacker finds one single value, then he will be able to compute all subsequent values Idea: restart the sequence often, using the clock as seed (initial value) 33
Cryptographically generated pseudo-random numbers Idea: use cryptographic primitives to generate pseudo-random numbers One possibility: Use a counter and encrypt each value for the counter, e.g., with DES the cryptotext will be the key Stronger version: instead of a counter use a PRNG (pseudo-random number generator) Technique can be made stronger using a more sophisticated scheme and 3DES, see ANSI X9.17 PRNG 34
Another speed-up in RSA implementation Operations modulo big integers become more time-consuming as the integers grows bigger Efficient implementation: use Chinese Remainder Theorem (CRT) In its simplest formulation, CRT essentially says that if n=pq, then instead of addition/difference/multiplication modulo n one can perform the same modulo p and modulo q and then compute the result mod n Big advantage because the modules are much smaller 35