Fast Variants of RSA



Similar documents
A Secure Password-Authenticated Key Agreement Using Smart Cards

Proactive Secret Sharing Or: How to Cope With Perpetual Leakage

AN EFFICIENT GROUP AUTHENTICATION FOR GROUP COMMUNICATIONS

What is Candidate Sampling

Luby s Alg. for Maximal Independent Sets using Pairwise Independence

Module 2 LOSSLESS IMAGE COMPRESSION SYSTEMS. Version 2 ECE IIT, Kharagpur

An Alternative Way to Measure Private Equity Performance

Secure Network Coding Over the Integers

Practical and Secure Solutions for Integer Comparison

Fully Homomorphic Encryption Scheme with Symmetric Keys

Watermark-based Provable Data Possession for Multimedia File in Cloud Storage

Recurrence. 1 Definitions and main statements

A Cryptographic Key Assignment Scheme for Access Control in Poset Ordered Hierarchies with Enhanced Security

8.5 UNITARY AND HERMITIAN MATRICES. The conjugate transpose of a complex matrix A, denoted by A*, is given by

Forecasting the Direction and Strength of Stock Market Movement

Klaus Hansen, Troels Larsen and Kim Olsen Department of Computer Science University of Copenhagen Copenhagen, Denmark

PKIS: practical keyword index search on cloud datacenter

Performance Analysis of Energy Consumption of Smartphone Running Mobile Hotspot Application

Multiplication Algorithms for Radix-2 RN-Codings and Two s Complement Numbers

A Crossplatform ECG Compression Library for Mobile HealthCare Services

Compact CCA2-secure Hierarchical Identity-Based Broadcast Encryption for Fuzzy-entity Data Sharing

Institute of Informatics, Faculty of Business and Management, Brno University of Technology,Czech Republic

8 Algorithm for Binary Searching in Trees

Scalable and Secure Architecture for Digital Content Distribution

Identity-Based Encryption Gone Wild

On the Optimal Control of a Cascade of Hydro-Electric Power Stations

1 Example 1: Axis-aligned rectangles

Fast degree elevation and knot insertion for B-spline curves

Certificate Revocation using Fine Grained Certificate Space Partitioning

From Selective to Full Security: Semi-Generic Transformations in the Standard Model

Data Broadcast on a Multi-System Heterogeneous Overlayed Wireless Network *

Traffic-light a stress test for life insurance provisions

Robust Design of Public Storage Warehouses. Yeming (Yale) Gong EMLYON Business School

benefit is 2, paid if the policyholder dies within the year, and probability of death within the year is ).

J. Parallel Distrib. Comput.

Provably Secure Single Sign-on Scheme in Distributed Systems and Networks

Multiple-Period Attribution: Residuals and Compounding

Logical Development Of Vogel s Approximation Method (LD-VAM): An Approach To Find Basic Feasible Solution Of Transportation Problem

Financial Mathemetics

v a 1 b 1 i, a 2 b 2 i,..., a n b n i.

A Secure Nonrepudiable Threshold Proxy Signature Scheme with Known Signers

This circuit than can be reduced to a planar circuit

Project Networks With Mixed-Time Constraints

To Fill or not to Fill: The Gas Station Problem

Frequency Selective IQ Phase and IQ Amplitude Imbalance Adjustments for OFDM Direct Conversion Transmitters

The Development of Web Log Mining Based on Improve-K-Means Clustering Analysis

A Performance Analysis of View Maintenance Techniques for Data Warehouses

Riposte: An Anonymous Messaging System Handling Millions of Users

Tracker: Security and Privacy for RFID-based Supply Chains

EVALUATING THE PERCEIVED QUALITY OF INFRASTRUCTURE-LESS VOIP. Kun-chan Lan and Tsung-hsun Wu

A role based access in a hierarchical sensor network architecture to provide multilevel security

L10: Linear discriminants analysis

An RFID Distance Bounding Protocol

STANDING WAVE TUBE TECHNIQUES FOR MEASURING THE NORMAL INCIDENCE ABSORPTION COEFFICIENT: COMPARISON OF DIFFERENT EXPERIMENTAL SETUPS.

Power-of-Two Policies for Single- Warehouse Multi-Retailer Inventory Systems with Order Frequency Discounts

Examensarbete. Rotating Workforce Scheduling. Caroline Granfeldt

Minimal Coding Network With Combinatorial Structure For Instantaneous Recovery From Edge Failures

RequIn, a tool for fast web traffic inference

International Journal of Information Technology, Modeling and Computing (IJITMC) Vol.1, No.3,August 2013

A DYNAMIC CRASHING METHOD FOR PROJECT MANAGEMENT USING SIMULATION-BASED OPTIMIZATION. Michael E. Kuhl Radhamés A. Tolentino-Peña

Implementation of Deutsch's Algorithm Using Mathcad

Joint Scheduling of Processing and Shuffle Phases in MapReduce Systems

DATA SECURITY IN LOCAL AREA NETWORK BASED ON FAST ENCRYPTION ALGORITHM

Canon NTSC Help Desk Documentation

The OC Curve of Attribute Acceptance Plans

Finite Math Chapter 10: Study Guide and Solution to Problems

Calculation of Sampling Weights

Loop Parallelization

Secure and Efficient Proof of Storage with Deduplication

Ad-Hoc Games and Packet Forwardng Networks

A Load-Balancing Algorithm for Cluster-based Multi-core Web Servers

An Interest-Oriented Network Evolution Mechanism for Online Communities

Usage of LCG/CLCG numbers for electronic gambling applications

Network Aware Load-Balancing via Parallel VM Migration for Data Centers

When Network Effect Meets Congestion Effect: Leveraging Social Services for Wireless Services

The Greedy Method. Introduction. 0/1 Knapsack Problem

Lecture 2: Single Layer Perceptrons Kevin Swingler

Simple Interest Loans (Section 5.1) :

RSA Attacks. By Abdulaziz Alrasheed and Fatima

Calculating the high frequency transmission line parameters of power cables

A SECURE BILLING SERVICE WITH TWO-FACTOR USER AUTHENTICATION IN WIRELESS SENSOR NETWORKS. Received March 2010; revised July 2010

Linear Circuits Analysis. Superposition, Thevenin /Norton Equivalent circuits

Traffic State Estimation in the Traffic Management Center of Berlin

On-Line Fault Detection in Wind Turbine Transmission System using Adaptive Filter and Robust Statistical Features

Can Auto Liability Insurance Purchases Signal Risk Attitude?

An ILP Formulation for Task Mapping and Scheduling on Multi-core Architectures

Towards a Light-weight Bag-of-tasks Grid Architecture

IDENTIFICATION AND CORRECTION OF A COMMON ERROR IN GENERAL ANNUITY CALCULATIONS

A Study on Secure Data Storage Strategy in Cloud Computing

Vembu StoreGrid Windows Client Installation Guide

Transcription:

Fast Varants of RSA Dan Boneh dabo@cs.stanford.edu Hovav Shacham hovav@cs.stanford.edu Abstract We survey three varants of RSA desgned to speed up RSA decrypton. These varants are backwards compatble n the sense that a system usng one of these varants can nteroperate wth a system usng standard RSA. 1 Introducton RSA [11] s the most wdely deployed publc key cryptosystem. It s used for securng web traffc, e-mal, and some wreless devces. Snce RSA s based on arthmetc modulo large numbers t can be slow n constraned envronments. For example, 1024-bt RSA decrypton on a small handheld devce such as the Palm III can take as long as 40 seconds. Smlarly, on a heavly loaded web server, RSA decrypton sgnfcantly reduces the number of SSL requests per second that the server can handle. Typcally, one mproves RSA s performance usng specal-purpose hardware. Current RSA coprocessors can perform as many as 10,000 RSA decryptons per second (usng a 1024-bt modulus) and even faster processors are comng out. In ths paper we survey four smple varants of RSA that are desgned to speed up RSA decrypton n software. Throughout the paper we focus on a 1024-bt RSA modulus. We emphasze backwards compatblty: A system usng one of these varants for fast RSA decrypton should be able to nteroperate wth systems that are bult for standard RSA; moreover, exstng Certfcate Authortes must be able to respond to a certfcate request for a varant-rsa publc key. The securty of these varants s an open problem. We cannot show that an attack on any of these varants would mply an attack on the standardzed verson of RSA (as descrbed, e.g., n ANSI X9.31). Therefore, when usng these varants, one can only rely on the fact that so far none of them has been shown to be weak. In other words, Use at your own rsk. We begn the paper wth a bref revew of RSA. We then descrbe the followng varants for speedng up RSA decrypton: Batch RSA [8]: do a number of RSA decryptons for approxmately the cost of one. Mult-factor RSA [6, 14]: use a a modulus of the form N = pqr or N = p 2 q. Rebalanced RSA [17]: speed up RSA decrypton by shftng most of the work to the encrypter. The RSA trapdoor permutaton s used for both publc key encrypton and dgtal sgnatures. Snce the exact applcaton of RSA s orthogonal to dscusson n ths paper we use termnology consstent wth the applcaton to publc key encrypton. All the RSA varants we dscuss apply equally well to dgtal sgnatures. 1

1.1 Revew of the basc RSA system We revew the basc RSA publc key system; refer to [10] for more nformaton. We descrbe three consttuent algorthms: key generaton, encrypton, and decrypton. Key generaton: The key generaton algorthm takes a securty parameter n as nput. We use n = 1024 as the standard securty parameter. One generates two (n/2)-bt prmes, p and q, and sets N pq. Next, one pcks some small value e that s relatvely prme to ϕ(n) = (p 1)(q 1). The value e s called the encrypton exponent, and s usually chosen as e = 3 or e = 65537. The RSA publc key conssts of the two ntegers N, e. The RSA prvate key s an nteger d satsfyng e d = 1 mod ϕ(n). Typcally, one sends the publc key N, e to a Certfcate Authorty (CA) to obtan a certfcate for t. Encrypton: To encrypt a message X usng an RSA publc key N, e, one frst formats the btstrng X to obtan an nteger M n Z N = {0,..., N 1}. Ths formattng s often done usng the PKCS #1 standard [1, 9]. The cphertext s then computed as C M e mod N. (Other methods for formattng X pror to encrypton are descrbed elsewhere n ths ssue.) Decrypton: To decrypt a cphertext C the decrypter uses ts prvate key d to compute M, the eth root of C n Z N, gven by C d mod N. Snce both d and N are large numbers (each 1024 bts long) ths s a lengthy computatonfor the decrypter. The formattng operaton from the encrypton algorthm s then reversed to obtan the orgnal bt-strng X from M. Unless d s taken as a large number (on the order of N), the RSA system s nsecure [3, 17]. It s standard practce to employ the Chnese Remander Theorem (CRT) for RSA decrypton. Rather than compute M C d (mod N), one evaluates: M p C dp p (mod p) M q C dq q (mod q) Here d p = d mod p 1 and d q = d mod q 1. Then one uses the CRT to calculate M from M p and M q. Ths s approxmately four tmes as fast as evaluatng C d mod N drectly [10, p. 613]. 2 Batch RSA Fat [8] observed that, when usng small publc exponents e 1 and e 2, t s possble to decrypt two cphertexts for approxmately the prce of one. Suppose C 1 s a cphertext obtaned by encryptng some M 1 usng the publc key N, 3, and C 2 s a cphertext for some M 2 usng N, 5. To decrypt, we must compute C 1/3 1 and C 1/5 2 mod N. Fat observed that by settng A = (C1 5 C3 2 )1/15 we obtan: C 1/3 1 = A10 C 3 1 C2 2 and C 1/5 2 = A6 C 2 1 C 2 (1) At the cost of computng a sngle 15th root and some addtonal arthmetc, we are able to decrypt both C 1 and C 2. Computng a 15th root takes the same tme as a sngle RSA decrypton. Ths batchng technque s only worthwhle when the publc exponents e 1 and e 2 are small (e.g., 3 and 5). Otherwse, the extra arthmetc requred s too expensve. Also, one can only batch-decrypt cphertexts encrypted usng the same modulus and dstnct publc exponents. Ths 2

s essental. In fact, t s known [12, Appendx A] that one cannot apply such algebrac technques to batch the decrypton of two cphertexts encrypted wth the same key (e.g., of C 1/3 1 and C 1/3 2 ). Fat generalzed the above observaton to the decrypton of a batch of b RSA cphertexts. We have b dstnct and parwse relatvely prme publc keys e 1,..., e b, all sharng a common modulus N. Furthermore, we have b encrypted messages C 1,..., C b, where C s encrypted usng the exponent e. We wsh to compute M = C 1/e for = 1,..., b. Fat descrbes ths b-batch process usng a bnary tree. For small values of b (b 8), one can use a drect generalzaton of (1). One sets e e, and A 0 Ce/e (where the ndces range over 1,..., b). Then one calculates A A 1/e 0 = b =1 C1/e. For each, one uses the CRT to fnd a number x satsfyng x = 1 mod p and x = 0 mod p j (for j ). Then M = C 1/e = C (x 1)/p A x j Cx /p j j (2) Ths b-batch requres b modular nversons; Fat s tree based method requres 2b modular nversons, but fewer auxlary multplcatons. 2.1 Improvng the performance of batch RSA In [12] the authors show how to use batch RSA wthn the Apache web server to mprove the performance of the SSL handshake. Ths requres changng the web server archtecture. They also descrbe several natural mprovements to batch RSA. We menton a few of these mprovements here. Batch dvson: Modular nverson s much slower than modular multplcaton. We use a trck due to Montgomery to compute all b nversons n the batch algorthm for the cost of a sngle nverson wth a few more multplcatons. The dea s: To nvert x and y we compute α (xy) 1 and then set x 1 y α and y 1 x α, obtanng nverses of both numbers at the cost of a sngle modular nverse and some addtonal multplcatons. More generally, we use the followng fact [5, p. 481]: Fact. Let x 1,..., x n be elements of Z N. All n nverses x 1 1,..., x 1 n cost of one nverson and 3n 3 multplcatons. can be obtaned at the Consequently, only a sngle modular nverson s requred for the entre batchng procedure. Global Chnese Remander: In Secton 1.1 we mentoned that RSA decrypton uses the CRT to speed up the computaton of C d mod N. Ths dea extends naturally to batch decrypton. We run the batchng algorthm modulo p, and agan modulo q, then use the CRT on each of the b pars C 1/e mod p, C 1/e mod q to obtan the b decryptons M = C 1/e mod N. Smultaneous Multple Exponentaton: Smultaneous multple exponentaton [10, 14.6] s a method for calculatng a u b v mod m wthout frst evaluatng a u and b v. It requres approxmately as many multplcatons as does a sngle exponentaton wth the larger of u or v as exponent. Such products of exponents are a large part of the batchng algorthm. Smultaneous multple exponentaton cuts the tme requred to perform them by close to 50%. 3

2.2 Performance of batch RSA Table 1 lsts the runnng tme for standalone batch-rsa decrypton, usng OpenSSL 0.9.5 on a machne wth a 750 MHz Pentum III and 256 MB RAM, runnng Deban Potato. In all experments, the smallest possble values for the encrypton exponents e were used. batch key sze sze 768 1024 2048 (unbatched) 4.67 8.38 52.96 2 3.09 5.27 29.43 4 1.93 3.18 16.41 8 1.55 2.42 10.81 Table 1: RSA decrypton tme, n mllseconds, as a functon of batch and key sze Wth standard 1024-bt keys, batchng mproves performance sgnfcantly. Wth b = 4, RSA decrypton s accelerated by a factor of 2.6; wth b = 8, by a factor of almost 3.5. Note that a batch sze of more than eght s probably not useful for common applcatons, as watng for many decrypton requests to be queued can sgnfcantly ncrease latency. batch Server load sze 16 32 48 (unbatched) 105 98 98 2 149 141 134 4 218 201 187 8 274 248 227 Table 2: SSL handshakes per second as a functon of batch sze. 1024 bt keys. We also consder the batch-rsa performance as a component of a larger system a web server handlng SSL traffc. An archtecture for such a system s descrbed n [12]; the challenge s to choose, from amongst the queued requests, the batch to perform. Table 2 gves the number of SSL handshakes per second that the batch-rsa web server can handle, when bombarded wth concurrent HTTP HEAD requests by a test clent. Here server load s the number of smultaneous connectons the clent makes to the server. Under heavy load, batch RSA can mprove the number of SSL handshakes per second by a factor of approxmately 2.5. 2.3 The Downsde of Batch RSA Batch RSA can lead to a sgnfcant mprovement n RSA decrypton tme. Nevertheless, there are a few dffcultes wth usng the batchng technque: When usng batch RSA, the decrypton server must mantan at least as many RSA certfcates as there are dstnct keys n a batch. Unfortunately, current CAs charge per certfcate regardless of the publc key n the certfcate. For optmal performance, batchng requres RSA publc keys wth very small publc exponents (e = 3, 5, 7, 11,... ). There are no known attacks on the resultng system, but RSA as usually deployed uses a larger publc exponent (e = 65537). 4

3 Mult-factor RSA The second RSA varant s based on modfyng the structure of the RSA modulus. Here there are two proposals. The frst, patented by Compaq [6], uses a modulus of the form N = pqr. When N s 1024 bts, each prme s approxmately 341 bts. We refer to ths as mult-prme RSA. The second, proposed by Takag [14] and patented by NTT [15], uses RSA modul of the form N = p 2 q and leads to an even greater speedup. We begn wth mult-prme RSA. We descrbe key generaton, encrypton, and decrypton. We then dscuss the performance of the scheme and analyze ts securty. Key generaton: The key generaton algorthm takes as nput a securty parameter n and an addtonal parameter b. It generates an RSA publc/prvate key par as follows: Step 1: Generate b dstnct prmes p 1,..., p b each n/b -bts long. Set N b =1 p. For a 1024-bt modulus we can use at most b = 3 (.e., N = pqr), for securty reasons dscussed below. Step 2: Pck the same e used n standard RSA publc keys, namely e = 65537. Then compute d = e 1 mod ϕ(n). As usual, we must ensure that e s relatvely prme to ϕ(n) = b =1 (p 1). The publc key s N, e ; the prvate key s d. Encrypton: Gven a publc key N, e, the encrypter encrypts exactly as n standard RSA. Decrypton: Decrypton s done usng the Chnese Remander Theorem (CRT). Let d = d mod p 1. To decrypt a cphertext C, one frst computes M = C d mod p for each, 1 b. One then combnes the M s usng the CRT to obtan M = C d mod N. The CRT step takes neglgble tme compared to the d exponentatons. Performance We compare the decrypton work usng the above scheme to the work done when decryptng a normal RSA cphertext. Recall that standard RSA decrypton usng CRT requres two full exponentatons modulo n/2-bt numbers. In mult-prme RSA decrypton requres b full exponentatons modulo n/b bt numbers. Usng basc algorthms computng x d mod p takes tme O(log d log 2 p). When d s on the order of p the runnng tme n O(log 3 p). Therefore, the speedup of mult-prme RSA over standard RSA s smply: 2 (n/2) 3 b (n/b) 3 = b2 /4 For 1024-bt RSA, we can use at most b = 3 (.e., N = pqr), whch gves a speedup of approxmately 2.25 over standard RSA. Securty The securty of mult-factor RSA depends on the dffculty of factorng ntegers of the form N = p 1 p b for b > 2. The fastest known factorng algorthm (the number feld seve) does not take advantage of ths specal structure of N. However, one has to make sure that the prme factors of N do not fall wthn the capabltes of the Ellptc Curve Method (ECM), whch s analyzed n SW93. Currently, 256-bt prme factors are consdered wthn the bounds of ECM, snce the work to fnd such factors s wthn range of the work needed for the RSA-512 factorng project. Consequently, for 1024-bt modul one should not use more than three factors. 5

3.1 Mult-power RSA: N = p b 1 q One can further speed up RSA decrypton usng modul of the form N = p b 1 q where q and q are n/b bts each [14]. When N s 1024-bts long we can use at most b = 3,.e., N = p 2 q. The two prmes p, q are then each 341 bts long. Key generaton: The key generaton algorthm takes as nput a securty parameter n and an addtonal parameter b. It generates an RSA publc/prvate key par as follows: Step 1: Generate two dstnct n-bt prmes, p and q, and compute N p b 1 q. Step 2: Use the same publc exponent e used n standard RSA publc keys, namely e = 65537. Compute d e 1 mod (p 1)(q 1). Step 3: Compute r 1 d mod p 1 and r 2 d mod q 1. The publc key s N, e ; the prvate key s p, q, r 1, r 2. Encrypton: Decrypton: Same as n standard RSA. To decrypt a cphertext C usng the prvate key p, q, r 1, r 2 one does: Step 1: Compute M 1 C r 1 mod p and M 2 C r 2 mod q; thus M e 1 = C mod p and M e 2 = C mod q. Step 2: Usng Hensel lftng [5, p. 137] construct an M 1 such that (M 1 )e = C mod p b 1. Hensel lftng s much faster than a full exponentaton modulo p b 1. Step 3: Usng CRT, compute an M Z N such that M = M 1 mod pb 1 and M = M 2 mod q. Then M = C d mod N, a proper decrypton of C. Comment. Hensel lftng n Step 2 requred a modular nverson. However, some accelerator cards do not provde support for modular nverson. The API to these cards typcally does modular nverson usng an exponentaton: x 1 = x pd p d 1 1 (mod p d ). Unfortunately, usng an exponentaton to do Hensel lftng greatly dmnshed the gans of ths method over the mult-prme approach. Performance We compare the work requred to decrypt usng mult-power RSA to that requred for standard RSA. For mult-power RSA, decrypton takes two full exponentatons modulo (n/b)- bt numbers, and b 2 Hensel lftngs. Snce the Hensel-lftng tme s neglgble, we focus on the tme for the two exponentatons. As noted before, a full exponentaton s cubc n the sze of the modulus, so the speedup of mult-prme RSA over standard RSA s smply: 2 (n/2) 3 2 (n/b) 3 = b3 /8 For 1024-bt RSA, b should agan be at most three (.e., N = p 2 q), gvng a speedup of approxmately 3.38 over standard RSA. 6

Securty The securty of mult-power RSA depends on the dffculty of factorng ntegers of the form N = p b 1 q. As for mult-prme RSA, one has to make sure that the prme factors of N do not fall wthn the capabltes of ECM. Consequently, for 1024-bt modul one can use at most b = 3,.e., N = p 2 q. We note that, although the Lattce Factorng Method (LFM) of Boneh, Durfee, and Howgrave-Graham [4] s desgned to factor ntegers for the form N = p u q for large u, t cannot factor ntegers of the form N = p 2 q when N s 1024 bts long. 4 Rebalanced RSA In standard RSA, encrypton and sgnature verfcaton are much less processor-ntensve than decrypton and sgnature generaton. In some applcatons, one would lke to have the reverse behavor. For example, when a cell phone needs to generate an RSA sgnature that wll be later verfed on a server one would lke sgnng to be easer than verfyng. Smlarly, for SSL, web browsers (dong encrypton) typcally have dle cycles to burn whereas web servers (dong decrypton) are overloaded. In ths secton we descrbe a varant of RSA that enables us to rebalance the dffculty of encrypton and decrypton. It s based on a proposal by Wener [17] (see also [2]). Note that we cannot smply speed up RSA decrypton by usng a small value of d snce as soon as d s less than N 0.292 RSA s nsecure [17, 3]. As before, we descrbe key generaton, encrypton, and decrypton. Key generaton The key generaton algorthm takes two securty paramters n and k where k n/2. It generates an RSA key as follows: Step 1: Generate two dstnct (n/2)-bt prmes p and q wth gcd(p 1, q 1) = 2. Compute N pq. Step 2: Pck two random k-bt values r 1 and r 2 such that gcd(r 1, p 1) = 1 and gcd(r 2, q 1) = 1 and r 1 = r 2 mod 2 Step 3: Fnd a d such that d = r 1 mod p 1 and d = r 2 mod q 1. Step 4: Compute e d 1 mod ϕ(n). The publc key s N, e ; the prvate key s p, q, r 1, r 2. Steps 3 and 4 requre some explanaton. Frst, we explan how to fnd d n Step 3. One usually uses the Chnese Remander Theorem (CRT). Unfortunately, p 1 and q 1 are not relatvely prme (they are both even) and consequently the theorem does not apply. However, (p 1)/2 s relatvely prme to (q 1)/2. Furthermore, r 1 = r 2 mod 2. Let a = r 1 mod 2. Then usng CRT we can fnd an element d such that d = r 1 a 2 (mod p 1 2 ) and d = r 2 a 2 (mod q 1 2 ) Now, observe that the requred d n Step 3 s smply d = 2d + a. Indeed, d = r 1 mod p 1 and d = r 2 mod q 1. In Step 4, we must justfy why d s nvertble modulo ϕ(n). Recall that gcd(r 1, p 1) = 1 and gcd(r 2, q 1) = 1. It follows that gcd(d, p 1) = 1 and gcd(d, q 1) = 1. Consequently gcd(d, (p 1)(q 1)) = 1. Hence, d s nvertble modulo ϕ(n) = (p 1)(q 1). 7

Typcally, we take k = 160, although other larger values are acceptable. Note that e s very large on the order of N. Ths s unlke standard RSA, where e typcally equals 65537. All CAs we tested were wllng to generate certfcates for such RSA publc keys. Encrypton: Encrypton usng the publc key N, e s dentcal to encrypton n standard RSA. The only ssue s that snce e s much larger than n standard RSA, the encrypter must be wllng to accept such publc keys. At the tme of ths wrtng all browsers we tested were wllng to accept such keys. Except Mcrosoft s Internet Explorer (IE). IE allows a maxmum of 32 bts for e. Decrypton: To decrypt a cphertext C usng the prvate key p, q, r 1, r 2 one does: Step 1: Compute M 1 C r 1 mod p and M 2 C r 2 mod q. Step 2: Usng the CRT compute an M Z N such that M = M 1 mod p and M = M 2 mod q. Note that M = C d mod N. Hence, the resultng M s a proper decrypton of C. Performance We compare the work requred to decrypt usng the above scheme to that requred usng standard RSA. Recall that decrypton tme for standard RSA wth CRT s domnated by two full exponentatons modulo (n/2)-bt numbers. In the scheme presented above, the bulk of the decypton work s n the two exponentatons n Step 1, but n each of these the exponent s only k bts long. Snce modular exponentaton takes tme lnear n the exponent s bt-length, we get a speedup of (n/2)/k over standard RSA. For a 1024-bt modulus and 160-bt exponent, ths s a factor of 3.20. Securty It s an open problem whether RSA usng values of d as above s secure. Snce d s large, the usual small-d attacks [17, 3] do not apply. We present the best known attack on the scheme. Lemma. Let N, e be an RSA publc key wth N = pq. Let d Z be the correspondng RSA prvate exponent satsfyng d = r 1 mod p 1 and d = r 2 mod q 1 wth r 1 < r 2. If r 1 s m bts long we assume that r 1 r 2 mod 2 m/2. Then gven N, e an adversary can expose the prvate key d n tme O( r 1 log r 1 ). Comment. Proof. We know that e = (r 1 ) 1 mod (p 1). Suppose r 1 s m-bts long. Wrte r 1 = A 2 m/2 + B where A, B are n [0, 2 m/2 ]. Pck a random g Z N and defne the polynomal G(x) = (g e 2m/2 x g) 2 m/2 =0 Note that ths polynomal has degree 2 m/2. Next, observe that G(g e B ) = 0 mod p. Ths follows snce one of the products above s ( g e 2 m/2 A g e B g ) = g e r 1 g = 0 (mod p) Snce r 1 r 2 mod 2 m/2 t follows that G(g e B ) 0 mod q. Hence, gcd ( N, G(g e B ) ) gves a nontrval factor of N. Hence, f we evaluate G(x) mod N at x = g e j for j = 0,..., 2 m/2 at least one of these values wll expose the factorzaton of N. Evaluatng a polynomal of degree 2 m/2 at 2 m/2 8

values can be done n tme 2 m/2 m/2 usng FFT methods [16]. Ths algorthm requres Õ(2m/2 ) space. Hence, n tme at most O( r 1 log r 1 ) we can factor N. The above attack shows that, to obtan securty of 2 80, we must make both r 1 and r 2 be at least 160 bts long. Ths explans our choce of parameter szes for r 1 and r 2. 5 Conclusons We surveyed four varants of RSA desgned to speed up RSA decrypton and be backwardscompatble wth standard RSA. Table 3 gves the speedup factors for each of these varants usng a 1024-bt RSA modulus. Batch RSA s fully backwards-compatble, but requres the decrypter to obtan and manage multple publc keys and certfcates. The two mult-factor RSA technques are promsng n that they are fully backwards compatble. The rebalanced RSA method gves a large speedup, but only works wth peer applcatons that properly mplement standard RSA, and so are wllng to accept RSA certfcates wth a large encrypton-exponent e. Currently, IE rejects all RSA certfcates where e s more than 32 bts long. Mult-factor RSA and rebalanced RSA can be used together to gve an addtonal speedup. Fnally, all these technques are orthogonal to work n mprovng the performance of the fundamental number-theoretc algorthms (e.g., modular multplcaton and exponentaton) on whch RSA s bult. Method Speedup Comment Batch RSA 2.64 Requres multple certfcates Mult-prme 2.25 Mult-power 3.38 Rebalanced 3.20 Incompatble wth Internet Explorer Table 3: Comparson of RSA varants Acknowledgments The authors thank Ar Juels for hs comments on prelmnary versons of ths paper. References [1] M. Bellare and P. Rogaway. Optmal Asymmetrc Encrypton. In A. De Sants, ed., Proceedngs of Eurocrypt 1994, vol. 950 of LNCS, pp. 92 111. Sprnger-Verlag, May 1994. [2] D. Boneh. Twenty Years of Attacks on the RSA Cryptosystem. Notces of the Amercan Mathematcal Socety, 46(2):203 213, Feb. 1999. [3] D. Boneh and G. Durfee. Cryptanalyss of RSA wth Prvate Key d Less than n 0.292. IEEE Trans. Informaton Theory, 46(4):1339 1349, Jul. 2000. [4] D. Boneh, G. Durfee, and N. Howgrave-Graham. Factorng N = p r q for Large r. In M. Wener, ed., Proceedngs of Crypto 99, vol. 1666 of LNCS, pp. 326 337. Sprnger-Verlag, Aug. 1999. 9

[5] H. Cohen. A Course n Computatonal Algebrac Number Theory, vol 138 of Graduate Texts n Mathematcs. Sprnger-Verlag, 1996 [6] T. Collns, D. Hopkns, S. Langford, and M. Sabn. Publc Key Cryptographc Apparatus and Method. US Patent #5,848,159. Jan. 1997. [7] T. Derks and C. Allen. RFC 2246: The TLS Protocol, Verson 1. Jan. 1999. [8] A. Fat. Batch RSA. In G. Brassard, ed., Proceedngs of Crypto 1989, vol. 435 of LNCS, pp. 175 185. Sprnger-Verlag, Aug. 1989. [9] RSA Labs. Publc Key Cryptography Standards (PKCS), Number 1. [10] A. Menezes, P. Van Oorschot, and S. Vanstone. Handbook of Appled Cryptography. CRC Press, 1997. [11] R. Rvest, A. Shamr, and L. Adleman. A Method for Obtanng Dgtal Sgnatures and Publc Key Cryptosystems. Commun. ACM, 21(2):120 126. Feb. 1978. [12] H. Shacham and D. Boneh. Improvng SSL Handhsake Performance va Batchng. In D. Naccache, ed., Proceedngs of RSA 2001, vol. 2020 of LNCS, pp. 28 43. Sprnger-Verlag, Apr. 2001. [13] R. Slverman and S. Wagstaff Jr. A Practcal Analyss of the Ellptc Curve Factorng Algorthm. Math. Comp. 61(203):445 462. Jul. 1993. [14] T. Takag. Fast RSA-type Cryptosystem Modulo p k q. In H. Krawczyk, ed., Proceedngs of Crypto 1998, vol. 1462 of LNCS, pp. 318 326. Sprnger-Verlag, Aug. 1998. [15] T. Takag and S. Nato Scheme for fast realzaton of encryton, decrypton and authentcaton. US Patent #6,396,926. Mar. 1999. [16] J. Turk. Fast Arthmetc Operatons on Numbers and Polynomals. In H. Lenstra, Jr. and R. Tjdeman, eds., Computatonal Methods n Number Theory, Part I, vol. 154 of Mathematcal Centre Tracts. Mathematsch Centrum, Amsterdam, 1982. [17] M. Wener. Cryptanalyss of Short RSA Secret Exponents. IEEE Trans. Informaton Theory 36(3):553 558. May 1990. 10