Fractal-Structured Karatsuba`s Algorthm for Bary Feld Multplcato: FK *The authors are worg at the Isttute of Mathematcs The Academy of Sceces of DPR Korea. **Address : U Jog dstrct Kwahadog Number Pyogyag DPR Korea Abstract: I ths paper we report a software mplemetato of bary feld multplcato, based o a ew fractal-structured algorthm, whch practce s much faster tha the curret methods for the multplcato.[,6,4,5] Our software mplemetato shows that the ew method archves surprsg speed-up. For example, our programm o a Petum Ⅲ-533, wthout optmzato of compler, by C source code edted by Mcrosoft Vsual C++ 6.0, taes oly 7.04 µ s per multplcato the feld GF( 43 ). We pot out that the algorthm too suts for hardware mplemetato.. Survey of results Recetly, the study o speedg up arthmetc operatos of bary feld GF ( ) s very actvely proceeded subject to ts practcal mportace.[-] Whereas much research successes wth respect to the hardware archtecture for hgh-speed mplemetato of bary feld multplcato(bm) have bee reported, there was o ay remarable sp after Motgomery`s method[ ] the drecto of ts software mplemetato [- ] I ths paper we report a software mplemetato of BM based o a ew fractal-structured algorthm, whch practce s much faster tha the curret methods for multplcato.[,6,4,5] For example, GF ( 43 ), ths algorthm s about 7 tmes faster tha the Motgomery multplcato[] ad more tha 0 tmes faster tha the mproved stadard multplcato.[6] Our algorthm s based o the Karatsuba`s recursve subdvso(kr) method[] for polyomal multplcato(pm). I the practcal mplemetato of PM, although the KR method mplemets less umber of multplcatos wth smaller sze every recursve process, but t eeds so much overhead that t has bee recogzed to be rather effcet ad dsregarded tll ow.[3] We have much reduced the overhead by smulatg the KR process by the structure of Serpsy tragle, a typcal example of fractal, ad determg the combatoral structure wth whch the assstat buffers ad addtos eeded every recursve step are embedded to the result of the etre process.. Bass of the algorthm I ths paper we employ the polyomal bass to represet elemets of bary fte feld ( ) Lets deote the defto polyomal of a bary feld GF( ) by (. The we ca cosder the process to compute the product c( = a( b( mod ( of a, b GF( ), dvdg to a PMa( ad a modulato o the result of the PM by (. As well ow, the cost for a modulato s so cheap that t s eglgble comparso wth the cost of a multplcato. Karatsuba`s method for computg the ma object-pm speedg up BM s as followg. To mae the descrpto smple, heceforth we suppose that the legth of the bary fte feld = ( N). Sce the degrees of a ( ad b ( s ot beyod, we ca wrte as ( x a = a ( x + a ( ), max deg ( < / a GF.
a b b = b ( x + b ( ), max deg ( < / ad the the PM- a( wth sze s computed by ( x b = a b x + a + a ) ( b + b ) + a b + a b ] / [( x + a b, beg coverted to the combato of three PM`s- b a, a b, ( a + a ) ( b + b ) wth sze of at most. I ths paper we call ths covertg process a Karatsuba`s recursve subdvso (KR) process. By the KR process, whe = ( x ) a x = 0 t t a b u v u= v= a ad ( x ) b = b x, ( b( = 0 a s obtaed by computg 3 products of type wth ut sze(heceforth, we call these products wth the least sze, basc.) ad combg them. What s eeded s to determe or dex these basc products ad to fd the archtecture of the scheme they atted. Lear cosderato about t by a purely recursve costructo seems to be rather complcate. Now, we are gog to cosder t o a plae, arragg the 3 basc products the KR process o the cells of a pre-fractal fgurato gotte by the -th acto of Serpsy`s system of teratve fuctos. To descrbe our method for arragg them, we frst troduce some otatos ad deftos. t t For smplcty of descrpto we deote the basc product a b by ( u v, Λ, t ). u= v= Whe r Z, we defe (, ) r : = ( + r, + r), (, ) ( j, Λ, jt ) : = ( + j, + jt ) ad deote the set cossted of all basc products the -th step by S. Gve two sets A, B composed of basc products, we defe operato as A B := { x y x A, y B, r Z, y = x r }ad,whe r Z, A r : = { x r x A}. Smulatg the KR process by a Serpsy tragle s based o a followg fact. [Lemma ] For ay S+ = S Υ { S } Υ{ S }}, where the symbol Υ represets dsjot sum. (abbrevate proof.) From ths lemma, we ca costruct three mappgs: ω : S S ω : S + + S ω3 : S+ S } whch maps all of x, x, x ( x ), where x S, to x, x, x ( x ), respectvely. We ca cosder the system of these mappgs, certaly, correspodgly to the teratve fucto system of a Serpsy tragle. Namely we ca completely place S+ o a ( +) -sze Serpsy tragle, by dstrbutg the basc products correspodg to S S + o the up left tragle of sze of the ( +) -sze Serpsy tragle, basc products correspodg to S o the up rght tragle ad oes correspodg to S { S } o the dow small tragle. The just verse of ths procedure perfectly dexes all basc products. Now, we should determe to whch postos the cell correspodg to a basc product s added + the bloc represetato of the polyomal of degree ( ) obtaed as the result of -sze multplcato a(. Ths correspods to fdg the total multplcato result a b from the
arragemet of the basc products, cosdered above dscusso. e. of the -sze pre-fractal cells a Serpysy tragle. To do t, we proceed followg process. We add three ( ) -sze tragles the -sze Serpysy tragle altogether by correspodg cells ad rearrage ts result (.e. S + S + S }) place of S { S } the - sze Serpsy tragle. For ay of three ( ) -sze pre-fractal tragles, aga ths procedure s terated. Followg theorem shows that our ths procedure maes possble to determe the arrage stuato of basc products the product a b. [Theorem ] The result of projectos of -sze cells obtaed above procedure to the bottom sde of Serpsy tragle s the product a b, where by projecto meas Xor-addto just at the place. (abbrevate proof.) I the ed, whe a fte feld GF ( ) s gve, f we frst costruct a pre-serpysy tragle of sze, composed of dexed basc products ad the get the arragemet accordg to [Theorem ] (We call ths procedure a fte feld GF ( ) multplcato system costructo.), the we ca solve the problem for computg product of ay two elemets a, b the feld, by computg the basc products ad combg them by the costructed operato system. Ths does ot requre ay eedless addto whch s elmated by Xor-addto or ay delay by termedate buffers. Havg computed prevously PM`s of some sze v by usg the tableloog-up[], we ca mae the computato of basc products more smple. 3. Aalyss of FK mplemetato FK s desged so that for the settled problem sze select the umber t of steps of KR procedure, by whch we should proceed, the table sze v for table-loog-up ad the word sze w to be optmzed. I followg lemma the cost for FK mplemetato s estmated by the umber of bt-xor operatos. [Lemma ] [7] The cost of BM FK s followg : t 3 3 Q( t, w, v) = + + ( + w + ) w 4 wv t The selecto of v, w ad t used FK depeds o followg fact. [Lemma 3] [7] The cost of PM FK s optmzed whe 4 log t = m log 3, log v 3 log w
I followg table we show the costs of dfferet algorthms for BM ad a expermetal comparso of FK to the stadard multplcato ad Motgomery multplcato. (I FK, whe = 43the table sze for referece v = 8, ad word sze w = 3 are selected.) From the practcal vewpot, we have estmated the costs, supposg that all algorthms refer 8bt multplcato tables [, 6] ad proceeded the expermet GF( 43 ) o a Petum Ⅲ- computer wth Itel CPU, wthout optmzato of compler, by C source code edted by Mcrosoft Vsual C++ 6.0. Algorthm cost Speed of mplemetato GF( 43 ) (secod/a mllo tme) Rate of speedehacemet Improved stadard multplcato[6] 34S + 3S 4 0.7 Motgomery multplcato [] 6 S + S 4 7 FK log 3 S + 8S + 34 9 6 Table. Comparso o the costs of FK to Motgomery ad mproved stadard multplcato ad o ther mplemetato speeds 4. Cocluso I ths paper we have proposed a algorthm FK based o smulatg the Karatsuba`s procedure for polyomal multplcato by Serpysy tragle, whch mplemets the multplcato o bary felds very effcetly. Practce has showed that FK s much faster tha stadard or Motgomery multplcato. FK has bee recogzed to be very effectve for the hardware mplemetato, too. Now we are usg FK as a specal route for basc operatos for ellptc curve cryptosystem o bary felds. Refereces [] I. F. Blae, G. Serouss ad N. P. Smart, Ellptc Curves Cryptography, Cambrdge, U.K.: Cambrdge Uv. Press, 999. [] Ç. K. Koç ad T. Acar, Motgomery Multplcato GF ( ), Desg, Codes ad Cryptography, 4, 57-69, 998. [3] A. HalbutoĞullar ad Ç. K. Koç. Parallel Multplcato GF ( ) usg polyomal Resdue Arthmetc, Desgs, Codes ad Cryptography, 0, 55-73, 000.
[4] M. Aydos, T. Ya ad Ç. K. Koç. Hgh-Speed Implemetato of a ECC-based Wreless Authetcato Protocol o a ARM Mcroprocessor, IEE Proc. Commu., 48, 5, 73-79, 00. [5] R. Katt ad J. Brea, Low Complexty Multplcato a Fte Feld Usg Rg Represetato, IEEE Tras. Computers, 5, 4, 48-47,003. [6] Y. Ha, P. C. Leog, P. C. Ta ad J. Zhag. Fast Algorthms for Ellptc Curve Cryptosystems over Bary Fte Feld, Asacrypt`98, 75-84, 999. [7] S. I. Km, G. H. Km ad C. S. S, A Geeralzed Recursve Subdvso Method for Hgh-Speed Implemetato of Bary Feld Multplcato, Scece of Iformato,, -3, 005. [8] W. Geselma, A New Represetato of Elemets of Fte Felds GF ( m ) Yeldg Small Complexty Arthmetc Crcuts, IEEE Tras. Computers, 5,, 460-46, 00. [9] C. H. Km, S.Oh ad J.Lm,, A New Hardware Archtecture for Operatos GF ( ), IEEE Tras. Computers, 5,, 90-9, 00. [0] M. Ela, et al., O the Iheret Space complexty of Fast Parallel Multplers for GF ( ), IEEE Tras. Computers, 5, 3, 346-35, 00. [] H. Wu, Bt-Parallel Fte Feld Multpler ad Square Usg Polyomal Bass, IEEE Tras. Computers, 5, 7, 750-758, 00. [] A. Satoh ad K. Taao, A Scalable Dual Feld Ellptc Curve cryptographc processor, IEEE Tras. Computers, 5, 4, 449-460, 003. [3] R. Cradall ad C. Pomerace, Prme Numbers; A Computatoal Perspectve, Sprger, p. 434, 00. [4] A. Reyha-Masoleh ad M. A. Hasa, Fast Normal Bass Multplcato Usg Geeral Purpose Processors, IEEE Tras. Computers, 5,, 379-390, 003. [5] D. Haerso, J. Lopez ad A. Meezes, Software Implemetato of Ellptc Curve Cryptography over Bary Felds, CHES000, -4,000. [6] A. Reyha-Masoleh ad M. A. Hasa, Effcet Multplcato Beyod Optmal Normal Bases, IEEE Tras. Computers, 5, 4, 48-439,003.