Meaningful electronic signatures based on an automatic indexing method

Size: px
Start display at page:

Download "Meaningful electronic signatures based on an automatic indexing method"

Transcription

1 Meagful electroc sgatures based o a automatc dexg method Maxme Wack, Ahmed Nat-Sd-Moh*, Sd Lamrous, Nathaael Cott Systems ad Trasports Laboratory Uversty of Techology of Belfort-Motbelard 9000 BELFORT Cedex Frace {maxme.wack, ahmed.at-sd-moh, sd.lamrous}@utbm.fr, athaael.cott@escae.fr ASTRACT. Legal formato certfcato ad secured storage combed wth documets electroc sgature are of great terest whe dgtal documets securty ad coservato are cocer. Therefore, these ew ad evolvg techologes offer powerful abltes, such as detfcato, authetcato ad certfcato. The latter cotrbute to crease the global securty of legal dgtal archves coservato ad access. However, curretly used cryptographc ad hash codg cocepts caot trscally eclose cogtve formato about both the sger ad the sged cotet. Ideed, a evoluto of these techologes may be ecessary to acheve full text researches wth hudreds or thousads of electrocally sged documets. Ths artcle ams at descrbg a possble model alog wth assocated processes to create ad make use of these ew electroc sgatures called meagful electroc sgatures as opposed to tradtoal electroc sgatures based o bt per bt computato.. Itroducto The EESSI fal report (EES, 999), of whch the Europea Parlamet ad Coucl 999/93/CE drectve (EPC, 999) spres, valdated the techcal ad jurdcal acceptablty of a electroc sgature lked wth a dgtal cotet. Ths electroc sgature would be cosdered, depedg o securty ad evrometal crtera, as a strct equvalet to a mauscrpt sgature. Sce March 3, 000 ad the Frech law (OJ, 000), a legal value ca be attrbuted to electrocally geerated, sged ad stored documets as log as they meet commo crtera, such as fdelty ad log-tme coservato, lsted by the Frech Cvl Code. As far as electroc sgature appled o umerc formato s cocered, professoals express ther wsh to take part of ths ecessary eterprse documets maagemet evoluto, especally terms of sgature processes ad sged documets archval ad retreval. We thus propose ths artcle a possble model whch hadles electroc sgatures costructo accordace wth the legal cotext prevously metoed (Hasz, 006). Electroc sgature has become a wdely used way for authetcatg dgtal formato. It beefts from umerous researches o asymmetrc cryptography ad hashcodg. Bascally, electroc sgature reproduces old wax seals used the atquty (NIST, 000). A seal may be compared wth a secret sgature key whch should be the sole possesso of the sgatory (the etty that sgs up the formato). However, a ma dfferece betwee seals ad electroc sgature must be poted out: whereas a seal s affxed o the support of the formato (the paper for example), electroc sgature s geerated usg the formato tself. As a cosequece, a seal remas the same (whch makes t possble to detfy ts propretary) but a electroc sgature depeds o the formato t refers to ad therefore s dfferet from oe aother. As a cosequece, a gve source formato must always produce a uque * Correspodg author, Emal address: ahmed.at-sd-moh@utbm.fr

2 electroc sgature ad two dfferet sources must lead to geerate two dfferet electroc sgatures (usg a sgle hashcodg algorthm alog wth a sgle prvate key, called sgature key). To each sgature key correspods a uque publc key, called verfcato key. Ths verfcato key verfes the sgature authetcty ad the formato tegrty (we suppose that the hashcodg process s collso-resstat). The detty of the sgatory caot be proved utl a legally approved lk ca be gve betwee the verfcato key ad the sgatory, such as a publc-key certfcate. Electroc sgatures geerato reles o asymmetrc cryptography appled o hash coded data. Cotrary to commo cryptography, electroc sgature does ot am at preservg formato cofdetalty but rather data authetcty ad o-repudato (Kaeo, 999).. Motvato.. A ew hashcodg approach Hashcodg (Meezes, 00) whch creates dgtal fgerprts from data flows geerates a fxed-legth h be a secured hashcodg fucto used o a source flow s. To be elgble for tradtoal electroc sgature costructo, ths fucto must aswer the followg requremets: message from ay source flow. The result s depedet from the source flow legth. Let ( ) A gve source flow s must always produce the same hash code. Two dfferet source flows s ad s must produce dfferet results (strog collso resstace) A hash code value does ot hold ay computable formato whch could be used to recreate ts source flow (o-reversblty). These requremets may be mathematcally expressed as follows: h s h s s = s h s h s. ( ) = ( ') ' ad ( ) ( ') ' ( ) 0 s s.. h ( s) = d p h ( d ) = s, where d refers to the dgtal fgerprt produced by ( s) h ad p s a probablty. Ideed, the secod logcal equato dcates that although recostructg s from d s theoretcally possble, t appears to be computatoally feasble. As these algorthms drectly work o bt flows (ad are destructve), they do ot provde ways to get a rough vew of the hashed source data. I the partcular case of pure textual documets (ad maly takg legal cosderatos to accout), t may be of great mportace to get the ma deas of a documet: the mpact of the proposed techology s that oe ca grasp the orgal documet deas from ts hash-code, whch s ot the case the tradtoal hash fuctos where the relato betwee a documet ad ts hash s ot reversble. Lawyers fd all ther terest the applcato of ths method order to fd formato related to a documet startg from ts fgerprt. A meagful sgature may advatageously supplemet other documet retreval methods as t would crease the recall level. As Losee et al. (Losee, 003) clams, a hgh recall level s desrable evromets such as law or academa where the cost of mssg a relevat documet may be very hgh. Therefore, lawyers may wat to cosder electroc sgatures as proof elemets by themselves (e.g. whe the sged cotet has bee destroyed, ether accdetally or o purpose), whch tradtoal hashg techques do ot provde. Thus, rethkg hash-codg, we suggest makg use of documet words to process the hash-code pror to creatg the electroc sgature (wth help of a prvate key). Although our hashcodg techque may be used place of tradtoal hashcodg systems usg real umbers, we tetoally use tegers as we cosder that collso resstace s ot of great terest (documets wth a legal scope). Data decpherg or decryptg leads gettg comprehesble formato from a cphered text. However, decrypto s used whe the etty does ot ow the approprate key ad thus tres to break the cphered text. Ths retreval performace measuremet dcator refers to the probablty a documet s retreved gve that t s relevat to a gve search query.

3 .. Related work I the framework of formato systems, ad order to esure the securty of electroc terchages, a certa umber of hashcodg algorthms have bee proposed ad mplemeted, such as MD (Kalsk, 99), MD4 (Rvest, 99) ad RIPEMD (Dobbert, 996) (Preeel, 997). For example (Preeel, 995) the securty of message authetcato code (MAC) algorthms s cosdered. I ths study, a ew geerc costructo s proposed for trasformg ay secure hash fucto of the MD4 famly to a secure MAC of equal or smaller bt legth ad comparable speed. Rvest proposed two very fast hash fuctos whch are motvated by ther use for RSA data securty, amely MD4 ad MD5 (Rvest, 99a, 99b). As the successor to MD5, the SHA (Secure Hash Algorthm) famly (NIST, 995) s a set of related cryptographc hash fuctos. The SHA algorthms were desged by the Natoal Securty Agecy (NSA) ad publshed as a US govermet federal formato processg stadard. The SHA algorthms are used for the protecto of sestve uclassfed formato ad ecouraged to be employed by prvate ad commercal orgazatos. Some of the most wdely used algorthms to geerate electroc sgatures (maly SHA- ad MD5) have bee proved to be o more collso-resstat (Wag, 005) ad ew versos have bee submtted to the commuty. As sad prevously, secured hashcodg algorthms are desged to make sure ( theory) that the orgal message caot be recostructed from ts fgerprt. As ths type of hash codes does ot gve valuable formato o the source flow, t must be used whe the hash code has to be trasmtted to a thrd party that eeds to kow the exstece of a formato wthout kowg the cotet of ths formato (durg a tmestampg process for example), or whe the electroc sgature s geerated o a multpart documet whch cludes multmeda cotet (mages, soud, vdeo) ad text formattg elemets (extra tags use to format the documet o scree or for prtg). I partcular, such electroc sgature s computatoally but ot sematcally lked to the source message. We suggest hereafter a ew hashcodg techque whch apples o textual documets ad based o words spatal represetato (Lamrous et al, 997). 3. Meagful electroc sgature creato process Electroc sgature s of great mportace as far as the sged cotet refers to a jurdcal cotext. However, the WYSIWYS («What You See Is What You Sg») paradgm s lmted whe the to be sged documet ecloses a dyamc cotet (macros or dyamc felds for example), holds secrets (by meas of mages stegaography techques) or eeds a o-secured (.e. o-approved) propretary vewer. Our scope s therefore strctly lmted to textual formato (whch makes the most of jurdcal documets). Assocatg a sematc wth the electroc sgature allows to make sure that the source documet ad the geerated electroc sgature are strogly lked ad also that the source data used to create the sgature are solely made of statc textual elemets (wthout ay vsualsato or prtg formato, etc.) whch do ot eed ay partcular vewg process. Geeratg ths kd of electroc sgatures allows to effcetly retrevg formato from archves wth lmted full-text (tme ad processor cosumg) researches. 3.. Idexg wth hash-code Whereas commo electroc sgatures geerato processes rely o bt-related hashcodg, we defe the oto of trace. A trace gathers all lemmas 3 alog wth a set of calculated parameters that provde formato o words dstrbuto ad posto wth the to be sged documet (Lamrous, 999). Ths trace dramatcally reduces the sze of the orgal documet ad oe ca roughly recreate a documet from ts trace. Traces ca also be compared to determe the correlato betwee documets. It may be possble to get cofdece the 3 Lemma : commo part of a word whch ca be decled to produce related words : «worker» ad «work» are both ssued from «work» lemma

4 smlarty of two documets aalyzg ther frst lemmas: the more detcal parameters values, the more cofdece ther smlarty, cosderg that a trace s uque ad collso-resstat. As a trace may eclose may parameters (ad thus may ot be applcable to electroc sgature geerato for tme-cosumpto ad sze of the geerated trace before the cryptographc process), we have worked o reducg the sze of our traces o the oe had ad keepg sgfcat meag o the other had. Therefore, we focused o lemmas foud to be meagful by Lamrous algorthm (Lamrous, 999), followed by ther traceablty (reduced set of parameters whch do ot deaturze the dstrbuto of these lemmas wth the documet). Ths retaed process allows costructg a certa umber of sgatures varetes, depedetly from laguages ad domas. Fally, to smplfy the trace geerato process, a extra step s performed before the trace algorthm apples: all accetuated characters (such as Frech laguage) are replaced by ther o-accetuated equvalet ( é ê è e ) ad specal characters are removed (parethess, percetage, etc.). The whole cotet s the lowercased to avod case-sestve comparso tests. A lst of parameters, called traceablty, s computed for each sgfcat lemma. It cotas the followg felds: The lemma occurrece or pertece degree of the lemma. Its barycetre. Its varace Lemma dstrbuto, barycetre computato We suggest focusg o the textual spatal dstrbuto sgal of each lemma. However, keepg a exhaustve lst of every lemma would crease the trace sze. Ideed, a represetatve value of ths lst, called barycetre, s created stead. A barycetre ca be descrbed as a harmoc cetre wth the whole textual represetato of the documet. Ths metrc value vares from 0 to 00: a ear 0 value meas that the lemma appears most at the begg of the documet ad a value whch approaches 00 dcates that the lemma s cocetrated at the ed of the documet. A lemma wth a level-headg of 50 s qute ambguous as t may be terpreted as a uform dstrbuto or as a cocetrato aroud the cetre of the documet. The varace of the lemma helps the terpretato: t s used to measure the coverage rate of the lemma wth the documet as far as t evaluates the dstace of the lemma declatos (kow as the user questo): the smaller ths dstace, the more the coecto betwee the subject ad the questo. The postos umbers (rages) of the lemma are cosdered as bascs source data. To uform results, dgts are coverted to a 0 to 00 value comparso wth the sze of the documet. Let X be a lemma wth a occurrece value of. It s located at posto 36 a documet whch comprses 0 words. The computed posto equals ( 36 0) /00 = 43.. Ths computed value refers to the lemma locato wth the textual represetato of the documet. Hereafter s a graphcal example whch llustrates a barycetre computato for a gve lemma havg a occurrece greater tha (they appear more tha oce the documet). The purpose s to assocate a average but represetatve value of the locato of ths lemma the documet. Let t be a textual documet represeted by a axs graduated from 0 to 00 (barycetre mmum ad maxmum values). A lemma dstrbuto s represeted o ths axs as follows: Textual represetato t Lemma X Barycetre of X Frst barycetre posto: average value of X dfferet postos The prevous fgure llustrates the partcular case of a word postos average value whch leads to represet a approprate vso of the barycetre. Ideed, the sgle left-sde value ca geerate a passvty of the barycetre. It appears thus sutable to assg dfferet multplcatve coeffcets depedg o the average value

5 of the stadard devatos: a lear levellg whch depeds o the dstace betwee ths average value ad the puctual postos has to be performed. Therefore, let: m = = = x = ( x m) Where: x : x (word) dfferet postos for every value from to : occurrece value of the X lemma Oe uses a Gaussa fucto that assocates low weghts wth dstat pots wth f [0, ]. More precsely, more x (word) s close to the frst average, f(x) must take a great value (close to ) ad coversely. The correspodg barycetre value ca the be calculated: Barycetre = f ( x ) x Base o ths barycetre value appled to lemmas, the ew gravty cetre s show hereafter: 3... Pertece degree of the lemma We expereced the curve expressg lemmas occurreces fucto of ther correspodg barycetre values, as depcted by fgure. Lemma Iformato recovery le Iformato relevace le Fgure. Lemmas occurreces fucto of ther gravty cetre wth a gve documet textual represetato. Oe ca pot out the symmetrcal vew of the curve cosderg a vertcal meda called formato recovery le. Relevat lemmas are umerous ad located ear ths bar: ths s the case of the smoke lemma, whch s uformly quoted. Therefore the documet s more lkely to deal wth smoke (ad ts derved words, such as smokg).

6 Other otceable umerous lemmas located far from the vertcal meda represet local peces of formato. Fally, a partcular pot appears at the top the formato recovery le. It represets the of lemma of the Eglsh laguage. The combato of ts hghly levelled value ad ts close locato to the vertcal meda make t o-relevat (or empty ): t s ecessarly a frequetly ad uformly employed word. A empty words class may thus be defed: t ecloses empty words such as the, of, t, etc., usually grouped aroud the meda ad hgh-levelled (upper the formato relevace horzotal le). The basele of the curve dcates lemmas metoed oce (frequecy equals ), called hapax 4, whch s the most commo case. The vertcal ad horzotal les (formato recovery le ad formato relevace respectvely) are used as a etry pot for calculatg the gradual level-headg of words wth a documet. The meda s fxed whereas the relevace le vale depeds o the umber of lemmas extracted from the text. A emprc approach has bee proposed by Lamrous (Lamrous, 999) from a expermetal corpus. The latter was costtuted of very dfferet textual represetatos of umerous documets, take from a varety of domas ad of all szes. However, all of these documets were wrtte the Frech laguage. Geeraltes ca be derved from ths expermetal work ad expressed as a algorthm whch ca determe the relevace degree of each lemma accordace wth the whole text based o the spatal represetato of fgure. A smlar model must be appled for ay other laguage. r = r = 3 r = 4 r = 5 r = 6 Text 3: Artcle (Smokg) Relvat lemmas wdow Fgure. Spatal represetato example of a artcle dealg wth smokg. The relevace degree vares from 0 to 00. Ay 00 valued lemma s ecessarly located wth the relevace wdow determed usg the algorthm proposed by Lamrous. These lemmas are used to geerate the hash code value. Ths hash value s the computed to create the electroc sgature value. It s crucal to take care of the lemma pertece degree compared to the corpus order to refe the fourth parameter (relevace degree). Let s take the example of a gve lemma, say cocept, beg relevat but repeated ad scattered a set of documets. Ths lemma becomes globally rrelevat ad s cosdered as a empty word because of ts lack of dscrmato power. Moreover, our workg cotext makes t trcky to dex all lemmas based o a corpus whch gathers all documets corpuses due to the multplcty of the laguages used. Ths parameter may thus cause ambgutes. I our specfc case of geeratg meagful electroc sgatures, the fourth parameter computed by the algorthm seems suffcet to provde cryptographc computato wth a meagful fgerprt created from a lst of text-based (cotrary to corpus-based) relevat lemmas Workg process evaluato Ths expermet reles o the aalyss of a textual example. Our goal s twofold: t cossts o the oe had evaluatg the automatc relevace degree attrbuto process, ad potg out the lexcometrc characterstcs of the text o the other had. 4 From the Greek expresso hapax legomeo whch meas thg sad oce

7 Our text proposal s a excerpt of a medcal joural. It focuses o smokg ad related cardo-vascular dseases. Here are some dcators: Total umber of words of the text: 3848 Number of dfferet words: 908 Hghest appearace degree: 6 The followg table 5 shows the results obtaed o a Frech documet whch deals wth smokg whe the algorthm s appled o sgfcat words: A: Lemma. B: Lemma appearace degree. C: Lemma barycetre. D: Word varace. A B C D cgarette farctus malade egalemet atherogee atherosclerose myocarde myocardque mportat effets 9 65 facteur deces hyperte effet motre etudes fumee vasculare Table. Ucompressed trace results: 405 octets. The techques descrbed before tervee our meagful electroc sgatures geerato process. Classcal coservatve ( wthout loss ) compresso algorthms, our trace (see table ) ca be ecoded to produce a sequece or bary values. The global sze of the trace may also be reduced (less tha 405 octets). Ths sequece of 0 ad bt values, called ecoded trace, s used as a hash code value put the sgature geerato process. Cotrary to commo hashcodg algorthms, our ecoded trace has a varable legth depedg o the sze of the documet (ad cosequetly the umber of extracted lemmas). However, oe ca extrapolate that the represetato of the ecoded trace sze versus the sze of the source documet reaches a asymptotc le. Therefore, our traces may ot become very heavy ad rema acceptable for our sgature geerato purpose. The evaluato of the asymptotc le coeffcet ad bechmarks are curretly progress ad wll be soo publshed. 5 Ths table has ot bee traslated as words frequecy dffers from oe laguage to aother

8 3.. Itegrato to electroc sgature Back to geeral cosderatos, eve f our hashcodg process reflects a rough overvew of the sged cotet, t does ot provde ay trusted formato about the sgatory. A o-refutable lk betwee the sgature (prvate) key (respectvely the correspodg verfcato publc key) ad a detfer of the sgatory must be preseted to the electroc sgature verfer. Oe of the most wdely employed stadards, called Publc Key Ifrastructure or PKI (Adams, 999) reles o a pyramdal frastructure whch testfes ths lk provdg sgatores wth a sgature key alog wth a electroc certfcate. Ths certfcate bascally metos the sgatory detty (frst ame, last ame), cotact formato (home address, home phoe umber, work compay, work address, work phoe umber) ad the sgatory s verfcato key (whch ca be publcly dstrbuted). Ths formato s sealed by the Certfcate Authorty (CA) that geerates ad delvers the sgatory s certfcate. Ths certfcate s sealed wth the ssug CA s electroc sgature. Trust the certfcate formato cossts trustg the sgg CA ad the certfcate curret state (see fgure 3). Expred Geerated Suspeded Actvated Revoked Fgure 3. Dgtal certfcate lfe cycle. As each certfcate remas actve for a lmted perod of tme (t metos begg ad ed of valdty dates) ad may be revoked ( case of loss or compromsg), the verfer must make sure that:. The ssug CA s recogzed as a trusted CA thrd party by courts of law.. The ssug CA s sgature key s ot compromsed. 3. The sgatory s certfcate s vald ad ts actvato perod (ether suspeded or revoked). Oce the verfer gets cofdece the certfcate formato, the sgatory s verfcato key ca be take out ad the electroc sgature verfed. The use of our algorthm s reflected to the teral certfcate structure. Fgure 4 s a example of a X.509v3 certfcate (Housley, 00). The sgature algorthm of the certfcate propretary (sgatory), such as SHAwthRSA dcates that the Lamrous algorthm s employed stead of SHA- ( ths partcular example). The ssug CA sgature algorthm used to seal the certfcate s ot modfed as the ower s publc key sequece of bts must be sged alog wth detty ad valdty (textual) formato.

9 Certfcate formato Sgature algorthm Sgature value Certfcate verso Seral umber Sgature algorthm Issuer detty Issuer Valdty perod Ower detty Certfcate ower (sgatory) Ower publc key Optoal felds Fgure 4. X.509v3 certfcate teral structure. 4. Evaluato 4.. Trace word vector relevace evaluato We volutary apply dfferet types of ose oto a gve word of the trace. Let s take the example of cgarette from table to demostrate that: Our method s sutable to produce collso-resstat hash-codes. Smlar documets traces are very close: ths property s useful whe oe eed to prove plagarsm o documets protected by law eve whe the orgal documet has bee lost or destroyed. Occurreces Barycetre Varace Orgal text 4 30,88 76,88 Nose # 4 30,9 767,4 Nose # 4 30,90 76,48 Nose #3 5 3,47 745,8 Nose #4 3 9,67 86,68 Nose #5 4 30,87 76,7 Table. cgarette barycetre ad varace vectors depedg o varous ose types Values Nose Nose Nose 3 Nose 4 Nose 5 Occurrece Barycetre Varace Nose Type Fgure 5. A smulato result. 4.. Documets smlarty measuremet The terpretato of documets traces by lawyers may rase the plagarsm questo. I order to quatfy the smlarty degree we suggest the followg calculato method:

10 Let D ad D be two documets very close to each other terms of trace value (ad thus make use of the same relevat words). We assocate to D ad { o b, v } D two trplets vectors V = { o, b v } ad { o, b v }, V = where,, respectvely refer to the occurrece, barycetre ad varace values of documet such that: Occurrece s the average value of all words occurreces whch take part of the trace. Barycetre s the average value of all words barycetres whch take part of the trace. Varace s the average value of all words varaces whch take part of the trace. The smlarty s obtaed as follows: sm D (, D ) = = t ( D 3 = t D D t D ) ( D 3 = D 3 ( D ) ( D ) D D = ) where t D s the traspose of the vector j D. j A ear smlarty value refers to a slght dfferece betwee these two documets ad thus a small agle value. D ad D are the very smlar. 5. Cocluso We suggested ths artcle a method for geeratg secured hash codes compared to classcal hash codg techques based o bts computato (such as MD5 or SHA). Ths method whch reles o the relevace degree of selected words wth a textual documet allows creatg a meagful trace used as a fgerprt. Ths trace s a reduced represetato of the text t refers to ad ca thus offer a rough vew of the documet cotet. Such hashcodg method s sutable terms of sze ad computato performaces ad thus appears elgble for electrocally sgg legal documets. It may substtute to exstg hash codg methods o textual documets. Oe ca apprecate the possbltes of ths method as the geerated trace becomes the sged documet thesaurus. As archvg processes usually rely o detached sgatures (sgatures separated from the sged cotet), we are ow able to pre-classfy documets wthout kowg ther exact cotets but wth help of the sgature thesaurus. However, the proposed meagful sgatures should be used drectly by data search eges as t would be very tme cosumg to decpher each sgature to get ts eclosed thesaurus. Ideed, t may be valuable to keep ths thesaurus ext to the sgature to perform data search o the thesaurus. Oce a documet appears to be relevat (based o gve search crtera), t s selected alog wth ts sgature. Ths tegrato wth archval trusted thrd partes processes as well as a deeper evaluato of our hashcodg process ad archval system performaces wll be dscussed aother commucatos. 7. Bblography Adams C., Farrell D., RFC50: Iteret X.509 Publc Key Ifrastructure Certfcate Maagemet Protocols, March 999. Dobbert H., Bosselaers A., Preeel B., RIPEMD-60, a stregtheed verso of RIPEMD, Fast Software Ecrypto, LNCS vol. 039, D. Gollma Ed., pp. 7-8, 996. Europea Electroc Sgature Stadardzato Itatve (EESSI), Fal Report of the EESSI Expert Team, July 999.

11 Europea Parlamet ad Coucl 999/93/CE drectve. Hasz B., A. Nat-Sd-Moh, M. Wack, S. Lamrous, N. Cott, Sgature Sgfate. Submtted to the Europea Offce of Patets. Patetlaa 80 HV Rjswjk (ZH), 006. Housley R., Polk W., Ford W., Solo D., RFC 380: Iteret X.509 Publc Key Ifrastructure, Certfcate ad Certfcate Revocato Lst (CRL) Profle, Aprl 00. Kaeo (999) : M., 999. Desgg Network Securty. Macmlla Techcal Publshg, USA, ISBN Kalsk Jr B. S., RFC 39: The MD Message-Dgest Algorthm, RSA Laboratores, Jauary 99. Lamrous S.-A., Modélsato et réalsato d u système prototype teractf de recherche d formato multméda à forte composate textuelle, phd thess of the Uversty of Techology of Compege, 999. Lamrous S.-A et Trgao P., Orgasato des bases documetares, vers ue explotato optmale, revue Documet Numérque, Hermes ed., Volume 4/997, pp , 997. Losee R. L., Church Jr L., Iformato Retreval wth Dstrbuted Databases: Aalytc Models of Performace, IEEE Trasactos o Parallel ad Dstrbuted Systems, pp. 8-7, 003. Meezes (00) : Meezes A. J., Va Oorschot P. C., Vastoe S. A., Févrer 00. Hadbook of Appled Cryptography. CRC Press, USA, ISBN Natoal Isttute of Stadards ad Techology (NIST), Secure Hash Stadard (SHS), Federal Iformato Processg Stadards Publcato, FIPS PUB 80-, Aprl 995. Natoal Isttute of Stadards ad Techology (NIST), Dgtal Sgature Stadard (DSS), Federal Iformato Processg Stadards Publcato, FIPS PUB 86-, Jauary 000. Offcal Joural, March 4, 000, p Preeel B., Paul C. ad V. Oorschot, MDx-MAC ad Buldg Fast MACs from Hash Fuctos, proc. Crypto 95, Sprger-Verlag LNCS, August 995. Preeel B., Bosselaers A., Dobbert H., The cryptographc hash fucto RIPEMD-60, CryptoBytes, vol. 3, No., pp. 9-4, 997. Rvest R. L., The MD4 message dgest algorthm, proc. Crypto 90, LNCS 537, Sprger-Verlag, pp Rvest R. L., RFC 30: The MD4 Message-Dgest Algorthm, MIT Laboratory for Computer Scece ad RSA Data Securty, Aprl 99. Rvest R. L., RFC3: The MD5 message dgest algorthm, Iteret Actvtes Board, Iteret Prvacy Task Force, Aprl 99. Wag X., Yu H., How to Break MD5 ad Other Hash Fuctos, Advaces Cryptology, Eurocrypt'005, Lecture Notes Computer Scece Vol. 3494, R. Cramer ed., Sprger-Verlag, 005