The EgenTrust Algorthm for Reputaton Management n P2P Networks Sepandar D. Kamvar Stanford Unversty sdkamvar@stanford.edu Maro T. Schlosser Stanford Unversty schloss@db.stanford.edu Hector Garca-Molna Stanford Unversty hector@db.stanford.edu ABSTRACT Peer-to-peer fle-sharng networks are currently recevng much attenton as a means of sharng and dstrbutng nformaton. However, as recent experence shows, the anonymous, open nature of these networks offers an almost deal envronment for the spread of self-replcatng nauthentc fles. We descrbe an algorthm to decrease the number of downloads of nauthentc fles n a peer-to-peer fle-sharng network that assgns each peer a unque global trust value, based on the peer s hstory of uploads. We present a dstrbuted and secure method to compute global trust values, based on Power teraton. By havng peers use these global trust values to choose the peers from whom they download, the network effectvely dentfes malcous peers and solates them from the network. In smulatons, ths reputaton system, called EgenTrust, has been shown to sgnfcantly decrease the number of nauthentc fles on the network, even under a varety of condtons where malcous peers cooperate n an attempt to delberately subvert the system. Categores and Subject Descrptors C.2.4 [Computer-Communcaton Networks]: Dstrbuted Systems Dstrbuted applcatons; H.3.3 [Informaton Systems]: Informaton Storage and Retreval Selecton; H.2.7 [Informaton Systems]: Database Management Securty, ntegrty and protecton General Terms Algorthms,Performance,Theory Keywords Peer-to-Peer, reputaton, dstrbuted egenvector computaton 1. INTRODUCTION Peer-to-peer fle-sharng networks have many benefts over standard clent-server approaches to data dstrbuton, ncludng mproved robustness, scalablty, and dversty of avalable data. However, the open and anonymous nature of these networks leads to a complete lack of accountablty for the content a peer puts on the network, openng the door to abuses of these networks by malcous peers. Attacks by anonymous malcous peers have been observed on today s popular peer-to-peer networks. For example, malcous users have used these networks to ntroduce vruses such as the Copyrght s held by the author/owner(s). WWW23, May 2 24, 23, Budapest, Hungary. ACM 1-58113-68-3/3/5. VBS.Gnutella worm, whch spreads by makng a copy of tself n a peer s Gnutella program drectory, then modfyng the Gnutella.n fle to allow sharng of.vbs fles [19]. Far more common have been nauthentc fle attacks, wheren malcous peers respond to vrtually any query provdng decoy fles that are tampered wth or do not work. It has been suggested that the future development of P2P systems wll depend largely on the avalablty of novel methods for ensurng that peers obtan relable nformaton on the qualty of resources they are recevng [6]. In ths context, attemptng to dentfy malcous peers that provde nauthentc fles s superor to attemptng to dentfy nauthentc fles themselves, snce malcous peers can easly generate a vrtually unlmted number of nauthentc fles f they are not banned from partcpatng n the network. We present such a method wheren each peer s assgned a unque global trust value that reflects the experences of all peers n the network wth peer. In our approach, all peers n the network partcpate n computng these values n a dstrbuted and node-symmetrc manner wth mnmal overhead on the network. Furthermore, we descrbe how to ensure the securty of the computatons, mnmzng the probablty that malcous peers n the system can le to ther own beneft. And fnally, we show how to use these values to dentfy peers that provde materal deemed napproprate by the users of a peer-topeer network, and effectvely solate them from the network. 2. DESIGN CONSIDERATIONS There are fve ssues that are mportant to address n any P2P reputaton system. 1. The system should be self-polcng. That s, the shared ethcs of the user populaton are defned and enforced by the peers themselves and not by some central authorty. 2. The system should mantan anonymty. That s, a peer s reputaton should be assocated wth an opaque dentfer (such as the peer s Gnutella username) rather than wth an externally assocated dentty (such as a peer s IP address). 3. The system should not assgn any proft to newcomers. That s, reputaton should be obtaned by consstent good behavor through several transactons, and t should not be advantageous for malcous peers wth poor reputatons to contnuously change ther opaque dentfers to obtan newcomers status. 4. The system should have mnmal overhead n terms of computaton, nfrastructure, storage, and message complexty. 5. The system should be robust to malcous collectves of peers who know one another and attempt to collectvely subvert the system.
3. REPUTATION SYSTEMS An mportant example of successful reputaton management s the onlne aucton system ebay [9]. In ebay s reputaton system, buyers and sellers can rate each other after each transacton, and the overall reputaton of a partcpant s the sum of these ratngs over the last 6 months. Ths system reles on a centralzed system to store and manage these ratngs. In a dstrbuted envronment, peers may stll rate each other after each transacton, as n the ebay system. For example, each tme peer downloads a fle from peer j, t may rate the transacton as postve (tr(, j) = 1) or negatve (tr(, j) = 1). Peer may rate a download as negatve, for example, f the fle downloaded s nauthentc or tampered wth, or f the download s nterrupted. Lke n the ebay model, we may defne a local trust value s j as the sum of the ratngs of the ndvdual transactons that peer has downloaded from peer j: s j = P tr j. Equvalently, each peer can store the number satsfactory transactons t has had wth peer j, sat(, j) and the number of unsatsfactory transactons t has had wth peer j, unsat(, j). Then, s j s defned: s j = sat(,j) unsat(, j) (1) Prevous work n P2P reputaton systems [6, 1] has all been based on smlar notons of local trust values. The challenge for reputaton systems n a dstrbuted envronment s how to aggregate the local trust values s j wthout a centralzed storage and management faclty. Whle each of the prevous systems cted above addresses ths ssue, each of the prevous systems proposed suffers from one of two drawbacks. Ether t aggregates the ratngs of only a few peers and doesn t get a wde vew about a peer s reputaton, or t aggregates the ratngs of all the peers and congests the network wth system messages askng for each peer s local trust values at every query. We present here a reputaton system that aggregates the local trust values of all of the users n a natural manner, wth mnmal overhead n terms of message complexty. Our approach s based on the noton of transtve trust: A peer wll have a hgh opnon of those peers who have provded t authentc fles. Moreover, peer s lkely to trust the opnons of those peers, snce peers who are honest about the fles they provde are also lkely to be honest n reportng ther local trust values. We show that the dea of transtve trust leads to a system where global trust values correspond to the left prncpal egenvector of a matrx of normalzed local trust values. We show how to perform ths egenvector computaton n a dstrbuted manner wth just a few lnes of code, where the message complexty s provably bounded and emprcally low. Most mportantly, we show that ths system s hghly effectve n decreasng the number of unsatsfactory downloads, even when up to 7% of the peers n the network form a malcous collectve n an attempt to subvert the system. 4. EIGENTRUST In ths secton, we descrbe the EgenTrust algorthm. In Egen- Trust, the global reputaton of each peer s gven by the local trust values assgned to peer by other peers, weghted by the global reputatons of the assgnng peers. In Secton 4.1, we show how to normalze the local trust values n a manner that leads to an elegant probablstc nterpretaton and an effcent algorthm for aggregatng these values. In Secton 4.2, we dscuss how to aggregate the normalzed trust values n a sensble manner. In Secton 4.3, we dscuss the probablstc nterpretaton of the local and global trust values. In Secton 4.4 through Secton 4.6, we present an algorthm for computng the global trust values. 4.1 Normalzng Local Trust Values In order to aggregate local trust values, t s necessary to normalze them n some manner. Otherwse, malcous peers can assgn arbtrarly hgh local trust values to other malcous peers, and arbtrarly low local trust values to good peers, easly subvertng the system. We defne a normalzed local trust value, c j, as follows: c j = max(sj,) Pj max(sj,) (2) Ths ensures that all values wll be between and 1. (Notce that f P j max(sj) =, then cj s undefned. We address ths case n Secton 4.4.) There are some drawbacks to normalzng n ths manner. For one, the normalzed trust values do not dstngush between a peer wth whom peer dd not nteract and a peer wth whom peer has had poor experence. Also, these c j values are relatve, and there s no absolute nterpretaton. That s, f c j = c k, we know that peer j has the same reputaton as peer k n the eyes of peer, but we don t know f both of them are very reputable, or f both of them are medocre. However, we are stll able to acheve substantally good results despte the drawbacks mentoned above. We choose to normalze the local trust values n ths manner because t allows us to perform the computaton that we descrbe below wthout renormalzng the global trust values at each teraton (whch s prohbtvely costly n a large dstrbuted envronment) and leads to an elegant probablstc model. 4.2 Aggregatng Local Trust Values We wsh to aggregate the normalzed local trust values. A natural way to do ths n a dstrbuted envronment s for peer to ask ts acquantances about ther opnons about other peers. It would make sense to weght ther opnons by the trust peer places n them: t k = X j c jc jk (3) where t k represents the trust that peer places n peer k based on askng hs frends. We can wrte ths n matrx notaton: If we defne C to be the matrx [c j] and t to be vector contanng the values t k, then t = C T c. (Note that P j tj = 1 as desred.) Ths s a useful way to have each peer gan a vew of the network that s wder than hs own experence. However, the trust values stored by peer stll reflect only the experence of peer and hs acquantances. In order to get a wder vew, peer may wsh to ask hs frends frends (t = (C T ) 2 c ). If he contnues n ths manner, (t = (C T ) n c ), he wll have a complete vew of the network after n = large teratons (under the assumptons that C s rreducble and aperodc, whch we guarantee n practce and address n Secton 4.5). Fortunately, f n s large, the trust vector t wll converge to the same vector for every peer. Namely, t wll converge to the left prncpal egenvector of C. In other words, t s a global trust vector n ths model. Its elements, t j, quantfy how much trust the system as a whole places peer j. 4.3 Probablstc Interpretaton It s useful to note that there exsts a straghtforward probablstc nterpretaton of ths method, smlar to the Random Surfer model of [12]. If an agent were searchng for reputable peers, t can crawl the network usng the followng rule: at each peer, t wll crawl to peer j wth probablty c j. After crawlng for a whle n ths manner, the agent s more lkely to be at reputable peers than unreputable peers. The statonary dstrbuton of the Markov chan
t () = e; repeat t (k+1) = C T t (k) ; δ = t (k+1) t k ; untl δ < ɛ; Algorthm 1: Smple non-dstrbuted EgenTrust algorthm defned by the normalzed local trust matrx C s our global trust vector t. 4.4 Basc EgenTrust In ths secton, we descrbe the basc EgenTrust algorthm, gnorng for now the dstrbuted nature of the peer-to-peer network. That s, we assume that some central server knows all the c j values and performs the computaton. In Secton 4.6, we descrbe how the computaton may be performed n a dstrbuted envronment. We smply wsh to compute t = (C T ) n e, for n =large, where we defne e to be the m-vector representng a unform probablty dstrbuton over all m peers, e = 1/m. (In Secton 4.2, we sad we wsh to compute t = (C T ) n c, where c s the normalzed local trust vector of some peer. However, snce they both converge to the prncpal left egenvector of C, we may use e nstead.) At the most basc level, the algorthm would proceed as n Algorthm 1. 4.5 Practcal Issues There are three practcal ssues that are not addressed by ths smple algorthm: a pror notons of trust, nactve peers, and malcous collectves. A pror notons of trust. Often, there are some peers n the network that are known to be trustworthy. For example, the frst few peers to jon a network are often known to be trustworthy, snce the desgners and early users of a P2P network are lkely to have less motvaton to destroy the network they bult. It would be useful to ncorporate such notons of trust n a natural and seamless manner. We do ths by defnng some dstrbuton p over pre-trusted peers 1. For example, f some set of peers P are known to be trusted, we may defne p = 1/ P f P, and p = otherwse.) We use ths dstrbuton p n three ways. Frst of all, n the presence of malcous peers, t = (C T ) n p wll generally converge faster than t = (C T ) n e, so we use p as our start vector. We descrbe the other two ways to use ths dstrbuton p below. Inactve Peers. If peer doesn t download from anybody else, or f t assgns a zero score to all other peers, c j from Equaton 1 wll be undefned. In ths case, we set c j = p j. So we redefne c j as: c j = ( max(sj,) f P Pj max(s j) j max(sj,) ; otherwse p j That s, f peer doesn t know anybody, or doesn t trust anybody, he wll choose to trust the pre-trusted peers. Malcous Collectves. In peer-to-peer networks, there s potental for malcous collectves to form [8]. A malcous collectve s a group of malcous peers who know each other, who gve each other hgh local trust values and gve all other peers low local trust values n an attempt to subvert the system and gan hgh global trust 1 The dea of pre-trusted peers s also used n [2], where the computaton of the trust metrc s performed relatve to a seed of trusted accounts. (4) t () = p; repeat t (k+1) = C T t (k) ; t (k+1) = (1 a) t (k+1) + a p; δ = t (k+1) t (k) ; untl δ < ɛ; Algorthm 2: Basc EgenTrust algorthm values. We address ths ssue by takng t (k+1) = (1 a)c T t (k) + a p (5) where a s some constant less than 1. Ths s equvalent to settng the opnon vector for all peers to be c = (1 a) c + a p, breakng collectves by havng each peer place at least some trust n the peers P that are not part of a collectve. Probablstcally, ths s equvalent to sayng that the agent that s crawlng the network by the probablstc model gven n Secton 4 s less lkely to get stuck crawlng a malcous collectve, because at each step, he has a certan probablty of crawlng to a pre-trusted peer. Notce that ths also makes the matrx C s rreducble and aperodc, guaranteeng that the computaton wll converge. The modfed algorthm s gven n Algorthm 2. It should be emphaszed that the pre-trusted peers are essental to ths algorthm, as they guarantee convergence and break up malcous collectves. Therefore, the choce of pre-trusted peers s mportant. In partcular, t s mportant that no pre-trusted peer be a member of a malcous collectve. Ths would compromse the qualty of the algorthm. To avod ths, the system may choose a very few number of pre-trusted peers (for example, the desgners of the network). A thorough nvestgaton of dfferent methods of choosng pre-trusted peers s an nterestng research area, but t s outsde of the scope of ths paper. 4.6 Dstrbuted EgenTrust Here, we present an algorthm where all peers n the network cooperate to compute and store the global trust vector, and the computaton, storage, and message overhead for each peer are mnmal. In a dstrbuted envronment, the frst challenge that arses s how to store C and t. In prevous sectons, we suggested that each peer could store ts local trust vector c. Here, we also suggest that each peer store ts own global trust value t. (For presentaton purposes, we gnore ssues of securty for the moment and allow peers to store ther own trust values. We address ssues of securty n Secton 5.) In fact, each peer can compute ts own global trust value: t (k+1) = (1 a)(c 1t (k) 1 +... + c nt (k) n ) + ap (6) Inspecton wll show that ths s the component-wse verson of t (k+1) = (1 a)c T t (k) +a p. Notce that, snce peer has had lmted nteracton wth other peers, many of the components n equaton 6 wll be zero. Ths lends tself to the smple dstrbuted algorthm shown n Algorthm 3. It s nterestng to note two thngs here. Frst of all, only the pre-trusted peers need to know ther p. Ths means that pre-trusted peers may reman anonymous; nobody else needs to know that they are pre-trusted 2. Therefore, the pretrusted peers mantan anonymty as pre-trusted peers. (One may magne that pre-trusted peers may be dentfed because they have hgh global trust values. However, smulatons show that, whle the 2 Recall that, for the moment, we assume that peers are honest and may report ther own trust values, ncludng whether or not they are a pre-trusted peer. The secure verson s presented n Secton 5.
Defntons: A : set of peers whch have downloaded fles from peer B : set of peers from whch peer has downloaded fles Algorthm: Each peer do { Query all peers j A for t () j = p j; repeat Compute t (k+1) c nt (k) n ) + ap ; = (1 a)(c 1t (k) 1 + c 2t (k) 2 +... + Send c jt (k+1) to all peers j B ; Compute δ = t (k+1) t (k) ; Wat for all peers j A to return c jt (k+1) untl δ < ɛ.; } j ; Algorthm 3: Dstrbuted EgenTrust Algorthm. pre-trusted peers have above average t values, they rarely have the hghest values of t.) Secondly, n most P2P networks, each peer has lmted nteracton wth other peers. There are two benefts to ths. Frst, the computaton t (k+1) = (1 a)(c 1t (k) 1 +c 2t (k) 2 +...+c nt (k) n )+ap s not ntensve, snce most c j are zero. Second, the number of messages passed s small, snce A and B are small. In the case where a network s full of heavly actve peers, we can enforce these benefts by lmtng the number of local trust values c j that each peer can report. 4.7 Algorthm Complexty The complexty of the algorthm s bounded n two ways. Frst, the algorthm converges fast: For a network of 1 peers after 1 query cycles (refer to Secton 7.1 for a descrpton of how we smulate our system), Fgure 1 depcts the resdual t (k+1) t (k) 1. Clearly, the algorthm has converged after less than 1 teratons,.e., the computed global trust values do not change sgnfcantly any more after a low number of teratons. In the dstrbuted verson of our algorthms, ths corresponds to less than 1 exchanges of updated trust values among peers. The reason for the fast convergence of the EgenTrust algorthm s dscussed n [1]. Second, we can specfcally lmt the number of local trust values that a peer reports. In the modfed verson of EgenTrust, each peer reports a subset of ts total set of local trust values. Prelmnary smulatons have shown ths scheme to perform comparably well as the algorthm presented here, where peers report all of ther local trust values. 5. SECURE EIGENTRUST In the algorthm presented n the prevous secton, each peer computes and reports ts own trust value t. Malcous peers can easly report false trust values, subvertng the system. We combat ths by mplementng two basc deas. Frst, the current trust value of a peer must not be computed by and resde at the peer tself, where t can easly become subject to manpulaton. Thus, we have a dfferent peer n the network compute the trust value of a peer. Second, t wll be n the nterest of malcous peers to return wrong results when they are supposed to compute any peer s trust value. Therefore, the trust value of one peer n the network wll be computed by more than one other peer. Resdual 1.4 1.2 1.8.6.4.2 5 1 15 2 Iteratons Fgure 1: EgenTrust convergence 4 6 pos 6 = h 2 (ID 1 ) 5 3 1 pos 3 = h 3 (ID 1 ) 8 7 2 pos 2 = h 1 (ID 1 ) Fgure 2: Two-dmensonal CAN hash space In the secure verson of the dstrbuted trust algorthm, M peers (dubbed score managers of a peer ) compute the trust value of a peer. If a peer needs the trust value of peer, t can query all M score managers for t. A majorty vote on the trust value then settles conflcts arsng from a number of malcous peers beng among the score managers and presentng faulty trust values as opposed to the correct one presented by the non-malcous score managers. To assgn score managers, we use a dstrbuted hash table (DHT), such as CAN [13] or Chord [18]. DHTs use a hash functon to determnstcally map keys such as fle names nto ponts n a logcal coordnate space. At any tme, the coordnate space s parttoned dynamcally among the peers n the system such that every peer covers a regon n the coordnate space. Peers are responsble for storng (key, value) pars the keys of whch are hashed nto a pont that s located wthn ther regon. In our approach, a peer s score manager s located by hashng a unque ID of the peer, such as ts IP address and TCP port, nto a pont n the DHT hash space. The peer whch currently covers ths pont as part of ts DHT regon s apponted as the score manager of that peer. All peers n the system whch know the unque ID of a peer can thus locate ts score manager. We can modfy our ntal algorthm such that t can be executed by score managers. As an example, consder the CAN n Fgure 2. Peer 1 s unque ID, ID 1, s mapped nto ponts covered by peers 2, 3 and 6, respectvely, by hash functons h 1, h 2 and h 3. Thus, these peers become peer 1 s score managers. 9
To cope wth the nherent dynamcs of a P2P system, we rely on the robustness of a well-desgned DHT. For example, when a score manager leaves the system, t passes on ts state (.e., trust values or ongong trust computatons) to ts neghbor peer n the DHT coordnate space. DHTs also ntroduce replcaton of data to prevent loss of data (n ths case, trust values) n case a score manager fals. 5.1 Algorthm Descrpton Here we descrbe the secure algorthm to compute a global trust vector. We wll use these defntons: Each peer has a number M of score managers, whose DHT coordnates are determned by applyng a set of one-way secure hash functons h, h 1,..., h M 1 to the peer s unque dentfer. pos are the coordnates of peer n the hash space. Snce each peer also acts as a score manager, t s assgned a set of daughters D - the set contans the ndexes of peers whose trust value computaton s covered by the peer. As a score manager, peer also mantans the opnon vector c d of ts daughter peer d (where d D ) at some pont n the algorthm. Also, peer wll learn A d whch s the set of peers whch downloaded fles from ts daughter peer d: It wll receve trust assessments from these peers referrng to ts daughter peer d. Fnally, peer wll get to know the set B d whch denotes the set of peers whch ts daughter peer d downloaded fles from: Upon kckng off a global trust value computaton, ts daughter peer d s supposed to submt ts trust assessments on other peers to ts score manager, provdng the score manager wth B d. foreach peer do Submt local trust values c to all score managers at postons h m(pos ), m = 1... M 1; Collect local trust values c d and sets of acquantances Bd of daughter peers d D ; Submt daughter d s local trust values c dj to score managers h m(pos d ), m = 1... M 1, j Bd; Collect acquantances A d of daughter peers; foreach daughter peer d D do Query all peers j A d for c jd p j; repeat Compute t (k+1) d = (1 a)(c 1d t (k) 1 + c 2d t (k) 2 +... + c nd t (k) n ) + ap d ; Send c dj t (k+1) d to all peers j Bd; Wat for all peers j A d to return c jd t (k+1) j ; untl t (k+1) d end end t (k) d < ɛ.; Algorthm 4: Secure EgenTrust Algorthm Upsdes of the secure algorthm n terms of ncreased securty and relablty nclude: Anonymty. It s not possble for a peer at a specfc coordnate to fnd out the peer ID for whom t computes the trust values hence malcous peers cannot ncrease the reputaton of other malcous peers. Randomzaton. Peers that enter the system cannot select at whch coordnates n the hash space they want to be located (ths should be a property of a well-desgned DHT) - hence t s not possble for a peer to, for example, compute the hash value of ts own ID and locate tself at precsely ths poston n the hash space to be able to compute ts own trust value. Redundancy. Several score managers compute the trust value for one peer. To assgn several score managers to a peer, we use several mult-dmensonal hash functons. Peers n the system stll take over a partcular regon n the coordnate space, yet now there are several coordnate spaces, each of whch s created by one multdmensonal hash functon. A peer s unque ID s thus mapped nto a dfferent pont n every mult-dmensonal hash space. 5.2 Dscusson A couple of ponts are mportant to note here. Frst, the ssue of secure score management n P2P networks s an mportant problem, wth mplcatons for reputaton management, ncentve systems, and P2P mcropayment schemes, among others. An extended dscusson of secure score management n P2P networks, and varous concrete score management schemes (ncludng a varant of the one presented above), are gven n [2]. The man contrbuton of ths work s not n the secure score management scheme, but rather n the core EgenTrust algorthm. We dscuss the secure score management scheme because some secure score management scheme s essental to the EgenTrust algorthm. However, t s mportant to note that the core EgenTrust algorthm may be used wth many dfferent secure score management schemes. Second, the secure protocols proposed here and n [2] descrbe how to use large collectons of enttes to mtgate sngular or groupbased manpulaton of the protocol. These protocols are not secured n the tradtonal sense; rather, we can show that the probablty s small that a peer s able to get away wth msreportng a score. Ths s dscussed further n [2]. 6. USING GLOBAL TRUST VALUES There are two clear ways to use these global trust values n a peer-to-peer system. The frst s to solate malcous peers from the network by basng users to download from reputable peers. The second s to ncent peers to share fles by rewardng reputable peers. Isolatng Malcous Peers. When peer ssues a query, the system may use the trust values t j to bas the user towards downloadng from more reputable peers. One way to do ths would be to have each peer download from the most hghly trusted peer who responds to ts query. However, such a polcy leads to the most hghly trusted peers beng overloaded, as shown n Secton 7. Furthermore, snce reputaton s bult upon sharng authentc fles, ths polcy does not enable new peers to buld up reputaton n the system. A dfferent strategy s to select the peers from whom to download probablstcally based on ther trust values. In partcular, we can make type probablty that a peer wll download a fle from respondng peer j be drectly proportonal to the trust value t j of peer j. Such a polcy lmts the number of unsatsfactory downloads on the network, whle balancng the load n the network and allowng newcomers to buld reputaton. The experments n Secton 7 valdate ths. It should be noted here that peers may easly choose to bas ther choce of download by a convex combnaton of the global trust values and ther own local trust assessments of other peers (and use the trust values gven by the vector t personal = d t + (1 d) c, where d s a constant between and 1. Ths way, a peer can avod downloadng from a peer that has gven t bad servce, even f t gves the rest of the network good servce. Incentng Freerders to Share. Secondly, the system may reward peers wth hgh trust values. For example, reputable peers may be rewarded wth ncreased connectvty to other reputable peers, or greater bandwdth. Rewardng hghly trusted peers has a
twofold effect. Frst, t gves users an ncentve to share fles, snce a hgh global trust value may only be acheved by sharng authentc fles. In the current Gnutella network, less than 7% of the peers are responsble for over 5% of the fles, and as many as 25% of peers on the network share no fles at all [16]. Incentves based on trust values should reduce the number of free rders on peer-to-peer networks. Some such ncentves are dscussed n [11]. Second, rewardng hghly trusted peers gves non-malcous peers an ncentve to delete nauthentc fles that they may have accdentally downloaded from malcous peers, actvely keepng the network tdy. Ths makes t more dffcult for nauthentc fles to replcate n the system. 7. EXPERIMENTS In ths secton, we wll assess the performance of our scheme as compared to a P2P network where no reputaton system s mplemented. We shall demonstrate the scheme s performance under a varety of threat models. 7.1 Smulaton Our fndngs are based on smulatons of a P2P network model whch we shall explan brefly n the followng. Network model. We consder a typcal P2P network: Interconnected, fle-sharng peers are able to ssue queres for fles, peers can respond to queres, and fles can be transferred between two peers to conclude a search process. When a query s ssued by a peer, t s propagated by broadcast wth hop-count horzon throughout the network (n the usual Gnutella way), peers whch receve the query forward t and check f they are able to respond to t. We nterconnect peers by a power-law network, a type of network prevalent n real-world P2P networks [15]. Node model. Our network conssts of good nodes (normal nodes, partcpatng n the network to download and upload fles) and malcous nodes (adversaral nodes, partcpatng n the network to undermne ts performance). In our experments, we consder dfferent threat models, where a threat model descrbes the behavor of a malcous peer n the network. Threat models wll be descrbed n more detal later on. Note also that, based on the consderatons n Secton 4.5, some good nodes n the network are apponted as hghly trusted nodes. Content dstrbuton model. Interactons between peers.e., whch queres are ssued and whch queres are answered by gven peers are computed based on a probablstc content dstrbuton model. The detaled model wll not be descrbed here, t s presented n [17]. Brefly, peers are assumed to be nterested n a subset of the total avalable content n the network,.e., each peer ntally pcks a number of content categores and shares fles only n these categores. Reference [7] has shown that fles shared n a P2P network are often clustered by content categores. Also, we assume that wthn one content category fles wth dfferent populartes exst, governed by a Zpf dstrbuton. When our smulator generates a query, t does not generate a search strng. Instead, t generates the category and rank (or popularty) of the fle that wll satsfy the query. The category and rank are based on Zpf dstrbutons. Each peer that receves the query checks f t supports the category and f t shares the fle. Fles are assgned probablstcally to peers at ntalzaton based on fle popularty and the content categores the peer s nterested (that s, peers are lkely to share popular fles, even f they have few fles). The number of fles shared by peers and other dstrbutons used n the model are taken from measurements n real-world P2P networks [16]. Smulaton executon. The smulaton of a network proceeds n smulaton cycles: Each smulaton cycle s subdvded nto a number of query cycles. In each query cycle, a peer n the network may be actvely ssung a query, nactve, or even down and not respondng to queres passng by. Upon ssung a query, a peer wats for ncomng responses, selects a download source among those nodes that responded and starts downloadng the fle. The latter two steps are repeated untl a peer has properly receved a good copy of the fle that t has been lookng for 3. Upon the concluson of each smulaton cycle, the global trust value computaton s kcked off. Statstcs are collected at each node, n partcular, we are nterested n the number of authentc and nauthentc up- and downloads of each node. Each experment s run several tmes and the results of all runs are averaged. We run an experment untl we see convergence to a steady state (to be defned n the descrptons of the experments), ntal transent states are excluded from the data. The base settngs that apply for most of the experments are summarzed n Table 1. The settngs represent a farly small network to make our smulatons tractable. However, we have expermented wth larger networks n some nstances and our conclusons contnue to hold. That s, schemes that do well n a small settng, do proportonately as well as the network s scaled up. Also note that our settngs descrbe a pessmstc scenaro wth a powerful adversary: Malcous peers connect to the most hghly connected peers when jonng the network (see Secton 7.3), they respond to the top 2% of queres receved and thus have a large bandwdth, they are able to communcate among themselves n most of our threat models, and they make up a sgnfcant fracton of the network n most of our experments. Yet, our experments ndcate that our scheme works well n ths hostle a scenaro, and thus wll also work n less hostle envronments. As metrcs, we are partcularly nterested n the number of nauthentc fle downloads versus the number of authentc fle downloads: If the computed global trust values accurately reflect each peer s actual behavor, the number of nauthentc fle downloads should be mnmzed. Before we consder the strengths of our scheme n suppressng nauthentc downloads n a P2P network, we examne f t leads to unwanted load mbalance n the network. In the followng secton, we also gve a precse defnton on how we use global trust values n downloadng fles. 7.2 Load Dstrbuton n a Trust-based Network In P2P networks, a natural load dstrbuton s establshed by peers wth more content and hgher bandwdth beng able to respond to more queres and thus havng a hgher lkelhood of beng chosen as download source for a fle transfer. In our scheme, a hgh global trust value of a peer addtonally contrbutes to a peer s lkelhood of beng chosen as download source. Possbly, ths mght lead a peer nto a vcous crcle of accumulatng reputaton by respondng to many queres, thus beng chosen even more frequently as download source n the future, thus accumulatng even more reputaton. In a non-trust based system, ths stuaton does not occur: From respondng peers, a peer usually s randomly pcked and selected as download source, somewhat balancng the load n the network. In the followng, we are nterested n ntegratng loaddstrbutng randomzaton nto our scheme. In the experment n Fgures 3 and 4, we study the load dstrbuton performance of a 3 In Secton 7.2 we wll consder two dfferent ways of choosng download sources from those nodes that respond to a query and compare ther performance n one of our experments.
Network # of good peers 6 # of malcous peers 42 # of pre-trusted peers 3 # of ntal neghbors of good peers 2 # of ntal neghbors of malcous peers 1 # of ntal neghbors of pre-trusted peers 1 # Tme-to-lve for query messages 7 Content Dstrbuton # of dstnct fles at good peer fle dstrbuton n [16] set of content categores supported by good peer Zpf dstrbuton over 2 content categores # of dstnct fles at good peer n category j unform random dstrbuton over peer s total number of dstnct fles top % of queres for most popular categores and 2% fles malcous peers respond to top % of queres for most popular categores and 5% fles pre-trusted peers respond to % of tme peer s up and processng queres unform random dstrbuton over [%, 1%] % of tme pre-trusted peer s up and processng 1 queres % of up-tme good peer ssues queres unform random dstrbuton over [%, 5%] % of up-tme pre-trusted peer ssues queres 1 Peer Behavor % of download requests n whch good peer 5% returns nauthentc fle % of download requests n whch malcous peer % (vared n Secton 7.3) returns nauthentc fle download source selecton algorthm probablstc algorthm (vared n Secton 7.2) probablty that peer wth global trust value s 1% selected as download source Smulaton # of smulaton cycles n one experment 3 # of query cycles n one smulaton cycle 5 # of experments over whch results are averaged 5 Table 1: Smulaton settngs Random download source selecton.8.7 Determnstc trust-based download source selecton Determnstc algorthm Choose the peer wth the hghest trust value t max among the peers respondng to a query as download source. Peer load share.6.5.4.3.2.1 1 2 3 4 5 6 7 8 9 1 11 12 13 14 15 16 17 18 19 2 Fgure 3: Load dstrbuton n a network usng determnstc download source selecton versus a non-trust based network. The load dstrbuton s heavly skewed, peer 2 wll eventually accumulate all reputaton n the network. network n whch our scheme s actvated. We consder two dfferent trust-based algorthms for selectng download sources among peers respondng to a query, a determnstc algorthm and a probablstc algorthm. If {t, t 1,..., t R 1} are the trust values of peers respondng to a query, the determnstc and probablstc algorthms proceed as follows. Peer Probablstc algorthm Choose peer as download source wth t probablty P Rj=. Wth a probablty of 1%, select a t j peer j that has a trust value t j =. If a download returns an nauthentc fle, delete the peer from the lst of respondng peers and repeat the algorthm. To gve new peers n the network whch start wth a global trust value of the chance of buldng up reputaton, the probablstc algorthm assgns a fxed 1% chance to download from the group of respondng peers wth trust value. Otherwse, new peers would maybe never be chosen as download source, deprvng them of the chance to become a trusted member of the network. Based on our experence, a probablty of 1% strkes a balance between grantng malcous peers (whch mght also have a trust value of ) too hgh a chance of uploadng nauthentc fles and allowng new peers to prove themselves as download sources of authentc fles. We compare these download source selecton algorthms to a network where no reputaton system s deployed,.e., among peers respondng to a query a peer s pcked as download source entrely at random. We examne the load dstrbuton n these networks. We do not assume the exstence of any malcous peers n ths experment. 4 4 Malcous peers would not mpact the load dstrbuton among good peers snce downloadng peers keep tryng untl they have found an authentc copy of a fle (assumng they have enough band-
Peer load share.12.1.8.6.4.2 Random download source selecton Probablstc trust-based download source selecton Fracton of nauthentc downloads 1.9.8.7.6.5.4.3.2.1 non-trust based trust based 1 2 3 4 5 6 7 8 9 1 11 12 13 14 15 16 17 18 19 2 Peer % 1% 2% 3% 4% 5% 6% 7% Fracton of malcous peers Fgure 4: Load dstrbuton n a network usng probablstc download source selecton versus a non-trust based network. The load dstrbuton does not devate too much from the load dstrbuton n a network based on random, non-trust based download source selecton and s thus close to the natural load dstrbuton n a normal Gnutella network. Setup. We smulate a network consstng of 2 good peers, no pre-trusted peers and no malcous peers. Other than that, the standard settngs n Table 1 apply. After runnng queres on the system for 2 query cycles, the load dstrbuton s measured n Fgures 3 and 4: For each peer 1 2 n the network, we depct ts load share,.e., the fracton of ts uploads after a full run of the experment dvded by the total number of uploads n the entre network. The load dstrbuton n a network usng the determnstc download source selecton algorthm s compared to the load dstrbuton n a network usng no reputaton system at all n Fgure 3, whereas a system employng the probablstc download source selecton algorthm s compared to the non-trust based network n Fgure 4. Dscusson. Always choosng the respondng peer wth the hghest global trust value as download source leads to a vast load mbalance n the network: Popular peers do not stop accumulatng trust value and gan further popularty. In Fgure 3, peer 2 wll eventually become the download source for vrtually all queres that t s able to answer. Also note that n each experment we ran another peer turned out to be the most trusted peer. Choosng download sources probablstcally yelds only a slght devaton n terms of ndvdual load share of each peer from the case where trust values are not used to select download sources among respondng peers, therefore leadng to a much better natural load dstrbuton n the network. In Fgure 4, peer 2 becomes the download source for 8% of all queres n the system, and many other peers partcpate n sharng the load, manly determned by the number of and popularty of fles the peers share. Our measurements also show that the effcency n suppressng nauthentc downloads does not vary between the two approaches. Thus, for the remanng experments we use the probablstc peer selecton algorthm. 7.3 Threat Models We now evaluate the performance of our system n suppressng nauthentc downloads. We wll consder several strateges of malcous peers to cause nauthentc uploads even when our scheme s actvated. In short, malcous peers operatng under threat model A wdth to do so) hence malcous peers would add nauthentc uploads to the network, but not change anythng about the number of authentc uploads from good peers. Fgure 5: Reducton of nauthentc downloads by basng download source selecton on global trust values n a network where ndependent malcous peers are present. Upon actvaton of our reputaton scheme, the number of nauthentc downloads n the network s sgnfcantly decreased to around 1% of all downloads n the system, malcous peers n the network are vrtually banned from uploadng nauthentc fles. smply try to upload nauthentc fles and assgn hgh trust values to any other malcous peer they get to nteract wth whle partcpatng n the network. In threat model B, malcous peers know each other upfront and determnstcally gve hgh local trust values to each other. In threat model C, malcous peers try to get some hgh local trust values from good peers by provdng authentc fles n some cases when selected as download sources. Under threat model D, one group of malcous peers n the network provdes only authentc fles and uses the reputaton they gan to boost the trust values of another group of malcous peers that only provdes nauthentc fles. We start our experments consderng the smplest threat model, where malcous peers are not ntally aware of other malcous peers and smply upload nauthentc fles. Threat Model A. Indvdual Malcous Peers. Malcous peers always provde an nauthentc fle when selected as download source. Malcous peers set ther local trust values to be s j = nauth(j) auth(j),.e., malcous peers value nauthentc fle downloads nstead of authentc fle downloads. Setup. We smulate a network consstng of 63 good nodes, 3 of whch are hghly trusted nodes, applyng the standard settngs from Table 1. In each experment, we add a number of malcous peers to the network such that malcous nodes make up between % and 7% of all nodes n the network. For each fracton n steps of 1% we run experments and depct the results n Fgure 5. Upon jonng the network, malcous peers connect to the 1 most hghly connected peers already n the network n order to receve as many queres travellng through the network as possble. In practce, P2P protocols such as the Gnutella protocol enable nodes to crawl the network n search of hghly connected nodes. We run the experments on a system where download sources are selected probablstcally based on our global trust values and on a system where download sources are chosen randomly from the set of peers respondng to a query. Bars depct the fracton of nauthentc fles downloaded n one smulaton cycle versus the total number of fles downloaded n the same perod of tme. The results are averaged over the last 1 query cycles n each experment. Dscusson. In the absence of a reputaton system, malcous
Threat Model Fle Upload Behavor Local Trust Behavor Fgure A Always upload nauthentc fles. Assgn trust to peers whch upload nauthentc fles. 5 B Always upload nauthentc fles. Assgn trust to prevously known malcous 6 peer to form malcous collectve. C Upload nauthentc fles n f% of all cases. Assgn trust to prevously known malcous 7, 8 peer to form malcous collectve. D Upload authentc fles. Assgn equal trust share to all type B nodes n the network. 9 Table 2: Threat models and assocated experments peers succeed n nflctng many nauthentc downloads on the network. Yet, f our scheme s actvated, malcous peers receve hgh local trust values only from other malcous peers, and even that only occasonally snce malcous peers have to happen to get acquanted wth each other through a fle exchange. Because of ther low trust values, malcous peers are rarely chosen as download source whch mnmzes the number of nauthentc fle downloads n the network. We observed a 1% fracton of nauthentc downloads, mostly due to the fact that good nodes make mstakes once n a whle and upload nauthentc fles (for example, by not deletng a downloaded nauthentc fle from ther shared folders). Even f no malcous peers are present n the network, downloads are evaluated as nauthentc n 5% of all cases ths accounts for mstakes users make when creatng and sharng a fle, e.g., by provdng the wrong meta-data or creatng and sharng an unreadable fle. Note that, due to the fact that our current secure algorthm uses majorty vote, a cooperatng malcous collectve that comprses over 4% of the network wll be able to nfluence the assgnment of global trust values values n the network durng ther computaton. Ths s not represented n Fgure 5, whch assumes that the trust values are computed correctly. However, t s unlkely that over 4% of the peers n a network are n a sngle malcous collectve, unless the malcous collectve s a result of pseudospoofng (a.k.a. the Sybl attack [8]), where a sngle adversary ntates thousands of peers onto the network. Ths type of attack can be avoded by mposng a cost of entry nto the network. For example, a peer wshng to enter the network may be requred to solve a puzzle that a computer cannot solve [3, 5]. Currently, YAHOO! requres a user to read some text from a JPEG fle n order to open a YAHOO! Mal account. Thus, n knowng that our scheme s present n a system, malcous peers know that they have to gan a somewhat hgh local trust value n order to be consdered as download sources. Therefore, we wll examne strateges on how malcous peers can ncrease ther global trust value despte uploadng nauthentc fles. Snce malcous peers cannot expect to receve any hgh local trust values from non-malcous peers, they can try to ncrease ther global trust value by teamng up as a malcous collectve. In the experment depcted n Fgure 6, we vary the number of malcous peers n the network to assess ther mpact on the network s performance when they are aware of each other and form a malcous collectve. Threat Model B. Malcous Collectves. Malcous peers always provde an nauthentc fle when selected as download source. Malcous peers form a malcous collectve by assgnng a sngle trust value of 1 to another malcous peer n the network. Precsely, f M denotes the set of malcous peers n the network, each peer M sets 8 < s peer peer j = : 1 f j = + 1 1 f = M and j = else whch resembles a malcous chan of mutual hgh local trust val- Fracton of nauthentc downloads 1.9.8.7.6.5.4.3.2.1 non-trust based trust based % 1% 2% 3% 4% 5% 6% 7% Fracton of malcous peers Fgure 6: Trust-based reducton of nauthentc downloads n a network where a fracton of peers forms a malcous collectve and always uploads authentc fles. Formng a malcous collectve does not boost the trust values of malcous peers sgnfcantly, they are stll vrtually banned from uploadng nauthentc fles, smlar to Fgure 5. ues. In terms of the probablstc nterpretaton of our scheme, malcous peers form a collectve out of whch a random surfer or agent, once t has entered the collectve, wll not be able to escape, thus boostng the trust values of all peers n the collectve. Setup. We proceed exactly as n the prevously descrbed experment, albet wth malcous nodes operatng under threat model B. As shown n Fgure 6, we run the experments on a system where download sources are selected based on our global trust values and on a system where download sources are chosen randomly from the set of peers respondng to a query. Dscusson. Our system performs well even f a majorty of malcous peers s present n the network at a promnent place. The experment clearly shows that formng a malcous collectve does not decsvely boost the global trust values of malcous peers: These peers are tagged wth a low trust value and thus rarely chosen as download source. The system manages to break up malcous collectves through the presence of pre-trusted peers (see Secton 4.4): If pre-trusted peers were not present n the network, formng a malcous collectve n fact heavly boosts the trust values of malcous nodes. Under the presence of pre-trusted peers, the local trust values of malcous peers are sgnfcantly lower than those of good peers already after one smulaton cycle. Ths mnmzes the number of nauthentc downloads, and the numbers are vrtually equal to the numbers n Fgure 5 when peers do not form a malcous collectve. For example, wth 4% of all peers n a network beng malcous, around 87% of all fle downloads wll end up n downloadng an nauthentc verson of the fle n a normal, non-trusted network. Upon actvaton of our scheme, around 1% of all fle
non-trust based trust-based trust-based non-trust based.7 45.6 4 Fracton of nauthentc downloads.5.4.3.2.1 Inauthentc downloads 35 3 25 2 15 1 5 1% 2% 3% 4% 5% 6% 7% 8% 9% f% 1 2 3 4 5 6 7 8 Authentc uploads by malcous peers Fgure 7: Trust-based reducton of nauthentc downloads n a network where a fracton of peers forms a malcous collectve and returns authentc fles wth certan probabltes. When malcous peers partly provde authentc uploads, they receve more postve local trust values and wll be selected as download sources more often, also ncreasng ther chances to upload nauthentc fles. Yet, uploadng authentc fles may be assocated wth a cost for malcous peers. Fgure 8: Inauthentc downloads versus authentc uploads provded by malcous peers wth trust-based and non-trust based download source selecton. When malcous peers provde authentc fles n more than 2% of the cases when selected as download source, the ncrease n authentc fles uploaded by malcous peers exceeds the ncrease n nauthentc downloads n the network, hence possbly comng at a hgher cost than beneft for malcous peers. downloads return an nauthentc fle. Formng a malcous collectve obvously does not ncrease the global trust values of malcous peers suffcently n order for them to have mpact on the network. Ths leaves malcous peers wth one choce: They have to ncrease ther local trust values by recevng postve local trust values from at least some good and trusted peers n the network. In the experment n Fgure 7, we consder a strategy for malcous peers that s bult on the dea that malcous peers try to get some postve local trust values from good peers. Threat Model C. Malcous Collectves wth Camoflouge. Malcous peers provde an nauthentc fle n f% of all cases when selected as download source. Malcous peers form a malcous collectve as descrbed above. Setup. We smulate a network consstng of 53 good peers, 3 of whch are pre-trusted peers, and 2 type C malcous peers applyng the standard settngs n Table 1. In each experment, we apply a dfferent settng of parameter f n threat model B such that the probablty that malcous peers return an authentc fle when selected as download source vares from % to 9%. We run experments for each settng of parameter f n steps of 1%. Runnng the experments on both a non-trust based system and on our system yelds Fgure 7. Bars depct the fracton of nauthentc fles downloaded n one smulaton cycle dvded by the total number of fles downloaded n the same perod of tme. Dscusson. Malcous peers that operate under threat model C attempt to gan postve local trust values from some peers n the network by sometmes provdng authentc fles. Thus, they wll not be assgned zero trust values by all peers n the network snce some peers wll receve an authentc fle from them. Ths n turn provdes them wth hgher global trust values and more uploads a fracton of whch wll be nauthentc. Fgure 7 shows that malcous peers have maxmum mpact on the network when provdng 5% authentc fles: 28% of all download requests return nauthentc fles then. However, ths strategy comes at a cost for malcous peers: They have to provde some share of authentc fles, whch s undesrable for them. Frst of all, they try to prevent the exchange of authentc fles on the network, and n ths strategy they have to partcpate n t; second, mantanng a repostory of authentc fles requres a certan mantenance overhead. Fgure 8 depcts the trade-off between authentc (horzontal axs) and nauthentc (vertcal axs) downloads. Each scenaro from Fgure 7 s represented by one data pont n Fgure 8. For example, consder the fourth dark bar n Fgure 7, correspondng to f = 3% and our reputaton scheme n place. In ths scenaro, malcous peers provde 185 authentc downloads and 5 nauthentc ones n a partcular run. 5 The value (185, 5) s plotted n Fgure 8 as the fourth data pont (left to rght) on the lower curve, representng the case when our reputaton scheme s used. The ponts on each curve represent ncreasng f values, from left to rght. In Fgure 8, malcous nodes would lke to operate n the upper left quadrant, provdng a hgh number of nauthentc downloads, and a low number of authentc downloads. However, the fle sharng mechansm n place constrans malcous nodes to operate along one of the curves shown. Wthout our reputaton scheme (top curve), malcous nodes can set f to a small value and move to the upper left quadrant. On the other hand, wth our scheme, malcous peers have no good choces. In partcular, ncreasng f beyond 2% does not make much sense to malcous peers snce the ncremental authentc uploads they have to host outnumber the ncrease n nauthentc downloads. Moreover, for all settngs of parameter f below 5%, malcous peers wll lose all postve local trust values assgned by other peers n the long run snce on average they do provde more nauthentc than authentc fles. Notce that the lnes cross at the lower rght hand sde. Ths does not show that the non-trust-based scheme works better for hgh values of f. Rather, t shows that, when the trust-based scheme s mplemented, malcous peers must upload more authentc fles n order to be able to upload the same number of nauthentc fles. Ths s the desred behavor. 5 More precsely, we run 3 query cycles, exclude the frst 15 query cycles, and count the number of nauthentc and authentc downloads. We execute a second run, and add the numbers form both runs.
Inauthentc downloads 35 3 25 2 15 1 5 trusted non-trusted 5 1 15 2 25 3 35 4 45 Authentc malcous uploads Fgure 9: Inauthentc downloads versus authentc uploads provded by malcous peers wth trust-based and non-trust based download source selecton n a network populated by type D and type B peers. As wth threat model C, malcous peers have to provde a number of authentc uploads n order to ncrease ther global trust values. Yet, as compared to Fgure 8, less authentc uploads by malcous peers are necessary to acheve equal numbers of nauthentc downloads n the network: 5 nauthentc downloads cost 4 authentc uploads wth ths strategy as compared to more than 1 authentc uploads wth threat model C. The prevous experment has shown that malcous peers can ncrease ther mpact by partly concealng ther malcous dentty. Yet over tme, ther malcous dentty wll be uncovered and they lose ther mpact on the network. In the experment n Fgure 9, we consder a team effort strategy that malcous peers can use to work around ths drawback. Two dfferent types of malcous peers are present n the network: Malcous nodes of type B and of type D. Threat Model D. Malcous Spes. Malcous peers answer.5% of the most popular queres and provde a good fle when selected as download source. Malcous peers of type D assgn trust values of 1 to all malcous nodes of type B n the network. Precsely, f M B and M D denote the set of malcous type B peers resp. type D peers n the network, each peer M D sets s peer peer j = j 1 M B f peer j M B else Setup. We smulate a network consstng of 63 good peers, 3 of whch are pre-trusted peers, and 4 (39%) malcous peers, dvded nto two groups of malcous type B and type D peers. Otherwse, the standard settngs from Table 1 apply. In each experment, we consder a dfferent number of type B and type D peers. Confguratons consdered are: I. 4 type B, type D peers II. 39 type B, 1 type D peer III. 36 type B, 4 type D peers IV. 35 type B, 5 type D peers V. 3 type B, 1 type D peers VI. 25 type B, 15 type D peers VII. 2 type B, 2 type D peers VIII. 15 type B, 25 type D peers IX. 1 type B, 3 type D peers X. 5 type B, 35 type D peers. From left to rght, we plot these data ponts n a graph that depcts the number of nauthentc fle downloads versus the number of authentc fle uploads provded by malcous peers, as n the prevous experment. Dscusson. Malcous peers establsh an effcent dvson of labor n ths scheme: Type D peers act as normal peers n the network and try to ncrease ther global trust value, whch they wll n turn assgn to malcous nodes of type B provdng nauthentc fles. The malcous nature of type D peers wll not be uncovered over tme snce these peers do not provde nauthentc fles hence they can contnue to ncrease the global local trust values of type B peers n the network. An nterestng confguraton for malcous peers would be confguraton I: Malcous peers provde a farly low number of authentc downloads (around 1), yet acheve almost the same number of nauthentc downloads n the network as n other confguratons wth a hgher share of authentc downloads by malcous peers. In any confguraton though, our scheme performs better than a system wthout trust-based download source selecton. Also, ths strategy would probably be the strategy of choce for malcous peers n order to attack a trust-based network: For example, by hostng 5 authentc fle uploads n ths strategy malcous peers acheve around 5 nauthentc fle downloads as opposed to about 25 nauthentc fle downloads n the prevous strategy, gven the same effort on provdng authentc uploads. 7.3.1 Other Threat Models In ths secton, we dscuss two slghtly more nuanced threat models. Threat Model E. Sybl Attack. An adversary ntates thousands of peers on the network. Each tme one of the peers s selected for download, t sends an nauthentc fle, after whch t dsconnected and replaced wth a new peer dentty. Dscusson. Ths threat scenaro smply takes advantage of the fact that the fudge-factor that allows prevously unknown users to obtan a reputaton can be abused. Essentally, because there s no cost to create a new ID, the adversary can domnate that pool (wth ghost denttes). Because 1% of all traffc goes to the unknown pool, the malcous entty can behave arbtrarly wthout fear of losng reputaton. To make matters worse, ths knd of attack wll prevent good peers from beng able to garner a good reputaton (they are so outnumbered that they wll almost never be selected). However, ths threat scenaro can be averted by mposng a cost to creatng a new ID as dscussed n Secton 7.3 and [3]. For example, f a user must read the text off of a JPEG (or solve some other captcha [5]), t wll be costly for a sngle adversary to create thousands of users. Threat Model F. Vrus-Dssemnators. (varant of threat model C) A malcous peer sends one vrus-laden (nauthentc) copy of a partcular fle every 1th request. At all other tmes, the authentc fle s sent. Dscusson. Ths s a threat scenaro that s not addressed by EgenTrust. EgenTrust greatly reduces but does not completely elmnate corrupt fles on a P2P network. Ths s useful on a flesharng network where executables are not shared. If executables are ntroduced that have potental to do great damage, then malcous peers can develop strateges to upload a few of them. But t should be noted that no reputaton system to date clams to completely elmnate all corrupt fles on a P2P network n an effcent manner. It should also be noted that the man problem on today s P2P networks s not the dstrbuton of malcous executables (.e. vruses), but rather the floodng of the network wth nauthentc fles. Ths s lkely because today s P2P networks are mostly used to trade dgtal meda, and relatvely few users make use of these networks to share executables. 8. RELATED WORK An overvew of many key ssues n reputaton management s gven n [14]. Trust metrcs on graphs have been presented n [2] and [4]. Beth et al. [4], also use the noton of transtve trust, but ther approach s qute dfferent from ours. Reputaton systems for
P2P networks n partcular are presented n [6] and [1], and are largely based on notons smlar to our local trust values. The contrbuton of ths work s that t shows how to aggregate the local trust assessments of all peers n the network n an effcent, dstrbuted manner that s robust to malcous peers. 9. CONCLUSION We have presented a method to mnmze the mpact of malcous peers on the performance of a P2P system. The system computes a global trust value for a peer by calculatng the left prncpal egenvector of a matrx of normalzed local trust values, thus takng nto consderaton the entre system s hstory wth each sngle peer. We also show how to carry out the computatons n a scalable and dstrbuted manner. In P2P smulatons, usng these trust values to bas downloads has shown to reduce the number of nauthentc fles on the network under a varety of threat scenaros. Furthermore, rewardng hghly reputable peers wth better qualty of servce ncents non-malcous peers to share more fles and to self-polce ther own fle repostory for nauthentc fles. Acknowledgements We would lke to thank the revewers of ths paper for detaled and nsghtful comments. Ths paper s supported n part by the Research Collaboraton between NTT Communcaton Scence Laboratores, Nppon Telegraph and Telephone Corporaton and CSLI, Stanford Unversty (research project on Concept Bases for Lexcal Acquston and Intellgently Reasonng wth Meanng). 1. REFERENCES [1] K. Aberer and Z. Despotovc. Managng Trust n a Peer-2-Peer Informaton System. In Proceedngs of the 1th Internatonal Conference on Informaton and Knowledge Management (ACM CIKM), New York, USA, 21. [2] Advogato s Trust Metrc (Whte Paper), http://www.advogato.org/trust-metrc.html. [3] T. Aura, P. Nkander, and J. Lewo. Dos-resstant authentcaton wth clent puzzles. In 8th Internatonal Workshop on Securty Protocols, 2. [4] T. Beth, M. Borcherdng, and B. Klen. Valuaton of trust n open networks. In Proc. 3rd European Symposum on Research n Computer Securty ESORICS 94, pages 3 18, 1994. [5] Captcha Project. http://www.captcha.net. [6] F. Cornell, E. Daman, S. D. C. D. Vmercat, S. Parabosch, and S. Samarat. Choosng Reputable Servents n a P2P Network. In Proceedngs of the 11th World Wde Web Conference, Hawa, USA, May 22. [7] A. Crespo and H. Garca-Molna. Semantc Overlay Networks. Submtted for publcaton 22. [8] J. Douceur. The Sybl Attack. In Frst IPTPS, March 22. [9] ebay webste. www.ebay.com. [1] T. H. Havelwala and S. D. Kamvar. The second egenvalue of the google matrx. Techncal report, Stanford Unversty, 23. [11] S. D. Kamvar, M. T. Schlosser, and H. Garca-Molna. Incentves for Combattng Freerdng on P2P Networks. Techncal report, Stanford Unversty, 23. [12] L. Page, S. Brn, R. Motwan, and T. Wnograd. The PageRank Ctaton Rankng: Brngng Order to the Web. Techncal report, Stanford Dgtal Lbrary Technologes Project, 1998. [13] S. Ratnasamy, P. Francs, M. Handley, R. Karp, and S. Shenker. A scalable content-addressable network. In Proceedngs of ACM SIGCOMM, August 21. [14] P. Resnck, R. Zeckhauser, E. Fredman, and K. Kuwabara. Reputaton Systems. Communcatons of the ACM, 43(12):45 48, 2. [15] M. Rpeanu and I. Foster. Mappng the Gnutella Network - Macroscopc Propertes of Large-scale P2P Networks and Implcatons for System Desgn. In Internet Computng Journal 6(1), 22. [16] S. Sarou, P. K. Gummad, and S. D. Grbble. A Measurement Study of Peer-to-Peer Fle Sharng Systems. In Proceedngs of Multmeda Computng and Networkng 22 (MMCN 2), San Jose, CA, USA, January 22. [17] M. T. Schlosser and S. D. Kamvar. Smulatng P2P Networks. Techncal report, Stanford Unversty, 23. [18] I. Stoca, R. Morrs, D. Karger, M. F. Kaashoek, and H. Balakrshnan. Chord: A scalable peer-to-peer lookup servce for nternet applcatons. In Proceedngs of the 21 Conference on Applcatons, Technologes, Archtectures, and Protocols for Computer Communcatons, pages 149 16. ACM Press, 21. [19] VBS.Gnutella Worm. http://securtyresponse.symantec.com/avcenter/venc/data/ vbs.gnutella.html. [2] B. Yang, S. D. Kamvar, and H. Garca-Molna. Secure Score Management for P2P Systems. Techncal report, Stanford Unversty, 23.