Crowds: Anonymity for Web Transactions



Similar documents
Dryad: Distributed Data-Parallel Programs from Sequential Building Blocks

Spin-out Companies. A Researcher s Guide

HOW MANY TIMES SHOULD YOU SHUFFLE A DECK OF CARDS? 1

Type Less, Find More: Fast Autocompletion Search with a Succinct Index

Leadership Can Be Learned, But How Is It Measured?

Adverse Health Care Events Reporting System: What have we learned?

SOME GEOMETRY IN HIGH-DIMENSIONAL SPACES

are new doctors safe to practise?

Systemic Risk and Stability in Financial Networks

Consistency of Random Forests and Other Averaging Classifiers

Supporting medical students with mental health conditions

Turning Brownfields into Greenspaces: Examining Incentives and Barriers to Revitalization

Things Your Next Firewall Must Do

Teaching Bayesian Reasoning in Less Than Two Hours

Child. Is Missing: When Your. A Family Survival Guide

Stéphane Boucheron 1, Olivier Bousquet 2 and Gábor Lugosi 3

When the People Draw the Lines

No One Benefits. How teacher pension systems are failing BOTH teachers and taxpayers

Everything You Always Wanted to Know about Copula Modeling but Were Afraid to Ask

The Unicorn, The Normal Curve, and Other Improbable Creatures

Catalogue no XPB Your Guide to the Consumer Price Index

Trying Juveniles as Adults: An Analysis of State Transfer Laws and Reporting Patrick Griffin, Sean Addie, Benjamin Adams, and Kathy Firestine

Issue Paper n August 2009 STRATEGIES TO SUPPORT THE INTEGRATION OF MENTAL HEALTH INTO PEDIATRIC PRIMARY CARE

HOMEBUYING STEP BY STEP. A Consumer Guide and Workbook

What is IT Governance?

The Arctic Climate Change and Security Policy Conference

How Has the Literature on Gini s Index Evolved in the Past 80 Years?

Managing An Uncertain

J. J. Kennedy, 1 N. A. Rayner, 1 R. O. Smith, 2 D. E. Parker, 1 and M. Saunby Introduction

Controller Area Network (CAN) Schedulability Analysis with FIFO queues

Transcription:

Crowds: Aoymity for Web Trasactios Michael K. Reiter ad Aviel D. Rubi AT&T Labs Research I this paper we itroduce a system called Crowds for protectig users aoymity o the worldwide-web. Crowds, amed for the otio of bledig ito a crowd, operates by groupig users ito a large ad geographically diverse group (crowd) that collectively issues requests o behalf of its members. Web servers are uable to lear the true source of a request because it is equally likely to have origiated from ay member of the crowd, ad eve collaboratig crowd members caot distiguish the origiator of a request from a member who is merely forwardig the request o behalf of aother. We describe the desig, implemetatio, security, performace, ad scalability of our system. Our security aalysis itroduces degrees of aoymity as a importat tool for describig ad provig aoymity properties. Categories ad Subject Descriptors: C.2.0 [Computer-Commuicatio Networks]: Geeral security ad protectio; C.2.2 [Computer-Commuicatio Networks]: Network Protocols applicatios; K.4.1 [Computers ad Society]: Public Policy Issues privacy; K.4.4 [Computers ad Society]: Electroic Commerce security Geeral Terms: Security Additioal Key Words ad Phrases: aoymous commuicatio, world-wide-web 1. INTRODUCTION Every ma should kow that his coversatios, his correspodece, ad his persoal life are private. Lydo B. Johso, presidet of the Uited States, 1963 69 The lack of privacy for trasactios o the world-wide-web, or the Iteret i geeral, is a well-documeted fact [Brier 1997; Miller 1997]. While ecryptig commuicatio to ad from web servers (e.g., usig SSL [Hickma ad Elgamal 1995]) ca hide the cotet of the trasactio from a eavesdropper (e.g., a Iteret service provider, or a local system admiistrator), the eavesdropper ca still lear the IP addresses of the cliet ad server computers, the legth of the data beig exchaged, ad the time ad frequecy of exchages. Ecryptio also does little to protect the privacy of the cliet from the server. A web server ca record the Iteret addresses at which its cliets reside, the servers that referred the cliets to it, ad the times ad frequecies of accesses by its cliets. With additioal effort, this iformatio ca be combied with other data to ivade the privacy of cliets eve further. For example, by automatically figerig the cliet computer shortly after a access ad comparig the idle time for each user of the cliet computer with the server access time, the server admiistrator ca ofte deduce the exact user with high likelihood. Some cosequeces of such privacy abuses are described i [Miller 1997]. I this paper we itroduce a ew approach for icreasig the privacy of web

2 trasactios ad a system, called Crowds, that implemets it. Our approach is based o the idea of bledig ito a crowd, i.e., hidig oe s actios withi the actios of may others. To execute web trasactios i our model, a user first jois a crowd of other users. The user s request to a web server is first passed to a radom member of the crowd. That member ca either submit the request directly to the ed server or forward it to aother radomly chose member, ad i the latter case the ext member chooses to submit or forward idepedetly. Whe the request is evetually submitted, it is submitted by a radom member, thus prevetig the ed server from idetifyig its true iitiator. Eve crowd members caot idetify the iitiator of the request, sice the iitiator is idistiguishable from a member that simply forwards a request from aother. I studyig the aoymity properties provided by this simple mechaism, we itroduce the otio of degrees of aoymity. We argue that the degree of aoymity provided agaist a attacker ca be viewed as a cotiuum, ragig from o aoymity to complete aoymity ad havig several iterestig poits i betwee. We iformally defie these itermediate poits, ad for our Crowds mechaism described above, we refie these defiitios ad prove aoymity properties for our system. We expect these defiitios ad proofs to yield isights ito provig aoymity properties for other approaches, as well. A itriguig property of Crowds is that a member of a crowd may submit requests iitiated by other users. This has both egative ad positive cosequeces. O the egative side, the user may be icorrectly suspected of origiatig that request. O the positive side, this property suggests that the mere availability of Crowds offers the user some degree of deiability for her observed browsig behavior, if it is possible that she was usig Crowds. Moreover, if Crowds becomes widely adopted, the the presumptio that the computer from which a request is received is the computer that origiated the request will become decreasigly valid (ad thus decreasigly utilized). The aoymity provided by Crowds is subject to some caveats. For example, Crowds obviously caot protect a user s aoymity if the cotet of her web trasactios reveals her idetity to the web server (e.g., if the user submits her ame ad credit card umber i a web form). More subtley, Crowds ca be udermied by executable web cotet that, if dowloaded ito the user s browser, ca ope etwork coectios directly from the browser to web servers, thus bypassig Crowds altogether ad exposig the user to the ed server. I today s browsers, such executable cotet takes the form of Java applets ad ActiveX cotrols. Therefore, whe usig Crowds, it is recommeded that Java ad ActiveX be disabled i the browser, which ca typically be doe via a simple prefereces meu i the browser. The rest of this paper is structured as follows. I Sectio 2, we more precisely state the aoymity goals of our system ad itroduce the otio of degrees of aoymity. This gives us sufficiet groudwork to compare our approach to other approaches to aoymity i Sectio 3. We describe the basic Crowds mechaism i Sectio 4 ad aalyze its security i Sectio 5. We describe the performace ad scalability of our system i Sectios 6 ad 7, respectively. We discuss crowd membership i Sectio 8, the system s user iterface i Sectio 9, ad the obstacles that firewalls preset to wide scale adoptio of Crowds i Sectio 10. We coclude i Sectio 11.

3 absolute privacy beyod suspicio probable iocece possible iocece exposed provably exposed Fig. 1. Degrees of aoymity: Degrees rage from absolute privacy, where the attacker caot perceive the presece of commuicatio, to provably exposed, where the attacker ca prove the seder, receiver, or their relatioship to others. 2. GOALS 2.1 Aoymity As discussed i [Pfitzma ad Waider 1987], there are three types of aoymous commuicatio properties that ca be provided: seder aoymity, receiver aoymity, ad ulikability of seder ad receiver. Seder aoymity meas that the idetity of the party who set a message is hidde, while its receiver (ad the message itself) might ot be. Receiver aoymity similarly meas that the idetity of the receiver is hidde. Ulikability of seder ad receiver meas that though the seder ad receiver ca each be idetified as participatig i some commuicatio, they caot be idetified as commuicatig with each other. A secod aspect of aoymous commuicatio is the attackers agaist which these properties are achieved. The attacker might be a eavesdropper that ca observe some or all messages set ad received, collaboratios cosistig of some seders, receivers, ad other parties, or variatios of these [Pfitzma ad Waider 1987]. To these two aspects of aoymous commuicatio, we add a third: the degree of aoymity. As show i Figure 1, the degree of aoymity ca be viewed as a iformal cotiuum. For simplicity, below we describe this cotiuum with respect to seder aoymity, but it ca aturally be exteded to receiver aoymity ad ulikability as well. O oe ed of the spectrum is absolute privacy: absolute seder privacy agaist a attacker meas that the attacker ca i o way distiguish the situatios i which a potetial seder actually set commuicatio ad those i which it did ot. That is, sedig a message results i o observable effects for the attacker. O the other ed of the spectrum is provably exposed: the idetity of a seder is provably exposed if the attacker caot oly idetify the seder of a message, but ca also prove the idetity of the seder to others. For the purposes of this paper, the followig three itermediate poits of this spectrum are of iterest, listed from strogest to weakest. Beyod suspicio: A seder s aoymity is beyod suspicio if though the attacker ca see evidece of a set message, the seder appears o more likely to be the origiator of that message tha ay other potetial seder i the system. Probable iocece: A seder is probably iocet if, from the attacker s poit of view, the seder appears o more likely to be the origiator tha to ot be the origiator. This is weaker tha beyod suspicio i that the attacker may have reaso to expect that the seder is more likely to be resposible tha ay other potetial seder, but it still appears at least as likely that the seder is ot

4 resposible. Possible iocece: A seder is possibly iocet if, from the attacker s poit of view, there is a otrivial probability that the real seder is someoe else. It is possible to describe these itermediate poits for receiver aoymity ad seder/receiver ulikability, as well. Whe ecessary, we defie these itermediate poits more precisely. Which degree of aoymity suffices for a user obviously depeds o the user ad her circumstaces. Probable iocece seder aoymity should prevet may types of attackers from actig o their suspicios (therefore avoidig may abuses, e.g., cited i [Miller 1997]) due to the high probability that those suspicios are icorrect. However, if the user wishes to avoid ay suspicio whatsoever icludig eve suspicios ot sufficietly certai for the attacker to act upo the she should isist o beyod suspicio seder aoymity. The default degree of aoymity o the web for most iformatio ad attackers is exposed, as described i Sectio 1. All recet versios of Netscape Navigator ad Iteret Explorer are cofigured to automatically idetify the cliet computer to web servers, by passig iformatio icludig the IP address ad the host platform i request headers. 2.2 What Crowds achieves As described i Sectio 1, our system cosists of a dyamic collectio of users, called a crowd. These users iitiate web requests to various web servers (ad receive replies from them), ad thus the users are the seders ad the servers are the receivers. We cosider the aoymity properties provided to a idividual user agaist three distict types of attackers: A local eavesdropper is a attacker who ca observe all (ad oly) commuicatio to ad from the user s computer. Collaboratig crowd members are other crowd members that ca pool their iformatio ad eve deviate from the prescribed protocol. The ed server is the web server to which the web trasactio is directed. The above descriptios are iteded to capture the full capabilities of each attacker. For example, collaboratig members ad the ed server caot eavesdrop o commuicatio betwee other members. Similarly, a local eavesdropper caot eavesdrop o messages other tha those set or received by the user s computer. A local eavesdropper is iteded to model, e.g., a eavesdropper o the local area etwork of the user, such as a admiistrator moitorig web usage at a local firewall. However, if the same LAN also serves the ed server, the the eavesdropper is effectively global, ad we provide o protectios agaist it. The security offered agaist each of these types of attackers is summarized i Table 1 ad justified i the remaider of the paper. As idicated by the omissio of a ulikability of seder ad receiver colum from this table, our system serves primarily to hide the seder or receiver from the attacker. I this table, deotes the umber of members i the crowd (for the momet we treat this as static) ad p f > 1/2 deotes the probability of forwardig, i.e., whe a crowd member receives a request, the probability that it forwards the request to aother member, rather

5 Table 1. Aoymity properties provided by Crowds Attacker Seder aoymity Receiver aoymity local eavesdropper exposed P (beyod suspicio) 1 c collaboratig members, probable iocece P (absolute privacy) 1 p f (c + 1) P (absolute privacy) 1 p f 1/2 ed server beyod suspicio N/A tha submittig it to the ed server. (p f is explaied more fully i Sectio 4.) The boldface claims i the table i.e., probable iocece seder aoymity agaist collaboratig members ad beyod suspicio seder aoymity agaist the ed server are guaratees. The probability of beyod suspicio receiver aoymity agaist a local eavesdropper, o the other had, oly icreases to oe asymptotically as the crowd size icreases to ifiity. Put aother way, if the local eavesdropper is sufficietly lucky, the it observes evets that expose the receiver of a web request, ad otherwise the receiver is beyod suspicio. However, the probability that it views these evets decreases as a fuctio of the size of the crowd. Similarly, a seder s assurace of absolute privacy agaist collaboratig members also holds asymptotically with probability oe as crowd size grows to ifiity (for a costat umber of collaborators). Thus, if the collaborators are ulucky, users achieve absolute privacy. We provide a more careful treatmet of these otios i Sectio 5. Of course, agaist a attacker that is comprised of two or more of the attackers described above, our system yields degrees of seder ad receiver aoymity that are the miimum amog those provided agaist the attackers preset. For example, if a local eavesdropper ad the ed server to which the user s request is destied collaborate i a attack, the our techiques achieve either seder aoymity or receiver aoymity. Aother caveat is that all of the claims of seder ad receiver aoymity i this sectio, ad their justificatios i the remaider of this paper, require that either message cotets themselves or a priori kowledge of seder behavior give clues to the seder s or receiver s idetity. 2.3 What Crowds does ot achieve Crowds makes o effort to defed agaist deial-of-service attacks by rogue crowd members. A crowd member could, e.g., accept messages from other crowd members ad refuse to pass them alog. I our system, such deial-of-service ca result from malicious behavior, but typically does ot result if (the process represetig) a crowd member fails beigly or leaves the crowd. As a result, these attacks are detectable. More difficult to detect are active attacks where crowd members substitute wrog iformatio i respose to web requests that they receive from other crowd members. Such attacks are iheret i ay system that uses itermediaries to forward uprotected iformatio, but fortuately they caot be utilized to compromise aoymity directly. 3. RELATED WORK There are two basic approaches previously proposed for achievig aoymous web trasactios. The first approach is to iterpose a additioal party (a proxy) be-

6 twee the seder ad receiver to hide the seder s idetity from the receiver. Examples of such proxies iclude the Aoymizer (http://www.aoymizer.com/) ad the Lucet Persoalized Web Assistat [Gabber et al. 1997] (http://lpwa.com). Crowds provides protectio agaist a wider rage of attackers tha proxies do. I particular, proxy-based systems are etirely vulerable to a passive attacker i cotrol of the proxy, sice the attacker ca moitor ad record the seders ad receivers of all commuicatio. Our system presets o sigle poit at which a passive attack ca cripple all users aoymity. I additio, a proxy is typically a sigle poit of failure; i.e., if the proxy fails, the aoymous browsig caot cotiue. I Crowds, o sigle failure discotiues all ogoig web trasactios. A secod approach to achievig aoymous web trasactios is to use a mix [Chaum 1981]. A mix is actually a ehaced proxy that, i additio to hidig the seder from the receiver, also takes measures to provide seder ad receiver ulikability agaist a global eavesdropper. It does so by collectig messages of equal legth from seders, cryptographically alterig them (typically by decryptig them with its private key), ad forwardig the messages to their recipiets i a differet order. These techiques make it difficult for a eavesdropper to determie which output messages correspod to which iput messages. A atural extesio is to iterpose a sequece of mixes betwee the seder ad receiver [Chaum 1981]. A sequece of mixes ca tolerate colludig mixes, as ay sigle correctly-behavig mix server i the sequece prevets a eavesdropper from likig the seder ad receiver. Mixes have bee implemeted to support may types of commuicatio, for example electroic mail (e.g., [Gulcu ad Tsudik 1996]), ISDN service [Pfitzma et al. 1991], ad geeral sychroous commuicatio (icludig web browsig) [Syverso et al. 1997]. The properties offered by Crowds is differet from those offered by mixes. As described above, Crowds provide (probable iocece) seder aoymity agaist collaboratig crowd members. I cotrast, i the closest aalog to this attack i typical mix systems i.e., a group of collaboratig mix servers mixes do ot provide seder aoymity but do esure seder ad receiver ulikability [Pfitzma ad Waider 1987]. Aother differece is that mixes provide seder ad receiver ulikability agaist a global eavesdropper. Crowds does ot provide aoymity agaist global eavesdroppers. However, our itetio is for a crowd to spa multiple admiistrative domais, where the existece of a global eavesdropper is ulikely. Aother differece is that mixes typically rely o public key ecryptio, the algebraic properties of which have bee exploited to break some implemetatios [Pfitzma ad Pfitzma 1990]. Crowds uique properties admit very efficiet implemetatios i compariso to mixes. With mixes, the legth of a message routed through a mix etwork grows proportioally to the umber of mixes through which it is routed, ad the mix etwork must pad messages to fixed legths ad geerate decoy messages to foil traffic aalysis. Moreover, i a typical mix implemetatio, routig a message through a sequece of mixes icurs a cost of public key ecryptios ad private key decryptios o the critical path of the message, which are comparatively expesive operatios. Thus, sice the ulikability provided by mixes is tolerat of up to 1 mixes colludig, icreasig improves aoymity but hurts performace. Privacy i Crowds ca similarly be ehaced by icreasig the average umber of times a

7 request is forwarded amog members before beig submitted to the ed server, but this should impact performace less because there are o public/private key operatios, o iflatio of message trasmissio legths (beyod a small, costat-size header), ad o decoy messages eeded. Aother performace advatage of Crowds is that sice each user actively participates i the fuctio of the crowd, the throughput of a crowd grows as a fuctio of the umber of users. I fact, we show i Sectio 7 that a crowd ca scale almost limitlessly (i theory), i the sese that the load o each user s computer is expected to remai roughly costat as ew users joi the crowd. With a fixed etwork of mixes, the load of each server icreases proportioally to the umber of users, with a resultig liear decrease i throughput. 4. CROWD OVERVIEW As discussed previously, a crowd ca be thought of as a collectio of users. A user is represeted i a crowd by a process o her computer called a jodo (proouced Joh Doe ad meat to covey the image of a faceless participat). The user (or a local admiistrator) starts the jodo o the user s computer. Whe the jodo is started, it cotacts a server called the bleder to request admittace to the crowd. If admitted, the bleder reports to this jodo the curret membership of the crowd ad iformatio that eables this jodo to participate i the crowd. We defer further discussio of the bleder ad crowd membership maiteace to Sectio 8. The user selects this jodo as her web proxy by specifyig its host ame ad port umber i her web browser as the proxy for all services. Thus, ay request comig from the browser is set directly to the jodo. 1 Upo receivig the first user request from the browser, the jodo iitiates the establishmet of a radom path of jodos that carries its users trasactios to ad from their iteded web servers. More precisely, the jodo picks a jodo from the crowd (possibly itself) at radom, ad forwards the request to it. Whe this jodo receives the request, it flips a biased coi to determie whether or ot to forward the request to aother jodo; the coi idicates to forward with probability p f. If the result is to forward, the the jodo selects a radom jodo ad forwards the request to it, ad otherwise the jodo submits the request to the ed server for which the request was destied. So, each request travels from the user s browser, through some umber of jodos, ad fially to the ed server. A possible set of such paths is show i Figure 2. I this figure, the paths are 1 5 server; 2 6 2 server; 3 1 6 server; 4 4 server; 5 4 6 server; ad 6 3 server. Subsequet requests iitiated at the same jodo follow the same path (except perhaps goig to a differet ed server), ad server replies traverse the same path as the requests, oly i reverse. A pseudocode descriptio of a jodo is preseted i Figure 3. This figure describes a thread of executio that is executed per received request. This descriptio uses cliet-server termiology, where oe jodo is a cliet of its successor o the path. 1 The services that must be proxied iclude Gopher, FTP, HTTP ad SSL. Otherwise, e.g., FTP requests triggered by dowloadig a web page would ot go through the crowd, ad would thus reveal the user s IP address to the ed server. Java ad ActiveX should be disabled i the browser as well, because a Java applet or ActiveX cotrol embedded i a retrieved web page could coect back to its server directly ad reveal the user s IP address to that server.

8 Crowd Web Servers 1 6 3 5 2 5 6 1 2 3 4 4 Fig. 2. Paths i a crowd (the iitiator ad web server of each path are labeled the same) For each path, idicated by a path id, the value ext[path id] is the ext jodo o the path. To assig ext jodos for paths, each jodo maitais a set Jodos of jodos that it believes to be active (itself icluded). Whe it chooses to direct the path to aother jodo, it selects the ext jodo uiformly at radom from this set (lies 6, 16, ad 26); i.e., R S deotes selectio from the set S uiformly at radom. Subsequet sectios shed greater light o the operatio of a jodo ad the pseudocode descriptio of Figure 3. For techical reasos, it is coveiet for the jodo at each positio i a path to hold a differet path idetifier for the path. That is, if a jodo receives a request marked with path id from its predecessor i a path, the it replaces path id with a differet path idetifier stored i traslate[path id] before forwardig the request to its successor (if a jodo). This eables a jodo that occupies multiple positios o a path to act idepedetly i each positio: if the path id remaied the same alog the path, the the jodo would behave idetically each time it received a message o the path, resultig i a ifiite loop. Path idetifiers should be uique; i our preset implemetatio, ew path id() (lies 5 ad 15) returs a radom 128-bit value. Omitted from the descriptio i Figure 3 is that fact that all commuicatio betwee ay two jodos is ecrypted usig a key kow oly to the two of them. Ecryptio keys are established as jodos joi the crowd, as is discussed i Sectio 8. 5. SECURITY ANALYSIS I this sectio we cosider the questio of what iformatio a attacker ca lear about the seders ad receivers of web trasactios, give the mechaisms we described i Sectio 4. The types of attackers we cosider were described i Sectio 2. Our aalysis begis with the two attackers for which aalysis is more straightfor-

9 (1) cliet,request receive request() (2) if (cliet = browser) (3) saitize(request) /* strip cookies ad idetifyig headers */ (4) if (my path id = ) /* if my path id is ot iitialized... */ (5) my path id ew path id() (6) ext[my path id] R Jodos (7) forward request(my path id) (8) else /* cliet is a jodo */ (9) path id remove path id(request) /* remove icomig path id */ (10) if (traslate[path id] = ) /* icomig path id is ew */ (11) coi coi flip(p f ) /* tails with probability p f */ (12) if (coi = heads) (13) traslate[path id] submit (14) else (15) traslate[path id] ew path id() /* set outgoig path id */ (16) ext[traslate[path id]] R Jodos /* select ext jodo at radom */ (17) if (traslate[path id] = submit ) (18) submit request() (19) else (20) forward request(traslate[path id]) (21) subroutie forward request(out path id) (22) sed out path id request to ext[out path id] (23) reply await reply( ) /* wait for reply or recogizable jodo failure */ (24) if (reply = jodo failed ) /* jodo failed */ (25) Jodos Jodos \ {ext[out path id]} /* remove the jodo */ (26) ext[out path id] R Jodos /* assig a ew radom jodo for this path */ (27) forward request(out path id) /* try agai */ (28) else /* received reply from jodo */ (29) sed reply to cliet (30) subroutie submit request () (31) sed request to destiatio(request) /* sed to destiatio web server */ (32) reply await reply(timeout) /* wait for reply, timeout, or server failure */ (33) sed reply to cliet /* sed reply or error message to cliet */ Fig. 3. Pseudocode descriptio of a jodo ward, amely a local eavesdropper ad the ed server. aalysis of crowd security versus collaboratig jodos. This is followed by a 5.1 Local eavesdropper Recall that a local eavesdropper is a attacker that ca observe all (ad oly) commuicatio emaatig from a idividual user s computer. Whe this user iitiates a request, the fact that she did so is exposed to the local eavesdropper, sice we make o effort to hide correlatios betwee iputs to ad outputs from the iitiatig computer. That is, the local eavesdropper observes that a request output by the user s computer did ot result from a correspodig iput. Thus, we offer o seder aoymity agaist a local eavesdropper. The mechaisms we described do, however, typically prevet a local eavesdropper from learig the iteded receiver of a request, because every message forwarded o a path, except for the fial request to the ed server, is ecrypted. Thus, while the eavesdropper is able to view ay message emaatig from the user s computer, it oly views a message submitted to the ed server (or equivaletly a plaitext message cotaiig the ed server s address) if the user s jodo ultimately submits the user s request itself. Sice the probability that the user s jodo ultimately submits the request is 1/ where is the size of the crowd whe the path was created, the probability that the eavesdropper lears the idetity of the receiver decreases

10 as a fuctio of crowd size. Moreover, whe the user s jodo does ot ultimately submit the request, the local eavesdropper sees oly the ecrypted address of the ed server, which we suggest yields receiver aoymity that is (iformally) beyod suspicio. Thus, P (beyod suspicio) 1 for receiver aoymity. 5.2 Ed servers We ow cosider the security of our system agaist a attack by the ed server oly. Because the web server is the receiver, obviously receiver aoymity is ot possible agaist this attacker. However, the aoymity for the path iitiator is quite strog. I particular, sice the path iitiator first forwards to aother jodo whe creatig its path (see Sectio 4), the ed server is equally likely to receive the iitiator s requests from ay crowd member. That is, from the ed server s perspective, all crowd members are equally likely to have iitiated the request, ad so the actual iitiator s seder aoymity is beyod suspicio. It is iterestig to ote that this result, as opposed to that for collaboratig jodos below, does ot deped o p f (the probability of forwardig; see Sectio 4). Ideed, icreasig expected path legth offers o additioal assurace of aoymity agaist a ed server. 5.3 Collaboratig jodos Cosider a set of collaboratig (corrupted) jodos i the crowd. A sigle malicious jodo is simply a special case of this attacker, ad our aalysis applies to this case as well. Because each jodo ca observe plaitext traffic o a path routed through it, ay such traffic, icludig the address of the ed server, is exposed to this attacker. The questio we cosider here is if the attacker ca determie who iitiated the path. To be precise, cosider ay path that is iitiated by a o-collaboratig member ad o which a collaborator occupies a positio. The goal of the collaborators is to determie the member that iitiated the path. Assumig that the cotets of the commuicatio do ot suggest a iitiator, the collaborators have o reaso to suspect ay member other tha the oe from which they immediately received it, i.e., the member immediately precedig the first collaborator o the path. All other ocollaboratig members are each equally likely to be the iitiator, but are also obviously less likely to be the iitiator tha the collaborators immediate predecessor. We ow aalyze how cofidet the collaborators ca be that their immediate predecessor is i fact the path iitiator. Let H k, k 1, deote the evet that the first collaborator o the path occupies the kth positio o the path, where the iitiator itself occupies the 0th positio (ad possibly others), ad defie H k+ = H k H k+1 H k+2... Let I deote the evet that the first collaborator o the path is immediately preceded o the path by the path iitiator. Note that H 1 I, but the coverse is ot true, because the iitiatig jodo might appear o the path multiple times. Give this otatio, the collaborators ow hope to determie P (I H 1+ ), i.e., give that a collaborator is o the path, what is the probability that the path iitiator is the first collaborator s immediate predecessor? Refiig our ituitio from Sectio 2, we say that the path iitiator has probable iocece if this probability is at most 1/2. Defiitio 5.1. The path iitiator has probable iocece (with respect to seder

11 aoymity) if P (I H 1+ ) 1/2. I order to yield probable iocece for the path iitiator, certai coditios must be met i our system. I particular, let p f > 1/2 be the probability of forwardig i the system (see Sectio 4), let c deote the umber of collaborators i the crowd, ad let deote the total umber of crowd members whe the path is formed. The theorem below gives a sufficiet coditio o p f, c, ad to esure probable iocece for the path iitiator. Theorem 5.2. If pf p f (c + 1), the the path iitiator has probable iocece agaist c 1/2 collaborators. Proof. We wat to show that P (I H 1+ ) 1/2 if pf p f 1/2 (c + 1). First ote that ( ) i 1 pf ( c) ( c ) P (H i ) = This is due to the fact that i order for the first collaborator to occupy the ith positio o the path, the path must first wader to i 1 ocollaborators (each time with probability c ), each of which chooses to forward the path with probability p f, ad the to a collaborator (with probability c ). The ext two facts follow immediately from this. P (H 2+ ) = c ( ) k pf ( c) ( c = ) ( ) p f ( c) p f c( c) = 1 2 p f ( c) P (H 1+ ) = c k=1 ( ) k pf ( c) = k=0 ( c ) ( 1 1 pf ( c) pf ( c) ) = c p f ( c) Other probabilities we eed are P (H 1 ) = c, P (I H 1) = 1, ad P (I H 2+ ) = 1 c. The last of these follows from the observatio that if the first collaborator o the path occupies oly the secod or higher positio, the it is immediately preceded o the path by ay ocollaboratig member with equal likelihood. Now, P (I) ca be captured as P (I) = P (H 1 )P (I H 1 ) + P (H 2+ )P (I H 2+ ) = c( p f + cp f + p f ). 2 p f ( c) The, sice I H 1+ we get So, if P (I H 1+ ) = P (I H 1+) P (H 1+ ) pf p f 1/2 (c + 1), the P (I H 1+) 1 2. = P (I) P (H 1+ ) = p f( c 1) As a result of Theorem 5.2, if p f = 3 4, the probable iocece is guarateed as log as 3(c + 1). More geerally, Theorem 5.2 implies a tradeoff betwee the legth of paths (i.e., performace) ad ability to tolerate collaborators. That is, by makig the probability of forwardig high, the fractio of collaborators that ca be tolerated approaches half of the crowd. O the other had, makig the probability

12 of forwardig close to oe-half decreases the fractio of collaborators that ca be tolerated. The value of P (H 1+ ) derived i the proof of Theorem 5.2 shows that P (H 1+ ) 0 as if c, p f are held costat. Assumig that collaborators caot observe a path o which they occupy o positios, it follows that P (absolute privacy) 1 for seder aoymity ad receiver aoymity. The rate of this growth, however, ca be slow if p f is large. 5.3.1 Timig attacks. So far the aalysis of security agaist collaboratig jodos has ot take timig attacks ito accout. The possibility of timig attacks i our system results from the structure of HTML, the laguage i which web pages are writte. A HTML page ca iclude a URL (e.g., the address of a image) that, whe the page is retrieved, causes the user s browser to automatically issue aother request. 2 It is the immediate ature of these requests that poses the greatest opportuity for timig attacks by collaboratig jodos. Specifically, the first collaboratig jodo o a path, upo returig a web page o that path cotaiig a URL that will be automatically retrieved, ca time the duratio util it receives the request for that URL. If the duratio is sufficietly short, the this could reveal that the collaborator s immediate predecessor is the iitiator of the request. I our preset implemetatio, we elimiate such timig attacks as follows. Whe a jodo receives a HTML reply to a request that it either received directly from a user s browser or submitted directly to a ed server i.e., the jodo is either the user s (i.e., the path iitiator) or the last jodo o the path it parses the HTML page to idetify all URLs that the user s browser will automatically request as a result of receivig this reply. The last jodo o the path requests these URLs ad seds them back alog the same path o which the origial request was received. The user s jodo, upo receivig requests for these URLs from the user s browser, does ot forward these requests o the path, but rather simply waits for the URLs cotets to arrive o the path ad the feeds them to the browser. I this way, other jodos o the path ever see the requests that are geerated by the browser, ad thus caot glea timig iformatio from them. Note that misbehavior by the last jodo o the path (or ay itermediate jodo) ca result oly i a deial of service, ad ot i a successful timig attack. I particular, if a attackig jodo iserts a embedded URL ito the returig page, the user s jodo will idetify it ad expect the URL cotets to arrive, but will ot forward the request for the URL that the user s browser iitiates. This mechaism prevets jodos other tha the user s from observig requests automatically geerated due to the retrieval of a page. Therefore, all requests observable by attackig jodos are geerated by explicit user actio. It is coceivable that a user s respose to a page (e.g., clickig o a cotaied URL), if sufficietly rapid, could reveal to the jodo i the first positio o the path that its predecessor is the iitiator of the path, i a way similar to how a automatic request might. However, the user s respose would eed to be extremely fast typically withi a 2 These URLs are cotaied i, for example, the src attributes of <embed>, <frame>, <iframe>, <img>, <iput type=image>, ad <script> tags, the backgroud attributes of <body>, <table>, <tr> ad <td> tags, the cotet attributes of <meta> tags, ad others.

13 fractio of a secod of viewig the page to risk revealig this iformatio. We expect that such respose times are ucharacteristic of huma browsig, ad ca be made eve less so by educatig users of this risk. If, however, this presumptio turs out to be icorrect, the user s jodo could isert a radom delay per usergeerated request, thereby decreasig the chaces of revealig this iformatio to virtually zero. The primary drawback of our preset approach to defedig agaist timig attacks is that it is ot easily compatible with some web techologies. For example, web pages that cotai executable scripts, e.g., writte i JavaScript, ca make it difficult for a jodo to idetify i advace the URLs that a browser will automatically request as a result of iterpretig those pages. Oe way to address this is for the user s jodo to delay requests received from the browser immediately after feedig the browser a page cotaiig JavaScript. A more foolproof defese, which we recommed, is for the user to disable JavaScript i the browser whe browsig via Crowds; this ca be doe easily via a preferece meu i most browsers. Aother techology that presets some difficulties is SSL, a protocol by which web pages ca be ecrypted durig trasport. To eable both the user s jodo ad the last jodo o the path to parse SSL-retrieved pages, the SSL coectio to the web server must be made by the last jodo o the path. I this case, HTTP commuicatio is ot protected from jodos o the path, but is protected from other eavesdroppers because all commuicatio betwee jodos is ecrypted. At the time of this writig, however, SSL is ot supported by Crowds. 5.3.2 Static paths. Early i the desig of Crowds, we were tempted to make paths much more dyamic tha they are i the preset system, e.g., by havig a jodo use a differet path for each of its users, per time period, or eve per user request. The advatages of more dyamic paths iclude the potetial for better performace via load balacig amog the crowd. I this sectio, however, we cautio that dyamic paths teds to decrease the aoymity properties provided by the system agaist collaboratig jodos. The reaso is that the probable iocece offered by Theorem 5.2 vaishes if the collaborators are able to lik may distict paths as beig iitiated by the same jodo. Collaboratig jodos might be able to lik paths iitiated by the same ukow jodo based o related path cotet or timig of commuicatio o paths. To prevet this, we made paths static, so the attacker simply does ot have multiple paths to lik to the same jodo. To see why multiple liked paths iitiated by the same jodo could compromise its user s aoymity, ote that collaboratig jodos have a higher probability of receivig each path iitiatio message (i.e., the first request o the path) from the iitiator of the path tha from ay other idividual member (see the proof of Theorem 5.2). Multiple paths iitiated by the same user s jodo therefore pipoit that jodo as the oe from which the collaborators most ofte receive the iitiatig messages. Put aother way, if the collaborators idetify paths P 1,..., P k from the same (ukow) iitiator, the the expected umber of paths o which the first pf ( c 1) collaborator is directly preceded by the path iitiator is µ = k( ). By Cheroff bouds, the probability that the first collaborator is immediately preceded by the iitiator o substatially fewer of these paths is small: the first collaborator is immediately preceded by the path iitiator o fewer tha (1 δ)µ paths with

14 probability oly e µδ2 /2 (see [Motwai ad Raghava 1995, Theorem 4.2]). Thus, the iitiator would be idetified with high probability. Agai, it is for this reaso that a jodo sets up oe path for all its users commuicatios, ad this path is altered oly uder two circumstaces. First, a path is altered whe failures are detected i the path. More specifically, paths are oly rerouted whe the failure of a jodo is umistakely detected, i.e., whe the jodo executes a fail-stop failure [Schlichtig ad Scheider 1983]. I our preset implemetatio, such failures are detected by the TCP/IP coectio to the jodo breakig or beig refused; a jodo does ot reroute a path based o simply timig out o the subsequet jodo i the path (see lie 23 of Figure 3). While this icreases our sesitivity to deial-of-service attacks (see Sectio 2.3), it stregthes our promise of aoymity to the user. A reasoable questio, however, is whether a malicious jodo o a path ca feig its ow failure i hopes that the path will be rerouted through a collaborator, yieldig iformatio that icrimiates the path iitiator. Fortuately, the aswer is o. If a jodo i a path fails (or appears to fail), the path remais the same up util the predecessor of that faulty jodo, who reroutes the remaider of the path radomly (lie 26 of Figure 3). Sice the collaboratig jodos caot distiguish whether that predecessor is the origiator or ot, the radom choices made by that predecessor yield o additioal iformatio to the collaborators. The secod circumstace i which paths are altered is whe ew jodos joi the crowd. The motivatio for reroutig paths is to protect the aoymity of a joiig jodo: if existig paths remaied static, the the joier s ew path ca be easily attributed to the ew jodo whe it is formed. Thus, to protect joiers, all jodos forget all paths after ew jodos joi, ad re-establish paths from scratch. To avoid exposig path iitiators to the attack described previously i this sectio, jois are grouped ito ifrequet scheduled evets called joi commits (see Sectio 8). Oce a joi commit occurs, existig paths are forgotte, ad the ewly joied jodos are eabled to participate i the crowd. Batchig may jois ito a sigle joi commit limits the umber of times that paths are rerouted ad thus the umber of paths vulerable to likage by collaborators. Moreover, each user is alerted whe a joi commit occurs ad is cautioed from cotiuig to browse cotet related to what she was browsig prior to the commit, lest collaborators are attemptig to lik paths based o that cotet. 6. PERFORMANCE I this sectio we describe the performace of Crowds 1.0. As discussed i Sectio 3, performace is oe of the motivatig factors behid the desig of Crowds ad, we believe, a stregth of our approach relative to mixes [Chaum 1981] (though there are few published performace results for mix implemetatios to which to compare our results). Ad, while Crowds performace is already ecouragig, it could be improved further by re-implemetig it i a compiled laguage such as C. Crowds 1.0 is implemeted i Perl 5 (a partially iterpreted laguage), which we chose for its rapid prototypig capabilities ad its portability across Uix ad Microsoft platforms. Results of performace tests o our implemetatio are show i Figures 4 5. I these tests, the source of requests was a Netscape 3.01 browser cofigured to

15 msecs 2000 1500 1000 500 5 4 3 path legth 2 1 5 4 1 2 3 page size (kbytes) 0 Path Page size (kbytes) legth 0 1 2 3 4 5 1 288 247 264 294 393 386 2 573 700 900 1157 1369 1384 3 692 945 1113 1316 1612 1748 4 814 1004 1191 1421 1623 1774 5 992 1205 1446 1620 1870 2007 Fig. 4. Respose latecy (msecs) as a fuctio of path legth ad page size allow a maximum of 4 simultaeous etwork coectios. The crowd cosisted of four jodos, each executig o a separate, moderately loaded 150 MHz Sparc 20 ruig SuOS 4.1.4. The web server was a fairly busy 133 MHz SGI workstatio ruig Irix 5.3 ad a Apache web server. All of these computers are located i AT&T Labs, ad thus are i close etwork proximity to oe aother. Figure 4 shows the mea latecy i millisecods of retrievig web pages of various sizes (cotaiig o embedded URLs) for various path legths. Each umber idicates the average duratio begiig whe the user s jodo receives the request from the browser ad edig whe the page has bee writte back to the browser. I this figure, the path legth is the umber of appearaces of jodos o the path. That is, if a jodo appears k times o a path, the this jodo cotributes k to the total path legth. So, for example, i Figure 2, the paths iitiated by jodos 1, 4, ad 6 are each of legth two, ad the paths iitiated by 2, 3, ad 5 are each of legth three. Oe observatio we ca make from Figure 4 is that the latecy sharply icreases whe the path legth icreases from oe to two. The primary reaso for the sharp icrease is that a path legth of two is the first legth at which ecryptio of page cotets takes place. I a path of legth oe (which would be employed oly if there were oe crowd member), the user s jodo acts as a simple proxy betwee the browser ad ed server, to strip away idetifyig iformatio from HTTP headers. I a path of legth two, however, both the request ad reply are

16 passed, ad ecrypted, betwee the jodos o the path. To slow the growth of this latecy as the path gets loger, this ecryptio is performed usig a path key, which is a key shared amog all jodos o a path. A path key is created by the jodo iitiatig the path, ad each jodo o a path forwards it to the ext jodo by ecryptig the path key with a key it shares with the ext jodo (see Sectio 8). The existece of a path key eables requests to be ecrypted at the jodo iitiatig the path, decrypted by the last jodo i the path, ad passed by itermediate jodos without ecryptig or decryptig the requests. Similarly, replies are ecrypted at the last jodo i the path, ad decrypted oly at the jodo where the path was iitiated. The cryptographic operatios are performed usig a efficiet stream cipher, allowig some of the ecryptig ad decryptig streams for the reply to be geerated while the jodos are waitig for the reply from the web server. However, sice eve this cipher is implemeted i Perl for portability, it remais a bottleeck i our implemetatio. Figure 5 shows the mea latecy i millisecods of retrievig, via paths of various differet legths, pages cotaiig URLs that are automatically retrieved by the browser (see Sectio 5.3.1). I these tests, each embedded URL is the address of a 1-kilobyte image residet o the same server as the page that refereced it. Each umber idicates the average duratio begiig whe the user s jodo receives the iitial request from the browser ad edig whe the jodo fiishes writig the page ad all of the images o the page to the browser. It is clear from Figure 5 that the umber of images cosiderably impacts the latecy of resposes. Though this is to be expected i geeral, this effect is particularly proouced i our implemetatio, ad is due primarily to ecryptio costs. Moreover, returig images o the path has the effect of serializig their retrieval, which further icreases the latecy over that achieved by moder browsers aloe (which use several etwork coectios to retrieve multiple images cocurretly). Because paths (ad thus path legths) are established radomly at ru time, the user caot choose her path legth to predict the request latecy she experieces. However, the expected path legth ca be iflueced by modifyig the value p f i.e., the probability that a jodo forwards to aother jodo versus submittig to the ed server at all jodos. Specifically, if > 1, the expected legth of a path is [ ] (1 p f ) (k + 2)(p f ) k = (1 p f ) k(p f ) k + 2 (p f ) k k=0 k=0 k=0 [ p f = (1 p f ) (1 p f ) + 2 ] 2 1 p f = p f 1 p f + 2 This suggests that multiple types of crowds should exist: those employig a small p f for better performace but less resiliece to collaboratig jodos (see Theorem 5.2), ad those usig a large p f to icrease security with a cost to performace. Performace see i practice may differ from Figures 4 ad 5, depedig o the platforms ruig jodos ad the speed of etwork coectivity betwee jodos. I particular, a jodo coected to the Iteret via a slow modem lik cosiderably

17 msecs 12000 10000 8000 6000 4000 5 4 3 path legth 2 1 25 20 10 15 1-kbyte images 5 Path Number of 1-kbyte images legth 5 10 15 20 25 1 2069 4200 5866 7219 8557 2 3313 4915 6101 8195 10994 3 4127 5654 7464 9611 11809 4 4122 6840 8156 10380 11823 5 4508 7644 9388 11889 13438 Fig. 5. Respose latecy (msecs) as a fuctio of path legth ad umber of embedded images impacts latecies o paths that use it. Agai, this suggests multiple types of crowds, amely oes cotaiig oly jodos coected via fast liks, ad oes allowig jodos coected via slower liks. 7. SCALE The umbers i Sectio 6 give little isight ito how performace is affected as crowd size grows. We do ot have sufficiet resources to measure the performace of a crowd ivolvig hudreds of computers, each simultaeously issuig requests. However, i this sectio we make some simple aalytic argumets to show that the performace should scale well. The measure of scale that we evaluate is the expected total umber of appearaces that each jodo makes o all paths at ay poit i time. For example, if a jodo occupies two positios o oe path ad oe positio o aother, the it makes a total of three appearaces o these paths. Theorem 7.1 says that the each jodo s expected umber of appearaces o paths is virtually costat as a fuctio of the size of the crowd. This suggests that crowds should be able to grow quite large. Theorem 7.1. I a crowd of size, ( the expected total umber of appearaces 1 that ay jodo makes o all paths is O (1 p f ) (1 + 1 2 ). ) Proof. Let be the size of the crowd. To compute the load o a jodo, say J, we begi by computig the distributio of the umber of appearaces made by J

18 o each path. Let R i, i > 0, deote the evet that this path reaches J exactly i times (ot coutig the first if J iitiated the path). Also, defie R 0 as follows: ( ) k ( ) 1 P (R 0 ) = (1 p f ) (p f ) k 1 = (1 p f ) 1 1 p f k=0 Ituitively, P (R 0 ) is the probability that the path, oce it has reached J, will ever reach J agai. The, we have P (R 1 ) = 1 ( ) k ( ) ( ) 2 P (R 0) (p f ) k 1 1 1 = (1 p f ) 1 1 p f P (R 2 ) = 1 p fp (R 1 ) P (R i ) = 1 p f P (R i 1 ) k=0 k=0 k=0 ( ) k (p f ) k 1 < (1 p f ). ( ) k (p f ) k 1 < (1 p f ) ( ) 2 ( 1 ( ) i ( 1 1 1 p f 1 1 1 p f 1 ) 3 ) i+1 From this, the expected umber of appearaces that J makes o a path formed by aother jodo is bouded from above by: ( ) [ 1 pf ( ) ] k ( ) ( ) 1 1 pf p f ( 1) k = 1 1 p f p f ( 1) 1 1 p k=0 f (1 + p f ( 1)) 2 p f ( 1) < (1 + p f ( 1)) 2 = < 1 (1 p f )( 1) + 1 ((1 p f )( 1)) 2 2 (1 p f ) 2 ( 1) Therefore, the expected umber of appearaces that J makes o all paths is bouded from above by: ( 2 (1 p f ) 2 ( 1) = 2 1 + 1 ) (1 p f ) 2 1 8. CROWD MEMBERSHIP The membership maiteace procedures of a crowd are those procedures that determie who ca joi the crowd ad whe they ca joi, ad that iform members of the crowd membership. We discuss mechaisms for maitaiig crowd membership i Sectio 8.1, ad policies regardig who ca joi a crowd i Sectio 8.2. 8.1 Mechaism There are may schemes that could be adopted to maage membership of the crowd. Existig group membership protocols, tolerat either of beig (e.g., [Cristia 1991;

19 Ricciardi ad Birma 1991; Moser et al. 1991]) or malicious [Reiter 1996b] faults, ca be used for maitaiig a cosistet view of the membership amog all jodos, ad the members could use votig to determie whether a autheticated prospective member should be admitted to the crowd. Ideed, a similar approach has bee adopted i prior work o secure process groups [Reiter et al. 1994]. While providig robust distributed solutios, these approaches have the disadvatages of icurrig sigificat overhead ad of providig sematics that are arguably too strog for the applicatio at had. I particular, a hallmark of these approaches is a guarateed cosistet view of the group membership amog the group members, whereas it is uclear whether such a strog guaratee is required here. I our preset implemetatio we have therefore opted for a simpler, cetralized solutio. Membership i a crowd is cotrolled ad reported to crowd members by a server called the bleder. To make use of the bleder (ad thus the crowd), the user must establish a accout with the bleder, i.e., a accout ame ad password that the bleder stores. Whe the user starts a jodo, the jodo ad the bleder use this shared password to autheticate each other s commuicatio. As a result of that commuicatio (ad if the bleder accepts the jodo ito the crowd; see Sectio 8.2), the bleder adds the ew jodo (i.e., its IP address, port umber, ad accout ame) to its list of members, ad reports this list back to the jodo. I additio, the bleder geerates ad reports back a list of shared keys, each of which ca be used to autheticate aother member of the crowd. The bleder the seds each key to the other jodo that is iteded to share it (ecrypted uder the accout password for that jodo) ad iforms the other jodo of the ew member. At this poit all members are equipped with the data they eed for the ew member to participate i the crowd. However, to protect itself from attacks described i Sectio 5.3.2, the ew member refrais from doig so util it receives a joi commit message from the bleder. This is discussed further i Sectio 8.2. Each member maitais its ow list of the crowd membership. This list is iitialized to that received from the bleder whe the jodo jois the crowd, ad is updated whe the jodo receives otices of ew or deleted members from the bleder. The jodo ca also remove jodos from its list of crowd members, if it detects that those jodos have failed (see lie 25 of Figure 3). This allows for each jodo s list to diverge from others if differet jodos have detected differet failures i the crowd. This appears to have little qualitative effect o our security aalysis of Sectio 5, uless attackers are able to prevet commuicatios betwee correct jodos to the extet that each removes the correct jodos from its list of members. A disadvatage of this approach to membership maiteace is that the bleder is a trusted third party for the purposes of key distributio ad membership reportig. Techiques exist for distributig trust i such a third party amog may third party replicas, i a way that the corruptio of some fractio of the replicas ca be tolerated (e.g., [Deswarte et al. 1991; Gog 1993; Reiter 1996a]). I its preset, o-replicated form, however, the bleder is best executed o a secure computer, e.g., with logi access available oly at the cosole. Eve though it is a trusted third party for some fuctios, ote that users HTTP commuicatio is ot routed through the bleder, ad thus a passive attack o the bleder does ot immediately reveal users web trasactios (ulike the Aoymizer; see Sectio 3). Moreover, the failure of the bleder does ot iterfere with ogoig web trasactios (agai

20 ulike the Aoymizer). We aticipate that i future versios of Crowds, jodos will establish shared keys usig Diffie-Hellma key exchage [Diffie ad Hellma 1976], where the bleder serves oly to distribute the Diffie-Hellma public keys of crowd members. This will elimiate the preset reliace o the bleder for key geeratio. 8.2 Policy It is importat i light of Sectio 5 that some degree of cotrol over crowd membership be maitaied. First, if ayoe ca add arbitrarily may jodos to a crowd, the a sigle attacker could lauch eough collaboratig jodos so that < pf p f 1/2 (c + 1), at which poit Theorem 5.2 o loger offers protectio. Secod, sice jois cause paths to be re-routed (see Sectio 5.3.2), if jois are allowed to occur frequetly ad without cotrols, the paths may be re-routed sufficietly frequetly to allow collaboratig jodos to mout the correlatio attack described i Sectio 5.3.2. I our preset implemetatio, the bleder serves as the poit at which jois to the crowd are cotrolled. To address the latter cocer, the bleder batches jois together so they occur i oe scheduled, discrete evet called a joi commit. The schedule of joi commits is a cofigurable parameter of the bleder, but we evisio that oe commit per day should typically suffice. The bleder iforms all crowd members of the joi commit, at which poit all ewly joied members are eabled to participate i the crowd ad all old members reset their paths, as described i Sectio 5.3.2. The eed to limit the umber of collaborators that joi the crowd suggests that two differet types of crowds will exist. The first type would cosist of a relatively small (e.g., 10 30) collectio of idividuals who, based o persoal kowledge of each other, agree to form a crowd together. Each member would be allowed to iclude at most oe jodo i the crowd. More precisely, each perso would be give oe accout, ad oly oe jodo per accout would be allowed. Each member s persoal kowledge of the other members eables her to trust that sufficietly few members collaborate to esure that pf p f 1/2 (c + 1). The secod type of crowd would be a much larger public crowd, admittig members that might ot be kow to a substatial fractio of the preset membership. The privacy offered by the crowd agaist collaboratig members would rely o the size of the crowd beig so large that a attack aimed at makig < would require cosiderable effort to go udetected. That is, by limitig each user to oe accout (e.g., the bleder admiistrator sets up a accout for a user oly after receivig a writte, otarized request from that user) ad each accout to oe jodo, ad by moitorig ad limitig the umber of jodos o ay oe etwork (usig IP address), the attacker would be forced to lauch jodos usig may differet idetities ad o may differet etworks to succeed. 9. USER INTERFACE pf p f 1/2 (c+1) I our preset implemetatio, there are several ways i which a user iteracts with her jodo, i.e., the jodo that serves as the HTTP proxy of her browser. (1) The user ca issue a crowd query by appedig?crowd? to the ed of ay URL that she requests. This returs a list of all of the active jodos i the crowd,

21 Fig. 6. Crowd query: A crowd query shows the jodos that are available, ad idicates which oe is actig as the user s HTTP proxy for the browser. accordig to her jodo. The iformatio icludes each jodo s accout ame, its IP address ad its port umber. A example is show i Figure 6. (2) Whe a user is browsig via the crowd, the word Crowd: is prepeded to the title of each page. Thus, a user ca check whether or ot she is usig the crowd by lookig at the title of the documets i the browser. Of course, a web server could add this word to the title of ay documet to fool the user, ad so this aloe should ot be relied upo. (3) Crowds offers other iformatioal pages to the user via the browser, similar to Figure 6. For example, Crowds alerts the user whe a joi commit occurs. The savvy Crowds user ca also fie-tue her jodo s behavior by way of a cofiguratio file that defies the behavior of her jodo. This cofiguratio file icludes, for example, parameter settigs to allow or disallow the passage of cookies, i.e., data that a web server ca dowload to the user s browser ad that the user s browser will iclude i subsequet requests to that server. By default, a jodo strips all cookies out of requests it receives from browsers i order to better protect its users privacy, but the jodo ca be cofigured to let cookies pass. Other cofigurable parameters of a jodo iclude, for example, the host ad port of the crowd bleder, ad the accout ame ad password uder which the jodo requests