Internet Anonymity and the Design Process - A Practical Approach

anon.next: A Framework for Privacy in the Next Generation Internet Matthew Wright Department of Computer Science and Engineering, The University of Texas at Arlington, Arlington, TX, USA, mwright@uta.edu, http://isec.uta.edu/ Abstract. Systems for anonymity on the Internet are inherently slow; multi-hop paths may traverse continents in an effort to remove the linkability between source and destination. Next generation Internet infrastructures are currently being investigated, notably through the NSF GENI project. In such an infrastructure, there is an opportunity to build anonymity directly into the network so that it is faster and more efficient than overlay-based approaches. We propose anon.next, a basic architecture for this kind of network-embedded anonymity system meant to be tested on the GENI infrastructure. In anon.next, anonymizing proxies are controlled by ISPs and have information about how to build paths that are both efficient and privacy-preserving. This paper presents the design choices that we would face in developing this system and the challenges for determining the privacy it would provide. 1 Introduction Anonymity research has led to a variety of practical system designs and anonymity systems that are in use today [1]. These systems are effective against many kinds of attacks on privacy, but they have substantially slower network performance and cannot protect against more powerful attackers. These systems have made reasonable design choices but face fundamental limitations. As they are overlay networks, an anonymized connection going through multiple proxies must pass through several Internet connections before reaching its destination, and reply traffic faces the same overhead. Any optimizations in this framework are therefore limited at best. Recently, there has been a great deal of interest by researchers in new Internet architectures. In particular, the U.S. National Science Foundation has a major long-term initiative in Future Internet Design (FIND) see http://www.netsfind.net/. Related to this initiative is the development of GENI, the Global Environment for Network Innovations (http://www.geni.net), a virtual laboratory for internetworking research. The next-generation Internet is expected to have many features, such as security mechanisms, quality-of-service, availability, and application support. This also presents a substantial opportunity for anonymity

2 and privacy researchers to develop ideas for privacy enhanced systems in these future networks. In this paper, we propose to investigate ways of embedding anonymous communications proxies into the new Internet architectures, such that these networks can provide efficient and effective protection from traffic analysis. More specifically, we will consider a simple but promising notion that proxies much like existing anonymizing proxies can be placed in key locations throughout the network to provide strong protection from traffic analysis. This will mean that the privacy of users communications will be protected while providing high-speed connections in a way that is not possible with todays frameworks. The promise of this idea is clear: by removing significant overheads in the creation of the path of proxies and not requiring communications to cross the entire network multiple times, we greatly lower the delays as compared to an overlay system. Further, consolidation of traffic may offer opportunities to provide greater mixing, by which traffic from different connections is briefly stored and reordered to confuse the eavesdropper. Mixing is largely beyond the capability of todays overlay systems, as the traffic levels are not high enough; low traffic means long periods of storage at the proxies, while higher traffic levels are too expensive for volunteer operators to sustain. While embedded anonymizing proxies would add to the networks infrastructure costs, they could be more cost-efficient than overlay proxies and the costs could be charged to users. The addition of proxies can substantially improve protection against powerful eavesdroppers by ensuring that sufficient mixing occurs in the system. In this initial investigation, we will consider a variety of possibilities for the placement of proxies in the network and the algorithms for selecting proxies for anonymous communication paths. We will use analysis and simulation to determine the performance and security consequences of different choices, as well as considering the practicality of these choices with respect to the cost of the proxies and overhead in the network. The most promising choices will be studied further in future research, in which we will investigate the potential for mixing and cover traffic to further prevent traffic analysis. 2 Background Anonymous communications have been studied in detail since 1981, when Chaum presented the idea of a mix [2], a proxy that buffers messages and reorders them before sending them out. Mixes should be put in a series, or a path, in which layers of encryption are removed in stages to protect messages from being tracked. Most of the research in anonymity has been based on this simple idea, and much of that work has focused on the systems aspects making anonymity practical and efficient for real network users. Designers of anonymity systems, including the commercial Freedom network [3] and the currently popular Tor [1], have made substantial compromises in security to allow for acceptable performance and overheads. Essentially, unlike mixes, they do not buffer or reorder messages. The creators of Tor plainly state that end-to-end timing correlation is likely to

3 be effective against their system [1]. Recent efforts in timing analysis have shown high rates of attacker success for tracking communications, even when all users have identical, constant-rate traffic patterns [4]. Despite the security compromises these systems have made, they are still slow. While this has been difficult to quantify, it seems clear that sending and receiving packets over randomly selected multiple overlay hops will be inherently slower than a direct connection. Each intermediate connection, for example, is subject to possible congestion. This means that the chance of congestion somewhere on the path is much higher than in a direct connection. One approach to solving this problem would be to choose servers that are well-placed in the network to provide the best network performance. Doing this in a naive way, however, makes the system vulnerable to attackers with only modest resources [5]. While secure ways of improving network performance may exist, such improvements will be inherently limited by the need to connect to servers at the edge of the network. In this paper, we propose a means to remedy the limitations on the performancesecurity tradeoff in current anonymity systems by embedding proxies into the network structure. This idea holds promise but leads to a number of important questions for investigation. In such an investigation, we would seek the answers to the two questions most suitable to understanding the feasibility of this approach. First, where should we put the proxies? Second, How do we select paths between proxies? Of course, these two questions are linked and we will need to address them together. Research funding agencies in both the U.S. and Europe are calling for new efforts in Internet design. This presents a unique opportunity to consider the addition of network services, including protection from traffic analysis. The main novel aspect of this proposed effort is to place proxies inside the network architecture, with more direct routes between proxies, in an effort to reduce the overheads of providing anonymity, while likely improving the security of the system. We envision these proxies being attached to routers, in that they will have short, direct links to the routers where end nodes typically are not attached. We now describe some of the challenges that are critical to the design of the anon.next system. Placement of Proxies. Placement of the proxies in the Internet involves choosing logical, rather than physical, locations. The placement, for example could be attached to the edge routers in the network or attached to routers in the core of the network. If we place all the inter-proxy routing intelligence in the proxies themselves, such proxies could sit at the core without requiring extra work from heavily-loaded routers. However, there may be benefits from having the proxies get information from the routers to improve their routing decisions. In this case, it could become quite expensive to place the proxies at core routers. This also affects the number of proxies. As more proxies are added to the system, the amount of mixing between different traffic may be reduced. However, realistic loads may only be handled with many proxies.

4 Selection of Proxies on the User s Path. Routing between proxies can take many forms, and there may be a tension between performance and security. Purely random selection of proxies has the best security properties, but can lead to very long paths and may provide little to no benefit over systems like Tor. Using entirely performance-driven selections can lead to selection that only use the nearest proxies, or proxies that split the distance between the end points. These paths may be easier to eavesdrop; selecting paths that avoid reusing the same network service provider or Internet exchange may be critical to privacy [5]. Proxy Discovery. Selecting proxies assumes that the nodes doing selection will know about most of the proxies in the Internet. This leads to a substantial challenging in the secure distribution of proxy information. While complete information about all proxies is more secure, as it can help prevent statistical attacks on path selection, keeping complete and up-to-date information may be prohibitively expensive. We expect that a system based on a combination of extensive local knowledge and less complete knowledge of more distant proxies may be appropriate in this system. Inter-proxy Connection Properties. To best pick paths that provide good performance, particularly without being tricked by attackers, there will need to be a means of testing the connections between proxies. Simple measurements of latency and bandwidth are certainly possible, and we hypothesize that nodes do not have the ability to affect results much except to make them worse than reality. This is unlikely to provide much benefit to the attacker. Against a system such as this, there are a number of attacks that must be considered. Here, we mention a few of the most likely and/or critical: Attacks Based on Latency. Hopper et al. study this possibility extensively on Tor [6]. The main issue is that the set of possible initiators can be greatly reduced by estimating the latency between the initiator and the responder. There are ways to mitigate this attack. First, having a large number of users makes a practical attack difficult, as there will be too many possible initiators even after the attack. Second, we may be able to limit the attacks effectiveness by putting users into latency classes. The client or the first proxy can estimate the round trip time (RTT) to the responder and add small delays to make the average RTT one of a relatively few values. With many users, the number of such values can be large enough to accomodate reasonable variation without high delays. Attacks Based on Biased Path Selection. If we assume that not every proxy can know every other proxy in the system, then the paths could be subject to bias by the attacker providing misinformation. A structured P2P system can be used as to create a secure distributed directory service, as proposed by Nambiar and Wright [7]. In a next-generation anonymity system, we believe that the structure of the directory service must be at least partially associated with the relative network distance between peers. This

5 way, peers that are close to each other can be easily substituted in a path without major changes to performance. Creating and evaluating a system design that does this is a major new challenge. Attacks Based on Leaks Due to Path Selection. As pointed out by Mittal and Borisov, using such a system like Salsa can lead to information leaks [8]. We also note that leaks can come from location if paths are chosen using latency as a consideration. The worst case is that these leaks build on each other, so that the combination of attacks is substantially more powerful at identifying the initiator. We will aim in our design to make these leaks substantially overlap, so that the information gained through one type of attack is approximately the same as the information gained from the other. Creating path selection algorithms that meet this goal, and finding ways to evaluate them, will be an important and difficult task. Attacks on Privacy Using Denial of Service. Denial of service attacks can be dangerous to the privacy of users, as shown by Borisov et al. [9]. The main problem in a system like Tor is that an attacker can block some paths in an attempt to get the initiator to use paths controlled by the attacker. As pointed out by Borisov et al., a reputation system could have an effect on such an attack by making denial of service attacks cost the attacker chances to be on a path [9]. Tailoring a reputation system to the proposed scenario and demonstrating its effectiveness are import to protect against this class of attacks, as well as provide useful information for selecting paths in the system. Intersection and Predecessor Attacks. Intersection and predecessor attacks require relatively strong attackers with either a substantial fraction of malicious nodes or a powerful eavesdropper who can see a large fraction of anonymized traffic. Since attacks have been shown to be successful against Tor and AN.ON with weaker attacker models, we believe that a reasonable approach is to focus on these attacks while limiting the ability for the attacker to control nodes and observe traffic. For the latter, we will aim to keep paths diverse and have them pass through multiple Internet Exchanges. 3 Conclusions This paper presents anon.next, a system of anonymizing proxies for the next generation Internet. With ongoing efforts to design and evaluate new Internet architectures, it is an exciting time to investigate novel privacy-preserving infrastructures for these networks. In this paper, we have discussed some of the key challenges around the design of one such infrastructure. There is a tremendous amount of additional work to be done in this area, and we encourage privacy researchers to start thinking more about the design challenges and privacy pitfalls involved in such an undertaking.

6 References 1. R. Dingledine, N. Mathewson, P.S.: Tor: The next-generation onion router. In: Proc. 13th USENIX Security Symposium. (Aug. 2004) 2. Chaum, D.: Untraceable Electronic Mail, Return Addresses, and Digital Pseudonyms. Communications of the ACM 24(2) (Feb. 1981) 84 88 3. Back, A., Goldberg, I., Shostack, A.: Freedom 2.0 security issues and analysis. Zero-Knowledge Systems, Inc. white paper (Nov. 2000) 4. Levine, B.N., Reiter, M., Wang, C., Wright, M.: Timing analysis in low-latency mix systems. In: Proc. Financial Cryptography (FC). (Feb. 2004) 5. Murdoch, S.J., Zieliński, P.: Sampled traffic analysis by internet-exchange-level adversaries. In: Proceedings of the Seventh Workshop on Privacy Enhancing Technologies (PET 2007). (June 2007) 6. Hopper, N., Vasserman, E.Y., Chan-Tin, E.: How much anonymity does network latency leak? ACM Transactions on Information and System Security (forthcoming 2009) 7. Nambiar, A., Wright, M.: Salsa: a structured approach to large-scale anonymity. In: Proc. ACM Conference on Computer and Communications Security (CCS 06). (Oct. 2006) 8. Mittal, P., Borisov, N.: Information leaks in structured peer-to-peer anonymous communication systems. In: Proceedings of the 15th ACM Conference on Computer and Communications Security (CCS 2008). (October 2008) 9. Borisov, N., Danezis, G., Mittal, P., Tabriz, P.: Denial of service or denial of security? How attacks on reliability can compromise anonymity. In: Proceedings of CCS 2007. (October 2007)