TRUST AND REPUTATION IN PEER-TO-PEER NETWORKS

Similar documents

CHAPTER 1 INTRODUCTION

LIST OF FIGURES. Figure No. Caption Page No.

A Reputation Management System in Structured Peer-to-Peer Networks

Vulnerabilities of Intrusion Detection Systems in Mobile Ad-hoc Networks - The routing problem

Ashok Kumar Gonela MTech Department of CSE Miracle Educational Group Of Institutions Bhogapuram.

SECURITY ASPECTS IN MOBILE AD HOC NETWORK (MANETS)

Multicast vs. P2P for content distribution

Intrusion Detection for Mobile Ad Hoc Networks

The Role and uses of Peer-to-Peer in file-sharing. Computer Communication & Distributed Systems EDA 390

ONLINE REPUTATION SYSTEMS

Super-Agent Based Reputation Management with a Practical Reward Mechanism in Decentralized Systems

Trust and Reputation Management

8 Conclusion and Future Work

Costs and Benefits of Reputation Management Systems

A Reputation Management and Selection Advisor Schemes for Peer-to-Peer Systems

SANE: A Protection Architecture For Enterprise Networks

Peer-to-peer Cooperative Backup System

Security for Ad Hoc Networks. Hang Zhao

Simulating a File-Sharing P2P Network

Adapting Distributed Hash Tables for Mobile Ad Hoc Networks

A Dynamic Reputation Management System for Mobile Ad Hoc Networks

DoS: Attack and Defense

NODES COOPERATION TRUST METHOD OVER AD HOC NETWORK. A Thesis by. Qi Jiang. Bachelor of Engineering, Jiangxi University of Science and Technology, 2005

CHAPTER 6. VOICE COMMUNICATION OVER HYBRID MANETs

SY system so that an unauthorized individual can take over an authorized session, or to disrupt service to authorized users.

Denial of Service Resilience in Peer to Peer. D. Dumitriu, E. Knightly, A. Kuzmanovic, I. Stoica, W. Zwaenepoel Presented by: Ahmet Canik

A very short history of networking

Comparison of Various Passive Distributed Denial of Service Attack in Mobile Adhoc Networks

CROSS LAYER BASED MULTIPATH ROUTING FOR LOAD BALANCING

PEER-TO-PEER NETWORK

SECURE DATA TRANSMISSION USING INDISCRIMINATE DATA PATHS FOR STAGNANT DESTINATION IN MANET

Attacks Against Peer-to-peer Networks and Countermeasures

Peer-to-Peer Networks. Chapter 6: P2P Content Distribution

The Advantages of a Firewall Over an Interafer

Prediction of DDoS Attack Scheme

Reputation Management in P2P Networks: The EigenTrust Algorithm

Deploying Firewalls Throughout Your Organization

About the Authors Preface Acknowledgements List of Acronyms

Why an Intelligent WAN Solution is Essential for Mission Critical Networks

Implementation of P2P Reputation Management Using Distributed Identities and Decentralized Recommendation Chains

Trust based Peer-to-Peer System for Secure Data Transmission ABSTRACT:

From Network Security To Content Filtering

Towards Trusted Semantic Service Computing

Internet Anonymity and the Design Process - A Practical Approach

An Implementation of Secure Wireless Network for Avoiding Black hole Attack

Security in Structured P2P Systems

Considerations In Developing Firewall Selection Criteria. Adeptech Systems, Inc.

PAVING THE PATH TO THE ELIMINATION OF THE TRADITIONAL DMZ

CHAPTER 8 CONCLUSION AND FUTURE ENHANCEMENTS

Wireless Sensor Networks Chapter 14: Security in WSNs

Solutions for Health Insurance Portability and Accountability Act (HIPAA) Compliance

CS 665: Computer System Security. Network Security. Usage environment. Sources of vulnerabilities. Information Assurance Module

Complete Protection against Evolving DDoS Threats

Cisco Advanced Services for Network Security

DDOS WALL: AN INTERNET SERVICE PROVIDER PROTECTOR

2. From a control perspective, the PRIMARY objective of classifying information assets is to:

A NOVEL OVERLAY IDS FOR WIRELESS SENSOR NETWORKS

Wireless Sensor Network Security. Seth A. Hellbusch CMPE 257

Technology White Paper Capacity Constrained Smart Grid Design

A Model for Access Control Management in Distributed Networks

Peer-to-Peer Systems: "A Shared Social Network"

III. Our Proposal ASOP ROUTING ALGORITHM. A.Position Management

F5 and Oracle Database Solution Guide. Solutions to optimize the network for database operations, replication, scalability, and security

Network Security Landscape

White paper. TrusGuard DPX: Complete Protection against Evolving DDoS Threats. AhnLab, Inc.

Preventing DDOS attack in Mobile Ad-hoc Network using a Secure Intrusion Detection System

4 Steps to Effective Mobile Application Security

1. Comments on reviews a. Need to avoid just summarizing web page asks you for:

Taxonomic Modeling of Security Threats in Software Defined Networking

Online music distribution Core-Edge Working Group Meeting, September 28-29, 2004 Natalie Klym Research Associate, MIT CFP

DDoS Vulnerability Analysis of Bittorrent Protocol

Security in Ad Hoc Network

How To Write A Transport Layer Protocol For Wireless Networks

Trust and Reputation Management in Distributed Systems

Security Threats in Mobile Ad Hoc Networks

A Review on Zero Day Attack Safety Using Different Scenarios

(MPLS) MultiProtocol Labling Switching. Software Engineering 4C03 Computer Network & Computer Security Dr. Kartik Krishnan Winter 2004.

Lecture 3: Scaling by Load Balancing 1. Comments on reviews i. 2. Topic 1: Scalability a. QUESTION: What are problems? i. These papers look at

Mobile Security Wireless Mesh Network Security. Sascha Alexander Jopen

Network Architecture and Topology

External Supplier Control Requirements

Dr. Arjan Durresi Louisiana State University, Baton Rouge, LA DDoS and IP Traceback. Overview

Ariadne A Secure On-Demand Routing Protocol for Ad-Hoc Networks

A Utility Based Incentive Scheme for P2P File Sharing in Mobile Ad Hoc Networks

Anonymous Communication in Peer-to-Peer Networks for Providing more Privacy and Security

Industrial Network Security for SCADA, Automation, Process Control and PLC Systems. Contents. 1 An Introduction to Industrial Network Security 1

FIREWALLS & NETWORK SECURITY with Intrusion Detection and VPNs, 2 nd ed. Chapter 5 Firewall Planning and Design

COSC 472 Network Security

Secure networks are crucial for IT systems and their

INTERNATIONAL JOURNAL OF PURE AND APPLIED RESEARCH IN ENGINEERING AND TECHNOLOGY

Near Sheltered and Loyal storage Space Navigating in Cloud

Unit 3 - Advanced Internet Architectures

20-CS X Network Security Spring, An Introduction To. Network Security. Week 1. January 7

How To Secure Your Store Data With Fortinet

Balanced Reputation Detective System (BREDS): Proposed Algorithm

Security vulnerabilities in the Internet and possible solutions

Decoupling Service and Feedback Trust in a Peer-to-Peer Reputation System

Introduction to Wireless Sensor Network Security

Distributed Computing over Communication Networks: Topology. (with an excursion to P2P)

How To Create A P2P Network

Transcription:

TRUST AND REPUTATION IN PEER-TO-PEER NETWORKS A DISSERTATION SUBMITTED TO THE DEPARTMENT OF COMPUTER SCIENCE AND THE COMMITTEE ON GRADUATE STUDIES OF STANFORD UNIVERSITY IN PARTIAL FULFILLMENT OF THE REQUIREMENTS FOR THE DEGREE OF DOCTOR OF PHILOSOPHY Sergio Marti May 2005

c Copyright by Sergio Marti 2005 All Rights Reserved ii

I certify that I have read this dissertation and that, in my opinion, it is fully adequate in scope and quality as a dissertation for the degree of Doctor of Philosophy. Hector Garcia-Molina Principal Adviser I certify that I have read this dissertation and that, in my opinion, it is fully adequate in scope and quality as a dissertation for the degree of Doctor of Philosophy. Mary Baker I certify that I have read this dissertation and that, in my opinion, it is fully adequate in scope and quality as a dissertation for the degree of Doctor of Philosophy. Rajeev Motwani Approved for the University Committee on Graduate Studies. iii

iv

Abstract The increasing availability of high bandwidth Internet connections and low-cost, commodity computers in people s homes has stimulated the use of resource sharing peerto-peer networks. These systems employ scalable mechanisms that allow anyone to offer content and services to other system users. However, the open accessibility of these systems make them vulnerable to malicious users wishing to poison the system with corrupted data or harmful services and worms. Because of this danger, users must be wary of the quality or validity of the resources they access. To mitigate the adverse behavior of unreliable or malicious peers in a network, researchers have suggested using reputation systems. Yet our understanding of how to incorporate an effective reputation system into an autonomous network is limited. This thesis categorizes and evaluates the components and mechanisms necessary to build robust, effective reputation systems for use in decentralized autonomous networks. Borrowing techniques from game theory and economic analysis, we begin with high-level models in order to understand general trends and properties of reputation systems and their effect on a user s behavior and experience. We then closely examine the effects of limited reputation sharing through simulations based on largescale measurements from actual, operating P2P networks. Finally, we propose new mechanisms for improving message routing throughput in decentralized networks of untrusted peers: one geared towards structured DHTs (SPROUT) and two other complementary mechanisms for mobile ad hoc networks (Watchdog and Pathrater). v

Acknowledgements I would like to thank my advisor Hector Garcia-Molina for his unending patience and guidance. I appreciate his great passion for research that is only matched by his strong commitment to his students. I am always amazed that, regardless of his many duties and projects, Hector would make himself available provide feedback and insight on my work. Not only is Hector a wonderful advisor but also a caring friend. I am also deeply grateful for the opportunity to have had Mary Baker as my advisor when I first came to Stanford. Her professionalism and enthusiasm for research inspired me to pursue my Ph.D. Mary s devotion to excellence is exemplified in the work of her students. My experience at Stanford has been joyful and enlightening and I am grateful to the members of both the Mosquitonet and Database groups for their insights, constructive criticism and friendship. I would especially like to thank my co-authors TJ Giuli, Kevin Lai and Prasanna Ganesan. I also appreciate Rajeev Motwani for agreeing to be a member on my reading committee. Finally, I must thank my friends and family for their encouragement and support. In particular, I am grateful to my parents for their love and for instilling in me a deep sense of academic pride. And most of all, to my wife Wendy whose patience and love has kept me going, even when I doubted myself. From proofreading my papers to preparing tasty treats, Wendy is always there for me. vi

Contents Abstract v Acknowledgements vi 1 Introduction 1 1.1 Research Contributions and Thesis Outline............... 5 2 Taxonomy of Trust 8 2.0.1 Taxonomy Overview....................... 9 2.1 Terms and Definitions.......................... 10 2.2 Assumptions and Constraints...................... 12 2.2.1 User Behavior........................... 12 2.2.2 Threat Model........................... 13 2.2.3 Environmental Limitations.................... 16 2.3 Gathering Information.......................... 17 2.3.1 System Identities......................... 17 2.3.2 Information Sharing....................... 19 2.3.3 Dealing with Strangers...................... 23 2.4 Reputation Scoring and Ranking.................... 24 2.4.1 Inputs............................... 25 vii

2.4.2 Outputs.............................. 26 2.4.3 Peer Selection........................... 27 2.5 Taking Action............................... 28 2.5.1 Incentives............................. 28 2.5.2 Punishment............................ 29 2.6 Miscellaneous............................... 30 2.6.1 Resource Reputation....................... 30 2.6.2 Social Networks.......................... 31 2.7 Conclusion................................. 31 3 Agent Strategies Under Reputation 32 3.1 Definitions and Dimensions....................... 33 3.1.1 Game Setup and Rules...................... 33 3.1.2 Knowledge-space......................... 35 3.1.3 Player-space............................ 35 3.1.4 Price-space............................ 36 3.1.5 ebay Scenario........................... 37 3.2 Strategy Independent Analysis...................... 38 3.2.1 Single Transaction Payoff..................... 38 3.2.2 Social Optimum.......................... 39 3.3 Selfish Analysis.............................. 40 3.3.1 Zero Knowledge.......................... 40 3.3.2 Perfect Knowledge........................ 41 3.4 Perfect History.............................. 44 3.4.1 Basic Reputation-based Strategies................ 45 3.4.2 Independent Decisions for MB-1S/VP.............. 47 3.4.3 Independent Decisions for 1B-MS/FP.............. 61 viii

3.5 Related Work............................... 63 3.6 Future Directions............................. 64 3.6.1 Variably-valuated goods..................... 64 3.6.2 Malicious Sellers......................... 65 3.6.3 Costly Signaling.......................... 66 3.7 Conclusion................................. 67 4 Modeling Reputation and Incentives 69 4.1 Assumptions and Definitions....................... 71 4.1.1 Utility............................... 72 4.1.2 Time................................ 72 4.2 Formal Model............................... 73 4.2.1 Incentive Schemes......................... 75 4.2.2 Currency Scenarios........................ 77 4.2.3 Trust................................ 79 4.3 Analysis.................................. 87 4.3.1 Trust Over time.......................... 87 4.3.2 Utility over Time......................... 91 4.4 Simulation Details............................ 95 4.5 Simulation Results............................ 98 4.5.1 Base Population.......................... 98 4.5.2 NR and MTPP.......................... 104 4.5.3 Trust vs Capacity......................... 109 4.5.4 Single-Peer Experiments..................... 110 4.6 Variations on the Model......................... 115 4.6.1 Profit Trust Factor........................ 115 4.6.2 Additional Trust Models..................... 117 ix

4.6.3 Tying Service to Reputation................... 121 4.7 Generalized Model of Trust and Profit................. 127 4.8 Discussion................................. 134 4.8.1 Credits and Economic Stimulation............... 134 4.9 Related Work............................... 136 4.10 Conclusion................................. 136 5 P2P Reputation System Metrics 138 5.1 System Model............................... 139 5.1.1 Authenticity............................ 141 5.2 Threat Models.............................. 142 5.2.1 Document-based Threat Model................. 143 5.2.2 Node-based Threat Model.................... 143 5.3 Reputation Systems............................ 144 5.3.1 Identity.............................. 149 5.4 Metrics................................... 151 5.4.1 Efficiency............................. 153 5.4.2 Effectiveness............................ 153 5.4.3 Load................................ 154 5.4.4 Message Traffic.......................... 155 5.4.5 Threat-Reputation Distance................... 156 5.5 Simulation Details............................ 157 5.6 Results................................... 159 5.6.1 Local Reputation System..................... 160 5.6.2 Voting-System........................... 169 5.6.3 Node-based Threat Model.................... 181 5.7 Statistical Analysis of Reputation Systems............... 190 x

5.8 Equations................................. 191 5.9 Empirical Estimations.......................... 193 5.10 Long-Term Reputation System Performance.............. 194 5.10.1 Random base case........................ 195 5.10.2 Select-Best/Weighted ideal case with threshold......... 196 5.10.3 Weighted ideal case without threshold............. 196 5.10.4 Select-Best ideal case without threshold............ 197 5.10.5 Select-Best/Weighted local reputation system with threshold. 198 5.10.6 Weighted local system without threshold............ 199 5.10.7 Select-Best local system..................... 199 5.11 Comparison of Statistical Analysis to Simulation Results....... 200 5.12 Related Work............................... 202 5.13 Conclusion................................. 203 6 SPROUT: P2P Routing with Social Networks 205 6.1 Trust Model................................ 207 6.1.1 Trust Function.......................... 208 6.1.2 Path Rating............................ 210 6.2 Social Path Routing Algorithm..................... 211 6.2.1 Optimizations........................... 212 6.3 Results................................... 214 6.3.1 Simulation Details........................ 214 6.3.2 Algorithm Evaluation....................... 215 6.3.3 Calculating Trust......................... 218 6.3.4 Number of Friends........................ 220 6.3.5 Comparison to Gnutella-like Networks............. 223 6.3.6 Latency Comparisons....................... 225 xi

6.3.7 Message Load........................... 226 6.4 Related and Future Work........................ 228 6.5 Conclusion................................. 229 7 Mitigating MANET Misbehavior 231 7.1 Assumptions and Background...................... 235 7.1.1 Definitions............................. 235 7.1.2 Physical Layer Characteristics.................. 235 7.1.3 Dynamic Source Routing (DSR)................. 236 7.2 Watchdog and Pathrater......................... 237 7.2.1 Watchdog............................. 237 7.2.2 Pathrater............................. 241 7.3 Methodology............................... 243 7.3.1 Movement and Communication Patterns............ 243 7.3.2 Misbehaving Nodes........................ 244 7.3.3 Metrics.............................. 244 7.4 Simulation Results............................ 245 7.4.1 Network Throughput...................... 246 7.4.2 Routing Overhead........................ 248 7.4.3 Effects of False Detection.................... 250 7.5 Related Work............................... 251 7.6 Future Work................................ 254 7.7 Conclusion................................. 255 8 Conclusion and Future Work 257 A Proof Of Long-Term Reputation Damage 262 A.1 Error Bounds............................... 265 xii

A.2 Improved Approximation......................... 266 B Unique Maximum of Segregated Schedule 269 C Optimal Schedule 272 D Math. Deriv. of Econ. Model 274 D.1 Utility Over Time............................. 275 D.2 Generalized Trust Over Time (σ(t, p ) = 1).............. 276 Bibliography 277 xiii

List of Tables 2.1 Breakdown of Reputation System Components............. 9 3.1 Parameter descriptions with sample values............... 34 3.2 General payoff matrix........................... 38 3.3 Payoff matrix for fixed $2 priced goods with valuation $3 and cost $1 39 3.4 Payoff Matrix for variable priced p goods for default v = $3 and c = $1. 43 3.5 Payoff Matrix for fixed $2 priced goods with valuation $3, cost $1, and maliciousness factor $1.......................... 66 4.1 Trust and Profit Parameters and Default Values............ 88 4.2 Simulation Parameters and Default Values............... 97 4.3 Definition of Generalized Model Terms................. 130 5.1 Simulation statistics and metrics..................... 152 5.2 Configuration parameters, and default values.............. 157 5.3 Distributions and their parameters with default values........ 158 6.1 SPROUT vs. Chord........................... 215 6.2 Evaluating lookahead and MHD..................... 216 7.1 Maximum and minimum network throughput obtained by any simulation at 40% misbehaving nodes with all features enabled........ 247 xiv

7.2 Maximum and minimum overhead obtained by any simulation at 40% misbehaving nodes with all features enabled............... 249 7.3 Comparison of the number of false positives between the 0 second and 60 second pause time simulations. Average taken from the simulations with all features enabled.......................... 251 xv

List of Figures 2.1 Representation of primary identity scheme properties.......... 19 3.1 Number of transactions until gain from single defection equals loss from lowered reputation k............................ 51 3.2 Optimal number of cooperation/defections as a function of total sales. 58 3.3 Relative utility error between optimal schedule and ±1 C/D...... 59 3.4 Relative utility error between optimal schedule using weak approximation and ±1 C/D.............................. 59 4.1 Relationship between a peer s profit rate and the number of peers in the network................................. 75 4.2 Representation of a reputation system s role in a trading network. Transaction observations update peer reputations maintained in the trust vector. Reputation information is then used by peers in transactions to improve expected utility..................... 81 4.3 A peer s trust rating over time...................... 88 4.4 Convergence of T as t. Note the logscale x-axis. C B = 0 in both. 90 4.5 A peer s utility over time. Initial trust T(0) = 0.01. Higher is better. 92 4.6 A peer s utility over time. Initial trust T(0) = 0.0035.......... 93 xvi

4.7 Minimum capacity needed for a good peer to (eventually) generate positive profit (using default π gt, k v, and k c ) is approximately 0.035 (for default parameters).......................... 94 4.8 Capacity distribution for base population................ 99 4.9 Trust and utility values for default population after 200 turns..... 100 4.10 Distribution of credits in base population at turn 200.......... 100 4.11 Trust and utility for base population after 1000 turns.......... 102 4.12 Trust and utility for NR=400 after 1000 turns.............. 104 4.13 Trust and utility for NR=1 after 1000 turns............... 105 4.14 Trust and utility for MTPP=2 after 1000 turns............. 107 4.15 Utility for MTPP=3 after 1000 turns................... 108 4.16 Comparing the analytical and simulation results for the convergence of T as t as a function of C = C G. Note the logscale x-axis... 110 4.17 Comparing the analytical and simulation results of trust over time for new good peers............................... 111 4.18 Comparing the analytical and simulation results of trust over time. MTPP=1.................................. 113 4.19 Comparing utility over time for new good peer. MTPP=2....... 113 4.20 Comparing utility over time for new bad peer. MTPP=1........ 114 4.21 Effects of varying trust factor σ...................... 116 4.22 Comparison of ratio trust model to differential trust model. T (0) = 0.01119 4.23 π gt w.r.t T for various functions of T................... 123 4.24 Effects of sample π gt w.r.t varying functions of T............ 124 4.25 Steady-state trust as a function of C B. C = 1............. 125 4.26 Steady-state profit as a function of C B. C = 1............. 125 4.27 Effects of varying σ(t, p)......................... 132 xvii

5.1 Sample document and matching query................. 141 5.2 Efficiency for varying ρ 0. Lower value is better. 1 is optimal...... 161 5.3 Varying selection threshold values.................... 162 5.4 Efficiency comparison........................... 164 5.5 Relative message traffic of Friends-First and maximum Friend-Cache utilization w.r.t. cache size........................ 167 5.6 Efficiency of voting reputation system w.r.t. varying quorumweight.. 170 5.7 Efficiency of the voting reputation system w.r.t. Friend-Cache size.. 172 5.8 Effects of front nodes on efficiency.................... 174 5.9 Efficiency of two reputation systems with the random algorithm as a function of π B............................... 175 5.10 Average load on well-behaved nodes as a function of p B........ 177 5.11 Distribution of load on good nodes (and their corresponding number of files shared)............................... 178 5.12 Efficiency comparison of local and ideal reputation systems under the node-based threat model.......................... 182 5.13 Efficiency comparison of reputation systems with uniformly distributed node threat ratings............................. 184 5.14 Comparison of the local reputation system with ρ T of 0.0 and 0.15 and the base case over time.......................... 185 5.15 Comparison of the local reputation system with both Weighted and Select-Best variants and a selection threshold of 0.0 and 0.15 and the base case over time............................. 187 5.16 Comparison of the efficiency of the reputation systems over time... 189 5.17 Expected steady-state system behavior................. 201 6.1 Performance of SPROUT and AC in different size Small World networks.217 xviii

6.2 Performance of SPROUT and AC for different trust functions and varying f.................................... 219 6.3 Performance of SPROUT and AC for varying r............. 220 6.4 Performance as a function of a node s degree. Club Nexus data.... 221 6.5 Performance of SPROUT and AC for different uniform networks with varying degrees............................... 222 6.6 Performance of SPROUT and AC versus unstructured flooding.... 224 6.7 Latency measurements for SPROUT vs AC w.r.t. network size. Lower is better................................... 226 6.8 Distribution of load (in fraction of routes) for augmented Chord and SPROUT.................................. 227 7.1 Example of a route request...................... 236 7.2 Watchdog in action............................ 238 7.3 Node A does not hear B forward packet 1 to C, because B s transmission collides at A with packet 2 from the source S............ 238 7.4 Node A believes that B has forwarded packet 1 on to C, though C never received the packet due to a collision with packet 2....... 239 7.5 Overall network throughput as a function of the fraction of misbehaving nodes in the network.......................... 246 7.6 This figure shows routing overhead as a ratio of routing packet transmissions to data packet transmissions. This ratio is plotted against the fraction of misbehaving nodes....................... 248 7.7 Comparison of network throughput between the regular Watchdog and a Watchdog that reports no false positives................ 250 xix

xx

Chapter 1 Introduction Previously, the ability to both send and receive large amounts of digital content and data was limited to large institutions with the funds and resources to install and manage high-speed networks and fast server machines. However, the increasing availability of high bandwidth Internet connections and low-cost, commodity computers in people s homes allows regular home users to quickly communicate and share data with each other. This spread of computing resources has stimulated the use of resource sharing peer-to-peer (P2P) networks. These systems employ a simple scalable mechanism that allows anyone to offer content and services to other users, as well as search for and request resources from the network. What distinguishes P2P systems from other distributed systems is their focus on full user autonomy. Typically, distributed systems consist of computers managed by a single organization or hierarchy. Devising an efficient architecture that spans many networked machines is much simpler when all machines can be monitored and controlled by a single operator. However, in pure P2P architectures there are no centralized services or control mechanisms dictating the actions of other nodes. Each user decides what computing resources he will contribute, as well as when and for how long. The architecture is 1

2 CHAPTER 1. INTRODUCTION designed to handle large numbers of nodes joining and abruptly leaving the network. In addition, these systems emphasize equality and balancing the load across nodes. This flexibility, self-determination and low participation cost encourages a much larger number of participants, which, in turn, greatly increases the number and value of the services provided by the system to all. The most important contribution of peer-to-peer system research is providing an architecture that allows a group of users spread throughout the Internet to cheaply and efficiently connect their commodity computing resources into one massive system, useable by all. The implications for rapid prototyping and deployment of new services by small teams of developers without large amounts of capital are astounding. Already we see P2P systems that handle a plethora of applications, ranging from grid computing to data storage to digital preservation. However, current media attention to peer-to-peer systems is concentrated on the legal issues of copyright infringement that plague popular file-sharing applications. Users have discovered P2P networks to be an efficient and cheap method of transmitting digital content. However, these transmissions are being done without the consent of the legal owners of the content. No legally acceptable solution to content distribution using P2P technology is deployed today. If such a solution existed both content owners/creators and consumers would benefit greatly. To understand the potential impact of P2P systems, we must step back and chronicle the evolution of media distribution. Currently, the cost of setting up and managing traditional media distribution channels is too great for individual content creators to overcome, resulting in a few large monopolistic companies that control all development and distribution of media, such as music, movies and books. These companies decide what media is produced based primarily on what can be marketed for maximum profit, not artistic merit. This filtering severely limits the public s access to new and diverse content and ideas.

3 The evolution of the World Wide Web has greatly helped independent artists and authors to reach a larger segment of the population. Artists can now distribute or sell their work in digital form from their websites, circumventing the packaging, transportation, and retail costs of CDs, DVDs and books. The Web has also enabled the sale of all kinds of material goods by ordinary people on a global scale. The best example of this is the auction site ebay [42], which allows any individual to advertise and auction items to people all over the world. Not only has the Web created new distribution channels for digital content, but it provides a cheap solution for global advertising of physical items. Although the Web has lowered the cost of distribution and marketing, it does impose costs that are still too great for many users. Websites that distribute songs or movies will require large amounts of bandwidth to serve all their customers, and bandwidth costs money. Running a commercial website with the necessary computing resources to handle sales and distribution for a vast number of customers is still beyond the capacity of most individuals. This need for technical capital has resulted in the emergence of large companies that specialize in digital content distribution. These new electronic distribution middlemen, such as ebay and Apple s itunes [8], are once again in a position of power over the content creators. They decide what is sold and what they charge for access to their service. Many ebay merchants are unhappy with the fees they must pay ebay to use its services. Every increase in fees results in sellers leaving ebay as they lose the already slim profit margins they maintained [86]. A new distribution revolution is needed. This revolution is coming in the form of P2P networks. When content can be transferred between customers without involving a single centralized server, the computational and bandwidth burdens on the content creator or owner are removed. The cost of distribution would be much lower for the content owner and the distribution channels could no longer be monopolized by a small group of middlemen. The result

4 CHAPTER 1. INTRODUCTION would mean lower prices for consumers and increased profits for the producers. Merchants who have left ebay (or never used it) due to the increasing fees may welcome a pure P2P-commerce solution where no fees are collected and all sellers participate equally. Unfortunately, both producers and consumers are reticent about using P2P networks for distribution. P2P technology is not sufficiently mature to support a secure and safe method for purchasing content through these systems. The primary hurdles are: providing an efficient, secure mechanism for purchasing content, a universally accepted method for verifying content authenticity and ownership, and ways to prevent or mitigate attacks on the system by malicious users. These attacks include: defrauding customers and stealing their money, intentionally modifying content to damage the owner and/or creator of the content, and using content distribution to infect computers with worms or viruses. Because of the lack of a secure payment system that prevents or punishes malicious attackers, P2P technology is not yet a viable distribution medium. These worries have appeared before whenever a new distribution channel emerged, most recently with e-commerce over the World Wide Web. Each time, methods and practices were developed to combat malicious activity and instill confidence in consumers and sellers alike. These mechanisms have proven successful. In 2004 Americans spent approximately $115 billion on online purchases, up over 25% from the previous year [66, 130, 134]. EBay alone, posted 2004 revenues of $3.3 billion [135]. The success of ebay is of special relevance because ebay is a hybrid peer-to-peer system. Although certain functions such as indexing and auction management are operated by a centralized server, distribution of goods and payment is handled directly between the buyers and sellers.

1.1. RESEARCH CONTRIBUTIONS AND THESIS OUTLINE 5 Now researchers are working fervently to develop the secure payment, digital rights management, auditing and enforcement mechanisms peer-to-peer systems need in order to allow users to confidently purchase and distribute all kinds of content. A major component in detecting and mitigating malicious attacks will be the reputation system. Online trading and auction systems, such as ebay, employ reputation systems as a means of distinguishing well-behaved productive users from the selfish or malicious peers. Reputation systems provide users with a summarized (perhaps imperfect) history of another peer s transactions. Users use this information to decide to what extent they should trust an unknown peer before they themselves have interacted with him/her. Scholars and researchers have adopted reputation systems as a useful mechanism for detecting, containing and discouraging misbehavior in P2P networks. Unfortunately, the lack of a centralized trusted entity capable of monitoring user behavior and enforcing rules, complicates the design of mechanisms for detecting and preventing malicious behavior in autonomous environments. However, it is this challenge that most inspires the work presented in this thesis, as well as the research field of security for peer-to-peer systems. Secure solutions will encourage more users to engage in larger-valued transactions through the flexible and efficient commercial medium of P2P systems. This growth will drive the burgeoning economy of digital goods and services. Reputation systems are necessary in order to revolutionize content and information distribution just as much, if not more, than the World Wide Web, as the cost of distribution is lowered once again. 1.1 Research Contributions and Thesis Outline This thesis presents a top-down exploration of designing reputation systems for autonomous, decentralized computer systems. After an introductory decomposition and

6 CHAPTER 1. INTRODUCTION survey of the research field, we present high-level models of the relationship between reputation and user behavior in typical trading systems. We then focus on P2P networks, using detailed simulations to investigate characteristics of basic system design decisions. Finally, we present two novel applications of trust and reputation for routing security in different autonomous networks. The following thesis outline describes the content of each chapter and touches on the major findings or research contributions discussed therein. Chapter 2 lays out an overview of the area of reputation system research geared towards peer-to-peer networks. We decompose peer-to-peer reputation systems into separate components. Each component must provide certain properties or capabilities in order for the whole system to function. Designing mechanisms that achieve these properties in an autonomous transient network, yields the most interesting research problems. In addition to defining terms used throughout the thesis, this chapter discusses in detail related work in this vast field of research. Further chapters briefly describe related research that is more closely tied to results presented in the chapter. The next two chapters study reputation in general systems where resources or commodity goods are exchanged. Although the examples used for illustration focus on online trade, the resulting conclusions are applicable to many economic systems. Chapters 3 and 4 present theoretical models for how reputation affects user behavior and utility, each applying a different approach at different granularity. These models provide a framework for evaluating reputation algorithms using economic metrics, which we then use to analyze high-level implementation issues. Based on these studies, we propose guidelines for reputation system designers. Chapter 3 applies elementary game theory to explore agent strategies on a microeconomic scale. Chapter 4 expands these ideas to a macroeconomic mathematical model for expected user performance in a large-scale online trading system. Our mathematical model is then compared to simulation results.

1.1. RESEARCH CONTRIBUTIONS AND THESIS OUTLINE 7 In Chapter 5, we look closely at using limited reputation sharing in unstructured peer-to-peer resource-sharing networks. We propose several performance metrics (such as message traffic, load and efficiency) that allow us to evaluate and compare reputation systems. Through detailed simulations of multiple variations on the basic reputation system, we quantify the effects of certain system properties and design choices. Our study demonstrates that even a small amount of reputation information collecting and sharing can vastly improve a peer s ability to locate and fetch valid resources, even when faced with large-scale whitewashing and collusion by malicious peers. In addition, certain methods for calculating reputation and ranking peers may perform equally well in terms of detecting and avoiding malicious peers, but have vastly differing effects on load balancing. The following two chapters each present specific protocols/mechanisms that exploit reputation information in order to improve message routing performance in two types of networks that vary both in their physical medium and their structure. Chapter 6 proposes the SPROUT protocol for incorporating existing social network information and services into a structured P2P network in order to improve the reliability of message transmission. Using our model of social trust we show that SPROUT can improve expected message delivery by 50%. Chapter 7 concentrates on the issue of trust in ad hoc wireless routing. The Watchdog mechanism uses the inherent broadcast nature of wireless transmission to detect when packets are not being forwarded correctly by eavesdropping on next-hop transmission. The reputation of nodes along a path is incremented or decremented based on the message throughput. These reputations are used when selecting new paths as nodes move around. Simulations show Watchdog improves routing throughput by up to 27% under high mobility when 40% of the nodes fail to route correctly. Finally, we give our concluding comments in Chapter 8.

Chapter 2 Taxonomy of Trust: Categorizing P2P Reputation Systems The development of any complex computer architecture can be a challenge. This is especially true of a complex distributed algorithm that is run by autonomous untrusted agents, yet is expected to be relatively reliable, efficient, and secure. Such is the task of designing a complete reputation system for use in peer-to-peer networks. To accomplish the task, it is necessary to break down the problem into separate simpler problems of constructing a mechanism that provides a specific set of functions or properties, allowing developers to divide and conquer the problem of reputation system design. Our primary goal in this chapter is to provide a useful taxonomy of the field of peer-to-peer reputation design. To accomplish this goal, we identify the three basic components of a reputation system, break them down into the necessary separate mechanisms, and categorize properties we feel the mechanisms need to provide in order for the reputation system to fulfill its function. For each mechanism we list possible design choices proposed by the research community. In the process, we give examples of research in the area of trust and reputation. A 8

9 Table 2.1: Breakdown of Reputation System Components Reputation Systems Information Gathering Scoring and Ranking Response Identity Scheme Good vs. Bad Behavior Incentives Info. Sources Quantity vs. Quality Punishment Info. Aggregation Stranger Policy Time-dependence Selection Threshold Peer Selection variety of research papers and implementations are referenced to illustrate ideas and provide the reader avenues for further investigation. We often draw on work done by the Peers research group [1] at Stanford University and do not pretend to produce a complete survey of the research area. We feel this overview will be of particular interest to those who are unfamiliar with the breadth of issues relating to reputation system design for peer-to-peer networks. Taxonomies related to trust and reputation systems (either in part or as a whole) have been proposed by others (e.g Daswani [33] and O Hara et al. [101]) and will be discussed in the text when appropriate. 2.0.1 Taxonomy Overview The following section defines terms we use throughout the thesis. We begin our taxonomy by classifying the common assumptions and constraints that guide reputation system design in Section 2.2. These assumptions include expected user behavior, as well as the goals of adversaries in the system and their capabilities. How effectively a reputation system can deal with adversaries may be constrained by the the technical limitations imposed on the implementation by the target system environment. These issues determine the necessary properties and powers of the reputation system. Next, we break down the functionality of a reputation system into the three components shown in Table 2.1. In general, a reputation system assists agents in choosing

10 CHAPTER 2. TAXONOMY OF TRUST a reliable peer (if possible) to transact with when one or more have offered the agent a service or resource. To provide this function, a reputation system collects information on the transactional behavior of each peer (information gathering), scores and ranks the peers based on expected reliability (scoring and ranking), and allows the system to take action against malicious peers while rewarding contributors (response). Each component requires separate system mechanisms (listed in Table 2.1). For each mechanism we study the possible desired properties and then discuss the implementation limitations and trade-offs that may prevent some of the properties from being met. In the discussion we will reference existing solutions or research to illustrate how different mechanism designs achieve certain properties within the given system constraints. The three functionalities, gathering, scoring and response are covered in turn in Sections 2.3, 2.4 and 2.5. 2.1 Terms and Definitions Before discussing the various taxonomies we would like to define certain terms we will be using throughout the thesis: Peer A single active entity in any system or network of autonomous entities. In general, a peer in a system is associated with a specific user and his/her representation in a network. However, in some systems it is possible for a single human user to control multiple network entities with different identities (as used in Sybil attacks [38]). Also, a user s computer may be compromised by a worm or trojan horse and consequently the computer may behave differently in the network than the user intended. The user may even be unaware the computer is misbehaving. Therefore, we distinguish between a user and user s representation(s) or node(s) in the network. At times, we will use the term node, agent

2.1. TERMS AND DEFINITIONS 11 or even user (when not considering compromised clients) synonymously with peer. For instance, in Chapter 3 we use the term agent out of the tradition of the field of game theory. Transactions Peer-to-peer systems are defined by interactions between autonomous agents or peers. These interactions may include swapping files, storing data, answering queries, or remote CPU usage. In addition, money may be exchanged when purchasing the desired resource. We refer to all interactions in general as transactions between two parties. Cooperate/Defect When well-behaved peers carry out transactions correctly, we say they cooperate. Bad peers, however, may at times attempt to cheat or defraud another peer, in which case they defect on the transaction. We will use these terms (when applicable) when discussing general system/peer behavior. Structured vs Unstructured P2P network architectures tend to be categorized as either structured or unstructured, depending on how the overlay topology is formed. Structured networks use a specific protocol to assign network IDs and establish links to new peers and are exemplified by the class of systems called Distributed Hash Tables (DHTs) (e.g. [127, 113, 118]). In purely unstructured topologies new users connect randomly to other peers. A hybrid approach is to assign certain peers as supernodes (or ultrapeers) that form an unstructured network and all peers connect to supernodes. Such organization is used in most popular file-sharing systems (e.g. [56, 74]). However, for simplicity, we will classify supernode networks as unstructured networks [139]. Strangers Peers that appear to be new to the system. They have not interacted with other peers and therefore no trust information is available. Adversary A general term we use to apply to agents that wish to harm other peers

12 CHAPTER 2. TAXONOMY OF TRUST or the system, or act in ways contrary to acceptable behavior. This may include accessing restricted information, corrupting data, maliciously attacking other nodes in the network, or attempting to take down the system services. 2.2 Assumptions and Constraints The driving force behind reputation system design is providing a service that severely mitigates misbehavior while imposing a minimal cost on the well-behaved users. To that end, it is important to understand the requirements imposed on system design by each of the following: the behavior and expectations of typical good users, the goals and attacks of adversaries, and the technical limitations resulting from the environment where the system is deployed. We discuss each of these here. The choices made here will impact the necessary mechanism properties discussed in Sections 2.3, 2.4, and 2.5. 2.2.1 User Behavior A system designer must build a system that is accessible to its intended users, provides the level of functionality they require and does not hinder or burden them to the point of driving them away. Therefore, it is important to anticipate any allowable user behavior and meet their needs, regardless of added system complexity. Examples of user behavior and requirements that affect distributed mechanism design include: Node churn The rate at which peers enter and leave the network, as well as how gracefully they disconnect, affects many areas from network routing to content availability. Higher levels of churn require increased data replication, redundant routing paths, and topology repair protocols [60]. The node lifetime in the system determines how much information can be collected for purpose of

2.2. ASSUMPTIONS AND CONSTRAINTS 13 computing its reputation, as well as how long that information is useful. Reliability For most applications, users require certain guarantees on the reliability or availability of system services. For example, a distributed data storage application would want to guarantee that data stored by a user will always be available to the user with high probability and that it will persist in the network (even if temporarily offline) with a much higher probability [81]. The situation is more difficult in peer-to-peer networks where adversaries are actively attempting to corrupt the content peers provide. Group auditing techniques may help detect or prevent data loss [87]. Privacy Along with reliability, users that store data in an untrusted distributed system would also want to protect the content from being accessed by unauthorized users. One solution is to encrypt all data before storing [81]. However, in some applications access to unencrypted data is necessary for processing. Separating sensitive data from subject identities, or using legally binding strict privacy policies may be sufficient [115, 6, 7]. Anonymity As a specific application of privacy, users may only be willing to participate if a certain amount of anonymity is guaranteed. This may vary from no anonymity requirements, to hiding real-world identity behind a pseudonym, to requiring that an agent s actions be completely disconnected from both his real-world identity and his other actions. Obviously, a reputation system would be infeasible under the last requirement. 2.2.2 Threat Model The two primary types of adversaries in peer-to-peer networks are selfish peers and malicious peers. They are distinguished primarily by their goals in the system. Selfish peers wish to use system services while contributing minimal or no resources

14 CHAPTER 2. TAXONOMY OF TRUST themselves. A well-known example of selfish peers are freeriders [5] in file-sharing networks, such as Kazaa and Gnutella. To minimize their cost in bandwidth and CPU utilization freeriders refuse to share files in the network. The goal of malicious peers, on the other hand, is to cause harm to either specific targeted members of the network or the system as a whole. To accomplish this goal, they are willing to spend any amount of resources (though we can consider malicious peers with constrained resources a subclass of malicious peers). Examples include distributing corrupted audio files on music-sharing networks to discourage piracy [98] or disseminating virus-infected files for notoriety [12]. Reputation system designers usually target a certain type of adversary. For instance, incentive schemes that encourage cooperation may work well against selfish peers but be ineffective against malicious peers. The number or fraction of peers that are adversaries also impact design. Byzantine protocols, for example, assume less than a third of the peers are misbehaving [21]. The work presented in this thesis tackles both selfish and malicious peers, although some sections may focus on a single type of adversary. Adversarial Powers Next, a designer must decide what techniques he expects the adversaries to employ against the system and build in mechanisms to combat those techniques. The following list briefly describes the more general techniques available to adversaries. Traitors Some malicious peers may behave properly for a period of time in order to build up a strongly positive reputation, then begin defecting. This technique is effective when increased reputation gives a peer additional privileges, thus allowing malicious peers to do extra damage to the system when they defect. An example of traitors are ebay merchants that participate in many small transactions in order to build up a high positive reputation, and then defraud

2.2. ASSUMPTIONS AND CONSTRAINTS 15 one or more buyers on a high-priced item. Traitors may also be the computers of well-behaved users that have been compromised through a virus or trojan horse. These machines will act to further the goals of the malicious user that subverted them. Collusion In many situations multiple malicious peers acting together can cause more damage than each acting independently. This is especially true in peerto-peer reputation systems, where covert affiliations are untraceable and the opinions of unknown peers impacts ones decisions. Most research devoted to defeating collusion assume that if a group of peers collude they act as a single unit, each peer being fully aware of the information and intent of every other colluding peer [87]. Front peers Also referred to as moles [45], these malicious colluding peers always cooperate with others in order to increase their reputation. They then provide misinformation to promote actively malicious peers. This form of attack is particularly difficult to prevent in an environment where there are no pre-existing trust relationships and peers have only the word and actions of others in guiding their interactions [93] (see Sec. 5.6.2). Whitewashers Peers that purposefully leave and rejoin the system with a new identity in an attempt to shed any bad reputation they have accumulated under their previous identity [83]. Whitewashers are discussed in depth in later sections and chapters (see Sec. 2.3.3 and Chp. 5). Denial of Service (DoS) Whether conducted at the application layer or network layer, Denial of Service attacks usually involve the adversary bringing to bear large amounts of resources to completely disrupt service usage. Using Internet worms however, malicious users are able to minimize their own personal

16 CHAPTER 2. TAXONOMY OF TRUST resource usage while amplifying the damage done through Distributed DoS attacks. Much work has been done on detecting, managing, and preventing DoS attacks. P2P-specific applications include [34, 35, 55] in unstructured networks and [21] in DHT networks. Not only would we like reputation systems to detect DoS attackers, but such attacks could be used against the reputation mechanism itself. As we discuss different mechanisms, we will reference these tactics and explain how certain system properties can help against them. Most of the existing research does not claim to handle malicious peers that bring to bear all these attacks at once. In fact, much of the work focuses solely on independent selfish peers. While Chapter 3 deals solely with the simplest case of selfish peers, the following chapters (and particularly Chapter 5) study at depth the issues surrounding malicious peers that use all these adversarial techniques. 2.2.3 Environmental Limitations The primary division among system component architectures is centralized versus decentralized. Implementing certain functionality at a single trusted entity can simplify mechanism design and provide a more efficient system. As we will see, some component properties can only be attained using the management and auditing capabilities afforded by a single point of trust. Of course centralization also has several drawbacks. It may be infeasible to have a single entity all agents trust. A centralized server becomes a single point of failure as well as a bottleneck. Providing performance and robustness requires the controlling entity to unilaterally invest large sums of money. It also makes for a single point of attack by adversaries, either by infiltration, subversion, or DoS attacks. Between purely centralized and purely decentralized is a spectrum of hybrid architectures. For simplicity, we will refer to proposed mechanisms as centralized if they

2.3. GATHERING INFORMATION 17 require one (or a small number) entity that is trusted by all users to handle some service for the entire system, even if they do not need to be always available, only intermittently. Otherwise, the mechanism is decentralized. 2.3 Gathering Information The first component of a reputation system is responsible for collecting information on the behavior of peers, which will be used to determine how trustworthy they are (either on an absolute scale or relative to the other peers). 2.3.1 System Identities Associating a history of behavior with a particular agent requires a sufficiently persistent identifier. Therefore, our first concern is the type of identities employed by the peers in the system. There are several properties an identity scheme may have, not all of which can be met with a single design. In fact, some properties are in direct conflict of each other. The properties we focus on are: Anonymity As previously mentioned in Section 2.2.1, the level of anonymity offered by an identity scheme can vary from using real-world identities to preventing any correlation of actions as being from the same agent. Most peer-to-peer networks, such as Kazaa [74], use simple, user-generated pseudonyms. Since peers connect directly to one another, their IP addresses are public, providing the closest association between the agent s actions and their real-world identity. To hide their IP addresses users can employ redirection schemes, such as Onion routing [128]. A P2P-specific solution using anonymizing tunnels is Tarzan [47]. Frequently changing pseudonyms and routing tunnels disassociates the user s actions from each other.