Symmetric Subgame Perfect Equilibria in Resource Allocation

Similar documents
Chapter 1 Microeconomics of Consumer Theory

Supply chain coordination; A Game Theory approach

Channel Assignment Strategies for Cellular Phone Systems

3 Game Theory: Basic Concepts

Static Fairness Criteria in Telecommunications

Sebastián Bravo López

Computer Networks Framing

5.2 The Master Theorem

Capacity at Unsignalized Two-Stage Priority Intersections

Weighting Methods in Survey Sampling

AUDITING COST OVERRUN CLAIMS *

Procurement auctions are sometimes plagued with a chosen supplier s failing to accomplish a project successfully.

RESEARCH SEMINAR IN INTERNATIONAL ECONOMICS. Discussion Paper No The Evolution and Utilization of the GATT/WTO Dispute Settlement Mechanism

Lemon Signaling in Cross-Listings Michal Barzuza*

Chapter 5 Single Phase Systems

cos t sin t sin t cos t

Classical Electromagnetic Doppler Effect Redefined. Copyright 2014 Joseph A. Rybczyk

Hierarchical Clustering and Sampling Techniques for Network Monitoring

) ( )( ) ( ) ( )( ) ( ) ( ) (1)

6.207/14.15: Networks Lecture 15: Repeated Games and Cooperation

1.3 Complex Numbers; Quadratic Equations in the Complex Number System*

A Holistic Method for Selecting Web Services in Design of Composite Applications

Granular Problem Solving and Software Engineering

Trade Information, Not Spectrum: A Novel TV White Space Information Market Model

An integrated optimization model of a Closed- Loop Supply Chain under uncertainty

A Theoretical Analysis of Credit Card Reform in Australia *

State of Maryland Participation Agreement for Pre-Tax and Roth Retirement Savings Accounts

Isaac Newton. Translated into English by

A Comparison of Service Quality between Private and Public Hospitals in Thailand

DSP-I DSP-I DSP-I DSP-I

Optimal Sales Force Compensation

HEAT EXCHANGERS-2. Associate Professor. IIT Delhi P.Talukdar/ Mech-IITD

Asymmetric Error Correction and Flash-Memory Rewriting using Polar Codes

How To Fator

Performance Analysis of IEEE in Multi-hop Wireless Networks

Programming Basics - FORTRAN 77

Bayes Bluff: Opponent Modelling in Poker

Suggested Answers, Problem Set 5 Health Economics


Using Live Chat in your Call Centre

Deadline-based Escalation in Process-Aware Information Systems

Fixed-income Securities Lecture 2: Basic Terminology and Concepts. Present value (fixed interest rate) Present value (fixed interest rate): the arb

Big Data Analysis and Reporting with Decision Tree Induction

Availability, Reliability, Maintainability, and Capability

arxiv:astro-ph/ v2 10 Jun 2003 Theory Group, MS 50A-5101 Lawrence Berkeley National Laboratory One Cyclotron Road Berkeley, CA USA

Open and Extensible Business Process Simulator

Prices and Heterogeneous Search Costs

Optimal Health Insurance for Multiple Goods and Time Periods

In order to be able to design beams, we need both moments and shears. 1. Moment a) From direct design method or equivalent frame method

Cournot Equilibrium in Price-capped Two-Settlement Electricity Markets

Optimal Online Buffer Scheduling for Block Devices *

Electrician'sMathand BasicElectricalFormulas

Dataflow Features in Computer Networks

BOSTON UNIVERSITY SCHOOL OF LAW

Downlink Scheduling and Radio Resource Allocation in Adaptive OFDMA Wireless Communication Systems for User-Individual QoS

A Three-Hybrid Treatment Method of the Compressor's Characteristic Line in Performance Prediction of Power Systems

Basic Properties of Probability

A Game Theoretical Approach to Gateway Selections in Multi-domain Wireless Networks

Intelligent Measurement Processes in 3D Optical Metrology: Producing More Accurate Point Clouds

Product Warranties and Double Adverse Selection

Economic and Antitrust Barriers to Entry

ECON 459 Game Theory. Lecture Notes Auctions. Luca Anderlini Spring 2015

Improved SOM-Based High-Dimensional Data Visualization Algorithm

On the Characteristics of Spectrum-Agile Communication Networks

In this chapter, we ll see state diagrams, an example of a different way to use directed graphs.

On the design of a credit agreement with peer monitoring

Tax-loss Selling and the Turn-of-the-Year Effect: New Evidence from Norway 1

6.254 : Game Theory with Engineering Applications Lecture 2: Strategic Form Games

Learning Curves and Stochastic Models for Pricing and Provisioning Cloud Computing Services

WORKFLOW CONTROL-FLOW PATTERNS A Revised View

Deduplication with Block-Level Content-Aware Chunking for Solid State Drives (SSDs)

SupermarketPricingStrategies

Findings and Recommendations

IEEE TRANSACTIONS ON DEPENDABLE AND SECURE COMPUTING, VOL. 9, NO. 3, MAY/JUNE

A Keyword Filters Method for Spam via Maximum Independent Sets

Market power and banking failures

Convergence of c k f(kx) and the Lip α class

SLA-based Resource Allocation for Software as a Service Provider (SaaS) in Cloud Computing Environments

An Enhanced Critical Path Method for Multiple Resource Constraints

The Optimal Deterrence of Tax Evasion: The Trade-off Between Information Reporting and Audits

REDUCTION FACTOR OF FEEDING LINES THAT HAVE A CABLE AND AN OVERHEAD SECTION

Provided in Cooperation with: Ifo Institute Leibniz Institute for Economic Research at the University of Munich

Dispersion in Optical Fibres

ECONOMICS OF SECURITY PATCH MANAGEMENT

Recovering Articulated Motion with a Hierarchical Factorization Method

10.1 The Lorentz force law

OpenScape 4000 CSTA V7 Connectivity Adapter - CSTA III, Part 2, Version 4.1. Developer s Guide A31003-G9310-I D1

Pattern Recognition Techniques in Microarray Data Analysis

protection p1ann1ng report

HEAT CONDUCTION. q A q T

Impact Simulation of Extreme Wind Generated Missiles on Radioactive Waste Storage Facilities

Physics 43 HW 3 Serway Chapter 39 & Knight Chapter 37

Monetary Policy, Leverage, and Bank Risk-Taking

The B.E. Journal of Macroeconomics

In many services, the quality or value provided by the service increases with the time the service provider

Disability Discrimination (Services and Premises) Regulations 2016 Index DISABILITY DISCRIMINATION (SERVICES AND PREMISES) REGULATIONS 2016

Soft-Edge Flip-flops for Improved Timing Yield: Design and Optimization

UNIVERSITY AND WORK-STUDY EMPLOYERS WEB SITE USER S GUIDE

A Reputation Management Approach for Resource Constrained Trustee Agents

Voluntary Disclosure and the Duty to Disclose

Transcription:

Proeedings of the Twenty-Sixth AAAI Conferene on Artifiial Intelligene Symmetri Subgame Perfet Equilibria in Resoure Alloation Ludek Cigler Eole Polytehnique Fédérale de Lausanne Artifiial Intelligene Laboratory CH-1015 Lausanne, Switzerland ludek.igler@epfl.h Boi Faltings Eole Polytehnique Fédérale de Lausanne Artifiial Intelligene Laboratory CH-1015 Lausanne, Switzerland boi.faltings@epfl.h Abstrat We analyze symmetri protools to rationally oordinate on an asymmetri, effiient alloation in an infinitely repeated N-agent, C-resoure alloation problems. Bhaskar 2000) proposed one way to ahieve this in 2-agent, 1-resoure alloation games: Agents start by symmetrially randomizing their ations, and as soon as they eah hoose different ations, they start to follow a potentially asymmetri onvention that presribes their ations from then on. We extend the onept of onvention to the general ase of infinitely repeated resoure alloation games with N agents and C resoures. We show that for any onvention, there exists a symmetri subgame perfet equilibrium whih implements it. We present two onventions: bourgeois, where agents stik to the first alloation; and market, where agents pay for the use of resoures, and observe a global oordination signal whih allows them to alternate between different alloations. We define prie of anonymity of a onvention as the ratio between the maximum soial payoff of any asymmetri) strategy profile and the expeted soial payoff of the onvention. We show that while the prie of anonymity of the bourgeois onvention is infinite, the market onvention dereases this prie by reduing the onflit between the agents. 1 Introdution In many situations, agents have to oordinate their use of some resoure. One wireless hannel an only be used by one devie, one parking slot may only be oupied by one vehile, et. The problem is that often, the agents have idential preferenes: Everyone prefers to aess rather than yield. Similarly, everyone prefers a parking slot losest to the building entrane. However, if multiple agents try to use one resoure simultaneously, they ollide and everyone loses. Consider a simple example: two agents who want to aess a single resoure. We an desribe the problem as a game. Both agents have two ations: yield Y ) and aess A). If agent α yields, it gets a payoff of 0. When agent α aesses the resoure while the other agent yields, it gets a payoff of 1. But if both agents aess the resoure at the same time, they both inur a ost γ < 0. The normal form of suh a game looks as follows: Copyright 2012, Assoiation for the Advanement of Artifiial Intelligene www.aaai.org). All rights reserved. Y A Y 0, 0 0, 1 A 1, 0 γ, γ This is a symmetri game, but the two effiient Nash equilibria NE) are asymmetri: either one agent yields and the other one aesses the resoure, or vie versa. The only symmetri rational outome is the mixed NE where both agents aess the resoure with probability PrA) := 1 γ +1. However, this mixed equilibrium is not effiient, beause the expeted payoff of both agents get is 0. Asymmetri equilibria of symmetri games are undesirable for two reasons: First, they are not fair. In our example, only one agent an aess the resoure. Seond, oordinating on an asymmetri equilibrium is diffiult. Imagine that the agents are all idential and anonymous, i.e. they annot observe neither their own identity, neither the identity of any other agent. We annot presribe a different strategy for eah of the agents. In our previous work Cigler and Faltings 2011), we addressed the fairness issue. We onsidered a speial ase of a resoure alloation problem. We proposed to use a global oordination signal and multiagent learning to reah a symmetri, fair and effiient wireless hannel alloation Wang et al. 2011) later implemented this approah in an atual wireless network and ahieved throughput 3x higher than standard ALOHA protools). The advantage is that the oordination signal is not speifi to the game. It does not have to tell the agents whih ation to take. As an example, the agents may just observe noise on some frequeny. The disadvantage of our previous approah though was that it was not rational for the agents to adopt this algorithm an agent who deided to always aess the resoure ould fore everyone else out and ahieve the highest payoff. In this paper, we onsider the rationality issue. We propose a distributed algorithm to find an alloation of a set of resoures whih is not only symmetri and fair, but also rational. We draw inspiration from the works of Bhaskar 2000) and Kuzmis, Palfrey, and Rogers 2010) on symmetri equilibria for symmetri repeated games. Assume that agents play an infinitely repeated game, and they disount future payoffs with a ommon disount fator 0 < δ < 1. A strategy for an agent is a mapping from any history of the play to a probability distribution over the a- 1326

tions. Our goal is to find a symmetri subgame perfet equilibrium. A subgame perfet equilibrium is a strategy profile vetor of strategies for every agent) whih is a NE in any history, inluding those that annot our on the equilibrium path. The problem is that the agents start with a ommon history. In order to ever use different ations, they need to randomize; in order to randomize, they need to be indifferent between a set of ations. We an split the game in two stages: Symmetri play, when all the agents use the same ations, asymmetri play from then on. We all the first round of the game where agents differ an asynhrony. After reahing asynhrony, agents an proeed in different ways, depending on whih ation they took in the asynhrony round. We all the strategy profile that the players adopt after an asynhrony a onvention. The agents who have suessfully aessed a resoure alone in that round are winners, and all the other agents are losers. The onvention an presribe a different strategy for the winners and for the losers. As an example, for the 2-agent, 1-resoure alloation game, Bhaskar 2000) desribes the following two onventions: Bourgeois Agents keep using the ation they played in the last round; Egalitarian Agents play the ation of their opponent from the last round. Some form of onvention is neessary to ahieve and maintain oordination even in games where agents don t have onfliting preferenes Crawford and Haller 1990), Goyal and Janssen 1996)). We will assume that a onvention is rational, i.e. every agent is playing their best response ations. The soial payoff depends on how fast the agents reah asynhrony. When there is a big differene between the winner and loser payoff, the losers will fight bak harder, so they will play their most preferred ation with higher probability. In the egalitarian onvention, the payoffs to the loser are given high enough δ) the same as the winner. Therefore, the agents will be indifferent between being a winner and a loser and they will reah asynhrony faster. How muh soial payoff do we lose by requiring the agents to be idential and anonymous, and requiring them to play only symmetrial strategies? For a given onvention, we an alulate the soial payoff E the agents get if we implement it as a symmetri SPE. That way, we an define its prie of anonymity. It is the ratio between E and the highest expeted soial payoff of any potentially asymmetri) strategy profile. Definition 1. Let σ = σ 1, σ 2,..., σ N ) be a symmetri strategy vetor for the resoure alloation game of N agents and C resoures. We define the prie of anonymity of strategy vetor σ as follows: Rσ) := max Eτ) Eσ) where Eσ) is the soial expeted payoff sum of individual expeted payoffs) when agents adopt strategy σ, and max Eτ) is the maximum soial payoff of any strategy vetor, symmetri or asymmetri. For a given resoure alloation game, we an also define its prie of anonymity as R := inf Rσ) where we minimize over all symmetri strategy profiles for the given game. Our main ontributions are the following: We show that in the infinitely repeated resoure alloation game with disounting of N agents and C resoures, for any onvention, there exists a symmetri subgame perfet equilibrium that reahes this onvention. We give a losed form expression to alulate the subgame perfet equilibrium for the bourgeois onvention, and show that for a small number of resoures C, this onvention leads to zero expeted payoff. This means that the prie of anonymity of the bourgeois onvention is. We present the market onvention. It is based on the idea that agents an observe a ommon oordination signal, and they an reah a different resoure alloation for eah signal. We show that ompared to the bourgeois onvention, it improves the expeted payoff. Its prie of anonymity is therefore finite. This paper is strutured as follows: In Setion 2 we present our resoure alloation game. We show that for any onvention there exists a symmetri subgame perfet equilibrium whih implements this onvention. In Setion 3 we present two onrete examples of a onvention: bourgeois and market onventions and disuss their properties. Finally, Setion 4 onludes. 2 Resoure Alloation Game Definition 2. A resoure alloation game G N,C is a game of N agents. Eah agent i an aess one of C resoures. It hooses its ation a i from A i = {Y, A 1, A 2,..., A C }, where ation a i = Y means to yield, and ation a i = A means to aess resoure. Beause all resoures are idential, we an define a speial meta-ation a i = A. To take ation A means to hoose to aess, and then to hoose the resoure uniformly at random. The payoff funtion for agent i is defined as follows: u i a 1,..., a i,..., a N ) := 0 if a i = Y 1) { 1 if ai Y, u i a 1,..., a i,..., a N ) := j i, a j a i γ < 0 otherwise 2) This game has a set of pure strategy NEs where C agents eah aess a resoure i and N C agents do not. There is also a symmetri mixed strategy NE in whih eah agent deides to aess some resoure with probability Pra i > 0) := min { C 1 N 1 γ 1 + γ ), 1 } 3) 1327

and then hooses the resoure to aess uniformly at random. Note that for high enough values of C, all agents will hoose to aess some resoure. As mentioned before, the pure strategy NEs are effiient, but neither symmetri nor fair. For a small number of resoures C, the mixed strategy NE leads to an expeted payoff of 0. Therefore, it is not soially effiient. To improve the effiieny, we will follow the approah of Bhaskar and Kuzmis et al. see Setion 1). We onsider an infinitely repeated version of the resoure alloation game G N,C. We assume that the agents disount their future payoffs with a disount fator 0 < δ < 1. Agents have information about whih resoures have been oupied in the last round, but they annot observe who oupied them. When N idential agents use symmetri strategies to play the repeated game, in order to reah an asymmetri outome of a single shot game they need to distinguish themselves. One way to do this is as follows: Whenever an agent is the only one to aess a resoure, he beomes the winner. The other agents those who yielded or ollided) are losers. We all this event asynhrony. The problem is that there an only be as many winners as there are resoures. We won t distinguish between the losers. Another way to distinguish them is as follows: We assume that the agents an observe in eah round of the game a global oordination signal, whih is an integer k {1, 2,..., K}, hosen uniformly at random. This signal is the same for all the agents. They an ondition their strategy on this signal. For different oordination signals, there an be different sets of winners and losers. Definition 3. A strategy σ of an agent α is a funtion from the oordination signal to a probability distribution over ations, σ : {1, 2,..., K} {A, Y }) 4) A deterministi strategy selets for eah signal either A, or Y. Definition 4. Let r be a round of the infinitely repeated game G N,C. We say round r is an asynhrony, if there exists an agent α who: 1. Aesses some resoure alone in round r, 2. and has not aessed any resoure alone in previous rounds. We all all suh agents winners. All the other agents who have not aessed any resoure alone so far are losers. Definition 5. A signal-based onvention or simply onvention) ξ in a game G N,C is a set of mixed) strategies that the agents adopt after an asynhrony round. The strategies for the winners and for the losers are potentially different. Suppose there are n w winners. The onvention leads to an expeted payoff w ξ n w ) for the winners, and l ξ n w ) for the losers. Figure 1 gives an example of a game play of N = 4 agents, C = 2 resoures and K = 2 signals. If in round t, the agents observe a signal k t, the onvention adopted by the agents in this example presribes that if an agent aesses a resoure alone in round t, it beomes its winner and will Round 1 2 3 4 5 6 7 8 Signal 1 2 2 2 1 2 1 2 Agent 1 1 0 2 0 1 0 1 0 Agent 2 1 0 1 1 1 1 0 1 Agent 3 2 1 0 0 2 0 2 0 Agent 4 0 1 2 2 1 2 0 2 Figure 1: Example of a game play for N = 4 agents, C = 2 and K = 2. The asynhrony rounds are 1, 3, 4 and 7 denoted in bold fae). One an agent aesses a resoure alone, it will keep aessing that resoure every time the same signal is observed. The winners are denoted with gray bakground different shades for different signals). The first round when an agent aesses a resoure alone and beomes the winner) is denoted with bold fae. In the rest of the game, agent 1 will keep aessing resoure 1 when the signal is 1. Agent 2 will aess the resoure 1 when signal is 2. Agent 3 will aess the resoure 2 when the signal is 1. Finally, agent 4 will aess the resoure 2 when the signal is 2. This way, the agents are no longer anonymous and have identified their roles with the signal and the resoure they aess. aess the same resoure in every round t > t in whih the signal k t = k t. Definition 6. Let ξ be a onvention for game G N,C. Let σ be a deterministi strategy of an agent α. Assume that for eah signal k {1,..., K}, every other agent takes ation A with probability p k. Let p = p 1, p 2,..., p K ) be a vetor of these probabilities. We define expeted payoff funtions E A and E Y when agent α takes ations A and Y : E A p, σ, k) := C [Prα wins & n w = A)w ξ ) =1 + Prα loses & n w = A)γ + l ξ ))] + Prn w = 0 A) γ + δ K E A p, σ, k) + E σl) p, σ, l) K E Y p, σ, k) := l=1 l k C Prn w = Y ) l ξ ) =1 + Prn w = 0 Y ) δ K E Y p, σ, k) + 5) K E σl) p, σ, l) l=1 l k Lemma 1. For any strategy σ and signal k, the funtions E A and E Y are ontinuous in p 0, 1 K. Proof. The probabilities Prn w = A) and Prn w = Y ) are ontinuous. The funtions E A and E Y are sums of produts of ontinuous funtions, so they must be themselves ontinuous. 6) 1328

Lemma 2. Funtions E A and E Y are well-defined for any σ, k and p 0, 1 K. Proof. For fixed p, σ, γ and δ the funtions E A, E Y define eah a system of K linear equations. We an write this system as AE σ = b, where E σ is a vetor of orresponding payoff funtions E σk), and b R K. The matrix A is defined as A := I δ K Prn w = 0 σ1)),..., Prn w = 0 σk))) 1 T 7) where I is a K K unit matrix and 1 T a K-dimensional row vetor of all 1. This system of equations has a unique solution if the matrix A is non-singular. This is equivalent to saying that deta) 0. The matrix A is diagonally dominant, that is a ii > K j=1,j i a ij. This is beause 0 < δ < 1, and all the probabilities Prn w = σk)) 1. It is known that diagonally dominant matries are non-singular Taussky 1949)). Therefore, a unique solution E σ of the system exists and the funtions E A, E Y are well-defined. Suppose that given the probability vetor p, there is a deterministi best-response strategy for agent α σ p. Theorem 1. If the funtions E A p, σ p, k) and E Y p, σ p, k) are well-defined and ontinuous in any p k, there exists a probability vetor p = p 1, p 2,..., p K ) suh that when for signal k, every agent aesses a resoure with probability p k, agents play a symmetri subgame perfet equilibrium of the infinitely repeated resoure alloation game. Proof. Fix γ, δ, σ and p for all l {1,..., K}, l k. Let p k = 0. If E Y E A, everyone is best off playing Y and it is a symmetri best-response. If not, then let p k = 1. If in this ase E A E Y, everyone is best off playing A and again this is a symmetri best-response. Finally, if both E Y < E A for p k = 0, and E Y > E A for p k = 1, then from the fat that both funtions are welldefined and ontinuous for 0 p k 1, they must interset for some 0 < p k < 1. For suh p k, the agents are indifferent between ations A and Y. Therefore, it is a symmetri bestresponse when all agents play A with probability p k. We now know that for any oordination signal k, there exists a symmetri best-response given any set strategies σl) for other oordination signals l k. Therefore, there must exist a probability vetor p suh that for all oordination signals it is a symmetri best-response to aess with these probabilities. This p defines a symmetri subgame perfet equilibrium of the infinitely repeated game. 2.1 Calulating the Equilibrium While the symmetri subgame perfet equilibrium is guaranteed to exist, in order to atually play it, the agents need to be able to alulate it. It is not always possible to obtain the losed form of the probability of aessing a resoure. Therefore, we will show how to alulate the equilibrium strategy numerially. Let p be a probability vetor, σ a strategy and k a signal. Let p 0 := p 1, p 2,..., p k = 0,..., p K ), i.e. vetor p with p k set to 0. Let p 1 := p 1, p 2,..., p k = 1,..., p K ). From Theorem 1 we know that either E Y p 0, σ, k) > E A p 0, σ, k), or E A p 1, σ, k) > E Y p 1, σ, k) or the two funtions interset for some 0 p k 1. Furthermore, we know that E A p 0, σ, k) = w ξ ) sine the probability of suessfully laiming a resoure is 1 when everyone else yields, and also E Y p 0, σ, k) = 0. Therefore, E Y p 0, σ, k) > E A p 0, σ, k) iff w ξ ) > 0. W.l.o.g, we will assume that w ξ ) > 0. Algorithm 1 shows then how to alulate the probability vetor. Algorithm 1 Calulating the equilibrium probabilities for Eah subset S {1, 2,..., K} do Let Σ be a system of equations i / S, Σ ontains two equations for E p, σ, i). One orresponding to E A p, σ, i), one to E Y p, σ, i). j S, we set p j := 1. Σ ontains only one equation for E p, σ, j), orresponding to E A p, σ, j). So Σ is a system of 2K S equations with 2K S variables. Solve numerially the system of equations Σ. if there exists a solution to Σ for whih i / S, 0 p i 1 then We have found a solution break; end if end for 3 Conventions In the previous setion, we have shown that we an find a symmetri way to reah any onvention, provided the agents aess the resoures with a ertain probability. We have also shown how to alulate the resoure aess probability in every stage of the game. In this setion, we would like to show speifi examples of the onventions that agents an adopt, and disuss their properties. 3.1 Bourgeois Convention The bourgeois onvention is the simplest one. One an agent has aessed a resoure suessfully for the first time, he will keep aessing it forever. We say that the agent has laimed the resoure. We don t need any oordination signal to implement it, so we assume that K = 1. For N agents and C resoures, we will desribe the deision problem from the point of view of agent α. Let be the number of resoures that have not been laimed yet, and n := N C + the number of agents who have not laimed a resoure yet. We define E, τ α ) as the expeted payoff of the best response strategy for agent α given the strategies τ α of all the opponents. Lemma 3. For any τ α and 1, E, τ α ) 0. Proof. No matter what is the strategy of the opponents, if agent α hooses to always yield, its payoff will be 0. 1329

Lemma 4. If the opponents strategies τ α are suh that the agent α is indifferent in every round between yielding and aessing, E, τ α ) = 0 for all 1. Proof. If the agent α is indifferent between ations Y and A in every round, that means that it is indifferent between a strategy that presribes Y in every round and any other strategy. The expeted) payoff of the strategy whih presribes always Y is 0. Therefore, the expeted payoff of any other strategy must be 0 as well. For the purpose of our problem, all the unlaimed resoures are idential. Therefore the only parameter of the agent strategy is the probability with whih it deides to aess the resoure itself is then hosen uniformly at random. Lemma 4 shows a neessary ondition for agent α to be indifferent. The following lemma shows a suffiient ondition: Lemma 5. Assume at round r there are unlaimed resoures. Then there exists a unique 0 p suh that if all opponents who haven t laimed any resoure yet play A with probability p = 1 n 1 γ γ + 1 1 δ indifferent between yielding and aessing. ), agent α is Proof. From Lemma 4 we know that when agent α is indifferent, it must be that E, τ α ) = 0 for all 1. The expeted profit to agent α from playing A and then following best-response strategy with zero payoff) is E A, τ α ) = 1 p ) 1 n 1 1 δ + [ 1 1 p ) ] n 1 γ 8) Here p is the probability with whih the opponents aess. We want E A, τ α ) = E Y, τ α ) = 0. This holds if p is defined as in the theorem above. Funtion E A is dereasing in p on the interval [0, ], while funtion E Y is onstantly 0. Therefore, the intersetion is unique on an interval [0, ]. Lemma 6. Assume that all the opponents who haven t laimed any resoure aess a resoure with probability p < p. Then it is best-response for agent α to aess. Proof. The probability that agent α laims suessfully a resoure after playing A is Prlaim some resoure A) := 1 p ) n 1 9) This probability inreases as p dereases. Therefore the expeted profit of aessing is inreasing, whereas the profit of yielding stays 0. Theorem 2. Define an agent s strategy τ as follows: If there are unlaimed resoures, play A with probability p := min 1, p ) where p is defined in Lemma 5). Then a joint strategy profile τ = τ 1, τ 2,..., τ N ) where, τ = τ is a subgame perfet equilibrium of the infinitely repeated resoure alloation game. Proof. If p < 1, any agent is indifferent between playing Y and playing A, therefore will happily follow strategy τ. If 1 = p < p, it is best response for any agent to play A, just as the strategy τ presribes. Theorem 3. For all N, if p = p, E, τ α ) = 0. Proof. We will proeed by indution. For = 0, the expeted payoff is trivially E0, τ α ) = 0, beause there are no free resoures. Let j <, Ej, τ α ) = 0 and p = p. If agent α plays Y, the expeted payoff is learly 0 it will be 0 now and 0 in the future from the indution hypothesis). If agent α plays A, the expeted payoff is E A, τ α ) := 1 p ) n 1 1 [ + 1 1 p 1 δ ) ] n 1 γ + δ q j Ej) j=0 10) Beause of the way the p is defined, and from the indution hypothesis Ej, τ α ) = 0 for j <, we get E A, τ α ) := δq E, τ α ) = δq max{e A, τ α ), E Y, τ α )} 11) Sine δq < 1, it must be that E A, τ α ) = 0. Theorem 4. If p < p, E, τ α ) > 0. Proof. From Lemma 6 we know that when p < p, it is a best response to aess, so E, τ α ) = E A, τ α ). From Lemma 3 we know that for all j, Ej) 0. If p < p, from the definition of E A, τ α ) Equation 10) we see that E, τ α ) > 0. Theorem 4 shows that if we have enough resoures so that p 1, the expeted payoff for the agents, even when they aess all the time, will be positive. Let us now look at the prie of anonymity for the bourgeois onvention as defined in Definition 1). The highest soial payoff any strategy profile τ an ahieve in an N- agent, C-resoure alloation game N C) is max Eτ) := C 1 δ. 12) This is ahieved when in every round, every resoure is aessed by exatly one agent. Suh strategy profile is obviously asymmetri. If eah agent knew whih part of the bourgeois onvention to play at the beginning of the game, this onvention would be soially effiient. However, when the agents are anonymous, they have to learn whih part of the onvention they should play through randomization. For the bourgeois onvention when C is small), this randomization wipes out all the effiieny gains. Therefore, its prie of anonymity is infinite. 1330

ex-post fair effiient rational C&F 11 ) 1 no Bourgeois no no Egalitarian 2 Market? Table 1: Properties of onventions Figure 2: Market onvention: Prie of anonymity for γ = 0.5 and varying δ. Figure 3: Market onvention: Prie of anonymity for δ = 0.9 and varying γ. 3.2 Market Convention We saw that the bourgeois onvention leads to zero expeted payoff for a small number of resoures. We would like to improve the expeted payoff here. We assume the following: Agents an observe K 1 oordination signals. Agents have a dereasing marginal utility when they aess a resoure more often. They pay a fixed prie per eah suessful aess, to the point that eah agent prefers to aess a resoure only for one signal out of K. In pratie, this ould be implemented by a entral authority whih observes the onvergene rate of the agents, and dynamially inreases or dereases the prie to ahieve onvergene. Suh assumptions define what we all market onvention, in whih the winners only aess their laimed resoure for the signals they observed when they first laimed it. We know that we an implement this onvention for C 1 resoures using symmetri play see Setion 2). We an also use Algorithm 1 to alulate the aess probabilities. Here we will look speifially at the expeted payoff of the market onvention in the ase of N agents and 1 resoure. When eah agent only aesses the resoure for one signal, we need K = N signals to make sure everyone gets to aess one. In the N-agent, 1-resoure ase, imagine there are still n agents playing and N n) agents who have already laimed the resoure for some signal. Imagine that the n agents observe one of the n signals for whih no resoure has been laimed. Assume that all agents aess the resoure with probability p n. The expeted payoff of aessing a resoure for agent α is E A p n, n) := 1 p n ) n 1 + [ 1 1 p n ) n 1] [ γ + 1 + δ N ) 1 1 δ δn N δn n) E Ap n, n) The expeted payoff of yielding for agent α is ] 13) E Y p n, n) := n 1)p n 1 p n ) n 2 En 1) + [ 1 n 1)p n 1 p n ) n 2] δn N δn n) E Y p n, n) 14) When p n = 1, aessing a resoure will always lead to a ollision, so the payoff will be negative. When p n = 0, aessing a resoure will always laim it, so the payoff will be positive. So in the equilibrium, the agents should be indifferent between aessing and yielding. Therefore, we want to find p n suh that E A p n, n) = E Y p n, n) = En). Finding a losed form expression for p n is diffiult, but we an use Algorithm 1 to alulate this probability, as well as the expeted payoff En), numerially. Figures 2 and 3 show the prie of anonymity of the market onvention as defined in Definition 1) of the market onvention for varying disount fator δ, and varying ost of ollision γ, respetively. From Setion 3.1, we saw that the prie of anonymity for C = 1 is. On the ontrary, for the market onvention this prie is in both ases finite and relatively small. 3.3 Convention Properties We ompare the properties of the following onventions: C&F 11, a hannel alloation algorithm presented in Cigler 1 Fair asymptotially, as N. 2 Only for 2-agents, 1-resoure games. 1331

and Faltings 2011); bourgeois, presented in Setion 3; egalitarian, presented in Setion 1; and market, presented in this work. We ompare the onventions aording to the following properties: Ex-post fairness Is the expeted payoff to all agents the same even after asynhrony? Effiieny Does the onvention maximize soial welfare among all possible onventions? Rationality Is it an equilibrium for the agents to adopt the onvention? Table 1 summarizes the properties of the onventions. The C&F 11 onvention is only approximately ex-post fair. The fairness is improving as the number of oordination signals inreases, but some agents might have a worse payoff than others. On the other hand, it is effiient, at least with no disounting δ = 1). However, it is not rational. The bourgeois onvention is neither fair nor effiient, in fat the expeted payoff to the agents is 0 for a small number of resoures). It is rational though, sine the agents are indifferent between being a winner and a loser. The egalitarian onvention is fair, effiient and rational. However, it only works for games of 2 agents and 1 resoure. Finally, the market onvention is fair and rational. It is learly more effiient than the bourgeois onvention. Nevertheless, finding the most effiient onvention remains an open problem. 4 Conlusions In this paper, we onsidered the problem of equilibrium seletion in the infinitely repeated resoure alloation game with disounting of N agents and C resoures. We assumed that the agents are idential, and that they use symmetri strategies. We based our work on the idea of Bhaskar 2000): we let the agents play a symmetri mixed strategy, after whih they adopt a ertain onvention. We show that for any onvention, there exists a symmetri subgame perfet equilibrium that implements it. We presented two suh onventions for the repeated resoure alloation game: bourgeois and market onvention. We defined the prie of anonymity as the ratio between the expeted soial payoff of the best asymmetri strategy profile and the expeted soial payoff of a given symmetri strategy profile. We showed that while the prie of anonymity for the bourgeois onvention is infinite at least for small number of resoures), the prie of anonymity of the market onvention is finite and relatively small. In the future work, we would like to investigate whether there exist more effiient onventions than the market onvention i.e. onventions with smaller prie of anonymity). In general, finding an optimal onvention is an NP-hard problem Balan, Rihards, and Luke 2011), but for a more restrited set of infinitely repeated resoure alloation games, we might be able to find the optimal onvention, similar to the Thue-Morse sequene Rihman 2001) used by Kuzmis, Palfrey, and Rogers 2010) in the Nash demand game. Aknowledgements We are partiularly thankful to David Parkes for giving the first author the unique opportunity to spend a few weeks in his lab at Harvard. David s openness and unparalleled knowledge is what really helped to originate this work. We would also like to thank Kate Larson for reading the draft of this paper and helping make the theoretial analysis muh more readable. Referenes Balan, G.; Rihards, D.; and Luke, S. 2011. Long-term fairness with bounded worst-ase losses. Autonomous Agents and Multi-Agent Systems 221):43 63. Bhaskar, V. 2000. Egalitarianism and effiieny in repeated symmetri games. Games and Eonomi Behavior 322):247 262. Cigler, L., and Faltings, B. 2011. Reahing orrelated equilibria through multi-agent learning. In The 10th International Conferene on Autonomous Agents and Multiagent Systems-Volume 2, 509 516. International Foundation for Autonomous Agents and Multiagent Systems. Crawford, V. P., and Haller, H. 1990. Learning how to ooperate: Optimal play in repeated oordination games. Eonometria 583):571 595. Goyal, S., and Janssen, M. 1996. Can we rationally learn to oordinate? Theory and Deision 401):29 49. Kuzmis, C.; Palfrey, T.; and Rogers, B. 2010. Symmetri players in repeated games: Theory and evidene. Rihman, R. 2001. Reursive binary sequenes of differenes. Complex Systems 134):381 392. Taussky, O. 1949. A reurring theorem on determinants. The Amerian Mathematial Monthly 5610):672 676. Wang, L.; Wu, K.; Hamdi, M.; and Ni, L. M. 2011. Attahment learning for multi-hannel alloation in distributed OFDMA networks. Parallel and Distributed Systems, International Conferene on 0:520 527. 1332