A Game Theoretical Approach to Gateway Selections in Multi-domain Wireless Networks

1 A Game Theoretial Approah to Gateway Seletions in Multi-domain Wireless Networks Yang Song, Starsky H.Y. Wong and Kang-Won Lee IBM Researh, Hawthorne, NY Email: {yangsong, hwong, kangwon}@us.ibm.om Abstrat We onsider a oalition network where multiple groups are interonneted via wireless links. Gateway nodes are designated by eah domain to ahieve a network-wide interoperability. Due to the inter-domain ommuniation ost, the optimal gateway seletion for one single domain depends on the gateway seletions of other domains and vie versa. In this paper, we investigate the interations of gateway seletions by multiple domains from a potential game perspetive. The equilibrium ineffiieny in terms of prie of stability is haraterized under various onditions. In addition, we examine the well-established equilibrium seletive learning algorithm B-logit and show that B-logit is a speial ase of a general family of algorithms, denoted by Γ olletively. A novel learning algorithm named MAX-logit is proposed, whih retains the favorable equilibrium seletion property with the provably fastest onvergene speed than any other algorithms in Γ, and an be applied to many other appliations of potential games. Simulation results show thatmax-logit an improve the onvergene speed ofb-logit by up to 33.85%. designate a gateway node for eah administrative domain, whih serves as a translator and olletively establish a onneted inter-domain ommuniation bakbone to failitate (seure and ompatible information exhange, as proposed in [1], []. Designating gateway nodes also enhane the manageability and ontrollability of oalition networks by enforing seurity and routing poliies for inter-domain ommuniations. With this hierarhial struture, inter-domain pakets are first delivered to the designated gateway node in the soure node domain, then forwarded to the destination domain via the established inter-domain wireless bakbone, and finally reah the destination node, as illustrated in Figure 1. Gateways S1 D1 D I. INTRODUCTION We investigate the interoperability issue in oalition networks where multiple groups of nodes (or domains are onneted via wireless links (e.g., mixture of IEEE 80.11, WiMAX, satellite links, Unmanned Aerial Vehile (UAV, 3G, 4G et.. For example, in military operations (disaster reovery senarios, troops of multiple ountries (polie department and fire resue teams, need to form a wireless ommuniation bakbone to failitate mutual information exhange and dissemination. While a global link suh as satellite and UAV is usually deployed to ahieve a network-wide onnetivity for mission-ritial tasks, one key obstale that hinders the interoperability of oalition networks lies in the heterogeneity of multiple domains in terms of different ommuniation tehnologies, protools, poliies, and inompatible paket formats, whih prevent two nodes in different domains from ommuniating diretly even in lose geographi proximity. Therefore, in order to enable inter-domain ommuniations among heterogeneous domains, a ommon paradigm is to This work will appear in The 17th Annual International Conferene on Mobile Computing and Networking, Las Vegas, Nevada, 011. This researh was sponsored by the U.S. Army Researh Laboratory and the U.K. Ministry of Defene and was aomplished under Agreement Number W911NF-06-3-0001. The views and onlusions ontained in this doument are those of the author(s and should not be interpreted as representing the offiial poliies, either expressed or implied, of the U.S. Army Researh Laboratory, the U.S. Government, the U.K. Ministry of Defene or the U.K. Government. The U.S. and U.K. Governments are authorized to reprodue and distribute reprints for Government purposes notwithstanding any opyright notation hereon. Fig. 1. S Domain A A oalition network with multiple autonomous domains. Finding the optimal set of gateway nodes in oalition networks is hallenging due to several reasons. First, due to the absene of a trusted entral authority in oalition networks, deentralized algorithms that only rely on loal information and observations are desired. Seond, although ollaboratively forming a ommuniation bakbone, eah domain is inlined to designate the gateway node for its own unilateral benefit, regardless the potential adverse impat on the overall network performane. Therefore, it is important and imperative to examine whether a unanimous agreement on gateway seletions exists. In addition, we are interested in quantifying the performane degradation of suh equilibrium solutions, ompared with the global optimum gateway seletion, in order to understand the impat of autonomy and lak of oordination in oalition networks. Third, eah administrative domain in the oalition network may be relutant to share its private information, e.g., intra-domain node topology, to other domains due to seurity and privay onerns. It is thus favorable to design distributed mehanisms that an ahieve an effiient solution while the private intra-domain struture of eah domain is unrevealed. The objetive of this work is to address the aforementioned hallenges.

II. SYSTEM MODEL Let M be the set of domains in the oalition network. For domain m M, denote the set of nodes in the domain as N m. We assume that and N m to avoid triviality. Denote gm i = 1 if node i is seleted as the gateway node for domain m and gm i = 0 otherwise. Let î m = argmax i Nm gm i be the seleted gateway node in domain m. For ease of exposition, we onsider the senario where eah domain will selet one of its nodes as the gateway. Denote g m = {gm,g 1 m,,g m Nm } as the gateway seletion strategy of domain m. Let s = {g 1,g,,g } be the joint gateway seletion profile of the whole network, i.e., the olletion of gateway seletion strategies of all domains. For notation brevity, we will also use î m (g m or î m (s to denote the gateway node designated by gateway seletion strategyg m or gateway seletion profiles, respetively. Denoteg m as the gateway seletion strategies of all domains other than domain m. Therefore, we will also use the notation of s = {g m,g m } to denote the gateway seletion profile of the network when domain m is of interest. We use S to represent the set of all possible gateway seletion profiles of the network, where S = m=1 N m, i.e., all nodes an be designated as the gateway node 1. Denote (i,j 0 as the assoiated symmetri link ost for a pair of node i and j, e.g., Eulidean distane. For mission-ritial appliations where an always-on network wide onnetivity is stringently required, a global link, e.g., UAV, satellite, 3G/4G, is deployed in the network to failitate reliable inter-domain ommuniations with a fixed yet possibly expensive ost, denoted by. Note that if nodes i,j are out of the ommuniation range, we have (i,j = due to the availability of the global link. It should be noted that if node i transmits pakets to j via the global link, the transmission is still viewed as one hop. We assume that eah domain has a deision module to selet its gateway node based on its loal information and observation. More speifially, we onsider a pratial senario where eah domain only knows its intradomain information in terms of topology and osts, as well as its one hop link osts to other neighboring domains, yet the global network topology and link osts are unknown due to the lak of observability. Define (i,j min((i,j, (1 where i,j are the gateways nodes of two domains. Therefore, for eah autonomous domain, say m, the gateway node î m should be seleted to minimize U m (g m,g m = (i,î m + ( î m,î n, i î m,i N m n m,n M ( where (î m,î n = is assumed by domain m if no diret link from î m to î n is observed, exept the global link. The first part of ( is defined as the intra-domain ost for domain 1 It an be verified that our analysis applies to senarios where eah domain is restrited to designate the gateway from a subset of its nodes. A multi-hop path may exist whih has a smaller aggregate ost than. However, domain m is only aware of its one-hop neighbors due to the lak of global information. m, and the seond part is the loally observed inter-domain ost from domain m s viewpoint. Assoiated with eah gateway seletion profile s S, a physial ommuniation graph, denoted by PCG(s, an be obtained (at domain level, where the verties are the set of gateways speified by s, and the weighted edges are the ommuniation links with osts, inluding the global link. In Figure, we illustrate the PCG of a oalition network with three domains where node a, b, are gateway nodes and all the possible ommuniation links among gateway nodes are labeled. We assume throughout the paper that the network topology varies at a slower time sale than the gateway seletion proess. a b (a,b Domain A (a, (b, Domain C Fig.. The physial ommuniation graph (PCG of a oalition network with three domains. Note that there might exist multiple paths between a pair of gateway nodes in PCG(s, however, the network s bakbone ommuniation ost should only aount for those links that are atually traversed by inter-domain traffi. In this paper, we assume that the underlying bakbone inter-domain routing algorithm satisfies (i, the path with minimum overall ost is seleted, (ii, in ase of multiple paths with same ost exist, the one with minimum number of hops is seleted, and (iii, if two paths have the same ost with same number of hops, tie breaks randomly. We thus an establish the desired ost-effiient ommuniation bakbone by finding the minimum ost path for eah pair of gateway nodes î m and î n. In other words, the desired minimum ost bakbone is an undireted graph (also a spanning subgraph of PCG, denoted by MCG(s, where the set of edges is the union of minimum ost paths for all pairs of gateway nodes. Note that MCG(s may not be a tree in general. For example, Figure 3 illustrates two different MCG realizations for the same struture of PCG in Figure, with different values of (a,. a Domain A 1 1.5 b 1 Domain C (a (a, = 1.5 a Domain A 1 b 1 Domain C (b (a, = 3 Fig. 3. Two different MCG of Figure when (a,b = (b, = 1,. From the network s perspetive, it is desirable to obtain the optimum gateway seletion profile s whih minimizes the

3 network ost funtion given by R(s = m i î m,i N m (i,î m + (îm, i n MCG(s ( î m,î n. (3 The first part of (3 is the aggregate intra-domain ost of all domains, and the seond part of (3 is the ost of bakbone ommuniation links in the assoiated graph of MCG(s. III. GATEWAY SELECTION GAME OF MULTIPLE A. Game Formulation AUTONOMOUS DOMAINS Note that the ost of a single domain m, given by (, depends on not only its own gateway seletion, but also the deisions of other domains. Therefore, oupled by the interdomain ommuniation ost, we formulate the interations of gateway seletions by multiple domains as a gateway seletion game, where eah autonomous domain is a player, with an objetive funtion of (, and the strategy spae for domain m is N m. Note that when a domain m selets its gateway node unilaterally, it may inrease the ost of other domains, as indiated by (, whih will trigger a new gateway update. This proedure iterates until a ommon agreement is reahed. An important question arises that whether this iterative gateway seletion proess will reah a steady state eventually. In other words, we are interested in whether the gateway seletion game has a Nash Equilibrium 3, sine an absene of equilibrium states indiates that the network will osillate and a ommon agreement an never be expeted. In addition, we are interested in the performane of suh equilibrium points, if exist, in terms of the overall network performane in order to haraterize the performane deterioration due to autonomy of multiple domains. We will address these two questions in the rest of this setion. B. Game Analysis Let us first introdue the onept of onstruted omplete graph, whih is a ruial omponent in our equilibrium analysis. Given a set of gateway nodes s, PCG(s depits the domain level struture onsisting of all available ommuniation links among gateway nodes. For every pair of gateways that possesses multiple links in PCG(s, we eliminate redundant links by keeping the link with the smallest ost only. See Figure 4 for an example. The onstruted graph is hene a omplete graph with link ost (a,b defined in (1. We denote suh a graph as the onstruted omplete graph, a.k.a., CCG(s, for the given gateway seletion profile s. Next, we show that the gateway seletion game falls into the ategory of potential games, where the existene of Nash equilibrium an be established. Theorem 1 The gateway seletion game has a Nash equilibrium, whih minimizes, either loally or globally, the following 3 From an engineering perspetive, we only fous on pure Nash equilibrium in this paper. Domain A a b Domain D d Domain C Physial Communiation Graph (PCG Domain A a b Domain D d Domain C Construted Complete Graph (CCG Fig. 4. Obtain the onstruted omplete graph (CCG from the physial ommuniation graph (PCG. funtion F(s = m i î m,i N m (i,î m + (îm, i n CCG(s ( î m,î n. Proof: Without loss of generality, let us assume domain m is updating its gateway seletion unilaterally, given the gateway seletions of other domains, i.e., g m. We alulate the differene of funtion F with two strategies g m, g m, and obtain = F (g m,g m F (g m,g m ( i,î m + (i,î n i î m,i Nm n mi i n,i N n ( i,î m (i,î n i î m,i Nm n mi i n,i N n + ( ( î m,î n ( î m,î n n m,n M (4 = U m (g m,g m U m (g m,g m. (5 Note that we utilize the property that when a single domain m swithes its gateway strategy fromg m to g m, the intra-domain osts of other domains, as well as the links in CCG(s that are not inident to domain m are unhanged. We stress that (5 is valid for any m, g m, and g m. Therefore, the gateway seletion game is an exat potential game with a potential funtion given by (4. It is worth noting that every Nash Equilibrium in the gateway seletion game, where no domain an improve its own performane by deviating unilaterally, orresponds to a loal or global minimizer of the potential funtion F(s. The existene of Nash Equilibrium follows the results of [3]. However, in multiple domain gateway seletion games, the stable Nash equilibrium solution may not be unique. In other words, depending on the initial onfiguration, the gateway seletion game may reah multiple equilibria whih yield signifiantly different performane in terms of overall network ost. To apture the effiieny loss in games with rational players, the onepts of prie of anarhy [4] and prie of stability [5] are introdued in the literature, whih are defined as the performane ratio of the worst Nash Equilibrium to the global optimal solution, and the best Nash Equilibrium to the global optimal solution, respetively. Sine our goal is to design poliies and mehanisms to improve the equilibrium performane for multiple autonomous domains in the oalition network, we will fous on the prie of stability of gateway seletion games in this paper.

4 Theorem For any gateway seletion game with two players, the best Nash Equilibrium is the global network optimum solution, i.e., the prie of stability is 1. Proof: The result an be shown by ontradition and is omitted due to spae onstraint. We next show that the result in Theorem also applies to multiple domain gateway seletion games ( 3, if the following ondition is satisfied. Condition 1 (Triangle Inequality We say the link ost metri (a, b 0 satisfies triangle inequality if (a, b (a,+(,b. Note that several link ost metris are known to satisfy triangle inequality, e.g., hop ount and Eulidean distane. In addition, network embedding tehniques have been proposed to onvert general routing metris into simple Eulidean distanes in the new metri spae, e.g., [6], [7], [8]. Theorem 3 If the link ost metri (a,b satisfies the triangle inequality, the prie of stability is always 1. Proof: When we obtain CCG from PCG, only the links whih will not be utilized in MCG are removed. Moreover, CCG an be viewed as a onstruted graph with new link ost (a,b = min((a,b,. It an be easily verified that if the original link ost metri (a, b satisfies the triangle inequality, so does the new link ost metri (a,b. Therefore, the indued MCG is the same as CCG, i.e., the potential funtion of (4 is idential to the network ost funtion of (3, whih ompletes the proof sine the best Nash equilibrium orresponds to the global minimizer of the potential funtion. Theorem 3 reveals that for multiple domain gateway seletion games, the global network optimum solution is one of the Nash equilibria if the triangle inequality is satisfied. Unfortunately, this result does not hold otherwise, as shown in the following Theorem. Theorem 4 If the triangle inequality does not hold, the prie of stability of an -player gateway seletion game is at most (1+δ, where ( + 1 3 δ = (. (6 min m M min gm i î m(g m,i N m i,î m (g m Proof: Denote s as the global minimizer of the potential funtion of (4. Note that for any feasible gateway seletion profile s S, we have MCG(s CCG(s. Therefore, we obtain F ( s R( s. Denote s as the global optimum gateway seletion profile whih minimizes (3. Sine s is the Nash equilibrium minimizing (4, we have R( s F ( s F (s. (7 We observe that F = m + + i î m(s,i N m (îm(s, i n(s MCG(s (îm(s i n(s MCG(s (îm(s, i n(s CCG(s ( i,î m (s ( î m (s,î n (s ( î m (s,î n (s where the last term is the ost of links that are removed from CCG(s when generating MCG(s. Sine CCG(s is a omplete graph, the number of links is given by ( 1. Moreover, MCG(s is a onneted graph and hene the number of links is at least 1. Therefore, we obtain F (s ( ( 1 R(s + ( 1 ( ( 1 R(s ( 1 1+ ( min m i î m(s,i N m i,î m (s ( = R(s + 1 1+ 3 (. min m i î m(s,i N m i,î m (s In tandem with (7, we have ( R( s R(s + 1 1+ 3 min m i î m(s,i N m ( i,î m (s whih ompletes the proof. The denominator of (6 reflets the minimum intra-domain ost of all domains in the network. Intuitively, if all domains are dense, the dominant omponent in network ost is the intra-domain ost and hene distributed loal gateway seletion an lead to a lose optimal solution. If the number of domains, i.e.,, inreases, the performane gap beomes larger due to the impat of autonomy of multiple domains. It is also worth noting that when =, we have δ = 0 and the prie of stability is 1, whih agrees with our previous result in Theorem. IV. EQUILIBRIUM SELECTIVE LEARNING IN GATEWAY SELECTION GAMES A. γ-logit Learning Algorithms In previous setion, we have shown that multiple Nash equilibria may exist in the gateway seletion game, whih possess remarkably different network performane and the one with the smallest network ost is desired. Reently, a simple learning algorithm named binarylogit, orb-logit in short, has attrated signifiant attention in potential game theory and networking ommunities, e.g., [9], [10], [11], [1], due to its favorable property of equilibrium seletion. The proedure of B-logit algorithm [9] is summarized as follows.

5 B-logit: For every time slot t: Randomly selet one of the players, say m, to update its gateway seletion while other domains remain unhanged. Denote the urrent gateway seletion of domain m as g m (t. Domain m randomly selets a node in its domain as the gateway andidate. Denote the andidate gateway seletion strategy by g m. Domain m updates as and = Pr(g m (t+1 = g m (8 exp Um( gm,g m(t/τ exp Um( gm,g m(t/τ +exp Um(gm(t,g m(t/τ Pr(g m (t+1 = g m (t = 1 Pr(g m (t+1 = g m (9 where τ is a small positive onstant, a.k.a., the smoothing fator of the algorithm. It has been shown that as τ 0, B-logit algorithm onentrates on the global minimizer of the potential funtion in any potential games with arbitrarily high probability [9], [10]. At eah step, B-logit omputes the value of ( at most one, with loal information only. This redued omplexity, in tandem with the desirable equilibrium seletion property, prosper the deployment of B-logit in networking areas suh as network oding [13], hannel and power alloation in wireless mesh networks [10], and MIMO interferene networks [14], among many others. In this paper, we will investigate an important yet unanswered aspet of B-logit, i.e., the onvergene speed of B-logit to the desired best Nash equilibrium. In the following, we first show that B-logit is essentially a speial ase of a general family of learning algorithms, denoted by γ-logit (or Γ olletively, parameterized by γ. Next, we propose a novel learning algorithm MAX-logit in Γ whih also retains the favorable property of equilibria seletion. More importantly, we prove that MAX-logit possesses the fastest onvergene rate ompared with any other γ-logit algorithm, inluding B-logit. Our key observation is that all γ-logit learning algorithms ahieve the best Nash equilibrium asymptotially by generating aperiodi, irreduible, reversible Markov hains with the same steady state distribution yet different stohasti kernels. The optimality in onvergene rate of MAX-logit is proven by investigating the mixing rate of the underlying Markov hain. We first provide the general struture of a γ-logit algorithm, parameterized by γ, as follows. γ-logit: γ-logit shares the same struture as B-logit exept in (8, where the probability is alulated as Pr(g m (t+1 = g m = exp Um( gm,g m(t/τ γ(s,s (10 where s = {g m (t,g m (t} and s = { g m,g m (t} are two gateway seletion profiles in S, and γ satisfies 1 Symmetry γ(s,s = γ(s,s, s S,s S, Feasibility ( γ(s,s max exp Um(s /τ,exp Um(s /τ. Denote the olletion of all γ-logit algorithms as Γ. It is straightforward to observe that B-logit is a speial ase in Γ with γ(s,s = γ(s,s = exp Um(s /τ +exp Um(s /τ. Lemma 1 Every γ-logit algorithm in Γ is equilibrium seletive, i.e., onverging to the global minimizer of the potential funtion asymptotially. Proof: The proof is straightforward by verifying that π(s = exp F(s /τ satisfies the detailed balane equation, and is omitted. exp F(s/τ s S Coneptually, the state spae of γ-logit algorithm is the Cartesian produt of omplete graphs K Nm,m = 1,,. While all algorithms in Γ share the idential state struture and steady state distribution, however, for two realizations of γ, the underlying transition probability matries, denoted by P(γ, are notieably different. In next setion, we will investigate the onvergene rate of B-logit, or γ-logit in general, by examining the transition probability matrix P(γ from a mixing time perspetive. B. Mixing Time Analysis of γ-logit Learning Algorithms For arbitraryγ-logit algorithm, the assoiated probability transition matrix P(γ is an S -by- S matrix, and eah element an be written as P i,j (γ Pr ( s i s j = 1 1 N m exp U(sj /τ γ(s i,s j if s i S and s j S differ at only player m, i.e., only the gateway seletions of domain m are different. Otherwise and P i,j (γ = 0, s i s j, P i,i (γ = 1 s j s i P i,j (γ. Denote the eigenvalues of P(γ in dereasing order as λ k (P(γ,k = 1,, S. By Perron-Frobenius Theorem [15], we have 1 = λ 1 (P(γ > λ (P(γ λ S (P(γ > 1. It is well understood in the literature that the mixing rate of a Markov hain to its steady state distribution is determined by the seond largest eigenvalue modulus, denoted by µ(p(γ, of the transition matrix P(γ, i.e., µ(p(γ = max ( λ (P(γ, λ S (P(γ. The smaller µ(p(γ is, the faster the Markov hain onverges to its steady state distribution [16], [17]. Therefore, we are

6 interested in finding the optimal values of γ whih retain the desired property of equilibrium seletion while enjoying a provably faster onvergene speed ompared with any other algorithms in Γ. Next, we present a new learning algorithm in Γ, denoted by MAX-logit, as follows. MAX-logit: MAX-logit is a γ-logit algorithm in Γ where ( γ(s,s = max exp Um(s /τ,exp Um(s /τ. (11 Denote µ MAX as the seond largest eigenvalue modulus assoiated with MAX-logit algorithm. We next present a key result of our paper. Theorem 5 Denote µ(p(γ as the seond largest eigenvalue modulus indued by an arbitrary γ-logit algorithm in Γ. We have µ MAX µ(p(γ. Before we prove Theorem 5, we need to establish an important lemma whih is ruial in our mixing rate analysis. Lemma For any γ-logit learning algorithm in Γ, we have λ (P(γ 0 and λ (P(γ λ S (P(γ. Proof: For an arbitrary γ-logit learning algorithm in Γ, we have P i,i (γ m=1 1 N m 1 α. (1 max m N m Define the ondutane [18] of the state spae S as s h min i A,s j A π(si P i,j (γ π(a 1 π(a (13 where A S is a subset of the state spae S and π(a s i A π(si. We know that when the smoothing fator τ is small, the γ-logit learning algorithm will onentrate arbitrarily lose to the best Nash equilibrium, denoted by s, in the steady state. Therefore, A = {s i, s i s} is a feasible partition, i.e., π(a 1. We denote π(si = ǫ, s i s, for notation brevity. By definition of (13, we have s i A π(si Pr ( s i s h π(a ( N 1 1 1 N ǫ+ +( 1 N 1 1 N ǫ M ( S 1ǫ ( 1 m=1 1 N m = = 1 1 1 m=1 N m S 1 m=1 N m 1 1 1 max m=1,, N m m=1 N 1 1 max m=1,, N m m 1 max m=1,, N m 1 where we utilize the fat that and N m, m, sine otherwise the solution is trivial. In light of (1, we attain ( 1 α h α < α 1 α. By invoking Cheeger s Inequality [19], we have λ (P(γ 1 h > 1 α 0 λ (P(γ > 1 α. (14 Next, we proeed to investigate the smallest eigenvalue, λ S (P(γ. Sine P i,i (γ α, we have W P(γ αi is a nonnegative matrix with a row sum of 1 α, where I denotes the identity matrix with dimension S -by- S. Define ρ(w as the spetral radius of matrix W and λ k (W,k = 1,, S as the eigenvalues of W in dereasing order. By Theorem 8.1. in [15], we have λ k (W ρ(w = 1 α, k = 1,, S. (15 Note that the transition matrix P(γ is not Hermitian in general. To failitate our analysis, we define π = {π(s i,s i S} as the vetor ontaining all steady state distributions, and Π = diag(π. Define P (γ = Π 1/ P(γΠ 1/. We an see that P(γ is Hermitian. In addition, sine Π 1/ is nonsingular, P(γ and P(γ are similar and hene share the same spetrum [15], i.e., λ k ( P(γ = λ k (P(γ, k = 1,, S. Similarly, we define W = Π 1/ WΠ 1/ = P(γ αi whih is also Hermitian and share the same spetrum as W. Sine P(γ and αi are both Hermitian and ommutative, we have λ k (W = λ k ( W = λ k ( P(γ α = λ k (P(γ α, k by the definition of eigenvalues. Therefore, by (15, we have λ S (P(γ α = λ S (W 1 α. Finally, we have λ S (P(γ 1+α. (16 Therefore, if λ S (P(γ 0, we have λ (P(γ λ S (P(γ. If λ S (P(γ < 0, in light of (14 and (16, we have λ (P(γ > λ S (P(γ, whih ompletes the proof of Lemma. Lemma suggests that when omparing µ(p(γ, we only need to fous on the seond largest eigenvalue λ (P(γ. Next, we proeed to provide the proof of Theorem 5. Proof of Theorem 5: Denote P and P(γ as the probability transition matries indued by MAX-logit and an arbitrary γ-logit learning algorithm in Γ, respetively. Define and P = Π 1/ P Π 1/ = P(γ P = Π 1/ Π 1/

7 where = P(γ P. It is worth noting that for eah offdiagonal element in, we have i,j = P i,j (γ Pi,j 0 and i,i = ( P i,j P i,j (γ 0. j i Therefore, is a diagonally dominant matrix with nonnegative real eigenvalues [15], implying that is a positive semidefinite (PSD matrix sine is Hermitian. By utilizing Theorem 4.3.3 in [15], we have λ (P = λ ( P λ ( P(γ = λ (P(γ, whih ompletes the proof by invoking Lemma. Therefore, the proposed MAX-logit algorithm enjoys a provably fastest onvergene rate than any other algorithms in Γ, inluding B-logit. Our theoretial results will be validated in the next setion. A. Simulation Setting V. PERFORMANCE EVALUATION To illustrate our theoretial results, we onsider the following senario for our simulation. The oalition network onsists of domains where eah domain has N nodes. For eah domain, we randomly deploy its nodes in a round area with radius 15m, entered at a random point within the square field of 1000 1000m. To demonstrate our prie of stability results in Theorem 3 and Theorem 4, we onsider two types of link ost in the simulation. The first is Eulidean distane ost, whih is a representative metri satisfying triangle inequality, and has been utilized extensively in geographi routing and network embedding shemes. In this ase, both B-logit and MAX-logit will onverge to the best Nash equilibrium whih oinides with the optimum solution. Next, we onsider random link osts where triangle inequality is violated. More speifially, we randomly selet p% of the links in the network and add a random error whih is uniformly distributed between 0% and 5% of their original Eulidean distane link ost. This enables us to ompare different senarios in a unified setting, i.e., by setting p = 0, triangle inequality is satisfied. The global link ost is set to = 500 for all network senarios. We onsider =,3,4 in our evaluation with varying number of nodes in eah domain, where the global network optimum solution is attained via exhaustive searh and is served as the performane benhmark. B. Eulidean Distane Cost We first onsider the Eulidean distane ost senarios. The global optimum gateway seletion profile is denoted by. MAX-logit and B-logit are exeuted to iteratively update gateway seletions with only loal observation and information, from the same initial onfiguration. For both algorithms, we set the smoothing fator τ = 0.0001 to ensure onvergene to the best Nash equilibrium is ahieved with suffiiently high probability. Figure 5, 6 and 7 depit sample runs for domains, 3 domains, and 4 domains senarios where eah domain ontains 0 nodes. We observe that both MAX-logit and B-logit onverge to the network optimum solution gradually in all three ases. Moreover, our proposed MAX-logit algorithm onverges signifiantly faster than B-logit to reah the global network optimum solution. To further investigate the onvergene rate improvement by MAX-logit, we ompare the average onvergene speed ofmax-logit andb-logit over 5000 sample runs for a given number of domains and nodes. Nodes per domain domains 3 domains 4 domains 5 nodes 16.06% 4.5% 33.85% 10 nodes 5.00% 9.81% 8.55% 0 nodes 11.96% 0.19% 0.36% 30 nodes 5.87% 16.46% 17.60% TABLE I CONVERGENCE RATE IMPROVEMENT BY MAX-LOGIT WHEN p = 0. Table I presents the average redution on the number of iterations needed to reah the network optimum solution, omparing MAX-logit and B-logit, over 5000 sample runs. It an be observed that MAX-logit onverges to the network optimum solution faster than B-logit up to 33.85%. It is also interesting to note that the improvement diminishes when the number of nodes inreases in eah domain. The reason is that, in suh senarios, the intra-domain ost beomes more dominant in the overall network ost and both B-logit and MAX-logit only need to onentrate on few ombinations of nodes that minimize eah domain s intradomain ost. Therefore, with the redued feasible solution set, B-logit algorithm performs reasonably well in finding optimum solution and the onvergene speed improvement by MAX-logit beomes smaller. C. Random Cost Next, we onsider the senarios where triangle inequality is violated by setting p = 50, i.e., 50% of the links in the network are assoiated with random link ost. Figure 8, Figure 9, and Figure 10 illustrate the trajetories of MAX-logit and B-logit in sample runs for domains, 3 domains, and 4 domains senarios, where violations of triangle inequality are observed. In domains senario, as suggested in Theorem, MAX-logit and B-logit still onverge to the global optimum solution as iteration evolves. In 3 domains and 4 domains senarios, however, both MAX-logit and B-logit will onverge to the best Nash equilibrium whih is different from (sine the optimum solution is not a Nash equilibrium and thus unstable. We also numerially alulate the prie of stability upper bound in (6, labeled as BOUND in Figure 9 and Figure 10. In both senarios, the proposed MAX-logit algorithm onverges notieably faster than B-logit. We ompare the average perentage of redution on the number of iterations needed to reah the best Nash equilibrium, omparing MAX-logit and B-logit, over 5000 sample runs. Sine the best Nash equilibrium is not the network optimum solution in general, we onsider the following rule as the riterion for onvergeny. For eah sample run, we set a suffiiently large number, i.e., 000 in our simulations, as

8 Global network ost 3400 3300 300 3100 3000 0 0 40 60 80 100 Fig. 5. Sample run for domains with 40 nodes. Global network ost 5000 4500 4000 3500 3000 500 0 50 100 150 00 Fig. 8. Sample run for domains with 40 nodes and random link ost. Global network ost 6500 6000 5500 5000 0 50 100 150 00 Fig. 6. Sample run for 3 domains with 60 nodes. Global network ost 6000 5500 5000 4500 BOUND 4000 0 50 100 150 00 Fig. 9. Sample run for 3 domains with 60 nodes and random link ost. Global network ost 10500 10000 9500 9000 8500 8000 0 50 100 150 00 Fig. 7. Sample run for 4 domains with 80 nodes. Global network ost 1000 11000 10000 9000 8000 BOUND 7000 6000 0 50 100 150 00 Fig. 10. Sample run for 4 domains with 80 nodes and random link ost. Nodes per domain domains 3 domains 4 domains 5 nodes 1.84% 4.46% 7.38% 10 nodes 1.00% 1.44% 1.56% 0 nodes 9.54% 9.13% 5.47% 30 nodes 1.90% 1.93%.4% TABLE II CONVERGENCE RATE IMPROVEMENT BY MAX-LOGIT WHEN p = 50. the maximum number of iterations eah algorithm exeutes. We average the global network osts of the last 00 iteration steps as the onvergene point, and denote this value by κ. We onsider the algorithm onverges to the best Nash equilibrium at iteration t, if for (t, t + 100 onseutive iterations steps, the global network ost onsistently remains in a small neighborhood ofκ±. Table II presents the perentage of redution on the number of iterations required to onverge, omparing MAX-logit and B-logit, where a similar trend of improvement degradation while number of nodes inreases is observed, as in the Eulidean distane senarios. VI. CONCLUSIONS In this work, we investigate the interations of gateway seletion by multiple domains in a oalition network. Within a potential game framework, the existene and ineffiieny of Nash equilibrium in the gateway seletion game are analyzed and quantified. In order to ahieve the best Nash equilibrium, equilibrium seletive learning algorithms are studied. We show that the well-established B-logit algorithm is a speial ase of a general family of algorithms denoted by γ-logit, or Γ olletively. In addition, we propose a novel learning algorithm named MAX-logit whih retains the favorable property of equilibrium seletion while enjoys a provably faster onvergene rate than any other algorithms in Γ. Our results are substantiated via simulations. REFERENCES [1] C.-K. Chau, J. Crowroft, K.-W. Lee, and S. H. Y. Wong, Inter-domain routing for mobile ad ho networks, ACM MobiArh, 008. [] A. Durresi, M. Durresi, and L. Barolli, Heterogeneous multi domain network arhiteture for military ommuniations, Proeedings of the Third International Conferene on Complex, Intelligent and Software Intensive Systems, 009. [3] D. Monderer and L. Shapley, Potential games, Journal of Games and Eonomi Behavior, vol. 14, pp. 14 143, 1996. [4] E. Koutsoupias and C. H. Papadimitriou, Worst-ase equilibria, STACS, 1999. [5] E. Anshelevih, A. Dasgupta, J. Kleinberg, E. Tardos, T. Wexler, and T. Roughgarden, The prie of stability for network design with fair ost alloation, IEEE FOCS, 004. [6] R. Kleinberg, Geographi routing using hyperboli spae, IEEE IN- FOCOM, 007. [7] A. Cvetkovski and M. Crovella, Hyperboli embedding and routing for dynami graphs, IEEE INFOCOM, 009. [8] F. Papadopoulos, D. Krioukov, M. Boguna, and A. Vahdat, Greedy forwarding in dynami sale-free networks embedded in hyperboli metri spaes, IEEE INFOCOM, 010. [9] G. Arslan, J. Marden, and J. Shamma, Autonomous vehile-target assignment: A game theoretial formulation, ASME Journal of Dynami Systems, Measurement and Control, pp. 584 596, 007. [10] Y. Song, C. Zhang, and Y. Fang, Throughput maximization in multihannel wireless mesh aess networks, IEEE ICNP, 007. [11] J. Marden and J. Shamma, Revisiting log-linear learning: Asynhrony, ompleteness and payoff-based implementation, in submission, http://eee.olorado.edu/marden/publiations.html. [1] J. Marden, G. Arslan, and J. Shamma, Cooperative ontrol and potential games, IEEE Transations on Systems, Man and Cybernetis. Part B: Cybernetis, 009. [13] J. Marden and M. Effros, The prie of selfishness in network oding, NetCod, 009. [14] G. Arslan, F. Demirkol, and Y. Song, Equilibrium effiieny improvement in MIMO interferene systems: A deentralized stream ontrol approah, IEEE Transations on Wireless Communiations, 007. [15] R. Horn and C. Johnson, Matrix Analysis. Cambridge University Press, 1986. [16] S. Boyd, P. Diaonis, and L. Xiao, Fastest mixing markov hain on a graph, SIAM Review, vol. 46, pp. 667 689, 004. [17] R. Montenegro and P. Tetali, Mathematial Aspets of Mixing Times in Markov Chains. NOW Publisher, 005. [18] F. Chuang, Spetral Graph Theory. CBMS Regional Conferene Series in Mathematis, 1997. [19] J. Cheeger, A lower bound for the smallest eigenvalue of the laplaian, Problems in analysis, pp. 195 199, 1970.