Stochastic Games on a Multiple Access Channel



Similar documents
Luby s Alg. for Maximal Independent Sets using Pairwise Independence

A Lyapunov Optimization Approach to Repeated Stochastic Games

Recurrence. 1 Definitions and main statements

Extending Probabilistic Dynamic Epistemic Logic

An Alternative Way to Measure Private Equity Performance

INVESTIGATION OF VEHICULAR USERS FAIRNESS IN CDMA-HDR NETWORKS

benefit is 2, paid if the policyholder dies within the year, and probability of death within the year is ).

Dynamic Pricing for Smart Grid with Reinforcement Learning

1 Example 1: Axis-aligned rectangles

CALL ADMISSION CONTROL IN WIRELESS MULTIMEDIA NETWORKS

NON-CONSTANT SUM RED-AND-BLACK GAMES WITH BET-DEPENDENT WIN PROBABILITY FUNCTION LAURA PONTIGGIA, University of the Sciences in Philadelphia

What is Candidate Sampling

QoS-Aware Spectrum Sharing in Cognitive Wireless Networks

Module 2 LOSSLESS IMAGE COMPRESSION SYSTEMS. Version 2 ECE IIT, Kharagpur

Minimal Coding Network With Combinatorial Structure For Instantaneous Recovery From Edge Failures

PAS: A Packet Accounting System to Limit the Effects of DoS & DDoS. Debish Fesehaye & Klara Naherstedt University of Illinois-Urbana Champaign

Efficient Project Portfolio as a tool for Enterprise Risk Management

Power-of-Two Policies for Single- Warehouse Multi-Retailer Inventory Systems with Order Frequency Discounts

An MILP model for planning of batch plants operating in a campaign-mode

How To Solve An Onlne Control Polcy On A Vrtualzed Data Center

How Sets of Coherent Probabilities May Serve as Models for Degrees of Incoherence

PSYCHOLOGICAL RESEARCH (PYC 304-C) Lecture 12

Joint Scheduling of Processing and Shuffle Phases in MapReduce Systems

v a 1 b 1 i, a 2 b 2 i,..., a n b n i.

8.5 UNITARY AND HERMITIAN MATRICES. The conjugate transpose of a complex matrix A, denoted by A*, is given by

denote the location of a node, and suppose node X . This transmission causes a successful reception by node X for any other node

Product-Form Stationary Distributions for Deficiency Zero Chemical Reaction Networks

1. Fundamentals of probability theory 2. Emergence of communication traffic 3. Stochastic & Markovian Processes (SP & MP)

Addendum to: Importing Skill-Biased Technology

OPTIMAL INVESTMENT POLICIES FOR THE HORSE RACE MODEL. Thomas S. Ferguson and C. Zachary Gilstein UCLA and Bell Communications May 1985, revised 2004

Rate Monotonic (RM) Disadvantages of cyclic. TDDB47 Real Time Systems. Lecture 2: RM & EDF. Priority-based scheduling. States of a process

Energy Conserving Routing in Wireless Ad-hoc Networks

The Greedy Method. Introduction. 0/1 Knapsack Problem

Project Networks With Mixed-Time Constraints

J. Parallel Distrib. Comput.

Enabling P2P One-view Multi-party Video Conferencing

On File Delay Minimization for Content Uploading to Media Cloud via Collaborative Wireless Network

Relay Secrecy in Wireless Networks with Eavesdropper

The Development of Web Log Mining Based on Improve-K-Means Clustering Analysis


AN optimization problem to maximize the up-link

Efficient On-Demand Data Service Delivery to High-Speed Trains in Cellular/Infostation Integrated Networks

Fair and Efficient User-Network Association Algorithm for Multi-Technology Wireless Networks

On the Optimal Control of a Cascade of Hydro-Electric Power Stations

行 政 院 國 家 科 學 委 員 會 補 助 專 題 研 究 計 畫 成 果 報 告 期 中 進 度 報 告

Feature selection for intrusion detection. Slobodan Petrović NISlab, Gjøvik University College

Availability-Based Path Selection and Network Vulnerability Assessment

) of the Cell class is created containing information about events associated with the cell. Events are added to the Cell instance

Downlink Scheduling and Resource Allocation for OFDM Systems

Optimal Call Routing in VoIP

Solving Factored MDPs with Continuous and Discrete Variables

ANALYZING THE RELATIONSHIPS BETWEEN QUALITY, TIME, AND COST IN PROJECT MANAGEMENT DECISION MAKING

Data Broadcast on a Multi-System Heterogeneous Overlayed Wireless Network *

General Auction Mechanism for Search Advertising

BERNSTEIN POLYNOMIALS

DEFINING %COMPLETE IN MICROSOFT PROJECT

Analysis of Energy-Conserving Access Protocols for Wireless Identification Networks

Formulating & Solving Integer Problems Chapter

THE DISTRIBUTION OF LOAN PORTFOLIO VALUE * Oldrich Alfons Vasicek

2008/8. An integrated model for warehouse and inventory planning. Géraldine Strack and Yves Pochet

One-Shot Games for Spectrum Sharing among Co-Located Radio Access Networks

The Power of Slightly More than One Sample in Randomized Load Balancing

On Secrecy Capacity Scaling in Wireless Networks

Can Auto Liability Insurance Purchases Signal Risk Attitude?

On the Interaction between Load Balancing and Speed Scaling

Support Vector Machines

Performance Analysis and Comparison of QoS Provisioning Mechanisms for CBR Traffic in Noisy IEEE e WLANs Environments

Equlbra Exst and Trade S effcent proportionally

Institute of Informatics, Faculty of Business and Management, Brno University of Technology,Czech Republic

Supply network formation as a biform game

A Game-Theoretic Approach for Minimizing Security Risks in the Internet-of-Things

VRT012 User s guide V0.1. Address: Žirmūnų g. 27, Vilnius LT-09105, Phone: (370-5) , Fax: (370-5) , info@teltonika.

Resource Control for Elastic Traffic in CDMA Networks

Multi-Resource Fair Allocation in Heterogeneous Cloud Computing Systems

Price Competition in an Oligopoly Market with Multiple IaaS Cloud Providers

The Stock Market Game and the Kelly-Nash Equilibrium

How To Improve Delay Throughput In Wireless Networks With Multipath Routing And Channel Codeing

The OC Curve of Attribute Acceptance Plans

A Secure Password-Authenticated Key Agreement Using Smart Cards

When Network Effect Meets Congestion Effect: Leveraging Social Services for Wireless Services

Nordea G10 Alpha Carry Index

Transcription:

Stochastc Games on a Multple Access Channel Prashant N and Vnod Sharma Department of Electrcal Communcaton Engneerng Indan Insttute of Scence, Bangalore 560012, Inda Emal: prashant2406@gmal.com, vnod@ece.sc.ernet.n arxv:1210.7859v1 [cs.sy] 29 Oct 2012 Abstract We consder a scenaro where N users try to access a common base staton. Assocated wth each user s ts channel state and a fnte queue whch vares wth tme. Each user chooses hs power and the admsson control varable n a dynamc manner so as to maxmze hs expected throughput. The throughput of each user s a functon of the actons and states of all users. The scenaro consders the stuaton where each user knows hs channel and buffer state but s unaware of the states and actons taken by the other users. We consder the scenaro when each user s saturated (.e., always has a packet to transmt) as well as the case when each user s unsaturated. We formulate the problem as a Markov game and show connectons wth strategc form games. We then consder varous throughput functons assocated wth the multple user channel and provde algorthms for fndng these equlbra. Keywords: Multple access channel, Stochastc games, Statonary polces, Strategc form games, Nash equlbra, Potental games. I. INTRODUCTION There has been a tremendous growth of wreless communcaton systems over the last few years. The success of wreless systems s prmarly due to the effcent use of ther resources. The users are able to obtan ther qualty of servce effcently n a tme varyng rado channel by adjustng ther own transmsson powers. Dstrbuted control of resources s an nterestng area of study snce ts alternatve nvolves hgh system complexty and large nfrastructure due the presence of a central controller. Noncooperatve game theory [1] s a natural tool to desgn and analyze wreless systems wth dstrbuted control of resources. Scutar et al. [3], [4] analyzed compettve maxmzaton of transmsson rate and mutual nformaton on the multple access channel subject to power and other constrants. Hekknen [5] analyzed dstrbuted power control problems va potental games whle La et al.[2] appled game theoretc framework to resource allocaton problem n fadng multple access channel. Altman et al. [6] studed the problem of maxmzng throughput of saturated users (a user always has a packet to transmt) who have a Markov modelled channel and are subjected to power constrants.they consdered both the centralzed scenaro where the base staton chooses the transmsson power levels for all users as well as the decentralzed scenaro where each user chooses ts own power level based on the condton of ts rado channel. Altman et al. [7] later consderd the problem of maxmzng the throughput of users n a dstrbuted manner subject to both power and buffer constrants. The decentralzed scenaro n [6] whle the dstrbuted resource allocaton problem n [7] was analyzed as constraned Markov games wth ndependent state nformaton,.e., no user knows other user s state. The proof of exstence of the equlbrum polces for such games was gven n [8]. An algorthm whch guaranteed convergence to the equlbrum polces for two users for any throughput functon of the two users and an algorthm whch guaranteed convergence to the equlbrum polces for N users when ther throughput functons are dentcal were provded n [9]. Our work s closely related to the above mentoned work. When restrcted to the objectve functons n [6], [7] our problem s exactly the same however we present an alternatve vew of constraned Markov games wth ndependent state nformaton. Wth ths vew we connect the theory to strategc form games [10]. The exstence of equlbrum polces follows drectly from ths vewpont. Ths ncludes both the saturated as well as the unsaturated scenaro consdered n [6] and [7] respectvely. We also show that the algorthm whch guaranteed convergence to the equlbrum polces for N users can be extended to cases where the throughput functons of the users may be dfferent. Besdes presentng an unfed vew of both the saturated as well as the unsaturated problem, we also consder the case where the base staton uses a successve nterfence cancellaton rather than a regular matched flter. Here we formulate both the non-cooperatve and the cooperatve (team problem) setup and fnd the equlbra for both the problems. The paper s structured as follows. In Secton II we present the system model for both the saturated (no buffer constrants) and the unsaturated (both buffer and power constrants) scenaro. In Secton III we setup the problem as a constraned Markov game wth ndependent state nformaton and defne the so called equvalent strategc form game. Here we provde a proof of exstence of equlbrum polces and defne the dea of a pure strategy and potental functon for Markov games. In Secton IV we consder varous throughput functons assocated wth the multple access channel. In Secton V we develop algorthms to compute these equlbrum polces. Secton VI concludes the paper. II. SYSTEM MODEL We consder a scenaro where a set N = {1,,N} of users access the base staton through a channel smultaneously. Tme s dvded nto slots. The channel for user s modelled as an ergodc Markov chan k [n] takng values from a fnte ndex set K = {0,1,2,,km }. The channel gan for user

n ndex k s h (k ) where functon h : K [0 1]. We assume h (0) = 0. The transton probablty of user gong from channel ndex k to k s P kk. We assume that n each tme slot each user knows hs channel ndex perfectly but does not know the channel ndex of the other users. Each user has a set of power ndexes L = {0,1,2,,lm } where l m s the largest power ndex. The power nvested by user at tme n s gven by the functon p : L R wth the property that p (0) = 0,.e., there s no power nvested by user at power ndex l = 0. Let l [n] represent the power ndex followed by user. For the unsaturated case each user has a queue of fnte length q m. Denote Q = {0,1,2,,qm }. Let γ [n] packets arrve n the queue at tme slot n from the hgher layers where {γ [n],n 0} are ndependent and dentcally dstrbuted (d) wth dstrbuton τ. In each tme slot a user may transmt atmost one packet from ts queue f t s not empty. Let d [n] D = {0,1} be the admsson control varable for user where d [n] = 1 denotes acceptng all packets from the upper layer and d [n] = 0 denotes rejectng all packets. The ncomng packets are accepted untll the buffer s full, the remanng packets are dropped. We assume that a user has no nformaton about the queues of other users. If q [n] and w [n] denote the number of packets n the queue and the number of departures from the buffer n slot n then the queue dynamcs are gven as, q [n+1] = mn([q [n]+d [n]γ [n] w [n]] +,q m). (1) In tme slot n the state x [n] and the acton a [n] of user s defned as, x [n] = (k [n],q [n]), a [n] = (l [n],c [n]). (2) The set of states X and the set of actons A of user are denoted as X = K Q and A = L D respectvely. The set of states (actons) other than that of user s denoted as X (A ) whle the set of all states (actons) of all users s denoted as X (A) respectvely. In the followng we wll present the detals for the unsaturated case and then comment brefly for the saturated case. A. Instantaneous throughput and cost for user : The throughput obtaned by user s gven by the functon t : K L R + satsfyng t (k,l) = 0 f k = 0 or l = 0 where K = N =1 K and A = N =1 A. Ths mples that the throughput obtaned by user s 0 f the channel s very bad or there s no power nvested by the user. Note that the throughput of user depends on the global channel ndex k and global power ndexl of all users. We defne the throughput (t ) and the cost (c j ) of user at tme n as, t (x[n],a[n]) = t (k[n],1 {q[n] 0} l [n]; N), (3) c 1 (x[n],a[n]) = p (k [n]), c 2 (x[n],a[n]) = q [n], (4) where 1 A represents the ndcator functon and s 1 f event A s true. We observe that there s a power cost and a queung cost for user due to strngent delay requrements whch have to be met by the user. B. Transton probablty under each acton: We defne the transton probablty P xa x of user gong from state x to state x under the acton a as, P xa x = P kk P qa q, (5) where P qa q s the transton probablty of user gong from state q to state q under acton a. C. Saturated system: In the saturated system each user always has a packet to transmt at each tme. Thus there s only a power cost for every user. Th state, acton and transton probablty of user get modfed as, x [n] = k [n], a [n] = l [n],and P xa x = P kk whle the nstantaneous throughput and cost for user are t (x[n],a[n]) = t (k[n],l[n]) and c 1 (x[n],a[n]) = p (k [n]) respectvely. D. Statonary polces: LetM (G) be the set of probablty measures over a set G. A statonary polcy for user s a functon u : X M (A ). The value u (a x ) represents the probablty of user takng acton a when t s n state x. We denote the set of statonary polces for user as U and the set of all statonary multpolces as U = N =1 U. The set of statonary multpolces of all users other than user s denoted as U. E. Expected tme-average rate, costs and constrants: Let x 0 := x[0] represent the ntal state of all users. Gven a statonary multpolcy u for all players, Pu x0 denotes the dstrbuton of the stochastc process (x[n], a[n]). The expectaton due to ths dstrbuton s denoted as E x0 u. We now defne the tme-average expected rate as, T (u) := lmsup T 1 T T n=1 E x0 u (t (x[n],a[n])). (6) where the expected tme average costs are subject to constrants, C k (u ) := lmsup T 1 T T n=1 E x0 u (c k (x[n],a[n])) Ck. (7) wherec 1 = P andc 2 = Q. In case of the saturated scenaro k = 1 otherwse k {1,2}. A polcy u s called feasble f t satsfes C k (u ) C k k and s called feasble f t s -feasble for all users N.

III. GAME THEORETIC FORMULATION Each user chooses a statonary polcy u U so as to maxmze hs expected average reward T (u). However T (u) depends on the statonary polcy of other users also leadng to a noncooperatve game. We denote the above formulaton as a constraned Markov game [8], [11], Γ cmg = [N,(X ),(A ),(P ),(t ),(c k ),(Ck ) ] where the elements of the above tuple are as defned prevously. Let [u,v ] denote the multpolcy where, users k use statonary polcy u k whle user uses polcy v. We now defne the Constraned Nash Equllbrum (CNE). Defnton 1: A multpolcy u U s called a CNE f for each player N and for any v U such that [u,v ] s feasble, T (u) T (u,v ). (8) A feasble polcy u s called an optmal response of player aganst a multpolcy u of other users f for any other feasble polcy v, (13) holds. In ths paper we lmt ourselves to statonary CNE as aganst general hstory dependent Nash equlbra. These are easy to mplement and are usually the subject of study. It s shown n [8] that statonary Nash equlbra are Nash equlbra n the general class of polces also although may only be a proper subset. A. Calculaton of optmal response Denote the transton probablty of user gong from state x to state y under the polcy u as, P xu y = a A u (a x )P xa y. (9) Defne the mmedate reward for user, when user has state x and takes acton a and other users use multpolcy u as, R (x,a ) = [ l (a l x l )π l u u l (x l )]t (x,a), (10) (x,a ) where π u l (x l ) s the steady state probablty of user l beng n state x l when t uses polcy u l. Gven the statonary polcy u U defne the occupaton measure as, z (x,a ) = π u (x ) u (a x ). (11) The occupaton measurez (x,a ) for user s the steady-state probablty of the user beng n state x X and usng acton a A. Gven the occupaton measurez the statonary polcy u s: u (a x ) = z (x,a ) a A z (x,a ), (x,a ) X A. (12), Then the tme-average expected rate and costs under the multpolcy u are: T (u) = R (x,a )z (x,a ), (13) C k (u ) = B. Best response of player c k (x,a )z (x,a ). (14) Let all users other than user use the multpolcy u. Then user has an optmal statonary best response polcy whch s ndependent of the ntal state x 0 [8]. Let the set of optmal statonary polces of user be denoted as BR(u ). We can compute the elements of ths set from the followng Lnear program: Fnd z = [z (x,a )],(x,a ) X A that maxmzes: T (u) = R (x,a )z (x,a ), (15) subject to C k (u ) = [1 y (x ) P ya x ]z (x,a ) = 0, y X, (16) c k (x,a )z (x,a ) C k, k {1,2}, (17) z (x,a ) = 1, z (x,a ) 0, (x,y ) X Y. (18) Note that the above Lnear program can be modfed for the saturated scenaro smply by choosng k = 1. The Lnear program for the saturated scenaro can be presented n a much smpler form [6]. The constrants (21 23) are referred n matrx form as A us z b us. C. Equvalent Strategc form game: In ths secton we wll show that the above Markov game s equvalent to a usual strategc form (nonstochastc) game. We wll use ths equvalence to show exstence of the CNE and also provde algorthms to fnd them and show ther convergence. Defne a Strategc form gameγ E = N,{V } N,{r } N where V := {1,2,,v m}. Each pont v V corresponds to the endpont [z (x,a )];(x,a ) X A of the polyhedron formed due to constrants A us z b us and wll be denoted as v := [v (x,a )];(x,a ) X A. The utlty functon r : V R where V = N V s defned as, r (v) = r (v 1,v 2,,v N ) := R v (x,a )v (x,a ), where R v (x,a ) := l (19) v l (x l,a l )t (x,a). (20)

Letλ be a mxed strategy for player. Denote the set of mxed strateges of player as (V ). The expected utlty of player when all players use strategy tuple λ = (λ 1,λ 2,,λ N ) s gven as r (λ) := E λ (r ) where E λ (.) denotes expectaton wth respect to the global mxed strategy λ. Defne the set of optmal strateges for player, when other players use strategy λ as, { } BR(λ ) = λ : λ argmax λ r (λ,λ ). (21) D. Exstence of Nash Equllbrum The followng proposton establshes a connecton between any global multpolcy u for the constraned Markov game Γ cmg and some global mxed strategy λ n the equvalent strategc form game Γ E. Proposton 1: There exst a u BR(u ) gven any multpolcy u of players other than f and only f there exst λ for players other than and a λ BR(λ ) such that T (u,u ) = r (λ,λ ) Proof: Refer to [11]. The exstence of CNE for the constraned Markov game Γ cmg follows from the above proposton. Theorem 1: There exst a CNE for the Constraned Markov game Γ cmg. Proof: There exst a mxed strategy Nash equlbrum for the equvalent strategc form game Γ E [1], let t be denoted by λ. It follows then, that r (λ,λ ) r (λ,λ ), λ, N. From proposton 1 we can fnd equvalent u for λ such that T (u,u ) = r (λ,λ ) r (λ,λ ) = T (u,u ), u U, N. Ths proves that u s a CNE. E. Potental Games We frst defne the dea of a pure strategy and pure startegy Nash equlbrum (PSNE) for the constraned Markov game Γ cmg. Defnton 2: A polcyu for players called a pure polcy or pure strategy of the constraned Markov game Γ cmg f the mxed strategy λ correspondng to ths polcy s a pure strategy. We say that a constraned Markov game Γ cmg has a PSNE f the equvalent startegc form game has a PSNE. Defnton 3: A strategc form game Γ s called a potental game f there exsts a functon r : V R such that N, r 1(v,v ) r 1(ˆv,v ) = ( r(v,v ) r(ˆv,v ) ) v,ˆv V, v V. Γ cmg s a potental game f the correspondng Γ E s a potental game. Consder the class of strategc form games, ( Ξ := Γ(k) = ) N,{L } N,{t (k)} N : k K (22) Lemma 1: If Γ(k) s a potental game for each k K, then the constraned Markov game Γ cmg s a potental game. Proof: Refer to [11]. Refer to an example n [11]. IV. THROUGHPUT FUNCTIONS The base staton may use a regular matched flter or a successve nterference cancellaton (SIC) flter. We assume that each user s aware of the flter adopted at the base staton to decode ther respectve transmssons. Any of the two cases results n dfferent throughput functons for the users whch we characterze n the subsequent subsectons. A. Regular matched flter When the base staton uses a regular matched flter the receved packet of any user s decoded by treatng the sgnals of other users as nose. In ths case, the throughput functons for user s, t n (k,l) = log 2 ( 1+ ) h (k )p (l ) N 0 + N j=1,j h. (23) j(k j )p j (l j ) Note that t n (k,l) s an upper bound for the throughput of user. On the other hand the users may want to maxmze the aggregrated throughput n a decentralzed manner. In ths case the jont objectve functon when they use acton a A at state x X s, t s (k,l) = N =1 t n (k,l). (24) The nterference cancellaton Markov game s Γ cmg wth t = t n and the sum throughput game s Γ cmg wth t = t s. The nterference canccellaton Markov game and the sum throughput Markov game are denoted as Γ n cmg and Γs cmg respectvely. These throughput functons were consdered n [7], [6]. B. Successve Interference Cancellaton When the base staton uses a successve nterference cancellaton flter t decodes the data of users n a predefned order at each tme slot. Gven an orderng scheme on the the set of users N, the receved packet of a user s decoded after cancellng out the decoded transmsson of other users lyng below user n the predefned order from the receved transmsson. We assume perfect cancellaton of the decoded sgnal from the receved transmsson [7]. We frst show how to choose the decodng order for each tme slot. We defne the Endpont SIC schemes where the decodng order s fxed for all tme slots. Now usng the latter we defne the Randomzed SIC schemes where the decodng order for each tme slot s chosen randomly from some dstrbuton. We assume that the dstrbuton s known to all users but they do not know the decodng order at each tme slot. 1) Endpont SIC schemes: Here the decodng order s same for each tme slot n. Gven the set of users N defne the m- th permutaton set of N as the ordered set σ N (m) where m represents one of the possble N! permutaton. Let B (m) denote the set of players who are ndexed above user n

the set σ N (m). We defne the m-th utlty functon of user as, ) t m h (k )p (l ) (k,l) = log 2 (1+ N 0 + j B h. (25) (m) j(k j )p j (l j ) The above utlty functon for player ndcates that all users ndexed below usern the set σ N (m) are decoded before user and ther sgnal s cancelled out from the receved sgnal, after whch, user sgnal s decoded. The m-th endpont SIC Markov game s Γ cmg wth t = t m N and s denoted as Γ m cmg. 2) Randomzed SIC schemes: Here the decodng order s chosen at each tme slot n wth a probablty. Though each user knows the probablty dstrbuton at each tme slot n, he does not know the exact decodng order. If probablty mass functon α = {α(m)} over the set N! = {1,2,,N!} s chosen then the utlty functon of user as, t α (k,l) = N! m=1 α(m)t m (k,l). (26) The α randomzed SIC Markov game s Γ cmg wth t = t α N and s denoted as Γ α cmg. Note that the randomzatons α such that α(m) = 1 for some m corresponds to the endpont game Γ m cmg. In the next subsecton we fnd randomzatons α, for whch Γ α cmg has a pure strategy Nash equlbrum. 3) Randomzed games wth PSNE: In ths scton we construct randomzatons α for whch the resultng randomzed games have PSNE s. Take a partton P 1,P 2,,P k of the set N where 1 k N. Let s(p a ) = P 1,P 2,,P k denote ths partcular partton of the set N where p a ndexes ths partcular partton of the set N. Let s(p a,p e ) = (P e1 P e2 P ek ) denote the ordered set formed by the p e -th permutaton of the parttons P 1,P 2,,P k. Note that 1 p e k!. Defne the Support set S(p a,p e ) as, S(p a,p e ) := { m : σ N (m) = σ Pe1 (m 1 )σ Pe2 (m 2 ) σ Pek (m k ) } 1 m 1 P e1!,,1 m k P ek!. where σ G (m) refers to the m-th permutaton of the set G. The set S(p a,p e ) contans all the permutatons m for whch the randomzaton α (to be defned next) has a postve value,.e α(m) > 0 m S(p a,p e ). We now defne the randomzaton α(p a,p e ) as, { 1 α(m) = P 1! P 2! P k! ; m S(p a,p e ) (27) 0 ; otherwse. The followng example shows the constructon for N = {1,2,3} Example 1: N = {1,2,3}. The permutaton sets of N are σ N (1) = (1,2,3), σ N (2) = (1,3,2), σ N (3) = (2,1,3), σ N (4) = (2,3,1), σ N (5) = (3,1,2) and σ N (6) = (3,2,1). The possble parttons of the set N are s(1) = {1},{2},{3}, s(2) = {1,2},{3}, s(3) = {1,3},{2}, s(4) = {3, 2},{1} and s(5) = {1, 2, 3}. The ordered set formed due to the correspondng permutatons of the parttons are s(1,1) = ({1}{2}{3}), s(1,2) = ({1}{3}{2}), s(1,3) = ({2}{1}{3}),s(1,4) = ({2}{3}{1}),s(1,5) = ({3}{1}{2}), s(1,6) = ({3}{2}{1}), s(2,1) = ({1}{2,3}), s(2,2) = ({2,3}{1}), s(3,1) = ({2}{1,3}), s(3,2) = ({1,3}{2}), s(4,1) = ({3}{1,2}), s(4,2) = ({1,2}{3}) and s(5) = ({1,2,3}). The support sets resultng from the above ordered sets are S(1,1) = {1}, S(1,2) = {2}, S(1,3) = {3}, S(1,4) = {4}, S(1,5) = {5}, S(1,6) = {6}, S(2,1) = {1,2}, S(2,2) = {4,6}, S(3,1) = {3,4}, S(3,2) = {2,5}, S(4,1) = {5,6}, S(4,2) = {1,3} and S(5,1) = {1,2,3,4,5,6} The above support set lead to the followng randomzatons: TABLE I RANDOMIZATIONS WITH PSNE α(p a,p e) α(1) α(2) α(3) α(4) α(5) α(6) α(1, 1) 1 0) 0 0 0 0 α(1, 2) 0 1 0 0 0 0 α(1, 3) 0 0 1 0 0 0 α(1, 4) 0 0 0 1 0 0 α(1, 5) 0 0 0 0 1 0 α(1, 6) 0 0 0 0 0 1 α(2, 1) 1/2 1/2 0 0 0 0 α(2, 2) 0 0 0 1/2 0 1/2 α(3, 1) 0 0 1/2 1/2 0 0 α(3, 2) 0 1/2 0 0 1/2 0 α(4, 1) 0 0 0 0 1/2 1/2 α(4, 2) 1/2 0 1/2 0 0 0 α(5, 1) 1/6 1/6 1/6 1/6 1/6 1/6 The next theorem shows that the randomzatons constructed n ths secton lead to games whch have PSNE s. Theorem 2: Any Markov game Γ α cmg wth α = α(p a,p e ) has a pure strategy Nash equlbrum. Proof: Refer to [11]. C. Sum Capacty utlty functon We defne the sum capacty utlty functon as, N t sc =1 (k,l) = log 2 (1+ h ) (k )p (l ). (28) N 0 For any probablty dstrbuton α we have, N t α (k,l) = t sc (k,l). =1 We can nterpret the sum capacty utlty functon as the aggregrated sum throughput that each user maxmzes n a decentralzed manner when the base staton s usng a SIC decoder. The Sum capacty Markov game s Γ cmg wth t = t sc N and s denoted as Γ sc cmg. V. ALGORITHMS In ths secton we gve the algorthms to compute the CNE for the Markov games Γ n cmg, Γsc cmg, Γs cmg and Γα cmg whenever α = α(p a,p e ) for some partton s(p a ) of N and

permutaton p e of the partton sets. Algorthm 1 s used to compute the Nash equlbrum for the frst three Markov games whle algorthm 2 s used to compute the equlbrum for the randomzed game Γ α cmg. Note that algorthm 1 was consdered n [6] and ts proof for dentcal nterest throughput functons (.e., Γ s cmg) was also gven. We extend the proof for Γ s cmg. Algorthm 1 Intalze multpolcy u 0 U for all 1 N do Compute u k BR(u ) by solvng the LP usng the smplex algorthm where u = (u k 1,uk 2,,uk 1,uk 1,,u k 1 N ). f T (u k,u )=T (u k 1,u ) then then the updated value u k end f end for f u k = u k 1 then stop, else go to step 2 end f u k s the CNE := uk 1 We defne the restrcton ofγ cmg whch s used n algorthm 2. Gven any set S N of users and polcy u 0 for all N/S, we defne the restrcton of Γ cmg on the set S as the constraned Markov game wth the set S of users partcpatng n the game Γ cmg whle the users N/S use the predefned polcy u 0. We denote the restrcted game as Γ cmg(s). Let s(p a,p e ) = P e1 P e2 P ek denote the ordered set formed by the p e -th permutaton of the partton s(p a ) = P 1,P 2,,P k. We compute the PSNE for the game Γ α cmg nduced by the partton p a and permutaton p e. Algorthm 2 Intalze multpolcy u 0 U for all 1 j k do f user P el where l < j then Set u = u end f f user P el where l > j then Set u = u 0 end f Compute u k for all P ej by restrctng algorthm 1 on the restrcted Markov game Γ cmg (P ej ). Set u = uk for all P ej. end for u, N s the requred PSNE. The convergence of algorthms 1 and 2 s proved n [11]. VI. NUMERICAL RESULTS The channel model consdered s the BF-FSMC model [7]: The channel transton probabltes are P 0,0 = 1/2, P 0,1 = 1/2, P k m,km 1 = 1/2, P k m,km = 1/2 ;P k,k = 1/3, P k,k 1 = 1/3, P k,k +1 = 1/3 (1 k km 1). The channel gan and the power functon are h = k /(k m ) and p = l respectvely. The followng parameters are fxed for all user: k m = 3, l m = 5, q m = 10, P = 2 and Q = 5. γ [n] has a Posson dstrbuton wth rate.3 and N 0 = 1. The throughput obtaned at the equlbra for the varous games are tabulated n Table II n the user order {1,2,3}. Note that the randomzed game α(2, 1) has multple equlbra. Please refer to [11] for the optmal polces. : TABLE II OPTIMAL USER THROUGHPUT Game / System Model Saturated Unsaturated Γ n cmg.5263,.5263,.5263.4649,.4649,.4649 Γ α cmg α = α(1, 1) 1.0644,.6969,.5068.6949,.5649,.4649 Γ α cmg α = α(4, 2).8836,.8836,.5082.6299,.6299,.4649 Γ α cmg α = α(5, 1).7566,.7566,.7566.5749,.5749,.5749 Γ α cmg 1.0644,.6035,.5987 α = α(2, 1) 1.0644,.5987,.6035.6949,.5149,.5149 Γ s cmg 1.6139 1.3959 Γ sc cmg 2.2789 1.7246 ACKNOWLEDGEMENTS The authors would lke to thank Professor Altman for nterestng dscussons about the paper. VII. CONCLUSIONS We have consdered decentralzed schedulng of a Wreless channel by multple users. The users may be saturated or unsaturated. The decoder at the base staton may employ a matched flter or successve nterfernce cancellaton. The users know only ther own channel states. The system s modelled as a constraned Markov game wth ndependent state nformaton. We have proved the exstence of equlbrum polces and provded algorthms to fnd these polces. For ths, we frst convert the Markov game nto an equvalent strategc form game. REFERENCES [1] Z Han, D Nyato, W Saad, T Basar, A Hjorungnes, Game Theory n Wreless and Communcaton Networks [2] L. La and H. El Gamal, The water-fllng game n fadng multple-access channels, IEEE Trans. Inform. Theory, vol. 54, no. 5, pp. 21102122, May 2008. [3] G Scutar, D P. Palomar, and S Barbarossa, Optmal Lnear Precodng Strateges for Wdeband Noncooperatve Systems Based on Game Theory Part I: Nash Equlbra n IEEE Transacton on Sgnal Processng, vol. 56, no. 3, pp. 1230-1249, March 2008. [4] G Scutar, D P. Palomar, and S Barbarossa, Optmal Lnear Precodng Strateges for Wdeband Noncooperatve Systems Based on Game Theory Part II: Algorthms n IEEE Transacton on Sgnal Processng, vol. 56, no. 3, pp. 1250-1267, March 2008. [5] T. Hekknen, A potental game approach to dstrbuted power control and schedulng,n Computer Networks, vol. 50, no. 13, pp. 2295 2311, September 2006. [6] E. Altman, K. Avrachenkov, G. Mller, and B. Prabhu, Uplnk dynamc dscrete power control n cellular networks,inria Techncal Report 5818, 2006

[7] E Altman, K Avrachenkov, N Bonneau, M Debbah, R El-Azouz, D Menasch, Constraned Stochastc Games n Wreless Networks Globecomm, 2008 [8] E. Altman, Constraned cost coupled stochastc games wth ndependent state nformaton to be publshed n Operatons Research Letters, vol. 36, no. 2, Mar. 2008. [9] E. Altman, K. Avrachenkov, N. Bonneau, M. Debbah, R. El-Azouz, and D. Menasch, Constraned stochastc games: Dynamc control n wreless networks Tech Report, 2007, www-net.cs.umass.edu/sadoc/mdp. [10] D. Monderer and L. S. Shapley, Potental games, n Games and Economc Behavor, vol. 14, no. 1, pp 124-143 May 1996. [11] N Prashant and Vnod Sharma, Stochastc Games on the Multple Access Channel Submtted to arxv pre-prnt server, October 2012