Tutorial and 4G Systems Capacity Limits of MIMO Channels Markku Juntti Contents 1. Introduction. Review of information theory 3. Fixed MIMO channels 4. Fading MIMO channels 5. Summary and Conclusions References Centre for Wireless Communications (CWC) 1
1. Introduction The use of multiple antennas can provide gain due to antenna gain more receive antennas more power is collected interference gain interference nulling by beamforming (array gain) interference averaging (to zero) due to independent observations diversity gain against fading receive diversity transmit diversity. Information theoretic model of multi-input multioutput (MIMO) channel is considered. Centre for Wireless Communications (CWC)
MIMO Channel Model Assume N T transmit and N R receive antennae called N T N R MIMO system. Fading radio channels modeled as frequency-flat: fixed time-varying known both/either in the transmitter and/or receiver perfect channel state information (CSI) a priori unknown. x 1 ( n) x ( n) x NT M ( n) h h 1, 1 ( n) N R, N T ( n) y 1 ( n) y ( n) y NR MIMO channel model. M ( n) Centre for Wireless Communications (CWC) 3
. Review of Information Theory Information theory (IT) has its origins in analyzing the limits communications. Information theory answers two fundamental questions in communication theory: What is the ultimate data compression rate? Answer: entropy. What is the ultimate data transmission rate? Answer: channel capacity. Centre for Wireless Communications (CWC) 4
Basic Concepts Assume a discrete valued random variable (RV) X with probability mass function p(x). The average information or entropy of RV X: 1 H ( X ) = p( x ) log[ p( x )] = E[ logp( X )] = E log ( ) x p X Joint entropy of RV s X and Y: [ p( x, y )] = E [ log[ p( X, ) ] Conditional entropy of RV Y given X = x: = p( x ) H ( Y X = x ) = p( x, y )log p( y x ) = x x y Chain rule: H ( X, Y ) = H ( X ) + H ( Y X ). { }. H ( X, Y ) p( x, y )log Y = x y [ ] E log[ p( X, ]. { }. H (Y X ) Y ) Centre for Wireless Communications (CWC) 5
Mutual Information Mutual information is the relative entropy between the joint distribution and product distribution: p( x, y ) p( X, Y ) I ( X ; Y ) = p( x, y )log = E log. ( ) ( ) ( ) ( ) p x p y p X p Y I x y ( X ; Y ) = H ( X ) H ( X Y ) = H ( Y ) H ( Y X ) ( ) ( ) ( ) ( ). = H X + H Y H X, Y = I Y ; X Measure of the information one random variable (say, X) contains on the other (Y): If X and Y are independent: I(X;Y) = 0 (also only if ). If Y = X: I(X;X) = H(X). Differential entropy for continuous RV s. Centre for Wireless Communications (CWC) 6
Gaussian RV s For multivariate, real-valued Gaussian RV s X 1, X,, X n with mean vector µ and covariance matrix K, the differential entropy is 1 ( ) [( ) ],, n h X1 X K, X n = log πe det( K). Gaussian distribution maximizes the entropy over all distributions with the same covariance: 1 ( ) [( ) ],, n h X1 X K, X n log πe det( K) for any RV s X 1, X,, X n with equality if and only if they are Gaussian. Centre for Wireless Communications (CWC) 7
Channel Capacity Message Channel Encoder p(y x) W X n Y n Decoder Information theoretic model of a communication system. Estimate of message Ŵ Channel capacity: C = maxi( X; Y ). p( x) Code rate R is achievable, if there exists a sequence of (nr,n) codes so that P 0, as n. e, max Centre for Wireless Communications (CWC) 8
Gaussian Channel X i Channel capacity: C = E Z i S ~ N(0, σ The Gaussian channel. max p( x ) ( ) X σ S I N ). Y i = X i + Z i ( X ; Y ) = log1 ( + γ), Capacity per time unit ((W) samples per second): P C = W log 1 +. N0W 1 γ = σ σ S N. Centre for Wireless Communications (CWC) 9
Parallel Gaussian Channels C = Capacity: k σ S,i i = 1 σ N,i 1 log 1 + Optimal transmission: X ~ N 0, diagσ water-filling. = k i = 1 1 ( log1 + γ ). i [ ], σ, K, σ S,1 S, S, k S Z 1 ~ N(0, σ N,1 X Y 1 1 Z ~ N(0, X Y X k S M S Z σ N, k ~ N(0, σ Y k N,k ). ). ). Parallel Gaussian channels. Centre for Wireless Communications (CWC) 10
3. Fixed MIMO Channels Signal x (n) i is transmitted at time interval n from antenna i (i=1,,,n T ). Signal y (n) j is received at time interval n at antenna j (j=1,,,n R ): y j N T ( n) = h ( n) x ( n ) + η ( n), i = 1 where h ij (n) is the complex channel gain with E ij h ij i ( n ) = 1 j x 1 ( n) x ( n) x NT M ( n) h 1, 1 ( n) ( n) Centre for Wireless Communications (CWC) 11 h N R, N T y 1 ( n) y ( n) y NR MIMO channel model. M ( n)
Matrix Formulation of MIMO Channel Model The signal received at all antennas: where H ( n) y ( n) = H( n) x( n) + η( n), x ( n ) = x ( n ) x ( n ) L x ( ) N n y h h h N 1,1 R ( n) h ( n) L h ( n) ( n) h ( n) L h ( n),1 1, ( n) h ( n) L h ( n) R, 1, N,1,, N NR N = C M [ ] T T 1, C N T [ y n y n y ] T NR, N n C ( n ) = ( ) ( ) L ( ) 1 N M O R R T M T N, N T T. Centre for Wireless Communications (CWC) 1
Noise Model and Power Constraint The noise vector satisfies The transmitted signal satisfies the average power constraint: E η [ ] T C NR, ( n ) = η ( n ) η ( n ) L η ( n ) 1 N ( ( ) ( )) T H T x n x n E x i ( n) = σ σ. = i = 1 N ( n ) ( ~ CN 0, ). η σ N I R N i = 1 S, i S Centre for Wireless Communications (CWC) 13
Singular Value Decomposition The MIMO model is a special case of parallel Gaussian channels. The channel transfer matrix has singular value decomposition (SVD): where U C 1 H = UΛ V R R, are unitary matrices, and 1 N R Λ R N T is a diagonal matrix of the singular values of H. H N N N N, V C T T Centre for Wireless Communications (CWC) 14
Equivalent Channel Model Let H ~ H x n = V x n, y n = U y n, ~ η n = Since U and V are unitary: ~ H E ~ x n x n σ ~ H ( ) ( ) ( ) ( ) ( ) U η( n). ( ( ) ( )), ~ Equivalent channel model 1 ~ y n = Λ ~ x n + ~ η n Independent parallel Gaussian channels. Capacity achieved with Gaussian input and by water-filling. S ( n ) ( ~ CN 0, ). η σ N I ( ) ( ) ( ). diagonal matrix of sixe N R N T Centre for Wireless Communications (CWC) 15
Derivation of Channel Capacity The rank of matrix H is rank(h) min(n R,N T ). The number of positive singular values is rank(h). The capacity of MIMO AWGN channel: rank( H) rank λi σ ( H) ( ) σ S, i C = log 1 = log1 + λ γ, γ = 1 + i i i i = σ N i = 1 σ where the signal powers are solved via water-filling max 0, N µ σ σ =, = 1,, K,rank( H), S, i i λ i and µ is chosen so that the power constraint is satisfied or rank( H) σ σ. i =1 S, i S Centre for Wireless Communications (CWC) 16 S,i N,
MIMO Channel Capacity for Full Rank Channel Matrix No CSI at the transmitter (and full rank H): C γ = log det IN + HH R NT H. CSI at the transmitter (and full rank H): γ H C = max log det IN + HQH, R Q N T where Q is the covariance matrix of the input vector x satisfying the power constraint tr(q) σ S. No CSI at the transmitter Q = I. Centre for Wireless Communications (CWC) 17
4. Fading MIMO Channels The channels are usually assumed to be ergodic: fading is fast enough and gets all realizations so many times that the sample average equals the theoretical mean the sample covariance equals the theoretical covariance. ergodic (a long observation time) non-ergodic (a short observation time) time Centre for Wireless Communications (CWC) 18
Fading Channel Model with Perfect Receiver CSI IN x convolution H y I ( x; y, H) = I ( x; H) + I ( x; yh) = I ( x; yh). OUT = 0 RV conditioned on channel realization The effective channel output: the actual channel output y and the channel realization H. Assuming that the channel is memoryless (independent channel state for each transmission), the capacity equals the mean of the mutual information: γ H C = EH log det IN + HH. R NT Centre for Wireless Communications (CWC) 19
Capacity Evaluation The evaluation of the fading MIMO channel capacity is complicated: Wishart distribution Laguerre polynomials [Telatar 1999] bounds [Foschini & Gans 1998] Monte Carlo computer simulations random matrix theory mutual information tends to Gaussian under development. Centre for Wireless Communications (CWC) 0
Example: N N MIMO System 10 R-CSI fading channel with N R =N T 100 90 80 SNR = 0 db SNR = 10 db SNR = 0 db R-CSI fading channel with N R =N T Capacity [bits per symbol] 10 1 3 antennae 16 antennae 8 antennae 4 antennae antennae 1 antenna 10 0 0 4 6 8 10 1 14 16 18 0 SNR [db] Capacity [bits per symbol] 70 60 50 40 30 0 10 1 3 5 7 9 11 13 15 17 19 1 3 5 7 9 31 Number of antennae The capacity curves are sifted upwards by introducing more antennae. The capacity increases linearly vs. the number of antennae. Centre for Wireless Communications (CWC) 1
Non-Ergodic Channels The channels are not always ergodic: fading can be so slow that it undergoes only some realizations. The random process becomes non-ergodic. ergodic non-ergodic time Centre for Wireless Communications (CWC)
Example AWGN 1 bit / use IN random switch AWGN bits / use OUT Select one of the channels with equal probability, and keep then fixed. Average mutual information is 1.5 bits / channel use. However, with probability 0.5 it is not supported. The achievable rate 1 bits / channel use. Channel capacity the average maximum mutual information. Centre for Wireless Communications (CWC) 3
Example: Random and Fixed Channel A simple example: generate a channel realization, and keep it fixed during the whole transmission. There is a positive probability of an arbitrarily bad channel realization. However small a rate, the channel realization may not be able to support it regardless the length of the code word. The Shannon capacity of this non-ergodic channel is zero. The Shannon capacity is again not equal to the average mutual information. Centre for Wireless Communications (CWC) 4
Outage Probability In non-ergodic channels, the capacity is measured by the probability of outage for a given rate R: Pout( R) = inf Pr[ I( x; y ) < R] Q: Q 0,tr( Q) σ = Q: Q 0,tr inf ( Q) σ S γ Pr log det I N + HQH R NT Often called capacity versus outage. S The set-up is encountered in real time applications with transmission delay constraints. Similar approach is applicable also for delay constrained communications in ergodic channels. H < R. Centre for Wireless Communications (CWC) 5
5. Summary and Conclusions AWGN MIMO channels are an extension of parallel Gaussian channels. Another example of parallel channels: channels on different frequencies. Introducing both multiple transmit and receive antennae is equivalent to increase in bandwidth. The linear capacity increase becomes natural. C γ = log det IN + HQH R NT H. Centre for Wireless Communications (CWC) 6
Fading AWGN MIMO Channel Ergodic channels: Channel experiences all its states several times. No delay constraints and/or fast fading. Capacity equals the average mutual information: C γ = EH log det IN + HH R NT H. Capacity increases linearly with N R =N T. Non-ergodic channels: Capacity does not equal the average mutual information. Capacity versus outage probability. Centre for Wireless Communications (CWC) 7
Research Challenges Capacity of selective channels time-selective frequency-selective with no or imperfect channel state information in the transmitter and the receiver. Optimal signal structures (coding and modulation) for real use with issues like amount of training vs. non-coherent detection transceiver complexity constraints limited bandwidth of a non-ideal feedback channel. Centre for Wireless Communications (CWC) 8
References 1. T. M. Cover & J. A. Thomas, Elements of Information Theory. John Wiley & Sons, 1991. ISBN: 0-471- 0659-6. E. Telatar, Capacity of multi-antenna Gaussian channels. European Transactions on Telecommunications, vol. 10, no. 6, pp. 585-595, Nov.-Dec. 1999. 3. G. J. Foschini & M. J. Gans, On limits of wireless communications in a fading environment when using multiple antennas. Wireless Personal Communications, vol. 6, pp. 311-335, Nov.-Dec. 1999 4. T. L. Marzetta & B. M. Hochwald, Capacity of a mobile multiple-antenna communication link in Rayleigh flat fading. IEEE Transactions on Information Theory, vol. 45, no. 1, pp. 139-157, Jan. 1999 5. I. E. Telatar & D. N C. Tse, Capacity and mutual information of wideband multipath fading channels. IEEE Transactions on Information Theory, vol. 46, no. 4, pp. 1384-1400, July 000. 6. M. Medard, The effect upon channel capacity in wireless communications of perfect and imperfect knowledge of the channel. IEEE Transactions on Information Theory, vol. 46, no. 3, pp. 933-945, May 000. 7. M. Medard & R. G. Gallager, Bandwidth scaling for fading multipath channels. IEEE Transactions on Information Theory, vol. 48, no. 4, pp. 840-85, April 00. 8. V. G. Subramanian & B. Hajek, Broad-band fading channels: signal burstiness and capacity. IEEE Transactions on Information Theory, vol. 48, no. 4, pp. 809-87, April 00. Centre for Wireless Communications (CWC) 9