Polarization codes and the rate of polarization

Save this PDF as:
 WORD  PNG  TXT  JPG

Size: px
Start display at page:

Download "Polarization codes and the rate of polarization"

Transcription

1 Polarization codes and the rate of polarization Erdal Arıkan, Emre Telatar Bilkent U., EPFL Sept 10, 2008

2 Channel Polarization Given a binary input DMC W, i.i.d. uniformly distributed inputs (X 1,..., X n ) {0, 1} n, in one-to-one correspondence with binary data (U 1,..., U n ) {0, 1} n. Observe that U i are i.i.d., uniform on {0, 1}. W X n F n 2 Y n

3 Channel Polarization Given a binary input DMC W, i.i.d. uniformly distributed inputs (X 1,..., X n ) {0, 1} n, in one-to-one correspondence with binary data (U 1,..., U n ) {0, 1} n. Observe that U i are i.i.d., uniform on {0, 1}. W X n F n 2 Y n

4 Channel Polarization Given a binary input DMC W, i.i.d. uniformly distributed inputs (X 1,..., X n ) {0, 1} n, in one-to-one correspondence with binary data (U 1,..., U n ) {0, 1} n. Observe that U i are i.i.d., uniform on {0, 1}. U n F n 2 G n : F n 2 Fn 2 X n F n 2 W Y n

5 Channel Polarization Given a binary input DMC W, i.i.d. uniformly distributed inputs (X 1,..., X n ) {0, 1} n, in one-to-one correspondence with binary data (U 1,..., U n ) {0, 1} n. Observe that U i are i.i.d., uniform on {0, 1}. U n F n 2 G n : F n 2 Fn 2 X n F n 2 W Y n

6 Channel Polarization I (U n ; Y n ) = I (X n ; Y n ) = i I (X i ; Y i ) = i I (W ) I (U n ; Y n ) = i = i I (U i ; Y n U i 1 ) I (U i ; Y n U i 1 ) Notation: I (P) denotes the mutual information between the input and output of a channel P when input is uniformly distributed.

7 Channel Polarization I (U n ; Y n ) = I (X n ; Y n ) = i I (X i ; Y i ) = i I (W ) I (U n ; Y n ) = i = i I (U i ; Y n U i 1 ) I (U i ; Y n U i 1 ) Notation: I (P) denotes the mutual information between the input and output of a channel P when input is uniformly distributed.

8 Channel Polarization I (U n ; Y n ) = I (X n ; Y n ) = i I (X i ; Y i ) = i I (W ) I (U n ; Y n ) = i = i I (U i ; Y n U i 1 ) I (U i ; Y n U i 1 ) Notation: I (P) denotes the mutual information between the input and output of a channel P when input is uniformly distributed.

9 Channel Polarization I (U n ; Y n ) = I (X n ; Y n ) = i I (X i ; Y i ) = i I (W ) I (U n ; Y n ) = i = i I (U i ; Y n U i 1 ) I (U i ; Y n U i 1 ) Notation: I (P) denotes the mutual information between the input and output of a channel P when input is uniformly distributed.

10 Channel Polarization I (U n ; Y n ) = I (X n ; Y n ) = i I (X i ; Y i ) #{i : I (X i ; Y i ) ζ}/n 1 = i I (W ) I (U n ; Y n ) = i I (U i ; Y n U i 1 ) = i I (U i ; Y n U i 1 ) 0 0 I (W ) 1 ζ Notation: I (P) denotes the mutual information between the input and output of a channel P when input is uniformly distributed.

11 Channel Polarization I (U n ; Y n ) = I (X n ; Y n ) = i I (X i ; Y i ) #{i : I (X i ; Y i ) ζ}/n 1 = i I (W ) I (U n ; Y n ) = i I (U i ; Y n U i 1 ) = i I (U i ; Y n U i 1 ) 0 0 I (W ) 1 ζ Notation: I (P) denotes the mutual information between the input and output of a channel P when input is uniformly distributed.

12 Channel Polarization I (U n ; Y n ) = I (X n ; Y n ) = i I (X i ; Y i ) #{i : I (X i ; Y i ) ζ}/n 1 = i I (W ) I (U n ; Y n ) = i I (U i ; Y n U i 1 ) = i I (U i ; Y n U i 1 ) 0 0 I (W ) 1 ζ Notation: I (P) denotes the mutual information between the input and output of a channel P when input is uniformly distributed.

13 Channel Polarization I (U n ; Y n ) = I (X n ; Y n ) = i I (X i ; Y i ) #{i : I (U i ; Y n U i 1 ) ζ}/n 1 = i I (W ) I (U n ; Y n ) = i I (U i ; Y n U i 1 ) = i I (U i ; Y n U i 1 ) ζ Notation: I (P) denotes the mutual information between the input and output of a channel P when input is uniformly distributed.

14 Channel Polarization I (U n ; Y n ) = I (X n ; Y n ) = i I (X i ; Y i ) #{i : I (U i ; Y n U i 1 ) ζ}/n 1 = i I (W ) I (U n ; Y n ) = i I (U i ; Y n U i 1 ) = I (U i ; Y n U i 1 ) 0 i 0 1 n = 1 Notation: I (P) denotes the mutual information between the input and output of a channel P when input is uniformly distributed. ζ

15 Channel Polarization I (U n ; Y n ) = I (X n ; Y n ) = i I (X i ; Y i ) #{i : I (U i ; Y n U i 1 ) ζ}/n 1 = i I (W ) I (U n ; Y n ) = i I (U i ; Y n U i 1 ) = I (U i ; Y n U i 1 ) 0 i 0 1 n = 2 Notation: I (P) denotes the mutual information between the input and output of a channel P when input is uniformly distributed. ζ

16 Channel Polarization I (U n ; Y n ) = I (X n ; Y n ) = i I (X i ; Y i ) #{i : I (U i ; Y n U i 1 ) ζ}/n 1 = i I (W ) I (U n ; Y n ) = i I (U i ; Y n U i 1 ) = I (U i ; Y n U i 1 ) 0 i 0 1 n = 2 4 Notation: I (P) denotes the mutual information between the input and output of a channel P when input is uniformly distributed. ζ

17 Channel Polarization I (U n ; Y n ) = I (X n ; Y n ) = i I (X i ; Y i ) #{i : I (U i ; Y n U i 1 ) ζ}/n 1 = i I (W ) I (U n ; Y n ) = i I (U i ; Y n U i 1 ) = I (U i ; Y n U i 1 ) 0 i 0 1 n = 2 8 Notation: I (P) denotes the mutual information between the input and output of a channel P when input is uniformly distributed. ζ

18 Channel Polarization I (U n ; Y n ) = I (X n ; Y n ) = i I (X i ; Y i ) #{i : I (U i ; Y n U i 1 ) ζ}/n 1 = i I (W ) I (U n ; Y n ) = i I (U i ; Y n U i 1 ) = I (U i ; Y n U i 1 ) 0 i 0 1 n = 2 12 Notation: I (P) denotes the mutual information between the input and output of a channel P when input is uniformly distributed. ζ

19 Channel Polarization I (U n ; Y n ) = I (X n ; Y n ) = i I (X i ; Y i ) #{i : I (U i ; Y n U i 1 ) ζ}/n 1 = i I (W ) I (U n ; Y n ) = i I (U i ; Y n U i 1 ) = I (U i ; Y n U i 1 ) 0 i 0 1 n = 2 20 Notation: I (P) denotes the mutual information between the input and output of a channel P when input is uniformly distributed. ζ

20 Channel Polarization I (U n ; Y n ) = I (X n ; Y n ) = i I (X i ; Y i ) #{i : I (U i ; Y n U i 1 ) ζ}/n 1 = i I (W ) I (U n ; Y n ) = i I (U i ; Y n U i 1 ) = i I (U i ; Y n U i 1 ) ζ Notation: I (P) denotes the mutual information between the input and output of a channel P when input is uniformly distributed.

21 Channel Polarization We say channel polarization takes place if it is the case that almost all of the numbers I (U i ; Y n U i 1 ) are near the extremal values, i.e., if 1 n #{ i : I (U i ; Y n U i 1 ) 1 } I (W ) and 1 n #{ i : I (U i ; Y n U i 1 ) 0 } 1 I (W ).

22 Channel Polarization We say channel polarization takes place if it is the case that almost all of the numbers I (U i ; Y n U i 1 ) are near the extremal values, i.e., if 1 n #{ i : I (U i ; Y n U i 1 ) 1 } I (W ) and 1 n #{ i : I (U i ; Y n U i 1 ) 0 } 1 I (W ).

23 Channel Polarization If polarization takes place and we wish to communicate at rate R: Pick n, and k = nr good indices i such that I (U i ; Y n U i 1 ) is high, let the transmitter set U i to be uncoded binary data for good indices, and set U i to random but publicly known values for the rest, let the receiver decode the U i successively: U 1 from Y n ; U i from Y n Û i 1. One would expect this scheme to do do well as long as R < I (W ).

24 Channel Polarization If polarization takes place and we wish to communicate at rate R: Pick n, and k = nr good indices i such that I (U i ; Y n U i 1 ) is high, let the transmitter set U i to be uncoded binary data for good indices, and set U i to random but publicly known values for the rest, let the receiver decode the U i successively: U 1 from Y n ; U i from Y n Û i 1. One would expect this scheme to do do well as long as R < I (W ).

25 Channel Polarization If polarization takes place and we wish to communicate at rate R: Pick n, and k = nr good indices i such that I (U i ; Y n U i 1 ) is high, let the transmitter set U i to be uncoded binary data for good indices, and set U i to random but publicly known values for the rest, let the receiver decode the U i successively: U 1 from Y n ; U i from Y n Û i 1. One would expect this scheme to do do well as long as R < I (W ).

26 Channel Polarization If polarization takes place and we wish to communicate at rate R: Pick n, and k = nr good indices i such that I (U i ; Y n U i 1 ) is high, let the transmitter set U i to be uncoded binary data for good indices, and set U i to random but publicly known values for the rest, let the receiver decode the U i successively: U 1 from Y n ; U i from Y n Û i 1. One would expect this scheme to do do well as long as R < I (W ).

27 Channel Polarization If polarization takes place and we wish to communicate at rate R: Pick n, and k = nr good indices i such that I (U i ; Y n U i 1 ) is high, let the transmitter set U i to be uncoded binary data for good indices, and set U i to random but publicly known values for the rest, let the receiver decode the U i successively: U 1 from Y n ; U i from Y n Û i 1. One would expect this scheme to do do well as long as R < I (W ).

28 Polarization HowTo Given two copies of a binary input channel P : F 2 Y X 1 P Y 1 X 2 P Y 2 consider the transformation above to generate two channels P : F 2 Y 2 and P + : F 2 Y 2 F 2 with P (y 1 y 2 u 1 ) = u P(y 1 u 1 + u 2 )P(y 2 u 2 ) P + (y 1 y 2 u 1 u 2 ) = 1 2 P(y 1 u 1 + u 2 )P(y 2 u 2 ).

29 Polarization HowTo Given two copies of a binary input channel P : F 2 Y U 1 + P Y 1 U 2 P Y 2 consider the transformation above to generate two channels P : F 2 Y 2 and P + : F 2 Y 2 F 2 with P (y 1 y 2 u 1 ) = u P(y 1 u 1 + u 2 )P(y 2 u 2 ) P + (y 1 y 2 u 1 u 2 ) = 1 2 P(y 1 u 1 + u 2 )P(y 2 u 2 ).

30 Polarization HowTo Given two copies of a binary input channel P : F 2 Y U 1 + P Y 1 U 2 P Y 2 consider the transformation above to generate two channels P : F 2 Y 2 and P + : F 2 Y 2 F 2 with P (y 1 y 2 u 1 ) = u P(y 1 u 1 + u 2 )P(y 2 u 2 ) P + (y 1 y 2 u 1 u 2 ) = 1 2 P(y 1 u 1 + u 2 )P(y 2 u 2 ).

31 Polarization HowTo Given two copies of a binary input channel P : F 2 Y U 1 + P Y 1 U 2 P Y 2 consider the transformation above to generate two channels P : F 2 Y 2 and P + : F 2 Y 2 F 2 with P (y 1 y 2 u 1 ) = u P(y 1 u 1 + u 2 )P(y 2 u 2 ) P + (y 1 y 2 u 1 u 2 ) = 1 2 P(y 1 u 1 + u 2 )P(y 2 u 2 ).

32 Polarization HowTo Given two copies of a binary input channel P : F 2 Y U 1 + P Y 1 U 2 P Y 2 consider the transformation above to generate two channels P : F 2 Y 2 and P + : F 2 Y 2 F 2 with P (y 1 y 2 u 1 ) = u P(y 1 u 1 + u 2 )P(y 2 u 2 ) P + (y 1 y 2 u 1 u 2 ) = 1 2 P(y 1 u 1 + u 2 )P(y 2 u 2 ).

33 Polarization HowTo Given two copies of a binary input channel P : F 2 Y U 1 + P Y 1 U 2 P Y 2 consider the transformation above to generate two channels P : F 2 Y 2 and P + : F 2 Y 2 F 2 with P (y 1 y 2 u 1 ) = u P(y 1 u 1 + u 2 )P(y 2 u 2 ) P + (y 1 y 2 u 1 u 2 ) = 1 2 P(y 1 u 1 + u 2 )P(y 2 u 2 ).

34 Polarization HowTo Observe that [ U1 U 2 ] = [ With independent, uniform U 1, U 2, Thus, ] [ X1 X 2 I (P ) = I (U 1 ; Y 1 Y 2 ), ]. I (P + ) = I (U 2 ; Y 1 Y 2 U 1 ). I (P ) + I (P + ) = I (U 1 U 2 ; Y 1 Y 2 ) = I (X 1 ; Y 1 ) + I (X 2 ; Y 2 ) = 2I (P), and I (P ) I (P) I (P + ).

35 Polarization HowTo Observe that [ U1 U 2 ] = [ With independent, uniform U 1, U 2, Thus, ] [ X1 X 2 I (P ) = I (U 1 ; Y 1 Y 2 ), ]. I (P + ) = I (U 2 ; Y 1 Y 2 U 1 ). I (P ) + I (P + ) = I (U 1 U 2 ; Y 1 Y 2 ) = I (X 1 ; Y 1 ) + I (X 2 ; Y 2 ) = 2I (P), and I (P ) I (P) I (P + ).

36 Polarization HowTo Observe that [ U1 U 2 ] = [ With independent, uniform U 1, U 2, Thus, ] [ X1 X 2 I (P ) = I (U 1 ; Y 1 Y 2 ), ]. I (P + ) = I (U 2 ; Y 1 Y 2 U 1 ). I (P ) + I (P + ) = I (U 1 U 2 ; Y 1 Y 2 ) = I (X 1 ; Y 1 ) + I (X 2 ; Y 2 ) = 2I (P), and I (P ) I (P) I (P + ).

37 Polarization HowTo Observe that [ U1 U 2 ] = [ With independent, uniform U 1, U 2, Thus, ] [ X1 X 2 I (P ) = I (U 1 ; Y 1 Y 2 ), ]. I (P + ) = I (U 2 ; Y 1 Y 2 U 1 ). I (P ) + I (P + ) = I (U 1 U 2 ; Y 1 Y 2 ) = I (X 1 ; Y 1 ) + I (X 2 ; Y 2 ) = 2I (P), and I (P ) I (P) I (P + ).

38 Polarization HowTo What we can do once, we can do many times: Given W, Duplicate W and obtain W and W +. Duplicate W (W + ), and obtain W and W + (W + and W ++ ). Duplicate W (W +, W +, W ++ ) and obtain W and W + (W +, W ++, W +, W + +, W ++, W +++ )....

39 Polarization HowTo What we can do once, we can do many times: Given W, Duplicate W and obtain W and W +. Duplicate W (W + ), and obtain W and W + (W + and W ++ ). Duplicate W (W +, W +, W ++ ) and obtain W and W + (W +, W ++, W +, W + +, W ++, W +++ )....

40 Polarization HowTo What we can do once, we can do many times: Given W, Duplicate W and obtain W and W +. Duplicate W (W + ), and obtain W and W + (W + and W ++ ). Duplicate W (W +, W +, W ++ ) and obtain W and W + (W +, W ++, W +, W + +, W ++, W +++ )....

41 Polarization HowTo What we can do once, we can do many times: Given W, Duplicate W and obtain W and W +. Duplicate W (W + ), and obtain W and W + (W + and W ++ ). Duplicate W (W +, W +, W ++ ) and obtain W and W + (W +, W ++, W +, W + +, W ++, W +++ )....

42 Polarization HowTo What we can do once, we can do many times: Given W, Duplicate W and obtain W and W +. Duplicate W (W + ), and obtain W and W + (W + and W ++ ). Duplicate W (W +, W +, W ++ ) and obtain W and W + (W +, W ++, W +, W + +, W ++, W +++ )....

43 Polarization HowTo What we can do once, we can do many times: Given W, Duplicate W and obtain W and W +. Duplicate W (W + ), and obtain W and W + (W + and W ++ ). Duplicate W (W +, W +, W ++ ) and obtain W and W + (W +, W ++, W +, W + +, W ++, W +++ )....

44 Polarization HowTo We now have a one-to-one transformation G n between U 1,..., U n and X 1,..., X n for n = 2 l. Furthermore, the n = 2 l quantities are exactly the n quantities I (W b 1...,b l ), b j {+, } I (U i ; Y n U i 1 ), i = 1,..., n whose empirical distribution we seek. This suggests: Set B 1, B 2,..., to be i.i.d., {+, } valued, uniformly distributed random variables, Define the random variable I l = I (W B 1,...,B l ). Study the distribution of I l.

45 Polarization HowTo We now have a one-to-one transformation G n between U 1,..., U n and X 1,..., X n for n = 2 l. Furthermore, the n = 2 l quantities are exactly the n quantities I (W b 1...,b l ), b j {+, } I (U i ; Y n U i 1 ), i = 1,..., n whose empirical distribution we seek. This suggests: Set B 1, B 2,..., to be i.i.d., {+, } valued, uniformly distributed random variables, Define the random variable I l = I (W B 1,...,B l ). Study the distribution of I l.

46 Polarization HowTo We now have a one-to-one transformation G n between U 1,..., U n and X 1,..., X n for n = 2 l. Furthermore, the n = 2 l quantities are exactly the n quantities I (W b 1...,b l ), b j {+, } I (U i ; Y n U i 1 ), i = 1,..., n whose empirical distribution we seek. This suggests: Set B 1, B 2,..., to be i.i.d., {+, } valued, uniformly distributed random variables, Define the random variable I l = I (W B 1,...,B l ). Study the distribution of I l.

47 Polarization HowTo We now have a one-to-one transformation G n between U 1,..., U n and X 1,..., X n for n = 2 l. Furthermore, the n = 2 l quantities are exactly the n quantities I (W b 1...,b l ), b j {+, } I (U i ; Y n U i 1 ), i = 1,..., n whose empirical distribution we seek. This suggests: Set B 1, B 2,..., to be i.i.d., {+, } valued, uniformly distributed random variables, Define the random variable I l = I (W B 1,...,B l ). Study the distribution of I l.

48 Polarization HowTo We now have a one-to-one transformation G n between U 1,..., U n and X 1,..., X n for n = 2 l. Furthermore, the n = 2 l quantities are exactly the n quantities I (W b 1...,b l ), b j {+, } I (U i ; Y n U i 1 ), i = 1,..., n whose empirical distribution we seek. This suggests: Set B 1, B 2,..., to be i.i.d., {+, } valued, uniformly distributed random variables, Define the random variable I l = I (W B 1,...,B l ). Study the distribution of I l.

49 Polarization HowTo Properties of the process I l : I 0 = I (W ) is a constant. I l is lies in [0, 1], so is bounded. Conditional on B 1,..., B l, and with P := W B 1,...,B l, I l+1 can take only two values I (P ) and I (P + ), so, E[I l+1 B 1,..., B l ] = 1 2 [I (P ) + I (P + )] = I (P) = I l. and we see that {I l } is a martingale.

50 Polarization HowTo Properties of the process I l : I 0 = I (W ) is a constant. I l is lies in [0, 1], so is bounded. Conditional on B 1,..., B l, and with P := W B 1,...,B l, I l+1 can take only two values I (P ) and I (P + ), so, E[I l+1 B 1,..., B l ] = 1 2 [I (P ) + I (P + )] = I (P) = I l. and we see that {I l } is a martingale.

51 Polarization HowTo Properties of the process I l : I 0 = I (W ) is a constant. I l is lies in [0, 1], so is bounded. Conditional on B 1,..., B l, and with P := W B 1,...,B l, I l+1 can take only two values I (P ) and I (P + ), so, E[I l+1 B 1,..., B l ] = 1 2 [I (P ) + I (P + )] = I (P) = I l. and we see that {I l } is a martingale.

52 Polarization HowTo Properties of the process I l : I 0 = I (W ) is a constant. I l is lies in [0, 1], so is bounded. Conditional on B 1,..., B l, and with P := W B 1,...,B l, I l+1 can take only two values I (P ) and I (P + ), so, E[I l+1 B 1,..., B l ] = 1 2 [I (P ) + I (P + )] = I (P) = I l. and we see that {I l } is a martingale.

53 Polarization HowTo Bounded martingales converge almost surely. Thus I := lim l I l exists with probability one. So we know that the empirical distribution we seek exists it is the distribution of I and E[I ] = I 0 = I (W ). Almost sure convergence of {I l } implies I l+1 I l 0 a.s. I l+1 I l = 1 2 [I (P+ ) I (P )]. But I (P + ) I (P ) < ɛ implies I (P) (δ, 1 δ). (The BEC and BSC turn out to be extremal among all binary input channels). Thus I is {0, 1} valued!

54 Polarization HowTo Bounded martingales converge almost surely. Thus I := lim l I l exists with probability one. So we know that the empirical distribution we seek exists it is the distribution of I and E[I ] = I 0 = I (W ). Almost sure convergence of {I l } implies I l+1 I l 0 a.s. I l+1 I l = 1 2 [I (P+ ) I (P )]. But I (P + ) I (P ) < ɛ implies I (P) (δ, 1 δ). (The BEC and BSC turn out to be extremal among all binary input channels). Thus I is {0, 1} valued!

55 Polarization HowTo Bounded martingales converge almost surely. Thus I := lim l I l exists with probability one. So we know that the empirical distribution we seek exists it is the distribution of I and E[I ] = I 0 = I (W ). Almost sure convergence of {I l } implies I l+1 I l 0 a.s. I l+1 I l = 1 2 [I (P+ ) I (P )]. But I (P + ) I (P ) < ɛ implies I (P) (δ, 1 δ). (The BEC and BSC turn out to be extremal among all binary input channels). Thus I is {0, 1} valued!

56 Polarization HowTo Bounded martingales converge almost surely. Thus I := lim l I l exists with probability one. So we know that the empirical distribution we seek exists it is the distribution of I and E[I ] = I 0 = I (W ). Almost sure convergence of {I l } implies I l+1 I l 0 a.s. I l+1 I l = 1 2 [I (P+ ) I (P )]. But I (P + ) I (P ) < ɛ implies I (P) (δ, 1 δ). (The BEC and BSC turn out to be extremal among all binary input channels). Thus I is {0, 1} valued! I (P + ) I (P ) I (P)

57 Polarization HowTo Bounded martingales converge almost surely. Thus I := lim l I l exists with probability one. So we know that the empirical distribution we seek exists it is the distribution of I and E[I ] = I 0 = I (W ). Almost sure convergence of {I l } implies I l+1 I l 0 a.s. I l+1 I l = 1 2 [I (P+ ) I (P )]. But I (P + ) I (P ) < ɛ implies I (P) (δ, 1 δ). (The BEC and BSC turn out to be extremal among all binary input channels). Thus I is {0, 1} valued! I (P + ) I (P ) I (P)

58 Polarization Rate We have seen that polarization takes place. How fast? Idea 0: We have seen in the I (P + ) I (P ) vs I (P) graph that BSC is least polarizing. So we should be able to obtain a lower bound on the speed of polarization by studying BSC. However, if P is a BSC, P + and P are not. Thus the estimate based on this idea is turns out to be too pessimistic. Instead, we introduce an auxiliary quantity Z(P) = P(y 0)P(y 1) y as a companion to I (P).

59 Polarization Rate We have seen that polarization takes place. How fast? Idea 0: We have seen in the I (P + ) I (P ) vs I (P) graph that BSC is least polarizing. So we should be able to obtain a lower bound on the speed of polarization by studying BSC. However, if P is a BSC, P + and P are not. Thus the estimate based on this idea is turns out to be too pessimistic. Instead, we introduce an auxiliary quantity Z(P) = P(y 0)P(y 1) y as a companion to I (P).

60 Polarization Rate We have seen that polarization takes place. How fast? Idea 0: We have seen in the I (P + ) I (P ) vs I (P) graph that BSC is least polarizing. So we should be able to obtain a lower bound on the speed of polarization by studying BSC. However, if P is a BSC, P + and P are not. Thus the estimate based on this idea is turns out to be too pessimistic. Instead, we introduce an auxiliary quantity Z(P) = P(y 0)P(y 1) y as a companion to I (P).

61 Polarization Rate We have seen that polarization takes place. How fast? Idea 0: We have seen in the I (P + ) I (P ) vs I (P) graph that BSC is least polarizing. So we should be able to obtain a lower bound on the speed of polarization by studying BSC. However, if P is a BSC, P + and P are not. Thus the estimate based on this idea is turns out to be too pessimistic. Instead, we introduce an auxiliary quantity Z(P) = P(y 0)P(y 1) y as a companion to I (P).

62 Polarization Rate Properties of Z(P): Z(P) [0, 1]. Z(P) 0 iff I (P) 1. Z(P) 1 iff I (P) 0. Z(P + ) = Z(P) 2. Z(P ) 2Z(P). Note that Z(P) is the Bhattacharyya upper bound on probability of error for uncoded transmission over P. Thus, we can choose the good indices on the basis of Z(P); the sum of the Z s of the chosen channels will upper bound the block error probability. This suggests studying the polarization rate of Z.

63 Polarization Rate Properties of Z(P): Z(P) [0, 1]. Z(P) 0 iff I (P) 1. Z(P) 1 iff I (P) 0. Z(P + ) = Z(P) 2. Z(P ) 2Z(P). Z(P) I (P) Note that Z(P) is the Bhattacharyya upper bound on probability of error for uncoded transmission over P. Thus, we can choose the good indices on the basis of Z(P); the sum of the Z s of the chosen channels will upper bound the block error probability. This suggests studying the polarization rate of Z

64 Polarization Rate Properties of Z(P): Z(P) [0, 1]. Z(P) 0 iff I (P) 1. Z(P) 1 iff I (P) 0. Z(P + ) = Z(P) 2. Z(P ) 2Z(P). Z(P) I (P) Note that Z(P) is the Bhattacharyya upper bound on probability of error for uncoded transmission over P. Thus, we can choose the good indices on the basis of Z(P); the sum of the Z s of the chosen channels will upper bound the block error probability. This suggests studying the polarization rate of Z

65 Polarization Rate Properties of Z(P): Z(P) [0, 1]. Z(P) 0 iff I (P) 1. Z(P) 1 iff I (P) 0. Z(P + ) = Z(P) 2. Z(P ) 2Z(P). Z(P) I (P) Note that Z(P) is the Bhattacharyya upper bound on probability of error for uncoded transmission over P. Thus, we can choose the good indices on the basis of Z(P); the sum of the Z s of the chosen channels will upper bound the block error probability. This suggests studying the polarization rate of Z

66 Polarization Rate Properties of Z(P): Z(P) [0, 1]. Z(P) 0 iff I (P) 1. Z(P) 1 iff I (P) 0. Z(P + ) = Z(P) 2. Z(P ) 2Z(P). Z(P) I (P) Note that Z(P) is the Bhattacharyya upper bound on probability of error for uncoded transmission over P. Thus, we can choose the good indices on the basis of Z(P); the sum of the Z s of the chosen channels will upper bound the block error probability. This suggests studying the polarization rate of Z

67 Polarization Rate Properties of Z(P): Z(P) [0, 1]. Z(P) 0 iff I (P) 1. Z(P) 1 iff I (P) 0. Z(P + ) = Z(P) 2. Z(P ) 2Z(P). Z(P) I (P) Note that Z(P) is the Bhattacharyya upper bound on probability of error for uncoded transmission over P. Thus, we can choose the good indices on the basis of Z(P); the sum of the Z s of the chosen channels will upper bound the block error probability. This suggests studying the polarization rate of Z

68 Polarization Rate Given a binary input channel W, just as for I l, define Z l = Z(W B 1,...,B l ). We know that Pr(Z l 0) = I (W ). It turns out that when Z l 0 it does so fast: Theorem For any β < 1/2, lim Pr(Z l < 2 2βl ) = I (W ). l This means that for any β < 1/2, as long as R < I (W ) the error probability of polarization codes decays to 0 faster than 2 nβ.

69 Polarization Rate Given a binary input channel W, just as for I l, define Z l = Z(W B 1,...,B l ). We know that Pr(Z l 0) = I (W ). It turns out that when Z l 0 it does so fast: Theorem For any β < 1/2, lim Pr(Z l < 2 2βl ) = I (W ). l This means that for any β < 1/2, as long as R < I (W ) the error probability of polarization codes decays to 0 faster than 2 nβ.

70 Polarization Rate Given a binary input channel W, just as for I l, define Z l = Z(W B 1,...,B l ). We know that Pr(Z l 0) = I (W ). It turns out that when Z l 0 it does so fast: Theorem For any β < 1/2, lim Pr(Z l < 2 2βl ) = I (W ). l This means that for any β < 1/2, as long as R < I (W ) the error probability of polarization codes decays to 0 faster than 2 nβ.

71 Polarization Rate Given a binary input channel W, just as for I l, define Z l = Z(W B 1,...,B l ). We know that Pr(Z l 0) = I (W ). It turns out that when Z l 0 it does so fast: Theorem For any β < 1/2, lim Pr(Z l < 2 2βl ) = I (W ). l This means that for any β < 1/2, as long as R < I (W ) the error probability of polarization codes decays to 0 faster than 2 nβ.

72 Polarization Rate Given a binary input channel W, just as for I l, define Z l = Z(W B 1,...,B l ). We know that Pr(Z l 0) = I (W ). It turns out that when Z l 0 it does so fast: Theorem For any β < 1/2, lim Pr(Z l < 2 2βl ) = I (W ). l This means that for any β < 1/2, as long as R < I (W ) the error probability of polarization codes decays to 0 faster than 2 nβ.

73 Polarization Rate Proof idea The subset of the probability space where Z l (ω) 0 has probability I (W ). For any ɛ > 0 there is an m and a subset of this set with probability at least I (W ) ɛ with Z l (ω) < 1/8 whenever l > m. Intersect this set with high probability event: that B i = + and B i = occur with almost equal frequency for i l. It is then easy to see that lim l P(log 2 Z l l) = I (W ).

74 Polarization Rate Proof idea The subset of the probability space where Z l (ω) 0 has probability I (W ). For any ɛ > 0 there is an m and a subset of this set with probability at least I (W ) ɛ with Z l (ω) < 1/8 whenever l > m. Intersect this set with high probability event: that B i = + and B i = occur with almost equal frequency for i l. It is then easy to see that lim l P(log 2 Z l l) = I (W ).

75 Polarization Rate Proof idea The subset of the probability space where Z l (ω) 0 has probability I (W ). For any ɛ > 0 there is an m and a subset of this set with probability at least I (W ) ɛ with Z l (ω) < 1/8 whenever l > m. Intersect this set with high probability event: that B i = + and B i = occur with almost equal frequency for i l. It is then easy to see that lim l P(log 2 Z l l) = I (W ).

76 Polarization Rate Proof idea cont. log 2 Z l l 0 l 0 l

77 Polarization Rate Proof idea cont. log 2 Z l l 0 l 0 +l 1 l l 0 l 1 = l 0

78 Polarization Rate Proof idea cont. log 2 Z l l 0 l 0 +l 1 l l l1 l 1 = l 0

79 Polarization Rate Proof idea cont. log 2 Z l l 0 l 0 +l 1 l 0 +l 1 +l 2 l l l1 l 1 = l 0 l 2 = l1

80 Polarization Rate Proof idea cont. log 2 Z l l 0 l 0 +l 1 l 0 +l 1 +l 2 l l l1 l 1 = l 0 l 2 = l l2

81 Remarks With the particular one-to-one mapping described here (Hadamard transform) and with the successive cancellation decoding polarization codes are I (W ) achieving, encoding complexity is n log n, decoding complexity is n log n, probability of error decays like 2 n. However, for small l their performance is surpassed by the usual suspects.

82 Remarks With the particular one-to-one mapping described here (Hadamard transform) and with the successive cancellation decoding polarization codes are I (W ) achieving, encoding complexity is n log n, decoding complexity is n log n, probability of error decays like 2 n. However, for small l their performance is surpassed by the usual suspects.

83 Remarks With the particular one-to-one mapping described here (Hadamard transform) and with the successive cancellation decoding polarization codes are I (W ) achieving, encoding complexity is n log n, decoding complexity is n log n, probability of error decays like 2 n. However, for small l their performance is surpassed by the usual suspects.

84 Remarks With the particular one-to-one mapping described here (Hadamard transform) and with the successive cancellation decoding polarization codes are I (W ) achieving, encoding complexity is n log n, decoding complexity is n log n, probability of error decays like 2 n. However, for small l their performance is surpassed by the usual suspects.

85 Remarks With the particular one-to-one mapping described here (Hadamard transform) and with the successive cancellation decoding polarization codes are I (W ) achieving, encoding complexity is n log n, decoding complexity is n log n, probability of error decays like 2 n. However, for small l their performance is surpassed by the usual suspects.

86 Remarks With the particular one-to-one mapping described here (Hadamard transform) and with the successive cancellation decoding polarization codes are I (W ) achieving, encoding complexity is n log n, decoding complexity is n log n, probability of error decays like 2 n. However, for small l their performance is surpassed by the usual suspects.

87 Remarks Moreover: For symmetric channels the dummy symbols can be chosen to be constant, yielding a deterministic construction, the error probability guarantees are not based on simulation. They can be generalized to channels with Galois field input alphabets: a similar two-by-two construction, with same complexity and error probability bounds.

88 Remarks Moreover: For symmetric channels the dummy symbols can be chosen to be constant, yielding a deterministic construction, the error probability guarantees are not based on simulation. They can be generalized to channels with Galois field input alphabets: a similar two-by-two construction, with same complexity and error probability bounds.

89 Remarks Moreover: For symmetric channels the dummy symbols can be chosen to be constant, yielding a deterministic construction, the error probability guarantees are not based on simulation. They can be generalized to channels with Galois field input alphabets: a similar two-by-two construction, with same complexity and error probability bounds.

90 Open Questions Many open questions. A few: Improve the error probability estimate to include rate dependence. Polarization appears to be a general phenomenon, how general? The restriction to Galois fields must be an artifact of the construction technique. Polarization based on combining more than 2 channels may yield stronger codes. Successive cancellation is too naive, different decoding algorithms should do better. Numerical evidence suggests scaling laws for the distribution of I l and Z l. Would be nice to have them.

91 Open Questions Many open questions. A few: Improve the error probability estimate to include rate dependence. Polarization appears to be a general phenomenon, how general? The restriction to Galois fields must be an artifact of the construction technique. Polarization based on combining more than 2 channels may yield stronger codes. Successive cancellation is too naive, different decoding algorithms should do better. Numerical evidence suggests scaling laws for the distribution of I l and Z l. Would be nice to have them.

92 Open Questions Many open questions. A few: Improve the error probability estimate to include rate dependence. Polarization appears to be a general phenomenon, how general? The restriction to Galois fields must be an artifact of the construction technique. Polarization based on combining more than 2 channels may yield stronger codes. Successive cancellation is too naive, different decoding algorithms should do better. Numerical evidence suggests scaling laws for the distribution of I l and Z l. Would be nice to have them.

93 Open Questions Many open questions. A few: Improve the error probability estimate to include rate dependence. Polarization appears to be a general phenomenon, how general? The restriction to Galois fields must be an artifact of the construction technique. Polarization based on combining more than 2 channels may yield stronger codes. Successive cancellation is too naive, different decoding algorithms should do better. Numerical evidence suggests scaling laws for the distribution of I l and Z l. Would be nice to have them.

94 Open Questions Many open questions. A few: Improve the error probability estimate to include rate dependence. Polarization appears to be a general phenomenon, how general? The restriction to Galois fields must be an artifact of the construction technique. Polarization based on combining more than 2 channels may yield stronger codes. Successive cancellation is too naive, different decoding algorithms should do better. Numerical evidence suggests scaling laws for the distribution of I l and Z l. Would be nice to have them.

95 Open Questions Many open questions. A few: Improve the error probability estimate to include rate dependence. Polarization appears to be a general phenomenon, how general? The restriction to Galois fields must be an artifact of the construction technique. Polarization based on combining more than 2 channels may yield stronger codes. Successive cancellation is too naive, different decoding algorithms should do better. Numerical evidence suggests scaling laws for the distribution of I l and Z l. Would be nice to have them.

96 Open Questions Many open questions. A few: Improve the error probability estimate to include rate dependence. Polarization appears to be a general phenomenon, how general? The restriction to Galois fields must be an artifact of the construction technique. Polarization based on combining more than 2 channels may yield stronger codes. Successive cancellation is too naive, different decoding algorithms should do better. Numerical evidence suggests scaling laws for the distribution of I l and Z l. Would be nice to have them.

x a x 2 (1 + x 2 ) n.

x a x 2 (1 + x 2 ) n. Limits and continuity Suppose that we have a function f : R R. Let a R. We say that f(x) tends to the limit l as x tends to a; lim f(x) = l ; x a if, given any real number ɛ > 0, there exists a real number

More information

INTRODUCTION TO CODING THEORY: BASIC CODES AND SHANNON S THEOREM

INTRODUCTION TO CODING THEORY: BASIC CODES AND SHANNON S THEOREM INTRODUCTION TO CODING THEORY: BASIC CODES AND SHANNON S THEOREM SIDDHARTHA BISWAS Abstract. Coding theory originated in the late 1940 s and took its roots in engineering. However, it has developed and

More information

{f 1 (U), U F} is an open cover of A. Since A is compact there is a finite subcover of A, {f 1 (U 1 ),...,f 1 (U n )}, {U 1,...

{f 1 (U), U F} is an open cover of A. Since A is compact there is a finite subcover of A, {f 1 (U 1 ),...,f 1 (U n )}, {U 1,... 44 CHAPTER 4. CONTINUOUS FUNCTIONS In Calculus we often use arithmetic operations to generate new continuous functions from old ones. In a general metric space we don t have arithmetic, but much of it

More information

Almost every lossy compression system contains a lossless compression system

Almost every lossy compression system contains a lossless compression system Lossless compression in lossy compression systems Almost every lossy compression system contains a lossless compression system Lossy compression system Transform Quantizer Lossless Encoder Lossless Decoder

More information

CSC 310: Information Theory

CSC 310: Information Theory CSC 310: Information Theory University of Toronto, Fall 2011 Instructor: Radford M. Neal Week 2 What s Needed for a Theory of (Lossless) Data Compression? A context for the problem. What are we trying

More information

AN INTRODUCTION TO ERROR CORRECTING CODES Part 1

AN INTRODUCTION TO ERROR CORRECTING CODES Part 1 AN INTRODUCTION TO ERROR CORRECTING CODES Part 1 Jack Keil Wolf ECE 154C Spring 2008 Noisy Communications Noise in a communications channel can cause errors in the transmission of binary digits. Transmit:

More information

ECEN 5682 Theory and Practice of Error Control Codes

ECEN 5682 Theory and Practice of Error Control Codes ECEN 5682 Theory and Practice of Error Control Codes Convolutional Codes University of Colorado Spring 2007 Linear (n, k) block codes take k data symbols at a time and encode them into n code symbols.

More information

CS 3719 (Theory of Computation and Algorithms) Lecture 4

CS 3719 (Theory of Computation and Algorithms) Lecture 4 CS 3719 (Theory of Computation and Algorithms) Lecture 4 Antonina Kolokolova January 18, 2012 1 Undecidable languages 1.1 Church-Turing thesis Let s recap how it all started. In 1990, Hilbert stated a

More information

The Ergodic Theorem and randomness

The Ergodic Theorem and randomness The Ergodic Theorem and randomness Peter Gács Department of Computer Science Boston University March 19, 2008 Peter Gács (Boston University) Ergodic theorem March 19, 2008 1 / 27 Introduction Introduction

More information

Bayesian logistic betting strategy against probability forecasting. Akimichi Takemura, Univ. Tokyo. November 12, 2012

Bayesian logistic betting strategy against probability forecasting. Akimichi Takemura, Univ. Tokyo. November 12, 2012 Bayesian logistic betting strategy against probability forecasting Akimichi Takemura, Univ. Tokyo (joint with Masayuki Kumon, Jing Li and Kei Takeuchi) November 12, 2012 arxiv:1204.3496. To appear in Stochastic

More information

Discrete Mathematics and Probability Theory Fall 2009 Satish Rao,David Tse Note 11

Discrete Mathematics and Probability Theory Fall 2009 Satish Rao,David Tse Note 11 CS 70 Discrete Mathematics and Probability Theory Fall 2009 Satish Rao,David Tse Note Conditional Probability A pharmaceutical company is marketing a new test for a certain medical condition. According

More information

AFASCINATING aspect of Shannon s proof of the noisy

AFASCINATING aspect of Shannon s proof of the noisy IEEE TRANSACTIONS ON INFORMATION THEORY, VOL 55, NO 7, JULY 2009 3051 Channel Polarization: A Method for Constructing Capacity-Achieving Codes for Symmetric Binary-Input Memoryless Channels Erdal Arıkan,

More information

1 The Line vs Point Test

1 The Line vs Point Test 6.875 PCP and Hardness of Approximation MIT, Fall 2010 Lecture 5: Low Degree Testing Lecturer: Dana Moshkovitz Scribe: Gregory Minton and Dana Moshkovitz Having seen a probabilistic verifier for linearity

More information

Lecture 13 - Basic Number Theory.

Lecture 13 - Basic Number Theory. Lecture 13 - Basic Number Theory. Boaz Barak March 22, 2010 Divisibility and primes Unless mentioned otherwise throughout this lecture all numbers are non-negative integers. We say that A divides B, denoted

More information

The sample space for a pair of die rolls is the set. The sample space for a random number between 0 and 1 is the interval [0, 1].

The sample space for a pair of die rolls is the set. The sample space for a random number between 0 and 1 is the interval [0, 1]. Probability Theory Probability Spaces and Events Consider a random experiment with several possible outcomes. For example, we might roll a pair of dice, flip a coin three times, or choose a random real

More information

Exercises with solutions (1)

Exercises with solutions (1) Exercises with solutions (). Investigate the relationship between independence and correlation. (a) Two random variables X and Y are said to be correlated if and only if their covariance C XY is not equal

More information

National Sun Yat-Sen University CSE Course: Information Theory. Gambling And Entropy

National Sun Yat-Sen University CSE Course: Information Theory. Gambling And Entropy Gambling And Entropy 1 Outline There is a strong relationship between the growth rate of investment in a horse race and the entropy of the horse race. The value of side information is related to the mutual

More information

MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.436J/15.085J Fall 2008 Lecture 5 9/17/2008 RANDOM VARIABLES

MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.436J/15.085J Fall 2008 Lecture 5 9/17/2008 RANDOM VARIABLES MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.436J/15.085J Fall 2008 Lecture 5 9/17/2008 RANDOM VARIABLES Contents 1. Random variables and measurable functions 2. Cumulative distribution functions 3. Discrete

More information

P (A) = lim P (A) = N(A)/N,

P (A) = lim P (A) = N(A)/N, 1.1 Probability, Relative Frequency and Classical Definition. Probability is the study of random or non-deterministic experiments. Suppose an experiment can be repeated any number of times, so that we

More information

Polynomial Degree and Lower Bounds in Quantum Complexity: Collision and Element Distinctness with Small Range

Polynomial Degree and Lower Bounds in Quantum Complexity: Collision and Element Distinctness with Small Range THEORY OF COMPUTING, Volume 1 (2005), pp. 37 46 http://theoryofcomputing.org Polynomial Degree and Lower Bounds in Quantum Complexity: Collision and Element Distinctness with Small Range Andris Ambainis

More information

POWER SETS AND RELATIONS

POWER SETS AND RELATIONS POWER SETS AND RELATIONS L. MARIZZA A. BAILEY 1. The Power Set Now that we have defined sets as best we can, we can consider a sets of sets. If we were to assume nothing, except the existence of the empty

More information

Lecture 1: Course overview, circuits, and formulas

Lecture 1: Course overview, circuits, and formulas Lecture 1: Course overview, circuits, and formulas Topics in Complexity Theory and Pseudorandomness (Spring 2013) Rutgers University Swastik Kopparty Scribes: John Kim, Ben Lund 1 Course Information Swastik

More information

Sequences and Convergence in Metric Spaces

Sequences and Convergence in Metric Spaces Sequences and Convergence in Metric Spaces Definition: A sequence in a set X (a sequence of elements of X) is a function s : N X. We usually denote s(n) by s n, called the n-th term of s, and write {s

More information

CMSC 858T: Randomized Algorithms Spring 2003 Handout 8: The Local Lemma

CMSC 858T: Randomized Algorithms Spring 2003 Handout 8: The Local Lemma CMSC 858T: Randomized Algorithms Spring 2003 Handout 8: The Local Lemma Please Note: The references at the end are given for extra reading if you are interested in exploring these ideas further. You are

More information

CHAPTER 6. Shannon entropy

CHAPTER 6. Shannon entropy CHAPTER 6 Shannon entropy This chapter is a digression in information theory. This is a fascinating subject, which arose once the notion of information got precise and quantifyable. From a physical point

More information

Offline sorting buffers on Line

Offline sorting buffers on Line Offline sorting buffers on Line Rohit Khandekar 1 and Vinayaka Pandit 2 1 University of Waterloo, ON, Canada. email: rkhandekar@gmail.com 2 IBM India Research Lab, New Delhi. email: pvinayak@in.ibm.com

More information

Prime numbers and prime polynomials. Paul Pollack Dartmouth College

Prime numbers and prime polynomials. Paul Pollack Dartmouth College Prime numbers and prime polynomials Paul Pollack Dartmouth College May 1, 2008 Analogies everywhere! Analogies in elementary number theory (continued fractions, quadratic reciprocity, Fermat s last theorem)

More information

Notes on Complexity Theory Last updated: August, 2011. Lecture 1

Notes on Complexity Theory Last updated: August, 2011. Lecture 1 Notes on Complexity Theory Last updated: August, 2011 Jonathan Katz Lecture 1 1 Turing Machines I assume that most students have encountered Turing machines before. (Students who have not may want to look

More information

Inverse Functions and Logarithms

Inverse Functions and Logarithms Section 3. Inverse Functions and Logarithms 1 Kiryl Tsishchanka Inverse Functions and Logarithms DEFINITION: A function f is called a one-to-one function if it never takes on the same value twice; that

More information

5. Convergence of sequences of random variables

5. Convergence of sequences of random variables 5. Convergence of sequences of random variables Throughout this chapter we assume that {X, X 2,...} is a sequence of r.v. and X is a r.v., and all of them are defined on the same probability space (Ω,

More information

Coding and decoding with convolutional codes. The Viterbi Algor

Coding and decoding with convolutional codes. The Viterbi Algor Coding and decoding with convolutional codes. The Viterbi Algorithm. 8 Block codes: main ideas Principles st point of view: infinite length block code nd point of view: convolutions Some examples Repetition

More information

Introduction to Markov Chain Monte Carlo

Introduction to Markov Chain Monte Carlo Introduction to Markov Chain Monte Carlo Monte Carlo: sample from a distribution to estimate the distribution to compute max, mean Markov Chain Monte Carlo: sampling using local information Generic problem

More information

arxiv:1112.0829v1 [math.pr] 5 Dec 2011

arxiv:1112.0829v1 [math.pr] 5 Dec 2011 How Not to Win a Million Dollars: A Counterexample to a Conjecture of L. Breiman Thomas P. Hayes arxiv:1112.0829v1 [math.pr] 5 Dec 2011 Abstract Consider a gambling game in which we are allowed to repeatedly

More information

GROUPS SUBGROUPS. Definition 1: An operation on a set G is a function : G G G.

GROUPS SUBGROUPS. Definition 1: An operation on a set G is a function : G G G. Definition 1: GROUPS An operation on a set G is a function : G G G. Definition 2: A group is a set G which is equipped with an operation and a special element e G, called the identity, such that (i) the

More information

FUNDAMENTALS of INFORMATION THEORY and CODING DESIGN

FUNDAMENTALS of INFORMATION THEORY and CODING DESIGN DISCRETE "ICS AND ITS APPLICATIONS Series Editor KENNETH H. ROSEN FUNDAMENTALS of INFORMATION THEORY and CODING DESIGN Roberto Togneri Christopher J.S. desilva CHAPMAN & HALL/CRC A CRC Press Company Boca

More information

Basics of Statistical Machine Learning

Basics of Statistical Machine Learning CS761 Spring 2013 Advanced Machine Learning Basics of Statistical Machine Learning Lecturer: Xiaojin Zhu jerryzhu@cs.wisc.edu Modern machine learning is rooted in statistics. You will find many familiar

More information

Prof. Girardi, Math 703, Fall 2012 Homework Solutions: Homework 13. in X is Cauchy.

Prof. Girardi, Math 703, Fall 2012 Homework Solutions: Homework 13. in X is Cauchy. Homework 13 Let (X, d) and (Y, ρ) be metric spaces. Consider a function f : X Y. 13.1. Prove or give a counterexample. f preserves convergent sequences if {x n } n=1 is a convergent sequence in X then

More information

Monte Carlo-based statistical methods (MASM11/FMS091)

Monte Carlo-based statistical methods (MASM11/FMS091) Monte Carlo-based statistical methods (MASM11/FMS091) Jimmy Olsson Centre for Mathematical Sciences Lund University, Sweden Lecture 5 Sequential Monte Carlo methods I February 5, 2013 J. Olsson Monte Carlo-based

More information

10.2 Series and Convergence

10.2 Series and Convergence 10.2 Series and Convergence Write sums using sigma notation Find the partial sums of series and determine convergence or divergence of infinite series Find the N th partial sums of geometric series and

More information

Estimating the Degree of Activity of jumps in High Frequency Financial Data. joint with Yacine Aït-Sahalia

Estimating the Degree of Activity of jumps in High Frequency Financial Data. joint with Yacine Aït-Sahalia Estimating the Degree of Activity of jumps in High Frequency Financial Data joint with Yacine Aït-Sahalia Aim and setting An underlying process X = (X t ) t 0, observed at equally spaced discrete times

More information

CS268: Geometric Algorithms Handout #5 Design and Analysis Original Handout #15 Stanford University Tuesday, 25 February 1992

CS268: Geometric Algorithms Handout #5 Design and Analysis Original Handout #15 Stanford University Tuesday, 25 February 1992 CS268: Geometric Algorithms Handout #5 Design and Analysis Original Handout #15 Stanford University Tuesday, 25 February 1992 Original Lecture #6: 28 January 1991 Topics: Triangulating Simple Polygons

More information

Theory of Computation Chapter 2: Turing Machines

Theory of Computation Chapter 2: Turing Machines Theory of Computation Chapter 2: Turing Machines Guan-Shieng Huang Feb. 24, 2003 Feb. 19, 2006 0-0 Turing Machine δ K 0111000a 01bb 1 Definition of TMs A Turing Machine is a quadruple M = (K, Σ, δ, s),

More information

Lectures 5-6: Taylor Series

Lectures 5-6: Taylor Series Math 1d Instructor: Padraic Bartlett Lectures 5-: Taylor Series Weeks 5- Caltech 213 1 Taylor Polynomials and Series As we saw in week 4, power series are remarkably nice objects to work with. In particular,

More information

This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination.

This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination. IEEE/ACM TRANSACTIONS ON NETWORKING 1 A Greedy Link Scheduler for Wireless Networks With Gaussian Multiple-Access and Broadcast Channels Arun Sridharan, Student Member, IEEE, C Emre Koksal, Member, IEEE,

More information

A Uniform Asymptotic Estimate for Discounted Aggregate Claims with Subexponential Tails

A Uniform Asymptotic Estimate for Discounted Aggregate Claims with Subexponential Tails 12th International Congress on Insurance: Mathematics and Economics July 16-18, 2008 A Uniform Asymptotic Estimate for Discounted Aggregate Claims with Subexponential Tails XUEMIAO HAO (Based on a joint

More information

I. Pointwise convergence

I. Pointwise convergence MATH 40 - NOTES Sequences of functions Pointwise and Uniform Convergence Fall 2005 Previously, we have studied sequences of real numbers. Now we discuss the topic of sequences of real valued functions.

More information

Competitive Analysis of On line Randomized Call Control in Cellular Networks

Competitive Analysis of On line Randomized Call Control in Cellular Networks Competitive Analysis of On line Randomized Call Control in Cellular Networks Ioannis Caragiannis Christos Kaklamanis Evi Papaioannou Abstract In this paper we address an important communication issue arising

More information

Approximation Algorithms: LP Relaxation, Rounding, and Randomized Rounding Techniques. My T. Thai

Approximation Algorithms: LP Relaxation, Rounding, and Randomized Rounding Techniques. My T. Thai Approximation Algorithms: LP Relaxation, Rounding, and Randomized Rounding Techniques My T. Thai 1 Overview An overview of LP relaxation and rounding method is as follows: 1. Formulate an optimization

More information

THE BANACH CONTRACTION PRINCIPLE. Contents

THE BANACH CONTRACTION PRINCIPLE. Contents THE BANACH CONTRACTION PRINCIPLE ALEX PONIECKI Abstract. This paper will study contractions of metric spaces. To do this, we will mainly use tools from topology. We will give some examples of contractions,

More information

A Practical Scheme for Wireless Network Operation

A Practical Scheme for Wireless Network Operation A Practical Scheme for Wireless Network Operation Radhika Gowaikar, Amir F. Dana, Babak Hassibi, Michelle Effros June 21, 2004 Abstract In many problems in wireline networks, it is known that achieving

More information

Information Theory and Coding Prof. S. N. Merchant Department of Electrical Engineering Indian Institute of Technology, Bombay

Information Theory and Coding Prof. S. N. Merchant Department of Electrical Engineering Indian Institute of Technology, Bombay Information Theory and Coding Prof. S. N. Merchant Department of Electrical Engineering Indian Institute of Technology, Bombay Lecture - 17 Shannon-Fano-Elias Coding and Introduction to Arithmetic Coding

More information

Fixed Point Theorems

Fixed Point Theorems Fixed Point Theorems Definition: Let X be a set and let T : X X be a function that maps X into itself. (Such a function is often called an operator, a transformation, or a transform on X, and the notation

More information

Chapter ML:IV. IV. Statistical Learning. Probability Basics Bayes Classification Maximum a-posteriori Hypotheses

Chapter ML:IV. IV. Statistical Learning. Probability Basics Bayes Classification Maximum a-posteriori Hypotheses Chapter ML:IV IV. Statistical Learning Probability Basics Bayes Classification Maximum a-posteriori Hypotheses ML:IV-1 Statistical Learning STEIN 2005-2015 Area Overview Mathematics Statistics...... Stochastics

More information

Channel polarization: A method for constructing capacity-achieving codes for symmetric binary-input memoryless channels

Channel polarization: A method for constructing capacity-achieving codes for symmetric binary-input memoryless channels Channel polarization: A method for constructing capacity-achieving codes for symmetric binary-input memoryless channels Erdal Arıkan, Senior Member, IEEE arxiv:007397v5 [csit] 0 Jul 009 Abstract A method

More information

Modern Optimization Methods for Big Data Problems MATH11146 The University of Edinburgh

Modern Optimization Methods for Big Data Problems MATH11146 The University of Edinburgh Modern Optimization Methods for Big Data Problems MATH11146 The University of Edinburgh Peter Richtárik Week 3 Randomized Coordinate Descent With Arbitrary Sampling January 27, 2016 1 / 30 The Problem

More information

Taylor Series and Asymptotic Expansions

Taylor Series and Asymptotic Expansions Taylor Series and Asymptotic Epansions The importance of power series as a convenient representation, as an approimation tool, as a tool for solving differential equations and so on, is pretty obvious.

More information

Chap2: The Real Number System (See Royden pp40)

Chap2: The Real Number System (See Royden pp40) Chap2: The Real Number System (See Royden pp40) 1 Open and Closed Sets of Real Numbers The simplest sets of real numbers are the intervals. We define the open interval (a, b) to be the set (a, b) = {x

More information

Double Sequences and Double Series

Double Sequences and Double Series Double Sequences and Double Series Eissa D. Habil Islamic University of Gaza P.O. Box 108, Gaza, Palestine E-mail: habil@iugaza.edu Abstract This research considers two traditional important questions,

More information

2. Linear regression with multiple regressors

2. Linear regression with multiple regressors 2. Linear regression with multiple regressors Aim of this section: Introduction of the multiple regression model OLS estimation in multiple regression Measures-of-fit in multiple regression Assumptions

More information

Monte Carlo and Empirical Methods for Stochastic Inference (MASM11/FMS091)

Monte Carlo and Empirical Methods for Stochastic Inference (MASM11/FMS091) Monte Carlo and Empirical Methods for Stochastic Inference (MASM11/FMS091) Magnus Wiktorsson Centre for Mathematical Sciences Lund University, Sweden Lecture 5 Sequential Monte Carlo methods I February

More information

1. (First passage/hitting times/gambler s ruin problem:) Suppose that X has a discrete state space and let i be a fixed state. Let

1. (First passage/hitting times/gambler s ruin problem:) Suppose that X has a discrete state space and let i be a fixed state. Let Copyright c 2009 by Karl Sigman 1 Stopping Times 1.1 Stopping Times: Definition Given a stochastic process X = {X n : n 0}, a random time τ is a discrete random variable on the same probability space as

More information

A sequence of real numbers or a sequence in R is a mapping f : N R.

A sequence of real numbers or a sequence in R is a mapping f : N R. A sequence of real numbers or a sequence in R is a mapping f : N R. A sequence of real numbers or a sequence in R is a mapping f : N R. Notation: We write x n for f(n), n N and so the notation for a sequence

More information

Deterministic Finite Automata

Deterministic Finite Automata 1 Deterministic Finite Automata Definition: A deterministic finite automaton (DFA) consists of 1. a finite set of states (often denoted Q) 2. a finite set Σ of symbols (alphabet) 3. a transition function

More information

Structural properties of oracle classes

Structural properties of oracle classes Structural properties of oracle classes Stanislav Živný Computing Laboratory, University of Oxford, Wolfson Building, Parks Road, Oxford, OX1 3QD, UK Abstract Denote by C the class of oracles relative

More information

Lecture 18 October 30

Lecture 18 October 30 EECS 290S: Network Information Flow Fall 2008 Lecture 18 October 30 Lecturer: Anant Sahai and David Tse Scribe: Changho Suh In this lecture, we studied two types of one-to-many channels: (1) compound channels

More information

S = k=1 k. ( 1) k+1 N S N S N

S = k=1 k. ( 1) k+1 N S N S N Alternating Series Last time we talked about convergence of series. Our two most important tests, the integral test and the (limit) comparison test, both required that the terms of the series were (eventually)

More information

1 Fixed Point Iteration and Contraction Mapping Theorem

1 Fixed Point Iteration and Contraction Mapping Theorem 1 Fixed Point Iteration and Contraction Mapping Theorem Notation: For two sets A,B we write A B iff x A = x B. So A A is true. Some people use the notation instead. 1.1 Introduction Consider a function

More information

Exact Nonparametric Tests for Comparing Means - A Personal Summary

Exact Nonparametric Tests for Comparing Means - A Personal Summary Exact Nonparametric Tests for Comparing Means - A Personal Summary Karl H. Schlag European University Institute 1 December 14, 2006 1 Economics Department, European University Institute. Via della Piazzuola

More information

Optimal File Sharing in Distributed Networks

Optimal File Sharing in Distributed Networks Optimal File Sharing in Distributed Networks Moni Naor Ron M. Roth Abstract The following file distribution problem is considered: Given a network of processors represented by an undirected graph G = (V,

More information

Lecture 7: Continuous Random Variables

Lecture 7: Continuous Random Variables Lecture 7: Continuous Random Variables 21 September 2005 1 Our First Continuous Random Variable The back of the lecture hall is roughly 10 meters across. Suppose it were exactly 10 meters, and consider

More information

4 Lyapunov Stability Theory

4 Lyapunov Stability Theory 4 Lyapunov Stability Theory In this section we review the tools of Lyapunov stability theory. These tools will be used in the next section to analyze the stability properties of a robot controller. We

More information

Math 317 HW #7 Solutions

Math 317 HW #7 Solutions Math 17 HW #7 Solutions 1. Exercise..5. Decide which of the following sets are compact. For those that are not compact, show how Definition..1 breaks down. In other words, give an example of a sequence

More information

1 Prior Probability and Posterior Probability

1 Prior Probability and Posterior Probability Math 541: Statistical Theory II Bayesian Approach to Parameter Estimation Lecturer: Songfeng Zheng 1 Prior Probability and Posterior Probability Consider now a problem of statistical inference in which

More information

Random access protocols for channel access. Markov chains and their stability. Laurent Massoulié.

Random access protocols for channel access. Markov chains and their stability. Laurent Massoulié. Random access protocols for channel access Markov chains and their stability laurent.massoulie@inria.fr Aloha: the first random access protocol for channel access [Abramson, Hawaii 70] Goal: allow machines

More information

Lecture 7: Approximation via Randomized Rounding

Lecture 7: Approximation via Randomized Rounding Lecture 7: Approximation via Randomized Rounding Often LPs return a fractional solution where the solution x, which is supposed to be in {0, } n, is in [0, ] n instead. There is a generic way of obtaining

More information

The Goldberg Rao Algorithm for the Maximum Flow Problem

The Goldberg Rao Algorithm for the Maximum Flow Problem The Goldberg Rao Algorithm for the Maximum Flow Problem COS 528 class notes October 18, 2006 Scribe: Dávid Papp Main idea: use of the blocking flow paradigm to achieve essentially O(min{m 2/3, n 1/2 }

More information

Undergraduate Notes in Mathematics. Arkansas Tech University Department of Mathematics

Undergraduate Notes in Mathematics. Arkansas Tech University Department of Mathematics Undergraduate Notes in Mathematics Arkansas Tech University Department of Mathematics An Introductory Single Variable Real Analysis: A Learning Approach through Problem Solving Marcel B. Finan c All Rights

More information

1 Limiting distribution for a Markov chain

1 Limiting distribution for a Markov chain Copyright c 2009 by Karl Sigman Limiting distribution for a Markov chain In these Lecture Notes, we shall study the limiting behavior of Markov chains as time n In particular, under suitable easy-to-check

More information

Ri and. i=1. S i N. and. R R i

Ri and. i=1. S i N. and. R R i The subset R of R n is a closed rectangle if there are n non-empty closed intervals {[a 1, b 1 ], [a 2, b 2 ],..., [a n, b n ]} so that R = [a 1, b 1 ] [a 2, b 2 ] [a n, b n ]. The subset R of R n is an

More information

INDISTINGUISHABILITY OF ABSOLUTELY CONTINUOUS AND SINGULAR DISTRIBUTIONS

INDISTINGUISHABILITY OF ABSOLUTELY CONTINUOUS AND SINGULAR DISTRIBUTIONS INDISTINGUISHABILITY OF ABSOLUTELY CONTINUOUS AND SINGULAR DISTRIBUTIONS STEVEN P. LALLEY AND ANDREW NOBEL Abstract. It is shown that there are no consistent decision rules for the hypothesis testing problem

More information

3 Some Integer Functions

3 Some Integer Functions 3 Some Integer Functions A Pair of Fundamental Integer Functions The integer function that is the heart of this section is the modulo function. However, before getting to it, let us look at some very simple

More information

Arkansas Tech University MATH 4033: Elementary Modern Algebra Dr. Marcel B. Finan

Arkansas Tech University MATH 4033: Elementary Modern Algebra Dr. Marcel B. Finan Arkansas Tech University MATH 4033: Elementary Modern Algebra Dr. Marcel B. Finan 3 Binary Operations We are used to addition and multiplication of real numbers. These operations combine two real numbers

More information

Dynamic TCP Acknowledgement: Penalizing Long Delays

Dynamic TCP Acknowledgement: Penalizing Long Delays Dynamic TCP Acknowledgement: Penalizing Long Delays Karousatou Christina Network Algorithms June 8, 2010 Karousatou Christina (Network Algorithms) Dynamic TCP Acknowledgement June 8, 2010 1 / 63 Layout

More information

E3: PROBABILITY AND STATISTICS lecture notes

E3: PROBABILITY AND STATISTICS lecture notes E3: PROBABILITY AND STATISTICS lecture notes 2 Contents 1 PROBABILITY THEORY 7 1.1 Experiments and random events............................ 7 1.2 Certain event. Impossible event............................

More information

MODULAR ARITHMETIC KEITH CONRAD

MODULAR ARITHMETIC KEITH CONRAD MODULAR ARITHMETIC KEITH CONRAD. Introduction We will define the notion of congruent integers (with respect to a modulus) and develop some basic ideas of modular arithmetic. Applications of modular arithmetic

More information

2.1 Complexity Classes

2.1 Complexity Classes 15-859(M): Randomized Algorithms Lecturer: Shuchi Chawla Topic: Complexity classes, Identity checking Date: September 15, 2004 Scribe: Andrew Gilpin 2.1 Complexity Classes In this lecture we will look

More information

Coded modulation: What is it?

Coded modulation: What is it? Coded modulation So far: Binary coding Binary modulation Will send R information bits/symbol (spectral efficiency = R) Constant transmission rate: Requires bandwidth expansion by a factor 1/R Until 1976:

More information

Introduction to Arithmetic Coding - Theory and Practice

Introduction to Arithmetic Coding - Theory and Practice Introduction to Arithmetic Coding - Theory and Practice Amir Said Imaging Systems Laboratory HP Laboratories Palo Alto HPL-2004-76 April 21, 2004* entropy coding, compression, complexity This introduction

More information

Gambling and Data Compression

Gambling and Data Compression Gambling and Data Compression Gambling. Horse Race Definition The wealth relative S(X) = b(x)o(x) is the factor by which the gambler s wealth grows if horse X wins the race, where b(x) is the fraction

More information

Review for Calculus Rational Functions, Logarithms & Exponentials

Review for Calculus Rational Functions, Logarithms & Exponentials Definition and Domain of Rational Functions A rational function is defined as the quotient of two polynomial functions. F(x) = P(x) / Q(x) The domain of F is the set of all real numbers except those for

More information

Playback Delay in On-Demand Streaming Communication with Feedback

Playback Delay in On-Demand Streaming Communication with Feedback Playback Delay in On-Demand Streaming Communication with Feedback Kaveh Mahdaviani, Ashish Khisti ECE Dept., University of Toronto Toronto, ON M5S3G4, Canada Email: {kaveh, akhisti}@comm.utoronto.ca Gauri

More information

Information Theory and Coding Prof. S. N. Merchant Department of Electrical Engineering Indian Institute of Technology, Bombay

Information Theory and Coding Prof. S. N. Merchant Department of Electrical Engineering Indian Institute of Technology, Bombay Information Theory and Coding Prof. S. N. Merchant Department of Electrical Engineering Indian Institute of Technology, Bombay Lecture - 14 Non-Binary Huffman Code and Other Codes In the last class we

More information

P (x) 0. Discrete random variables Expected value. The expected value, mean or average of a random variable x is: xp (x) = v i P (v i )

P (x) 0. Discrete random variables Expected value. The expected value, mean or average of a random variable x is: xp (x) = v i P (v i ) Discrete random variables Probability mass function Given a discrete random variable X taking values in X = {v 1,..., v m }, its probability mass function P : X [0, 1] is defined as: P (v i ) = Pr[X =

More information

Chapter 9: Hypothesis Testing Sections

Chapter 9: Hypothesis Testing Sections Chapter 9: Hypothesis Testing Sections Skip: 9.2 Testing Simple Hypotheses Skip: 9.3 Uniformly Most Powerful Tests Skip: 9.4 Two-Sided Alternatives 9.5 The t Test 9.6 Comparing the Means of Two Normal

More information

Notes on metric spaces

Notes on metric spaces Notes on metric spaces 1 Introduction The purpose of these notes is to quickly review some of the basic concepts from Real Analysis, Metric Spaces and some related results that will be used in this course.

More information

Welcome to... Problem Analysis and Complexity Theory 716.054, 3 VU

Welcome to... Problem Analysis and Complexity Theory 716.054, 3 VU Welcome to... Problem Analysis and Complexity Theory 716.054, 3 VU Birgit Vogtenhuber Institute for Software Technology email: bvogt@ist.tugraz.at office hour: Tuesday 10:30 11:30 slides: http://www.ist.tugraz.at/pact.html

More information

1. Periodic Fourier series. The Fourier expansion of a 2π-periodic function f is:

1. Periodic Fourier series. The Fourier expansion of a 2π-periodic function f is: CONVERGENCE OF FOURIER SERIES 1. Periodic Fourier series. The Fourier expansion of a 2π-periodic function f is: with coefficients given by: a n = 1 π f(x) a 0 2 + a n cos(nx) + b n sin(nx), n 1 f(x) cos(nx)dx

More information

Lecture 2: Universality

Lecture 2: Universality CS 710: Complexity Theory 1/21/2010 Lecture 2: Universality Instructor: Dieter van Melkebeek Scribe: Tyson Williams In this lecture, we introduce the notion of a universal machine, develop efficient universal

More information

ELEC3028 Digital Transmission Overview & Information Theory. Example 1

ELEC3028 Digital Transmission Overview & Information Theory. Example 1 Example. A source emits symbols i, i 6, in the BCD format with probabilities P( i ) as given in Table, at a rate R s = 9.6 kbaud (baud=symbol/second). State (i) the information rate and (ii) the data rate

More information

Lecture 13: Martingales

Lecture 13: Martingales Lecture 13: Martingales 1. Definition of a Martingale 1.1 Filtrations 1.2 Definition of a martingale and its basic properties 1.3 Sums of independent random variables and related models 1.4 Products of

More information