Polarization codes and the rate of polarization
|
|
|
- Junior Jeremy Moore
- 9 years ago
- Views:
Transcription
1 Polarization codes and the rate of polarization Erdal Arıkan, Emre Telatar Bilkent U., EPFL Sept 10, 2008
2 Channel Polarization Given a binary input DMC W, i.i.d. uniformly distributed inputs (X 1,..., X n ) {0, 1} n, in one-to-one correspondence with binary data (U 1,..., U n ) {0, 1} n. Observe that U i are i.i.d., uniform on {0, 1}. W X n F n 2 Y n
3 Channel Polarization Given a binary input DMC W, i.i.d. uniformly distributed inputs (X 1,..., X n ) {0, 1} n, in one-to-one correspondence with binary data (U 1,..., U n ) {0, 1} n. Observe that U i are i.i.d., uniform on {0, 1}. W X n F n 2 Y n
4 Channel Polarization Given a binary input DMC W, i.i.d. uniformly distributed inputs (X 1,..., X n ) {0, 1} n, in one-to-one correspondence with binary data (U 1,..., U n ) {0, 1} n. Observe that U i are i.i.d., uniform on {0, 1}. U n F n 2 G n : F n 2 Fn 2 X n F n 2 W Y n
5 Channel Polarization Given a binary input DMC W, i.i.d. uniformly distributed inputs (X 1,..., X n ) {0, 1} n, in one-to-one correspondence with binary data (U 1,..., U n ) {0, 1} n. Observe that U i are i.i.d., uniform on {0, 1}. U n F n 2 G n : F n 2 Fn 2 X n F n 2 W Y n
6 Channel Polarization I (U n ; Y n ) = I (X n ; Y n ) = i I (X i ; Y i ) = i I (W ) I (U n ; Y n ) = i = i I (U i ; Y n U i 1 ) I (U i ; Y n U i 1 ) Notation: I (P) denotes the mutual information between the input and output of a channel P when input is uniformly distributed.
7 Channel Polarization I (U n ; Y n ) = I (X n ; Y n ) = i I (X i ; Y i ) = i I (W ) I (U n ; Y n ) = i = i I (U i ; Y n U i 1 ) I (U i ; Y n U i 1 ) Notation: I (P) denotes the mutual information between the input and output of a channel P when input is uniformly distributed.
8 Channel Polarization I (U n ; Y n ) = I (X n ; Y n ) = i I (X i ; Y i ) = i I (W ) I (U n ; Y n ) = i = i I (U i ; Y n U i 1 ) I (U i ; Y n U i 1 ) Notation: I (P) denotes the mutual information between the input and output of a channel P when input is uniformly distributed.
9 Channel Polarization I (U n ; Y n ) = I (X n ; Y n ) = i I (X i ; Y i ) = i I (W ) I (U n ; Y n ) = i = i I (U i ; Y n U i 1 ) I (U i ; Y n U i 1 ) Notation: I (P) denotes the mutual information between the input and output of a channel P when input is uniformly distributed.
10 Channel Polarization I (U n ; Y n ) = I (X n ; Y n ) = i I (X i ; Y i ) #{i : I (X i ; Y i ) ζ}/n 1 = i I (W ) I (U n ; Y n ) = i I (U i ; Y n U i 1 ) = i I (U i ; Y n U i 1 ) 0 0 I (W ) 1 ζ Notation: I (P) denotes the mutual information between the input and output of a channel P when input is uniformly distributed.
11 Channel Polarization I (U n ; Y n ) = I (X n ; Y n ) = i I (X i ; Y i ) #{i : I (X i ; Y i ) ζ}/n 1 = i I (W ) I (U n ; Y n ) = i I (U i ; Y n U i 1 ) = i I (U i ; Y n U i 1 ) 0 0 I (W ) 1 ζ Notation: I (P) denotes the mutual information between the input and output of a channel P when input is uniformly distributed.
12 Channel Polarization I (U n ; Y n ) = I (X n ; Y n ) = i I (X i ; Y i ) #{i : I (X i ; Y i ) ζ}/n 1 = i I (W ) I (U n ; Y n ) = i I (U i ; Y n U i 1 ) = i I (U i ; Y n U i 1 ) 0 0 I (W ) 1 ζ Notation: I (P) denotes the mutual information between the input and output of a channel P when input is uniformly distributed.
13 Channel Polarization I (U n ; Y n ) = I (X n ; Y n ) = i I (X i ; Y i ) #{i : I (U i ; Y n U i 1 ) ζ}/n 1 = i I (W ) I (U n ; Y n ) = i I (U i ; Y n U i 1 ) = i I (U i ; Y n U i 1 ) ζ Notation: I (P) denotes the mutual information between the input and output of a channel P when input is uniformly distributed.
14 Channel Polarization I (U n ; Y n ) = I (X n ; Y n ) = i I (X i ; Y i ) #{i : I (U i ; Y n U i 1 ) ζ}/n 1 = i I (W ) I (U n ; Y n ) = i I (U i ; Y n U i 1 ) = I (U i ; Y n U i 1 ) 0 i 0 1 n = 1 Notation: I (P) denotes the mutual information between the input and output of a channel P when input is uniformly distributed. ζ
15 Channel Polarization I (U n ; Y n ) = I (X n ; Y n ) = i I (X i ; Y i ) #{i : I (U i ; Y n U i 1 ) ζ}/n 1 = i I (W ) I (U n ; Y n ) = i I (U i ; Y n U i 1 ) = I (U i ; Y n U i 1 ) 0 i 0 1 n = 2 Notation: I (P) denotes the mutual information between the input and output of a channel P when input is uniformly distributed. ζ
16 Channel Polarization I (U n ; Y n ) = I (X n ; Y n ) = i I (X i ; Y i ) #{i : I (U i ; Y n U i 1 ) ζ}/n 1 = i I (W ) I (U n ; Y n ) = i I (U i ; Y n U i 1 ) = I (U i ; Y n U i 1 ) 0 i 0 1 n = 2 4 Notation: I (P) denotes the mutual information between the input and output of a channel P when input is uniformly distributed. ζ
17 Channel Polarization I (U n ; Y n ) = I (X n ; Y n ) = i I (X i ; Y i ) #{i : I (U i ; Y n U i 1 ) ζ}/n 1 = i I (W ) I (U n ; Y n ) = i I (U i ; Y n U i 1 ) = I (U i ; Y n U i 1 ) 0 i 0 1 n = 2 8 Notation: I (P) denotes the mutual information between the input and output of a channel P when input is uniformly distributed. ζ
18 Channel Polarization I (U n ; Y n ) = I (X n ; Y n ) = i I (X i ; Y i ) #{i : I (U i ; Y n U i 1 ) ζ}/n 1 = i I (W ) I (U n ; Y n ) = i I (U i ; Y n U i 1 ) = I (U i ; Y n U i 1 ) 0 i 0 1 n = 2 12 Notation: I (P) denotes the mutual information between the input and output of a channel P when input is uniformly distributed. ζ
19 Channel Polarization I (U n ; Y n ) = I (X n ; Y n ) = i I (X i ; Y i ) #{i : I (U i ; Y n U i 1 ) ζ}/n 1 = i I (W ) I (U n ; Y n ) = i I (U i ; Y n U i 1 ) = I (U i ; Y n U i 1 ) 0 i 0 1 n = 2 20 Notation: I (P) denotes the mutual information between the input and output of a channel P when input is uniformly distributed. ζ
20 Channel Polarization I (U n ; Y n ) = I (X n ; Y n ) = i I (X i ; Y i ) #{i : I (U i ; Y n U i 1 ) ζ}/n 1 = i I (W ) I (U n ; Y n ) = i I (U i ; Y n U i 1 ) = i I (U i ; Y n U i 1 ) ζ Notation: I (P) denotes the mutual information between the input and output of a channel P when input is uniformly distributed.
21 Channel Polarization We say channel polarization takes place if it is the case that almost all of the numbers I (U i ; Y n U i 1 ) are near the extremal values, i.e., if 1 n #{ i : I (U i ; Y n U i 1 ) 1 } I (W ) and 1 n #{ i : I (U i ; Y n U i 1 ) 0 } 1 I (W ).
22 Channel Polarization We say channel polarization takes place if it is the case that almost all of the numbers I (U i ; Y n U i 1 ) are near the extremal values, i.e., if 1 n #{ i : I (U i ; Y n U i 1 ) 1 } I (W ) and 1 n #{ i : I (U i ; Y n U i 1 ) 0 } 1 I (W ).
23 Channel Polarization If polarization takes place and we wish to communicate at rate R: Pick n, and k = nr good indices i such that I (U i ; Y n U i 1 ) is high, let the transmitter set U i to be uncoded binary data for good indices, and set U i to random but publicly known values for the rest, let the receiver decode the U i successively: U 1 from Y n ; U i from Y n Û i 1. One would expect this scheme to do do well as long as R < I (W ).
24 Channel Polarization If polarization takes place and we wish to communicate at rate R: Pick n, and k = nr good indices i such that I (U i ; Y n U i 1 ) is high, let the transmitter set U i to be uncoded binary data for good indices, and set U i to random but publicly known values for the rest, let the receiver decode the U i successively: U 1 from Y n ; U i from Y n Û i 1. One would expect this scheme to do do well as long as R < I (W ).
25 Channel Polarization If polarization takes place and we wish to communicate at rate R: Pick n, and k = nr good indices i such that I (U i ; Y n U i 1 ) is high, let the transmitter set U i to be uncoded binary data for good indices, and set U i to random but publicly known values for the rest, let the receiver decode the U i successively: U 1 from Y n ; U i from Y n Û i 1. One would expect this scheme to do do well as long as R < I (W ).
26 Channel Polarization If polarization takes place and we wish to communicate at rate R: Pick n, and k = nr good indices i such that I (U i ; Y n U i 1 ) is high, let the transmitter set U i to be uncoded binary data for good indices, and set U i to random but publicly known values for the rest, let the receiver decode the U i successively: U 1 from Y n ; U i from Y n Û i 1. One would expect this scheme to do do well as long as R < I (W ).
27 Channel Polarization If polarization takes place and we wish to communicate at rate R: Pick n, and k = nr good indices i such that I (U i ; Y n U i 1 ) is high, let the transmitter set U i to be uncoded binary data for good indices, and set U i to random but publicly known values for the rest, let the receiver decode the U i successively: U 1 from Y n ; U i from Y n Û i 1. One would expect this scheme to do do well as long as R < I (W ).
28 Polarization HowTo Given two copies of a binary input channel P : F 2 Y X 1 P Y 1 X 2 P Y 2 consider the transformation above to generate two channels P : F 2 Y 2 and P + : F 2 Y 2 F 2 with P (y 1 y 2 u 1 ) = u P(y 1 u 1 + u 2 )P(y 2 u 2 ) P + (y 1 y 2 u 1 u 2 ) = 1 2 P(y 1 u 1 + u 2 )P(y 2 u 2 ).
29 Polarization HowTo Given two copies of a binary input channel P : F 2 Y U 1 + P Y 1 U 2 P Y 2 consider the transformation above to generate two channels P : F 2 Y 2 and P + : F 2 Y 2 F 2 with P (y 1 y 2 u 1 ) = u P(y 1 u 1 + u 2 )P(y 2 u 2 ) P + (y 1 y 2 u 1 u 2 ) = 1 2 P(y 1 u 1 + u 2 )P(y 2 u 2 ).
30 Polarization HowTo Given two copies of a binary input channel P : F 2 Y U 1 + P Y 1 U 2 P Y 2 consider the transformation above to generate two channels P : F 2 Y 2 and P + : F 2 Y 2 F 2 with P (y 1 y 2 u 1 ) = u P(y 1 u 1 + u 2 )P(y 2 u 2 ) P + (y 1 y 2 u 1 u 2 ) = 1 2 P(y 1 u 1 + u 2 )P(y 2 u 2 ).
31 Polarization HowTo Given two copies of a binary input channel P : F 2 Y U 1 + P Y 1 U 2 P Y 2 consider the transformation above to generate two channels P : F 2 Y 2 and P + : F 2 Y 2 F 2 with P (y 1 y 2 u 1 ) = u P(y 1 u 1 + u 2 )P(y 2 u 2 ) P + (y 1 y 2 u 1 u 2 ) = 1 2 P(y 1 u 1 + u 2 )P(y 2 u 2 ).
32 Polarization HowTo Given two copies of a binary input channel P : F 2 Y U 1 + P Y 1 U 2 P Y 2 consider the transformation above to generate two channels P : F 2 Y 2 and P + : F 2 Y 2 F 2 with P (y 1 y 2 u 1 ) = u P(y 1 u 1 + u 2 )P(y 2 u 2 ) P + (y 1 y 2 u 1 u 2 ) = 1 2 P(y 1 u 1 + u 2 )P(y 2 u 2 ).
33 Polarization HowTo Given two copies of a binary input channel P : F 2 Y U 1 + P Y 1 U 2 P Y 2 consider the transformation above to generate two channels P : F 2 Y 2 and P + : F 2 Y 2 F 2 with P (y 1 y 2 u 1 ) = u P(y 1 u 1 + u 2 )P(y 2 u 2 ) P + (y 1 y 2 u 1 u 2 ) = 1 2 P(y 1 u 1 + u 2 )P(y 2 u 2 ).
34 Polarization HowTo Observe that [ U1 U 2 ] = [ With independent, uniform U 1, U 2, Thus, ] [ X1 X 2 I (P ) = I (U 1 ; Y 1 Y 2 ), ]. I (P + ) = I (U 2 ; Y 1 Y 2 U 1 ). I (P ) + I (P + ) = I (U 1 U 2 ; Y 1 Y 2 ) = I (X 1 ; Y 1 ) + I (X 2 ; Y 2 ) = 2I (P), and I (P ) I (P) I (P + ).
35 Polarization HowTo Observe that [ U1 U 2 ] = [ With independent, uniform U 1, U 2, Thus, ] [ X1 X 2 I (P ) = I (U 1 ; Y 1 Y 2 ), ]. I (P + ) = I (U 2 ; Y 1 Y 2 U 1 ). I (P ) + I (P + ) = I (U 1 U 2 ; Y 1 Y 2 ) = I (X 1 ; Y 1 ) + I (X 2 ; Y 2 ) = 2I (P), and I (P ) I (P) I (P + ).
36 Polarization HowTo Observe that [ U1 U 2 ] = [ With independent, uniform U 1, U 2, Thus, ] [ X1 X 2 I (P ) = I (U 1 ; Y 1 Y 2 ), ]. I (P + ) = I (U 2 ; Y 1 Y 2 U 1 ). I (P ) + I (P + ) = I (U 1 U 2 ; Y 1 Y 2 ) = I (X 1 ; Y 1 ) + I (X 2 ; Y 2 ) = 2I (P), and I (P ) I (P) I (P + ).
37 Polarization HowTo Observe that [ U1 U 2 ] = [ With independent, uniform U 1, U 2, Thus, ] [ X1 X 2 I (P ) = I (U 1 ; Y 1 Y 2 ), ]. I (P + ) = I (U 2 ; Y 1 Y 2 U 1 ). I (P ) + I (P + ) = I (U 1 U 2 ; Y 1 Y 2 ) = I (X 1 ; Y 1 ) + I (X 2 ; Y 2 ) = 2I (P), and I (P ) I (P) I (P + ).
38 Polarization HowTo What we can do once, we can do many times: Given W, Duplicate W and obtain W and W +. Duplicate W (W + ), and obtain W and W + (W + and W ++ ). Duplicate W (W +, W +, W ++ ) and obtain W and W + (W +, W ++, W +, W + +, W ++, W +++ )....
39 Polarization HowTo What we can do once, we can do many times: Given W, Duplicate W and obtain W and W +. Duplicate W (W + ), and obtain W and W + (W + and W ++ ). Duplicate W (W +, W +, W ++ ) and obtain W and W + (W +, W ++, W +, W + +, W ++, W +++ )....
40 Polarization HowTo What we can do once, we can do many times: Given W, Duplicate W and obtain W and W +. Duplicate W (W + ), and obtain W and W + (W + and W ++ ). Duplicate W (W +, W +, W ++ ) and obtain W and W + (W +, W ++, W +, W + +, W ++, W +++ )....
41 Polarization HowTo What we can do once, we can do many times: Given W, Duplicate W and obtain W and W +. Duplicate W (W + ), and obtain W and W + (W + and W ++ ). Duplicate W (W +, W +, W ++ ) and obtain W and W + (W +, W ++, W +, W + +, W ++, W +++ )....
42 Polarization HowTo What we can do once, we can do many times: Given W, Duplicate W and obtain W and W +. Duplicate W (W + ), and obtain W and W + (W + and W ++ ). Duplicate W (W +, W +, W ++ ) and obtain W and W + (W +, W ++, W +, W + +, W ++, W +++ )....
43 Polarization HowTo What we can do once, we can do many times: Given W, Duplicate W and obtain W and W +. Duplicate W (W + ), and obtain W and W + (W + and W ++ ). Duplicate W (W +, W +, W ++ ) and obtain W and W + (W +, W ++, W +, W + +, W ++, W +++ )....
44 Polarization HowTo We now have a one-to-one transformation G n between U 1,..., U n and X 1,..., X n for n = 2 l. Furthermore, the n = 2 l quantities are exactly the n quantities I (W b 1...,b l ), b j {+, } I (U i ; Y n U i 1 ), i = 1,..., n whose empirical distribution we seek. This suggests: Set B 1, B 2,..., to be i.i.d., {+, } valued, uniformly distributed random variables, Define the random variable I l = I (W B 1,...,B l ). Study the distribution of I l.
45 Polarization HowTo We now have a one-to-one transformation G n between U 1,..., U n and X 1,..., X n for n = 2 l. Furthermore, the n = 2 l quantities are exactly the n quantities I (W b 1...,b l ), b j {+, } I (U i ; Y n U i 1 ), i = 1,..., n whose empirical distribution we seek. This suggests: Set B 1, B 2,..., to be i.i.d., {+, } valued, uniformly distributed random variables, Define the random variable I l = I (W B 1,...,B l ). Study the distribution of I l.
46 Polarization HowTo We now have a one-to-one transformation G n between U 1,..., U n and X 1,..., X n for n = 2 l. Furthermore, the n = 2 l quantities are exactly the n quantities I (W b 1...,b l ), b j {+, } I (U i ; Y n U i 1 ), i = 1,..., n whose empirical distribution we seek. This suggests: Set B 1, B 2,..., to be i.i.d., {+, } valued, uniformly distributed random variables, Define the random variable I l = I (W B 1,...,B l ). Study the distribution of I l.
47 Polarization HowTo We now have a one-to-one transformation G n between U 1,..., U n and X 1,..., X n for n = 2 l. Furthermore, the n = 2 l quantities are exactly the n quantities I (W b 1...,b l ), b j {+, } I (U i ; Y n U i 1 ), i = 1,..., n whose empirical distribution we seek. This suggests: Set B 1, B 2,..., to be i.i.d., {+, } valued, uniformly distributed random variables, Define the random variable I l = I (W B 1,...,B l ). Study the distribution of I l.
48 Polarization HowTo We now have a one-to-one transformation G n between U 1,..., U n and X 1,..., X n for n = 2 l. Furthermore, the n = 2 l quantities are exactly the n quantities I (W b 1...,b l ), b j {+, } I (U i ; Y n U i 1 ), i = 1,..., n whose empirical distribution we seek. This suggests: Set B 1, B 2,..., to be i.i.d., {+, } valued, uniformly distributed random variables, Define the random variable I l = I (W B 1,...,B l ). Study the distribution of I l.
49 Polarization HowTo Properties of the process I l : I 0 = I (W ) is a constant. I l is lies in [0, 1], so is bounded. Conditional on B 1,..., B l, and with P := W B 1,...,B l, I l+1 can take only two values I (P ) and I (P + ), so, E[I l+1 B 1,..., B l ] = 1 2 [I (P ) + I (P + )] = I (P) = I l. and we see that {I l } is a martingale.
50 Polarization HowTo Properties of the process I l : I 0 = I (W ) is a constant. I l is lies in [0, 1], so is bounded. Conditional on B 1,..., B l, and with P := W B 1,...,B l, I l+1 can take only two values I (P ) and I (P + ), so, E[I l+1 B 1,..., B l ] = 1 2 [I (P ) + I (P + )] = I (P) = I l. and we see that {I l } is a martingale.
51 Polarization HowTo Properties of the process I l : I 0 = I (W ) is a constant. I l is lies in [0, 1], so is bounded. Conditional on B 1,..., B l, and with P := W B 1,...,B l, I l+1 can take only two values I (P ) and I (P + ), so, E[I l+1 B 1,..., B l ] = 1 2 [I (P ) + I (P + )] = I (P) = I l. and we see that {I l } is a martingale.
52 Polarization HowTo Properties of the process I l : I 0 = I (W ) is a constant. I l is lies in [0, 1], so is bounded. Conditional on B 1,..., B l, and with P := W B 1,...,B l, I l+1 can take only two values I (P ) and I (P + ), so, E[I l+1 B 1,..., B l ] = 1 2 [I (P ) + I (P + )] = I (P) = I l. and we see that {I l } is a martingale.
53 Polarization HowTo Bounded martingales converge almost surely. Thus I := lim l I l exists with probability one. So we know that the empirical distribution we seek exists it is the distribution of I and E[I ] = I 0 = I (W ). Almost sure convergence of {I l } implies I l+1 I l 0 a.s. I l+1 I l = 1 2 [I (P+ ) I (P )]. But I (P + ) I (P ) < ɛ implies I (P) (δ, 1 δ). (The BEC and BSC turn out to be extremal among all binary input channels). Thus I is {0, 1} valued!
54 Polarization HowTo Bounded martingales converge almost surely. Thus I := lim l I l exists with probability one. So we know that the empirical distribution we seek exists it is the distribution of I and E[I ] = I 0 = I (W ). Almost sure convergence of {I l } implies I l+1 I l 0 a.s. I l+1 I l = 1 2 [I (P+ ) I (P )]. But I (P + ) I (P ) < ɛ implies I (P) (δ, 1 δ). (The BEC and BSC turn out to be extremal among all binary input channels). Thus I is {0, 1} valued!
55 Polarization HowTo Bounded martingales converge almost surely. Thus I := lim l I l exists with probability one. So we know that the empirical distribution we seek exists it is the distribution of I and E[I ] = I 0 = I (W ). Almost sure convergence of {I l } implies I l+1 I l 0 a.s. I l+1 I l = 1 2 [I (P+ ) I (P )]. But I (P + ) I (P ) < ɛ implies I (P) (δ, 1 δ). (The BEC and BSC turn out to be extremal among all binary input channels). Thus I is {0, 1} valued!
56 Polarization HowTo Bounded martingales converge almost surely. Thus I := lim l I l exists with probability one. So we know that the empirical distribution we seek exists it is the distribution of I and E[I ] = I 0 = I (W ). Almost sure convergence of {I l } implies I l+1 I l 0 a.s. I l+1 I l = 1 2 [I (P+ ) I (P )]. But I (P + ) I (P ) < ɛ implies I (P) (δ, 1 δ). (The BEC and BSC turn out to be extremal among all binary input channels). Thus I is {0, 1} valued! I (P + ) I (P ) I (P)
57 Polarization HowTo Bounded martingales converge almost surely. Thus I := lim l I l exists with probability one. So we know that the empirical distribution we seek exists it is the distribution of I and E[I ] = I 0 = I (W ). Almost sure convergence of {I l } implies I l+1 I l 0 a.s. I l+1 I l = 1 2 [I (P+ ) I (P )]. But I (P + ) I (P ) < ɛ implies I (P) (δ, 1 δ). (The BEC and BSC turn out to be extremal among all binary input channels). Thus I is {0, 1} valued! I (P + ) I (P ) I (P)
58 Polarization Rate We have seen that polarization takes place. How fast? Idea 0: We have seen in the I (P + ) I (P ) vs I (P) graph that BSC is least polarizing. So we should be able to obtain a lower bound on the speed of polarization by studying BSC. However, if P is a BSC, P + and P are not. Thus the estimate based on this idea is turns out to be too pessimistic. Instead, we introduce an auxiliary quantity Z(P) = P(y 0)P(y 1) y as a companion to I (P).
59 Polarization Rate We have seen that polarization takes place. How fast? Idea 0: We have seen in the I (P + ) I (P ) vs I (P) graph that BSC is least polarizing. So we should be able to obtain a lower bound on the speed of polarization by studying BSC. However, if P is a BSC, P + and P are not. Thus the estimate based on this idea is turns out to be too pessimistic. Instead, we introduce an auxiliary quantity Z(P) = P(y 0)P(y 1) y as a companion to I (P).
60 Polarization Rate We have seen that polarization takes place. How fast? Idea 0: We have seen in the I (P + ) I (P ) vs I (P) graph that BSC is least polarizing. So we should be able to obtain a lower bound on the speed of polarization by studying BSC. However, if P is a BSC, P + and P are not. Thus the estimate based on this idea is turns out to be too pessimistic. Instead, we introduce an auxiliary quantity Z(P) = P(y 0)P(y 1) y as a companion to I (P).
61 Polarization Rate We have seen that polarization takes place. How fast? Idea 0: We have seen in the I (P + ) I (P ) vs I (P) graph that BSC is least polarizing. So we should be able to obtain a lower bound on the speed of polarization by studying BSC. However, if P is a BSC, P + and P are not. Thus the estimate based on this idea is turns out to be too pessimistic. Instead, we introduce an auxiliary quantity Z(P) = P(y 0)P(y 1) y as a companion to I (P).
62 Polarization Rate Properties of Z(P): Z(P) [0, 1]. Z(P) 0 iff I (P) 1. Z(P) 1 iff I (P) 0. Z(P + ) = Z(P) 2. Z(P ) 2Z(P). Note that Z(P) is the Bhattacharyya upper bound on probability of error for uncoded transmission over P. Thus, we can choose the good indices on the basis of Z(P); the sum of the Z s of the chosen channels will upper bound the block error probability. This suggests studying the polarization rate of Z.
63 Polarization Rate Properties of Z(P): Z(P) [0, 1]. Z(P) 0 iff I (P) 1. Z(P) 1 iff I (P) 0. Z(P + ) = Z(P) 2. Z(P ) 2Z(P). Z(P) I (P) Note that Z(P) is the Bhattacharyya upper bound on probability of error for uncoded transmission over P. Thus, we can choose the good indices on the basis of Z(P); the sum of the Z s of the chosen channels will upper bound the block error probability. This suggests studying the polarization rate of Z
64 Polarization Rate Properties of Z(P): Z(P) [0, 1]. Z(P) 0 iff I (P) 1. Z(P) 1 iff I (P) 0. Z(P + ) = Z(P) 2. Z(P ) 2Z(P). Z(P) I (P) Note that Z(P) is the Bhattacharyya upper bound on probability of error for uncoded transmission over P. Thus, we can choose the good indices on the basis of Z(P); the sum of the Z s of the chosen channels will upper bound the block error probability. This suggests studying the polarization rate of Z
65 Polarization Rate Properties of Z(P): Z(P) [0, 1]. Z(P) 0 iff I (P) 1. Z(P) 1 iff I (P) 0. Z(P + ) = Z(P) 2. Z(P ) 2Z(P). Z(P) I (P) Note that Z(P) is the Bhattacharyya upper bound on probability of error for uncoded transmission over P. Thus, we can choose the good indices on the basis of Z(P); the sum of the Z s of the chosen channels will upper bound the block error probability. This suggests studying the polarization rate of Z
66 Polarization Rate Properties of Z(P): Z(P) [0, 1]. Z(P) 0 iff I (P) 1. Z(P) 1 iff I (P) 0. Z(P + ) = Z(P) 2. Z(P ) 2Z(P). Z(P) I (P) Note that Z(P) is the Bhattacharyya upper bound on probability of error for uncoded transmission over P. Thus, we can choose the good indices on the basis of Z(P); the sum of the Z s of the chosen channels will upper bound the block error probability. This suggests studying the polarization rate of Z
67 Polarization Rate Properties of Z(P): Z(P) [0, 1]. Z(P) 0 iff I (P) 1. Z(P) 1 iff I (P) 0. Z(P + ) = Z(P) 2. Z(P ) 2Z(P). Z(P) I (P) Note that Z(P) is the Bhattacharyya upper bound on probability of error for uncoded transmission over P. Thus, we can choose the good indices on the basis of Z(P); the sum of the Z s of the chosen channels will upper bound the block error probability. This suggests studying the polarization rate of Z
68 Polarization Rate Given a binary input channel W, just as for I l, define Z l = Z(W B 1,...,B l ). We know that Pr(Z l 0) = I (W ). It turns out that when Z l 0 it does so fast: Theorem For any β < 1/2, lim Pr(Z l < 2 2βl ) = I (W ). l This means that for any β < 1/2, as long as R < I (W ) the error probability of polarization codes decays to 0 faster than 2 nβ.
69 Polarization Rate Given a binary input channel W, just as for I l, define Z l = Z(W B 1,...,B l ). We know that Pr(Z l 0) = I (W ). It turns out that when Z l 0 it does so fast: Theorem For any β < 1/2, lim Pr(Z l < 2 2βl ) = I (W ). l This means that for any β < 1/2, as long as R < I (W ) the error probability of polarization codes decays to 0 faster than 2 nβ.
70 Polarization Rate Given a binary input channel W, just as for I l, define Z l = Z(W B 1,...,B l ). We know that Pr(Z l 0) = I (W ). It turns out that when Z l 0 it does so fast: Theorem For any β < 1/2, lim Pr(Z l < 2 2βl ) = I (W ). l This means that for any β < 1/2, as long as R < I (W ) the error probability of polarization codes decays to 0 faster than 2 nβ.
71 Polarization Rate Given a binary input channel W, just as for I l, define Z l = Z(W B 1,...,B l ). We know that Pr(Z l 0) = I (W ). It turns out that when Z l 0 it does so fast: Theorem For any β < 1/2, lim Pr(Z l < 2 2βl ) = I (W ). l This means that for any β < 1/2, as long as R < I (W ) the error probability of polarization codes decays to 0 faster than 2 nβ.
72 Polarization Rate Given a binary input channel W, just as for I l, define Z l = Z(W B 1,...,B l ). We know that Pr(Z l 0) = I (W ). It turns out that when Z l 0 it does so fast: Theorem For any β < 1/2, lim Pr(Z l < 2 2βl ) = I (W ). l This means that for any β < 1/2, as long as R < I (W ) the error probability of polarization codes decays to 0 faster than 2 nβ.
73 Polarization Rate Proof idea The subset of the probability space where Z l (ω) 0 has probability I (W ). For any ɛ > 0 there is an m and a subset of this set with probability at least I (W ) ɛ with Z l (ω) < 1/8 whenever l > m. Intersect this set with high probability event: that B i = + and B i = occur with almost equal frequency for i l. It is then easy to see that lim l P(log 2 Z l l) = I (W ).
74 Polarization Rate Proof idea The subset of the probability space where Z l (ω) 0 has probability I (W ). For any ɛ > 0 there is an m and a subset of this set with probability at least I (W ) ɛ with Z l (ω) < 1/8 whenever l > m. Intersect this set with high probability event: that B i = + and B i = occur with almost equal frequency for i l. It is then easy to see that lim l P(log 2 Z l l) = I (W ).
75 Polarization Rate Proof idea The subset of the probability space where Z l (ω) 0 has probability I (W ). For any ɛ > 0 there is an m and a subset of this set with probability at least I (W ) ɛ with Z l (ω) < 1/8 whenever l > m. Intersect this set with high probability event: that B i = + and B i = occur with almost equal frequency for i l. It is then easy to see that lim l P(log 2 Z l l) = I (W ).
76 Polarization Rate Proof idea cont. log 2 Z l l 0 l 0 l
77 Polarization Rate Proof idea cont. log 2 Z l l 0 l 0 +l 1 l l 0 l 1 = l 0
78 Polarization Rate Proof idea cont. log 2 Z l l 0 l 0 +l 1 l l l1 l 1 = l 0
79 Polarization Rate Proof idea cont. log 2 Z l l 0 l 0 +l 1 l 0 +l 1 +l 2 l l l1 l 1 = l 0 l 2 = l1
80 Polarization Rate Proof idea cont. log 2 Z l l 0 l 0 +l 1 l 0 +l 1 +l 2 l l l1 l 1 = l 0 l 2 = l l2
81 Remarks With the particular one-to-one mapping described here (Hadamard transform) and with the successive cancellation decoding polarization codes are I (W ) achieving, encoding complexity is n log n, decoding complexity is n log n, probability of error decays like 2 n. However, for small l their performance is surpassed by the usual suspects.
82 Remarks With the particular one-to-one mapping described here (Hadamard transform) and with the successive cancellation decoding polarization codes are I (W ) achieving, encoding complexity is n log n, decoding complexity is n log n, probability of error decays like 2 n. However, for small l their performance is surpassed by the usual suspects.
83 Remarks With the particular one-to-one mapping described here (Hadamard transform) and with the successive cancellation decoding polarization codes are I (W ) achieving, encoding complexity is n log n, decoding complexity is n log n, probability of error decays like 2 n. However, for small l their performance is surpassed by the usual suspects.
84 Remarks With the particular one-to-one mapping described here (Hadamard transform) and with the successive cancellation decoding polarization codes are I (W ) achieving, encoding complexity is n log n, decoding complexity is n log n, probability of error decays like 2 n. However, for small l their performance is surpassed by the usual suspects.
85 Remarks With the particular one-to-one mapping described here (Hadamard transform) and with the successive cancellation decoding polarization codes are I (W ) achieving, encoding complexity is n log n, decoding complexity is n log n, probability of error decays like 2 n. However, for small l their performance is surpassed by the usual suspects.
86 Remarks With the particular one-to-one mapping described here (Hadamard transform) and with the successive cancellation decoding polarization codes are I (W ) achieving, encoding complexity is n log n, decoding complexity is n log n, probability of error decays like 2 n. However, for small l their performance is surpassed by the usual suspects.
87 Remarks Moreover: For symmetric channels the dummy symbols can be chosen to be constant, yielding a deterministic construction, the error probability guarantees are not based on simulation. They can be generalized to channels with Galois field input alphabets: a similar two-by-two construction, with same complexity and error probability bounds.
88 Remarks Moreover: For symmetric channels the dummy symbols can be chosen to be constant, yielding a deterministic construction, the error probability guarantees are not based on simulation. They can be generalized to channels with Galois field input alphabets: a similar two-by-two construction, with same complexity and error probability bounds.
89 Remarks Moreover: For symmetric channels the dummy symbols can be chosen to be constant, yielding a deterministic construction, the error probability guarantees are not based on simulation. They can be generalized to channels with Galois field input alphabets: a similar two-by-two construction, with same complexity and error probability bounds.
90 Open Questions Many open questions. A few: Improve the error probability estimate to include rate dependence. Polarization appears to be a general phenomenon, how general? The restriction to Galois fields must be an artifact of the construction technique. Polarization based on combining more than 2 channels may yield stronger codes. Successive cancellation is too naive, different decoding algorithms should do better. Numerical evidence suggests scaling laws for the distribution of I l and Z l. Would be nice to have them.
91 Open Questions Many open questions. A few: Improve the error probability estimate to include rate dependence. Polarization appears to be a general phenomenon, how general? The restriction to Galois fields must be an artifact of the construction technique. Polarization based on combining more than 2 channels may yield stronger codes. Successive cancellation is too naive, different decoding algorithms should do better. Numerical evidence suggests scaling laws for the distribution of I l and Z l. Would be nice to have them.
92 Open Questions Many open questions. A few: Improve the error probability estimate to include rate dependence. Polarization appears to be a general phenomenon, how general? The restriction to Galois fields must be an artifact of the construction technique. Polarization based on combining more than 2 channels may yield stronger codes. Successive cancellation is too naive, different decoding algorithms should do better. Numerical evidence suggests scaling laws for the distribution of I l and Z l. Would be nice to have them.
93 Open Questions Many open questions. A few: Improve the error probability estimate to include rate dependence. Polarization appears to be a general phenomenon, how general? The restriction to Galois fields must be an artifact of the construction technique. Polarization based on combining more than 2 channels may yield stronger codes. Successive cancellation is too naive, different decoding algorithms should do better. Numerical evidence suggests scaling laws for the distribution of I l and Z l. Would be nice to have them.
94 Open Questions Many open questions. A few: Improve the error probability estimate to include rate dependence. Polarization appears to be a general phenomenon, how general? The restriction to Galois fields must be an artifact of the construction technique. Polarization based on combining more than 2 channels may yield stronger codes. Successive cancellation is too naive, different decoding algorithms should do better. Numerical evidence suggests scaling laws for the distribution of I l and Z l. Would be nice to have them.
95 Open Questions Many open questions. A few: Improve the error probability estimate to include rate dependence. Polarization appears to be a general phenomenon, how general? The restriction to Galois fields must be an artifact of the construction technique. Polarization based on combining more than 2 channels may yield stronger codes. Successive cancellation is too naive, different decoding algorithms should do better. Numerical evidence suggests scaling laws for the distribution of I l and Z l. Would be nice to have them.
96 Open Questions Many open questions. A few: Improve the error probability estimate to include rate dependence. Polarization appears to be a general phenomenon, how general? The restriction to Galois fields must be an artifact of the construction technique. Polarization based on combining more than 2 channels may yield stronger codes. Successive cancellation is too naive, different decoding algorithms should do better. Numerical evidence suggests scaling laws for the distribution of I l and Z l. Would be nice to have them.
x a x 2 (1 + x 2 ) n.
Limits and continuity Suppose that we have a function f : R R. Let a R. We say that f(x) tends to the limit l as x tends to a; lim f(x) = l ; x a if, given any real number ɛ > 0, there exists a real number
CS 3719 (Theory of Computation and Algorithms) Lecture 4
CS 3719 (Theory of Computation and Algorithms) Lecture 4 Antonina Kolokolova January 18, 2012 1 Undecidable languages 1.1 Church-Turing thesis Let s recap how it all started. In 1990, Hilbert stated a
The Ergodic Theorem and randomness
The Ergodic Theorem and randomness Peter Gács Department of Computer Science Boston University March 19, 2008 Peter Gács (Boston University) Ergodic theorem March 19, 2008 1 / 27 Introduction Introduction
Lecture 13 - Basic Number Theory.
Lecture 13 - Basic Number Theory. Boaz Barak March 22, 2010 Divisibility and primes Unless mentioned otherwise throughout this lecture all numbers are non-negative integers. We say that A divides B, denoted
The sample space for a pair of die rolls is the set. The sample space for a random number between 0 and 1 is the interval [0, 1].
Probability Theory Probability Spaces and Events Consider a random experiment with several possible outcomes. For example, we might roll a pair of dice, flip a coin three times, or choose a random real
1 The Line vs Point Test
6.875 PCP and Hardness of Approximation MIT, Fall 2010 Lecture 5: Low Degree Testing Lecturer: Dana Moshkovitz Scribe: Gregory Minton and Dana Moshkovitz Having seen a probabilistic verifier for linearity
MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.436J/15.085J Fall 2008 Lecture 5 9/17/2008 RANDOM VARIABLES
MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.436J/15.085J Fall 2008 Lecture 5 9/17/2008 RANDOM VARIABLES Contents 1. Random variables and measurable functions 2. Cumulative distribution functions 3. Discrete
Bayesian logistic betting strategy against probability forecasting. Akimichi Takemura, Univ. Tokyo. November 12, 2012
Bayesian logistic betting strategy against probability forecasting Akimichi Takemura, Univ. Tokyo (joint with Masayuki Kumon, Jing Li and Kei Takeuchi) November 12, 2012 arxiv:1204.3496. To appear in Stochastic
Polynomial Degree and Lower Bounds in Quantum Complexity: Collision and Element Distinctness with Small Range
THEORY OF COMPUTING, Volume 1 (2005), pp. 37 46 http://theoryofcomputing.org Polynomial Degree and Lower Bounds in Quantum Complexity: Collision and Element Distinctness with Small Range Andris Ambainis
I. Pointwise convergence
MATH 40 - NOTES Sequences of functions Pointwise and Uniform Convergence Fall 2005 Previously, we have studied sequences of real numbers. Now we discuss the topic of sequences of real valued functions.
Notes on Complexity Theory Last updated: August, 2011. Lecture 1
Notes on Complexity Theory Last updated: August, 2011 Jonathan Katz Lecture 1 1 Turing Machines I assume that most students have encountered Turing machines before. (Students who have not may want to look
CHAPTER 6. Shannon entropy
CHAPTER 6 Shannon entropy This chapter is a digression in information theory. This is a fascinating subject, which arose once the notion of information got precise and quantifyable. From a physical point
Monte Carlo-based statistical methods (MASM11/FMS091)
Monte Carlo-based statistical methods (MASM11/FMS091) Jimmy Olsson Centre for Mathematical Sciences Lund University, Sweden Lecture 5 Sequential Monte Carlo methods I February 5, 2013 J. Olsson Monte Carlo-based
National Sun Yat-Sen University CSE Course: Information Theory. Gambling And Entropy
Gambling And Entropy 1 Outline There is a strong relationship between the growth rate of investment in a horse race and the entropy of the horse race. The value of side information is related to the mutual
Lecture 1: Course overview, circuits, and formulas
Lecture 1: Course overview, circuits, and formulas Topics in Complexity Theory and Pseudorandomness (Spring 2013) Rutgers University Swastik Kopparty Scribes: John Kim, Ben Lund 1 Course Information Swastik
Basics of Statistical Machine Learning
CS761 Spring 2013 Advanced Machine Learning Basics of Statistical Machine Learning Lecturer: Xiaojin Zhu [email protected] Modern machine learning is rooted in statistics. You will find many familiar
THE BANACH CONTRACTION PRINCIPLE. Contents
THE BANACH CONTRACTION PRINCIPLE ALEX PONIECKI Abstract. This paper will study contractions of metric spaces. To do this, we will mainly use tools from topology. We will give some examples of contractions,
CMSC 858T: Randomized Algorithms Spring 2003 Handout 8: The Local Lemma
CMSC 858T: Randomized Algorithms Spring 2003 Handout 8: The Local Lemma Please Note: The references at the end are given for extra reading if you are interested in exploring these ideas further. You are
arxiv:1112.0829v1 [math.pr] 5 Dec 2011
How Not to Win a Million Dollars: A Counterexample to a Conjecture of L. Breiman Thomas P. Hayes arxiv:1112.0829v1 [math.pr] 5 Dec 2011 Abstract Consider a gambling game in which we are allowed to repeatedly
Prime numbers and prime polynomials. Paul Pollack Dartmouth College
Prime numbers and prime polynomials Paul Pollack Dartmouth College May 1, 2008 Analogies everywhere! Analogies in elementary number theory (continued fractions, quadratic reciprocity, Fermat s last theorem)
Offline sorting buffers on Line
Offline sorting buffers on Line Rohit Khandekar 1 and Vinayaka Pandit 2 1 University of Waterloo, ON, Canada. email: [email protected] 2 IBM India Research Lab, New Delhi. email: [email protected]
Introduction to Markov Chain Monte Carlo
Introduction to Markov Chain Monte Carlo Monte Carlo: sample from a distribution to estimate the distribution to compute max, mean Markov Chain Monte Carlo: sampling using local information Generic problem
Theory of Computation Chapter 2: Turing Machines
Theory of Computation Chapter 2: Turing Machines Guan-Shieng Huang Feb. 24, 2003 Feb. 19, 2006 0-0 Turing Machine δ K 0111000a 01bb 1 Definition of TMs A Turing Machine is a quadruple M = (K, Σ, δ, s),
Coding and decoding with convolutional codes. The Viterbi Algor
Coding and decoding with convolutional codes. The Viterbi Algorithm. 8 Block codes: main ideas Principles st point of view: infinite length block code nd point of view: convolutions Some examples Repetition
FUNDAMENTALS of INFORMATION THEORY and CODING DESIGN
DISCRETE "ICS AND ITS APPLICATIONS Series Editor KENNETH H. ROSEN FUNDAMENTALS of INFORMATION THEORY and CODING DESIGN Roberto Togneri Christopher J.S. desilva CHAPMAN & HALL/CRC A CRC Press Company Boca
Inverse Functions and Logarithms
Section 3. Inverse Functions and Logarithms 1 Kiryl Tsishchanka Inverse Functions and Logarithms DEFINITION: A function f is called a one-to-one function if it never takes on the same value twice; that
This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination.
IEEE/ACM TRANSACTIONS ON NETWORKING 1 A Greedy Link Scheduler for Wireless Networks With Gaussian Multiple-Access and Broadcast Channels Arun Sridharan, Student Member, IEEE, C Emre Koksal, Member, IEEE,
Estimating the Degree of Activity of jumps in High Frequency Financial Data. joint with Yacine Aït-Sahalia
Estimating the Degree of Activity of jumps in High Frequency Financial Data joint with Yacine Aït-Sahalia Aim and setting An underlying process X = (X t ) t 0, observed at equally spaced discrete times
Arkansas Tech University MATH 4033: Elementary Modern Algebra Dr. Marcel B. Finan
Arkansas Tech University MATH 4033: Elementary Modern Algebra Dr. Marcel B. Finan 3 Binary Operations We are used to addition and multiplication of real numbers. These operations combine two real numbers
Monte Carlo and Empirical Methods for Stochastic Inference (MASM11/FMS091)
Monte Carlo and Empirical Methods for Stochastic Inference (MASM11/FMS091) Magnus Wiktorsson Centre for Mathematical Sciences Lund University, Sweden Lecture 5 Sequential Monte Carlo methods I February
Lectures 5-6: Taylor Series
Math 1d Instructor: Padraic Bartlett Lectures 5-: Taylor Series Weeks 5- Caltech 213 1 Taylor Polynomials and Series As we saw in week 4, power series are remarkably nice objects to work with. In particular,
4 Lyapunov Stability Theory
4 Lyapunov Stability Theory In this section we review the tools of Lyapunov stability theory. These tools will be used in the next section to analyze the stability properties of a robot controller. We
10.2 Series and Convergence
10.2 Series and Convergence Write sums using sigma notation Find the partial sums of series and determine convergence or divergence of infinite series Find the N th partial sums of geometric series and
A Uniform Asymptotic Estimate for Discounted Aggregate Claims with Subexponential Tails
12th International Congress on Insurance: Mathematics and Economics July 16-18, 2008 A Uniform Asymptotic Estimate for Discounted Aggregate Claims with Subexponential Tails XUEMIAO HAO (Based on a joint
Competitive Analysis of On line Randomized Call Control in Cellular Networks
Competitive Analysis of On line Randomized Call Control in Cellular Networks Ioannis Caragiannis Christos Kaklamanis Evi Papaioannou Abstract In this paper we address an important communication issue arising
Undergraduate Notes in Mathematics. Arkansas Tech University Department of Mathematics
Undergraduate Notes in Mathematics Arkansas Tech University Department of Mathematics An Introductory Single Variable Real Analysis: A Learning Approach through Problem Solving Marcel B. Finan c All Rights
Information Theory and Coding Prof. S. N. Merchant Department of Electrical Engineering Indian Institute of Technology, Bombay
Information Theory and Coding Prof. S. N. Merchant Department of Electrical Engineering Indian Institute of Technology, Bombay Lecture - 17 Shannon-Fano-Elias Coding and Introduction to Arithmetic Coding
A Practical Scheme for Wireless Network Operation
A Practical Scheme for Wireless Network Operation Radhika Gowaikar, Amir F. Dana, Babak Hassibi, Michelle Effros June 21, 2004 Abstract In many problems in wireline networks, it is known that achieving
INDISTINGUISHABILITY OF ABSOLUTELY CONTINUOUS AND SINGULAR DISTRIBUTIONS
INDISTINGUISHABILITY OF ABSOLUTELY CONTINUOUS AND SINGULAR DISTRIBUTIONS STEVEN P. LALLEY AND ANDREW NOBEL Abstract. It is shown that there are no consistent decision rules for the hypothesis testing problem
3 Some Integer Functions
3 Some Integer Functions A Pair of Fundamental Integer Functions The integer function that is the heart of this section is the modulo function. However, before getting to it, let us look at some very simple
Chapter ML:IV. IV. Statistical Learning. Probability Basics Bayes Classification Maximum a-posteriori Hypotheses
Chapter ML:IV IV. Statistical Learning Probability Basics Bayes Classification Maximum a-posteriori Hypotheses ML:IV-1 Statistical Learning STEIN 2005-2015 Area Overview Mathematics Statistics...... Stochastics
Notes on metric spaces
Notes on metric spaces 1 Introduction The purpose of these notes is to quickly review some of the basic concepts from Real Analysis, Metric Spaces and some related results that will be used in this course.
Fixed Point Theorems
Fixed Point Theorems Definition: Let X be a set and let T : X X be a function that maps X into itself. (Such a function is often called an operator, a transformation, or a transform on X, and the notation
Notes on Continuous Random Variables
Notes on Continuous Random Variables Continuous random variables are random quantities that are measured on a continuous scale. They can usually take on any value over some interval, which distinguishes
Modern Optimization Methods for Big Data Problems MATH11146 The University of Edinburgh
Modern Optimization Methods for Big Data Problems MATH11146 The University of Edinburgh Peter Richtárik Week 3 Randomized Coordinate Descent With Arbitrary Sampling January 27, 2016 1 / 30 The Problem
1 Prior Probability and Posterior Probability
Math 541: Statistical Theory II Bayesian Approach to Parameter Estimation Lecturer: Songfeng Zheng 1 Prior Probability and Posterior Probability Consider now a problem of statistical inference in which
Data Streams A Tutorial
Data Streams A Tutorial Nicole Schweikardt Goethe-Universität Frankfurt am Main DEIS 10: GI-Dagstuhl Seminar on Data Exchange, Integration, and Streams Schloss Dagstuhl, November 8, 2010 Data Streams Situation:
Lecture 7: Continuous Random Variables
Lecture 7: Continuous Random Variables 21 September 2005 1 Our First Continuous Random Variable The back of the lecture hall is roughly 10 meters across. Suppose it were exactly 10 meters, and consider
Deterministic Finite Automata
1 Deterministic Finite Automata Definition: A deterministic finite automaton (DFA) consists of 1. a finite set of states (often denoted Q) 2. a finite set Σ of symbols (alphabet) 3. a transition function
Lecture 2: Universality
CS 710: Complexity Theory 1/21/2010 Lecture 2: Universality Instructor: Dieter van Melkebeek Scribe: Tyson Williams In this lecture, we introduce the notion of a universal machine, develop efficient universal
1. Prove that the empty set is a subset of every set.
1. Prove that the empty set is a subset of every set. Basic Topology Written by Men-Gen Tsai email: [email protected] Proof: For any element x of the empty set, x is also an element of every set since
2. Linear regression with multiple regressors
2. Linear regression with multiple regressors Aim of this section: Introduction of the multiple regression model OLS estimation in multiple regression Measures-of-fit in multiple regression Assumptions
1. (First passage/hitting times/gambler s ruin problem:) Suppose that X has a discrete state space and let i be a fixed state. Let
Copyright c 2009 by Karl Sigman 1 Stopping Times 1.1 Stopping Times: Definition Given a stochastic process X = {X n : n 0}, a random time τ is a discrete random variable on the same probability space as
Random variables P(X = 3) = P(X = 3) = 1 8, P(X = 1) = P(X = 1) = 3 8.
Random variables Remark on Notations 1. When X is a number chosen uniformly from a data set, What I call P(X = k) is called Freq[k, X] in the courseware. 2. When X is a random variable, what I call F ()
Exact Nonparametric Tests for Comparing Means - A Personal Summary
Exact Nonparametric Tests for Comparing Means - A Personal Summary Karl H. Schlag European University Institute 1 December 14, 2006 1 Economics Department, European University Institute. Via della Piazzuola
Online Appendix to Stochastic Imitative Game Dynamics with Committed Agents
Online Appendix to Stochastic Imitative Game Dynamics with Committed Agents William H. Sandholm January 6, 22 O.. Imitative protocols, mean dynamics, and equilibrium selection In this section, we consider
Metric Spaces. Chapter 7. 7.1. Metrics
Chapter 7 Metric Spaces A metric space is a set X that has a notion of the distance d(x, y) between every pair of points x, y X. The purpose of this chapter is to introduce metric spaces and give some
Random access protocols for channel access. Markov chains and their stability. Laurent Massoulié.
Random access protocols for channel access Markov chains and their stability [email protected] Aloha: the first random access protocol for channel access [Abramson, Hawaii 70] Goal: allow machines
The Goldberg Rao Algorithm for the Maximum Flow Problem
The Goldberg Rao Algorithm for the Maximum Flow Problem COS 528 class notes October 18, 2006 Scribe: Dávid Papp Main idea: use of the blocking flow paradigm to achieve essentially O(min{m 2/3, n 1/2 }
Introduction to General and Generalized Linear Models
Introduction to General and Generalized Linear Models General Linear Models - part I Henrik Madsen Poul Thyregod Informatics and Mathematical Modelling Technical University of Denmark DK-2800 Kgs. Lyngby
Lecture 16 : Relations and Functions DRAFT
CS/Math 240: Introduction to Discrete Mathematics 3/29/2011 Lecture 16 : Relations and Functions Instructor: Dieter van Melkebeek Scribe: Dalibor Zelený DRAFT In Lecture 3, we described a correspondence
Monte Carlo-based statistical methods (MASM11/FMS091)
Monte Carlo-based statistical methods (MASM11/FMS091) Magnus Wiktorsson Centre for Mathematical Sciences Lund University, Sweden Lecture 6 Sequential Monte Carlo methods II February 7, 2014 M. Wiktorsson
2.1 Complexity Classes
15-859(M): Randomized Algorithms Lecturer: Shuchi Chawla Topic: Complexity classes, Identity checking Date: September 15, 2004 Scribe: Andrew Gilpin 2.1 Complexity Classes In this lecture we will look
Math 319 Problem Set #3 Solution 21 February 2002
Math 319 Problem Set #3 Solution 21 February 2002 1. ( 2.1, problem 15) Find integers a 1, a 2, a 3, a 4, a 5 such that every integer x satisfies at least one of the congruences x a 1 (mod 2), x a 2 (mod
Practice with Proofs
Practice with Proofs October 6, 2014 Recall the following Definition 0.1. A function f is increasing if for every x, y in the domain of f, x < y = f(x) < f(y) 1. Prove that h(x) = x 3 is increasing, using
Gambling and Data Compression
Gambling and Data Compression Gambling. Horse Race Definition The wealth relative S(X) = b(x)o(x) is the factor by which the gambler s wealth grows if horse X wins the race, where b(x) is the fraction
E3: PROBABILITY AND STATISTICS lecture notes
E3: PROBABILITY AND STATISTICS lecture notes 2 Contents 1 PROBABILITY THEORY 7 1.1 Experiments and random events............................ 7 1.2 Certain event. Impossible event............................
Scalar Valued Functions of Several Variables; the Gradient Vector
Scalar Valued Functions of Several Variables; the Gradient Vector Scalar Valued Functions vector valued function of n variables: Let us consider a scalar (i.e., numerical, rather than y = φ(x = φ(x 1,
Welcome to... Problem Analysis and Complexity Theory 716.054, 3 VU
Welcome to... Problem Analysis and Complexity Theory 716.054, 3 VU Birgit Vogtenhuber Institute for Software Technology email: [email protected] office hour: Tuesday 10:30 11:30 slides: http://www.ist.tugraz.at/pact.html
ELEC3028 Digital Transmission Overview & Information Theory. Example 1
Example. A source emits symbols i, i 6, in the BCD format with probabilities P( i ) as given in Table, at a rate R s = 9.6 kbaud (baud=symbol/second). State (i) the information rate and (ii) the data rate
Lecture 13: Martingales
Lecture 13: Martingales 1. Definition of a Martingale 1.1 Filtrations 1.2 Definition of a martingale and its basic properties 1.3 Sums of independent random variables and related models 1.4 Products of
6.2 Permutations continued
6.2 Permutations continued Theorem A permutation on a finite set A is either a cycle or can be expressed as a product (composition of disjoint cycles. Proof is by (strong induction on the number, r, of
Lecture 4: BK inequality 27th August and 6th September, 2007
CSL866: Percolation and Random Graphs IIT Delhi Amitabha Bagchi Scribe: Arindam Pal Lecture 4: BK inequality 27th August and 6th September, 2007 4. Preliminaries The FKG inequality allows us to lower bound
CHAPTER 2 Estimating Probabilities
CHAPTER 2 Estimating Probabilities Machine Learning Copyright c 2016. Tom M. Mitchell. All rights reserved. *DRAFT OF January 24, 2016* *PLEASE DO NOT DISTRIBUTE WITHOUT AUTHOR S PERMISSION* This is a
AN ANALYSIS OF A WAR-LIKE CARD GAME. Introduction
AN ANALYSIS OF A WAR-LIKE CARD GAME BORIS ALEXEEV AND JACOB TSIMERMAN Abstract. In his book Mathematical Mind-Benders, Peter Winkler poses the following open problem, originally due to the first author:
Maximum Likelihood Estimation
Math 541: Statistical Theory II Lecturer: Songfeng Zheng Maximum Likelihood Estimation 1 Maximum Likelihood Estimation Maximum likelihood is a relatively simple method of constructing an estimator for
The Steepest Descent Algorithm for Unconstrained Optimization and a Bisection Line-search Method
The Steepest Descent Algorithm for Unconstrained Optimization and a Bisection Line-search Method Robert M. Freund February, 004 004 Massachusetts Institute of Technology. 1 1 The Algorithm The problem
Bounded Cost Algorithms for Multivalued Consensus Using Binary Consensus Instances
Bounded Cost Algorithms for Multivalued Consensus Using Binary Consensus Instances Jialin Zhang Tsinghua University [email protected] Wei Chen Microsoft Research Asia [email protected] Abstract
Mathematical Induction
Mathematical Induction In logic, we often want to prove that every member of an infinite set has some feature. E.g., we would like to show: N 1 : is a number 1 : has the feature Φ ( x)(n 1 x! 1 x) How
Example: Credit card default, we may be more interested in predicting the probabilty of a default than classifying individuals as default or not.
Statistical Learning: Chapter 4 Classification 4.1 Introduction Supervised learning with a categorical (Qualitative) response Notation: - Feature vector X, - qualitative response Y, taking values in C
Discrete Mathematics and Probability Theory Fall 2009 Satish Rao, David Tse Note 10
CS 70 Discrete Mathematics and Probability Theory Fall 2009 Satish Rao, David Tse Note 10 Introduction to Discrete Probability Probability theory has its origins in gambling analyzing card games, dice,
CHAPTER 6: Continuous Uniform Distribution: 6.1. Definition: The density function of the continuous random variable X on the interval [A, B] is.
Some Continuous Probability Distributions CHAPTER 6: Continuous Uniform Distribution: 6. Definition: The density function of the continuous random variable X on the interval [A, B] is B A A x B f(x; A,
Reading 13 : Finite State Automata and Regular Expressions
CS/Math 24: Introduction to Discrete Mathematics Fall 25 Reading 3 : Finite State Automata and Regular Expressions Instructors: Beck Hasti, Gautam Prakriya In this reading we study a mathematical model
Recursive Algorithms. Recursion. Motivating Example Factorial Recall the factorial function. { 1 if n = 1 n! = n (n 1)! if n > 1
Recursion Slides by Christopher M Bourke Instructor: Berthe Y Choueiry Fall 007 Computer Science & Engineering 35 Introduction to Discrete Mathematics Sections 71-7 of Rosen cse35@cseunledu Recursive Algorithms
Notes from Week 1: Algorithms for sequential prediction
CS 683 Learning, Games, and Electronic Markets Spring 2007 Notes from Week 1: Algorithms for sequential prediction Instructor: Robert Kleinberg 22-26 Jan 2007 1 Introduction In this course we will be looking
CONTINUED FRACTIONS AND PELL S EQUATION. Contents 1. Continued Fractions 1 2. Solution to Pell s Equation 9 References 12
CONTINUED FRACTIONS AND PELL S EQUATION SEUNG HYUN YANG Abstract. In this REU paper, I will use some important characteristics of continued fractions to give the complete set of solutions to Pell s equation.
CSE 135: Introduction to Theory of Computation Decidability and Recognizability
CSE 135: Introduction to Theory of Computation Decidability and Recognizability Sungjin Im University of California, Merced 04-28, 30-2014 High-Level Descriptions of Computation Instead of giving a Turing
PHASE ESTIMATION ALGORITHM FOR FREQUENCY HOPPED BINARY PSK AND DPSK WAVEFORMS WITH SMALL NUMBER OF REFERENCE SYMBOLS
PHASE ESTIMATION ALGORITHM FOR FREQUENCY HOPPED BINARY PSK AND DPSK WAVEFORMS WITH SMALL NUM OF REFERENCE SYMBOLS Benjamin R. Wiederholt The MITRE Corporation Bedford, MA and Mario A. Blanco The MITRE
0 0 such that f x L whenever x a
Epsilon-Delta Definition of the Limit Few statements in elementary mathematics appear as cryptic as the one defining the limit of a function f() at the point = a, 0 0 such that f L whenever a Translation:
I. INTRODUCTION. of the biometric measurements is stored in the database
122 IEEE TRANSACTIONS ON INFORMATION FORENSICS AND SECURITY, VOL 6, NO 1, MARCH 2011 Privacy Security Trade-Offs in Biometric Security Systems Part I: Single Use Case Lifeng Lai, Member, IEEE, Siu-Wai
Basics of information theory and information complexity
Basics of information theory and information complexity a tutorial Mark Braverman Princeton University June 1, 2013 1 Part I: Information theory Information theory, in its modern format was introduced
Practical Guide to the Simplex Method of Linear Programming
Practical Guide to the Simplex Method of Linear Programming Marcel Oliver Revised: April, 0 The basic steps of the simplex algorithm Step : Write the linear programming problem in standard form Linear
A Partition-Based Efficient Algorithm for Large Scale. Multiple-Strings Matching
A Partition-Based Efficient Algorithm for Large Scale Multiple-Strings Matching Ping Liu Jianlong Tan, Yanbing Liu Software Division, Institute of Computing Technology, Chinese Academy of Sciences, Beijing,
A Network Flow Approach in Cloud Computing
1 A Network Flow Approach in Cloud Computing Soheil Feizi, Amy Zhang, Muriel Médard RLE at MIT Abstract In this paper, by using network flow principles, we propose algorithms to address various challenges
Collinear Points in Permutations
Collinear Points in Permutations Joshua N. Cooper Courant Institute of Mathematics New York University, New York, NY József Solymosi Department of Mathematics University of British Columbia, Vancouver,
