Introduction to Probability

Introduction to Probability EE 179, Lecture 15, Handout #24 Probability theory gives a mathematical characterization for experiments with random outcomes. coin toss life of lightbulb binary data sequence Brownian motion An event is a set of outcomes belonging to a sample space. Events must be repeatable and have statistical regularity, i.e. a large number of experiments have regularity in outcome patterns. We define the probability of an event A as the average number of times that the outcome belongs to A in the limit for large n: number of times outcome is in A P(A) = lim n n Examples: P(roulette wheel outcome is red) = 18 38 P(rain tomorrow) 57 365 (bogus) EE 179, May 2, 2014 Lecture 15, Page 1

Mathematics of Probability: Axiomatic Approach Random events are defined on a probability space: sample space S of possible outcomes (finite or infinite) family (set) of events {Ai } that are subsets of S probability measure P( ) on events The probability measure has three properties: P(A) 0 P(S) = 1 if Ai A j is empty, then P( i A i ) = i P(A i) We formally write probability space as triple (S,{A i },P( )} A very common notation for the sample space is Ω, and a generic outcome is ω This axiomatic approach was introduced by Kolmogorov around 1930. Probability has been used for thousand of years. Proverbs 16:32: We may throw the dice, but the Lord determines how they fall. EE 179, May 2, 2014 Lecture 15, Page 2

Random Variables Practical definition of random variable: the numerical output of a probabilistic experiment. coin flip (tails=0, heads=1) or sum of two dice (2,3,...,12) amount of snow fall at a location over a duration noise voltage at an instant or integral of noise over an interval Mathematical definition: a real-valued function defined on the sample space of a probability space. Examples: X : Ω R X(ω) R for every ω Ω Sample space for toss of two dice is {i,j : 1 i,j 6}. The sum is the r.v. i+j. For the BSC, the input is the r.v. X = x and the output is Y = y. We have derived the joint probability distribution for X and Y. If the values of a r.v. are discrete, the r.v. is called discrete. Otherwise the r.v. is continuous or mixed. EE 179, May 2, 2014 Lecture 15, Page 3

Probability Mass Function The probability distribution of a discrete random is complete described by its probality mass function (pmf or PMF). P{X = x k } = p X (x k ), where values of X are {x k } In this special of the axioms of probability, p X (x k ) 0, k p X(x k ) = 1 Important discrete random variables: Bernoulli: Ω = {0,1}, p(1) = p, p(0) = 1 p. Binomial: S n = n k=1 X k where X k are independent Bernoulli. Geometric: p(n) = (1 p) n 1 p for n = 1,2,... and 0 p 1. Poisson: p(n) = λn n! e λ, where n 0 and λ 0. Solving problems about discrete r.v.s usually requires manipulating sums (combinatorics). EE 179, May 2, 2014 Lecture 15, Page 4

Cumulative Distribution Function For continuous random variables p(x) = 0 for all x, so we cannot use pmf. The cumulative distribution function (cdf or CDF) can describe both discrete and continuous r.v.s. The CDF of a real-valued r.v. X is defined by Properties of CDF. F X (x) = P{X x}, x Monotone: if x 1 x 2 then F(x 1 ) F(x 2 ) Limits: F( ) = lim F(x) = 0, F( ) = lim F(x) = 1 x x Interval: P{a < X b} = P{X b} P{X a} = F(b) F(a) Point: P{X = x} = P{X x} P{X < x} = F(x) F(x ), where F(x ) = lim u x F(u). A random variable is continuous if its cdf F(x) is continuous for every x. Another definition is P{X < x}; this is used by Russian mathematicians. EE 179, May 2, 2014 Lecture 15, Page 5

Types of CDFs The CDF of any discrete r.v. is an increasing staircase function. The CDF of a continous r.v. is a smooth nondecreasing function. The CDF of a mixed r.v. is continuous between jumps; p(x) > 0 for some x. nondecreasing = increasing but not necessarily strictly increasing. EE 179, May 2, 2014 Lecture 15, Page 6

Probability Density Function If X is a continuous r.v., then P([x 1,x 2 ]) = P{x 1 X x 2 } = F X (x 2 ) F X (x 1 ). If F X (x) is differentiable, then F X (x 2 ) F X (x 1 ) = x2 p x (u)du, where p X (x) = df X x 1 dx We call p X (x) is the probability density function (pdf, PDF) of X; p X (x) is the probability per unit width of a narrow interval around x. EE 179, May 2, 2014 Lecture 15, Page 7

Properties of PDF Nonnegative: p(x) 0, since F(x) is increasing. CDF is the antiderivative of the PDF. F(x) = x p(u) du Impulses: if P{X = x 0 } = p 0 > 0 then p X (x) = p 0 δ(x x 0 ). Mixed r.v.: if F(x) is differentiable except at discrete points {x k }, then p(x) = p(x)+ k p k δ(x x k ) where p(x) is a nonnegative continuous function and p(x)dx = 1 k p k Most books use f X (x) for pdf and p X (x) for pmf. EE 179, May 2, 2014 Lecture 15, Page 8

Statistics of Random Variables The complete description of a random variable is its CDF, which specifies probabilities of all intervals, e.g., X > x 0. To compare two r.v.s we often need single numbers (statistics) associated with each random variable. The most common statistics are: Mean: (average, expected value): Second moment: Variance: X = E(X) = E(X 2 ) = xp(x)dx or x 2 p(x)dx or n= n= x n p(x n ) x 2 n p(x n) Var(X) = E((X X) 2 ) = E(X 2 ) (E(X)) 2 Median: The median X med is the value satisfying EE 179, May 2, 2014 P{X < X med } = P{X > X med } Lecture 15, Page 9

Examples of Continuous Random Variables Uniform random variable has a constant density on an interval. We write X Unif[a,b] if p X (x) is constant on [a,b] and 0 elsewhere. 1 a x b p X (x;a,b) = b a 0 x < a or x > b Examples of uniform random variables are final position of roulette wheel or quantization error. b x E(X) = a b a dx = x 2 b 2(b a) = b2 a 2 a 2(b a) = b+a 2 b E(X 2 x 2 ) = b a dx = x 3 b 3(b a) = b3 a 3 3(b a) = b2 +ba+a 2 3 Var(X) = b2 +ba+a 2 3 a a b2 +2ba+a 2 4 = b2 2ba+a 2 12 = (b a)2 12 EE 179, May 2, 2014 Lecture 15, Page 10

Examples of Continuous Random Variables (cont.) Exponential random variable has one parameter λ. { λe λx x 0 f(x;λ) = 0 x < 0 The CDF for x 0 is F(x;λ) = x λe λu du = x 0 λe λu du = e λu x 0 = 1 e λx The mean is xλe λx dx = xe λx 0 The variance (integrating by parts twice) is 0 0 ( e λx ) = 1 λ 0 (x λ) 2 λe λx dx = 1 λ 2 EE 179, May 2, 2014 Lecture 15, Page 11

Examples of Continuous Random Variables (cont.) Gaussian random variable has two parameters µ and σ. Its pdf is ( ) N(x;µ,σ 2 1 ) = exp (x µ)2 2πσ 2 σ 2 The Gaussian PDF is centered at and has maximum value at x = µ. The mean is µ (obvious) and the variance is σ 2. The inflection points of the density graph are at ±σ. The density decreases faster than exponentially as x ±. EE 179, May 2, 2014 Lecture 15, Page 12

Joint Random Variables In communication systems we usually have two random signals defined on the sample sample space: Transmitted signal x(t) Received signal y(t). For times t 1 and t 2, the values x(t 1 ) and y(t 2 ) are joint random variables. Joint r.v.s are characterized by a joint CDF: F XY (x,y) = P{X x, Y y} = P{left lower quadrant bounded by (x,y)} If X and Y are jointly continuous, their joint PDF is given by p XY (x,y) = 2 x y F XY(x,y) EE 179, May 2, 2014 Lecture 15, Page 13

Properties of Joint PDF P{(X,Y) [a,b] [c,d]} 0, that is, p XY (x,y) 0, and F(b,d) F(a,d) F(b,c)+F(a,b) 0 p(x,y)dxdy = 1 EE 179, May 2, 2014 Lecture 15, Page 14