Topic 2: Scalar random variables. Definition of random variables

Similar documents
Introduction to Probability

What is Statistics? Lecture 1. Introduction and probability review. Idea of parametric inference

Lecture 6: Discrete & Continuous Probability and Random Variables

Definition: Suppose that two random variables, either continuous or discrete, X and Y have joint density

e.g. arrival of a customer to a service station or breakdown of a component in some system.

Chapter 4 Lecture Notes

Lecture 7: Continuous Random Variables

Random variables, probability distributions, binomial random variable

Master s Theory Exam Spring 2006

LECTURE 16. Readings: Section 5.1. Lecture outline. Random processes Definition of the Bernoulli process Basic properties of the Bernoulli process

For a partition B 1,..., B n, where B i B j = for i. A = (A B 1 ) (A B 2 ),..., (A B n ) and thus. P (A) = P (A B i ) = P (A B i )P (B i )

Overview of Monte Carlo Simulation, Probability Review and Introduction to Matlab

Section 5.1 Continuous Random Variables: Introduction

The Exponential Distribution

STAT 315: HOW TO CHOOSE A DISTRIBUTION FOR A RANDOM VARIABLE

5. Continuous Random Variables

Probability density function : An arbitrary continuous random variable X is similarly described by its probability density function f x = f X

ST 371 (IV): Discrete Random Variables

M2S1 Lecture Notes. G. A. Young ayoung

Chapter 3: DISCRETE RANDOM VARIABLES AND PROBABILITY DISTRIBUTIONS. Part 3: Discrete Uniform Distribution Binomial Distribution

1.1 Introduction, and Review of Probability Theory Random Variable, Range, Types of Random Variables CDF, PDF, Quantiles...

MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.436J/15.085J Fall 2008 Lecture 5 9/17/2008 RANDOM VARIABLES

The sample space for a pair of die rolls is the set. The sample space for a random number between 0 and 1 is the interval [0, 1].

Lecture Notes 1. Brief Review of Basic Probability

An Introduction to Basic Statistics and Probability

Sums of Independent Random Variables

ECE302 Spring 2006 HW5 Solutions February 21,

Discrete Mathematics and Probability Theory Fall 2009 Satish Rao, David Tse Note 18. A Brief Introduction to Continuous Probability

Math 431 An Introduction to Probability. Final Exam Solutions

Tenth Problem Assignment

ECE302 Spring 2006 HW3 Solutions February 2,

ECE302 Spring 2006 HW4 Solutions February 6,

FEGYVERNEKI SÁNDOR, PROBABILITY THEORY AND MATHEmATICAL

Probability Generating Functions

E3: PROBABILITY AND STATISTICS lecture notes

Lecture 13: Martingales

Summary of Formulas and Concepts. Descriptive Statistics (Ch. 1-4)

Important Probability Distributions OPRE 6301

Math 370, Spring 2008 Prof. A.J. Hildebrand. Practice Test 1 Solutions

MATH4427 Notebook 2 Spring MATH4427 Notebook Definitions and Examples Performance Measures for Estimators...

Math 370/408, Spring 2008 Prof. A.J. Hildebrand. Actuarial Exam Practice Problem Set 2 Solutions

2WB05 Simulation Lecture 8: Generating random variables

A Tutorial on Probability Theory

Math 461 Fall 2006 Test 2 Solutions

MATH 10: Elementary Statistics and Probability Chapter 5: Continuous Random Variables

Notes on Continuous Random Variables

Aggregate Loss Models

Probability and Statistics Prof. Dr. Somesh Kumar Department of Mathematics Indian Institute of Technology, Kharagpur

Review of Random Variables

MULTIVARIATE PROBABILITY DISTRIBUTIONS

THE CENTRAL LIMIT THEOREM TORONTO

IEOR 6711: Stochastic Models I Fall 2012, Professor Whitt, Tuesday, September 11 Normal Approximations and the Central Limit Theorem

Lecture 8. Confidence intervals and the central limit theorem

Lecture 3: Continuous distributions, expected value & mean, variance, the normal distribution

3.4. The Binomial Probability Distribution. Copyright Cengage Learning. All rights reserved.

Chapter 5. Random variables

Random variables P(X = 3) = P(X = 3) = 1 8, P(X = 1) = P(X = 1) = 3 8.

Math 370, Spring 2008 Prof. A.J. Hildebrand. Practice Test 2 Solutions

Homework 4 - KEY. Jeff Brenion. June 16, Note: Many problems can be solved in more than one way; we present only a single solution here.

UNIT I: RANDOM VARIABLES PART- A -TWO MARKS

6.041/6.431 Spring 2008 Quiz 2 Wednesday, April 16, 7:30-9:30 PM. SOLUTIONS

Joint Exam 1/P Sample Exam 1

Probability Theory. Florian Herzog. A random variable is neither random nor variable. Gian-Carlo Rota, M.I.T..

Probability and Random Variables. Generation of random variables (r.v.)

Introduction to Queueing Theory and Stochastic Teletraffic Models

CA200 Quantitative Analysis for Business Decisions. File name: CA200_Section_04A_StatisticsIntroduction

Transformations and Expectations of random variables

Exponential Distribution

Measurements of central tendency express whether the numbers tend to be high or low. The most common of these are:

LECTURE 4. Last time: Lecture outline

Random Variate Generation (Part 3)

Math/Stats 342: Solutions to Homework

Lecture 8: More Continuous Random Variables

2. Discrete random variables

Question: What is the probability that a five-card poker hand contains a flush, that is, five cards of the same suit?

MAS108 Probability I

VISUALIZATION OF DENSITY FUNCTIONS WITH GEOGEBRA

RANDOM VARIABLES AND PROBABILITY DISTRIBUTIONS

0 x = 0.30 x = 1.10 x = 3.05 x = 4.15 x = x = 12. f(x) =

Department of Mathematics, Indian Institute of Technology, Kharagpur Assignment 2-3, Probability and Statistics, March Due:-March 25, 2015.

Math 425 (Fall 08) Solutions Midterm 2 November 6, 2008

BNG 202 Biomechanics Lab. Descriptive statistics and probability distributions I

M/M/1 and M/M/m Queueing Systems

Discrete Mathematics and Probability Theory Fall 2009 Satish Rao, David Tse Note 13. Random Variables: Distribution and Expectation

Practice problems for Homework 11 - Point Estimation

Basics of Statistical Machine Learning

Chapter 5. Discrete Probability Distributions

Example: 1. You have observed that the number of hits to your web site follow a Poisson distribution at a rate of 2 per day.

NOV /II. 1. Let f(z) = sin z, z C. Then f(z) : 3. Let the sequence {a n } be given. (A) is bounded in the complex plane

MATHEMATICAL METHODS OF STATISTICS

Principle of Data Reduction

A review of the portions of probability useful for understanding experimental design and analysis.

IEOR 6711: Stochastic Models, I Fall 2012, Professor Whitt, Final Exam SOLUTIONS

Math 151. Rumbos Spring Solutions to Assignment #22

STA 256: Statistics and Probability I

Metric Spaces. Chapter Metrics

1. (First passage/hitting times/gambler s ruin problem:) Suppose that X has a discrete state space and let i be a fixed state. Let

Statistics 100A Homework 7 Solutions

Probability and statistics; Rehearsal for pattern recognition

Wald s Identity. by Jeffery Hein. Dartmouth College, Math 100

Transcription:

Topic 2: Scalar random variables Discrete and continuous random variables Probability distribution and densities (cdf, pmf, pdf) Important random variables Expectation, mean, variance, moments Markov and Chebyshev inequalities Testing the fit of a distribution to data ES150 Harvard SEAS 1 Definition of random variables A random variable is a function that assigns a real number, X(s), to each outcome s in a sample space Ω. Ω is the domain of the random variable The set R X of all values of X is its range R X R. The notation {X x} denotes a subset of Ω consisting of all outcomes s such that X(s) x. Similarly for, = and. The function as a random variable must satisfy two conditions: The set {X x} is an event for every x. The probability of the events {X = } and {X = } is zero: P {X = } = P {X = } = 0 ES150 Harvard SEAS 2

Random variables A random variable can be either discrete, continuous, or of mixed type. X(s) : Ω R X Discrete variable: The range R X is discrete, it can be either finite or countably infinite R X = {x 1, x 2,...} The sample space Ω can be discrete, continuous, or a mixture of both. X(s) partitions Ω into the sets {S i X(s) = x i s S i }. Continuous variable: The range is continuous. The sample space must also be continuous. Mixed type: The range is a combination of discrete values and continuous regions. ES150 Harvard SEAS 3 Distribution function The distribution function of a random variable relates to the probability of an event described by the random variable. It is defined as Properties of F X (x): 0 F X (x) 1 F ( ) = 1 and F ( ) = 0 F X (x) = P {X x} It is a non-decreasing function of x It is continuous from the right P {X > x} = 1 F X (x) P {X = x} = F X (x) F X (x ) x 1 < x 2 F X (x 1 ) F X (x 2 ) F X (x + ) = lim ɛ 0 F X (x + ɛ) = F X (x) P {x 1 < X x 2 } = F X (x 2 ) F X (x 1 ) ES150 Harvard SEAS 4

The distribution of different types of random variables Discrete: F X (x) is a stair-case function of x with jumps at a countable set of points {x 0, x 1,...} F X (x) = k p X (x k )u(x x k ) where p X (x k ) is the probability of {X = x k }. Continuous: F X (x) is continuous everywhere and can be written as an integral of a non-negative function F X (x) = x f X (t)dt. The continuity implies that at any point x, P {X = x} = F X (x + ) F X (x) = 0. Mixed: F X (x) has jumps on a countable set of points but is also continuous on at least one interval. We will mostly study discrete and continuous random variables. ES150 Harvard SEAS 5 Discrete random variables Pmf A discrete random variable can be completely specified by its probability mass function p X (x) p X (x) = P {X = x} for x R X p X (x) 0 for any x R X k p X(x k ) = 1 for all x k R X For any set A P (X A) = k p X (x k ) for all x k A R X We use X p X (x) or just simply X p(x) to denote discrete random variable X with pmf p X (x) or p(x). ES150 Harvard SEAS 6

Some important discrete random variables Bernoulli: The success or failure of an experiment (Bernoulli trial). p if x = 1 p X (x) = 1 p if x = 0 Example: Flipping a bias coin. Binomial: The number of successes in a sequence of n independent Bernoulli trials. ( ) n p X (k) = p k (1 p) N k for k = 0,..., n k Example: The number of heads in n independent coin flips. Geometric: The number of trials until the first success. p X (k) = (1 p) k 1 p for k = 1, 2,... The geometric probability is strictly decreasing with k. Example: The number of coin flips until the first head shows up. ES150 Harvard SEAS 7 Poisson: Number of occurrences of an event within a certain time period or region in space. p X (k) = αk k! e α for k = 1, 2,... where α R + is the average number of occurrences. The Poisson probabilities can approximate the binomial probabilities. If n is large and p is small, then for α = np ( n p X (k) = )p k (1 p) N k αk k k! e α The approximation becomes exact in the limit of n, provided α = np is fixed. ES150 Harvard SEAS 8

Continuous random variables Pdf A continuous random variable can be completely specified by its probability density function, which is a nonnegative function such that F X (x) = Properties of f X (x): x f X (t) dt. f X (x) = df X (x) dx f X (x) 0 f X(x)dx = 1 for all x R P {X A} = A f X(x)dx for any event A R P {x 1 < X x 2 } = x 2 x 1 f X (x)dx However, f X (x) should not be interpreted as the probability at X = x. In fact, f X (x) is not a probability measure since it can be > 1. ES150 Harvard SEAS 9 Some important continuous random variables Uniform U[a, b]: 0 for x < a f X (x) = 1 b a for a x b 0 for x > b Example: A wireless signal x(t) = A cos(ωt + θ) has the phase θ U[ π, π] because of random scattering. Exponential: X exp(λ) f X (x) = λe λx, λ > 0, x 0 Examples: The arrival time of packets at an internet router, cell-phone call durations can be modeled as exponential RVs. Gaussian (normal): X N (µ, σ 2 ) f X (x) = 1 ( σ 2π exp (x µ)2 2σ 2 ), σ > 0, < x < When µ = 0 and σ = 1, we call f(x) the standard Gaussian density. ES150 Harvard SEAS 10

The Gaussian distribution is very important and is often used in EE, for example, to model thermal noise in circuits, in communication and control systems. It also arises naturally from the sum of independent random variables. We will study more about this in a later lecture. The Q function Q(α) = Pr[x α] = 1 2π + α e x2 /2 dx Often used to calculate the error probability in communications. Has no closed-form but good approximations exist. A related function is the complementary error function erfc(z) = 2 π + Matlab has the command erfc(z). Chi-square: X X 2 k z ( ) e x2 dx = 2Q 2z f X (x) = xk/2 1 e x/2, x 0, where Γ(p) := Γ(k/2)2k/2 z p 1 e z dz 0 ES150 Harvard SEAS 11 Here k is called the degree of freedom. When k is an integer, Γ(k) = (k 1)! = (k 1)(k 2)... 2 1 The chi-squared random variable X arises from the sum of k i.i.d. standard Gaussian RVs X = k Z i, i=1 Z i N (0, 1), independent A X 2 2 random variable (k = 2) is the same as exp( 1 2 ). Rayleigh: f X (x) = x λ 2 e (x/2)2 /2, x 0 Example: The magnitude of a wireless signal. Cauchy: X Cauchy(λ) f X (x) = λ/π λ 2 + x 2, < x < The Cauchy random variable arises as the tangent of a uniform RV. ES150 Harvard SEAS 12

Expectation The expected value (also called expectation or mean) of a random variable X is defined for continuous X as: E[X] = xf X (x)dx for discrete X as: E[X] = k x k p X (x k ) provided the integral or sum converges absolutely (E[ X ] < ). The mean can be thought of as the average value of X in a large number of independent repetitions of the experiment. E[X] is the center of gravity of the pdf, considering f X (x) as the distribution of mass on the real line. Questions: Find the mean of the following random variables: Binomial, Poison, uniform, exponential, Gaussian, Cauchy. ES150 Harvard SEAS 13 Variance and moments Expectation of a function of X E[g(X)] = g(x)f X(x) k g(x k)p X (x k ) for continuous X dx for discrete X The variance of a random variable X is defined as var(x) = E [ (X E[X]) 2] The variance provides a measure of the dispersion of X around its mean. The variance is always non-negative. The standard deviation σ X = var(x) has the same unit as X. The k th moment of X is defined as m k = E [ X k] The mean and variance can be expressed in terms of the first two moments E[X] and E[X 2 ]: var(x) = E[X 2 ] (E[X]) 2. ES150 Harvard SEAS 14

Properties of mean and variance Expectation is linear [ n ] E g k (X) = k=1 Let c be a constant scalar. Then E[c] = c E[X + c] = E[X] + c E[cX] = ce[x] n E [g k (X)] k=1 var(c) = 0 var(x + c) = var(x) var(cx) = c 2 var(x) Example: A random binary NRZ signal x = {1, 1, 1, 1, 1, 1, 1,...} 1 1 with prob. 2 x = 1 1 with prob. 2 Mean E[X] = 0: the signal is unbiased. Variance σx 2 = 1 is the average signal power. What happens to the mean and variance if you scale the signal to a different voltage V? ES150 Harvard SEAS 15 Markov and Chebyshev inequalities For a Gaussian r.v., the mean and variance completely specify its pdf X N (µ, σ 2 ) f X (x) = 1 ) ( σ 2π exp (x µ)2 2σ 2 In general, however, the mean and variance are insufficient in specifying a random variable (determining its pmf/pdf/cdf). They can be used to bound the probabilities of the form P [X t]. Markov inequality: For X nonnegative P [X a] E[X], a > 0 a This bound is useful when the right-hand-side expression is < 1. It can be tight for certain distributions. Chebyshev inequality: For X with mean m and variance σ 2 P [ X m a] σ2 a 2 The Chebyshev inequality can be obtained by applying the Markov inequality to Y = (X m) 2. ES150 Harvard SEAS 16

Testing the fit of a distribution to data We have a set of observation data. How do we determine how well a model distribution fits the data? The Chi-square test. Partition the sample space S X into the union of K disjoint intervals. Based on the modeled distribution, calculate the expected number of outcomes that fall in the kth interval as m k. Let N k be the observed number of outcomes in the interval k. Form the weighted difference D 2 = K (N k m k ) 2 k=1 If D 2 is small then the fit is good. If D 2 > t α then reject. Here t α is a predetermined threshold based on the significant level of the test. It is calculated from P [X t α ] = α, e.g. for α = 1%. m k ES150 Harvard SEAS 17