Chapter 2 - Random Variables

Size: px
Start display at page:

Download "Chapter 2 - Random Variables"

Transcription

1 Chapter - Random Variables In this and the chapters that follow, we denote the real line as R = (- < < ), and the etended real line is denoted as R + R{±}. The etended real line is the real line with ± thrown in. Put simply (and incompletely), a random variable is a function that maps the sample space S into the etended real line. Eample -1: In the die eperiment, we assign to the si outcomes f i the numbers X(f i ) = 10i. Thus, we have X(f 1 ) = 10, X(f ) = 0, X(f 3 ) = 30, X(f 4 ) = 40, X(f 5 ) = 50, X(f 6 ) = 60. For an arbitrary value 0, we must be able to answer questions like what is the probability that random variable X is less than, or equal to, 0? Hence, the set { S : X() 0 } must be an event (i.e., the set must belong to -algebra F ) for every 0 (sometimes, the algebraic, non-random variable 0 is said to be a realization of the random variable X). This leads to the more formal definition. Definition: Given a probability space (S, F, P), a random variable X() is a function + X: S R. (-1) That is, random variable X is a function that maps sample space S into the etended real line R +. In addition, random variable X must satisfy the two criteria discussed below. 1) Recall the Borel -algebra B of subsets of R that was discussed in Chapter 1 (see Eample 1-9). For each B B, we must have 1 X (B) S : X( ) B F, BB. (-) A function that satisfies this criteria is said to be measurable. A random variable X must be a measurable function. ) P[ S : X() = ± ] = 0. Random variable X is allowed to take on the values of ± ; Updates at -1

2 however, it must take on the values of ± with a probability of zero. These two conditions hold for most elementary applications. Usually, they are treated as mere technicalities that impose no real limitations on real applications of random variables (usually, they are not given much thought). However, good reasons eist for requiring that random variable X satisfy the conditions 1) and ) listed above. In our eperiment, recall that sample space S describes the set of elementary outcomes. Now, it may happen that we cannot directly observe elementary outcomes S. Instead, we may be forced to use a measuring instrument (i.e., random variable) that would provide us with measurements X(), S. Now, for each R, we need to be able to compute the probability P[- < X() ], because [- < X() is a meaningful, observable event in the contet of our eperiment/measurements. For probability P[- < X() ] to eist, we must have [- < X() as an event; that is, we must have S : X( ) F (-3) for each R. It is possible to show that Conditions (-) and (-3) are equivalent. So, while (-) (or the equivalent (-3)) may be a mere technicality, it is an important technicality. Sigma-Algebra Generated by Random Variable X Suppose that we are given a probability space (S, F, P) and a random variable X as described above. Random variable X induces a -algebra (X) on S. (X) consists of all sets of the form { S : X() B, B B}, where B denotes the -algebra of Borel subsets of R that was discussed in Chapter 1 (see Eample 1-9). Note that (X) F ; we say that (X) is the sub -algebra of F that is generated by random variable X. Probability Space Induced by Random Variable X Suppose that we are given a probability space (S, F, P) and a random variable X as described above. Let B be the Borel -algebra introduced by Eample 1-9. By (-), for each B B, we have { S : X() B} F, so P[{ S : X() B}] is well defined. Updates at -

3 This allows us to use random variable X to define (R +, B, P), a probability space induced by X. Probability measure P is defined as follows: for each B B, we define P(B) P[{ S : X() B]. We say that P induces probability measure P. Distribution and Density Functions The distribution function of the random variable X() is the function F() P[X( ) ] P[ S : X( ) ], (-4) where <. Eample -: Consider the coin tossing eperiment with P[heads] = p and P[tails] = q 1 - p. Define the random variable X(head) = 1 X(tail) = 0. If 1, then both X(head) = 1 and X(tail) = 0 so that F() 1 for 1. If 0 1, then X(head) = 1 and X(tail) = 0 so that F() P [X ] q for 0 < 1 Finally, if 0, then both X(head) = 1 and X(tail) = 0 so that F() P [X ] 0 for < 0. Updates at -3

4 1 F() See Figure -1 for a graph of F(). Properties of Distribution Functions First, some standard notation: q Figure -1: Distribution function for X(head) =1, X(tail) = 0 random variable. 1 F( ) limit F(r) and F( ) limit F(r). r r Some properties of distribution functions are listed below. Claim #1: F(+) = 1 and F(-) = 0. (-5) Proof: F(+ ) = limit P[X ] = P[ S ]=1 and F(- ) = limit P[X ] = P[{ }]=0. Claim #: The distribution function is a non-decreasing function of. If 1, then F( 1 ) F( ). Proof: 1 implies that {X() 1 } {X() }. But this means that P[{X() 1 }] P[{X() }] and F( 1 ) F( ). Claim #3: P[X > ] = 1 - F(). (-6) Proof: {X() 1 } and {X() 1 } are mutually eclusive. Also, {X() 1 }{X() 1 } = Updates at -4

5 S. Hence, P[{X 1 }] + P[{X 1 }] = P[S] = 1. Claim #4: Function F() may have jump discontinuities. It can be shown that a jump is the only type of discontinuity that is possible for distribution F() (and, F() may have a countable number of jumps, at most). F() must be right continuous; that is, we must have F( + ) = F(). At a jump, take F to be the larger value; see Figure -. Claim #5: P[ 1 X ] = F( ) - F( 1 ) (-7) Proof: {X() 1 } and { 1 X() } are mutually eclusive. Also, {X() } = {X() 1 } { 1 X() }. Hence, P[X( ) ] P[X( ) 1] + P [ 1< X( ) ] and P[ 1 X() ] = F( ) - F( 1 ). Claim #6: P[X ] = F( ) - F( ). (-8) Proof: P[ - X ] = F() - F( - ). Now, take limit as to obtain the desired result. Claim #7: P[ 1 X ] F( ) - F( 1 ). (-9) Proof: { 1 X } { 1 X }{X 1 } so that P[ X ] = ( F( ) F( ) )+( F( ) F( ) )= F( ) F( ). 1 F( 0 ) 0 Figure -: Distributions are right continuous. Updates at -5

6 F () 1 Figure -3: Distribution function for a discrete random variable. Continuous Random Variables Random variable X is of continuous type if F () is continuous. In this case, P[X = ] = 0; the probability is zero that X takes on a given value. Discrete Random Variables Random variable X is of discrete type if F () is piece-wise constant. The distribution should look like a staircase. Denote by i the points where F () is discontinuous. Then F ( i ) F ( i ) = P[X = i ] = p i. See Figure -3. Mied Random Variables Random variable X is said to be of mied type if F () is discontinuous but not a staircase. Density Function The derivative df () f () (-10) d is called the density function of the random variable X. Suppose F has a jump discontinuity at a point 0. Then f() contains the term Updates at -6

7 Distribution Function F () } k Jump Discontinuity at 0 0 Density Function f () Delta Function at 0 k( - 0 ) Figure -4: "Jumps" in distribution function causes delta function in density function. The distribution jumps by the value k at = 0. 0 F( 0 ) F( 0 ) () 0 F() 0 F( 0 ) () 0. (-11) See Figure -4. Suppose that X is of a discrete type taking values i, i I. The density can be written as f () P[X ] (), i i ii where I is an inde set. Figures -5 and -6 illustrate the distribution and density, respectively, of a discrete random variable. Properties of f The monotonicity of F implies that f () 0 for all. Furthermore, we have df () f () F () f ( )d d (-1) Updates at -7

8 1 Distribution Function F () k 1 UVW k q k 3 } k Figure -5: Distribution function for a discrete random variable. Density Function f () k 1 k k 3 k Figure -6: Density function for a discrete random variable. P[ 1 < X ] = F ( ) - F ( 1 ) = f ( )d. (-13) 1 If X is of continuous type, then F () = F ( - ), and P[ 1 X ] = F ( ) - F ( 1 ) = f ( )d 1. For continuous random variables P[ X + ] f () for small. Normal/Gaussian Random Variables Let,, and, > 0, be constants. Then f () g ep[ ] (-14) 1 1 f() ep[ ] 1.0 F() 0.5 F() 1 ep[ ½ ]d Figure -7: Density and distribution functions for a Gaussian random variable with = 0 and = 1. Updates at -8

9 is a Gaussian density function with parameters and. These parameters have special meanings, as discussed below. The notation N(;) is used to indicate a Gaussian random variable with parameters and. Figure -7 illustrates Gaussian density and distribution functions. Random variable X is said to be Gaussian if its distribution function is given by 1 ( ) F() P [X ] ep d (-15) for given, - < <, and, > 0. Numerical values for F() can be determined with the aid of a table. To accomplish this, make the change of variable y = ( - )/ in (-15) and obtain ( )/ 1 y F() ep dy G. (-16) Function G() is tabulated in many reference books, and it is built in to many popular computer math packages (i.e., Matlab, Mathcad, etc.). Uniform Random variable X is uniform between 1 and if its density is constant on the interval [ 1, ] and zero elsewhere. Figure -8 illustrates the distribution and density of a uniform f () F () Figure -8: Density and distribution functions for a uniform random variable. Updates at -9

10 random variable. Binomial Random variable X has a binomial distribution of order n with parameter p if it takes the integer values 0, 1,..., n with probabilities n k nk P [X k] p q, 0 k n, 0 p 1, p q 1. (-17) k Both n and p are known parameters where p + q = 1, and n n!. (-18) k k!(n k)! We say that binomial random variable X is B(n,p). The Binomial density function is n f () p q (k) k n k nk, (-19) k0 and the Binomial distribution is m n k nk F() pq, k0k m m 1n 1, n. (-0) Note that m depends on. Updates at -10

11 Poisson with Random variable X is Poisson with parameter a > 0 if it takes on integer values 0, 1,... k a a [X k] e, k 0, 1,, 3,... k! P (-1) The density and distribution of a Poisson random variable are given by k0 k a a f () e (k) (-) k! m k a a F () e m m 1, m 0, 1,.... k! k0 (-3) Rayleigh The random variable X is Rayleigh distributed with real-valued parameter, > 0, if it is described by the density f () ep 1, 0 (-4) 0, 0. See Figure -9 for a depiction of a Rayleigh density function. The distribution function for a Rayleigh random variable is Updates at -11

12 1 ep 1 1 f( ) ep 1 ( ) ej U L NM O QP f( ) ep U ( ) / Figure -9: Rayleigh density function. u F() e du, 0 u / Figure -10: Eponential density function. (-5) 0, 0 To evaluate (-5), use the change of variable y = u /, dy = (u/ )du to obtain / y / F() e dy 1e, 0 0 (-6) 0, 0 as the distribution function for a Rayleigh random variable. Eponential The random variable X is eponentially distributed with real-valued parameter, > 0, if it is described by the density f () ep, 0 0, 0. (-7) See Figure -10 for a depiction of an eponential density function. Updates at -1

13 The distribution function for an eponential random variable is y 0 F() e dy 1e, 0 (-8) 0, 0. Conditional Distribution Let M denote an event for which P[M] 0. Assuming the occurrence of event M, the conditional distribution F(M) of random variable X is defined as [X,M] F(M) P[X M] P P[M]. (-9) Note that F(M) = 1 and F(-M) = 0. Furthermore, the conditional distribution has all of the properties of an "ordinary" distribution function. For eample, [1 X,M] P[1 X M] F(M) F(1M) P P[M]. (-30) Conditional Density The conditional density f(m) is defined as d [ X M] f (M) F(M) limit d 0 P. (-31) The conditional density has all of the properties of an "ordinary" density function. Eample -3: Determine the conditional distribution F(M) of random variable X(f i ) = 10i of the fair-die eperiment where M={f, f 4, f 6 } is the event "even" has occurred. First, note that X must take on values in the set {10, 0, 30, 40, 50, 60}. Hence, if 60, then [X ] is the Updates at -13

14 certain event and [X, M] = M. Because of this, P[X,M] P[M] F(M) 1, 60 P[M] P[M]. If 40 < 60, then [X, M] = [f, f 4 ], and P[X,M] P[f,f 4] /6 F(M), P[M] P[f,f,f ] 3/ If 0 < 40, then [X, M] = [f ], and P[X,M] P[f ] 1/6 F(M), P[M] P[f,f,f ] 3/6 4 6 Finally, if < 0, then [X, M] = [], and P[X,M] P[{ ] 0 F(M), 0. P[M] P[f,f,f ] 3/6 4 6 Conditional Distribution When Event M is Defined in Terms of X If M is an event that can be epressed in terms of the random variable X, then F(M) can be determined from the "ordinary" distribution F (). Below, we give several eamples. As a first eample, consider M = [X a], and find both F(M) = F(X a) = P[X X a] and f(m). Note that P[X X a] F(X a) P[XX a] P[X a]. Hence, if a, we have [X, X a] = [X a] and Updates at -14

15 P[X a] F(X a) 1, a. P[X a] If a, then [X, X a] = [X ] and P[X ] F () F(X a), a. P[X a] F (a) At = a, F(X a) would jump 1 - F X (a - )/F X (a) if F X (a - ) F X (a). The conditional density is f (), a F(a) d d F () F (a ) f(x a) F(X a) 1 (a). d d F (a) F (a) 0, a As a second eample, consider M = [b < X a] so that P[X b X a] F(b X a). P[b X a] Since [X, b < X a] = [b < X a], a [b < X ], b < < a = { }, b, we have Updates at -15

16 F(b < X a) 1, a F() F(b), F(a) F(b) b a 0, b. F(b < X a) is continuous at = b. At = a, F(b < X a) jumps 1 - {F X (a ) - F X (b)}/{f X (a) - F X (b)}, a value that is zero if F X (a ) F X (a). The corresponding conditional density is d f(b Xa) F(b Xa) d f (), b a F(a) F(b) F(a ) F(b) 1 (a). F(a) F(b) 0, otherwise Eample -4: Find f(x - ), where X is N(). First, note that F() P[X] G d 1 f () G g, d where G is the distribution, and g is the density, for the case = 0 and = 1. Also, note that the inequality X implies X so that Updates at -16

17 P[ X] F ( ) F ( ) G( ) G( ) G( ) 1. By the previous eample, we have F( X ) 1, G G( ), G( ) 1 0,, a conditional distribution that is a continuous function of. Differentiate this to obtain 1 1 ep 1 g f( X ), G( ) 1 G( ) 1 0, otherwise, a conditional density that contains no delta functions. Distributions/Densities Epressed In Terms of Conditional Distributions/Densities Let X be any random variable and define B = [X ]. Let A 1, A,..., A n be a partition of sample space S. That is, n A i S and Ai A j { for i j. (-3) i=1 From the discrete form of the Theorem of Total Probability discussed in Chapter 1, we have P[B] = P[B A 1 ]P[A 1 ] + P[B A ]P[A ] P[B A n ]P[A n ]. Now, with B = [X ], this becomes Updates at -17

18 P[X ] = P[X A 1 ]P[A 1 ] + P[X A ]P[A ] P[X A n ]P[A n ]. (-33) Hence, F () = F(A ) P[A ] + F(A ) P[A ] F(A ) P[A ] 1 1 n n f () = f(a ) P[A ] + f(a ) P[A ] f(a ) P[A ]. 1 1 n n (-34) Total Probability Continuous Form The probability of an event can be conditioned on an event defined in terms of a random variable. For eample, let A be any event, and let X be any random variable. Then we can write P[AX ] P[X A] [A] F( A) [A] [A X ] P P P. P[X ] P[X ] F () (-35) As a second eample, we derive a formula for P[A 1 < X ]. Now, the conditional distribution F(A) has the same properties as an "ordinary" distribution. That is, we can write P[ 1 < X A] = F( A) - F( 1 A) so that P[1 X P[A 1X ] P[A] P[1 X F( F(1 P[A]. F( F( 1 (-36) In general, the formulation P[AX ] P[AX ] P[X ] (-37) Updates at -18

19 may result in an indeterminant 0/0 form. Instead, we should write F( A) F(A) P[AX ] limit P[A X ] limit P[A] F() F() [F( A) F(A)]/ limit P(A), [F ( ) F ()]/ + 0 (-38) which yields f(a) P[AX ] P [A]. (-39) f () Now, multiply both sides of this last result by f and integrate to obtain [AX ]f ()d [A] f (A)d P P. (-40) But, the area under f(a) is unity. Hence, we obtain [A] [AX ]f ()d P P, (-41) the continuous version of the Total Probability Theorem. In Chapter 1, we gave a finite dimensional version of this theorem. Compare (-41) with the result of Theorem 1-1; conceptually, they are similar. P[AX = ] is the probability of A given that X =. It is a function of. Equation (-41) tells us to average this function over all possible values of X to find P[A]. For the Total Probability Theorem, the discrete and continuous cases can be related by the following simple, intuitive argument. We start with the discrete version (see Theorem 1.1, Updates at -19

20 Chapter 1 of class notes). For small > 0 and integer i, < i <, define the event X ( +1) : X( ) ( +1) Ai i i S i i. Note that the A i, < i <, form a partition of the sample space. Let A be an arbitrary event. By Theorem 1-1 of Chapter 1, the discrete version of the Total Probability Theorem, we can write P P P i A AAi Ai PA i X ( i+1) Pi X ( i+1) i P AX i f () i, i where i is any point in the infinitesimal interval (i, (i + 1)]. In the limit as approaches zero, the infinitesimal becomes d, and the sum becomes an integral so that A P P A X= f ()d, the continuous version of the Theorem of Total Probability. Eample -5 (Theorem of Total Probability) Suppose we have a large set S c of distinguishable coins. We do not know the probability of heads for any of these coins. Suppose we randomly select a coin from S c (all coins are equally likely to be selected). Its probability of heads will be estimated by a method that uses eperimental data obtained from tossing the selected coin. This method is developed in this and Updates at -0

21 the net eample. For each coin in S c, we model its probability of heads as a continuous random variable p, 0 p 1, described by known density f p (p). Density f p (p) should peak at or near p = ½ and decrease as p deviates from ½. After all, most coins are well balanced and nearly fair. Also, note the requirements p 1 p p P p p p f (p)dp. (-4) p 1 f 0 p (p)dp1 P. (-43) Realistically, it is likely that we do not know f p (p) eactly. We may make a good guess ; on the other hand, we could choose a uniform density for f p (p), a choice that implies considerable ignorance. As shown in Eample -6, given a specific coin S c, it may be possible to use eperimental data and estimate its probability of heads without knowledge of f p (p). Consider a combined, or joint, eperiment of randomly selecting a coin from S c (the first eperiment) and then tossing it once (the second eperiment, with sample space S = [h, t], is independent of the first eperiment). For this combined eperiment, the relevant product sample space is (,h): (,t): S S S S. (-44) c c c For the combined eperiment, the event we get a heads or heads occurs is H (,h): S c. (-45) Updates at -1

22 The conditional probability of H, given that the tossed coin has (Probability of Head) = p (i.e., p p), is simply PH p p P[(,h): S c] p p p. (-46) The Theorem of Total Probability can be used to epress P[H] in terms of known f p (p). Simply use (-46) and the Theorem of Total Probability (-41) to write 1 1 p p f 0 p(p)dp pf 0 p(p)dp PH P H. (-47) On the right hand side of (-47), the integral is called the epected value, or ensemble average, of the random variable p. Equation (-47) is used in Eample -6 below to compute an estimate for the probability of heads for any coin selected from S c. Bayes Theorem - Continuous Form From (-39) we get [AX ] f(a) P f (). (-48) P[A] Now, use (-41) and (-48) to write f(a) P[AX ] f (), (-49) P[AX ]f ( )d a result known as the continuous form of Bayes Theorem. Often, f X () is called the a-priori density for random variable X. And, f(a) is called the a-posteriori density conditioned on the observed event A. In an application, we might cook Updates at -

23 up a density f X () that (crudely) describes a random quantity (i.e., variable) X of interest. To improve our characterization of X, we note the occurrence of a related event A and compute f(a) to better characterize X. The value of that maimizes f(a) is called the maimum a-posteriori (MAP) estimate of X. MAP estimation is used in statistical signal processing and many other problems where one must estimate a quantity from observations of related random quantities. Given event A, the value m is the MAP estimate for X if f(m f( ), m. Eample -6: (Bayes Theorem & MAP estimate of probability of heads for Eample -5) Find the MAP estimate of the probability of heads in the coin selection and tossing eperiment that was described in Eample -5. First, recall that the probability of heads is modeled as a random variable p with density f p (p) (which we may have to guess). We call f p (p) the a- priori ("before the coin toss eperiment") density of p. Suppose we toss the coin n times and get k heads. We want to use these eperimental results with our a-priori density f p (p) to compute the conditional density f (pk heads, in a specific order, in n tosses of a selected coin) f (p. (-50) Observed Event A This density is called a-posteriori density ("after the coin toss eperiment") of random variable p given the eperimentally observed event A [k heads, in a specific order, in n tosses of a selected coin]. (-51) Updates at -3

24 f p (p) The a-posteriori density f(pa) may give us a good idea (better than the a-priori density ) of the probability of heads for the randomly selected coin that was tossed. Conceptually, think of f(pa) as a density that results from using eperimental data/observation A to update f p (p). In fact, given that A occurred, the a-posteriori probability that p is between p 1 and p is p p f(pa)dp. (-5) 1 Finally, the value of p that maimizes f(pa) is the maimum a-posteriori estimate (MAP estimate) of p for the selected coin. To find this a-posteriori density, recall that p is defined on the sample space S c. The eperiment of tossing the randomly selected coin n times is defined on the sample space (i.e., product space) S c S n, where S = [h, t]. The elements of S c S n have the form, t h t h where S and t h t h S. n outcomes c n outcomes n Now, given that p = p, the conditional probability of event A is k n k P[k heads in specific order in n tosses of specific coin p = p] = p (1 p) Observed Event A. (-53) Substitute this into the continuous form of Bayes rule (-49) to obtain k Observed Event A f (pa) f (pk heads, in a specific order, in n tosses of a selected coin ) nk p p (1 p) f (p), (1 ) f ( )d 1 k n k 0 p (-54) Updates at -4

25 a result known as the a-posteriori density of p given A. In (-54), the quantity is a dummy variable of integration. Suppose that the a-priori density f p (p) is smooth and slowly changing around the value p = k/n (indicating a lot of uncertainty in the value of p). Then, for large values of n, the numerator p k (1-p) n-k f (p) and the a-posteriori density f(pa) have a sharp peak at p = k/n, p indicating little uncertainty in the value of p. When f(pa) is peaked at p = k/n, the MAP estimate of the probability of heads (for the selected coin) is the value p = k/n. Eample -7: Continuing Eamples -5 and -6, for > 0, we use the a-priori density 1 (p1/) ep f p(p), 0p1,, (-55) 1 1 G where G() is a zero mean, unit variance Gaussian distribution function (verify that there is unit area under f p (p)). Also, use numerical integration to evaluate the denominator of (-54). For =.3, n = 10 and k = 3, a-posteriori density f(pa) was computed and the results are plotted on Figure -11. For =.3, n = 50 and k = 15, the calculation was performed a second time, and the results appear on Figure -11. For both, the MAP estimate of p is near.3, since this is were the plots of f(p A) peak. Epectation Let X denote a random variable with density f (). The epected value of X is defined as E[X] f ()d. (-56) Also, E[X] is known as the mean, or average value, of X. If f is symmetrical about some value, then is the epected value of X. For eample, Updates at -5

26 7 6 f(pa) for n = 50, k = 15 Density Functions f(pa) for n = 10, k = 3 fp~( p) 1 let X be N() so that p Figure -11: Plot of a-priori and a-posteriori densities. f() 1 ep 1 ( ). (-57) Now, (-57) is symmetrical about, so E[X] =. Suppose that random variable X is discrete and takes on the value of i with P[X = i ] = p i. Then the epected value of X, as defined by (-56), reduces to E[X] ipi. (-58) i Eample -8: Consider a B(n,p) random variable X (that is, X is Binomial with parameters n and p). Using (-58), we can write n n n k nk E[X] k P [X k] k p q. (-59) k0 k0 k Updates at -6

27 To evaluate this result, first consider the well-know, and etremely useful, Binomial epansion n n k n (1). (-60) k0 k With respect to, differentiate (-60); then, multiply the derivative by to obtain n n k n1 k n(1). (-61) k0 k Into (-61), substitute = p/q, and then multiply both sides by q n. This results in n k0 n p k p q npq (1 ) np(p q) k q k nk n1 n1 n1 np. (-6) From this and (-59), we conclude that a B(n,p) random variable has E[X] = np. (-63) We now provide a second, completely different, evaluation of E[X] for a B(n,p) random variable. Define n new random variables Xi 1 th, if i trial is a "success", 1 i n = 0, otherwise. (-64) Updates at -7

28 Note that E[X i] 1p 0 (1 p) p, 1 i n. (-65) B(n,p) random variable X is the number of successes out of n independent trials; it can be written as X X X X. (-66) 1 n With the use of (-65), the epected value of X can be evaluated as E[X] E[X X X ] E[X ] E[X ] E[X ] np, 1 n 1 n (-67) a result equivalent to (-63). Consider random variable X, with mean = E[X], and constant k. Define the new random variable Y X + k. The mean value of Y is k k k. (-68) y E[Y] E[X ] E[X] A random variable, when translated by a constant, has a mean that is translated by the same constant. If instead Y kx, the mean of Y is E[Y] = E[kX] = k. In what follows, we etend (-56) to arbitrary functions of random variable X. Let g() be any function of, and let X be any random variable with density f (). In Chapter 4, we will discuss transformations of the form Y = g(x); this defines the new random variable Y in terms of the old random variable X. We will argue that the epected value of Y can be computed as Updates at -8

29 E[Y] E[g(X)] g()f - ()d. (-69) This brief heads-up note (on what is to come in CH 4) is used net to define certain statistical averages of functions of X. Variance and Standard Deviation The variance of random variable X is Var[X] E[(X ) ] ( ) f ()d. (-70) Almost always, variance is denoted by the symbol. The square root of the variance is called the standard deviation of the random variable, and it is denoted as. Finally, variance is a measure of uncertainty (or dispersion about the mean). The smaller (alternatively, larger) is, the more (alternatively, less) likely it is for the random variable to take on values near its mean. Finally, note that (-70) is an application of the basic result (-69) with g = ( - ). Eample -9: Let X be N() then 1 ( ). (-71) VAR[X] E[(X ) ] ( ) ep d Let y = d = dy so that 1 1 y Var[X] y e dy, (-7) a result obtained by looking up the integral in a table of integrals. Consider random variable X with mean and variance = E[{X } ]. Let k denote an arbitrary constant. Define the new random variable Y kx. The variance of Y is Updates at -9

30 y E[{Y y} ] E[ {X } ] k k. (-73) The variance of kx is simply k time the variance of X. On the other hand, adding constant k to a random variable does not change the variance; that is, VAR[X+k] = VAR[X]. Consider random variable X with mean and variance problem by centering and normalizing X to define the new random variable. Often, we can simplify a X Y. (-74) Note that E[Y] = 0 and VAR[Y] = 1. Moments The n th moment of random variable X is defined as n n mn E[X ] f ()d. (-75) The n th moment about the mean is defined as n n n E[(X ) ] ( ) f ()d. (-76) Note that (-75) and (-76) are basic applications of (-69). Variance in Terms of Second Moment and Mean Note that the variance can be epressed as Var[X] E[(X ) ] E[X X ] E[X ] E[X] E[ ], (-77) where the linearity of the operator E[ ] has been used. Now, constants come out front of Updates at -30

31 epectations, and the epectation of a constant is the constant. Hence, Equation (-77) leads to E[X ] E[ X] E[ ] m E[X] m, (-78) the second moment minus the square of the mean. In what follows, this formula will be used etensively. Eample -10: Let X be N(0,) and find E[X n ]. First, consider the case of n an odd integer n, (-79) n 1 E[X ] ep 1 d 0 since an integral of an odd function over symmetrical limits is zero. Now, consider the case n an even integer. Start with the known tabulated integral 1/ e d, > 0. (-80) Repeated differentiations with respect to yields 1 3/ first d/d : e d / second d/d : e d 1 35 (k1) k d/d : e d. th k (k 1) / Let n = k (remember, this is the case n even) and = 1/ to obtain Updates at -31

32 135 (n1) e d n / (n 1) (n+1) n n (n1). (-81) From this, we conclude that the n th -moment of a zero-mean, Gaussian random variable is n n / n mn E[X ] 1 e d 135 (n 1), n = k (n even) (-8) = 0, n = k-1 (n odd). Eample -11 (Rayleigh Distribution): A random variable X is Rayleigh distributed with parameter if its density function has the form f X() ep, 0 (-83) = 0, < 0. This random variable has many applications in communication theory; for eample, it describes the envelope of a narrowband Gaussian noise process (as described in Chapter 9 of these notes). The n th moment of a Rayleigh random variable can be computed by writing n n 1 / 1 E[X ] e d 0. (-84) We consider two cases, n even and n odd. For the case n odd, we have n+1 even, and (-84) becomes n 1 n 1 / n 1 / E[X ] 1 e d 1 e d. (-85) - - Updates at -3

33 On the right-hand-side of (-85), the bracket contains the (n+1) th moment of a N(0,) random variable (compare (-8) and the right-hand-side of (-85)). Hence, we can write n n1 n E[X ] 135n 135 n /, n = k+1 (n odd). (-86) Now, consider the case n even, n+1 odd. For this case, (-84) becomes n 1 n 1 / n / E[X ] e d e d 0 0. (-87) Substitute y = /, so that dy = (/ )d, in (-87) and obtain n n n y n/ n n/ y n/ n EX y e dy y e dy 0 0, (-88)! (note that n/ is an integer here) where we have used the Gamma function k y (k 1) y e dy k! 0, (-89) k 0 an integer. Hence, for a Rayleigh random variable X, we have determined that n E[X ] 135n /, n odd n n/ n = n, n even.! (-90) In particular, Equation (-90) can be used to obtain the mean and variance Updates at -33

34 E[X] Var[X] E[X ] E[X] ( ), (-91) given that X is a Rayleigh random variable. Eample -1: Let X be Poisson with parameter so that k P [X k] e k!, k 0, and k f() e (k). (-9) k! k0 Show that E[X] = and VAR[X] = Recall that e k0 k. (-93) k! With respect to differentiate (-93) to obtain k1 1 e k k k! k! k0 k1 k (-94) Multiply both sides of this result by e - to obtain k = k e E[X] k! k1. (-95) as claimed. With respect to, differentiate (-94) (obtain the second derivative of (-93)) and obtain Updates at -34

35 k k k 1 1 k! k! k! k1 k1 k1. e k(k1) k k Multiply both sides of this result by e - to obtain k k k e k e k P[X k] k P [X k]. (-96) k! k! k1 k1 k1 k1 Note that (-96) is simply E[X ] E[X]. Finally, a Poisson random variable has a variance given by Var[X] E[X ] E[X] E[X ] E[X], (-97) as claimed. Conditional Mean Let M denote an event. The conditional density f(m) can be used to define the conditional mean E[XM] f (d -. (-98) The conditional mean has many applications, including estimation theory, detection theory, etc. Eample -13: Let X be Gaussian with zero mean and variance (i.e., X is N(0,)). Let M = [X > 0]; find E[X M] = E[XX > 0]. First, we must find the conditional density f( X > 0); from previous work in this chapter, we can write Updates at -35

36 [X,X ] [X,X ] [X,X ] F(X ) P[X ] 1- P[X ] 1 F (0) P P P 1 F (0) F() X F(0) X, 0 = 0, < 0, so that f(x0) f (), 0 0, 0. From (-98) we can write EX X 0 ep / d. 0 Now, set y = /, dy = d/ to obtain y EX X 0 e dy. 0 Tchebycheff Inequality A measure of the concentration of a random variable near its mean is its variance. Consider a random variable X with mean variance and density f X (). The larger, the more "spread-out" the density function, and the more probable it is to find values of X "far" from the mean. Let denote an arbitrary small positive number. The Tchebycheff inequality says that the probability that X is outside ( -, + ) is negligible if is sufficiently small. Theorem (Tchebycheff s Inequality) Consider random variable X with mean and variance. For any > 0, we have Updates at -36

37 X- P. (-99) Proof: Note that { : } ( ) f ()d ( ) f ()d [ f ()d ] { : } P[ X- ]. This leads to the Tchebycheff inequality X- P. (-100) The significance of Tchebycheff's inequality is that it holds for any random variable, and it can be used without eplicit knowledge of f(). However, the bound is very "conservative" (or "loose"), so it may not offer much information in some applications. For eample, consider Gaussian X. Note that X- P X- 3 1 P X- 3 1 P G(3) G( 3) G(3), (-101).007 where G(3) is obtained from a table containing values of the Gaussian integral. However, the Tchebycheff inequality gives the rather "loose" upper bound of Updates at -37

38 P X- 3 1/ 9. (-10) Certainly, inequality (-10) is correct; however, it is a very crude upper bound as can be seen from inspection of (-101). Generalizations of Tchebycheff's Inequality For a given random variable X, suppose that f X () = 0 for < 0. Then, for any > 0, we have P[ X ] (-103) To show (-103), note that E[X] f 0 ()d f ()d f ()d [X ] P, (-104) so that P[ X ] as claimed. Corollary: Let X be an arbitrary random variable and and n an arbitrary real number and positive integer, respectively. The random variable X - n takes on only nonnegative values. Hence n P n n E[ X- ] X- n, (-105) which implies n P E[ X- ] X- n. (-106) Updates at -38

39 The Tchebycheff inequality is a special case with = and n =. Application: System Reliability Often, systems fail in a random manner. For a particular system, we denote t f as the time interval from the moment a system is put into operation until it fails; t f is the time to failure random variable. The distribution F a (t) = P[t f t] is the probability the system fails at, or prior to, time t. Implicit here is the assumption that the system is placed into service at t = 0. Also, we require that F a (t) = 0 for t 0. The quantity R(t) 1F (t) P [t t] (-107) a f is the system reliability. R(t) is the probability the system is functioning at time t > 0. We are interested in simple methods to quantify system reliability. One such measure of system reliability is the mean time before failure MTBF Etf t f 0 a(t)dt, (-108) where f a = df a /dt is the density function that describes random variable t f. Given that a system is functioning at time t, t 0, we are interested in the probability that a system fails at, or prior to, time t, where t t 0. We epress this conditional distribution function as P[tf t,tf t ] P[t < tf t] F a(t) F a(t ) F(t tf t ), t t. (-109) P[t t ] P[t t ] 1F (t ) f f a The conditional density can be obtained by differentiating (-109) to obtain Updates at -39

40 d f a (t) f(tt f t) F(tt f t), t t. (-110) dt 1 F (t ) a F(tt f > t) and f(tt f > t) describe t f conditioned on the event t f > t. The quantity f(tt f > t)dt is, to first-order in dt, the probability that the system fails between t and t + dt given that it was working at t. Eample -14 Suppose that the time to failure random variable t f is eponentially distributed. That is, suppose that t t 1e, t 0 F(t) a 0, t 0 e, t 0 f a (t) 0, t 0, (-111) for some constant > 0. From (-110), we see that t e (tt ) f(ttf t) e f t a(t t), t > t. (-11) e That is, if the system is working at time t, then the probability that it fails between tand t depends only on the positive difference t - t, not on absolute t. The system does not wear out (become more likely to fail) as time progresses! With (-110), we define f(tt f > t) for t > t. However, the function f a (t) (t), (-113) 1 F (t) a Updates at -40

41 known as the conditional failure rate (also known as the hazard rate), is very useful. To firstorder, (t)dt (when this quantity eists) is the probability that a functioning-at-t system will fail between t and t + dt. Eample -15 (Continuation of Eample -14) Assume the system has f a and F a as defined in Eample -14. Substitute (-111) into (-113) to obtain t e (t). (-114) t 1 {1e } That is, the conditional failure rate is the constant. As stated in Eample -14, the system does not wear out as time progresses! If conditional failure rate (t) is a constant, we say that the system is a good as new system. That is, it does not wear out (become more likely to fail) overtime. Eamples -14 and -15 show that if a system s time-to-failure random variable t f is eponentially distributed, then f (t tf t ) f a(t t ), t > t (-115) and (t) = constant, (-116) so the system is a good as new system. The converse is true as well. That is, if (t) is a constant for a system, then the system s time-to-failure random variable t f is eponentially distributed. To argure this, use (-113) to write Updates at -41

42 f (t) [1 F (t)]. (-117) a a But (-117) leads to df dt a F(t). (-118) a Since F a (t) = 0 for t 0, we must have t 1e, t 0 F(t) a 0, t 0, so t f is eponentially distributed. For (t) to be equal to a constant, failures must be truly random in nature. A constant requires that there be no time epochs where failure is more likely (no Year 000 type problems!). Random variable t f is said to ehibit the Markov property, or it is said to be memoryless, if its conditional density obeys (-115). We have established the following theorem. Theorem The following are equivalent 1) A system is a good-as-new system ) = constant 3) f(tt f > t) = f a (t t), t > t 4) t f is eponentially distributed. The previous theorem, and the Markov property, is stated in the contet of system reliability. However, both have far reaching consequences in other areas that have nothing to do with system reliability. Basically, in problems dealing with random arrival times, the time between two successive arrivals is eponentially distributed if Updates at -4

43 1) the arrivals are independent of each other, and ) the arrival time t after any specified fied time t is described by a density function that depends only on the difference t - t (the arrival time random variable obeys the Markov property). In Chapter 9, we will study shot noise (caused by the random arrival of electrons at a semiconductor junction, vacuum tube anode, etc), an application of the above theorem and the Markov property. Updates at -43

Random variables, probability distributions, binomial random variable

Random variables, probability distributions, binomial random variable Week 4 lecture notes. WEEK 4 page 1 Random variables, probability distributions, binomial random variable Eample 1 : Consider the eperiment of flipping a fair coin three times. The number of tails that

More information

Random variables P(X = 3) = P(X = 3) = 1 8, P(X = 1) = P(X = 1) = 3 8.

Random variables P(X = 3) = P(X = 3) = 1 8, P(X = 1) = P(X = 1) = 3 8. Random variables Remark on Notations 1. When X is a number chosen uniformly from a data set, What I call P(X = k) is called Freq[k, X] in the courseware. 2. When X is a random variable, what I call F ()

More information

Chapter 4 Lecture Notes

Chapter 4 Lecture Notes Chapter 4 Lecture Notes Random Variables October 27, 2015 1 Section 4.1 Random Variables A random variable is typically a real-valued function defined on the sample space of some experiment. For instance,

More information

MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.436J/15.085J Fall 2008 Lecture 5 9/17/2008 RANDOM VARIABLES

MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.436J/15.085J Fall 2008 Lecture 5 9/17/2008 RANDOM VARIABLES MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.436J/15.085J Fall 2008 Lecture 5 9/17/2008 RANDOM VARIABLES Contents 1. Random variables and measurable functions 2. Cumulative distribution functions 3. Discrete

More information

WHERE DOES THE 10% CONDITION COME FROM?

WHERE DOES THE 10% CONDITION COME FROM? 1 WHERE DOES THE 10% CONDITION COME FROM? The text has mentioned The 10% Condition (at least) twice so far: p. 407 Bernoulli trials must be independent. If that assumption is violated, it is still okay

More information

Section 5.1 Continuous Random Variables: Introduction

Section 5.1 Continuous Random Variables: Introduction Section 5. Continuous Random Variables: Introduction Not all random variables are discrete. For example:. Waiting times for anything (train, arrival of customer, production of mrna molecule from gene,

More information

Math 431 An Introduction to Probability. Final Exam Solutions

Math 431 An Introduction to Probability. Final Exam Solutions Math 43 An Introduction to Probability Final Eam Solutions. A continuous random variable X has cdf a for 0, F () = for 0 <

More information

Probability and Statistics Prof. Dr. Somesh Kumar Department of Mathematics Indian Institute of Technology, Kharagpur

Probability and Statistics Prof. Dr. Somesh Kumar Department of Mathematics Indian Institute of Technology, Kharagpur Probability and Statistics Prof. Dr. Somesh Kumar Department of Mathematics Indian Institute of Technology, Kharagpur Module No. #01 Lecture No. #15 Special Distributions-VI Today, I am going to introduce

More information

Summary of Formulas and Concepts. Descriptive Statistics (Ch. 1-4)

Summary of Formulas and Concepts. Descriptive Statistics (Ch. 1-4) Summary of Formulas and Concepts Descriptive Statistics (Ch. 1-4) Definitions Population: The complete set of numerical information on a particular quantity in which an investigator is interested. We assume

More information

Core Maths C1. Revision Notes

Core Maths C1. Revision Notes Core Maths C Revision Notes November 0 Core Maths C Algebra... Indices... Rules of indices... Surds... 4 Simplifying surds... 4 Rationalising the denominator... 4 Quadratic functions... 4 Completing the

More information

4. Continuous Random Variables, the Pareto and Normal Distributions

4. Continuous Random Variables, the Pareto and Normal Distributions 4. Continuous Random Variables, the Pareto and Normal Distributions A continuous random variable X can take any value in a given range (e.g. height, weight, age). The distribution of a continuous random

More information

ECE302 Spring 2006 HW5 Solutions February 21, 2006 1

ECE302 Spring 2006 HW5 Solutions February 21, 2006 1 ECE3 Spring 6 HW5 Solutions February 1, 6 1 Solutions to HW5 Note: Most of these solutions were generated by R. D. Yates and D. J. Goodman, the authors of our textbook. I have added comments in italics

More information

Notes on Continuous Random Variables

Notes on Continuous Random Variables Notes on Continuous Random Variables Continuous random variables are random quantities that are measured on a continuous scale. They can usually take on any value over some interval, which distinguishes

More information

1 if 1 x 0 1 if 0 x 1

1 if 1 x 0 1 if 0 x 1 Chapter 3 Continuity In this chapter we begin by defining the fundamental notion of continuity for real valued functions of a single real variable. When trying to decide whether a given function is or

More information

Chapter 5 Discrete Probability Distribution. Learning objectives

Chapter 5 Discrete Probability Distribution. Learning objectives Chapter 5 Discrete Probability Distribution Slide 1 Learning objectives 1. Understand random variables and probability distributions. 1.1. Distinguish discrete and continuous random variables. 2. Able

More information

The sample space for a pair of die rolls is the set. The sample space for a random number between 0 and 1 is the interval [0, 1].

The sample space for a pair of die rolls is the set. The sample space for a random number between 0 and 1 is the interval [0, 1]. Probability Theory Probability Spaces and Events Consider a random experiment with several possible outcomes. For example, we might roll a pair of dice, flip a coin three times, or choose a random real

More information

Probability density function : An arbitrary continuous random variable X is similarly described by its probability density function f x = f X

Probability density function : An arbitrary continuous random variable X is similarly described by its probability density function f x = f X Week 6 notes : Continuous random variables and their probability densities WEEK 6 page 1 uniform, normal, gamma, exponential,chi-squared distributions, normal approx'n to the binomial Uniform [,1] random

More information

Lecture 6: Discrete & Continuous Probability and Random Variables

Lecture 6: Discrete & Continuous Probability and Random Variables Lecture 6: Discrete & Continuous Probability and Random Variables D. Alex Hughes Math Camp September 17, 2015 D. Alex Hughes (Math Camp) Lecture 6: Discrete & Continuous Probability and Random September

More information

Chapter 5. Random variables

Chapter 5. Random variables Random variables random variable numerical variable whose value is the outcome of some probabilistic experiment; we use uppercase letters, like X, to denote such a variable and lowercase letters, like

More information

E3: PROBABILITY AND STATISTICS lecture notes

E3: PROBABILITY AND STATISTICS lecture notes E3: PROBABILITY AND STATISTICS lecture notes 2 Contents 1 PROBABILITY THEORY 7 1.1 Experiments and random events............................ 7 1.2 Certain event. Impossible event............................

More information

2 Binomial, Poisson, Normal Distribution

2 Binomial, Poisson, Normal Distribution 2 Binomial, Poisson, Normal Distribution Binomial Distribution ): We are interested in the number of times an event A occurs in n independent trials. In each trial the event A has the same probability

More information

5. Continuous Random Variables

5. Continuous Random Variables 5. Continuous Random Variables Continuous random variables can take any value in an interval. They are used to model physical characteristics such as time, length, position, etc. Examples (i) Let X be

More information

ST 371 (IV): Discrete Random Variables

ST 371 (IV): Discrete Random Variables ST 371 (IV): Discrete Random Variables 1 Random Variables A random variable (rv) is a function that is defined on the sample space of the experiment and that assigns a numerical variable to each possible

More information

Chapter 4. Polynomial and Rational Functions. 4.1 Polynomial Functions and Their Graphs

Chapter 4. Polynomial and Rational Functions. 4.1 Polynomial Functions and Their Graphs Chapter 4. Polynomial and Rational Functions 4.1 Polynomial Functions and Their Graphs A polynomial function of degree n is a function of the form P = a n n + a n 1 n 1 + + a 2 2 + a 1 + a 0 Where a s

More information

TOPIC 4: DERIVATIVES

TOPIC 4: DERIVATIVES TOPIC 4: DERIVATIVES 1. The derivative of a function. Differentiation rules 1.1. The slope of a curve. The slope of a curve at a point P is a measure of the steepness of the curve. If Q is a point on the

More information

6 PROBABILITY GENERATING FUNCTIONS

6 PROBABILITY GENERATING FUNCTIONS 6 PROBABILITY GENERATING FUNCTIONS Certain derivations presented in this course have been somewhat heavy on algebra. For example, determining the expectation of the Binomial distribution (page 5.1 turned

More information

Normal distribution. ) 2 /2σ. 2π σ

Normal distribution. ) 2 /2σ. 2π σ Normal distribution The normal distribution is the most widely known and used of all distributions. Because the normal distribution approximates many natural phenomena so well, it has developed into a

More information

PUTNAM TRAINING POLYNOMIALS. Exercises 1. Find a polynomial with integral coefficients whose zeros include 2 + 5.

PUTNAM TRAINING POLYNOMIALS. Exercises 1. Find a polynomial with integral coefficients whose zeros include 2 + 5. PUTNAM TRAINING POLYNOMIALS (Last updated: November 17, 2015) Remark. This is a list of exercises on polynomials. Miguel A. Lerma Exercises 1. Find a polynomial with integral coefficients whose zeros include

More information

Introduction to Probability

Introduction to Probability Introduction to Probability EE 179, Lecture 15, Handout #24 Probability theory gives a mathematical characterization for experiments with random outcomes. coin toss life of lightbulb binary data sequence

More information

MULTIVARIATE PROBABILITY DISTRIBUTIONS

MULTIVARIATE PROBABILITY DISTRIBUTIONS MULTIVARIATE PROBABILITY DISTRIBUTIONS. PRELIMINARIES.. Example. Consider an experiment that consists of tossing a die and a coin at the same time. We can consider a number of random variables defined

More information

Probability Generating Functions

Probability Generating Functions page 39 Chapter 3 Probability Generating Functions 3 Preamble: Generating Functions Generating functions are widely used in mathematics, and play an important role in probability theory Consider a sequence

More information

Question: What is the probability that a five-card poker hand contains a flush, that is, five cards of the same suit?

Question: What is the probability that a five-card poker hand contains a flush, that is, five cards of the same suit? ECS20 Discrete Mathematics Quarter: Spring 2007 Instructor: John Steinberger Assistant: Sophie Engle (prepared by Sophie Engle) Homework 8 Hints Due Wednesday June 6 th 2007 Section 6.1 #16 What is the

More information

99.37, 99.38, 99.38, 99.39, 99.39, 99.39, 99.39, 99.40, 99.41, 99.42 cm

99.37, 99.38, 99.38, 99.39, 99.39, 99.39, 99.39, 99.40, 99.41, 99.42 cm Error Analysis and the Gaussian Distribution In experimental science theory lives or dies based on the results of experimental evidence and thus the analysis of this evidence is a critical part of the

More information

LIMITS AND CONTINUITY

LIMITS AND CONTINUITY LIMITS AND CONTINUITY 1 The concept of it Eample 11 Let f() = 2 4 Eamine the behavior of f() as approaches 2 2 Solution Let us compute some values of f() for close to 2, as in the tables below We see from

More information

Transformations and Expectations of random variables

Transformations and Expectations of random variables Transformations and Epectations of random variables X F X (): a random variable X distributed with CDF F X. Any function Y = g(x) is also a random variable. If both X, and Y are continuous random variables,

More information

Ch5: Discrete Probability Distributions Section 5-1: Probability Distribution

Ch5: Discrete Probability Distributions Section 5-1: Probability Distribution Recall: Ch5: Discrete Probability Distributions Section 5-1: Probability Distribution A variable is a characteristic or attribute that can assume different values. o Various letters of the alphabet (e.g.

More information

( ) is proportional to ( 10 + x)!2. Calculate the

( ) is proportional to ( 10 + x)!2. Calculate the PRACTICE EXAMINATION NUMBER 6. An insurance company eamines its pool of auto insurance customers and gathers the following information: i) All customers insure at least one car. ii) 64 of the customers

More information

e.g. arrival of a customer to a service station or breakdown of a component in some system.

e.g. arrival of a customer to a service station or breakdown of a component in some system. Poisson process Events occur at random instants of time at an average rate of λ events per second. e.g. arrival of a customer to a service station or breakdown of a component in some system. Let N(t) be

More information

Information Theory and Coding Prof. S. N. Merchant Department of Electrical Engineering Indian Institute of Technology, Bombay

Information Theory and Coding Prof. S. N. Merchant Department of Electrical Engineering Indian Institute of Technology, Bombay Information Theory and Coding Prof. S. N. Merchant Department of Electrical Engineering Indian Institute of Technology, Bombay Lecture - 17 Shannon-Fano-Elias Coding and Introduction to Arithmetic Coding

More information

An Introduction to Basic Statistics and Probability

An Introduction to Basic Statistics and Probability An Introduction to Basic Statistics and Probability Shenek Heyward NCSU An Introduction to Basic Statistics and Probability p. 1/4 Outline Basic probability concepts Conditional probability Discrete Random

More information

Math 461 Fall 2006 Test 2 Solutions

Math 461 Fall 2006 Test 2 Solutions Math 461 Fall 2006 Test 2 Solutions Total points: 100. Do all questions. Explain all answers. No notes, books, or electronic devices. 1. [105+5 points] Assume X Exponential(λ). Justify the following two

More information

Math 370/408, Spring 2008 Prof. A.J. Hildebrand. Actuarial Exam Practice Problem Set 2 Solutions

Math 370/408, Spring 2008 Prof. A.J. Hildebrand. Actuarial Exam Practice Problem Set 2 Solutions Math 70/408, Spring 2008 Prof. A.J. Hildebrand Actuarial Exam Practice Problem Set 2 Solutions About this problem set: These are problems from Course /P actuarial exams that I have collected over the years,

More information

Overview of Monte Carlo Simulation, Probability Review and Introduction to Matlab

Overview of Monte Carlo Simulation, Probability Review and Introduction to Matlab Monte Carlo Simulation: IEOR E4703 Fall 2004 c 2004 by Martin Haugh Overview of Monte Carlo Simulation, Probability Review and Introduction to Matlab 1 Overview of Monte Carlo Simulation 1.1 Why use simulation?

More information

Chapter 8 - Power Density Spectrum

Chapter 8 - Power Density Spectrum EE385 Class Notes 8/8/03 John Stensby Chapter 8 - Power Density Spectrum Let X(t) be a WSS random process. X(t) has an average power, given in watts, of E[X(t) ], a constant. his total average power is

More information

POLYNOMIAL FUNCTIONS

POLYNOMIAL FUNCTIONS POLYNOMIAL FUNCTIONS Polynomial Division.. 314 The Rational Zero Test.....317 Descarte s Rule of Signs... 319 The Remainder Theorem.....31 Finding all Zeros of a Polynomial Function.......33 Writing a

More information

Principle of Data Reduction

Principle of Data Reduction Chapter 6 Principle of Data Reduction 6.1 Introduction An experimenter uses the information in a sample X 1,..., X n to make inferences about an unknown parameter θ. If the sample size n is large, then

More information

ECE302 Spring 2006 HW3 Solutions February 2, 2006 1

ECE302 Spring 2006 HW3 Solutions February 2, 2006 1 ECE302 Spring 2006 HW3 Solutions February 2, 2006 1 Solutions to HW3 Note: Most of these solutions were generated by R. D. Yates and D. J. Goodman, the authors of our textbook. I have added comments in

More information

Lecture 8: More Continuous Random Variables

Lecture 8: More Continuous Random Variables Lecture 8: More Continuous Random Variables 26 September 2005 Last time: the eponential. Going from saying the density e λ, to f() λe λ, to the CDF F () e λ. Pictures of the pdf and CDF. Today: the Gaussian

More information

Cartesian Products and Relations

Cartesian Products and Relations Cartesian Products and Relations Definition (Cartesian product) If A and B are sets, the Cartesian product of A and B is the set A B = {(a, b) :(a A) and (b B)}. The following points are worth special

More information

Continued Fractions and the Euclidean Algorithm

Continued Fractions and the Euclidean Algorithm Continued Fractions and the Euclidean Algorithm Lecture notes prepared for MATH 326, Spring 997 Department of Mathematics and Statistics University at Albany William F Hammond Table of Contents Introduction

More information

Factoring Polynomials

Factoring Polynomials Factoring Polynomials Sue Geller June 19, 2006 Factoring polynomials over the rational numbers, real numbers, and complex numbers has long been a standard topic of high school algebra. With the advent

More information

MAS108 Probability I

MAS108 Probability I 1 QUEEN MARY UNIVERSITY OF LONDON 2:30 pm, Thursday 3 May, 2007 Duration: 2 hours MAS108 Probability I Do not start reading the question paper until you are instructed to by the invigilators. The paper

More information

Polynomial Degree and Finite Differences

Polynomial Degree and Finite Differences CONDENSED LESSON 7.1 Polynomial Degree and Finite Differences In this lesson you will learn the terminology associated with polynomials use the finite differences method to determine the degree of a polynomial

More information

Probability, Mean and Median

Probability, Mean and Median Proaility, Mean and Median In the last section, we considered (proaility) density functions. We went on to discuss their relationship with cumulative distriution functions. The goal of this section is

More information

ECE302 Spring 2006 HW4 Solutions February 6, 2006 1

ECE302 Spring 2006 HW4 Solutions February 6, 2006 1 ECE302 Spring 2006 HW4 Solutions February 6, 2006 1 Solutions to HW4 Note: Most of these solutions were generated by R. D. Yates and D. J. Goodman, the authors of our textbook. I have added comments in

More information

6 EXTENDING ALGEBRA. 6.0 Introduction. 6.1 The cubic equation. Objectives

6 EXTENDING ALGEBRA. 6.0 Introduction. 6.1 The cubic equation. Objectives 6 EXTENDING ALGEBRA Chapter 6 Extending Algebra Objectives After studying this chapter you should understand techniques whereby equations of cubic degree and higher can be solved; be able to factorise

More information

Master s Theory Exam Spring 2006

Master s Theory Exam Spring 2006 Spring 2006 This exam contains 7 questions. You should attempt them all. Each question is divided into parts to help lead you through the material. You should attempt to complete as much of each problem

More information

Sums of Independent Random Variables

Sums of Independent Random Variables Chapter 7 Sums of Independent Random Variables 7.1 Sums of Discrete Random Variables In this chapter we turn to the important question of determining the distribution of a sum of independent random variables

More information

Probability Theory. Florian Herzog. A random variable is neither random nor variable. Gian-Carlo Rota, M.I.T..

Probability Theory. Florian Herzog. A random variable is neither random nor variable. Gian-Carlo Rota, M.I.T.. Probability Theory A random variable is neither random nor variable. Gian-Carlo Rota, M.I.T.. Florian Herzog 2013 Probability space Probability space A probability space W is a unique triple W = {Ω, F,

More information

Metric Spaces. Chapter 7. 7.1. Metrics

Metric Spaces. Chapter 7. 7.1. Metrics Chapter 7 Metric Spaces A metric space is a set X that has a notion of the distance d(x, y) between every pair of points x, y X. The purpose of this chapter is to introduce metric spaces and give some

More information

University of California, Los Angeles Department of Statistics. Random variables

University of California, Los Angeles Department of Statistics. Random variables University of California, Los Angeles Department of Statistics Statistics Instructor: Nicolas Christou Random variables Discrete random variables. Continuous random variables. Discrete random variables.

More information

STAT 830 Convergence in Distribution

STAT 830 Convergence in Distribution STAT 830 Convergence in Distribution Richard Lockhart Simon Fraser University STAT 830 Fall 2011 Richard Lockhart (Simon Fraser University) STAT 830 Convergence in Distribution STAT 830 Fall 2011 1 / 31

More information

7: The CRR Market Model

7: The CRR Market Model Ben Goldys and Marek Rutkowski School of Mathematics and Statistics University of Sydney MATH3075/3975 Financial Mathematics Semester 2, 2015 Outline We will examine the following issues: 1 The Cox-Ross-Rubinstein

More information

REPEATED TRIALS. The probability of winning those k chosen times and losing the other times is then p k q n k.

REPEATED TRIALS. The probability of winning those k chosen times and losing the other times is then p k q n k. REPEATED TRIALS Suppose you toss a fair coin one time. Let E be the event that the coin lands heads. We know from basic counting that p(e) = 1 since n(e) = 1 and 2 n(s) = 2. Now suppose we play a game

More information

Chapter 4, Arithmetic in F [x] Polynomial arithmetic and the division algorithm.

Chapter 4, Arithmetic in F [x] Polynomial arithmetic and the division algorithm. Chapter 4, Arithmetic in F [x] Polynomial arithmetic and the division algorithm. We begin by defining the ring of polynomials with coefficients in a ring R. After some preliminary results, we specialize

More information

CHAPTER 6: Continuous Uniform Distribution: 6.1. Definition: The density function of the continuous random variable X on the interval [A, B] is.

CHAPTER 6: Continuous Uniform Distribution: 6.1. Definition: The density function of the continuous random variable X on the interval [A, B] is. Some Continuous Probability Distributions CHAPTER 6: Continuous Uniform Distribution: 6. Definition: The density function of the continuous random variable X on the interval [A, B] is B A A x B f(x; A,

More information

Statistics 100A Homework 4 Solutions

Statistics 100A Homework 4 Solutions Problem 1 For a discrete random variable X, Statistics 100A Homework 4 Solutions Ryan Rosario Note that all of the problems below as you to prove the statement. We are proving the properties of epectation

More information

STA 256: Statistics and Probability I

STA 256: Statistics and Probability I Al Nosedal. University of Toronto. Fall 2014 1 2 3 4 5 My momma always said: Life was like a box of chocolates. You never know what you re gonna get. Forrest Gump. Experiment, outcome, sample space, and

More information

Answer Key for California State Standards: Algebra I

Answer Key for California State Standards: Algebra I Algebra I: Symbolic reasoning and calculations with symbols are central in algebra. Through the study of algebra, a student develops an understanding of the symbolic language of mathematics and the sciences.

More information

Elements of probability theory

Elements of probability theory 2 Elements of probability theory Probability theory provides mathematical models for random phenomena, that is, phenomena which under repeated observations yield di erent outcomes that cannot be predicted

More information

Notes on the Negative Binomial Distribution

Notes on the Negative Binomial Distribution Notes on the Negative Binomial Distribution John D. Cook October 28, 2009 Abstract These notes give several properties of the negative binomial distribution. 1. Parameterizations 2. The connection between

More information

Lecture 4: BK inequality 27th August and 6th September, 2007

Lecture 4: BK inequality 27th August and 6th September, 2007 CSL866: Percolation and Random Graphs IIT Delhi Amitabha Bagchi Scribe: Arindam Pal Lecture 4: BK inequality 27th August and 6th September, 2007 4. Preliminaries The FKG inequality allows us to lower bound

More information

Review of Intermediate Algebra Content

Review of Intermediate Algebra Content Review of Intermediate Algebra Content Table of Contents Page Factoring GCF and Trinomials of the Form + b + c... Factoring Trinomials of the Form a + b + c... Factoring Perfect Square Trinomials... 6

More information

Lectures on Stochastic Processes. William G. Faris

Lectures on Stochastic Processes. William G. Faris Lectures on Stochastic Processes William G. Faris November 8, 2001 2 Contents 1 Random walk 7 1.1 Symmetric simple random walk................... 7 1.2 Simple random walk......................... 9 1.3

More information

Polynomials and Factoring

Polynomials and Factoring 7.6 Polynomials and Factoring Basic Terminology A term, or monomial, is defined to be a number, a variable, or a product of numbers and variables. A polynomial is a term or a finite sum or difference of

More information

Math 370, Spring 2008 Prof. A.J. Hildebrand. Practice Test 2 Solutions

Math 370, Spring 2008 Prof. A.J. Hildebrand. Practice Test 2 Solutions Math 370, Spring 008 Prof. A.J. Hildebrand Practice Test Solutions About this test. This is a practice test made up of a random collection of 5 problems from past Course /P actuarial exams. Most of the

More information

Math 370, Actuarial Problemsolving Spring 2008 A.J. Hildebrand. Practice Test, 1/28/2008 (with solutions)

Math 370, Actuarial Problemsolving Spring 2008 A.J. Hildebrand. Practice Test, 1/28/2008 (with solutions) Math 370, Actuarial Problemsolving Spring 008 A.J. Hildebrand Practice Test, 1/8/008 (with solutions) About this test. This is a practice test made up of a random collection of 0 problems from past Course

More information

About the Gamma Function

About the Gamma Function About the Gamma Function Notes for Honors Calculus II, Originally Prepared in Spring 995 Basic Facts about the Gamma Function The Gamma function is defined by the improper integral Γ) = The integral is

More information

6.4 Normal Distribution

6.4 Normal Distribution Contents 6.4 Normal Distribution....................... 381 6.4.1 Characteristics of the Normal Distribution....... 381 6.4.2 The Standardized Normal Distribution......... 385 6.4.3 Meaning of Areas under

More information

Zeros of a Polynomial Function

Zeros of a Polynomial Function Zeros of a Polynomial Function An important consequence of the Factor Theorem is that finding the zeros of a polynomial is really the same thing as factoring it into linear factors. In this section we

More information

BANACH AND HILBERT SPACE REVIEW

BANACH AND HILBERT SPACE REVIEW BANACH AND HILBET SPACE EVIEW CHISTOPHE HEIL These notes will briefly review some basic concepts related to the theory of Banach and Hilbert spaces. We are not trying to give a complete development, but

More information

This is a square root. The number under the radical is 9. (An asterisk * means multiply.)

This is a square root. The number under the radical is 9. (An asterisk * means multiply.) Page of Review of Radical Expressions and Equations Skills involving radicals can be divided into the following groups: Evaluate square roots or higher order roots. Simplify radical expressions. Rationalize

More information

Real Roots of Univariate Polynomials with Real Coefficients

Real Roots of Univariate Polynomials with Real Coefficients Real Roots of Univariate Polynomials with Real Coefficients mostly written by Christina Hewitt March 22, 2012 1 Introduction Polynomial equations are used throughout mathematics. When solving polynomials

More information

Exploratory Data Analysis

Exploratory Data Analysis Exploratory Data Analysis Johannes Schauer johannes.schauer@tugraz.at Institute of Statistics Graz University of Technology Steyrergasse 17/IV, 8010 Graz www.statistics.tugraz.at February 12, 2008 Introduction

More information

Practice problems for Homework 11 - Point Estimation

Practice problems for Homework 11 - Point Estimation Practice problems for Homework 11 - Point Estimation 1. (10 marks) Suppose we want to select a random sample of size 5 from the current CS 3341 students. Which of the following strategies is the best:

More information

4 Sums of Random Variables

4 Sums of Random Variables Sums of a Random Variables 47 4 Sums of Random Variables Many of the variables dealt with in physics can be expressed as a sum of other variables; often the components of the sum are statistically independent.

More information

Moving Least Squares Approximation

Moving Least Squares Approximation Chapter 7 Moving Least Squares Approimation An alternative to radial basis function interpolation and approimation is the so-called moving least squares method. As we will see below, in this method the

More information

3.4. The Binomial Probability Distribution. Copyright Cengage Learning. All rights reserved.

3.4. The Binomial Probability Distribution. Copyright Cengage Learning. All rights reserved. 3.4 The Binomial Probability Distribution Copyright Cengage Learning. All rights reserved. The Binomial Probability Distribution There are many experiments that conform either exactly or approximately

More information

minimal polyonomial Example

minimal polyonomial Example Minimal Polynomials Definition Let α be an element in GF(p e ). We call the monic polynomial of smallest degree which has coefficients in GF(p) and α as a root, the minimal polyonomial of α. Example: We

More information

Section 1-4 Functions: Graphs and Properties

Section 1-4 Functions: Graphs and Properties 44 1 FUNCTIONS AND GRAPHS I(r). 2.7r where r represents R & D ependitures. (A) Complete the following table. Round values of I(r) to one decimal place. r (R & D) Net income I(r).66 1.2.7 1..8 1.8.99 2.1

More information

Undergraduate Notes in Mathematics. Arkansas Tech University Department of Mathematics

Undergraduate Notes in Mathematics. Arkansas Tech University Department of Mathematics Undergraduate Notes in Mathematics Arkansas Tech University Department of Mathematics An Introductory Single Variable Real Analysis: A Learning Approach through Problem Solving Marcel B. Finan c All Rights

More information

15.062 Data Mining: Algorithms and Applications Matrix Math Review

15.062 Data Mining: Algorithms and Applications Matrix Math Review .6 Data Mining: Algorithms and Applications Matrix Math Review The purpose of this document is to give a brief review of selected linear algebra concepts that will be useful for the course and to develop

More information

Normal Distribution as an Approximation to the Binomial Distribution

Normal Distribution as an Approximation to the Binomial Distribution Chapter 1 Student Lecture Notes 1-1 Normal Distribution as an Approximation to the Binomial Distribution : Goals ONE TWO THREE 2 Review Binomial Probability Distribution applies to a discrete random variable

More information

6.3 Conditional Probability and Independence

6.3 Conditional Probability and Independence 222 CHAPTER 6. PROBABILITY 6.3 Conditional Probability and Independence Conditional Probability Two cubical dice each have a triangle painted on one side, a circle painted on two sides and a square painted

More information

MATH10212 Linear Algebra. Systems of Linear Equations. Definition. An n-dimensional vector is a row or a column of n numbers (or letters): a 1.

MATH10212 Linear Algebra. Systems of Linear Equations. Definition. An n-dimensional vector is a row or a column of n numbers (or letters): a 1. MATH10212 Linear Algebra Textbook: D. Poole, Linear Algebra: A Modern Introduction. Thompson, 2006. ISBN 0-534-40596-7. Systems of Linear Equations Definition. An n-dimensional vector is a row or a column

More information

Section 5.0A Factoring Part 1

Section 5.0A Factoring Part 1 Section 5.0A Factoring Part 1 I. Work Together A. Multiply the following binomials into trinomials. (Write the final result in descending order, i.e., a + b + c ). ( 7)( + 5) ( + 7)( + ) ( + 7)( + 5) (

More information

Math Review. for the Quantitative Reasoning Measure of the GRE revised General Test

Math Review. for the Quantitative Reasoning Measure of the GRE revised General Test Math Review for the Quantitative Reasoning Measure of the GRE revised General Test www.ets.org Overview This Math Review will familiarize you with the mathematical skills and concepts that are important

More information

Finance 400 A. Penati - G. Pennacchi Market Micro-Structure: Notes on the Kyle Model

Finance 400 A. Penati - G. Pennacchi Market Micro-Structure: Notes on the Kyle Model Finance 400 A. Penati - G. Pennacchi Market Micro-Structure: Notes on the Kyle Model These notes consider the single-period model in Kyle (1985) Continuous Auctions and Insider Trading, Econometrica 15,

More information

Section 3-3 Approximating Real Zeros of Polynomials

Section 3-3 Approximating Real Zeros of Polynomials - Approimating Real Zeros of Polynomials 9 Section - Approimating Real Zeros of Polynomials Locating Real Zeros The Bisection Method Approimating Multiple Zeros Application The methods for finding zeros

More information

For a partition B 1,..., B n, where B i B j = for i. A = (A B 1 ) (A B 2 ),..., (A B n ) and thus. P (A) = P (A B i ) = P (A B i )P (B i )

For a partition B 1,..., B n, where B i B j = for i. A = (A B 1 ) (A B 2 ),..., (A B n ) and thus. P (A) = P (A B i ) = P (A B i )P (B i ) Probability Review 15.075 Cynthia Rudin A probability space, defined by Kolmogorov (1903-1987) consists of: A set of outcomes S, e.g., for the roll of a die, S = {1, 2, 3, 4, 5, 6}, 1 1 2 1 6 for the roll

More information