Lecture Notes: Variance, Law of Large Numbers, Central Limit Theorem

Size: px
Start display at page:

Download "Lecture Notes: Variance, Law of Large Numbers, Central Limit Theorem"

Transcription

1 Lecture Notes: Variance, Law of Large Numbers, Central Limit Theorem CS244-Randomness and Computation March 24, Variance Definition, Basic Examples The variance of a random variable is a measure of how much the value of the random variable differs from its expected value. Let X be a random variable, and let µ = E(X) be its expected value. Then the variance is defined by V ar(x) = E((X µ) 2 ). A related quantity, which you can think of as the average deviation of X from its mean, is the standard deviation of X, denoted σ X, defined by σ X = V ar(x). You might wonder why we don t use the more obvious E( X µ ) as a measure of the average deviation from the mean. The answer, in part, is given in the last section of these notes on the Central Limit Theorem. Example. Bernoulli Random Variable Let X be a Bernoulli random variable that has the value 1 with probability p and 0 with probability q = 1 p. As we ve seen, E(X) = p. So (X µ) 2 has the value (1 p) 2 = q 2 with probability p, and the value (0 p) 2 = p 2 with probability q. Thus V ar(x) = pq 2 + qp 2 = pq(p + q) = pq, and σ X = pq. 1

2 Figure 1: PMFs for a fair die and two differently loaded dice For instance if p = 1 then V ar(x) = 1 and σ 2 4 X = 1. This represents a kind of 2 extreme case, at least for Bernoulli random variables, of deviation from the mean. At the other extreme, if p = 1 then X never varies (it always has the value 1), and V ar(x) = σ X = 0. Example. Dice, loaded and unloaded. Figure 1 shows the PMFs for three different distributions of the outcome of a single die roll. The diagram at left shows the standard uniform random variable, where each of the six outcomes has probability 1/6. The diagram in the center shows a loaded die that always results in 1 or 6, each with probability 1/2, and the diagram at right is the case where the die always results in 3 or 4, again each with probability 1/2. In all three instances, the expected value of the random variable is 3.5. For the fair die, the variance is 6 j=1 1 6 (j 3.5)2 = , so the standard deviation is the square root of this, about For the die loaded to come out 1 or 6, the variance is just 1 2 ((1 3.5)2 + (6 3.5) 2 ) = 6.25, so the standard deviation is 2.5. Similarly for the die loaded to come out 3 or 4, the variance is 0.25 and the standard deviation 0.5. In other words, the center diagram 2

3 is the most spread out, because its value is always quite far from the mean, and the right diagram the least spread out. The uniform distribution has values both far from the mean and close to the mean, giving a variance those of the two loaded dice. A (usually) easier way to compute the variance. We can write (X µ) 2 = X 2 2µX + µ 2, so by linearity of expectation, and the fact that µ is constant, V ar(x) = E(X 2 2µX + µ 2 ) = E(X 2 ) E(2µX) + µ 2 = E(X 2 ) 2µE(X) + µ 2 = E(X 2 ) µ 2. Let s repeat the computation of the variance of the Bernoulli variable, using this simpler formula. Since X has values 0 and 1, X 2 = X, so E(X 2 ) = E(X) = p. Thus V ar(x) = E(X 2 ) µ 2 = p p 2 = p(1 p) = pq, just as we found above. For the fair die, E(X 2 ) = 1 6 ( ) = , so the variance is = , which agrees with the previous example. As another example, if we want to find E(X 2 ) for a continuous random variable X, we compute x 2 p(x)dx where p(x) is the probability density function. If we take X to be the outcome of a spinner with values between 0 and 1, then p is the uniform density that is 1 between 0 and 1 and 0 elsewhere. Thus Since E(X) = 1/2, we have E(X 2 ) = 1 0 x 2 dx = 1/3. V ar(x) = E(X 2 ) E(X) 2 = 1/3 (1/2) 2 = 1/12. As was the case with the expected value, the variance of an infinite discrete or continuous random variable might not be defined, because the underlying infinite series or improper interval does not converge. 3

4 2 Additivity for Independent Random Variables; Binomial Distribution For a constant c and random variable X we have, by linearity of expectations V ar(cx) = E((cX) 2 ) E(cX) 2 = E(c 2 X 2 ) (ce(x)) 2 = c 2 E(X 2 ) c 2 E(X) 2 = c 2 (E(X 2 ) E(X) 2 ) = c 2 V ar(x). What is the variance of the sum of two random variables? Again, by repeatedly applying the linearity of expectations, V ar(x + Y ) = E((X + Y ) 2 ) E(X + Y ) 2 = E(X 2 + 2XY + Y 2 ) (E(X) + E(Y )) 2 = E(X 2 ) + 2E(XY ) + E(Y 2 ) (E(X) 2 + 2E(X)E(Y ) + E(Y ) 2 ) = (E(X 2 ) E(X) 2 ) + (E(Y 2 ) E(Y ) 2 ) + 2(E(XY ) E(X)E(Y )) = V ar(x) + V ar(y ) + 2(E(XY ) E(X)E(Y )). The expression E(XY ) E(X)E(Y ) in the right-hand summand is called the covariance of X and Y, and we will see it again later. If X and Y are independent then this expression is 0, so for independent random variables, V ar(x + Y ) = V ar(x) + V ar(y ). This does not hold in general if the random variables are not independent. In a homework problem you were asked to show that if X and Y represent, respectively, the smaller and larger values of the dice in a roll of two dice, then E(X)E(Y ) E(XY ), so in this case the sum of the variances is not equal to the variance of the sum. Example. Binomial Random Variable The binomial random variable S n gives the probability of k heads on n coin tosses; as we ve seen before, ( ) n P [X n = k] = p k (1 p) n k k 4

5 if p is the probability of heads. X n is itself the sum of n pairwise independent copies of a Bernoulli random variable X each with probability p. As a result, using the summation result above, and V ar(s n ) = n V ar(x) = np(1 p), σ Sn = np(1 p). Further, let Y = S n /n be the average number of heads on n tosses of a coin. Then V ar(s n ) = V ar(x)/n 2 = p(1 p)/n. 3 Chebyshev Inequality The notion of variance allows us to obtain a rough estimate of the probability that a random value differs by a certain amount from its mean. Before we state and prove this, we ll establish a simpler property called the Markov inequality: If X is a positive-valued random variable with expected value µ, and c > 0, then P [X c] E(X). c Let s see why this is: We ll assume that X is a discrete random variable with finitely many outcomes 0 < c 1 < c 2 < c n. with probabilities p 1, p 2,..., p n, respectively. Let c i be the smallest of these values that is greater than or equal to c. Then P [X > c] = p i + p i p n = 1 c (cp i + cp i+1 + cp n ) 1 c (c ip i + c i+1 p i c n p n ) 1 c (c 1p 1 + c 2 p c n p n ) = E(X). c Essentially the same argument works for infinite discrete and continuous random variables. The Markov inequality by itself does not provide a great deal of information, but we will use it show something important. Let X be any random 5

6 variable with expected value µ, and let ɛ be a positive number. (Think of ɛ as small.) When we apply the Markov inequality to the random variable (X µ) 2 then we get P [(X µ) 2 ɛ 2 ] E((X µ)2 ) ɛ 2 = V ar(x) ɛ 2. Since the left-hand side is the same thing as P [ X µ ɛ], we can write this as P [ X µ ɛ] V ar(x) ɛ 2. This is called Chebyshev s inequality. It provides a probability bound on how likely it is for a random variable to deviate a certain amount from its expected value. Example. Let S n be the number of heads on n tosses of a fair coin. Let s estimate the probability that the number of heads for n = 100 is between 40 and 60. Since E(S 100 ) = 50, we are asking for the complement of the probability that the number of heads is at least 61 or at most 39; in other words, that it differs by at least 11 from it s mean. Since the variance of S 100 is 100/4 = 25, Chebyshev s inequality gives P [ S ] < 25/ , so the probability of the number of head being between 40 and 50 is at least As we shall see later, the probability is actually much closer to 1 than this. Chebyshev s inequality only gives a rough upper bound, not a close approximation. The advantage is that it applies to absolutely any random variable. If we take ɛ = tσ X, then we can write the inequality as P [ X µ tσ X ] 1 t 2. For instance, the probability of being at least two standard deviations from the mean is at most Law of Large Numbers The numerical outcome of an experiment is a random variable X. Let us suppose X has expected value µ and variance σ 2. If we perform n independent trials of 6

7 the experiment, then the random variable A n = (X X n )/n, giving the average of the outcomes of these trials, still has expected value µ, but now the variance is σ 2 /n. Thus by Chebyshev s inequality, for any ɛ > 0 P r[ A n µ > ɛ] < σ2 nɛ 2. What does this mean? Imagine that ɛ is rather small, say Then the right-hand side is 10 4 σ 2 /n. If we choose n to be large enough, then we can make this righthand side as small as we like, which means that we can guarantee with probability as large as we like, that the average A n is within 0.01 of its expected value. Put more simply, we can get as close to the mean as we like (with probability as high as we like) by repeating the experiment often enough. (Of course, we cannot make the probability exactly 1 no matter how many times we repeat the experiment, nor can we guarantee that the average will be exactly equal to the mean.) This is the precise statement of what is often described colloquially as the law of averages. In probability theory, it is called the Law of Large Numbers. 5 Normal Approximation to Binomial Distribution Figure 2 shows the PMFs of the random variables S n for n = 20, 60, 100, where S n denotes the number of heads on n tosses of a coin with heads probability p = 0.4. These PMFs are given by the binomial probability distribution ( ) n P [S n = k] = p k (1 p) n k. k The three random variables of course all have different expected values (10, 30 and 50, respectively) so the PMFs are nonzero on different parts of the numer line. By our previous calculations, the standard deviation of S n grows proportionally to the square root of n. As a result, the PMFs get more spread out as n increases. In Figure 3, S n is replaced by S n np, so that all three random variables have expected value 0. In Figure 4, we further change this to the random variables S n np np(1 p), so that all three have variance 1. The apparent result is that all three graphs seem to have the same basic shape, but just differ in the vertical scale. In Figure 5, the vertical scale is adjusted so 7

8 that all three have maximum value 1. All the points lie on the same smooth curve. What is this shape? The smooth curve was drawn by plotting the graph of y = e x2 /2, and the crucial result illustrated by these pictures is that this shape closely approximates the binomial distribution. In other words, this famous bell curve represents a continuous probability density that is a kind of limiting case of the binomial distributions as n grows large. Let s be a little more precise about this: The function e x2 /2 is not itself a probability density function, because the area under the curve is not 1, but it becomes a density function when we divide by the area under the curve. (This area is 2π, a fact that is far from obvious.) is called the standard normal density. Standard here means that it has mean 0 and standard deviation 1. The corresponding cumulative distribution function is Φ(x) = 1 2π x e t2 /2 dt. Since we cannot evaluate Φ(x) analytically, it has to be approximated numerically. You can compute Φ(x) in Python to high accuracy using a built-in related function erf, as *math.erf(x/math.sqrt(2)) Our observations above illustrate an important fact: the binomial distribution, adjusted to have mean 0 and standard deviation 1, is closely approximated by the normal distribution, especially as n gets larger. Here are a few examples. Example. Let us redo the problem of estimating the probability that on one hundred tosses of a fair coin, the number of heads is between 40 and 60. Let X be the random variable representing the number of heads, so we are asking for P [45 X 60]. We make the same modification as above, subtracting the expected value 50 and dividing by the standard deviation = 5. Thus we are looking for 4 P [ 1 X ]. 8

9 Figure 2: PMFs of binomial distribution with n = 20, 60, 100 and p = 0.4 9

10 Figure 3: The same distributions shifted to all have mean

11 Figure 4:...and scaled horizontally to have standard deviation 1 11

12 Figure 5: The previous figure stretched vertically so that all three PMFs appear with the same height, superimposed on the graph of e x2 /2 12

13 Figure 6: The standard normal density φ(x): the shaded area is Φ(1), the probability that the standard normal random variable has value less than 1. 13

14 Figure 7: The cumulative normal density Φ(x). 14

15 Approximating this by the standard normal distribution suggests that this probability is about Φ(1) Φ( 1) = The exact value, of course, is 55 j=45 ( 100 j ) = , so our approximation is not very impressive, only accurate to one decimal digit of precision. Part of the reason can be seen in the fact that the probability we are looking for is also equal to P [44 < X < 56]. One of the pitfalls of approximating a discrete distribution with a continuous one is that we don t necessarily know exactly where we should draw the lines between values of the random variable. It turns out that works well in this situation is to use values for the continuous distribution that are halfway between the relevant values for the discrete distribution: In this case, that means we should view the problem as one of calculating This gives an estimate of P [44.5 < X < 55.5]. Φ(1.1) Φ( 1.1) = , accurate to four decimal digits of precision. 6 Central Limit Theorem The last section illustrated the fact that the sum of independent identically distributed Bernoulli random variables is approximately normally distributed. This is an instance of a much more general phenomenon every random variable has this property! To be more precise, let X be a random variable with µ = E(X) and σ 2 = V ar(x) defined. Let X 1,..., X n be pairwise independent random variables, each with the same distribution as X. Think of this as making n independent repetitions of an experiment whose outcome is modeled by the random variable X. Our claim 15

16 is that the sum of the X i is approximately normally distributed. Again we adjust the mean and standard deviation to be 0 and 1; then the precise statement is lim P [a < X 1 + X n nµ n σ n < b] = Φ(b) Φ(a). This is called the Central Limit Theorem. Before we saw, with the Law of Large Numbers, that the deviation of the average of n independent identical random variables from its mean approaches 0 as n grows larger. The Central Limit Theorem says more: it tells us how that deviation is distributed. Example. Let s look at an experiment that was the subject of a question on the midterm: Spin two spinners, each giving a value uniformly distributed between 0 and 1, and let X be the larger of the two values. We saw that the cdf of X was given by y = x 2 between 0 and 1, and thus the pdf of X is y = 2x between 0 and 1, and 0 elsewhere. (See the posted exam solutions for the details.) We can then compute so µ = E(X) = E(X 2 ) = x 2xdx = 2 3, x 2 2xdx = 1 2, σ 2 = V ar(x) = 1 2 (2 3 )2 = 1 18, and σ = Suppose we perform this experiment 100 times. How likely is it that the sum is greater than 65? The expected value of the sum is Since the distribution of 3 the sum is approximately normal, and thus symmetric about the mean, we should expect a probability greater than one half. How likely is it that the sum is greater than 70? Here we should expect a probability less than one-half. We apply the Central Limit Theorem to obtain an estimate. We first try to compute P r[0 X X 100 < 65], so making our usual transformation with µ and σ, this is P r[ /(10/ 18) < X X / 18 < /(10/ 18)]. 16

17 The Central Limit Theorem says that this is approximately Φ( /(10/ 18)) Φ( /(10/ 18)). The right-hand expression is a very tiny number which we can treat as 0. (Alternatively, we could just as well have use as 0 in our computation.) So this gives the approximation Φ( /(10/ 18)) = , and thus the probability that the sum is greater than 65 is = , which is greater than one-half, as we expected. If we replace 65 by 70, then an identical calculation gives 1 Φ( /(10/ 18)) = = I simulated the experiment of 100 spinners spun twice and summed the maxima of the results. In 10,000 repetitions, I found that the number of times the sum was greater than 65 to be 7594 (as compared to the predicted 7602), and the number of times the sum was greater than 70 to be 780 (as compared to the predicted 787). 17

Math 431 An Introduction to Probability. Final Exam Solutions

Math 431 An Introduction to Probability. Final Exam Solutions Math 43 An Introduction to Probability Final Eam Solutions. A continuous random variable X has cdf a for 0, F () = for 0 <

More information

For a partition B 1,..., B n, where B i B j = for i. A = (A B 1 ) (A B 2 ),..., (A B n ) and thus. P (A) = P (A B i ) = P (A B i )P (B i )

For a partition B 1,..., B n, where B i B j = for i. A = (A B 1 ) (A B 2 ),..., (A B n ) and thus. P (A) = P (A B i ) = P (A B i )P (B i ) Probability Review 15.075 Cynthia Rudin A probability space, defined by Kolmogorov (1903-1987) consists of: A set of outcomes S, e.g., for the roll of a die, S = {1, 2, 3, 4, 5, 6}, 1 1 2 1 6 for the roll

More information

4. Continuous Random Variables, the Pareto and Normal Distributions

4. Continuous Random Variables, the Pareto and Normal Distributions 4. Continuous Random Variables, the Pareto and Normal Distributions A continuous random variable X can take any value in a given range (e.g. height, weight, age). The distribution of a continuous random

More information

Lecture 6: Discrete & Continuous Probability and Random Variables

Lecture 6: Discrete & Continuous Probability and Random Variables Lecture 6: Discrete & Continuous Probability and Random Variables D. Alex Hughes Math Camp September 17, 2015 D. Alex Hughes (Math Camp) Lecture 6: Discrete & Continuous Probability and Random September

More information

Question: What is the probability that a five-card poker hand contains a flush, that is, five cards of the same suit?

Question: What is the probability that a five-card poker hand contains a flush, that is, five cards of the same suit? ECS20 Discrete Mathematics Quarter: Spring 2007 Instructor: John Steinberger Assistant: Sophie Engle (prepared by Sophie Engle) Homework 8 Hints Due Wednesday June 6 th 2007 Section 6.1 #16 What is the

More information

Random variables, probability distributions, binomial random variable

Random variables, probability distributions, binomial random variable Week 4 lecture notes. WEEK 4 page 1 Random variables, probability distributions, binomial random variable Eample 1 : Consider the eperiment of flipping a fair coin three times. The number of tails that

More information

The sample space for a pair of die rolls is the set. The sample space for a random number between 0 and 1 is the interval [0, 1].

The sample space for a pair of die rolls is the set. The sample space for a random number between 0 and 1 is the interval [0, 1]. Probability Theory Probability Spaces and Events Consider a random experiment with several possible outcomes. For example, we might roll a pair of dice, flip a coin three times, or choose a random real

More information

Chapter 5. Random variables

Chapter 5. Random variables Random variables random variable numerical variable whose value is the outcome of some probabilistic experiment; we use uppercase letters, like X, to denote such a variable and lowercase letters, like

More information

Summary of Formulas and Concepts. Descriptive Statistics (Ch. 1-4)

Summary of Formulas and Concepts. Descriptive Statistics (Ch. 1-4) Summary of Formulas and Concepts Descriptive Statistics (Ch. 1-4) Definitions Population: The complete set of numerical information on a particular quantity in which an investigator is interested. We assume

More information

An Introduction to Basic Statistics and Probability

An Introduction to Basic Statistics and Probability An Introduction to Basic Statistics and Probability Shenek Heyward NCSU An Introduction to Basic Statistics and Probability p. 1/4 Outline Basic probability concepts Conditional probability Discrete Random

More information

Normal distribution. ) 2 /2σ. 2π σ

Normal distribution. ) 2 /2σ. 2π σ Normal distribution The normal distribution is the most widely known and used of all distributions. Because the normal distribution approximates many natural phenomena so well, it has developed into a

More information

Lecture 3: Continuous distributions, expected value & mean, variance, the normal distribution

Lecture 3: Continuous distributions, expected value & mean, variance, the normal distribution Lecture 3: Continuous distributions, expected value & mean, variance, the normal distribution 8 October 2007 In this lecture we ll learn the following: 1. how continuous probability distributions differ

More information

WHERE DOES THE 10% CONDITION COME FROM?

WHERE DOES THE 10% CONDITION COME FROM? 1 WHERE DOES THE 10% CONDITION COME FROM? The text has mentioned The 10% Condition (at least) twice so far: p. 407 Bernoulli trials must be independent. If that assumption is violated, it is still okay

More information

ST 371 (IV): Discrete Random Variables

ST 371 (IV): Discrete Random Variables ST 371 (IV): Discrete Random Variables 1 Random Variables A random variable (rv) is a function that is defined on the sample space of the experiment and that assigns a numerical variable to each possible

More information

REPEATED TRIALS. The probability of winning those k chosen times and losing the other times is then p k q n k.

REPEATED TRIALS. The probability of winning those k chosen times and losing the other times is then p k q n k. REPEATED TRIALS Suppose you toss a fair coin one time. Let E be the event that the coin lands heads. We know from basic counting that p(e) = 1 since n(e) = 1 and 2 n(s) = 2. Now suppose we play a game

More information

Notes on Continuous Random Variables

Notes on Continuous Random Variables Notes on Continuous Random Variables Continuous random variables are random quantities that are measured on a continuous scale. They can usually take on any value over some interval, which distinguishes

More information

Probability and Statistics Prof. Dr. Somesh Kumar Department of Mathematics Indian Institute of Technology, Kharagpur

Probability and Statistics Prof. Dr. Somesh Kumar Department of Mathematics Indian Institute of Technology, Kharagpur Probability and Statistics Prof. Dr. Somesh Kumar Department of Mathematics Indian Institute of Technology, Kharagpur Module No. #01 Lecture No. #15 Special Distributions-VI Today, I am going to introduce

More information

Lesson 20. Probability and Cumulative Distribution Functions

Lesson 20. Probability and Cumulative Distribution Functions Lesson 20 Probability and Cumulative Distribution Functions Recall If p(x) is a density function for some characteristic of a population, then Recall If p(x) is a density function for some characteristic

More information

Characteristics of Binomial Distributions

Characteristics of Binomial Distributions Lesson2 Characteristics of Binomial Distributions In the last lesson, you constructed several binomial distributions, observed their shapes, and estimated their means and standard deviations. In Investigation

More information

IEOR 6711: Stochastic Models I Fall 2012, Professor Whitt, Tuesday, September 11 Normal Approximations and the Central Limit Theorem

IEOR 6711: Stochastic Models I Fall 2012, Professor Whitt, Tuesday, September 11 Normal Approximations and the Central Limit Theorem IEOR 6711: Stochastic Models I Fall 2012, Professor Whitt, Tuesday, September 11 Normal Approximations and the Central Limit Theorem Time on my hands: Coin tosses. Problem Formulation: Suppose that I have

More information

MEASURES OF VARIATION

MEASURES OF VARIATION NORMAL DISTRIBTIONS MEASURES OF VARIATION In statistics, it is important to measure the spread of data. A simple way to measure spread is to find the range. But statisticians want to know if the data are

More information

MATH4427 Notebook 2 Spring 2016. 2 MATH4427 Notebook 2 3. 2.1 Definitions and Examples... 3. 2.2 Performance Measures for Estimators...

MATH4427 Notebook 2 Spring 2016. 2 MATH4427 Notebook 2 3. 2.1 Definitions and Examples... 3. 2.2 Performance Measures for Estimators... MATH4427 Notebook 2 Spring 2016 prepared by Professor Jenny Baglivo c Copyright 2009-2016 by Jenny A. Baglivo. All Rights Reserved. Contents 2 MATH4427 Notebook 2 3 2.1 Definitions and Examples...................................

More information

Lecture 14. Chapter 7: Probability. Rule 1: Rule 2: Rule 3: Nancy Pfenning Stats 1000

Lecture 14. Chapter 7: Probability. Rule 1: Rule 2: Rule 3: Nancy Pfenning Stats 1000 Lecture 4 Nancy Pfenning Stats 000 Chapter 7: Probability Last time we established some basic definitions and rules of probability: Rule : P (A C ) = P (A). Rule 2: In general, the probability of one event

More information

Random variables P(X = 3) = P(X = 3) = 1 8, P(X = 1) = P(X = 1) = 3 8.

Random variables P(X = 3) = P(X = 3) = 1 8, P(X = 1) = P(X = 1) = 3 8. Random variables Remark on Notations 1. When X is a number chosen uniformly from a data set, What I call P(X = k) is called Freq[k, X] in the courseware. 2. When X is a random variable, what I call F ()

More information

Chapter 4 - Lecture 1 Probability Density Functions and Cumul. Distribution Functions

Chapter 4 - Lecture 1 Probability Density Functions and Cumul. Distribution Functions Chapter 4 - Lecture 1 Probability Density Functions and Cumulative Distribution Functions October 21st, 2009 Review Probability distribution function Useful results Relationship between the pdf and the

More information

4.1 4.2 Probability Distribution for Discrete Random Variables

4.1 4.2 Probability Distribution for Discrete Random Variables 4.1 4.2 Probability Distribution for Discrete Random Variables Key concepts: discrete random variable, probability distribution, expected value, variance, and standard deviation of a discrete random variable.

More information

WEEK #22: PDFs and CDFs, Measures of Center and Spread

WEEK #22: PDFs and CDFs, Measures of Center and Spread WEEK #22: PDFs and CDFs, Measures of Center and Spread Goals: Explore the effect of independent events in probability calculations. Present a number of ways to represent probability distributions. Textbook

More information

Lecture 7: Continuous Random Variables

Lecture 7: Continuous Random Variables Lecture 7: Continuous Random Variables 21 September 2005 1 Our First Continuous Random Variable The back of the lecture hall is roughly 10 meters across. Suppose it were exactly 10 meters, and consider

More information

Covariance and Correlation

Covariance and Correlation Covariance and Correlation ( c Robert J. Serfling Not for reproduction or distribution) We have seen how to summarize a data-based relative frequency distribution by measures of location and spread, such

More information

Random Variables. Chapter 2. Random Variables 1

Random Variables. Chapter 2. Random Variables 1 Random Variables Chapter 2 Random Variables 1 Roulette and Random Variables A Roulette wheel has 38 pockets. 18 of them are red and 18 are black; these are numbered from 1 to 36. The two remaining pockets

More information

Session 7 Bivariate Data and Analysis

Session 7 Bivariate Data and Analysis Session 7 Bivariate Data and Analysis Key Terms for This Session Previously Introduced mean standard deviation New in This Session association bivariate analysis contingency table co-variation least squares

More information

Discrete Mathematics and Probability Theory Fall 2009 Satish Rao, David Tse Note 13. Random Variables: Distribution and Expectation

Discrete Mathematics and Probability Theory Fall 2009 Satish Rao, David Tse Note 13. Random Variables: Distribution and Expectation CS 70 Discrete Mathematics and Probability Theory Fall 2009 Satish Rao, David Tse Note 3 Random Variables: Distribution and Expectation Random Variables Question: The homeworks of 20 students are collected

More information

Important Probability Distributions OPRE 6301

Important Probability Distributions OPRE 6301 Important Probability Distributions OPRE 6301 Important Distributions... Certain probability distributions occur with such regularity in real-life applications that they have been given their own names.

More information

Sample Questions for Mastery #5

Sample Questions for Mastery #5 Name: Class: Date: Sample Questions for Mastery #5 Multiple Choice Identify the choice that best completes the statement or answers the question.. For which of the following binomial experiments could

More information

MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.436J/15.085J Fall 2008 Lecture 5 9/17/2008 RANDOM VARIABLES

MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.436J/15.085J Fall 2008 Lecture 5 9/17/2008 RANDOM VARIABLES MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.436J/15.085J Fall 2008 Lecture 5 9/17/2008 RANDOM VARIABLES Contents 1. Random variables and measurable functions 2. Cumulative distribution functions 3. Discrete

More information

BNG 202 Biomechanics Lab. Descriptive statistics and probability distributions I

BNG 202 Biomechanics Lab. Descriptive statistics and probability distributions I BNG 202 Biomechanics Lab Descriptive statistics and probability distributions I Overview The overall goal of this short course in statistics is to provide an introduction to descriptive and inferential

More information

Chapter 4 Lecture Notes

Chapter 4 Lecture Notes Chapter 4 Lecture Notes Random Variables October 27, 2015 1 Section 4.1 Random Variables A random variable is typically a real-valued function defined on the sample space of some experiment. For instance,

More information

Lecture 8. Confidence intervals and the central limit theorem

Lecture 8. Confidence intervals and the central limit theorem Lecture 8. Confidence intervals and the central limit theorem Mathematical Statistics and Discrete Mathematics November 25th, 2015 1 / 15 Central limit theorem Let X 1, X 2,... X n be a random sample of

More information

THE CENTRAL LIMIT THEOREM TORONTO

THE CENTRAL LIMIT THEOREM TORONTO THE CENTRAL LIMIT THEOREM DANIEL RÜDT UNIVERSITY OF TORONTO MARCH, 2010 Contents 1 Introduction 1 2 Mathematical Background 3 3 The Central Limit Theorem 4 4 Examples 4 4.1 Roulette......................................

More information

Department of Mathematics, Indian Institute of Technology, Kharagpur Assignment 2-3, Probability and Statistics, March 2015. Due:-March 25, 2015.

Department of Mathematics, Indian Institute of Technology, Kharagpur Assignment 2-3, Probability and Statistics, March 2015. Due:-March 25, 2015. Department of Mathematics, Indian Institute of Technology, Kharagpur Assignment -3, Probability and Statistics, March 05. Due:-March 5, 05.. Show that the function 0 for x < x+ F (x) = 4 for x < for x

More information

Probability: Terminology and Examples Class 2, 18.05, Spring 2014 Jeremy Orloff and Jonathan Bloom

Probability: Terminology and Examples Class 2, 18.05, Spring 2014 Jeremy Orloff and Jonathan Bloom Probability: Terminology and Examples Class 2, 18.05, Spring 2014 Jeremy Orloff and Jonathan Bloom 1 Learning Goals 1. Know the definitions of sample space, event and probability function. 2. Be able to

More information

Problem sets for BUEC 333 Part 1: Probability and Statistics

Problem sets for BUEC 333 Part 1: Probability and Statistics Problem sets for BUEC 333 Part 1: Probability and Statistics I will indicate the relevant exercises for each week at the end of the Wednesday lecture. Numbered exercises are back-of-chapter exercises from

More information

CA200 Quantitative Analysis for Business Decisions. File name: CA200_Section_04A_StatisticsIntroduction

CA200 Quantitative Analysis for Business Decisions. File name: CA200_Section_04A_StatisticsIntroduction CA200 Quantitative Analysis for Business Decisions File name: CA200_Section_04A_StatisticsIntroduction Table of Contents 4. Introduction to Statistics... 1 4.1 Overview... 3 4.2 Discrete or continuous

More information

Homework 4 - KEY. Jeff Brenion. June 16, 2004. Note: Many problems can be solved in more than one way; we present only a single solution here.

Homework 4 - KEY. Jeff Brenion. June 16, 2004. Note: Many problems can be solved in more than one way; we present only a single solution here. Homework 4 - KEY Jeff Brenion June 16, 2004 Note: Many problems can be solved in more than one way; we present only a single solution here. 1 Problem 2-1 Since there can be anywhere from 0 to 4 aces, the

More information

Math/Stats 342: Solutions to Homework

Math/Stats 342: Solutions to Homework Math/Stats 342: Solutions to Homework Steven Miller (sjm1@williams.edu) November 17, 2011 Abstract Below are solutions / sketches of solutions to the homework problems from Math/Stats 342: Probability

More information

Introduction to Probability

Introduction to Probability Introduction to Probability EE 179, Lecture 15, Handout #24 Probability theory gives a mathematical characterization for experiments with random outcomes. coin toss life of lightbulb binary data sequence

More information

University of California, Los Angeles Department of Statistics. Random variables

University of California, Los Angeles Department of Statistics. Random variables University of California, Los Angeles Department of Statistics Statistics Instructor: Nicolas Christou Random variables Discrete random variables. Continuous random variables. Discrete random variables.

More information

MA 1125 Lecture 14 - Expected Values. Friday, February 28, 2014. Objectives: Introduce expected values.

MA 1125 Lecture 14 - Expected Values. Friday, February 28, 2014. Objectives: Introduce expected values. MA 5 Lecture 4 - Expected Values Friday, February 2, 24. Objectives: Introduce expected values.. Means, Variances, and Standard Deviations of Probability Distributions Two classes ago, we computed the

More information

You flip a fair coin four times, what is the probability that you obtain three heads.

You flip a fair coin four times, what is the probability that you obtain three heads. Handout 4: Binomial Distribution Reading Assignment: Chapter 5 In the previous handout, we looked at continuous random variables and calculating probabilities and percentiles for those type of variables.

More information

5. Continuous Random Variables

5. Continuous Random Variables 5. Continuous Random Variables Continuous random variables can take any value in an interval. They are used to model physical characteristics such as time, length, position, etc. Examples (i) Let X be

More information

6.4 Normal Distribution

6.4 Normal Distribution Contents 6.4 Normal Distribution....................... 381 6.4.1 Characteristics of the Normal Distribution....... 381 6.4.2 The Standardized Normal Distribution......... 385 6.4.3 Meaning of Areas under

More information

Lecture 2: Discrete Distributions, Normal Distributions. Chapter 1

Lecture 2: Discrete Distributions, Normal Distributions. Chapter 1 Lecture 2: Discrete Distributions, Normal Distributions Chapter 1 Reminders Course website: www. stat.purdue.edu/~xuanyaoh/stat350 Office Hour: Mon 3:30-4:30, Wed 4-5 Bring a calculator, and copy Tables

More information

1 if 1 x 0 1 if 0 x 1

1 if 1 x 0 1 if 0 x 1 Chapter 3 Continuity In this chapter we begin by defining the fundamental notion of continuity for real valued functions of a single real variable. When trying to decide whether a given function is or

More information

Discrete Mathematics and Probability Theory Fall 2009 Satish Rao, David Tse Note 18. A Brief Introduction to Continuous Probability

Discrete Mathematics and Probability Theory Fall 2009 Satish Rao, David Tse Note 18. A Brief Introduction to Continuous Probability CS 7 Discrete Mathematics and Probability Theory Fall 29 Satish Rao, David Tse Note 8 A Brief Introduction to Continuous Probability Up to now we have focused exclusively on discrete probability spaces

More information

CALCULATIONS & STATISTICS

CALCULATIONS & STATISTICS CALCULATIONS & STATISTICS CALCULATION OF SCORES Conversion of 1-5 scale to 0-100 scores When you look at your report, you will notice that the scores are reported on a 0-100 scale, even though respondents

More information

Definition: Suppose that two random variables, either continuous or discrete, X and Y have joint density

Definition: Suppose that two random variables, either continuous or discrete, X and Y have joint density HW MATH 461/561 Lecture Notes 15 1 Definition: Suppose that two random variables, either continuous or discrete, X and Y have joint density and marginal densities f(x, y), (x, y) Λ X,Y f X (x), x Λ X,

More information

Chapter 3: DISCRETE RANDOM VARIABLES AND PROBABILITY DISTRIBUTIONS. Part 3: Discrete Uniform Distribution Binomial Distribution

Chapter 3: DISCRETE RANDOM VARIABLES AND PROBABILITY DISTRIBUTIONS. Part 3: Discrete Uniform Distribution Binomial Distribution Chapter 3: DISCRETE RANDOM VARIABLES AND PROBABILITY DISTRIBUTIONS Part 3: Discrete Uniform Distribution Binomial Distribution Sections 3-5, 3-6 Special discrete random variable distributions we will cover

More information

Chapter 4. Probability and Probability Distributions

Chapter 4. Probability and Probability Distributions Chapter 4. robability and robability Distributions Importance of Knowing robability To know whether a sample is not identical to the population from which it was selected, it is necessary to assess the

More information

Normal Distribution as an Approximation to the Binomial Distribution

Normal Distribution as an Approximation to the Binomial Distribution Chapter 1 Student Lecture Notes 1-1 Normal Distribution as an Approximation to the Binomial Distribution : Goals ONE TWO THREE 2 Review Binomial Probability Distribution applies to a discrete random variable

More information

Probability density function : An arbitrary continuous random variable X is similarly described by its probability density function f x = f X

Probability density function : An arbitrary continuous random variable X is similarly described by its probability density function f x = f X Week 6 notes : Continuous random variables and their probability densities WEEK 6 page 1 uniform, normal, gamma, exponential,chi-squared distributions, normal approx'n to the binomial Uniform [,1] random

More information

Chapter 3 RANDOM VARIATE GENERATION

Chapter 3 RANDOM VARIATE GENERATION Chapter 3 RANDOM VARIATE GENERATION In order to do a Monte Carlo simulation either by hand or by computer, techniques must be developed for generating values of random variables having known distributions.

More information

The Binomial Distribution

The Binomial Distribution The Binomial Distribution James H. Steiger November 10, 00 1 Topics for this Module 1. The Binomial Process. The Binomial Random Variable. The Binomial Distribution (a) Computing the Binomial pdf (b) Computing

More information

Binomial Probability Distribution

Binomial Probability Distribution Binomial Probability Distribution In a binomial setting, we can compute probabilities of certain outcomes. This used to be done with tables, but with graphing calculator technology, these problems are

More information

TEACHER NOTES MATH NSPIRED

TEACHER NOTES MATH NSPIRED Math Objectives Students will understand that normal distributions can be used to approximate binomial distributions whenever both np and n(1 p) are sufficiently large. Students will understand that when

More information

Density Curve. A density curve is the graph of a continuous probability distribution. It must satisfy the following properties:

Density Curve. A density curve is the graph of a continuous probability distribution. It must satisfy the following properties: Density Curve A density curve is the graph of a continuous probability distribution. It must satisfy the following properties: 1. The total area under the curve must equal 1. 2. Every point on the curve

More information

Sums of Independent Random Variables

Sums of Independent Random Variables Chapter 7 Sums of Independent Random Variables 7.1 Sums of Discrete Random Variables In this chapter we turn to the important question of determining the distribution of a sum of independent random variables

More information

UNIT I: RANDOM VARIABLES PART- A -TWO MARKS

UNIT I: RANDOM VARIABLES PART- A -TWO MARKS UNIT I: RANDOM VARIABLES PART- A -TWO MARKS 1. Given the probability density function of a continuous random variable X as follows f(x) = 6x (1-x) 0

More information

99.37, 99.38, 99.38, 99.39, 99.39, 99.39, 99.39, 99.40, 99.41, 99.42 cm

99.37, 99.38, 99.38, 99.39, 99.39, 99.39, 99.39, 99.40, 99.41, 99.42 cm Error Analysis and the Gaussian Distribution In experimental science theory lives or dies based on the results of experimental evidence and thus the analysis of this evidence is a critical part of the

More information

Example. A casino offers the following bets (the fairest bets in the casino!) 1 You get $0 (i.e., you can walk away)

Example. A casino offers the following bets (the fairest bets in the casino!) 1 You get $0 (i.e., you can walk away) : Three bets Math 45 Introduction to Probability Lecture 5 Kenneth Harris aharri@umich.edu Department of Mathematics University of Michigan February, 009. A casino offers the following bets (the fairest

More information

Lecture Notes 1. Brief Review of Basic Probability

Lecture Notes 1. Brief Review of Basic Probability Probability Review Lecture Notes Brief Review of Basic Probability I assume you know basic probability. Chapters -3 are a review. I will assume you have read and understood Chapters -3. Here is a very

More information

What is Statistics? Lecture 1. Introduction and probability review. Idea of parametric inference

What is Statistics? Lecture 1. Introduction and probability review. Idea of parametric inference 0. 1. Introduction and probability review 1.1. What is Statistics? What is Statistics? Lecture 1. Introduction and probability review There are many definitions: I will use A set of principle and procedures

More information

MATH 140 Lab 4: Probability and the Standard Normal Distribution

MATH 140 Lab 4: Probability and the Standard Normal Distribution MATH 140 Lab 4: Probability and the Standard Normal Distribution Problem 1. Flipping a Coin Problem In this problem, we want to simualte the process of flipping a fair coin 1000 times. Note that the outcomes

More information

MULTIVARIATE PROBABILITY DISTRIBUTIONS

MULTIVARIATE PROBABILITY DISTRIBUTIONS MULTIVARIATE PROBABILITY DISTRIBUTIONS. PRELIMINARIES.. Example. Consider an experiment that consists of tossing a die and a coin at the same time. We can consider a number of random variables defined

More information

Probability Distributions

Probability Distributions CHAPTER 5 Probability Distributions CHAPTER OUTLINE 5.1 Probability Distribution of a Discrete Random Variable 5.2 Mean and Standard Deviation of a Probability Distribution 5.3 The Binomial Distribution

More information

Math 461 Fall 2006 Test 2 Solutions

Math 461 Fall 2006 Test 2 Solutions Math 461 Fall 2006 Test 2 Solutions Total points: 100. Do all questions. Explain all answers. No notes, books, or electronic devices. 1. [105+5 points] Assume X Exponential(λ). Justify the following two

More information

Normal Probability Distribution

Normal Probability Distribution Normal Probability Distribution The Normal Distribution functions: #1: normalpdf pdf = Probability Density Function This function returns the probability of a single value of the random variable x. Use

More information

MAS108 Probability I

MAS108 Probability I 1 QUEEN MARY UNIVERSITY OF LONDON 2:30 pm, Thursday 3 May, 2007 Duration: 2 hours MAS108 Probability I Do not start reading the question paper until you are instructed to by the invigilators. The paper

More information

Statistics 100A Homework 8 Solutions

Statistics 100A Homework 8 Solutions Part : Chapter 7 Statistics A Homework 8 Solutions Ryan Rosario. A player throws a fair die and simultaneously flips a fair coin. If the coin lands heads, then she wins twice, and if tails, the one-half

More information

Probability Generating Functions

Probability Generating Functions page 39 Chapter 3 Probability Generating Functions 3 Preamble: Generating Functions Generating functions are widely used in mathematics, and play an important role in probability theory Consider a sequence

More information

E3: PROBABILITY AND STATISTICS lecture notes

E3: PROBABILITY AND STATISTICS lecture notes E3: PROBABILITY AND STATISTICS lecture notes 2 Contents 1 PROBABILITY THEORY 7 1.1 Experiments and random events............................ 7 1.2 Certain event. Impossible event............................

More information

( ) is proportional to ( 10 + x)!2. Calculate the

( ) is proportional to ( 10 + x)!2. Calculate the PRACTICE EXAMINATION NUMBER 6. An insurance company eamines its pool of auto insurance customers and gathers the following information: i) All customers insure at least one car. ii) 64 of the customers

More information

Fairfield Public Schools

Fairfield Public Schools Mathematics Fairfield Public Schools AP Statistics AP Statistics BOE Approved 04/08/2014 1 AP STATISTICS Critical Areas of Focus AP Statistics is a rigorous course that offers advanced students an opportunity

More information

Determine If An Equation Represents a Function

Determine If An Equation Represents a Function Question : What is a linear function? The term linear function consists of two parts: linear and function. To understand what these terms mean together, we must first understand what a function is. The

More information

Stats on the TI 83 and TI 84 Calculator

Stats on the TI 83 and TI 84 Calculator Stats on the TI 83 and TI 84 Calculator Entering the sample values STAT button Left bracket { Right bracket } Store (STO) List L1 Comma Enter Example: Sample data are {5, 10, 15, 20} 1. Press 2 ND and

More information

Week 4: Standard Error and Confidence Intervals

Week 4: Standard Error and Confidence Intervals Health Sciences M.Sc. Programme Applied Biostatistics Week 4: Standard Error and Confidence Intervals Sampling Most research data come from subjects we think of as samples drawn from a larger population.

More information

Statistics 100A Homework 3 Solutions

Statistics 100A Homework 3 Solutions Chapter Statistics 00A Homework Solutions Ryan Rosario. Two balls are chosen randomly from an urn containing 8 white, black, and orange balls. Suppose that we win $ for each black ball selected and we

More information

Math 370, Spring 2008 Prof. A.J. Hildebrand. Practice Test 1 Solutions

Math 370, Spring 2008 Prof. A.J. Hildebrand. Practice Test 1 Solutions Math 70, Spring 008 Prof. A.J. Hildebrand Practice Test Solutions About this test. This is a practice test made up of a random collection of 5 problems from past Course /P actuarial exams. Most of the

More information

HISTOGRAMS, CUMULATIVE FREQUENCY AND BOX PLOTS

HISTOGRAMS, CUMULATIVE FREQUENCY AND BOX PLOTS Mathematics Revision Guides Histograms, Cumulative Frequency and Box Plots Page 1 of 25 M.K. HOME TUITION Mathematics Revision Guides Level: GCSE Higher Tier HISTOGRAMS, CUMULATIVE FREQUENCY AND BOX PLOTS

More information

6.041/6.431 Spring 2008 Quiz 2 Wednesday, April 16, 7:30-9:30 PM. SOLUTIONS

6.041/6.431 Spring 2008 Quiz 2 Wednesday, April 16, 7:30-9:30 PM. SOLUTIONS 6.4/6.43 Spring 28 Quiz 2 Wednesday, April 6, 7:3-9:3 PM. SOLUTIONS Name: Recitation Instructor: TA: 6.4/6.43: Question Part Score Out of 3 all 36 2 a 4 b 5 c 5 d 8 e 5 f 6 3 a 4 b 6 c 6 d 6 e 6 Total

More information

Simple Regression Theory II 2010 Samuel L. Baker

Simple Regression Theory II 2010 Samuel L. Baker SIMPLE REGRESSION THEORY II 1 Simple Regression Theory II 2010 Samuel L. Baker Assessing how good the regression equation is likely to be Assignment 1A gets into drawing inferences about how close the

More information

STAT 360 Probability and Statistics. Fall 2012

STAT 360 Probability and Statistics. Fall 2012 STAT 360 Probability and Statistics Fall 2012 1) General information: Crosslisted course offered as STAT 360, MATH 360 Semester: Fall 2012, Aug 20--Dec 07 Course name: Probability and Statistics Number

More information

1 Error in Euler s Method

1 Error in Euler s Method 1 Error in Euler s Method Experience with Euler s 1 method raises some interesting questions about numerical approximations for the solutions of differential equations. 1. What determines the amount of

More information

Lies My Calculator and Computer Told Me

Lies My Calculator and Computer Told Me Lies My Calculator and Computer Told Me 2 LIES MY CALCULATOR AND COMPUTER TOLD ME Lies My Calculator and Computer Told Me See Section.4 for a discussion of graphing calculators and computers with graphing

More information

Master s Theory Exam Spring 2006

Master s Theory Exam Spring 2006 Spring 2006 This exam contains 7 questions. You should attempt them all. Each question is divided into parts to help lead you through the material. You should attempt to complete as much of each problem

More information

7.6 Approximation Errors and Simpson's Rule

7.6 Approximation Errors and Simpson's Rule WileyPLUS: Home Help Contact us Logout Hughes-Hallett, Calculus: Single and Multivariable, 4/e Calculus I, II, and Vector Calculus Reading content Integration 7.1. Integration by Substitution 7.2. Integration

More information

Probability Distributions

Probability Distributions Learning Objectives Probability Distributions Section 1: How Can We Summarize Possible Outcomes and Their Probabilities? 1. Random variable 2. Probability distributions for discrete random variables 3.

More information

Chicago Booth BUSINESS STATISTICS 41000 Final Exam Fall 2011

Chicago Booth BUSINESS STATISTICS 41000 Final Exam Fall 2011 Chicago Booth BUSINESS STATISTICS 41000 Final Exam Fall 2011 Name: Section: I pledge my honor that I have not violated the Honor Code Signature: This exam has 34 pages. You have 3 hours to complete this

More information

Def: The standard normal distribution is a normal probability distribution that has a mean of 0 and a standard deviation of 1.

Def: The standard normal distribution is a normal probability distribution that has a mean of 0 and a standard deviation of 1. Lecture 6: Chapter 6: Normal Probability Distributions A normal distribution is a continuous probability distribution for a random variable x. The graph of a normal distribution is called the normal curve.

More information

Overview of Monte Carlo Simulation, Probability Review and Introduction to Matlab

Overview of Monte Carlo Simulation, Probability Review and Introduction to Matlab Monte Carlo Simulation: IEOR E4703 Fall 2004 c 2004 by Martin Haugh Overview of Monte Carlo Simulation, Probability Review and Introduction to Matlab 1 Overview of Monte Carlo Simulation 1.1 Why use simulation?

More information

Practice problems for Homework 11 - Point Estimation

Practice problems for Homework 11 - Point Estimation Practice problems for Homework 11 - Point Estimation 1. (10 marks) Suppose we want to select a random sample of size 5 from the current CS 3341 students. Which of the following strategies is the best:

More information