Measurements of central tendency express whether the numbers tend to be high or low. The most common of these are:

Size: px
Start display at page:

Download "Measurements of central tendency express whether the numbers tend to be high or low. The most common of these are:"

Transcription

1 A PRIMER IN PROBABILITY This handout is intended to refresh you on the elements of probability and statistics that are relevant for econometric analysis. In order to help you prioritize the information you need to retain, I have used the symbol to mark any essential formula that is universally true. I. Descriptive statistics First of all, we should distinguish between two concepts: the population is the entire group that we wish to study, while the sample is the subset of the population for which we have information. Some formulas will differ slightly, depending on whether we are describing a population or sample. When faced with a bunch of numbers, in either a population or a sample, we often look for simple ways to summarize the data. For example, look at the results of two groups systolic blood pressure readings: Group A: 95, 102, 98, 104, 101, 104, 99 Group B: 131, 129, 167, 103, 142, 126, 153 We might notice two differences between these groups: first, that Group B tends to have much higher blood pressure readings; next, that Group A tends to be clustered closely together, while Group B is more spread out. Those two statements are descriptive statistics, in essence though in this case, they are very informal observations, and they might invite the question what exactly do you mean by more spread out? We have a number of standard measurements to express characteristics of samples or populations. Measurements of central tendency express whether the numbers tend to be high or low. The most common of these are: Mean: the average value. Median: the middle value. Mode: the most common value. (In practice, almost nobody uses the mode.) The mean and median of a population will be different of the distribution is skewed, meaning that there are larger (or smaller) gaps between values at the high end than at the low end. For example, the distribution of income is very skewed: the income of the wealthiest people differs by billions of dollars, while the income of the poorest people differs by pennies. Because of this, mean income might be a slightly misleading indicator, since a few

2 wealthy people can pull the average up, so that most people actually have incomes below the average. The median addresses this issue, by reporting the income of the person in right in the middle of the distribution. 1 The second characteristic that we might wish to describe is the spread of the distribution: whether observations are clustered closely together, or spread apart. Most often, we use the variance or the standard deviation to express this concept. The standard deviation in a group is the average distance between each observation and the mean; the variance is just the standard deviation, squared (or the average squared difference between the observations and their mean). Skewness refers to whether the gaps at the top of the distribution are larger or smaller than those at the bottom. (Formally, this is calculated at the average cubed difference between the observations and their mean.) Skewness is not synonymous with biased avoid saying that results are skewed unless you are certain that you are using the word correctly. The maximum and minimum values should be self-evident. Finally, the Xth percentile refers to the value that X% of the group lies below. 2 For example, the median is exactly the same thing as the 50th percentile. II. Probability In probability, an event is something, determined by chance, that either does or does not happen. An event can be described as simple, meaning that there is only one way to achieve the outcome, or complex, meaning that there are a number of simple events that would satisfy the condition. For example, an (American) roulette wheel contains the numbers 1 through 36, plus 0 and 00. Aside from 0 and 00, half of the numbers are red, and half black. The betting board looks something like this: In this context, an event would be anything that you could place a bet on. A simple event would be a bet on the number 17, since there is only one outcome that 1 In the March 2005 Current Population study, mean household income was $61,905. However, 63% of households earned less than the average. The median income was $46, The 90th percentile in household income is $122,324: 90% of households earn less than this, and 10% earn more. A primer in probability, p. 2

3 would win this bet. Betting on all odd numbers would be a complex event, since the outcomes 1, 3, 5,, 35 would all satisfy this condition. Formally, let S denote the space of all possible outcomes. Any event is a subset of S. We will use letters like A and B to denote generic events, while A or B will denote the complement of A or B: all the things that are not part of the event. For example, if A is the event that red wins then A is the event black or house wins. The union of two events, A! B, consists of all outcomes that satisfy one event or the other (or both); while the intersection, A! B, are all outcomes satisfying both conditions. For all practical purposes, you can read A! B as A or B and A! B as A and B. We use the following logical rules for combining ands, ors, and nots: (A! B) = ( A) " ( B) (A " B) = ( A)! ( B) To put this in words: neither A nor B happened is equivalent to saying A did not happen, and B did not happen ; and it was not the case that both A and B occurred is the same as A did not happen, or B did not happen (or neither happened). Let s go back to the roulette example, where A was the event that odd wins and B was red wins. Then we can write the following: S = {1,2, 3, 4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25, 26,27,28,29,30, 31, 32, 33, 34, 35, 36,0,00} A = {1, 3,5, 7,9,11,13,15,17,19,21,23,25,27,29, 31, 33, 35} B = {1, 3,5, 7,9,12,14,16,18,19,21,23,25,27, 30, 32, 34, 36} A = {2, 4,6,8,10,12,14,16,18,20,22,24,26,28, 30, 32, 34, 36,0,00} A! B = {1, 3,5, 7,9,11,12,13,14,15,16,17,18,19,21,23,25,27,29, 30, 31, 32, 33, 34, 35, 36} A " B = {1, 3,5, 7,9,19,21,23,25,27} A probability measure is a function P[A] that tells us the fraction of times that an event occurs. The probability measure must satisfy three properties: 0! P[A]! 1 P[S] = 1 P[ A] = 1! P[A] A primer in probability, p. 3

4 In words: a probability cannot be negative, nor can it exceed one (a completely impossible event has a probability of zero, while a certain event has a probability of one); it is certain that something in the space of all possible outcomes will occur; and finally, if the chance that A happens is X, then the chance that A doesn t happen is1! X. We can calculate the chance of any complex event by adding up the probabilities of the chances of simple events in this complex event. Calculating the odds in roulette is fairly simply, since there is a 1 / 38 chance that the ball lands on any specific number. The probability that odd wins is therefore P[A] = P[1] + P[3] + + P[35] = 18 / 38. If we already know the probabilities that some complex events occur, and we want to calculate the chance that their union occurs (that one or the other, or both, happens), we cannot simply add the probabilities together. For example, there is an 18 / 38 chance that odd wins, and there is an 18 / 38 chance that red wins. The chance that red or odd wins is not18 / / 38 = 36 / 38. Look at the roulette board again: only 26 of the 38 outcomes are either red or odd, so this should be the chance that A! B occurs. By simply adding P[A] to P[B], we have double-counted the outcomes that are both red and odd. A correct calculation of the probability is: P[A! B] = P[A] + P[B] " P[A # B] This is always the rule for calculating the probability of the union of events. When there is nothing in the intersection of two events, we say that the events are disjoint or mutually exclusive. For example, even wins and odd wins are mutually exclusive events, since there is no outcome satisfying both conditions. In this special case, the probability that one or the other occurs is simply their sum: P[A! B] = P[A] + P[B] if A and B are mutually exclusive events. Finally, we should address conditional probability. Suppose that you are playing roulette, and you have placed a bet on odd. In general, your chance of winning is18 / 38, slightly under half. The wheel spins and the ball stops, but your view is obscured. However, you hear someone call out, yes, red wins! Even though you did not bet on red, you should be a bit excited about this news, since your chance of winning has increased: exactly half of the reds ( 9 of18 ) are odd. If we know that event B has occurred, we can use this information to revise our expectations about A. The probability of A conditional on B or the probability of A given B is always calculated as: A primer in probability, p. 4

5 P[A B] = P[A! B] / P[B] In this case, there are nine outcomes that are both red and odd, so P[A! B] = 9 / 38. Eighteen outcomes are red, so P[B] = 18 / 38. Therefore, given that the outcome is red, the chance that it is odd is P[A B] = (9 / 38) / (18 / 38) = 9 /18. We say that two events are independent if P[A B] = P[A] ; in other words, knowing B does not help us revise our probabilities that A occurred. In this example, red wins and odd wins are not independent, since P[A] = 18 / 38, while P[A B] = 9 /18. (They are not independent because red winning indicates that the house has not won; that is, that 0 or 00 did not come up.) If we want to calculate the probability that an intersection occurs (that both A and B happen), we can get this formula by rearranging the one for conditional probabilities: P[A! B] = P[A B]" P[B], in general; and P[A! B] = P[A]" P[B], if the events are independent. This last rule is incredibly useful. The heroine of the movie Run Lola Run places a hundred-mark bet on the number 20 on a roulette wheel. When 20 wins, she keeps her earnings on the same number, and 20 wins again. What are the odds that this occurs? The odds of any individual number winning are1 / 38, 3 so assuming that the spins of the roulette wheel are independent, the chance that 20 wins twice in a row is(1 / 38)! (1 / 38) = 1 /1444. When we have repeated, independent trials, this rule is convenient for calculating the probability that A and B occur. If we want to know instead the chance that A or B occurs, we have to combine several of our rules. One roulette strategy is to walk into the casino with $36, and place a $1 bet on a single number (my lucky number would be19 ) for 36 spins. Since a winning bet on a single number pays off 36 :1, you will come out ahead if your number comes up at least once within these 36 spins. So what is the chance that this happens? P[(Spin 1 = 19) or (Spin 2 = 19) or or (Spin 36 = 19)] = 1! P[ {(Spin 1 = 19) or (Spin 2 = 19) or or (Spin 36 = 19)}] = 1! P[(Spin 1 " 19) and (Spin 2 " 19) and and (Spin 36 " 19)] = 1! P[(Spin 1 " 19)]# P[(Spin 2 " 19)]# # P[(Spin 36 " 19)] = 1! (37 / 38) #(37 / 38) # #(37 / 38) = 1! (37 / 38) 36 = In truth, since Lola plays on a European wheel without 00, the odds are 1 / 37. A primer in probability, p. 5

6 Finally, we should talk about Bayes Rule. Suppose that you are being tested for some horrible disease. Fortunately, this disease is fairly rare: in the population overall, only 1 in 10,000 people have it, so we ll say that P[D] = 1 /10,000, where D is the event you have the disease. The test for this disease is very accurate, but not absolutely perfect. Among people who have the disease, 99.5% get a positive result, and 0.5% get a false negative; among those without, 99.9% get a correct negative, while 0.1% get a false positive. We can write the probabilities of obtaining a positive test (the event P) for these populations as P[P D] = and P[P D] = You take the test, and you are shocked to obtain a positive result. Given the accuracy of the test, does this mean you are likely to die? The answer is no, in fact. Think of it this way: in a population of 10,000,000 people, we would expect that 1,000 people have the disease, while 9,999,000 do not. If everyone were to take the test, 995 diseased people would get positive results and 9,999 well people would get (false) positives. Among the 10,994 people who get positive results, only 995 have the disease, so the chance of actually having the disease, given a positive test, is only Most likely, your result was a false positive. We have informally used Bayes Rule to calculate the chance of having the disease, given a positive result: P[D P]. Bayes Rule is used when you have an unconditional probability (also called a prior probability or an ex-ante probability) that you want to revise after the arrival of some news (the final conditional probability is sometimes called a posterior probability or an ex-post probability). In general, the rule is: P[D P] = P[P D]! P[D] P[P D]! P[D] + P[P D]! P[ D] In this case, P[P D]! P[D] P[D P] = P[P D]! P[D] + P[P D]! P[ D] = III. Random variables (0.995)(0.0001) (0.995)(0.0001) + (0.001)(0.9999) = A random variable takes on a numerical value that is determined by chance. We will use X or Y to denote a specific generic random variable, and x or y to indicate a specific value that the variable could take. For example, we could let X be the number of heads obtained from three coin flips. When we write P[X = x], we mean P[ the number of heads obtained from three coin flips = x]. X stands for the thing that we are measuring; x is a specific value that it could take. A primer in probability, p. 6

7 At times, we will want to distinguish between a discrete random variable, which takes on only a limited number of values, and a continuous random variable, which can take any value within some range. The number of heads obtained from three coin tosses is discrete, since this can take only values of zero, one, two, or three we could not obtain heads, or any other non-integer value. In contrast, the length of time until a light bulb fails is a continuous random variable, since this could be any non-negative value (a light bulb could last exactly months). As before, a probability distribution describes the likelihood of specific outcomes for a random variable. For simplicity, we will often write P[x] to indicate P[X = x], when there is no ambiguity; for example, P[2.718] stands for P[X = 2.718]. We will again let S denote the set of all possible outcomes for a variable. The expected value of a random variable is its theoretical average value. For a discrete random variable, this is calculated as: E[X] = # x! P[x] x"s In other words, we add up all possible outcomes times the chance of obtaining that outcome. When flipping a coin three times, there is a 1 / 8 chance of obtaining zero heads; 3 / 8 chance of obtaining one heads; 3 / 8 chance of obtaining two heads; and 1 / 8 chance of obtaining three heads. Therefore, the expected number of heads is: E[X] = 0! (1 / 8) + 1!(3 / 8) + 2!(3 / 8) + 3!(1 / 8) = 12 / 8 We can also take the expected value of any function of X. If X is a random variable, then G(X) is one, too. The expected value of G(X) is calculated as: E[G(X)] = # G(x)! P[x] x"s With linear functions, like G(X) = a + b! X, we can write E[G(X)] as a function of the expected value of X. The rule is: E[a + b! X] = a + b! E[X] This is not true of other functions, however: E[log(X)]! log(e[x]). At times, we are given additional information that allows us to revise our expectations about the value of X. For example, someone might have revealed that not all of the coins turned up heads in our coin toss. Since this rules out the possibility that we got three heads, we should lower our expectations about the number of heads that we did get. A primer in probability, p. 7

8 If we know that X is actually in T, some subset of S, the conditional expectation of X given T is: E[X T ] = # x! P[x] # P[x] x"t x"t Given that the number of heads is zero, one, or two, the conditional expectation is: E[X X! 3] = 0 "(1 / 8) + 1"(3 / 8) + 2 "(3 / 8) (1 / 8) + (3 / 8) + (3 / 8) = 9 / 8 7 / 8 When we take the expectation of the sum of random variables, the expectation can always be broken up at the summation: E[X + Y ] = E[X] + E[Y ] The same is generally not true for the expected value of a product: E[X!Y ] " E[X]! E[Y ]. The only time that this does work is when the two variables are independent: E[X!Y ] = E[X]! E[Y ]if independent The variance is the theoretical average squared difference between the outcome and its mean. For a discrete random variable, this formula is: Var(X) = E[(X! E[X]) 2 ] = $ (x! E[X]) 2 " P[x] x#s The variance in the number of heads from three coin tosses is: Var(X) = (0! 3 / 2) 2 " (1 / 8) + (1! 3 / 2) 2 "(3 / 8) + (2! 3 / 2) 2 " (3 / 8) + (3! 3 / 2) 2 " (1 / 8) = (9 / 4) " (1 / 8) + (1 / 4) " (3 / 8) + (1 / 4) "(3 / 8) + (9 / 4) " (1 / 8) = 24 / 32 Note that the variance must always be positive (or at least, non-negative), since it requires adding up a bunch of squared terms (which must be positive), each multiplied by a probability (which much also be positive). We can also calculate the variance in any function of X: Var[G(X)] = $ x#s (G(x)! E[G(X)]) 2 " P[x] With linear functions, there is again a specific rule: A primer in probability, p. 8

9 Var[a + b! X] = b 2!Var(X) In other words, adding a constant amount to the variable (always) does not affect its spread. Scaling the variable up or down by a constant b does affect this spread. Finally, the standard deviation in a random variable is the square root of its variance. When two random variables are observed concurrently, we might want to express whether they tend to move in the same direction, opposite direction, or have no joint tendencies. The covariance between X and Y is calculated as: Cov(X,Y ) = E[(X! E[X]) "(Y! E[Y ])] $ $ = (x! E[X])(y! E[Y ]) " P[X = x,y = y] x#s y#t (Fortunately, you need to know covariance more for the concept than for calculations in practice.) If the covariance is positive, the two variables tend to move in the same direction: when X is above average, then Y is generally above average as well. If the covariance is negative, the variables tend to move in opposite directions: when X is above average, Y tends to be below average. When the covariance is zero, the variables essentially move independently. While the sign of the covariance indicates whether the variables tend to move in the same direction, the magnitude is a bit more difficult to interpret. It tells us the size of the similarity, but it also reflects the size of the random variables themselves. (Simply doubling X will double the covariance between X and any other variable.) To adjust for the scale of the variables, we usually use the correlation: Corr(X,Y ) = Cov(X,Y ) Var(X)! Var(Y ) The correlation is an index that ranges between!1 and +1. A correlation of zero indicates that the variables have nothing in common; a correlation of one means that the variables are exactly the same, except that they might be measured in different scales; and a correlation of minus one means that they are exactly opposite. 4 More 4 For example, temperature in Fahrenheit and temperature in Celsius are variables that would have a correlation of exactly one, since they measure exactly the same thing in different scales. The peculiar old Delisle temperature scale, where water boiled at 0 D and froze at 150 D, would have a correlation of negative one with either Fahrenheit or Celsius temperatures. A primer in probability, p. 9

10 generally, a magnitude close to one indicates that the variables are very similar, while a correlation close to zero indicates a weak relationship. When taking the covariance or correlation of a linear function of some random variable, the rules are: Cov(a + b! X,Y ) = b! Cov(X,Y ) Corr(a + b! X,Y ) = Corr(X,Y ) Finally, we should note that the covariance between a variable and itself is the same as the variance in that variable: Cov(X, X) = E[(X! E[X]) " (X! E[X])] = E[(X! E[X]) 2 ] = Var(X) These formulas change a bit when dealing with continuous random variables. With continuous random variables, the probability that X takes any specific value (exactly) is infinitesimally small there is virtually zero chance that a light bulb burns out at exactly the instant that it turns months old. P[X = x]! 0, if X is continuous (generally). However, there is a non-negligible chance that the variable falls within some range, so it makes sense to talk about P[a! X! b]. The distribution of a continuous random variable X is described by a probability density function f (x), also known as a p.d.f., which has the property: b P[a! X! b] = " f (x)dx, where f (x) is the p.d.f. of X a A familiar p.d.f. is the bell curve of the normal distribution. Formally, this curve is characterized as f (x) = (2!" 2 ) #1 2 exp(#(x # µ) 2 2" 2 ), where µ is the mean of the population and! 2 is its variance. Graphing this function, we get: A primer in probability, p. 10

11 The fraction of the population whose X values lie between two points, a and b for example, is the area under this curve between a and b. Thinking back to calculus: if we are given some function f (x) and we want to know the area under this curve between two points, we integrate the function between those points, so! a b f (x)dx represents the fraction of the population whose Xes are between these values, so this is the probability that any observation, picked at random, is between these values. The normal distribution is only one example of a p.d.f.; we will deal with several others. For all distributions, a p.d.f. must satisfy two properties: f (x)! 0! +# "# f (x)dx = 1 These ensure that the probability measure will have all of the necessary properties mentioned in the previous section. For continuous random variables, the cumulative density function or c.d.f. is another important function. The c.d.f. gives the probability that the variable is below some specific value: F(x) = P[X! x], where F(x) is the c.d.f. of the random variable X Clearly, this is related to the p.d.f.: F(b) = # b!" f (x)dx The c.d.f., evaluated at some point, is the p.d.f., integrated up to this point. This implies that the p.d.f. is the derivative of the c.d.f. Also, the c.d.f. can be used to calculate the probability that X falls in some range (which is sometimes a convenient alternative to integrating the p.d.f.): F!(x) = f (x) P[a! X! b] = F(b) " F(a) The formulas for calculating expected values, variances, and such for continuous random variables are identical to those for discrete random variables, with two changes: 1. We replace the summation (over all possible values) with an integral (over the entire range); and 2. We replace the probability P[x] with the p.d.f. f (x). A primer in probability, p. 11

12 This gives us the formula for the expected value: +# E[X] = x! f (x)dx $ "# We can also calculate the expected values of functions of X, by integrating those functions instead of just x. The rule that E[a + b! X] = a + b! E[X] holds for continuous random variables, just as it did for discrete ones. The conditional expectation is defined similarly: b b E[X a! X! b] = # x " f (x)dx # f (x)dx The formula for variance is that: a Var(X) = E[(X! E[X]) 2 +# ] = (x! E[X]) 2 " f (x)dx $!# a Again, the rule that Var(a + b! X) = b 2 Var(X) is true for continuous random variables, as it was for discrete random variables. Covariance is calculated analogously, although we have to specify a joint p.d.f. for the two random variables. The correlation coefficient remains the covariance divided by the standard deviations in the two variables. As a final note, we often don t have to integrate these functions in practice and much of the time, the function can t actually be integrated (for example, you cannot integrate the p.d.f. of the normal distribution, f (x) = (2!" 2 ) #1 2 exp(#(x # µ) 2 2" 2 ), without a computer approximation). In theory, we integrate things, but in practice, we rarely have to do the dirty work. IV. Common probability distributions When modeling probabilistic phenomena, we repeatedly rely on a handful of distributions. The binomial distribution is the most common among the discrete variables. It describes a situation where we have N repeated, independent trials that can each come up positive (with probability p) or negative (with probability1! p ); the outcome of interest is K, the number of positive results. 5 The number of heads obtained from N coin tosses of a fair coin ( p = 1 / 2 ) would be a perfect example of a binomial distribution. 5 I will describe the outcomes generically as positive and negative, but this distribution applies whether the outcome is binary. There are other ways to describe this stylized situation: some people call the outcomes yes or no, while others will say success or failure. A primer in probability, p. 12

13 In principle, we could work out the probability distribution of K. When we flip one coin, two outcomes are equally likely, and one of these is heads and one is tails. When N = 2, four sequences are equally likely: each of HH, HT, TH, and TT occurs with probability1 / 4. One of these outcomes gives us zero heads, two outcomes give us one heads, and one outcome gives us two heads. We can continue this process to figure out the distribution when N = 3, N = 4, and N = 5 : N = 1 N = 2 N = 3 N = 4 N = 5 P[K = 0] 1 / 2 1 / 4 1 / 8 1 /16 1 / 32 P[K = 1] 1 / 2 2 / 4 3 / 8 4 /16 5 / 32 P[K = 2] -- 1 / 4 3 / 8 6 /16 10 / 32 P[K = 3] / 8 4 /16 10 / 32 P[K = 4] /16 5 / 32 P[K = 5] / 32 Unless you recognize a pattern involving Pascal s triangle, this becomes tedious and even that pattern will fail if p! 1 / 2. A formula does exist, however: P[k] = N! k!(n! k)! pk (1! p) N! k where X! (read as X factorial ) equals X!(X " 1)!(X " 2)!! 3! 2!1, the product of the number with all numbers less than itself. (However, 0!is set equal to zero.) For example:1! = 1, 2! = 2!1 = 2, 3! = 3! 2!1 = 6, 4! = 4! 3! 2!1 = 24, and so on. For shorthand, we might write to indicate that K is a random variable from the binomial distribution with N trials and a probability p of a positive outcome in each trial. For example, suppose that Professor Leach has a class of twenty-five students; the professor assigns grades, with each student having a 0.10 chance of failing the class. The probability that exactly five of twenty-five students fail is: P[5] = 25! 5!(25! 5)! (0.10)5 (0.90) 25!5 (The rest is fairly straight-forward albeit messy algebra, so I won t solve it further. However, I will point out that you can often simplify the factorial part of the problem: 25! = 25! 24! 23! 22! 21!(20!) ; the 20! in the numerator cancels with the (25! 5)! = (20!) in the denominator.) The mean of a binomial distribution is always: N N! E[K] = k! k!(n " k)! pk (1" p) N " k # = N! p k =0 A primer in probability, p. 13

14 (But it is not simple to show why the expression in the middle collapses to just N! p.) The variance of a binomial random variable is: Var(K) = N! p! (1 " p) A second common discrete distribution is the Poisson distribution, which is used to model count data : outcomes that are non-negative integers, typically smaller ones. (These non-negative integers are, of course, the numbers we could get if we count things: zero, one, two, three, and so on.) The Poisson distribution is especially appropriate when we are counting the number of [some event] per [time period], like the number of customers arriving at a store in an hour or the number of children that a woman has [in her lifetime]. Technically, the Poisson distribution makes two assumptions: 1. For all intervals of the same length within this period, there is the same chance that an event occurs, and 2. The chance that an event occurs is independent of what has happened in the past. While these assumptions might not hold exactly in all the situations that we model with the Poisson distribution, we often feel that they are close enough to justify it. With the Poisson distribution, the probability of having k events in the period is: P[k] = e!" " k / k! where! is some positive number, which reflects the average number of events that occur within the period. (In shorthand, we would write K ~ Pois(!) for this distribution). For example, if we know that families have 1.8 children on average a number that comes from the March 2005 Current Population Survey and that fertility is determined entirely by chance, then the probability of having k children is P[k] = e!1.8 "1.8 k / k!. If we calculate the probabilities from this formula, we would get the theoretical probabilities below: Number of children Theoretical probability Observed fraction These theoretical probabilities are remarkably close (in my opinion) to the observed fractions of families of each size in the dataset confirming that the Poisson A primer in probability, p. 14

15 distribution provides a reasonable description of this random variable. (It doesn t quite capture the observed fondness for having two children. In reality, many couples probably reduce their chance of having more children at that point, which violates the assumption that the probability of having an additional event is independent of the past.) As a final note,! is the mean number of events occurring in the time period, and it is also the variance in the number of events in the period. (Incidentally, in the CPS, the variance in family size is1.7, which is another indicator that the variable has approximately a Poisson distribution.) Let s move on to continuous probability distributions. The simplest is the uniform distribution, which assigns equal likelihood to any outcome between a lower limit of! and an upper limit of u. (For example, people s birthdays are essentially distributed uniformly over the numbers 1 through 365.) The complete p.d.f. of the uniform distribution is: # 0 if x <! % f (x) = $ 1 (u!!) if! " x " u % & 0 if x > u The shorthand notation would be X ~ U([!,u]). Calculating the expected value of the uniform distribution is relatively simple:! E[x] = $ 0! x! dx + $ 1 (u "!) + $ 0! x! dx "# u = $ ( 1 (u "!))! x! dx! u = ( 1 (u "!)) $ x! dx! = ( 1 (u "!)) x 2 2! u x=u ( ) x=! ( )! x! dx ( ) u 2 "! 2 ( )((u "!)(u +!)) = 1 2! 1 (u "!) = 1 2! 1 (u "!) = 1 2! (u +!) ( ) The expected value is the average of the two endpoints of the distribution. We could go through a similar exercise to find the variance in the uniform distribution: u # A primer in probability, p. 15

16 ! Var(X) = $ 0!(x " E[x]) 2! dx + $ 1 (u "!) + $ 0!(x " E[x]) 2! dx "# u $ ( )!( x " 1 2! (u +!)) 2! dx = 1 (u "!)! = = 1 12! (u "!) 2! u ( )! (x " E[x]) 2! dx (The actual algebra is somewhat tedious, but I hope you get the idea.) The single most important continuous distribution is the normal distribution. Many variables in nature have this bell-curve distribution: height, weight, or intelligence in a population; annual rainfall in a region or the fraction of days that are overcast; the age at which a person gives birth or the age at which a person would like to retire. The p.d.f. of the normal distribution is: f (x) = 1 2!" 2 # exp $(x$ µ)2 ( 2" ) 2 where µ is the mean and! 2 the variance in X. For shorthand, we would write X ~ N(µ,! 2 ) to indicate this normal distribution. In addition, the central limit theorem tells us that the sum (or average) of a number of independent, identically distributed random variables will tend to be normal, regardless of the distribution of the variables themselves. The amount of rainfall that falls in a region in a day might have some odd distribution: a 90% chance of no rain, a 7% chance of a quarter inch, and a 3% chance of one inch. This is far from normal. However, if you calculate the distribution of rainfall over 365 days, you ll get an normal distribution, approximately. A consequence of the central limit theorem is that the binomial distribution starts to look normal as N gets large. (The total number of positive outcomes, K, is the sum of a bunch of independent random variables.) Technically, we would write that ( )! N ( Np, Np(1" p) ). The Poisson distribution also approaches as N! ", B N, p the normal distribution, since it is also the total number of events that happen in some period. The more common the event is, the more the Poisson distribution will resemble the normal distribution: technically, as! " #, Pois(!) " N(!,!). For all practical purposes, I would say that there is essentially no difference between the Poisson and normal distributions when the expected number of events in the period is ten or more. With the binomial distribution, it s a bit harder to establish a rule of thumb for how large N needs to be in order to use the normal approximation, since this will depend on p in part. If N is a hundred or so, I would usually be fairly comfortable with the normality assumption; if N is just a couple of dozen, I would usually favor the binomial distribution. u # A primer in probability, p. 16

17 The standard normal distribution is a normal distribution with mean zero and variance one. We can standardize any normal X by calculating: Z = X! µ " We could then write that Z ~ N(0,1). (There is a tradition of using Z to denote an arbitrary standard normal random variable. Additionally, we often use!(z) to denote the p.d.f. of the normal distribution, and!(z) to denote its c.d.f.) Usually, we often standardize random variables in order to calculate look up values of the c.d.f. Normal distributions with different µ s and! s will have different c.d.f.s, and it is impossible to have a table of values for each of these functions. However, we can standardize any normal random variable, and compare this to a single table with values of the c.d.f. of the standard normal distribution. To model the time duration until some event occurs, we often use the exponential distribution. Technically, this makes two assumptions: 1. There is a constant probability that the event occurs at any time, and 2. The probability that the event occurs is independent of the past history. The life of an incandescent light bulb is an almost ideal example of an exponential random variable; this distribution could also be used to model the time that a person works with a particular employer, or the length of time that a patient waits for an organ transplant. To denote that the random variable T has an exponential distribution, we might writet ~ Exp(!); the p.d.f. of this distribution is: $ 0 if t < 0 f (t) = % &!e "!t if t # 0 where the parameter! represents the reciprocal of the average duration of the random variable T. For example, if I purchase a light bulb that advertises an (average) lifetime of 2000 hours, the p.d.f. of the duration, in hours, is f (t) = e!0.0005t. The variance in the exponential distribution is! "2. Another not-uncommon distribution is the log-normal distribution, which is often used to model outcomes that are always positive and fairly skewed where many people have fairly low values, but a few people have substantially higher values. Income would be one excellent example: most people earn low-to-moderate A primer in probability, p. 17

18 amounts tens of thousands of dollars per year but the range extends quite far, up into the hundreds of thousands, millions, and even multiple millions. The distance that a person has to travel to a hospital would be another: most people live fairly close, within five or ten miles of a hospital, but the range extends quite far, and a handful of people might have to travel hundreds of miles. Technically, if some variable X has the log-normal distribution, then its natural logarithm, ln(x), is normally distributed. While the sums of random variables tend to have the normal distribution, the products of random variables tend to have the log-normal distribution. Imagine that ever college graduate starts with the same base salary, maybe $40,000. Each year of his career, he randomly receives some promotion: perhaps a 0.50 chance of 0% promotion, a 30% chance of a 3% promotion, and a 20% chance of a 10% promotion. Since these increases are multiplicative, we would expect that the distribution of incomes after twenty years to be roughly log-normal. V. Probability distributions for statistics There are a handful of probability distributions that we frequently use for statistics, but which rarely show up in the real world. The first is the chi-squared distribution. Technically, if we start with a variable that has the standard normal distribution, its square has the chi-squared distribution. If we add N chi-squared variables to each other, the sum is chi-squared with N degrees of freedom. Z i ~ N(0,1)! " N i=1 Z 2 i ~ # 2 (N) We will discuss these degrees of freedom later. (Generally, they are the number of observations we have, minus the number of parameters that we estimated.) We use the chi-squared distribution to describe sample variances, which are calculated from the sum of squared values of a variable. When we have a bunch of normal random variables and we calculate their average, X =! X i N, then the difference between this sample mean X and the true mean µ, divided by the square root of the sample variance (!(X i " X) 2 (N " 1) ), has the t- distribution (with N! 1 degrees of freedom). We use the t-distribution when we want to test hypotheses about the value of an estimated parameter. If we knew the true variance in the parameter, we could standardize it and compare it to the standard normal distribution. However, when we use estimated variance that we calculate from our sample, we compare the standardized variable to the t- distribution. In many cases, the difference is minor, especially when we have a large A primer in probability, p. 18

19 sample. (As the degrees of freedom get large, the t-distribution approaches the normal distribution.) Some textbooks and teachers distinguish between small sample tests with the t-distribution and large sample with the normal distribution; technically, the small sample test is always the only correct way to do things when we use estimated variance, but the difference is minor. Finally, when we have two chi-squared variables, V 1 ~! 2 (N 1 ) andv 2 ~! 2 (N 2 ), the ratio (V 1 / N 1 ) (V 2 / N 2 ) has an F-distribution with N 1 degrees of freedom in the numerator and N 2 degrees of freedom in the denominator. The F-distribution is most often used to compare two estimated variances, to test if they are the same. VI. Statistics When dealing with data, we observe a random sample that is generated from some probability distribution. For example, if X is distributed normally with mean five and variance four, then we might draw ten observations with the values: 1.91, 2.53, 2.81, 5.28, 5.29, 5.53, 5.60, 5.73, 5.96, 7.27 or with the values: 1.60, 3.39, 3.68, 3.74, 4.62, 5.11, 5.32, 7.00, 7.54, 8.24 In statistics, the problem is that we don t know what the mean and variance of the population or other parameters that relate to the population so we are trying to guess them. We do, of course, know the mean and variance of our samples. We calculate these with the formulas: X = Ê[X] = N X i N! i=1 N Vâr(X) = (X i! X) 2 (N! 1) " i=1 (In general, I will use the hat to indicate a sample variance or similar calculation, except when I m describing the sample mean.) With the two samples above, we would calculate sample means of 4.79 and 5.02, and sample variances of 3.04 and These are not the sample as the population parameters, but they would be reasonable guesses if we didn t know the true mean and variance which, of course, we almost never know in practice. Statistics is all about guessing. An estimator is a formula or technique used to guess the value of some parameter. An estimate, in contrast, is the number that we get out of that formula, given our sample. Thus, X =! X i N is an estimator of the population mean; again, this word refers to the formula itself. With this formula, we obtain estimates of 4.79 A primer in probability, p. 19

20 and 5.02 on the two samples. Similarly, the formula Vâr(X) =!(X i " X) 2 an estimator of the population variance. (N " 1) is There are many ways that we could estimate any particular variable. Some of these techniques will be silly, and some will be sensible. For example, consider three different estimators of the population mean, µ : ˆµ 1 =! N i=1 X i N ˆµ 2 = X 1 ˆµ 3 = These are all formulas we could use to guess the population mean from our sample; since they are formulas, they qualify as estimators. However, the second and third methods seem a bit silly the second uses only one person s value of X to guess the population average, and the third uses a completely arbitrary number. Now, we need to define some desirable properties of estimators, so we can talk about what is sensible. Let s use! to denote some parameter that we re trying to estimate, and let ˆ! represent our estimator. Unbiasedness: The estimator ˆ! is unbiased if E[ ˆ!] =! ; that is, we would expect it to be correct, on average. Consistent: The estimator ˆ! is consistent if ˆ! "! as N! " ; that is, the estimator gets the correct value exactly as our sample gets larger and larger. Efficient: The estimator ˆ! is efficient if it minimizesvar( ˆ!) ; that is, that it is the most precise estimator available. The sample mean is an unbiased estimator of the population mean. This means that on average, it will be correct. Look at the estimates we obtained from the two samples about one was a bit below the true value, and one was a bit above. This is exactly what we expect from an unbiased estimator. We can demonstrate formally that this estimator is unbiased. To do this, we need to calculate the expected value of ˆ!. ˆ! is just a formula, some function of the values of the variable in our sample. We need to figure out the expected value of that function. E[ ˆµ 1 ] = E[! i=1 N X i N] = E[X 1 N + X 2 N + + X N N] A primer in probability, p. 20

21 Remember that we can always separate an expectation where we add or subtract components (but not when we multiply or divide, unless the variables are independent of each other): E[ ˆµ 1 ] = E[X 1 N + X 2 N + + X N N] = E[X 1 N] + E[X 2 N] + + E[X N N] In addition, we can always move a constant (like1 N ) outside of the expectation: E[ ˆµ 1 ] = E[X 1 N] + E[X 2 N] + + E[X N N] = 1 N! E[X 1 ] + 1 N! E[X 2 ] N! E[X N ] Now we need to apply the expectation to each of those variables. Remember that the expected value of any X i is the population mean, µ. E[ ˆµ 1 ] = 1 N! E[X 1 ] + 1 N! E[X 2 ] N! E[X N ] = 1 N! µ + 1 N! µ N! µ = N!(1 N! µ) = µ Thus, we have shown that this formula, ˆµ 1, is an unbiased estimator of µ. What about the other estimators? As it turns out, ˆµ 2 is actually an unbiased estimator of the population mean: E[ ˆµ 2 ] = E[X 1 ] = µ. Our first observation is just as likely to be above average as below average. Even though it seems silly to discard the rest of our sample, using only one observation is unbiased. The third estimator, however, is generally biased (unless we re sampling calculations of pi, I guess): E[ ] = ! µ, in general. What about consistency? As the sample grows larger and approaches the entire population, the sample mean should indeed approach the population mean, so the first estimator is consistent. The second is not: if we were to keep expanding the sample, this estimator wouldn t change. We d still expect it to be above average half the time and below average half the time. Finally, the third estimator is wrong no matter how large our sample is. Usually, there s little difference between consistency and unbiasedness if an estimator is one, then it is the other, as well. (In fact, if we wanted to show consistency formally, we d usually show that the estimator is unbiased, and then we d show that the variability in the estimator gets smaller and smaller as the sample size gets larger meaning that the estimator is perfectly precise when the sample is infinitely large.) Exceptions are usually a bit weird. Using a limited sub-sample to A primer in probability, p. 21

22 estimate a mean is unbiased but not consistent; using the formula!(x i " X) 2 / N to estimate variance (instead of!(x i " X) 2 / (N " 1)) is biased, but the amount of the bias gets smaller and smaller as the sample grows larger, and the bias disappears when the sample is infinitely large. Finally, we should talk about efficiency, which means that we need to address the variance in an estimator. Estimates are random variables themselves: they vary with the random sample that we collected. Two different samples drawn from the same distribution, using the same estimator, will generally yield different estimates. We want to know how much we would expect the value of the estimator to vary from sample to sample, so we calculate Var( ˆ!) = E[( ˆ! " E[ ˆ!]) 2 ] for our estimator. For example, the variance in ˆµ 1 would be: Var( ˆµ 1 ) = E[( ˆµ 1! E[ ˆµ 1 ]) 2 ] = E[(" X i / N! µ) 2 ] Usually, the first steps in calculating the variance are plugging in the expected value of the estimator (the population mean, µ, in this case), and substituting the formula in place of ˆµ 1. We also need to specify a bit more about the distribution of X: let s assume that each X i variance of! 2, and that X i and X j are uncorrelated (as would be the case with a truly random sample). Now, we have to plow through algebra: Var( ˆµ 1 ) = E[(! X i / N " µ) 2 ] = E[(! X i / N " N # (µ / N)) 2 ] = E[(X 1 / N + X 2 / N + + X N / N " µ / N " µ / N " " µ / N) 2 ] = E[((X 1 / N " µ / N) + (X 2 / N " µ / N) + + (X N / N " µ / N)) 2 ] Now let s square terms. We ll have: Var( ˆµ 1 ) = E[((X 1 / N! µ / N) + (X 2 / N! µ / N) + + (X N / N! µ / N)) 2 ] = E[(X 1 / N! µ / N) 2 + (X 2 / N! µ / N) (X N / N! µ / N) 2 +2(X 1 / N! µ / N)(X 2 / N! µ / N) + + 2(X 1 / N! µ / N)(X N / N! µ / N) +2(X 2 / N! µ / N)(X 3 / N! µ / N) + + 2(X 2 / N! µ / N)(X N / N! µ / N) + + 2(X N!1 / N! µ / N)(X N / N! µ / N)] Now we can break up the expectation at each summation, and we can factor a constant 1 N 2 out of each term: A primer in probability, p. 22

23 Var( ˆµ 1 ) = 1 N 2! E[(X 1 " µ) 2 ] + 1 N 2! E[(X 2 " µ) 2 ] N 2! E[(X N " µ) 2 ] + 2 N 2! E[(X 1 " µ)(x 2 " µ)] N 2! E[(X 1 " µ)(x N " µ)] + 2 N 2! E[(X 2 " µ)(x 3 " µ)] N 2! E[(X 2 " µ)(x N " µ)] N 2! E[(X N "1 " µ)(x N " µ)] We can now apply the expectation. In the first line of this expression, we have variances in the variables, and E[(X i! µ) 2 ] = " 2. In the following lines, we have a bunch of covariances, and E[(X i! µ)(x j! µ)] = 0. Thus, the variance in ˆµ 2 is: Var( ˆµ 1 ) = 1 N 2 (! 2 +! 2 + +! 2 ) + 2 N 2 ( ) = 1 N 2 (N! 2 ) =! 2 N That s it. If we wanted to talk about efficiency of the estimators, we would compare the variance of ˆµ 2 to the variance in the other estimators. Var( ˆµ 2 ) = E[(X 1! E[X 1 ]) 2 ] = E[(X 1! µ) 2 ] = " 2 As long as our sample has more than one observation, N > 1, the first estimator is more efficient, more precise, since it has less variance. The variance in the third estimator is: Var( ˆµ 3 ) = E[( ! E[ ]) 2 ] = E[( ! ) 2 ] = 0 This is actually the most efficient estimator. However, since it s biased since it s wrong it doesn t make sense to use it. Often our criterion for the best estimator is the minimum variance unbiased estimator (sometimes called MVUE), the most efficient of the unbiased estimators. In general, when we re faced with an estimator, we need to figure out three things: its mean, its variance, and its distribution. The estimator inherits all of these properties from the variables used to calculate it. Once we know these characteristics, we can construct confidence intervals for our estimate and test hypotheses about the variable. A primer in probability, p. 23

24 A Q% confidence interval for a parameter estimate is a range of values that has a Q% chance of including the true value of the coefficient. We should note that it is technically incorrect to say that there is a Q% chance that the true value of the parameter lies within this range. Since the true value is a fixed, constant, nonrandom variable, it makes no sense to talk about a chance relating to something non-random. The parameter estimate is a random variable, since it depends on the specific values in our random sample. The confidence interval is also something random, since it is based around our parameter estimate. It is correct to think about the chance that this confidence interval does something in this case, including the true value of the parameter. When we want to test a hypothesis, we must define two things: the null hypothesis, which is the thing to be tested, and the alternative hypothesis, which is what we believe to be true if the null is not. For example, we might want to test whether the true value of the parameter! could be This would be our null hypothesis, H 0 :! = A natural alternative is that! is not equal to 2.13, H A :! " Often the alternative hypothesis is simple and innocuous like this, but sometimes we make stronger assumptions. To test a hypothesis, we define a confidence level, which I will call Q%, and then we go through the following steps: 1. Assume that the null hypothesis is true. 2. If the null hypothesis is true, then find the distribution of the estimator (its mean, variance, and type of distribution). 3. Find the probability of obtaining this estimate, or one even further from the hypothesized value, if the null is true. This probability is called a p-value. 4. If the p-value is less than1! Q /100, you reject the null hypothesis; if the p- value is more than1! Q /100, you fail to reject the null hypothesis. For example, suppose that we believe that the variable X is distributed normally with mean of µ and variance of! 2 (but we don t know these values). We collect a random sample of ten observations: 1.91, 2.53, 2.81, 5.28, 5.29, 5.53, 5.60, 5.73, 5.96, 7.27 Our sample mean is 4.79, and our sample variance is Now we want to test the hypothesis H 0 : µ = 2.13 against the alternative H A : µ! Out sample mean isn t 2.13, but we re asking the question: how unlikely would it be to observe a sample mean of 2.13 if the true mean is 4.79? If the null hypothesis is true, then X =! X i N should be normally distributed (adding normal random variables gives us another normal random variable) with a A primer in probability, p. 24

Chapter 4 Lecture Notes

Chapter 4 Lecture Notes Chapter 4 Lecture Notes Random Variables October 27, 2015 1 Section 4.1 Random Variables A random variable is typically a real-valued function defined on the sample space of some experiment. For instance,

More information

Summary of Formulas and Concepts. Descriptive Statistics (Ch. 1-4)

Summary of Formulas and Concepts. Descriptive Statistics (Ch. 1-4) Summary of Formulas and Concepts Descriptive Statistics (Ch. 1-4) Definitions Population: The complete set of numerical information on a particular quantity in which an investigator is interested. We assume

More information

For a partition B 1,..., B n, where B i B j = for i. A = (A B 1 ) (A B 2 ),..., (A B n ) and thus. P (A) = P (A B i ) = P (A B i )P (B i )

For a partition B 1,..., B n, where B i B j = for i. A = (A B 1 ) (A B 2 ),..., (A B n ) and thus. P (A) = P (A B i ) = P (A B i )P (B i ) Probability Review 15.075 Cynthia Rudin A probability space, defined by Kolmogorov (1903-1987) consists of: A set of outcomes S, e.g., for the roll of a die, S = {1, 2, 3, 4, 5, 6}, 1 1 2 1 6 for the roll

More information

Notes on Continuous Random Variables

Notes on Continuous Random Variables Notes on Continuous Random Variables Continuous random variables are random quantities that are measured on a continuous scale. They can usually take on any value over some interval, which distinguishes

More information

" Y. Notation and Equations for Regression Lecture 11/4. Notation:

 Y. Notation and Equations for Regression Lecture 11/4. Notation: Notation: Notation and Equations for Regression Lecture 11/4 m: The number of predictor variables in a regression Xi: One of multiple predictor variables. The subscript i represents any number from 1 through

More information

MA 1125 Lecture 14 - Expected Values. Friday, February 28, 2014. Objectives: Introduce expected values.

MA 1125 Lecture 14 - Expected Values. Friday, February 28, 2014. Objectives: Introduce expected values. MA 5 Lecture 4 - Expected Values Friday, February 2, 24. Objectives: Introduce expected values.. Means, Variances, and Standard Deviations of Probability Distributions Two classes ago, we computed the

More information

Joint Exam 1/P Sample Exam 1

Joint Exam 1/P Sample Exam 1 Joint Exam 1/P Sample Exam 1 Take this practice exam under strict exam conditions: Set a timer for 3 hours; Do not stop the timer for restroom breaks; Do not look at your notes. If you believe a question

More information

5. Continuous Random Variables

5. Continuous Random Variables 5. Continuous Random Variables Continuous random variables can take any value in an interval. They are used to model physical characteristics such as time, length, position, etc. Examples (i) Let X be

More information

Data Modeling & Analysis Techniques. Probability & Statistics. Manfred Huber 2011 1

Data Modeling & Analysis Techniques. Probability & Statistics. Manfred Huber 2011 1 Data Modeling & Analysis Techniques Probability & Statistics Manfred Huber 2011 1 Probability and Statistics Probability and statistics are often used interchangeably but are different, related fields

More information

Chapter 4. Probability Distributions

Chapter 4. Probability Distributions Chapter 4 Probability Distributions Lesson 4-1/4-2 Random Variable Probability Distributions This chapter will deal the construction of probability distribution. By combining the methods of descriptive

More information

Random Variables. Chapter 2. Random Variables 1

Random Variables. Chapter 2. Random Variables 1 Random Variables Chapter 2 Random Variables 1 Roulette and Random Variables A Roulette wheel has 38 pockets. 18 of them are red and 18 are black; these are numbered from 1 to 36. The two remaining pockets

More information

Simple Regression Theory II 2010 Samuel L. Baker

Simple Regression Theory II 2010 Samuel L. Baker SIMPLE REGRESSION THEORY II 1 Simple Regression Theory II 2010 Samuel L. Baker Assessing how good the regression equation is likely to be Assignment 1A gets into drawing inferences about how close the

More information

Problem sets for BUEC 333 Part 1: Probability and Statistics

Problem sets for BUEC 333 Part 1: Probability and Statistics Problem sets for BUEC 333 Part 1: Probability and Statistics I will indicate the relevant exercises for each week at the end of the Wednesday lecture. Numbered exercises are back-of-chapter exercises from

More information

Normal distribution. ) 2 /2σ. 2π σ

Normal distribution. ) 2 /2σ. 2π σ Normal distribution The normal distribution is the most widely known and used of all distributions. Because the normal distribution approximates many natural phenomena so well, it has developed into a

More information

Probability and Statistics Vocabulary List (Definitions for Middle School Teachers)

Probability and Statistics Vocabulary List (Definitions for Middle School Teachers) Probability and Statistics Vocabulary List (Definitions for Middle School Teachers) B Bar graph a diagram representing the frequency distribution for nominal or discrete data. It consists of a sequence

More information

Discrete Mathematics and Probability Theory Fall 2009 Satish Rao, David Tse Note 13. Random Variables: Distribution and Expectation

Discrete Mathematics and Probability Theory Fall 2009 Satish Rao, David Tse Note 13. Random Variables: Distribution and Expectation CS 70 Discrete Mathematics and Probability Theory Fall 2009 Satish Rao, David Tse Note 3 Random Variables: Distribution and Expectation Random Variables Question: The homeworks of 20 students are collected

More information

STT315 Chapter 4 Random Variables & Probability Distributions KM. Chapter 4.5, 6, 8 Probability Distributions for Continuous Random Variables

STT315 Chapter 4 Random Variables & Probability Distributions KM. Chapter 4.5, 6, 8 Probability Distributions for Continuous Random Variables Chapter 4.5, 6, 8 Probability Distributions for Continuous Random Variables Discrete vs. continuous random variables Examples of continuous distributions o Uniform o Exponential o Normal Recall: A random

More information

Recursive Estimation

Recursive Estimation Recursive Estimation Raffaello D Andrea Spring 04 Problem Set : Bayes Theorem and Bayesian Tracking Last updated: March 8, 05 Notes: Notation: Unlessotherwisenoted,x, y,andz denoterandomvariables, f x

More information

Ch. 13.3: More about Probability

Ch. 13.3: More about Probability Ch. 13.3: More about Probability Complementary Probabilities Given any event, E, of some sample space, U, of a random experiment, we can always talk about the complement, E, of that event: this is the

More information

E3: PROBABILITY AND STATISTICS lecture notes

E3: PROBABILITY AND STATISTICS lecture notes E3: PROBABILITY AND STATISTICS lecture notes 2 Contents 1 PROBABILITY THEORY 7 1.1 Experiments and random events............................ 7 1.2 Certain event. Impossible event............................

More information

An Introduction to Basic Statistics and Probability

An Introduction to Basic Statistics and Probability An Introduction to Basic Statistics and Probability Shenek Heyward NCSU An Introduction to Basic Statistics and Probability p. 1/4 Outline Basic probability concepts Conditional probability Discrete Random

More information

4. Continuous Random Variables, the Pareto and Normal Distributions

4. Continuous Random Variables, the Pareto and Normal Distributions 4. Continuous Random Variables, the Pareto and Normal Distributions A continuous random variable X can take any value in a given range (e.g. height, weight, age). The distribution of a continuous random

More information

Probability and Statistics Prof. Dr. Somesh Kumar Department of Mathematics Indian Institute of Technology, Kharagpur

Probability and Statistics Prof. Dr. Somesh Kumar Department of Mathematics Indian Institute of Technology, Kharagpur Probability and Statistics Prof. Dr. Somesh Kumar Department of Mathematics Indian Institute of Technology, Kharagpur Module No. #01 Lecture No. #15 Special Distributions-VI Today, I am going to introduce

More information

Question: What is the probability that a five-card poker hand contains a flush, that is, five cards of the same suit?

Question: What is the probability that a five-card poker hand contains a flush, that is, five cards of the same suit? ECS20 Discrete Mathematics Quarter: Spring 2007 Instructor: John Steinberger Assistant: Sophie Engle (prepared by Sophie Engle) Homework 8 Hints Due Wednesday June 6 th 2007 Section 6.1 #16 What is the

More information

CALCULATIONS & STATISTICS

CALCULATIONS & STATISTICS CALCULATIONS & STATISTICS CALCULATION OF SCORES Conversion of 1-5 scale to 0-100 scores When you look at your report, you will notice that the scores are reported on a 0-100 scale, even though respondents

More information

Math Review. for the Quantitative Reasoning Measure of the GRE revised General Test

Math Review. for the Quantitative Reasoning Measure of the GRE revised General Test Math Review for the Quantitative Reasoning Measure of the GRE revised General Test www.ets.org Overview This Math Review will familiarize you with the mathematical skills and concepts that are important

More information

Probability and Expected Value

Probability and Expected Value Probability and Expected Value This handout provides an introduction to probability and expected value. Some of you may already be familiar with some of these topics. Probability and expected value are

More information

Math 425 (Fall 08) Solutions Midterm 2 November 6, 2008

Math 425 (Fall 08) Solutions Midterm 2 November 6, 2008 Math 425 (Fall 8) Solutions Midterm 2 November 6, 28 (5 pts) Compute E[X] and Var[X] for i) X a random variable that takes the values, 2, 3 with probabilities.2,.5,.3; ii) X a random variable with the

More information

Introduction to. Hypothesis Testing CHAPTER LEARNING OBJECTIVES. 1 Identify the four steps of hypothesis testing.

Introduction to. Hypothesis Testing CHAPTER LEARNING OBJECTIVES. 1 Identify the four steps of hypothesis testing. Introduction to Hypothesis Testing CHAPTER 8 LEARNING OBJECTIVES After reading this chapter, you should be able to: 1 Identify the four steps of hypothesis testing. 2 Define null hypothesis, alternative

More information

Math/Stats 342: Solutions to Homework

Math/Stats 342: Solutions to Homework Math/Stats 342: Solutions to Homework Steven Miller (sjm1@williams.edu) November 17, 2011 Abstract Below are solutions / sketches of solutions to the homework problems from Math/Stats 342: Probability

More information

CA200 Quantitative Analysis for Business Decisions. File name: CA200_Section_04A_StatisticsIntroduction

CA200 Quantitative Analysis for Business Decisions. File name: CA200_Section_04A_StatisticsIntroduction CA200 Quantitative Analysis for Business Decisions File name: CA200_Section_04A_StatisticsIntroduction Table of Contents 4. Introduction to Statistics... 1 4.1 Overview... 3 4.2 Discrete or continuous

More information

Hypothesis Testing for Beginners

Hypothesis Testing for Beginners Hypothesis Testing for Beginners Michele Piffer LSE August, 2011 Michele Piffer (LSE) Hypothesis Testing for Beginners August, 2011 1 / 53 One year ago a friend asked me to put down some easy-to-read notes

More information

Introduction to Probability

Introduction to Probability Introduction to Probability EE 179, Lecture 15, Handout #24 Probability theory gives a mathematical characterization for experiments with random outcomes. coin toss life of lightbulb binary data sequence

More information

Lecture 6: Discrete & Continuous Probability and Random Variables

Lecture 6: Discrete & Continuous Probability and Random Variables Lecture 6: Discrete & Continuous Probability and Random Variables D. Alex Hughes Math Camp September 17, 2015 D. Alex Hughes (Math Camp) Lecture 6: Discrete & Continuous Probability and Random September

More information

The sample space for a pair of die rolls is the set. The sample space for a random number between 0 and 1 is the interval [0, 1].

The sample space for a pair of die rolls is the set. The sample space for a random number between 0 and 1 is the interval [0, 1]. Probability Theory Probability Spaces and Events Consider a random experiment with several possible outcomes. For example, we might roll a pair of dice, flip a coin three times, or choose a random real

More information

Introduction to Hypothesis Testing

Introduction to Hypothesis Testing I. Terms, Concepts. Introduction to Hypothesis Testing A. In general, we do not know the true value of population parameters - they must be estimated. However, we do have hypotheses about what the true

More information

Statistics 100A Homework 8 Solutions

Statistics 100A Homework 8 Solutions Part : Chapter 7 Statistics A Homework 8 Solutions Ryan Rosario. A player throws a fair die and simultaneously flips a fair coin. If the coin lands heads, then she wins twice, and if tails, the one-half

More information

Mathematical Expectation

Mathematical Expectation Mathematical Expectation Properties of Mathematical Expectation I The concept of mathematical expectation arose in connection with games of chance. In its simplest form, mathematical expectation is the

More information

MONT 107N Understanding Randomness Solutions For Final Examination May 11, 2010

MONT 107N Understanding Randomness Solutions For Final Examination May 11, 2010 MONT 07N Understanding Randomness Solutions For Final Examination May, 00 Short Answer (a) (0) How are the EV and SE for the sum of n draws with replacement from a box computed? Solution: The EV is n times

More information

Statistics 151 Practice Midterm 1 Mike Kowalski

Statistics 151 Practice Midterm 1 Mike Kowalski Statistics 151 Practice Midterm 1 Mike Kowalski Statistics 151 Practice Midterm 1 Multiple Choice (50 minutes) Instructions: 1. This is a closed book exam. 2. You may use the STAT 151 formula sheets and

More information

COLLEGE ALGEBRA. Paul Dawkins

COLLEGE ALGEBRA. Paul Dawkins COLLEGE ALGEBRA Paul Dawkins Table of Contents Preface... iii Outline... iv Preliminaries... Introduction... Integer Exponents... Rational Exponents... 9 Real Exponents...5 Radicals...6 Polynomials...5

More information

Lecture 8. Confidence intervals and the central limit theorem

Lecture 8. Confidence intervals and the central limit theorem Lecture 8. Confidence intervals and the central limit theorem Mathematical Statistics and Discrete Mathematics November 25th, 2015 1 / 15 Central limit theorem Let X 1, X 2,... X n be a random sample of

More information

In the situations that we will encounter, we may generally calculate the probability of an event

In the situations that we will encounter, we may generally calculate the probability of an event What does it mean for something to be random? An event is called random if the process which produces the outcome is sufficiently complicated that we are unable to predict the precise result and are instead

More information

Chicago Booth BUSINESS STATISTICS 41000 Final Exam Fall 2011

Chicago Booth BUSINESS STATISTICS 41000 Final Exam Fall 2011 Chicago Booth BUSINESS STATISTICS 41000 Final Exam Fall 2011 Name: Section: I pledge my honor that I have not violated the Honor Code Signature: This exam has 34 pages. You have 3 hours to complete this

More information

ECE302 Spring 2006 HW4 Solutions February 6, 2006 1

ECE302 Spring 2006 HW4 Solutions February 6, 2006 1 ECE302 Spring 2006 HW4 Solutions February 6, 2006 1 Solutions to HW4 Note: Most of these solutions were generated by R. D. Yates and D. J. Goodman, the authors of our textbook. I have added comments in

More information

Statistics 100A Homework 3 Solutions

Statistics 100A Homework 3 Solutions Chapter Statistics 00A Homework Solutions Ryan Rosario. Two balls are chosen randomly from an urn containing 8 white, black, and orange balls. Suppose that we win $ for each black ball selected and we

More information

Example: Find the expected value of the random variable X. X 2 4 6 7 P(X) 0.3 0.2 0.1 0.4

Example: Find the expected value of the random variable X. X 2 4 6 7 P(X) 0.3 0.2 0.1 0.4 MATH 110 Test Three Outline of Test Material EXPECTED VALUE (8.5) Super easy ones (when the PDF is already given to you as a table and all you need to do is multiply down the columns and add across) Example:

More information

Recall this chart that showed how most of our course would be organized:

Recall this chart that showed how most of our course would be organized: Chapter 4 One-Way ANOVA Recall this chart that showed how most of our course would be organized: Explanatory Variable(s) Response Variable Methods Categorical Categorical Contingency Tables Categorical

More information

WEEK #22: PDFs and CDFs, Measures of Center and Spread

WEEK #22: PDFs and CDFs, Measures of Center and Spread WEEK #22: PDFs and CDFs, Measures of Center and Spread Goals: Explore the effect of independent events in probability calculations. Present a number of ways to represent probability distributions. Textbook

More information

STATISTICS 8: CHAPTERS 7 TO 10, SAMPLE MULTIPLE CHOICE QUESTIONS

STATISTICS 8: CHAPTERS 7 TO 10, SAMPLE MULTIPLE CHOICE QUESTIONS STATISTICS 8: CHAPTERS 7 TO 10, SAMPLE MULTIPLE CHOICE QUESTIONS 1. If two events (both with probability greater than 0) are mutually exclusive, then: A. They also must be independent. B. They also could

More information

The Math. P (x) = 5! = 1 2 3 4 5 = 120.

The Math. P (x) = 5! = 1 2 3 4 5 = 120. The Math Suppose there are n experiments, and the probability that someone gets the right answer on any given experiment is p. So in the first example above, n = 5 and p = 0.2. Let X be the number of correct

More information

ECE302 Spring 2006 HW3 Solutions February 2, 2006 1

ECE302 Spring 2006 HW3 Solutions February 2, 2006 1 ECE302 Spring 2006 HW3 Solutions February 2, 2006 1 Solutions to HW3 Note: Most of these solutions were generated by R. D. Yates and D. J. Goodman, the authors of our textbook. I have added comments in

More information

Chapter 5. Discrete Probability Distributions

Chapter 5. Discrete Probability Distributions Chapter 5. Discrete Probability Distributions Chapter Problem: Did Mendel s result from plant hybridization experiments contradicts his theory? 1. Mendel s theory says that when there are two inheritable

More information

Solutions: Problems for Chapter 3. Solutions: Problems for Chapter 3

Solutions: Problems for Chapter 3. Solutions: Problems for Chapter 3 Problem A: You are dealt five cards from a standard deck. Are you more likely to be dealt two pairs or three of a kind? experiment: choose 5 cards at random from a standard deck Ω = {5-combinations of

More information

AMS 5 CHANCE VARIABILITY

AMS 5 CHANCE VARIABILITY AMS 5 CHANCE VARIABILITY The Law of Averages When tossing a fair coin the chances of tails and heads are the same: 50% and 50%. So if the coin is tossed a large number of times, the number of heads and

More information

COMMON CORE STATE STANDARDS FOR

COMMON CORE STATE STANDARDS FOR COMMON CORE STATE STANDARDS FOR Mathematics (CCSSM) High School Statistics and Probability Mathematics High School Statistics and Probability Decisions or predictions are often based on data numbers in

More information

Descriptive Statistics

Descriptive Statistics Descriptive Statistics Primer Descriptive statistics Central tendency Variation Relative position Relationships Calculating descriptive statistics Descriptive Statistics Purpose to describe or summarize

More information

Definition: Suppose that two random variables, either continuous or discrete, X and Y have joint density

Definition: Suppose that two random variables, either continuous or discrete, X and Y have joint density HW MATH 461/561 Lecture Notes 15 1 Definition: Suppose that two random variables, either continuous or discrete, X and Y have joint density and marginal densities f(x, y), (x, y) Λ X,Y f X (x), x Λ X,

More information

Section 5.1 Continuous Random Variables: Introduction

Section 5.1 Continuous Random Variables: Introduction Section 5. Continuous Random Variables: Introduction Not all random variables are discrete. For example:. Waiting times for anything (train, arrival of customer, production of mrna molecule from gene,

More information

WHERE DOES THE 10% CONDITION COME FROM?

WHERE DOES THE 10% CONDITION COME FROM? 1 WHERE DOES THE 10% CONDITION COME FROM? The text has mentioned The 10% Condition (at least) twice so far: p. 407 Bernoulli trials must be independent. If that assumption is violated, it is still okay

More information

NCSS Statistical Software Principal Components Regression. In ordinary least squares, the regression coefficients are estimated using the formula ( )

NCSS Statistical Software Principal Components Regression. In ordinary least squares, the regression coefficients are estimated using the formula ( ) Chapter 340 Principal Components Regression Introduction is a technique for analyzing multiple regression data that suffer from multicollinearity. When multicollinearity occurs, least squares estimates

More information

STA 130 (Winter 2016): An Introduction to Statistical Reasoning and Data Science

STA 130 (Winter 2016): An Introduction to Statistical Reasoning and Data Science STA 130 (Winter 2016): An Introduction to Statistical Reasoning and Data Science Mondays 2:10 4:00 (GB 220) and Wednesdays 2:10 4:00 (various) Jeffrey Rosenthal Professor of Statistics, University of Toronto

More information

If, under a given assumption, the of a particular observed is extremely. , we conclude that the is probably not

If, under a given assumption, the of a particular observed is extremely. , we conclude that the is probably not 4.1 REVIEW AND PREVIEW RARE EVENT RULE FOR INFERENTIAL STATISTICS If, under a given assumption, the of a particular observed is extremely, we conclude that the is probably not. 4.2 BASIC CONCEPTS OF PROBABILITY

More information

How To Check For Differences In The One Way Anova

How To Check For Differences In The One Way Anova MINITAB ASSISTANT WHITE PAPER This paper explains the research conducted by Minitab statisticians to develop the methods and data checks used in the Assistant in Minitab 17 Statistical Software. One-Way

More information

9. Sampling Distributions

9. Sampling Distributions 9. Sampling Distributions Prerequisites none A. Introduction B. Sampling Distribution of the Mean C. Sampling Distribution of Difference Between Means D. Sampling Distribution of Pearson's r E. Sampling

More information

Statistics 104: Section 6!

Statistics 104: Section 6! Page 1 Statistics 104: Section 6! TF: Deirdre (say: Dear-dra) Bloome Email: dbloome@fas.harvard.edu Section Times Thursday 2pm-3pm in SC 109, Thursday 5pm-6pm in SC 705 Office Hours: Thursday 6pm-7pm SC

More information

Probability density function : An arbitrary continuous random variable X is similarly described by its probability density function f x = f X

Probability density function : An arbitrary continuous random variable X is similarly described by its probability density function f x = f X Week 6 notes : Continuous random variables and their probability densities WEEK 6 page 1 uniform, normal, gamma, exponential,chi-squared distributions, normal approx'n to the binomial Uniform [,1] random

More information

Curriculum Map Statistics and Probability Honors (348) Saugus High School Saugus Public Schools 2009-2010

Curriculum Map Statistics and Probability Honors (348) Saugus High School Saugus Public Schools 2009-2010 Curriculum Map Statistics and Probability Honors (348) Saugus High School Saugus Public Schools 2009-2010 Week 1 Week 2 14.0 Students organize and describe distributions of data by using a number of different

More information

BNG 202 Biomechanics Lab. Descriptive statistics and probability distributions I

BNG 202 Biomechanics Lab. Descriptive statistics and probability distributions I BNG 202 Biomechanics Lab Descriptive statistics and probability distributions I Overview The overall goal of this short course in statistics is to provide an introduction to descriptive and inferential

More information

Random variables P(X = 3) = P(X = 3) = 1 8, P(X = 1) = P(X = 1) = 3 8.

Random variables P(X = 3) = P(X = 3) = 1 8, P(X = 1) = P(X = 1) = 3 8. Random variables Remark on Notations 1. When X is a number chosen uniformly from a data set, What I call P(X = k) is called Freq[k, X] in the courseware. 2. When X is a random variable, what I call F ()

More information

Statistics Review PSY379

Statistics Review PSY379 Statistics Review PSY379 Basic concepts Measurement scales Populations vs. samples Continuous vs. discrete variable Independent vs. dependent variable Descriptive vs. inferential stats Common analyses

More information

Random variables, probability distributions, binomial random variable

Random variables, probability distributions, binomial random variable Week 4 lecture notes. WEEK 4 page 1 Random variables, probability distributions, binomial random variable Eample 1 : Consider the eperiment of flipping a fair coin three times. The number of tails that

More information

Chapter 4. Probability and Probability Distributions

Chapter 4. Probability and Probability Distributions Chapter 4. robability and robability Distributions Importance of Knowing robability To know whether a sample is not identical to the population from which it was selected, it is necessary to assess the

More information

Master s Theory Exam Spring 2006

Master s Theory Exam Spring 2006 Spring 2006 This exam contains 7 questions. You should attempt them all. Each question is divided into parts to help lead you through the material. You should attempt to complete as much of each problem

More information

Answer Key for California State Standards: Algebra I

Answer Key for California State Standards: Algebra I Algebra I: Symbolic reasoning and calculations with symbols are central in algebra. Through the study of algebra, a student develops an understanding of the symbolic language of mathematics and the sciences.

More information

Statistics and Random Variables. Math 425 Introduction to Probability Lecture 14. Finite valued Random Variables. Expectation defined

Statistics and Random Variables. Math 425 Introduction to Probability Lecture 14. Finite valued Random Variables. Expectation defined Expectation Statistics and Random Variables Math 425 Introduction to Probability Lecture 4 Kenneth Harris kaharri@umich.edu Department of Mathematics University of Michigan February 9, 2009 When a large

More information

FEGYVERNEKI SÁNDOR, PROBABILITY THEORY AND MATHEmATICAL

FEGYVERNEKI SÁNDOR, PROBABILITY THEORY AND MATHEmATICAL FEGYVERNEKI SÁNDOR, PROBABILITY THEORY AND MATHEmATICAL STATIsTICs 4 IV. RANDOm VECTORs 1. JOINTLY DIsTRIBUTED RANDOm VARIABLEs If are two rom variables defined on the same sample space we define the joint

More information

Exploratory Data Analysis

Exploratory Data Analysis Exploratory Data Analysis Johannes Schauer johannes.schauer@tugraz.at Institute of Statistics Graz University of Technology Steyrergasse 17/IV, 8010 Graz www.statistics.tugraz.at February 12, 2008 Introduction

More information

Probability Models.S1 Introduction to Probability

Probability Models.S1 Introduction to Probability Probability Models.S1 Introduction to Probability Operations Research Models and Methods Paul A. Jensen and Jonathan F. Bard The stochastic chapters of this book involve random variability. Decisions are

More information

The Normal Approximation to Probability Histograms. Dice: Throw a single die twice. The Probability Histogram: Area = Probability. Where are we going?

The Normal Approximation to Probability Histograms. Dice: Throw a single die twice. The Probability Histogram: Area = Probability. Where are we going? The Normal Approximation to Probability Histograms Where are we going? Probability histograms The normal approximation to binomial histograms The normal approximation to probability histograms of sums

More information

6.3 Conditional Probability and Independence

6.3 Conditional Probability and Independence 222 CHAPTER 6. PROBABILITY 6.3 Conditional Probability and Independence Conditional Probability Two cubical dice each have a triangle painted on one side, a circle painted on two sides and a square painted

More information

Discrete Structures for Computer Science

Discrete Structures for Computer Science Discrete Structures for Computer Science Adam J. Lee adamlee@cs.pitt.edu 6111 Sennott Square Lecture #20: Bayes Theorem November 5, 2013 How can we incorporate prior knowledge? Sometimes we want to know

More information

Sums of Independent Random Variables

Sums of Independent Random Variables Chapter 7 Sums of Independent Random Variables 7.1 Sums of Discrete Random Variables In this chapter we turn to the important question of determining the distribution of a sum of independent random variables

More information

FACTORING QUADRATICS 8.1.1 and 8.1.2

FACTORING QUADRATICS 8.1.1 and 8.1.2 FACTORING QUADRATICS 8.1.1 and 8.1.2 Chapter 8 introduces students to quadratic equations. These equations can be written in the form of y = ax 2 + bx + c and, when graphed, produce a curve called a parabola.

More information

Mathematics Pre-Test Sample Questions A. { 11, 7} B. { 7,0,7} C. { 7, 7} D. { 11, 11}

Mathematics Pre-Test Sample Questions A. { 11, 7} B. { 7,0,7} C. { 7, 7} D. { 11, 11} Mathematics Pre-Test Sample Questions 1. Which of the following sets is closed under division? I. {½, 1,, 4} II. {-1, 1} III. {-1, 0, 1} A. I only B. II only C. III only D. I and II. Which of the following

More information

You flip a fair coin four times, what is the probability that you obtain three heads.

You flip a fair coin four times, what is the probability that you obtain three heads. Handout 4: Binomial Distribution Reading Assignment: Chapter 5 In the previous handout, we looked at continuous random variables and calculating probabilities and percentiles for those type of variables.

More information

LOGNORMAL MODEL FOR STOCK PRICES

LOGNORMAL MODEL FOR STOCK PRICES LOGNORMAL MODEL FOR STOCK PRICES MICHAEL J. SHARPE MATHEMATICS DEPARTMENT, UCSD 1. INTRODUCTION What follows is a simple but important model that will be the basis for a later study of stock prices as

More information

Simple Random Sampling

Simple Random Sampling Source: Frerichs, R.R. Rapid Surveys (unpublished), 2008. NOT FOR COMMERCIAL DISTRIBUTION 3 Simple Random Sampling 3.1 INTRODUCTION Everyone mentions simple random sampling, but few use this method for

More information

Chi Square Tests. Chapter 10. 10.1 Introduction

Chi Square Tests. Chapter 10. 10.1 Introduction Contents 10 Chi Square Tests 703 10.1 Introduction............................ 703 10.2 The Chi Square Distribution.................. 704 10.3 Goodness of Fit Test....................... 709 10.4 Chi Square

More information

If A is divided by B the result is 2/3. If B is divided by C the result is 4/7. What is the result if A is divided by C?

If A is divided by B the result is 2/3. If B is divided by C the result is 4/7. What is the result if A is divided by C? Problem 3 If A is divided by B the result is 2/3. If B is divided by C the result is 4/7. What is the result if A is divided by C? Suggested Questions to ask students about Problem 3 The key to this question

More information

What are the place values to the left of the decimal point and their associated powers of ten?

What are the place values to the left of the decimal point and their associated powers of ten? The verbal answers to all of the following questions should be memorized before completion of algebra. Answers that are not memorized will hinder your ability to succeed in geometry and algebra. (Everything

More information

4.1 4.2 Probability Distribution for Discrete Random Variables

4.1 4.2 Probability Distribution for Discrete Random Variables 4.1 4.2 Probability Distribution for Discrete Random Variables Key concepts: discrete random variable, probability distribution, expected value, variance, and standard deviation of a discrete random variable.

More information

t Tests in Excel The Excel Statistical Master By Mark Harmon Copyright 2011 Mark Harmon

t Tests in Excel The Excel Statistical Master By Mark Harmon Copyright 2011 Mark Harmon t-tests in Excel By Mark Harmon Copyright 2011 Mark Harmon No part of this publication may be reproduced or distributed without the express permission of the author. mark@excelmasterseries.com www.excelmasterseries.com

More information

Basic Probability Theory II

Basic Probability Theory II RECAP Basic Probability heory II Dr. om Ilvento FREC 408 We said the approach to establishing probabilities for events is to Define the experiment List the sample points Assign probabilities to the sample

More information

36 Odds, Expected Value, and Conditional Probability

36 Odds, Expected Value, and Conditional Probability 36 Odds, Expected Value, and Conditional Probability What s the difference between probabilities and odds? To answer this question, let s consider a game that involves rolling a die. If one gets the face

More information

Factoring Polynomials

Factoring Polynomials Factoring Polynomials Hoste, Miller, Murieka September 12, 2011 1 Factoring In the previous section, we discussed how to determine the product of two or more terms. Consider, for instance, the equations

More information

MATH 21. College Algebra 1 Lecture Notes

MATH 21. College Algebra 1 Lecture Notes MATH 21 College Algebra 1 Lecture Notes MATH 21 3.6 Factoring Review College Algebra 1 Factoring and Foiling 1. (a + b) 2 = a 2 + 2ab + b 2. 2. (a b) 2 = a 2 2ab + b 2. 3. (a + b)(a b) = a 2 b 2. 4. (a

More information

Overview of Monte Carlo Simulation, Probability Review and Introduction to Matlab

Overview of Monte Carlo Simulation, Probability Review and Introduction to Matlab Monte Carlo Simulation: IEOR E4703 Fall 2004 c 2004 by Martin Haugh Overview of Monte Carlo Simulation, Probability Review and Introduction to Matlab 1 Overview of Monte Carlo Simulation 1.1 Why use simulation?

More information

calculating probabilities

calculating probabilities 4 calculating probabilities Taking Chances What s the probability he s remembered I m allergic to non-precious metals? Life is full of uncertainty. Sometimes it can be impossible to say what will happen

More information

MATH4427 Notebook 2 Spring 2016. 2 MATH4427 Notebook 2 3. 2.1 Definitions and Examples... 3. 2.2 Performance Measures for Estimators...

MATH4427 Notebook 2 Spring 2016. 2 MATH4427 Notebook 2 3. 2.1 Definitions and Examples... 3. 2.2 Performance Measures for Estimators... MATH4427 Notebook 2 Spring 2016 prepared by Professor Jenny Baglivo c Copyright 2009-2016 by Jenny A. Baglivo. All Rights Reserved. Contents 2 MATH4427 Notebook 2 3 2.1 Definitions and Examples...................................

More information