Proaility, Mean and Median In the last section, we considered (proaility) density functions. We went on to discuss their relationship with cumulative distriution functions. The goal of this section is to take a closer look at densities, introduce some common distriutions and discuss the mean and median. Recall, we define proailities as follows: Proportion of population for Area under the graph of p( d ) which is etween a and p( ) etween a and a The cumulative distriution function gives the proportion of the population that has values elow t. That is, t Pt () pd ( ) Proportion of population having values of elow t When answering some questions involving proailities, oth the density function and the cumulative distriution can e used, as the net eample illustrates. Eample : Consider the graph of the function p(). p.2. 2 4 6 8 Figure : The graph of the function p() a. Eplain why the function is a proaility density function.. Use the graph to find P(X < 3) c. Use the graph to find P(3 X 8)
Solution: a. Recall, a function is a proaility density function if the area under the curve is equal to and all of the values of p() are non-negative. It is immediately clear that the values of p() are non-negative. To verify that the area under the curve is equal to, we recognize that the graph aove can e viewed as a triangle. Its ase is and its height is.2. Thus its area is equal to.2 2.. There are two ways that we can solve this prolem. Before we get started, though, we egin y drawing the shaded region. p.2. 2 4 6 8 The first approach is to recognize that we can determine the area under the curve from to 3 immediately. The shaded area is another triangle, with a ase of 3 and a height of.. Thus, the area is equal to.5. A second approach would e to find the equation of the lines that form p() and use the integral formula on the previous page. For the first line, notice that the line passes through the points (, ) and (6,.2). Using the point-slope formula, we see that the line is given y p() = (/3). The second line passes through the points (6,.2) and (, ). Again, using the point-slope formula, we see that the line is given y p() = -(/2) + /2. So, we have that if 6 3 p ( ) if 6 otherwise Returning to the original question, we have that P(X < 3) is given y the integral 3 pd ( ) P(3) P(). On [, 3), p() = (/3). Notice that P(t) = (/6)t 2. So, we have that 9. 6 6 6 PX ( 3) P(3) P() (3) ().5 2
c. Again, we have two ways that we can approach this prolem. Again, we start y drawing out the shaded region. p.2. 2 4 6 8 If we want to use triangles, it is easiest to use the fact that the area under the curve is equal to. The shaded region is thus equal to one minus the two triangles on the sides. In (), we found the area of the left triangle is equal to.5. The area of the right triangle is equal to.. So, the area of the shaded region is.5. =.75. If instead we were to use integrals, notice that p() changes functions at = 6. Thus, in order to compute the integral That is, 8 6 8 p( d ) pd ( ) pd ( ). 3 3 6 8 3 p( d ), we need to split into two pieces. 6 6 6 2 pd ( ) d.45. 3 3 3 6 8 8 8 2 pd ( ) d.3. 6 6 4 2 3 6 So, we see that the shaded area is equal to.45 +.3 =.75, which agrees with the answer we found the other way. Often times, we are concerned with finding the average value of a distriution. There are two common measured that are used: the mean and the median. The Mean If a quantity has a density function p(), then we define the mean value of the quantity as p( d ). 3
Eample 2: Returning to the density function given in Eample, compute its mean. Solution: Notice that p() changes functions at = 6. Thus, in order to compute the integral p( d ), we will need to again split it into two pieces. Thus, we have that the mean is equal to 6 p( ) d d d 3 6 6 3 3 2 9 6 4 26 76 6 9 6 3 6 The Median A median of a quantity distriuted through a population is a value T such that half of the population has values of less than T and half the population has values of greater than T. That is, T satisfies the equation T pd ( ) 2 where p() is the density function of the quantity. In words, we have that half the area under the graph of p() lies to the left of T (and half lies to the right of T.) Eample 3: Returning to the density function given in Eample, compute its median. Solution: Looking at Figure, notice that more than half of the area occurs in the left side of the triangle. Thus, the median will e a numer etween and 6. 4
Since we do not need to worry aout the function changing (since it is the same on the T 2 T interval [, 6]), we have that d. That is,. Solving for T, we see that 3 2 6 2 T 3. Note: We did not use the 3 for T, since we know that T is a positive numer. There are a numer of important distriutions that arise in a variety of situations. Below, we list three such distriutions as well as associated properties. The first important distriution we shall consider is the uniform distriution. We introduced this distriution in the previous section. The graph of the density function is constant on the interval [a, ] and zero elsewhere. p -a a Figure 2: The density of the uniform distriution on [a, ] Uniform Distriution The density of the uniform distriution is given y p ( ) a, for a The cumulative distriution function is given y t t a Pt () pd ( ), for a t a a Another important distriution we shall consider is the eponential distriution. The graph of the density function is characterized y an eponential decay. 5
p Figure 3: The density of the eponential distriution for c >. Eponential Distriution The density of the eponential distriution is given y p( ) ce c, for and any constant c > The cumulative distriution function is given y t Pt () pd ( ) e ct, for t Eample 4: Suppose that the proaility density function for the wait time in line at a counter is if given y p ( ) /5 ke if a. What is the value of the constant k?. Determine the proaility that a person will wait at least 3 minutes. c. What is the mean wait time? Solution: a. Comparing the form of the density function with that given in the o aove, we see that c = /5. Thus, we must have that k = /5. Another way to see this would e to do the integration and solve for k. /5 /5 /5 /5 5 ke d lim ke d lim 5ke lim 5k 5ke k Dividing oth sides y 5, we see that k = /5. 6
. The proaility that a person will wait at least 3 minutes is given y 3 3 3/5 3/5 p( d ) lim pd ( ) lim P ( ) P(3) P(3) ( e ) e. Here, we used the fact that lim P ( ) to simplify the aove epression. c. The mean wait time is given y e /5 5 d. Using integration y parts, we have: /5 /5 /5 /5 e dlim e dlime e d 5 5 /5 /5 lim e 5e lim /5 e /5 5e 5 5 Note: In general, if p( ) c ce for, then p ( ) d. c The final distriution which we shall eamine is the normal distriution. The graph of its density function is a ell-shaped curve which peaks at its mean, denoted y m. The width of the curve is determined y the standard deviation, denoted y s. s s m Figure 4: The density of the normal distriution with parameters m and s. 7
Normal Distriution The density of the normal distriution is given y ( ) 2 p ( ) e, for - < < 2 where m is the mean of the distriution and s is the standard deviation. It is eyond the scope of this course to verify that p( d ). However, we can see ( ) 2 that p() for all, since e will always e positive (ut less than ) and is a positive scalar that is less than. 2 The normal density function is not an elementary integral. That is, a closed form of the antiderivative does not eist. But, as Figure 4 aove illustrates, there is still area under the curve. To evaluate the integral, we use a calculator or a tale of values. Eample 5: Lengths of human pregnancies are normally distriuted with mean 268 days and standard deviation 5 days. What percentage of pregnancies last etween 25 days and 28 days? Solution: Using the fact that m = 268 and s = 5, we have that the density function is given y ( 268) 2(5) p ( ) e. Finding the integral numerically, we have: 5 2 Proportion of pregnancies lasting etween 25 days and 28 days 5 2 28 ( 268) 2(5) e d.673. 25 8