MATHEMATICS FOR ENGINEERS STATISTICS TUTORIAL 4 PROBABILITY DISTRIBUTIONS CONTENTS Sample Space Accumulative Probability Probability Distributions Binomial Distribution Normal Distribution Poisson Distribution You will find a useful calculation aid for all probability distributions at this web address: http://www.stat.vt.edu/~sundar/java/applets/distributions.html#poisson This tutorial is a continuation of outcome 4 tutorial 1 D.J.Dunn www.freestudy.co.uk 1
SAMPLE SPACE AND ACCUMULATIVE FREQUENCY Consider the case when two dice are rolled and the outcome is to guess the resulting score. n = 6 and x = so there are 6 = 6 permutations There are only 11 possible scores. You would only need 11 guesses to be sure of getting it right. If we arrange all the possible results into a table we get the sample space shown as the shaded region. First Die Second Die 1 4 5 6 1 4 5 6 7 4 5 6 7 8 4 5 6 7 8 9 4 5 6 7 8 9 10 5 6 7 8 9 10 11 6 7 8 9 10 11 1 There are 6 events with equal probability so p = 1/6. The probability of getting any score is hence the product of frequency and probability. In cases like this we should use a frequency table as shown and this is called a probability distribution. Score 4 5 6 7 8 9 10 11 1 Frequency f 1 4 5 6 5 4 1 P 1/6 /6 /6 4/6 5/6 6/6 5/6 4/6 /6 /6 1/6 Accumulative f 1/6 /6 6/6 10/6 15/6 1/6 6/6 0/6 /6 5/6 6/6 We can see that the probability of getting any given sum is not the same. The only way to get a sum of is to roll a 1 on both dice, but you can get a sum of 7 in six different ways. Plotting shows this is a linear distribution symmetrically placed around the middle value. The accumulative probability always adds up to 1.0. The probability value can only be between 0 and 1. An event that is certain has a probability of 1 and an event that is impossible has a probability of 0. We can see a rule for the probability of a given sum is P = f p f is the frequency of the event and p is the equal probability of any score this being p =1/n r n = number of possibilities for each event (6) and r the number of events (). Hence in this case P = 1/n r x f = 1/6 x f = f/6 It is important to note that the distribution is not a continuous function but a set of discrete values based on the integers 1,,... n D.J.Dunn www.freestudy.co.uk
If we turned the probability distribution into a bar graph with bars of width = 1, the accumulative frequency would be the area of the graph between 0 and the given score. The probability of getting at least 5 is 10/6 from the table and from the graph it is the pink area this being: (1/6 + /6 + /6 + 4/6) = 10/6 The green area represents the probability of getting a score of 6 or more and is simply found by subtracting the pink area from the total area. The total area is always one (6/6). WORKED EXAMPLE No. 1 When two dice are rolled 1 times, what is the probability of guessing correctly a score of 4? What is the probability of guessing a score of least 4 or less? From the plots or table created previously, the probability of guessing a score of exactly 4 is 4/6. The probability of guessing a score of four or less is 10/6 BINOMIAL DISTRIBUTION The Binomial distribution only applies to events where there are two outcomes, say win and lose or heads and tails (tossing a coin). The Binomial Distribution was covered in outcome 1 and was written as: (1 + x) n = 1 + n C 1 x + n C x + n C x + n C 4 x 4 +... + x n The key part is the Binomial coefficient n C r This may be considered as a way of evaluating how many successes 'r' you are likely to get when you repeat the event 'n' times. Let's revise how to evaluate n C r On the top line we put the first r factors of n and on the bottom line we put r! 4 4 x x 4 4 x e.g. C 4 C 6 x x 1 x 1 And if we evaluate these for all values of r we get a symmetrical distribution. The plot shows the result for n = 4. The mean is always the middle value so the mean r is always n/ D.J.Dunn www.freestudy.co.uk
Let's consider the probability distribution for tossing a coin where the probability of a head or tail is both ½ or 0.5. Consider tossing the coin 4 times (n = 4). The sample space is like this. FLIP 1 FLIP FLIP FLIP 4 HEADS H H H H 4 H H H T H H T H H H T T H T H H H T H T H T T H H T T T 1 T H H H T H H T T H T H T H T T 1 T T H H T T H T 1 T T T H 1 T T T T 0 Note there are 4 = 16 possible results. Now we can build the frequency distribution. Note the P = n C r 16 so we didn't need to construct the sample space. Heads (r) 0 1 4 n C r 1 4 6 4 1 P 1/16 4/16 6/16 4/16 1/16 Acc P 1/16 5/16 11/16 15/16 16/16 Suppose we want to know the chances of getting exactly correct. Using r = we see we have a probability of 6/16 = 0.75. Suppose we want to know the chances of getting or less guesses correct. Using r = we get a probability of 11/16 = 0.6875 Suppose we want the probability of getting or more guesses correct. This would be found by subtracting the last answer from 1 to give 5/16 = 0.15. WORKED EXAMPLE No. What is the probability of correctly calling four heads when a coin is tossed ten times? The number of possible permutations is 10 = 104 The probability of calling correctly 4 times is n C r /104 10 C4 10 x 9 x 8 x 7 1 P x 0.051 104 4 x x x 1 104 D.J.Dunn www.freestudy.co.uk 4
UNEQUAL PROBABILITY Without proof, when the probability of an event is not 0.5 the probability of getting r results correct n r nr out of n events is: P Crp (1 p) p is the probability of each event. n n In the case p = 0.5 this reduces to P Cr(0.5) which is the same formula already used. If the number of tosses are large (n is large) the frequency distribution resembles a continuous graph and it is tempting to join the points as shown but we should remember that the values of r are integers (whole numbers) and so we can never have values in between. The plot below is for n = 50. For cases where π ½ the distribution becomes skewed. Consider the following case. A bag contains three balls numbered 1 to. A single ball is drawn from the bag at random and then replaced. If this is repeated times we get the following sample space. Note how the pattern is constructed in three groups of 9 giving 7 permutations. Ball drawn Number of ones Ball drawn Number of ones Ball drawn Number of ones 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 1 1 0 0 1 1 1 1 1 1 1 1 0 0 1 1 0 0 Number of times drawn r 0 1 frequency 8 1 6 1 Probability P 8/7 1/7 6/7 1/7 If we make n = 50 we get a curve with a peak at n/ and if p =/ the peak is at n/. n r nr All results are predicted by the equation Pr C p (1 p) r You might try the animation at this web address to see this in action. http://www.stat.wvu.edu/srs/modules/binomial/binomial.html D.J.Dunn www.freestudy.co.uk 5
WORKED EXAMPLE No. Verify the four results previous for n =, p = 1/ r = n r nr 0 ()() 0 P Crp (1 p) C(1/) (/) (1/) (/) 1/ 7 ()() r = n r nr 1 ()() 1 P Crp (1 p) C(1/) (/) (1/) (/) /9 or 6/7 () r = 1 n r nr 1 () 1 P Crp (1 p) C 1(1/) (/) (1/) (/) 4/9 or 1/7 (1) n r nr r = 0 P C p (1 p) C (1/) (/) 1(1/) (/) 8/7 r WORKED EXAMPLE No. 4 0 0 A bag contains balls numbered 1, and. One ball is removed at random and noted and then replaced. This is repeated 5 times. What is the probability of guessing the number correctly three times out 5? 0 p = 1/ n = 5 and r = P n r C p (1 p) r nr 5 C (1/) (/) (5)(4)() (1/) ()() (/) 0.1646 MEAN AND VARIANCE OF THE BINOMIAL DISTRIBUTION fx fx In statistics the variance is defined as σ f f In the terminology used here x becomes r and P is the probability of r correct guesses. P r P r σ P P The following example shows that P =1 so this reduces to σ P r P r The standard deviation is σ = S Without proof - It can be shown that this reduces to σ = np(1 - p) and when p = ½ σ = n/4 The mean of the Binomial distribution when p =1/ is clearly the middle value so r = n/ When π ½ we can see from the graphs that the mean is r = pn. When π ½ the standard deviation is σ = np(1-p) D.J.Dunn www.freestudy.co.uk 6
WORKED EXAMPLE No. 5 A coin is flipped six times. Show that the resulting frequency distribution for correct tosses has a standard deviation of 1.5 by use of both formulae. First by the simple method σ = n/4 = 6/4 = 1.5 σ = 1.5 = 1.5 P r P r Next by the full method σ P P r 0 1 4 5 6 n C r 1 6 15 0 15 6 1 P = n C r / 6 1/64 6/64 15/64 0/64 15/64 6/64 1/64 P = 1 P r 0 6/64 60/64 180/64 40/64 150/64 6/64 P r = 67/64 P r 0/64 6/64 0/64 60/64 60/64 6/64 6/64 P r = 19/64 σ σ = 1.5 P r P P r P WORKED EXAMPLE No. 6 67/ 64 19 / 64 1 1 10.5 9 1.5 In the last example, what is the probability of guessing correctly exactly four times and at least four times? From the table we see the probability of guessing four correct is 15/64 but the probability of guessing at least four is (1 + 6 + 15 + 0 + 15)/64 = 57/64. This is the accumulative value. WORKED EXAMPLE No. 7 Samples of a product are tested to a certain standard and it is found that there is a probability of 0. that they fail. What is the probability of selecting 5 failures from a selection of 15? What is the mean and standard deviation for this sample? n r nr 15 5 10 15! 5 10 p = 0. n = 15 r = 5 P Crp (1 p) C5(0.) (0.8) (0.) (0.8) 0. 10 5!10! Mean = pn = 0. x 15 = σ = (15)(0.)(1-0.) = 1.549 You will find a useful calculating aid for at the following web address http://hyperphysics.phy-astr.gsu.edu/hbase/math/disfcn.html D.J.Dunn www.freestudy.co.uk 7
SELF ASSESSMENT EXERCISE No. 1 1. If a coin is tossed 0 times, what is the probability of getting the call correct 5 times? (0.0148). If a six sided die is tossed 10 times, what is the probability of getting the call right five times? (0.01). A lottery system consists of drawing one numbered ball from a bag containing nine. This is repeated with six separate bags. What is the probability of guessing all the numbers drawn? (1/51441) 4. 0 coins are flipped each with a probability of 0.5 that it will be heads. What is the standard deviation for the frequency distribution? (.6) 5. A machine making electrical resistors has a probability of 0.1 that the values will fall outside the target range. What is the probability of randomly picking 0 from a batch of 100 that will be outside the target? (0.00117) What are the mean and the standard deviation for this distribution? (10 and ) NORMAL DISTRIBUTION CURVES In statistics, the normal distribution is often used. In terms of probability the equation without explanation is given as: rr e /σ P σ π The normal distribution curve is not used exclusively for events with a win/lose or yes/no result but it does give similar results to the Binomial distribution when n is large. The same mean and standard deviation must be used in the comparison. Even for low values of n the curves are well matched as shown in the plotted examples below with n = 10. The normal distribution is not normally used for win/lose situations unless n is 50 or larger. You can compare the Binomial and normal distribution at this web address http://www.ruf.rice.edu/~lane/stat_sim/binom_demo.html The normal distribution is more widely used for cases where the standard deviation and mean are known as a result of many measurements. We then use it to predict the probability of a given value or range of values. D.J.Dunn www.freestudy.co.uk 8
The normal distribution curve can be made into one that fits all eventualities. This is done by changing the mean to zero by subtracting r and making the standard deviation 1 by dividing by σ. r r Instead of plotting r we plot z. As this is a standard graph, the area of the graph can be σ tabulated and used to solve problems. The table given here covers the area from - to the value of z. Because the graph is symmetrical, other areas can be worked out as appropriate. The total area is 1.0 so the total either side of the mean is 0.5. Tables of the Normal Distribution Probability Content from - to z Note red area = 1 green area z 0.00 0.01 0.0 0.0 0.04 0.05 0.06 0.07 0.08 0.09 0 0.5000 0.5040 0.5080 0.510 0.5160 0.5199 0.59 0.579 0.519 0.559 0.1 0.598 0.548 0.5478 0.5517 0.5557 0.5596 0.566 0.5675 0.5714 0.575 0. 0.579 0.58 0.5871 0.5910 0.5948 0.5987 0.606 0.6064 0.610 0.6141 0. 0.6179 0.617 0.655 0.69 0.61 0.668 0.6406 0.644 0.6480 0.6517 0.4 0.6554 0.6591 0.668 0.6664 0.6700 0.676 0.677 0.6808 0.6844 0.6879 0.5 0.6915 0.6950 0.6985 0.7019 0.7054 0.7088 0.71 0.7157 0.7190 0.74 0.6 0.757 0.791 0.74 0.757 0.789 0.74 0.7454 0.7486 0.7517 0.7549 0.7 0.7580 0.7611 0.764 0.767 0.7704 0.774 0.7764 0.7794 0.78 0.785 0.8 0.7881 0.7910 0.799 0.7967 0.7995 0.80 0.8051 0.8078 0.8106 0.81 0.9 0.8159 0.8186 0.81 0.88 0.864 0.889 0.815 0.840 0.865 0.889 1.0 0.841 0.848 0.8461 0.8485 0.8508 0.851 0.8554 0.8577 0.8599 0.861 1.1 0.864 0.8665 0.8686 0.8708 0.879 0.8749 0.8770 0.8790 0.8810 0.880 1. 0.8849 0.8869 0.8888 0.8907 0.895 0.8944 0.896 0.8980 0.8997 0.9015 1. 0.90 0.9049 0.9066 0.908 0.9099 0.9115 0.911 0.9147 0.916 0.9177 1.4 0.919 0.907 0.9 0.96 0.951 0.965 0.979 0.99 0.906 0.919 1.5 0.9 0.945 0.957 0.970 0.98 0.994 0.9406 0.9418 0.949 0.9441 1.6 0.945 0.946 0.9474 0.9484 0.9495 0.9505 0.9515 0.955 0.955 0.9545 1.7 0.9554 0.9564 0.957 0.958 0.9591 0.9599 0.9608 0.9616 0.965 0.96 1.8 0.9641 0.9649 0.9656 0.9664 0.9671 0.9678 0.9686 0.969 0.9699 0.9706 1.9 0.971 0.9719 0.976 0.97 0.978 0.9744 0.9750 0.9756 0.9761 0.9767.0 0.977 0.9778 0.978 0.9788 0.979 0.9798 0.980 0.9808 0.981 0.9817.1 0.981 0.986 0.980 0.984 0.988 0.984 0.9846 0.9850 0.9854 0.9857. 0.9861 0.9864 0.9868 0.9871 0.9875 0.9878 0.9881 0.9884 0.9887 0.9890. 0.989 0.9896 0.9898 0.9901 0.9904 0.9906 0.9909 0.9911 0.991 0.9916.4 0.9918 0.990 0.99 0.995 0.997 0.999 0.991 0.99 0.994 0.996.5 0.998 0.9940 0.9941 0.994 0.9945 0.9946 0.9948 0.9949 0.9951 0.995.6 0.995 0.9955 0.9956 0.9957 0.9959 0.9960 0.9961 0.996 0.996 0.9964.7 0.9965 0.9966 0.9967 0.9968 0.9969 0.9970 0.9971 0.997 0.997 0.9974.8 0.9974 0.9975 0.9976 0.9977 0.9977 0.9978 0.9979 0.9979 0.9980 0.9981.9 0.9981 0.998 0.998 0.998 0.9984 0.9984 0.9985 0.9985 0.9986 0.9986.0 0.9987 0.9987 0.9987 0.9988 0.9988 0.9989 0.9989 0.9989 0.9990 0.9990 D.J.Dunn www.freestudy.co.uk 9
WORKED EXAMPLE No. 8 The maximum number of people that can occupy a lift is set at 8. The total weight of 8 people chosen at random follows a normal distribution with a mean of 550 kg and a standard deviation of 150 kg. What is the probability that the total weight of 8 people exceeds 600kg? r r 600 550 r 550 σ = 150 z 0. 0 σ 150 Look in the table down the left hand column for z = 0. and across under 0.0. The number in the table for z = 0. is 0.69 The green area to the right is 1 0.69 = 0.707 This is the probability that the weight will exceed 600kg. WORKED EXAMPLE No. 9 The lifetime in hours of a mass produced product is represented by the normal distribution curve with a mean of 1400 and a standard deviation of 00. What is the probability that a component taken at random will have a lifetime between 1400 and 1450 hours? r 1400 σ = 00 First find the probability for 1450 hours r r 1450 1400 z 0.17 σ 00 From the table P = 0.5675 Next find the probability for 1400 hours r r 1400 1400 z 0 σ 00 From the table P = 0.500 as expected for the mean. The probability of the component having a lifetime between 1400 and 1450 hours is : 0.5675 0.5 = 0.0675 D.J.Dunn www.freestudy.co.uk 10
SELF ASSESSMENT EXERCISE No. 1. The height of adult males is normally distributed with a mean of 1.78 m and a standard deviation of 0.076 m. What is the probability of a randomly selected man having a height of less than 1.6 m? (0.0089). A grinding machine produces components with a mean diameter of 0 mm. All the components are measured and the actual size logged. The standard deviation over a period of time is 0.05 mm. Assuming the normal distribution represents the actual distribution, what is the probability of a component being between 9.95 mm and 0.05 mm diameter? (0.686) (Note this is the standard figure for the range between σ = -1 and σ = +1). The breaking strengths of 150 spot welds was measured in Newton and grouped into bands of 0 N as shown. Range f 160-10 180-00 6 00-0 10 0-40 8 40-60 50 60-80 1 80-00 15 00-0 8 Calculate the mean and the standard deviation. (Answers 51.47 N and 9.04 N) Calculate the probability that a sample taken at random will have strength of less than 00 N based on the normal distribution. (Answer about 4%) Calculate the probability based on the raw data above. (Answer 5.%) D.J.Dunn www.freestudy.co.uk 11
POISSON DISTRIBUTION Proof and derivation is not given at this level of study but students will find the derivation of this formula at the following web address. http://en.wikipedia.org/wiki/poisson_distribution This is a distribution representing discrete samples (same as the Binomial) but it brings the time element into the equation. The probability distribution is given by: λ r e λ P r! r = number of occurrences λ = average occurrences/time interval You will find another useful aid to calculation at this web address. http://hyperphysics.phy-astr.gsu.edu/hbase/math/poiex.html WORKED EXAMPLE No. 10 A business receives order at an average rate of 1 per minute. What is the probability of getting three orders in one minute? λ r 1 e λ e 1 λ = 1 r = P 0.061or 6% r! ()() WORKED EXAMPLE No.11 An emergency service receives an average of.1 false alarms per day. What is the probability of getting four false alarms in a given day? λ r.1 4 e λ e.1 λ =.1 r = 4 P 0.099 or 10% r! (4)()() SELF ASSESSMENT EXERCISE No. Solve all the following on the assumption that Poisson's distribution applies. 1. On average the demand for a certain product is four per week. If the stock at the beginning of each week is renewed so that there are always 6 in store, what is the probability of running out of stock in any week? (1.4%). A call centre has a capacity to deal with 5 calls per minute on average. What is the probability of getting 0 calls in any minute period? (4.5%). The average time taken for a worker to assemble a certain product is 45 minutes. There are 10 workers employed to make these assemblies. What is the probability of assembling 10 units in an hour? (8%) D.J.Dunn www.freestudy.co.uk 1