MATH 140 Lab 4: Probability and the Standard Normal Distribution Problem 1. Flipping a Coin Problem In this problem, we want to simualte the process of flipping a fair coin 1000 times. Note that the outcomes of this experiment can be represented as H,T, where H stands for Head and T for Tails. The objective here is to show that in the long run the chance of getting head or tails will be equal to 1/2. (a) We can simulate the process of flipping a fair coin in MINITAB. Go to: Calc Random data Bernoulli. Generate 10 rows of data, and Store the results in column C1. Click on the box Probability of success and type 0.5. This way, you are telling the machine that you want to simulate the flipping a coin experiment 10 times with the equal chance of either head or tails appearing. Hit OK. Now go to Calc Column Statistics..., select C1 and sum over the number of 1s. Divide this number by 10. Assuming that 1s represent head and 0 represent tails what can you say about the probability (or the ratio of 1s ) of obtaining a head? (b) Repeat the same procedure but this time generate 100 observation with exactly the same conditions and count the number of 1s. (c) Do this again for 1000 simulated-flips. Compare your results with the previous sections. (d) Finally, repeat the same process this time with 10000 flips. What can you conclude in general? 1
Problem 2. Birthday Paradox There are 20 students in a class. What is the chance of 2 of them having the same birthday? The answer is a lot higher than you might think! We will try to approximate this probability using simulation. In order to simulate birthdays, we will have to generate random sets of data. Let us represent the numbers 1 to 365 as days of the year, and we generate sets of 20 random numbers between 1 and 365 (which represents random birthdays). The idea is to see whether there are any similar numbers appearing in our random data. The existence of similar numbers is simply a sign that in a group of 20 we may find people having the same birthday. (a) To generate integers between 1 to 365 (each representing a different day in the year), choose Calc Random Data Integer. Generate 20 rows of data, with minimum value 1 and maximum value 365. Store the data in C1-C40. Hit OK. This will give you 40 sets of 20-randomly generated numbers between 1 to 365. (b) For each column, we need to find out whether there are any two numbers that are the same. We can use MINITAB to count the same birthdays for us! Go to Stat Tables Tally... For the Variable box, type C1-C40. Make sure that the option Display Counts is checked. Hit OK. Now look at the window session. The software should provide you with the number of times that a certain birthday has occurred. If the counts of a given day is 2, then we have same birthdays! (c) Out of the 40 groups, what is the proportion of groups that at least two persons share the same birthday? Using the simulated data, what is the probability of 2 persons having the same birthday in a group of 20 persons? (d) Now repeat the same experiment with groups of 30 persons. What do you get? This phenomenon is called the birthday paradox. The paradox says that: the probability of 2 persons having the same birthday even in a group as small as 23 people is higher than 0.5! 2
Problem 3. More on Probabilities! A certain retail establishment accepts the Visa or the American Express credit card. A total of 58% of its customers carry a VISA card, 22% carry an American Express card, and 14% carry both. Use these numbers as probabilities in the exercises below, and show all work. Write in words what each event is in this problem, and then fill in the probabilities listed by the Venn Diagram. Event A, in words: Event B, in words: P(A) = P(B) = P(A AND B) = A P(A) = S P(A AND B) = P(B) = B (a) Find the probability that a customer will carry an American Express card OR a VISA card, i.e., find P(A OR B). (b) In the diagram in the previous question, shade in the area corresponding to American Express and not VISA. Then find the probability of this event. (c) What is the probability that a customer carries exactly one card out of the two? (d) What is the probability that at least one of four randomly chosen customers carries a Visa? 3
Problem 4. Rolling a Die Simulation In this problem, we will simulate the rolling of a die. That is, we will try to calculate the probabilities associated with the various outcomes of throwing a fair die! Remember the assumption is that each side (from 1 to 6) has an equal chance of 1/6 to appear when we repeat this process for a large number of times. (a) go to Calc Random data... Integer... Generate 20 rows of data and place them in column C1. For the Minimun value: 1, and for the Maximum value: 6. By this, you are telling the computer that you want to roll the die 20 times and want to see the results in column C1. Go to Stat Tables Tally... Place C1 in the box Variables, and hit OK. Report the relative frequencies of the outcomes 1, 2, 3,..., 6. (b) Repeat the same procedure but roll the die 100 times. Keep track of the relative frequency of each of the outcomes. (c) Repeat the same procedure but roll the die 1000 times. Keep track of the probability of each of the outcomes. (d) Now, suppose that we are interested in rolling two dice simultaneously. We want to simulate this experiment using MINITAB. First, simulate 1000 rolls in column C1 and simulate 1000 rolls in column C2. Without the loss of generality, we can assume that each row contains the outcome for two simultaneous rolls of the dice (although we simulated the columns one after another but our hope is that a large number of simulations will balance off any discrepancies regarding the order of throws in the experiment.) (e) What is the probability of the event (1,1), that is the event of getting a 1 in the first roll and another 1 in the second roll? Think for a minute before you go to the next page. The answer is given there. (f) We know that one way to count the number of (1,1) occurrences is to go through the whole list and keep the tally for this event! However, there is a shortcut. Let s create a new column C which is the sum of the two columns (you should know how to do this by now). Note that the event (1,1) happens only when the sum of the two rolls is 2. So, calculate the probability 4
of getting (1,1), or getting a sum of 2, using Stat Tables Tally... Is this the number you were expecting to see? why? (g) Using the simulated rolls find the probability of having two rolls one of which an odd number and the other an even number. (h) In your answer sheets, draw a table with two columns. Name the first column event, and the second column probability. Now, using the simulated rolls, calculate the probabilities of all 11 possible sums of rolling two fair dice. (i) Find the probability of the sum of the two rolls to be at least 10. (j) Find the probability of the sum of the two rolls to be larger than 7. 5
Problem 5. Accessing the z table in MINITAB general approach We have learned how to use the z table to obtain the probabilities of a standard normal curve. Minitab has a function that calculates the same probabilities. Here, we will examine that function and will compare our findings with the values given in table II of the textbook. To give an example, suppose that we would like to find: Pr(z < 1.645). In order to find the answer, click on Calc Probability Distributions N ormal... Choose Cumulative probability, enter Mean: 0, Standard Deviation:1 then, click on Input constant: and type 1.645. The answer is now shown in the upper window. Note that by choosing the option Cumulative probability, we will tell the computer that we want to find the probability or area under the z curve from the left up to the point 1.645. On the other hand, if you want to find z such that: Pr(z < z ) = 0.95, you can do this in MINITAB using the same menu. This time, after choosing Probability Distributions N ormal..., click on Inverse cumulative probability instead of cumulative probability. Under Input constant: type 0.95. The related z will appear in the session window. What if the normal curve has a different mean and standard deviation? The good news is that you can find the probability of any normal curve using the same function. In the same menu, you just have to type the mean and the standard deviation of interest. Considering the above, answer the following questions: (a) Pr(z < 2.081) (b) Pr(z > 0.497) (c) Pr( 2.23 < z < 1.32) (d) Find z such that Pr(z < z ) = 0.025. (e) Find z such that Pr(z > z ) = 0.0.38. (f) Find z such that Pr( z < z < z ) = 0.0.8. 6
(g) Find the 45 th percentile of the standard normal curve. 7