7-8: Probability and Statistics Mat Dr. Firoz Chapter 7 Probability Definition: Probability is a real valued set function P that assigns to each event A in the sample space S a number A), called the probability of A, such that the following properties hold: a) PA ( ) 0 b) PS ( ) c) S) A) A ) A) A ), where A is the complement of A. d) For events A,and Bwe have A B) A) B) A B ) e) If A and B are mutually eclusive (means A B, A B) P ( ) 0, then A B) A) B ) f) For three events A, B, and C verify that A B C) A) B) C) A B) B C) C A) A B C) Eample. Roll a die once. The total outcomes are namely,, 3, 4, 5, and 6. The sample space is the set S {,, 3, 4, 5, 6}. The event E rolling a 3 is the singleton set E {3}. Now the probability of the event E for rolling a 3 is p( E) p (rolling a 3), 6 because there are 6 total outcomes and desired outcome. Eample. Roll a die once. Write the sample space. Find the following probabilities: a) E) P (rolling a 3 or a 5) b) E) P (rolling a 3 or more) c) E) P (rolling a number greater than 0) d) P ( E) rollingan even number) e) o(e) = odds for the event that the roll is a 3 or a 5 Answer: a) /3 b) /3 c) 0 d) ½ e) : 4 Eample 3. Roll a die twice. Write the sample space. Find the following probabilities: a) E) P (rolling a sum 3 or a sum 5) b) E) P (rolling a sum 3 or more) c) E) P (rolling a sum greater than 0) d) E) P (rolling a sum greater than ) The sample space of the event of rolling a die twice:
3 4 5 6 3 4 5 6 3 3 33 34 35 36 4 4 43 44 45 46 5 5 53 54 55 56 6 6 63 64 65 66 Answer: a) 6/36 b) 35/36 c) 3/36 d) 0 Eample 4. Roll a die twice. Write the sample space. Find the following probabilities: a) P ( E) all two rollsare either aor a 3) b) P ( E) all two rollsare not ) c) P ( E) all two rollsare above 3) d) P ( E) all are aor all are a 5) The sample space of the event of rolling a die twice: 3 4 5 6 3 4 5 6 3 3 33 34 35 36 4 4 43 44 45 46 5 5 53 54 55 56 6 6 63 64 65 66 Answer: a) Look at the sample space, we have all three rolls either a or a 3 are, 3, 3, and 33. Thus P ( E) all two rollsare either aor a 3) = 4/36. We could find this probability using independent event and multiplication principle as the probability of getting first outcome either or 3 is /6, then the probability of getting second outcome either or 3 is again /6. By multiplication principle the probability of getting all two outcomes either a or a 3 is / 6 / 6 4/ 36 b) P ( E) all two rollsare not ) = 5/36, you may verify looking at the sample space, that we have,,, 3, 4, 5, 6, 3, 4, 5, 6 with outcome s. There are 5 outcomes with no s. Using independent event and multiplication principle we have P ( E) all two rollsare not ) = 5 / 6 5/ 6 5/ 36 c) P ( E) all two rollsare above 3) = 9/36 Using independent event and multiplication principle we have P ( E) all two rollsare above 3) = 3 / 6 3/ 6 9/ 36 d) P ( E) all are aor all are a 5) = / 6 / 6 / 6 / 6 / 36. You look at the sample space we have only two outcomes and 55.
3 Eample 5. Roll a die three times. How many outcomes do we have in the sample space? Find the following probabilities: a) E) P (all three rolls are either a or a 3) b) E) P (all three rolls are not ) c) E) P (all three rolls are above 3) d) P ( E) all are aor all are a 5) Answer: a) / 6 / 6 / 6 8/ 6 b) 5 / 6 5/ 6 5/ 6 5/ 6 c) 3 / 6 3/ 6 3/ 6 7/ 6 d) / 6 / 6 / 6 / 6 / 6 / 6 / 6 Eample 6. Roll a die five times. How many outcomes do we have in the sample space. Find the following probabilities: a) E) P (all five rolls are either a or a 3) b) E) P (all five rolls are not ) c) E) P (all five rolls are above 3) d) P ( E) all are aor all are a 5) Answer: a) 5 5 / 6 b) 5 5 5 / 6 c) 5 5 3 / 6 d) 5 / 6 Methods of Enumeration. A true/false test contains 0 questions. In how many ways can a student answer the questions? If a student makes random guesses, what is the probability that the student will make eactly 5 questions correct? Answer: 0 04, c (0,5) /04 0.46.. How many three letter words (without meaning) are possible when repetition of letters is not allowed? What is the probability that those words will not start with a vowel? Answer: P (6,3), 0.8077 3. How many three letter words (without meaning) are possible when repetition of letters is allowed? What is the probability that those words will not start with a 3 vowel? Answer: 6, 0.8077 4. A coin is tossed 9 times. What is the probability of getting at least heads? Answer: 0.9805 5. A company has 9 women and 8 men. What is the probability that a 7 person committee will have 4 men and 3 women? Answer: 0.303 6. If the letters in the word POKER are rearranged, what is the probability that the word will begin with a K and ends with an O? Answer: 0.05
4 7. A computer retail store has personal computers in stock. A customer wants to purchase three of the computers. Assume that of the computers, 4 are defective. If the computers are selected at random what is the probability that eactly one of the purchased computers is defective? Answer: 0.509 8. A computer retail store has personal computers in stock. A customer wants to purchase three of the computers. Assume that of the computers, 4 are defective. If the computers are selected at random what is the probability that at least one of the purchased computers is defective? Answer: 0.7455 Independent Probabilities: Eample. A card is chosen at random from a deck of 5 cards. It is then replaced, the deck reshuffled and a second card is chosen. What is the probability of getting a jack and an eight? Solution. The event is independent. The probability of drawing first card a jack is 4/5 and second card an eight is 4/5. Also drawing a first card an eight is 4/5 and second card a jack is 4/5. The probability of drawing a jack and an eight is 4 /5 4/5 4/5 4/5 /69 Eercise: A card is chosen at random from a deck of 5 cards. It is then replaced, the deck reshuffled and a second card is chosen. a) What is the probability of getting a jack and then an eight? Ans: /69 b) What is the probability of getting a diamond and then a heart? Ans: /6 Eample. A family has two children. Using b to stand for boy and g for girl in ordered pairs, give each of the following. a) the sample space b) the event E that the family has eactly one daughter. c) the event F the family has at least one daughter d) the event G that the family has two daughters e) p(e) f) p(f) g) p(g) Eample 3. A group of three people is selected at random. )What is the probability that all three people have different birthdays. ) What is the probability that at least two of them have the same birthday? ) The probability that all three people have different birthdays is 365P3 365 364 363 0.99 3 365 365 365 365 ) The probability that at least two people have same birthday is 365 364 363 0.99 0.008 365 365 365
5 Conditional probability. A conditional probability is a probability whose sample space has been limited to only those outcomes that fulfill a certain condition. A conditional n( A B) probability of an event A, given event B is p ( A B) n( B) Eample. In a newspaper poll concerning violence on television, 600 people were asked, what is your opinion of the amount of violence on prime time television is there too much violence on television? Their responses are indicated in the table below. Yes (Y) No (N) Don t know Total Men (M) 6 95 3 80 Women (W) 56 45 9 30 Total 48 40 4 600 Suppose we label the events in the following manner: W is the event that a response is from a woman, M is the event that a response is from a man, Y is the event that a response is yes, and N is the event that a response is no, then the event that a woman responded yes would be written as Y W and p(y W) = 56/30 = 0.8. Use the given table to answer following questions. a) p(n) b) p(w) c) p(n W) d) p(w N) e) p ( N W) f) p ( W N) g) p(y) h) p(m) i) p(y M) j) p(m Y) k) p ( Y M) l) p ( M Y) m) p ( W Y) n) p ( W Y) Answer: a) 0.3 b) 0.53 c) 0.4 d) 0.3 e) 0.08 f) 0.08 g) 0.70 h) 0.47 i) 0.58 j) 0.39 k) 0.7 l) 0.7 m) 0.43 n) 0.6 Independent Events Independent Events: Two events A and B are called independent if and only if P ( A B) A) B), otherwise A and B are dependent. Eample. In two tosses of a single fair coin show that the events A head on the first toss and A head on the second toss are independent. Solution: The sample space S S { HH, HT, TH, TT}, the event with a head on the first toss A { HH, HT} and an event with a head on the second toss B { HH, TH}. Now show that P ( A B) A) B).
6 Bayes Formula: Let A, B, C are mutually eclusive events whose union is the sample space S. Let E be the arbitrary event in S such that P (E) 0, then A E) B E) C E) P ( A E), P ( B E), P ( C E) E) E) E) where P ( A E) E A) A) and so on. General form of Bayes Theorem (Page # 348) Bk ) A Bk ) Bk A) m, where k,,3,... m. The conditional probability B ) A B ) i i B k A) is often called the posterior probability of B k. i Eample. A company produces,000 refrigerators a week at three plants. Plant A produces 350 refrigerators a week, plant B produces 50 refrigerators a week, and plant C produces 400 refrigerators a week. Production records indicate that 5% of the refrigerators at plant A will be defective, 3% of those produced at plant B will be defective, and 7% of those produced at plant C will be defective. All refrigerators are shipped to a central warehouse. If a refrigerator at the warehouse is found to be defective, what is the probability that it was produced a) at plant A? b) at plant B? c) at plant C? We consider D as defective and D as non defective. a) 0.05 D A 0.95 D 0.35 0.03 D Start 0.5 B 0.97 D 0.40 0.07 D C 0.93 D We now answer all questions from the tree diagram. A D) 0.35(0.05) P ( A D) D) 0.35(0.05) 0.5(0.03) 0.40(0.07) 3 B D) 0.5(0.03) You now try to find b) P ( B D)? D) 0.35(0.05) 0.5(0.03) 0.40(0.07) C D) 0.40(0.07) c) P ( C D)? D) 0.35(0.05) 0.5(0.03) 0.40(0.07) More eamples on Bayes Theorem:
7. The Belgian 0-frank coin (B0), the Italian 500-lire coin (I500), and the Hong Kong 5-dollar (HK5) are approimately the same size. Coin purse one (C) contains si of each of these coins. Coin purse two (C) contains nine B0s, si I500s, and three HK5s. A fair four-sided die is rolled. If the outcome is {}, a coin is selected randomly from C. If the outcomes belong to {, 3, 4}, a coin is selected randomly from C. Find a) P (I500), the probability of selecting an Italian coin b) P ( C I500), the conditional probability that the coin selected from C, given that it was an Italian coin. Solution: We have P (C) / 4 and P (C) 3/ 4. We know that (you may draw a tree diagram) P ( C I500) 6/8 / 3, P ( C B0) 6/8 / 3 and P ( C HK5) 6/8 /3, and you can find for C in the same way. a) I500) I500 C) C) I500 C) C) / 3*/ 4 / 3*3/ 4 4/ /3 C) I500 C) / 3*/ 4 b) P ( C I500) / 4 I500) / 3. The Belgian 0-frank coin (B0), the Italian 500-lire coin (I500), and the Hong Kong 5-dollar (HK5) are approimately the same size. Coin purse one (C) contains si of each of these coins. Coin purse two (C) contains nine B0s, si I500s, and three HK5s. A fair four-sided die is rolled. If the outcome is {}, a coin is selected randomly from C. If the outcomes belong to {, 3, 4}, a coin is selected randomly from C. Find a) P (B0), the probability of selecting a Belgian coin (Answer: /4) b) P ( C B0), the conditional probability that the coin selected from C, given that it was a Belgian coin. (Answer: /) c) 3. Given two urns, suppose urn I contains 4 black and 7 white balls. Urn II contains 3 black, white, and 4 yellow balls. Select an urn and then select a ball. a) What is the probability that you obtain a black ball? (Answer: 65/76) b) What is the probability that you obtain a ball from urn II, given that the ball is a black ball? (Answer: 33/65) 4. An absence minded nurse is to give Mr. Brown a pill each day. The probability that the nurse forgets to administer the pill is /3. If he receives the pill, the probability that Mr. Brown will die is /3. If he does not get the pill, the probability that he will die is ¾. Mr. Brown died. What is the probability that the nurse forgot to give Mr. Brown the pill? (Answer: 9/) Solution hints: A = the nurse forgets to give pill, B = do not forget, E = Mr. Brown dies. Now A) = /3, E A) = /3, E) = A)E A)+B)E B), find A E).
8 5. A company that specializes in language tutoring lists the following information concerning its English-speaking employees: 3 speak German, 5 speak French, 3 speak Spanish, 43 speak Spanish or French, 38 speak French or German, 46 speak German or Spanish, 8 speak Spanish, French and German, and 7 office workers and secretaries speak English only. Make a Venn Diagram and show all information in it. Find the following a) What percent of the employees speak at least one language other than English? b) What percent of the employees speak at least two languages other than English. Answer: a) 88.9% b) 3.8% 6. A bo of candy hearts contains 5 hearts of which 9 are white, 0 are tan, 7 are pink, 3 are purple, 5 are yellow, are orange, and 7 are green. If you select 9 pieces candy randomly from the bo, without replacement, give the probability that a) three of the hearts are white b) Three are white, two are tan, one is pink, one is yellow, and two are green. [a) 9.7% b) 0.87%] 7. Suppose that A) = 0.7 and B) = 0.5 and P[ ( A B) ] = 0.. Find a) p ( A B) b) p ( A B) c) p ( B A) 8. Let A and B be the events that a person is left-eye dominant or right-eye dominant, respectively. When a person folds their hands, let C and D be the events that their left thump and right thump, respectively are on top. A survey in one statistics class yielded the following table: If a student is selected randomly, find the following probabilities: a) p ( A C) b) p ( A C) c) p ( A C) d) p ( B D) Chapter 8 random Variables and Statistics Random variables of the discrete type A B C 5 7 D 4 9 In probability theory, a probability distribution is called discrete if it is characterized by a probability mass function. Thus, the distribution of a random variable X is discrete, and X is then called a discrete random variable, if px ( ) If a random variable is discrete, then the set of all values that it can assume with non-zero probability is finite or countably infinite, because the sum of uncountably many positive real numbers (which is the least upper bound of the set of all finite partial sums) always diverges to infinity. Given a random eperiment with an outcome space S, a function X that assigns to each element s in S one and only one real number X(s) = is called a random variable, like a
9 function of s. The space of X is the set of real numbers { : X( s), s S }, where s S means the element s belongs to S. The probability mass function (pmf) f() of a discrete random variable X is a function that satisfies the following properties:. f ( ) 0, S ;. f( ) ; S 3. X A) f ( ), A S A Eample. Suppose that X has a discrete uniform distribution on S {,,3,4,5,6} and its pmf is f ( ),,,3,4,5,6. 6 As a general case we may write pmf as f ( ),,,3,4,, m m Eample. Roll a 4 sided die twice and let X equal the larger of the two outcomes if they are different and common value if they are the same. The outcome space for this eperiment is S0 {( d, d); d,,3,4; d,,3,4}, where we assume that each of these 6 points has probability 6. Then 3 5 X ) P[(,)], X ) P[{(,),(,),(,)], X 3)] and 6 6 6 7 PX ( 4). Looking at the pattern one can easily find the pmf 6 f ( ),,,3,4 6 Eercise. Let the pmf of X be defined by f ( ), 9,3,4, a) draw a bar graph and a b) probability histogram. For each of the following, determine the constant c, so that f() satisfies the conditions of being a pmf for a random variable X, a) f ( ),,,3,4 c b) f ( ) c,,,3,4 0 c) f ( ) c,,,3,4 4 d) f ( ) c( ), 0,,,3 e) f ( ),,,3,4,, n c
0 Mathematical epectation In probability theory and statistics, the epected value (or epectation value, or mathematical epectation, or mean, or first moment) of a random variable is the integral of the random variable with respect to its probability measure. For discrete random variables this is equivalent to the probability-weighted sum of the possible values. The term "epected value" can be misleading. It must not be confused with the "most probable value." The epected value is in general not a typical value that the random variable can take on. It is often helpful to interpret the epected value of a random variable as the long-run average value of the variable over many independent repetitions of an eperiment. When it eists, mathematical epectation E satisfies the following properties: a) If c is a constant, E() c c b) If c is a constant and u is a function, E[ cu( )] ce[ u( )] c) If c and c are constants and u and u are functions, than E[ cu ( X ) cu( X )] ce[ u( X )] ce[ u( X )] Eample. Let X have the pmf f ( ),,,3,4. Find EX ( ) 0 Solution: E( X ) if ( i) 3, verify. 0 Eample. Let X have the pmf f ( ),,,3. Find mean = EX ( ) and also 6 E( X ) and variance E( X ) ( E( X )) and also standard deviation. Solution: Mean = E( X ) f ( ) ( ) i ( i) 6 6 6 E X f i i 36 7 6 3 7 5 5 Variance E( X ) ( E( X )) 6 and 3 9 3 Eample 3. A politician can emphasize jobs or the environment in her election campaign. The voters can be concerned about jobs or the environment. A payoff matri showing the utility of each possible outcome is shown. Jobs Voters Environment Jobs 5 0 Environment 5 30 The political analysts feel there is a 0.39 chance that the voters will emphasize jobs. Which strategy should the candidate adopt to gain the highest utility a) Environment b) Jobs, eplain mathematically.
Solution: For the environment the epected value is E ( ) 5(.39) 30(.39).45 On the other hand for jobs the epected value is E ( ) 5(.39) 0(.39) 3.65. So the preference will go for a) Environment (because of higher epected value). Eercise. Find mean and standard deviation of the following: a) f ( ), 5 5,0,5, 0, 5 b) f ( ), 5 c) f ( ) 4, 6,,3. Given E( X 4) 0, E[( X 4) ] 6, determine mean= EX ( ) Bernoulli trials and the Binomial distribution Var ( X 4) and A Bernoulli eperiment is a random eperiment, the outcome of which can be classified in but one of two mutually eclusive and ehaustive ways, say, success or failure (life or death, head or tail, 3 or not 3 etc. The pmf of a Bernoulli trail is ( ) f p ( p), 0, and we say that the random variable has Bernoulli distribution. The mean of Bernoulli trial is given as ( ) E X p ( p) p, verify. The variance of Bernoulli trial is ( ) ( ) Var X p p ( p) p( p) pq, q p 0 0 Eample. In the instant lottery with 0% winning tickets, if X is equal to the number of winning tickets among n = 8 that are purchased, the probability of purchasing two 8 6 winning tickets is f () X ) (0.0) ( 0.) 0.936 9.36% One may use calculator as follows (TI) nd DISTR 0 (binompdf) (8, 0.0, ) will display 0.93608 Eample. In the instant lottery with 0% winning tickets, if X is equal to the number of winning tickets among n = 8 that are purchased, the probability of purchasing at best 6 winning tickets is 8 7 8 8 X 6) f (7) f (8) (0.0) ( 0.) (0.) 0.999955 7 8 One may use calculator as follows (TI) nd DISTR A (binomcdf) (8, 0.0, 6) will display 0.999955
Eample 3. In the instant lottery with 0% winning tickets, if X is equal to the number of winning tickets among n = 8 that are purchased. Find the probability of purchasing at least 6 winning tickets. Hint. Find X 6) X 6) X 7) X 8) or X 6) X 5) binomcdf (8,.,5) Eample 4. A quiz consists of 4 multiple choice questions. Each question has 5 possible answers, only one of which is correct. If you answer the questions completely based on guessing, what is the probability that a) You will answer eactly 4 wrong? b) You will answer eactly 4 correctly? c) You will answer at least 0 correctly? d) You will answer at most 3 wrong? e) You will answer at most 3 correctly? Solution: The probability that you will answer one question wrong is 4 5 0.8. a) The probability of answering eactly 4 wrong is a binomial probability of B(4, 0.8, 4), which is X 4) B (4, 0.8, 4) 4 4 0 (0.8) (0.) 4.56 0, 4 which is almost zero. If you use TI calculator use binompdf (4, 0.8, 4). Check your calculator using the following code: nd DISTR 0 binompdf (4,.8, 4) b) The probability that you will answer eactly 4 correct is B(4, 0., 4) = 0.96 c) At least 0 correct PX ( 0) = 0 correct + correct + correct + 3 correct + 4 correct = It is easy to use calculator with binomcdf as follows: X 0) binomcdf (4, 0.,9) 4.79 0 4.79 0. d) At most three wrong: X 3) binomcdf (4, 0.8,3).5 0 e) At most three correct: X 3) binomcdf (4,0.,3) 0.64 Eample 5. A computer manufacturer tests a random sample of 8 computers. The probability that a computer is non defective is 9.3%. What is the probability that: a) Eactly 7 computers are defective? Answer: 0.006605 b) At least two computers are defective? Answer: 0.73689 c) At most two computers are defective? Answer: 0.5554
3 Eample 6. A quiz consists of 0 multiple choice questions, each with 4 possible choices. For someone who makes random guesses for all of the questions, find the probability of passing if the minimum passing grade is 90%. Solution: X 9) binomcdf (0, 0.5,8).95639 0 5 Eample 8. A student claims that he has etrasensory perception (ESP). A coin is flipped 5 times, and a student is asked to predict the outcome in advance. He gets 0 out of 5 correct. What was the probability that he would have done at least this well if he had no EPS? Solution: X 0) binomcdf (5,0.5,9) 0.00038658 Eercise. Toss a fair coin times. How many possible outcomes do you have? What is the probability of getting a) eactly 7 heads, b) at least 7 heads, c) at most 7 heads? Eercise. A student claims that he has etrasensory perception (ESP). A coin is flipped 30 times, and a student is asked to predict the outcome in advance. He gets 5 out of 30 correct. What was the probability that he would have done at least this well if he had no EPS? Eercise 3. A quiz consists of 0 multiple choice questions, each with 5 possible choices. For someone who makes random guesses for all of the questions, find the probability of passing if the minimum passing grade is 80%. Eample 4. A computer manufacturer tests a random sample of 30 computers. The 7 probability that a computer is defective is 7 %. What is the probability that: 8 a) Eactly 7 computers are defective? b) At least two computers are defective? c) At most two computers are defective? Eercise 5. In the instant lottery with 0% winning tickets, if X is equal to the number of winning tickets among 0 tickets that are purchased, find the probability of purchasing a) at best 7 winning tickets, b) at least 7 winning tickets, c) no more than 6 winning tickets, d) no less than 6 winning tickets Eercise 6. The rates of on-time flights for commercial jets are continuously tracked by the U.S Department of transportation. Recently, Southwest Air had the best rate with 80% of its flights arriving on time. A test is conducted by randomly selected 6 Southwest flights and observing whether they arrive on time. Find a) the probability that eactly 4 flights arrive on time b) The probability that at least 4 flights arrive on time c) At best 4 flights arrive on time
4 Random variable of the continuous type A random variable is a function X that assigns to each element s in the outcome space S one and only one corresponding real number X(s) =. The space of X is the set of real numbers S { : X( s), s S} is an interval. In discrete case the S is the set of discrete points. In the continuous case we call the integrable function f(), a probability density function (pdf) which satisfies the following: a) f ( ) 0, S b) f ( ) d S c) The probability of the event X A is X A) f ( ) d Probability Distribution Function: A function F is a distribution function of the random variable X iff the following conditions are satisfied: a) F is non decreasing i.e., F 0, or F( ) F( y ) for all y b) F is continuous c) F is normalized i.e., lim F( ) 0; lim F( ) Eample. Evaluate the integral Solution: 0 e 0 / 0 e / 0 0 0 / 0 b lim b 0 d e m e Eample. Show that f ( ), 0 m Solution (Hint): Show that f( ) 0 and d 0 e m / m A is a probability density function. Eample 3. Let Y be a continuous random variable with pdf g( y) y, 0 y and the distribution function is defined by 0 y 0 Find mean Variance 0 d y ( ) 0 G y t dt y y 0 E( Y ) yg( y) dy (check the integral) and 3 Var( Y ) E( Y ) y g( y) dy (check the integral). Find 8 also the standard deviation. 0 Eample 4. The probability density function of a continuous random variable is given as f ( ), 0. Find its corresponding distribution function, mean, variance and standard deviation, interval of one standard deviation of mean, two standard deviation of mean and three standard deviation of mean. y
5 Solution: The distribution function is the integral of the pdf function over the real line. Draw the graph of the pdf and notice that F(0) = 0, 0 F() = / and F() =. We also notice that distribution function is zero, i.e., F (0) 0 when < 0. The distribution function over the interval 0 is F( ) ( ) d ( ( ) d d c, c 0,as F (0) 0 The distribution function over the interval is F( ) ( ) d ( ) d ( ) d c, c, as F () The distribution function over the interval is F( ) f ( ) d 0 d c, as F () Thus we have the probability distribution function defined as follows: 0 0, 0 F( ), You can calculate mean, standard deviation and variance. Look at eample 3. Eample 5. Show the following function is a probability distribution function 0 0, 0 F( ) Solution: We need to check the following properties: a) Check that F is non-decreasing. 0 0 0, 0, 0 F ( ), 0 0, 0 shows that F is not decreasing.
6 b) For continuity check that lim F( ) lim F( ) F () lim F( ) lim F( ) F (0) 0 and 0 0 c) F is normalized: lim F( ) 0; lim F( ) The function F() defined above is probability distribution function. Eercise:. For each of the following functions, i) find the constant c so that f () is a pdf of the random variable X, ii) find the distribution function F() X ) and iii) sketch f () and F(), iv) find also a) f ( ) 3, 0 4 c b) f ( ) 3, 6 c c c) f ( ) c 4, 0 d) f ( ) c, 0 4,,.. Sketch the graph of the following pdf f (), then find and sketch the probability distribution function F() on the real line. Review eample 4. a) f ( ) 3, b) f ( ), c) f( ), 0, 0 The Normal Distribution A normal distribution of a random variable X with mean distribution with probability density function (pdf) and variance is a statistic ()
7 on the domain. While statisticians and mathematicians uniformly use the term "normal distribution" for this distribution, physicists sometimes call it a Gaussian distribution and, because of its curved flaring shape, social scientists refer to it as the "bell curve." De Moivre developed the normal distribution as an approimation to the binomial distribution, and it was subsequently used by Laplace in 783 to study measurement errors and by Gauss in 809 in the analysis of astronomical data (Havil 003, p. 57). The normal distribution is an etremely important probability distribution in many fields. It is a family of distributions of the same general form, differing in their location and scale parameters: the mean ("average") and standard deviation ("variability"), respectively. The standard normal distribution is the normal distribution with a mean of zero and a standard deviation of one (the green curve in the plots below). It is often called the bell curve because the graph of its probability density resembles a bell. If a random variable X has this distribution, we write ~. If and, the distribution is called the standard normal distribution and the probability density function reduces to
8 Area under a normal curve: For a standard normal variate z, the normal distribution has mean zero and standard z / deviation one with pdf f ( ) ep z / e The area under the standard normal distribution curve for z u f ( Z z) ep du. We have now the difficulty to evaluate the integral without having the knowledge of multivariable calculus and polar coordinate form. But this difficulty we can manage using standard values from the table 5 of Normal distribution at page # 43 or using our calculator. Look at eample 3. Important Information: All normal density curves satisfy the following property which is often referred to as empirical rule:. 68.6% of the observations fall within standard deviation of mean.. 95.44% of the observations fall within standard deviation of mean 3. 99.74% of the observations fall within 3 standard deviation of mean Note: Within 5 standard deviation of mean we assume 00% data points. Eample. Find the mean and standard deviation of the normal distribution whose pdf is ( 7) given as f( ) ep 8 8 Solution: Compare with the standard formula of pdf for the normal distribution and find that 8, 7. Eample. Write the pdf of a normal distribution with mean 3 and variance 6. Solution: We have 4, 3, the pdf of the normal distribution is given as ( 3) f( ) ep 4 3 Eample 3. Find the area under the normal curve with mean zero and standard deviation one for the standard variate z.4. Solution: From table 5a: Pz (.4) 0.895 89.5% For this value choose row with. and column 0.04. Using calculator: Pz (.4) 0.895 89.5% 0.4
9 The calculator code: nd DISTR normalcdf (-5,.4) =0.8950 Eample 4. Find the area under the normal curve with mean zero and standard deviation one for the standard variate z.4. Solution: From table 5a: Pz (.4) 0.895 0.75% For this value choose row with. and column 0.04. 0.4 Using calculator: Pz (.4) 0.075 0.75% The calculator code: nd DISTR normalcdf (.4, 5) =0.074875 Eample 5. Find the area under the normal curve with mean zero and standard deviation one for 0. z.4. Solution: From table 5a: 0. z.4) z.) z.4) 44.03% Using calculator: 0. z.4) 0.440707 44.03% The calculator code: -0. 0.4 nd DISTR normalcdf (-0.,.4) =0.440707 Eample 6. Suppose is a normally distributed random variable with mean 0. and standard deviation.5. Find each of the following probabilities. a) 6. 3.3). b) 9.4 3) c) 5.5 3.) d) P (.6) e) P ( 4.4) Draw normal curve and show the region bounded by the normal curve and the values.
0 Solution: from calculator a) 6. 3.3) normalcdf 6.,3.3,0.,.5 0.977484 97.75% try similar way for b), and c), b) Try for e). Eercise Set.6 0..6) normalcdf,5 7.53%.5 6. 0. 3.3. The physical fitness of an athlete is often measured by how much oygen the athlete takes in (which is recorded in millimeters per kilogram, ml/kg). The maimum oygen uptake for elite athletes has been found to be 80 with a standard deviation 9.. Assume that distribution is approimately normal. a) What is the probability that an elite athlete has a maimum oygen uptake of at least 75 ml/kg? Answer: 70.66% b) What is the probability that an elite athlete has a maimum oygen uptake of 65 ml/kg or lower? Answer: 5.5% c) Consider someone with a maimum oygen uptake of 6 ml/kg. Is it likely that this person is an elite athlete? Answer: No. The combined score of SAT test are normally distributed with mean of 998 and a standard deviation of 0. If a college includes a minimum score of 800 among its requirements, what percentage of students do not satisfy that requirement? Answer: 6.35% 3. IQ score are normally distributed with mean of 00 and a standard deviation 5. Mensa is an international society that has one and only one qualification for membership, a score in the top on an IQ test. a) What IQ score should one have in order to be eligible for Mensa? Answer: hint: (-00)/5 = invnorm(0.98), = 30.8 b) In a typical region of 90,000 people, how many are eligible for Mensa? Answer: 90,000 (0.0) = 800 4. Using diaries for many weeks, a study on the lifestyle of visually impaired students was conducted. The students kept track of many lifestyle variables including how many hours of sleep obtained on a typical day. Researchers found that visually impaired students averaged 9.6 hours of sleep, with a standard deviation of.56 hours. Assume that the number of hours of sleep for these visually impaired students is normally distributed. a) What is the probability that a visually impaired student gets less than 6. hours of sleep? Answer: 8.58% b) What is the probability that a visually impaired student gets between 6.3 and 0.35 hours of sleep? Answer: 5.65%
c) Forty percent of students get less than how many hours of sleep on a typical day? Answer: 8.95 hours 5. Healthy people have body temperatures that are normally distributed with a mean of 98.0 degree Fahrenheit and a standard deviation of 0.6 degree Fahrenheit. a) If a healthy person is randomly selected, what is the probability that he or she has a body temperature above 98.9 degree Fahrenheit? Answer:.94% b) A hospital wants to select a minimum temperature for requiring further medical tests. What should that temperature be, if we want only % of healthy people to eceed it? Answer: hint: (-98.)/.6 = invnorm(0.99), 99.64 6. The heights of a large group of people are assumed to be normally distributed. Their mean height is 68 inches, and the standard deviation is 4 inches. What percent of these people are taller than 73 inches? Answer: 0.56% 7. Suppose a population is normally distributed with a mean of 4.6 and a standard deviation of.3. What percent of the data will lie between 5.3 and 6.8? Answer: 4.9% Statistics (this section will be discussed briefly in the class) Measures of Central Tendency Mean Median and Mode of a set of data: Measures of Dispersion: