The Poisson Distribution


 Alban Lang
 1 years ago
 Views:
Transcription
1 Lecture 5 The Poisso Distributio 5.1 Itroductio Example 5.1: Drowigs i Malta The book [Mou98] cites data from the St. Luke s Hospital Gazette, o the mothly umber of drowigs o Malta, over a period of early 30 years (355 cosecutive moths). Most moths there were o drowigs. Some moths there was oe perso who drowed. Oe moth had four people drow. The data are give as couts of the umber of moths i which a give umber of drowigs occurred, ad we repeat them here as Table 5.1. Lookig at the data i Table 5.1, we might suppose that oe of the followig hypotheses is true: Some moths are particularly dagerous; Or, o the cotrary, whe oe perso has drowed, the surroudig publicity makes others more cautious for a while, prevetig drowigs? Or, drowigs are simply idepedet evets? How ca we use the data to decide which of these hypotheses is true? We might reasoably suppose that the first hypothesis would predict that there would be more moths with high umbers of drowigs tha the idepedece hypothesis; the secod 81
2 82 The Poisso Distributio Table 5.1: Mothly couts of drowigs i Malta. No. of drowig deaths per moth Frequecy (No. moths observed) hypothesis would predict fewer moths with high umbers of drowigs. The problem is, we do t kow how may we should expect, if idepedece is correct. What we eed is a model: A sesible probability distributio, givig the probability of a moth havig a certai umber of drowigs, uder the idepedece assumptio. The stadard model for this sort of situatio is called the Poisso distributio. The Poisso distributio is used i situatios whe we observe the couts of evets withi a set uit of time, area, volume, legth etc. For example, The umber of cases of a disease i differet tows; The umber of mutatios i give regios of a chromosome; The umber of dolphi pod sightigs alog a flight path through a regio; The umber of particles emitted by a radioactive source i a give time; The umber of births per hour durig a give day. I such situatios we are ofte iterested i whether the evets occur radomly i time or space. Cosider the Babyboom dataset (Table 1.2), that we saw i Lecture 1. The birth times of the babies throughout the day are show i Figure 5.1(a). If we divide up the day ito 24 hour itervals ad
3 The Poisso Distributio 83 cout the umber of births i each hour we ca plot the couts as a histogram i Figure 5.1(b). How does this compare to the histogram of couts for a process that is t radom? Suppose the 44 birth times were distributed i time as show i Figure 5.1(c). The histogram of these birth times per hour is show i Figure 5.1(d). We see that the oradom clusterig of evets i time causes there to be more hours with zero births ad more hours with large umbers of births tha the real birth times histogram. This example illustrates that the distributio of couts is useful i ucoverig whether the evets might occur radomly or oradomly i time (or space). Simply lookig at the histogram is t sufficiet if we wat to ask the questio whether the evets occur radomly or ot. To aswer this questio we eed a probability model for the distributio of couts of radom evets that dictates the type of distributios we should expect to see. 5.2 The Poisso Distributio The Poisso distributio is a discrete probability distributio for the couts of evets that occur radomly i a give iterval of time (or space). If we let X = The umber of evets i a give iterval, The, if the mea umber of evets per iterval is λ The probability of observig x evets i a give iterval is give by λ λx P(X = x) = e x! x =0, 1, 2, 3, 4,... Note e is a mathematical costat. e There should be a butto o your calculator e x that calculates powers of e. If the probabilities of X are distributed i this way, we write X Po(λ) λ is the parameter of the distributio. We say X follows a Poisso distributio with parameter λ
4 84 The Poisso Distributio Birth Time (miutes sice midight) (a) Babyboom data birth times Frequecy No. of births per hour (b) Histogram of Babyboom birth times Birth Time (miutes sice midight) (c) Noradom birth times Frequecy No. of births per hour (d) Histogram of oradom birth times Figure 5.1: Represetig the babyboom data set (upper two) ad a oradom hypothetical collectio of birth times (lower two). Note A Poisso radom variable ca take o ay positive iteger value. I cotrast, the Biomial distributio always has a fiite upper limit.
5 The Poisso Distributio 85 Example 5.2: Hospital births Births i a hospital occur radomly at a average rate of 1.8 births per hour. What is the probability of observig 4 births i a give hour at the hospital? Let X = No. of births i a give hour (i) Evets occur radomly (ii) Mea rate λ =1.8 X Po(1.8) We ca ow use the formula to calculate the probability of observig exactly 4 births i a give hour P (X = 4) = e ! = What about the probability of observig more tha or equal to 2 births i a give hour at the hospital? We wat P (X 2) = P (X = 2) + P (X = 3) +... i.e. a ifiite umber of probabilities to calculate but P (X 2) = P (X = 2) + P (X = 3) +... = 1 P (X <2) = 1 (P (X = 0) + P (X = 1)) = 1 (e +e 0! 1! = 1 ( ) = )
6 86 The Poisso Distributio Example 5.3: Disease icidece Suppose there is a disease, whose average icidece is 2 per millio people. What is the probability that a city of 1 millio people has at least twice the average icidece? Twice the average icidece would be 4 cases. We ca reasoably suppose the radom variable X=# cases i 1 millio people has Poisso distributio with parameter 2. The P (X 4) = 1 P (X 3) = 1 e e 2 + e 2 + e 3 0! 1! 2! 3! 5.3 The shape of the Poisso distributio Usig the formula we ca calculate the probabilities for a specific Poisso distributio ad plot the probabilities to observe the shape of the distributio. For example, Figure 5.2 shows 3 differet Poisso distributios. We observe that the distributios (i). are uimodal; (ii). exhibit positive skew (that decreases as λ icreases); (iii). are cetred roughly o λ; (iv). have variace (spread) that icreases as λ icreases. = Mea ad Variace of the Poisso distributio I geeral, there is a formula for the mea of a Poisso distributio. There is also a formula for the stadard deviatio, σ, ad variace, σ 2. If X Po(λ) the µ = λ σ = λ σ 2 = λ
7 The Poisso Distributio 87 Po(3) Po(5) Po(10) P(X) X P(X) X P(X) X Figure 5.2: Three differet Poisso distributios. 5.5 Chagig the size of the iterval Suppose we kow that births i a hospital occur radomly at a average rate of 1.8 births per hour. What is the probability that we observe 5 births i a give 2 hour iterval? Well, if births occur radomly at a rate of 1.8 births per 1 hour iterval The births occur radomly at a rate of 3.6 births per 2 hour iterval Let Y = No. of births i a 2 hour period The Y Po(3.6) P (Y = 5) = e ! = This example illustrates the followig rule If X Po(λ) o 1 uit iterval, the Y Po(kλ) o k uit itervals.
8 88 The Poisso Distributio 5.6 Sum of two Poisso variables Now suppose we kow that i hospital A births occur radomly at a average rate of 2.3 births per hour ad i hospital B births occur radomly at a average rate of 3.1 births per hour. What is the probability that we observe 7 births i total from the two hospitals i a give 1 hour period? To aswer this questio we ca use the followig rule If X Po(λ 1 ) o 1 uit iterval, ad Y Po(λ 2 ) o 1 uit iterval, the X + Y Po(λ 1 + λ 2 ) o 1 uit iterval. So if we let X = No. of births i a give hour at hospital A ad Y = No. of births i a give hour at hospital B The X Po(2.3), Y Po(3.1) ad X + Y Po(5.4) P (X + Y = 7) = e ! = Example 5.4: Disease Icidece, cotiued Suppose disease A occurs with icidece 1.7 per millio, ad disease B occurs with icidece 2.9 per millio. Statistics are compiled, i which these diseases are ot distiguished, but simply are all called cases of disease AB. What is the probability that a city of 1 millio people has at least 6 cases of AB? If Z=# cases of AB, the P Po(4.6). Thus, P (Z 6) = 1 P (Z 5) =1 e = ! ! ! ! ! !
9 The Poisso Distributio Fittig a Poisso distributio Cosider the two sequeces of birth times we saw i Sectio 1. Both of these examples cosisted of a total of 44 births i 24 hour itervals. Therefore the mea birth rate for both sequeces is = What would be the expected couts if birth times were really radom i.e. what is the expected histogram for a Poisso radom variable with mea rate λ = Usig the Poisso formula we ca calculate the probabilities of obtaiig each possible value 1 x P (X = x) The if we observe 24 hour itervals we ca calculate the expected frequecies as 24 P (X = x) for each value of x. x Expected frequecy P (X = x) We say we have fitted a Poisso distributio to the data. This cosisted of 3 steps (i). Estimatig the parameters of the distributio from the data (ii). Calculatig the probability distributio (iii). Multiplyig the probability distributio by the umber of observatios Oce we have fitted a distributio to the data we ca compare the expected frequecies to those we actually observed from the real Babyboom dataset. We see that the agreemet is quite good. x Expected Observed i practice we group values with low probability ito oe category.
10 90 The Poisso Distributio Whe we compare the expected frequecies to those observed from the oradom clustered sequece i Sectio 1 we see that there is much less agreemet. x Expected Observed I Lecture 9 we will see how we ca formally test for a differece betwee the expected ad observed couts. For ow it is eough just to kow how to fit a distributio. 5.8 Usig the Poisso to approximate the Biomial The Biomial ad Poisso distributios are both discrete probability distributios. I some circumstaces the distributios are very similar. For example, cosider the Bi(100, 0.02) ad Po(2) distributios show i Figure 5.3. Visually these distributios are idetical. I geeral, If is large (say > 50) ad p is small (say < 0.1) the a Bi(, p) ca be approximated with a Po(λ) whereλ = p Example 5.5: Coutig lefties Give that 5% of a populatio are lefthaded, use the Poisso distributio to estimate the probability that a radom sample of 100 people cotais 2 or more lefthaded people. X = No. of left haded people i a sample of 100 X Bi(100, 0.05) Poisso approximatio X Po(λ) withλ = = 5
11 The Poisso Distributio 91 Bi(100, 0.02) Po(2) P(X) P(X) X X Figure 5.3: A Biomial ad Poisso distributio that are very similar. We wat P (X 2)? P (X 2) = 1 P (X <2) = 1 P (X = 0) + P (X = 1) e +e 5 0! 1! If we use the exact Biomial distributio we get the aswer The idea of usig oe distributio to approximate aother is widespread throughout statistics ad oe we will meet agai. Why would we use a approximate distributio whe we actually kow the exact distributio? The exact distributio may be hard to work with. The exact distributio may have too much detail. There may be some features of the exact distributio that are irrelevat to the questios
12 92 The Poisso Distributio we wat to aswer. By usig the approximate distributio, we focus attetio o the thigs we re really cocered with. For example, cosider the Babyboom data, discussed i Example 5.2. We said that radom birth times should yield umbers of births i each hour that are Poisso distributed. Why? Cosider the births betwee 6 am ad 7 am. Whe we say that the births are radom, we probably mea somethig like this: The times are idepedet of each other, ad have equal chaces of happeig at ay time. Ay give oe of the 44 births has 24 hours whe it could have happeed. The probability that it happes durig this hour is p =1/24 = The births betwee 6 am ad 7 am should thus have about the Bi(44, ) distributio. This distributio is about the same as Po(1.83), sice 1.83 = Example 5.6: Drowigs i Malta, cotiued We ow aalyse the data o the mothly umbers of drowig icidets i Malta. Uder the hypothesis that drowigs have othig to do with each other, ad have causes that do t chage i time, we would expect the probability the radom umber X of drowigs occur i a moth to have a Poisso distributio? Why is that? We might imagie that there are a large umber of people i the populatio, each of whom has a ukow probability p of drowig i ay give moth. The the umber of drowigs i a moth has Bi(, p) distributio. I order to use this model, we eed to kow what ad p are. That is, we eed to kow the size of the populatio, which we do t really care about. O the other had, the expected (mea) umber of mothly drowigs is p, ad that ca be estimated from the observed mea umber of drowigs. If we approximate the biomial distributio by Po(λ), where λ = p, the we do t have to worry about We estimate λ as total umber of drowigs/umber of moths. The total umber of drowigs is = 167, so we estimate λ = 167/355 = We show the probabilities for the differet possible outcomes i the last last colum of Table 5.2. I the third colum we show the expected umber of moths with a give umber of drowigs, assumig
13 The Poisso Distributio 93 Table 5.2: Mothly couts of drowigs i Malta, with Poisso fit. No. of drowig Frequecy (No. Expected frequecy Probability deaths per moth moths observed) Poisso λ = the idepedece assumptio ad hece the Poisso model is true. This is computed by multiplyig the last colum by 355. After all, if the probability of o drowigs i ay give moth is 0.625, ad we have 355 moths of observatios, we expect moths with 0 drowigs. We see that the observatios (i the secod colum) are pretty close to the predictios of the Poisso model (i the third colum), so the data do ot give us strog evidece to reject the eutral assumptio, that drowigs are idepedet of oe aother, ad have a costat rate i time. I Lecture 9 we will describe oe way of testig this hypothesis formally. Example 5.7: Swie flu vacciatio I 1976, fear of a impedig swie flu pademic led to a mass vacciatio campaig i the US. The pademic ever materialised, but there were cocers that the vacciatio may have led to a icrease i a rare ad serious eurological disease, GuillaiBarré Sydrome (GBS). It was difficult to determie whether the vaccie was really at fault, sice GBS may arise spotaeously about 1 perso i 100,000 develops GBS i a give year ad the umber of cases was small. Cosider the followig data from the US state of Michiga: Out of 9 millio residets, about 2.3 millio were vacciated. Of
14 94 The Poisso Distributio those, 48 developed GBS betwee July 1976 ad Jue We might have expected 2.3 millio 10 5 cases/persoyear = 23 cases. How likely is it that, purely by chace, this populatio would have experieced 48 cases i a sigle year? If Y is the umber of cases, it would the have Poisso distributio with parameter 23, so that i P (Y 48) = 1 e i! i=0 = So, such a extreme umber of cases is likely to happe less tha 1 year i 100,000. Does this prove that the vaccie caused GBS? The people who had the vaccie are people who chose to be vacciated. They may differ from the rest of the populatio i multiple ways i additio to the elemetary fact of havig bee vacciated, ad some of those ways may have predisposed them to GBS. What ca we do? The paper [BH84] takes the followig approach: If the vaccie were ot the cause of the GBS cases, we would expect o coectio betwee the timig of the vaccie ad the oset of GBS. I fact, though, there seemed to be a particularly large umber of cases i the six weeks followig vacciatio. Ca we say that this was more tha could reasoably be expected by chace? The data are give i Table 5.3. Each of the 40 GBS cases was assiged a time, which is the umber of weeks after vacciatio whe the disease was diagosed. (Thus week 1 is a differet caledar week for each subject.) If the cases are evely distributed, the umber i a give week should be Poisso distributed with parameter 40/30 = Usig this parameter, we compute the probabilities of 0, 1, 2,... cases i a week, which we give i row 3 of Table 5.3. Multiplyig these umbers by 30 gives the expected frequecies i row 4 of the table. It is clear that the observed ad expected frequecies are very differet. Oe way of seeig this is to cosider the stadard deviatio. The Poisso distributio has SD 1.33 = 1.15 (as discussed i sectio 5.4,
15 The Poisso Distributio 95 while the data have SD (0 1.33) 2 +7 (1 1.33) 2 +3 (2 1.33) 2 s = = (4 1.33) 2 +1 (9 1.33) 2 +1 ( ) 2 Table 5.3: Cases of GBS, by weeks after vacciatio # cases per week observed frequecy probability expected frequecy Derivatio of the Poisso distributio (oexamiable) This sectio is ot officially part of the course, but is optioal, for those who are iterested i more mathematical detail. Where does the formula i sectio 5.2 come from? Thik of the Poisso distributio as i sectio 5.8, as a approximatio to a biomial distributio. Let X be the (radom) umber of successes i a collectio of idepedet radom trials, where the expected umber of successes is λ. This will, of course, deped o the umber of trials, but we show that whe the umber of trials (call it ) gets large, the exact umber of trials does t matter. I mathematical laguage, we say that the probability coverges to a limit as goes to ifiity. But how large is large? We would like to kow how good the approximatio is, for real values of, of the sort that we are iterested i. Let X be the radom umber of successes i idepedet trials, where the probability of each success is λ/. Thus, the probability of success goes dow as the umber of trials goes up, ad expected umber of successes is always the same λ. The λ x P {X = x} = C x 1 λ x.
16 96 The Poisso Distributio Now, those of you who have leared some calculus at Alevels may remember the Taylor series for e z : e z =1+z + z2 2! + z3 3! +. I particular, for small z we have e z 1 z, ad the differece (or error i the approximatio) is o bigger tha z 2 /2. The key idea is that if z is very small (as it is whe z = λ/, ad is large), the z 2 is a lot smaller tha z. Usig a bit of algebra, we have P {X = x} = C x λ x 1 λ x 1 λ λ x ( 1) ( x + 1) = x! = λx (1) x 1 x! 1 λ x x 1 λ x 1 λ 1 λ. Now, if we re ot cocered about the size of the error, we ca simply say that is much bigger tha λ or x (because we re thikig of a fixed λ ad x, ad gettig large). So we have the approximatios x 1 1; 1 λ x 1; 1 λ e λ/ = e λ. Thus P {X = x} λx x! e λ Error bouds (very mathematical) I the log ru, X has a distributio very close to the Poisso distributio defied i sectio 5.2. But how log is the log ru? Do we eed 10 trials? 1000? a billio? If you just wat the aswer, it s approximately this: The error that you ll make by takig the Poisso distributio istead of the biomial is o more
17 The Poisso Distributio 97 tha about 1.6λ 2 / 3/2. I Example 5.5, where = 100 ad λ = 5, this says the error wo t be bigger tha about 0.04, which is useful iformatio, although i reality the maximum error is about 10 times smaller tha this. O the other had, if = 400, 000 (about the populatio of Malta), ad λ =0.47, the the error will be oly about Let s assume that is at least 4λ 2,soλ< /2. Defie the approximatio error to be := max P {X = x} P {X = x}. (The bars mea that we re oly iterested i how big the differece is, ot whether it s positive or egative.) The P {X = x} P {X = x} = λx (1) x 1 x! 1 λ x 1 λ λx x! e λ = λx x! e λ (1) x 1 1 λ x 1 λ/ 1 e λ/ If x is bigger tha,thep {X = x} ad P {X = x} are both tiy; we wo t go ito the details here, but we will cosider oly x that are smaller tha this. Now we have to do some careful approximatio. Basic algebra tells us that if a ad b are positive, (1 a)(1 b) =1 (a + b)+ab > 1 (a + b). We ca exted this to (1 a)(1 b)(1 c) > (1 (a+b))(1 c) > 1 (a+b+c). Ad so, fially, if a, b, c,... are all positive, the Thus ad 1 > (1 a)(1 b)(1 c) (1 z) > 1 (a + b + c + + z) x 1 1 > x 1 > 1 k=0 1 λ x > 1 λx. Agai applyig some calculus, we tur this ito 1 < 1 λ x < 1+ λx λx. k > 1 x2 2,
18 98 The Poisso Distributio We also kow that which meas that 1 λ <e λ/ < 1 λ + λ2 2 2, ad 1 λ 2 2( 2 λ) < 1 λ/ < 1, e λ/ λ 2 1 2( λ) < λ 2 1 λ/ 1 2( 2 < < 1. λ) e λ/ Now we put together all the overestimates o oe side, ad all the uderestimates o the other. λ x x! e λ λ2 2( λ) λx P {X = x} P {X = x} λx λx x! e λ. λx So, fially, as log as 4λ 2,we get max λx+1 λ e λ x! + x + x (1 x/2. ) We eed to fid the maximum over all possible x. If x< the this becomes max 1 λ x+1 e λ (λ +3x) 4λ2 x! 2π, (by a formula kow as Stirlig s formula ), where λ = max{λ, 1}.
Key Ideas Section 81: Overview hypothesis testing Hypothesis Hypothesis Test Section 82: Basics of Hypothesis Testing Null Hypothesis
Chapter 8 Key Ideas Hypothesis (Null ad Alterative), Hypothesis Test, Test Statistic, Pvalue Type I Error, Type II Error, Sigificace Level, Power Sectio 81: Overview Cofidece Itervals (Chapter 7) are
More information1. C. The formula for the confidence interval for a population mean is: x t, which was
s 1. C. The formula for the cofidece iterval for a populatio mea is: x t, which was based o the sample Mea. So, x is guarateed to be i the iterval you form.. D. Use the rule : pvalue
More informationHypothesis testing. Null and alternative hypotheses
Hypothesis testig Aother importat use of samplig distributios is to test hypotheses about populatio parameters, e.g. mea, proportio, regressio coefficiets, etc. For example, it is possible to stipulate
More informationI. Chisquared Distributions
1 M 358K Supplemet to Chapter 23: CHISQUARED DISTRIBUTIONS, TDISTRIBUTIONS, AND DEGREES OF FREEDOM To uderstad tdistributios, we first eed to look at aother family of distributios, the chisquared distributios.
More informationCHAPTER 7: Central Limit Theorem: CLT for Averages (Means)
CHAPTER 7: Cetral Limit Theorem: CLT for Averages (Meas) X = the umber obtaied whe rollig oe six sided die oce. If we roll a six sided die oce, the mea of the probability distributio is X P(X = x) Simulatio:
More informationZTEST / ZSTATISTIC: used to test hypotheses about. µ when the population standard deviation is unknown
ZTEST / ZSTATISTIC: used to test hypotheses about µ whe the populatio stadard deviatio is kow ad populatio distributio is ormal or sample size is large TTEST / TSTATISTIC: used to test hypotheses about
More informationIn nite Sequences. Dr. Philippe B. Laval Kennesaw State University. October 9, 2008
I ite Sequeces Dr. Philippe B. Laval Keesaw State Uiversity October 9, 2008 Abstract This had out is a itroductio to i ite sequeces. mai de itios ad presets some elemetary results. It gives the I ite Sequeces
More informationAQA STATISTICS 1 REVISION NOTES
AQA STATISTICS 1 REVISION NOTES AVERAGES AND MEASURES OF SPREAD www.mathsbox.org.uk Mode : the most commo or most popular data value the oly average that ca be used for qualitative data ot suitable if
More informationThe following example will help us understand The Sampling Distribution of the Mean. C1 C2 C3 C4 C5 50 miles 84 miles 38 miles 120 miles 48 miles
The followig eample will help us uderstad The Samplig Distributio of the Mea Review: The populatio is the etire collectio of all idividuals or objects of iterest The sample is the portio of the populatio
More informationSequences II. Chapter 3. 3.1 Convergent Sequences
Chapter 3 Sequeces II 3. Coverget Sequeces Plot a graph of the sequece a ) = 2, 3 2, 4 3, 5 + 4,...,,... To what limit do you thik this sequece teds? What ca you say about the sequece a )? For ǫ = 0.,
More informationWeek 3 Conditional probabilities, Bayes formula, WEEK 3 page 1 Expected value of a random variable
Week 3 Coditioal probabilities, Bayes formula, WEEK 3 page 1 Expected value of a radom variable We recall our discussio of 5 card poker hads. Example 13 : a) What is the probability of evet A that a 5
More information0.7 0.6 0.2 0 0 96 96.5 97 97.5 98 98.5 99 99.5 100 100.5 96.5 97 97.5 98 98.5 99 99.5 100 100.5
Sectio 13 KolmogorovSmirov test. Suppose that we have a i.i.d. sample X 1,..., X with some ukow distributio P ad we would like to test the hypothesis that P is equal to a particular distributio P 0, i.e.
More informationSection 11.3: The Integral Test
Sectio.3: The Itegral Test Most of the series we have looked at have either diverged or have coverged ad we have bee able to fid what they coverge to. I geeral however, the problem is much more difficult
More information5: Introduction to Estimation
5: Itroductio to Estimatio Cotets Acroyms ad symbols... 1 Statistical iferece... Estimatig µ with cofidece... 3 Samplig distributio of the mea... 3 Cofidece Iterval for μ whe σ is kow before had... 4 Sample
More informationChapter 14 Nonparametric Statistics
Chapter 14 Noparametric Statistics A.K.A. distributiofree statistics! Does ot deped o the populatio fittig ay particular type of distributio (e.g, ormal). Sice these methods make fewer assumptios, they
More informationDiscrete Random Variables and Probability Distributions. Random Variables. Chapter 3 3.1
UCLA STAT A Applied Probability & Statistics for Egieers Istructor: Ivo Diov, Asst. Prof. I Statistics ad Neurology Teachig Assistat: Neda Farziia, UCLA Statistics Uiversity of Califoria, Los Ageles, Sprig
More informationOnesample test of proportions
Oesample test of proportios The Settig: Idividuals i some populatio ca be classified ito oe of two categories. You wat to make iferece about the proportio i each category, so you draw a sample. Examples:
More informationOverview of some probability distributions.
Lecture Overview of some probability distributios. I this lecture we will review several commo distributios that will be used ofte throughtout the class. Each distributio is usually described by its probability
More informationChapter 10. Hypothesis Tests Regarding a Parameter. 10.1 The Language of Hypothesis Testing
Chapter 10 Hypothesis Tests Regardig a Parameter A secod type of statistical iferece is hypothesis testig. Here, rather tha use either a poit (or iterval) estimate from a simple radom sample to approximate
More informationChapter 5 Discrete Probability Distributions
Slides Prepared by JOHN S. LOUCKS St. Edward s Uiversity Slide Chapter 5 Discrete Probability Distributios Radom Variables Discrete Probability Distributios Epected Value ad Variace Poisso Distributio
More information1 Hypothesis testing for a single mean
BST 140.65 Hypothesis Testig Review otes 1 Hypothesis testig for a sigle mea 1. The ull, or status quo, hypothesis is labeled H 0, the alterative H a or H 1 or H.... A type I error occurs whe we falsely
More informationx : X bar Mean (i.e. Average) of a sample
A quick referece for symbols ad formulas covered i COGS14: MEAN OF SAMPLE: x = x i x : X bar Mea (i.e. Average) of a sample x i : X sub i This stads for each idividual value you have i your sample. For
More informationDefinition. A variable X that takes on values X 1, X 2, X 3,...X k with respective frequencies f 1, f 2, f 3,...f k has mean
1 Social Studies 201 October 13, 2004 Note: The examples i these otes may be differet tha used i class. However, the examples are similar ad the methods used are idetical to what was preseted i class.
More informationSECTION 1.5 : SUMMATION NOTATION + WORK WITH SEQUENCES
SECTION 1.5 : SUMMATION NOTATION + WORK WITH SEQUENCES Read Sectio 1.5 (pages 5 9) Overview I Sectio 1.5 we lear to work with summatio otatio ad formulas. We will also itroduce a brief overview of sequeces,
More informationMath C067 Sampling Distributions
Math C067 Samplig Distributios Sample Mea ad Sample Proportio Richard Beigel Some time betwee April 16, 2007 ad April 16, 2007 Examples of Samplig A pollster may try to estimate the proportio of voters
More informationConfidence Intervals for One Mean
Chapter 420 Cofidece Itervals for Oe Mea Itroductio This routie calculates the sample size ecessary to achieve a specified distace from the mea to the cofidece limit(s) at a stated cofidece level for a
More informationConfidence Intervals. CI for a population mean (σ is known and n > 30 or the variable is normally distributed in the.
Cofidece Itervals A cofidece iterval is a iterval whose purpose is to estimate a parameter (a umber that could, i theory, be calculated from the populatio, if measuremets were available for the whole populatio).
More informationDetermining the sample size
Determiig the sample size Oe of the most commo questios ay statisticia gets asked is How large a sample size do I eed? Researchers are ofte surprised to fid out that the aswer depeds o a umber of factors
More informationDescriptive statistics deals with the description or simple analysis of population or sample data.
Descriptive statistics Some basic cocepts A populatio is a fiite or ifiite collectio of idividuals or objects. Ofte it is impossible or impractical to get data o all the members of the populatio ad a small
More informationModule 4: Mathematical Induction
Module 4: Mathematical Iductio Theme 1: Priciple of Mathematical Iductio Mathematical iductio is used to prove statemets about atural umbers. As studets may remember, we ca write such a statemet as a predicate
More informationLesson 17 Pearson s Correlation Coefficient
Outlie Measures of Relatioships Pearso s Correlatio Coefficiet (r) types of data scatter plots measure of directio measure of stregth Computatio covariatio of X ad Y uique variatio i X ad Y measurig
More informationGCSE STATISTICS. 4) How to calculate the range: The difference between the biggest number and the smallest number.
GCSE STATISTICS You should kow: 1) How to draw a frequecy diagram: e.g. NUMBER TALLY FREQUENCY 1 3 5 ) How to draw a bar chart, a pictogram, ad a pie chart. 3) How to use averages: a) Mea  add up all
More information4.1 Sigma Notation and Riemann Sums
0 the itegral. Sigma Notatio ad Riema Sums Oe strategy for calculatig the area of a regio is to cut the regio ito simple shapes, calculate the area of each simple shape, ad the add these smaller areas
More informationUniversity of California, Los Angeles Department of Statistics. Distributions related to the normal distribution
Uiversity of Califoria, Los Ageles Departmet of Statistics Statistics 100B Istructor: Nicolas Christou Three importat distributios: Distributios related to the ormal distributio Chisquare (χ ) distributio.
More informationLecture 4: Cauchy sequences, BolzanoWeierstrass, and the Squeeze theorem
Lecture 4: Cauchy sequeces, BolzaoWeierstrass, ad the Squeeze theorem The purpose of this lecture is more modest tha the previous oes. It is to state certai coditios uder which we are guarateed that limits
More informationProperties of MLE: consistency, asymptotic normality. Fisher information.
Lecture 3 Properties of MLE: cosistecy, asymptotic ormality. Fisher iformatio. I this sectio we will try to uderstad why MLEs are good. Let us recall two facts from probability that we be used ofte throughout
More informationSoving Recurrence Relations
Sovig Recurrece Relatios Part 1. Homogeeous liear 2d degree relatios with costat coefficiets. Cosider the recurrece relatio ( ) T () + at ( 1) + bt ( 2) = 0 This is called a homogeeous liear 2d degree
More informationInference on Proportion. Chapter 8 Tests of Statistical Hypotheses. Sampling Distribution of Sample Proportion. Confidence Interval
Chapter 8 Tests of Statistical Hypotheses 8. Tests about Proportios HT  Iferece o Proportio Parameter: Populatio Proportio p (or π) (Percetage of people has o health isurace) x Statistic: Sample Proportio
More informationCase Study. Normal and t Distributions. Density Plot. Normal Distributions
Case Study Normal ad t Distributios Bret Halo ad Bret Larget Departmet of Statistics Uiversity of Wiscosi Madiso October 11 13, 2011 Case Study Body temperature varies withi idividuals over time (it ca
More informationEstimating the Mean and Variance of a Normal Distribution
Estimatig the Mea ad Variace of a Normal Distributio Learig Objectives After completig this module, the studet will be able to eplai the value of repeatig eperimets eplai the role of the law of large umbers
More informationMEI Structured Mathematics. Module Summary Sheets. Statistics 2 (Version B: reference to new book)
MEI Mathematics i Educatio ad Idustry MEI Structured Mathematics Module Summary Sheets Statistics (Versio B: referece to ew book) Topic : The Poisso Distributio Topic : The Normal Distributio Topic 3:
More informationUsing Excel to Construct Confidence Intervals
OPIM 303 Statistics Ja Stallaert Usig Excel to Costruct Cofidece Itervals This hadout explais how to costruct cofidece itervals i Excel for the followig cases: 1. Cofidece Itervals for the mea of a populatio
More informationChapter 6: Variance, the law of large numbers and the MonteCarlo method
Chapter 6: Variace, the law of large umbers ad the MoteCarlo method Expected value, variace, ad Chebyshev iequality. If X is a radom variable recall that the expected value of X, E[X] is the average value
More informationCenter, Spread, and Shape in Inference: Claims, Caveats, and Insights
Ceter, Spread, ad Shape i Iferece: Claims, Caveats, ad Isights Dr. Nacy Pfeig (Uiversity of Pittsburgh) AMATYC November 2008 Prelimiary Activities 1. I would like to produce a iterval estimate for the
More informationSum and Product Rules. Combinatorics. Some Subtler Examples
Combiatorics Sum ad Product Rules Problem: How to cout without coutig. How do you figure out how may thigs there are with a certai property without actually eumeratig all of them. Sometimes this requires
More informationLesson 15 ANOVA (analysis of variance)
Outlie Variability betwee group variability withi group variability total variability Fratio Computatio sums of squares (betwee/withi/total degrees of freedom (betwee/withi/total mea square (betwee/withi
More informationStatistical Methods. Chapter 1: Overview and Descriptive Statistics
Geeral Itroductio Statistical Methods Chapter 1: Overview ad Descriptive Statistics Statistics studies data, populatio, ad samples. Descriptive Statistics vs Iferetial Statistics. Descriptive Statistics
More informationPSYCHOLOGICAL STATISTICS
UNIVERSITY OF CALICUT SCHOOL OF DISTANCE EDUCATION B Sc. Cousellig Psychology (0 Adm.) IV SEMESTER COMPLEMENTARY COURSE PSYCHOLOGICAL STATISTICS QUESTION BANK. Iferetial statistics is the brach of statistics
More informationUnit 20 Hypotheses Testing
Uit 2 Hypotheses Testig Objectives: To uderstad how to formulate a ull hypothesis ad a alterative hypothesis about a populatio proportio, ad how to choose a sigificace level To uderstad how to collect
More informationQuadrat Sampling in Population Ecology
Quadrat Samplig i Populatio Ecology Backgroud Estimatig the abudace of orgaisms. Ecology is ofte referred to as the "study of distributio ad abudace". This beig true, we would ofte like to kow how may
More informationMeasures of Central Tendency
Measures of Cetral Tedecy A studet s grade will be determied by exam grades ( each exam couts twice ad there are three exams, HW average (couts oce, fial exam ( couts three times. Fid the average if the
More information7. Sample Covariance and Correlation
1 of 8 7/16/2009 6:06 AM Virtual Laboratories > 6. Radom Samples > 1 2 3 4 5 6 7 7. Sample Covariace ad Correlatio The Bivariate Model Suppose agai that we have a basic radom experimet, ad that X ad Y
More informationChapter 7  Sampling Distributions. 1 Introduction. What is statistics? It consist of three major areas:
Chapter 7  Samplig Distributios 1 Itroductio What is statistics? It cosist of three major areas: Data Collectio: samplig plas ad experimetal desigs Descriptive Statistics: umerical ad graphical summaries
More informationDiscrete Mathematics and Probability Theory Spring 2014 Anant Sahai Note 13
EECS 70 Discrete Mathematics ad Probability Theory Sprig 2014 Aat Sahai Note 13 Itroductio At this poit, we have see eough examples that it is worth just takig stock of our model of probability ad may
More informationMeasures of Spread and Boxplots Discrete Math, Section 9.4
Measures of Spread ad Boxplots Discrete Math, Sectio 9.4 We start with a example: Example 1: Comparig Mea ad Media Compute the mea ad media of each data set: S 1 = {4, 6, 8, 10, 1, 14, 16} S = {4, 7, 9,
More informationApproximating Area under a curve with rectangles. To find the area under a curve we approximate the area using rectangles and then use limits to find
1.8 Approximatig Area uder a curve with rectagles 1.6 To fid the area uder a curve we approximate the area usig rectagles ad the use limits to fid 1.4 the area. Example 1 Suppose we wat to estimate 1.
More informationCS103X: Discrete Structures Homework 4 Solutions
CS103X: Discrete Structures Homewor 4 Solutios Due February 22, 2008 Exercise 1 10 poits. Silico Valley questios: a How may possible sixfigure salaries i whole dollar amouts are there that cotai at least
More informationBasic Elements of Arithmetic Sequences and Series
MA40S PRECALCULUS UNIT G GEOMETRIC SEQUENCES CLASS NOTES (COMPLETED NO NEED TO COPY NOTES FROM OVERHEAD) Basic Elemets of Arithmetic Sequeces ad Series Objective: To establish basic elemets of arithmetic
More informationOverview. Learning Objectives. Point Estimate. Estimation. Estimating the Value of a Parameter Using Confidence Intervals
Overview Estimatig the Value of a Parameter Usig Cofidece Itervals We apply the results about the sample mea the problem of estimatio Estimatio is the process of usig sample data estimate the value of
More information1 Computing the Standard Deviation of Sample Means
Computig the Stadard Deviatio of Sample Meas Quality cotrol charts are based o sample meas ot o idividual values withi a sample. A sample is a group of items, which are cosidered all together for our aalysis.
More informationThe second difference is the sequence of differences of the first difference sequence, 2
Differece Equatios I differetial equatios, you look for a fuctio that satisfies ad equatio ivolvig derivatives. I differece equatios, istead of a fuctio of a cotiuous variable (such as time), we look for
More information9.8: THE POWER OF A TEST
9.8: The Power of a Test CD91 9.8: THE POWER OF A TEST I the iitial discussio of statistical hypothesis testig, the two types of risks that are take whe decisios are made about populatio parameters based
More informationHere are a couple of warnings to my students who may be here to get a copy of what happened on a day that you missed.
This documet was writte ad copyrighted by Paul Dawkis. Use of this documet ad its olie versio is govered by the Terms ad Coditios of Use located at http://tutorial.math.lamar.edu/terms.asp. The olie versio
More information5.4 Amortization. Question 1: How do you find the present value of an annuity? Question 2: How is a loan amortized?
5.4 Amortizatio Questio 1: How do you fid the preset value of a auity? Questio 2: How is a loa amortized? Questio 3: How do you make a amortizatio table? Oe of the most commo fiacial istrumets a perso
More informationExample Consider the following set of data, showing the number of times a sample of 5 students check their per day:
Sectio 82: Measures of cetral tedecy Whe thikig about questios such as: how may calories do I eat per day? or how much time do I sped talkig per day?, we quickly realize that the aswer will vary from day
More informationCovariance and correlation
Covariace ad correlatio The mea ad sd help us summarize a buch of umbers which are measuremets of just oe thig. A fudametal ad totally differet questio is how oe thig relates to aother. Stat 0: Quatitative
More informationHypergeometric Distributions
7.4 Hypergeometric Distributios Whe choosig the startig lieup for a game, a coach obviously has to choose a differet player for each positio. Similarly, whe a uio elects delegates for a covetio or you
More informationThe Stable Marriage Problem
The Stable Marriage Problem William Hut Lae Departmet of Computer Sciece ad Electrical Egieerig, West Virgiia Uiversity, Morgatow, WV William.Hut@mail.wvu.edu 1 Itroductio Imagie you are a matchmaker,
More information15.075 Exam 3. Instructor: Cynthia Rudin TA: Dimitrios Bisias. November 22, 2011
15.075 Exam 3 Istructor: Cythia Rudi TA: Dimitrios Bisias November 22, 2011 Gradig is based o demostratio of coceptual uderstadig, so you eed to show all of your work. Problem 1 A compay makes highdefiitio
More informationCS103A Handout 23 Winter 2002 February 22, 2002 Solving Recurrence Relations
CS3A Hadout 3 Witer 00 February, 00 Solvig Recurrece Relatios Itroductio A wide variety of recurrece problems occur i models. Some of these recurrece relatios ca be solved usig iteratio or some other ad
More informationDescriptive Statistics
Descriptive Statistics We leared to describe data sets graphically. We ca also describe a data set umerically. Measures of Locatio Defiitio The sample mea is the arithmetic average of values. We deote
More informationSpss Lab 7: Ttests Section 1
Spss Lab 7: Ttests Sectio I this lab, we will be usig everythig we have leared i our text ad applyig that iformatio to uderstad ttests for parametric ad oparametric data. THERE WILL BE TWO SECTIONS FOR
More informationChapter 7: Confidence Interval and Sample Size
Chapter 7: Cofidece Iterval ad Sample Size Learig Objectives Upo successful completio of Chapter 7, you will be able to: Fid the cofidece iterval for the mea, proportio, ad variace. Determie the miimum
More informationCHAPTER 11 Financial mathematics
CHAPTER 11 Fiacial mathematics I this chapter you will: Calculate iterest usig the simple iterest formula ( ) Use the simple iterest formula to calculate the pricipal (P) Use the simple iterest formula
More informationStandard Errors and Confidence Intervals
Stadard Errors ad Cofidece Itervals Itroductio I the documet Data Descriptio, Populatios ad the Normal Distributio a sample had bee obtaied from the populatio of heights of 5yearold boys. If we assume
More informationConfidence intervals and hypothesis tests
Chapter 2 Cofidece itervals ad hypothesis tests This chapter focuses o how to draw coclusios about populatios from sample data. We ll start by lookig at biary data (e.g., pollig), ad lear how to estimate
More informationA Mathematical Perspective on Gambling
A Mathematical Perspective o Gamblig Molly Maxwell Abstract. This paper presets some basic topics i probability ad statistics, icludig sample spaces, probabilistic evets, expectatios, the biomial ad ormal
More information3. Covariance and Correlation
Virtual Laboratories > 3. Expected Value > 1 2 3 4 5 6 3. Covariace ad Correlatio Recall that by takig the expected value of various trasformatios of a radom variable, we ca measure may iterestig characteristics
More informationMaximum Likelihood Estimators.
Lecture 2 Maximum Likelihood Estimators. Matlab example. As a motivatio, let us look at oe Matlab example. Let us geerate a radom sample of size 00 from beta distributio Beta(5, 2). We will lear the defiitio
More information3 Basic Definitions of Probability Theory
3 Basic Defiitios of Probability Theory 3defprob.tex: Feb 10, 2003 Classical probability Frequecy probability axiomatic probability Historical developemet: Classical Frequecy Axiomatic The Axiomatic defiitio
More informationMannWhitney U 2 Sample Test (a.k.a. Wilcoxon Rank Sum Test)
NoParametric ivariate Statistics: WilcoxoMaWhitey 2 Sample Test 1 MaWhitey 2 Sample Test (a.k.a. Wilcoxo Rak Sum Test) The (Wilcoxo) MaWhitey (WMW) test is the oparametric equivalet of a pooled
More information1 Introduction to reducing variance in Monte Carlo simulations
Copyright c 007 by Karl Sigma 1 Itroductio to reducig variace i Mote Carlo simulatios 11 Review of cofidece itervals for estimatig a mea I statistics, we estimate a uow mea µ = E(X) of a distributio by
More informationCHAPTER 3 DIGITAL CODING OF SIGNALS
CHAPTER 3 DIGITAL CODING OF SIGNALS Computers are ofte used to automate the recordig of measuremets. The trasducers ad sigal coditioig circuits produce a voltage sigal that is proportioal to a quatity
More informationOutput Analysis (2, Chapters 10 &11 Law)
B. Maddah ENMG 6 Simulatio 05/0/07 Output Aalysis (, Chapters 10 &11 Law) Comparig alterative system cofiguratio Sice the output of a simulatio is radom, the comparig differet systems via simulatio should
More informationNotes on Hypothesis Testing
Probability & Statistics Grishpa Notes o Hypothesis Testig A radom sample X = X 1,..., X is observed, with joit pmf/pdf f θ x 1,..., x. The values x = x 1,..., x of X lie i some sample space X. The parameter
More informationAnalyzing Longitudinal Data from Complex Surveys Using SUDAAN
Aalyzig Logitudial Data from Complex Surveys Usig SUDAAN Darryl Creel Statistics ad Epidemiology, RTI Iteratioal, 312 Trotter Farm Drive, Rockville, MD, 20850 Abstract SUDAAN: Software for the Statistical
More information.04. This means $1000 is multiplied by 1.02 five times, once for each of the remaining sixmonth
Questio 1: What is a ordiary auity? Let s look at a ordiary auity that is certai ad simple. By this, we mea a auity over a fixed term whose paymet period matches the iterest coversio period. Additioally,
More informationStat 104 Lecture 2. Variables and their distributions. DJIA: monthly % change, 2000 to Finding the center of a distribution. Median.
Stat 04 Lecture Statistics 04 Lecture (IPS. &.) Outlie for today Variables ad their distributios Fidig the ceter Measurig the spread Effects of a liear trasformatio Variables ad their distributios Variable:
More information1 Correlation and Regression Analysis
1 Correlatio ad Regressio Aalysis I this sectio we will be ivestigatig the relatioship betwee two cotiuous variable, such as height ad weight, the cocetratio of a ijected drug ad heart rate, or the cosumptio
More informationA probabilistic proof of a binomial identity
A probabilistic proof of a biomial idetity Joatho Peterso Abstract We give a elemetary probabilistic proof of a biomial idetity. The proof is obtaied by computig the probability of a certai evet i two
More informationSampling Distribution And Central Limit Theorem
() Samplig Distributio & Cetral Limit Samplig Distributio Ad Cetral Limit Samplig distributio of the sample mea If we sample a umber of samples (say k samples where k is very large umber) each of size,
More informationPractice Problems for Test 3
Practice Problems for Test 3 Note: these problems oly cover CIs ad hypothesis testig You are also resposible for kowig the samplig distributio of the sample meas, ad the Cetral Limit Theorem Review all
More informationDivide and Conquer. Maximum/minimum. Integer Multiplication. CS125 Lecture 4 Fall 2015
CS125 Lecture 4 Fall 2015 Divide ad Coquer We have see oe geeral paradigm for fidig algorithms: the greedy approach. We ow cosider aother geeral paradigm, kow as divide ad coquer. We have already see a
More informationRecursion and Recurrences
Chapter 5 Recursio ad Recurreces 5.1 Growth Rates of Solutios to Recurreces Divide ad Coquer Algorithms Oe of the most basic ad powerful algorithmic techiques is divide ad coquer. Cosider, for example,
More informationStatistical Inference: Hypothesis Testing for Single Populations
Chapter 9 Statistical Iferece: Hypothesis Testig for Sigle Populatios A foremost statistical mechaism for decisio makig is the hypothesis test. The cocept of hypothesis testig lies at the heart of iferetial
More informationIncremental calculation of weighted mean and variance
Icremetal calculatio of weighted mea ad variace Toy Fich faf@cam.ac.uk dot@dotat.at Uiversity of Cambridge Computig Service February 009 Abstract I these otes I eplai how to derive formulae for umerically
More informationUSING STATISTICAL FUNCTIONS ON A SCIENTIFIC CALCULATOR
USING STATISTICAL FUNCTIONS ON A SCIENTIFIC CALCULATOR Objective:. Improve calculator skills eeded i a multiple choice statistical eamiatio where the eam allows the studet to use a scietific calculator..
More informationReview for Test 3. b. Construct the 90% and 95% confidence intervals for the population mean. Interpret the CIs.
Review for Test 3 1 From a radom sample of 36 days i a recet year, the closig stock prices of Hasbro had a mea of $1931 From past studies we kow that the populatio stadard deviatio is $237 a Should you
More informationAsymptotic Growth of Functions
CMPS Itroductio to Aalysis of Algorithms Fall 3 Asymptotic Growth of Fuctios We itroduce several types of asymptotic otatio which are used to compare the performace ad efficiecy of algorithms As we ll
More informationTHE REGRESSION MODEL IN MATRIX FORM. For simple linear regression, meaning one predictor, the model is. for i = 1, 2, 3,, n
We will cosider the liear regressio model i matrix form. For simple liear regressio, meaig oe predictor, the model is i = + x i + ε i for i =,,,, This model icludes the assumptio that the ε i s are a sample
More information