# The Poisson Distribution

Save this PDF as:

Size: px
Start display at page:

## Transcription

1 Lecture 5 The Poisso Distributio 5.1 Itroductio Example 5.1: Drowigs i Malta The book [Mou98] cites data from the St. Luke s Hospital Gazette, o the mothly umber of drowigs o Malta, over a period of early 30 years (355 cosecutive moths). Most moths there were o drowigs. Some moths there was oe perso who drowed. Oe moth had four people drow. The data are give as couts of the umber of moths i which a give umber of drowigs occurred, ad we repeat them here as Table 5.1. Lookig at the data i Table 5.1, we might suppose that oe of the followig hypotheses is true: Some moths are particularly dagerous; Or, o the cotrary, whe oe perso has drowed, the surroudig publicity makes others more cautious for a while, prevetig drowigs? Or, drowigs are simply idepedet evets? How ca we use the data to decide which of these hypotheses is true? We might reasoably suppose that the first hypothesis would predict that there would be more moths with high umbers of drowigs tha the idepedece hypothesis; the secod 81

2 82 The Poisso Distributio Table 5.1: Mothly couts of drowigs i Malta. No. of drowig deaths per moth Frequecy (No. moths observed) hypothesis would predict fewer moths with high umbers of drowigs. The problem is, we do t kow how may we should expect, if idepedece is correct. What we eed is a model: A sesible probability distributio, givig the probability of a moth havig a certai umber of drowigs, uder the idepedece assumptio. The stadard model for this sort of situatio is called the Poisso distributio. The Poisso distributio is used i situatios whe we observe the couts of evets withi a set uit of time, area, volume, legth etc. For example, The umber of cases of a disease i differet tows; The umber of mutatios i give regios of a chromosome; The umber of dolphi pod sightigs alog a flight path through a regio; The umber of particles emitted by a radioactive source i a give time; The umber of births per hour durig a give day. I such situatios we are ofte iterested i whether the evets occur radomly i time or space. Cosider the Babyboom dataset (Table 1.2), that we saw i Lecture 1. The birth times of the babies throughout the day are show i Figure 5.1(a). If we divide up the day ito 24 hour itervals ad

3 The Poisso Distributio 83 cout the umber of births i each hour we ca plot the couts as a histogram i Figure 5.1(b). How does this compare to the histogram of couts for a process that is t radom? Suppose the 44 birth times were distributed i time as show i Figure 5.1(c). The histogram of these birth times per hour is show i Figure 5.1(d). We see that the o-radom clusterig of evets i time causes there to be more hours with zero births ad more hours with large umbers of births tha the real birth times histogram. This example illustrates that the distributio of couts is useful i ucoverig whether the evets might occur radomly or o-radomly i time (or space). Simply lookig at the histogram is t sufficiet if we wat to ask the questio whether the evets occur radomly or ot. To aswer this questio we eed a probability model for the distributio of couts of radom evets that dictates the type of distributios we should expect to see. 5.2 The Poisso Distributio The Poisso distributio is a discrete probability distributio for the couts of evets that occur radomly i a give iterval of time (or space). If we let X = The umber of evets i a give iterval, The, if the mea umber of evets per iterval is λ The probability of observig x evets i a give iterval is give by λ λx P(X = x) = e x! x =0, 1, 2, 3, 4,... Note e is a mathematical costat. e There should be a butto o your calculator e x that calculates powers of e. If the probabilities of X are distributed i this way, we write X Po(λ) λ is the parameter of the distributio. We say X follows a Poisso distributio with parameter λ

4 84 The Poisso Distributio Birth Time (miutes sice midight) (a) Babyboom data birth times Frequecy No. of births per hour (b) Histogram of Babyboom birth times Birth Time (miutes sice midight) (c) Noradom birth times Frequecy No. of births per hour (d) Histogram of oradom birth times Figure 5.1: Represetig the babyboom data set (upper two) ad a oradom hypothetical collectio of birth times (lower two). Note A Poisso radom variable ca take o ay positive iteger value. I cotrast, the Biomial distributio always has a fiite upper limit.

5 The Poisso Distributio 85 Example 5.2: Hospital births Births i a hospital occur radomly at a average rate of 1.8 births per hour. What is the probability of observig 4 births i a give hour at the hospital? Let X = No. of births i a give hour (i) Evets occur radomly (ii) Mea rate λ =1.8 X Po(1.8) We ca ow use the formula to calculate the probability of observig exactly 4 births i a give hour P (X = 4) = e ! = What about the probability of observig more tha or equal to 2 births i a give hour at the hospital? We wat P (X 2) = P (X = 2) + P (X = 3) +... i.e. a ifiite umber of probabilities to calculate but P (X 2) = P (X = 2) + P (X = 3) +... = 1 P (X <2) = 1 (P (X = 0) + P (X = 1)) = 1 (e +e 0! 1! = 1 ( ) = )

6 86 The Poisso Distributio Example 5.3: Disease icidece Suppose there is a disease, whose average icidece is 2 per millio people. What is the probability that a city of 1 millio people has at least twice the average icidece? Twice the average icidece would be 4 cases. We ca reasoably suppose the radom variable X=# cases i 1 millio people has Poisso distributio with parameter 2. The P (X 4) = 1 P (X 3) = 1 e e 2 + e 2 + e 3 0! 1! 2! 3! 5.3 The shape of the Poisso distributio Usig the formula we ca calculate the probabilities for a specific Poisso distributio ad plot the probabilities to observe the shape of the distributio. For example, Figure 5.2 shows 3 differet Poisso distributios. We observe that the distributios (i). are uimodal; (ii). exhibit positive skew (that decreases as λ icreases); (iii). are cetred roughly o λ; (iv). have variace (spread) that icreases as λ icreases. = Mea ad Variace of the Poisso distributio I geeral, there is a formula for the mea of a Poisso distributio. There is also a formula for the stadard deviatio, σ, ad variace, σ 2. If X Po(λ) the µ = λ σ = λ σ 2 = λ

7 The Poisso Distributio 87 Po(3) Po(5) Po(10) P(X) X P(X) X P(X) X Figure 5.2: Three differet Poisso distributios. 5.5 Chagig the size of the iterval Suppose we kow that births i a hospital occur radomly at a average rate of 1.8 births per hour. What is the probability that we observe 5 births i a give 2 hour iterval? Well, if births occur radomly at a rate of 1.8 births per 1 hour iterval The births occur radomly at a rate of 3.6 births per 2 hour iterval Let Y = No. of births i a 2 hour period The Y Po(3.6) P (Y = 5) = e ! = This example illustrates the followig rule If X Po(λ) o 1 uit iterval, the Y Po(kλ) o k uit itervals.

8 88 The Poisso Distributio 5.6 Sum of two Poisso variables Now suppose we kow that i hospital A births occur radomly at a average rate of 2.3 births per hour ad i hospital B births occur radomly at a average rate of 3.1 births per hour. What is the probability that we observe 7 births i total from the two hospitals i a give 1 hour period? To aswer this questio we ca use the followig rule If X Po(λ 1 ) o 1 uit iterval, ad Y Po(λ 2 ) o 1 uit iterval, the X + Y Po(λ 1 + λ 2 ) o 1 uit iterval. So if we let X = No. of births i a give hour at hospital A ad Y = No. of births i a give hour at hospital B The X Po(2.3), Y Po(3.1) ad X + Y Po(5.4) P (X + Y = 7) = e ! = Example 5.4: Disease Icidece, cotiued Suppose disease A occurs with icidece 1.7 per millio, ad disease B occurs with icidece 2.9 per millio. Statistics are compiled, i which these diseases are ot distiguished, but simply are all called cases of disease AB. What is the probability that a city of 1 millio people has at least 6 cases of AB? If Z=# cases of AB, the P Po(4.6). Thus, P (Z 6) = 1 P (Z 5) =1 e = ! ! ! ! ! !

9 The Poisso Distributio Fittig a Poisso distributio Cosider the two sequeces of birth times we saw i Sectio 1. Both of these examples cosisted of a total of 44 births i 24 hour itervals. Therefore the mea birth rate for both sequeces is = What would be the expected couts if birth times were really radom i.e. what is the expected histogram for a Poisso radom variable with mea rate λ = Usig the Poisso formula we ca calculate the probabilities of obtaiig each possible value 1 x P (X = x) The if we observe 24 hour itervals we ca calculate the expected frequecies as 24 P (X = x) for each value of x. x Expected frequecy P (X = x) We say we have fitted a Poisso distributio to the data. This cosisted of 3 steps (i). Estimatig the parameters of the distributio from the data (ii). Calculatig the probability distributio (iii). Multiplyig the probability distributio by the umber of observatios Oce we have fitted a distributio to the data we ca compare the expected frequecies to those we actually observed from the real Babyboom dataset. We see that the agreemet is quite good. x Expected Observed i practice we group values with low probability ito oe category.

10 90 The Poisso Distributio Whe we compare the expected frequecies to those observed from the oradom clustered sequece i Sectio 1 we see that there is much less agreemet. x Expected Observed I Lecture 9 we will see how we ca formally test for a differece betwee the expected ad observed couts. For ow it is eough just to kow how to fit a distributio. 5.8 Usig the Poisso to approximate the Biomial The Biomial ad Poisso distributios are both discrete probability distributios. I some circumstaces the distributios are very similar. For example, cosider the Bi(100, 0.02) ad Po(2) distributios show i Figure 5.3. Visually these distributios are idetical. I geeral, If is large (say > 50) ad p is small (say < 0.1) the a Bi(, p) ca be approximated with a Po(λ) whereλ = p Example 5.5: Coutig lefties Give that 5% of a populatio are left-haded, use the Poisso distributio to estimate the probability that a radom sample of 100 people cotais 2 or more left-haded people. X = No. of left haded people i a sample of 100 X Bi(100, 0.05) Poisso approximatio X Po(λ) withλ = = 5

11 The Poisso Distributio 91 Bi(100, 0.02) Po(2) P(X) P(X) X X Figure 5.3: A Biomial ad Poisso distributio that are very similar. We wat P (X 2)? P (X 2) = 1 P (X <2) = 1 P (X = 0) + P (X = 1) e +e 5 0! 1! If we use the exact Biomial distributio we get the aswer The idea of usig oe distributio to approximate aother is widespread throughout statistics ad oe we will meet agai. Why would we use a approximate distributio whe we actually kow the exact distributio? The exact distributio may be hard to work with. The exact distributio may have too much detail. There may be some features of the exact distributio that are irrelevat to the questios

13 The Poisso Distributio 93 Table 5.2: Mothly couts of drowigs i Malta, with Poisso fit. No. of drowig Frequecy (No. Expected frequecy Probability deaths per moth moths observed) Poisso λ = the idepedece assumptio ad hece the Poisso model is true. This is computed by multiplyig the last colum by 355. After all, if the probability of o drowigs i ay give moth is 0.625, ad we have 355 moths of observatios, we expect moths with 0 drowigs. We see that the observatios (i the secod colum) are pretty close to the predictios of the Poisso model (i the third colum), so the data do ot give us strog evidece to reject the eutral assumptio, that drowigs are idepedet of oe aother, ad have a costat rate i time. I Lecture 9 we will describe oe way of testig this hypothesis formally. Example 5.7: Swie flu vacciatio I 1976, fear of a impedig swie flu pademic led to a mass vacciatio campaig i the US. The pademic ever materialised, but there were cocers that the vacciatio may have led to a icrease i a rare ad serious eurological disease, Guillai-Barré Sydrome (GBS). It was difficult to determie whether the vaccie was really at fault, sice GBS may arise spotaeously about 1 perso i 100,000 develops GBS i a give year ad the umber of cases was small. Cosider the followig data from the US state of Michiga: Out of 9 millio residets, about 2.3 millio were vacciated. Of

14 94 The Poisso Distributio those, 48 developed GBS betwee July 1976 ad Jue We might have expected 2.3 millio 10 5 cases/perso-year = 23 cases. How likely is it that, purely by chace, this populatio would have experieced 48 cases i a sigle year? If Y is the umber of cases, it would the have Poisso distributio with parameter 23, so that i P (Y 48) = 1 e i! i=0 = So, such a extreme umber of cases is likely to happe less tha 1 year i 100,000. Does this prove that the vaccie caused GBS? The people who had the vaccie are people who chose to be vacciated. They may differ from the rest of the populatio i multiple ways i additio to the elemetary fact of havig bee vacciated, ad some of those ways may have predisposed them to GBS. What ca we do? The paper [BH84] takes the followig approach: If the vaccie were ot the cause of the GBS cases, we would expect o coectio betwee the timig of the vaccie ad the oset of GBS. I fact, though, there seemed to be a particularly large umber of cases i the six weeks followig vacciatio. Ca we say that this was more tha could reasoably be expected by chace? The data are give i Table 5.3. Each of the 40 GBS cases was assiged a time, which is the umber of weeks after vacciatio whe the disease was diagosed. (Thus week 1 is a differet caledar week for each subject.) If the cases are evely distributed, the umber i a give week should be Poisso distributed with parameter 40/30 = Usig this parameter, we compute the probabilities of 0, 1, 2,... cases i a week, which we give i row 3 of Table 5.3. Multiplyig these umbers by 30 gives the expected frequecies i row 4 of the table. It is clear that the observed ad expected frequecies are very differet. Oe way of seeig this is to cosider the stadard deviatio. The Poisso distributio has SD 1.33 = 1.15 (as discussed i sectio 5.4,

15 The Poisso Distributio 95 while the data have SD (0 1.33) 2 +7 (1 1.33) 2 +3 (2 1.33) 2 s = = (4 1.33) 2 +1 (9 1.33) 2 +1 ( ) 2 Table 5.3: Cases of GBS, by weeks after vacciatio # cases per week observed frequecy probability expected frequecy Derivatio of the Poisso distributio (oexamiable) This sectio is ot officially part of the course, but is optioal, for those who are iterested i more mathematical detail. Where does the formula i sectio 5.2 come from? Thik of the Poisso distributio as i sectio 5.8, as a approximatio to a biomial distributio. Let X be the (radom) umber of successes i a collectio of idepedet radom trials, where the expected umber of successes is λ. This will, of course, deped o the umber of trials, but we show that whe the umber of trials (call it ) gets large, the exact umber of trials does t matter. I mathematical laguage, we say that the probability coverges to a limit as goes to ifiity. But how large is large? We would like to kow how good the approximatio is, for real values of, of the sort that we are iterested i. Let X be the radom umber of successes i idepedet trials, where the probability of each success is λ/. Thus, the probability of success goes dow as the umber of trials goes up, ad expected umber of successes is always the same λ. The λ x P {X = x} = C x 1 λ x.

16 96 The Poisso Distributio Now, those of you who have leared some calculus at A-levels may remember the Taylor series for e z : e z =1+z + z2 2! + z3 3! +. I particular, for small z we have e z 1 z, ad the differece (or error i the approximatio) is o bigger tha z 2 /2. The key idea is that if z is very small (as it is whe z = λ/, ad is large), the z 2 is a lot smaller tha z. Usig a bit of algebra, we have P {X = x} = C x λ x 1 λ x 1 λ λ x ( 1) ( x + 1) = x! = λx (1) x 1 x! 1 λ x x 1 λ x 1 λ 1 λ. Now, if we re ot cocered about the size of the error, we ca simply say that is much bigger tha λ or x (because we re thikig of a fixed λ ad x, ad gettig large). So we have the approximatios x 1 1; 1 λ x 1; 1 λ e λ/ = e λ. Thus P {X = x} λx x! e λ Error bouds (very mathematical) I the log ru, X has a distributio very close to the Poisso distributio defied i sectio 5.2. But how log is the log ru? Do we eed 10 trials? 1000? a billio? If you just wat the aswer, it s approximately this: The error that you ll make by takig the Poisso distributio istead of the biomial is o more

18 98 The Poisso Distributio We also kow that which meas that 1 λ <e λ/ < 1 λ + λ2 2 2, ad 1 λ 2 2( 2 λ) < 1 λ/ < 1, e λ/ λ 2 1 2( λ) < λ 2 1 λ/ 1 2( 2 < < 1. λ) e λ/ Now we put together all the overestimates o oe side, ad all the uderestimates o the other. λ x x! e λ λ2 2( λ) λx P {X = x} P {X = x} λx λx x! e λ. λx So, fially, as log as 4λ 2,we get max λx+1 λ e λ x! + x + x (1 x/2. ) We eed to fid the maximum over all possible x. If x< the this becomes max 1 λ x+1 e λ (λ +3x) 4λ2 x! 2π, (by a formula kow as Stirlig s formula ), where λ = max{λ, 1}.

### Key Ideas Section 8-1: Overview hypothesis testing Hypothesis Hypothesis Test Section 8-2: Basics of Hypothesis Testing Null Hypothesis

Chapter 8 Key Ideas Hypothesis (Null ad Alterative), Hypothesis Test, Test Statistic, P-value Type I Error, Type II Error, Sigificace Level, Power Sectio 8-1: Overview Cofidece Itervals (Chapter 7) are

### 1. C. The formula for the confidence interval for a population mean is: x t, which was

s 1. C. The formula for the cofidece iterval for a populatio mea is: x t, which was based o the sample Mea. So, x is guarateed to be i the iterval you form.. D. Use the rule : p-value

### Hypothesis testing. Null and alternative hypotheses

Hypothesis testig Aother importat use of samplig distributios is to test hypotheses about populatio parameters, e.g. mea, proportio, regressio coefficiets, etc. For example, it is possible to stipulate

### I. Chi-squared Distributions

1 M 358K Supplemet to Chapter 23: CHI-SQUARED DISTRIBUTIONS, T-DISTRIBUTIONS, AND DEGREES OF FREEDOM To uderstad t-distributios, we first eed to look at aother family of distributios, the chi-squared distributios.

### CHAPTER 7: Central Limit Theorem: CLT for Averages (Means)

CHAPTER 7: Cetral Limit Theorem: CLT for Averages (Meas) X = the umber obtaied whe rollig oe six sided die oce. If we roll a six sided die oce, the mea of the probability distributio is X P(X = x) Simulatio:

### Z-TEST / Z-STATISTIC: used to test hypotheses about. µ when the population standard deviation is unknown

Z-TEST / Z-STATISTIC: used to test hypotheses about µ whe the populatio stadard deviatio is kow ad populatio distributio is ormal or sample size is large T-TEST / T-STATISTIC: used to test hypotheses about

### In nite Sequences. Dr. Philippe B. Laval Kennesaw State University. October 9, 2008

I ite Sequeces Dr. Philippe B. Laval Keesaw State Uiversity October 9, 2008 Abstract This had out is a itroductio to i ite sequeces. mai de itios ad presets some elemetary results. It gives the I ite Sequeces

### AQA STATISTICS 1 REVISION NOTES

AQA STATISTICS 1 REVISION NOTES AVERAGES AND MEASURES OF SPREAD www.mathsbox.org.uk Mode : the most commo or most popular data value the oly average that ca be used for qualitative data ot suitable if

### The following example will help us understand The Sampling Distribution of the Mean. C1 C2 C3 C4 C5 50 miles 84 miles 38 miles 120 miles 48 miles

The followig eample will help us uderstad The Samplig Distributio of the Mea Review: The populatio is the etire collectio of all idividuals or objects of iterest The sample is the portio of the populatio

### Sequences II. Chapter 3. 3.1 Convergent Sequences

Chapter 3 Sequeces II 3. Coverget Sequeces Plot a graph of the sequece a ) = 2, 3 2, 4 3, 5 + 4,...,,... To what limit do you thik this sequece teds? What ca you say about the sequece a )? For ǫ = 0.,

### Week 3 Conditional probabilities, Bayes formula, WEEK 3 page 1 Expected value of a random variable

Week 3 Coditioal probabilities, Bayes formula, WEEK 3 page 1 Expected value of a radom variable We recall our discussio of 5 card poker hads. Example 13 : a) What is the probability of evet A that a 5

### 0.7 0.6 0.2 0 0 96 96.5 97 97.5 98 98.5 99 99.5 100 100.5 96.5 97 97.5 98 98.5 99 99.5 100 100.5

Sectio 13 Kolmogorov-Smirov test. Suppose that we have a i.i.d. sample X 1,..., X with some ukow distributio P ad we would like to test the hypothesis that P is equal to a particular distributio P 0, i.e.

### Section 11.3: The Integral Test

Sectio.3: The Itegral Test Most of the series we have looked at have either diverged or have coverged ad we have bee able to fid what they coverge to. I geeral however, the problem is much more difficult

### 5: Introduction to Estimation

5: Itroductio to Estimatio Cotets Acroyms ad symbols... 1 Statistical iferece... Estimatig µ with cofidece... 3 Samplig distributio of the mea... 3 Cofidece Iterval for μ whe σ is kow before had... 4 Sample

### Chapter 14 Nonparametric Statistics

Chapter 14 Noparametric Statistics A.K.A. distributio-free statistics! Does ot deped o the populatio fittig ay particular type of distributio (e.g, ormal). Sice these methods make fewer assumptios, they

### Discrete Random Variables and Probability Distributions. Random Variables. Chapter 3 3.1

UCLA STAT A Applied Probability & Statistics for Egieers Istructor: Ivo Diov, Asst. Prof. I Statistics ad Neurology Teachig Assistat: Neda Farziia, UCLA Statistics Uiversity of Califoria, Los Ageles, Sprig

### One-sample test of proportions

Oe-sample test of proportios The Settig: Idividuals i some populatio ca be classified ito oe of two categories. You wat to make iferece about the proportio i each category, so you draw a sample. Examples:

### Overview of some probability distributions.

Lecture Overview of some probability distributios. I this lecture we will review several commo distributios that will be used ofte throughtout the class. Each distributio is usually described by its probability

### Chapter 10. Hypothesis Tests Regarding a Parameter. 10.1 The Language of Hypothesis Testing

Chapter 10 Hypothesis Tests Regardig a Parameter A secod type of statistical iferece is hypothesis testig. Here, rather tha use either a poit (or iterval) estimate from a simple radom sample to approximate

### Chapter 5 Discrete Probability Distributions

Slides Prepared by JOHN S. LOUCKS St. Edward s Uiversity Slide Chapter 5 Discrete Probability Distributios Radom Variables Discrete Probability Distributios Epected Value ad Variace Poisso Distributio

### 1 Hypothesis testing for a single mean

BST 140.65 Hypothesis Testig Review otes 1 Hypothesis testig for a sigle mea 1. The ull, or status quo, hypothesis is labeled H 0, the alterative H a or H 1 or H.... A type I error occurs whe we falsely

### x : X bar Mean (i.e. Average) of a sample

A quick referece for symbols ad formulas covered i COGS14: MEAN OF SAMPLE: x = x i x : X bar Mea (i.e. Average) of a sample x i : X sub i This stads for each idividual value you have i your sample. For

### Definition. A variable X that takes on values X 1, X 2, X 3,...X k with respective frequencies f 1, f 2, f 3,...f k has mean

1 Social Studies 201 October 13, 2004 Note: The examples i these otes may be differet tha used i class. However, the examples are similar ad the methods used are idetical to what was preseted i class.

### SECTION 1.5 : SUMMATION NOTATION + WORK WITH SEQUENCES

SECTION 1.5 : SUMMATION NOTATION + WORK WITH SEQUENCES Read Sectio 1.5 (pages 5 9) Overview I Sectio 1.5 we lear to work with summatio otatio ad formulas. We will also itroduce a brief overview of sequeces,

### Math C067 Sampling Distributions

Math C067 Samplig Distributios Sample Mea ad Sample Proportio Richard Beigel Some time betwee April 16, 2007 ad April 16, 2007 Examples of Samplig A pollster may try to estimate the proportio of voters

### Confidence Intervals for One Mean

Chapter 420 Cofidece Itervals for Oe Mea Itroductio This routie calculates the sample size ecessary to achieve a specified distace from the mea to the cofidece limit(s) at a stated cofidece level for a

### Confidence Intervals. CI for a population mean (σ is known and n > 30 or the variable is normally distributed in the.

Cofidece Itervals A cofidece iterval is a iterval whose purpose is to estimate a parameter (a umber that could, i theory, be calculated from the populatio, if measuremets were available for the whole populatio).

### Determining the sample size

Determiig the sample size Oe of the most commo questios ay statisticia gets asked is How large a sample size do I eed? Researchers are ofte surprised to fid out that the aswer depeds o a umber of factors

### Descriptive statistics deals with the description or simple analysis of population or sample data.

Descriptive statistics Some basic cocepts A populatio is a fiite or ifiite collectio of idividuals or objects. Ofte it is impossible or impractical to get data o all the members of the populatio ad a small

### Module 4: Mathematical Induction

Module 4: Mathematical Iductio Theme 1: Priciple of Mathematical Iductio Mathematical iductio is used to prove statemets about atural umbers. As studets may remember, we ca write such a statemet as a predicate

### Lesson 17 Pearson s Correlation Coefficient

Outlie Measures of Relatioships Pearso s Correlatio Coefficiet (r) -types of data -scatter plots -measure of directio -measure of stregth Computatio -covariatio of X ad Y -uique variatio i X ad Y -measurig

### GCSE STATISTICS. 4) How to calculate the range: The difference between the biggest number and the smallest number.

GCSE STATISTICS You should kow: 1) How to draw a frequecy diagram: e.g. NUMBER TALLY FREQUENCY 1 3 5 ) How to draw a bar chart, a pictogram, ad a pie chart. 3) How to use averages: a) Mea - add up all

### 4.1 Sigma Notation and Riemann Sums

0 the itegral. Sigma Notatio ad Riema Sums Oe strategy for calculatig the area of a regio is to cut the regio ito simple shapes, calculate the area of each simple shape, ad the add these smaller areas

### University of California, Los Angeles Department of Statistics. Distributions related to the normal distribution

Uiversity of Califoria, Los Ageles Departmet of Statistics Statistics 100B Istructor: Nicolas Christou Three importat distributios: Distributios related to the ormal distributio Chi-square (χ ) distributio.

### Lecture 4: Cauchy sequences, Bolzano-Weierstrass, and the Squeeze theorem

Lecture 4: Cauchy sequeces, Bolzao-Weierstrass, ad the Squeeze theorem The purpose of this lecture is more modest tha the previous oes. It is to state certai coditios uder which we are guarateed that limits

### Properties of MLE: consistency, asymptotic normality. Fisher information.

Lecture 3 Properties of MLE: cosistecy, asymptotic ormality. Fisher iformatio. I this sectio we will try to uderstad why MLEs are good. Let us recall two facts from probability that we be used ofte throughout

### Soving Recurrence Relations

Sovig Recurrece Relatios Part 1. Homogeeous liear 2d degree relatios with costat coefficiets. Cosider the recurrece relatio ( ) T () + at ( 1) + bt ( 2) = 0 This is called a homogeeous liear 2d degree

### Inference on Proportion. Chapter 8 Tests of Statistical Hypotheses. Sampling Distribution of Sample Proportion. Confidence Interval

Chapter 8 Tests of Statistical Hypotheses 8. Tests about Proportios HT - Iferece o Proportio Parameter: Populatio Proportio p (or π) (Percetage of people has o health isurace) x Statistic: Sample Proportio

### Case Study. Normal and t Distributions. Density Plot. Normal Distributions

Case Study Normal ad t Distributios Bret Halo ad Bret Larget Departmet of Statistics Uiversity of Wiscosi Madiso October 11 13, 2011 Case Study Body temperature varies withi idividuals over time (it ca

### Estimating the Mean and Variance of a Normal Distribution

Estimatig the Mea ad Variace of a Normal Distributio Learig Objectives After completig this module, the studet will be able to eplai the value of repeatig eperimets eplai the role of the law of large umbers

### MEI Structured Mathematics. Module Summary Sheets. Statistics 2 (Version B: reference to new book)

MEI Mathematics i Educatio ad Idustry MEI Structured Mathematics Module Summary Sheets Statistics (Versio B: referece to ew book) Topic : The Poisso Distributio Topic : The Normal Distributio Topic 3:

### Using Excel to Construct Confidence Intervals

OPIM 303 Statistics Ja Stallaert Usig Excel to Costruct Cofidece Itervals This hadout explais how to costruct cofidece itervals i Excel for the followig cases: 1. Cofidece Itervals for the mea of a populatio

### Chapter 6: Variance, the law of large numbers and the Monte-Carlo method

Chapter 6: Variace, the law of large umbers ad the Mote-Carlo method Expected value, variace, ad Chebyshev iequality. If X is a radom variable recall that the expected value of X, E[X] is the average value

### Center, Spread, and Shape in Inference: Claims, Caveats, and Insights

Ceter, Spread, ad Shape i Iferece: Claims, Caveats, ad Isights Dr. Nacy Pfeig (Uiversity of Pittsburgh) AMATYC November 2008 Prelimiary Activities 1. I would like to produce a iterval estimate for the

### Sum and Product Rules. Combinatorics. Some Subtler Examples

Combiatorics Sum ad Product Rules Problem: How to cout without coutig. How do you figure out how may thigs there are with a certai property without actually eumeratig all of them. Sometimes this requires

### Lesson 15 ANOVA (analysis of variance)

Outlie Variability -betwee group variability -withi group variability -total variability -F-ratio Computatio -sums of squares (betwee/withi/total -degrees of freedom (betwee/withi/total -mea square (betwee/withi

### Statistical Methods. Chapter 1: Overview and Descriptive Statistics

Geeral Itroductio Statistical Methods Chapter 1: Overview ad Descriptive Statistics Statistics studies data, populatio, ad samples. Descriptive Statistics vs Iferetial Statistics. Descriptive Statistics

### PSYCHOLOGICAL STATISTICS

UNIVERSITY OF CALICUT SCHOOL OF DISTANCE EDUCATION B Sc. Cousellig Psychology (0 Adm.) IV SEMESTER COMPLEMENTARY COURSE PSYCHOLOGICAL STATISTICS QUESTION BANK. Iferetial statistics is the brach of statistics

### Unit 20 Hypotheses Testing

Uit 2 Hypotheses Testig Objectives: To uderstad how to formulate a ull hypothesis ad a alterative hypothesis about a populatio proportio, ad how to choose a sigificace level To uderstad how to collect

### Quadrat Sampling in Population Ecology

Quadrat Samplig i Populatio Ecology Backgroud Estimatig the abudace of orgaisms. Ecology is ofte referred to as the "study of distributio ad abudace". This beig true, we would ofte like to kow how may

### Measures of Central Tendency

Measures of Cetral Tedecy A studet s grade will be determied by exam grades ( each exam couts twice ad there are three exams, HW average (couts oce, fial exam ( couts three times. Fid the average if the

### 7. Sample Covariance and Correlation

1 of 8 7/16/2009 6:06 AM Virtual Laboratories > 6. Radom Samples > 1 2 3 4 5 6 7 7. Sample Covariace ad Correlatio The Bivariate Model Suppose agai that we have a basic radom experimet, ad that X ad Y

### Chapter 7 - Sampling Distributions. 1 Introduction. What is statistics? It consist of three major areas:

Chapter 7 - Samplig Distributios 1 Itroductio What is statistics? It cosist of three major areas: Data Collectio: samplig plas ad experimetal desigs Descriptive Statistics: umerical ad graphical summaries

### Discrete Mathematics and Probability Theory Spring 2014 Anant Sahai Note 13

EECS 70 Discrete Mathematics ad Probability Theory Sprig 2014 Aat Sahai Note 13 Itroductio At this poit, we have see eough examples that it is worth just takig stock of our model of probability ad may

### Measures of Spread and Boxplots Discrete Math, Section 9.4

Measures of Spread ad Boxplots Discrete Math, Sectio 9.4 We start with a example: Example 1: Comparig Mea ad Media Compute the mea ad media of each data set: S 1 = {4, 6, 8, 10, 1, 14, 16} S = {4, 7, 9,

### Approximating Area under a curve with rectangles. To find the area under a curve we approximate the area using rectangles and then use limits to find

1.8 Approximatig Area uder a curve with rectagles 1.6 To fid the area uder a curve we approximate the area usig rectagles ad the use limits to fid 1.4 the area. Example 1 Suppose we wat to estimate 1.

### CS103X: Discrete Structures Homework 4 Solutions

CS103X: Discrete Structures Homewor 4 Solutios Due February 22, 2008 Exercise 1 10 poits. Silico Valley questios: a How may possible six-figure salaries i whole dollar amouts are there that cotai at least

### Basic Elements of Arithmetic Sequences and Series

MA40S PRE-CALCULUS UNIT G GEOMETRIC SEQUENCES CLASS NOTES (COMPLETED NO NEED TO COPY NOTES FROM OVERHEAD) Basic Elemets of Arithmetic Sequeces ad Series Objective: To establish basic elemets of arithmetic

### Overview. Learning Objectives. Point Estimate. Estimation. Estimating the Value of a Parameter Using Confidence Intervals

Overview Estimatig the Value of a Parameter Usig Cofidece Itervals We apply the results about the sample mea the problem of estimatio Estimatio is the process of usig sample data estimate the value of

### 1 Computing the Standard Deviation of Sample Means

Computig the Stadard Deviatio of Sample Meas Quality cotrol charts are based o sample meas ot o idividual values withi a sample. A sample is a group of items, which are cosidered all together for our aalysis.

### The second difference is the sequence of differences of the first difference sequence, 2

Differece Equatios I differetial equatios, you look for a fuctio that satisfies ad equatio ivolvig derivatives. I differece equatios, istead of a fuctio of a cotiuous variable (such as time), we look for

### 9.8: THE POWER OF A TEST

9.8: The Power of a Test CD9-1 9.8: THE POWER OF A TEST I the iitial discussio of statistical hypothesis testig, the two types of risks that are take whe decisios are made about populatio parameters based

### Here are a couple of warnings to my students who may be here to get a copy of what happened on a day that you missed.

This documet was writte ad copyrighted by Paul Dawkis. Use of this documet ad its olie versio is govered by the Terms ad Coditios of Use located at http://tutorial.math.lamar.edu/terms.asp. The olie versio

### 5.4 Amortization. Question 1: How do you find the present value of an annuity? Question 2: How is a loan amortized?

5.4 Amortizatio Questio 1: How do you fid the preset value of a auity? Questio 2: How is a loa amortized? Questio 3: How do you make a amortizatio table? Oe of the most commo fiacial istrumets a perso

### Example Consider the following set of data, showing the number of times a sample of 5 students check their per day:

Sectio 82: Measures of cetral tedecy Whe thikig about questios such as: how may calories do I eat per day? or how much time do I sped talkig per day?, we quickly realize that the aswer will vary from day

### Covariance and correlation

Covariace ad correlatio The mea ad sd help us summarize a buch of umbers which are measuremets of just oe thig. A fudametal ad totally differet questio is how oe thig relates to aother. Stat 0: Quatitative

### Hypergeometric Distributions

7.4 Hypergeometric Distributios Whe choosig the startig lie-up for a game, a coach obviously has to choose a differet player for each positio. Similarly, whe a uio elects delegates for a covetio or you

### The Stable Marriage Problem

The Stable Marriage Problem William Hut Lae Departmet of Computer Sciece ad Electrical Egieerig, West Virgiia Uiversity, Morgatow, WV William.Hut@mail.wvu.edu 1 Itroductio Imagie you are a matchmaker,

### 15.075 Exam 3. Instructor: Cynthia Rudin TA: Dimitrios Bisias. November 22, 2011

15.075 Exam 3 Istructor: Cythia Rudi TA: Dimitrios Bisias November 22, 2011 Gradig is based o demostratio of coceptual uderstadig, so you eed to show all of your work. Problem 1 A compay makes high-defiitio

### CS103A Handout 23 Winter 2002 February 22, 2002 Solving Recurrence Relations

CS3A Hadout 3 Witer 00 February, 00 Solvig Recurrece Relatios Itroductio A wide variety of recurrece problems occur i models. Some of these recurrece relatios ca be solved usig iteratio or some other ad

### Descriptive Statistics

Descriptive Statistics We leared to describe data sets graphically. We ca also describe a data set umerically. Measures of Locatio Defiitio The sample mea is the arithmetic average of values. We deote

### Spss Lab 7: T-tests Section 1

Spss Lab 7: T-tests Sectio I this lab, we will be usig everythig we have leared i our text ad applyig that iformatio to uderstad t-tests for parametric ad oparametric data. THERE WILL BE TWO SECTIONS FOR

### Chapter 7: Confidence Interval and Sample Size

Chapter 7: Cofidece Iterval ad Sample Size Learig Objectives Upo successful completio of Chapter 7, you will be able to: Fid the cofidece iterval for the mea, proportio, ad variace. Determie the miimum

### CHAPTER 11 Financial mathematics

CHAPTER 11 Fiacial mathematics I this chapter you will: Calculate iterest usig the simple iterest formula ( ) Use the simple iterest formula to calculate the pricipal (P) Use the simple iterest formula

### Standard Errors and Confidence Intervals

Stadard Errors ad Cofidece Itervals Itroductio I the documet Data Descriptio, Populatios ad the Normal Distributio a sample had bee obtaied from the populatio of heights of 5-year-old boys. If we assume

### Confidence intervals and hypothesis tests

Chapter 2 Cofidece itervals ad hypothesis tests This chapter focuses o how to draw coclusios about populatios from sample data. We ll start by lookig at biary data (e.g., pollig), ad lear how to estimate

### A Mathematical Perspective on Gambling

A Mathematical Perspective o Gamblig Molly Maxwell Abstract. This paper presets some basic topics i probability ad statistics, icludig sample spaces, probabilistic evets, expectatios, the biomial ad ormal

### 3. Covariance and Correlation

Virtual Laboratories > 3. Expected Value > 1 2 3 4 5 6 3. Covariace ad Correlatio Recall that by takig the expected value of various trasformatios of a radom variable, we ca measure may iterestig characteristics

### Maximum Likelihood Estimators.

Lecture 2 Maximum Likelihood Estimators. Matlab example. As a motivatio, let us look at oe Matlab example. Let us geerate a radom sample of size 00 from beta distributio Beta(5, 2). We will lear the defiitio

### 3 Basic Definitions of Probability Theory

3 Basic Defiitios of Probability Theory 3defprob.tex: Feb 10, 2003 Classical probability Frequecy probability axiomatic probability Historical developemet: Classical Frequecy Axiomatic The Axiomatic defiitio

### Mann-Whitney U 2 Sample Test (a.k.a. Wilcoxon Rank Sum Test)

No-Parametric ivariate Statistics: Wilcoxo-Ma-Whitey 2 Sample Test 1 Ma-Whitey 2 Sample Test (a.k.a. Wilcoxo Rak Sum Test) The (Wilcoxo-) Ma-Whitey (WMW) test is the o-parametric equivalet of a pooled

### 1 Introduction to reducing variance in Monte Carlo simulations

Copyright c 007 by Karl Sigma 1 Itroductio to reducig variace i Mote Carlo simulatios 11 Review of cofidece itervals for estimatig a mea I statistics, we estimate a uow mea µ = E(X) of a distributio by

### CHAPTER 3 DIGITAL CODING OF SIGNALS

CHAPTER 3 DIGITAL CODING OF SIGNALS Computers are ofte used to automate the recordig of measuremets. The trasducers ad sigal coditioig circuits produce a voltage sigal that is proportioal to a quatity

### Output Analysis (2, Chapters 10 &11 Law)

B. Maddah ENMG 6 Simulatio 05/0/07 Output Aalysis (, Chapters 10 &11 Law) Comparig alterative system cofiguratio Sice the output of a simulatio is radom, the comparig differet systems via simulatio should

### Notes on Hypothesis Testing

Probability & Statistics Grishpa Notes o Hypothesis Testig A radom sample X = X 1,..., X is observed, with joit pmf/pdf f θ x 1,..., x. The values x = x 1,..., x of X lie i some sample space X. The parameter

### Analyzing Longitudinal Data from Complex Surveys Using SUDAAN

Aalyzig Logitudial Data from Complex Surveys Usig SUDAAN Darryl Creel Statistics ad Epidemiology, RTI Iteratioal, 312 Trotter Farm Drive, Rockville, MD, 20850 Abstract SUDAAN: Software for the Statistical

### .04. This means \$1000 is multiplied by 1.02 five times, once for each of the remaining sixmonth

Questio 1: What is a ordiary auity? Let s look at a ordiary auity that is certai ad simple. By this, we mea a auity over a fixed term whose paymet period matches the iterest coversio period. Additioally,

### Stat 104 Lecture 2. Variables and their distributions. DJIA: monthly % change, 2000 to Finding the center of a distribution. Median.

Stat 04 Lecture Statistics 04 Lecture (IPS. &.) Outlie for today Variables ad their distributios Fidig the ceter Measurig the spread Effects of a liear trasformatio Variables ad their distributios Variable:

### 1 Correlation and Regression Analysis

1 Correlatio ad Regressio Aalysis I this sectio we will be ivestigatig the relatioship betwee two cotiuous variable, such as height ad weight, the cocetratio of a ijected drug ad heart rate, or the cosumptio

### A probabilistic proof of a binomial identity

A probabilistic proof of a biomial idetity Joatho Peterso Abstract We give a elemetary probabilistic proof of a biomial idetity. The proof is obtaied by computig the probability of a certai evet i two

### Sampling Distribution And Central Limit Theorem

() Samplig Distributio & Cetral Limit Samplig Distributio Ad Cetral Limit Samplig distributio of the sample mea If we sample a umber of samples (say k samples where k is very large umber) each of size,

### Practice Problems for Test 3

Practice Problems for Test 3 Note: these problems oly cover CIs ad hypothesis testig You are also resposible for kowig the samplig distributio of the sample meas, ad the Cetral Limit Theorem Review all

### Divide and Conquer. Maximum/minimum. Integer Multiplication. CS125 Lecture 4 Fall 2015

CS125 Lecture 4 Fall 2015 Divide ad Coquer We have see oe geeral paradigm for fidig algorithms: the greedy approach. We ow cosider aother geeral paradigm, kow as divide ad coquer. We have already see a

### Recursion and Recurrences

Chapter 5 Recursio ad Recurreces 5.1 Growth Rates of Solutios to Recurreces Divide ad Coquer Algorithms Oe of the most basic ad powerful algorithmic techiques is divide ad coquer. Cosider, for example,

### Statistical Inference: Hypothesis Testing for Single Populations

Chapter 9 Statistical Iferece: Hypothesis Testig for Sigle Populatios A foremost statistical mechaism for decisio makig is the hypothesis test. The cocept of hypothesis testig lies at the heart of iferetial

### Incremental calculation of weighted mean and variance

Icremetal calculatio of weighted mea ad variace Toy Fich faf@cam.ac.uk dot@dotat.at Uiversity of Cambridge Computig Service February 009 Abstract I these otes I eplai how to derive formulae for umerically

### USING STATISTICAL FUNCTIONS ON A SCIENTIFIC CALCULATOR

USING STATISTICAL FUNCTIONS ON A SCIENTIFIC CALCULATOR Objective:. Improve calculator skills eeded i a multiple choice statistical eamiatio where the eam allows the studet to use a scietific calculator..

### Review for Test 3. b. Construct the 90% and 95% confidence intervals for the population mean. Interpret the CIs.

Review for Test 3 1 From a radom sample of 36 days i a recet year, the closig stock prices of Hasbro had a mea of \$1931 From past studies we kow that the populatio stadard deviatio is \$237 a Should you