λ λ(1+δ) e λ 2πe λ(1+δ) = eλδ (1 + δ) λ(1+δ) 1/2 2πλ p(x) = e (x λ)2 /(2λ) 2πλ

Size: px
Start display at page:

Download "λ λ(1+δ) e λ 2πe λ(1+δ) = eλδ (1 + δ) λ(1+δ) 1/2 2πλ p(x) = e (x λ)2 /(2λ) 2πλ"

Transcription

1 2.1.5 Gaussia distributio as a limit of the Poisso distributio A limitig form of the Poisso distributio (ad may others see the Cetral Limit Theorem below) is the Gaussia distributio. I derivig the Poisso distributio we took the limit of the total umber of evets N ; we ow take the limit that the mea value is very large. Let s write the Poisso distributio as P = λ e λ. (45)! Now let x = = λ(1 + δ) where λ 1 ad δ 1. Sice = λ, this meas that we will also be cocered with large values of, i which case the discrete P goes over to a cotiuous pdf i the variable x. Usig Stirlig s formula for!: x! 2πx e x x x as x (46) we fid 7 p(x) = λ λ(1+δ) e λ 2πe λ(1+δ) [λ(1 + δ)] λ(1+δ)+1/2 = eλδ (1 + δ) λ(1+δ) 1/2 2πλ (see footote) = e λδ2 /2 2πλ (48) Substitutig back for x, with δ =(x λ)/λ, yields p(x) = e (x λ)2 /(2λ) 2πλ (49) This is a Gaussia, or Normal 8, distributio with mea ad variace of λ. The Gaussia distributio is the most importat distributio i probability, due to its role i the Cetral Limit Theorem, which loosely says that the sum of a large umber of idepedet quatities teds to have a Gaussia form, idepedet of the pdf of the idividual measuremets. The above specific derivatio is somewhat cumbersome, ad it will actually be more elegat to use the Cetral Limit theorem to derive the Gaussia approximatio to the Poisso distributio More o the Gaussia The Gaussia distributio is so importat that we collect some properties here. It is ormally writte as 1 /2σ2 p(x) = (2π) 1/2 e (x µ)2, (50) σ 7 Maths Notes: The limit of a fuctio like (1 + δ) λ(1+δ)+1/2 with λ 1 ad δ 1 ca be foud by takig the atural log, the expadig i δ to secod order ad usig λ 1: l[(1 + δ) λ(1+δ)+1/2 ] = [λ(1 + δ)+1/2] l(1 + δ) = (λ +1/2+δ)(δ δ 2 /2+O(δ 3 )) λδ + λδ 2 /2+O(δ 3 ) (47) 8 The ame Normal was give to this distributio by the statisticia K. Pearso, who almost immediately regretted itroducig the ame. It is also sometimes called the Bell-curve. 19

2 so that µ is the mea ad σ the stadard deviatio. The first statemet is obvious: cosider x µ, which must vaish by symmetry sice it ivolves itegratio of a odd fuctio. To prove the secod statemet, write (x µ) 2 1 = (2π) 1/2 σ σ3 y 2 e y2 /2 dy (51) ad do the itegral by parts. Provig that the distributio is correctly ormalized is harder, but there is a clever trick, which is to exted to a two-dimesioal Gaussia for two idepedet (zero-mea) variables x ad y: p(x, y) = 1 +y2 )/2σ2 e (x2. (52) 2πσ2 The itegral over both variables ca ow be rewritte usig polar coordiates: p(x, y) dx dy = ad the fial expressio clearly itegrates to so the distributio is ideed correctly ormalized. p(x, y) 2π r dr = 1 2πσ 2 2π r e r2 /2σ 2 dr (53) P (r > R) = exp ( R 2 /2σ 2), (54) Ufortuately, the sigle Gaussia distributio has o aalytic expressio for its itegral, eve though this is ofte of great iterest. As see i the ext sectio, we ofte wat to kow the probability of a Gaussia variable lyig above some value, so we eed to kow the itegral of the Gaussia; there are two commo otatios for this (assumig zero mea): where the error fuctio is defied as P (x < Xσ) = Φ(X); (55) P (x < Xσ) = 1 2 erf(y) 2 π y A useful approximatio for the itegral i the limit of high x is [ 1 + erf(x/ 2) ], (56) 0 e t2 dt. (57) P (x > Xσ) e X2 /2 (2π) 1/2 X (58) (which is derived by a Taylor series: e (x+ɛ)2 /2 e x2 /2 e xɛ ad the liear expoetial i ɛ ca be itegrated). 2.2 Tails ad measures of rareess So far, we have asked questios about the probability of obtaiig a particular experimetal outcome, but this is ot always a sesible questio. Where there are may possible outcomes, the chace of 20

3 Figure 3: The Gaussia distributio, illustratig the area uder various parts of the curve, divided i uits of σ. Thus the chace of beig withi 1σ of the mea is 68%; 95% of results are withi 2σ of the mea; 99.7%of results are withi 3σ of the mea. ay give oe happeig will be very small. This is most obvious with a cotiuous pdf, p(x): the probability of a result i the rage x = a to x = a + δx is p(a)δx, so the probability of gettig x = a exactly is precisely zero. Thus it oly makes sese to ask about the probability of x lyig i a give rage. The most commo reaso for calculatig the probability of a give outcome (apart from bettig), is so we ca test hypotheses. The lectures will give a i-depth discussio of this issue later, as it ca be quite subtle. Nevertheless, it is immediately clear that we wat to set possible outcomes to a experimet i some order of rareess, ad we will be suspicious of a hypothesis whe a experimet geerates a outcome that ought to be very rare if the hypothesis were true. Iformally, we eed to defie a typical value for x ad some meas of decidig if x is far from the typical value. For a Gaussia, we would aturally the mea, µ as the typical value ad (x µ)/σ as the distace. But how do we make this geeral? There are two other commo measures of locatio: The Mode The value of x where p(x) has a maximum. The Media The value of x such that P (> x) = 0.5. For the Gaussia, both these measures are equal to the mea. I geeral, the media is the safest choice, sice it is easy to devise pathological examples where the mode or eve the mea is ot well defied. Followig o from the media, we ca defie upper ad lower quartiles, which together eclose half the probability i.e. the values x 1 ad x 2, where P (< x 1 )=0.25 ad P (> x 2 )=0.75. This suggests a measure of rareess of evets, which is to sigle out evets that lie i the tails of the distributio, where either P (< x) 1 or P (> x) 1. This ca be doe i a 1-tailed or a 2-tailed maer, depedig o whether we choose to cout oly high excursios, or to be impressed by deviatios i either directio. The area uder the tail of a pdf is called a p value, to emphasise that we have to be careful with meaig. If we get, say, p =0.05 this meas that there is a probability of 0.05 to get a value as extreme as this oe, or worse, o a give hypothesis. So we eed to have a hypothesis i mid 21

4 to start with; this is called the ull hypothesis it would typically be somethig o-committal, such as there is o sigal. If we get a p value that is small, this is some evidece agaist the ull hypothesis, i which case we ca claimed to have detected a sigal. Small values of p are described as givig sigificat evidece agaist the ull hypothesis: for p = 0.05 we would say this result is sigificat at the 5% level. For completeess, we should metio the term cofidece level, which is complemetary to the sigificace level. If P (< x 1 )=0.05 ad P (> x 2 )=0.05, the we would say that, o the ull hypothesis, x 1 < x < x 2 at 90% cofidece, or x<x 2 at 95% cofidece. As show later i the course, the p value is ot the same as the probability that the ull hypothesis is correct, although may people thik this is the case. Nevertheless, whe p is small, you are o good grouds i disbelievig the ull hypothesis. Some of the p values correspodig to particular places i the Gaussia are listed i table 1. The weakest evidece would be a 2σ result, which happes by chace about 5% of the time. This is hardly defiitive, although it may provoke further work. But a 3σ result is much less probable. If you decide to reject the ull hypothesis wheever x µ>3σ, you will be wrog oly oe time i Nevertheless, discovery i some areas of sciece ca isist o stroger evidece, perhaps at the 5σ level (1-sided p = ). This is partly because σ may ot itself be kow precisely, but also because may differet experimets ca be performed: if we search for the Higgs Boso i 100 differet idepedet mass rages, we are boud to fid a result that is sigificat at about the 1% level. Table 1: Tails of the Gaussia x/σ 1-tail p 2-tail p Obviously, this process is ot perfect. If we make a observatio ad get x µ σ, this actually favours a arrower distributio tha the stadard oe, but broader distributios are easier to detect, because the probability of a extreme deviatio falls expoetially. 2.3 The likelihood Although we have argued that the probability of a cotiuous variable havig exactly some give value is zero, the relative probability of havig ay two values, p(x = a)/p(x = b) is well defied. This ca be exteded to the idea of relative probabilities for larger datasets, where drawigs are made from the pdf. We ca approach this usig the multiomial distributio, where we exted the biomial to somethig with more tha two possible outcomes: e.g. we toss a six-sided dice seve times, ad ask what is the probability of gettig three oes, two twoes, a five ad a six? Number the possible results of each trial, ad say each occurs 1 times, 2 times etc., with probabilities p 1, p 2 etc., out of N trials. Imagie first the case where the trials give a strig of 1 s, followed by 2 s etc. 22

5 The probability of this happeig is p 1 1 p 2 2 p 3 3. If we do t care which trials give these particular umbers, the we eed to multiply by the umber of ways i which such a outcome could arise. Imagie choosig the 1 first, which ca happe C N 1 ways; the 2 ca be chose C N 1 2 ways. Multiplyig all these factors, we get the simple result p = N! 1! 2! 3! p 1 1 p 2 2 p 3 3 (59) Now cosider the approach to the cotiuum limit, where all the p s are very small, so that the s are either 0 or 1. Usig bis i x of width δx, p i = p(x)δx, so p = N!(δx) N N i=1 p(x i ) N!(δx) N L, (60) where L is the likelihood of the data. Clearly, whe we compute the relative probabilities of two differet datasets, this is the same as the likelihood ratio. The likelihood ca be used ot oly to compute the relative probabilities of two differet outcomes for the same p(x), but also the relative probabilities of the same outcome for two differet pdfs. It is therefore a tool that is itimately ivolved i comparig hypotheses, as discussed more fully later i the course. 2.4 Example problems We will ow go through a umber of examples where simple pdfs are applied to real astroomical problems Example: Poisso photo statistics Typically, a star produces a large umber, N 1, of photos durig a period of observatio. We oly itercept a tiy fractio, p 1, of the photos which are emitted i all directios by the star, ad if we collect those photos for a few miutes or hours we will collect oly a tiy fractio of those emitted throughout the life of the star. So if the star emits N photos i total ad we collect a fractio, p, of those, the t λ = Np (the mea umber detected) N (the mea total umber emitted) p 0 (probability of detectio is very low) (61) So if we make may idetical observatios of the star ad plot out the frequecy distributio of the umbers of photos collected each time, we expect to see a Poisso distributio (strictly, this is ot completely true, as it igores photo buchig: whe the radiatio occupatio umber is high, as i a laser, photos ted to arrive i bursts). 23

6 Coversely, if we make oe observatio ad detect photos, we ca use the Poisso distributio to derive the probability of gettig this result for all the possible values of λ. The simplest case is whe there is o source of backgroud photos (as i e.g. gamma-ray astroomy). I that case, seeig eve a sigle photo is eough to tell us that there is a source there, ad the oly questio is how bright it is. Here, the problem is to estimate the mea arrival rate of photos i a give time iterval, λ, give the observed i oe iterval. Provided is reasoably large, we ca safely take the Gaussia approach ad argue that will be scattered aroud λ with variace λ, where λ is close to. Thus the source flux will be estimated from the observed umber of photos, ad the fractioal error o the flux will be σ f f = σ l f =1/. (62) Whe is small, we have to be more careful as discussed later i the sectio o Bayesia statistics Example: sky-limited detectio Typically the couts from the directio of a idividual source will be much less tha from the surroudig sky, ad so our attempted flux measuremet always icludes sky photos as well as the desired photos from the object. For example, we may detect 5500 photos from a aperture cetred o the source, ad 5000 from the same sized aperture cetred o a piece of blak sky. Have we detected a source? Let the couts from the aperture o the source be N T, ad from the same area of backgroud sky N B. N T icludes some backgroud, so our estimate of the source couts N S is (hat meas estimate of ): ˆN S = N T N B. (63) The questio we wat to address is how ucertai is this? The couts are idepedet ad radom ad so each follow a Poisso distributio, so the variace o N S is σ 2 S = σ 2 T + σ 2 B = N T + N B. (64) Thus i tur σ ˆ S 2 = N T + N B. If the source is much faiter tha the sky N S N T, the N T N B ad the variace is approximately 2N B. Thus the sigificace of the detectio, the sigal-to-oise ratio, is Sigal/Noise = N T N B N T N B. (65) NT + N B 2NB So simply measurig the backgroud ad the total couts is sufficiet to determie if a detectio is made. I the above example, Sigal/Noise 500/ = 5 (strictly, slightly less), what we would call a 5σ detectio. Normally 3σ (p 0.001) gives good evidece for a detectio but oly if the positio is kow i advace. Whe we make a survey, every pixel i a image is a cadidate for a source, although most of them will be blak i reality. Thus the umber of trials is very high; to avoid beig swamped by false positives, surveys will ormally set a threshold aroud 5σ (for related reasos, this is the traditioal threshold used by particle physicists whe searchig for e.g. the Higgs Boso). 24

7 2.4.3 Example: The distributio of superlumial velocities i quasars. Some radio sources appear to be expadig faster tha the speed of light. This is thought to occur if a radio-emittig compoet i the quasar jet travels almost directly towards the observer at a speed close to that of light. The effect was predicted by the Astroomer Royal Lord Marti Rees i 1966 (whe he was a udergraduate), ad first observed i Figure 4: Superlumial motio from a quasar ucleus. Suppose the agle to the lie of sight is θ as show above, ad that a compoet is ejected alog the jet from the ucleus. at t = 0 After some time t the ejectio compoet has travelled a distace d = tv cos θ alog the lie of sight. But the iitial ejectio is see to happe later tha t = 0, owig to the light travel time from the ucleus, so the observed duratio is t = t d /c = t(1 (v/c) cos θ). I that time the compoet appears to have moved a distace d = tv si θ across the lie of sight, ad hece the apparet trasverse velocity of the compoet is v = d t = v si θ 1 (v/c) cos θ. (66) Note that although a v/c term appears i this expressio, the effect is ot a relativistic effect. It is just due to light delay ad the viewig geometry. Writig β = v/c, γ = (1 β 2 ) 1/2 (67) we fid that the apparet trasverse speed β has a maximum value whe β θ β(β cos θ) = =0, (68) (1 β cos θ) 2 whe cos θ = β. Sice si θ =1/γ we fid a maximum value of β = γβ, where γ ca be much greater tha uity. Give a radomly orieted sample of radio sources, what is the expected distributio of β if β is fixed? First, ote that θ is the agle to the lie of sight, ad sice the orietatio is radom i three dimesios (i.e. uiform distributio over the area da = si θ dθdφ), p(θ) = si θ 0 θ π/2. (69) 25

8 Hece, p(β ) = dθ p(θ) dβ = p(θ) dβ 1 dθ = si θ(1 β cos θ)2 β cos θ β 2 (70) where si θ ad cos θ are give by the equatio for β. We have chose the limits 0 θ π/2 because i the stadard model blobs are ejected from the ucleus alog a jet i two opposite directios, so we should always see oe blob which is travellig towards us. The limits i β are 0 β γβ. The expressio for p(β ) i terms of β aloe is rather messy, but simplifies for β 1: β = si θ (1 cos θ) (71) p(β ) = si θ(1 cos θ). (72) Squarig both sides of si θ = β (1 cos θ), usig si 2 θ = (1 cos θ)(1 + cos θ) ad rearragig gives us (1 cos θ) = 2/(1 + β 2 ). Substitutig this ad si θ = β (1 cos θ) ito equatio (72) fially gives us p(β 4β )= β 1. (73) (1 + β 2 ) 2 The cumulative probability for β is P (> β )= 2 (1 + β 2 ) β 1. (74) so the probability of observig a large apparet velocity, say β > 5, is P (β > 5) 1/13. I fact, a much larger fractio of powerful radio quasars show superlumial motios, ad it ow seems likely that the quasars jets caot be radomly orieted: There must be effects operatig which ted to favour the selectio of quasars jets poitig towards us, most probably due to a opaque disc shroudig the ucleus. Aother physical effect is that jets poitig towards us at speeds close to c have their fluxes boosted by relativistic beamig, which meas they would be favoured i a survey which was selected o the basis of their radio flux. This ca be avoided by choosig the sample o the basis of some flux which is ot beamed, urelated to the jet. 26

9 3 Additio of radom variables to Cetral Limit Theorem 3.1 Probability distributio of summed idepedet radom variables Let us cosider the distributio of the sum of two or more radom variables: this will lead us o to the Cetral Limit Theorem which is of critical importace i probability theory ad hece astrophysics. Let us defie a ew radom variable z = x + y. What is the probability desity, p z (z) of z? The probability of observig a value, z, which is greater tha some value z 1 is P (z z 1 ) = = = z 1 d zp z ( z) (75) dy dx z 1 y z 1 x dx p x,y (x, y) (76) dy p x,y (x, y) (77) where the itegral limits o the secod lie ca be see from defiig the regio i the x y plae (see Figure 5). Figure 5: The regio of itegratio of equatio (76). Now, the pdf for z is just the derivative of this itegral probability: p z (z) = dp (> z)/dz = Or we could do the 2D itegral i the opposite order, which would give p z (z) = dx p x,y (x, z x). (78) dy p x,y (z y, y) (79) 27

10 If the distributios of x ad y are idepedet, the we arrive at a particularly importat result: p z (z) = dx p x (x)p y (z x) or dy p x (z y)p y (y) i.e. p z (z) = (p x p y )(z) (80) If we add together two idepedet radom variables, the resultig distributio is a covolutio of the two distributio fuctios. The most powerful way of hadlig covolutios is to use Fourier trasforms (FTs), sice the FT of a covolved fuctio p(z) is simply the product of the FTs of the separate fuctios p(x) ad p(y) beig covolved, i.e. for the sum of two idepedet radom variables we have: F(p z )=F(p x ) F(p y ) (81) 3.2 Characteristic fuctios I probability theory the Fourier Trasform of a probability distributio fuctio is kow as the characteristic fuctio: φ(k) = dx p(x)e ikx (82) with reciprocal relatio p(x) = 1 dk φ(k)e ikx (83) 2π (ote that other Fourier covetios may put the factor 2π i a differet place). A discrete probability distributio ca be thought of as a cotiuous oe with delta-fuctio spikes of weight p i at locatios x i, so here the characteristic fuctio is φ(k) = i p i e ikx i (84) (ote that φ(k) is a cotiuous fuctio of k). Hece i all cases the characteristic fuctio is simply the expectatio value of e ikx : φ(k) = e ikx. (85) Part of the power of characteristic fuctios is the ease with which oe ca geerate all of the momets of the distributio by differetiatio: This ca be see if oe expads φ(k) i a power series: m =( i) d dk φ(k) k=0 (86) φ(k) = e ikx (ikx) (ikx) = = = 1 + ik x 1 =0! =0! 2 k2 x 2 +. (87) As a example of a characteristic fuctio let us cosider the Poisso distributio: λ e λ φ(k) = e ik = e λ e λeik, (88) =0! 28

11 so that the characteristic fuctio for the Poisso distributio is φ(k) =e λ(eik 1). (89) The first momets of the Poisso distributio are: m 0 = φ(k) = e λ(eik 1) k=0 =1 (90) k=0 m 1 = ( i) 1 d1 dk 1 φ(k) k=0 =( i)e λ(eik 1) λe ik (i) k=0 = λ (91) m 2 = ( i) 2 d2 dk 2 φ(k) k=0 =( 1) d1 dk 1 eλ(eik 1) λe ik (i) k=0 = = λ(λ + 1) (92) which is i total agreemet with results foud for the mea ad the variace of the Poisso distributio (see Eq. 41 ad Eq. 43). Returig to the covolutio equatio (80), p z (z) = dy p x (z y)p y (y), (93) we shall idetify the characteristic fuctio of p z (z), p x (x), p y (y) as φ z (k), φ x (k) ad φ y (k) respectively. The characteristic fuctio of p z (z) is the φ z (k) = = = (let x = z y) = dz p z (z)e ikz dz dz dy p x (z y)p y (y)e ikz dy [p x (z y)e ik(z y) ][p y (y)e iky ] dx p x (x)e ikx dy p y (y)e iky. (94) which is a explicit proof of the covolutio theorem for the product of Fourier trasforms: φ z (k) =φ x (k) φ y (k). (95) The power of this approach is that the distributio of the sum of a large umber of radom variables ca be easily derived. This result allows us to tur ow to the Cetral Limit Theorem. 29

12 3.3 The Cetral Limit Theorem The most importat, ad geeral, result from probability theory is the Cetral Limit Theorem. It applies to a wide rage of pheomea ad explais why the Gaussia distributio appears so ofte i Nature. I its most geeral form, the Cetral Limit Theorem states that the sum of radom values draw from a probability distributio fuctio of fiite variace, σ 2, teds to be Gaussia distributed about the expectatio value for the sum, with variace σ 2. There are two importat cosequeces: 1. The mea of a large umber of values teds to be ormally distributed regardless of the probability distributio from which the values were draw. Hece the samplig distributio is kow eve whe the uderlyig probability distributio is ot. It is for this reaso that the Gaussia distributio occupies such a cetral place i statistics. It is particularly importat i applicatios where uderlyig distributios are ot kow, such as astrophysics. 2. Fuctios such as the Biomial ad Poisso distributios arise from multiple drawigs of values from some uderlyig probability distributio, ad they all ted to look like the Gaussia distributio i the limit of large umbers of drawigs. We saw this earlier whe we derived the Gaussia distributio from the Poisso distributio. The first of these cosequeces meas that uder certai coditios we ca assume a ukow distributio is Gaussia, if it is geerated from a large umber of evets with fiite variace. For example, the height of the surface of the sea has a Gaussia distributio, as it is perturbed by the sum of radom wids. But it should be bore i mid that the umber of ifluecig factors will be fiite, so the Gaussia form will ot apply exactly. It is ofte the case that a pdf will be approximately Gaussia i its core, but with icreasig departures from the Gaussia as we move ito the tails of rare evets. Thus the probability of a 5σ excursio might be much greater tha a aive Gaussia estimate; it is sometimes alleged that eglect of this simple poit by the bakig sector played a big part i the fiacial crash of A simple example of this pheomeo is the distributio of huma heights: a Gaussia model for this must fail, sice a height has to be positive, whereas a Gaussia exteds to Derivatio of the cetral limit theorem Let X = 1 (x 1 + x x )= j=1 x j (96) be the sum of radom variables x j, each draw from the same arbitrary uderlyig distributio fuctio, p x. I geeral the uderlyig distributios ca all be differet for each x j, but for simplicity we shall cosider oly oe here. The distributio of X s geerated by this summatio, p X (X), will be a covolutio of the uderlyig distributios. 30

13 From the properties of characteristic fuctios we kow that a covolutio of distributio fuctios is a multiplicatio of characteristic fuctios. If the characteristic fuctio of p x (x) is φ x (k) = dx p x (x)e ikx = 1 + i x k 1 2 x 2 k 2 + O(k 3 ), (97) where i the last term we have expaded out e ikx. Sice the sum is over x j /, rather tha x j we scale all the momets x p (x/ ) p. From equatio (97),we see this is the same as scalig k k/. Hece the characteristic fuctio of X is Φ X (k) = φ xj / (k) = φ xj (k/ ) = [φ x (k/ )] (98) j=1 j=1 If we assume that m 1 = x = 0, so that m 2 = x 2 = σ 2 x (this does t affect our results) the Φ X (k) = [ 1+i m 1k σ2 xk 2 2 ] [ k3 + O( ) = 1 σ2 xk 2 3/2 2 ] k3 + O( ) e σ2 3/2 x k2 /2 (99) as. Note the higher terms cotribute as 3/2 i the expasio of Φ X (k) ad so vaish i the limit of large where we have previously see how to treat the limit of the expressio (1 + a/). It is however importat to ote that we have made a critical assumptio: all the momets of the distributio, however high order, must be fiite. If they are ot, the the higher-order terms i k will ot be egligible, however much we reduce them as a fuctio of. It is easy to ivet distributios for which this will be a problem: a power-law tail to the pdf may allow a fiite variace, but sufficietly high momets will diverge, ad our proof will fail. I fact, the Cetral Limit theorem still holds eve i such cases it is oly ecessary that the variace be fiite but we are uable to give a simple proof of this here. The above proof gives the characteristic fuctio for X. We kow that the F.T. of a Gaussia is aother Gaussia, but let us show that explicitly: p X (X) = 1 dk Φ X (k)e ikx 2π /(2σ 2 = e X2 x ) dk e ( σ2 xk 2 +X 2 /σx 2ikX)/2 2 2π = e X2 /(2σ 2 x) 2π = e X2 /(2σ 2 x) 2πσx. dk e (kσx+ix/σx)2 /2 = e X2 /(2σ 2 x) 2π 1 2πσx dk e (k +ix) 2 /(2σ 2 x) (100) Thus the sum of radom variables, sampled from the same uderlyig distributio, will ted towards a Gaussia distributio, idepedetly of the iitial distributio. The variace of X is evidetly the same as for x: σ 2 x. The variace of the mea of the x j is the clearly smaller by a factor, sice the mea is X/. 31

14 3.3.2 Measuremet theory As a corollary, by comparig equatio (96) with the expressio for estimatig the mea from a sample of idepedet variables, x = 1 x i, (101) we see that the estimated mea from a sample has a Gaussia distributio with mea m 1 = x ad stadard error o the mea σ x = σ x (102) as. This has two importat cosequeces. i=1 1. This meas that if we estimate the mea from a sample, we will always ted towards the true mea, 2. The ucertaity i our estimate of the mea will decrease as the sample gets bigger. This is a remarkable result: for sufficietly large umbers of drawigs from a ukow distributio fuctio with mea x ad stadard deviatio σ/, we are assured by the Cetral Limit Theorem that we will get the measuremet we wat to higher ad higher accuracy, ad that the estimated mea of the sampled umbers will have a Gaussia distributio almost regardless of the form of the ukow distributio. The oly coditio uder which this will ot occur is if the ukow distributio does ot have a fiite variace. Hece we see that all our assumptios about measuremet rely o the Cetral Limit Theorem How the Cetral Limit Theorem works We have see from the above derivatio that the Cetral Limit Theorem arises because i makig may measuremets ad averagig them together, we are covolvig a probability distributio with itself may times. We have show that this has the remarkable mathematical property that i the limit of large umbers of such covolutios, the result always teds to look Gaussia. I this sese, the Gaussia, or ormal, distributio is the smoothest distributio which ca be produced by atural processes. We ca show this by cosiderig a o-gaussia distributio, ie a top-hat, or square distributio (see Figure 6). If we covolve this with itself, we get a triagle distributio. Covolvig agai we get a slightly smoother distributio. If we keep goig we will ed up with a Gaussia distributio. This is the Cetral Limit Theorem ad is the reaso for its ubiquitous presece i ature. 3.4 Samplig distributios Above we showed how the Cetral Limit Theorem lies at the root of our expectatio that more measuremets will lead to better results. Our estimate of the mea of variables is ubiased (ie 32

15 Figure 6: Repeated covolutio of a distributio will evetually yield a Gaussia if the variace of the covolved distributio is fiite. gives the right aswer) ad the ucertaity o the estimated mea decreases as σ x /, ad the distributio of the estimated, or sampled, mea has a Gaussia distributio. The distributio of the mea determied i this way is kow as the samplig distributio of the mea. How fast the Cetral Limit Theorem works (i.e. how small ca be before the distributio is o loger Gaussia) depeds o the uderlyig distributio. At oe extreme we ca cosider the case of whe the uderlyig variables are all Gaussia distributed. The the samplig distributio of the mea will always be a Gaussia, eve if 1. But, beware! For some distributios the Cetral Limit Theorem does ot hold. For example the meas of values draw from a Cauchy (or Loretz) distributio, p(x) = 1 π(1 + x 2 ) (103) ever approach ormality. This is because this distributio has ifiite variace (try ad calculate it ad see). I fact they are distributed like the Cauchy distributio. Is this a rare, but pathological example? Ufortuately ot. For example the Cauchy distributio appears i spectral lie fittig, where it is called the Voigt distributio. Aother example is if we take the ratio of two Gaussia variables. The resultig distributio has a Cauchy distributio. Hece, we should beware, that although the Cetral Limit Theorem ad Gaussia distributio cosiderably simplify probability ad statistics, exceptios do occur, ad oe should always be wary of them. 3.5 Error propagatio If z is some fuctio of radom variables x ad y, ad we kow the variace of x ad y, what is the variace of z? Let z = f(x, y). We ca propagate errors by expadig f(x, y) to first order aroud some arbitrary 33

16 values, x 0 ad y 0 ; f(x, y) =f(x 0,y 0 )+(x x 0 ) f +(y y 0 ) f +0((x x 0 ) 2, (x x 0 )(y y 0 ), (y y 0 ) 2 ) x x=x0,y=y 0 y x=x0,y=y 0 (104) Let us assume x 0 = y 0 = 0 ad x = y = 0 for simplicity (the aswer will be geeral). The mea of z is ad the variace is (assumig x ad y are idepedet) σ 2 z = (z z ) 2 = = z = f(x 0,y 0 ) (105) dxdy (f f ) 2 p(x)p(y) where we have used the otatio f x f. x=x0,y=y 0 dxdy (x 2 f 2 x + y 2 f 2 y +2xyf x f y )p(x)p(y) (106) x Averagig over the radom variables we fid for idepedet variables with zero mea: σ 2 z = ( ) 2 f σx 2 + x ( ) 2 f σ 2 y y. (107) This formula will allow us to propagate errors for arbitrary fuctios. Note agai that this is valid for ay distributio fuctio, but depeds o (1) the uderlyig variables beig idepedet, (2) the fuctio beig differetiable ad (3) the variatio from the mea beig small eough that the expasio is valid The sample variace The average of the sample ˆx = 1 N N x i (108) i=1 is a estimate of the mea of the uderlyig distributio. Give we may ot directly kow the variace of the summed variables, σx, 2 is there a similar estimate of the variace of ˆx? This is particularly importat i situatios where we eed to assess the sigificace of a result i terms of how far away it is from the expected value, but where we oly have a fiite sample size from which to measure the variace of the distributio. We would expect a good estimate of the populatio variace would be somethig like S 2 = 1 (x i x) 2, (109) i=1 34

17 where x = 1 x i (110) i=1 is the sample mea of values. Let us fid the expected value of this sum. First we re-arrage the summatio S 2 = 1 i x) i=1(x 2 = 1 x 2 i 2 x 2 i x k + 1 x i k 2 i x k = 1 ( ) x i k i x i (111) i=1 which is the same result we foud i Sectio 1.4 the variace is just the mea of the square mius the square of the mea. If all the x i are draw idepedetly the f(x i ) = f(x i ) (112) i i where f(x) is some arbitrary fuctio of x. If i = j the ad whe i ad j are differet The expectatio value of our estimator is the x i x j = x 2 i = j, (113) x i x j = x 2 i j. (114) ( ) S 2 1 = x i x i i=1 = 1 x 2 1 x 2 1 x 2 2 i j i 2 = x 2 ( 1) x 2 x 2 2 ( 2 = 1 1 ) x 2 ( 1) x 2 2 ( 1) = ( x 2 x 2 ( 1) )= σ 2 x. (115) The variace is defied as σ 2 x = x 2 x 2, so S 2 will uderestimate the variace by the factor ( 1)/. This is because a extra variace term, σ 2 x/, has appeared due to the extra variace i our estimate i the mea. Sice the square of the mea is subtracted from the mea of the square, this extra variace is subtracted off from our estimate of the variace, causig the uderestimatio. To correct for this we should chage our estimate to S 2 = 1 1 (x i x) 2 (116) i=1 which is a ubiased estimate of σx, 2 idepedet of the uderlyig distributio. It is ubiased because its expectatio value is always σx 2 for ay whe the mea is estimated from the sample. Note that if the mea is kow, ad ot estimated from the sample, this extra variace does ot appear, i which case equatio (109) is a ubiased estimate of the sample variace. 35

18 3.5.2 Example: Measurig quasar variatio We wat to look for variable quasars. We have two CCD images of oe field take some time apart ad we wat to pick out the quasars which have varied sigificatly more tha the measuremet error which is ukow. I this case S 2 = 1 1 ( m i m) 2 (117) i=1 is the ubiased estimate of the variace of m. We wat to keep m 0 (i.e. we wat to measure m from the data) to allow for possible calibratio errors. If we were cofidet that the calibratio is correct we ca set m = 0, ad we could retur to the defiitio σ 2 = 1 ( m) 2. (118) i Suppose we fid that oe of the m, say m i, is very large, ca we assess the sigificace of this result? Oe way to estimate its sigificace is from t = m i m. (119) S If the mea is kow this is distributed as a stadardised Gaussia (ie t has uit variace) if the measuremet errors are Gaussia. But if we ca oly estimate the mea from the data, t is distributed as Studet-t. The Studet-t distributio looks qualitatively similar to a Gaussia distributio, but it has larger tails, due to the variatios i the measured mea ad variace. I other words, Studet-t is the pdf that arises whe estimatig the mea of a Gaussia distributed populatio whe the sample size is small. 36

Properties of MLE: consistency, asymptotic normality. Fisher information.

Properties of MLE: consistency, asymptotic normality. Fisher information. Lecture 3 Properties of MLE: cosistecy, asymptotic ormality. Fisher iformatio. I this sectio we will try to uderstad why MLEs are good. Let us recall two facts from probability that we be used ofte throughout

More information

Hypothesis testing. Null and alternative hypotheses

Hypothesis testing. Null and alternative hypotheses Hypothesis testig Aother importat use of samplig distributios is to test hypotheses about populatio parameters, e.g. mea, proportio, regressio coefficiets, etc. For example, it is possible to stipulate

More information

Overview of some probability distributions.

Overview of some probability distributions. Lecture Overview of some probability distributios. I this lecture we will review several commo distributios that will be used ofte throughtout the class. Each distributio is usually described by its probability

More information

I. Chi-squared Distributions

I. Chi-squared Distributions 1 M 358K Supplemet to Chapter 23: CHI-SQUARED DISTRIBUTIONS, T-DISTRIBUTIONS, AND DEGREES OF FREEDOM To uderstad t-distributios, we first eed to look at aother family of distributios, the chi-squared distributios.

More information

1. C. The formula for the confidence interval for a population mean is: x t, which was

1. C. The formula for the confidence interval for a population mean is: x t, which was s 1. C. The formula for the cofidece iterval for a populatio mea is: x t, which was based o the sample Mea. So, x is guarateed to be i the iterval you form.. D. Use the rule : p-value

More information

5: Introduction to Estimation

5: Introduction to Estimation 5: Itroductio to Estimatio Cotets Acroyms ad symbols... 1 Statistical iferece... Estimatig µ with cofidece... 3 Samplig distributio of the mea... 3 Cofidece Iterval for μ whe σ is kow before had... 4 Sample

More information

Lecture 4: Cauchy sequences, Bolzano-Weierstrass, and the Squeeze theorem

Lecture 4: Cauchy sequences, Bolzano-Weierstrass, and the Squeeze theorem Lecture 4: Cauchy sequeces, Bolzao-Weierstrass, ad the Squeeze theorem The purpose of this lecture is more modest tha the previous oes. It is to state certai coditios uder which we are guarateed that limits

More information

0.7 0.6 0.2 0 0 96 96.5 97 97.5 98 98.5 99 99.5 100 100.5 96.5 97 97.5 98 98.5 99 99.5 100 100.5

0.7 0.6 0.2 0 0 96 96.5 97 97.5 98 98.5 99 99.5 100 100.5 96.5 97 97.5 98 98.5 99 99.5 100 100.5 Sectio 13 Kolmogorov-Smirov test. Suppose that we have a i.i.d. sample X 1,..., X with some ukow distributio P ad we would like to test the hypothesis that P is equal to a particular distributio P 0, i.e.

More information

University of California, Los Angeles Department of Statistics. Distributions related to the normal distribution

University of California, Los Angeles Department of Statistics. Distributions related to the normal distribution Uiversity of Califoria, Los Ageles Departmet of Statistics Statistics 100B Istructor: Nicolas Christou Three importat distributios: Distributios related to the ormal distributio Chi-square (χ ) distributio.

More information

Chapter 7 Methods of Finding Estimators

Chapter 7 Methods of Finding Estimators Chapter 7 for BST 695: Special Topics i Statistical Theory. Kui Zhag, 011 Chapter 7 Methods of Fidig Estimators Sectio 7.1 Itroductio Defiitio 7.1.1 A poit estimator is ay fuctio W( X) W( X1, X,, X ) of

More information

GCSE STATISTICS. 4) How to calculate the range: The difference between the biggest number and the smallest number.

GCSE STATISTICS. 4) How to calculate the range: The difference between the biggest number and the smallest number. GCSE STATISTICS You should kow: 1) How to draw a frequecy diagram: e.g. NUMBER TALLY FREQUENCY 1 3 5 ) How to draw a bar chart, a pictogram, ad a pie chart. 3) How to use averages: a) Mea - add up all

More information

Normal Distribution.

Normal Distribution. Normal Distributio www.icrf.l Normal distributio I probability theory, the ormal or Gaussia distributio, is a cotiuous probability distributio that is ofte used as a first approimatio to describe realvalued

More information

In nite Sequences. Dr. Philippe B. Laval Kennesaw State University. October 9, 2008

In nite Sequences. Dr. Philippe B. Laval Kennesaw State University. October 9, 2008 I ite Sequeces Dr. Philippe B. Laval Keesaw State Uiversity October 9, 2008 Abstract This had out is a itroductio to i ite sequeces. mai de itios ad presets some elemetary results. It gives the I ite Sequeces

More information

Case Study. Normal and t Distributions. Density Plot. Normal Distributions

Case Study. Normal and t Distributions. Density Plot. Normal Distributions Case Study Normal ad t Distributios Bret Halo ad Bret Larget Departmet of Statistics Uiversity of Wiscosi Madiso October 11 13, 2011 Case Study Body temperature varies withi idividuals over time (it ca

More information

The following example will help us understand The Sampling Distribution of the Mean. C1 C2 C3 C4 C5 50 miles 84 miles 38 miles 120 miles 48 miles

The following example will help us understand The Sampling Distribution of the Mean. C1 C2 C3 C4 C5 50 miles 84 miles 38 miles 120 miles 48 miles The followig eample will help us uderstad The Samplig Distributio of the Mea Review: The populatio is the etire collectio of all idividuals or objects of iterest The sample is the portio of the populatio

More information

Inference on Proportion. Chapter 8 Tests of Statistical Hypotheses. Sampling Distribution of Sample Proportion. Confidence Interval

Inference on Proportion. Chapter 8 Tests of Statistical Hypotheses. Sampling Distribution of Sample Proportion. Confidence Interval Chapter 8 Tests of Statistical Hypotheses 8. Tests about Proportios HT - Iferece o Proportio Parameter: Populatio Proportio p (or π) (Percetage of people has o health isurace) x Statistic: Sample Proportio

More information

Lesson 17 Pearson s Correlation Coefficient

Lesson 17 Pearson s Correlation Coefficient Outlie Measures of Relatioships Pearso s Correlatio Coefficiet (r) -types of data -scatter plots -measure of directio -measure of stregth Computatio -covariatio of X ad Y -uique variatio i X ad Y -measurig

More information

THE REGRESSION MODEL IN MATRIX FORM. For simple linear regression, meaning one predictor, the model is. for i = 1, 2, 3,, n

THE REGRESSION MODEL IN MATRIX FORM. For simple linear regression, meaning one predictor, the model is. for i = 1, 2, 3,, n We will cosider the liear regressio model i matrix form. For simple liear regressio, meaig oe predictor, the model is i = + x i + ε i for i =,,,, This model icludes the assumptio that the ε i s are a sample

More information

Confidence Intervals for One Mean

Confidence Intervals for One Mean Chapter 420 Cofidece Itervals for Oe Mea Itroductio This routie calculates the sample size ecessary to achieve a specified distace from the mea to the cofidece limit(s) at a stated cofidece level for a

More information

Chapter 7 - Sampling Distributions. 1 Introduction. What is statistics? It consist of three major areas:

Chapter 7 - Sampling Distributions. 1 Introduction. What is statistics? It consist of three major areas: Chapter 7 - Samplig Distributios 1 Itroductio What is statistics? It cosist of three major areas: Data Collectio: samplig plas ad experimetal desigs Descriptive Statistics: umerical ad graphical summaries

More information

Determining the sample size

Determining the sample size Determiig the sample size Oe of the most commo questios ay statisticia gets asked is How large a sample size do I eed? Researchers are ofte surprised to fid out that the aswer depeds o a umber of factors

More information

Math C067 Sampling Distributions

Math C067 Sampling Distributions Math C067 Samplig Distributios Sample Mea ad Sample Proportio Richard Beigel Some time betwee April 16, 2007 ad April 16, 2007 Examples of Samplig A pollster may try to estimate the proportio of voters

More information

Convexity, Inequalities, and Norms

Convexity, Inequalities, and Norms Covexity, Iequalities, ad Norms Covex Fuctios You are probably familiar with the otio of cocavity of fuctios. Give a twicedifferetiable fuctio ϕ: R R, We say that ϕ is covex (or cocave up) if ϕ (x) 0 for

More information

A probabilistic proof of a binomial identity

A probabilistic proof of a binomial identity A probabilistic proof of a biomial idetity Joatho Peterso Abstract We give a elemetary probabilistic proof of a biomial idetity. The proof is obtaied by computig the probability of a certai evet i two

More information

Section 11.3: The Integral Test

Section 11.3: The Integral Test Sectio.3: The Itegral Test Most of the series we have looked at have either diverged or have coverged ad we have bee able to fid what they coverge to. I geeral however, the problem is much more difficult

More information

Chapter 6: Variance, the law of large numbers and the Monte-Carlo method

Chapter 6: Variance, the law of large numbers and the Monte-Carlo method Chapter 6: Variace, the law of large umbers ad the Mote-Carlo method Expected value, variace, ad Chebyshev iequality. If X is a radom variable recall that the expected value of X, E[X] is the average value

More information

A Mathematical Perspective on Gambling

A Mathematical Perspective on Gambling A Mathematical Perspective o Gamblig Molly Maxwell Abstract. This paper presets some basic topics i probability ad statistics, icludig sample spaces, probabilistic evets, expectatios, the biomial ad ormal

More information

Sequences and Series

Sequences and Series CHAPTER 9 Sequeces ad Series 9.. Covergece: Defiitio ad Examples Sequeces The purpose of this chapter is to itroduce a particular way of geeratig algorithms for fidig the values of fuctios defied by their

More information

CHAPTER 7: Central Limit Theorem: CLT for Averages (Means)

CHAPTER 7: Central Limit Theorem: CLT for Averages (Means) CHAPTER 7: Cetral Limit Theorem: CLT for Averages (Meas) X = the umber obtaied whe rollig oe six sided die oce. If we roll a six sided die oce, the mea of the probability distributio is X P(X = x) Simulatio:

More information

UC Berkeley Department of Electrical Engineering and Computer Science. EE 126: Probablity and Random Processes. Solutions 9 Spring 2006

UC Berkeley Department of Electrical Engineering and Computer Science. EE 126: Probablity and Random Processes. Solutions 9 Spring 2006 Exam format UC Bereley Departmet of Electrical Egieerig ad Computer Sciece EE 6: Probablity ad Radom Processes Solutios 9 Sprig 006 The secod midterm will be held o Wedesday May 7; CHECK the fial exam

More information

Basic Elements of Arithmetic Sequences and Series

Basic Elements of Arithmetic Sequences and Series MA40S PRE-CALCULUS UNIT G GEOMETRIC SEQUENCES CLASS NOTES (COMPLETED NO NEED TO COPY NOTES FROM OVERHEAD) Basic Elemets of Arithmetic Sequeces ad Series Objective: To establish basic elemets of arithmetic

More information

MEI Structured Mathematics. Module Summary Sheets. Statistics 2 (Version B: reference to new book)

MEI Structured Mathematics. Module Summary Sheets. Statistics 2 (Version B: reference to new book) MEI Mathematics i Educatio ad Idustry MEI Structured Mathematics Module Summary Sheets Statistics (Versio B: referece to ew book) Topic : The Poisso Distributio Topic : The Normal Distributio Topic 3:

More information

Trigonometric Form of a Complex Number. The Complex Plane. axis. ( 2, 1) or 2 i FIGURE 6.44. The absolute value of the complex number z a bi is

Trigonometric Form of a Complex Number. The Complex Plane. axis. ( 2, 1) or 2 i FIGURE 6.44. The absolute value of the complex number z a bi is 0_0605.qxd /5/05 0:45 AM Page 470 470 Chapter 6 Additioal Topics i Trigoometry 6.5 Trigoometric Form of a Complex Number What you should lear Plot complex umbers i the complex plae ad fid absolute values

More information

PSYCHOLOGICAL STATISTICS

PSYCHOLOGICAL STATISTICS UNIVERSITY OF CALICUT SCHOOL OF DISTANCE EDUCATION B Sc. Cousellig Psychology (0 Adm.) IV SEMESTER COMPLEMENTARY COURSE PSYCHOLOGICAL STATISTICS QUESTION BANK. Iferetial statistics is the brach of statistics

More information

Maximum Likelihood Estimators.

Maximum Likelihood Estimators. Lecture 2 Maximum Likelihood Estimators. Matlab example. As a motivatio, let us look at oe Matlab example. Let us geerate a radom sample of size 00 from beta distributio Beta(5, 2). We will lear the defiitio

More information

One-sample test of proportions

One-sample test of proportions Oe-sample test of proportios The Settig: Idividuals i some populatio ca be classified ito oe of two categories. You wat to make iferece about the proportio i each category, so you draw a sample. Examples:

More information

Center, Spread, and Shape in Inference: Claims, Caveats, and Insights

Center, Spread, and Shape in Inference: Claims, Caveats, and Insights Ceter, Spread, ad Shape i Iferece: Claims, Caveats, ad Isights Dr. Nacy Pfeig (Uiversity of Pittsburgh) AMATYC November 2008 Prelimiary Activities 1. I would like to produce a iterval estimate for the

More information

Statistical inference: example 1. Inferential Statistics

Statistical inference: example 1. Inferential Statistics Statistical iferece: example 1 Iferetial Statistics POPULATION SAMPLE A clothig store chai regularly buys from a supplier large quatities of a certai piece of clothig. Each item ca be classified either

More information

SAMPLE QUESTIONS FOR FINAL EXAM. (1) (2) (3) (4) Find the following using the definition of the Riemann integral: (2x + 1)dx

SAMPLE QUESTIONS FOR FINAL EXAM. (1) (2) (3) (4) Find the following using the definition of the Riemann integral: (2x + 1)dx SAMPLE QUESTIONS FOR FINAL EXAM REAL ANALYSIS I FALL 006 3 4 Fid the followig usig the defiitio of the Riema itegral: a 0 x + dx 3 Cosider the partitio P x 0 3, x 3 +, x 3 +,......, x 3 3 + 3 of the iterval

More information

CS103A Handout 23 Winter 2002 February 22, 2002 Solving Recurrence Relations

CS103A Handout 23 Winter 2002 February 22, 2002 Solving Recurrence Relations CS3A Hadout 3 Witer 00 February, 00 Solvig Recurrece Relatios Itroductio A wide variety of recurrece problems occur i models. Some of these recurrece relatios ca be solved usig iteratio or some other ad

More information

Example 2 Find the square root of 0. The only square root of 0 is 0 (since 0 is not positive or negative, so those choices don t exist here).

Example 2 Find the square root of 0. The only square root of 0 is 0 (since 0 is not positive or negative, so those choices don t exist here). BEGINNING ALGEBRA Roots ad Radicals (revised summer, 00 Olso) Packet to Supplemet the Curret Textbook - Part Review of Square Roots & Irratioals (This portio ca be ay time before Part ad should mostly

More information

Week 3 Conditional probabilities, Bayes formula, WEEK 3 page 1 Expected value of a random variable

Week 3 Conditional probabilities, Bayes formula, WEEK 3 page 1 Expected value of a random variable Week 3 Coditioal probabilities, Bayes formula, WEEK 3 page 1 Expected value of a radom variable We recall our discussio of 5 card poker hads. Example 13 : a) What is the probability of evet A that a 5

More information

Confidence intervals and hypothesis tests

Confidence intervals and hypothesis tests Chapter 2 Cofidece itervals ad hypothesis tests This chapter focuses o how to draw coclusios about populatios from sample data. We ll start by lookig at biary data (e.g., pollig), ad lear how to estimate

More information

Discrete Mathematics and Probability Theory Spring 2014 Anant Sahai Note 13

Discrete Mathematics and Probability Theory Spring 2014 Anant Sahai Note 13 EECS 70 Discrete Mathematics ad Probability Theory Sprig 2014 Aat Sahai Note 13 Itroductio At this poit, we have see eough examples that it is worth just takig stock of our model of probability ad may

More information

Output Analysis (2, Chapters 10 &11 Law)

Output Analysis (2, Chapters 10 &11 Law) B. Maddah ENMG 6 Simulatio 05/0/07 Output Aalysis (, Chapters 10 &11 Law) Comparig alterative system cofiguratio Sice the output of a simulatio is radom, the comparig differet systems via simulatio should

More information

Theorems About Power Series

Theorems About Power Series Physics 6A Witer 20 Theorems About Power Series Cosider a power series, f(x) = a x, () where the a are real coefficiets ad x is a real variable. There exists a real o-egative umber R, called the radius

More information

Confidence Intervals. CI for a population mean (σ is known and n > 30 or the variable is normally distributed in the.

Confidence Intervals. CI for a population mean (σ is known and n > 30 or the variable is normally distributed in the. Cofidece Itervals A cofidece iterval is a iterval whose purpose is to estimate a parameter (a umber that could, i theory, be calculated from the populatio, if measuremets were available for the whole populatio).

More information

Sampling Distribution And Central Limit Theorem

Sampling Distribution And Central Limit Theorem () Samplig Distributio & Cetral Limit Samplig Distributio Ad Cetral Limit Samplig distributio of the sample mea If we sample a umber of samples (say k samples where k is very large umber) each of size,

More information

Chapter 14 Nonparametric Statistics

Chapter 14 Nonparametric Statistics Chapter 14 Noparametric Statistics A.K.A. distributio-free statistics! Does ot deped o the populatio fittig ay particular type of distributio (e.g, ormal). Sice these methods make fewer assumptios, they

More information

CS103X: Discrete Structures Homework 4 Solutions

CS103X: Discrete Structures Homework 4 Solutions CS103X: Discrete Structures Homewor 4 Solutios Due February 22, 2008 Exercise 1 10 poits. Silico Valley questios: a How may possible six-figure salaries i whole dollar amouts are there that cotai at least

More information

Soving Recurrence Relations

Soving Recurrence Relations Sovig Recurrece Relatios Part 1. Homogeeous liear 2d degree relatios with costat coefficiets. Cosider the recurrece relatio ( ) T () + at ( 1) + bt ( 2) = 0 This is called a homogeeous liear 2d degree

More information

Quadrat Sampling in Population Ecology

Quadrat Sampling in Population Ecology Quadrat Samplig i Populatio Ecology Backgroud Estimatig the abudace of orgaisms. Ecology is ofte referred to as the "study of distributio ad abudace". This beig true, we would ofte like to kow how may

More information

Solutions to Selected Problems In: Pattern Classification by Duda, Hart, Stork

Solutions to Selected Problems In: Pattern Classification by Duda, Hart, Stork Solutios to Selected Problems I: Patter Classificatio by Duda, Hart, Stork Joh L. Weatherwax February 4, 008 Problem Solutios Chapter Bayesia Decisio Theory Problem radomized rules Part a: Let Rx be the

More information

Overview. Learning Objectives. Point Estimate. Estimation. Estimating the Value of a Parameter Using Confidence Intervals

Overview. Learning Objectives. Point Estimate. Estimation. Estimating the Value of a Parameter Using Confidence Intervals Overview Estimatig the Value of a Parameter Usig Cofidece Itervals We apply the results about the sample mea the problem of estimatio Estimatio is the process of usig sample data estimate the value of

More information

Practice Problems for Test 3

Practice Problems for Test 3 Practice Problems for Test 3 Note: these problems oly cover CIs ad hypothesis testig You are also resposible for kowig the samplig distributio of the sample meas, ad the Cetral Limit Theorem Review all

More information

Approximating Area under a curve with rectangles. To find the area under a curve we approximate the area using rectangles and then use limits to find

Approximating Area under a curve with rectangles. To find the area under a curve we approximate the area using rectangles and then use limits to find 1.8 Approximatig Area uder a curve with rectagles 1.6 To fid the area uder a curve we approximate the area usig rectagles ad the use limits to fid 1.4 the area. Example 1 Suppose we wat to estimate 1.

More information

SECTION 1.5 : SUMMATION NOTATION + WORK WITH SEQUENCES

SECTION 1.5 : SUMMATION NOTATION + WORK WITH SEQUENCES SECTION 1.5 : SUMMATION NOTATION + WORK WITH SEQUENCES Read Sectio 1.5 (pages 5 9) Overview I Sectio 1.5 we lear to work with summatio otatio ad formulas. We will also itroduce a brief overview of sequeces,

More information

Measures of Spread and Boxplots Discrete Math, Section 9.4

Measures of Spread and Boxplots Discrete Math, Section 9.4 Measures of Spread ad Boxplots Discrete Math, Sectio 9.4 We start with a example: Example 1: Comparig Mea ad Media Compute the mea ad media of each data set: S 1 = {4, 6, 8, 10, 1, 14, 16} S = {4, 7, 9,

More information

3 Basic Definitions of Probability Theory

3 Basic Definitions of Probability Theory 3 Basic Defiitios of Probability Theory 3defprob.tex: Feb 10, 2003 Classical probability Frequecy probability axiomatic probability Historical developemet: Classical Frequecy Axiomatic The Axiomatic defiitio

More information

CHAPTER 3 DIGITAL CODING OF SIGNALS

CHAPTER 3 DIGITAL CODING OF SIGNALS CHAPTER 3 DIGITAL CODING OF SIGNALS Computers are ofte used to automate the recordig of measuremets. The trasducers ad sigal coditioig circuits produce a voltage sigal that is proportioal to a quatity

More information

THE ARITHMETIC OF INTEGERS. - multiplication, exponentiation, division, addition, and subtraction

THE ARITHMETIC OF INTEGERS. - multiplication, exponentiation, division, addition, and subtraction THE ARITHMETIC OF INTEGERS - multiplicatio, expoetiatio, divisio, additio, ad subtractio What to do ad what ot to do. THE INTEGERS Recall that a iteger is oe of the whole umbers, which may be either positive,

More information

Lecture 13. Lecturer: Jonathan Kelner Scribe: Jonathan Pines (2009)

Lecture 13. Lecturer: Jonathan Kelner Scribe: Jonathan Pines (2009) 18.409 A Algorithmist s Toolkit October 27, 2009 Lecture 13 Lecturer: Joatha Keler Scribe: Joatha Pies (2009) 1 Outlie Last time, we proved the Bru-Mikowski iequality for boxes. Today we ll go over the

More information

AP Calculus AB 2006 Scoring Guidelines Form B

AP Calculus AB 2006 Scoring Guidelines Form B AP Calculus AB 6 Scorig Guidelies Form B The College Board: Coectig Studets to College Success The College Board is a ot-for-profit membership associatio whose missio is to coect studets to college success

More information

Infinite Sequences and Series

Infinite Sequences and Series CHAPTER 4 Ifiite Sequeces ad Series 4.1. Sequeces A sequece is a ifiite ordered list of umbers, for example the sequece of odd positive itegers: 1, 3, 5, 7, 9, 11, 13, 15, 17, 19, 21, 23, 25, 27, 29...

More information

Z-TEST / Z-STATISTIC: used to test hypotheses about. µ when the population standard deviation is unknown

Z-TEST / Z-STATISTIC: used to test hypotheses about. µ when the population standard deviation is unknown Z-TEST / Z-STATISTIC: used to test hypotheses about µ whe the populatio stadard deviatio is kow ad populatio distributio is ormal or sample size is large T-TEST / T-STATISTIC: used to test hypotheses about

More information

Parametric (theoretical) probability distributions. (Wilks, Ch. 4) Discrete distributions: (e.g., yes/no; above normal, normal, below normal)

Parametric (theoretical) probability distributions. (Wilks, Ch. 4) Discrete distributions: (e.g., yes/no; above normal, normal, below normal) 6 Parametric (theoretical) probability distributios. (Wilks, Ch. 4) Note: parametric: assume a theoretical distributio (e.g., Gauss) No-parametric: o assumptio made about the distributio Advatages of assumig

More information

BASIC STATISTICS. f(x 1,x 2,..., x n )=f(x 1 )f(x 2 ) f(x n )= f(x i ) (1)

BASIC STATISTICS. f(x 1,x 2,..., x n )=f(x 1 )f(x 2 ) f(x n )= f(x i ) (1) BASIC STATISTICS. SAMPLES, RANDOM SAMPLING AND SAMPLE STATISTICS.. Radom Sample. The radom variables X,X 2,..., X are called a radom sample of size from the populatio f(x if X,X 2,..., X are mutually idepedet

More information

INFINITE SERIES KEITH CONRAD

INFINITE SERIES KEITH CONRAD INFINITE SERIES KEITH CONRAD. Itroductio The two basic cocepts of calculus, differetiatio ad itegratio, are defied i terms of limits (Newto quotiets ad Riema sums). I additio to these is a third fudametal

More information

Definition. A variable X that takes on values X 1, X 2, X 3,...X k with respective frequencies f 1, f 2, f 3,...f k has mean

Definition. A variable X that takes on values X 1, X 2, X 3,...X k with respective frequencies f 1, f 2, f 3,...f k has mean 1 Social Studies 201 October 13, 2004 Note: The examples i these otes may be differet tha used i class. However, the examples are similar ad the methods used are idetical to what was preseted i class.

More information

1 Computing the Standard Deviation of Sample Means

1 Computing the Standard Deviation of Sample Means Computig the Stadard Deviatio of Sample Meas Quality cotrol charts are based o sample meas ot o idividual values withi a sample. A sample is a group of items, which are cosidered all together for our aalysis.

More information

Non-life insurance mathematics. Nils F. Haavardsson, University of Oslo and DNB Skadeforsikring

Non-life insurance mathematics. Nils F. Haavardsson, University of Oslo and DNB Skadeforsikring No-life isurace mathematics Nils F. Haavardsso, Uiversity of Oslo ad DNB Skadeforsikrig Mai issues so far Why does isurace work? How is risk premium defied ad why is it importat? How ca claim frequecy

More information

Hypergeometric Distributions

Hypergeometric Distributions 7.4 Hypergeometric Distributios Whe choosig the startig lie-up for a game, a coach obviously has to choose a differet player for each positio. Similarly, whe a uio elects delegates for a covetio or you

More information

1 Correlation and Regression Analysis

1 Correlation and Regression Analysis 1 Correlatio ad Regressio Aalysis I this sectio we will be ivestigatig the relatioship betwee two cotiuous variable, such as height ad weight, the cocetratio of a ijected drug ad heart rate, or the cosumptio

More information

Our aim is to show that under reasonable assumptions a given 2π-periodic function f can be represented as convergent series

Our aim is to show that under reasonable assumptions a given 2π-periodic function f can be represented as convergent series 8 Fourier Series Our aim is to show that uder reasoable assumptios a give -periodic fuctio f ca be represeted as coverget series f(x) = a + (a cos x + b si x). (8.) By defiitio, the covergece of the series

More information

The Stable Marriage Problem

The Stable Marriage Problem The Stable Marriage Problem William Hut Lae Departmet of Computer Sciece ad Electrical Egieerig, West Virgiia Uiversity, Morgatow, WV William.Hut@mail.wvu.edu 1 Itroductio Imagie you are a matchmaker,

More information

THE TWO-VARIABLE LINEAR REGRESSION MODEL

THE TWO-VARIABLE LINEAR REGRESSION MODEL THE TWO-VARIABLE LINEAR REGRESSION MODEL Herma J. Bieres Pesylvaia State Uiversity April 30, 202. Itroductio Suppose you are a ecoomics or busiess maor i a college close to the beach i the souther part

More information

LECTURE 13: Cross-validation

LECTURE 13: Cross-validation LECTURE 3: Cross-validatio Resampli methods Cross Validatio Bootstrap Bias ad variace estimatio with the Bootstrap Three-way data partitioi Itroductio to Patter Aalysis Ricardo Gutierrez-Osua Texas A&M

More information

Asymptotic Growth of Functions

Asymptotic Growth of Functions CMPS Itroductio to Aalysis of Algorithms Fall 3 Asymptotic Growth of Fuctios We itroduce several types of asymptotic otatio which are used to compare the performace ad efficiecy of algorithms As we ll

More information

Incremental calculation of weighted mean and variance

Incremental calculation of weighted mean and variance Icremetal calculatio of weighted mea ad variace Toy Fich faf@cam.ac.uk dot@dotat.at Uiversity of Cambridge Computig Service February 009 Abstract I these otes I eplai how to derive formulae for umerically

More information

15.075 Exam 3. Instructor: Cynthia Rudin TA: Dimitrios Bisias. November 22, 2011

15.075 Exam 3. Instructor: Cynthia Rudin TA: Dimitrios Bisias. November 22, 2011 15.075 Exam 3 Istructor: Cythia Rudi TA: Dimitrios Bisias November 22, 2011 Gradig is based o demostratio of coceptual uderstadig, so you eed to show all of your work. Problem 1 A compay makes high-defiitio

More information

THE HEIGHT OF q-binary SEARCH TREES

THE HEIGHT OF q-binary SEARCH TREES THE HEIGHT OF q-binary SEARCH TREES MICHAEL DRMOTA AND HELMUT PRODINGER Abstract. q biary search trees are obtaied from words, equipped with the geometric distributio istead of permutatios. The average

More information

Chapter 5: Inner Product Spaces

Chapter 5: Inner Product Spaces Chapter 5: Ier Product Spaces Chapter 5: Ier Product Spaces SECION A Itroductio to Ier Product Spaces By the ed of this sectio you will be able to uderstad what is meat by a ier product space give examples

More information

Chapter 7: Confidence Interval and Sample Size

Chapter 7: Confidence Interval and Sample Size Chapter 7: Cofidece Iterval ad Sample Size Learig Objectives Upo successful completio of Chapter 7, you will be able to: Fid the cofidece iterval for the mea, proportio, ad variace. Determie the miimum

More information

WHEN IS THE (CO)SINE OF A RATIONAL ANGLE EQUAL TO A RATIONAL NUMBER?

WHEN IS THE (CO)SINE OF A RATIONAL ANGLE EQUAL TO A RATIONAL NUMBER? WHEN IS THE (CO)SINE OF A RATIONAL ANGLE EQUAL TO A RATIONAL NUMBER? JÖRG JAHNEL 1. My Motivatio Some Sort of a Itroductio Last term I tought Topological Groups at the Göttige Georg August Uiversity. This

More information

Descriptive Statistics

Descriptive Statistics Descriptive Statistics We leared to describe data sets graphically. We ca also describe a data set umerically. Measures of Locatio Defiitio The sample mea is the arithmetic average of values. We deote

More information

arxiv:1506.03481v1 [stat.me] 10 Jun 2015

arxiv:1506.03481v1 [stat.me] 10 Jun 2015 BEHAVIOUR OF ABC FOR BIG DATA By Wetao Li ad Paul Fearhead Lacaster Uiversity arxiv:1506.03481v1 [stat.me] 10 Ju 2015 May statistical applicatios ivolve models that it is difficult to evaluate the likelihood,

More information

FOUNDATIONS OF MATHEMATICS AND PRE-CALCULUS GRADE 10

FOUNDATIONS OF MATHEMATICS AND PRE-CALCULUS GRADE 10 FOUNDATIONS OF MATHEMATICS AND PRE-CALCULUS GRADE 10 [C] Commuicatio Measuremet A1. Solve problems that ivolve liear measuremet, usig: SI ad imperial uits of measure estimatio strategies measuremet strategies.

More information

Math 113 HW #11 Solutions

Math 113 HW #11 Solutions Math 3 HW # Solutios 5. 4. (a) Estimate the area uder the graph of f(x) = x from x = to x = 4 usig four approximatig rectagles ad right edpoits. Sketch the graph ad the rectagles. Is your estimate a uderestimate

More information

A Recursive Formula for Moments of a Binomial Distribution

A Recursive Formula for Moments of a Binomial Distribution A Recursive Formula for Momets of a Biomial Distributio Árpád Béyi beyi@mathumassedu, Uiversity of Massachusetts, Amherst, MA 01003 ad Saverio M Maago smmaago@psavymil Naval Postgraduate School, Moterey,

More information

Factoring x n 1: cyclotomic and Aurifeuillian polynomials Paul Garrett <garrett@math.umn.edu>

Factoring x n 1: cyclotomic and Aurifeuillian polynomials Paul Garrett <garrett@math.umn.edu> (March 16, 004) Factorig x 1: cyclotomic ad Aurifeuillia polyomials Paul Garrett Polyomials of the form x 1, x 3 1, x 4 1 have at least oe systematic factorizatio x 1 = (x 1)(x 1

More information

Lesson 15 ANOVA (analysis of variance)

Lesson 15 ANOVA (analysis of variance) Outlie Variability -betwee group variability -withi group variability -total variability -F-ratio Computatio -sums of squares (betwee/withi/total -degrees of freedom (betwee/withi/total -mea square (betwee/withi

More information

Lecture 4: Cheeger s Inequality

Lecture 4: Cheeger s Inequality Spectral Graph Theory ad Applicatios WS 0/0 Lecture 4: Cheeger s Iequality Lecturer: Thomas Sauerwald & He Su Statemet of Cheeger s Iequality I this lecture we assume for simplicity that G is a d-regular

More information

Mathematical goals. Starting points. Materials required. Time needed

Mathematical goals. Starting points. Materials required. Time needed Level A1 of challege: C A1 Mathematical goals Startig poits Materials required Time eeded Iterpretig algebraic expressios To help learers to: traslate betwee words, symbols, tables, ad area represetatios

More information

Research Method (I) --Knowledge on Sampling (Simple Random Sampling)

Research Method (I) --Knowledge on Sampling (Simple Random Sampling) Research Method (I) --Kowledge o Samplig (Simple Radom Samplig) 1. Itroductio to samplig 1.1 Defiitio of samplig Samplig ca be defied as selectig part of the elemets i a populatio. It results i the fact

More information

4.3. The Integral and Comparison Tests

4.3. The Integral and Comparison Tests 4.3. THE INTEGRAL AND COMPARISON TESTS 9 4.3. The Itegral ad Compariso Tests 4.3.. The Itegral Test. Suppose f is a cotiuous, positive, decreasig fuctio o [, ), ad let a = f(). The the covergece or divergece

More information

.04. This means $1000 is multiplied by 1.02 five times, once for each of the remaining sixmonth

.04. This means $1000 is multiplied by 1.02 five times, once for each of the remaining sixmonth Questio 1: What is a ordiary auity? Let s look at a ordiary auity that is certai ad simple. By this, we mea a auity over a fixed term whose paymet period matches the iterest coversio period. Additioally,

More information

Cooley-Tukey. Tukey FFT Algorithms. FFT Algorithms. Cooley

Cooley-Tukey. Tukey FFT Algorithms. FFT Algorithms. Cooley Cooley Cooley-Tuey Tuey FFT Algorithms FFT Algorithms Cosider a legth- sequece x[ with a -poit DFT X[ where Represet the idices ad as +, +, Cooley Cooley-Tuey Tuey FFT Algorithms FFT Algorithms Usig these

More information

, a Wishart distribution with n -1 degrees of freedom and scale matrix.

, a Wishart distribution with n -1 degrees of freedom and scale matrix. UMEÅ UNIVERSITET Matematisk-statistiska istitutioe Multivariat dataaalys D MSTD79 PA TENTAMEN 004-0-9 LÖSNINGSFÖRSLAG TILL TENTAMEN I MATEMATISK STATISTIK Multivariat dataaalys D, 5 poäg.. Assume that

More information

STA 2023 Practice Questions Exam 2 Chapter 7- sec 9.2. Case parameter estimator standard error Estimate of standard error

STA 2023 Practice Questions Exam 2 Chapter 7- sec 9.2. Case parameter estimator standard error Estimate of standard error STA 2023 Practice Questios Exam 2 Chapter 7- sec 9.2 Formulas Give o the test: Case parameter estimator stadard error Estimate of stadard error Samplig Distributio oe mea x s t (-1) oe p ( 1 p) CI: prop.

More information

AP Calculus BC 2003 Scoring Guidelines Form B

AP Calculus BC 2003 Scoring Guidelines Form B AP Calculus BC Scorig Guidelies Form B The materials icluded i these files are iteded for use by AP teachers for course ad exam preparatio; permissio for ay other use must be sought from the Advaced Placemet

More information