Handbook on STATISTICAL DISTRIBUTIONS for experimentalists


 Mary Pitts
 4 years ago
 Views:
Transcription
1 Iteral Report SUF PFY/96 Stockholm, December 996 st revisio, 3 October 998 last modificatio September 7 Hadbook o STATISTICAL DISTRIBUTIONS for experimetalists by Christia Walck Particle Physics Group Fysikum Uiversity of Stockholm (
2
3 Cotets Itroductio. Radom Number Geeratio Probability Desity Fuctios 3. Itroductio Momets Errors of Momets Characteristic Fuctio Probability Geeratig Fuctio Cumulats Radom Number Geeratio Cumulative Techique AcceptReject techique Compositio Techiques Multivariate Distributios Multivariate Momets Errors of Bivariate Momets Joit Characteristic Fuctio Radom Number Geeratio Beroulli Distributio 3. Itroductio Relatio to Other Distributios Beta distributio 3 4. Itroductio Derivatio of the Beta Distributio Characteristic Fuctio Momets Probability Cotet Radom Number Geeratio Biomial Distributio 6 5. Itroductio Momets Probability Geeratig Fuctio Cumulative Fuctio Radom Number Geeratio Estimatio of Parameters Probability Cotet Biormal Distributio 6. Itroductio Coditioal Probability Desity Characteristic Fuctio Momets BoxMuller Trasformatio Probability Cotet i
4 6.7 Radom Number Geeratio Cauchy Distributio 6 7. Itroductio Momets Normalizatio Characteristic Fuctio Locatio ad Scale Parameters BreitWiger Distributio Compariso to Other Distributios Trucatio Sum ad Average of Cauchy Variables Estimatio of the Media Estimatio of the HWHM Radom Number Geeratio Physical Picture Ratio Betwee Two Stadard Normal Variables Chisquare Distributio Itroductio Momets Characteristic Fuctio Cumulative Fuctio Origi of the Chisquare Distributio Approximatios Radom Number Geeratio Cofidece Itervals for the Variace Hypothesis Testig Probability Cotet Eve Number of Degrees of Freedom Odd Number of Degrees of Freedom Fial Algorithm Chi Distributio Compoud Poisso Distributio Itroductio Brachig Process Momets Probability Geeratig Fuctio Radom Number Geeratio DoubleExpoetial Distributio 47. Itroductio Momets Characteristic Fuctio Cumulative Fuctio Radom Number Geeratio ii
5 Doubly NoCetral F Distributio 49. Itroductio Momets Cumulative Distributio Radom Number Geeratio Doubly NoCetral tdistributio 5. Itroductio Momets Cumulative Distributio Radom Number Geeratio Error Fuctio Itroductio Probability Desity Fuctio Expoetial Distributio Itroductio Cumulative Fuctio Momets Characteristic Fuctio Radom Number Geeratio Method by vo Neuma Method by Marsaglia Method by Ahres Extreme Value Distributio Itroductio Cumulative Distributio Characteristic Fuctio Momets Radom Number Geeratio Fdistributio 6 6. Itroductio Relatios to Other Distributios /F Characteristic Fuctio Momets Fratio Variace Ratio Aalysis of Variace Calculatio of Probability Cotet The Icomplete Beta fuctio Fial Formulæ Radom Number Geeratio iii
6 7 Gamma Distributio Itroductio Derivatio of the Gamma Distributio Momets Characteristic Fuctio Probability Cotet Radom Number Geeratio Erlagia distributio Geeral case Asymptotic Approximatio Geeralized Gamma Distributio Itroductio Cumulative Fuctio Momets Relatio to Other Distributios Geometric Distributio Itroductio Momets Probability Geeratig Fuctio Radom Number Geeratio Hyperexpoetial Distributio 77. Itroductio Momets Characteristic Fuctio Radom Number Geeratio Hypergeometric Distributio 79. Itroductio Probability Geeratig Fuctio Momets Radom Number Geeratio Logarithmic Distributio 8. Itroductio Momets Probability Geeratig Fuctio Radom Number Geeratio Logistic Distributio Itroductio Cumulative Distributio Characteristic Fuctio Momets Radom umbers iv
7 4 Logormal Distributio Itroductio Momets Cumulative Distributio Radom Number Geeratio Maxwell Distributio Itroductio Momets Cumulative Distributio Kietic Theory Radom Number Geeratio Moyal Distributio 9 6. Itroductio Normalizatio Characteristic Fuctio Momets Cumulative Distributio Radom Number Geeratio Multiomial Distributio Itroductio Histogram Momets Probability Geeratig Fuctio Radom Number Geeratio Sigificace Levels Equal Group Probabilities Multiormal Distributio Itroductio Coditioal Probability Desity Probability Cotet Radom Number Geeratio Negative Biomial Distributio 9. Itroductio Momets Probability Geeratig Fuctio Relatios to Other Distributios Poisso Distributio Gamma Distributio Logarithmic Distributio Brachig Process Poisso ad Gamma Distributios Radom Number Geeratio v
8 3 Nocetral Betadistributio 8 3. Itroductio Derivatio of distributio Momets Cumulative distributio Radom Number Geeratio Nocetral Chisquare Distributio 3. Itroductio Characteristic Fuctio Momets Cumulative Distributio Approximatios Radom Number Geeratio Nocetral F Distributio 3 3. Itroductio Momets Cumulative Distributio Approximatios Radom Number Geeratio Nocetral tdistributio Itroductio Derivatio of distributio Momets Cumulative Distributio Approximatio Radom Number Geeratio Normal Distributio Itroductio Momets Cumulative Fuctio Characteristic Fuctio Additio Theorem Idepedece of x ad s Probability Cotet Radom Number Geeratio Cetral Limit Theory Approach Exact Trasformatio Polar Method Trapezoidal Method Cetertail method Compositiorejectio Methods Method by Marsaglia Histogram Techique Ratio of Uiform Deviates Compariso of radom umber geerators vi
9 34.9 Tests o Parameters of a Normal Distributio Pareto Distributio Itroductio Cumulative Distributio Momets Radom Numbers Poisso Distributio Itroductio Momets Probability Geeratig Fuctio Cumulative Distributio Additio Theorem Derivatio of the Poisso Distributio Histogram Radom Number Geeratio Rayleigh Distributio Itroductio Momets Cumulative Distributio Twodimesioal Kietic Theory Radom Number Geeratio Studet s tdistributio Itroductio History Momets Cumulative Fuctio Relatios to Other Distributios tratio Oe Normal Sample Two Normal Samples Paired Data Cofidece Levels Testig Hypotheses Calculatio of Probability Cotet Eve umber of degrees of freedom Odd umber of degrees of freedom Fial algorithm Radom Number Geeratio Triagular Distributio Itroductio Momets Radom Number Geeratio vii
10 4 Uiform Distributio 5 4. Itroductio Momets Radom Number Geeratio Weibull Distributio 5 4. Itroductio Cumulative Distributio Momets Radom Number Geeratio Appedix A: The Gamma ad Beta Fuctios Itroductio The Gamma Fuctio Numerical Calculatio Formulæ Digamma Fuctio Polygamma Fuctio The Icomplete Gamma Fuctio Numerical Calculatio Formulæ Special Cases The Beta Fuctio The Icomplete Beta Fuctio Numerical Calculatio Approximatio Relatios to Probability Desity Fuctios The Beta Distributio The Biomial Distributio The Chisquared Distributio The F distributio The Gamma Distributio The Negative Biomial Distributio The Normal Distributio The Poisso Distributio Studet s tdistributio Summary Appedix B: Hypergeometric Fuctios Itroductio Hypergeometric Fuctio Cofluet Hypergeometric Fuctio Mathematical Costats Errata et Addeda viii
11 Refereces Idex List of Tables Percetage poits of the chisquare distributio Extreme cofidece levels for the chisquare distributio Extreme cofidece levels for the chisquare distributio (as χ /d.f. values) Exact ad approximate values for the Beroulli umbers Percetage poits of the F distributio Probability cotet from z to z of Gauss distributio i % Stadard ormal distributio zvalues for a specific probability cotet Percetage poits of the tdistributio Expressios for the Beta fuctio B(m, ) for iteger ad halfiteger argumets. 79 ix
12 x
13 Itroductio I experimetal work e.g. i physics oe ofte ecouters problems where a stadard statistical probability desity fuctio is applicable. It is ofte of great help to be able to hadle these i differet ways such as calculatig probability cotets or geeratig radom umbers. For these purposes there are excellet textbooks i statistics e.g. the classical work of Maurice G. Kedall ad Ala Stuart [,] or more moder textbooks as [3] ad others. Some books are particularly aimed at experimetal physics or eve specifically at particle physics [4,5,6,7,8]. Cocerig umerical methods a valuable refereces worth metioig is [9] which has bee surpassed by a ew editio []. Also hadbooks, especially [], has bee of great help throughout. However, whe it comes to actual applicatios it ofte turs out to be hard to fid detailed explaatios i the literature ready for implemetatio. This work has bee collected over may years i parallel with actual experimetal work. I this way some material may be historical ad sometimes be aïve ad have somewhat clumsy solutios ot always made i the mathematically most striget may. We apologize for this but still hope that it will be of iterest ad help for people who is strugglig to fid methods to solve their statistical problems i makig real applicatios ad ot oly learig statistics as a course. Eve if oe has the skill ad may be able to fid solutios it seems worthwhile to have easy ad fast access to formulæ ready for applicatio. Similar books ad reports exist e.g. [,3] but we hope the preset work may compete i describig more distributios, beig more complete, ad icludig more explaatios o relatios give. The material could most probably have bee divided i a more logical way but we have chose to preset the distributios i alphabetic order. I this way it is more of a hadbook tha a proper textbook. After the first release the report has bee modestly chaged. Mior chages to correct misprits is made wheever foud. I a few cases subsectios ad tables have bee added. These alteratios are described o page 8. I October 998 the first somewhat bigger revisio was made where i particular a lot of material o the ocetral samplig distributios were added.. Radom Number Geeratio I moder computig Mote Carlo simulatios are of vital importace ad we give methods to achieve radom umbers from the distributios. A earlier report dealt etirely with these matters [4]. Not all textbooks o statistics iclude iformatio o this subject which we fid extremely useful. Large simulatios are commo i particle physics as well as i other areas but ofte it is also useful to make small toy Mote Carlo programs to ivestigate ad study aalysis tools developed o ideal, but statistically soud, radom samples. A related ad importat field which we will oly metio briefly here, is how to get good basic geerators for achievig radom umbers uiformly distributed betwee zero ad oe. Those are the basis for all the methods described i order to get radom umbers
14 from specific distributios i this documet. For a review see e.g. [5]. From older methods ofte usig so called multiplicative cogruetial method or shiftgeerators G. Marsaglia et al [6] itroduced i 989 a ew uiversal geerator which became the ew stadard i may fields. We implemeted this i our experimets at CERN ad also made a package of routies for geeral use [7]. This method is still a very good choice but later alteratives, claimed to be eve better, have tured up. These are based o o the same type of lagged Fiboacci sequeces as is used i the uiversal geerator ad was origially proposed by the same authors [8]. A implemetatios of this method was proposed by F. James [5] ad this versio was further developed by M. Lüscher [9]. A similar package of routie as was prepared for the uiversal geerator has bee implemeted for this method [].
15 Probability Desity Fuctios. Itroductio Probability desity fuctios i oe, discrete or cotiuous, variable are deoted p(r) ad f(x), respectively. They are assumed to be properly ormalized such that p(r) = r ad f(x)dx = where the sum or the itegral are take over all relevat values for which the probability desity fuctio is defied. Statisticias ofte use the distributio fuctio or as physicists more ofte call it the cumulative fuctio which is defied as r P (r) = p(i) ad F (x) = x i= f(t)dt. Momets Algebraic momets of order r are defied as the expectatio value µ r = E(x r ) = k k r p(k) or x r f(x)dx Obviously µ = from the ormalizatio coditio ad µ is equal to the mea, sometimes called the expectatio value, of the distributio. Cetral momets of order r are defied as µ r = E((k E(k)) r ) or E((x E(x)) r ) of which the most commoly used is µ which is the variace of the distributio. Istead of usig the third ad fourth cetral momets oe ofte defies the coefficiets of skewess γ ad kurtosis γ by γ = µ 3 ad γ µ 3 = µ 4 µ 3 where the shift by 3 uits i γ assures that both measures are zero for a ormal distributio. Distributios with positive kurtosis are called leptokurtic, those with kurtosis aroud zero mesokurtic ad those with egative kurtosis platykurtic. Leptokurtic distributios are ormally more peaked tha the ormal distributio while platykurtic distributios are more flat topped. From greek kyrtosis = curvature from kyrt(ós) = curved, arched, roud, swellig, bulgig. Sometimes, especially i older literature, γ is called the coefficiet of excess. 3
16 .. Errors of Momets For a thorough presetatio of how to estimate errors o momets we refer to the classical books by M. G. Kedall ad A. Stuart [] (pp 8 45). Below oly a brief descriptio is give. For a sample with observatios x, x,..., x we defie the mometstatistics for the algebraic ad cetral momets m r ad m r as m r = x r ad m r = r= (x m ) r r= The otatio m r ad m r are thus used for the statistics (sample values) while we deote the true, populatio, values by µ r ad µ r. The mea value of the r:th ad the samplig covariace betwee the q:th ad r:th mometstatistic are give by. E(m r) = µ r Cov(m q, m r) = ( µ q+r µ qµ r) These formula are exact. Formulæ for momets about the mea are ot as simple sice the mea itself is subject to samplig fluctuatios. E(m r ) = µ r Cov(m q, m r ) = (µ q+r µ q µ r + rqµ µ r µ q rµ r µ q+ qµ r+ µ q ) to order / ad /, respectively. The covariace betwee a algebraic ad a cetral momet is give by to order /. Note especially that Cov(m r, m q) = (µ q+r µ q µ r rµ q+ µ r ) V (m r) = ( ) µ r µ r V (m r ) = ( ) µr µ r + r µ µ r rµ r µ r+ Cov(m, m r ) = (µ r+ rµ µ r ).3 Characteristic Fuctio For a distributio i a cotiuous variable x the Fourier trasform of the probability desity fuctio φ(t) = E(e ıxt ) = 4 e ıxt f(x)dx
17 is called the characteristic fuctio. It has the properties that φ() = ad φ(t) for all t. If the cumulative, distributio, fuctio F (x) is cotiuous everywhere ad df (x) = f(x)dx the we reverse the trasform such that f(x) = π φ(t)e ıxt dt The characteristic fuctio is related to the momets of the distributio by φ x (t) = E(e ıtx ) = = (ıt) E(x )! = = (ıt) µ! e.g. algebraic momets may be foud by µ r = ( r d φ(t) ı dt) r To fid cetral momets (about the mea µ) use t= φ x µ (t) = E ( e ıt(x µ)) = e ıtµ φ x (t) ad thus µ r = ( r d e ı dt) ıtµ φ(t) r t= A very useful property of the characteristic fuctio is that for idepedet variables x ad y φ x+y (t) = φ x (t) φ y (t) As a example regard the sum a i z i where the z i s are distributed accordig to ormal distributios with meas µ i ad variaces σ i. The the liear combiatio will also be distributed accordig to the ormal distributio with mea a i µ i ad variace a i σ i. To show that the characteristic fuctio i two variables factorizes is the best way to show idepedece betwee two variables. Remember that a vaishig correlatio coefficiet does ot imply idepedece while the reversed is true..4 Probability Geeratig Fuctio I the case of a distributio i a discrete variable r the characteristic fuctio is give by φ(t) = E(e ıtr ) = p(r)e ıtr I this case it is ofte coveiet to write z = e ıt ad defie the probability geeratig fuctio as G(z) = E(z r ) = p(r)z r 5
18 Derivatives of G(z) evaluated at z = are related to factorial momets of the distributio G () = d dz G(z) z= = E(r) G() = (ormalizatio) G () = d dz G(z) = E(r(r )) z= G 3 () = d3 dz G(z) 3 = E(r(r )(r )) z= G k () = dk dz G(z) k = E(r(r )(r ) (r k + )) z= Lower order algebraic momets are the give by µ = G () µ = G () + G () µ 3 = G 3 () + 3G () + G () µ 4 = G 4 () + 6G 3 () + 7G () + G () while expressio for cetral momets become more complicated. A useful property of the probability geeratig fuctio is for a brachig process i steps where G(z) = G (G (... G (G (z))...)) with G k (z) the probability geeratig fuctio for the distributio i the k:th step. As a example see sectio o page 5..5 Cumulats Although ot much used i physics the cumulats, κ r, are of statistical iterest. Oe reaso for this is that they have some useful properties such as beig ivariat for a shift i scale (except the first cumulat which is equal to the mea ad is shifted alog with the scale). Multiplyig the xscale by a costat a has the same effect as for algebraic momets amely to multiply κ r by a r. As the algebraic momet µ is the coefficiet of (ıt) /! i the expasio of φ(t) the cumulat κ is the coefficiet of (ıt) /! i the expasio of the logarithm of φ(t) (sometimes called the cumulat geeratig fuctio) i.e. (ıt) l φ(t) = κ =! ad thus κ r = ( r d l φ(t) ı dt) r t= Relatios betwee cumulats ad cetral momets for some lower orders are as follows 6
19 κ = µ κ = µ µ = κ κ 3 = µ 3 µ 3 = κ 3 κ 4 = µ 4 3µ µ 4 = κ 4 + 3κ κ 5 = µ 5 µ 3 µ µ 5 = κ 5 + κ 3 κ κ 6 = µ 6 5µ 4 µ µ 3 + 3µ 3 µ 6 = κ 6 + 5κ 4 κ + κ 3 + 5κ 3 κ 7 = µ 7 µ 5 µ 35µ 4 µ 3 + µ 3 µ µ 7 = κ 7 + κ 5 κ + 35κ 4 κ 3 + 5κ 3 κ κ 8 = µ 8 8µ 6 µ 56µ 5 µ 3 35µ 4+ µ 8 = κ 8 + 8κ 6 κ + 56κ 5 κ κ 4+ +4µ 4 µ + 56µ 3µ 63µ 4 +κ 4 κ + 8κ 3κ + 5κ 4.6 Radom Number Geeratio Whe geeratig radom umbers from differet distributio it is assumed that a good geerator for uiform pseudoradom umbers betwee zero ad oe exist (ormally the edpoits are excluded)..6. Cumulative Techique The most direct techique to obtai radom umbers from a cotiuous probability desity fuctio f(x) with a limited rage from x mi to x max is to solve for x i the equatio ξ = F (x) F (x mi) F (x max ) F (x mi ) where ξ is uiformly distributed betwee zero ad oe ad F (x) is the cumulative distributio (or as statisticias say the distributio fuctio). For a properly ormalized probability desity fuctio thus x = F (ξ) The techique is sometimes also of use i the discrete case if the cumulative sum may be expressed i aalytical form as e.g. for the geometric distributio. Also for geeral cases, discrete or cotiuous, e.g. from a arbitrary histogram the cumulative method is coveiet ad ofte faster tha more elaborate methods. I this case the task is to costruct a cumulative vector ad assig a radom umber accordig to the value of a uiform radom umber (iterpolatig withi bis i the cotiuous case)..6. AcceptReject techique A useful techique is the acceptacerejectio, or hitmiss, method where we choose f max to be greater tha or equal to f(x) i the etire iterval betwee x mi ad x max ad proceed as follows i Geerate a pair of uiform pseudoradom umbers ξ ad ξ. ii Determie x = x mi + ξ (x max x mi ). iii Determie y = f max ξ. iv If y f(x) > reject ad go to i else accept x as a pseudoradom umber from f(x). 7
20 The efficiecy of this method depeds o the average value of f(x)/f max over the iterval. If this value is close to oe the method is efficiet. O the other had, if this average is close to zero, the method is extremely iefficiet. If α is the fractio of the area f max (x max x mi ) covered by the fuctio the average umber of rejects i step iv is α ad uiform pseudoradom umbers are required o average. α The efficiecy of this method ca be icreased if we are able to choose a fuctio h(x), from which radom umbers are more easily obtaied, such that f(x) αh(x) = g(x) over the etire iterval uder cosideratio (where α is a costat). A radom sample from f(x) is obtaied by i Geerate i x a radom umber from h(x). ii Geerate a uiform radom umber ξ. iii If ξ f(x)/g(x) go back to i else accept x as a pseudoradom umber from f(x). Yet aother situatio is whe a fuctio g(x), from which fast geeratio may be obtaied, ca be iscribed i such a way that a big proportio (f) of the area uder the fuctio is covered (as a example see the trapezoidal method for the ormal distributio). The proceed as follows: i Geerate a uiform radom umber ξ. ii If ξ < f the geerate a radom umber from g(x). iii Else use the acceptace/rejectio techique for h(x) = f(x) g(x) (i subitervals if more efficiet)..6.3 Compositio Techiques If f(x) may be writte i the form f(x) = g z (x)dh(z) where we kow how to sample radom umbers from the p.d.f. g(x) ad the distributio fuctio H(z). A radom umber from f(x) is the obtaied by i Geerate two uiform radom umbers ξ ad ξ. ii Determie z = H (ξ ). iii Determie x = G z (ξ ) where G z is the distributio fuctio correspodig to the p.d.f. g z (x). For more detailed iformatio o the Compositio techique see [] or []. 8
21 A combiatio of the compositio ad the rejectio method has bee proposed by J. C. Butcher [3]. If f(x) ca be writte f(x) = α i f i (x)g i (x) i= where α i are positive costats, f i (x) p.d.f. s for which we kow how to sample a radom umber ad g i (x) are fuctios takig values betwee zero ad oe. The method is the as follows: i Geerate uiform radom umbers ξ ad ξ. ii Determie a iteger k from the discrete distributio p i = α i /(α + α α ) usig ξ. iii Geerate a radom umber x from f k (x). iv Determie g k (x) ad if ξ > g k (x) the go to i. v Accept x as a radom umber from f(x)..7 Multivariate Distributios Joit probability desity fuctios i several variables are deoted by f(x, x,..., x ) ad p(r, r,..., r ) for cotiuous ad discrete variables, respectively. It is assumed that they are properly ormalized i.e. itegrated (or summed) over all variables the result is uity..7. Multivariate Momets The geeralizatio of algebraic ad cetral momets to multivariate distributios is straightforward. As a example we take a bivariate distributio f(x, y) i two cotiuous variables x ad y ad defie algebraic ad cetral bivariate momets of order k, l as µ kl E(x k y l ) = x k y l f(x, y)dxdy µ kl E((x µ x ) k (y µ y ) l ) = (x µ x ) k (y µ y ) l f(x, y)dxdy where µ x ad µ y are the mea values of x ad y. The covariace is a cetral bivariate momet of order, i.e. Cov(x, y) = µ. Similarly oe easily defies multivariate momets for distributio i discrete variables..7. Errors of Bivariate Momets Algebraic (m rs) ad cetral (m rs ) bivariate momets are defied by: m rs = x r i yi s ad m rs = i= (x i m ) r (y i m ) s i= Whe there is a risk of ambiguity we write m r,s istead of m rs. 9
22 The otatios m rs ad m rs are used for the statistics (sample values) while we write µ rs ad µ rs for the populatio values. The errors of bivariate momets are give by Cov(m rs, m uv) = (µ r+u,s+v µ rsµ uv) especially Cov(m rs, m uv ) = (µ r+u,s+v µ rs µ uv + ruµ µ r,s µ u,v + svµ µ r,s µ u,v +rvµ µ r,s µ u,v + suµ µ r,s µ u,v uµ r+,s µ u,v vµ r,s+ µ u,v rµ r,s µ u+,v sµ r,s µ u,v+ ) V (m rs) = (µ r,s µ rs) V (m rs ) = (µ r,s µ rs + r µ µ r,s + s µ µ r,s +rsµ µ r,s µ r,s rµ r+,s µ r,s sµ r,s+ µ r,s ) For the covariace (m ) we get by error propagatio V (m ) = (µ µ ) Cov(m, m ) = µ Cov(m, m ) = (µ 3 µ µ ) For the correlatio coefficiet (deoted by ρ = µ / µ µ for the populatio value ad by r for the sample value) we get V (r) = ρ { µ + [ µ4 + µ 4 + µ ] [ µ3 + µ ]} 3 µ 4 µ µ µ µ µ µ µ Beware, however, that the samplig distributio of r teds to ormality very slowly..7.3 Joit Characteristic Fuctio The joit characteristic fuctio is defied by φ(t, t,..., t ) = E(e ıt x +ıt x +...t x ) = =... e ıt x +ıt x +...+ıt x f(x, x,..., x )dx dx... dx From this fuctio multivariate momets may be obtaied e.g. for a bivariate distributio algebraic bivariate momets are give by µ rs = E(x r x s ) = r+s φ(t, t ) (ıt ) r (ıt ) s t =t =
23 .7.4 Radom Number Geeratio Radom samplig from a may dimesioal distributio with a joit probability desity fuctio f(x, x,..., x ) ca be made by the followig method: Defie the margial distributios g m (x, x,..., x m ) = f(x,..., x )dx m+ dx m+...dx = g m+ (x,..., x m+ )dx m+ Cosider the coditioal desity fuctio h m give by h m (x m x, x,...x m ) g m (x, x,..., x m )/g m (x, x,..., x m ) We see that g = f ad that h m (x m x, x,..., x m )dx m = from the defiitios. Thus h m is the coditioal distributio i x m give fixed values for x, x,..., x m. We ca ow factorize f as f(x, x,..., x ) = h (x )h (x x )... h (x x, x,..., x ) We sample values for x, x,..., x from the joit probability desity fuctio f by: Geerate a value for x from h (x ). Use x ad sample x from h (x x ). Proceed step by step ad use previously sampled values for x, x,..., x m to obtai a value for x m+ from h m+ (x m+ x, x,..., x m ). Cotiue util all x i :s have bee sampled. If all x i :s are idepedet the coditioal desities will equal the margial desities ad the variables ca be sampled i ay order.
24 3 Beroulli Distributio 3. Itroductio The Beroulli distributio, amed after the swiss mathematicia Jacques Beroulli (654 75), describes a probabilistic experimet where a trial has two possible outcomes, a success or a failure. The parameter p is the probability for a success i a sigle trial, the probability for a failure thus beig p (ofte deoted by q). Both p ad q is limited to the iterval from zero to oe. The distributio has the simple form p(r; p) = { p = q if r = (failure) p if r = (success) ad zero elsewhere. The work of J. Beroulli, which costitutes a foudatio of probability theory, was published posthumously i Ars Cojectadi (73) [4]. The probability geeratig fuctio is G(z) = q +pz ad the distributio fuctio give by P () = q ad P () =. A radom umbers are easily obtaied by usig a uiform radom umber variate ξ ad puttig r = (success) if ξ p ad r = else (failure). 3. Relatio to Other Distributios From the Beroulli distributio we may deduce several probability desity fuctios described i this documet all of which are based o series of idepedet Beroulli trials: Biomial distributio: expresses the probability for r successes i a experimet with trials ( r ). Geometric distributio: expresses the probability of havig to wait exactly r trials before the first successful evet (r ). Negative Biomial distributio: expresses the probability of havig to wait exactly r trials util k successes have occurred (r k). This form is sometimes referred to as the Pascal distributio. Sometimes this distributio is expressed as the umber of failures occurrig while waitig for k successes ( ).
25 4 Beta distributio 4. Itroductio The Beta distributio is give by f(x; p, q) = B(p, q) xp ( x) q where the parameters p ad q are positive real quatities ad the variable x satisfies x. The quatity B(p, q) is the Beta fuctio defied i terms of the more commo Gamma fuctio as B(p, q) = Γ(p)Γ(q) Γ(p + q) For p = q = the Beta distributio simply becomes a uiform distributio betwee zero ad oe. For p = ad q = or vise versa we get triagular shaped distributios, f(x) = x ad f(x) = x. For p = q = we obtai a distributio of parabolic shape, f(x) = 6x( x). More geerally, if p ad q both are greater tha oe the distributio has a uique mode at x = (p )/(p + q ) ad is zero at the edpoits. If p ad/or q is less tha oe f() ad/or f() ad the distributio is said to be Jshaped. I figure below we show the Beta distributio for two cases: p = q = ad p = 6, q = 3. Figure : Examples of Beta distributios 4. Derivatio of the Beta Distributio If y m ad y are two idepedet variables distributed accordig to the chisquared distributio with m ad degrees of freedom, respectively, the the ratio y m /(y m + y ) follows a Beta distributio with parameters p = m ad q =. 3
26 To show this we make a chage of variables to x = y m /(y m + y ) ad y = y m + y which implies that y m = xy ad y = y( x). We obtai f(x, y) = = = y m x y x y y y m y y y ( x ym x Γ ( ) m+ Γ ( m ) Γ ( f(y m, y ) = ) m e ym Γ ( ) m ( y )x m ( x) ) e y ) Γ ( ( ) m y + e y Γ ( ) m+ = which we recogize as a product of a Beta distributio i the variable x ad a chisquared distributio with m + degrees of freedom i the variable y (as expected for the sum of two idepedet chisquare variables). 4.3 Characteristic Fuctio The characteristic fuctio of the Beta distributio may be expressed i terms of the cofluet hypergeometric fuctio (see sectio 43.3) as 4.4 Momets φ(t) = M(p, p + q; ıt) The expectatio value, variace, third ad fourth cetral momet are give by E(x) = V (x) = µ 3 = µ 4 = p p + q pq (p + q) (p + q + ) pq(q p) (p + q) 3 (p + q + )(p + q + ) 3pq((p + q) + pq(p + q 6)) (p + q) 4 (p + q + )(p + q + )(p + q + 3) More geerally algebraic momets are give i terms of the Beta fuctio by µ k = B(p + k, q) B(p, q) 4.5 Probability Cotet I order to fid the probability cotet for a Beta distributio we form the cumulative distributio x F (x) = t p ( t) q dt = B x(p, q) B(p, q) B(p, q) = I x(p, q) 4
27 where both B x ad I x seems to be called the icomplete Beta fuctio i the literature. The icomplete Beta fuctio I x is coected to the biomial distributio for iteger values of a by ( ) a I x (a, b) = I x (b, a) = ( x) a+b ( a + b x i x or expressed i the opposite directio s=a i= ( ) p s ( p) s = I p (a, a + ) s Also to the egative biomial distributio there is a coectio by the relatio ( ) + s p q s = I q (a, ) s=a s The icomplete Beta fuctio is also coected to the probability cotet of Studet s tdistributio ad the F distributio. See further sectio 4.7 for more iformatio o I x. 4.6 Radom Number Geeratio I order to obtai radom umbers from a Beta distributio we first sigle out a few special cases. For p = ad/or q = we may easily solve the equatio F (x) = ξ where F (x) is the cumulative fuctio ad ξ a uiform radom umber betwee zero ad oe. I these cases p = x = ξ /q q = x = ξ /p For p ad q halfitegers we may use the relatio to the chisquare distributio by formig the ratio y m y m + y with y m ad y two idepedet radom umbers from chisquare distributios with m = p ad = q degrees of freedom, respectively. Yet aother way of obtaiig radom umbers from a Beta distributio valid whe p ad q are both itegers is to take the l:th out of k ( l k) idepedet uiform radom umbers betwee zero ad oe (sorted i ascedig order). Doig this we obtai a Beta distributio with parameters p = l ad q = k + l. Coversely, if we wat to geerate radom umbers from a Beta distributio with iteger parameters p ad q we could use this techique with l = p ad k = p+q. This last techique implies that for low iteger values of p ad q simple code may be used, e.g. for p = ad q = we may simply take max(ξ, ξ ) i.e. the maximum of two uiform radom umbers. ) i 5
28 5 Biomial Distributio 5. Itroductio The Biomial distributio is give by p(r; N, p) = ( ) N p r ( p) N r r where the variable r with r N ad the parameter N (N > ) are itegers ad the parameter p ( p ) is a real quatity. The distributio describes the probability of exactly r successes i N trials if the probability of a success i a sigle trial is p (we sometimes also use q = p, the probability for a failure, for coveiece). It was first preseted by Jacques Beroulli i a work which was posthumously published [4]. 5. Momets The expectatio value, variace, third ad fourth momet are give by E(r) = Np V (r) = Np( p) = Npq µ 3 = Np( p)( p) = Npq(q p) µ 4 = Np( p) [ + 3p( p)(n )] = Npq [ + 3pq(N )] Cetral momets of higher orders may be obtaied by the recursive formula µ r+ = pq { Nrµ r + µ r p startig with µ = ad µ =. The coefficiets of skewess ad kurtosis are give by γ = q p Npq ad γ = 6pq Npq 5.3 Probability Geeratig Fuctio The probability geeratig fuctio is give by ) N G(z) = E(z r ) = z r( N p r ( p) N r = (pz + q) N r= r ad the characteristic fuctio thus by φ(t) = G(e ıt ) = ( q + pe ıt) N } 6
29 5.4 Cumulative Fuctio For fixed N ad p oe may easily costruct the cumulative fuctio P (r) by a recursive formula, see sectio o radom umbers below. However, a iterestig ad useful relatio exist betwee P (r) ad the icomplete Beta fuctio I x amely k P (k) = p(r; N, p) = I p (N k, k + ) r= For further iformatio o I x see sectio Radom Number Geeratio I order to achieve radom umbers from a biomial distributio we may either Geerate N uiform radom umbers ad accumulate the umber of such that are less or equal to p, or Use the cumulative techique, i.e. costruct the cumulative, distributio, fuctio ad by use of this ad oe uiform radom umber obtai the required radom umber, or for larger values of N, say N >, use a approximatio to the ormal distributio with mea Np ad variace Npq. Except for very small values of N ad very high values of p the cumulative techique is the fastest for umerical calculatios. This is especially true if we proceed by costructig the cumulative vector oce for all (as opposed to makig this at each call) usig the recursive formula p(i) = p(i ) p N + i q i for i =,,..., N startig with p() = q N. However, usig the relatio give i the previous sectio with a well optimized code for the icomplete Beta fuctio (see [] or sectio 4.7) turs out to be a umerically more stable way of creatig the cumulative distributio tha a simple loop addig up the idividual probabilities. 5.6 Estimatio of Parameters Experimetally the quatity r, the relative umber of successes i N trials, ofte is of more N iterest tha r itself. This variable has expectatio E( r ) = p ad variace V ( r ) = pq. N N N The estimated value for p i a experimet givig r successes i N trials is ˆp = r. N If p is ukow a ubiased estimate of the variace of a biomial distributio is give by V (r) = N ( ) ( r N N r ) = N N ˆp( ˆp) N N N This is possible oly if we require radom umbers from oe ad the same biomial distributio with fixed values of N ad p. 7
30 To fid lower ad upper cofidece levels for p we proceed as follows. For lower limits fid a p low such that N r=k ( ) N p r r low( p low ) N r = α or expressed i terms of the icomplete Beta fuctio I p (N k +, k) = α for upper limits fid a p up such that k r= ( ) N p r r up( p up ) N r = α which is equivalet to I p (N k, k + ) = α i.e. I p (k +, N k) = α. As a example we take a experimet with N = where a certai umber of successes k N have bee observed. The cofidece levels correspodig to 9%, 95%, 99% as well as the levels correspodig to oe, two ad three stadard deviatios for a ormal distributio (84.3%, 97.7% ad 99.87% probability cotet) are give below. Lower cofidece levels Upper cofidece levels k 3σ 99% σ 95% 9% σ ˆp σ 9% 95% σ 99% 3σ Probability Cotet It is sometimes of iterest to judge the sigificace level of a certai outcome give the hypothesis that p =. If N trials are made ad we fid k successes (let s say k < N/ else use N k istead of k) we wat to estimate the probability to have k or fewer successes plus the probability for N k or more successes. Sice the assumptio is that p = we wat the twotailed probability cotet. To calculate this either sum the idividual probabilities or use the relatio to the icomplete beta fuctio. The former may seem more straightforward but the latter may be computatioally easier give a routie for the icomplete beta fuctio. If k = N/ we watch up ot to add the cetral term twice (i this case the requested probability is % ayway). I the table below we show such cofidece levels i % for values of N ragig from to. E.g. the probability to observe 3 successes (or failures) or less ad failures (or successes) or more for = 5 is 3.5%. 8
31 k N
32 6 Biormal Distributio 6. Itroductio As a geeralizatio of the ormal or Gauss distributio to two dimesios we defie the biormal distributio as ( ( ) ( ) ) x µ x f(x, x ) = πσ σ ρ e ( ρ + µ ρ x µ x µ ) σ σ σ σ where µ ad µ are the expectatio values of x ad x, σ ad σ their stadard deviatios ad ρ the correlatio coefficiet betwee them. Puttig ρ = we see that the distributio becomes the product of two oedimesioal Gauss distributios. 4 3 x x Figure : Biormal distributio I figure we show cotours for a stadardized Biormal distributio i.e puttig µ = µ = ad σ = σ = (these parameters are ayway shift ad scaleparameters oly). I the example show ρ =.5. Usig stadardized variables the cotours rage from a perfect circle for ρ = to gradually thier ellipses i the ±45 directio as ρ ±. The cotours show correspod to the oe, two, ad three stadard deviatio levels. See sectio o probability cotet below for details.
33 6. Coditioal Probability Desity The coditioal desity of the biormal distributio is give by f(x y) = f(x, y)/f(y) = = exp πσx ρ σx( ρ ) ( = N µ x + ρ σ ) x (y µ y ), σ σ x( ρ ) y [ x ( µ x + ρσ )] x (y µ y ) σ y = which is see to be a ormal distributio which for ρ = is, as expected, give by N(µ x, σ x) but geerally has a mea shifted from µ x ad a variace which is smaller tha σ x. 6.3 Characteristic Fuctio The characteristic fuctio of the biormal distributio is give by φ(t, t ) = E(e ıt x +ıt x ) = = exp { ıt µ + ıt µ + e ıt x +ıt x f(x, x )dx dx = [ (ıt ) σ + (ıt ) σ + (ıt )(ıt )ρσ σ ]} which shows that if the correlatio coefficiet ρ is zero the the characteristic fuctio factorizes i.e. the variables are idepedet. This is a uique property of the ormal distributio sice i geeral ρ = does ot imply idepedece. 6.4 Momets To fid bivariate momets of the biormal distributio the simplest, but still quite tedious, way is to use the characteristic fuctio give above (see sectio.7.3). Algebraic bivariate momets for the biormal distributio becomes somewhat complicated but ormally they are of less iterest tha the cetral oes. Algebraic momets of the type µ k ad µ k are, of course, equal to momets of the margial oedimesioal ormal distributio e.g. µ = µ, µ = µ + σ, ad µ 3 = µ (σ + µ ) (for µ k simply exchage the subscripts o µ ad σ). Some other lower order algebraic bivariate momets are give by µ = µ µ + ρσ σ µ = ρσ σ µ + σ µ + µ µ µ = σ σ + σ µ + σ µ + µ µ + ρ σ σ + 4ρσ σ µ µ Beware of the somewhat cofusig otatio where µ with two subscripts deotes bivariate momets while µ with oe subscript deotes expectatio values. Lower order cetral bivariate momets µ kl, arraged i matrix form, are give by
Chapter 7  Sampling Distributions. 1 Introduction. What is statistics? It consist of three major areas:
Chapter 7  Samplig Distributios 1 Itroductio What is statistics? It cosist of three major areas: Data Collectio: samplig plas ad experimetal desigs Descriptive Statistics: umerical ad graphical summaries
More informationChapter 7 Methods of Finding Estimators
Chapter 7 for BST 695: Special Topics i Statistical Theory. Kui Zhag, 011 Chapter 7 Methods of Fidig Estimators Sectio 7.1 Itroductio Defiitio 7.1.1 A poit estimator is ay fuctio W( X) W( X1, X,, X ) of
More informationI. Chisquared Distributions
1 M 358K Supplemet to Chapter 23: CHISQUARED DISTRIBUTIONS, TDISTRIBUTIONS, AND DEGREES OF FREEDOM To uderstad tdistributios, we first eed to look at aother family of distributios, the chisquared distributios.
More informationPSYCHOLOGICAL STATISTICS
UNIVERSITY OF CALICUT SCHOOL OF DISTANCE EDUCATION B Sc. Cousellig Psychology (0 Adm.) IV SEMESTER COMPLEMENTARY COURSE PSYCHOLOGICAL STATISTICS QUESTION BANK. Iferetial statistics is the brach of statistics
More informationProperties of MLE: consistency, asymptotic normality. Fisher information.
Lecture 3 Properties of MLE: cosistecy, asymptotic ormality. Fisher iformatio. I this sectio we will try to uderstad why MLEs are good. Let us recall two facts from probability that we be used ofte throughout
More informationChapter 6: Variance, the law of large numbers and the MonteCarlo method
Chapter 6: Variace, the law of large umbers ad the MoteCarlo method Expected value, variace, ad Chebyshev iequality. If X is a radom variable recall that the expected value of X, E[X] is the average value
More informationCase Study. Normal and t Distributions. Density Plot. Normal Distributions
Case Study Normal ad t Distributios Bret Halo ad Bret Larget Departmet of Statistics Uiversity of Wiscosi Madiso October 11 13, 2011 Case Study Body temperature varies withi idividuals over time (it ca
More informationNormal Distribution.
Normal Distributio www.icrf.l Normal distributio I probability theory, the ormal or Gaussia distributio, is a cotiuous probability distributio that is ofte used as a first approimatio to describe realvalued
More informationUniversity of California, Los Angeles Department of Statistics. Distributions related to the normal distribution
Uiversity of Califoria, Los Ageles Departmet of Statistics Statistics 100B Istructor: Nicolas Christou Three importat distributios: Distributios related to the ormal distributio Chisquare (χ ) distributio.
More informationA probabilistic proof of a binomial identity
A probabilistic proof of a biomial idetity Joatho Peterso Abstract We give a elemetary probabilistic proof of a biomial idetity. The proof is obtaied by computig the probability of a certai evet i two
More informationZTEST / ZSTATISTIC: used to test hypotheses about. µ when the population standard deviation is unknown
ZTEST / ZSTATISTIC: used to test hypotheses about µ whe the populatio stadard deviatio is kow ad populatio distributio is ormal or sample size is large TTEST / TSTATISTIC: used to test hypotheses about
More informationHypothesis testing. Null and alternative hypotheses
Hypothesis testig Aother importat use of samplig distributios is to test hypotheses about populatio parameters, e.g. mea, proportio, regressio coefficiets, etc. For example, it is possible to stipulate
More informationIncremental calculation of weighted mean and variance
Icremetal calculatio of weighted mea ad variace Toy Fich faf@cam.ac.uk dot@dotat.at Uiversity of Cambridge Computig Service February 009 Abstract I these otes I eplai how to derive formulae for umerically
More informationOverview of some probability distributions.
Lecture Overview of some probability distributios. I this lecture we will review several commo distributios that will be used ofte throughtout the class. Each distributio is usually described by its probability
More informationConfidence Intervals for One Mean
Chapter 420 Cofidece Itervals for Oe Mea Itroductio This routie calculates the sample size ecessary to achieve a specified distace from the mea to the cofidece limit(s) at a stated cofidece level for a
More informationChapter 7: Confidence Interval and Sample Size
Chapter 7: Cofidece Iterval ad Sample Size Learig Objectives Upo successful completio of Chapter 7, you will be able to: Fid the cofidece iterval for the mea, proportio, ad variace. Determie the miimum
More informationMaximum Likelihood Estimators.
Lecture 2 Maximum Likelihood Estimators. Matlab example. As a motivatio, let us look at oe Matlab example. Let us geerate a radom sample of size 00 from beta distributio Beta(5, 2). We will lear the defiitio
More informationModified Line Search Method for Global Optimization
Modified Lie Search Method for Global Optimizatio Cria Grosa ad Ajith Abraham Ceter of Excellece for Quatifiable Quality of Service Norwegia Uiversity of Sciece ad Techology Trodheim, Norway {cria, ajith}@q2s.tu.o
More informationA Test of Normality. 1 n S 2 3. n 1. Now introduce two new statistics. The sample skewness is defined as:
A Test of Normality Textbook Referece: Chapter. (eighth editio, pages 59 ; seveth editio, pages 6 6). The calculatio of p values for hypothesis testig typically is based o the assumptio that the populatio
More informationBASIC STATISTICS. f(x 1,x 2,..., x n )=f(x 1 )f(x 2 ) f(x n )= f(x i ) (1)
BASIC STATISTICS. SAMPLES, RANDOM SAMPLING AND SAMPLE STATISTICS.. Radom Sample. The radom variables X,X 2,..., X are called a radom sample of size from the populatio f(x if X,X 2,..., X are mutually idepedet
More informationOverview. Learning Objectives. Point Estimate. Estimation. Estimating the Value of a Parameter Using Confidence Intervals
Overview Estimatig the Value of a Parameter Usig Cofidece Itervals We apply the results about the sample mea the problem of estimatio Estimatio is the process of usig sample data estimate the value of
More informationStatistical inference: example 1. Inferential Statistics
Statistical iferece: example 1 Iferetial Statistics POPULATION SAMPLE A clothig store chai regularly buys from a supplier large quatities of a certai piece of clothig. Each item ca be classified either
More informationDetermining the sample size
Determiig the sample size Oe of the most commo questios ay statisticia gets asked is How large a sample size do I eed? Researchers are ofte surprised to fid out that the aswer depeds o a umber of factors
More informationOutput Analysis (2, Chapters 10 &11 Law)
B. Maddah ENMG 6 Simulatio 05/0/07 Output Aalysis (, Chapters 10 &11 Law) Comparig alterative system cofiguratio Sice the output of a simulatio is radom, the comparig differet systems via simulatio should
More informationAsymptotic Growth of Functions
CMPS Itroductio to Aalysis of Algorithms Fall 3 Asymptotic Growth of Fuctios We itroduce several types of asymptotic otatio which are used to compare the performace ad efficiecy of algorithms As we ll
More informationIn nite Sequences. Dr. Philippe B. Laval Kennesaw State University. October 9, 2008
I ite Sequeces Dr. Philippe B. Laval Keesaw State Uiversity October 9, 2008 Abstract This had out is a itroductio to i ite sequeces. mai de itios ad presets some elemetary results. It gives the I ite Sequeces
More information5: Introduction to Estimation
5: Itroductio to Estimatio Cotets Acroyms ad symbols... 1 Statistical iferece... Estimatig µ with cofidece... 3 Samplig distributio of the mea... 3 Cofidece Iterval for μ whe σ is kow before had... 4 Sample
More information1. C. The formula for the confidence interval for a population mean is: x t, which was
s 1. C. The formula for the cofidece iterval for a populatio mea is: x t, which was based o the sample Mea. So, x is guarateed to be i the iterval you form.. D. Use the rule : pvalue
More informationSoving Recurrence Relations
Sovig Recurrece Relatios Part 1. Homogeeous liear 2d degree relatios with costat coefficiets. Cosider the recurrece relatio ( ) T () + at ( 1) + bt ( 2) = 0 This is called a homogeeous liear 2d degree
More informationMEI Structured Mathematics. Module Summary Sheets. Statistics 2 (Version B: reference to new book)
MEI Mathematics i Educatio ad Idustry MEI Structured Mathematics Module Summary Sheets Statistics (Versio B: referece to ew book) Topic : The Poisso Distributio Topic : The Normal Distributio Topic 3:
More information0.7 0.6 0.2 0 0 96 96.5 97 97.5 98 98.5 99 99.5 100 100.5 96.5 97 97.5 98 98.5 99 99.5 100 100.5
Sectio 13 KolmogorovSmirov test. Suppose that we have a i.i.d. sample X 1,..., X with some ukow distributio P ad we would like to test the hypothesis that P is equal to a particular distributio P 0, i.e.
More information1 Computing the Standard Deviation of Sample Means
Computig the Stadard Deviatio of Sample Meas Quality cotrol charts are based o sample meas ot o idividual values withi a sample. A sample is a group of items, which are cosidered all together for our aalysis.
More informationOnesample test of proportions
Oesample test of proportios The Settig: Idividuals i some populatio ca be classified ito oe of two categories. You wat to make iferece about the proportio i each category, so you draw a sample. Examples:
More informationApproximating Area under a curve with rectangles. To find the area under a curve we approximate the area using rectangles and then use limits to find
1.8 Approximatig Area uder a curve with rectagles 1.6 To fid the area uder a curve we approximate the area usig rectagles ad the use limits to fid 1.4 the area. Example 1 Suppose we wat to estimate 1.
More informationCHAPTER 3 THE TIME VALUE OF MONEY
CHAPTER 3 THE TIME VALUE OF MONEY OVERVIEW A dollar i the had today is worth more tha a dollar to be received i the future because, if you had it ow, you could ivest that dollar ad ear iterest. Of all
More informationSECTION 1.5 : SUMMATION NOTATION + WORK WITH SEQUENCES
SECTION 1.5 : SUMMATION NOTATION + WORK WITH SEQUENCES Read Sectio 1.5 (pages 5 9) Overview I Sectio 1.5 we lear to work with summatio otatio ad formulas. We will also itroduce a brief overview of sequeces,
More informationSequences and Series
CHAPTER 9 Sequeces ad Series 9.. Covergece: Defiitio ad Examples Sequeces The purpose of this chapter is to itroduce a particular way of geeratig algorithms for fidig the values of fuctios defied by their
More informationCS103A Handout 23 Winter 2002 February 22, 2002 Solving Recurrence Relations
CS3A Hadout 3 Witer 00 February, 00 Solvig Recurrece Relatios Itroductio A wide variety of recurrece problems occur i models. Some of these recurrece relatios ca be solved usig iteratio or some other ad
More informationCenter, Spread, and Shape in Inference: Claims, Caveats, and Insights
Ceter, Spread, ad Shape i Iferece: Claims, Caveats, ad Isights Dr. Nacy Pfeig (Uiversity of Pittsburgh) AMATYC November 2008 Prelimiary Activities 1. I would like to produce a iterval estimate for the
More informationLECTURE 13: Crossvalidation
LECTURE 3: Crossvalidatio Resampli methods Cross Validatio Bootstrap Bias ad variace estimatio with the Bootstrap Threeway data partitioi Itroductio to Patter Aalysis Ricardo GutierrezOsua Texas A&M
More informationThe following example will help us understand The Sampling Distribution of the Mean. C1 C2 C3 C4 C5 50 miles 84 miles 38 miles 120 miles 48 miles
The followig eample will help us uderstad The Samplig Distributio of the Mea Review: The populatio is the etire collectio of all idividuals or objects of iterest The sample is the portio of the populatio
More informationVladimir N. Burkov, Dmitri A. Novikov MODELS AND METHODS OF MULTIPROJECTS MANAGEMENT
Keywords: project maagemet, resource allocatio, etwork plaig Vladimir N Burkov, Dmitri A Novikov MODELS AND METHODS OF MULTIPROJECTS MANAGEMENT The paper deals with the problems of resource allocatio betwee
More informationTHE REGRESSION MODEL IN MATRIX FORM. For simple linear regression, meaning one predictor, the model is. for i = 1, 2, 3,, n
We will cosider the liear regressio model i matrix form. For simple liear regressio, meaig oe predictor, the model is i = + x i + ε i for i =,,,, This model icludes the assumptio that the ε i s are a sample
More informationUC Berkeley Department of Electrical Engineering and Computer Science. EE 126: Probablity and Random Processes. Solutions 9 Spring 2006
Exam format UC Bereley Departmet of Electrical Egieerig ad Computer Sciece EE 6: Probablity ad Radom Processes Solutios 9 Sprig 006 The secod midterm will be held o Wedesday May 7; CHECK the fial exam
More information, a Wishart distribution with n 1 degrees of freedom and scale matrix.
UMEÅ UNIVERSITET Matematiskstatistiska istitutioe Multivariat dataaalys D MSTD79 PA TENTAMEN 00409 LÖSNINGSFÖRSLAG TILL TENTAMEN I MATEMATISK STATISTIK Multivariat dataaalys D, 5 poäg.. Assume that
More informationInference on Proportion. Chapter 8 Tests of Statistical Hypotheses. Sampling Distribution of Sample Proportion. Confidence Interval
Chapter 8 Tests of Statistical Hypotheses 8. Tests about Proportios HT  Iferece o Proportio Parameter: Populatio Proportio p (or π) (Percetage of people has o health isurace) x Statistic: Sample Proportio
More informationChapter 14 Nonparametric Statistics
Chapter 14 Noparametric Statistics A.K.A. distributiofree statistics! Does ot deped o the populatio fittig ay particular type of distributio (e.g, ormal). Sice these methods make fewer assumptios, they
More information.04. This means $1000 is multiplied by 1.02 five times, once for each of the remaining sixmonth
Questio 1: What is a ordiary auity? Let s look at a ordiary auity that is certai ad simple. By this, we mea a auity over a fixed term whose paymet period matches the iterest coversio period. Additioally,
More informationSampling Distribution And Central Limit Theorem
() Samplig Distributio & Cetral Limit Samplig Distributio Ad Cetral Limit Samplig distributio of the sample mea If we sample a umber of samples (say k samples where k is very large umber) each of size,
More informationHypergeometric Distributions
7.4 Hypergeometric Distributios Whe choosig the startig lieup for a game, a coach obviously has to choose a differet player for each positio. Similarly, whe a uio elects delegates for a covetio or you
More informationDiscrete Mathematics and Probability Theory Spring 2014 Anant Sahai Note 13
EECS 70 Discrete Mathematics ad Probability Theory Sprig 2014 Aat Sahai Note 13 Itroductio At this poit, we have see eough examples that it is worth just takig stock of our model of probability ad may
More informationParametric (theoretical) probability distributions. (Wilks, Ch. 4) Discrete distributions: (e.g., yes/no; above normal, normal, below normal)
6 Parametric (theoretical) probability distributios. (Wilks, Ch. 4) Note: parametric: assume a theoretical distributio (e.g., Gauss) Noparametric: o assumptio made about the distributio Advatages of assumig
More informationLesson 15 ANOVA (analysis of variance)
Outlie Variability betwee group variability withi group variability total variability Fratio Computatio sums of squares (betwee/withi/total degrees of freedom (betwee/withi/total mea square (betwee/withi
More informationMeasures of Spread and Boxplots Discrete Math, Section 9.4
Measures of Spread ad Boxplots Discrete Math, Sectio 9.4 We start with a example: Example 1: Comparig Mea ad Media Compute the mea ad media of each data set: S 1 = {4, 6, 8, 10, 1, 14, 16} S = {4, 7, 9,
More informationEscola Federal de Engenharia de Itajubá
Escola Federal de Egeharia de Itajubá Departameto de Egeharia Mecâica PósGraduação em Egeharia Mecâica MPF04 ANÁLISE DE SINAIS E AQUISÇÃO DE DADOS SINAIS E SISTEMAS Trabalho 02 (MATLAB) Prof. Dr. José
More informationChair for Network Architectures and Services Institute of Informatics TU München Prof. Carle. Network Security. Chapter 2 Basics
Chair for Network Architectures ad Services Istitute of Iformatics TU Müche Prof. Carle Network Security Chapter 2 Basics 2.4 Radom Number Geeratio for Cryptographic Protocols Motivatio It is crucial to
More informationBasic Elements of Arithmetic Sequences and Series
MA40S PRECALCULUS UNIT G GEOMETRIC SEQUENCES CLASS NOTES (COMPLETED NO NEED TO COPY NOTES FROM OVERHEAD) Basic Elemets of Arithmetic Sequeces ad Series Objective: To establish basic elemets of arithmetic
More informationDefinition. A variable X that takes on values X 1, X 2, X 3,...X k with respective frequencies f 1, f 2, f 3,...f k has mean
1 Social Studies 201 October 13, 2004 Note: The examples i these otes may be differet tha used i class. However, the examples are similar ad the methods used are idetical to what was preseted i class.
More informationA Mathematical Perspective on Gambling
A Mathematical Perspective o Gamblig Molly Maxwell Abstract. This paper presets some basic topics i probability ad statistics, icludig sample spaces, probabilistic evets, expectatios, the biomial ad ormal
More informationLesson 17 Pearson s Correlation Coefficient
Outlie Measures of Relatioships Pearso s Correlatio Coefficiet (r) types of data scatter plots measure of directio measure of stregth Computatio covariatio of X ad Y uique variatio i X ad Y measurig
More informationSTATISTICAL METHODS FOR BUSINESS
STATISTICAL METHODS FOR BUSINESS UNIT 7: INFERENTIAL TOOLS. DISTRIBUTIONS ASSOCIATED WITH SAMPLING 7.1. Distributios associated with the samplig process. 7.2. Iferetial processes ad relevat distributios.
More informationProbabilistic Engineering Mechanics. Do Rosenblatt and Nataf isoprobabilistic transformations really differ?
Probabilistic Egieerig Mechaics 4 (009) 577 584 Cotets lists available at ScieceDirect Probabilistic Egieerig Mechaics joural homepage: wwwelseviercom/locate/probegmech Do Roseblatt ad Nataf isoprobabilistic
More information1 Correlation and Regression Analysis
1 Correlatio ad Regressio Aalysis I this sectio we will be ivestigatig the relatioship betwee two cotiuous variable, such as height ad weight, the cocetratio of a ijected drug ad heart rate, or the cosumptio
More informationNonlife insurance mathematics. Nils F. Haavardsson, University of Oslo and DNB Skadeforsikring
Nolife isurace mathematics Nils F. Haavardsso, Uiversity of Oslo ad DNB Skadeforsikrig Mai issues so far Why does isurace work? How is risk premium defied ad why is it importat? How ca claim frequecy
More informationTHE HEIGHT OF qbinary SEARCH TREES
THE HEIGHT OF qbinary SEARCH TREES MICHAEL DRMOTA AND HELMUT PRODINGER Abstract. q biary search trees are obtaied from words, equipped with the geometric distributio istead of permutatios. The average
More informationA Combined Continuous/Binary Genetic Algorithm for Microstrip Antenna Design
A Combied Cotiuous/Biary Geetic Algorithm for Microstrip Atea Desig Rady L. Haupt The Pesylvaia State Uiversity Applied Research Laboratory P. O. Box 30 State College, PA 168040030 haupt@ieee.org Abstract:
More informationAn Efficient Polynomial Approximation of the Normal Distribution Function & Its Inverse Function
A Efficiet Polyomial Approximatio of the Normal Distributio Fuctio & Its Iverse Fuctio Wisto A. Richards, 1 Robi Atoie, * 1 Asho Sahai, ad 3 M. Raghuadh Acharya 1 Departmet of Mathematics & Computer Sciece;
More informationDepartment of Computer Science, University of Otago
Departmet of Computer Sciece, Uiversity of Otago Techical Report OUCS200609 Permutatios Cotaiig May Patters Authors: M.H. Albert Departmet of Computer Sciece, Uiversity of Otago Micah Colema, Rya Fly
More informationTO: Users of the ACTEX Review Seminar on DVD for SOA Exam MLC
TO: Users of the ACTEX Review Semiar o DVD for SOA Eam MLC FROM: Richard L. (Dick) Lodo, FSA Dear Studets, Thak you for purchasig the DVD recordig of the ACTEX Review Semiar for SOA Eam M, Life Cotigecies
More informationMultiserver Optimal Bandwidth Monitoring for QoS based Multimedia Delivery Anup Basu, Irene Cheng and Yinzhe Yu
Multiserver Optimal Badwidth Moitorig for QoS based Multimedia Delivery Aup Basu, Iree Cheg ad Yizhe Yu Departmet of Computig Sciece U. of Alberta Architecture Applicatio Layer Request receptio coectio
More informationCHAPTER 7: Central Limit Theorem: CLT for Averages (Means)
CHAPTER 7: Cetral Limit Theorem: CLT for Averages (Meas) X = the umber obtaied whe rollig oe six sided die oce. If we roll a six sided die oce, the mea of the probability distributio is X P(X = x) Simulatio:
More informationTHE TWOVARIABLE LINEAR REGRESSION MODEL
THE TWOVARIABLE LINEAR REGRESSION MODEL Herma J. Bieres Pesylvaia State Uiversity April 30, 202. Itroductio Suppose you are a ecoomics or busiess maor i a college close to the beach i the souther part
More informationPROCEEDINGS OF THE YEREVAN STATE UNIVERSITY AN ALTERNATIVE MODEL FOR BONUSMALUS SYSTEM
PROCEEDINGS OF THE YEREVAN STATE UNIVERSITY Physical ad Mathematical Scieces 2015, 1, p. 15 19 M a t h e m a t i c s AN ALTERNATIVE MODEL FOR BONUSMALUS SYSTEM A. G. GULYAN Chair of Actuarial Mathematics
More informationMARTINGALES AND A BASIC APPLICATION
MARTINGALES AND A BASIC APPLICATION TURNER SMITH Abstract. This paper will develop the measuretheoretic approach to probability i order to preset the defiitio of martigales. From there we will apply this
More information15.075 Exam 3. Instructor: Cynthia Rudin TA: Dimitrios Bisias. November 22, 2011
15.075 Exam 3 Istructor: Cythia Rudi TA: Dimitrios Bisias November 22, 2011 Gradig is based o demostratio of coceptual uderstadig, so you eed to show all of your work. Problem 1 A compay makes highdefiitio
More informationConvexity, Inequalities, and Norms
Covexity, Iequalities, ad Norms Covex Fuctios You are probably familiar with the otio of cocavity of fuctios. Give a twicedifferetiable fuctio ϕ: R R, We say that ϕ is covex (or cocave up) if ϕ (x) 0 for
More informationParameter estimation for nonlinear models: Numerical approaches to solving the inverse problem. Lecture 11 04/01/2008. Sven Zenker
Parameter estimatio for oliear models: Numerical approaches to solvig the iverse problem Lecture 11 04/01/2008 Sve Zeker Review: Trasformatio of radom variables Cosider probability distributio of a radom
More informationExample 2 Find the square root of 0. The only square root of 0 is 0 (since 0 is not positive or negative, so those choices don t exist here).
BEGINNING ALGEBRA Roots ad Radicals (revised summer, 00 Olso) Packet to Supplemet the Curret Textbook  Part Review of Square Roots & Irratioals (This portio ca be ay time before Part ad should mostly
More informationResearch Article Sign Data Derivative Recovery
Iteratioal Scholarly Research Network ISRN Applied Mathematics Volume 0, Article ID 63070, 7 pages doi:0.540/0/63070 Research Article Sig Data Derivative Recovery L. M. Housto, G. A. Glass, ad A. D. Dymikov
More informationarxiv:1506.03481v1 [stat.me] 10 Jun 2015
BEHAVIOUR OF ABC FOR BIG DATA By Wetao Li ad Paul Fearhead Lacaster Uiversity arxiv:1506.03481v1 [stat.me] 10 Ju 2015 May statistical applicatios ivolve models that it is difficult to evaluate the likelihood,
More informationMath C067 Sampling Distributions
Math C067 Samplig Distributios Sample Mea ad Sample Proportio Richard Beigel Some time betwee April 16, 2007 ad April 16, 2007 Examples of Samplig A pollster may try to estimate the proportio of voters
More information7. Concepts in Probability, Statistics and Stochastic Modelling
7. Cocepts i Probability, Statistics ad Stochastic Modellig 1. Itroductio 169. Probability Cocepts ad Methods 170.1. Radom Variables ad Distributios 170.. Expectatio 173.3. Quatiles, Momets ad Their Estimators
More informationTI83, TI83 Plus or TI84 for NonBusiness Statistics
TI83, TI83 Plu or TI84 for NoBuie Statitic Chapter 3 Eterig Data Pre [STAT] the firt optio i already highlighted (:Edit) o you ca either pre [ENTER] or. Make ure the curor i i the lit, ot o the lit
More informationPlugin martingales for testing exchangeability online
Plugi martigales for testig exchageability olie Valetia Fedorova, Alex Gammerma, Ilia Nouretdiov, ad Vladimir Vovk Computer Learig Research Cetre Royal Holloway, Uiversity of Lodo, UK {valetia,ilia,alex,vovk}@cs.rhul.ac.uk
More informationThe Stable Marriage Problem
The Stable Marriage Problem William Hut Lae Departmet of Computer Sciece ad Electrical Egieerig, West Virgiia Uiversity, Morgatow, WV William.Hut@mail.wvu.edu 1 Itroductio Imagie you are a matchmaker,
More informationCHAPTER 3 DIGITAL CODING OF SIGNALS
CHAPTER 3 DIGITAL CODING OF SIGNALS Computers are ofte used to automate the recordig of measuremets. The trasducers ad sigal coditioig circuits produce a voltage sigal that is proportioal to a quatity
More informationCOMPARISON OF THE EFFICIENCY OF SCONTROL CHART AND EWMAS 2 CONTROL CHART FOR THE CHANGES IN A PROCESS
COMPARISON OF THE EFFICIENCY OF SCONTROL CHART AND EWMAS CONTROL CHART FOR THE CHANGES IN A PROCESS Supraee Lisawadi Departmet of Mathematics ad Statistics, Faculty of Sciece ad Techoology, Thammasat
More informationwhere: T = number of years of cash flow in investment's life n = the year in which the cash flow X n i = IRR = the internal rate of return
EVALUATING ALTERNATIVE CAPITAL INVESTMENT PROGRAMS By Ke D. Duft, Extesio Ecoomist I the March 98 issue of this publicatio we reviewed the procedure by which a capital ivestmet project was assessed. The
More informationFactoring x n 1: cyclotomic and Aurifeuillian polynomials Paul Garrett <garrett@math.umn.edu>
(March 16, 004) Factorig x 1: cyclotomic ad Aurifeuillia polyomials Paul Garrett Polyomials of the form x 1, x 3 1, x 4 1 have at least oe systematic factorizatio x 1 = (x 1)(x 1
More information3. Greatest Common Divisor  Least Common Multiple
3 Greatest Commo Divisor  Least Commo Multiple Defiitio 31: The greatest commo divisor of two atural umbers a ad b is the largest atural umber c which divides both a ad b We deote the greatest commo gcd
More informationAnalyzing Longitudinal Data from Complex Surveys Using SUDAAN
Aalyzig Logitudial Data from Complex Surveys Usig SUDAAN Darryl Creel Statistics ad Epidemiology, RTI Iteratioal, 312 Trotter Farm Drive, Rockville, MD, 20850 Abstract SUDAAN: Software for the Statistical
More informationWeek 3 Conditional probabilities, Bayes formula, WEEK 3 page 1 Expected value of a random variable
Week 3 Coditioal probabilities, Bayes formula, WEEK 3 page 1 Expected value of a radom variable We recall our discussio of 5 card poker hads. Example 13 : a) What is the probability of evet A that a 5
More informationData Analysis and Statistical Behaviors of Stock Market Fluctuations
44 JOURNAL OF COMPUTERS, VOL. 3, NO. 0, OCTOBER 2008 Data Aalysis ad Statistical Behaviors of Stock Market Fluctuatios Ju Wag Departmet of Mathematics, Beijig Jiaotog Uiversity, Beijig 00044, Chia Email:
More informationProject Deliverables. CS 361, Lecture 28. Outline. Project Deliverables. Administrative. Project Comments
Project Deliverables CS 361, Lecture 28 Jared Saia Uiversity of New Mexico Each Group should tur i oe group project cosistig of: About 612 pages of text (ca be loger with appedix) 612 figures (please
More informationConfidence Intervals
Cofidece Itervals Cofidece Itervals are a extesio of the cocept of Margi of Error which we met earlier i this course. Remember we saw: The sample proportio will differ from the populatio proportio by more
More informationNow here is the important step
LINEST i Excel The Excel spreadsheet fuctio "liest" is a complete liear least squares curve fittig routie that produces ucertaity estimates for the fit values. There are two ways to access the "liest"
More informationTaking DCOP to the Real World: Efficient Complete Solutions for Distributed MultiEvent Scheduling
Taig DCOP to the Real World: Efficiet Complete Solutios for Distributed MultiEvet Schedulig Rajiv T. Maheswara, Milid Tambe, Emma Bowrig, Joatha P. Pearce, ad Pradeep araatham Uiversity of Souther Califoria
More informationExploratory Data Analysis
1 Exploratory Data Aalysis Exploratory data aalysis is ofte the rst step i a statistical aalysis, for it helps uderstadig the mai features of the particular sample that a aalyst is usig. Itelliget descriptios
More information1 Review of Probability
Copyright c 27 by Karl Sigma 1 Review of Probability Radom variables are deoted by X, Y, Z, etc. The cumulative distributio fuctio (c.d.f.) of a radom variable X is deoted by F (x) = P (X x), < x
More informationSEQUENCES AND SERIES
Chapter 9 SEQUENCES AND SERIES Natural umbers are the product of huma spirit. DEDEKIND 9.1 Itroductio I mathematics, the word, sequece is used i much the same way as it is i ordiary Eglish. Whe we say
More information