Lecture 5: Statistics 4 Let X, X 2, X 3, X 4, X 5 be iid random variables with a distribution F with a range of a, b). We can relabel these X s such that their labels correspond to arranging them in increasing order so that X ) X 2) X 3) X 4) X 5) Colin Rundel X) X2) X3) X4) X5) March 4, 22 a X5 X X4 X2 X3 b In the case where the distribution F is continuous we can make the stronger statement that X ) < X 2) < X 3) < X 4) < X 5) Since PX i X j ) for all i j for continuous random variables. Statistics 4 Colin Rundel) Lecture 5 March 4, 22 / 24, cont. For X, X 2,..., X n iid random variables X k is the kth smallest X, usually called the kth order statistic. X ) is therefore the smallest X and X ) minx,..., X n ) Notation Detour For a continuous random variable we can see that f )ɛ P X + ɛ) PX [, + ɛ]) lim f )ɛ lim PX [, + ɛ]) ɛ ɛ f ) lim ɛ PX [, + ɛ])/ɛ Similarly, X n) is the largest X and X n) max,..., X n ) P apple X apple + ) P X 2 [, + ]) f) f + ) + Statistics 4 Colin Rundel) Lecture 5 March 4, 22 2 / 24 Statistics 4 Colin Rundel) Lecture 5 March 4, 22 3 / 24
Density of the maimum Density of the minimum For X, X 2,..., X n iid continuous random variables with pdf f and cdf F the density of the maimum is For X, X 2,..., X n iid continuous random variables with pdf f and cdf F the density of the minimum is PX n) [, + ɛ]) Pone of the X s [, + ɛ] and all others < ) n PX i [, + ɛ] and all others < ) i npx [, + ɛ] and all others < ) npx [, + ɛ])pall others < ) npx [, + ɛ])px 2 < ) PX n < ) nf )ɛf ) n PX ) [, + ɛ]) Pone of the X s [, + ɛ] and all others > ) n PX i [, + ɛ] and all others > ) i npx [, + ɛ] and all others > ) npx [, + ɛ])pall others > ) npx [, + ɛ])px 2 > ) PX n > ) nf )ɛ F )) n f n) ) nf )F ) n f ) ) nf ) F )) n Statistics 4 Colin Rundel) Lecture 5 March 4, 22 4 / 24 Statistics 4 Colin Rundel) Lecture 5 March 4, 22 5 / 24 Density of the kth Order Statistic Cumulative Distribution of the min and ma For X, X 2,..., X n iid continuous random variables with pdf f and cdf F the density of the kth order statistic is PX k) [, + ɛ]) Pone of the X s [, + ɛ] and eactly k of the others < ) n PX i [, + ɛ] and eactly k of the others < ) i npx [, + ɛ] and eactly k of the others < ) npx [, + ɛ])peactly k of the others < ) ) n npx [, + ɛ]) )PX < ) k PX > ) n k k ) n f k) ) nf ) F ) k F )) n k k Statistics 4 Colin Rundel) Lecture 5 March 4, 22 6 / 24 For X, X 2,..., X n iid continuous random variables with pdf f and cdf F the density of the kth order statistic is F ) ) PX ) < ) PX ) > ) PX >,..., X n > ) PX > ) PX n > ) F )) n F n) ) PX n) < ) PX n) > ) ) PX <,..., X n < ) PX < ) PX n < ) n nf )ɛ F ) k F ) F n )) n k k f ) ) d d F ))n n F )) n df ) nf ) F )) n d f n) ) d d F )n nf ) n df ) nf )F ) n d Statistics 4 Colin Rundel) Lecture 5 March 4, 22 7 / 24
Order Statistic of Standard Uniforms Beta Distribution Let X, X 2,..., X n iid Unif, ) then the density of Xn) is given by ) n f k) ) nf ) F ) k F )) n k k n n k ) k ) n k if < < otherwise This is an eample of the Beta distribution where r k and s n k +. X k) Betak, n k + ) The Beta distribution is a continuous distribution defined on the range, ) where the density is given by f ) Br, s) r ) s where Br, s) is called the Beta function and it is a normalizing constant which ensures the density integrates to. Br, s) f )d Br, s) Br, s) r ) s d r ) s d r ) s d Statistics 4 Colin Rundel) Lecture 5 March 4, 22 8 / 24 Statistics 4 Colin Rundel) Lecture 5 March 4, 22 9 / 24 Beta Function Beta Function - Epectation The connection between the Beta distribution and the kth order statistic of n standard Uniform random variables allows us to simplify the Beta function. Br, s) r ) s d Bk, n k + ) n ) n k k )!n k + )! nn )! r )!n k)! n! r )!s )! Γr)Γs) r + s )! Γr + s) Let X Betar, s) then EX ) Br, s) r ) s d, r+) ) s d Br, s) Br +, s) Br, s) r!s )! r + s )! r + s)! r )!s )! r! r + s )! r )! r + s)! r r + s Statistics 4 Colin Rundel) Lecture 5 March 4, 22 / 24 Statistics 4 Colin Rundel) Lecture 5 March 4, 22 / 24
Beta Function - Variance Beta Distribution - Summary Let X Betar, s) then If X Betar, s) then EX 2 ) 2 Br, s) r ) s d Br + 2, s) Br, s) r + )r r + s + )r + s) VarX ) EX 2 ) EX ) 2 r + )!s )! r + s + )! r + )r r + s + )r + s) r 2 r + s) 2 r + )rr + s) r 2 r + s + ) r + s + )r + s) 2 rs r + s + )r + s) 2 r + s )! r )!s )! f ) Br, s) r ) s F ) Br, s) B r, s) Br, s) r ) s d B r, s) Br, s) r ) s d r ) s d EX ) r r + s rs VarX ) r + s) 2 r + s + ) r )!s )! r + s )! Γr)Γs) Γr + s) Statistics 4 Colin Rundel) Lecture 5 March 4, 22 2 / 24 Statistics 4 Colin Rundel) Lecture 5 March 4, 22 3 / 24 Minimum of Eponentials Maimum of Eponentials Let X, X 2,..., X n iid Epλ), we previously derived a more general result where the X s were not identically distributed and showed that minx,..., X n ) Epλ + + λ n ) Epnλ) in this more restricted case. Lets confirm that result using our new more general methods f ) ) nf ) F )) n n λe λ) [ e λ ] ) n iid Let X, X 2,..., X n Epλ) then the density of Xn) is given by f n) ) nf )F ) n n λe λ) e λ) n Which we can t do much with, instead we can try the cdf of the maimum. nλe λ e λ ] ) n nλ e λ ] nλe nλ ) n Which is the density for Epnλ). Statistics 4 Colin Rundel) Lecture 5 March 4, 22 4 / 24 Statistics 4 Colin Rundel) Lecture 5 March 4, 22 5 / 24
Maimum of Eponentials, cont. Limit Distributions of Maima and Minima Let X, X 2,..., X n iid Epλ) then the cdf of Xn) is given by Previous we have shown that F n) ) F ) n e λ) n ne λ n F n) ) ep ne λ ) ) n lim F n)) lim n n ep ne λ ) This result is not unique to the eponential distribution... When n tends to infinity we get F ) ) PX ) < ) F )) n F n) ) PX n) < ) F ) n lim F )) lim F n n ))n lim F n)) lim F n n )n if F ) if F ) < if F ) if F ) > Statistics 4 Colin Rundel) Lecture 5 March 4, 22 6 / 24 Statistics 4 Colin Rundel) Lecture 5 March 4, 22 7 / 24 Limit Distributions of Maima and Minima, cont. Maimum of Eponentials, cont. These results show that the limit distributions are degenerate as they only take values of or. To avoid the degeneracy we would like to use a simple transform that such that the limit distributions are not degenerate. Let s consider simple linear transformations lim F n)a n + b n ) lim F a n + b n ) n F ) n n lim F )c n + d n ) lim F c n + d n )) n F ) n n Xn) a n F n) a n + b n ) PX n) < a n + b n ) P b n ) < ) X) c n F ) c n + d n ) PX ) < c n + d n ) P < d n Let X, X 2,..., X n iid Epλ) and an logn)/λ, b n /λ then the cdf of X n) is given by F n) a n + b n ) F logn) + )/λ) n e λlogn)+)/λ) n e logn) e ) n e /n ) n lim F n)a n + b n ) ep e ) n This is known as the standard Gumbel distribution. Statistics 4 Colin Rundel) Lecture 5 March 4, 22 8 / 24 Statistics 4 Colin Rundel) Lecture 5 March 4, 22 9 / 24
Gumbel Distribution Let X Gumbel, ) then Maimum of Eponentials, cont. Let X, X 2,..., X n iid Epλ) and an logn)/λ, b n /λ then if n is large we can use the Standard Gumbel to calculate properties of X n). Standard Gumbel Distribution - pdf Standard Gumbel Distribution - cdf f)...2.3 F)..2.4.6.8. MedianX n) ) m n) PX n) < m n) ) /2 ) Xn) a n P < m G /2 b n PX n) < a n + b n m G ) /2-5 5 F ) ep e ) f ) e ep e ) -5 5 EX ) π/ 6 MedianX ) loglog2)) m n) a n + b n m G λ log n log log 2 λ Statistics 4 Colin Rundel) Lecture 5 March 4, 22 2 / 24 Statistics 4 Colin Rundel) Lecture 5 March 4, 22 2 / 24 Maimum of Eponentials, Eample Maimum of Uniforms In 29 Usain Bolt broke the world record in the meters with a time of 9.58 seconds in Berlin, Germany. If we imagine that the running speed in m/s of competitive sprinters is given by an Eponential distribution with λ. How many sprinters would need to run to have a 5/5 chance of beating Usian Bolt s record having a faster running speed)? Let X Ep) then we need to find n such that MedianX n) ) /9.58 Let X, X 2,..., X n iid Unif, ) and an, b n /n then the cdf of X n) is given by F n) ) F ) n n F n) a n + b n ) F a n + b n ) n lim F n)a n + b n ) e n + /n) n This is eample of the Reverse Weibul distribution. Statistics 4 Colin Rundel) Lecture 5 March 4, 22 22 / 24 Statistics 4 Colin Rundel) Lecture 5 March 4, 22 23 / 24
Maimum of Paretos This is a distribution we have not seen yet, but is useful for describing many physical processes. It s key feature is that it has long tails meaning it goes to slower than a distribution like the normal. Let X, X 2,..., X iid n Paretoα, k) and a n, b n kn /α then the cdf of X is ) k α if k, F X ) otherwise The cdf of X n) is then given by F n) ) F ) n F n) a n + b n ) F a n + b n ) n lim F n)a n + b n ) e α n ) α ) n k k kn /α ) α ) n α n ) n This is eample of the Fréchet distribution. Statistics 4 Colin Rundel) Lecture 5 March 4, 22 24 / 24