MATH20812: PRACTICAL STATISTICS I SEMESTER 2 NOTES ON EXPONENTIAL DISTRIBUTION The eponential distribution (a continuous distribution) is widely used to describe inter-arrival times, lifetimes and times-to-failure. Definition Probability Density Function If a random variable X has the pdf f() = λ ep( λ), 0, λ > 0 (1) then it is said to have the eponential distribution with parameter λ. This parameter λ represents the mean number of events per unit time (e.g. the rate of arrivals or the rate of failure). Plots of the PDF Use Minitab to plot (1) as a function of for various values of λ. What do you see? Try other values for λ to see how the plots change. Properties The eponential random variable X has the following properties: closely related to the Poisson distribution if X describes say the time between two failures then the number of failures per unit time has the Poisson distribution with parameter λ; the only continuous distribution having the memoryless property: Pr(X > + t X > ) = Pr (X > t), (2) i.e. the future lifetime of an individual has the same distribution no matter how it is at present; the only continuous distribution with the property Pr(X > n) = {Pr(X > )} n ; (3) the only continuous distribution with the property Pr(X > )d = Pr(X > z); (4) the only continuous distribution with the property z d Pr(X > ) = c Pr(X > ); (5) d 1
the only continuous distribution with the property E [ep( X) X y] = ep( y)e(x); (6) the only continuous distribution with the property [ { } ] E ep (X y) 2 /2 X y [ ( )] = E ep X 2 /2 ; (7) the only continuous distribution with the property V ar (X X > y) = c; (8) the only continuous distribution with the property [ ] E (X y) 2 X > y = c; (9) the cdf is F() = 1 ep( λ); (10) the 100(1 α)% percentile is α = 1 log(α). (11) λ The median is (log 2)/λ; the epected value is: E(X) = 1 λ ; (12) the variance is: V ar(x) = 1 λ2; (13) the third central moment is: the fourth central moment is: [ E {X E(X)} 3] = 2 λ3; (14) [ E {X E(X)} 4] = 9 λ4; (15) the mean deviation is: E { X E(X) } = 2 λe ; (16) the coefficient of variation is: CV (X) = 1; (17) the entropy is: E [ log f(x)] = 1 log λ. (18) 2
Estimation Suppose you have a dataset 1, 2,..., n from an eponential distribution with parameter λ unknown. The estimate for λ is: n ˆλ = ni=1 =. (19) i 1 A 95% confidence interval for the true value of λ is: ( ) χ 2 2n,0.025 2n, χ2 2n,0.975, (20) 2n where the χ 2 -value can be read off from the attached table for the χ 2 distribution. Use Minitab to compute ˆλ and the confidence interval for the dataset: (0.2324184,1.2206298,0.1363881,0.8816330,1.0186222). Model Checking Q-Q Plot The quantile-quantile plot is an informal graphical tool for determining whether a given data comes from a certain distribution. For the eponential distribution the plot is produced as follows: (i) arrange the data in the ascending order so that (1) (2) (3). (n) is the smallest observation, is the second smallest observation, is the third smallest observation,. is the largest observation; these (i) are known as the order statistics or the observed quantiles; (ii) compute y i by y i these y i are known as the epected quantiles; (iii) plot y i versus (i) for i = 1, 2,...,n. = 1ˆλ ( ) n + 1 i log, i = 1, 2,...,n; (21) n + 1 Use Minitab to draw the Q-Q plot for the eponential distribution using the same dataset as above. If all the points lie close to the 45 degree straight line it is an indication that the data comes from the eponential distribution. P-P Plot The probability-probability plot is another informal graphical tool for determining whether a given data comes from a certain distribution. For the eponential distribution the plot is produced as follows: 3
(i) arrange the data in the ascending order so that (1) (2) (3). (n) is the smallest observation, is the second smallest observation, is the third smallest observation,. is the largest observation; then compute the probabilities p (i) ) = 1 ep ( ˆλ (i) ; (22) these are known as the observed probabilities; (ii) compute p i by p i = these p i are known as the epected probabilities; (iii) plot p i versus p (i) for i = 1, 2,...,n. i, i = 1, 2,...,n; (23) n + 1 Use Minitab to draw the P-P plot for the eponential distribution using the same dataset as above. If all the points lie close to the 45 degree straight line it is an indication that the data comes from the eponential distribution. Repeat the Q-Q and P-P plots for the larger dataset: 0.531, 1.696, 0.189, 0.119, 0.802, 0.689, 0.399, 1.116, 0.807, 0.609. Simulation If you want to generate data that has a certain distribution there are tools for that too. Generalized Eponential Distributions Two-parameter eponential distribution: where > θ > 0. Double eponential distribution: f() = 1 ( σ ep θ ), (24) σ f() = 1 2σ ep ( θ σ ), (25) 4
where < <. Truncated eponential distribution: where θ < < 0. f() = 1 ( σ ep θ σ Mitures of eponential distributions: f() = w σ 1 ep where > 0. Also known as Schuhl distribution. ) / [ ( 1 ep 0 θ σ )], (26) ( ) + 1 w ep ( ), (27) σ1 σ 2 σ2 General Erlang distribution: f() = n ) 1 λ n 2 j ep ( λj, (28) λ j=1 k j j λ k where > 0. Ryu s generalized eponential distribution: where > 0. f() = {λ 1 + λ 12 (1 ep( s))} ep( λ 1 λ 12 ) + λ 12 s (1 ep( s)), (29) 5
Figures lambda=1 Eponential PDF lambda=3 PDF 0 4 8 PDF 0 4 8 0 1 2 3 4 0.0 0.5 1.0 1.5 lambda=5 lambda=10 PDF 0 4 8 PDF 0 4 8 0.0 0.2 0.4 0.6 0.8 0.0 0.1 0.2 0.3 0.4 6
Q-Q Plot Epected 0.2 0.4 0.6 0.8 1.0 1.2 0.2 0.4 0.6 0.8 1.0 1.2 Observed 7
P-P Plot Epected 0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0 Observed 8
Simulated Data Frequency 0 10 20 30 40 0.0 0.2 0.4 0.6 0.8 1.0 1.2 1.4 9