A modified Kolmogorov-Smirnov test for normality

Size: px
Start display at page:

Download "A modified Kolmogorov-Smirnov test for normality"

Transcription

1 MPRA Muich Persoal RePEc Archive A modified Kolmogorov-Smirov test for ormality Zvi Drezer ad Ofir Turel ad Dawit Zerom Califoria State Uiversity-Fullerto 22. October 2008 Olie at MPRA Paper No , posted 1. April :38 UTC

2 A Modified Kolmogorov-Smirov Test for Normality Zvi Drezer, Ofir Turel ad Dawit Zerom Steve G. Mihaylo College of Busiess ad Ecoomics Califoria State Uiversity-Fullerto Fullerto, CA Abstract I this paper we propose a improvemet of the Kolmogorov-Smirov test for ormality. I the curret implemetatio of the Kolmogorov-Smirov test, a sample is compared with a ormal distributio where the sample mea ad the sample variace are used as parameters of the distributio. We propose to select the mea ad variace of the ormal distributio that provide the closest fit to the data. This is like shiftig ad stretchig the referece ormal distributio so that it fits the data i the best possible way. If this shiftig ad stretchig does ot lead to a acceptable fit, the data is probably ot ormal. We also itroduce a fast easily implemetable algorithm for the proposed test. A study of the power of the proposed test idicates that the test is able to discrimiate betwee the ormal distributio ad distributios such as uiform, bi-modal, beta, expoetial ad log-ormal that are differet i shape, but has a relatively lower power agaist the studet t-distributio that is similar i shape to the ormal distributio. I model settigs, the former distictio is typically more importat to make tha the latter distictio. We demostrate the practical sigificace of the proposed test with several simulated examples. Keywords: Closest fit; Kolmogorov-Smirov; Normal distributio. 1 Itroductio May data aalysis methods deped o the assumptio that data were sampled from a ormal distributio or at least from a distributio which is sufficietly close to a ormal distributio. For example, oe ofte tests ormality of residuals after fittig a liear model to the data i order to esure the ormality assumptio of the model is satisfied. Such a assumptio is of great importace because, i may cases, it determies the method that ought to be used to estimate the ukow parameters i the model ad also dictates the test procedures which the aalyst may Correspodig author: Mihaylo College of Busiess ad Ecoomics, Califoria State Uiversity, Fullerto, CA, , (714) , dzerom@fullerto.edu. 1

3 apply. There are several tests available to determie if a sample comes from a ormally distributed populatio. Those theory-drive tests iclude the Kolmogorov-Smirov test, Aderso-Darlig test, Cramer-vo Mises test, Shapiro-Wilk test ad Shapiro-Fracia test. The first three tests are based o the empirical cumulative distributio. Shapiro-Fracia test (Shapiro ad Fracia, 1972 ad Roysto, 1983) is specifically desiged for testig ormality ad is a modificatio of the more geeral Shapiro-Wilk test (Shapiro ad Wilk 1965). There are also tests that exploit the shape of the distributio of the data. For example, the widely available Jarque-Bera test (Jarque ad Bera, 1980) is based o skewess ad kurtosis of the data. To complemet the results of formal tests, graphical methods (such as box-plots ad Q-Q plots) have also bee used ad icreasigly so i recet years. I this paper we focus o the Kolmogorov-Smirov (KS) test. The KS test is arguably the most well-kow test for ormality. It is also available i most widely used statistical software packages. I its origial form, the KS test is used to decide if a sample comes from a populatio with a completely specified cotiuous distributio. I practice, however, we ofte eed to estimate oe or more of the parameters of the hypothesized distributio (say, the ormal distributio) from the sample, i which case the critical values of the KS test may o loger be valid. For the case of ormality testig, Massey (1951) suggests usig sample mea ad sample variace, ad this is the orm i the curret use of KS test. Lilliefors (1967) ad Dallal ad Wilkiso (1986) provide a table of approximate critical values for use with the KS statistics whe usig sample mea ad sample variace. While the use of sample mea ad sample variace seems a atural choice, usig these fixed values is ot ecessarily the best available optio. Whe oe cocludes (after usig the KS test) that a sample is ot ormal, this oly meas that the data is ot ormal at the specified sample mea 2

4 ad sample variace. But it could well be that the data is ormal or sufficietly close to ormal at other values of the mea ad variace of the ormal distributio. Although the scope of this paper is limited to the KS test, this drawback is also shared by other tests such as Aderso-Darlig ad Cramer-vo Mises tests. Iterestigly, Stephes (1974) writes after comparig several tests (such as KS, Aderso-Darlig ad so o) It appears that sice oe is tryig, i effect, to fit a desity of a certai shape to the data, the precise locatio ad scale is relatively uimportat, ad beig tied dow to fixed values, eve correct oes, is more of a hiderace tha a help. I this paper, we suggest a approach that circumvets the eed to use pre-determied values of mea ad variace. Istead, we look for mea ad variace values such that the resultig ormal distributio fits the give sample data. Whe such values do ot exist, we coclude that the sample data is probably ot ormally distributed. Avoidig the use of fixed parameters, we propose a modified KS test i which we choose data-drive mea ad variace values of the ormal distributio by miimizig the KS statistics. I the traditioal KS test, the data is compared agaist a ormal distributio with fixed parameter values. O the other had, our approach looks for a ormal distributio that fits the data i the best possible way, ad hece favors the sample data whe passig judgmet about its closeess to a ormal distributio. Suppose that the sample cosists of idepedet observatios. These observatios are sorted x 1 x 2... x. The cumulative distributio of the data is a step fuctio (see Figures 1 ad 2). At each x k the step is betwee k 1 ormal distributio at x k is Φ ( xk µ σ ad k. For a give mea µ ad variace σ2, the cumulative ). The KS statistics is give by KS(µ, σ) = max 1 k { k Φ ( xk µ σ ) ( xk µ, Φ σ ) k 1 }. (1) The traditioal KS statistics is simply KS( x, s) where µ = x ad σ = s. We propose a modified KS 3

5 statistics deoted by KS( µ, σ) where the vector ( µ, σ) is a solutio to the followig miimizatio problem mi {KS(µ, σ)} (2) µ,σ where KS(µ, σ) is as defied i (1). I sectio 2, we aalyze this optimizatio problem ad provide a tractable algorithm for its solutio. I sectio 3, we provide critical values for the modified KS test usig 100 millio replicatios. The proposed algorithm is quite efficiet ad we are able to complete the critical values table (Table 1) i less tha 4 days (6000 calculatios per secod). To facilitate implemetatio of our test, we also provide approximatio formulas (that work for ay 20) for fidig critical values at typical sigificat levels. To best of our kowledge, there has ot bee ay study that exteds the KS test by allowig the use of optimized distributio parameters. Closely related to our work is that of Weber et al (2006) where they cosider the problem of parameter estimatio of cotiuous distributios (ot just ormal distributio) via miimizig the KS statistics. They use the heuristic optimizatio algorithm of Sobieszczaski-Sobieski et al (1998) to estimate the parameters of a umber of widely used distributios ad also provide a user-friedly software tool. The practical advatage of this software is that it suggests a best fitted distributio to give data by lookig at the miimized KS statistics values amog a set of cotiuous distributios. I this sese, our algorithm of miimizig the KS statistics may also serve the same purpose as that of Weber et al (2006) although our paper is wider i scope. To motivate our modified KS test, we give two Mote Carlo based examples that ca highlight the weakesses of the existig KS ad offer iterestig practical implicatios for proper use of the KS 4

6 test. Example 1: We geerate 999 stadard ormal radom samples of size = 30. The choice of 999 samples (istead of say, 1000) is oly to facilitate the calculatio of the media sample as we will see below. For each sample, we calculate the two KS statistics values, KS( x, s) ad KS( µ, σ), where the algorithm i sectio 2 is used to compute µ ad σ. We also compute = KS( x, s) KS( µ, σ) which is simply the differece betwee the two KS statistics values. It should be oted that KS( µ, σ) KS( x, s) ad hece 0. We do the above steps for all 999 samples. Let j deote a value obtaied for sample j where j = 1..., 999. We select a typical sample, say the k-th sample, to be the oe where k = Media{ j } 999 j=1. Similarly, a extreme sample, say the l-th sample, to be the oe where l = Max{ j } 999 j=1 Based o the typical sample (sample k), Figure 1 gives the empirical cumulative distributio (the step-fuctio), the cumulative ormal distributio (the dotted lie) based o the sample mea ( x k = ) ad sample variace (s k = 1.022) ad the cumulative ormal distributio (the solid lie) based o µ k = ad σ k = The subscript k is attached to estimates to idicate that they correspod to the typical sample k. For this typical sample, KS( x k, s k ) = ad KS( µ k, σ k ) = which idicate a 26% improvemet by the latter. Note from the empirical cumulative distributio plots that the solid lie is closer overall to the sample cumulative distributio. Usig critical values Table 1 (for = 30), both KS statistics values lead to the o-rejectio of the ull of ormality with p-value p > 0.2. This coclusio is correct as we kow the sample is geerated from a ormal distributio. Based o the extreme sample (sample l), Figure 2 gives the empirical cumulative distributio (the step-fuctio), the cumulative ormal distributio (the dotted lie) based o the sample mea ( x l = ) ad sample variace (s l = ) ad the cumulative ormal distributio 5

7 (the solid lie) based o µ l = ad σ l = For this sample, KS( x l, s l ) = ad KS( µ l, σ l ) = which idicate a 50% improvemet by the latter. From the empirical cumulative distributio plots, the solid lie is much closer to the sample cumulative distributio for data values roughly below -0.5 ad these values costitute approximately 80% of the data observatios. Usig the critical values table for = 30 (Table 1), the traditioal KS test implies that the sample data deviates from ormality (at p-value p < 0.01). O the other had, the modified KS test cocludes that we ca ot reject the ull of ormality at a covicig p-value p > 0.2. The coclusio from our test proposal is correct as the sample is geerated from a ormal distributio. This example illustrates that the sample mea ad sample variace do ot ecessarily provide the closest fit to the empirical distributio of the sample. Our approach shifts ad stretches the ormal distributio (by lookig for data-drive mea ad variace values) so that it fits the sample data i the best possible way. Example 2: We cosider = 20, 40,..., 400 (i a iterval of 20). For each, we geerate 10,000 stadard ormal radom samples of 1 ad oe outlier. We defie a outlier as outlier = C where the costat C takes values 4, 5,..., 10. We will oly report results for C = 4, 6, 8, 10 as the implicatios from the other outliers are qualitatively similar. The purpose of this example is to evaluate the two tests: the traditioal KS test (which is based o KS( x, s)) ad the modified KS test (which is based o KS( µ, σ)), i terms of their size usig the level of sigificace α = Whe implemetig both tests, we use the approximatio formula i Table 2 for locatig the critical values. Usig 10,000 replicatios, we plot the size of the two tests for each i Figure 3. Size is defied as the percetage of times (out of the total 10,000 samples) a test rejects the ull hypothesis of ormality. If a test is correctly sized, this percetage should be 6

8 close to The dotted lie i the figure correspods to the size of the modified KS test while the solid lies correspod to the traditioal KS test. Iterestigly, the modified KS test is always close to 0.05 regardless of the magitude of the outlier for all (the average size from all is with stadard deviatio of ). However the traditioal KS test is very sesitive to outliers leadig to clearly wrog coclusios about the distributio of the data. While icreasig the sample size seems to help miimize the effect of a outlier o the test, we still eed urealistically large sample sizes to get rid off the effect. This example is oly meat to illustrate the dager of usig fixed parameter values that do ot respod to the structure of sample data. The modified KS test adapts to the data by attemptig (via choice of µ ad σ) to fit the ormal distributio to the majority of the data by weightig dow the outlier. I practice, researchers ofte deal with small data sets with potetially a few outliers. Eve if much of the data may be well approximated by a ormal distributio, a blid use of traditioal KS test will lead to rejectio of ormality - suggestig use of trasformatios or complex models. I cotrast, the modified KS test is robust to these few outliers ad ca lead to more uaced judgmets regardig the ormality of the data. 2 Algorithm I this sectio, we aalyze the optimizatio problem give i equatio (2) ad provide a tractable algorithm for its solutio. By (1) KS(µ, σ) k ( ) Φ xk µ σ ( ) xk µ KS(µ, σ) Φ k 1 σ 7

9 Let L be the miimum possible value of KS(µ, σ). The solutio to the followig optimizatio problem is the miimum possible KS(µ, σ) ad thus is equivalet to (2). mi{ L } (3) subject to: ( ) k Φ xk µ L for k > L (4) σ ( ) xk µ Φ k 1 L for k < (1 L) + 1. (5) σ Note that if k L 0, costrait (4) is always true ad if L + k 1 1, costrait (5) is always true. We ca solve (3-5) by desigig a algorithm that fids whether there is a feasible solutio to (4-5) for a give L. For a give L, the costraits are equivalet to: µ x k Φ 1 ( k L ) σ for k > L (6) ( µ x k Φ 1 L + k 1 ) σ for k < (1 L) + 1. (7) Costraits (6) ad (7) ca be combied ito oe costrait each. ( ) } k µ mi {x k Φ 1 k>l L σ ( {x k Φ 1 µ max k<(1 L)+1 L + k 1 ) } σ (8) (9) For a give σ there is a solutio for µ satisfyig the system of equatios (8-9) if ad oly if ( ) } k mi {x k Φ 1 k>l L σ ( max {x k Φ 1 k<(1 L)+1 L + k 1 ) } σ (10) 8

10 or ( ) } k F (σ, L) = mi {x k Φ 1 k>l L σ max k<(1 L)+1 {x k Φ 1 ( L + k 1 ) } σ 0. (11) For a give L, the fuctio F (σ, L) is a piece-wise liear cocave fuctio i σ (see Figure 4). We prove that F (σ, L) is a cocave fuctio i σ for a give L. Theorem 1: The fuctio F (σ, L) for a give L is cocave i σ. Proof: All the fuctios i the braces of (11) are liear i σ ad all the other values are costats for a give L. Furthermore, the miimum of liear fuctios is cocave ad the maximum of liear fuctios is covex. Therefore, the differece F (σ, L) is a cocave fuctio i σ. By Theorem 1, for a give L, F (σ, L) has oly oe local maximum which is the global oe. The maximum value of F (σ, L) for a give L ca be easily foud by a search o σ. For ay value of σ F (σ, L) ca be calculated ad if the slope is positive we kow that the optimal σ is to the right, ad if it is egative we kow that it is to the left. The solutio is always at the itersectio poit betwee two lies, oe with a positive slope ad oe with a egative slope (see figure 4). Megiddo (1983) suggested a very efficiet method for solvig such a problem. Note that if F (σ, L) 0, ay µ i the rage [ max k<(1 L)+1 {x k Φ 1 ( L + k 1 ) } ( ) } σ, mi {x ] k k Φ 1 k>l L σ (or specifically the midpoit of the rage) with the σ used i calculatig F (σ, L) yields a KS statistic which does ot exceed L. Let G(L) = max {F (σ, L)} foud by either the method i Megiddo (1983) or ay other search σ 9

11 method. If G(L) 0, there is a solutio (µ, σ) for this value of L ad if G(L) < 0 o such solutio exists. To fid the miimum value of L we propose a biary search. The details of the biary search are ow described. The optimal L must satisfy L KS( x, s). Also, ay KS statistic must be at least 1 2. Therefore, 1 2 L KS( x, s). A biary search o ay segmet [a, b] is performed as follows. G(L) for L = a+b 2 is evaluated. If G(L) 0, there is a solutio (µ, σ) for this value of L ad the search segmet is reduced to [a, a+b 2 ]. If G(L) < 0 o such solutio exists ad the search segmet is reduced to [ a+b 2, b]. I either case the search segmet is cut i half. Followig a relatively small umber of iteratios, the search segmet is reduced to a small eough rage (such as 10 5 ) ad the upper limit of the rage yields a solutio (µ, σ) ad its value of L is withi a give tolerace (the size of the fial segmet) of the optimal value of L. 3 Mote Carlo estimatio of test statistics distributio I this sectio we provide critical values for the modified KS statistics usig Mote Carlo simulatio. To derive the distributio of this statistics, we draw a radom sample of size from a stadard ormal distributio. We estimate µ ad σ ad compute KS( µ, σ), ad for every sample size, we repeat this procedure 100 millio times. The critical values are give i Table 1. We also recalculate the critical values for the traditioal KS test i the same way ad are available i Table 1. Because we use 100 millio samples, the critical values we report for the traditioal KS test are more accurate tha Lilliefors (1967) ad Dallal ad Washigto (1986). The critical values for both KS tests ca be approximated for 20 by the formula a+ b ( ) 1 c where a, b ad c are fuctios of α. These three parameters are give i Table 2. The approximatio is very accurate with a error (whe compared to Table 1) of ot more tha So, the approximatio formula ca replace the tables for 20. We obtai the approximatio formula 10

12 via multiple regressio, where for each α, the critical values i Table 1 are used as the depedet variable, ad 1 ad 1 are the idepedet variables. We select these two idepedet variables through experimetatio. We begi with a sigle variable regressio ivolvig oly 1. We the add variables, oe at a time, which are fuctios of. A regressio ivolvig 1 ad 1 provides a excellet fit. 4 Power comparisos I this sectio we compare the approximate powers of the modified KS test with the traditioal KS test for a set of selected distributios. These distributios covey a wide array of shapes where some resemble the ormal distributio while others are substatially differet. Some of these distributios are also used i Lilliefors (1967) ad Stephes (1974), amog others. We cosider a uiform (0,1) distributio; a bi-modal distributio which is a composite of two ormal distributios, oe cetered at +2 ad oe at -2 with variace of 1; a beta(1,2) distributio whose desity fuctio is a straight lie coectig (0, 0) ad (1, 1); a expoetial distributio with mea ad variace of 1; a log-ormal distributio with mea e 1/2 ad variace e(e 1) ad three t-distributios with degrees of freedom 1, 2 ad 6. We also iclude the ormal distributio where we expect power to be close to α. To save space, we oly report results for α = 0.05 (the behavior is very similar for other values of α). For a give alterative hypothesis (say, a uiform distributio), computatio of the power of the modified KS test is doe as follows. We draw a radom sample of size from the distributio specified i the alterative hypothesis. Based o this sample, we estimate the parameters µ ad σ usig the algorithm outlied i sectio 2 ad compute KS( µ, σ). The, apply the critical values i Table 2 to test if such sample comes from a ormal distributio. Repeatig this procedure 11

13 10,000 times, ad coutig the umber of correct decisios gives the approximate power. The same approach is followed to compute power for traditioal KS test. The complete power results are give i Table 3. From Table 3 we ca see that the power of the modified KS test is cosistetly better tha the traditioal KS test for uiform, beta ad bi-modal distributios. The improvemet is quite large especially for uiform ad beta distributios. These power results idicate that the proposed KS test is able to better discrimiate betwee the ormal distributio ad those distributios that are very differet i shape from ormal, i.e. those that substatially deviate from ormality. For expoetial ad log-ormal distributios, the powers of the two KS tests are quite similar where both achieve reasoably good powers for 40. For the t-distributios, the modified KS test has a much lower power tha the traditioal KS test. What is commo to the t-distributios is that they resemble the ormal distributio except for their heavier tails. I theory, with icreasig degrees of freedom, the tails of the t-distributio get lighter evetually behavig like the ormal distributio. The modified KS test has difficulty detectig o-ormality whe the observed distributio is similar to ormal ad icreasigly so with larger degrees of freedom, i.e. as it gets closer to ormal. O the surface, the low power for the t-distributio may seem like a weakess of the modified KS test. However, would oe expect, with a small, that data geerated by a t 6 distributio be distiguishable from a ormal distributio - thus be idetified as o-ormal? We argue that the reaso the traditioal KS test has a higher power is that it rejects data which ca be fitted quite well to a ormal distributio by a proper selectio of µ ad σ. It is ideed strage that the power of the traditioal KS test is higher for a t 2 distributio tha it is for the uiform ad beta distributios while the latter are substatially differet from ormality. By costructio, the modified KS test tries to look for those mea ad variace values that lead to the closest fit to the 12

14 data. I a way, we are tryig to approximate the referece distributio (the t-distributio) with a ormal distributio. If such a ormal approximatio exists, the data may be cosidered sufficietly ormal. For example, for t 6, the powers at 100 are close to α = 0.05 implyig the sample data is hardly distiguishable from the ormal distributio (see how close the powers of t 6 are to those of the ormal distributio). Whe the degrees of freedom is made smaller, the power of the modified KS test improves because the deviatio from ormality gets larger. Whe ormal approximatio ca ot be achieved, the sample data is flagged as o-ormal. For t 2, the modified KS test is able to detect differece from ormality at = 200 while t 6 requires a very large to be detected by the modified KS. For t 1, the power of the proposed KS test gets a lot better reachig decet power at = 100. The reaso is that t 1 has a much heavier tail tha the ormal distributio makig ormal approximatio via data drive mea ad variace values very difficult. To see why the modified KS test treats several small data from the t-distributio as ormally distributed, we use the t 2 -distributio as a example. To do so, we repeat the experimets described i Example 1 (see sectio 1) but draw 999 samples (of = 30) from a t 2 distributio. The odd umber of simulatio replicatios has the same purpose as i Example 1. We select a typical sample i terms of the differece betwee the traditioal KS statistic ad our proposed KS statistic. Similar to Figures 1 ad 2, three cumulative distributio are depicted i Figure 5 (the rage of x was trucated for better expositio). For this typical sample, x = , s = 2.506, µ = ad σ = The traditioal KS is KS( x, s) = while the modified KS is KS( µ, σ) = Usig the critical value tables i sectio 3, the traditioal KS test rejects the ormality with a p-value of p = O the cotrary, the modified KS test does ot reject ormality with p-value p >

15 5 Coclusio May data aalysis methods (t-test, ANOVA, regressio) deped o the assumptio that data were sampled from a ormal distributio. Oe of the most frequetly used test to evaluate how far data are from ormality is the Kolmogorov-Smirov (KS) test. I implemetig the KS test, most statistical software packages use the sample mea ad sample variace as the parameters of the ormal distributio. However, the sample mea ad sample variace do ot ecessarily provide the closest fit to the empirical distributio of the data. Therefore, we propose a modified KS test i which we optimally choose the mea ad variace of the ormal distributio by miimizig the KS statistics. To facilitate easy implemetatio we also provide a algorithm to solve for the optimal parameters. Refereces 1. Dallal G. E. ad L. Wilkiso (1986) A aalytic approximatio to the distributio of Lilliefors s test statistic for ormality, The America Statisticia, 40, Jarque, C.M. ad A.K. Bera (1980) Efficiet Tests for Normality, Homoscedasticity ad Serial Idepedece of Regressio Residuals, Ecoomics Letters, 6(3), Lilliefors H. W. (1967) O the Kolmogorov-Smirov test for ormality with mea ad variace ukow, Joural of the America Statistical Associatio, 62, Massey F. J. (1951) The Kolmogorov-Smirov test for goodess of fit, Joural of the America Statistical Associatio, 46, Megiddo N. (1983) Liear-time algorithms for liear programmig ir 3 ad related problems, SIAM Joural o Computig, 12, Roysto, J. P. (1983) A Simple Method for Evaluatig the Shapiro-Fracia W Test of No-Normality, Statisticia, 32(3) (September), Shapiro, S. S. ad R. S. Fracia (1972) A Approximate Aalysis of Variace Test for Normality, Joural of the America Statistical Associatio, 67, Shapiro, S. S. ad M. B. Wilk (1965) A Aalysis of Variace Test for Normality (Complete Samples), Biometrika, 52(3/4) (December),

16 9. Sobieszczaski-Sobieski, J., Laba, K. ad R. Kicaid (1998) Bell-curve evolutioary optimizatio algorithm, Proceedigs of the 7th AIAA Symposium o Multidiscipliary Aalysis ad Optimizatio, St. Louis, MO, 2-4 September, AIAA paper Stephes, M.A. (1974) EDF statistics for goodess of fit ad some comparisos, Joural of the America Statistical Associatio, 69, Weber, M., Leemis, L. ad R. Kicaid (2006) Miimum Kolmogorov-Smirov test statistic parameter estimates, Joural of Statistical Computatio ad Simulatio, 76, 3,

17 Table 1: Critical Values for the Traditioal ad Modified KS Test Traditioal KS statistics Modified KS statistics Upper Tail Probabilities Upper Tail Probabilities Table 2: Coefficiets for the approximate formulas Traditioal KS test Modified Ks test α a b c a b c

18 Table 3: Powers (%) of the Traditioal ad Modified KS tests (α = 0.05) Uiform Bi-modal Beta Expoetial Log-ormal t t 2 t 6 Normal Traditioal KS test Modified KS test 17

19 Figure 1: The Typical Sample Cumulative Distributios Figure 2: The Extreme Sample Cumulative Distributios Data Sample parameters Optimal Data Sample parameters Optimal 18

20 Figure 3: Sizes of the Traditioal (solid lie) ad Modified (dotted lie) KS tests (α = 0.05) Outlier=10 Outlier=8 Outlier=6 Outlier=4 Size Figure 4: The Fuctio F (σ, L) 0 σ F (σ, L) 19

21 Figure 5: Typical t 2 Samples Cumulative Distributios Data Sample parameters Optimal 20

I. Chi-squared Distributions

I. Chi-squared Distributions 1 M 358K Supplemet to Chapter 23: CHI-SQUARED DISTRIBUTIONS, T-DISTRIBUTIONS, AND DEGREES OF FREEDOM To uderstad t-distributios, we first eed to look at aother family of distributios, the chi-squared distributios.

More information

0.7 0.6 0.2 0 0 96 96.5 97 97.5 98 98.5 99 99.5 100 100.5 96.5 97 97.5 98 98.5 99 99.5 100 100.5

0.7 0.6 0.2 0 0 96 96.5 97 97.5 98 98.5 99 99.5 100 100.5 96.5 97 97.5 98 98.5 99 99.5 100 100.5 Sectio 13 Kolmogorov-Smirov test. Suppose that we have a i.i.d. sample X 1,..., X with some ukow distributio P ad we would like to test the hypothesis that P is equal to a particular distributio P 0, i.e.

More information

Hypothesis testing. Null and alternative hypotheses

Hypothesis testing. Null and alternative hypotheses Hypothesis testig Aother importat use of samplig distributios is to test hypotheses about populatio parameters, e.g. mea, proportio, regressio coefficiets, etc. For example, it is possible to stipulate

More information

Maximum Likelihood Estimators.

Maximum Likelihood Estimators. Lecture 2 Maximum Likelihood Estimators. Matlab example. As a motivatio, let us look at oe Matlab example. Let us geerate a radom sample of size 00 from beta distributio Beta(5, 2). We will lear the defiitio

More information

A Test of Normality. 1 n S 2 3. n 1. Now introduce two new statistics. The sample skewness is defined as:

A Test of Normality. 1 n S 2 3. n 1. Now introduce two new statistics. The sample skewness is defined as: A Test of Normality Textbook Referece: Chapter. (eighth editio, pages 59 ; seveth editio, pages 6 6). The calculatio of p values for hypothesis testig typically is based o the assumptio that the populatio

More information

Measures of Spread and Boxplots Discrete Math, Section 9.4

Measures of Spread and Boxplots Discrete Math, Section 9.4 Measures of Spread ad Boxplots Discrete Math, Sectio 9.4 We start with a example: Example 1: Comparig Mea ad Media Compute the mea ad media of each data set: S 1 = {4, 6, 8, 10, 1, 14, 16} S = {4, 7, 9,

More information

Determining the sample size

Determining the sample size Determiig the sample size Oe of the most commo questios ay statisticia gets asked is How large a sample size do I eed? Researchers are ofte surprised to fid out that the aswer depeds o a umber of factors

More information

Z-TEST / Z-STATISTIC: used to test hypotheses about. µ when the population standard deviation is unknown

Z-TEST / Z-STATISTIC: used to test hypotheses about. µ when the population standard deviation is unknown Z-TEST / Z-STATISTIC: used to test hypotheses about µ whe the populatio stadard deviatio is kow ad populatio distributio is ormal or sample size is large T-TEST / T-STATISTIC: used to test hypotheses about

More information

5: Introduction to Estimation

5: Introduction to Estimation 5: Itroductio to Estimatio Cotets Acroyms ad symbols... 1 Statistical iferece... Estimatig µ with cofidece... 3 Samplig distributio of the mea... 3 Cofidece Iterval for μ whe σ is kow before had... 4 Sample

More information

1. C. The formula for the confidence interval for a population mean is: x t, which was

1. C. The formula for the confidence interval for a population mean is: x t, which was s 1. C. The formula for the cofidece iterval for a populatio mea is: x t, which was based o the sample Mea. So, x is guarateed to be i the iterval you form.. D. Use the rule : p-value

More information

Case Study. Normal and t Distributions. Density Plot. Normal Distributions

Case Study. Normal and t Distributions. Density Plot. Normal Distributions Case Study Normal ad t Distributios Bret Halo ad Bret Larget Departmet of Statistics Uiversity of Wiscosi Madiso October 11 13, 2011 Case Study Body temperature varies withi idividuals over time (it ca

More information

1 Computing the Standard Deviation of Sample Means

1 Computing the Standard Deviation of Sample Means Computig the Stadard Deviatio of Sample Meas Quality cotrol charts are based o sample meas ot o idividual values withi a sample. A sample is a group of items, which are cosidered all together for our aalysis.

More information

Inference on Proportion. Chapter 8 Tests of Statistical Hypotheses. Sampling Distribution of Sample Proportion. Confidence Interval

Inference on Proportion. Chapter 8 Tests of Statistical Hypotheses. Sampling Distribution of Sample Proportion. Confidence Interval Chapter 8 Tests of Statistical Hypotheses 8. Tests about Proportios HT - Iferece o Proportio Parameter: Populatio Proportio p (or π) (Percetage of people has o health isurace) x Statistic: Sample Proportio

More information

In nite Sequences. Dr. Philippe B. Laval Kennesaw State University. October 9, 2008

In nite Sequences. Dr. Philippe B. Laval Kennesaw State University. October 9, 2008 I ite Sequeces Dr. Philippe B. Laval Keesaw State Uiversity October 9, 2008 Abstract This had out is a itroductio to i ite sequeces. mai de itios ad presets some elemetary results. It gives the I ite Sequeces

More information

Lesson 15 ANOVA (analysis of variance)

Lesson 15 ANOVA (analysis of variance) Outlie Variability -betwee group variability -withi group variability -total variability -F-ratio Computatio -sums of squares (betwee/withi/total -degrees of freedom (betwee/withi/total -mea square (betwee/withi

More information

Chapter 7 Methods of Finding Estimators

Chapter 7 Methods of Finding Estimators Chapter 7 for BST 695: Special Topics i Statistical Theory. Kui Zhag, 011 Chapter 7 Methods of Fidig Estimators Sectio 7.1 Itroductio Defiitio 7.1.1 A poit estimator is ay fuctio W( X) W( X1, X,, X ) of

More information

PSYCHOLOGICAL STATISTICS

PSYCHOLOGICAL STATISTICS UNIVERSITY OF CALICUT SCHOOL OF DISTANCE EDUCATION B Sc. Cousellig Psychology (0 Adm.) IV SEMESTER COMPLEMENTARY COURSE PSYCHOLOGICAL STATISTICS QUESTION BANK. Iferetial statistics is the brach of statistics

More information

Confidence Intervals for One Mean

Confidence Intervals for One Mean Chapter 420 Cofidece Itervals for Oe Mea Itroductio This routie calculates the sample size ecessary to achieve a specified distace from the mea to the cofidece limit(s) at a stated cofidece level for a

More information

Output Analysis (2, Chapters 10 &11 Law)

Output Analysis (2, Chapters 10 &11 Law) B. Maddah ENMG 6 Simulatio 05/0/07 Output Aalysis (, Chapters 10 &11 Law) Comparig alterative system cofiguratio Sice the output of a simulatio is radom, the comparig differet systems via simulatio should

More information

Modified Line Search Method for Global Optimization

Modified Line Search Method for Global Optimization Modified Lie Search Method for Global Optimizatio Cria Grosa ad Ajith Abraham Ceter of Excellece for Quatifiable Quality of Service Norwegia Uiversity of Sciece ad Techology Trodheim, Norway {cria, ajith}@q2s.tu.o

More information

Chapter 14 Nonparametric Statistics

Chapter 14 Nonparametric Statistics Chapter 14 Noparametric Statistics A.K.A. distributio-free statistics! Does ot deped o the populatio fittig ay particular type of distributio (e.g, ormal). Sice these methods make fewer assumptios, they

More information

CHAPTER 7: Central Limit Theorem: CLT for Averages (Means)

CHAPTER 7: Central Limit Theorem: CLT for Averages (Means) CHAPTER 7: Cetral Limit Theorem: CLT for Averages (Meas) X = the umber obtaied whe rollig oe six sided die oce. If we roll a six sided die oce, the mea of the probability distributio is X P(X = x) Simulatio:

More information

One-sample test of proportions

One-sample test of proportions Oe-sample test of proportios The Settig: Idividuals i some populatio ca be classified ito oe of two categories. You wat to make iferece about the proportio i each category, so you draw a sample. Examples:

More information

3. Greatest Common Divisor - Least Common Multiple

3. Greatest Common Divisor - Least Common Multiple 3 Greatest Commo Divisor - Least Commo Multiple Defiitio 31: The greatest commo divisor of two atural umbers a ad b is the largest atural umber c which divides both a ad b We deote the greatest commo gcd

More information

The following example will help us understand The Sampling Distribution of the Mean. C1 C2 C3 C4 C5 50 miles 84 miles 38 miles 120 miles 48 miles

The following example will help us understand The Sampling Distribution of the Mean. C1 C2 C3 C4 C5 50 miles 84 miles 38 miles 120 miles 48 miles The followig eample will help us uderstad The Samplig Distributio of the Mea Review: The populatio is the etire collectio of all idividuals or objects of iterest The sample is the portio of the populatio

More information

THE REGRESSION MODEL IN MATRIX FORM. For simple linear regression, meaning one predictor, the model is. for i = 1, 2, 3,, n

THE REGRESSION MODEL IN MATRIX FORM. For simple linear regression, meaning one predictor, the model is. for i = 1, 2, 3,, n We will cosider the liear regressio model i matrix form. For simple liear regressio, meaig oe predictor, the model is i = + x i + ε i for i =,,,, This model icludes the assumptio that the ε i s are a sample

More information

Statistical inference: example 1. Inferential Statistics

Statistical inference: example 1. Inferential Statistics Statistical iferece: example 1 Iferetial Statistics POPULATION SAMPLE A clothig store chai regularly buys from a supplier large quatities of a certai piece of clothig. Each item ca be classified either

More information

Normal Distribution.

Normal Distribution. Normal Distributio www.icrf.l Normal distributio I probability theory, the ormal or Gaussia distributio, is a cotiuous probability distributio that is ofte used as a first approimatio to describe realvalued

More information

Overview. Learning Objectives. Point Estimate. Estimation. Estimating the Value of a Parameter Using Confidence Intervals

Overview. Learning Objectives. Point Estimate. Estimation. Estimating the Value of a Parameter Using Confidence Intervals Overview Estimatig the Value of a Parameter Usig Cofidece Itervals We apply the results about the sample mea the problem of estimatio Estimatio is the process of usig sample data estimate the value of

More information

1 Correlation and Regression Analysis

1 Correlation and Regression Analysis 1 Correlatio ad Regressio Aalysis I this sectio we will be ivestigatig the relatioship betwee two cotiuous variable, such as height ad weight, the cocetratio of a ijected drug ad heart rate, or the cosumptio

More information

Incremental calculation of weighted mean and variance

Incremental calculation of weighted mean and variance Icremetal calculatio of weighted mea ad variace Toy Fich faf@cam.ac.uk dot@dotat.at Uiversity of Cambridge Computig Service February 009 Abstract I these otes I eplai how to derive formulae for umerically

More information

Lesson 17 Pearson s Correlation Coefficient

Lesson 17 Pearson s Correlation Coefficient Outlie Measures of Relatioships Pearso s Correlatio Coefficiet (r) -types of data -scatter plots -measure of directio -measure of stregth Computatio -covariatio of X ad Y -uique variatio i X ad Y -measurig

More information

Chapter 7: Confidence Interval and Sample Size

Chapter 7: Confidence Interval and Sample Size Chapter 7: Cofidece Iterval ad Sample Size Learig Objectives Upo successful completio of Chapter 7, you will be able to: Fid the cofidece iterval for the mea, proportio, ad variace. Determie the miimum

More information

Properties of MLE: consistency, asymptotic normality. Fisher information.

Properties of MLE: consistency, asymptotic normality. Fisher information. Lecture 3 Properties of MLE: cosistecy, asymptotic ormality. Fisher iformatio. I this sectio we will try to uderstad why MLEs are good. Let us recall two facts from probability that we be used ofte throughout

More information

Convexity, Inequalities, and Norms

Convexity, Inequalities, and Norms Covexity, Iequalities, ad Norms Covex Fuctios You are probably familiar with the otio of cocavity of fuctios. Give a twicedifferetiable fuctio ϕ: R R, We say that ϕ is covex (or cocave up) if ϕ (x) 0 for

More information

University of California, Los Angeles Department of Statistics. Distributions related to the normal distribution

University of California, Los Angeles Department of Statistics. Distributions related to the normal distribution Uiversity of Califoria, Los Ageles Departmet of Statistics Statistics 100B Istructor: Nicolas Christou Three importat distributios: Distributios related to the ormal distributio Chi-square (χ ) distributio.

More information

THE TWO-VARIABLE LINEAR REGRESSION MODEL

THE TWO-VARIABLE LINEAR REGRESSION MODEL THE TWO-VARIABLE LINEAR REGRESSION MODEL Herma J. Bieres Pesylvaia State Uiversity April 30, 202. Itroductio Suppose you are a ecoomics or busiess maor i a college close to the beach i the souther part

More information

Math C067 Sampling Distributions

Math C067 Sampling Distributions Math C067 Samplig Distributios Sample Mea ad Sample Proportio Richard Beigel Some time betwee April 16, 2007 ad April 16, 2007 Examples of Samplig A pollster may try to estimate the proportio of voters

More information

Chapter 7 - Sampling Distributions. 1 Introduction. What is statistics? It consist of three major areas:

Chapter 7 - Sampling Distributions. 1 Introduction. What is statistics? It consist of three major areas: Chapter 7 - Samplig Distributios 1 Itroductio What is statistics? It cosist of three major areas: Data Collectio: samplig plas ad experimetal desigs Descriptive Statistics: umerical ad graphical summaries

More information

Analyzing Longitudinal Data from Complex Surveys Using SUDAAN

Analyzing Longitudinal Data from Complex Surveys Using SUDAAN Aalyzig Logitudial Data from Complex Surveys Usig SUDAAN Darryl Creel Statistics ad Epidemiology, RTI Iteratioal, 312 Trotter Farm Drive, Rockville, MD, 20850 Abstract SUDAAN: Software for the Statistical

More information

The analysis of the Cournot oligopoly model considering the subjective motive in the strategy selection

The analysis of the Cournot oligopoly model considering the subjective motive in the strategy selection The aalysis of the Courot oligopoly model cosiderig the subjective motive i the strategy selectio Shigehito Furuyama Teruhisa Nakai Departmet of Systems Maagemet Egieerig Faculty of Egieerig Kasai Uiversity

More information

GCSE STATISTICS. 4) How to calculate the range: The difference between the biggest number and the smallest number.

GCSE STATISTICS. 4) How to calculate the range: The difference between the biggest number and the smallest number. GCSE STATISTICS You should kow: 1) How to draw a frequecy diagram: e.g. NUMBER TALLY FREQUENCY 1 3 5 ) How to draw a bar chart, a pictogram, ad a pie chart. 3) How to use averages: a) Mea - add up all

More information

Non-life insurance mathematics. Nils F. Haavardsson, University of Oslo and DNB Skadeforsikring

Non-life insurance mathematics. Nils F. Haavardsson, University of Oslo and DNB Skadeforsikring No-life isurace mathematics Nils F. Haavardsso, Uiversity of Oslo ad DNB Skadeforsikrig Mai issues so far Why does isurace work? How is risk premium defied ad why is it importat? How ca claim frequecy

More information

Mann-Whitney U 2 Sample Test (a.k.a. Wilcoxon Rank Sum Test)

Mann-Whitney U 2 Sample Test (a.k.a. Wilcoxon Rank Sum Test) No-Parametric ivariate Statistics: Wilcoxo-Ma-Whitey 2 Sample Test 1 Ma-Whitey 2 Sample Test (a.k.a. Wilcoxo Rak Sum Test) The (Wilcoxo-) Ma-Whitey (WMW) test is the o-parametric equivalet of a pooled

More information

Descriptive Statistics

Descriptive Statistics Descriptive Statistics We leared to describe data sets graphically. We ca also describe a data set umerically. Measures of Locatio Defiitio The sample mea is the arithmetic average of values. We deote

More information

, a Wishart distribution with n -1 degrees of freedom and scale matrix.

, a Wishart distribution with n -1 degrees of freedom and scale matrix. UMEÅ UNIVERSITET Matematisk-statistiska istitutioe Multivariat dataaalys D MSTD79 PA TENTAMEN 004-0-9 LÖSNINGSFÖRSLAG TILL TENTAMEN I MATEMATISK STATISTIK Multivariat dataaalys D, 5 poäg.. Assume that

More information

Practice Problems for Test 3

Practice Problems for Test 3 Practice Problems for Test 3 Note: these problems oly cover CIs ad hypothesis testig You are also resposible for kowig the samplig distributio of the sample meas, ad the Cetral Limit Theorem Review all

More information

Exploratory Data Analysis

Exploratory Data Analysis 1 Exploratory Data Aalysis Exploratory data aalysis is ofte the rst step i a statistical aalysis, for it helps uderstadig the mai features of the particular sample that a aalyst is usig. Itelliget descriptios

More information

Finding the circle that best fits a set of points

Finding the circle that best fits a set of points Fidig the circle that best fits a set of poits L. MAISONOBE October 5 th 007 Cotets 1 Itroductio Solvig the problem.1 Priciples............................... Iitializatio.............................

More information

Systems Design Project: Indoor Location of Wireless Devices

Systems Design Project: Indoor Location of Wireless Devices Systems Desig Project: Idoor Locatio of Wireless Devices Prepared By: Bria Murphy Seior Systems Sciece ad Egieerig Washigto Uiversity i St. Louis Phoe: (805) 698-5295 Email: bcm1@cec.wustl.edu Supervised

More information

Overview of some probability distributions.

Overview of some probability distributions. Lecture Overview of some probability distributios. I this lecture we will review several commo distributios that will be used ofte throughtout the class. Each distributio is usually described by its probability

More information

Taking DCOP to the Real World: Efficient Complete Solutions for Distributed Multi-Event Scheduling

Taking DCOP to the Real World: Efficient Complete Solutions for Distributed Multi-Event Scheduling Taig DCOP to the Real World: Efficiet Complete Solutios for Distributed Multi-Evet Schedulig Rajiv T. Maheswara, Milid Tambe, Emma Bowrig, Joatha P. Pearce, ad Pradeep araatham Uiversity of Souther Califoria

More information

COMPARISON OF THE EFFICIENCY OF S-CONTROL CHART AND EWMA-S 2 CONTROL CHART FOR THE CHANGES IN A PROCESS

COMPARISON OF THE EFFICIENCY OF S-CONTROL CHART AND EWMA-S 2 CONTROL CHART FOR THE CHANGES IN A PROCESS COMPARISON OF THE EFFICIENCY OF S-CONTROL CHART AND EWMA-S CONTROL CHART FOR THE CHANGES IN A PROCESS Supraee Lisawadi Departmet of Mathematics ad Statistics, Faculty of Sciece ad Techoology, Thammasat

More information

Center, Spread, and Shape in Inference: Claims, Caveats, and Insights

Center, Spread, and Shape in Inference: Claims, Caveats, and Insights Ceter, Spread, ad Shape i Iferece: Claims, Caveats, ad Isights Dr. Nacy Pfeig (Uiversity of Pittsburgh) AMATYC November 2008 Prelimiary Activities 1. I would like to produce a iterval estimate for the

More information

MEI Structured Mathematics. Module Summary Sheets. Statistics 2 (Version B: reference to new book)

MEI Structured Mathematics. Module Summary Sheets. Statistics 2 (Version B: reference to new book) MEI Mathematics i Educatio ad Idustry MEI Structured Mathematics Module Summary Sheets Statistics (Versio B: referece to ew book) Topic : The Poisso Distributio Topic : The Normal Distributio Topic 3:

More information

SECTION 1.5 : SUMMATION NOTATION + WORK WITH SEQUENCES

SECTION 1.5 : SUMMATION NOTATION + WORK WITH SEQUENCES SECTION 1.5 : SUMMATION NOTATION + WORK WITH SEQUENCES Read Sectio 1.5 (pages 5 9) Overview I Sectio 1.5 we lear to work with summatio otatio ad formulas. We will also itroduce a brief overview of sequeces,

More information

A Mathematical Perspective on Gambling

A Mathematical Perspective on Gambling A Mathematical Perspective o Gamblig Molly Maxwell Abstract. This paper presets some basic topics i probability ad statistics, icludig sample spaces, probabilistic evets, expectatios, the biomial ad ormal

More information

Chapter 6: Variance, the law of large numbers and the Monte-Carlo method

Chapter 6: Variance, the law of large numbers and the Monte-Carlo method Chapter 6: Variace, the law of large umbers ad the Mote-Carlo method Expected value, variace, ad Chebyshev iequality. If X is a radom variable recall that the expected value of X, E[X] is the average value

More information

Your organization has a Class B IP address of 166.144.0.0 Before you implement subnetting, the Network ID and Host ID are divided as follows:

Your organization has a Class B IP address of 166.144.0.0 Before you implement subnetting, the Network ID and Host ID are divided as follows: Subettig Subettig is used to subdivide a sigle class of etwork i to multiple smaller etworks. Example: Your orgaizatio has a Class B IP address of 166.144.0.0 Before you implemet subettig, the Network

More information

Data Analysis and Statistical Behaviors of Stock Market Fluctuations

Data Analysis and Statistical Behaviors of Stock Market Fluctuations 44 JOURNAL OF COMPUTERS, VOL. 3, NO. 0, OCTOBER 2008 Data Aalysis ad Statistical Behaviors of Stock Market Fluctuatios Ju Wag Departmet of Mathematics, Beijig Jiaotog Uiversity, Beijig 00044, Chia Email:

More information

PROCEEDINGS OF THE YEREVAN STATE UNIVERSITY AN ALTERNATIVE MODEL FOR BONUS-MALUS SYSTEM

PROCEEDINGS OF THE YEREVAN STATE UNIVERSITY AN ALTERNATIVE MODEL FOR BONUS-MALUS SYSTEM PROCEEDINGS OF THE YEREVAN STATE UNIVERSITY Physical ad Mathematical Scieces 2015, 1, p. 15 19 M a t h e m a t i c s AN ALTERNATIVE MODEL FOR BONUS-MALUS SYSTEM A. G. GULYAN Chair of Actuarial Mathematics

More information

Vladimir N. Burkov, Dmitri A. Novikov MODELS AND METHODS OF MULTIPROJECTS MANAGEMENT

Vladimir N. Burkov, Dmitri A. Novikov MODELS AND METHODS OF MULTIPROJECTS MANAGEMENT Keywords: project maagemet, resource allocatio, etwork plaig Vladimir N Burkov, Dmitri A Novikov MODELS AND METHODS OF MULTIPROJECTS MANAGEMENT The paper deals with the problems of resource allocatio betwee

More information

A probabilistic proof of a binomial identity

A probabilistic proof of a binomial identity A probabilistic proof of a biomial idetity Joatho Peterso Abstract We give a elemetary probabilistic proof of a biomial idetity. The proof is obtaied by computig the probability of a certai evet i two

More information

Swaps: Constant maturity swaps (CMS) and constant maturity. Treasury (CMT) swaps

Swaps: Constant maturity swaps (CMS) and constant maturity. Treasury (CMT) swaps Swaps: Costat maturity swaps (CMS) ad costat maturity reasury (CM) swaps A Costat Maturity Swap (CMS) swap is a swap where oe of the legs pays (respectively receives) a swap rate of a fixed maturity, while

More information

Unit 8: Inference for Proportions. Chapters 8 & 9 in IPS

Unit 8: Inference for Proportions. Chapters 8 & 9 in IPS Uit 8: Iferece for Proortios Chaters 8 & 9 i IPS Lecture Outlie Iferece for a Proortio (oe samle) Iferece for Two Proortios (two samles) Cotigecy Tables ad the χ test Iferece for Proortios IPS, Chater

More information

Confidence intervals and hypothesis tests

Confidence intervals and hypothesis tests Chapter 2 Cofidece itervals ad hypothesis tests This chapter focuses o how to draw coclusios about populatios from sample data. We ll start by lookig at biary data (e.g., pollig), ad lear how to estimate

More information

Soving Recurrence Relations

Soving Recurrence Relations Sovig Recurrece Relatios Part 1. Homogeeous liear 2d degree relatios with costat coefficiets. Cosider the recurrece relatio ( ) T () + at ( 1) + bt ( 2) = 0 This is called a homogeeous liear 2d degree

More information

SAMPLE QUESTIONS FOR FINAL EXAM. (1) (2) (3) (4) Find the following using the definition of the Riemann integral: (2x + 1)dx

SAMPLE QUESTIONS FOR FINAL EXAM. (1) (2) (3) (4) Find the following using the definition of the Riemann integral: (2x + 1)dx SAMPLE QUESTIONS FOR FINAL EXAM REAL ANALYSIS I FALL 006 3 4 Fid the followig usig the defiitio of the Riema itegral: a 0 x + dx 3 Cosider the partitio P x 0 3, x 3 +, x 3 +,......, x 3 3 + 3 of the iterval

More information

LECTURE 13: Cross-validation

LECTURE 13: Cross-validation LECTURE 3: Cross-validatio Resampli methods Cross Validatio Bootstrap Bias ad variace estimatio with the Bootstrap Three-way data partitioi Itroductio to Patter Aalysis Ricardo Gutierrez-Osua Texas A&M

More information

where: T = number of years of cash flow in investment's life n = the year in which the cash flow X n i = IRR = the internal rate of return

where: T = number of years of cash flow in investment's life n = the year in which the cash flow X n i = IRR = the internal rate of return EVALUATING ALTERNATIVE CAPITAL INVESTMENT PROGRAMS By Ke D. Duft, Extesio Ecoomist I the March 98 issue of this publicatio we reviewed the procedure by which a capital ivestmet project was assessed. The

More information

NEW HIGH PERFORMANCE COMPUTATIONAL METHODS FOR MORTGAGES AND ANNUITIES. Yuri Shestopaloff,

NEW HIGH PERFORMANCE COMPUTATIONAL METHODS FOR MORTGAGES AND ANNUITIES. Yuri Shestopaloff, NEW HIGH PERFORMNCE COMPUTTIONL METHODS FOR MORTGGES ND NNUITIES Yuri Shestopaloff, Geerally, mortgage ad auity equatios do ot have aalytical solutios for ukow iterest rate, which has to be foud usig umerical

More information

Now here is the important step

Now here is the important step LINEST i Excel The Excel spreadsheet fuctio "liest" is a complete liear least squares curve fittig routie that produces ucertaity estimates for the fit values. There are two ways to access the "liest"

More information

Sampling Distribution And Central Limit Theorem

Sampling Distribution And Central Limit Theorem () Samplig Distributio & Cetral Limit Samplig Distributio Ad Cetral Limit Samplig distributio of the sample mea If we sample a umber of samples (say k samples where k is very large umber) each of size,

More information

Confidence Intervals. CI for a population mean (σ is known and n > 30 or the variable is normally distributed in the.

Confidence Intervals. CI for a population mean (σ is known and n > 30 or the variable is normally distributed in the. Cofidece Itervals A cofidece iterval is a iterval whose purpose is to estimate a parameter (a umber that could, i theory, be calculated from the populatio, if measuremets were available for the whole populatio).

More information

A Faster Clause-Shortening Algorithm for SAT with No Restriction on Clause Length

A Faster Clause-Shortening Algorithm for SAT with No Restriction on Clause Length Joural o Satisfiability, Boolea Modelig ad Computatio 1 2005) 49-60 A Faster Clause-Shorteig Algorithm for SAT with No Restrictio o Clause Legth Evgey Datsi Alexader Wolpert Departmet of Computer Sciece

More information

Chapter 5: Inner Product Spaces

Chapter 5: Inner Product Spaces Chapter 5: Ier Product Spaces Chapter 5: Ier Product Spaces SECION A Itroductio to Ier Product Spaces By the ed of this sectio you will be able to uderstad what is meat by a ier product space give examples

More information

Sequences and Series

Sequences and Series CHAPTER 9 Sequeces ad Series 9.. Covergece: Defiitio ad Examples Sequeces The purpose of this chapter is to itroduce a particular way of geeratig algorithms for fidig the values of fuctios defied by their

More information

Quadrat Sampling in Population Ecology

Quadrat Sampling in Population Ecology Quadrat Samplig i Populatio Ecology Backgroud Estimatig the abudace of orgaisms. Ecology is ofte referred to as the "study of distributio ad abudace". This beig true, we would ofte like to kow how may

More information

Estimating Probability Distributions by Observing Betting Practices

Estimating Probability Distributions by Observing Betting Practices 5th Iteratioal Symposium o Imprecise Probability: Theories ad Applicatios, Prague, Czech Republic, 007 Estimatig Probability Distributios by Observig Bettig Practices Dr C Lych Natioal Uiversity of Irelad,

More information

Parametric (theoretical) probability distributions. (Wilks, Ch. 4) Discrete distributions: (e.g., yes/no; above normal, normal, below normal)

Parametric (theoretical) probability distributions. (Wilks, Ch. 4) Discrete distributions: (e.g., yes/no; above normal, normal, below normal) 6 Parametric (theoretical) probability distributios. (Wilks, Ch. 4) Note: parametric: assume a theoretical distributio (e.g., Gauss) No-parametric: o assumptio made about the distributio Advatages of assumig

More information

A Review and Comparison of Methods for Detecting Outliers in Univariate Data Sets

A Review and Comparison of Methods for Detecting Outliers in Univariate Data Sets A Review ad Compariso of Methods for Detectig Outliers i Uivariate Data Sets by Sogwo Seo BS, Kyughee Uiversity, Submitted to the Graduate Faculty of Graduate School of Public Health i partial fulfillmet

More information

Section 11.3: The Integral Test

Section 11.3: The Integral Test Sectio.3: The Itegral Test Most of the series we have looked at have either diverged or have coverged ad we have bee able to fid what they coverge to. I geeral however, the problem is much more difficult

More information

Running Time ( 3.1) Analysis of Algorithms. Experimental Studies ( 3.1.1) Limitations of Experiments. Pseudocode ( 3.1.2) Theoretical Analysis

Running Time ( 3.1) Analysis of Algorithms. Experimental Studies ( 3.1.1) Limitations of Experiments. Pseudocode ( 3.1.2) Theoretical Analysis Ruig Time ( 3.) Aalysis of Algorithms Iput Algorithm Output A algorithm is a step-by-step procedure for solvig a problem i a fiite amout of time. Most algorithms trasform iput objects ito output objects.

More information

Research Method (I) --Knowledge on Sampling (Simple Random Sampling)

Research Method (I) --Knowledge on Sampling (Simple Random Sampling) Research Method (I) --Kowledge o Samplig (Simple Radom Samplig) 1. Itroductio to samplig 1.1 Defiitio of samplig Samplig ca be defied as selectig part of the elemets i a populatio. It results i the fact

More information

An Efficient Polynomial Approximation of the Normal Distribution Function & Its Inverse Function

An Efficient Polynomial Approximation of the Normal Distribution Function & Its Inverse Function A Efficiet Polyomial Approximatio of the Normal Distributio Fuctio & Its Iverse Fuctio Wisto A. Richards, 1 Robi Atoie, * 1 Asho Sahai, ad 3 M. Raghuadh Acharya 1 Departmet of Mathematics & Computer Sciece;

More information

Lecture 2: Karger s Min Cut Algorithm

Lecture 2: Karger s Min Cut Algorithm priceto uiv. F 3 cos 5: Advaced Algorithm Desig Lecture : Karger s Mi Cut Algorithm Lecturer: Sajeev Arora Scribe:Sajeev Today s topic is simple but gorgeous: Karger s mi cut algorithm ad its extesio.

More information

DAME - Microsoft Excel add-in for solving multicriteria decision problems with scenarios Radomir Perzina 1, Jaroslav Ramik 2

DAME - Microsoft Excel add-in for solving multicriteria decision problems with scenarios Radomir Perzina 1, Jaroslav Ramik 2 Itroductio DAME - Microsoft Excel add-i for solvig multicriteria decisio problems with scearios Radomir Perzia, Jaroslav Ramik 2 Abstract. The mai goal of every ecoomic aget is to make a good decisio,

More information

Research Article Sign Data Derivative Recovery

Research Article Sign Data Derivative Recovery Iteratioal Scholarly Research Network ISRN Applied Mathematics Volume 0, Article ID 63070, 7 pages doi:0.540/0/63070 Research Article Sig Data Derivative Recovery L. M. Housto, G. A. Glass, ad A. D. Dymikov

More information

Asymptotic Growth of Functions

Asymptotic Growth of Functions CMPS Itroductio to Aalysis of Algorithms Fall 3 Asymptotic Growth of Fuctios We itroduce several types of asymptotic otatio which are used to compare the performace ad efficiecy of algorithms As we ll

More information

CONTROL CHART BASED ON A MULTIPLICATIVE-BINOMIAL DISTRIBUTION

CONTROL CHART BASED ON A MULTIPLICATIVE-BINOMIAL DISTRIBUTION www.arpapress.com/volumes/vol8issue2/ijrras_8_2_04.pdf CONTROL CHART BASED ON A MULTIPLICATIVE-BINOMIAL DISTRIBUTION Elsayed A. E. Habib Departmet of Statistics ad Mathematics, Faculty of Commerce, Beha

More information

Department of Computer Science, University of Otago

Department of Computer Science, University of Otago Departmet of Computer Sciece, Uiversity of Otago Techical Report OUCS-2006-09 Permutatios Cotaiig May Patters Authors: M.H. Albert Departmet of Computer Sciece, Uiversity of Otago Micah Colema, Rya Fly

More information

hp calculators HP 12C Statistics - average and standard deviation Average and standard deviation concepts HP12C average and standard deviation

hp calculators HP 12C Statistics - average and standard deviation Average and standard deviation concepts HP12C average and standard deviation HP 1C Statistics - average ad stadard deviatio Average ad stadard deviatio cocepts HP1C average ad stadard deviatio Practice calculatig averages ad stadard deviatios with oe or two variables HP 1C Statistics

More information

Hypergeometric Distributions

Hypergeometric Distributions 7.4 Hypergeometric Distributios Whe choosig the startig lie-up for a game, a coach obviously has to choose a differet player for each positio. Similarly, whe a uio elects delegates for a covetio or you

More information

Iran. J. Chem. Chem. Eng. Vol. 26, No.1, 2007. Sensitivity Analysis of Water Flooding Optimization by Dynamic Optimization

Iran. J. Chem. Chem. Eng. Vol. 26, No.1, 2007. Sensitivity Analysis of Water Flooding Optimization by Dynamic Optimization Ira. J. Chem. Chem. Eg. Vol. 6, No., 007 Sesitivity Aalysis of Water Floodig Optimizatio by Dyamic Optimizatio Gharesheiklou, Ali Asghar* + ; Mousavi-Dehghai, Sayed Ali Research Istitute of Petroleum Idustry

More information

Institute of Actuaries of India Subject CT1 Financial Mathematics

Institute of Actuaries of India Subject CT1 Financial Mathematics Istitute of Actuaries of Idia Subject CT1 Fiacial Mathematics For 2014 Examiatios Subject CT1 Fiacial Mathematics Core Techical Aim The aim of the Fiacial Mathematics subject is to provide a groudig i

More information

CME 302: NUMERICAL LINEAR ALGEBRA FALL 2005/06 LECTURE 8

CME 302: NUMERICAL LINEAR ALGEBRA FALL 2005/06 LECTURE 8 CME 30: NUMERICAL LINEAR ALGEBRA FALL 005/06 LECTURE 8 GENE H GOLUB 1 Positive Defiite Matrices A matrix A is positive defiite if x Ax > 0 for all ozero x A positive defiite matrix has real ad positive

More information

7. Concepts in Probability, Statistics and Stochastic Modelling

7. Concepts in Probability, Statistics and Stochastic Modelling 7. Cocepts i Probability, Statistics ad Stochastic Modellig 1. Itroductio 169. Probability Cocepts ad Methods 170.1. Radom Variables ad Distributios 170.. Expectatio 173.3. Quatiles, Momets ad Their Estimators

More information

1. MATHEMATICAL INDUCTION

1. MATHEMATICAL INDUCTION 1. MATHEMATICAL INDUCTION EXAMPLE 1: Prove that for ay iteger 1. Proof: 1 + 2 + 3 +... + ( + 1 2 (1.1 STEP 1: For 1 (1.1 is true, sice 1 1(1 + 1. 2 STEP 2: Suppose (1.1 is true for some k 1, that is 1

More information

Definition. A variable X that takes on values X 1, X 2, X 3,...X k with respective frequencies f 1, f 2, f 3,...f k has mean

Definition. A variable X that takes on values X 1, X 2, X 3,...X k with respective frequencies f 1, f 2, f 3,...f k has mean 1 Social Studies 201 October 13, 2004 Note: The examples i these otes may be differet tha used i class. However, the examples are similar ad the methods used are idetical to what was preseted i class.

More information

W. Sandmann, O. Bober University of Bamberg, Germany

W. Sandmann, O. Bober University of Bamberg, Germany STOCHASTIC MODELS FOR INTERMITTENT DEMANDS FORECASTING AND STOCK CONTROL W. Sadma, O. Bober Uiversity of Bamberg, Germay Correspodig author: W. Sadma Uiversity of Bamberg, Dep. Iformatio Systems ad Applied

More information