A problem with the likelihood ratio test for a change-point hazard rate model

Size: px
Start display at page:

Download "A problem with the likelihood ratio test for a change-point hazard rate model"

Transcription

1 Biomeirika (1990), 77, 4, pp Printed in Great Britain A problem with the likelihood ratio test for a change-point hazard rate model BY ROBIN HENDERSON Department of Mathematics and Statistics, University of Newcastle-upon-Tyne, Newcastle upon Tyne NE1 7RU, U.K. SUMMARY A likelihood ratio test for constant hazard against a step change alternative has received considerable attention in recent years. Additional information in the maximum likelihood estimate of the change-point can seriously affect interpretation of test results. Some simple modifications are considered for which exact percentage points can be derived. Monte Carlo power and mean squared error comparisons are encouraging. Some key words: Change-point; Consistent estimator; Hazard rate; Sufficient partition; Weighted likelihood ratio. 1. INTRODUCTION Testing a sequence of variables for a change in distribution at an unknown time point has been considered by many authors over many years. See, for example, James, James & Siegmund (1987) for an illustration and references. One aspect which has not received adequate attention, however, concerns the sufficiency of statistics designed to test the null hypothesis of no change in distribution, and consequently the possibility that further information in the data may influence the interpretation of test results. The present paper considers this point in detail for one particular form of change-point problem, that of testing survival data for constant hazard against the alternative of a step change at an unknown time point. To begin with, however, some more general issues are discussed. The most common form of change-point problem assumes a sequence x,,..., x n of independent random variables. Under the null hypothesis H o the variables are identically distributed, but under the alternative hypothesis H x there is a change in distribution at some unknown point k in the sequence (l^k<n). That is, the first k observations are drawn from one distribution and the remaining (n-k) are drawn from a different distribution. Thus H, contains a family of alternatives indexed by a parameter k which disappears under the null hypothesis. In parametric problems a standard method of testing H o against H t is to construct a likelihood ratio test, which is achieved by taking each potential change point k in turn and determining for each the usual log likelihood ratio statistic L k for testing H o against W,. The overall statistic is then L= max (L k ), (1) which implicitly assumes that all values of k are equally likely under the alternative hypothesis. From a Bayesian viewpoint, therefore, a usually unstated component of ff, is that the prior distribution of k is uniform over the integers 1 to n - 1 inclusive. If any

2 836 ROBIN HENDERSON other distribution is presumed then the statistics L k should be weighted accordingly before the maximization if L is to be considered as a genuine likelihood ratio. Consider now the sufficiency of L or more generally any statistic 5 designed to test H o. In accordance with the sufficiency principle (Cox & Hinkley, 1974, p. 37), 5 is sufficient for H o if equally sized samples giving the same value of S always lead to the same conclusions about the validity of H o. This implies that there is no additional information about H o in any other statistic U obtained from the data, which in turn implies that the conditional distribution of U given S = s must be the same under both H o and H t for all values s. To see this suppose that there is some value u for which from which we deduce that the posterior odds on H o given U = u and 5 = s are not the same as those given S = s only, thus contradicting the sufficiency principle. Of course if the distribution of U given 5 = s is unknown under either H o or H x then lack of formal sufficiency has no practical consequence with regard to conclusions since the additional information in U could not be interpreted. Similarly, lack of sufficiency may also be acceptable if additional information can only be obtained at the expense of considerable extra effort. Nonetheless it does not seem sensible to ignore any readily available additional information which may influence conclusions. Suppose therefore that the procedure used to derive S also yields a consistent estimator k of either the change-point k or the proportion k/n, for example at (1) the value of k corresponding to the maximum of the L k. Would two samples with the same value of 5 but different values of k always lead to the same posterior degree of belief in H o l Sufficient conditions for this are that (i) k is independent of 5, and (ii) k has a uniform marginal distribution under H o. The reason is that under the composite //, all change-points k are equally likely. Since k is consistent for k all values of k are therefore at least approximately equally likely under W,, the approximation becoming exact as sample size increases. So if conditions (i) and (ii) hold then the argument of the preceding paragraph shows that k contains no information about H o. Of course these conditions are stronger than strictly required, condition (i) in particular, and weaker necessary conditions would involve the conditional distribution of k given S. Such conditions would be difficult to verify in general however and so it seems reasonable to investigate the at least approximate validity of (i) and (ii) in practical situations. With these comments in mind consider the change-point hazard rate problem (Matthews & Farewell, 1982). The hazard function A(/) of a failure time variable T is modelled as and a test of H o : p = 1 against //,: p =t= 1 is required with T unknown. Note that this is not a standard change-point problem in the sense of a change in distribution somewhere in a sequence of random variables as described above, but the treatment is the same as will be indicated later. A number of intriguing aspects have been highlighted for this deceptively simple problem. Matthews & Farewell (1982) suggested that the null hypothesis could be tested

3 Likelihood ratio test for a change-point hazard rate model 837 via a standard likelihood ratio test but Nguyen, Rogers & Walker (1984) pointed out that the likelihood is unbounded under the alternative hypothesis since a singularity appears if p -» oo and T is taken immediately before the largest observation. Matthews & Farewell (1985) removed the singularity by considering the data as being in effect discrete and re-formulating the likelihood as a product of probabilities rather than densities. Yao (1986) also overcame the problem but this time by simply constraining the estimate of r not to fall in the interval between the largest two observations. Worsley (1988) made the attractive and sensible observation that the singularity could be removed without other material effect if the largest observation is artificially considered to be censored. Another problem concerns the distribution of the likelihood ratio test statistic. There are three unknown parameters under the alternative hypothesis and one under the null, so standard asymptotic theory suggests that a x 2 distribution with two degrees of freedom should be appropriate for the usual statistic of twice the log likelihood ratio, for large samples at least. Matthews & Farewell (1982) noted that the problem is not regular so strictly the asymptotics do not apply but nonetheless they found simulated percentage points to be close to the x\ equivalents. Worsley (1988) showed how exact percentage points could be obtained for the likelihood ratio test statistic with the final observation censored and tabulated 90%, 95% and 99% values for various sample sizes. These were considerably higher than the corresponding x\ values and did not seem to converge to a finite limit as the sample size was increased. Worsley's method is outlined in more detail in 2 below. Section 2 also considers the marginal distribution under H o of the rank in the data of the estimated change-point. This indicates that the likelihood ratio test is not sufficient and therefore some alternative procedures are examined, the object being to find a test statistic for which the corresponding estimated change-point contains no additional information. Some Monte Carlo power and mean squared error comparisons are given in 3 and some general remarks in 4 complete the paper. 2. LIKELIHOOD RATIO AND ALTERNATIVES Consider the following situation. A sample of n independent failure times is available with f,,..., f n denoting the ordered values. The largest observation is considered to be censored as suggested by Worsley (1988) but otherwise all failure times are observed. The effect of random censorship is discussed by Matthews & Farewell (1982) and Worsley (1988) and will not be pursued here. Yao (1986) shows that the likelihood under model (2) is maximized over T by f either just before or just after one of the observed failure times, that is for fe {tt,fk,k = \,...,n l} in an obvious notation. Writing L\ and L~ k for twice the log likelihood ratio evaluated at r= t\ and r = t~ k respectively, Worsley (1988) shows that, for k = 1,..., n 1,

4 838 ROBIN HENDERSON where 0 log 0 is taken to be zero and Now let L' k be the greater of Lt and L~ k for each k. Then L, = max {L' k : k = 1,..., n -1} is the likelihood ratio statistic suggested by Matthews & Farewell (1982) for testing H o against H,, which shows the similarity of treatment with the more common problem discussed at (1). Let k\ be the value of k corresponding to the maximum of the {L' k }, so that, denotes the rank in the sample of the estimate of the change-point T. Note that L, can be considered to be a likelihood ratio test of H o with all values of, given equal prior probability, which is the assumption made henceforth. This can be shown to be equivalent to the prior assumption that T has an exponential distribution with rate A. We concentrate for simplicity in the main on fc, rather than the maximum likelihood estimate f, say of T, but shall return to f, when considering mean squared error in 3. The formulation of L and L~ k in terms of U k and constants is useful in determining the exact distribution of the statistic L,. Since both L and L* are convex in U k the event {L^SSJC} is the same as the event {a k^ U k^b k } for some constants a k and b k. Hence L,^x if and only if a k *z U k^b k for each of k = 1,..., n -1. Worsley (1988) shows that under H o the distribution of U t,..., / _, is the same as that of n -1 ordered uniform random variables and so an algorithm given by Noe (1972) can be applied to find pr (a* = U k^b k, k = 1,..., n - 1). Thus exact percentage points of L, can be determined and are given by Worsley for various sample sizes. As mentioned in 1 these are not close to the corresponding \\ values and give no indication of converging to a finite limit as the sample size increases. By considering terms such as pr (x< L' k^x + 8x, L'j^x, j #= k), Noe's algorithm can be adapted to give a numerical procedure for the calculation of the marginal distribution under H o of &,. This is given in Fig. l(a) for sample size n = 31 which gives 30 uncensored observations. The distribution is drawn for clarity as a continuous curve although of course it is discrete with support 1,..., 30. Figure l(a) also gives equivalent probabilities conditional upon the likelihood ratio statistic L, exceeding the 5% critical value, in other words given a Type I error. In both cases a clear U-shaped distribution is seen with values of, near the extremes of the sample many times more likely than near the centre. For the unconditional distribution for example the probability of k x being equal to the median observation is in comparison to a probability of being equal to the smallest observation. For the conditional distribution the corresponding probabilities are and Similarly shaped distributions occur for other sample sizes also. Knowledge of the shape of the distribution of k, under the null hypothesis clearly affects one's interpretation of a test based on L,. Consider for illustration a situation in which two equally sized samples from different populations produce identical values of L, with a fairly small p-vahie, say between 1% and 5%. Suppose further that the first sample yields the estimate, = 1, whereas the second sample yieldsfc, =\n. One is much more inclined to reject the null hypothesis for the second sample than the first because given a Type I error on L, one expects, to be near the extremes, whereas under H, all values are presumed to be equally likely. Clearly L, is not sufficient and there is additional information in fc,. An improved procedure may involve the joint distribution of L, and,. It is not obvious how to exploit the joint distribution however and such a technique in any case

5 Likelihood ratio test for a change-point hazard rate model (a) Unadjusted: fc, 0-2 (b) Standardised: k, 2 Ou ' CO o & Rank (c) Weighted: k, Rank Rank Fig. 1. Distribution of rank of change-point estimate under null hypothesis, n = 31; solid line, unconditional upon test; dashed line, conditional upon test exceeding 95% point S 01 i Rank (d) Combined: fc«is likely to be cumbersome in practice. An alternative possibility is to modify the test procedure in such a way that the corresponding estimator k has a uniform distribution under H o irrespective of the value of the test statistic. Knowledge of the value of k would then play no role in the interpretation of the test result, as indicated in 1 for more general problems. This is in some way achieved by a suggestion made for other reasons by Worsley (1988) that the estimator of T be constrained to lie between the p-quantile and (1 p)-quantile of the sample. Inspection of Fig. l(a) shows that this would give a much flatter distribution of, under H o, which although not uniform would involve a sufficiently small range of values for the difference to be of no practical concern. Worsley's suggestion, also made in a different context by James et al. (1987), has two important drawbacks. First, the choice of p is crucial but arbitrary and second even the strongest evidence of a change near the extremes of the sample would be discounted. Instead we consider three methods of modifying the likelihood ratio procedure. Each involves the calculation of a statistic L, designed to test H o and produces as part of the test procedure the rank k, of an estimated change-point T, (i = 2, 3,4). 30 (i) Standardizing the Lt, L* terms. Part of the reason for the uneven distribution of, is that the terms L and L* have moments which depend upon k. Consequently one expects values with larger means and possibly also larger variances to produce the maximum. The mean and variance of either L + k or L\~ can be obtained using the observation that the marginal distribution of U k under H Q is beta with parameters k and (n -k), and

6 840 ROBIN HENDERSON that (log U k ) = E{log (1- U n. k )} = -" \/i, var (log U k ) = var {log (1 - U n _ k )} = "f I// 2 cov {log U k, log (1 - U k )} = ("f I// 2 ) The first and second of these moment results can be obtained from standard integrals (Gradsteyn & Ryzhik, 1980, results , ). The covariance result can be obtained recursively starting at k = 1. Together the results can be used to determine simple expressions for the means and standard deviations of L + k and L~ k, say n + k, nl, "t and o~ k. In turn both means and standard deviations can be shown to be relatively large, as expected, for k near either 1 or n -1 and relatively small for k near \n. A sensible adjustment to the likelihood ratio procedure seems to be, therefore, to standardize the L* and LI terms before maximization over k. Hence an alternative to L, as test statistic is the standardized likelihood ratio statistic L 2 = max {{LX fit)i'^, (L k (jl k )/o- k : k = I,..., n 1}. Calculation of L? involves almost negligible extra effort over calculation of L, and Noe's algorithm can again be employed to determine the exact distribution of both L 2 and the corresponding rank k 2 of the estimate f 2 defined in the obvious way. The marginal distribution of k 2 with n = 31 appears in Fig. l(b) and whilst giving improved results a U-shaped pattern can again be seen. Probabilities range from 0017 at the centre to 0113 at the extremes when the unconditional case is considered, and from 0022 to conditional upon L 2 exceeding the 95% point. Note that in the conditional case the maximum values occur at observations 2 and n 2 rather than 1 and n 1. (ii) Weighting the likelihood ratio. An alternative possibility is to weight the likelihood ratio so as to give greater credence to the more central values of L' k. The ad hoc scheme considered here gives weight {k(n-k)}^/(^n) to the likelihood ratio evaluated at t k so that the appropriate statistic is L 3 = max [L k + log {4k{n - k)/n 2 }: k = 1,..., n - 1] which is again easily obtained. Once more the exact distribution of both L 3 and the rank k 3 of f, can be determined using Noe's algorithm. The marginal distribution of 3 under H o appears in Fig. l(c) for sample size n = 31, and again there is an improvement over the standard procedure, but still a U-shaped pattern apparent. The unconditional probabilities range from to 0- with the conditional values ranging from 0024 to for this sample size. (iii) Combined approach: Weighting and standardizing. Both of the preceding procedures give some improvement over the usual methods but neither is entirely satisfactory. The final possibility considered here is to combine the two approaches, to use as test the weighted and standardized likelihood ratio value. Weighting takes place after standardization where the factor of 2 disappears, and so the fourth test statistic is L 4 = max [Lt + { log {4k(n - k)/n 2 }: k = l,...,n-l], where L* is the larger of (Lt -/A*)/CT and (LI -(jl k )/cr k.

7 Likelihood ratio test for a change-point hazard rate model 841 Again Noe's algorithm can be applied to determine the marginal distributions of both L 4 and k 4, the rank of f 4. The latter distribution is shown in Fig. l(d) for n = 31, where a much flatter pattern is apparent for both the unconditional and conditional cases, with probabilities ranging from 0025 to 0054 for the former and from to 0044 for the latter. The distribution conditional upon a significant test statistic is of most interest and although not uniform the pattern shown is sufficiently flat, with all values in a narrow range, for the difference to be of little or no practical concern. The distribution is similarly roughly uniform at other sample sizes also. The additional information in the value of the estimator therefore does not seriously influence the interpretation of the test result, and hence the test seems to be adequate in this sense. Critical values of L 4 for various sample sizes are given in Table 1. Sample size Table 1. Percentage points of L 4 90% Percentage point 95% % Sample size is number of uncensored observations The method used by Yao (1986) to show consistency of the maximum likelihood estimator r, can be adapted to show that f 4 is consistent for T under H {, as indeed are the less useful estimators r 2 and T 3. Some additional algebra is necessary to achieve this but the basic procedure follows Yao and so details are omitted. 3. POWER AND MEAN SQUARED ERROR Some Monte Carlo power comparisons between L, and L 4 are given in Table 2 together with some mean squared error comparisons between f, and f 4. All results are based on batches of simulations, assuming 5% tests, and since the problem is invariant to multiplicative transformations the value of A is taken to be one throughout. Two sample sizes are employed, n =31 and n = 101 giving 30 and uncensored cases respectively. Power is obviously higher for the larger sample size and p relatively far from one. The L 4 statistic generally has higher power than L,, the only exceptions in the table occurring at T = 2-25 and p < 1, where L, has a slight advantage. The estimator f 4 also consistently gives better mean squared error than the estimator f,, particularly when the hazard is reduced by the change and the change-point occurs early. Note that when p < 1 the mean squared error of both T, and T 4 tends to be higher than for p > 1 since the survival time variance is greater. Ten thousand simulations were also carried out with n =101 at each of p = 0-5 and p = 2 with the change point T drawn randomly from a unit exponential distribution.

8 842 ROBIN HENDERSON Table 2. Monte Carlo power and mean squared error estimates p L t Power U Mean squared error p Power L A Mean squared error Results based on batches of simulations at each combination tabulated. Upper values for n = 31, lower for n = 101. When p=0-5 the L, and L 4 statistics had power 0-54 and 058 respectively, the mean squared error of f, was 3-49 and that of r 4 was At p = 2 the powers were 0-51 and 0-55 and the mean squared errors were 0-97 and 0-85 respectively. From these and other simulation results for brevity not detailed here, we conclude that T 4 provides a more reliable estimate of the change point than r,, for small sample sizes in particular, and that overall the L 4 statistic is more powerful than L,. The L, statistic has a slight power advantage for both very early and very late changes, which is as expected since the L 4 statistic gives less weight to the extreme sample values, but for most values of the change point the advantage is with L DISCUSSION None of the three different procedures given above leads to a uniform marginal distribution under H o of the rank of the estimated change-point, as would be desired given the arguments in 1. All three however give distributions which are closer to uniformity than obtained by standard maximum likelihood, and for the combined weighting and standardizing procedure at least the degree of nonuniformity is sufficiently small to be of little practical consequence. The L 4 test statistic also has better power than the likelihood ratio test L, and f 4 has smaller mean squared error than the maximum likelihood estimator f,. No significant extra computational burden is involved in either the calculation of L 4 and T 4 or in the determination of exact distributions and hence the alternative approach seems to be preferable to the standard in all respects for this problem. Cobb (1978) noted that, for more general change-point problems, the maximum likelihood estimator of the location of the change is not a sufficient statistic and additional information about the location can be obtained by conditioning on appropriate ancilliary -A T,

9 Likelihood ratio test for a change-point hazard rate model 843 statistics. However, to my knowledge the relationship between the estimated change-point and the value of any proposed test statistic has not previously been considered for other change point problems. Therefore two statistics designed to detect the presence of a mean change in a sequence of independent normal random variables with common known variance, both of which also yield natural estimates of the change-point as part of the test procedure, have been studied by simulation. One is the likelihood ratio statistic as defined at (1) for this problem and studied in detail by James et al. (1987). The other is a score statistic suggested by Pettitt (1980) and which can be considered to be a weighted form of likelihood ratio, with more weight on the more central potential change-points. The results suggest that a U-shaped marginal distribution is again obtained for the maximum unweighted likelihood estimate of the change-point, with a much flatter distribution occurring for the maximum weighted likelihood value. So, interpretation of the standard likelihood ratio test may depend upon the value of the estimated changepoint, just as for the change-point hazard rate problem considered in this paper. Such difficulties may also occur in other more general hypothesis testing problems when a nuisance parameter is present only under the alternative, as considered by Davies (1977). If the marginal distribution under the null hypothesis of the estimated nuisance parameter is markedly different from the prior distribution of that parameter under the alternative then problems of sufficiency could arise. The possibility of exploiting the joint distribution of test statistic and parameter estimator seems to be worth investigation, or perhaps tests should be made conditional upon the estimator value. REFERENCES COBB, G. W. (1978). The problem of the Nile: Conditional solution to the changepoint problem. Biometrika 65, Cox, D. R. & HINKLEY, D. V. (1974). Theoretical Statistics. London: Chapman and Hall. DAVIES, R. B. (1977). Hypothesis testing when a nuisance parameter is present only under the alternative. Biometrika 64, GRADSTEYN, I. S. & RYZHIK, I. M. (1980). Table of Integrals, Series and Products, corrected and enlarged ed. London: Academic Press. JAMES, B., JAMES, K. L. & SIEGMUND, D. (1987). Tests for a change-point. Biometrika 74, MATTHEWS, D. E. & FAREWELL, V. T. (1982). On testing for a constant hazard against a change-point alternative. Biometrics 38, MATTHEWS, D. E. & FAREWELL, V. T. (1985). On a singularity in the likelihood for a change point hazard rate model. Biometrika 72, NGUYEN, H. T., ROGERS, G. S. & WALKER, E. A. (1984). Estimation in change-point hazard rate models. Biometrika 71, NOE, M. (1972). The calculation of distributions of two sided Kolmogorov-Smirnov type statistics. Ann. Math. Statist. 43, PETTIT, A. N. (1980). A simple cumulative sum type statistic for the change point problem with zero-one observations. Biometrika 67, WORSLEY, K. J. (1988). Exact percentage points of the likelihood-ratio test for a change-point hazard-rate model. Biometrics 44, YAO, Y. C. (1986). Maximum likelihood estimation in hazard rate models with a change-point. Comm. Statist. A 15, [Received November Revised July 1990]

Week 4: Standard Error and Confidence Intervals

Week 4: Standard Error and Confidence Intervals Health Sciences M.Sc. Programme Applied Biostatistics Week 4: Standard Error and Confidence Intervals Sampling Most research data come from subjects we think of as samples drawn from a larger population.

More information

Overview of Violations of the Basic Assumptions in the Classical Normal Linear Regression Model

Overview of Violations of the Basic Assumptions in the Classical Normal Linear Regression Model Overview of Violations of the Basic Assumptions in the Classical Normal Linear Regression Model 1 September 004 A. Introduction and assumptions The classical normal linear regression model can be written

More information

4. Continuous Random Variables, the Pareto and Normal Distributions

4. Continuous Random Variables, the Pareto and Normal Distributions 4. Continuous Random Variables, the Pareto and Normal Distributions A continuous random variable X can take any value in a given range (e.g. height, weight, age). The distribution of a continuous random

More information

MATH10212 Linear Algebra. Systems of Linear Equations. Definition. An n-dimensional vector is a row or a column of n numbers (or letters): a 1.

MATH10212 Linear Algebra. Systems of Linear Equations. Definition. An n-dimensional vector is a row or a column of n numbers (or letters): a 1. MATH10212 Linear Algebra Textbook: D. Poole, Linear Algebra: A Modern Introduction. Thompson, 2006. ISBN 0-534-40596-7. Systems of Linear Equations Definition. An n-dimensional vector is a row or a column

More information

IEOR 6711: Stochastic Models I Fall 2012, Professor Whitt, Tuesday, September 11 Normal Approximations and the Central Limit Theorem

IEOR 6711: Stochastic Models I Fall 2012, Professor Whitt, Tuesday, September 11 Normal Approximations and the Central Limit Theorem IEOR 6711: Stochastic Models I Fall 2012, Professor Whitt, Tuesday, September 11 Normal Approximations and the Central Limit Theorem Time on my hands: Coin tosses. Problem Formulation: Suppose that I have

More information

Sensitivity Analysis 3.1 AN EXAMPLE FOR ANALYSIS

Sensitivity Analysis 3.1 AN EXAMPLE FOR ANALYSIS Sensitivity Analysis 3 We have already been introduced to sensitivity analysis in Chapter via the geometry of a simple example. We saw that the values of the decision variables and those of the slack and

More information

T test as a parametric statistic

T test as a parametric statistic KJA Statistical Round pissn 2005-619 eissn 2005-7563 T test as a parametric statistic Korean Journal of Anesthesiology Department of Anesthesia and Pain Medicine, Pusan National University School of Medicine,

More information

Mathematical Induction

Mathematical Induction Mathematical Induction (Handout March 8, 01) The Principle of Mathematical Induction provides a means to prove infinitely many statements all at once The principle is logical rather than strictly mathematical,

More information

Least Squares Estimation

Least Squares Estimation Least Squares Estimation SARA A VAN DE GEER Volume 2, pp 1041 1045 in Encyclopedia of Statistics in Behavioral Science ISBN-13: 978-0-470-86080-9 ISBN-10: 0-470-86080-4 Editors Brian S Everitt & David

More information

6.4 Normal Distribution

6.4 Normal Distribution Contents 6.4 Normal Distribution....................... 381 6.4.1 Characteristics of the Normal Distribution....... 381 6.4.2 The Standardized Normal Distribution......... 385 6.4.3 Meaning of Areas under

More information

The Binomial Distribution

The Binomial Distribution The Binomial Distribution James H. Steiger November 10, 00 1 Topics for this Module 1. The Binomial Process. The Binomial Random Variable. The Binomial Distribution (a) Computing the Binomial pdf (b) Computing

More information

Statistics in Retail Finance. Chapter 6: Behavioural models

Statistics in Retail Finance. Chapter 6: Behavioural models Statistics in Retail Finance 1 Overview > So far we have focussed mainly on application scorecards. In this chapter we shall look at behavioural models. We shall cover the following topics:- Behavioural

More information

Interpretation of Somers D under four simple models

Interpretation of Somers D under four simple models Interpretation of Somers D under four simple models Roger B. Newson 03 September, 04 Introduction Somers D is an ordinal measure of association introduced by Somers (96)[9]. It can be defined in terms

More information

CHAPTER 2 Estimating Probabilities

CHAPTER 2 Estimating Probabilities CHAPTER 2 Estimating Probabilities Machine Learning Copyright c 2016. Tom M. Mitchell. All rights reserved. *DRAFT OF January 24, 2016* *PLEASE DO NOT DISTRIBUTE WITHOUT AUTHOR S PERMISSION* This is a

More information

Detection of changes in variance using binary segmentation and optimal partitioning

Detection of changes in variance using binary segmentation and optimal partitioning Detection of changes in variance using binary segmentation and optimal partitioning Christian Rohrbeck Abstract This work explores the performance of binary segmentation and optimal partitioning in the

More information

MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.436J/15.085J Fall 2008 Lecture 5 9/17/2008 RANDOM VARIABLES

MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.436J/15.085J Fall 2008 Lecture 5 9/17/2008 RANDOM VARIABLES MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.436J/15.085J Fall 2008 Lecture 5 9/17/2008 RANDOM VARIABLES Contents 1. Random variables and measurable functions 2. Cumulative distribution functions 3. Discrete

More information

Basics of Statistical Machine Learning

Basics of Statistical Machine Learning CS761 Spring 2013 Advanced Machine Learning Basics of Statistical Machine Learning Lecturer: Xiaojin Zhu jerryzhu@cs.wisc.edu Modern machine learning is rooted in statistics. You will find many familiar

More information

The Standard Normal distribution

The Standard Normal distribution The Standard Normal distribution 21.2 Introduction Mass-produced items should conform to a specification. Usually, a mean is aimed for but due to random errors in the production process we set a tolerance

More information

Exact Nonparametric Tests for Comparing Means - A Personal Summary

Exact Nonparametric Tests for Comparing Means - A Personal Summary Exact Nonparametric Tests for Comparing Means - A Personal Summary Karl H. Schlag European University Institute 1 December 14, 2006 1 Economics Department, European University Institute. Via della Piazzuola

More information

Chapter 3 RANDOM VARIATE GENERATION

Chapter 3 RANDOM VARIATE GENERATION Chapter 3 RANDOM VARIATE GENERATION In order to do a Monte Carlo simulation either by hand or by computer, techniques must be developed for generating values of random variables having known distributions.

More information

Introduction to General and Generalized Linear Models

Introduction to General and Generalized Linear Models Introduction to General and Generalized Linear Models General Linear Models - part I Henrik Madsen Poul Thyregod Informatics and Mathematical Modelling Technical University of Denmark DK-2800 Kgs. Lyngby

More information

Gamma Distribution Fitting

Gamma Distribution Fitting Chapter 552 Gamma Distribution Fitting Introduction This module fits the gamma probability distributions to a complete or censored set of individual or grouped data values. It outputs various statistics

More information

Master s Theory Exam Spring 2006

Master s Theory Exam Spring 2006 Spring 2006 This exam contains 7 questions. You should attempt them all. Each question is divided into parts to help lead you through the material. You should attempt to complete as much of each problem

More information

The Effects of Start Prices on the Performance of the Certainty Equivalent Pricing Policy

The Effects of Start Prices on the Performance of the Certainty Equivalent Pricing Policy BMI Paper The Effects of Start Prices on the Performance of the Certainty Equivalent Pricing Policy Faculty of Sciences VU University Amsterdam De Boelelaan 1081 1081 HV Amsterdam Netherlands Author: R.D.R.

More information

Probability and Statistics Prof. Dr. Somesh Kumar Department of Mathematics Indian Institute of Technology, Kharagpur

Probability and Statistics Prof. Dr. Somesh Kumar Department of Mathematics Indian Institute of Technology, Kharagpur Probability and Statistics Prof. Dr. Somesh Kumar Department of Mathematics Indian Institute of Technology, Kharagpur Module No. #01 Lecture No. #15 Special Distributions-VI Today, I am going to introduce

More information

Permutation Tests for Comparing Two Populations

Permutation Tests for Comparing Two Populations Permutation Tests for Comparing Two Populations Ferry Butar Butar, Ph.D. Jae-Wan Park Abstract Permutation tests for comparing two populations could be widely used in practice because of flexibility of

More information

3. Mathematical Induction

3. Mathematical Induction 3. MATHEMATICAL INDUCTION 83 3. Mathematical Induction 3.1. First Principle of Mathematical Induction. Let P (n) be a predicate with domain of discourse (over) the natural numbers N = {0, 1,,...}. If (1)

More information

A Robustness Simulation Method of Project Schedule based on the Monte Carlo Method

A Robustness Simulation Method of Project Schedule based on the Monte Carlo Method Send Orders for Reprints to reprints@benthamscience.ae 254 The Open Cybernetics & Systemics Journal, 2014, 8, 254-258 Open Access A Robustness Simulation Method of Project Schedule based on the Monte Carlo

More information

Stat 5102 Notes: Nonparametric Tests and. confidence interval

Stat 5102 Notes: Nonparametric Tests and. confidence interval Stat 510 Notes: Nonparametric Tests and Confidence Intervals Charles J. Geyer April 13, 003 This handout gives a brief introduction to nonparametrics, which is what you do when you don t believe the assumptions

More information

Study Guide for the Final Exam

Study Guide for the Final Exam Study Guide for the Final Exam When studying, remember that the computational portion of the exam will only involve new material (covered after the second midterm), that material from Exam 1 will make

More information

1.5 Oneway Analysis of Variance

1.5 Oneway Analysis of Variance Statistics: Rosie Cornish. 200. 1.5 Oneway Analysis of Variance 1 Introduction Oneway analysis of variance (ANOVA) is used to compare several means. This method is often used in scientific or medical experiments

More information

The correlation coefficient

The correlation coefficient The correlation coefficient Clinical Biostatistics The correlation coefficient Martin Bland Correlation coefficients are used to measure the of the relationship or association between two quantitative

More information

Survival Analysis of Left Truncated Income Protection Insurance Data. [March 29, 2012]

Survival Analysis of Left Truncated Income Protection Insurance Data. [March 29, 2012] Survival Analysis of Left Truncated Income Protection Insurance Data [March 29, 2012] 1 Qing Liu 2 David Pitt 3 Yan Wang 4 Xueyuan Wu Abstract One of the main characteristics of Income Protection Insurance

More information

The CUSUM algorithm a small review. Pierre Granjon

The CUSUM algorithm a small review. Pierre Granjon The CUSUM algorithm a small review Pierre Granjon June, 1 Contents 1 The CUSUM algorithm 1.1 Algorithm............................... 1.1.1 The problem......................... 1.1. The different steps......................

More information

Normal distribution. ) 2 /2σ. 2π σ

Normal distribution. ) 2 /2σ. 2π σ Normal distribution The normal distribution is the most widely known and used of all distributions. Because the normal distribution approximates many natural phenomena so well, it has developed into a

More information

CALCULATIONS & STATISTICS

CALCULATIONS & STATISTICS CALCULATIONS & STATISTICS CALCULATION OF SCORES Conversion of 1-5 scale to 0-100 scores When you look at your report, you will notice that the scores are reported on a 0-100 scale, even though respondents

More information

Life Table Analysis using Weighted Survey Data

Life Table Analysis using Weighted Survey Data Life Table Analysis using Weighted Survey Data James G. Booth and Thomas A. Hirschl June 2005 Abstract Formulas for constructing valid pointwise confidence bands for survival distributions, estimated using

More information

Statistical estimation using confidence intervals

Statistical estimation using confidence intervals 0894PP_ch06 15/3/02 11:02 am Page 135 6 Statistical estimation using confidence intervals In Chapter 2, the concept of the central nature and variability of data and the methods by which these two phenomena

More information

SENSITIVITY ANALYSIS AND INFERENCE. Lecture 12

SENSITIVITY ANALYSIS AND INFERENCE. Lecture 12 This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike License. Your use of this material constitutes acceptance of that license and the conditions of use of materials on this

More information

LAB 4 INSTRUCTIONS CONFIDENCE INTERVALS AND HYPOTHESIS TESTING

LAB 4 INSTRUCTIONS CONFIDENCE INTERVALS AND HYPOTHESIS TESTING LAB 4 INSTRUCTIONS CONFIDENCE INTERVALS AND HYPOTHESIS TESTING In this lab you will explore the concept of a confidence interval and hypothesis testing through a simulation problem in engineering setting.

More information

STATISTICA Formula Guide: Logistic Regression. Table of Contents

STATISTICA Formula Guide: Logistic Regression. Table of Contents : Table of Contents... 1 Overview of Model... 1 Dispersion... 2 Parameterization... 3 Sigma-Restricted Model... 3 Overparameterized Model... 4 Reference Coding... 4 Model Summary (Summary Tab)... 5 Summary

More information

Information Theory and Coding Prof. S. N. Merchant Department of Electrical Engineering Indian Institute of Technology, Bombay

Information Theory and Coding Prof. S. N. Merchant Department of Electrical Engineering Indian Institute of Technology, Bombay Information Theory and Coding Prof. S. N. Merchant Department of Electrical Engineering Indian Institute of Technology, Bombay Lecture - 17 Shannon-Fano-Elias Coding and Introduction to Arithmetic Coding

More information

FEGYVERNEKI SÁNDOR, PROBABILITY THEORY AND MATHEmATICAL

FEGYVERNEKI SÁNDOR, PROBABILITY THEORY AND MATHEmATICAL FEGYVERNEKI SÁNDOR, PROBABILITY THEORY AND MATHEmATICAL STATIsTICs 4 IV. RANDOm VECTORs 1. JOINTLY DIsTRIBUTED RANDOm VARIABLEs If are two rom variables defined on the same sample space we define the joint

More information

Nonparametric adaptive age replacement with a one-cycle criterion

Nonparametric adaptive age replacement with a one-cycle criterion Nonparametric adaptive age replacement with a one-cycle criterion P. Coolen-Schrijner, F.P.A. Coolen Department of Mathematical Sciences University of Durham, Durham, DH1 3LE, UK e-mail: Pauline.Schrijner@durham.ac.uk

More information

Maximum Likelihood Estimation

Maximum Likelihood Estimation Math 541: Statistical Theory II Lecturer: Songfeng Zheng Maximum Likelihood Estimation 1 Maximum Likelihood Estimation Maximum likelihood is a relatively simple method of constructing an estimator for

More information

How To Check For Differences In The One Way Anova

How To Check For Differences In The One Way Anova MINITAB ASSISTANT WHITE PAPER This paper explains the research conducted by Minitab statisticians to develop the methods and data checks used in the Assistant in Minitab 17 Statistical Software. One-Way

More information

1 Error in Euler s Method

1 Error in Euler s Method 1 Error in Euler s Method Experience with Euler s 1 method raises some interesting questions about numerical approximations for the solutions of differential equations. 1. What determines the amount of

More information

LOGNORMAL MODEL FOR STOCK PRICES

LOGNORMAL MODEL FOR STOCK PRICES LOGNORMAL MODEL FOR STOCK PRICES MICHAEL J. SHARPE MATHEMATICS DEPARTMENT, UCSD 1. INTRODUCTION What follows is a simple but important model that will be the basis for a later study of stock prices as

More information

8 Divisibility and prime numbers

8 Divisibility and prime numbers 8 Divisibility and prime numbers 8.1 Divisibility In this short section we extend the concept of a multiple from the natural numbers to the integers. We also summarize several other terms that express

More information

1 Prior Probability and Posterior Probability

1 Prior Probability and Posterior Probability Math 541: Statistical Theory II Bayesian Approach to Parameter Estimation Lecturer: Songfeng Zheng 1 Prior Probability and Posterior Probability Consider now a problem of statistical inference in which

More information

Lecture 13 - Basic Number Theory.

Lecture 13 - Basic Number Theory. Lecture 13 - Basic Number Theory. Boaz Barak March 22, 2010 Divisibility and primes Unless mentioned otherwise throughout this lecture all numbers are non-negative integers. We say that A divides B, denoted

More information

Tests for Two Survival Curves Using Cox s Proportional Hazards Model

Tests for Two Survival Curves Using Cox s Proportional Hazards Model Chapter 730 Tests for Two Survival Curves Using Cox s Proportional Hazards Model Introduction A clinical trial is often employed to test the equality of survival distributions of two treatment groups.

More information

We can express this in decimal notation (in contrast to the underline notation we have been using) as follows: 9081 + 900b + 90c = 9001 + 100c + 10b

We can express this in decimal notation (in contrast to the underline notation we have been using) as follows: 9081 + 900b + 90c = 9001 + 100c + 10b In this session, we ll learn how to solve problems related to place value. This is one of the fundamental concepts in arithmetic, something every elementary and middle school mathematics teacher should

More information

Hypothesis Testing for Beginners

Hypothesis Testing for Beginners Hypothesis Testing for Beginners Michele Piffer LSE August, 2011 Michele Piffer (LSE) Hypothesis Testing for Beginners August, 2011 1 / 53 One year ago a friend asked me to put down some easy-to-read notes

More information

1 The Brownian bridge construction

1 The Brownian bridge construction The Brownian bridge construction The Brownian bridge construction is a way to build a Brownian motion path by successively adding finer scale detail. This construction leads to a relatively easy proof

More information

1 if 1 x 0 1 if 0 x 1

1 if 1 x 0 1 if 0 x 1 Chapter 3 Continuity In this chapter we begin by defining the fundamental notion of continuity for real valued functions of a single real variable. When trying to decide whether a given function is or

More information

Language Modeling. Chapter 1. 1.1 Introduction

Language Modeling. Chapter 1. 1.1 Introduction Chapter 1 Language Modeling (Course notes for NLP by Michael Collins, Columbia University) 1.1 Introduction In this chapter we will consider the the problem of constructing a language model from a set

More information

The Sample Overlap Problem for Systematic Sampling

The Sample Overlap Problem for Systematic Sampling The Sample Overlap Problem for Systematic Sampling Robert E. Fay 1 1 Westat, Inc., 1600 Research Blvd., Rockville, MD 20850 Abstract Within the context of probability-based sampling from a finite population,

More information

NPV Versus IRR. W.L. Silber -1000 0 0 +300 +600 +900. We know that if the cost of capital is 18 percent we reject the project because the NPV

NPV Versus IRR. W.L. Silber -1000 0 0 +300 +600 +900. We know that if the cost of capital is 18 percent we reject the project because the NPV NPV Versus IRR W.L. Silber I. Our favorite project A has the following cash flows: -1 + +6 +9 1 2 We know that if the cost of capital is 18 percent we reject the project because the net present value is

More information

QUANTITATIVE METHODS BIOLOGY FINAL HONOUR SCHOOL NON-PARAMETRIC TESTS

QUANTITATIVE METHODS BIOLOGY FINAL HONOUR SCHOOL NON-PARAMETRIC TESTS QUANTITATIVE METHODS BIOLOGY FINAL HONOUR SCHOOL NON-PARAMETRIC TESTS This booklet contains lecture notes for the nonparametric work in the QM course. This booklet may be online at http://users.ox.ac.uk/~grafen/qmnotes/index.html.

More information

START Selected Topics in Assurance

START Selected Topics in Assurance START Selected Topics in Assurance Related Technologies Table of Contents Introduction Some Statistical Background Fitting a Normal Using the Anderson Darling GoF Test Fitting a Weibull Using the Anderson

More information

UNDERSTANDING THE TWO-WAY ANOVA

UNDERSTANDING THE TWO-WAY ANOVA UNDERSTANDING THE e have seen how the one-way ANOVA can be used to compare two or more sample means in studies involving a single independent variable. This can be extended to two independent variables

More information

The Kelly criterion for spread bets

The Kelly criterion for spread bets IMA Journal of Applied Mathematics 2007 72,43 51 doi:10.1093/imamat/hxl027 Advance Access publication on December 5, 2006 The Kelly criterion for spread bets S. J. CHAPMAN Oxford Centre for Industrial

More information

Introduction to Quantitative Methods

Introduction to Quantitative Methods Introduction to Quantitative Methods October 15, 2009 Contents 1 Definition of Key Terms 2 2 Descriptive Statistics 3 2.1 Frequency Tables......................... 4 2.2 Measures of Central Tendencies.................

More information

CS 103X: Discrete Structures Homework Assignment 3 Solutions

CS 103X: Discrete Structures Homework Assignment 3 Solutions CS 103X: Discrete Structures Homework Assignment 3 s Exercise 1 (20 points). On well-ordering and induction: (a) Prove the induction principle from the well-ordering principle. (b) Prove the well-ordering

More information

Sample Induction Proofs

Sample Induction Proofs Math 3 Worksheet: Induction Proofs III, Sample Proofs A.J. Hildebrand Sample Induction Proofs Below are model solutions to some of the practice problems on the induction worksheets. The solutions given

More information

171:290 Model Selection Lecture II: The Akaike Information Criterion

171:290 Model Selection Lecture II: The Akaike Information Criterion 171:290 Model Selection Lecture II: The Akaike Information Criterion Department of Biostatistics Department of Statistics and Actuarial Science August 28, 2012 Introduction AIC, the Akaike Information

More information

Non Parametric Inference

Non Parametric Inference Maura Department of Economics and Finance Università Tor Vergata Outline 1 2 3 Inverse distribution function Theorem: Let U be a uniform random variable on (0, 1). Let X be a continuous random variable

More information

Stochastic Inventory Control

Stochastic Inventory Control Chapter 3 Stochastic Inventory Control 1 In this chapter, we consider in much greater details certain dynamic inventory control problems of the type already encountered in section 1.3. In addition to the

More information

Likelihood: Frequentist vs Bayesian Reasoning

Likelihood: Frequentist vs Bayesian Reasoning "PRINCIPLES OF PHYLOGENETICS: ECOLOGY AND EVOLUTION" Integrative Biology 200B University of California, Berkeley Spring 2009 N Hallinan Likelihood: Frequentist vs Bayesian Reasoning Stochastic odels and

More information

Metric Spaces. Chapter 7. 7.1. Metrics

Metric Spaces. Chapter 7. 7.1. Metrics Chapter 7 Metric Spaces A metric space is a set X that has a notion of the distance d(x, y) between every pair of points x, y X. The purpose of this chapter is to introduce metric spaces and give some

More information

Monte Carlo testing with Big Data

Monte Carlo testing with Big Data Monte Carlo testing with Big Data Patrick Rubin-Delanchy University of Bristol & Heilbronn Institute for Mathematical Research Joint work with: Axel Gandy (Imperial College London) with contributions from:

More information

NCSS Statistical Software

NCSS Statistical Software Chapter 06 Introduction This procedure provides several reports for the comparison of two distributions, including confidence intervals for the difference in means, two-sample t-tests, the z-test, the

More information

Tutorial 5: Hypothesis Testing

Tutorial 5: Hypothesis Testing Tutorial 5: Hypothesis Testing Rob Nicholls nicholls@mrc-lmb.cam.ac.uk MRC LMB Statistics Course 2014 Contents 1 Introduction................................ 1 2 Testing distributional assumptions....................

More information

Principle of Data Reduction

Principle of Data Reduction Chapter 6 Principle of Data Reduction 6.1 Introduction An experimenter uses the information in a sample X 1,..., X n to make inferences about an unknown parameter θ. If the sample size n is large, then

More information

http://www.jstor.org This content downloaded on Tue, 19 Feb 2013 17:28:43 PM All use subject to JSTOR Terms and Conditions

http://www.jstor.org This content downloaded on Tue, 19 Feb 2013 17:28:43 PM All use subject to JSTOR Terms and Conditions A Significance Test for Time Series Analysis Author(s): W. Allen Wallis and Geoffrey H. Moore Reviewed work(s): Source: Journal of the American Statistical Association, Vol. 36, No. 215 (Sep., 1941), pp.

More information

Institute of Actuaries of India Subject CT3 Probability and Mathematical Statistics

Institute of Actuaries of India Subject CT3 Probability and Mathematical Statistics Institute of Actuaries of India Subject CT3 Probability and Mathematical Statistics For 2015 Examinations Aim The aim of the Probability and Mathematical Statistics subject is to provide a grounding in

More information

6.207/14.15: Networks Lecture 15: Repeated Games and Cooperation

6.207/14.15: Networks Lecture 15: Repeated Games and Cooperation 6.207/14.15: Networks Lecture 15: Repeated Games and Cooperation Daron Acemoglu and Asu Ozdaglar MIT November 2, 2009 1 Introduction Outline The problem of cooperation Finitely-repeated prisoner s dilemma

More information

Notes on Continuous Random Variables

Notes on Continuous Random Variables Notes on Continuous Random Variables Continuous random variables are random quantities that are measured on a continuous scale. They can usually take on any value over some interval, which distinguishes

More information

6 PROBABILITY GENERATING FUNCTIONS

6 PROBABILITY GENERATING FUNCTIONS 6 PROBABILITY GENERATING FUNCTIONS Certain derivations presented in this course have been somewhat heavy on algebra. For example, determining the expectation of the Binomial distribution (page 5.1 turned

More information

Summary of Formulas and Concepts. Descriptive Statistics (Ch. 1-4)

Summary of Formulas and Concepts. Descriptive Statistics (Ch. 1-4) Summary of Formulas and Concepts Descriptive Statistics (Ch. 1-4) Definitions Population: The complete set of numerical information on a particular quantity in which an investigator is interested. We assume

More information

Estimating Weighing Uncertainty From Balance Data Sheet Specifications

Estimating Weighing Uncertainty From Balance Data Sheet Specifications Estimating Weighing Uncertainty From Balance Data Sheet Specifications Sources Of Measurement Deviations And Uncertainties Determination Of The Combined Measurement Bias Estimation Of The Combined Measurement

More information

AN ANALYSIS OF A WAR-LIKE CARD GAME. Introduction

AN ANALYSIS OF A WAR-LIKE CARD GAME. Introduction AN ANALYSIS OF A WAR-LIKE CARD GAME BORIS ALEXEEV AND JACOB TSIMERMAN Abstract. In his book Mathematical Mind-Benders, Peter Winkler poses the following open problem, originally due to the first author:

More information

What s New in Econometrics? Lecture 8 Cluster and Stratified Sampling

What s New in Econometrics? Lecture 8 Cluster and Stratified Sampling What s New in Econometrics? Lecture 8 Cluster and Stratified Sampling Jeff Wooldridge NBER Summer Institute, 2007 1. The Linear Model with Cluster Effects 2. Estimation with a Small Number of Groups and

More information

Research Article Batch Scheduling on Two-Machine Flowshop with Machine-Dependent Setup Times

Research Article Batch Scheduling on Two-Machine Flowshop with Machine-Dependent Setup Times Hindawi Publishing Corporation Advances in Operations Research Volume 2009, Article ID 153910, 10 pages doi:10.1155/2009/153910 Research Article Batch Scheduling on Two-Machine Flowshop with Machine-Dependent

More information

3 Some Integer Functions

3 Some Integer Functions 3 Some Integer Functions A Pair of Fundamental Integer Functions The integer function that is the heart of this section is the modulo function. However, before getting to it, let us look at some very simple

More information

8 Primes and Modular Arithmetic

8 Primes and Modular Arithmetic 8 Primes and Modular Arithmetic 8.1 Primes and Factors Over two millennia ago already, people all over the world were considering the properties of numbers. One of the simplest concepts is prime numbers.

More information

Association Between Variables

Association Between Variables Contents 11 Association Between Variables 767 11.1 Introduction............................ 767 11.1.1 Measure of Association................. 768 11.1.2 Chapter Summary.................... 769 11.2 Chi

More information

Using simulation to calculate the NPV of a project

Using simulation to calculate the NPV of a project Using simulation to calculate the NPV of a project Marius Holtan Onward Inc. 5/31/2002 Monte Carlo simulation is fast becoming the technology of choice for evaluating and analyzing assets, be it pure financial

More information

Statistical tests for SPSS

Statistical tests for SPSS Statistical tests for SPSS Paolo Coletti A.Y. 2010/11 Free University of Bolzano Bozen Premise This book is a very quick, rough and fast description of statistical tests and their usage. It is explicitly

More information

Unit 26 Estimation with Confidence Intervals

Unit 26 Estimation with Confidence Intervals Unit 26 Estimation with Confidence Intervals Objectives: To see how confidence intervals are used to estimate a population proportion, a population mean, a difference in population proportions, or a difference

More information

Modelling the Scores of Premier League Football Matches

Modelling the Scores of Premier League Football Matches Modelling the Scores of Premier League Football Matches by: Daan van Gemert The aim of this thesis is to develop a model for estimating the probabilities of premier league football outcomes, with the potential

More information

Monte Carlo Methods in Finance

Monte Carlo Methods in Finance Author: Yiyang Yang Advisor: Pr. Xiaolin Li, Pr. Zari Rachev Department of Applied Mathematics and Statistics State University of New York at Stony Brook October 2, 2012 Outline Introduction 1 Introduction

More information

99.37, 99.38, 99.38, 99.39, 99.39, 99.39, 99.39, 99.40, 99.41, 99.42 cm

99.37, 99.38, 99.38, 99.39, 99.39, 99.39, 99.39, 99.40, 99.41, 99.42 cm Error Analysis and the Gaussian Distribution In experimental science theory lives or dies based on the results of experimental evidence and thus the analysis of this evidence is a critical part of the

More information

Two-Sample T-Tests Allowing Unequal Variance (Enter Difference)

Two-Sample T-Tests Allowing Unequal Variance (Enter Difference) Chapter 45 Two-Sample T-Tests Allowing Unequal Variance (Enter Difference) Introduction This procedure provides sample size and power calculations for one- or two-sided two-sample t-tests when no assumption

More information

Simple linear regression

Simple linear regression Simple linear regression Introduction Simple linear regression is a statistical method for obtaining a formula to predict values of one variable from another where there is a causal relationship between

More information

LOGIT AND PROBIT ANALYSIS

LOGIT AND PROBIT ANALYSIS LOGIT AND PROBIT ANALYSIS A.K. Vasisht I.A.S.R.I., Library Avenue, New Delhi 110 012 amitvasisht@iasri.res.in In dummy regression variable models, it is assumed implicitly that the dependent variable Y

More information

Tests for Two Proportions

Tests for Two Proportions Chapter 200 Tests for Two Proportions Introduction This module computes power and sample size for hypothesis tests of the difference, ratio, or odds ratio of two independent proportions. The test statistics

More information

An example of a computable

An example of a computable An example of a computable absolutely normal number Verónica Becher Santiago Figueira Abstract The first example of an absolutely normal number was given by Sierpinski in 96, twenty years before the concept

More information

Simple Regression Theory II 2010 Samuel L. Baker

Simple Regression Theory II 2010 Samuel L. Baker SIMPLE REGRESSION THEORY II 1 Simple Regression Theory II 2010 Samuel L. Baker Assessing how good the regression equation is likely to be Assignment 1A gets into drawing inferences about how close the

More information