# 1 The Pareto Distribution

Save this PDF as:

Size: px
Start display at page:

## Transcription

1 Estimating the Parameters of a Pareto Distribution Introducing a Quantile Regression Method Joseph Lee Petersen Introduction. A broad approach to using correlation coefficients for parameter estimation and not merely as descriptive statistics has been developed. [1,2,3,4] It was a goal of this project to extend these ideas specifically to estimating the parameters of the Pareto distribution. In this paper we will recall the definition of the Pareto distribution, some basic properties, and some previously developed methods of estimating the parameters of the Pareto distribution from which a random sample comes. We will introduce a new parameter estimation scheme based on correlation coefficients. Finally, we will study and compare the performance of each of the parameter estimation schemes. 1 The Pareto Distribution The Pareto Distribution was first proposed as a model for the distribution of incomes. It is also used as a model for the distribution of city populations within a given area. The Pareto distribution is defined by the following functions: ( ) a k CDF: F (x α, k) = 1 ; k x < ; α, k > 0 x PDF: f(x α, k) = αkα ; k x < ; α, k > 0 xα+1 The first parameter marks a lower bound on the possible values that a Pareto distributed random variable can take on. To illustrate we can see in figure 1 a plot of the density of a Pareto(1, 1) random variable. A few well known properties follow: E(X) = α k/(α 1), α > 1 V ar(x) = α k 2 /[(α 1) 2 (α 2)], α > 2 1

2 2 Parameter Estimation We are interested in estimating the parameters of the Pareto distribution from which a random sample comes. We will outline a few parameter estimation schemes. 2.1 Method of Moments We actually modify the usual method of moments scheme according to a method laid out in Johnson and Kotz[5]. If we set the sample mean equal to the distribution s theoretical expected value mentioned above and if we set the sample minimum, x 1, equal to the theoretical expected value of the minimum of a size n sample of Pareto(k,α) random variables, we obtain two equations and two unknowns: x = α k ( α 1) x 1 = n α k n α 1 Solving these equations yields the following estimators: α = n x x 1 n ( x x 1 ) k = (n α 1) x 1 n α 2.2 Median Estimator As far as the author knows, this is a new estimator. The idea is that in method of moments we set the sample mean equal to the theoretical mean, so here we will set the sample median equal to the theoretical median. Many of the estimation schemes discussed in this paper were first studied in the 2

3 case where k was known to be equal to one, so that it was only α that needed to be estimated. In this case we can see from the CDF above that if x = m s, the median of the sample, we can estimate α as follows: 1 m α s = 0.5 = α = ln 2 ln m s It is likely though that we will be interested in many cases in which k is not equal to one, so if we already have an estimate for k, call it k est, we will make the following adjustment to this estimate for α: α = ln 2 ( ln ) m s k est For the purposes of analyzing the performance of this estimator, we will use the minimum sample value as the estimate for k. 2.3 Maximum Likelihood The likelihood function, L, for the Pareto distribution has the following form: L(k, α x) = n α k α ; 0 < k min{x i }, α > 0 i=1 x α+1 i Recall that the likelihood function tells as a function of the distribution parameters how likely it is to have observed the data that we did in fact observe. The maximum likelihood estimates for k and α are the values of k and α that make L as large as possible given the data we have. The most familiar method of maximizing functions involves calculus. However, we need no calculus to see that L gets large beyond bound for increases in k. It is key then to recall that k can be no larger than the smallest value of x in our data, so the best we can do in maximizing L by adjusting k is as follows: ˆk = min{x i } In order to find the maximum likelihood estimate for α, calculus is appropriate. Since L is nonnegative, we can take its logarithm. We do this because 3

4 it is easier to differentiate log L than L itself. Logarithms are bijective functions, so the value of α that maximizes L also maximizes log L. The process in brief looks like this: n ( ) α k α log L(k, α x) = i=1 log = n log(α) + α nlog(k) (α + 1) x α+1 i n log(x i ) i=1 = d n dα = n/α + n log(k) log(x i ) Setting the derivative equal to zero, a little algebra and an omitted second derivative check to confirm we are maximizing L rather than minimizing L yields: n ( ) xi ˆα = n/ log ˆk i=1 i=1 2.4 Correlation Coefficients Gideon[4] has shown using a correlation based interpretation of linear regression that the mean and standard deviation for a normally distributed data set of size n can be estimated by regression of the sorted data on the 1st through nth (n + 1)-tiles of the standard normal. In such a regression, the intercept of the fitted linear model serves as an unbiased estimate of the mean of the distribution from which the data came, and the slope of the fitted linear model serves as an unbiased estimate of the standard deviation. This quantile regression estimation scheme not only is appropriate for normally distributed data though, but rather it works for distributions from any scale regular family. 1 The connection between regression and correlation is laid out explicitly in Gideon[4]. We will describe the connection here in brief. The value of s 1 A scale regular family is a family of distributions such that for any member in the family there is a scalar that can multiply the member to yield another member in the same family with unit variance. A common example of the use of the scale regular property of a scale regular family is standardizing a Normally distributed random variable. 4

5 that satisfies the following equation (See Gideon[1,2,3,4]) estimates standard deviation: r(q, u o sq) = 0 where q is a vector of the 1st through nth (n + 1)-tiles of the standard distribution, n is the sample size of the data, u o is a vector of the ordered data, and r is any correlation coefficient. When r is Pearson s correlation, the solution is exactly the least squares estimate of the slope of a linear model. As for an estimate of the model intercept, if r is Pearson s, the estimate is ū 0 s q. If r is Kendall s r or the Greatest Deviation Correlation Coefficient (See Gideon[2]), the estimate of intercept is median(u 0 ) s median(q). Again if r is Pearson s correlation coefficient, the solution is exactly the least squares line. However, it is important to note that r need not be Pearson s r, but rather other correlation statistics will yield estimates, with perhaps more desirable properties, of the slope and intercept of the line. We will now describe how these ideas can be applied to the Pareto distribution. Let X be a Pareto(α,k) distributed random variable. Then U = ln X is a two-parameter Exponentially distributed random variable with parameters λ and θ. That is U has probability density function f(u) = ( 1 λ) e (u θ)/λ, where λ = 1/α and θ = lnk. Its expected value is λ+θ or in terms of the original Pareto random variable, 1/α+ln k, and its variance is λ 2 or 1/α 2. (This means the standard deviation is 1/α.) It should be noted now that the Exponential distributions are a scale regular family. Now the random variable Z = αu will be rather a standard Exponential with probability density function f(z) = e (z θ) and so cumulative density function F (z) = 1 e (z θ). These facts suggest that if we log transform data which we hypothesize to be Pareto distributed, we can solve r(q, u 0 sq) = 0, q being the quantiles of the standard Exponential, to get an estimate of the scale parameter for the Exponential distribution to which the log transformed Pareto data corresponds, and the appropriate center measure, i, of the uncentered residuals will give an estimate of the location parameter of the Exponential distribution to which the log transformed Pareto data corresponds. In order to turn these estimates back in terms of the Pareto parameters, we estimate k by e i and α by 1/s. In our studies we used Pearson s r and the Greatest Deviation Correlation Coefficient (See Gideon[1,2]) to evaluate the performance of this estimation scheme. 5

6 3 An Example We used the probability integral transformation to write a function in SPlus that would generate random samples of Pareto distributed data. For this example we generated a size 100 sample from a Pareto(1, 0.5) distribution. We can see a histogram of this data in figure 2. We ve truncated the histogram in order to compare it to the density plot in Figure 1, and because there are some extreme outliers that would have made the histogram not useful had they been included. (The max was 5715 and the next highest was 1581.) The next plot shows the sorted logarithm transformed data plotted as a function of the first through 100th 101-tiles of an Exponential(1) random variable. The dotted line is a least squares regression line. The solid line is a Greatest Deviation based regression line. The intercept and slope of the least squares line are and which imply estimates for k and α of and The intercept and slope of the GD line are and which imply estimates for k and α of and Method of moments estimates k = and α = It is important to note how horribly method of moments does in estimating α. The reason for this is that the integral defining variance for a Pareto distribution does not converge if α is less than or equal to two, and similarly the integral defining the distribution s mean is infinite if α is less than or equal to one. This means that the distribution is prone to extreme outliers. The limit of α as x goes to infinity is one, so one extreme outlier can yield an estimate of one for α. The maximum likelihood estimates are k = and α = The median estimate for α assuming a k estimate of k min (1.035) is Finally to illustrate the equation r(q, u o sq) = 0 we can see a plot of r(q, u o sq) as a function of s. Remember u 0 is actually the logarithm of the sorted presumed Pareto data and hence is distributed Exponentially. The value of s where the graph crosses the s axis is the correlation estimate of the standard deviation of the Exponential distribution. Our estimate of α is which was obtained more precisely than can be done looking at a graph using a midpoint algorithm. Recall that solving the correlation function is embedded in the least squares regression. This is why the correlation estimate is the same as that from the LS regression. 6

7 4 Estimator Performance It was interesting to see one example, but to truly evaluate the performance of estimators, we need to do some simulations. Our simulations were done by setting a value for k and α. SPlus generated a size 100 sample of Pareto(k, α) data and estimated k using one of the parameter estimation schemes we discussed earlier. It repeated 1000 times. The mean, median, standard error (SE) of those 1000 estimates were computed. The estimated bias was calculated as the mean minus the true value of the parameter. The mean squared error (MSE) was calculated as the bias squared plus the SE squared. This process was repeated for α using the same estimation scheme and the same values of k and α to generate the random samples. When all the estimation schemes were tested, the entire process was repeated using different values of k and α to generate the data. We tested for values of k at 1, 1,000, and 1,000,000, and for each value of k, we tested for values of α at 0.1, 1, 10, 100, and 1,000. The estimation schemes were GDCC based quantile regression for k (GDqk), GDCC based quantile regression for α (GDqα), Least squares based quantile regression for k (LSqk), Least squares based quantile regression for α (LSqα), method of moments for k (momk), method of moments for α (momα), maximum likelihood for k (MLk), maximum likelihood for α (MLα), median estimator for α assuming k = x min (medα), and a Pearson s correlation based estimate for α. For the purposes of reporting in this paper, SPlus rounded all results to 3 digits as per its significant digits function. The results of one simulation can be found here. The output for the other fourteen simulations can be found at the end of the paper. Simulation Statistics for Pareto(1,0.1) (1, 0.1) Mean Median SE Bias MSE GDqk e GDqα e LSqk e LSqα e momk e momα e MLk e MLα e medα e

8 5 Results Something very fascinating is revealed in looking at all the output. The method of moments approach, despite being totally inappropriate for estimating α, performed consistently best as an estimator for k. There were only two instances in which the GD based quantile regression method was less biased (k = 1000 and α = 100, k = 1, 000, 000 and α = 1) than method of moments and one instance (k = 1, 000, 000 and α = 0.1) in which maximum likelihood yielded a smaller MSE. As for estimating α, the trend seemed to be that the GD based quantile regression estimator was in all cases the least biased estimator and maximum likelihood yielded the least MSE for low values of α while method of moments yielded the least MSE for high values of α. 6 Conclusion Among the estimators studied, method of moments clearly performed the best for estimating k. While this is interesting theoretically, it could be less interesting practically. On page 242 Johnson and Kotz[5] say that the Pareto distribution serves well as a model for incomes at the extremities, but not as well over the entire range of income levels. We are guessing then that if one were to try to fit a Pareto model to a set of data, he would first through other means determine at what value k the Pareto model takes over and then only be interested in the power parameter α that models the data greater than k. However if one would need to estimate k, we recommend the method of moments estimator. As for estimating α, the Greatest Deviation based quantile regression showed to be the least biased estimator, but it generally had a 20% greater standard error than maximum likelihood. Maximum likelihood yielded smaller mean squared error than the Greatest Deviation based quantile regression over the parameter values we studied. If one needs an unbiased estimator, we recommend the Greatest Deviation based quantile regression estimator. If small MSE is desired, we recommend maximum likelihood estimation. Part of the motivation for this project was to develop a parameter estimation scheme for the Pareto distribution that fits into the broader subject of using correlation statistics for parameter estimation and not merely as 8

9 descriptive statistics. So far the quantile regression approach has only been applied to symmetric distribution such as Normal, Gosset s t, and Cauchy. We see that correlations, specifically in the case of the Greatest Deviation Correlation Coefficient, can produce good estimates for skewed distributions. The GD quantile regression approach produced estimates for α that were relatively unbiased. 9

10 7 Complete Simulation Output Simulation Statistics for Pareto(1, 1) (1, 1) Mean Median SE Bias MSE GDqk GDqα LSqk LSqα momk momα MLk MLα medα Simulation Statistics for Pareto(1, 10) (1, 10) Mean Median SE Bias MSE GDqk e-05 GDqα e+00 LSqk e-05 LSqα e+00 momk e-06 momα e+00 MLk e-06 MLα e+00 medα e+00 Simulation Statistics for Pareto(1, 100) (1, 100) Mean Median SE Bias MSE GDqk e e e-07 GDqα e e e+02 LSqk e e e-07 LSqα e e e+02 momk e e e-09 momα e e e+02 MLk e e e-08 MLα e e e+02 medα e e e+02 10

11 Simulation Statistics for Pareto(1, 1000) (1, 1000) Mean Median SE Bias MSE GDqk e e e-09 GDqα e e e+04 LSqk e e e-08 LSqα e e e+04 momk e e e-10 momα e e e+04 MLk e e e-10 MLα e e e+04 medα e e e+04 Simulation Statistics for Pareto(1000, 0.1) (1000, 0.1) Mean Median SE Bias MSE GDqk 1.14e e e e e+05 GDqα 1.01e e e e e-04 LSqk 1.13e e e e e+06 LSqα 9.66e e e e e-04 momk 1.10e e e e e+04 momα 1.00e e e e e-01 MLk 1.11e e e e e+04 MLα 1.02e e e e e-04 medα 1.03e e e e e-04 Simulation Statistics for Pareto(1000, 1) (1000, 1) Mean Median SE Bias MSE GDqk e+03 GDqα e-02 LSqk e+03 LSqα e-02 momk e+02 momα e-02 MLk e+02 MLα e-02 medα e-02 11

12 Simulation Statistics for Pareto(1000, 10) (1000, 10) Mean Median SE Bias MSE GDqk GDqα LSqk LSqα momk momα MLk MLα medα Simulation Statistics for Pareto(1000, 100) (1000, 100) Mean Median SE Bias MSE GDqk e-01 GDqα e+02 LSqk e-01 LSqα e+02 momk e-03 momα e+02 MLk e-02 MLα e+02 medα e+02 Simulation Statistics for Pareto(1000, 1000) (1000, 1000) Mean Median SE Bias MSE GDqk e-03 GDqα e+04 LSqk e-03 LSqα e+04 momk e-05 momα e+04 MLk e-04 MLα e+04 medα e+04 12

13 Simulation Statistics for Pareto(1mil, 0.1) (1mil, 0.1) Mean Median SE Bias MSE GDqk 1.23e e e e e+11 GDqα 1.00e e e e e-04 LSqk 1.12e e e e e+12 LSqα 9.64e e e e e-04 momk 1.10e e e e e+10 momα 1.00e e e e e-01 MLk 1.11e e e e e+10 MLα 1.02e e e e e-04 medα 1.03e e e e e-04 Simulation Statistics for Pareto(1mil, 1) (1mil, 1) Mean Median SE Bias MSE GDqk 1.00e e e e e+09 GDqα 1.01e e e e e-02 LSqk 9.75e e e e e+09 LSqα 9.64e e e e e-02 momk 1.00e e e e e+08 momα 1.20e e e e e-02 MLk 1.01e e e e e+08 MLα 1.02e e e e e-02 medα 1.02e e e e e+12 Simulation Statistics for Pareto(1mil, 10) (1mil, 10) Mean Median SE Bias MSE GDqk 1.00e e e e+07 GDqα 1.00e e e e+00 LSqk 9.97e e e e+07 LSqα 9.62e e e e+00 momk 1.00e e e e+06 momα 1.01e e e e+00 MLk 1.00e e e e+06 MLα 1.02e e e e+00 medα 1.03e e e e+00 13

14 Simulation Statistics for Pareto(1mil, 100) (1mil, 100) Mean Median SE Bias MSE GDqk 1.00e e GDqα 1.01e e LSqk 1.00e e LSqα 9.65e e momk 1.00e e momα 1.01e e MLk 1.00e e MLα 1.03e e medα 1.03e e Simulation Statistics for Pareto(1mil, 1000) (1mil, 1000) Mean Median SE Bias MSE GDqk GDqα LSqk LSqα momk momα MLk MLα medα

15 References [1] Rudy A. Gideon, Generalized Interpretation of Pearson s r., unpublished paper (URL: [2] Rudy A. Gideon, Random Variables, Regression, and the Greatest Deviation Correlation Coefficient., unpublished paper (URL: [3] Rudy A. Gideon, Correlation in Simple Linear Regression., unpublished paper (URL: SPACE-REG.pdf) [4] Rudy A. Gideon, Location and Scale Estimation with Correlation Coefficients., unpublished paper (URL: [5] Norman L. Johnson and Samuel Kotz, Distributions in Statistics Continuous Univariate Distributions - 1., Houghton Mifflin Company, Boston,

### Maximum Likelihood Estimation

Math 541: Statistical Theory II Lecturer: Songfeng Zheng Maximum Likelihood Estimation 1 Maximum Likelihood Estimation Maximum likelihood is a relatively simple method of constructing an estimator for

### Overview of Violations of the Basic Assumptions in the Classical Normal Linear Regression Model

Overview of Violations of the Basic Assumptions in the Classical Normal Linear Regression Model 1 September 004 A. Introduction and assumptions The classical normal linear regression model can be written

### Simple Regression Theory II 2010 Samuel L. Baker

SIMPLE REGRESSION THEORY II 1 Simple Regression Theory II 2010 Samuel L. Baker Assessing how good the regression equation is likely to be Assignment 1A gets into drawing inferences about how close the

### 1) Write the following as an algebraic expression using x as the variable: Triple a number subtracted from the number

1) Write the following as an algebraic expression using x as the variable: Triple a number subtracted from the number A. 3(x - x) B. x 3 x C. 3x - x D. x - 3x 2) Write the following as an algebraic expression

### MATH4427 Notebook 2 Spring 2016. 2 MATH4427 Notebook 2 3. 2.1 Definitions and Examples... 3. 2.2 Performance Measures for Estimators...

MATH4427 Notebook 2 Spring 2016 prepared by Professor Jenny Baglivo c Copyright 2009-2016 by Jenny A. Baglivo. All Rights Reserved. Contents 2 MATH4427 Notebook 2 3 2.1 Definitions and Examples...................................

### Dongfeng Li. Autumn 2010

Autumn 2010 Chapter Contents Some statistics background; ; Comparing means and proportions; variance. Students should master the basic concepts, descriptive statistics measures and graphs, basic hypothesis

### Regression Analysis Prof. Soumen Maity Department of Mathematics Indian Institute of Technology, Kharagpur. Lecture - 2 Simple Linear Regression

Regression Analysis Prof. Soumen Maity Department of Mathematics Indian Institute of Technology, Kharagpur Lecture - 2 Simple Linear Regression Hi, this is my second lecture in module one and on simple

### Exercise 1.12 (Pg. 22-23)

Individuals: The objects that are described by a set of data. They may be people, animals, things, etc. (Also referred to as Cases or Records) Variables: The characteristics recorded about each individual.

### Simple linear regression

Simple linear regression Introduction Simple linear regression is a statistical method for obtaining a formula to predict values of one variable from another where there is a causal relationship between

### Statistical Foundations: Measures of Location and Central Tendency and Summation and Expectation

Statistical Foundations: and Central Tendency and and Lecture 4 September 5, 2006 Psychology 790 Lecture #4-9/05/2006 Slide 1 of 26 Today s Lecture Today s Lecture Where this Fits central tendency/location

### Chapter 6: Point Estimation. Fall 2011. - Probability & Statistics

STAT355 Chapter 6: Point Estimation Fall 2011 Chapter Fall 2011 6: Point1 Estimat / 18 Chap 6 - Point Estimation 1 6.1 Some general Concepts of Point Estimation Point Estimate Unbiasedness Principle of

### Univariate Regression

Univariate Regression Correlation and Regression The regression line summarizes the linear relationship between 2 variables Correlation coefficient, r, measures strength of relationship: the closer r is

### Econometrics Simple Linear Regression

Econometrics Simple Linear Regression Burcu Eke UC3M Linear equations with one variable Recall what a linear equation is: y = b 0 + b 1 x is a linear equation with one variable, or equivalently, a straight

### Numerical Summarization of Data OPRE 6301

Numerical Summarization of Data OPRE 6301 Motivation... In the previous session, we used graphical techniques to describe data. For example: While this histogram provides useful insight, other interesting

### Probability and Statistics Prof. Dr. Somesh Kumar Department of Mathematics Indian Institute of Technology, Kharagpur

Probability and Statistics Prof. Dr. Somesh Kumar Department of Mathematics Indian Institute of Technology, Kharagpur Module No. #01 Lecture No. #15 Special Distributions-VI Today, I am going to introduce

### Simple Linear Regression Chapter 11

Simple Linear Regression Chapter 11 Rationale Frequently decision-making situations require modeling of relationships among business variables. For instance, the amount of sale of a product may be related

### ELEC-E8104 Stochastics models and estimation, Lecture 3b: Linear Estimation in Static Systems

Stochastics models and estimation, Lecture 3b: Linear Estimation in Static Systems Minimum Mean Square Error (MMSE) MMSE estimation of Gaussian random vectors Linear MMSE estimator for arbitrarily distributed

### Errata and updates for ASM Exam C/Exam 4 Manual (Sixteenth Edition) sorted by page

Errata for ASM Exam C/4 Study Manual (Sixteenth Edition) Sorted by Page 1 Errata and updates for ASM Exam C/Exam 4 Manual (Sixteenth Edition) sorted by page Practice exam 1:9, 1:22, 1:29, 9:5, and 10:8

### NPTEL STRUCTURAL RELIABILITY

NPTEL Course On STRUCTURAL RELIABILITY Module # 02 Lecture 6 Course Format: Web Instructor: Dr. Arunasis Chakraborty Department of Civil Engineering Indian Institute of Technology Guwahati 6. Lecture 06:

### Probability and Statistics

CHAPTER 2: RANDOM VARIABLES AND ASSOCIATED FUNCTIONS 2b - 0 Probability and Statistics Kristel Van Steen, PhD 2 Montefiore Institute - Systems and Modeling GIGA - Bioinformatics ULg kristel.vansteen@ulg.ac.be

### Properties of Future Lifetime Distributions and Estimation

Properties of Future Lifetime Distributions and Estimation Harmanpreet Singh Kapoor and Kanchan Jain Abstract Distributional properties of continuous future lifetime of an individual aged x have been studied.

### Section 14 Simple Linear Regression: Introduction to Least Squares Regression

Slide 1 Section 14 Simple Linear Regression: Introduction to Least Squares Regression There are several different measures of statistical association used for understanding the quantitative relationship

### Gamma Distribution Fitting

Chapter 552 Gamma Distribution Fitting Introduction This module fits the gamma probability distributions to a complete or censored set of individual or grouped data values. It outputs various statistics

### Practice problems for Homework 11 - Point Estimation

Practice problems for Homework 11 - Point Estimation 1. (10 marks) Suppose we want to select a random sample of size 5 from the current CS 3341 students. Which of the following strategies is the best:

### Department of Mathematics, Indian Institute of Technology, Kharagpur Assignment 2-3, Probability and Statistics, March 2015. Due:-March 25, 2015.

Department of Mathematics, Indian Institute of Technology, Kharagpur Assignment -3, Probability and Statistics, March 05. Due:-March 5, 05.. Show that the function 0 for x < x+ F (x) = 4 for x < for x

### The Big 50 Revision Guidelines for S1

The Big 50 Revision Guidelines for S1 If you can understand all of these you ll do very well 1. Know what is meant by a statistical model and the Modelling cycle of continuous refinement 2. Understand

### AP Statistics 2001 Solutions and Scoring Guidelines

AP Statistics 2001 Solutions and Scoring Guidelines The materials included in these files are intended for non-commercial use by AP teachers for course and exam preparation; permission for any other use

### Lecture 8: More Continuous Random Variables

Lecture 8: More Continuous Random Variables 26 September 2005 Last time: the eponential. Going from saying the density e λ, to f() λe λ, to the CDF F () e λ. Pictures of the pdf and CDF. Today: the Gaussian

### OLS is not only unbiased it is also the most precise (efficient) unbiased estimation technique - ie the estimator has the smallest variance

Lecture 5: Hypothesis Testing What we know now: OLS is not only unbiased it is also the most precise (efficient) unbiased estimation technique - ie the estimator has the smallest variance (if the Gauss-Markov

### Java Modules for Time Series Analysis

Java Modules for Time Series Analysis Agenda Clustering Non-normal distributions Multifactor modeling Implied ratings Time series prediction 1. Clustering + Cluster 1 Synthetic Clustering + Time series

### Institute of Actuaries of India Subject CT3 Probability and Mathematical Statistics

Institute of Actuaries of India Subject CT3 Probability and Mathematical Statistics For 2015 Examinations Aim The aim of the Probability and Mathematical Statistics subject is to provide a grounding in

### Regression Estimation - Least Squares and Maximum Likelihood. Dr. Frank Wood

Regression Estimation - Least Squares and Maximum Likelihood Dr. Frank Wood Least Squares Max(min)imization Function to minimize w.r.t. b 0, b 1 Q = n (Y i (b 0 + b 1 X i )) 2 i=1 Minimize this by maximizing

### Linear and Piecewise Linear Regressions

Tarigan Statistical Consulting & Coaching statistical-coaching.ch Doctoral Program in Computer Science of the Universities of Fribourg, Geneva, Lausanne, Neuchâtel, Bern and the EPFL Hands-on Data Analysis

### Parametric Models Part I: Maximum Likelihood and Bayesian Density Estimation

Parametric Models Part I: Maximum Likelihood and Bayesian Density Estimation Selim Aksoy Department of Computer Engineering Bilkent University saksoy@cs.bilkent.edu.tr CS 551, Fall 2015 CS 551, Fall 2015

### Review of Fundamental Mathematics

Review of Fundamental Mathematics As explained in the Preface and in Chapter 1 of your textbook, managerial economics applies microeconomic theory to business decision making. The decision-making tools

### 4. Introduction to Statistics

Statistics for Engineers 4-1 4. Introduction to Statistics Descriptive Statistics Types of data A variate or random variable is a quantity or attribute whose value may vary from one unit of investigation

### Some Notes on Taylor Polynomials and Taylor Series

Some Notes on Taylor Polynomials and Taylor Series Mark MacLean October 3, 27 UBC s courses MATH /8 and MATH introduce students to the ideas of Taylor polynomials and Taylor series in a fairly limited

### Age to Age Factor Selection under Changing Development Chris G. Gross, ACAS, MAAA

Age to Age Factor Selection under Changing Development Chris G. Gross, ACAS, MAAA Introduction A common question faced by many actuaries when selecting loss development factors is whether to base the selected

### Sampling Distribution of a Normal Variable

Ismor Fischer, 5/9/01 5.-1 5. Formal Statement and Examples Comments: Sampling Distribution of a Normal Variable Given a random variable. Suppose that the population distribution of is known to be normal,

### CALCULATIONS & STATISTICS

CALCULATIONS & STATISTICS CALCULATION OF SCORES Conversion of 1-5 scale to 0-100 scores When you look at your report, you will notice that the scores are reported on a 0-100 scale, even though respondents

### Senior Secondary Australian Curriculum

Senior Secondary Australian Curriculum Mathematical Methods Glossary Unit 1 Functions and graphs Asymptote A line is an asymptote to a curve if the distance between the line and the curve approaches zero

### Regression Analysis: Basic Concepts

The simple linear model Regression Analysis: Basic Concepts Allin Cottrell Represents the dependent variable, y i, as a linear function of one independent variable, x i, subject to a random disturbance

### " Y. Notation and Equations for Regression Lecture 11/4. Notation:

Notation: Notation and Equations for Regression Lecture 11/4 m: The number of predictor variables in a regression Xi: One of multiple predictor variables. The subscript i represents any number from 1 through

### NCSS Statistical Software Principal Components Regression. In ordinary least squares, the regression coefficients are estimated using the formula ( )

Chapter 340 Principal Components Regression Introduction is a technique for analyzing multiple regression data that suffer from multicollinearity. When multicollinearity occurs, least squares estimates

### Normal distribution. ) 2 /2σ. 2π σ

Normal distribution The normal distribution is the most widely known and used of all distributions. Because the normal distribution approximates many natural phenomena so well, it has developed into a

### , then the form of the model is given by: which comprises a deterministic component involving the three regression coefficients (

Multiple regression Introduction Multiple regression is a logical extension of the principles of simple linear regression to situations in which there are several predictor variables. For instance if we

### MATH BOOK OF PROBLEMS SERIES. New from Pearson Custom Publishing!

MATH BOOK OF PROBLEMS SERIES New from Pearson Custom Publishing! The Math Book of Problems Series is a database of math problems for the following courses: Pre-algebra Algebra Pre-calculus Calculus Statistics

### Linearizing Equations Handout Wilfrid Laurier University

Linearizing Equations Handout Wilfrid Laurier University c Terry Sturtevant 1 2 1 Physics Lab Supervisor 2 This document may be freely copied as long as this page is included. ii Chapter 1 Linearizing

### Forecasting in supply chains

1 Forecasting in supply chains Role of demand forecasting Effective transportation system or supply chain design is predicated on the availability of accurate inputs to the modeling process. One of the

### Distribution (Weibull) Fitting

Chapter 550 Distribution (Weibull) Fitting Introduction This procedure estimates the parameters of the exponential, extreme value, logistic, log-logistic, lognormal, normal, and Weibull probability distributions

### Calculating Interval Forecasts

Calculating Chapter 7 (Chatfield) Monika Turyna & Thomas Hrdina Department of Economics, University of Vienna Summer Term 2009 Terminology An interval forecast consists of an upper and a lower limit between

### Nonlinear Regression Functions. SW Ch 8 1/54/

Nonlinear Regression Functions SW Ch 8 1/54/ The TestScore STR relation looks linear (maybe) SW Ch 8 2/54/ But the TestScore Income relation looks nonlinear... SW Ch 8 3/54/ Nonlinear Regression General

### where b is the slope of the line and a is the intercept i.e. where the line cuts the y axis.

Least Squares Introduction We have mentioned that one should not always conclude that because two variables are correlated that one variable is causing the other to behave a certain way. However, sometimes

### 3. Regression & Exponential Smoothing

3. Regression & Exponential Smoothing 3.1 Forecasting a Single Time Series Two main approaches are traditionally used to model a single time series z 1, z 2,..., z n 1. Models the observation z t as a

### 1 OLS under Measurement Error

ECON 370: Measurement Error 1 Violations of Gauss Markov Assumptions: Measurement Error Econometric Methods, ECON 370 1 OLS under Measurement Error We have found out that another source of endogeneity

Quadratic Modeling Business 10 Profits In this activity, we are going to look at modeling business profits. We will allow q to represent the number of items manufactured and assume that all items that

### Test of Hypotheses. Since the Neyman-Pearson approach involves two statistical hypotheses, one has to decide which one

Test of Hypotheses Hypothesis, Test Statistic, and Rejection Region Imagine that you play a repeated Bernoulli game: you win \$1 if head and lose \$1 if tail. After 10 plays, you lost \$2 in net (4 heads

### Notes for STA 437/1005 Methods for Multivariate Data

Notes for STA 437/1005 Methods for Multivariate Data Radford M. Neal, 26 November 2010 Random Vectors Notation: Let X be a random vector with p elements, so that X = [X 1,..., X p ], where denotes transpose.

### NIST GCR 06-906 Probabilistic Models for Directionless Wind Speeds in Hurricanes

NIST GCR 06-906 Probabilistic Models for Directionless Wind Speeds in Hurricanes Mircea Grigoriu Cornell University NIST GCR 06-906 Probabilistic Models for Directionless Wind Speeds in Hurricanes Prepared

### BNG 202 Biomechanics Lab. Descriptive statistics and probability distributions I

BNG 202 Biomechanics Lab Descriptive statistics and probability distributions I Overview The overall goal of this short course in statistics is to provide an introduction to descriptive and inferential

### Stats Review Chapters 3-4

Stats Review Chapters 3-4 Created by Teri Johnson Math Coordinator, Mary Stangler Center for Academic Success Examples are taken from Statistics 4 E by Michael Sullivan, III And the corresponding Test

### **BEGINNING OF EXAMINATION** The annual number of claims for an insured has probability function: , 0 < q < 1.

**BEGINNING OF EXAMINATION** 1. You are given: (i) The annual number of claims for an insured has probability function: 3 p x q q x x ( ) = ( 1 ) 3 x, x = 0,1,, 3 (ii) The prior density is π ( q) = q,

### 3. The Multivariate Normal Distribution

3. The Multivariate Normal Distribution 3.1 Introduction A generalization of the familiar bell shaped normal density to several dimensions plays a fundamental role in multivariate analysis While real data

### VISUALIZATION OF DENSITY FUNCTIONS WITH GEOGEBRA

VISUALIZATION OF DENSITY FUNCTIONS WITH GEOGEBRA Csilla Csendes University of Miskolc, Hungary Department of Applied Mathematics ICAM 2010 Probability density functions A random variable X has density

### A Primer on Mathematical Statistics and Univariate Distributions; The Normal Distribution; The GLM with the Normal Distribution

A Primer on Mathematical Statistics and Univariate Distributions; The Normal Distribution; The GLM with the Normal Distribution PSYC 943 (930): Fundamentals of Multivariate Modeling Lecture 4: September

### Geostatistics Exploratory Analysis

Instituto Superior de Estatística e Gestão de Informação Universidade Nova de Lisboa Master of Science in Geospatial Technologies Geostatistics Exploratory Analysis Carlos Alberto Felgueiras cfelgueiras@isegi.unl.pt

### Introduction to Regression. Dr. Tom Pierce Radford University

Introduction to Regression Dr. Tom Pierce Radford University In the chapter on correlational techniques we focused on the Pearson R as a tool for learning about the relationship between two variables.

### E3: PROBABILITY AND STATISTICS lecture notes

E3: PROBABILITY AND STATISTICS lecture notes 2 Contents 1 PROBABILITY THEORY 7 1.1 Experiments and random events............................ 7 1.2 Certain event. Impossible event............................

### Unit 31 A Hypothesis Test about Correlation and Slope in a Simple Linear Regression

Unit 31 A Hypothesis Test about Correlation and Slope in a Simple Linear Regression Objectives: To perform a hypothesis test concerning the slope of a least squares line To recognize that testing for a

### CORRELATION ANALYSIS

CORRELATION ANALYSIS Learning Objectives Understand how correlation can be used to demonstrate a relationship between two factors. Know how to perform a correlation analysis and calculate the coefficient

### Taylor Polynomials and Taylor Series Math 126

Taylor Polynomials and Taylor Series Math 26 In many problems in science and engineering we have a function f(x) which is too complicated to answer the questions we d like to ask. In this chapter, we will

### Statistics courses often teach the two-sample t-test, linear regression, and analysis of variance

2 Making Connections: The Two-Sample t-test, Regression, and ANOVA In theory, there s no difference between theory and practice. In practice, there is. Yogi Berra 1 Statistics courses often teach the two-sample

### 2. Linear regression with multiple regressors

2. Linear regression with multiple regressors Aim of this section: Introduction of the multiple regression model OLS estimation in multiple regression Measures-of-fit in multiple regression Assumptions

### Reflection and Refraction

Equipment Reflection and Refraction Acrylic block set, plane-concave-convex universal mirror, cork board, cork board stand, pins, flashlight, protractor, ruler, mirror worksheet, rectangular block worksheet,

### Figure B.1: Optimal ownership as a function of investment interrelatedness. Figure C.1: Marginal effects at low interrelatedness

Online Appendix for: Lileeva, A. and J. Van Biesebroeck. Outsourcing when Investments are Specific and Interrelated, Journal of the European Economic Association Appendix A: Proofs Proof of Proposition

### Simple Linear Regression Inference

Simple Linear Regression Inference 1 Inference requirements The Normality assumption of the stochastic term e is needed for inference even if it is not a OLS requirement. Therefore we have: Interpretation

### 2DI36 Statistics. 2DI36 Part II (Chapter 7 of MR)

2DI36 Statistics 2DI36 Part II (Chapter 7 of MR) What Have we Done so Far? Last time we introduced the concept of a dataset and seen how we can represent it in various ways But, how did this dataset came

### StatTools Assignment #1, Winter 2007 This assignment has three parts.

StatTools Assignment #1, Winter 2007 This assignment has three parts. Before beginning this assignment, be sure to carefully read the General Instructions document that is located on the StatTools Assignments

### Math 4310 Handout - Quotient Vector Spaces

Math 4310 Handout - Quotient Vector Spaces Dan Collins The textbook defines a subspace of a vector space in Chapter 4, but it avoids ever discussing the notion of a quotient space. This is understandable

### MATHEMATICAL METHODS OF STATISTICS

MATHEMATICAL METHODS OF STATISTICS By HARALD CRAMER TROFESSOK IN THE UNIVERSITY OF STOCKHOLM Princeton PRINCETON UNIVERSITY PRESS 1946 TABLE OF CONTENTS. First Part. MATHEMATICAL INTRODUCTION. CHAPTERS

### Introduction to Diophantine Equations

Introduction to Diophantine Equations Tom Davis tomrdavis@earthlink.net http://www.geometer.org/mathcircles September, 2006 Abstract In this article we will only touch on a few tiny parts of the field

### 1 The Brownian bridge construction

The Brownian bridge construction The Brownian bridge construction is a way to build a Brownian motion path by successively adding finer scale detail. This construction leads to a relatively easy proof

### Module 3: Correlation and Covariance

Using Statistical Data to Make Decisions Module 3: Correlation and Covariance Tom Ilvento Dr. Mugdim Pašiƒ University of Delaware Sarajevo Graduate School of Business O ften our interest in data analysis

### Appendix C: Graphs. Vern Lindberg

Vern Lindberg 1 Making Graphs A picture is worth a thousand words. Graphical presentation of data is a vital tool in the sciences and engineering. Good graphs convey a great deal of information and can

### Generalized Linear Models. Today: definition of GLM, maximum likelihood estimation. Involves choice of a link function (systematic component)

Generalized Linear Models Last time: definition of exponential family, derivation of mean and variance (memorize) Today: definition of GLM, maximum likelihood estimation Include predictors x i through

### Joint Probability Distributions and Random Samples (Devore Chapter Five)

Joint Probability Distributions and Random Samples (Devore Chapter Five) 1016-345-01 Probability and Statistics for Engineers Winter 2010-2011 Contents 1 Joint Probability Distributions 1 1.1 Two Discrete

### Summary of Formulas and Concepts. Descriptive Statistics (Ch. 1-4)

Summary of Formulas and Concepts Descriptive Statistics (Ch. 1-4) Definitions Population: The complete set of numerical information on a particular quantity in which an investigator is interested. We assume

### Limit processes are the basis of calculus. For example, the derivative. f f (x + h) f (x)

SEC. 4.1 TAYLOR SERIES AND CALCULATION OF FUNCTIONS 187 Taylor Series 4.1 Taylor Series and Calculation of Functions Limit processes are the basis of calculus. For example, the derivative f f (x + h) f

### Study Resources For Algebra I. Unit 1C Analyzing Data Sets for Two Quantitative Variables

Study Resources For Algebra I Unit 1C Analyzing Data Sets for Two Quantitative Variables This unit explores linear functions as they apply to data analysis of scatter plots. Information compiled and written

### To give it a definition, an implicit function of x and y is simply any relationship that takes the form:

2 Implicit function theorems and applications 21 Implicit functions The implicit function theorem is one of the most useful single tools you ll meet this year After a while, it will be second nature to

### Simple Regression and Correlation

Simple Regression and Correlation Today, we are going to discuss a powerful statistical technique for examining whether or not two variables are related. Specifically, we are going to talk about the ideas

### 1 Solving LPs: The Simplex Algorithm of George Dantzig

Solving LPs: The Simplex Algorithm of George Dantzig. Simplex Pivoting: Dictionary Format We illustrate a general solution procedure, called the simplex algorithm, by implementing it on a very simple example.

### seven Statistical Analysis with Excel chapter OVERVIEW CHAPTER

seven Statistical Analysis with Excel CHAPTER chapter OVERVIEW 7.1 Introduction 7.2 Understanding Data 7.3 Relationships in Data 7.4 Distributions 7.5 Summary 7.6 Exercises 147 148 CHAPTER 7 Statistical

### Determining distribution parameters from quantiles

Determining distribution parameters from quantiles John D. Cook Department of Biostatistics The University of Texas M. D. Anderson Cancer Center P. O. Box 301402 Unit 1409 Houston, TX 77230-1402 USA cook@mderson.org

### CF4103 Financial Time Series. 1 Financial Time Series and Their Characteristics

CF4103 Financial Time Series 1 Financial Time Series and Their Characteristics Financial time series analysis is concerned with theory and practice of asset valuation over time. For example, there are

### Basics of Statistical Machine Learning

CS761 Spring 2013 Advanced Machine Learning Basics of Statistical Machine Learning Lecturer: Xiaojin Zhu jerryzhu@cs.wisc.edu Modern machine learning is rooted in statistics. You will find many familiar

### DESCRIPTIVE STATISTICS. The purpose of statistics is to condense raw data to make it easier to answer specific questions; test hypotheses.

DESCRIPTIVE STATISTICS The purpose of statistics is to condense raw data to make it easier to answer specific questions; test hypotheses. DESCRIPTIVE VS. INFERENTIAL STATISTICS Descriptive To organize,

### NEW YORK STATE TEACHER CERTIFICATION EXAMINATIONS

NEW YORK STATE TEACHER CERTIFICATION EXAMINATIONS TEST DESIGN AND FRAMEWORK September 2014 Authorized for Distribution by the New York State Education Department This test design and framework document