# Constrained Bayes and Empirical Bayes Estimator Applications in Insurance Pricing

Save this PDF as:

Size: px
Start display at page:

## Transcription

1 Communications for Statistical Applications and Methods 2013, Vol 20, No 4, DOI: Constrained Bayes and Empirical Bayes Estimator Applications in Insurance Pricing Myung Joon Kim a, Yeong-Hwa Kim 1,b a Department of Business Statistics, Hannam University b Department of Applied Statistics, Chung-Ang University Abstract Bayesian and empirical Bayesian methods have become quite popular in the theory and practice of statistics However, the objective is to often produce an ensemble of parameter estimates as well as to produce the histogram of the estimates For example, in insurance pricing, the accurate point estimates of risk for each group is necessary and also proper dispersion estimation should be considered Well-known Bayes estimates (which is the posterior means under quadratic loss) are underdispersed as an estimate of the histogram of parameters The adjustment of Bayes estimates to correct this problem is known as constrained Bayes estimators, which are matching the first two empirical moments In this paper, we propose a way to apply the constrained Bayes estimators in insurance pricing, which is required to estimate accurately both location and dispersion Also, the benefit of the constrained Bayes estimates will be discussed by analyzing real insurance accident data Keywords: Bayes estimate, constrained Bayes estimate, insurance pricing 1 Introduction Bayesian techniques are widely used for the simultaneous estimation of several parameters in compound decision problems Under the quadratic loss, Bayes estimates are the posterior means of the parameters of interest which is minimizing the squared error loss focusing on the precision of estimation However, the goal is to often produce an ensemble of parameter estimates as well as ensure the histogram of estimates somewhat close to the histogram of the population parameters The twin objectives mentioned the above are usually conflicting in the sense that one is often achieved at the expenses of the other The posterior means of the parameters are optimal estimates under quadratic loss, but it can be shown that the histogram of the posterior means of parameters is underdispersed as an estimate of the histogram of parameters Accordingly, the histogram of posterior means is clearly inappropriate to estimate the parameter histogram There is a need to find a set of suboptimal estimates to compromise between the two criteria Louis (1984) proposed a constrained Bayes method that matches the first two empirical moments of Bayes estimates of the normal means with minimizing the squared distance of the parameters and estimates subject to these constraints After that Ghosh (1992) generalized the Louis result for any arbitrary distribution and not necessarily normal distribution This paper was supported by the 2013 Hannam University Research Fund 1 Corresponding author: Professor, Department of Applied Statistics, Chung-Ang University, 221 Heuksuk-dong, Dongjak-gu, Seoul , Korea

2 322 Myung Joon Kim, Yeong-Hwa Kim Consider a situation where the data are denoted by x and θ = (θ 1,, θ m ) T is the parameter of interest The basic idea is to find an estimate of parameter θ = (θ 1,, θ m ) T that minimizes E (θ i t i ) 2 x, (11) within the class of all estimates t(x) = t = (t 1,, t m ) T of θ that satisfy these two conditions (1) E ( θ x ) = m 1 t i (x) = t(x), (12) (2) E (θ i θ) 2 x = {t i (x) t(x)} 2 (13) Let θ B denote Bayes estimate of θ under any quadratic loss, it is easy to to see that Bayes estimate satisfies (12); however, regarding to the (13) can be calculated as: { θ B i (X) θ B (X) } 2 ( < E θi θ ) 2 X (14) This clearly shows the limitation of usual Bayes estimates to estimate the true variation of parameter θ and the reason why constrained Bayes estimates should be considered in the case of the twin objectives In Section 2, the constrained Bayes and constrained empirical Bayes estimates are described in detail The insurance pricing application with described estimates is discussed in Section 3 After that the benefit of the those estimates are derived with numerical results in Section 4 using real insurance data 2 Constrained and Constrained Empirical Bayes Estimate 21 Constrained Bayes estimate Constrained Bayes estimate(cb) of parameter θ can be found which minimizes (11) subject to (12) and (13) and the main result is proved in Ghosh (1992) For stating the theorem require a few notions and the expected value of second moment can be denoted as following To see this we define I m as the identity matrix of order m, 1 m as the m-component column vector with each element equal to 1 and J m = 1 m 1 T m ( E θi θ ) 2 X {(I = tr m 1 ) m J m E ( θθ T ) } X {( = tr I m 1 ) {V(θ X) m J }} m + E(θ X)E(θ X) T {( = tr I m 1 ) } m J m V(θ X) + = tr { V ( θ θ1 m X )} + { θ B i (X) θ B (X) } 2 { θ B i (X) θ B (X) } 2 (21)

3 Constrained Bayes and Empirical Bayes Estimator Applications in Insurance Pricing 323 The second term of (21) is the squared distance of the Bayes estimate and clearly shows the under-dispersion of the Bayes estimate Let the first term of (21) be H 1 (x), and the second term be H 2 (x), then the CB estimate θ CB (x) = (θ CB 1 (x),, θcb m (x)) T can be stated as follows [ θ CB i (x) = aθi B (x) + (1 a) θ B (x), i = 1,, m, a a(x) = 1 + H ] 1 2 1(x) (22) H 2 (x) The Bayes estimate for the exponential family can be derived as follows Suppose that X 1,, X m are m independent random variables, where X i has the pdf given by f ϕi (x i ) = exp {nϕ i x i nψ(ϕ i )}, i = 1,, m Each X i can be viewed as each having a pdf belonging to a one-parameter exponential family Assume the independent conjugate priors g(ϕ i ) = exp (νϕ i µ νψ(ϕ i )), for the ϕ i s Then under the quadratic loss, the Bayes estimates of θ i s are given by θ B i (x) = E(θ i x) = E [ ψ (ϕ i ) x ] = (1 B)x i + Bµ, (23) where B = ν/(n + ν) Equation (22) appearance looks like convex combinations of the Bayes estimates and their average, but it is deceptive expressing because function a(x) exceeds 1 Ghosh and Kim (2002) have extended the result when the parameters are vector-valued and can be stated similarly 22 Constrained empirical Bayes estimate When the Bayesian model is true, the Bayes estimators have the smallest Bayes risks The CB estimators cannot claim any risk improvement over the Bayes estimators The same phenomenon is reflected in the comparison of the empirical Bayes(EB) and the constrained empirical Bayes(CEB) estimators The CEB estimators are not designed to improve on the EB estimators by producing smaller MSE s They are constructed to meet the twin objectives mentioned earlier more satisfactorily than the EB estimators Louis (1984) proposed the CEB estimators in the original James-Stein framework Ghosh (1992) developed the CB estimators in a more general framework when the distribution was not necessarily normal In this section, for more clear understanding, we will discuss under the assumption of normal distribution Consider the usual normal model where X i θ i are independent N(θ i, 1), i = 1,, m while θ i are iid N(µ, A) Then the CB estimator of θ can be expressed as following by Equation (22) ˆθ CB = a B [(1 B)X + Bµ1 m ] + (1 a B ) [ (1 B) X + Bµ ] 1 m = (1 B) [ a B X + (1 a B ) X1 m ] + Bµ1m, (24) where 1 m is an m-component column vector with each element equal to 1, X = m 1 m X i, a 2 B = 1 + 1/{(1 B)S }, and S = m (X i X) 2 /(m 1) In an EB scenario, typically both µ and A are unknown, and are estimated from the marginal distribution of X = (X 1,, X m ) T Marginally, X N(µ1 m, B 1 I m ) Hence, marginally, S

4 324 Myung Joon Kim, Yeong-Hwa Kim B 1 χ 2 m 1 /(m 1) We estimate µ by X, and B by ˆB = 1/S The CEB estimator of θ is then given by ˆθ CEB = ( 1 ˆB ) [ a EB X + (1 a EB ) X1 m ] + ˆB X1 m, (25) where a EB replaces B by ˆB in a B 23 Estimation with ANOVA model Insurance pricing focuses on the estimation for the risk of each group The key idea to segment the group is similar to the ANOVA model By trying to minimize the within variance for each group and also to maximize the between variance, each group risk could be evaluated properly Higher risk characteristics are segmentized with same group and lower risk characteristics are categorized with same group for better segmentation in insurance pricing Therefore, we will introduce the CB estimators and the CEB estimators in the ANOVA model, which results expanded under different loss functions by Ghosh et al (2008) In the next section, this model assumption and numerical analysis will be implemented with auto insurance accident data to estimate the each group risk Consider the ANOVA model with Y i j = θ i + e i j and θ i = µ + α i (i = 1,, m; j = 1,, k) Here the α i and the e i j are mutually independent with α iid i N(0, τ 2 ) and e iid i j N(0, σ 2 ) Alternatively, in a Bayesian framework, these amounts to saying that Y i j θ iid i N(θ i, σ 2 ), i = 1,, m and θ iid i N(µ, τ 2 ) Minimal sufficiency consideration allows us to restrict to (X 1,, X m, SSW), where X i = 1/k k j=1 Y i j = Ȳ i and SSW = m kj=1 (Y i j Ȳ i ) 2 We may note that marginally X 1,, X m and SSW are mutually independent with X i iid N(µ, τ 2 + σ 2 /k), ie, N(µ, σ 2 /(kb)), where B = (σ 2 /k)/(σ 2 /k + τ 2 ) = σ 2 /(σ 2 + kτ 2 ) and SSW σ 2 χ 2 m(k 1) From the results of the previous section, the CB estimator of θ = (θ 1,, θ m ) T is given by ˆθ CB = a(x)(1 B) ( X X1 m ) + { (1 B) X + Bµ } 1 m, (26) where X = (X 1,, X m ) T, X = m 1 m X i, and 1 m is an m-component column vector with each element equal to 1 Also a 2 (X) = 1 + H 1 (X)/H 2 (X), where H 1 (X) = (m 1)(1 B)σ 2 /k and H 2 (X) = (1 B) 2 m (X i X) 2 = (1 B) 2 (SSB/k), (say) Then, on simplification, a 2 (X) = 1 + σ 2 (1 B)MSB, (27) where MSB = SSB/(m 1) and MSW = SSW/(k 1) At the result of Equation (27), by substitution of µ by X, B by ˆB and σ 2 by MSW Here ˆB = {(m 3)MSW}/{(m 1)MSB}, the CEB estimator of θ can be derived as ˆθ CEB = a EB (X) ( 1 ˆB ) ( X X1 m ) + X1 m, (28) where a EB (X) = 1 + MSW/{(1 ˆB)MSB} Based on the previous result, the numerical analysis implementation will be performed in the next section to show the advantage of the CB estimators in the real business area

5 Constrained Bayes and Empirical Bayes Estimator Applications in Insurance Pricing 325 Table 1: Data structure (n denotes the sample size of each group and n = 1,000) N Group Small size car Medium size car Large size car SUV & Van type 1 y 11 y 21 y 31 y 41 2 y 12 y 22 y 32 y 42 n y 1n y 2n y 3n y 4n 3 Data Analysis and Result Insurance companies estimate premiums based on the paid loss amount caused by accidents using various customer information; therefore, pricing is the historical trial for converging to the true risk For the proper and accurate estimation of the true risk, there exist three main issues The first one is what kind of customer information should be used for the explanatory factor to estimate the risk (known as the rating factor) Secondly, how well segmented and assign the rating factor as groups for same group customers with similar risk characteristics The last issue is how to estimate risk of each group precisely for corresponding rating factors In this paper, the first two issues are not an interesting topic; however, there is a focus on the last issue that provides an accurate estimation of the risk 31 Data descriptions Auto insurance companies use approximately thirty to forty rating factors to differentiate each customer premium and apply the different segmentation group for each rating factor In this section, the one main and basic factor in auto insurance, car type-small size car, medium size car and other factors will be considered among several factors The data used for the analysis is real accident results in regards to property damage liability coverage from one property and casualty company in Korea For each group, one thousand samples are randomly selected and Table 1 illustrates the data structures In Table 1, y i j indicates the loss amount during the policy period by accident for each policy and for the normality assumption log transformation is implemented From the data structure described the above, by assuming the parameter empirically, numerical analysis regarding the ANOVA model explained in the previous section will be implemented Our suggesting estimator, constrained and empirical Bayes estimator and its second moment will be derived and the well known and widely used estimator Bayes estimator and its second moment also will be calculated for the comparison 32 Analysis result and comparison To indicate and evaluate the advantage of the estimators, we need to develop and define the index with some equations that reflect the research purpose In this paper, the purpose of estimation has twin objectives by matching the first and second moments to produce histogram estimates close to the histogram of population parameters Therefore, the idea for the developed index derives from the these objectives and the equation is: Index = w E ( θ x ) 2 m 1 t i (x) + (1 w) E ( θi θ ) 2 x [ ti (x) t(x) ] 2, (31) where w = 1/2 to assign the equal weight for each moments, and m = 4 indicates the number of group

6 326 Myung Joon Kim, Yeong-Hwa Kim Table 2: Numerical analysis result Estimates Group Small size car Medium size car Large size car SUV & Van type Index θ 488, , , ,003 ˆθ B 481, , , , ˆθ EB 483, , , , ˆθ CB 478, , , , ˆθ CEB 480, , , , The index reflects the weighted sum of distance both first moment and second moment The reason why is that the second term considers not square distance but that distance of absolute value is equalizing the size of distance with the first term Table 2 provides the summary of the data analysis The result shows the index of the CB estimator is closer than the Bayes estimator in regards to the distance of the two parameters; in addition, the CEB estimator is closer to the EB estimator This proves the benefit of the constrained Bayes estimators, not just minimizing the MSE, but trying to simultaneously match the second moments The popular Bayes estimates minimize the errors; however, they do not satisfy the properness of simultaneous estimation for the point and dispersion together In the table the advantage of the CB estimators totally comes from the second term of the index that calculates the distance of the second moments The difference of index between Bayes and constrained Bayes estimators in regards to loss amount aspect, Bayes estimators shrink toward a mean of about 28,400 won compared to the constrained Bayes estimators and empirical Bayes estimators that shrink to 30,700 won compared to the constrained empirical Bayes estimators The amount is approximately a 50% and 54% deviation from the total loss mean The insured period of auto insurance is a one year base in Korea; therefore, a consistent pricing technique is required every year An accurate risk point estimation and the dispersion of the each group should also be considered With regard to this reason and business needs, applying the constrained Bayes estimates for the auto insurance pricing can be a meaningful recommendations and this research suggests alternative tools for the auto insurance business field 4 Further Study In this paper, we assume the loss function as a quadratic loss function which is widely used for the estimation However, quadratic losses are geared solely towards the precision of estimation Sometimes there is a need to provide a framework that considers the tradeoff between goodness of fit and precision of estimation Balanced loss function, suggested by Zellner (1988, 1992), meets this need; subsequently, the analysis with different loss function should be further studied for better estimation References Louis, T A (1984) Estimating a population of parameter values using Bayes and empirical Bayes method, Journal of the American Statistical Association, 79, Ghosh, M (1992) Constrained Bayes estimation with applications, Journal of the American Statistical Association, 87, Ghosh, M and Kim, D (2002) Multivariate constrained Bayes estimation, Pakistan Journal of Statistics, 18, Ghosh, M, Kim, M and Kim, D (2008) Constrained Bayes and empirical Bayes estimation under random effects normal ANOVA model with balanced loss function, Journal of Statistical

7 Constrained Bayes and Empirical Bayes Estimator Applications in Insurance Pricing 327 Planning and Inference, 138, Zellner, A (1988) Bayesian analysis in econometrics, Journal of Econometrics, 37, Zellner, A (1992) Bayesian and non-bayesian estimation using balanced loss functions, Statistical Decision Theory and Related Topics V, Springer-Verlag, New York, Received April 29, 2013; Revised May 28, 2013; Accepted June 21, 2013

### 1 Prior Probability and Posterior Probability

Math 541: Statistical Theory II Bayesian Approach to Parameter Estimation Lecturer: Songfeng Zheng 1 Prior Probability and Posterior Probability Consider now a problem of statistical inference in which

### Parametric Models Part I: Maximum Likelihood and Bayesian Density Estimation

Parametric Models Part I: Maximum Likelihood and Bayesian Density Estimation Selim Aksoy Department of Computer Engineering Bilkent University saksoy@cs.bilkent.edu.tr CS 551, Fall 2015 CS 551, Fall 2015

### Basics of Statistical Machine Learning

CS761 Spring 2013 Advanced Machine Learning Basics of Statistical Machine Learning Lecturer: Xiaojin Zhu jerryzhu@cs.wisc.edu Modern machine learning is rooted in statistics. You will find many familiar

### P (x) 0. Discrete random variables Expected value. The expected value, mean or average of a random variable x is: xp (x) = v i P (v i )

Discrete random variables Probability mass function Given a discrete random variable X taking values in X = {v 1,..., v m }, its probability mass function P : X [0, 1] is defined as: P (v i ) = Pr[X =

### Inference of Probability Distributions for Trust and Security applications

Inference of Probability Distributions for Trust and Security applications Vladimiro Sassone Based on joint work with Mogens Nielsen & Catuscia Palamidessi Outline 2 Outline Motivations 2 Outline Motivations

### Introduction to General and Generalized Linear Models

Introduction to General and Generalized Linear Models General Linear Models - part I Henrik Madsen Poul Thyregod Informatics and Mathematical Modelling Technical University of Denmark DK-2800 Kgs. Lyngby

### Principle of Data Reduction

Chapter 6 Principle of Data Reduction 6.1 Introduction An experimenter uses the information in a sample X 1,..., X n to make inferences about an unknown parameter θ. If the sample size n is large, then

### Maximum Likelihood Estimation

Math 541: Statistical Theory II Lecturer: Songfeng Zheng Maximum Likelihood Estimation 1 Maximum Likelihood Estimation Maximum likelihood is a relatively simple method of constructing an estimator for

### Efficiency and the Cramér-Rao Inequality

Chapter Efficiency and the Cramér-Rao Inequality Clearly we would like an unbiased estimator ˆφ (X of φ (θ to produce, in the long run, estimates which are fairly concentrated i.e. have high precision.

### MATH4427 Notebook 2 Spring 2016. 2 MATH4427 Notebook 2 3. 2.1 Definitions and Examples... 3. 2.2 Performance Measures for Estimators...

MATH4427 Notebook 2 Spring 2016 prepared by Professor Jenny Baglivo c Copyright 2009-2016 by Jenny A. Baglivo. All Rights Reserved. Contents 2 MATH4427 Notebook 2 3 2.1 Definitions and Examples...................................

### Multivariate normal distribution and testing for means (see MKB Ch 3)

Multivariate normal distribution and testing for means (see MKB Ch 3) Where are we going? 2 One-sample t-test (univariate).................................................. 3 Two-sample t-test (univariate).................................................

### Sampling and Hypothesis Testing

Population and sample Sampling and Hypothesis Testing Allin Cottrell Population : an entire set of objects or units of observation of one sort or another. Sample : subset of a population. Parameter versus

### Chapter 6: Point Estimation. Fall 2011. - Probability & Statistics

STAT355 Chapter 6: Point Estimation Fall 2011 Chapter Fall 2011 6: Point1 Estimat / 18 Chap 6 - Point Estimation 1 6.1 Some general Concepts of Point Estimation Point Estimate Unbiasedness Principle of

### Using pivots to construct confidence intervals. In Example 41 we used the fact that

Using pivots to construct confidence intervals In Example 41 we used the fact that Q( X, µ) = X µ σ/ n N(0, 1) for all µ. We then said Q( X, µ) z α/2 with probability 1 α, and converted this into a statement

### Centre for Central Banking Studies

Centre for Central Banking Studies Technical Handbook No. 4 Applied Bayesian econometrics for central bankers Andrew Blake and Haroon Mumtaz CCBS Technical Handbook No. 4 Applied Bayesian econometrics

### Confidence Intervals for the Difference Between Two Means

Chapter 47 Confidence Intervals for the Difference Between Two Means Introduction This procedure calculates the sample size necessary to achieve a specified distance from the difference in sample means

### Chapter 9: Hypothesis Testing Sections

Chapter 9: Hypothesis Testing Sections 9.1 Problems of Testing Hypotheses Skip: 9.2 Testing Simple Hypotheses Skip: 9.3 Uniformly Most Powerful Tests Skip: 9.4 Two-Sided Alternatives 9.5 The t Test 9.6

### Basic Utility Theory for Portfolio Selection

Basic Utility Theory for Portfolio Selection In economics and finance, the most popular approach to the problem of choice under uncertainty is the expected utility (EU) hypothesis. The reason for this

### Hypothesis Testing COMP 245 STATISTICS. Dr N A Heard. 1 Hypothesis Testing 2 1.1 Introduction... 2 1.2 Error Rates and Power of a Test...

Hypothesis Testing COMP 45 STATISTICS Dr N A Heard Contents 1 Hypothesis Testing 1.1 Introduction........................................ 1. Error Rates and Power of a Test.............................

### Uncertainty in Second Moments: Implications for Portfolio Allocation

Uncertainty in Second Moments: Implications for Portfolio Allocation David Daewhan Cho SUNY at Buffalo, School of Management September 2003 Abstract This paper investigates the uncertainty in variance

### Overview of Violations of the Basic Assumptions in the Classical Normal Linear Regression Model

Overview of Violations of the Basic Assumptions in the Classical Normal Linear Regression Model 1 September 004 A. Introduction and assumptions The classical normal linear regression model can be written

### Lecture 8. Confidence intervals and the central limit theorem

Lecture 8. Confidence intervals and the central limit theorem Mathematical Statistics and Discrete Mathematics November 25th, 2015 1 / 15 Central limit theorem Let X 1, X 2,... X n be a random sample of

### A Study on the Comparison of Electricity Forecasting Models: Korea and China

Communications for Statistical Applications and Methods 2015, Vol. 22, No. 6, 675 683 DOI: http://dx.doi.org/10.5351/csam.2015.22.6.675 Print ISSN 2287-7843 / Online ISSN 2383-4757 A Study on the Comparison

### Lecture 3: Linear methods for classification

Lecture 3: Linear methods for classification Rafael A. Irizarry and Hector Corrada Bravo February, 2010 Today we describe four specific algorithms useful for classification problems: linear regression,

### Quadratic forms Cochran s theorem, degrees of freedom, and all that

Quadratic forms Cochran s theorem, degrees of freedom, and all that Dr. Frank Wood Frank Wood, fwood@stat.columbia.edu Linear Regression Models Lecture 1, Slide 1 Why We Care Cochran s theorem tells us

### Summary of Formulas and Concepts. Descriptive Statistics (Ch. 1-4)

Summary of Formulas and Concepts Descriptive Statistics (Ch. 1-4) Definitions Population: The complete set of numerical information on a particular quantity in which an investigator is interested. We assume

### LOGNORMAL MODEL FOR STOCK PRICES

LOGNORMAL MODEL FOR STOCK PRICES MICHAEL J. SHARPE MATHEMATICS DEPARTMENT, UCSD 1. INTRODUCTION What follows is a simple but important model that will be the basis for a later study of stock prices as

### Math 151. Rumbos Spring 2014 1. Solutions to Assignment #22

Math 151. Rumbos Spring 2014 1 Solutions to Assignment #22 1. An experiment consists of rolling a die 81 times and computing the average of the numbers on the top face of the die. Estimate the probability

### LOGISTIC REGRESSION. Nitin R Patel. where the dependent variable, y, is binary (for convenience we often code these values as

LOGISTIC REGRESSION Nitin R Patel Logistic regression extends the ideas of multiple linear regression to the situation where the dependent variable, y, is binary (for convenience we often code these values

### Notes for STA 437/1005 Methods for Multivariate Data

Notes for STA 437/1005 Methods for Multivariate Data Radford M. Neal, 26 November 2010 Random Vectors Notation: Let X be a random vector with p elements, so that X = [X 1,..., X p ], where denotes transpose.

### A LOGNORMAL MODEL FOR INSURANCE CLAIMS DATA

REVSTAT Statistical Journal Volume 4, Number 2, June 2006, 131 142 A LOGNORMAL MODEL FOR INSURANCE CLAIMS DATA Authors: Daiane Aparecida Zuanetti Departamento de Estatística, Universidade Federal de São

### Practice problems for Homework 11 - Point Estimation

Practice problems for Homework 11 - Point Estimation 1. (10 marks) Suppose we want to select a random sample of size 5 from the current CS 3341 students. Which of the following strategies is the best:

### L10: Probability, statistics, and estimation theory

L10: Probability, statistics, and estimation theory Review of probability theory Bayes theorem Statistics and the Normal distribution Least Squares Error estimation Maximum Likelihood estimation Bayesian

### Spatial Statistics Chapter 3 Basics of areal data and areal data modeling

Spatial Statistics Chapter 3 Basics of areal data and areal data modeling Recall areal data also known as lattice data are data Y (s), s D where D is a discrete index set. This usually corresponds to data

### Probabilistic Models for Big Data. Alex Davies and Roger Frigola University of Cambridge 13th February 2014

Probabilistic Models for Big Data Alex Davies and Roger Frigola University of Cambridge 13th February 2014 The State of Big Data Why probabilistic models for Big Data? 1. If you don t have to worry about

### ( ) = P Z > = P( Z > 1) = 1 Φ(1) = 1 0.8413 = 0.1587 P X > 17

4.6 I company that manufactures and bottles of apple juice uses a machine that automatically fills 6 ounce bottles. There is some variation, however, in the amounts of liquid dispensed into the bottles

### Summary of Probability

Summary of Probability Mathematical Physics I Rules of Probability The probability of an event is called P(A), which is a positive number less than or equal to 1. The total probability for all possible

### Chapter 9: Hypothesis Testing Sections

Chapter 9: Hypothesis Testing Sections 9.1 Problems of Testing Hypotheses Skip: 9.2 Testing Simple Hypotheses Skip: 9.3 Uniformly Most Powerful Tests Skip: 9.4 Two-Sided Alternatives 9.6 Comparing the

### 7 Hypothesis testing - one sample tests

7 Hypothesis testing - one sample tests 7.1 Introduction Definition 7.1 A hypothesis is a statement about a population parameter. Example A hypothesis might be that the mean age of students taking MAS113X

### On the Efficiency of Competitive Stock Markets Where Traders Have Diverse Information

Finance 400 A. Penati - G. Pennacchi Notes on On the Efficiency of Competitive Stock Markets Where Traders Have Diverse Information by Sanford Grossman This model shows how the heterogeneous information

### Marketing Mix Modelling and Big Data P. M Cain

1) Introduction Marketing Mix Modelling and Big Data P. M Cain Big data is generally defined in terms of the volume and variety of structured and unstructured information. Whereas structured data is stored

### Econometrics Simple Linear Regression

Econometrics Simple Linear Regression Burcu Eke UC3M Linear equations with one variable Recall what a linear equation is: y = b 0 + b 1 x is a linear equation with one variable, or equivalently, a straight

### Master s Theory Exam Spring 2006

Spring 2006 This exam contains 7 questions. You should attempt them all. Each question is divided into parts to help lead you through the material. You should attempt to complete as much of each problem

### A crash course in probability and Naïve Bayes classification

Probability theory A crash course in probability and Naïve Bayes classification Chapter 9 Random variable: a variable whose possible values are numerical outcomes of a random phenomenon. s: A person s

### MATH BOOK OF PROBLEMS SERIES. New from Pearson Custom Publishing!

MATH BOOK OF PROBLEMS SERIES New from Pearson Custom Publishing! The Math Book of Problems Series is a database of math problems for the following courses: Pre-algebra Algebra Pre-calculus Calculus Statistics

### Web-based Supplementary Materials for Bayesian Effect Estimation. Accounting for Adjustment Uncertainty by Chi Wang, Giovanni

1 Web-based Supplementary Materials for Bayesian Effect Estimation Accounting for Adjustment Uncertainty by Chi Wang, Giovanni Parmigiani, and Francesca Dominici In Web Appendix A, we provide detailed

### Supplement to Call Centers with Delay Information: Models and Insights

Supplement to Call Centers with Delay Information: Models and Insights Oualid Jouini 1 Zeynep Akşin 2 Yves Dallery 1 1 Laboratoire Genie Industriel, Ecole Centrale Paris, Grande Voie des Vignes, 92290

### 2.3 Convex Constrained Optimization Problems

42 CHAPTER 2. FUNDAMENTAL CONCEPTS IN CONVEX OPTIMIZATION Theorem 15 Let f : R n R and h : R R. Consider g(x) = h(f(x)) for all x R n. The function g is convex if either of the following two conditions

### STA 4273H: Statistical Machine Learning

STA 4273H: Statistical Machine Learning Russ Salakhutdinov Department of Statistics! rsalakhu@utstat.toronto.edu! http://www.cs.toronto.edu/~rsalakhu/ Lecture 6 Three Approaches to Classification Construct

### Linear Threshold Units

Linear Threshold Units w x hx (... w n x n w We assume that each feature x j and each weight w j is a real number (we will relax this later) We will study three different algorithms for learning linear

### Probability and Statistics Prof. Dr. Somesh Kumar Department of Mathematics Indian Institute of Technology, Kharagpur

Probability and Statistics Prof. Dr. Somesh Kumar Department of Mathematics Indian Institute of Technology, Kharagpur Module No. #01 Lecture No. #15 Special Distributions-VI Today, I am going to introduce

### Department of Mathematics, Indian Institute of Technology, Kharagpur Assignment 2-3, Probability and Statistics, March 2015. Due:-March 25, 2015.

Department of Mathematics, Indian Institute of Technology, Kharagpur Assignment -3, Probability and Statistics, March 05. Due:-March 5, 05.. Show that the function 0 for x < x+ F (x) = 4 for x < for x

### 1 Sufficient statistics

1 Sufficient statistics A statistic is a function T = rx 1, X 2,, X n of the random sample X 1, X 2,, X n. Examples are X n = 1 n s 2 = = X i, 1 n 1 the sample mean X i X n 2, the sample variance T 1 =

### An Introduction to Modeling Stock Price Returns With a View Towards Option Pricing

An Introduction to Modeling Stock Price Returns With a View Towards Option Pricing Kyle Chauvin August 21, 2006 This work is the product of a summer research project at the University of Kansas, conducted

### Least Squares Estimation

Least Squares Estimation SARA A VAN DE GEER Volume 2, pp 1041 1045 in Encyclopedia of Statistics in Behavioral Science ISBN-13: 978-0-470-86080-9 ISBN-10: 0-470-86080-4 Editors Brian S Everitt & David

### 1 Short Introduction to Time Series

ECONOMICS 7344, Spring 202 Bent E. Sørensen January 24, 202 Short Introduction to Time Series A time series is a collection of stochastic variables x,.., x t,.., x T indexed by an integer value t. The

### Mathematics. (www.tiwariacademy.com : Focus on free Education) (Chapter 5) (Complex Numbers and Quadratic Equations) (Class XI)

( : Focus on free Education) Miscellaneous Exercise on chapter 5 Question 1: Evaluate: Answer 1: 1 ( : Focus on free Education) Question 2: For any two complex numbers z1 and z2, prove that Re (z1z2) =

### THE NUMBER OF GRAPHS AND A RANDOM GRAPH WITH A GIVEN DEGREE SEQUENCE. Alexander Barvinok

THE NUMBER OF GRAPHS AND A RANDOM GRAPH WITH A GIVEN DEGREE SEQUENCE Alexer Barvinok Papers are available at http://www.math.lsa.umich.edu/ barvinok/papers.html This is a joint work with J.A. Hartigan

### Statistical Machine Learning

Statistical Machine Learning UoC Stats 37700, Winter quarter Lecture 4: classical linear and quadratic discriminants. 1 / 25 Linear separation For two classes in R d : simple idea: separate the classes

### An Introduction to Machine Learning

An Introduction to Machine Learning L5: Novelty Detection and Regression Alexander J. Smola Statistical Machine Learning Program Canberra, ACT 0200 Australia Alex.Smola@nicta.com.au Tata Institute, Pune,

### SF2940: Probability theory Lecture 8: Multivariate Normal Distribution

SF2940: Probability theory Lecture 8: Multivariate Normal Distribution Timo Koski 24.09.2015 Timo Koski Matematisk statistik 24.09.2015 1 / 1 Learning outcomes Random vectors, mean vector, covariance matrix,

### Variance of OLS Estimators and Hypothesis Testing. Randomness in the model. GM assumptions. Notes. Notes. Notes. Charlie Gibbons ARE 212.

Variance of OLS Estimators and Hypothesis Testing Charlie Gibbons ARE 212 Spring 2011 Randomness in the model Considering the model what is random? Y = X β + ɛ, β is a parameter and not random, X may be

### Errata and updates for ASM Exam C/Exam 4 Manual (Sixteenth Edition) sorted by page

Errata for ASM Exam C/4 Study Manual (Sixteenth Edition) Sorted by Page 1 Errata and updates for ASM Exam C/Exam 4 Manual (Sixteenth Edition) sorted by page Practice exam 1:9, 1:22, 1:29, 9:5, and 10:8

### Christfried Webers. Canberra February June 2015

c Statistical Group and College of Engineering and Computer Science Canberra February June (Many figures from C. M. Bishop, "Pattern Recognition and ") 1of 829 c Part VIII Linear Classification 2 Logistic

### Comparison of resampling method applied to censored data

International Journal of Advanced Statistics and Probability, 2 (2) (2014) 48-55 c Science Publishing Corporation www.sciencepubco.com/index.php/ijasp doi: 10.14419/ijasp.v2i2.2291 Research Paper Comparison

### Part 2: One-parameter models

Part 2: One-parameter models Bernoilli/binomial models Return to iid Y 1,...,Y n Bin(1, θ). The sampling model/likelihood is p(y 1,...,y n θ) =θ P y i (1 θ) n P y i When combined with a prior p(θ), Bayes

### Hypothesis Testing for Beginners

Hypothesis Testing for Beginners Michele Piffer LSE August, 2011 Michele Piffer (LSE) Hypothesis Testing for Beginners August, 2011 1 / 53 One year ago a friend asked me to put down some easy-to-read notes

### 1 Portfolio mean and variance

Copyright c 2005 by Karl Sigman Portfolio mean and variance Here we study the performance of a one-period investment X 0 > 0 (dollars) shared among several different assets. Our criterion for measuring

### Forecast covariances in the linear multiregression dynamic model.

Forecast covariances in the linear multiregression dynamic model. Catriona M Queen, Ben J Wright and Casper J Albers The Open University, Milton Keynes, MK7 6AA, UK February 28, 2007 Abstract The linear

### Lecture notes: single-agent dynamics 1

Lecture notes: single-agent dynamics 1 Single-agent dynamic optimization models In these lecture notes we consider specification and estimation of dynamic optimization models. Focus on single-agent models.

### Multivariate Normal Distribution

Multivariate Normal Distribution Lecture 4 July 21, 2011 Advanced Multivariate Statistical Methods ICPSR Summer Session #2 Lecture #4-7/21/2011 Slide 1 of 41 Last Time Matrices and vectors Eigenvalues

### Simple treatment structure

Chapter 3 Simple treatment structure 3.1 Replication of control treatments Suppose that treatment 1 is a control treatment and that treatments 2,..., t are new treatments which we want to compare with

### A Bootstrap Metropolis-Hastings Algorithm for Bayesian Analysis of Big Data

A Bootstrap Metropolis-Hastings Algorithm for Bayesian Analysis of Big Data Faming Liang University of Florida August 9, 2015 Abstract MCMC methods have proven to be a very powerful tool for analyzing

### Modern Optimization Methods for Big Data Problems MATH11146 The University of Edinburgh

Modern Optimization Methods for Big Data Problems MATH11146 The University of Edinburgh Peter Richtárik Week 3 Randomized Coordinate Descent With Arbitrary Sampling January 27, 2016 1 / 30 The Problem

### Introduction to Hypothesis Testing. Point estimation and confidence intervals are useful statistical inference procedures.

Introduction to Hypothesis Testing Point estimation and confidence intervals are useful statistical inference procedures. Another type of inference is used frequently used concerns tests of hypotheses.

### 0.1 Phase Estimation Technique

Phase Estimation In this lecture we will describe Kitaev s phase estimation algorithm, and use it to obtain an alternate derivation of a quantum factoring algorithm We will also use this technique to design

### Exact Confidence Intervals

Math 541: Statistical Theory II Instructor: Songfeng Zheng Exact Confidence Intervals Confidence intervals provide an alternative to using an estimator ˆθ when we wish to estimate an unknown parameter

### SOCIETY OF ACTUARIES/CASUALTY ACTUARIAL SOCIETY EXAM C CONSTRUCTION AND EVALUATION OF ACTUARIAL MODELS EXAM C SAMPLE QUESTIONS

SOCIETY OF ACTUARIES/CASUALTY ACTUARIAL SOCIETY EXAM C CONSTRUCTION AND EVALUATION OF ACTUARIAL MODELS EXAM C SAMPLE QUESTIONS Copyright 005 by the Society of Actuaries and the Casualty Actuarial Society

### Joint Exam 1/P Sample Exam 1

Joint Exam 1/P Sample Exam 1 Take this practice exam under strict exam conditions: Set a timer for 3 hours; Do not stop the timer for restroom breaks; Do not look at your notes. If you believe a question

### NPTEL STRUCTURAL RELIABILITY

NPTEL Course On STRUCTURAL RELIABILITY Module # 02 Lecture 6 Course Format: Web Instructor: Dr. Arunasis Chakraborty Department of Civil Engineering Indian Institute of Technology Guwahati 6. Lecture 06:

### Linear algebra and the geometry of quadratic equations. Similarity transformations and orthogonal matrices

MATH 30 Differential Equations Spring 006 Linear algebra and the geometry of quadratic equations Similarity transformations and orthogonal matrices First, some things to recall from linear algebra Two

### 6. Jointly Distributed Random Variables

6. Jointly Distributed Random Variables We are often interested in the relationship between two or more random variables. Example: A randomly chosen person may be a smoker and/or may get cancer. Definition.

### Bayesian Methods. 1 The Joint Posterior Distribution

Bayesian Methods Every variable in a linear model is a random variable derived from a distribution function. A fixed factor becomes a random variable with possibly a uniform distribution going from a lower

### STT315 Chapter 4 Random Variables & Probability Distributions KM. Chapter 4.5, 6, 8 Probability Distributions for Continuous Random Variables

Chapter 4.5, 6, 8 Probability Distributions for Continuous Random Variables Discrete vs. continuous random variables Examples of continuous distributions o Uniform o Exponential o Normal Recall: A random

### What you CANNOT ignore about Probs and Stats

What you CANNOT ignore about Probs and Stats by Your Teacher Version 1.0.3 November 5, 2009 Introduction The Finance master is conceived as a postgraduate course and contains a sizable quantitative section.

### STAT 830 Convergence in Distribution

STAT 830 Convergence in Distribution Richard Lockhart Simon Fraser University STAT 830 Fall 2011 Richard Lockhart (Simon Fraser University) STAT 830 Convergence in Distribution STAT 830 Fall 2011 1 / 31

### Problem Set 4: Covariance functions and Gaussian random fields

Problem Set : Covariance functions and Gaussian random fields GEOS 7: Inverse Problems and Parameter Estimation, Carl Tape Assigned: February 9, 15 Due: February 1, 15 Last compiled: February 5, 15 Overview

### Chapter 13 Introduction to Nonlinear Regression( 非 線 性 迴 歸 )

Chapter 13 Introduction to Nonlinear Regression( 非 線 性 迴 歸 ) and Neural Networks( 類 神 經 網 路 ) 許 湘 伶 Applied Linear Regression Models (Kutner, Nachtsheim, Neter, Li) hsuhl (NUK) LR Chap 10 1 / 35 13 Examples

### Stochastic Inventory Control

Chapter 3 Stochastic Inventory Control 1 In this chapter, we consider in much greater details certain dynamic inventory control problems of the type already encountered in section 1.3. In addition to the

### Fuzzy regression model with fuzzy input and output data for manpower forecasting

Fuzzy Sets and Systems 9 (200) 205 23 www.elsevier.com/locate/fss Fuzzy regression model with fuzzy input and output data for manpower forecasting Hong Tau Lee, Sheu Hua Chen Department of Industrial Engineering

### Pattern Analysis. Logistic Regression. 12. Mai 2009. Joachim Hornegger. Chair of Pattern Recognition Erlangen University

Pattern Analysis Logistic Regression 12. Mai 2009 Joachim Hornegger Chair of Pattern Recognition Erlangen University Pattern Analysis 2 / 43 1 Logistic Regression Posteriors and the Logistic Function Decision

### Fitting Subject-specific Curves to Grouped Longitudinal Data

Fitting Subject-specific Curves to Grouped Longitudinal Data Djeundje, Viani Heriot-Watt University, Department of Actuarial Mathematics & Statistics Edinburgh, EH14 4AS, UK E-mail: vad5@hw.ac.uk Currie,

### ELEC-E8104 Stochastics models and estimation, Lecture 3b: Linear Estimation in Static Systems

Stochastics models and estimation, Lecture 3b: Linear Estimation in Static Systems Minimum Mean Square Error (MMSE) MMSE estimation of Gaussian random vectors Linear MMSE estimator for arbitrarily distributed

### Elasticity Theory Basics

G22.3033-002: Topics in Computer Graphics: Lecture #7 Geometric Modeling New York University Elasticity Theory Basics Lecture #7: 20 October 2003 Lecturer: Denis Zorin Scribe: Adrian Secord, Yotam Gingold

### 4. Continuous Random Variables, the Pareto and Normal Distributions

4. Continuous Random Variables, the Pareto and Normal Distributions A continuous random variable X can take any value in a given range (e.g. height, weight, age). The distribution of a continuous random

### Applications to Data Smoothing and Image Processing I

Applications to Data Smoothing and Image Processing I MA 348 Kurt Bryan Signals and Images Let t denote time and consider a signal a(t) on some time interval, say t. We ll assume that the signal a(t) is

### Elementary Statistics Sample Exam #3

Elementary Statistics Sample Exam #3 Instructions. No books or telephones. Only the supplied calculators are allowed. The exam is worth 100 points. 1. A chi square goodness of fit test is considered to

### Chapter 3 Joint Distributions

Chapter 3 Joint Distributions 3.6 Functions of Jointly Distributed Random Variables Discrete Random Variables: Let f(x, y) denote the joint pdf of random variables X and Y with A denoting the two-dimensional

APPLIED MATHEMATICS ADVANCED LEVEL INTRODUCTION This syllabus serves to examine candidates knowledge and skills in introductory mathematical and statistical methods, and their applications. For applications

### 38. STATISTICS. 38. Statistics 1. 38.1. Fundamental concepts

38. Statistics 1 38. STATISTICS Revised September 2013 by G. Cowan (RHUL). This chapter gives an overview of statistical methods used in high-energy physics. In statistics, we are interested in using a