# Claim Size Distributions in German Private Health Insurance

Save this PDF as:

Size: px
Start display at page:

Download "Claim Size Distributions in German Private Health Insurance"

## Transcription

2 Recognition of an atypical number of high claims in a cohort which may have a large impact on the mean and hence impair the premium calculation. In practice, this is dealt with in an ad hoc manner. For the calculation of risk adjusted premiums more information about the claim size distribution besides the expected value is needed. In general, it is assumed that the mean of a cohort is normally distributed. This assumption must be justified in particular, if there is only a moderate number of observations and the claim size distribution is highly skewed, as it is the case for medical insurance covering specifically hospital expenses. In recent years German health insurance companies are evaluated by analysts based on a stochastic simulation, called market consistent embedded value. So far only the income from capital investts is modeled stochastically, while the claims are modeled deterministically by their expected value. An understanding of claim size distributions enables an improved modeling of the liabilities. Academic interest in distributions which occur in practice. In the first section we will discuss the data and describe the portion of the claim free insured. In the second section we examine the conditional distribution of the non zero claim sizes. We will show that after normalization these distributions are similar over the groups. Then we describe these empirical distributions by named theoretical distributions. Finally, we study their tail. In the third and final section we warn against using the central limit theorem without an error estimation by giving unexpected examples. The author is indebted to Deutsche Krankenversicherung AG (DKV) for providing the data. Further, the author thanks Rasmus Schlömer for several discussions. 1 The data set and general considerations The examination is based on the claims in the year of a modern full cover insurance as well as a classical inpatient, outpatient, and insurance which are usually sold together. The data are grouped by the cohorts, i.e., by sex and groups spanning five years each. In order to ensure that each group contains at least about one thousand members (more precisely at least 9), we consider in the case of the full cover insurance only the s from to for both sexes, in the case of inpatient insurance the s from to 6 for and from 3 to 9 for wo, in the remaining cases the s from to 8 for both sexes. In full cover insurance the strongest group is 3 39 with more than and wo. In the classical inpatient, outpatient, and insurance the strongest groups are with more than 9, 7 resp. 3 and 3, 6 resp. 8 wo. The fewer members in inpatient insurance are due to the fact that the insurance company offers several popular inpatient insurances, and we restrict our examination only to the most popular one. In practice the members of a cohort are considered as a homogeneous risk group. As tioned in the introduction, it is required by the German law that

4 . portion of claim free outpatient inpatient full cover wo Naturally, the highest percent of claim free insured are in inpatient insurance. For the number of claim free decrease with, for wo this appears to be only true in inpatient insurance the other types of insurances are roughly constant. The obvious model for the number of claim free is a Binomial distribution B(n,λ) whose parameter λ can be read off the above graph. To validate this model we compare the data of to the data of in the next two graphs. The first shows the relative change of λ in the year compared to the year. For the second we fit a logistic regression model to each cohort with the observation year as the only independent variable and plot the p value of the hypothesis test whether the observation year is relevant. relativ change of λ outpatient inpatient full cover wo p value of the year parameter outpatient inpatient full cover wo The small probabilities for the moderate relative changes are due to the high number of observations, which should have produced a very reliable estimate for the parameter λ. In general, the influence of the observation year does not seem to be significant with the notable exception in outpatient insurance in the younger s. This may be due to an epidemic of a minor illness. Here, the Binomial model should be used with caution.

5 The conditional distributions Having dealt with the portion of claim free insureds in the previous section we turn to the conditional distribution of non zero claim sizes. Unfortunately, because of the large proportion of claim free in inpatient insurance the data basis for our examination is greatly reduced. In the group of we have only about 1 observations for wo and for, in the other groups we have between and 1. For the other insurance types we have at least 9 observations with the exception of insurance 8 8, wo 9, 7 8 and full cover 9,, wo 9,, which still have at least observations..1 Comparison over the groups We start by plotting the means of the cohorts relative to the cohorts of. This is called the profile: profile outpatient inpatient full cover wo We note that none of the graphs has a big jump and the developt with increasing has a clear trend. With the exception of insurance the profiles increase with. For insurance the graphs are slightly concave with maximum at the We can observe this smooth behavior of the profiles despite the fact that we included the extremely high claims, which are usually ignored for premium calculation. Next we plot the variance coefficient for the cohorts, i.e., the quotient of the

6 standard deviation by the mean: variation coefficient outpatient inpatient full cover wo The graphs are jumpy, but the overall tendency is that the variation coefficient is mainly independent of the. The huge jump in the cohort of year old in inpatient insurance is due to two extremely high claims of 83 and 9 times the mean of this cohort, while for the other s the maximum is nearly always below. The more prominent volatility of the graphs for wo results from the fact that there are fewer observations for wo and hence we have a larger error in the estimation of the standard deviation. For all the following examinations we normalize the conditional distributions in the cohorts by their means. Thus we will have a mean of 1 in each cohort and our observation about the variation coefficients means that the standard deviation depends on the insurance type and the sex, but is roughly independent of the group. Below we plot the pdfs of the cohorts using the R defaults. A reduction of the default bandwidth in the kernel estimator leads to small bumps in the graphs, thus the default value of R seems optimal. outpatient: pdf of normalized claim size wo 7 6 inpatient: pdf of normalized claim size wo density. density claim size claim size 6

7 : pdf of normalized claim size wo full cover: pdf of normalized claim size wo density. density claim size claim size To our surprise we have very similar pdfs over the groups with the exception of inpatient insurance. Looking back at the huge change in the portion of not claim free polices over the s in inpatient insurance compared to the other insurance types a different behavior of inpatient insurance is unsurprising. To give further evidence to the fact that the pdfs of the normalized claims are roughly independent of the even at the right tails we plot some empirical quantiles of the distributions. outpatient: quantils of normalized claim size inpatient: quantils of normalized claim size wo wo quantils quantils : quantils of normalized claim size full cover: quantils of normalized claim size wo wo quantils quantils Again we note the quantiles are nearly constant with the exception of the very high quantiles above 99%. The latter is caused by the fact that in 7

8 that range are at most, sometimes less than 1 observations. With these few observations volatility is to be excepted. To compute these quantiles one should use the advanced methods described in Subsection.3.. Description through a probability distribution function Classically claim size distributions are modeled by a lognormal or gamma distribution in combination with a Pareto distribution for large claims. Let us fit a lognormal and gamma distribution with ML estimation to the claim size distribution of year old in full cover insurance. The resulting qq and pp plots are: full cover lognorm: qq plot full cover lognorm: pp plot emp. quantils 1 1 emp. percentiles theo. quantils theo. percentiles full cover gamma: qq plot full cover gamma: pp plot emp. quantils 1 1 emp. percentiles theo. quantils theo. percentiles Obviously, the gamma distribution fits very badly. This is also the case for other cohorts and insurance types, and we want to discour its use in health insurance. The fit of the lognormal distribution is not good either, there are cohorts where the fit is worse. We see that the tail of the lognormal distribution is not thick enough in the range from. to 1.. In practice, one models additional high claims with a Pareto distribution to compensate for this gap. 8

9 Instead of trying to fix the bad fit of the lognormal distribution by adding a Pareto distribution, we attempt to fit the following distributions to the claim size distribution in hope of getting a better fit: scaled Burr Type XII, generalized extreme value, gamma, generalized beta, generalized Birnbaum Saunders with Laplace, logistic, normal, and Student t kernel (see Appendix B), inverse Gauss type with Laplace, logistic, normal, and Student t kernel (see Appendix A), lognormal, Pareto, transformed beta and transformed gamma. As a measure for the quality of the fitting we looked at the histograms, qq plots, pp plots, and plot of the empirical cdfs together with the fitted cdfs. However, we can present here only the results of the Kolmogorov Smirnov test. To reduce the amount of data we split the data by insurance type and sex and give only the mean and quantiles of the statistics and p values over the groups. The best fitting distributions as well as the lognormal and gamma distribution can be found in the following table: 9

11 The first glimpse at the table confirms that the gamma distribution is unsuitable for approximating any of the claim size distributions. The lognormal distribution can only be useful in insurance. However, a closer look reveals that it overestimates the tail in this case. The Burr, IGT St, and transformed beta distributions are the most suitable distributions for the modeling of outpatient and full cover insurance whose claims are dominated by the outpatient claims. (Ignore the adj. lines in the table for the time being.) The transformed beta distribution is the only four parameter distribution among them. It does not fit the claim size distribution notably better than the two other three parameter distributions. Under such circumstances the model with the least parameters should be chosen, following general principles. In addition, sometimes numerical difficulties arise during the ML estimation of the four parameters. Thus we will consider only the Burr and IGT St distribution further. In inpatient insurance we are unable to get a fit of similar quality as in outpatient and full cover insurance. However, the same distributions still give a good fit and even the comparison of their fits is as above. The table shows that the EVT, GBS St, and transformed gamma distributions provide a fit of the same quality. However, the qq and pp plots indicate an inferior fit: The GBS St distribution fits the tail very badly and the the graph of the pp plot of the transformed gamma distribution shows an S shape. The EVT distribution provides a good fit in the younger s, but has problems with the tail for the older s. While from the quality of the fit it might be worthwhile to examine the EVT distribution further for the inpatient claim size distribution, we will neglect it because it fits the other insurance type badly. We estimate the parameters for the Burr and IGT St distribution by the maximum likelihood method and plot the resulting estimated means. Due to our scaling they should be 1. Burr: estimated mean outpatient inpatient full cover wo IGT St: estimated mean outpatient inpatient full cover wo The IGT St distribution obviously fits the cohorts of 9 year old wo in inpatient insurance badly. This cohort is different from the other because it contains many claims due to pregnancy. In addition, it is the smallest one, thus we have the largest estimation error here. However, even apart from this the 11

12 estimated mean deviates often by more than % for both distributions. As this effect varies slowly with, we have to assume that it is not random, but that there is a systematic difference between the empirical claim size distributions and the theoretical ones. A systematic deviation of the expected mean from the true one is not acceptable for many applications, for example premium calculation. Therefore, we force the mean to be one and estimate only the remaining parameters by ML estimation. For the IGT St distribution we set µ = 1; for the Burr distribution we estimate the parameters α, β by maximum likelihood while simultaneously choosing the scale parameter such that the mean is one. The resulting KS test results are the adjusted lines in above table. To get an impression of the fit we draw the qq and pp plot of the cohorts of the year old. We cut off the qq plots at because we have a detailed discussion about the tails in the next subsection. outpatient IGT St adj.: pp plot.6. emp. percentiles 1.. emp. quantils outpatient IGT St adj.: qq plot theo. quantils theo. percentiles inpatient IGT St adj.: qq plot inpatient IGT St adj.: pp plot emp. percentiles 1.. emp. quantils theo. quantils...6 theo. percentiles 1

13 full cover IGT St adj.: pp plot.6. emp. percentiles 1.. emp. quantils full cover IGT St adj.: qq plot theo. quantils theo. percentiles outpatient Burr adj.: qq plot outpatient Burr adj.: pp plot emp. percentiles 1.. emp. quantils theo. quantils theo. percentiles inpatient Burr adj.: qq plot inpatient Burr adj.: pp plot.6. emp. percentiles 1.. emp. quantils theo. quantils...6 theo. percentiles 13

14 full cover Burr adj.: qq plot full cover Burr adj.: pp plot emp. quantils 1 1 emp. percentiles theo. quantils theo. percentiles The pp plots confirm the impression we got from the table about the KS test: both distributions fit the claim size distributions of outpatient and full cover insurance very well, but the inpatient insurance less well. The qq plot shows that beyond six the empirical distributions deviate somewhat from the theoretical ones. However, this is already an area where there are only few observations and random effects will be noticeable. In the next subsection we will show that we have a good fit in the tail overall. We plot the estimated parameters after the adjustt to mean one: Burr adj.: estimated a outpatient inpatient full cover Burr adj.: estimated b outpatient inpatient full cover wo 6 wo

15 Burr adj.: estimated scale outpatient inpatient full cover wo IGT St adj.: estimated λ outpatient inpatient full cover wo IGT St adj.: estimated ν outpatient inpatient full cover wo In general, the graphs are well behaved, not too volatile, and mainly constant with the exception of impatient insurance, which shows a clear trend. Let us discuss the obvious exceptions first. We already noted above that the 9 year old wo in inpatient insurance are exceptional, and we see that the IGT St distribution has also exceptional parameters for the of this group. Since no obvious outliers were visible in the data, we can only speculate about the reasons: One possibility is that due to few observations in those cohorts an estimation of a distribution function is too unreliable, another is that this group contains many newly insureds, which have thus recently passed a risk assesst and therefore distort the distribution. The other prominent exception are the parameters for the Burr distribution for the 3 3 year old wo in full cover insurance. Clearly, claims due to pregnancy will play an important role here. However, here we see also a disadvant of the Burr distribution, namely that the interpretation of its parameters is difficult. The parameters of the exceptional cohort are (a,b,scale) = (.93,1.1,.) and the ones of a neighboring cohort are (.3, 1.1, 1.8). While the values are very different, the following plot shows 1

16 that the resulting pdfs and cdfs are similar: pdfs and cdfs of two Burr distribtuions Apart from these exceptions the graphs are well behaved. To get a better intuitive understanding and to be able to compare both distributions we plot the mode, i.e., the point of maximum of the pdf: Burr adj.: mode outpatient inpatient full cover wo IGT St adj.: mode outpatient inpatient full cover wo Most prominent are the decrease in inpatient insurance as well as the increase in outpatient insurance for. Looking back at the plots of the empirical pdfs in Subsection.1 we see that these cases are in fact the ones where the profiles and the pdfs vary at most with. We also note that the modes for the Burr and the IGT St distribution fits differ somewhat, especially in inpatient insurance where the fit of both distributions is not as good as in inpatient and full cover insurance. The claim size distribution in insurance behaves very differently from the remainder. This is due to fact that very high claims are nearly impossible, andthusthe distributionhasatmostaslightlyheavytail, seealsothediscussion in the next subsection. The best fit is achieved by the generalized Birnbaum Saunders distribution with normal, logistic, and Student t kernel. As the fits with normal and logistic kernel are very similar, we will focus on the classical Birnbaum Saunders distribution with normal kernel. The GBS distribution with Student t kernel usually has a slightly better fit in the tail, but it comes at 16

17 the disadvant of an additional parameter. The parameters by ML estimation and the resulting expected mean are as follows: BS: estimated α BS: estimated β wo wo BS: estimated mean wo GBS St: estimated α GBS St: estimated β wo 1.7 wo

18 GBS St: estimated ν wo GBS St: estimated mean wo The graphs are in general well behaved and in addition the resulting expected mean is very close to 1, thus we do not make any correction of the parameters this time. The only problem is that the ML estimation the ν parameter for the GBS St distribution is not robust for ν in the above range, see Appendix B. In the cohort of 7 79 year old wo we have ν > 1. However, this is acceptable because the changes of ν in this range have only a minor impact on the shape of the pdf. To get a better impression of the quality of the fit, we show again the qq and pp plots for the year old : BS: qq plot BS: pp plot emp. quantils 1 1 emp. percentiles theo. quantils theo. percentiles 18

19 GBS St: qq plot GBS St: pp plot emp. quantils 1 1 emp. percentiles theo. quantils theo. percentiles We compare the fits of the distributions by their modes: BS: mode wo GBS St: mode wo The graphs are similar and both reveal a slight shift of the mode to the left with increasing. In fact, we can see this trend in the plot of empirical distribution functions in Subsection.1. If only the traditional distributions are available for modeling the claim size distributions one should choose the lognormal distribution. All other distributions except the gamma distribution fit the tail very badly as qq plots show and the above table shows that the lognormal distribution provides a much better fit than the gamma distribution. We estimate the parameters by applying the maximum likelihood method and plot the resulting means: 19

20 lognorm: estimated mean wo As the mean is often off by more than % we adjust the parameters such that the mean is 1. This is done by a ML estimation of σ while simultaneously choosing µ such that the mean is one. For comparison we give again the qq and pp plots: lognorm adj.: qq plot lognorm adj.: pp plot emp. quantils 1 1 emp. percentiles theo. quantils theo. percentiles

### A Coefficient of Variation for Skewed and Heavy-Tailed Insurance Losses. Michael R. Powers[ 1 ] Temple University and Tsinghua University

A Coefficient of Variation for Skewed and Heavy-Tailed Insurance Losses Michael R. Powers[ ] Temple University and Tsinghua University Thomas Y. Powers Yale University [June 2009] Abstract We propose a

### Exploratory Data Analysis

Exploratory Data Analysis Johannes Schauer johannes.schauer@tugraz.at Institute of Statistics Graz University of Technology Steyrergasse 17/IV, 8010 Graz www.statistics.tugraz.at February 12, 2008 Introduction

### Dongfeng Li. Autumn 2010

Autumn 2010 Chapter Contents Some statistics background; ; Comparing means and proportions; variance. Students should master the basic concepts, descriptive statistics measures and graphs, basic hypothesis

### CA200 Quantitative Analysis for Business Decisions. File name: CA200_Section_04A_StatisticsIntroduction

CA200 Quantitative Analysis for Business Decisions File name: CA200_Section_04A_StatisticsIntroduction Table of Contents 4. Introduction to Statistics... 1 4.1 Overview... 3 4.2 Discrete or continuous

### MINITAB ASSISTANT WHITE PAPER

MINITAB ASSISTANT WHITE PAPER This paper explains the research conducted by Minitab statisticians to develop the methods and data checks used in the Assistant in Minitab 17 Statistical Software. One-Way

### Joint Exam 1/P Sample Exam 1

Joint Exam 1/P Sample Exam 1 Take this practice exam under strict exam conditions: Set a timer for 3 hours; Do not stop the timer for restroom breaks; Do not look at your notes. If you believe a question

### BNG 202 Biomechanics Lab. Descriptive statistics and probability distributions I

BNG 202 Biomechanics Lab Descriptive statistics and probability distributions I Overview The overall goal of this short course in statistics is to provide an introduction to descriptive and inferential

### Nonparametric adaptive age replacement with a one-cycle criterion

Nonparametric adaptive age replacement with a one-cycle criterion P. Coolen-Schrijner, F.P.A. Coolen Department of Mathematical Sciences University of Durham, Durham, DH1 3LE, UK e-mail: Pauline.Schrijner@durham.ac.uk

### 6. Distribution and Quantile Functions

Virtual Laboratories > 2. Distributions > 1 2 3 4 5 6 7 8 6. Distribution and Quantile Functions As usual, our starting point is a random experiment with probability measure P on an underlying sample spac

### Geostatistics Exploratory Analysis

Instituto Superior de Estatística e Gestão de Informação Universidade Nova de Lisboa Master of Science in Geospatial Technologies Geostatistics Exploratory Analysis Carlos Alberto Felgueiras cfelgueiras@isegi.unl.pt

### Quantitative Methods for Finance

Quantitative Methods for Finance Module 1: The Time Value of Money 1 Learning how to interpret interest rates as required rates of return, discount rates, or opportunity costs. 2 Learning how to explain

### Applying Generalized Pareto Distribution to the Risk Management of Commerce Fire Insurance

Applying Generalized Pareto Distribution to the Risk Management of Commerce Fire Insurance Wo-Chiang Lee Associate Professor, Department of Banking and Finance Tamkang University 5,Yin-Chuan Road, Tamsui

### Modeling Individual Claims for Motor Third Party Liability of Insurance Companies in Albania

Modeling Individual Claims for Motor Third Party Liability of Insurance Companies in Albania Oriana Zacaj Department of Mathematics, Polytechnic University, Faculty of Mathematics and Physics Engineering

### Exploratory data analysis (Chapter 2) Fall 2011

Exploratory data analysis (Chapter 2) Fall 2011 Data Examples Example 1: Survey Data 1 Data collected from a Stat 371 class in Fall 2005 2 They answered questions about their: gender, major, year in school,

### 4. Introduction to Statistics

Statistics for Engineers 4-1 4. Introduction to Statistics Descriptive Statistics Types of data A variate or random variable is a quantity or attribute whose value may vary from one unit of investigation

### Normality Testing in Excel

Normality Testing in Excel By Mark Harmon Copyright 2011 Mark Harmon No part of this publication may be reproduced or distributed without the express permission of the author. mark@excelmasterseries.com

### Journal of Financial and Economic Practice

Analyzing Investment Data Using Conditional Probabilities: The Implications for Investment Forecasts, Stock Option Pricing, Risk Premia, and CAPM Beta Calculations By Richard R. Joss, Ph.D. Resource Actuary,

### WEEK #22: PDFs and CDFs, Measures of Center and Spread

WEEK #22: PDFs and CDFs, Measures of Center and Spread Goals: Explore the effect of independent events in probability calculations. Present a number of ways to represent probability distributions. Textbook

### Supplement to Call Centers with Delay Information: Models and Insights

Supplement to Call Centers with Delay Information: Models and Insights Oualid Jouini 1 Zeynep Akşin 2 Yves Dallery 1 1 Laboratoire Genie Industriel, Ecole Centrale Paris, Grande Voie des Vignes, 92290

### A Primer on Mathematical Statistics and Univariate Distributions; The Normal Distribution; The GLM with the Normal Distribution

A Primer on Mathematical Statistics and Univariate Distributions; The Normal Distribution; The GLM with the Normal Distribution PSYC 943 (930): Fundamentals of Multivariate Modeling Lecture 4: September

### Quantitative Impact Study 1 (QIS1) Summary Report for Belgium. 21 March 2006

Quantitative Impact Study 1 (QIS1) Summary Report for Belgium 21 March 2006 1 Quantitative Impact Study 1 (QIS1) Summary Report for Belgium INTRODUCTORY REMARKS...4 1. GENERAL OBSERVATIONS...4 1.1. Market

### Continuous Distributions

MAT 2379 3X (Summer 2012) Continuous Distributions Up to now we have been working with discrete random variables whose R X is finite or countable. However we will have to allow for variables that can take

### Age to Age Factor Selection under Changing Development Chris G. Gross, ACAS, MAAA

Age to Age Factor Selection under Changing Development Chris G. Gross, ACAS, MAAA Introduction A common question faced by many actuaries when selecting loss development factors is whether to base the selected

### SUMAN DUVVURU STAT 567 PROJECT REPORT

SUMAN DUVVURU STAT 567 PROJECT REPORT SURVIVAL ANALYSIS OF HEROIN ADDICTS Background and introduction: Current illicit drug use among teens is continuing to increase in many countries around the world.

### A frequency distribution is a table used to describe a data set. A frequency table lists intervals or ranges of data values called data classes

A frequency distribution is a table used to describe a data set. A frequency table lists intervals or ranges of data values called data classes together with the number of data values from the set that

### CHINHOYI UNIVERSITY OF TECHNOLOGY

CHINHOYI UNIVERSITY OF TECHNOLOGY SCHOOL OF NATURAL SCIENCES AND MATHEMATICS DEPARTMENT OF MATHEMATICS MEASURES OF CENTRAL TENDENCY AND DISPERSION INTRODUCTION From the previous unit, the Graphical displays

### Tutorial 5: Hypothesis Testing

Tutorial 5: Hypothesis Testing Rob Nicholls nicholls@mrc-lmb.cam.ac.uk MRC LMB Statistics Course 2014 Contents 1 Introduction................................ 1 2 Testing distributional assumptions....................

Binomial distribution From Wikipedia, the free encyclopedia See also: Negative binomial distribution In probability theory and statistics, the binomial distribution is the discrete probability distribution

### Numerical Summarization of Data OPRE 6301

Numerical Summarization of Data OPRE 6301 Motivation... In the previous session, we used graphical techniques to describe data. For example: While this histogram provides useful insight, other interesting

### Statistics I for QBIC. Contents and Objectives. Chapters 1 7. Revised: August 2013

Statistics I for QBIC Text Book: Biostatistics, 10 th edition, by Daniel & Cross Contents and Objectives Chapters 1 7 Revised: August 2013 Chapter 1: Nature of Statistics (sections 1.1-1.6) Objectives

### Probability and Statistics

CHAPTER 2: RANDOM VARIABLES AND ASSOCIATED FUNCTIONS 2b - 0 Probability and Statistics Kristel Van Steen, PhD 2 Montefiore Institute - Systems and Modeling GIGA - Bioinformatics ULg kristel.vansteen@ulg.ac.be

### Chapter 6: The Information Function 129. CHAPTER 7 Test Calibration

Chapter 6: The Information Function 129 CHAPTER 7 Test Calibration 130 Chapter 7: Test Calibration CHAPTER 7 Test Calibration For didactic purposes, all of the preceding chapters have assumed that the

### CHAPTER 2 Estimating Probabilities

CHAPTER 2 Estimating Probabilities Machine Learning Copyright c 2016. Tom M. Mitchell. All rights reserved. *DRAFT OF January 24, 2016* *PLEASE DO NOT DISTRIBUTE WITHOUT AUTHOR S PERMISSION* This is a

### Probability and Statistics Vocabulary List (Definitions for Middle School Teachers)

Probability and Statistics Vocabulary List (Definitions for Middle School Teachers) B Bar graph a diagram representing the frequency distribution for nominal or discrete data. It consists of a sequence

### **BEGINNING OF EXAMINATION** The annual number of claims for an insured has probability function: , 0 < q < 1.

**BEGINNING OF EXAMINATION** 1. You are given: (i) The annual number of claims for an insured has probability function: 3 p x q q x x ( ) = ( 1 ) 3 x, x = 0,1,, 3 (ii) The prior density is π ( q) = q,

### ACTUARIAL MODELING FOR INSURANCE CLAIM SEVERITY IN MOTOR COMPREHENSIVE POLICY USING INDUSTRIAL STATISTICAL DISTRIBUTIONS

i ACTUARIAL MODELING FOR INSURANCE CLAIM SEVERITY IN MOTOR COMPREHENSIVE POLICY USING INDUSTRIAL STATISTICAL DISTRIBUTIONS OYUGI MARGARET ACHIENG BOSOM INSURANCE BROKERS LTD P.O.BOX 74547-00200 NAIROBI,

### Generating Random Numbers Variance Reduction Quasi-Monte Carlo. Simulation Methods. Leonid Kogan. MIT, Sloan. 15.450, Fall 2010

Simulation Methods Leonid Kogan MIT, Sloan 15.450, Fall 2010 c Leonid Kogan ( MIT, Sloan ) Simulation Methods 15.450, Fall 2010 1 / 35 Outline 1 Generating Random Numbers 2 Variance Reduction 3 Quasi-Monte

### Chi-Square Test. Contingency Tables. Contingency Tables. Chi-Square Test for Independence. Chi-Square Tests for Goodnessof-Fit

Chi-Square Tests 15 Chapter Chi-Square Test for Independence Chi-Square Tests for Goodness Uniform Goodness- Poisson Goodness- Goodness Test ECDF Tests (Optional) McGraw-Hill/Irwin Copyright 2009 by The

### Central Limit Theorem and Sample Size. Zachary R. Smith. and. Craig S. Wells. University of Massachusetts Amherst

CLT and Sample Size 1 Running Head: CLT AND SAMPLE SIZE Central Limit Theorem and Sample Size Zachary R. Smith and Craig S. Wells University of Massachusetts Amherst Paper presented at the annual meeting

### MATH4427 Notebook 2 Spring 2016. 2 MATH4427 Notebook 2 3. 2.1 Definitions and Examples... 3. 2.2 Performance Measures for Estimators...

MATH4427 Notebook 2 Spring 2016 prepared by Professor Jenny Baglivo c Copyright 2009-2016 by Jenny A. Baglivo. All Rights Reserved. Contents 2 MATH4427 Notebook 2 3 2.1 Definitions and Examples...................................

### Tail-Dependence an Essential Factor for Correctly Measuring the Benefits of Diversification

Tail-Dependence an Essential Factor for Correctly Measuring the Benefits of Diversification Presented by Work done with Roland Bürgi and Roger Iles New Views on Extreme Events: Coupled Networks, Dragon

### An Alternative Route to Performance Hypothesis Testing

EDHEC-Risk Institute 393-400 promenade des Anglais 06202 Nice Cedex 3 Tel.: +33 (0)4 93 18 32 53 E-mail: research@edhec-risk.com Web: www.edhec-risk.com An Alternative Route to Performance Hypothesis Testing

### Stochastic Analysis of Long-Term Multiple-Decrement Contracts

Stochastic Analysis of Long-Term Multiple-Decrement Contracts Matthew Clark, FSA, MAAA, and Chad Runchey, FSA, MAAA Ernst & Young LLP Published in the July 2008 issue of the Actuarial Practice Forum Copyright

### Assignment I Pareto Distribution

Name and Surname: Thi Thanh Tam Vu PhD student in Economics and Management Course: Statistics and regressions (2014-2015) Assignment I Pareto Distribution 1.! Definition The Pareto distribution (hereinafter

### Institute of Actuaries of India Subject CT3 Probability and Mathematical Statistics

Institute of Actuaries of India Subject CT3 Probability and Mathematical Statistics For 2015 Examinations Aim The aim of the Probability and Mathematical Statistics subject is to provide a grounding in

### Continuous Random Variables. and Probability Distributions. Continuous Random Variables and Probability Distributions ( ) ( ) Chapter 4 4.

UCLA STAT 11 A Applied Probability & Statistics for Engineers Instructor: Ivo Dinov, Asst. Prof. In Statistics and Neurology Teaching Assistant: Neda Farzinnia, UCLA Statistics University of California,

### LOGNORMAL MODEL FOR STOCK PRICES

LOGNORMAL MODEL FOR STOCK PRICES MICHAEL J. SHARPE MATHEMATICS DEPARTMENT, UCSD 1. INTRODUCTION What follows is a simple but important model that will be the basis for a later study of stock prices as

### Describe what is meant by a placebo Contrast the double-blind procedure with the single-blind procedure Review the structure for organizing a memo

Readings: Ha and Ha Textbook - Chapters 1 8 Appendix D & E (online) Plous - Chapters 10, 11, 12 and 14 Chapter 10: The Representativeness Heuristic Chapter 11: The Availability Heuristic Chapter 12: Probability

### Continuous Random Variables and Probability Distributions. Stat 4570/5570 Material from Devore s book (Ed 8) Chapter 4 - and Cengage

4 Continuous Random Variables and Probability Distributions Stat 4570/5570 Material from Devore s book (Ed 8) Chapter 4 - and Cengage Continuous r.v. A random variable X is continuous if possible values

### Chicago Booth BUSINESS STATISTICS 41000 Final Exam Fall 2011

Chicago Booth BUSINESS STATISTICS 41000 Final Exam Fall 2011 Name: Section: I pledge my honor that I have not violated the Honor Code Signature: This exam has 34 pages. You have 3 hours to complete this

### Review of Random Variables

Chapter 1 Review of Random Variables Updated: January 16, 2015 This chapter reviews basic probability concepts that are necessary for the modeling and statistical analysis of financial data. 1.1 Random

### REINSURANCE PROFIT SHARE

REINSURANCE PROFIT SHARE Prepared by Damian Thornley Presented to the Institute of Actuaries of Australia Biennial Convention 23-26 September 2007 Christchurch, New Zealand This paper has been prepared

### Exercise 1.12 (Pg. 22-23)

Individuals: The objects that are described by a set of data. They may be people, animals, things, etc. (Also referred to as Cases or Records) Variables: The characteristics recorded about each individual.

### Non Parametric Inference

Maura Department of Economics and Finance Università Tor Vergata Outline 1 2 3 Inverse distribution function Theorem: Let U be a uniform random variable on (0, 1). Let X be a continuous random variable

### Chapter 3 RANDOM VARIATE GENERATION

Chapter 3 RANDOM VARIATE GENERATION In order to do a Monte Carlo simulation either by hand or by computer, techniques must be developed for generating values of random variables having known distributions.

### Featured article: Evaluating the Cost of Longevity in Variable Annuity Living Benefits

Featured article: Evaluating the Cost of Longevity in Variable Annuity Living Benefits By Stuart Silverman and Dan Theodore This is a follow-up to a previous article Considering the Cost of Longevity Volatility

### Important Probability Distributions OPRE 6301

Important Probability Distributions OPRE 6301 Important Distributions... Certain probability distributions occur with such regularity in real-life applications that they have been given their own names.

### The Variability of P-Values. Summary

The Variability of P-Values Dennis D. Boos Department of Statistics North Carolina State University Raleigh, NC 27695-8203 boos@stat.ncsu.edu August 15, 2009 NC State Statistics Departement Tech Report

### The Big 50 Revision Guidelines for S1

The Big 50 Revision Guidelines for S1 If you can understand all of these you ll do very well 1. Know what is meant by a statistical model and the Modelling cycle of continuous refinement 2. Understand

### 6.4 Normal Distribution

Contents 6.4 Normal Distribution....................... 381 6.4.1 Characteristics of the Normal Distribution....... 381 6.4.2 The Standardized Normal Distribution......... 385 6.4.3 Meaning of Areas under

### Multivariate Normal Distribution

Multivariate Normal Distribution Lecture 4 July 21, 2011 Advanced Multivariate Statistical Methods ICPSR Summer Session #2 Lecture #4-7/21/2011 Slide 1 of 41 Last Time Matrices and vectors Eigenvalues

### A Simple Formula for Operational Risk Capital: A Proposal Based on the Similarity of Loss Severity Distributions Observed among 18 Japanese Banks

A Simple Formula for Operational Risk Capital: A Proposal Based on the Similarity of Loss Severity Distributions Observed among 18 Japanese Banks May 2011 Tsuyoshi Nagafuji Takayuki

### Inferential Statistics

Inferential Statistics Sampling and the normal distribution Z-scores Confidence levels and intervals Hypothesis testing Commonly used statistical methods Inferential Statistics Descriptive statistics are

### Content DESCRIPTIVE STATISTICS. Data & Statistic. Statistics. Example: DATA VS. STATISTIC VS. STATISTICS

Content DESCRIPTIVE STATISTICS Dr Najib Majdi bin Yaacob MD, MPH, DrPH (Epidemiology) USM Unit of Biostatistics & Research Methodology School of Medical Sciences Universiti Sains Malaysia. Introduction

### Asymmetry and the Cost of Capital

Asymmetry and the Cost of Capital Javier García Sánchez, IAE Business School Lorenzo Preve, IAE Business School Virginia Sarria Allende, IAE Business School Abstract The expected cost of capital is a crucial

### PS 271B: Quantitative Methods II. Lecture Notes

PS 271B: Quantitative Methods II Lecture Notes Langche Zeng zeng@ucsd.edu The Empirical Research Process; Fundamental Methodological Issues 2 Theory; Data; Models/model selection; Estimation; Inference.

### A Uniform Asymptotic Estimate for Discounted Aggregate Claims with Subexponential Tails

12th International Congress on Insurance: Mathematics and Economics July 16-18, 2008 A Uniform Asymptotic Estimate for Discounted Aggregate Claims with Subexponential Tails XUEMIAO HAO (Based on a joint

### Probability and Statistics Prof. Dr. Somesh Kumar Department of Mathematics Indian Institute of Technology, Kharagpur

Probability and Statistics Prof. Dr. Somesh Kumar Department of Mathematics Indian Institute of Technology, Kharagpur Module No. #01 Lecture No. #15 Special Distributions-VI Today, I am going to introduce

### LDA at Work: Deutsche Bank s Approach to Quantifying Operational Risk

LDA at Work: Deutsche Bank s Approach to Quantifying Operational Risk Workshop on Financial Risk and Banking Regulation Office of the Comptroller of the Currency, Washington DC, 5 Feb 2009 Michael Kalkbrener

### Statistical Machine Learning

Statistical Machine Learning UoC Stats 37700, Winter quarter Lecture 4: classical linear and quadratic discriminants. 1 / 25 Linear separation For two classes in R d : simple idea: separate the classes

### Limit processes are the basis of calculus. For example, the derivative. f f (x + h) f (x)

SEC. 4.1 TAYLOR SERIES AND CALCULATION OF FUNCTIONS 187 Taylor Series 4.1 Taylor Series and Calculation of Functions Limit processes are the basis of calculus. For example, the derivative f f (x + h) f

### Treatment and analysis of data Applied statistics Lecture 3: Sampling and descriptive statistics

Treatment and analysis of data Applied statistics Lecture 3: Sampling and descriptive statistics Topics covered: Parameters and statistics Sample mean and sample standard deviation Order statistics and

### Some Notes on Taylor Polynomials and Taylor Series

Some Notes on Taylor Polynomials and Taylor Series Mark MacLean October 3, 27 UBC s courses MATH /8 and MATH introduce students to the ideas of Taylor polynomials and Taylor series in a fairly limited

### ( ) = 1 x. ! 2x = 2. The region where that joint density is positive is indicated with dotted lines in the graph below. y = x

Errata for the ASM Study Manual for Exam P, Eleventh Edition By Dr. Krzysztof M. Ostaszewski, FSA, CERA, FSAS, CFA, MAAA Web site: http://www.krzysio.net E-mail: krzysio@krzysio.net Posted September 21,

### Senior Secondary Australian Curriculum

Senior Secondary Australian Curriculum Mathematical Methods Glossary Unit 1 Functions and graphs Asymptote A line is an asymptote to a curve if the distance between the line and the curve approaches zero

### Principle of Data Reduction

Chapter 6 Principle of Data Reduction 6.1 Introduction An experimenter uses the information in a sample X 1,..., X n to make inferences about an unknown parameter θ. If the sample size n is large, then

### Approximation of Aggregate Losses Using Simulation

Journal of Mathematics and Statistics 6 (3): 233-239, 2010 ISSN 1549-3644 2010 Science Publications Approimation of Aggregate Losses Using Simulation Mohamed Amraja Mohamed, Ahmad Mahir Razali and Noriszura

### Calculating Interval Forecasts

Calculating Chapter 7 (Chatfield) Monika Turyna & Thomas Hrdina Department of Economics, University of Vienna Summer Term 2009 Terminology An interval forecast consists of an upper and a lower limit between

### Notes on Continuous Random Variables

Notes on Continuous Random Variables Continuous random variables are random quantities that are measured on a continuous scale. They can usually take on any value over some interval, which distinguishes

### F. Farrokhyar, MPhil, PhD, PDoc

Learning objectives Descriptive Statistics F. Farrokhyar, MPhil, PhD, PDoc To recognize different types of variables To learn how to appropriately explore your data How to display data using graphs How

### An Introduction to Modeling Stock Price Returns With a View Towards Option Pricing

An Introduction to Modeling Stock Price Returns With a View Towards Option Pricing Kyle Chauvin August 21, 2006 This work is the product of a summer research project at the University of Kansas, conducted

### START Selected Topics in Assurance

START Selected Topics in Assurance Related Technologies Table of Contents Introduction Some Statistical Background Fitting a Normal Using the Anderson Darling GoF Test Fitting a Weibull Using the Anderson

### Recall this chart that showed how most of our course would be organized:

Chapter 4 One-Way ANOVA Recall this chart that showed how most of our course would be organized: Explanatory Variable(s) Response Variable Methods Categorical Categorical Contingency Tables Categorical

### Section A. Index. Section A. Planning, Budgeting and Forecasting Section A.2 Forecasting techniques... 1. Page 1 of 11. EduPristine CMA - Part I

Index Section A. Planning, Budgeting and Forecasting Section A.2 Forecasting techniques... 1 EduPristine CMA - Part I Page 1 of 11 Section A. Planning, Budgeting and Forecasting Section A.2 Forecasting

### Normal distribution Approximating binomial distribution by normal 2.10 Central Limit Theorem

1.1.2 Normal distribution 1.1.3 Approimating binomial distribution by normal 2.1 Central Limit Theorem Prof. Tesler Math 283 October 22, 214 Prof. Tesler 1.1.2-3, 2.1 Normal distribution Math 283 / October

### Report on application of Probability in Risk Analysis in Oil and Gas Industry

Report on application of Probability in Risk Analysis in Oil and Gas Industry Abstract Risk Analysis in Oil and Gas Industry Global demand for energy is rising around the world. Meanwhile, managing oil

### 2.0 Lesson Plan. Answer Questions. Summary Statistics. Histograms. The Normal Distribution. Using the Standard Normal Table

2.0 Lesson Plan Answer Questions 1 Summary Statistics Histograms The Normal Distribution Using the Standard Normal Table 2. Summary Statistics Given a collection of data, one needs to find representations

### Experiment #1, Analyze Data using Excel, Calculator and Graphs.

Physics 182 - Fall 2014 - Experiment #1 1 Experiment #1, Analyze Data using Excel, Calculator and Graphs. 1 Purpose (5 Points, Including Title. Points apply to your lab report.) Before we start measuring

### Normal distribution. ) 2 /2σ. 2π σ

Normal distribution The normal distribution is the most widely known and used of all distributions. Because the normal distribution approximates many natural phenomena so well, it has developed into a

### Multiple Linear Regression in Data Mining

Multiple Linear Regression in Data Mining Contents 2.1. A Review of Multiple Linear Regression 2.2. Illustration of the Regression Process 2.3. Subset Selection in Linear Regression 1 2 Chap. 2 Multiple

### Financial Time Series Analysis (FTSA) Lecture 1: Introduction

Financial Time Series Analysis (FTSA) Lecture 1: Introduction Brief History of Time Series Analysis Statistical analysis of time series data (Yule, 1927) v/s forecasting (even longer). Forecasting is often

### Treatment of Incomplete Data in the Field of Operational Risk: The Effects on Parameter Estimates, EL and UL Figures

Chernobai.qxd 2/1/ 1: PM Page 1 Treatment of Incomplete Data in the Field of Operational Risk: The Effects on Parameter Estimates, EL and UL Figures Anna Chernobai; Christian Menn*; Svetlozar T. Rachev;

### EMPIRICAL RISK MINIMIZATION FOR CAR INSURANCE DATA

EMPIRICAL RISK MINIMIZATION FOR CAR INSURANCE DATA Andreas Christmann Department of Mathematics homepages.vub.ac.be/ achristm Talk: ULB, Sciences Actuarielles, 17/NOV/2006 Contents 1. Project: Motor vehicle

### Pricing Alternative forms of Commercial Insurance cover

Pricing Alternative forms of Commercial Insurance cover Prepared by Andrew Harford Presented to the Institute of Actuaries of Australia Biennial Convention 23-26 September 2007 Christchurch, New Zealand

### Time series Forecasting using Holt-Winters Exponential Smoothing

Time series Forecasting using Holt-Winters Exponential Smoothing Prajakta S. Kalekar(04329008) Kanwal Rekhi School of Information Technology Under the guidance of Prof. Bernard December 6, 2004 Abstract

### Report of for Chapter 2 pretest

Report of for Chapter 2 pretest Exam: Chapter 2 pretest Category: Organizing and Graphing Data 1. "For our study of driving habits, we recorded the speed of every fifth vehicle on Drury Lane. Nearly every

### Financial Assets Behaving Badly The Case of High Yield Bonds. Chris Kantos Newport Seminar June 2013

Financial Assets Behaving Badly The Case of High Yield Bonds Chris Kantos Newport Seminar June 2013 Main Concepts for Today The most common metric of financial asset risk is the volatility or standard

### Nonparametric Two-Sample Tests. Nonparametric Tests. Sign Test

Nonparametric Two-Sample Tests Sign test Mann-Whitney U-test (a.k.a. Wilcoxon two-sample test) Kolmogorov-Smirnov Test Wilcoxon Signed-Rank Test Tukey-Duckworth Test 1 Nonparametric Tests Recall, nonparametric