Combining GLM and datamining techniques for modelling accident compensation data. Peter Mulquiney

Similar documents

MGT 267 PROJECT. Forecasting the United States Retail Sales of the Pharmacies and Drug Stores. Done by: Shunwei Wang & Mohammad Zainal

NTC Project: S01-PH10 (formerly I01-P10) 1 Forecasting Women s Apparel Sales Using Mathematical Modeling

The Combination Forecasting Model of Auto Sales Based on Seasonal Index and RBF Neural Network

BOOSTED REGRESSION TREES: A MODERN WAY TO ENHANCE ACTUARIAL MODELLING

Logistic Regression (1/24/13)

Time series forecasting

Joseph Twagilimana, University of Louisville, Louisville, KY

Predictive Modeling in Workers Compensation 2008 CAS Ratemaking Seminar

Lecture 6. Artificial Neural Networks

SAS Software to Fit the Generalized Linear Model

Machine Learning and Data Mining. Regression Problem. (adapted from) Prof. Alexander Ihler

Development Period Observed Payments

COMBINED NEURAL NETWORKS FOR TIME SERIES ANALYSIS

Location matters. 3 techniques to incorporate geo-spatial effects in one's predictive model

APPLYING DATA MINING TECHNIQUES TO FORECAST NUMBER OF AIRLINE PASSENGERS

IFT3395/6390. Machine Learning from linear regression to Neural Networks. Machine Learning. Training Set. t (3.5, -2,..., 127, 0,...

Not Your Dad s Magic Eight Ball

MISSING DATA TECHNIQUES WITH SAS. IDRE Statistical Consulting Group

Nonlinear Regression:

Prediction Model for Crude Oil Price Using Artificial Neural Networks

Section A. Index. Section A. Planning, Budgeting and Forecasting Section A.2 Forecasting techniques Page 1 of 11. EduPristine CMA - Part I

ON THE DEGREES OF FREEDOM IN RICHLY PARAMETERISED MODELS

Industry Environment and Concepts for Forecasting 1

5. Multiple regression

Neural Network Stock Trading Systems Donn S. Fishbein, MD, PhD Neuroquant.com

Simple Methods and Procedures Used in Forecasting

Introduction to Machine Learning and Data Mining. Prof. Dr. Igor Trajkovski

Time Series Analysis. 1) smoothing/trend assessment

Combining Linear and Non-Linear Modeling Techniques: EMB America. Getting the Best of Two Worlds

Lecture 8 February 4

Predictive modelling around the world

Module 6: Introduction to Time Series Forecasting

Event driven trading new studies on innovative way. of trading in Forex market. Michał Osmoła INIME live 23 February 2016

Data Mining. Nonlinear Classification

How To Build A Predictive Model In Insurance

Neural Network and Genetic Algorithm Based Trading Systems. Donn S. Fishbein, MD, PhD Neuroquant.com

Penalized regression: Introduction

Supply Chain Forecasting Model Using Computational Intelligence Techniques

Data Mining - Evaluation of Classifiers

Artificial Neural Network and Non-Linear Regression: A Comparative Study

Chapter 4: Artificial Neural Networks

Data Mining and Neural Networks in Stata

Regression III: Advanced Methods

Logistic Regression (a type of Generalized Linear Model)

Drugs store sales forecast using Machine Learning

8. Time Series and Prediction

Simple linear regression

Statistical Machine Learning

Data Mining Lab 5: Introduction to Neural Networks

Statistics in Retail Finance. Chapter 6: Behavioural models

Risks Involved with Systematic Investment Strategies

Adaptive Demand-Forecasting Approach based on Principal Components Time-series an application of data-mining technique to detection of market movement

Model Validation Techniques

Data Mining Part 5. Prediction

AUTOMATION OF ENERGY DEMAND FORECASTING. Sanzad Siddique, B.S.

SPSS TRAINING SESSION 3 ADVANCED TOPICS (PASW STATISTICS 17.0) Sun Li Centre for Academic Computing lsun@smu.edu.sg

NTC Project: S01-PH10 (formerly I01-P10) 1 Forecasting Women s Apparel Sales Using Mathematical Modeling

Chapter 12 Discovering New Knowledge Data Mining

Applied Statistics. J. Blanchet and J. Wadsworth. Institute of Mathematics, Analysis, and Applications EPF Lausanne

GLM I An Introduction to Generalized Linear Models

RISK MARGINS FOR HEALTH INSURANCE. Adam Searle and Dimity Wall

Machine learning algorithms for predicting roadside fine particulate matter concentration level in Hong Kong Central

Neural Networks for Machine Learning. Lecture 13a The ups and downs of backpropagation

Role of Neural network in data mining

Automated Biosurveillance Data from England and Wales,

business statistics using Excel OXFORD UNIVERSITY PRESS Glyn Davis & Branko Pecar

Jinadasa Gamage, Professor of Mathematics, Illinois State University, Normal, IL, e- mail:

Insurance Fraud Detection: MARS versus Neural Networks?

INTELLIGENT ENERGY MANAGEMENT OF ELECTRICAL POWER SYSTEMS WITH DISTRIBUTED FEEDING ON THE BASIS OF FORECASTS OF DEMAND AND GENERATION Chr.

Estimating Claim Settlement Values Using GLM. Roosevelt C. Mosley Jr., FCAS, MAAA

Financial Simulation Models in General Insurance

Designing a neural network for forecasting financial time series

Power Prediction Analysis using Artificial Neural Network in MS Excel

Logistic Regression for Spam Filtering

Local classification and local likelihoods

A Color Placement Support System for Visualization Designs Based on Subjective Color Balance

240ST014 - Data Analysis of Transport and Logistics

New Work Item for ISO Predictive Analytics (Initial Notes and Thoughts) Introduction

Predictive Modeling Techniques in Insurance

Schools Value-added Information System Technical Manual

Regression III: Advanced Methods

Optimization of technical trading strategies and the profitability in security markets

A Deeper Look Inside Generalized Linear Models

Studying Auto Insurance Data

Big Data Analytics. Tools and Techniques

exspline That: Explaining Geographic Variation in Insurance Pricing

A Short Tour of the Predictive Modeling Process

Demand forecasting & Aggregate planning in a Supply chain. Session Speaker Prof.P.S.Satish

PREDICTIVE DISTRIBUTIONS OF OUTSTANDING LIABILITIES IN GENERAL INSURANCE

Follow links Class Use and other Permissions. For more information, send to:

Knowledge Discovery and Data Mining. Bootstrap review. Bagging Important Concepts. Notes. Lecture 19 - Bagging. Tom Kelsey. Notes

An Interactive Tool for Residual Diagnostics for Fitting Spatial Dependencies (with Implementation in R)

Tutorial on Using Excel Solver to Analyze Spin-Lattice Relaxation Time Data

Expert Systems with Applications

Long-term Stock Market Forecasting using Gaussian Processes

USE OF ARIMA TIME SERIES AND REGRESSORS TO FORECAST THE SALE OF ELECTRICITY

Forecasting of Economic Quantities using Fuzzy Autoregressive Model and Fuzzy Neural Network

Flexible Neural Trees Ensemble for Stock Index Modeling

Transcription:

Combining GLM and datamining techniques for modelling accident compensation data Peter Mulquiney

Introduction Accident compensation data exhibit features which complicate loss reserving and premium rate setting Speeding up or slowing down of payment patterns Abrupt changes in trends due to legislative changes Changes in the profile of claims Other changes which emerge as superimposed inflation Complicated structure can be modelled with GLMs structure chosen in an ad hoc manner process can be laborious and can be fallible

Alternative: Data mining techniques Artificial Neural Networks CART, MARS etc Advantages: Introduction flexible architecture can fit almost any data structure model fitting is largely automated

Overview Examine general form of model of claims data Examine the specific case of a GLM to represent the data Consider how the GLM structure is chosen Introduce and discuss Artificial Neural Networks (ANNs) Consider how these may assist in formulating a GLM

Model of claims data General form of claims data model Y i = f(x i ; β) + ε i Y i = some observation on claims experience β = vector of parameters that apply to all observations X i = vector of attributes (covariates) of i-th observation ε i = vector of centred stochastic error terms

General form of claims data model Y i = f(x i ; β) + ε i Y i = some observation on claims experience β = vector of parameters that apply to all observations X i = vector of attributes (covariates) of i-th observation ε i = vector of centred stochastic error terms Examples Model of claims data Y i = Y ad = paid losses in (a,d) cell» a = accident period» d = development period Y i = cost of i-th completed claim

Examples (cont) Y ad = paid losses in (a,d) cell E[Y ad ] = β d Σ r=1 d-1 Y ar (chain ladder)

Y ad = paid losses in (a,d) cell E[Y ad ] = β d Σ r=1 d-1 Y ar (chain ladder) Y i = cost of i-th completed claim Y i ~ Gamma E[Y i ] = exp [α+β t i ] where Examples (cont)» a i = accident period to which i-th claim belongs» t i = operational time at completion of i-th claim = proportion of claims from the accident period a i completed before i-th claim

Examples of individual claim models More generally E[Y i ] = exp {function of operational time}

Examples of individual claim models More generally E[Y i ] = exp {function of operational time} + function of accident period (legislative change)}

More generally E[Y i ] = Examples of individual claim models exp {function of operational time} + function of accident period (legislative change)} + function of completion period (superimposed inflation)}

More generally E[Y i ] = Examples of individual claim models exp {function of operational time} + function of accident period (legislative change)} + function of completion period (superimposed inflation)} + joint function (interaction) of operational time & accident period (change in payment pattern attributable to legislative change)}

Examples of individual claim models Models of this type may be very detailed May include Operational time effect (payment pattern) Seasonality Creeping change in payment pattern Abrupt change in payment pattern Accident period effect (legislative change) Completion quarter effect (superimposed inflation) Variations of superimposed inflation with operational time

Choosing GLM structure Typically largely ad hoc, using Trial and error regressions Diagnostics, e.g. residual plots Example: Modelling 60,000 Auto Bodily Injury claims Model of the cost of completed claims

Choosing GLM structure First fit just an operational time effect 12.5 Linear Predictor 12.0 11.5 11.0 optime 10.5 10.0 9.5 9.0 8.5 8.0 optime

Choosing GLM structure But there appear to be unmodelled trends by Accident quarter Completion (finalisation) quarter Studentized Standardized Deviance Residuals by accident quarter Studentized Standardized Deviance Residuals by finalisation quarter 10 2.0 8 1.5 6 1.0 4 0.5 2 0.0 0-0.5-2 -1.0-4 -1.5-6 -2.0-8 -2.5 Sep-94 Dec-94 Mar-95 Jun-95 Sep-95 Dec-95 Mar-96 Jun-96 Sep-96 Dec-96 Mar-97 Jun-97 Sep-97 Dec-97 Mar-98 Jun-98 Sep-98 Dec-98 Mar-99 Jun-99 Sep-99 Dec-99 Mar-00 Jun-00 Sep-00 Dec-00 Mar-01 Jun-01 Sep-01 Dec-01 Mar-02 Jun-02 Sep-02 Dec-02 Mar-03 Jun-03 Sep-03 De c-94 Mar -95 Ju n- 95 Sep-95 De c- 95 Mar -96 Ju n- 96 Sep-96 De c- 96 Mar-97 Ju n- 97 Se p-97 De c- 97 Mar-98 Ju n-98 Se p-98 De c- 98 Mar-99 Ju n-99 Sep- 99 De c- 99 Mar-0 0 Ju n- 00 Sep- 00 Dec-00 Mar -0 1 Ju n- 01 Sep- 01 De c-01 Mar -0 2 Ju n- 02 Sep-02 De c-02 Mar -03 Ju n- 03 Sep-03

Choosing GLM structure Final model has terms for: Age of claim Seasonality Accident quarter Change in Scheme rules Change in age of claim effect with change in Scheme rules Superimposed inflation Varying with age of claim

Choosing GLM structure Structure identified in ad hoc manner Trial and error regressions Diagnostics, e.g. residual plots More rigorous approach desirable Can we use ANN to do it better?

Introduction to ANN The ANN Regression Function Start with vector of P inputs X = {x p } Create hidden layer with M hidden units Make M linear combinations of inputs h m = p w Linear combinations then passed through layer of activation functions g(h m ) Z m = mp x g( h ) = ( m g w p p mp x p )

Activation function Introduction to ANN Usually a sigmoidal curve g(h) 1 0.9 0.8 0.7 0.6 0.5 0.4 g( h) 1 = 1 + e h 0.3 0.2 0.1 0-5 -4-3 -2-1 0 1 2 3 4 5 h Function introduces non-linearity to model keeps response bounded

Introduction to ANN Y is then given by a linear combination of the outputs from the hidden layer Y = W = ( mz m Wmg m m p w mp x p ) This function can describe any continuous function 2 hidden layers ANN can describe any function

Introduction to ANN Y W m Z m g h m w m X i

Training an ANN Weights are usually determined by minimising the least-squares error Err = 1 2 Weight decay penalty function stops overfitting N i= 1 ( y f ( i X i 2 )) Err + λ( 2 + m W m w m p 2 mp ) Larger λ smaller weights Smaller weights smoother fit

Training of an ANN - Example Training data set: 70% of available data Test data set: 30% of available data Network structure: Single hidden layer 20 units Weight decay λ=0.05 These tuning parameters determined by crossvalidation Prediction error in test data set

GLM Comparison of GLM and ANN ANN Average absolute error = $33,777 Average absolute error = $33,559

1D graphical plots to visualise data features e.g. historical and future superimposed inflation Development quarter 10: red Development quarter 20: green Development quarter 30: yellow Development quarter 40: blue ANN forecasts ANN has searched out general form of past superimposed inflation (SI) Future SI determined by simple extrapolation

Note forecast negative SI may be undesirable Need to consider expected claims environment in the future to determine appropriate SI forecast Because of problem of extrapolation with ANN usually necessary to supplement ANN forecast with a separate forecast of future SI. ANN forecasts

Combining ANN and GLM Often preferable to use a GLM over ANN due to model simplicity and transparency ANN 181 parameters GLM 13 pars May get best out of ANN and GLM if use in combination Use ANN as an automated tool to seeking out trends in data Apply ANN to data set Study trends in fitted model against a range of predictors or pairs of predictors using graphical means Use this knowledge to choose the functional forms to include in the GLM model

Combining ANN and GLM Ultimate test of the GLM is to apply ANN to its residuals, seeking structure There should be none The example indicates that there may the chosen GLM structure may: Over-estimate the more recent experience at the midages of claim Under-estimate it at the older ages

Conclusions GLMs provide a powerful and flexible family of models for claims data Complex GLM structures may be required for adequate representation of the data The identification of these may be difficult The identification procedures are likely to be ad hoc ANNs provide an alternative form of non-linear regression These are likely to involve their own shortcomings if left to stand on their own (e.g. reduced transparency) They may, however, provide considerable assistance if used in parallel with GLMs to identify GLM structure