Combining GLM and datamining techniques for modelling accident compensation data. Peter Mulquiney

Size: px
Start display at page:

Download "Combining GLM and datamining techniques for modelling accident compensation data. Peter Mulquiney"

Transcription

1 Combining GLM and datamining techniques for modelling accident compensation data Peter Mulquiney

2 Introduction Accident compensation data exhibit features which complicate loss reserving and premium rate setting Speeding up or slowing down of payment patterns Abrupt changes in trends due to legislative changes Changes in the profile of claims Other changes which emerge as superimposed inflation Complicated structure can be modelled with GLMs structure chosen in an ad hoc manner process can be laborious and can be fallible

3 Alternative: Data mining techniques Artificial Neural Networks CART, MARS etc Advantages: Introduction flexible architecture can fit almost any data structure model fitting is largely automated

4 Overview Examine general form of model of claims data Examine the specific case of a GLM to represent the data Consider how the GLM structure is chosen Introduce and discuss Artificial Neural Networks (ANNs) Consider how these may assist in formulating a GLM

5 Model of claims data General form of claims data model Y i = f(x i ; β) + ε i Y i = some observation on claims experience β = vector of parameters that apply to all observations X i = vector of attributes (covariates) of i-th observation ε i = vector of centred stochastic error terms

6 General form of claims data model Y i = f(x i ; β) + ε i Y i = some observation on claims experience β = vector of parameters that apply to all observations X i = vector of attributes (covariates) of i-th observation ε i = vector of centred stochastic error terms Examples Model of claims data Y i = Y ad = paid losses in (a,d) cell» a = accident period» d = development period Y i = cost of i-th completed claim

7 Examples (cont) Y ad = paid losses in (a,d) cell E[Y ad ] = β d Σ r=1 d-1 Y ar (chain ladder)

8 Y ad = paid losses in (a,d) cell E[Y ad ] = β d Σ r=1 d-1 Y ar (chain ladder) Y i = cost of i-th completed claim Y i ~ Gamma E[Y i ] = exp [α+β t i ] where Examples (cont)» a i = accident period to which i-th claim belongs» t i = operational time at completion of i-th claim = proportion of claims from the accident period a i completed before i-th claim

9 Examples of individual claim models More generally E[Y i ] = exp {function of operational time}

10 Examples of individual claim models More generally E[Y i ] = exp {function of operational time} + function of accident period (legislative change)}

11 More generally E[Y i ] = Examples of individual claim models exp {function of operational time} + function of accident period (legislative change)} + function of completion period (superimposed inflation)}

12 More generally E[Y i ] = Examples of individual claim models exp {function of operational time} + function of accident period (legislative change)} + function of completion period (superimposed inflation)} + joint function (interaction) of operational time & accident period (change in payment pattern attributable to legislative change)}

13 Examples of individual claim models Models of this type may be very detailed May include Operational time effect (payment pattern) Seasonality Creeping change in payment pattern Abrupt change in payment pattern Accident period effect (legislative change) Completion quarter effect (superimposed inflation) Variations of superimposed inflation with operational time

14 Choosing GLM structure Typically largely ad hoc, using Trial and error regressions Diagnostics, e.g. residual plots Example: Modelling 60,000 Auto Bodily Injury claims Model of the cost of completed claims

15 Choosing GLM structure First fit just an operational time effect 12.5 Linear Predictor optime optime

16 Choosing GLM structure But there appear to be unmodelled trends by Accident quarter Completion (finalisation) quarter Studentized Standardized Deviance Residuals by accident quarter Studentized Standardized Deviance Residuals by finalisation quarter Sep-94 Dec-94 Mar-95 Jun-95 Sep-95 Dec-95 Mar-96 Jun-96 Sep-96 Dec-96 Mar-97 Jun-97 Sep-97 Dec-97 Mar-98 Jun-98 Sep-98 Dec-98 Mar-99 Jun-99 Sep-99 Dec-99 Mar-00 Jun-00 Sep-00 Dec-00 Mar-01 Jun-01 Sep-01 Dec-01 Mar-02 Jun-02 Sep-02 Dec-02 Mar-03 Jun-03 Sep-03 De c-94 Mar -95 Ju n- 95 Sep-95 De c- 95 Mar -96 Ju n- 96 Sep-96 De c- 96 Mar-97 Ju n- 97 Se p-97 De c- 97 Mar-98 Ju n-98 Se p-98 De c- 98 Mar-99 Ju n-99 Sep- 99 De c- 99 Mar-0 0 Ju n- 00 Sep- 00 Dec-00 Mar -0 1 Ju n- 01 Sep- 01 De c-01 Mar -0 2 Ju n- 02 Sep-02 De c-02 Mar -03 Ju n- 03 Sep-03

17 Choosing GLM structure Final model has terms for: Age of claim Seasonality Accident quarter Change in Scheme rules Change in age of claim effect with change in Scheme rules Superimposed inflation Varying with age of claim

18 Choosing GLM structure Structure identified in ad hoc manner Trial and error regressions Diagnostics, e.g. residual plots More rigorous approach desirable Can we use ANN to do it better?

19 Introduction to ANN The ANN Regression Function Start with vector of P inputs X = {x p } Create hidden layer with M hidden units Make M linear combinations of inputs h m = p w Linear combinations then passed through layer of activation functions g(h m ) Z m = mp x g( h ) = ( m g w p p mp x p )

20 Activation function Introduction to ANN Usually a sigmoidal curve g(h) g( h) 1 = 1 + e h h Function introduces non-linearity to model keeps response bounded

21 Introduction to ANN Y is then given by a linear combination of the outputs from the hidden layer Y = W = ( mz m Wmg m m p w mp x p ) This function can describe any continuous function 2 hidden layers ANN can describe any function

22 Introduction to ANN Y W m Z m g h m w m X i

23 Training an ANN Weights are usually determined by minimising the least-squares error Err = 1 2 Weight decay penalty function stops overfitting N i= 1 ( y f ( i X i 2 )) Err + λ( 2 + m W m w m p 2 mp ) Larger λ smaller weights Smaller weights smoother fit

24 Training of an ANN - Example Training data set: 70% of available data Test data set: 30% of available data Network structure: Single hidden layer 20 units Weight decay λ=0.05 These tuning parameters determined by crossvalidation Prediction error in test data set

25 GLM Comparison of GLM and ANN ANN Average absolute error = $33,777 Average absolute error = $33,559

26 1D graphical plots to visualise data features e.g. historical and future superimposed inflation Development quarter 10: red Development quarter 20: green Development quarter 30: yellow Development quarter 40: blue ANN forecasts ANN has searched out general form of past superimposed inflation (SI) Future SI determined by simple extrapolation

27 Note forecast negative SI may be undesirable Need to consider expected claims environment in the future to determine appropriate SI forecast Because of problem of extrapolation with ANN usually necessary to supplement ANN forecast with a separate forecast of future SI. ANN forecasts

28 Combining ANN and GLM Often preferable to use a GLM over ANN due to model simplicity and transparency ANN 181 parameters GLM 13 pars May get best out of ANN and GLM if use in combination Use ANN as an automated tool to seeking out trends in data Apply ANN to data set Study trends in fitted model against a range of predictors or pairs of predictors using graphical means Use this knowledge to choose the functional forms to include in the GLM model

29 Combining ANN and GLM Ultimate test of the GLM is to apply ANN to its residuals, seeking structure There should be none The example indicates that there may the chosen GLM structure may: Over-estimate the more recent experience at the midages of claim Under-estimate it at the older ages

30 Conclusions GLMs provide a powerful and flexible family of models for claims data Complex GLM structures may be required for adequate representation of the data The identification of these may be difficult The identification procedures are likely to be ad hoc ANNs provide an alternative form of non-linear regression These are likely to involve their own shortcomings if left to stand on their own (e.g. reduced transparency) They may, however, provide considerable assistance if used in parallel with GLMs to identify GLM structure

Neural Networks & Boosting

Neural Networks & Boosting Neural Networks & Boosting Bob Stine Dept of Statistics, School University of Pennsylvania Questions How is logistic regression different from OLS? Logistic mean function for probabilities Larger weight

More information

NTC Project: S01-PH10 (formerly I01-P10) 1 Forecasting Women s Apparel Sales Using Mathematical Modeling

NTC Project: S01-PH10 (formerly I01-P10) 1 Forecasting Women s Apparel Sales Using Mathematical Modeling 1 Forecasting Women s Apparel Sales Using Mathematical Modeling Celia Frank* 1, Balaji Vemulapalli 1, Les M. Sztandera 2, Amar Raheja 3 1 School of Textiles and Materials Technology 2 Computer Information

More information

MGT 267 PROJECT. Forecasting the United States Retail Sales of the Pharmacies and Drug Stores. Done by: Shunwei Wang & Mohammad Zainal

MGT 267 PROJECT. Forecasting the United States Retail Sales of the Pharmacies and Drug Stores. Done by: Shunwei Wang & Mohammad Zainal MGT 267 PROJECT Forecasting the United States Retail Sales of the Pharmacies and Drug Stores Done by: Shunwei Wang & Mohammad Zainal Dec. 2002 The retail sale (Million) ABSTRACT The present study aims

More information

The Combination Forecasting Model of Auto Sales Based on Seasonal Index and RBF Neural Network

The Combination Forecasting Model of Auto Sales Based on Seasonal Index and RBF Neural Network , pp.67-76 http://dx.doi.org/10.14257/ijdta.2016.9.1.06 The Combination Forecasting Model of Auto Sales Based on Seasonal Index and RBF Neural Network Lihua Yang and Baolin Li* School of Economics and

More information

Joseph Twagilimana, University of Louisville, Louisville, KY

Joseph Twagilimana, University of Louisville, Louisville, KY ST14 Comparing Time series, Generalized Linear Models and Artificial Neural Network Models for Transactional Data analysis Joseph Twagilimana, University of Louisville, Louisville, KY ABSTRACT The aim

More information

Time series forecasting

Time series forecasting Time series forecasting 1 The latest version of this document and related examples are found in http://myy.haaga-helia.fi/~taaak/q Time series forecasting The objective of time series methods is to discover

More information

BOOSTED REGRESSION TREES: A MODERN WAY TO ENHANCE ACTUARIAL MODELLING

BOOSTED REGRESSION TREES: A MODERN WAY TO ENHANCE ACTUARIAL MODELLING BOOSTED REGRESSION TREES: A MODERN WAY TO ENHANCE ACTUARIAL MODELLING Xavier Conort xavier.conort@gear-analytics.com Session Number: TBR14 Insurance has always been a data business The industry has successfully

More information

Logistic Regression (1/24/13)

Logistic Regression (1/24/13) STA63/CBB540: Statistical methods in computational biology Logistic Regression (/24/3) Lecturer: Barbara Engelhardt Scribe: Dinesh Manandhar Introduction Logistic regression is model for regression used

More information

Neural Networks. CAP5610 Machine Learning Instructor: Guo-Jun Qi

Neural Networks. CAP5610 Machine Learning Instructor: Guo-Jun Qi Neural Networks CAP5610 Machine Learning Instructor: Guo-Jun Qi Recap: linear classifier Logistic regression Maximizing the posterior distribution of class Y conditional on the input vector X Support vector

More information

Predictive Modeling in Workers Compensation 2008 CAS Ratemaking Seminar

Predictive Modeling in Workers Compensation 2008 CAS Ratemaking Seminar Predictive Modeling in Workers Compensation 2008 CAS Ratemaking Seminar Prepared by Louise Francis, FCAS, MAAA Francis Analytics and Actuarial Data Mining, Inc. www.data-mines.com Louise.francis@data-mines.cm

More information

Lecture 6. Artificial Neural Networks

Lecture 6. Artificial Neural Networks Lecture 6 Artificial Neural Networks 1 1 Artificial Neural Networks In this note we provide an overview of the key concepts that have led to the emergence of Artificial Neural Networks as a major paradigm

More information

How to Construct a Seasonal Index

How to Construct a Seasonal Index How to Construct a Seasonal Index Methods of Constructing a Seasonal Index There are several ways to construct a seasonal index. The simplest is to produce a graph with the factor being studied (i.e.,

More information

Machine Learning and Data Mining. Regression Problem. (adapted from) Prof. Alexander Ihler

Machine Learning and Data Mining. Regression Problem. (adapted from) Prof. Alexander Ihler Machine Learning and Data Mining Regression Problem (adapted from) Prof. Alexander Ihler Overview Regression Problem Definition and define parameters ϴ. Prediction using ϴ as parameters Measure the error

More information

SAS Software to Fit the Generalized Linear Model

SAS Software to Fit the Generalized Linear Model SAS Software to Fit the Generalized Linear Model Gordon Johnston, SAS Institute Inc., Cary, NC Abstract In recent years, the class of generalized linear models has gained popularity as a statistical modeling

More information

Not Your Dad s Magic Eight Ball

Not Your Dad s Magic Eight Ball Not Your Dad s Magic Eight Ball Prepared for the NCSL Fiscal Analysts Seminar, October 21, 2014 Jim Landers, Office of Fiscal and Management Analysis, Indiana Legislative Services Agency Actual Forecast

More information

Development Period 1 2 3 4 5 6 7 8 9 Observed Payments

Development Period 1 2 3 4 5 6 7 8 9 Observed Payments Pricing and reserving in the general insurance industry Solutions developed in The SAS System John Hansen & Christian Larsen, Larsen & Partners Ltd 1. Introduction The two business solutions presented

More information

COMBINED NEURAL NETWORKS FOR TIME SERIES ANALYSIS

COMBINED NEURAL NETWORKS FOR TIME SERIES ANALYSIS COMBINED NEURAL NETWORKS FOR TIME SERIES ANALYSIS Iris Ginzburg and David Horn School of Physics and Astronomy Raymond and Beverly Sackler Faculty of Exact Science Tel-Aviv University Tel-A viv 96678,

More information

Prediction Model for Crude Oil Price Using Artificial Neural Networks

Prediction Model for Crude Oil Price Using Artificial Neural Networks Applied Mathematical Sciences, Vol. 8, 2014, no. 80, 3953-3965 HIKARI Ltd, www.m-hikari.com http://dx.doi.org/10.12988/ams.2014.43193 Prediction Model for Crude Oil Price Using Artificial Neural Networks

More information

Location matters. 3 techniques to incorporate geo-spatial effects in one's predictive model

Location matters. 3 techniques to incorporate geo-spatial effects in one's predictive model Location matters. 3 techniques to incorporate geo-spatial effects in one's predictive model Xavier Conort xavier.conort@gear-analytics.com Motivation Location matters! Observed value at one location is

More information

APPLYING DATA MINING TECHNIQUES TO FORECAST NUMBER OF AIRLINE PASSENGERS

APPLYING DATA MINING TECHNIQUES TO FORECAST NUMBER OF AIRLINE PASSENGERS APPLYING DATA MINING TECHNIQUES TO FORECAST NUMBER OF AIRLINE PASSENGERS IN SAUDI ARABIA (DOMESTIC AND INTERNATIONAL TRAVELS) Abdullah Omer BaFail King Abdul Aziz University Jeddah, Saudi Arabia ABSTRACT

More information

Industry Environment and Concepts for Forecasting 1

Industry Environment and Concepts for Forecasting 1 Table of Contents Industry Environment and Concepts for Forecasting 1 Forecasting Methods Overview...2 Multilevel Forecasting...3 Demand Forecasting...4 Integrating Information...5 Simplifying the Forecast...6

More information

Time Series Analysis. 1) smoothing/trend assessment

Time Series Analysis. 1) smoothing/trend assessment Time Series Analysis This (not surprisingly) concerns the analysis of data collected over time... weekly values, monthly values, quarterly values, yearly values, etc. Usually the intent is to discern whether

More information

Introduction to Machine Learning and Data Mining. Prof. Dr. Igor Trajkovski trajkovski@nyus.edu.mk

Introduction to Machine Learning and Data Mining. Prof. Dr. Igor Trajkovski trajkovski@nyus.edu.mk Introduction to Machine Learning and Data Mining Prof. Dr. Igor Trakovski trakovski@nyus.edu.mk Neural Networks 2 Neural Networks Analogy to biological neural systems, the most robust learning systems

More information

IFT3395/6390. Machine Learning from linear regression to Neural Networks. Machine Learning. Training Set. t (3.5, -2,..., 127, 0,...

IFT3395/6390. Machine Learning from linear regression to Neural Networks. Machine Learning. Training Set. t (3.5, -2,..., 127, 0,... IFT3395/6390 Historical perspective: back to 1957 (Prof. Pascal Vincent) (Rosenblatt, Perceptron ) Machine Learning from linear regression to Neural Networks Computer Science Artificial Intelligence Symbolic

More information

Predictive modelling around the world 28.11.13

Predictive modelling around the world 28.11.13 Predictive modelling around the world 28.11.13 Agenda Why this presentation is really interesting Introduction to predictive modelling Case studies Conclusions Why this presentation is really interesting

More information

MISSING DATA TECHNIQUES WITH SAS. IDRE Statistical Consulting Group

MISSING DATA TECHNIQUES WITH SAS. IDRE Statistical Consulting Group MISSING DATA TECHNIQUES WITH SAS IDRE Statistical Consulting Group ROAD MAP FOR TODAY To discuss: 1. Commonly used techniques for handling missing data, focusing on multiple imputation 2. Issues that could

More information

Simple Methods and Procedures Used in Forecasting

Simple Methods and Procedures Used in Forecasting Simple Methods and Procedures Used in Forecasting The project prepared by : Sven Gingelmaier Michael Richter Under direction of the Maria Jadamus-Hacura What Is Forecasting? Prediction of future events

More information

Section A. Index. Section A. Planning, Budgeting and Forecasting Section A.2 Forecasting techniques... 1. Page 1 of 11. EduPristine CMA - Part I

Section A. Index. Section A. Planning, Budgeting and Forecasting Section A.2 Forecasting techniques... 1. Page 1 of 11. EduPristine CMA - Part I Index Section A. Planning, Budgeting and Forecasting Section A.2 Forecasting techniques... 1 EduPristine CMA - Part I Page 1 of 11 Section A. Planning, Budgeting and Forecasting Section A.2 Forecasting

More information

5. Multiple regression

5. Multiple regression 5. Multiple regression QBUS6840 Predictive Analytics https://www.otexts.org/fpp/5 QBUS6840 Predictive Analytics 5. Multiple regression 2/39 Outline Introduction to multiple linear regression Some useful

More information

Neural Network Stock Trading Systems Donn S. Fishbein, MD, PhD Neuroquant.com

Neural Network Stock Trading Systems Donn S. Fishbein, MD, PhD Neuroquant.com Neural Network Stock Trading Systems Donn S. Fishbein, MD, PhD Neuroquant.com There are at least as many ways to trade stocks and other financial instruments as there are traders. Remarkably, most people

More information

Module 6: Introduction to Time Series Forecasting

Module 6: Introduction to Time Series Forecasting Using Statistical Data to Make Decisions Module 6: Introduction to Time Series Forecasting Titus Awokuse and Tom Ilvento, University of Delaware, College of Agriculture and Natural Resources, Food and

More information

Penalized regression: Introduction

Penalized regression: Introduction Penalized regression: Introduction Patrick Breheny August 30 Patrick Breheny BST 764: Applied Statistical Modeling 1/19 Maximum likelihood Much of 20th-century statistics dealt with maximum likelihood

More information

ON THE DEGREES OF FREEDOM IN RICHLY PARAMETERISED MODELS

ON THE DEGREES OF FREEDOM IN RICHLY PARAMETERISED MODELS COMPSTAT 2004 Symposium c Physica-Verlag/Springer 2004 ON THE DEGREES OF FREEDOM IN RICHLY PARAMETERISED MODELS Salvatore Ingrassia and Isabella Morlini Key words: Richly parameterised models, small data

More information

Combining Linear and Non-Linear Modeling Techniques: EMB America. Getting the Best of Two Worlds

Combining Linear and Non-Linear Modeling Techniques: EMB America. Getting the Best of Two Worlds Combining Linear and Non-Linear Modeling Techniques: Getting the Best of Two Worlds Outline Who is EMB? Insurance industry predictive modeling applications EMBLEM our GLM tool How we have used CART with

More information

8. Time Series and Prediction

8. Time Series and Prediction 8. Time Series and Prediction Definition: A time series is given by a sequence of the values of a variable observed at sequential points in time. e.g. daily maximum temperature, end of day share prices,

More information

Lecture 8 February 4

Lecture 8 February 4 ICS273A: Machine Learning Winter 2008 Lecture 8 February 4 Scribe: Carlos Agell (Student) Lecturer: Deva Ramanan 8.1 Neural Nets 8.1.1 Logistic Regression Recall the logistic function: g(x) = 1 1 + e θt

More information

Nonlinear Regression:

Nonlinear Regression: Zurich University of Applied Sciences School of Engineering IDP Institute of Data Analysis and Process Design Nonlinear Regression: A Powerful Tool With Considerable Complexity Half-Day : Improved Inference

More information

Event driven trading new studies on innovative way. of trading in Forex market. Michał Osmoła INIME live 23 February 2016

Event driven trading new studies on innovative way. of trading in Forex market. Michał Osmoła INIME live 23 February 2016 Event driven trading new studies on innovative way of trading in Forex market Michał Osmoła INIME live 23 February 2016 Forex market From Wikipedia: The foreign exchange market (Forex, FX, or currency

More information

Data Mining. Nonlinear Classification

Data Mining. Nonlinear Classification Data Mining Unit # 6 Sajjad Haider Fall 2014 1 Nonlinear Classification Classes may not be separable by a linear boundary Suppose we randomly generate a data set as follows: X has range between 0 to 15

More information

Statistics in Retail Finance. Chapter 6: Behavioural models

Statistics in Retail Finance. Chapter 6: Behavioural models Statistics in Retail Finance 1 Overview > So far we have focussed mainly on application scorecards. In this chapter we shall look at behavioural models. We shall cover the following topics:- Behavioural

More information

The Do s & Don ts of Building A Predictive Model in Insurance. University of Minnesota November 9 th, 2012 Nathan Hubbell, FCAS Katy Micek, Ph.D.

The Do s & Don ts of Building A Predictive Model in Insurance. University of Minnesota November 9 th, 2012 Nathan Hubbell, FCAS Katy Micek, Ph.D. The Do s & Don ts of Building A Predictive Model in Insurance University of Minnesota November 9 th, 2012 Nathan Hubbell, FCAS Katy Micek, Ph.D. Agenda Travelers Broad Overview Actuarial & Analytics Career

More information

Neural Networks. Introduction to Artificial Intelligence CSE 150 May 29, 2007

Neural Networks. Introduction to Artificial Intelligence CSE 150 May 29, 2007 Neural Networks Introduction to Artificial Intelligence CSE 150 May 29, 2007 Administration Last programming assignment has been posted! Final Exam: Tuesday, June 12, 11:30-2:30 Last Lecture Naïve Bayes

More information

Supply Chain Forecasting Model Using Computational Intelligence Techniques

Supply Chain Forecasting Model Using Computational Intelligence Techniques CMU.J.Nat.Sci Special Issue on Manufacturing Technology (2011) Vol.10(1) 19 Supply Chain Forecasting Model Using Computational Intelligence Techniques Wimalin S. Laosiritaworn Department of Industrial

More information

Data Mining - Evaluation of Classifiers

Data Mining - Evaluation of Classifiers Data Mining - Evaluation of Classifiers Lecturer: JERZY STEFANOWSKI Institute of Computing Sciences Poznan University of Technology Poznan, Poland Lecture 4 SE Master Course 2008/2009 revised for 2010

More information

Artificial Neural Network and Non-Linear Regression: A Comparative Study

Artificial Neural Network and Non-Linear Regression: A Comparative Study International Journal of Scientific and Research Publications, Volume 2, Issue 12, December 2012 1 Artificial Neural Network and Non-Linear Regression: A Comparative Study Shraddha Srivastava 1, *, K.C.

More information

Automated Biosurveillance Data from England and Wales, 1991 2011

Automated Biosurveillance Data from England and Wales, 1991 2011 Article DOI: http://dx.doi.org/10.3201/eid1901.120493 Automated Biosurveillance Data from England and Wales, 1991 2011 Technical Appendix This online appendix provides technical details of statistical

More information

Regression III: Advanced Methods

Regression III: Advanced Methods Lecture 4: Transformations Regression III: Advanced Methods William G. Jacoby Michigan State University Goals of the lecture The Ladder of Roots and Powers Changing the shape of distributions Transforming

More information

Logistic Regression for Spam Filtering

Logistic Regression for Spam Filtering Logistic Regression for Spam Filtering Nikhila Arkalgud February 14, 28 Abstract The goal of the spam filtering problem is to identify an email as a spam or not spam. One of the classic techniques used

More information

Regression III: Advanced Methods

Regression III: Advanced Methods Lecture 16: Generalized Additive Models Regression III: Advanced Methods Bill Jacoby Michigan State University http://polisci.msu.edu/jacoby/icpsr/regress3 Goals of the Lecture Introduce Additive Models

More information

Chapter 12 Discovering New Knowledge Data Mining

Chapter 12 Discovering New Knowledge Data Mining Chapter 12 Discovering New Knowledge Data Mining Becerra-Fernandez, et al. -- Knowledge Management 1/e -- 2004 Prentice Hall Additional material 2007 Dekai Wu Chapter Objectives Introduce the student to

More information

Data Mining Prediction

Data Mining Prediction Data Mining Prediction Jingpeng Li 1 of 23 What is Prediction? Predicting the identity of one thing based purely on the description of another related thing Not necessarily future events, just unknowns

More information

Chapter 4: Artificial Neural Networks

Chapter 4: Artificial Neural Networks Chapter 4: Artificial Neural Networks CS 536: Machine Learning Littman (Wu, TA) Administration icml-03: instructional Conference on Machine Learning http://www.cs.rutgers.edu/~mlittman/courses/ml03/icml03/

More information

INTRODUCTION TO NEURAL NETWORKS

INTRODUCTION TO NEURAL NETWORKS INTRODUCTION TO NEURAL NETWORKS Pictures are taken from http://www.cs.cmu.edu/~tom/mlbook-chapter-slides.html http://research.microsoft.com/~cmbishop/prml/index.htm By Nobel Khandaker Neural Networks An

More information

Data Mining and Neural Networks in Stata

Data Mining and Neural Networks in Stata Data Mining and Neural Networks in Stata 2 nd Italian Stata Users Group Meeting Milano, 10 October 2005 Mario Lucchini e Maurizo Pisati Università di Milano-Bicocca mario.lucchini@unimib.it maurizio.pisati@unimib.it

More information

Drugs store sales forecast using Machine Learning

Drugs store sales forecast using Machine Learning Drugs store sales forecast using Machine Learning Hongyu Xiong (hxiong2), Xi Wu (wuxi), Jingying Yue (jingying) 1 Introduction Nowadays medical-related sales prediction is of great interest; with reliable

More information

Jinadasa Gamage, Professor of Mathematics, Illinois State University, Normal, IL, e- mail: jina@ilstu.edu

Jinadasa Gamage, Professor of Mathematics, Illinois State University, Normal, IL, e- mail: jina@ilstu.edu Submission for ARCH, October 31, 2006 Jinadasa Gamage, Professor of Mathematics, Illinois State University, Normal, IL, e- mail: jina@ilstu.edu Jed L. Linfield, FSA, MAAA, Health Actuary, Kaiser Permanente,

More information

Joint models for classification and comparison of mortality in different countries.

Joint models for classification and comparison of mortality in different countries. Joint models for classification and comparison of mortality in different countries. Viani D. Biatat 1 and Iain D. Currie 1 1 Department of Actuarial Mathematics and Statistics, and the Maxwell Institute

More information

Logistic Regression (a type of Generalized Linear Model)

Logistic Regression (a type of Generalized Linear Model) Logistic Regression (a type of Generalized Linear Model) 1/36 Today Review of GLMs Logistic Regression 2/36 How do we find patterns in data? We begin with a model of how the world works We use our knowledge

More information

Simple linear regression

Simple linear regression Simple linear regression Introduction Simple linear regression is a statistical method for obtaining a formula to predict values of one variable from another where there is a causal relationship between

More information

A Short Tour of the Predictive Modeling Process

A Short Tour of the Predictive Modeling Process Chapter 2 A Short Tour of the Predictive Modeling Process Before diving in to the formal components of model building, we present a simple example that illustrates the broad concepts of model building.

More information

Statistical Machine Learning

Statistical Machine Learning Statistical Machine Learning UoC Stats 37700, Winter quarter Lecture 4: classical linear and quadratic discriminants. 1 / 25 Linear separation For two classes in R d : simple idea: separate the classes

More information

Predictive Modeling Techniques in Insurance

Predictive Modeling Techniques in Insurance Predictive Modeling Techniques in Insurance Tuesday May 5, 2015 JF. Breton Application Engineer 2014 The MathWorks, Inc. 1 Opening Presenter: JF. Breton: 13 years of experience in predictive analytics

More information

Adaptive Demand-Forecasting Approach based on Principal Components Time-series an application of data-mining technique to detection of market movement

Adaptive Demand-Forecasting Approach based on Principal Components Time-series an application of data-mining technique to detection of market movement Adaptive Demand-Forecasting Approach based on Principal Components Time-series an application of data-mining technique to detection of market movement Toshio Sugihara Abstract In this study, an adaptive

More information

Optimization of technical trading strategies and the profitability in security markets

Optimization of technical trading strategies and the profitability in security markets Economics Letters 59 (1998) 249 254 Optimization of technical trading strategies and the profitability in security markets Ramazan Gençay 1, * University of Windsor, Department of Economics, 401 Sunset,

More information

Data Mining Lab 5: Introduction to Neural Networks

Data Mining Lab 5: Introduction to Neural Networks Data Mining Lab 5: Introduction to Neural Networks 1 Introduction In this lab we are going to have a look at some very basic neural networks on a new data set which relates various covariates about cheese

More information

Neural Network and Genetic Algorithm Based Trading Systems. Donn S. Fishbein, MD, PhD Neuroquant.com

Neural Network and Genetic Algorithm Based Trading Systems. Donn S. Fishbein, MD, PhD Neuroquant.com Neural Network and Genetic Algorithm Based Trading Systems Donn S. Fishbein, MD, PhD Neuroquant.com Consider the challenge of constructing a financial market trading system using commonly available technical

More information

Forecasting Hospital Bed Availability Using Simulation and Neural Networks

Forecasting Hospital Bed Availability Using Simulation and Neural Networks Forecasting Hospital Bed Availability Using Simulation and Neural Networks Matthew J. Daniels Michael E. Kuhl Industrial & Systems Engineering Department Rochester Institute of Technology Rochester, NY

More information

Risks Involved with Systematic Investment Strategies

Risks Involved with Systematic Investment Strategies Risks Involved with Systematic Investment Strategies SIMC Lectures in Entrepreneurial Leadership, ETH Zurich, D-MTEC, February 5, 2008 Dr. Gabriel Dondi (dondi@swissquant.ch) This work is subject to copyright.

More information

Data Mining Part 5. Prediction

Data Mining Part 5. Prediction Data Mining Part 5. Prediction 5.7 Spring 2010 Instructor: Dr. Masoud Yaghini Outline Introduction Linear Regression Other Regression Models References Introduction Introduction Numerical prediction is

More information

AUTOMATION OF ENERGY DEMAND FORECASTING. Sanzad Siddique, B.S.

AUTOMATION OF ENERGY DEMAND FORECASTING. Sanzad Siddique, B.S. AUTOMATION OF ENERGY DEMAND FORECASTING by Sanzad Siddique, B.S. A Thesis submitted to the Faculty of the Graduate School, Marquette University, in Partial Fulfillment of the Requirements for the Degree

More information

Model Validation Techniques

Model Validation Techniques Model Validation Techniques Kevin Mahoney, FCAS kmahoney@ travelers.com CAS RPM Seminar March 17, 2010 Uses of Statistical Models in P/C Insurance Examples of Applications Determine expected loss cost

More information

NTC Project: S01-PH10 (formerly I01-P10) 1 Forecasting Women s Apparel Sales Using Mathematical Modeling

NTC Project: S01-PH10 (formerly I01-P10) 1 Forecasting Women s Apparel Sales Using Mathematical Modeling 1 Forecasting Women s Apparel Sales Using Mathematical Modeling Celia Frank* 1, Balaji Vemulapalli 1, Les M. Sztandera 2, Amar Raheja 3 1 School of Textiles and Materials Technology 2 Computer Information

More information

SPSS TRAINING SESSION 3 ADVANCED TOPICS (PASW STATISTICS 17.0) Sun Li Centre for Academic Computing lsun@smu.edu.sg

SPSS TRAINING SESSION 3 ADVANCED TOPICS (PASW STATISTICS 17.0) Sun Li Centre for Academic Computing lsun@smu.edu.sg SPSS TRAINING SESSION 3 ADVANCED TOPICS (PASW STATISTICS 17.0) Sun Li Centre for Academic Computing lsun@smu.edu.sg IN SPSS SESSION 2, WE HAVE LEARNT: Elementary Data Analysis Group Comparison & One-way

More information

Applied Statistics. J. Blanchet and J. Wadsworth. Institute of Mathematics, Analysis, and Applications EPF Lausanne

Applied Statistics. J. Blanchet and J. Wadsworth. Institute of Mathematics, Analysis, and Applications EPF Lausanne Applied Statistics J. Blanchet and J. Wadsworth Institute of Mathematics, Analysis, and Applications EPF Lausanne An MSc Course for Applied Mathematicians, Fall 2012 Outline 1 Model Comparison 2 Model

More information

Machine learning algorithms for predicting roadside fine particulate matter concentration level in Hong Kong Central

Machine learning algorithms for predicting roadside fine particulate matter concentration level in Hong Kong Central Article Machine learning algorithms for predicting roadside fine particulate matter concentration level in Hong Kong Central Yin Zhao, Yahya Abu Hasan School of Mathematical Sciences, Universiti Sains

More information

6. Feed-forward mapping networks

6. Feed-forward mapping networks 6. Feed-forward mapping networks Fundamentals of Computational Neuroscience, T. P. Trappenberg, 2002. Lecture Notes on Brain and Computation Byoung-Tak Zhang Biointelligence Laboratory School of Computer

More information

Role of Neural network in data mining

Role of Neural network in data mining Role of Neural network in data mining Chitranjanjit kaur Associate Prof Guru Nanak College, Sukhchainana Phagwara,(GNDU) Punjab, India Pooja kapoor Associate Prof Swami Sarvanand Group Of Institutes Dinanagar(PTU)

More information

business statistics using Excel OXFORD UNIVERSITY PRESS Glyn Davis & Branko Pecar

business statistics using Excel OXFORD UNIVERSITY PRESS Glyn Davis & Branko Pecar business statistics using Excel Glyn Davis & Branko Pecar OXFORD UNIVERSITY PRESS Detailed contents Introduction to Microsoft Excel 2003 Overview Learning Objectives 1.1 Introduction to Microsoft Excel

More information

Neural Networks for Machine Learning. Lecture 13a The ups and downs of backpropagation

Neural Networks for Machine Learning. Lecture 13a The ups and downs of backpropagation Neural Networks for Machine Learning Lecture 13a The ups and downs of backpropagation Geoffrey Hinton Nitish Srivastava, Kevin Swersky Tijmen Tieleman Abdel-rahman Mohamed A brief history of backpropagation

More information

INTELLIGENT ENERGY MANAGEMENT OF ELECTRICAL POWER SYSTEMS WITH DISTRIBUTED FEEDING ON THE BASIS OF FORECASTS OF DEMAND AND GENERATION Chr.

INTELLIGENT ENERGY MANAGEMENT OF ELECTRICAL POWER SYSTEMS WITH DISTRIBUTED FEEDING ON THE BASIS OF FORECASTS OF DEMAND AND GENERATION Chr. INTELLIGENT ENERGY MANAGEMENT OF ELECTRICAL POWER SYSTEMS WITH DISTRIBUTED FEEDING ON THE BASIS OF FORECASTS OF DEMAND AND GENERATION Chr. Meisenbach M. Hable G. Winkler P. Meier Technology, Laboratory

More information

HOSPIRA (HSP US) HISTORICAL COMMON STOCK PRICE INFORMATION

HOSPIRA (HSP US) HISTORICAL COMMON STOCK PRICE INFORMATION 30-Apr-2004 28.35 29.00 28.20 28.46 28.55 03-May-2004 28.50 28.70 26.80 27.04 27.21 04-May-2004 26.90 26.99 26.00 26.00 26.38 05-May-2004 26.05 26.69 26.00 26.35 26.34 06-May-2004 26.31 26.35 26.05 26.26

More information

RISK MARGINS FOR HEALTH INSURANCE. Adam Searle and Dimity Wall

RISK MARGINS FOR HEALTH INSURANCE. Adam Searle and Dimity Wall RISK MARGINS FOR HEALTH INSURANCE Adam Searle and Dimity Wall Structure of presentation Why this paper? The nature of health insurance Our approach Benchmarks LAT Limitations 2 AASB1023 Why this paper?

More information

Expert Systems with Applications

Expert Systems with Applications Expert Systems with Applications 39 (2012) 3659 3667 Contents lists available at SciVerse ScienceDirect Expert Systems with Applications journal homepage: www.elsevier.com/locate/eswa Gradient boosting

More information

GLM I An Introduction to Generalized Linear Models

GLM I An Introduction to Generalized Linear Models GLM I An Introduction to Generalized Linear Models CAS Ratemaking and Product Management Seminar March 2009 Presented by: Tanya D. Havlicek, Actuarial Assistant 0 ANTITRUST Notice The Casualty Actuarial

More information

Insurance Fraud Detection: MARS versus Neural Networks?

Insurance Fraud Detection: MARS versus Neural Networks? Insurance Fraud Detection: MARS versus Neural Networks? Louise A Francis FCAS, MAAA Louise_francis@msn.com 1 Objectives Introduce a relatively new data mining method which can be used as an alternative

More information

Machine Learning and Pattern Recognition Logistic Regression

Machine Learning and Pattern Recognition Logistic Regression Machine Learning and Pattern Recognition Logistic Regression Course Lecturer:Amos J Storkey Institute for Adaptive and Neural Computation School of Informatics University of Edinburgh Crichton Street,

More information

Power Prediction Analysis using Artificial Neural Network in MS Excel

Power Prediction Analysis using Artificial Neural Network in MS Excel Power Prediction Analysis using Artificial Neural Network in MS Excel NURHASHINMAH MAHAMAD, MUHAMAD KAMAL B. MOHAMMED AMIN Electronic System Engineering Department Malaysia Japan International Institute

More information

Local classification and local likelihoods

Local classification and local likelihoods Local classification and local likelihoods November 18 k-nearest neighbors The idea of local regression can be extended to classification as well The simplest way of doing so is called nearest neighbor

More information

Data Mining Techniques for Optimizing Québec s Automobile Risk-Sharing Pool

Data Mining Techniques for Optimizing Québec s Automobile Risk-Sharing Pool Data Mining Techniques for Optimizing Québec s Automobile Risk-Sharing Pool T E C H N O L O G I C A L W H I T E P A P E R Charles Dugas, Ph.D., A.S.A. Director, insurance solutions ApSTAT Technologies

More information

Estimating Claim Settlement Values Using GLM. Roosevelt C. Mosley Jr., FCAS, MAAA

Estimating Claim Settlement Values Using GLM. Roosevelt C. Mosley Jr., FCAS, MAAA Estimating Claim Settlement Values Using GLM Roosevelt C. Mosley Jr., FCAS, MAAA 291 Estimating Claim Settlement Values Using GLM by Roosevelt C. Mosley, Jr., FCAS, MAAA Abstract: The goal of this paper

More information

New Work Item for ISO 3534-5 Predictive Analytics (Initial Notes and Thoughts) Introduction

New Work Item for ISO 3534-5 Predictive Analytics (Initial Notes and Thoughts) Introduction Introduction New Work Item for ISO 3534-5 Predictive Analytics (Initial Notes and Thoughts) Predictive analytics encompasses the body of statistical knowledge supporting the analysis of massive data sets.

More information

Interaction between quantitative predictors

Interaction between quantitative predictors Interaction between quantitative predictors In a first-order model like the ones we have discussed, the association between E(y) and a predictor x j does not depend on the value of the other predictors

More information

Financial Simulation Models in General Insurance

Financial Simulation Models in General Insurance Financial Simulation Models in General Insurance By - Peter D. England Abstract Increases in computer power and advances in statistical modelling have conspired to change the way financial modelling is

More information

A Color Placement Support System for Visualization Designs Based on Subjective Color Balance

A Color Placement Support System for Visualization Designs Based on Subjective Color Balance A Color Placement Support System for Visualization Designs Based on Subjective Color Balance Eric Cooper and Katsuari Kamei College of Information Science and Engineering Ritsumeikan University Abstract:

More information

240ST014 - Data Analysis of Transport and Logistics

240ST014 - Data Analysis of Transport and Logistics Coordinating unit: Teaching unit: Academic year: Degree: ECTS credits: 2015 240 - ETSEIB - Barcelona School of Industrial Engineering 715 - EIO - Department of Statistics and Operations Research MASTER'S

More information

Energy Load Mining Using Univariate Time Series Analysis

Energy Load Mining Using Univariate Time Series Analysis Energy Load Mining Using Univariate Time Series Analysis By: Taghreed Alghamdi & Ali Almadan 03/02/2015 Caruth Hall 0184 Energy Forecasting Energy Saving Energy consumption Introduction: Energy consumption.

More information

STOCHASTIC CLAIMS RESERVING IN GENERAL INSURANCE. By P. D. England and R. J. Verrall. abstract. keywords. contact address. ".

STOCHASTIC CLAIMS RESERVING IN GENERAL INSURANCE. By P. D. England and R. J. Verrall. abstract. keywords. contact address. . B.A.J. 8, III, 443-544 (2002) STOCHASTIC CLAIMS RESERVING IN GENERAL INSURANCE By P. D. England and R. J. Verrall [Presented to the Institute of Actuaries, 28 January 2002] abstract This paper considers

More information

Technology Step-by-Step Using StatCrunch

Technology Step-by-Step Using StatCrunch Technology Step-by-Step Using StatCrunch Section 1.3 Simple Random Sampling 1. Select Data, highlight Simulate Data, then highlight Discrete Uniform. 2. Fill in the following window with the appropriate

More information

exspline That: Explaining Geographic Variation in Insurance Pricing

exspline That: Explaining Geographic Variation in Insurance Pricing Paper 8441-2016 exspline That: Explaining Geographic Variation in Insurance Pricing Carol Frigo and Kelsey Osterloo, State Farm Insurance ABSTRACT Generalized linear models (GLMs) are commonly used to

More information

A Deeper Look Inside Generalized Linear Models

A Deeper Look Inside Generalized Linear Models A Deeper Look Inside Generalized Linear Models University of Minnesota February 3 rd, 2012 Nathan Hubbell, FCAS Agenda Property & Casualty (P&C Insurance) in one slide The Actuarial Profession Travelers

More information