EMPIRICAL RISK MINIMIZATION FOR CAR INSURANCE DATA
|
|
|
- Brian Oswald Rich
- 10 years ago
- Views:
Transcription
1 EMPIRICAL RISK MINIMIZATION FOR CAR INSURANCE DATA Andreas Christmann Department of Mathematics homepages.vub.ac.be/ achristm Talk: ULB, Sciences Actuarielles, 17/NOV/2006 Contents 1. Project: Motor vehicle insurance 2. Classical Methods 3. Convex risk minimization 4. Robustness 5. Cont.: Motor vehicle insurance 6. Summary 1
2 1. PROJECT: MOTOR VEHICLE INSURANCE Verband öffentlicher Versicherer, Düsseldorf, Germany non-aggregated data from 15 insurance companies 3GBperyear,> 4.5 million customers, > 70 explanatory variables primary response: pure premium [year] insurance tariffs secondary response: probability of having at least 1 claim per year complex dependencies, empty cells, missing values, some extreme high costs 2
3 claim amount for each customer x R p explanatory variables Statistical Objectives Actual premium. Charged to the customer pure premium + safety loading + administrative costs + desired profit Primary response: Pure premium. E(Y X = x) observed pure premium = sum of individual claim sizes number of days under risk / 360 Secondary response: Prob. of claim. P(Y >0 X = x) Precision criterion: MSE. E((Y Ŷ )2 X = x) =min! Fair criterion: Bias. E(Y Ŷ X = x) 0! in whole population and in subpopulations for each customer 3
4 Characteristic Properties Most of the policy holders have no claim within a year or a certain period. The claim sizes are extremely skewed to the right, but there is atom in zero. There is a complex, high dimensional dependency structure between variables. There are only imprecise values available for some explanatory variables. Some claim sizes are only estimates. The data sets to be analyzed are huge. Extreme high claim amounts are rare events, but contribute enormously to the total sum of all claims. 4
5 Exploratory Data Analysis Pure premium per policy holder per year total mean 360 EUR Pure premium % obs. % of total sum Mean Median total (0,2000] (2000,10000] (10000,50000] > Max. individual claim size: some million EUR 5
6 EUR Pure premium Relative frequency Age of main user Age of main user Probability
7 smoothed prob. for claim SFKL Age of main user 7
8 8
9 2. METHODS Classical approach (in Germany): Marginal Sum Model Poisson-Regression Generalized Linear Models (GLIM): Y i has distribution from exponential family E(Y i )=µ i =g 1 (x i β)andvar(y i)=φ V (g 1 (x i β))/w i g = link function, V = variance function Special cases: Poisson: E(Y i )=Var(Y i ) Gamma: V (µ i )=µ 2 i Inverse Gaussian: V (µ i )=µ 3 i Negative Binomial: V (µ i )=µ i + kµ 2 i Tweedie s compound Poisson Model (Smyth & Jørgensen, 02): 9
10 Check of GLIM assumption QQ plot for GPD Variance 0 e+00 3 e+10 6 e+10 Gamma:Var = ke 2 Poisson : Var = ke Average log(empirical quantiles) log(empirical quantiles) GPD( 0.8, ) log(theoretical quantiles) GPD( 0.63, ) log(empirical quantiles) log(empirical quantiles) GPD( 0.89, ) log(theoretical quantiles) GPD( 0.85, ) overdispersion k 1 log(theoretical quantiles) log(theoretical quantiles) 10
11 PROPOSAL Denote: Y pure premium, x vector of explanatory variables 1: Define additional variable C to classify Y,e.g. C =0 ify = 0 no cost 94.9 % =1 ify (0, 2000] low 2.2 % =2 ify (2000, 10000] medium 2.4 % =3 ify (10000, 50000] high 0.4 % =4 ify>50, 000 extreme 0.07 % 2: No claims: E(Y X = x, C =0) 0 for all x. E(Y X = x) =P(C>0 X = x) E(Y C >0,X = x) =P(C>0 X = x) k+1 c=1 P(C = c C >0,X = x) E(Y C = c, X = x) 11
12 E(Y X = x) =P(C>0 X = x) k+1 c=1 P(C = c C >0,X = x) E(Y C = c, X = x) + Companies have interest in P(C = c) ore(y X = x, C = c) + Circumvents the problem: most y i = 0, but P(Y =0)=0 + Reduction of computation time possible: regression only for 5% of obs.! + Reduction of interactions: different x for different C = c + Variable selection: different vectors x for different C groups are possible + Different estimation techniques can be used for estimating P(C = c X = x) ande(y X = x, C = c), e.g. Multinomial logistic regression + Gamma regression Robust logistic regr. + semi-parametr. regr. Classification trees + semiparametric regression Kernel logistic regression + ε support vector regression Combination of above pairs + extreme value theory (e.g. GPD) 12
13 3. EMPIRICAL RISK MINIMIZATION Vapnik 98 Data set (x i,y i ) R p { 1, +1}. Assume (X i,y i ) iid P, P unknown, predictor ˆf(x), classifier sign( ˆf(x)+ˆb), loss function L(y, f(x)+b) goal: arg min f,b E P L(Y,f(X)+b) reg. emp. ( ˆf n,λ, ˆb n,λ )=argmin f H, b R 1 n ni=1 L(y i,f(x i )+b)+λ f 2 H, risk: where L convex, λ > 0 reg. constant, H reproducing kernel Hilbert space (RKHS), defined by reproducing kernel k [k : R p R p R, k(x, ) H, f(x) = f,k(x, ) ] reg. theor. risk: (f P,λ,b P,λ )=argmin f H, b R E P L(Y,f(X)+b)+λ f 2 H Vapnik 98, Zhang 01, Chr & Steinwart 05: L-risk consistency 13
14 Advantages: Kernel trick restricts class of functions to broad subset (RKHS) allows very flexible fits RBF kernel: k(x i,x j ) =exp( γ x i x j 2 ),γ>0 Convex loss function avoids computational intractable NP-hard problems Regularization improves generalization property, avoids overfitting Software: many computational tricks Overview: 14
15 Kernel Logistic Regression (KLR) L(x, y, f) =ln(1+exp( y[f(x)+b])) smoothly decreasing risk scoring is possible: P(Y =1 X = x) = 1 1+e [f(x)+b] ε Support Vector Regression (SVR) (Vapnik 98) L ε (x, y, f) =max{0, y f(x) ε} Loss ε= y ξ i ξj 2ε y [f(x)+b] x 15
16 4. Robustness model assumptions valid? some extreme claim sizes are estimates imprecise data: driving distance during a year (1 ε)p +εp ~ P neighborhood Goal: T (P), but T ((1 ε)p + ε P) T (P)? set of all distributions T (P) =(f P,λ,b P,λ )=argmin f H, b R E P L(Y,f(X)+b)+λ f 2 H 16
17 Robustness Approaches Influence function (F. Hampel): Gâteaux-derivative towards Dirac-distr. z ( ) T (1 ε)p + ε z T (P) IF (z; T,P) = lim ε 0 ε Sensitivity curve (J.W. Tukey): measures impact of one point z SC n (z; T n )=n ( T n (z 1,...,z n 1, z) T n 1 (z 1,...,z n 1 ) ) SC n (z; T n )= T ( (1 ε n )P n 1 + ε n z ) T (Pn 1 ) ε n, ε n = 1 n Maxbias (P.J. Huber): measures worst case behavior in neighborhood maxbias(ε; T,P) = sup T (Q) T (P) Q N ε (P) N ε (P) ={Q =(1 ε)p + ε P; P is distr. on X Y, 0 ε< 1 2 } 17
18 Robustness Results Assume for classification or regression problem: loss function L is convex, continuous, L bounded H RKHS of a bounded, continuous kernel k. Then these kernel methods have good robustness properties: influence function sensitivity curve bias maxbias are bounded, sometimes even uniformly bounded, and they are able to learn the unknown distribution P. Details and Proofs: Chr & Steinwart ( 04, 05, 06) 18
19 Y : pure premium [year] 5. Cont.: motor vehicle insurance X: 8 explanatory variables: age of main user, gender of main user, car kept in garage?, driving distance, geographical information, population density, no. of years without claims, strength of engine Estimation with (KLR, ε SVR) with RBF kernel using E(Y X = x) =P(C>0 X = x) k+1 c=1 P(C = c C >0,X = x) E(Y C = c, X = x) 50% training (n=2.2e6), 25% validation (n=1.1e6), 25% test (n=1.1e6) 19
20 Estimated pure premium E(Y X=x) EUR Age of main user 20
21 Estimated pure premium E(Y X=x) EUR Males Females Age of main user 21
22 E(Y X=x) Age of main user P(C=2 C>0,X=x) Age of main user Probability EUR Probability Probability P(C>0 X=x) Age of main user P(C=3 C>0,X=x) Age of main user Probability Probability P(C=1 C>0,X=x) Age of main user P(C=4 C>0,X=x) Age of main user 22
23 6. Summary + Empirical Risk Minimization: + statistics + mathematics + computer sciences + companies (insurance, banking, IT, telecommunication) tariffs, risk scoring, data mining, CRM, CHURN + software companies (SAS/Enterprise Miner) + good predictions, very flexible + applicable also in very high dimensions + good robustness properties 23
24 References Christmann, A., Steinwart, I. (2006). Consistency of kernel based quantile regression. Submitted. Christmann, A., Steinwart, I., Hubert, M. (2006). Robust Learning from Bites for Data Mining. To appear in CSDA. Christmann, A. (2005). On a strategy to develop robust and simple tariffs from motor vehicle insurance data. Acta Mathematicae Applicatae Sinica (English Series, Springer), 21, Christmann, A., Steinwart, I. (2005) Consistency and robustness of kernel based regression. Submitted. Christmann, A. (2004). On Properties of Support Vector Machines for Pattern Recognition in Finite Samples. In: Theory and Applications of Recent Robust Methods. Eds.: M. Hubert, G. Pison, A. Struyf and S. Van Aelst. Series: Statistics for Industry and Technology, Birkhäuser, Basel. Christmann, A., Steinwart, I. (2004). On robust properties of convex risk minimization methods for pattern recognition. Journal of Machine Learning Research, 5, Schölkopf, Smola (2002) Learning with Kernels. Support Vector Machines, Regularization, Optimization, and Beyond. MIT Press. Smyth, Jørgensen (2002). Fitting Tweedie s Compound Poisson Model to Insurance Claims Data: Dispersion Modelling. ASTIN Bulletin, 32, Vapnik (1998). Statistical Learning Theory. Wiley. homepages.vub.ac.be/ achristm 24
Statistical Machine Learning
Statistical Machine Learning UoC Stats 37700, Winter quarter Lecture 4: classical linear and quadratic discriminants. 1 / 25 Linear separation For two classes in R d : simple idea: separate the classes
The zero-adjusted Inverse Gaussian distribution as a model for insurance claims
The zero-adjusted Inverse Gaussian distribution as a model for insurance claims Gillian Heller 1, Mikis Stasinopoulos 2 and Bob Rigby 2 1 Dept of Statistics, Macquarie University, Sydney, Australia. email:
Stochastic programming approaches to pricing in non-life insurance
Stochastic programming approaches to pricing in non-life insurance Martin Branda Charles University in Prague Department of Probability and Mathematical Statistics 11th International Conference on COMPUTATIONAL
BayesX - Software for Bayesian Inference in Structured Additive Regression
BayesX - Software for Bayesian Inference in Structured Additive Regression Thomas Kneib Faculty of Mathematics and Economics, University of Ulm Department of Statistics, Ludwig-Maximilians-University Munich
SAS Software to Fit the Generalized Linear Model
SAS Software to Fit the Generalized Linear Model Gordon Johnston, SAS Institute Inc., Cary, NC Abstract In recent years, the class of generalized linear models has gained popularity as a statistical modeling
Institute of Actuaries of India Subject CT3 Probability and Mathematical Statistics
Institute of Actuaries of India Subject CT3 Probability and Mathematical Statistics For 2015 Examinations Aim The aim of the Probability and Mathematical Statistics subject is to provide a grounding in
A Deeper Look Inside Generalized Linear Models
A Deeper Look Inside Generalized Linear Models University of Minnesota February 3 rd, 2012 Nathan Hubbell, FCAS Agenda Property & Casualty (P&C Insurance) in one slide The Actuarial Profession Travelers
Underwriting risk control in non-life insurance via generalized linear models and stochastic programming
Underwriting risk control in non-life insurance via generalized linear models and stochastic programming 1 Introduction Martin Branda 1 Abstract. We focus on rating of non-life insurance contracts. We
Early defect identification of semiconductor processes using machine learning
STANFORD UNIVERISTY MACHINE LEARNING CS229 Early defect identification of semiconductor processes using machine learning Friday, December 16, 2011 Authors: Saul ROSA Anton VLADIMIROV Professor: Dr. Andrew
Exploratory Data Analysis
Exploratory Data Analysis Johannes Schauer [email protected] Institute of Statistics Graz University of Technology Steyrergasse 17/IV, 8010 Graz www.statistics.tugraz.at February 12, 2008 Introduction
Introduction to Support Vector Machines. Colin Campbell, Bristol University
Introduction to Support Vector Machines Colin Campbell, Bristol University 1 Outline of talk. Part 1. An Introduction to SVMs 1.1. SVMs for binary classification. 1.2. Soft margins and multi-class classification.
Lecture 8: Gamma regression
Lecture 8: Gamma regression Claudia Czado TU München c (Claudia Czado, TU Munich) ZFS/IMS Göttingen 2004 0 Overview Models with constant coefficient of variation Gamma regression: estimation and testing
Local classification and local likelihoods
Local classification and local likelihoods November 18 k-nearest neighbors The idea of local regression can be extended to classification as well The simplest way of doing so is called nearest neighbor
GENERALIZED LINEAR MODELS IN VEHICLE INSURANCE
ACTA UNIVERSITATIS AGRICULTURAE ET SILVICULTURAE MENDELIANAE BRUNENSIS Volume 62 41 Number 2, 2014 http://dx.doi.org/10.11118/actaun201462020383 GENERALIZED LINEAR MODELS IN VEHICLE INSURANCE Silvie Kafková
Support Vector Machines for Classification and Regression
UNIVERSITY OF SOUTHAMPTON Support Vector Machines for Classification and Regression by Steve R. Gunn Technical Report Faculty of Engineering, Science and Mathematics School of Electronics and Computer
Support Vector Machine. Tutorial. (and Statistical Learning Theory)
Support Vector Machine (and Statistical Learning Theory) Tutorial Jason Weston NEC Labs America 4 Independence Way, Princeton, USA. [email protected] 1 Support Vector Machines: history SVMs introduced
STATISTICA Formula Guide: Logistic Regression. Table of Contents
: Table of Contents... 1 Overview of Model... 1 Dispersion... 2 Parameterization... 3 Sigma-Restricted Model... 3 Overparameterized Model... 4 Reference Coding... 4 Model Summary (Summary Tab)... 5 Summary
Introduction to Predictive Modeling Using GLMs
Introduction to Predictive Modeling Using GLMs Dan Tevet, FCAS, MAAA, Liberty Mutual Insurance Group Anand Khare, FCAS, MAAA, CPCU, Milliman 1 Antitrust Notice The Casualty Actuarial Society is committed
Two Topics in Parametric Integration Applied to Stochastic Simulation in Industrial Engineering
Two Topics in Parametric Integration Applied to Stochastic Simulation in Industrial Engineering Department of Industrial Engineering and Management Sciences Northwestern University September 15th, 2014
Optimization approaches to multiplicative tariff of rates estimation in non-life insurance
Optimization approaches to multiplicative tariff of rates estimation in non-life insurance Kooperativa pojišt ovna, a.s., Vienna Insurance Group & Charles University in Prague ASTIN Colloquium in The Hague
Handling missing data in Stata a whirlwind tour
Handling missing data in Stata a whirlwind tour 2012 Italian Stata Users Group Meeting Jonathan Bartlett www.missingdata.org.uk 20th September 2012 1/55 Outline The problem of missing data and a principled
MACHINE LEARNING IN HIGH ENERGY PHYSICS
MACHINE LEARNING IN HIGH ENERGY PHYSICS LECTURE #1 Alex Rogozhnikov, 2015 INTRO NOTES 4 days two lectures, two practice seminars every day this is introductory track to machine learning kaggle competition!
Semi-Supervised Support Vector Machines and Application to Spam Filtering
Semi-Supervised Support Vector Machines and Application to Spam Filtering Alexander Zien Empirical Inference Department, Bernhard Schölkopf Max Planck Institute for Biological Cybernetics ECML 2006 Discovery
Linear Threshold Units
Linear Threshold Units w x hx (... w n x n w We assume that each feature x j and each weight w j is a real number (we will relax this later) We will study three different algorithms for learning linear
Consistent Binary Classification with Generalized Performance Metrics
Consistent Binary Classification with Generalized Performance Metrics Nagarajan Natarajan Joint work with Oluwasanmi Koyejo, Pradeep Ravikumar and Inderjit Dhillon UT Austin Nov 4, 2014 Problem and Motivation
PATTERN RECOGNITION AND MACHINE LEARNING CHAPTER 4: LINEAR MODELS FOR CLASSIFICATION
PATTERN RECOGNITION AND MACHINE LEARNING CHAPTER 4: LINEAR MODELS FOR CLASSIFICATION Introduction In the previous chapter, we explored a class of regression models having particularly simple analytical
Basics of Statistical Machine Learning
CS761 Spring 2013 Advanced Machine Learning Basics of Statistical Machine Learning Lecturer: Xiaojin Zhu [email protected] Modern machine learning is rooted in statistics. You will find many familiar
Introduction to nonparametric regression: Least squares vs. Nearest neighbors
Introduction to nonparametric regression: Least squares vs. Nearest neighbors Patrick Breheny October 30 Patrick Breheny STA 621: Nonparametric Statistics 1/16 Introduction For the remainder of the course,
Data Mining and Data Warehousing. Henryk Maciejewski. Data Mining Predictive modelling: regression
Data Mining and Data Warehousing Henryk Maciejewski Data Mining Predictive modelling: regression Algorithms for Predictive Modelling Contents Regression Classification Auxiliary topics: Estimation of prediction
Introduction: Overview of Kernel Methods
Introduction: Overview of Kernel Methods Statistical Data Analysis with Positive Definite Kernels Kenji Fukumizu Institute of Statistical Mathematics, ROIS Department of Statistical Science, Graduate University
WebFOCUS RStat. RStat. Predict the Future and Make Effective Decisions Today. WebFOCUS RStat
Information Builders enables agile information solutions with business intelligence (BI) and integration technologies. WebFOCUS the most widely utilized business intelligence platform connects to any enterprise
An Introduction to Machine Learning
An Introduction to Machine Learning L5: Novelty Detection and Regression Alexander J. Smola Statistical Machine Learning Program Canberra, ACT 0200 Australia [email protected] Tata Institute, Pune,
A Primer on Mathematical Statistics and Univariate Distributions; The Normal Distribution; The GLM with the Normal Distribution
A Primer on Mathematical Statistics and Univariate Distributions; The Normal Distribution; The GLM with the Normal Distribution PSYC 943 (930): Fundamentals of Multivariate Modeling Lecture 4: September
Penalized Logistic Regression and Classification of Microarray Data
Penalized Logistic Regression and Classification of Microarray Data Milan, May 2003 Anestis Antoniadis Laboratoire IMAG-LMC University Joseph Fourier Grenoble, France Penalized Logistic Regression andclassification
Acknowledgments. Data Mining with Regression. Data Mining Context. Overview. Colleagues
Data Mining with Regression Teaching an old dog some new tricks Acknowledgments Colleagues Dean Foster in Statistics Lyle Ungar in Computer Science Bob Stine Department of Statistics The School of the
Review of Random Variables
Chapter 1 Review of Random Variables Updated: January 16, 2015 This chapter reviews basic probability concepts that are necessary for the modeling and statistical analysis of financial data. 1.1 Random
Data Mining: An Overview. David Madigan http://www.stat.columbia.edu/~madigan
Data Mining: An Overview David Madigan http://www.stat.columbia.edu/~madigan Overview Brief Introduction to Data Mining Data Mining Algorithms Specific Eamples Algorithms: Disease Clusters Algorithms:
A Model of Optimum Tariff in Vehicle Fleet Insurance
A Model of Optimum Tariff in Vehicle Fleet Insurance. Bouhetala and F.Belhia and R.Salmi Statistics and Probability Department Bp, 3, El-Alia, USTHB, Bab-Ezzouar, Alger Algeria. Summary: An approach about
MATH4427 Notebook 2 Spring 2016. 2 MATH4427 Notebook 2 3. 2.1 Definitions and Examples... 3. 2.2 Performance Measures for Estimators...
MATH4427 Notebook 2 Spring 2016 prepared by Professor Jenny Baglivo c Copyright 2009-2016 by Jenny A. Baglivo. All Rights Reserved. Contents 2 MATH4427 Notebook 2 3 2.1 Definitions and Examples...................................
Offset Techniques for Predictive Modeling for Insurance
Offset Techniques for Predictive Modeling for Insurance Matthew Flynn, Ph.D, ISO Innovative Analytics, W. Hartford CT Jun Yan, Ph.D, Deloitte & Touche LLP, Hartford CT ABSTRACT This paper presents the
Quantile Regression under misspecification, with an application to the U.S. wage structure
Quantile Regression under misspecification, with an application to the U.S. wage structure Angrist, Chernozhukov and Fernandez-Val Reading Group Econometrics November 2, 2010 Intro: initial problem The
Automated Biosurveillance Data from England and Wales, 1991 2011
Article DOI: http://dx.doi.org/10.3201/eid1901.120493 Automated Biosurveillance Data from England and Wales, 1991 2011 Technical Appendix This online appendix provides technical details of statistical
CHARACTERISTICS IN FLIGHT DATA ESTIMATION WITH LOGISTIC REGRESSION AND SUPPORT VECTOR MACHINES
CHARACTERISTICS IN FLIGHT DATA ESTIMATION WITH LOGISTIC REGRESSION AND SUPPORT VECTOR MACHINES Claus Gwiggner, Ecole Polytechnique, LIX, Palaiseau, France Gert Lanckriet, University of Berkeley, EECS,
A Handbook of Statistical Analyses Using R. Brian S. Everitt and Torsten Hothorn
A Handbook of Statistical Analyses Using R Brian S. Everitt and Torsten Hothorn CHAPTER 6 Logistic Regression and Generalised Linear Models: Blood Screening, Women s Role in Society, and Colonic Polyps
HT2015: SC4 Statistical Data Mining and Machine Learning
HT2015: SC4 Statistical Data Mining and Machine Learning Dino Sejdinovic Department of Statistics Oxford http://www.stats.ox.ac.uk/~sejdinov/sdmml.html Bayesian Nonparametrics Parametric vs Nonparametric
Pattern Analysis. Logistic Regression. 12. Mai 2009. Joachim Hornegger. Chair of Pattern Recognition Erlangen University
Pattern Analysis Logistic Regression 12. Mai 2009 Joachim Hornegger Chair of Pattern Recognition Erlangen University Pattern Analysis 2 / 43 1 Logistic Regression Posteriors and the Logistic Function Decision
Statistical Learning for Short-Term Photovoltaic Power Predictions
Statistical Learning for Short-Term Photovoltaic Power Predictions Björn Wolff 1, Elke Lorenz 2, Oliver Kramer 1 1 Department of Computing Science 2 Institute of Physics, Energy and Semiconductor Research
Lecture 3: Linear methods for classification
Lecture 3: Linear methods for classification Rafael A. Irizarry and Hector Corrada Bravo February, 2010 Today we describe four specific algorithms useful for classification problems: linear regression,
Lecture 6: Poisson regression
Lecture 6: Poisson regression Claudia Czado TU München c (Claudia Czado, TU Munich) ZFS/IMS Göttingen 2004 0 Overview Introduction EDA for Poisson regression Estimation and testing in Poisson regression
Studying Auto Insurance Data
Studying Auto Insurance Data Ashutosh Nandeshwar February 23, 2010 1 Introduction To study auto insurance data using traditional and non-traditional tools, I downloaded a well-studied data from http://www.statsci.org/data/general/motorins.
Heritage Provider Network Health Prize Round 3 Milestone: Team crescendo s Solution
Heritage Provider Network Health Prize Round 3 Milestone: Team crescendo s Solution Rie Johnson Tong Zhang 1 Introduction This document describes our entry nominated for the second prize of the Heritage
A Study on the Comparison of Electricity Forecasting Models: Korea and China
Communications for Statistical Applications and Methods 2015, Vol. 22, No. 6, 675 683 DOI: http://dx.doi.org/10.5351/csam.2015.22.6.675 Print ISSN 2287-7843 / Online ISSN 2383-4757 A Study on the Comparison
Data Exploration Data Visualization
Data Exploration Data Visualization What is data exploration? A preliminary exploration of the data to better understand its characteristics. Key motivations of data exploration include Helping to select
E-commerce Transaction Anomaly Classification
E-commerce Transaction Anomaly Classification Minyong Lee [email protected] Seunghee Ham [email protected] Qiyi Jiang [email protected] I. INTRODUCTION Due to the increasing popularity of e-commerce
Logistic Regression (a type of Generalized Linear Model)
Logistic Regression (a type of Generalized Linear Model) 1/36 Today Review of GLMs Logistic Regression 2/36 How do we find patterns in data? We begin with a model of how the world works We use our knowledge
For a partition B 1,..., B n, where B i B j = for i. A = (A B 1 ) (A B 2 ),..., (A B n ) and thus. P (A) = P (A B i ) = P (A B i )P (B i )
Probability Review 15.075 Cynthia Rudin A probability space, defined by Kolmogorov (1903-1987) consists of: A set of outcomes S, e.g., for the roll of a die, S = {1, 2, 3, 4, 5, 6}, 1 1 2 1 6 for the roll
Predictive Modeling Techniques in Insurance
Predictive Modeling Techniques in Insurance Tuesday May 5, 2015 JF. Breton Application Engineer 2014 The MathWorks, Inc. 1 Opening Presenter: JF. Breton: 13 years of experience in predictive analytics
GLM I An Introduction to Generalized Linear Models
GLM I An Introduction to Generalized Linear Models CAS Ratemaking and Product Management Seminar March 2009 Presented by: Tanya D. Havlicek, Actuarial Assistant 0 ANTITRUST Notice The Casualty Actuarial
Travelers Analytics: U of M Stats 8053 Insurance Modeling Problem
Travelers Analytics: U of M Stats 8053 Insurance Modeling Problem October 30 th, 2013 Nathan Hubbell, FCAS Shengde Liang, Ph.D. Agenda Travelers: Who Are We & How Do We Use Data? Insurance 101 Basic business
Nominal and ordinal logistic regression
Nominal and ordinal logistic regression April 26 Nominal and ordinal logistic regression Our goal for today is to briefly go over ways to extend the logistic regression model to the case where the outcome
Comparing the Results of Support Vector Machines with Traditional Data Mining Algorithms
Comparing the Results of Support Vector Machines with Traditional Data Mining Algorithms Scott Pion and Lutz Hamel Abstract This paper presents the results of a series of analyses performed on direct mail
Java Modules for Time Series Analysis
Java Modules for Time Series Analysis Agenda Clustering Non-normal distributions Multifactor modeling Implied ratings Time series prediction 1. Clustering + Cluster 1 Synthetic Clustering + Time series
Simple and efficient online algorithms for real world applications
Simple and efficient online algorithms for real world applications Università degli Studi di Milano Milano, Italy Talk @ Centro de Visión por Computador Something about me PhD in Robotics at LIRA-Lab,
Visualizing class probability estimators
Visualizing class probability estimators Eibe Frank and Mark Hall Department of Computer Science University of Waikato Hamilton, New Zealand {eibe, mhall}@cs.waikato.ac.nz Abstract. Inducing classifiers
A.II. Kernel Estimation of Densities
A.II. Kernel Estimation of Densities Olivier Scaillet University of Geneva and Swiss Finance Institute Outline 1 Introduction 2 Issues with Empirical Averages 3 Kernel Estimator 4 Optimal Bandwidth 5 Bivariate
Bootstrapping Big Data
Bootstrapping Big Data Ariel Kleiner Ameet Talwalkar Purnamrita Sarkar Michael I. Jordan Computer Science Division University of California, Berkeley {akleiner, ameet, psarkar, jordan}@eecs.berkeley.edu
CSCI567 Machine Learning (Fall 2014)
CSCI567 Machine Learning (Fall 2014) Drs. Sha & Liu {feisha,yanliu.cs}@usc.edu September 22, 2014 Drs. Sha & Liu ({feisha,yanliu.cs}@usc.edu) CSCI567 Machine Learning (Fall 2014) September 22, 2014 1 /
How To Build A Predictive Model In Insurance
The Do s & Don ts of Building A Predictive Model in Insurance University of Minnesota November 9 th, 2012 Nathan Hubbell, FCAS Katy Micek, Ph.D. Agenda Travelers Broad Overview Actuarial & Analytics Career
E3: PROBABILITY AND STATISTICS lecture notes
E3: PROBABILITY AND STATISTICS lecture notes 2 Contents 1 PROBABILITY THEORY 7 1.1 Experiments and random events............................ 7 1.2 Certain event. Impossible event............................
Support Vector Machines Explained
March 1, 2009 Support Vector Machines Explained Tristan Fletcher www.cs.ucl.ac.uk/staff/t.fletcher/ Introduction This document has been written in an attempt to make the Support Vector Machines (SVM),
Knowledge Discovery from patents using KMX Text Analytics
Knowledge Discovery from patents using KMX Text Analytics Dr. Anton Heijs [email protected] Treparel Abstract In this white paper we discuss how the KMX technology of Treparel can help searchers
Least Squares Estimation
Least Squares Estimation SARA A VAN DE GEER Volume 2, pp 1041 1045 in Encyclopedia of Statistics in Behavioral Science ISBN-13: 978-0-470-86080-9 ISBN-10: 0-470-86080-4 Editors Brian S Everitt & David
Probabilistic Linear Classification: Logistic Regression. Piyush Rai IIT Kanpur
Probabilistic Linear Classification: Logistic Regression Piyush Rai IIT Kanpur Probabilistic Machine Learning (CS772A) Jan 18, 2016 Probabilistic Machine Learning (CS772A) Probabilistic Linear Classification:
Estimating Car Insurance Premia: a Case Study in High-Dimensional Data Inference
Estimating Car Insurance Premia: a Case Study in High-Dimensional Data Inference Nicolas Chapados, Yoshua Bengio, Pascal Vincent, Joumana Ghosn, Charles Dugas, Ichiro Takeuchi, Linyan Meng University of
D-optimal plans in observational studies
D-optimal plans in observational studies Constanze Pumplün Stefan Rüping Katharina Morik Claus Weihs October 11, 2005 Abstract This paper investigates the use of Design of Experiments in observational
Interactive Machine Learning. Maria-Florina Balcan
Interactive Machine Learning Maria-Florina Balcan Machine Learning Image Classification Document Categorization Speech Recognition Protein Classification Branch Prediction Fraud Detection Spam Detection
Department of Mathematics, Indian Institute of Technology, Kharagpur Assignment 2-3, Probability and Statistics, March 2015. Due:-March 25, 2015.
Department of Mathematics, Indian Institute of Technology, Kharagpur Assignment -3, Probability and Statistics, March 05. Due:-March 5, 05.. Show that the function 0 for x < x+ F (x) = 4 for x < for x
Linear Classification. Volker Tresp Summer 2015
Linear Classification Volker Tresp Summer 2015 1 Classification Classification is the central task of pattern recognition Sensors supply information about an object: to which class do the object belong
Dongfeng Li. Autumn 2010
Autumn 2010 Chapter Contents Some statistics background; ; Comparing means and proportions; variance. Students should master the basic concepts, descriptive statistics measures and graphs, basic hypothesis
Data Mining: An Overview of Methods and Technologies for Increasing Profits in Direct Marketing. C. Olivia Rud, VP, Fleet Bank
Data Mining: An Overview of Methods and Technologies for Increasing Profits in Direct Marketing C. Olivia Rud, VP, Fleet Bank ABSTRACT Data Mining is a new term for the common practice of searching through
Tree based ensemble models regularization by convex optimization
Tree based ensemble models regularization by convex optimization Bertrand Cornélusse, Pierre Geurts and Louis Wehenkel Department of Electrical Engineering and Computer Science University of Liège B-4000
2DI36 Statistics. 2DI36 Part II (Chapter 7 of MR)
2DI36 Statistics 2DI36 Part II (Chapter 7 of MR) What Have we Done so Far? Last time we introduced the concept of a dataset and seen how we can represent it in various ways But, how did this dataset came
Support Vector Machines with Clustering for Training with Very Large Datasets
Support Vector Machines with Clustering for Training with Very Large Datasets Theodoros Evgeniou Technology Management INSEAD Bd de Constance, Fontainebleau 77300, France [email protected] Massimiliano
A three dimensional stochastic Model for Claim Reserving
A three dimensional stochastic Model for Claim Reserving Magda Schiegl Haydnstr. 6, D - 84088 Neufahrn, [email protected] and Cologne University of Applied Sciences Claudiusstr. 1, D-50678 Köln
Christfried Webers. Canberra February June 2015
c Statistical Group and College of Engineering and Computer Science Canberra February June (Many figures from C. M. Bishop, "Pattern Recognition and ") 1of 829 c Part VIII Linear Classification 2 Logistic
Sections 2.11 and 5.8
Sections 211 and 58 Timothy Hanson Department of Statistics, University of South Carolina Stat 704: Data Analysis I 1/25 Gesell data Let X be the age in in months a child speaks his/her first word and
CA200 Quantitative Analysis for Business Decisions. File name: CA200_Section_04A_StatisticsIntroduction
CA200 Quantitative Analysis for Business Decisions File name: CA200_Section_04A_StatisticsIntroduction Table of Contents 4. Introduction to Statistics... 1 4.1 Overview... 3 4.2 Discrete or continuous
Classification with Hybrid Generative/Discriminative Models
Classification with Hybrid Generative/Discriminative Models Rajat Raina, Yirong Shen, Andrew Y. Ng Computer Science Department Stanford University Stanford, CA 94305 Andrew McCallum Department of Computer
Contents. List of Figures. List of Tables. List of Examples. Preface to Volume IV
Contents List of Figures List of Tables List of Examples Foreword Preface to Volume IV xiii xvi xxi xxv xxix IV.1 Value at Risk and Other Risk Metrics 1 IV.1.1 Introduction 1 IV.1.2 An Overview of Market
Principle of Data Reduction
Chapter 6 Principle of Data Reduction 6.1 Introduction An experimenter uses the information in a sample X 1,..., X n to make inferences about an unknown parameter θ. If the sample size n is large, then
