WARRANTY CLAIMS MODELLING



Similar documents
benefit is 2, paid if the policyholder dies within the year, and probability of death within the year is ).

An Alternative Way to Measure Private Equity Performance

THE METHOD OF LEAST SQUARES THE METHOD OF LEAST SQUARES

What is Candidate Sampling

How To Calculate The Accountng Perod Of Nequalty

1. Fundamentals of probability theory 2. Emergence of communication traffic 3. Stochastic & Markovian Processes (SP & MP)

8.5 UNITARY AND HERMITIAN MATRICES. The conjugate transpose of a complex matrix A, denoted by A*, is given by

The OC Curve of Attribute Acceptance Plans

Calculation of Sampling Weights

DEFINING %COMPLETE IN MICROSOFT PROJECT

CS 2750 Machine Learning. Lecture 3. Density estimation. CS 2750 Machine Learning. Announcements

PSYCHOLOGICAL RESEARCH (PYC 304-C) Lecture 12

Forecasting the Direction and Strength of Stock Market Movement

Section 5.4 Annuities, Present Value, and Amortization

Analysis of Premium Liabilities for Australian Lines of Business

NPAR TESTS. One-Sample Chi-Square Test. Cell Specification. Observed Frequencies 1O i 6. Expected Frequencies 1EXP i 6

SIMPLE LINEAR CORRELATION

1 Example 1: Axis-aligned rectangles

THE DISTRIBUTION OF LOAN PORTFOLIO VALUE * Oldrich Alfons Vasicek

Causal, Explanatory Forecasting. Analysis. Regression Analysis. Simple Linear Regression. Which is Independent? Forecasting

1. Measuring association using correlation and regression

Can Auto Liability Insurance Purchases Signal Risk Attitude?

7.5. Present Value of an Annuity. Investigate

Risk Model of Long-Term Production Scheduling in Open Pit Gold Mining

CHOLESTEROL REFERENCE METHOD LABORATORY NETWORK. Sample Stability Protocol

BERNSTEIN POLYNOMIALS

Power-of-Two Policies for Single- Warehouse Multi-Retailer Inventory Systems with Order Frequency Discounts

Lecture 3: Force of Interest, Real Interest Rate, Annuity

Estimation of Dispersion Parameters in GLMs with and without Random Effects

CHAPTER 14 MORE ABOUT REGRESSION

Using Series to Analyze Financial Situations: Present Value

Quantization Effects in Digital Filters

Module 2 LOSSLESS IMAGE COMPRESSION SYSTEMS. Version 2 ECE IIT, Kharagpur

Recurrence. 1 Definitions and main statements

How Sets of Coherent Probabilities May Serve as Models for Degrees of Incoherence

A DYNAMIC CRASHING METHOD FOR PROJECT MANAGEMENT USING SIMULATION-BASED OPTIMIZATION. Michael E. Kuhl Radhamés A. Tolentino-Peña

SPEE Recommended Evaluation Practice #6 Definition of Decline Curve Parameters Background:

The Application of Fractional Brownian Motion in Option Pricing

The Development of Web Log Mining Based on Improve-K-Means Clustering Analysis

Binomial Link Functions. Lori Murray, Phil Munz

Risk-based Fatigue Estimate of Deep Water Risers -- Course Project for EM388F: Fracture Mechanics, Spring 2008

Calculating the high frequency transmission line parameters of power cables

The Greedy Method. Introduction. 0/1 Knapsack Problem

ANALYZING THE RELATIONSHIPS BETWEEN QUALITY, TIME, AND COST IN PROJECT MANAGEMENT DECISION MAKING

Latent Class Regression. Statistics for Psychosocial Research II: Structural Models December 4 and 6, 2006

A Probabilistic Theory of Coherence

Portfolio Loss Distribution

A Novel Methodology of Working Capital Management for Large. Public Constructions by Using Fuzzy S-curve Regression

Damage detection in composite laminates using coin-tap method

PRACTICE 1: MUTUAL FUNDS EVALUATION USING MATLAB.

Measuring portfolio loss using approximation methods

Statistical Methods to Develop Rating Models

n + d + q = 24 and.05n +.1d +.25q = 2 { n + d + q = 24 (3) n + 2d + 5q = 40 (2)

STATISTICAL DATA ANALYSIS IN EXCEL

1 De nitions and Censoring

Feature selection for intrusion detection. Slobodan Petrović NISlab, Gjøvik University College

Estimating Total Claim Size in the Auto Insurance Industry: a Comparison between Tweedie and Zero-Adjusted Inverse Gaussian Distribution

Prediction of Disability Frequencies in Life Insurance

SUPPLIER FINANCING AND STOCK MANAGEMENT. A JOINT VIEW.

Stress test for measuring insurance risks in non-life insurance

Project Networks With Mixed-Time Constraints

Simple Interest Loans (Section 5.1) :

Implementation of Deutsch's Algorithm Using Mathcad

Linear Circuits Analysis. Superposition, Thevenin /Norton Equivalent circuits

NMT EE 589 & UNM ME 482/582 ROBOT ENGINEERING. Dr. Stephen Bruder NMT EE 589 & UNM ME 482/582

Characterization of Assembly. Variation Analysis Methods. A Thesis. Presented to the. Department of Mechanical Engineering. Brigham Young University

Luby s Alg. for Maximal Independent Sets using Pairwise Independence

Chapter 7: Answers to Questions and Problems

An Evaluation of the Extended Logistic, Simple Logistic, and Gompertz Models for Forecasting Short Lifecycle Products and Services

Vasicek s Model of Distribution of Losses in a Large, Homogeneous Portfolio

Extending Probabilistic Dynamic Epistemic Logic

On the Optimal Control of a Cascade of Hydro-Electric Power Stations

Traffic-light a stress test for life insurance provisions

Solution: Let i = 10% and d = 5%. By definition, the respective forces of interest on funds A and B are. i 1 + it. S A (t) = d (1 dt) 2 1. = d 1 dt.

Single and multiple stage classifiers implementing logistic discrimination

Study on Model of Risks Assessment of Standard Operation in Rural Power Network

Ring structure of splines on triangulations

Exhaustive Regression. An Exploration of Regression-Based Data Mining Techniques Using Super Computation

v a 1 b 1 i, a 2 b 2 i,..., a n b n i.

IDENTIFICATION AND CORRECTION OF A COMMON ERROR IN GENERAL ANNUITY CALCULATIONS

Imperial College London

Statistical algorithms in Review Manager 5

Meta-Analysis of Hazard Ratios

Support Vector Machines

Survival analysis methods in Insurance Applications in car insurance contracts

Properties of Indoor Received Signal Strength for WLAN Location Fingerprinting

Efficient Project Portfolio as a tool for Enterprise Risk Management

Faraday's Law of Induction

Stochastic epidemic models revisited: Analysis of some continuous performance measures

) of the Cell class is created containing information about events associated with the cell. Events are added to the Cell instance

"Research Note" APPLICATION OF CHARGE SIMULATION METHOD TO ELECTRIC FIELD CALCULATION IN THE POWER CABLES *

Conversion between the vector and raster data structures using Fuzzy Geographical Entities

Traffic State Estimation in the Traffic Management Center of Berlin

Transcription:

WAANTY CLAIMS MODELLING V. KULKANI AND SIDNEY I. ESNICK Abstract. A company wshes to estmate or predct ts fnancal exposure n a reportng perod of length T (typcally one quarter) due to warranty clams. We propose a farly general random measure model whch allows computaton of the Laplace transform of the total clam made aganst the company n the reportng nterval due to warranty clams. When specalzed to a Posson process of both sales and warranty clams, statstcal estmaton of relevant quanttes s possble. The methodology s llustrated by analyzng automoble sales and warranty clams data from a large car manufacturer for a sngle car model and model year. 1. Introducton A retal company needs to budget for warranty clams as part of ther rsk management polces. Excessve warranty clams ndcate problems n manufacturng or procurement so dstrbutonal propertes of the clams are needed. Fnancal requrements for predctng and then reportng quarterly results make t desrable for a company to be able to predct total quarterly warranty clams. We analyze ths problem usng a random measure and clustered pont process approach. A sale of an tem generates a random measure D( ) n whch D((0, t]) represents the total warranty clam experenced by the company for the sold tem n the frst t tme unts subsequent to the sale. Tmes of sales are treated as a pont process wth only smple ponts. Specalzaton of both the sale tmes and tmes of warranty clams relatve to sales to non-homogeneous Posson processes allows estmaton of model parameters and, hence, estmaton of the Laplace transform of the dstrbuton of total clam exposure to the company n, say, a quarter. To guard aganst model rsk arsng from the Posson assumptons, other tractable statstcal models should be developed. See Ja, et al (2002) for earler work on warranty costs arsng out of non-statonary sales processes. Smlar models arse n other dscplnes. For nstance, n Internet modelng the tmes of user ntated connectons are followed by a cluster of machne ntated connectons gvng rse to cluster pont processes. Fttng realstc models to data n the face of a varety of statstcal structures for the clusters s an mportant ssue. In partcular, cluster characterstcs may or may not be dependent on cluster sze and gaps between ponts n a cluster may or may not be ndependent. Clearly there s a connecton between warranty clams modelng developed here and nsurance clams modelng. Instead of gvng references to ths vast lterature, we refer the readers to two excellent books on ths topc: Asmussen (2000), olsk et al (1999). Insurance models typcally do not account for nonstatonary polcy-ssung processes, whch s what makes our models dfferent. Our paper s organzed as follows: Secton 2 outlnes the general sales and clams model. Secton 3 offers a theoretcal formula for the total warranty clam aganst the company n a tme perod of [0, T ]. Ths formula s specalzed to the case where both tmes of sales and tmes of warranty clams follow nonhomogeneous Posson processes. Ths specalzaton yelds a suffcently explct formula for the Laplace transform of the dstrbuton of total warranty clams n a reportng nterval; the Laplace transform can be numercally nverted to obtan, for nstance, quantles of the dstrbuton of total clams n a reportng nterval. Subsequent sectons analyze two data sets of sales dates and tmes and costs of warranty clams Much of ths research took place durng sprng 2006 whle Sd esnck was a wanderng academc on sabbatcal. Grateful acknowledgement for support and hosptalty go to the Department of Statstcs and Operatons esearch, Unversty of North Carolna, Chapel Hll and to SAMSI, esearch Trangle Park, NC. Sdney esnck s research was also partally supported by NSA grant MSPF-05G-0492 durng summer 2006. 1

2 V. KULKANI AND S.I. ESNICK from a major car manufacturer for a sngle car model and model year. We show how model parameters may be estmated and use these estmates to compute the quantles of the total warranty costs n each quarter for a ths manufacturer. The last secton contans the summary and conclusons. The securty protocols of the manufacturng company that suppled us wth data requred that the sales and warranty clam data be subjected to maskng, scalng and some random deletons. Thus we vew the methodology developed n ths paper as a demonstraton of feasblty. More defntve conclusons would requre workng closely wth a manufacturer or retaler and obtanng more complete data records. Ths secton outlnes the model. 2. The sales and clams model 2.1. Sales. Suppose there s a non-homogeneous Posson process of sales. The Posson countng process (2.1) ɛ Sj ( ) j lves on and has sales tmes {S j }. The standard notaton j ɛ S j for the countng functon of the ponts {S j } s defned by { 1, f S I, ɛ S (I) = 0, f S / I, for an nterval I. The mean number of sales n an nterval I s ( ) µ(i) = E ɛ Sj (I), where µ s a measure on whch s fnte on fnte ntervals. The Posson assumpton means that the Laplace functonal s of the form (Kallenberg (1983), Neveu (1977), esnck (2006, 1987)) (2.2) E(exp{ { } f(s j )} = exp (1 e f )dµ j for any non-negatve functon f defned on. 2.2. Warranty clams. For the jth sold tem, we assume a general process of warranty clams. For the jth tem sold, suppose there exst tmes {S j + T (j), 1} at whch warranty clams are made. The tme ponts {T (j) } are non-negatve and non-decreasng n. Assume there are fntely many ponts {S j +T (j), 1, < j < } n any fnte nterval. There s an d sequence of clam szes {D (j), 1} where we assume D (j) s the th warranty clam for the jth sold tem. For the jth tem, there s a random measure D j ( ) on [0, ) gvng the total clams: (2.3) D j ( ) = j D (j) ɛ (j) T ( ), so that D j (I) s the total warranty clam amount durng tme nterval I for the jth tem sold. Later we wll assume a standard warranty s of length W and that each of the d random measures D j ( ) concentrates on [0, W ]. The random measure gvng total exposure to warranty clams n a tme perod I s (2.4) C(I) = j D (j) ɛ (j) T +S j (I) so that the total exposure n a reportng nterval [0, T ] (eg, 1 quarter or 1 year) s C([0, T ]) = D (j) ɛ (j) T +S j ([0, T ]) = ( D j [( Sj ) +, T S j ] ), j j the aggregate across tems sold of clams ncurred n [0, T ].

WAANTY MODELLING 3 3. The dstrbuton of total clams. 3.1. General warranty clams process. Suppose f : [0, ) [0, ) s a test functon; later t wll be the ndcator of an nterval. The random measures {D j } gven n (2.3) are d. Defne the Laplace functonal of the random measure D 1 ( ) as (3.1) ψ D (f) = Ee D 1(f). As f vares, ths determnes the dstrbuton of the random measure D 1 ( ). The Laplace functonal of the total warranty clams random measure s Ee C(f) =Ee 0 f(s)c(ds) = Ee j =Ee j 0 f(y+s j)d j (dy). s S j f(s)d j(ds S j) Now we condton on {S j }. Let the condtonal expectaton wth respect to the σ-feld generated by {S j } be E {Sj}. We get the above equal to ( =E E {S j} ( e j 0 f(y+s j)d j (dy) )) = E( E {S j} ( e 0 f(y+s j)d j (dy) )). Wrte f s ( ) = f( + s) and usng (3.1), we recognze the forgong as ( =E ψ D (f Sj ) ) = Ee j log ψ D(f Sj ). j Ths beng a functon of Posson ponts allows us to use the form n (2.2) and to wrte the above equal to ( ) ( ) = exp{ 1 e ( log ψ D(f s)) µ(ds)} = exp{ 1 ψ D (f s ) µ(ds)} ( ) = exp{ 1 ψ D (f( + s)) µ(ds)}. We summarze the above dscusson n the followng proposton. Proposton 1. Suppose {S j } are Posson ponts wth mean measure µ gven by (2.1) and Laplace functonal (2.2). Suppose {D j ( )} are d random measures representng cumulatve warranty clam amounts as gven by (2.3) wth dstrbuton determned by the Laplace functonal (3.1). Then the total warranty clam random measure C( ) defned n (2.4) has Laplace functonal ( ) (3.2) ψ C (f) := exp{ 1 ψ D (f( + s)) µ(ds)}, for f 0. 3.2. Specalzaton. Settng f = ζ1 [0T ] for ζ > 0 transforms ψ C (f) = E(e ζc[0,t ] ), ζ > 0, nto the Laplace transform of the random varable C[0, T ]. We now see how explct formula (3.2) can be made n certan specal cases. We assume the warranty clam szes are d random varables {D (j), 1, j 1} and suppose (3.3) P [D (1) 1 x], x 0 s the dstrbuton wth Laplace transform ˆD(ζ) = E(e ζd(1) 1 ), ζ > 0. Then the Laplace functonal of the random measure D 1 ( ) gven n (3.1) can be expressed as ψ D (f) =E(e D 1(f) ) = E(e D(1) f(t (1) ) ) j

4 V. KULKANI AND S.I. ESNICK and condtonng on {T (1) } ths s (3.4) where =E (E =E (1) {T } ( e D(1) f(t (1) ψ D (f) =ψ T ( log ˆD f), ) )) ˆD(f(T (1) )) = E (e ) (1) log ˆD(f(T )) ψ T (g) = E(e (1) g(t ) ), g 0, s the Laplace functonal of the pont process wth ponts {T (1) }. Thus the Laplace functonal of the random measure D 1 ( ) can be represented n terms of the Laplace functonal of {T (1) } and the Laplace transform of the dstrbuton D(x). Next we suppose {T (1) } are the ponts of a non-homogeneous Posson process on [0, ) and slghtly more generally we suppose these are the ponts of a Posson random measure wth mean measure ν, a measure gvng fnte mass to bounded sets. We assume the warranty duraton s a fxed number W > 0, longer than the length of the fnancal reportng nterval T and so we assume ν concentrates on [0, W ] whch amounts to consderng the pont process of clams to be ɛ ( [0, W ]). (Varants where W s random wth small T (1) varance about the mean could also be consdered but we leave ths for elsewhere.) Then for g 0, the Posson assumpton means the Laplace functonal of ɛ ( [0, W ]) s T (1) (3.5) ψ T (g) = e W 0 (1 e g )dν. See Kallenberg (1983), Neveu (1977), esnck (2006, 1987). Wth ths Posson assumpton we have from (3.4), W ψ D (f) =ψ T ( log ˆD ( ) f) = exp{ 1 e ( log ˆD f(s)) ν(ds)} 0 =e ( ) W 1 (3.6) ˆD f(s) 0 ν(ds). So assumng {T (1) } are Posson ponts as well as {S j } beng Posson, and combnng (3.2), (3.4) and (3.6), we get (3.7) ψ C (f) = exp{ = exp{ 1 ψ D (f(s + ))µ(ds)} (1 e ) W (1 ) 0 ( ˆD(f(s+u)) ν(du) µ(ds)} To get the Laplace transform of C[0, T ], we set f = ζ1 [0,T ] for ζ > 0 and get Ee ζc[0,t ] = exp{ (1 e ( ) ) W 1 ˆD(ζ1 0 [0,T ] (s+u)) ν(du) µ(ds)} ( = exp{ (1 ˆD(ζ)) ) e 1 W 1 0 [0,T ] (s+u)ν(du) µ(ds)}. To analyze ths further, suppose W > T, that s, the warranty perod s larger than the reportng perod. Decompose the ntegral over as T = + = A + B. s=0 s<0 We fnd the term A s a standard convoluton T (3.8) A = ( 1 e (1 ˆD(ζ))ν([0,T s]) µ(ds). s=0

WAANTY MODELLING 5 For B, we get that s < 0 and s < u T + s and also u < W and therefore, B = (1 e (1 ˆD(ζ)) ) W 1 0 [0,T ] (s+u)ν(du) µ(ds) s<0 ( ) = 1 e (1 ˆD(ζ))ν( s,(t + s ) W ] µ(ds) s<0 ( ) = 1 e (1 ˆD(ζ))ν( s,(t + s )] (3.9) µ(ds) + s<0 0< s <W T s<0 W T < s <W ( 1 e (1 ˆD(ζ))ν( s,w ] ) µ(ds). To summarze, we fnd that when both sales and clams follow non-homogeneous Posson processes, we get an explct formula for the Laplace transform of total warranty clams n a reportng perod: (3.10) Ee ζc[0,t ] = e (A+B), where A s gven by (3.8) and B s gven by (3.9). If we specalze further and let ɛ be homogeneous Posson on wth ν(dt) = λdt and T (1) ɛ S be homogeneous Posson wth µ(ds) = θds so that the rates are λ > 0 and θ > 0, we get T ( ) A = 1 e (1 ˆD(ζ))λ(T s) θds s=0 =θt θ(1 e (1 ˆD(ζ))λT ) (1 ˆD(ζ))λ. For B we get after transformng s s, W T ( ) B = 1 e (1 ˆD(ζ))λT θds + s=0 =θ(w T )(1 e (1 ˆD(ζ))λT ) + W s=w T T 0 ( 1 e (1 ˆD(ζ))λ(W s) ) θds ( 1 e (1 ˆD(ζ))λs ) θds =θ(w T )(1 e (1 ˆD(ζ))λT ) + θt θ(1 e (1 ˆD(ζ))T λ ) λ(1 ˆD(ζ)). 4. Analyss of the clam sze data In an attempt to analyze the dstrbuton of clam szes {D (1) } and also to understand the process of sales {S j }, we obtaned and analyzed a year s worth of clams and sales data for a specfc car model from a major car manufacturer. The clams data for model year 2000 comprses a spreadsheet called cost2000 of length 73,167. The analyss for the fourth column cost, representng the cost to the company of each warranty clam, allows us to construct an estmate of the clam cost dstrbuton P [D (1) 1 x] defned n (3.3). The usual methods from the statstcs of extremes and quanttatve rsk management construct a dstrbuton model usng the emprcal dstrbuton for the center of the dstrbuton and a generalzed Pareto dstrbuton for the part of the dstrbuton above a threshold. See Coles (2001), Embrechts et al. (1997), McNel et al. (2005). 4.1. Intal analyss of warranty clam costs. The tme seres plot for cost szes shows a spkey structure wthout apparent trends. The autocorrelaton plot taken out to 30 lags shows lttle dependence though there may be dependence at very small lags of 1 or 2 whch may result from multple clams for the same vehcle on the same day. Intal analyss of the cost data summarzed n Table 4.1 shows that the data has a wde range. Twenty four values n the data set are equal to 0. The unts are unknown resultng from maskng and scalng of the data by the manufacturer. Presumably there s a ratonal choce of upper bound based on the total value of a car. Despte the upper bound, t s stll reasonable to model the cost data wth a heavy taled dstrbuton based on the frequency of extreme values and the fact that there s a mx of vehcles wth dfferng total

6 V. KULKANI AND S.I. ESNICK TS plot: cost2000 Seres cost2000 cost2000 0 500 1000 1500 2000 2500 ACF 0.0 0.2 0.4 0.6 0.8 1.0 0 20000 40000 60000 Tme 0 5 10 15 20 25 30 Lag Fgure 1. Tme seres plot of clams data (left) and autocorrelaton plot (rght) showng lttle dependence. replacement value. An alternatve whch could be explored would be to ft a truncated Pareto dstrbuton to threshold exceedances. Ths would produce smlar statstcal conclusons. Mn. 1st Qu. Medan Mean 3rd Qu. Max. 0.000 6.573 11.650 35.610 31.830 2844.000 Table 1. Summary statstcs for the data. 4.2. Heavy tal analyss. As a frst step we made a QQ plot of the log transformed data aganst exponental quantles. The upper 10,000 largest values follow a lne whose slope ndcates a value of α = 1.51 for the model P [Cost > x] x α, for large x. However, examnaton of the plot ndcates that the 55 largest upper order statstcs follow a lne wth slope whch estmates α = 16.376. Consderng that 55 s a small percentage of 73167 or 10,000, t s reasonable to contnue wth the heavy tal analyss. Alternatvely, one could consder a mxture model of two Pareto dstrbutons but ths may result n overfttng. As an alternatve exploratory procedure we made a Hll plot (Csörgő et al. (1985), Hll (1975), Mason and Turova (1994), Mason (1982), esnck (2006, 1997)) of the cost data whch plots the Hll estmator H 1 k,n of α as a functon of k, the number of upper order statstcs used n the estmaton; the parameter n s the sample sze. The plots n Fgure 3 are qute stable. The althll plot (esnck and Stărcă (1997), esnck (2006)) replaces k by [n θ ] for 0.4 θ 1. Next we ft a generalzed Pareto dstrbuton to the tal (see Coles (2001), Embrechts et al. (1997), McNel et al. (2005)) followng the standard peaks over threshold phlosophy descrbed as follows. For a threshold u and x > u wrte P [Cost > x] =P [Cost > x Cost > u]p [Cost > u] = F (u) F [u] (x),

WAANTY MODELLING 7 LS ft, k=55; alpha=17.24 LS ft, k=10,000; alpha=1.51 log sorted data 7.75 7.80 7.85 7.90 7.95 ************* *** *** ****** ********* ** * ** * * * * * * * * * * * * * * * * log sorted data 4 5 6 7 8 * ** ***************** * * * * * * 8 9 10 11 quantles of exponental 2 4 6 8 10 quantles of exponental Fgure 2. QQ plots for the log transformed data. The left plot consders only the 55 upper order statstcs whle the rght plot uses 10,000 upper order statstcs. Hll plot Hll estmate of alpha 0 10 20 30 40 50 0 20000 40000 60000 number of order statstcs althll Hll estmate of alpha 1.0 1.4 1.8 2.2 0.4 0.5 0.6 0.7 0.8 0.9 theta Fgure 3. Hll plots for cost2000. The top plot plots estmates of α based on k, the number of upper order statstcs used n the estmaton whle the bottom plot s n alt scale and replaces k by [n θ ]; 0.4 θ 1 and n s the sample sze. where F (x) s the clam sze dstrbuton and F [u] (x) s the condtonal dstrbuton of clam cost gven the clam s bgger than u; ths s also the exceedance dstrbuton. The exceedance dstrbuton s approxmated by the generalzed Pareto dstrbuton, effectvely allowng extrapolaton beyond the range of the data, yeldng (4.1) F (u)(1 + 1 αβ (x u)) α,

8 V. KULKANI AND S.I. ESNICK for a scale parameter β > 0 and the Pareto shape parameter α > 0. Ths means our estmate of the clam sze dstrbuton s F (x), f x u, ˆP [Cost > x] = F (u)(1 + 1 ˆα ˆβ (x u)) ˆα, f x > u. For many purposes, an adequate estmate, F (x), for x u s provded by the emprcal dstrbuton ( # observatons > x)/(sample sze); however, relyng on the emprcal dstrbuton may present storage problems when dong numercal Laplace transform nverson. For the clam sze data, we used the 10,000 upper order statstcs, whch corresponds to a threshold of u = 60.262, wth 86% of the observatons less than ths threshold. We obtaned by the method of maxmum lkelhood usng the -module EVI, that ˆα = 1.54 (as opposed to the QQ estmate of 1.51) and ˆβ = 41.4537. The ftted excess dstrbuton overlad wth the emprcal dstrbuton of excesses of the threshold s shown n Fgure 4. Excess dstrbuton for cost2000 Fu(x u) 0.0 0.2 0.4 0.6 0.8 1.0 50 100 200 500 1000 2000 5000 x (on log scale) Fgure 4. GPD ftted dstrbuton of excesses compared wth emprcal dstrbuton of excesses relatve to the threshold 60.262. We computed some some standard rsk measures ncludng quantles of the ftted dstrbuton and the expected shortfall (sfall) for each quantle; that s the expected cost gven that the cost s greater than the quantle level. For example, from Table 2 we see that P [Cost > 343.952] = 0.01 and the expected excess over 343.952 s 980.482-343.952=636.530. For F (x), x < u, n some crcumstances, t s preferable to ft a parametrc famly rather than use the emprcal dstrbuton. We call the subset of the clams data correspondng to clam szes less than the threshold u = 62.262 clams2000small. We ft a Gamma densty f a,s (x) = 1 Γ(a) s a x a 1 e x/s, x > 0 wth shape paramter a and scale parameter s. Usng maxmum lkelhood va the -functon ftdstr we obtaned the estmates shape = â = 1.25000, ŝcale = ŝ = 11.846.

WAANTY MODELLING 9 p quantle sfall 0.800 46.276 138.001 0.900 74.619 218.218 0.950 118.993 343.804 0.990 343.952 980.482 0.999 1537.819 4359.364 Table 2. Quantles of the ftted excess dstrbuton and expected shortfalls. Ths densty assgns probablty 0.9909 to values n [0, u]. Thus P [Cost v Cost u] = ˆF (v) v ˆF (u) = fâ,ŝ (y)dy, 0 v u, and therefore, on [0, u] we estmate the densty of F (v) as ˆF (u) v 0 0 fâ,ŝ (y)dy, where recall ˆF (u) = 0.862, the percentage of observatons below the threshold u. The hstogram of the prethreshold data clams2000small and the ftted Gamma densty are dsplayed n Fgure 5. Hst cost2000small & Gamma ft Densty 0.00 0.01 0.02 0.03 0.04 0.05 0.06 0 10 20 30 40 50 60 cost2000small Fgure 5. Hstogram of the clam szes data below threshold wth the Gamma ftted densty supermposed. 5. Analyss of sales data Havng some understandng of the nature of warranty clams, we now turn to studyng the sales process. 5.1. Intal analyss. The sales data for model year 2000 of a sngle car model conssts of 34,807 records stretchng over 1116 days or 3.05 years wth an earlest date of 08/20/1999 and a latest date of 09/9/2002. Sales counts per day are gven n Fgure 6.

10 V. KULKANI AND S.I. ESNICK Sales counts per day countsales 0 200 400 600 800 0 200 400 600 800 1000 Tme Fgure 6. Sales per day for the 2000 model year stretchng over 1116 days. 5.2. The Bass sales rate model. Bass (1969) proposed the followng parametrc model for sales of consumer goods after ther market ntroducton. Let µ([0, t]) be the cumulatve sales of a partcular model n t tme unts snce the model ntroducton to the market. The sales rate model s then descrbed by three parameters a, b and M as follows: dµ([0, t]) = a(µ([0, t]) + b)(m µ([0, t])), dt wth ntal condton µ({0}) = 0. The soluton s: b(1 exp{ a(m + b)t} (5.1) µ([0, t]) = M b + M exp{ a(m + b)t}. Ths model s desgned so that the sales rate ntally ncreases and then decreases and the total sales µ([0, t]) approaches M as t approaches nfnty. We can estmate a, b, M from sales data. For our purposes, a more convenent parameterzaton results from settng a = A, ab = B, A + B = C, M and M, as before, s the total sales of the tem. Ths gves 1 exp{ Ct} (5.2) µ([0, t]) = M 1 + ( C B 1) exp{ Ct}. Examples of dfferent Bass sales rate curves are gven n Fgure 7. Our ntent s to use µ([0, t]) as the mean measure of the Posson process of sales. Comparng Fgures 6 and 7, we see that there s excess varablty n Fgure 6 that s not captured by the smooth Bass sales rate plots so smoothng the data mght be consdered to get a better ft. Wthout smoothng the data, we perform maxmum lkelhood estmaton of (B, C) usng Posson counts per day as the data. We use the -functon nlm and fnd estmated values of ( ˆB, Ĉ) = (0.00041, 0.0163). The plot gvng observed daly counts vs ftted expected counts s n Fgure 8. The ftted model does not capture all the varablty, nor does t ft the bg observed counts partcularly well.

WAANTY MODELLING 11 Bass sales rate plots dff(st(seq(1, 600, by = 0.1), p = c(4e 04, 0.02))) 0 e+00 1 e 04 2 e 04 3 e 04 4 e 04 5 e 04 0 1000 2000 3000 4000 5000 6000 Index Fgure 7. Dfferent sales rate curves. The thn curve achevng the bggest maxmum value correponds to (B, C) = (.0004,.02); the mddle, heavy curve s for (B, C) = (.0004,.01); the lowest curve s for (B, C) = (0.0004, 0.008). Observed vs expected daly counts countsales 0 200 400 600 800 0 200 400 600 800 1000 Tme Fgure 8. The spkey graph s the observed daly sales counts whle the smooth curve gves the ftted Bass model. A better vsual ft to the Bass model comes from smoothng the data by aggregaton nto 12 day bns and consderng sales per 12 day perods as the data. Ths yelds very smlar estmates of the parameters (5.3) ( ˆB, Ĉ) = (0.00039, 0.0165) and the observed 12 day sales counts vs expected plot looks better as shown n Fgure 9. Ths ncreases our confdence that the parameter estmates of (B, C) are very reasonable.

12 V. KULKANI AND S.I. ESNICK Observed vs ftted sales per 12 days countsales12 0 500 1000 1500 2000 0 20 40 60 80 Tme Fgure 9. The contnuous graph s the observed 12 day sales counts whle the crcles connected by lnes gves the ftted Bass model. 6. Analyss of warranty clam data Ths secton analyzes warranty clam dates. We took warranty clam tmes of each vehcle and subtracted the tme of sale to create tmes relatve to zero for each car. The unt of measurement s one day. Each of these records was treated as a realzaton of a non-homogeneous Posson process. Interestngly, ths resulted n tmes whch were negatve (warranty clams presumably made by the dealer pror to sale) and tmes whch exceeded the 3 year wndow. Owng to maskng of the data by the manufacturer, not all sales dates could be matched wth clams and subsets of sales and clams were selected correspondng to the same vehcle dentfcaton number. 6.1. Intal analyss. The amalgamaton of all the warranty clam tmes relatve to the vehcle purchase tme produced a data set of length 65,351. The summary statstcs for ths data set follow n Table 6.1. Note the mnmum s negatve and the maxmum exceeds 3 years or 1095 days. Mn. 1st Qu. Medan Mean 3rd Qu. Max. -858.0 127.0 405.0 440.9 726.0 1459.0 Table 3. Summary statstcs for the clam tme data. 6.2. Warranty clam counts. The warranty clam tmes yeld counts per day. The hstogram of count frequences for all the data s n left hand graph of Fgure 10 whch exhbts daly counts n the range (-900,1500). The rght graph s a smplfed hstogram of count frequences n whch negatve values are lumped to 0 and values exceedng 1095 days are lumped wth 1096. 6.3. Model ft. The lnear appearance of the warranty clams counts n the regon (0, 1095) exhbted n the left hstogram of Fgure 10 suggests estmatng the measure ν appearng n (3.5) by ν({0}) = w 1, ν({1096}) = w 2, ν(( 1, ]) = m + b, 1 1095, where ν({1096}) results from all counts n the regon (1095,1500]. Fgure 11 gves the clam counts for days 1 through 1095 wth a least squares ftted lne yeldng estmates of the ntercept and slope (75.69, 0.042).

WAANTY MODELLING 13 Hstogram: Clams relatve to purchase Hstogram: Clam tmes relatve to purchase Frequency 0 50 100 150 200 250 300 Frequency 0 1000 2000 3000 4000 5000 6000 500 0 500 1000 1500 0 200 400 600 800 1000 Fgure 10. Hstogram (left) of daly count frequences n the range (-900,1500) and (rght) hstogram of daly counts n whch negatve values are counted as 0 and values exceedng 3 years are counted as 1095. Keepng n mnd that the data has been amalgamated over the 34,807 sales of cars n the data record provded, we dvde these numbers by 34807 and get estmates of ν( ) to be (6.1) ˆν({0}) = 0.1663, ˆν({1096}) = 0.0612, ˆν(( 1, ]) = ( 1.218e 06 ) + 0.002174, 1 1095. LS ft: Clam counts countclams[2:1096] 50 100 150 0 200 400 600 800 1000 Tme Fgure 11. Clam counts for days 1 through 1095 and the least squares lne.

14 V. KULKANI AND S.I. ESNICK 7. Total Warranty Costs. We assume that the sales start at tme 0 and contnue accordng to a non-homogeneous Posson Process wth rate functon gven n Equaton (5.2), wth parameters gven n Equaton (5.3). The warranty perod s taken to be W = 3 years, and we compute the dstrbuton of the total warranty costs for 12 quarters, the -th quarter beng the nterval [( 1) T, T ), wth T = 1/4 year. The warranty clams from a sngle sale are assumed to arse accordng to a non-homogeneous Posson process wth mean measure gven by the estmate n Equaton (6.1), approprately converted to years. Fnally the dstrbuton of ndvdual clam sze s assumed to be a combnaton of Gamma and generalzed Pareto as descrbed n Secton 4.2. Usng these values n the analyss as presented n Secton 3.2 we compute the Laplace transform of the total warranty costs over dfferent quarters. Then we use the numercal Laplace transform nverson program nvlap (Hollenback (1998)) wrtten n Matlab to compute the dstrbuton of the total cost. The table below gves the results about the mean, medan and varous quantles of the costs for the 12 quarters. It shows that the warranty costs ncrease n the begnnng and then decrease, as expected. There s a slght ncrease n the fnal quarter. Ths s due to the ncrease n the warranty clams at the end of warranty perod. Quarter Mean Medan 75% 90% 95% 99% 1 23081 23036 24237 25355 26040 27358 2 63126 63081 65056 66868 67970 70069 3 83200 83155 85418 87491 88749 91139 4 46902 46856 48560 50130 51086 52912 5 14857 14811 15778 16685 17244 18325 6 4069 4024 4537 5036 5350 5940 7 1384 1338 1642 1953 2155 2568 8 743 697 922 1162 1322 1655 9 570 523 720 937 1082 1387 10 501 454 638 845 985 1277 11 454 408 583 783 919 1203 12 1236 1320 1621 1928 2128 2537 Table 4. Total clam quantles for the 12 quarters 8. Summary and Conclusons In ths paper we have developed stochastc and statstcal models to compute the warranty costs for a specfed perod, a quarter year n our case, arsng out of a non-statonary sales process of the tems under warranty. We have used the Bass model to estmate the sales rate as a functon of tme and then used ths as the rate functon of a non-homogeneous Posson process representaton of the sales process. We then developed a non-homogeneous Posson process to model the clams process arsng out of a sngle tem durng the warranty perod. We also modeled the clam szes as..d. random varables wth a combnaton of Gamma and Generalzed Pareto dstrbutons to account for the heavy tals manfest n the data. Fnally we combned all these submodels to compute the quantles of the warranty costs for the 12 quarters. We beleve that ths methodology wll help manufacturers wth ther fnancal plannng n that they can more accurately account for and predct warranty labltes n a proper fashon. eferences S. Asmussen. un Probabltes. World Scentfc, Sngapore, 2000. F.M. Bass. A new product growth for model consumer durables. Management Scence, 15:215 227, 1969. S.G. Coles. An Introducton to Statstcal Modelng of Extreme Values. Sprnger Seres n Statstcs. London: Sprnger. xv, 210 p., 2001.

WAANTY MODELLING 15 S. Csörgő, P. Deheuvels, and D. Mason. Kernel estmates for the tal ndex of a dstrbuton. Ann. Statst., 13: 1050 1077, 1985. P. Embrechts, C. Kluppelberg, and T. Mkosch. Modellng Extreme Events for Insurance and Fnance. Sprnger- Verlag, Berln, 1997. B.M. Hll. A smple general approach to nference about the tal of a dstrbuton. Ann. Statst., 3:1163 1174, 1975. Hollenbeck, K. J. INVLAP.M: A matlab functon for numercal nverson of Laplace transforms by the de Hoog algorthm, http://www.sva.dtu.dk/staff/karl/nvlap.htm, 1998. S.-S. Ja, V. G. Kulkarn, A. Mtra, and J. Patankar. Warranty eserves for Non-statonary Sales Processes. NLQ, 49, No. 5, 499-513, 2002. O. Kallenberg. andom Measures. Akademe-Verlag, Berln, thrd edton, 1983. ISBN 0-12-394960-2. D. Mason and T. Turova. Weak convergence of the Hll estmator process. In J. Galambos, J. Lechner, and E. Smu, edtors, Extreme Value Theory and Applcatons, pages 419 432. Kluwer Academc Publshers, Dordrecht, Holland, 1994. D. Mason. Laws of large numbers for sums of extreme values. Ann. Probab., 10:754 764, 1982. A.J. McNel,. Frey, and P. Embrechts. Quanttatve sk Management. Prnceton Seres n Fnance. Prnceton Unversty Press, Prnceton, NJ, 2005. ISBN 0-691-12255-5. Concepts, technques and tools. J. Neveu. Processus ponctuels. In École d Été de Probabltés de Sant-Flour, VI 1976, pages 249 445. Lecture Notes n Math., Vol. 598, Berln, 1977. Sprnger-Verlag. S.I. esnck and C. Stărcă. Smoothng the Hll estmator. Adv. Appled Probab., 29:271 293, 1997. S.I. esnck. Heavy Tal Phenomena: Probablstc and Statstcal Modelng. Sprnger Seres n Operatons esearch and Fnancal Engneerng. Sprnger-Verlag, New York, 2006. ISBN: 0-387-24272-4. S.I. esnck. Extreme Values, egular Varaton and Pont Processes. Sprnger-Verlag, New York, 1987. S.I. esnck. Heavy tal modelng and teletraffc data. Ann. Statst., 25:1805 1869, 1997. T. olsk, H. Schmdl, V. Schmdt, and J. Teugels. Stochastc Processes for Insurance and Fnance Wley, New York, 1999. V.G. Kulkarn, 209 Smth Buldng, Department of Statstcs and Operatons esearch, Unversty of North Carolna, Chapel Hll, NC 27599 USA E-mal address: vkulkarn@emal.unc.edu Sdney esnck, School of Operatons esearch and Industral Engneerng, Cornell Unversty, Ithaca, NY 14853 USA E-mal address: sr1@cornell.edu