GLAM Array Methods in Statistics

Size: px

Start display at page:

Download "GLAM Array Methods in Statistics"

Janis Charles
8 years ago
Views:

1 GLAM Array Methods in Statistics Iain Currie Heriot Watt University A Generalized Linear Array Model is a low-storage, high-speed, GLAM method for multidimensional smoothing, when data forms an array, Simon Fraser University model has a row and column structure which allows it to be written as a Kronecker product. May 09 Swedish male mortality data (HMD) Raw mortality surface 10 2 Deaths : D Exposures : E D,E :

University model has a row and column structure which allows it to be written as a Kronecker product.

2 Structure Generalized linear models A single cubic B spline Data: vectors y of deaths and e of exposures Model: a model matrix B of B-splines a parameter vector θ a link function Error distribution: Poisson Algorithm Scoring algorithm µ = E(y), log µ = log e + Bθ B Wδ Bˆθ = B Wδ z where z = B θ + W δ 1 (y µ) is the working vector and Wδ is a diagonal matrix of weights. B spline B-spline basis A B-spline regression basis uses local basis functions. Bspline B-spline basis: {B 1 (x), B 2 (x),...,b c (x)} where B 1 (x), B 2 (x),...,b c (x) are B-splines. Model matrix B = [B 1 (x), B 2 (x),...,b c (x)], n c

vector and Wδ is a diagonal matrix of weights. B spline 0.0 0.1 0.2 0.3 0.4 0.5 0.6 100 B-spline basis A B-spline regression basis uses local basis functions.

3 Log mortality for Swedish males age 70 Penalties Observed mortality B spline regression B spline coefficients Eilers & Marx (1996) imposed penalties on differences between adjacent coefficients (θ 1 2θ 2 + θ 3 ) (θ c 2 2θ c 1 + θ c ) 2 = θ D 2D 2 θ where D 2 is a second order difference matrix. Estimation is via penalized likelihood PL(θ) = L(θ) 1 2 λθ D 2D 2 θ where λ is the smoothing parameter which balances fit and smoothness. Bspline B-spline regression (λ = 0) Linear (classical Gompertz) regression (λ = ) Algorithm Log mortality for Swedish males age 70 Penalized scoring algorithm (B Wδ B + P)ˆθ = B Wδ z, P = λd 2D 2 is a roughness penalty. This is Eilers and Marx s method of P -splines Observed mortality B spline regression P spline regression B spline coefficients P spline coefficients Bspline

.. + (θ c 2 2θ c 1 + θ c ) 2 = θ D 2D 2 θ where D 2 is a second order difference matrix.

4 2d B spline basis 2-dimensional smoothing Let B a, n a c a, be a 1-d B-spline model matrix defined along age. Let B y, n y c y, be a 1-d B-spline model matrix defined along year. The 2-d model matrix is given by the Kronecker product B = B y B a, n a n y c a c y. B spline Amazing formula Generalized linear array models or GLAM Penalties in 2-d Structure [B y B a ]θ, n a n y 1 B a ΘB y, n a n y Each regression coefficient is associated with the summit of one of the hills. log E[D] = log E + B a ΘB y Smoothness is ensured by penalizing the coefficients in rows and columns. P = λ a I cy D ad a + λ y D yd y I ca Computational procedure with B = B y B a Bθ B a ΘB y B W δ B G(B a ) WG(B y ) Definition: Row tensor of X, n c, G(X) = [X 1 c] [1 c X], n c 2.

0 19 19 19 Amazing formula Generalized linear array models or GLAM Penalties in 2-d Structure [B y B a ]θ, n a n y 1 B a ΘB y, n a n y Each regression coefficient is associated with the summit of

5 Computational details: the magic shuffle Linear functions Bθ, n a n y 1 B a ΘB y, n a n y Generalization to d-dimensions (X 2 X 1 )θ (X 2 (X 1 Θ) ) (X 3 X 2 X 1 )θ ρ(x 3, ρ(x 2, ρ(x 1,Θ))) Inner products Definition: X, n 1 c 1 matrix; A, c 1 c 2 c 3 array. Diagonal function B W δ B, c a c y c a c y G(B a ) WG(B y ), c 2 a c 2 y ρ(x, A) XA c1 c 2c 3 = A n1 c 2c 3 A n1 c 2 c 3 A c2 c 3 n 1 is called the rotated H-transform. diag ( BS m B ), n a n y 1 G(B a )SG(B y ), n a n y S m = (B W δ B) 1 SE s of fitted values Computation of Xθ in d-dimensions Computation of X W δ X in d-dimensions X i, n i c i, i = 1, 2, 3. X = X 3 X 2 X 1, n 1 n 2 n 3 c 1 c 2 c 3 θ, c 1 c 2 c 3 1 Θ is the corresponding array, c 1 c 2 c 3 X i, n i c i, i = 1, 2, 3. X = X 3 X 2 X 1, n 1 n 2 n 3 c 1 c 2 c 3 W δ is diagonal, n 1 n 2 n 3 n 1 n 2 n 3 W is the corresponding array, n 1 n 2 n 3 Xθ, n 1 n 2 n 3 1 ρ(x 3, ρ(x 2, ρ(x 1,Θ))), n 1 n 2 n 3 X W δ X, c 1 c 2 c 3 c 1 c 2 c 3 ρ(g(x 3 ), ρ(g(x 2 ), ρ(g(x 1 ), W))), c 2 1 c 2 2 c 2 3

Diagonal function B W δ B, c a c y c a c y G(B a ) WG(B y ), c 2 a c 2 y ρ(x, A) XA c1 c 2c 3 = A n1 c 2c 3 A n1 c 2 c 3 A c2 c 3 n 1 is called the rotated H-transform.

6 Standard errors of Xˆθ We need diag X(X W δ X) 1 X = diag XS m X Inner product shuffles in R where S m, c 1 c 2 c 3 c 1 c 2 c 3. X W δ X, c 1 c 2 c 3 c 1 c 2 c 3 Let S, c 2 1 c 2 2 c 2 3, be the array form of S m. diag XS m X, n 1 n 2 n 3 1 ρ(g(x 3 ), ρ(g(x 2 ), ρ(g(x 1 ), S))), n 1 n 2 n 3. ρ(g(x 3 ), ρ(g(x 2 ), ρ(g(x 1 ), W))), c 2 1 c 2 2 c 2 3 In R, XWX = RH(t(RT3), RH(t(RT2), RH(t(RT1), W))) dim(xwx) = c(c1, c1, c2, c2, c3, c3) PermDims = aperm(xwx, c(1, 3, 5, 2, 4, 6)) XWX = matrix(permdims, nrow = c1 * c2 * c3) conceptually attractive low footprint very fast generalizes to d-dimensions GLAM Examples of GLAMs Mortality shocks: Swedish data and the Spanish flu Joint modelling of mortality surfaces: Insurance data by lives v amounts Density estimation: Old Faithful data

ρ(g(x 3 ), ρ(g(x 2 ), ρ(g(x 1 ), W))), c 2 1 c 2 2 c 2 3 In R, XWX = RH(t(RT3), RH(t(RT2), RH(t(RT1), W))) dim(xwx) = c(c1, c1, c2, c2, c3, c3) PermDims = aperm(xwx, c(1, 3, 5, 2, 4, 6)) XWX =

7 Raw mortality surface Modelling shocks 2 4 Additive model: smooth surface + smooth period shocks [ [B y B a ]θ + I ny B ] a θ, B = [B y B a : I ny B ] a, Additive GLAM: B a ΘB y + B a Θ Penalty matrix: P 0 0 P 19 P penalizes roughness in rows and columns P is a ridge penalty Smooth + Shocks Smooth

6 Additive GLAM: B a ΘB y + B a Θ 8 10 19 19 Penalty matrix: P 0 0 P 19 P penalizes

8 Shocks Mortality shock 1918 Mortality shock Mortality shock Alpha = 0 Alpha = 1 Alpha = 3.5 Mortality shock Mortality shock 1923 Mortality shock Mortality shock Mortality shock Joint modelling of insurance data Insurance data by lives and amounts. Additive model: smooth 2d-surface + smooth age-dependent gaps Lives: [B y B a ]θ Amounts: [B y B a ]θ + [ ] 1 ny B a θ. Inner products in addditive GLAMs Let X = [ ] B y B a : 1 ny B a X W δ X G(B a) WG(B y ) G(B a ) WB y G(B a ) W1 ny Additive GLAM with dimensions Lives: B a ΘB y Amounts: B a ΘB y + B a Θ1 ny. c ac y c a c y c a c a c y c a c y c a c a c a c2 a c 2 y c 2 a c y c 2 a c y c 2 a 1

6 Joint modelling of insurance data Insurance data by lives and amounts.

9 Log(mortality) Amounts = 70 Lives Log(mortality) Lives Amounts = 2-d Density Estimation Form a fine 2-d grid of counts Apply 2-d P -spline smoothing with Poisson errors & log link Model matrix B 2 (x 2 ) B 1 (x 1 ) third order penalties 272 data points Example: Old Faithful Geyser Data grid 238 counts of 1, 17 of 2, and (98%!) counts of 0. Observed, smoothed and forecast log mortality by lives and amounts. Normalized Density Duration (minutes) Duration (minutes): bin width = 1 sec Waiting time (minutes) Waiting time (minutes): bin width = 1 min

third order penalties 272 data points Example: Old Faithful Geyser Data 1990 10 1990 10 217 grid 238 counts of 1, 17 of 2, and 12765 (98%!) counts of 0.

10 Normalized Density Histogram of waiting times Density Waiting time Duration Density d marginal density 1 d density Waiting time (minutes): bin width 1 min Histogram of duration times Density d marginal density 1 d density References P -splines: Eilers & Marx (1996) Statistical Science, 11, GLAM: Currie, Durban & Eilers (06) Journal of the Royal Statistical Society, Series B, 68, Eilers, Currie & Durban (06) Computational Statistics & Data Analysis, 50, Mortality shocks: Kirkby & Currie (09) Statistical Modelling, to appear. Mortality data: Human Mortality Database GLAM web page iain/research/glam.html Duration time (seconds): bin width 1 sec

030 2 d marginal density 1 d density References P -splines: Eilers & Marx (1996) Statistical Science, 11, 758-783.

Joint models for classification and comparison of mortality in different countries.

Joint models for classification and comparison of mortality in different countries. Viani D. Biatat 1 and Iain D. Currie 1 1 Department of Actuarial Mathematics and Statistics, and the Maxwell Institute