GLAM Array Methods in Statistics



Similar documents
Penalized Splines - A statistical Idea with numerous Applications...

Fitting Subject-specific Curves to Grouped Longitudinal Data

GLMs: Gompertz s Law. GLMs in R. Gompertz s famous graduation formula is. or log µ x is linear in age, x,

GAM for large datasets and load forecasting

Exact Inference for Gaussian Process Regression in case of Big Data with the Cartesian Product Structure

Lecture 3: Linear methods for classification

Penalized Logistic Regression and Classification of Microarray Data

Smoothing and Non-Parametric Regression

MODELLING CRITICAL ILLNESS INSURANCE DATA

Longevity Risk in the United Kingdom

Logistic Regression. Jia Li. Department of Statistics The Pennsylvania State University. Logistic Regression

Example: Credit card default, we may be more interested in predicting the probabilty of a default than classifying individuals as default or not.

Least Squares Estimation

Two Topics in Parametric Integration Applied to Stochastic Simulation in Industrial Engineering

Statistical Machine Learning

VISUALIZATION OF DENSITY FUNCTIONS WITH GEOGEBRA

Neural Network Add-in

Adequacy of Biomath. Models. Empirical Modeling Tools. Bayesian Modeling. Model Uncertainty / Selection

Pattern Analysis. Logistic Regression. 12. Mai Joachim Hornegger. Chair of Pattern Recognition Erlangen University

Introduction to General and Generalized Linear Models

Natural cubic splines

BayesX - Software for Bayesian Inference in Structured Additive Regression

Programming Exercise 3: Multi-class Classification and Neural Networks

Location matters. 3 techniques to incorporate geo-spatial effects in one's predictive model

Rob J Hyndman. Forecasting using. 11. Dynamic regression OTexts.com/fpp/9/1/ Forecasting using R 1

13 MATH FACTS a = The elements of a vector have a graphical interpretation, which is particularly easy to see in two or three dimensions.

Analysis of Bayesian Dynamic Linear Models

Statistical Analysis of Life Insurance Policy Termination and Survivorship

Lecture 6: Poisson regression

Machine Learning and Pattern Recognition Logistic Regression

Department of Mathematics, Indian Institute of Technology, Kharagpur Assignment 2-3, Probability and Statistics, March Due:-March 25, 2015.

HETEROGENEOUS AGENTS AND AGGREGATE UNCERTAINTY. Daniel Harenberg University of Mannheim. Econ 714,

Using the Delta Method to Construct Confidence Intervals for Predicted Probabilities, Rates, and Discrete Changes

1 Introduction to Matrices

POLYNOMIAL HISTOPOLATION, SUPERCONVERGENT DEGREES OF FREEDOM, AND PSEUDOSPECTRAL DISCRETE HODGE OPERATORS

Math 215 HW #6 Solutions

Smoothing. Fitting without a parametrization

Multimedia Databases. Wolf-Tilo Balke Philipp Wille Institut für Informationssysteme Technische Universität Braunschweig

Mehtap Ergüven Abstract of Ph.D. Dissertation for the degree of PhD of Engineering in Informatics


A short introduction to splines in least squares regression analysis

PREDICTIVE MODELS IN LIFE INSURANCE

Feb 28 Homework Solutions Math 151, Winter Chapter 6 Problems (pages )

Multiple Choice: 2 points each

Automated Biosurveillance Data from England and Wales,

Poisson Models for Count Data

Package smoothhr. November 9, 2015

Dimensionality Reduction: Principal Components Analysis

Poisson Regression or Regression of Counts (& Rates)

is in plane V. However, it may be more convenient to introduce a plane coordinate system in V.

INTEREST RATES AND FX MODELS

MAT 242 Test 2 SOLUTIONS, FORM T

Logistic Regression (1/24/13)

Introduction: Overview of Kernel Methods

(Refer Slide Time: 1:42)

Lecture 6: Logistic Regression

ANALYSIS, THEORY AND DESIGN OF LOGISTIC REGRESSION CLASSIFIERS USED FOR VERY LARGE SCALE DATA MINING

Linear Algebra Review. Vectors

2012 Individual Annuity Reserving Table. Presented to the National Association of Insurance Commissioners Life Actuarial Task Force.

3. Regression & Exponential Smoothing

Principle Component Analysis and Partial Least Squares: Two Dimension Reduction Techniques for Regression

BOOSTED REGRESSION TREES: A MODERN WAY TO ENHANCE ACTUARIAL MODELLING

1. The forward curve

EECS 556 Image Processing W 09. Interpolation. Interpolation techniques B splines

CITY UNIVERSITY LONDON. BEng Degree in Computer Systems Engineering Part II BSc Degree in Computer Systems Engineering Part III PART 2 EXAMINATION

Regularized Logistic Regression for Mind Reading with Parallel Validation

Spatial Statistics Chapter 3 Basics of areal data and areal data modeling

Acknowledgments. Data Mining with Regression. Data Mining Context. Overview. Colleagues

Chapter 23. Inferences for Regression

Machine Learning Logistic Regression

Business Process Services. White Paper. Price Elasticity using Distributed Computing for Big Data

Bayesian Penalized Methods for High Dimensional Data

Getting started with qplot

Computer Graphics CS 543 Lecture 12 (Part 1) Curves. Prof Emmanuel Agu. Computer Science Dept. Worcester Polytechnic Institute (WPI)

NON-LIFE INSURANCE PRICING USING THE GENERALIZED ADDITIVE MODEL, SMOOTHING SPLINES AND L-CURVES

Factorial experimental designs and generalized linear models

A Value-at-risk framework for longevity trend risk

State of Stress at Point

Inner products on R n, and more

Vector Algebra and Calculus

Lasso on Categorical Data

NCSS Statistical Software Principal Components Regression. In ordinary least squares, the regression coefficients are estimated using the formula ( )

Math Quizzes Winter 2009

The Image Deblurring Problem

Data Analysis Tools. Tools for Summarizing Data

Web-based Supplementary Materials for Bayesian Effect Estimation. Accounting for Adjustment Uncertainty by Chi Wang, Giovanni

Regression III: Advanced Methods

A LOGNORMAL MODEL FOR INSURANCE CLAIMS DATA

Point Lattices in Computer Graphics and Visualization how signal processing may help computer graphics

Exploratory Factor Analysis and Principal Components. Pekka Malo & Anton Frantsev 30E00500 Quantitative Empirical Research Spring 2016

Paper No 19. FINALTERM EXAMINATION Fall 2009 MTH302- Business Mathematics & Statistics (Session - 2) Ref No: Time: 120 min Marks: 80

Vector Math Computer Graphics Scott D. Anderson

Generating Random Numbers Variance Reduction Quasi-Monte Carlo. Simulation Methods. Leonid Kogan. MIT, Sloan , Fall 2010

Data Preprocessing. Week 2

Transcription:

GLAM Array Methods in Statistics Iain Currie Heriot Watt University A Generalized Linear Array Model is a low-storage, high-speed, GLAM method for multidimensional smoothing, when data forms an array, Simon Fraser University model has a row and column structure which allows it to be written as a Kronecker product. May 09 Swedish male mortality data (HMD) Raw mortality surface 10 2 Deaths : D Exposures : E D,E : 81 101 4 6 8 10 90 19 19 19

Structure Generalized linear models A single cubic B spline Data: vectors y of deaths and e of exposures Model: a model matrix B of B-splines a parameter vector θ a link function Error distribution: Poisson Algorithm Scoring algorithm µ = E(y), log µ = log e + Bθ B Wδ Bˆθ = B Wδ z where z = B θ + W δ 1 (y µ) is the working vector and Wδ is a diagonal matrix of weights. B spline 0.0 0.1 0.2 0.3 0.4 0.5 0.6 100 B-spline basis A B-spline regression basis uses local basis functions. Bspline 0.0 0.3 0.6 B-spline basis: {B 1 (x), B 2 (x),...,b c (x)} where B 1 (x), B 2 (x),...,b c (x) are B-splines. Model matrix B = [B 1 (x), B 2 (x),...,b c (x)], n c. 19 19 19

Log mortality for Swedish males age 70 Penalties 3.6 3.5 3.4 3.3 3.2 3.1 3.0 Observed mortality B spline regression B spline coefficients Eilers & Marx (1996) imposed penalties on differences between adjacent coefficients (θ 1 2θ 2 + θ 3 ) 2 +... + (θ c 2 2θ c 1 + θ c ) 2 = θ D 2D 2 θ where D 2 is a second order difference matrix. Estimation is via penalized likelihood PL(θ) = L(θ) 1 2 λθ D 2D 2 θ where λ is the smoothing parameter which balances fit and smoothness. Bspline 0.0 0.3 0.6 B-spline regression (λ = 0) Linear (classical Gompertz) regression (λ = ) 19 19 19 Algorithm Log mortality for Swedish males age 70 Penalized scoring algorithm (B Wδ B + P)ˆθ = B Wδ z, P = λd 2D 2 is a roughness penalty. This is Eilers and Marx s method of P -splines. 3.6 3.5 3.4 3.3 3.2 3.1 3.0 Observed mortality B spline regression P spline regression B spline coefficients P spline coefficients Bspline 0.0 0.3 0.6 19 19 19

2d B spline basis 2-dimensional smoothing Let B a, n a c a, be a 1-d B-spline model matrix defined along age. Let B y, n y c y, be a 1-d B-spline model matrix defined along year. The 2-d model matrix is given by the Kronecker product B = B y B a, n a n y c a c y. B spline 0.5 0.4 0.3 0.2 0.1 0.0 19 19 19 Amazing formula Generalized linear array models or GLAM Penalties in 2-d Structure [B y B a ]θ, n a n y 1 B a ΘB y, n a n y Each regression coefficient is associated with the summit of one of the hills. log E[D] = log E + B a ΘB y Smoothness is ensured by penalizing the coefficients in rows and columns. P = λ a I cy D ad a + λ y D yd y I ca Computational procedure with B = B y B a Bθ B a ΘB y B W δ B G(B a ) WG(B y ) Definition: Row tensor of X, n c, G(X) = [X 1 c] [1 c X], n c 2.

Computational details: the magic shuffle Linear functions Bθ, n a n y 1 B a ΘB y, n a n y Generalization to d-dimensions (X 2 X 1 )θ (X 2 (X 1 Θ) ) (X 3 X 2 X 1 )θ ρ(x 3, ρ(x 2, ρ(x 1,Θ))) Inner products Definition: X, n 1 c 1 matrix; A, c 1 c 2 c 3 array. Diagonal function B W δ B, c a c y c a c y G(B a ) WG(B y ), c 2 a c 2 y ρ(x, A) XA c1 c 2c 3 = A n1 c 2c 3 A n1 c 2 c 3 A c2 c 3 n 1 is called the rotated H-transform. diag ( BS m B ), n a n y 1 G(B a )SG(B y ), n a n y S m = (B W δ B) 1 SE s of fitted values Computation of Xθ in d-dimensions Computation of X W δ X in d-dimensions X i, n i c i, i = 1, 2, 3. X = X 3 X 2 X 1, n 1 n 2 n 3 c 1 c 2 c 3 θ, c 1 c 2 c 3 1 Θ is the corresponding array, c 1 c 2 c 3 X i, n i c i, i = 1, 2, 3. X = X 3 X 2 X 1, n 1 n 2 n 3 c 1 c 2 c 3 W δ is diagonal, n 1 n 2 n 3 n 1 n 2 n 3 W is the corresponding array, n 1 n 2 n 3 Xθ, n 1 n 2 n 3 1 ρ(x 3, ρ(x 2, ρ(x 1,Θ))), n 1 n 2 n 3 X W δ X, c 1 c 2 c 3 c 1 c 2 c 3 ρ(g(x 3 ), ρ(g(x 2 ), ρ(g(x 1 ), W))), c 2 1 c 2 2 c 2 3

Standard errors of Xˆθ We need diag X(X W δ X) 1 X = diag XS m X Inner product shuffles in R where S m, c 1 c 2 c 3 c 1 c 2 c 3. X W δ X, c 1 c 2 c 3 c 1 c 2 c 3 Let S, c 2 1 c 2 2 c 2 3, be the array form of S m. diag XS m X, n 1 n 2 n 3 1 ρ(g(x 3 ), ρ(g(x 2 ), ρ(g(x 1 ), S))), n 1 n 2 n 3. ρ(g(x 3 ), ρ(g(x 2 ), ρ(g(x 1 ), W))), c 2 1 c 2 2 c 2 3 In R, XWX = RH(t(RT3), RH(t(RT2), RH(t(RT1), W))) dim(xwx) = c(c1, c1, c2, c2, c3, c3) PermDims = aperm(xwx, c(1, 3, 5, 2, 4, 6)) XWX = matrix(permdims, nrow = c1 * c2 * c3) conceptually attractive low footprint very fast generalizes to d-dimensions GLAM Examples of GLAMs Mortality shocks: Swedish data and the Spanish flu Joint modelling of mortality surfaces: Insurance data by lives v amounts Density estimation: Old Faithful data

Raw mortality surface Modelling shocks 2 4 Additive model: smooth surface + smooth period shocks [ [B y B a ]θ + I ny B ] a θ, B = [B y B a : I ny B ] a, 8181 1346. 6 Additive GLAM: B a ΘB y + B a Θ 8 10 19 19 Penalty matrix: P 0 0 P 19 P penalizes roughness in rows and columns P is a ridge penalty Smooth + Shocks Smooth 2 2 4 4 6 6 8 8 19 19 19 19 19 19

Shocks Mortality shock 1918 Mortality shock 1919 1.0 0.5 Mortality shock 0.0 0.5 1.0 Alpha = 0 Alpha = 1 Alpha = 3.5 Mortality shock 0.1 0.0 0.1 0.2 0.3 0.4 0.0 Mortality shock 1923 Mortality shock 1944 19 19 19 Mortality shock 0.25 0.15 0.05 0.05 Mortality shock 0.0 0.2 0.4 0.6 Joint modelling of insurance data Insurance data by lives and amounts. Additive model: smooth 2d-surface + smooth age-dependent gaps Lives: [B y B a ]θ Amounts: [B y B a ]θ + [ ] 1 ny B a θ. Inner products in addditive GLAMs Let X = [ ] B y B a : 1 ny B a X W δ X G(B a) WG(B y ) G(B a ) WB y G(B a ) W1 ny Additive GLAM with dimensions Lives: B a ΘB y Amounts: B a ΘB y + B a Θ1 ny. c ac y c a c y c a c a c y c a c y c a c a c a c2 a c 2 y c 2 a c y c 2 a c y c 2 a 1

Log(mortality) -4.5-4.0-3.5 Amounts = 70 Lives Log(mortality) -4.0-3.5-3.0-2.5 Lives Amounts = 2-d Density Estimation Form a fine 2-d grid of counts Apply 2-d P -spline smoothing with Poisson errors & log link Model matrix B 2 (x 2 ) B 1 (x 1 ) third order penalties 272 data points Example: Old Faithful Geyser Data 1990 10 1990 10 217 grid 238 counts of 1, 17 of 2, and 12765 (98%!) counts of 0. Observed, smoothed and forecast log mortality by lives and amounts. Normalized Density Duration (minutes) 1.5 2.0 2.5 3.0 3.5 4.0 4.5 5.0 Duration (minutes): bin width = 1 sec 1.5 2.0 2.5 3.0 3.5 4.0 4.5 5.0 50 70 90 Waiting time (minutes) 50 70 90 100 Waiting time (minutes): bin width = 1 min

Normalized Density Histogram of waiting times 1.0 0.8 Density 0.6 0.4 0.2 0.0 50 Waiting time 70 90 150 100 300 250 0 Duration Density 0.00 0.01 0.02 0.03 0.04 0.05 2 d marginal density 1 d density 50 70 90 100 Waiting time (minutes): bin width 1 min Histogram of duration times Density 0.000 0.005 0.010 0.015 0.0 0.025 0.030 2 d marginal density 1 d density References P -splines: Eilers & Marx (1996) Statistical Science, 11, 758-783. GLAM: Currie, Durban & Eilers (06) Journal of the Royal Statistical Society, Series B, 68, 259-2. Eilers, Currie & Durban (06) Computational Statistics & Data Analysis, 50, 61-76. Mortality shocks: Kirkby & Currie (09) Statistical Modelling, to appear. Mortality data: Human Mortality Database www.mortality.org GLAM web page www.ma.hw.ac.uk/ iain/research/glam.html 100 150 0 250 300 Duration time (seconds): bin width 1 sec