Extreme Value Modeling for Detection and Attribution of Climate Extremes

Extreme Value Modeling for Detection and Attribution of Climate Extremes Jun Yan, Yujing Jiang Joint work with Zhuo Wang, Xuebin Zhang Department of Statistics, University of Connecticut February 2, 2016 @ IDAG, Boulder, CO Jun Yan February 2, 2016 @ IDAG, Boulder, CO 1 / 21

Outline 1 Introduction 2 Combined Score Equations (CSE) 3 Illustrations 4 Outlook Jun Yan February 2, 2016 @ IDAG, Boulder, CO 2 / 21

Introduction Method of Zwiers, Zhang, and Feng (2011) Data: Multiple years over a collection of sites extremes from climate model simulation (multiple model, multiple ensemble) + observed extremes Signal estimation from simulation data: piecewise constant location parameter ˆµ ts in GEV fit at each site. Detection analysis for the observed data GEV fit with location µ ts = α s + ˆµ ts β, site specific σ s, ξ s. Profile independence likelihood estimation of β Uncertainty assessment via nested block bootstrap (32x32) to account for uncertainty in ˆµ ts. Goodness-of-fit test: KS test at each site with field significance check Jun Yan February 2, 2016 @ IDAG, Boulder, CO 3 / 21

Introduction Departure from Zwiers et al (2011) Spatial dependence is discarded: Can efficiency in estimating β be improved by incorporating spatial dependence? Max-stable process for spatial extremes with composite likelihood estimation (e.g., Davison et al., 2012). Misspecification of spatial dependence may ruin inferences on marginal parameters: bias can be serious with strong dependence (Wang et al., 2014); goodness-of-fit test is difficult (Kojadinovic et al., 2015). In some applications like D&A, the primary interest is the inference about marginal parameters; the spatial dependence is a nuisance. Combining marginal GEV score equations: no dependence assumptions beyond marginal GEV. Profiling is computing intensive and accurary depends on grid resolution: Can we compute more efficiently? (needed by multiple forcing) Goal: toward a closer analog to standard optimal fingerprinting (e.g., Allen and Stott, 2003). Jun Yan February 2, 2016 @ IDAG, Boulder, CO 4 / 21

Combined Score Equations (CSE) Setup Idea: Combine the score equation of the marginal GEV distribution at each monitoring sites in some optimal way to improve efficiency by accounting the spatial correlation among them. Y ts : extreme observation of interest at site s in year t with density f ( ; θ ts ), s = 1,..., m, t = 1,..., n, and scalor parameter θ ts (other paramers assumed known for the moment). X ts : p 1 covariate vector (signal) for θ ts. g(θ ts ) = η ts = X ts β, where g is a known link function. Assume data from year to year are independent while spatial dependence exists within the same year. Only assume marginal distribution f is the correctly specified GEV distribution. Jun Yan February 2, 2016 @ IDAG, Boulder, CO 5 / 21

Combined Score Equations (CSE) Combining the Score Equations Score function: S ts = d log f (Y ts ; θ ts )/dθ ts. Score equation for β at site s: Combined score equation: n t=1 n t=1 X ts dθ ts dη ts S ts = 0. X t A t W 1 t S t = 0, where X t = (X t1,..., X tm ), A t = diag(dθ t1 /dη t1,..., dθ tm /dη tm ), W 1 t is the weight matrix, and S t = (S t1,..., S tm ). When W t is the identity matrix, it reduces to the derivative of the independence likelihood (Zwiers et al., 2011). Jun Yan February 2, 2016 @ IDAG, Boulder, CO 6 / 21

Combined Score Equations (CSE) Optimal Weight Optimal W t (Nikoloulopoulos et al., 2011): dθ 2 t1 W t = Ω t 1 t, where Ω t = cov(s t ) and { ( d 2 ) log f t1 (y t1, θ t1 ) t = diag E,..., E ( d 2 log f tm (y tm, θ tm ) dθ 2 tm )}. Ω t plays the role of variance matrix representing internal variability in standard optimal fingerprinting Approximate the covariance matrix Ω t of the score functions S t : Apply the idea of generalized estimating equations (GEE) use simple form of working spatial correlation structure. Assume all the clusters (years) share a same correlation matrix, R, of the score function: Ω t = 1/2 t R 1/2 t. Jun Yan February 2, 2016 @ IDAG, Boulder, CO 7 / 21

Combined Score Equations (CSE) Approximation of Optimal Weight 0 5 10 15 20 0.0 0.2 0.4 0.6 0.8 1.0 Euclidean Distance correlation µ Exp Sph Gau 0.2 0.4 0.6 0.8 1.0 correlation σ Exp Sph Gau Figure: The empirical correlation of the standardized score function of µ (points), and the corresponding non-linear least square fitted correlation curves from exponential (red), spherical (blue) and gaussian (green) correlation function. Data generated from an isotropic Smith model with m = 20, n = 1000, and moderate dependence level in region [ 10, 10]. Jun Yan February 2, 2016 @ IDAG, Boulder, CO 8 / 21

Combined Score Equations (CSE) Approximation of Optimal Weight It would be nice to know the pairwise correlation between site j and site k, ρ jk, but approximation is good too. Exponential correlation ρ jk = exp( d jk /r), where d jk is the pairwise distance and r is the parameter to be estimated through the empirical correlation of the standardized score function. Spherical correlation ρ jk = [ 1 1.5(r/d jk ) + 0.5(r/d jk ) 3] I dij<r, which leads to sparse correlation matrix and can be exploited computationally when the number of sites is big. Jun Yan February 2, 2016 @ IDAG, Boulder, CO 9 / 21

Combined Score Equations (CSE) Coordinate Descent Approach GEV for observed extremes in detection analysis location µ ts = α s + X T ts β, where the signals X can incorporate p forcings. site specific scale σ s and shape ξ s. a total of 3m + p unknown parameters. Coordinate descent approach: a two-step iterative process. 1 Given current estimate ˆβ of β, obtain the likelihood estimate ˆζ s of ζ s = (α s, σ s, ξ s ) separately at each grid box s {1,..., m}. 2 Given current estimate ˆζ s, obtain the CSE estimate ˆβ of β from solving the estimating equation with an appropriately chosen working correlation structure. The two steps iterate until ˆβ converges. Jun Yan February 2, 2016 @ IDAG, Boulder, CO 10 / 21

Illustrations Simulation Study in Fingerprinting Setting Mimic the daily maximum temperature setting in Australia (n = 140, m = 29). Recall detection model: µ ts = α s + X ts β, σ ts = σ s, ξ ts = ξ s. Estimated signals µ d(t),s were used as input X ts to generate data. Parameters α, σ and ξ are the estimates based on Australia data. β {0, 0.5, 1}. Dependence model: a mixture of a GG model (proportion p) and a GA model (proportion 1 p). CSE method with an exponential correlation structure. Jun Yan February 2, 2016 @ IDAG, Boulder, CO 11 / 21

Illustrations Estimate RMSE RE p Dep True IL PL CSE IL PL CSE PL CSE 0 M 0 0.001 0.001 0.001 0.120 0.114 0.103 1.10 1.37 0.5 0.503 0.503 0.502 0.118 0.111 0.097 1.12 1.49 1 1.005 1.005 1.005 0.119 0.112 0.098 1.13 1.48 S 0 0.001 0.002 0.004 0.153 0.140 0.104 1.19 2.14 0.5 0.503 0.502 0.501 0.147 0.133 0.102 1.21 2.05 1 1.007 1.007 1.002 0.146 0.134 0.103 1.19 1.99 0.5 M 0 0.004 0.003 0.001 0.116 0.112 0.094 1.08 1.51 0.5 0.507 0.507 0.502 0.115 0.111 0.096 1.07 1.42 1 0.997 0.997 1.000 0.115 0.112 0.097 1.06 1.40 S 0 0.004 0.004 0.000 0.138 0.131 0.098 1.12 2.00 0.5 0.500 0.500 0.499 0.144 0.136 0.100 1.13 2.08 1 1.005 1.005 1.006 0.138 0.131 0.097 1.12 2.02 1 M 0 0.001 0.001 0.002 0.110 0.108 0.091 1.04 1.46 0.5 0.504 0.504 0.502 0.110 0.108 0.092 1.04 1.44 1 0.997 0.997 1.000 0.112 0.110 0.093 1.04 1.44 S 0 0.001 0.000 0.002 0.132 0.128 0.098 1.07 1.83 0.5 0.505 0.505 0.502 0.133 0.129 0.099 1.07 1.80 1 0.996 0.996 1.000 0.135 0.131 0.100 1.07 1.82 (The relative efficiency (RE) was based on the MSE, with the IL estimate as reference.) Jun Yan February 2, 2016 @ IDAG, Boulder, CO 12 / 21

Illustrations Applications on Extreme Temperatures Extreme temperatures in Northern Europe (NEU) Annual maximum of daily maximum (TXx) warmest day Annual maximum of daily minimum (TNx) warmest night Annual minimum of daily maximum (TXn) coldest day Annual minimum of daily minimum (TNn) coldest night Data period 1951 2010 (n = 60, m = 67). CSE method with an exponential correlation structure. Jun Yan February 2, 2016 @ IDAG, Boulder, CO 13 / 21

Illustrations Results for the annual maximum of daily minimum temperature (TNx) for illustration. Forcing Me Par est 90% CI len ALL IL β 1.10 (0.73, 1.48) 0.75 CSE β 0.69 (0.46, 0.95) 0.49 ANT IL β 1.19 (0.77, 1.62) 0.85 CSE β 0.52 (0.31, 0.74) 0.43 ANT&NAT IL β A 1.12 (0.75, 1.50) 0.76 β N 0.91 ( 0.28, 2.07) 2.35 CSE β A 0.70 (0.47, 0.95) 0.48 β N 0.59 (0.17, 1.01) 0.84 Jun Yan February 2, 2016 @ IDAG, Boulder, CO 14 / 21

Outlook Summary CSE improves estimation efficiency without specifying spatial dependence. Coordinate descent algorithm is reasonably fast and reliable. Application to climate extremes increases power of detection and attribution of changes, with possibly multiple forcing. Outlook (thesis of Yujing Jiang) Measurement error may cause bias, especially when it is high relative to the signal. A joint modeling approach similar to Hannart et al. (2014) for extremes: both simulated and observed data depend on a latent signal. Different climate models may have different sensitivity to the latent signal, but the average of the scaling factors is restricted to be 1. Jun Yan February 2, 2016 @ IDAG, Boulder, CO 15 / 21

Outlook Departure from Z. Wang s Thesis PhD thesis in Statistics: Yujing Jiang (joint with Zhuo Wang, Jun Yan, and Xuebin Zhang) The work reported earlier is a 2-step approach 1 Estimate the signal from the climate simulation data. 2 Estimate the scaling factor of the signal with observed data. Possible drawback: uncertainty in estimated signals has an effect like error-in-covariates, which is known to attenuate covariate effects in regression models with measurement error. Goal: remove bias from measurement error but retain efficiency from CSE. Jun Yan February 2, 2016 @ IDAG, Boulder, CO 16 / 21

Outlook Joint D&A Model for Observed and Simulated Extremes The signal (characterized by a few parameters) is shared by the location parameters of the GEV models for both. Illustration with one forcing GEV model for observed extremes: µ ts = α s + β obs µ ts, σ ts = σ s, ξ ts = ξ s, GEV model for simulated extremes from climate model c, c = 1,..., K, µ cts = α cs + β c µ ts, σ cts = σ cs, ξ cts = ξ cs Signal appears in the model as µ ts β c allows model specific sensitivity the average of β c over c is restricted to be 1 σ cs and ξ cs could be restricted to be the same as σ s and ξ s, respectively, if desired. Jun Yan February 2, 2016 @ IDAG, Boulder, CO 17 / 21

Outlook Parameter Estimation Assume independence between observed data and simulated data, Block coordinate descent (the observed data treated as if from the K + 1th climate model) 1 { µ ts, t = 1,..., 10D}, s = 1,..., m 2 {σ s, ξ s }, s = 1,..., m 3 {α cs }, c = 1,..., K + 1} 4 {β c }, c = 1,..., K + 1} Average-to-1 restriction is enforced at each iteration for identifiability. When updating each β, CSE can be used for efficiency. Jun Yan February 2, 2016 @ IDAG, Boulder, CO 18 / 21

Outlook A Simulation Study Regional D&A study for extreme temperature: 29 grid boxes in Australia. A single climate model under one forcing. Same generating model for observed and simulated data, Dependence structure was a geometric Gaussian process with a Gaussian correlation function with φ = 12 and 18. True marginal parameter values were set to be the estimates from 10 runs under ALL forcing from HadCM3. Signal: 0.1 degree/10 years. β obs = 1. Number of years: 100. Number of runs from the climate model: 2, 5, 10. Four methods: 2-step (2S) versus joint modeling (JM); independence likelihood (IL) versus CSE. Jun Yan February 2, 2016 @ IDAG, Boulder, CO 19 / 21

Outlook Table: Mean, standard deviation (SD) and root mean squared error (RMSE) of estimates of β obs from 1000 replicates. Run 2 5 10 Dep Mean SD RMSE Mean SD RMSE Mean SD RMSE M 2S.IL 0.94 0.12 0.13 0.98 0.11 0.11 0.99 0.11 0.11 2S.CSE 0.72 0.12 0.30 0.88 0.11 0.16 0.94 0.11 0.13 JM.IL 1.00 0.11 0.11 1.01 0.10 0.10 1.00 0.11 0.11 JM.CSE 1.00 0.09 0.09 1.00 0.09 0.09 1.01 0.10 0.10 S 2S.IL 0.95 0.15 0.16 0.98 0.14 0.15 0.99 0.14 0.14 2S.CSE 0.65 0.13 0.38 0.83 0.13 0.22 0.91 0.13 0.16 JM.IL 1.01 0.14 0.14 1.00 0.14 0.14 1.00 0.13 0.13 JM.CSE 1.00 0.10 0.10 1.00 0.11 0.11 1.00 0.11 0.11 Jun Yan February 2, 2016 @ IDAG, Boulder, CO 20 / 21

Outlook References Allen, M. R. and P. A. Stott (2003). Estimating signal amplitudes in optimal fingerprinting, part i: theory. Climate Dynamics 21, 477 491. Davison, A. C., S. A. Padoan, and M. Ribatet (2012). Statistical modeling of spatial extremes. Statistical Science 27(2), 161 186. Hannart, A., A. Ribes, and P. Naveau (2014). Optimal fingerprinting under multiple sources of uncertainty. Geophysical Research Letters 41(4), 1261 1268. Kojadinovic, I., H. Shang, and J. Yan (2015). A class of goodness-of-fit tests for spatial extremes models based on max-stable processes. Statistics and Its Interfaces 8(1), 45 62. Nikoloulopoulos, A. K., H. Joe, and N. R. Chaganty (2011). Weighted scores method for regression models with dependent data. Biostatistics 12, 653 665. Wang, Z., J. Yan, and X. Zhang (2014). Incorporating spatial dependence in regional frequency analysis. Water Resources Research 50(12), 9570 9585. Zwiers, F. W., X. Zhang, and Y. Feng (2011). Anthropogenic influence on long return period daily temperature extremes at regional scales. Journal of Climate 24(3), 881 892. Jun Yan February 2, 2016 @ IDAG, Boulder, CO 21 / 21