Journal of Statistical Software


 Theresa Bailey
 1 years ago
 Views:
Transcription
1 JSS Journal of Statistical Software Januar 2015, Volume 63, Issue 20. Spatial Data Analsis with RINLA with Some Etensions Roger S. Bivand NHH Norwegian School of Economics Virgilio GómezRubio Universidad de CastillaLa Mancha Håvard Rue Norwegian Universit for Science and Technolog Abstract The integrated nested Laplace approimation (INLA) provides an interesting wa of approimating the posterior marginals of a wide range of Baesian hierarchical models. This approimation is based on conducting a Laplace approimation of certain functions and numerical integration is etensivel used to integrate some of the models parameters out. The RINLA package offers an interface to INLA, providing a suitable framework for data analsis. Although the INLA methodolog can deal with a large number of models, onl the most relevant have been implemented within RINLA. However, man other important models are not available for RINLA et. In this paper we show how to fit a number of spatial models with RINLA, including its interaction with other R packages for data analsis. Secondl, we describe a novel method to etend the number of latent models available for the model parameters. Our approach is based on conditioning on one or several model parameters and fit these conditioned models with RINLA. Then these models are combined using Baesian model averaging to provide the final approimations to the posterior marginals of the model. Finall, we show some eamples of the application of this technique in spatial statistics. It is worth noting that our approach can be etended to a number of other fields, and not onl spatial statistics. Kewords: INLA, spatial statistics, R. 1. Introduction Baesian inference has become ver popular in spatial statistics in recent ears. Part of this success is due to the availabilit of computation methods to tackle fitting of spatial models. Besag, York, and Mollié (1991) proposed in their seminal paper an appropriate wa of fitting
2 2 Spatial Data Analsis with RINLA with Some Etensions a spatial model using Markov chain Monte Carlo methods. This model has been etensivel used and etended to consider different tpes of fied and random effects for spatial and spatiotemporal analsis. In general, fitting these models has been possible because of the availabilit of different computational techniques, the most notable being Markov chain Monte Carlo (MCMC). For large models or big data sets, MCMC can be tedious and reaching the required number of samples can take a long time. Not to mention that autocorrelation ma arise and that an increased number of iterations ma be required. Alternativel, the posterior distributions of the parameters ma be approimated in some wa. However, most models are highl multivariate and approimating the full posterior distribution ma not be possible in practice. The integrated nested Laplace approimation (INLA, Rue, Martino, and Chopin 2009) focuses on the posterior marginals for latent Gaussian models. Although these models ma seem rather restricted, the appear in a fair number of fields. This also means that INLA will be particularl useful when onl marginal inference on the model parameters is needed. The RINLA package (Rue, Martino, Lindgren, Simpson, Riebler, and Krainski 2014; Lindgren and Rue 2015) for the R programming language (R Core Team 2014) provides an interface to INLA (a freestanding programme) so that models can be fitted using standard R commands. Results are readil available for plotting or further analsis. First of all, we describe how RINLA can be used together with other R packages for spatial data analsis. It is often the case that spatial data are available in different formats that need to be loaded into R and some preprocessing is required. Also, once the results are available, it is helpful to eplain how to displa them on a map. Although INLA is a general method to approimate the posterior marginals, RINLA implements a number of popular latent models and prior distributions for the model parameters. It is, however, difficult to fit new models with INLA if these are based on other distributions not available in RINLA. This ma be an inconvenience when tring to develop new models as there is no eas wa of etending RINLA to fit other models without writing them into INLA itself. This is wh we also describe a wa of etending the number of models that RINLA can fit with little etra effort. First of all, we consider one (or more) parameters in our model so that, if the are fied, the resulting model can be fitted with RINLA. What we are doing here is, in fact, to fit a model conditioned on the assigned values to the parameters. Then, we can assign different values to these parameters and combine the resulting models in some wa to obtain a fit of the original model. We have used Baesian model averaging and numerical integration techniques to combine these models (Bivand, GómezRubio, and Rue 2014b). This paper is organized as follows. Section 2 describes the integrated nested Laplace approimation. In Section 3 the different latent models for spatial statistics are described. We describe how to etend RINLA to fit new models in Section 4. Some eamples are provided in Section 5. Finall, we discuss wh our approach is relevant in Section Integrated nested Laplace approimation Baesian inference is based on computing the posterior distribution of a vector of model parameters conditioned on the vector of observed data. Baes rule states that this
3 Journal of Statistical Software 3 posterior distribution can be written down as π( ) π( )π() (1) Here, π( ) is the likelihood of the model and π() represents the prior distribution on the model parameters. Usuall, π( ) is a highl multivariate distribution and difficult to obtain. In particular, it is seldom possible to derive it in a closed form. For this reason, several computational approaches have been proposed to get approimations to it. MCMC is probabl the most widel used famil of computational approaches to estimate the posterior distribution. The marginal distribution of parameter i can be denoted b π( i ) and it can be easil derived from the full posterior b integrating out over the remaining set of parameters i. Let us assume that we have a set of n observations = { i } n i=1, whose distribution is of the eponential famil. The mean of observation i is µ i and it can depend on a linear predictor η i via a link function. In turn, the linear predictor η i can be modelled as follows: n f η i = α + f (j) (u ji ) + β k z ki + ε i (2) j=1 α is the intercept, f (j) are functions on a set of n f random effects on a vector of covariates u, β k are coefficients on some covariates z and ε i are error terms. Hence, the vector of latent effects is = {{η i }, α, {β k },...}. Note that given our particular interest in spatial models, terms f (j) (u ji ) can be defined as to model spatial or spatiotemporal dependence. is modelled using a Gaussian distribution with zero mean and variancecovariance matri Q(θ 1 ). Now, θ 1 is a vector of hperparameters. Furthermore, is assumed to be a Gaussian Markov random field (GMRF, Rue and Held 2005). This means that Q(θ 1 ) will fulfil a number of Markov properties. The distribution of observations i will depend on the latent effects and, possibl, a number of hperparameters θ 2. Taking the vector of hperparameters θ = (θ 1, θ 2 ), observations i will be independent of each other given i and θ because of being a GMRF. Following Rue et al. (2009), the posterior distribution of the model latent effects and hperparameters θ can be written as n β k=1 π(, θ ) π(θ)π( θ) i I π( i i, θ) (3) π(θ) Q(θ) 1/2 ep{ 1 2 T Q(θ) + i I log(π( i i, θ))} I represents an inde of observed data (from 1 to n), Q(θ) is a precision matri on some hperparameters θ and log(π( i i, θ)) is the loglikelihood of observation i. INLA allows different forms for the likelihood of the observations. This includes not onl distributions from the eponential famil but also mitures of distributions. Also, INLA can handle observations with different likelihoods in the same model. Regarding the latent effects, different models can be used. We will describe some of these in more detail in Section 3. The specification of the prior distributions π(θ) is also ver fleible. These will often depend on the latent effect but, in principle, the most common distributions are available and the
4 4 Spatial Data Analsis with RINLA with Some Etensions user can define their own prior distribution in the RINLA package (but we will return to this later). Hence, we can write the marginals of the elements in and θ (i.e., latent effects and hperparameters) as π( i ) = π( i θ, )π(θ )dθ (4) and π(θ j ) = π(θ )dθ j (5) In order to estimate the previous marginals, we need π(θ ) or, alternativel, a convenient approimation that we will denote b π(θ ). Initiall, this approimation can be taken as π(, θ, ) π(θ ) (6) π G ( θ, ) = (θ) Here π G ( θ, ) is a Gaussian approimation to the full conditional of and (θ) is the mode of the full conditional for a given value of θ. Rue et al. (2009) take this approimation and use it to compute the marginal distribution of i using numerical integration: π( i ) = k π( i θ k, ) π(θ k ) k (7) Here k are the weights associated with the ensemble of values θ k, defined on a multidimentional grid over the space of hperparameters. Note that in the previous equation it is important to have good approimations of π( i θ k, ). A Gaussian approimation π G ( i θ k, ), with mean µ i (θ) and variance σi 2 (θ), ma be a good starting point but a better approimation ma be required in other cases. Rue et al. (2009) developed better approimations based on alternative approimation methods, such as the Laplace approimation. For eample, the have used the Laplace approimation to obtain: π(, θ, ) π LA ( i θ, ) (8) π GG ( i i, θ, ) i = i ( i,θ) π GG ( i i, θ, ) is a Gaussian approimation to i i, θ, around its mode i ( i, θ). Rue et al. (2009) develop a simplified Laplace approimation to improve π LA ( i θ, ) using a series epansion of the Laplace approimation around i. This approimation is computationall less epensive, and it also corrects for location and skewness The RINLA package An interface to INLA has been provided as an R package called RINLA, which can be downloaded from together with the freestanding eternal INLA programme. RINLA provides a user model interface similar to the one used to fit generalized additive models (GAM) with function gam() in the mgcv package (Wood 2006). It can handle fied effects, nonlinear terms and random effects in a formula argument. The interface is fleible enough to allow for the specification of different priors and model fitting options. Nonlinear terms and random effects are included in the formula as calls to the f() function.
5 Journal of Statistical Software 5 The model is fitted with a call to function inla(), which will return the fitted model as an inla object. Note that, b default, onl some results will be returned. These include the marginal distributions of the latent effects and hperparameters, as well as summar statistics. In addition to the posterior marginals, RINLA can provide a number of additional quantities on the fitted model. For eample, it can provide the logmarginal likelihood π() which can be used for model selection. Other model selection criteria such as the DIC (Spiegelhalter, Best, Carlin, and Van der Linde 2002) and CPO (Held, Schödle, and Rue 2010) have also been implemented. Furthermore, RINLA includes a number of options to define the prior distributions for the parameters in the model. Wellknown prior distributions are available and the user can define their own prior distributions as well. In the net Section we describe different eamples of the use of RINLA for spatial statistics, in which we have included a detailed description on how inla() should be called. 3. Spatial models with INLA As discussed in Section 2, spatial dependence can be included as part of the vector of latent effects. In principle, an number of random effects can be included in the model. In this Section, we will describe the different options available, depending on the tpe of problem. A full description of the models described here can be found in the RINLA website at http: //www.rinla.org/, but we have included a summar. Blangiardo, Cameletti, Baio, and Rue (2013) and GómezRubio, Bivand, and Rue (2014b) also discuss the different spatial models included in RINLA. First we will briefl introduce other papers describing the use of INLA and RINLA for spatial statistics. Schrödle and Held (2010) describe the use of spatial and spatiotemporal models for disease mapping, including ecological regression. Schrödle and Held (2011) epand the number of spatiotemporal models that can be used with RINLA, and show the use of setting linear constraints to make comple spatiotemporal effects identifiable. Schrödle, Held, Riebler, and Danuser (2011) show how to use spatiotemporal models for disease surveillance. Eidsvik, Finle, Banerjee, and Rue (2012) focus on the use of RINLA for the analsis of large spatial datasets. Finall, RuizCardenas, Krainski, and Rue (2012) develop spatiotemporal dnamic models with RINLA Analsis of lattice data First of all, we will discuss the analsis of lattice data because this will establish the basis for other tpes of analses. In the analsis of lattice data observations are grouped according to a set of areas, which usuall represent some sort of administrative region (neighborhoods, municipalities, provinces, countries, etc.). RINLA includes a latent model for uncorrelated random effects. In this case, the random effects u i are modelled as u i N(0, τ u ) (9) where τ u refers to the precision of the Gaussian distribution. It should be noted that RINLA assigns a prior to log(τ u ) which, b default, is a loggamma distribution. Although this model
6 6 Spatial Data Analsis with RINLA with Some Etensions is not spatial, it can be combined with other spatial models. Using log(τ u ) instead of simpl τ u provides some advantages as log(τ u ) is not constrained to be positive. This is particularl useful when optimising to find the mode of log(τ u ), for eample. In order to model spatial correlation, neighborhoods must be defined among the stud areas. It is often considered that two areas are neighbors is the share a common boundar. Spatial autocorrelation is modelled using a Gaussian distribution with zero mean and a precision matri that will model correlation between neighbors. Given that latent effects are a GMRF, we can define the variancecovariance matri of the random effects as Σ = 1 τ Q 1 (10) where τ is a precision parameter and matri Q encodes the spatial structure. Given that we are assuming a latent GMRF, this also means that matri Q will be defined such as element Q ij is zero if areas i and j are not neighbors. This means that Q is often a ver sparse matri. See, for eample, Rue and Held (2005) for details. Available specifications for spatial dependence includes the intrinsic conditional autoregressive (CAR) specification (Besag et al. 1991). This will produce a Q matri in which element Q ii is n i (the number of neighbors of area i) and element Q ij (with i j) is 1 if areas i and j are neighbors and 0 otherwise. This means that the spatial random effects v i are distributed as v i v j, τ v N( 1 1 v j, ) i j (11) n i τ v n i i j τ v is the conditional precision of the random effects. As in the previous model, RINLA uses a loggamma prior on log(τ v ). In addition, a proper version of this model is available as well, for which the spatial random effects are distributed as 1 1 v i v j, τ v N( v j, n i + d τ v (n i + d) ) i j (12) i j d is a positive quantit to make the distribution proper. B default, a loggamma distribution is assigned to log(d). A more general approach is obtained with the following precision matri: Q = (I ρ λ ma C) (13) Here I is the identit matri, ρ a spatial autocorrelation parameter, C an adjacenc matri and λ ma the maimum eigenvalue of C. RINLA assigns a Gaussian prior on log(ρ/(1 ρ)). This specification ensures that ρ takes values between 0 and 1. In the following eample we use the Boston housing data, which is described in Harrison and Rubinfeld (1978), to develop an eample on several spatial models. This data set records median price for houses that were occupied b their owners plus some other relevant covariates (see Harrison and Rubinfeld 1978; Pace and Gille 1997, for details). Data have been recorded at the tract level and the neighborhood structure of the tracts is also available, and it is available in the boston data set from the R package spdep (Bivand 2014). In addition, this data set is also available in a shapefile, which is the one we will use in this eample. This
7 Journal of Statistical Software 7 will provide a more general eample on how to load eternal data into R to fit models with RINLA. readshapepol(), from package maptools (Bivand and LewinKoh 2014), can be used to load vector data from a shapefile. Alternativel, readogr(), from package rgdal (Bivand, Keitt, and Rowlingson 2014a), provides a more general data loading framework for vector data since it supports a wider range of formats. This is the one we have used to load the Boston data set: R> librar("rgdal") R> boston < readogr(sstem.file("etc/shapes", package = "spdep")[1], + "boston_tracts") Here, readogr() takes the director where the laer (shapefile) is located and the laer name, which in this case is the name of the shapefile, as arguments and return an object of tpe SpatialPolgonsDataFrame. This data object is used to store the tract boundaries plus the associated data (tract name and other variables). Before fitting an spatial model, the neighborhood structure needs to be defined. A common criterion is to consider that two areas are neighbors if the share a common boundar. Function pol2nb() will take the tract boundaries and perform this operation to provide us with the adjacenc structure of the Boston tracts as a nb object: R> librar("spdep") R> bostonadj < pol2nb(boston, queen = FALSE) Here, we have also set queen = FALSE so that queen adjacenc is not used, i.e., in order to consider two areas as neighbors more than one shared point is required. We have converted this into a binar matri to be used with RINLA using function nb2mat(). Furthermore, the adjacenc matri is converted into a sparse matri of class dgtmatri to reduce memor usage. This will be passed to function f() when defining the spatial model. R> adj < nb2mat(bostonadj, stle = "B") R> adj < as(adj, "dgtmatri") A summar of some latent models implemented in RINLA, and that can be used within the f() function, is available in Table 1. Note that this is not an ehaustive list and that a complete list of the available latent models can be obtained from the RINLA documentation. We have also included a column showing whether these models are restricted to a regular grid. Also, detailed eamples are available from the RINLA website at Fied effects (including the intercept) in RINLA have a Gaussian prior with fied mean and precision, which are 0 and 0.01 (or 0 for the intercept) b default, respectivel. These values can be changed using option control.fied in the inla() call. control.fied must take a named list of arguments, which are used to control how to handle the fied effects in the model. In this named list, mean.intercept and prec.intercept can be used to set the parameters of the Gaussian prior of the intercept, whilst mean and prec are the analogous parameters to define the priors for the other fied effects. These can be a numeric value or another named list, using the names of fied effects, to set different priors for different effects. Note that
8 8 Spatial Data Analsis with RINLA with Some Etensions Name in f() Model Regular grid besag Intrinsic CAR No besagproper Proper CAR No bm Convolution model No generic0 Σ = 1 τ Q 1 No generic1 Σ = 1 τ (I n ρ λ ma C) 1 No rw2d 2D random walk Yes matern2d Matérn correlation Yes Table 1: Summar of some latent models implemented in RINLA for spatial statistics. precisions in the fied effects priors cannot be estimated as was the case with the different random effects presented before. The model that we are fitting is: i = α + βx i + v i + ε i (14) where α is the model intercept, β a vector of coefficients of the covariates X i, v i a random effect with an intrinsic CAR specification and ε i is random Gaussian error term. As f() needs an area inde which must have different values for different areas, this is first defined in variable id. R> librar("inla") R> boston$id < 1:nrow(boston) R> form < log(cmedv) ~ CRIM + ZN + INDUS + CHAS + I(NOX^2) + I(RM^2) + + AGE + log(dis) + log(rad) + TAX + PTRATIO + B + log(lstat) + + f(id, model = "besag", graph = adj) R> btdf < as.data.frame(boston) R> m1 < inla(form, data = btdf, control.predictor = list(compute = TRUE)) Note how the call to inla() is similar to fitting other regression models with R with glm() or gam(). Furthermore, it is ver eas to include spatial random effects with function f() in the formula passed to inla(). Finall, control.predictor = list(compute = TRUE) is used to compute summar statistics on the fitted values. A summar of the model can be obtained as follows: R> summar(m1) Call: "inla(formula = form, data = btdf, control.predictor = list(compute = TRUE))" Time used: Preprocessing Running inla Postprocessing Total Fied effects:
9 Journal of Statistical Software 9 mean sd 0.025quant 0.5quant 0.975quant mode kld (Intercept) CRIM ZN INDUS CHAS I(NOX^2) I(RM^2) AGE log(dis) log(rad) TAX PTRATIO B log(lstat) Random effects: Name Model id Besags ICAR model Model hperparameters: mean sd Precision for the Gaussian observations 1.626e e+04 Precision for id 1.222e e quant 0.5quant Precision for the Gaussian observations 7.582e e+04 Precision for id 1.074e e quant mode Precision for the Gaussian observations 6.180e e+03 Precision for id 1.381e e+01 Epected number of effective parameters(std dev): (5.348) Number of equivalent replicates : Marginal Likelihood: Posterior marginals for linear predictor and fitted values computed The output includes summar statistics of the posterior marginals of the coefficients of the fied effects plus the precisions of the error term and intrinsic CAR random effect. In addition, kld reports the KullbackLeibler divergence between the Gaussian and the (simplified) Laplace approimation to the marginal posterior densities. This provides information about the accurac of the Gaussian approimation. The marginal likelihood of the model is also reported and it is computed b integrating all the model parameters out. Hence, it is not the predictive marginal likelihood and it can be used to perform model selection, for eample. The effictive number of parameters, as defined in Spiegelhalter et al. (2002), and the associated number of equivalent replicates are also shown. See Martino and Rue (2010) for more details on the RINLA output.
10 10 Spatial Data Analsis with RINLA with Some Etensions (Intercept) CRIM ZN INDUS CHAS1 I(NOX^2) I(RM^2) AGE log(dis) log(rad) TAX PTRATIO e 04 2e B log(lstat) Precision error Precision spatial effects e+00 6e e+00 4e 05 0e+00 2e+05 4e Figure 1: Marginals of the fied effects, and the precisions of the error term and spatial random effects, Boston housing data. Figure 1 shows the estimated marginals of the coefficients of the fied effects and the precisions of the random effects in the model. These distributions can be used to compute summar statistics for the model parameters. In the previous RINLA output these marginals have been used to compute the posterior mean, standard deviation, mode and some quantiles (0.025, 0.5 and 0.975). Fitted values can be easil displaed in a map. First, we need to add all the required values to the SpatialPolgonsDataFrame: R> boston$logcmedv < log(boston$cmedv) R> boston$ftdlogcmedv < m1$summar.fitted[, "mean"] Note that we will represent values in the logscale. Net, we can use spplot() to displa
11 Journal of Statistical Software 11 Observed CMEDV Predicted CMEDV Figure 2: Observed and predicted median values, Boston housing data. both the observed and the predicted values of house prices. In the following eample, which can be seen in Figure 2, we have also used package RColorBrewer (Neuwirth 2014) to define a suitable color palette: R> librar("rcolorbrewer") R> spplot(boston, c("logcmedv", "FTDLOGCMEDV"), + col.regions = brewer.pal(9, "Blues"), cuts = 8, + names.attr = c("observed logcmedv", "Predicted logcmedv")) To provide an alternative visualisation of the results, we have included a short eample using function qmap() from the ggmap package (Kahle and Wickham 2013). First of all we will reproject our data to be WGS84. With fortif() the boston dataset is converted into a suitable format to be used when plotting and then the log median values are added to the new data. R> bostonf < sptransform(boston, CRS("+proj=longlat +datum=wgs84")) R> librar("ggmap") R> bostonf < fortif(bostonf, region = "TRACT") R> id < match(bostonf$id, as.character(boston$tract)) R> bostonf$logcmedv < boston$logcmedv[id] qmap() is based on the the grammar of graphics implemented in the ggplot2 package (Wickham 2009). In the net eample, qmap() is used to get satellite data from the Boston area, whilst geom_polgon() adds the boundaries: R> qmap("boston", zoom = 10, maptpe = "satellite") + geom_polgon( + data = bostonf, aes( = long, = lat, group = group, fill = LOGCMEDV), + colour = "white", alpha = 0.8, size = 0.3) The resulting map can be seen in Figure 3.
12 12 Spatial Data Analsis with RINLA with Some Etensions Figure 3: Displa of the Boston housing data set using ggmap and Google Maps Point patterns Point patterns are analzed with INLA as the result of a counting process, i.e., points are not modelled directl but the are aggregated over a a grid of small squares. For this reason, the analsis of point patterns is conducted similarl to that of lattice data: counts are available for each square and these are assigned neighbors according to the adjacent squares. Then, counts can be smoothed using an appropriate nonlinear term, such as spatial random effects. Hossain and Lawson (2009) compare different approimations to the analsis of point patterns, including methods that are based on discretisation of the stud region. In the following eample we use the Japanese black pine data set from R package spatstat (Baddele and Turner 2005). This data set records the location of Japanese black pine saplings in a square sampling in a natural forest. This eample is reproduced from Go mezrubio et al. (2014b). Hence, we first split the stud area into smaller squares to create a grid of squares. R> R> R> R> R> R> R> + librar("spatstat") data("japanesepines") japd < as.data.frame(japanesepines) Nrow < 10 Ncol < 10 n < Nrow * Ncol grd < GridTopolog(cellcentre.offset = c(0.05, 0.05), cellsize = c(1/nrow, 1/Ncol), cells.dim = c(nrow, Ncol))
13 Journal of Statistical Software 13 After the creation of the grid, we have used function over() on the set of points and the newl defined squares to find how man points can be found in each square. R> polgrdjap < as(grd, "SpatialPolgons") R> idpp < over(spatialpoints(japd), polgrdjap) R> japgrd < SpatialGridDataFrame(grd, data.frame(ntrees = rep(0, n))) R> tidpp < table(idpp) R> japgrd$ntrees[as.numeric(names(tidpp))] < tidpp Net, an inde variable is built to create the spatial neighborhood structure to be passed to the f() function. Note that care must be taken as R and RINLA ma have a different ordering of the areas when defining the adjacenc matri. R> japgrd$spidx < 1:n R> japnb < pol2nb(polgrdjap, queen = FALSE, row.names = 1:100) R> adjpine < nb2mat(japnb, stle = "B") R> adjpine < as(adjpine, "dgtmatri") Here we have avoided using a queen adjacenc as this will consider as neighbors two areas which onl share a corner. Finall, we define the call to inla() using a formula which includes spatial random effects based on the grid of squares. In addition, we have set other options to compute the DIC, with control.compute = list(dic = TRUE), and the marginals of the linear predictors, using control.predictor = list(compute = TRUE). We have included the specification of the prior distributions of the logprecisions of unstructured and spatial random effects as well. R> fpp < Ntrees ~ 1 + f(japgrd$spidx, model = "bm", graph = adjpine, + hper = list(prec.unstruct = list(prior = "loggamma", + param = c(0.001, 0.001)), + prec.spatial = list(prior = "loggamma", param = c(0.1, 0.1)))) R> japinlala < inla(fpp, famil = "poisson", data = as.data.frame(japgrd), + control.compute = list(dic = TRUE), + control.inla = list(tolerance = 1e20, h = 1e08), + control.predictor = list(compute = TRUE)) R> japgrd$inlala < japinlala$summar.fitted.values[, "mean"] The former model is the one that we have emploed with the Boston data set on an irregular lattice. Given that now we are considering a regular lattice it is also possible to use a twodimensional random walk for spatial smoothing: R> fpprw2d < Ntrees ~ 1 + f(japgrd$spidx, model = "rw2d", nrow = 10, + ncol = 10, hper = list(prec = list(prior = "loggamma", + param = c(0.001, 0.001)))) R> japinlalarw2d < inla(fpprw2d, famil = "poisson", + data = as.data.frame(japgrd), control.compute = list(dic = TRUE), + control.inla = list(tolerance = 1e20, h = 1e08), + control.predictor = list(compute = TRUE)) R> japgrd$inlalarw2d < japinlalarw2d$summar.fitted.values[, "mean"]
14 14 Spatial Data Analsis with RINLA with Some Etensions DATA INLA BYM INLA RW2D Figure 4: Estimation of the intensit of a point pattern with RINLA, Japanese black pine dataset. Figure 4 shows the original counts and the smoothed counts. Note that this is similar to estimating the intensit of an inhomogeneous point pattern using a smoothing method Geostatistics RINLA deals with geostatistical data on a regular grid. This means that observations need to be matched to the points in the grid and that those points with no observations attached are considered as missing values. Hence, this is somewhat similar to the analsis of lattice data and point patterns. However, RINLA provides a number of options to build modelbased geostatistical models (Diggle and Ribeiro 2007). First of all, different likelihoods can be used. Secondl, there are different options to define the spatial random effects. Although it is still possible to model spatial dependence in the grid of points using a CAR specification, RINLA provides a twodimensional Matérn covariance function. This correlation allows, for eample, the use of eponentiall decaing functions such as Σ ij = σ 2 ep( d ij /ϕ) (15) where d ij is the distance between points i and j, and ϕ is a parameter that controls the scale of the spatial dependence. More recentl, Lindgren, Rue, and Lindström (2011) follow a different approach based on a triangulation on the sampling points and the use of stochastic partial differential equations. Now, the spatial effects are defined as u(s) = n ψ k (s)w k, s R 2 (16) k=1 Here, {ψ k (s)} are a basis of functions and w k are associated weights. Weights are assumed to be Gaussian. The advantages of this approach for spatial statistics are full described in Cameletti, Lindgren, Simpson, and Rue (2013). In order to show how to fit geostatistical models with RINLA we reproduce here an eample from GómezRubio et al. (2014b) based on the Rongelap data set (Diggle and Ribeiro 2007), which records radionuclide concentration at 157 different locations in Rongelap island. We have restricted the analsis to one of the clusters in the northeast part of the island because
15 Journal of Statistical Software 15 observations need to be matched to a regular grid of points. For this analsis we have used R packages geor (Ribeiro and Diggle 2001) and georglm (Christensen and Ribeiro 2002). First of all, data are loaded and the data from the desired clusters are etracted from the original data set b checking that their coordinates are in the window ( 700, 500) ( 1900, 1700). R> librar("geor") R> librar("georglm") R> data("rongelap") R> rgldata < as.data.frame(rongelap) R> < rongelap[[1]] R> id1 < ([, 1] < 500 & [, 1] > 700 & [, 2] > & + [, 2] < 1700) R> rgldata < rgldata[id1, ] The net step is to define the grid topolog for the grid that will be used to match these points to. The grid is defined to be of dimension 5 5. R> Nrow < 5 R> Ncol < 5 R> n < Nrow * Ncol R> grdoffset < c(min(rgldata$x1), min(rgldata$x2)) R> csize1 < diff(range(rgldata$x1))/(nrow  1) R> csize2 < diff(range(rgldata$x2))/(ncol  1) R> grd < GridTopolog(cellcentre.offset = grdoffset, + cellsize = c(csize1, csize2), cells.dim = c(nrow, Ncol)) Data will be placed in a SpatialGridDataFrame (using the previousl defined grid topolog) and reorganized according to what RINLA epects for this model (i.e., grid data stored b column). An inde variable IDX is added to be used in f() when defining the model. However, RINLA will rel on how the rows are ordered in the data passed to inla() when defining distances and adjacencies (i.e., the inde variable ordering will not be considered). R> inla2sp < inla.lattice2node.mapping(nrow, Ncol)[, Ncol:1] R> inla2sp < as.vector(inla2sp) R> spgrd < SpatialGridDataFrame(grd, as.data.frame(rgldata[inla2sp, ])) R> spgrd$idx < Net, we create a SpatialPolgons with the boundaries of the squares in the grid. This wa, it is eas to match the data to the newl created grid using function over(). R> polgrd < as(grd, "SpatialPolgons") R> dataid < over(spatialpoints(as.matri(rgldata[, 1:2])), polgrd) It should be noted that radionuclide concentration is measured at each square b the average of the observations in the square, and this needs to be computed beforehand. R> ag < b(rgldata$data, dataid, sum) R> umag < b(rgldata$units.m, dataid, sum) R> ratioag < ag/umag
16 16 Spatial Data Analsis with RINLA with Some Etensions DATA INLA MATERN2D INLA RW2D Figure 5: Observed and estimated radionuclide concentration in Rongelap island. Then, a new column is added to the SpatialGridDataFrame with these averages. NA will be used for the squares with no data so that these values will be imputed from the model. R> spgrd$ratioag < NA R> spgrd$ratioag[as.numeric(names(ratioag))] < ratioag Here we define a model with an intercept term and a random effect of the Matérn class. Note how we have fied, for convenience, the value of the range and precision. R> formula1 < ratioag ~ 1 + f(spgrd$idx, model = "matern2d", nrow = Nrow, + ncol = Ncol, hper = list(range = list(initial = log(sqrt(8)/0.5), + fied = TRUE), prec = list(initial = log(1), fied = TRUE))) R> rglinlala < inla(formula1, famil = "poisson", + control.predictor = list(compute = TRUE), + control.compute = list(dic = TRUE), + data = as.data.frame(spgrd)) R> spgrd$inlala < rglinlala$summar.fitted.values[, "mean"] Similarl as in the point patterns eample, here we have also used a two dimensional random walk for spatial smoothing. R> formularw2d < ratioag ~ 1 + f(spgrd$idx, model = "rw2d", nrow = Nrow, + ncol = Ncol, hper = list(prec = list(prior = "loggamma", + param = c(1, 1)))) R> rglinlalarw2d < inla(formularw2d, famil = "poisson", + control.predictor = list(compute = TRUE), + control.compute = list(dic = TRUE), + data = as.data.frame(spgrd)) R> spgrd$inlalarw2d < rglinlalarw2d$summar.fitted.values[, "mean"] Figure 5 shows the observed and estimated radionuclide concentration in Rongelap island. It can be seen how our model has spatiall smoothed the observed values.
17 Journal of Statistical Software 17 DATA INLA BYM INLA RW2D BAYESX Figure 6: Estimation of the intensit of a point pattern with RINLA and BaesX, Japanese black pine dataset RINLA and other packages for Baesian spatial modelling RINLA is not the onl package for Baesian spatial modelling. Bivand, Pebesma, and Gómez Rubio (2013, Chapter 10) compare different packages for Baesian modelling in the contect of disease mapping. We wil focus here in R2BaesX (Umlauf, Kneib, Lang, and Zeileis 2013; Umlauf, Adler, Kneib, Lang, and Zeileis 2015) because it provides a wa to defining spatial models as RINLA. For eample, in order to reproduce the eample on the Japanese black pine data with R2BaesX we can do the following: R> librar("r2baesx") R> baesadj < nb2gra(japnb) R> japb < baes(ntrees ~ 1 + s(spidx, bs = "re") + + s(spidx, bs = "spatial", map = baesadj), famil = "poisson", + data = as.data.frame(japgrd)) Function nb2gra() is used to convert our adjacenc matri into an object of class gra, which is used in R2BaesX to store adjacencies. baes() takes similar arguments as inla() and the model can be epressed using a formula, with s() used to define the random effects. s(id, bs = "re") defines independent Gaussian random effects and the spatial random effects are defined in s(tract, bs = "spatial", map = baesadj) using adjancenc matri defined in baesadj. Retrieving the predicted data requires some care as the are reordered, but is is as simple as: R> japgrd$bayesx < japb$fitted.values[ + order(japb$baes.setup$order), "mu"] Finall, we compare the fitted values obtained with RINLA and R2BaesX in Figure 6. Note that differences appear not onl because of the different models used but also because of the choice of prior distributions.
18 18 Spatial Data Analsis with RINLA with Some Etensions 4. Etending RINLA to fit new models Although the current implementation of INLA in the RINLA package provides a reasonable number of models for spatial dependence it ma be the case that we need to include some other models. As it is now, this is not possible without adding to the code of the eternal INLA programme. Bivand et al. (2014b) describe a simple wa of etending INLA to use other latent models. In particular the focus on some latent models used in spatial econometrics that are not available as part of the RINLA package at the moment. A new latent class has been added recentl and it is described in GómezRubio, Bivand, and Rue (2014a). This approach is based on considering a model where one or several parameters have been fied in a wa that makes the conditioned model fittable with RINLA. If we denote b ρ the vector of parameters to fi and b ˆρ a specific set of fied parameter values, the full posterior marginal could be written as π(, θ, ˆρ) (17) Taking this into account, it is clear that when conditioning on ρ = ˆρ RINLA will give us an approimation to π( i, ˆρ) and π(θ i, ˆρ). Note that the full posterior distribution can be obtained b integrating ρ out, i.e., π(, θ ) = π(, θ, ρ)π(ρ )dρ (18) where π(ρ ) is the posterior distribution of ρ. Also, note that this can be written as π(ρ ) π( ρ)π(ρ) (19) Here π(ρ) is a prior distribution on ρ and π( ρ) is the marginal likelihood of the model, which is reported b RINLA. Hence, π(ρ ) can be estimated b rescaling the epression in Equation 19. The posterior distribution of ρ can be estimated b defining a fine grid of values S = {ρ i } r i=1 so that π(ρ i ), i = 1,..., r are computed. Then π(ρ ) can be obtained b fitting and rescaling a spline (or other nonlinear function) to the previous values. Using simple numerical integration techniques we can obtain an approimation to π(, θ ) as follows: π(, θ ) = π(, θ, ρ)π(ρ )dρ π(, θ, ρ i )π(ρ i ) i (20) ρ i S where i is the amplitude of the interval used in the discretisation of ρ. Note that the previous epression can be regarded as a weighted average of the different models fitted after conditioning on different values of ρ. From Equation 20 it is clear that we can obtain the following approimations to the posterior marginals of the individual latent parameters and hperparameters: ˆπ( i ) = j π( i, ρ j )w j (21) ˆπ(θ i ) = j π(θ i, ρ j )w j (22)
19 Journal of Statistical Software 19 w j is a weight associated with ρ j as follows: w j = π(ρ j ) j (23) This is like carring out Baesian model averaging (Hoeting, Madigan, Rafter, and Volinsk 1999) on the different conditioned models fitted with RINLA. Altogether, this provides a wa of combining simpler models to obtain our desired model. In Section 5 we show how to appl these ideas to different models in spatial statistics. Note that this approach can be easil etended to the case of ρ being a discrete random variable Implementation We have implemented this approach in an R package called INLABMA, available from CRAN. The package includes some general functions to conduct Baesian model averaging of models fitted with INLA. In addition, we have included some wrapper functions to fit the models described in Section Eamples 5.1. Lerou model Lerou, Lei, and Breslow (1999) propose a model for the analsis of spatial data in a lattice which is similar to the one b Besag et al. (1991), in the sense that the split variation according to spatial and nonspatial patterns. Rather than including the spatial and nonspatial random effect as a sum in the linear term the consider a single random effect as follows: u MVN(0, Σ); Σ = σ 2 ((1 λ)i n + λm) 1 (24) Here M is the precision matri of a process with spatial structure and we will take that of an intrinsic CAR specification. Hence, the precision matri is, in a sense, a miture of the precisions of a nonspatial and a spatial one. λ controls how strong the spatial structure is. For λ = 1 the effect is entirel spatial whilst for λ = 0 there is no spatial dependence. In principle, this is not a model that RINLA can fit. However, if λ is fied, then the random effects are Gaussian with a known structure for the variancecovariance matri which can be fitted using a generic0 latent model. Boston housing data Here we revisit the Boston housing data to fit the Lerou et al. model. First of all, it is worth mentioning that the model needs a wrapper function to be fitted for a given value of the spatial parameter λ. This wrapper function is included in the R package RINLA and it is based on the generic0 latent model available in RINLA. Once λ is fied the model can be easil fitted with RINLA, as the latent effect is a multivariate Gaussian random effect with zero mean and precision matri as in Equation 24. We repeat this procedure for different values of λ to obtain a list of fitted models to be combined later. Hence, we have written a simple wrapper function which is included in package INLABMA (GómezRubio and Bivand 2014):
20 20 Spatial Data Analsis with RINLA with Some Etensions R> librar("inlabma") R> lerou.inla function (formula, d, W, lambda, improve = TRUE, fhper = NULL,...) { W2 < diag(appl(w, 1, sum))  W Q < (1  lambda) * diag(nrow(w)) + lambda * W2 assign("q", Q, environment(formula)) if (is.null(fhper)) { formula < update(formula,. ~. + f(id, model = "generic0", Cmatri = Q)) } else { formula < update(formula,. ~. + f(id, model = "generic0", Cmatri = Q, hper = fhper)) } res < INLA::inla(formula, data = d,...) if (improve) res < INLA::inla.rerun(res) res$logdet < as.numeric(matri::determinant(q)$modulus) res$mlik < res$mlik + res$logdet/2 return(res) } <environment: namespace:inlabma> In the previous code, the precision matri Q is created using the adjacenc matri W and the value of λ. Then the generic0 model is added to the formula with the fied effects. Finall we correct the marginal loglikelihood π( λ) (conditioned on the value of λ) b adding half the logdeterminant of ((1 λ)i n + λm). Note that, in principle, this is not needed to fit a single model and obtain the approimations to the posterior marginals as it is a constant. However, we are fitting and combining several models so we need to correct for this because this scaling factor will change with the value of λ. Argument... is used to pass an other options to inla(). This can be used to tune and set a number of other options. Also, the adjacenc matri is taken from the data provided in the boston data set. Note that we will be using a binar adjacenc matri as the random effects have an intrinsic CAR specification: R> boston.matb < listw2mat(nb2listw(bostonadj, stle = "B")) R> bmspb < as(boston.matb, "CsparseMatri") Function inla.lerou is used in the eample below to compute the fitted models for the Lerou et al. model. In this case, we take λ to be in the interval (0.8, 0.99) after previous assessment on where π(λ ) has its mode. Also, we define a prior for the precision of the random effects in variable fhper. The prior for the precision of the error term is defined in errorhper. In addition, we have used mclappl to parallelize the computations on operating sstems supporting forking (not Windows). Note that this is an advantage of fitting these conditioned models compared with standard MCMC methods.
Hidden Part Models for Human Action Recognition: Probabilistic vs. MaxMargin
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, VOL., NO. 1 Hidden Part Models for Human Action Recognition: Probabilistic vs. MaxMargin Yang Wang and Greg Mori, Member, IEEE Abstract
More informationA Guide to Sample Size Calculations for Random Effect Models via Simulation and the MLPowSim Software Package
A Guide to Sample Size Calculations for Random Effect Models via Simulation and the MLPowSim Software Package William J Browne, Mousa Golalizadeh Lahi* & Richard MA Parker School of Clinical Veterinary
More informationMultidimensional Point Process Models in R
Multidimensional Point Process Models in R Roger D. Peng Department of Statistics, University of California, Los Angeles Los Angeles CA 900951554 Abstract A software package for fitting and assessing
More informationPREDICTIVE DISTRIBUTIONS OF OUTSTANDING LIABILITIES IN GENERAL INSURANCE
PREDICTIVE DISTRIBUTIONS OF OUTSTANDING LIABILITIES IN GENERAL INSURANCE BY P.D. ENGLAND AND R.J. VERRALL ABSTRACT This paper extends the methods introduced in England & Verrall (00), and shows how predictive
More informationHighDimensional Image Warping
Chapter 4 HighDimensional Image Warping John Ashburner & Karl J. Friston The Wellcome Dept. of Imaging Neuroscience, 12 Queen Square, London WC1N 3BG, UK. Contents 4.1 Introduction.................................
More informationFlexible and efficient Gaussian process models for machine learning
Flexible and efficient Gaussian process models for machine learning Edward Lloyd Snelson M.A., M.Sci., Physics, University of Cambridge, UK (2001) Gatsby Computational Neuroscience Unit University College
More informationWhy Does Unsupervised Pretraining Help Deep Learning?
Journal of Machine Learning Research 11 (2010) 625660 Submitted 8/09; Published 2/10 Why Does Unsupervised Pretraining Help Deep Learning? Dumitru Erhan Yoshua Bengio Aaron Courville PierreAntoine Manzagol
More informationJournal of Statistical Software
JSS Journal of Statistical Software January 2005, Volume 12, Issue 6. http://www.jstatsoft.org/ spatstat: An R Package for Analyzing Spatial Point Patterns Adrian Baddeley University of Western Australia
More informationPredicting the Present with Bayesian Structural Time Series
Predicting the Present with Bayesian Structural Time Series Steven L. Scott Hal Varian June 28, 2013 Abstract This article describes a system for short term forecasting based on an ensemble prediction
More informationRegression Modeling and MetaAnalysis for Decision Making: A CostBenefit Analysis of Incentives in Telephone Surveys
Regression Modeling and MetaAnalysis for Decision Making: A CostBenefit Analysis of Incentives in Telephone Surveys Andrew Gelman, Matt Stevens, and Valerie Chan Departments of Statistics and Political
More informationOPRE 6201 : 2. Simplex Method
OPRE 6201 : 2. Simplex Method 1 The Graphical Method: An Example Consider the following linear program: Max 4x 1 +3x 2 Subject to: 2x 1 +3x 2 6 (1) 3x 1 +2x 2 3 (2) 2x 2 5 (3) 2x 1 +x 2 4 (4) x 1, x 2
More informationA Direct Approach to Data Fusion
A Direct Approach to Data Fusion Zvi Gilula Department of Statistics Hebrew University Robert E. McCulloch Peter E. Rossi Graduate School of Business University of Chicago 1101 E. 58 th Street Chicago,
More informationTest Problem Construction for SingleObjective Bilevel Optimization
Test Problem Construction for SingleObjective Bilevel Optimization Ankur Sinha, Pekka Malo Department of Information and Service Economy Aalto University School of Business, Finland {Firstname.Lastname}@aalto.fi
More informationWhat s Strange About Recent Events (WSARE): An Algorithm for the Early Detection of Disease Outbreaks
Journal of Machine Learning Research 6 (2005) 1961 1998 Submitted 8/04; Revised 3/05; Published 12/05 What s Strange About Recent Events (WSARE): An Algorithm for the Early Detection of Disease Outbreaks
More informationEvaluations and improvements in small area estimation methodologies
National Centre for Research Methods Methodological Review paper Evaluations and improvements in small area estimation methodologies Adam Whitworth (edt), University of Sheffield Evaluations and improvements
More informationClean Answers over Dirty Databases: A Probabilistic Approach
Clean Answers over Dirty Databases: A Probabilistic Approach Periklis Andritsos University of Trento periklis@dit.unitn.it Ariel Fuxman University of Toronto afuxman@cs.toronto.edu Renée J. Miller University
More informationPRINCIPAL COMPONENT ANALYSIS
1 Chapter 1 PRINCIPAL COMPONENT ANALYSIS Introduction: The Basics of Principal Component Analysis........................... 2 A Variable Reduction Procedure.......................................... 2
More informationA Googlelike Model of Road Network Dynamics and its Application to Regulation and Control
A Googlelike Model of Road Network Dynamics and its Application to Regulation and Control Emanuele Crisostomi, Steve Kirkland, Robert Shorten August, 2010 Abstract Inspired by the ability of Markov chains
More informationNICE DSU TECHNICAL SUPPORT DOCUMENT 2: A GENERALISED LINEAR MODELLING FRAMEWORK FOR PAIRWISE AND NETWORK METAANALYSIS OF RANDOMISED CONTROLLED TRIALS
NICE DSU TECHNICAL SUPPORT DOCUMENT 2: A GENERALISED LINEAR MODELLING FRAMEWORK FOR PAIRWISE AND NETWORK METAANALYSIS OF RANDOMISED CONTROLLED TRIALS REPORT BY THE DECISION SUPPORT UNIT August 2011 (last
More informationEstimation and comparison of multiple changepoint models
Journal of Econometrics 86 (1998) 221 241 Estimation and comparison of multiple changepoint models Siddhartha Chib* John M. Olin School of Business, Washington University, 1 Brookings Drive, Campus Box
More informationTHE PROBLEM OF finding localized energy solutions
600 IEEE TRANSACTIONS ON SIGNAL PROCESSING, VOL. 45, NO. 3, MARCH 1997 Sparse Signal Reconstruction from Limited Data Using FOCUSS: A Reweighted Minimum Norm Algorithm Irina F. Gorodnitsky, Member, IEEE,
More informationEconometrics in R. Grant V. Farnsworth. October 26, 2008
Econometrics in R Grant V. Farnsworth October 26, 2008 This paper was originally written as part of a teaching assistantship and has subsequently become a personal reference. I learned most of this stuff
More informationTop 10 algorithms in data mining
Knowl Inf Syst (2008) 14:1 37 DOI 10.1007/s1011500701142 SURVEY PAPER Top 10 algorithms in data mining Xindong Wu Vipin Kumar J. Ross Quinlan Joydeep Ghosh Qiang Yang Hiroshi Motoda Geoffrey J. McLachlan
More informationRobust Set Reconciliation
Robust Set Reconciliation Di Chen 1 Christian Konrad 2 Ke Yi 1 Wei Yu 3 Qin Zhang 4 1 Hong Kong University of Science and Technology, Hong Kong, China 2 Reykjavik University, Reykjavik, Iceland 3 Aarhus
More informationIntroduction to Stochastic ActorBased Models for Network Dynamics
Introduction to Stochastic ActorBased Models for Network Dynamics Tom A.B. Snijders Gerhard G. van de Bunt Christian E.G. Steglich Abstract Stochastic actorbased models are models for network dynamics
More informationRecovering 3D Human Pose from Monocular Images
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE. SUBMITTED FOR REVIEW. 1 Recovering 3D Human Pose from Monocular Images Ankur Agarwal and Bill Triggs Abstract We describe a learning based
More informationMODEL SELECTION FOR SOCIAL NETWORKS USING GRAPHLETS
MODEL SELECTION FOR SOCIAL NETWORKS USING GRAPHLETS JEANNETTE JANSSEN, MATT HURSHMAN, AND NAUZER KALYANIWALLA Abstract. Several network models have been proposed to explain the link structure observed
More informationIBM SPSS Missing Values 22
IBM SPSS Missing Values 22 Note Before using this information and the product it supports, read the information in Notices on page 23. Product Information This edition applies to version 22, release 0,
More informationApplying MCMC Methods to Multilevel Models submitted by William J Browne for the degree of PhD of the University of Bath 1998 COPYRIGHT Attention is drawn tothefactthatcopyright of this thesis rests with
More informationAn Introduction to Variable and Feature Selection
Journal of Machine Learning Research 3 (23) 11571182 Submitted 11/2; Published 3/3 An Introduction to Variable and Feature Selection Isabelle Guyon Clopinet 955 Creston Road Berkeley, CA 9478151, USA
More information