Customising spatial data classes and methods
|
|
|
- Myron Harrington
- 10 years ago
- Views:
Transcription
1 Customising spatial data classes and methods Edzer Pebesma Feb 2008 Contents 1 Programming with classes and methods S3-style classes and methods S4-style classes and methods Animal track data in package trip Generic and constructor functions Methods for trip objects Multi-point data: SpatialMultiPoints 8 4 Spatio-temporal grids 13 5 Analysing spatial Monte Carlo simulations 17 6 Processing massive grids 19 Although the classes defined in the sp package cover many needs, they do not go far beyond the most typical GIS data models. In applied research, it often happens that customised classes would suit the actual data coming from the instruments better. Since S4 classes have mechanisms for inheritance, it may be attractive to build on the sp classes, so as to utilise their methods where appropriate. Here, we will demonstrate a range of different settings in which sp classes can be extended. Naturally, this is only useful for researchers with specific and clear needs, so our goal is to show how (relatively) easy it may be to prototype classes extending sp classes for specific purposes. This vignette formed pp of the first edition of Bivand, R. S., Pebesma, E. and Gómez-Rubio V. (2008) Applied Spatial Data Analysis with R, Springer-Verlag, New York. It was retired from the second edition (2013) to accommodate material on other topics, and is made available in this form with the understanding of the publishers. It has been updated to the 2013 state of the software, e.g. using over. Institute for Geoinformatics, University of Muenster, Weseler Strasse 253, Münster, Germany. [email protected] 1
2 1 Programming with classes and methods This section will explain the elementary basics of programming with classes and methods in R. The S language (implemented in R and S-PLUS ) contains two mechanisms for creating classes and methods: the traditional S3 system and the more recent S4 system (see Section 2.2 in Bivand et al. (2008), in which classes were described for the user here they are described for the developer). This chapter is not a full introduction to R programming (see Braun and Murdoch, 2007, for more details), but it will try to give some feel of how the Spatial classes in package sp can be extended to be used for wider classes of problems. For full details, the interested reader is referred to e.g. Venables and Ripley (2000) and Chambers (1998), the latter being a reference for new-style S4 classes and methods. Example code is for example found in the source code for package sp, available from CRAN. Suppose we define myfun as > myfun <- function(x) { + x + 2 then, calling it with the numbers 1, 2 and 3 results in > myfun(1:3) [1] or alternatively using a named argument: > myfun(x = 1:3) [1] The return value of the function is the last expression evaluated. want to wrap existing functions, such as a plot function: > plotxplus2yminus3 <- function(x, y,...) { + plot(x = x + 2, y = y - 3,...) Often, we In this case, the... is used to pass information to the plot function without explicitly anticipating what it will be: named arguments x and y, or the first two arguments if they are unnamed are processed, remaining arguments are passed on. The plot function is a generic method, with an instance that depends on the class of its first (S3) or first n arguments (S4). The available instances of plot are shown for S3-type methods by > methods("plot") [1] plot.holtwinters* plot.tukeyhsd* plot.acf* plot.data.frame* [5] plot.decomposed.ts* plot.default plot.dendrogram* plot.density* [9] plot.ecdf plot.factor* plot.formula* plot.function [13] plot.hclust* plot.histogram* plot.isoreg* plot.lm* [17] plot.medpolish* plot.mlm* plot.ppr* plot.prcomp* [21] plot.princomp* plot.profile.nls* plot.raster* plot.spec* [25] plot.stepfun plot.stl* plot.table* plot.ts [29] plot.tskernel* see '?methods' for accessing help and source code and for S4-type methods by > library(sp) > showmethods("plot") 2
3 Function: plot (package graphics) x="any", y="any" x="spatial", y="missing" x="spatialgrid", y="missing" x="spatiallines", y="missing" x="spatialmultipoints", y="missing" x="spatialpixels", y="missing" x="spatialpoints", y="missing" x="spatialpolygons", y="missing" where we first loaded sp to make sure there are some S4 plot methods to show. 1.1 S3-style classes and methods Building S3-style classes is simple. Suppose we want to build an object of class foo: > x <- rnorm(10) > class(x) <- "foo" > x [1] [8] attr(,"class") [1] "foo" If we plot this object, e.g., by plot(x) we get the same plot as when we would not have set the class to foo. If we know, however, that objects of class foo need to be plotted without symbols but with connected lines, we can write a plot method for this class: > plot.foo <- function(x, y,...) { + plot.default(x, type = "l",...) after which plot(x) will call this particular method, rather than a default plot method. Class inheritance is obtained in S3 when an object is given multiple classes, as in > class(x) <- c("foo", "bar") > plot(x) For this plot, first function plot.foo will be looked for, and if not found the second option plot.bar will be looked for. If none of them is found, the default plot.default will be used. The S3 class mechanism is simple and powerful. Much of R works with it, including key functions such as lm. > data(meuse) > class(meuse) [1] "data.frame" > class(lm(log(zinc) ~ sqrt(dist), meuse)) [1] "lm" 3
4 There is, however, no checking that a class with a particular name does indeed contain the elements that a certain method for it expects. It also has design flaws, as method specification by dot separation is ambiguous in case of names such as as.data.frame, where one cannot tell whether it means that the method as.data acts on objects of class frame, or the method as acts on objects of class data.frame, or none of them (the answer is: none). For such reasons, S4-style classes and methods were designed. 1.2 S4-style classes and methods S4-style classes are formally defined, using setclass. As an example, somewhat simplified versions of classes CRS and Spatial in sp are > setclass("crs", representation(projargs = "character")) > setclass("spatial", representation(bbox = "matrix", + proj4string = "CRS"), validity <- function(object) { + bb <- bbox(object) + if (!is.matrix(bb)) + return("bbox should be a matrix") + n <- dimensions(object) + if (n < 2) + return("spatial.dimension should be 2 or more") + if (any(is.na(bb))) + return("bbox should never contain NA values") + if (any(!is.finite(bb))) + return("bbox should never contain infinite values") + if (any(bb[, "max"] < bb[, "min"])) + return("invalid bbox: max < min") + TRUE ) The command setclass defines a class name as a formal class, gives the names of the class elements (called slots), and their type type checking will happen upon construction of an instance of the class. Further checking, e.g., on valid dimensions and data ranges can be done in the validity function. Here, the validity function retrieves the bounding box using the generic bbox method. Generics, if not defined in the base R system, e.g., > isgeneric("show") [1] TRUE can be defined with setgeneric. Defining a specific instance of a generic is done by setmethod: > setgeneric("bbox", function(obj) standardgeneric("bbox")) > setmethod("bbox", signature = "Spatial", function(obj) obj@bbox) where the signature tells the class of the first (or first n) arguments. Here, operator is used to access the bbox slot in an S4 object, not to be confused with the $ operator to access list elements. We will now illustrate this mechanism by providing a few examples of classes, building on those available in package sp. 4
5 2 Animal track data in package trip CRAN Package trip, written by Michael Sumner (Kirkwood et al., 2006; Page et al., 2006), provides a class for animal tracking data. Animal tracking data consist of sets of (x, y, t) stamps, grouped by an identifier pointing to an individual animal, sensor or perhaps isolated period of monitoring. A strategy for this (slightly simplified from that of trip) is to extend the SpatialPointsDataFrame class by a length 2 character vector carrying the names of the time column and the trip identifier column in the SpatialPointsDataFrame attribute table. Package trip does a lot of work to read and analyse tracking data from data formats typical for tracking data (Argos DAT), removing duplicate observations and validating the objects, e.g., checking that time stamps increase and movement speeds are realistic. We ignore this and stick to the bare bones. We now define a class called trip that extends SpatialPointsDataFrame: > library(sp) > setclass("trip", representation("spatialpointsdataframe", + TOR.columns = "character"), validity <- function(object) { + if (length([email protected])!= 2) + stop("time/id column names must be of length 2") + if (!all([email protected] %in% names(object@data))) + stop("time/id columns must be present in attribute table") + TRUE ) > showclass("trip") Class "trip" [in ".GlobalEnv"] Slots: Name: TOR.columns data coords.nrs coords bbox Class: character data.frame numeric matrix matrix Name: Class: proj4string CRS Extends: Class "SpatialPointsDataFrame", directly Class "SpatialPoints", by class "SpatialPointsDataFrame", distance 2 Class "Spatial", by class "SpatialPointsDataFrame", distance 3 that checks, upon creation of objects, that indeed two variable names are passed and that these names refer to variables present in the attribute table. 2.1 Generic and constructor functions It would be nice to have a constructor function, just like data.frame or Spatial- Points, so we now create it and set it as the generic function to be called in case the first argument is of class SpatialPointsDataFrame. > trip.default <- function(obj, TORnames) { + if (!is(obj, "SpatialPointsDataFrame")) + stop("trip only supports SpatialPointsDataFrame") + if (is.numeric(tornames)) + TORnames <- names(obj)[tornames] + new("trip", obj, TOR.columns = TORnames) 5
6 > if (!isgeneric("trip")) setgeneric("trip", function(obj, + TORnames) standardgeneric("trip")) [1] "trip" > setmethod("trip", signature(obj = "SpatialPointsDataFrame", + TORnames = "ANY"), trip.default) [1] "trip" We can now try it out, with turtle data: > turtle <- read.csv(system.file("external/seamap105_mod.csv", + package = "sp")) > timestamp <- as.posixlt(strptime(as.character(turtle$obs_date), + "%m/%d/%y %H:%M:%S"), "GMT") > turtle <- data.frame(turtle, timestamp = timestamp) > turtle$lon <- ifelse(turtle$lon < 0, turtle$lon + 360, + turtle$lon) > turtle <- turtle[order(turtle$timestamp), ] > coordinates(turtle) <- c("lon", "lat") > proj4string(turtle) <- CRS("+proj=longlat +ellps=wgs84") > turtle$id <- c(rep(1, 200), rep(2, nrow(coordinates(turtle)) )) > turtle_trip <- trip(turtle, c("timestamp", "id")) > summary(turtle_trip) Object of class trip Coordinates: min max lon lat Is projected: FALSE proj4string : [+proj=longlat +ellps=wgs84] Number of points: 394 Data attributes: id obs_date Min. : /02/ :16:53: 1 1st Qu.: /02/ :56:25: 1 Median : /04/ :41:54: 1 Mean : /05/ :20:07: 1 3rd Qu.: /06/ :31:13: 1 Max. : /06/ :12:56: 1 (Other) :388 timestamp Min. : :15:00 1st Qu.: :10:04 Median : :31:01 Mean : :24:56 3rd Qu.: :26:20 Max. : :19: Methods for trip objects The summary method here is not defined for trip, but is the default summary inherited from class Spatial. As can be seen, nothing special about the trip features is mentioned, such as what the time points are and what the identifiers. We could alter this by writing a class-specific summary method 6
7 > summary.trip <- function(object,...) { + cat("object of class \"trip\"\ntime column: ") + print([email protected][1]) + cat("identifier column: ") + print([email protected][2]) + print(summary(as(object, "Spatial"))) + print(summary(object@data)) > setmethod("summary", "trip", summary.trip) [1] "summary" > summary(turtle_trip) Object of class "trip" Time column: [1] "timestamp" Identifier column: [1] "id" Object of class Spatial Coordinates: min max lon lat Is projected: FALSE proj4string : [+proj=longlat +ellps=wgs84] id obs_date Min. : /02/ :16:53: 1 1st Qu.: /02/ :56:25: 1 Median : /04/ :41:54: 1 Mean : /05/ :20:07: 1 3rd Qu.: /06/ :31:13: 1 Max. : /06/ :12:56: 1 (Other) :388 timestamp Min. : :15:00 1st Qu.: :10:04 Median : :31:01 Mean : :24:56 3rd Qu.: :26:20 Max. : :19:46 As trip extends SpatialPointsDataFrame, subsetting using "[" and column selection or replacement using "[[" or "$" all work, as these are inherited. Creating invalid trip objects can be prohibited by adding checks to the validity function in the class definition, e.g., will not work because the time and/or id column are not present any more. A custom plot method for trip could be written, for example using colour to denote a change in identifier: > setgeneric("lines", function(x,...) standardgeneric("lines")) [1] "lines" > setmethod("lines", signature(x = "trip"), function(x, +..., col = NULL) { + tor <- [email protected] + if (is.null(col)) { + l <- length(unique(x[[tor[2]]])) + col <- hsv(seq(0, 0.5, length = l)) + coords <- coordinates(x) 7
8 + lx <- split(1:nrow(coords), x[[tor[2]]]) + for (i in 1:length(lx)) lines(coords[lx[[i]], ], + col = col[i],...) ) [1] "lines" Here, the col argument is added to the function header so that a reasonable default can be overridden, e.g., for black/white plotting. 3 Multi-point data: SpatialMultiPoints One of the feature types of the OpenGeospatial Consortium (OGC) simple feature specification that has not been implemented in sp is the MultiPoint object. In a MultiPoint object, each feature refers to a set of points. The sp classes SpatialPointsDataFrame only provide reference to a single point. Instead of building a new class up from scratch, we ll try to re-use code and build a class SpatialMultiPoint from the SpatialLines class. After all, lines are just sets of ordered points. In fact, the SpatialLines class implements the MultiLineString simple feature, where each feature can refer to multiple lines. A special case is formed if each feature only has a single line: > setclass("spatialmultipoints", representation("spatiallines"), + validity <- function(object) { + if (any(unlist(lapply(object@lines, + function(x) length(x@lines)))!= + 1)) + stop("only Lines objects with one Line element") + TRUE ) > SpatialMultiPoints <- function(object) new("spatialmultipoints", + object) As an example, we can create an instance of this class for two MultiPoint features each having three locations: > n <- 5 > set.seed(1) > x1 <- cbind(rnorm(n), rnorm(n, 0, 0.25)) > x2 <- cbind(rnorm(n), rnorm(n, 0, 0.25)) > x3 <- cbind(rnorm(n), rnorm(n, 0, 0.25)) > L1 <- Lines(list(Line(x1)), ID = "mp1") > L2 <- Lines(list(Line(x2)), ID = "mp2") > L3 <- Lines(list(Line(x3)), ID = "mp3") > s <- SpatialLines(list(L1, L2, L3)) > smp <- SpatialMultiPoints(s) If we now plot object smp, we get the same plot as when we plot s, showing the two lines. The plot method for a SpatialLines object is not suitable, so we write a new one: > plot.spatialmultipoints <- function(x,..., pch = 1:length(x@lines), + col = 1, cex = 1) { + n <- length(x@lines) + if (length(pch) < n) + pch <- rep(pch, length.out = n) 8
9 + if (length(col) < n) + col <- rep(col, length.out = n) + if (length(cex) < n) + cex <- rep(cex, length.out = n) + plot(as(x, "Spatial"),...) + for (i in 1:n) points(x@lines[[i]]@lines[[1]]@coords, + pch = pch[i], col = col[i], cex = cex[i]) > setmethod("plot", signature(x = "SpatialMultiPoints", + y = "missing"), function(x, y,...) plot.spatialmultipoints(x, +...)) [1] "plot" Here we chose to pass any named... arguments to the plot method for a Spatial object. This function sets up the axes and controls the margins, aspect ratio, etc. All arguments that need to be passed to points (pch for symbol type, cex for symbol size and col for symbol colour) need explicit naming and sensible defaults, as they are passed explicitly to the consecutive calls to points. According to the documentation of points, in addition to pch, cex and col, the arguments bg and lwd (symbol fill colour and symbol line width) would need a similar treatment to make this plot method completely transparent with the base plot method something an end user would hope for. Having pch, cex and col arrays the length of the number of MultiPoints sets rather than the number of points to be plotted is useful for two reasons. First, the whole point of MultiPoints object is to distinguish sets of points. Second, when we extend this class to SpatialMultiPointsDataFrame, e.g., by > cname <- "SpatialMultiPointsDataFrame" > setclass(cname, representation("spatiallinesdataframe"), + validity <- function(object) { + lst <- lapply(object@lines, function(x) length(x@lines)) + if (any(unlist(lst)!= 1)) + stop("only Lines objects with single Line") + TRUE ) > SpatialMultiPointsDataFrame <- function(object) { + new("spatialmultipointsdataframe", object) then we can pass symbol characteristics by (functions of) columns in the attribute table: > df <- data.frame(x1 = 1:3, x2 = c(1, 4, 2), row.names = c("mp1", + "mp2", "mp3")) > smp_df <- SpatialMultiPointsDataFrame(SpatialLinesDataFrame(smp, + df)) > setmethod("plot", signature(x = "SpatialMultiPointsDataFrame", + y = "missing"), function(x, y,...) plot.spatialmultipoints(x, +...)) [1] "plot" > grys <- c("grey10", "grey40", "grey80") > plot(smp_df, col = grys[smp_df[["x1"]]], pch = smp_df[["x2"]], + cex = 2, axes = TRUE) for which the plot is shown in Figure 1. 9
10 Figure 1: Plot of the SpatialMultiPointsDataFrame object. Hexagonal grids are like square grids, where grid points are centres of matching hexagons, rather than squares. Package sp has no classes for hexagonal grids, but does have some useful functions for generating and plotting them. This could be used to build a class. Much of this code in sp is based on postings to the R-sig-geo mailing list by Tim Keitt, used with permission. The spatial sampling method spsample has a method for sampling points on a hexagonal grid: > data(meuse.grid) > gridded(meuse.grid) = ~x + y > xx <- spsample(meuse.grid, type = "hexagonal", cellsize = 200) > class(xx) [1] "SpatialPoints" attr(,"package") [1] "sp" gives the points shown in the left side of Figure 2. Note that an alternative hexagonal representation is obtained by rotating this grid 90 degrees; we will not further consider that here. > HexPts <- spsample(meuse.grid, type = "hexagonal", cellsize = 200) > spplot(meuse.grid["dist"], sp.layout = list("sp.points", + HexPts, col = 1)) > HexPols <- HexPoints2SpatialPolygons(HexPts) > df <- over(hexpols, meuse.grid) > HexPolsDf <- SpatialPolygonsDataFrame(HexPols, df, match.id = FALSE) > spplot(hexpolsdf["dist"]) for which the plots are shown in Figure 2. We can now generate and plot hexagonal grids, but need to deal with two representations: as points and as polygons, and both representations do not tell by themselves that they represent a hexagonal grid. For designing a hexagonal grid class we will extend SpatialPoints, assuming that computation of the polygons can be done when needed without a prohibitive overhead. 10
11 Figure 2: Hexagonal points (left) and polygons (right). > setclass("spatialhexgrid", representation("spatialpoints", + dx = "numeric"), validity <- function(object) { + if (object@dx <= 0) + stop("dx should be positive") + TRUE ) > setclass("spatialhexgriddataframe", + representation("spatialpointsdataframe", + dx = "numeric"), validity <- function(object) { + if (object@dx <= 0) + stop("dx should be positive") + TRUE ) Note that these class definitions do not check that instances actually do form valid hexagonal grids; a more robust implementation could provide a test that distances between points with equal y coordinate are separated by a multiple of dx, that the y-separations are correct and so on. It might make sense to adapt the generic spsample method in package sp to return SpatialHexGrid objects; we can also add plot and spsample methods for them. Method over should work with a SpatialHexGrid as its first argument, by inheriting from SpatialPoints. Let us first see how to create the new classes. Without a constructor function we can use > HexPts <- spsample(meuse.grid, type = "hexagonal", cellsize = 200) > Hex <- new("spatialhexgrid", HexPts, dx = 200) > df <- over(hex, meuse.grid) > spdf <- SpatialPointsDataFrame(HexPts, df) 11
12 > HexDf <- new("spatialhexgriddataframe", spdf, dx = 200) Because of the route taken to define both HexGrid classes, it is not obvious that the second extends the first. We can tell the S4 system this by setis: > is(hexdf, "SpatialHexGrid") [1] FALSE > setis("spatialhexgriddataframe", "SpatialHexGrid") > is(hexdf, "SpatialHexGrid") [1] TRUE to make sure that methods for SpatialHexGrid objects work as well for objects of class SpatialHexGridDataFrame. When adding methods, several of them will need conversion to the polygon representation, so it makes sense to add the conversion function such that e.g. as(x, "SpatialPolygons") will work: > setas("spatialhexgrid", "SpatialPolygons", + function(from) HexPoints2SpatialPolygons(from, + from@dx)) > setas("spatialhexgriddataframe", "SpatialPolygonsDataFrame", + function(from) SpatialPolygonsDataFrame(as(obj, + "SpatialPolygons"), obj@data, + match.id = FALSE)) We can now add plot, spplot, spsample and over methods for these classes: > setmethod("plot", signature(x = "SpatialHexGrid", y = "missing"), + function(x, y,...) plot(as(x, "SpatialPolygons"), +...)) [1] "plot" > setmethod("spplot", signature(obj = "SpatialHexGridDataFrame"), + function(obj,...) spplot(spatialpolygonsdataframe(as(obj, + "SpatialPolygons"), obj@data, match.id = FALSE), +...)) [1] "spplot" > setmethod("spsample", "SpatialHexGrid", function(x, n, + type,...) spsample(as(x, "SpatialPolygons"), n = n, + type = type,...)) [1] "spsample" > setmethod("over", c("spatialhexgrid", "SpatialPoints"), + function(x, y,...) over(as(x, "SpatialPolygons"), + y)) [1] "over" After this, the following will work: > spplot(meuse.grid["dist"], sp.layout = list("sp.points", + Hex, col = 1)) > spplot(hexdf["dist"]) Coercion to a data frame is done by > as(hexdf, "data.frame") Another detail not mentioned is that the bounding box of the hexgrid objects only match the grid centre points, not the hexgrid cells: 12
13 > bbox(hex) min max x y > bbox(as(hex, "SpatialPolygons")) min max x y One solution for this is to correct for this in a constructor function, and check for it in the validity test. Explicit coercion functions to the points representation would have to set the bounding box back to the points ranges. Another solution is to write a bbox method for the hexgrid classes, taking the risk that someone still looks at the incorrect bbox slot. 4 Spatio-temporal grids Spatio-temporal data can be represented in different ways. One simple option is when observations (or model-results, or predictions) are given on a regular space-time grid. Objects of class or extending SpatialPoints, SpatialPixels and SpatialGrid do not have the constraint that they represent a two-dimensional space; they may have arbitrary dimension; an example for a three-dimensional grid is > n <- 10 > x <- data.frame(expand.grid(x1 = 1:n, x2 = 1:n, x3 = 1:n), + z = rnorm(n^3)) > coordinates(x) <- ~x1 + x2 + x3 > gridded(x) <- TRUE > fullgrid(x) <- TRUE > summary(x) Object of class SpatialGridDataFrame Coordinates: min max x x x Is projected: NA proj4string : [NA] Grid attributes: cellcentre.offset cellsize cells.dim x x x Data attributes: z Min. : st Qu.: Median : Mean : rd Qu.: Max. :
14 We might assume here that the third dimension, x3, represents time. If we are happy with time somehow represented by a real number (in double precision), then we are done. A simple representation is that of decimal year, with e.g meaning the 183rd day of 1980, or e.g. relative time in seconds after the start of some event. When we want to use the POSIXct or POSIXlt representations, we need to do some more work to see the readable version. We will now devise a simple three-dimensional space-time grid with the POSIXct representation. > setclass("spatialtimegrid", "SpatialGrid", + validity <- function(object) { + stopifnot(dimensions(object) == + 3) + TRUE ) Along the same line, we can extend the SpatialGridDataFrame for space-time: > setclass("spatialtimegriddataframe", "SpatialGridDataFrame", + validity <- function(object) { + stopifnot(dimensions(object) == 3) + TRUE ) > setis("spatialtimegriddataframe", "SpatialTimeGrid") > x <- new("spatialtimegriddataframe", x) A crude summary for this class could be written along these lines: > summary.spatialtimegriddataframe <- function(object, +...) { + cat("object of class SpatialTimeGridDataFrame\n") + x <- gridparameters(object) + t0 <- ISOdate(1970, 1, 1, 0, 0, 0) + t1 <- t0 + x[3, 1] + cat(paste("first time step:", t1, "\n")) + t2 <- t0 + x[3, 1] + (x[3, 3] - 1) * x[3, 2] + cat(paste("last time step: ", t2, "\n")) + cat(paste("time step: ", x[3, 2], "\n")) + summary(as(object, "SpatialGridDataFrame")) > setmethod("summary", "SpatialTimeGridDataFrame", + summary.spatialtimegriddataframe) [1] "summary" > summary(x) Object of class SpatialTimeGridDataFrame first time step: :00:01 last time step: :00:10 time step: 1 Object of class SpatialGridDataFrame Coordinates: min max x x x Is projected: NA proj4string : [NA] 14
15 Grid attributes: cellcentre.offset cellsize cells.dim x x x Data attributes: z Min. : st Qu.: Median : Mean : rd Qu.: Max. : Next, suppose we need a subsetting method that selects on the time. When the first subset argument is allowed to be a time range, this is done by > subs.spatialtimegriddataframe <- function(x, i, j,..., + drop = FALSE) { + t <- coordinates(x)[, 3] + ISOdate(1970, 1, 1, 0, + 0, 0) + if (missing(j)) + j <- TRUE + sel <- t %in% i + if (!any(sel)) + stop("selection results in empty set") + fullgrid(x) <- FALSE + if (length(i) > 1) { + x <- x[i = sel, j = j,...] + fullgrid(x) <- TRUE + as(x, "SpatialTimeGridDataFrame") + else { + gridded(x) <- FALSE + x <- x[i = sel, j = j,...] + cc <- coordinates(x)[, 1:2] + p4s <- CRS(proj4string(x)) + SpatialPixelsDataFrame(cc, x@data, proj4string = p4s) > setmethod("[", c("spatialtimegriddataframe", "POSIXct", + "ANY"), subs.spatialtimegriddataframe) [1] "[" > t1 <- as.posixct(" :00:03", tz = "GMT") > t2 <- as.posixct(" :00:05", tz = "GMT") > summary(x[c(t1, t2)]) Object of class SpatialTimeGridDataFrame first time step: :00:01 last time step: :00:10 time step: 1 Object of class SpatialGridDataFrame Coordinates: min max x x x Is projected: NA 15
16 proj4string : [NA] Grid attributes: cellcentre.offset cellsize cells.dim x x x Data attributes: z Min. : st Qu.: Median : Mean : rd Qu.: Max. : NA's :800 > summary(x[t1]) Object of class SpatialPixelsDataFrame Coordinates: min max x x Is projected: NA proj4string : [NA] Number of points: 100 Grid attributes: cellcentre.offset cellsize cells.dim x x Data attributes: z Min. : st Qu.: Median : Mean : rd Qu.: Max. : The reason to only convert back to SpatialTimeGridDataFrame when multiple time steps are present is that the time step ( cell size in time direction) cannot be found when there is only a single step. In that case, the current selection method returns an object of class SpatialPixelsDataFrame for that time slice. Plotting a set of slices could be done using levelplot, or writing another spplot method: > spplot.stgdf <- function(obj, zcol = 1,..., format = NULL) { + if (length(zcol)!= 1) + stop("can only plot a single attribute") + if (is.null(format)) + format <- "%Y-%m-%d %H:%M:%S" + cc <- coordinates(obj) + df <- unstack(data.frame(obj[[zcol]], cc[, 3])) + ns <- as.character(coordinatevalues(getgridtopology(obj))[[3]] + + ISOdate(1970, 1, 1, 0, 0, 0), format = format) + cc2d <- cc[cc[, 3] == min(cc[, 3]), 1:2] + obj <- SpatialPixelsDataFrame(cc2d, df) + spplot(obj, names.attr = ns,...) 16
17 00:00:01 00:00:02 00:00:03 00:00: :00:05 00:00:06 00:00:07 00:00: :00:09 00:00: Figure 3: spplot for an object of class SpatialTimeGridDataFrame, filled with random numbers. > setmethod("spplot", "SpatialTimeGridDataFrame", spplot.stgdf) [1] "spplot" Now, the result of > library(lattice) > trellis.par.set(canonical.theme(color = FALSE)) > spplot(x, format = "%H:%M:%S", as.table = TRUE, cuts = 6, + col.regions = grey.colors(7, 0.55, 0.95, 2.2)) is shown in Figure 3. The format argument passed controls the way time is printed; one can refer to the help of > `?`(as.character.posixt) for more details about the format argument. 5 Analysing spatial Monte Carlo simulations Quite often, spatial statistical analysis results in a large number of spatial realisations or a random field, using some Monte Carlo simulation approach. Regardless whether individual values refer to points, lines, polygons or grid cells, we would like to write some methods or functions that aggregate over these simulations, to get summary statistics such as the mean value, quantiles, or cumulative distributions values. Such aggregation can take place in two ways. 17
18 Either we aggregate over the probability space, and compute summary statistics for each geographical feature over the set of realisations (i.e., the rows of the attribute table), or for each realisation we aggregate over the complete geographical layer or a subset of it (i.e., aggregate over the columns of the attribute table). Let us first generate, as an example, a set of 100 conditional Gaussian simulations for the zinc variable in the meuse data set: > library(gstat) > data(meuse) > coordinates(meuse) <- ~x + y > v <- vgm(0.5, "Sph", 800, 0.05) > sim <- krige(log(zinc) ~ 1, meuse, meuse.grid, v, nsim = 100, + nmax = 30) drawing 100 GLS realisations of beta... [using conditional Gaussian simulation] > sim@data <- exp(sim@data) where the last statement back-transforms the simulations from the log scale to the observation scale. A quantile method for Spatial object attributes can be written as > quantile.spatial <- function(x,..., bylayer = FALSE) { + stopifnot("data" %in% slotnames(x)) + apply(x@data, ifelse(bylayer, 2, 1), quantile,...) after which we can find the sample lower and upper 95% confidence limits by > sim$lower <- quantile.spatial(sim[1:100], probs = 0.025) > sim$upper <- quantile.spatial(sim[1:100], probs = 0.975) To get the sample distribution of the areal median, we can aggregate over layers: > medians <- quantile.spatial(sim[1:100], probs = 0.5, + bylayer = TRUE) > hist(medians) It should be noted that in these particular cases, the quantities computed by simulations could have been obtained faster and exact by working analytically with ordinary (block) kriging and the normal distribution (Section in Bivand et al. (2008)). A statistic that cannot be obtained analytically is the sample distribution of the area fraction that exceeds a threshold. Suppose that 500 is a crucial threshold, and we want to summarise the sampling distribution of the area fraction where 500 is exceeded: > fractionbelow <- function(x, q, bylayer = FALSE) { + stopifnot(is(x, "Spatial")!("data" %in% + slotnames(x))) + apply(x@data < q, ifelse(bylayer, + 2, 1), function(r) sum(r)/length(r)) > over500 <- 1 - fractionbelow(sim[1:100], 200, bylayer = TRUE) > summary(over500) Min. 1st Qu. Median Mean 3rd Qu. Max
19 > quantile(over500, c(0.025, 0.975)) 2.5% 97.5% For space-time data, we could write methods that aggregate over space, over time or over space and time. 6 Processing massive grids Up to now we have made the assumption that gridded data can be completely read, and are kept by R in memory. In some cases, however, we need to process grids that exceed the memory capacity of the computers available. A method for analysing grids without fully loading them into memory then seems useful. Note that package rgdal allows for partial reading of grids, e.g., > fn <- system.file("pictures/erdas_spnad83.tif", package = "rgdal")[1] > x <- readgdal(fn, output.dim = c(120, 132)) > x$band1[x$band1 <= 0] <- NA > spplot(x, col.regions = bpy.colors()) reads a downsized grid, where 1% of the grid cells remained. Another option is reading certain rectangular sections of a grid, starting at some offset. Yet another approach is to use the low-level opening routines and then subset: > library(rgdal) > x <- GDAL.open(fn) > class(x) [1] "GDALReadOnlyDataset" attr(,"package") [1] "rgdal" > x.subs <- x[1:100, 1:100, 1] > class(x.subs) [1] "SpatialGridDataFrame" attr(,"package") [1] "sp" > gridparameters(x.subs) cellcentre.offset cellsize cells.dim x y An object of class GDALReadOnlyDataset only contains a file handle. The subset method "[" for it does not, as it quite often does, return an object of the same class but actually reads the data requested, with arguments interpreted as rows, columns and raster bands, and returns a SpatialGridDataFrame. We will now extend this approach to allow partial writing through "[" as well. As the actual code is rather lengthy and involves a lot of administration, it will not all be shown and details can be found in the rgdal source code. We will define two classes, 19
20 > setclass("spatialgdal", representation("spatial", + grid = "GridTopology", grod = "GDALReadOnlyDataset", + name = "character")) > setclass("spatialgdalwrite", "SpatialGDAL") that derive from Spatial, contain a GridTopology and a file handle in the grod slot. Next, we can define a function open.spatialgdal to open a raster file, returning a SpatialGDAL object and a function copy.spatialgdal that returns a writable copy of the opened raster. Note that some GDAL drivers only allow copying, some only writing and some both. > x <- open.spatialgdal(fn) > nrows <- GDALinfo(fn)["rows"] > ncols <- GDALinfo(fn)["columns"] > xout <- copy.spatialgdal(x, "erdas_spnad83_out.tif") > bls <- 20 > for (i in 1:(nrows/bls - 1)) { + r <- 1 + (i - 1) * bls + for (j in 1:(ncols/bls - 1)) { + c <- 1 + (j - 1) * bls + x.in <- x[r:(r + bls), c:(c + bls)] + xout[r:(r + bls), c:(c + bls)] <- x.in$band cat(paste("row-block", i, "\n")) > close(x) > close(xout) This requires the functions "[" and "[<-" to be present. They are set by > setmethod("[", "SpatialGDAL", function(x, i, j,..., + drop = FALSE) x@grod[i = i, j = j,...]) > setreplacemethod("[", "SpatialGDALWrite", function(x, + i, j,..., value) { +... ) where, for the latter, the implementation details are here omitted. It should be noted that single rows or columns cannot be read this way, as they cannot be converted sensibly to a grid. It should be noted that flat binary representations such as the Arc/Info Binary Grid allow much faster random access than ASCII representations or compressed formats such as jpeg varieties. Also, certain drivers in the GDAL library suggest an optimal block size for partial access (e.g., typically a single row), which is not used here 1. This chapter has sketched developments beyond the base sp classes and methods used otherwise in this book. Although we think that the base classes cater for many standard kinds of spatial data analysis, it is clear that specific research problems will call for specific solutions, and that the R environment provides the high-level abstractions needed to help busy researchers get their work done. 1 An attempt to use this block size is, at time of writing, found in the blockapply code, found in the THK branch of the rgdal project on R-forge. 20
21 References Roger S. Bivand, Edzer J. Pebesma and Virgilio Gomez-Rubio (2008). Applied spatial data analysis with R Springer, NY Braun, W. J. and Murdoch, D. J. (2007). A first course in statistical programming with R. Cambridge University Press, Cambridge. Chambers, J.M. (1998). Programming with Data. Springer, New York. Kirkwood, R., Lynch, M., Gales, N., Dann, P., and Sumner, M. (2006). At-sea movements and habitat use of adult male Australian fur seals (Arctocephalus pusillus doriferus). Canadian Journal of Zoology, 84: Page, B., McKenzie, J., Sumner, M., Coyne, M., and Goldsworthy, S. (2006). Spatial separation of foraging habitats among New Zealand fur seals. Marine Ecology Progress Series, 323: Venables, W. N. and Ripley, B. D. (2000). S Programming. Springer, New York. 21
Classes and Methods for Spatial Data: the sp Package
Classes and Methods for Spatial Data: the sp Package Edzer Pebesma Roger S. Bivand Feb 2005 Contents 1 Introduction 2 2 Spatial data classes 2 3 Manipulating spatial objects 3 3.1 Standard methods..........................
The sp Package. July 27, 2006
The sp Package July 27, 2006 Version 0.8-18 Date 2006-07-10 Title classes and methods for spatial data Author Edzer J. Pebesma , Roger Bivand and others Maintainer
Map overlay and spatial aggregation in sp
Map overlay and spatial aggregation in sp Edzer Pebesma June 5, 2015 Abstract Numerical map overlay combines spatial features from one map layer with the attribute (numerical) properties of another. This
Acknowledgements. S4 Classes and Methods. Overview. Introduction. S4 has been designed and written by. John Chambers. These slides contain material by
Acknowledgements S4 has been designed and written by John Chambers S4 Classes and Methods Friedrich Leisch These slides contain material by Robert Gentleman R Development Core Group Paul Murrell user!
Package neuralnet. February 20, 2015
Type Package Title Training of neural networks Version 1.32 Date 2012-09-19 Package neuralnet February 20, 2015 Author Stefan Fritsch, Frauke Guenther , following earlier work
APPLICATION OF OPEN SOURCE/FREE SOFTWARE (R + GOOGLE EARTH) IN DESIGNING 2D GEODETIC CONTROL NETWORK
INTERNATIONAL SCIENTIFIC CONFERENCE AND XXIV MEETING OF SERBIAN SURVEYORS PROFESSIONAL PRACTICE AND EDUCATION IN GEODESY AND RELATED FIELDS 24-26, June 2011, Kladovo -,,Djerdap upon Danube, Serbia. APPLICATION
Each function call carries out a single task associated with drawing the graph.
Chapter 3 Graphics with R 3.1 Low-Level Graphics R has extensive facilities for producing graphs. There are both low- and high-level graphics facilities. The low-level graphics facilities provide basic
Introduction to R+OSGeo (software installation and first steps)
Introduction to R+OSGeo (software installation and first steps) Tomislav Hengl ISRIC World Soil Information, Wageningen University About myself Senior researcher at ISRIC World Soil Information; PhD in
Object systems available in R. Why use classes? Information hiding. Statistics 771. R Object Systems Managing R Projects Creating R Packages
Object systems available in R Statistics 771 R Object Systems Managing R Projects Creating R Packages Douglas Bates R has two object systems available, known informally as the S3 and the S4 systems. S3
S4 Classes in 15 pages, more or less
S4 Classes in 15 pages, more or less February 12, 2003 Overview The preferred mechanism for object oriented programming in R is described in Chambers (1998). The actual implementation is slightly different
Introduction to the data.table package in R
Introduction to the data.table package in R Revised: September 18, 2015 (A later revision may be available on the homepage) Introduction This vignette is aimed at those who are already familiar with creating
R Language Fundamentals
R Language Fundamentals Data Types and Basic Maniuplation Steven Buechler Department of Mathematics 276B Hurley Hall; 1-6233 Fall, 2007 Outline Where did R come from? Overview Atomic Vectors Subsetting
GEOSTAT13 Spatial Data Analysis Tutorial: Raster Operations in R Tuesday AM
GEOSTAT13 Spatial Data Analysis Tutorial: Raster Operations in R Tuesday AM In this tutorial we shall experiment with the R package 'raster'. We will learn how to read and plot raster objects, learn how
Package TimeWarp. R topics documented: April 1, 2015
Type Package Title Date Calculations and Manipulation Version 1.0.12 Date 2015-03-30 Author Tony Plate, Jeffrey Horner, Lars Hansen Maintainer Tony Plate
Package MDM. February 19, 2015
Type Package Title Multinomial Diversity Model Version 1.3 Date 2013-06-28 Package MDM February 19, 2015 Author Glenn De'ath ; Code for mdm was adapted from multinom in the nnet package
Psychology 205: Research Methods in Psychology
Psychology 205: Research Methods in Psychology Using R to analyze the data for study 2 Department of Psychology Northwestern University Evanston, Illinois USA November, 2012 1 / 38 Outline 1 Getting ready
2 intervals-package. Index 33. Tools for working with points and intervals
Package intervals March 7, 2013 Version 0.14.0 Date 2013-03-06 Type Package Title Tools for working with points and intervals Author Richard Bourgon Maintainer Richard Bourgon
Package DSsim. September 25, 2015
Package DSsim September 25, 2015 Depends graphics, splancs, mrds, mgcv, shapefiles, methods Suggests testthat, parallel Type Package Title Distance Sampling Simulations Version 1.0.4 Date 2015-09-25 LazyLoad
Package RCassandra. R topics documented: February 19, 2015. Version 0.1-3 Title R/Cassandra interface
Version 0.1-3 Title R/Cassandra interface Package RCassandra February 19, 2015 Author Simon Urbanek Maintainer Simon Urbanek This packages provides
Package move. July 7, 2015
Type Package Package move July 7, 2015 Title Visualizing and Analyzing Animal Track Data Version 1.5.514 Date 2015-07-06 Author Bart Kranstauber ,
NCSS Statistical Software Principal Components Regression. In ordinary least squares, the regression coefficients are estimated using the formula ( )
Chapter 340 Principal Components Regression Introduction is a technique for analyzing multiple regression data that suffer from multicollinearity. When multicollinearity occurs, least squares estimates
Reading & Writing Spatial Data in R John Lewis. Some material used in these slides are taken from presentations by Roger Bivand and David Rossiter
Reading & Writing Spatial Data in R John Lewis Some material used in these slides are taken from presentations by Roger Bivand and David Rossiter Introduction Having described how spatial data may be represented
Beas Inventory location management. Version 23.10.2007
Beas Inventory location management Version 23.10.2007 1. INVENTORY LOCATION MANAGEMENT... 3 2. INTEGRATION... 4 2.1. INTEGRATION INTO SBO... 4 2.2. INTEGRATION INTO BE.AS... 4 3. ACTIVATION... 4 3.1. AUTOMATIC
Package bigrf. February 19, 2015
Version 0.1-11 Date 2014-05-16 Package bigrf February 19, 2015 Title Big Random Forests: Classification and Regression Forests for Large Data Sets Maintainer Aloysius Lim OS_type
PGR Computing Programming Skills
PGR Computing Programming Skills Dr. I. Hawke 2008 1 Introduction The purpose of computing is to do something faster, more efficiently and more reliably than you could as a human do it. One obvious point
Statistical Models in R
Statistical Models in R Some Examples Steven Buechler Department of Mathematics 276B Hurley Hall; 1-6233 Fall, 2007 Outline Statistical Models Linear Models in R Regression Regression analysis is the appropriate
Cluster Analysis using R
Cluster analysis or clustering is the task of assigning a set of objects into groups (called clusters) so that the objects in the same cluster are more similar (in some sense or another) to each other
Package HHG. July 14, 2015
Type Package Package HHG July 14, 2015 Title Heller-Heller-Gorfine Tests of Independence and Equality of Distributions Version 1.5.1 Date 2015-07-13 Author Barak Brill & Shachar Kaufman, based in part
Tutorial 3: Graphics and Exploratory Data Analysis in R Jason Pienaar and Tom Miller
Tutorial 3: Graphics and Exploratory Data Analysis in R Jason Pienaar and Tom Miller Getting to know the data An important first step before performing any kind of statistical analysis is to familiarize
Package biganalytics
Package biganalytics February 19, 2015 Version 1.1.1 Date 2012-09-20 Title A library of utilities for big.matrix objects of package bigmemory. Author John W. Emerson and Michael
Package rgeos. September 28, 2015
Package rgeos September 28, 2015 Title Interface to Geometry Engine - Open Source (GEOS) Version 0.3-13 Date 2015-09-26 Depends R (>= 2.14.0) Imports methods, sp (>= 1.1-0), utils, stats, graphics LinkingTo
Physical Design. Meeting the needs of the users is the gold standard against which we measure our success in creating a database.
Physical Design Physical Database Design (Defined): Process of producing a description of the implementation of the database on secondary storage; it describes the base relations, file organizations, and
Using Excel for descriptive statistics
FACT SHEET Using Excel for descriptive statistics Introduction Biologists no longer routinely plot graphs by hand or rely on calculators to carry out difficult and tedious statistical calculations. These
Viewing Ecological data using R graphics
Biostatistics Illustrations in Viewing Ecological data using R graphics A.B. Dufour & N. Pettorelli April 9, 2009 Presentation of the principal graphics dealing with discrete or continuous variables. Course
Installing R and the psych package
Installing R and the psych package William Revelle Department of Psychology Northwestern University August 17, 2014 Contents 1 Overview of this and related documents 2 2 Install R and relevant packages
Package maptools. September 29, 2015
Version 0.8-37 Date 2015-09-28 Package maptools September 29, 2015 Title Tools for Reading and Handling Spatial Objects Encoding UTF-8 Depends R (>= 2.10), sp (>= 1.0-11) Imports foreign (>= 0.8), methods,
Getting started with qplot
Chapter 2 Getting started with qplot 2.1 Introduction In this chapter, you will learn to make a wide variety of plots with your first ggplot2 function, qplot(), short for quick plot. qplot makes it easy
Package SHELF. February 5, 2016
Type Package Package SHELF February 5, 2016 Title Tools to Support the Sheffield Elicitation Framework (SHELF) Version 1.1.0 Date 2016-01-29 Author Jeremy Oakley Maintainer Jeremy Oakley
Graphics in R. Biostatistics 615/815
Graphics in R Biostatistics 615/815 Last Lecture Introduction to R Programming Controlling Loops Defining your own functions Today Introduction to Graphics in R Examples of commonly used graphics functions
INTRODUCTION TO SURVEY DATA ANALYSIS THROUGH STATISTICAL PACKAGES
INTRODUCTION TO SURVEY DATA ANALYSIS THROUGH STATISTICAL PACKAGES Hukum Chandra Indian Agricultural Statistics Research Institute, New Delhi-110012 1. INTRODUCTION A sample survey is a process for collecting
Lab 13: Logistic Regression
Lab 13: Logistic Regression Spam Emails Today we will be working with a corpus of emails received by a single gmail account over the first three months of 2012. Just like any other email address this account
Evaluating Trading Systems By John Ehlers and Ric Way
Evaluating Trading Systems By John Ehlers and Ric Way INTRODUCTION What is the best way to evaluate the performance of a trading system? Conventional wisdom holds that the best way is to examine the system
Package survpresmooth
Package survpresmooth February 20, 2015 Type Package Title Presmoothed Estimation in Survival Analysis Version 1.1-8 Date 2013-08-30 Author Ignacio Lopez de Ullibarri and Maria Amalia Jacome Maintainer
Package HadoopStreaming
Package HadoopStreaming February 19, 2015 Type Package Title Utilities for using R scripts in Hadoop streaming Version 0.2 Date 2009-09-28 Author David S. Rosenberg Maintainer
Package tagcloud. R topics documented: July 3, 2015
Package tagcloud July 3, 2015 Type Package Title Tag Clouds Version 0.6 Date 2015-07-02 Author January Weiner Maintainer January Weiner Description Generating Tag and Word Clouds.
Example: Credit card default, we may be more interested in predicting the probabilty of a default than classifying individuals as default or not.
Statistical Learning: Chapter 4 Classification 4.1 Introduction Supervised learning with a categorical (Qualitative) response Notation: - Feature vector X, - qualitative response Y, taking values in C
Once the schema has been designed, it can be implemented in the RDBMS.
2. Creating a database Designing the database schema... 1 Representing Classes, Attributes and Objects... 2 Data types... 5 Additional constraints... 6 Choosing the right fields... 7 Implementing a table
The Saves Package. an approximate benchmark of performance issues while loading datasets. Gergely Daróczi [email protected].
The Saves Package an approximate benchmark of performance issues while loading datasets Gergely Daróczi [email protected] December 27, 2013 1 Introduction The purpose of this package is to be able
Lesson 15 - Fill Cells Plugin
15.1 Lesson 15 - Fill Cells Plugin This lesson presents the functionalities of the Fill Cells plugin. Fill Cells plugin allows the calculation of attribute values of tables associated with cell type layers.
CATIA V5 Tutorials. Mechanism Design & Animation. Release 18. Nader G. Zamani. University of Windsor. Jonathan M. Weaver. University of Detroit Mercy
CATIA V5 Tutorials Mechanism Design & Animation Release 18 Nader G. Zamani University of Windsor Jonathan M. Weaver University of Detroit Mercy SDC PUBLICATIONS Schroff Development Corporation www.schroff.com
Introduction to the R Language
Introduction to the R Language Functions Biostatistics 140.776 Functions Functions are created using the function() directive and are stored as R objects just like anything else. In particular, they are
Environmental Remote Sensing GEOG 2021
Environmental Remote Sensing GEOG 2021 Lecture 4 Image classification 2 Purpose categorising data data abstraction / simplification data interpretation mapping for land cover mapping use land cover class
Introduction to the North Carolina SIDS data set
Introduction to the North Carolina SIDS data set Roger Bivand October 9, 2007 1 Introduction This data set was presented first in Symons et al. (1983), analysed with reference to the spatial nature of
Package CoImp. February 19, 2015
Title Copula based imputation method Date 2014-03-01 Version 0.2-3 Package CoImp February 19, 2015 Author Francesca Marta Lilja Di Lascio, Simone Giannerini Depends R (>= 2.15.2), methods, copula Imports
Chapter 6: The Information Function 129. CHAPTER 7 Test Calibration
Chapter 6: The Information Function 129 CHAPTER 7 Test Calibration 130 Chapter 7: Test Calibration CHAPTER 7 Test Calibration For didactic purposes, all of the preceding chapters have assumed that the
Representing Geography
3 Representing Geography OVERVIEW This chapter introduces the concept of representation, or the construction of a digital model of some aspect of the Earth s surface. The geographic world is extremely
Package MBA. February 19, 2015. Index 7. Canopy LIDAR data
Version 0.0-8 Date 2014-4-28 Title Multilevel B-spline Approximation Package MBA February 19, 2015 Author Andrew O. Finley , Sudipto Banerjee Maintainer Andrew
Math Review. for the Quantitative Reasoning Measure of the GRE revised General Test
Math Review for the Quantitative Reasoning Measure of the GRE revised General Test www.ets.org Overview This Math Review will familiarize you with the mathematical skills and concepts that are important
Classification and Regression by randomforest
Vol. 2/3, December 02 18 Classification and Regression by randomforest Andy Liaw and Matthew Wiener Introduction Recently there has been a lot of interest in ensemble learning methods that generate many
A HYBRID APPROACH FOR AUTOMATED AREA AGGREGATION
A HYBRID APPROACH FOR AUTOMATED AREA AGGREGATION Zeshen Wang ESRI 380 NewYork Street Redlands CA 92373 [email protected] ABSTRACT Automated area aggregation, which is widely needed for mapping both natural
Grade 6 Mathematics Common Core State Standards
Grade 6 Mathematics Common Core State Standards Standards for Mathematical Practice HOW make sense of problems, persevere in solving them, and check the reasonableness of answers. reason with and flexibly
Package sjdbc. R topics documented: February 20, 2015
Package sjdbc February 20, 2015 Version 1.5.0-71 Title JDBC Driver Interface Author TIBCO Software Inc. Maintainer Stephen Kaluzny Provides a database-independent JDBC interface. License
STATGRAPHICS Online. Statistical Analysis and Data Visualization System. Revised 6/21/2012. Copyright 2012 by StatPoint Technologies, Inc.
STATGRAPHICS Online Statistical Analysis and Data Visualization System Revised 6/21/2012 Copyright 2012 by StatPoint Technologies, Inc. All rights reserved. Table of Contents Introduction... 1 Chapter
Engineering Problem Solving and Excel. EGN 1006 Introduction to Engineering
Engineering Problem Solving and Excel EGN 1006 Introduction to Engineering Mathematical Solution Procedures Commonly Used in Engineering Analysis Data Analysis Techniques (Statistics) Curve Fitting techniques
Introduction to GIS. Dr F. Escobar, Assoc Prof G. Hunter, Assoc Prof I. Bishop, Dr A. Zerger Department of Geomatics, The University of Melbourne
Introduction to GIS 1 Introduction to GIS http://www.sli.unimelb.edu.au/gisweb/ Dr F. Escobar, Assoc Prof G. Hunter, Assoc Prof I. Bishop, Dr A. Zerger Department of Geomatics, The University of Melbourne
Access Queries (Office 2003)
Access Queries (Office 2003) Technical Support Services Office of Information Technology, West Virginia University OIT Help Desk 293-4444 x 1 oit.wvu.edu/support/training/classmat/db/ Instructor: Kathy
Create a folder on your network drive called DEM. This is where data for the first part of this lesson will be stored.
In this lesson you will create a Digital Elevation Model (DEM). A DEM is a gridded array of elevations. In its raw form it is an ASCII, or text, file. First, you will interpolate elevations on a topographic
Journal of Statistical Software
JSS Journal of Statistical Software October 2006, Volume 16, Code Snippet 3. http://www.jstatsoft.org/ LDheatmap: An R Function for Graphical Display of Pairwise Linkage Disequilibria between Single Nucleotide
DATA QUALITY IN GIS TERMINOLGY GIS11
DATA QUALITY IN GIS When using a GIS to analyse spatial data, there is sometimes a tendency to assume that all data, both locational and attribute, are completely accurate. This of course is never the
The Trip Scheduling Problem
The Trip Scheduling Problem Claudia Archetti Department of Quantitative Methods, University of Brescia Contrada Santa Chiara 50, 25122 Brescia, Italy Martin Savelsbergh School of Industrial and Systems
Tutorial 3 - Map Symbology in ArcGIS
Tutorial 3 - Map Symbology in ArcGIS Introduction ArcGIS provides many ways to display and analyze map features. Although not specifically a map-making or cartographic program, ArcGIS does feature a wide
Exploratory Data Analysis and Plotting
Exploratory Data Analysis and Plotting The purpose of this handout is to introduce you to working with and manipulating data in R, as well as how you can begin to create figures from the ground up. 1 Importing
Package EstCRM. July 13, 2015
Version 1.4 Date 2015-7-11 Package EstCRM July 13, 2015 Title Calibrating Parameters for the Samejima's Continuous IRT Model Author Cengiz Zopluoglu Maintainer Cengiz Zopluoglu
Digital Terrain Model Grid Width 10 m DGM10
Digital Terrain Model Grid Width 10 m Status of documentation: 23.02.2015 Seite 1 Contents page 1 Overview of dataset 3 2 Description of the dataset contents 4 3 Data volume 4 4 Description of the data
In this chapter, you will learn improvement curve concepts and their application to cost and price analysis.
7.0 - Chapter Introduction In this chapter, you will learn improvement curve concepts and their application to cost and price analysis. Basic Improvement Curve Concept. You may have learned about improvement
Toothpick Squares: An Introduction to Formulas
Unit IX Activity 1 Toothpick Squares: An Introduction to Formulas O V E R V I E W Rows of squares are formed with toothpicks. The relationship between the number of squares in a row and the number of toothpicks
Performance Level Descriptors Grade 6 Mathematics
Performance Level Descriptors Grade 6 Mathematics Multiplying and Dividing with Fractions 6.NS.1-2 Grade 6 Math : Sub-Claim A The student solves problems involving the Major Content for grade/course with
WhatsApp monitoring for Fun and Stalking. by Flavio Giobergia
WhatsApp monitoring for Fun and Stalking by Flavio Giobergia The idea behind the project If you have used WhatsApp at least once in your life, you know that there is a really lovely/shitty (deping on the
An Introduction to Point Pattern Analysis using CrimeStat
Introduction An Introduction to Point Pattern Analysis using CrimeStat Luc Anselin Spatial Analysis Laboratory Department of Agricultural and Consumer Economics University of Illinois, Urbana-Champaign
Vector and Matrix Norms
Chapter 1 Vector and Matrix Norms 11 Vector Spaces Let F be a field (such as the real numbers, R, or complex numbers, C) with elements called scalars A Vector Space, V, over the field F is a non-empty
Deal with big data in R using bigmemory package
Deal with big data in R using bigmemory package Xiaojuan Hao Department of Statistics University of Nebraska-Lincoln April 28, 2015 Background What Is Big Data Size (We focus on) Complexity Rate of growth
InfiniteInsight 6.5 sp4
End User Documentation Document Version: 1.0 2013-11-19 CUSTOMER InfiniteInsight 6.5 sp4 Toolkit User Guide Table of Contents Table of Contents About this Document 3 Common Steps 4 Selecting a Data Set...
Package retrosheet. April 13, 2015
Type Package Package retrosheet April 13, 2015 Title Import Professional Baseball Data from 'Retrosheet' Version 1.0.2 Date 2015-03-17 Maintainer Richard Scriven A collection of tools
Performance Tuning for the Teradata Database
Performance Tuning for the Teradata Database Matthew W Froemsdorf Teradata Partner Engineering and Technical Consulting - i - Document Changes Rev. Date Section Comment 1.0 2010-10-26 All Initial document
Exploratory Data Analysis for Ecological Modelling and Decision Support
Exploratory Data Analysis for Ecological Modelling and Decision Support Gennady Andrienko & Natalia Andrienko Fraunhofer Institute AIS Sankt Augustin Germany http://www.ais.fraunhofer.de/and 5th ECEM conference,
Importing ASCII Grid Data into GIS/Image processing software
Importing ASCII Grid Data into GIS/Image processing software Table of Contents Importing data into: ArcGIS 9.x ENVI ArcView 3.x GRASS Importing the ASCII Grid data into ArcGIS 9.x Go to Table of Contents
Visualization Quick Guide
Visualization Quick Guide A best practice guide to help you find the right visualization for your data WHAT IS DOMO? Domo is a new form of business intelligence (BI) unlike anything before an executive
Data Analysis Tools. Tools for Summarizing Data
Data Analysis Tools This section of the notes is meant to introduce you to many of the tools that are provided by Excel under the Tools/Data Analysis menu item. If your computer does not have that tool
University of Arkansas Libraries ArcGIS Desktop Tutorial. Section 2: Manipulating Display Parameters in ArcMap. Symbolizing Features and Rasters:
: Manipulating Display Parameters in ArcMap Symbolizing Features and Rasters: Data sets that are added to ArcMap a default symbology. The user can change the default symbology for their features (point,
Analysing equity portfolios in R
Analysing equity portfolios in R Using the portfolio package by David Kane and Jeff Enos Introduction 1 R is used by major financial institutions around the world to manage billions of dollars in equity
R: A self-learn tutorial
R: A self-learn tutorial 1 Introduction R is a software language for carrying out complicated (and simple) statistical analyses. It includes routines for data summary and exploration, graphical presentation
Getting Started with R and RStudio 1
Getting Started with R and RStudio 1 1 What is R? R is a system for statistical computation and graphics. It is the statistical system that is used in Mathematics 241, Engineering Statistics, for the following
