Lecture 3: Models of Spatial Information Introduction In the last lecture we discussed issues of cartography, particularly abstraction of real world objects into points, lines, and areas for use in maps. Today we want to take this discussion a bit further and begin talking about ways in which spatial information is modeled. This discussion will move from conceptual models to data models to logical models, focusing on the two most common approaches to representing spatial data. Conceptual Models of Spatial Information Conceptually, there are two standard ways in which spatial information is modeled: object based and field based models. Object based models treat information space as populated by discrete entities that are georeferenced. These entities are objects that have coordinates defining their location in the world. Because it is focused on objects, the implementation of this conceptual model will yield data models and structures that are focused on objects. Field based models treat information as collections of spatial distributions, where each distribution is formalized as a mathematical function from a spatial framework. To grasp this definition it is necessary to understand two important terms. In this definition, the term mathematical function indicates that these models will consist of numbers; and the term spatial framework indicates that the model will divide an area into a finite tessellation of spatial units. In summary, the model will divide an area up into an array of smaller units and put numbers into them. Unlike object based models, field based models are focused on location, and will yield data models and structures that are focused on location. Data Models Drawn from conceptual models, spatial data models fall into two categories: vector and raster. With their origin in object based models, it is not surprising that the basic data units for vector data models are objects. In these models, objects are points, lines and polygons. Because it is clear what each object is in a vector data model (point, line, or polygon), these data units are seen as implicit. Locating them in space, however, is an explicit process. It is necessary to explicitly encode location information for each object. These locations are generally expressed in Cartesian coordinates on a Euclidean plane. (Figure 1)
Figure 1: Basics of vector and raster data structures. (Bolstad 2002) In raster data models the basic data unit is spatial, represented most often as a cell in a tessellated array of rows and columns. Because these cells are of known size and because the array has a known origin, x and y coordinates for each cell are implicit. However, information about the objects in the model must be explicitly encoded. That is, whether a cell in the array is part of a point, line, area, or a blank space must be designated explicitly by a numeric code (Figure 1). Logical Models Moving from data models to logical models brings us to the point at which we can begin talking about specific data structures, and about how objects are stored, manipulated, etc. within specific hardware/software environments. The two primary data structures are vector and raster. Each has specific strengths and weaknesses. Vector Data Structure Vector structures, following their heritage of vector data models, are designed around point, line, and polygonal objects and their related attribute data. Typically, there are tables with ID numbers for each object in the data set. These ID numbers allow objects to be connected to a variety of tabular attribute data (Figure 2). This ability to connect geometric with tabular data is commonly known as the georelational model. The way in which coordinates for objects are stored is specific to particular file structures. Generally, coordinates are either stored as BLOBs in standard attribute tables, or as hidden files connected to attribute tables by the object ID numbers.
Figure 2: The georelational model (From ESRI promotional materials) In a vector structure, analysis takes place by overlay of georeferenced data themes. The overlay process creates new data themes of combined geometry that carry with them all the attributes of the original themes (Figure 3). Figure 3: Vector structure and overlay analysis (Bolstad 2002).
Raster Data Structure In a raster data structure, data are stored in a grid of columns and rows. The intersection of each row and column is known as a cell. Each cell corresponds to x and y coordinates in the real world and contains a z value that can represent anything from elevation to census data, to etc. These numbers can be used to assign colors, or shades when displaying the raster (Figure 4). Figure 4: Raster data structure (Bolstad 2002) Analysis in a raster environment involves mathematical manipulation of the values in the cells of the array. This analysis can be based on individual cells, or on neighborhood cells. Because it is about math, any analysis based on an equation or statistical model can be applied to raster data (Figure 5).
Figure 5: Analysis in a raster structure is based on mathematical functions (Bolstad 2002). Strength in Structures Directly related to the differences of the field based and object based conceptual models, raster and vector data structures each have their own strengths and weaknesses. Table 1 provides a quick comparison of the strengths of each structure. Although most commercial GIS applications allow both structures, knowing the strengths and
weaknesses of each will allow you to select the best data structure for each spatial data theme in your geodatabase. Table 1: Raster and Vector Strengths Raster Strengths Geographic position is implicit, so there is no need to store x,y coordinates for objects Neighboring locations are represented by neighboring cells, making neighborhood analyses simple Accommodates both discrete and continuous data Since it is all about numbers, analytical algorithms are easy to create and apply to data Compatible with remotely sensed data Vector Strengths Point-line-polygon format is familiar which makes it easier to understand vector maps Vector systems have small storage requirements because only the individual objects are stored As objects, individual features can be retrieved individually for processing A variety of descriptive data in tabular format can be associated with a single feature Superior cartographic products Summary This discussion has explored the most common ways in which spatial information is represented. It followed the development of conceptual models, leading to data models, and eventually to logical models. (Discussion of data structure specific to ESRI software will be covered in labs.) Early Raster Map: This image shows a section of the Byzantine period floor of the church in Madaba, present day Jordan. This floor is a mosaic map of the known (Byzantine) world, covering everything from Asia to Spain, and from Northern Africa to Northern Europe. In this image, we can see the city of Jerusalem, including the main streets, important buildings, and gates. Relevant to today s discussion is that this map represents an early raster data structure for spatial information. The mosaic floor is made up of thousands of tesserae, small blocks of colored stones. Models of Spatial Information Bonus Section
References Cited Bolstad, P. (2002). GIS fundamentals: a first text on geographic information systems. White Bear Lake, Minn., Eider Press.