Constructing an EA-level Database. for the Census

Similar documents
Introduction to GIS (Basics, Data, Analysis) & Case Studies. 13 th May Content. What is GIS?

A GIS helps you answer questions and solve problems by looking at your data in a way that is quickly understood and easily shared.

GIS Spatial Data Standards

Basics on Geodatabases

APLS GIS Data: Classification, Potential Misuse, and Practical Limitations

Implementation Planning

Spatial data models (types) Not taught yet

ArcGIS Data Models Practical Templates for Implementing GIS Projects

Guidelines on Information Deliverables for Research Projects in Grand Canyon National Park

WHAT IS GIS - AN INRODUCTION

GEOGRAPHIC INFORMATION SYSTEMS Lecture 20: Adding and Creating Data

SESSION 8: GEOGRAPHIC INFORMATION SYSTEMS AND MAP PROJECTIONS

GEOGRAPHIC INFORMATION SYSTEMS

Institute of Natural Resources Departament of General Geology and Land use planning Work with a MAPS

GIS Data Conversion. GIS maps are digital not analog. Getting the Map into the Computer

ESRI Experience in the Use of GIS for Census Mapping Applications

Introduction to GIS. Dr F. Escobar, Assoc Prof G. Hunter, Assoc Prof I. Bishop, Dr A. Zerger Department of Geomatics, The University of Melbourne

GEOGRAPHIC INFORMATION SYSTEMS CERTIFICATION

INDIVIDUAL COURSE DETAILS

Geographic Information Systems

Data Validation Online References

10. Creating and Maintaining Geographic Databases. Learning objectives. Keywords and concepts. Overview. Definitions

GIS Databases With focused on ArcSDE

INTRODUCTION TO ARCGIS SOFTWARE

Agenda. What is GIS? GIS and SAP Real Examples

Hydrographic Data Management using GIS Technologies

Digital Cadastral Maps in Land Information Systems

History of Database Systems

Chapter 2. Data Model. Database Systems: Design, Implementation, and Management, Sixth Edition, Rob and Coronel

Technology Trends In Geoinformation

GIS Architecture and Data Management Practices Boone County GIS Created and Maintained by the Boone County Planning Commission GIS Services Division

Database System Architecture & System Catalog Instructor: Mourad Benchikh Text Books: Elmasri & Navathe Chap. 17 Silberschatz & Korth Chap.

The UCC-21 cognitive skills that are listed above will be met via the following objectives.

Attribute Data and Relational Database. Lecture 5 9/21/2006

Geodatabase Programming with SQL

Geographic Information System Technician

Chapter Contents Page No

Geographic Information Systems In Emergency Management

Cookbook 23 September 2013 GIS Analysis Part 1 - A GIS is NOT a Map!

DATABASE MANAGEMENT FILES GIS06

Best Practices for Developing Geographic Information Models

Creating Maps in QGIS: A Quick Guide

GIS. Digital Humanities Boot Camp Series

Chapter 10 Practical Database Design Methodology and Use of UML Diagrams

Vector analysis - introduction Spatial data management operations - Assembling datasets for analysis. Data management operations

White Paper. Freeance Mobile for Cityworks

Chapter 2 Database System Concepts and Architecture

Gatekeeper Systems. NaviGate USA (Underground Service Alert) Software Product Description. Product Summary. Product Description.

Geographical Information Systems An Overview

{ { { Meeting Date 08/03/10. City of Largo Agenda Item 24. Leland Dicus, P.E., City Engineer

The Use of Geographic Information Systems in Risk Assessment

Topology. Shapefile versus Coverage Views

Databases and DBMS. What is a Database?

Overview RDBMS-ORDBMS- OODBMS

Using CAD Data in ArcGIS

Building & Developing the Environmental

E c o n o m i c. S o c i a l A f f a i r s HANDBOOK ON GEOSPATIAL INFRASTRUCTURE IN SUPPORT OF CENSUS ACTIVITIES. United Nations

Foundations of Business Intelligence: Databases and Information Management

Oracle Big Data Spatial and Graph

GIS Data in ArcGIS. Pay Attention to Data!!!

UPDATED GIS DATABASE DESIGN: Geodatabase Model

GEOGRAPHIC INFORMATION SYSTEMS

5.5 Copyright 2011 Pearson Education, Inc. publishing as Prentice Hall. Figure 5-2

Concepts of Database Management Eighth Edition. Chapter 1 Introduction to Database Management

Can GIS Help You Manage Water Resources? Erika Boghici Texas Natural Resources Information Systems

Visualization Method of Trajectory Data Based on GML, KML

Kingdom Of Bahrain Ministry of Works. Enterprise Asset Management System A Geocentric Approach. Presented By Hisham Y.

CHAPTER-IV DATABASE MANAGEMENT AND ITS ENVIRONMENT

Assessment Tasks Pass theory exams at > 70%. Meet, or exceed, outcome criteria for projects and assignments.

Oracle10g & Beyond. Justin Lokitz. Senior Member - Technical Staff GIS/Web Development Specialist

Aerial imagery and geographic information systems used in the asbestos removal process in Poland

Using Geographic Information Systems to provide better e-services

STATE OF NEVADA Department of Administration Division of Human Resource Management CLASS SPECIFICATION

What is a database? COSC 304 Introduction to Database Systems. Database Introduction. Example Problem. Databases in the Real-World

Qatar National Geospatial Infrastructure

REQUEST FOR PROPOSAL FOR THE DEVELOPMENT OF A GIS PARCEL MAPPING FOR VERNON COUNTY, MISSOURI

COURSE NAME: Database Management. TOPIC: Database Design LECTURE 3. The Database System Life Cycle (DBLC) The database life cycle contains six phases;

Web and Mobile GIS Applications Development

SPATIAL ANALYSIS IN GEOGRAPHICAL INFORMATION SYSTEMS. A DATA MODEL ORffiNTED APPROACH

City of Tigard. GIS Data Standards

Syllabus AGET 782. GIS for Agricultural and Natural Resources Management

Reading Questions. Lo and Yeung, 2007: Schuurman, 2004: Chapter What distinguishes data from information? How are data represented?

Government 1008: Introduction to Geographic Information Systems. LAB EXERCISE 4: Got Database?

Answers to Review Questions

Advanced Image Management using the Mosaic Dataset

ArcGIS. Server. A Complete and Integrated Server GIS

Development of a Geographic Information Systems Road Network Database for Emergency Response; A Case Study Of Oyo-Town, Oyo State, Nigeria

The GIS Primer provides an overview of issues and requirements for implementing and applying geographic information systems technology.

CS2Bh: Current Technologies. Introduction to XML and Relational Databases. Introduction to Databases. Why databases? Why not use XML?

Creating a File Geodatabase


Database Production and Map Series Management. Using. The Production Line Tool Set

Chapter 3. Database Environment - Objectives. Multi-user DBMS Architectures. Teleprocessing. File-Server

Module 3: File and database organization

DATA QUALITY IN GIS TERMINOLGY GIS11

are aimed for the investigation, planning, implementation, and decision making divisions.

REGIONAL SEDIMENT MANAGEMENT: A GIS APPROACH TO SPATIAL DATA ANALYSIS. Lynn Copeland Hardegree, Jennifer M. Wozencraft 1, Rose Dopsovic 2 INTRODUCTION

GIS and Census Mapping in St. Lucia

Compiled from ESRI s Web site: 1. What Is a GIS?

SAMPLE: DO NOT COMPLETE

Transcription:

Constructing an EA-level Database for the Census Amor Laaribi UN-GGIM Secretariat UN Statistics Division New York 1

Overview Stages in the Geographic Database Development Sources of geographic information Data conversion Data integration Implementation of the Database Data Modelling Relational Data Model Example Conclusion

Stages in the geographic database development Geographic data sources for EA delineation Inventory of existing data sources Additional geographic data collection Geographic data conversion Digitizing/Scanning + raster-to-vector conversion Editing Geographic features Constructing and maintaining topology for geographic features Data integration Geo-referencing/Coding Combining and integrating/additional delineation of EA boundaries Parallel activity Develop geographic attribute database Metadata development

Sources of geographic information Identify existing data sources Additional geographic data collection Paper maps, existing printed air photos and satellite imagery Field mapping products such as sketch maps Digital air photos and satellite images GPS coordinate collection Existing digital maps

Inventory of existing sources National mapping agency (often the lead agency in the country); Military mapping services; Province, district and municipal governments. (transportation, social services, utility services and planning relevant information); Various government/private organizations dealing with spatial data; Geological or hydrological survey, Environmental protection authority, Utility and communication sector companies; Donor activities

Why Data Inventory? Geographic data: Labor intensive, tedious and error-prone Up to 70% of GIS projects Identify existing data sources UNSD-CELADE Regional Workshop on Census

Geographic data conversion Data conversion: The process of converting features that are visible on a hardcopy map into digital point, line, polygon and attribute information is called data automation or data conversion. The best strategy for data conversion depends on many factors including data availability, time and resource constraints Cost Trade-off between the cost of a project, the amount of time required to complete data conversion, and the quality of the final product. Speed Quality

Data Conversion Paper maps, existing printed air photos and satellite imagery Field mapping products such as sketch maps Digital air photos and satellite images Digitizing Scanning Raster-to-vector conversion

Geographic data conversion 2 main approaches for converting information on hardcopy maps to digital data: Scanning Digitizing

Scanning Scanning has arguably bypassed digitizing as the main method of spatial data input, mainly because of the potential to automate some tedious data-input steps using large-format feed scanners and interactive vectorization software. The result of the scanning process is a raster image of the original map which can be stored in a standard image format such as GIF or TIFF. After geo-referencing it can be displayed in GIS packages as a backdrop to existing vector data.

Advantages and Disadvantages of Scanning Advantages Scanned maps can be used as image backdrops for vector information; Clear base maps or original color separations can be vectorized relatively easily using raster-tovector conversion software; and Small-format scanners are relatively inexpensive and provide quick data capture. Disadvantages Converting large maps with small format scanners requires tedious re-assembly of the individual parts; Scanning large volumes of hardcopy maps will present challenges for file storage on many desktop computer systems; Despite recent advances in vectorization software, considerable manual editing and attribute labeling may still be required.

Raster to Vector Conversion Raster to Vector Conversion Since the end result of the conversion process is a digital geographic database of points and lines, the scanned information contained on the raster images needs to be converted into coordinate information. Digital air photos and satellite images Scanning Raster-to-vector conversion

Digitizing Manual Digitizing Digitizing is often tedious and tiring to the operators Heads up Digitizing In the heads-up digitizing, a scanned map image is used digitally to trace the outlines into a GIS layer

Heads-Up Digitizing II Operator uses a Raster-scanned image on the computer screen (a scanned map, air photo or satellite image) as a backdrop. Operator follows lines on-screen in vector mode

Advantages and Disadvantages of Digitizing Advantages Digitizing is easy to learn and thus does not require expensive skilled labor; Attribute information can be added during the digitizing process; High accuracy can be achieved through manual digitizing; i.e., there is usually no loss of accuracy compared to the source map. Disadvantages Digitizing is tedious possibly leading to operator fatigue and resulting quality problems which may require considerable postprocessing; Manual digitizing is quite slow; In contrast to primary data collection using GPS or aerial photography, the accuracy of digitized maps is limited by the quality of the source material.

Editing and Building topology Paper maps, existing printed air photos and satellite imagery Field mapping products such as sketch maps Digital air photos and satellite images GPS coordinate collection Existing digital maps Digitizing Scanning Raster-to-vector conversion Editing geographic features Generate lines and polygones Construct Topology for Geographic features

Editing Manual digitizing is error prone Objective is to produce an accurate representation of the original map data This means that all lines that connect on the map must also connect in the digital database There should be no missing features and no duplicate lines The most common types of errors Reconnect disconnected line segments, etc.

Some common digitizing errors spike undershoot missing line overshoot line digitized twice

Fixing Errors Some of the common digitizing errors shown in the figure can be avoided by using the digitizing software s snap tolerances that are defined by the user. For example, the user might specify that all endpoints of a line that are closer than 1 mm from another line will automatically be connected (snapped) to that line. Small sliver polygons that are created when a line is digitized twice can also be automatically removed.

Topology Data structure in which each point, line and polygon : knows where it is knows what is around it understands its environment knows how to get around Helps answer the question what is where?

Example of Spaghetti data structure 6 5 4 3 2 A B C Poly coordinates A (1,4), (1,6), (6,6), (6,4), (4,4), (1,4) B (1,4), (4,4), (4,1), (1,1), (1,4) C (4,4), (6,4), (6,1), (4,1), (4,4) 1 1 2 3 4 5 6

Example of Topological data structure 6 5 4 3 1 2 1 A I II III 4 5 2 B 6 C 3 IV 1 2 3 4 5 6 O = outside polygon Node X Y Lines Poly Lines I 1 4 1,2,4 A 1,4,5 II 4 4 4,5,6 B 2,4,6 III 6 4 1,3,5 C 3,5,6 IV 4 1 2,3,6 From To Left Right Line Node Node Poly Poly 1 I III O A 2 I IV B O 3 III IV O C 4 I II A B 5 II III A C 6 II IV C B

Spaghetti data structure 6 5 4 3 2 B A C Poly A B C Coordinates (1,4), (1,6), (6,6), (6,4), (4,4), (1,4) (1,4), (4,4), (4,1), (1,1), (1,4) (4,4), (6,4), (6,1), (4,1), (4,4) 1 1 2 3 4 5 6 Topological data structure 6 5 4 3 2 1 B 1 A 4 5 C 2 3 1 2 3 4 5 6 O = outside polygon 6 Node I II III IV Line 1 2 3 4 5 6 X 1 4 6 4 Y 4 4 4 1 From Node I I III I II II Lines 1,2,4 4,5,6 1,3,5 2,3,6 To Node III IV IV II Left Poly O B O A Poly A B C Lines 1,4,5 2,4,6 3,5,6 III IV A C Right Poly A O C B C B

Constructing and maintaining topology (cont.) Storing the topological information facilitates analysis, since many GIS operations do not actually require coordinate information, but are based only on topology The user typically does not have to worry about how the GIS stores topological information. How this is actually done is software-specific. Building topology thus also acts as a test of database integrity

Digital data integration Construct Topology for Geographic features Existing digital maps Geo-referencing (coordinate transformation and projection change) Coding (labeling) of digital geographic features Combine and integrate attribute data

Integrating data Geo-referencing Converting map coordinates to the real world coordinates corresponding to the source map s cartographic projection (or at digitizing stage). Attaching codes to the digitized features Integrating attribute data Spreadsheets links to external database

Integrating attribute data After the completed digital database has been verified to be error-free, the final step is to add additional attributes These can be linked to the database permanently, or the additional information about each database feature can be stored in separate files which are linked to the geographic database as needed

Implementation of an EA database Geographic databases (hereafter referred to as geodatabases) are more than spreadsheets Entity types can be defined as having specific properties that govern behavior in the real world. The EA as a geographic unit is a kind of object whose function is to delineate territory for the census canvassing operation. Morphologically, the EA is contiguous, it nests within administrative units, and it is composed of population-based units. UNSD-CELADE Regional Workshop on

Implementation of an EA database (cont.) All large operational GISs are built on geodatabases; Arguably the most important part of the GIS Geodatabases form the basis for all queries, analysis, and decision-making. A DBMS, or database management system, is where databases are stored.

Levels of Data Abstraction Reality Conceptual Model (user's perception of the real world) Logical Model (a formal description of the data model) Increasing Abstraction Humanoriented Computeroriented Physical Model (physical storage of the data e.g., format, order, path)

Levels of Data Abstraction Real World Physical Model - both hardware and software specific - how files will be structured for access from the disk - Database Schema Conceptual Model - software and hardware independent - describes and defines included entities - identifies how entities will be represented in the database: i.e. selection of spatial objects - points, lines, polygons, raster cells - requires decisions about how real-world dimensionality and relationships will be represented: based on the processing that will be done on these objects e.g. should a building be represented as an area or a point? Logical Model software specific but hardware independent sets out the logical structure of the database elements, determined by the database management system used by the software

GIS Data Models Vector data model Raster data model courtesy: Mary Ruvane, http://ils.unc.edu/

Several types of data organization Database management systems (DBMSs) can be divided into various types, including: Relational Object Object-relational Relational (RDBMS) RDMS is the most popular type of DBMS Over 95% of data in DBMS is in RDBMS DB2; SQL Server, Access; Oracle; Informix

Example: the Relational Database Model The relational database model is used to store, retrieve and manipulate tables of data that refer to the geographic features in the coordinate database. It is based on the entity-relationship model In a geographic context, an entity can be administrative or census units, or any other spatial feature for which characteristics will be compiled.

Example of an urban EA map Main components are: Street network, Buildings EA boundaries layer Annotation, Symbols, Labels Building numbers Neatlines Legend 27 43 28 42 31 42 37 358 38 43 51 45 44 Street Robinson Street Tissot Street Mercator Avenue 40 41 1 2 3 45 61 57 22 35 31 65 62 19 42 41 350 32 44 43 349 63 20 21 34 33 58 64 60 59 6 61 55 60 1 6 4 5 7 62 11 5 2 59 56 66 10 11 9 8 10 65 64 63 3 58 4 57 Snyder Street Bonne Street Tobler Street Cassini Drive Imhof Drive Goode Street Krassowskij Street Bessel Street Eckert Drive Lambert Avenue Ptolemy Street Gall Street 12 13 19 18 73 72 14 20 23 71 69 17 15 22 21 16 70 Miller Cassini Drive Grinten Street Ortelius Street 74 75 67 68 Drive Clarke Street 12 13 362 18 19 20 21 81 82 76 80 29 19 35 21 24 36 26 25 20 79 78 77 28 22 27 361 23 37 Mollweide Street 30 28 27 43 29 29 38 88 28 31 83 87 1 2 32 42 33 31 32 39 14 33 85 84 41 30 41 86 40 13 7 34 12 374 24 43 42 15 50 51 52 23 11 10 48 49 9 8 44 54 53 58 59 50 47 46 45 51 22 16 25378 26 27 1 2 3 4 32 52 377 54 34 33 9 10 21 Enumeration Area Map Province: District: Locality: EA-Code: Cartania Chartes Maptown N Approximate scale 0 50 100 14 032 0221 00361 200m 17 District Locality EA Building number Census 2000 Symbols 358 EA-Code Hospital Church School National Statistical Office - July 1998

Definition of EA database content A spatial data model: description of geographic features, such as administrative units, streets, houses, buildings, hydrological features/rivers, etc., and the relationships between those entities. A data model is independent from any specific technology/software package

Entity-Relationship Example: EA entity can be linked to the entity Crew Leader Area. The table for this entity could have attributes such as the name of the crew leader, the regional office responsible, contact information, and the crew leader code (CL code) as primary code, which is also present in the EA entity. EA R Crew Leader Area EA-code Area 1-1 1-N CL-code Name Pop. RO responsible

Implementation of an EA database Example of an entity table enumeration area Entity: Enumeration areas Type (attributes) EA-Code Area Pop CL-Code Instances 723101 723102 723103 723201 723202 723203 723204 32.1 28.4 19.1 34.6 25.7 28.3 12.4 763 593 838 832 632 839 388 88 88 88 88 89 89 89 Primary key

An example of land parcels

The E/R diagram for land parcels STREET -name A SEGMENT B -number 2-N 0-1 1-2 3-N PARCEL -number A: Streets have edges (segments) B: parcels have boundaries (segments) C: line have two endpoints D: parcels have owners, and people own land. 2-2 C 2-N POINT -number -x,y 1-N D 1-N LANDOWNER -name -date-of-birth

Data Tables

Object-Oriented and Object-Relational GIS DBMS Object-oriented (OODBMS) Based on OO concept to store state and behavior of GIS objects in databases Provide OO query tools Commercially not successful Object-Relational (ORDBMS) Extend RDMS to handle GIS objects Current Geographic Databases are ORDBMS

Data Dictionary Definition: A data catalog that describes the contents of a database. Information is listed about each field in the attribute table and about the format, definitions and structures of the attribute tables. A data dictionary is an essential component of metadata information.

Spatial Analysis: Query Select features by their attributes: find all districts with literacy rates < 60% Select features by geographic relationships find all family planning clinics within this district Combined attributes/geographic queries find all villages within 10km of a health facility that have high child mortality Query operations are based on the SQL (Structured Query Language) concept

Spatial Analysis (cont.) Buffer: find all settlements that are more than 10km from a health clinic Point-in-polygon operations: identify for all villages into which vegetation zone they fall Polygon overlay: combine administrative records with health district data Network operations: find the shortest route from village to hospital Advanced spatial analysis Multi-criteria Analysis

Summary Data conversion Digitizing/Scanning Editing Building Topology Data integration Geo-referencing; Projection change Coding Integration of attribute data Building EA Database Data Modelling: A spatial data model Database Implementation

Thank You!

Illustration Sources of geographic information Identify existing data sources Additional geographic data collection Paper maps, existing printed air photos and satellite images Field mapping products such as sketch maps Digital air photos and satellite images GPS coordinate collection Existing digital maps Data conversion Digitizing Scanning Generate lines and polygons Raster to vector conversion (automated or semi-automated) Editing geographic features Construct topology for geographic features Digital map data integration Georeferencing (coordinate transformation and projection change) Coding (labelling) of digital geographic features Combine and integrate digital map sheets Additional delineation of EA boundaries Parallel activity Develop geographic attributes database