Database Design Concepts and Data Integration URISA February 9 th, 2005 Bob Earle 3/10/2005 1
Agenda A few Design tips and guidelines, some of which I learned the hard way Many GIS staff get database design and maintenance thrust upon them The r evolution of GIS at Sacramento County; how we became a data integrator, but not necessarily by plan
Design of a Spatial system is like any other Enterprise application
Earlier is Better
Translation of User Requirements Business Rules Model the User s View
Entity Relationship Diagram Good design tool, good communication tool
Access ERD (Relationships) You may already have an ERD tool and not know it
UML May look complicated, but at its heart is the ERD
Sacramento County ERD
Normalization Remove Repeating Groups Every non-key column must be dependent on all parts of the primary key No non-key column may be functionally dependent on another non-key column Introduced by E.F. Codd
Every attribute will refer to the Primary key, the whole key and nothing but the key, so help me Codd A table should have a field that uniquely identifies each of its records, and each field in the table should describe the subject that the table represents Database Design for Mere Mortals by Michael Hernandez
Database Data Models
Metadata
Database Design Fallacies Design your system to replicate existing practices You can throw your data model away when you have implemented your database I will just put it in a note Metadata is too much trouble My spreadsheet is good enough GIS shouldn t be putting its nose into everybody s business
More Database Design Fallacies Nobody will ever need my data Just import your shapefiles into SDE and voila you have a Geodatabase Normalized data is fast performing data Spatial data is just data
Database Design Truths Spatial data is just data Design on paper, then a computer Earlier is better Document deviances from the conceptual design
More Database Design Truths Use meaningful attribute and table names Flag for deletion, rather than delete Take the user s key away Use database constraints, rather than coded business rules
What is so Special about Spatial Data? How it is the same than regular data Data relationships Integrity Key Asset How it is different Spatial Key (Location is a type of key) Topology Heavy load Long transaction Advanced data type Database
Early GIS
Community GIS
Evolution of the spatial data model Ian McCarg Design With Nature ArcInfo Coverages ArcStorm SDE Shape File Oracle Spatial GeoDatabase True True Spatially Spatially Enabled Enabled Database Database
GIS Layers Points Lines Polygons Surfaces
3 D
Scanned Maps
Scanned Documents
Attribute Data
Orthophotography
Levels of Data Integration
If it were a perfect (GIS) World Internet Spatial Transactions Universal Regional Database Departmental Documents
I can t say it was easy, but I can say it was worth it!!
Sacramento County Property Systems (Without data sharing) GHB/GAX Applications (Mainframe) SAP/ COMPASS (Oracle) PDB Application (Mainframe) CPS Unsecured Tax Application (Oracle) APS Permits system (Oracle) Utility Billing/ CUBS (SAP?,Oracle?) Planning Public Works Direct Levy Maintenance Application GIS (ESRI)
Sacramento County Property Systems GHB/GAX Applications (Mainframe) SAP/ COMPASS (Oracle) PDB Application (Mainframe) CPS Unsecured Tax Application (Oracle) County-wide Shared Property Data Warehouse APS Permits system (Oracle) Utility Billing/ CUBS (SAP?,Oracle?) Planning Public Works Direct Levy Maintenance Application GIS (ESRI)
Examples of Shared Data Benefits Tax Rate Areas Vineyards Utility Billing Minimize data redundancy and costs in the Regional Cooperative
Data Ownership, Dissemination, and Accessibility Policies Ownership - Departments own their data, GIS is the middleman Dissemination - Open Access, Cooperative Mapping based on Standards Access - Concerns over safety
Database Architecture Transactional Transactional Warehouse Warehouse Shared Shared Firewall
Summary - Benefits of Good Design Data integrity Flexibility in data retrieval and analysis Follows business rules and therefore supports organizational requirements Easier to share data Different users access same data Accommodates different views of the data Minimal Data Redundancy Breaks out of Spreadsheet mentality
Questions, Comments? earleb@saccounty.net exliner@saccounty.net perrym@saccounty.net