Algorithmische Geometrie in der Praxis



Similar documents
Terry Sherman Oklahoma South Central Arc-Users Group th Annual Conference 9/23/2014

Supported DBMS platforms DB2. Informix. Enterprise ArcSDE Technology. Oracle. GIS data. GIS clients DB2. SQL Server. Enterprise Geodatabase 9.

ELECTRONIC JOURNAL OF POLISH AGRICULTURAL UNIVERSITIES

Introduction to Using PostGIS Training Workbook Last Updated 18 June 2014

IBM Informix. Reference Documentation on Why Informix Spatial for GIS Projects

GeoPackage, The Shapefile Of The Future

Introduction to PostGIS

What's New in SAP HANA Spatial (Release Notes)

SQL/MM Spatial: The Standard to Manage Spatial Data in Relational Database Systems

Chapter 5 Spatial is not Special: Managing Tracking Data in a Spatial Database


Data Visualization Techniques and Practices Introduction to GIS Technology

SQL SUPPORTED SPATIAL ANALYSIS FOR WEB-GIS INTRODUCTION

10. Creating and Maintaining Geographic Databases. Learning objectives. Keywords and concepts. Overview. Definitions

Geodatabase Programming with SQL

GEOGRAPHIC INFORMATION SYSTEMS CERTIFICATION

Building a Spatial Database in PostgreSQL

Oracle Spatial 10g. An Oracle White Paper August 2005

MapInfo SpatialWare Version 4.6 for Microsoft SQL Server

Representing Geography

City of Tigard. GIS Data Standards

An Introduction to Open Source Geospatial Tools

Oracle8i Spatial: Experiences with Extensible Databases

not at all a manual simply a quick how-to-do guide

Big Data: Using ArcGIS with Apache Hadoop. Erik Hoel and Mike Park

A Web services solution for Work Management Operations. Venu Kanaparthy Dr. Charles O Hara, Ph. D. Abstract

Pro Spatial with SQL Server 2012

ADVANCED DATA STRUCTURES FOR SURFACE STORAGE

Thematic Map Types. Information Visualization MOOC. Unit 3 Where : Geospatial Data. Overview and Terminology

Introduction to GIS (Basics, Data, Analysis) & Case Studies. 13 th May Content. What is GIS?

GIS Databases With focused on ArcSDE

Oracle Spatial and Graph. Jayant Sharma Director, Product Management

Oracle Platform GIS & Location-Based Services. Fred Louis Solution Architect Ohio Valley

A HYBRID APPROACH FOR AUTOMATED AREA AGGREGATION

Pennsylvania Geospatial Data Sharing Standards (PGDSS) V 2.5

Oracle Database 10g: Building GIS Applications Using the Oracle Spatial Network Data Model. An Oracle Technical White Paper May 2005

OpenGIS Implementation Specification for Geographic information - Simple feature access - Part 2: SQL option

Working with the Geodatabase Using SQL

UTM Zones for the US UTM UTM. Uniform strips Scalable coordinates

Abstract. Introduction

Spatial Database Support

WHAT YOU NEED TO USE THE STATE PLANE COORDINATE SYSTEMS

Data warehousing with PostgreSQL

Data Warehousing und Data Mining

<Insert Picture Here> Oracle SQL Developer 3.0: Overview and New Features

GIS Data in ArcGIS. Pay Attention to Data!!!

Oracle Big Data Spatial and Graph

Reading Questions. Lo and Yeung, 2007: Schuurman, 2004: Chapter What distinguishes data from information? How are data represented?

The process of database development. Logical model: relational DBMS. Relation

Title 10 DEPARTMENT OF NATURAL RESOURCES Division 35 Land Survey Chapter 1 Cadastral Mapping Standards

Oracle Big Data Spatial and Graph: Spatial Features

Spatial data models (types) Not taught yet

SESSION 8: GEOGRAPHIC INFORMATION SYSTEMS AND MAP PROJECTIONS

Crime Mapping Methods. Assigning Spatial Locations to Events (Address Matching or Geocoding)

Partitioning under the hood in MySQL 5.5

GIS User Guide. for the. County of Calaveras

GIS: Geographic Information Systems A short introduction

Spotfire v6 New Features. TIBCO Spotfire Delta Training Jumpstart

J9.6 GIS TOOLS FOR VISUALIZATION AND ANALYSIS OF NEXRAD RADAR (WSR-88D) ARCHIVED DATA AT THE NATIONAL CLIMATIC DATA CENTER

<Insert Picture Here> Data Management Innovations for Massive Point Cloud, DEM, and 3D Vector Databases

GIS Spatial Data Standards

3D Drawing. Single Point Perspective with Diminishing Spaces

TOWARDS AN AUTOMATED HEALING OF 3D URBAN MODELS

SolidWorks Implementation Guides. Sketching Concepts

The Arts & Science of Tuning HANA models for Performance. Abani Pattanayak, SAP HANA CoE Nov 12, 2015

ABSTRACT INTRODUCTION OVERVIEW OF POSTGRESQL AND POSTGIS SESUG Paper RI-14

Cookbook 23 September 2013 GIS Analysis Part 1 - A GIS is NOT a Map!

Introduction to GIS.

Government 1008: Introduction to Geographic Information Systems. LAB EXERCISE 4: Got Database?

Big Data and Analytics: A Conceptual Overview. Mike Park Erik Hoel

GEOGRAPHIC INFORMATION SYSTEMS Lecture 20: Adding and Creating Data

database abstraction layer database abstraction layers in PHP Lukas Smith BackendMedia

3D Drawing. Single Point Perspective with Diminishing Spaces

Primitive type: GEOMETRY: matches SDO_GEOMETRY and GEOMETRY_COLUMN types.

ABAP SQL Monitor Implementation Guide and Best Practices

Earth Coordinates & Grid Coordinate Systems

Metadata for Big River Watershed Geologic and Geomorphic Data

MAIN_SNP_TOPO.dgm_2m

When to consider OLAP?

EPSG. Coordinate Reference System Definition - Recommended Practice. Guidance Note Number 5

Vector storage and access; algorithms in GIS. This is lecture 6

Fundamentals of Database Design

MapInfo SpatialWare 4.8 for Microsoft SQL Server Release Notes

Title 10 DEPARTMENT OF NATURAL RESOURCES Division 35 Land Survey Chapter 1 Cadastral Mapping Standards

Oracle Database In-Memory The Next Big Thing


DBMS / Business Intelligence, SQL Server

RDS Migration Tool Customer FAQ Updated 7/23/2015

Using Map Views and Spatial Analytics in OBI 11g. BIWA Summit 2014

ADWR GIS Metadata Policy

Database Systems. Lecture 1: Introduction

Chapter 6: Data Acquisition Methods, Procedures, and Issues

FileMaker 12. ODBC and JDBC Guide

Requirements Specification Document

Transcription:

IBM Software Group DB2 Information Management Software Algorithmische Geometrie in der Praxis Integration von räumlichen Daten und Operationen in relationale Datenbanksysteme Knut Stolze <stolze@de.ibm.com>

Important Disclaimer THE INFORMATION CONTAINED IN THIS PRESENTATION IS PROVIDED FOR INFORMATIONAL PURPOSES ONLY. WHILE EFFORTS WERE MADE TO VERIFY THE COMPLETENESS AND ACCURACY OF THE INFORMATION CONTAINED IN THIS PRESENTATION, IT IS PROVIDED AS IS, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED. IN ADDITION, THIS INFORMATION IS BASED ON IBM S CURRENT PRODUCT PLANS AND STRATEGY, WHICH ARE SUBJECT TO CHANGE BY IBM WITHOUT NOTICE. IBM SHALL NOT BE RESPONSIBLE FOR ANY DAMAGES ARISING OUT OF THE USE OF, OR OTHERWISE RELATED TO, THIS PRESENTATION OR ANY OTHER DOCUMENTATION. NOTHING CONTAINED IN THIS PRESENTATION IS INTENDED TO, OR SHALL HAVE THE EFFECT OF: CREATING ANY WARRANTY OR REPRESENTATION FROM IBM (OR ITS AFFILIATES OR ITS OR THEIR SUPPLIERS AND/OR LICENSORS); OR ALTERING THE TERMS AND CONDITIONS OF THE APPLICABLE LICENSE AGREEMENT GOVERNING THE USE OF IBM SOFTWARE. 2

Agenda Introduction SQL/MM Spatial standard DB2 Spatial Extender & DB2 z/os Spatial Support Feature History Functionality Implementation details Spatial indexing Current & future developments 3D support Graph, Network & Topology Integration Aspects Summary 3

Introduction 4

What is Spatial Data Information about anything that can be located on Earth s surface Natural rivers, lakes, mountains, Man-made buildings, utility facility, Cadastral property boundary, voting districts, Artificial CAD/CAM Represented by a geometry Point, linestring, polygon Location and geometry defined by Coordinates latitude/longitude; x/y Addresses Piazza Leonardo da Vinci 32, 20133 Milano, Italy Geocode to get coordinates Name White House Use gazetteer to get coordinates 5

What is Spatial Data (cont.) Spatial data is modeled as raster or vector, and organized as collections of thematic layers 6

Spatial Applications Traditional GIS (G is central) Natural resources, government, remote sensing Sophisticated processing, dynamic geographic features Emphasis on geographic data Business, commercial GIS (IS is central) Enterprise databases Simpler, static geography Emphasis on other attributes Virtually every database has locations! But: only sparse integration into classic database applications 7

Products IBM SWG DB2 Information Management Software IBM DB2 Spatial and Geodetic Extender & IBM DB2 z/os Spatial Support Feature IDS Spatial DataBlade, SpatialWare & Geodetic DataBlade Oracle Spatial & Oracle Locator MySQL Spatial PostGIS for PostgreSQL 8

Use of Location Information... Demographic data Average income Family size Average education level Population density Gender/race distribution Crime data Geophysical Flood plains Rivers Fault-lines Mineral resources Meteorological Weather conditions Government Regulatory compliance 9

... which is Available At Little or No Cost FBI - Crime data such as Robbery crimes per 100,000 people Property crimes per 100,000 people Car thefts per 100,000 people FEMA - Disaster data such as 100-year and 500-year flood plain areas Coastal Barrier Resources Act areas USGS - Geophysical data such as Fault lines Known mineral deposits Active mines Census - Demographic data such as % of population 65 or older % of population never married Average household size % of population with a college education 10

It s s just SQL... Business applications often require the combination of different types of data CREATE TABLE HospitalRooms ( Name VARCHAR(128) NOT NULL, Equipment Document NOT NULL Where ST_Point NOT NULL ); -- "Show me all rooms available for the -- next five days within 45 miles of the -- patient s location equipped with a -- defibrillator?" SELECT h.name FROM hospitalrooms AS h, patients AS p WHERE ST_Within(h.where, ST_Buffer(p.location, 45, 'MILES') ) = 1 AND p.name = 'Fred Flintstone' AND Contains(h.equipment, 'defibrillator') = 1 11

12 Reference: http://www.ibm.com/software/data/solutions/pdf2/sf.pdf

Coordinate Systems Geographic 3D On surface of the Earth (often latitude/longitude) Spheroid (datum, semi-major and semi-minor axis) Primeridian (e.g. Greenwich) Unit (e.g. degrees, radians) Projected 2D Geographic plus projection to 2D space Many different projection algorithms Abstract Not related to Earth s surface Location in an image, on a screen, or in a retail store 13

Geographic Coordinates +90 (90 N) R latitude longitude 0-90 (90 W) 14

Flattening the Earth Plane Geometry on lat-long Singularities and scale distortion at the poles Wrap-around at 180º longitude Poor location of lines 90 0-90 -180 0 180 Local/Regional Projections Limited valid range Map edge-matching problems Non-uniform scale UTM 10 UTM 11 Indexing: it gets worse! Multiple bounding boxes or complete loss of selectivity 15

Planar Coordinates Northing 5,000,000 Easting 500,000 16

Spatial Data in DBMS "Tell me the number and average income of customers who make more than $100K and live within 100 miles of our California stores." SELECT sid, COUNT(*) AS count, AVG(income) AS avgincome FROM stores AS s, customers AS c WHERE ST_distance(s.loc, c.loc, 'MILE') < 100 AND s.state = 'CA' AND income > 100000 GROUP BY sid; DB2_stores Spatial Index / Data DB2 View_customers sid loc text image spatial cid loc income DBMS 17

Spatial/OpenGIS Spatial SQL Functions ST_Distance(g1,g2)? SELECT a.cust_number, a.address FROM customers a, stores b WHERE ST_Distance(a.location,b.location) < 2000 AND b.last_order_year < 2001 ST_Intersects(g1,g2)? SELECT a.policy_number, a.address FROM customers a, floodzones b WHERE ST_Intersects(a.location,b.location) = 1 AND b.last_flood_year > 1950 18

SQL/MM Spatial standard 19

Environment SQL99/SQL2003 defines many object-relational extensions structured types, UDFs, procedures, LOBs SQL/MM Part 3: Spatial defines SQL extensions for spatial data types, functions/methods, information schema (catalog) DB2 Spatial Extender pretty much implements this standard 20

Overview on SQL/MM Spatial Part 3 of SQL/MM Part 1 - Framework Part 2 - Full Text Part 3 - Spatial Part 5 - Still Image Part 6 - Data Mining Derived from OGC Simple Feature Specification for SQL OpenGIS Consortium (OGC) OGC Simple Feature Specification for SQL not really a standard, just a specification was base line for SQL/MM 21

Content Currently status 2 nd version is International Standard since 2003 3 rd version in Working Draft state (slow progress) Based on SQL99 Structured types, methods,... Standard does not define Indexing mechanisms Implementation issues number, names, and types of attributes of the types specific algorithms Follows SQL99 approach 22

Spatial Type Hierarchy (1) ST_Geometry ST_Point ST_GeomCollection ST_MultiPoint ST_Curve ST_Surface ST_Curve ST_Surface ST_LineString ST_Polygon ST_MultiLineString ST_MultiPolygon 23

Spatial Type Hierarchy (2) Additional optional types ST_CircularString ST_CompoundCurve ST_CurvePolygon 24

Spatial Type Hierarchy Discussion Attempt to implement Composite Design Pattern originated from OGC class hierarchy pattern cannot be implemented in SQL ST_Point unrelated to ST_MultiPoint, ST_LineString unrelated to ST_MultiLineString, ST_Polygon unrelated to ST_MultiPolygon further processing of results leads oftentimes to much more complex SQL statements Optional types not properly mirrored between single-part and multi-part subtrees of the hierarchy Inconsistent handling of empty geometries 25

Spatial Type Hierarchy Possible Solution ST_Empty ST_Geometry ST_MultiPoint ST_MultiCurve ST_MultiSurface ST_MultiLineStrin g ST_MultiCircStrin g ST_MultiPolygon ST_Point ST_LineString ST_CircularString ST_Polygon 26

Spatial Methods ST_Geometry ST_Point ST_LineString ST_CircularString ST_CompoundCurve ST_CurvePolygon ST_Polygon ST_GeomCollection ST_MultiPoint ST_MultiLineString ST_MultiPolygon ST_AsText ST_AsBinary ST_AsGML ST_Transform ST_Dimension ST_CoordDim ST_GeometryType ST_SRID ST_IsEmpty ST_IsSimple ST_IsValid ST_Boundary ST_Envelope ST_ConvexHull ST_Buffer ST_Intersection ST_Union ST_Difference ST_SymDifference ST_Length ST_StartPoint ST_EndPoint ST_IsClosed ST_IsRing ST_NumPoints ST_PointN ST_NumCurves ST_CurveN ST_Area ST_Perimeter ST_Centroid ST_PointOnSurface ST_ExteriorRing ST_InteriorRings ST_NumInteriorRing ST_InteriorRingN ST_Distance ST_Equals ST_Relate ST_Disjoint ST_Intersects ST_Touches ST_Crosses ST_Within ST_Contains ST_Overlaps ST_X ST_Y 27

Spatial Functionality (1) Constructors well-known text representation (WKT) well-known binary representation (WKB) ESRI shape representation GML representation Coordinates (for ST_Point values) Comparison functions (to be used in predicates) ST_Overlaps, ST_Within, ST_Disjoint, ST_EqualSRS, ST_Relate,... 28

Spatial Functionality (2) Return information about properties ST_Length, ST_X, ST_Is3D, ST_MinY, ST_NumPoints,... Derive/compute new geometries ST_Union, ST_Intersection, ST_MidPoint, ST_Buffer, ST_ConvexHull,... Functions involving 2 geometries are usually performed in 2D space Z and/or M coordinates are ignored 29

Constructors Well-known Text Representation point (10 10) multipolygon (((1 1, 2 2, 1 2, 1 1)),((10 10, 10 20, 20 20, 20 10, 10 10))) Well-known Binary Representation x'010100000000000000000024400000000000002440 Geography Markup Language <gml:point><gml:coordinates>10, 10</gml:coordinates></gml:Point> (Additional formats in products, e.g. Shape format) 30

Spatial Methods Discussion Duplicated functionality ST_Overlaps, ST_Intersects, ST_Crosses ST_GeometryType & SPECIFIC_TYPE (SQL99) Additional functionality already supported by products ST_ShortestPath (in current WD) Z/M coordinate support Other external data representations ST_Generalize to simplify geometries 31

Spatial Information Schema (Catalog) associated ST_GEOMETRY_ COLUMNS ST_SPATIAL_ REFERENCE_SYSTEMS is subset ST_UNITS_OF_ MEASURE COLUMNS (SQL2003) ST_SIZINGS 32

Spatial Information Schema Discussion Merge ST_SIZINGS view with SIZINGS (SQL2003) additional facilities in SQL2003 needed, first ST_SPATIAL_REFERENCE_SYSTEMS is rather primitive, EPSG uses: Coordinate Axis Name, Coordinate Axis, Coordinate System, Coordinate Reference System, Coordinate Operation, Ellipsoid, Datum, Prime Meridian, Coord_Op Method, Coord_Op Parameter, Coord_Op Parameter Usage, Coord_Op Parameter Value, Coord_Op Path 33

DB2 LUW Spatial Extender & DB2 z/os Spatial Support Feature 34

Introduction IBM SWG DB2 Information Management Software Spatial Geodetic Grid Your idea goes here DB2 Server Components Connectivity Backup Restore 35

What is an Extender? Types New Extender/ DataBlade Functions Casts Aggregates Indexes Tables Client Code 36

DB2 z/os Spatial Support No extender technology used Deeply integrated into DB2 engine Better performance Better customer acceptance Functionality very close to DB2 Spatial Extender 37

History Skip 38

Spatial Technology A Joint IBM & ESRI Effort IBM teamed up with Environmental Systems Research Institute (ESRI) over 30 years of experience in spatial technology broad portfolio of GIS products and tools Collaboration combined best of both partners ESRI provided code that does all the spatial calculations IBM threw in database knowledge and extended DB2 s OR features to accomodate ESRI index extensions were primarily developed for spatial grid index 39

Evolution DB2 UDB 7.1 DB2 Spatial Extender ArcInfo V8 ArcSDE8 DataJoiner Spatial Extender ESRI Spatial Tools Other SQL Applications ESRI SDE for DB2 UDB ESRI Spatial Tools Other SQL Applications ESRI Spatial Tools SDE 3.0.2 ArcSDE 9 ESRI SDE 3.0.2 DataJoiner with DB2 Spatial Extender DB2 UDB 8.1 with the DB2 Spatial Extender Feature DB2 UDB 40

Loosely Coupled Architecture pre-extender GIS Application Proprietary GIS APIs GIS Gateway SQL (ODBC, JDBC, etc.) RDBMS (1) Spatial types (2) Spatial indexes (3) Spatial functions (4) Spatial predicates (5) Spatial query composer (6) Spatial query rewriter (1) Spatial meta tables (2) "Hidden" tables for spatial data (3) "Hidden" table for spatial index Table: customers Constraint: within(loc, :circle) Other constraint: income > 50000 select * from customers c, index i, feature f where i.xmin > xmin(:circle) and i.ymin > ymin(:circle) and i.xmax < xmax(:circle) and i.ymax < ymax(:circle) and i.oid = f.oid and i.oid = c.oid and c.income > 50000 41

Integrated Architecture GIS Application Proprietary GIS APIs GIS Gateway SQL (ODBC, JDBC, etc) RDBMS (1) Spatial query composer (1) Spatial meta tables (2) Spatial types (3) Spatial functions (4) Spatial indexes (5) Spatial predicates Table: customers Constraint: within(loc, :circle) Other constraint: income > 50000 select * from customers c where within(c.location, :circle) and c.income > 50000 42

Functionality Skip 43

Architecture Applications SQL Client functions DB2 Client DB2 Stored Proc. Structured Types UDF/ Method Business Data { Spatial Extensions Spatial Catalog Data Data and Maps Geocoder data 44

Spatial Extender Features at a Glance (1) Integrated into DB2 Control Center Graphical user interface Includes spatial management functions Enable/disable spatial DB Create/drop coordinate systems, spatial reference systems, and geocoders Register/unregister spatial columns Manage geocoding instances Export/import 45

Spatial Extender Features at a Glance (2) Extender is based on DB2 s OR features User-defined structured types Spatial index extensions Grid index for all 2-dimensional data Z-order index for point data in 2D space Spatial catalog views Coordinate systems & spatial reference systems Spatial columns Geocoders Geocoder framework Support to plug in custom geocoders Just define a UDF that does the geocoding Import/export for special spatial file formats ESRI shapefiles SDE coverage files 46

Spatial Extender Functions and Methods ST_Contains ST_Touches ST_Within ST_Overlaps ST_Intersects ST_Crosses ST_Disjoint ST_Relate ST_Equals ST_MBRIntersects ST_Area ST_Distance ST_Length ST_Perimeter ST_AppendPoint ST_ChangePoint ST_RemovePoint ST_PerpPoints ST_Edge_GC_USA ST_EqualCoordsys ST_EqualSRS ST_GetIndexParms ST_SrsName ST_SrsId ST_X ST_Y ST_Z ST_M ST_MinX ST_MinY ST_MinZ ST_MinM ST_MaxX ST_MaxY ST_MaxZ ST_MaxM ST_Intersection ST_Difference ST_Union ST_SymDifference ST_Buffer ST_PointOnSurface ST_Boundary ST_Envelope ST_Centroid ST_Perimeter ST_ConvexHull ST_MBR ST_Generalize ST_FindMeasure ST_MeasureBetween MBR aggregate Union aggregate ST_Geometry ST_Point ST_LineString ST_Polygon ST_GeomCollection ST_MultiPoint ST_MultiLineString ST_MultiPolygon ST_AsText ST_AsBinary ST_AsShape ST_AsGML ST_ToGeomColl ST_ToPoint ST_ToLineString ST_ToPolygon ST_ToMultiPoint ST_ToMultiLine ST_ToMultiPolygon ST_Transform ST_Is3D ST_IsMeasured ST_IsClosed ST_IsEmpty ST_IsRing ST_IsSimple ST_IsValid ST_StartPoint ST_MidPoint ST_EndPoint ST_CoordDim ST_Dimension ST_GeometryType ST_NumPoints ST_PointN ST_ExteriorRing ST_NumInteriorRing ST_InteriorRingN ST_NumGeometries ST_GeometryN ST_NumLineStrings ST_LineStringN ST_NumPolygons ST_PolygonN ST_NumPoints ST_PointN 47

Administrative Stored Procedures ST_ALTER_COORDSYS ST_ALTER_SRS ST_CREATE_COORDSYS ST_CREATE_INDEX ST_CREATE_SRS ST_CREATE_SRS_2 ST_DROP_COORDSYS ST_DROP_INDEX ST_DROP_SRS ST_IMPORT_SHAPE ST_REGISTER_SPATIAL_COLUMN 48

Some Examples (1) CREATE TABLE customers ( name VARCHAR(100), address VARCHAR(200), location ST_Point ) CREATE TABLE streets ( name VARCHAR(100), location ST_LineString ) CREATE TABLE stores ( name VARCHAR(100), address VARCHAR(200), manager VARCHAR(100), location ST_Point ) 49

Some Examples (2) INSERT INTO customers VALUES ( 'customer 1', 'address 1', ST_Point(100, 200, 1) ) INSERT INTO streets VALUES ( '1st street', ST_LineString( 'linestring ( 50 50, 100 100, 100 250, 200 250 )', 1) ) INSERT INTO stores VALUES ( 'store 1', 'address 2', 'john', ST_Point('point(150, 250)', 1) ) 50

Some Examples (3) Distance between all stores and all customers, measured in meters SELECT s.location..st_distance(c.location, 'METER') FROM stores AS s, customers AS c Find all street intersections, return intersecting point as GML will exploit spatial index if possible SELECT s1.name, s2.name, s1.location..st_intersection(s2.location).. ST_AsGML() FROM streets AS s1 JOIN streets AS s2 WHERE ST_Intersects(s1.location, s2.location) = 1 51

Some Examples (4) Find all customers that live in an area around any store SELECT s.name, c.name, c.address FROM stores AS s, customers AS c WHERE s.location..st_buffer(5, 'KILOMETER').. ST_Contains(c.location) = 1 Find closest store for all customers WITH cust_store_dist(cust, store, dist) AS ( SELECT c.name, s.name, ST_Distance(c.location, s.location, 'METER') FROM stores AS s, customers AS c ) SELECT o.cust, o.store FROM cust_store_dist AS o WHERE o.dist <= ALL ( SELECT i.dist FROM cust_store_dist AS i WHERE o.cust = i.cust ) 52

Geocoding Address Data Customers Cid Address City State Zip Location 1 4335 Queen Anne Drive San Jose CA 95129 -- 2 555 Bailey Avenue San Jose CA 95120 -- -- 3000 1256 Prince Drive Cupertino CA 95129 -- Street Segments Geocoder Populate the spatial column with the location (point) information 53

The Geocoding Process ADDRESS 555 Bailey Avenue San Jose, CA 95141 DB2 UDB US Streets Map Data DB2 Spatial DB2 Geocoder Real Earth x,y Coordinates (Latitude, Longitude) 54

Implementation Details Skip 55

Integer Coordinates (1) All coordinates are stored as positive 32-bit integers Integer calculation faster than floating points Less storage required Compression of coordinates Floating points are converted to integers on the fly Offset and scale factor for each dimension control conversion 56

Integer Coordinates (2) Conversion floating point integer 1. Substract offset 2. Multiply with scale factor 3. Add 0.5 for proper rounding int = ( fp offset ) * scale + 0.5 Conversion integer floating point Reverse steps 57

Determining Offsets and Scale factors Get maximum spatial extent of data Min/max X/Y/Z/M coordinates Enlarge area by 20% to accommodate for growth Offset: Set to min X/Y/Z/M Scale factor Set to MAX_INT / ( max X/Y/Z/M + offset ) MAX_INT is 2,147,483,648 Tradeoff is between coordinate precision (determined by scale factor) and extent (allowable range of coordinate values 58

Coordinate Compression Internally, coordinates are stored relative to previous point in geometry Small integer values can be compressed better Compression rate is data dependent Geometries that are small (relative to scale factor) compress better (3, 6) [2, 4] (4, 4) [-2, 1] (6, 7) [3, 1] (1, 2) (3, 1) [-1, -3] (6, 3) [0, -4] 59

Added method support Version 7 Only ST_function(geometry) or ST_function(geometry1, geometry2) Version 8 Now also geometry..st_function() or geometry1..st_function(geometry2) Reasons Standard compliance Ease of use Problem: Migration was necessary 60

Catalog changes Version 7 Catalog tables highly inadequate (but actually standardized that way) Irrelevant information was shown Relevant information was missing 1:1 dependency between base tables and views was mandatory Specific naming conventions and limits (name lengths) Example: geometry_columns(layer_catalog, layer_schema, layer_table, layer_column, geometry_type, srid) Version 8 Implemented new catalog Changed SQL/MM standard to comply with the product Implementation issues Backward compatibility Everything in the code was affected Migration 61

Import/Export Rewrite of Queries Version 7 INSERT INTO table VALUES (..., ST_GeomFromShape(blob),...) Initialization of buffers inside DB2 Initialization inside the function Version 8 INSERT INTO table SELECT..., ST_GeomFromShape(data),... FROM TABLE ( VALUES (..., blob,...), VALUES (..., blob,...),... ) AS (..., data,...) Initialization done only once In 1 simple test: import time of shapefile from 3.5 min 30 sec 62

Exploitation of UDF features Use of SCRATCHPAD Carry internal buffers and already initialized objects from call to call All functions: construct coordinate system objects only once Geocoder: caching of data results in much less file I/O Several thousand memory operations to geocode a single address Allow parallelization of function execution Performance enhancements Improve parsers for WKT, WKB, GML, Complete rewrite of some functions 63

Error Handling & Tracing Version 7 At least 5 different ways to handle error conditions NLS issues due to hard-coded error messages No tracing available No first-fault data capture (FFDC) Version 8 Common architecture throughout the (IBM-controlled part of the) product Maps ESRI errors 64

Spatial Indexing Skip 65

The Need for a Spatial Index Spatial queries are typically in 2-D space Native B-Tree is insufficient B-Tree does not support structured types 0-D objects (points, e.g. location of buildings) can be indexed in B- tree based on X and Y coordinates But not with best possible performance Neither 1-D nor 2-D objects (lines, e.g. road segments, or polygons, e.g. lakes, counties) can be indexed this way 66

Function that can Exploit a Spatial Grid Index ST_Equals ST_Disjoint ST_Intersects, ST_Overlaps, ST_Crosses ST_Touches ST_Within, ST_Contains ST_MBRIntersects ST_Distance (only w/o units parameter) 67

Function that can Exploit a Spatial Z-Order Z Index ST_ZEquals ST_ZIntersects, ST_ZOverlaps, ST_ZCrosses ST_ZWithin, ST_ZContains ST_ZMBRIntersects ST_ZTouches ST_ZDistance 68

The Spatial Index Grids A uniformly-spaced square indexing grid each feature exists in one or more grid cells multiple geometries can exist in a single grid cell, especially overlapping geometries MBR of geometries is used for indexing speed up searches Up to 3 levels of grids for the spatial index spatial index is like a two-dimensional column index Records which grid cells each feature resides in Geometries are indexed by the grid level, the overlapped grid cells, and the MBR of the geometry 69

How the Spatial Index Works 1. Determine all overlapping grid cells (on all levels) of search argument Eliminate all geometries that do not overlap any of those grid cells Index scan returns set of candidates, possibly including false positives 2. Compare MBR of search agrument with candidates Eliminate all geometries where the MBR is already disjoint Set of candidates is reduced but might still contain false positives 3. Do an exact comparison Find correct result in set of candidates Spatial index search avoids access to the data 70

Let s s Walk Through an Example Parcel Flood Zone Find all Parcels that intersect the Flood Zone? Remember: The goal is to avoid comparing all geometries with the flood zone Use a 3 level filtering process 71

Spatial Index Example Index Scan 1. Eliminate geometries that do not overlap grid cells of search argument Eliminate all parcels that do not intersect with the 4 grid cells in which the flood zone lies. 72

Spatial Index Example Filter Based on MBR 1. Eliminate geometries that do not overlap grid cells of search argument 2. Eliminate geometries by filtering with the MBR Eliminate all parcels that do not intersect the MBR of the flood zone. 73

Spatial Index Example Coordinate Comparison 1. Eliminate geometries that do not overlap grid cells of search argument 2. Eliminate geometries by filtering with the MBR 3. Eliminate geometries by exact coordinate comparison Eliminate all parcels that do not actually intersect with the flood zone. 74

Spatial Index Grid Levels Up to 3 grid levels possible in an index Goal Reduce number of index entries Make grid cells as small as possible Higher resolution on index scans better filtering in this step Use multiple grid levels when geometries have skewed size Avoids lots of index entries for large geometries Geometry is promoted to next grid level if it covers more than 4 grid cells Note Multiple grid levels result in multiple index scans Index is logically partitioned by grid level 75

Choosing the Best Grid Size Grid size is size of grid cells (1 dimension) Use Spatial Index Advisor (gseidx) to get Information on the spatial data (size etc.) Suggestions for grid sizes Index statistics for user-providedgrid sizes Rules Make grid size as small as possible while Make sure that a geometry covers as few grid cells as possible 76

Spatial Index Statistics Use key generator function of grid index extension Feed geometries into it Returns table of all data stored in the index Use SQL to evaluate data in that table Simpler version: operate on MBR of the geometries Use ST_Min/MaxX/Y methods 77

Rules for Index Exploitation Spatial predicate must be used in WHERE clause Spatial function must be on left-hand side of comparison operator Equality comparison must use integer constant 1 At least of of the function s parameters must be an indexed spatial column SELECT * FROM customers c WHERE ST_Within(c.location, :BayArea) = 1 SELECT * FROM customers c WHERE ST_Distance(c.location, :SanJose) < 10 SELECT * FROM customers c WHERE ST_Length(c.location) > 10 SELECT * FROM customers c WHERE 1 = ST_Within(c.location, :BayArea) SELECT * FROM customers c WHERE ST_Within(c.location, :BayArea) = 2 SELECT * FROM customers c WHERE ST_Within(c.location, :BayArea) > 0 SELECT * FROM customers c WHERE ST_Within(:SanJose, :BayArea) = 1 78

Demo Skip 79

Problem Statement A real estate website needs the capabilities that can satisfy the following objectives: Allow users to search property listings and display the search results on a map Allow users to attain detailed information about a particular property Allow users to drill down and find the spatial relationship between the selected property and other point of interests Simple to use 80

Implementation Overview Web Browser ------------------------- AJAX / JavaScript Internet HTTP Server ESRI ArcWeb Map Server Google Map Server PHP Application DB2 z/os with Spatial Support Real Estate Listing Data School Data Crime Data 81

Search all the detached houses in the city of Santa Clara and show the results on a map 82

Show the details of each result by clicking on a green pushpin on the map. At the query panel below, query the schools within 2 miles of the selected home and API > 700. 83

Click on the right most yellow pushpin. Notice the address is in the city of San Jose, not Santa Clara. Using the query panel at the bottom, find the criminal data within 2 miles of the selected school. 84

Criminal data are displayed with red pushpin on the map. Click on them for additional details. 85

Current Developments Prototypes 86

3D Skip 87

Modelling 3D Objects Extend spatial type hierarchy Solids and Polyhedrons and Collections Approximate curved surfaces with polygons New methods for 3D types ST_Volume, ST_BoundingArea, ST_ExteriorShell, ST_InteriorShellN, and ST_NumFacets Adjust existing methods to Operate on 3D geometries Perform calculations in 3D space 88

Extend External Data Formats Well-known text (WKT), well-known binary (WKB), and Geography Markup Language (GML) Boundary based representation POLYHEDRON Z ((((0 0 0, 1 0 0, 0 0 1, 0 0 0)), ((1 0 0, 0 1 0, 0 0 1, 1 0 0)), ((0 0 1, 0 1 0, 0 0 0, 0 0 1)), ((0 0 0, 0 1 0, 1 0 0, 0 0 0)))) Indexed face set representation POLYHEDRON Z INDEX ((0 0 0, 1 0 0, 0 0 1, 0 1 0), (((1, 2, 3, 1)), ((2, 4, 3, 2)), ((3, 4, 1, 3)), ((1, 4, 2, 1)))) 89

CGAL as 3D Computation Engine Scientific library implementing spatial algorithms with optimal theoretical runtime CGAL Polyhedron for true 3D objects More general Nef-polyhedrons All kinds of spatial objects can be described in R 3 Points, linestrings, polygons, and polyhedrons 90

3D Performance Tests Approximated spheres with growing complexity Tested functionality Constructors & conversion to external data format Storage and construction time Spatial comparison of two geometries Generation of new geometries Intersection computation Performance of traditional Spatial Extender not impacted besides a single additional comparison of SRS identifier 91

Some Performance Results Intersection Construction 92

Summary or 3D with CGAL Performance is not acceptable despite optimal theoretical runtimes No native binary interface for I/O available Extremely long construction times Some simple optimization in CGAL show potential for significant improvements CGAL not suitable for practical purposes 93

Graph Skip 94

Integrating Graphs into DBMS All geometries have inherent topological properties Can be mapped to graph structures Allows application of graph operations directly on geometries Evaluate graph operations inside the DBMS Applications do not need any additional graph functionality 95

Deriving Graphs from LineStrings Points become vertices Intersections of linestrings without common point does not result in graph vertex Line segments connecting points become edges Reverse mapping requires identifier (primary key value or RID) Retain point coordinates in vertices 96

Test Data for Prototype # streets #points #vertices savings California 1,589,712 4,867,271 1,200,783 75.3% Connecticut 148,232 340,099 112,520 66.8% Delaware 39,429 91,673 29,266 68.1% D.C. 14,762 14,059 9,084 35.4% Michigan 595,662 1,361,034 428,435 68.5% New Mexico 451,326 2,248,623 330,786 85.3% Pennsylvania 842,099 2,469,506 620,257 74.9% Texas 2,259,314 7,214,599 1,689,279 76.6% Wyoming 270,334 1,926,298 201,571 89.5% 97

Graphs as Index Structures Usage in SQL statements Finds linestrings participating in shortest path Extracts relevant segments of linestrings SELECT sp.line, t.street_name FROM t, TABLE ( ST_ShortestPath( ST_Point(3.25, 3.25), ST_Point(5.5, 6.5), 'SAMPLE_GRAPH') ) AS sp(seq, line, id) WHERE sp.id = t.id ORDER BY sp.seq ST_ShortestPath automatically accesses graph structure Treats it as smart access path If not available, temporary graph constructed on the fly 98

Integration Aspects Skip 99

Integration Utilities Export and Load/Import do not support spatial types and spatial file formats Language Bindings No spatial types available in JDBC Spatial data must be transferred as LOB and then converted Distributed Database Systems Federation and Replication are unaware of Spatial 100

Summary 101

Summary (Only) basic spatial functionality available Standardized in SQL/MM Implementations available by all major database vendors More or less adherence to standard Index support is mandatory Spatial Support for z/os much better integrated with DB2 engine Federated access on spatial data and replication Different products = different data types and functions Coordinate Systems Raster support 102

Questions? 103