Semantic Method of Conflation International Semantic Web Conference Terra Cognita Workshop Oct 26, 2009 Jim Ressler, Veleria Boaten, Eric Freese Northrop Grumman Information Systems Intelligence Systems Division St. Louis, Missouri
Geospatial Features A geospatial feature is a representation of a real-world geographic or man-made entity Features have Location Shape (point, line, polygon) Type of feature, a classification Attribution to describe feature (name, address, type-specific) Metadata about feature s origin, lifecycle, classification, etc. 2
Geospatial Feature Conflation problem Multiple representations of the same geographic feature need to be combined into one for accurate and up-to-date record (conflation) A precursor to automated feature conflation is to align input feature databases with like feature types and attributes Research by Volz (2005) and Fonseca (2002) align geospatial data with semantics, but fall short of conflation Research emphasize methods or means of cross-referencing data Aim to use the inference power of Semantic technology to conflate a data-independent representation of features 3
Basic Process of Conflation Semantic Feature Match/Conflation Process Vocabularies Ontology, Schema Semantic Alignment Conflation Engine Feature Geometry representations Conflated Features
Defining Semantic Geospatial Feature Content A taxonomy, i.e. ontology, of geospatial features enables applications to integrate information about features at geographic locations, such as feature conflation The NSG Entity Catalog (NEC) and NSG Feature Data Dictionary (NFDD): Taken together, these determine semantic content by specifying a domain data model and its supporting data element dictionary. They do so by drawing upon recognized content standards, specifications and profiles from the military (e.g., DGIWG, NATO/MGID, MIDB, JMCDM) and civilian sectors (e.g., IHO, ICAO/Eurocontrol, WMO). The NEC and NFDD taken together answer the question what do we mean by <name of feature>? NSG Application Schema (NAS): This specifies the Platform Independent Model that determines the syntactic structure used to represent the semantics specified by the NEC. 5
NSG Semantics and Semantic Models NSG Application Schema model NAS Defined Feature, Attributes, and Attribute Values in UML Concepts harmonized Detailed identity UML NAS Model (Rose UML 1.8) Semantic Web Technology (RDF/OWL) Models data for distributed data Models data for integration Allows simple model integration Allows inferences of new information NAS RDF/OWL Model 6
Feature Type Vocabulary Constructed a feature type vocabulary in OWL from NAS, Wordnet and FACC Wordnet created SKOS hierarchy from W3C Wordnet files (168 words related to NAS) Broader/narrower based on hyponyms Labels based on sense names FACC created SKOS concept for each FACC code entry from spreadsheet (106 connections with NAS) NAS concepts are skos:related to Wordnet and FACC concepts as appropriate (70 of 550 feature types complete) NAS Concept narrower than Pier Most General NAS Concept 7
Multiple Ontology used to create Feature Vocabulary berth related concepts in NAS 8
Multiple Ontology used to create Feature Vocabulary berth with Wordnet and FACC related concepts 9
Feature Type Matching The 3 principle steps in Feature Type Matching Output is business rules that align the input data sources for conflation 10
Semantic Concept Mapping Ingest features, determines candidate types User able to review candidates and override those to consider 11
Semantic Concept Mapping Inference of concepts based upon dictionary value Grouped Concepts inferred in Sparql query on common Derived concept 12
Semantic Concept Mapping creating feature similarity groups Feature Dataset1 Feature Dataset2 Feature DatasetN Concept- Keyword Dictionary Service Triples of Feature Concepts (N3) Flex File Browser Triples Of Concept candidates Source Info (XML) Flex Form SPARQL Motion Extraction File1 N Selected Feature- Concept Groups Queue of Files Assert concepts Link to SKOS Infer Types Query for feature similarity XQuery to FeatSim Schema (XML) Post to WPS- Conflation Rule Service Transform Output Rules Into ACS 13
Feature Similarity Rules Schema Instance (featgrp) 14
Semantic Feature Matching Demo Two datasets with harbor information used for chart update Perform conflation on two sources and publish merged data Electronic Nautical Chart (ENC) inner Boston harbor Digital Nautical Chart (DNC ) Boston harbor Maritime Ports and Harbors ontology used to align two sources for business rules Automated Conflation Service utilizes the business rules for conflation processing Conflation Engine
Standards-based Conflation flow Geospatial Client Rules Intelligence Publish results Conflate req Feat resp discovery FTP WMS / WFS-T Conflation Service Rules Feat resp Database WFS Catalog Service Confl ated Feat Database WFS Conflation Engine Rule / Catalog Service 16
Summary Large parts of Geospatial Conflation can be automated Currently human input required to setting up conflation rules Geometric conflation is a black box process once rules are established Ability demonstrated to recognize what feature type is in common using semantics. Further research to broaden types, accept more sources, define a hierarchy of conflation rules 17
18