Geospatial Data and the Semantic Web. The GeoKnow Project. Sebastian Hellmann AKSW/KILT research group, Leipzig University & DBpedia Association

Similar documents
Revealing Trends and Insights in Online Hiring Market Using Linking Open Data Cloud: Active Hiring a Use Case Study

GetLOD - Linked Open Data and Spatial Data Infrastructures

IAAA Grupo de Sistemas de Información Avanzados

City Data Pipeline. A System for Making Open Data Useful for Cities. stefan.bischof@tuwien.ac.at

Integrating Open Sources and Relational Data with SPARQL

Publishing Linked Data Requires More than Just Using a Tool

Mining the Web of Linked Data with RapidMiner

Geospatial Platforms For Enabling Workflows

LinkZoo: A linked data platform for collaborative management of heterogeneous resources

LDIF - Linked Data Integration Framework

BIG DATA AGGREGATOR STASINOS KONSTANTOPOULOS NCSR DEMOKRITOS, GREECE. Big Data Europe

Exposing Open Street Map in the Linked Data cloud

Smart Cities require Geospatial Data Providing services to citizens, enterprises, visitors...

Geospatial Platforms For Enabling Workflows

Open Data Integration Using SPARQL and SPIN

DBpedia German: Extensions and Applications

Cloud-based Linked Data Geoprocessing: Implementing Kriging as WPS on the Cloud

Visual Analysis of Statistical Data on Maps using Linked Open Data

Integrating Linked Open Data into Open Source Web Mapping

Semantic Interoperability

THE EUROPEAN DATA PORTAL

Enabling embedded maps

A Web services solution for Work Management Operations. Venu Kanaparthy Dr. Charles O Hara, Ph. D. Abstract

Databases for 3D Data Management: From Point Cloud to City Model

Standards for Big Data in the Cloud

Linked Open Data Infrastructure for Public Sector Information: Example from Serbia

How To Make Sense Of Data With Altilia


Geospatial Technology Innovations and Convergence

ON DEMAND ACCESS TO BIG DATA. Peter Haase fluid Operations AG

GeoKettle: A powerful open source spatial ETL tool

Where is... How do I get to...

Introduction to Ontologies

D5.3.2b Automatic Rigorous Testing Components

Semantic Data Management. Xavier Lopez, Ph.D., Director, Spatial & Semantic Technologies

Comparison of Triple Stores

Scalable End-User Access to Big Data HELLENIC REPUBLIC National and Kapodistrian University of Athens

SAP HANA Core Data Services (CDS) Reference

We have big data, but we need big knowledge

Portal Version 1 - User Manual

Web Mapping in Archaeology

Short Paper: Enabling Lightweight Semantic Sensor Networks on Android Devices

Big Data, Cloud Computing, Spatial Databases Steven Hagan Vice President Server Technologies

Automated Data Ingestion. Bernhard Disselhoff Enterprise Sales Engineer

CRITEO INTERNSHIP PROGRAM 2015/2016

Investigating Hadoop for Large Spatiotemporal Processing Tasks

DATA MANAGEMENT PLAN DELIVERABLE NUMBER RESPONSIBLE AUTHOR. Co- funded by the Horizon 2020 Framework Programme of the European Union

Industry 4.0 and Big Data

GIS Databases With focused on ArcSDE

How To Use Hadoop For Gis

Serendipity a platform to discover and visualize Open OER Data from OpenCourseWare repositories Abstract Keywords Introduction

Graph Database Performance: An Oracle Perspective

Associate Professor, Department of CSE, Shri Vishnu Engineering College for Women, Andhra Pradesh, India 2

Deliverable D6.2 Transformation System Configuration Techniques

María Elena Alvarado gnoss.com* Susana López-Sola gnoss.com*

Mining Big Data with RDF Graph Technology:

Creating Knowledge From Interlinked Data From Big Data to Smart Data

TOWARDS TREATING GIS AS VIRTUAL RDF GRAPHS

DataOps: Seamless End-to-end Anything-to-RDF Data Integration

Databricks. A Primer

Semantic Web Tool Landscape

Publishing Relational Databases as Linked Data

Querying DBpedia Using HIVE-QL

Deploy. Friction-free self-service BI solutions for everyone Scalable analytics on a modern architecture

Pulsar Realtime Analytics At Scale. Tony Ng April 14, 2015

The MELODIES project: Exploiting open data using cloud computing and Linked Data

RDF Dataset Management Framework for Data.go.th

White Paper November Technical Comparison of Perspectium Replicator vs Traditional Enterprise Service Buses

ArcGIS. Server. A Complete and Integrated Server GIS

There are various ways to find data using the Hennepin County GIS Open Data site:

CityGML goes to Broadway

An HTML tool for exploiting geospatial web services

Databricks. A Primer

Emerging Geospatial Trends The Convergence of Technologies. Jim Steiner Vice President, Product Management

E6895 Advanced Big Data Analytics Lecture 4:! Data Store

Delivering secure, real-time business insights for the Industrial world

Rotorcraft Health Management System (RHMS)

Middleware support for the Internet of Things

Transcription:

spatial Data and the Semantic Web The Geo Project Sebastian Hellmann AKSW/KILT research group, Leipzig University & DBpedia Association # About the project Collaborative Project 2012-2015 Information and Communication Technologies Project No. 318159 Start Date 01/12/2012 Geo Sebastian Hellmann The Geo Project 2016-02-10 1 / 24

Geo The Spatial Data Web Status Semantic technologies for exposing, sharing, and connecting pieces of data on web using RDF Large Spatial Databases (OpenStreetMaps, Google Maps etc.) and GIS systems required for many applications Goal Enhance the Web as a global, distributed platform for data, information and knowledge integration by combining GIS and semantic technologies Sebastian Hellmann The Geo Project 2016-02-10 2 / 24

Geo The Spatial Data Web The story so far 1 Extension of the Web with a data commons (> 50 Billion facts) including 20% geospatial data sets 2 Emerging spatial data web research community 3 Standardisation activities (e.g. GeoSPARQL, W3C & OGC Standardisation Group started Jan 2015) 4 More Spatial Triple Stores (e.g. Strabon) 5 Linked Data allows combining geo services and formats (e.g. INSPIRE, OGC) with other structured information Sebastian Hellmann The Geo Project 2016-02-10 3 / 24

Geo The Spatial Data Web Challenges 1 2 3 4 5 Big Data: Large volumes of frequently updated data Data Integration: Relatively few, expensively maintained links Quality: partly low-quality data and varying coverage Data Consumption: large-scale processing, schema mapping and data fusion Standard Support/Compliance: standards not fully implemented yet Image Source: intergraph.com Sebastian Hellmann The Geo Project 2016-02-10 4 / 24

Consortium of the Geo Project Sebastian Hellmann The Geo Project 2016-02-10 5 / 24

Project Overview WP7: Dissemination & Exploitation WP8: Management Sebastian Hellmann The Geo Project 2016-02-10 6 / 24

Achievements - Linked Data Stack Originated in LOD2 Project Linked Data Stack (http://stack.linkeddata.org) provides a consolidated repository of such tools Lightweight integration between tools via common vocabularies and standards, e.g. (Geo)SPARQL Each tool is a Debian package 38 Geo packages 87 packages in Linked Data Stack (including other projects) RPM packages Nightly builds Planned stack use in Big Data Europe, HOBBIT, GEISER etc. research projects Sebastian Hellmann The Geo Project 2016-02-10 7 / 24

Achievements - Geo Generator UI Common interface and integration of geospatial Linked Data Stack tools Guided demo tour on the website Co-evolution support Supply chain data extraction Commercial adoption at State Secretariat for Economic Affairs (SECO) Switzerland within LINDAS project Sebastian Hellmann The Geo Project 2016-02-10 8 / 24

WP4: Spatial-semantic Browsing, Visualization, Auth Public-private spatial data co-evolution Goal: Identify types of enterprise RDF data synchronization workflows, define and implement tools that support them. This will include ETL processes, query federation, transformation/patching of data and change set propagation. Achieved by: Named Graph Sets with changeset tracking, versioning and conflict resolution Sebastian Hellmann The Geo Project 2016-02-10 9 / 24

Geo Achievements - Benchmarking Intensive work on benchmarking geospatial systems Benchmark project started: https://github.com/geo/geobenchlab Validation of component scalability across different lifecycle phases (and detection of potential regression) Current benchmarks: Slippy map benchmark Datacube query benchmark Interlinking benchmark Fusion benchmark Enrichment benchmark Facet Browsing Library benchmark RDB2RDF benchmark Sebastian Hellmann The Geo Project 2016-02-10 10 / 24

Virtuoso by OpenLink Software Triplestore development in Geo Initial analysis for Triple Stores performed (Virtuoso, useekm, Parliament, AllegroGraph, OWLIM-SE, Strabon + Oracle Spatial 11g, PostGIS as reference) using fragments of OSM and Ordnance Survey data Improved GeoSPARQL compliance (after tests with various stores: useekm, Parliament, AllegroGraph, OWLIM-SE, Strabon) Continuous benchmarking of Virtuoso throughout the project (detailed in D1.3 and WP2) Characteristic set support in Virtuoso - relational materialisation of related structures to improve query performance (started in Year 2, finished in Year 3) Graph analytics inside Virtuoso e.g. for route planning Sebastian Hellmann The Geo Project 2016-02-10 11 / 24

Achievements - Sparqlify http://aksw.org/projects/sparqlify Rewrites SPARQL SELECT queries to a single SQL query allowing the underlying database to perform optimizations Support for using geospatial RDBMS (PostGIS) functions Web interface with syntax highlighting and live data generation for easy mapping creation Docker image now available Sebastian Hellmann The Geo Project 2016-02-10 12 / 24

Achievements - LinkedGeoData LinkedGeoData conversion simplified to a set of SQL files and Sparqlify Mapping Definitions New full release (January 2016) with more than 1.2 billion triples based on the OpenStreetMap planet file from 2015-11-02 is now online! Updated/new interlinks with Wikidata, DBpedia, Wikimapia, Geonames, Natural Earth in progress Sebastian Hellmann The Geo Project 2016-02-10 13 / 24

Achievements - TripleGeo Converts shapefiles, spatial DBMS output to RDF Input: ESRI shapefiles, Oracle Spatial, PostGIS, MySQL, IBM DB2, INSPIRE-aligned GML (Geography Markup Language) data for seven Data Themes (in Annex I) Output: GeoSPARQL WKT, WGS84 + several RDF serialisations supported Support for EU INSPIRE directive added, in particular INSPIRE-aligned GML (Geography Markup Language) data for seven Data Themes (Annex I): Addresses, Administrative Units, Cadastral Parcels, GeographicalNames, Hydrography, Protected Sites, and Transport Networks (Roads) Sebastian Hellmann The Geo Project 2016-02-10 14 / 24

Achievements - LIMES Orthodromic distance only supported by LIMES and SILK Linking algorithm ORCHID for geospatial data in LIMES Reduces number of comparisons of resources in datasets which should be integrated Linking papers at ESWC 13 (linking in the cloud, best paper award) and ISWC 13 (ORCHID algorithm) and ISWC 14 ORCHID Between 1 and 2 orders of magnitude scalability improvement on geospatial data Large-scale evaluation of ORCHID including industrial settings Improved caching and load balancing cosine(title), 0.55 jaccard(director), 0.9 identity, 0.54 Sebastian Hellmann The Geo Project 2016-02-10 15 / 24

Achievements - FAGI Geospatial transformations and fusion Merges geospatial data using several configurable fusing actions Map interactive geospatial transformations Higher efficiency 2 orders of magnitude performance improvements Extended set of fusion actions Sebastian Hellmann The Geo Project 2016-02-10 16 / 24

Achievements - DEER Goal: enrich RDF datasets with (spatial) information Modular pipeline approach for dereferencing, linking and NLP extraction Progress: Machine learning techniques for learning DEER specifications using desired input-output pairs Learning of more complex ML specs work in progress d1 Linking d2 d3 Split d4 NLP Filter d5 Merge d6 d7 Sebastian Hellmann The Geo Project 2016-02-10 17 / 24

Achievements - Facete 1 Generic faceted browser for HTTP SPARQL endpoints Vocabulary abstraction layer for support of arbitrary geo vocabularies (e.g. GeoSPARQL, WGS) Automatically finds map data for entities Nested facets and client side pagination without pre-processing Client side querying with support for 100k spatial objects Goal is to work with DBpedia/LGD Sebastian Hellmann The Geo Project 2016-02-10 18 / 24

Achievements - Facete 2 RDF authoring support Implicit support for tracking changes Support for explicitly registering Changesets using the Co-Evolution API Sebastian Hellmann The Geo Project 2016-02-10 19 / 24

Achievements - GEM Mobile spatial exploration - extended with better routing, authoring, recommendations Sebastian Hellmann The Geo Project 2016-02-10 20 / 24

Supply Chain Management Use Case Mobile dashboard developed: Integration of prototype with Xybermotive commercial Web-EDI system On-the-fly calculation of metrics, for example weather influences and news aggregation (see demo) Distributed metrics calculation via Apache Spark Sebastian Hellmann The Geo Project 2016-02-10 21 / 24

E-Commerce Use Case Motive based search prototype Internal evaluation and crowdsourcing approach on BlueKiwi web application Sebastian Hellmann The Geo Project 2016-02-10 22 / 24

Achievements - Development Activity main repository: https://github.com/geo Software components: DataCubeValidation - quality assessment for statistical data DEER annotates existing RDF datasets with geospatial information ESTA-LD - spatiotemporal analysis tool FAGI a tool for fusion and aggregation of geospatial information GEM - mobile geospatial exploration and filtering Jassa a JavaScript library for SPARQL-based applications LDViewer - Linked Data presentation and browsing Lodtenant - RDF/SPARQL workflow toolkit Mappify allows to create simple geospatial web applications TripleGeo converts shapefiles and other geospatial structures to triples Sparqlify SPARQL-2-SQL rewriter LinkedGeoData RDF version of OpenStreetMap Facete (2) a generic, facet-based RDF browser LIMES framework for RDF data integration Virtuoso triple store and middleware Sebastian Hellmann The Geo Project 2016-02-10 23 / 24

The End Scientific Project Leader Jens Lehmann http://jens-lehmann.org lehmann@informatik.uni-leipzig.de Institute for Applied Informatics (InfAI) Project Manager Nadine Jänicke http://aksw.org/nadinejaenicke.html jaenicke@uni-leipzig.de Institute for Applied Informatics (InfAI) Geo http://geoknow.eu Thanks for your attention! Sebastian Hellmann The Geo Project 2016-02-10 24 / 24