Lift your data hands on session



Similar documents
Publishing Linked Data Requires More than Just Using a Tool

Portal Version 1 - User Manual

GetLOD - Linked Open Data and Spatial Data Infrastructures

MicroStrategy Desktop

SPARQL UniProt.RDF. Get these slides! Tutorial plan. Everyone has had some introduction slash knowledge of RDF.

ORACLE USER PRODUCTIVITY KIT USAGE TRACKING ADMINISTRATION & REPORTING RELEASE 3.6 PART NO. E

SysPatrol - Server Security Monitor

NAIP Consortium Strengthening Statistical Computing for NARS SAS Enterprise Business Intelligence

The manual can be used as a practical tutorial and aims to teach users how to:

IBM DB2 XML support. How to Configure the IBM DB2 Support in oxygen

Producing Listings and Reports Using SAS and Crystal Reports Krishna (Balakrishna) Dandamudi, PharmaNet - SPS, Kennett Square, PA

MIS Export via the FEM transfer software

Working with WebSphere 4.0

Note: With v3.2, the DocuSign Fetch application was renamed DocuSign Retrieve.

Setting Up ALERE with Client/Server Data

User Manual - Sales Lead Tracking Software

HR Onboarding Solution

Visualization with Excel Tools and Microsoft Azure

ORACLE BUSINESS INTELLIGENCE WORKSHOP

Market Pricing Override

Chapter 10 Encryption Service

Introduction. Regards, Lee Chadwick Managing Director

EBOX Digital Content Management System (CMS) User Guide For Site Owners & Administrators

WatchDox Administrator's Guide. Application Version 3.7.5

Bank Manager Version

National Fire Incident Reporting System (NFIRS 5.0) Configuration Tool User's Guide

Installing OGDI DataLab version 5 on Azure

Quick Start Guide. 1 Copyright 2014 Samanage

MiniBase. Custom View Tips & Tricks. Schoolwires Centricity 2.0

Hosted VoIP Phone System. Admin Portal User Guide for. Call Center Administration

How To Use Syntheticys User Management On A Pc Or Mac Or Macbook Powerbook (For Mac) On A Computer Or Mac (For Pc Or Pc) On Your Computer Or Ipa (For Ipa) On An Pc Or Ipad

USER GUIDE. Unit 5: Tools & Modules. Chapter 3: Forms & Surveys

Cloud Control Panel (CCP) Billing User Guide

DevInfo 7 Web Installer Guide

Call Recorder Apresa User Manual

Towards the Integration of a Research Group Website into the Web of Data

INTEGRATING MICROSOFT DYNAMICS CRM WITH SIMEGO DS3

We have big data, but we need big knowledge

About This Document 3. Integration Overview 4. Prerequisites and Requirements 6

Business Intelligence Office of Planning Planning and Statistics Portal Overview

How To Use Query Console

HTTP - METHODS. Same as GET, but transfers the status line and header section only.

Using Internet or Windows Explorer to Upload Your Site

Work with the MiniBase App

User Guide to the Content Analysis Tool

Quick start. A project with SpagoBI 3.x

Grandstream Networks, Inc. GXV3175 IP Multimedia Phone GUI Customization Guide

An Oracle White Paper June RESTful Web Services for the Oracle Database Cloud - Multitenant Edition

Fixes for CrossTec ResQDesk

Module 9 Ad Hoc Queries

Novell Filr. Windows Client

RDF Dataset Management Framework for Data.go.th

RP Pocket PC Scanner Reference Manual For PPT8800

User's Guide. Product Version: Publication Date: 7/25/2011

How to work with Panels in LimeSurvey

ABSTRACT 1. INTRODUCTION. Kamil Bajda-Pawlikowski

Web services with WebSphere Studio: Deploy and publish

Call Recorder Quick CD Access System

User Guide. Version R91. English

Copyright Pivotal Software Inc, of 10

How-to: HTTP-Proxy and Radius Authentication and Windows IAS Server settings. Securepoint Security System Version 2007nx

How To Test Your Web Site On Wapt On A Pc Or Mac Or Mac (Or Mac) On A Mac Or Ipad Or Ipa (Or Ipa) On Pc Or Ipam (Or Pc Or Pc) On An Ip

JOSSO 2.4. Internet Information Server (IIS) Tutorial

IBM WebSphere Adapter for Quick Start Tutorials

TeamViewer 9 Manual Management Console

Microsoft Dynamics CRM Clients

Monnit Wi-Fi Sensors. Quick Start Guide

Sophos Mobile Control Startup guide. Product version: 3

RDS Migration Tool Customer FAQ Updated 7/23/2015

1. Welcome to QEngine About This Guide About QEngine Online Resources Installing/Starting QEngine... 5

HELP DESK MANUAL INSTALLATION GUIDE

FileMaker Server 12. Custom Web Publishing with PHP

Guide to the MySQL Workbench Migration Wizard: From Microsoft SQL Server to MySQL

Database FAQs - SQL Server

Policy Guide Access Manager 3.1 SP5 January 2013

SharePoint Integration Framework Developers Cookbook

Energistics WITSML Product Certification Program Server Test Suite #1 Testing Procedure

DEPLOYMENT GUIDE Version 1.1. Deploying F5 with IBM WebSphere 7

Commerce Services Documentation

T320 E-business technologies: foundations and practice

Configure Web Conference Parameters Through The Web Conference Administration User Interface.

Collaborative Forecasts Implementation Guide

Network Visiblity and Performance Solutions Online Demo Guide

emarketing Manual- Creating a New

1 Vendor Management Module v4.4 User s Guide

Getting Started Guide: Getting the most out of your Windows Intune cloud

Accessing Your Database with JMP 10 JMP Discovery Conference 2012 Brian Corcoran SAS Institute

OSLC Primer Learning the concepts of OSLC

An Introduction To The Web File Manager

FileMaker Server 10 Help

LDIF - Linked Data Integration Framework

The Power Loader GUI

Grandstream Networks, Inc.

Network Probe User Guide

FileMaker Server 11. FileMaker Server Help

Spectrum Technology Platform. Version 9.0. Administration Guide

DEPLOYMENT GUIDE Version 1.2. Deploying the BIG-IP system v10 with Microsoft Exchange Outlook Web Access 2007

Transcription:

Lift your data hands on session Duration: 40mn Foreword Publishing data as linked data requires several procedures like converting initial data into RDF, polishing URIs, possibly finding a commonly used vocabulary that you can reuse in your dataset, publishing the data (i.e. making sure that each URI used to identify an object in your dataset is dereferenceable, i.e. is a HTTP URL that leads to some actual description of the object). Further on, you would want to give a SPARQL access to these data (the equivalent of WFS service but in the Linked Data world), and also to interconnect your instances with other instances published on the Web. For some of these procedures, you could find code on the Web and for others you would have to write down something yourself. For your URIs to be dereferenceable, you can tune the config file of a HTTPD Web server. To set up the SPARQL access is a bit more complex. This is why this tutorial will use an existing open source software, designed and implemented in the context of the research project Datalift (funded by the French national agency for research, ANR, http://datalift.org). This software gathers modules to perform these steps. It has been designed by the industrial partner of the consortium, Atos, as an open platform so that for each step several modules can be available and so that new modules can be added. Some of them are operational whereas others are still under development. It will be made available at the end of the project (autumn 2013) as open source. We expect two categories of participants to this workshop. For people who are quite familiar with implementation, we suggest you install the platform on your computer (see dedicated directory on the tutorial material) and follow the steps to lift sample datasets and then adapt the process to your own data. The technical description of the software is in the suggested readings directory of the tutorial material. For people who don t feel so familiar, we suggest you use an existing server installed by Atos. Adapting the lifting process to your own data might be a bit longer but we will offer assistance. Please show us your data so that we adapt the configuration file used by the conversion software and load it on the server. 1

Part A: lifting data Principles The workspace interface allows you to design projects. You can specify data sources that will be loaded by the server, and select and apply operations (modules) on these sources. This will yield new data on which you can apply new operations. To lift data, it is necessary to create or use an existing project. A project is an environment in which it is possible to add sources, i.e. to specify location of initial data that the server will fetch to work on them, and it is possible to select and trigger operations on these sources. The resulting data will be in turn stored as sources of the project. Step 0: create the project To create a project, you need to use the «workspace» interface. Remote server: http://datalift.si.fr.atosorigin.com/datalift/project Login: datalift, Pwd: test Your installation: http://localhost:9091/datalift/project On the workspace, click on «New project» in the left column (fig 1 and the fig. 2). The name of the project is used in the working URIs but these can be modified before publishing, in step 3. If you use the remote server make sure to use a distinctive name for your project so that each of you works on his own project in the remaining of the tutorial. fig. 1 2

fig. 2 Step 1: specify sources On the workspace view, select the newly created project, then click on the Sources tab and then on the «+» sign on the left bottom of the Sources tab (fig. 3) to add a new source to the list of existing sources used in your project. fig. 3 On the next window, you have to select the format for the input source of data like CSV, XML, GML, RDF, SHP, any DBMS provided there is a JDBC driver and any SPARQL endpoint. When you have selected a format, an interface helps you to specify the location of your data for the server to be able to fetch it. In this tutorial, first use the SHP source loader to load DEPARTEMENT.shp (fig. 4) and the CSV source loader to load ADRESSES.csv (fig. 5). 3

fig. 4 fig. 5 Some source can then be visualised by clicking on the source name (others not yet): CSV, RDF, XML, GML files can be visualised by clicking on their name in the Sources tab. You may also delete sources using the bin picto or modify them using the pen. Be sure you select the source by clicking on its rectangle but not on the name before selecting the bin or pencil picto. Step 2: convert into RDF To proceed to conversion you go back to the description tab. There are different possible conversion modules depending on the kind of sources. 2.a conversion of the SHAPE to GML 4

In the case of SHAPEFILE, you will apply two modules: firstly SHAPE to GML mapping and second GML to RDF mapping. Note that the SHP to GML will also change the projection to WGS84 that is widely used in Linked Data. fig. 6 Here we only have a SHP source and the only possible module associated to it is a SHP to GML mapping (fig. 6). Click on the module, there is only one possible SHP file to select, click on submit. The newly created GML file appears now in the list of sources and if you click on it you can visualise it: <?xml version="1.0" encoding="utf-8"?> <ogr:featurecollection xmlns:xsi="http://www.w3.org/2001/xmlschema-instance" xsi:schemalocation="http://ogr.maptools.org/ DEPARTEMENT_wgs84.xsd" xmlns:ogr="http://ogr.maptools.org/" xmlns:gml="http://www.opengis.net/gml"> <gml:boundedby> <gml:box> <gml:coord><gml:x>- 5.139017285433222</gml:X><gml:Y>41.36275742645721</gml:Y></gml:coord> <gml:coord><gml:x>9.559823318166366</gml:x><gml:y>51.08939669830915</gm l:y></gml:coord> </gml:box> </gml:boundedby> <gml:featuremember> <ogr:departement_wgs84 fid="departement_wgs84.0"> 5

<ogr:geometryproperty><gml:polygon><gml:outerboundaryis><gml:linearring ><gml:coordinates>5.831226413621034,45.938459578293212 5.822166170149367,45.93026097668983 5.829400764539029,45.913987107834956 5.826316643217439,45.903692922700827 5.815154073062683 etc. The next step is to apply a GML to RDF mapping to this GML file. To do so go to the Description tab, the module GML to RDF mapping is now available. Click on it, there is only one possible GML file to use for the module, click on submit. The newly created RDF file appears in the Sources tab. You can click on it to display the generated RDF data (fig. 7). fig. 7 WARNING: the GML to RDF mapping uses a configuration file that is specific to the GML schema used in the gml file. It is stored at: {$DATALIFT_HOME}/storage/public/project/{PROJECT_NAME}/ and its name must be the same as the name of the GML file with a.conf suffixe. In the tutorial material you can see how it looks: DEPARTEMENT_wgs84.conf. We also join an article from the author of the initial source code (that we have reused in the platform) that explains the parameters of the configuration file. It is the file USGSReport.pdf in the Suggestedreadings directory. Step 2.b: conversion of CSV to RDF In the Description tab, select the module Direct mapping CSV to RDF. A Data type mapping section allows you to specify the datatype for each field (fig 8). 6

fig. 8 An RDF file is created that has the same name as the previous one suffixed with (RDF #1) at the end. This name is generated automatically, feel free to change it. Step 3: change the URIs You may need to change URIs in the generated RDF file. To do so, go to the Description tab and select the module: RDF URI translation (fig. 9). fig. 9 A new source is created that has the same name as the previous one with (RDF #i) at the end. This name is generated automatically, feel free to change it. Step 4: Transform the schema 7

To change the schema you can use the RDF to RDF transformation (CONSTRUCT) like on fig. 10. fig. 10 Step 5: Publish the data Once you have RDF sources, two new modules are available in the Description tab: Data publishing to public RDF store and RDF data export. The first module copies RDF data from the internal store to the public store. When publishing RDF data, the named graph URI is not visible as part of the data but acts as a container for the RDF triples, allowing manipulating them as a set (e.g. to delete or replace them). The URIs of your objects will remain unchanged: if the server receives a request with these URIs it will be able to serve the description of the object identified by the URI. The second module allows you to download the data locally, for example to upload them in another distant RDF store. Step 6: Interconnexion Now that your data are published, it is interesting to match the instances with other instances and to publish the links. 8

Part B: Querying the lifted data The data can be queried by sending: 1) HTTP requests to dereference the URIs (URLs in this case); the server will return a representation of the object after content negotiation based on request headers: HTML, RDF/XML, Turtle, N3. 2) SPARQL queries on the SPARQL endpoint web service. See fig. 11. Some clues about SparQL language: http://www.w3.org/tr/rdf- sparql- query/ You can also experiment HTML files in the tutorial material to display Linked Data on cartographic portals. fig. 11 9