SPARQL UniProt.RDF. Get these slides! Tutorial plan. Everyone has had some introduction slash knowledge of RDF.



Similar documents
Getting Started Guide

TopBraid Application Development Quickstart Guide. Version 3.3

Creating a Semantic Web Service in 5 Easy Steps. Using SPARQLMotion in TopBraid Composer Maestro Edition

Lift your data hands on session

Mitigation Planning Portal (MPP) Tutorial Canned Reports Updated 5/18/2015

This is a training module for Maximo Asset Management V7.1. It demonstrates how to use the E-Audit function.

Pharmacy Affairs Branch. Website Database Downloads PUBLIC ACCESS GUIDE

Semantic Modeling with RDF. DBTech ExtWorkshop on Database Modeling and Semantic Modeling Lili Aunimo

How To Use Query Console

Sample Table. Columns. Column 1 Column 2 Column 3 Row 1 Cell 1 Cell 2 Cell 3 Row 2 Cell 4 Cell 5 Cell 6 Row 3 Cell 7 Cell 8 Cell 9.

Microsoft SharePoint 2010 End User Quick Reference Card

Using Ad-Hoc Reporting

USER GUIDE. Unit 5: Tools & Modules. Chapter 3: Forms & Surveys

Junk Settings. Options

Microsoft Access 2007

Creating an Access Database. To start an Access Database, you should first go into Access and then select file, new.

OUTLOOK ANYWHERE CONNECTION GUIDE FOR USERS OF OUTLOOK 2010

MICROSOFT ACCESS STEP BY STEP GUIDE

Creating a Database in Access

Semantic Stored Procedures Programming Environment and performance analysis

ICP Data Entry Module Training document. HHC Data Entry Module Training Document

Qlik REST Connector Installation and User Guide

Specify the location of an HTML control stored in the application repository. See Using the XPath search method, page 2.

Microsoft Access 2010 Overview of Basics

Enterprise Asset Management System

MEDIAplus administration interface

MAS 500 Intelligence Tips and Tricks Booklet Vol. 1

Word 2010: Mail Merge to with Attachments

Scribe Online Integration Services (IS) Tutorial

Importing and Exporting With SPSS for Windows 17 TUT 117

ORACLE BUSINESS INTELLIGENCE WORKSHOP

SPSS (Statistical Package for the Social Sciences)

Mitigation Planning Portal MPP Reporting System

Instructions for creating a data entry form in Microsoft Excel

SPSS: Getting Started. For Windows

Creating a New Search

How to create pop-up menus

Learning Services IT Guide. Access 2013

DATA VISUALIZATION WITH TABLEAU PUBLIC. (Data for this tutorial at

Database manager does something that sounds trivial. It makes it easy to setup a new database for searching with Mascot. It also makes it easy to

SnapLogic Tutorials Document Release: October 2013 SnapLogic, Inc. 2 West 5th Ave, Fourth Floor San Mateo, California U.S.A.

MICROSOFT ACCESS 2003 TUTORIAL

Simply Accounting Intelligence Tips and Tricks Booklet Vol. 1

Data Tool Platform SQL Development Tools

There are numerous ways to access monitors:

Eclipse installation, configuration and operation

Change Management for Rational DOORS User s Guide

ACCESS Importing and Exporting Data Files. Information Technology. MS Access 2007 Users Guide. IT Training & Development (818)

What are we dealing with? Creating a New MS Access Database

Learn how to create web enabled (browser) forms in InfoPath 2013 and publish them in SharePoint InfoPath 2013 Web Enabled (Browser) forms

tools that make every developer a quality expert

Install MS SQL Server 2012 Express Edition

About Kobo Desktop Downloading and installing Kobo Desktop Installing Kobo Desktop for Windows... 5 Installing Kobo Desktop for Mac...

MyOra 3.0. User Guide. SQL Tool for Oracle. Jayam Systems, LLC

Microsoft Access Basics

Swinburne University of Technology

Universal Simple Control, USC-1

InstantSearch+ for Magento Extension

1 Key Features. Audit, Discovery and Software Licence Manager Guide

Notepad++ The COMPSCI 101 Text Editor for Windows. What is a text editor? Install Python 3

Pattern Graphix Pattern Graphix is a program complex that provides the automatic detection of the technical analysis patterns in MetaTrader 4.

Microsoft Office 2010

Chapter 9 Slide Shows

GroupWise Web Access 8.0

Bradley University College of Liberal Arts and Sciences Department of Computer Sciences and Information Systems

2. Unzip the file using a program that supports long filenames, such as WinZip. Do not use DOS.

Create a New Database in Access 2010

Creating a Participants Mailing and/or Contact List:

1. Starting the management of a subscribers list with emill

SharePoint 2007 Get started User Guide. Team Sites

Abstract. For notes detailing the changes in each release, see the MySQL for Excel Release Notes. For legal information, see the Legal Notices.

INTRODUCTION TO MICROSOFT ACCESS Tables, Queries, Forms & Reports

Microsoft Access Rollup Procedure for Microsoft Office Click on Blank Database and name it something appropriate.

University of Rochester

Document Management Quick Reference Guide

Android Development Setup [Revision Date: 02/16/11]

Hypercosm. Studio.

WebSphere Business Monitor V6.2 KPI history and prediction lab

Microsoft Access 2007 Introduction

This is a training module for Maximo Asset Management V7.1. In this module, you learn to use the E-Signature user authentication feature.

Consider the possible problems with storing the following data in a spreadsheet:

Avaya Flare Experience for Windows Quick Reference

Bioinformatics Resources at a Glance

Unified Communications Using Microsoft Office Live Meeting 2007

CISCO VPN CLIENT INSTALL AND UPDATE INSTRUCTIONS

SHAREPOINT 2010 FOUNDATION FOR END USERS

Wellspring FAX Service 1 September 2015

Skills Funding Agency

SPARQL By Example: The Cheat Sheet

Important Notice. (c) Cloudera, Inc. All rights reserved.

VDF Query User Manual

File Structure Best Practices / Troubleshooting

Support Desk Help Manual. v 1, May 2014

Dynamics CRM 2011 Outlook Configuration Guide With Windows XP

Transcription:

SPARQL UniProt.RDF Everyone has had some introduction slash knowledge of RDF. Jerven Bolleman Developer Swiss-Prot Group Swiss Institute of Bioinformatics Get these slides! https://sites.google.com/a/jerven.eu/jerven/home/ Programming/swat4lsaveiro Search #SWAT4LS on twitter look for my last tweet Tutorial plan Set up Topbraid Composer Skipped in talk Gather data from uniprot website Learn sparql Text You do not need Topbraid Composer to use UniProt RDF data or do sparql queries. You should have used Topbraid composer in this mornings tutorial. If not have a look at next few slides.

Before starting have a look at http://purl.uniprot.org/core/ You can find the documentation on the schema of uniprot rdf here. Download and install Topbraid composer Requirements Sun/Oracle JVM Go to http://www.topquadrant.com/products/ TB_download.html Register Select any edition, free is ok for today Start Topbraid If you never used topbraid before you will have an empty workspace.

Setting up a workspace for this tutorial For today please create a new empty workspace that does not influence your previous work http://www.topquadrant.com/products/tb_download.html New project File > New Project > General Give your project a recognizable name Gather data from uniprot.org website The.project file contains the project details do not delete it! In the navigator select the new project you just made.

Gather data from uniprot.org website Right click on your new project. Select Import in the drop down menu In this case we are going to use an uniprot entry for our examples. Import RDF or OWL file from the web Gather data from uniprot.org website Fill in the source and target url. Click finished Do the same for http://www.uniprot.org/ owl/core.rdf and name it core.owl core.owl contains the "schema" data for uniprot rdf. You can see a html view of this entry at http://www.uniprot.org/uniprot/p05067 Gather data from uniprot.org website Open the P05067 file by double clicking

You get a very helpfull dialog. Hit yes This auto imports ontologies used by uniprot that are not inside the core.owl file. And their imports as well. Using the same process download core.owl You can see a html view of this schema ontology at http://www.uniprot.org/core/ Where are all the UniProt classes? Have a look at the Tab classes. The number between the brackets is the instances of that class in your file.

Function_Annotation in P05067 Some datatype documentation If instance is empty double click on the Function_Annotation in the classes view. Function_Annotation in P05067 Double click on the top triple Resource to see it in more detail. Unstructured text comment This is the top Function_Annotation Instance of the last page.

Unstructured text comment Use the source code tab to see the triples in RAW formats. The turtle view is helpfull when you start to write SPARQL queries. Lets learn SPARQL In this example session I will only show SELECT and CONSTRUCT Queries over RDF data. Four basic types SELECT Returns tab delimited results CONSTRUCT Makes new triples DESCRIBE Returns all triples mentioning a resource ASK Return true if anything matches Lets learn SPARQL All

Lets learn SPARQL This is where you type your query. Lets learn SPARQL This is where you see your results. Each line in the where clause is a triple pattern where things that start with? are variables Shorthand a = rdf:type Here we select those 5 instances that we saw earlier on in the classes -> instances tab SELECT * WHERE {?protein rdf:type core:protein.?protein core:annotation?functionann.?functionann a core:functio_annotation. }

Constructing an owl:sameas between two URI str() to change a IRI into a string concat and substring to do string manipulation IRI() to change the string back into a IRI Not exists (Negation) SELECT * WHERE {!?link a core:resource. NOT EXISTS {?link core:database? database. } } count SELECT count(*) WHERE {!?subject?predicate?object }

Extra material:path queries Path queries will be slightly different in output but not in syntax for final SPARQL 1.1 Extra material:path queries?s core:range/core:begin?o;range property then begin property?s core:begin core:end?o; begin or end property?s core:range*?o;zero or more steps?s core:range+?o;one or more steps?s core:range{2,3}?o;two or three steps?s core:annotation/core:range/ core:begin?p any annotations begin position. Filter FILTER can be used to remove potential matches from the pattern.

Filter on not equals?a >?b : a greater than b?a <?b : a smaller than b?a =?b : a same value as b?a!=?b : a different value than b Filters Options depend on the values e.b. < > only work on numbers Filtering on string values?a =?b : a same value as b?a!=?b : a different value than b

Regular Expressions Most perl style regex options work except for capturing groups Why don t these queries work on the web? PREFIX Topbraid composer uses the prefixes defined in the files overview tab. On the web you often have to add these. PREFIX :<http://purl.uniprot.org/core/> SELECT?x FROM <http://purl.uniprot.org/taxonomy/> WHERE {?x a :Taxon} More uniprot rdf http://www.uniprot.org/downloads (See bottom of page for RDF) http://www.uniprot.org/faq/28 Queries on the website can be downloaded as RDF e.g. only human entries http://www.uniprot.org/uniprot/?query=organism %3a9606&sort=score&format=rdf

Let s infer You should get a view that you saw earlier in this tutorial. Go back to the top level of the file by double clicking again on the file name in the navigator tab. Let s infer Change to the profile tab Some ontologies used by uniprot.org Profile tab Tick the OWL2RL and RDFS Plus boxes and save This enables the reasoner.

Run the reasoner In the menu Inference > select the option Run inferences name is inferred to be a rdfs:label Inferred! Inferring can help make queries easier. Or they can trully infer new knowledge. Side note Annotations (as above the name) are annotations in the OWL sense not in the biological curated annotation sense. name is inferred to be a rdfs:label Using the red box you can quickly jump to an instance. Quick navigation.

Inferencing changes the results of queries SELECT * WHERE {?subject rdfs:label "FASEB J.". } Try this query before and after reseting inferences In the menu bar under inference count SELECT count(*) WHERE {!?subject?predicate?object } 26876 triples instead of 13176 bit more than double! count SELECT count(*) WHERE {?s?p?o} same as SELECT (count(*) as?count) where {?s?p?o} more widely accepted by stores SELECT count(*) WHERE {!?subject?predicate?object } 26876 triples instead of 13176 bit more than double!

Adding your own rules to the inferencer Remember the linking between UniProt and PDBj identifiers? Using SPIN rules one can do this automatically First import the SPIN schema Open the Imports tab Open the Imports tab Use the local import function to import the SPIN schema

Select spin.rdf and hit ok After pressing ok, save. Structure_Resource Find the Structure_Resource class. Either using the class tab or the quick navigator Add an empty row to spin:constructor The small downwards pointing triangle next to spin:constructor is the key ui element here.

You get a sparql construct query: finish it as earlier You get a sparql construct query finish: it as earlier Now add the query as shown here The difference is in the use of the IRI function instead of the URI function used earlier. URI is an official synonym for the IRI function due to a small bug you canʼt use it here.

Run the reasoner In the menu Inference > select the option Run inferences Running spin on lots of data without Topbraid composer Open Source Have a look at www.spinrdf.org Closed Source Have a look at the alegro graph triple store Thank you for your time! See the new owl:sameas links. You just mapped uniprot purl identifiers with pdbj identifiers and made them logically point to the same Resource.