ARC: appmosphere RDF Classes for PHP Developers



Similar documents
PHP Integration Kit. Version User Guide

Towards the Integration of a Research Group Website into the Web of Data

ABSTRACT 1. INTRODUCTION. Kamil Bajda-Pawlikowski

WIRIS quizzes web services Getting started with PHP and Java

12 The Semantic Web and RDF

XML Processing and Web Services. Chapter 17

Storage and Retrieval of Large RDF Graph Using Hadoop and MapReduce

Semantic Modeling with RDF. DBTech ExtWorkshop on Database Modeling and Semantic Modeling Lili Aunimo

Lift your data hands on session

Epimorphics Linked Data Publishing Platform

Optimizing Drupal Performance. Benchmark Results

Drupal CMS for marketing sites

Semantic Stored Procedures Programming Environment and performance analysis

LDIF - Linked Data Integration Framework

Powl A Web Based Platform for Collaborative Semantic Web Development

Using RDF Metadata To Enable Access Control on the Social Semantic Web

Data Store Interface Design and Implementation

OSLC Primer Learning the concepts of OSLC

A Java proxy for MS SQL Server Reporting Services

Managing Content in System Center 2012 R2 Configuration Manager

Preface. Motivation for this Book

Smooks Dev Tools Reference Guide. Version: GA

Dynamic Web Programming BUILDING WEB APPLICATIONS USING ASP.NET, AJAX AND JAVASCRIPT

How to Design and Create Your Own Custom Ext Rep

A DATA QUERYING AND MAPPING FRAMEWORK FOR LINKED DATA APPLICATIONS

Accessing Data with ADOBE FLEX 4.6

Web Services for Management Perl Library VMware ESX Server 3.5, VMware ESX Server 3i version 3.5, and VMware VirtualCenter 2.5

LabVIEW Internet Toolkit User Guide

PHP Tutorial From beginner to master

City Data Pipeline. A System for Making Open Data Useful for Cities. stefan.bischof@tuwien.ac.at

Programming IoT Gateways With macchina.io

FileMaker Server 9. Custom Web Publishing with PHP

Deploying Microsoft Operations Manager with the BIG-IP system and icontrol

Search and Information Retrieval

Linked Data Publishing with Drupal

Drupal.

Developing a Web Server Platform with SAPI Support for AJAX RPC using JSON

How To Build A Connector On A Website (For A Nonprogrammer)

Lightweight Data Integration using the WebComposition Data Grid Service

RDF Resource Description Framework

Easy configuration of NETCONF devices

Using Database Metadata and its Semantics to Generate Automatic and Dynamic Web Entry Forms

IBM InfoSphere MDM Server v9.0. Version: Demo. Page <<1/11>>

Big Data Analytics. Rasoul Karimi

Firewall Builder Architecture Overview

The HTTP Plug-in. Table of contents

Benchmarking the Performance of Storage Systems that expose SPARQL Endpoints

Fairsail REST API: Guide for Developers

WEB DEVELOPMENT COURSE (PHP/ MYSQL)

Exam Name: IBM InfoSphere MDM Server v9.0

powl Features and Usage Overview

T320 E-business technologies: foundations and practice

Grids, Logs, and the Resource Description Framework

Webair CDN Secure URLs

Author: Gennaro Frazzingaro Universidad Rey Juan Carlos campus de Mostòles (Madrid) GIA Grupo de Inteligencia Artificial

Standards, Tools and Web 2.0

JAVA r VOLUME II-ADVANCED FEATURES. e^i v it;

INSTALLING SQL SERVER 2012 EXPRESS WITH ADVANCED SERVICES FOR REDHORSE CRM

SiteCelerate white paper

Migrating IIS 6.0 Web Application to New Version Guidance

MAGENTO HOSTING Progressive Server Performance Improvements

WWW. World Wide Web Aka The Internet. dr. C. P. J. Koymans. Informatics Institute Universiteit van Amsterdam. November 30, 2007

Administering Jive for Outlook

PHP on IBM i: What s New with Zend Server 5 for IBM i

Oracle BI EE Implementation on Netezza. Prepared by SureShot Strategies, Inc.

10CS73:Web Programming

Setting Up a CLucene and PostgreSQL Federation

MarkLogic Server. Node.js Application Developer s Guide. MarkLogic 8 February, Copyright 2016 MarkLogic Corporation. All rights reserved.

Intro to Web Programming. using PHP, HTTP, CSS, and Javascript Layton Smith CSE 4000

Facebook Twitter YouTube Google Plus Website

Flow Publisher v1.0 Getting Started Guide. Get started with WhatsUp Flow Publisher.

Transition your MCPD Web Developer Skills to MCPD ASP.NET Developer 3.5 (VB)

ELIS Multimedia Lab. Linked Open Data. Sam Coppens MMLab IBBT - UGent

Efficiency of Web Based SAX XML Distributed Processing

The full setup includes the server itself, the server control panel, Firebird Database Server, and three sample applications with source code.

EUR-Lex 2012 Data Extraction using Web Services

WebSocket Server. To understand the Wakanda Server side WebSocket support, it is important to identify the different parts and how they interact:

Scalable and Reactive Programming for Semantic Web Developers

Address Phone & Fax Internet

DISCOVERING RESUME INFORMATION USING LINKED DATA

Working With Virtual Hosts on Pramati Server

7 Why Use Perl for CGI?

Introducing the Microsoft IIS deployment guide

Publishing Relational Databases as Linked Data

UFTP AUTHENTICATION SERVICE

Extending the Linked Data API with RDFa

Transcription:

ARC: appmosphere RDF Classes for PHP Developers Benjamin Nowack appmosphere web applications, Kruppstr. 100, 45145 Essen, Germany bnowack@appmosphere.com Abstract. ARC is an open source collection of lightweight PHP scripts optimized for RDF development on hosted Web servers. It currently consists of a non-validating RDF/XML parser, an N-Triples Serializer, and a "Simple Model" class providing common methods for working with resource descriptions. The three main classes are stand-alone, single-file scripts, thus facilitating the bundling with existing PHP-based applications. By partly using arrays instead of objects, ARC offers speed improvements compared to toolkits that follow approaches completely based on PHP objects. 1 Motivation Despite the availability of the feature-rich and mature RAP - RDF API for PHP [1], there was ongoing demand for RDF libraries (especially for an RDF/XML parser) which could be bundled with existing PHP frameworks more easily. RAP is deployed as an integrated API, so that version changes usually require to completely replace a prior installation. ARC 1 uses more lightweight, stand-alone classes in order to allow independent use and to facilitate upgrading tasks. 2 RDF Classes ARC currently consists of three basic class files for working with Semantic Web data: An RDF/XML parser, an N-Triples serializer, and a "Simple Model" class which provides common methods for working with resource descriptions. 2.1 ARC RDF/XML Parser The RDF/XML Parser class can be used to parse a passed string, to load and parse data from the local machine, or to load and parse data from the Web. The built-in Web reader does not follow HTTP redirects yet, but it accepts several parameters to e.g. work with proxies, send custom HTTP headers, or limit the number of lines to 1 ARC stands for appmosphere RDF classes. The classes and documentation are available online at http://www.appmosphere.com/en-arc

2 Benjamin Nowack parse. Additionally, the class can be configured to retrieve content-negotiated RDF/XML (via an HTTP Accept header), or to keep the raw RDF/XML data and/or response headers of the remote server in memory. The code snippet below shows a usage example. /* class inclusion */ include_once("arc_rdfxml_parser.php"); /* class configuration (optional) */ $args=array( "encoding"=>"auto",/* auto-detect UTF-8 etc */ "proxy_host"=>"192.168.27.1", "proxy_port"=>8080, "user_agent"=>"myparser v1.0" ); /* instantiation */ $parser=new ARC_rdfxml_parser($args); /* parsing a file from the Web */ $url="http://www.example.com/data.rdf"; $result=$parser->parse_web_file($url); if(is_array($result)){ echo count($result)." triples found"; else{ echo "couldn't parse ".$url.": ".$result; The parser does not check the parsed code for complete RDF-validity, only XML errors and some very basic RDF/XML syntax [2] errors are detected. The result of the parsing process is either an error message or an array of triples. 2.2 ARC N-Triples Serializer The serializer class generates an N-Triples [3] string from an array of triples. The structure of the passed triple array has to conform to the following ARC triple array structure: Each entry is an associative array with keys s, p, and o: - s (a subject node array) - p (a predicate URI string) - o (an object node array)

ARC: appmosphere RDF Classes for PHP Developers 3 A subject node array may have the following keys: - type ("uri" or "bnode") - uri (if type is "uri") - bnode_id (if type is "bnode") An object node array may contain additional key-value pairs: - type ("uri", or "bnode", or "literal") - uri (if type is "uri") - bnode_id (if type is "bnode") - val (if type is "literal") - dt (datatype URI if available and type is "literal") - lang (language code if available and type is "literal") The serializer provides only a small number of configuration options. It is possible to set the linebreak and the space characters as illustrated below. /* class inclusion */ include_once("arc_ntriples_serializer.php"); /* class configuration (optional) */ $args=array("linebreak"=>"\n", "spacer"=>"\t"); /* instantiation */ $ser=new ARC_ntriples_serializer($args); /* displaying the N-Triples code */ echo $ser->get_ntriples($triples); 2.3 ARC Simple Model ARC's Simple Model class indexes a triple array and offers methods to e.g. return a list of resources, find resources based on a provided identifier, parse RDF lists, or extract property values. A detailed usage example is given in section 3. 3 Use cases By combining the RDF/XML Parser (or an adjusted triple array queried from a basic RDF triple store) with the Simple Model class, it is easily possible to generate HTML views from RDF. Figure 1 shows a resource description summary which could be created by passing two Simple Model instances (one with vocabulary information, and one with data about individuals) to an HTML rendering script. The latter is not included in ARC but the PHP code snippets below Fig. 1 exemplify how data can be

4 Benjamin Nowack pre-processed by ARC in order to retrieve grouped property values and label information. Fig. 1. HTML view generated by combining information of two ARC Simple Models PHP code to instantiate the two ARC Simple Model classes: /* abbreviations */ $rdfs="http://www.w3.org/2000/01/rdf-schema#"; $foaf="http://xmlns.com/foaf/0.1/"; /* vocabulary model */ $v_triples=$parser->parse_web_file($foaf); $model_args=array( "triples"=>$v_triples, "ns_abbrs"=>array($foaf=>"foaf", $rdfs=>"rdfs") ); $v_model=new ARC_simple_model($model_args); /* data model */ $d_url="http://www.example.com/my_foaf.rdf"; $d_triples=$parser->parse_web_file($d_url); $model_args["triples"]=$d_triples; $d_model=new ARC_simple_model($model_args);

ARC: appmosphere RDF Classes for PHP Developers 5 PHP code to find a foaf:person resource in a foaf:personalprofiledocument: /* find PPD in data model */ $ppd_qname="foaf:personalprofiledocument"; if(!$ppds=$d_model->get_resources($ppd_qname)){ return false; $ppd=$ppds[0]; /* get person identifier via primarytopic */ if(!$id=$d_model->rpv 2 ($ppd, "foaf:primarytopic")){ return false; $person=$d_model->get_resource($id); PHP code to display labeled property values: /* name value */ $name_val=$d_model->rpv($person, "foaf:name"); if(!$name_val){ /* try alternative naming properties */ /* mbox values */ $cur_vals=array(); if($cur_props=$person["props"]["foaf:mbox"]){ foreach($cur_props as $cur_prop){ $cur_vals[]=$cur_prop["val"]; /* mbox label (use vocabulary model) */ if($terms=$v_model->get_resources("foaf:mbox")){ $term=$terms[0];/* an rdf:property instance */ $cur_label=$v_model->rpv($term, "rdfs:label"); /* display */ echo '<h2>'.$name_val.'</h2>'; echo ($cur_label)? $cur_label : "mbox"; echo "- ".join("<br />- ", $cur_vals); 2 "rpv" is an alias for the method "get_resource_prop_val" which retrieves a single property value for a given resource.

6 Benjamin Nowack Further use cases include reading, importing, and displaying of RSS 1.0 feeds, or any other RDF/XML-compatible data. At the moment, ARC is best suited for rapid RDF development or as a small addition to existing codebases, but the classes have already been successfully tested in larger projects (e.g. the new SemanticWeb.org portal) as well. 4 Performance RAP is currently the only other toolkit written completely in PHP (which is a prerequisite to be deployable to cheap hosted Web server environments) that conforms to the revised RDF specifications [4]. The performance tests done for this paper are limited to comparing the RDF/XML parsers. A third option (w.r.t. to parsing RDF) is Morten Frederiksen's SimpleRdfParser [5], a wrapper class for RAP that uses arrays instead of RAP's internal objects; it was tested as well. The Redland Application Framework [6] also provides bindings for PHP [7]. However, the PHP bindings can neither be installed on average hosted Web servers nor on the Windows machine that was used for the performance tests 3. Redland was not included in the benchmarks. It would have been an interesting reference, though. The tests were run on a desktop PC (Pentium IV 1.8 GHz, Windows 2000), using PHP 4.3.6, and the xdebug extension [8] for profiling. The long execution times in the benchmark results are caused by having to disable Zend optimizer while using the profiling extension. Switching back from xdebug to the Zend optimizer accelerates the parsers by a factor of 4. Parsing RDF/XML Files ARC (v0.2.0), SimpleRdfParser, and RAP (v0.9.1) were tested with 3 different files: 1. An RSS 1.0 news feed (338 triples) 2. The W3C RDF test cases manifest file 4 (2160 triples) 3. A large FOAF file (8601 triples) RDF files beyond this size are usually stored in a database, these tests cover common in-memory parsing tasks only. As the SimpleRdfParser can't load files directly, each test passes the RDF/XML code as a single string variable to the parser. RAP SimpleRdfParser ARC Processing (total) 2.921 s 1.984 s 0.834 s Triples/s (total) ~116 ~170 ~405 Triples/s (excl. compiling) ~149 ~265 ~431 Table 1. Parsing an RSS 1.0 news feed containing 338 triples. 3 Installing the bindings was not possible with the existing PHP engine. It would have been neccessary to re-compile the PHP sources. 4 http://www.w3.org/2000/10/rdf-tests/rdfcore/manifest.rdf

ARC: appmosphere RDF Classes for PHP Developers 7 RAP SimpleRdfParser ARC Processing (total) 16.257 s 4.372 s 2.037 s Triples/s (total) ~133 ~494 ~1060 Triples/s (excl. compiling) ~134 ~553 ~1131 Table 2. Parsing the W3C test cases manifest file containing 2160 triples. RAP SimpleRdfParser ARC Processing (total) 61 s 12 s 6 s Triples/s (total) ~141 ~717 ~1434 Table 3. Parsing a large FOAF file containing 8601 triples, xdebug turned off 5. The benchmark results 6 show that using the SimpleRdfParser is about two to four times faster than building a complete in-memory model via RAP. ARC is about twice as fast as the SimpleRdfParser. The difference between using PHP objects vs. arrays becomes more obvious with a growing number of triples in a model. While the SimpleRdfParser is 1.5 times faster than RAP when parsing the smallest RDF file, it is 3.7 times faster at parsing the 2160 triples, and about 5 times faster when processing the large file. These tests don't consider that RAP is not only parsing but also building indexes and validating the parsed code. But it could still make sense to examine the xdebug results in more detail in order to see if RAP may be partly made faster. ARC's speed could for example be improved by working around PHP's automatic type conversion mechanism (e.g. by using "===" instead of "==") and by applying simple string functions such as strpos instead of regular expressions. 5 Conclusion and Next Steps ARC is a lightweight alternative (or addition) to RAP's feature-rich API when only limited functionality such as parsing or displaying is required. Being open source and written entirely in PHP, the RDF classes can be bundled with PHP frameworks and used in shared Web hosting environments. The advantage of using PHP arrays instead of object structures increases with the size of parsed RDF/XML documents. The main idea of ARC is to provide lightweight RDF classes. However, several options have already been added to ARC which might not be needed for certain use cases. It is planned to build a simple parser generation service which will allow developers to tailor the ARC scripts to their needs. It will be possible to e.g. exclude the method for using an HTTP socket instead of PHP's one-line fopen function, or to automatically remove comments. Together with other configuration options the parser's size could be reduced to 20KB or less. 5 xdebug had to be disabled for this test as it needed to much main memory to profile RAP 6 The complete benchmark details can be downloaded from http://www.appmosphere.com/2005/php_parser_benchmarks_2005_04_15.pdf

8 Benjamin Nowack Other considerations include extending the parser with more validation features, making the Web reader follow HTTP redirects, or adding an RDF store interface to the ARC collection. A basic SPARQL [9] implementation is already available [10]. References 1. Bizer, C., Oldakowski, R.: RAP: RDF API for PHP. Berlin (2004). http://www.wiwiss.fu-berlin.de/suhl/radek/pub/rap-oldakowski.pdf 2. Beckett, D.: RDF/XML Syntax Specification (Revised). W3C (2004). http://www.w3.org/tr/rdf-syntax-grammar/ 3. Grant, J., Beckett, D.: RDF Test Cases. W3C (2004). http://www.w3.org/tr/rdf-testcases/ 4. Resource Description Framework (RDF). W3C (2004). http://www.w3.org/rdf/ 5. Frederiksen, M.: Easy RDF-parsing with PHP. http://www.wasab.dk/morten/blog/archives/2004/05/31/easy-rdf-parsing-with-php 6. Beckett, D.: Redland RDF Application Framework. http://librdf.org/ 7. Beckett, D.: Redland RDF Language Bindings - PHP Interface. http://librdf.org/docs/php.html 8. xdebug. http://xdebug.org/ 9. Prud'hommeaux, E., Seaborne, A.: SPARQL Query Language for RDF. W3C (2005). http://www.w3.org/tr/rdf-sparql-query/ 10. Nowack, B.: ARC SPARQL Parser. http://www.appmosphere.com/pages/en-arc_sparql_parser