An Efficient and Scalable Management of Ontology



Similar documents
DLDB: Extending Relational Databases to Support Semantic Web Queries

SEMANTIC WEB BASED INFERENCE MODEL FOR LARGE SCALE ONTOLOGIES FROM BIG DATA

Storage and Retrieval of Large RDF Graph Using Hadoop and MapReduce

Development of Ontology for Smart Hospital and Implementation using UML and RDF

Logical and categorical methods in data transformation (TransLoCaTe)

A generic approach for data integration using RDF, OWL and XML

JOURNAL OF COMPUTER SCIENCE AND ENGINEERING

Transformation of OWL Ontology Sources into Data Warehouse

Completing Description Logic Knowledge Bases using Formal Concept Analysis

A Secure Mediator for Integrating Multiple Level Access Control Policies

OWL based XML Data Integration

Benchmarking the Performance of Storage Systems that expose SPARQL Endpoints

High-Performance, Massively Scalable Distributed Systems using the MapReduce Software Framework: The SHARD Triple-Store

Supporting Change-Aware Semantic Web Services

Data Validation with OWL Integrity Constraints

Performance Analysis, Data Sharing, Tools Integration: New Approach based on Ontology

A Semantic Web Knowledge Base System that Supports Large Scale Data Integration

Semantic Web Technologies and Data Management

LUBM: A Benchmark for OWL Knowledge Base Systems

Semantic Web Standard in Cloud Computing

Evaluating SPARQL-to-SQL translation in ontop

Semantic Concept Based Retrieval of Software Bug Report with Feedback

Horizontal Aggregations in SQL to Prepare Data Sets for Data Mining Analysis

I. INTRODUCTION NOESIS ONTOLOGIES SEMANTICS AND ANNOTATION

Ontologies, Semantic Web and Virtual Enterprises. Marek Obitko Rockwell Automation Research Center, Prague, Czech Republic 1

Use of OWL and SWRL for Semantic Relational Database Translation

Incorporating Semantic Discovery into a Ubiquitous Computing Infrastructure

KEYWORD SEARCH IN RELATIONAL DATABASES

Developing a Web-Based Application using OWL and SWRL

Graph Database Performance: An Oracle Perspective

Logic and Reasoning in the Semantic Web (part I RDF/RDFS)

Semantic Stored Procedures Programming Environment and performance analysis

HadoopSPARQL : A Hadoop-based Engine for Multiple SPARQL Query Answering

Techniques to Produce Good Web Service Compositions in The Semantic Grid

Towards the Integration of a Research Group Website into the Web of Data

Semantic Knowledge Management System. Paripati Lohith Kumar. School of Information Technology

excellent graph matching capabilities with global graph analytic operations, via an interface that researchers can use to plug in their own

Ontology-Based Discovery of Workflow Activity Patterns

Context Capture in Software Development

Ontologies for elearning

Design and Implementation of a Semantic Web Solution for Real-time Reservoir Management

Efficient SPARQL-to-SQL Translation using R2RML to manage

DISTRIBUTED RDF QUERY PROCESSING AND REASONING FOR BIG DATA / LINKED DATA. A THESIS IN Computer Science

RDF y SPARQL: Dos componentes básicos para la Web de datos

City Data Pipeline. A System for Making Open Data Useful for Cities. stefan.bischof@tuwien.ac.at

Detection and Elimination of Duplicate Data from Semantic Web Queries

Big Data, Fast Data, Complex Data. Jans Aasman Franz Inc

The Semantic Web for Application Developers. Oracle New England Development Center Zhe Wu, Ph.D. 1

Semantic Web Services for e-learning: Engineering and Technology Domain

SPARQL: Un Lenguaje de Consulta para la Web

Dependency Free Distributed Database Caching for Web Applications and Web Services

Lesson 8: Introduction to Databases E-R Data Modeling

Exploring Incremental Reasoning Approaches Based on Module Extraction

ONLINE EXPERT SYSTEM IN AGRICULTURE

Fuzzy-DL Reasoning over Unknown Fuzzy Degrees

Secure Semantic Web Service Using SAML

OBDA: Query Rewriting or Materialization? In Practice, Both!

Efficient Mapping XML DTD to Relational Database

A Framework for Ontology-based Context Base Management System

Defining Equity and Debt using REA Claim Semantics

Enhancing Dataset Processing in Hadoop YARN Performance for Big Data Applications

Characterizing Knowledge on the Semantic Web with Watson

ONTOLOGY-BASED APPROACH TO DEVELOPMENT OF ADJUSTABLE KNOWLEDGE INTERNET PORTAL FOR SUPPORT OF RESEARCH ACTIVITIY

Efficiently Identifying Inclusion Dependencies in RDBMS

Creating an RDF Graph from a Relational Database Using SPARQL

No More Keyword Search or FAQ: Innovative Ontology and Agent Based Dynamic User Interface

Semantic Enhancement for Enterprise Data Management

Application of OASIS Integrated Collaboration Object Model (ICOM) with Oracle Database 11g Semantic Technologies

SmartLink: a Web-based editor and search environment for Linked Services

Powl A Web Based Platform for Collaborative Semantic Web Development

ON ANALYZING THE DATABASE PERFORMANCE FOR DIFFERENT CLASSES OF XML DOCUMENTS BASED ON THE USED STORAGE APPROACH

G-SPARQL: A Hybrid Engine for Querying Large Attributed Graphs

Semantic Interoperability

Performance Evaluation of Java Object Relational Mapping Tools

Transcription:

An Efficient and Scalable Management of Ontology Myung-Jae Park 1, Jihyun Lee 1, Chun-Hee Lee 1, Jiexi Lin 1, Olivier Serres 2, and Chin-Wan Chung 1 1 Korea Advanced Institute of Science and Technology, Korea {jpark,hyunlee,leechun,jesse,chungcw}@islab.kaist.ac.kr 2 University of Technology of Belfort-Montbéliard, France olivier.serres@utbm.fr Abstract. OWL is a recommended language for publishing and sharing ontologies on the Semantic Web. To manage the ontologies, several OWL data management systems have been proposed. However, the existing systems have limitations of the scalability and the reasoning. In this paper, we propose an OWL data management system, ONTOMS, which stores OWL data into class based relations, performs complete inverseof, symmetric, and transitive reasoning for instances, and efficiently evaluates OWL-QL queries against ontologies in a relational database. 1 Introduction Web Ontology Language(OWL) [1] is a semantic markup language for publishing and sharing ontologies on the Web. OWL is developed as a vocabulary extension of RDF [2] and RDFS [3] to increase the expressive power of ontology data, which leads OWL as a recommended ontology language for the Semantic Web. OWL data can be represented by a graph like RDF as shown in Fig. 1. To support the expressive power of OWL data, several OWL reasoners [4 6] have been proposed. However, those reasoners confronted the scalability issue, due to the use of memory. To overcome this problem, RDBMS based OWL data management systems [7 9] have been proposed. Since RDBMSs do not support the reasoning ability, those systems cannot obtain complete class and property hierarchies and perform the instance reasoning. As a result, those systems incorporated OWL reasoners to obtain such hierarchies completely. However, due to the scalability drawback of OWL reasoners, instance reasoning is not supported. To retrieve instances of classes or properties, OWL Query Language(OWL- QL) [1] has been proposed. OWL-QL is based on query patterns, in the form of (property, subject, object). To evaluate query patterns over OWL data stored in RDBMS, a proper relational schema is required. However, existing systems do not incorporate efficiency consideration in designing their relational schemas. Thus, in this paper, we propose ONTOMS, an efficient and scalable ONTOlogy Management System, to efficiently manage large sized OWL data. ONTOMS stores OWL data into a class based relational schema to increase query processing performance. On the average, the query performance of ONTOMS is about

2 Myung-Jae Park et al. Fig. 1. An example of OWL data (a simple university ontology) 9 times better than DLDB [7]. To provide the complete results, ONTOMS supports instance reasoning for inverseof, symmetric, and transitive properties. To our best knowledge, ONTOMS is the first RDBMS based OWL data management system which supports the complete instance reasoning. 2 Related Work There are several OWL reasoners to manage OWL data. FaCT [5] performs class and property related reasoning only. RACER [4] and Pellet [6] support class and property hierarchy reasoning, as well as instance reasoning. SnoBase [9] stores class and property definitions (e.g., subclassof and sub- PropertyOf) into Fact relation. SnoBase also stores every triple (i.e., (subject, property, object)) into Fact relation. To provide reasoning, SnoBase utilizes SQL triggers. However, the runtime depth level of trigger cascading supported in RDBMSs is limited. Also, SnoBase does not support instance reasoning. Instance Store [8] uses Descriptions relation to store class definitions, Primitives relation to store individuals, and other four relations (Type, Equivalents, Parents and Children) to maintain class hierarchy information. Instance Store uses FaCT or RACER to only obtain class hierarchies. In addition, Instance Store only supports classes without any consideration on properties. DLDB [7] maintains one class relation for each class and one property relation for each property. For class and property hierarchies, DLDB uses FaCT. However, DLDB does not support any instance reasoning. Thus, DLDB cannot provide complete query results for properties which require instance reasoning. 3 OWL Data Storage The class hierarchy and the property hierarchy are generated from Pellet. To maintain containment relationship information among nodes of each hierarchy, a pair of (start, end) values is assigned to each node according to the node s position, which was originally proposed for XML data [11]. Fig. 2 shows class and property hierarchies for the OWL data in Fig. 1.

An Efficient and Scalable Management of Ontology 3 [1,14] Thing [2,9] Person Course University [1,11] [12,13] Professor Student [5,8] [3,4] GraduateStudent [6,7] (a) The class hierarchy takescourse teachescourse degreefrom hasalumnus [1,2] [3,4] [5,8] [9,1] doctoraldegreefrom [6,7] (b) The property hierarchy Fig. 2. Class and Property Hierarchies ONTOMS generates a class based relational schema, where one relation is created for each class. Each class relation contains associated properties as attributes. Associated properties are the properties that have the class as domains. To efficiently utilize the property hierarchy, we only retain properties which do not have any super properties (called highest properties) among associated properties. Class relations also have start and end attributes for each highest property. If the highest property has no subproperties, those are omitted. GS1 Student takescourse degreefrom degreefrom_s degreefrom_e Person degreefrom degreefrom_s degreefrom_e GraduateStudent degreefrom degreefrom_s degreefrom_e Course University hasalumnus Univ1 GraduateStudent _takescourse Value GS1 C1 GS1 GS1 C2 C3 Professor _teachescourse Value Prof1 C1 Prof1 Prof1 Professor degreefrom degreefrom_s degreefrom_e Prof1 Univ1 6 7 C2 C3 Fig. 3. Relational Tables in ONTOMS A number of redundant tuples would be generated for a multi-valued property. The multi-valued property indicates where one instance of property s domain has more than one values. For example, in Fig. 1, Prof1 has three different values (i.e., C1, C2, and C3) for teachescourse property. As a result, ONTOMS separates multi-valued properties from class relations. A new relation is assigned to each multi-valued property. Fig. 3 shows the relations 3, generated for the OWL data presented in Fig. 1. Note that ONTOMS stores new tuples generated after performing instance reasoning. The (Univ1, null) tuple in University relation should be updated as (Univ1, Prof1) for the inverseof property, hasalumnus. Note that the translation process of OWL-QL queries to SQL queries for the class based relations can be found in [12]. 4 Instance Reasoning OWL defines five types of properties: inverseof, symmetric, transitive, functional and inversefunctional properties. Only the first three properties may generate a 3 Internally, ONTOMS assigns a unique label identifier () to each instance.

4 Myung-Jae Park et al. large number of new facts 4. Therefore, we focus on reasoning for inverseof, symmetric and transitive properties (which we will refer to as IST properties). Note that the definitions of the IST properties are given in the OWL Reference [1]. Definition 1. Relation R of a property P is the set of (x,y) in P. Definition 2. Let property P be inverseof property P, R be the relation of P, and R be the relation of P. The inverseof reasoning for P or R is the process of adding (y,x) to R for all (x,y) in R, if (y,x) is not in R. Definition 3. Let P be a symmetric property and R be the relation of P. The symmetric reasoning for P or R is the process of adding (y,x) to R for all (x,y) in R, if (y,x) is not in R. Definition 4. Let P be a transitive property and R be the relation of P. The transitive reasoning for P or R is the process of computing the transitive closure of R. A relation R after inverseof reasoning is written as R I. 5 Similarly, a relation R after symmetric and transitive reasoning is written as R S and R T, respectively. In this paper, we propose an IST reasoning algorithm which does not require recursive or iterative processing. The algorithm guarantees that the complete set of new facts can be obtained by performing reasoning only once for each type of property in a certain sequence, i.e., first for inverseof property, then for symmetric property, and last for transitive property. Lemma 1. Suppose relation R is inverseof relation R. After symmetric reasoning for R and R, R S is inverseof R S. Lemma 2. Suppose relation R is inverseof relation R. If R is symmetric, then R is symmetric. If R is transitive, then R is transitive. Theorem 1. Suppose property P is inverseof property P, P is symmetric and transitive. Let R be the relation of P and R the relation of P. By following the sequence <inverseof reasoning, symmetric reasoning, transitive reasoning> for R and R, the resulting relations of R and R are inverseof of each other, symmetric and transitive. The proof for Theorem 1 and the algorithm for IST reasoning, which can be found in [12], are not included due to the page limitation. 5 Experiments We implemented ONTOMS using IBM DB2 UDB 8.2. We interfaced DLDB with IBM DB2 since DLDB uses MS-Access. Experiments were performed on 3GHz 4 Referred to as newly generated instances in Sect. 3. 5 Here,we introduce R I to indicate the change of R as a result of inverseof reasoning.

An Efficient and Scalable Management of Ontology 5 Pentium 4 with 124MB of main memory. We used Lehigh University Benchmark Data(LUBM) 6. We generated various sizes of OWL data: 1MB, 5MB, 1MB, 5MB, 1MB, and 5MB. Since LUBM queries (Q1 to Q14) are not sufficient, we added three queries (Q15 to Q17). Detailed information on the query set, which can be found in [12], is not included due to the page limitation. We compared the total query processing time of ONTOMS and that of DLDB using 17 queries. We show the total query processing time for only 5MB data in Fig. 4 since the shapes of graphs for different sizes are similar. 1.6 1.2.8.4 ONTOMS DLDB Q1 Q3 Q5 Q6 Q1 Q11 Q12 Q13 Q14 Q15 (a) Query processing time(<2 sec) 976 24 16 8 Q2 Q4 Q7 Q8 Q9 Q16 Q17 (b) Query processing time(>=2 sec) Fig. 4. Query Processing Time for 5MB OWL Data The number of joins in ONTOMS is less than or equal to that of DLDB. Therefore, in Fig. 4, ONTOMS is better than DLDB for most of 17 queries. However, for Q7 and Q13, ONTOMS is worse than DLDB. Since Q7 and Q13 have values as their subjects, there are just a few bindings satisfying those queries. 4 3 2 1 ONTOMS DLDB 12 8 4 1MB 5MB 1MB 5MB 1MB 5MB (a) Single-valued property (Q16) 1MB 5MB 1MB 5MB 1MB 5MB (b) Multi-valued property (Q17) Fig. 5. Query Processing Time over Differently Sized OWL Data In Fig. 5, all properties of Q16 are single-valued properties while those of Q17 contain multi-valued properties. Thus, the performance gap between ONTOMS and DLDB is much larger in Q16. Consequently, ONTOMS outperforms DLDB for most queries in spite of its support of instance reasoning. ONTOMS is 9 times faster than DLDB on the average, calculated by averaging performance differences for queries over 1MB, 5MB, 1MB, 5MB, 1MB, and 5MB OWL data. 6 Available in http://swat.cse.lehigh.edu/projects/lubm/index.htm

6 Myung-Jae Park et al. 6 Conclusion In this paper, we proposed ONTOMS, an OWL data management system using an RDBMS. ONTOMS generates the class based relational schema in which a relation is created for each class and contains associated properties as its attributes. To avoid data redundancy, ONTOMS creates class-property relations for multivalued properties. Thus, this schema is of great advantage to queries having less multi-valued properties. In addition, ONTOMS supports the reasoning on the IST properties and the class and property hierarchies. The experimental results show that ONTOMS outperforms DLDB in the query response time. Acknowledgments. This research was supported by the Ministry of Information and Communication, Korea, under the College Information Technology Research Center Support Program, grant number IITA-26-C19-63-31. References 1. Bechhofer, S., van Harmelen, F., Hendler, J., Horrocks, I., McGuinness, D.L., Patel- Schneider, P.F., Stein, L.A.: OWL Web Ontology Language Reference. W3C Recommendation, http://www.w3.org/tr/owl-ref (Feb. 24) 2. Manola, F., Miller, E., McBride, B.: RDF Primer. W3C Recommendation, http://www.w3.org/tr/rdf-primer (Feb. 24) 3. Brickley, D., Guha, R.V., McBride, B.: RDF Vocabulary Description Language 1.: RDF Schema. http://www.w3.org/tr/rdf-schema (Feb. 24) 4. Haarsley, V., Moller, R.: RACER System Description. In: Proc. of 1st International Joint Conference on Automated Reasoning. (June 21) 71 76 5. Horrocks, I., Sattler, U.: A Tableaux Decision Procedure fore SHOIQ. In: Proc. of 19th International Joint Conference on Artificial Intelligence. (Aug. 25) 448 453 6. Parsia, B., Sirin, E.: Pellet: An OWL DL Reasoner. In: Proc. of 3rd International Semantic Web Conference. (Nov. 24) 7. Guo, Y., Pan, Z., Heflin, J.: An Evaluation of Knowledge Base Systems for Large OWL Datasets. In: Proc. of 3rd International Semantic Web Conference. (Nov. 24) 274 288 8. Horrocks, I., Li, L., Turi, D., Bechhofer, S.: The Instance Store: Description Logic Reasoning with Large Numbers of Individuals. In: Proc. of 24 International Workshop on Description Logic. (June 24) 31 4 9. Lee, J., Goodwin, R.: Ontology Management for Large-Scale Enterprise Systems. IBM Technical Report, RC2373 (Sep. 25) 1. Fikes, R., Hayes, P., Horrocks, I.: OWL-QL-A Language for Deductive Query Answering on the Semantic Web. Journal of Web Semantics 2(1) (Dec. 24) 19 29 11. Zhang, C., Naughton, J., DeWitt, D., Luo, Q., Lohman, G.: On Supporting Containment Queries in Relational Database Management Systems. In: Proc. of the 21 ACM SIGMOD Conference. (May 21) 425 436 12. Park, M.J., Lee, J.H., Lee, C.H., Lin, J., Serres, O., Chung, C.W.: ONTOMS: An Efficient and Scalable Ontology Management System. In: KAIST CS/TR-25-246. (Dec. 25)