Classifying ELH Ontologies in SQL Databases
|
|
|
- Norma Alexander
- 10 years ago
- Views:
Transcription
1 Classifying ELH Ontologies in SQL Databases Vincent Delaitre 1 and Yevgeny Kazakov 2 1 École Normale Supérieure de Lyon, Lyon, France [email protected] 2 Oxford University Computing Laboratory, Oxford, England [email protected] Abstract. The current implementations of ontology classification procedures use the main memory of the computer for loading and processing ontologies, which soon can become one of the main limiting factors for very large ontologies. We describe a secondary memory implementation of a classification procedure for ELH ontologies using an SQL relational database management system. Although secondary memory has much slower characteristics, our preliminary experiments demonstrate that one can obtain a comparable performance to those of existing in-memory reasoners using a number of caching techniques. 1 Introduction The ontology languages OWL and OWL 2 based on description logics (DL) are becoming increasingly popular among ontology developers, largely thanks to the availability of ontology reasoners which provide automated support for many ontology development tasks. One of the key ontology development task is ontology classification, the goal of which is to compute a hierarchical representation of the subsumption relation between classes of the ontology called class taxonomy. The popularity of OWL and efficiency of ontology reasoners has resulted in the availability of large ontologies such as Snomed CT containing hundreds of thousands of classes. Modern ontology reasoners such as CEL and FaCT++ can classify Snomed CT in a matter of minutes, however scaling these reasoners to ontologies containing millions of classes is problematic due to their high memory consumption. For example, the classification of Snomed CT in CEL and FaCT++ requires almost 1GB of the main memory. In this paper we describe a secondarymemory implementation of the classification procedure for the DL ELH the fragment of OWL used by Snomed CT in SQL databases. Our reasoner can classify Snomed CT in less than 20 minutes using less than 32MB of RAM. To the best of our knowledge, secondary-memory (database) algorithms and optimizations for ontology reasoning have been studied only very recently. Lutz et al. have proposed a method for conjunctive query answering over EL ontologies using databases [1], but the main focus of their work is on optimizing query response time once the ontology is compiled into a database. The IBM SHER system ( has a db-backed module, which presumably can perform secondary-memory reasoning in EL +, but currently not much information is available about this module.
2 Table 1. The syntax and semantics of ELH Name Syntax Semantics Concepts: atomic concept A A I (given) top concept I conjunction C D C I D I existential restriction r.c {x I y I : x, y r I y C I } Axioms: concept inclusion C D C I D I role inclusion r s r I s I 2 Preliminaries In this section we introduce the lightweight description logic ELH and the classification procedure for ELH ontologies [2]. 2.1 The Syntax and Semantics of ELH A description logic vocabulary consists of countably infinite sets N C of atomic concepts, and N R of atomic roles. The syntax and semantics of ELH is summarized in Table 1. The set of ELH concepts is recursively defined using the constructors in the upper part of Table 1, where A N C, r N R, and C, D are concepts. A terminology or ontology is a set O of axioms in the lower part of Table 1 where C, D are concepts and r, s N R. We use concept equivalence C D as an abbreviation for two concept inclusion axioms C D and D C. The semantics of ELH is defined using interpretations. An interpretation is a pair I = ( I, I) where I is a non-empty set called the domain of the interpretation and I is the interpretation function, which assigns to every A N C a set A I I, and to every r N R a relation r I I I. The interpretation is extended to concepts according to the right column of Table 1. An interpretation I satisfies an axiom α (written I = α) if the respective condition to the right of the axiom in Table 1 holds; I is a model of an ontology O (written I = O) if I satisfies every axiom in O. We say that α is a (logical) consequence of O, or is entailed by O (written O = α) if every model of O satisfies α. Classification is a key reasoning problem for description logics and ontologies, which requires to compute the set of direct subsumptions A B between atomic concepts that are entailed by O. 2.2 Normalization of ELH Ontologies Normalization is a preprocessing stage that eliminates nested occurrences of complex concept from ELH ontologies using auxiliary atomic concepts and axioms. The resulting ontology will contain only axioms of the forms: A B, A B C, A r.b, s.b C, r s, (1)
3 IR1 A A IR2 A CR1 A B A C : B C O A B A C A B CR2 : B C D O CR3 A D A r.c : B r.c O A r.b A s.b B C CR4 : r s O CR5 : s.c D O A s.b A D Fig. 1. The Completion Rules for ELH where A, B, C N C and r, s N R. Given an ELH concept C, let sub(c) be the set of sub-concepts of C recursively defined as follows: sub(a) = {A} for A N C, sub( ) = { }, sub(c D) = {C D} sub(c) sub(d), and sub( r.c) = { r.c} sub(c). Given an ELH ontology O, define the set of negative / positive / all concepts in O by respectively sub (O) = {sub(c) C D O}, sub + (O) = {sub(d) C D O}, and sub(o) = sub (O) sub + (O). For every C sub(o) we define a function nf(c) as follows: nf(a) = A, if A N C, nf( ) =, nf(c D) = A C A D, and nf( r.c) = r.a C, where A C and A D are fresh atomic concepts introduced for C and D. The result of applying normalization to O is the ontology O consisting of the following axioms: (i) A C A D for C D O, (ii) r s for r s O, (iii) nf(c) A C for C sub (O), and (iv) A D nf(d) for D sub + (O), where the axioms of the form A B C are replaced with a pair of axioms A B and A C. It has been shown [2] that this transformation preserves all original subsumptions between atomic concepts in O. 2.3 Completion Rules In order to classify a normalized ELH ontology O, the procedure applies the completion rules in Fig. 1. The rules derive new axioms of the form A B and C r.d which are logical consequences of O, where A, B, C, and D are atomic concepts or, and r, s atomic roles. Rules IR1 and IR2 derive trivial axioms for A N C sub(o). The remaining rules are applied to already derived axioms and use the normalized axioms in O as side conditions. The completion rules are applied until no new axiom can be derived, i.e., the resulting set of axioms is closed under all inference rules. It has been shown [2] that the rules IR1 CR5 are sound and complete, that is, a concept subsumption A B is entailed by O if and only if it is derivable by these rules. 3 Implementing ELH Classification in a Database In this section we describe the basic idea of our database implementation for the ELH classification procedure, the performance problems we face with, and our solutions to these problems. Although our presentation is specific to the MySQL syntax, the procedure can be implemented in any other SQL database system.
4 at conc name id A conj B pos neg id ax c incl A B ax conj A B C MuscularOrgan 1 Organ 2 MuscularSystem 4 Heart 6 CirculatorySystem 10 at role name id ispartof 3 belongsto r. exis B pos neg id ax r incl r s ax exis pos A r. B ax exis neg s. B C Fig. 2. Storing ELH concepts and normalized axioms in a database: the first part of every table is introduced for axiom (2), the second part is introduced for axiom (3). 3.1 The Outline of the Approach In this section we give a high level description of the ELH classification procedure using databases. We use the ELH ontology O consisting of axioms (2) (4) as a (rather artificial but simple) running example. MuscularOrgan Organ ispartof.muscularsystem (2) Heart Organ belongsto.(muscularsystem CirculatorySystem) (3) belongsto ispartof (4) The first step of the procedure is to assign integer identifiers (ids) to every concept and role occurring in the ontology, in order to use them for computing normalized axioms (1). In order to represent all information about concepts, it is sufficient to store, for every concept, its topmost constructor together with the ids of its direct sub-concepts. Thus, all information about concepts and roles can be represented in a database using four tables corresponding to possible types of constructors shown in Fig. 2, left: at conc for atomic concepts, at role for atomic roles, conj for conjunctions, and exis for existential restrictions. The assignment of ids is optimized so that the same ids are assigned to concepts that are structurally equivalent or declared equivalent using axioms such as (2). Apart from the ids, for every complex concept we store flags indicating whether the concept occurs positively and/or negatively in the ontology. The polarities of concepts are used consequently to identify normal forms of axioms which are stored in five tables corresponding to five types of normal forms (1) (see Fig. 2, right). Table ax c incl stores inclusions between concepts: for every row (A, B, pos, neg, id) in table conj with pos = true, table ac c incl contains inclusions between id and A as well as between id and B; for every concept inclusion C D in O the table also contains the inclusion between the id for C and the id for D. Similarly, table ax r incl contains inclusion between ids for
5 role inclusions r s in O. Table ax conj is obtained from the rows of table conj with negative flag. Likewise, tables ax exis pos and ax exis neg are obtained from the rows of exis with respectively positive and negative flags. Once the tables for normal form of the axioms are computed, it is possible to apply inference rules in Fig. 1 using SQL joins to derive new subsumptions. In the next sections we describe both steps of the classification procedure in detail. 3.2 Normalization The first problem appears when trying to assign the same ids to the same concept occurring in the ontology. Since our goal is to design a scalable procedure which can deal with very large ontologies, we cannot load the whole ontology into the main memory before assigning ids. Hence all tables in Fig. 2 should be constructed on the fly while reading the ontology. We have divided every table in Fig. 2 in two parts: the first is constructed after reading axiom (2), the second is constructed after reading axiom (3), except for table ax r incl which is constructed after reading axiom (4). Suppose we have just processed axiom (2) and are now reading axiom (3). The concept Heart did not occur before, so we need to assign it with a fresh id = 6 and add a row into the table at conc. The concept Organ, however, has been added before and we need to reuse the old id = 2. Thus table at conc acts as a lookup table for atomic concepts: if a new atomic concept occurs in this table then we reuse its id, otherwise, we introduce a fresh id and add a row into this table. This strategy works quite well for in-memory lookup. However it is hopelessly slow for external memory lookup due to the considerably slow disc access time. Moreover, since executing a query in a database management system, such as MySQL, involves an overhead on opening connection, performing transaction, and parsing the query, executing a query one by one for every atomic concept is hardly practical. In order to solve this problem, we use temporary in-memory tables to buffer the newly added concepts. We assign fresh ids regardless of whether the concept has been read before or not, and later restore uniqueness of the ids using a series of SQL queries. The tables are similar to those used for storing original ids, except that the tables for conjunctions and existential restrictions have an additional column depth representing the nesting depth of the concepts. In addition, there is a new table map which is used for resolving ids in case of duplicates. The sizes of temporary tables can be tuned for best speed/memory performance. Fig. 3 presents the contents of the temporary tables after reading axiom (3) assuming that axiom (2) is already processed. In order to resolve ids for all buffered atomic concept we perform the following SQL queries: 1: INSERT IGNORE INTO at conc SELECT * FROM tmp at conc; 2: INSERT INTO map SELECT tmp at conc.id, at conc.id FROM tmp at conc JOIN at conc USING (name); The first query inserts all buffered atomic concepts into the main table; duplicate insertions are ignored using a uniqueness constraint on the filed name in table at conc. The second query creates a mapping between temporary ids and ogirinal
6 tmp at conc name id tmp at role name id A tmp conj B pos neg id depth map tmp id orig id Heart 6 Organ 7 MuscularSystem 9 CirculatorySystem 10 belongsto 8 9/ / tmp exis r. B pos neg id depth Fig. 3. Temporary tables after reading axiom (3). The entries id1/id2 represent ids before/after mapping temporary ids to original ids using table map. ids to be used for replacement of the ids in complex concepts containing atomic ones. The query can be optimized to avoid trivial mappings. Replacement of ids and resolving of duplicate ids in complex concepts is done recursively over the depth d of concepts starting from d = 1. Processing conjunctions of the depth d is done using the following SQL queries: 1: UPDATE tmp conj SET A = map.orig id WHERE tmp conj.a = map.tmp id; 2: UPDATE tmp conj SET B = map.orig id WHERE tmp conj.b = map.tmp id; 3: INSERT INTO conj SELECT * FROM tmp conj WHERE tmp conj.depth = d ON DUPLICATE KEY UPDATE pos = (conj.pos tmp conj.pos), neg = (conj.neg tmp conj.neg); 4: INSERT INTO map SELECT tmp conj.id, conj.id FROM tmp conj JOIN conj USING (A, B) WHERE tmp conj.depth = d; The first two queries restore original ids for columns A and B. The resulting rows with depth d are then inserted into the main conjunction table, or update the polarities in case of duplicates (line 3). Finally, new mappings between ids for conjunctions of depth d are added (line 4). The ids for existential restrictions are resolved analogously. 3.3 Completion under Rules CR1 and CR2 We apply the completion rules from Fig. 1 in the order of priority in which they are listed. That is, we exhaustively apply rule CR1 before applying rule CR2, and perform closure under the rules CR1 and CR2 before applying the remaining rules CR3 CR5. This decision stems from an observation that closure under rules CR1 and CR2 can be computed efficiently using a temporary in-memory table. Indeed, the closure under rules CR1 and CR2 of the form A X for a fixed A can be computed independently of other subsumptions since the premises and the conclusions of these rules share the same concept A. The main idea is to use a temporary in-memory table to compute a closure for a bounded number of concepts A, and then output the union of the results into the main on-disc table. Let tmp subs be a temporary in-memory table with columns A, B representing subsumptions between A and B, and a column step representing the step at which each subsumption has been added. We use a global variable step to indicate that rule CR1 has been already applied for all subsumptions in tmp subs with
7 Procedure 1 Computing closure under rules CR1 and CR2 close CR1() 1: REPEAT 2: = (SELECT COUNT(*) FROM tmp subs); 3: INSERT IGNORE INTO tmp subs SELECT tmp subs.a, ax c incl.b, (step + 1) FROM tmp subs JOIN ax c incl ON tmp subs.b = ax c incl.a WHERE tmp subs.step = step; 4: SET step = step + 1; 5: = (SELECT COUNT(*) FROM tmp subs) -- nothing has changed 6: END REPEAT; close CR1 CR2() 7: REPEAT 8: step = step; 9: CALL close CR1(); 10: = (SELECT COUNT(*) FROM tmp subs); 11: INSERT IGNORE INTO tmp subs SELECT t1.a, ax conj.c, (step + 1) FROM JOIN (tmp subs AS t1, tmp subs AS t2, ax conj) ON (t1.b = ax conj.a AND t2.b = ax conj.b) WHERE (t1.step step OR t2.step step); 12: SET step = step + 1; 13: = (SELECT COUNT(*) FROM tmp subs) -- nothing has changed 14: END REPEAT; smaller values of tmp subs.step. Then the closure of tmp subs under CR1 can be computed using the function close CR1() of Procedure 1. The function repeatedly performs joins of the temporary table tmp subs with the table containing concept inclusions ac c incl and inserts the result back into tmp subs with the increased step. We assume that tmp subs has a uniqueness constraint on the pair (A, B), so that duplicate subsumptions are ignored. The same idea can be used to perform closure under rule CR2 with the only difference that CR2 has two premises and therefore requires a more complex join. Since CR2 is thus more expensive than CR1, we execute it with a lower priority. The closure under both rules CR1 and CR2 is done using the function close CR1 CR2() of Procedure 1. Here we use a temporary step to remember the last time we applied rule CR2. After performing the closure under CR1 (line 9), we make a join on two copies tmp subs and ax conj on new rows (line 11). 3.4 Completion under Rules CR3 CR5 Direct execution of rules CR3 CR5 requires performing many complicated joins. We combine these rules into one inference rule with multiple side conditions: A B r.a s.b : r s O, r.a sub+ (O), s.b sub (O) (5)
8 Procedure 2 Saturation under rules IR1 CR5. add exis incl() 1: TRUNCATE exis incl; -- the table used to store new subsumptions 2: INSERT INTO exis incl SELECT ax exis pos.a, ax exis neg.c FROM JOIN (ax exis pos, ax exis neg, ax r incl, tmp subs) ON (ax exis pos.b = tmp subs.a AND ax exis neg.b = tmp subs.b ax exis pos.r = ax r incl.r AND ax exis neg.s = ax r incl.s); 3: DELETE FROM exis incl WHERE exis incl.a = exis incl.b; 4: DELETE FROM exis incl USING exis incl JOIN ax c incl USING (A, B); 5: INSERT INTO ax c incl SELECT exis incl.a, exis incl.b FROM exis incl; saturate(n) -- N is a parameter for temporary subsumption table 6: INSERT INTO queue SELECT at conc.id, true FROM at conc; 7: INSERT INTO queue SELECT ax exis pos.b, true FROM ax exis pos; 8: INSERT INTO subs SELECT queue.id, queue.id FROM queue; -- rule IR1 9: INSERT INTO subs SELECT queue.id, top id FROM queue; -- rule IR2 10: REPEAT 11: TRUNCATE tmp queue; -- to store the first N unprocessed elements 12: INSERT INTO tmp queue SELECT queue.id FROM queue WHERE queue.flag = true LIMIT N; 13: UPDATE queue, tmp queue SET queue.flag = false WHERE queue.id = tmp queue.id; 14: TRUNCATE tmp subs; SET step = 0; 15: INSERT INTO tmp subs SELECT subs.a, subs.b, step -- fetching FROM subs JOIN tmp queue ON subs.a = tmp queue.id; -- to process 16: CALL close CR1 CR2(); -- close under rules CR1 and CR2 17: INSERT INTO subs SELECT tmp subs.a, tmp subs.b FROM tmp subs WHERE tmp subs.step > 0; -- insert back only new subsumptions 18: CALL add exis incl(); -- find and add new existential subsumptions 19: UPDATE queue, subs, exis incl SET queue.flag = true WHERE (queue.id = subs.a AND subs.b = exis incl.a); 20: UNTIL (SELECT COUNT(*) FROM tmp queue) = 0 -- nothing to process 21: END REPEAT; Rule (5) derives new inclusions between (ids of) existentially restricted concepts occurring in the ontology using previously derived subsumptions. The main idea is to repeatedly derive these new inclusions, adding them into table ax c incl, and performing incremental closure of the letter under rules CR1 and CR2. Inference (5) is implemented using the function add exis incl() of Procedure 2. The conclusion of the inference is computed using a complex join query (line 2) and stored in a new table exis incl. The result is subsequently filtered by removing trivial inclusions (line 3), and inclusions that occur in table ax c incl (line 4). The new subsumptions are added into ax c incl (line 5). Function saturate(n) implements our overall strategy of rule applications. Table queue stores all concepts A for which subsumptions A X are to be computed as indicated in column flag. It is sufficient to consider only atomic concepts (line 6), and concepts occurring under positive existential restrictions
9 (line 7). The computed subsumptions are saved in table subs, which is initialized according to rules IR1 and IR2 (lines 8-9). The concepts in queue are processed by portions of N elements (N is the given parameter) using a temporary table tmp queue. At the beginning of every iteration, table tmp queue is filled with first N unprocessed elements from queue (line 12), which are then labeled as processed (line 13). The table tmp subs is filled with the corresponding subsumptions from subs (line 15), closed under rules CR1 and CR2 (line 16), and new subsumptions are inserted back into subs (line 17). After that, new inclusions between existential restrictions are derived using (5) (line 18), and elements A in queue for which new inclusions can yield new subsumptions, are labeled as unprocessed (line 19). 3.5 Transitive Reduction The output of the classification algorithm is not the set of all subsumptions between sub-concepts of an ontology, but a taxonomy which contains only direct subsumptions between atomic concepts. In order to compute the taxonomy, the set of all subsumptions between atomic concepts should be transitively reduced. Transitive reduction of a transitive subsumption relation can be done by applying one step of transitive closure and marking the resulting relations as not-direct, thus finding the remaining set of direct subsumptions. Although this can be easily implemented in databases with just one join, we found that this approach has a poor performance because it requires to make a large number of on-disc updates, namely, marking subsumptions as non-direct. In contrast, the number of direct subsumptions is usually much smaller. A solution to this problem involves the usage of a temporary in-memory table. We repeatedly fetch into the table all subsumption relations A X for a bounded number of atomic concepts A, perform transitive reduction according to the method described above, and output the direct subsumptions into the on-disk taxonomy. This algorithm can be also extended to handle equivalence classes of atomic concepts. 4 Experimental Evaluation We have implemented a prototype reasoner DB 3 in MySQL according to the procedure described in Section 3 (with some small additional optimizations). In order to evaluate the performance of the reasoner and compare it with existing ontology reasoners, we have performed a series of experiments with four large bio-medical ontologies of various sizes and complexity. The Gene Ontology (GO) ( and National Cancer Institute (NCI) ontology ( are quite big ontologies containing respectively 20,465 and 27,652 classes. Galen containing 23,136 classes is obtain from a version of the OpenGalen ontology ( by removing role functionalities, transitivities, and inverses. The largest tested ontology Snomed based on Snomed CT ( contains 315,489 classes. We ran the experiments on a 3 The reasoner is available open source from db-reasoner.googlecode.com
10 Table 2. Time profiles for classification. Time is in seconds. (a) The stages of the classification procedure Action NCI GO Galen Snomed Loading/Preprocessing Completion Transitive reduction Formating output Total (b) Completion for different N N= NCI GO Galen Snomed , , , , Table 3. Comparison with other reasoners. Time is in seconds. Reasoner NCI GO Galen Snomed DB CB CEL v FaCT++ v HermiT v PC with a 2GHz Intel R Core TM 2 Duo processor, a 5400 RPM 2.5" hard drive, and 4GB RAM operated by Ubuntu Linux v.9.04 with MySQL v In Table 2 we present detailed timings for classification of the tested ontologies using DB. Table 2(a) lists timings for various stages of the classification procedure described in Section 3. As we can see, for NCI and GO a considerable time is spent on preprocessing and transitive reduction, whereas for Galen and Snomed most time is spent on completion. The reason could be that for NCI and GO inference (5) does not result in many new inclusions. In Table 2(b) we analyze the impact of different values of the parameter N on the times for the completion stage (see Procedure 2). We see that the timings do not differ significantly. In all other experiments, we fix N=5000, which does not result in main memory consumption of MySQL server more than 32MB. In Table 3, we compare the performance of our reasoner with existing inmemory ontology reasoners CB (cb-reasoner.googlecode.com), CEL (lat. inf.tu-dresden.de/systems/cel), FaCT++ (owl.man.ac.uk/factplusplus), and Hermit (hermit-reasoner.com). We can see that the performance of DB is on par with the in-memory reasoners and, in particular, that DB outperforms CEL on Galen and SNOMED, FaCT++ on Galen, and HermiT on all ontologies. References 1. Lutz, C., Toman, D., Wolter, F.: Conjunctive query answering in the description logic EL using a relational database system. In: IJCAI, AAAI Press (2009) 2. Baader, F., Brandt, S., Lutz, C.: Pushing the EL envelope. In: IJCAI, Professional Book Center (2005)
Exploring Incremental Reasoning Approaches Based on Module Extraction
Exploring Incremental Reasoning Approaches Based on Module Extraction Liudmila Reyes-Alvarez 1, Danny Molina-Morales 1, Yusniel Hidalgo-Delgado 2, María del Mar Roldán-García 3, José F. Aldana-Montes 3
Optimizing Description Logic Subsumption
Topics in Knowledge Representation and Reasoning Optimizing Description Logic Subsumption Maryam Fazel-Zarandi Company Department of Computer Science University of Toronto Outline Introduction Optimization
ABSTRACT 1. INTRODUCTION. Kamil Bajda-Pawlikowski [email protected]
Kamil Bajda-Pawlikowski [email protected] Querying RDF data stored in DBMS: SPARQL to SQL Conversion Yale University technical report #1409 ABSTRACT This paper discusses the design and implementation
DLDB: Extending Relational Databases to Support Semantic Web Queries
DLDB: Extending Relational Databases to Support Semantic Web Queries Zhengxiang Pan (Lehigh University, USA [email protected]) Jeff Heflin (Lehigh University, USA [email protected]) Abstract: We
! " # The Logic of Descriptions. Logics for Data and Knowledge Representation. Terminology. Overview. Three Basic Features. Some History on DLs
,!0((,.+#$),%$(-&.& *,2(-$)%&2.'3&%!&, Logics for Data and Knowledge Representation Alessandro Agostini [email protected] University of Trento Fausto Giunchiglia [email protected] The Logic of Descriptions!$%&'()*$#)
Raima Database Manager Version 14.0 In-memory Database Engine
+ Raima Database Manager Version 14.0 In-memory Database Engine By Jeffrey R. Parsons, Senior Engineer January 2016 Abstract Raima Database Manager (RDM) v14.0 contains an all new data storage engine optimized
Virtuoso and Database Scalability
Virtuoso and Database Scalability By Orri Erling Table of Contents Abstract Metrics Results Transaction Throughput Initializing 40 warehouses Serial Read Test Conditions Analysis Working Set Effect of
QuickDB Yet YetAnother Database Management System?
QuickDB Yet YetAnother Database Management System? Radim Bača, Peter Chovanec, Michal Krátký, and Petr Lukáš Radim Bača, Peter Chovanec, Michal Krátký, and Petr Lukáš Department of Computer Science, FEECS,
An Efficient and Scalable Management of Ontology
An Efficient and Scalable Management of Ontology Myung-Jae Park 1, Jihyun Lee 1, Chun-Hee Lee 1, Jiexi Lin 1, Olivier Serres 2, and Chin-Wan Chung 1 1 Korea Advanced Institute of Science and Technology,
Tune That SQL for Supercharged DB2 Performance! Craig S. Mullins, Corporate Technologist, NEON Enterprise Software, Inc.
Tune That SQL for Supercharged DB2 Performance! Craig S. Mullins, Corporate Technologist, NEON Enterprise Software, Inc. Table of Contents Overview...................................................................................
Random vs. Structure-Based Testing of Answer-Set Programs: An Experimental Comparison
Random vs. Structure-Based Testing of Answer-Set Programs: An Experimental Comparison Tomi Janhunen 1, Ilkka Niemelä 1, Johannes Oetsch 2, Jörg Pührer 2, and Hans Tompits 2 1 Aalto University, Department
www.dotnetsparkles.wordpress.com
Database Design Considerations Designing a database requires an understanding of both the business functions you want to model and the database concepts and features used to represent those business functions.
Understanding SQL Server Execution Plans. Klaus Aschenbrenner Independent SQL Server Consultant SQLpassion.at Twitter: @Aschenbrenner
Understanding SQL Server Execution Plans Klaus Aschenbrenner Independent SQL Server Consultant SQLpassion.at Twitter: @Aschenbrenner About me Independent SQL Server Consultant International Speaker, Author
A3 Computer Architecture
A3 Computer Architecture Engineering Science 3rd year A3 Lectures Prof David Murray [email protected] www.robots.ox.ac.uk/ dwm/courses/3co Michaelmas 2000 1 / 1 6. Stacks, Subroutines, and Memory
Evaluation of Sub Query Performance in SQL Server
EPJ Web of Conferences 68, 033 (2014 DOI: 10.1051/ epjconf/ 201468033 C Owned by the authors, published by EDP Sciences, 2014 Evaluation of Sub Query Performance in SQL Server Tanty Oktavia 1, Surya Sujarwo
Graph Database Proof of Concept Report
Objectivity, Inc. Graph Database Proof of Concept Report Managing The Internet of Things Table of Contents Executive Summary 3 Background 3 Proof of Concept 4 Dataset 4 Process 4 Query Catalog 4 Environment
Execution Strategies for SQL Subqueries
Execution Strategies for SQL Subqueries Mostafa Elhemali, César Galindo- Legaria, Torsten Grabs, Milind Joshi Microsoft Corp With additional slides from material in paper, added by S. Sudarshan 1 Motivation
PERFORMANCE TUNING FOR PEOPLESOFT APPLICATIONS
PERFORMANCE TUNING FOR PEOPLESOFT APPLICATIONS 1.Introduction: It is a widely known fact that 80% of performance problems are a direct result of the to poor performance, such as server configuration, resource
Oracle EXAM - 1Z0-117. Oracle Database 11g Release 2: SQL Tuning. Buy Full Product. http://www.examskey.com/1z0-117.html
Oracle EXAM - 1Z0-117 Oracle Database 11g Release 2: SQL Tuning Buy Full Product http://www.examskey.com/1z0-117.html Examskey Oracle 1Z0-117 exam demo product is here for you to test the quality of the
Subsetting Observations from Large SAS Data Sets
Subsetting Observations from Large SAS Data Sets Christopher J. Bost, MDRC, New York, NY ABSTRACT This paper reviews four techniques to subset observations from large SAS data sets: MERGE, PROC SQL, user-defined
Graph Database Performance: An Oracle Perspective
Graph Database Performance: An Oracle Perspective Xavier Lopez, Ph.D. Senior Director, Product Management 1 Copyright 2012, Oracle and/or its affiliates. All rights reserved. Program Agenda Broad Perspective
1Z0-117 Oracle Database 11g Release 2: SQL Tuning. Oracle
1Z0-117 Oracle Database 11g Release 2: SQL Tuning Oracle To purchase Full version of Practice exam click below; http://www.certshome.com/1z0-117-practice-test.html FOR Oracle 1Z0-117 Exam Candidates We
Teldat Router. DNS Client
Teldat Router DNS Client Doc. DM723-I Rev. 10.00 March, 2003 INDEX Chapter 1 Domain Name System...1 1. Introduction...2 2. Resolution of domains...3 2.1. Domain names resolver functionality...4 2.2. Functionality
Table of Contents. Chapter 1: Introduction. Chapter 2: Getting Started. Chapter 3: Standard Functionality. Chapter 4: Module Descriptions
Table of Contents Chapter 1: Introduction Chapter 2: Getting Started Chapter 3: Standard Functionality Chapter 4: Module Descriptions Table of Contents Table of Contents Chapter 5: Administration Table
Symbol Tables. Introduction
Symbol Tables Introduction A compiler needs to collect and use information about the names appearing in the source program. This information is entered into a data structure called a symbol table. The
Columnstore Indexes for Fast Data Warehouse Query Processing in SQL Server 11.0
SQL Server Technical Article Columnstore Indexes for Fast Data Warehouse Query Processing in SQL Server 11.0 Writer: Eric N. Hanson Technical Reviewer: Susan Price Published: November 2010 Applies to:
Unit 4.3 - Storage Structures 1. Storage Structures. Unit 4.3
Storage Structures Unit 4.3 Unit 4.3 - Storage Structures 1 The Physical Store Storage Capacity Medium Transfer Rate Seek Time Main Memory 800 MB/s 500 MB Instant Hard Drive 10 MB/s 120 GB 10 ms CD-ROM
University of Ostrava. Reasoning in Description Logic with Semantic Tableau Binary Trees
University of Ostrava Institute for Research and Applications of Fuzzy Modeling Reasoning in Description Logic with Semantic Tableau Binary Trees Alena Lukasová Research report No. 63 2005 Submitted/to
Relational Databases
Relational Databases Jan Chomicki University at Buffalo Jan Chomicki () Relational databases 1 / 18 Relational data model Domain domain: predefined set of atomic values: integers, strings,... every attribute
SAP HANA - Main Memory Technology: A Challenge for Development of Business Applications. Jürgen Primsch, SAP AG July 2011
SAP HANA - Main Memory Technology: A Challenge for Development of Business Applications Jürgen Primsch, SAP AG July 2011 Why In-Memory? Information at the Speed of Thought Imagine access to business data,
Quiz! Database Indexes. Index. Quiz! Disc and main memory. Quiz! How costly is this operation (naive solution)?
Database Indexes How costly is this operation (naive solution)? course per weekday hour room TDA356 2 VR Monday 13:15 TDA356 2 VR Thursday 08:00 TDA356 4 HB1 Tuesday 08:00 TDA356 4 HB1 Friday 13:15 TIN090
1 File Processing Systems
COMP 378 Database Systems Notes for Chapter 1 of Database System Concepts Introduction A database management system (DBMS) is a collection of data and an integrated set of programs that access that data.
A Semantic Dissimilarity Measure for Concept Descriptions in Ontological Knowledge Bases
A Semantic Dissimilarity Measure for Concept Descriptions in Ontological Knowledge Bases Claudia d Amato, Nicola Fanizzi, Floriana Esposito Dipartimento di Informatica, Università degli Studi di Bari Campus
Keywords: Regression testing, database applications, and impact analysis. Abstract. 1 Introduction
Regression Testing of Database Applications Bassel Daou, Ramzi A. Haraty, Nash at Mansour Lebanese American University P.O. Box 13-5053 Beirut, Lebanon Email: rharaty, [email protected] Keywords: Regression
Dependency Free Distributed Database Caching for Web Applications and Web Services
Dependency Free Distributed Database Caching for Web Applications and Web Services Hemant Kumar Mehta School of Computer Science and IT, Devi Ahilya University Indore, India Priyesh Kanungo Patel College
Bringing Big Data Modelling into the Hands of Domain Experts
Bringing Big Data Modelling into the Hands of Domain Experts David Willingham Senior Application Engineer MathWorks [email protected] 2015 The MathWorks, Inc. 1 Data is the sword of the
Toad for Oracle 8.6 SQL Tuning
Quick User Guide for Toad for Oracle 8.6 SQL Tuning SQL Tuning Version 6.1.1 SQL Tuning definitively solves SQL bottlenecks through a unique methodology that scans code, without executing programs, to
XTM Web 2.0 Enterprise Architecture Hardware Implementation Guidelines. A.Zydroń 18 April 2009. Page 1 of 12
XTM Web 2.0 Enterprise Architecture Hardware Implementation Guidelines A.Zydroń 18 April 2009 Page 1 of 12 1. Introduction...3 2. XTM Database...4 3. JVM and Tomcat considerations...5 4. XTM Engine...5
Contents WEKA Microsoft SQL Database
WEKA User Manual Contents WEKA Introduction 3 Background information. 3 Installation. 3 Where to get WEKA... 3 Downloading Information... 3 Opening the program.. 4 Chooser Menu. 4-6 Preprocessing... 6-7
CS2Bh: Current Technologies. Introduction to XML and Relational Databases. Introduction to Databases. Why databases? Why not use XML?
CS2Bh: Current Technologies Introduction to XML and Relational Databases Spring 2005 Introduction to Databases CS2 Spring 2005 (LN5) 1 Why databases? Why not use XML? What is missing from XML: Consistency
The Sierra Clustered Database Engine, the technology at the heart of
A New Approach: Clustrix Sierra Database Engine The Sierra Clustered Database Engine, the technology at the heart of the Clustrix solution, is a shared-nothing environment that includes the Sierra Parallel
ICOM 6005 Database Management Systems Design. Dr. Manuel Rodríguez Martínez Electrical and Computer Engineering Department Lecture 2 August 23, 2001
ICOM 6005 Database Management Systems Design Dr. Manuel Rodríguez Martínez Electrical and Computer Engineering Department Lecture 2 August 23, 2001 Readings Read Chapter 1 of text book ICOM 6005 Dr. Manuel
Google File System. Web and scalability
Google File System Web and scalability The web: - How big is the Web right now? No one knows. - Number of pages that are crawled: o 100,000 pages in 1994 o 8 million pages in 2005 - Crawlable pages might
Improving SQL Server Performance
Informatica Economică vol. 14, no. 2/2010 55 Improving SQL Server Performance Nicolae MERCIOIU 1, Victor VLADUCU 2 1 Prosecutor's Office attached to the High Court of Cassation and Justice 2 Prosecutor's
Relational Data Analysis I
Relational Data Analysis I Relational Data Analysis Prepares Business data for representation using the relational model The relational model is implemented in a number of popular database systems Access
OPTIMAL MULTI SERVER CONFIGURATION FOR PROFIT MAXIMIZATION IN CLOUD COMPUTING
OPTIMAL MULTI SERVER CONFIGURATION FOR PROFIT MAXIMIZATION IN CLOUD COMPUTING Abstract: As cloud computing becomes more and more popular, understanding the economics of cloud computing becomes critically
SAP HANA PLATFORM Top Ten Questions for Choosing In-Memory Databases. Start Here
PLATFORM Top Ten Questions for Choosing In-Memory Databases Start Here PLATFORM Top Ten Questions for Choosing In-Memory Databases. Are my applications accelerated without manual intervention and tuning?.
2) Write in detail the issues in the design of code generator.
COMPUTER SCIENCE AND ENGINEERING VI SEM CSE Principles of Compiler Design Unit-IV Question and answers UNIT IV CODE GENERATION 9 Issues in the design of code generator The target machine Runtime Storage
MAD2: A Scalable High-Throughput Exact Deduplication Approach for Network Backup Services
MAD2: A Scalable High-Throughput Exact Deduplication Approach for Network Backup Services Jiansheng Wei, Hong Jiang, Ke Zhou, Dan Feng School of Computer, Huazhong University of Science and Technology,
Secure cloud access system using JAR ABSTRACT:
Secure cloud access system using JAR ABSTRACT: Cloud computing enables highly scalable services to be easily consumed over the Internet on an as-needed basis. A major feature of the cloud services is that
Chapter 13 File and Database Systems
Chapter 13 File and Database Systems Outline 13.1 Introduction 13.2 Data Hierarchy 13.3 Files 13.4 File Systems 13.4.1 Directories 13.4. Metadata 13.4. Mounting 13.5 File Organization 13.6 File Allocation
Chapter 13 File and Database Systems
Chapter 13 File and Database Systems Outline 13.1 Introduction 13.2 Data Hierarchy 13.3 Files 13.4 File Systems 13.4.1 Directories 13.4. Metadata 13.4. Mounting 13.5 File Organization 13.6 File Allocation
3. Relational Model and Relational Algebra
ECS-165A WQ 11 36 3. Relational Model and Relational Algebra Contents Fundamental Concepts of the Relational Model Integrity Constraints Translation ER schema Relational Database Schema Relational Algebra
DMS Performance Tuning Guide for SQL Server
DMS Performance Tuning Guide for SQL Server Rev: February 13, 2014 Sitecore CMS 6.5 DMS Performance Tuning Guide for SQL Server A system administrator's guide to optimizing the performance of Sitecore
Web Server (Step 1) Processes request and sends query to SQL server via ADO/OLEDB. Web Server (Step 2) Creates HTML page dynamically from record set
Dawn CF Performance Considerations Dawn CF key processes Request (http) Web Server (Step 1) Processes request and sends query to SQL server via ADO/OLEDB. Query (SQL) SQL Server Queries Database & returns
Drupal Performance Tuning
Drupal Performance Tuning By Jeremy Zerr Website: http://www.jeremyzerr.com @jrzerr http://www.linkedin.com/in/jrzerr Overview Basics of Web App Systems Architecture General Web
ProTrack: A Simple Provenance-tracking Filesystem
ProTrack: A Simple Provenance-tracking Filesystem Somak Das Department of Electrical Engineering and Computer Science Massachusetts Institute of Technology [email protected] Abstract Provenance describes a file
Pattern Insight Clone Detection
Pattern Insight Clone Detection TM The fastest, most effective way to discover all similar code segments What is Clone Detection? Pattern Insight Clone Detection is a powerful pattern discovery technology
SEMANTIC WEB BASED INFERENCE MODEL FOR LARGE SCALE ONTOLOGIES FROM BIG DATA
SEMANTIC WEB BASED INFERENCE MODEL FOR LARGE SCALE ONTOLOGIES FROM BIG DATA J.RAVI RAJESH PG Scholar Rajalakshmi engineering college Thandalam, Chennai. [email protected] Mrs.
The process of database development. Logical model: relational DBMS. Relation
The process of database development Reality (Universe of Discourse) Relational Databases and SQL Basic Concepts The 3rd normal form Structured Query Language (SQL) Conceptual model (e.g. Entity-Relationship
Chapter 1: Introduction
Chapter 1: Introduction Database System Concepts, 5th Ed. See www.db book.com for conditions on re use Chapter 1: Introduction Purpose of Database Systems View of Data Database Languages Relational Databases
METHODS OF QUICK EXPLOITATION OF BLIND SQL INJECTION DMITRY EVTEEV
METHODS OF QUICK EXPLOITATION OF BLIND SQL INJECTION DMITRY EVTEEV JANUARY 28TH, 2010 [ 1 ] INTRO 3 [ 2 ] ERROR-BASED BLIND SQL INJECTION IN MYSQL 5 [ 3 ] UNIVERSAL EXPLOITATION TECHNIQUES FOR OTHER DATABASES
Commercial Database Software Development- A review.
Commercial Database Software Development- A review. A database software has wide applications. A database software is used in almost all the organizations. Over 15 years many tools have been developed
OWL based XML Data Integration
OWL based XML Data Integration Manjula Shenoy K Manipal University CSE MIT Manipal, India K.C.Shet, PhD. N.I.T.K. CSE, Suratkal Karnataka, India U. Dinesh Acharya, PhD. ManipalUniversity CSE MIT, Manipal,
Bridge from Entity Relationship modeling to creating SQL databases, tables, & relations
1 Topics for this week: 1. Good Design 2. Functional Dependencies 3. Normalization Readings for this week: 1. E&N, Ch. 10.1-10.6; 12.2 2. Quickstart, Ch. 3 3. Complete the tutorial at http://sqlcourse2.com/
An integrated. EHR system
An integrated t Expression Repository EHR system Daniel Karlsson, Mikael Nyström, Bengt Kron Project goals Develop and test a system for storing and querying pre- and post-coordinated SNOMED CT expressions
Databases -Normalization III. (N Spadaccini 2010 and W Liu 2012) Databases - Normalization III 1 / 31
Databases -Normalization III (N Spadaccini 2010 and W Liu 2012) Databases - Normalization III 1 / 31 This lecture This lecture describes 3rd normal form. (N Spadaccini 2010 and W Liu 2012) Databases -
SQL Server. 2012 for developers. murach's TRAINING & REFERENCE. Bryan Syverson. Mike Murach & Associates, Inc. Joel Murach
TRAINING & REFERENCE murach's SQL Server 2012 for developers Bryan Syverson Joel Murach Mike Murach & Associates, Inc. 4340 N. Knoll Ave. Fresno, CA 93722 www.murach.com [email protected] Expanded
DBMS / Business Intelligence, SQL Server
DBMS / Business Intelligence, SQL Server Orsys, with 30 years of experience, is providing high quality, independant State of the Art seminars and hands-on courses corresponding to the needs of IT professionals.
A Pattern-based Framework of Change Operators for Ontology Evolution
A Pattern-based Framework of Change Operators for Ontology Evolution Muhammad Javed 1, Yalemisew M. Abgaz 2, Claus Pahl 3 Centre for Next Generation Localization (CNGL), School of Computing, Dublin City
Parallel Databases. Parallel Architectures. Parallelism Terminology 1/4/2015. Increase performance by performing operations in parallel
Parallel Databases Increase performance by performing operations in parallel Parallel Architectures Shared memory Shared disk Shared nothing closely coupled loosely coupled Parallelism Terminology Speedup:
Application of ontologies for the integration of network monitoring platforms
Application of ontologies for the integration of network monitoring platforms Jorge E. López de Vergara, Javier Aracil, Jesús Martínez, Alfredo Salvador, José Alberto Hernández Networking Research Group,
CHAPTER 5 INTELLIGENT TECHNIQUES TO PREVENT SQL INJECTION ATTACKS
66 CHAPTER 5 INTELLIGENT TECHNIQUES TO PREVENT SQL INJECTION ATTACKS 5.1 INTRODUCTION In this research work, two new techniques have been proposed for addressing the problem of SQL injection attacks, one
Efficient Integration of Data Mining Techniques in Database Management Systems
Efficient Integration of Data Mining Techniques in Database Management Systems Fadila Bentayeb Jérôme Darmont Cédric Udréa ERIC, University of Lyon 2 5 avenue Pierre Mendès-France 69676 Bron Cedex France
SQL Server Administrator Introduction - 3 Days Objectives
SQL Server Administrator Introduction - 3 Days INTRODUCTION TO MICROSOFT SQL SERVER Exploring the components of SQL Server Identifying SQL Server administration tasks INSTALLING SQL SERVER Identifying
Relational Database Basics Review
Relational Database Basics Review IT 4153 Advanced Database J.G. Zheng Spring 2012 Overview Database approach Database system Relational model Database development 2 File Processing Approaches Based on
ACCELERATING SELECT WHERE AND SELECT JOIN QUERIES ON A GPU
Computer Science 14 (2) 2013 http://dx.doi.org/10.7494/csci.2013.14.2.243 Marcin Pietroń Pawe l Russek Kazimierz Wiatr ACCELERATING SELECT WHERE AND SELECT JOIN QUERIES ON A GPU Abstract This paper presents
HP Quality Center. Upgrade Preparation Guide
HP Quality Center Upgrade Preparation Guide Document Release Date: November 2008 Software Release Date: November 2008 Legal Notices Warranty The only warranties for HP products and services are set forth
CERTIFIED MULESOFT DEVELOPER EXAM. Preparation Guide
CERTIFIED MULESOFT DEVELOPER EXAM Preparation Guide v. November, 2014 2 TABLE OF CONTENTS Table of Contents... 3 Preparation Guide Overview... 5 Guide Purpose... 5 General Preparation Recommendations...
White Paper. Optimizing the Performance Of MySQL Cluster
White Paper Optimizing the Performance Of MySQL Cluster Table of Contents Introduction and Background Information... 2 Optimal Applications for MySQL Cluster... 3 Identifying the Performance Issues.....
PRODUCT BRIEF 3E PERFORMANCE BENCHMARKS LOAD AND SCALABILITY TESTING
PRODUCT BRIEF 3E PERFORMANCE BENCHMARKS LOAD AND SCALABILITY TESTING THE FOUNDATION Thomson Reuters Elite completed a series of performance load tests with the 3E application to verify that it could scale
Database Replication with MySQL and PostgreSQL
Database Replication with MySQL and PostgreSQL Fabian Mauchle Software and Systems University of Applied Sciences Rapperswil, Switzerland www.hsr.ch/mse Abstract Databases are used very often in business
ICAB4136B Use structured query language to create database structures and manipulate data
ICAB4136B Use structured query language to create database structures and manipulate data Release: 1 ICAB4136B Use structured query language to create database structures and manipulate data Modification
Karl Lum Partner, LabKey Software [email protected]. Evolution of Connectivity in LabKey Server
Karl Lum Partner, LabKey Software [email protected] Evolution of Connectivity in LabKey Server Connecting Data to LabKey Server Lowering the barrier to connect scientific data to LabKey Server Increased
Enforcing Data Quality Rules for a Synchronized VM Log Audit Environment Using Transformation Mapping Techniques
Enforcing Data Quality Rules for a Synchronized VM Log Audit Environment Using Transformation Mapping Techniques Sean Thorpe 1, Indrajit Ray 2, and Tyrone Grandison 3 1 Faculty of Engineering and Computing,
AV-005: Administering and Implementing a Data Warehouse with SQL Server 2014
AV-005: Administering and Implementing a Data Warehouse with SQL Server 2014 Career Details Duration 105 hours Prerequisites This career requires that you meet the following prerequisites: Working knowledge
Performance Guideline for syslog-ng Premium Edition 5 LTS
Performance Guideline for syslog-ng Premium Edition 5 LTS May 08, 2015 Abstract Performance analysis of syslog-ng Premium Edition Copyright 1996-2015 BalaBit S.a.r.l. Table of Contents 1. Preface... 3
npsolver A SAT Based Solver for Optimization Problems
npsolver A SAT Based Solver for Optimization Problems Norbert Manthey and Peter Steinke Knowledge Representation and Reasoning Group Technische Universität Dresden, 01062 Dresden, Germany [email protected]
