Database & Information Systems Group Prof. Marc H. Scholl. XML & Databases. Tutorial. 11. SQL Compilation, XPath Symmetries



Similar documents
XML and Data Management

Unified XML/relational storage March The IBM approach to unified XML/relational databases

XML & Databases. Tutorial. 2. Parsing XML. Universität Konstanz. Database & Information Systems Group Prof. Marc H. Scholl

Markup Languages and Semistructured Data - SS 02

Data XML and XQuery A language that can combine and transform data

Storing and Querying Ordered XML Using a Relational Database System

Exchanger XML Editor - Canonicalization and XML Digital Signatures

Database Technologies

Semistructured data and XML. Institutt for Informatikk INF Ahmet Soylu

Purely Relational XQuery

An Eclipse Plug-In for Visualizing Java Code Dependencies on Relational Databases

Indexing XML Data in RDBMS using ORDPATH

High Performance XML Data Retrieval

Managing large sound databases using Mpeg7

Oracle Hyperion Data Relationship Management Best Practices, Tips and Tricks. Whitepaper

Creating a TEI-Based Website with the exist XML Database

Introduction to XML Applications

Design Patterns in Parsing

XML and Data Integration

Chapter 8 The Enhanced Entity- Relationship (EER) Model

Caching XML Data on Mobile Web Clients

Challenges and Opportunities for formal specifications in Service Oriented Architectures

by LindaMay Patterson PartnerWorld for Developers, AS/400 January 2000

Problems and Measures Regarding Waste 1 Management and 3R Era of public health improvement Situation subsequent to the Meiji Restoration

Language Interface for an XML. Constructing a Generic Natural. Database. Rohit Paravastu

Introduction to Database Management Systems

Quiz! Database Indexes. Index. Quiz! Disc and main memory. Quiz! How costly is this operation (naive solution)?

Extensible Markup Language (XML): Essentials for Climatologists

Sorting Hierarchical Data in External Memory for Archiving

Data Structure with C

Enhancing Traditional Databases to Support Broader Data Management Applications. Yi Chen Computer Science & Engineering Arizona State University

OpenScape Voice V8 Application Developers Manual. Programming Guide A31003-H8080-R

Database Design Patterns. Winter Lecture 24

Parsing Technology and its role in Legacy Modernization. A Metaware White Paper

Efficiency of Web Based SAX XML Distributed Processing

Database-Supported XML Processors

XML Processing and Web Services. Chapter 17

<Insert Picture Here> What's New in NetBeans IDE 7.2

Structured vs. unstructured data. Motivation for self describing data. Enter semistructured data. Databases are highly structured

Schematron Validation and Guidance

Last Week. XML (extensible Markup Language) HTML Deficiencies. XML Advantages. Syntax of XML DHTML. Applets. Modifying DOM Event bubbling

BACHELOR S THESIS. Roman Betík XML Data Visualization

QuickDB Yet YetAnother Database Management System?

Translating between XML and Relational Databases using XML Schema and Automed

Interactive Data Visualization for the Web Scott Murray

Visualization Method of Trajectory Data Based on GML, KML

Inside the PostgreSQL Query Optimizer

A Workbench for Prototyping XML Data Exchange (extended abstract)

Processing XML with Java A Performance Benchmark

Deferred node-copying scheme for XQuery processors

Criteo Tags & Feed Extension for Magento

A system for Candidate-Task matching in the E-Recruitment

ITP 342 Mobile App Dev

XML: extensible Markup Language. Anabel Fraga

Translating XQuery expressions to Functional Queries in a Mediator Database System

Querying MongoDB without programming using FUNQL

Business Process Modeling Notation. Bruce Silver Principal, BPMessentials

Effective feedback from quality tools during development

Fast Sequential Summation Algorithms Using Augmented Data Structures

Managing XML Documents Versions and Upgrades with XSLT

In-Memory Database: Query Optimisation. S S Kausik ( ) Aamod Kore ( ) Mehul Goyal ( ) Nisheeth Lahoti ( )

Efficient Interval Management in Microsoft SQL Server

Data processing goes big

MongoDB Aggregation and Data Processing Release 3.0.4

MongoDB Aggregation and Data Processing

Database System Concepts

Business Modernization Overview

How To Write A Database In Java With A New Data Type In Itunes.Com

Foglight. Dashboard Support Guide

How to translate VisualPlace

Methods for Firewall Policy Detection and Prevention

Full and Complete Binary Trees

A MEDIATION LAYER FOR HETEROGENEOUS XML SCHEMAS

Last not not Last Last Next! Next! Line Line Forms Forms Here Here Last In, First Out Last In, First Out not Last Next! Call stack: Worst line ever!

REDUCING THE COST OF GROUND SYSTEM DEVELOPMENT AND MISSION OPERATIONS USING AUTOMATED XML TECHNOLOGIES. Jesse Wright Jet Propulsion Laboratory,

CHAPTER 1 INTRODUCTION

GRAPH PATTERN MINING: A SURVEY OF ISSUES AND APPROACHES

SQL Anywhere 12 New Features Summary

Chapter 1: Introduction

McAfee Network Threat Response (NTR) 4.0

An Oracle White Paper October Oracle XML DB: Choosing the Best XMLType Storage Option for Your Use Case

4D v11 SQL Release 1 (11.1) ADDENDUM

Automatic Penetration Test Tool for Detection of XML Signature Wrapping Attacks in Web Services

ABSTRACT 1. INTRODUCTION. Kamil Bajda-Pawlikowski

Transcription:

XML & Databases Tutorial 11. SQL Compilation, XPath Symmetries Christian Grün, Database & Information Systems Group University of, Winter 2005/06

SQL Compilation Relational Encoding: the table representation of pre-/post encoded XML documents allows an easy storage in a relational, SQL-driven database system before we have already transformed XPath location steps, e.g. result context/descendant::node() into simple range matches and predicate tests: result pre(table) > pre(context) post(table) < post(context) range matches can now be further transformed into SQL queries: SELECT DISTINCT t.* FROM table t, context c WHERE t.pre > c.pre AND t.post < c.post ORDER BY t.pre Seite 2

SQL Compilation Relational Encoding: to guarantee a one-step compilation of all SQL commands, we can store the attributes pre, post, par, kind & tag in our pre-/post table a query like result context/child::toothpick is thus evaluated as: SELECT DISTINCT t.* FROM table t, context c WHERE t.pre > c.pre AND t.post < c.post AND t.par = c.pre AND t.kind = 'elem' AND t.tag = 'toothpick' ORDER BY t.pre Seite 3

SQL Compilation Multiple Location Steps: as SQL commands can arbitrarily be nested, we can combine several location steps into one single SQL query: result context /descendant::node() /child::toothpick SELECT DISTINCT t2.* FROM ( SELECT DISTINCT t1.* FROM context c1, table t1 WHERE t1.pre > c1.pre AND t1.post < c1.post ) c2, table t2 WHERE t2.pre > c2.pre AND t2.post < c2.post AND t2.par = c2.pre AND t2.kind = 'elem' AND t2. = 'toothpick' ORDER BY t2.pre Seite 4

SQL Compilation Window Queries: the relational encoding of an XML document can also be captured in so-called window queries (axis :: tag t, context c): axis pre post par kind tag child (c.pre,*) (*,c.post) c.pre elem t descendant (c.pre,*) (*,c.post) elem t descendant-or-self [c.pre,*) (*,c.post] elem t following (c.pre,*) (c.post,*) elem t following-sibling (c.pre,*) (c.post,*) c.par elem t parent c.par (c.post,*) elem t ancestor (*,c.pre) (c.post,*) elem t ancestor-or-self (*,c.pre] [c.post,*) elem t preceding (*,c.pre) (*,c.post) elem t preceding-sibling (*,c.pre) (*,c.post) c.par elem t Seite 5

SQL Compilation Window Queries: window queries are somewhat easier to read and can be implemented by customized SQL functions the last query result context/descendant::node()/child::toothpick can then be formulated as follows: SELECT DISTINCT t1.* FROM table t1, table t2, context c WHERE t1 INSIDE window(child::toothpick, t2) AND t2 INSIDE window(descendant::node(), c) ORDER BY t1.pre Seite 6

XPath Symmetries Idea: when a query is evaluated, the efficiency might be improved when the execution order of the location steps is changed XPath location steps can be divided in forward and reverse axes as a SAX parser works sequentially, reverse axes can be transformed into forward axes when the content of an XML document is indexed, it makes sense to first evaluate predicates redundant location steps can be merged into single steps dependant of the query implementation some steps might be executed faster than others Paper: Olteanu et al., Symmetry in XPath. 2002 Seite 7

XPath Symmetries Full Symmetry: v' v/parent v v'/child mondial Partial Symmetries: v' v/descendant v v'/ancestor v' v/following v v'/preceding province city city Full root Symmetry: v' r/descendant-or-self r v'/ancestor city city Simplification: v' v//city v/descendant-or-self::node()/child::city v' v/descendant::city Seite 8

XPath Symmetries Multiple Location Step: v' v/child/child v v'/parent/parent mondial Simplified Step: v' v/child/parent v v'/self warning: a location step can lead to an empty result set /child = () /child/parent = () whereas /self = province city city city city Using predicates: by introducing predicates, we can guarantee the equivalence: v' /child/parent v' /self[child] Seite 9

XPath Symmetries Predicates: Predicates are also helpful as they just modify our context set, but don t replace it with a new one compare: province/child = city with: province[child] = province Other Symmetries: province mondial city city child::city/parent::province self::province[child::city] descendant::city/parent::province descendant-or-self::province[child::city] /descendant::/preceding::province /descendant::province[following::] city city Seite 10

XPath Symmetries Bottom-Up Approach: Predicates are often used in XPath to match text or attribute nodes. A conventional top-down approach creates many context nodes that are dropped at the end: mondial /mondial///city[/text() = "Rome ] province city city If we have stored all context nodes in an index, we can go the other way round and parse the leaf nodes first: //text()[. = "Rome ] /parent::/parent::city [ancestor::/parent::mondial] city city "Parma" "Rome" note that the parent axis is very efficient when the parent is stored as attribute! Seite 11