XML Databases 13. Systems



Similar documents
6. SQL/XML. 6.1 Introduction. 6.1 Introduction. 6.1 Introduction. 6.1 Introduction. XML Databases 6. SQL/XML. Creating XML documents from a database

XML Databases 6. SQL/XML

10. XML Storage Motivation Motivation Motivation Motivation. XML Databases 10. XML Storage 1 Overview

XML Databases 10 O. 10. XML Storage 1 Overview

Advanced Information Management

Comparison of XML Support in IBM DB2 9, Microsoft SQL Server 2005, Oracle 10g

Implementing XML Schema inside a Relational Database

An Oracle White Paper October Oracle XML DB: Choosing the Best XMLType Storage Option for Your Use Case

PHP Oracle Web Development Data Processing, Security, Caching, XML, Web Services and AJAX

Managing XML Data to optimize Performance into Object-Relational Databases

Generating XML from Relational Tables using ORACLE. by Selim Mimaroglu Supervisor: Betty O NeilO

XML and Relational Database Management Systems: Inside Microsoft SQL Server 2005

Data Integration Hub for a Hybrid Paper Search

Indexing XML Data in RDBMS using ORDPATH

Database Support for XML

How To Store And Manage Xml In A Database With A Powerpoint (Xml) And A Powerbook (Xm) (Powerbook) (Xl) (Oracle) (For Free) (Windows) (Html) (

Unified XML/relational storage March The IBM approach to unified XML/relational databases

IBM DB2 XML support. How to Configure the IBM DB2 Support in oxygen

OData Extension for XML Data A Directional White Paper

EFFECTIVE STORAGE OF XBRL DOCUMENTS

Quiz! Database Indexes. Index. Quiz! Disc and main memory. Quiz! How costly is this operation (naive solution)?

How To Create A Table In Sql (Ahem)

DBMS / Business Intelligence, SQL Server

Processing XML with SQL on the IBM i MMSA MEETING April 15, Raymond A. Everhart - RAECO Design, Inc. reverhart@raecodesign.

Discovering SQL. Wiley Publishing, Inc. A HANDS-ON GUIDE FOR BEGINNERS. Alex Kriegel WILEY

Database Programming with PL/SQL: Learning Objectives

Introduction This document s purpose is to define Microsoft SQL server database design standards.

Chapter 2: Designing XML DTDs

XML Programming with PHP and Ajax

MOC 20461C: Querying Microsoft SQL Server. Course Overview

ON ANALYZING THE DATABASE PERFORMANCE FOR DIFFERENT CLASSES OF XML DOCUMENTS BASED ON THE USED STORAGE APPROACH

Course 6232A: Implementing a Microsoft SQL Server 2008 Database

Instant SQL Programming

4 Logical Design : RDM Schema Definition with SQL / DDL

Use a Native XML Database for Your XML Data

Beginning C# 5.0. Databases. Vidya Vrat Agarwal. Second Edition

Semistructured data and XML. Institutt for Informatikk INF Ahmet Soylu

SQL Server for developers. murach's TRAINING & REFERENCE. Bryan Syverson. Mike Murach & Associates, Inc. Joel Murach

A Migration Methodology of Transferring Database Structures and Data

Developing Microsoft SQL Server Databases 20464C; 5 Days

A Comparative Study Between Two Types of Database Management Systems: XML-Enabled Relational and Native XML

Modern XML applications

Module 1: Getting Started with Databases and Transact-SQL in SQL Server 2008

Developing Microsoft SQL Server Databases MOC 20464

David Dye. Extract, Transform, Load

SQL Server An Overview

Writing Queries Using Microsoft SQL Server 2008 Transact-SQL

Schema Evolution in SQL-99 and Commercial (Object-)Relational DBMS

Schematron Validation and Guidance

Translating between XML and Relational Databases using XML Schema and Automed

20464C: Developing Microsoft SQL Server Databases

A Workbench for Prototyping XML Data Exchange (extended abstract)

Database Design Patterns. Winter Lecture 24

NETMARK: A SCHEMA-LESS EXTENSION FOR RELATIONAL DATABASES FOR MANAGING SEMI-STRUCTURED DATA DYNAMICALLY

Data XML and XQuery A language that can combine and transform data

Introduction to XML. Data Integration. Structure in Data Representation. Yanlei Diao UMass Amherst Nov 15, 2007

Big Data Analytics. Rasoul Karimi

SQL Server. 1. What is RDBMS?

IBM DB2 for Linux, UNIX, and Windows. Best Practices. Managing XML Data. Matthias Nicola IBM Silicon Valley Lab Susanne Englert IBM Silicon Valley Lab

How Strings are Stored. Searching Text. Setting. ANSI_PADDING Setting

DESIGN OF HETEROGENEOUS DATABASES REPLICATION USING XML

An XML Based Data Exchange Model for Power System Studies

Course 20464: Developing Microsoft SQL Server Databases

Database Master User Manual

Integrating VoltDB with Hadoop

ETL Tools. L. Libkin 1 Data Integration and Exchange

Using SQL Server Management Studio

3.GETTING STARTED WITH ORACLE8i

大 型 企 业 级 数 据 库 管 理 与 优 化. Lab Instruction

Writing Queries Using Microsoft SQL Server 2008 Transact-SQL

XML Processing and Web Services. Chapter 17

5.1 Database Schema Schema Generation in SQL

database abstraction layer database abstraction layers in PHP Lukas Smith BackendMedia

Java and XML parsing. EH2745 Lecture #8 Spring

Developing Microsoft SQL Server Databases (20464) H8N64S

Using Altova Tools with DB2 purexml

INTRO TO XMLSPY (IXS)

OLH: Oracle Loader for Hadoop OSCH: Oracle SQL Connector for Hadoop Distributed File System (HDFS)

Teradata SQL Assistant Version 13.0 (.Net) Enhancements and Differences. Mike Dempsey

SQL Server Database Coding Standards and Guidelines

1 Changes in this release

Oracle 10g PL/SQL Training

In this session, we use the table ZZTELE with approx. 115,000 records for the examples. The primary key is defined on the columns NAME,VORNAME,STR

Course -Oracle 10g SQL (Exam Code IZ0-047) Session number Module Topics 1 Retrieving Data Using the SQL SELECT Statement

Driver for JDBC Implementation Guide

An Oracle White Paper February Managing Unstructured Data with Oracle Database 11g

MySQL for Beginners Ed 3

Course 20464C: Developing Microsoft SQL Server Databases

ODBC Client Driver Help Kepware, Inc.

Transcription:

XML Databases 13. Systems Silke Eckstein Andreas Kupfer Institut für Informationssysteme Technische Universität Braunschweig http://www.ifis.cs.tu-bs.de

13. Systems 13.1 Introduction 13.2 Oracle 13.3 DB2 13.4 SQL Server 13.5 Tamino 13.6 Summary 13.X Overview and References XML Databases Silke Eckstein Institut für Informationssysteme TU Braunschweig 2

13.1 Introduction After discussing various aspects of XML and XML databases...... we are now going to have a closer look at some of the database systems. XML Databases Silke Eckstein Institut für Informationssysteme TU Braunschweig 3

13.1 Introduction RDBMS with XML support Native XML-DBMS systems XML Databases Silke Eckstein Institut für Informationssysteme TU Braunschweig 4

13. Systems 13.1 Introduction 13.2 Oracle 13.3 DB2 13.4 SQL Server 13.5 Tamino 13.6 Summary 13.X Overview and References XML Databases Silke Eckstein Institut für Informationssysteme TU Braunschweig 5

13.2 Oracle 11g Architecture Figure taken from Oracle XML Developer's Kit Programmer's Guide 11g Release 1 (11.1), April 2008 XML Databases Silke Eckstein Institut für Informationssysteme TU Braunschweig 6

13.2 Oracle 11g Architecture (2) Figure taken from Oracle XML DB Developer s Guide 11g Release 1 (11.1) October 2007 XML Databases Silke Eckstein Institut für Informationssysteme TU Braunschweig 7

13.2 Oracle 11g Mapping variants from XML to databases XML column approach: Column is based on XML type XML table approach: Table is based on XML type Using objectrelational extensions of Oracle XMLTYPE as predefined object type with SQL/XML functions as methods Intermedia-Text-Package with full text functions DBMS_XMLDOM package with DOM methods DBMS_XMLSCHEMA package with administration and generation methods DBMS_XMLGEN package with methods to generate XML from SQL XML Databases Silke Eckstein Institut für Informationssysteme TU Braunschweig 8

13.2 Oracle 11g Storage options text-based (unstructured as CLOB) binary (compact storage in XML binary format) schema-based (object-relational storage requires XML Schema) hybrid (semistructured) Figure taken from Oracle XML DB Developer s Guide 11g Release 1 (11.1) October 2007 XML Databases Silke Eckstein Institut für Informationssysteme TU Braunschweig 9

13.2 Oracle 11g Figure taken from Oracle XML DB Developer s Guide 11g Release 1 (11.1) October 2007 XML Databases Silke Eckstein Institut für Informationssysteme TU Braunschweig 10

13.2 Oracle 11g XML-column vs. XML-table approach Table with XML column CREATE TABLE <table name> ( <column name> XMLTYPE) [XMLTYPE [COLUMN] <column name> [STORE AS {OBJECT RELATIONAL CLOB ( <LOB parameter>) BINARY XML ( <LOB parameter>)}) [XMLSCHEMA <url> ELEMENT [ <url> #] <element> ]] schema-based text-based binary XML table CREATE TABLE <table name> OF XMLTYPE [XMLTYPE [STORE AS {OBJECT RELATIONAL CLOB ( <LOB parameter>) BINARY XML ( <LOB parameter>)}) [XMLSCHEMA <url> ELEMENT [ <url> #] <element> ]] Inserting documents in both cases INSERT INTO table VALUES (XMLTYPE(getDocument('input1.xml'))); XML Databases Silke Eckstein Institut für Informationssysteme TU Braunschweig 11

13.2 Oracle 11g User-defined functiongetdocument(file) to read XML documents CREATE DIRECTORY xmldir AS 'c:\xmldir'; GRANT READ ON DIRECTORY xmldir TO PUBLIC WITH GRANT OPTION; CREATE FUNCTION getdocument(filename VARCHAR2) RETURN CLOB AUTHID CURRENT_USER IS xbfile BFILE; xclob CLOB; BEGIN END; / xbfile := BFILENAME('xmldir', filename); DBMS_LOB.open(xbfile); DBMS_LOB.createTemporary(xclob TRUE, DBMS_LOB.session); DBMS_LOB.loadFromFile(xclob, xbfile, DBMS_LOB.getLength(xbfile)); DBMS_LOB.close(xbfile); RETURN xclob; XML Databases Silke Eckstein Institut für Informationssysteme TU Braunschweig 12

13.2 Oracle 11g Package DBMS_XMLSCHEMA offers methods to register, compile, generate and delete XML Schemas DBMS_XMLSCHEMA.registerSchema( 'schema-url', 'schema-name' ); DBMS_XMLSCHEMA.registerSchema( 'text.xsd', getdocument('test.xsd') ); DBMS_XMLSCHEMA.compileSchema( 'schema-url' ); DBMS_XMLSCHEMA.generateSchema( 'schema-url', 'type-name' ); DBMS_XMLSCHEMA.deleteSchema( 'schema-url', DeleteOption ); DeleteOption: DELETE_RESTRICT DELETE_INVALIDATE DELETE_CASCADE DELETE_CASCADE_FORCE XML Databases Silke Eckstein Institut für Informationssysteme TU Braunschweig 13

13.2 Oracle 11g Some methods of the XMLTYPE XMLTYPE(<value-expr>) is the constructor. Expression can be a string or a user defined type getclobval()/getstringval() returns XML value as CLOB or string getnumval() only applicable to text nodes containing a numeric string isfragment() returns 1 if instance has more than one root element existsnode(<xpath-expr>) returns 1 if the expression returns a node extract(<xpath-expr>) extracts a part of the XML value transform(<xml-value-expr>) transforms according to a stylesheet toobject() converts to an object isschemabased() returns 1 if the XML value is based on a schema getschemaurl() returns the URL to the schema getrootelement() returns the root element or NULL for fragments XML Databases Silke Eckstein Institut für Informationssysteme TU Braunschweig 14

13.2 Oracle 11g Queries Support of SQL/XML functions XMLQUERY XMLTABLE XMLAGG XMLELEMENT XMLATTRIBUTE XMLFOREST And additional functions EXTRACT EXISTSNODE... Full text search with the Intermedia-Text-Package XML Databases Silke Eckstein Institut für Informationssysteme TU Braunschweig 15

13.2 Oracle 11g EXTRACT extracts an excerpt of the XML value described by an XPath query EXTRACT( <XML-value-expression>, <XPath-expression> [, <Namespace>]) SELECT EXTRACT( VALUE(b), '//@ISBN') AS ISBNumber, EXTRACT( VALUE(b), '//Title/text()') AS Title_content, EXTRACT( VALUE(b), '//Title') AS Title_element FROM Bookb; ISBNumber Title_content Title_element 3-89864-148-1 XML & Datenbanken <Title>XML & Datenbanken</Title> 3-89864-219-4 SQL-1999 & SQL:2003 <Title>SQL-1999 & SQL:2003</Title> XML Databases Silke Eckstein Institut für Informationssysteme TU Braunschweig 16

13.2 Oracle 11g EXISTSNODE Returns 0 if the query returns the empty sequence EXISTSNODE( <XML-value-expression>, <XPath-expression> [, <Namespace>]) Example: SELECT EXTRACT( VALUE(b), '//@ISBN') AS ISBNumber, EXTRACT( VALUE(b), '//Title/text()') AS Title_content, EXTRACT( VALUE(b), '//Title') AS Title_element FROM Bookb WHERE EXISTSNODE( VALUE(b), '//Book[@ISBN="3-89864-219-4"]') = 1; ISBNumber Title_content Title_element 3-89864-219-4 SQL-1999 & SQL:2003 <Title>SQL-1999 & SQL:2003</Title> XML Databases Silke Eckstein Institut für Informationssysteme TU Braunschweig 17

13.2 Oracle 11g Indexing Full text index Path index CREATE INDEX xmlfulltextidx ON Book b (VALUE(b)) INDEXTYPE IS CTXSYS.CONTEXT; CREATE INDEX xmlpathidx ON Book b (VALUE(b)) INDEXTYPE IS CTXSYS.CTXXPATH; Functional index (value index) XML index CREATE INDEX xmlfunctionalidx ON Book b (EXTRACTVALUE(VALUE(b),'//@year')); CREATE INDEXxmlidx ONBookb (VALUE(b)) INDEXTYPE IS XDB.XMLIndex; Creates a set of secondary indexes Path index with all XML tags and fragments Value index with the oder of the document (node positions) Value index to index the values of the nodes XML Databases Silke Eckstein Institut für Informationssysteme TU Braunschweig 18

13.2 Oracle 11g Using indexes Query using the path index: SELECT EXTRACTVALUE(VALUE(b),'//Title') AS Title FROM Book b WHERE EXISTSNODE (VALUE(b),'/Book/Publisher[text()="dpunkt"]') = 1; Query using the full text index: SELECT SCORE(o), EXTRACT(VALUE(b),'//@ISBN') AS ISBN FROM Book b WHERE CONTAINS (VALUE(b),'Java', o) > o ORDER BY SCORE (o) DESC; Query using the functional index: SELECT EXTRACTVALUE(VALUE(b),'//Title') AS Title FROM Book b WHERE EXTRACTVALUE (VALUE(b),'//Year') = 2009; XML Databases Silke Eckstein Institut für Informationssysteme TU Braunschweig 19

13.2 Oracle 11g Manipulation UPDATEXML Change a part (defined by an XPath query) of the XML value UPDATEXML(<XML-value-expr>, <replacement-list> [, <namespace>]) <replacement-list> := <XPath-expr>, <value-expr> Example to change the value of an attribute: UPDATE Book b SET VALUE(b) = UPDATEXML (VALUE(b),'//Publisher[text()="dpunkt"]/@City', 'Zürich'); Manipulation DELETEXML Deletes a sequence of nodes (selected by an XPath query) from the XML value DELETEXML(<XML-value-expr>, <replacement-list> [, <namespace>]) Example to delete a specific Author node: UPDATE Book b SET VALUE(b) = DELETEXML (VALUE(b),'//Book[@ISBN="3-89864-148-1"]/Author[text()="Holger Meyer"]'); XML Databases Silke Eckstein Institut für Informationssysteme TU Braunschweig 20

13.2 Oracle 11g XML views Allow XML-based views on SQL and XML values Are based on the principle of object views The object type is XMLTYPE in this case Example: CREATE VIEW DpunktBooks OF XMLTYPE WITH OBJECT ID DEFAULT AS SELECT VALUE (b)from Bookb WHERE EXISTSNODE (VALUE(b),'//Publisher[text()="dpunkt"]') ; XML Databases Silke Eckstein Institut für Informationssysteme TU Braunschweig 21

13.2 Oracle 11g Export of database contents with XML syntax Standard mapping: SQL XML with Top level elements result from columns Simple types (with scalar values) as elements with PCDATA DBMS_XMLGEN.getXML('query') Structured types and their attributes as elements with subelements for attributes Complex attributes as hierarchically nested elements Collection types are mapped to lists of elements Object references and referential integrity as ID/IDREF within the document Table content is mapped to ROWSET elements: <ROWSET> <ROW num="1"> </ROW> <ROW num="n"> </ROW> </ROWSET> User defined transformation from SQL to XML is possible with XSLT XML Databases Silke Eckstein Institut für Informationssysteme TU Braunschweig 22

13.2 Oracle 11g Summary Oracle XML support XML storage modelccccc Extensible, object relational Schema definition Validation possible Storage type Text-based or schema-based Mapping DB XML By SQL/XML functions, schema generators, XML views XML data type Available Value/function index Available Full text index Available Path index Available Queries SQL/XML with XQuery support Full text search With the Intermedia-Text-Package Manipulation SQL methods with XPath XML Databases Silke Eckstein Institut für Informationssysteme TU Braunschweig 23

13. Systems 13.1 Introduction 13.2 Oracle 13.3 DB2 13.4 SQL Server 13.5 Tamino 13.6 Summary 13.X Overview and References XML Databases Silke Eckstein Institut für Informationssysteme TU Braunschweig 24

13.3 DB2 V9 IBM DB2 Application XML documents file system Database XML Databases Silke Eckstein Institut für Informationssysteme TU Braunschweig 25

13.3 DB2 V9 Mapping XML data to relational databases Variants: XML column approach: based on XML data type XML collection approach: based on decomposition of XML documents into database tables and attributes Table with XML column: Diverse XML datatypes: XML: modelbased / hierarchical storage PureXML XMLCLOB: XML documents stored as CLOBs XMLVARCHAR: XML documents stored as VARCHAR XML XMLFILE: XML documents stored in file system Extender XML schema validation for datatype XML only In addition: materialized views Extract selected XML content from documents Materialise those content into so-called side tables Side tables are defined in Document Access Definition (DAD) XML Databases Silke Eckstein Institut für Informationssysteme TU Braunschweig 26

13.3 DB2 V9 "purexml and relational hybrid database" [IBM06a] XML Databases Silke Eckstein Institut für Informationssysteme TU Braunschweig 27

13.3 DB2 V9 Ways to put XML data into the database (PureXML) [IBM06b] XML Databases Silke Eckstein Institut für Informationssysteme TU Braunschweig 28

13.3 DB2 V9 Ways to get XML data out of the database (PureXML) [IBM06b] XML Databases Silke Eckstein Institut für Informationssysteme TU Braunschweig 29

13.3 DB2 V9 PureXML Queries and Indexes Application of SQL in XQuery: XQUERY db2-fn:xmlcolumn ( t1.xml1 ) Delivers the value of column xml1 of table t1 as a node sequence (column must be of type XML) XQUERY db2-fn:sqlquery ( SELECT xml1 FROM t1 ) Delivers the XML value of the single-column table t1 as a node sequence (column must be of type XML) Definition of a path index: CREATE INDEX Idx_Author_Path ON Book (Content) GENERATE KEY USING XMLPATTERN '//Author' AS SQL VARCHAR(50) XML Databases Silke Eckstein Institut für Informationssysteme TU Braunschweig 30

13.3 DB2 V9 XML Extender Mapping between XML and SQL XML Databases Silke Eckstein Institut für Informationssysteme TU Braunschweig 31

13.3 DB2 V9 XML Extender Tables with XML Types XML extension setup with XML Extender Admin Wizard or Command Window: > dxxadm enable_db XMLDB Definition of tables accepting XML documents: Variant 1: Create with XML Extender Admin Wizard Variant 2: SQL CREATE TABLE Buch (Inhalt DB2XML.XMLVARCHAR) Insertion of an XML document: INSERT INTO Buch (Inhalt) VALUES (DB2XML.XMLVARCHARFromFile('C:\XMLDIR\buch01.xml')) XML Databases Silke Eckstein Institut für Informationssysteme TU Braunschweig 32

13.3 DB2 V9 XML Extender Queries SQL-XML Extender offers functions for queries and updates Extract functions: DB2XML.EXTRACT<datatype>(<XML value expression>, <XPath expression>) Example: SELECT a.returnedvarchar FROM Buchlob, TABLE(DB2XML.EXTRACTVARCHARS(Inhalt, '//Autor')) a Limited supportof SQL/XML standard XMLAGG XMLELEMENT XMLATTRIBUTE XML Databases Silke Eckstein Institut für Informationssysteme TU Braunschweig 33

13.3 DB2 V9 ExtractXXX(<XML value expression>, <XPath expression>) "IBM DB2 Universal Database XML Extender Administration and Programming, Version 8, 2002" XML Databases Silke Eckstein Institut für Informationssysteme TU Braunschweig 34

13.3 DB2 V9 XML Extender Updates Updates possible with special XML Extender methods Syntax: DB2XML.UPDATE(<XML value expression>, <XPath expression >, <new value>) Restriction: predicates with elements are not supported Example: not supported predicate UPDATE Buchlob SET Inhalt = DB2XML UPDATE(Inhalt '//Verlag[text()="dpunkt"]/@Ort' 'Zürich') Example: supported predicate UPDATE Buchlob SET Inhalt = DB2XML.UPDATE(Inhalt, '// Buch[@ISBN="3-89864-148-1"]/Verlag/ @Ort', 'Köln') With XML column approach updates are transferred to side tables automatically In PureXML an XML value can only be fully replaced XML Databases Silke Eckstein Institut für Informationssysteme TU Braunschweig 35

13.3 DB2 V9 XML Extender Indexing Index support Value index (B-Tree, Bitmap, etc.) on side tables (XML Extender) Full text index (with Text Extender) on XML types Extension of full text index for IR on XML Path information included in index Support for path expressions Example: Retrival model SELECT Inhalt FROM Buchlob WHERE contains(dscrhandel, MODEL order SECTION(//Buch/Beschreibung) "Datenbank" ) = 1 XML Databases Silke Eckstein Institut für Informationssysteme TU Braunschweig 36

13.3 DB2 V9 Summary IBM DB2 XML Support XML storage model Schema definition Storage type Mapping DB XML XML data type Value/function index Full text index Path index Queries Full text search Manipulation Extensible, object relational Validation possible Model-based (PureXML), text-based or userdefined schema-based (XML Extender) DAD (XML Extender) Available (PureXML) Standard DBS indexes on side tables With TextExtender Available SQL/XML with XQuery support With TextExtender SQL functions with XPath XML Databases Silke Eckstein Institut für Informationssysteme TU Braunschweig 37

13. Systems 13.1 Introduction 13.2 Oracle 13.3 DB2 13.4 SQL Server 13.5 Tamino 13.6 Summary 13.X Overview and References XML Databases Silke Eckstein Institut für Informationssysteme TU Braunschweig 38

13.4 SQL Server Microsoft SQL Server Architecture Application XML documents Database XML Databases Silke Eckstein Institut für Informationssysteme TU Braunschweig 39

13.4 SQL Server Mapping XML data to relational databases 4 storage variants: Native (binary) storage Text-based storage as CLOB Model-based storage according to EDGE approach Schema-based storage via STORED-queries Datatype XML with methods based on XQuery Query() evaluates an XQuery and returns a value of type XML Value() evaluates an XQuery and returns a scalar SQL value Exist() returns true, if XQuery result is not empty Modify() updates a value of type XML Nodes() returns subtree of XML value Integrated Usage of SQL and XQuery Access to SQL data in XQuery via sql:column() and sql:variable() Evaluation of XQuery expressions in SQL via XML methods from above XML Databases Silke Eckstein Institut für Informationssysteme TU Braunschweig 40

13.4 SQL Server Native storage table definition Schema registration CREATE XML SCHEMA COLLECTION BuchXSD AS '<?xml version="1.0"?> ' Table definition CREATE TABLE Buch ( Id INT PRIMARY KEY, Inhalt XML BuchXSD) ) Insertion of an XML document from a file INSERT INTO Buch SELECT 1, xcol FROM (SELECT * FROM OPENROWSET (BULK 'C:\XMLDIR\buch1.xml', SINGLE_BLOB) AS xcol) AS R(xCol) XML Databases Silke Eckstein Institut für Informationssysteme TU Braunschweig 41

13.4 SQL Server Native storage SQL/XML queries & updates Find all author elements from books whose first author is "Gunter Saake" SELECT Inhalt.query('//Autor') AS Autoren FROM Buch WHERE Inhalt.exist('/Buch[Autor[1] = "Gunter Saake"]') = 1 Autoren <Autor>Gunter Saake</Autor><Autor>Ingo Schmitt</Autor> <Autor>Can Türker</Autor> <Autor>Gunter Saake</Autor><Autor>Kai-Uwe Sattler</Autor> Update the value of the attributes "City" from all those publisher elements to "Zürich", where the publisher is "dpunkt" UPDATE Buch SET Inhalt.modify ('replace value of (//Verlag[. = "dpunkt"]/@ort)[1] with "Zürich"') XML Databases Silke Eckstein Institut für Informationssysteme TU Braunschweig 42

13.4 SQL Server Native storage indexing Definition of a primary XML indexes CREATE PRIMARY XML INDEX Idx_Inhalt ON Buch (Inhalt) Creates clustered index with entries of form (ID, ORDPATH, TAG, NODETYPE, VALUE, PATH_ID,...) necessary in order to create secondary indexes Secondary XML index types: Path index (path, value) Property index (primary key, path, value) Value index (value, path) Definition of a secondary XML index: Full text index is also supported: PATH PROPERTY VALUE CREATE XML INDEX Idx_Inhalt_Path ON Buch (Inhalt) USING XML INDEX Idx Inhalt FOR <Indextyp> CREATE FULLTEXT INDEX Idx_Inhalt_FT ON Buch (Inhalt) KEY INDEX b XML Databases Silke Eckstein Institut für Informationssysteme TU Braunschweig 43

13.4 SQL Server Model-based storage with EDGE Invocation of OPENXML without WITH claus creates EDGE table Column Datatype Task id bigint unique node id parentid bigint parent node id nodetype int distinguishes elements, attributes, comments localname nvarchar tag prefix nvarchar XML namespace prefix namespaceuri nvarchar XML namespace URI datatype nvarchar datatype (derived from DTD or XML schema) prev bigint id of previous node (in document order) text ntext node content XML Databases Silke Eckstein Institut für Informationssysteme TU Braunschweig 44

13.4 SQL Server Model-based storage with EDGE EXEC sp_xml_preparedocument @hdoc OUTPUT, @xmldoctext INSERT INTO EDGE SELECT * FROMOpenXML (@hdoc, '', 0) EXEC sp_xml_removedocument @hdocc EDGE table: id parent nodetype localname prefix namespaceuri datatype prev text 0 NULL 1 book NULL NULL NULL NULL NULL... 17 6 3 #text NULL NULL NULL NULL 'Vossen' XML Databases Silke Eckstein Institut für Informationssysteme TU Braunschweig 45

13.4 SQL Server Schema-based storage of STORED queries SQL extension with OPENXML OPENXML transforms XML contents into database tables (shredding) OPENXML therefore offers possibility to implement STORED queries Example for the realization of a STORED query: EXEC sp_xml_preparedocument @hdoc OUTPUT, @xmldoctext INSERT INTO book SELECT * FROM OpenXML (@hdoc, '//book/', 0) WITH ( title NVARCHAR(3000)./title', publisher NVARCHAR(200)./publisher, isbn NVARCHAR(15)./isbn ) EXEC sp_xml_removedocument @hdoc XML Databases Silke Eckstein Institut für Informationssysteme TU Braunschweig 46

13.4 SQL Server Mapping of databases to XML Variant 1: Standard transformation with SQL SELECT and FOR XML clause FOR XML RAW: Transformation in ROW-XML elements and XML attributes FOR XML AUTO: Semantically rich XML element names Foreign key relationships are transformed into hierarchies FOR XML EXPLICIT: User controls XML assembling through metadata (EDGE) Variant 2: User defined XML view Use of a (available) XML schema Annotation of the schema with information about tables and columns Accesss from the application to the XML view via: IIS functionality ADO (ActiveX Data Objects) middleware for DB access XML Databases Silke Eckstein Institut für Informationssysteme TU Braunschweig 47

13.4 SQL Server Updates SQL Server does not offer functions to update XML documents stored as CLOBs Results in heavy restrictions of text-based approach Updates for schema-based approach possible via so called updategrams Builds on annotated XML schemas Updates are specified as an XML document New namespace: xmlns:updg="urn:schemas-microsoft-com:xml-updategram" Element before: Definition of a previous state (to be modified) Element after: Definition of the new state Different update operations through varying element contents Insert: before element remains empty Delete: after element remains empty Update: both elements have non-empty contents Automatic execution of necessary database operations XML Databases Silke Eckstein Institut für Informationssysteme TU Braunschweig 48

13.4 SQL Server Updates: updategram example Update of publisher information <ROOT xmlns:updg="urn:schemas-microsoft-com:xml-updategram"> <updg:sync > <updg:before> <Buch> <Titel> Objektdatenbanken </Titel> <ISBN>3-8266-00258-7 </ISBN> <Verlag> Thomson </Verlag> </Buch> </updg:before> <updg:after> <Buch> <Titel> Objektdatenbanken </Titel> <ISBN>3-8266-00258-7 </ISBN> <Verlag> International Thomson Publishing </Verlag> <Buch> </updg:after> </updg:sync> </ROOT> XML Databases Silke Eckstein Institut für Informationssysteme TU Braunschweig 49

13.4 SQL Server Summary SQL Server XML support XML storage model Schema definition Storage type Mapping DB XML XML data type Value index Full text index Path index Queries Manipulation Relational inline DTD or XML schema Native: XML column text-based: CLOB column modelbased: with OPENXML user-defined schema-based: with OPENXML-STORED queries Automatically: FOR XML clause user-defined: XSD annotations Available Available No XML specific functions Available SQl extensions (query and value not compatible with SQL/XML), XQuery XML method modify with updategrams XML Databases Silke Eckstein Institut für Informationssysteme TU Braunschweig 50

13. Systems 13.1 Introduction 13.2 Oracle 13.3 DB2 13.4 SQL Server 13.5 Tamino 13.6 Summary 13.X Overview and References XML Databases Silke Eckstein Institut für Informationssysteme TU Braunschweig 51

13.5 Tamino Architecture XML Databases Silke Eckstein Institut für Informationssysteme TU Braunschweig 52

13.5 Tamino Architecture (2) XML Output Query (URL) XML Objects, DTDs Data from external sources and/or internal data storage Data to external sources and/or internal data storage XML Databases Silke Eckstein Institut für Informationssysteme TU Braunschweig 53

13.5 Tamino Storage structures: Mapping of XML Tamino uses "native" storage structures for XML data Native storage is supplemented with diverse classical index types B-Tree index Full text index Path index Storage alternatives: Storage of well-formed XML documents without schema Storage of valid XML documents Annotation of schema definition with storage alternatives Storage hierarchy: Tier 1: Tamino Tier 2: Collection Tier 3: Document type (defined by set of XML schema definitions) Tier 4: document instance XML Databases Silke Eckstein Institut für Informationssysteme TU Braunschweig 54

13.5 Tamino <?xml version="1.0" encoding="utf-8"?> <xsd:schema xmlns:xs="http://www.w3.org/2001/xmlschema" xmlns:tsd="namespaces.softwareag.com/tamino/taminoschemadefinition"> <xs:annotation> <xs:appinfo> <tsd:schemainfo name="book"> <tsd:collection name="books"></tsd:collection> <tsd:doctype name="book"> <tsd:logical> <tsd:content>open<tsd:content></tsd:logical> </tsd:doctype> </tsd:schemainfo> </xs:appinfo> </xs:annotation> <xs:element name = "book"> <xs:complextype> <xs:sequence> <xs:element name = "title" type = "xs:string"></xs:element> <xs:element name = "summary" type = "xs:string"> <xs:annotation> <xs:appinfo> <tsd:elementinfo> <tsd:physical> <tsd:native> <tsd:index> <tsd:text></tsd:text> </tsd:index> </tsd:native> </tsd:physical> </tsd:elementinfo> </xs:appinfo> </xs:annotation> </xs:element> </xs:sequence> </xs:complextype> </xs:element> Storage: Example schema with annotations for text index XML Databases Silke Eckstein Institut für Informationssysteme TU Braunschweig 55

13.5 Tamino Queries Access possibilities Program controlled, e.g. via DCOM components Ad-hoc queries with X-Plorer query tool "Interactive Interface" Supported query languages XPath 1.0 dialect with extensions for text search (also possible without index) Containedness (~=) /Buch[Titel ~= "Datenmodelle"]/Beschreibung Wildcard character (*) /*[. ~= "*XML*"] Consideration of context (NEAR) /*[/Autor ~= "Gunter" NEAR "Saake"] XQuery dialect XML Databases Silke Eckstein Institut für Informationssysteme TU Braunschweig 56

13.5 Tamino Updates Operations Delete: UPDATE DELETE $buch//verlag[@ort="zürich"]/@ort Insert: UPDATE INSERT <Preis Waehrung="EUR">35</Preis> INTO $buch[@isbn="3-8266-0258-7"] Replace: UPDATE REPLACE $buch//verlag[@ort="zürich"]/@ort WITH ATTRIBUTE Ort {"Wiesbaden"} XML Databases Silke Eckstein Institut für Informationssysteme TU Braunschweig 57

13.5 Tamino Indexing Classical indexes for data Numbers and strings Text indexes for document centric parts With wildcards Structure index Full Condensed Combined index Multiple elements and attributes, even on different levels Multi path index Different paths indexed together Reference index Hierarchy aware index XML Databases Silke Eckstein Institut für Informationssysteme TU Braunschweig 58

13.5 Tamino Summary Tamino Native Schema definition Storage type Mapping DB XML XML data type Value index Full text index Path index Queries Full text search Manipulation Relational Validation possible Model-based Native Available Available Available Available Tamino X-Query (with extensions and small differences compared to W3C XQuery) Supported Supported XML Databases Silke Eckstein Institut für Informationssysteme TU Braunschweig 59

13. Systems 13.1 Introduction 13.2 Oracle 13.3 DB2 13.4 SQL Server 13.5 Tamino 13.6 Summary 13.X Overview and References XML Databases Silke Eckstein Institut für Informationssysteme TU Braunschweig 60

13.5 Summary XML Databases Silke Eckstein Institut für Informationssysteme TU Braunschweig 61

13.6 Overview 1. Introduction 2. XML Basics 3. Schema definition 4. XML query languages I 5. Mapping relational data to XML 6. SQL/XML 7. XML processing 8. XML query languages II XQuery Data Model 9. XML query languages III XQuery 10. XML storage I Overview 11. XML storage II 12. Updates 13. Systems XML Databases Silke Eckstein Institut für Informationssysteme TU Braunschweig 62

13.6 References "XML und Datenbanken" Can Türker Lecture, University of Zurich, 2008 "XML und Datenbanken" [KM03] M. Klettke, H. Meier dpunkt.verlag, 2003 " DB2 9 purexml Guide" [IBM06a] IBM December 2006 "DB2 Version 9. XML Guide" [IBM06b] XML Databases Silke Eckstein Institut für Informationssysteme TU Braunschweig 63

Questions, Ideas, Comments Now, or... Room: IZ 232 Office our: Tuesday, 12:30 13:30 Uhr or on appointment Email: eckstein@ifis.cs.tu-bs.de XML Databases Silke Eckstein Institut für Informationssysteme TU Braunschweig 64