XML Databases Silke Eckstein Andreas Kupfer Institut für Informationssysteme Technische Universität http://www.ifis.cs.tu-bs.de in XML XML Databases SilkeEckstein Institut fürinformationssysteme TU 2 Creating XML documents from a database Introduced in the last chapter On a more or less conceptual level Not handled so far Creating XML documents inside a database Retrieving data from XML documents Changing XML document content Solution: Integration in database SQL/XML SQL/XML Storage of XML in all big commercial DBMS available Proprietary solution for embedding in SQL SQL/XML = Part 14 of the SQL-Standard: XML functionality Incorporates the corresponding standards for XML (XML Schema, XQuery) Basic idea: Mapping of SQL concepts to XML (see last chapter) Own datatype to store XML XML Databases Silke Eckstein Institut für Informationssysteme TU 3 XML Databases Silke Eckstein Institut für Informationssysteme TU 4 SQL/XML Storing XML documents inside the database as values of type XML Datatype XML with belonging functions Mapping between SQL and XML Embedding XQuery in SQL <Name> </Name> <State> Niedersachsen </State> Mapping between SQl and XML SQL XQuery SQL database XML datatype Generating XML documents using SQL/XML functions Mapping SQL database to XML SQL charset to unicode (depends on implementation) SQL identifiers to XML names SQL data types to XML schema data types SQL values to XML values SQL tables to XML and XML schema documents SQL schemas to XML and XML schema documents SQL catalogues to XML and XML schema documents XML Databases Silke Eckstein Institut für Informationssysteme TU 5 XML Databases Silke Eckstein Institut für Informationssysteme TU 6
Mapping SQL tables CREATE TABLE Account ( Name CHAR(20), Balance NUMERIC(12,2), ); Name Balance Joe 2000 Jim 3500 Mapping SQL table columns to XML elements Mapping table rowstoxml <row> elements <ACCOUNT> <row> <NAME>Joe</NAME> <BALANCE>2000</BALANCE> </row> <row> <NAME>Jim</NAME> <BALANCE>3500</BALANCE> </row> </ACCOUNT> <xsd:complextypename="row.account"> <xsd:sequence> <xsd:element name="name" type="char_20"/> <xsd:element name="balance" type="numeric_12_2"/> </xsd:sequence> </xsd:complextype> <xsd:complextypename="table.account"> <xsd:annotation><xsd:appinfo> <xqlxml:sqlnametype="base TABLE" localname="account"/> </xsd:appinfo></xsd:annotation> <xsd:sequence> <xsd:element name="row" type="row.account"/> </xsd:sequence> </xsd:complextype> <xsd:elementname="account" type="table.account"/> XML Databases Silke Eckstein Institut für Informationssysteme TU 7 Relational table: Cities City Zip State 38100 Niedersachsen 38106 Niedersachsen Hannover 30159 Niedersachsen Many possible XML documents... <Name></Name> <State>Niedersachsen</State>...... <State name="niedersachsen"> <City name=""> </State>... XML Databases Silke Eckstein Institut für Informationssysteme TU 8 in XML XMLELEMENT creates an XML element Example: creating name and content XMLELEMENT( NAME "City", 'Bad Oeynhausen' ) Creates Bad Oeynhausen Can contain attributes, comments and other elements and options XMLELEMENT( NAME "City", XMLCOMMENT ( "Example 2" ), XMLATTRIBUTES('Bayern' AS "State", '80469' AS "Zip" ),'München' ) Creates <City State="Bayern" Zip="80469"><! Example 2 --> München XML Databases SilkeEckstein Institut fürinformationssysteme TU 9 XML Databases Silke Eckstein Institut für Informationssysteme TU 10 XMLELEMENT referencing the database Can be used directly from an SQL statement SELECT XMLELEMENT( NAME "City", XMLCOMMENT ( "Example 3" ), XMLATTRIBUTES( "State", "Zip" AS "PLZ" ), "City" ) FROM Cities WHERE ; <City STATE="Niedersachsen" PLZ="38100"> <! Example 3 --> XMLELEMENT nesting Example SELECT XMLELEMENT( NAME "City", XMLELEMENT( NAME "Name", "City" ), XMLELEMENT( NAME "State", "State" ), XMLELEMENT( NAME "Zip", "Zip" ) ) FROM Cities WHERE ; <Name></Name> <State>Niedersachsen</State> XML Databases Silke Eckstein Institut für Informationssysteme TU 11 XML Databases Silke Eckstein Institut für Informationssysteme TU 12
XMLELEMENT syntax diagram XMLFOREST Constructs a forest of elements without attributes SELECT XMLFOREST ( "City", "State" ) FROM Cities; <State>Niedersachsen</State> <State>Niedersachsen</State> Hannover<State>Niedersachsen</State> [IBM] XML Databases Silke Eckstein Institut für Informationssysteme TU 13 XML Databases Silke Eckstein Institut für Informationssysteme TU 14 XMLFOREST syntax diagram XMLCONCAT Concatenates multiple XML fragments into a single XML pattern Compare outputs SELECT XMLELEMENT("city", City) AS "CITY", XMLELEMENT("zip", Zip) AS "ZIP", XMLELEMENT("state", State) AS "STATE" FROM Cities; SELECT XMLCONCAT( XMLELEMENT("city", CITY), XMLELEMENT("zip", ZIP), XMLELEMENT("state", STATE) ) FROM Cities; [IBM] XML Databases Silke Eckstein Institut für Informationssysteme TU 15 [Pow07] XML Databases Silke Eckstein Institut für Informationssysteme TU 16 XMLAGG Aggregates seperate lines of output into a single string SELECT CITY, XMLAGG( XMLELEMENT(NAME "Zip", Zip)) AS "Zipcodes" FROM Cities GROUP BY City; City Hannover Zipcodes <Zip>30159</Zip> XMLAGG Allows sorting SELECT XMLAGG( XMLELEMENT("address", Zip ' ' City) ORDER BY Zip DESC) FROM Cities; <address>38106 </address> <address>38100 </address> <address>30159 Hannover</address> Disadvantage: Can only aggregate a single element, and thus fields are concatenated XML Databases Silke Eckstein Institut für Informationssysteme TU 17 [Pow07] XML Databases Silke Eckstein Institut für Informationssysteme TU 18
in XML Storing XML in relational databases is possible as Character data (VARCHAR, Character Large OBject) New data type XML A value of the data type XML can contain whole XML document XML element a set of XML elements All XML publishing operators from chapter 6.2 create values of the data type XML, not a string XML Databases SilkeEckstein Institut fürinformationssysteme TU 19 XML Databases Silke Eckstein Institut für Informationssysteme TU 20 Untyped elements & attributes, elements not NULL XML(CONTENT(UNTYPED)) 1 element child XML(DOCUMENT(UNTYPED)) XML(SEQUENCE) XML(CONTENT(ANY)) XML(DOCUMENT(ANY)) Validated against schema NULL or document node 1 element child Validated against schema XML(CONTENT(XMLSCHEMA)) 1 element child XML(DOCUMENT(XMLSCHEMA)) Specification of XML type XML [({DOCUMENT CONTENT SEQUENCE} [({ANY UNTYPED XMLSCHEMA schema name})])] Modifiers are optional Primary type modifier DOCUMENT (XML document) CONTENT (XML element) SEQUENCE (sequence of XML elements) Secondary type modifier UNTYPED XMLSCHEMA (typed) ANY (may be typed) XML Databases Silke Eckstein Institut für Informationssysteme TU 21 XML Databases Silke Eckstein Institut für Informationssysteme TU 22 Create a table that is an XML data type in itself CREATE TABLE XMLDOCUMENT OF XMLTYPE; Create a table containing an XMLType data type column CREATE TABLE XML ( ID NUMBER NOT NULL, XML XMLTYPE, CONSTRAINT XPK PRIMARY KEY (ID) ); Example: Definition of an XML type column CREATE TABLE Groups ( ID INTEGER, Name XML ); ID Name 123 <Groups>Annabelle</Groups> 234 <Groups>Magdalena, Marius</Groups> 345 <?xml version 1.0?> <Groups> <Person>Patrick</Person> <Person>Robert</Person> </Groups> 654 <Groups>Rebecca</Groups> <Groups>Torben</Groups> [Pow07] XML Databases Silke Eckstein Institut für Informationssysteme TU 23 XML Databases Silke Eckstein Institut für Informationssysteme TU 24
Characteristics Allowed values: XML documents (including prolog) XML content according to XML 1.0 (includes pure text comments, PI?) NULL No comparison possible (compare CLOB in SQL) User can define an order, if comparison is necessary No corresponding type in programming languages for embedding in SQL available Standard defines operators to convert to other SQL data types Parsing & Serialization XMLParse: Parses a string value using an <State> Niedersachsen </State> XML parser Produces value whose specific type is XML(DOCUMENT(ANY)), or CONTENT, or XMLSerialize Transforms an XML value into a string value (CHAR, VARCHAR, CLOB, or BLOB) <Name> </Name> <State> Niedersachsen </State> <Name> </Name> XML Databases Silke Eckstein Institut für Informationssysteme TU 25 XML Databases Silke Eckstein Institut für Informationssysteme TU 26 in XML Motivation How can SQL applications locate and retrieve information in XML documents stored in an SQL database cell? Invoking XML query language within SQL statements Retrieve information in SELECT list Locate information in WHERE clause Details on XML query language XQuery later XML Databases SilkeEckstein Institut fürinformationssysteme TU 27 XML Databases Silke Eckstein Institut für Informationssysteme TU 28 XMLQuery A new SQL expression, invoked as a pseudofunction, whose data type can be an XML type such as XML(CONTENT(ANY)) or an ordinary SQL type XMLExists A new SQL predicate, invoked as a pseudo-function, returning true when the contained XQuery expression returns anything other than the empty sequence (false) or SQL null value (unknown) XMLQuery syntax XMLQUERY(<XQuery expression> [PASSING <argument list>] {NULL EMPTY} ON EMPTY) argument list := <SQL value> AS <XQuery variable> Example SELECT XMLQUERY( '<State name="{$name}">{$city}</state>' PASSING State as $Name, City AS $City NULL ON EMPTY) AS CityList FROM Cities; CityList <State name="niedersachsen"></state> <State name="niedersachsen">hannover</state> XML Databases Silke Eckstein Institut für Informationssysteme TU 29 XML Databases Silke Eckstein Institut für Informationssysteme TU 30
CREATE TABLE Papers (ID INTEGER, Paper XML); ID Paper 123 <Paper> <author>alice</author><title>perpetual Motion</title><year>1999</year></Paper> 345 <Paper><year>2005</year><author>Bob</author><author>Charlie </author><title>beer</title> </Paper> SELECT ID, XMLQUERY( 'FOR $a IN $p//author RETURN <Authors>{$a/text()}</Authors>' PASSING Paper AS "p") AS AuthorNames FROM Papers; ID AuthorNames 123 <Authors>Alice</Authors> 345 <Authors>Bob</Authors> <Authors>Charlie</Authors> XMLTABLE Provides an SQL view of XML data Output is not of the XML type Evaluates an XQuery row pattern with optional arguments (as with XMLQuery) Element/attribute values mapped to columns using XQuery column patterns Names & types of columns required; default values optional Syntax: XMLTABLE (<XQuery expression> PASSING <argument list> COLUMNS <column list>) column := <name> <type> PATH <path expression> XML Databases Silke Eckstein Institut für Informationssysteme TU 31 XML Databases Silke Eckstein Institut für Informationssysteme TU 32 XMLTable: Example SELECT ID, t.* FROM Papers p, XMLTABLE( 'for $root in $papers where $root//author/text() = "Bob" return $root/paper' PASSING p.paper as "papers" COLUMNS About VARCHAR(30) PATH '/Paper/title', Created INTEGER PATH '/Paper/year' ) AS t; ID About Created 345 Beer 2005 in XML XML Databases Silke Eckstein Institut für Informationssysteme TU 33 XML Databases SilkeEckstein Institut fürinformationssysteme TU 34 Validation of XML Is like integrity constraints in DBs Requires an XML Schema XML Schemas may be registered with the SQL-server Implementation-defined mechanism Known by SQL name & by target namespace URI Schema does need a unique name Used by XMLValidate(), IS VALID, and to restrict values of XML(DOCUMENT-or-CONTENT(XMLSCHEMA)) Schema registration Register XMLSCHEMA 'http://www.alfred-moos.de/grussschema.xsd' FROM 'file://c:/xml_schemata/grussschema.xsd' AS GrussSchema COMPLETE ; CREATE TABLE Dokument_XML (Dokument_XML_Nr CHAR (4) NOT NULL PRIMARY KEY, Dokument XML, CONSTRAINT validieren CHECK (Dokument IS VALIDATED ACCORDING TO XMLSCHEMA ID GrussSchema ) ) ; XML Databases Silke Eckstein Institut für Informationssysteme TU 35 XML Databases Silke Eckstein Institut für Informationssysteme TU 36
Schema definition Syntax XML(CONTENT(XMLSCHEMA) <schema> [<elements>])) <schema> := URI <namespace> [LOCATION <loc>] NO NAMESPACE [LOCATION <loc>] ID <registered schema name> <element> := [NAMESPACE <namespace>] ELEMENT <element name> New functions and predicates: XMLValidate Validates an XML value against an XML Schema (or target namespace), returning new XML value with type annotations IS VALID Tests an XML value to determine whether or not it is valid according to an XML Schema (or target namespace); return true/false without altering the XML value itself IS DOCUMENT determines whether an XML value satisfies the (SQL/XML) criteria for an XML document IS CONTENT determines whether an XML value satisfies the (SQL/XML) criteria for XML content XML Databases Silke Eckstein Institut für Informationssysteme TU 37 XML Databases Silke Eckstein Institut für Informationssysteme TU 38 Benefits of schema registration Security issues Schemas cannot disappear without SQLserver knowing about it Schemas cannot be hijacked (altered in inappropriate ways) without SQL-server knowing about it Documents cannot be marked valid against schemas unless SQL-server knows about them Predefined schemas (build-in namespaces) xs:http://www.w3.org/2001/xmlschema xsi:http://www.w3.org/2001/xmlschema-instance sqlxml:http://standards.iso.org/iso/9075/2003/sqlxml More depending on the DB implementation Completely supported per XML+Namespaces: XMLElement, XMLForest, XMLTable Default namespace, explicit namespace (prefix) Declare namespace within scopes of WITH clause, column definitions, constraint definitions, insert/delete/update statements, compound statements XML Databases Silke Eckstein Institut für Informationssysteme TU 39 XML Databases Silke Eckstein Institut für Informationssysteme TU 40 in XML SQL/XML standard published as ISO/IEC 9074-14:2003 Mappings and Publishing Functions ISO/IEC 9075-14:2006 Adds XQuery, including Data Model, Validation ISO/IEC 9075-14:2008 Updates Something else? XML Databases SilkeEckstein Institut fürinformationssysteme TU 41 XML Databases Silke Eckstein Institut für Informationssysteme TU 42
SQL/XML:2003 plus Additional publishing functions XQuery data model More precise XML type (modifiers) XMLQuery, XMLTable XMLValidate, IS VALID XMLExists, IS DOCUMENT, IS CONTENT Casting between XML type and SQL types Overview of some operators for the XML type XMLELEMENT creates an XML element node XMLFOREST creates a sequence of XML element nodes from a table XMLCOMMENT creates an XML comment node XMLTEXT creates a text node XMLPI creates a processing instruction XMLAGG aggregates XML values of a group XMLCONCAT concatenates XML type values XMLTRANSFORM applies an XSL to a document XML Databases Silke Eckstein Institut für Informationssysteme TU 43 XML Databases Silke Eckstein Institut für Informationssysteme TU 44... Overview of some operators for the XML type XMLPARSE a well-formed SQL text to XML value XMLSERIALIZE converts an XML value to a SQL text XMLDOCUMENT creates an XML document node from an XML value XMLVALIDATE validates an XML value with a schema XMLQUERY evaluates an XQuery expression XMLTABLE transforms an XQuery result to a SQL table XMLITERATE transforms an XQery sequence to a SQL table Review of SQL/XML Two components A data type XML to store XML data Functions to map relational structures to XML Only construction operators No extraction of values or search But construction operators are based on XQuery Mapping of tables, schemas, catalogues ignores some information from the relational schema UNIQUE REFERENCES CHECK Further extensions are expected XML Databases Silke Eckstein Institut für Informationssysteme TU 45 XML Databases Silke Eckstein Institut für Informationssysteme TU 46 1. Introduction 2. XML Basics 3. Schema definition 4. XML query languages I 5. Mapping relational data to XML 7. XML processing 8. XML query languages II 9. XML storage I 10. XML storage - index 11. XML storage - native 12. Updates / Transactions 13. Systems 14. XML Benchmarks "XML und Datenbanken" Can Türker Lecture, University of Zurich, 2008 Beginning XML Databases. [Pow07] Gavin Powell Wiley & Sons, 2007, ISBN 0471791202 "XML-Datenbanken", Thomas Kudraß Lecture, HTWK Leipzig, WS2007/2008 "SQL/XML", Jim Melton, Oracle Corp. 2005 XML Databases Silke Eckstein Institut für Informationssysteme TU 47 XML Databases Silke Eckstein Institut für Informationssysteme TU 48
XQuery und SQL/XML in DB2-Datenbanken: Verwaltung und Erzeugung von XML- Dokumenten in DB2 [Moo08] Alfred Moos Vieweg+Teubner, 2008 ISO/IEC 9075-14:2003 Information Technology - Database Languages - SQL - Part 14: XML-Related Specifications (SQL/XML) DB2 SQL-Reference, IBM, March 2008 [IBM] Questions, Ideas, Comments Now, or... Room: IZ 232 Office our: Tuesday, 12:30 13:30 Uhr or on appointment Email: eckstein@ifis.cs.tu-bs.de XML Databases Silke Eckstein Institut für Informationssysteme TU 49 XML Databases Silke Eckstein Institut für Informationssysteme TU 50