Comparison of XML Support in IBM DB2 9, Microsoft SQL Server 2005, Oracle 10g



Similar documents
6. SQL/XML. 6.1 Introduction. 6.1 Introduction. 6.1 Introduction. 6.1 Introduction. XML Databases 6. SQL/XML. Creating XML documents from a database

Generating XML from Relational Tables using ORACLE. by Selim Mimaroglu Supervisor: Betty O NeilO

XML Databases 6. SQL/XML

OData Extension for XML Data A Directional White Paper

XML Databases 13. Systems

Advanced Information Management

IBM DB2 XML support. How to Configure the IBM DB2 Support in oxygen

Implementing XML Schema inside a Relational Database

Agents and Web Services

Database Systems. Lecture 1: Introduction

ON ANALYZING THE DATABASE PERFORMANCE FOR DIFFERENT CLASSES OF XML DOCUMENTS BASED ON THE USED STORAGE APPROACH

LOBs were introduced back with DB2 V6, some 13 years ago. (V6 GA 25 June 1999) Prior to the introduction of LOBs, the max row size was 32K and the

XML Programming with PHP and Ajax

Unified XML/relational storage March The IBM approach to unified XML/relational databases

XML. CIS-3152, Spring 2013 Peter C. Chapin

EFFECTIVE STORAGE OF XBRL DOCUMENTS

Teradata SQL Assistant Version 13.0 (.Net) Enhancements and Differences. Mike Dempsey

Processing XML with SQL on the IBM i MMSA MEETING April 15, Raymond A. Everhart - RAECO Design, Inc. reverhart@raecodesign.

SQL - QUICK GUIDE. Allows users to access data in relational database management systems.

Introduction to XML Applications

How To Create A Table In Sql (Ahem)

Discovering SQL. Wiley Publishing, Inc. A HANDS-ON GUIDE FOR BEGINNERS. Alex Kriegel WILEY

Instant SQL Programming

Course 6232A: Implementing a Microsoft SQL Server 2008 Database

Object Oriented Databases. OOAD Fall 2012 Arjun Gopalakrishna Bhavya Udayashankar

Geodatabase Programming with SQL

Oracle Database 10g Express

THE XML TECHNOLOGY IMPLEMENTED IN MICROSOFT SQL SERVER

The Relational Model. Why Study the Relational Model? Relational Database: Definitions

Using Altova Tools with DB2 purexml

Overview Document Framework Version 1.0 December 12, 2005

Fundamentals of Database Design

Data Access Guide. BusinessObjects 11. Windows and UNIX

Translating between XML and Relational Databases using XML Schema and Automed

A Brief Introduction to MySQL

Database Support for XML

Short notes on webpage programming languages

Release Bulletin Sybase ETL Small Business Edition 4.2

12 File and Database Concepts 13 File and Database Concepts A many-to-many relationship means that one record in a particular record type can be relat

Working with the Geodatabase Using SQL

Developing Microsoft SQL Server Databases (20464) H8N64S

A basic create statement for a simple student table would look like the following.

TIM 50 - Business Information Systems

PHP Oracle Web Development Data Processing, Security, Caching, XML, Web Services and AJAX

CHAPTER 1 INTRODUCTION

Concepts of Database Management Seventh Edition. Chapter 9 Database Management Approaches

Presentation / Interface 1.3

SQL Server An Overview

Introduction to Triggers using SQL

Database Programming with PL/SQL: Learning Objectives

20464C: Developing Microsoft SQL Server Databases

3.GETTING STARTED WITH ORACLE8i

Developing Microsoft SQL Server Databases MOC 20464

ISM 318: Database Systems. Objectives. Database. Dr. Hamid R. Nemati

5.5 Copyright 2011 Pearson Education, Inc. publishing as Prentice Hall. Figure 5-2

Course 20464: Developing Microsoft SQL Server Databases

purexml Critical to Capitalizing on ACORD s Potential

Developing Microsoft SQL Server Databases 20464C; 5 Days

Introduction. Introduction: Database management system. Introduction: DBS concepts & architecture. Introduction: DBS versus File system

CSC 443 Data Base Management Systems. Basic SQL

1 File Processing Systems

4 Simple Database Features

CS2Bh: Current Technologies. Introduction to XML and Relational Databases. The Relational Model. The relational model

4 Logical Design : RDM Schema Definition with SQL / DDL

Using EMC Documentum with Adobe LiveCycle ES

SQL Simple Queries. Chapter 3.1 V3.0. Napier University Dr Gordon Russell

Introductory Concepts

ICAB4136B Use structured query language to create database structures and manipulate data

Introduction: Database management system

Stored Documents and the FileCabinet

Tune That SQL for Supercharged DB2 Performance! Craig S. Mullins, Corporate Technologist, NEON Enterprise Software, Inc.

Managing XML Documents Versions and Upgrades with XSLT

Schema Evolution in SQL-99 and Commercial (Object-)Relational DBMS

Linas Virbalas Continuent, Inc.

Oracle Essbase Integration Services. Readme. Release

Oracle Database 10g: Introduction to SQL

How To Store And Manage Xml In A Database With A Powerpoint (Xml) And A Powerbook (Xm) (Powerbook) (Xl) (Oracle) (For Free) (Windows) (Html) (

EUR-Lex 2012 Data Extraction using Web Services

XML: extensible Markup Language. Anabel Fraga

10. XML Storage Motivation Motivation Motivation Motivation. XML Databases 10. XML Storage 1 Overview

Managing large sound databases using Mpeg7

The Relational Model. Why Study the Relational Model? Relational Database: Definitions. Chapter 3

An Oracle White Paper October Oracle XML DB: Choosing the Best XMLType Storage Option for Your Use Case

NETMARK: A SCHEMA-LESS EXTENSION FOR RELATIONAL DATABASES FOR MANAGING SEMI-STRUCTURED DATA DYNAMICALLY

Toad Data Modeler - Features Matrix

Choosing a Data Model for Your Database

CHAPTER 2 DATABASE MANAGEMENT SYSTEM AND SECURITY

Data XML and XQuery A language that can combine and transform data

II. PREVIOUS RELATED WORK

An Oracle White Paper February Managing Unstructured Data with Oracle Database 11g

Implementing an Enterprise Order Database with DB2 purexml at Verizon Business

Transcription:

Comparison of XML Support in IBM DB2 9, Microsoft SQL Server 2005, Oracle 10g O. Beza¹, M. Patsala², E. Keramopoulos³ ¹Dpt. Of Information Technology, Alexander Technology Educational Institute (ATEI), Thessaloniki, Greece, E-mail: olmpel@it.teithe.gr ²Dpt. Of Information Technology, Alexander Technology Educational Institute (ATEI), Thessaloniki, Greece, E-mail: Patsala_Maria@yahoo.gr ³Dpt. Of Information Technology, Alexander Technology Educational Institute (ATEI), Thessaloniki, Greece, E-mail: euclid@it.teithe.gr Abstract In this paper we present the relation between XML (Extensible Markup Language) documents and Relational Database Management System (RDBMS) IBM DB2 9, MICROSOFT SQL SERVER 2005 and ORACLE 10g. The research aims to develop and describe the ways in which we can manipulate this type of documents using these three XML-enabled Databases and perform a comparison analysis of their XML support. The paper discusses the basic characteristics/concepts of XML and it presents the structure of XML documents, all related technologies (DTDs, SCHEMATA, etc) and two of the most important XML Query languages XPath and XQuery. Moreover, we outline the basic concepts of Database systems and how they can benefit using XML. The emphasis of the paper is given in the presentation of the comparison analysis, which is based on a list of basic features of XML that a RDBMS should support. We introduce these XML features and we analyze the comparison analysis by presenting examples of using XML with IBM DB2 9, MICROSOFT SQL SERVER 2005 and ORACLE 10g. Finally we summarize all our conclusions in a comparison table which contains all the supported XML operations from the three RDBMSs. Keywords: XML, XML-enabled Databases, DTD, XML Schema 1. Introduction XML (Extensible Markup Language) [1, 2] is a markup language developed by the World Wide Web Consortium (W3C) [3] to deliver structured content over the web. XML was originally developed as an application profile of SGML [4], but soon XML made an instant success for a variety of other application domains. That s because XML provides many advantages as a data format over others, including: 1. Built-in support for internationalization due to the fact that it utilizes unicode. 2. Platform independence. 3. Human readable format which is easier for developers to locate and fix errors than with other data storage formats. 1

4. Extensibility in a manner that allows developers to add extra information to a format without breaking applications that based on older versions of the specific format. 5. Large number of off-the-shelf tools for processing existing XML documents. XML databases have become widely accepted for all applications where the storage of XML data is necessary. There are three different types of XML databases [5], namely: XML Enabled Database: A database that holds data in some format different than XML. An interface is provided, which presents XML info to the application even though the data is stored in some other format than XML. An XML-enabled database might be a relational database, an object-relational database, or an object-oriented database. Native XML Database: This type of database allows XML data to be stored directly. Also, they define a (logical) model for an XML document that stores and retrieves documents according to that model. Native XML databases are likely to perform better than XML-enabled databases since there is little need for converting the data. The data conversion in an enabled database is almost always going to be more significant and time consuming than with a native database. Hybrid XML Database: A database that have characteristics of Native XML Databases and XML Enabled Databases. IBM DB2 9, Microsoft SQL Server 2005 and Oracle 10g are XML enabled relational database management systems (RDBMS). DB2 offers two ways of processing XML documents: XML Extender [6, 7] and PureXML [8, 9]. In this paper, we present the XML characteristics that the three RDBMSs should support. In particular, in section 2 the three DBMSs are examined against some XML technologies, such as DTDs, XML Schema, XPath, XQuery and XSL. The method that the three DBMSs use in order to store an XML Document is given in Section 3. In the next section, we focus on the mapping between XML documents and DBMSs and the XML indexes. In Section 5, we examine how the three DBMSs compose and decompose XML documents into/from relational table columns. In section 6, we study the way that three DBMSs use the extension of SQL3 [10] for using XML, i.e. SQL/XML. Finally, in section 7 we conclude by summarizing in a comparison table which contains all the described XML features that a RDBMS should support. 2. XML Technologies 2.1 DTD (Document Type Definition) Documents DTDs [11] are documents where are defined some markup rules as a vocabulary. DTDs have a different syntax from XML and are generally used to specify the order and occurrence of elements in an XML document. In fact the use of DTDs is not so popular since the XML schemata were introduced. However there are programmers that prefer to use DTDs mainly because DTDs are easier to code and validate than XML Schemata. Sql Server does not support the use of DTD files. DB2 2

PureXML does not support DTD validation but it permits the insertion of documents that contain a DOCTYPE that refer to DTDs. On the other hand, DB2 XML Extender and Oracle fully support DTD validation. 2.2 XML Schemata An XML Schema [12] is a mechanism introduced by the W3C and can be used in place of a DTD to define the specifications for the content of XML documents. All three DBMSs register an XML schema in their database. Oracle and DB2 provide a repository that contains all registered XML technologies used for validation and stores them in their hierarchical structure, named XML DB Repository and XML Schema Repository respectively. SQL Server does not provide such a tool, but it provides a method for modifying an XML Schema which is, the alter xml schema collection. Similarly DB2 gives the method add xmlschema document to. In Oracle once we register the schema in the database we can not modify it. Moreover, Oracle provides two methods isschemabased that checks if the inserted XML document conforms to a schema and isschemavalidated that checks if the document inserted in a column is valid. Finally, in Oracle and SQL Server we can create XML schemata in the database. One of the most important reasons that XML Schemata are used by DBMSs is to validate XML documents before inserted in the columns of a table. In the case of SQL Server we have to define from the creation of the table whether the XML column will contain an XML document that conforms to an XML Schema. Thus, if we create a typed table the XML document should contain all the tags and the same names that the schema defines. On the contrary in Oracle we just have to name the root element and the rest of the document may differ. In the case of DB2 we have to define on insert command whether the document will be validated against a schema or not, as we can see in the following example. insert into PurchaseOrder(poid,info) values (2002,XMLVALIDATE( XMLPARSE(DOCUMENT'<purchaseOrder poid="2002" orderdate="1999-10-20" status=""> </purchaseorder>') ACCORDING TO XMLSCHEMA ID migrate.po)); 2.3 XPath-Xquery-XSL XPath [13] is a query language that conforms to a data model (DTD, XML Schema) and provides a hierarchical representation of XML documents. All three DBMS use it to navigate through elements and attributes in an XML document that is stored n a column of XML type. XQuery [14] is a W3C Recommendation and conforms to the same data model of XPath. XQuery is used for finding and extracting elements and attributes from XML documents. According to our research DB2 s support of XQuery is superior compared to the other two since it treats XQuery as a first-class language. Only DB2 XML Extender does not support XQuery. 3

db2 PureXML's XQuery: select xmlquery('$cinfo/purchaseorder/shipto/name' passing info as "cinfo") from purchaseorder ; XQuery: SQL Server's XQuery: Oracle's XQuery: xquery for $y in db2- fn:xmlcolumn('purchaseorder.info')/purchase Order/items return $y Select poid, info.query('for $y in /purchaseorder/items return <topic>{$y/item[@pid]}</topic>') from purchaseorder SelectXMLQuery ('$cinfo/product/description/name[ora:contains (.,"Roll")>0]' passing info as "cinfo" returning content) from product; XSL (Extensible Stylesheet Language) [15] is a language for expressing style sheets. In other words it defines how an XML document should be presented. All three DBMS fully support the use of XSL. 3. Storage Methods In this section we examine the method that the three DBMSs use in order to store an XML Document in a database. In particular, SQL Server [16, 17] stores XML Documents in table columns of XML type like BLOBs (Binary Large Objects). In the case that the XML document stored in an untyped 1 column then it is stored as Unicode (UTF-16) whereas in the other case that the XML Document is stored in a typed 2 column then it is stored with the same type as the XML schema defines. For example, Create table Product (pid varchar(10) not null primary key, name varchar(128), category varchar(32), price decimal(30,2), info xml); Create table purchaseorder ( POid bigint not null primary key, Status varchar(10) not null default 'New', Info XML(content PO)); Oracle [18, 19] stores XML documents as intact documents in xmltype type columns of tables like CLOBs (Character Large Objects) or BLOBs (Binary Large Objects) or as a distinct xmltype table. For example, Create table purchaseorder ( POid bigint not null primary key, status varchar (10) not null, info xmltype) xmltype info xmlschema "http://www.w3.org/2001/xmlschema" element "purchaseorder"; 1 Typed is a terminology used in SQL Server to describe those columns of XML type that do not comply with an XML Schema. 2 Untyped is a terminology used in SQL Server to describe those columns of XML type that comply to an XML Schema 4

Create table purchaseorder of xmltype XMLSCHEMA "http://www.w3.org/2001/xmlschema" element "purchaseorder"; DB2 XML Extender stores the XML document in a single column as character data, extracting values into "side tables". For example, Create table PurchaseOrder ( POid bigint not null primary key, Status varchar (10) not null with default 'New',Info db2xml.xmlclob not logged not null); In case of DB2 PureXML the XML document is stored in a column of XML type. What is worth mentioning is that PureXML does not store documents as plain text and does not map XML to relational or objectrelational tables. Instead, it stores XML in its inherent hierarchical format, which matches the XML data model. Any XML document is a well-defined tree of elements and attributes, and XML queries are expressed in terms of tree traversal. An example of storing an XML document in DB2 PureXML is given: Create table PurchaseOrder ( POid bigint not null primary key,status varchar (10) not null with default 'shipped', Info xml not null); 4. XML Mapping and Indexing The concept of mapping [20] is of greatest importance for XML Enabled Databases, and that s because the data transfer between the XML document and the database is based on the mapping between them. Using DB2 s XML Extender, the mapping between the tables of the database and the structure of the XML document is defined by a document called DAD (Data Access Definition). This document maps the elements of the XML document with the columns of the table. In contrary, DB2 PureXML uses annotated XML Schemata [12], instead of DAD files. Generally annotated Schemata, which are also referred as mapping schemata, are used by all three RDBMSs for mapping. Annotations can be defined on tables (sql:relation annotation), on fields (sql:field annotation) and on referential integrity relationships (sql:relationship annotation). In case that an XML schema is not registered in a database, each one of the DBMSs use a default mapping. SQL Server also uses FOR XML clauses [21] that define how the select clauses are mapped to XML documents. A common characteristic of all the XML Enabled Databases is the support of XML indexes [17, 22] which are produced by elements and attributes of XML documents. Just like relational indexes, XML indexes are used to improve the performance of queries. The user should always create indexes over frequently accessed data that results in a much better performance of the select statements and executed over the indexed data. 5

5. Managing XML Documents and Relational Data When working with XML documents and Databases we can either store the documents intact in columns of XML type or decompose XML data into relational tables. Another operation we can perform is to compose XML documents from existing relational data. In case of decomposition DB2 XML Extender provides the method of XML Collection [23]. XML Collection is defined by a DAD document that determines how the elements and the attributes are to be mapped in one or more relational tables. After we enable the database for XML Collection (dxxadm enable_collection database_name name path ) we insert the XML data in the tables using DAD and XML documents. On the other hand DB2 PureXML uses an annotated XML Schema that we have registered in the database and we have enabled it for decomposition and with the command decompose xml document the desirable XML data are inserted into tables. SQL Server uses a stored procedure sp_xml_preparedocument and OPENXML clauses [16] for decomposing XML data in relational tables. This procedure does not require an XML Schema, we just have to define the XML document. This approach is not automated like DB2 s and it does not support insertion in more than one tables. Oracle performs a similar procedure using dbms_xmlgen. By defining an XML document and the name of tables we want to create and their contained columns we can insert XML data in them. This approach is far more complicated compared to DB2 s, especially when we want to update a lot of tables or insert more than one XML documents. Apart from these functions we can create XML documents and Schemata from existing relational data. Oracle and DB2 use SQL/XML methods to produce XML documents whereas SQL Server uses OPENXML statements. Finally, DB2 does not support XML Schema creation. 6. SQL/XML SQL/XML [24] is an extension of SQL that is part of ANSI/ISO SQL 2003. SQL/XML was developed by INCITS H2.3 [25], with participation from Oracle, IBM, Microsoft (which does not plan to implement SQL/XML), Sybase [26], and DataDirect Technologies [27]. It's extensions include the following: Mapping SQL tables, schemas, and catalogs to XML documents. Generation of an XML schema corresponding to an XML document generated from SQL data. An XML data type to allow columns of SQL tables to contain XML data. Publishing functions that allow SQL queries to create XML structures using XML publishing functions including: XMLELEMENT, XMLATTRIBUTES, XMLFOREST, XMLCONCAT, XMLAGG, and XMLGEN 6

7. Conclusions Summing up this paper we quote some observations we made during our research. Oracle was rather slow working in Windows XP and less userfriendly compared to the other two DBMS. What we find quiet convenient working with SQL Server and DB2 was the existence of hyperlinks in the columns that contain XML documents and in query results. One limitation of DB2 9 is that working with XML Extender is necessary to make a number of steps in order to enable the database and in case of PureXML we have to work in a database with codeset UTF-16. Below we indicate a comparison table that consists of all the functions and tools that DBMS use for XML support. FEATURES DB2 SQL SERVER ORACLE PureXML XMLExtender DTD XML Schemas XML Technologies Xpath Xquery XSL BLOB Storage Methods CLOB VARCHAR Native XMLType XML XML Data Type XMLVarchar XMLClob XMLFile Columns of XML Type Tables of XML type XML Validation DTD XML Schema XML Shredding Composition of XML Documents Composition of XML Schema XML Mapping XML Indexing SQL/XML XML Repository Figure 1: Comperative Table As we can see from the above table DB2 XML Extender does not support the use of XML Schemata and XQuery whereas SQL Server fells short in the use of DTDs. 7

One big advantage of DB2 PureXML is the native storage of XML data. This approach contributes to a faster query performance and data access. One of the most interesting functions that SQLServer and Oracle offer is the creation of XML Schema, something that DB2 does not support. Finally, DB2 PureXML and Oracle provide a very helpful tool for managing XML schemata and validating technologies, the XML Repository. REFERENCES [1] Extensible Markup Language (XML) 1.0 (Fourth Edition), W3C Recommendation, (29 September 2006) Available at: http://www.w3.org/tr/xml/ [2] Extensible Markup Language (XML) Tutorial. Available at: http://www.w3schools.com/ xml/ [3] World Wide Web Consortium (W3C), Available at: http://www.w3.org/ [4] SGML, Available at: http://www.w3.org/tr/html4/intro/sgmltut.html [5] What is an XML database?, Available at: http://xmldb-org.sourceforge.net/faqs.html [6] IBM Redbooks, XML Guide, db2xge90 [7] IBM Redbooks, XML Extender Administration and Programming, Version 8.2, db2sxe81 [8] IBM Redbooks, DB2 9: purexml Overview and Fast Start, sg247298 [9] IBM Redbooks, DB2 9 purexml Guide, sg247315 [10] ISO/IEC 9075-1:2003. Information technology Database languages SQL Part 1: Framework (SQL/Framework). [11] Kelvin Williams, Professional XML Databases, Wrox Press, Ltd 2000 [12] Introduction to Annotated XSD Schemas (SQLXML 4.0), Available at: http://technet.microsoft.com/en-us/library/ms171870.aspx [13] XML Path Language (XPath) Version 1.0, W3C Recommendation, (16 November 1999), Available at: http://www.w3.org/tr/xpath [14] XML Path Language (XQuery) Version 1.0, W3C Recommendation Available at: http://www.w3.org/tr/xquery [15] Extensible Stylesheet Language (XSL) Version 1.1, Available at: http://www.w3.org/tr/xsl/ [16] Scott Klein, 2006/ Professional SQL Server 2005 XML. Wiley Publishing. [17] Mitch Ruebush, Comparing SQL Server 2005 and Oracle 10g as a Database Platform for Microsoft.NET Developers, April 2005 [18] Shelley Higgins, Oracle Application Developer s Guide - XML, 10g (9.0.4) Part No. B12099-01, Oracle Corporation, 2003 [19] Geoff Lee, Mastering XML DB Queries in Oracle 10g, Release 2, Oracle Corporation, March 2006. [20] Igor Dayen, Storing XML in Relational Databases, June 20, 2001 [21] Srinivas Sampath, Beginning SQL Server 2005 XML Programming, 21 February 2006 [22] IBM Redbooks, DB2 9: Indexing XML documents with DB2 9 purexml [23] IBM Redbooks, XML for DB2 Information Integration, SG24-6994 [24] SQL/XML, Available at: http://www.stylusstudio.com/sqlxml_tutorial.html [25] INCITS H2.3, Available at: http://www.incits.org/ [26] Sybase, see also:www.sybase.com/ [27] DataDirect Technologies, Available at: www.datadirect.com/ 8