XML and Data Management XML standards XML DTD, XML Schema DOM, SAX, XPath XSL XQuery,... Databases and Information Systems 1 - WS 2005 / 06 - Prof. Dr. Stefan Böttcher XML / 1
Overview of internet technologies for document management and archiving server technologies database coupling XML+XSL pure HTML document languages client technologies Databases and Information Systems 1 - WS 2005 / 06 - Prof. Dr. Stefan Böttcher XML / 2
Data centric XML - XML data storage <doc> opening tag <Auftrag> <Kunde> Arm </Kunde> <PC> pc400 </PC> </Auftrag> <Auftrag> <Kunde> Meier </Kunde> <PC> pc500 </PC> </Auftrag> <Auftrag> <Kunde> Reich </Kunde> <PC> pc500 </PC> </Auftrag> </doc> doc % Kunde PC auftrag( Arm pc400 ). auftrag( Meier pc500 ). auftrag( Reich pc600 ). closing tag Databases and Information Systems 1 - WS 2005 / 06 - Prof. Dr. Stefan Böttcher XML / 3
extended Markup Language (XML) XML - a family of standards: XML (extensible Markup Language) data format exchangable accross different operating systems, applications, and enterprises often used for content XPath path expressions used for navigation in XML trees used within other XML standards (e.g. XSL) XSL (extensible Stylesheet Language) used to describe layout of content / to convert data many more standards: XQuery ( queries ), DTD ( type definition ), XML-Schema ( integrity constraints ) Databases and Information Systems 1 - WS 2005 / 06 - Prof. Dr. Stefan Böttcher XML / 4
Unique Standard for Content DTD or XML Schema: defines structure of all XML trees exchanged => unique data format for all participants data formats exchangable accross company borders New data exchange formats and languages based on XML example: ebxml (E-Business XML) as a basis for OTA (Open Travel Association) data exchange between travel agency, airline etc. Consequence of these standards: ( econnomic ) force to use the standard Databases and Information Systems 1 - WS 2005 / 06 - Prof. Dr. Stefan Böttcher XML / 5
Separation of content and layout content (product2.xml) layout ( technican2.xsl) content (product1.xml) layout (customer1.xsl) HTML file combines requested data with requested layout Databases and Information Systems 1 - WS 2005 / 06 - Prof. Dr. Stefan Böttcher XML / 6
Separation of content and layout (2) Consequences: 1 (content) data source for different layouts (technican, seller, customer, re-seller,...) layout may change without changing content ( different logo, different seller or customer, different employee or job, new view of data ) reuse 1 layout for different content ( frame with company logo,...) content may change without changing layout ( new prices, ) Databases and Information Systems 1 - WS 2005 / 06 - Prof. Dr. Stefan Böttcher XML / 7
XML on Java servers XML + XSL separate layout and content layout (.xsl file) content data (.xml file) combine them in the web server XML file XSL file HTMLpage input Browser client calls generated HTML page Servlet server transform XML+XSL HTML Databases and Information Systems 1 - WS 2005 / 06 - Prof. Dr. Stefan Böttcher XML / 8
XML document as a data storage <doc> opening Tag <Auftrag> <Kunde> Arm </Kunde> <PC> pc400 </PC> </Auftrag> <Auftrag> <Kunde> Meier </Kunde> <PC> pc500 </PC> </Auftrag> <Auftrag> <Kunde> Reich </Kunde> <PC> pc500 </PC> </Auftrag> </doc> % Kunde PC auftrag( Arm pc400 ). auftrag( auftrag( doc Meier Reich closing Tag pc500 pc600 ). ). Databases and Information Systems 1 - WS 2005 / 06 - Prof. Dr. Stefan Böttcher XML / 9
XML syntax XML - Prolog: version character set without DTD! <?xml version="1.0" encoding="iso-8859-1" standalone="yes"?> <?xml-stylesheet type="text/xsl" href="xmlbsp1.xsl"?> XML - main part: used stylesheet (only inside ie5) element start tag /end tag <Auftrag> <Kunde> meier </Kunde> <PC> pc500 </PC> </Auftrag> text node Databases and Information Systems 1 - WS 2005 / 06 - Prof. Dr. Stefan Böttcher XML / 10
In the XML main part: XML syntax (2) (arbitrarily) no text node <Angebote> <Liefert wer= vobis teil= pc500 > </Liefert> attribute attribute value end of tag (no text) <Liefert wer= IBM teil= pc600 / > </Angebote> element Databases and Information Systems 1 - WS 2005 / 06 - Prof. Dr. Stefan Böttcher XML / 11
XML-Syntax (3) all tags must be closed (<tag>... </tag> or <singletag />) incorrectly nested tags not allowed ( <tag1> <tag2>... </tag1> </tag2> ) case-sensitive ( <tag> different from <Tag> ) attribute values must be quoted ( z.b. <p align="center"> ) text must be enclosed in elements Databases and Information Systems 1 - WS 2005 / 06 - Prof. Dr. Stefan Böttcher XML / 12
XML document as a tree <doc> <Kunde name= meier > <Auftrag>... </Auftrag> <Adresse> </Adresse> </kunde> <Kunde> <Auftrag/> <Adresse/> </Kunde> </doc> name = meier Kunde doc Kunde Auftrag Adresse Auftrag Adresse Databases and Information Systems 1 - WS 2005 / 06 - Prof. Dr. Stefan Böttcher XML / 13
7 kinds of nodes: XML node types root - has no parent node element text attribute comment name-space processing-instruction - leaf node (has no child node) - leaf node (has no child node) - leaf node (has no child node) - leaf node (has no child node) - leaf node (has no child node) Databases and Information Systems 1 - WS 2005 / 06 - Prof. Dr. Stefan Böttcher XML / 14
DTD and XML Schema DTD ( the older standard ) : + defines the structure (nesting of tags) of the documents <kunde> <auftrag> <teil> + defines structural dependencies, e.g. every auftrag contains at least one teil element XML-Schema ( the newer standard ) additionally : + binds XML elements to types defined in the XML Schema + defines Domains + defines integrity constraints Databases and Information Systems 1 - WS 2005 / 06 - Prof. Dr. Stefan Böttcher XML / 15
Document-Type-Definition (DTD) <!-- DTD xmlbsp2d.dtd for example xmlbsp2d.mxl --> <!ELEMENT Auftraege (Auftrag)* > <!ELEMENT Auftrag ( Kunde, PC ) > <!ELEMENT Kunde (#PCDATA) > <!ELEMENT PC (#PCDATA) > arbitrary many root element parsed char data sequence required <?xml version="1.0" encoding="iso-8859-1" standalone="no"?> <!DOCTYPE Auftraege SYSTEM "xmlbsp2d.dtd"> <?xml-stylesheet type="text/xsl" href="xmlbsp2.xsl"?> <Auftraege> <Auftrag> <Kunde>Meier</Kunde> <PC>pc500</PC> </Auftrag> <Auftrag>... </Auftrag> </Auftraege> Databases and Information Systems 1 - WS 2005 / 06 - Prof. Dr. Stefan Böttcher XML / 16
Element declarations in DTDs <!ELEMENT PC (#PCDATA) > <!ELEMENT Liefert (EMPTY) > <!ELEMENT Angebot (Liefert) > <!ELEMENT Angebote (Liefert)* > <!ELEMENT Auftrag (Kunde,PC) > <!ELEMENT Zahlung (Bar Karte) > <!ELEMENT E ((A B)*,C,(D)?)+ > text (no elements) empty 1 sub-element? 0 or 1 * arbitrary many + al least 1 sub-element sequence choice paranthesis Databases and Information Systems 1 - WS 2005 / 06 - Prof. Dr. Stefan Böttcher XML / 17
Attribute declarations in DTDs <!-- DTD xmlbsp2d.dtd for the example xmlbsp2d.xml --> <!ELEMENT Angebote (Liefert)* > arbitrary many <!ELEMENT Liefert (EMPTY) > empty <!ATTLIST Liefert wer CDATA #REQUIRED teil CDATA #REQUIRED > root element attribute type (char data) must occur <Angebote> <Liefert wer= vobis teil= pc500 > </Liefert> <Liefert wer= IBM teil= pc600 / > </Angebote> Databases and Information Systems 1 - WS 2005 / 06 - Prof. Dr. Stefan Böttcher XML / 18
Axes in XML document trees XML document doc name = meier Kunde Kunde Axes: child-axis Auftrag Adresse Auftrag Adresse /child::doc/child::kunde/child::auftrag / doc / Kunde / Auftrag attribute-axis /child::doc/child::kunde/attribute::name / doc / Kunde /@ name Databases and Information Systems 1 - WS 2005 / 06 - Prof. Dr. Stefan Böttcher XML / 19
Axes in XML document trees (2) ancestor doc ancestor-or-self parent Kunde self Auftrag Adresse descendant-or-self child PC following following-sibling descendant attribute @nr Handbuch Databases and Information Systems 1 - WS 2005 / 06 - Prof. Dr. Stefan Böttcher XML / 20
Axes in XML document trees (3) <doc> <Kunde> </Kunde> <Kunde> <name> </name> <Auftrag> self::... </Auftrag> <Adresse> </Adresse> </Kunde> <Kunde> </Kunde> </doc> Kunde name ancestor:: doc Kunde Auftrag Adresse Kunde preceding:: descendant:: following:: Databases and Information Systems 1 - WS 2005 / 06 - Prof. Dr. Stefan Böttcher XML / 21
Axes in XML document trees (4) Die following axes select for a given context node: child:: descendant:: parent:: ancestor:: following-sibling:: preceding-sibling:: following:: preceding:: attribute:: namespace:: its child nodes its descendants (=children and their descendants) the parent node (only root does not have a parent). nodes on the path to the root (=parent and its anc's). siblings have identical parent, following in doc order (empty for attribute and namespace nodes). inverse to following sibling (empty for attribute and namespace nodes). all nodes following in doc order after context node (excluding descendant-, attribute- & namespace-nodes). all nodes preceeding in doc order before context node (excluding ancestor-, attribute- & namespace-nodes). its attributes (empty for each non-element node). its namespace-nodes (empty for each non-element node). Databases and Information Systems 1 - WS 2005 / 06 - Prof. Dr. Stefan Böttcher XML / 22
Axes in XML document trees (5) the following axes select for a given context node: self:: the context node itself descendant-or-self:: the context node and its descendants ancestor-or-self:: the context node and its ancestors When ignoring attribute nodes and namespace nodes, the following holds for everey document node: the axes ancestor::, descendant::, following::, preceding:: and self:: partition a document fully, i.e., the selected node sets do not overlap but the union of all partitions contain all nodes of the document. Databases and Information Systems 1 - WS 2005 / 06 - Prof. Dr. Stefan Böttcher XML / 23
XML Schema example (1) <xsd:element name="address" > <xsd:sequence> <xsd:element name="fullname" maxoccurs="1"> <xsd:sequence> <xsd:element name="firstname"/> <xsd:element name="lastname"/> </xsd:sequence> </xsd:element> <xsd:choice> <xsd:element name="street"/> <xsd:element name="pob"/> </xsd:choice> </xsd:sequence> </xsd:element> Databases and Information Systems 1 - WS 2005 / 06 - Prof. Dr. Stefan Böttcher XML / 24
XML Schema example (2) <xsd:element name="shipto" type="coaddress"/> <xsd:complextype name="address"> <xsd:complexcontent> <xsd:sequence> <xsd:element name="fullname"/> <xsd:element name="street"/> </xsd:sequence> </xsd:complexcontent> </xsd:complextype> <xsd:complextype name="coaddress"> <xsd:extension base="address"> <xsd:sequence> <xsd:element name="countrycode"/> </xsd:sequence> </xsd:extension> </xsd:complextype> </xsd:element> Databases and Information Systems 1 - WS 2005 / 06 - Prof. Dr. Stefan Böttcher XML / 25
XML summary XML : DTD : XML-Schema tree structure for content structure definition additionally: type checking and logic consistency checking well documented standards http://www.w3c.org Databases and Information Systems 1 - WS 2005 / 06 - Prof. Dr. Stefan Böttcher XML / 26