04 XML Schemas. Software Technology 2. MSc in Communication Sciences 2009-10 Program in Technologies for Human Communication Davide Eynard



Similar documents
Introduction to XML. Data Integration. Structure in Data Representation. Yanlei Diao UMass Amherst Nov 15, 2007

XML Schema Definition Language (XSDL)

DTD Tutorial. About the tutorial. Tutorial

Last Week. XML (extensible Markup Language) HTML Deficiencies. XML Advantages. Syntax of XML DHTML. Applets. Modifying DOM Event bubbling

XML: extensible Markup Language. Anabel Fraga

XML and Data Management

Quiz! Database Indexes. Index. Quiz! Disc and main memory. Quiz! How costly is this operation (naive solution)?

Chapter 3: XML Namespaces

Structured vs. unstructured data. Semistructured data, XML, DTDs. Motivation for self-describing data

Modernize your NonStop COBOL Applications with XML Thunder September 29, 2009 Mike Bonham, TIC Software John Russell, Canam Software

Structured vs. unstructured data. Motivation for self describing data. Enter semistructured data. Databases are highly structured

Java and XML parsing. EH2745 Lecture #8 Spring

How To Use Xml In A Web Browser (For A Web User)

The A2A Data Model and its application in WieWasWie. Michel

CHAPTER 9: DATAPORT AND XMLPORT CHANGES

<xs:restriction base="xs:string">

Languages for Data Integration of Semi- Structured Data II XML Schema, Dom/SAX. Recuperación de Información 2007 Lecture 3.

Translating between XML and Relational Databases using XML Schema and Automed

XML Based Customizable Screen. Rev 1.1

Chapter 2: Designing XML DTDs

<Namespaces> Core XML Technologies. Why Namespaces? Namespaces - based on unique prefixes. Namespaces. </Person>

Visualization of GML data using XSLT.

T Network Application Frameworks and XML Web Services and WSDL Tancred Lindholm

Integration and interoperability of data sources: forward into the new century

6. SQL/XML. 6.1 Introduction. 6.1 Introduction. 6.1 Introduction. 6.1 Introduction. XML Databases 6. SQL/XML. Creating XML documents from a database

XML. Document Type Definitions XML Schema

XML WEB TECHNOLOGIES

How To Write A Contract Versioning In Wsdl 2.2.2

XIII. Service Oriented Computing. Laurea Triennale in Informatica Corso di Ingegneria del Software I A.A. 2006/2007 Andrea Polini

XML Databases 6. SQL/XML

Representation of E-documents in AIDA Project

OpenTravel Alliance XML Schema Design Best Practices

An XML Based Data Exchange Model for Power System Studies

Internationalization Tag Set 1.0 A New Standard for Internationalization and Localization of XML

Exercises: XSD, XPath Basi di da4 2

XML. CIS-3152, Spring 2013 Peter C. Chapin

Semistructured data and XML. Institutt for Informatikk INF Ahmet Soylu

Introduction to Web Services

DRAFT. Standard Definition. Extensible Event Stream. Christian W. Günther Fluxicon Process Laboratories

TR-154 TR-069 Data Model XML User Guide

XML Security. Blake Dournaee

XML and Tools. Muhammad Khalid Sohail Khan Mat #: University of Duisburg Essen Germany

Information Model Architecture. Version 2.0

Geography Markup Language (GML) simple features profile

Extensible Markup Language (XML): Essentials for Climatologists

Chapter 15 Working with Web Services

Implementing XML Schema inside a Relational Database

[MS-FSDAP]: Forms Services Design and Activation Web Service Protocol

XML for RPG Programmers: An Introduction

XML. Dott. Nicole NOVIELLI XML: extensible Markup Language

WebSphere Business Monitor

INTERNATIONAL TELECOMMUNICATION UNION

T XML in 2 lessons! %! " #$& $ "#& ) ' */,: -.,0+(. ". "'- (. 1

Standard Recommended Practice extensible Markup Language (XML) for the Interchange of Document Images and Related Metadata

12 The Semantic Web and RDF

Importing Lease Data into Forms Online

SW : : Introduction to Web Services with IBM Rational Application Developer V6

Model-driven Rule-based Mediation in XML Data Exchange

Security for industrial automation and control systems: Patch compatibility information

Data Integration through XML/XSLT. Presenter: Xin Gu

Data Modeling Basics

User manual for e-line DNB: the XML import file. User manual for e-line DNB: the XML import file

XEP-0043: Jabber Database Access

BACKGROUND. Namespace Declaration and Qualification

[MS-DVRD]: Device Registration Discovery Protocol. Intellectual Property Rights Notice for Open Specifications Documentation

Development and Validation of an XML Schema for Automated Environmental Reporting on XML Basis

Chris Smith, Platform Computing Marvin Theimer, Microsoft Glenn Wasson, UVA July 14, 2006 Updated: October 2, 2006

[MS-MDM]: Mobile Device Management Protocol. Intellectual Property Rights Notice for Open Specifications Documentation

Et tu, XML? Philip Wadler, Avaya Labs

XML Schemadefinition

TagSoup: A SAX parser in Java for nasty, ugly HTML. John Cowan (cowan@ccil.org)

David RR Webber Chair OASIS CAM TC (Content Assembly Mechanism)

[MS-MDE]: Mobile Device Enrollment Protocol. Intellectual Property Rights Notice for Open Specifications Documentation

CHAPTER 1 INTRODUCTION

XML - A Practical Application and Design

Design and Development of Website Validator using XHTML 1.0 Strict Standard

Exchanger XML Editor - Canonicalization and XML Digital Signatures

Open Grid Services Infrastructure (OGSI) Version 1.0

XSLT Mapping in SAP PI 7.1

Allegato XML flusso richieste di produzione

Advanced Information Management

Conceptual Level Design of Semi-structured Database System: Graph-semantic Based Approach

13 RDFS and SPARQL. Internet Technology. MSc in Communication Sciences Program in Technologies for Human Communication.

Transcription:

MSc in Communication Sciences 2009-10 Program in Technologies for Human Communication Davide Eynard Software Technology 2 04 XML Schemas

2 XML: recap and evaluation During last lesson we saw the basics of XML... Tree structure Elements and attributes Content vs presentation... And the basics of XML evaluation Well-formedness (just syntax) Validity wrt a schema

3 Why do we need a schema? XML can be used to describe different data and is totally unaware of what you are speaking about Example: You can check if the syntax is right...... but you cannot constrain its usage in any way! <person> <firstname>john</firstname> <lastname>doe</lastname> <SSN>123-45-6789</SSN> <SSN>987-65-4321</SSN> </person> SSN should be unique, but a simple check on syntax would not find errors in this code

4 What does a schema do? A schema allows you to define all the elements and attributes that can be used inside an XML document Moreover, you can add constraints specifying: Which are the children of a particular element n which order they appear How many children an element can have f the element is empty or contains text Datatypes for elements and attributes Default values for elements and attributes Given this information, an XML document conforming to a given schema can be validated The document is valid if it is well-formed and it follows the structure given inside the schema Validation can be done automatically by any tool which understands the schema language.

5 DTD vs XML Schema There are many ways of defining the structure of an XML document i.e. DTD, XML Schema, RELAX NG, Schematron,... DTD and XML Schema are the most used ones, but XML Schema is having more success (W3C recommendation, 2001) because: it is written with XML syntax it supports datatypes it supports namespaces it supports inheritance and data type extension

6 Using a DTD Writing a DTD is just as easy as writing another text file, but how can we use a DTD? How can we say a file should follow a schema? How can we use this information to validate the file? To match a document with a DTD, we should add the following to the xml prolog: <! DOCTYPE rootelement SYSTEM dtdlocation > To test it, we can use online validators or validating editors Example: <?xml version="1.0" encoding="utf-8"?> <!DOCTYPE messages SYSTEM "./messages.dtd"> <messages> <message msgid="1"> <from>...

7 DTD Elements 1 An element can declared in the following way: <!ELEMENT element-name category> or <!ELEMENT element-name (element-content)> Category = EMPTY <!ELEMENT br EMPTY> <br/> Elements containing only a sequence of characters <!ELEMENT element-name (#PCDATA)> Elements containing any mixture of text and other elements <!ELEMENT element-name ANY> Elements containing one or more children elements <!ELEMENT element-name (child1, child2,...)> Follows the specified order!

8 DTD Elements 2 Example: <text-message> <from>+393357654321</from> <to>+393471234567</to> <text>hi there!</text> </text-message> <!ELEMENT text-message (from, to, text)> <!ELEMENT from (#PCDATA)> <!ELEMENT to (#PCDATA)> <!ELEMENT text (#PCDATA)>

9 DTD Elements 3 Disjunction: <!ELEMENT email_header (cc bcc)> <!ELEMENT cc (#PCDATA)> <!ELEMENT bcc (#PCDATA)> We can use disjunction to specify subelements in generic order: <!ELEMENT email_header ((from,to) (to,from))>... What if we have 10 subelements? Cardinality: <!ELEMENT email_header (from,to+,cc*,subject)>? zero times or once * zero or more times + one or more times

10 DTD Attributes 1 Attributes are defined in a DTD with an attribute list: <!ATTLST element-name attr-name attr-type value-type> For each attribute you have to define: The name of the element it is related to ts name ts type ts value type

11 DTD Attributes 2 Example: <text-message from= +393357654321 > <to>+393471234567</to> <text>hi there!</text> </text-message> <!ELEMENT text-message (to, text)> <!ATTLST text-message from CDATA #REQURED> <!ELEMENT to (#PCDATA)> <!ELEMENT text (#PCDATA)>

12 DTD Attributes 3 Attribute types: CDATA, a string D, a name that is unique across the XML document DREF, a reference to another element with the D attribute DREFS, a sequence of DREF (v1... vn), an enumeration of all possible values i.e. weekday (monday tuesday... sunday) Limitations No dates No numbers No booleans

13 DTD Attributes 4 Attribute value types: #REQURED (attribute must appear in every occurrence of the element type in the XML document) #MPLED (the appearance of the attribute is optional) #FXED value (every element must have this attribute with this value) <!ATTLST html xmlns CDATA #FXED 'http://www.w3.org/1999/xhtml'> value (specifies the default value for the attribute) <!ATTLST car color (red white blue) red >

14 From DTD to XML Schema Main differences between XML DTD and XML Schema: Note: XML Schema's syntax is based on XML itself (you can use the same tools for XML documents and schemas!) t allows the reuse of existing schemas (inheritance) and their refinement (extension) t supports more specific datatypes t supports namespaces XML Schema is also called XML Schema Definition (XSD)

15 Namespaces Elements in XML files can be defined by the developers What if two developers use the same name for different kinds of elements? Example: <table> <tr> <td>apples</td> <td>bananas</td> </tr> </table> <table> <name>african Coffee Table</name> <width>80</width> <length>120</length> </table>

16 Namespaces definition We need a way to specify that element names come from two different contexts we put a prefix before element names we specify what namespace that prefix represents <h:table xmlns:h = "http://www.w3.org/tr/html4/"> <tr> <td>apples</td> <td>bananas</td> </tr> </h:table> <f:table xmlns:f = http://my.name.space/furniture/ > <name>african Coffee Table</name> <width>80</width> <length>120</length> </f:table>

17 Root and default namespaces You can also define all the namespaces you are going to use in the root element of your XML document: <root xmlns:h = "http://www.w3.org/tr/html4/" xmlns:f = http://my.name.space/furniture/ > <h:table>...</h:table> <f:table>...</f:table> </root> f the xmlns attribute is not followed by a prefix, then the specified namespace is considered as the default one <html xmlns="http://www.w3.org/1999/xhtml">

18 Documents using XML Schema How is the prolog of XML documents using an XML Schema? <?xml version="1.0" encoding="utf-8"?> <messages xmlns = "http://my.name.space" xmlns:xsi = "http://www.w3.org/2001/xmlschema-instance" xsi:schemalocation = "http://my.name.space./messages.xsd"> <message msgid="1"> <from>...... </messages>

19 XML Schema opening tag An XML schema is an XML document whose root element is called schema and is defined like follows: <?xml version="1.0" encoding="utf-8"?> <xs:schema xmlns:xs = "http://www.w3.org/2001/xmlschema" Source, targetnamespace = "http://my.name.space" target, xmlns = "http://my.name.space" and default ns elementformdefault = "qualified"> Notes: the xs:schema element is the root of every XML schema qualified = Associated with a namespace, either by the use of a declared prefix or via a default namespace declaration. More details here

20 The four constructs of XML Schema XML Schema is built on four constructs: A simple type definition defines a family of text strings (Unicode) A complex type definition defines a collection of requirements for attributes, sub-elements, and char data An element declaration associates an element name with either a simple type or a complex type An attribute declaration associates an attribute name with a simple type (attributes always contain unstructured text)

21 XML Schema Elements and Types To declare an element (equivalent to <!ELEMENT> in a DTD) you have to use the element tag: <element name=... /> The most important (optional) attribute is type, as it defines the element's content type: <element name=... type=... />

22 Cardinality and default values To change cardinality, you can use the (optional) attributes minoccurs and maxoccurs: <element name="from" minoccurs="1" maxoccurs="1" /> <element name="to" minoccurs="1" maxoccurs="unbounded" /> <element name="cc" minoccurs="0" maxoccurs="unbounded" /> Note: minoccurs= x, where x is an integer >=0 maxoccurs= x, where x is an integer >0 or unbounded The default is 1 in both cases Also, default or fixed values can be specified: <element name="color" type="xs:string" default="red"/> <element name="color" type="xs:string" fixed="green/>

23 XML Schema Attributes and Types To declare an attribute use the attribute tag (very similar to the element one): <attribute name=... /> Similarly to element, attributes can have types, default, and fixed values: <attribute name=... type=... /> <attribute name="color" type="xs:string" default="red"/> Attributes are optional by default. You can use the use attribute to make them required: <attribute name="color" type="xs:string" use="required"/> Note: Attributes can be defined only within a complex element type (see later)

24 XML Schema built-in data types

25 Simple derived types Derived datatypes (such as integer), are built from the original ones using Restrictions Lists Unions

26 Complex types Complex types are used to define elements which contain attributes, text, other elements, or any combination of these They are built using the following operators Element references, such as <element ref= name > Concatenation, using the sequence element Union, using a choice element The all element (like sequence but unordered) The any construct The group element (to allow references to item groups) MinOccurs and maxoccurs attributes to define cardinalities The mixed (boolean) attribute to allow mixed content

27 An example <xsd:complextype name= TeacherType > <xsd:sequence> <xsd:element name= firstname type= xsd:string minoccurs= 0 maxoccurs= unbounded /> <xsd:element name= lastname type= xsd:string /> </xsd:sequence> <xsd:attribute name= title type= xsd:string use= optional /> </xsd:complextype> <xsd:element name= teacher type= TeacherType /> <teacher title= Ph.D. > <firstname>davide</firstname> <lastname>eynard</lastname> </teacher>

28 Schema extension Modularization is allowed by the following three constructs: <include schemalocation="ur"/> <import namespace="ns" schemalocation="ur"/> <redefine schemalocation="ur">... </redefine> nheritance and extensions Restrictions

29 Limitations of XML Schema XML Schema is much more powerful and expressive than DTDs, however it still has some limitations: Too difficult for non-experts (problem: non-experts need to read the schema to write valid XML documents!) Element and attribute declarations are context insensitive Although XML Schema is built with XML, it still does not have a complete XML Schema Technical limitations When describing mixed content, the character data cannot be constrained in any way A schema cannot enforce a particular root element Element defaults cannot contain markup, but only character data... and many others

30 References Some Web references: G. Antoniou and F. van Harmelen A Semantic Web Primer, The MT Press 2004. Chapter 2 slides: http://www.ics.forth.gr/isl/swprimer M.C. Daconta, L.J. Obrst, and K.T. Smith. The Semantic Web, Wiley, 2003. Chapter 3 online: http://www.wiley.com/legacy/compbooks/daconta/sw/sample.html Tools: W3 School website, in particular http://www.w3schools.com/dtd and http://www.w3schools.com/schema Anders Møller and Michael. Schwartzbach. An ntroduction to XML and Web Technologies, Addison-Wesley, 2006. Chapter 4 (Schema Languages) online: http://www.brics.dk/ixwt Examples from Elizabeth Castro, XML for the World Wide Web: Visual Quickstart Guide, Peachpit Press, 2000. http://www.cookwood.com/xml XML Validation Services: http://www.stg.brown.edu/service/xmlvalid, http://validator.w3.org XML Copy Editor, a free (as in freedom), multiplatform editor which supports validation: http://xml-copy-editor.sourceforge.net Validator, a free (as in freedom), multiplatform, drag and drop XML validator which works on Windows, Linux, and Mac OS X: http://homepage.mac.com/rcrews/software/validator/