Structured Data and Visualization Structured Data Programming Language Support Schemas become Types Xml docs become Values parsers and validators A language to describe the structure of documents A language to describe data that has this structure <element name = course type = courset /> <complextype name = courset > <sequence> <element name = CD /> </sequence> <attribute name = name type = string /> </complextype> <course name = SDV > <CD> </CD> <S> </S> <T> </T> </course> Programming Language Support Programming Language Support To write programs that deal with course documents: represent course documents in a programming language! interface Course{ public String getname(); public void setname(string value); public CDT getcd(); public void setcd(cdt value); Could be generated from the schema for course documents! A course composer Generate xml from strings input by the user An e-learning platform A course document for a given course A web site generator. Read an xml document A result generator
Programming Language Support Generate types (interfaces and classes in java) from a schema, Support to generate xml from a value (objects in java), the xml document will be valid with respect to the schema used! Support to generate a value from a valid xml document, parsing including validation! In Java JAXB a library for binding xml documents to java objects, A number of packages, javax.xml.bind javax.xml.parsers javax.xml.bind.util A compiler to generate interfaces and classes from a schema: xjc.sh (on unix) xjc.bat(on windows) prompt> YOURJAXBPATH/xjc.sh coursedoc.xsd Creates appropiate directories and generates java code:: interfaces and classes corresponding to elements and types in the schema Xml namespaces become java packages org.coursedoc Plus a package for implementations org.coursedoc.impl Declared complex types: <complextype name = courset > <sequence> </sequence> <attribute name = name type = string /> </complextype> become interfaces in the proper package: package org.coursedoc; public interface CourseT{ String getname(); void setname(string value);
And classes implementing these: package org.coursedoc.impl; public class CourseTImpl implements org.coursedoc.courset{ Experiment yourself to see what happens when you use an anonymous type! What happens when you declare a simple type! Declared objects <element name = course type = courset/> become interfaces in the proper package: package org.coursedoc; public interface Course extends CourseT{ and implementing classes: package org.coursedoc.impl; public class CourseImpl implements org.coursedoc.course{ One thing to be aware of is the treatment of minoccurs and maxoccurs. When they are not the default 1. <complextype name = courset > <sequence> <element name = teacher type = teachert minoccurs= 1 maxoccurs= unbounded /> </sequence> </complextype> public interface CourseT{ There is no setteacher! java.util.list getteacher(); Use add on the list! also produced A class implementing the abstract class javax.xml.bind.jaxbcontext see under org/coursedoc/impl/runtime! Has control over a grammar for the language you defined with the xml-schema Has a method returning a parser for this grammar that generates internal representations from a valid xml document. Has a method returning a serializer that produces valid xml from an internal representation.
java.xml.bind Unmarshal Taking an xml document into a Java program: javax.xml.bind.jaxbcontext jc = javax.xml.bind.jaxbcontext.newinstance( "org.coursedoc" ); javax.xml.bind.unmarshaller u = jc.createunmarshaller(); u.setvalidating(true); Course c = (Course)u.unmarshal(xmlFile); Parse the file according to the rules in the schema! Generate an instance of CourseImpl (with all subcomponents)! javax.xml.bind Marshal Producing xml from a Java program : javax.xml.bind.jaxbcontext jc = javax.xml.bind.jaxbcontext.newinstance( "org.coursedoc" ); javax.xml.bind.marshaller m = jc.createmarshaller(); sdv = new org.coursedoc.impl.courseimpl(); java.io.outputstream os = new java.io.fileoutputstream( filename+".xml" ); m.marshal(sdv,os); java.xml.bind Things you should explore What if a contents in an xml file is to match a regular expression? Is this checked during marshaling? What if a contents is to be of a union type? How can the programer deal with validation issues? There are exceptions throw but there are also events generated! You just register a ValidationEventHandler with the unmarshaler! See how to use this! More? I vissa programspråk kan finnas enklare stöd för att programera kring xml dokument: file.xml Finns inget schema som formaliserar dess struktur! En intern representation av file.xml kan ändå vara praktisk att ha! Man får bara xml-strukturen!
Xml parsers <recipe name= lemon pie > <ingredient name= sugar amount= 3spoons /> <instructions> Start by turning on the oven </instructions> </recipe> Attribute:name Lemon pie Element:recipe Element:ingred Attribute:name Sugar Element:instr Attribute:amount 3 spoons Start by Xml parsers i java DOM The package javax.xmp.parsers follows with the standard distribution of java. Class DocumentBuilder{ Document parse(inputstream in) Document newdocument() interface Document extends Node{ Node getfirstchild() void createelement(string tag) Element getelementbyname(string name) Xml parsers in java DOM Xml parsers in java SAX In the same package javax.xml.parsers we find DocumentBuilderFactory.newDocumentBuilder() returns a DocumentBuider! When using it to parse, it buids an internal tree representation for the whole document! class SAXParser{ void parse(inputstream in, DefaultHandler dh) class DefaultHandler implements ContentHandler{ void startdocument() All implemented as void enddocument() DO NOTHING you should redefine the interesting ones!
You get an instance using Xml parsers in java SAX SAXParserFactory.newSAXParser() When using parse to read a document it doesn't build an internal representation but it generates events on finding elements, attributes, content and more. You can use it to build an internal representation using the event handler!