Master s Project Proposal An Application to create Problem- Specific DOMs for XML

Similar documents
Concrete uses of XML in software development and data analysis.

Design Patterns in Parsing

Chain of Responsibility

by LindaMay Patterson PartnerWorld for Developers, AS/400 January 2000

CST6445: Web Services Development with Java and XML Lesson 1 Introduction To Web Services Skilltop Technology Limited. All rights reserved.

Excerpts from Chapter 4, Architectural Modeling -- UML for Mere Mortals by Eric J. Naiburg and Robert A. Maksimchuk

Structured Data and Visualization. Structured Data. Programming Language Support. Programming Language Support. Programming Language Support

Last Week. XML (extensible Markup Language) HTML Deficiencies. XML Advantages. Syntax of XML DHTML. Applets. Modifying DOM Event bubbling

JAXB: Binding between XML Schema and Java Classes

XML & Databases. Tutorial. 2. Parsing XML. Universität Konstanz. Database & Information Systems Group Prof. Marc H. Scholl

Introduction to XML Applications

REDUCING THE COST OF GROUND SYSTEM DEVELOPMENT AND MISSION OPERATIONS USING AUTOMATED XML TECHNOLOGIES. Jesse Wright Jet Propulsion Laboratory,

Managing large sound databases using Mpeg7

XML- New meta language in e-business

The Google Web Toolkit (GWT): The Model-View-Presenter (MVP) Architecture Official MVP Framework

Systems Engineering and Integration for the NSG (SEIN) SharePoint Developer

Application of XML Tools for Enterprise-Wide RBAC Implementation Tasks

No no-argument constructor. No default constructor found

JDOM Overview. Application development with XML and Java. Application Development with XML and Java. JDOM Philosophy. JDOM and Sun

Managing XML Documents Versions and Upgrades with XSLT

Welcome to the AITP NCC 2000 Java Competition in Tampa, Florida.

Aspects of using Hibernate with CaptainCasa Enterprise Client

Integrating XACML into JAX-WS and WSIT

ABSTRACT FACTORY AND SINGLETON DESIGN PATTERNS TO CREATE DECORATOR PATTERN OBJECTS IN WEB APPLICATION

Topics. Introduction. Java History CS 146. Introduction to Programming and Algorithms Module 1. Module Objectives

An Approach to Eliminate Semantic Heterogenity Using Ontologies in Enterprise Data Integeration

XML Processing and Web Services. Chapter 17

qwertyuiopasdfghjklzxcvbnmqwerty uiopasdfghjklzxcvbnmqwertyuiopasd fghjklzxcvbnmqwertyuiopasdfghjklzx cvbnmqwertyuiopasdfghjklzxcvbnmq

Keywords: XML, Web-based Editor

SSC - Web development Model-View-Controller for Java web application development

Patterns in. Lecture 2 GoF Design Patterns Creational. Sharif University of Technology. Department of Computer Engineering

Lecture 5: Java Fundamentals III

Firewall Builder Architecture Overview

Session Topic. Session Objectives. Extreme Java G XML Data Processing for Java MOM and POP Applications

Mobility Information Series

Java Web Services Training

A TOOL FOR DATA STRUCTURE VISUALIZATION AND USER-DEFINED ALGORITHM ANIMATION

Fast Infoset & Fast Web Services. Paul Sandoz Staff Engineer Sun Microsystems

The end. Carl Nettelblad

Chapter 12 Programming Concepts and Languages

Data XML and XQuery A language that can combine and transform data

N CYCLES software solutions. XML White Paper. Where XML Fits in Enterprise Applications. May 2001

03 - Lexical Analysis

MiniDraw Introducing a framework... and a few patterns

XML DATA INTEGRATION SYSTEM

MDA Overview OMG. Enterprise Architect UML 2 Case Tool by Sparx Systems by Sparx Systems

Responders: Language Support for Interactive Applications

Design Patterns. Design patterns are known solutions for common problems. Design patterns give us a system of names and ideas for common problems.

ISM/ISC Middleware Module

An XML Based Data Exchange Model for Power System Studies

Developing XML Solutions with JavaServer Pages Technology

Monitoring Infrastructure (MIS) Software Architecture Document. Version 1.1

Reduces development time by 90%

Chapter 13 Computer Programs and Programming Languages. Discovering Computers Your Interactive Guide to the Digital World

Design and Development of Website Validator using XHTML 1.0 Strict Standard

Technologies for a CERIF XML based CRIS

Design Grammars for High-performance Speech Recognition

A Thread Monitoring System for Multithreaded Java Programs

TATJA: A Test Automation Tool for Java Applets

DataDirect XQuery Technical Overview

XBRL Processor Interstage XWand and Its Application Programs

An Eclipse Plug-In for Visualizing Java Code Dependencies on Relational Databases

Classes and Objects. Agenda. Quiz 7/1/2008. The Background of the Object-Oriented Approach. Class. Object. Package and import

Chapter 1 Fundamentals of Java Programming

Open XML Court Interface (OXCI) Architecture Proposal

REST Client Pattern. [Draft] Bhim P. Upadhyaya ABSTRACT

Java 2 Platform, Enterprise Edition (J2EE): Enabling Technologies for EAI

XFlash A Web Application Design Framework with Model-Driven Methodology

Xtreme RUP. Ne t BJECTIVES. Lightening Up the Rational Unified Process. 2/9/2001 Copyright 2001 Net Objectives 1. Agenda

Extensible Markup Language (XML): Essentials for Climatologists

and ensure validation; documents are saved in standard METS format.

Dynamic Adaptability of Services in Enterprise JavaBeans Architecture

High Performance XML Data Retrieval

Production time profiling On-Demand with Java Flight Recorder

Using Object And Object-Oriented Technologies for XML-native Database Systems

A MEDIATION LAYER FOR HETEROGENEOUS XML SCHEMAS

Chapter 3 Chapter 3 Service-Oriented Computing and SOA Lecture Note

JavaFX Session Agenda

Adding Panoramas to Google Maps Using Ajax

JVA-561. Developing SOAP Web Services in Java

Introduction to Web Services

Why Threads Are A Bad Idea (for most purposes)

Web Services Development In a Java Environment

Lecture 9. Semantic Analysis Scoping and Symbol Table

An Exception Monitoring System for Java

Dinopolis Java Coding Convention

GenericServ, a Generic Server for Web Application Development

Cache Database: Introduction to a New Generation Database

Extracting data from XML. Wednesday DTL

Name of pattern types 1 Process control patterns 2 Logic architectural patterns 3 Organizational patterns 4 Analytic patterns 5 Design patterns 6

Object-Oriented Programming in Java: More Capabilities

Databases and Information Systems 2

maximizing IT productivity

Characteristics of Java (Optional) Y. Daniel Liang Supplement for Introduction to Java Programming

4.2 Understand Microsoft ASP.NET Web Application Development

Chapter. Introduction. The Problem. You know what a legacy application is? It s one that works. Bill Cafiero

Motivation Definitions EAI Architectures Elements Integration Technologies. Part I. EAI: Foundations, Concepts, and Architectures

The Study on Mobile Phone-oriented Application Integration Technology of Web Services 1

Software: Systems and Application Software

EPiServer and XForms - The Next Generation of Web Forms

Transcription:

Master s Project Proposal An Application to create Problem- Specific DOMs for XML Liangxiao Zhu lxz0062@cs.rit.edu October 9, 2002 1. SUMMARY... 2 2. OVERVIEW... 2 3. FUNCTIONAL SPECIFICATION... 3 3.1 CLASS DEFINITION FILE... 3 3.2 COMPILING CLASS DEFINITION FILE TO CLASSES AND OBSERVERS... 4 3.3 TURNING XML DOCUMENTS INTO JAVA OBJECT TREES... 5 3.4 APPLICATION USAGE... 5 4. ARCHITECTURAL OVERVIEW... 6 4.1 CLASS DEFINITION FILE AND CDF COMPILER... 6 4.2 BINDING FRAMEWORK AND OBSERVERS... 6 4.3 WRITING CDF COMPILER... 7 5. DELIVERABLES... 8 6. DRAFT TABLE OF CONTENTS OF FINAL REPORT... 8 7. SCHEDULE... 8 REFERENCES... 9-1 -

1. Summary The purpose of this project is to develop an application that turns XML sources into Problem-Specific Data Object Models (PSDOM). The W3C DOM (Document Object Model) [5] is the most popular API that parses XML documents into tree structures of data. But DOM provides a generic data model rather than a problem-specific data object model, which makes it complicated and inefficient. And in many cases, programmers don t use the generic data model. They usually need an object model that is specific to a particular problem. This is where this project comes in. The goal of this project is to provide an application that offers a convenient way to convert XML documents to objects. This application contains a Class Definition (CDF) compiler and a binding framework. The CDF compiler takes a Class Definition, which specifies the constraints of XML data and classes for XML elements, and compiles it into class files representing XML elements. It also generates observers (this observer idea is from Dr. Schreiner s oops [1]) that will be used to unmarshall XML documents based on the class definition file. The binding framework uses the observer to parse an XML document and turn it into a problem-specific object tree representing the XML document. The PSDOM makes it easy to create applications to process XML data. It maps XML into objects and allows user to manipulate the document in the same way the user manipulates objects. Figure 1 shows the data flow of this application. Class Definition CDF Compiler s Classes XML Document Handler Binding Framework Figure 1. Data Flow Object Tree 2. Overview XML has certainly taken the programming world by storm. It is a form of self-describing data, which makes it the best candidate as a data exchange medium between different systems. Therefore, how to parse XML and manipulate the XML data becomes a very essential task of handling XML documents. - 2 -

Basically, there are three major methods of accessing XML data [8]. The first one is Callbacks. Simple API for XML (SAX) is the W3C standard that uses this Callbacks method. It is an event-driven, serial-access mechanism for accessing XML documents. SAX is fast and less memory-intensive because it does not store any part of the document in memory. On the other hand, SAX API is a bit harder to visualize because of its eventdriven model; therefore, it is hard to code with. Another method is Trees. The most popular one for this method is the W3C Document Object Model (DOM) [5]. DOM is a platform- and language-neutral interface. It parses an XML document and constructs a generic object model in memory that represents the contents of the XML document. DOM is easier to use than SAX. Its tree structure makes it easier to visualize. And the API offers rich functionality. But what DOM provides are generic data models, which make it more complicated and inefficient to access than a problem-specific data model. It also carries a lot of ballast when handling a simple XML. The third one, also the newest method, is data binding. Sun s JAXB [2] is using this method. It maps the data in XML documents to small objects. These objects are related to the elements and attributes of the XML document. It becomes much easier for developers to access and handle these data in applications and convert the XML document from one format to another. Of course, developers want to work with the XML document in an easy and directly perceived way. They want to put more time on the development of business logic instead of spending too much time boning up on XML and XML parser specification. They want to write applications caring more about XML as data than as documents. In this project, we will provide such a tool that will be as fast as SAX parsers, as directly perceived as W3C DOM and as easy to create application processing XML data as data binding. Some of the ideas in this project are inspired by Dr. Schreiner s Oops [1] and the observer pattern in Oops. Oops is an object-oriented parser generator targeted to. It takes a grammar as input and automatically generates a parse tree. The parse tree, combined with the scanner, composes the parser for this grammar. The parser will generate object trees for input files based on this grammar. How to execute this tree is accomplished by the idea of adding observers to the parser. Different users can write different observers for the same grammar, hence, execute different tasks on the grammar. This project borrows the observer idea to implement the binding framework. 3. Functional Specification 3.1 Class Definition Class Definition itself is an XML document. It is written based on DTD file. It contains -like code as element contents, which gives the specification of generating the observers and the element classes. Following is an example of a DTD and a Class Definition that is based on the DTD. A simple DTD: <!ELEMENT person (firstname, lastname)> <!ELEMENT firstname #PCDATA> <!ELEMENT lastname #PCDATA> - 3 -

The Class Definition based on this DTD file: <element name= person > String firstname; String lastname; <element name= firstname > firstname = new String(PCDATA); </element> <element name= lastname > lastname = new String(PCDATA); </element> return new Person(firstName, lastname); </element> The example shows that the person element contains one firstname sub element and one lastname sub element. The Person element class has two String type fields; one is firstname, and the other is lastname, which corresponding to its subelements. The types of firstname and lastname elements are both String. They take their element content as value. Here, the PCDATA represents the element content. 3.2 Compiling Class Definition to classes and observers The CDF compiler takes the Class Definition as input, and maps it into observer source files and element classes. Here are the results generated from the Class Definition shown in 3.2: A. Class for person element: public class Person { private String firstname; private String lastname; public Person(String firstname, String lastname) { this.firstname = firstname; this.lastname = lastname; public String getfirstname() { return firstname; public void setfirstname(string data) { firstname = data; public String getlastname() { return lastname; public void setlastname(string data) { lastname = data; B. s used to unmarshall XML document to objects protected class PersonAdapter extends Adapter { protected Object value; public Object value () { return value; public init (int rule) { switch (rule) { // <!ELEMENT person (firstname, lastname)> case 0: return new PersonAdapter ( ) { { String firstname; String lastname; - 4 -

public void shift ( sender) { if (sender.getname().equals("firstname") { firstname = sender.value(); else if (sender.getname().equals("lastname") { lastname = sender.value(); else { throw Exception("Shifted wrong subelement!"); public void reduce() { value = new Person(firstName, lastname); ; // <!ELEMENT firstname #PCDATA> case 1: return new PersonAdapter ( ) { public void shift (String data) { value = data; ; // <!ELEMENT lastname #PCDATA> case 2: return new PersonAdapter ( ) { public void shift (String data) { value = data; ; default: return null; 3.3 Turning XML documents into object trees The binding framework is in charge of turning XML documents into object trees. It takes an XML document and parses the file using an XML parser (in our case, we use SAX parser). The XML parser calls the observer Handler, which in turn invokes observers defined for each rule in the DTD file like the one above. The observers build object trees by instantiated classes generated by the CDF compiler. Figure 2 shows this workflow. 3.4 Application usage Using this application is quite simple. You just need to follow these steps: Get the DTD file of XML documents; write a Class Definition based on DTD file. The CDF contains -like code, which specifies how the classes of elements and observers should be generated. Use CDF compiler and the given Class Definition to generate class files and observer source file. Compile those class files and observer source file using compiler. Use binding framework to parse the XML document and turn it into a object tree. Manipulate the generated object tree. - 5 -

4. Architectural Overview XML Document Binding Framework SAX Parser Content Handler Handler Calls Object Tree DTD s Use Classes Class Definition CDF Compiler Source s Compiler Figure 2. PSDOM Overall Architecture Figure 2 shows the overall architecture of this project. 4.1 Class Definition and CDF Compiler The application developer needs to write Class Definition according to the DTD file. The CDF is an XML document that is written in a specific format, which will be defined in this project. The CDF Compiler takes the Class Definition and compiles it into Source s and element classes. The Source s can then be compiled with Compiler into s that will be later hooked into the Framework. The generated Classes are in the bean-like style with gets and sets methods. They represent the object-oriented view of DTD elements definitions and attribute list definitions. At this time, the application developer has already been able to instantiate generated Classes to build the XML content tree and then manipulate the XML data with Objects. 4.2 Binding Framework and s If an XML file exists, of course the XML should have been validated against the DTD. The application developer can use the Binding Framework to convert an XML document into Objects and form the content tree. The Binding Framework is composed with SAX parser (or other XML parser) and Handler. The SAX parser consumes an XML file, generates SAX2 Events[6]. Handler processes SAX2 Events, then in turn, instantiates and shifts XML data to by calling observer s methods. The observer actually does the real work to instantiate Classes, create Object and build the XML content tree. - 6 -

4.3 Writing CDF Compiler One interesting thought is that since the Class Definition is written in XML, theoretically, we can use the same architecture of Binding Framework to build the CDF Compiler as shown in figures 3 and 4. Class Definition CDF Compiler SAX Parser Content Handler Handler Source s Calls to generate Source s Figure 3. CDF Compiler Part 1 Class Definition CDF Compiler SAX Parser Content Handler Handler Classes Calls to generate Classes Figure 4. CDF Compiler Part 2 Then only thing left that we need to do is to write observes used to generate Source s and Classes. - 7 -

5. Deliverables The following are the principal deliverables of this project. All of these will be contained in a CD: Source code and executable files of CDF compiler and binding framework. A technical report for the design architecture and specifics of this project. Examples of using this tool, which include DTD, Class Definition, XML documents and classes. 6. Draft Table of Contents of Final Report Description of XML data binding Architecture of this tool Design specification of Class Definition Design of binding compiler Analysis of observers pattern Conclusion and future work Bibliography and References 7. Schedule September 2002 Work on the feasibility for this project. Write observers and element classes by hand for simple DTDs. Use SAX parser to parse XML documents, to see if can get object tree as expected. October 2002 Finish detailed design November and December 2002 Implement the CDF compiler and binding framework. January 2003 Code test and writing examples. February 2003 Complete technical report and prepare the presentation. - 8 -

References [1] Axel-Tobias Schreiner. RIT Oops homepage http://www.cs.rit.edu/~ats/projects/oops/edu/doc/ Introduce of the RIT Oops and Oops API. [2] Mark Reinhold, Core Platform Group. The TM Architecture for XML Binding (JAXB). Working-Draft Specification. Version 0.21. 30 May 2001 JAXB specification. [3] Sun Microsystems, Inc. The TM Architecture for XML Binding User s Guide. May 2001 JAXB user s guide. [4] World Wide Web Consortium. Extensible Markup Language (XML) 1.0 (Second Edition) W3C Recommendation 6 October 2000 http://www.w3.org/tr/rec-xml W3C XML specification. [5] World Wide Web Consortium. W3C Document Object Model Level 1 Specification W3C Recommendation 1 October 1998 http://www.w3.org/tr/rec-dom-level-1/ W3C DOM specification. [6] David Megginson. SAX http://www.saxproject.org/ Homepage of SAX project. [7] Brett McLaughlin. Data binding from XML to applications. Enhydra Strategist, Lutris Technologies. August 2000 Tutorial of XML data binding with java applications. [8] Brett McLaughlin. & XML Data Binding. O Relly May 2002 In-depth technical look at XML Data Binding. [9] Erich Gamma, Richard Helm, Ralph Johnson, John Vlissides, Design Patterns, Elements of Reusable Object-Oriented Software. Addison-Wesley 1995 A book of design patterns that describes simple and elegant solutions to specific problems in OOD. - 9 -