Experiences with JSON and XML Transformations IBM Submission to W3C Workshop on Data and Services Integration October 20-21 2011, Bedford, MA, USA



Similar documents
Data Normalization Reconsidered

OData Extension for XML Data A Directional White Paper

REST vs. SOAP: Making the Right Architectural Decision

Common definitions and specifications for OMA REST interfaces

Enabling REST Services with SAP PI. Michael Le Peter Ha

02267: Software Development of Web Services

Data XML and XQuery A language that can combine and transform data

Big Data Analytics. Rasoul Karimi

Presentation / Interface 1.3

Schematron Validation and Guidance

DataDirect XQuery Technical Overview

Semantic Interoperability

02267: Software Development of Web Services

NoSQL and Agility. Why Document and Graph Stores Rock November 12th, 2015 COPYRIGHT 2015 MARKLOGIC CORPORATION. ALL RIGHTS RESERVED

Introduction to Web Services

Contents. 2 Alfresco API Version 1.0

Extending the Linked Data API with RDFa

XML Processing and Web Services. Chapter 17

Wiley. Automated Data Collection with R. Text Mining. A Practical Guide to Web Scraping and

LINKED DATA EXPERIENCE AT MACMILLAN Building discovery services for scientific and scholarly content on top of a semantic data model

Exchanger XML Editor - Canonicalization and XML Digital Signatures

6. SQL/XML. 6.1 Introduction. 6.1 Introduction. 6.1 Introduction. 6.1 Introduction. XML Databases 6. SQL/XML. Creating XML documents from a database

Unified XML/relational storage March The IBM approach to unified XML/relational databases

XML Databases 6. SQL/XML

High Performance XML Data Retrieval

Data-Gov Wiki: Towards Linked Government Data

2009 Martin v. Löwis. Data-centric XML. Other Schema Languages

IBM DB2 XML support. How to Configure the IBM DB2 Support in oxygen

A Comparison of Database Query Languages: SQL, SPARQL, CQL, DMX

Semantic Web Languages: RDF vs. SOAP Serialisation

Using Database Metadata and its Semantics to Generate Automatic and Dynamic Web Entry Forms

Why NoSQL? Your database options in the new non- relational world IBM Cloudant 1

Standard Recommended Practice extensible Markup Language (XML) for the Interchange of Document Images and Related Metadata

LabVIEW Internet Toolkit User Guide

Semistructured data and XML. Institutt for Informatikk INF Ahmet Soylu

Agents and Web Services

ANDROID APPS DEVELOPMENT FOR MOBILE GAME

Integrating VoltDB with Hadoop

Last Week. XML (extensible Markup Language) HTML Deficiencies. XML Advantages. Syntax of XML DHTML. Applets. Modifying DOM Event bubbling

How To Use X Query For Data Collection

Lightweight Data Integration using the WebComposition Data Grid Service

An XML Based Data Exchange Model for Power System Studies

12 The Semantic Web and RDF

OSLC Primer Learning the concepts of OSLC

What's new in 3.0 (XSLT/XPath/XQuery) (plus XML Schema 1.1) Michael Kay Saxonica

EUR-Lex 2012 Data Extraction using Web Services

Mobility Information Series

Analytics March 2015 White paper. Why NoSQL? Your database options in the new non-relational world

Release Bulletin Sybase ETL Small Business Edition 4.2

Fraunhofer FOKUS. Fraunhofer Institute for Open Communication Systems Kaiserin-Augusta-Allee Berlin, Germany.

Semantic Stored Procedures Programming Environment and performance analysis

E6895 Advanced Big Data Analytics Lecture 4:! Data Store

An Oracle White Paper June RESTful Web Services for the Oracle Database Cloud - Multitenant Edition

How semantic technology can help you do more with production data. Doing more with production data

1 File Processing Systems

Cloud Storage Standards Overview and Research Ideas Brainstorm

Portal Factory CMIS Connector Module documentation

BASI DI DATI II 2 modulo Parte II: XML e namespaces. Prof. Riccardo Torlone Università Roma Tre

CST6445: Web Services Development with Java and XML Lesson 1 Introduction To Web Services Skilltop Technology Limited. All rights reserved.

restful-api-design Documentation

What is Data Virtualization? Rick F. van der Lans, R20/Consultancy

Introducing Apache Pivot. Greg Brown, Todd Volkert 6/10/2010

We have big data, but we need big knowledge

Programming Against Hybrid Databases with Java Handling SQL and NoSQL. Brian Hughes IBM

Using Object And Object-Oriented Technologies for XML-native Database Systems

Integration of Hotel Property Management Systems (HPMS) with Global Internet Reservation Systems

Business Object Document (BOD) Message Architecture for OAGIS Release 9.+

database abstraction layer database abstraction layers in PHP Lukas Smith BackendMedia

WWW. World Wide Web Aka The Internet. dr. C. P. J. Koymans. Informatics Institute Universiteit van Amsterdam. November 30, 2007

Lesson 4 Web Service Interface Definition (Part I)

Extensible Markup Language (XML): Essentials for Climatologists

BACKGROUND. Namespace Declaration and Qualification

SQL Databases Course. by Applied Technology Research Center. This course provides training for MySQL, Oracle, SQL Server and PostgreSQL databases.

IoT-Ticket.com. Your Ticket to the Internet of Things and beyond. IoT API

Introducing DocumentDB

Introduction to XML Applications

Getting started with API testing

Addressing Self-Management in Cloud Platforms: a Semantic Sensor Web Approach

MySQL for Beginners Ed 3

10CS73:Web Programming

T Network Application Frameworks and XML Web Services and WSDL Tancred Lindholm

FileMaker Server 9. Custom Web Publishing with PHP

Oracle Database: SQL and PL/SQL Fundamentals NEW

RDF Resource Description Framework

Computer Science E-259

Deferred node-copying scheme for XQuery processors

From Atom's to OWL ' s: The new ecology of the WWW

Chapter 11 Map-Reduce, Hadoop, HDFS, Hbase, MongoDB, Apache HIVE, and Related

A Semantic web approach for e-learning platforms

Beyond The Web Drupal Meets The Desktop (And Mobile) Justin Miller Code Sorcery Workshop, LLC

OData in a Nutshell. August 2011 INTERNAL

Standards, Tools and Web 2.0

Achille Felicetti" VAST-LAB, PIN S.c.R.L., Università degli Studi di Firenze!

A Big Picture for Big Data

Encoding Library of Congress Subject Headings in SKOS: Authority Control for the Semantic Web

Copyright 2013, Oracle and/or its affiliates. All rights reserved.

Using CDMI to Manage Swift, S3, and Ceph Object Repositories David Slik NetApp, Inc.

Big Data, Fast Data, Complex Data. Jans Aasman Franz Inc

MarkLogic Server. SQL Data Modeling Guide. MarkLogic 8 February, Copyright 2015 MarkLogic Corporation. All rights reserved.

Programming the Semantic Web

Transcription:

Experiences with JSON and XML Transformations IBM Submission to W3C Workshop on Data and Services Integration October 20-21 2011, Bedford, MA, USA 21 October 2011 John Boyer, Sandy Gao, Susan Malaika, Michael Maximilien, Rich Salz, Jerome Simeon Feedback to: malaika@us.ibm.com Copyright IBM 1

Agenda Why Do We Want to Transform Friendliness vs Grotesqueness (Unfriendliness) Typical JSON and XML Mapping Issues Round-Trippability Friendliness A Use Case Mapping Approaches Overview of Approaches 1-4 Recommendations Ideas Some Published Mappings and References The Grand Finale Copyright IBM 2

Why Do We Want To Transform? Trend: Javascript/Web 2.0 dominates for: Fast parsing of message content Programmer convenience Trend: Multiple message formats (Atom, XML, JSON, etc) are becoming common XML APIs and REST APIs are being extended to support JSON Trend: Enterprises desire validation, declarative constraints and stringency for data content, but also want the benefits of Web 2.0 Customer demand for: Constraint and query features provided by XML Programming ease provided by JSON Copyright IBM 3

Some Data and Metadata Standards Relational XML JSON Linked Data Metadata Data Definition Language (ISO) XML Schema XSD (W3C), Namespaces (W3C) JSON Schema - IETF RDFS (RDF Schema), Ontology (W3C and elsewhere) Constraints Integrity Constraints in table definitions (ISO) Schematron (ISO), Relax- NG (OASIS) - Triggers Relational triggers - - RIF (W3C) Data Exchange Serializations SQL standard (ISO) defines an XML serialization but it is not widely used There is no agreed JSON serialization. There are RDF serializations. XML is a syntax widely used for data exchange (W3C) JSON is a serialized format, there are XML representations of JSON too XML, Turtle Annotations Not part of the relational model Many kinds of annotations are defined for XML schemas and for XML data - RDFa (W3C) can be used to annotate XML Query & CRUD Languages Data Manipulation Language (DML), SQL, SQL/XML (ISO) XPath, XQuery (W3C) and others for CRUD JAQL, JSONiq SPARQL (W3C) Query & CRUD APIs Many for various programming languages, e.g., JDBC, ODBC Various, includes some of the relational APIs and specific APIs, e.g., XQJ - SPARQL Graph Store HTTP Protocol for CRUD (W3C) Collection Table, View, Database (ISO) XML Collection Function (W3C) - RDF Graphs (W3C) Transformation & Other Languages SQL (Tables to Tables) XSLT (XML to text, includes XML); XForms JavaScript - Some XML communities who work on query and transformation languages are reviewing the idea of XML languages supporting JSON SQL XMLTABLE (XML to relational) Currently, the role of JSON is mainly for data exchange between JavaScript clients and servers Copyright IBM 4

Friendly XML Friendly XML has multiple unique paths (multiple element and attribute names) in the XML, rather than a name value pair design Friendliness provides Easy XML consumption by programmers, authors and software Straightforward transforms, queries, and indexing capabilities Friendly XML Example <location> <city>armonk</city> <state>ny</state> </location> UnFriendly XML Example <root type="object > <item name= city value= Armonk /> <item name = state value = NY /> </root> Copyright IBM 5

Friendly JSON Friendly JSON does not include repeating variable names Data types are directly associated with each variable Friendliness provides easy data structure consumption by JavaScript programmers, authors Friendly JSON Example UnFriendly JSON Example { "person": { "age": "40", "name": "John Smith" { "book" : { "children : [{ "title" : { "attributes" : { "isbn": "15115115", "children" : [ "This book is ", { "emph" : "bold" ] ] Copyright IBM 6

Typical JSON and XML Mapping Issues Round-trip transformations are challenging XML to JSON to XML JSON to XML to JSON General issues Different character mapping for names and values in JSON and XML Mismatch in data types between JSON and XML Semantic infidelity XML to JSON issues Losing XML namespaces Incorrectly mapping XML repeating elements Mishandling of relative URLs JSON to XML issues Incorrectly mapping JSON arrays Associating variables with data types in JSON Copyright IBM 7

Round-trip Why is round-tripping important? Data and data type information has to be preserved Fidelity, to prevent data loss Preservation of metadata, order Maintain usability regardless of format Some details are unnecessary, depending on use cases May not help Friendliness Copyright IBM 8

Mapping Approach Overview Approach 1 JSON to XML Focus on Friendliness Arbitrary JSON [1.1] [1.2] [1.4] Mapper A [1.3] Friendly XML Only round trip when needed with degraded friendliness Approach 2 JSON to XML Focus on Round-Tripping Arbitrary JSON [2.1] [2.4] Mapper B [2.2] [2.3] XML Example: JSONx at IETF Approach 3 XML to JSON Focus on friendliness Arbitrary XML [3.1] [3.2] [3.3] Mapper C Friendly JSON Mapper C [3.4] Possibly different XML Approach 4 XML to JSON Focus on Round-Tripping Arbitrary XML [4.1] [4.4] Mapper D [4.2] [4.3] JSON Copyright IBM 9

Friendly and Round-Trippable Summary A JSON-to-XML mapping is friendly when the generated XML has meaningful element and attribute names, rather than a name value pair design. An XML-to-JSON mapping is friendly when the generated JSON has flat structure, making it easy for JavaScript programmers to consume it. A JSON-to-XML mapping is round-trippable when the generated XML contains all the information that was in the original JSON, without any loss. An XML-to-JSON mapping is round-trippable when the generated JSON contains all the information that was in the original XML, without any loss. Copyright IBM 10

A JSON XML Use Case JavaScript Clients HTTP-REST/JSON Friendly JSON Gateway JSONx SOAP/XML Friendly XML SOAP Applications Copyright IBM 11

Approach #1: JSON to Friendly XML Requirement 1: Friendly XML processing Requirement 2: Round-trippable Rules: JSON names become XML element names JSON object members and arrays become XML elements, and Simple values become XML text content Synthesized XML 'root' element Special handling to preserve data type, kinds of emptiness, and special characters Copyright IBM 12

Approach #1: Example In this example, { "city": "Armonk", "state": "NY" becomes <root type="object"> <city>armonk</city> <state>ny</state> </root> Copyright IBM 13

Approach #1: Special characters in names and empty names A JSON name may not match an XML NCName Each illegal character is escaped with two leading underscores, a hexadecimal encoding of the character's Unicode codepoint and A trailing underscore { "var$":"value", "":"generic" <root type="object"> <var 24_>value</var 24_> < >generic</ > </root> Copyright IBM 14

Approach #1: Simple Values JSON name/value pair with a simple value Generate XML element from name Store string equivalent of the simple value as content of the XML element Typed as boolean, number, string, or null { "age":43, "married":true, "address":null <root type="object"> <age type="number">43</age> <married type="boolean">true</married> <address nil="true"></address> </root> Copyright IBM 15

Approach #1: Object Values JSON name/value pair with a non-null object value, Generate XML element based on the name Generate child elements for each of the JSON name/value pairs { "address": { "street" : "123 Main St." <root type="object"> <address type="object"> <street>123 Main St.</street> </address> </root> Copyright IBM 16

Approach #1: Arrays JSON name/value pair with an array value Generate a container XML element for the array Generate one child XML element for each value in the array { "locations": ["Amsterdam", "London"], "secondary_locations": [] <root type="object"> <locations type="array"> < >Amsterdam</ > < >London</ > </locations> <secondary_locations type="array"></secondary_locations> </root> Copyright IBM 17

Approach #1: Special Characters In Values Characters are copied from JSON except for certain conversions: < < > > & \udddd \ \\ \ \/ / & &#xdddd \n Newline (0x0A) \r &#xd \t tab(0x09) Non-XML chars(\b, \f) Omitted, or encoded as PI Copyright IBM 18

Approach #2: JSON to XML and Back Requirement: Round-trippable Usually a name/value pair Rules: XML elements to preserve data type XML elements to preserve objects XML elements to preserve arrays XML attributes/text to preserve property names Copyright IBM 19

Approach #2: Example In this example,... "batters": { "batter": [ { "id": 2001, "type": "Regular", { "id": 2003, "type": "Blueberry" ],... becomes <json:object name="batters"> <json:array name="batter"> <json:object> <json:number name="id">2001</json:number> <json:string name="type">regular</json:string> </json:object> <json:object> <json:number name="id">2003</json:number> <json:string name="type">blueberry</json:string> </json:object> </json:array> </json:object> Copyright IBM 20

Approach #3: XML to Friendly JSON Requirement: Consumable/Friendly JSON Requirement: Preserve XML structure Rules: XML element or attribute names become JSON object names, XML children elements become JSON objects fields or arrays, and XML text nodes become JSON simple values Copyright IBM 21

Approach #3: Multiple Child Elements With Same Name When multiple child elements have the same name, use JSON arrays to represent those elements. <people> <person> <name>john Smith</name> <age>40</age> </person> <person> <name>jane Foster</name> <age>43</age> </person> </people> { "people": { "person": [ { "age": "40", "name": "John Smith", { "age": "43", "name": "Jane Foster" ] Copyright IBM 22

Approach #3: Attributes and Element Nodes Preserve XML notion of attributes <title isbn="15115115">this book is on XML and JSON</title> { "title": { "@isbn": "15115115", "text()": "This book is on XML and JSON" Copyright IBM 23

Approach #3: Text Nodes and Mixed Content An XML element could contain both child elements and text <name>john<br/>smith</name> { "name": { "br": {, "text()": "JohnSmith" Copyright IBM 24

Approach #3: Handling Document Order Preserve document order with JSON array <name>john<br/>smith</name> { "name": [ "br": {, "text()": "JohnSmith" ] Copyright IBM 25

Approach #3: XML Namespaces JSON does not support a name-scoping mechanism Confuses Round-tripping, Stops being friendly JSON <n:root xmlns:n=\"foo\" xmlns:l=\"bar\" l:att=\"1\"> <n:node/> </n:root> { "foo": { "root": { "bar": { "@att": "1", "foo": { "node": { Copyright IBM 26

Approach #3: Simple Values and Datatypes May be necessary to keep track of the original XML type annotation <year xsi:type="xs:positiveinteger">1989</year> { "xsi:type" : "xs:positiveinteger", value : 1989 Copyright IBM 27

Approach #4: XML to JSON and Back Requirement: Round-trippable Rules: Elements are mapped as JSON objects with its qualified name as a field Attributes of XML elements are included as a JSON attributes object Element children are included as an array to preserve the original order of the element's children Copyright IBM 28

Approach #4: Example In this example,... <book> <title isbn=\"15115115\">this book is <emph>bold</emph></title> </book>... becomes { "book" : { "children : [ { "title" : { "attributes" : { "isbn": "15115115", "children" : [ "This book is ", { "emph" : "bold" ] ] Copyright IBM 29

Mapping Approach Recommendations Use Approach 1 when your highest priority is to make the resulting XML easy to consume and then optionally round-trippable Use Approach 2 when your priority is roundtrippable conversions Use Approach 3 when your highest priority to make the resulting JSON easily consumable and then round-trippable Use Approach 4 when your priority is not JSON consumption, but still capable of round-trip conversions Copyright IBM 30

Ideas and Discussion Provide structured guidance to the W3C XSLT and other groups that are looking at integrating JSON and XML? Provide guidance for architects and developers who create APIs and formats which apply to both JSON and XML? Review the OData JSON proposals? (Contains JSON representation for ATOM ) Copyright IBM 31

Another Idea - Consider creating a W3C REST Community? Topics could include: (1) Identify gaps e.g., WADL or similar? URL Query? Expressing REST API results (e.g., RDF; JSON; XML; ATOM) and paging? JSON Schema http://tools.ietf.org/html/draft-zyp-json-schema-03? (2) Create Best Practices e.g., REST CRUD (use of POST as alternative) ; POST CREATE only or TUNNEL anything ; Partial update via POST instead of PATCH Retrieve subset a resource from POST Resource Links / ATOM - HREF and REL ; <atom:link rel='...' href='...'/> ; <machine rel='...' href='...'/> URI opaqueness MIME Types Generic MIME -> Generic Tooling ; Specific MIME class -> Specific Tooling Important for JSON - no guarantee there will be "Root Element (outer element) ; if MIME TYPE JSON/XML don't know anything Copyright IBM 32

Examples from CWMG (Cloud Management Work Group at DMTF) Modeling collections : Update collections Allowable operations on a class of resources : <operation url='xxx' rel='xxx'/> <operation url="/foo" rel="stop'/> Async processing What does client do after 202? CWMG Reference: http://dmtf.org/sites/default/files/standards/documents/dsp0263_ 1.0.0a.pdf CWMG Primer: http://dmtf.org/sites/default/files/standards/documents/dsp2027_ 1.0.0a.pdf Copyright IBM 33

Examples from URL Queries Queries in URLs have become common It is usual to offer the query responses in one or more of XML, ATOM or JSON. Here are some examples: Twitter Search API : http://search.twitter.com/api/ Example - http://search.twitter.com/search.atom?lang=en&q=&metopera&rpp=15 - Return all English language tweets that contain references to user metopera in chunks of 15 tweets per page - as atom FaceBook Graph API (part of the Facebook Graph Protocol) : http://developers.facebook.com/docs/reference/api/ Example - https://graph.facebook.com/me/friends?limit=3 - Return all my friends in chunks of 3 - as JSON OData : http://www.odata.org/developers/protocols/uriconventions#querystringoptions Example - http://services.odata.org/odata/odata.svc/products?$skip=2&$top=2&$orderby=rating - Return the 3rd and 4th products when sorted by rating - as atom The OData interface has been adopted by Microsoft and by SAP Copyright IBM 34

Some Published Mappings JSONx: http://datatracker.ietf.org/doc/draft-rsalz-jsonx/ JAQL: http://code.google.com/p/jaql/wiki/builtin_functions#xmltojson() XForms: http://www.w3.org/markup/forms/wiki/json Jettison: http://jettison.codehaus.org/user%27s+guide#user%27sguide- Conventions] Badgerfish: http://www.sklar.com/badgerfish JSON4J: http://publib.boulder.ibm.com/infocenter/wasinfo/fep/index.jsp?topic= /com.ibm.json.help/docs/gettingstarted.html Zorba: http://www.zorba-xquery.com/doc/zorba- 1.4.0/zorba/xqdoc/xhtml/www.zorba-xquery.com_modules_json.html Copyright IBM 35

References XML: http://www.w3.org/xml/ JSON : http://www.json.org/ Convert Atom documents to JSON : http://www.ibm.com/developerworks/library/x-atom2json/index.html OData Protocol JSON Format: http://www.odata.org/developers/protocols/json-format Copyright IBM 36

The Grande Finale Copyright IBM 37