exist A NOSQL DOCUMENT DATABASE AND APPLICATION PLATFORM Erik Siegel & Adam Retter



Similar documents
O Reilly Ebooks Your bookshelf on your devices!

Content Management Systems: Drupal Vs Jahia

THE WINDOWS AZURE PROGRAMMING MODEL

How To Retire A Legacy System From Healthcare With A Flatirons Eas Application Retirement Solution

NoSQL replacement for SQLite (for Beatstream) Antti-Jussi Kovalainen Seminar OHJ-1860: NoSQL databases

MONGODB - THE NOSQL DATABASE

Chapter 11 Map-Reduce, Hadoop, HDFS, Hbase, MongoDB, Apache HIVE, and Related

Search and Real-Time Analytics on Big Data

Data processing goes big

Sisense. Product Highlights.

DataDirect XQuery Technical Overview

Draft Response for delivering DITA.xml.org DITAweb. Written by Mark Poston, Senior Technical Consultant, Mekon Ltd.

NoSQL and Hadoop Technologies On Oracle Cloud

Five Steps to Integrate SalesForce.com with 3 rd -Party Systems and Avoid Most Common Mistakes

Getting started with API testing

Load and Performance Load Testing. RadView Software October

How To Write A Monitoring System For Free

Open is as Open Does: Lessons from Running a Professional Open Source Company

Open Source Technologies on Microsoft Azure

Seamless Web Data Entry for SAS Applications D.J. Penix, Pinnacle Solutions, Indianapolis, IN

Mobile Application Platform

WINDOWS AZURE EXECUTION MODELS

Oracle Identity Analytics Architecture. An Oracle White Paper July 2010

Increase Agility and Reduce Costs with a Logical Data Warehouse. February 2014

Technical. Overview. ~ a ~ irods version 4.x

Skynax. Mobility Management System. System Manual

Document management and exchange system supporting education process

WHAT'S NEW IN SHAREPOINT 2013 WEB CONTENT MANAGEMENT

PHP on IBM i: What s New with Zend Server 5 for IBM i

Structured Content: the Key to Agile. Web Experience Management. Introduction

EFFECTIVE STORAGE OF XBRL DOCUMENTS

Smartphone Enterprise Application Integration

tibbr Now, the Information Finds You.

REDUCING THE COST OF GROUND SYSTEM DEVELOPMENT AND MISSION OPERATIONS USING AUTOMATED XML TECHNOLOGIES. Jesse Wright Jet Propulsion Laboratory,

Data Integration Checklist

An Oracle White Paper May Oracle Database Cloud Service

A Tool for Evaluation and Optimization of Web Application Performance

CatDV Pro Workgroup Serve r

Firewall Builder Architecture Overview

SAP HANA SPS 09 - What s New? HANA IM Services: SDI and SDQ

Choosing the Best Mobile Backend

Deploy. Friction-free self-service BI solutions for everyone Scalable analytics on a modern architecture

WINDOWS AZURE DATA MANAGEMENT

Kony Mobile Application Management (MAM)

Liferay Portal User Guide. Joseph Shum Alexander Chow

MarkLogic Server. Reference Application Architecture Guide. MarkLogic 8 February, Copyright 2015 MarkLogic Corporation. All rights reserved.

White Paper. Anywhere, Any Device File Access with IT in Control. Enterprise File Serving 2.0

Building Applications Using Micro Focus COBOL

NoSQL Databases. Polyglot Persistence

KonyOne Server Prerequisites _ MS SQL Server

Oracle Service Bus. Situation. Oracle Service Bus Primer. Product History and Evolution. Positioning. Usage Scenario

Data Driven Success. Comparing Log Analytics Tools: Flowerfire s Sawmill vs. Google Analytics (GA)

Creating Value through Innovation MAGENTO 1.X TO MAGENTO 2.0 MIGRATION

Making Data Available on the Web

W H I T E P A P E R. Deriving Intelligence from Large Data Using Hadoop and Applying Analytics. Abstract

Database FAQs - SQL Server

How to Prepare for the Upgrade to Microsoft Dynamics CRM 2013 (On-premises)

Overview Document Framework Version 1.0 December 12, 2005

MIGRATING SHAREPOINT TO THE CLOUD

Jenkins: The Definitive Guide

Making Sense ofnosql A GUIDE FOR MANAGERS AND THE REST OF US DAN MCCREARY MANNING ANN KELLY. Shelter Island

owncloud Architecture Overview

Visualisation in the Google Cloud

MALAYSIAN PUBLIC SECTOR OPEN SOURCE SOFTWARE (OSS) PROGRAMME BENCHMARK/COMPARISON REPORT DOCUMENT MANAGEMENT SYSTEMS (NUXEO AND ALFRESCO)

What s New in Version Cue CS2

Base One's Rich Client Architecture

IBM Cognos Mobile Overview

Introduction to Hadoop. New York Oracle User Group Vikas Sawhney

Communiqué 4. Standardized Global Content Management. Designed for World s Leading Enterprises. Industry Leading Products & Platform

Data XML and XQuery A language that can combine and transform data

Distributed File Systems

MySQL. Leveraging. Features for Availability & Scalability ABSTRACT: By Srinivasa Krishna Mamillapalli

TRANSLATIONS FOR A WORKING WORLD. 2. Translate files in their source format. 1. Localize thoroughly

Analytics March 2015 White paper. Why NoSQL? Your database options in the new non-relational world

owncloud Architecture Overview

Business Process Management

Lotus Domino Security

MarkLogic Enterprise Data Layer

iway Roadmap: 2011 and Beyond Dave Watson SVP, iway Software

Your guide to building great apps. Upgrade your skills and update your tools to create the next great app

An Oracle White Paper February Oracle Data Integrator 12c Architecture Overview

The Definitive Guide. Active Directory Troubleshooting, Auditing, and Best Practices Edition Don Jones

JAVASCRIPT CHARTING. Scaling for the Enterprise with Metric Insights Copyright Metric insights, Inc.

Deciding When to Deploy Microsoft Windows SharePoint Services and Microsoft Office SharePoint Portal Server White Paper

Efficient database auditing

TYPO3 6.x Enterprise Web CMS

Big Data Solutions. Portal Development with MongoDB and Liferay. Solutions

Sentimental Analysis using Hadoop Phase 2: Week 2

Zabbix : Interview 2012 of Alexei Vladishev

A SHORT INTRODUCTION TO CLOUD PLATFORMS

WINDOWS AZURE DATA MANAGEMENT AND BUSINESS ANALYTICS

Transcription:

exist A NOSQL DOCUMENT DATABASE AND APPLICATION PLATFORM Erik Siegel & Adam Retter

exist Get a head start with exist, the open source NoSQL database and application development platform built entirely around XML technologies. With this hands-on guide, you ll learn exist from the ground up, from using this featurerich database to work with millions of documents to building complex web applications that take advantage of exist s many extensions. If you re familiar with XML as a student, professor, publisher, or developer you ll find that exist is ideal for all kinds of documents. This book shows you how to store, query, and search documents with XQuery and other XML technologies, and how to construct applications on top of the database with tools such as exide and exist s built-in development environment. Manage both data-oriented and text-oriented markup documents securely Build a sample application that analyzes and searches Shakespeare s plays Go inside the architecture and learn how exist processes documents Learn how to work with exist s internal development environment Choose among various indexes, including a full-text index based on Apache Lucene Dive into exist s APIs for integrating or interacting with the database Extend exist by building your own Triggers, Scheduled Tasks, and XQuery extension modules Erik Siegel is a content engineer and XML specialist who runs Xatapult consultancy in The Netherlands. He specializes in content design and conversion, XML schemas and transformations, and exist and XProc applications. Adam Retter, Director of Evolved Binary in the UK and a cofounder of exist Solutions in Germany, has been a member of the exist Core Development Team since 2005. He is also a member of the XML Guild and an Invited Expert to the W3C XML Query Working Group. This book tells you everything you need to know to implement exist, all the way from writing your first queries to building sophisticated web applications. Priscilla Walmsley XML consultant and author of XQuery Erik Siegel and Adam Retter have written a technical tour de force about the database in their eponymous book exist, one that's both eminently readable while still digging deep into the inner workings of the database and how to use it. Kurt Cagle Principal Evangelist at Avalon Consulting XML/DATABASES US $44.99 CAN $47.99 ISBN: 978-1-449-33710-0 Twitter: @oreillymedia facebook.com/oreilly

O Reilly Ebooks Your bookshelf on your devices! When you buy an ebook through oreilly.com you get lifetime access to the book, and whenever possible we provide it to you in five, DRM-free file formats PDF,.epub, Kindle-compatible.mobi, Android.apk, and DAISY that you can use on the devices of your choice. Our ebook files are fully searchable, and you can cut-and-paste and print them. We also alert you when we ve updated the files with corrections and additions. Learn more at ebooks.oreilly.com You can also purchase O Reilly ebooks through the ibookstore, the Android Marketplace, and Amazon.com. Spreading the knowledge of innovators oreilly.com

exist by Erik Siegel and Adam Retter Copyright 2015 Erik Siegel and Adam Retter. All rights reserved. Printed in the United States of America. Published by O Reilly Media, Inc., 1005 Gravenstein Highway North, Sebastopol, CA 95472. O Reilly books may be purchased for educational, business, or sales promotional use. Online editions are also available for most titles (http://safaribooksonline.com). For more information, contact our corporate/ institutional sales department: 800-998-9938 or corporate@oreilly.com. Editors: Simon St. Laurent and Allyson MacDonald Production Editor: Matthew Hacker Copyeditor: Rachel Monaghan Proofreader: Rachel Head Indexer: Lucie Haskins Interior Designer: David Futato Cover Designer: Ellie Volckhausen Illustrator: Rebecca Demarest December 2014: First Edition Revision History for the First Edition 2014-12-04: First Release See http://oreilly.com/catalog/errata.csp?isbn=9781449337100 for release details. The O Reilly logo is a registered trademark of O Reilly Media, Inc. exist, the cover image of a lettered araçari, and related trade dress are trademarks of O Reilly Media, Inc. While the publisher and the authors have used good faith efforts to ensure that the information and instructions contained in this work are accurate, the publisher and the authors disclaim all responsibility for errors or omissions, including without limitation responsibility for damages resulting from the use of or reliance on this work. Use of the information and instructions contained in this work is at your own risk. If any code samples or other technology this work contains or describes is subject to open source licenses or the intellectual property rights of others, it is your responsibility to ensure that your use thereof complies with such licenses and/or rights. 978-1-449-33710-0 [LSI]

Table of Contents Preface....................................................................... xi 1. Introduction................................................................ 1 What Is exist? 1 exist Compared to Other Database Systems 3 History 5 Competitors 7 Open Source Competitors 8 Closed Source, Commercial Competitors 8 Who Is Using exist, and for What? 9 Contributing to the Community 13 Individuals Using exist 14 Organizations Using exist 14 Authors Using exist 15 Developers Using exist 16 Additional Resources 16 2. Getting Started............................................................ 19 Downloading and Installing exist 19 Preconditions 19 Downloading exist 20 Things to Decide Before Installing 20 Installing exist 22 Post-Installation Checks 22 Starting and Stopping exist with a GUI 23 Starting and Stopping exist from the Command Line 24 A First Tour Around Town 24 The Dashboard 24 Playing Around 27 What s in Your Database 27 What s on Your Disk 28 iii

The Java Admin Client 29 Getting Files into and out of the Database 30 Hello exist! 31 Hello Data 31 Hello XQuery 32 Hello XSLT 33 Hello XInclude 35 Hello XForms 35 3. Using exist 101............................................................. 39 Preparations and Basic Application Setup 39 exist Terminology 40 Exporting Documents from exist 40 Designing an Application s Collection Structure and Importing Data 42 Viewing the Data 43 Listing the Plays (XML) 45 Listing with the collection Function 45 Listing with the xmldb Extension Module 48 Listing the Plays (HTML) 48 Analyzing the Plays 51 Linking the Analysis to the Play Overview 55 Searching the Plays 56 Searching Using Straight XQuery 56 Searching Using an Index 58 Creating a Log 60 What s Next? 64 4. Architecture............................................................... 65 Deployment Architectures 67 Embedded Architecture 68 Client/Server Database Architecture 69 Web Application Platform Architecture 70 Storage Architecture 72 XML Document Storage and Indexing 72 Binary Document Storage 76 Efficient XML Processing Architecture 76 Collections 77 Documents 79 Dynamic Level Numbering of Nodes 80 Dynamic Level Numbering and Updates 83 Paging and Caching 85 iv Table of Contents

5. Working with the Database.................................................. 87 The Database s Content 87 Help: Where Is My XML? 87 Terminology 88 Properties of Collections and Resources 88 System Collections 90 Addressing Collections, Resources, and Files 91 The XPath Collection and Doc Functions in exist 93 The collection Function 93 The doc Function 94 Querying the Database Using REST 94 Security 95 GET Requests 95 PUT Requests 97 DELETE Requests 97 POST Requests 98 Ad Hoc Querying 100 Updating Documents 101 exist s XQuery Update Extension 102 XUpdate 105 Controlling the Database from Code 107 Specifying Collections and Resources for the xmldb Extension Module 107 Getting Information 108 Creating Resources and Collections 109 Setting Permissions 110 Moving, Removing, and Renaming 110 6. XQuery for exist........................................................... 111 exist s XQuery Implementation 111 XQuery 1.0 Support 111 XQuery 3.0 Support 112 Other XQuery Extras 116 XQuery Execution 118 Serialization 118 Controlling Serialization 119 Serialization Options 119 Controlling XQuery Execution 123 exist XQuery Pragmas 123 Limiting Execution Time and Output Size 124 Other Options 124 XQuery Documentation with xqdoc 125 Table of Contents v

7. Extension Modules........................................................ 127 Types of Extension Modules 127 Extension Modules Written in Java 127 Extension Modules Written in XQuery 128 Enabling Extension Modules 128 Enabling Java Extension Modules 129 Enabling XQuery Extension Modules 130 8. Security.................................................................. 133 Security Basics 134 Users 134 Groups 135 Permissions 135 Default Permissions 138 Managing Users and Groups 140 Group Managers 140 Tools for User and Group Management 141 User and Group Management with the Java Admin Client 145 Scenario 145 Creating a Group 145 Creating Users 147 Setting Group Managers 149 Managing Permissions 151 Tools for Permission Management 151 Permission Management with the Java Admin Client 154 Access Control Lists 156 Access Control Entries 157 ACLs by Example 158 Managing ACLs 164 Realms 166 LDAP Realm Module 166 Other Realm Modules 174 Hardening 174 Reducing Collateral Damage 175 Reducing the Attack Surface 177 User Authentication in XQuery 187 xmldb:authenticate 188 xmldb:login 188 Backups 189 9. Building Applications...................................................... 191 Overview 191 vi Table of Contents

Which Technology to Use? 192 Application Aspects 192 Getting Started, Quickly? 193 Where to Store Your Application? 194 URL Mapping Using URL Rewriting 194 Anatomy of a URL Rewriting-Based Application 195 How exist Finds the Controller 198 The URL Rewriting Controller s Environment 199 The Controller s Output XML Format 200 Advanced URL Control 203 Changing the URL for URL Rewriting 205 Changing Jetty Settings: Port Number and URL Prefix 205 The controller-config.xml Configuration File 206 Proxying exist Behind a Web Server 207 Requests, Sessions, and Responses 209 The request Extension Module 209 The session Extension Module 211 The response Extension Module 211 Application Security 212 Running with Extra Permissions 214 Global Error Pages 215 Building Applications with RESTXQ 215 Configuring RESTXQ 216 RESTXQ Annotations 217 RESTXQ XQuery Extension Functions 227 Packaging 227 Examples 229 The Packaging Format 229 The Prepare and Finish Scripts 233 Creating Packages 234 Additional Remarks About Packages 234 10. Other XML Technologies.................................................... 237 XSLT 238 Embedding Stylesheets or Not 238 Invoking XSLT with the Transform Extension Module 240 Passing XSLT Parameters 241 Invoking XSLT by Processing Instruction 242 Stylesheet Details 243 XInclude 243 Including Documents 244 Including Query Results 245 Table of Contents vii

Error Handling and Fallback 245 Validation 246 Implicit Validation 246 Explicit Validation 248 Collations 251 Supported Collations 251 Specifying Collations 251 XSL-FO 252 XForms 254 XForms Instances 255 XForms Submissions 258 betterform 263 XSLTForms 265 11. Basic Indexing............................................................ 271 Indexing Example 272 Index Types 274 Structural Index 274 Range Indexes 274 NGram Indexes 275 Full-Text Indexes 275 Configuring Indexes 275 Configuring Range Indexes 276 Configuring NGram Indexes 278 Maintaining Indexes 278 Using Indexes 279 Using the Structural Index 279 Using the Range Indexes 279 Using the NGram Indexes 280 General Optimization Tips 281 Debugging Indexes 281 Checking Index Definitions 282 Checking Index Usage 282 Tracing the Optimizer 283 12. Text Indexing and Lookup.................................................. 285 Full-Text Index and KWIC Example 285 Configuring Full-Text Indexes 286 Configuring the Search Context 287 Handling Mixed Content 290 Maintaining the Full-Text Index 291 Searching with the Full-Text Index 292 viii Table of Contents

Basic Search Operations 292 Scoring Searches 296 Locating Matches 296 Using Keywords in Context 297 Defining and Configuring the Lucene Analyzer 298 Manual Full-Text Indexing 301 13. Integration............................................................... 303 Choosing an API 303 Remote APIs 305 WebDAV 305 REST Server API 319 XML-RPC API 342 XML:DB Remote API 349 RESTXQ 353 XQJ 359 Deprecated Remote APIs 361 Remote API Libraries for Other Languages 363 Local APIs 364 XML:DB Local API 366 Fluent API 369 14. Tools.................................................................... 373 Java Admin Client 373 exide 374 oxygen 375 Connecting with oxygen Using WebDAV 376 Natively Connecting with oxygen 377 Ant and exist 379 Trying the Ant Examples 379 Preparing an exist Ant Build Script 380 Using Ant with exist 381 15. System Administration..................................................... 385 Logging 385 JMX 387 Memory and Cache Tuning 389 Understanding Memory Use 390 Cache Tuning 394 Backup and Restore 396 Client-Side Data Export Backup 397 Server-Side Data Export Backup 400 Table of Contents ix

Restoring a Clean Database 403 Emergency Export Tool 404 Installing exist as a Service 405 Solaris 406 Windows Linux and Other Unix 407 Hosting and the Cloud 408 Entic 408 Amazon EC2 408 Other Cloud Providers 412 Getting Support 413 Community Support 415 Commercial Support 415 16. Advanced Topics.......................................................... 417 XQuery Testing 417 Versioning 427 Historical Archiving 427 Document Revisions 428 Scheduled Jobs 435 Scheduling Jobs 436 XQuery Jobs 438 Java Jobs 441 Startup Triggers 446 Configured Modules Example Startup Trigger 448 Database Triggers 449 XQuery Triggers 453 Java Triggers 457 Internal XQuery Library Modules 467 Using the Hello World Module 473 Types and Cardinality 474 Function Parameters and Return Types 477 Variable Declarations 482 Module Configuration 483 Developing exist 483 Building exist from Source 485 Debugging exist 488 A. XQuery Extension Modules.................................................. 493 B. REST Server Processes...................................................... 525 Index....................................................................... 543 x Table of Contents

CHAPTER 1 Introduction What Is exist? As it turns out, this is quite a difficult question to answer. The problem lies in the wide audience that exist serves. exist is many things to many people, and thus there is no single succinct answer. exist is an open source piece of software written in Java that is freely available in both source code and binary form. exist has always been made available under the Lesser GNU Public License (LGPL), version 2.1. While exist makes use of many other open source libraries itself, all of these are compatible with the LGPL, and exist eschews the GPL license in favor of freedom of choice for its users. exist was conceived as a native XML database. As a database, its unit of atomicity is the document, so we could very easily brand it a NoSQL document database. However, to do so would be to do an injustice to the software, and worse, to all of those who have contributed to making exist much more than just a NoSQL database over the years. Unlike most NoSQL databases, which each have their own proprietary database query language, exist makes use of a standardized query language developed by the W3C: XML Query Language (XQuery). With a standard query language, you have the ability to write code that can be used not just on exist, but on any platform or processor that supports XQuery. Some of the benefits of XQuery are that it is: Synergistic XQuery was carefully designed and evolved over a six-year period by an open working group with many contributors, meaning that many real industry use cases were considered during its construction. XQuery has been influenced by 1

several previous languages and concepts, such as Perl, Lisp, Haskell, SQL, and many more. Easy to use XQuery was designed to be simple to use and debug, meaning that nonprogrammers (given an understanding of their documents) should be able to easily construct queries. Many exist users work in the humanities and have no formal computer science background, but are quite comfortable writing complex XQueries to query their documents. Easy to optimize The XQuery specification does not detail how an implementation should perform query processing, and its developers have given great thought to ensuring that any implementation can optimize processing of query operations. Likewise, as a moderate user of XQuery, you can often easily understand why a particular query is slow and what you may do to improve it. Easy to index Join operations in XQuery (e.g., predicates and where clauses) lend themselves well to index optimization, which exist exploits to speed up XQuery execution. exist provides multiple indexing schemes that the user may configure. Turing complete XQuery is more than just a query language: it is in a class of languages known as Turing complete, which means that it is a complete programming language and any program can be expressed in it. XQuery is also a functional programming language, as opposed to a procedural one, meaning that it is generally easier to construct programs that you can easily understand and that ultimately contain fewer bugs. In exist you can build entire applications in just XQuery! Query first While XQuery is a programming language, it is designed primarily as a query language. Therefore, it is much easier to extract just a few elements from large data collections with XQuery than, say, with XSLT. More than you realize! While XQuery is easy to get started with, its functional nature can make it tricky to work with if you have only procedural programming experience. XQuery can be incredibly elegant, and we are frequently surprised at how very complex problems may be solved quite simply in XQuery when considered in a different light. We should also perhaps mention that exist is not just for XML documents. You can, in fact, store any file into the database, and exist can do some very clever things with content extraction and metadata with non-xml documents to help you query and manage those binary formats too. 2 Chapter 1: Introduction

We could stop there and focus the rest of the book on the database, but you would really miss out on the good stuff. exist is also a web server: you can make web requests directly to the database to store, retrieve, or update XML documents. exist achieves this by providing an HTTP REST API that describes the database. It also provides a WebDAV interface so that your users can easily drag and drop documents from their desktops into the database, or open a document for editing. But wait, there s more! As exist evolved over the years it became clear that being able to store, retrieve, and edit documents via the Web was neat, but also being able to store XQuery into the database and execute it via a web request from your web browser meant you could easily construct very powerful web applications directly on top of the database. exist of course continued to evolve here, providing new features for forms, web application packaging, improved security, SQL queries, SSL, and support for producing and consuming JSON and HTML 5, among other offerings. So, in summary, what is exist? A NoSQL document database for XML and binary (including text) A web server for consuming and serving documents A document search engine A web application platform A document creation and capture platform (XForms) A data mashup and integration platform An embeddable set of libraries for use in your own applications And much, much more exist Compared to Other Database Systems Let s take a moment to discuss some of the main differences between exist and other SQL and NoSQL database systems. exist is: Document oriented Unlike traditional RDBMSs (Relational Database Management Systems) such as Oracle, MySQL, and SQL Server, which are table oriented, exist is a NoSQL document-oriented database. Many other NoSQL document databases (including MongoDB and Apache Cassandra) store JSON documents, whereas exist stores XML documents. One of the key advantages of XML over JSON is the ability to handle complex document structures using mixed content. JSON easily handles data-oriented documents, while XML easily handles both data-oriented markup and text-oriented exist Compared to Other Database Systems 3

documents. Another key advantage of XML over JSON is that you can adopt namespaces to cleanly model different business domains. Schemaless RDBMSs and even several NoSQL databases require you to define your data schema before you can start storing your data. exist is entirely flexible, and allows you to store your documents without specifying any schema whatsoever. It is ideal for business problems that have high-variability data and also helps developers rapidly prototype and evolve applications. However, schemaless should not be confused with providing validation of documents. Should you wish, you can also define a schema in exist and have exist enforce that only documents meeting your schema requirements are stored or updated. Portable queries RDBMSs typically use a standardized SQL query language; however, in practice, apart from the most basic queries it can be hard to run the same SQL queries across different RDBMS database products. Likewise, most NoSQL systems have their own proprietary query languages, which are entirely product-specific. exist takes a very different approach and provides XQuery and XSLT, which are W3C standardized query and transformation languages, meaning that with very little effort you can execute your exist queries on any other product that provides an XQuery and/or XSLT processor. Structured search Like many database systems, exist allows you to define different indexes for your searches. However, combined with the ability to search based on the document structure, this makes exist search results more precise than those of almost any other database when dealing with structured documents such as TEI, DocBook, and DITA. exist will consistently have better search metrics (precision and recall) than search systems that ignore document structure. If findability is high on your list of desired attributes, then exist is a great choice. Forms Oracle provides Oracle forms for use with its RDBMS. We are not aware of any NoSQL databases that provide form support for constructing end-user interfaces that can feed directly into the database. exist supports XForms (another W3C standard), which allows you to easily capture user input into XML documents in the database. Some organizations find that exist is ideal not just for managing data collection with forms, but also for entire backend workflows around the content publishing process. Application development Like some RDBMSs and NoSQL databases, exist is embeddable into your own applications. However, when you are using exist as a server, it really becomes an application platform and offers more than almost any other database system. 4 Chapter 1: Introduction

When running exist as a server, you can develop entire applications in exist s high-level query languages (XQuery and XSLT) without necessarily having to be a computer scientist or programmer. Transaction management Most RDBMSs and many NoSQL databases allow you to control your transactions from within your database queries or API calls. Unfortunately, exist does not currently support database-level transaction control. exist does have transactions internally and uses a database journal to ensure the durability and consistency of your data, but these are not exposed to the user. Transaction control is high on the list of desirable features for exist, and some options have already been explored for the future. Horizontal scalability and replication A feature of many RDBMSs and NoSQL database systems is the ability to cluster database nodes to increase database scalability and capacity. exist is currently mostly deployed on single servers. As of this writing, it has no support for automatic sharding of data. exist has recently gained support for replication in version 2.1 through JMS (Java Message Service) message queueing. The replication feature allows you to have a master/slave database, which is highly available for reads that may occur from any node, but it is still currently recommended to send writes to the master node. For further details of the emerging replication support in exist, see https://github.com/exist-db/messaging-replication/wiki. History Once upon a time, around the turn of the 21st century, there was a researcher named Wolfgang Meier working at the Technical University of Darmstadt. He was in need of a system to analyze and query XML data, and since there was nothing around that satisfied his needs, he decided to write something himself: exist. Starting out in C++, Meier quickly turned to Java, and by the beginning of 2001, a first version was available. It was based on a relational database backend and, compared to where we are now, very primitive. The functionality was basic and it was slow on indexing, but yes, it already had some XPath on board. Immediately, some dictionary research projects started using it. The next stage was replacing the relational backend with native XML storage. While this was happening, more and more people started using exist, and around 2004 the first commercial projects arrived. The development of exist has since then mostly been financed by its users, who needed new functionality and were willing to pay for it. Implementing XQuery met some resistance. At that stage, exist was still mostly an XML database only. Why would you need something like XQuery if you already have History 5

XPath? Luckily (for us), a professor of literature really needed XQuery support and paid for its implementation. It was embedded in the product by 2005. During 2005 exist was going so well that Meier was able to quit his university job and concentrate on exist projects only. By that time some other programmers had come on board, and they constitute what we now know as the original core programmer team. In alphabetical order, they were Pierrick Brihaye, Leif-Jöran Olsson, Adam Retter, and Dannes Wessels. Up to 2006 the version number was kept to v0.xx, but in 2006 a real v1.0 was released! By this time, having previously only communicated via the Internet, the core programmer team met live for the first time in 2007 in Versailles. One of the first things they did was to check exist against the official XQuery test suite, which subsequently resulted in the current 99%+ compliance score. The product kept evolving. A major improvement was replacing the existing scheme for node identifiers with a much better one. As a result of that, limitations on XML size and structure disappeared. Stability and transaction management were improved and the Lucene full-text search engine added. From the original research/retrieval tool, exist evolved into something we really could call a native XML database. The XRX (XForms, REST, XQuery) paradigm popped up as a way to create fully XML-driven applications. exist was among the first engines that made this possible. It turned, slowly but surely, into a full-blown application platform. With version 1.4 of exist released in 2009, suddenly many more organizations were using exist in their production systems day to day. More development effort went into stabilizing, fixing bugs, and improving reliability. With this, the first settled application problems arrived: it became harder and harder to change anything without breaking backward compatibility. Consider, for instance, exist s XQuery update support: an implementation of a draft version of the standard for writing XQuery statements to update XML. Switching to the final standardized version is virtually impossible because it would break backward compatibility and existing applications would stop working. However, the development team did not stop working, and gradually the 2.0 version as we know it came to life. Release candidates were made available throughout 2012, containing a large number of major changes and additions to the previous versions: Behind the scenes, the XQuery engine and optimizer were improved. Support for (large parts of) XQuery 3.0 was added. The way the indexes work was redesigned to reduce lock contention, offer modularity, and improve performance. 6 Chapter 1: Introduction

Security was reorganized and now works not only a lot faster, but also in a way most developers are comfortable with (i.e., similar to Unix-like systems). The repository manager was added, opening the way to a more modular exist. RESTXQ, a standard for coupling function invocations to URLs, was added. And, of course, numerous other small improvements were made. The final version 2.0, released in February 2013, was a massive leap forward from 1.4, representing the culmination of more than three years worth of sustained development effort. As such, it was not completely without backward compatibility problems. For instance, existing XQuery applications will have to do something about their security settings before they can run on the new version. However, that s not too hard and is well worthwhile. Version 2.1 was released shortly after, in July 2013, and consisted mostly of bug fixes and a new version of exide. In February 2014, a release candidate of exist 2.2 was made available, which along with the usual bug fixes included a completely new range index based on Lucene that offered much improved query times. It is expected that exist will keep evolving. The plans are to move more and more toward a core product with separate modules, enabling adding à la carte functionality as needed. Competitors Now, obviously we are passionate about exist; otherwise, you would not be reading a book we have written on the subject. More importantly, though, we are passionate about open source, and even more so we are concerned with quality software and using the right tool for the job. Like any other product, exist has both strengths and weaknesses, and it would be somewhat misleading if we were not to share the whole story with you. Pointing out the weaknesses of a software product for which you have bought a book may not help us sell more books, but we do hope it will help you make informed decisions. As exist has such a wide scope, it is impossible to compare it directly to other products; so, we compare it instead against other native XML databases that also couple web server and application platform capabilities. exist s competitors can be split into two categories: those that are open source and freely available, and the closed source, commercial offerings. By no means is what follows a complete list, but it contains the offerings that we believe are popular and frequently encounter when talking to others. A further independent comparison is available in the XML database article on Wikipedia. Competitors 7

Open Source Competitors Let s begin with the open source exist competitors. BaseX BaseX is a lightweight native XML database with some application server facilities written in Java. The project was started in 2005 by Christian Grün at the University of Konstanz, and BaseX was released as open source in 2007. BaseX promotes ease of use and provides an easy-to-use GUI frontend also written in Java. Compared to exist, BaseX adheres more closely to the W3C XQuery specifications, achieving 99.9% compliance with the W3C XQuery 1.0 specification (exist has 99.4%) and implementing the specifications for XQuery Update 1.0 and XQuery Full- Text. exist has an older draft implementation of XQuery Update and its own proprietary full-text search. exist, however, has been available for significantly longer, and thus benefits from many more features, such as XSLT. BaseX is released under the more liberal BSD license. Commercial support is available for BaseX from BaseX GmbH, which was founded to support commercial applications of BaseX. Sedna Sedna is a lightweight native XML database without application server capability, written in C and C++. The origins of Sedna are not well documented, but it appears to have started around 2003 as a project of the Institute for System Programming at the Russian Academy of Sciences. Sedna seems to focus on providing core database services and little more. While it has no REST Server of its own, it can be configured to work as a module within the Apache HTTP Server. Compared to exist, Sedna supports more APIs for different programming languages directly; exist mostly assumes that developers will use its REST or RPC APIs, and leaves language APIs to third-party providers. Sedna reports 98.8% compliance with the W3C XQuery 1.0 specification; as mentioned previously, exist has 99.4% compliance. Sedna, like exist, implements its own proprietary full-text search, and a draft version of XQuery Update. Sedna is released under the Apache 2.0 license. There does not appear to be a commercial support offering for Sedna. Closed Source, Commercial Competitors Now we ll take a look at exist competitors that are closed source and commercially available. 8 Chapter 1: Introduction

28.io 28.io is a PaaS (Platform as a Service) for the Zorba open source XQuery processor. 28.io integrates Zorba with MongoDB, supporting the storage and indexing of XML into MongoDB as its main datastore. The query optimizer leverages the full capabilities of MQL (Mongo Query Language), enabling developers to leverage the expressiveness and productivity of XQuery atop a highly scalable store. In comparison with exist, 28.io focuses on the cloud and manages its database entirely using XQuery, whereas exist provides a number of Admin GUI tools and additional APIs. 28.io, much like exist, provides an application server platform enabling you to build entire apps using XQuery. 28.io (through Zorba) has similar compliance to the W3C XQuery 1.0 specification as exist, but also supports XQuery Update, XQuery Full-Text, and XQuery Scripting. 28.io s main advantage over exist is its cloud scaling; exist s main advantage is XSLT, XRX, and XForms support. 28.io is developed by 28msec. 28msec is based in Zurich, Switzerland, and has strong research links with ETH Zürich. MarkLogic Server MarkLogic Server is a standalone native XML database server, written in C++, that provides XQuery and XSLT query and transform capabilities. MarkLogic Server also has the capability to cluster nodes to scale horizontally, with the additional capability to pass large batch processing jobs off to Hadoop. MarkLogic distances itself from the technical marketing of XML and XQuery and instead identifies itself as a NoSQL database solution for the enterprise. Compared to exist, MarkLogic markets itself as being able to handle petabytes of XML data. exist can currently scale to hundreds of gigabytes, but this is very much dependent on the dataset and queries made. MarkLogic lacks a documentrepresentative REST API, but does provide a REST API for application development. MarkLogic s main advantage over exist is scaling to huge datasets, while exist s advantage is its fast innovation and rich feature set. Both support XSLT, but Mark Logic does not support XQuery Update; rather, it provides its own proprietary functions. MarkLogic Server is developed by MarkLogic Corporation, based in San Carlos, California. Who Is Using exist, and for What? The problem with giving something away for free with no questions asked is that you can never quite be sure: How many people are using it Who Is Using exist, and for What? 9

Who the people using it are What it is being used for From the support channels available to exist users, and as a member of the community, you can see that exist is used by many people for many different purposes, but their end goals and projects are not always disclosed or clear. Here we have pulled together a few descriptions of various projects using exist from the developers of those projects themselves: The Tibetan Buddhist Resource Center (TBRC) holds the world s single largest collection of Tibetan texts nearly 10 million scanned pages and over 11,000 Unicode Tibetan texts. TBRC.org provides online access to over 4,000 users via an Ajax client written in Google Web Toolkit as a front-end to the exist-db. TBRC has used exist-db since 2004 to store the catalog for the texts in the library as well as a knowledge-base of persons and places that provide a cultural context for Tibetan literature. The integration of exist-db with the Lucene full-text indexing has created a powerful framework with which TBRC.org is able to provide searchable access to the library via comprehensive tables of contents in Tibetan and a large collection of texts that have been input in Unicode in centers around the world. Our production system currently runs exist-db 2.1. Chris Tomlinson, Senior Technical Staff Member, Tibetan Buddhist Resource Center, Cambridge, Massachusetts ScoutDragon initially started as a baseball research project by a group of baseball enthusiasts including writers, agents, scouts, fans, fantasy owners, and even former players. This group realized a need for original English content, data, and research on baseball players in Asia. All data for multiple sports covering multiple sporting leagues is stored in XML documents within exist in a schema derived from IPTC s SportsML, most extensions having to do with providing multi-lingual support of players so that information may be displayed in English, Japanese, Korean, and/or Chinese. XQuery has proven to be a fantastic language for not just transforming the vast quantities of data to web pages, but also for data analysis and the generation of sabermetrics-based statistics. Michael Westbay, Lead Programmer/System Administrator, ScoutDragon.com, Japan Semanta s core business is metadata in business intelligence. Part of our concern is parsing metadata from reporting platforms. Many of these reporting platforms supply their metadata in large XML chunks, which we then need to further process efficiently. A typical example is our IBM Cognos connector, where we use exist heavily to extract details of report structures and data sources. Originally we thought we would only use exist for prototyping, but ultimately, we have used embedded exist in the production 10 Chapter 1: Introduction

system; re-writing the connector without exist s XQuery turned out to be just too complicated! David Voňka, Programmer, Semanta, Czech Republic The Centre for Document Studies and Scholarly Editing of the Royal Academy of Dutch Language and Literature (Ghent, Belgium) develops rich scholarly collections of textual data, and publishes them as digital text editions and language corpora. From the start, we have fully embraced open standards and publication technologies. At first, we started out with the Cocoon XML publication framework, which back then nicely integrated with exist (or the other way round) for efficient querying of XML content. Since the introduction of exist s MVC framework, we have extended our use of exist as a full application server, not only for querying the indexed data, but also driving the entire application and presentation logic. The texts we re querying (or rather, processing) with exist are mostly documentcentered XML documents that are conformant to the schemas developed by the Text Encoding Initiative (TEI). Depending on the specific edition project, they are enriched with metadata such as named entities, editorial annotations, and sometimes highly specific textual documentation (such as critical apparatuses documenting variation among text versions). Though our texts are mostly in Dutch, we try to connect and contribute to methodological good practice emerging in the interesting field that is Digital Humanities. Some of our exemplar projects include a collection of letters in relation to the Belgian literary journal Van Nu en Straks; a digital edition comparing 20 versions of De trein der traagheid, a novel by the Belgian novelist Johan Daisne; and a digital edition of the first Dutch dialect survey in the Flemish region by Pieter Willems (developed between 1885 1890). Ron Van den Branden, Centre for Scholarly Editing and Document Studies of the Royal Academy of Dutch Language and Literature, Ghent, Belgium At the Cluster of Excellence Asia and Europe in a Global Context, we use exist-db to store our collections of MODS (bibliographical) and VRA (image metadata) records. We have developed two open source applications for this, Tamboti and Ziziphus, where our records can be searched and edited. Both applications are built entirely in XML technologies (XQuery and XForms) using exist-db and make use of LDAP integration and detailed user rights management. Heidelberg Research Architecture, Cluster of Excellence Asia and Europe in a Global Context, The University of Heidelberg, Heidelberg, Germany Haptix Games is a video game and interactive application development and publishing studio, and we have been a Microsoft shop for as long as I can remember. We have Who Is Using exist, and for What? 11

leveraged C#, MVC, and IIS for user experience; WCF, OData, and BizTalk for message exchanges; MSSQL and Entity Framework for storage. That is a lot of acronyms and even more complexity under the hood. Prototyping a concept usually involved all above-mentioned technologies, while the final solution release was either expensive, inflexible, or did not meet client expectations. With the adoption of exist-db our development and release workflows have become highly agile and more competitive. Utilizing exist-db as a dark-data solution platform and not just another XML database allowed us to eliminate 80% of our Microsoft code base just by taking advantage of the built-in web server, low-level data manipulation using XQuery 3.0, restful data exchange, and native storage capabilities. Chris Misztur, CTO, Haptix Games, Chicago, Illinois easydita is an end-to-end solution for collaboratively authoring, managing, and publishing content using the DITA XML standard. Companies utilize easydita to reduce the cost and time to market to deliver content in a variety of formats and languages. By leveraging exist, easydita is able to deliver customers exceptional ability to search, manage, localize, and publish content. exist s schemaless design and flexible indexing system makes it easy to support customizations like reporting, analytics, and new content models without sacrificing performance or doing major redesigns. Casey Jordan, cofounder, easydita, Jorsek LLC, Rochester, New York We [at XML Team Solutions] help media and entertainment companies integrate sports news and data feeds. These feeds are predominantly XML. We use exist for two things: 1. Regulating and preparing vendor web service XML for transmission to clients. Scheduled jobs access remote web services, preprocess, and pass on XML via HTTP Client to our main feed processor. 2. API to drive graphics for live television broadcast. API built from RESTXQ provides live updates of results to broadcaster clients. Currently uses JMS to sync from one write DB to two load-balanced readers. Also has an XForms beat the feed live score updater which mimics the incoming feed in case feed vendor is delayed. We recently delivered a project for BBC Sports to deliver live broadcast information. exist met all the requirements for speed, cost, and reliability for an API to deliver upto-date scores and statistics to BBC television. Paul Kelly, Director of Software Development, XML Team Solutions Corp, Canada 12 Chapter 1: Introduction

exist-db is at the core of [the Office of the Historian s] open government and digital history initiatives. It powers our public website, allowing visitors to search and browse instantly through nearly a hundred thousand archival government documents. On the fly, it transforms our XML documents and query results into web pages, PDFs, ebooks, and APIs and data feeds. Its support of the high-level XQuery programming language and its elegant suite of development tools empower me and my fellow historians to analyze data and answer research questions. The open source nature of exist-db has delivered far more value to us than its simply being free ; its active, welcoming, expert collaborative user community has helped us learn, discover exist-db s plethora of capabilities, and find the best solutions to our research and publishing challenges. exist-db belongs in the toolkit of all digital humanities, open government, and publishing projects. Contributing to the Community Joe Wicentowski, Historian, Office of the Historian, U.S. Department of State There is a vibrant and supportive community around the exist software, whose goal it is to make using exist easy for beginners and as painless as possible for advanced developers. The exist community prides itself on the agility and quality of its responses to support requests. There are many ways to contribute to exist and the community. You need not be a crack software engineer; even beginners asking questions on the mailing list can help others learn from their issues and encourage the developers to simplify or consider new approaches. To get in touch with the exist community, you have several channels available to you: Email: the exist-open mailing list This is the official preferred mechanism, and your best bet for getting a quick answer. There are also the exist-development and exist-commits mailing lists; the former is used for technical discussion of features and fixes that go into exist, and the latter is a feed of any changes made to the source code of exist. For further details, see http://sourceforge.net/p/exist/mailman/, http://existopen.markmail.org, and http://www.exist-db.org/exist/apps/doc/getting-help.xml. Stack Overflow: the exist-db tag While the mailing lists should currently be considered the primary support mechanism, Stack Overflow is also becoming popular for asking exist questions. You can find exist questions and answers under the exist-db tag. Contributing to the Community 13

Twitter: @existdb The Twitter channel is monitored by the core developers of exist. It s mostly used for announcements about exist, as providing tech support in 140 characters is tough! IRC: #existdb on irc.freenode.net The exist IRC chat room is for the community, by the community. It can be a mix of users and developers and is often worth a visit, but getting a response can really depend on who is awake and logged in. Individuals Using exist As a user of exist, be sure to do the following: Ask questions There are no stupid questions. Importantly, all questions and answers on the mailing list are archived so that others may learn from them also. Sensibly, you should search the archive first, to see if your question has already been answered; if not, or if the answer s not clear, then ask away! Report bugs All software has bugs! exist is no exception. If you think you have found a bug, it is probably best to discuss it on the exist-open mailing list first, and then, if it s confirmed, log it in the exist issue tracker on GitHub. If you want to report a bug, it s very important that the developers of exist understand how to reproduce what you are seeing; if you can t describe the problem and its cause, it s very hard for the community to help you! Ideally you should provide a reproducible test case of the absolute minimum steps required to cause the issue. See Getting Support on page 413 for more detail. Answer questions As you begin to use exist, you will start to learn more and more things that other users may not know. Why not get some good karma back by answering some questions on the mailing list? After all, it s a community! Evangelize If you re having a great time using exist, or you are enthralled by some neat feature, tell your friends, and let us and everyone else know by writing a blog entry or article. Organizations Using exist Open source developers are often working on a project for the love of it, and many of the developers of exist contribute much of their time to the project completely unpaid. Sadly, love for developing open source code with your friends does not nec 14 Chapter 1: Introduction

essarily equate to food or shelter. If you re part of an organization making free use of exist, there are a number of ways that you can contribute back to the community: Sponsor features or bug fixes Perhaps there is some feature that you wish that exist had that would really help your project, or there is a bug that sometimes upsets your system. Your organization could financially sponsor a developer from the exist community to add this feature or resolve that issue. Sponsoring exist developers for small or large projects helps support them in their work on exist and could provide new or improved functionality to the community. If you want to give something back financially but have no specific features or bug fixes in mind, just get in touch via exist-open and the community will helpfully propose a project to meet your budget. Friday afternoon exist If you have developers in your organization, empowering them to spend a small amount of their paid work time contributing to the development of exist can also be a great way to give back to the community. For example, Google allows its developers to work on open source projects on Friday afternoons and has realized various benefits from this. Contracts and jobs Are you looking for someone who is an expert in exist and XQuery and/or XML technologies? The exist-open mailing list can be a great place to advertise. You will more than likely end up sponsoring one of the contributors to exist, as they tend to be the people who really know it inside and out. Support and maintenance contracts If you re serious about using exist in your projects and running production systems on it, you will more than likely want the support and operational security afforded by purchasing a support contract for it. exist Solutions provides a variety of support contracts and consultancy services for exist. It was founded by core developers of exist and contributes almost all of its resources back into developing the software. By working with exist Solutions, you are closely supporting and funding exist s development. Authors Using exist If you re an author using exist, here s how you can contribute: Documentation exist has a large set of documentation that accompanies it, but it is by no means complete or exhaustive. You do not have to be a developer to write documentation for exist, and all improvements to the documentation are warmly accepted. Contributing to the Community 15

Developers Using exist Developers using exist can give back in the following ways: Bugs, patches, and new features Found a bug? Want to submit a patch or new feature? Why not roll up your sleeves and get your hands dirty? In the beginning the exist code base may seem intimidating in its size, but it s fairly modular and easy to get around. And if you have the skills, there is often no quicker way to get something fixed than to do it yourself, while hopefully learning a few new and interesting things along the way. Bug reports should be posted to the exist-development mailing list first, and then logged in the issue tracker on GitHub. Patches can be submitted by means of a pull request to the exist GitHub repository. For further information about developing exist, see Developing exist on page 483. Additional Resources This section contains additional informational resources. It s compiled from our personal preferences and bookshelves, meaning there are many other good sources of information around. However, this list is a good place to start: General W3C (World Wide Web Consortium) The W3C is the body that manages, among other things, the XML standards. The website is surprisingly easy to use, yet informative. W3 Schools For a quick high-level overview of any W3C standard with practical examples, try the W3 Schools. XQuery XQuery, by Priscilla Walmsley (O Reilly, 2007) This is probably the best XQuery book available in our opinion. XQuery wikibook, edited by Dan McCreary et al. The XQuery wikibook is an excellent resource for XQuery and exist,with the majority of the examples developed for exist. XRX wikibook, edited by Dan McCreary et al. The XRX wikibook, like the XQuery wikibook, is an excellent resource when you re building applications atop exist using REST and XForms. 16 Chapter 1: Introduction

XPath and XQuery Functions and Operators 3.0 (W3C, 2014) The F+O specification is a great resource for quickly looking up the available functions and their specification for XQuery, XPath, and even XSLT. XQuery: The XML Query Language, by Michael Brundage (Addison-Wesley, 2004) This was a great book at the time it was published; while still relevant, it was released before the final XQuery 1.0 specification. XQuery from the Experts: A Guide to the W3C XML Query Language, by Don Chamberlin et al. (Addison-Wesley, 2004) Again, this book was released before the final XQuery 1.0 specification, but it is useful for those who want to know the nitty-gritty details like the formal underpinnings of the language. The XQuery Talk mailing list The xquery-talk mailing list is a great place to ask XQuery questions that are not specific to exist-db. XSLT XSLT 2.0 and XPath 2.0: Programmer s Reference, 4th Edition, by Michael Kay (Wiley, 2008) This is the book to have beside your keyboard if you ever want to do any serious XSLT programming. XSL-List The XSL-List is the best place to ask XSL questions and receive help. Note that it has an excellent archive; we suggest that you search that first for an answer before asking! XSLT Questions and Answers FAQ, curated by Dave Pawson The XSLT FAQ is an incredible resource that has many answers from those who were involved in specifying XSLT and those recognized as subject experts. XSLT, 2nd Edition, by Doug Tidwell (O Reilly, 2008) XSLT Cookbook, 2nd Edition, by Sal Mangano (O Reilly, 2006). Also a very handy book to have when you only sporadically program XSLT; it contains many useful recipes. XForms XForms Tutorial and Cookbook wikibook, edited by Dan McCreary Additional Resources 17

The XForms wikibook is an excellent resource for XForms examples, especially as many of the articles are developed against exist. XForms Essentials, by Micah Dubinko (O Reilly, 2003) An excellent reference guide to have when you re working with XForms, with some good explanations of the W3C XForms specification. XML Schema Definitive XML Schema, by Priscilla Walmsley (Prentice Hall, 2013) RELAX NG, by Eric van der Vlist (O Reilly, 2003) XSL-FO XSL, XSL-FO FAQ, curated by Dave Pawson The XSL-FO FAQ is in a similar vein to the XSLT FAQ and likewise is an invaluable resource, with many questions answered by subject experts. XSL Formatting Objects: Developer s Handbook, by Doug Lovell (Sams, 2003) 18 Chapter 1: Introduction

Want to read more? You can buy this book at oreilly.com in print and ebook format. Buy 2 books, get the 3rd FREE! Use discount code: OPC10 All orders over $29.95 qualify for free shipping within the US. It s also available at your favorite book retailer, including the ibookstore, the Android Marketplace, and Amazon.com. Spreading the knowledge of innovators oreilly.com