The inkml (language) and other W3C techs applied to a multichannel document processing system. José Antonio Magaña Mesa ( jomag@hp.com ) R+D Software Engineer. Member of inkml WG. HP Barcelona Division November 2004 2004 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice
In the next 100 years, half of the 6000 languages spoken in the world will disappear. American Language Society (Chicago)
What is inkml? http://www.w3.org/2002/mmi/ink InkML is an XML data format for representing digital ink data that is input with an electronic pen or stylus as part of a multimodal system. In the context of the W3C Multimodal Interaction Framework, the markup provides a format for: 1. transferring digital ink data between devices and software components 2. storing hand-input traces for: Handwriting recognition (including text, mathematics, chemistry) Signature verification Gesture interpretation The specification is based on the following documents: 1. Ink Requirements Note, published by the Working Group on 22 January 2003, 2. InkXML (W3C Members only), a specification contributed by IBM, Intel, the International Unipen Foundation and Motorola. November 21, 2004 3
More on inkml The aim of the standard is to remain application agnostic, and provide the basis for building based on it any XML based language that manipulates digital ink or serve as transport layer for ink. dppml Note taking inkml HPLI- Annotated data Companies participating in the working group: IBM, Fraunhofer Gesellschaft, Apple, Corel, (Motorola), HP Last draft(3rd draft): 28 Sept 2004 Different companies already developed based on it: HP, Mi-Co November 21, 2004 4
Statement of work of inkml Overview As more electronic devices with pen interfaces have and continue to become available for entering and manipulating information, applications need to be more effective at leveraging this method of input. Handwriting is an input modality that is very familiar for most users since everyone learns to write in school. Hence, users will tend to use this as a mode of input and control when available. A pen-based interface is enabled by a transducer device and a pen that allow movements of the pen to be captured as digital ink. Digital ink can be passed on to recognition software that will convert the pen input into appropriate computer actions. Alternatively, the handwritten input can be organized into ink documents, notes or messages that can be stored for later retrieval or exchanged through telecommunications means. Such ink documents are appealing because they capture information as the user composed it, including text in any mix of languages and drawings such as equations and graphs. Hardware and software vendors have typically stored and represented digital ink using proprietary or restrictive formats. The lack of a public and comprehensive digital ink format has severely limited the capture, transmission, processing, and presentation of digital ink across heterogeneous devices developed by multiple vendors. In response to this need, the Ink Markup Language (InkML) provides a simple and platform-neutral data format to promote the interchange of digital ink between software applications. InkML supports a complete and accurate representation of digital ink. For instance, in addition to the pen position over time, InkML allows recording of information about transducer device characteristics and detailed dynamic behavior to support applications such as handwriting recognition and authentication. For example, there is support for recording additional channels such as pen tilt, or pen tip force (often referred to as pressure in manufacturers' documentation). InkML provides means for extension. By virtue of being an XML-based language, users may easily add application-specific information to ink files to suit the needs of the application at hand. 1.1 Uses of InkML With the establishment of a non-proprietary ink standard, a number of applications, old and new, are expanded where the pen can be used as a very convenient and natural form of input. Here are a few examples. Ink Messaging Two-way transmission of digital ink, possibly wireless, offers mobile-device users a compelling new way to communicate. Users can draw or write with a pen on the device's screen to compose a note in their own handwriting. Such an ink note can then be addressed and delivered to other mobile users, desktop users, or fax machines. The recipient views the message as the sender composed it, including text in any mix of languages and drawings. Ink and SMIL A photo taken with a digital camera can be annotated with a pen; the digital ink can be coordinated with a spoken commentary. The ink annotation could be used for indexing the photo (for example, one could assign different handwritten glyphs to different categories of pictures). Ink Archiving and Retrieval A software application may allow users to archive handwritten notes and retrieve them using either the time of creation of the handwritten notes or the tags associated with keywords. The tags are typically text strings created using a handwriting recognition system. Electronic Form-Filling In support of natural and robust data entry for electronic forms on a wide spectrum of keyboardless devices, a handwriting recognition engine developer may define an API that takes InkML as input. Pen Input and Multimodal Systems Robust and flexible user interfaces can be created that integrate the pen with other input modalities such as speech. Higher robustness is achievable because cross-modal redundancy can be used to compensate for imperfect recognition on each individual mode. Higher flexibility is possible because users can choose the most appropriate from among various modes for achieving a task or issuing commands. This choice might be based on user preferences, suitability for the task, or external conditions. For instance, when noise in the environment or privacy is a concern, the pen modality is preferred over voice. November 21, 2004 5
Application: HP Forms Automation System Workflow Connect 250
HP Forms Automation System Fill out the printed form using the HP Digital Pen 200 November 21, 2004 7
HP Forms Automation System Processed information as it appears in an Excel spreadsheet. Download captured information November 21, 2004 8
HP Forms Automation System Form Processing Workflow Acrobat Reader Program DLD (parser) Plug-in Print Request PDF/XFDF Digital Form PDF XFORM S CSS WSDL SOAP Customer s IT Infrastructure Form processing results Form Processing Application SVG dppml XSLT Digital Paper Printer Driver Print form command Req./resp. for unique dot pattern SOAP Service Controller General comm. Req./resp. for form processing service HTTP/ HTTPS Digital Paper Runtime (SDK) Pen data and related information Digital Pen Dow nload Softw are HTTP/ HTTPS XHTML HP LaserJet Printer Output digital paper form, w ith unique dot pattern Printed digital form User w rites on digital form and pen captures stroke data Req. to upload pen data, via USB cradle CSS HP Digital Pen November 21, 2004 9
HP Forms Automation System Form Processing Workflow 1. User opens digital form using Acrobat Reader. 2. User issues print command, using Digital Paper Printer driver, from Acrobat Reader. 3. Digital Paper Printer driver sends requests, along with form id, to Service Controller for unique dot pattern. Service Controller responds with dot pattern information. 4. Digital Paper Printer sends print command to printer. 5. HP LaserJet prints paper form with unique dot pattern. 6. User writes on digital paper form and pen simultaneously captures written pen stroke data. User marks Send box when done filling form. 7. User docks digital pen to cradle to being data upload to appropriate form processing application server. 8. Digital Pen software contacts Service Controller for upload request. Service controller checks for appropriate application server and responds with contact information. (Secured comm.) 9. Digital Pen software contacts application server to begin upload. (Optional secured comm.) 10. Forms processing application performs business logic to process form data; e.g. interfaces with IT applications, etc. November 21, 2004 10
dppml XML language for HP FAS dppml is the key point in XML inter-application connectivity: Internal XML language Basis for XML transformations. Storage format. dppml description XML Schema driven (xsd) inkml compliant dppml Form Info Ink Info (inkml) Logical Info Print Info Different levels: Form definition, logical, ink Interoperability and XML Connector Output and Input format Multi-session support November 21, 2004 11
FAS - dppml structure flexibility Contains information about: Document structure ( forms ) Information filled by the user (digital ink) Process results at different stages: ICR, data correction, data validation System information User information This information is separated in different blocks so that only the required layers are transmitted. November 21, 2004 12
FAS: dppml samples Device definition November 21, 2004 13
FAS - dppml samples. Strokes assignment to fields November 21, 2004 14
FAS - dppml samples Ink representation November 21, 2004 15
FAS - dppml samples Text Fields November 21, 2004 16
FAS - dppml samples Boolean fields November 21, 2004 17
FAS - Multi-channel environment Device independence: CC/PP Form Document RD F dppml dppml Forms Automation System dppml WSDL dppml External Form Processing dppml Apps Form data repository dppml dppml XM L November 21, 2004 18
Advantages of use of W3C standards Benefit from knowledge and experience of other people in fast evolving technologies Higher interoperability possibilities Reduce development costs by use of implementations of these standards: Open source: Xerces (XML family), Axis (SOAP),... Java technologies: J2SE(DOM), J2EE (HTTP), JAI (PNG),.. Reduce Time to Market. November 21, 2004 19
Backup slides November 21, 2004 20
Forms Automation System workflow Current forms processes mail form Deliver form Receive form Fill form Scan form Interpret form Validate form Data output fax form Lowest level of automation Customer benefits: - Reduced administration - Less resources needed - Forms Compliance - Image Digital Quality Fill form Interpret form Validate form Data output November 21, 2004 21
HP Forms Automation System Efficient Workflow Process The link between paper and the digital world 1 Print On-Demand digital form Includes a unique background dot pattern to make the paper form digital, and pre-filled with data to customized each printed form. 1 3 The digital form processing The digital pen transfers the captured information to the IT infrastructure for processing and delivery to collaborating applications.. 3 HP printer 2 HP digital pen Enterprise information system 2 The digital capture The digital pen uses a built-in camera to electronically capture all written information on the digital form. The background pattern enables the pen to locate the positions of written information on the form. November 21, 2004 22
HP Forms Automation System Key Benefits EFFICIENT Saves time significantly speeds up the workflow process by enabling concurrent data capture. Saves money significantly reduces processing costs. SIMPLE Traditional pen and paper experience for end user. Meets the end user half way for both the techno-savvy and the technology fearful. Intuitive little training required for the pen SEAMLESS Seamless transfer of captured handwritten data to server application. Can merge into existing document form processes (or be the focus of a new process!) Customized forms the right form, a unique digital pattern, some customer data already merged (pre-printed) onto a form! Prints digital paper forms on demand, using a variety of media. Mobile - Meet your customer where he/she needs you. November 21, 2004 23
Standardizing form design. Relevant standards. I/O workflow Data output XML XDP,PDF,TD S XSN,XS F MDF,MTF,PD F XFM,HTML, InfoPath, PDF, XML PDF Web PC(Tablet ) Xforms (W3C) XDP/PDF (Adobe) InfoPath) XML Paper Mi-Co Cardiff (Liquid Office) Digital Paper (static) HP Digital Pen and Paper Digital Paper (on demand) Data Input XML XDP,PDF, XFT, DOC, XSN, RTF, XML,XSD,WSDL,OLEDB MSOffice, XSD,WSDL(www), XML(DB) MDF,WSDL(www), XML(DB) XFM, FXF HP-DPP proprietary Format PDF November 21, 2004 24
XML Connectivity Netweaver Biz-talk WebSphere? non-xml data XSL dppml ( for FAS ) XML data Data transformation XML data Viewer XSLT compliant compliant XSD Transformation definition optional XSD Adobe Form Design Mi-Co InfoPath November 21, 2004 25
SDK - Vertical markets & XML formats Healthcare HL7 IHE CDA HIPPAA XML/non-XML messaging Insurance Acord Finance OFX What do Adobe, SAP or InfoPath mean when they say they support these standards? November 21, 2004 26
Application: HP Forms Automation System Workflow Connect 250 November 21, 2004 27