Internationalization Tag Set 1.0 A New Standard for Internationalization and Localization of XML



Similar documents
XML: extensible Markup Language. Anabel Fraga

Standard Recommended Practice extensible Markup Language (XML) for the Interchange of Document Images and Related Metadata

Extensible Markup Language (XML): Essentials for Climatologists

Structured vs. unstructured data. Motivation for self describing data. Enter semistructured data. Databases are highly structured

XSLT Mapping in SAP PI 7.1

Data Integration through XML/XSLT. Presenter: Xin Gu

Managing XML Documents Versions and Upgrades with XSLT

An Approach to Eliminate Semantic Heterogenity Using Ontologies in Enterprise Data Integeration

Semistructured data and XML. Institutt for Informatikk INF Ahmet Soylu

DITA CMS Release 4.0 (Dynamic Release Management Module): Detailed Release Notes

XLIFF 2.0. David Filip Secretary & Editor OASIS XLIFF TC

The Web Web page Links 16-3

Page: 1. Merging XML files: a new approach providing intelligent merge of XML data sets

Managing large sound databases using Mpeg7

XML Schema Definition Language (XSDL)

Introduction to XML Applications

Last Week. XML (extensible Markup Language) HTML Deficiencies. XML Advantages. Syntax of XML DHTML. Applets. Modifying DOM Event bubbling

Translation and Localization Services

ART 379 Web Design. HTML, XHTML & CSS: Introduction, 1-2

Chapter 3: XML Namespaces

Developing XML Solutions with JavaServer Pages Technology

Presentation / Interface 1.3

Quiz! Database Indexes. Index. Quiz! Disc and main memory. Quiz! How costly is this operation (naive solution)?

e-business Frameworks based on MDA

Development and Validation of an XML Schema for Automated Environmental Reporting on XML Basis

<Namespaces> Core XML Technologies. Why Namespaces? Namespaces - based on unique prefixes. Namespaces. </Person>

DTD Tutorial. About the tutorial. Tutorial

My IC Customizer: Descriptors of Skins and Webapps for third party User Guide

XML WEB TECHNOLOGIES

Cleo Communications. CUEScript Training

04 XML Schemas. Software Technology 2. MSc in Communication Sciences Program in Technologies for Human Communication Davide Eynard

Draft Response for delivering DITA.xml.org DITAweb. Written by Mark Poston, Senior Technical Consultant, Mekon Ltd.

An Approach to Translate XSLT into XQuery

Studio. Rapid Single-Source Content Development. Author XYLEME STUDIO DATA SHEET

A Workbench for Prototyping XML Data Exchange (extended abstract)

Exchanger XML Editor - Canonicalization and XML Digital Signatures

EUROPEAN COMPUTER DRIVING LICENCE / INTERNATIONAL COMPUTER DRIVING LICENCE WEB EDITING

Kuali Financial System Interface Specification for Electronic Invoice Feed

XML for Manufacturing Systems Integration

Web Development CSE2WD Final Examination June (a) Which organisation is primarily responsible for HTML, CSS and DOM standards?

An XML Based Data Exchange Model for Power System Studies

Composite.Community.Newsletter - User Guide

David RR Webber Chair OASIS CAM TC (Content Assembly Mechanism)

Overview of DatadiagramML

LabVIEW Internet Toolkit User Guide

Arbortext 6.1. Curriculum Guide

Application of XML Tools for Enterprise-Wide RBAC Implementation Tasks

Database System Concepts

Fuzzy Systems and Neural Networks XML Schemas for Soft Computing

Introduction to Web Services

T Network Application Frameworks and XML Web Services and WSDL Tancred Lindholm

Encoding Library of Congress Subject Headings in SKOS: Authority Control for the Semantic Web

metaengine DataConnect For SharePoint 2007 Configuration Guide

The Google Web Toolkit (GWT): Declarative Layout with UiBinder Basics

XML Processing and Web Services. Chapter 17

CSET 3100 Advanced Website Design (3 semester credit hours) IT Required

Developer Guide to Authentication and Authorisation Web Services Secure and Public

HTML Web Page That Shows Its Own Source Code

1Lesson 1: Overview of Web Design Concepts Objectives

Structured vs. unstructured data. Semistructured data, XML, DTDs. Motivation for self-describing data

Translating between XML and Relational Databases using XML Schema and Automed

IBM Operational Decision Manager Version 8 Release 5. Getting Started with Business Rules

Service Road Map for ANDS Core Infrastructure and Applications Programs

Relational Databases for Querying XML Documents: Limitations and Opportunities. Outline. Motivation and Problem Definition Querying XML using a RDBMS

GetLibraryUserOrderList

Secure XML API Integration Guide - Periodic and Triggered add in

Functional Requirements for Digital Asset Management Project version /30/2006

Chapter 2 HTML Basics Key Concepts. Copyright 2013 Terry Ann Morris, Ed.D

Agents and Web Services

Xtreeme Search Engine Studio Help Xtreeme

XML. CIS-3152, Spring 2013 Peter C. Chapin

XML and Data Management

1. Write the query of Exercise 6.19 using TRC and DRC: Find the names of all brokers who have made money in all accounts assigned to them.

CST6445: Web Services Development with Java and XML Lesson 1 Introduction To Web Services Skilltop Technology Limited. All rights reserved.

Adobe XML Architecture

Introduction to XML. Data Integration. Structure in Data Representation. Yanlei Diao UMass Amherst Nov 15, 2007

Rotorcraft Health Management System (RHMS)

Markup Languages and Semistructured Data - SS 02

Transcription:

A New Standard for Internationalization and Localization of XML Felix Sasaki World Wide Web Consortium 1 San Jose, This presentation describes a new W3C Recommendation, the Internationalization Tag Set (ITS) 1.0, which has being developed by the W3C i18n ITS Working Group. IUC 31, San Jose 1

Overview Background: ITS 1.0 purposes / audiences ITS 1.0 basic "local" usage ITS 1.0 extended usage Implementations and applications 2 San Jose, We will first give an overview of the purpose and possible audiences for ITS 1.0. Then we will describe its basic and extended usage scenarios. Finally some implementations and applications will be introduced. IUC 31, San Jose 2

i18n Background ITS Working GroupBasic Usage ITS Requirements Extended UsageSignificant Implementations Issues Background 3 San Jose, IUC 31, San Jose 3

i18n Background ITS Working GroupBasic Usage ITS Requirements Extended UsageSignificant Implementations Issues Background ITS 1.0: "Internationalization Tag Set (ITS) Version 1.0" http://www.w3.org/tr/its W3C Recommendation (April 2007) produced by the W3C i18n ITS Working Group http://www.w3.org/international/its 4 San Jose, ITS 1.0, the "Internationalization Tag Set (ITS) Version 1.0", is a W3C Recommendation published in April 2007. It has been developed by the W3C i18n ITS Working Group. IUC 31, San Jose 4

i18n Background ITS Working GroupBasic Usage ITS Requirements Extended UsageSignificant Implementations Issues Background Target: XML Documents and Schemas Purpose: To express information ("data categories") for internationalization and localization of XML documents and schemas 5 San Jose, ITS 1.0 targets XML documents and schemas (e.g. XML DTDs, W3C XML Schema or RELAX NG). The purpose of ITS 1.0 is to express information (in the terminology of ITS 1.0 so-called "data categories") for internationalization and localization of XML. Examples for data categories will be given later. IUC 31, San Jose 5

i18n Background ITS Working GroupBasic Usage ITS Requirements Extended UsageSignificant Implementations Issues Audience / Usage Scenarios Content authoring Terminology creation and translation Software development 6 San Jose, The audience for ITS 1.0 are - content authors who need to mark up internationalization-related or localization-related information in an XML document; - support of terminology creation and translation in the localization process, e.g. insertion of special markers for terms; and - software development, where the software-related material (code and / or documentation) is stored in an XML based format). IUC 31, San Jose 6

i18n Background ITS Working GroupBasic Usage ITS Requirements Extended UsageSignificant Implementations Issues Other topics of the Working Group Mostly covered by "Best Practices for XML Internationalization" document Draft currently under development http://www.w3.org/tr/xml-i18n-bp/ 7 San Jose, Besides the ITS1.0 specification covered in this talk, the ITS working group is dealing with a lot of other topics. Many of these are covered as "Best Practices for XML Internationalization" in a separate document which is currently under development. The following slide provides a list of the most important topics the working group is working on. IUC 31, San Jose 7

Scenario - Authoring Content Limited impact Scenario - Terminology Creation CDATA section and Translation Links to internal/external text Scenario - Software Resources Bidirectional text support Indicator of Constraints Indicator for metrics Handling entities Attribute and translatable text Cultural aspects of the content Naming scheme Purpose specification/mapping Localization Notes Span-like elements Handling of white-spaces Unique identifier Multilingual Documents Locale/language identification Annotation Markup Term identification Identifying Date and Time Indicator of translatability Complete list at htttp://www.w3.org/international/its/requirements/ 8 San Jose, IUC 31, San Jose 8

Basic Usage 9 San Jose, IUC 31, San Jose 9

Translate ITS 1.0 Data Categories Localization Notes Terminology Directionality Ruby Language Information Elements within Text 10 San Jose, The basic usage of ITS 1.0 is to add information, so-called "data categories" for internationalization and localization, to an XML document. The data categories on this slide are part of the ITS 1.0 specification. IUC 31, San Jose 10

Sample Data category: "Translate" Parts of a document should not be translated: <book>... <p>and he said: you need a new <quote its:translate="no">motherboard</quote> </p>... </book> 11 San Jose, As an example we will introduce the "Translate" data category. It is used to express information about translatability of (parts of) an XML document. The default is that that the textual content of all elements has to be translated. The values of attributes should not be translated. An exception to this statement can be made via an "its:translate" attribute. For example, the "its:translate" attribute with the value "no" at the <quote> element means that the content of this element should not be translated. IUC 31, San Jose 11

Separation of Why Data Categories? 1. prose description of ITS 1.0 categories ("data category") 2. implementation (schema language independent) 3. schema language specific declaration (XML DTDs, XML Schema, RELAX NG) 12 San Jose, Why does ITS 1.0 define data categories and not markup directly? The benefit of data categories is the separation of (1) the prose description of what the ITS 1.0 information is about (the "data category"), (2) the implementation on a schema language independent level, and (3) the declaration which is specific to a schema language. The ITS 1.0 specification provides declarations for three schema language: XML DTDs, XML Schema and RELAX NG. IUC 31, San Jose 12

Example: "Translate" 1. Data category: Prose description "Parts of a document should (not) be translated." 2. Schema language independent implementation <book its:translate="yes" >... </book> 3. Schema language specific declaration <!ELEMENT book > <!ATTLIST book its:translate (yes no) #IMPLIED> 13 San Jose, Again we give an example of the "Translate" data category. Its prose description is very simple: "Parts of a document should (not) be translated". On a schema language independent level, "Translate" is implemented via an attribute "its:translate" with the two values "yes" or "no". In the schema language XML DTDs, the attribute is declared as an optional attribute on the <book> and other, possibly all elements of a schema. IUC 31, San Jose 13

Example: "Localization Note" 1. Data category: Prose description "Used to provide information to localizers." 2. Schema language independent implementation <book its:locnote=" " locnotetype=" " >... </book> 3. Schema language specific declaration <!ELEMENT book > <!ATTLIST book its:locnote CDATA) > 14 San Jose, Now we give an example of the "Localization Note" data category. It is used to provide information to localizers. On a schema language independent level, "Localization Note" is implemented via the attributes "its:locnote" and "its:locnotetype". "its:locnotetype" provides the type of localization note, "description" or "alert". Declarations for these two attributes are provided In the schema language XML DTD. IUC 31, San Jose 14

Extended Usage 15 San Jose, IUC 31, San Jose 15

Extended Usage of ITS Global usage of ITS 1.0 data categories (basic usage is called "local") Association between data categories and existing markup Pointing to existing values 16 San Jose, In addition to the basic usage of ITS 1.0, there are three aspects of ITS extended usage: the global usage of ITS data categories, the association between data categories and existing markup, and pointing to existing values. IUC 31, San Jose 16

Global ITS Data Categories Using XPath to select parts of an XML document: <somedocument > <its:rules > <its:translaterule translate="no" selector="//p[@editor='john']"/></its:rules > <!-- This rule holds for p elements which are edited by John. --> </somedocument> Other rule elements: locnoterule, termrule, dirrule, 17 San Jose, ITS 1.0 data categories can appear in "global" positions, that means independent of a specific location in an XML document, a schema or a separate XML file. For the global usage of ITS 1.0, an <its:rules> element is defined. It contains data category specific child elements like <its:translaterule> for the "Translate" data category. The attribute "selector" at the <its:translaterule> Element contains an XPath expression which is used for selecting parts of an XML document. These parts are the targets to which information of the data category is applied to. In the example, the "selector attribute", in combination with the "translate='no'" attribute, expresses that the <p> element with the attribute "editor='john'" should not be translated. For other data categories, ITS 1.0 defines similar rules elements. They all contain a "selector" attribute and further, data category specific markup. IUC 31, San Jose 17

Combination of Global and Local Usage <book ><its:rules > <its:translaterule translate="no" selector="//quote"/></its:rules> <body>... <p>and he said: you need a new <quote its:translate="yes">motherboard</quote> </p>... </body> </book> 18 San Jose, It is possible to combine the local and global usage of ITS 1.0 data categories. In the example, the content of all <quote> elements should not be translated. This is expressed via <its:translaterule> and two attributes "translate" and "selector" at that element. As an exception, the content of the <quote> element should be translated. This is expressed via the "its:translate='yes'" attribute at the <quote> element. IUC 31, San Jose 18

Association with Existing Markup Association with DITA and other markup: <book ><its:rules > <its:translaterule translate="no" selector="//*[@dita:translate='no']"/> <its:termrule term="yes" selector="//quote"/></its:rules> <body>... <p>and he said: you need a new <quote dita:translate="no">motherboard</quote> </p>... </body> </book> 19 San Jose, The "global" usage of ITS is also a means to associate ITS data categories with existing markup. In the example, the attribute "translate='no'" at the <its:translaterule> element is associated with the "dita:translate='no'" attribute from DITA, via the "selector='//*[@dita:translate='no']'" attribute. The attribute "term='yes'" at the <its:termrule> element implements the "Terminology" data category. It is associated with all <quote> elements via the "selector='//quote'" attribute. This means that all <quote> elements are interpreted as terms. DITA ("Darwin Information Typing Architecture") is a more and more widely used XML vocabulary especially designed for technical documentation. DITA documents are a primary target for localization, and DITA provides relevant markup like the "dita:translate" attribute. IUC 31, San Jose 19

Pointing to Existing Values Pointing to existing localization notes: <mydoc ><its:rules > <its:locnoterule locnotetype="alert" selector="//*[@loc-n]" locnotepointer="@loc-n"/> </its:rules>... <span loc-n=" "> </span> </mydoc> 20 San Jose, Finally, it is possible to point to existing values in a document. An example is given for the "Localization Note" data category. The "locnotepointer" attribute at the <its:locnoterule> element is used to point to the value of the "loc-n" attribute. Association with markup and pointing to values make it possible to apply ITS without an impact on existing XML documents, and re-use markup (like the DITA "dita:translate" attribute) from existing XML vocabularies. IUC 31, San Jose 20

Implementations and Applications 21 San Jose, Now we will present some implementations and applications of ITS 1.0. The data on the next slide is taken from the ITS 1.0 test suite, see http://www.w3.org/international/its/tests/. IUC 31, San Jose 21

ITS Test Suite http://www.w3.org/international/its/tests/ Translate (input, output) Localization Note (input, output) Terminology (input, ouput) Language Information (input, output) Elements within Text (input, output) 22 San Jose, There is sample input for five data categories: "Translate", Localization Note, Terminology, Language Information and Elements within Text. The output format is specific to the ITS 1.0 test suite. It makes explicit what information is available after ITS 1.0 processing. IUC 31, San Jose 22

Implementations XSLT implementations for all data categories Spritser ITSImpl Rainbow XML Filter preparation for translation XLIFF extraction spell-checking etc. 23 San Jose, During the development of the ITS 1.0 specification, four implementations have been created. There are two XSLT-based implementations, "Spritser" and "ITSImpl", which cover all ITS 1.0 data categories. The "Rainbow XML Filter" covers the "Translate", Localization Note, Terminology and Elements within Text data categories. This filter has many application scenarios which are listed on this slide. IUC 31, San Jose 23

XTM Implementations Web-based translation tool system ITS2Xliff (under development!) Further information at http://www.w3.org/international/its/links also about: Your implementation of ITS 1.0! Please let us know 24 San Jose, XTM covers the "Translate" and "Elements within Text" data categories. ITS2Xliff is a new implementation under development. It's purpose is the generation of XLIFF ("XML Localization Interchange File Format") documents from XML files with local or global ITS markup. Further information about implementations and applications is available at the link on this slide. We hope that sooner or later we will also be able to add a link to your implementation of ITS 1.0! IUC 31, San Jose 24

Thank You! 25 San Jose, IUC 31, San Jose 25