XLIFF Localisation for Joomla! Translator-oriented localisation of CMS-based websites



Similar documents
Localizing dynamic websites created from open source content management systems

XTM for Language Service Providers Explained

Open Source Content Management System for content development: a comparative study

XTM Cloud Explained. XTM Cloud Explained. Better Translation Technology. Page 1

FEATURES LIST. cms.moveable.com

Dreamweaver CS5. Module 2: Website Modification

Short notes on webpage programming languages

CS3051: Digital Content Management

HTML5. Turn this page to see Quick Guide of CTTC

Inviting Quotation for converting existing Static website to Dynamic website.

SKILLS HIGHLIGHTS: W e b a n d G r a p h i c D e s i g n e r

Course Syllabus Web Page Design 2 IMED 2315

Maldives Pension Administration Office Republic of Maldives

User Guide for Smart Former Gold (v. 1.0) by IToris Inc. team

Web Development News, Tips and Tutorials

WEB DEVELOPMENT IA & IB (893 & 894)

XLIFF 2.0 SUPPORT IN CAT TOOLS

Course Information Course Number: IWT 1229 Course Name: Web Development and Design Foundation

Web Authoring CSS. Module Descriptor

Content Management Systems: Drupal Vs Jahia

Mindshare Studios Introductory Guide to Content Management Systems

Trainer name is P. Ranjan Raja. He is honour of and he has 8 years of experience in real time programming.

Drupal CMS for marketing sites

IGW+ Certificate. I d e a l G r o u p i n W e b. International professional web design,

SDLXLIFF in Word. Proof-reading SDLXLIFF files in MS Word. Best practice guide

Boundary Commission for England Website technical development - Statement of Work. Point of Contact for Questions. Project Director.

Introduction to Web Content Management Systems Site Development SYLLABUS FALL 2012

WEB& WEBSITE DESIGN TRAINING

Web Development I & II*

The truth about Drupal

Toad Data Modeler - Features Matrix

Content Manager User Guide Information Technology Web Services

SEMINAR. Content Management System. Presented by: Radhika Khandelwal

Content Management System

Content Management Systems: Drupal Vs Jahia

Advanced Web Development SCOPE OF WEB DEVELOPMENT INDUSTRY

TERMS OF REFERENCE. Revamping of GSS Website. GSS Information Technology Directorate Application and Database Section

How to choose the 'right' CMS for a website

XTM Drupal Connector. A Translation Management Tool Plugin

XLIFF SUPPORT IN CAT TOOLS

CLASSROOM WEB DESIGNING COURSE

How We Did It. Unique data model abstraction layer to integrate, but de-couple EHR data from patient website design.

Thank you for deciding to read this book. I have written this book for you to learn Joomla! 1.5 as fast as possible.

Creating Library Website Using Open Source Content Management System

Typo3_tridion. SDL Tridion R5 3/21/2008

Full version is >>> HERE <<<


FileMaker Server 9. Custom Web Publishing with PHP

wpml manual A guide for site owners and translators.

Shop Manager Manual ConfigBox 3.0 for Magento

Title: Front-end Web Design, Back-end Development, & Graphic Design Levi Gable Web Design Seattle WA

Translation and Localization Services

WebLink 3 rd Party Integration Guide

Using EMC Documentum with Adobe LiveCycle ES

Studio. Rapid Single-Source Content Development. Author XYLEME STUDIO DATA SHEET

Drupal and ArcGIS Yes, it can be done. Frank McLean Developer

Joomla 1.0 Extension Development Training. Learning to program for Joomla

ANNEX A.1 TECHNICAL SPECIFICATIONS OPEN CALL FOR TENDERS F-SE-13-T01 WEB DEVELOPMENT SERVICES

BUILDING WEB JOURNAL DIRECTORY AND ITS ARTICLES WITH DRUPAL

Community Builder Language Package Guide Updated for CB 1.2.3

Workshop on Using Open Source Content Management System Drupal to build Library Websites Hasina Afroz Auninda Rumy Saleque

ACORD Website Design

Better Translation Technology. User Manual For Administrators, Project Managers, Linguists & Customers

Build a Multilingual Website with Joomla! 2.5

Joomla! 3 in 10 Easy Steps The new Joomla! 3.x series is mobile ready and comes with a complete new user interface. The book covers the standard term

If you are unable to look at this page in your , please click to

An Advanced E-commerce Course

Choosing a Content Management System (CMS)

Build it with Drupal 8

ultimo theme Update Guide Copyright Infortis All rights reserved

Manual for CKForms component Release 1.3.4

Content Management & Translation Management

FileMaker Server 15. Custom Web Publishing Guide

Oracle Application Express MS Access on Steroids

Christopher Zavatchen

Sitemap. Component for Joomla! This manual documents version 3.15.x of the Joomla! extension.

webtree designs Gayle Pyfrom web site design and development Lakewood, CO

[PROFILE / INTRO] 3D Multimedia, Graphics & Web Services

Developing Your School Website

Web Development. Owen Sacco. ICS2205/ICS2230 Web Intelligence

TRANSLATIONS FOR A WORKING WORLD. 2. Translate files in their source format. 1. Localize thoroughly

Your single-source partner for corporate product communication. Transit NXT Evolution. from Service Pack 0 to Service Pack 8

TOWARD THE CREATION OF A GREEN CONTENT MANAGEMENT SYSTEM

A DIAGRAM APPROACH TO AUTOMATIC GENERATION OF JSP/SERVLET WEB APPLICATIONS

DiskPulse DISK CHANGE MONITOR

Xtreeme Search Engine Studio Help Xtreeme

Drupal Website Design Curriculum

"Better is the enemy of good." Tips for Translators Who Migrate to Across

Document Freedom Workshop DFW 2012: CMS, Moodle and Web Publishing

Web Development. How the Web Works 3/3/2015. Clients / Server

TYPO3 6.x Enterprise Web CMS

Transcription:

XLIFF Localisation for Joomla! Translator-oriented localisation of CMS-based websites Jesús Torres del Rey Emilio Rodríguez Vázquez de Aldana Faculty of Translation and Documentation http://diarium.usal.es/codex

Agenda Introduction Motivation Multilingual management & interchange Our Research/Experiments Analysis of other tools Application Workflow XLIFF 1.2, XML+its1.0 Behaviour in CAT tools Translation-Oriented L10n Future Work Jesús Torres del Rey & Emilio Rodríguez Vázquez de Aldana 1

Motivation: chronology 2009: Request for translation of Faculty s website (Joomla 1.5, multilingual Joomfish) Html download > use of CAT > paste on Joomla html editor 2010-11: How to teach localisation of dynamic websites to our UG students? Full localisation of static websites taught Filetypes and technologies (html, js, css, graphics ) Super-, Macro-, Hyper-, Micro- structures Directory structures, relative links Link/Web management (Ms Expression, Adobe DW ) Automatisation via Search/Replace, regular expressions 2012: Multilingual extensions for Joomla 2.5 Falang (also for Joomla 3), Josetta, Joomfish, Jolomea 2013: Research with other CMSs (Drupal, Wrpss., Ty3.) Jesús Torres del Rey & Emilio Rodríguez Vázquez de Aldana 2

Motivation: T&R philosophy Translator/Localiser-oriented approach Integration with CAT/Localisation tools Empowerment through control of Processes, lifecycle From request to publication, update, multilingualisation Visual/Relational/Functional Context, Global meaning, Negotiation of communication needs Standardisation, XLIFF, ITS Acquisition of basic knowledge of Nature and Mechanics of Dynamic, CMS-based websites (On top of nature and mechanics of static websites) Filetypes, Databases and technologies Server Client intrastructure Composition of Dynamic active pages Front-end, Back-end, interface, content Jesús Torres del Rey & Emilio Rodríguez Vázquez de Aldana 3

Motivation: (dis)empowerment Static html-based websites: Full localisation Visual and functional context Use of functionality, quality tools (CAT/L) Capable of multilingual re-structuring Publication-ready deliverables Dynamic CMS-based websites: Patchy translation CMS partial webpage/separate translation environment Texts locked in DB-> export/import (for interchange, batch quality/analysis/term extraction processes) Only if administrative rights for CMS and multilanguage module installed Only if write-access rights; partial, patchy publication Jesús Torres del Rey & Emilio Rodríguez Vázquez de Aldana 4

L10n, I18n, Multilanguage: evolution in CMSs Specific (often) third-party modules to make multilingual websites easy to setup and manage. Automatic duplication of structure/pages Taking advantage of simplified CMS editing environments At the same time, translatable data export/import modules to csv, po, xml and, increasingly, XLIFF» Drupal XLIFF Tools, Wordpress WPML, Typo3 l10nmgr, Joomla JDiction (since early 2013)... Combination of multilingual management and XLIFF et al. export/import» Wordpress WPML, Joomla JDiction (since early 2013)... Jesús Torres del Rey & Emilio Rodríguez Vázquez de Aldana 5

Our experiments: overview Application: Falang2XLIFF (beta)» http://diarium.usal.es/codex/desarrollo Java Client (compiled to 1.7) Handy experimental tool with our limited resources Not embedded into CMS as a module: access rights to DB? Uses Falang multilingual DB structure for Joomla Potentially applicable to other DB structures, like Josetta, Jfish Main purpose: to experiment with data to be extracted, XLIFF and whole L10n process, and to use it for our UG L10n course for translator training Other tools: Jdiction (xliff tool added since March) For other CMSs: XLIFF Tools (Drupal) Jesús Torres del Rey & Emilio Rodríguez Vázquez de Aldana 6

Experiments: L10n objects CMS objects for L10n: Editor/Administration interface php, asp, or externalised to ini, po Dependent, linked files (pdf, epub, graphics, video, audio ) Database elements Article/page Modules (e.g. calendar ) Categories (e.g. for thematically grouping blog posts). Smaller user interaction elements (weblinks, etc.) Jesús Torres del Rey & Emilio Rodríguez Vázquez de Aldana 7

Experiments: L10n objects In Database tables, L10n elements: 1. Structural/Interface text strings menus, article titles, sections 2. Longer (x)html article contents 3. Parameters for the above elements metakey, metadesc, menu params. All in text fields in DB* Jesús Torres del Rey & Emilio Rodríguez Vázquez de Aldana 8

Experiments: other extraction strategies (JDiction) Titles <!CDATA[ TEXT]]> HTML:: TAG & TEXT <!CDATA[ TEXT]]> Parameters: state->translated! (Drupal: final status) Jesús Torres del Rey & Emilio Rodríguez Vázquez de Aldana 9

Experiments: other extraction>cat (JDiction>Virtaal) Tags are visually marked probably, regex <[^>]+/?> However, unprotected tags CAT tools could integrate a WYSWYG html editor if xliff 1.2 datatype = "htmlbody" Jesús Torres del Rey & Emilio Rodríguez Vázquez de Aldana 10

Experiments: other extraction>cat (Jdiction>MemoQ 5) Filters not always versatile enough Segments should be shorter and regularly segmented for better matches and TM leverage Jesús Torres del Rey & Emilio Rodríguez Vázquez de Aldana 11

Experiments: Summarising JDiction Multilingual management + export/import Some multilingual management problems: Translation editor: separate environment (not integrated in target -language page) does not show original in parallel Some export/import problems: Indiscriminate bulk export, irrespective of newness or update/translated state CDATA export of (x)html content» No different from csv export» Whole article/item, without structure XHTML should be processed with XML processors, rather than with regular expressions HTML text should be carried to CAT tool not as plain text but as html tags and text (Drupal Xliff Tools does) Jesús Torres del Rey & Emilio Rodríguez Vázquez de Aldana 12

Experiments: Application Workflow (export) Joomla! with Falang Falang2Xliff 1. In Falang, element selection 2. DB Connection 4. XML Generation XML+its1.0 BD 3. DB Extraction (new & updated) Simple XML (Temporary) 5. XLIFF Generation xml2xliff.xsl of XliffRoundTrip Tool XLIFF 1.2 Jesús Torres del Rey & Emilio Rodríguez Vázquez de Aldana 13

Application: Workflow (1/6) 1. In Falang: Element Selection 1.1. Falang. PM with admin rights 1.2 selects elements one by one! and 1.3 Copy Source! Jesús Torres del Rey & Emilio Rodríguez Vázquez de Aldana 14

Application: Workflow (2/6) 2. Database Connection Only standard TCP/IP connections to SQL server Only in network security zone or localhost Joomla DB prefix needed Read-access permission for export Falang tables but also Original content tables, to check newness & update status Jesús Torres del Rey & Emilio Rodríguez Vázquez de Aldana 15

Application: Workflow (3/6) 3. Database Extraction (new & updated) New: Established as translatable by PM by using "Copy Source Updated: translatable text whose source content has been edited (original content tables checked MD5 hash-) Info from attributes title, text, introtext, name, fulltext, description & content in tables categories, content, menu, modules and weblinks Parameters not extracted to prevent DB corruption. X X Jesús Torres del Rey & Emilio Rodríguez Vázquez de Aldana 16

Application: Workflow (3/6) 3. Database Extraction (new & updated): The Joomla! html editor typically rewrites HTML fragments as XHTML But are we certain that it is correct XHTML? We have rechecked (Jericho Parser HTML) and rewritten data if necessary XML entitities, closing attribute quotes, checking and correcting node hyerarchy» Some current limitations: e.g. unpaired <tag> <tag/> XHTML elements should be stored in DB as XMLElements» ISO/IEC 9075-14:2011-Part 14:XML-Related Specifications (SQL/XML)» XML Support low in MySQL; high in PostgreSQL Jesús Torres del Rey & Emilio Rodríguez Vázquez de Aldana 17

Application: Workflow (4/6) 4. XML Generation <value_falang>usando Joomla! & </value_falang> <value_falang><p> <img /> </p></value_falang> Jesús Torres del Rey & Emilio Rodríguez Vázquez de Aldana 18

Application: Workflow (4/6) 4. XML Generation (temporary file to be converted to XLIFF) <registros_falang> Root <registro_falang> <value_falang> Attributes contain info for correct back import to DB Contains translatable content (can include html elements) XHTML Text Jesús Torres del Rey & Emilio Rodríguez Vázquez de Aldana 19

Application: Workflow (4/6) 4. Generation of XML+its1.0 Global, Embedded ITS rules. Features: Translate Elements Within Text ITS1.0 supports XPath 1.0 (which does not support regex) W3C WG (2008): Best Practices por XML Localization. 5.1.4 Associating existing XHTML Jesús Torres del Rey & Emilio Rodríguez Vázquez de Aldana markup with ITS 20

Application: Workflow (5/6) 5. Generation of XLIFF 1.2 Schnabel s xml2xliff.xsl adapted so that source language=variable Jesús Torres del Rey & Emilio Rodríguez Vázquez de Aldana 21

Application: Workflow Generation of XML+its1.0 and XLIFF Jesús Torres del Rey & Emilio Rodríguez Vázquez de Aldana 22

Application: Workflow (import) From XLIFF/XML+its back to DB (import) Joomla! with Falang Falang2Xliff BD XML (Temporary) Optional online update 1. XML Generation 2. SQL Generation XLIFF 1.2 xliff2xml.xsl (XliffRoundTrip) SQL XML + ITS 1.0 XLIFF encoding (UTF-8 without BOM) Translation states (e.g. needs-translation, etc.) not taken into account XML to SQL via Xquery processor (http://xmlbeans.apache.org/index.html) Jesús Torres del Rey & Emilio Rodríguez Vázquez de Aldana 23

Application: Workflow (import) From XLIFF/XML+its back to DB (import) Jesús Torres del Rey & Emilio Rodríguez Vázquez de Aldana 24

Application: XLIFF generated 1 2 3 xliffroundtrip XSL For regular XML structures Limitation: Attributes (translatable) must be post-processed Tags: <group> (without text) <trans-unit> (with text) <g> </g>, <x/> (within text/inline) <p><img alt=" " </p> 1 <p><span> <a title =" " > </a> </span></p> 2 <ul><li><span> <strong> </strong> </span></li> 3 <li><span> <strong> </strong> <em> </em> </span></li></ul> 4 4 <trans-unit><x/> </trans-unit> <group><trans-unit> <g id="" > </g> </trans-unit></group> <group> <group><trans-unit> <g id=""> </g> </trans-unit></group> </group> <group><trans-unit> <g id=""> </g> <g d=""> </g> </transunit></group> 25 1 2 3 4

Experiments: XLIFF>CAT Translation Units segmented at paragraph level Jesús Torres del Rey & Emilio Rodríguez Vázquez de Aldana 26

Experiments: XML+its1.0>CAT 2 1 4,5 3 6 Support for ITS in CAT? SDL Trados Studio: Global & Embedded rules for features: Translate Elements Within 7 Okapi Rainbow (For Global:) Translate, Elements WithinText, LocNote XTM text (Linked File) 27

Experiments: html overtagging > XLIFF <ul> <li><span> <strong> </strong> </span></li> 3 <li><span> <strong> </strong> <em> </em> </span> </li> </ul> 4 3 4 Many reformatting actions (on the html editor) produce html overtagging <ul> <li><span> <strong> </strong> </span></li> 3 <li><span style=""> </span><strong style=""> </strong><span style=""> </span> <em style=""> </em><span style=""> </span> </li> </ul> Previous Segment 4 becomes 4, 5, 6, 7, 8 Therefore, one trans-unit for each <tag></tag> pair 28

Experiments: html overtagging > XLIFF > CAT 3 4 8 6 4 5 7 Html overtagging by CMS html editors produces oversegmentation when converting to XLIFF (following XSL s logical segmentation strategy) CMS editors Clean-html function seldom helps! Jesús Torres del Rey & Emilio Rodríguez Vázquez de Aldana 29

Experiments: html overtagging > ITS > XLIFF > CAT Okapi Rainbow-generated XLIFF from XML+its 1.0 XML+its 1.0 converted to SDLXLIFF by CAT tool Jesús Torres del Rey & Emilio Rodríguez Vázquez de Aldana 30

Translator/Localiser needs CMS Communication Structure Agent/Doc/Kn Interaction Global Meaning & Function Intratextual relations Purpose Exchange PM Translator/Localiser Form, layout, expression CAT/L Quality, Consistence, Adherence to conventions, leverage, format, language/knowledge building Jesús Torres del Rey & Emilio Rodríguez Vázquez de Aldana 31

Translator/Localiser needs Meaningful, (dynamically) coherent whole that needs to attract, keep & direct attention Translation as just a matter of words, just a language problem?! Localisation/Translation as adaptation, communication, cultural/professional mediation Articles/Items are coherently, cohesively integrated in General/Particular communicative/performative purpose Sometimes bigger articles Regions in the webpage, & relative positions Hyperlink/Interaction relationships Structure/sitemap relationships (internal and external menus, etc.) Potentially indexed search results Type of article/element/module categories Usability/Accessibility needs/alternatives Jesús Torres del Rey & Emilio Rodríguez Vázquez de Aldana 32

Translator/Localiser needs CMS<xliff/its>CAT/L TOOLS Exported units must behave properly and efficiently in CAT/L tools Segmentation XHTML structure, function, meaning of tags Preview? Visual/functional contextualisation Link to published webpage, highlighted translated elements Zielinski & Beuster (memoqfest 2012): DB>html>CATpreview Control of new elements, updates, trans status, etc. Interchange (batch extraction, revision, etc.) Other Possibility of placeable adaptation? E.g. specific/global localisable links (href attribute) Jesús Torres del Rey & Emilio Rodríguez Vázquez de Aldana 33

Translator/Localiser desiderata CMS4L10n CMS managing content taking translators/localisers/pms into account Separating content from layout & function but showing interrelationships XLIFF with linked XSL/CSS? (in xliff 2.0 L10n kit/portfolio?) Preview, link to published page? Classifying elements in a standard way, semantics? Types of articles/pages Types of modules Relations between constituents Possibility of PM preprocessing for translation» CMS User profiles: localisation PM, localiser E.g. specific/global localisable links (href attribute) Including various articles, entities, elements (e.g. flash, graphics, etc.) of a page in an XLIFF file/group element, marking which for translation, others translated/for context Generating html skeleton? Jesús Torres del Rey & Emilio Rodríguez Vázquez de Aldana 34

Future work In-depth analysis of export/import tools in different CMSs and other Joomla! Multilingual Managers. Josetta, new Joomfish version Extraction of contextual, preview information Links to published page containing translatable articles Analysis of object types & relationships in web CMSs + Accessibility needs Jesús Torres del Rey & Emilio Rodríguez Vázquez de Aldana 35