Internationalization & Localization
|
|
|
- Jesse Edwards
- 9 years ago
- Views:
Transcription
1 Internationalization & Localization Of OpenOffice.org - The Indian Perspective Comprehensive Office Suite for Multilingual Indic Computing Bhupesh Koli, Shikha G Pillai <[email protected]> <[email protected]> CENTRE FOR DEVELOPMENT OF ADVANCED COMPUTING (Formerly NCST) 68, Electronic City, Hosur Road, Bangalore , India. Tel: Fax: OpenOffice.org Conference 1
2 Contents Abstract 3 Introduction 4 OpenOffice.org in India 4 Overview of Indian scripts 4 Vowels 5 Consonants 5 Graphical Symbols 6 Issues in Indic computing 6 Unicode for Indian scripts 6 OpenType 7 Indian Language Input Methods 7 Indian Language Support in OpenOffice.org BharateeyaOO.o 8 Internationalization 9 Complex Text Layout 9 Locale Data 12 Localization 12 Localization Tools 13 Glossary and Translation 13 Representing the Language in the build environment 13 Building the Installation set 14 Future 15 Conclusion 16 Reference 16 Acknowledgement 16 Copyright 17 OpenOffice.org Conference 2
3 Abstract India supports a culturally and linguistically diverse population, majority of whom are excluded from the productive usage of information technology due to the lack of standardized and economical Indian language enabled software. OpenOffice.org is the leading office productivity suite through open-source initiatives, available cross-platform with internationalization and localization support for major International languages. This paper examines the development aspects, usage and prospects of OpenOffice.org internationalized and localized to cater to the Indian market. Most Indian scripts originate from the Brahmi script and follow complex rules of layout involving consonants, vowels, special symbols, conjuncts and ligatures. Unicode encoding for Indian languages establishes a similar pattern among the scripts. We will examine these orthographic rules and how Complex Text Layout algorithms can be developed for Indian scripts in OpenOffice.org. Also explored are storage and rendering aspects of Indian text, along with font technologies suitable for Indian scripts. The Internationalization (i18n) and Localization (l10n) framework of OpenOffice.org sets guidelines for localization and internationalization work of the suite in other languages. The project BharateeyaOO.o ( commenced on the lines of these frameworks, to achieve Indian language support in OpenOffice.org. With initiatives for localizations in major languages of India, Complex Text Layout support, Indian locales, dictionary support and collation algorithms, the project aims at a completely Indianized Office suite packaged economically for the Indian user. This paper concludes with an insight into the development, implementation details and progress of this project. OpenOffice.org Conference 3
4 1. Introduction India is a country having eighteen constitutional languages, six thousand dialects and an estimated eight hundred fifty languages in daily use, and supporting 15% of the world s population. The growth and penetration of Information Technology in this developing nation is however, restricted to only 10% of its population, since the other 90% are yet to be conversant in the lingua franca of computer mediated communication English. This calls for development of software supporting locallanguages, allowing creation and manipulation of regionally relevant content, along with computing in local mediums. Such software need to be economically viable, and compliant to international standards to allow information dissemination transparently across economic, cultural and social barriers. Here, we analyze the prospects and developmental aspects of one such software - the internationally acclaimed product OpenOffice.org - in Indian languages. 2. OpenOffice.org in India OpenOffice.org is the open source project through which Sun Microsystems has released the technology for the popular StarOffice[tm] Productivity Suite. With a worldwide developer community, open access to a common source base for highproductivity office applications that runs on all major platforms, and extensible internationalization and localization frameworks for development in all international languages, OpenOffice.org is an unparalleled effort towards open technology access in this computing era. For India, OpenOffice.org provides not only economic viability, but also extensibility in terms of Indian language development and support achieved so far. Indian language support in OpenOffice.org aims to fulfill the basic requirement of having the commonly used collection of office applications on each Indian desktop. With support added for Indian localized interfaces and technical processing of Indian languages, the software will be in the forefront of Indian language development and information propagation across all digital barriers. 3. Overview of Indian Scripts Of the eighteen official Indian languages, fifteen are of common origin the ancient Brahmi script of India [1]. The other three namely Kashmiri, Sindhi and Urdu are of Perso-Arabic origin. Based on the Brahmi script classification and syllabic features, the characters of the fifteen major languages have been classified as Consonants (C) and Vowels (V). These two basic groups in various orders, along with graphical symbols (G) of the script, combine to form the orthographic syllable. OpenOffice.org Conference 4
5 The organization of consonants and vowels, as well as the possible combinations form the basics of each Indian language. This organization results in explicit rules for creation of characters and conjuncts. Indian scripts generally exhibit complex behavior in syllable composition, formation and positioning. There also exist many shaping rules for each character, depending on the syllable it forms with other characters. These shaping rules may vary for individual scripts, but on the whole, based on the organizational structure and the classification of characters within each script (explained below), all major Indian scripts display a similar behavioral pattern. 3.1 Vowels Independent and Dependent Vowels in Indian languages have both independent and dependent forms. The independent form stands by itself, and emphasizes the sound that it represents. The dependent vowel, also known as vowel signs/matras, is depicted in combination with a consonant or consonant cluster. The sound of this combination is the combined result of the sounds of the consonant/cluster and the vowel. Depending on the Indian script, these dependent vowels may attach itself to the any part of the consonant cluster: up, down, left or right. In certain scripts, combination of a consonant and dependent vowel may also change the inherent shape of both the vowel and consonant. The Tamil script has been used here to highlight independent and dependent vowel combinations with a consonant: 3.2 Consonants Full and Half Forms Consonants are generally treated as the base of a syllable, and can be found having an inherent vowel, generally with the sound {a}. When combined with a dependent vowel, the sound of the inherent vowel is overridden with that of the dependent vowel. A consonant may be visible in its full or half form, the latter depicting a consonant without the inherent vowel. This half form of a consonant is obtained by combining the consonant with a special character called a virama/halant. The halant kills the inherent vowel of the consonant and is used in forming consonant conjuncts. OpenOffice.org Conference 5
6 Depicted below is an example of different conjuncts formed from two consonants in Hindi, using the halant?and zero-width characters ZWJ and ZWNJ: 3.3 Graphical Symbols Modifier marks Each Indian script also has modifier marks and diacritics that add to the consonant to produce some modified phonetic behavior. There also exist special symbols all having different combination roles, and Indian numerals within the script. The following showing consonant, vowel and special character combination logic in Hindi; the boundaries of each cluster have been marked: 4. Issues In Indic Computing Enabling Indian languages in software requires focusing on character encoding, font, character input, rendering and shaping technology for each script. Described below are some of these issues: 4.1 Unicode for Indian scripts Unicode encoding (16 bit) for Indic scripts is based on the 1988 draft of ISCII (Indian script code for Information Interchange), brought out by the Indian Department of Electronics (DoE), Government of India. The ISCII code set was evolved to exploit the common phonetic structure of Indian scripts, as an 8-bit coded character set. The lower 128 characters are from ASCII, while the top 128 are for one Indian script. This arrangement sought to cater to all ten main Brahmi-based Indian scripts, with an INSCRIPT keyboard overlay providing a logical arrangement of the characters. OpenOffice.org Conference 6
7 Nine main Indian scripts viz. Devanagari, Bengali, Gurmukhi, Gujarati, Oriya, Tamil, Telugu, Kannada and Malayalam have been encoded in Unicode in the South and Southeast Asian script range [2]. The encodings are used in conjunction with the 8-bit transformation format UTF-8 to effect storage and rendering of Unicode text effectively. UTF-8 is compatible to ASCII ranges and transforms each Unicode Indian character to three byte representations. After development of the Unicode standard, many issues have been brought up by Indic computing societies, about the incompleteness of the Unicode standard with regard to proper representation of all the Indian scripts. These are mainly issues regarding misrepresentation of some characters, missing characters and incorrect shape or annotation. The Government of India has proposed some changes to the existing encoding for Indian scripts, to bring out the best possible representation [3]. Apart from Unicode, there are also other encoding standards in India, like ISCII and some proprietary/font encodings for Indian languages. However, since Unicode is an international multilingual standard compatible across software platforms, it can be seen as a better option for most Indian languages. 4.2 OpenType The OpenType specification [4] uses Unicode for character encoding and allows development of fonts containing large glyph sets and glyph variants. It allows font designers to map single characters to multiple glyphs and vice-versa, a single glyph for multiple characters. OpenType tables like GSUB (glyph substitution) and GPOS (glyph positioning) allows complex positioning, definition of conjuncts and cluster formations, and encoding of specific script features. With such explicit script information, textprocessing applications need only provide the processing details than concentrate on linguistic specifications. For Indian languages, OpenType fonts provide the best solution, and has proven results with respect to support for complex features within each script. 4.3 Indian Language Input Methods Language Input in a software depend on the recognition of the language in the software as well as methods to input characters within the language. Input methods may be keyboard-based, through speech recognition, or dictionary/vocabulary dependent. Most common ways of Indian language input in software is through usage of logical keyboards based on mapping the standard keyboard. Since Indian language input in itself is complex in nature, as opposed to Latin scripts, keyboard layouts for Indian languages also require much deliberation before implementation and usage. The INSCRIPT keyboard, an initiative of the Department of Electronics, maps the standard QWERTY keyboard for Indic scripts. OpenOffice.org Conference 7
8 The keyboard overlay exploits the logical ordering of Indian scripts, bringing about a common structure among all the scripts, with the vowels on the left hand side, and consonants on the right. The Windows platform Indian locales uses INSCRIPT keyboard layout for Indian character entry. On Linux, there are keymaps developed, used in conjunction with xkb or xmodmap, that use the INSCRIPT layout. There are also a variety of phonetic, regional and romanized keyboards available, however, some are proprietary and based on various other standards. 5. Indian Language Support in OpenOffice.org The BharateeyaOO.o project The BharateeyaOO.o project ( was commenced at the National Centre for Software Technology (officially CDAC from April 2003) in 2001, to initiate development in OpenOffice.org through localizations and internationalization support within the suite for major Indian languages, on the prominent software platforms. Such an initiative was brought about to compensate for the lack of Indian language support in key software like office suites, which are almost every computer user s requirement. The project targets implementation of the following features in OpenOffice.org for Indian languages Complex Text Layout: Enabling CTL support in the suite, for all major Indian languages based on a generic framework, on Windows and Linux. This also involves development of Indian language solutions such as fonts, and input methods, according to requirement of the platform. Indian Locales: Enabling locale data in the suite, for effecting currency, calendar, dictionary and collation specifications for each Indian language. Localization: Translation and localization of the suite on Windows and Linux for the Indian languages Hindi, Tamil, Kannada, Telugu, Punjabi, Marathi, Gujarati, Bengali and Malayalam. The development of the project was based on the i18n (Internationalization) and l10n (Localization) frameworks of OpenOffice.org. The sections described below, highlight the developmental aspects of BharateeyaOO.o with insights into the solutions provided and observations made during implementation. OpenOffice.org Conference 8
9 6. Internationalization (i18n) The i18n framework of OpenOffice.org offers internationalization functionality within the suite. Encapsulated within two modules of the source i18n and i18npool, the framework designates internationalization requirements for western, CJK and complex script based languages. Addressing these requirements for Indian languages, addition of complex text layout support for major Indian languages, and locale data support was taken up. 6.1 Complex Text Layout The presentation of Indian Language scripts requires contextual processing for output and display, due to its complex nature. Main issues that need to be tackled are those of conjunct formation from more than one character, and processing of such clusters within a document. For the requirements of CTL, i.e. text rendering, processing logic for arrow keys, positioning of caret and cursor, backspace/delete processing etc. character clusters lend a new dimension and each CTL event must be able to accommodate clusters and behave accordingly. Some typical requirements of Indian script CTL processing are that the arrow and mouse events should result in the caret being positioned at cluster boundaries, and not within the cluster. Deletion should remove the entire cluster containing more than one code point, while a backspace operation should be allowed to remove characters one by one. Combining characters also need to be considered while selection, cut/copy/paste and text break issues. Finding a combination for Indian Languages A text-processing engine for Indian scripts should be able to know character behavior, as in whether it can be involved in a combination with its neighbors. This involves logic processing for finding a combination in a specific language. In OpenOffice.org, such logic can be developed within the i18n implementation of BreakIterator. The BreakIterator is used to find a character boundary, based on logic provided on the particular script. This character boundary logic, accessed through interfaces, can aid the text-processing engine in finding valid character combinations. All major Indian languages that originate from the Brahmi script follow similar rules of combination. This feature can be used to develop a generic algorithm for character combinations. However, there are some exceptions in certain scripts, which may require specific implementation. OpenOffice.org Conference 9
10 Character combination affects Text width Combination often results in a modification of a character s appearance or even a completely different character substituted. This leads to the cluster having a new width, rather than a width equal to the sum of individual width of characters involved in combination. If the text-processing engine places the caret according to the sum of the widths, this may lead to an error. Thus along with combination character processing, determination and calculation of text width is also necessary. Implementation of Indian script CTL Implementation of CTL for Indian scripts, based on Character combinations and their corresponding text width, is resolved to the following three major feature additions: Additional support required for Complex Script characters. OpenOffice.org Visual class library [VCL] module provides access to different GUI systems. This module also defines the APIs to render multilingual text. An array [WidthInfoArray] of type - structure [ImplWidthInfoData] has been defined to store the width of each unique character occurring in a document. If the character is used in the document once again, its width will be looked up in the array, instead of repetitive computation. struct ImplWidthInfoData{ USHORT mnchar; Long mnwidth; }; For multilingual Indian text, comprising of complex characters, this becomes a limitation, since, there is no provision to store the width of the character cluster, which in itself has a unique value dissimilar to its components. CTL Processing is constrained here, as the exact widths of combinations are not stored. To counter with this limitation, we propose a new layout of the structure for complex scripts. The member variable mnchar should be a string to store complex characters (consisting of more than one code-point). Following is a possible definition: OpenOffice.org Conference 10
11 struct ImplWidthInfoData{ USHORT mnchar; string machar; long mnwidth; }; Abstract interface to get a width of combination characters Complex text may contain single character, combining characters or even multilingual text. Width needs to be calculated correctly based on each feature within the text. If characters are single (not combining), width can be calculated individual. In case of combining characters, we need to have an API that abstracts system level functionality to get exact width of the string. The [VCL] module implementation contains a SalGraphics (System Abstraction Layer Graphics) class, where a function to get width of characters is defined. Here we propose to have a function, which can return width of strings (character cluster) passed as arguments. Class SalGraphics {... GetCharWidth (nchar, nchar, nlen);... // It also requires an interface to compute string width GetTextWidth (String, nlen); }; Modification in the text processing in source All width processing, spacing array and text break implementations have to be modified to accommodate the changes mentioned above. One such implementation can be as follows. The example is that of a function in the [VCL] layer definition of the OutputDevice class. long OutputDevice::GetTextArray( String&, long*, nindex, nlen ) const {... for ( i = 0; i < nlen; i++ ) { /* Check the character to see if it is from a complex script. Call the BreakIterator APIs for that language, to find out the character boundaries and their widths. According to whether the character has combined with another one or not, width can be stored in the modified array along with the single/combination character*/ } } OpenOffice.org Conference 11
12 All other CTL implementations like arrow processing, delete/backspace, etc. inherently uses the BreakIterator implementation, and hence will be resolved to work for the complex script text. 6.2 Locale Data Indian Locale Data has been implemented to provide calendar and currency formats in Hindi. This involves adding a locale file (XML) to the i18npool module, from which it is parsed and compiled into a shared library. At runtime, the locale can be selected from the Options dialog. 7. Localization (l10n) The Localization project of OpenOffice.org, residing at hosts a Localization (l10n) framework based on the multi-platform environment of the suite. The Framework [5] stipulates the method to build localized workspaces, with the introduction of a new language to the suite and clearly demarcates the necessary steps in terms of Getting Milestone workspaces Localizing and Building OpenOffice.org o Representing the language in the build environment o Extraction of strings from the source o Merging back translated strings o Rebuilding the localized code Switching to the newer workspaces Localization of OpenOffice.org was planned with the two major languages: Hindi and Tamil. While the former is the national language and widely spoken in Northern India, the latter is a language spoken in Southern India. After successful localization in these two languages, other language localizations were to be started. OpenOffice.org Conference 12
13 Translation within OpenOffice.org is required for strings composing the resources of the suite, definitions of the setup program and configuration files. These files have to be translated in the target language and rebuilt in a source which has the target language already added to the environment. 7.1 Localization Tools The suite also provides localization tools such as transex3, lngex, cfgex and localized that parse the resource files to produce tab-separated files for translation, and allow merging back of the translated strings within these files to the source. The tabs are crucial for the merging back process. If the translation results in incorrect tab spacings then the file will be corrupted, and may hamper correct merging back of the strings. While transex3, lngex and cfgex work for specific files, the localize tool extracts and merges strings from the entire source. These tools are located in the transex3 module and require the target language to be added to the transex3 source and rebuilt, for extraction and merging to be possible. 7.2 Glossary and Translation The OpenOffice.org glossary, composing about 7000 strings, contains most of the words within the 21,000 actual resource strings to be translated. Using the glossary as a reference for translation, consistency across translations can be assured, as well as decreasing time required for reference. Translation of the strings is to be done within the tab-separated files produced by the localization tools, in a simple editor, and saved in UTF-8 format. The font used for entry of translations need to be a Unicode font so that the UTF-8 values generated are correct. Only this can ensure that the application renders the correct strings on the user interface after localization. For Indian languages, OpenType fonts can produce best results. Resource translations need to be apt and context sensitive as far as possible. For this, the context information of the strings within the files, i.e. the module information and user-interface element that holds the string, can be used to guess out the context during translation. 7.3 Representing the language in the build environment Language support in OpenOffice.org is demarcated in six different modules of the source i.e. solenv, tools, svx, rsc, transex3 and the installation project comprising the modules scp and setup2. Any language within the source is represented by six different parameters: OpenOffice.org Conference 13
14 Locale Identifier: Reserved Hexadecimal LANGID defined in Windows. Can also be specified by a user, for languages that do not have an existing definition. Numerical Code: Decimal equivalent of the LANGID. Language Identifier: A unique two-digit identifier that represents the language. ISO-Code: A short string, which indicates the language. Symbolic Code: Name of the language. If the language is spoken in different regions, then a 2-character representation can be also added for that region along with the language. Environment Variable: A variable to be set to TRUE to indicate that the build should be performed for that language. The following table gives the values adopted for Indian localizations: Language Locale Numerical Language ISO Symbolic Environment Identifier Code Identifier Code Code Variable Hindi 0x hi hindi RES_HINDI Tamil 0x ta tamil RES_TAMIL The values assigned for each language across all the files should be uniform and consistent, for the user interface to be completely localized. The ISO-code may be predefined in the file isolang.cxx in the tools module, and if so, it would be a better approach to adopt the same value for all further modifications. The encoding format for the language also has to be specified within the tools module. For Indian languages, UTF8 encoding was used. (Specified as RTL_TEXTENCODING_UTF8 in the file l2txtenc.cxx). 7.4 Building the installation set A rebuild of the source with the localization modifications added will produce the new installation set. For the rebuild, the environment variable should be set to TRUE within the environment file (*.bat on Windows and *.set on Linux) and LANGEXT=## provided as argument to the dmake/build command. The installation sets after building are created in instsetoo/*.pro/##/normal. (*.pro maybe wntmsci7.pro or unxlngi3.pro on Windows and Linux respectively. ## is the Language identifier). Following are some screenshots for the localization work in Hindi on Windows and Linux. OpenOffice.org Conference 14
15 Localized strings on both platforms can be rendered using any OpenType font for that language. The preferences for font can be set within the source or at run-time. Localization on Windows Draw (font is Mangal) Localization on Linux Text Document (font is Raghu) 8. Future of BharateeyaOO.o Project The Internationalization framework of OpenOffice.org can be extended to provide dictionary support, and locale data support for Hindi and other Indian languages. Implementation of sorting/collation for Indian languages also can be developed. Collation is the linguistic and culturally sensitive sort order of strings in a language. It is dissimilar to encoding order, esp. in the case of Unicode, since collation is language dependent while encoding is script based. Hence, for an Indian script like Devanagari, which is the base for Hindi, Marathi, Sanskrit and Konkani, collation [6] OpenOffice.org Conference 15
16 orders have to be defined for each of the languages, all of which have different sorting requirements. For Indian languages, collation order should also be sensitive to multiple code points which form conjuncts, and weights given to consonant modifiers and special characters in the script. OpenOffice.org defines collator components that can be developed to produce collation algorithms for a particular language. 9. Conclusion The availability of Indian language support in OpenOffice.org can be viewed as a major boon to Indian language computing and development. The multi-faceted features of this leading suite coupled with its availability in the open source domain, makes it an appealing and highly sought-for technology solution for Indian users. With this paper, we aim to introduce the possibilities of this suite in Indian languages and deliver an insight into the subsequent impact of its availability, on the propagation of Indian languages in the world of Information Technology. 10. References [1] SP Mudur et al. Computers & Graphics 23 (1999) 7-24 An architecture for the shaping of Indic Texts [2] South and Southeast Asian script encoding in Unicode [3] Proposed changes in Indian scripts given by the Department of Information Technology, Ministry of Communications and Information Technology, India [4] OpenType Indic script support [5] L10N Framework [6] : Issues in Indic collations 11. Acknowledgement We would like to thank all the past members of this project as well as our other colleagues who have encouraged us in our work. We are indebted to members of our organization, NCST, for the pioneering research and development they have done in Indian Language Technology. We would also like to extend our sincere gratitude to Dr. S.P Mudur, for initiating this work and inspiring us to do our best. OpenOffice.org Conference 16
17 Finally, we thank all the OpenOffice.org team members for the wonderful support they gave us, and for providing assistance and answers to all our queries. 12. Copyright Copyright 2003 Centre for Development of Advanced Computing (Formerly NCST), 68 Electronics City, Bangalore , Karnataka, India. All rights reserved. OpenOffice.org Conference 17
Red Hat Enterprise Linux International Language Support Guide
Red Hat Enterprise Linux International Language Support Guide Red Hat Enterprise Linux International Language Support Guide Copyright This book is about international language support for Red Hat Enterprise
National Language (Tamil) Support in Oracle An Oracle White paper / November 2004
National Language (Tamil) Support in Oracle An Oracle White paper / November 2004 Vasundhara V* & Nagarajan M & * [email protected]; & [email protected]) Oracle
Rendering/Layout Engine for Complex script. Pema Geyleg [email protected]
Rendering/Layout Engine for Complex script Pema Geyleg [email protected] Overview What is the Layout Engine/ Rendering? What is complex text? Types of rendering engine? How does it work? How does it support
SETTING UP A MULTILINGUAL INFORMATION REPOSITORY : A CASE STUDY WITH EPRINTS.ORG SOFTWARE
595 SETTING UP A MULTILINGUAL INFORMATION REPOSITORY : A CASE STUDY WITH EPRINTS.ORG SOFTWARE Nagaraj N Vaidya Francis Jayakanth Abstract Today 80 % of the content on the Web is in English, which is spoken
Tibetan For Windows - Software Development and Future Speculations. Marvin Moser, Tibetan for Windows & Lucent Technologies, USA
Tibetan For Windows - Software Development and Future Speculations Marvin Moser, Tibetan for Windows & Lucent Technologies, USA Introduction This paper presents the basic functions of the Tibetan for Windows
Introduction to Unicode. By: Atif Gulzar Center for Research in Urdu Language Processing
Introduction to Unicode By: Atif Gulzar Center for Research in Urdu Language Processing Introduction to Unicode Unicode Why Unicode? What is Unicode? Unicode Architecture Why Unicode? Pre-Unicode Standards
The Indian National Bibliography: Today and tomorrow
Submitted on: June 22, 2013 The Indian National Bibliography: Today and tomorrow Shahina P. Ahas Central Reference Library, Kolkata, India E-mail : [email protected] Swapna Banerjee Department of
Easy Bangla Typing for MS-Word!
Easy Bangla Typing for MS-Word! W ELCOME to Ekushey 2.2c, the easiest and most powerful Bangla typing software yet produced! Prepare yourself for international standard UNICODE Bangla typing. Fully integrated
Encoding script-specific writing rules based on the Unicode character set
Encoding script-specific writing rules based on the Unicode character set Malek Boualem, Mark Leisher, Bill Ogden Computing Research Laboratory (CRL), New Mexico State University, Box 30001, Dept 3CRL,
LuitPad: A fully Unicode compatible Assamese writing software
LuitPad: A fully Unicode compatible Assamese writing software Navanath Saharia 1,3 Kishori M Konwar 2,3 (1) Tezpur University, Tezpur, Assam, India (2) University of British Columbia, Vancouver, Canada
Chapter 4: Computer Codes
Slide 1/30 Learning Objectives In this chapter you will learn about: Computer data Computer codes: representation of data in binary Most commonly used computer codes Collating sequence 36 Slide 2/30 Data
The Unicode Standard Version 8.0 Core Specification
The Unicode Standard Version 8.0 Core Specification To learn about the latest version of the Unicode Standard, see http://www.unicode.org/versions/latest/. Many of the designations used by manufacturers
Localization of Text Editor using Java Programming
Localization of Text Editor using Java Programming Varsha Tomar M.Tech Scholar Banasthali University Jaipur, India Manisha Bhatia Assistant Professor Banasthali University Jaipur, India ABSTRACT Software
Designing Global Applications: Requirements and Challenges
Designing Global Applications: Requirements and Challenges Sourav Mazumder Abstract This paper explores various business drivers for globalization and examines the nature of globalization requirements
EURESCOM - P923 (Babelweb) PIR.3.1
Multilingual text processing difficulties Malek Boualem, Jérôme Vinesse CNET, 1. Introduction Users of more and more applications now require multilingual text processing tools, including word processors,
Challenges of Multilingualism and Possible Approach for Standardization of e-governance Solutions in India
Challenges of Multilingualism and Possible Approach for Standardization of e-governance Solutions in India Swaran Lata 1 * and Somnath Chandra 1 ABSTRACT In this paper we have addressed the major challenges
Preservation Handbook
Preservation Handbook Plain text Author Version 2 Date 17.08.05 Change History Martin Wynne and Stuart Yeates Written by MW 2004. Revised by SY May 2005. Revised by MW August 2005. Page 1 of 7 File: presplaintext_d2.doc
Keyboards for inputting Japanese language -A study based on US patents
Keyboards for inputting Japanese language -A study based on US patents Umakant Mishra Bangalore, India [email protected] http://umakant.trizsite.tk (This paper was published in April 2005 issue of TRIZsite
Adobe Acrobat 9 Pro Accessibility Guide: PDF Accessibility Overview
Adobe Acrobat 9 Pro Accessibility Guide: PDF Accessibility Overview Adobe, the Adobe logo, Acrobat, Acrobat Connect, the Adobe PDF logo, Creative Suite, LiveCycle, and Reader are either registered trademarks
INTERNATIONALIZATION FEATURES IN THE MICROSOFT.NET DEVELOPMENT PLATFORM AND WINDOWS 2000/XP
INTERNATIONALIZATION FEATURES IN THE MICROSOFT.NET DEVELOPMENT PLATFORM AND WINDOWS 2000/XP Dr. William A. Newman, Texas A&M International University, [email protected] Mr. Syed S. Ghaznavi, Texas A&M
Module 9. User Interface Design. Version 2 CSE IIT, Kharagpur
Module 9 User Interface Design Lesson 21 Types of User Interfaces Specific Instructional Objectives Classify user interfaces into three main types. What are the different ways in which menu items can be
Internationalized Domain Names -
Internationalized Domain Names - Getting them to work Gihan Dias LK Domain Registry What is IDN? Originally DNS names were restricted to the characters a-z (letters), 0-9 (digits) and '-' (hyphen) (LDH)
Speaking your language...
1 About us: Cuttingedge Translation Services Pvt. Ltd. (Cuttingedge) has its corporate headquarters in Noida, India and an office in Glasgow, UK. Over the time we have serviced clients from various backgrounds
Product Internationalization of a Document Management System
Case Study Product Internationalization of a ì THE CUSTOMER A US-based provider of proprietary Legal s and Archiving solutions, with a customizable document management framework. The customer s DMS was
I PUC - Computer Science. Practical s Syllabus. Contents
I PUC - Computer Science Practical s Syllabus Contents Topics 1 Overview Of a Computer 1.1 Introduction 1.2 Functional Components of a computer (Working of each unit) 1.3 Evolution Of Computers 1.4 Generations
Server-Based PDF Creation: Basics
White Paper Server-Based PDF Creation: Basics Copyright 2002-2009 soft Xpansion GmbH & Co. KG White Paper Server-Based PDF Creation: Basics 1 Table of Contents PDF Format... 2 Description... 2 Advantages
MODULE 7: TECHNOLOGY OVERVIEW. Module Overview. Objectives
MODULE 7: TECHNOLOGY OVERVIEW Module Overview The Microsoft Dynamics NAV 2013 architecture is made up of three core components also known as a three-tier architecture - and offers many programming features
zen Platform technical white paper
zen Platform technical white paper The zen Platform as Strategic Business Platform The increasing use of application servers as standard paradigm for the development of business critical applications meant
TEXT TO SPEECH SYSTEM FOR KONKANI ( GOAN ) LANGUAGE
TEXT TO SPEECH SYSTEM FOR KONKANI ( GOAN ) LANGUAGE Sangam P. Borkar M.E. (Electronics)Dissertation Guided by Prof. S. P. Patil Head of Electronics Department Rajarambapu Institute of Technology Sakharale,
L2/14-009 Abstract Introduction
P P T 0 1 S P P P P P P S P P P P P 0 S 1 1 S 0 0 1 P 0 S 1 T P 0 S 1 T 1 T P 0 S 1 T P 0 T P P P 0 1 S S 1 0 T P S P 1 0 T S P 0 1 P 0 S 1 T TPPT Form for PT ISO/IEC JTC 1/SC 2/WG 2 PROPOSAL SUMMARY
NAME. Internationalized Domain Names (IDNs) -.IN Domain Registry. Policy Framework. Implementation
.भ रत (.BHARAT) Country Code Top Level DOMAIN (cctld) NAME Internationalized Domain Names (IDNs) -.IN Domain Registry Policy Framework & Implementation Government of India Ministry of Communications &
Internationalizing the Domain Name System. Šimon Hochla, Anisa Azis, Fara Nabilla
Internationalizing the Domain Name System Šimon Hochla, Anisa Azis, Fara Nabilla Internationalize Internet Master in Innovation and Research in Informatics problematic of using non-ascii characters ease
Table Of Contents. iii
PASSOLO Handbook Table Of Contents General... 1 Content Overview... 1 Typographic Conventions... 2 First Steps... 3 First steps... 3 The Welcome dialog... 3 User login... 4 PASSOLO Projects... 5 Overview...
Workflow Requirements (Dec. 12, 2006)
1 Functional Requirements Workflow Requirements (Dec. 12, 2006) 1.1 Designing Workflow Templates The workflow design system should provide means for designing (modeling) workflow templates in graphical
Sterling Web. Localization Guide. Release 9.0. March 2010
Sterling Web Localization Guide Release 9.0 March 2010 Copyright 2010 Sterling Commerce, Inc. All rights reserved. Additional copyright information is located on the Sterling Web Documentation Library:
Internationalizing JavaScript Applications Norbert Lindenberg. Norbert Lindenberg 2013. All rights reserved.
Internationalizing JavaScript Applications Norbert Lindenberg Norbert Lindenberg 2013. All rights reserved. Agenda Unicode support Collation Number and date/time formatting Localizable resources Message
ORGANIZATIONAL KNOWLEDGE MAPPING BASED ON LIBRARY INFORMATION SYSTEM
ORGANIZATIONAL KNOWLEDGE MAPPING BASED ON LIBRARY INFORMATION SYSTEM IRANDOC CASE STUDY Ammar Jalalimanesh a,*, Elaheh Homayounvala a a Information engineering department, Iranian Research Institute for
Implementation of Internet Domain Names in Sinhala
Implementation of Internet Domain Names in Sinhala Harsha Wijayawardhana, Asanka Wasala, Ruvan Weerasinghe and Chamila Liyanage University of Colombo School of Computing 35, Reid Avenue, Colombo 00700
Excel 2007 Basic knowledge
Ribbon menu The Ribbon menu system with tabs for various Excel commands. This Ribbon system replaces the traditional menus used with Excel 2003. Above the Ribbon in the upper-left corner is the Microsoft
PRICE LIST. ALPHA TRANSLATION AGENCY www.biuro-tlumaczen.tv [email protected]
We encourage you to get to know the prices of the services provided by Alpha Translation Agency in the range of standard and certified written translations of common and rare languages, as well as interpretation
How To Write A Domain Name In Unix (Unicode) On A Pc Or Mac (Windows) On An Ipo (Windows 7) On Pc Or Ipo 8.5 (Windows 8) On Your Pc Or Pc (Windows
IDN TECHNICAL SPECIFICATION February 3rd, 2012 1 IDN technical specifications - Version 1.0 - February 3rd, 2012 IDN TECHNICAL SPECIFICATION February 3rd, 2012 2 Table of content 1. Foreword...3 1.1. Reference
PDF Accessibility Overview
Contents 1 Overview of Portable Document Format (PDF) 1 Determine the Accessibility Path for each PDF Document 2 Start with an Accessible Document 2 Characteristics of Accessible PDF files 4 Adobe Acrobat
DiskPulse DISK CHANGE MONITOR
DiskPulse DISK CHANGE MONITOR User Manual Version 7.9 Oct 2015 www.diskpulse.com [email protected] 1 1 DiskPulse Overview...3 2 DiskPulse Product Versions...5 3 Using Desktop Product Version...6 3.1 Product
ISO/IEC JTC1 SC2/WG2 N4399
ISO/IEC JTC1 SC2/WG2 N4399 Universal Multiple-Octet Coded Character Set International Organization for Standardization Organisation Internationale de rmalisation Международная организация по стандартизации
Tamil Indic Input 2 User Guide
Tamil Indic Input 2 User Guide Tamil Indic Input 2 User Guide 1 Contents WHAT IS TAMIL INDIC INPUT 2?... 2 SYSTEM REQUIREMENTS... 2 TO INSTALL TAMIL INDIC INPUT 2... 2 TO USE TAMIL INDIC INPUT 2... 3 SUPPORTED
Multi-lingual Label Printing with Unicode
Multi-lingual Label Printing with Unicode White Paper Version 20100716 2009 SATO CORPORATION. All rights reserved. http://www.satoworldwide.com [email protected] 2009 SATO Corporation. All rights
ELFRING FONTS UPC BAR CODES
ELFRING FONTS UPC BAR CODES This package includes five UPC-A and five UPC-E bar code fonts in both TrueType and PostScript formats, a Windows utility, BarUPC, which helps you make bar codes, and Visual
Keywords : complexity, dictionary, compression, frequency, retrieval, occurrence, coded file. GJCST-C Classification : E.3
Global Journal of Computer Science and Technology Software & Data Engineering Volume 13 Issue 4 Version 1.0 Year 2013 Type: Double Blind Peer Reviewed International Research Journal Publisher: Global Journals
Working with Tables: How to use tables in OpenOffice.org Writer
Working with Tables: How to use tables in OpenOffice.org Writer Title: Working with Tables: How to use tables in OpenOffice.org Writer Version: 1.0 First edition: January 2005 First English edition: January
PDF Primer PDF. White Paper
White Paper PDF Primer PDF What is PDF and what is it good for? How does PDF manage content? How is a PDF file structured? What are its capabilities? What are its limitations? Version: 1.0 Date: October
TZWorks Windows Event Log Viewer (evtx_view) Users Guide
TZWorks Windows Event Log Viewer (evtx_view) Users Guide Abstract evtx_view is a standalone, GUI tool used to extract and parse Event Logs and display their internals. The tool allows one to export all
Unicode Enabling Java Web Applications
Internationalization Report: Unicode Enabling Java Web Applications From Browser to DB Provided by: LingoPort, Inc. 1734 Sumac Avenue Boulder, Colorado 80304 Tel: +1.303.444.8020 Fax: +1.303.484.2447 http://www.lingoport.com
ASCII Code. Numerous codes were invented, including Émile Baudot's code (known as Baudot
ASCII Code Data coding Morse code was the first code used for long-distance communication. Samuel F.B. Morse invented it in 1844. This code is made up of dots and dashes (a sort of binary code). It was
3D Viewer. user's manual 10017352_2
EN 3D Viewer user's manual 10017352_2 TABLE OF CONTENTS 1 SYSTEM REQUIREMENTS...1 2 STARTING PLANMECA 3D VIEWER...2 3 PLANMECA 3D VIEWER INTRODUCTION...3 3.1 Menu Toolbar... 4 4 EXPLORER...6 4.1 3D Volume
Bibliographic Standards
Bibliographic Standards The INFLIBNET Centre maintains this page as part of its commitment to collaboration with University, College, R & D and National institute libraries in the development, promotion
SQL Server An Overview
SQL Server An Overview SQL Server Microsoft SQL Server is designed to work effectively in a number of environments: As a two-tier or multi-tier client/server database system As a desktop database system
Q&As: Microsoft Excel 2013: Chapter 2
Q&As: Microsoft Excel 2013: Chapter 2 In Step 5, why did the date that was entered change from 4/5/10 to 4/5/2010? When Excel recognizes that you entered a date in mm/dd/yy format, it automatically formats
The Virtual Tibetan Classroom
The Virtual Tibetan Classroom by William Magee, DDBC Thanks to a Generous Grant from the Taiwan National Science Council and the Hopkins MultimediaTibetan Research Archive Project http://haa.ddbc.edu.tw
When older typesetting methods gave
Typographic Terms When older typesetting methods gave way to electronic publishing, certain traditional terms got carried along. Today we use a mix of old and new terminology to describe typography. Alignment
The Unicode Standard Version 8.0 Core Specification
The Unicode Standard Version 8.0 Core Specification To learn about the latest version of the Unicode Standard, see http://www.unicode.org/versions/latest/. Many of the designations used by manufacturers
Search Interface to a Mayan Glyph Database based on Visual Characteristics *
Search Interface to a Mayan Glyph Database based on Visual Characteristics * Grigori Sidorov 1, Obdulia Pichardo-Lagunas 1, and Liliana Chanona-Hernandez 2 1 Natural Language and Text Processing Laboratory,
Gujarati Indic Input 3 - User Guide
Gujarati Indic Input 3 - User Guide Contents 1. WHAT IS GUJARATI INDIC INPUT 3?... 2 1.1. SYSTEM REQUIREMENTS... 2 1.2. APPLICATION REQUIREMENTS... 2 2. TO INSTALL GUJARATI INDIC INPUT 3... 2 3. TO USE
Setting Up OpenOffice.org: Choosing options to suit the way you work
Setting Up OpenOffice.org: Choosing options to suit the way you work Title: Setting Up OpenOffice.org: Choosing options to suit the way you work Version: 1.0 First edition: December 2004 First English
What s New in QuarkXPress 8
What s New in QuarkXPress 8 LEGAL NOTICES 2008 Quark Inc. as to the content and arrangement of this material. All rights reserved. 1986 2008 Quark Inc. and its licensors as to the technology. All rights
FileMaker 14. ODBC and JDBC Guide
FileMaker 14 ODBC and JDBC Guide 2004 2015 FileMaker, Inc. All Rights Reserved. FileMaker, Inc. 5201 Patrick Henry Drive Santa Clara, California 95054 FileMaker and FileMaker Go are trademarks of FileMaker,
Agile Business Suite: a 4GL environment for.net developers DEVELOPMENT, MAINTENANCE AND DEPLOYMENT OF LARGE, COMPLEX BACK-OFFICE APPLICATIONS
Agile Business Suite: a 4GL environment for.net developers DEVELOPMENT, MAINTENANCE AND DEPLOYMENT OF LARGE, COMPLEX BACK-OFFICE APPLICATIONS In order to ease the burden of application lifecycle management,
How to translate your website. An overview of the steps to take if you are about to embark on a website localization project.
How to translate your website An overview of the steps to take if you are about to embark on a website localization project. Getting Started Translating websites can be an expensive and complex process.
The use of binary codes to represent characters
The use of binary codes to represent characters Teacher s Notes Lesson Plan x Length 60 mins Specification Link 2.1.4/hi Character Learning objective (a) Explain the use of binary codes to represent characters
Technical Information Abstract
1/15 Technical Information Abstract Disclaimer: in no event shall Microarea be liable for any special, indirect or consequential damages or any damages whatsoever resulting from loss of use, data or profits,
Microsoft Word 2010 Prepared by Computing Services at the Eastman School of Music July 2010
Microsoft Word 2010 Prepared by Computing Services at the Eastman School of Music July 2010 Contents Microsoft Office Interface... 4 File Ribbon Tab... 5 Microsoft Office Quick Access Toolbar... 6 Appearance
Exhibit F. VA-130620-CAI - Staff Aug Job Titles and Descriptions Effective 2015
Applications... 3 1. Programmer Analyst... 3 2. Programmer... 5 3. Software Test Analyst... 6 4. Technical Writer... 9 5. Business Analyst... 10 6. System Analyst... 12 7. Software Solutions Architect...
Your single-source partner for corporate product communication. Transit NXT Evolution. from Service Pack 0 to Service Pack 8
Transit NXT Evolution from Service Pack 0 to Service Pack 8 April 2009: Transit NXT Service Pack 0 (Version 4.0.0.671) Additional versions of DTP programs supported: InDesign CS3 and FrameMaker 9 Additional
Kannada Indic Input 2 - User Guide
Kannada Indic 2 - User Guide Kannada Indic 2 - User Guide 2 Contents WHAT IS KANNADA INDIC INPUT 2?... 3 SYSTEM REQUIREMENTS... 3 TO INSTALL KANNADA INDIC INPUT 2... 3 TO USE KANNADA INDIC INPUT 2... 4
Oracle Database 11g Express Edition PL/SQL and Database Administration Concepts -II
Oracle Database 11g Express Edition PL/SQL and Database Administration Concepts -II Slide 1: Hello and welcome back to the second part of this online, self-paced course titled Oracle Database 11g Express
Gujarati Indic Input 2 - User Guide
Gujarati Indic Input 2 - User Guide Gujarati Indic Input 2 - User Guide 2 Contents WHAT IS GUJARATI INDIC INPUT 2?... 3 SYSTEM REQUIREMENTS... 3 TO INSTALL GUJARATI INDIC INPUT 2... 3 TO USE GUJARATI INDIC
Context-aware Library Management System using Augmented Reality
International Journal of Electronic and Electrical Engineering. ISSN 0974-2174 Volume 7, Number 9 (2014), pp. 923-929 International Research Publication House http://www.irphouse.com Context-aware Library
Translating QueueMetrics into a new language
Translating QueueMetrics into a new language Translator s manual AUTORE: LOWAY RESEARCH VERSIONE: 1.3 DATA: NOV 11, 2006 STATO: Loway Research di Lorenzo Emilitri Via Fermi 5 21100 Varese Tel 0332 320550
ScreenMatch: Providing Context to Software Translators by Displaying Screenshots
ScreenMatch: Providing Context to Software Translators by Displaying Screenshots Geza Kovacs MIT CSAIL 32 Vassar St, Cambridge MA 02139 USA [email protected] Abstract Translators often encounter ambiguous
FileMaker 13. ODBC and JDBC Guide
FileMaker 13 ODBC and JDBC Guide 2004 2013 FileMaker, Inc. All Rights Reserved. FileMaker, Inc. 5201 Patrick Henry Drive Santa Clara, California 95054 FileMaker and Bento are trademarks of FileMaker, Inc.
Multilingual Interface for Grid Market Directory Services: An Experience with Supporting Tamil
Multilingual Interface for Grid Market Directory Services: An Experience with Supporting Tamil S.Thamarai Selvi *, Rajkumar Buyya **, M.R. Rajagopalan #, K.Vijayakumar *, G.N.Deepak * * Department of Information
Building Multilingual Search Index using open source framework
Building Multilingual Search Index using open source framework ABSTRACT Arjun Atreya V 1 Swapnil Chaudhari 1 Pushpak Bhattacharyya 1 Ganesh Ramakrishnan 1 (1) Deptartment of CSE, IIT Bombay {arjun, swapnil,
winhex Disk Editor, RAM Editor PRESENTED BY: OMAR ZYADAT and LOAI HATTAR
winhex Disk Editor, RAM Editor PRESENTED BY: OMAR ZYADAT and LOAI HATTAR Supervised by : Dr. Lo'ai Tawalbeh New York Institute of Technology (NYIT)-Jordan X-Ways Software Technology AG is a stock corporation
Perfect PDF 8 Premium
Perfect PDF 8 Premium Test results ( gut Good, sehr gut very good) refer to versions 7, 6 and 5 of Perfect PDF. Professionally create, convert, edit and view PDF, PDF/A and XPS files Perfect PDF 8 Premium
Embedded Software Development with MPS
Embedded Software Development with MPS Markus Voelter independent/itemis The Limitations of C and Modeling Tools Embedded software is usually implemented in C. The language is relatively close to the hardware,
Right-to-Left Language Support in EMu
EMu Documentation Right-to-Left Language Support in EMu Document Version 1.1 EMu Version 4.0 www.kesoftware.com 2010 KE Software. All rights reserved. Contents SECTION 1 Overview 1 SECTION 2 Switching
Kazuraki : Under The Hood
Kazuraki : Under The Hood Dr. Ken Lunde Senior Computer Scientist Adobe Systems Incorporated Why Develop Kazuraki? To build excitement and awareness about OpenType Japanese fonts Kazuraki is the first
Terms and Definitions for CMS Administrators, Architects, and Developers
Sitecore CMS 6 Glossary Rev. 081028 Sitecore CMS 6 Glossary Terms and Definitions for CMS Administrators, Architects, and Developers Table of Contents Chapter 1 Introduction... 3 1.1 Glossary... 4 Page
Rotorcraft Health Management System (RHMS)
AIAC-11 Eleventh Australian International Aerospace Congress Rotorcraft Health Management System (RHMS) Robab Safa-Bakhsh 1, Dmitry Cherkassky 2 1 The Boeing Company, Phantom Works Philadelphia Center
PCB Project (*.PrjPcb)
Project Essentials Summary The basis of every design captured in Altium Designer is the project. This application note outlines the different kinds of projects, techniques for working on projects and how
The Sacred Letters of Tibet
The Sacred Letters of Tibet TIBET IS another arena of confluence of two ancient cultures. Tibet adopted government organization and social standards from China. But its spiritual guidance came from Buddhism.
Microsoft Excel 2010 Part 3: Advanced Excel
CALIFORNIA STATE UNIVERSITY, LOS ANGELES INFORMATION TECHNOLOGY SERVICES Microsoft Excel 2010 Part 3: Advanced Excel Winter 2015, Version 1.0 Table of Contents Introduction...2 Sorting Data...2 Sorting
AFN-FixedAssets-062502
062502 2002 Blackbaud, Inc. This publication, or any part thereof, may not be reproduced or transmitted in any form or by any means, electronic, or mechanical, including photocopying, recording, storage
DRH specification framework
DRH specification framework 2007-03-15 EDM - NIED Takeshi KAWAMOTO, Hiroaki NEGISHI, Mitsuaki SASAKI 1 DRH Basic Development before Sep. 2007 Server architectures Search architectures Multilanguage Architectures
The corresponding control ladder program is shown at below: The content of element comment will be built is shown below
Introduction This tutorial explains how to build an application by using the Winproladder programming package to write a ladder control program. In this tutorial we will not tackle the advanced features
Firewall Builder Architecture Overview
Firewall Builder Architecture Overview Vadim Zaliva Vadim Kurland Abstract This document gives brief, high level overview of existing Firewall Builder architecture.
