Survey of Language Computing in Asia 2005

Similar documents
National Language (Tamil) Support in Oracle An Oracle White paper / November 2004

SETTING UP A MULTILINGUAL INFORMATION REPOSITORY : A CASE STUDY WITH EPRINTS.ORG SOFTWARE

LuitPad: A fully Unicode compatible Assamese writing software

Red Hat Enterprise Linux International Language Support Guide

Office Language Interface Pack for Farsi (Persian) Content

Tibetan For Windows - Software Development and Future Speculations. Marvin Moser, Tibetan for Windows & Lucent Technologies, USA

Vision Impairment and Computing

Adobe Acrobat 9 Pro Accessibility Guide: PDF Accessibility Overview

Rendering/Layout Engine for Complex script. Pema Geyleg

Internationalization & Localization

Localization of Open Source Web Content Management System

PDF Accessibility Overview

TYPING IN ARABIC (WINDOWS XP)

White Paper. Translation Quality - Understanding factors and standards. Global Language Translations and Consulting, Inc. Author: James W.

Compatibility Matrixes. Blackboard Academic Suite

Develop Software that Speaks and Listens

उपलब ध क -ब र ड ल -आउट ववकल प

TRANSLATION OF TELUGU-MARATHI AND VICE- VERSA USING RULE BASED MACHINE TRANSLATION

Any Software Any Language Instantly!

How To Migrate To Redhat Enterprise Linux 4

AUTOMATED CONFERENCE CD-ROM BUILDER AN OPEN SOURCE APPROACH Stefan Karastanev

Challenges of Multilingualism and Possible Approach for Standardization of e-governance Solutions in India

Free/Open Source Software: Localization

System Requirements for Online Testing

MOVING MACHINE TRANSLATION SYSTEM TO WEB

Perfion Output Using Special Barcode fonts

System Requirements Across v6.3 (Revision: 10. December 2015)

System Requirements for Computer-Based Testing AzMERIT

Bangla Localization of OpenOffice.org. Asif Iqbal Sarkar Research Programmer BRAC University Bangladesh

Speaking your language...

Right-to-Left Language Support in EMu

Linux. Prepared for: Professor Maria Damen. Prepared by: Lori Minor. Date: December 6, 2010

Voice Driven Animation System

Internationalization of the Domain Name System: The Next Big Step in a Multilingual Internet

Kaspersky Security Center Web-Console

System Requirements Across v6 (Revision: April 29, 2015)

Why accessibility bugs harm KDE and how to get rid of them. Olaf Schmidt, KDE Accessibility Project

Reading and Writing in Chinese

Getting Started with VMware Fusion. VMware Fusion for Mac OS X

Cisco Networking Academy Program Curriculum Scope & Sequence. Fundamentals of UNIX version 2.0 (July, 2002)

SOFTWARE TESTING TRAINING COURSES CONTENTS

Hitachi Backup Services Manager Certified Configurations Guide 6.5

Easy Bangla Typing for MS-Word!

TEXT TO SPEECH SYSTEM FOR KONKANI ( GOAN ) LANGUAGE

This guide specifies the required and supported system elements for the application.

SWsoft Plesk 8.2 for Linux/Unix Backup and Restore Utilities. Administrator's Guide

Copyright by Parallels Holdings, Ltd. All rights reserved.

Kaspersky Security Center Web-Console

Product Internationalization of a Document Management System

Fall Lecture 1. Operating Systems: Configuration & Use CIS345. Introduction to Operating Systems. Mostafa Z. Ali. mzali@just.edu.

StreamServe Persuasion SP5 Supported platforms and software

SWsoft Plesk 8.3 for Linux/Unix Backup and Restore Utilities

SDL Trados Studio 2015 Translation Memory Management Quick Start Guide

A Python Tour: Just a Brief Introduction CS 303e: Elements of Computers and Programming

Quick Reference Guide

Intershop 7 System Requirements Sheet

Data Sheet VISUAL COBOL WHAT S NEW? COBOL JVM. Java Application Servers. Web Tools Platform PERFORMANCE. Web Services and JSP Tutorials

IBM Rational Web Developer for WebSphere Software Version 6.0

Using Keil software with Linux via VirtualBox

Server-Based PDF Creation: Basics

Overview. Supported Operating Systems and Devices

Challenges of Multilingual Web in India : Technology Development & Standardization Perspective

SIMATIC. WinCC V7.0. Getting started. Getting started. Welcome 2. Icons 3. Creating a project 4. Configure communication 5

Webinar: Software & Mobile App Localization

Publish Acrolinx Terminology Changes via RSS

Chapter 3. Basic Application Software. McGraw-Hill/Irwin. Copyright 2008 by The McGraw-Hill Companies, Inc. All rights reserved.

How to Develop Accessible Linux Applications

OWrite One of the more interesting features Manipulating documents Documents can be printed OWrite has the look and feel Find and replace

Operating Systems compatible with GigasoftOBM / GigasoftACB (Supported Operation System List):

Plesk 8.0 for Linux/UNIX Backup and Restore Utilities

Network operating systems typically are used to run computers that act as servers. They provide the capabilities required for network operation.

Adobe Acrobat XI Pro Accessibility Guide: Best Practices for PDF Accessibility

bbc Installing and Deploying LiveCycle ES2 Using JBoss Turnkey Adobe LiveCycle ES2 November 30, 2011 Version 9

UNIVERSITY OF MYSORE B Com. ( ANNUAL ) DEGREE EXAMINATIONS - MAY / JUNE 2014 TIME TABLE

The Indian National Bibliography: Today and tomorrow

PRICE LIST. ALPHA TRANSLATION AGENCY

ScreenMatch: Providing Context to Software Translators by Displaying Screenshots

NAPCS Product List for NAICS 54193: Translation and Interpretation Services

Red Hat Enterprise Linux Desktop. What's new in version 5? Jonathan Blandford

Industry Guidelines on Captioning Television Programs 1 Introduction


Real-time Device Monitoring Using AWS

Nadir Durrani Curriculum Vitae

Linux Getting Started Guide

Operating System Today s Operating Systems File Basics File Management Application Software

Plesk 8.3 for Linux/Unix System Monitoring Module Administrator's Guide

RIC 2007 SNAP: Symbolic Nuclear Analysis Package. Chester Gingrich USNRC/RES 3/13/07

Creating Accessible Forms in Microsoft Word and Adobe PDF

Tivoli Endpoint Manager for Remote Control Version 8 Release 2. User s Guide

ORACLE USER PRODUCTIVITY KIT V3.6.1 TECHNICAL SPECIFICATIONS (WITH ENABLEMENT SERVICE PACK 3)

Preservation Handbook

Perfect PDF 8 Premium

CoLang 2014 Data Management and Archiving Course. Session 2. Nick Thieberger University of Melbourne

Release Notes Tecplot, Inc. Bellevue, WA 2012

technical brief Multiple Print Queues

Oracle SQL Developer Migration

OPTIMIZING CONTENT FOR TRANSLATION ACROLINX AND VISTATEC

Lingotek + Salesforce

MS WORD 2007 (PC) Macros and Track Changes Please note the latest Macintosh version of MS Word does not have Macros.

Transcription:

Survey of Language Computing in Asia 2005 Sarmad Hussain Nadir Durrani Sana Gul Center for Research in Urdu Language Processing National University of Computer and Emerging Sciences www.nu.edu.pk www.idrc.ca

Published by Center for Research in Urdu Language Processing National University of Computer and Emerging Sciences Lahore, Pakistan Copyrights International Development Research Center, Canada Printed by Walayatsons, Pakistan ISBN: 969-8961-00-3 This work was carried out with the aid of a grant from the International Development Research Centre (IDRC), Ottawa, Canada, administered through the Centre for Research in Urdu Language Processing (CRULP), National University of Computer and Emerging Sciences (NUCES), Pakistan. ii

Telugu Telugu is a Dravidian language spoken in southern Indian states, and is the official language of Andhra Pradesh. It is considered to be the the second most widely spoken language in India after Hindi. There are 75 million first language and about 5 million second language speakers of Telugu [1]. Figure 1 shows the language tree for Telugu. Dravidian South-Central TELUGU Figure 1: Language Family Tree of Telugu [1] Telugu script derives from Brahmi script, and is similar to Kannada script. The earliest known inscriptions in the Telugu language date back from the 6 th century AD. Telugu has been written in an old style. However, the writing system was modernized in second half of 20 th century to agree more with the spoken language [2]. Character Set and Encoding ISCII includes encoding for Telugu characters, with code page identifier 57006. Telugu script has also been incorporated in Unicode from 0C00-0C7F [3]. There are also other ad hoc encodings which are in use. Encoding converters between ISCII and these Telugu ad hoc encodings have also been developed [17]. Fonts and Rendering Many Open Type fonts are available, which can be used for rendering Telugu script, e.g. Akshar Unicode, Code 2000, Pothana 2000 and Vemana 2000 [4]. Microsoft supports Telugu rendering, and provides font support for this script, e.g. in Arial Unicode MS font. Figure 2 shows results of rendering some of these fonts on Microsoft platform [4]. Microsoft also provides a guide to develop Open Type fonts for Telugu [8]. Telugu Open Type fonts are not rendered on Linux platform in regular distributions. However, rendering support has been developed in Pango for GNOME and Firefox, and Qt engine for KDE. Upgraded versions are available at [5].

Telugu Figure 2: Telugu Unicode fonts [4] Keyboard Three different keyboard layouts are used by Telugu, Inscript, RTS and WX, of which first two are very popular. RTS and WX are phonetic based [5]. Inscript is standardized through Government of India [6, 7] and is shown in Figure 3. Figure 3: Inscript Telugu Keyboard Layout [7] Microsoft provides a Telugu keyboard, enabling support in all Microsoft applications. Figure 4 shows the keyboard layout provided by Microsoft. 119

PAN Localization Survey of Language Computing in Asia 2005 (a) (b) Figure 4: Microsoft On-Screen Keyboard for Telugu: (a) Normal, and (b) Shift State All keyboard layouts mentioned above (RTS, Inscript and WX) are available on Linux platform. The support is not built-in and has to be installed [5]. Collation Telugu locale is defined but not standardized. Basic support is available on Microsoft and Linux platforms e.g. see [10]. Locale Telugu locale (te_in) is partially defined and is standardized through posting at IBM ICU and CLDR 1. Further information is available at [5]. Microsoft provides a complete LIP for Telugu. Figure 5 shows a snapshot of the locale settings for Telugu. 120

Telugu Figure 5: Telugu Locale on Windows XP Support for Telugu locale is available in Glib C. The Telugu locale for India has also been defined officially in Red Hat Linux [9]. It is also supported on other open source platforms [10]. Interface Terminology Translation Interface terminology is not standardized, but is available on many platforms. Microsoft supports Telugu LIP, which contains Telugu interface for its applications [11]. A Native Language Project has been initiated for localizing Open Office version 2.0. Its team has just started therefore no output from this group has been committed [12]. The India Linux project also includes a language team for localizing Linux in Telugu [14]. However no output of the team has been published. GNOME translation has started for Telugu and about 85% translations of glossary have been accomplished [15]. This team is working independently and has not registered at the www.gnome.org for translating GNOME 2.8 GUI messages to local languages [15]. A Telugu localization team has also registered on the Mozilla Localization Project (MLP). This team is working on localizing Mozilla version 1.2.1. No outputs of this project are posted [13]. 121

PAN Localization Survey of Language Computing in Asia 2005 Status of Advanced Applications Telugu support has been added into the Unicode text editor Yudit [10]. Work has also been done for Telugu Lexicon, open source spell checker Aspell [10] and prototypical work on optical character recognition system. Significant work on Telugu text-to-speech, lexicon and Telugu- Hindi machine translation is also under progress, with working systems already released [16, 17]. References [1] http://www.ethnologue.com/show_language.asp?code=tel [2] http://www.omniglot.com/writing/telugu.htm [3]http://www.unicode.org/charts/PDF/U0C00.pdf [4] http://salrc.uchicago.edu/resources/fonts/telugufonts.html [5] http://telugu.sarovar.org/wiki/ [6] http://tdil.mit.gov.in/keyoverlay.htm [7] http://tdil.mit.gov.in/isciichart.pdf [8] http://www.microsoft.com/typography/opentype%20dev/telugu/intro.mspx [9] http://www.indlinux.org/downloads/locale/locales/te_in [10] http://telugu.sarovar.org/wiki/index.php/status [11] http://www.bhashaindia.com/downloadsv2/category.aspx?id=8 [12] http://te.openoffice.org [13] http://www.mozilla.org/projects/l10n/mlp_status.html [14] http://indlinux.org/lang/ [15] http://l10n-status.gnome.org/gnome-2.8/index.html [16] http://ltrc.iiit.net/showfile.php?filename=research/ [17] http://ltrc.iiit.net/showfile.php?filename=downloads/ 122