Cross-Language Instant Messaging with Automatic Translation



Similar documents
Languages Supported. SpeechGear s products are being used to remove communications barriers throughout the world.

Who We Are. Services We Offer

RECENSEO Quick Reference

Remote Desktop Services Guide

Professional. Accurate. Fast.

Linking the world through professional language services

Cyclope Internet Filtering Proxy. - User Guide -

PRICE LIST. ALPHA TRANSLATION AGENCY

Tel: Fax: P.O. Box: 22392, Dubai - UAE info@communicationdubai.com comm123@emirates.net.ae

Translution Price List GBP

LANGUAGE CONNECTIONS YOUR LINGUISTIC GATEWAY

Yandex.Translate API Developer's guide

Table 1: TSQM Version 1.4 Available Translations

GET YOUR START MENU BACK IN MICROSOFT WINDOWS SERVER 2012

About CRC? What is Link?

EMC SourceOne. Products Compatibility Guide REV 54

Safe Harbor Statement

MT Search Elastic Search for Magento

Xerox Easy Translator Service User Guide

Helping Companies with Globalization

Helping Companies with Globalization

INTERC O MBASE. Global Language Solution

Reference Guide: Approved Vendors for Translation and In-Person Interpretation Services

Microsoft SharePoint Workspace 2010Product Guide

Translation/interpreting Services in Nottingham

Cisco Unified Presence Server 1.0

Survey of University of Michigan Graduate-level Area Studies Alumni/ae & FLAS Recipients from : Selected Findings

2014 HIGHER SCHOOL CERTIFICATE EXAMINATION TIMETABLE Monday 13 October to Wednesday 5 November

We Answer To All Your Localization Needs!

Introductory Guide to the Common European Framework of Reference (CEFR) for English Language Teachers

Speaking your language...

Your Global Virtual Sales Force

Formatting Custom List Information

A global leader in document translations

Luxembourg-Luxembourg: FL/SCIENT15 Translation services 2015/S Contract notice. Services

LocaTran Translations Ltd. Professional Translation, Localization and DTP Solutions.

PUBLISHING TRANSLATIONS IN EUROPE SURVEY OF PUBLISHERS

We Answer All Your Localization Needs!

Poliscript Installation Guide

Microsoft stores badge guidelines. February 2016

Liquid OS X User Guide

We re talking your language

Internet sites for machine translation available language-pairs ** Part 1 direct translation sites

webcertain Recruitment pack Ceri Wright [Pick the date]

LOCALIZATION PROCESS CHECKLIST

FOREIGN LANGUAGE AND AREA STUDIES (FLAS) FELLOWSHIP For Graduate Students Academic Year

Luxembourg-Luxembourg: FL/TERM15 Translation services 2015/S Contract notice. Services

GCE/GCSE subjects recognised for NUI matriculation purposes

Contents. BMC Atrium Core Compatibility Matrix

SIN 382-1/1RC Translation Services SIN 382-2/2RC Interpretation Services Contract Number: GS-10F-034AA

Media labels and their contents

ServiceAPI to the WorldLingo System

Australian Embassy, Seoul List of Translators and Interpreters 2013 Seoul, Busan and Daejeon

List of Higher School Certificate Board Developed Courses

HAZARD COMMUNICATION TRANSLATION RESOURCES Introduction

The most trusted name in translations.

Derby Translations. Translation for all languages. Active Knowledge. Team of Experts. Quality Is Our Priority. Competitive Prices.

Contents. BMC Remedy AR System Compatibility Matrix

GENERAL SERVICES ADMINISTRATION

Microsoft SharePoint Workspace 2010 Product Guide

Data First Framework. How to Build Your Enterprise Data Hub. Luis Campos Big Data Solutions Director Oracle Europe, Middle East and Africa

Live Office. Personal Archive User Guide

Voice Mail. Service & Operations

General Services Administration

Product Globalization Service. A Partner You Can Trust

Personal Archive User Guide

2011 Census: Language

Leverage VoIP Software and Social Media Tools to Extend Your Service Organization Beyond the Walls of the Call Center. John Burton

Cisco Unified IP Phone CP-6961 VoIP -puhelin

LSI TRANSLATION PLUG-IN FOR RELATIVITY. within

Interactive product brochure :: Nina TM Mobile: The Virtual Assistant for Mobile Customer Service Apps

RESEARCH ASSISTANCE. The Portal is also accessible to the general public but restricted to the free case law databases.

External Candidate Online Application

Financial Reporting Comparison Matrix

2015 Population Office figures for October to December and year to date

Web Conferencing Comparison Guide

Cisco Agent Desktop for Cisco Unified Contact Center Express 9.0

Release Notes MimioStudio Software

FRAX Release Notes Release (FRAX v3.10)

Cheap International Calls

ivms-4500 HD (ios) Mobile Client Software User Manual (V3.4)

320 E. 46 th Street, 11G New York, NY Tel Ext. 5

Keystone Academic Solutions

ivms-4500 HD (Android) Mobile Client Software User Manual (V3.4)

Report on Data from the MLA Guide to Doctoral Programs in English and Other Modern Languages

Fujiyama Co. Ltd. Company profile

technical translation services, ltd. translation into 42 languages desktop publishing/typesetting multi-media presentations multi-lingual web sites

Similar to basic pricing, usage cost is added in specified time intervals. Example: Each 15 minutes, add $1.00. Overtime option is optional.

Professional Services Schedule (PSS)

TRIDINDIA IT TRANSLATION SERVICES PRIVATE LIMITED

We Translate Inc. P RE MIE R CE RTI F I ED T R AN SL AT ION SERV IC ES

Oracle Taleo Enterprise Mobile for Talent Management Cloud Service Administration Guide

The Language Grid The Language Grid combines users language resources and machine translators to produce high-quality translation that is customized

GENERAL SERVICES ADMINISTRATION

HP Business Notebook Password Localization Guidelines V1.0

Your total solution for direct hire staffing. Legal staffing and training solutions

SWOT Assessment: BMC Remedy v9

HP Backup and Recovery Manager

Microsoft Office 2010 via Windows 7 (Word, Excel, Access, One Note, Outlook, PowerPoint and Publisher) Microsoft Exchange 2007, Visio, Project.

FIRST-CLASS TRANSLATIONS WORLDWIDE

Transcription:

Cross-Language Instant Messaging with Automatic Translation Che-Yu Yang Department of Information Management China University of Technology Taipei, Taiwan e-mail: cyyang@cute.edu.tw Abstract Along with the rapid development of Internet technologies, the instant messaging has become nowadays an important medium for a huge number of people to communicate with friends, family, and even colleagues while working. People who come from different corner of the world speak different native-languages. Even if the instant messaging technology is now so developed, there is a barrier to communication for people having different native-languages. This study tries to let people easily chat without having to be familiar with the others native-language by integrating "instant messaging" and "machine translation" technologies. This shall overcome the language barrier to communication. Furthermore, with proper design, this mechanism can also facilitate conversation practice. Keywords-instant messaging; machine translation; crosslanguage conversation; distance learning I. INTRODUCTION In recent years, along with the development of Internet communication technologies, various network-related applications are springing up. In the Web2.0 era, social network and its related applications are the hottest topics. Among them, the instant messaging (IM) has become nowadays an important medium for people to communicate for its convenience and free of use. Through the Internet we are able to make friends with people around the world and chat with them using a computer. The instant messaging has shortened the geographical distance between people all over the world - the conversation is as easy as sitting in front of the computer and popping fingers to type the text communication has become easy and efficient. However, the invisible distance the barrier results from the different native-languages people speak has not been eliminated yet. Imagine that if there is a Japanese Internet users sitting in front of his computer, how to communicate through the instant messaging to have a conversation with him? Well, even inputting Japanese words with the keyboard is a problem, let alone understand and write/type Japanese. It is one major reason that most people's instant messaging contact list has only a small number of friends of different countries. It is still a tough work for people to communicate with each others while they are speaking different native languages. This language barrier needs to be overcome by the technology of natural language processing (NLP). Natural language processing technology, such as information retrieval, speech recognition, machine translation, automatic summary and so on, has developed rapidly in recent years. The machine translation technology can be used to do the language translation task instead of human translators or language experts. If "instant messaging" and "machine translation" technologies cooperate so that instant messaging is no longer just a messenger's role: it also properly does real-time translation to the context of messages. This shall eliminate the barrier to communication easily for those whom are speaking various different languages. Furthermore, with some proper design, this mechanism can also provide some facilitation for conversation practice during language learning. Imagine a plot: Xiaoming, a boy from Taiwan, has his MSN contact list an American friend Mary, and a Japanese friend Miyoko. Xiaoming, as a baseball fan, is very concerned about the baseball player Chien-Ming Wang s recent situation in the Major League Baseball. So he decides to send a message in Chinese language to Mary, asking about: 王 建 民 將 要 離 開 紐 約 洋 基 隊 了 嗎? And Mary receives this statement: "Is Chien-Ming Wang going to leave the New York Yankees?" Then she immediately types the following in English: "He will sign the contract with the Washington Nationals." And Xiaoming sees Mary s reply in Chinese: 他 將 與 華 盛 頓 國 民 隊 簽 約 In above plot, Xiaoming and Mary both use their own native-languages to communicate with the other, Xiaoming speaks Chinese, Mary speaks English. Accordingly they should have been talking about different things; however, the communication between them is no significant problem just like they were talking to a person who speaks the same native language with themselves. Xiaoming talks to Miyoko in the same fashion; the difference is Miyoko speaks Japanese. This plot is no longer a science fiction - with the technologies of instance messaging and machine translation, it is a job of integration.

II. RELATED WORKS A. Instant Messaging Instant messaging is a kind of network service that allows two or more people to make text chat to each others. It is developing rapidly in recent years, and is integrated with more and more functions such as offline message delivery, voice chat, video chat, file-transfer..., et al. Instant messaging is now no doubt one of the most popular network services in the Internet. In April 2009, comscore, Inc. released a result [1] of a study Internet users in France revealed that people in France spent the highest share of total time spent at 14.3 percent on instant messaging, followed by social networking at 5.7 percent. In October 2009, another report of comscore [2] indicated that online communications, entertainment and social networking occupied the highest share of Hong Kong Internet users attention. Instant Messengers accounted for the highest share of minutes spent online at 16 percent. Yet another survey [3] indicated that MSN Messenger [4] has the strongest penetration worldwide, with 61 percent of worldwide IM users utilizing the application. MSN Messenger is also dominant in Latin America, reaching more than 90 percent of IM users, and in Europe and Asia Pacific, reaching more than 70 percent of IM users in each region. North America is the most competitive IM market, with MSN Messenger, AOL/Aim [5] and Yahoo! Messenger [6] each garnering between 27 percent and 37 percent of IM users. According to a survey in Taiwan by InsightXplorer Limited [7], the current usage rate of instant messaging in Taiwan is 92.7%, among which 50.1% of netizens have installed two or more instant messaging software. From the view of user-age, the younger the netizens are, the more often they communicate with friends via instant messaging. In the ages of 15 to 19 year-old, 99% Web surfers use instant messaging; despite from the teenagers, in the age of 35 to 39 year-old, the usage rate is as high as 81.3%. As for the user population, MSN Messenger owns a number of more than 700 million users, while the Yahoo! Messenger about 470 million users. Because of the popular of instant messaging and its powerful communication functions that connect with the social network, the current Internet service providers are devoting to the integration of various network services, such as E-Mail, file sharing, network storage space, online games, street maps query, news reader, RSS feeds and many others, to create the one-stop information service. B. Electronic Dictionary Electronic Dictionary is a major product of machine translation technology generally divided into software and hardware types. The hardware form of electronic dictionary appears as a handheld mobile device looks like PDA, provides the abilities to look up the built-in dictionaries as the main functions. Most of them also support with some additional functions such as language learning, calendar, calculator, note...et al. The software electronic dictionary is usually available in the form of either package software or network service accessed through a Web browser (online service). For the latter, because of its no installation of software needed and free to use, is currently most popular way to provide the dictionary service, commons are Yahoo! dictionary [8], Google translate [9], Microsoft Bing translations [10]. These online dictionary services normally provide the functions of vocabulary queries, paragraph translation, webpage translation, text file translation and some others. Google translate and Microsoft Bing translations provides not only a number of intimate small tools and resources to end-user, but also application programming interface (API) to the computer software developer [11,12]. One example of benefit from automated translation technology is that many Internet Content Providers make copies of the original web content available in various language versions generated by machine translation, thereby expanding the scope of audience to global. III. INSTANT MESSAGING WITH LANGUAGE TRANSLATION With today's technology, the cross-language dialogue plot mentioned earlier is not out of reach. The key is to cooperate "instant messaging" and "electronic dictionary" technologies. For Internet instant messaging, not only passing messages for the two sides in dialogue, but also makes proper language translation to the context of the messages. The electronic dictionary can be used to facilitate the translation task. Instant messaging is a kind of network service that two or more users can talk to each others by real-time text chat. Among many, MSN Messenger [4] is developing quickly and is one of the most popular globally. As for the electronic dictionary, now often provides service through web pages (also referred to as Internet dictionary, online dictionary). In addition to look up words, it also provides advanced sentence translation function. Among many, Google Translate [9] is one of the most developed online dictionaries and it comes with application programming interface (API) that can be used by computer program rather than human [11,12]. This study tries to use MSN Messenger for communication technology and Google Translate for language translation technology, to design a system that supports "bi-directional multi-language translation" instant messaging, as shown in Fig. 1.

Figure 1. Bi-directional multi-language automatic translation" instant messaging. In addition to communication, this real-time translation messaging mechanism can also serve as a tool for language learning. During a conversation if the original sentences and the translated sentences are both presented to the user then he can read the context in bilingual fashion. This will help to learn foreign language conversation. IV. SYSTEM PROCESS AND ARCHITECTURE A. System Process The system provides a set of management interfaces to allow users to maintain their MSN Messenger contact list, each contact in the list can be configured to be associated with a value of native-language. This setting will be stored in a database for future reference. The system is responsible for passing text messages between two dialogue ends, and more importantly, making appropriate language translation to the message context according to the native language settings of two ends. Fig. 2 demonstrates the message sending process of the system. Figure 2. Message sending process of the system. Fig. 3 demonstrates the message receiving process of the system. Figure 3. Receiving process of the instant messaging by system. Fig. 2 and fig. 3 show that in order to achieve bilingual conversion instant messaging ability, only one of the two ends of user needs to chat through the system (which end is denoted as "user" in fig. 2 and fig. 3), the other end of user can use common MSN messenger client (which end is denoted as "common client" in fig. 2 and fig. 3). Because of the nature of the system design, the end using common client even does not need to know the existence of this system which helps with the translation. This makes it more practical and feasible of the system. B. System Architecture The system aims at integrating the online instant messaging service - MSN Messenger with bi-directional multi-language translation capabilities, allowing users to easily chat with friends around the world using their own native language, and can from which to achieve language learning. The required tools and development environment for building the automatic language translation instant messaging are as follows: Programming language:java SE Web Pages:HTML + JSP Integrated Development Environment: NetBeans IDE [13] Database:MySQL Community Server [14] Web Server:Apache Tomcat [15] Instant messaging API:Java MSN Messenger Library (JML) [16] Language translation API :Google Translate API for Java [12] The system architecture is as fig. 4.

Figure 4. System architecture. The system consists of following components: Instant Messaging Module: mainly responsible for MSN account login and authentication, access to contact lists, message transmission and reception. The functions of this module will be realized with the JML APIs. Language Translation Module: responsible for natural language translation capable of bi-directional multi-language translation ability. For example, to use Chinese-English translation, entering an English sentence to the module then the module will translate the context into English and output to the caller. This module will cooperate with the Google Translate API for Java to realize the functions. Management Module: the management module is responsible to record/retrieve the dialogue history and setup native-language configuration. HTML User Interface: responsible for integrating and coordinating the two modules: instant messaging and language translation, and provides user interface in the form of web pages to the system user. In addition, the bilingual dialogue context will also be stored in the database to facilitate conversation study and reference purposes. The advantage taking the Web application architecture is to achieve cross-platform, so that users can just use the browser to connect to the system and use all the features without having to download and install additional software bundles. Here Apache Tomcat Web site is adapted as platform, web pages are written in JSP and database uses MySQL. V. SYSTEM IMPLEMENTATION An enhanced version of the Internet instant messaging system has been designed with the following capabilities: Every contact in the messaging contact list can be set to associate with a language option which denotes the nativelanguage of the contact, as shown in fig. 5. Figure 5. Setup native-language for each of the contacts User can chat with contacts who speak different nativelanguage from the user without having to be familiar with each other s native-languages. The system will translate the message context into the proper languages according to the language options associated to the contacts, and then convey the messages to them. Also, the bilingual dialogue context can be stored for future conversation learning and reference, as shown in fig. 6. Figure 6. Messaging with contacts of different languages. While messaging with a contact, the user can first type the message in his\her own native language in the field Input message in the UI, as shown in fig. 7. In fig. 7, a system user Candy (Chinese) is messaging to a contact Yang (English). Then she clicks on the button translate demanding for the translation. The translated contexts of the message will be presented in the field Translation, for preview and also demanding for further manual refinement (if necessary) by the uset. Finally, by clicking on send button, the bilingual contexts are then both presented to the message receiver, Yang.

Figure 7. the cross-language conversation. The Google Translation now supports 58 languages, as enlisted in table 1. TABLE I. Starts with A B C D E F G H I J K L M N P R S T U V W Y LANGUAGES SUPPORTED BY GOOGLE TRANSLATION (AND THIS SYSTEM) LANGUAGE Afrikaans, Albanian, Arabic, Armenian, Azerbaijani Basque, Belarusian, Bulgarian Catalan, Chinese, Croatian, Czech Danish, Dutch English, Estonian Filipino, Finnish, French Galician, Georgian, German, Greek Haitian Creole, Hebrew, Hindi, Hungarian Icelandic, Indonesian, Irish, Italian Japanese Korean Latin, Latvian, Lithuanian Macedonian, Malay, Maltese Norwegian Persian, Polish, Portuguese Romanian, Russian Serbian, Slovak, Slovenian, Spanish, Swahili, Swedish Thai, Turkish Ukrainian, Urdu Vietnamese Welsh Yiddish VI. CONCLUSIONS The advance of the Internet technology has eliminated the barriers of the geographic distance. However, with the perspective of the global village, the barrier to communication results from various different nativelanguages people speak is still a problem to be solved at present. Through the design of this work that cooperates instant messaging and machine translation technologies, having a simultaneous interpreter is no longer of the privilege of a country leader, it is now possible for everyone to take to enjoy the convenience of cross-language communication online. Wide range of situations can benefit from this, from making foreign friends to the cross-language global customer service. The additional foreign language conversation learning function is also a good helper. REFERENCES [1] Instant messaging most popular online activity in France, comscore Inc. Press Release, April 6, 2009, http://www.comscore.com/press_events/press_releases/2009/4/insta nt_messaging_most_popular_online_activity_in_france [2] Hong Kong Internet users spend twice as much time on instant messengers as counterparts in Asia-Pacific region, comscore Inc. Press Release, October 2009 http://ir.comscore.com/releasedetail.cfm?releaseid=415641 [3] Europe surpasses north America in instant messenger users, comscore Inc. Press Release, April 2006, http://www.comscore.com/press_events/press_releases/2006/04/eur ope_surpasses_north_america_in_instant_messenger_usage [4] Windows Live Messenger, http://windowsliveintro.spaces.live.com/ [5] AOL Instant Messenger, http://www.aim.com/ [6] Yahoo! Messenger, http://messenger.yahoo.com/ [7] Instant messaging population up to 90 percent,insightxplorer Inc. Press Releasese, March 2006, http://www.insightxplorer.com/news/news_03_24_06.html [8] Yahoo! dictionary, http://tw.dictionary.yahoo.com/ [9] Google Translate, http://translate.google.com.tw [10] Microsoft Bing translations, http://www.microsofttranslator.com/ [11] Google AJAX Language API, http://code.google.com/intl/zh- TW/apis/ajaxlanguage/ [12] Google Translate API for Java, http://code.google.com/p/google-apitranslate-java/ [13] NetBeans IDE, http://netbeans.org/ [14] MySQL Community Server, http://www.mysql.com/ [15] Apache Tomcat, http://tomcat.apache.org/ [16] Java MSN Messenger Library (JML), http://java-jml.sourceforge.net/