AN INTERACTIVE ON-LINE MACHINE TRANSLATION SYSTEM (CHINESE INTO ENGLISH)



Similar documents
UNIT 1 INTRODUCTION TO NC MACHINE TOOLS

Symbol Tables. Introduction

Solving simultaneous equations using the inverse matrix

Artificial Neural Networks are bio-inspired mechanisms for intelligent decision support. Artificial Neural Networks. Research Article 2014

I. DigitalCAT Captioning Software Overview A. Welcome B. Acquiring the Captioning Add-On C. Purpose of this Guide...

Overview of MT techniques. Malek Boualem (FT)

Tibetan For Windows - Software Development and Future Speculations. Marvin Moser, Tibetan for Windows & Lucent Technologies, USA

Collecting Polish German Parallel Corpora in the Internet

CITY UNIVERSITY OF HONG KONG. A Study of Electromagnetic Radiation and Specific Absorption Rate of Mobile Phones with Fractional Human Head Models

A Programming Language for Mechanical Translation Victor H. Yngve, Massachusetts Institute of Technology, Cambridge, Massachusetts

VERITAS NetBackup 6.0 Encryption

COMPUTATIONAL DATA ANALYSIS FOR SYNTAX

Introduction to Information & Communications Technology

COMPUTER - INPUT DEVICES

Test Automation Architectures: Planning for Test Automation

Chapter 8. The Training of Trainers for Legal Interpreting and Translation Brooke Townsley

Department: Political Science, Philosophy & Religion.

BIOMETRIC POINT OF SALE SYSTEM

Website Accessibility Under Title II of the ADA

THREE YEAR DEGREE (HONS.) COURSE BACHELOR OF COMPUTER APPLICATION (BCA) First Year Paper I Computer Fundamentals

Fig.1 Electoronic whiteboard and programming education system

Sharp Copier - AR-M550U, AR-M620N

VERITAS NetBackup Vault 6.0

Turing Machines, Part I

Windows 95. 2a. Place the pointer on Programs. Move the pointer horizontally to the right into the next window.

Lesotho new Integrated Curriculum for primary schools in Lesotho

Schneps, Leila; Colmez, Coralie. Math on Trial : How Numbers Get Used and Abused in the Courtroom. New York, NY, USA: Basic Books, p i.

release 380 Exact Globe E-Invoice

SupverVAG K+CAN User Manual

Programming Reference Guide HP USB Barcode Scanner

Chapter 1. Dr. Chris Irwin Davis Phone: (972) Office: ECSS CS-4337 Organization of Programming Languages

Bachelor of Games and Virtual Worlds (Programming) Subject and Course Summaries

Temporal Difference Learning in the Tetris Game


The Design of DSP controller based DC Servo Motor Control System

EPM Performance Suite Profitability Administration & Security Guide

Identification. Preparation and formulation. Evaluation. Review and approval. Implementation. A. Phase 1: Project identification

Access Tutorial 2: Tables

NACA airfoil geometrical construction

Computation Beyond Turing Machines

Paraphrasing controlled English texts

The Elective Part of the NSS ICT Curriculum D. Software Development

Super Vag K+Can User Manual

IT Quick Reference Guides Using Windows 7

EURESCOM - P923 (Babelweb) PIR.3.1

3 SOFTWARE AND PROGRAMMING LANGUAGES

Chapter 2: Algorithm Discovery and Design. Invitation to Computer Science, C++ Version, Third Edition

Discourse Markers in English Writing

Tibetan-Chinese Bilingual Sentences Alignment Method based on Multiple Features

mini ScanEYE User Manual

Johannes Sametinger. C. Doppler Laboratory for Software Engineering Johannes Kepler University of Linz A-4040 Linz, Austria

Music Data Management Software Data Manager 6.1 User s Guide

Surveillance System Using Wireless Sensor Networks

PRACTICAL ASPECTS OF TRANSLATION: BRIDGING THE GAP BETWEEN THEORY AND PRACTICE. Andrea LIBEG, Petru Maior University, Tg Mureş, Romania.

Outline. File Management Tanenbaum, Chapter 4. Files. File Management. Objectives for a File Management System

CS104: Data Structures and Object-Oriented Design (Fall 2013) October 24, 2013: Priority Queues Scribes: CS 104 Teaching Team

Xerox WorkCentre 6655 Color Multifunction Printer Control Panel

A Neural Network and Web-Based Decision Support System for Forex Forecasting and Trading

Using SPSS for item analysis

KSE Comp. support for the writing process 2 1

Analytic Modeling in Python

Matt Cabot Rory Taca QR CODES

It is the thinnest layer in the OSI model. At the time the model was formulated, it was not clear that a session layer was needed.

How to Paraphrase Reading Materials for Successful EFL Reading Comprehension

Easy Guide to Hard Disk Caching

Online Enrollment and Administration System

3D SCANNING: A NEW APPROACH TOWARDS MODEL DEVELOPMENT IN ADVANCED MANUFACTURING SYSTEM

MEETINGONE ONLINE ACCOUNT MANAGEMENT PORTAL ACCOUNT ADMIN USER GUIDE

Net lists and automated wiring diagram generation

An automatic predictive datamining tool. Data Preparation Propensity to Buy v1.05

English academic writing difficulties of engineering students at the tertiary level in China

Getting Started on the Computer With Mouseaerobics! Windows XP

Bar Code Label Specification

Taco Hydronic System Solutions Quick Start Guide

Parental Occupation Coding

Wind River Financial iprocess Setup Guide for Android Devices

EXPERIMENT NUMBER 5 BASIC OSCILLOSCOPE OPERATIONS

Web-Based Genomic Information Integration with Gene Ontology

Application of Artificial Intelligence for a Computerized Maintenance Scheduling System

ZIMBABWE SCHOOL EXAMINATIONS COUNCIL. COMPUTER STUDIES 7014/01 PAPER 1 Multiple Choice SPECIMEN PAPER

Cross-Lingual Concern Analysis from Multilingual Weblog Articles

Using Power to Improve C Programming Education

3. Programming the STM32F4-Discovery

Content Author's Reference and Cookbook

PERFORMANCE APPRAISAL FOR FACULTY

ISHIDA BC Scale to Scale Communications

TM SysAid Chat Guide Document Updated: 10 November 2009

FAGOR CNC 8055 ia-mc Control

Operating system Dr. Shroouq J.

Web Connect Guide MFC-J825DW MFC-J835DW. Version 0 USA

COMMUNITY COLLEGE OF CITY UNIVERSITY CITY UNIVERSITY OF HONG KONG. Information on a Course offered by Division of Business

1. The RSA algorithm In this chapter, we ll learn how the RSA algorithm works.

Transcription:

[From: Translating and the Computer, B.M. Snell (ed.), North-Holland Publishing Company, 1979] AN INTERACTIVE ON-LINE MACHINE TRANSLATION SYSTEM (CHINESE INTO ENGLISH) Shiu-Chang LOH and Luan KONG Hung On-To Research Centre for Machine Translation The Chinese University of Hong Kong ABSTRACT The present on-line system is a direct conversion of the CULT (batch) machine translation system which has been used since January 1975 on a regular basis in translating Chinese mathematics journals. The pre-editing procedures used in CULT are being implemented in the present on-line system by means of editing programs. The enormous problem of inputting Chinese texts is being solved by keying the text directly with a newly designed Chinese keyboard. Introduction This paper describes briefly an interactive on-line machine translation system capable of translating Chinese into readable English, which has recently been implemented at the Chinese University of Hong Kong. The design of this on-line system is mainly based on the algorithm developed for the much used CULT (Chinese University Language Translator) system which was first developed with the IBM 1130 system in 1969 and then implemented on the ICL 1904A system in 1972 (Loh 1972). CULT has been used, on a regular basis, to translate the Chinese mathematics journal, Acta Mathematica Sinica, since 1975 (Loh and Kong 1977). The capability of CULT has also been further developed to assist translation in both directions, that is CULT is now capable of supporting translation from Chinese into readable English as well as from English into readable Chinese (Loh, Hung and Kong 1977). All the versions of CULT mentioned above are in batch mode. One of the major problems of this mode of translation is the problem of inputting Chinese characters. Firstly, the Chinese text was manually encoded into four-digit standard telegraph codes and punched onto cards before being inputted into the computer for translation. This process is enormous, tedious and liable to errors. In order to solve this problem, a research project was carried out at the Chinese University of Hong Kong to try to find a better solution to this problem. As a result, a Chinese key-board was designed. The newly constructed Chinese key-board makes the design of the present online system possible. The use of this key-board for direct input has eliminated many of the errors arising in coding Chinese characters either by means of numerical digits or alphanumerics, thus facilitating the translation process. The On-line Machine Translation System The present on-line system is primarily based on the CULT (batch mode) system just mentioned, except that the pre-editing procedures can be replaced by the interactive operations of the Chinese key-board, and also the editing facility 135

136 S-C. LOH and L. KONG is provided. The flowchart of the general translation procedures is given in Figure 1, similar to CULT (batch mode) system. Figures 2,3,4,5,6,7,8 & 9 give the detailed flowcharts of (i) Source text preparation, (ii) Input, (iii) Lexical Analysis, (iv) Syntactic & Semantic Analysis, (v) Relative order analysis, (vi) Target equivalence analysis, (vii) Output and (viii) Output refinement respectively. The editing facilities for on-line up-dating of dictionary records, i.e. correction, insertion, deletion etc. are provided at appropriate points. This enables the operation to interact with the translation system at all times, while translating. Programs for the system are written in Standard FORTRAN and run on the PDP 11/34 computer system. Computer Configuration A PDP 11/34 computer system has recently been installed at the Hung On-To Research Centre for Machine Translation, to be used exclusively for the Machine Translation Project. The configuration of the system is given in Figure 10. The Chinese key-board is attached in addition to the standard alphanumeric key-board which is to be used in the implementation of On-line Dual Language Translator in due course. When implemented, the system should be capable of translating Chinese into readable English as well as from English into readable Chinese simultaneously. The Chinese Key-board Many Chinese input systems based either on Corner Shapes of the characters, or on Pin Yin (pronunciation), or on a Chinese typewriter key-board with a few thousand keys are commercially available. The widely publicized cylindrical Chinese typewriter key-board which has been developed by the research workers of the Chinese Language Project, Cambridge University, is the latest one to be introduced to the market. None of these input systems mentioned possesses the most desirable characteristic of being easy to learn and operate. Most, if not all, require extensive training of the operator. For simplicity, the input of Chinese characters to the computer system should follow the natural order of how they are written by hand - the way children are taught to write Chinese character at school. This kind of stroke-approach, though theoretically sound, is impossible to implement due to the richness of the language and the infinite combination of strokes that go to make up Chinese characters. However, a feasible way is to employ the "radical" approach, in other words, to represent on the keyboard the most frequently used radicals of the Chinese characters. For the past few years, the structures of the Chinese characters have been studied and a new set of modified radicals which total barely over 200 in number, has been selected. A key-board based on this design approach has been constructed and initial test results are satisfactory. The arrangement of the keys on the key-board is usually according to their normal position in the character, that is, a radical, if normally at the top of the character is placed on the upper portion of the key-board and so on. The main advantage of this approach is that it is simple and natural for any person with a minimal knowledge of written Chinese to operate the key-board with ease.

AN INTERACTIVE ON LINE MACHINE TRANSLATION SYSTEM 137 Fig. 1 Translation Procedure

138 S-C. LOH and L. KONG Fig. 2 Source Text Preparation

AN INTERACTIVE ON-LINE MACHINE TRANSLATION SYSTEM 139 Fig. 3 Input

140 S-C. LOH and L. KONG Fig. 4 Lexical Analysis

AN INTERACTIVE ON-LINE MACHINE TRANSLATION SYSTEM 141 Fig. 5 Syntactic & Semantic Analysis

142 S-C. LOH and L. KONG Fig. 6 Relative Order Analysis

AN INTERACTIVE ON-LINE MACHINE TRANSLATION SYSTEM 143 Fig. 7 Target Equivalence Analysis

144 S-C. LOH and L. KONG Fig. 8 Output

AN INTERACTIVE ON-LINE MACHINE TRANSLATION SYSTEM 145 Fig. 9 Output Refinement

146 S-C. LOH and L. KONG Fig. 10 System Configuration

AN INTERACTIVE ON-LINE MACHINE TRANSLATION SYSTEM 147

148 S-C. LOH and L. KONG Remarks The implementation of the Chinese key-board has greatly enhanced the throughput, and thus has enhanced the efficiency and performance of the translation system. Acknowledgements The authors are grateful to The Asia Foundation and Rockefeller Brothers Fund for initial financial support of the Machine Translation Project and to Hung On-To Memorial Fund for their generous financial assistance in setting up the research centre. References Loh, S.C. (1972) Machine translation at the Chinese University of Hong Kong. Proceedings of the CETA (Chinese-English Translation Assistance) Workshop on Chinese Language and Chinese Research Materials, CETA-72-01, 1972 Loh, S.C, Hung, H.S. and Kong, L. (1977) A dual language translator. The Fifth International Symposium on Computers in Literary and Linguistic Research, Birmingham, April 1977. Loh, S.C. and Kong, L. (1977) Computer translation of Chinese scientific journals. Proceedings of the Third European Congress on Information Systems and Networks "Overcoming the Language Barrier", Luxemburg, 1977.