How the Computer Translates. Svetlana Sokolova President and CEO of PROMT, PhD.

Size: px
Start display at page:

Download "How the Computer Translates. Svetlana Sokolova President and CEO of PROMT, PhD."

Transcription

1 Svetlana Sokolova President and CEO of PROMT, PhD.

2 How the Computer Translates Machine translation is a special field of computer application where almost everyone believes that he/she is a specialist. Firstly, everybody understands that the larger the dictionary volume, the better the translation will be, so the first problem is to create large dictionaries for the systems. Secondly, it is clear that the system should be able to translate sentences like HI, HOW ARE YOU DOING? So, another problem is to teach the system to recognize common collocations. Thirdly, it is obvious that a sentence subject to translation is written in accordance with certain rules and should be translated under certain rules, so there is one more problem: to store all these rules as a program. That's it. The point is that these problems are really essential for development of machine translation systems, however, methods of their solution are not commonly known and not as simple as it may seem. Machine translation systems of the PROMT family are perfect instances to show effective solutions of these problems. Dictionary Methods of arrangement of large databases are well developed, but as for translation, in order to provide correct retrieval of database elements, it may be more important to know how to configure the information assigned to each element. For example, how many dictionary entries should correspond to a common Russian word "program"? And what is more, a large dictionary is a dictionary that contains many entries, or a dictionary that allows recognizing many words in a text? A closer look reveals that, for example, Russian nouns change cases and numbers, i.e. up to 12 different forms can exist for a noun, and, as a rule, even greater number of different forms can exist for verbs and adjectives (more than 30). Therefore, in order to translate sentences containing Russian declinable words like "program", "about program", "programs", etc., it would be useful to implement a technique of searching a correlation between the "program" entry contained in the computer dictionary and the appropriate word form in the text. So, in order to describe both source and target languages the system should use a formal method of morphology description that is the base for dictionary unit retrieval. 2

3 Actually, in every system pretending to be a translation system, the problem of representation of morphological models is somehow solved. But some systems can recognize 1,000,000 word forms on the basis of a dictionary containing 50,000 entries, and other systems with a dictionary of 100,000 entries can recognize just these 100,000. In PROMT family systems, morphological description is developed for all languages to be handled by the systems. This description is almost unique due to its completeness. It contains 800 types of word change for the Russian language, more than 300 types for German and French languages, and even for English, which is not an inflectional language, over 250 types of word change are defined. The variety of endings in each language is stored as tree structures thus providing not only the effective way of storage, but also the effective algorithm of morphological analysis. Furthermore, this morphology model was applied for development of the advisory system for those users who create dictionaries themselves. This system actually automates the process of stem extraction and determination of word change type while entering new dictionary entries. There is no such feature in other existing machine translation systems, even in such well-known systems like Power Translator (Globalink, USA), Language Assistant (MicroTac, USA), TRANSEND (Intergaph, USA), where users should conjugate and decline words manually in order to define a morphological model. Nevertheless, the development of morphology description allows to solve only one problem, namely the problem of determination of the dictionary entry header, which is used for identification of a text unit and a dictionary unit. But determining the correlation between a word in the text and a dictionary entry is performed not only for identification purposes, as it is required in spell checkers or electronic dictionaries, but also for execution of translation procedures by the software. So what information should a dictionary entry contain and how should translation rules be described in order to make the software translate? Dictionary Here a historical digression is needed because machine translation, as a part of applied linguistics, has a very dramatic history. In the 1950's, along with development of first computers, the idea of machine translation has appeared. By the way, the term "machine translation" exists since that time. The task seemed to be quite easy to perform. This caused a kind of linguistic euphoria, and several global projects on creation of translation systems for different languages were launched. 3

4 None of these projects had developed an operable system, and the commission specially established by the US National Academy of Sciences in 1967 stated that machine translation projects have no future and should not be financed. Only in the beginning of the 1980's linguists recovered enough from consequences of this verdict and resumed research and development in this field. Certainly, in many respects this revival was connected with overall development of computer industry and, more particularly, with growing interest in "artificial intelligence" as a field of computers application. Nevertheless, in the 1980's the history almost repeated itself, but in addition to global projects, such as EUROTRA (European Economic Community), ARIANE (France), METAL (USA and Germany), KANT (USA), SUSY (Germany), many local projects having less ambitious purposes were launched. The global projects were still aimed at solution of translation problem in general. Within these projects, development of description of lexical units for the dictionary and development of translation algorithms were considered as different tasks. A variety of linguistic proceedings appeared offering structure of description of live word properties in a computer dictionary entry. At the same time, a number of independent researches were published devoted to issues, for example, like "The Structure of Noun Phrase" or "Representation of Direct Objects of Verbs of Saying". However, real commercial systems somehow implementing results of these studies were not presented in the market. Each developed system had a modest complement of "experimental" or "prototype". But in practice no one of these systems had ever been finished and could be considered as a consumer product. It was stipulated by the fact that applied methods for description of translation, after their transferring to real environment (i.e. upon their applying to arbitrary texts), revealed their inconsistency with methods offered for creation of dictionary entries. The exception, perhaps, is the METAL project. Although this project did not finally resulted in a real commercial product, but during its development it was redirected to creation of a system that would be capable to translate from German into English and from English into German and to handle specialized dictionaries for specific subject areas. At the same time, local projects were oriented to narrow-scope solutions. Developers' goal was to obtain any valuable result. In these projects, dictionary description and description of algorithms were considered as integral parts of one problem, but the solution, as a rule, was found by limiting the analyzed environment, either grammar or semantic. For example, on the basis of the "Belonging to a part of speech" attribute, the grammar of following types was described: a noun phrase is a noun a noun phrase is an adjective + a noun phrase a verbal phrase is a verb + a noun phrase a sentence is a noun phrase + a verbal phrase 4

5 It is clear that part of sentences in natural language can be described using such grammar, but their number is insignificant and not enough for correct analysis and translation of a real text. But it is possible to use effective methods for construction of a converter on the basis of specific grammar or, at worst, to compile a program that can build dependency trees for limited set of sentences by means of linear search. Similarly, such systems also were called "experimental". Though both these approaches did not result in commercial systems, research works conducted in this area helped to understand the complexity of the task and, at least, to detect bottlenecks in similar developments. Anyway, these local projects became a platform that allowed creation of translation systems that are now offered to end-users. Power Translator (Globalink company), Language Assistant (MicroTac company) and TRANSEND (Intergraph company) are among these systems. Systems of STYLUS and PROMT families are not exceptions, as many specialists of the PROMT Company were involved in similar development projects. Nevertheless, a first-ever revolutionary approach was applied for the development of PROMT systems, which led to impressing results. Translation systems of the PROMT family are the systems designed on the basis of not linguistic, but cybernetic methods. It was revealed that it was very effective to consider the translation system not as a translator assigned to the task of translation of a text allowable from the point of view of source grammar, but rather as some complex system assigned to the task of getting the result in case of arbitrary input data including texts which are not correct from the point of view of system grammar in use. Instead of accepted linguistic approach, which assumes implementation of sequential processes of sentence analysis and synthesis, the architecture of the system is based on representation of translation procedures in a form of "objectoriented" process founded on an hierarchy of sentence components to be processed. That allowed PROMT systems to be stable and open. Besides, such approach allowed applying of various formalisms for description of translation on different levels. The systems also employ network grammars, whose type is similar to extended transition networks, as well as working algorithms for filling and transformation of frame structures for analysis of complex predicates. Lexical unit description within a dictionary entry, which actually is not limited in its volume and can contain a number of various attributes, is closely interconnected with the structure of system algorithms and is configured not on the basis of an immemorial antithesis of "syntax-semantics", but rather on the basis of text component levels. Thus the systems can work using incompletely described dictionary entries, which is a very important point for opening dictionaries for a user who cannot be regarded as a highly experienced specialist in linguistics. 5

6 The very first machine translation system, released by the PROMT Company in 1991, was able to translate specialized texts, relating to computer software, from English into Russian. The system employed a small dictionary (about 17,000 words and expressions), it was DOS-compatible and had no tools for customization. But even this first system was correctly arranged, and the present technology of development of machine translation algorithms, applied by the PROMT Company, was not subject to major modification. Moreover, the approach found during that phase of development proved to be very effective for many different languages. First let's explain some definitions: along with the development of machine translation, which is a part of applied linguistics, some system classifications also appeared, and subdivision of translation systems into TRANSFER systems and INTERLINGUA systems was adopted. This subdivision is based on aspects of architectural solutions relating to linguistic algorithms. Translation algorithms for TRANSFER systems are built as a composition of three processes: analysis of the input sentence in terms of source language structures, conversion of this structure into a similar target language structure (TRANSFER), and, finally, synthesis of the output sentence according to the constructed structure. INTERLINGUA systems assume apriori that a certain structure metalanguage (INTERLINGUA) is available, which, in principle, can be used for describing any structure of both source and target languages. Therefore, it is supposed that the translation algorithm employed in INTERLINGUA systems is more simple: analysis of the input sentence in terms of the metalanguage and then synthesis of a corresponding target language sentence using the metastructure. In this case, the only one" difficulty is development of the metalanguage itself and making a description of the natural language in appropriate terms. In spite of the fact that this classification actually exists and that among machine translation developers it is considered good form to ask which type of system your system belongs to, yet there is no real system developed based on the INTERLINGUA principle. Our system is not an exception, and we answer this question as follows: our system performs the translation of the TRANSFER type. But this answer is very simple, and actually it does not reflect any peculiarity of the PROMT system architecture. The special feature of the system is that this (TRANSFER) method is applied not according to standard linguistic approach. As a matter of fact, a translation system generally operates under conditions of using incomplete data, as the language is an alive, fast-evolving system: new words, new functions of old words and, along with new essences, new meanings are constantly being developed. In this situation, the main structural feature of translation algorithms is stability of the system, with respect to arbitrary input data. PROMT system translation algorithms are based not on sequential TRANSFER procedures, but on hierarchical approach that provides subdivision of translation process into interconnected TRANSFER procedures for different units of analysis. 6

7 The following levels are distinguished in the system: the lexical unit level, the group level, the simple sentence level and the compound sentence level. All these processes are interconnected and interact hierarchically according to text unit hierarchy, and also exchange synthesized and inherited attributes. This kind of algorithm arrangement allows to use different formal methods for description of algorithms on different levels. Now let's look at the lexical unit level: a lexical unit is a word or collocation that is a unit of the lowest level. Each word is described as the composition of a stem and an ending, in both source and target languages. On one hand, it provides the possibility of source word recognition and source morphology analysis, and, on the other hand, the possibility of convenient target word synthesis according to relevant morphological data (the stem, the type of change and the address of ending in the array of endings of this type). Thus, if rules of conversion of source morphological data into target morphological data are available, it is possible to carry out TRANSFER procedures on the morphological level. The group level corresponds to more complex structures: groups of nouns, adjectives, adverbs and complex verbal forms. This level is based on formal network grammars, and, when analyzing, it allows compounding of groups for creation of syntactic units. Each unit is characterized by synthesized structural data and the main unit of the group. Corresponding to the source structure composed in terms of immediate constituents, and along with synthesized attributes, the target group is created as a set of lexical units with morphological attribute values that can be inherited in accordance with results of group analysis. In this way, the TRANSFER procedures are implemented on the group level. The analysis of simple sentences that are considered as structures consisting of syntactic units is performed on the basis of frame predicate structures providing effective conversions. In simple sentences, the verb is considered as the main element and its valences determine filling of the corresponding frame. For any type of frame, there is a conversion law for creation of the target frame and forming of actants. In this way, the TRANSFER procedures are implemented on the sentence level. The analysis of compound sentences is required when it is necessary to form the concord of tenses and provide correct translation of conjunctions. Conclusion We hope that this information will allow potential users of translation systems to understand that creation of a machine translation system is not a simple but rather a knowledge-intensive task. And therefore, the quantity of real ready-tooperate translation systems that may appear per time unit is essentially limited. Svetlana Sokolova President and CEO of PROMT( PhD.

Overview of MT techniques. Malek Boualem (FT)

Overview of MT techniques. Malek Boualem (FT) Overview of MT techniques Malek Boualem (FT) This section presents an standard overview of general aspects related to machine translation with a description of different techniques: bilingual, transfer,

More information

COMPUTATIONAL DATA ANALYSIS FOR SYNTAX

COMPUTATIONAL DATA ANALYSIS FOR SYNTAX COLING 82, J. Horeck~ (ed.j North-Holland Publishing Compa~y Academia, 1982 COMPUTATIONAL DATA ANALYSIS FOR SYNTAX Ludmila UhliFova - Zva Nebeska - Jan Kralik Czech Language Institute Czechoslovak Academy

More information

Ling 201 Syntax 1. Jirka Hana April 10, 2006

Ling 201 Syntax 1. Jirka Hana April 10, 2006 Overview of topics What is Syntax? Word Classes What to remember and understand: Ling 201 Syntax 1 Jirka Hana April 10, 2006 Syntax, difference between syntax and semantics, open/closed class words, all

More information

International Journal of Scientific & Engineering Research, Volume 4, Issue 11, November-2013 5 ISSN 2229-5518

International Journal of Scientific & Engineering Research, Volume 4, Issue 11, November-2013 5 ISSN 2229-5518 International Journal of Scientific & Engineering Research, Volume 4, Issue 11, November-2013 5 INTELLIGENT MULTIDIMENSIONAL DATABASE INTERFACE Mona Gharib Mohamed Reda Zahraa E. Mohamed Faculty of Science,

More information

L130: Chapter 5d. Dr. Shannon Bischoff. Dr. Shannon Bischoff () L130: Chapter 5d 1 / 25

L130: Chapter 5d. Dr. Shannon Bischoff. Dr. Shannon Bischoff () L130: Chapter 5d 1 / 25 L130: Chapter 5d Dr. Shannon Bischoff Dr. Shannon Bischoff () L130: Chapter 5d 1 / 25 Outline 1 Syntax 2 Clauses 3 Constituents Dr. Shannon Bischoff () L130: Chapter 5d 2 / 25 Outline Last time... Verbs...

More information

Translation Solution for

Translation Solution for Translation Solution for Case Study Contents PROMT Translation Solution for PayPal Case Study 1 Contents 1 Summary 1 Background for Using MT at PayPal 1 PayPal s Initial Requirements for MT Vendor 2 Business

More information

Building a Question Classifier for a TREC-Style Question Answering System

Building a Question Classifier for a TREC-Style Question Answering System Building a Question Classifier for a TREC-Style Question Answering System Richard May & Ari Steinberg Topic: Question Classification We define Question Classification (QC) here to be the task that, given

More information

JOB BANK TRANSLATION AUTOMATED TRANSLATION SYSTEM. Table of Contents

JOB BANK TRANSLATION AUTOMATED TRANSLATION SYSTEM. Table of Contents JOB BANK TRANSLATION AUTOMATED TRANSLATION SYSTEM Job Bank for Employers Creating a Job Offer Table of Contents Building the Automated Translation System Integration Steps Automated Translation System

More information

Comprendium Translator System Overview

Comprendium Translator System Overview Comprendium System Overview May 2004 Table of Contents 1. INTRODUCTION...3 2. WHAT IS MACHINE TRANSLATION?...3 3. THE COMPRENDIUM MACHINE TRANSLATION TECHNOLOGY...4 3.1 THE BEST MT TECHNOLOGY IN THE MARKET...4

More information

Transformation of Free-text Electronic Health Records for Efficient Information Retrieval and Support of Knowledge Discovery

Transformation of Free-text Electronic Health Records for Efficient Information Retrieval and Support of Knowledge Discovery Transformation of Free-text Electronic Health Records for Efficient Information Retrieval and Support of Knowledge Discovery Jan Paralic, Peter Smatana Technical University of Kosice, Slovakia Center for

More information

Module Catalogue for the Bachelor Program in Computational Linguistics at the University of Heidelberg

Module Catalogue for the Bachelor Program in Computational Linguistics at the University of Heidelberg Module Catalogue for the Bachelor Program in Computational Linguistics at the University of Heidelberg March 1, 2007 The catalogue is organized into sections of (1) obligatory modules ( Basismodule ) that

More information

MODULE 15 Diagram the organizational structure of your company.

MODULE 15 Diagram the organizational structure of your company. Student name: Date: MODULE 15 Diagram the organizational structure of your company. Objectives: A. Diagram the organizational chart for your place of business. B. Determine the importance of organization

More information

What s in a Lexicon. The Lexicon. Lexicon vs. Dictionary. What kind of Information should a Lexicon contain?

What s in a Lexicon. The Lexicon. Lexicon vs. Dictionary. What kind of Information should a Lexicon contain? What s in a Lexicon What kind of Information should a Lexicon contain? The Lexicon Miriam Butt November 2002 Semantic: information about lexical meaning and relations (thematic roles, selectional restrictions,

More information

The Language Grid The Language Grid combines users language resources and machine translators to produce high-quality translation that is customized

The Language Grid The Language Grid combines users language resources and machine translators to produce high-quality translation that is customized The Language Grid The Language Grid combines users language resources and machine translators to produce high-quality translation that is customized to each field. The Language Grid, a software that provides

More information

The multilayer sentiment analysis model based on Random forest Wei Liu1, Jie Zhang2

The multilayer sentiment analysis model based on Random forest Wei Liu1, Jie Zhang2 2nd International Conference on Advances in Mechanical Engineering and Industrial Informatics (AMEII 2016) The multilayer sentiment analysis model based on Random forest Wei Liu1, Jie Zhang2 1 School of

More information

Testing Data-Driven Learning Algorithms for PoS Tagging of Icelandic

Testing Data-Driven Learning Algorithms for PoS Tagging of Icelandic Testing Data-Driven Learning Algorithms for PoS Tagging of Icelandic by Sigrún Helgadóttir Abstract This paper gives the results of an experiment concerned with training three different taggers on tagged

More information

Application of Natural Language Interface to a Machine Translation Problem

Application of Natural Language Interface to a Machine Translation Problem Application of Natural Language Interface to a Machine Translation Problem Heidi M. Johnson Yukiko Sekine John S. White Martin Marietta Corporation Gil C. Kim Korean Advanced Institute of Science and Technology

More information

TERMS. Parts of Speech

TERMS. Parts of Speech TERMS Parts of Speech Noun: a word that names a person, place, thing, quality, or idea (examples: Maggie, Alabama, clarinet, satisfaction, socialism). Pronoun: a word used in place of a noun (examples:

More information

The Oxford Learner s Dictionary of Academic English

The Oxford Learner s Dictionary of Academic English ISEJ Advertorial The Oxford Learner s Dictionary of Academic English Oxford University Press The Oxford Learner s Dictionary of Academic English (OLDAE) is a brand new learner s dictionary aimed at students

More information

No Evidence. 8.9 f X

No Evidence. 8.9 f X Section I. Correlation with the 2010 English Standards of Learning and Curriculum Framework- Grade 8 Writing Summary Adequate Rating Limited No Evidence Section I. Correlation with the 2010 English Standards

More information

FACULTY OF COMPUTER SCIENCE AND INFORMATION TECHNOLOGY AUTUMN 2016 BACHELOR COURSES

FACULTY OF COMPUTER SCIENCE AND INFORMATION TECHNOLOGY AUTUMN 2016 BACHELOR COURSES FACULTY OF COMPUTER SCIENCE AND INFORMATION TECHNOLOGY Please note! This is a preliminary list of courses for the study year 2016/2017. Changes may occur! AUTUMN 2016 BACHELOR COURSES DIP217 Applied Software

More information

Chapter 1. Dr. Chris Irwin Davis Email: cid021000@utdallas.edu Phone: (972) 883-3574 Office: ECSS 4.705. CS-4337 Organization of Programming Languages

Chapter 1. Dr. Chris Irwin Davis Email: cid021000@utdallas.edu Phone: (972) 883-3574 Office: ECSS 4.705. CS-4337 Organization of Programming Languages Chapter 1 CS-4337 Organization of Programming Languages Dr. Chris Irwin Davis Email: cid021000@utdallas.edu Phone: (972) 883-3574 Office: ECSS 4.705 Chapter 1 Topics Reasons for Studying Concepts of Programming

More information

12 FIRST QUARTER. Class Assignments

12 FIRST QUARTER. Class Assignments August 7- Go over senior dates. Go over school rules. 12 FIRST QUARTER Class Assignments August 8- Overview of the course. Go over class syllabus. Handout textbooks. August 11- Part 2 Chapter 1 Parts of

More information

Parsing Technology and its role in Legacy Modernization. A Metaware White Paper

Parsing Technology and its role in Legacy Modernization. A Metaware White Paper Parsing Technology and its role in Legacy Modernization A Metaware White Paper 1 INTRODUCTION In the two last decades there has been an explosion of interest in software tools that can automate key tasks

More information

Overview of the TACITUS Project

Overview of the TACITUS Project Overview of the TACITUS Project Jerry R. Hobbs Artificial Intelligence Center SRI International 1 Aims of the Project The specific aim of the TACITUS project is to develop interpretation processes for

More information

Word Completion and Prediction in Hebrew

Word Completion and Prediction in Hebrew Experiments with Language Models for בס"ד Word Completion and Prediction in Hebrew 1 Yaakov HaCohen-Kerner, Asaf Applebaum, Jacob Bitterman Department of Computer Science Jerusalem College of Technology

More information

Livingston Public Schools Scope and Sequence K 6 Grammar and Mechanics

Livingston Public Schools Scope and Sequence K 6 Grammar and Mechanics Grade and Unit Timeframe Grammar Mechanics K Unit 1 6 weeks Oral grammar naming words K Unit 2 6 weeks Oral grammar Capitalization of a Name action words K Unit 3 6 weeks Oral grammar sentences Sentence

More information

1-04-10 Configuration Management: An Object-Based Method Barbara Dumas

1-04-10 Configuration Management: An Object-Based Method Barbara Dumas 1-04-10 Configuration Management: An Object-Based Method Barbara Dumas Payoff Configuration management (CM) helps an organization maintain an inventory of its software assets. In traditional CM systems,

More information

Free Online Translators:

Free Online Translators: Free Online Translators: A Comparative Assessment of worldlingo.com, freetranslation.com and translate.google.com Introduction / Structure of paper Design of experiment: choice of ST, SLs, translation

More information

Presented to The Federal Big Data Working Group Meetup On 07 June 2014 By Chuck Rehberg, CTO Semantic Insights a Division of Trigent Software

Presented to The Federal Big Data Working Group Meetup On 07 June 2014 By Chuck Rehberg, CTO Semantic Insights a Division of Trigent Software Semantic Research using Natural Language Processing at Scale; A continued look behind the scenes of Semantic Insights Research Assistant and Research Librarian Presented to The Federal Big Data Working

More information

A terminology model approach for defining and managing statistical metadata

A terminology model approach for defining and managing statistical metadata A terminology model approach for defining and managing statistical metadata Comments to : R. Karge (49) 30-6576 2791 mail reinhard.karge@run-software.com Content 1 Introduction... 4 2 Knowledge presentation...

More information

Natural Language to Relational Query by Using Parsing Compiler

Natural Language to Relational Query by Using Parsing Compiler Available Online at www.ijcsmc.com International Journal of Computer Science and Mobile Computing A Monthly Journal of Computer Science and Information Technology IJCSMC, Vol. 4, Issue. 3, March 2015,

More information

SYNTACTIC PATTERNS IN ADVERTISEMENT SLOGANS Vindi Karsita and Aulia Apriana State University of Malang Email: vindikarsita@gmail.

SYNTACTIC PATTERNS IN ADVERTISEMENT SLOGANS Vindi Karsita and Aulia Apriana State University of Malang Email: vindikarsita@gmail. SYNTACTIC PATTERNS IN ADVERTISEMENT SLOGANS Vindi Karsita and Aulia Apriana State University of Malang Email: vindikarsita@gmail.com ABSTRACT: This study aims at investigating the syntactic patterns of

More information

Using NLP and Ontologies for Notary Document Management Systems

Using NLP and Ontologies for Notary Document Management Systems Outline Using NLP and Ontologies for Notary Document Management Systems Flora Amato, Antonino Mazzeo, Antonio Penta and Antonio Picariello Dipartimento di Informatica e Sistemistica Universitá di Napoli

More information

NATURAL LANGUAGE QUERY PROCESSING USING SEMANTIC GRAMMAR

NATURAL LANGUAGE QUERY PROCESSING USING SEMANTIC GRAMMAR NATURAL LANGUAGE QUERY PROCESSING USING SEMANTIC GRAMMAR 1 Gauri Rao, 2 Chanchal Agarwal, 3 Snehal Chaudhry, 4 Nikita Kulkarni,, 5 Dr. S.H. Patil 1 Lecturer department o f Computer Engineering BVUCOE,

More information

TRANSLATING POLISH TEXTS INTO SIGN LANGUAGE IN THE TGT SYSTEM

TRANSLATING POLISH TEXTS INTO SIGN LANGUAGE IN THE TGT SYSTEM 20th IASTED International Multi-Conference Applied Informatics AI 2002, Innsbruck, Austria 2002, pp. 282-287 TRANSLATING POLISH TEXTS INTO SIGN LANGUAGE IN THE TGT SYSTEM NINA SUSZCZAŃSKA, PRZEMYSŁAW SZMAL,

More information

CHAPTER 1 INTRODUCTION

CHAPTER 1 INTRODUCTION 1 CHAPTER 1 INTRODUCTION Exploration is a process of discovery. In the database exploration process, an analyst executes a sequence of transformations over a collection of data structures to discover useful

More information

Semantic analysis of text and speech

Semantic analysis of text and speech Semantic analysis of text and speech SGN-9206 Signal processing graduate seminar II, Fall 2007 Anssi Klapuri Institute of Signal Processing, Tampere University of Technology, Finland Outline What is semantic

More information

Deposit Identification Utility and Visualization Tool

Deposit Identification Utility and Visualization Tool Deposit Identification Utility and Visualization Tool Colorado School of Mines Field Session Summer 2014 David Alexander Jeremy Kerr Luke McPherson Introduction Newmont Mining Corporation was founded in

More information

Index. 344 Grammar and Language Workbook, Grade 8

Index. 344 Grammar and Language Workbook, Grade 8 Index Index 343 Index A A, an (usage), 8, 123 A, an, the (articles), 8, 123 diagraming, 205 Abbreviations, correct use of, 18 19, 273 Abstract nouns, defined, 4, 63 Accept, except, 12, 227 Action verbs,

More information

Elena Baralis, Silvia Chiusano Politecnico di Torino. Pag. 1. Query optimization. DBMS Architecture. Query optimizer. Query optimizer.

Elena Baralis, Silvia Chiusano Politecnico di Torino. Pag. 1. Query optimization. DBMS Architecture. Query optimizer. Query optimizer. DBMS Architecture INSTRUCTION OPTIMIZER Database Management Systems MANAGEMENT OF ACCESS METHODS BUFFER MANAGER CONCURRENCY CONTROL RELIABILITY MANAGEMENT Index Files Data Files System Catalog BASE It

More information

Automation of Translation: Past, Presence, and Future Karl Heinz Freigang, Universität des Saarlandes, Saarbrücken

Automation of Translation: Past, Presence, and Future Karl Heinz Freigang, Universität des Saarlandes, Saarbrücken Automation of Translation: Past, Presence, and Future Karl Heinz Freigang, Universität des Saarlandes, Saarbrücken Introduction First attempts in "automating" the process of translation between natural

More information

Special Topics in Computer Science

Special Topics in Computer Science Special Topics in Computer Science NLP in a Nutshell CS492B Spring Semester 2009 Jong C. Park Computer Science Department Korea Advanced Institute of Science and Technology INTRODUCTION Jong C. Park, CS

More information

Morphology. Morphology is the study of word formation, of the structure of words. 1. some words can be divided into parts which still have meaning

Morphology. Morphology is the study of word formation, of the structure of words. 1. some words can be divided into parts which still have meaning Morphology Morphology is the study of word formation, of the structure of words. Some observations about words and their structure: 1. some words can be divided into parts which still have meaning 2. many

More information

Introduction to formal semantics -

Introduction to formal semantics - Introduction to formal semantics - Introduction to formal semantics 1 / 25 structure Motivation - Philosophy paradox antinomy division in object und Meta language Semiotics syntax semantics Pragmatics

More information

Natural Language Database Interface for the Community Based Monitoring System *

Natural Language Database Interface for the Community Based Monitoring System * Natural Language Database Interface for the Community Based Monitoring System * Krissanne Kaye Garcia, Ma. Angelica Lumain, Jose Antonio Wong, Jhovee Gerard Yap, Charibeth Cheng De La Salle University

More information

MULTIFUNCTIONAL DICTIONARIES

MULTIFUNCTIONAL DICTIONARIES In: A. Zampolli, A. Capelli (eds., 1984): The possibilities and limits of the computer in producing and publishing dictionaries. Linguistica Computationale III, Pisa: Giardini, 279-288 MULTIFUNCTIONAL

More information

Course Syllabus For Operations Management. Management Information Systems

Course Syllabus For Operations Management. Management Information Systems For Operations Management and Management Information Systems Department School Year First Year First Year First Year Second year Second year Second year Third year Third year Third year Third year Third

More information

UNIVERSITÀ DEGLI STUDI DELL AQUILA CENTRO LINGUISTICO DI ATENEO

UNIVERSITÀ DEGLI STUDI DELL AQUILA CENTRO LINGUISTICO DI ATENEO TESTING DI LINGUA INGLESE: PROGRAMMA DI TUTTI I LIVELLI - a.a. 2010/2011 Collaboratori e Esperti Linguistici di Lingua Inglese: Dott.ssa Fatima Bassi e-mail: fatimacarla.bassi@fastwebnet.it Dott.ssa Liliana

More information

ENHANCEMENT OF UDC DATA FOR USE AND SHARING IN A NETWORKED ENVIRONMENT. Aida Slavic Maria Ines Cordeiro Gerhard Riesthuis

ENHANCEMENT OF UDC DATA FOR USE AND SHARING IN A NETWORKED ENVIRONMENT. Aida Slavic Maria Ines Cordeiro Gerhard Riesthuis ENHANCEMENT OF UDC DATA FOR USE AND SHARING IN A NETWORKED ENVIRONMENT Aida Slavic Maria Ines Cordeiro Gerhard Riesthuis MAIN POINTS UDC facts update Logic behind the synthetic structure UDC number building

More information

WHITE PAPER. Machine Translation of Language for Safety Information Sharing Systems

WHITE PAPER. Machine Translation of Language for Safety Information Sharing Systems WHITE PAPER Machine Translation of Language for Safety Information Sharing Systems September 2004 Disclaimers; Non-Endorsement All data and information in this document are provided as is, without any

More information

Year 1 reading expectations (New Curriculum) Year 1 writing expectations (New Curriculum)

Year 1 reading expectations (New Curriculum) Year 1 writing expectations (New Curriculum) Year 1 reading expectations Year 1 writing expectations Responds speedily with the correct sound to graphemes (letters or groups of letters) for all 40+ phonemes, including, where applicable, alternative

More information

4.1 Multilingual versus bilingual systems

4.1 Multilingual versus bilingual systems This chapter is devoted to some fundamental questions on the basic strategies of MT systems. These concern decisions which designers of MT systems have to address before any construction can start, and

More information

LESSON THIRTEEN STRUCTURAL AMBIGUITY. Structural ambiguity is also referred to as syntactic ambiguity or grammatical ambiguity.

LESSON THIRTEEN STRUCTURAL AMBIGUITY. Structural ambiguity is also referred to as syntactic ambiguity or grammatical ambiguity. LESSON THIRTEEN STRUCTURAL AMBIGUITY Structural ambiguity is also referred to as syntactic ambiguity or grammatical ambiguity. Structural or syntactic ambiguity, occurs when a phrase, clause or sentence

More information

TECH. Requirements. Why are requirements important? The Requirements Process REQUIREMENTS ELICITATION AND ANALYSIS. Requirements vs.

TECH. Requirements. Why are requirements important? The Requirements Process REQUIREMENTS ELICITATION AND ANALYSIS. Requirements vs. CH04 Capturing the Requirements Understanding what the customers and users expect the system to do * The Requirements Process * Types of Requirements * Characteristics of Requirements * How to Express

More information

Morphological Analysis and Named Entity Recognition for your Lucene / Solr Search Applications

Morphological Analysis and Named Entity Recognition for your Lucene / Solr Search Applications Morphological Analysis and Named Entity Recognition for your Lucene / Solr Search Applications Berlin Berlin Buzzwords 2011, Dr. Christoph Goller, IntraFind AG Outline IntraFind AG Indexing Morphological

More information

Efficient Techniques for Improved Data Classification and POS Tagging by Monitoring Extraction, Pruning and Updating of Unknown Foreign Words

Efficient Techniques for Improved Data Classification and POS Tagging by Monitoring Extraction, Pruning and Updating of Unknown Foreign Words , pp.290-295 http://dx.doi.org/10.14257/astl.2015.111.55 Efficient Techniques for Improved Data Classification and POS Tagging by Monitoring Extraction, Pruning and Updating of Unknown Foreign Words Irfan

More information

A Chart Parsing implementation in Answer Set Programming

A Chart Parsing implementation in Answer Set Programming A Chart Parsing implementation in Answer Set Programming Ismael Sandoval Cervantes Ingenieria en Sistemas Computacionales ITESM, Campus Guadalajara elcoraz@gmail.com Rogelio Dávila Pérez Departamento de

More information

COCOVILA Compiler-Compiler for Visual Languages

COCOVILA Compiler-Compiler for Visual Languages LDTA 2005 Preliminary Version COCOVILA Compiler-Compiler for Visual Languages Pavel Grigorenko, Ando Saabas and Enn Tyugu 1 Institute of Cybernetics, Tallinn University of Technology Akadeemia tee 21 12618

More information

PS I TAM-TAM Aspect [20/11/09] 1

PS I TAM-TAM Aspect [20/11/09] 1 PS I TAM-TAM Aspect [20/11/09] 1 Binnick, Robert I. (2006): "Aspect and Aspectuality". In: Bas Aarts & April McMahon (eds). The Handbook of English Linguistics. Malden, MA et al.: Blackwell Publishing,

More information

Electronic offprint from. baltic linguistics. Vol. 3, 2012

Electronic offprint from. baltic linguistics. Vol. 3, 2012 Electronic offprint from baltic linguistics Vol. 3, 2012 ISSN 2081-7533 Nɪᴄᴏʟᴇ Nᴀᴜ, A Short Grammar of Latgalian. (Languages of the World/Materials, 482.) München: ʟɪɴᴄᴏᴍ Europa, 2011, 119 pp. ɪѕʙɴ 978-3-86288-055-3.

More information

REALIZATION SORTING ALGORITHM USING PARALLEL TECHNOLOGIES bachelor, Mikhelev Vladimir candidate of Science, prof., Sinyuk Vasily

REALIZATION SORTING ALGORITHM USING PARALLEL TECHNOLOGIES bachelor, Mikhelev Vladimir candidate of Science, prof., Sinyuk Vasily 2. В. Гергель: Современные языки и технологии параллельного программирования. М., Изд-во МГУ, 2012. 3. Синюк, В. Г. Алгоритмы и структуры данных. Белгород: Изд-во БГТУ им. В. Г. Шухова, 2013. REALIZATION

More information

ifinder ENTERPRISE SEARCH

ifinder ENTERPRISE SEARCH DATA SHEET ifinder ENTERPRISE SEARCH ifinder - the Enterprise Search solution for company-wide information search, information logistics and text mining. CUSTOMER QUOTE IntraFind stands for high quality

More information

A Survey of Online Tools Used in English-Thai and Thai-English Translation by Thai Students

A Survey of Online Tools Used in English-Thai and Thai-English Translation by Thai Students 69 A Survey of Online Tools Used in English-Thai and Thai-English Translation by Thai Students Sarathorn Munpru, Srinakharinwirot University, Thailand Pornpol Wuttikrikunlaya, Srinakharinwirot University,

More information

TOOL OF THE INTELLIGENCE ECONOMIC: RECOGNITION FUNCTION OF REVIEWS CRITICS. Extraction and linguistic analysis of sentiments

TOOL OF THE INTELLIGENCE ECONOMIC: RECOGNITION FUNCTION OF REVIEWS CRITICS. Extraction and linguistic analysis of sentiments TOOL OF THE INTELLIGENCE ECONOMIC: RECOGNITION FUNCTION OF REVIEWS CRITICS. Extraction and linguistic analysis of sentiments Grzegorz Dziczkowski, Katarzyna Wegrzyn-Wolska Ecole Superieur d Ingenieurs

More information

Robustness of a Spoken Dialogue Interface for a Personal Assistant

Robustness of a Spoken Dialogue Interface for a Personal Assistant Robustness of a Spoken Dialogue Interface for a Personal Assistant Anna Wong, Anh Nguyen and Wayne Wobcke School of Computer Science and Engineering University of New South Wales Sydney NSW 22, Australia

More information

Paraphrasing controlled English texts

Paraphrasing controlled English texts Paraphrasing controlled English texts Kaarel Kaljurand Institute of Computational Linguistics, University of Zurich kaljurand@gmail.com Abstract. We discuss paraphrasing controlled English texts, by defining

More information

stress, intonation and pauses and pronounce English sounds correctly. (b) To speak accurately to the listener(s) about one s thoughts and feelings,

stress, intonation and pauses and pronounce English sounds correctly. (b) To speak accurately to the listener(s) about one s thoughts and feelings, Section 9 Foreign Languages I. OVERALL OBJECTIVE To develop students basic communication abilities such as listening, speaking, reading and writing, deepening their understanding of language and culture

More information

Why Data Flow Diagrams?

Why Data Flow Diagrams? Flow Diagrams A structured analysis technique that employs a set of visual representations of the data that moves through the organization, the paths through which the data moves, and the processes that

More information

Grammar Presentation: The Sentence

Grammar Presentation: The Sentence Grammar Presentation: The Sentence GradWRITE! Initiative Writing Support Centre Student Development Services The rules of English grammar are best understood if you understand the underlying structure

More information

National Quali cations SPECIMEN ONLY

National Quali cations SPECIMEN ONLY N5 SQ40/N5/02 FOR OFFICIAL USE National Quali cations SPECIMEN ONLY Mark Urdu Writing Date Not applicable Duration 1 hour and 30 minutes *SQ40N502* Fill in these boxes and read what is printed below. Full

More information

Sentence Structure/Sentence Types HANDOUT

Sentence Structure/Sentence Types HANDOUT Sentence Structure/Sentence Types HANDOUT This handout is designed to give you a very brief (and, of necessity, incomplete) overview of the different types of sentence structure and how the elements of

More information

Language Meaning and Use

Language Meaning and Use Language Meaning and Use Raymond Hickey, English Linguistics Website: www.uni-due.de/ele Types of meaning There are four recognisable types of meaning: lexical meaning, grammatical meaning, sentence meaning

More information

Domain Knowledge Extracting in a Chinese Natural Language Interface to Databases: NChiql

Domain Knowledge Extracting in a Chinese Natural Language Interface to Databases: NChiql Domain Knowledge Extracting in a Chinese Natural Language Interface to Databases: NChiql Xiaofeng Meng 1,2, Yong Zhou 1, and Shan Wang 1 1 College of Information, Renmin University of China, Beijing 100872

More information

ICAME Journal No. 24. Reviews

ICAME Journal No. 24. Reviews ICAME Journal No. 24 Reviews Collins COBUILD Grammar Patterns 2: Nouns and Adjectives, edited by Gill Francis, Susan Hunston, andelizabeth Manning, withjohn Sinclair as the founding editor-in-chief of

More information

A Knowledge-based System for Translating FOL Formulas into NL Sentences

A Knowledge-based System for Translating FOL Formulas into NL Sentences A Knowledge-based System for Translating FOL Formulas into NL Sentences Aikaterini Mpagouli, Ioannis Hatzilygeroudis University of Patras, School of Engineering Department of Computer Engineering & Informatics,

More information

Language Evaluation Criteria. Evaluation Criteria: Readability. Evaluation Criteria: Writability. ICOM 4036 Programming Languages

Language Evaluation Criteria. Evaluation Criteria: Readability. Evaluation Criteria: Writability. ICOM 4036 Programming Languages ICOM 4036 Programming Languages Preliminaries Dr. Amirhossein Chinaei Dept. of Electrical & Computer Engineering UPRM Spring 2010 Language Evaluation Criteria Readability: the ease with which programs

More information

Universal. Event. Product. Computer. 1 warehouse.

Universal. Event. Product. Computer. 1 warehouse. Dynamic multi-dimensional models for text warehouses Maria Zamr Bleyberg, Karthik Ganesh Computing and Information Sciences Department Kansas State University, Manhattan, KS, 66506 Abstract In this paper,

More information

A Survey of ASL Tenses

A Survey of ASL Tenses A Survey of ASL Tenses Karen Alkoby DePaul University School of Computer Science Chicago, IL kalkoby@shrike.depaul.edu Abstract This paper examines tenses in American Sign Language (ASL), which will be

More information

Hybrid Strategies. for better products and shorter time-to-market

Hybrid Strategies. for better products and shorter time-to-market Hybrid Strategies for better products and shorter time-to-market Background Manufacturer of language technology software & services Spin-off of the research center of Germany/Heidelberg Founded in 1999,

More information

PROMT Technologies for Translation and Big Data

PROMT Technologies for Translation and Big Data PROMT Technologies for Translation and Big Data Overview and Use Cases Julia Epiphantseva PROMT About PROMT EXPIRIENCED Founded in 1991. One of the world leading machine translation provider DIVERSIFIED

More information

TECHNIQUES FOR OPTIMIZING THE RELATIONSHIP BETWEEN DATA STORAGE SPACE AND DATA RETRIEVAL TIME FOR LARGE DATABASES

TECHNIQUES FOR OPTIMIZING THE RELATIONSHIP BETWEEN DATA STORAGE SPACE AND DATA RETRIEVAL TIME FOR LARGE DATABASES Techniques For Optimizing The Relationship Between Data Storage Space And Data Retrieval Time For Large Databases TECHNIQUES FOR OPTIMIZING THE RELATIONSHIP BETWEEN DATA STORAGE SPACE AND DATA RETRIEVAL

More information

Knowledge Discovery using Text Mining: A Programmable Implementation on Information Extraction and Categorization

Knowledge Discovery using Text Mining: A Programmable Implementation on Information Extraction and Categorization Knowledge Discovery using Text Mining: A Programmable Implementation on Information Extraction and Categorization Atika Mustafa, Ali Akbar, and Ahmer Sultan National University of Computer and Emerging

More information

A Collaborative System Software Solution for Modeling Business Flows Based on Automated Semantic Web Service Composition

A Collaborative System Software Solution for Modeling Business Flows Based on Automated Semantic Web Service Composition 32 A Collaborative System Software Solution for Modeling Business Flows Based on Automated Semantic Web Service Composition Ion SMEUREANU, Andreea DIOŞTEANU Economic Informatics Department, Academy of

More information

Multilingual and Localization Support for Ontologies

Multilingual and Localization Support for Ontologies Multilingual and Localization Support for Ontologies Mauricio Espinoza, Asunción Gómez-Pérez and Elena Montiel-Ponsoda UPM, Laboratorio de Inteligencia Artificial, 28660 Boadilla del Monte, Spain {jespinoza,

More information

Lecture 9. Phrases: Subject/Predicate. English 3318: Studies in English Grammar. Dr. Svetlana Nuernberg

Lecture 9. Phrases: Subject/Predicate. English 3318: Studies in English Grammar. Dr. Svetlana Nuernberg Lecture 9 English 3318: Studies in English Grammar Phrases: Subject/Predicate Dr. Svetlana Nuernberg Objectives Identify and diagram the most important constituents of sentences Noun phrases Verb phrases

More information

Hungarian Academy of Fine Arts Doctoral Course Programme. EXAMINATION AND CONSERVATION POSSIBILITIES OF 16-17 th CENTURY PAINTED PAPER-BASED OBJECTS

Hungarian Academy of Fine Arts Doctoral Course Programme. EXAMINATION AND CONSERVATION POSSIBILITIES OF 16-17 th CENTURY PAINTED PAPER-BASED OBJECTS Hungarian Academy of Fine Arts Doctoral Course Programme EXAMINATION AND CONSERVATION POSSIBILITIES OF 16-17 th CENTURY PAINTED PAPER-BASED OBJECTS DLA THESIS Katalin Orosz 2008 Supervisor: Dr. Márta Járó

More information

A Machine Translation System Between a Pair of Closely Related Languages

A Machine Translation System Between a Pair of Closely Related Languages A Machine Translation System Between a Pair of Closely Related Languages Kemal Altintas 1,3 1 Dept. of Computer Engineering Bilkent University Ankara, Turkey email:kemal@ics.uci.edu Abstract Machine translation

More information

COURSE OBJECTIVES SPAN 100/101 ELEMENTARY SPANISH LISTENING. SPEAKING/FUNCTIONAl KNOWLEDGE

COURSE OBJECTIVES SPAN 100/101 ELEMENTARY SPANISH LISTENING. SPEAKING/FUNCTIONAl KNOWLEDGE SPAN 100/101 ELEMENTARY SPANISH COURSE OBJECTIVES This Spanish course pays equal attention to developing all four language skills (listening, speaking, reading, and writing), with a special emphasis on

More information

Topic Task: Preparing Students for Conversation in the Topic Task At a glance

Topic Task: Preparing Students for Conversation in the Topic Task At a glance Topic Task: Preparing Students for Conversation in the Topic Task At a glance Level: ISE Foundation Focus: Topic task Aims: Developing skills in connected speech for the topic task and encouraging question

More information

Customizing an English-Korean Machine Translation System for Patent Translation *

Customizing an English-Korean Machine Translation System for Patent Translation * Customizing an English-Korean Machine Translation System for Patent Translation * Sung-Kwon Choi, Young-Gil Kim Natural Language Processing Team, Electronics and Telecommunications Research Institute,

More information

Towards a RB-SMT Hybrid System for Translating Patent Claims Results and Perspectives

Towards a RB-SMT Hybrid System for Translating Patent Claims Results and Perspectives Towards a RB-SMT Hybrid System for Translating Patent Claims Results and Perspectives Ramona Enache and Adam Slaski Department of Computer Science and Engineering Chalmers University of Technology and

More information

HELP DESK SYSTEMS. Using CaseBased Reasoning

HELP DESK SYSTEMS. Using CaseBased Reasoning HELP DESK SYSTEMS Using CaseBased Reasoning Topics Covered Today What is Help-Desk? Components of HelpDesk Systems Types Of HelpDesk Systems Used Need for CBR in HelpDesk Systems GE Helpdesk using ReMind

More information

Report Writing: Editing the Writing in the Final Draft

Report Writing: Editing the Writing in the Final Draft Report Writing: Editing the Writing in the Final Draft 1. Organisation within each section of the report Check that you have used signposting to tell the reader how your text is structured At the beginning

More information

The Book of Grammar Lesson Six. Mr. McBride AP Language and Composition

The Book of Grammar Lesson Six. Mr. McBride AP Language and Composition The Book of Grammar Lesson Six Mr. McBride AP Language and Composition Table of Contents Lesson One: Prepositions and Prepositional Phrases Lesson Two: The Function of Nouns in a Sentence Lesson Three:

More information

EAP 1161 1660 Grammar Competencies Levels 1 6

EAP 1161 1660 Grammar Competencies Levels 1 6 EAP 1161 1660 Grammar Competencies Levels 1 6 Grammar Committee Representatives: Marcia Captan, Maria Fallon, Ira Fernandez, Myra Redman, Geraldine Walker Developmental Editor: Cynthia M. Schuemann Approved:

More information

Points of Interference in Learning English as a Second Language

Points of Interference in Learning English as a Second Language Points of Interference in Learning English as a Second Language Tone Spanish: In both English and Spanish there are four tone levels, but Spanish speaker use only the three lower pitch tones, except when

More information

01219211 Software Development Training Camp 1 (0-3) Prerequisite : 01204214 Program development skill enhancement camp, at least 48 person-hours.

01219211 Software Development Training Camp 1 (0-3) Prerequisite : 01204214 Program development skill enhancement camp, at least 48 person-hours. (International Program) 01219141 Object-Oriented Modeling and Programming 3 (3-0) Object concepts, object-oriented design and analysis, object-oriented analysis relating to developing conceptual models

More information

The Specific Text Analysis Tasks at the Beginning of MDA Life Cycle

The Specific Text Analysis Tasks at the Beginning of MDA Life Cycle SCIENTIFIC PAPERS, UNIVERSITY OF LATVIA, 2010. Vol. 757 COMPUTER SCIENCE AND INFORMATION TECHNOLOGIES 11 22 P. The Specific Text Analysis Tasks at the Beginning of MDA Life Cycle Armands Šlihte Faculty

More information

Multi language e Discovery Three Critical Steps for Litigating in a Global Economy

Multi language e Discovery Three Critical Steps for Litigating in a Global Economy Multi language e Discovery Three Critical Steps for Litigating in a Global Economy 2 3 5 6 7 Introduction e Discovery has become a pressure point in many boardrooms. Companies with international operations

More information