Extracting translation relations for humanreadable dictionaries from bilingual text



Similar documents
Hybrid Strategies. for better products and shorter time-to-market

PROMT Technologies for Translation and Big Data

The SYSTRAN Linguistics Platform: A Software Solution to Manage Multilingual Corporate Knowledge

Project Management. From industrial perspective. A. Helle M. Herranz. EXPERT Summer School, Pangeanic - BI-Europe

Computer Aided Translation

Machine Translation Computer Aided Translation Machine Language Processing

Comprendium Translator System Overview

Integration of Content Optimization Software into the Machine Translation Workflow. Ben Gottesman Acrolinx

WHITE PAPER. Machine Translation of Language for Safety Information Sharing Systems

Overview of MT techniques. Malek Boualem (FT)

M LTO Multilingual On-Line Translation

JOB BANK TRANSLATION AUTOMATED TRANSLATION SYSTEM. Table of Contents

The history of machine translation in a nutshell

LANGUAGE TRANSLATION SOFTWARE EXECUTIVE SUMMARY Language Translation Software Market Shares and Market Forecasts Language Translation Software Market

Machine Translation at the European Commission

SYSTRAN Enterprise Server 7 Online Tools User Guide. Chapter 1: Overview... 1 SYSTRAN Enterprise Server 7 Overview... 2

Towards a RB-SMT Hybrid System for Translating Patent Claims Results and Perspectives

Automation of Translation: Past, Presence, and Future Karl Heinz Freigang, Universität des Saarlandes, Saarbrücken

REALIZATION SORTING ALGORITHM USING PARALLEL TECHNOLOGIES bachelor, Mikhelev Vladimir candidate of Science, prof., Sinyuk Vasily

ACCURAT Analysis and Evaluation of Comparable Corpora for Under Resourced Areas of Machine Translation Project no.

Translation Solution for

PROMT-Adobe Case Study:

Question template for interviews

Glossary of translation tool types

FROM METAL TO T1: SYSTEMS AND COMPONENTS FOR MACHINE TRANSLATION APPLICATIONS

Structure of the presentation

The Language Grid The Language Grid combines users language resources and machine translators to produce high-quality translation that is customized

The Principle of Translation Management Systems

Your single-source partner for corporate product communication. Transit NXT Evolution. from Service Pack 0 to Service Pack 8

Translation and Localization Services

An Interactive Hypertextual Environment for MT Training

Machine Translation and the Translator

Machine Translation as a translator's tool. Oleg Vigodsky Argonaut Ltd. (Translation Agency)

Tibetan-Chinese Bilingual Sentences Alignment Method based on Multiple Features

SYSTRAN Chinese-English and English-Chinese Hybrid Machine Translation Systems for CWMT2011 SYSTRAN 混 合 策 略 汉 英 和 英 汉 机 器 翻 译 系 CWMT2011 技 术 报 告

MOVING MACHINE TRANSLATION SYSTEM TO WEB

Processing: current projects and research at the IXA Group

SYSTRAN v6 Quick Start Guide

Multilingual Term Extraction as a Service from Acrolinx. Ben Gottesman Michael Klemme Acrolinx CHAT2013

SYSTRAN 6 Desktop User Guide. Chapter 1: Overview... 1 SYSTRAN Desktop Products Overview... 2

Learning Translation Rules from Bilingual English Filipino Corpus

The history of machine translation in a nutshell

Fully Automatic High Quality Machine Translation of. Restricted Text: A Case Study

Automatic Speech Recognition and Hybrid Machine Translation for High-Quality Closed-Captioning and Subtitling for Video Broadcast

Statistical Machine Translation

Localization Profile 2014

Multilingual and mixed-lingual TTS applications

4. Clause combining 2

Module Catalogue for the Bachelor Program in Computational Linguistics at the University of Heidelberg

How the Computer Translates. Svetlana Sokolova President and CEO of PROMT, PhD.

TAUS Membership Program (Executive Overview) write to to request the 35 pages detailed service overview.

Transit NXT Product Guide Service Pack 7 09/2013

Integra(on of human and machine transla(on. Marcello Federico Fondazione Bruno Kessler MT Marathon, Prague, Sept 2013

TRANSLATION OF TELUGU-MARATHI AND VICE- VERSA USING RULE BASED MACHINE TRANSLATION

Resource and Workflow Management Support in Teletranslation

A web-based multilingual help desk

Customizing an English-Korean Machine Translation System for Patent Translation *

Composing Human and Machine Translation Services: Language Grid for Improving Localization Processes

Introduction. Philipp Koehn. 28 January 2016

Word Completion and Prediction in Hebrew

Statistical Machine Translation

KantanMT.com. The world s #1 MT Platform. No Hardware. No Software. No Hassle MT.

FUNDAMENTAL TECHNOLOGIES FOR IP TRANSLATION SERVICES

Joint efforts to further develop and incorporate Apertium into the document management flow at Universitat Oberta de Catalunya

EURESCOM Project BabelWeb Multilingual Web Sites: Best Practice Guidelines and Architectures

Neural Machine Transla/on for Spoken Language Domains. Thang Luong IWSLT 2015 (Joint work with Chris Manning)

State of affairs today ALL THESE CAN BE TRUE!!!! We tried MT but it was not good. Because of MT, our revenues increased by 17%

ADVANTAGES AND DISADVANTAGES OF TRANSLATION MEMORY: A COST/BENEFIT ANALYSIS by Lynn E. Webb BA, San Francisco State University, 1992 Submitted in

Translation Management System

Natural Language Database Interface for the Community Based Monitoring System *

TechWatch. Technology and Market Observation powered by SMILA

Machine vs. Human Translation Scott Bass, Advanced Language Translation Inc.

Getting Off to a Good Start: Best Practices for Terminology

Living, Working, Breathing the toolset How Alpha CRC has incorporated memoq in its production process

XTM for Language Service Providers Explained

Cross-Language Information Retrieval by Domain Restriction using Web Directory Structure

Testing Data-Driven Learning Algorithms for PoS Tagging of Icelandic

Online free translation services

Web-based automatic translation: the Yandex.Translate API

GATE Mímir and cloud services. Multi-paradigm indexing and search tool Pay-as-you-go large-scale annotation

Automatic Mining of Internet Translation Reference Knowledge Based on Multiple Search Engines

Software Cost. Discounted STS Rate Units Total $0.00 $0.00 $0.00 $0.00 Total $0.00

Intel s Localization BUS Initiative To XLIFF or not to XLIFF. Loïc Dufresne de Virel Localization Strategist

How To Write A Multilingual Web Conference

Convergence of Translation Memory and Statistical Machine Translation

STATE OF VERMONT. Secretary of Administration ORIGINAL POLICY ADOPTED BY STC DATE: STATUTORY REFERENCE Policy for Web Look and Feel Requirements

Efficient Techniques for Improved Data Classification and POS Tagging by Monitoring Extraction, Pruning and Updating of Unknown Foreign Words

Reducing Total Cost and Risk in Multilingual Litigation A VIA Legal Brief

COMPUTATIONAL DATA ANALYSIS FOR SYNTAX

BILINGUAL TRANSLATION SYSTEM

Recent developments in machine translation policy at the European Patent Office

Automatische Übersetzung zwischen Hype und Realität. Automatic Translation between Hype and Reality

Using the BNC to create and develop educational materials and a website for learners of English

How To Translate English To Yoruba Language To Yoranuva

Introduction to formal semantics -

LINGSTAT: AN INTERACTIVE, MACHINE-AIDED TRANSLATION SYSTEM*

HIERARCHICAL HYBRID TRANSLATION BETWEEN ENGLISH AND GERMAN

Automated Online English -Arabic Translator

a Chinese-to-Spanish rule-based machine translation

LetsMT!: A Cloud-Based Platform for Do-It-Yourself Machine Translation

Transcription:

Extracting translation relations for humanreadable dictionaries from bilingual text

Overview 1. Company 2. Translate pro 12.1 and AutoLearn<word> 3. Translation workflow 4. Extraction method 5. Extended AutoLearn with selection restrictions 6. Improving accuracy, coverage and availability 7. On-the-fly extraction 06.11.2013 TeKom 2013 - Lingenio 2013 2

Company Funded in 1999 Spin-off of the IBM research center Germany located in Heidelberg develops and markets language technology software and services. Core compentence machine translation electronic dictionaries text analysis (morphology, syntax, semantics) Several research projects 06.11.2013 TeKom 2013 - Lingenio 2013 3

Translate pro 12.1 Single user versions for professional translators and private use including dictionaries with context-sensitive search functions 06.11.2013 TeKom 2013 - Lingenio 2013 4

Translate pro 12.1 Corporate solutions Client/Server networks for workgroups Lingenio Translation Server: web-based solutions for company-wide intranets 06.11.2013 TeKom 2013 - Lingenio 2013 5

Translate pro 12.1 Integration via Plug-Ins into Publishing Tools Wordpress, CAT-Tools Trados OmegaT, 06.11.2013 TeKom 2013 - Lingenio 2013 6

Translate pro 12.1 Translation Center MS Office Plug-Ins Browser Plug-Ins (IE, Firefox) Pdf translation 06.11.2013 TeKom 2013 - Lingenio 2013 7

Translation Center User dictionaries edition, settings Translation Memories selection, settings Automatic extraction of dictionary entries Postediting: Alternative translations Assistant: Unknown words, Statistics, settings,.. 06.11.2013 TeKom 2013 - Lingenio 2013 8

AutoLearn<word> extracts suggestions for dictionary entries from postedited MT Translation memories 06.11.2013 TeKom 2013 - Lingenio 2013 9

AutoLearn<word> creates suggestions from postedited text 06.11.2013 TeKom 2013 - Lingenio 2013 10

AutoLearn<word> creates suggestions from postedited text 06.11.2013 TeKom 2013 - Lingenio 2013 11

AutoLearn<word> creates suggestions from translation memory sentence pairs 06.11.2013 TeKom 2013 - Lingenio 2013 12

AutoLearn<word> suggestions extracted from translation memory sentence pairs 06.11.2013 TeKom 2013 - Lingenio 2013 13

AutoLearn<word> suggestions relate (potentially) to all parts of speech (nouns, verbs, adjectives, ) include multiword expressions can be selected for integration into active user dictionary. 06.11.2013 TeKom 2013 - Lingenio 2013 14

AutoLearn<word> suggestions can be added to dictionary single relations or all 06.11.2013 TeKom 2013 - Lingenio 2013 15

Dictionary entries assigned to suggestions make use of morpho-syntactic & semantic information & defaults of the MT system can be edited TeKom 2013 - Lingenio 2013 16

AutoLearn entries adapt the translation to the references extracted 06.11.2013 TeKom 2013 - Lingenio 2013 17

AutoLearn<word> extracts suggestions for dictionary entries from postedited MT Translation memories from single sentence pairs complete TMs from bilingual text via Lingenio sentence aligner workflow 06.11.2013 TeKom 2013 - Lingenio 2013 18

AutoLearn<word> bilingual texts European insurance regulation 06.11.2013 TeKom 2013 - Lingenio 2013 19

AutoLearn<word> align & import into translation memory 06.11.2013 TeKom 2013 - Lingenio 2013 20

AutoLearn<word> extract translation suggestions from single sentence pairs 06.11.2013 TeKom 2013 - Lingenio 2013 21

AutoLearn<word> extract translation suggestions from single sentence pairs 06.11.2013 TeKom 2013 - Lingenio 2013 22

AutoLearn<word> or from complete translation memories 06.11.2013 TeKom 2013 - Lingenio 2013 23

AutoLearn<word> from complete translation memories 06.11.2013 TeKom 2013 - Lingenio 2013 24

AutoLearn<word> - Extraction method 1. Translation relations from system dictionaries 2. Structures assigned to source and target sentence by the analysis components of the MT system 06.11.2013 TeKom 2013 - Lingenio 2013 25

Example Die Lithofazien-Analyse des oberen Teils der Pliozän-Schicht im Valdelsa-Becken (Mittelitalien) hat eine gewisse Anzahl von Umweltablagerungen ergeben, von der Schwemm- zur Küsten- und zur Meeresebene. Lithofacies analysis of the upper part of the Pliocene succession of the Valdelsa basin (central Italy) unravelled a number of depositional environments, ranging from alluvial plain to coastal, to marine 06.11.2013 TeKom 2013 - Lingenio 2013 26

Example Die Lithofazien-Analyse der Pliozän-Schicht hat eine gewisse Anzahl von Umweltablagerungen ergeben. Lithofacies analysis of the Pliocene succession unravelled a number of depositional environments. 06.11.2013 TeKom 2013 - Lingenio 2013 27

Dependence grammar structures 06.11.2013 TeKom 2013 - Lingenio 2013 28

Dependence grammar structures + transfer knowledge 06.11.2013 TeKom 2013 - Lingenio 2013 29

Dependence grammar structures + transfer knowledge (+ statistics) 06.11.2013 TeKom 2013 - Lingenio 2013 30

Dependence grammar structures + transfer knowledge (+ statistics) Derive new relations 06.11.2013 TeKom 2013 - Lingenio 2013 31

Dependence grammar structures + transfer knowledge (+ statistics) Derive new relations AutoLearn<word> 06.11.2013 TeKom 2013 - Lingenio 2013 32

Do more! Use analysis constraints! syntactic constraints semantic constraints morphological constraints 33

Extended AutoLearn with selection restrictions 06.11.2013 TeKom 2013 - Lingenio 2013 34

Extended AutoLearn with selection restrictions genitive object constraint direct object constraints 35

Extended AutoLearn with selection restrictions extract restrictions Lithofazien-Analyse ergibt Anzahl Umweltablagerungen ~ unravel weaken conditions Analyse ergibt Ablagerung Vorgang ergibt Ergebnis Select conditions by evaluating occurrences in corpora Analyse/Vorgang ergibt Rückstand/ Ergebnis ~ unravel, yield? 36

Extended AutoLearn soon: version 12.5 with selection restrictions 37

Extended AutoLearn soon: version 12.5 with selection restrictions supporting research: improve accuracy and coverage 38

Improving accuracy and coverage EU Marie Curie project (Hybrid high quality machine translation) BMWi project FlexNeuroTrans (Flexible MT for medium-sized businesses using neural nets) combination of rule-based and statistical methods extract information from the internet, 06.11.2013 TeKom 2013 - Lingenio 2013 39

AutoLearn<word> information Example: European insurance regulation search bilingual text (on the fly) that suits information requirement For example via Wikipedia,.. 06.11.2013 TeKom 2013 - Lingenio 2013 40

AutoLearn<word> information store & examine extracted texts 06.11.2013 TeKom 2013 - Lingenio 2013 41

Availability for translation service (Lingenio Translation Server LTS) for CAT-tools publishing tools (Wordpress,..) Intranet solutions AutoLearn<word> 06.11.2013 TeKom 2013 - Lingenio 2013 42

Improving availability for multilingual platforms 06.11.2013 TeKom 2013 - Lingenio 2013 43

On-the-fly extraction Text to be processed 06.11.2013 TeKom 2013 - Lingenio 2013 44

Summary: products & research 1. version 12.1 AutoLearn<word> (for several parts of speech & multiwords) 2. version 12.5 (soon: with selection restrictions) 3. learning, user dictionaries & memories available for Lingenio Translation Server (for CAT-tools, publishing and intranet) 4. supporting research for improving accuracy, coverage and onthe-fly extraction of translation information 06.11.2013 TeKom 2013 - Lingenio 2013 45

Thank you for your attention! Questions? (please visit us at stand 420)