# Statistical Machine Translation

 To view this video please enable JavaScript, and consider upgrading to a web browser that supports HTML5 video
Save this PDF as:

Size: px
Start display at page:

## Transcription

1 Statistical Machine Translation Some of the content of this lecture is taken from previous lectures and presentations given by Philipp Koehn and Andy Way. Dr. Jennifer Foster National Centre for Language Technology School of Computing Dublin City University Week Ten/Eleven 17th April 2014

2 Lecture Plan Part One: Recap of decoding Part Two: More on decoding Part Three: Encoding linguistic knowledge

3 Part One Recap of decoding

4 Questions 1. What is decoding? 2. What methods are used to make the decoding problem tractable? 3. What is the difference between histogram pruning and threshold pruning?

5 What is decoding? Decoding is the process of searching for the most likely translation within the space of candidate translations.

6 Making decoding tractable 1. Hypothesis recombination 2. Pruning

7 Histogram Pruning Keep a maximum of m hypotheses

8 Threshold Pruning Keep only those hypotheses that are within a threshold α of the best hypothesis.

9 Histogram Pruning How many hypotheses will be pruned if m = 1?

10 Threshold Pruning How many hypotheses will be pruned if we prune all hypotheses that are at least 0.5 times worse than the best hypothesis? 0.09

11 Threshold Pruning 0.01 Threshold pruning is more flexible than histogram pruning since it takes into account the difference between the scores of the best and worst hypothesis

12 Part Two More on decoding

13 Stack Decoding Algorithm Partial translations or hypotheses are grouped into stacks based on how many words in the input sentence they translate.

14 Stack Decoding Algorithm Partial translations or hypotheses are grouped into stacks based on how many words in the input sentence they translate. Working left to right through the stacks, the decoder takes a hypothesis from a stack and a phrase from the input sentence and tries to construct a new hypothesis.

15 Stack Decoding Algorithm Partial translations or hypotheses are grouped into stacks based on how many words in the input sentence they translate. Working left to right through the stacks, the decoder takes a hypothesis from a stack and a phrase from the input sentence and tries to construct a new hypothesis. The output is the highest scoring hypothesis that translates the entire input sentence.

16 Stack Decoding Algorithm Partial translations or hypotheses are grouped into stacks based on how many words in the input sentence they translate. Working left to right through the stacks, the decoder takes a hypothesis from a stack and a phrase from the input sentence and tries to construct a new hypothesis. The output is the highest scoring hypothesis that translates the entire input sentence. Stacks are kept to a manageable size using pruning.

17 Stack Decoding Pseudocode 1:place empty hypothesis into stack 0 2: for all stacks 0...n 1 do 3: for all hypotheses in stack do 4: for all translation options do 5: if applicable then 6: create new hypothesis 7: place in stack 8: recombine with existing hypothesis if possible 9: prune stack if too big 10: end if 11: end for 12: end for 13: end for

18 Stack Decoding Example Input sentence: maison bleu Goal: Translate it into English using the stack decoding algorithm with histogram pruning (maximum number of hypotheses per stack = 3)

19 Stack Decoding Example Phrase pairs in training data: maison house maison home bleu blue bleu turquoise maison bleu blue house

20 Stack Decoding Example All possible translations: house blue, house turquoise, home blue, home turquoise, blue house, blue home, turquoise home, turquoise house

21 Stack Decoding Example Insert the empty hypothesis in Stack

22 Stack Decoding Example Construct new hypotheses compatible with empty hypothesis turquoise blue home house blue house 0 1 2

23 Stack Decoding Example Prune Stack 1 (assume turquoise has lowest probability) turquoise blue home house blue house 0 1 2

24 Stack Decoding Example Construct new hypotheses compatible with hypotheses in Stack 1 home turqoise house turquoise home blue blue home house house blue blue home blue house blue house 0 1 2

25 Stack Decoding Example Perform hypothesis recombination (assume in this case that the shorter path hypothesis is more likely) home turqoise house turquoise home blue blue home house house blue blue home blue house 0 1 2

26 Stack Decoding Example Prune stack 2 (assume home blue, home turquoise and house turquoise have the lowest scores) blue home house home turqoise housee turquoise home blue house blue blue home blue house 0 1 2

27 Stack Decoding Example Output most likely translation in Stack 2 blue house blue home blue home house blue house 0 1 2

28 Limitation of Stack Decoding Algorithm Relatively low-scoring hypotheses that are part of a good translation may be pruned.

29 Limitation of Stack Decoding Algorithm the tourism initiative addresses this for the first time Die Tm:-0.19, lm:-0.4, d:0,all:-0.59 touristische Tm:-1.16, lm:-2.93, d:0,all:-4.09 initiative Tm:-1.21, lm:-4.67, d:0,all:-5.88 das erste mal Tm:-0.56, lm:-2.81, d:-0.74,all:-4.11

30 Limitation of Stack Decoding Algorithm the tourism initiative addresses this for the first time Die Tm:-0.19, lm:-0.4, d:0,all:-0.59 touristische Tm:-1.16, lm:-2.93, d:0,all:-4.09 initiative Tm:-1.21, lm:-4.67, d:0,all:-5.88 das erste mal Tm:-0.56, lm:-2.81, d:-0.74,all:-4.11 Both hypotheses translate 3 words (therefore, in the same stack so will be compared to each other when deciding what to prune)

31 Limitation of Stack Decoding Algorithm the tourism initiative addresses this for the first time Die Tm:-0.19, lm:-0.4, d:0,all:-0.59 touristische Tm:-1.16, lm:-2.93, d:0,all:-4.09 initiative Tm:-1.21, lm:-4.67, d:0,all:-5.88 das erste mal Tm:-0.56, lm:-2.81, d:-0.74,all:-4.11 Both hypotheses translate 3 words (therefore, in the same stack so will be compared to each other when deciding what to prune) The worse hypothesis (i.e. the one starting with das erste mal) has a better score at this stage because it contains common words that are easier to translate.

32 Limitation of Stack Decoding Algorithm How to overcome this limitation?

33 Limitation of Stack Decoding Algorithm How to overcome this limitation? Estimate the future cost associated with a hypothesis by looking at the rest of the input string.

34 Future Cost A hypothesis score is some combination of its probability and its future cost. Why is the exact future cost not calculated?

35 Future Cost A hypothesis score is some combination of its probability and its future cost. Why is the exact future cost not calculated? Because calculating the exact future cost of all hypotheses would amount to evaluating the translation cost of all hypotheses the very thing we use pruning to avoid.

36 Depth-first versus breadth-first search A distinction is made between decoding algorithms that perform a breadth-first search and those that perform a depth-first search.

37 Depth-first versus breadth-first search A distinction is made between decoding algorithms that perform a breadth-first search and those that perform a depth-first search. A breadth-first search explores many hypotheses in parallel, e.g. stack decoding. A depth-first search fully explores a hypothesis before moving on to the next one, e.g. A* decoding.

38 Search and Model Errors 1. A search error occurs when the decoder misses a translation with a higher probability than the translation it returns.

39 Search and Model Errors 1. A search error occurs when the decoder misses a translation with a higher probability than the translation it returns. 2. A model error occurs when the model assigns a higher probability to an incorrect translation than to a correct one.

40 Part Three Making Use of Linguistic Knowledge

41 The story so far The SMT approaches we have studied so far are language-independent.

42 The story so far The SMT approaches we have studied so far are language-independent. Input: 1. A monolingual corpus for training a language model 2. A sentence-aligned parallel corpus for training a translation model

43 The story so far The SMT approaches we have studied so far are language-independent. Input: 1. A monolingual corpus for training a language model 2. A sentence-aligned parallel corpus for training a translation model Output: A translation (of variable quality)

44 The old way of doing things Traditional rule-based MT systems tried to translate using hand-coded rules which encode linguistic knowledge or generalisations.

45 Examples of linguistic knowledge 1. In English, the noun comes after the adjective. In French, the noun usually precedes the adjective.

46 Examples of linguistic knowledge 1. In English, the noun comes after the adjective. In French, the noun usually precedes the adjective. 2. In German, the main verb often occurs at the end of a clause.

47 Examples of linguistic knowledge 1. In English, the noun comes after the adjective. In French, the noun usually precedes the adjective. 2. In German, the main verb often occurs at the end of a clause. 3. In Irish, prepositions are inflected for number and person.

48 Examples of linguistic knowledge 1. In English, the noun comes after the adjective. In French, the noun usually precedes the adjective. 2. In German, the main verb often occurs at the end of a clause. 3. In Irish, prepositions are inflected for number and person. 4. In many languages, the subject and verb agree in person and number.

49 Why bother with linguistic knowledge? Why would we want to encode linguistic knowledge in an SMT system?

50 Why bother with linguistic knowledge? Why would we want to encode linguistic knowledge in an SMT system? 1. Data-driven SMT works well but there is room for improvement (especially for some language pairs).

51 Why bother with linguistic knowledge? Why would we want to encode linguistic knowledge in an SMT system? 1. Data-driven SMT works well but there is room for improvement (especially for some language pairs). 2. Data-driven SMT relies on the availability of large corpora. For minority languages, this can be problematic.

52 Using linguistic knowledge Where linguistic knowledge might be useful:

53 Using linguistic knowledge Where linguistic knowledge might be useful: 1. More global view of fluency than an n-gram language model: the house is the man is small

54 Using linguistic knowledge Where linguistic knowledge might be useful: 1. More global view of fluency than an n-gram language model: the house is the man is small 2. Long-distance dependencies: make a tremendous amount of sense

55 Using linguistic knowledge Where linguistic knowledge might be useful: 1. More global view of fluency than an n-gram language model: the house is the man is small 2. Long-distance dependencies: make a tremendous amount of sense 3. Word compounding: Knowing that Aktionsplan is made up of the words Aktion and plan

56 Using linguistic knowledge Where linguistic knowledge might be useful: 1. More global view of fluency than an n-gram language model: the house is the man is small 2. Long-distance dependencies: make a tremendous amount of sense 3. Word compounding: Knowing that Aktionsplan is made up of the words Aktion and plan 4. Better handling of function words: role of de in entreprises de transports (haulage companies)

57 How to encode linguistic knowledge How can linguistic knowledge be encoded in an SMT system?

58 How to encode linguistic knowledge How can linguistic knowledge be encoded in an SMT system? 1. Before translation (pre-processing)

59 How to encode linguistic knowledge How can linguistic knowledge be encoded in an SMT system? 1. Before translation (pre-processing) 2. During translation

60 How to encode linguistic knowledge How can linguistic knowledge be encoded in an SMT system? 1. Before translation (pre-processing) 2. During translation 3. After translation (post-processing)

61 Where do we get this linguistic knowledge? How do we obtain linguistic knowledge automatically?

62 Where do we get this linguistic knowledge? How do we obtain linguistic knowledge automatically? 1. Automatic morphological analysis 2. Part-of-speech tagging 3. Syntactic parsing

63 Where do we get this linguistic knowledge? How do we obtain linguistic knowledge automatically? 1. Automatic morphological analysis 2. Part-of-speech tagging 3. Syntactic parsing Can you spot a potential problem here (hint: automatic)?

64 Syntactic Parsing S NP VP dobj NNP VBP NNP John loves Mary NNP VBP NNP John loves Mary nsubj

65 Pre-processing Encoding linguistic knowledge before translation 1. Word segmentation 2. Word lemmatisation 3. Re-ordering

66 Word Segmentation The notion of what a word is varies from language to language.

67 Word Segmentation The notion of what a word is varies from language to language. In some languages, e.g. Chinese, words are not separated by spaces.

68 Word Segmentation The notion of what a word is varies from language to language. In some languages, e.g. Chinese, words are not separated by spaces. Why is this a problem for machine translation?

69 Word Segmentation The notion of what a word is varies from language to language. In some languages, e.g. Chinese, words are not separated by spaces. Why is this a problem for machine translation? Potential solution: try to perform automatic word segmentation before translation.

70 Word Segmentation The notion of what a word is varies from language to language. In some languages, e.g. Chinese, words are not separated by spaces. Why is this a problem for machine translation? Potential solution: try to perform automatic word segmentation before translation. A related challenge is word compounding in German.

71 Word Lemmatisation Languages with a rich inflectional system can have several different variants of the same root form or lemma.

72 Word Lemmatisation Languages with a rich inflectional system can have several different variants of the same root form or lemma. Example: Latin: bellum, belli, bello, bella, bellis, bellorum English: war, wars

73 Word Lemmatisation Languages with a rich inflectional system can have several different variants of the same root form or lemma. Example: Latin: bellum, belli, bello, bella, bellis, bellorum English: war, wars Why is this a problem for translation from an MRL (morphologically rich language) into English?

74 Word Lemmatisation Languages with a rich inflectional system can have several different variants of the same root form or lemma. Example: Latin: bellum, belli, bello, bella, bellis, bellorum English: war, wars Why is this a problem for translation from an MRL (morphologically rich language) into English? Potential solution: before translation, replace full form with lemma

75 Word Lemmatisation Languages with a rich inflectional system can have several different variants of the same root form or lemma. Example: Latin: bellum, belli, bello, bella, bellis, bellorum English: war, wars Why is this a problem for translation from an MRL (morphologically rich language) into English? Potential solution: before translation, replace full form with lemma

76 Syntactic Re-ordering It is harder to translate language pairs with very different word order.

77 Syntactic Re-ordering It is harder to translate language pairs with very different word order. Language model and phrase-based translation model can only do so much.

78 Syntactic Re-ordering It is harder to translate language pairs with very different word order. Language model and phrase-based translation model can only do so much. Potential solution: transform the source sentence so that its word order more closely resembles the word order of the target language.

79 Syntactic Re-ordering It is harder to translate language pairs with very different word order. Language model and phrase-based translation model can only do so much. Potential solution: transform the source sentence so that its word order more closely resembles the word order of the target language. Consider translating from English to German: I walked to the shop yesterday. I yesterday to the shop walked.

80 During Translation Encoding linguistic knowledge during translation

81 During Translation Encoding linguistic knowledge during translation 1. Log-linear models

82 During Translation Encoding linguistic knowledge during translation 1. Log-linear models 2. Tree-based models

83 During Translation Encoding linguistic knowledge during translation 1. Log-linear models 2. Tree-based models 3. Syntactic language models

84 Log-linear models The log-linear model of translation allows several sources of information to be employed simultaneously during the translation process: p x = exp n i=1 λ i h i (x)

85 Log-linear models The log-linear model of translation allows several sources of information to be employed simultaneously during the translation process: p x = exp n i=1 λ i h i (x) Some of this information can be linguistic knowledge, e.g. instead of just calculating p(ocras hungry) we can also calculate p(noun adj).

86 Tree-based models A more radical approach is to replace the phrasebased translation model with a different type of translation model entirely.

87 Tree-based models A more radical approach is to replace the phrasebased translation model with a different type of translation model entirely. One such model is a tree-based model which is based on the notion of a syntax tree and which allows one, in theory, to better capture structural similarities and divergences between languages.

88 Tree-based models A more radical approach is to replace the phrasebased translation model with a different type of translation model entirely. One such model is a tree-based model which is based on the notion of a syntax tree and which allows one, in theory, to better capture structural similarities and divergences between languages. During decoding, syntax trees for the source and language pairs are built up simultaneously (or one side only)

89 Syntactic language models N-gram language models are not the only language models.

90 Syntactic language models N-gram language models are not the only language models. People have experimented with other kinds of language models, including those based on the notion of a syntax tree.

91 Syntactic language models N-gram language models are not the only language models. People have experimented with other kinds of language models, including those based on the notion of a syntax tree. Instead of computing the probability of a sentence by multiplying n-gram probabilities, the probability is computed by multiplying the probabilities of the rules that are used to build the syntax tree(s).

92 Post-processing SMT systems return a ranked list of candidate translations. Linguistic information (of all kinds) can be employed in re-ranking this list using machine learning techniques.

### Chapter 6. Decoding. Statistical Machine Translation

Chapter 6 Decoding Statistical Machine Translation Decoding We have a mathematical model for translation p(e f) Task of decoding: find the translation e best with highest probability Two types of error

### Decoding. Philipp Koehn. 11 May 2015

Decoding Philipp Koehn 11 May 2015 Decoding 1 We have a mathematical model for translation p(e f) Task of decoding: find the translation e best with highest probability Two types of error e best = argmax

### Statistical Machine Translation Lecture 4. Beyond IBM Model 1 to Phrase-Based Models

p. Statistical Machine Translation Lecture 4 Beyond IBM Model 1 to Phrase-Based Models Stephen Clark based on slides by Philipp Koehn p. Model 2 p Introduces more realistic assumption for the alignment

### Phrase-Based MT. Machine Translation Lecture 7. Instructor: Chris Callison-Burch TAs: Mitchell Stern, Justin Chiu. Website: mt-class.

Phrase-Based MT Machine Translation Lecture 7 Instructor: Chris Callison-Burch TAs: Mitchell Stern, Justin Chiu Website: mt-class.org/penn Translational Equivalence Er hat die Prüfung bestanden, jedoch

### Hybrid Machine Translation Guided by a Rule Based System

Hybrid Machine Translation Guided by a Rule Based System Cristina España-Bonet, Gorka Labaka, Arantza Díaz de Ilarraza, Lluís Màrquez Kepa Sarasola Universitat Politècnica de Catalunya University of the

### Convergence of Translation Memory and Statistical Machine Translation

Convergence of Translation Memory and Statistical Machine Translation Philipp Koehn and Jean Senellart 4 November 2010 Progress in Translation Automation 1 Translation Memory (TM) translators store past

### Phrases. Topics for Today. Phrases. POS Tagging. ! Text transformation. ! Text processing issues

Topics for Today! Text transformation Word occurrence statistics Tokenizing Stopping and stemming Phrases Document structure Link analysis Information extraction Internationalization Phrases! Many queries

### The KIT Translation system for IWSLT 2010

The KIT Translation system for IWSLT 2010 Jan Niehues 1, Mohammed Mediani 1, Teresa Herrmann 1, Michael Heck 2, Christian Herff 2, Alex Waibel 1 Institute of Anthropomatics KIT - Karlsruhe Institute of

### The Transition of Phrase based to Factored based Translation for Tamil language in SMT Systems

The Transition of Phrase based to Factored based Translation for Tamil language in SMT Systems Dr. Ananthi Sheshasaayee 1, Angela Deepa. V.R 2 1 Research Supervisior, Department of Computer Science & Application,

### The XMU Phrase-Based Statistical Machine Translation System for IWSLT 2006

The XMU Phrase-Based Statistical Machine Translation System for IWSLT 2006 Yidong Chen, Xiaodong Shi Institute of Artificial Intelligence Xiamen University P. R. China November 28, 2006 - Kyoto 13:46 1

### Factored Translation Models

Factored Translation s Philipp Koehn and Hieu Hoang pkoehn@inf.ed.ac.uk, H.Hoang@sms.ed.ac.uk School of Informatics University of Edinburgh 2 Buccleuch Place, Edinburgh EH8 9LW Scotland, United Kingdom

### LIUM s Statistical Machine Translation System for IWSLT 2010

LIUM s Statistical Machine Translation System for IWSLT 2010 Anthony Rousseau, Loïc Barrault, Paul Deléglise, Yannick Estève Laboratoire Informatique de l Université du Maine (LIUM) University of Le Mans,

### The KIT Translation Systems for IWSLT 2013

The KIT Translation Systems for IWSLT 2013 Thanh-Le Ha, Teresa Herrmann, Jan Niehues, Mohammed Mediani, Eunah Cho, Yuqi Zhang, Isabel Slawik and Alex Waibel Institute for Anthropomatics KIT - Karlsruhe

### Customizing an English-Korean Machine Translation System for Patent Translation *

Customizing an English-Korean Machine Translation System for Patent Translation * Sung-Kwon Choi, Young-Gil Kim Natural Language Processing Team, Electronics and Telecommunications Research Institute,

### A Joint Sequence Translation Model with Integrated Reordering

A Joint Sequence Translation Model with Integrated Reordering Nadir Durrani, Helmut Schmid and Alexander Fraser Institute for Natural Language Processing University of Stuttgart Introduction Generation

### TriS: A Statistical Sentence Simplifier with Log-linear Models and Marginbased Discriminative Training

TriS: A Statistical Sentence Simplifier with Log-linear Models and Marginbased Discriminative Training Nguyen Bach, Qin Gao, Stephan Vogel, and Alex Waibel Language Technologies Institute Carnegie Mellon

### Automatic Speech Recognition and Hybrid Machine Translation for High-Quality Closed-Captioning and Subtitling for Video Broadcast

Automatic Speech Recognition and Hybrid Machine Translation for High-Quality Closed-Captioning and Subtitling for Video Broadcast Hassan Sawaf Science Applications International Corporation (SAIC) 7990

### Learning Translation Rules from Bilingual English Filipino Corpus

Proceedings of PACLIC 19, the 19 th Asia-Pacific Conference on Language, Information and Computation. Learning Translation s from Bilingual English Filipino Corpus Michelle Wendy Tan, Raymond Joseph Ang,

### Chapter 5. Phrase-based models. Statistical Machine Translation

Chapter 5 Phrase-based models Statistical Machine Translation Motivation Word-Based Models translate words as atomic units Phrase-Based Models translate phrases as atomic units Advantages: many-to-many

### Why Evaluation? Machine Translation. Evaluation. Evaluation Metrics. Ten Translations of a Chinese Sentence. How good is a given system?

Why Evaluation? How good is a given system? Machine Translation Evaluation Which one is the best system for our purpose? How much did we improve our system? How can we tune our system to become better?

### Machine Translation. Why Evaluation? Evaluation. Ten Translations of a Chinese Sentence. Evaluation Metrics. But MT evaluation is a di cult problem!

Why Evaluation? How good is a given system? Which one is the best system for our purpose? How much did we improve our system? How can we tune our system to become better? But MT evaluation is a di cult

### Croatian Statistical Machine Translation for Industry Domain

Croatian Statistical Machine Translation for Industry Domain Summary Multilingual communication has become a top priority in today s globalized world. As human translation is a time-consuming, expensive

### THUTR: A Translation Retrieval System

THUTR: A Translation Retrieval System Chunyang Liu, Qi Liu, Yang Liu, and Maosong Sun Department of Computer Science and Technology State Key Lab on Intelligent Technology and Systems National Lab for

### Translation Solution for

Translation Solution for Case Study Contents PROMT Translation Solution for PayPal Case Study 1 Contents 1 Summary 1 Background for Using MT at PayPal 1 PayPal s Initial Requirements for MT Vendor 2 Business

### Extending Memory-Based Machine Translation to Phrases

Extending Memory-Based Machine Translation to Phrases February 2010 What is Phrase-based Memory-based Machine Translation? 1 Machine translation Data-driven Form of Example Based MT 2 Memory-based Lazy-learning

### Comma checking in Danish Daniel Hardt Copenhagen Business School & Villanova University

Comma checking in Danish Daniel Hardt Copenhagen Business School & Villanova University 1. Introduction This paper describes research in using the Brill tagger (Brill 94,95) to learn to identify incorrect

### An Approach to Handle Idioms and Phrasal Verbs in English-Tamil Machine Translation System

An Approach to Handle Idioms and Phrasal Verbs in English-Tamil Machine Translation System Thiruumeni P G, Anand Kumar M Computational Engineering & Networking, Amrita Vishwa Vidyapeetham, Coimbatore,

### Enriching Morphologically Poor Languages for Statistical Machine Translation

Enriching Morphologically Poor Languages for Statistical Machine Translation Eleftherios Avramidis e.avramidis@sms.ed.ac.uk Philipp Koehn pkoehn@inf.ed.ac.uk School of Informatics University of Edinburgh

### CHAPTER I INTRODUCTION

1 CHAPTER I INTRODUCTION A. Background of the Study Language is used to communicate with other people. People need to study how to use language especially foreign language. Language can be study in linguistic

### UNSUPERVISED MORPHOLOGICAL SEGMENTATION FOR STATISTICAL MACHINE TRANSLATION

UNSUPERVISED MORPHOLOGICAL SEGMENTATION FOR STATISTICAL MACHINE TRANSLATION by Ann Clifton B.A., Reed College, 2001 a thesis submitted in partial fulfillment of the requirements for the degree of Master

### Intelligent Systems (AI-2)

Intelligent Systems (AI-2) Computer Science cpsc422, Lecture 25 Nov, 9, 2016 CPSC 422, Lecture 26 Slide 1 Morphology Syntax Semantics Pragmatics Discourse and Dialogue Knowledge-Formalisms Map (including

### SYNTAX. Syntax the study of the system of rules and categories that underlies sentence formation.

SYNTAX Syntax the study of the system of rules and categories that underlies sentence formation. 1) Syntactic categories lexical: - words that have meaning (semantic content) - words that can be inflected

### Word Completion and Prediction in Hebrew

Experiments with Language Models for בס"ד Word Completion and Prediction in Hebrew 1 Yaakov HaCohen-Kerner, Asaf Applebaum, Jacob Bitterman Department of Computer Science Jerusalem College of Technology

### tance alignment and time information to create confusion networks 1 from the output of different ASR systems for the same

1222 IEEE TRANSACTIONS ON AUDIO, SPEECH, AND LANGUAGE PROCESSING, VOL. 16, NO. 7, SEPTEMBER 2008 System Combination for Machine Translation of Spoken and Written Language Evgeny Matusov, Student Member,

### Modeling Computation

Modeling Computation 11.1.1 Language and Grammars Natural Languages: English, French, German, Programming Languages: Pascal, C, Java, Grammar is used to generate sentences from basic words of the language

### The Prague Bulletin of Mathematical Linguistics NUMBER 96 OCTOBER 2011 49 58. Ncode: an Open Source Bilingual N-gram SMT Toolkit

The Prague Bulletin of Mathematical Linguistics NUMBER 96 OCTOBER 2011 49 58 Ncode: an Open Source Bilingual N-gram SMT Toolkit Josep M. Crego a, François Yvon ab, José B. Mariño c c a LIMSI-CNRS, BP 133,

### A New Input Method for Human Translators: Integrating Machine Translation Effectively and Imperceptibly

Proceedings of the Twenty-Fourth International Joint Conference on Artificial Intelligence (IJCAI 2015) A New Input Method for Human Translators: Integrating Machine Translation Effectively and Imperceptibly

### Hybrid Strategies. for better products and shorter time-to-market

Hybrid Strategies for better products and shorter time-to-market Background Manufacturer of language technology software & services Spin-off of the research center of Germany/Heidelberg Founded in 1999,

### Machine Translation. Agenda

Agenda Introduction to Machine Translation Data-driven statistical machine translation Translation models Parallel corpora Document-, sentence-, word-alignment Phrase-based translation MT decoding algorithm

### Towards a RB-SMT Hybrid System for Translating Patent Claims Results and Perspectives

Towards a RB-SMT Hybrid System for Translating Patent Claims Results and Perspectives Ramona Enache and Adam Slaski Department of Computer Science and Engineering Chalmers University of Technology and

### Word Ordering Without Syntax

Word Ordering Without Syntax Allen Schmaltz Alexander M. Rush Stuart M. Shieber Harvard University EMNLP, 2016 Schmaltz et al. (Harvard University) Word Ordering Without Syntax EMNLP, 2016 1 / 15 Outline

### A Quirk Review of Translation Models

A Quirk Review of Translation Models Jianfeng Gao, Microsoft Research Prepared in connection with the 2010 ACL/SIGIR summer school July 22, 2011 The goal of machine translation (MT) is to use a computer

### The Impact of Morphological Errors in Phrase-based Statistical Machine Translation from English and German into Swedish

The Impact of Morphological Errors in Phrase-based Statistical Machine Translation from English and German into Swedish Oscar Täckström Swedish Institute of Computer Science SE-16429, Kista, Sweden oscar@sics.se

### ACCURAT Analysis and Evaluation of Comparable Corpora for Under Resourced Areas of Machine Translation www.accurat-project.eu Project no.

ACCURAT Analysis and Evaluation of Comparable Corpora for Under Resourced Areas of Machine Translation www.accurat-project.eu Project no. 248347 Deliverable D5.4 Report on requirements, implementation

### SYSTRAN 混 合 策 略 汉 英 和 英 汉 机 器 翻 译 系 统

SYSTRAN Chinese-English and English-Chinese Hybrid Machine Translation Systems Jin Yang, Satoshi Enoue Jean Senellart, Tristan Croiset SYSTRAN Software, Inc. SYSTRAN SA 9333 Genesee Ave. Suite PL1 La Grande

### UPM system for the translation task

UPM system for the translation task Verónica López-Ludeña Grupo de Tecnología del Habla Universidad Politécnica de Madrid veronicalopez@die.upm.es Rubén San-Segundo Grupo de Tecnología del Habla Universidad

This page intentionally left blank Statistical Machine Translation The field of machine translation has recently been energized by the emergence of statistical techniques, which have brought the dream

### Statistical Phrase-Based Translation

Statistical Phrase-Based Translation Philipp Koehn, Franz Josef Och, Daniel Marcu Information Sciences Institute Department of Computer Science University of Southern California koehn@isi.edu, och@isi.edu,

### Paraphrasing controlled English texts

Paraphrasing controlled English texts Kaarel Kaljurand Institute of Computational Linguistics, University of Zurich kaljurand@gmail.com Abstract. We discuss paraphrasing controlled English texts, by defining

### Factored bilingual n-gram language models for statistical machine translation

Mach Translat DOI 10.1007/s10590-010-9082-5 Factored bilingual n-gram language models for statistical machine translation Josep M. Crego François Yvon Received: 2 November 2009 / Accepted: 12 June 2010

### Introduction. Philipp Koehn. 28 January 2016

Introduction Philipp Koehn 28 January 2016 Administrativa 1 Class web site: http://www.mt-class.org/jhu/ Tuesdays and Thursdays, 1:30-2:45, Hodson 313 Instructor: Philipp Koehn (with help from Matt Post)

### Computational Linguistics: Syntax I

Computational Linguistics: Syntax I Raffaella Bernardi KRDB, Free University of Bozen-Bolzano P.zza Domenicani, Room: 2.28, e-mail: bernardi@inf.unibz.it Contents 1 Syntax...................................................

### DEVELOPMENT AND ANALYSIS OF HINDI-URDU PARALLEL CORPUS

DEVELOPMENT AND ANALYSIS OF HINDI-URDU PARALLEL CORPUS Mandeep Kaur GNE, Ludhiana Ludhiana, India Navdeep Kaur SLIET, Longowal Sangrur, India Abstract This paper deals with Development and Analysis of

### Computational Linguistics

Computational Linguistics Machine Translation Suhaila Saee & Bali Ranaivo-Malançon Faculty of Computer Science and Information Technology Universiti Malaysia Sarawak August 2014 (SOURCE:WORDWEB PRO 1.4)

### Machine Translation and the Translator

Machine Translation and the Translator Philipp Koehn 8 April 2015 About me 1 Professor at Johns Hopkins University (US), University of Edinburgh (Scotland) Author of textbook on statistical machine translation

### Building a Question Classifier for a TREC-Style Question Answering System

Building a Question Classifier for a TREC-Style Question Answering System Richard May & Ari Steinberg Topic: Question Classification We define Question Classification (QC) here to be the task that, given

### Building a Web-based parallel corpus and filtering out machinetranslated

Building a Web-based parallel corpus and filtering out machinetranslated text Alexandra Antonova, Alexey Misyurev Yandex 16, Leo Tolstoy St., Moscow, Russia {antonova, misyurev}@yandex-team.ru Abstract

### Computer Aided Translation

Computer Aided Translation Philipp Koehn 30 April 2015 Why Machine Translation? 1 Assimilation reader initiates translation, wants to know content user is tolerant of inferior quality focus of majority

### SYSTRAN Chinese-English and English-Chinese Hybrid Machine Translation Systems for CWMT2011 SYSTRAN 混 合 策 略 汉 英 和 英 汉 机 器 翻 译 系 CWMT2011 技 术 报 告

SYSTRAN Chinese-English and English-Chinese Hybrid Machine Translation Systems for CWMT2011 Jin Yang and Satoshi Enoue SYSTRAN Software, Inc. 4444 Eastgate Mall, Suite 310 San Diego, CA 92121, USA E-mail:

### Machine Translation of User Generated Content. Julia Epiphantseva Head of Business Development

Machine Translation of User Generated Content Julia Epiphantseva pp Head of Business Development PROMT Technologies PROMT Rule Based Machine Translation (RBMT) PROMT Statistical ti ti Machine Translation

### MT Development Experience of Vietnam

MT Development Experience of Vietnam VU Tat Thang, Ph.D. Institute of Information Technology Vietnamese Academy of Science and Technology vtthang@ioit.ac.vn Thang VU 2002 ~ 2005: IOIT, Vietnam Speech Processing

### Recent Innovations in NTT s Statistical Machine Translation

: Front-line of Speech, Language, and Hearing Research for Heartfelt Communications Recent Innovations in NTT s Statistical Machine Translation Masaaki Nagata, Katsuhito Sudoh, Jun Suzuki, Yasuhiro Akiba,

### The Basics of Syntax. Supplementary Readings. Introduction. Lexical Categories. Phrase Structure Rules. Introducing Noun Phrases. Some Further Details

The following readings have been posted to the Moodle course site: Language Files: Chapter 5 (pp. 194-198, 204-215) Language Instinct: Chapter 4 (pp. 74-99) The System Thus Far The Fundamental Question:

### The TCH Machine Translation System for IWSLT 2008

The TCH Machine Translation System for IWSLT 2008 Haifeng Wang, Hua Wu, Xiaoguang Hu, Zhanyi Liu, Jianfeng Li, Dengjun Ren, Zhengyu Niu Toshiba (China) Research and Development Center 5/F., Tower W2, Oriental

### Improving the Performance of English-Tamil Statistical Machine Translation System using Source-Side Pre-Processing

Proc. of Int. Conf. on Advances in Computer Science, AETACS Improving the Performance of English-Tamil Statistical Machine Translation System using Source-Side Pre-Processing Anand Kumar M 1, Dhanalakshmi

### Learning Translations of Named-Entity Phrases from Parallel Corpora

Learning Translations of Named-Entity Phrases from Parallel Corpora Robert C. Moore Microsoft Research Redmond, WA 98052, USA bobmoore@microsoft.com Abstract We develop a new approach to learning phrase

### Lecture 9: Formal grammars of English

(Fall 2012) http://cs.illinois.edu/class/cs498jh Lecture 9: Formal grammars of English Julia Hockenmaier juliahmr@illinois.edu 3324 Siebel Center Office Hours: Wednesday, 12:15-1:15pm Previous key concepts!

### Effective Self-Training for Parsing

Effective Self-Training for Parsing David McClosky dmcc@cs.brown.edu Brown Laboratory for Linguistic Information Processing (BLLIP) Joint work with Eugene Charniak and Mark Johnson David McClosky - dmcc@cs.brown.edu

### SMOOTHING A PBSMT MODEL BY FACTORING OUT ADJUNCTS

SMOOTHING A PBSMT MODEL BY FACTORING OUT ADJUNCTS MSc Thesis (Afstudeerscriptie) written by Sophie I. Arnoult (born April 3rd, 1976 in Suresnes, France) under the supervision of Dr Khalil Sima an, and

### Improving Pronoun Translation for Statistical Machine Translation (SMT)

Improving Pronoun Translation for Statistical Machine Translation (SMT) Liane Guillou E H U N I V E R S I T Y T O H F R G E D I N B U Master of Science Artificial Intelligence School of Informatics University

### Error Analysis of Verb Inflections in Spanish Translation Output

Error Analysis of Verb Inflections in Translation Output Maja Popović and Hermann Ney Lehrstuhl für Informatik 6 Computer Science Department RWTH Aachen University, D-52056 Aachen, Germany {popovic,ney}@informatik.rwth-aachen.de

### A Data-Driven Approach to Deep Machine Translation. Michael Jellinghaus Saarland University

A Data-Driven Approach to Deep Machine Translation Michael Jellinghaus Saarland University micha@coli.uni-sb.de Motivation Overview Characterisation of statistical and transfer-based MT Hybrid system idea

### Syntactic Phrase Reordering for English-to-Arabic Statistical Machine Translation

Syntactic Phrase Reordering for English-to-Arabic Statistical Machine Translation Ibrahim Badr Rabih Zbib James Glass Computer Science and Artificial Intelligence Lab Massachusetts Institute of Technology

### An Efficient Framework for Extracting Parallel Sentences from Non-Parallel Corpora

Fundamenta Informaticae 130 (2014) 179 199 179 DOI 10.3233/FI-2014-987 IOS Press An Efficient Framework for Extracting Parallel Sentences from Non-Parallel Corpora Cuong Hoang, Anh-Cuong Le, Phuong-Thai

### Sharing resources between free/open-source rule-based machine translation systems: Grammatical Framework and Apertium

Sharing resources between free/open-source rule-based machine translation systems: Grammatical Framework and Apertium Grégoire Détrez, Víctor M. Sánchez-Cartagena, Aarne Ranta gregoire.detrez@gu.se, vmsanchez@prompsit.com,

### Translation of verbal idioms

Translation of verbal idioms Martine Smets, Joseph Pentheroudakis and Arul Menezes Microsoft Research One Microsoft Way Redmond, WA 98052, USA martines@microsoft.com josephp@microsoft.com arulm@microsoft.com

### Using a Dependency Parser to Improve SMT for Subject-Object-Verb Languages

Using a Dependency Parser to Improve SMT for Subject-Object-Verb Languages Peng Xu, Jaeho Kang, Michael Ringgaard and Franz Och Google Inc. 1600 Amphitheatre Parkway Mountain View, CA 94043, USA {xp,jhkang,ringgaard,och}@google.com

### Rapid assembly of a large-scale French-English MT system

Rapid assembly of a large-scale French-English MT system Jessie Pinkham, Monica Corston-Oliver, Martine Smets, and Martine Pettenaro Microsoft Research One Microsoft Way Redmond, WA 98052 {jessiep, martines,

### That ll do Fine!: A Coarse Lexical Resource for English-Hindi MT, using Polylingual Topic Models

That ll do Fine!: A Coarse Lexical Resource for English-Hindi MT, using Polylingual Topic Models Diptesh Kanojia 2, Aditya Joshi 1,2,3, Pushpak Bhattacharyya 2, Mark James Carman 3 1 IITB-Monash Research

### L130: Chapter 5d. Dr. Shannon Bischoff. Dr. Shannon Bischoff () L130: Chapter 5d 1 / 25

L130: Chapter 5d Dr. Shannon Bischoff Dr. Shannon Bischoff () L130: Chapter 5d 1 / 25 Outline 1 Syntax 2 Clauses 3 Constituents Dr. Shannon Bischoff () L130: Chapter 5d 2 / 25 Outline Last time... Verbs...

### Training and evaluation of POS taggers on the French MULTITAG corpus

Training and evaluation of POS taggers on the French MULTITAG corpus A. Allauzen, H. Bonneau-Maynard LIMSI/CNRS; Univ Paris-Sud, Orsay, F-91405 {allauzen,maynard}@limsi.fr Abstract The explicit introduction

### a hybrid MT system Peter Dirix Vincent Vandeghinste Ineke Schuurman Centre for Computational Linguistics Katholieke Universiteit Leuven

METIS-II: a hybrid MT system Peter Dirix Vincent Vandeghinste Ineke Schuurman Centre for Computational Linguistics Katholieke Universiteit Leuven TMI 2007, Skövde Overview Techniques and issues in MT The

### Context Free Grammars

Context Free Grammars So far we have looked at models of language that capture only local phenomena, namely what we can analyze when looking at only a small window of words in a sentence. To move towards

### PROMT Technologies for Translation and Big Data

PROMT Technologies for Translation and Big Data Overview and Use Cases Julia Epiphantseva PROMT About PROMT EXPIRIENCED Founded in 1991. One of the world leading machine translation provider DIVERSIFIED

### Leveraging ASEAN Economic Community through Language Translation Services

Leveraging ASEAN Economic Community through Language Translation Services Hammam Riza Center for Information and Communication Technology Agency for the Assessment and Application of Technology (BPPT)

### construct for language tests?

How valid is the CEFR as a construct for language tests? Cecilie Hamnes Carlsen 5 th ALTE conference, Paris, April 9 th 10 th 2014 [ ] the CEFR levels are neither based on empirical evidence taken from

### FBK's Machine Translation Systems for IWSLT 2012's TED Lectures

FBK's Machine Translation Systems for IWSLT 2012's TED Lectures Nick Ruiz, Arianna Bisazza Roldano Cattoni, Marcello Federico 1 2 Outline Common components Arabic-English Turkish-English Dutch-English

### College-level L2 English Writing Competence: Conjunctions and Errors

College-level L2 English Writing Competence: Mo Li Abstract Learners output of the target language has been taken into consideration a great deal over the past decade. One of the issues of L2 writing research

### Customizing an English-Korean Machine Translation System for Patent/Technical Documents Translation * *

Customizing an English-Korean Machine Translation System for Patent/Technical Documents Translation * * Oh-Woog Kwon, Sung-Kwon Choi, Ki-Young Lee, Yoon-Hyung Roh, and Young-Gil Kim Natural Language Processing

### 14 Automatic language correction

14 Automatic language correction IA161 Advanced Techniques of Natural Language Processing J. Švec NLP Centre, FI MU, Brno December 21, 2015 J. Švec IA161 Advanced NLP 14 Automatic language correction 1

### An Iteratively-Trained Segmentation-Free Phrase Translation Model for Statistical Machine Translation

An Iteratively-Trained Segmentation-Free Phrase Translation Model for Statistical Machine Translation Robert C. Moore Chris Quirk Microsoft Research Redmond, WA 98052, USA {bobmoore,chrisq}@microsoft.com

### Empirical Machine Translation and its Evaluation

Empirical Machine Translation and its Evaluation EAMT Best Thesis Award 2008 Jesús Giménez (Advisor, Lluís Màrquez) Universitat Politècnica de Catalunya May 28, 2010 Empirical Machine Translation Empirical

### ANNLOR: A Naïve Notation-system for Lexical Outputs Ranking

ANNLOR: A Naïve Notation-system for Lexical Outputs Ranking Anne-Laure Ligozat LIMSI-CNRS/ENSIIE rue John von Neumann 91400 Orsay, France annlor@limsi.fr Cyril Grouin LIMSI-CNRS rue John von Neumann 91400

### T U R K A L A T O R 1

T U R K A L A T O R 1 A Suite of Tools for Augmenting English-to-Turkish Statistical Machine Translation by Gorkem Ozbek [gorkem@stanford.edu] Siddharth Jonathan [jonsid@stanford.edu] CS224N: Natural Language

### MORPHOLOGY BASED PROTOTYPE STATISTICAL MACHINE TRANSLATION SYSTEM FOR ENGLISH TO TAMIL LANGUAGE

MORPHOLOGY BASED PROTOTYPE STATISTICAL MACHINE TRANSLATION SYSTEM FOR ENGLISH TO TAMIL LANGUAGE A Thesis Submitted for the Degree of Doctor of Philosophy in the School of Engineering by ANAND KUMAR M CENTER

### Statistical Machine Translation

Statistical Machine Translation What works and what does not Andreas Maletti Universität Stuttgart maletti@ims.uni-stuttgart.de Stuttgart May 14, 2013 Statistical Machine Translation A. Maletti 1 Main

### Natural Language Database Interface for the Community Based Monitoring System *

Natural Language Database Interface for the Community Based Monitoring System * Krissanne Kaye Garcia, Ma. Angelica Lumain, Jose Antonio Wong, Jhovee Gerard Yap, Charibeth Cheng De La Salle University

### Statistical Significance Tests for Machine Translation Evaluation

Statistical Significance Tests for Machine Translation Evaluation Philipp Koehn Computer Science and Artificial Intelligence Laboratory Massachusetts Institute of Technology The Stata Center, 32 Vassar