Machine Translation. Why Evaluation? Evaluation. Ten Translations of a Chinese Sentence. Evaluation Metrics. But MT evaluation is a di cult problem!
|
|
- Eugenia Wheeler
- 8 years ago
- Views:
Transcription
1 Why Evaluation? How good is a given system? Which one is the best system for our purpose? How much did we improve our system? How can we tune our system to become better? But MT evaluation is a di cult problem! Evaluation of Machine Translation Machine Translation Evaluation Based on Philipp Koehn s slides from Chapter 8 Evaluation Metrics subjective judgments by human evaluators automatic evaluation metrics task-based evaluation, e.g.: how much post-editing e ort? does information come across? Evaluation of Machine Translation Ten Translations of a Chinese Sentence Israeli o cials are responsible for airport security. Israel is in charge of the security at this airport. The security work for this airport is the responsibility of the Israel government. Israeli side was in charge of the security of this airport. Israel is responsible for the airport s security. Israel is responsible for safety work at this airport. Israel presides over the security of the airport. Israel took charge of the airport security. The safety of this airport is taken charge of by Israel. This airport s security is the responsibility of the Israeli security o cials. (a typical example from the 00 NIST evaluation set) Evaluation of Machine Translation
2 Adequacy and Fluency Human judgement given: machine translation output given: source and/or reference translation task: assess the quality of the machine translation output Metrics Adequacy: Does the output convey the same meaning as the input sentence? Is part of the message lost, added, or distorted? Fluency: Is the output good fluent English? This involves both grammatical correctness and idiomatic word choices. Evaluation of Machine Translation Human vs. Automatic Evaluation Human evaluation is Ultimately what we are interested in, but Very time consuming Not re-usable Automatic evaluation is Cheap and re-usable, but Not necessarily reliable Evaluation of Machine Translation Human Evaluation Source: Estos tejidos estan analizados, transformados y congelados antes de ser almacenados en Hema- Quebec, que gestiona tambien el unico banco publico de sangre del cordon umbilical en Quebec. Reference: These tissues are analyzed, processed and frozen before being stored at Hema- Quebec, which manages also the only bank of placental blood in Quebec. Translation: These weavings are analyzed, transformed and frozen before being stored in Hema-Quebec, that negotiates also the public only bank of blood of the umbilical cord in Quebec. What is your judgement in terms of adequacy and fluency? Adequacy Fluency all meaning flawless English most meaning good English much meaning non-native English little meaning disfluent English none incomprehensible Evaluation of Machine Translation 7 Fluency and Adequacy: Scales Adequacy Fluency all meaning flawless English most meaning good English much meaning non-native English little meaning disfluent English none incomprehensible Evaluation of Machine Translation
3 Measuring Agreement between Evaluators Kappa coe cient p(a) p(e) K = p(e) p(a): proportion of times that the evaluators agree p(e): proportion of time that they would agree by chance (-point scale! p(e) = ) Example: Inter-evaluator agreement in WMT 007 evaluation campaign Evaluation type P (A) P (E) K Fluency Adequacy.80.. Evaluation of Machine Translation 9 Evaluators Disagree Histogram of adequacy judgments by di erent human evaluators 0% 0% 0% (from WMT 00 evaluation) Evaluation of Machine Translation 8 Rank Sentences You have judged sentences for WMT09 Spanish-English News Corpus, 7 sentences total taking.9 seconds per sentence. Human Evaluation Source: Estos tejidos están analizados, transformados y congelados antes de ser almacenados en Hema- Québec, que gestiona también el único banco público de sangre del cordón umbilical en Quebec. Reference: These tissues are analyzed, processed and frozen before being stored at Hema- Quebec, which manages also the only bank of placental blood in Quebec. Reference: These tissues are analyzed, processed and frozen before being stored at Héma-Québec, which manages also the only bank of placental blood in Quebec. Translation Rank These weavings are analyzed, transformed and frozen before being stored in Hema-Quebec, that negotiates also the public only bank of blood of the umbilical cord in Quebec. These tissues analysed, processed and before frozen of stored in Hema- Québec, which also operates the only public bank umbilical cord blood in Quebec. These tissues are analyzed, processed and frozen before being stored in Hema-Québec, which also manages the only public bank umbilical cord blood in Quebec. These tissues are analyzed, processed and frozen before being stored in Hema-Quebec, which also operates the only public bank of umbilical cord blood in Quebec. These fabrics are analyzed, are transformed and are frozen before being stored in Hema-Québec, who manages also the only public bank of blood of the umbilical cord in Quebec. Annotator: ccb Task: WMT09 Spanish-English News Corpus Instructions: Rank each translation from Best to Worst relative to the other choices (ties are allowed). These are not interpreted as absolute scores. They are Best Best Best Best Best Worst Worst Worst Worst Worst Evaluation relative scores. of Machine Translation Ranking Translations Task for evaluator: Is translation X better than translation Y? (choices: better, worse, equal) Evaluators are more consistent: Evaluation type P (A) P (E) K Fluency Adequacy.80.. Sentence ranking.8..7 Evaluation of Machine Translation 0
4 General Goals for Evaluation Metrics Correct: metric must rank better systems higher Meaningful: score should give intuitive interpretation of translation quality Low cost: reduce time and money spent on carrying out evaluation Useful for tuning: automatically optimize system parameters towards metric Consistent: repeated use of metric should give same results Evaluation of Machine Translation Human Evaluation Reference: These tissues are analyzed, processed and frozen before being stored at Hema- Quebec, which manages also the only bank of placental blood in Quebec. Evaluation of Machine Translation Precision and Recall of Words SYSTEM A: Israeli officials responsibility of airport safety REFERENCE: Israeli officials are responsible for airport security Precision correct output-length = = 0% Recall correct reference-length = 7 = % F-measure precision recall (precision + recall)/ =.. (.+.)/ = % Evaluation of Machine Translation Automatic Evaluation Metrics Goal: computer program that computes the quality of translations Advantages: low cost, fast, re-usable Basic strategy given: machine translation output given: human reference translation task: compute similarity between them Evaluation of Machine Translation
5 Word Error Rate Minimum number of editing steps to transform output to reference match: words match, no cost substitution: replace one word with another insertion: add word deletion: drop word Levenshtein distance substitutions + insertions + deletions wer = reference-length Evaluation of Machine Translation 7 Precision and Recall SYSTEM A: Israeli officials responsibility of airport safety REFERENCE: Israeli officials are responsible for airport security SYSTEM B: airport security Israeli officials are responsible Metric System A System B precision 0% 00% recall % 00% f-measure % 00% flaw: no penalty for reordering Evaluation of Machine Translation Evaluation of Machine Translation 8 Metric System A System B word error rate (wer) 7% 7% security 7 security 7 7 airport airport for for responsible responsible are are officials 0 officials Israeli 0 Israeli 0 0 Israeli officials responsibility of airport safety airport security Israeli officials are responsible Example Evaluation of Machine Translation 9 if output-length c>reference-length r BP exp( r/c) if output-length c apple reference-length r Add brevity penalty for short translations: i= ny = (precision precision... precisionn) n = precisioni! n P = np precision precision... precisionn Compute geometric mean of n-gram precisions (typically size to ): N-gram overlap between machine translation output and reference translation BLEU
6 Example SYSTEM A: Israeli officials responsibility of airport safety -GRAM MATCH -GRAM MATCH REFERENCE: Israeli officials are responsible for airport security SYSTEM B: airport security Israeli officials are responsible -GRAM MATCH -GRAM MATCH Metric System A System B precision (gram) / / precision (gram) / / precision (gram) 0/ / precision (gram) 0/ / brevity penalty /7 /7 bleu 0% % Evaluation of Machine Translation BLEU More e cient: P =( Q n i= precision i) n = exp n P n i= log e(precisonn) Putting everything together (for to -grams): BLEU =min,exp reference-length exp output-length NX loge(precisonn) n n= Typically computed over the entire test corpus, not single sentences Can you figure out why? Evaluation of Machine Translation 0! Multiple Reference Translations To account for variability, use multiple reference translations n-grams may match in any of the references closest reference length used Example SYSTEM: REFERENCES: Israeli officials responsibility of airport safety -GRAM MATCH -GRAM MATCH -GRAM Israeli officials are responsible for airport security Israel is in charge of the security at this airport The security work for this airport is the responsibility of the Israel government Israeli side was in charge of the security of this airport Evaluation of Machine Translation Modified N-gram Precision Avoid counting correct N-grams more often than they appear in any reference translation! countclip = min (countcandidate, maxcountreference) Candidate: the the the the the the the. Reference : The cat is on the mat. Reference : There is a cat on the mat. countclip(the) = precision = /7 (unigram precision) Evaluation of Machine Translation
7 Correlation with Human Judgement Evaluation of Machine Translation Typical BLEU Scores BLEU scores for 0 statistical machine translation systems (Koehn 00) % da de el en es fr fi it nl pt sv da de el en es fr fi it nl pt sv Evaluation of Machine Translation Critique of Automatic Metrics Ignore relevance of words (names and core concepts more important than determiners and punctuation) Operate on local level (do not consider overall grammaticality of the sentence or sentence meaning) Scores are meaningless (scores very test-set specific, absolute value not informative) Human translators score low on BLEU (possibly because of higher variability, di erent word choices) Evaluation of Machine Translation 7 METEOR: Flexible Matching Partial credit for matching stems system Jim went home reference Joe goes home Partial credit for matching (near) synonyms system Jim walks home reference Joe goes home Use of paraphrases Evaluation of Machine Translation
8 Evaluation of Machine Translation 8 Bleu Score Human Score. Adequacy Correlation Post-edited output vs. statistical systems (NIST 00) Evidence of Shortcomings of Automatic Metrics Evaluation of Machine Translation 9 Bleu Score SMT System Human Score. Rule-based System (Systran) SMT System Adequacy Fluency. Rule-based vs. statistical systems Evidence of Shortcomings of Automatic Metrics Automatic Metrics: Conclusions Automatic metrics essential tool for system development Not fully suited to rank systems of di erent types Evaluation metrics still open challenge Evaluation of Machine Translation Metric Research Active development of new metrics syntactic similarity semantic equivalence or entailment metrics targeted at reordering trainable metrics etc. Evaluation campaigns that rank metrics Evaluation of Machine Translation 0
9 Post-Editing Machine Translation Measuring time spent on producing translations baseline: translation from scratch post-editing machine translation But: time consuming, depend on skills of translator and post-editor Metrics inspired by this task ter: based on number of editing steps Levenshtein operations (insertion, deletion, substitution) plus movement hter: manually construct reference translation for output, apply ter (very time consuming, used in DARPA GALE program 00-0) Evaluation of Machine Translation Task-Oriented Evaluation Does machine translation output help accomplish a task? browsing quality: Is the translation understandable in its context? (its main contents is clear to find information I need) post-editing quality: How many edit operations are required to turn it into a good translation? publishing quality: How many human interventions are necessary to make the entire document ready for printing? Evaluation of Machine Translation Other Evaluation Criteria When deploying systems, considerations go beyond quality of translations Speed: we prefer faster machine translation systems Size: fits into memory of available machines (e.g., handheld devices) Integration: can be integrated into existing workflow Customization: can be adapted to user s needs Evaluation of Machine Translation Content Understanding Tests Given machine translation output, can monolingual target side speaker answer questions about it?. basic facts: who? where? when? names, numbers, and dates. actors and events: relationships, temporal and causal order. nuance and author intent: emphasis and subtext Very hard to devise questions Sentence editing task (WMT ) person A edits the translation to make it fluent (with no access to source or reference) person B checks if edit is correct! did person A understand the translation correctly? Evaluation of Machine Translation
10 Summary MT evaluation is important System development Parameter tuning Task-oriented performance MT evaluation is di cult Human evaluators are expensive and disagree Automatic metrics ar not always reliable! Be careful when arguing about MT quality! Evaluation of Machine Translation
Why Evaluation? Machine Translation. Evaluation. Evaluation Metrics. Ten Translations of a Chinese Sentence. How good is a given system?
Why Evaluation? How good is a given system? Machine Translation Evaluation Which one is the best system for our purpose? How much did we improve our system? How can we tune our system to become better?
More informationHow to Read Clearly Without Having a Brainstorm
Evaluating translation quality Machine Translation Lecture 9 Instructor: Chris Callison-Burch TAs: Mitchell Stern, Justin Chiu Website: mt-class.org/penn Goals for this lecture Understanding advantages
More informationStatistical Machine Translation
Statistical Machine Translation Some of the content of this lecture is taken from previous lectures and presentations given by Philipp Koehn and Andy Way. Dr. Jennifer Foster National Centre for Language
More informationSYSTRAN Chinese-English and English-Chinese Hybrid Machine Translation Systems for CWMT2011 SYSTRAN 混 合 策 略 汉 英 和 英 汉 机 器 翻 译 系 CWMT2011 技 术 报 告
SYSTRAN Chinese-English and English-Chinese Hybrid Machine Translation Systems for CWMT2011 Jin Yang and Satoshi Enoue SYSTRAN Software, Inc. 4444 Eastgate Mall, Suite 310 San Diego, CA 92121, USA E-mail:
More informationSYSTRAN 混 合 策 略 汉 英 和 英 汉 机 器 翻 译 系 统
SYSTRAN Chinese-English and English-Chinese Hybrid Machine Translation Systems Jin Yang, Satoshi Enoue Jean Senellart, Tristan Croiset SYSTRAN Software, Inc. SYSTRAN SA 9333 Genesee Ave. Suite PL1 La Grande
More informationDublin City University at CLEF 2004: Experiments with the ImageCLEF St Andrew s Collection
Dublin City University at CLEF 2004: Experiments with the ImageCLEF St Andrew s Collection Gareth J. F. Jones, Declan Groves, Anna Khasin, Adenike Lam-Adesina, Bart Mellebeek. Andy Way School of Computing,
More informationAppraise: an Open-Source Toolkit for Manual Evaluation of MT Output
Appraise: an Open-Source Toolkit for Manual Evaluation of MT Output Christian Federmann Language Technology Lab, German Research Center for Artificial Intelligence, Stuhlsatzenhausweg 3, D-66123 Saarbrücken,
More informationACCURAT Analysis and Evaluation of Comparable Corpora for Under Resourced Areas of Machine Translation www.accurat-project.eu Project no.
ACCURAT Analysis and Evaluation of Comparable Corpora for Under Resourced Areas of Machine Translation www.accurat-project.eu Project no. 248347 Deliverable D5.4 Report on requirements, implementation
More informationConvergence of Translation Memory and Statistical Machine Translation
Convergence of Translation Memory and Statistical Machine Translation Philipp Koehn and Jean Senellart 4 November 2010 Progress in Translation Automation 1 Translation Memory (TM) translators store past
More informationEmpirical Machine Translation and its Evaluation
Empirical Machine Translation and its Evaluation EAMT Best Thesis Award 2008 Jesús Giménez (Advisor, Lluís Màrquez) Universitat Politècnica de Catalunya May 28, 2010 Empirical Machine Translation Empirical
More informationThe Impact of Morphological Errors in Phrase-based Statistical Machine Translation from English and German into Swedish
The Impact of Morphological Errors in Phrase-based Statistical Machine Translation from English and German into Swedish Oscar Täckström Swedish Institute of Computer Science SE-16429, Kista, Sweden oscar@sics.se
More informationTAUS Quality Dashboard. An Industry-Shared Platform for Quality Evaluation and Business Intelligence September, 2015
TAUS Quality Dashboard An Industry-Shared Platform for Quality Evaluation and Business Intelligence September, 2015 1 This document describes how the TAUS Dynamic Quality Framework (DQF) generates a Quality
More informationA Joint Sequence Translation Model with Integrated Reordering
A Joint Sequence Translation Model with Integrated Reordering Nadir Durrani, Helmut Schmid and Alexander Fraser Institute for Natural Language Processing University of Stuttgart Introduction Generation
More informationTranslation Solution for
Translation Solution for Case Study Contents PROMT Translation Solution for PayPal Case Study 1 Contents 1 Summary 1 Background for Using MT at PayPal 1 PayPal s Initial Requirements for MT Vendor 2 Business
More informationComputer Aided Translation
Computer Aided Translation Philipp Koehn 30 April 2015 Why Machine Translation? 1 Assimilation reader initiates translation, wants to know content user is tolerant of inferior quality focus of majority
More informationMachine Translation. Agenda
Agenda Introduction to Machine Translation Data-driven statistical machine translation Translation models Parallel corpora Document-, sentence-, word-alignment Phrase-based translation MT decoding algorithm
More informationMachine Translation and the Translator
Machine Translation and the Translator Philipp Koehn 8 April 2015 About me 1 Professor at Johns Hopkins University (US), University of Edinburgh (Scotland) Author of textbook on statistical machine translation
More informationReport on the embedding and evaluation of the second MT pilot
Report on the embedding and evaluation of the second MT pilot quality translation by deep language engineering approaches DELIVERABLE D3.10 VERSION 1.6 2015-11-02 P2 QTLeap Machine translation is a computational
More informationA New Input Method for Human Translators: Integrating Machine Translation Effectively and Imperceptibly
Proceedings of the Twenty-Fourth International Joint Conference on Artificial Intelligence (IJCAI 2015) A New Input Method for Human Translators: Integrating Machine Translation Effectively and Imperceptibly
More informationQuantifying the Influence of MT Output in the Translators Performance: A Case Study in Technical Translation
Quantifying the Influence of MT Output in the Translators Performance: A Case Study in Technical Translation Marcos Zampieri Saarland University Saarbrücken, Germany mzampier@uni-koeln.de Mihaela Vela
More informationTRANSREAD LIVRABLE 3.1 QUALITY CONTROL IN HUMAN TRANSLATIONS: USE CASES AND SPECIFICATIONS. Projet ANR 201 2 CORD 01 5
Projet ANR 201 2 CORD 01 5 TRANSREAD Lecture et interaction bilingues enrichies par les données d'alignement LIVRABLE 3.1 QUALITY CONTROL IN HUMAN TRANSLATIONS: USE CASES AND SPECIFICATIONS Avril 201 4
More informationIntroduction. Philipp Koehn. 28 January 2016
Introduction Philipp Koehn 28 January 2016 Administrativa 1 Class web site: http://www.mt-class.org/jhu/ Tuesdays and Thursdays, 1:30-2:45, Hodson 313 Instructor: Philipp Koehn (with help from Matt Post)
More informationA Flexible Online Server for Machine Translation Evaluation
A Flexible Online Server for Machine Translation Evaluation Matthias Eck, Stephan Vogel, and Alex Waibel InterACT Research Carnegie Mellon University Pittsburgh, PA, 15213, USA {matteck, vogel, waibel}@cs.cmu.edu
More informationTuning Methods in Statistical Machine Translation
A thesis submitted in partial fulfilment for the degree of Master of Science in the science of Artificial Intelligence Tuning Methods in Statistical Machine Translation Author: Anne Gerard Schuth aschuth@science.uva.nl
More informationSemantics in Statistical Machine Translation
Semantics in Statistical Machine Translation Mihael Arcan DERI, NUI Galway firstname.lastname@deri.org Copyright 2011. All rights reserved. Overview 1. Statistical Machine Translation (SMT) 2. Translations
More informationCollecting Polish German Parallel Corpora in the Internet
Proceedings of the International Multiconference on ISSN 1896 7094 Computer Science and Information Technology, pp. 285 292 2007 PIPS Collecting Polish German Parallel Corpora in the Internet Monika Rosińska
More informationLIUM s Statistical Machine Translation System for IWSLT 2010
LIUM s Statistical Machine Translation System for IWSLT 2010 Anthony Rousseau, Loïc Barrault, Paul Deléglise, Yannick Estève Laboratoire Informatique de l Université du Maine (LIUM) University of Le Mans,
More informationTurker-Assisted Paraphrasing for English-Arabic Machine Translation
Turker-Assisted Paraphrasing for English-Arabic Machine Translation Michael Denkowski and Hassan Al-Haj and Alon Lavie Language Technologies Institute School of Computer Science Carnegie Mellon University
More informationMachine Translation at the European Commission
Directorate-General for Translation Machine Translation at the European Commission Konferenz 10 Jahre Verbmobil Saarbrücken, 16. November 2010 Andreas Eisele Project Manager Machine Translation, ICT Unit
More informationThe XMU Phrase-Based Statistical Machine Translation System for IWSLT 2006
The XMU Phrase-Based Statistical Machine Translation System for IWSLT 2006 Yidong Chen, Xiaodong Shi Institute of Artificial Intelligence Xiamen University P. R. China November 28, 2006 - Kyoto 13:46 1
More informationStatistical Machine Translation Lecture 4. Beyond IBM Model 1 to Phrase-Based Models
p. Statistical Machine Translation Lecture 4 Beyond IBM Model 1 to Phrase-Based Models Stephen Clark based on slides by Philipp Koehn p. Model 2 p Introduces more realistic assumption for the alignment
More informationAutomatic Speech Recognition and Hybrid Machine Translation for High-Quality Closed-Captioning and Subtitling for Video Broadcast
Automatic Speech Recognition and Hybrid Machine Translation for High-Quality Closed-Captioning and Subtitling for Video Broadcast Hassan Sawaf Science Applications International Corporation (SAIC) 7990
More informationStatistical Pattern-Based Machine Translation with Statistical French-English Machine Translation
Statistical Pattern-Based Machine Translation with Statistical French-English Machine Translation Jin'ichi Murakami, Takuya Nishimura, Masato Tokuhisa Tottori University, Japan Problems of Phrase-Based
More informationTS3: an Improved Version of the Bilingual Concordancer TransSearch
TS3: an Improved Version of the Bilingual Concordancer TransSearch Stéphane HUET, Julien BOURDAILLET and Philippe LANGLAIS EAMT 2009 - Barcelona June 14, 2009 Computer assisted translation Preferred by
More informationHybrid Machine Translation Guided by a Rule Based System
Hybrid Machine Translation Guided by a Rule Based System Cristina España-Bonet, Gorka Labaka, Arantza Díaz de Ilarraza, Lluís Màrquez Kepa Sarasola Universitat Politècnica de Catalunya University of the
More informationSystematic Comparison of Professional and Crowdsourced Reference Translations for Machine Translation
Systematic Comparison of Professional and Crowdsourced Reference Translations for Machine Translation Rabih Zbib, Gretchen Markiewicz, Spyros Matsoukas, Richard Schwartz, John Makhoul Raytheon BBN Technologies
More informationTopics in Computational Linguistics. Learning to Paraphrase: An Unsupervised Approach Using Multiple-Sequence Alignment
Topics in Computational Linguistics Learning to Paraphrase: An Unsupervised Approach Using Multiple-Sequence Alignment Regina Barzilay and Lillian Lee Presented By: Mohammad Saif Department of Computer
More informationJOB BANK TRANSLATION AUTOMATED TRANSLATION SYSTEM. Table of Contents
JOB BANK TRANSLATION AUTOMATED TRANSLATION SYSTEM Job Bank for Employers Creating a Job Offer Table of Contents Building the Automated Translation System Integration Steps Automated Translation System
More informationPROMT Technologies for Translation and Big Data
PROMT Technologies for Translation and Big Data Overview and Use Cases Julia Epiphantseva PROMT About PROMT EXPIRIENCED Founded in 1991. One of the world leading machine translation provider DIVERSIFIED
More informationRecent developments in machine translation policy at the European Patent Office
Recent developments in machine translation policy at the European Patent Office Dr Georg Artelsmair Director European Co-operation European Patent Office Brussels, 17 November 2010 The European Patent
More informationCollaborative Machine Translation Service for Scientific texts
Collaborative Machine Translation Service for Scientific texts Patrik Lambert patrik.lambert@lium.univ-lemans.fr Jean Senellart Systran SA senellart@systran.fr Laurent Romary Humboldt Universität Berlin
More informationBEER 1.1: ILLC UvA submission to metrics and tuning task
BEER 1.1: ILLC UvA submission to metrics and tuning task Miloš Stanojević ILLC University of Amsterdam m.stanojevic@uva.nl Khalil Sima an ILLC University of Amsterdam k.simaan@uva.nl Abstract We describe
More informationLanguage technologies for Education: recent results by the MLLP group
Language technologies for Education: recent results by the MLLP group Alfons Juan 2nd Internet of Education Conference 2015 18 September 2015, Sarajevo Contents The MLLP research group 2 translectures
More informationDeciphering Foreign Language
Deciphering Foreign Language NLP 1! Sujith Ravi and Kevin Knight sravi@usc.edu, knight@isi.edu Information Sciences Institute University of Southern California! 2 Statistical Machine Translation (MT) Current
More informationChapter 6. Decoding. Statistical Machine Translation
Chapter 6 Decoding Statistical Machine Translation Decoding We have a mathematical model for translation p(e f) Task of decoding: find the translation e best with highest probability Two types of error
More informationHIERARCHICAL HYBRID TRANSLATION BETWEEN ENGLISH AND GERMAN
HIERARCHICAL HYBRID TRANSLATION BETWEEN ENGLISH AND GERMAN Yu Chen, Andreas Eisele DFKI GmbH, Saarbrücken, Germany May 28, 2010 OUTLINE INTRODUCTION ARCHITECTURE EXPERIMENTS CONCLUSION SMT VS. RBMT [K.
More informationBuild Vs. Buy For Text Mining
Build Vs. Buy For Text Mining Why use hand tools when you can get some rockin power tools? Whitepaper April 2015 INTRODUCTION We, at Lexalytics, see a significant number of people who have the same question
More informationProject Management. From industrial perspective. A. Helle M. Herranz. EXPERT Summer School, 2014. Pangeanic - BI-Europe
Project Management From industrial perspective A. Helle M. Herranz Pangeanic - BI-Europe EXPERT Summer School, 2014 Outline 1 Introduction 2 3 Translation project management without MT Translation project
More informationD4.3: TRANSLATION PROJECT- LEVEL EVALUATION
: TRANSLATION PROJECT- LEVEL EVALUATION Jinhua Du, Joss Moorkens, Ankit Srivastava, Mikołaj Lauer, Andy Way, Alfredo Maldonado, David Lewis Distribution: Public Federated Active Linguistic data CuratiON
More informationWord Completion and Prediction in Hebrew
Experiments with Language Models for בס"ד Word Completion and Prediction in Hebrew 1 Yaakov HaCohen-Kerner, Asaf Applebaum, Jacob Bitterman Department of Computer Science Jerusalem College of Technology
More informationTHUTR: A Translation Retrieval System
THUTR: A Translation Retrieval System Chunyang Liu, Qi Liu, Yang Liu, and Maosong Sun Department of Computer Science and Technology State Key Lab on Intelligent Technology and Systems National Lab for
More informationThe Machine Translation Help Desk and the Post-Editing Service
[Terminologie et Traduction 1.1998, pp.289-295] DOROTHY SENEZ The Machine Translation Help Desk and the Post-Editing Service The growth of machine translation M achine translation (MT), introduced to the
More informationFree Online Translators:
Free Online Translators: A Comparative Assessment of worldlingo.com, freetranslation.com and translate.google.com Introduction / Structure of paper Design of experiment: choice of ST, SLs, translation
More informationOverview of iclef 2008: search log analysis for Multilingual Image Retrieval
Overview of iclef 2008: search log analysis for Multilingual Image Retrieval Julio Gonzalo Paul Clough Jussi Karlgren UNED U. Sheffield SICS Spain United Kingdom Sweden julio@lsi.uned.es p.d.clough@sheffield.ac.uk
More informationStructural and Semantic Indexing for Supporting Creation of Multilingual Web Pages
Structural and Semantic Indexing for Supporting Creation of Multilingual Web Pages Hiroshi URAE, Taro TEZUKA, Fuminori KIMURA, and Akira MAEDA Abstract Translating webpages by machine translation is the
More informationUser choice as an evaluation metric for web translation services in cross language instant messaging applications
User choice as an evaluation metric for web translation services in cross language instant messaging applications William Ogden, Ron Zacharski Sieun An, and Yuki Ishikawa New Mexico State University University
More informationA Joint Sequence Translation Model with Integrated Reordering
A Joint Sequence Translation Model with Integrated Reordering Nadir Durrani Advisors: Alexander Fraser and Helmut Schmid Institute for Natural Language Processing University of Stuttgart Machine Translation
More informationEffective Self-Training for Parsing
Effective Self-Training for Parsing David McClosky dmcc@cs.brown.edu Brown Laboratory for Linguistic Information Processing (BLLIP) Joint work with Eugene Charniak and Mark Johnson David McClosky - dmcc@cs.brown.edu
More informationReorganizing information in a multilingual website: Issues and Challenges
Reorganizing information in a multilingual website: Issues and Challenges Fernando Serván! Food and Agriculture Organization of the! United Nations (FAO),! Rome, Italy! About FAO - International organization
More informationThe TCH Machine Translation System for IWSLT 2008
The TCH Machine Translation System for IWSLT 2008 Haifeng Wang, Hua Wu, Xiaoguang Hu, Zhanyi Liu, Jianfeng Li, Dengjun Ren, Zhengyu Niu Toshiba (China) Research and Development Center 5/F., Tower W2, Oriental
More informationAutomatic slide assignation for language model adaptation
Automatic slide assignation for language model adaptation Applications of Computational Linguistics Adrià Agustí Martínez Villaronga May 23, 2013 1 Introduction Online multimedia repositories are rapidly
More informationUNSUPERVISED MORPHOLOGICAL SEGMENTATION FOR STATISTICAL MACHINE TRANSLATION
UNSUPERVISED MORPHOLOGICAL SEGMENTATION FOR STATISTICAL MACHINE TRANSLATION by Ann Clifton B.A., Reed College, 2001 a thesis submitted in partial fulfillment of the requirements for the degree of Master
More informationThe history of machine translation in a nutshell
1. Before the computer The history of machine translation in a nutshell 2. The pioneers, 1947-1954 John Hutchins [revised January 2014] It is possible to trace ideas about mechanizing translation processes
More informationComposing Human and Machine Translation Services: Language Grid for Improving Localization Processes
Composing Human and Machine Translation Services: Language Grid for Improving Localization Processes Donghui Lin, Yoshiaki Murakami, Toru Ishida, Yohei Murakami, Masahiro Tanaka National Institute of Information
More informationAMTA 2012. 10 th Biennial Conference of the Association for Machine Translation in the Americas. San Diego, Oct 28 Nov 1, 2012
AMTA 2012 10 th Biennial Conference of the Association for Machine Translation in the Americas San Diego, Oct 28 Nov 1, 2012 http://amta2012.amtaweb.org/ Scope MT als akademisches Thema (mit abgelehnten
More informationUEdin: Translating L1 Phrases in L2 Context using Context-Sensitive SMT
UEdin: Translating L1 Phrases in L2 Context using Context-Sensitive SMT Eva Hasler ILCC, School of Informatics University of Edinburgh e.hasler@ed.ac.uk Abstract We describe our systems for the SemEval
More informationState of affairs today ALL THESE CAN BE TRUE!!!! We tried MT but it was not good. Because of MT, our revenues increased by 17%
Identifying the best opportunities to use Machine Translation to address your Big Language Needs Daniel Marcu Chief Science Officer Agenda The machine translation landscape Content to consider for machine
More informationPROMT-Adobe Case Study:
For Americas: 330 Townsend St., Suite 117, San Francisco, CA 94107 Tel: (415) 913-7586 Fax: (415) 913-7589 promtamericas@promt.com PROMT-Adobe Case Study: For other regions: 16A Dobrolubova av. ( Arena
More informationTAUS 2015. Membership Program (Executive Overview) write to memberservices@taus.net to request the 35 pages detailed service overview. www.taus.
TAUS 2015 Membership Program (Executive Overview) write to memberservices@taus.net to request the 35 pages detailed service overview www.taus.net Five Reasons to be a TAUS Member 1. Access the collaborative
More informationBridging the Online Language Barriers with Machine Translation at the United Nations
Bridging the Online Language Barriers with Machine Translation at the United Nations Fernando Serván! Food and Agriculture Organization of the! United Nations (FAO),! Rome, Italy! About FAO - International
More informationThe University of Maryland Statistical Machine Translation System for the Fifth Workshop on Machine Translation
The University of Maryland Statistical Machine Translation System for the Fifth Workshop on Machine Translation Vladimir Eidelman, Chris Dyer, and Philip Resnik UMIACS Laboratory for Computational Linguistics
More informationEvaluating a Machine Translation System in a Technical Support Scenario
Evaluating a Machine Translation System in a Technical Support Scenario Rosa Del Gaudio, Aljoscha Burchardt and Arle Lommel Higher Functions Sistemas Inteligentes Lisbon, Portugal rosa.gaudio@pcmedic.pt
More informationModeling coherence in ESOL learner texts
University of Cambridge Computer Lab Building Educational Applications NAACL 2012 Outline 1 2 3 4 The Task: Automated Text Scoring (ATS) ATS systems Discourse coherence & cohesion The Task: Automated Text
More informationHandbook on Test Development: Helpful Tips for Creating Reliable and Valid Classroom Tests. Allan S. Cohen. and. James A. Wollack
Handbook on Test Development: Helpful Tips for Creating Reliable and Valid Classroom Tests Allan S. Cohen and James A. Wollack Testing & Evaluation Services University of Wisconsin-Madison 1. Terminology
More informationStatistical Machine Translation
Statistical Machine Translation What works and what does not Andreas Maletti Universität Stuttgart maletti@ims.uni-stuttgart.de Stuttgart May 14, 2013 Statistical Machine Translation A. Maletti 1 Main
More informationCustomizing an English-Korean Machine Translation System for Patent Translation *
Customizing an English-Korean Machine Translation System for Patent Translation * Sung-Kwon Choi, Young-Gil Kim Natural Language Processing Team, Electronics and Telecommunications Research Institute,
More informationSegmentation and Punctuation Prediction in Speech Language Translation Using a Monolingual Translation System
Segmentation and Punctuation Prediction in Speech Language Translation Using a Monolingual Translation System Eunah Cho, Jan Niehues and Alex Waibel International Center for Advanced Communication Technologies
More informationCLOUD ANALYTICS: Empowering the Army Intelligence Core Analytic Enterprise
CLOUD ANALYTICS: Empowering the Army Intelligence Core Analytic Enterprise 5 APR 2011 1 2005... Advanced Analytics Harnessing Data for the Warfighter I2E GIG Brigade Combat Team Data Silos DCGS LandWarNet
More informationMining a Corpus of Job Ads
Mining a Corpus of Job Ads Workshop Strings and Structures Computational Biology & Linguistics Jürgen Jürgen Hermes Hermes Sprachliche Linguistic Data Informationsverarbeitung Processing Institut Department
More informationLanguage Independent Evaluation of Translation Style and Consistency: Comparing Human and Machine Translations of Camus Novel The Stranger
Language Independent Evaluation of Translation Style and Consistency: Comparing Human and Machine Translations of Camus Novel The Stranger Mahmoud El-Haj 1, Paul Rayson 1, and David Hall 2 1 School of
More informationThe Transition of Phrase based to Factored based Translation for Tamil language in SMT Systems
The Transition of Phrase based to Factored based Translation for Tamil language in SMT Systems Dr. Ananthi Sheshasaayee 1, Angela Deepa. V.R 2 1 Research Supervisior, Department of Computer Science & Application,
More informationEvaluation of speech technologies
CLARA Training course on evaluation of Human Language Technologies Evaluations and Language resources Distribution Agency November 27, 2012 Evaluation of speaker identification Speech technologies Outline
More informationSCHOOL OF ENGINEERING AND INFORMATION TECHNOLOGIES GRADUATE PROGRAMS
INSTITUTO TECNOLÓGICO Y DE ESTUDIOS SUPERIORES DE MONTERREY CAMPUS MONTERREY SCHOOL OF ENGINEERING AND INFORMATION TECHNOLOGIES GRADUATE PROGRAMS DOCTOR OF PHILOSOPHY in INFORMATION TECHNOLOGIES AND COMMUNICATIONS
More informationIntegration of Content Optimization Software into the Machine Translation Workflow. Ben Gottesman Acrolinx
Integration of Content Optimization Software into the Machine Translation Workflow Ben Gottesman Acrolinx What is Acrolinx? Acrolinx is Content Optimization Software. It helps authors make their text!
More informationStatistical Machine Translation prototype using UN parallel documents
Proceedings of the 16th EAMT Conference, 28-30 May 2012, Trento, Italy Statistical Machine Translation prototype using UN parallel documents Bruno Pouliquen, Christophe Mazenc World Intellectual Property
More informationAn Approach to Handle Idioms and Phrasal Verbs in English-Tamil Machine Translation System
An Approach to Handle Idioms and Phrasal Verbs in English-Tamil Machine Translation System Thiruumeni P G, Anand Kumar M Computational Engineering & Networking, Amrita Vishwa Vidyapeetham, Coimbatore,
More informationChoices, choices, choices... Which sequence database? Which modifications? What mass tolerance?
Optimization 1 Choices, choices, choices... Which sequence database? Which modifications? What mass tolerance? Where to begin? 2 Sequence Databases Swiss-prot MSDB, NCBI nr dbest Species specific ORFS
More informationThis page intentionally left blank
This page intentionally left blank Statistical Machine Translation The field of machine translation has recently been energized by the emergence of statistical techniques, which have brought the dream
More informationHybrid Strategies. for better products and shorter time-to-market
Hybrid Strategies for better products and shorter time-to-market Background Manufacturer of language technology software & services Spin-off of the research center of Germany/Heidelberg Founded in 1999,
More informationE-discovery Taking Predictive Coding Out of the Black Box
E-discovery Taking Predictive Coding Out of the Black Box Joseph H. Looby Senior Managing Director FTI TECHNOLOGY IN CASES OF COMMERCIAL LITIGATION, the process of discovery can place a huge burden on
More informationWHITE PAPER. Machine Translation of Language for Safety Information Sharing Systems
WHITE PAPER Machine Translation of Language for Safety Information Sharing Systems September 2004 Disclaimers; Non-Endorsement All data and information in this document are provided as is, without any
More informationUsing the Amazon Mechanical Turk for Transcription of Spoken Language
Research Showcase @ CMU Computer Science Department School of Computer Science 2010 Using the Amazon Mechanical Turk for Transcription of Spoken Language Matthew R. Marge Satanjeev Banerjee Alexander I.
More informationThe KIT Translation system for IWSLT 2010
The KIT Translation system for IWSLT 2010 Jan Niehues 1, Mohammed Mediani 1, Teresa Herrmann 1, Michael Heck 2, Christian Herff 2, Alex Waibel 1 Institute of Anthropomatics KIT - Karlsruhe Institute of
More informationA chart generator for the Dutch Alpino grammar
June 10, 2009 Introduction Parsing: determining the grammatical structure of a sentence. Semantics: a parser can build a representation of meaning (semantics) as a side-effect of parsing a sentence. Generation:
More informationPredictive Coding Defensibility
Predictive Coding Defensibility Who should read this paper The Veritas ediscovery Platform facilitates a quality control workflow that incorporates statistically sound sampling practices developed in conjunction
More informationPrivacy Issues in Online Machine Translation Services European Perspective.
Privacy Issues in Online Machine Translation Services European Perspective. Pawel Kamocki, Jim O'Regan IDS Mannheim / Paris Descartes / WWU Münster Centre for Language and Communication Studies, Trinity
More informationBuilding a Web-based parallel corpus and filtering out machinetranslated
Building a Web-based parallel corpus and filtering out machinetranslated text Alexandra Antonova, Alexey Misyurev Yandex 16, Leo Tolstoy St., Moscow, Russia {antonova, misyurev}@yandex-team.ru Abstract
More informationWeb-based automatic translation: the Yandex.Translate API
Maarten van Hees m.van.hees@umail.leidenuniv.nl Web-based automatic translation: the Yandex.Translate API Paulina Kozłowska pk.kozlowska@gmail.com Nana Tian n.tian.2@umail.leidenuniv.nl ABSTRACT Yandex.Translate
More informationMachine Translation of Public Health Materials From English to Chinese: A Feasibility Study
Original Paper Machine Translation of Public Health Materials From English to Chinese: A Feasibility Study Anne M Turner 1*, MD, MLIS, MPH; Kristin N Dew 2*, MS; Loma Desai 3, MS, MBA; Nathalie Martin
More informationMoses from the point of view of an LSP: The Trusted Translations Experience
Moses from the point of view of an LSP: The Trusted Translations Experience Sunday 25 March 13:30-17:30 Gustavo Lucardi COO Trusted Translations, Inc. @glucardi An On-going Story Not a Success Story (Yet)
More information