Meta-Evaluation of a Diagnostic Quality Metric for Machine Translation

Size: px
Start display at page:

Download "Meta-Evaluation of a Diagnostic Quality Metric for Machine Translation"

Transcription

1 Meta-Evaluation of a Diagnostic Quality Metric for Machine Translation Sudip Kumar Naskar Computer and System Sciences Visva-Bharati University India sudip.naskar@visva-bharati.ac.in Antonio Toral Federico Gaspari Declan Groves School of Computing Dublin City University Ireland {atoral, fgaspari, dgroves}@computing.dcu.ie Abstract Diagnostic evaluation of machine translation (MT) is an approach to evaluation that provides finer-grained information compared to state-of-the-art automatic metrics. This paper evaluates DELiC4MT, a diagnostic metric that assesses the performance of MT systems on user-defined linguistic phenomena. We present the results obtained using this diagnostic metric when evaluating three MT systems that translate from English to French, with a comparison against both human judgements and a set of representative automatic evaluation metrics. In addition, as the diagnostic metric relies on word alignments, the paper compares the margin of error in diagnostic evaluation when using automatic word alignments as opposed to gold standard manual alignments. We observed that this diagnostic metric is capable of accurately reflecting translation quality, can be used reliably with automatic word alignments and, in general, correlates well with automatic metrics and, more importantly, with human judgements. 1 Introduction The study presented in this paper addresses the topic of diagnostic evaluation of machine translation (MT), which is receiving increasing attention due to its potentially crucial but still largely unexplored role in the development and subsequent deployment of MT systems. Diagnostic evaluation Work done while at CNGL, School of Computing, Dublin City University. might be particularly useful to complement the overall system-level scores provided by automatic MT evaluation metrics. On the one hand, these automatic metrics represent cost-effective, objective and easily replicable measures, on the other, they provide only global indications that are normally too coarse to explain the performance of an MT system. An associated issue is that diagnostic evaluation needs to be as fine-grained as possible to be really useful in targeting specific weaknesses detected in MT output, for the system developers to be able to take corrective actions accordingly, and the users to asses the actual impact of the system s weaknesses. This paper evaluates a diagnostic metric that assesses the performance of MT systems on userdefined linguistic phenomena. Focusing on English to French translation as a case study, the use of alternative automatic word alignments is investigated and compared against gold standard manual alignment to discuss how these different approaches impact on the results of diagnostic MT evaluation. The paper also presents a comparative evaluation of three MT systems judged according to standard automatic MT evaluation metrics, the diagnostic evaluation metric over a range of linguistic checkpoints, and human judgements. Additionally, we investigate how these different types of MT evaluation correlate to each other. The paper is structured as follows. Section 2 presents previous work in diagnostic evaluation of MT, discussing the methodologies and tools that exist in this area, focusing in particular on the features of the diagnostic metric used in this study. Section 3 describes the datasets that were used for the experiments and Section 4 details the experi- Sima an, K., Forcada, M.L., Grasmick, D., Depraetere, H., Way, A. (eds.) Proceedings of the XIV Machine Translation Summit (Nice, September 2 6, 2013), p c 2013 The authors. This article is licensed under a Creative Commons 3.0 licence, no derivative works, attribution, CC-BY-ND.

2 mental setup. The results of the investigation are presented and analysed in Section 5, and finally some conclusions are drawn and possible avenues for future work are outlined in Section 6. 2 Related Work Recognising that the ability to automatically identify and evaluate specific MT errors with diagnostic relevance is of paramount importance, Popović et al. (2006) propose a framework for the automatic classification of MT errors based on morpho-syntactic features. They show that linguistically-sensitive measures provide useful feedback to alleviate the problems encountered by MT. In a similar vein, Popović and Burchardt (2011) present a method for automatic error classification and compare its use with results obtained from human evaluation. They show good correlation between their automatic measures and human judgements across various error classes for different MT output. Popović (2011) describes a tool for automatic classification of MT errors, which are grouped into five classes (morphological, lexical, reordering, omissions and unnecessary additions). The tool needs full-form reference translation(s) and hypotheses with their corresponding base forms. Additional information at the word level (such as PoS tags) can be used for a more delicate analysis. The tool computes the number of errors for each class at the document and sentence levels. Max et al. (2010) propose an approach to contrastive diagnostic MT evaluation based on comparing the ability of different systems (or implementations of the same system) to correctly translate source-language words. Their contrastive lexical evaluation method does not rely on the direct comparison of the system s hypotheses with the reference translations, but for each sourcelanguage word it identifies which of the MT systems under consideration provide the correct output matching the reference. Their study is devoted to English French and they point out the crucial role played by the quality of the alignment, suggesting that inaccuracies in the automatic alignment are bound to impair the reliability of this approach for lexical diagnostic evaluation. Fishel et al. (2012) provide an overview of the field of diagnostic evaluation of MT, presenting a collection of freely available translation errorannotation corpora for various language pairs and comparing the performance of two state-of-the-art tools on automatic error analysis of MT. Zhou et al. (2008) describe a tool for diagnostic MT evaluation called Woodpecker, 1 which is based on linguistic checkpoints. These are particularly interesting (or problematic) linguistic phenomena for MT processing identified by the user or developer who conducts the evaluation, e.g. ambiguous words, challenging collocations or PoS-ngram constructs, etc. One needs to define a linguistic taxonomy which describes the phenomena to be captured in the diagnostic evaluation, deciding which elements of the source language one wants to investigate. This scheme is extremely flexible, and can be formulated at different levels of specificity, whereby the granularity of the checkpoints included depends on the objectives of the diagnostic evaluation. While the notion of linguistic checkpoints is very useful within the context of diagnostic MT evaluation, Woodpecker has some limitations. First of all, language-dependent data for English Chinese (the language pair covered in the study presented in (Zhou et al., 2008)) is hardcoded in the software, which therefore cannot be easily adapted to other language pairs. In addition, the licence with which Woodpecker is distributed (MSR-LA) 2 is quite restrictive, in that e.g. researchers cannot publicly release their own adaptations of the tool. DELiC4MT 3 (Toral et al., 2012) is a free opensource tool for diagnostic evaluation which offers similar functionality to Woodpecker. We chose to carry out experiments with DELiC4MT due to its language-independent nature. This recall-based diagnostic evaluation metric essentially works like other n-gram-based automatic MT evaluation metrics (i.e. counting n-gram matches between the MT output and the reference translations), except that it focuses on specific segments of the reference identified through linguistic constructs found in the source (i.e. linguistic checkpoints) and word alignment. 1 com/en-us/downloads/ ad a9a7-4a14-a556-d6a7c7919b4a/ 2 projects/pex/msr-la.txt 3 atoral/ delic4mt/ 136

3 The final recall score produced by DELiC4MT is computed as in equation 1, where R is the set of references (r) of all the checkpoints (c) in C. A length-based penalty is introduced to penalise longer candidate translations (otherwise longer translations would have a better chance of returning higher scores) as in equation 2, where length(c) is the average candidate translation length and length(r) is the average reference translation length. R(C) = penalty = 3 Datasets r R n gram r r R n gram r { length(r) length(c) match(n gram) penalty count(n gram) if length(c) > length(r) 1 otherwise The initial key decisions that had to be made to set up the experiment concerned the languages to be focused on as well as the domain and specific dataset to be selected for the investigation. We decided to work on English French, as human-aligned datasets are readily available for this language pair. We investigated a number of options in terms of manually annotated aligned English French data to serve as gold standard, and considered, for example, using Biblical texts made available as part of the Blinker Annotation Project (Melamed, 1998). However, the syntax and vocabulary of this dataset presented some specific features which were not in line with actual uses envisaged for diagnostic evaluation in research or industrial settings. The dataset that was chosen for our experiment was initially created for the shared task on word alignment held as part of the HLT/NAACL 2003 Workshop on Building and Using Parallel Texts (Mihalcea and Pedersen, 2003). The dataset used for this study consists of 447 English French word-aligned sentence pairs drawn from the Canadian Hansard Corpus, consisting of parliamentary debates (Och and Ney, 2000), for a total of 7,020 tokens in English and 7,761 in French. It should be noted that we did not differentiate between sure and probable word alignments in this dataset and treat them as having the same weight. Choosing a bilingual dataset from the domain of parliamentary speeches allowed us to conduct a (1) (2) fair and direct comparison with a closely related baseline English French MT system built using the Europarl corpus 4 (Koehn, 2005). 4 Experimental Setup 4.1 MT We experimented with three MT systems: Google Translate 5, Systran 6 and a baseline Moses 7 system. Among the three MT systems, Google Translate and Moses are statistical MT systems while Systran is predominantly a rule-based system. The Moses system used for our experiments was trained on 3.6 million English French sentence pairs taken from Europarl, the News Commentary corpus and a randomly selected section of the UN corpus. The system was tuned on a heldout development set consisting of 1,025 sentence pairs and used a 5-gram language model built using the SRILM toolkit (Stolcke, 2002). 4.2 Word Alignment The diagnostic evaluation was carried out using both gold standard human alignments and three sets of automatic alignments. Thus, in total we carried out experiments on 4 different sets of word alignments. The idea behind this study was primarily to show whether the different possible alignments had an impact on the effectiveness of the diagnostic MT evaluation metric, also in comparison with gold-standard manual alignment and human evaluation. We used GIZA++ 8 (Och and Ney, 2003) to derive the automatic alignments between the source and target sides of the testset. We extracted three sets of alignments using the union, intersection and grow-diag-final heuristics, as implemented by the Moses training scripts. Since the testset is far too small to be accurately word-aligned using a statistical word-aligner and would suffer from data sparseness, additional parallel training data from the Europarl corpus was used. The additional training data was first tokenised, filtered (using source-target length ratio) and lower-cased. The testset was also subjected to tokenisation and lower-casing. The testset was then appended with

4 the additional training data and word-aligned using GIZA++. Finally, from the word-alignment file only the word alignments for the sentences that correspond to the testset were extracted. 4.3 Linguistic Checkpoints Regarding the linguistic phenomena, we considered a basic set of PoS-based checkpoints: adjectives (a), nouns (n), verbs (v), adverbs (r), determiners (dt), miscellaneous (misc), and pronouns (pro). The misc checkpoint contains a variety of other PoS tags (CC, IN, RP and TO) (Santorini, 1990). We used Treetagger 9 (Schmid, 1994) to PoS-tag both sides of the testset. It should be noted that the evaluation framework can potentially focus on more complex userdefined linguistic phenomena. In fact, it can be applied to a wide range of composite linguistic structures of interest to the MT developer or user for evaluation purposes. The metric can handle, e.g., combinations of literal words or lemmas with PoS tags. Evaluation on named entities and dependency structures is also supported by this diagnostic MT evaluation metric. 4.4 Human Judgements In order to verify the results of the diagnostic evaluation, we carried out human evaluations on the output of the 3 different MT systems. These were done by 2 evaluators, both native French speakers and experienced in translation evaluation. They were asked to assign fluency and adequacy scores to the translations based on a discrete 5-point scale (LDC, 2005). In addition, they were asked to evaluate translation quality in terms of 5 PoS-based checkpoints (a, n, v, r and dt), again using a 5- point scale, with 1 representing instances where there were severe errors in the translation of all instances of the checkpoint and 5 indicating that all instances were translated perfectly. The evaluators were also asked to give a does-not-apply ( NA ) score to sentences that did not contain the linguistic phenomenon under consideration. 5 Results 5.1 Diagnostic Evaluation Table 1 shows the diagnostic evaluation results obtained on the gold standard word alignment. 9 projekte/corplex/treetagger/ Tables 2, 3 and 4 present results obtained on the grow-diag-final, union and intersection alignments respectively. Each of these tables shows checkpoint-specific scores across systems. Table 1 shows in addition the number of checkpointspecific instances (#Inst) extracted from the source side of the testset. Checkpoint-specific statistically significant improvements are reported in these tables as superscripts. For representation purposes, we use a, b, and c for Google, Moses and Systran, respectively. For example, the Google score b,c for the adjective checkpoint in Table 1 means that the improvement provided by Google for this checkpoint is statistically significant over both Moses and Systran. In addition to the checkpoint-specific scores, each of these tables provides an arithmetic mean (avg) and a weighted mean (w-avg, weighted by the number of instances for each checkpoint). The weighted average is considered as the system-level score for diagnostic evaluation. Tables 2, 3 and 4 also show the ratios with manual alignment (m-ratio). For example, Table 2 shows that the weighted means obtained by Google, Moses and Systran on grow-diag-final alignments are respectively , and times those obtained on manual alignments. #Inst a b,c n 1, b,c v 1, c c r dt b,c misc 1, b,c pro b,c avg 6, w-avg Table 1: Diagnostic evaluation results on manual alignments As the scores in Table 1 suggest, Google clearly outperforms the other systems on all of the phenomena, and most of these improvements are statistically significant. The Moses baseline system performs slightly better than Systran according to the weighted averages. While some of the phenomena (e.g., nouns, verbs) are better handled by 138

5 the Moses baseline system the scores in Table 1 also show that Systran performs quite better than this baseline system for adverbs, determiners and pronouns. This trend can be observed across Tables 2, 3 and 4 as well. a b,c n b,c v c c r b dt b,c misc b,c pro b avg w-avg m-ratio Table 2: Diagnostic evaluation results on growdiag-final alignments a b,c n b,c v c c r b dt b,c misc b,c pro b avg w-avg m-ratio Table 3: Diagnostic evaluation results on union alignments 5.2 Automatic Metrics We also evaluated the performances of the MT systems using a set of state-of-the-art automatic evaluation metrics: BLEU (Papineni et al., 2002), NIST (Doddington, 2002), METEOR (Banerjee and Lavie, 2005) and TER (Snover et al., 2006). Table 5 presents the system-level evaluation results for the different types of metrics considered (automatic, diagnostic and human judgements). For diagnostic evaluation it reports the weighted averages (see w-avg in Tables 1, 2, 3 and 4). According to BLEU, NIST and METEOR, Google a b,c n b,c v c c r dt b,c misc b,c pro b,c avg w-avg m-ratio Table 4: Diagnostic evaluation results on intersection alignments is the best system, followed by Moses and Systran, while TER ranks Systran over Moses. Diagnostic evaluation on gold standard alignment also yields the same ranking as BLEU, NIST and ME- TEOR. More importantly, for this work, the use of any automatically derived word alignments (i.e., grow-diag-final, union or intersection) in diagnostic evaluation replicates the same ranking obtained with gold standard alignments. Diagnostic Automatic Human Method manual gdf union intersection BLEU NIST METEOR TER Evaluator Evaluator Table 5: System level evaluation results 5.3 Human Judgements Tables 6 and 7 present the results of human evaluation of the MT systems. The mean of adequacy (adq) and fluency (fln) is considered as the overall human judgement score. According to both evaluators, at the system level, Google is the best system, followed by Systran and Moses. As far as fine-grained checkpoint-specific 139

6 human judgements are concerned, both evaluators agree that Google handles all of the linguistic phenomena better than the other two systems, as is also revealed by the diagnostic evaluation. According to evaluator 1, Moses translates nouns better than Systran does, while Systran does well on adjectives, verbs, adverbs and determiners. Diagnostic evaluation matches almost perfectly with fine-grained checkpoint-specific human judgements obtained from evaluator 1, except for the translation of verbs for Moses and Systran. But, according to evaluator 2, Moses only translates determiners better than Systran, while Systran does better on the rest of the checkpoints. Adq Fln Avg (Adq, Fln) noun verb adverb adjective determiner Table 6: Human judgements of evaluator 1 Adq Fln Avg (Adq, Fln) noun verb adverb adjective determiner Table 7: Human judgements of evaluator 2 Table 8 presents Pearson s correlations for checkpoint-specific evaluation across systems. It shows that checkpoint-specific diagnostic evaluation using either manual or automatic alignments correlates well with checkpoint-specific human judgements in general. However, the correlation is very poor in the case of verbs, as both human evaluators preferred Systran over Moses, while diagnostic evaluation (even with gold standard alignments) ranked Moses over Systran for this checkpoint. We manually inspected the outputs of Moses and Systran for those sentences for which diagnostic evaluation contradicted human evaluation for the verb checkpoint. We found that in some of the cases the problem was due to the failure of DELiC4MT to consider synonyms. Most of the existing automatic evaluation metrics (except METEOR) also suffer from this problem. Availability of mutiple reference translations can circumvent this problem and DELiC4MT also supports evaluation with mutiple references. It should be also noted that the scoring of DELiC4MT being n-gram based, the metric might be slightly biased toward SMT systems (Callison-Burch et al, 2006). The checkpoint-specific inter-annotator agreements (Fleiss Kappa) between the two annotators were 0.32 (adjectives), 0.13 (adverbs), 0.12 (determiners), 0.24 (nouns) and 0.29 (verbs). This somewhat low agreement may be due to the fact that although the evaluators are experienced in translation evaluation in terms of adequacy and fluency, they never performed diagnostic evaluation of this sort. It can be noticed from Tables 6 and 7 that evaluator 2 consistently gives higher scores for adequacy and fluency than evaluator 1 across systems; but these scores still correlate perfectly (cf. Table 9). A limitation of the current study regards the low number of human annotators, as having more of them might probably result in more stable results. Finally, Table 9 presents the Pearson s correlation coefficients between the system-level scores across systems. As it can be seen from this table, system-level diagnostic evaluation scores obtained on automatically derived word alignments correlate very highly with those obtained on the gold standard alignment. In fact, diagnostic evaluation using grow-diag-final and union alignments (as opposed to using manual alignments) was found to correlate better with human judgements, while the use of intersection alignments produced better correlations with the majority of the automatic MT evaluation metrics. This indicates that using automatic word alignments is sufficient for carrying out diagnostic evaluation. Diagnostic evaluation correlates well with all automatic evaluation scores (including TER, which being an error metric shows strong negative or inverse association) as well as human judgements, indicating that this type of evaluation is accurate at predicting true system quality. 140

7 Pearson s Correlation Noun Verb Adv Adj Det Evaluator 1 Evaluator Evaluator 1 Diagnostic (manual) Evaluator 2 Diagnostic (manual) Evaluator 1 Diagnostic (gdf) Evaluator 2 Diagnostic (gdf) Diagnostic (manual) Diagnostic (gdf) Diagnostic (manual) Diagnostic (union) Diagnostic (manual) Diagnostic (intersection) Table 8: Pearson s correlation for checkpoint-specific evaluation across systems Diagnostic Automatic Human Diagnostic Human manual gdf union intersection evaluator 1 evaluator 2 manual gdf union intersection BLEU NIST METEOR TER Evaluator Evaluator Table 9: Pearson s correlation between the system level scores 6 Conclusions and Future Work This paper has evaluated a diagnostic metric that assesses the performance of MT systems on userdefined linguistic phenomena. This has been done by means of a case study for the English French language direction. As this metric is dependent on word alignments, one of the objectives was to find the margin of error in diagnostic evaluation using automatic word alignments as opposed to using gold standard manual alignments. In order to determine that, we carried out diagnostic evaluation using manual alignments as well as a set of commonly used automatic alignments (grow-diag-final, union and intersection). In addition, we also calculated the correlation with several state-of-the-art automatic MT evaluation metrics as well as with human judgements. From the experimental results we found that automatically-derived word alignments can be considered as effective as gold standard alignments when carrying out diagnostic evaluation. We also observed that diagnostic evaluation can accurately capture translation quality and, in general, correlates well both with system-level automatic evaluation metrics and with human judgements. As an extension to this work, we would like to explore the impact of different automatic aligners on the results of diagnostic evaluation of MT. Also, the low correlation with human judgements obtained for verbs requires a deeper analysis of this linguistic phenomenon and how it is treated by the diagnostic metric, which we plan to explore in further detail. Acknowledgments The research leading to these results has received funding from the European Union Seventh Framework Programme FP7/ under grant agreements FP7-ICT (CoSyne) and PIAP-GA (Abu-MaTran) and through Science Foundation Ireland as part of the CNGL (grant 07/CE/I1142). 141

8 References Banerjee, S. and Lavie, A. (2005). METEOR: An Automatic Metric for MT Evaluation with Improved Correlation with Human Judgments. In Proceedings of the ACL 2005 Workshop on Intrinsic and Extrinsic Evaluation Measures for Machine Translation and/or Summarization, pages Callison-Burch, C., Osborne, M. and Koehn, P. (2006). Re-evaluation the Role of Bleu in Machine Translation Research. In Proceedings of EACL 2006, 11st Conference of the European Chapter of the Association for Computational Linguistics. Doddington, G. (2002). Automatic evaluation of machine translation quality using n-gram co-occurrence statistics. In Proceedings of the second international conference on Human Language Technology Research, HLT 02, pages Fishel, M., Bojar, O., and Popović, M. (2012). Terra: a collection of translation error-annotated corpora. In Proceedings of the 8th International Conference on Language Resources and Evaluation, pages Koehn, P. (2005). Europarl: A Parallel Corpus for Statistical Machine Translation. In Conference Proceedings: the tenth Machine Translation Summit, pages LDC (2005). Linguistic data annotation specification: Assessment of fluency and adequacy in translations. Technical report. Revision 1.5. Max, A., Crego, J. M., and Yvon, F. (2010). Contrastive lexical evaluation of machine translation. In Proceedings of the 7th International Conference on Language Resources and Evaluation. Melamed, I. D. (1998). Manual annotation of translational equivalence: The blinker project. CoRR, cmplg/ Mihalcea, R. and Pedersen, T. (2003). An evaluation exercise for word alignment. In Proceedings of the HLT-NAACL 2003 Workshop on Building and Using Parallel Texts: Data Driven Machine Translation and Beyond, pages Och, F. J. and Ney, H. (2000). A comparison of alignment models for statistical machine translation. In Proceedings of the 18th conference on Computational linguistics - Volume 2, pages Popović, M. (2011). Hjerson: An open source tool for automatic error classification of machine translation output. The Prague Bulletin of Mathematical Linguistics, 96: Popović, M. and Burchardt, A. (2011). From human to automatic error classification for machine translation output. In Proceedings of the 15th Annual Conference of the European Associtation for Machine Translation. Popović, M., Ney, H., de Gispert, A., Mariño, J. B., Gupta, D., Federico, M., Lambert, P., and Banchs, R. (2006). Morpho-syntactic information for automatic error analysis of statistical machine translation output. In Proceedings of the Workshop on Statistical Machine Translation, pages 1 6. Santorini, B. (1990). Part-of-speech tagging guidelines for the Penn Treebank Project. Technical report, Department of Computer and Information Science, University of Pennsylvania. (3rd revision, 2nd printing). Schmid, H. (1994). Probabilistic part-of-speech tagging using decision trees. In Proceedings of the International Conference on New Methods in Language Processing. Snover, M., Dorr, B., Schwartz, R., Micciulla, L., and Makhoul, J. (2006). A study of translation edit rate with targeted human annotation. In Proceedings of Association for Machine Translation in the Americas, pages Stolcke, A. (2002). SRILM an extensible language modeling toolkit. In Proceedings International Conference on Spoken Language Processing, pages Toral, A., Naskar, S. K., Gaspari, F., and Groves, D. (2012). DELiC4MT: A Tool for Diagnostic MT Evaluation over User-defined Linguistic Phenomena. In The Prague Bulletin of Mathematical Linguistics, 98: Zhou, M., Wang, B., Liu, S., Li, M., Zhang, D., and Zhao, T. (2008). Diagnostic evaluation of machine translation systems using automatically constructed linguistic check-points. In Proceedings of the 22nd International Conference on Computational Linguistics - Volume 1., pages Och, F. J. and Ney, H. (2003). A systematic comparison of various statistical alignment models. Computational Linguistics, 29(1): Papineni, K., Roukos, S., Ward, T., and Zhu, W.-J. (2002). BLEU: a method for automatic evaluation of machine translation. In Proceedings of the 40th Annual Meeting on Association for Computational Linguistics, pages

The Impact of Morphological Errors in Phrase-based Statistical Machine Translation from English and German into Swedish

The Impact of Morphological Errors in Phrase-based Statistical Machine Translation from English and German into Swedish The Impact of Morphological Errors in Phrase-based Statistical Machine Translation from English and German into Swedish Oscar Täckström Swedish Institute of Computer Science SE-16429, Kista, Sweden oscar@sics.se

More information

Hybrid Machine Translation Guided by a Rule Based System

Hybrid Machine Translation Guided by a Rule Based System Hybrid Machine Translation Guided by a Rule Based System Cristina España-Bonet, Gorka Labaka, Arantza Díaz de Ilarraza, Lluís Màrquez Kepa Sarasola Universitat Politècnica de Catalunya University of the

More information

Dublin City University at CLEF 2004: Experiments with the ImageCLEF St Andrew s Collection

Dublin City University at CLEF 2004: Experiments with the ImageCLEF St Andrew s Collection Dublin City University at CLEF 2004: Experiments with the ImageCLEF St Andrew s Collection Gareth J. F. Jones, Declan Groves, Anna Khasin, Adenike Lam-Adesina, Bart Mellebeek. Andy Way School of Computing,

More information

Appraise: an Open-Source Toolkit for Manual Evaluation of MT Output

Appraise: an Open-Source Toolkit for Manual Evaluation of MT Output Appraise: an Open-Source Toolkit for Manual Evaluation of MT Output Christian Federmann Language Technology Lab, German Research Center for Artificial Intelligence, Stuhlsatzenhausweg 3, D-66123 Saarbrücken,

More information

SYSTRAN 混 合 策 略 汉 英 和 英 汉 机 器 翻 译 系 统

SYSTRAN 混 合 策 略 汉 英 和 英 汉 机 器 翻 译 系 统 SYSTRAN Chinese-English and English-Chinese Hybrid Machine Translation Systems Jin Yang, Satoshi Enoue Jean Senellart, Tristan Croiset SYSTRAN Software, Inc. SYSTRAN SA 9333 Genesee Ave. Suite PL1 La Grande

More information

The KIT Translation system for IWSLT 2010

The KIT Translation system for IWSLT 2010 The KIT Translation system for IWSLT 2010 Jan Niehues 1, Mohammed Mediani 1, Teresa Herrmann 1, Michael Heck 2, Christian Herff 2, Alex Waibel 1 Institute of Anthropomatics KIT - Karlsruhe Institute of

More information

Turker-Assisted Paraphrasing for English-Arabic Machine Translation

Turker-Assisted Paraphrasing for English-Arabic Machine Translation Turker-Assisted Paraphrasing for English-Arabic Machine Translation Michael Denkowski and Hassan Al-Haj and Alon Lavie Language Technologies Institute School of Computer Science Carnegie Mellon University

More information

SYSTRAN Chinese-English and English-Chinese Hybrid Machine Translation Systems for CWMT2011 SYSTRAN 混 合 策 略 汉 英 和 英 汉 机 器 翻 译 系 CWMT2011 技 术 报 告

SYSTRAN Chinese-English and English-Chinese Hybrid Machine Translation Systems for CWMT2011 SYSTRAN 混 合 策 略 汉 英 和 英 汉 机 器 翻 译 系 CWMT2011 技 术 报 告 SYSTRAN Chinese-English and English-Chinese Hybrid Machine Translation Systems for CWMT2011 Jin Yang and Satoshi Enoue SYSTRAN Software, Inc. 4444 Eastgate Mall, Suite 310 San Diego, CA 92121, USA E-mail:

More information

The Prague Bulletin of Mathematical Linguistics NUMBER 96 OCTOBER 2011 49 58. Ncode: an Open Source Bilingual N-gram SMT Toolkit

The Prague Bulletin of Mathematical Linguistics NUMBER 96 OCTOBER 2011 49 58. Ncode: an Open Source Bilingual N-gram SMT Toolkit The Prague Bulletin of Mathematical Linguistics NUMBER 96 OCTOBER 2011 49 58 Ncode: an Open Source Bilingual N-gram SMT Toolkit Josep M. Crego a, François Yvon ab, José B. Mariño c c a LIMSI-CNRS, BP 133,

More information

The TCH Machine Translation System for IWSLT 2008

The TCH Machine Translation System for IWSLT 2008 The TCH Machine Translation System for IWSLT 2008 Haifeng Wang, Hua Wu, Xiaoguang Hu, Zhanyi Liu, Jianfeng Li, Dengjun Ren, Zhengyu Niu Toshiba (China) Research and Development Center 5/F., Tower W2, Oriental

More information

A Flexible Online Server for Machine Translation Evaluation

A Flexible Online Server for Machine Translation Evaluation A Flexible Online Server for Machine Translation Evaluation Matthias Eck, Stephan Vogel, and Alex Waibel InterACT Research Carnegie Mellon University Pittsburgh, PA, 15213, USA {matteck, vogel, waibel}@cs.cmu.edu

More information

THUTR: A Translation Retrieval System

THUTR: A Translation Retrieval System THUTR: A Translation Retrieval System Chunyang Liu, Qi Liu, Yang Liu, and Maosong Sun Department of Computer Science and Technology State Key Lab on Intelligent Technology and Systems National Lab for

More information

LIUM s Statistical Machine Translation System for IWSLT 2010

LIUM s Statistical Machine Translation System for IWSLT 2010 LIUM s Statistical Machine Translation System for IWSLT 2010 Anthony Rousseau, Loïc Barrault, Paul Deléglise, Yannick Estève Laboratoire Informatique de l Université du Maine (LIUM) University of Le Mans,

More information

Systematic Comparison of Professional and Crowdsourced Reference Translations for Machine Translation

Systematic Comparison of Professional and Crowdsourced Reference Translations for Machine Translation Systematic Comparison of Professional and Crowdsourced Reference Translations for Machine Translation Rabih Zbib, Gretchen Markiewicz, Spyros Matsoukas, Richard Schwartz, John Makhoul Raytheon BBN Technologies

More information

Building a Web-based parallel corpus and filtering out machinetranslated

Building a Web-based parallel corpus and filtering out machinetranslated Building a Web-based parallel corpus and filtering out machinetranslated text Alexandra Antonova, Alexey Misyurev Yandex 16, Leo Tolstoy St., Moscow, Russia {antonova, misyurev}@yandex-team.ru Abstract

More information

Polish - English Statistical Machine Translation of Medical Texts.

Polish - English Statistical Machine Translation of Medical Texts. Polish - English Statistical Machine Translation of Medical Texts. Krzysztof Wołk, Krzysztof Marasek Department of Multimedia Polish Japanese Institute of Information Technology kwolk@pjwstk.edu.pl Abstract.

More information

On the practice of error analysis for machine translation evaluation

On the practice of error analysis for machine translation evaluation On the practice of error analysis for machine translation evaluation Sara Stymne, Lars Ahrenberg Linköping University Linköping, Sweden {sara.stymne,lars.ahrenberg}@liu.se Abstract Error analysis is a

More information

Enriching Morphologically Poor Languages for Statistical Machine Translation

Enriching Morphologically Poor Languages for Statistical Machine Translation Enriching Morphologically Poor Languages for Statistical Machine Translation Eleftherios Avramidis e.avramidis@sms.ed.ac.uk Philipp Koehn pkoehn@inf.ed.ac.uk School of Informatics University of Edinburgh

More information

Statistical Machine Translation

Statistical Machine Translation Statistical Machine Translation Some of the content of this lecture is taken from previous lectures and presentations given by Philipp Koehn and Andy Way. Dr. Jennifer Foster National Centre for Language

More information

An Iteratively-Trained Segmentation-Free Phrase Translation Model for Statistical Machine Translation

An Iteratively-Trained Segmentation-Free Phrase Translation Model for Statistical Machine Translation An Iteratively-Trained Segmentation-Free Phrase Translation Model for Statistical Machine Translation Robert C. Moore Chris Quirk Microsoft Research Redmond, WA 98052, USA {bobmoore,chrisq}@microsoft.com

More information

Adaptation to Hungarian, Swedish, and Spanish

Adaptation to Hungarian, Swedish, and Spanish www.kconnect.eu Adaptation to Hungarian, Swedish, and Spanish Deliverable number D1.4 Dissemination level Public Delivery date 31 January 2016 Status Author(s) Final Jindřich Libovický, Aleš Tamchyna,

More information

The University of Maryland Statistical Machine Translation System for the Fifth Workshop on Machine Translation

The University of Maryland Statistical Machine Translation System for the Fifth Workshop on Machine Translation The University of Maryland Statistical Machine Translation System for the Fifth Workshop on Machine Translation Vladimir Eidelman, Chris Dyer, and Philip Resnik UMIACS Laboratory for Computational Linguistics

More information

Collaborative Machine Translation Service for Scientific texts

Collaborative Machine Translation Service for Scientific texts Collaborative Machine Translation Service for Scientific texts Patrik Lambert patrik.lambert@lium.univ-lemans.fr Jean Senellart Systran SA senellart@systran.fr Laurent Romary Humboldt Universität Berlin

More information

Factored Translation Models

Factored Translation Models Factored Translation s Philipp Koehn and Hieu Hoang pkoehn@inf.ed.ac.uk, H.Hoang@sms.ed.ac.uk School of Informatics University of Edinburgh 2 Buccleuch Place, Edinburgh EH8 9LW Scotland, United Kingdom

More information

Parmesan: Meteor without Paraphrases with Paraphrased References

Parmesan: Meteor without Paraphrases with Paraphrased References Parmesan: Meteor without Paraphrases with Paraphrased References Petra Barančíková Institute of Formal and Applied Linguistics Charles University in Prague, Faculty of Mathematics and Physics Malostranské

More information

7-2 Speech-to-Speech Translation System Field Experiments in All Over Japan

7-2 Speech-to-Speech Translation System Field Experiments in All Over Japan 7-2 Speech-to-Speech Translation System Field Experiments in All Over Japan We explain field experiments conducted during the 2009 fiscal year in five areas of Japan. We also show the experiments of evaluation

More information

The XMU Phrase-Based Statistical Machine Translation System for IWSLT 2006

The XMU Phrase-Based Statistical Machine Translation System for IWSLT 2006 The XMU Phrase-Based Statistical Machine Translation System for IWSLT 2006 Yidong Chen, Xiaodong Shi Institute of Artificial Intelligence Xiamen University P. R. China November 28, 2006 - Kyoto 13:46 1

More information

Factored bilingual n-gram language models for statistical machine translation

Factored bilingual n-gram language models for statistical machine translation Mach Translat DOI 10.1007/s10590-010-9082-5 Factored bilingual n-gram language models for statistical machine translation Josep M. Crego François Yvon Received: 2 November 2009 / Accepted: 12 June 2010

More information

Adapting General Models to Novel Project Ideas

Adapting General Models to Novel Project Ideas The KIT Translation Systems for IWSLT 2013 Thanh-Le Ha, Teresa Herrmann, Jan Niehues, Mohammed Mediani, Eunah Cho, Yuqi Zhang, Isabel Slawik and Alex Waibel Institute for Anthropomatics KIT - Karlsruhe

More information

Why Evaluation? Machine Translation. Evaluation. Evaluation Metrics. Ten Translations of a Chinese Sentence. How good is a given system?

Why Evaluation? Machine Translation. Evaluation. Evaluation Metrics. Ten Translations of a Chinese Sentence. How good is a given system? Why Evaluation? How good is a given system? Machine Translation Evaluation Which one is the best system for our purpose? How much did we improve our system? How can we tune our system to become better?

More information

Statistical Machine Translation: IBM Models 1 and 2

Statistical Machine Translation: IBM Models 1 and 2 Statistical Machine Translation: IBM Models 1 and 2 Michael Collins 1 Introduction The next few lectures of the course will be focused on machine translation, and in particular on statistical machine translation

More information

Machine Translation. Why Evaluation? Evaluation. Ten Translations of a Chinese Sentence. Evaluation Metrics. But MT evaluation is a di cult problem!

Machine Translation. Why Evaluation? Evaluation. Ten Translations of a Chinese Sentence. Evaluation Metrics. But MT evaluation is a di cult problem! Why Evaluation? How good is a given system? Which one is the best system for our purpose? How much did we improve our system? How can we tune our system to become better? But MT evaluation is a di cult

More information

Quantifying the Influence of MT Output in the Translators Performance: A Case Study in Technical Translation

Quantifying the Influence of MT Output in the Translators Performance: A Case Study in Technical Translation Quantifying the Influence of MT Output in the Translators Performance: A Case Study in Technical Translation Marcos Zampieri Saarland University Saarbrücken, Germany mzampier@uni-koeln.de Mihaela Vela

More information

BEER 1.1: ILLC UvA submission to metrics and tuning task

BEER 1.1: ILLC UvA submission to metrics and tuning task BEER 1.1: ILLC UvA submission to metrics and tuning task Miloš Stanojević ILLC University of Amsterdam m.stanojevic@uva.nl Khalil Sima an ILLC University of Amsterdam k.simaan@uva.nl Abstract We describe

More information

Choosing the best machine translation system to translate a sentence by using only source-language information*

Choosing the best machine translation system to translate a sentence by using only source-language information* Choosing the best machine translation system to translate a sentence by using only source-language information* Felipe Sánchez-Martínez Dep. de Llenguatges i Sistemes Informàtics Universitat d Alacant

More information

Automatic Mining of Internet Translation Reference Knowledge Based on Multiple Search Engines

Automatic Mining of Internet Translation Reference Knowledge Based on Multiple Search Engines , 22-24 October, 2014, San Francisco, USA Automatic Mining of Internet Translation Reference Knowledge Based on Multiple Search Engines Baosheng Yin, Wei Wang, Ruixue Lu, Yang Yang Abstract With the increasing

More information

Towards Using Web-Crawled Data for Domain Adaptation in Statistical Machine Translation

Towards Using Web-Crawled Data for Domain Adaptation in Statistical Machine Translation Towards Using Web-Crawled Data for Domain Adaptation in Statistical Machine Translation Pavel Pecina, Antonio Toral, Andy Way School of Computing Dublin City Universiy Dublin 9, Ireland {ppecina,atoral,away}@computing.dcu.ie

More information

Machine Translation. Agenda

Machine Translation. Agenda Agenda Introduction to Machine Translation Data-driven statistical machine translation Translation models Parallel corpora Document-, sentence-, word-alignment Phrase-based translation MT decoding algorithm

More information

PoS-tagging Italian texts with CORISTagger

PoS-tagging Italian texts with CORISTagger PoS-tagging Italian texts with CORISTagger Fabio Tamburini DSLO, University of Bologna, Italy fabio.tamburini@unibo.it Abstract. This paper presents an evolution of CORISTagger [1], an high-performance

More information

Adaptive Development Data Selection for Log-linear Model in Statistical Machine Translation

Adaptive Development Data Selection for Log-linear Model in Statistical Machine Translation Adaptive Development Data Selection for Log-linear Model in Statistical Machine Translation Mu Li Microsoft Research Asia muli@microsoft.com Dongdong Zhang Microsoft Research Asia dozhang@microsoft.com

More information

The CMU-Avenue French-English Translation System

The CMU-Avenue French-English Translation System The CMU-Avenue French-English Translation System Michael Denkowski Greg Hanneman Alon Lavie Language Technologies Institute Carnegie Mellon University Pittsburgh, PA, 15213, USA {mdenkows,ghannema,alavie}@cs.cmu.edu

More information

Experiments in morphosyntactic processing for translating to and from German

Experiments in morphosyntactic processing for translating to and from German Experiments in morphosyntactic processing for translating to and from German Alexander Fraser Institute for Natural Language Processing University of Stuttgart fraser@ims.uni-stuttgart.de Abstract We describe

More information

Building task-oriented machine translation systems

Building task-oriented machine translation systems Building task-oriented machine translation systems Germán Sanchis-Trilles Advisor: Francisco Casacuberta Pattern Recognition and Human Language Technologies Group Departamento de Sistemas Informáticos

More information

The noisier channel : translation from morphologically complex languages

The noisier channel : translation from morphologically complex languages The noisier channel : translation from morphologically complex languages Christopher J. Dyer Department of Linguistics University of Maryland College Park, MD 20742 redpony@umd.edu Abstract This paper

More information

a Chinese-to-Spanish rule-based machine translation

a Chinese-to-Spanish rule-based machine translation Chinese-to-Spanish rule-based machine translation system Jordi Centelles 1 and Marta R. Costa-jussà 2 1 Centre de Tecnologies i Aplicacions del llenguatge i la Parla (TALP), Universitat Politècnica de

More information

Domain Adaptation of Statistical Machine Translation using Web-Crawled Resources: A Case Study

Domain Adaptation of Statistical Machine Translation using Web-Crawled Resources: A Case Study Proceedings of the 16th EAMT Conference, 28-30 May 2012, Trento, Italy Domain Adaptation of Statistical Machine Translation using Web-Crawled Resources: A Case Study Pavel Pecina 1, Antonio Toral 2, Vassilis

More information

CINDOR Conceptual Interlingua Document Retrieval: TREC-8 Evaluation.

CINDOR Conceptual Interlingua Document Retrieval: TREC-8 Evaluation. CINDOR Conceptual Interlingua Document Retrieval: TREC-8 Evaluation. Miguel Ruiz, Anne Diekema, Páraic Sheridan MNIS-TextWise Labs Dey Centennial Plaza 401 South Salina Street Syracuse, NY 13202 Abstract:

More information

Towards a General and Extensible Phrase-Extraction Algorithm

Towards a General and Extensible Phrase-Extraction Algorithm Towards a General and Extensible Phrase-Extraction Algorithm Wang Ling, Tiago Luís, João Graça, Luísa Coheur and Isabel Trancoso L 2 F Spoken Systems Lab INESC-ID Lisboa {wang.ling,tiago.luis,joao.graca,luisa.coheur,imt}@l2f.inesc-id.pt

More information

Report on the embedding and evaluation of the second MT pilot

Report on the embedding and evaluation of the second MT pilot Report on the embedding and evaluation of the second MT pilot quality translation by deep language engineering approaches DELIVERABLE D3.10 VERSION 1.6 2015-11-02 P2 QTLeap Machine translation is a computational

More information

UM-Corpus: A Large English-Chinese Parallel Corpus for Statistical Machine Translation

UM-Corpus: A Large English-Chinese Parallel Corpus for Statistical Machine Translation UM-Corpus: A Large English-Chinese Parallel Corpus for Statistical Machine Translation Liang Tian 1, Derek F. Wong 1, Lidia S. Chao 1, Paulo Quaresma 2,3, Francisco Oliveira 1, Yi Lu 1, Shuo Li 1, Yiming

More information

Parallel FDA5 for Fast Deployment of Accurate Statistical Machine Translation Systems

Parallel FDA5 for Fast Deployment of Accurate Statistical Machine Translation Systems Parallel FDA5 for Fast Deployment of Accurate Statistical Machine Translation Systems Ergun Biçici Qun Liu Centre for Next Generation Localisation Centre for Next Generation Localisation School of Computing

More information

Tibetan-Chinese Bilingual Sentences Alignment Method based on Multiple Features

Tibetan-Chinese Bilingual Sentences Alignment Method based on Multiple Features , pp.273-280 http://dx.doi.org/10.14257/ijdta.2015.8.4.27 Tibetan-Chinese Bilingual Sentences Alignment Method based on Multiple Features Lirong Qiu School of Information Engineering, MinzuUniversity of

More information

D4.3: TRANSLATION PROJECT- LEVEL EVALUATION

D4.3: TRANSLATION PROJECT- LEVEL EVALUATION : TRANSLATION PROJECT- LEVEL EVALUATION Jinhua Du, Joss Moorkens, Ankit Srivastava, Mikołaj Lauer, Andy Way, Alfredo Maldonado, David Lewis Distribution: Public Federated Active Linguistic data CuratiON

More information

tance alignment and time information to create confusion networks 1 from the output of different ASR systems for the same

tance alignment and time information to create confusion networks 1 from the output of different ASR systems for the same 1222 IEEE TRANSACTIONS ON AUDIO, SPEECH, AND LANGUAGE PROCESSING, VOL. 16, NO. 7, SEPTEMBER 2008 System Combination for Machine Translation of Spoken and Written Language Evgeny Matusov, Student Member,

More information

A Joint Sequence Translation Model with Integrated Reordering

A Joint Sequence Translation Model with Integrated Reordering A Joint Sequence Translation Model with Integrated Reordering Nadir Durrani, Helmut Schmid and Alexander Fraser Institute for Natural Language Processing University of Stuttgart Introduction Generation

More information

Learning Translation Rules from Bilingual English Filipino Corpus

Learning Translation Rules from Bilingual English Filipino Corpus Proceedings of PACLIC 19, the 19 th Asia-Pacific Conference on Language, Information and Computation. Learning Translation s from Bilingual English Filipino Corpus Michelle Wendy Tan, Raymond Joseph Ang,

More information

Customizing an English-Korean Machine Translation System for Patent Translation *

Customizing an English-Korean Machine Translation System for Patent Translation * Customizing an English-Korean Machine Translation System for Patent Translation * Sung-Kwon Choi, Young-Gil Kim Natural Language Processing Team, Electronics and Telecommunications Research Institute,

More information

Training and evaluation of POS taggers on the French MULTITAG corpus

Training and evaluation of POS taggers on the French MULTITAG corpus Training and evaluation of POS taggers on the French MULTITAG corpus A. Allauzen, H. Bonneau-Maynard LIMSI/CNRS; Univ Paris-Sud, Orsay, F-91405 {allauzen,maynard}@limsi.fr Abstract The explicit introduction

More information

Chinese-Japanese Machine Translation Exploiting Chinese Characters

Chinese-Japanese Machine Translation Exploiting Chinese Characters Chinese-Japanese Machine Translation Exploiting Chinese Characters CHENHUI CHU, TOSHIAKI NAKAZAWA, DAISUKE KAWAHARA, and SADAO KUROHASHI, Kyoto University The Chinese and Japanese languages share Chinese

More information

Applying Statistical Post-Editing to. English-to-Korean Rule-based Machine Translation System

Applying Statistical Post-Editing to. English-to-Korean Rule-based Machine Translation System Applying Statistical Post-Editing to English-to-Korean Rule-based Machine Translation System Ki-Young Lee and Young-Gil Kim Natural Language Processing Team, Electronics and Telecommunications Research

More information

T U R K A L A T O R 1

T U R K A L A T O R 1 T U R K A L A T O R 1 A Suite of Tools for Augmenting English-to-Turkish Statistical Machine Translation by Gorkem Ozbek [gorkem@stanford.edu] Siddharth Jonathan [jonsid@stanford.edu] CS224N: Natural Language

More information

Two Phase Evaluation for Selecting Machine Translation Services

Two Phase Evaluation for Selecting Machine Translation Services Two Phase Evaluation for Selecting Machine Translation Services Chunqi Shi, Donghui Lin, Masahiko Shimada, Toru Ishida Department of Social Informatics, Kyoto University Yoshida-Honmachi,Sakyo-Ku, Kyoto,

More information

Factored Markov Translation with Robust Modeling

Factored Markov Translation with Robust Modeling Factored Markov Translation with Robust Modeling Yang Feng Trevor Cohn Xinkai Du Information Sciences Institue Computing and Information Systems Computer Science Department The University of Melbourne

More information

An Approach to Handle Idioms and Phrasal Verbs in English-Tamil Machine Translation System

An Approach to Handle Idioms and Phrasal Verbs in English-Tamil Machine Translation System An Approach to Handle Idioms and Phrasal Verbs in English-Tamil Machine Translation System Thiruumeni P G, Anand Kumar M Computational Engineering & Networking, Amrita Vishwa Vidyapeetham, Coimbatore,

More information

31 Case Studies: Java Natural Language Tools Available on the Web

31 Case Studies: Java Natural Language Tools Available on the Web 31 Case Studies: Java Natural Language Tools Available on the Web Chapter Objectives Chapter Contents This chapter provides a number of sources for open source and free atural language understanding software

More information

The Transition of Phrase based to Factored based Translation for Tamil language in SMT Systems

The Transition of Phrase based to Factored based Translation for Tamil language in SMT Systems The Transition of Phrase based to Factored based Translation for Tamil language in SMT Systems Dr. Ananthi Sheshasaayee 1, Angela Deepa. V.R 2 1 Research Supervisior, Department of Computer Science & Application,

More information

UEdin: Translating L1 Phrases in L2 Context using Context-Sensitive SMT

UEdin: Translating L1 Phrases in L2 Context using Context-Sensitive SMT UEdin: Translating L1 Phrases in L2 Context using Context-Sensitive SMT Eva Hasler ILCC, School of Informatics University of Edinburgh e.hasler@ed.ac.uk Abstract We describe our systems for the SemEval

More information

Hybrid Strategies. for better products and shorter time-to-market

Hybrid Strategies. for better products and shorter time-to-market Hybrid Strategies for better products and shorter time-to-market Background Manufacturer of language technology software & services Spin-off of the research center of Germany/Heidelberg Founded in 1999,

More information

An End-to-End Discriminative Approach to Machine Translation

An End-to-End Discriminative Approach to Machine Translation An End-to-End Discriminative Approach to Machine Translation Percy Liang Alexandre Bouchard-Côté Dan Klein Ben Taskar Computer Science Division, EECS Department University of California at Berkeley Berkeley,

More information

A Generate and Rank Approach to Sentence Paraphrasing

A Generate and Rank Approach to Sentence Paraphrasing A Generate and Rank Approach to Sentence Paraphrasing Prodromos Malakasiotis and Ion Androutsopoulos + Department of Informatics, Athens University of Economics and Business, Greece + Digital Curation

More information

LetsMT!: A Cloud-Based Platform for Do-It-Yourself Machine Translation

LetsMT!: A Cloud-Based Platform for Do-It-Yourself Machine Translation LetsMT!: A Cloud-Based Platform for Do-It-Yourself Machine Translation Andrejs Vasiļjevs Raivis Skadiņš Jörg Tiedemann TILDE TILDE Uppsala University Vienbas gatve 75a, Riga Vienbas gatve 75a, Riga Box

More information

Topics in Computational Linguistics. Learning to Paraphrase: An Unsupervised Approach Using Multiple-Sequence Alignment

Topics in Computational Linguistics. Learning to Paraphrase: An Unsupervised Approach Using Multiple-Sequence Alignment Topics in Computational Linguistics Learning to Paraphrase: An Unsupervised Approach Using Multiple-Sequence Alignment Regina Barzilay and Lillian Lee Presented By: Mohammad Saif Department of Computer

More information

Selected Topics in Applied Machine Learning: An integrating view on data analysis and learning algorithms

Selected Topics in Applied Machine Learning: An integrating view on data analysis and learning algorithms Selected Topics in Applied Machine Learning: An integrating view on data analysis and learning algorithms ESSLLI 2015 Barcelona, Spain http://ufal.mff.cuni.cz/esslli2015 Barbora Hladká hladka@ufal.mff.cuni.cz

More information

Automatic Speech Recognition and Hybrid Machine Translation for High-Quality Closed-Captioning and Subtitling for Video Broadcast

Automatic Speech Recognition and Hybrid Machine Translation for High-Quality Closed-Captioning and Subtitling for Video Broadcast Automatic Speech Recognition and Hybrid Machine Translation for High-Quality Closed-Captioning and Subtitling for Video Broadcast Hassan Sawaf Science Applications International Corporation (SAIC) 7990

More information

ACCURAT Analysis and Evaluation of Comparable Corpora for Under Resourced Areas of Machine Translation www.accurat-project.eu Project no.

ACCURAT Analysis and Evaluation of Comparable Corpora for Under Resourced Areas of Machine Translation www.accurat-project.eu Project no. ACCURAT Analysis and Evaluation of Comparable Corpora for Under Resourced Areas of Machine Translation www.accurat-project.eu Project no. 248347 Deliverable D5.4 Report on requirements, implementation

More information

A New Input Method for Human Translators: Integrating Machine Translation Effectively and Imperceptibly

A New Input Method for Human Translators: Integrating Machine Translation Effectively and Imperceptibly Proceedings of the Twenty-Fourth International Joint Conference on Artificial Intelligence (IJCAI 2015) A New Input Method for Human Translators: Integrating Machine Translation Effectively and Imperceptibly

More information

Sentential Paraphrasing as Black-Box Machine Translation

Sentential Paraphrasing as Black-Box Machine Translation Sentential Paraphrasing as Black-Box Machine Translation Courtney Napoles 1, Chris Callison-Burch 2, and Matt Post 3 1 Center for Language and Speech Processing, Johns Hopkins University 2 Computer and

More information

PBML. logo. The Prague Bulletin of Mathematical Linguistics NUMBER??? JANUARY 2009 1 15. Grammar based statistical MT on Hadoop

PBML. logo. The Prague Bulletin of Mathematical Linguistics NUMBER??? JANUARY 2009 1 15. Grammar based statistical MT on Hadoop PBML logo The Prague Bulletin of Mathematical Linguistics NUMBER??? JANUARY 2009 1 15 Grammar based statistical MT on Hadoop An end-to-end toolkit for large scale PSCFG based MT Ashish Venugopal, Andreas

More information

Statistical Machine Translation for Automobile Marketing Texts

Statistical Machine Translation for Automobile Marketing Texts Statistical Machine Translation for Automobile Marketing Texts Samuel Läubli 1 Mark Fishel 1 1 Institute of Computational Linguistics University of Zurich Binzmühlestrasse 14 CH-8050 Zürich {laeubli,fishel,volk}

More information

Database Design For Corpus Storage: The ET10-63 Data Model

Database Design For Corpus Storage: The ET10-63 Data Model January 1993 Database Design For Corpus Storage: The ET10-63 Data Model Tony McEnery & Béatrice Daille I. General Presentation Within the ET10-63 project, a French-English bilingual corpus of about 2 million

More information

HIERARCHICAL HYBRID TRANSLATION BETWEEN ENGLISH AND GERMAN

HIERARCHICAL HYBRID TRANSLATION BETWEEN ENGLISH AND GERMAN HIERARCHICAL HYBRID TRANSLATION BETWEEN ENGLISH AND GERMAN Yu Chen, Andreas Eisele DFKI GmbH, Saarbrücken, Germany May 28, 2010 OUTLINE INTRODUCTION ARCHITECTURE EXPERIMENTS CONCLUSION SMT VS. RBMT [K.

More information

Sharing resources between free/open-source rule-based machine translation systems: Grammatical Framework and Apertium

Sharing resources between free/open-source rule-based machine translation systems: Grammatical Framework and Apertium Sharing resources between free/open-source rule-based machine translation systems: Grammatical Framework and Apertium Grégoire Détrez, Víctor M. Sánchez-Cartagena, Aarne Ranta gregoire.detrez@gu.se, vmsanchez@prompsit.com,

More information

Human in the Loop Machine Translation of Medical Terminology

Human in the Loop Machine Translation of Medical Terminology Human in the Loop Machine Translation of Medical Terminology by John J. Morgan ARL-MR-0743 April 2010 Approved for public release; distribution unlimited. NOTICES Disclaimers The findings in this report

More information

Visualizing Data Structures in Parsing-based Machine Translation. Jonathan Weese, Chris Callison-Burch

Visualizing Data Structures in Parsing-based Machine Translation. Jonathan Weese, Chris Callison-Burch The Prague Bulletin of Mathematical Linguistics NUMBER 93 JANUARY 2010 127 136 Visualizing Data Structures in Parsing-based Machine Translation Jonathan Weese, Chris Callison-Burch Abstract As machine

More information

ANNLOR: A Naïve Notation-system for Lexical Outputs Ranking

ANNLOR: A Naïve Notation-system for Lexical Outputs Ranking ANNLOR: A Naïve Notation-system for Lexical Outputs Ranking Anne-Laure Ligozat LIMSI-CNRS/ENSIIE rue John von Neumann 91400 Orsay, France annlor@limsi.fr Cyril Grouin LIMSI-CNRS rue John von Neumann 91400

More information

UNSUPERVISED MORPHOLOGICAL SEGMENTATION FOR STATISTICAL MACHINE TRANSLATION

UNSUPERVISED MORPHOLOGICAL SEGMENTATION FOR STATISTICAL MACHINE TRANSLATION UNSUPERVISED MORPHOLOGICAL SEGMENTATION FOR STATISTICAL MACHINE TRANSLATION by Ann Clifton B.A., Reed College, 2001 a thesis submitted in partial fulfillment of the requirements for the degree of Master

More information

A Hybrid System for Patent Translation

A Hybrid System for Patent Translation Proceedings of the 16th EAMT Conference, 28-30 May 2012, Trento, Italy A Hybrid System for Patent Translation Ramona Enache Cristina España-Bonet Aarne Ranta Lluís Màrquez Dept. of Computer Science and

More information

An Empirical Study on Web Mining of Parallel Data

An Empirical Study on Web Mining of Parallel Data An Empirical Study on Web Mining of Parallel Data Gumwon Hong 1, Chi-Ho Li 2, Ming Zhou 2 and Hae-Chang Rim 1 1 Department of Computer Science & Engineering, Korea University {gwhong,rim}@nlp.korea.ac.kr

More information

Cache-based Online Adaptation for Machine Translation Enhanced Computer Assisted Translation

Cache-based Online Adaptation for Machine Translation Enhanced Computer Assisted Translation Cache-based Online Adaptation for Machine Translation Enhanced Computer Assisted Translation Nicola Bertoldi Mauro Cettolo Marcello Federico FBK - Fondazione Bruno Kessler via Sommarive 18 38123 Povo,

More information

Question Answering and Multilingual CLEF 2008

Question Answering and Multilingual CLEF 2008 Dublin City University at QA@CLEF 2008 Sisay Fissaha Adafre Josef van Genabith National Center for Language Technology School of Computing, DCU IBM CAS Dublin sadafre,josef@computing.dcu.ie Abstract We

More information

Semantics in Statistical Machine Translation

Semantics in Statistical Machine Translation Semantics in Statistical Machine Translation Mihael Arcan DERI, NUI Galway firstname.lastname@deri.org Copyright 2011. All rights reserved. Overview 1. Statistical Machine Translation (SMT) 2. Translations

More information

Syntactic Reordering for English-Arabic Phrase-Based Machine Translation

Syntactic Reordering for English-Arabic Phrase-Based Machine Translation Syntactic Reordering for English-Arabic Phrase-Based Machine Translation Jakob Elming Languagelens Copenhagen, Denmark je@languagelens.com Nizar Habash Center for Computational Learning Systems Columbia

More information

Cross-lingual Synonymy Overlap

Cross-lingual Synonymy Overlap Cross-lingual Synonymy Overlap Anca Dinu 1, Liviu P. Dinu 2, Ana Sabina Uban 2 1 Faculty of Foreign Languages and Literatures, University of Bucharest 2 Faculty of Mathematics and Computer Science, University

More information

Jane 2: Open Source Phrase-based and Hierarchical Statistical Machine Translation

Jane 2: Open Source Phrase-based and Hierarchical Statistical Machine Translation Jane 2: Open Source Phrase-based and Hierarchical Statistical Machine Translation Joern Wuebker M at thias Huck Stephan Peitz M al te Nuhn M arkus F reitag Jan-Thorsten Peter Saab M ansour Hermann N e

More information

Segmentation and Punctuation Prediction in Speech Language Translation Using a Monolingual Translation System

Segmentation and Punctuation Prediction in Speech Language Translation Using a Monolingual Translation System Segmentation and Punctuation Prediction in Speech Language Translation Using a Monolingual Translation System Eunah Cho, Jan Niehues and Alex Waibel International Center for Advanced Communication Technologies

More information

Application of Machine Translation in Localization into Low-Resourced Languages

Application of Machine Translation in Localization into Low-Resourced Languages Application of Machine Translation in Localization into Low-Resourced Languages Raivis Skadiņš 1, Mārcis Pinnis 1, Andrejs Vasiļjevs 1, Inguna Skadiņa 1, Tomas Hudik 2 Tilde 1, Moravia 2 {raivis.skadins;marcis.pinnis;andrejs;inguna.skadina}@tilde.lv,

More information

Leveraging ASEAN Economic Community through Language Translation Services

Leveraging ASEAN Economic Community through Language Translation Services Leveraging ASEAN Economic Community through Language Translation Services Hammam Riza Center for Information and Communication Technology Agency for the Assessment and Application of Technology (BPPT)

More information

The Prague Bulletin of Mathematical Linguistics NUMBER 93 JANUARY 2010 37 46. Training Phrase-Based Machine Translation Models on the Cloud

The Prague Bulletin of Mathematical Linguistics NUMBER 93 JANUARY 2010 37 46. Training Phrase-Based Machine Translation Models on the Cloud The Prague Bulletin of Mathematical Linguistics NUMBER 93 JANUARY 2010 37 46 Training Phrase-Based Machine Translation Models on the Cloud Open Source Machine Translation Toolkit Chaski Qin Gao, Stephan

More information

JOINING HANDS: DEVELOPING A SIGN LANGUAGE MACHINE TRANSLATION SYSTEM WITH AND FOR THE DEAF COMMUNITY

JOINING HANDS: DEVELOPING A SIGN LANGUAGE MACHINE TRANSLATION SYSTEM WITH AND FOR THE DEAF COMMUNITY Conference & Workshop on Assistive Technologies for People with Vision & Hearing Impairments Assistive Technology for All Ages CVHI 2007, M.A. Hersh (ed.) JOINING HANDS: DEVELOPING A SIGN LANGUAGE MACHINE

More information

Translating the Penn Treebank with an Interactive-Predictive MT System

Translating the Penn Treebank with an Interactive-Predictive MT System IJCLA VOL. 2, NO. 1 2, JAN-DEC 2011, PP. 225 237 RECEIVED 31/10/10 ACCEPTED 26/11/10 FINAL 11/02/11 Translating the Penn Treebank with an Interactive-Predictive MT System MARTHA ALICIA ROCHA 1 AND JOAN

More information