TS3: an Improved Version of the Bilingual Concordancer TransSearch

Size: px
Start display at page:

Download "TS3: an Improved Version of the Bilingual Concordancer TransSearch"

Transcription

1 TS3: an Improved Version of the Bilingual Concordancer TransSearch Stéphane HUET, Julien BOURDAILLET and Philippe LANGLAIS EAMT Barcelona June 14, 2009

2 Computer assisted translation Preferred by professional translators Exploits a translation memory One of these tools: bilingual concordancer Retrieves from a bitext parts associated with a query Currently operates at the sentence level TransSearch: a web-based concordancer with 177,000 queries/month 2

3 Current version: highlighted query sentence alignment 3

4 Prototype version: TS3 highlighted query sentence alignment translation of the query 4

5 Prototype version: TS3 several translations of the query context of use 5

6 Outline Spotting of the query translation Refinement of translation spotting Translation variants merging Corpora Experimental results 6

7 Translation spotting (or Transpotting) Identification in a sentence of the translation of a query This is in keeping with that strategy. La présente mesure est conforme à cette stratégie. Query: in keeping with 7

8 Translation spotting (or Transpotting) Identification in a sentence of the translation of a query This is in keeping with that strategy. La présente mesure est conforme à cette stratégie. Query: in keeping with Transpot: conforme à 8

9 Word alignment This is in keeping with that strategy. La présente mesure est conforme à cette stratégie. Use of an IBM2 model Discontinuous transpots Not the best method to transpot 9

10 Transpotting algorithm This is in keeping with that strategy. La présente mesure est conforme à cette stratégie. Algorithm of [Simard 03] Contiguous transpots Best performance among several tested methods 10

11 Transpotting algorithm This is in keeping with that strategy. La présente mesure est conforme à cette stratégie. Algorithm of [Simard 03] Contiguous transpots Best performance among several tested methods 11

12 The need for post-processing Query: in keeping with Proposed transpots conforme à (45) conformément à (29) à (21) dans (20) conforme aux (18) de (14) conforme (13) conformément aux (13) conforme au (12) conformes à (11) d actualité (1) gestes en (1) correspond à (1) respectent (1) 12

13 The need for post-processing Query: in keeping with Proposed transpots after filtering conforme à (45) conformément à (29) à (21) dans (20) conforme aux (18) de (14) conforme (13) conformément aux (13) conforme au (12) conformes à (11) d actualité (1) gestes en (1) correspond à (1) respectent (1) 13

14 The need for post-processing Query: in keeping with Proposed transpots after filtering and merging conforme à (45) conformément à (29) conforme aux (18) conforme (13) conformément aux (13) conforme au (12) conformes à (11) correspond à (1) respectent (1) 14

15 Filtering bad transpots At the level of a pair of sentences Computation of 3 sets of features Size of the transpot, size of the query Statistical word alignment features: min and max likelihood, Viterbi scores... Linguistic features: grammatical word ratio, article counts, preposition counts... Training of various classifiers Voted-perceptron, SVM, decision tree, voting 15

16 Merging translation variants At the level of the transpot list found for a query High complexity when building all possible clusters Neighbor-joining method of [Saiou and Nei 87] Builds a distance matrix Q between all pairs Is a greedy algorithm that at each step Merges the two closest transpots Updates Q Uses a word-based distance Minimal cost between 2 inflected forms of a lemma Edition costs smaller for grammatical words 16

17 Example for the merging process conforme au conforme aux correspondant au dans le sens de l dans les sens des 17

18 Example for the merging process conforme au conforme aux correspondant au dans le sens de l dans les sens des 18

19 Example for the merging process conforme au conforme aux correspondant au dans le sens des dans le sens de l 19

20 Example for the merging process correspondant au conforme au conforme aux dans le sens des dans le sens de l 20

21 Example for the merging process dans le sens des dans le sens de l correspondant au conforme au conforme aux 21

22 Detection of similar variants dans le sens des dans le sens de l correspondant au conforme au conforme aux 22

23 Corpus used in the experiments 5,000 most frequent queries Canadian Hansard 8.3 M pairs of sentences Retrieved Retrieved pairs of pairs sentences of sentences 23

24 Reference corpus for filtering Annotation of 530 queries (23 translations per query) 24

25 Results for classification of transpots Trained on the annotated queries Tested by 10-fold cross-validation Correct classification F-measure for bad transpots All good 62 0 Grammatical ratio > Best classifier Similar results for the 4 tested classifiers: voted-perceptron, SVM, decision stump, AdaBoost Most informative features: grammatical and statistical word alignment 25

26 Reference corpus for transpotting Retrieved Retrieved pairs of pairs sentences of sentences Bilingual lexicon Transpotted pairs of sentences Reference = 1.4 M pairs of sentences 26

27 Metrics for transpotting Precison 2/4 suggested transpot Je crois qu il est tout à fait conforme à l esprit du projet de loi. reference 27

28 Metrics for transpotting Precison 2/4 suggested transpot Je crois qu il est tout à fait conforme à l esprit du projet de loi. Recall 1/2 reference suggested transpot Cela n est pas conforme aux normes des Nations Unies. reference Averaged for each query, then averaged on the overall corpus 28

29 Results for transpotting and filtering precision recall F-measure Transpotting Transpotting + filtering Filtering of 7.9% of pairs of sentences Improvement of F-measure, in particular of recall 29

30 Evaluation of variant merging Significant reduction of the number of translations proposed for a query: Higher diversity among the top translations Example: query as described rank before décrits décrite décrit tel que décrit comme l a after décrits prévu comme l a tel que prescrit comme le propose 30

31 Evaluation of variant merging Task: retrieving the 5 best transpots of a query Example {décrits, décrite, décrit, tel que décrit, comme l a} 31

32 Evaluation of variant merging Task: retrieving the 5 best transpots of a query Example {décrits, décrite, décrit, tel que décrit, comme l a} bag of unigrams {décrits, décrite, décrit, tel, que, comme, l, a} 32

33 Evaluation of variant merging Task: retrieving the 5 best transpots of a query Example {décrits, décrite, décrit, tel que décrit, comme l a} bag of unigrams grammatical words removed {décrits, décrite, décrit} 33

34 Evaluation of variant merging Task: retrieving the 5 best transpots of a query Example {décrits, décrite, décrit, tel que décrit, comme l a} bag of unigrams grammatical words removed lemmatization {décrit} 34

35 Results for variant merging Task: retrieving the 5 best transpots of a query Experiments done on the manually annotated corpus precision recall F-measure Before merging After merging Slight decrease of precision and significant improvement of recall => higher diversity 35

36 Conclusion Use of word alignment in a bilingual concordancer Quantitative evaluation of a transpotting algorithm Two new issues Filtering erroneous transpots Merging similar variants of translations 36

37 Future work Improvement of word alignment Higher level IBM models Phrase-based models Use of pseudo-relevance feedback to improve transpotting Evaluation with end users 37

38 Thank you for your attention 38

Statistical NLP Spring 2008. Machine Translation: Examples

Statistical NLP Spring 2008. Machine Translation: Examples Statistical NLP Spring 2008 Lecture 11: Word Alignment Dan Klein UC Berkeley Machine Translation: Examples 1 Machine Translation Madame la présidente, votre présidence de cette institution a été marquante.

More information

Convergence of Translation Memory and Statistical Machine Translation

Convergence of Translation Memory and Statistical Machine Translation Convergence of Translation Memory and Statistical Machine Translation Philipp Koehn and Jean Senellart 4 November 2010 Progress in Translation Automation 1 Translation Memory (TM) translators store past

More information

BUSINESS PROCESS OPTIMIZATION. OPTIMIZATION DES PROCESSUS D ENTERPRISE Comment d aborder la qualité en améliorant le processus

BUSINESS PROCESS OPTIMIZATION. OPTIMIZATION DES PROCESSUS D ENTERPRISE Comment d aborder la qualité en améliorant le processus BUSINESS PROCESS OPTIMIZATION How to Approach Quality by Improving the Process OPTIMIZATION DES PROCESSUS D ENTERPRISE Comment d aborder la qualité en améliorant le processus Business Diamond / Le losange

More information

The Clotho Project. Thibault Clérice ICS London seminars: Summer 2014

The Clotho Project. Thibault Clérice ICS London seminars: Summer 2014 The Clotho Project Thibault Clérice ICS London seminars: Summer 2014 Source of the problem : Understand and translate lascivus,a, um Dictionary definitions really «light» for sometimes highly licentious

More information

Archived Content. Contenu archivé

Archived Content. Contenu archivé ARCHIVED - Archiving Content ARCHIVÉE - Contenu archivé Archived Content Contenu archivé Information identified as archived is provided for reference, research or recordkeeping purposes. It is not subject

More information

Overview. Evaluation Connectionist and Statistical Language Processing. Test and Validation Set. Training and Test Set

Overview. Evaluation Connectionist and Statistical Language Processing. Test and Validation Set. Training and Test Set Overview Evaluation Connectionist and Statistical Language Processing Frank Keller keller@coli.uni-sb.de Computerlinguistik Universität des Saarlandes training set, validation set, test set holdout, stratification

More information

Identifying the Translations of Idiomatic Expressions using TransSearch

Identifying the Translations of Idiomatic Expressions using TransSearch Identifying the Translations of Idiomatic Expressions using TransSearch Stéphane Huet 1 and Philippe Langlais 2 1 LIA - Université d Avignon, Avignon, France stephane.huet@univ-avignon.fr 2 DIRO - Université

More information

RAPPORT FINANCIER ANNUEL PORTANT SUR LES COMPTES 2014

RAPPORT FINANCIER ANNUEL PORTANT SUR LES COMPTES 2014 RAPPORT FINANCIER ANNUEL PORTANT SUR LES COMPTES 2014 En application de la loi du Luxembourg du 11 janvier 2008 relative aux obligations de transparence sur les émetteurs de valeurs mobilières. CREDIT

More information

CENG 734 Advanced Topics in Bioinformatics

CENG 734 Advanced Topics in Bioinformatics CENG 734 Advanced Topics in Bioinformatics Week 9 Text Mining for Bioinformatics: BioCreative II.5 Fall 2010-2011 Quiz #7 1. Draw the decompressed graph for the following graph summary 2. Describe the

More information

The XMU Phrase-Based Statistical Machine Translation System for IWSLT 2006

The XMU Phrase-Based Statistical Machine Translation System for IWSLT 2006 The XMU Phrase-Based Statistical Machine Translation System for IWSLT 2006 Yidong Chen, Xiaodong Shi Institute of Artificial Intelligence Xiamen University P. R. China November 28, 2006 - Kyoto 13:46 1

More information

Using COTS Search Engines and Custom Query Strategies at CLEF

Using COTS Search Engines and Custom Query Strategies at CLEF Using COTS Search Engines and Custom Query Strategies at CLEF David Nadeau, Mario Jarmasz, Caroline Barrière, George Foster, and Claude St-Jacques Language Technologies Research Centre Interactive Language

More information

Archived Content. Contenu archivé

Archived Content. Contenu archivé ARCHIVED - Archiving Content ARCHIVÉE - Contenu archivé Archived Content Contenu archivé Information identified as archived is provided for reference, research or recordkeeping purposes. It is not subject

More information

Office of the Auditor General / Bureau du vérificateur général FOLLOW-UP TO THE 2010 AUDIT OF COMPRESSED WORK WEEK AGREEMENTS 2012 SUIVI DE LA

Office of the Auditor General / Bureau du vérificateur général FOLLOW-UP TO THE 2010 AUDIT OF COMPRESSED WORK WEEK AGREEMENTS 2012 SUIVI DE LA Office of the Auditor General / Bureau du vérificateur général FOLLOW-UP TO THE 2010 AUDIT OF COMPRESSED WORK WEEK AGREEMENTS 2012 SUIVI DE LA VÉRIFICATION DES ENTENTES DE SEMAINE DE TRAVAIL COMPRIMÉE

More information

Archived Content. Contenu archivé

Archived Content. Contenu archivé ARCHIVED - Archiving Content ARCHIVÉE - Contenu archivé Archived Content Contenu archivé Information identified as archived is provided for reference, research or recordkeeping purposes. It is not subject

More information

I will explain to you in English why everything from now on will be in French

I will explain to you in English why everything from now on will be in French I will explain to you in English why everything from now on will be in French Démarche et Outils REACHING OUT TO YOU I will explain to you in English why everything from now on will be in French All French

More information

FUZZY CLUSTERING ANALYSIS OF DATA MINING: APPLICATION TO AN ACCIDENT MINING SYSTEM

FUZZY CLUSTERING ANALYSIS OF DATA MINING: APPLICATION TO AN ACCIDENT MINING SYSTEM International Journal of Innovative Computing, Information and Control ICIC International c 0 ISSN 34-48 Volume 8, Number 8, August 0 pp. 4 FUZZY CLUSTERING ANALYSIS OF DATA MINING: APPLICATION TO AN ACCIDENT

More information

<is web> Information Systems & Semantic Web University of Koblenz Landau, Germany

<is web> Information Systems & Semantic Web University of Koblenz Landau, Germany Information Systems University of Koblenz Landau, Germany Exploiting Spatial Context in Images Using Fuzzy Constraint Reasoning Carsten Saathoff & Agenda Semantic Web: Our Context Knowledge Annotation

More information

Statistical Machine Translation

Statistical Machine Translation Statistical Machine Translation Some of the content of this lecture is taken from previous lectures and presentations given by Philipp Koehn and Andy Way. Dr. Jennifer Foster National Centre for Language

More information

TREATIES AND OTHER INTERNATIONAL ACTS SERIES 12859. Agreement Between the UNITED STATES OF AMERICA and CONGO

TREATIES AND OTHER INTERNATIONAL ACTS SERIES 12859. Agreement Between the UNITED STATES OF AMERICA and CONGO 1 TREATIES AND OTHER INTERNATIONAL ACTS SERIES 12859 EMPLOYMENT Agreement Between the UNITED STATES OF AMERICA and CONGO Effected by Exchange of Notes Dated at Washington April 11 and May 23, 1997 2 NOTE

More information

Survey on Conference Services provided by the United Nations Office at Geneva

Survey on Conference Services provided by the United Nations Office at Geneva Survey on Conference Services provided by the United Nations Office at Geneva Trade and Development Board, fifty-eighth session Geneva, 12-23 September 2011 Contents Survey contents Evaluation criteria

More information

SVM Based Learning System For Information Extraction

SVM Based Learning System For Information Extraction SVM Based Learning System For Information Extraction Yaoyong Li, Kalina Bontcheva, and Hamish Cunningham Department of Computer Science, The University of Sheffield, Sheffield, S1 4DP, UK {yaoyong,kalina,hamish}@dcs.shef.ac.uk

More information

DIRECTIVE ON ACCOUNTABILITY IN CONTRACT MANAGEMENT FOR PUBLIC BODIES. An Act respecting contracting by public bodies (chapter C-65.1, a.

DIRECTIVE ON ACCOUNTABILITY IN CONTRACT MANAGEMENT FOR PUBLIC BODIES. An Act respecting contracting by public bodies (chapter C-65.1, a. DIRECTIVE ON ACCOUNTABILITY IN CONTRACT MANAGEMENT FOR PUBLIC BODIES An Act respecting contracting by public bodies (chapter C-65.1, a. 26) SUBJECT 1. The purpose of this directive is to establish the

More information

Post-Secondary Opportunities For Student-Athletes / Opportunités post-secondaire pour les étudiantathlètes

Post-Secondary Opportunities For Student-Athletes / Opportunités post-secondaire pour les étudiantathlètes Post-Secondary Opportunities For Student-Athletes / Opportunités post-secondaire pour les étudiantathlètes Jean-François Roy Athletics Canada / Athlétisme Canada Talent Development Coordinator / Coordonnateur

More information

Machine Translation. Agenda

Machine Translation. Agenda Agenda Introduction to Machine Translation Data-driven statistical machine translation Translation models Parallel corpora Document-, sentence-, word-alignment Phrase-based translation MT decoding algorithm

More information

Archived Content. Contenu archivé

Archived Content. Contenu archivé ARCHIVED - Archiving Content ARCHIVÉE - Contenu archivé Archived Content Contenu archivé Information identified as archived is provided for reference, research or recordkeeping purposes. It is not subject

More information

Why Evaluation? Machine Translation. Evaluation. Evaluation Metrics. Ten Translations of a Chinese Sentence. How good is a given system?

Why Evaluation? Machine Translation. Evaluation. Evaluation Metrics. Ten Translations of a Chinese Sentence. How good is a given system? Why Evaluation? How good is a given system? Machine Translation Evaluation Which one is the best system for our purpose? How much did we improve our system? How can we tune our system to become better?

More information

Dublin City University at CLEF 2004: Experiments with the ImageCLEF St Andrew s Collection

Dublin City University at CLEF 2004: Experiments with the ImageCLEF St Andrew s Collection Dublin City University at CLEF 2004: Experiments with the ImageCLEF St Andrew s Collection Gareth J. F. Jones, Declan Groves, Anna Khasin, Adenike Lam-Adesina, Bart Mellebeek. Andy Way School of Computing,

More information

Chapter 5. Phrase-based models. Statistical Machine Translation

Chapter 5. Phrase-based models. Statistical Machine Translation Chapter 5 Phrase-based models Statistical Machine Translation Motivation Word-Based Models translate words as atomic units Phrase-Based Models translate phrases as atomic units Advantages: many-to-many

More information

Product / Produit Description Duration /Days Total / Total

Product / Produit Description Duration /Days Total / Total DELL Budget Proposal / Proposition Budgétaire Solutions Design Centre N o : 200903201602 Centre de Design de Solutions Date: 2009-03-23 Proposition valide pour 30 jours / Proposal valid for 30 days Customer

More information

Training and evaluation of POS taggers on the French MULTITAG corpus

Training and evaluation of POS taggers on the French MULTITAG corpus Training and evaluation of POS taggers on the French MULTITAG corpus A. Allauzen, H. Bonneau-Maynard LIMSI/CNRS; Univ Paris-Sud, Orsay, F-91405 {allauzen,maynard}@limsi.fr Abstract The explicit introduction

More information

Chapter 8. Final Results on Dutch Senseval-2 Test Data

Chapter 8. Final Results on Dutch Senseval-2 Test Data Chapter 8 Final Results on Dutch Senseval-2 Test Data The general idea of testing is to assess how well a given model works and that can only be done properly on data that has not been seen before. Supervised

More information

Cliquez sur le résultat que vous avez obtenu au test de classement linguistique Click on the result you obtained following the language test

Cliquez sur le résultat que vous avez obtenu au test de classement linguistique Click on the result you obtained following the language test Cliquez sur le résultat que vous avez obtenu au test de classement linguistique Click on the result you obtained following the language test E1N (SVP contactez-nous si vous avez obtenu ce résultat / Please

More information

Certificat de fusion. Certificate of Amalgamation. Canada Business Corporations Act. Loi canadienne sur les sociétés par actions

Certificat de fusion. Certificate of Amalgamation. Canada Business Corporations Act. Loi canadienne sur les sociétés par actions Industry Canada Industrie Canada Certificate of Amalgamation Canada Business Corporations Act Certificat de fusion Loi canadienne sur les sociétés par actions Franco-Nevada Corporation 445771-4 Name of

More information

Archived Content. Contenu archivé

Archived Content. Contenu archivé ARCHIVED - Archiving Content ARCHIVÉE - Contenu archivé Archived Content Contenu archivé Information identified as archived is provided for reference, research or recordkeeping purposes. It is not subject

More information

Collecting Polish German Parallel Corpora in the Internet

Collecting Polish German Parallel Corpora in the Internet Proceedings of the International Multiconference on ISSN 1896 7094 Computer Science and Information Technology, pp. 285 292 2007 PIPS Collecting Polish German Parallel Corpora in the Internet Monika Rosińska

More information

Archived Content. Contenu archivé

Archived Content. Contenu archivé ARCHIVED - Archiving Content ARCHIVÉE - Contenu archivé Archived Content Contenu archivé Information identified as archived is provided for reference, research or recordkeeping purposes. It is not subject

More information

BILL C-665 PROJET DE LOI C-665 C-665 C-665 HOUSE OF COMMONS OF CANADA CHAMBRE DES COMMUNES DU CANADA

BILL C-665 PROJET DE LOI C-665 C-665 C-665 HOUSE OF COMMONS OF CANADA CHAMBRE DES COMMUNES DU CANADA C-665 C-665 Second Session, Forty-first Parliament, Deuxième session, quarante et unième législature, HOUSE OF COMMONS OF CANADA CHAMBRE DES COMMUNES DU CANADA BILL C-665 PROJET DE LOI C-665 An Act to

More information

Measuring Policing Complexity: A Research Based Agenda

Measuring Policing Complexity: A Research Based Agenda ARCHIVED - Archiving Content ARCHIVÉE - Contenu archivé Archived Content Contenu archivé Information identified as archived is provided for reference, research or recordkeeping purposes. It is not subject

More information

A web-based multilingual help desk

A web-based multilingual help desk LTC-Communicator: A web-based multilingual help desk Nigel Goffe The Language Technology Centre Ltd Kingston upon Thames Abstract Software vendors operating in international markets face two problems:

More information

Level 2 French, 2014

Level 2 French, 2014 91121 911210 2SUPERVISOR S Level 2 French, 2014 91121 Demonstrate understanding of a variety of written and / or visual French text(s) on familiar matters 9.30 am Wednesday 26 November 2014 Credits: Five

More information

Mining Direct Marketing Data by Ensembles of Weak Learners and Rough Set Methods

Mining Direct Marketing Data by Ensembles of Weak Learners and Rough Set Methods Mining Direct Marketing Data by Ensembles of Weak Learners and Rough Set Methods Jerzy B laszczyński 1, Krzysztof Dembczyński 1, Wojciech Kot lowski 1, and Mariusz Paw lowski 2 1 Institute of Computing

More information

GSAC CONSIGNE DE NAVIGABILITE définie par la DIRECTION GENERALE DE L AVIATION CIVILE Les examens ou modifications décrits ci-dessous sont impératifs. La non application des exigences contenues dans cette

More information

Active Learning SVM for Blogs recommendation

Active Learning SVM for Blogs recommendation Active Learning SVM for Blogs recommendation Xin Guan Computer Science, George Mason University Ⅰ.Introduction In the DH Now website, they try to review a big amount of blogs and articles and find the

More information

Experiments in Web Page Classification for Semantic Web

Experiments in Web Page Classification for Semantic Web Experiments in Web Page Classification for Semantic Web Asad Satti, Nick Cercone, Vlado Kešelj Faculty of Computer Science, Dalhousie University E-mail: {rashid,nick,vlado}@cs.dal.ca Abstract We address

More information

UNDERSTANDING THE EFFECTIVENESS OF BANK DIRECT MARKETING Tarun Gupta, Tong Xia and Diana Lee

UNDERSTANDING THE EFFECTIVENESS OF BANK DIRECT MARKETING Tarun Gupta, Tong Xia and Diana Lee UNDERSTANDING THE EFFECTIVENESS OF BANK DIRECT MARKETING Tarun Gupta, Tong Xia and Diana Lee 1. Introduction There are two main approaches for companies to promote their products / services: through mass

More information

Altiris Patch Management Solution for Windows 7.6 from Symantec Third-Party Legal Notices

Altiris Patch Management Solution for Windows 7.6 from Symantec Third-Party Legal Notices Appendix A Altiris Patch Management Solution for Windows 7.6 from Symantec Third-Party Legal Notices This appendix includes the following topics: Third-Party Legal Attributions CabDotNet MICROSOFT PLATFORM

More information

Ensemble Methods. Knowledge Discovery and Data Mining 2 (VU) (707.004) Roman Kern. KTI, TU Graz 2015-03-05

Ensemble Methods. Knowledge Discovery and Data Mining 2 (VU) (707.004) Roman Kern. KTI, TU Graz 2015-03-05 Ensemble Methods Knowledge Discovery and Data Mining 2 (VU) (707004) Roman Kern KTI, TU Graz 2015-03-05 Roman Kern (KTI, TU Graz) Ensemble Methods 2015-03-05 1 / 38 Outline 1 Introduction 2 Classification

More information

RRSS - Rating Reviews Support System purpose built for movies recommendation

RRSS - Rating Reviews Support System purpose built for movies recommendation RRSS - Rating Reviews Support System purpose built for movies recommendation Grzegorz Dziczkowski 1,2 and Katarzyna Wegrzyn-Wolska 1 1 Ecole Superieur d Ingenieurs en Informatique et Genie des Telecommunicatiom

More information

SYSTRAN Chinese-English and English-Chinese Hybrid Machine Translation Systems for CWMT2011 SYSTRAN 混 合 策 略 汉 英 和 英 汉 机 器 翻 译 系 CWMT2011 技 术 报 告

SYSTRAN Chinese-English and English-Chinese Hybrid Machine Translation Systems for CWMT2011 SYSTRAN 混 合 策 略 汉 英 和 英 汉 机 器 翻 译 系 CWMT2011 技 术 报 告 SYSTRAN Chinese-English and English-Chinese Hybrid Machine Translation Systems for CWMT2011 Jin Yang and Satoshi Enoue SYSTRAN Software, Inc. 4444 Eastgate Mall, Suite 310 San Diego, CA 92121, USA E-mail:

More information

Altiris Patch Management Solution for Windows 7.5 SP1 from Symantec Third-Party Legal Notices

Altiris Patch Management Solution for Windows 7.5 SP1 from Symantec Third-Party Legal Notices Appendix A Altiris Patch Management Solution for Windows 7.5 SP1 from Symantec Third-Party Legal Notices This appendix includes the following topics: Third-Party Legal Attributions CabDotNet XML-RPC.NET

More information

Hybrid Machine Translation Guided by a Rule Based System

Hybrid Machine Translation Guided by a Rule Based System Hybrid Machine Translation Guided by a Rule Based System Cristina España-Bonet, Gorka Labaka, Arantza Díaz de Ilarraza, Lluís Màrquez Kepa Sarasola Universitat Politècnica de Catalunya University of the

More information

Identifying Thesis and Conclusion Statements in Student Essays to Scaffold Peer Review

Identifying Thesis and Conclusion Statements in Student Essays to Scaffold Peer Review Identifying Thesis and Conclusion Statements in Student Essays to Scaffold Peer Review Mohammad H. Falakmasir, Kevin D. Ashley, Christian D. Schunn, Diane J. Litman Learning Research and Development Center,

More information

An Empirical Study on Web Mining of Parallel Data

An Empirical Study on Web Mining of Parallel Data An Empirical Study on Web Mining of Parallel Data Gumwon Hong 1, Chi-Ho Li 2, Ming Zhou 2 and Hae-Chang Rim 1 1 Department of Computer Science & Engineering, Korea University {gwhong,rim}@nlp.korea.ac.kr

More information

Computer Aided Translation

Computer Aided Translation Computer Aided Translation Philipp Koehn 30 April 2015 Why Machine Translation? 1 Assimilation reader initiates translation, wants to know content user is tolerant of inferior quality focus of majority

More information

Open issues regarding legal metadata: IP licensing and management of different cognitive levels

Open issues regarding legal metadata: IP licensing and management of different cognitive levels Open issues regarding legal metadata: IP licensing and management of different cognitive levels FLORENCE MAY 6th, 2011 Danièle Bourcier Meritxell Fernández-Barrera 1 Cersa CNRS-Université Paris 2, Paris

More information

Genetic Algorithm-based Multi-Word Automatic Language Translation

Genetic Algorithm-based Multi-Word Automatic Language Translation Recent Advances in Intelligent Information Systems ISBN 978-83-60434-59-8, pages 751 760 Genetic Algorithm-based Multi-Word Automatic Language Translation Ali Zogheib IT-Universitetet i Goteborg - Department

More information

Contracts over $10,000: 1 April 2013 to 30 September 2013 Contrats de plus de 10 000 $ : 1er avril 2013 au 30 septembre 2013

Contracts over $10,000: 1 April 2013 to 30 September 2013 Contrats de plus de 10 000 $ : 1er avril 2013 au 30 septembre 2013 Contracts over $10,000: 1 April 2013 to 30 September 2013 Contrats de plus de 10 000 $ : 1er avril 2013 au 30 septembre 2013 Vendor / Fournisseur No. Description NOEL PARENT ANITA PORTIER RENEE ST-ARNAUD

More information

FINAL DRAFT INTERNATIONAL STANDARD

FINAL DRAFT INTERNATIONAL STANDARD IEC 62047-15 Edition 1.0 2014-12 FINAL DRAFT INTERNATIONAL STANDARD colour inside Semiconductor devices Micro-electromechanical devices Part 15: Test method of bonding strength between PDMS and glass INTERNATIONAL

More information

Evaluation of speech technologies

Evaluation of speech technologies CLARA Training course on evaluation of Human Language Technologies Evaluations and Language resources Distribution Agency November 27, 2012 Evaluation of speaker identification Speech technologies Outline

More information

2-3 Automatic Construction Technology for Parallel Corpora

2-3 Automatic Construction Technology for Parallel Corpora 2-3 Automatic Construction Technology for Parallel Corpora We have aligned Japanese and English news articles and sentences, extracted from the Yomiuri and the Daily Yomiuri newspapers, to make a large

More information

Numéro de projet CISPR 16-1-4 Amd 2 Ed. 3.0. IEC/TC or SC: CISPR/A CEI/CE ou SC: Date of circulation Date de diffusion 2015-10-30

Numéro de projet CISPR 16-1-4 Amd 2 Ed. 3.0. IEC/TC or SC: CISPR/A CEI/CE ou SC: Date of circulation Date de diffusion 2015-10-30 PRIVATE CIRCULATION GEL/210/11_15_0275 For comment/vote - Action Due Date: 2016/01/08 Submitted for parallel voting in CENELEC Soumis au vote parallèle au CENELEC Also of interest to the following committees

More information

Machine Translation. Why Evaluation? Evaluation. Ten Translations of a Chinese Sentence. Evaluation Metrics. But MT evaluation is a di cult problem!

Machine Translation. Why Evaluation? Evaluation. Ten Translations of a Chinese Sentence. Evaluation Metrics. But MT evaluation is a di cult problem! Why Evaluation? How good is a given system? Which one is the best system for our purpose? How much did we improve our system? How can we tune our system to become better? But MT evaluation is a di cult

More information

Sun Management Center Change Manager 1.0.1 Release Notes

Sun Management Center Change Manager 1.0.1 Release Notes Sun Management Center Change Manager 1.0.1 Release Notes Sun Microsystems, Inc. 4150 Network Circle Santa Clara, CA 95054 U.S.A. Part No: 817 0891 10 May 2003 Copyright 2003 Sun Microsystems, Inc. 4150

More information

Web based English-Chinese OOV term translation using Adaptive rules and Recursive feature selection

Web based English-Chinese OOV term translation using Adaptive rules and Recursive feature selection Web based English-Chinese OOV term translation using Adaptive rules and Recursive feature selection Jian Qu, Nguyen Le Minh, Akira Shimazu School of Information Science, JAIST Ishikawa, Japan 923-1292

More information

FOR TEACHERS ONLY The University of the State of New York

FOR TEACHERS ONLY The University of the State of New York FOR TEACHERS ONLY The University of the State of New York REGENTS HIGH SCHOOL EXAMINATION F COMPREHENSIVE EXAMINATION IN FRENCH Friday, June 16, 2006 1:15 to 4:15 p.m., only SCORING KEY Updated information

More information

Amazigh ConCorde: an appropriate concordance for Amazigh

Amazigh ConCorde: an appropriate concordance for Amazigh SITACAM 09, Agadir, 12-13 December 2009 Amazigh ConCorde: an appropriate concordance for Amazigh Siham Boulaknadel Institut Royal de la Culture Amazighe Avenue Allal El Fassi, Madinat Al Irfane, Rabat

More information

Automatic Text Processing: Cross-Lingual. Text Categorization

Automatic Text Processing: Cross-Lingual. Text Categorization Automatic Text Processing: Cross-Lingual Text Categorization Dipartimento di Ingegneria dell Informazione Università degli Studi di Siena Dottorato di Ricerca in Ingegneria dell Informazone XVII ciclo

More information

Linguistic Analysis of Requirements of a Space Project and Their Conformity with the Recommendations Proposed by a Controlled Natural Language

Linguistic Analysis of Requirements of a Space Project and Their Conformity with the Recommendations Proposed by a Controlled Natural Language Linguistic Analysis of Requirements of a Space Project and Their Conformity with the Recommendations Proposed by a Controlled Natural Language Anne Condamines and Maxime Warnier {anne.condamines,maxime.warnier}@univ-tlse2.fr

More information

Error Log Processing for Accurate Failure Prediction. Humboldt-Universität zu Berlin

Error Log Processing for Accurate Failure Prediction. Humboldt-Universität zu Berlin Error Log Processing for Accurate Failure Prediction Felix Salfner ICSI Berkeley Steffen Tschirpke Humboldt-Universität zu Berlin Introduction Context of work: Error-based online failure prediction: error

More information

Automatic Acquisition of Chinese English Parallel Corpus from the Web

Automatic Acquisition of Chinese English Parallel Corpus from the Web Automatic Acquisition of Chinese English Parallel Corpus from the Web Ying Zhang 1, Ke Wu 2, Jianfeng Gao 3, and Phil Vines 1 1 RMIT University, GPO Box 2476V, Melbourne, Australia, yzhang@cs.rmit.edu.au,

More information

AP FRENCH LANGUAGE AND CULTURE 2013 SCORING GUIDELINES

AP FRENCH LANGUAGE AND CULTURE 2013 SCORING GUIDELINES AP FRENCH LANGUAGE AND CULTURE 2013 SCORING GUIDELINES Interpersonal Writing: E-mail Reply 5: STRONG performance in Interpersonal Writing Maintains the exchange with a response that is clearly appropriate

More information

AP FRENCH LANGUAGE 2008 SCORING GUIDELINES

AP FRENCH LANGUAGE 2008 SCORING GUIDELINES AP FRENCH LANGUAGE 2008 SCORING GUIDELINES Part A (Essay): Question 31 9 Demonstrates STRONG CONTROL Excellence Ease of expression marked by a good sense of idiomatic French. Clarity of organization. Accuracy

More information

Dutch Parallel Corpus

Dutch Parallel Corpus Dutch Parallel Corpus Lieve Macken lieve.macken@hogent.be LT 3, Language and Translation Technology Team Faculty of Applied Language Studies University College Ghent November 29th 2011 Lieve Macken (LT

More information

Hours: The hours for the class are divided between practicum and in-class activities. The dates and hours are as follows:

Hours: The hours for the class are divided between practicum and in-class activities. The dates and hours are as follows: March 2014 Bienvenue à EDUC 1515 Français Langue Seconde Partie 1 The following information will allow you to plan in advance for the upcoming session of FSL Part 1 and make arrangements to complete the

More information

ColdGuard Bi-PARTING DOOR INSTALLATION INSTRUCTIONS

ColdGuard Bi-PARTING DOOR INSTALLATION INSTRUCTIONS EHD TRACK LEVEL, (SET LEVEL ON PLASTIC HEADER. DO NOT PLACE LEVEL ON ALUMINUM TRACK.) TRACK IS FLUSH WITH TOP OF HEADER JUNCTION BOX IDLER PULLEY LOCATOR PINS LOCATOR PINS OPERATOR DOOR STOP JUNCTION BOX

More information

A Systematic Comparison of Various Statistical Alignment Models

A Systematic Comparison of Various Statistical Alignment Models A Systematic Comparison of Various Statistical Alignment Models Franz Josef Och Hermann Ney University of Southern California RWTH Aachen We present and compare various methods for computing word alignments

More information

VCU-TSA at Semeval-2016 Task 4: Sentiment Analysis in Twitter

VCU-TSA at Semeval-2016 Task 4: Sentiment Analysis in Twitter VCU-TSA at Semeval-2016 Task 4: Sentiment Analysis in Twitter Gerard Briones and Kasun Amarasinghe and Bridget T. McInnes, PhD. Department of Computer Science Virginia Commonwealth University Richmond,

More information

Thailand Business visa Application for citizens of Hong Kong living in Manitoba

Thailand Business visa Application for citizens of Hong Kong living in Manitoba Thailand Business visa Application for citizens of Hong Kong living in Manitoba Please enter your contact information Name: Email: Tel: Mobile: The latest date you need your passport returned in time for

More information

Enterprise Risk Management & Board members. GUBERNA Alumni Event June 19 th 2014 Prepared by Gaëtan LEFEVRE

Enterprise Risk Management & Board members. GUBERNA Alumni Event June 19 th 2014 Prepared by Gaëtan LEFEVRE Enterprise Risk Management & Board members GUBERNA Alumni Event June 19 th 2014 Prepared by Gaëtan LEFEVRE Agenda Introduction Do we need Risk Management? The 8 th EU Company Law Directive Art 41, 2b Three

More information

SYSTRAN 混 合 策 略 汉 英 和 英 汉 机 器 翻 译 系 统

SYSTRAN 混 合 策 略 汉 英 和 英 汉 机 器 翻 译 系 统 SYSTRAN Chinese-English and English-Chinese Hybrid Machine Translation Systems Jin Yang, Satoshi Enoue Jean Senellart, Tristan Croiset SYSTRAN Software, Inc. SYSTRAN SA 9333 Genesee Ave. Suite PL1 La Grande

More information

TRANSREAD LIVRABLE 3.1 QUALITY CONTROL IN HUMAN TRANSLATIONS: USE CASES AND SPECIFICATIONS. Projet ANR 201 2 CORD 01 5

TRANSREAD LIVRABLE 3.1 QUALITY CONTROL IN HUMAN TRANSLATIONS: USE CASES AND SPECIFICATIONS. Projet ANR 201 2 CORD 01 5 Projet ANR 201 2 CORD 01 5 TRANSREAD Lecture et interaction bilingues enrichies par les données d'alignement LIVRABLE 3.1 QUALITY CONTROL IN HUMAN TRANSLATIONS: USE CASES AND SPECIFICATIONS Avril 201 4

More information

Efficient Techniques for Improved Data Classification and POS Tagging by Monitoring Extraction, Pruning and Updating of Unknown Foreign Words

Efficient Techniques for Improved Data Classification and POS Tagging by Monitoring Extraction, Pruning and Updating of Unknown Foreign Words , pp.290-295 http://dx.doi.org/10.14257/astl.2015.111.55 Efficient Techniques for Improved Data Classification and POS Tagging by Monitoring Extraction, Pruning and Updating of Unknown Foreign Words Irfan

More information

Note concernant votre accord de souscription au service «Trusted Certificate Service» (TCS)

Note concernant votre accord de souscription au service «Trusted Certificate Service» (TCS) Note concernant votre accord de souscription au service «Trusted Certificate Service» (TCS) Veuillez vérifier les éléments suivants avant de nous soumettre votre accord : 1. Vous avez bien lu et paraphé

More information

W6.B.1. FAQs CS535 BIG DATA W6.B.3. 4. If the distance of the point is additionally less than the tight distance T 2, remove it from the original set

W6.B.1. FAQs CS535 BIG DATA W6.B.3. 4. If the distance of the point is additionally less than the tight distance T 2, remove it from the original set http://wwwcscolostateedu/~cs535 W6B W6B2 CS535 BIG DAA FAQs Please prepare for the last minute rush Store your output files safely Partial score will be given for the output from less than 50GB input Computer

More information

FACULTY OF MANAGEMENT MBA PROGRAM

FACULTY OF MANAGEMENT MBA PROGRAM FACULTY OF MANAGEMENT MBA PROGRAM APPLICATION PROCEDURES: Completed files are evaluated on a rolling basis. Although the MBA Admissions office notifies all applicants of any outstanding documents electronically,

More information

Direct AC Wiring Option Installation Guide

Direct AC Wiring Option Installation Guide Revision A Direct AC Wiring Option Installation Guide The direct AC wiring option enables you to wire AC power directly in to the ADP InTouch. The option kit contains the following parts. 25mm 12mm Caution:

More information

Improved Discriminative Bilingual Word Alignment

Improved Discriminative Bilingual Word Alignment Improved Discriminative Bilingual Word Alignment Robert C. Moore Wen-tau Yih Andreas Bode Microsoft Research Redmond, WA 98052, USA {bobmoore,scottyhi,abode}@microsoft.com Abstract For many years, statistical

More information

A STUDY OF THE SENSITIVITY, STABILITY AND SPECIFICITY OF PHENOLPHTHALEIN AS AN INDICATOR TEST FOR BLOOD R. S. HIGAKI 1 and W. M. S.

A STUDY OF THE SENSITIVITY, STABILITY AND SPECIFICITY OF PHENOLPHTHALEIN AS AN INDICATOR TEST FOR BLOOD R. S. HIGAKI 1 and W. M. S. Can.Soc.Forens.Sci.J.Vol.9,No.3(1976) Recieved June 14, 1976 A STUDY OF THE SENSITIVITY, STABILITY AND SPECIFICITY OF PHENOLPHTHALEIN AS AN INDICATOR TEST FOR BLOOD R. S. HIGAKI 1 and W. M. S. PHIILP 1

More information

Archived Content. Contenu archivé

Archived Content. Contenu archivé ARCHIVED - Archiving Content ARCHIVÉE - Contenu archivé Archived Content Contenu archivé Information identified as archived is provided for reference, research or recordkeeping purposes. It is not subject

More information

An Incrementally Trainable Statistical Approach to Information Extraction Based on Token Classification and Rich Context Models

An Incrementally Trainable Statistical Approach to Information Extraction Based on Token Classification and Rich Context Models Dissertation (Ph.D. Thesis) An Incrementally Trainable Statistical Approach to Information Extraction Based on Token Classification and Rich Context Models Christian Siefkes Disputationen: 16th February

More information

Future Entreprise. Jean-Dominique Meunier NEM Executive Director www.nem-initiative.org. Nov. 23, 2009 FIA Stockholm

Future Entreprise. Jean-Dominique Meunier NEM Executive Director www.nem-initiative.org. Nov. 23, 2009 FIA Stockholm Future Entreprise Jean-Dominique Meunier NEM Executive Director www.nem-initiative.org Nov. 23, 2009 FIA Stockholm 1 Nov. 23, 2009 FIA Stockholm Source : @JDM 2 Nov. 23, 2009 FIA Stockholm Source : @JDM

More information

ULYSSES L.T. FUNDS EUROPEAN GENERAL. L.T. Funds European General: Share Price Evolution INVESTMENT STRATEGY AUGUST 2015 COMMENT

ULYSSES L.T. FUNDS EUROPEAN GENERAL. L.T. Funds European General: Share Price Evolution INVESTMENT STRATEGY AUGUST 2015 COMMENT ULYSSES L.T. FUNDS EUROPEAN GENERAL L.T. Funds European General: Share Price Evolution Mid-Caps 28% Breakdown by Market Capitalisation Small Caps Cash 5% 0% Large caps 67% 124 Breakdown by Risk Profile

More information

Search Result Optimization using Annotators

Search Result Optimization using Annotators Search Result Optimization using Annotators Vishal A. Kamble 1, Amit B. Chougule 2 1 Department of Computer Science and Engineering, D Y Patil College of engineering, Kolhapur, Maharashtra, India 2 Professor,

More information

Similarity Search in a Very Large Scale Using Hadoop and HBase

Similarity Search in a Very Large Scale Using Hadoop and HBase Similarity Search in a Very Large Scale Using Hadoop and HBase Stanislav Barton, Vlastislav Dohnal, Philippe Rigaux LAMSADE - Universite Paris Dauphine, France Internet Memory Foundation, Paris, France

More information

Archived Content. Contenu archivé

Archived Content. Contenu archivé ARCHIVED - Archiving Content ARCHIVÉE - Contenu archivé Archived Content Contenu archivé Information identified as archived is provided for reference, research or recordkeeping purposes. It is not subject

More information

Chapter 6. The stacking ensemble approach

Chapter 6. The stacking ensemble approach 82 This chapter proposes the stacking ensemble approach for combining different data mining classifiers to get better performance. Other combination techniques like voting, bagging etc are also described

More information

HOW MUCH DO YOU KNOW ABOUT RUGBY???

HOW MUCH DO YOU KNOW ABOUT RUGBY??? HOW MUCH DO YOU KNOW ABOUT RUGBY??? TÂCHE: Je peux donner quelques informations à propos du rugby. EO/PE (A2/B1) Rugby History? How is it played? The different rugby competitions? The Rugby World Cup?

More information

An Empirical Study of Corpus-Based Response Automation Methods for an E-mail-Based Help-Desk Domain

An Empirical Study of Corpus-Based Response Automation Methods for an E-mail-Based Help-Desk Domain An Empirical Study of Corpus-Based Response Automation Methods for an E-mail-Based Help-Desk Domain Yuval Marom Monash University Ingrid Zukerman Monash University This article presents an investigation

More information

Gamejam as Design Method

Gamejam as Design Method Gamejam as Design Method Elias Farhan Game Design Orientation: C Mentors: René Bauer, Beat Suter, Max Moswitzer July 7, 2015 Figure 1: Ludum Dare 32 @ Spielhalle Oslo, Basel 1 Abstract A gamejam is a video-game

More information

Information Retrieval and Web Search Engines

Information Retrieval and Web Search Engines Information Retrieval and Web Search Engines Lecture 7: Document Clustering December 10 th, 2013 Wolf-Tilo Balke and Kinda El Maarry Institut für Informationssysteme Technische Universität Braunschweig

More information