Machine Translation and the Translator
|
|
|
- Daisy Floyd
- 10 years ago
- Views:
Transcription
1 Machine Translation and the Translator Philipp Koehn 8 April 2015
2 About me 1 Professor at Johns Hopkins University (US), University of Edinburgh (Scotland) Author of textbook on statistical machine translation Leading development of open source Moses toolkit developed since 2006 reference implementation of state-of-the art methods used in academia as benchmark and testbed extensive commercial deployment (20% of MT market)
3 Recent Projects 2 Speech translation Computer aided translation Development of an open source toolkit tightly integrated with machine translation Novel types of assistance for translators Adaptation of machine translation to user needs Open source infrastructure MOSES CORE
4 3 how good is machine translation?
5 Machine Translation: Chinese 4
6 Machine Translation: Chinese 4
7 Machine Translation: French 5
8 Quality 6 HTER assessment 0% 10% 20% publishable editable 30% gistable 40% triagable 50% (scale developed in preparation of DARPA GALE programme)
9 Applications 7 HTER assessment application examples 0% Seamless bridging of language divide publishable Automatic publication of official announcements 10% editable Increased productivity of human translators 20% Access to official publications Multi-lingual communication (chat, social networks) 30% gistable Information gathering Trend spotting 40% triagable Identifying relevant documents 50%
10 Current State of the Art 8 HTER assessment language pairs and domains 0% publishable French-English restricted domain 10% French-English news stories editable German-English news stories 20% Chinese-English news stories English-Czech open domain 30% gistable English-Japanese open domain 40% triagable 50% (informal rough estimates by presenter)
11 9 big picture
12 A Clear Plan 10 Interlingua Lexical Transfer Source Target
13 A Clear Plan 11 Interlingua Analysis Syntactic Transfer Lexical Transfer Generation Source Target
14 A Clear Plan 12 Interlingua Semantic Transfer Generation Analysis Syntactic Transfer Lexical Transfer Source Target
15 A Clear Plan 13 Interlingua Analysis Semantic Transfer Syntactic Transfer Generation Lexical Transfer Source Target
16 Learning from Data 14 foreign/english parallel text English text statistical analysis Translation Model statistical analysis Language Model Decoding Algorithm
17 Finding the Best Translation 15 e BEST = argmax e p(e f)
18 16 why is that a good plan?
19 Word Translation Problems 17 Words are ambiguous He deposited money in a bank account with a high interest rate. Sitting on the bank of the Mississippi, a passing ship piqued his interest. How do we find the right meaning, and thus translation? Context should be helpful
20 Phrase Translation Problems 18 Idiomatic phrases are not compositional It s raining cats and dogs. Es schüttet aus Eimern. (it pours from buckets.) How can we translate such larger units?
21 Syntactic Translation Problems 19 Languages have different sentence structure das behaupten sie wenigstens this claim they at least the she
22 Syntactic Translation Problems 19 Languages have different sentence structure das behaupten sie wenigstens this claim they at least the she Convert from object-verb-subject (OVS) to subject-verb-object (SVO) Ambiguities can be resolved through syntactic analysis the meaning the of das not possible (not a noun phrase) the meaning she of sie not possible (subject-verb agreement)
23 Semantic Translation Problems 20 Pronominal anaphora I saw the movie and it is good. How to translate it into German (or French)?
24 Semantic Translation Problems 20 Pronominal anaphora I saw the movie and it is good. How to translate it into German (or French)? it refers to movie movie translates to Film Film has masculine gender ergo: it must be translated into masculine pronoun er We are not handling this very well [Le Nagard and Koehn, 2010]
25 Semantic Translation Problems 20 Pronominal anaphora I saw the movie and it is good. How to translate it into German (or French)? it refers to movie movie translates to Film Film has masculine gender ergo: it must be translated into masculine pronoun er We are not handling this very well [Le Nagard and Koehn, 2010]
26 Semantic Translation Problems 21 Coreference Whenever I visit my uncle and his daughters, I can t decide who is my favorite cousin. How to translate cousin into German? Male or female? Complex inference required
27 No Single Right Answer 22 Israeli officials are responsible for airport security. Israel is in charge of the security at this airport. The security work for this airport is the responsibility of the Israel government. Israeli side was in charge of the security of this airport. Israel is responsible for the airport s security. Israel is responsible for safety work at this airport. Israel presides over the security of the airport. Israel took charge of the airport security. The safety of this airport is taken charge of by Israel. This airport s security is the responsibility of the Israeli security officials.
28 Learning from Data 23 What is the best translation? Sicherheit security 14,516 Sicherheit safety 10,015 Sicherheit certainty 334
29 Learning from Data 24 What is the best translation? Counts in European Parliament corpus Sicherheit security 14,516 Sicherheit safety 10,015 Sicherheit certainty 334
30 Learning from Data 25 What is the best translation? Phrasal rules Sicherheit security 14,516 Sicherheit safety 10,015 Sicherheit certainty 334 Sicherheitspolitik security policy 1580 Sicherheitspolitik safety policy 13 Sicherheitspolitik certainty policy 0 Lebensmittelsicherheit food security 51 Lebensmittelsicherheit food safety 1084 Lebensmittelsicherheit food certainty 0 Rechtssicherheit legal security 156 Rechtssicherheit legal safety 5 Rechtssicherheit legal certainty 723
31 26 better models
32 Phrase-Based Model 27 natürlich hat John Spaß am Spiel of course John has fun with the game Foreign input is segmented in phrases Each phrase is translated into English Phrases are reordered Workhorse of today s statistical machine translation
33 Synchronous Grammar Rules 28 Nonterminal rules NP DET 1 NN 2 JJ 3 DET 1 JJ 3 NN 2 Terminal rules N maison house NP la maison bleue the blue house Mixed rules NP la maison JJ 1 the JJ 1 house
34 Learning Rules 29 S VP VP VP PP NP PRP MD VB VBG RP TO PRP DT NNS I shall be passing on to you some comments Ich werde Ihnen die entsprechenden Anmerkungen aushändigen Extracted rule: VP X 1 X 2 aushändigen passing on PP 1 NP 2
35 Syntax Decoding 30 S PRO VP VP VP VBZ wants TO to VB NP NP NP PP PRO she DET a NN cup IN of NN NN coffee VB drink Sie PPER will VAFIN eine ART Tasse NN Kaffee NN trinken VVINF NP S VP
36 New State of the Art 31 Good results for German English [WMT 2014] language pair syntax preferred German English 57% English German 55% Mixed for other language pairs language pair syntax preferred Czech English 44% Russian English 44% Hindi English 54% Also very successful for Chinese English
37 32 better machine learning
38 Sparse Data 33 Statistical estimation often suffers from sparse data Zipf s law most words are extremely rare frequency rank = constant rank Statistics from Europarl the occurs 1,929,379 times large tail of words that occur once: 33,447 words, for instance cornflakes, mathematicians, or Tazhikhistan frequency
39 Brown Clusters 34 Main idea: share evidence with similar words Cluster words to reduce sparsity presented the laconic message pursued these pompous lesson aired that melancholic letter commissioned this bouncy counterfactuals published incompletable stunner For instance: use in language modeling p(cluster(message) cluster(presented), cluster(the), class(laconic)
40 Word Embeddings 35
41 Word Embeddings 36
42 Deep Learning 37 Autoencoders first: learn embeddings unsupervised then: supervised learning of task Neural network language models several implementations some integrated in Moses Neural networks everywhere translation model reordering model operation sequence model
43 38 data
44 Big Data 39 For many language pairs, lots of text available. Text you read in your lifetime 300 million words Translated text available billions of words English text available trillions of words
45 Mining the Web 40 Largest source for test: the World Wide Web Common Crawl publicly available crawl of the web hosted by Amazon Web Services, but can be downloaded regularly updated (semi-annual) 2-4 billion web pages per crawl Currently filling up our hard drives
46 Monolingual Data 41 Starting point: 35TB of text Processing pipeline [Buck et al., 2014] language detection reduplication normalization of Unicode characters sentence splitting Obtained corpora Language Lines (B) Tokens (B) Bytes BLEU (WMT) English TB - German GB +0.5 Spanish GB - French GB +0.6 Russian GB +1.2 Czech GB +0.6
47 Parallel Data 42 Basic processing pipeline [Smith et al., 2013] find parallel web pages (based on URL only) align document by HTML structure sentence splitting and tokenization sentence alignment filtering (remove boilerplate) Obtained corpora French German Spanish Russian Japanese Chinese Segments 10.2M 7.50M 5.67M 3.58M 1.70M 1.42M Foreign Tokens 128M 79.9M 71.5M 34.7M 9.91M 8.14M English Tokens 118M 87.5M 67.6M 36.7M 19.1M 14.8M Bengali Farsi Telugu Somali Kannada Pashto Segments 59.9K 44.2K 50.6K 52.6K 34.5K 28.0K Foreign Tokens 573K 477K 336K 318K 305K 208K English Tokens 537K 459K 358K 325K 297K 218K Much more work needed!
48 43 computer aided translation
49 Post-Editing Machine Translation 44 (source: Autodesk)
50 Interactivity 45 Traditional professional translation approaches translation from scratch post-editing translation memory match post-editing machine translation output More interactive collaboration between machine and professional?
51 Interactive Machine Translation 46 Input Sentence Er hat seit Monaten geplant, im April einen Vortrag in New York zu halten. Professional Translator
52 Interactive Machine Translation 47 Input Sentence Er hat seit Monaten geplant, im April einen Vortrag in New York zu halten. Professional Translator He
53 Interactive Machine Translation 48 Input Sentence Er hat seit Monaten geplant, im April einen Vortrag in New York zu halten. Professional Translator He has
54 Interactive Machine Translation 49 Input Sentence Er hat seit Monaten geplant, im April einen Vortrag in New York zu halten. Professional Translator He has for months
55 Interactive Machine Translation 50 Input Sentence Er hat seit Monaten geplant, im April einen Vortrag in New York zu halten. Professional Translator He planned
56 Interactive Machine Translation 51 Input Sentence Er hat seit Monaten geplant, im April einen Vortrag in New York zu halten. Professional Translator He planned for months
57 Word Alignment Visualization 52 Input Sentence Er hat seit Monaten geplant, im April einen Vortrag in New York zu halten. Professional Translator He planned for months to give a lecture in New York in
58 Word Alignment Visualization 53 Input Sentence Er hat seit Monaten geplant, im April einen Vortrag in New York zu halten. Professional Translator He planned for months to give a lecture in New York in
59 Shading off Translated Material 54 Input Sentence Er hat seit Monaten geplant, im April einen Vortrag in New York zu halten. Professional Translator He planned for months to give a lecture in New York in
60 Choices 55 Trigger the passive vocabulary Display multiple translations for words and phrases er hat seit Monaten geplant, im April einen Vortrag... he has for months the plan in April a lecture... it has for months now planned, in April a presentation... he was for several months planned to in the April a speech... he has made since months the pipeline in April of a statement... he did for many months scheduled the April a general... Rank and color-highlight by probability of each translation Prefer diversity
61 Instant Feedback Loop 56 source text translate MT engine MT translation re-train post-edit human translation
62 CASMACAT Home Edition 57 Available as open source software Features installation on any desktop machine allows training of MT engines all new types of assistance incremental updating of models Warning: still in development stage (help welcome!)
63 58 summary
64 Summary 59 Machine translation is not perfect, but useful Better models (esp. syntax) Better machine learning (esp. neural networks) More data Closer integration with target application (e.g., computer aided translation)
65 Thank You 60 questions?
Introduction. Philipp Koehn. 28 January 2016
Introduction Philipp Koehn 28 January 2016 Administrativa 1 Class web site: http://www.mt-class.org/jhu/ Tuesdays and Thursdays, 1:30-2:45, Hodson 313 Instructor: Philipp Koehn (with help from Matt Post)
Computer Aided Translation
Computer Aided Translation Philipp Koehn 30 April 2015 Why Machine Translation? 1 Assimilation reader initiates translation, wants to know content user is tolerant of inferior quality focus of majority
Statistical Machine Translation
Statistical Machine Translation Some of the content of this lecture is taken from previous lectures and presentations given by Philipp Koehn and Andy Way. Dr. Jennifer Foster National Centre for Language
Phrase-Based MT. Machine Translation Lecture 7. Instructor: Chris Callison-Burch TAs: Mitchell Stern, Justin Chiu. Website: mt-class.
Phrase-Based MT Machine Translation Lecture 7 Instructor: Chris Callison-Burch TAs: Mitchell Stern, Justin Chiu Website: mt-class.org/penn Translational Equivalence Er hat die Prüfung bestanden, jedoch
A Joint Sequence Translation Model with Integrated Reordering
A Joint Sequence Translation Model with Integrated Reordering Nadir Durrani, Helmut Schmid and Alexander Fraser Institute for Natural Language Processing University of Stuttgart Introduction Generation
Statistical Machine Translation Lecture 4. Beyond IBM Model 1 to Phrase-Based Models
p. Statistical Machine Translation Lecture 4 Beyond IBM Model 1 to Phrase-Based Models Stephen Clark based on slides by Philipp Koehn p. Model 2 p Introduces more realistic assumption for the alignment
Comprendium Translator System Overview
Comprendium System Overview May 2004 Table of Contents 1. INTRODUCTION...3 2. WHAT IS MACHINE TRANSLATION?...3 3. THE COMPRENDIUM MACHINE TRANSLATION TECHNOLOGY...4 3.1 THE BEST MT TECHNOLOGY IN THE MARKET...4
The XMU Phrase-Based Statistical Machine Translation System for IWSLT 2006
The XMU Phrase-Based Statistical Machine Translation System for IWSLT 2006 Yidong Chen, Xiaodong Shi Institute of Artificial Intelligence Xiamen University P. R. China November 28, 2006 - Kyoto 13:46 1
SYSTRAN Chinese-English and English-Chinese Hybrid Machine Translation Systems for CWMT2011 SYSTRAN 混 合 策 略 汉 英 和 英 汉 机 器 翻 译 系 CWMT2011 技 术 报 告
SYSTRAN Chinese-English and English-Chinese Hybrid Machine Translation Systems for CWMT2011 Jin Yang and Satoshi Enoue SYSTRAN Software, Inc. 4444 Eastgate Mall, Suite 310 San Diego, CA 92121, USA E-mail:
Convergence of Translation Memory and Statistical Machine Translation
Convergence of Translation Memory and Statistical Machine Translation Philipp Koehn and Jean Senellart 4 November 2010 Progress in Translation Automation 1 Translation Memory (TM) translators store past
Why Evaluation? Machine Translation. Evaluation. Evaluation Metrics. Ten Translations of a Chinese Sentence. How good is a given system?
Why Evaluation? How good is a given system? Machine Translation Evaluation Which one is the best system for our purpose? How much did we improve our system? How can we tune our system to become better?
Machine Translation. Why Evaluation? Evaluation. Ten Translations of a Chinese Sentence. Evaluation Metrics. But MT evaluation is a di cult problem!
Why Evaluation? How good is a given system? Which one is the best system for our purpose? How much did we improve our system? How can we tune our system to become better? But MT evaluation is a di cult
Machine Translation. Agenda
Agenda Introduction to Machine Translation Data-driven statistical machine translation Translation models Parallel corpora Document-, sentence-, word-alignment Phrase-based translation MT decoding algorithm
Chapter 5. Phrase-based models. Statistical Machine Translation
Chapter 5 Phrase-based models Statistical Machine Translation Motivation Word-Based Models translate words as atomic units Phrase-Based Models translate phrases as atomic units Advantages: many-to-many
Collecting Polish German Parallel Corpora in the Internet
Proceedings of the International Multiconference on ISSN 1896 7094 Computer Science and Information Technology, pp. 285 292 2007 PIPS Collecting Polish German Parallel Corpora in the Internet Monika Rosińska
SYSTRAN 混 合 策 略 汉 英 和 英 汉 机 器 翻 译 系 统
SYSTRAN Chinese-English and English-Chinese Hybrid Machine Translation Systems Jin Yang, Satoshi Enoue Jean Senellart, Tristan Croiset SYSTRAN Software, Inc. SYSTRAN SA 9333 Genesee Ave. Suite PL1 La Grande
Question template for interviews
Question template for interviews This interview template creates a framework for the interviews. The template should not be considered too restrictive. If an interview reveals information not covered by
Chapter 6. Decoding. Statistical Machine Translation
Chapter 6 Decoding Statistical Machine Translation Decoding We have a mathematical model for translation p(e f) Task of decoding: find the translation e best with highest probability Two types of error
Hybrid Strategies. for better products and shorter time-to-market
Hybrid Strategies for better products and shorter time-to-market Background Manufacturer of language technology software & services Spin-off of the research center of Germany/Heidelberg Founded in 1999,
Factored Translation Models
Factored Translation s Philipp Koehn and Hieu Hoang [email protected], [email protected] School of Informatics University of Edinburgh 2 Buccleuch Place, Edinburgh EH8 9LW Scotland, United Kingdom
BILINGUAL TRANSLATION SYSTEM
BILINGUAL TRANSLATION SYSTEM (FOR ENGLISH AND TAMIL) Dr. S. Saraswathi Associate Professor M. Anusiya P. Kanivadhana S. Sathiya Abstract--- The project aims in developing Bilingual Translation System for
Extracting translation relations for humanreadable dictionaries from bilingual text
Extracting translation relations for humanreadable dictionaries from bilingual text Overview 1. Company 2. Translate pro 12.1 and AutoLearn 3. Translation workflow 4. Extraction method 5. Extended
Machine Learning for natural language processing
Machine Learning for natural language processing Introduction Laura Kallmeyer Heinrich-Heine-Universität Düsseldorf Summer 2016 1 / 13 Introduction Goal of machine learning: Automatically learn how to
Automatic Speech Recognition and Hybrid Machine Translation for High-Quality Closed-Captioning and Subtitling for Video Broadcast
Automatic Speech Recognition and Hybrid Machine Translation for High-Quality Closed-Captioning and Subtitling for Video Broadcast Hassan Sawaf Science Applications International Corporation (SAIC) 7990
A Joint Sequence Translation Model with Integrated Reordering
A Joint Sequence Translation Model with Integrated Reordering Nadir Durrani Advisors: Alexander Fraser and Helmut Schmid Institute for Natural Language Processing University of Stuttgart Machine Translation
Introduction. BM1 Advanced Natural Language Processing. Alexander Koller. 17 October 2014
Introduction! BM1 Advanced Natural Language Processing Alexander Koller! 17 October 2014 Outline What is computational linguistics? Topics of this course Organizational issues Siri Text prediction Facebook
Language and Computation
Language and Computation week 13, Thursday, April 24 Tamás Biró Yale University [email protected] http://www.birot.hu/courses/2014-lc/ Tamás Biró, Yale U., Language and Computation p. 1 Practical matters
This page intentionally left blank
This page intentionally left blank Statistical Machine Translation The field of machine translation has recently been energized by the emergence of statistical techniques, which have brought the dream
PROMT Technologies for Translation and Big Data
PROMT Technologies for Translation and Big Data Overview and Use Cases Julia Epiphantseva PROMT About PROMT EXPIRIENCED Founded in 1991. One of the world leading machine translation provider DIVERSIFIED
The SYSTRAN Linguistics Platform: A Software Solution to Manage Multilingual Corporate Knowledge
The SYSTRAN Linguistics Platform: A Software Solution to Manage Multilingual Corporate Knowledge White Paper October 2002 I. Translation and Localization New Challenges Businesses are beginning to encounter
Tibetan-Chinese Bilingual Sentences Alignment Method based on Multiple Features
, pp.273-280 http://dx.doi.org/10.14257/ijdta.2015.8.4.27 Tibetan-Chinese Bilingual Sentences Alignment Method based on Multiple Features Lirong Qiu School of Information Engineering, MinzuUniversity of
Effective Self-Training for Parsing
Effective Self-Training for Parsing David McClosky [email protected] Brown Laboratory for Linguistic Information Processing (BLLIP) Joint work with Eugene Charniak and Mark Johnson David McClosky - [email protected]
Collaborative Machine Translation Service for Scientific texts
Collaborative Machine Translation Service for Scientific texts Patrik Lambert [email protected] Jean Senellart Systran SA [email protected] Laurent Romary Humboldt Universität Berlin
Big Data in Education
Big Data in Education Assessment of the New Educational Standards Markus Iseli, Deirdre Kerr, Hamid Mousavi Big Data in Education Technology is disrupting education, expanding the education ecosystem beyond
Empirical Machine Translation and its Evaluation
Empirical Machine Translation and its Evaluation EAMT Best Thesis Award 2008 Jesús Giménez (Advisor, Lluís Màrquez) Universitat Politècnica de Catalunya May 28, 2010 Empirical Machine Translation Empirical
Learning Translation Rules from Bilingual English Filipino Corpus
Proceedings of PACLIC 19, the 19 th Asia-Pacific Conference on Language, Information and Computation. Learning Translation s from Bilingual English Filipino Corpus Michelle Wendy Tan, Raymond Joseph Ang,
Translation Solution for
Translation Solution for Case Study Contents PROMT Translation Solution for PayPal Case Study 1 Contents 1 Summary 1 Background for Using MT at PayPal 1 PayPal s Initial Requirements for MT Vendor 2 Business
THUTR: A Translation Retrieval System
THUTR: A Translation Retrieval System Chunyang Liu, Qi Liu, Yang Liu, and Maosong Sun Department of Computer Science and Technology State Key Lab on Intelligent Technology and Systems National Lab for
Glossary of translation tool types
Glossary of translation tool types Tool type Description French equivalent Active terminology recognition tools Bilingual concordancers Active terminology recognition (ATR) tools automatically analyze
Search and Information Retrieval
Search and Information Retrieval Search on the Web 1 is a daily activity for many people throughout the world Search and communication are most popular uses of the computer Applications involving search
UEdin: Translating L1 Phrases in L2 Context using Context-Sensitive SMT
UEdin: Translating L1 Phrases in L2 Context using Context-Sensitive SMT Eva Hasler ILCC, School of Informatics University of Edinburgh [email protected] Abstract We describe our systems for the SemEval
Neural Machine Transla/on for Spoken Language Domains. Thang Luong IWSLT 2015 (Joint work with Chris Manning)
Neural Machine Transla/on for Spoken Language Domains Thang Luong IWSLT 2015 (Joint work with Chris Manning) Neural Machine Transla/on (NMT) End- to- end neural approach to MT: Simple and coherent. Achieved
HIERARCHICAL HYBRID TRANSLATION BETWEEN ENGLISH AND GERMAN
HIERARCHICAL HYBRID TRANSLATION BETWEEN ENGLISH AND GERMAN Yu Chen, Andreas Eisele DFKI GmbH, Saarbrücken, Germany May 28, 2010 OUTLINE INTRODUCTION ARCHITECTURE EXPERIMENTS CONCLUSION SMT VS. RBMT [K.
Multi language e Discovery Three Critical Steps for Litigating in a Global Economy
Multi language e Discovery Three Critical Steps for Litigating in a Global Economy 2 3 5 6 7 Introduction e Discovery has become a pressure point in many boardrooms. Companies with international operations
White Paper. Translation Quality - Understanding factors and standards. Global Language Translations and Consulting, Inc. Author: James W.
White Paper Translation Quality - Understanding factors and standards Global Language Translations and Consulting, Inc. Author: James W. Mentele 1 Copyright 2008, All rights reserved. Executive Summary
Rule based Sentence Simplification for English to Tamil Machine Translation System
Volume 25 No8, July 2011 Rule based Sentence Simplification for English to Tamil Machine Translation System Poornima C, Dhanalakshmi V Computational Engineering and Networking Amrita Vishwa Vidyapeetham
How to translate your website. An overview of the steps to take if you are about to embark on a website localization project.
How to translate your website An overview of the steps to take if you are about to embark on a website localization project. Getting Started Translating websites can be an expensive and complex process.
Automatic Mining of Internet Translation Reference Knowledge Based on Multiple Search Engines
, 22-24 October, 2014, San Francisco, USA Automatic Mining of Internet Translation Reference Knowledge Based on Multiple Search Engines Baosheng Yin, Wei Wang, Ruixue Lu, Yang Yang Abstract With the increasing
Multilingual and mixed-lingual TTS applications
Multilingual and mixed-lingual TTS applications LangTech 2003 November 24, 2003 Simona Fina, Manager Linguistics Real-life texts need mixed-lingual analysis Agenda Short presentation of SVOX Challenges
TRANSREAD LIVRABLE 3.1 QUALITY CONTROL IN HUMAN TRANSLATIONS: USE CASES AND SPECIFICATIONS. Projet ANR 201 2 CORD 01 5
Projet ANR 201 2 CORD 01 5 TRANSREAD Lecture et interaction bilingues enrichies par les données d'alignement LIVRABLE 3.1 QUALITY CONTROL IN HUMAN TRANSLATIONS: USE CASES AND SPECIFICATIONS Avril 201 4
Differences in linguistic and discourse features of narrative writing performance. Dr. Bilal Genç 1 Dr. Kağan Büyükkarcı 2 Ali Göksu 3
Yıl/Year: 2012 Cilt/Volume: 1 Sayı/Issue:2 Sayfalar/Pages: 40-47 Differences in linguistic and discourse features of narrative writing performance Abstract Dr. Bilal Genç 1 Dr. Kağan Büyükkarcı 2 Ali Göksu
Automated Translation Quality Assurance and Quality Control. Andrew Bredenkamp Daniel Grasmick Julia V. Makoushina
Automated Translation Quality Assurance and Quality Control Andrew Bredenkamp Daniel Grasmick Julia V. Makoushina Andrew Bredenkamp Introductions (all) CEO acrolinx, Computational Linguist, QA Tool Vendor
A New Input Method for Human Translators: Integrating Machine Translation Effectively and Imperceptibly
Proceedings of the Twenty-Fourth International Joint Conference on Artificial Intelligence (IJCAI 2015) A New Input Method for Human Translators: Integrating Machine Translation Effectively and Imperceptibly
Deciphering Foreign Language
Deciphering Foreign Language NLP 1! Sujith Ravi and Kevin Knight [email protected], [email protected] Information Sciences Institute University of Southern California! 2 Statistical Machine Translation (MT) Current
Outline of today s lecture
Outline of today s lecture Generative grammar Simple context free grammars Probabilistic CFGs Formalism power requirements Parsing Modelling syntactic structure of phrases and sentences. Why is it useful?
Statistical Machine Translation
Statistical Machine Translation What works and what does not Andreas Maletti Universität Stuttgart [email protected] Stuttgart May 14, 2013 Statistical Machine Translation A. Maletti 1 Main
Computer Assisted Language Learning (CALL): Room for CompLing? Scott, Stella, Stacia
Computer Assisted Language Learning (CALL): Room for CompLing? Scott, Stella, Stacia Outline I What is CALL? (scott) II Popular language learning sites (stella) Livemocha.com (stacia) III IV Specific sites
CAPTURING THE VALUE OF UNSTRUCTURED DATA: INTRODUCTION TO TEXT MINING
CAPTURING THE VALUE OF UNSTRUCTURED DATA: INTRODUCTION TO TEXT MINING Mary-Elizabeth ( M-E ) Eddlestone Principal Systems Engineer, Analytics SAS Customer Loyalty, SAS Institute, Inc. Is there valuable
Web-based automatic translation: the Yandex.Translate API
Maarten van Hees [email protected] Web-based automatic translation: the Yandex.Translate API Paulina Kozłowska [email protected] Nana Tian [email protected] ABSTRACT Yandex.Translate
POS Tagsets and POS Tagging. Definition. Tokenization. Tagset Design. Automatic POS Tagging Bigram tagging. Maximum Likelihood Estimation 1 / 23
POS Def. Part of Speech POS POS L645 POS = Assigning word class information to words Dept. of Linguistics, Indiana University Fall 2009 ex: the man bought a book determiner noun verb determiner noun 1
Introduction. Compiler Design CSE 504. Overview. Programming problems are easier to solve in high-level languages
Introduction Compiler esign CSE 504 1 Overview 2 3 Phases of Translation ast modifled: Mon Jan 28 2013 at 17:19:57 EST Version: 1.5 23:45:54 2013/01/28 Compiled at 11:48 on 2015/01/28 Compiler esign Introduction
Text Mining - Scope and Applications
Journal of Computer Science and Applications. ISSN 2231-1270 Volume 5, Number 2 (2013), pp. 51-55 International Research Publication House http://www.irphouse.com Text Mining - Scope and Applications Miss
IRIS - English-Irish Translation System
IRIS - English-Irish Translation System Mihael Arcan, Unit for Natural Language Processing of the Insight Centre for Data Analytics at the National University of Ireland, Galway Introduction about me,
The Prague Bulletin of Mathematical Linguistics NUMBER 93 JANUARY 2010 37 46. Training Phrase-Based Machine Translation Models on the Cloud
The Prague Bulletin of Mathematical Linguistics NUMBER 93 JANUARY 2010 37 46 Training Phrase-Based Machine Translation Models on the Cloud Open Source Machine Translation Toolkit Chaski Qin Gao, Stephan
Study Plan. Bachelor s in. Faculty of Foreign Languages University of Jordan
Study Plan Bachelor s in Spanish and English Faculty of Foreign Languages University of Jordan 2009/2010 Department of European Languages Faculty of Foreign Languages University of Jordan Degree: B.A.
The history of machine translation in a nutshell
1. Before the computer The history of machine translation in a nutshell 2. The pioneers, 1947-1954 John Hutchins [revised January 2014] It is possible to trace ideas about mechanizing translation processes
Online free translation services
[Translating and the Computer 24: proceedings of the International Conference 21-22 November 2002, London (Aslib, 2002)] Online free translation services Thei Zervaki [email protected] Introduction
Translation and Localization Services
Translation and Localization Services Company Overview InterSol, Inc., a California corporation founded in 1996, provides clients with international language solutions. InterSol delivers multilingual solutions
Chinese-Japanese Machine Translation Exploiting Chinese Characters
Chinese-Japanese Machine Translation Exploiting Chinese Characters CHENHUI CHU, TOSHIAKI NAKAZAWA, DAISUKE KAWAHARA, and SADAO KUROHASHI, Kyoto University The Chinese and Japanese languages share Chinese
Machine Translation Computer Aided Translation Machine Language Processing
Machine Translation Computer Aided Translation Machine Language Processing Martin Kappus ([email protected]) 1 Machine Translation Computer-Aided Translation Agenda Machine Translation Introduction History
Customizing an English-Korean Machine Translation System for Patent Translation *
Customizing an English-Korean Machine Translation System for Patent Translation * Sung-Kwon Choi, Young-Gil Kim Natural Language Processing Team, Electronics and Telecommunications Research Institute,
Ngram Search Engine with Patterns Combining Token, POS, Chunk and NE Information
Ngram Search Engine with Patterns Combining Token, POS, Chunk and NE Information Satoshi Sekine Computer Science Department New York University [email protected] Kapil Dalwani Computer Science Department
Building a Web-based parallel corpus and filtering out machinetranslated
Building a Web-based parallel corpus and filtering out machinetranslated text Alexandra Antonova, Alexey Misyurev Yandex 16, Leo Tolstoy St., Moscow, Russia {antonova, misyurev}@yandex-team.ru Abstract
Interactive Dynamic Information Extraction
Interactive Dynamic Information Extraction Kathrin Eichler, Holmer Hemsen, Markus Löckelt, Günter Neumann, and Norbert Reithinger Deutsches Forschungszentrum für Künstliche Intelligenz - DFKI, 66123 Saarbrücken
Translution Price List GBP
Translution Price List GBP TABLE OF CONTENTS Services AD HOC MACHINE TRANSLATION... LIGHT POST EDITED TRANSLATION... PROFESSIONAL TRANSLATION... 3 TRANSLATE, EDIT, REVIEW TRANSLATION (TWICE TRANSLATED)...3
Machine vs. Human Translation Scott Bass, Advanced Language Translation Inc.
Machine vs. Human Translation Scott Bass, Advanced Language Translation Inc. Using computers to translate text from one language to another (referred to as machine translation [MT]) no longer faces the
The Transition of Phrase based to Factored based Translation for Tamil language in SMT Systems
The Transition of Phrase based to Factored based Translation for Tamil language in SMT Systems Dr. Ananthi Sheshasaayee 1, Angela Deepa. V.R 2 1 Research Supervisior, Department of Computer Science & Application,
Systematic Comparison of Professional and Crowdsourced Reference Translations for Machine Translation
Systematic Comparison of Professional and Crowdsourced Reference Translations for Machine Translation Rabih Zbib, Gretchen Markiewicz, Spyros Matsoukas, Richard Schwartz, John Makhoul Raytheon BBN Technologies
How the Computer Translates. Svetlana Sokolova President and CEO of PROMT, PhD.
Svetlana Sokolova President and CEO of PROMT, PhD. How the Computer Translates Machine translation is a special field of computer application where almost everyone believes that he/she is a specialist.
IAI : Knowledge Representation
IAI : Knowledge Representation John A. Bullinaria, 2005 1. What is Knowledge? 2. What is a Knowledge Representation? 3. Requirements of a Knowledge Representation 4. Practical Aspects of Good Representations
A Machine Translation System Between a Pair of Closely Related Languages
A Machine Translation System Between a Pair of Closely Related Languages Kemal Altintas 1,3 1 Dept. of Computer Engineering Bilkent University Ankara, Turkey email:[email protected] Abstract Machine translation
Modern foreign languages
Modern foreign languages Programme of study for key stage 3 and attainment targets (This is an extract from The National Curriculum 2007) Crown copyright 2007 Qualifications and Curriculum Authority 2007
Motivation. Korpus-Abfrage: Werkzeuge und Sprachen. Overview. Languages of Corpus Query. SARA Query Possibilities 1
Korpus-Abfrage: Werkzeuge und Sprachen Gastreferat zur Vorlesung Korpuslinguistik mit und für Computerlinguistik Charlotte Merz 3. Dezember 2002 Motivation Lizentiatsarbeit: A Corpus Query Tool for Automatically
