1 Automatische Übersetzung zwischen Hype und Realität Automatic Translation between Hype and Reality Hans Uszkoreit
2 Outline! Applications of MT The MT Hype MT Yesterday: Machine Translation in Verbmobil MT Today: Between Google Translate and EuroMatrix Deep and Hybrid NLP examples: DELPH-IN, EuroMatrix Statistical Machine Translation Light-Weight MT Applications MT Tomorrow: Quality, Trust and Translingual Spaces Forecasts of the META-NET Vision Groups
3 Outline! The MT Hype MT Yesterday: Machine Translation in Verbmobil MT Today: Between Google Translate and EuroMatrix Deep and Hybrid NLP examples: DELPH-IN, EuroMatrix Statistical Machine Translation Light-Weight MT Applications MT Tomorrow: Quality, Trust and Translingual Spaces Forecasts of the META-NET Vision Groups
4 Types of Demand! Three Major Markets informational inbound translation of digital texts high quality outbound translation of texts mobile translation of spontaneous speech And many others... patent translation UI globalization translation film dubbing subtitle translation...
5 Informational Inbound Translation! a.k.a. indicative translation or gisting translation of web pages, news, , blogs, wikipedia entries lower requirements on felicity (adequacy) low requirements on fluency dominance of translations into or out off English huge user demand - market is restricted by free services
6 High-Quality Outbound Translation! printed and online publications documentation, consumer information, manuals regulations, laws, announcements huge existing market
7 Mobile Speech Translation! combination of speech recognition/production and MT high demands on reliability problems spontaneous speech background noise variability in voices, dialects, registers almost no opportunity for postediting large potential market as mobile software or service
8 An Essential Difference! For informational (indicative) inbound translation, a decent average quality across sentences is needed. Sentences that do not have a comprehensible translation really hurt For outbound high quality translation, the goal is to translate many sentences in sufficient quality for publication another proportion of usable sentences may require minor postediting the quality of the remaining sentences does not matter.
9 The Situation in the US! The biggest players are Google: " inbound " informational translation of webpages DARPA research, military, " intelligence services:" inbound translation of news, " communication, and web pages business intelligence companies:" inbound translation of news, " analyses, web pages
10 The European Situation! Large demand for high-quality " outbound translation Many export-oriented economies Large EU-internal needs But also great demand for " informational inbound translation "
11 A New MT Hype?! Progress in statistical machine translation DARPA MT evaluations Google translate offers 3306 translation directions many new users in large corporations and public administrations European Patent Office Catalan Government SAP, Symantec, IBM, VW,... calculations of the economic benefits of MT
12 MT in Verbmobil! Verbmobil was not an MT project only a small part of Verbmobil actually worked on translation nevertheless Verbmobil was ahead of its time in MT new push in deep translation syntax-based statistical translation, learning from parallel treebanks hybrid translation
13 Verbmobil and Google Translate! Some Germans at Google Translate Franz Josef Och Aachen (Verbmobil) Thorsten Brants -Saarbrücken Richard Zens - Verbmobil Klaus Macherey Aachen (Verbmobil) Wolfgang Macherey - Aachen Peter Dienes - Saarbrücken
14 Deep and Hybrid Translation! Deep Translation HPSG analysis LFG transfer rules Hybrid Translation by parallel pipelines choice mainly based on time
15 DELPH-IN! pressure from competition with statistical MT in VM cooperation between DFKI and Stanford speed-up of HPSG parsing by a factor of 1500 later extended by U. Tokyo, U. Cambridge and others growing stock of open source systems, grammars and tools
16 DELPH-IN Members! Bulgarian Academy of Science, (Bulgaria) Cambridge University (UK) DFKI Saarbrücken GmbH (Germany) Kyung Hee University (Korea) LORIA Nancy (France) Melbourne University (Australia) NTT Communication Science Laboratory (Japan) Nanyang Technological University (Singapore), Norwegian University of Science and Technology (Norway) Saarland University (Germany) Stanford University (USA) Tokyo University (Japan) University of Lisbon (Portugal) Universtitat de Barcelona (Spain) Universtitat Pompeu Fabra (Spain) University of Oslo (Norway) University of Sussex (UK) University of Washington (USA)
17 Statistical MT! Verbmobil research prepared the grounds for today s " linguistic methods in statistical MT factored MT and hierarchical phrase-based MT learning translation models from parallel treebanks by Ney and others Ney and his colleagues have influenced SMT worldwide Alex Waibel s group continued speech-to-speech translation in TC-STAR and othe projects Mobile Translation Products: iphone app Jibbigo
19 Light-Weight MT! Yocoy Technologies " DFKI Spin-off in Berlin Feiyu Xu (CEO), " Sven Schmeier (CTO), " Xiwen Cheng, " Nicolaas Bongaerts, " Hemma Crain et al. Products: " i-you " yochina (released today)
21 Evaluation Campaigns and MT Marathons Translingual Europe Event Project Meeting Today s Meeting
22 EuroMatrixPlus! 22
23 Objectives of EuroMatrix! Translation systems for all pairs of " EU languages, with a special focus " on the languages of new and near-term prospective member states Efficient inclusion of linguistic knowledge into statistical machine translation statistical systems for 462 language pairs (all 23 except Irish) special efforts on Czech and Hungarian factored translation models, tree-based translation The development and testing of hybrid architectures for the integration of rulebased and statistical approaches several hybrid translation models have been developed and tested, some with very promising results
24 Objectives of EuroMatrix cont.! Organization, analysis and interpretation " of a competitive annual international evaluation of machine translation with a strong focus on European economic " and social needs The provision of open source machine translation technology including " research tools, software and data A systematically compiled and constantly updated detailed survey of the state of MT technology for all EU language pairs based on the developed systematic translation between all EU languages, Two evaluation campaigns have taken place with strong participation. Growing repository of open source systems and tools. The survey was conducted, and it keeps growing
26 Lessons learned from! the evaluations! Among the winners are RBMT, SMT, HMT, and Combo-Systems EuroMatrix participants performed among the best For European languages, European MT systems do not " perform worse than US systems In several cases the best US system was outperformed by a " EuroMatrix system
27 More Lessons! Even when RBMT and SMT systems exhibit similar evaluation scores, their errors are quite different there are rather positive signals for the performance of combination systems but the results are not yet conclusive enough to prove their superiority quality still keeps improving
28 A Problem and a Chance! A core component of the European " Union is a common market with a " single information space that works " with two dozens national languages " and many regional languages. This " ambitious endeavour is an unpre-" cedented social experiment. If it works, the multicultural union " of nations will prosper and serve as " a model for the peaceful and egal-" itarian cooperation of people in other " parts of the world. If it fails, Europe will be forced to choose between sacrificing cultural identities and economic defeat. 28
29 GOALS! Preserving the European cultural and linguistic diversity in the united information and knowledge society Securing at affordable costs the free flow of information and thought across language boundaries in the resulting single information space Providing each language community with the most advanced technologies for communication, information and knowledge management so that maintaining their mother tongue does not turn into a disadvantage for 62 languages incl. 23 working languages of the EU
30 Starting Point! Where are we today? Research has made considerable progress in recent years. Successful projects have been funded by the EU and national programs. But: the pace of progress is not fast enough to meet the three challenges within the next years. Can progress be accelerated? Yes. But only if the relevant stakeholders such as research communities, LT industry, user industries, language communitites, funding programs and policy makers team up for a dedicated major push. We need an alliance of stakeholders dedicated to a large concerted effort. 30
31 Multilingual Europe Technology Alliance
32 TA-NET is...!... a Network of Excellence with three major objectives Prepare the grounds for a large scale concerted effort by building a Strategic Alliance (META) of national and international research programmes, corporate users and commercial technology providers and language communities. Strengthen the European research community through networking of research and by creating new schemes and structures for sharing resources and efforts. Build bridges by approaching open problems in collaboration with other research fields such as machine learning, social computing, cognitive systems, knowledge technologies and multimedia content.
33 Starting Point! Initial funding through a FP 7 Network of Excellence Technologies for the multilingual European Information Society Total cost: 7.62 million euro EU contribution: 5.99 million euro Execution: From to Duration: 36 months" Three more consortia have applied for projects that are designed as parts of META-NET. These projects are currently under negotiation. Other centers have also joined by invitation We now have 42 members in 31 countries
34 Three Lines of Action
35 Building the Alliance! 35
36 The Process at Large
37 MT Tomorrow! Strive for high quality MT through highly specialized systems Exploit semantics through many resources (open linked data) New Human-Centred Research Paradigm in MT: Truly hybrid processes of machines and humans Humans as providers of data, insights, quality judgements, " critique, etc. Humans as test users and evaluators of early MT prototypes New training of people for pre- and post-editing of MT texts (missing in current curricula for translators)
38 Translation Brokerage:! The Translingual Cloud! Trusted Transl. Broker Specialized MT/LT Web Service Cloud PR Brochures Informal Language Automatic Summarization Patents Int. Company Names Times and Places Human Post-Editing Annual Reports
39 Ambient Translation: Translingual Spaces!
40 Ambient Translation: Translingual Spaces!
41 Ambient Translation:! Translingual Spaces!
42 Ambient Translation:! Translingual Spaces!
43 The discussion of the topics for the 8th " Framework of the EU has started You can participate on the " META-NET web forum
44 What do these people have in common! Head of Google Translate, Google, Mountain View Seven core members of his team Leader of WMT Workshops and main developer of MOSES, Edinburgh Director of the Next Generation Localization Technology Center in Dublin Coordinator of EuroMatrix and EuroMatrixPlus Coordinator of META-NET Organizer Director of Localization World, working from Colorado
45 Yes...! they are all Germans and most of them had been part of Verbmobil
46 But what about LT Research in Germany! We urgently need to do something in Germany Germany has fallen behind in volume of LT research Today funding for LT research in Germany is at the level of the " Netherlands MT and other LTs for German are still less performant than for many other languages Germany as export oriented economy, immigration country and global player in science, arts, technology, ecology and business needs crosslingual technologies
47 Opec Payment Office Trafalgar Square Pall Mall East OPEC ZAHLUNGS BÜRO Trafalgar Quadrat Hülle-Mall Ost DANKE OPEC-Zahlungsbüro Trafalgar Quadrat Sargtuch-Einkaufszentrum nach Osten Opec Payment Office Trafalgar Square Pall Mall Osten