Simultaneous Machine Interpretation Utopia? Alex Waibel and the InterACT Team Carnegie Mellon University Karlsruhe Institute of Technology Mobile Technologies, LLC alex@waibel.com
Dilemma: The Language Challenge Living in the Global Village Globalization, Global Markets Increased Exchange and Communication European/International Integration Cultural Diversity: Beauty, Identity, Language, Culture, Customs Pride and Individualism Language Ability Challenge: Providing Access to Global Markets and Opportunities Maintaining Cultural Diversity/Individuality
In Europe: Everyone Speaks English???
The Magnitude of the Problem Today Almost all Translation Done by Human Effort (>99%) ~ 500,000 translators worldwide. ~150,000 in Europe ~ $31 Billion dollar market European Union: 1.3 B Spent on Translation/Interpretation 506 Language Directions to Translate Current Effort Insufficient to Keep up with Needs of 27 Member States Worldwide 6000 Languages 36,000,000 language directions!!! Actual translation work is currently only about 10% of translatable text. Translation needs are growing 25% -35% per year. And that s just for Text.. 5
Conferences Interpretation of Speech Estimated 300,000 conference per Year in Europe Compared to Needs Few Professional Interpreters 1% or Less are Interpreted Internet On You-Tube, Every Minute 13 Hours of New Videos Television Satellite, Cable: Virtually Unlimited Channels Lectures Government, Universities, Corporations Meetings Telephone Conversations Travel Dialogs/Encounters
Can Technology Provide a Solution??
Technology To Build a Speech Translator for a New Language 6 Component-Engines: Automatic Speech Recognition, Machine Translation, and Text-to-Speech Synthesis Each is in Principle Language Independent, but Requires Language Dependent Models Models are Automatically Trained but Require Large Corpora Certain Language Dependent Peculiarities Exist
Statistical Translation Approach Translation and Speech Systems Learn Automatically Statistics Trained over Lots of Data Uses Parallel or Speech Data Data Volume is Growing on Internet
Speech Translation Progression of Technologies: Domain Limited, Clear Speaking Style (late 80 s-91) Janus (first European&US speech-to-speech system) ATT, NEC, ATR Domain Limited, Spontaneous ( 91-00) Janus II/III (work on 20 languages), Verbmobil, Nespole, Enthusiast, C-STAR, ATR, ETRI, NLPR, Mobile Consecutive Interpretation Transtac, Babylon, Phraselator, Jibbigo, U-STAR Domain Unlimited Simultaneous Interpretation Parliamentary Speeches (TC-STAR) Broadcast News (GALE) Lectures, Seminars (InterACT, STAR-DUST, TC-STAR)
Mobile Consecutive Interpretation Technologies for Cross-Lingual Dialog
How it is Done Now: Human Interpreters Charts, Dictionaries Limitations/Problems: Limited Supply!! Fidelity/Trust/Security Number of Languages Humanitarian Needs
No Server Necessary! Real-Time Translation All on the Phone 15++ languages 40,000 Words Jibbigo:
Jibbigo on Apple Commercials
Jibbigo Systems itunes & Android App Stores: English, Spanish, French, German, Japanese, Chinese, Korean, Filipino, Iraqi, Thai, Pashto, Dari Other Languages Cost: Free Jibbigo Online Translator Off-Line: Freedom from Network Outside of App Store: Enterprise Versions for Special Applications Supported Devices
Humanitarian Deployment
Thailand Cobra Gold 11
Cambodia
San Jose, Honduras
Unlimited Domain Simultaneous Speech Translation Technologies
Domain Unlimited Domain Unlimited Translators for: TV/Radio Broadcast Translation Translation of Lectures and Speeches Parliamentary Speeches (UN, EU,..) Telephone Conversations Meeting Translation 你 们 的 评 估 准 则 是 什 么
Translation of Speeches
University Lectures êß*0vúbøi BA pysuêí}hÿ5 ƒä< y ëœkû OFˇØ kô#å «Zeû
EU-BRIDGE Bridges across the Language Divide
End-to-End Speech Translation
Lectures of the Future
Meeting of the Future Chinese English Arabic Spanish
Seeing Personal Translations Technology: Heads-up Display Goggles Display Translation Text in Translation Glasses
Hearing Personal Translations Targeted Audio Array of Ultra-Sound Speakers Targeted Beam of Audio Can only be Heard in Narrow Area Multiple Arrays Could Provide Multiple Languages
Internet Delivery Students bring their own Devices Transcription/Translation Output is Delivered via Web Page Interpretation Done on Server User Can Select Languages Prof. Alex Waibel
Service Infrastructure Components Services Events ASR Lecture 1 Lecture 2 MT Lecture 3 New Improved Technologies Adaptation, Learning Speech- Services for Users and Developers
Lecture Interpretation Service Launch at KIT: Summer 2012, Support for 4 Courses EU-BRIDGE, Support Prof. Alex Waibel
Tools for Students Translation of Power Point Slides Presentation by Sub-Titles
Search for Content Transcripts useful to Search for Content Slides, and Lectures in the Cloud Multi-Lingual Search and Retrieval in Lectures and Slides by Way of Search Terms
New Challenges Simultaneous Translation of Lectures Continuous Monologue Broadcast News, Speeches, Lectures Speaking-Style Fast, spontaneous, fragmentary, and no punctuation!! Noise, Caughing, Singing (!) Vocabulary Much larger, Special Vocabularies Speed, Realtime Service-Infrastructure Many parallel lectures; Automatic, robust assignment of compute power
The German Lecture Translator MT in German Lectures is particularly hard. Why? Peculiarities of German: Wordorder: Ich möchte mich zu der Konferenz über Maschinelle Übersetzung anmelden I want to register to the conference on Machine Translation Compounds: Worterkennungsfehlerrate Word Recognition Error Rate Inflections and Agreement: Zu der nächsten wichtigen interessanten Vorlesung
Words, Words, Words Technical Terms normally not in normal vocabularies Cepstral-Koeffizienten Wälzlagerungen Roller Bearings Unterraum Subspace Technical Terms with special Meanings Klausur Final Exam (not Retreat) Vorzeichen Sign (not Omen) Formulas: Eff von Ix f(x)
Words, Words, Words. Foreign Words in German Language Computer Science, English Expressions Political Speeches, Latin Proverbs Accent Würfelkalkül (Asfour) Foreign Words in German Language Cloud, iphone, ipad, Laser Inflections & Declinations of these Words Web-ge-casted, down-ge-loaded Formation of Compounds: Cloudbasierter Webcastzugriff
Solution Use Power Point Slides and Publications Search Internet for Similar Topics Incorporate User Corrections Adapt Vocabulary to Lecture
Languages: The Long Tail of Language Only a Few Languages are Currently Addressed (<50) Development of Technology Takes Long & Is Expensive Ongoing Research to Lower Cost per Language and Speed Development
Discussion Is Interpretation by Machine Possible? Yes, Performance will Continue to Improve and be Made Available over the Internet Are we Replacing Human Interpreters? No! Machine Translation Quality Remains Worse it Lacks Human Judgement and Intuition But: Human vs. Machine is Usually not the Choice we have! What about the Rest of us? The Common Reality is: Poor English or No Communication! A Social Challenge! Are we Hindering Human Language Learning? No! Technology Enables and Empowers Human Interaction thus Motivates and Supports more Language Learning
Our Mission Multi-Lingual Understanding & Integration for All Maintain & Nurture Language Diversity and Heritage Europe must Provide for its own Effective Support (and not Outsource or Ignore the Problem) We Can only Achieve this if we Embrace and Integrate Both: Human and Machine Support Achieve Symbiotic, Scalable Solutions by Language Services that Complement and Magnify Human Effort with Machine Support