HLT in Hungary - 2009



Similar documents
Talking machines?! Present and future of speech technology in Hungary

Language and Computation

Participants of the program Program history Internet research in Hungary The concept of the Future Internet Research

Members of the Hungarian Section of The Combustion Institute 2011, before June

Introducing the EIT ICT Labs Budapest Associate Partner Group

Master of Science in Artificial Intelligence

MEDAR Mediterranean Arabic Language and Speech Technology An intermediate report on the MEDAR Survey of actors, projects, products

Making the Dictionary. The Electronic Version

EASPD European Association of Service Providers for persons with disabilities

Bachelor Degree in Informatics Engineering Master courses

English taught courses, winter term 2015/16 Bachelor level

Overview of MT techniques. Malek Boualem (FT)

Special Topics in Computer Science

PROMT Technologies for Translation and Big Data

How To Help People With Disabilities With A Computer Program

Module Catalogue for the Bachelor Program in Computational Linguistics at the University of Heidelberg

Investigating tongue movement during speech with ultrasound

Master of Arts in Linguistics Syllabus

Survey Results: Requirements and Use Cases for Linguistic Linked Data

Szent István Egyetem Gödöll, Páter Károly u. 1., Tudástranszfer Központ (GPS: , ) Szent István University

New Horizons. The Future of the Eastern Partnership

LANGUAGE TRANSLATION SOFTWARE EXECUTIVE SUMMARY Language Translation Software Market Shares and Market Forecasts Language Translation Software Market

Knowledge transfer at universities and other public research organisations in Hungary

15-17th October. conference program. CONFESSIONALITY AND UNIVERSITY IN THE MODERN WORLD 20th Anniversary of Károli University

Szegedi Tudományegyetem

Master of Artificial Intelligence

Turkish Radiology Dictation System

Survey on the interest for a Professional Master in European Public Administration

National Awarding Committee (NAC) for EuroPsy in HUNGARY: Overview

Module Handbook for the Masters program in General Linguistics at the University of Bamberg

László Hunyadi (Résumé)

17th Building Services, Mechanical and Building Industry Days Exhibition and International Conference HURO WORKSHOPS October, 2011

The Challenge of Machine Translation of Patent Specifications and the Approach of the European Patent Office

Boundless Treasury of Collective Memories

Workshop on Joint Programs at Eötvös Loránd University

1st International Workshop on Cognitive Infocommunications. CogInfoCom Final Program

Pannon Egyetem Veszprém, Wartha Vince u. 1. Polinszky Terem és O épület. University of Pannonia

Higher Education in Hungary

Clinical Pharmacology Unit Ist. Department of Internal Medicine 1083-BUDAPEST, HUNGARY Korányi Sándor u. 2/a.

Getting Off to a Good Start: Best Practices for Terminology

Machine Translation at the European Commission

Why major in linguistics (and what does a linguist do)?

Draft dpt for MEng Electronics and Computer Science

Tamás Gábor CSAPÓ Curriculum Vitae

ENERGY ARBITRATION COURT ROLL OF ARBITRATORS

Presented to The Federal Big Data Working Group Meetup On 07 June 2014 By Chuck Rehberg, CTO Semantic Insights a Division of Trigent Software

CURRICULUM VITAE. Tamás SZENTIRMAI STUDIES Doctorate graduation in process - expected date of DLA graduation: spring 2011

Leveraging ASEAN Economic Community through Language Translation Services

Central and South-East European Resources in META-SHARE

An Open Platform for Collecting Domain Specific Web Pages and Extracting Information from Them

Minor in Environmental Studies LINGUISTICS GENERAL ELECTIVES COOPERATIVE EDUCATION RESIDENCY REQUIREMENT UNIVERSITY-WIDE REQUIREMENTS REQUIRED COURSE

HEALTH INDUSTRY THE INDUSTRY THAT MA KES YOUR LIFE BETTER

1. Arguments against the term Hungarian Runic

CONSTRAINING THE GRAMMAR OF APS AND ADVPS. TIBOR LACZKÓ & GYÖRGY RÁKOSI rakosi, laczko

Faculty of Business Administration

Automatically Generated Customizable Online Dictionaries

HOPS Project presentation

BUDAPEST, 25 May 5 June, Days 9:30 12:00 12:00 13:00 13:00-17:30 19:00 20:00 Day 1: Sunday 25 May, 2014

JOB BANK TRANSLATION AUTOMATED TRANSLATION SYSTEM. Table of Contents

JOINT MASTER S DEGREE programme IN INTERpRETING

Curriculum Vitae. May 10, 1975 (Born in Alexandria, Egypt)

MASTER OF PHILOSOPHY IN ENGLISH AND APPLIED LINGUISTICS

Actions all over Hungary on World Psoriasis Day 2009

Introducing the Department of International Business Language

A GrAF-compliant Indonesian Speech Recognition Web Service on the Language Grid for Transcription Crowdsourcing

Hybrid Strategies. for better products and shorter time-to-market

Appendices master s degree programme Human Machine Communication

Masters in Information Technology

School of Computer Science

First Annual Cold War History Research Center International Student Conference at Corvinus University of Budapest

Computer-Based Text- and Data Analysis Technologies and Applications. Mark Cieliebak

Computer Assisted Language Learning (CALL): Room for CompLing? Scott, Stella, Stacia

Preliminary program 10 OCTOBER, 2014 FRIDAY. 9:10 9:30 Opening ceremony

CSC384 Intro to Artificial Intelligence

Lecturer in the School of Computer Applications

Using Expert System in the Military Technology Research and Development

Master of Arts Program in Linguistics for Communication Department of Linguistics Faculty of Liberal Arts Thammasat University

R O L L O F A R B I T R A T O R S (June 1, 2011 May 31, 2014)

Free Online Translators:

Main Events of the 2011 Calendar Year

Veronika VINCZE, PhD. PERSONAL DATA Date of birth: 1 July 1981 Nationality: Hungarian

Language Technologies in Europe: trends and future perspectives

9th NATIONAL AND 4th INTERNATIONAL CONFERENCE OF LOGISTICS DIRECTORS

Bachelor in Deaf Studies

LOCAL AND TRANSNATIONAL CSÁNGÓ LIFEWORLDS FEJLÉCZ

Appendices master s degree programme Artificial Intelligence

Curriculum Vitae Ruben Sipos

INSTITUTE FOR BASIC AND CONTINUING EDUCATION OF HEALTH WORKERS

REPORT ON THE HIGHER EDUCATION GEOLOGY PROGRAMS IN HUNGARY

Knowledge Transfer, Small and Medium-Sized Enterprises, and Regional Development in Hungary

a Case for a Digital Library at Berzsenyi College

ICT PSP Call Aleksandra Wesolowska. Theme 6: Multilingual Web.

Erasmus Exchanges for Informatics Students

DAM-LR at the INL Archive Formation and Local INL. Remco van Veenendaal 01/03/2007 DAM-LR

TRANSLATION OF TELUGU-MARATHI AND VICE- VERSA USING RULE BASED MACHINE TRANSLATION

Master Specialization in Knowledge Engineering

couleurs Pantone réf : 126 C réf : 126 C réf : 414 C logo seul Signature sur le devant de la carte de vœux

European Masters Program in Language and Communication Technologies (LCT) Module Handbook for Prospective Students

Recommendation. Dear Reader,

BSc in Information Technology Degree Programme. Syllabus

Transcription:

HLT in Hungary - 2009 Gábor Prószéky MorphoLogic http:// Pázmány Péter Catholic University Faculty of Information Technology http://

Basics On Hungarian 15 million speakers world-wide, 10 million in Hungary Agglutinative language: Fenno-Ugric roots (with uncertain points) and with a few small relative languages only Since 896 in Central Europe: Turkish, Slavic, Romance and German areal influences Complex formal descriptions have been needed, namely simple CL methods (which work for English) don t work The first detailed and computationally usable morphosyntactic description of Hungarian was made in 1991

History of Hungarian HLT 1960 s: Russian-Hungarian MT Group, periodical Computational Linguistics (Prof. Kiefer) 1970 s: Atergo dictionary, basic language statistics (Debrecen University - Prof. Papp) 1980 s-: Speech applications (Technical University - Prof. Gordos, Németh, Olaszy, Vicsi) AI applications (ALL Gergely et al.) 1991-: Marketable NLP products (MorphoLogic - Prószéky et al.) 1990 s: Historical dictionary, corpus linguistics (Linguistics Institute of HAS - Váradi et al.), 2000 s-: Learning methods in NLP (Szeged University - Prof. Csirik) Services combined with speech applications (AITIA Tatai et al.) 2002-: Courses in HLT, PhD s in HLT (Pázmány University - Prof. Prószéky, Prof. Takács) 2003-: Series of Annual National HLT Conferences (Szeged) 2008-: HLT Platform: 4 academic institutions, 4 enterprises

Hungarian HLT Research MorphoLogic (Gábor Prószéky) Staff: 15 Proofing tools, intelligent dictionaries, machine translation, a large scale of linguistic resources for various languages (incl. Hungarian WordNet), text processing tools, lexicographical activities Linguistics Institute (Tamás Váradi) Staff: 8 Hungarian National Corpus, research activities in various CL projects (incl. Hungarian WordNet) Szeged University (János Csirik) Staff: 6 Machine learning tools for NLP, speech research, activities in various CL research projects (incl. Hungarian WordNet)

Hungarian HLT Research (cont d) Technical University of Budapest (TMIT) Staff: 28 (in 9 laboratories) Speech Technology Lab: speech information systems, e-mail/sms reader, tools for blind people (Géza Németh) Speech Acoustics Lab: speech databases, medical applications, speech correction, acoustic-phonetic research (Klára Vicsi) Speech Recognition Lab: speaker recognition, speech recognition, statistical modeling, multimedia indexing (Tibor Fegyó, Péter Mihajlik) Technical University of Budapest (MOKK) Staff: 5 Corpus collection (mono- and bilingual), text aligning, audio/video archives, ontology modeling, POS-tagging (Péter Halácsy)

Hungarian HLT Research (cont d) Pázmány Péter Catholic University, Faculty of Information Technology (Gábor Prószéky, György Takács) 4 researchers, 7 PhD students Language: WSD, semantic representation, anaphora resolution, text mining Speech: mobile applications (incl. mobile for the deaf!) Pécs University (Gábor Alberti) 4 researchers, 2 PhD students Computational semantics, machine translation, Prolog Other universities (with 1-2 researchers) Debrecen (literary computing) Miskolc (face modeling) Szombathely (terminology)

Hungarian HLT Research (cont d) Applied Logic Laboratory (Tamás Gergely) 4 researchers, 5 PhD students AI tools for medical and pharmacological applications, cognitive systems AITIA (Gábor Tatai) 48 co-workers (a few of them in HLT) Speech technology applications, text mining, chat-robots Kilgray (Balázs Kis) 3 full-time employees Translation memory development

International Cooperations in HLT Earlier in the 1990 s: MULTEXT-East, GLOSSER, GRAMLEX, ELSNET Goes East, SPECO, BABEL, TELRI, TRACTOR, EuroTermBank (MorphoLogic): common EU terminology ImportNet (ALL): ontology generation EASAIER (ALL): multimedia search CACAO (Linguistics Institute): library applications with HLT EuroMatrix (MorphoLogic): statistical MT for Europe CLARIN (Linguistics Institute & others): resources FLaReNeT (MorphoLogic): resources

Hungarian HLT Platform (2008-2010) Founders of the Platform: 4 industrial partners: AITIA Applied Logic Laboratory Kilgray MorphoLogic 4 academic partners: Linguistics Institute, HAS Technical University, Telecomm. & Media-informatics (TMIT) Technical University, Center for Media Res. & Educ. (MOKK) Szeged University, Res. Group of AI (RGAI) New member: Pázmány University, Faculty of Information Technology

Hungarian Education in NLP Courses in CL/HLT/NLP: Pázmány University: HLT (Prószéky + 5 PhD) speech (Takács + 2 PhD) Szeged University: machine learning (Csirik, Alexin + 3 PhD) Technical University: speech (Gordos, Németh, Olaszy, Vicsi + 3 PhD) artificial intelligence (Prószéky) Others: Debrecen University: general linguistics programme (Hunyadi) ELTE University: theoretical linguistics programme (Kálmán, Oravecz) Dept. of Translation Theory (Prószéky + 3 PhD) Pécs University: semantic representation (Alberti + 3 PhD) Pannon University, Szombathely: terminology (Fóris)

Annual National Conferences in Computational Linguistics 2-day conferences, always in December: 2003) 1st: 39 long and 20 short papers 2004) 2nd: 46 papers (in 8 sections) 2005) 3rd: 40 papers (in 7 sections), 13 posters & demos 2006) 4th: 34 papers (in 7 sections), 16 posters & demos 2007) 5th: 30 long papers (in 7 sections), 8 posters & demos 2008) Kick-off Conference of the Platform: 6 plenary presentations, 9 posters & demos

Hungary s Nr.1 HLT website: www.webforditas.hu Website for various HLT applications: text & web translation, dictionaries, spell-checking, search with linguistic support For fordítás (= translation ) it is the 1st in Google (among nearly 20 million hits) 60 000 visitors/day In 2008: 91 million pages translated (in 2007: 43 million pages) 81 million text translation + 2 million web translation + 7 million dictionary lookup 13,3 GB data traffic/year (with 1800 char/page it is 7,2 million A4 page translation) and the human translation market felt nothing (=no translators complained about losing jobs )

Translation between Hungarian and 33 other languages Technically, it is rather easy to combine two existing web translation services: HU-EN + EN-X and X-EN + EN-HU EN-X and X--EN language pairs for which commercial translation services are currently available: Official EU languages to and from Hungarian: 1. Bulgarian-Hungarian/Hungarian-Bulgarian Magyar/български MorphoLogic & SkyCode 2. Czech-Hungarian/Hungarian-Czech Magyar/Čeština 3. Danish-Hungarian/Hungarian-Danish Magyar/Dansk MorphoLogic & GrammarSoft 4. Dutch-Hungarian/Hungarian-Dutch Magyar/Nederlands 5. English-Hungarian/Hungarian-English Magyar/English MorphoLogic (Hu-En: with LI & SU) 6. Finnish-Hungarian/Hungarian-Finnish Magyar/Suomi 7. French-Hungarian/Hungarian-French Magyar/Français MorphoLogic & ProMT 8. German-Hungarian/Hungarian-German Magyar/Deutsch MorphoLogic & ProMT 9. Greek-Hungarian/Hungarian-Greek Magyar/Ελληνικά 10. Italian-Hungarian/Hungarian-Italian Magyar/Italiano MorphoLogic & ProMT 11. Latvian-Hungarian/Hungarian-Latvian Magyar/Latviesu valoda MorphoLogic & Trident 12. Lithuanian-Hungarian/Hungarian-Lithuanian Magyar/Lietuviu kalba 13. Polish-Hungarian/Hungarian-Polish Magyar/Polski MorphoLogic & pwn.pl 14. Portuguese-Hungarian/Hungarian-Portuguese Magyar/Português MorphoLogic & ProMT 15. Romanian-Hungarian/Hungarian-Romanian Magyar/Română 16. Slovak-Hungarian/Hungarian-Slovak Magyar/Slovenčina 17. Slovene-Hungarian/Hungarian-Slovene Magyar/Slovenščina 18. Spanish-Hungarian/Hungarian-Spanish Magyar/Español MorphoLogic & ProMT 19. Swedish-Hungarian/Hungarian-Swedish Magyar/Svenska 20. Other European languages to and from Hungarian: 20-25. HU/Catalan, HU/Croatian, HU/Norwegian (GrammarSoft), HU/Russian (ProMT), HU/Serbian, HU/Ukrainian (Trident) Important non-european languages to and from Hungarian: 26-33. HU/Arabic, HU/Chinese, HU/Hebrew, HU/Hindi, HU/Indonesian, HU/Japanese, HU/Korean, HU/Vietnamese

Features of a general web translation service Text translation techniques for any X language if X-En and En-X services are available Translation of entire websites Combination with various dictionaries (Web2, AJAX) Virtual keyboard for all languages Spell-checking for all languages Integrated text-to-speech tools (and speech recognition, soon) Language guesser tools integrated Translation options combined with internet search

MT Service for All European Languages Proposal for a new Pan-European cooperation Remark: we have not lost the interest in finding new ways in MT (e.g. we are partners in EuroMatrix), and we are still working on new scientific methods, as well, BUT THIS PROPOSAL IS DIFFERENT: it guarantees a usable translation service for a wide range of end-users on the basis of the existing service www.webforditas.hu, the above pivot application is running and anybody can use it (for the time being, 60.000 users/day), to extend the existing application to any other local languages, software technological developments are needed only, usable final results can be guaranteed, service providers for many languages are recently available on the market, cooperation has already started: both from the EU (HU, BG, DK, PL) and from non- EU countries (RU, UKR), partners R&D activity is basically the improvement of their own service to have better translations, on the basis of its experiences, MorphoLogic is in the position to offer an initiative how to combine efforts of potential partners.

Köszönöm figyelmüket! Thanks for your attention!