Panel Discussion 2 Infrastructures, Pla/orms and Services for the Mul8lingual Digital Single Market Par8cipants: Stelios Piperidis (Ins8tute for Language and Speech Processing, Greece) Khalid Choukri (ELRA/ELDA, France) Steven Krauwer (CLARIN ERIC, The Netherlands) Jochen Hummel (ESTeam, Sweden/Germany) Andrejs Vasiljevs (Tilde, Latvia) Asunción Gómez- Pérez (Universidad Politécnica de Madrid, Spain) Modera8on:
Background of the Panel services/pla/orms cons8tute one layer of the SRIA everybody agrees on the demand for such infrastructures the current draw of the SRIA leaves this issue very underspecified concrete visions and plans are badly needed the panel will be a first step for fleshing out this important part of the SRIA
Why Platforms? reduc8on of complexity (one- stop service point) shared basic services (maintenance, promo8on, licensing, payment) supports evolu8on (compe88on and crossfer8liza8on) possibly minimum quality assurance as soon as end- users are involved: collec8ng data 8ght test/research loop (hybrid research, ResDevOps) tes8ng products
Most often suggested types platforms 1. specialized & generic language resources (corpora, tools, lexicons, etc.) 2. basic processing components (sas systems for n languages) 3. one- stop transla8on cloud (MT and human transla8on) 4. multlingual seman8c resources & processing (NEE, terminology, WSD, KG)
Most often suggested types platforms 1. specialized & generic language resources (corpora, tools, lexicons, etc.) future META- SHARE, ELRA, CLARIN 2. basic processing components (sas systems for n languages) BLARKs, European Language Cloud 3. one- stop transla8on cloud (MT and human transla8on) Translingual Cloud, European Transla8on Cloud 4. multlingual seman8c resources & processing (NEE, terminology, WSD, KG) European Knowledge Graph
Questions to be answered for each proposed infrastructure Offering exactly what? Which data, sowware & services? Offered to whom? R&D, corporate users, public users, ci8zens? Run by whom? EC, R&D consor8um, private enterprise(s), PPP? Paid by whom? Users, EC, private enterprises? Built and filled by whom? Private enterprises, research centers?
Towards a Consensus resources, tools needed? for whom? run by? paid by? filled by? basic LP services transla8on services seman8c services
Panel Discussion 2 Infrastructures, Pla/orms and Services for the Mul8lingual Digital Single Market Par8cipants: Stelios Piperidis (Ins:tute for Language and Speech Processing, Greece) Khalid Choukri (ELRA/ELDA, France) Steven Krauwer (CLARIN ERIC, The Netherlands) Jochen Hummel (ESTeam, Sweden/Germany) Andrejs Vasiljevs (Tilde, Latvia) Asunción Gómez- Pérez (Universidad Politécnica de Madrid, Spain) Modera8on:
META-SHARE(1/3) Stelios Piperidis a pan- european infrastructure bringing online together providers and consumers of language data, tools and services a distributed network of 29 repositories set up and maintained by 37 organisa8ons in 25 countries of the EU Infrastructure building started in 2010, in the META- NET framework, public launch in January 2013 Caters for datasets, tools and services for all EU languages, for language and language technology research and development for the academic and commercial world alike Offers all tools and func8onali8es required to support data publishing, sharing and re- use specialised all- inclusive repository sowware metadata model, licensing kit, sta8s8cs and repor8ng services, etc.
META-SHARE (2/3) Khalid Choukri Resource Type 16 301 1193 1013 Linguality 19 192 142 266 1777 corpus lexicalconceptualresource toolservice languagedescrip8on monolingual mul8lingual bilingual parallel comparable 84 88 94 90 100 119 157 9 19 77 6 Media Type 670 1937 Distribu:on per Language text audio video textngram image textnumerical 68 68 English 80 77 62 58 57 Spanish 897 French German 466 200 Finnish 206 310 299 Swedish Italian Portuguese Russian
META-SHARE (3/3) Stelios Piperidis maintained by member organisa8ons since Feb 2013, with the strong support of ATH/ILSP, CN/ILC, DFKI, ELRA, FBK ELRA, an ac8ve member, instrumental since the incep8on phase META- SHARE solu8ons (repo sowware, model, etc) used as founda8ons in some CLARIN Centres the two infrastructures collaborate on all fronts; they share a sizeable intersec8on in their membership anyway the META- SHARE model has been adopted by the LIDER project and has been translated into RDF, the translated META- SHARE inventory has been used as one source for popula8ng linghub in the QTLaunchPad project, the pilot on extending META- SHARE, as a data infrastructure, with a linguis8c processing and annota8on layer provided useful insights as to the adop8on and use of standards (representa8on, I/O, etc) dependencies on and deployment of cloud infrastructures per8nent legal ques8ons opera8onal framework of such data+processing service infrastructures
Panel Discussion 2 Infrastructures, Pla/orms and Services for the Mul8lingual Digital Single Market Par8cipants: Stelios Piperidis (Ins8tute for Language and Speech Processing, Greece) Khalid Choukri (ELRA/ELDA, France) Steven Krauwer (CLARIN ERIC, The Netherlands) Jochen Hummel (ESTeam, Sweden/Germany) Andrejs Vasiljevs (Tilde, Latvia) Asunción Gómez- Pérez (Universidad Politécnica de Madrid, Spain) Modera8on:
EU policy and SpeechDat Family Illustration High level of Coopera:on between Key players Public Private- Private- Partnership Most of today s app, Which lessons did we learn? Do we have key players in the EU. YES from Brno to Orsay to Trento to Leuven, Madrid, Lisbon LILA
The biggest risks & Common belief(s) Mimic what the EU did for the PC industry in the 90 th What was that? Where do we buy our laptops, Chips today, even our Drones!! We talk about infrastructures, pla/orms, strategies Others talk about but also DO R&D and industrial resources & applica8ons And also about Markets & Lucra8ve markets Can this respond to the needs & requirements of the EU In a compe88ve market. Boeing/Airbus, GM- Chrysler/Daimler Google & MicrosoW, The big dilemma R&D funded in Europe and Business mastered by May be just a common belief. Darpa portofolio of languages, Google, MicrosoW, Japanese players (How do we benefit from this? Darpa programmes led by.) Do we have instruments to implement an SRIA today!! May be through the PPP instruments but who are the EU players with 100M An extraordinary use- case / EU policy mid 90th for Speech
Panel Discussion 2 Infrastructures, Pla/orms and Services for the Mul8lingual Digital Single Market Par8cipants: Stelios Piperidis (Ins8tute for Language and Speech Processing, Greece) Khalid Choukri (ELRA/ELDA, France) Steven Krauwer (CLARIN ERIC, The Netherlands) Jochen Hummel (ESTeam, Sweden/Germany) Andrejs Vasiljevs (Tilde, Latvia) Asunción Gómez- Pérez (Universidad Politécnica de Madrid, Spain) Modera8on:
Panel Discussion 2 Infrastructures, Pla/orms and Services for the Mul8lingual Digital Single Market Par8cipants: Stelios Piperidis (Ins8tute for Language and Speech Processing, Greece) Khalid Choukri (ELRA/ELDA, France) Steven Krauwer (CLARIN ERIC, The Netherlands) Jochen Hummel (ESTeam, Sweden/Germany) Andrejs Vasiljevs (Tilde, Latvia) Asunción Gómez- Pérez (Universidad Politécnica de Madrid, Spain) Modera8on:
European Language Cloud For companies who process text the European Language Cloud is a web- based set of APIs that provides the basic func8onality to build and market products for all languages of the DSM and Europe s main trading partners. Unlike previous incomplete avempts to solve mul8lingualism ELC provides easy- to- use API calls in a reliable base quality under the same favorable terms. 17
coordinates promotes bootstraps Member States Non-profit Industry Assn operates use Institutions, Big Biz serve maintain SMEs European Language Cloud grow Language Resources Global Market
Panel Discussion 2 Infrastructures, Pla/orms and Services for the Mul8lingual Digital Single Market Par8cipants: Stelios Piperidis (Ins8tute for Language and Speech Processing, Greece) Khalid Choukri (ELRA/ELDA, France) Steven Krauwer (CLARIN ERIC, The Netherlands) Jochen Hummel (ESTeam, Sweden/Germany) Andrejs Vasiljevs (Tilde, Latvia) Asunción Gómez- Pérez (Universidad Politécnica de Madrid, Spain) Modera8on:
Panel Discussion 2 Infrastructures, Pla/orms and Services for the Mul8lingual Digital Single Market Par8cipants: Stelios Piperidis (Ins8tute for Language and Speech Processing, Greece) Khalid Choukri (ELRA/ELDA, France) Steven Krauwer (CLARIN ERIC, The Netherlands) Jochen Hummel (ESTeam, Sweden/Germany) Andrejs Vasiljevs (Tilde, Latvia) Asunción Gómez- Pérez (Universidad Politécnica de Madrid, Spain) Modera8on:
Linked Data as an enabler for the DSM Linked Data interconnects resources In many domains and languages Open and closed License Links with other datasets Linguis:c LD is a subset of LOD Enables the lexicaliza8on of data, not necessarily in the LD format Enables a new genera8on of LD- aware NLP and MT Services Based on Agree on vocabularies Unified and standardized languages: RDF(S) and (SPARQL) Standardized non- proprietary APIs Linguis:c LD A. Gómez- Pérez (asun@fi.upm.es)
Linked Data as an enabler for the DSM: uses and users 1. Programmers built applica8ons making queries in SPARQL and get RDF 3. Machine Machine data exchange and seman8c interoperability in RDF 2. Ci8zens/Users access LD through a user interface (they do not see RDF) Culture Geograhical Smart Ci:es A. Gómez- Pérez (asun@fi.upm.es)
Computers understand each other and do business in the DSM A. Gómez- Pérez (asun@fi.upm.es)
Linked Data as an enabler for the DSM: Questions to be answered Offering exactly what? Which data, sowware & services? Linked Data, Linguis8c Linked Data, Uniform and standardized APIs for accessing data Offered to whom? R&D, corporate users, public users, ci8zens? Human and Machine consump8on Run by whom? EC, R&D consor8um, private enterprise(s), PPP? Anyone: EC, Na8onal/Regional/local governments, Companies, Academia, etc. Paid by whom? Users, EC, private enterprises? Data: by data consumers if they are not open SoWware &Services: Anyone: EC, Na8onal/Regional/local governments, Companies, Academia, etc. Built and filled by whom? Private enterprises, research centers? Licensed Data: Data owners SosWware & Services: Anyone A. Gómez- Pérez (asun@fi.upm.es)
Panel Discussion 2 Infrastructures, Pla/orms and Services for the Mul8lingual Digital Single Market Par8cipants: Stelios Piperidis (Ins8tute for Language and Speech Processing, Greece) Khalid Choukri (ELRA/ELDA, France) Steven Krauwer (CLARIN ERIC, The Netherlands) Jochen Hummel (ESTeam, Sweden/Germany) Andrejs Vasiljevs (Tilde, Latvia) Asunción Gómez- Pérez (Universidad Politécnica de Madrid, Spain) Modera8on: THE AUDIENCE