How To Help The Netherlands Language And Speech Technology (Lst) Project



Similar documents
CLST Annual Report 2010

Introduction to the Digital Literacy Instructor

e-health Helmer Strik en vele anderen 22 November 2013

Master of Science in Artificial Intelligence

Automatic Speech Recognition and Hybrid Machine Translation for High-Quality Closed-Captioning and Subtitling for Video Broadcast

Study program International Communication (120 ЕCTS)

Zeynep Azar. English Teacher, Açı Private Primary School, Istanbul, Turkey Azar, E.Z.

Telecommunication (120 ЕCTS)

Your boldest wishes concerning online corpora: OpenSoNaR and you

The Knowledge Sharing Infrastructure KSI. Steven Krauwer

Education and Assessment Regulations Language and Communication Research Master s Programme Tilburg University

Excel Communications. Company Profile

Introduction to the IFLA Government Libraries Section

Study Plan for Master of Arts in Applied Linguistics

The Language Archive at the Max Planck Institute for Psycholinguistics. Alexander König (with thanks to J. Ringersma)

CAM DIPLOMA IN DIGITAL MARKETING (MOBILE)

European Master s Programme in Sport & Exercise Psychology. Dr Erwin Apitzsch Department of Psychology Lund University, Sweden

Appendices master s degree programme Human Machine Communication

Zuyd University of Applied Sciences. Together we create the future

Modularising Multilingual and Multicultural Academic Communication Competence for BA and MA level MAGICC CONSULTATION INTERVIEW STUDENTS

Teacher Training and Development Survey (EU-SPEAK 2 Project)

Why major in linguistics (and what does a linguist do)?

How To Teach A Deaf Person

DATABASES. Peter M.G. Apers

SCHOOL OF ENGINEERING Baccalaureate Study in Engineering Goals and Assessment of Student Learning Outcomes

Module Catalogue for the Bachelor Program in Computational Linguistics at the University of Heidelberg

University Libraries Strategic Goals and Objectives. extracted from: A Strategic Plan for the UNLV Libraries:

Turkish Radiology Dictation System

Master of Artificial Intelligence

ESPIL- European School Psychologists Improve Lifelong Learning

Master in Digital Humanities

Teaching and Assessment Regulations Language & Communication Research Master s Programme Tilburg University

Cataloguing data can be found at the end of this publication. Luxembourg: Publications Office of the European Union, 2011 ISBN

How To Use Data Mining For Knowledge Management In Technology Enhanced Learning

UNIVERSITY OF CUMBRIA LEADERSHIP AND MANAGEMENT DEVELOPMENT STRATEGY HUMAN RESOURCES SERVICE

ST. PETER S CHURCH OF ENGLAND (VOLUNTARY AIDED) PRIMARY SCHOOL SOUTH WEALD. Modern Foreign Language Policy

Program curriculum for graduate studies in Speech and Music Communication

Applications of Deep Learning to the GEOINT mission. June 2015

Curation Report KEMPENSCH TAALEIGEN

Dutch-Flemish Research Programme for Dutch Language and Speech Technology. stevin programme. project results

COMMON APPLICATION FOR PRO BONO SERVICES

Full professor and 6 assistant professors of IT A new school in IT Cameroon

How To Get A Degree From The Brussels School Of International Relations

CAM DIPLOMA IN DIGITAL MARKETING (METRICS AND ANALYTICS)

Master Programme in Mathematics

Curriculum and Module Handbook. Master s Degree Programme. in Information Systems (Master of Science in Information Systems) 1 September 2015

EuroCollege Hogeschool

B. Questions and answers 74. Youthpass in practice. Youthpass in Training Courses. 1 What is Youthpass in Training Courses?

Skills for a lifetime towards a future proof VET

ASSOCIATION OF AFRICAN BUSINESS SCHOOLS. BUSINESS SCHOOL QUALITY REVIEW Draft dated 10 November 2010

CALL FOR PAPERS. Thessaloniki Greece.

MBA in Healthcare Management

APPLYING MFCC-BASED AUTOMATIC SPEAKER RECOGNITION TO GSM AND FORENSIC DATA

Cochlear (Re)Habilitation Resources

Prediction of Stock Performance Using Analytical Techniques

Strategy Providing resources for staff and students in higher and further education in the UK and beyond

INPUTLOG 6.0 a research tool for logging and analyzing writing process data. Linguistic analysis. Linguistic analysis

: Graduated Magna Cum Laude - Bachelor in Speech and Language Pathology

Language Technologies in Europe: trends and future perspectives

Establishing the Uniqueness of the Human Voice for Security Applications

PhD candidate, Department of Political Science, VU University Amsterdam

The Graduate School:

HL7 AROUND THE WORLD

Master of Arts in Linguistics Syllabus

THE MASTER'S DEGREE IN ENGLISH

Appendices master s degree programme Artificial Intelligence

C E D A T 8 5. Innovating services and technologies for speech content management

How To Become A Data Scientist

Programme Specification (Postgraduate)

Bridging theory and practice: The dual PhD. Leiden University Dual PhD Centre The Hague

MA APPLIED LINGUISTICS AND TESOL

PROGRAMME SPECIFICATION University Certificate Psychology. Valid from September Faculty of Education, Health and Sciences -1 -

Training Services Course Catalog TRAINING SERVICES

How To Become A Master In International Communication

Marylhurst University, Portland, Oregon MA Interdisciplinary Studies (area of emphasis: Organizational Communication) Degree awarded: June, 2000

Summary Report to UNESCO for UNESCO Chair program in higher education at Peking University, PR China

Employment Opportunity

EAPAA Accreditation Committee Evaluation Report

Admission Number. Master of Science Programme in Computer Science (International Programme)

What s the next big thing in Broadcasting? Chances are we re already working on it.

Transcription:

CLST Annual Report 2012 CLST is the Centre for Language and Speech Technology. CLST operates as a separate unit within the Faculty of Arts of the Radboud University Nijmegen. CLST was founded in January 2003. Its objective is to contribute to the development of language and speech technology. CLST is active in research, application development and consultancy. Ideal projects for CLST are research projects with a link to applications. Whenever possible, CLST publishes about these projects in research journals and conference proceedings. Projects are funded and carried out within national and international research programmes, while contracts are also accepted from commercial parties. CLST focuses on research in and application of Language and Speech Technology (LST) in the following domains: Data mining & knowledge discovery Language learning and language teaching Communication in Health Care Forensics Our cross-over activities are bundled in the domain: Resources and research infrastructure Particular strengths of CLST are the following: Knowledge and expertise in speech technology, particularly in fundamental ASR technology and speech data Knowledge and expertise in language technology, particularly in the integration of rule-based and statistical processing of written language Combination of knowledge and expertise in the field of language and speech technology Combination of fundamental and applied research Strongly embedded in a ehumanities context Knowledge and expertise in the design, collection and validation of spoken and written language resources, and in the creation and maintenance of linguistic and lexical databases A firm position in a network of national and international contacts with additional expertise and knowledge In 2013 CLST will celebrate its 10 th anniversary. Preparations for the organisation of this celebration in full progress. 1

Board Board members in 2012 were R. van Hout (chair), A. van den Bosch, D. van Leeuwen, P. Fikkert and N. Oostdijk. N. Schröder is advisor of the Board. There were two vacancies for external board members which have been filled in 2012 with the accession of J. Oomen (Netherlands Institute for Sound and Vision) R. Smeulders (BrilliantBrains). Final responsibility for CLST lies with the Dean of the Faculty. Executive director is H. van den Heuvel. Projects Below we present an overview of the research projects in which CLST was involved in 2012, sorted per research domain 1. Data mining & knowledge discovery - BATS (Speaker tracking and topic detection in audio archives) (completed in 2012) - PoliticalMashup (Text analytics for political data) (completed in 2012) - TM4IP (Text Mining for Intellectual Property) - SCALE (Speech Communication with Adaptive Learning) (completed in 2012) - Speech as metadata II: Speaker diarization for the Netherlands Institute of Sound and Vision - Automatic detection of threatening tweets Language learning and language teaching - DISCO (ASR-based CALL for training oral proficiency In Dutch) - My Pronunciation Coach (Development of a pronunciation trainer for English as second language) - FASOP (Corrective feedback and acquisition of syntax in oral proficiency) - GOBL (Games Online for Basic Language learning) - HLT4LL (An interdisciplinary meeting on the contribution of HLT to the future of language learning) - LIN (A validated reading level tool for Dutch) - MPC (My Pronunciation Coach (Development of a pronunciation trainer for English as second language) - ACMoLA (A Computational Model of Language Acquisition) Communication in Health Care - ComPoli (LST to improve patients communication possibilities with a ehealth webportal) - FeetBack (combining body sensors and persuasive messages to incite patients to healthy behavior) (completed in 2012) - INSPIRE (Speech processing in realistic environments, e.g. in hearing aids and cochlear implants) - Optifox (Automatic optimization of the settings of a cochlear implant using automatic pronunciation assessment techniques) (completed in 2012) - PEDDS (automatic feedback on pronunciation of people with dysarthria) (completed in 2012) Forensics - BBfor2 (Bayesian Biometrics for forensics) 1 See for detailed information on all projects: http://www.ru.nl/clst 2

- COST Action IC1106: Integrating Biometrics and Forensics for the Digital Age - Automatic detection of threatening tweets Resources, tools and research infrastructure - CLAM-Support (Supporting CLAM as interface for tools in a web application) - DCS (Data Curation Service for CLARIN-NL) - D-LUCEA (Database of the Longitudinal Utrecht Collection of English Accent) - Elftal (Automatic Recognisers for Eleven African Languages) - Limburgian Spelling (a website with author environment to practice the spelling of various dialects of the Dutch province Limburg) (completed in 2012) - SoNaR (A 500 M words text corpus of written Dutch) (completed in 2012) - TTNWW (Tools for Dutch LST as web services in a workflow) In this overview we find projects from various funding sources: - CLARIN-NL projects:, CLAM-S, DCS, D-Lucea, TTTNWW - EU projects: BBfor2, GOBL, INSPIRE, OptiFox, SCALE, COST IC1106 - NWO, STW & ZonMW projects: ACMOLA. BATS, ComPoli, FASOP, LIN, MPC, PEDDS - STEVIN projects: DISCO, SoNaR - Other: Dutch Government (incl. Dutch Language Union), Netherlands Institute for Sound and Vision CLST is the coordinating partner of two Marie Curie ITN projects, viz. BBfor2 and INSPIRE, and of one LLP KA2 project, viz. GOBL. CLST participated in a third one, viz. SCALE (coordinated by the University of Saarbrücken). Acquisition In the Lifelong Learning Programme (Grundvig), Helmer Strik and Catia Cucchiarini acquired a project named DigLIn Digital Literacy Instructor which aims to substantially advance literacy acquisition in Dutch, English, German and Finnish. This project intends to develop advanced learning materials for languages with different degrees of orthographic transparency and to study their impact on L2 learners literacy acquisition.. For this, the key technology is Automatic Speech Recognition (ASR). One of the partners, FrieslandCollege, already has a suitable digital learning course that will be localized to other languages, English, German and Finnish to which ASR will be added. The project is coordinated by Dr Helmer Strik. Partners are: Friesland College (NL), University of Newcastle upon Tyne (UK), University Leipzig (Germany) and the University of Jyväskylä (Finland). The project started 1 Jan. 2013 and lasts for three years. CLST is the coordinating partner. H. Strik (together with M. Mulken) acquired partnership in the, European 'Lifelong Learning Programme' (LLP) project IntlUni: 'The Challenges of the Multilingual and Multicultural Learning Space in the lnternational University', David van Leeuwen acquired funds from the Dutch Language Union (in cooperation with the government of South Africa) to develop an automatic language recognizer for eleven African languages. The name of the project is Elftal (Eleven Likelihoods for the African Languages). In the same framework Helmer Strik acquired funding to organize an interdisciplinary meeting (HLT4LL) in Stellenbosch with the aim to make clear what the possibilities and challenges are of using human language technology (HLT) for language learning (LL). In the fourth open call of CLARIN-NL two projects were granted: VALID aiming at the curation of five databases with recordings and tests/analyses of persons with language impairments; and RemBench which aims at interlinking various databases centering around the life and work of the painter Rembrandt van Rijn in a demonstrator application. 3

Participation in new initiatives and awareness At the national level CLST keeps track of new trends and developments. To this end, CLST s director participates at Board level (Treasurer) in NOTaS (Nederlandse Organisatie voor Taal-en Spraaktechnologie), an organization in which national companies and knowledge centers cooperate to stimulate the development and application of language and speech technology in the Dutch language. In addition, in 2012, he became member of the ELRA Board. ELRA s missions are to promote language resources for the Human Language Technology (HLT) sector, and to evaluate language engineering technologies 2. Further, CLST is involved in CLARIN-NL, which is a large-scale national collaborative effort to create, coordinate and make language resources and technology available and readily usable in the Humanities and Social Sciences. 3 CLST is represented in the International Speech Communication Association Special Interest Group (ISCA SIG) on Speech and Language Technology in Education (SLaTE, ww.sigslate.org) by H. Strik (vice-president) and C. Cucchiarini. CLST members are also involved in European initiatives pertaining to the language resources infrastructure. Involvement is considered of some importance as the infrastructure impacts on the possibilities for research in the area of language and speech technology: - CLARIN-ERIC 4 - FLaReNet, which aims at developing a common vision of the area of language resources and language technologies, and fostering a European strategy for consolidating the sector, thus enhancing competitiveness at EU level and worldwide. 5 - Meta-Net which is a Network of Excellence dedicated to fostering the technological foundations of a multilingual European information society. META-NET is building the Multilingual Europe Technology Alliance (META). Bringing together researchers, commercial technology providers, private and corporate language technology users, language professionals and other information society stakeholders. 6 - CLARIAH, which aims to design, construct, and exploit a facility for ehumanities research. 7 ICT infrastructure Together with the Department of Linguistics and the Department of Business Communication CLST set-up a powerful computing and storage infrastructure at the Faculty of Science. The equipment consists of seven 16-core computing units and over 100 TB disk storage. CLST values this infrastructure as an important expansion of our computing and storage capacities. We have employed an additional ICT-specialist and programmer for system maintenance and user support. Personnel Five temporary staff members left CLST in 2012 8 and found employment elsewhere. Three new employees were added to the staff, two junior researchers 9 and one ICT-specialist 10. At the end of 2012 the staff employed by CLST consisted of four tenure staff members, four temporary staff members, eight PhD students, and three members with a temporary part-time posting from the Department of Linguistics, summing up to 14.4 fte (including ICT support). The institute s daily management is in the hands of the director with the support of a secretary (together 0.5 fte). The Table 2 http://www.elra.info 3 http://www.clarin.nl, Due to the CLARIN-EU project similar initiatives have started in more European countries such as Germany and Norway. As of Feb. 29 th, 2012, CLARIN is also an official ERIC. 4 http://www.clarin.eu/ 5 http://www.flarenet.eu 6 http://www.meta-net.eu/ 7 http://www.clariah.nl 8 Sanne Frazer, Heyun Huang, Maaike Jongenelen, Mitchell McLaren, Maaske Treurniet 9 Cezara Pastrav, Juliane Schmidt 10 Bouke Versteegh 4

below shows the distribution of the CLST staff (in fte) over various types of contracts and financing. Student assistants are excluded from this overview. NWO-like projects Other external funding Tenure staff 1.7 1.2 PhD students 3.6 3.7 Other temporary staff: Junior Postdoc 0.6 1.0 0.2 1.5 Secondments Dept. 0.15 0.25 Linguistics Sum 7.05 6.85 Annual performance interviews were held with all staff members. These interviews serve to achieve a match between personal ambitions, job contents and career opportunities of individual employees. CLST reserves part of its budget for training and education of its personnel. However, staff members preferably follow offered for free by the Radboud University to its employees as part of the University s career policy. Internships This year have started extra efforts in training and educating students in the LST-related research and development. Apart from attending fairs with internships offerings for students, we have also opened a webportal where proposals for internships are displayed 11, both at Bachelor and at Master level. Customer Satisfaction For most of its projects CLST is obliged to write progress reports. These progress reports follow the templates provided by the funding bodies (NWO, EU, CLARIN-NL). Customer satisfaction in other projects (typically with companies) is monitored on a regular basis (by personally contacting the customer e-mail or telephone). Customer responses are in general very positive. Public Relations Every CLST research employee visited at least one international conference or workshop. Attendance of workshops is, as a rule, restricted to conferences to which the employee contributes an accepted paper. In all contributions the affiliation to CLST and the Radboud University is mentioned. Further PR activities in 2012 were: Website demo updates now including: o Demo of the Oral History Annotation tool from the INTER-VIEWs project o Data search in video recordings of popular TV series: Boer zoekt vrouw (BATS project) o ML Translation system for Frisian <-> Dutch: Oersetter o Rembrandt Documents project: RemDoc Website video updates now including: o Videos of applications showing some of the application potentials of the DISCO, FASOP, PEDDS projects Folders and brochures (Dutch, English) for e.g. Health Valley 2012 11 http://www.ru.nl/clst/interships-(-stages/clst-internships/ 5

Publicity in newspapers and radio/tv broadcasts addressing: o Threatening tweets project o Political speaking time monitor (application running during Dutch parliamentary elections period) 12 o The RemDoc project Periodicals for a general or professional audience such as DIXIT PhD Defenses & Publications One earlier CLST project resulted in an academic PhD defense in 2012: De Vriend, F. (2012, Oct. 18). Tools for Computational Analyses of Dialect Geography Data. Radboud Universiteit Nijmegen. Prom.: Prof. L.W.J. Boves & Prof R. van Hout. Project: D-square. Publications by CLST employees are listed in the Appendix. Henk van den Heuvel, Director CLST H.vandenHeuvel@let.ru.nl http://www.ru.nl/clst 12 http://beeldengeluid.nl/politieke-spreektijd-monitor 6

Publications 2012 Journal papers: Astudillo, R.F., Kolossa, D., Abad, A., Zeiler, S., Saeidi, R., Mowlaee, P., Silva Neto, J.P. da & Martin, R. (2012). Integration of Beamforming and Uncertainty-of-Observation Techniques for Robust ASR in Multi-Source Environments. Computer Speech and Language, 27(3), 837-850. Bergmann, C., Paulus, M.A. & Fikkert, J.P.M. (2012). Preschoolers' comprehension of pronouns and reflexives: the impact of the task. Journal of Child Language, 39(4), 777-803. Bosch, A.P.J. van den, Morante, R. & Canisius, S.V.M. (2012). Joint learning of dependency parsing and semantic role labeling. Computational Linguistics in the Netherlands Journal, 2, 97-117. Bruijn, M.J. de, Bosch, L.F.M. ten, Kuik, J., Witte, I., Langendijk, A., Leemans, C. & Verdonck, M. (2012). Acoustic-phonetic and artificial neural network feature analysis to assess speech quality of stop consonants produced by patients treated for oral or oropharyngeal cancer. Speech Communication, 54(5), 632-640. Camp, M.M. van de & Bosch, A.P.J. van den (2012). The socialist network. Decision Support Systems, 53(4), 761-769. Cucchiarini, C., Nejjari, W. & Strik, H. (2012). My Pronunciation Coach: Improving English pronunciation with an automatic coach that listens. Language Learning in Higher Education, 1(2), 365-376. D'hondt, E.K.L., Verberne, S., Weber, Niklas, Koster, C.H.A. & Boves, L.W.J. (2012). Using skipgrams and PoS-based feature selection for patent classification. Computational Linguistics in the Netherlands Journal, 2, 52-70. Halteren, H. van & Oostdijk, N.H.J. (2012). Towards Identifying Normal Forms for Various Word Form Spelling on Twitter. Computational Linguistics in the Netherlands Journal, 2, 2-22. Hanilci, C., Kinnunen, T., Ertas, F., Saeidi, R., Pohjalainen, J. & Alku, P. (2012). Regularized All-Pole Models for Speaker Verification Under Noisy Environments. IEEE Signal Processing Letters, 19(3), 163-166. Kinnunen, T., Saeidi, R., Sedlak, F., Lee, K.A., Sandberg, J., Hansson-Sandsten, M. & Li, H. (2012). Low-Variance Multitaper MFCC Features: a Case Study in Robust Speaker Verification. IEEE Transactions on Audio, Speech and Language Processing, 20(7), 1990-2001. McLaren, M.L. & Leeuwen, D.A. van (2012). Source-Normalized LDA for robust Speaker Recognition using i-vectors from multiple sources. IEEE Transactions on Audio, Speech and Language Processing, 20(3), 755-766. Mowlaee, P., Saeidi, R., Tan, Z.H., Christensen, M.G., Kinnunen, T., Franti, P. & Jensen, S.H. (2012). A Joint Approach for Single-Channel Speaker Identification and Speech Separation. IEEE Transactions on Audio, Speech and Language Processing, 20(9), 2586-2601. Réveil, B., Martens, J-P. & Heuvel, H. van den (2012). Improving Proper Name Recognition by by means of automatically learned pronunciation variants. Speech Communication, 54(3), 321-340. Ruiter, M.B., Beijer, L.J., Cucchiarini, C., Kramer, E., Rietveld, T., Strik, H. & Van hamme, H. (2012). Human Language Technology and communicative disabilities: Requirements and possibilities for the future. Language Resources and Evaluation, 46, 143-151. Truong, K. P. & Leeuwen, D.A. van (2012). Speech-based recognition of self-reported and observed emotion in a dimensional space. Speech Communication, 54, 1049-1063. 7

Book chapters: Mos, M.B.J., Bosch, A.P.J. van den & Berck, P.J. (2012). The predictive value of word-level perplexity in human sentence processing: A case study on fixed adjective-preposition constructions in Dutch. In S.Th. Gries & D. Divjak (Eds.), Frequency Effects in Language Learning and Processing (Trends in Linguistics. Studies and Monographs, 244.1) (pp. 207-240). Berlin: De Gruyter. Proceeding papers: Bahari, M.H., McLaren, M.L., Van hamme, H. & Leeuwen, D.A. van (2012). Age Estimation from Telephone Speech using i-vectors. In Proceedings of Interspeech 2012. Portland, USA. Bergmann, C., Boves, L.W.J. & Bosch, L.F.M. ten (2012). A model of the Headturn Preference Procedure: Linking cognitive processes to overt behaviour. In Proceedings of the IEEE Conference on Development and Learning and Epigenetic Robotics (IEEE ICDL-EpiRob 2012) (pp. electr.). San Diego, CA. Bosch, A.P.J. van den & Berck, P.J. (2012). Memory-based text correction for preposition and determiner errors. In Proceedings of the 7th Workshop on the Innovative Use of NLP for Building Educational Applications (pp. 289-294). Stroudsburg, PA, USA: Association for Computational Linguistics. Bosch, L.F.M. ten & Scharenborg, O.E. (2012). Modeling cue trading in human word recognition. In Proceedings of Interspeech 2012. Portland, OR, USA: Interspeech 2012. Cucchiarini, C., Doremalen, J.J.H.C. van & Strik, H. (2012). Practice and feedback in L2 speaking: an evaluation of the DISCO CALL system. In Proceedings of Interspeech 2012 (pp. DVD). Portland, USA. Hanilci, C., Kinnunen, T., Saeidi, R., Pohjalainen, J., Alku, P., Ertas, F., Sandberg, J. & Hansson- Sandsten, M. (2012). Comparing Spectrum Estimators in Speaker Verification Under Additive Noise Degradation. In Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP) 2012. Kyoto, Japan: IEEE. Hanilci, C., Kinnunen, T., Saeidi, R., Pohjalainen, J., Alku, P. & Ertas, F. (2012). Regularization of allpole models for speaker verification under additive noise. In Proceedings of The Speaker and Language Recognition Workshop Odyssey (pp. 236-242). Singapore. Heuvel, H. van den, Sanders, E.P., Rutten, R., Scagliola, S. & Witkamp, P. (2012). An Oral History Annotation Tool for INTER-VIEWs. In Proceedings of LREC 2012 (pp. 215-218). Istanboel. Huang, H., Bosch, L.F.M. ten, Cranen, B. & Boves, L.W.J. (2012). Smoothing Trajectories by Regularization. In Proceedings of the Statistical And Perceptual Audition (SAPA) Workshop 2012 (pp. DVD). Huang, H., Bosch, L.F.M. ten, Cranen, B. & Boves, L.W.J. (2012). Exploring Discriminative Speech Trajectory Structures. In Proceedings of Interspeech 2012 (pp. DVD). Portland, Oregon, USA. Huang, H., Liu, Y., Bosch, L.F.M. ten, Cranen, B. & Boves, L.W.J. (2012). Knowledge-based Quadratic Discriminant Analysis. In Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP) 2012 (pp. 4145-4148). Kyoto, Japan: IEEE. Hurmalainen, A., Saeidi, R. & Virtanen, T. (2012). Group Sparsity for Speaker Identity Discrimination in Factorisation-based Speech Recognition. In Proceedings of Interspeech 2012 (pp. dvd). Portland, USA. Kinnunen, T., Saeidi, R., Leppanen, J. & Saarinen, J.P. (2012). Audio context recognition in variable mobile environments from short segments using speaker and language recognizers. In Proceedings of The Speaker and Language Recognition Workshop Odyssey (pp. 301-311). Singapore. 8

Leeuwen, D.A. van & Bahari, M.H. (2012). Calibration of probabilistic age recognition. In Proceedings of Interspeech 2012 (pp. dvd). Portland, USA. Mandasari, M.I., McLaren, M.L. & Leeuwen, D.A. van (2012). The Effect of Noise on Modern Automatic Speaker Recognition Systems. In Proceedings of IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) (pp. 4249-4252). Kyoto-Japan. McLaren, M.L. & Leeuwen, D.A. van (2012). Gender-Independent Speaker Recognition using Source Normalisation. In Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP) 2012. Kyoto, Japan: IEEE. McLaren, M.L., Mandasari, M.I. & Leeuwen, D.A. van (2012). Source Normalization for Language- Independent Speaker Recognition using i-vectors. In Proceedings of The Speaker and Language Recognition Workshop Odyssey (pp. 55-61). Singapore. Mowlaee, P., Saeidi, R. & Martin, R. (2012). Model-driven speech enhancement for multisource reverberant environment (Signal Separation Evaluation Campaign SiSEC 2011). In Proceedings of the 10th International Conference on Latent Variable Analysis and Source Separation LVA/ICA 2012 (pp. 454-461). Tel-Aviv, Israel. Mowlaee, P., Saeidi, R., Christensen, M.G. & Martin, R. (2012). Subjective and Objective Quality Assessment of Single-channel Speech Separation Algorithims. In Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP) 2012. Kyoto, Japan: IEEE. Niesler, T., Vuuren, S. van & Bosch, L.F.M. ten (2012). Automatic segmentation of TIMIT by dynamic programming. In Proceedings of Pattern Recognition Association of South Africa (PRASA) (pp. 40-46). Oostdijk, N.H.J. & Heuvel, H. van den (2012). Introducing the CLARIN-NL Data Curation Service. In Proceedings of LREC 2012 (pp. 29-34). Istanbul, Turkey: European Language Resources Association (ELRA). Reynaert, M., Schuurman, A.K., Hoste, V., Oostdijk, N.H.J. & Gompel, M. van (2012). Beyond SoNaR: towards the facilitation of large corpus building efforts. In Proceedings LREC of 2012 (pp. 2897-2904). Istanbul, Turkey: European Language Resources Association (ELRA). Saeidi, R., Hurmalainen, A., Virtanen, T. & Leeuwen, D.A. van (2012). Exemplar- based Sparse Representation and Sparse Discrimination for Noise Robust Speaker Identification. In Proceedings of The Speaker and Language Recognition Workshop Odyssey (pp. 248-255). Singapore. Saeidi, R., Mowlaee, P. & Martin, R. (2012). Phase estimation for signal reconstruction in singlechannel source separation. In Proceedings of Interspeech 2012 (pp. dvd). Portland, USA. Sanders, E.P. (2012). Collecting and Analysing Chats and Tweets in SoNaR. In LREC 2012 (pp. 2253-2256). Istanbul, Turkey: European Language Resources Association (ELRA). Sappelli, M., Verberne, S., Heijden, Maarten van der, Hinne, M. & Kraaij, W. (2012). Collection and Analysis of ground truth data for query intent. In DIR 2012 Proceedings of the Dutch-Belgium Information Retrieval workshop (pp. 7-10). S.l.: s.n.. Strik, H., Colpaert, J., Doremalen, J.J.H.C. van & Cucchiarini, C. (2012). The DISCO ASR-based CALL system: practicing L2 oral skills and beyond. In Proceedings of LREC 2012 (pp. DVD). Istanbul, Turkey. Strik, H. (2012). ASR-based systems for language learning and therapy. In Proceedings of IS-ADEPT 2012 (pp. DVD). Stockholm, Sweden: KTH. 9

Sun, Y., Doss, M.M., Gemmeke, J.F., Cranen, B., Bosch, L.F.M. ten & Boves, L.W.J. (2012). Combination of Sparse Classification and Multilayer Perceptron for Noise-robust ASR. In Proceedings of Interspeech 2012 (pp. DVD). Portland, Oregon, USA. Sun, Y., Cranen, B., Gemmeke, J.F., Bosch, L.F.M. ten, Boves, L.W.J. & Doss, M.M. (2012). Using Sparse Classification Outputs as Feature Observations for Noise-robust ASR. In Proceedings of Interspeech 2012 (pp. DVD). Portland, Oregon, USA. Treurniet, M. & Sanders, E.P. (2012). Chats, Tweets and SMS in the SoNaR Corpus: Social Media Collection. In D Newman (Ed.), Proceedings 1st Annual International Conference Language, Literature & Linguistics (L3 Conference) (pp. 268-271). Singapore: GSTF. Treurniet, M., De Clercq, O, Heuvel, H. van den & Oostdijk, N.H.J. (2012). Collection of a corpus of Dutch SMS. In Proceedings of LREC 2012 (pp. 2268-2273). Istanbul, Turkey: European Language Resources Association (ELRA). Turco, G. & Gubian, M. (2012). L1 Prosodic transfer and priming effects: A quantitative study on semispontaneous dialogues. In Proceedings of the 6th International Conference on Speech Prosody (pp. online). Shanghai. Verberne, S., Bosch, A.P.J. van den, Strik, H. & Boves, L.W.J. (2012). The effect of domain and text type on text prediction quality. In Proceedings of EACL 2012 (pp. 561-569). New Brunswick, Canada: Association for Computational Linguistics. Popular specialist publications: Bogers, T. & Bosch, A.P.J. van den (2012). Onderzoekers die dit artikel lazen, lazen ook... DIXIT, 8, 18-18. Bosch, A.P.J. van den (2012). Contextgebaseerde spellingcorrectie met Valkuil.net. DIXIT, 8, 17-17. Bosch, A.P.J. van den (2012). Valkuil.net: Spellingcorrectie met gevoel voor context. Tekst[blad], 18(2), 16-18. Boves, L.W.J. (2012). Enterprise language processing. DIXIT, 9, 32-33. Heuvel, Th. Van den, T Sas, J. & Verberne, S. (2012). Onderzoek taal- en spraaktechnologie en onderwijs. Publication by the Dutch Language Union. Oostdijk, N.H.J. (2012). De digitale speurhond. DIXIT, 9, 11-13. Sanders, E.P. & Treurniet, M. (2012). Social media in SoNaR. DIXIT, 9, 8-10. 10