Transcription bottleneck of speech corpus exploitation

Save this PDF as:
 WORD  PNG  TXT  JPG

Size: px
Start display at page:

Download "Transcription bottleneck of speech corpus exploitation"

Transcription

1 Transcription bottleneck of speech corpus exploitation Caren Brinckmann Institut für Deutsche Sprache, Mannheim, Germany Lesser Used Languages and Computer Linguistics (LULCL) II Nov 13/14, 2008 Bozen

2 Overview Introduction Written corpora vs. speech corpora Speech corpus annotation Transcription bottleneck Crowdsourcing the orthographic transcription Automatic broad phonetic alignment Query-driven annotation Summary 2

3 Written vs. speech corpora Written corpora can be compiled/accessed more easily web as corpus large available corpora, e.g. DeReKo for German (3.4 billion words): Written corpora can be exploited without any annotation, e.g. extraction of higher-order collocations in CCDB: Limited availability of speech corpora Speech corpora need at least a basic transcription 3

4 Speech corpus annotation "Basic" transcription: orthographic transcription languages without standardized orthography? Text-to-audio alignment Phonetic transcription for phonetic and phonological research Prosody, information structure, coreferences, POS,... 4

5 Transcription bottleneck Reliable orthographic transcription: only feasible for near-native speakers problem: minority languages / dialectal speech crowdsourcing the orthographic transcription Phonetic transcription: manual annotation is very time-consuming (1:200) and requires considerable skill automatic broad phonetic alignment query-driven annotation 5

6 Transcription bottleneck Reliable orthographic transcription: only feasible for near-native speakers problem: minority languages / dialectal speech crowdsourcing the orthographic transcription Phonetic transcription: manual annotation is very time-consuming (1:200) and requires considerable skill automatic broad phonetic alignment query-driven annotation 6

7 Crowdsourcing: Introduction Term coined by Jeff Howe (Wired, June 2006) Outsourcing: subcontracting a process, such as product design or manufacturing, to a third-party company Crowdsourcing: outsourcing a task traditionally performed by an employee or contractor to an undefined, generally large group of people Classical crowdsourcing: self-service restaurants, supermarkets, IKEA, ATMs, ticket machines New: use the Internet to publicize and manage crowdsourcing projects "Wisdom of crowds": aggregation of information in groups result in decisions that are often better than could have been made by any single member of the group 7

8 Amazon Mechanical Turk (mturk.com) 8

9 Distributed Proofreaders (pgdp.net) 9

10 Recording Teenagers: (LMU Munich) 10

11 Key guidelines for successful crowdsourcing 1. Be focused: vaguely defined problems get vague answers 2. Get your filters right: use crowd and experts to extract the best answers 3. Tap the right crowds: find the best experts in the mass 4. Build community into social networks (BusinessWeek, September 25, 2006) 11

12 Possible application: speech corpus "German Today" Recordings in 160+ towns throughout the German speaking area of Europe (D, A, CH, LUX, I, B, FL) 4 high school students (aged 16-20) in every town und 2 older adults (aged 50-60) in 80 towns 800+ speakers 90 minutes per speaker 1200 hrs. of speech Material: read speech interview map task 12

13 13

14 Map Task Bruneck Landeck Start Ziel Start Ziel 14

15 Crowdsourcing the orthographic transcription Dialectal spontaneous speech (map task data) can be transcribed reliably only by (near-)native speakers of the dialect. Possible crowdsourcing implementation: central database of speech signals, metadata, transcripts, and information about the users/transcribers web-based transcription software, e.g. WebTranscribe (as used in clearly defined task: transcribe each inter-pause-stretch with standard German orthography quality assurance: parallel transcription, evaluation + control tasks (as employed by CastingWords on mturk.com) recruit transcribers: contact the schools where the recordings took place and/or the speakers directly community: points / virtual titles, rewards (e.g. visit to IDS), games... 15

16 Transcription bottleneck Reliable orthographic transcription: only feasible for near-native speakers problem: minority languages / dialectal speech crowdsourcing the orthographic transcription Phonetic transcription: manual annotation is very time-consuming (1:200) and requires considerable skill automatic broad phonetic alignment query-driven annotation 16

17 Automatic broad phonetic alignment Input: speech signal orthographic transcription canonic/phonemic transcription of all words in the corpus pronunciation lexicon grapheme-to-phoneme converter language-specific phoneme models (e.g. trained HMMs) Output: time-aligned broad phonetic transcription 17

18 Example: orthographic transcription 18

19 Munich Automatic Segmentation System MAUS 19

20 Modelling post-lexical phonological processes 20

21 Obvious errors 21

22 Evaluation: comparison with manual transcription Van Bael et al. (2006, 2007) compared 10 aligners for Dutch with a manually obtained reference transcription. Results: Best performance: Canonical transcription + modelling of postlexical phonological processes with a decision tree Number of remaining disagreements with the reference transcription (14.6% for spontaneous speech, 8.1% for read speech) only slightly higher than human inter-labeller disagreement scores reported in the literature 22

23 Task-based evaluation access specific portions of the speech signal for further manual annotation? duration-based analyses (only large, significant effects can be found) analyses in the frequency domain (e.g. formant slope) 23

24 Phonetic aligners for lessresourced languages? build your own using HTK but: you need at least one hour of phonetically segmented and labelled speech data find an aligner for a language that is phonetically similar to your target language and use its pre-built HMMs adding pronunciation lexicon and/or grapheme-to-phoneme rules mapping between the phonemes of your target language and the HMM-modelled language 24

25 Transcription bottleneck Reliable orthographic transcription: only feasible for near-native speakers problem: minority languages / dialectal speech crowdsourcing the orthographic transcription Phonetic transcription: manual annotation is very time-consuming (1:200) and requires considerable skill automatic broad phonetic alignment query-driven annotation 25

26 Traditional corpus annotation process Gut (2008) 26

27 Problems with sequential corpus creation too time-consuming: many years of annotation work before corpus can be exploited and any results can be published very error-prone: limited reliability of annotations due to coder drift restricted corpus queries: failed/impossible queries re-annotation of corpus 27

28 Cyclic and iterative corpus annotation ("agile corpus creation") Gut (2008) 28

29 Query-driven phonetic annotation of "German Today" 29

30 30

31 31

32 Advantages of agile corpus creation Query-driven approach tests suitability and consistency of annotation schema very little data has to be re-annotated or discarded design errors, annotation errors and conceptual inadequacies become immediately visible successive cycles improve annotation schema and limit it to the elements necessary for the queries saves time early publication of first results 32

33 Combining automatic and querydriven annotation 33

34 Summary speech corpora need at least a basic (orthographic) transcription to be exploitable difficult to produce for languages/dialects with only few native speakers use crowdsourcing phonological research further requires phonemic/phonetic segmentation and labelling very time-consuming combine automatic broad phonetic alignment with querydriven annotation 34

35 References Brinckmann, C., Kleiner, S., Knöbl, R., and Berend, N. (2008): German Today: an areally extensive corpus of spoken Standard German. Proceedings of the 6th International Conference on Language Resources and Evaluation (LREC 2008), Marrakech, Morocco. Draxler, C. (2005): WebTranscribe an extensible web-based speech annotation framework. Proceedings of the 8th International Conference on Text, Speech and Dialogue (TSD 2005), Karlovy Vary, Czech Republic, Keibel, H. and Belica, C. (2007): CCDB: a corpus-linguistic research and development workbench. Proceedings of Corpus Linguistics 2007, Birmingham, United Kingdom. Raffelsiefen, R. and Brinckmann, C. (2007): Evaluating phonological status: significance of paradigm uniformity vs. prosodic grouping effects. Proceedings of the 16th International Congress of Phonetic Sciences (ICPhS XVI), Saarbrücken, Germany, Schiel, F. (2004): MAUS Goes Iterative. Proceedings of the fourth International Conference on Language Resources and Evaluation (LREC 2004), Lisbon, Portugal, Van Bael, C., Boves, L., van den Heuvel, H. and Strik, H. (2006): Automatic phonetic transcription of large speech corpora. Proceedings of the 5th International Conference on Language Resources and Evaluation (LREC 2006), Genoa, Italy, Van Bael, C., Boves, L., van den Heuvel, H. and Strik, H. (2007): Automatic phonetic transcription of large speech corpora. Computer Speech and Language 21 (4), Voormann, H. and Gut, U. (2008): Agile corpus creation. Corpus Linguistics and Linguistic Theory 4 (2),

36 Thank you! 36

Towards Web Services for Speech Recording and Annotation

Towards Web Services for Speech Recording and Annotation Towards Web Services for Speech Recording and Annotation Christoph Draxler draxler@phonetik.uni-muenchen.de BAS Bavarian Archive for Speech Signals LMU Munich BAS hosted by University of Munich (LMU) Florian

More information

EXMARaLDA and the FOLK tools two toolsets for transcribing and annotating spoken language

EXMARaLDA and the FOLK tools two toolsets for transcribing and annotating spoken language EXMARaLDA and the FOLK tools two toolsets for transcribing and annotating spoken language Thomas Schmidt Institut für Deutsche Sprache, Mannheim R 5, 6-13 D-68161 Mannheim thomas.schmidt@uni-hamburg.de

More information

Turkish Radiology Dictation System

Turkish Radiology Dictation System Turkish Radiology Dictation System Ebru Arısoy, Levent M. Arslan Boaziçi University, Electrical and Electronic Engineering Department, 34342, Bebek, stanbul, Turkey arisoyeb@boun.edu.tr, arslanle@boun.edu.tr

More information

Robust Methods for Automatic Transcription and Alignment of Speech Signals

Robust Methods for Automatic Transcription and Alignment of Speech Signals Robust Methods for Automatic Transcription and Alignment of Speech Signals Leif Grönqvist (lgr@msi.vxu.se) Course in Speech Recognition January 2. 2004 Contents Contents 1 1 Introduction 2 2 Background

More information

The BBC s Virtual Voice-over tool ALTO: Technology for Video Translation

The BBC s Virtual Voice-over tool ALTO: Technology for Video Translation The BBC s Virtual Voice-over tool ALTO: Technology for Video Translation Susanne Weber Language Technology Producer, BBC News Labs In this presentation. - Overview over the ALTO Pilot project - Machine

More information

What CLARIN has to offer to Linguists. Jan Odijk TIN-dag Utrecht,

What CLARIN has to offer to Linguists. Jan Odijk TIN-dag Utrecht, What CLARIN has to offer to Linguists Jan Odijk TIN-dag Utrecht, 2015-02-07 1 Overview What is CLARIN? What CLARIN has to offer to linguists How you can learn to use the functionality offered Current Status

More information

Robust Question Answering for Speech Transcripts: UPC Experience in QAst 2009

Robust Question Answering for Speech Transcripts: UPC Experience in QAst 2009 Robust Question Answering for Speech Transcripts: UPC Experience in QAst 2009 Pere R. Comas and Jordi Turmo TALP Research Center Technical University of Catalonia (UPC) {pcomas,turmo}@lsi.upc.edu Abstract

More information

Pronunciation Modeling in Spelling Correction for Writers of English as a Foreign Language

Pronunciation Modeling in Spelling Correction for Writers of English as a Foreign Language Pronunciation Modeling in Spelling Correction for Writers of English as a Foreign Language Adriane Boyd Department of Linguistics The Ohio State University 1712 Neil Avenue Columbus, Ohio 43210, USA adriane@ling.osu.edu

More information

Using Crowdsourcing for Labelling Emotional Speech Assets

Using Crowdsourcing for Labelling Emotional Speech Assets Using Crowdsourcing for Labelling Emotional Speech Assets Alexey Tarasov, Charlie Cullen, Sarah Jane Delany Digital Media Centre Dublin Institute of Technology W3C Workshop on Emotion Language Markup -

More information

The Database for Spoken German DGD2

The Database for Spoken German DGD2 The Database for Spoken German DGD2 Thomas Schmidt Institut für Deutsche Sprache R5, 6-13, D-68161 Mannheim E-mail: thomas.schmidt@ids-mannheim.de Abstract The Database for Spoken German (Datenbank für

More information

Vowel production in a spoken Arapaho corpus

Vowel production in a spoken Arapaho corpus Vowel production in a spoken Arapaho corpus Christian T. DiCanio dicanio@haskins.yale.edu D. H. Whalen whalen@haskins.yale.edu Haskins Laboratories http://linguistics.berkeley.edu/ dicanio 10/24/14 46

More information

SWING: A tool for modelling intonational varieties of Swedish Beskow, Jonas; Bruce, Gösta; Enflo, Laura; Granström, Björn; Schötz, Susanne

SWING: A tool for modelling intonational varieties of Swedish Beskow, Jonas; Bruce, Gösta; Enflo, Laura; Granström, Björn; Schötz, Susanne SWING: A tool for modelling intonational varieties of Swedish Beskow, Jonas; Bruce, Gösta; Enflo, Laura; Granström, Björn; Schötz, Susanne Published in: Proceedings of Fonetik 2008 Published: 2008-01-01

More information

Carla Simões, t-carlas@microsoft.com. Speech Analysis and Transcription Software

Carla Simões, t-carlas@microsoft.com. Speech Analysis and Transcription Software Carla Simões, t-carlas@microsoft.com Speech Analysis and Transcription Software 1 Overview Methods for Speech Acoustic Analysis Why Speech Acoustic Analysis? Annotation Segmentation Alignment Speech Analysis

More information

The Icelandic Speech Recognition Project Hjal

The Icelandic Speech Recognition Project Hjal Eiríkur Rögnvaldsson University of Iceland The Icelandic Speech Recognition Project Hjal 1. The project In the end of 2002, the University of Iceland and four leading companies in the telecommunication

More information

Creating resources for Arabic summarisation. Dr Mahmoud El-Haj School of Computing and Communications

Creating resources for Arabic summarisation. Dr Mahmoud El-Haj School of Computing and Communications Creating resources for Arabic summarisation Dr Mahmoud El-Haj School of Computing and Communications Purpose The purpose of this exercise was two-fold: First it addresses a shortage of relevant data for

More information

Introducing Voice Analysis Software into the Classroom: how Praat Can Help French Students Improve their Acquisition of English Prosody.

Introducing Voice Analysis Software into the Classroom: how Praat Can Help French Students Improve their Acquisition of English Prosody. Introducing Voice Analysis Software into the Classroom: how Praat Can Help French Students Improve their Acquisition of English Prosody. Laurence Delrue Université de Lille 3 (France) laurence.delrue@univ-lille3.fr

More information

Recording Speech of Children, Non-Natives and Elderly People for HLT Applications: the JASMIN-CGN Corpus

Recording Speech of Children, Non-Natives and Elderly People for HLT Applications: the JASMIN-CGN Corpus Recording Speech of Children, Non-Natives and Elderly People for HLT Applications: the JASMIN-CGN Corpus C. Cucchiarini 1, J. Driesen 2, H. Van hamme 2, E. Sanders 1 1 CLST, Radboud University, Erasmusplein

More information

Master of Arts in Linguistics Syllabus

Master of Arts in Linguistics Syllabus Master of Arts in Linguistics Syllabus Applicants shall hold a Bachelor s degree with Honours of this University or another qualification of equivalent standard from this University or from another university

More information

Scandinavian Dialect Syntax Transnational collaboration, data collection, and resource development

Scandinavian Dialect Syntax Transnational collaboration, data collection, and resource development Scandinavian Dialect Syntax Transnational collaboration, data collection, and resource development Janne Bondi Johannessen, Signe Laake, Kristin Hagen, Øystein Alexander Vangsnes, Tor Anders Åfarli, Arne

More information

Machine Translation in TIDES Planning Committee Report

Machine Translation in TIDES Planning Committee Report Machine Translation in TIDES Planning Committee Report Kevin Knight, USC/ISI Lori Levin, CMU Young-Suk Lee, MIT-LL Salim Roukos, IBM Alex Waibel, CMU Technical Objectives Convert free text from a variety

More information

An analysis of coding consistency in the transcription of spontaneous. speech from the Buckeye corpus

An analysis of coding consistency in the transcription of spontaneous. speech from the Buckeye corpus An analysis of coding consistency in the transcription of spontaneous speech from the Buckeye corpus William D. Raymond Ohio State University 1. Introduction Large corpora of speech that have been supplemented

More information

Machine Translation-based Language Model Adaptation for ASR of Spoken Translations

Machine Translation-based Language Model Adaptation for ASR of Spoken Translations Machine Translation-based Language Model Adaptation for ASR of Spoken Translations aka ESAT's contribution to the SCATE project Joris Pelemans Tom Vanallemeersch (CCL) Kris Demuynck (UGent) Lyan Verwimp

More information

NOTE: This is the penultimate version of the paper, as it has been submitted. The final version might be (slightly) different.

NOTE: This is the penultimate version of the paper, as it has been submitted. The final version might be (slightly) different. Chapter # Is phonetic knowledge of any use for speech technology? Helmer Strik A 2 RT, Dept. of Language and Speech, University of Nijmegen, the Netherlands NOTE: This is the penultimate version of the

More information

Efficient diphone database creation for MBROLA, a multilingual speech synthesiser

Efficient diphone database creation for MBROLA, a multilingual speech synthesiser Efficient diphone database creation for, a multilingual speech synthesiser Institute of Linguistics Adam Mickiewicz University Poznań OWD 2010 Wisła-Kopydło, Poland Why? useful for testing speech models

More information

CHAPTER 5 SPEAKER IDENTIFICATION USING SPEAKER- SPECIFIC-TEXT

CHAPTER 5 SPEAKER IDENTIFICATION USING SPEAKER- SPECIFIC-TEXT 52 CHAPTER 5 SPEAKER IDENTIFICATION USING SPEAKER- SPECIFIC-TEXT 5.1 MOTIVATION FOR USING SPEAKER-SPECIFIC-TEXT Better classification accuracy can be achieved if the training technique is able to capture

More information

SOME ASPECTS OF ASR TRANSCRIPTION BASED UNSUPERVISED SPEAKER ADAPTATION FOR HMM SPEECH SYNTHESIS

SOME ASPECTS OF ASR TRANSCRIPTION BASED UNSUPERVISED SPEAKER ADAPTATION FOR HMM SPEECH SYNTHESIS SOME ASPECTS OF ASR TRANSCRIPTION BASED UNSUPERVISED SPEAKER ADAPTATION FOR HMM SPEECH SYNTHESIS Bálint Tóth, Tibor Fegyó, Géza Németh Department of Telecommunications and Media Informatics Budapest University

More information

Text-To-Speech Technologies for Mobile Telephony Services

Text-To-Speech Technologies for Mobile Telephony Services Text-To-Speech Technologies for Mobile Telephony Services Paulseph-John Farrugia Department of Computer Science and AI, University of Malta Abstract. Text-To-Speech (TTS) systems aim to transform arbitrary

More information

FEATURES FOR AN INTERNET ACCESSIBLE CORPUS OF SPOKEN TURKISH DISCOURSE

FEATURES FOR AN INTERNET ACCESSIBLE CORPUS OF SPOKEN TURKISH DISCOURSE FEATURES FOR AN INTERNET ACCESSIBLE CORPUS OF SPOKEN TURKISH DISCOURSE Şükriye RUHİ sukruh@metu.edu.tr Derya ÇOKAL KARADAŞ cokal@metu.edu.tr Middle East Technical University THE METU SPOKEN TURKISH DISCOURSE

More information

SIMPLE LANGUAGE TRANSLATION

SIMPLE LANGUAGE TRANSLATION Technical Disclosure Commons Defensive Publications Series February 29, 2016 SIMPLE LANGUAGE TRANSLATION Dimitri Kanevsky Vladimir Vuskovic A.S. Dogruoz Follow this and additional works at: http://www.tdcommons.org/dpubs_series

More information

Lexicography: Theory and Practice (Gibbon)

Lexicography: Theory and Practice (Gibbon) 22 June 2005 Lexicography: theory and practice Dafydd Gibbon gibbon@uni-bielefeld.de Overview Why does everyone know what a dictionary is? Questions for the lexicographer (and for you) About dictionaries:

More information

The Language Archive at the Max Planck Institute for Psycholinguistics. Alexander König (with thanks to J. Ringersma)

The Language Archive at the Max Planck Institute for Psycholinguistics. Alexander König (with thanks to J. Ringersma) The Language Archive at the Max Planck Institute for Psycholinguistics Alexander König (with thanks to J. Ringersma) Fourth SLCN Workshop, Berlin, December 2010 Content 1.The Language Archive Why Archiving?

More information

HMM Speech Recognition. Words: Pronunciations and Language Models. Pronunciation dictionary. Out-of-vocabulary (OOV) rate.

HMM Speech Recognition. Words: Pronunciations and Language Models. Pronunciation dictionary. Out-of-vocabulary (OOV) rate. HMM Speech Recognition ords: Pronunciations and Language Models Recorded Speech Decoded Text (Transcription) Steve Renals Acoustic Features Acoustic Model Automatic Speech Recognition ASR Lecture 9 January-March

More information

Automatic Transcription of Continuous Speech using Unsupervised and Incremental Training

Automatic Transcription of Continuous Speech using Unsupervised and Incremental Training INTERSPEECH-2004 1 Automatic Transcription of Continuous Speech using Unsupervised and Incremental Training G.L. Sarada, N. Hemalatha, T. Nagarajan, Hema A. Murthy Department of Computer Science and Engg.,

More information

VERBMOBIL Dialogues: Multifaced Analysis

VERBMOBIL Dialogues: Multifaced Analysis VERBMOBIL Dialogues: Multifaced Analysis Akira Kurematsu (1), Youichi Akegami (2), Susanne Burger (3), Susanne Jekat (4), Brigitte Lause (4), Victoria L Maclaren (3), Daniela Oppermann (5), Tanja Schultz

More information

Stefan Engelberg (IDS Mannheim), Workshop Corpora in Lexical Research, Bucharest, Nov. 2008 [Folie 1]

Stefan Engelberg (IDS Mannheim), Workshop Corpora in Lexical Research, Bucharest, Nov. 2008 [Folie 1] Content 1. Empirical linguistics 2. Text corpora and corpus linguistics 3. Concordances 4. Application I: The German progressive 5. Part-of-speech tagging 6. Fequency analysis 7. Application II: Compounds

More information

DERIVING PERCEPTUAL GRADATION OF L2 ENGLISH MISPRONUNCIATIONS USING CROWDSOURCING AND THE WORKERRANK ALGORITHM. Hao Wang and Helen Meng*

DERIVING PERCEPTUAL GRADATION OF L2 ENGLISH MISPRONUNCIATIONS USING CROWDSOURCING AND THE WORKERRANK ALGORITHM. Hao Wang and Helen Meng* DERIVING PERCEPTUAL GRADATION OF L2 ENGLISH MISPRONUNCIATIONS USING CROWDSOURCING AND THE WORKERRANK ALGORITHM Hao Wang and Helen Meng* Department of Systems Engineering and Engineering Management, The

More information

DAM-LR at the INL Archive Formation and Local INL. Remco van Veenendaal veenendaal@inl.nl http://imdi.inl.nl 01/03/2007 DAM-LR

DAM-LR at the INL Archive Formation and Local INL. Remco van Veenendaal veenendaal@inl.nl http://imdi.inl.nl 01/03/2007 DAM-LR DAM-LR at the INL Archive Formation and Local INL Remco van Veenendaal veenendaal@inl.nl http://imdi.inl.nl Introducing Remco van Veenendaal Project manager DAM-LR Acting project manager Dutch HLT Agency

More information

Tools & Resources for Visualising Conversational-Speech Interaction

Tools & Resources for Visualising Conversational-Speech Interaction Tools & Resources for Visualising Conversational-Speech Interaction Nick Campbell NiCT/ATR-SLC Keihanna Science City, Kyoto, Japan. nick@nict.go.jp Preamble large corpus data examples new stuff conclusion

More information

Phrases. Topics for Today. Phrases. POS Tagging. ! Text transformation. ! Text processing issues

Phrases. Topics for Today. Phrases. POS Tagging. ! Text transformation. ! Text processing issues Topics for Today! Text transformation Word occurrence statistics Tokenizing Stopping and stemming Phrases Document structure Link analysis Information extraction Internationalization Phrases! Many queries

More information

Semantic Web Mining: Using Association Rules for Learning an Ontology. Presented By : Amgad Madkour

Semantic Web Mining: Using Association Rules for Learning an Ontology. Presented By : Amgad Madkour Semantic Web Mining: Using Association Rules for Learning an Ontology Presented By : Amgad Madkour Agenda Semantic Web Mining aim Web Mining overview Semantic Web overview Ontology Building Learning an

More information

D1.2 BUSINESS SECTOR REPORTS ON COMPANIES LANGUAGE NEEDS - ICT SECTOR

D1.2 BUSINESS SECTOR REPORTS ON COMPANIES LANGUAGE NEEDS - ICT SECTOR D1.2 BUSINESS SECTOR REPORTS ON COMPANIES LANGUAGE NEEDS - ICT SECTOR Project Title: Project Type: Programme: Project No: CELAN Network Project LLP KA2 196466-LLP-1-2010-1-BE-KA2-KA2PLA Version: 1.0 Date:

More information

ehg New Trends in e Humanities Amsterdam 10 01 2013

ehg New Trends in e Humanities Amsterdam 10 01 2013 ehg New Trends in e Humanities Amsterdam 10 01 2013 Overview 1) Dialect geography 2) A unified structure for Dutch dialect dictionary data 3) Dialectgebieden in Brabant. Geografische clustering op basis

More information

Speech technologies & multilingualism. society

Speech technologies & multilingualism. society Speech technologies in a multilingual society Universitat Autònoma de Barcelona http://liceu.uab.cat/~joaquim Lauch of Lingu@net Worldwide Instituto Cervantes Madrid, 24th May, 2011 1 The need for speech

More information

Stress and Accent in Tunisian Arabic

Stress and Accent in Tunisian Arabic Stress and Accent in Tunisian Arabic By Nadia Bouchhioua University of Carthage, Tunis Outline of the Presentation 1. Rationale for the study 2. Defining stress and accent 3. Parameters Explored 4. Methodology

More information

Developing a User-based Method of Web Register Classification

Developing a User-based Method of Web Register Classification Developing a User-based Method of Web Register Classification Jesse Egbert Douglas Biber Northern Arizona University Introduction The internet has tremendous potential for linguistic research and NLP applications

More information

Crowdsourcing for Big Data Analytics

Crowdsourcing for Big Data Analytics KYOTO UNIVERSITY Crowdsourcing for Big Data Analytics Hisashi Kashima (Kyoto University) Satoshi Oyama (Hokkaido University) Yukino Baba (Kyoto University) DEPARTMENT OF INTELLIGENCE SCIENCE AND TECHNOLOGY

More information

1. Introduction to Spoken Dialogue Systems

1. Introduction to Spoken Dialogue Systems SoSe 2006 Projekt Sprachdialogsysteme 1. Introduction to Spoken Dialogue Systems Walther v. Hahn, Cristina Vertan {vhahn,vertan}@informatik.uni-hamburg.de Content What are Spoken dialogue systems? Types

More information

Progress Report Spring 20XX

Progress Report Spring 20XX Progress Report Spring 20XX Client: XX C.A.: 7 years Date of Birth: January 1, 19XX Address: Somewhere Phone 555-555-5555 Referral Source: UUUU Graduate Clinician: XX, B.A. Clinical Faculty: XX, M.S.,

More information

Five emotion classes detection in real-world call center data: the use of various types of paralinguistic features

Five emotion classes detection in real-world call center data: the use of various types of paralinguistic features Five emotion classes detection in real-world call center data: the use of various types of paralinguistic features Laurence Vidrascu, Laurence Devillers LIMSI-CNRS, France (vidrascu,devil)@limsi.fr Abstract

More information

The use of Praat in corpus research

The use of Praat in corpus research The use of Praat in corpus research Paul Boersma Praat is a computer program for analysing, synthesizing and manipulating speech and other sounds, and for creating publication-quality graphics. It is open

More information

Computerized Language Analysis (CLAN) from The CHILDES Project

Computerized Language Analysis (CLAN) from The CHILDES Project Vol. 1, No. 1 (June 2007), pp. 107 112 http://nflrc.hawaii.edu/ldc/ Computerized Language Analysis (CLAN) from The CHILDES Project Reviewed by FELICITY MEAKINS, University of Melbourne CLAN is an annotation

More information

Text To Speech Conversion Using Different Speech Synthesis

Text To Speech Conversion Using Different Speech Synthesis INTERNATIONAL JOURNAL OF SCIENTIFIC & TECHNOLOGY RESEARCH VOLUME 4, ISSUE 7, JULY 25 ISSN 2277-866 Text To Conversion Using Different Synthesis Hay Mar Htun, Theingi Zin, Hla Myo Tun Abstract: Text to

More information

Crowd-sourced, automatic speechcorpora collection building the Romanian Anonymous Speech Corpus

Crowd-sourced, automatic speechcorpora collection building the Romanian Anonymous Speech Corpus 1 / 17 Crowd-sourced, automatic speechcorpora collection building the Romanian Anonymous Speech Corpus Stefan Daniel Dumitrescu Tiberiu Boros Radu Ion RACAI RESEARCH INSTITUTE FOR ARTIFICIAL INTELLIGENCE

More information

A CHINESE SPEECH DATA WAREHOUSE

A CHINESE SPEECH DATA WAREHOUSE A CHINESE SPEECH DATA WAREHOUSE LUK Wing-Pong, Robert and CHENG Chung-Keng Department of Computing, Hong Kong Polytechnic University Tel: 2766 5143, FAX: 2774 0842, E-mail: {csrluk,cskcheng}@comp.polyu.edu.hk

More information

A Short Introduction to Transcribing with ELAN. Ingrid Rosenfelder Linguistics Lab University of Pennsylvania

A Short Introduction to Transcribing with ELAN. Ingrid Rosenfelder Linguistics Lab University of Pennsylvania A Short Introduction to Transcribing with ELAN Ingrid Rosenfelder Linguistics Lab University of Pennsylvania January 2011 Contents 1 Source 2 2 Opening files for annotation 2 2.1 Starting a new transcription.....................

More information

Towards Automatic Scoring of Non-Native Spontaneous Speech

Towards Automatic Scoring of Non-Native Spontaneous Speech Towards Automatic Scoring of Non-Native Spontaneous Speech Klaus Zechner and Isaac I. Bejar Educational Testing Service Princeton, NJ, USA (kzechner,ibejar)@ets.org Abstract This paper investigates the

More information

Error analysis of a public domain pronunciation dictionary

Error analysis of a public domain pronunciation dictionary Error analysis of a public domain pronunciation dictionary Olga Martirosian and Marelie Davel Human Language Technologies Research Group CSIR Meraka Institute / North-West University omartirosian@csir.co.za,

More information

DIXI A Generic Text-to-Speech System for European Portuguese

DIXI A Generic Text-to-Speech System for European Portuguese DIXI A Generic Text-to-Speech System for European Portuguese Sérgio Paulo, Luís C. Oliveira, Carlos Mendes, Luís Figueira, Renato Cassaca, Céu Viana 1 and Helena Moniz 1,2 L 2 F INESC-ID/IST, 1 CLUL/FLUL,

More information

Study Plan for Master of Arts in Applied Linguistics

Study Plan for Master of Arts in Applied Linguistics Study Plan for Master of Arts in Applied Linguistics Master of Arts in Applied Linguistics is awarded by the Faculty of Graduate Studies at Jordan University of Science and Technology (JUST) upon the fulfillment

More information

Automatic Speech Recognition and Hybrid Machine Translation for High-Quality Closed-Captioning and Subtitling for Video Broadcast

Automatic Speech Recognition and Hybrid Machine Translation for High-Quality Closed-Captioning and Subtitling for Video Broadcast Automatic Speech Recognition and Hybrid Machine Translation for High-Quality Closed-Captioning and Subtitling for Video Broadcast Hassan Sawaf Science Applications International Corporation (SAIC) 7990

More information

Investigations on Error Minimizing Training Criteria for Discriminative Training in Automatic Speech Recognition

Investigations on Error Minimizing Training Criteria for Discriminative Training in Automatic Speech Recognition , Lisbon Investigations on Error Minimizing Training Criteria for Discriminative Training in Automatic Speech Recognition Wolfgang Macherey Lars Haferkamp Ralf Schlüter Hermann Ney Human Language Technology

More information

Online experiments with the Percy software framework experiences and some early results

Online experiments with the Percy software framework experiences and some early results Online experiments with the Percy software framework experiences and some early results Christoph Draxler BAS Bavarian Archive of Speech Signals Institute of Phonetics and Speech Processing Ludwig-Maximilian

More information

Teaching Methodology Modules. Teaching Skills Modules

Teaching Methodology Modules. Teaching Skills Modules 3.3 Clarendon Park, Clumber Avenue, Nottingham, NG5 1DW, United Kingdom Tel: +44 115 969 2424. Fax: +44 115 962 1452. www.ilsenglish.com. Email: frances@ilsenglish.com Teacher Development Modules for Teachers

More information

Developing LMF-XML Bilingual Dictionaries for Colloquial Arabic Dialects

Developing LMF-XML Bilingual Dictionaries for Colloquial Arabic Dialects Developing LMF-XML Bilingual Dictionaries for Colloquial Arabic Dialects David Graff, Mohamed Maamouri Linguistic Data Consortium University of Pennsylvania E-mail: graff@ldc.upenn.edu, maamouri@ldc.upenn.edu

More information

AET 700: INTRODUCTION TO TRANSLATION AND INTERPRETATION AET 701: TRANSLATION THEORIES AND INTERPRETATION

AET 700: INTRODUCTION TO TRANSLATION AND INTERPRETATION AET 701: TRANSLATION THEORIES AND INTERPRETATION POSTGRADUATE DIPLOMA IN TRANSLATION STUDIES Level 700 AET 700: INTRODUCTION TO TRANSLATION AND INTERPRETATION Concepts of translation and interpretation; interpretation versus translation; historical background;

More information

Text-to-pinyin conversion based on contextual knowledge and D-tree for Mandarin

Text-to-pinyin conversion based on contextual knowledge and D-tree for Mandarin Text-to-pinyin conversion based on contextual knowledge and D-tree for Mandarin Sen Zhang, Yves Laprie To cite this version: Sen Zhang, Yves Laprie. Text-to-pinyin conversion based on contextual knowledge

More information

BASELINE WSJ ACOUSTIC MODELS FOR HTK AND SPHINX: TRAINING RECIPES AND RECOGNITION EXPERIMENTS. Keith Vertanen

BASELINE WSJ ACOUSTIC MODELS FOR HTK AND SPHINX: TRAINING RECIPES AND RECOGNITION EXPERIMENTS. Keith Vertanen BASELINE WSJ ACOUSTIC MODELS FOR HTK AND SPHINX: TRAINING RECIPES AND RECOGNITION EXPERIMENTS Keith Vertanen University of Cambridge, Cavendish Laboratory Madingley Road, Cambridge, CB3 0HE, UK kv7@cam.ac.uk

More information

D2.4: Two trained semantic decoders for the Appointment Scheduling task

D2.4: Two trained semantic decoders for the Appointment Scheduling task D2.4: Two trained semantic decoders for the Appointment Scheduling task James Henderson, François Mairesse, Lonneke van der Plas, Paola Merlo Distribution: Public CLASSiC Computational Learning in Adaptive

More information

Research Portfolio. Beáta B. Megyesi January 8, 2007

Research Portfolio. Beáta B. Megyesi January 8, 2007 Research Portfolio Beáta B. Megyesi January 8, 2007 Research Activities Research activities focus on mainly four areas: Natural language processing During the last ten years, since I started my academic

More information

Towards a Typology of English Accents

Towards a Typology of English Accents Towards a Typology of English Accents The Speech Accent Archive and STAT Steven H. Weinberger George Mason University Stephen Kunath Georgetown University http://accent.gmu.edu Outline Archive architecture

More information

Speaker Adaptation from Limited Training in the BBN BYBLOS Speech Recognition System

Speaker Adaptation from Limited Training in the BBN BYBLOS Speech Recognition System Speaker Adaptation from Limited Training in the BBN BYBLOS Speech Recognition System Francis Kubala Ming-Whel Feng, John Makhoul, Richard Schwartz BBN Systems and Technologies Corporation 10 Moulton St.,

More information

Things to remember when transcribing speech

Things to remember when transcribing speech Notes and discussion Things to remember when transcribing speech David Crystal University of Reading Until the day comes when this journal is available in an audio or video format, we shall have to rely

More information

http://liceu.uab.cat/~joaquim/publicacions/ Dybkjaer_et_al_01_annotation_multimodality.pdf

http://liceu.uab.cat/~joaquim/publicacions/ Dybkjaer_et_al_01_annotation_multimodality.pdf Dybkjaer, L., Berman, S., Bernsen, N. O., Carletta, J., Heid, U., & Llisterri, J. (2001). Requirements and specifications for a tool in support of annotation of natural interaction and multimodal data.

More information

Experiments with Signal-Driven Symbolic Prosody for Statistical Parametric Speech Synthesis

Experiments with Signal-Driven Symbolic Prosody for Statistical Parametric Speech Synthesis Experiments with Signal-Driven Symbolic Prosody for Statistical Parametric Speech Synthesis Fabio Tesser, Giacomo Sommavilla, Giulio Paci, Piero Cosi Institute of Cognitive Sciences and Technologies, National

More information

SYNTHESISED SPEECH WITH UNIT SELECTION

SYNTHESISED SPEECH WITH UNIT SELECTION Institute of Phonetic Sciences, University of Amsterdam, Proceedings 24 (2001), 57-63. SYNTHESISED SPEECH WITH UNIT SELECTION Creating a restricted domain speech corpus for Dutch Betina Simonsen, Esther

More information

PERCEPTION OF VOWEL QUANTITY BY ENGLISH LEARNERS

PERCEPTION OF VOWEL QUANTITY BY ENGLISH LEARNERS PERCEPTION OF VOWEL QUANTITY BY ENGLISH LEARNERS OF CZECH AND NATIVE LISTENERS Kateřina Chládková Václav Jonáš Podlipský Karel Janíček Eva Boudová 1 INTRODUCTION Vowels in both English and Czech are realized

More information

Speaker Recruitment for Speech Databases

Speaker Recruitment for Speech Databases Speaker Recruitment for Speech Databases Eric Sanders, Henk van den Heuvel SPEX P.O. Box 9103, 6500 HD Nijmegen, the Netherlands eric@spex.nl Abstract In this paper, the aspects of speaker recruitment,

More information

Eliminating Complexity to Ensure Fastest Time to Big Data Value

Eliminating Complexity to Ensure Fastest Time to Big Data Value Eliminating Complexity to Ensure Fastest Time to Big Data Value Copyright 2015 Pentaho Corporation. Redistribution permitted. All trademarks are the property of their respective owners. For the latest

More information

Multi-Dialectical Languages Effect on Speech Recognition

Multi-Dialectical Languages Effect on Speech Recognition Multi-Dialectical Languages Effect on Speech Recognition Too Much Choice Can Hurt Mohamed G. Elfeky Pedro Moreno Google Inc. New York, NY, USA {mgelfeky, pedro}@google.com Victor Soto Columbia University

More information

Thirukkural - A Text-to-Speech Synthesis System

Thirukkural - A Text-to-Speech Synthesis System Thirukkural - A Text-to-Speech Synthesis System G. L. Jayavardhana Rama, A. G. Ramakrishnan, M Vijay Venkatesh, R. Murali Shankar Department of Electrical Engg, Indian Institute of Science, Bangalore 560012,

More information

Sign Language Recognition, Generation and Modeling with Application in Deaf Communication. Project Presentation

Sign Language Recognition, Generation and Modeling with Application in Deaf Communication. Project Presentation Sign Language Recognition, Generation and Modeling with Application in Deaf Communication UniS, Guildford UEA, Norwich UHH, Hamburg LIMSI, Orsay UPS, Toulouse WebSourd, Toulouse NTUA, Athens ILSP, Athens

More information

Hands-on tutorial: Using Praat for analysing a speech corpus. Mietta Lennes Palmse, Estonia

Hands-on tutorial: Using Praat for analysing a speech corpus. Mietta Lennes Palmse, Estonia Hands-on tutorial: Using Praat for analysing a speech corpus Mietta Lennes 12.-13.8.2005 Palmse, Estonia Department of Speech Sciences University of Helsinki Objectives Lecture: Understanding what speech

More information

Annotation in Language Documentation

Annotation in Language Documentation Annotation in Language Documentation Univ. Hamburg Workshop Annotation SEBASTIAN DRUDE 2015-10-29 Topics 1. Language Documentation 2. Data and Annotation (theory) 3. Types and interdependencies of Annotations

More information

209 THE STRUCTURE AND USE OF ENGLISH.

209 THE STRUCTURE AND USE OF ENGLISH. 209 THE STRUCTURE AND USE OF ENGLISH. (3) A general survey of the history, structure, and use of the English language. Topics investigated include: the history of the English language; elements of the

More information

IMPROVING TTS BY HIGHER AGREEMENT BETWEEN PREDICTED VERSUS OBSERVED PRONUNCIATIONS

IMPROVING TTS BY HIGHER AGREEMENT BETWEEN PREDICTED VERSUS OBSERVED PRONUNCIATIONS IMPROVING TTS BY HIGHER AGREEMENT BETWEEN PREDICTED VERSUS OBSERVED PRONUNCIATIONS Yeon-Jun Kim, Ann Syrdal AT&T Labs-Research, 180 Park Ave. Florham Park, NJ 07932 Matthias Jilka Institut für Linguistik,

More information

Corpus building and investigation for the Humanities:

Corpus building and investigation for the Humanities: Corpus building and investigation for the Humanities: An on-line information pack about corpus investigation techniques for the Humanities Unit 2: Compiling a corpus David Evans, University of Nottingham

More information

Language. Language. Communication. The use of an organized means of combining i words in order to communicate

Language. Language. Communication. The use of an organized means of combining i words in order to communicate LANGUAGE Language Language The use of an organized means of combining i words in order to communicate Makes it possible for us to communicate with those around us and to think about things and processes

More information

LING 520 Introduction to Phonetics I Fall Week 1. Introduction Anatomy of speech production Consonants and vowels Phonetic transcription

LING 520 Introduction to Phonetics I Fall Week 1. Introduction Anatomy of speech production Consonants and vowels Phonetic transcription LING 520 Introduction to Phonetics I Fall 2008 Week 1 Introduction Anatomy of speech production Consonants and vowels Phonetic transcription Sep. 8, 2008 What is phonetics? 2 Phonetics is the study of

More information

Sprinter: Language Technologies for Interactive and Multimedia Language Learning

Sprinter: Language Technologies for Interactive and Multimedia Language Learning Sprinter: Language Technologies for Interactive and Multimedia Language Learning Renlong Ai, Marcela Charfuelan, Walter Kasper, Tina Klüwer, Hans Uszkoreit, Feiyu Xu, Sandra Gasber, Philip Gienandt German

More information

The Power of Pentaho and Hadoop in Action. Demonstrating MapReduce Performance at Scale

The Power of Pentaho and Hadoop in Action. Demonstrating MapReduce Performance at Scale The Power of Pentaho and Hadoop in Action Demonstrating MapReduce Performance at Scale Introduction Over the last few years, Big Data has gone from a tech buzzword to a value generator for many organizations.

More information

Understanding Impaired Speech. Kobi Calev, Morris Alper January 2016 Voiceitt

Understanding Impaired Speech. Kobi Calev, Morris Alper January 2016 Voiceitt Understanding Impaired Speech Kobi Calev, Morris Alper January 2016 Voiceitt Our Problem Domain We deal with phonological disorders They may be either - resonance or phonation - physiological or neural

More information

Language Resources and Evaluation for the Support of

Language Resources and Evaluation for the Support of Language Resources and Evaluation for the Support of the Greek Language in the MARY TtS Pepi Stavropoulou 1,2, Dimitrios Tsonos 1, and Georgios Kouroupetroglou 1 1 National and Kapodistrian University

More information

Modular Text-to-Speech Synthesis Evaluation for Mandarin Chinese

Modular Text-to-Speech Synthesis Evaluation for Mandarin Chinese Modular Text-to-Speech Synthesis Evaluation for Mandarin Chinese Jilei Tian, Jani Nurminen, and Imre Kiss Multimedia Technologies Laboratory, Nokia Research Center P.O. Box 100, FIN-33721 Tampere, Finland

More information

Semi-automatically Alignment of Predicates between Speech and OntoNotes Data

Semi-automatically Alignment of Predicates between Speech and OntoNotes Data Semi-automatically Alignment of Predicates between Speech and OntoNotes Data Niraj Shrestha, Marie-Francine Moens Department of Computer Science, KU Leuven, Belgium {niraj.shrestha, Marie-Francine.Moens}@cs.kuleuven.be

More information

COURSE SYLLABUS ESU 561 ASPECTS OF THE ENGLISH LANGUAGE. Fall 2014

COURSE SYLLABUS ESU 561 ASPECTS OF THE ENGLISH LANGUAGE. Fall 2014 COURSE SYLLABUS ESU 561 ASPECTS OF THE ENGLISH LANGUAGE Fall 2014 EDU 561 (85515) Instructor: Bart Weyand Classroom: Online TEL: (207) 985-7140 E-Mail: weyand@maine.edu COURSE DESCRIPTION: This is a practical

More information

Marathi Speech Database

Marathi Speech Database Marathi Speech Database Samudravijaya K Tata Institute of Fundamental Research, 1, Homi Bhabha Road, Mumbai 400005 India chief@tifr.res.in Mandar R Gogate LBHSST College Bandra (E) Mumbai 400051 India

More information

INDEX. List of Figures...XII List of Tables...XV 1. INTRODUCTION TO RECOGNITION OF FOR TEXT TO SPEECH CONVERSION

INDEX. List of Figures...XII List of Tables...XV 1. INTRODUCTION TO RECOGNITION OF FOR TEXT TO SPEECH CONVERSION INDEX Page No. List of Figures...XII List of Tables...XV 1. INTRODUCTION TO RECOGNITION OF FOR TEXT TO SPEECH CONVERSION 1.1 Introduction...1 1.2 Statement of the problem...2 1.3 Objective of the study...2

More information

Data at the SFB "Mehrsprachigkeit"

Data at the SFB Mehrsprachigkeit 1 Workshop on multilingual data, 08 July 2003 MULTILINGUAL DATABASE: Obstacles and Opportunities Thomas Schmidt, Project Zb Data at the SFB "Mehrsprachigkeit" K1: Japanese and German expert discourse in

More information

Cue-based analysis of speech: Implications for prosodic transcription

Cue-based analysis of speech: Implications for prosodic transcription Cue-based analysis of speech: Implications for prosodic transcription Stefanie Shattuck-Hufnagel Speech Communication Group Research Laboratory of Electronics MIT A stark view: Some unanswered questions

More information

Speech Transcription

Speech Transcription TC-STAR Final Review Meeting Luxembourg, 29 May 2007 Speech Transcription Jean-Luc Gauvain LIMSI TC-STAR Final Review Luxembourg, 29-31 May 2007 1 What Is Speech Recognition? Def: Automatic conversion

More information