Tues/Fri, Nov min/person in my office Be prepared to give an update on progress on your (part of the) project
|
|
- Clementine Riley
- 7 years ago
- Views:
Transcription
1 Project updates Tues/Fri, Nov min/person in my office Be prepared to give an update on progress on your (part of the) project Brief characterization of project (task, language, code, modules) What you ve done so far What s left to be done Evaluation Roadblocks? Questions?
2 LANGUAGE RECOGNITION (SPOKEN LANGUAGE IDENTIFICATION)
3
4 Problem What language is being spoken?
5 Problem What language is being spoken? 1. Tamil 2. Spanish 3. Mandarin 4. Korean 5. Japanese 6. Hindi
6 Applications Skip Para español, oprime 2 step Call Centers (e.g., 911) Signals Intelligence First step in multilingual voice UI or translator
7 Baseline Always guess the most common language.
8 Two Main Solutions Acoustic analysis only Train classifiers based on spectral information Linguistic information Phonotactics (most successful) Broad-class phonotactics Phone duration Silence, Filled Pauses Prosody (difficult and less effective)
9 Most Successful Approach
10 Most Successful System
11
12 BYU s Solution Phone Call Sphinx 4 Time slices Praat Feature Definitions Feature Transformer Maximum Entropy Classifier Language
13 10,000-foot View Phone Call Sphinx 4 Praat Feature Transformer Maximum Entropy Classifier Language
14 Sphinx-4: Phoneme Recognizer ah <s> b... </s> z
15 Sphinx-4: Phonetic Class Models Phonetic classes are language independent sets of related sounds, based on manner of articulation; e.g., VOC consists of vowels FRIC consists of fricatives Create a language model based on the classes to constrain recognizer: <s> CLOS VOC CLOS </s> <s> CLOS FRIC CLOS CLOS FRIC PRVS CLOS FRIC VOC
16 Advantages Simplicity Only 1 acoustic model Only 1 Maximum Entropy model per language Rich feature set Flexibility No phonetically-labeled data is needed (though we use it where possible)
17 Speech Recognizer Three components to our speech recognizer Acoustic model (1) Phonotactic language model (N) Pronunciation dictionary
18 Phonetic Class Language Models Group similar phones together Based on manner of articulation; e.g., VOC consists of vowels FRIC consists of fricatives Probability that n-phone classes occur in order: <s> CLOS VOC CLOS </s> <s> CLOS FRIC CLOS CLOS FRIC PRVS CLOS FRIC VOC
19 Speech Recognizer Acoustic Model English LM English-like phonemes Audio File Acoustic Model Mandarin LM... Mandarin-like phonemes Acoustic Model Tamil LM Tamil-like phonemes
20 Praat
21 Recognizer and Praat Output <seglolafile length="3" Xlanguage="sp"> <filefeatures ingcount="6" f1average="2000"/> <timeslices> <slice starttime="0ms" duration="0.111ms"> <seglola> CLOS </seglola> <avgf1> </avgf1> <avgf2> </avgf2> <avgf3> </avgf3> <avgf4> </avgf4> <avgf5> </avgf5> <f0beg> 2.00 </f0beg> <f0end> </f0end> </slice> </timeslices> </seglolafile>
22 Feature Definition File Linguists identify relevant acoustic-phonetic features No need to estimate relative impact Examples: Statistical phonotactics (n-grams) Average phoneme duration Pitch contour Rising or falling tone
23 Maximum Entropy Classifier Binary decision Makes no assumptions beyond what is observed in the data Features provide constraints (Berger et al. 1996)
24 Evaluation Training set: OGI-TS corpus Hand-segmented LOLA format phone class labels (no recognizer) 6 languages, 338 files, 4.6 secs average length Features Unigram, bigram, trigram, 4-gram, and 5-gram features (broad phone class) 80/20 Train/Test split
25 Evaluation Metric: NIST 2005 LRE defined detection cost Weighted average of false negatives and false positives Perfect system, cost = 0 1 CDetection () i = P(Miss() Target()) i i P(FalsePositive( i) NonTarget( j)) 2( N 1) j i
26 Results Hits Misses False Alarms Cost Spanish English Mandarin Japanese Tamil Korean
27 Results Average Cost NIST Cost Score gram 1/2-gram 1/2/3-gram 1/2/3/4-gram 1/2/3/4/5-gram Feature Set
28 Results Improvement with Increasing n-gram Order Count Average Misses Average Hits Average Falses Max. n-gram Order
29 Future Work Experiments from speech (rather than from true transcripts) Optimize Sphinx s parameters using Powell s algorithm Train a new acoustic model Train better language models Define better (& more) linguistic features Re-write of feature transformer using fixed definition of feature definition lang. More experiments on lots more data Participation in NIST evaluation
Thirukkural - A Text-to-Speech Synthesis System
Thirukkural - A Text-to-Speech Synthesis System G. L. Jayavardhana Rama, A. G. Ramakrishnan, M Vijay Venkatesh, R. Murali Shankar Department of Electrical Engg, Indian Institute of Science, Bangalore 560012,
More informationUnlocking Value from. Patanjali V, Lead Data Scientist, Tiger Analytics Anand B, Director Analytics Consulting,Tiger Analytics
Unlocking Value from Patanjali V, Lead Data Scientist, Anand B, Director Analytics Consulting, EXECUTIVE SUMMARY Today a lot of unstructured data is being generated in the form of text, images, videos
More informationText-To-Speech Technologies for Mobile Telephony Services
Text-To-Speech Technologies for Mobile Telephony Services Paulseph-John Farrugia Department of Computer Science and AI, University of Malta Abstract. Text-To-Speech (TTS) systems aim to transform arbitrary
More informationPoints of Interference in Learning English as a Second Language
Points of Interference in Learning English as a Second Language Tone Spanish: In both English and Spanish there are four tone levels, but Spanish speaker use only the three lower pitch tones, except when
More informationSpeech Analytics. Whitepaper
Speech Analytics Whitepaper This document is property of ASC telecom AG. All rights reserved. Distribution or copying of this document is forbidden without permission of ASC. 1 Introduction Hearing the
More informationEfficient diphone database creation for MBROLA, a multilingual speech synthesiser
Efficient diphone database creation for, a multilingual speech synthesiser Institute of Linguistics Adam Mickiewicz University Poznań OWD 2010 Wisła-Kopydło, Poland Why? useful for testing speech models
More informationCarla Simões, t-carlas@microsoft.com. Speech Analysis and Transcription Software
Carla Simões, t-carlas@microsoft.com Speech Analysis and Transcription Software 1 Overview Methods for Speech Acoustic Analysis Why Speech Acoustic Analysis? Annotation Segmentation Alignment Speech Analysis
More informationLecture 12: An Overview of Speech Recognition
Lecture : An Overview of peech Recognition. Introduction We can classify speech recognition tasks and systems along a set of dimensions that produce various tradeoffs in applicability and robustness. Isolated
More informationSpot me if you can: Uncovering spoken phrases in encrypted VoIP conversations
Spot me if you can: Uncovering spoken phrases in encrypted VoIP conversations C. Wright, L. Ballard, S. Coull, F. Monrose, G. Masson Talk held by Goran Doychev Selected Topics in Information Security and
More informationAUTOMATIC PHONEME SEGMENTATION WITH RELAXED TEXTUAL CONSTRAINTS
AUTOMATIC PHONEME SEGMENTATION WITH RELAXED TEXTUAL CONSTRAINTS PIERRE LANCHANTIN, ANDREW C. MORRIS, XAVIER RODET, CHRISTOPHE VEAUX Very high quality text-to-speech synthesis can be achieved by unit selection
More informationAn Arabic Text-To-Speech System Based on Artificial Neural Networks
Journal of Computer Science 5 (3): 207-213, 2009 ISSN 1549-3636 2009 Science Publications An Arabic Text-To-Speech System Based on Artificial Neural Networks Ghadeer Al-Said and Moussa Abdallah Department
More informationTurkish Radiology Dictation System
Turkish Radiology Dictation System Ebru Arısoy, Levent M. Arslan Boaziçi University, Electrical and Electronic Engineering Department, 34342, Bebek, stanbul, Turkey arisoyeb@boun.edu.tr, arslanle@boun.edu.tr
More informationPERCENTAGE ARTICULATION LOSS OF CONSONANTS IN THE ELEMENTARY SCHOOL CLASSROOMS
The 21 st International Congress on Sound and Vibration 13-17 July, 2014, Beijing/China PERCENTAGE ARTICULATION LOSS OF CONSONANTS IN THE ELEMENTARY SCHOOL CLASSROOMS Dan Wang, Nanjie Yan and Jianxin Peng*
More informationTED-LIUM: an Automatic Speech Recognition dedicated corpus
TED-LIUM: an Automatic Speech Recognition dedicated corpus Anthony Rousseau, Paul Deléglise, Yannick Estève Laboratoire Informatique de l Université du Maine (LIUM) University of Le Mans, France firstname.lastname@lium.univ-lemans.fr
More informationAPPLYING MFCC-BASED AUTOMATIC SPEAKER RECOGNITION TO GSM AND FORENSIC DATA
APPLYING MFCC-BASED AUTOMATIC SPEAKER RECOGNITION TO GSM AND FORENSIC DATA Tuija Niemi-Laitinen*, Juhani Saastamoinen**, Tomi Kinnunen**, Pasi Fränti** *Crime Laboratory, NBI, Finland **Dept. of Computer
More informationA GrAF-compliant Indonesian Speech Recognition Web Service on the Language Grid for Transcription Crowdsourcing
A GrAF-compliant Indonesian Speech Recognition Web Service on the Language Grid for Transcription Crowdsourcing LAW VI JEJU 2012 Bayu Distiawan Trisedya & Ruli Manurung Faculty of Computer Science Universitas
More informationSOME ASPECTS OF ASR TRANSCRIPTION BASED UNSUPERVISED SPEAKER ADAPTATION FOR HMM SPEECH SYNTHESIS
SOME ASPECTS OF ASR TRANSCRIPTION BASED UNSUPERVISED SPEAKER ADAPTATION FOR HMM SPEECH SYNTHESIS Bálint Tóth, Tibor Fegyó, Géza Németh Department of Telecommunications and Media Informatics Budapest University
More informationACOUSTICAL CONSIDERATIONS FOR EFFECTIVE EMERGENCY ALARM SYSTEMS IN AN INDUSTRIAL SETTING
ACOUSTICAL CONSIDERATIONS FOR EFFECTIVE EMERGENCY ALARM SYSTEMS IN AN INDUSTRIAL SETTING Dennis P. Driscoll, P.E. and David C. Byrne, CCC-A Associates in Acoustics, Inc. Evergreen, Colorado Telephone (303)
More informationGrammars and introduction to machine learning. Computers Playing Jeopardy! Course Stony Brook University
Grammars and introduction to machine learning Computers Playing Jeopardy! Course Stony Brook University Last class: grammars and parsing in Prolog Noun -> roller Verb thrills VP Verb NP S NP VP NP S VP
More informationLanguage Modeling. Chapter 1. 1.1 Introduction
Chapter 1 Language Modeling (Course notes for NLP by Michael Collins, Columbia University) 1.1 Introduction In this chapter we will consider the the problem of constructing a language model from a set
More informationRobust Methods for Automatic Transcription and Alignment of Speech Signals
Robust Methods for Automatic Transcription and Alignment of Speech Signals Leif Grönqvist (lgr@msi.vxu.se) Course in Speech Recognition January 2. 2004 Contents Contents 1 1 Introduction 2 2 Background
More informationCorpus Driven Malayalam Text-to-Speech Synthesis for Interactive Voice Response System
Corpus Driven Malayalam Text-to-Speech Synthesis for Interactive Voice Response System Arun Soman, Sachin Kumar S., Hemanth V. K., M. Sabarimalai Manikandan, K. P. Soman Centre for Excellence in Computational
More informationAutomatic slide assignation for language model adaptation
Automatic slide assignation for language model adaptation Applications of Computational Linguistics Adrià Agustí Martínez Villaronga May 23, 2013 1 Introduction Online multimedia repositories are rapidly
More informationGerman Speech Recognition: A Solution for the Analysis and Processing of Lecture Recordings
German Speech Recognition: A Solution for the Analysis and Processing of Lecture Recordings Haojin Yang, Christoph Oehlke, Christoph Meinel Hasso Plattner Institut (HPI), University of Potsdam P.O. Box
More informationWorkshop Perceptual Effects of Filtering and Masking Introduction to Filtering and Masking
Workshop Perceptual Effects of Filtering and Masking Introduction to Filtering and Masking The perception and correct identification of speech sounds as phonemes depends on the listener extracting various
More informationOPTIMIZATION OF NEURAL NETWORK LANGUAGE MODELS FOR KEYWORD SEARCH. Ankur Gandhe, Florian Metze, Alex Waibel, Ian Lane
OPTIMIZATION OF NEURAL NETWORK LANGUAGE MODELS FOR KEYWORD SEARCH Ankur Gandhe, Florian Metze, Alex Waibel, Ian Lane Carnegie Mellon University Language Technology Institute {ankurgan,fmetze,ahw,lane}@cs.cmu.edu
More informationTEXT TO SPEECH SYSTEM FOR KONKANI ( GOAN ) LANGUAGE
TEXT TO SPEECH SYSTEM FOR KONKANI ( GOAN ) LANGUAGE Sangam P. Borkar M.E. (Electronics)Dissertation Guided by Prof. S. P. Patil Head of Electronics Department Rajarambapu Institute of Technology Sakharale,
More informationFiliText: A Filipino Hands-Free Text Messaging Application
FiliText: A Filipino Hands-Free Text Messaging Application Jerrick Chua, Unisse Chua, Cesar de Padua, Janelle Isis Tan, Mr. Danny Cheng College of Computer Studies De La Salle University - Manila 1401
More informationEnterprise Voice Technology Solutions: A Primer
Cognizant 20-20 Insights Enterprise Voice Technology Solutions: A Primer A successful enterprise voice journey starts with clearly understanding the range of technology components and options, and often
More informationICFHR 2010 Tutorial: Multimodal Computer Assisted Transcription of Handwriting Images
ICFHR 2010 Tutorial: Multimodal Computer Assisted Transcription of Handwriting Images III Multimodality in Computer Assisted Transcription Alejandro H. Toselli & Moisés Pastor & Verónica Romero {ahector,moises,vromero}@iti.upv.es
More informationThings to remember when transcribing speech
Notes and discussion Things to remember when transcribing speech David Crystal University of Reading Until the day comes when this journal is available in an audio or video format, we shall have to rely
More informationSEGMENTATION AND INDEXATION OF BROADCAST NEWS
SEGMENTATION AND INDEXATION OF BROADCAST NEWS Rui Amaral 1, Isabel Trancoso 2 1 IPS/INESC ID Lisboa 2 IST/INESC ID Lisboa INESC ID Lisboa, Rua Alves Redol, 9,1000-029 Lisboa, Portugal {Rui.Amaral, Isabel.Trancoso}@inesc-id.pt
More informationAutomatic Speech Recognition and Hybrid Machine Translation for High-Quality Closed-Captioning and Subtitling for Video Broadcast
Automatic Speech Recognition and Hybrid Machine Translation for High-Quality Closed-Captioning and Subtitling for Video Broadcast Hassan Sawaf Science Applications International Corporation (SAIC) 7990
More informationTechnologies for Voice Portal Platform
Technologies for Voice Portal Platform V Yasushi Yamazaki V Hitoshi Iwamida V Kazuhiro Watanabe (Manuscript received November 28, 2003) The voice user interface is an important tool for realizing natural,
More informationThe LENA TM Language Environment Analysis System:
FOUNDATION The LENA TM Language Environment Analysis System: The Interpreted Time Segments (ITS) File Dongxin Xu, Umit Yapanel, Sharmi Gray, & Charles T. Baer LENA Foundation, Boulder, CO LTR-04-2 September
More informationInformation Leakage in Encrypted Network Traffic
Information Leakage in Encrypted Network Traffic Attacks and Countermeasures Scott Coull RedJack Joint work with: Charles Wright (MIT LL) Lucas Ballard (Google) Fabian Monrose (UNC) Gerald Masson (JHU)
More informationOffline Recognition of Unconstrained Handwritten Texts Using HMMs and Statistical Language Models. Alessandro Vinciarelli, Samy Bengio and Horst Bunke
1 Offline Recognition of Unconstrained Handwritten Texts Using HMMs and Statistical Language Models Alessandro Vinciarelli, Samy Bengio and Horst Bunke Abstract This paper presents a system for the offline
More information31 Case Studies: Java Natural Language Tools Available on the Web
31 Case Studies: Java Natural Language Tools Available on the Web Chapter Objectives Chapter Contents This chapter provides a number of sources for open source and free atural language understanding software
More informationComparative Error Analysis of Dialog State Tracking
Comparative Error Analysis of Dialog State Tracking Ronnie W. Smith Department of Computer Science East Carolina University Greenville, North Carolina, 27834 rws@cs.ecu.edu Abstract A primary motivation
More informationEstonian Large Vocabulary Speech Recognition System for Radiology
Estonian Large Vocabulary Speech Recognition System for Radiology Tanel Alumäe, Einar Meister Institute of Cybernetics Tallinn University of Technology, Estonia October 8, 2010 Alumäe, Meister (TUT, Estonia)
More informationMyanmar Continuous Speech Recognition System Based on DTW and HMM
Myanmar Continuous Speech Recognition System Based on DTW and HMM Ingyin Khaing Department of Information and Technology University of Technology (Yatanarpon Cyber City),near Pyin Oo Lwin, Myanmar Abstract-
More informationThe sound patterns of language
The sound patterns of language Phonology Chapter 5 Alaa Mohammadi- Fall 2009 1 This lecture There are systematic differences between: What speakers memorize about the sounds of words. The speech sounds
More informationUsing Words and Phonetic Strings for Efficient Information Retrieval from Imperfectly Transcribed Spoken Documents
Using Words and Phonetic Strings for Efficient Information Retrieval from Imperfectly Transcribed Spoken Documents Michael J. Witbrock and Alexander G. Hauptmann Carnegie Mellon University ABSTRACT Library
More informationActive Learning with Boosting for Spam Detection
Active Learning with Boosting for Spam Detection Nikhila Arkalgud Last update: March 22, 2008 Active Learning with Boosting for Spam Detection Last update: March 22, 2008 1 / 38 Outline 1 Spam Filters
More informationL2 EXPERIENCE MODULATES LEARNERS USE OF CUES IN THE PERCEPTION OF L3 TONES
L2 EXPERIENCE MODULATES LEARNERS USE OF CUES IN THE PERCEPTION OF L3 TONES Zhen Qin, Allard Jongman Department of Linguistics, University of Kansas, United States qinzhenquentin2@ku.edu, ajongman@ku.edu
More informationTagging with Hidden Markov Models
Tagging with Hidden Markov Models Michael Collins 1 Tagging Problems In many NLP problems, we would like to model pairs of sequences. Part-of-speech (POS) tagging is perhaps the earliest, and most famous,
More informationReading Assistant: Technology for Guided Oral Reading
A Scientific Learning Whitepaper 300 Frank H. Ogawa Plaza, Ste. 600 Oakland, CA 94612 888-358-0212 www.scilearn.com Reading Assistant: Technology for Guided Oral Reading Valerie Beattie, Ph.D. Director
More informationMaster of Arts in Linguistics Syllabus
Master of Arts in Linguistics Syllabus Applicants shall hold a Bachelor s degree with Honours of this University or another qualification of equivalent standard from this University or from another university
More informationSpeech and Data Analytics for Trading Floors: Technologies, Reliability, Accuracy and Readiness
Speech and Data Analytics for Trading Floors: Technologies, Reliability, Accuracy and Readiness Worse than not knowing is having information that you didn t know you had. Let the data tell me my inherent
More informationEmotion in Speech: towards an integration of linguistic, paralinguistic and psychological analysis
Emotion in Speech: towards an integration of linguistic, paralinguistic and psychological analysis S-E.Fotinea 1, S.Bakamidis 1, T.Athanaselis 1, I.Dologlou 1, G.Carayannis 1, R.Cowie 2, E.Douglas-Cowie
More informationExploring the Structure of Broadcast News for Topic Segmentation
Exploring the Structure of Broadcast News for Topic Segmentation Rui Amaral (1,2,3), Isabel Trancoso (1,3) 1 Instituto Superior Técnico 2 Instituto Politécnico de Setúbal 3 L 2 F - Spoken Language Systems
More informationSPEECH SYNTHESIZER BASED ON THE PROJECT MBROLA
Rajs Arkadiusz, Banaszak-Piechowska Agnieszka, Drzycimski Paweł. Speech synthesizer based on the project MBROLA. Journal of Education, Health and Sport. 2015;5(12):160-164. ISSN 2391-8306. DOI http://dx.doi.org/10.5281/zenodo.35266
More informationVCU-TSA at Semeval-2016 Task 4: Sentiment Analysis in Twitter
VCU-TSA at Semeval-2016 Task 4: Sentiment Analysis in Twitter Gerard Briones and Kasun Amarasinghe and Bridget T. McInnes, PhD. Department of Computer Science Virginia Commonwealth University Richmond,
More informationNATURAL SOUNDING TEXT-TO-SPEECH SYNTHESIS BASED ON SYLLABLE-LIKE UNITS SAMUEL THOMAS MASTER OF SCIENCE
NATURAL SOUNDING TEXT-TO-SPEECH SYNTHESIS BASED ON SYLLABLE-LIKE UNITS A THESIS submitted by SAMUEL THOMAS for the award of the degree of MASTER OF SCIENCE (by Research) DEPARTMENT OF COMPUTER SCIENCE
More informationExperiments with Signal-Driven Symbolic Prosody for Statistical Parametric Speech Synthesis
Experiments with Signal-Driven Symbolic Prosody for Statistical Parametric Speech Synthesis Fabio Tesser, Giacomo Sommavilla, Giulio Paci, Piero Cosi Institute of Cognitive Sciences and Technologies, National
More informationVoice User Interfaces (CS4390/5390)
Revised Syllabus February 17, 2015 Voice User Interfaces (CS4390/5390) Spring 2015 Tuesday & Thursday 3:00 4:20, CCS Room 1.0204 Instructor: Nigel Ward Office: CCS 3.0408 Phone: 747-6827 E-mail nigel@cs.utep.edu
More informationWord Completion and Prediction in Hebrew
Experiments with Language Models for בס"ד Word Completion and Prediction in Hebrew 1 Yaakov HaCohen-Kerner, Asaf Applebaum, Jacob Bitterman Department of Computer Science Jerusalem College of Technology
More informationSpeech Recognition on Cell Broadband Engine UCRL-PRES-223890
Speech Recognition on Cell Broadband Engine UCRL-PRES-223890 Yang Liu, Holger Jones, John Johnson, Sheila Vaidya (Lawrence Livermore National Laboratory) Michael Perrone, Borivoj Tydlitat, Ashwini Nanda
More informationBachelors of Science Program in Communication Disorders and Sciences:
Bachelors of Science Program in Communication Disorders and Sciences: Mission: The SIUC CDS program is committed to multiple complimentary missions. We provide support for, and align with, the university,
More information1. Introduction to Spoken Dialogue Systems
SoSe 2006 Projekt Sprachdialogsysteme 1. Introduction to Spoken Dialogue Systems Walther v. Hahn, Cristina Vertan {vhahn,vertan}@informatik.uni-hamburg.de Content What are Spoken dialogue systems? Types
More informationhave more skill and perform more complex
Speech Recognition Smartphone UI Speech Recognition Technology and Applications for Improving Terminal Functionality and Service Usability User interfaces that utilize voice input on compact devices such
More informationENGLISH LANGUAGE LEARNERS * * Adapted from March 2004 NJ DOE presentation by Peggy Freedson-Gonzalez
ENGLISH LANGUAGE LEARNERS * * Adapted from March 2004 NJ DOE presentation by Peggy Freedson-Gonzalez NJ DEMOGRAPHICS As of 2001, NJ ranked 7 th in % of limited- English-speaking residents 42% increase
More informationDublin City University at CLEF 2004: Experiments with the ImageCLEF St Andrew s Collection
Dublin City University at CLEF 2004: Experiments with the ImageCLEF St Andrew s Collection Gareth J. F. Jones, Declan Groves, Anna Khasin, Adenike Lam-Adesina, Bart Mellebeek. Andy Way School of Computing,
More informationThe SweDat Project and Swedia Database for Phonetic and Acoustic Research
2009 Fifth IEEE International Conference on e-science The SweDat Project and Swedia Database for Phonetic and Acoustic Research Jonas Lindh and Anders Eriksson Department of Philosophy, Linguistics and
More informationAutomatic Evaluation Software for Contact Centre Agents voice Handling Performance
International Journal of Scientific and Research Publications, Volume 5, Issue 1, January 2015 1 Automatic Evaluation Software for Contact Centre Agents voice Handling Performance K.K.A. Nipuni N. Perera,
More informationConnected Digits Recognition Task: ISTC CNR Comparison of Open Source Tools
Connected Digits Recognition Task: ISTC CNR Comparison of Open Source Tools Piero Cosi, Mauro Nicolao Istituto di Scienze e Tecnologie della Cognizione, C.N.R. via Martiri della libertà, 2 35137 Padova
More informationLIUM s Statistical Machine Translation System for IWSLT 2010
LIUM s Statistical Machine Translation System for IWSLT 2010 Anthony Rousseau, Loïc Barrault, Paul Deléglise, Yannick Estève Laboratoire Informatique de l Université du Maine (LIUM) University of Le Mans,
More informationSYLLABLE STRESS IN URDU
202 SYLLABLE STRESS IN URDU SHANZA NYYAR ABSTRACT Syllable stress plays a vital role in determining the pronunciation of a word in any language. In every language where the stress exists, a particular
More informationRecognition of Emotions in Interactive Voice Response Systems
Recognition of Emotions in Interactive Voice Response Systems Sherif Yacoub, Steve Simske, Xiaofan Lin, John Burns HP Laboratories Palo Alto HPL-2003-136 July 2 nd, 2003* E-mail: {sherif.yacoub, steven.simske,
More informationPronunciation Difficulties of Japanese Speakers of English: Predictions Based on a Contrastive Analysis Steven W. Carruthers
17 Pronunciation Difficulties of Japanese Speakers of English: Predictions Based on a Contrastive Analysis Steven W. Carruthers Abstract A contrastive analysis of English and Japanese phonology can help
More informationOpen-Source, Cross-Platform Java Tools Working Together on a Dialogue System
Open-Source, Cross-Platform Java Tools Working Together on a Dialogue System Oana NICOLAE Faculty of Mathematics and Computer Science, Department of Computer Science, University of Craiova, Romania oananicolae1981@yahoo.com
More informationEvaluating grapheme-to-phoneme converters in automatic speech recognition context
Evaluating grapheme-to-phoneme converters in automatic speech recognition context Denis Jouvet, Dominique Fohr, Irina Illina To cite this version: Denis Jouvet, Dominique Fohr, Irina Illina. Evaluating
More informationSpeech Recognition System of Arabic Alphabet Based on a Telephony Arabic Corpus
Speech Recognition System of Arabic Alphabet Based on a Telephony Arabic Corpus Yousef Ajami Alotaibi 1, Mansour Alghamdi 2, and Fahad Alotaiby 3 1 Computer Engineering Department, King Saud University,
More informationSWING: A tool for modelling intonational varieties of Swedish Beskow, Jonas; Bruce, Gösta; Enflo, Laura; Granström, Björn; Schötz, Susanne
SWING: A tool for modelling intonational varieties of Swedish Beskow, Jonas; Bruce, Gösta; Enflo, Laura; Granström, Björn; Schötz, Susanne Published in: Proceedings of Fonetik 2008 Published: 2008-01-01
More informationInteractive product brochure :: Nina TM Mobile: The Virtual Assistant for Mobile Customer Service Apps
TM Interactive product brochure :: Nina TM Mobile: The Virtual Assistant for Mobile Customer Service Apps This PDF contains embedded interactive features. Make sure to download and save the file to your
More informationAutomatic Language Identification. Martine Adda-Decker
Automatic Language Identification Martine Adda-Decker July 20, 2008 2 8.1. Introduction When listening to our native language we, speech and hearing enabled humans, immediately identify the language being
More informationOptimizing Multilingual Search With Solr
www.basistech.com info@basistech.com 617-386-2090 Optimizing Multilingual Search With Solr Pg. 1 INTRODUCTION Today s search application users expect search engines to just work seamlessly across multiple
More informationDeveloping speech recognition software for command-and-control applications
Developing speech recognition software for command-and-control applications Author: Contact: Ivan A. Uemlianin i.uemlianin@bangor.ac.uk Contents Introduction Workflow Set up the project infrastructure
More informationDevelop Software that Speaks and Listens
Develop Software that Speaks and Listens Copyright 2011 Chant Inc. All rights reserved. Chant, SpeechKit, Getting the World Talking with Technology, talking man, and headset are trademarks or registered
More informationMembering T M : A Conference Call Service with Speaker-Independent Name Dialing on AIN
PAGE 30 Membering T M : A Conference Call Service with Speaker-Independent Name Dialing on AIN Sung-Joon Park, Kyung-Ae Jang, Jae-In Kim, Myoung-Wan Koo, Chu-Shik Jhon Service Development Laboratory, KT,
More informationWinPitch LTL II, a Multimodal Pronunciation Software
WinPitch LTL II, a Multimodal Pronunciation Software Philippe MARTIN UFRL Université Paris 7 92, Ave. de France 75013 Paris, France philippe.martin@linguist.jussieu.fr Abstract We introduce a new version
More informationTranslution Price List GBP
Translution Price List GBP TABLE OF CONTENTS Services AD HOC MACHINE TRANSLATION... LIGHT POST EDITED TRANSLATION... PROFESSIONAL TRANSLATION... 3 TRANSLATE, EDIT, REVIEW TRANSLATION (TWICE TRANSLATED)...3
More informationImproving Automatic Forced Alignment for Dysarthric Speech Transcription
Improving Automatic Forced Alignment for Dysarthric Speech Transcription Yu Ting Yeung 2, Ka Ho Wong 1, Helen Meng 1,2 1 Human-Computer Communications Laboratory, Department of Systems Engineering and
More informationModern foreign languages
Modern foreign languages Programme of study for key stage 3 and attainment targets (This is an extract from The National Curriculum 2007) Crown copyright 2007 Qualifications and Curriculum Authority 2007
More informationTHE COLLECTION AND PRELIMINARY ANALYSIS OF A SPONTANEOUS SPEECH DATABASE*
THE COLLECTION AND PRELIMINARY ANALYSIS OF A SPONTANEOUS SPEECH DATABASE* Victor Zue, Nancy Daly, James Glass, David Goodine, Hong Leung, Michael Phillips, Joseph Polifroni, Stephanie Seneff, and Michal
More informationA Knowledge-Poor Approach to BioCreative V DNER and CID Tasks
A Knowledge-Poor Approach to BioCreative V DNER and CID Tasks Firoj Alam 1, Anna Corazza 2, Alberto Lavelli 3, and Roberto Zanoli 3 1 Dept. of Information Eng. and Computer Science, University of Trento,
More informationMADHUMITHA.S.-LINGUIST/ TRANSLATOR/INTERPRETER PHD National university of Singapore
MADHUMITHA.S.-LINGUIST/ TRANSLATOR/INTERPRETER PHD National university of Singapore T3, Om Sai paradise, 19th Cross, Kaggadasapura main road, Bangalore-560 093 Mobile: +919900161153;+9962844405 Email me
More informationThe ROI. of Speech Tuning
The ROI of Speech Tuning Executive Summary: Speech tuning is a process of improving speech applications after they have been deployed by reviewing how users interact with the system and testing changes.
More informationPhonetic-Based Dialogue Search: The Key to Unlocking an Archive s Potential
white paper Phonetic-Based Dialogue Search: The Key to Unlocking an Archive s Potential A Whitepaper by Jacob Garland, Colin Blake, Mark Finlay and Drew Lanham Nexidia, Inc., Atlanta, GA People who create,
More informationBLIND SOURCE SEPARATION OF SPEECH AND BACKGROUND MUSIC FOR IMPROVED SPEECH RECOGNITION
BLIND SOURCE SEPARATION OF SPEECH AND BACKGROUND MUSIC FOR IMPROVED SPEECH RECOGNITION P. Vanroose Katholieke Universiteit Leuven, div. ESAT/PSI Kasteelpark Arenberg 10, B 3001 Heverlee, Belgium Peter.Vanroose@esat.kuleuven.ac.be
More informationComprehensive VoIP Evaluation Report
VoIP Evaluation Report Sample Timeline Tuesday, December 14, 2004 15:13:39 GMT to Saturday, December 18, 2004 11:58:41 GMT Generated on Saturday, December 18, 2004 12:00 PM Total pages 17 1. Evaluation
More informationA Short Introduction to Transcribing with ELAN. Ingrid Rosenfelder Linguistics Lab University of Pennsylvania
A Short Introduction to Transcribing with ELAN Ingrid Rosenfelder Linguistics Lab University of Pennsylvania January 2011 Contents 1 Source 2 2 Opening files for annotation 2 2.1 Starting a new transcription.....................
More information. Niparko, J. K. (2006). Speech Recognition at 1-Year Follow-Up in the Childhood
Psychology 230: Research Methods Lab A Katie Berg, Brandon Geary, Gina Scharenbroch, Haley Schmidt, & Elizabeth Stevens Introduction: Overview: A training program, under the lead of Professor Jeremy Loebach,
More informationTable of Contents. Executive Summary... 3 The Business Impact of Contact Analytics... 3. How Contact Analytics Works... 4
Table of Contents Executive Summary... 3 The Business Impact of Contact Analytics... 3 How Contact Analytics Works... 4 Saveology Company Overview... 5 Case Study #1 Marketing Campaign Effectiveness...
More informationDirect Loss Minimization for Structured Prediction
Direct Loss Minimization for Structured Prediction David McAllester TTI-Chicago mcallester@ttic.edu Tamir Hazan TTI-Chicago tamir@ttic.edu Joseph Keshet TTI-Chicago jkeshet@ttic.edu Abstract In discriminative
More informationMonophonic Music Recognition
Monophonic Music Recognition Per Weijnitz Speech Technology 5p per.weijnitz@gslt.hum.gu.se 5th March 2003 Abstract This report describes an experimental monophonic music recognition system, carried out
More informationSpeculating on the Future for Automatic Speech Recognition
Speculating on the Future for Automatic Speech Recognition A Survey of Attendees by Roger K Moore 1 THANK YOU! 2 It is hard to predict especially the future. Niels Bohr, 1922 3 The Survey(s) 12 of the
More informationLecture 1-10: Spectrograms
Lecture 1-10: Spectrograms Overview 1. Spectra of dynamic signals: like many real world signals, speech changes in quality with time. But so far the only spectral analysis we have performed has assumed
More information