Investigating tongue movement during speech with ultrasound



Similar documents
Tamás Gábor CSAPÓ Curriculum Vitae

Mathematical modeling of speech acoustics D. Sc. Daniel Aalto

DEVELOPMENT OF A 3D TONGUE MOTION VISUALIZATION PLATFORM BASED ON ULTRASOUND IMAGE SEQUENCES

Graphic Design. Background: The part of an artwork that appears to be farthest from the viewer, or in the distance of the scene.

Mobile Multimedia Application for Deaf Users

62 Hearing Impaired MI-SG-FLD062-02

Mouse Control using a Web Camera based on Colour Detection

An Energy-Based Vehicle Tracking System using Principal Component Analysis and Unsupervised ART Network

HLT in Hungary

COMMUNICATION SCIENCES AND DISORDERS COURSE OFFERINGS Version:

László Hunyadi (Résumé)

Spatiotemporal Visualization of the Tongue Surface

Effects of Pronunciation Practice System Based on Personalized CG Animations of Mouth Movement Model

How To Write A Book On Honduran Studies

MICHIGAN TEST FOR TEACHER CERTIFICATION (MTTC) TEST OBJECTIVES FIELD 062: HEARING IMPAIRED

Text-To-Speech Technologies for Mobile Telephony Services

Program curriculum for graduate studies in Speech and Music Communication

RESEARCH ON SPOKEN LANGUAGE PROCESSING Progress Report No. 29 (2008) Indiana University

History of Aural Rehab

Face Locating and Tracking for Human{Computer Interaction. Carnegie Mellon University. Pittsburgh, PA 15213

Knowledge and Skills Acquisition (KASA) Summary Form* Harding University Communication Sciences and Disorders Program

Intelligent Monitoring Software

VisArtico. Articulatory Data Visualization Tool

Automated Pavement Distress Survey: A Review and A New Direction

Introducing your Intelligent Monitoring Software. Designed for security.

Talking machines?! Present and future of speech technology in Hungary

Introduction to Computer Graphics

DSP Laboratory: Analog to Digital and Digital to Analog Conversion

MR Urography Plug-in for ImageJ User s Guide

PERCENTAGE ARTICULATION LOSS OF CONSONANTS IN THE ELEMENTARY SCHOOL CLASSROOMS

Virtual Mouse Using a Webcam

Introduction to Audiology and Speech- Language Pathology ASLP 133 Fall Nancy Blair, M.S. 161 TLRB Provo, UT

APPLYING MFCC-BASED AUTOMATIC SPEAKER RECOGNITION TO GSM AND FORENSIC DATA

Thirukkural - A Text-to-Speech Synthesis System

Automatic Detection of Emergency Vehicles for Hearing Impaired Drivers


Articulatory Phonetics. and the International Phonetic Alphabet. Readings and Other Materials. Review. IPA: The Vowels. Practice

M.A., Linguistics, 2007 Thesis: Measurement of vowel nasalization by multi-dimensional acoustic analysis University of Rochester, Rochester, NY, USA

Parametric Comparison of H.264 with Existing Video Standards

HANDS-FREE PC CONTROL CONTROLLING OF MOUSE CURSOR USING EYE MOVEMENT

Sony's "Beyond 4K" solutions bring museum visual exhibits into the next generation

VEHICLE TRACKING USING ACOUSTIC AND VIDEO SENSORS

High speed 3D capture for Configuration Management DOE SBIR Phase II Paul Banks

Ultrasound Tissue Characterization an innovative method to visualize and monitor Patellar Tendinopathy

Janet L. Hawley, MS, CCC-SLP

High-accuracy ultrasound target localization for hand-eye calibration between optical tracking systems and three-dimensional ultrasound

Philips CX50 CompactXtreme System. A New Class of Compact Echocardiography

Utilizing spatial information systems for non-spatial-data analysis

Lecture 1-10: Spectrograms

PHOTOGRAMMETRIC TECHNIQUES FOR MEASUREMENTS IN WOODWORKING INDUSTRY

Deception Detection Techniques for Rapid Screening

Interactive Projector Screen with Hand Detection Using LED Lights

Bachelors of Science Program in Communication Disorders and Sciences:

A Method for Controlling Mouse Movement using a Real- Time Camera

Curriculum Vitae of Kanae Nishi

A TOOL FOR TEACHING LINEAR PREDICTIVE CODING

VSSN 06 Algorithm Competition

Ashwini Joshi, Ph.D., CCC-SLP

Video Editing Tools. digital video equipment and editing software, more and more people are able to be

LABORATORY ACTIVITIES FOR LARGE AND ONLINE PHONETICS CLASSES

Advanced Presentation Features and Animation

Speech-Language Pathology Curriculum Foundation Course Linkages

Jeff M. Smith

Portions have been extracted from this report to protect the identity of the student. RIT/NTID AURAL REHABILITATION REPORT Academic Year

engin erzin the use of speech processing applications is expected to surge in multimedia-rich scenarios

Stand Alone Type. Digital Video Recorder USER S MANUAL. (Real time recording 8 & 16 CH DVR) Revision Date :

T-REDSPEED White paper

Applying to the Graduate Program

Master of Science in Speech and Hearing Science Distance Education Program. Frequently Asked Questions

Fast Z-stacking 3D Microscopy Extended Depth of Field Autofocus Z Depth Measurement 3D Surface Analysis

Open-Source, Cross-Platform Java Tools Working Together on a Dialogue System

Introduction. System requirements

Speech Signal Processing: An Overview

APPLYING VIRTUAL SOUND BARRIER AT A ROOM OPENING FOR TRANSFORMER NOISE CONTROL

THE UNIVERSITY OF EDINBURGH George Square Library

Chapter 3 Input Devices

Web:

A Segmentation Algorithm for Zebra Finch Song at the Note Level. Ping Du and Todd W. Troyer

Proceedings of Meetings on Acoustics

Building 3D anatomical scenes on the Web

Career info session Nov. 17th, / 24

Copyright 2006 TechSmith Corporation. All Rights Reserved.

Speech Production 2. Paper 9: Foundations of Speech Communication Lent Term: Week 4. Katharine Barden

Maria V. Dixon, M.A., CCC-SLP 402 Ridge Rd. #8 // Greenbelt, MD (301)

M.A. Hispanic Linguistics. University of Arizona: Tucson, AZ.

Speech: A Challenge to Digital Signal Processing Technology for Human-to-Computer Interaction

COUNCIL OF ACADEMIC PROGRAMS IN COMMUNICATION SCIENCES AND DISORDERS NATIONAL SURVEY OF UNDERGRADUATE AND GRADUATE PROGRAMS TABLE OF CONTENTS

Pre-Requisite Information & Course Equivalency Forms Page 1 of 5

CURRICULUM VITAE. Toby Macrae, Ph.D., CCC-SLP

Experiments with a Camera-Based Human-Computer Interface System

Internet and Computing Core Certification Guide Module A Computing Fundamentals

Face Identification by Human and by Computer: Two Sides of the Same Coin, or Not? Tsuhan Chen

Address: 2104 Londondery Dr. Date of Birth: May 16, 1958 Manhattan, KS 66503

Analysis of free reed attack transients

QAV-PET: A Free Software for Quantitative Analysis and Visualization of PET Images

Speech-Language Pathology and Audiology. Group Orientation

Digital Image Processing with DragonflyPACS

STOP MOTION. Recommendations:

Low-resolution Character Recognition by Video-based Super-resolution

Transcription:

Investigating tongue movement during speech with ultrasound Tamás Gábor CSAPÓ csapot@tmit.bme.hu Meet the Hungarian Fulbrighters of AY 2013-2014 April 28, 2015

Introduction Methods & results Personal Summary 1 Introduction Indiana University Bloomington Speech research with ultrasound Goals of my scholarship 2 Research methods and results Ultrasound recordings Manual tongue contour tracing Automatic tongue contour tracking 3 Personal experiences Playgrounds and pictures 4 Summary and future plans 2 / 40 Tamás Gábor CSAPÓ Investigating tongue movement with ultrasound

Introduction Methods & results Personal Summary IU Ultrasound Goals Indiana University, Bloomington I Bloomington, IN 3 / 40 Tamás Gábor CSAPÓ Investigating tongue movement with ultrasound

Introduction Methods & results Personal Summary IU Ultrasound Goals Indiana University, Bloomington II 4 / 40 Tamás Gábor CSAPÓ Investigating tongue movement with ultrasound

Introduction Methods & results Personal Summary IU Ultrasound Goals Indiana University, Bloomington III 5 / 40 Tamás Gábor CSAPÓ Investigating tongue movement with ultrasound

Introduction Methods & results Personal Summary IU Ultrasound Goals Indiana University, Bloomington IV Indiana University Jacobs School of Music Kelley School of Business Dept. of Speech and Hearing Sciences you can even learn Hungarian! Speech Production Laboratory professor: Dr. Steven M. Lulich brand new equipment for speech research 6 / 40 Tamás Gábor CSAPÓ Investigating tongue movement with ultrasound

Introduction Methods & results Personal Summary IU Ultrasound Goals Speech research with ultrasound I Ultrasound (US) used in speech research since early 80s US transducer positioned below the chin during speech record video of tongue movement series of gray-scale images tongue surface has a greater brightness than the surrounding tissue and air [Stone et al., 1983, Stone, 2005] 7 / 40 Tamás Gábor CSAPÓ Investigating tongue movement with ultrasound

Introduction Methods & results Personal Summary IU Ultrasound Goals Speech research with ultrasound II 8 / 40 Tamás Gábor CSAPÓ Investigating tongue movement with ultrasound

Introduction Methods & results Personal Summary IU Ultrasound Goals Speech research with ultrasound III Vocal tract Ultrasound sample [Németh and Olaszy, 2010] ( click) 9 / 40 Tamás Gábor CSAPÓ Investigating tongue movement with ultrasound

Introduction Methods & results Personal Summary IU Ultrasound Goals Speech research with ultrasound IV Phonetic research examples reconstruct tongue shape during sustained vowels investigate speech sounds of under-researched languages compare articulatory characteristics of vowels analyze tongue shapes for clinical purposes First step is always the tongue contour tracking! [Stone and Lundberg, 1996, Mielke et al., 2011, Benus and Gafos, 2007, Zharkova, 2013] 10 / 40 Tamás Gábor CSAPÓ Investigating tongue movement with ultrasound

Introduction Methods & results Personal Summary IU Ultrasound Goals My goals This scholarship compare manual tongue tracings of several individuals compare automatic tongue contour extraction programs use 2D ultrasound at high frame rate Long-term extend text-to-speech with tongue contour data based on ultrasound use real-time 3D ultrasound 11 / 40 Tamás Gábor CSAPÓ Investigating tongue movement with ultrasound

Introduction Methods & results Personal Summary Recordings Manual tracing Automatic tracking Methods Subjects two female and two male 3 speakers of American English 1 speaker of Hungarian Speech material I owe you a yo-yo. sentence two times 135 various English sentences 210 various Hungarian sentences 12 / 40 Tamás Gábor CSAPÓ Investigating tongue movement with ultrasound

Introduction Methods & results Personal Summary Recordings Manual tracing Automatic tracking Recordings 1 0.8 0.6 0.4 0.2 0 0.2 0.4 0.6 0.8 1 0 100 200 300 400 500 600 700 800 900 1000 Location Speech Production Lab, IU Parallel recordings speech signal with a microphone video of the lips with a webcamera video of the tongue with an ultrasound device (Philips EpiQ-7G, xmatrix 6-1 MHz) 13 / 40 Tamás Gábor CSAPÓ Investigating tongue movement with ultrasound

Introduction Methods & results Personal Summary Recordings Manual tracing Automatic tracking Recording setup 14 / 40 Tamás Gábor CSAPÓ Investigating tongue movement with ultrasound

Introduction Methods & results Personal Summary Recordings Manual tracing Automatic tracking Manual tracings Ultrasound recordings JPG image sequence 800x600 pixels resolution Tracers 7 individuals (2 professors and 5 students) drag a computer mouse cursor from the root of the tongue (left) to the tip of the tongue (right) about 150 200 points per image about 5 10 seconds per image 15 / 40 Tamás Gábor CSAPÓ Investigating tongue movement with ultrasound

Introduction Methods & results Personal Summary Recordings Manual tracing Automatic tracking Manual tracing website 16 / 40 Tamás Gábor CSAPÓ Investigating tongue movement with ultrasound

Introduction Methods & results Personal Summary Recordings Manual tracing Automatic tracking Automatic tongue contour tracking algorithms 5 freely available programs, baseline settings AutoTrace (University of Arizona, USA) EdgeTrak (University of Maryland, USA) Palatoglossotron (North Carolina State University, USA) TongueTrack (Simon Fraser University, Canada) Ultra-CATS (University of Toronto, Canada) [Sung et al., 2013, Li et al., 2005, Baker et al., 2005, Tang et al., 2012, Bressmann et al., 2005] 17 / 40 Tamás Gábor CSAPÓ Investigating tongue movement with ultrasound

Introduction Methods & results Personal Summary Recordings Manual tracing Automatic tracking Comparison of two tongue contours 0 frame #0046 (40 fps) 100 200 manual automatic 300 400 500 600 0 100 200 300 400 500 600 700 800 18 / 40 Tamás Gábor CSAPÓ Investigating tongue movement with ultrasound

Introduction Methods & results Personal Summary Recordings Manual tracing Automatic tracking Automatic trackings RMSE (Root Mean Squared Error) difference from mean of manual tracing Average for the best algorithm, AutoTrace: 9.66 pixel (1.93 mm) depending on the speaker, algorithm and image 100 (compare with: 7.11 pixel inter-tracer variability) US video samples speaker1 ( click) speaker4 ( click) 200 300 400 9.66px 30px 19 / 40 Tamás Gábor CSAPÓ Investigating tongue movement with ultrasound

Introduction Methods & results Personal Summary Recordings Manual tracing Automatic tracking What can this be used for? investigate articulation during speech visual reconstruction of 3D tongue surface audiovisual speech synthesis language education: how to produce unfamiliar speech sounds? speech rehabilitation: learn to speak after a tongue surgery [Stone et al., 2005] 20 / 40 Tamás Gábor CSAPÓ Investigating tongue movement with ultrasound

Introduction Methods & results Personal Summary Playgrounds and pictures Playgrounds I 21 / 40 Tamás Gábor CSAPÓ Investigating tongue movement with ultrasound

Introduction Methods & results Personal Summary Playgrounds and pictures Biking in Bloomington I 22 / 40 Tamás Gábor CSAPÓ Investigating tongue movement with ultrasound

Introduction Methods & results Personal Summary Playgrounds and pictures 4th of July with the Lulich family I 23 / 40 Tamás Gábor CSAPÓ Investigating tongue movement with ultrasound

Introduction Methods & results Personal Summary Playgrounds and pictures Roundtrip I 24 / 40 Tamás Gábor CSAPÓ Investigating tongue movement with ultrasound

Introduction Methods & results Personal Summary Playgrounds and pictures Roundtrip II 25 / 40 Tamás Gábor CSAPÓ Investigating tongue movement with ultrasound

Introduction Methods & results Personal Summary Playgrounds and pictures Roundtrip III 26 / 40 Tamás Gábor CSAPÓ Investigating tongue movement with ultrasound

Introduction Methods & results Personal Summary Summary I This study ultrasound recordings with several speakers compared manual tongue tracings compared automatic tongue contour extraction programs Future plans extend Hungarian / English Text-To-Speech with tongue contour data use 2D / real-time 3D ultrasound 27 / 40 Tamás Gábor CSAPÓ Investigating tongue movement with ultrasound

Introduction Methods & results Personal Summary Summary II Presentations and papers during the scholarship T. G. Csapó, S. M. Lulich, Comparison of tongue contour extraction methods from ultrasound images for use in TTS, Conf. of HCA, Bloomington, IN, USA, April 6, 2014. TGCs, SML, Comparison of tongue contour extraction methods, virtual presentation at the lab meeting of University of Arizona, May 13, 2014. TGCs, SML, Tongue contour tracings from 2D ultrasound image sequences: quantification of measurement error using manual and automatic tracing methods, in preparation, 2014. 28 / 40 Tamás Gábor CSAPÓ Investigating tongue movement with ultrasound

Introduction Methods & results Personal Summary Summary III Presentations and papers after the scholarship R. Pedro, E. Mazzocco, TGCs, SML, Investigation of a tongue-internal coordinate system for two-dimensional ultrasound, 168th Meeting of the Acoustical Society of America, Indianapolis, IN, USA, Oct 27-31, 2014. D. Csopor, Mély neuronhálók alkalmazása ultrahangos nyelvkontúr követésre, supervised by TGCs, Scientific Students Association Annual Conference of BME VIK, Budapest, Nov 11, 2014. TGCs, D. Csopor, Ultrahangos nyelvkontúr követés automatikusan: a mély neuronhálókon alapuló AutoTrace eljárás vizsgálata, Beszédkutatás 2015, pp. 177-187, 2015. TGCs, SML, Error analysis of extracted tongue contours from 2D ultrasound images, submitted to Interspeech 2015. 29 / 40 Tamás Gábor CSAPÓ Investigating tongue movement with ultrasound

Introduction Methods & results Personal Summary Summary IV Grants Bolyai post-doc grant, Modeling articulation using ultrasound, with special regard to text-to-speech synthesis (submitted). OTKA-NSF International collaboration grant (planned). 30 / 40 Tamás Gábor CSAPÓ Investigating tongue movement with ultrasound

Introduction Methods & results Personal Summary Acknowledgements Support from Fulbright Hungary Hungarian Academy of Engineering Thank you for your attention! http://csapobloomington.blogspot.hu/ 31 / 40 Tamás Gábor CSAPÓ Investigating tongue movement with ultrasound

Introduction Methods & results Personal Summary References I Baker, A., Mielke, J., and Archangeli, D. (2005). Tracing the tongue with GLoSsatron. In Ultrafest III, Tucson, AZ, USA. Benus, S. and Gafos, A. I. (2007). Articulatory characteristics of Hungarian transparent vowels. Journal of Phonetics, 35(3):271 300. Bressmann, T., Heng, C.-L., and Irish, J. C. (2005). Applications of 2D and 3D ultrasound imaging in speech-language pathology. Journal of Speech-Language Pathology and Audiology, 29(4):158 168. Li, M., Kambhamettu, C., and Stone, M. (2005). Automatic contour tracking in ultrasound images. Clinical Linguistics & Phonetics, 19(6-7):545 554. Mielke, J., Olson, K. S., Baker, A., and Archangeli, D. (2011). Articulation of the Kagayanen interdental approximant: An ultrasound study. Journal of Phonetics, 39(3):403 412. Németh, G. and Olaszy, G., editors (2010). A MAGYAR BESZÉD; Beszédkutatás, beszédtechnológia, beszédinformációs rendszerek. Akadémiai Kiadó, Budapest. 32 / 40 Tamás Gábor CSAPÓ Investigating tongue movement with ultrasound

Introduction Methods & results Personal Summary References II Stone, M. (2005). A guide to analysing tongue motion from ultrasound images. Clinical Linguistics & Phonetics, 19(6-7):455 501. Stone, M., Epstein, M. A., Li, M. I. N., and Kambhamettu, C. (2005). Predicting 3D Tongue Shapes from Midsagittal Contours. In Tabain, J. and M., editors, Speech Production: Models, Phonetic Processes, and Techniques, chapter 18, pages 315 331. Psychology Press. Stone, M. and Lundberg, A. (1996). Three-dimensional tongue surface shapes of English consonants and vowels. The Journal of the Acoustical Society of America, 99(6):3728 37. Stone, M., Sonies, B., Shawker, T., Weiss, G., and Nadel, L. (1983). Analysis of real-time ultrasound images of tongue configuration using a grid-digitizing system. Journal of Phonetics, 11:207 218. Sung, J.-H., Berry, J., Cooper, M., Hahn-Powell, G., and Archangeli, D. (2013). Testing AutoTrace: A Machine-learning Approach to Automated Tongue Contour Data Extraction. In Ultrafest VI, pages 9 10, Edinburgh, UK. Tang, L., Bressmann, T., and Hamarneh, G. (2012). Tongue contour tracking in dynamic ultrasound via higher-order MRFs and efficient fusion moves. Medical Image Analysis, 16(8):1503 1520. 33 / 40 Tamás Gábor CSAPÓ Investigating tongue movement with ultrasound

Introduction Methods & results Personal Summary References III Zharkova, N. (2013). A normative-speaker validation study of two indices developed to quantify tongue dorsum activity from midsagittal tongue shapes. Clinical Linguistics & Phonetics, 27(6-7):484 96. 34 / 40 Tamás Gábor CSAPÓ Investigating tongue movement with ultrasound