How our brains perceive categories of sounds

Similar documents
Levels of representation in the electrophysiology of speech perception

Prelinguistic vocal behaviors. Stage 1 (birth-1 month) Stage 2 (2-3 months) Stage 4 (7-9 months) Stage 3 (4-6 months)

Lecture 1-10: Spectrograms

WHY WE NEED DISTINCTIVE FEATURES?

Speech-Language Pathology Curriculum Foundation Course Linkages

The sound patterns of language

ERP indices of lab-learned phonotactics

Articulatory Phonetics. and the International Phonetic Alphabet. Readings and Other Materials. Introduction. The Articulatory System

Between voicing and aspiration

The Pronunciation of the Aspirated Consonants P, T, and K in English by Native Speakers of Spanish and French

Ph.D in Speech-Language Pathology

The syllable as emerging unit of information, processing, production

English Phonetics: Consonants (i)

Thirukkural - A Text-to-Speech Synthesis System

Workshop Perceptual Effects of Filtering and Masking Introduction to Filtering and Masking

MICHIGAN TEST FOR TEACHER CERTIFICATION (MTTC) TEST OBJECTIVES FIELD 062: HEARING IMPAIRED

Bachelors of Science Program in Communication Disorders and Sciences:

L2 EXPERIENCE MODULATES LEARNERS USE OF CUES IN THE PERCEPTION OF L3 TONES

Portions have been extracted from this report to protect the identity of the student. RIT/NTID AURAL REHABILITATION REPORT Academic Year

62 Hearing Impaired MI-SG-FLD062-02

PERCENTAGE ARTICULATION LOSS OF CONSONANTS IN THE ELEMENTARY SCHOOL CLASSROOMS

Tonal Detection in Noise: An Auditory Neuroscience Insight

Timbre. Chapter Definition of timbre modelling. Hanna Järveläinen Giovanni De Poli

Things to remember when transcribing speech

Functional Auditory Performance Indicators (FAPI)

Holistic Music Therapy and Rehabilitation

Vision: Receptors. Modes of Perception. Vision: Summary 9/28/2012. How do we perceive our environment? Sensation and Perception Terminology

What Audio Engineers Should Know About Human Sound Perception. Part 2. Binaural Effects and Spatial Hearing

SPEECH Biswajeet Sarangi, B.Sc.(Audiology & speech Language pathology)

Lecture 2, Human cognition

Running head: PROCESSING REDUCED WORD FORMS. Processing reduced word forms: the suffix restoration effect. Rachèl J.J.K. Kemps, Mirjam Ernestus

An Evaluation of Bone-conducted Ultrasonic Hearing-aid regarding Transmission of Japanese Prosodic Phonemes

Common Pronunciation Problems for Cantonese Speakers

Neuroplasticity in the processing of pitch dimensions: A multidimensional scaling analysis of the mismatch negativity

Common Phonological processes - There are several kinds of familiar processes that are found in many many languages.

Trigonometric functions and sound

Sensory Dissonance Using Memory Model

Dr. Abdel Aziz Hussein Lecturer of Physiology Mansoura Faculty of Medicine

Early vs. Late Onset Hearing Loss: How Children Differ from Adults. Andrea Pittman, PhD Arizona State University

. Niparko, J. K. (2006). Speech Recognition at 1-Year Follow-Up in the Childhood

Introduction PURPOSE OF THE GUIDE

HEARING. With Your Brain

Quarterly Progress and Status Report. An ontogenetic study of infant speech perception

Problem-Based Group Activities for a Sensation & Perception Course. David S. Kreiner. University of Central Missouri

An Arabic Text-To-Speech System Based on Artificial Neural Networks

Neurogenic Disorders of Speech in Children and Adults

PUBLICATIONS IN DISCIPLINE

The loudness war is fought with (and over) compression

DRA2 Word Analysis. correlated to. Virginia Learning Standards Grade 1

Mathematical modeling of speech acoustics D. Sc. Daniel Aalto

The effect of mismatched recording conditions on human and automatic speaker recognition in forensic applications

Congenitally Deaf Children Generate Iconic Vocalizations to Communicate Magnitude

L3: Organization of speech sounds

Overview of Methodology. Human Electrophysiology. Computing and Displaying Difference Waves. Plotting The Averaged ERP

Cochlear Implant Rehabilitation It s Not Just for Kids!

THE VOICE OF LOVE. Trisha Belanger, Caroline Menezes, Claire Barboa, Mofida Helo, Kimia Shirazifard

Some Basic Concepts Marla Yoshida

EARLY INTERVENTION: COMMUNICATION AND LANGUAGE SERVICES FOR FAMILIES OF DEAF AND HARD-OF-HEARING CHILDREN

Speech Production 2. Paper 9: Foundations of Speech Communication Lent Term: Week 4. Katharine Barden

Lecture 4: Jan 12, 2005

PURE TONE AUDIOMETRY Andrew P. McGrath, AuD

AP Psychology ~ Ms. Justice

Predicting Speech Intelligibility With a Multiple Speech Subsystems Approach in Children With Cerebral Palsy

Perceptual experiments sir-skur-spur-stir

Feedback and imitation by a caregiver guides a virtual infant to learn native phonemes and the skill of speech inversion *

Phonetic and phonological properties of the final pitch accent in Catalan declaratives

Lecture 1-6: Noise and Filters

The Minor Third Communicates Sadness in Speech, Mirroring Its Use in Music

PERCEPCJA ANGIELSKICH I POLSKICH SPÓŁGŁOSEK WŁAŚCIWYCH

Proceedings of Meetings on Acoustics

Program curriculum for graduate studies in Speech and Music Communication

Anatomy and Physiology of Hearing (added 09/06)

2 Neurons. 4 The Brain: Cortex

Formant Bandwidth and Resilience of Speech to Noise

Evaluation of a Segmental Durations Model for TTS

PERSPECTIVES. The cortical organization of speech processing

L9: Cepstral analysis

Auditory memory and cerebral reorganization in post-linguistically deaf adults

Speech Therapy for Cleft Palate or Velopharyngeal Dysfunction (VPD) Indications for Speech Therapy

Points of Interference in Learning English as a Second Language

The effects of non-native English speaking EFL teachers accents on their willingness to teach pronunciation

Classic EEG (ERPs)/ Advanced EEG. Quentin Noirhomme

3-2 Language Learning in the 21st Century

Innovative Tools and Technology to use during Aural Rehabilitation Therapy

Categories of Exceptionality and Definitions

GUIDELINES FOR DETERMINING CENTRAL AUDITORY PROCESSING DISORDER

Myanmar Continuous Speech Recognition System Based on DTW and HMM

PERSPECTIVE. How Top-Down is Visual Perception?

A Segmentation Algorithm for Zebra Finch Song at the Note Level. Ping Du and Todd W. Troyer

THESE ARE A FEW OF MY FAVORITE THINGS DIRECT INTERVENTION WITH PRESCHOOL CHILDREN: ALTERING THE CHILD S TALKING BEHAVIORS

Phonetic Recoding of Phonologically Ambiguous Printed Words*

Expanding Performance Leadership in Cochlear Implants. Hansjuerg Emch President, Advanced Bionics AG GVP, Sonova Medical

Brain responses reveal the learning of foreign language phonemes

SPPB. Research lines 18/02/2008. Neurociència cognitiva del llenguatge (i d altres funcions cognitives) Speech Perception Production and Bilingualism

A Cross-Language Approach to Voice, Quantity and Aspiration. An East-Bengali and German Production Study.

HOFSTRA UNIVERSITY Department of Speech-Language-Hearing Sciences Knowledge & Skills Acquisition Report

SPEECH OR LANGUAGE IMPAIRMENT

Doppler Effect Plug-in in Music Production and Engineering

SPEECH INTELLIGIBILITY and Fire Alarm Voice Communication Systems

Production and perception of mandarin tone in adults with cerebral palsy

Transcription:

How our brains perceive categories of sounds

Task of the listener: Mapping sound to meaning /kæt/ Requires converting the raw signal into the kinds of sound representations that connect to meaning. What are those sound representations like?

Task of the listener: Mapping sound to meaning Stored sound representations must abstract away from a huge amount of physical variance in the signal Loudness Pitch Gender Speaker identity Accent and other variance in pronunciation Etc

Categorization inside and outside language What makes a chair a chair? What makes a t a t? For the speaker? For the listener? Narrowing down the question: What makes a t a t, as opposed to a d?

istinctive features Major class features Manner features Laryngeal features t + consonantal - sonorant - syllabic - nasal - continuant - lateral - voiced d + consonantal - sonorant - syllabic - nasal - continuant - lateral + voiced Place of articulation features - round + anterior - round + anterior

[+ voiced] sounds are produced with the vocal folds close together, in such as way that air passing through them causes them to vibrate. Voiced Human vocal tract Touch your larynx while pronoucing [z/] vs. [s] to feel that [z] is voiced and [s] voiceless.

Voicing and stop consonants Stop consonants: p, t, k, b, d, g Produced by causing a complete closure of the vocal tract and then releasing it. Stops cannot be heard until the closure is released. All vowels are voiced. When the stop is released quickly, and the voicing of the vowel starts quickly, the stop is considered [+ voiced]. The difference between [d] and [t] is that the onset of voicing of the following vowel takes longer for [t] than for [d].

Acoustically, voiced and voiceless stops differ in their Voice Onset Time (VOT) VOT: The time it takes for the voicing of the vowel to start.

VOT continuum for ta-da English boundary 60 msec

Categorical perception of t-d Eimas & Corbitt: (1973): Perception of synthesized sounds with VOTs increasing in equal steps from 0 to 80ms. Listeners to judge whether they hear a [da] or [ta].

/d/ category /t/ category Within category discrimination is poor. How do we learn to group sounds in this way? What is it good for?

/d/ category /t/ category A change within these VOTs or within these VOTs is never associated with a change in meaning in English. So being able to discriminate between 10 and 20ms VOTs or 60 and 70ms VOTs won t help you understand English.

boundary Short VOT Long VOT doll ifferent words tall Could not be different words

Phoneme A category of sounds. A mental abstraction over a group of sounds within which interchanging one token for another can never induce a change in meaning. Since going from /t/ to /d/ in English can induce a change in meaning, /t/ and /d/ are distinct phonemes in English.

/kæt/ Requires converting the raw signal into the kinds of sound representations that connect to meaning. What are those sound representations like? Sequences of phonemes!

What do we know about the brain bases of phonological categorization?

An auditory evoked neuromagnetic field M100 field pattern Roberts, Ferrari, Stufflebeam & Poeppel, 2000, J Clin Neurophysiol, Vol 17, No 2, 2000

M100 (N1 in ERP terms) Generated in auditory cortex bilaterally. Affected by: Intensity higher intensity shorter latency, larger amplitude Frequency higher frequency shorter latency Phonetic identity e.g., shorter latencies for a than for u (explainable in spectral terms) Function of M100 activity in speech processing unclear.

Auditory Mismatch Field A change detector. Not part of the basic processing of sounds.

Auditory Mismatch Field Elicited in auditory cortex at ~180ms post stimulus onset by deviant stimuli in an oddball paradigm. XXXYXXXXYXXXXYXXX X = standard Y = deviant

Auditory Mismatch Field A tool for investigating what counts as a change for auditory cortex. Has been heavily used in the study of the neural bases of categorical perception.

Mismatch field as a function of frequency change Sams et al. 1985

The Mismatch Field and Phonemic Categorization Would crossing a phonemic boundary elicit a mismatch field? If yes, this would tell us that auditory cortex has access to phonemic categories that category information is accessed by 180ms.

Voice Onset Time (VOT) 60 msec

Phillips et al. (2000, Journal of Cognitive Neuroscience) deviants T 45 ms T 64 ms T 56 ms perceptual boundary 05 ms 16 ms 00 ms 24 ms 18 ms 00 ms 24 ms 16 ms 00 ms 24 ms 15 ms 05 ms 00 ms standards Acoustic representation

Phillips et al. (2000, Journal of Cognitive Neuroscience) T 45 ms T 64 ms T 56 ms perceptual boundary 05 ms 16 ms 00 ms24 ms 18 ms 00 ms 24 ms 16 ms 00 ms 24 ms 15 ms 05 ms 00 ms Phonological representation

Phillips et al. (2000, Journal of Cognitive Neuroscience) T 45 ms T 64 ms T 56 ms perceptual boundary 05 ms 16 ms 00 ms24 ms 18 ms 00 ms 24 ms 16 ms 00 ms 24 ms 15 ms 05 ms 00 ms If the auditory cortex is sensitive to phonological categories, deviant Ts should elicit a Mismatch Field.

What should happen if we keep everything else the same, but lift all the VOTs by 20ms? T 45 ms T 64 ms T 56 ms perceptual boundary 05 ms 16 ms 00 ms 24 ms 18 ms 00 ms 24 ms 16 ms 00 ms 24 ms 15 ms 05 ms 00 ms

deviants T 65 ms T 84 ms T 76 ms T T 44 ms perceptual 36 ms boundary 25 ms 20 ms 38 ms 20 ms T 44 ms T 36 ms 20 ms T 44 ms T 35 ms 25 ms 20 ms standards The deviants should not elicit a Mismatch Field -- they are not perceived as different from the standards.

Phillips et al. (2000, Journal of Cognitive Neuroscience) Experiment 1: standards fall on one side of a perceptual boundary, the deviants on the other. Experiment 2: the physical distance between standards and deviants is as in Exp. 1, but now the standards and deviants are no longer differentiated by a perceptual boundary. If a Mismatch Field elicited in Exp 1 is driven by the category difference between standards and deviants, it should dissappear in Exp 2.

Phillips et al., 2000, Journal of Cognitive Neuroscience

Phillips et al. (2000, Journal of Cognitive Neuroscience) A Mismatch Field is elicited only when standards fall on one side of a perceptual boundary, the deviants on the other. Auditory cortex has access to phonological category information at 180ms (shortly after the M100). In this example, category membership depended on voicing. ifferent stop consonant have different perceptual boundaries for voicing. If the mismatch response is sensitive to voicing, we should be able elicit it just for that feature, even across phonemic categories.

Multiple places of articulation pæ tæ tæ kæ dæ pæ kæ tæ pæ kæ bæ tæ pæ kæ gæ (Phillips, Pellathy & Marantz 2000)

Multiple places of articulation pæ tæ tæ kæ dæ pæ kæ tæ pæ kæ bæ tæ pæ kæ gæ [+voi] [+voi] [+voi] Voiceless phonemes are in many-to-one ratio with [+voice] phonemes. No other many-to-one ratios in this sequence (Phillips, Pellathy & Marantz 2000)

Multiple places of articulation pæ tæ tæ kæ dæ pæ kæ tæ pæ kæ bæ tæ pæ kæ gæ [+voi] [+voi] [+voi] (Phillips, Pellathy & Marantz 2000)

Stimuli

Stimuli

Stimuli

Mismatch response elicited by voicing

Left laterality of language related Mismatch Field Nonlinguistic sounds elicit a bilateral Mismatch Field. I.e., both auditory cortices are surprised. Crossing a phoneme boundary in an oddball task elicits a Mismatch Field in the left hemisphere only. The left hemisphere auditory cortex is specialized for phoneme perception.

Mismatch Summary By 180ms, auditory cortex has extracted from the input the relevant features needed for categorizing sounds into phonemes. Since the Mismatch field is a surprise response, the necessary computational steps needed for categorization must occur prior to 180ms. How exactly this happens, we don t currently know.