Hearing. 1 Introduction. 1.2 Psychoacoustics. 1.1 Auditory system

Hearing Hearing 1 1 Introduction Hearing 2 Sources: Rossing. (1990). The science of sound. Chapters 5 7. Karjalainen. (1999). Kommunikaatioakustiikka. Pulkki, Karjalainen (2015). Communication acoustics Moore. (1997). An introduction to the psychology of hearing. Contents: 1. Introduction 2. Ear physiology 3. Loudness 4. Masking 5. Pitch 6. Spatial hearing! Auditory system can be divided in two parts Peripheral auditory system (outer, middle, and inner ear) Auditory nervous system (in the brain)! Ear physiology studies the peripheral system! Psychoacoustics studies the entire sensation: relationships between sound stimuli and the subjective sensation 1.1 Auditory system Hearing 3 1.2 Psychoacoustics Hearing 4! Dynamic range of hearing is wide ratio of a very loud to a barely audible sound pressure level is 1:10 6 (120 db)! Frequency range of hearing varies a lot between individuals only few can hear from 20 Hz to 20 khz sensitivity to low sounds (< 100Hz) is not very good sensitivity to high sounds (> 12 khz) decreases along with age! Selectivity of hearing listener can pick an instrument from among an orchestra listener can follow a speaker at a cocktail party One can sleep in background noise but still wake up to an abnormal sound! Perception involves information processing in the brain Information about the brain is limited! Psychoacoustics studies the relationships between sound stimuli and the resulting sensations Attempt to model the process of perception For example trying to predict the perceived loudness / pitch / timbre from the acoustic properties of the sound signal! In a psychoacoustic listening test Test subject listens to sounds Questions are made or the subject is asked to describe her sensasions

2 Ear physiology! The human ear consists of three main parts: (1) outer ear, (2) middle ear, (3) inner ear Hearing 5 Hearing 6 2.1 Outer ear! Outer ear consists of: pinna gathers sound; direction-dependent response auditory canal (ear canal) - conveys sound to middle ear, and acts as a resonator at 3-4 khz amplifying sound +10 db! Outer ear is passive and linear, and its behavior can be completely described by laws of acoustic wave propagation. Nerve signal to brain [Chittka05] 2.2 Middle ear Hearing 7! Middle ear contains Eardrum that transforms sound waves into mechanic vibration Tiny audtory bones (ossicles): 1 Malleus (resting against the eardrum, see figure), 2 Incus and 3 Staples! The ossicles transmit eardrum vibrations (in air) to the oval window of the inner ear (filled with fluid)! Acoustic reflex: when sound 1 2 pressure level exceeds 50-60 db, eardrum tension increases and staples is removed from oval window 3 Protects the inner ear from damage 2.3 Inner ear, cochlea Hearing 8! The inner ear contains the cochlea: a fluid-filled organ where vibrations are converted into nerve impulses to the brain.! Cochlea = Greek: snail shell.! Spiral tube: When stretched out, approximately 35 millimeters long.! Vibrations on the cochlea s oval window cause hydraulic pressure waves inside the cochlea! Inside the cochlea there is the basilar membrane,! On the basilar membrane there is the organ of Corti with nerve cells that are sensitive to vibration! Nerve cells transform movement information into neural impulses in the auditory nerve

2.4 Basilar membrane Hearing 9 Basilar membrane Hearing 10! Figure: cochlea stretched out for illustration purposes Basilar membrane divides the fluid of the cochlea into separate tunnels When hydraulic pressure waves travel along the cochlea, they move the basilar membrane from oval window to round window. apex! Different frequencies produce highest amplitude at different sites! Preliminary frequency analysis happens on the basilar membrane Travelling waves: base apex base Best freq (Hz) 2.5 Sensory hair cells! Distributed along the basilar membrane are sensory hair cells that transform membrane movement into neural impulses! When a hair cell bends, it generates neural impulses Impulse rate depends on vibration amplitude and frequency Hearing 11! Depending on the position on the basilar membrane, each nerve cell has a characteristic frequency to which it is most responsive to (right fig.: tuning curves of 6 different cells from different positions)! Left fig: Varying the amplitude of stimulus (db) broadens the range of frequencies that the hair cell reacts to 3 Loudness! Loudness describes the subjective level of sound Perception of loudness is relatively complex, but consistent phenomenon and one of the central parts of psychoacoustics Hearing 12! The loudness of a sound can be compared to a standardized reference tone, for example 1000 Hz sinusoidal tone Loudness level (phon) is defined to be the sound pressure level (db) of a 1000 Hz sinusoidal, that has the the same subjective loudness as the target sound For example if the heard sound is perceived as equally loud as 40 db 1kHz sinusoidal, is the loudness level 40 phons

3.1 Equal-loudness curves Sound pressure level (db) Loudness level (phons) Frequency (Hz) Hearing 13 3.2 Critical bands Hearing 14! Comparing to two band-limited noises with same center frequencies and increasing the other s bandwidth, the perceived loudness increases only after a critical bandwidth Figure: 1 khz @ 60 db, Critical bandwidth is 160 Hz at 1 khz! Ear analyzes sound at critical band resolution. Each critical band contributes to the overall loudness level! Each center frequency f c has a critical bandwidth Δf (Bark bandwidths)! Equivalent rectangular bandwidth (ERB) is a different type of procedure to measure a rectangular bandwidth of auditory filters.! Both bandwidths can be used to define a frequency Bark and ERB scales Stack bandwidths on top of each other 3.3 Perceptually-motivated frequency scales Hearing 15 3.4 Loudness of a complex sound Hearing 16 mm. on basilar membrane frequency / khz frequency / mel frequency / Bark! A broadband sound is perceived louder than a narrowband sound of equal SPL, as shown by the critical bandwidth experiment.! Loudness of a complex sound is calculated by using socalled specific loudness of a critical band as intermediate unit! Specific loudness is roughly proportional to the log-power of the signal at the band (weighted according to sensitivity of hearing and spread slightly by convolving over bands)! Overall loudness is obtained by summing up specific loudness values over critical band

3.5 Measuring sound level! SPL does not work well as a perceptual metric of sound loudness. Hearing 17! Different frequency weighting curves (A,B,C,D) can be applied to measure sound level, which is more related to frequency sensitivity of hearing! A-weighting is most common, referred as db(a), it roughly resembles the inverse of the hearing threshold curve. wikipedia 4 Masking! Masking describes the situation where a weaker but clearly audible signal (maskee, test tone) becomes inaudible in the presence of a louder signal (masker)! Masking depends on both the spectral structure of the sounds and their variation over time Hearing 18 4.1 Masking in frequency domain Hearing 19 Masking in frequency domain Hearing 20! Model of the frequency analysis in the auditory system subdivision of the frequency axis into critical bands frequency components within a same critical band mask each other easily Bark & ERB scales: frequency scales that are derived by mapping frequencies to critical band numbers! Narrowband noise masks a tone (sinusoidal) easier than a tone masks noise! Masked threshold refers to the raised threshold of audibility caused by the masker sounds with a level below the masked threshold are inaudible masked threshold in quiet = threshold of hearing in quiet! Figure: masked thresholds [Herre95] masker: narrowband noise around 250 Hz, 1 khz, 4 khz spreading function: the effect of masking extends to the spectral vicinity of the masker (spreads more towards high freqencies)! Additivity of masking: joint masked thresh is approximately (but slightly more than) sum of the components

4.2 Masking in time domain! Forward masking masking effect extends to times after the masker is switched off! Backwards masking masking extends to times before the masker is been switched on! Forward/backward masking does not extend far in time " simultaneous masking is more important phenomenon backward masking forward masking Hearing 21 5 Pitch Hearing 22! Pitch Subjective attribute of sounds that enables us to arrange them on a frequency-related scale ranging from low to high Sound has a certain pitch if human listaners can consistently match the frequency of a sinusoidal tone to the pitch of the sound! Fundamental frequency vs. pitch Fundamental frequency is a physical attribute Pitch is a perceptual attribute Both are measured in Hertz (Hz) In practise, perceived pitch fundamental frequency! "Perfect pitch" or "absolute pitch" - ability to recognize the pitch of a musical note without any reference Minority of the population can do that 5.1 Harmonic sound! For a sinusoidal tone Fundamental frequency = sinusoidal frequency Pitch sinusoidal frequency! Harmonic sound Hearing 23 Trumpet sound: * Fundamental frequency F = 262 Hz * Wavelength 1/F = 3.8 ms Hearing 24 5.2 Pitch perception! Pitch perception has been tried to explain using two competing theories Place theory: Peak activity along the basilar membrane determines pitch (fails to explain missing fundamental) Periodicity theory: Pitch depends on rate, not place, of response. Neurons fire in sync with signals! The real mechanism is a combination of the above Sound is subdivided into subbands (critical bands) Periodicity of the amplitude envelope (see lowest panel) is analyzed within bands Results are combined across bands

6 Spatial hearing Hearing 25 6.1 Monaural source localization Hearing 26! The most important auditory cues for localizing a sound sources in space are 1. Interaural time difference (ITD) Is relatively constant with frequency 2. Interaural intensity difference (ILD) Increases with frequency 3. Direction-dependent filtering of the sound spectrum by head and pinnae! Terms Monaural : with one ear Binaural : with two ears Interaural : between the ears (interaural time difference etc) Lateralization : localizing a source in horizontal plane! Diretional hearing works to some extent even with one ear! Head and pinna form a direction-dependent filter Direction-dependent changes in the spectrum of the sound arriving in the ear can be described with HRTFs HRTF = head-related transfer function! HRTFs are crucial for localizing sources in the median plane (vertical localization) Monaural source localization Hearing 27 6.2 Localizing a sinusoidal Hearing 28! HRTFs can be measured by recording Sound emitted by a source Sounds arriving to the auditory canal or eardrum (transfer function of the auditory canal does not vary along with direction)! In practice left: microphone in the ear of a test subject, OR right: head and torso simulator! Experimenting with sinusoidal tones helps to understand the localization of more complex sounds! Angle-of-arrival perception for sinusoids below 750 Hz is based mainly on interaural time difference

Localizing a sinusoidal Hearing 29 6.3 Localizing complex sounds Hearing 30! Interaural time difference is useful only up to 750 Hz Above that, the time difference is ambiguous, since there are several wavelengths within the time difference Moving the head (or source movement) helps: can be done up to 1500 Hz! At higher frequencies (> 750 Hz) the auditory system utilizes interaural intensity difference Head causes and acoustic shadow (sound level is lower behind the head) Works especially at high frequencies with shorter wavelenghts compared to head dimensions! Complex sounds refer to sounds that involve a number of different frequency components and vary over time! Localizing sound sources is typically a result of combining all the above-described mechanisms 1. Interaural time difference (most important) 2. Interaural intensity difference 3. HRTFs! Wideband noise: directional hearing works well 6.4 Lateralization in headphone listening Hearing 31 6.5 Precedence effect Hearing 32! When listening with headphones, the sounds are often localized inside the head, on the axis between the ears Sound does not seem to come from outside the head because the diffraction caused by pinnae and head is missing If the sounds are processed with HRTFs carefully, they move outside the head! The sound in indoors that reaches the listener contains the direct path, early reflections, and late reverberation.! The brain suppresses early reflected sounds to aid in source direction perception Early reflections are not heard as separate sounds. Instead, they amplify the loudness of direct sound! Precedence effect works when 1. Early reflections arrive less than ~ 35ms after direct sound. 2. The spectrum of a reflection is similar to that of the direct sound 3. Reflections are lower or at least not signigicantly louder than direct sound Even a sound with +10 db SPL than direct sound can not be heard