Spatialized Audio Rendering for Immersive Virtual Environments Martin Naef, Markus Gross Computer Graphics Laboratory, ETH Zurich Oliver Staadt Computer Science Department, UC Davis 1
Context: The blue-c Collaborative Immersive Virtual Reality Environment Provide remote collaboration features Shared, synchronized virtual world Render partners using 3D video streams Concurrent rendering and acquisition http://blue-c.ethz.ch 2
Audio for VR Increase the sense of presence Guide the interest of the user Provide cues for orientation Requires 3D sound rendering Requires some form of room simulation 3
Overview Available systems and technology System overview Audio rendering pipeline Sound sources Simulation of physical effects 3D positioning and mixdown API integration Experiments 4
Rendering Options Using headphones Model head/pinnae using HRTF Head-tracking for each user required Calibration for individual users required Using multiple speakers Multi-channel hardware required Speaker placement is critical 5
Available Systems High-end systems Offline calculation of impulse responses E.g. CATT-Acoustics Convolution processors E.g. Lake Expensive! 6
Available Systems Low-end systems PC sound cards Direct Sound, EAX, OpenAL Speaker placement No sample clock synchronization 7
Design Goals Good sound quality at moderate cost Believable results Not necessarily physically correct Support for networked sound Needed for remote collaboration Flexible speaker placement Efficient implementation on standard hardware At least 20 sound sources 8
System Overview Part of the blue-c software core Application sync Graphics System Scene Graphics Pipeline Sound System 1 1 1 Localization n Pipeline Bus nmix n Reverb n 9
Audio Rendering Pipeline Distance Delay Air Absorption Distance Gain 3D Positioning Reverb Projection Head Tracking Room-EQ Speakers LPF Sub 10
Audio s Recorded audio Mono samples or loops for effects Multi-channel files for background music Live input Microphones External synthesizer or sampler Networked input Remote microphones for collaboration Distance Delay Air Absorption Distance Gain 3D Positioning Reverb Projection Head Tracking Room-EQ Speakers LPF Sub 11
Audio s Keep all state information 3D position Reference distance Gain Temporary rendering data Filter coefficients and state Delay lines Mixdown matrix Distance Delay Air Absorption Distance Gain 3D Positioning Reverb Projection Head Tracking Room-EQ Speakers LPF Sub 12
Distance Delay Simulate propagation speed of sound (300 m/s) Store and delay samples in a memory buffer Keep independent write and read pointers Read pointer is moved according to distance Linear interpolation of time and samples Results in a frequency shift (Doppler effect) P cur + t d + t block P cur + t d.last P cur + t block P cur Delay Line Distance Delay Air Absorption Distance Gain 3D Positioning Read Data Write Data Reverb Projection Room-EQ LPF Speakers Sub Head Tracking 13
Air Absorption Frequency-dependant power loss Higher frequencies are attenuated more Only perceivable for large distances (approx. -4 db per 1000 m at 1 khz) Simplified model High-shelving filter (bi-quad) Distance Delay Air Absorption Distance Gain 3D Positioning Reverb Projection Head Tracking Room-EQ Speakers LPF Sub 14
Distance Gain Power loss according to distance Uses reference distance L = d D D ref s Distance Delay Air Absorption Distance Gain 3D Positioning Reverb Projection Head Tracking Room-EQ Speakers LPF Sub 15
3D Positioning Simulate source direction using a discrete, small number of speakers Distribute mono-stream onto multiple speaker channels Distance Delay Air Absorption Distance Gain 3D Positioning Reverb Projection Head Tracking Room-EQ Speakers LPF Sub 16
3D Positioning Calculate channel gains with dot-product Open up "active angle" to avoid differences in perceived spread Normalize gain factors L chn vspk vs + 0.1 = max, 0 1.1 Distance Delay Air Absorption Distance Gain 3D Positioning Reverb Projection Head Tracking Room-EQ Speakers LPF Sub 17
Loudness Projection Correct the individual channels to move "sweet spot" Use head-tracking information Allows irregular distances to the listener for individual speaker L = spk D D spk spkref Distance Delay Air Absorption Distance Gain 3D Positioning Reverb Projection Head Tracking Room-EQ Speakers LPF Sub 18
Room Simulation Simulate room echo Provide a sense of the size and material of the acoustic space Two fundamental approaches Simulated impulse response (large FIR filters) Parameterized reverberation algorithms Distance Delay Air Absorption Distance Gain 3D Positioning Reverb Projection Head Tracking Room-EQ Speakers LPF Sub 19
Room Simulation Separate send channel to studio effect processor (t.c. M-ONE XL) Provides smooth, pleasing reverb Intuitive parameterization Mix reverb output onto the mix bus Use effect send gain as additional distance cue High direct-sound to room echo ratio for close sounds L rev = 1 D s Dref + D ref 2 Distance Delay Air Absorption Distance Gain 3D Positioning Reverb Projection Room-EQ LPF Speakers Sub Head Tracking 20
Room EQ and LF Management Parametric equalizer in the mixing bus allows to adjust to acoustic environment Attenuate resonant frequencies Account for non-linear speaker response Low-frequency management low-pass filter a sum signal to drive subwoofer Distance Delay Air Absorption Distance Gain 3D Positioning Reverb Projection Head Tracking Room-EQ Speakers LPF Sub 21
Fused Pipeline Mix signal onto main bus using a single mixdown-matrix gain Distance gain Position and projection gain for each speaker channel Steps are reduced into a single vector-matrix multiplication Distance Delay Air Absorption g Mix Matrix Rev. Room-EQ LPF Speakers Sub f 3D Positioning Distance Delay Air Absorption Distance Gain 3D Positioning Reverb Projection Head Tracking Room-EQ Speakers LPF Sub 22
g f Fused Pipeline Mixdown matrix Calculated at audio block boundaries Linear interpolation between last and current matrix (every 32 samples) Provides smooth transition between different positions Distance Delay Air Absorption Mix Matrix 3D Positioning Rev. Room-EQ Speakers LPF Sub 23
API Integration Sound service in the blue-c API core Control sound sources and system Audio nodes in the scene graph Sound as object attribute Support transformation nodes Provide translation between virtual (scene) and real coordinate systems (physical setup) Triggered as the user approaches an object 24
Benchmarks Single MIPS R12000 CPU, 400 MHz 44.1 khz sampling rate, 20 ms latency, 8 channel ADAT input/output Mono Stereo Localized Preload 78 sources 33 sources 37 sources Live 54 sources 25 sources 30 sources Stream 65 sources 31 sources 33 sources Delay-line is expensive Latency has little influence 25
Applications Used for several applications Landscape (ship seeking test) Infoticles "Fashion show" blue-c feature demo Collaborative chess 26
Conclusions High quality sound system Based on standard components Moderate cost ~ US$5000 for audio system Fully integrated into VR toolkit 27
Future Work Integration into area management Culling of sound sources Portal effects Assign reverberation parameters to areas Linux port Run as standalone server 28
Questions? http://blue-c.ethz.ch 29
Related Work - Acoustics [Begault:94] Overview [Gardner:92] Virtual Acoustics / Reverb [Krockstadt:68] Ray-tracing [Funkhouser:99] Beam-tracing [Gardner:94] HRTF [Pulkki:99] VBAP [Start:97] Wavefield Synthesis 30
Related Work - VR [Takala:92] Sound Rendering [Tsingos:97] Soundtracks for animation [Eckel:99] Cyberstage Sound Server [Jot:99] IRCAM Spatialisateur [Huopaniemi:99] DIVA 31
Implementation Notes Rendering runs in its own process Sound sources can be added and modified at any time Parameter updates only at block boundaries Runs on SGI Onyx 3200 (MIPS R12000, 400 MHz) I/O through 8 channel ADAT Inexpensive studio hardware and speakers ~ US$5000 for audio system PreloadData LiveInput Stream SoundService Sound Preload 3DPositioning ReverbControl LiveInput 32
Speaker Placement More speakers means better localization 6 speaker provide good results 8 speakers almost "equal power" distribution Distance Delay Air Absorption Distance Gain 3D Positioning Reverb Projection Head Tracking Room-EQ Speakers LPF Sub 33