Digital Audio Compression: Why, What, and How



Similar documents
AUDIO CODING: BASICS AND STATE OF THE ART

MPEG Unified Speech and Audio Coding Enabling Efficient Coding of both Speech and Music

Audio Coding Introduction

Audio Coding, Psycho- Accoustic model and MP3

For Articulation Purpose Only

MPEG-1 / MPEG-2 BC Audio. Prof. Dr.-Ing. K. Brandenburg, bdg@idmt.fraunhofer.de Dr.-Ing. G. Schuller, shl@idmt.fraunhofer.de

BDTI Solution Certification TM : Benchmarking H.264 Video Decoder Hardware/Software Solutions

Technical Paper. Dolby Digital Plus Audio Coding

A Comparison of the ATRAC and MPEG-1 Layer 3 Audio Compression Algorithms Christopher Hoult, 18/11/2002 University of Southampton

DOLBY SR-D DIGITAL. by JOHN F ALLEN

Digital Audio and Video Data

4 Digital Video Signal According to ITU-BT.R.601 (CCIR 601) 43

MP3 AND AAC EXPLAINED

The Theory Behind Mp3

FREE TV AUSTRALIA OPERATIONAL PRACTICE OP 60 Multi-Channel Sound Track Down-Mix and Up-Mix Draft Issue 1 April 2012 Page 1 of 6

An Optimised Software Solution for an ARM Powered TM MP3 Decoder. By Barney Wragg and Paul Carpenter

Quality Estimation for Scalable Video Codec. Presented by Ann Ukhanova (DTU Fotonik, Denmark) Kashaf Mazhar (KTH, Sweden)

Introduction to image coding

Introduzione alle Biblioteche Digitali Audio/Video

H.264/MPEG-4 AVC Video Compression Tutorial

Audio Coding Algorithm for One-Segment Broadcasting

DAB + The additional audio codec in DAB

Figure 1: Relation between codec, data containers and compression algorithms.

PRIMER ON PC AUDIO. Introduction to PC-Based Audio

A HIGH PERFORMANCE SOFTWARE IMPLEMENTATION OF MPEG AUDIO ENCODER. Figure 1. Basic structure of an encoder.

encoding compression encryption

5.1 audio. How to get on-air with. Broadcasting in stereo. the Dolby "5.1 Cookbook" for broadcasters. Tony Spath Dolby Laboratories, Inc.

MP3 Player CSEE 4840 SPRING 2010 PROJECT DESIGN.

Analog Representations of Sound

Digital terrestrial television broadcasting Audio coding

high-quality surround sound at stereo bit-rates

Microsoft Smooth Streaming

MPEG-4. The new standard for multimedia on the Internet, powered by QuickTime. What Is MPEG-4?

IsumaTV. Media Player Setup Manual COOP Cable System. Media Player

EE3414 Multimedia Communication Systems Part I

Image Compression through DCT and Huffman Coding Technique

!"#$"%&' What is Multimedia?

Multichannel stereophonic sound system with and without accompanying picture

UNIVERSITY OF CALICUT

Multimedia Communications

Networked AV Systems Pretest

Classes of multimedia Applications

White Paper: An Overview of the Coherent Acoustics Coding System

Module 8 VIDEO CODING STANDARDS. Version 2 ECE IIT, Kharagpur

COMMON SURROUND SOUND FORMATS

Hooking Up Home Theater

STUDY OF MUTUAL INFORMATION IN PERCEPTUAL CODING WITH APPLICATION FOR LOW BIT-RATE COMPRESSION

Chapter 6: Broadcast Systems. Mobile Communications. Unidirectional distribution systems DVB DAB. High-speed Internet. architecture Container

Creating Content for ipod + itunes

Overview ISDB-T for sound broadcasting Terrestrial Digital Radio in Japan. Shunji NAKAHARA. NHK (Japan Broadcasting Corporation)

Signal-400 HDMI COFDM (DVB-T) Encoder & Modulator. --- Home Use

Preservation Handbook

CNR Requirements for DVB-T2 Fixed Reception Based on Field Trial Results

Video Coding Basics. Yao Wang Polytechnic University, Brooklyn, NY11201

RF Measurements Using a Modular Digitizer

Voice---is analog in character and moves in the form of waves. 3-important wave-characteristics:

IPTV Primer. August Media Content Team IRT Workgroup

Bandwidth Adaptation for MPEG-4 Video Streaming over the Internet

COMPACT DISK STANDARDS & SPECIFICATIONS

DTS-HD Audio. Consumer White Paper for Blu-ray Disc and HD DVD Applications

Audio and Video Synchronization:

A Review of Algorithms for Perceptual Coding of Digital Audio Signals

Sample Project List. Software Reverse Engineering

FREE TV AUSTRALIA OPERATIONAL PRACTICE OP- 59 Measurement and Management of Loudness in Soundtracks for Television Broadcasting

Video Coding Standards. Yao Wang Polytechnic University, Brooklyn, NY11201

MPEG Layer-3. An introduction to. 1. Introduction

White paper. H.264 video compression standard. New possibilities within video surveillance.

LINE IN, LINE OUT TO TV, VIDEO IN, VIDEO OUT

Frequently asked QUESTIONS. about DOLBY DIGITAL

EV-8000S. Features & Technical Specifications. EV-8000S Major Features & Specifications 1

Datasheet EdgeVision

ITU-T RECOMMENDATION J.122, SECOND-GENERATION TRANSMISSION SYSTEMS FOR INTERACTIVE CABLE TELEVISION SERVICES IP CABLE MODEMS

Home Theater. Diagram 1:

8 Commercial Streaming Systems An Overview

HE-AAC v2. MPEG-4 HE-AAC v2 (also known as aacplus v2 ) is the combination of three technologies:

Digital TV: Overview. Yao Wang Polytechnic University, Brooklyn, NY11201 http: //eeweb.poly.edu/~yao

Video Encoding Best Practices

Workshop Mediaformats for the Eminent mediaplayers

FPGAs for High-Performance DSP Applications

TECHNICAL OPERATING SPECIFICATIONS

Analog-to-Digital Voice Encoding

Tutorial about the VQR (Voice Quality Restoration) technology

FRAUNHOFER INSTITUTE FOR INTEGRATED CIRCUITS IIS AUDIO COMMUNICATION ENGINE RAISING THE BAR IN COMMUNICATION QUALITY

High-Fidelity Multichannel Audio Coding With Karhunen-Loève Transform

IP Video Rendering Basics

reach a younger audience and to attract the next-generation PEG broadcasters.

Multichannel audio: From studio to listener

4. H.323 Components. VOIP, Version 1.6e T.O.P. BusinessInteractive GmbH Page 1 of 19

All About Audio Metadata. The three Ds: dialogue level, dynamic range control, and downmixing

J6200 LED TV SPEC SHEET PRODUCT HIGHLIGHTS. Smart TV. Wi-Fi Built In. Motion Rate 120 SIZE CLASS 60" 65" 55" UN65J6200 UN55J6200 UN60J " 48" 40"

The DART Two Step - First Record, then Restore.

JPEG Image Compression by Using DCT

2K Processor AJ-HDP2000

Convention Paper 5553

Curso de Telefonía IP para el MTC. Sesión 2 Requerimientos principales. Mg. Antonio Ocampo Zúñiga

SNC-VL10P Video Network Camera

Video Conferencing Glossary of Terms

ISO/IEC JTC 1/SC 29. ISO/IEC :/Amd.1:1999(E) Date: ISO/IEC JTC 1/SC 29/WG 11

GPU Compute accelerated HEVC decoder on ARM Mali TM -T600 GPUs

Transcription:

Digital Audio Compression: Why, What, and How An Absurdly Short Course Jeff Bier Berkeley Design Technology, Inc. 2000 BDTI 1 Outline Why Compress? What is Audio Compression? How Does it Work? Conclusions 2 ICSPAT 2000 Page 1

Why: Too Much Data! CD quality Rate: 2 audio channels 44,100 samples/sec 16 bits = 1.4 Mbit/second audio data Audio CD capacity: 0.8 gigabytes audio data Cinema quality ~6 audio channels 48,000 samples/sec 16 bits = 4.6 Mbit/second, ~1 Gbyte/hour Typical compression today: few hundred kbit/sec for even 6 channels 3 Crossing Thresholds Mbytes / chi 1000 100 10 1 0.1 0.01 Memory Capacity - Mbytes/chip Audio CD DVD Sound Track 0.5 0.125 0.032 0.008 2 8 512 128 32 MP3 Album MP3 Song 0.001 1975 1980 1985 1990 1995 2000 2005 2010 Year 4 ICSPAT 2000 Page 2

Commercial Applications Cinema (digital film sound) Consumer devices Mini-disc (Sony) Handheld players Games DVD Distribution Internet distribution Audio over cable (set-top box) Satellite/terrestrial audio broadcast Digital television 5 What: Compression Goals Reduced bandwidth and/or storage Make decoded signal as close as possible to original signal Lowest implementation complexity Reasonable arithmetic requirements Applicable to as many signal types as possible Robust Scalable Extensible 6 ICSPAT 2000 Page 3

Psychoacoustics What does it cover? Relationship between what arrives at the ear and what we hear Why is it important for compression? Don t transmit what the ear can t hear How to figure out what ear can t hear? Range of human hearing Masking 7 Range of Human Hearing Zwicker/Fastl /Fastl p. 17 8 ICSPAT 2000 Page 4

. Digital Audio Compression: Why, What, and How Auditory Masking One signal can make another inaudible Amp Masker Shifted hearing threshold Signal not masked Masked signals Freq. Hearing threshold 9 Sound Pressure Level Auditory Masking (cont d) Temporal Masking Premasking ( Backward ) 10-30 msec Shifted Threshold Masker Duration After Zwicker/Fastl p. 78, Buser/Imbert p. 47 Post-masking ( Forward ) 100-200 msec Time ICSPAT 2000 Page 5 10

Perceptual Compression Calculate Thresholds Window (Snapshot) Spectral Analysis Remove Inaudible Components Quantie Coding Scheme Pack Data into Frames 11 Spectral Analysis/Synthesis What is it? Break a signal into spectrum Recover the signal from its spectrum Forward (analysis) Time domain Reverse/inverse (synthesis) Frequency domain ICSPAT 2000 Page 6 12

Spectral Analysis (cont d) How is it done? 1. Filter bank 2. Transform...... + Time domain Frequency domain Time domain 13 Spectral Analysis (cont d) Does reconstructed output = input? Yes, if: Don t change transform data (no filtering) Design window correctly Window input often enough Overlap windows in analysis/resynthesis properly If yes: Identity System Solid basis for further changes 14 ICSPAT 2000 Page 7

How: Analysis Spectral analysis DCT Wavelet... Davidson et al., 1994 15 How: Analysis Calculate masking thresholds Perceptual model Davidson et al., 1994 ICSPAT 2000 Page 8 16

How: Noise Allocation Remove inaudible components Quantie remaining components Use minimum # bits Adds noise Keep below masking threshold Davidson et al., 1994 17 How: More Tricks Filter Bandlimit input signal LFE bandlimited to <120 H Differential coding of spectral values Coupling Across time Across channels 18 ICSPAT 2000 Page 9

How: Coding Scheme Direct coding Entropy (e.g., Huffman) coding MPEG: noiseless coding PAC: information-theoretic coding Quantiation table Run-length Vector quantiation 19 Artifacts: Pre-echo echo Original Resynthesied (5.3 msec) Quantiation noise spread Noise components 1-2 msec before impulsive signal not masked Fix: Shorter window; wavelets (epac) 20 ICSPAT 2000 Page 10

Comparing Specifications Bit rate ranges: < 8 kbps - 9.6 Mbps Bit widths: 16-24 bits Sample rates: 8-192 kh Number of channels: 1-many doen Spectral bins: 128-1024 Time resolution: 4-12 msec Compression ratios: 6-12:1 typical Audio quality transparent to annoying 21 Internet Cinema MPEG Which Algorithm? MPEG-1 1992 ATRAC (Sony) 1992 PAC (Lucent) 1992 MPEG-2 1994 TwinVQ (NTT) 1995 G2 (RealNetworks) 1998 AC-3 (Dolby) 1995 AAC 1997 Coherent Acoustics (DTS) 1996 WMA (Microsoft) 1999 MPEG-4 1999... MLP (Meridian) 1997 Qdesign 1999 ICSPAT 2000 Page 11 22

MPEG Family Moving Pictures Experts Group Moving pictures + associated audio MPEG-1, MPEG-2 (MP3), MPEG-4 Ongoing standardiation effort (MPEG-7) 23 MPEG-1 1 Audio 1992 Able to work well with CD, DAT One or two channels Single channel Two independent channels Stereo Stereo with joint coding 32, 44.1, 48 kh Specifies bit stream format, decoder structure, but not encoder (!) 24 ICSPAT 2000 Page 12

MPEG-1 1 Audio Layers Layer 1: simplest; Philips DCC Layer 2: more efficient coding; DAB, CD-I Layer 3: higher frequency and time resolution; ISDN, Internet All 3 layers use same header structure Decoder for one layer must also decode lower-numbered layers Higher-numbered layers have more complex decoder 25 MPEG-2 2 Audio 1994 MPEG-2 video for digital TV Higher bit rates than MPEG-1 Backward compatible with MPEG-1 Three layers, like MPEG-1 Add lower sample rates 16, 22.05, 24 kh 5.1 + up to 7 multilingual/commentary channels MP3 = MPEG-1/2 Layer 3 (not MPEG-3 ) 26 ICSPAT 2000 Page 13

MPEG-2 2 Advanced Audio Coding (AAC) 1997 Goals: Indistinguishable at 384 kbit/sec Higher quality, multi-channel Features: Non-backward-compatible ( NBC ) Up to 48 channels (stereo, 5.1 ) Tools (modules) combined into profiles 27 AAC Profiles LC (Low-Complexity) Most commonly used TNS (Temporal Noise Shaping) SSR (Scalable Sampling Rate) Features gain control tool Main LTP (Long-Term Predictor) Delivers the best audio quality of the three profiles 28 ICSPAT 2000 Page 14

Conclusions Entertainment is going digital Audio is a key component Many new market opportunities opening up Internet audio is hot; audio may be the Internet killer app Audio compression is a key technology Many algorithms, many applications Better algorithms better quality, more compression Computation requirements are going up 29 In the Future Photo credit: Dr. Richard O. Duda, SJSU 30 ICSPAT 2000 Page 15

For More Information http://www.bdti.com http://www.eg3.com/dsp Collection of BDTI's papers on DSP processors, tools, apps, benchmarking Links to other good DSP sites comp.dsp Microprocessor Report DSP Processor Fundamentals, BDTI Usenet group For info on newer DSPs Textbook on DSP processors Or, join BDTI...We're Hiring! (See www.bdti.com) 31 ICSPAT 2000 Page 16