Lecture 11 MP3 and MP4 Audio (Part 7)

Similar documents
AUDIO CODING: BASICS AND STATE OF THE ART

Introduction and Comparison of Common Videoconferencing Audio Protocols I. Digital Audio Principles

Audio Coding Introduction

STUDY OF MUTUAL INFORMATION IN PERCEPTUAL CODING WITH APPLICATION FOR LOW BIT-RATE COMPRESSION

Multimedia Communications

Audio Coding, Psycho- Accoustic model and MP3

MPEG Unified Speech and Audio Coding Enabling Efficient Coding of both Speech and Music

Digital Audio Compression: Why, What, and How

Audio Coding Algorithm for One-Segment Broadcasting

DAB + The additional audio codec in DAB

For Articulation Purpose Only

A Comparison of the ATRAC and MPEG-1 Layer 3 Audio Compression Algorithms Christopher Hoult, 18/11/2002 University of Southampton

MPEG-1 / MPEG-2 BC Audio. Prof. Dr.-Ing. K. Brandenburg, bdg@idmt.fraunhofer.de Dr.-Ing. G. Schuller, shl@idmt.fraunhofer.de

Study and Implementation of Video Compression standards (H.264/AVC, Dirac)

Digital Audio Compression

Preservation Handbook

Study and Implementation of Video Compression Standards (H.264/AVC and Dirac)

Tutorial about the VQR (Voice Quality Restoration) technology

!"#$"%&' What is Multimedia?

Quality Estimation for Scalable Video Codec. Presented by Ann Ukhanova (DTU Fotonik, Denmark) Kashaf Mazhar (KTH, Sweden)

DTS Enhance : Smart EQ and Bandwidth Extension Brings Audio to Life

Digital terrestrial television broadcasting Audio coding

The Theory Behind Mp3

UNIVERSITY OF CALICUT

FREE TV AUSTRALIA OPERATIONAL PRACTICE OP 60 Multi-Channel Sound Track Down-Mix and Up-Mix Draft Issue 1 April 2012 Page 1 of 6

MPEG Layer-3. An introduction to. 1. Introduction

Advanced Speech-Audio Processing in Mobile Phones and Hearing Aids

Speech Signal Processing: An Overview

Digital Audio and Video Data

Dream DRM Receiver Documentation

A TOOL FOR TEACHING LINEAR PREDICTIVE CODING

Figure 1: Relation between codec, data containers and compression algorithms.

MP3 AND AAC EXPLAINED

A Comparison of Speech Coding Algorithms ADPCM vs CELP. Shannon Wichman

JPEG Image Compression by Using DCT

A HIGH PERFORMANCE SOFTWARE IMPLEMENTATION OF MPEG AUDIO ENCODER. Figure 1. Basic structure of an encoder.

The AAC audio Coding Family For

Video Coding Basics. Yao Wang Polytechnic University, Brooklyn, NY11201

How To Test Video Quality With Real Time Monitor

A Secure File Transfer based on Discrete Wavelet Transformation and Audio Watermarking Techniques

White Paper. PESQ: An Introduction. Prepared by: Psytechnics Limited. 23 Museum Street Ipswich, Suffolk United Kingdom IP1 1HN

EE3414 Multimedia Communication Systems Part I

Broadband Networks. Prof. Dr. Abhay Karandikar. Electrical Engineering Department. Indian Institute of Technology, Bombay. Lecture - 29.

Computer Networks and Internets, 5e Chapter 6 Information Sources and Signals. Introduction

Introduction to image coding

Any Video Converter Professional User Manual 1. Any Video Converter Professional. User Manual

Figure1. Acoustic feedback in packet based video conferencing system

FAST MIR IN A SPARSE TRANSFORM DOMAIN

encoding compression encryption

Born-digital media for long term preservation and access: Selection or deselection of media independent music productions

PRIMER ON PC AUDIO. Introduction to PC-Based Audio

Convention Paper 5553

Lecture 1-6: Noise and Filters

Creating Content for ipod + itunes

A Digital Audio Watermark Embedding Algorithm

Classes of multimedia Applications

HIGH-QUALITY FREQUENCY DOMAIN-BASED AUDIO WATERMARKING. Eric Humphrey. School of Music Engineering Technology University of Miami

Basic principles of Voice over IP

Audio Engineering Society. Convention Paper. Presented at the 129th Convention 2010 November 4 7 San Francisco, CA, USA

APPLICATION BULLETIN AAC Transport Formats

Video Coding Standards. Yao Wang Polytechnic University, Brooklyn, NY11201

Technical Paper. Dolby Digital Plus Audio Coding

MPEG-H Audio System for Broadcasting

H.264/MPEG-4 AVC Video Compression Tutorial

EUROPEAN COMPUTER DRIVING LICENCE. Multimedia Audio Editing. Syllabus

Department of Electrical and Computer Engineering Ben-Gurion University of the Negev. LAB 1 - Introduction to USRP

The Design and Implementation of Multimedia Software

Objective Speech Quality Measures for Internet Telephony

C Implementation & comparison of companding & silence audio compression techniques

Web-Conferencing System SAViiMeeting

Understanding the Transition From PESQ to POLQA. An Ascom Network Testing White Paper

Video compression: Performance of available codec software

Analog-to-Digital Voice Encoding

Frequently asked QUESTIONS. about DOLBY DIGITAL

Loudness and Dynamic Range

A Framework for Robust and Scalable Audio Streaming

From Concept to Production in Secure Voice Communications

Polycom Video Communications

What Audio Engineers Should Know About Human Sound Perception. Part 2. Binaural Effects and Spatial Hearing

Analog Representations of Sound

An Optimised Software Solution for an ARM Powered TM MP3 Decoder. By Barney Wragg and Paul Carpenter

Lecture 1-10: Spectrograms

White Paper: An Overview of the Coherent Acoustics Coding System

Understanding Compression Technologies for HD and Megapixel Surveillance

high-quality surround sound at stereo bit-rates

SPEECH SIGNAL CODING FOR VOIP APPLICATIONS USING WAVELET PACKET TRANSFORM A

CM0340 SOLNS. Do not turn this page over until instructed to do so by the Senior Invigilator.

Voice Communication Package v7.0 of front-end voice processing software technologies General description and technical specification

Best practices for producing quality digital video files

Starlink 9003T1 T1/E1 Dig i tal Trans mis sion Sys tem

Digital Speech Coding

GSM speech coding. Wolfgang Leister Forelesning INF 5080 Vårsemester Norsk Regnesentral

Introduzione alle Biblioteche Digitali Audio/Video

ARIB STD-T64-C.S0042 v1.0 Circuit-Switched Video Conferencing Services

Chapter 6: Broadcast Systems. Mobile Communications. Unidirectional distribution systems DVB DAB. High-speed Internet. architecture Container

HIGH QUALITY AUDIO RECORDING IN NOKIA LUMIA SMARTPHONES. 1 Nokia 2013 High quality audio recording in Nokia Lumia smartphones

High-Fidelity Multichannel Audio Coding With Karhunen-Loève Transform

MP3/mp3PRO plug-in. How you can make an audio CD from mp3 or mp3pro files

Transcription:

CS 414 Multimedia Systems Design Lecture 11 MP3 and MP4 Audio (Part 7) Klara Nahrstedt Spring 2012

Administrative MP1 deadline February 18

Outline MP3 Audio Encoding MP4 Audio Reading: Media Coding book, Section 7.7.2 7.7.5 Recommended Paper on MP3: Davis Pan, A Tutorial on MPEG/Audio Compression, IEEE Multimedia, pp. 6-74, 1995 Recommended books on JPEG/ MPEG Audio/Video Fundamentals: Haskell, Puri, Netravali, Digital Video: An Introduction to MPEG-2, Chapman and Hall, 1996

Why Compression is Needed Data rate = sampling rate * quantization bits * channels (+ control information) For example (digital audio): 44100 Hz; 16 bits; 2 channels generates about 1.4M of data per second; 84M per minute; 5G per hour

MPEG-1 Audio Lossy compression of audio In late 1980 s ISO s MPEG group started to standardize TV broadcasting Use of Audio on CD-ROM (later DVD) MPEG-1 Audio 1992 MPEG-2 Audio - 1994 MPEG-1 Audio Layer I, II, III

Criteria for A Good Standard Achieve desired outcome Be comprehensible Allow efficient implementation Support competition Give benchmark tests Be supported by industry Be good for end users. Two models: implement first, then standardize standardize first, then implement

MPEG-1 Audio Layer II Called MP2 Dominant standard for audio broadcasting DAB digital radio and DVB digital television Came out of MUSICAM codecs with bit rates 64-196 kbps MUSICAM audio coding - basis for MPEG-1 and MPEG-2 audio Sampling rates: 32, 44.1, 48 khz Bit rates: 32, 48, 56, 64, 80, 96, 384 kbps Format: mono, stereo, dual channel, MP2 sub-band audio encoder in time domain

MPEG-1 Audio Layer III MPEG-1 Layer III is called MP3 format Popular for PC and Internet applications Goal to compress to 128 kbps, but can be compressed to higher or lower resulting quality Utilization of psychoacoustics Scientific study of sound perception.

MPEG Audio MP3 First psychoacoustic masking code was proposed in 1979 in AT&T Bell Labs, Murray Hill. MP3 based on OCF (optimum coding in frequency domain) and PXFM (Perceptual transform coding) MPEG-1 Audio Layer III public release 1993 MPEG-2 Audio III public release 1995

MPEG Audio MP3 1997 mp3.com offering thousands of MP3s created by independent artists for free 1999 Napster MP3 peer-to-peer file sharing Problem: copyright infringement Authorized services: Amazon.com, Rhapsody, Juno Records,..

MPEG-1 Audio Encoding Characteristics Precision 16 bits Sampling frequency: 32KHz, 44.1 KHz, 48 KHz 3 compression layers: Layer 1, Layer 2, Layer 3 (MP3) Layer 3: 32-320 kbps, target 64 kbps Layer 2: 32-384 kbps, target 128 kbps Layer 1: 32-448 kbps, target 192 kbps

MPEG Audio Encoding Steps

MPEG Audio Filter Bank Filter bank divides input into multiple sub-bands (32 equal frequency sub-bands) Sub-band i defined St[ i] 7 k 0 3 7 j 0 (2i cos( i [ 0,31], S [ i] 1)( k 16) *( C[ k 64 64 j]* x[ k 64 j] t - filter output sample for sub-band i at time t, C[n] one of 512 coefficients, x[n] audio input sample from 512 sample buffer

MPEG Audio Psycho-acoustic Model MPEG audio compresses by removing acoustically irrelevant parts of audio signals Takes advantage of human auditory systems inability to hear quantization noise under auditory masking Auditory masking: occurs when ever the presence of a strong audio signal makes a temporal or spectral neighborhood of weaker audio signals imperceptible.

Loudness and Pitch (Review on Psychoacoustic Effects) More sensitive to loudness at mid frequencies than at other frequencies intermediate frequencies at [500hz, 5000hz] Human hearing frequencies at [20hz,20000hz] Perceived loudness of a sound changes based on frequency of that sound basilar membrane reacts more to intermediate frequencies than other frequencies

Fletcher-Munson Contours Each contour represents an equal perceived sound Perception sensitivity (loudness) is not linear across all frequencies and intensities

Masking Effects (Review of Psychoacoustic Effects) Frequency masking Temporal masking

MPEG/audio divides audio signal into frequency sub-bands that approximate critical bands. Then we quantize each sub-band according to the audibility of quantization noise within the band

MPEG Audio Bit Allocation This process determines number of code bits allocated to each sub-band based on information from the psychoacoustic model Algorithm: 1. Compute mask-to-noise ratio: MNR=SNR-SMR Standard provides tables that give estimates for SNR resulting from quantizing to a given number of quantizer levels 2. Get MNR for each sub-band 3. Search for sub-band with the lowest MNR 4. Allocate code bits to this sub-band. If sub-band gets allocated more code bits than appropriate, look up new estimate of SNR and repeat step 1

Audio Quality Bitrate With too low bit rate, we get compression artifacts Ringing Pre-echo sound is heard before it occurs. It is most noticeable in impulsive sounds from percussion instruments such as cymbals Occurs in transform-based audio compression algorithms Quality of encoder and encoding parameters Constant Bit rate encoding Variable Bit rate encoding

MP3 Audio Format Source: http://wiki.hydrogenaudio.org/images/e/ee/mp3filestructure.jpg

MPEG Audio Comments Precision of 16 bits per sample is needed to get good SNR ratio Noise we are getting is quantization noise from the digitization process For each added bit, we get 6dB better SNR ratio Masking effect means that we can raise the noise floor around a strong sound because the noise will be masked away Raising noise floor is the same as using less bits and using less bits is the same as compression

Successor of MP3 Advanced Audio Coding (AAC) now part of MPEG-4 Audio Inclusion of 48 full-bandwidth audio channels Default audio format for iphone, ipad, Nintendo, PlayStation, Nokia, Android, BlackBerry Introduced 1997 as MPEG-2 Part 7 In 1999 updated and included in MPEG-4

AAC s Improvements over MP3 More sample frequencies (8-96 khz) Arbitrary bit rates and variable frame length Higher efficiency and simpler filterbank Uses pure MDCT (modified discrete cosine transform) Used in Windows Media Audio

MPEG-4 Audio Variety of applications General audio signals Speech signals Synthetic audio Synthesized speech (structured audio)

MPEG-4 Audio Part 3 Includes variety of audio coding technologies Lossy speech coding (e.g., CELP) CELP code-excited linear prediction speech coding General audio coding (AAC) Lossless audio coding Text-to-Speech interface Structured Audio (e.g., MIDI)

MPEG-4 Part 14 Called MP4 with Extension.mp4 Multimedia container format Stores digital video and audio streams and allows streaming over Internet Container or wrapper format meta-file format whose spec describes how different data elements and metadata coesit in computer file

MPEG-4 Audio Bit-rate 2-64kbps Scalable for variable rates MPEG-4 defines set of coders Parametric Coding Techniques: low bit-rate 2-6kbps, 8kHz sampling frequency Code Excited Linear Prediction: medium bit-rates 6-24 kbps, 8 and 16 khz sampling rate Time Frequency Techniques: high quality audio 16 kbps and higher bit-rates, sampling rate > 7 khz CS 414 - Spring 2011

Conclusion MPEG Audio is an integral part of the MPEG standard to be considered together with video MPEG-4 Audio represents an major extension in terms of capabilities to MPEG-1 Audio