The Fraunhofer Gesellschaft - FhG



Similar documents
Improved MPEG Low-Delay Audio Coding on DaVinci and TI C64 series DSPs. Negjmedin Fazlija Fraunhofer IIS

FRAUNHOFER INSTITUTE FOR INTEGRATED CIRCUITS IIS AUDIO COMMUNICATION ENGINE RAISING THE BAR IN COMMUNICATION QUALITY

high-quality surround sound at stereo bit-rates

Fraunhofer Institute for Integrated Circuits IIS. Director Prof. Dr.-Ing. Albert Heuberger Am Wolfsmantel Erlangen

AUDIO CODING: BASICS AND STATE OF THE ART

Audio Coding Introduction

The AAC audio Coding Family For

DAB + The additional audio codec in DAB

MPEG Unified Speech and Audio Coding Enabling Efficient Coding of both Speech and Music

Introduction and Comparison of Common Videoconferencing Audio Protocols I. Digital Audio Principles

Voice Communication Package v7.0 of front-end voice processing software technologies General description and technical specification

Audio Coding Algorithm for One-Segment Broadcasting

APPLICATION BULLETIN AAC Transport Formats

TECHNICAL PAPER. Fraunhofer Institute for Integrated Circuits IIS

FAST FACTS. Fraunhofer Institute for Integrated Circuits IIS

Digital Audio Compression: Why, What, and How

Digital terrestrial television broadcasting Audio coding

MPEG-1 / MPEG-2 BC Audio. Prof. Dr.-Ing. K. Brandenburg, bdg@idmt.fraunhofer.de Dr.-Ing. G. Schuller, shl@idmt.fraunhofer.de

For Articulation Purpose Only

MP3 AND AAC EXPLAINED

Technical Paper. Dolby Digital Plus Audio Coding

TECHNICAL PAPER. Fraunhofer Institute for Integrated Circuits IIS

FRAUNHOFER UMSICHT. Kick-Off-Meeting ECLIPSE, May 10 th -11 th 2012, San Sebastian. Fraunhofer UMSICHT

Creating Content for ipod + itunes

HE-AAC v2. MPEG-4 HE-AAC v2 (also known as aacplus v2 ) is the combination of three technologies:

Network administrators must be aware that delay exists, and then design their network to bring end-to-end delay within acceptable limits.

MPEG Layer-3. An introduction to. 1. Introduction

Convention Paper 5553

Company and Market Overview

MP3 Player CSEE 4840 SPRING 2010 PROJECT DESIGN.

Sample Project List. Software Reverse Engineering

MPEG-4 Enhanced Low Delay AAC - a new standard for high quality communication

Software Engineering in Kaiserslautern,, Germany

Audio Coding, Psycho- Accoustic model and MP3

MPEG-H Audio System for Broadcasting

An Optimised Software Solution for an ARM Powered TM MP3 Decoder. By Barney Wragg and Paul Carpenter

Polycom Video Communications

HD VoIP Sounds Better. Brief Introduction. March 2009

Polycom Competitive Q1 08

Using Mobile Processors for Cost Effective Live Video Streaming to the Internet

Introduction to Packet Voice Technologies and VoIP

Adaptive Playout for VoIP based on the Enhanced Low Delay AAC Audio Codec

Tutorial about the VQR (Voice Quality Restoration) technology

FREE TV AUSTRALIA OPERATIONAL PRACTICE OP 60 Multi-Channel Sound Track Down-Mix and Up-Mix Draft Issue 1 April 2012 Page 1 of 6

Curso de Telefonía IP para el MTC. Sesión 2 Requerimientos principales. Mg. Antonio Ocampo Zúñiga

For version p (September 4, 2012)

ACCESS Rack & Portable. Small, compact and powerful IP audio codec...

FRAUNHOFER UMSICHT TECHNOLOGY THAT PAYS

Preservation Handbook

UNIVERSITY OF CALICUT

MPEG-4. The new standard for multimedia on the Internet, powered by QuickTime. What Is MPEG-4?

A Comparison of the ATRAC and MPEG-1 Layer 3 Audio Compression Algorithms Christopher Hoult, 18/11/2002 University of Southampton

ARIB STD-T64-C.S0042 v1.0 Circuit-Switched Video Conferencing Services

USER MANUAL DUET EXECUTIVE USB DESKTOP SPEAKERPHONE

ADVANTAGES OF AV OVER IP. EMCORE Corporation

Exploit Technologies Pte Ltd A*STAR Technology Transfer Office Tai Hou TNG (linkedin) Technology links

BDTI Solution Certification TM : Benchmarking H.264 Video Decoder Hardware/Software Solutions

Voice Over IP Per Call Bandwidth Consumption

STUDY OF MUTUAL INFORMATION IN PERCEPTUAL CODING WITH APPLICATION FOR LOW BIT-RATE COMPRESSION

Experiencing MultiChannel Sound in

Understanding Latency in IP Telephony

Conference interpreting with information and communication technologies experiences from the European Commission DG Interpretation

Networked AV Systems Pretest

How To Test Video Quality With Real Time Monitor

CSE 237A Final Project Final Report

4. H.323 Components. VOIP, Version 1.6e T.O.P. BusinessInteractive GmbH Page 1 of 19

VIDEO AND THE WUSTL JANUARY 18, 2016

Finance at Fraunhofer- The Bridge between Academia and Industry

Mobile Audio from MP3 to AAC and further

Analog-to-Digital Voice Encoding

Assessing the quality of VoIP transmission affected by playout buffer scheme and encoding scheme

Application Note. Onsight Mobile Collaboration Video Endpoint Interoperability v5.0

real time in real space TM connectivity... SOLUTIONS GUIDE

Easy H.264 video streaming with Freescale's i.mx27 and Linux

CHANGE REQUEST. Work item code: MMS6-Codec Date: 15/03/2005

Improved user experiences are possible with enhanced FM radio data system (RDS) reception

How To Recognize Voice Over Ip On Pc Or Mac Or Ip On A Pc Or Ip (Ip) On A Microsoft Computer Or Ip Computer On A Mac Or Mac (Ip Or Ip) On An Ip Computer Or Mac Computer On An Mp3

Advanced Speech-Audio Processing in Mobile Phones and Hearing Aids

Glossary of Terms and Acronyms for Videoconferencing

Introduction, Rate and Latency

GPU Compute accelerated HEVC decoder on ARM Mali TM -T600 GPUs

HISO Videoconferencing Interoperability Standard

The World`s First Unified Media Server

Overview ISDB-T for sound broadcasting Terrestrial Digital Radio in Japan. Shunji NAKAHARA. NHK (Japan Broadcasting Corporation)

BRINGING VOIP TO THE CONFERENCE ROOM: HOW IT MANAGERS CAN ENHANCE THE USER EXPERIENCE

Clearing the Way for VoIP

TEST RESULTS FOR CLEARONE COLLABORATE ROOM PRO 600. Model: Collaborate Room Pro 600

Voice and Fax/Modem transmission in VoIP networks

The Theory Behind Mp3

CHAPTER FIVE RESULT ANALYSIS

EV-8000S. Features & Technical Specifications. EV-8000S Major Features & Specifications 1

Starlink 9003T1 T1/E1 Dig i tal Trans mis sion Sys tem

QuickTime Streaming. End-to-end solutions for live broadcasting and on-demand streaming of digital media. Features

Handling Multimedia Under Desktop Virtualization for Knowledge Workers

Study and Implementation of Video Compression standards (H.264/AVC, Dirac)

Transcription:

Implementing Low-Delay Communication Audio Coding on ARM Processors Marc Gayer (marc.gayer@iis.fraunhofer.de) Fraunhofer Institute for Integrated Circuits IIS

Overview The Fraunhofer Institute for Integrated Circuits IIS Low-Delay Audio Coding Basics MPEG-4 AAC Low-Delay Enhanced AAC Low-Delay with SBR Implementations on ARM-powered platforms Conclusions 2

The Fraunhofer Gesellschaft - FhG Bremen Itzehoe Hannover Berlin Braunschweig Golm Oberhausen Magdeburg Dortmund Duisburg Schmallenberg Aachen St. Augustin LeipzigDresden Euskirchen Jena Ilmenau Chemnitz DarmstadtWürzburg Kaiserslautern Erlangen St. Ingbert Saarbrücken Pfinztal Karlsruhe Stuttgart Freising Largest private research organization in Europe Non-profit organization, founded 1949 58 Institutes in 40 locations in Germany Offices in Europe, USA (San Jose) and Asia Permanent staff 12 700, mainly scientists and engineers Freiburg 3

Fraunhofer IIS Institute for Integrated Circuits Home of Founded in 1985, staff ~ 450 in 4 locations Audio, Video, Multimedia at IIS and IDMT More than 100 engineers in Erlangen and Ilmenau doing research on audio and video technologies MPEG Audio Encoders and Decoders MPEG-4 Advanced Audio Coding (AAC-LC, HE-AAC v2, Low-Delay AAC) MPEG Layer-3 (MP3) MPEG Surround HD-AAC (MPEG-4 SLS, Scalable Lossless) MPEG-4 Video Codecs, AV Streaming Technologies Virtual Acoustics, Metadata, Music Recognition Digital Rights Management Spin-offs: Opticom, Coding Technologies, MusicTrace 4

Low-Delay Audio Coding Why? For two-way interactive communication Long delay feels un-natural Example: Audio Conferencing Audio must be synchronized with video Example: Video Conferencing Where speakers can hear the coded signal Speakers find speech difficult when they hear themselves with > 25 ms delay Example: Phone call without an echo canceller Where acoustic delay is important 1 ms ~= 1 foot of sound propagation Example: Wireless speakers and microphones 5

Why a Special Low-Delay Codec? Traditional speech codecs do not work well with music or background noise. General Audio Codecs or Music Codecs are not typically used in interactive situations, and delay is unimportant. usually long latency Codec MP3 AAC-LC HE-AAC HE-AAC v2 Typical Delay (HW or DSP) 140 ms 210 ms 360 ms Typical Application Music Player Music Player, Broadcasting (ipod, itunes, ISDB) Mobile Music, Satellite and Digital Radio, IPTV (XM Radio, 3GPP, Digital Radio Mondial, DAB+) All delay figures for implementations with 100% encoder workload and continuous bitstream transmission 6

Goals of Next-Gen Conferencing Systems We need the fidelity of a music codec... and the latency of a speech codec. The vision of new systems is to move from just delivering intelligible signals to high-fidelity, entertainment-grade performance Intelligibility Natural sound quality No annoying coding artifacts or noise Full audio bandwidth Robust with tandem coding (cascaded codecs, MCUs) Speaker Separation Be able to hear several people talking at once Multi-channel capability Ambience Sounds come from the room, not a speaker 7

Fraunhofer s Low Delay Codecs Realtime communication MPEG-4 AAC Low Delay, AAC-LD Bidirectional communication, VoIP Delay: 20 ms algorithmic, 31 ms implementation AAC-ELD: Enhanced Low-Delay AAC Based on AAC-LD and Spectral Band Replication Delay: 15-32 ms algorithmic Scheduled as international standard: 1/2008 Wireless Audio ULD: Ultra Low Delay Audio Codec Wireless audio: microphones, loudspeakers, hearing-aids, VoIP, music jam sessions Delay: 6 ms algorithmic, 10 ms implementation 8

The AAC Codec Family Part of ISO/IEC MPEG-2 and MPEG-4 Standards: MPEG-2 AAC: (Japanese ISDB) AAC-LC: Music Coding (ipod, itunes) HE-AAC: Low-bitrate music (XM Radio) HE-AAC v2: Lower-bitrate music (3GPP, digital radio and TV services) AAC-LD: Low-bitrate communication New AAC-ELD: Lower-bitrate communication 9

AAC-LD Basics Configuration or Side Information Input Time Signal MDCT TNS Intensity/ Coupling PNS Mid/ Side Scale Factors Quantization Huffman Coding Spectral Processing Perceptual Model Bitrate / Distortion Controller Quantization and Coding Bitstream Payload Formatter, Bit Reservoir Not Shown Output Bitstream AAC-LD is based on AAC Low-Complexity with several modifications to decrease end-to-end delay 10

AAC-LD Delay Optimizations Lower number of subbands Reduced filter bank delay, only 2 x 480 samples No block switching (compensated by TNS) No look ahead delay Reduced bit reservoir size For example < 100 bits/channel Result: 20 ms algorithmic delay (48 khz, frame length 480) Real-time implementation ~31 ms delay Audio quality comparable to MP3 at same bitrate Up to fs/2 audio bandwidth Tuned for best performance at 16 khz bandwidth for 64 kbps/channel Large range of usable bitrates 32-80 kbps/channel 11

Systems using AAC-LD and/or AAC-LC Tandberg MXP Tandberg MPS 200/800 Sony PCS-TL50P Vcon HD4000/HD5000 Lifesize Telos Zephyr Xstream Musicam Netstar Mayah CENTAURI Source Elements Codian MCU 4200 Comrex Access Cisco Telepresence Apple ichat Several others in pipeline Tandberg MXP 2006: Comrex Access 2006: Codian MCU 2007: Apple ichat 2006: Cisco Telepresence 12

Enhanced Low-Delay AAC (AAC ELD) ISO/MPEG Status (5/2007): FPDAM Scheduled International Standard 1/2008 Modifications to AAC-LD: Delay optimized AAC core Optional Low delay Spectral Bandwidth Replication Key Features: Bandwidth up to 16 khz (and more) Algorithmic delay: 15 ms w/o SBR 32 ms with SBR Typcial bitrate: 24 to 48 kbps/channel with SBR Frame length 20 ms with SBR 13

AAC-LD and AAC-ELD: Delay Analysis Codec AAC-LD AAC-ELD Delay Sources MDCT+IMDCT LD-Filterbank Delay at 48kHz 20 ms 15 ms Low-delay window reduces delay from 960 samples (MDCT) to 720 samples (for a frame size of 480 samples) Low delay filterbank does not increase computational complexity Perfect reconstruction and similar frequency response 14

Spectral Band Replication in AAC-ELD Spectral Band Replication (SBR) Energy Energy Transposition Frequency Envelope Adjustment Frequency Core coder (AAC-LC or AAC-LD) is band limited Transposition of low band to high band Shaping and envelope adjustment of high band with additional SBR side info Add noise or additional sinusoidal frequency components that were detected in the SBR encoder 15

AAC-ELD with LD-SBR: Delay Analysis Codec Delay Sources Delay at 48kHz AAC-LD AAC-ELD AAC-ELD + LD-SBR MDCT+IMDCT LD-Filterbank LD-Filterbank QMF -> CLDFB (new after 80th MPEG meeting) SUM 20 ms 15 ms 30 ms 12 ms 1.3 ms 31.3 ms 16

AAC-ELD: Audio Quality Listening test results: MPEG2007/M14723, July 2007 MPEG critical test items: speech, music, single instruments 10 expert listeners MUSHRA test with 12 test items Bitrate: 32 kbps Comparison codecs: AAC-ELD with SBR, 48 khz MPEG-4 AAC-LD, 24 khz ITU-T G.722.1-C, 32 khz AMR-WB (G.722.2, @ 24kbps), 16 khz 17

AAC-LD/ELD Listening Test Result (32 kbps) 18

Implementing AAC-LD and AAC-ELD on ARM-powered Processors 19

Fraunhofer Core Design Kit (CDK) Development Flow Work done by Memory and runtime optimization Fixed-point knowhow Fraunhofer Licensee or Fraunhofer Assembler optimization ARM Implementation #1 Floating-point Reference not bit-exact Fixed-point Template Code bit-exact ARM Implementation #2 Optimized transcendent functions Quality & stability tests Platform specific cache and memory managment ARM Implementation #n 20

ARM Specific Software Features #if defined( CC_ARM) && defined( TARGET_ARCH_5TE) asm { smulwb result, b, a } #if defined(_win32_wce) && defined(_arm_) #if _MSC_VER >= 0x4b1 #if (_M_ARM==5) #include <armintrin.h> #include <cmnintrin.h> Depending on ARM instruction set and compiler tools use: inline assembler: SMULL, SMULWB, SMLAL, SMLAWB, CLZ assembler intrinsics: _MulHigh, _SmulWLo_SW_SL, _SmulAddWLo_SW_SL, _CountLeadingZeros for a variety of 16/32-bit MPY/MAC and other instructions Support for: ARM RealView/ADS ARM gcc for embedded Linux MS Visual C++ or embedded Visual C++ 21

ARM-powered Test Platforms Logic/Freescale i.mx31 LITEKIT board Freescale i.mx31 Application Development System Both based on ARM 1136JS -powered Freescale i.mx31 at up to 532 MHz Intel IXP420 XScale based on ARMv5TE NSLU2 (Linksys) with embedded Linux for ARM 266 MHz CPU speed, 8 MB flash, 32 MB SDRAM Ethernet, 2 x USB, Serial port Various ARM/Xscale-powered PDAs/mobiles 22

Processing Power Requirements on ARM9E Pure C code with ARM inline assembler/intrinsics, 1-channel mono, 48 khz, ARM RVDS simulator Encoder Decoder ARM946E-S ARM920T ARM946E-S ARM920T MP3 (for comparison) 42 MHz 53 MHz 16 MHz 20 MHz MPEG-4 AAC-LC (TNS + PNS) 40 MHz 52 MHz 11 MHz 13 MHz MPEG-4 AAC Low Delay 55 MHz 72 MHz 17 MHz 21 MHz MPEG-4 HE-AAC (SBR+TNS) 70 MHz 81 MHz 24 MHz 32 MHz AAC-ELD Enhanced AAC Low-Delay 65 MHz 73 MHz 26 MHz 35 MHz 23

Processing Power Requirements ARM946E-S Why an increase in processing power consumption of >35% from AAC-LC to AAC-LD encoding when we don t have short blocks in AAC-LD? 70 MHz 60 50 37.5% higher workload 40 30 Decoder Encoder 20 10 0 Encoder Decoder mp3 AAC-LC AAC-LD HE-AAC AAC-ELD 24

AAC-LC vs. AAC-LD processing power complexity AAC-LC frame length usually 1024 samples AAC-LD frame length 480 or 512 samples AAC-LD routines called twice as often during one time unit: function call overhead Psychoacoustic calculations in encoder based on scalefactor bands (sfbs) AAC-LC @ 48 khz: 49 sfbs (frame length 1024) AAC-LD @ 48 khz: 36 sfbs (frame length 512) 47% more scalefactor band-wise calculations per time unit But: AAC-LD does not use processing power consuming short blocks 25

Conclusions MPEG-4 AAC Low-Delay used by a large number of high-quality audio/video conferencing systems AAC-ELD with optional SBR extension currently under standardization in ISO/MPEG End-to-end delay reduced to only 15 ms w/o SBR Even higher coding efficiency with still very low delay using SBR extension Optimized implementations for all ARMpowered processor platforms available at Fraunhofer IIS 26

Please ask now Come to our booth # 605 www.iis.fraunhofer.de/amm mailto: marc.gayer@iis.fraunhofer.de Fraunhofer office in San Jose 27