Introduction and Comparison of Common Videoconferencing Audio Protocols I. Digital Audio Principles



Similar documents
Analog-to-Digital Voice Encoding

VoIP Technologies Lecturer : Dr. Ala Khalifeh Lecture 4 : Voice codecs (Cont.)

Analog Representations of Sound

Simple Voice over IP (VoIP) Implementation

Audio Coding, Psycho- Accoustic model and MP3

HISO Videoconferencing Interoperability Standard

Computer Networks and Internets, 5e Chapter 6 Information Sources and Signals. Introduction

Polycom Video Communications

Voice over IP Protocols And Compression Algorithms

TECHNICAL PAPER. Fraunhofer Institute for Integrated Circuits IIS

Basic principles of Voice over IP

VoIP to 20 khz: Codec Choices for High Definition Voice Telephony

How To Understand The Technical Specifications Of Videoconferencing

Video Conferencing Standards

FRAUNHOFER INSTITUTE FOR INTEGRATED CIRCUITS IIS AUDIO COMMUNICATION ENGINE RAISING THE BAR IN COMMUNICATION QUALITY

ARIB STD-T64-C.S0042 v1.0 Circuit-Switched Video Conferencing Services

Digital Transmission of Analog Data: PCM and Delta Modulation

Starlink 9003T1 T1/E1 Dig i tal Trans mis sion Sys tem

ETSI TS V1.1.1 ( )

Broadband Networks. Prof. Dr. Abhay Karandikar. Electrical Engineering Department. Indian Institute of Technology, Bombay. Lecture - 29.

Radio over Internet Protocol (RoIP)

Voice---is analog in character and moves in the form of waves. 3-important wave-characteristics:

MPEG Unified Speech and Audio Coding Enabling Efficient Coding of both Speech and Music

How To Understand How Bandwidth Is Used In A Network With A Realtime Connection

AUDIO CODING: BASICS AND STATE OF THE ART

VoIP Bandwidth Calculation

Tutorial about the VQR (Voice Quality Restoration) technology

Video Conferencing Glossary of Terms

How To Recognize Voice Over Ip On Pc Or Mac Or Ip On A Pc Or Ip (Ip) On A Microsoft Computer Or Ip Computer On A Mac Or Mac (Ip Or Ip) On An Ip Computer Or Mac Computer On An Mp3

IETF 84 RTCWEB. Mandatory To Implement Audio Codec Selection

Voice Encryption over GSM:

Receiving the IP packets Decoding of the packets Digital-to-analog conversion which reproduces the original voice stream

EE3414 Multimedia Communication Systems Part I

GSM speech coding. Wolfgang Leister Forelesning INF 5080 Vårsemester Norsk Regnesentral

Voice Over IP Per Call Bandwidth Consumption

Digital Speech Coding

VoIP in Mika Nupponen. S Postgraduate Course in Radio Communications 06/04/2004 1

PRIMER ON PC AUDIO. Introduction to PC-Based Audio

Choosing the Right Audio Codecs for VoIP over cdma2000 Networks:

Network administrators must be aware that delay exists, and then design their network to bring end-to-end delay within acceptable limits.

MP3 Player CSEE 4840 SPRING 2010 PROJECT DESIGN.

How To Understand The Differences Between A Fax And A Fax On A G3 Network

) SERIES V: DATA COMMUNICATION OVER THE TELEPHONE NETWORK Interfaces and voiceband modems. ITU-T Recommendation V.25

PERFORMANCE ANALYSIS OF VOIP TRAFFIC OVER INTEGRATING WIRELESS LAN AND WAN USING DIFFERENT CODECS

Voice Over IP. Priscilla Oppenheimer

Data Storage 3.1. Foundations of Computer Science Cengage Learning

Introduction to Packet Voice Technologies and VoIP

Active Monitoring of Voice over IP Services with Malden

Speech Signal Processing: An Overview

Conference interpreting with information and communication technologies experiences from the European Commission DG Interpretation

Calculating Bandwidth Requirements

Multimedia Conferencing Standards

For Articulation Purpose Only

Network Traffic #5. Traffic Characterization

Audio Coding Introduction

Audio Coding Algorithm for One-Segment Broadcasting

MPEG-1 / MPEG-2 BC Audio. Prof. Dr.-Ing. K. Brandenburg, bdg@idmt.fraunhofer.de Dr.-Ing. G. Schuller, shl@idmt.fraunhofer.de

Curso de Telefonía IP para el MTC. Sesión 2 Requerimientos principales. Mg. Antonio Ocampo Zúñiga

Preservation Handbook

Video Conferencing Protocols

SIP Trunking and Voice over IP

White Paper. D-Link International Tel: (65) , Fax: (65) Web:

Challenges and Solutions in VoIP

UNIVERSITY OF CALICUT

Video over IP WHITE PAPER. Executive Summary

Multichannel Voice over Internet Protocol Applications on the CARMEL DSP

Performance of Various Codecs Related to Jitter Buffer Variation in VoIP Using SIP

EUROPEAN COMPUTER DRIVING LICENCE. Multimedia Audio Editing. Syllabus

TECHNICAL PAPER. Fraunhofer Institute for Integrated Circuits IIS

Application Notes. Contents. Overview. Introduction. Echo in Voice over IP Systems VoIP Performance Management

Creating Content for ipod + itunes

Appendix C GSM System and Modulation Description

Data Storage. Chapter 3. Objectives. 3-1 Data Types. Data Inside the Computer. After studying this chapter, students should be able to:

Interference to Hearing Aids by Digital Mobile Telephones Operating in the 1800 MHz Band.

Nokia Networks. Voice over LTE (VoLTE) Optimization

Lecture 2 Outline. EE 179, Lecture 2, Handout #3. Information representation. Communication system block diagrams. Analog versus digital systems

Evaluating Data Networks for Voice Readiness

Differences between Traditional Telephony and VoIP

TCOM 370 NOTES 99-6 VOICE DIGITIZATION AND VOICE/DATA INTEGRATION

INTRODUCTION TO VOICE OVER IP

Evolution from Voiceband to Broadband Internet Access

Indepth Voice over IP and SIP Networking Course

Optimizing Converged Cisco Networks (ONT)

HIGH QUALITY AUDIO RECORDING IN NOKIA LUMIA SMARTPHONES. 1 Nokia 2013 High quality audio recording in Nokia Lumia smartphones

An Introduction to VoIP Protocols

Chapter 3 ATM and Multimedia Traffic

The AAC audio Coding Family For

System Requirements. Version 8.6

Simulation Platform for the Planning and Design of Networks Carrying VoIP Traffic

Benefits of Using T1 and Fractional T1 Carriers and Multiplexers Gilbert Held

Analysis of Quality of Service (QoS) for Video Conferencing in WiMAX Networks

Digital Audio and Video Data

Direct Digital Amplification (DDX ) The Evolution of Digital Amplification

VOICE OVER IP AND NETWORK CONVERGENCE

Goal We want to know. Introduction. What is VoIP? Carrier Grade VoIP. What is Meant by Carrier-Grade? What is Meant by VoIP? Why VoIP?

How To Compare Video Resolution To Video On A Computer Or Tablet Or Ipad Or Ipa Or Ipo Or Ipom Or Iporom Or A Tv Or Ipro Or Ipot Or A Computer (Or A Tv) Or A Webcam Or

Transcription:

Introduction and Comparison of Common Videoconferencing Audio Protocols I. Digital Audio Principles Sound is an energy wave with frequency and amplitude. Frequency maps the axis of time, and amplitude maps the axis of level. Generally, there are three types of sound: Audible sound: This sound is clearly audible to the unaided ear, and the frequency ranges from 20 Hz to 20 khz. Infrasound: The frequency is smaller than 20 Hz. Ultrasonic sound: The frequency is larger than 20 khz. Multimedia technologies only focus on audible sound. For audible sound, the frequency band of speech signals ranges from 80 Hz to 3400 Hz, and the frequency band of music signals ranges from 20 Hz to 20 khz. Multimedia technologies focus on processing of speech and music. Because analog voice is continuous in terms of time, the voice signals collected from microphones must be digitized and then sent to the computer for processing. Generally, the pulse-code modulation (PCM) technology is used to convert a continuously variable analog signal to a digital signal by means of sampling, quantization, and coding. 1. Sampling Sampling: reading amplitude of voice at a regular interval. The number of samples collected per second is represented by the sampling rate. Obviously, the larger the sampling rate, the closer the data points of the discrete amplitude values to the continuous analog signal curve. The total number of samples is larger. Based on the sampling principle, the sampling rate must be larger or equal to two times of the maximum frequency in the analog signal frequency band, ensuring that a digital audio can be accurately transformed to an analog audio. Common audio sampling rates: 8 khz, 11.025 khz, 22.05 khz, 16 khz, 37.8 khz, 44.1 khz, and 48 khz. For example, assume that the frequency of a speech signal ranges from 0.3 khz to 3.4 khz and a sampling rate of 8 khz is used, the signals that are used to replace the original continuous speech signals can be sampled. Generally, CD audio has a sampling rate of 44.1 khz. 2. Quantizing Quantizing: converting amplitudes of voice signal to digits to represent the signal strength. Quantizing precision: The maximum number of bits that each sample can be represented, also known as sampling resolution. Generally, the sampling resolution of voice signal can be 4 bits, 6 bits, 8 bits, 12 bits, or 16 bits. Based on the sampling rate and quantization precision, natural voice signals can be indefinitely closer but cannot be analogized completely by using an audio codec. In computer application, the highest fidelity of sound can be achieved by using the PCM, which is regarded as a lossless data compression. 3. Coding Coding: assigning a unique digital code to input signals. If the sampling rate is 44.1 khz, quantization with 16 bits is used, and two-way audio is output by using the PCM, the bit rate is 1411.2 kbit/s (44.1 khz x 16 x 2 = 1411.2 kbit/s). Therefore, the required storage space in one second is 176.4 KB and in one minute is 10.34 MB. Digital audio signals must be compressed so that the transmission and storage cost can be reduced.

Currently, the bit rate of an audio signal can be reduced to 32 256 kbit/s after compression, and the bit rate of a speech signal can be reduced to a value smaller than 8 kbit/s. Digital audio information is compressed so that its data amount can be minimized without impacting on the use of the information. Generally, the following six attributes are used to measure the data amount of digital audio information: Bit rate Signal bandwidth Subjective and objective voice quality Delay Algorithm complexity and storage requirement Sensitivity to channel bit error Audio coding is based on the standard algorithm. In this way, the coded audio information can be widely used. Traditional videoconferencing audio devices mainly adopt standards such as G.711, G.722, G.728, and AAC_LD. These standards are released by the International Telecommunication Union Telecommunication Standardization Sector (ITU-T). II. Common Audio Protocols 1. ITU-T G.728 G.728 is a telephone speech codec standard released by ITU-T in 1992. G.728 adopts the low-delay code excited linear prediction (LD-CELP) coding mode, uses a sampling rate of 8 khz, and transmits voice signals at a bit rate of 16 kbit/s with a 0.625 ms algorithm delay. 2. ITU-T G.711 G.711 was released for usage by ITU-T in 1972. Speech signals are coded by using the non-uniform quantization PCM methods. G.711 uses a sampling rate of 8 khz and uses a non-uniform quantization with 8 bits to represent each sample, resulting in a 64 kbit/s bit rate. This narrowband codec supports compression of 300 Hz to 3400 Hz audio. G.711 defines an excellent compression quality. However, a relatively lager bandwidth is occupied. G.711 is applicable to digital telephones in the PBX/ISDN system. 3. ITU-T G.722 ITU-T G.722 is the first standard wideband speech codec that uses a sampling rate of 16 khz. This codec was approved by the International Telegraph and Telephone Consultative Committee (CCITT) in 1984 and is still in use nowadays. G.722-based codec receives 50 Hz to 7 khz audio data by using a 16 khz sampling rate and a 16-bit quantization, and then compresses the audio data to reach a bit rate of 64 kbit/s, 56 kbit/s, or 48 kbit/s. The total delay is about 3 ms, and a high-quality audio is provided. G.722 has the following advantages: Low delay Low transmission bit error rate No patented technology Low cost Therefore, G.722 is widely used in Voice over IP (VoIP) services, personal communication services, and videoconferencing applications. 4. G.722.1 G.722.1 is a third-generation Siren 7 compression technology developed by Polycom. G.722.1 standard was approved by ITU-T in 1999. G.722.1 uses a sampling rate of 16 khz and a 16-bit quantization. G.722.1 supports sampling of 50 Hz to 7 khz audio data, and compresses the audio data to reach a bit rate of 32 kbit/s or 24 kbit/s. G.722.1 encapsulates voice signals into frames at 20 ms and provides a 40 ms algorithm delay.

With comparison to G.722, G.722.1 can implement lower bit rate and higher quality of audio data compression. G.722.1 aims to provide audio with the same quality defined in G.722 by operating at a half of the bit rate. To use this codec, you must obtain the authorization of the codec from the Polycom. 5. G.722.1 Annex C G.722.1 Annex C is a Siren 14 compression technology developed by Polycom. It uses a sampling rate of 32 khz, supports sampling of audio data with frequency ranging from 50 Hz to 14 khz, and compresses the audio data to reach a bit rate of 24 kbit/s, 32 kbit/s, or 48 kbit/s. G.722.1 Annex C encapsulates voice signals into frames at 20 ms and provides a 40 ms algorithm delay. In 2005, the Siren 14 technology of the Polycom was approved by ITU and was a new standard for 14 khz wideband speech codecs. The mono version of Siren 14 became ITU-T G.722.1 Annex C. G.722.1 Annex C has the following advantages: 6. AAC_LD Defines a low requirement for calculation capability and bandwidth. Be applicable to the processing of speech, music, and natural voice. Advanced Audio Coding (AAC) is an audio compression format developed by the Fraunhofer Institute (also known as the developer of the MP3 format), Dolby Laboratories, and AT&T Corporation. It is a part of the MPEG-2 standard and was approved to be an international standard for audio compression technologies in March 1997. With the wide use of MPEG-4 in 2000, MPEG2 AAC was brought into use as another core codec technology and added with some new codec features. MPEG2 AAC is also known as MPEG-4 AAC. The MPEG-4 AAC family contains nine codec specifications. The MPEG-4 Low Delay Audio Coder (AAC_LD) is used in the case of low bit rate. AAC_LD uses a sampling rate of 8 khz to 48 khz and provides CD-quality audio at a bit rate of 64 kbit/s. In addition, AAC_LD supports multi-channel output and provides a 20 ms algorithm delay. AAC adopts modular design and provides more powerful functions. 7. Parameter Comparison of Audio Protocols Table 1-1 lists parameter comparison of audio protocols. Table 1-1 Parameter comparison of audio protocols Sampling Rate Supported Audio Frequency Output Bit Rate Minimum Algorithm Delay G.711 8 khz 300 Hz to 3400 Hz 64 kbit/s < 1 ms G.722 16 khz 50 Hz to 7 khz 64 kbit/s 3 ms G.722.1 16 khz 50 Hz to 7 khz 24 kbit/s and 32 kbit/s 40 ms G.722.1 C 32 khz 50 Hz to 14 khz 24 kbit/s, 32 kbit/s, and 48 kbit/s 40 ms AAC_LD 48 khz 20 Hz to 20 khz 48 kbit/s to 64 kbit/s 20 ms

III. Advantage and Disadvantage Comparison of AAC_LD and G.722.1 Annex C Table 1-2 lists advantages and disadvantages of AAC_LD and G.722.1 Annex C. Table 1-2 Advantages and disadvantages of AAC_LD and G.722.1 Annex C Frequency of audio samples Bit rate Algorithm complexity Minimum delay G.722.1 Annex C Supports 50 Hz to 14 khz audio. Defines CD-quality audio but fails to sample audio data with high frequency. Supports a bit rate of 24 kbit/s, 32 kbit/s, or 48 kbit/s. Uses a smaller bandwidth than AAC_LD. Does not support audio output with high frequency. Uses an algorithm with low complexity. Occupies less CPU than AAC_LD. Encapsulates voice signals into frames at 20 ms and provides a 40 ms algorithm delay. AAC_LD Supports sampling of 20 Hz to 20 khz audio data. Defines better CD-quality audio than G.722.1 Annex C. Uses a bit rate of 48 kbit/s to 64 kbit/s and supports audio output at a bit rate larger than 64 kbit/s. Defines audio with higher quality. Adopts modular design and provides more powerful functions. Defines specialized chips such as TI. Defines a 20 ms algorithm delay. Multi-channel Supports dual audio channel. Supports a maximum of 48 tracks and 15 low-frequency tracks. Commonality Developed by Polycom. Used with authorization obtained from Polycom. Currently adopted only by Polycom and few videoconferencing vendors. Obtains support from Apple, Nokia, and Panasonic as a core MPEG-4 standard. Adopted by a variety of videoconferencing vendors such as Ted. Has a wide prospect of application.

Figure 1-1 shows comparison of AAC_LD, G.722.1-C, and MP3 using speech items. Figure 1-1 Comparison of AAC_LD, G.722.1-C, and MP3 using speech items As shown in Figure 1-1 (provided by Fraunhofer Institute), AAC_LD provides higher quality audio than that defined in G.722.1 Annex C and MP3 in the case of same sampling rate. AAC_LD provides the smallest delay among wideband speech codecs. In addition, AAC_LD ensures the CD quality of audio and provides a best combination of audio quality, bit rate, and delay. Therefore, it is the best choice for videoconferencing vendors.