VoIP Technologies Lecturer : Dr. Ala Khalifeh Lecture 4 : Voice codecs (Cont.)
1 Remember first the big picture VoIP network architecture and some terminologies Voice coders 2 Audio and voice quality measuring techniques 3
Voice Codecs Types Two types of coders: Waveform coders that try to re-produce the signal waveform, gives good performances with a bitrate between 32 kbit/s and 24 kbit/s. Linear predictive coders (or vocoders) use a simple model of speech production (voiced or unvoiced types), modeled by a slowly variable filter (updated on a 20 30-ms frame basis) which shapes the spectrum of the decoded speech. 3
Examples of some common coders ITU (International Telecommunication Union) is the United Nations specialized agency for information and communication technologies ICTs. http://www.itu.int 4
Fixed/Variable rate coders Voice Codecs either Fixed rates or Variable rates - Fixed coders can encode using just one specific rate. e.g. G.711-> encoding rate is : 64 Kbps - Variable bit rate: allow the user to chose among different rates (qualities). e.g. Speex www.speex.org e.g. AMR www.voiceage.com 5
Fixed/Variable rate coders - Variable bit-rates coders either give the user the option to chose the target bit-rate (fixed rate) fixed rate for all frames - Or may allow him to use the Variable Bit-Rate (VBR) mode: - Variable bit-rate (VBR) allows a codec to change its bit-rate dynamically to adapt to the difficulty of the audio being encoded. In the example of Speex, sounds like vowels and high-energy transients require a higher bit-rate to achieve good quality, while fricatives (e.g. s,f sounds) can be coded adequately with less bits. For this reason, VBR can achieve lower bit-rate for the same quality, or a better quality for a certain bit-rate. 6
Fixed/Variable rate coders - Variable bit-rates coders either give the user the option to chose the target bit-rate (fixed rate) fixed rate for all frames - Or may allow him to use the Variable Bit-Rate (VBR) mode: - Variable bit-rate (VBR) allows a codec to change its bit-rate dynamically to adapt to the difficulty of the audio being encoded. In the example of Speex, sounds like vowels and high-energy transients require a higher bit-rate to achieve good quality, while fricatives (e.g. s,f sounds) can be coded adequately with less bits. For this reason, VBR can achieve lower bit-rate for the same quality, or a better quality for a certain bit-rate. 7
Fixed/Variable rate coders - Variable bit-rate (VBR) Despite its advantages, VBR has two main drawbacks: First, by only specifying quality, there s no guaranty about the final Average Bit-rate. Second, for some real-time applications like voice over IP (VoIP), what counts is the maximum bit-rate, which must be low enough for the communication channel. Solution 8
Fixed/Variable rate coders Average Bit-Rate (ABR) - Average bit-rate solves one of the problems of VBR, as it dynamically adjusts VBR quality in order to meet a specific target Bit-rate. - Because the quality/bit-rate is adjusted in real-time (open-loop), The global quality will be slightly lower than that obtained by encoding in VBR with exactly the right quality setting to meet the target average bit-rate. 9
Other voice codec functionality - In addition to basic speech encoding, a VAD (voice activity detection), DTX (discontinuous transmission), and CNG (comfort noise generation) scheme was added to the coder. Voice Activity Detection (VAD) - When enabled, voice activity detection detects whether the audio being encoded is speech or silence/background noise. VAD is always implicitly activated when encoding in VBR, so the option is only useful in non-vbr operation. - In this case, Speex for example detects non-speech periods and encode them with just enough bits to reproduce the background noise. This is called comfort noise generation (CNG). It must be pointed out that the design of a good and efficient VAD algorithm is almost as complex as the design of 10good speech coder
Discontinuous Transmission (DTX) Discontinuous transmission is an addition to VAD/VBR operation, that allows to stop transmitting completely when the background noise is stationary Packet Concealment Algorithms Hide the lost frames with the other received frames, e.g. Insertion based error concealment algorithm VoIP requirements as seen by the encoder Frame size and algorithmic delay must be small Encoding and decoding must work with limited resources Minimal distortion when packets are lost Support for narrowband and wideband Support for multiple bit-rates (quality) Achieve good compression 11
Speex: A Free Codec For Free Speech Audio codec specifically designed for speech and VoIP - Open-source/Free software (BSD-licensed) - Designed to avoid patents Specs Bit-rates narrowband: 2.15 24.6 kbps wideband: 4 kbps 42.2 kbps Latency narrowband: 30 ms (20 ms frames, 10 ms delay) wideband: 34 ms (20 ms frames, 14 ms delay) 12
Features Embedded wideband bit-stream Variable bitrate (VBR) Good for files, bad for VoIP Average bitrate (ABR): VBR with bitrate management Voice activity detection (VAD) and Discontinuous transmission (DTX) 13
14
15
16
17