Digital Speech Coding

Similar documents
Analog-to-Digital Voice Encoding

A Comparison of Speech Coding Algorithms ADPCM vs CELP. Shannon Wichman

Broadband Networks. Prof. Dr. Abhay Karandikar. Electrical Engineering Department. Indian Institute of Technology, Bombay. Lecture - 29.

Simple Voice over IP (VoIP) Implementation

VoIP Technologies Lecturer : Dr. Ala Khalifeh Lecture 4 : Voice codecs (Cont.)

Voice---is analog in character and moves in the form of waves. 3-important wave-characteristics:

Introduction to Packet Voice Technologies and VoIP

VoIP Bandwidth Calculation

GSM speech coding. Wolfgang Leister Forelesning INF 5080 Vårsemester Norsk Regnesentral

Linear Predictive Coding

Speech Compression. 2.1 Introduction

Voice Encryption over GSM:

Voice Encoding Methods for Digital Wireless Communications Systems

Appendix C GSM System and Modulation Description

From Concept to Production in Secure Voice Communications

Digital Modulation. David Tipper. Department of Information Science and Telecommunications University of Pittsburgh. Typical Communication System

ARIB STD-T64-C.S0042 v1.0 Circuit-Switched Video Conferencing Services

Calculating Bandwidth Requirements

Voice Activity Detection in the Tiger Platform. Hampus Thorell

PCM Encoding and Decoding:

Network administrators must be aware that delay exists, and then design their network to bring end-to-end delay within acceptable limits.

Choosing the Right Audio Codecs for VoIP over cdma2000 Networks:

Introduction and Comparison of Common Videoconferencing Audio Protocols I. Digital Audio Principles

A TOOL FOR TEACHING LINEAR PREDICTIVE CODING

Chapter 3 ATM and Multimedia Traffic

SIP Trunking and Voice over IP

Voice over IP Protocols And Compression Algorithms

Speech Coding Methods, Standards, and Applications. Jerry D. Gibson

ETSI TS V1.1.1 ( )

Boost Your Skills with On-Site Courses Tailored to Your Needs

Digital Transmission of Analog Data: PCM and Delta Modulation

Indepth Voice over IP and SIP Networking Course

Web-Conferencing System SAViiMeeting

Differences between Traditional Telephony and VoIP

TCOM 370 NOTES 99-6 VOICE DIGITIZATION AND VOICE/DATA INTEGRATION

Voice Over IP Per Call Bandwidth Consumption

Module 5. Broadcast Communication Networks. Version 2 CSE IIT, Kharagpur

Basic principles of Voice over IP

Goal We want to know. Introduction. What is VoIP? Carrier Grade VoIP. What is Meant by Carrier-Grade? What is Meant by VoIP? Why VoIP?

Testing Voice Service for Next Generation Packet Voice Networks

Simulative Investigation of QoS parameters for VoIP over WiMAX networks

Implementing an In-Service, Non- Intrusive Measurement Device in Telecommunication Networks Using the TMS320C31

1 There are also private networks including leased-lines, analog PBXs

How To Understand The Differences Between A Fax And A Fax On A G3 Network

Speech Signal Processing: An Overview

Voice over IP in PDC Packet Data Network

Perceived Speech Quality Prediction for Voice over IP-based Networks

Challenges and Solutions in VoIP

Tutorial about the VQR (Voice Quality Restoration) technology

CS263: Wireless Communications and Sensor Networks

Adjusting Voice Quality

Performance Evaluation of VoIP Services using Different CODECs over a UMTS Network

Active Monitoring of Voice over IP Services with Malden

Packet Loss Concealment Algorithm for VoIP Transmission in Unreliable Networks

Voice and Fax/Modem transmission in VoIP networks

ADAPTIVE SPEECH QUALITY IN VOICE-OVER-IP COMMUNICATIONS. by Eugene Myakotnykh

TECHNICAL PAPER. Fraunhofer Institute for Integrated Circuits IIS

Ch GSM PENN. Magda El Zarki - Tcom Spring 98

Lab 5 Getting started with analog-digital conversion

CDMA TECHNOLOGY. Brief Working of CDMA

How To Recognize Voice Over Ip On Pc Or Mac Or Ip On A Pc Or Ip (Ip) On A Microsoft Computer Or Ip Computer On A Mac Or Mac (Ip Or Ip) On An Ip Computer Or Mac Computer On An Mp3

How To Encode Data From A Signal To A Signal (Wired) To A Bitcode (Wired Or Coaxial)

MP3 Player CSEE 4840 SPRING 2010 PROJECT DESIGN.

TUTORIAL FOR CHAPTER 8

8. Cellular Systems. 1. Bell System Technical Journal, Vol. 58, no. 1, Jan R. Steele, Mobile Communications, Pentech House, 1992.

Cellular Network Organization. Cellular Wireless Networks. Approaches to Cope with Increasing Capacity. Frequency Reuse

Research Report. By An T. Le Oct 2005 Supervisor: Prof. Ravi Sankar, Ph.D. Oct 28, 2005 An T. Le -USF- ICONS group - SUS ans VoIP 1

Simulation Platform for the Planning and Design of Networks Carrying VoIP Traffic

CBS RECORDS PROFESSIONAL SERIES CBS RECORDS CD-1 STANDARD TEST DISC

Radio over Internet Protocol (RoIP)

Hello viewers, welcome to today s lecture on cellular telephone systems.

Computers Are Your Future Prentice-Hall, Inc.

L9: Cepstral analysis

RECOMMENDATION ITU-R BO.786 *

PART 5D TECHNICAL AND OPERATING CHARACTERISTICS OF MOBILE-SATELLITE SERVICES RECOMMENDATION ITU-R M.1188

Achieving PSTN Voice Quality in VoIP

Voice Communication Package v7.0 of front-end voice processing software technologies General description and technical specification

GSM System. Global System for Mobile Communications

Revision of Lecture Eighteen

Lecture 1. Introduction to Wireless Communications 1

The PESQ Algorithm as the Solution for Speech Quality Evaluation on 2.5G and 3G Networks. Technical Paper

Mobile Communications TCS 455

Curso de Telefonía IP para el MTC. Sesión 2 Requerimientos principales. Mg. Antonio Ocampo Zúñiga

Sigma- Delta Modulator Simulation and Analysis using MatLab

Effect of WiFi systems on multimedia applications

AC : MULTIMEDIA SYSTEMS EDUCATION INNOVATIONS I: SPEECH

White Paper. PESQ: An Introduction. Prepared by: Psytechnics Limited. 23 Museum Street Ipswich, Suffolk United Kingdom IP1 1HN

Adaptive Rate Voice over IP Quality Management Algorithm

12 Quality of Service (QoS)

Voice Over IP. Priscilla Oppenheimer

Lecture 2 Outline. EE 179, Lecture 2, Handout #3. Information representation. Communication system block diagrams. Analog versus digital systems

White Paper. ETSI Speech Quality Test Event Calling Testing Speech Quality of a VoIP Gateway

Transcription:

Digital Speech Processing David Tipper Associate Professor Graduate Program of Telecommunications and Networking University of Pittsburgh Telcom 2720 Slides 7 http://www.sis.pitt.edu/~dtipper/tipper.html Digital Speech Coding Digital Speech Convert analog speech to digital form and transmit digitally Applications Telephony: (cellular, wired and Internet- VoIP) Speech Storage (Automated call-centers) High-Fidelity recordings/voice Text-to-speech (machine generated speech) Issues Efficient use of bandwidth Compress to lower bit rate per user => more users Speech Quality Want tollgrade or better quality in a specific transmission environment Environment ( BER, packet lost, packet out of order, delay, etc.) Hardware complexity Speed (coding/decoding delay), computation requirement and power consumption Telcom 2720 2 1

Digital Speech Processing Speech coding in wireless systems All 1G systems have analog speech transmission 2G and 3G systems have digital speech Type of source coding Motivation for digital speech Increase system capacity Compression possible Quality/bandwidth tradeoffs can be made Improve quality of speech Error control coding possible, equalization, etc. Improve security as encryption possible for privacy Reduce Cost and Operations and Maintenance (OAM) Telcom 2720 3 Typical Wireless Communication System Source Source Encoder Channel Encoder Modulator Channel Destination Source Decoder Channel Decoder Demod -ulator Telcom 2720 4 2

Characteristics of Speech Bandwidth Most of energy between 20 Hz to about 7KHz, Human ear sensitive to energy between 50 Hz and 4KHz Time Signal High correlation Short term stationary Classified into four categories Voiced : created by air passed through vocal cords (e.g., ah, v) Unvoiced : created by air through mouth and lips (e.g., s, f ) Mixed or transitional Silence Telcom 2720 5 Characteristics of Speech Typical Voiced speech Typical Unvoiced speech Telcom 2720 6 3

Digital Speech Speech Coder: device that converts speech to digital Types of speech coders Waveform coders Convert any analog signal to digital form Vocoders (Parametric coders) Try to exploit special properties of speech signal to reduce bit rate Build model of speech transmit parameters of model Hybrid Coders Combine features of waveform and vocoders Telcom 2720 7 Speech Quality of Various Coders Mean Opinion Score is a subjective measure of quality Tradeoff in quality vs. data rate vs. complexity Telcom 2720 8 4

Waveform Coders (e.g.,pcm) Waveform Coders Convert any analog signal to digital - basically A/D converter signal sampled > twice highest frequency- then quantized into ` n bit samples Uniform quantization Example Pulse Code Modulation band limit speech < 4000 Hz pass speech through μ law compander sample 8000 Hz, 8 bit samples 64 Kbps DS0 rate Characteristics Quality High Complexity Low Bit rate High Delay -Low Robustness - High Telcom 2720 10 PCM Speech Coding input Bandpass filter compressor Sampleand-hold circuit PAM -to- Digital converter PCM μ law compander Transmission medium output Bandpass filter expander Hold circuit PAM Digital-to converter μ law expander Pulse code modulation (PCM) system with analog companding then digital conversion ITU G.700 standard basis for speech coding In PSTN in 60 s Telcom 2720 11 5

Companding 1 μ-law companding Output 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 5 15 100 40 μ = 255 0 (no compression) Compander emphasizes small values, de-emphasizes large values in-order to equalize SNR across samples. Reverse the mapping at the receiver with an expandor 0 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 Input Telcom 2720 12 PCM Speech Coding Digitally companded PCM system ITU G.711 standard better quality speech than analog companding input Bandpass filter Sampleand-hold circuit PAM -to- Digital converter Linear PCM Digital compressor Compressed PCM PCM transmitter Transmission medium output Bandpass filter Hold circuit PAM Digital-to converter Linear PCM Digital expander PCM receiver Differential PCM (DPCM) : reduce bit rate from 64 Kbps to 32 Kbps) since change is small between sample transmit 1 sample then on transmit difference between samples use 4 bits to quantize adaptively adjust range of quantizer improves quality (ADPCM ITU G.726 ) Telcom 2720 13 6

DPCM Speech Coding input Low-pass filter Differentiator (summer) -to- Digital converter Encoded difference samples Accumulated signal level Integrator Digital-to converter DPCM transmitter DPCM input Digital-to converter Integrator Hold circuit Low-pass filter output DPCM receiver Telcom 2720 14 Subband Speech Coding Bandpass Filter 1 A/D 1 speech Bandpass Filter 2 A/D 2 Mux Channel encoder Bandpass Filter 3 A/D 3 Partition signal into non-overlapping frequency bands use different A/D quantizer for each band Example: 3 subbands 5600+12000 + 13600 = 31.2 Kbps band Range encoding ---------------------------------------- 1 50-700 Hz 4 bits 2 700-2000 Hz 3 bits 3 2000-3400Hz 2 bits Telcom 2720 15 7

Vocoders Vocoders (Parametric Coders) Models the vocalization of speech Speech sampled and broken into frames (~25 msec) Instead of transmitting digitized speech 1. Build model of speech 2. Transmit parameters of model 3. Synthesize approximation of speech Linear Predictive Coders (LPC) basic Vocoder model Models vocal tract as a filter Filter excitation periodic pulse (voiced speech) or noise (unvoiced speech) Transmitted parameters: gain, voiced/unvoiced decision, pitch (if voiced), LPC parameters Telcom 2720 16 Vocoders Linear Predictive Coders (LPC) Excitation periodic pulse (voiced speech) or noise (unvoiced speech) Transmitted parameters: gain, voiced/unvoiced decision, pitch (if voiced), LPC parameters Telcom 2720 17 8

Vocoders Example Tenth Order Linear Predictive Coder Samples Voice at 8000 Hz buffer 240 samples => 30 msec Filter Model (M=10 is order, G is gain, z -1 unit delay, b k are filter coefficients) H ( z ) = 1 + M G k = 1 b k z G = 5 bits, b k = 8 bits each, voiced/unvoiced decision = 1 bit, pitch = 6 bits => 92 bits/30 msec = 3067 bps k Telcom 2720 18 Vocoders LPC coders can achieve low bit rates 1.2 4.8 Kbps Characteristics of LPC Quality Low Complexity Moderate Bit Rate Low Delay Moderate Robustness Low Quality of pure LPC vocoder to low for cellular telephony - try to improve quality by using hybrid coders Try to improve the quality by refining model of speech, improve accuracy of model improve input to speech coder Telcom 2720 19 9

Vocoders Hybrid Coders Combine Vocoder and Waveform Coder concept Residual LPC (RELP) Codebook excited LPC (CELP) Telcom 2720 20 RELP Vocoder Residual Excited LPC improve quality of LPC by transmitting error (residue) along with LPC parameters s(n) Buffer/ Window residue v/u decision gain, pitch ST-LP Analysis LP parameters LPC Synthesis ENCODER Encoded output Block diagram of a RELP encoder Telcom 2720 21 10

GSM Speech Coding speech Low-pass filter A/D 104 kbps 13 kbps RPE-LTP Channel speech encoder encoder 8000 samples/s, 13 bits/sample GSM uses Regular Pulse Excited -- Linear Predictive Coder (RPE-- LPC) for speech Basically combine DPCM concept with LPC Information from previous samples used to predict the current sample. The LPC coefficients, plus an encoded form of the residual (predicted - actual sample = error), represent the signal. Telcom 2720 22 GSM Speech Coding (cont) Regular pulse excited - long term prediction (RPE-LRP) speech encoder (RELP speech coder 160 samples/ 20 ms from A/D (= 2080 bits) RPE-LTP speech encoder 36 LPC bits/20 ms 9 LTP bits/5 ms 47 RPE bits/5 ms 260 bits/20 ms to channel encoder LPC: linear prediction coding filter LTP: long term prediction pitch + input RPE: Residual Prediction Error: Telcom 2720 23 11

GSM Speech Coding (cont) Channel encoder 260 bits/ 20 ms = 13 kb/s 50 class 1a bits 3-bit CRC 182 class 1b bits 78 class 2 bits 53 bits 4 tail bits* (2,1,5) convolution coder 470 bits Bit interleaver Class 1a: CRC (3-bit error detection) and convolutional coding (error correction) Class 1b: convolutional coding Class 2: no error protection *tail bits to periodically reset convolutional coder 456 bits/ 20 ms = 22.8 kb/s Telcom 2720 24 Hybrid Vocoders Codebook Excited LPC Problem with simple LPC is U/V decision and pitch estimation doesn t model transitional speech well, and not always accurate Codebook approach pass speech through an analyzer to find closest match to a set of possible excitations (codebook) Transmit codebook pointer + LPC parameters NA-TDMA standard, IS-95, 3G, ITU G.729 standard Telcom 2720 25 12

Typical CELP Encoder Telcom 2720 26 CELP Speech Coders General CELP architecture Telcom 2720 27 13

CELP Speech Coders i Code Book #1 γ 1 γ 1 l Code Book #2 γ 2 LPC Synthesis Filter Output γ 2 h Code Book #3 β β Filter Coefficients Block diagram of the NA-TDMA (IS-54) speech coder subband codebook approach termed vector sum excited LPC (VSELPC) Telcom 2720 28 Evaluating Speech Coders Qualitative Comparison based on subjective procedures in ITU-T Rec. P. 830 Major Procedures Absolute Category Rating Subjects listen to samples and rank them on an absolute scale - result is a mean opinion score (MOS) Comparison Category Rating Subjects listen to coded samples and original uncoded sample (PCM or analog), the two are compared on a relative scale result is a comparison mean opinion score (CMOS) Mean Opinion Score (MOS) ------------------------------------- Excellent 5 Good 4 Fair 3 Poor 2 Bad 1 Comparison MOS (CMOS) ------------------------------------- Much Better 3 Better 2 Slightly Better 1 About the Same 0 Slightly Worse -1 Worse -2 Much Worse -3 Telcom 2720 29 14

Evaluating Speech Coders MOS for clear channel environment no errors Result vary a little with language and speaker gender Standard Speech coder Bit rate MOS PCM Waveform 64 Kbps 4.3 CT2 ADPCM 32 Kbps 4.1 DECT ADPCM 32 Kbps 4.1 NA-TDMA Hybrid VSELPC 8Kbps 3.0 GSM Hybrid RELPC 13 kbps 3.54 QCELP Hybrid CELP 14.4 Kbps 3.4 4.0 QCELP Hybrid CELP 9.6 Kbps 3.4 LPC Vocoder 2.4 Kbps 2.5 ITU G.729 Hybrid CELP 8.Kbps 3.9 Telcom 2720 30 Evaluating Speech Coders Types of environments recommended for testing coder quality Clean Channel no background noise Vehicle : emulate car background noise Street : emulate pedestrian environment Hoth : emulate background noise in office environment (voice band interference) Consider environments above for cases of Perfect Channel no transmission errors Random channel errors Bursty channel errors May consider repeated encoding/decoding (e.g., mobile to mobile call) Telcom 2720 31 15

Evaluating Speech Coders Background noise and errors degrade quality Repeated coding degrades quality Telcom 2720 32 Codec Selection For cellular need to consider Quality, Complexity, Delay, Compression Rate ITU Coder Bit Rate Coding Delay Decoding Delay Complexity G.711 64 Kbps 0 0 Low G.729 8 Kbps 15 ms 7.5 ms Medium G.723.a,b 6.4/5.3 Kbps 35.5 ms 18.75 ms High Telcom 2720 33 16

3G Standards Two competing 3G standards Both standards use multi-mode CELP vocoders 1. 3GPP/cdma2000 2. 3GPP/UMTS (SMV Multimode rate set 1) Variable bit rate vocoder Source Control of bit rate Channel coding treats all bits equally (AMR-NB Multi-rate) Fixed rate vocoder Voice Activity Detection Discontinuous Transmission network control of coder rate Tailors Channel coding to speech coder Telcom 2720 34 Silence Compression Much of a conversation is Silence (~40%) no need to transmit Voice Activity Detector (VAD) Hardware to detect silence period quickly Variable Bit Rate coders reduce bit rate when silence Discontinuous transmission (DTX) Stop transmitting frames Send minimal # of frames to keep connection up Comfort Noise Generator (CNG) Synthesize background noise avoids: Did you hang up? Random noise or reproduce speaker s ambient background For example GSM codec and popular VoIP G.723.1 codec has VAD/DTX/CNG Cdmaone and CDMA2000 codec use variable bit rate approach Telcom 2720 35 17

Silence Compression Telcom 2720 36 Voice Coding Basic Voice Coding Approaches Waveform Vocoders Hybrid Vocoders Evaluation of Vocoder Quality Codebook based vocoders use in new technology 3GPP and ITU recently standardized a AMR wideband CELP input 50-7000 HZ rather than 300-3400 Hz of current systems more natural quality speech slightly higher bit rate Telcom 2720 37 18