Speech Signal Processing: An Overview



Similar documents
Analysis/resynthesis with the short time Fourier transform

Analog Representations of Sound

School Class Monitoring System Based on Audio Signal Processing

B3. Short Time Fourier Transform (STFT)

Introduction and Comparison of Common Videoconferencing Audio Protocols I. Digital Audio Principles

Advanced Speech-Audio Processing in Mobile Phones and Hearing Aids

L9: Cepstral analysis

MUSICAL INSTRUMENT FAMILY CLASSIFICATION

The Algorithms of Speech Recognition, Programming and Simulating in MATLAB

Lecture 1-10: Spectrograms

Audio Engineering Society. Convention Paper. Presented at the 129th Convention 2010 November 4 7 San Francisco, CA, USA

Thirukkural - A Text-to-Speech Synthesis System

Emotion Detection from Speech

A Sound Analysis and Synthesis System for Generating an Instrumental Piri Song

CMAS. Compact Radio Signal Monitoring Solution

BLIND SOURCE SEPARATION OF SPEECH AND BACKGROUND MUSIC FOR IMPROVED SPEECH RECOGNITION

Understanding the Transition From PESQ to POLQA. An Ascom Network Testing White Paper

Ericsson T18s Voice Dialing Simulator

Voice Communication Package v7.0 of front-end voice processing software technologies General description and technical specification

GSM speech coding. Wolfgang Leister Forelesning INF 5080 Vårsemester Norsk Regnesentral

Computer Networks and Internets, 5e Chapter 6 Information Sources and Signals. Introduction

Marathi Interactive Voice Response System (IVRS) using MFCC and DTW

Dirac Live & the RS20i

Sampling Theorem Notes. Recall: That a time sampled signal is like taking a snap shot or picture of signal periodically.

Advanced Signal Processing and Digital Noise Reduction

1. Introduction to Spoken Dialogue Systems

K2 CW Filter Alignment Procedures Using Spectrogram 1 ver. 5 01/17/2002

Carla Simões, Speech Analysis and Transcription Software

Appendix C GSM System and Modulation Description

This document is downloaded from DR-NTU, Nanyang Technological University Library, Singapore.

From Concept to Production in Secure Voice Communications

Automatic Detection of Emergency Vehicles for Hearing Impaired Drivers

Auto-Tuning Using Fourier Coefficients

Time Series Analysis: Introduction to Signal Processing Concepts. Liam Kilmartin Discipline of Electrical & Electronic Engineering, NUI, Galway

SR2000 FREQUENCY MONITOR

Introduction to Digital Audio

Web-Conferencing System SAViiMeeting

Digital Speech Coding

Course overview Processamento de sinais 2009/10 LEA

Artificial Neural Network for Speech Recognition

ANALYZER BASICS WHAT IS AN FFT SPECTRUM ANALYZER? 2-1

SPEAKER IDENTIFICATION FROM YOUTUBE OBTAINED DATA

High Quality Integrated Data Reconstruction for Medical Applications

SOFTWARE FOR GENERATION OF SPECTRUM COMPATIBLE TIME HISTORY

Objective Speech Quality Measures for Internet Telephony

VoIP Technologies Lecturer : Dr. Ala Khalifeh Lecture 4 : Voice codecs (Cont.)

FOURIER TRANSFORM BASED SIMPLE CHORD ANALYSIS. UIUC Physics 193 POM

Video-Conferencing System

Automatic Evaluation Software for Contact Centre Agents voice Handling Performance

Improved Fault Detection by Appropriate Control of Signal

Software Defined Radio

Establishing the Uniqueness of the Human Voice for Security Applications

Figure1. Acoustic feedback in packet based video conferencing system

HERMES: Human Exposure and Radiation Monitoring of Electromagnetic Sources.

Digital Audio Recording Analysis The Electric Network Frequency Criterion Catalin Grigoras, Ph.D, IAFPA

The Sonometer The Resonant String and Timbre Change after plucking

MP3 Player CSEE 4840 SPRING 2010 PROJECT DESIGN.

Probability and Random Variables. Generation of random variables (r.v.)

RECOMMENDATION ITU-R BR *, ** Parameters for international exchange of multi-channel sound recordings with or without accompanying picture ***

communication over wireless link handling mobile user who changes point of attachment to network

Recent advances in Digital Music Processing and Indexing

CLIO 8 CLIO 8 CLIO 8 CLIO 8

LOW COST HARDWARE IMPLEMENTATION FOR DIGITAL HEARING AID USING

Abant Izzet Baysal University

APPLYING MFCC-BASED AUTOMATIC SPEAKER RECOGNITION TO GSM AND FORENSIC DATA

Doppler. Doppler. Doppler shift. Doppler Frequency. Doppler shift. Doppler shift. Chapter 19

Broadband Networks. Prof. Dr. Abhay Karandikar. Electrical Engineering Department. Indian Institute of Technology, Bombay. Lecture - 29.

Chapter 10 Introduction to Time Series Analysis

Sound absorption and acoustic surface impedance

Final Year Project Progress Report. Frequency-Domain Adaptive Filtering. Myles Friel. Supervisor: Dr.Edward Jones

Introduction to Packet Voice Technologies and VoIP

11: AUDIO AMPLIFIER I. INTRODUCTION

Real Time Analysis of VoIP System under Pervasive Environment through Spectral Parameters

MPEG Unified Speech and Audio Coding Enabling Efficient Coding of both Speech and Music

Enhancing the SNR of the Fiber Optic Rotation Sensor using the LMS Algorithm

Department of Electrical and Computer Engineering Ben-Gurion University of the Negev. LAB 1 - Introduction to USRP

Propagation Channel Emulator ECP_V3

THE problem of tracking multiple moving targets arises

Wavelet analysis. Wavelet requirements. Example signals. Stationary signal 2 Hz + 10 Hz + 20Hz. Zero mean, oscillatory (wave) Fast decay (let)


Analog-to-Digital Voice Encoding

High Definition Wideband

Teaching Fourier Analysis and Wave Physics with the Bass Guitar

Speech Analysis for Automatic Speech Recognition

R&S FSW signal and spectrum analyzer: best in class now up to 50 GHz

DSAP - Digital Speech and Audio Processing

*For stability of the feedback loop, the differential gain must vary as

AN Application Note: FCC Regulations for ISM Band Devices: MHz. FCC Regulations for ISM Band Devices: MHz

TCOM 370 NOTES 99-6 VOICE DIGITIZATION AND VOICE/DATA INTEGRATION

Tutorial about the VQR (Voice Quality Restoration) technology

Use Data Budgets to Manage Large Acoustic Datasets

Wave Field Synthesis

AC : MULTIMEDIA SYSTEMS EDUCATION INNOVATIONS I: SPEECH

Reconfigurable Low Area Complexity Filter Bank Architecture for Software Defined Radio

The Fundamentals of FFT-Based Audio Measurements in SmaartLive

(Refer Slide Time: 2:10)

WLAN Spectrum Analyzer Technology White Paper HUAWEI TECHNOLOGIES CO., LTD. Issue 01. Date

Transcription:

Speech Signal Processing: An Overview S. R. M. Prasanna Department of Electronics and Electrical Engineering Indian Institute of Technology Guwahati December, 2012 Prasanna (EMST Lab, EEE, IITG) Speech Signal Processing: An Overview December 20, 2012 1 / 13

Organization Introduction Sampling frequency and bit resolution Non-stationary nature Short term processing STFT and Spectrogram Energy and Pitch Cepstral analysis Linear prediction analysis Speech processing tasks Summary Prasanna (EMST Lab, EEE, IITG) Speech Signal Processing: An Overview December 20, 2012 2 / 13

Speech: Fundamental and effortless mode of communication among humans. Speech communication: Talker, listener and channel Speech Production Process: Message formulation, language coding, neuro-muscular commands, movement of speech production organs, acoustic pressure variations Speech Perception Process: acoustic pressure variations, movement of speech perception organs, neuro-muscular commands, message comprehension Prasanna (EMST Lab, EEE, IITG) Speech Signal Processing: An Overview December 20, 2012 3 / 13

What is present in Speech Signal? Message Speaker Emotion Language Dialect Sensor Channel How to analyze, extract and model these information Prasanna (EMST Lab, EEE, IITG) Speech Signal Processing: An Overview December 20, 2012 4 / 13

Sampling Frequency Acoustic pressure variations to electrical signal using microphone Digitization for storage, analysis and processing on a digital machine Sampling, quantization and Encoding Sampling Theorem: The sampling frequency should be greater than or equal to twice the maximum frequency Audio frequency range: 20 Hz to 20 khz Speech components up to 14 khz, but can consider the whole audio range. Min. Sampling frequency recommended is 40 khz Including some guard band it is 48 khz Prasanna (EMST Lab, EEE, IITG) Speech Signal Processing: An Overview December 20, 2012 5 / 13

Bit Rate Number of bits / sample Bit resolution Number of quantization levels Minimum 16 bits is recommended Prasanna (EMST Lab, EEE, IITG) Speech Signal Processing: An Overview December 20, 2012 6 / 13

Non-Stationary Nature Signal, system, and signals and systems Stationary vs non-stationary signal Significance of non-stationary nature of speech Prasanna (EMST Lab, EEE, IITG) Speech Signal Processing: An Overview December 20, 2012 7 / 13

Short Term Processing Need for short term processing Approach for short term processing Frame size and frame shift Short term time domain processing Short term frequency domain processing Prasanna (EMST Lab, EEE, IITG) Speech Signal Processing: An Overview December 20, 2012 8 / 13

Short Term Domain Parameters Short term energy Short term zero crossing rate Short term autocorrelation Prasanna (EMST Lab, EEE, IITG) Speech Signal Processing: An Overview December 20, 2012 9 / 13

Short Term Frequency Domain Parameters Short term Fourier transform DTFT, STFT, DFT, FFT Spectrogram Wideband spectrogram Narrowband spectrogram Prasanna (EMST Lab, EEE, IITG) Speech Signal Processing: An Overview December 20, 2012 10 / 13

Cepstral Analysis of Speech Separation of source and system components in cepstral domain Feature extraction stage of automatic speech processing systems Also in estimation of pitch Cepstrum pitch determination Prasanna (EMST Lab, EEE, IITG) Speech Signal Processing: An Overview December 20, 2012 11 / 13

Linear Prediction Analysis of Speech Separation of source and system components in time domain Filter coefficients for speech coding Pitch estimation by SIFT Prasanna (EMST Lab, EEE, IITG) Speech Signal Processing: An Overview December 20, 2012 12 / 13

Automatic Speech Processing Tasks Speech recognition Speaker recognition Speech synthesis Language identification Prasanna (EMST Lab, EEE, IITG) Speech Signal Processing: An Overview December 20, 2012 13 / 13