Speech Signal Processing: An Overview

Save this PDF as:
 WORD  PNG  TXT  JPG

Size: px
Start display at page:

Download "Speech Signal Processing: An Overview"

Transcription

1 Speech Signal Processing: An Overview S. R. M. Prasanna Department of Electronics and Electrical Engineering Indian Institute of Technology Guwahati December, 2012 Prasanna (EMST Lab, EEE, IITG) Speech Signal Processing: An Overview December 20, / 13

2 Organization Introduction Sampling frequency and bit resolution Non-stationary nature Short term processing STFT and Spectrogram Energy and Pitch Cepstral analysis Linear prediction analysis Speech processing tasks Summary Prasanna (EMST Lab, EEE, IITG) Speech Signal Processing: An Overview December 20, / 13

3 Speech: Fundamental and effortless mode of communication among humans. Speech communication: Talker, listener and channel Speech Production Process: Message formulation, language coding, neuro-muscular commands, movement of speech production organs, acoustic pressure variations Speech Perception Process: acoustic pressure variations, movement of speech perception organs, neuro-muscular commands, message comprehension Prasanna (EMST Lab, EEE, IITG) Speech Signal Processing: An Overview December 20, / 13

4 What is present in Speech Signal? Message Speaker Emotion Language Dialect Sensor Channel How to analyze, extract and model these information Prasanna (EMST Lab, EEE, IITG) Speech Signal Processing: An Overview December 20, / 13

5 Sampling Frequency Acoustic pressure variations to electrical signal using microphone Digitization for storage, analysis and processing on a digital machine Sampling, quantization and Encoding Sampling Theorem: The sampling frequency should be greater than or equal to twice the maximum frequency Audio frequency range: 20 Hz to 20 khz Speech components up to 14 khz, but can consider the whole audio range. Min. Sampling frequency recommended is 40 khz Including some guard band it is 48 khz Prasanna (EMST Lab, EEE, IITG) Speech Signal Processing: An Overview December 20, / 13

6 Bit Rate Number of bits / sample Bit resolution Number of quantization levels Minimum 16 bits is recommended Prasanna (EMST Lab, EEE, IITG) Speech Signal Processing: An Overview December 20, / 13

7 Non-Stationary Nature Signal, system, and signals and systems Stationary vs non-stationary signal Significance of non-stationary nature of speech Prasanna (EMST Lab, EEE, IITG) Speech Signal Processing: An Overview December 20, / 13

8 Short Term Processing Need for short term processing Approach for short term processing Frame size and frame shift Short term time domain processing Short term frequency domain processing Prasanna (EMST Lab, EEE, IITG) Speech Signal Processing: An Overview December 20, / 13

9 Short Term Domain Parameters Short term energy Short term zero crossing rate Short term autocorrelation Prasanna (EMST Lab, EEE, IITG) Speech Signal Processing: An Overview December 20, / 13

10 Short Term Frequency Domain Parameters Short term Fourier transform DTFT, STFT, DFT, FFT Spectrogram Wideband spectrogram Narrowband spectrogram Prasanna (EMST Lab, EEE, IITG) Speech Signal Processing: An Overview December 20, / 13

11 Cepstral Analysis of Speech Separation of source and system components in cepstral domain Feature extraction stage of automatic speech processing systems Also in estimation of pitch Cepstrum pitch determination Prasanna (EMST Lab, EEE, IITG) Speech Signal Processing: An Overview December 20, / 13

12 Linear Prediction Analysis of Speech Separation of source and system components in time domain Filter coefficients for speech coding Pitch estimation by SIFT Prasanna (EMST Lab, EEE, IITG) Speech Signal Processing: An Overview December 20, / 13

13 Automatic Speech Processing Tasks Speech recognition Speaker recognition Speech synthesis Language identification Prasanna (EMST Lab, EEE, IITG) Speech Signal Processing: An Overview December 20, / 13

Overview. Speech Signal Analysis. Speech signal analysis for ASR. Speech production model. Vocal Organs & Vocal Tract. Speech Signal Analysis for ASR

Overview. Speech Signal Analysis. Speech signal analysis for ASR. Speech production model. Vocal Organs & Vocal Tract. Speech Signal Analysis for ASR Overview Speech Signal Analysis Hiroshi Shimodaira and Steve Renals Automatic Speech Recognition ASR Lectures 2&3 7/2 January 23 Speech Signal Analysis for ASR Reading: Features for ASR Spectral analysis

More information

Analog Representations of Sound

Analog Representations of Sound Analog Representations of Sound Magnified phonograph grooves, viewed from above: The shape of the grooves encodes the continuously varying audio signal. Analog to Digital Recording Chain ADC Microphone

More information

ECE438 - Laboratory 9: Speech Processing (Week 2)

ECE438 - Laboratory 9: Speech Processing (Week 2) Purdue University: ECE438 - Digital Signal Processing with Applications 1 ECE438 - Laboratory 9: Speech Processing (Week 2) October 6, 2010 1 Introduction This is the second part of a two week experiment.

More information

Analysis/resynthesis with the short time Fourier transform

Analysis/resynthesis with the short time Fourier transform Analysis/resynthesis with the short time Fourier transform summer 2006 lecture on analysis, modeling and transformation of audio signals Axel Röbel Institute of communication science TU-Berlin IRCAM Analysis/Synthesis

More information

Module 9 AUDIO CODING. Version 2 ECE IIT, Kharagpur

Module 9 AUDIO CODING. Version 2 ECE IIT, Kharagpur Module 9 AUDIO CODING Lesson 28 Basic of Audio Coding Instructional Objectives At the end of this lesson, the students should be able to : 1. Name at least three different audio signal classes. 2. Calculate

More information

L6: Short-time Fourier analysis and synthesis

L6: Short-time Fourier analysis and synthesis L6: Short-time Fourier analysis and synthesis Overview Analysis: Fourier-transform view Analysis: filtering view Synthesis: filter bank summation (FBS) method Synthesis: overlap-add (OLA) method STFT magnitude

More information

School Class Monitoring System Based on Audio Signal Processing

School Class Monitoring System Based on Audio Signal Processing C. R. Rashmi 1,,C.P.Shantala 2 andt.r.yashavanth 3 1 Department of CSE, PG Student, CIT, Gubbi, Tumkur, Karnataka, India. 2 Department of CSE, Vice Principal & HOD, CIT, Gubbi, Tumkur, Karnataka, India.

More information

A Tutorial on Fourier Analysis

A Tutorial on Fourier Analysis A Tutorial on Fourier Analysis Douglas Eck University of Montreal NYU March 26 1.5 A fundamental and three odd harmonics (3,5,7) fund (freq 1) 3rd harm 5th harm 7th harmm.5 1 2 4 6 8 1 12 14 16 18 2 1.5

More information

Introduction and Comparison of Common Videoconferencing Audio Protocols I. Digital Audio Principles

Introduction and Comparison of Common Videoconferencing Audio Protocols I. Digital Audio Principles Introduction and Comparison of Common Videoconferencing Audio Protocols I. Digital Audio Principles Sound is an energy wave with frequency and amplitude. Frequency maps the axis of time, and amplitude

More information

Advanced Speech-Audio Processing in Mobile Phones and Hearing Aids

Advanced Speech-Audio Processing in Mobile Phones and Hearing Aids Advanced Speech-Audio Processing in Mobile Phones and Hearing Aids Synergies and Distinctions Peter Vary RWTH Aachen University Institute of Communication Systems WASPAA, October 23, 2013 Mohonk Mountain

More information

MPEG, the MP3 Standard, and Audio Compression

MPEG, the MP3 Standard, and Audio Compression MPEG, the MP3 Standard, and Audio Compression Mark ilgore and Jamie Wu Mathematics of the Information Age September 16, 23 Audio Compression Basic Audio Coding. Why beneficial to compress? Lossless versus

More information

L9: Cepstral analysis

L9: Cepstral analysis L9: Cepstral analysis The cepstrum Homomorphic filtering The cepstrum and voicing/pitch detection Linear prediction cepstral coefficients Mel frequency cepstral coefficients This lecture is based on [Taylor,

More information

The Algorithms of Speech Recognition, Programming and Simulating in MATLAB

The Algorithms of Speech Recognition, Programming and Simulating in MATLAB FACULTY OF ENGINEERING AND SUSTAINABLE DEVELOPMENT. The Algorithms of Speech Recognition, Programming and Simulating in MATLAB Tingxiao Yang January 2012 Bachelor s Thesis in Electronics Bachelor s Program

More information

B3. Short Time Fourier Transform (STFT)

B3. Short Time Fourier Transform (STFT) B3. Short Time Fourier Transform (STFT) Objectives: Understand the concept of a time varying frequency spectrum and the spectrogram Understand the effect of different windows on the spectrogram; Understand

More information

Available from Deakin Research Online:

Available from Deakin Research Online: This is the authors final peered reviewed (post print) version of the item published as: Adibi,S 2014, A low overhead scaled equalized harmonic-based voice authentication system, Telematics and informatics,

More information

MUSICAL INSTRUMENT FAMILY CLASSIFICATION

MUSICAL INSTRUMENT FAMILY CLASSIFICATION MUSICAL INSTRUMENT FAMILY CLASSIFICATION Ricardo A. Garcia Media Lab, Massachusetts Institute of Technology 0 Ames Street Room E5-40, Cambridge, MA 039 USA PH: 67-53-0 FAX: 67-58-664 e-mail: rago @ media.

More information

Signal Processing for Speech Recognition

Signal Processing for Speech Recognition Signal Processing for Speech Recognition Once a signal has been sampled, we have huge amounts of data, often 20,000 16 bit numbers a second! We need to find ways to concisely capture the properties of

More information

Speech Recognition of Spoken Digits

Speech Recognition of Spoken Digits Institute of Integrated Sensor Systems Dept. of Electrical Engineering and Information Technology Speech Recognition of Spoken Digits Stefanie Peters May 10, 2006 Lecture Information Sensor Signal Processing

More information

Why 64-bit processing?

Why 64-bit processing? Why 64-bit processing? The ARTA64 is an experimental version of ARTA that uses a 64-bit floating point data format for Fast Fourier Transform processing (FFT). Normal version of ARTA uses 32-bit floating

More information

Phonetics-1 periodic frequency amplitude decibel aperiodic fundamental harmonics overtones power spectrum

Phonetics-1 periodic frequency amplitude decibel aperiodic fundamental harmonics overtones power spectrum 24.901 Phonetics-1 1. sound results from pressure fluctuations in a medium which displace the ear drum to stimulate the auditory nerve air is normal medium for speech air is elastic (cf. bicyle pump, plastic

More information

Distributed Speech Recognition Where is 358 Madison Avenue

Distributed Speech Recognition Where is 358 Madison Avenue Distributed Speech Recognition Where is 358 Madison Avenue David Pearce Motorola Labs bdp003@motorola.com Voice & Multimodal Multimodal-enabled Voice-enabled Services User enters commands via: SPEECH KEYPAD

More information

Audio Engineering Society. Convention Paper. Presented at the 129th Convention 2010 November 4 7 San Francisco, CA, USA

Audio Engineering Society. Convention Paper. Presented at the 129th Convention 2010 November 4 7 San Francisco, CA, USA Audio Engineering Society Convention Paper Presented at the 129th Convention 2010 November 4 7 San Francisco, CA, USA The papers at this Convention have been selected on the basis of a submitted abstract

More information

ONE MICROPHONE SINGING VOICE SEPARATION USING SOURCE-ADAPTED MODELS Alexey Ozerov,Pierrick Philippe, Remi Gribonval, Frederic

ONE MICROPHONE SINGING VOICE SEPARATION USING SOURCE-ADAPTED MODELS Alexey Ozerov,Pierrick Philippe, Remi Gribonval, Frederic ONE MICROPHONE SINGING VOICE SEPARATION USING SOURCE-ADAPTED MODELS Alexey Ozerov,Pierrick Philippe, Remi Gribonval, Frederic Bimbot Presented by Orly Kohn Feldman Main Idea Problem nature - Source Separation

More information

Voice Signal s Noise Reduction Using Adaptive/Reconfigurable Filters for the Command of an Industrial Robot

Voice Signal s Noise Reduction Using Adaptive/Reconfigurable Filters for the Command of an Industrial Robot Voice Signal s Noise Reduction Using Adaptive/Reconfigurable Filters for the Command of an Industrial Robot Moisa Claudia**, Silaghi Helga Maria**, Rohde L. Ulrich * ***, Silaghi Paul****, Silaghi Andrei****

More information

Speech Signal Processing introduction

Speech Signal Processing introduction Speech Signal Processing introduction Jan Černocký, Valentina Hubeika {cernocky,ihubeika}@fit.vutbr.cz DCGM FIT BUT Brno FIT BUT Brno Speech Signal Processing introduction. Valentina Hubeika, DCGM FIT

More information

Emotion Detection from Speech

Emotion Detection from Speech Emotion Detection from Speech 1. Introduction Although emotion detection from speech is a relatively new field of research, it has many potential applications. In human-computer or human-human interaction

More information

Thirukkural - A Text-to-Speech Synthesis System

Thirukkural - A Text-to-Speech Synthesis System Thirukkural - A Text-to-Speech Synthesis System G. L. Jayavardhana Rama, A. G. Ramakrishnan, M Vijay Venkatesh, R. Murali Shankar Department of Electrical Engg, Indian Institute of Science, Bangalore 560012,

More information

A Sound Analysis and Synthesis System for Generating an Instrumental Piri Song

A Sound Analysis and Synthesis System for Generating an Instrumental Piri Song , pp.347-354 http://dx.doi.org/10.14257/ijmue.2014.9.8.32 A Sound Analysis and Synthesis System for Generating an Instrumental Piri Song Myeongsu Kang and Jong-Myon Kim School of Electrical Engineering,

More information

EELE445 - Lab 2 Pulse Signals

EELE445 - Lab 2 Pulse Signals EELE445 - Lab 2 Pulse Signals PURPOSE The purpose of the lab is to examine the characteristics of some common pulsed waveforms in the time and frequency domain. The repetitive pulsed waveforms used are

More information

Frequency Domain Characterization of Signals. Yao Wang Polytechnic University, Brooklyn, NY11201 http: //eeweb.poly.edu/~yao

Frequency Domain Characterization of Signals. Yao Wang Polytechnic University, Brooklyn, NY11201 http: //eeweb.poly.edu/~yao Frequency Domain Characterization of Signals Yao Wang Polytechnic University, Brooklyn, NY1121 http: //eeweb.poly.edu/~yao Signal Representation What is a signal Time-domain description Waveform representation

More information

Chapter 14. MPEG Audio Compression

Chapter 14. MPEG Audio Compression Chapter 14 MPEG Audio Compression 14.1 Psychoacoustics 14.2 MPEG Audio 14.3 Other Commercial Audio Codecs 14.4 The Future: MPEG-7 and MPEG-21 14.5 Further Exploration 1 Li & Drew c Prentice Hall 2003 14.1

More information

Understanding the Transition From PESQ to POLQA. An Ascom Network Testing White Paper

Understanding the Transition From PESQ to POLQA. An Ascom Network Testing White Paper Understanding the Transition From PESQ to POLQA An Ascom Network Testing White Paper By Dr. Irina Cotanis Prepared by: Date: Document: Dr. Irina Cotanis 6 December 2011 NT11-22759, Rev. 1.0 Ascom (2011)

More information

Computer Networks and Internets, 5e Chapter 6 Information Sources and Signals. Introduction

Computer Networks and Internets, 5e Chapter 6 Information Sources and Signals. Introduction Computer Networks and Internets, 5e Chapter 6 Information Sources and Signals Modified from the lecture slides of Lami Kaya (LKaya@ieee.org) for use CECS 474, Fall 2008. 2009 Pearson Education Inc., Upper

More information

Subjective test method for quantifying speaker identification accuracy of bandwidth-limited speech

Subjective test method for quantifying speaker identification accuracy of bandwidth-limited speech This article has been accepted and published on J-STAGE in advance of copyediting. Content is final as presented. IEICE Communications Express, Vol.1, 1 6 Subjective test method for quantifying speaker

More information

Dirac Live & the RS20i

Dirac Live & the RS20i Dirac Live & the RS20i Dirac Research has worked for many years fine-tuning digital sound optimization and room correction. Today, the technology is available to the high-end consumer audio market with

More information

Musical Pitch Identification

Musical Pitch Identification Musical Pitch Identification Manoj Deshpande Introduction Musical pitch identification is the fundamental problem that serves as a building block for many music applications such as music transcription,

More information

2: Audio Basics. Audio Basics. Mark Handley

2: Audio Basics. Audio Basics. Mark Handley 2: Audio Basics Mark Handley Audio Basics Analog to Digital Conversion Sampling Quantization Aliasing effects Filtering Companding PCM encoding Digital to Analog Conversion 1 Analog Audio Sound Waves (compression

More information

BLIND SOURCE SEPARATION OF SPEECH AND BACKGROUND MUSIC FOR IMPROVED SPEECH RECOGNITION

BLIND SOURCE SEPARATION OF SPEECH AND BACKGROUND MUSIC FOR IMPROVED SPEECH RECOGNITION BLIND SOURCE SEPARATION OF SPEECH AND BACKGROUND MUSIC FOR IMPROVED SPEECH RECOGNITION P. Vanroose Katholieke Universiteit Leuven, div. ESAT/PSI Kasteelpark Arenberg 10, B 3001 Heverlee, Belgium Peter.Vanroose@esat.kuleuven.ac.be

More information

Sampling Theorem Notes. Recall: That a time sampled signal is like taking a snap shot or picture of signal periodically.

Sampling Theorem Notes. Recall: That a time sampled signal is like taking a snap shot or picture of signal periodically. Sampling Theorem We will show that a band limited signal can be reconstructed exactly from its discrete time samples. Recall: That a time sampled signal is like taking a snap shot or picture of signal

More information

Marathi Interactive Voice Response System (IVRS) using MFCC and DTW

Marathi Interactive Voice Response System (IVRS) using MFCC and DTW Marathi Interactive Voice Response System (IVRS) using MFCC and DTW Manasi Ram Baheti Department of CSIT, Dr.B.A.M. University, Aurangabad, (M.S.), India Bharti W. Gawali Department of CSIT, Dr.B.A.M.University,

More information

Learning to Recognize Talkers from Natural, Sinewave, and Reversed Speech Samples

Learning to Recognize Talkers from Natural, Sinewave, and Reversed Speech Samples Learning to Recognize Talkers from Natural, Sinewave, and Reversed Speech Samples Presented by: Pankaj Rajan Graduate Student, Department of Computer Sciences. Texas A&M University, College Station Agenda

More information

Signaling is the way data is communicated. This type of signal used can be either analog or digital

Signaling is the way data is communicated. This type of signal used can be either analog or digital 3.1 Analog vs. Digital Signaling is the way data is communicated. This type of signal used can be either analog or digital 1 3.1 Analog vs. Digital 2 WCB/McGraw-Hill The McGraw-Hill Companies, Inc., 1998

More information

CMAS. Compact Radio Signal Monitoring Solution

CMAS. Compact Radio Signal Monitoring Solution CMAS Compact Radio Signal Monitoring Solution CMAS is a high-performance, automatic radio monitoring solution for multichannel analysing and processing of HF and V/UHF signals. 1, 2, 20 MHz wideband input

More information

Topic 4: Continuous-Time Fourier Transform (CTFT)

Topic 4: Continuous-Time Fourier Transform (CTFT) ELEC264: Signals And Systems Topic 4: Continuous-Time Fourier Transform (CTFT) Aishy Amer Concordia University Electrical and Computer Engineering o Introduction to Fourier Transform o Fourier transform

More information

NVH caused by ABS and ESP in cold climates

NVH caused by ABS and ESP in cold climates Research report LTU 2009-01 NVH caused by ABS and ESP in cold climates Roger Johnsson Anders Ågren Bror Tingvall Luleå University of technology Division of Sound & vibration 1 INTRODUCTION To evaluate

More information

SR2000 FREQUENCY MONITOR

SR2000 FREQUENCY MONITOR SR2000 FREQUENCY MONITOR THE FFT SEARCH FUNCTION IN DETAILS FFT Search is a signal search using FFT (Fast Fourier Transform) technology. The FFT search function first appeared with the SR2000 Frequency

More information

Carla Simões, t-carlas@microsoft.com. Speech Analysis and Transcription Software

Carla Simões, t-carlas@microsoft.com. Speech Analysis and Transcription Software Carla Simões, t-carlas@microsoft.com Speech Analysis and Transcription Software 1 Overview Methods for Speech Acoustic Analysis Why Speech Acoustic Analysis? Annotation Segmentation Alignment Speech Analysis

More information

Automatic Detection of Emergency Vehicles for Hearing Impaired Drivers

Automatic Detection of Emergency Vehicles for Hearing Impaired Drivers Automatic Detection of Emergency Vehicles for Hearing Impaired Drivers Sung-won ark and Jose Trevino Texas A&M University-Kingsville, EE/CS Department, MSC 92, Kingsville, TX 78363 TEL (36) 593-2638, FAX

More information

1. Introduction to Spoken Dialogue Systems

1. Introduction to Spoken Dialogue Systems SoSe 2006 Projekt Sprachdialogsysteme 1. Introduction to Spoken Dialogue Systems Walther v. Hahn, Cristina Vertan {vhahn,vertan}@informatik.uni-hamburg.de Content What are Spoken dialogue systems? Types

More information

Appendix C GSM System and Modulation Description

Appendix C GSM System and Modulation Description C1 Appendix C GSM System and Modulation Description C1. Parameters included in the modelling In the modelling the number of mobiles and their positioning with respect to the wired device needs to be taken

More information

Auto-Tuning Using Fourier Coefficients

Auto-Tuning Using Fourier Coefficients Auto-Tuning Using Fourier Coefficients Math 56 Tom Whalen May 20, 2013 The Fourier transform is an integral part of signal processing of any kind. To be able to analyze an input signal as a superposition

More information

(Refer Slide Time: 2:08)

(Refer Slide Time: 2:08) Digital Voice and Picture Communication Prof. S. Sengupta Department of Electronics and Communication Engineering Indian Institute of Technology, Kharagpur Lecture - 30 AC - 3 Decoder In continuation with

More information

Ericsson T18s Voice Dialing Simulator

Ericsson T18s Voice Dialing Simulator Ericsson T18s Voice Dialing Simulator Mauricio Aracena Kovacevic, Anna Dehlbom, Jakob Ekeberg, Guillaume Gariazzo, Eric Lästh and Vanessa Troncoso Dept. of Signals Sensors and Systems Royal Institute of

More information

GSM speech coding. Wolfgang Leister Forelesning INF 5080 Vårsemester 2004. Norsk Regnesentral

GSM speech coding. Wolfgang Leister Forelesning INF 5080 Vårsemester 2004. Norsk Regnesentral GSM speech coding Forelesning INF 5080 Vårsemester 2004 Sources This part contains material from: Web pages Universität Bremen, Arbeitsbereich Nachrichtentechnik (ANT): Prof.K.D. Kammeyer, Jörg Bitzer,

More information

Lecture 1-10: Spectrograms

Lecture 1-10: Spectrograms Lecture 1-10: Spectrograms Overview 1. Spectra of dynamic signals: like many real world signals, speech changes in quality with time. But so far the only spectral analysis we have performed has assumed

More information

Voice Communication Package v7.0 of front-end voice processing software technologies General description and technical specification

Voice Communication Package v7.0 of front-end voice processing software technologies General description and technical specification Voice Communication Package v7.0 of front-end voice processing software technologies General description and technical specification (Revision 1.0, May 2012) General VCP information Voice Communication

More information

Department of Electronics and Communication Engineering 1

Department of Electronics and Communication Engineering 1 DHANALAKSHMI COLLEGE OF ENGINEERING, CHENNAI DEPARTMENT OF ELECTRONICS AND COMMUNICATION ENGINEERING III Year ECE / V Semester EC 6502 PRINCIPLES OF DIGITAL SIGNAL PROCESSING QUESTION BANK Department of

More information

DSP Laboratory Work S. Laboratory exercises with TMS320C5510 DSK

DSP Laboratory Work S. Laboratory exercises with TMS320C5510 DSK DSP Laboratory Work 521485S Laboratory exercises with TMS320C5510 DSK Jari Hannuksela Information Processing Laboratory Dept. of Electrical and Information Engineering, University of Oulu ovember 14, 2008

More information

Automatic Evaluation Software for Contact Centre Agents voice Handling Performance

Automatic Evaluation Software for Contact Centre Agents voice Handling Performance International Journal of Scientific and Research Publications, Volume 5, Issue 1, January 2015 1 Automatic Evaluation Software for Contact Centre Agents voice Handling Performance K.K.A. Nipuni N. Perera,

More information

Introduction to Digital Audio

Introduction to Digital Audio Introduction to Digital Audio Before the development of high-speed, low-cost digital computers and analog-to-digital conversion circuits, all recording and manipulation of sound was done using analog techniques.

More information

Nuance Audio Input Specification

Nuance Audio Input Specification c Nuance Audio Input Specification NUANCE MOBILE DIVISION 2015-2016 Nuance Communications, Inc. All rights reserved. ABOUT THIS DOCUMENT This document describes requirements and best practices for the

More information

Speech Signal Processing Handout 4 Spectral Analysis & the DFT

Speech Signal Processing Handout 4 Spectral Analysis & the DFT Speech Signal Processing Handout 4 Spectral Analysis & the DFT Signal: N samples Spectrum: DFT Result: N complex values Frequency resolution = 1 (Hz) NT Magnitude 1 Symmetric about Hz 2T frequency (Hz)

More information

Advanced Signal Processing and Digital Noise Reduction

Advanced Signal Processing and Digital Noise Reduction Advanced Signal Processing and Digital Noise Reduction Saeed V. Vaseghi Queen's University of Belfast UK WILEY HTEUBNER A Partnership between John Wiley & Sons and B. G. Teubner Publishers Chichester New

More information

AIRBORNE SOUND BASED DAMAGE DETECTION FOR WIND TURBINE ROTOR BLADES USING IMPULSE DETECTION IN FREQUENCY BANDS

AIRBORNE SOUND BASED DAMAGE DETECTION FOR WIND TURBINE ROTOR BLADES USING IMPULSE DETECTION IN FREQUENCY BANDS International Wind Engineering Conference IWEC 2014 AIRBORNE SOUND BASED DAMAGE DETECTION FOR WIND TURBINE ROTOR BLADES USING IMPULSE DETECTION IN FREQUENCY BANDS THOMAS KRAUSE, STEPHAN PREIHS, JÖRN OSTERMANN

More information

K2 CW Filter Alignment Procedures Using Spectrogram 1 ver. 5 01/17/2002

K2 CW Filter Alignment Procedures Using Spectrogram 1 ver. 5 01/17/2002 K2 CW Filter Alignment Procedures Using Spectrogram 1 ver. 5 01/17/2002 It will be assumed that you have already performed the RX alignment procedures in the K2 manual, that you have already selected the

More information

Digital vs. Analog Transmission

Digital vs. Analog Transmission Digital vs. Analog Transmission Two forms of transmission: digital transmission: data transmission using square waves analog transmission: data transmission using all other waves Four possibilities to

More information

Speech Recognition on Cell Broadband Engine UCRL-PRES-223890

Speech Recognition on Cell Broadband Engine UCRL-PRES-223890 Speech Recognition on Cell Broadband Engine UCRL-PRES-223890 Yang Liu, Holger Jones, John Johnson, Sheila Vaidya (Lawrence Livermore National Laboratory) Michael Perrone, Borivoj Tydlitat, Ashwini Nanda

More information

CS578- Speech Signal Processing

CS578- Speech Signal Processing CS578- Speech Signal Processing Lecture 2: Production and Classification of Speech Sounds Yannis Stylianou University of Crete, Computer Science Dept., Multimedia Informatics Lab yannis@csd.uoc.gr Univ.

More information

Digital Speech Coding

Digital Speech Coding Digital Speech Processing David Tipper Associate Professor Graduate Program of Telecommunications and Networking University of Pittsburgh Telcom 2720 Slides 7 http://www.sis.pitt.edu/~dtipper/tipper.html

More information

Artificial Neural Network for Speech Recognition

Artificial Neural Network for Speech Recognition Artificial Neural Network for Speech Recognition Austin Marshall March 3, 2005 2nd Annual Student Research Showcase Overview Presenting an Artificial Neural Network to recognize and classify speech Spoken

More information

This document is downloaded from DR-NTU, Nanyang Technological University Library, Singapore.

This document is downloaded from DR-NTU, Nanyang Technological University Library, Singapore. This document is downloaded from DR-NTU, Nanyang Technological University Library, Singapore. Title Transcription of polyphonic signals using fast filter bank( Accepted version ) Author(s) Foo, Say Wei;

More information

Course overview Processamento de sinais 2009/10 LEA

Course overview Processamento de sinais 2009/10 LEA Course overview Processamento de sinais 2009/10 LEA João Pedro Gomes jpg@isr.ist.utl.pt Instituto Superior Técnico Processamento de sinais MEAer (IST) Course overview 1 / 19 Course overview Motivation:

More information

Data driven design of filter bank for speech recognition

Data driven design of filter bank for speech recognition Data driven design of filter bank for speech recognition Lukáš Burget 12 and Hynek Heřmanský 23 1 Oregon Graduate Institute, Anthropic Signal Processing Group, 2 NW Walker Rd., Beaverton, Oregon 976-8921,

More information

From Concept to Production in Secure Voice Communications

From Concept to Production in Secure Voice Communications From Concept to Production in Secure Voice Communications Earl E. Swartzlander, Jr. Electrical and Computer Engineering Department University of Texas at Austin Austin, TX 78712 Abstract In the 1970s secure

More information

SPEAKER IDENTIFICATION FROM YOUTUBE OBTAINED DATA

SPEAKER IDENTIFICATION FROM YOUTUBE OBTAINED DATA SPEAKER IDENTIFICATION FROM YOUTUBE OBTAINED DATA Nitesh Kumar Chaudhary 1 and Shraddha Srivastav 2 1 Department of Electronics & Communication Engineering, LNMIIT, Jaipur, India 2 Bharti School Of Telecommunication,

More information

Introduction of Fourier Analysis and Time-frequency Analysis

Introduction of Fourier Analysis and Time-frequency Analysis Introduction of Fourier Analysis and Time-frequency Analysis March 1, 2016 Fourier Series Fourier transform Fourier analysis Mathematics compares the most diverse phenomena and discovers the secret analogies

More information

VoIP Technologies Lecturer : Dr. Ala Khalifeh Lecture 4 : Voice codecs (Cont.)

VoIP Technologies Lecturer : Dr. Ala Khalifeh Lecture 4 : Voice codecs (Cont.) VoIP Technologies Lecturer : Dr. Ala Khalifeh Lecture 4 : Voice codecs (Cont.) 1 Remember first the big picture VoIP network architecture and some terminologies Voice coders 2 Audio and voice quality measuring

More information

Front-End Signal Processing for Speech Recognition

Front-End Signal Processing for Speech Recognition Front-End Signal Processing for Speech Recognition MILAN RAMLJAK 1, MAJA STELLA 2, MATKO ŠARIĆ 2 1 Ericsson Nikola Tesla Poljička 39, HR-21 Split 2 FESB - University of Split R. Boškovića 32, HR-21 Split

More information

Artificial Bandwidth Extension of Narrowband Speech

Artificial Bandwidth Extension of Narrowband Speech Artificial Bandwidth Extension of Narrowband Speech SIGNAL AND INFORMATION PROCESSING FOR COMMUNICATIONS DEPARTMENT OF ELECTRONIC SYSTEMS Group 1092 June 7 2007 AALBORG UNIVERSITY Aalborg University Department

More information

ANALYZER BASICS WHAT IS AN FFT SPECTRUM ANALYZER? 2-1

ANALYZER BASICS WHAT IS AN FFT SPECTRUM ANALYZER? 2-1 WHAT IS AN FFT SPECTRUM ANALYZER? ANALYZER BASICS The SR760 FFT Spectrum Analyzer takes a time varying input signal, like you would see on an oscilloscope trace, and computes its frequency spectrum. Fourier's

More information

FOURIER TRANSFORM BASED SIMPLE CHORD ANALYSIS. UIUC Physics 193 POM

FOURIER TRANSFORM BASED SIMPLE CHORD ANALYSIS. UIUC Physics 193 POM FOURIER TRANSFORM BASED SIMPLE CHORD ANALYSIS Fanbo Xiang UIUC Physics 193 POM Professor Steven M. Errede Fall 2014 1 Introduction Chords, an essential part of music, have long been analyzed. Different

More information

Discrete Fourier Series & Discrete Fourier Transform Chapter Intended Learning Outcomes

Discrete Fourier Series & Discrete Fourier Transform Chapter Intended Learning Outcomes Discrete Fourier Series & Discrete Fourier Transform Chapter Intended Learning Outcomes (i) Understanding the relationships between the transform, discrete-time Fourier transform (DTFT), discrete Fourier

More information

Lecture 1-8: Audio Recording Systems

Lecture 1-8: Audio Recording Systems Lecture 1-8: Audio Recording Systems Overview 1. Why do we need to record speech? We need audio recordings of speech for a number of reasons: for off-line analysis, so that we can listen to and transcribe

More information

Fundamentals Series Analog vs. Digital. Polycom, Inc. All rights reserved.

Fundamentals Series Analog vs. Digital. Polycom, Inc. All rights reserved. Fundamentals Series Analog vs. Digital Polycom, Inc. All rights reserved. Fundamentals Series Signals H.323 Analog vs. Digital SIP Defining Quality Standards Network Communication I Network Communication

More information

TIME-FREQUENCY ANALYSIS OF DIESEL ENGINE NOISE

TIME-FREQUENCY ANALYSIS OF DIESEL ENGINE NOISE Bulletin of Engineering Tome VII [] ISSN: 67 389. Sunny NARAYAN TIME-FREQUENCY ANALYSIS OF DIESEL ENGINE NOISE. Mechanical Engineering Department, University of Roma Tre, ITALY Abstract: Combustion is

More information

High Quality Integrated Data Reconstruction for Medical Applications

High Quality Integrated Data Reconstruction for Medical Applications High Quality Integrated Data Reconstruction for Medical Applications A.K.M Fazlul Haque Md. Hanif Ali M Adnan Kiber Department of Computer Science Department of Computer Science Department of Applied Physics,

More information

Objective Speech Quality Measures for Internet Telephony

Objective Speech Quality Measures for Internet Telephony Objective Speech Quality Measures for Internet Telephony Timothy A. Hall National Institute of Standards and Technology 100 Bureau Drive, STOP 8920 Gaithersburg, MD 20899-8920 ABSTRACT Measuring voice

More information

Video-Conferencing System

Video-Conferencing System Video-Conferencing System Evan Broder and C. Christoher Post Introductory Digital Systems Laboratory November 2, 2007 Abstract The goal of this project is to create a video/audio conferencing system. Video

More information

Noise Removal in Speech Processing Using Spectral Subtraction

Noise Removal in Speech Processing Using Spectral Subtraction Journal of Signal and Information Processing, 24, 5, 32-4 Published Online May 24 in SciRes. http://www.scirp.org/journal/jsip http://dx.doi.org/.4236/jsip.24.526 Noise Removal in Speech Processing Using

More information

RightMark Audio Analyzer

RightMark Audio Analyzer RightMark Audio Analyzer Version 2.5 2001 http://audio.rightmark.org Tests description 1 Contents FREQUENCY RESPONSE TEST... 2 NOISE LEVEL TEST... 3 DYNAMIC RANGE TEST... 5 TOTAL HARMONIC DISTORTION TEST...

More information

Software Defined Radio

Software Defined Radio Software Defined Radio GNU Radio and the USRP Overview What is Software Defined Radio? Advantages of Software Defined Radio Traditional versus SDR Receivers SDR and the USRP Using GNU Radio Introduction

More information

APPLICATION OF FILTER BANK THEORY TO SUBBAND CODING OF IMAGES

APPLICATION OF FILTER BANK THEORY TO SUBBAND CODING OF IMAGES EC 623 ADVANCED DIGITAL SIGNAL PROCESSING TERM-PROJECT APPLICATION OF FILTER BANK THEORY TO SUBBAND CODING OF IMAGES Y. PRAVEEN KUMAR 03010240 KANCHAN MISHRA 03010242 Supervisor: Dr. S.R.M. Prasanna Department

More information

Elec 484 Final Project Report. Marlon Smith

Elec 484 Final Project Report. Marlon Smith Elec 484 Final Project Report Marlon Smith Abstract This report discusses the implementation of a variety of audio effects using a phase vocoder. Effects such as time stretching, pitch shifting, and robotization

More information

Cleaning and Quality Classification of Optically Recorded Voice Signals

Cleaning and Quality Classification of Optically Recorded Voice Signals 6 Recent Patents on Signal Processing, 2010, 2, 6-11 Open Access Cleaning and Quality Classification of Optically Recorded Voice Signals Yevgeny Beiderman 1, Yaniv Azani 2, Yoni Cohen 2, Chen Nisankoren

More information

HERMES: Human Exposure and Radiation Monitoring of Electromagnetic Sources. www.hermes-program.gr

HERMES: Human Exposure and Radiation Monitoring of Electromagnetic Sources. www.hermes-program.gr HERMES Program: The Greek Experience of an Electromagnetic Radiation Monitoring System A. Manassas, A. Boursianis, T. Samaras and J. N. Sahalos HERMES: Human Exposure and Radiation Monitoring of Electromagnetic

More information

MP3 Player CSEE 4840 SPRING 2010 PROJECT DESIGN. zl2211@columbia.edu. ml3088@columbia.edu

MP3 Player CSEE 4840 SPRING 2010 PROJECT DESIGN. zl2211@columbia.edu. ml3088@columbia.edu MP3 Player CSEE 4840 SPRING 2010 PROJECT DESIGN Zheng Lai Zhao Liu Meng Li Quan Yuan zl2215@columbia.edu zl2211@columbia.edu ml3088@columbia.edu qy2123@columbia.edu I. Overview Architecture The purpose

More information

Digital Audio Recording Analysis The Electric Network Frequency Criterion Catalin Grigoras, Ph.D, IAFPA

Digital Audio Recording Analysis The Electric Network Frequency Criterion Catalin Grigoras, Ph.D, IAFPA Digital Audio Recording Analysis The Electric Network Frequency Criterion Catalin Grigoras, Ph.D, IAFPA INTRODUCTION Over the last years we saw a significant increase in the number of attempts to use digital

More information

Establishing the Uniqueness of the Human Voice for Security Applications

Establishing the Uniqueness of the Human Voice for Security Applications Proceedings of Student/Faculty Research Day, CSIS, Pace University, May 7th, 2004 Establishing the Uniqueness of the Human Voice for Security Applications Naresh P. Trilok, Sung-Hyuk Cha, and Charles C.

More information

The Sonometer The Resonant String and Timbre Change after plucking

The Sonometer The Resonant String and Timbre Change after plucking The Sonometer The Resonant String and Timbre Change after plucking EQUIPMENT Pasco sonometers (pick up 5 from teaching lab) and 5 kits to go with them BK Precision function generators and Tenma oscilloscopes

More information

VOR software receiver and decoder with dspic

VOR software receiver and decoder with dspic VOR software receiver and decoder with dspic By Josef Stastny 9-15-2004-1 - 1. Introduction VOR (VHF Omni-directional Radio range) is a radio navigation system used for civil and military navigation of

More information