Algorithm for Detection of Voice Signal Periodicity

Size: px
Start display at page:

Download "Algorithm for Detection of Voice Signal Periodicity"

Transcription

1 Algorithm for Detection of Voice Signal Periodicity Ovidiu Buza, Gavril Toderean, András Balogh Department of Communications, Technical University of Cluj- Napoca, Romania József Domokos Department of Electrical Engineering,Sapientia University, Targu Mures, Romania

2 Algorithm for Detection of Voice Signal Periodicity An original algorithm for detecting the periodicity of voice signal M k (i) PIV D Main characteristics of current algorithm: - precise determination of each period from a voiced segment of speech - accurate detection of pitch interval boundaries - marking the glottal peak of each period The algorithm uses time domain analysis of the signal -> fast and efficient

3 INTRODUCTION Some existing methods for pitch detection: use LPC model by detecting peaks from LPC residual signal [1] by calculating spectral discontinuities using time-frequency transformations or by detecting waveform discontinuities from corresponding vocal tract signal [2]-[5] Other methods use autocorrelation, cepstrum and inverse filtering (SIFT) for estimating signal periodicity [6] Statistical methods have been developed also ([7]-[9]) -> determine mean values of F0 frequency along a considered frame, but not the precise frame for each signal period. 1. D.G. Childers, H.T. Hu, Speech synthesis by glottal excited linear prediction, Journal of the Acoustical Society of America, P.A. Naylor, A. Kounoudes, J. Gudnason, M. Brookes, Estimation of glottal closure instants in voiced speech using the DYPSA algorithm, IEEE Transactions on Audio, Speech, and Language Processing, Volume 15, Issue 1, pp.34-43, January M.R.P. Thomas, P.A. Naylor, The SIGMA algorithm: a glottal activity detector for electroglottographic signals, IEEE Transactions on Audio, Speech, and Language Processing, Volume 17, No. 8, November C. d Alessandro et al., Phase-based methods for voice source analysis, in Advances in Nonlinear Speech Processing, International Conference on Nonlinear Speech Processing, NOLISP 2007, Paris, France, pp. 1-27, May K. Schnell, Estimation of glottal closure instances from speech signals by weighted nonlinear prediction, in Advances in Nonlinear Speech Processing, International Conference on Nonlinear Speech Processing, NOLISP 2007, Paris, France, pp , N.A. Kader, Pitch detection algorithm using a wavelet correlation model, The 17th National Radio Science Conference (NRSC), S. Sakai, J. Glass, Fundamental frequency modeling for corpus-based speech synthesis based on a statistical learning technique, Spoken Language System Publications, G. Proakis, C. M. Rader, F. Ling, M. Moonen, I. K. Proudler, C. L. Nikias, Algorithms for Statistical Signal Processing, Prentice Hall, 2002

4 A generic algorithm Was presented by Childers and Hu [1]. This method uses the results of S/U/V segmentation of speech and prediction error signal generated from LPC analysis The waveform signal periodicity is calculated by detecting GCI points that correspond with physical glottal vibrations ->from the correlated signal Cte(n) Testing this method -> although providing good results, introduces some errors especially in frames with rapid changing of F0 frequency. prediction error elp(n) GP GP GP GP GP = glottal peak correlated signal Cte(n)

5 The Proposed Algorithm Implements a synchronous time-domain analysis: a pitch synchronous algorithm Gives a precise period determination Applies well on voiced segments of the speech signal Pivot Determination Pivot point: the maximum point of entire analysed segment -> all subsequent period information will be detected Period Estimation Initial period estimation around the pivot point Glottal Peaks and Hiatus Points Detection Glottal peaks of all periods situated to the left and to the right of pivot point are determined Period Segmentation The boundaries of each period are detected

6 A. Pivot Point Determination M k (i) PIV Algorithm for Detection of Voice Signal Periodicity D The pivot point: the reference point Establishing the pivot point: initial filtering of the signal (median filter) zero-crossing points, local minimum and local maximum points are detected (ZeroMinMax algorithm [10] ). Pivot Point -> the sample point with the highest amplitude among the maximum points, along a distance D from the beginning of the segment PIV max( Mk ( i)), k 0,.. N ; i D - N : the number of local maximum points M k (i) from considered segment - i : the sample index -D: introduced for limiting the calculation for long duration segments

7 B. First Period Estimation Algorithm for Detection of Voice Signal Periodicity M S (i) PIV M D (j) D 1 D 2 An initial estimation of the period around the pivot point First: the local maximum points in the left M S (i) and right M D (j) vicinity of pivot point having an amplitude comparable with the amplitude of central point are detected If the distances D 1 and D 2 between M S (i), M D (j) and the pivot point are approximately equals (most cases) => Initial estimation of the period: PER = Average (D 1, D 2 ); If D 1 and D 2 are quite different (few cases of sharpened voice) => the nearest value from the median period value of the previous processed segment will be considered -> increases the robustness D1 D2 PER d( PIV, M d( PIV, M ( D 1 D S D 2 ( i)) ( j)) ) / 2 - M S (i) : first local maximum point at the left of pivot point with comparable magnitude : ( M S ( i), PIV ) S - M D (j) : first local maximum point at the right of pivot point with comparable magnitude : ( M D ( j), PIV ) S - PER is the first estimation of the period.

8 C. Detecting Glottal Peaks Algorithm for Detection of Voice Signal Periodicity M k-1 (j) PIV M k (i) k=n S D k P k-1 k=0 k=n D All the local maximum points corresponding with glottal peaks are determined, starting from the pivot position to the left and to the right. Starting from a previous peak M k-1 (j), the next peak M k (i) is found as follows: 1) first estimating the position of the next peak : the distance from the previous peak is equal to the estimated current period P k-1 ; 2) determine the local maximum point situated at minimum distance from the estimated position; 3) current period value P k is updated according to the position of this last point that was found. If a maximum point is not found in the expected position: -> exceeding the allowed period duration (A) -> low signal amplitude (B) => the next local maximum point is marked as hiatus (gap): - period hiatus case (A) - amplitude hiatus case (B).

9 C. Detecting Glottal Peaks Algorithm for Detection of Voice Signal Periodicity M k-1 (j) PIV M k (i) k=n S D k P k-1 k=0 k=n D The condition for determining a glottal peak M k (i) at iteration k : Dk d( M Dk Pk k 1 1 ( j), M / P k 1 k ( i)) - M k-1 (j) is the peak determined at the previous iteration (k-1), situated at sample number j; - D k is the distance between previous peak M k-1 (j) and current peak M k (i); k = 1.. N S at the left of pivot point, k = 1.. N D at the right of pivot point; - P k-1 represents the estimated period at iteration k-1, where P 0 has been settled at step 2 (first period estimation) of the algorithm; - Δ represents the threshold for the relative error between previous estimated period P k-1 and the actual distance D k

10 C. Detecting Glottal Peaks Algorithm for Detection of Voice Signal Periodicity M k-1 (j) PIV M k (i) k=n S D k P k-1 k=0 k=n D After the determination of a glottal peak M k (i), current period estimation P k will be updated: P k ( P 1N ( k) D ) /( N ( k) 1) k k -> N(k) represents a weighting factor - can be set to the number of periods covered by the previous iteration: N(k) = k - 1, or - can be set to a constant: N(k) = C In the current algorithm: N=4 => more rapidity in changing the estimated current period, following the variations of signal frequency (due to the speaker intonation).

11 C. Detecting Glottal Peaks Algorithm for Detection of Voice Signal Periodicity An example of automate detection of glottal peaks into a voiced segment of speech All detected points (1 22) belong to the same voiced region. The algorithm detects: a) point 9 : the pivot point, then b) it settles the left peaks (from 8 to 1), c) then the right peaks (points 10-13); after point 13, it could not identify a next peak situated inside or nearly the estimated period distance => the next peak (14) has been marked as hiatus. d) the algorithm is resumed: from this point -> the end of the segment. => a new pivot point (point 19) -> a new value of period -> all other peaks (15-18 and 20-22)

12 D. Period Segmentation Algorithm for Detection of Voice Signal Periodicity M k (i) M k+1 (j) Z k (m) PER k Z k+1 (n) After all the peaks have been determined => the pitch interval boundaries The starting point of each period interval: the first zero-crossing point before the period peak. Each period interval: will start at corresponding initial zero point -> will last till the initial zero point of the next interval Period interval duration PER k corresponding to the peak M k (i) is computed as distance between the two zero points that are marked as period interval boundaries: PER k d( Z ( m), Z 1( n)) k k - Z k (m) : the first zero point preceding M k (i) and situated at sample number m - Z k+1 (n) : the first zero point preceding M k+1 (j) and situated at sample number n In sample units : In time domain: PER k = n-m, PER k (t) = (n-m)/f es, - F es represents the sampling frequency of the signal.

13 Results Obtained The algorithm works well both for male and female voice. Result of pitch intervals determination for a voiced segment of speech uttered by a male speaker : Periods duration and pitch frequency for above speech segment: Nw : length of periods in number of samples Tw: period duration in miliseconds Fw: pitch frequency.

14 Results Obtained For untainted voices at normal speed utterances, the accuracy is high -> few differences between manual and automate period segmentation Testing the algorithm on a series of sound files (~ 20 seconds) => a correct detection rate of over 90% compared to manual segmentation of periods Sharpened voices or noisy environments - the signal waveform is very fragmented and rich in high order harmonics => some variations in detecting period boundaries could be observed because of the uniform manner the algorithm detects the glottal peaks and corresponding boundaries, an error that appeares at one specific period tends to be compensated at the next period => a good overall result Next phase of research: a comparative study with other methods that detect glottal peaks directly from waveform signal, like Childers and Hu method [1] and DYPSA algorithm [2]. [1] D.G. Childers, H.T. Hu, Speech synthesis by glottal excited linear prediction, Journal of the Acoustical Society of America, 1994 [2] P.A. Naylor, A. Kounoudes, J. Gudnason, M. Brookes, Estimation of glottal closure instants in voiced speech using the DYPSA algorithm, IEEE Transactions on Audio, Speech, and Language Processing, Volume 15, Issue 1, pp.34-43, January 2007

15 Conclusions An original algorithm for determining the pitch intervals for a voice signal The algorithm is very accurate and is working exclusively in the time domain of analysis Unlike other methods that use the frequency domain, it does not require windowing or complex calculations -> it is very quickly The method involves four successive steps -> four algorithms have been developed: - an algorithm for determining the pivot point; - an algorithm for determining an estimate of the period around the pivot point; - an algorithm for determining glottal peaks of the speech segment; the algorithm is able to detect also the hiatus points of the segment and classify them as period hiatus points or amplitude hiatus points; - an algorithm for determining the end points of pitch intervals, corresponding to the glottal peaks of the analysed segment.

16 Remarks The glottal peaks detected by our algorithm: the most significant peaks in each period of the speech signal These peaks correspond with glottal closure instants (GCI), but are not identical: > The GCI points are computed from the corresponding EGG signal (recorded together with the original speech signal -> need a laryngograph) > the glottal peaks are detected directly from the speech signal waveform => Advantage: detection of pitch intervals is made even if we don t have the EGG signal recording (signal waveform; vocal databases that do not store EGG) EGG

17 References 1. D.G. Childers, H.T. Hu, Speech synthesis by glottal excited linear prediction, Journal of the Acoustical Society of America, P.A. Naylor, A. Kounoudes, J. Gudnason, M. Brookes, Estimation of glottal closure instants in voiced speech using the DYPSA algorithm, IEEE Transactions on Audio, Speech, and Language Processing, Volume 15, Issue 1, pp.34-43, January M.R.P. Thomas, P.A. Naylor, The SIGMA algorithm: a glottal activity detector for electroglottographic signals, IEEE Transactions on Audio, Speech, and Language Processing, Volume 17, No. 8, November C. d Alessandro et al., Phase-based methods for voice source analysis, in Advances in Nonlinear Speech Processing, International Conference on Nonlinear Speech Processing, NOLISP 2007, Paris, France, pp. 1-27, May K. Schnell, Estimation of glottal closure instances from speech signals by weighted nonlinear prediction, in Advances in Nonlinear Speech Processing, International Conference on Nonlinear Speech Processing, NOLISP 2007, Paris, France, pp , May N.A. Kader, Pitch detection algorithm using a wavelet correlation model, The 17th National Radio Science Conference (NRSC), S. Sakai, J. Glass, Fundamental frequency modeling for corpus-based speech synthesis based on a statistical learning technique, Spoken Language System Publications, G. Proakis, C. M. Rader, F. Ling, M. Moonen, I. K. Proudler, C. L. Nikias, Algorithms for Statistical Signal Processing, Prentice Hall, D. Joho, M. Bennewitz, S. Behnke, Pitch estimation using models of voiced speech on three levels, IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Volume 4, pp , O. Buza, Contributions into Voice Signal Analysis and Text to Speech Synthesis for Romanian, Phd Thesis, Faculty of Electronics and Telecommunications, Cluj-Napoca, Romania, 2010.

18 Algorithm for Detection of Voice Signal Periodicity M k (i) M k+1 (j) Z k (m) Z k+1 (n) PER k Ovidiu Buza, Gavril Toderean, András Balogh Department of Communications, Technical University of Cluj-Napoca, Romania József Domokos Department of Electrical Engineering,Sapientia University, Targu Mures, Romania

Emotion Detection from Speech

Emotion Detection from Speech Emotion Detection from Speech 1. Introduction Although emotion detection from speech is a relatively new field of research, it has many potential applications. In human-computer or human-human interaction

More information

L9: Cepstral analysis

L9: Cepstral analysis L9: Cepstral analysis The cepstrum Homomorphic filtering The cepstrum and voicing/pitch detection Linear prediction cepstral coefficients Mel frequency cepstral coefficients This lecture is based on [Taylor,

More information

Speech Signal Processing: An Overview

Speech Signal Processing: An Overview Speech Signal Processing: An Overview S. R. M. Prasanna Department of Electronics and Electrical Engineering Indian Institute of Technology Guwahati December, 2012 Prasanna (EMST Lab, EEE, IITG) Speech

More information

Establishing the Uniqueness of the Human Voice for Security Applications

Establishing the Uniqueness of the Human Voice for Security Applications Proceedings of Student/Faculty Research Day, CSIS, Pace University, May 7th, 2004 Establishing the Uniqueness of the Human Voice for Security Applications Naresh P. Trilok, Sung-Hyuk Cha, and Charles C.

More information

Thirukkural - A Text-to-Speech Synthesis System

Thirukkural - A Text-to-Speech Synthesis System Thirukkural - A Text-to-Speech Synthesis System G. L. Jayavardhana Rama, A. G. Ramakrishnan, M Vijay Venkatesh, R. Murali Shankar Department of Electrical Engg, Indian Institute of Science, Bangalore 560012,

More information

A TOOL FOR TEACHING LINEAR PREDICTIVE CODING

A TOOL FOR TEACHING LINEAR PREDICTIVE CODING A TOOL FOR TEACHING LINEAR PREDICTIVE CODING Branislav Gerazov 1, Venceslav Kafedziski 2, Goce Shutinoski 1 1) Department of Electronics, 2) Department of Telecommunications Faculty of Electrical Engineering

More information

Lecture 1-10: Spectrograms

Lecture 1-10: Spectrograms Lecture 1-10: Spectrograms Overview 1. Spectra of dynamic signals: like many real world signals, speech changes in quality with time. But so far the only spectral analysis we have performed has assumed

More information

Ericsson T18s Voice Dialing Simulator

Ericsson T18s Voice Dialing Simulator Ericsson T18s Voice Dialing Simulator Mauricio Aracena Kovacevic, Anna Dehlbom, Jakob Ekeberg, Guillaume Gariazzo, Eric Lästh and Vanessa Troncoso Dept. of Signals Sensors and Systems Royal Institute of

More information

MUSICAL INSTRUMENT FAMILY CLASSIFICATION

MUSICAL INSTRUMENT FAMILY CLASSIFICATION MUSICAL INSTRUMENT FAMILY CLASSIFICATION Ricardo A. Garcia Media Lab, Massachusetts Institute of Technology 0 Ames Street Room E5-40, Cambridge, MA 039 USA PH: 67-53-0 FAX: 67-58-664 e-mail: rago @ media.

More information

Artificial Neural Network for Speech Recognition

Artificial Neural Network for Speech Recognition Artificial Neural Network for Speech Recognition Austin Marshall March 3, 2005 2nd Annual Student Research Showcase Overview Presenting an Artificial Neural Network to recognize and classify speech Spoken

More information

School Class Monitoring System Based on Audio Signal Processing

School Class Monitoring System Based on Audio Signal Processing C. R. Rashmi 1,,C.P.Shantala 2 andt.r.yashavanth 3 1 Department of CSE, PG Student, CIT, Gubbi, Tumkur, Karnataka, India. 2 Department of CSE, Vice Principal & HOD, CIT, Gubbi, Tumkur, Karnataka, India.

More information

Broadband Networks. Prof. Dr. Abhay Karandikar. Electrical Engineering Department. Indian Institute of Technology, Bombay. Lecture - 29.

Broadband Networks. Prof. Dr. Abhay Karandikar. Electrical Engineering Department. Indian Institute of Technology, Bombay. Lecture - 29. Broadband Networks Prof. Dr. Abhay Karandikar Electrical Engineering Department Indian Institute of Technology, Bombay Lecture - 29 Voice over IP So, today we will discuss about voice over IP and internet

More information

This document is downloaded from DR-NTU, Nanyang Technological University Library, Singapore.

This document is downloaded from DR-NTU, Nanyang Technological University Library, Singapore. This document is downloaded from DR-NTU, Nanyang Technological University Library, Singapore. Title Transcription of polyphonic signals using fast filter bank( Accepted version ) Author(s) Foo, Say Wei;

More information

Figure1. Acoustic feedback in packet based video conferencing system

Figure1. Acoustic feedback in packet based video conferencing system Real-Time Howling Detection for Hands-Free Video Conferencing System Mi Suk Lee and Do Young Kim Future Internet Research Department ETRI, Daejeon, Korea {lms, dyk}@etri.re.kr Abstract: This paper presents

More information

A Sound Analysis and Synthesis System for Generating an Instrumental Piri Song

A Sound Analysis and Synthesis System for Generating an Instrumental Piri Song , pp.347-354 http://dx.doi.org/10.14257/ijmue.2014.9.8.32 A Sound Analysis and Synthesis System for Generating an Instrumental Piri Song Myeongsu Kang and Jong-Myon Kim School of Electrical Engineering,

More information

Audio Engineering Society. Convention Paper. Presented at the 129th Convention 2010 November 4 7 San Francisco, CA, USA

Audio Engineering Society. Convention Paper. Presented at the 129th Convention 2010 November 4 7 San Francisco, CA, USA Audio Engineering Society Convention Paper Presented at the 129th Convention 2010 November 4 7 San Francisco, CA, USA The papers at this Convention have been selected on the basis of a submitted abstract

More information

AN-007 APPLICATION NOTE MEASURING MAXIMUM SUBWOOFER OUTPUT ACCORDING ANSI/CEA-2010 STANDARD INTRODUCTION CEA-2010 (ANSI) TEST PROCEDURE

AN-007 APPLICATION NOTE MEASURING MAXIMUM SUBWOOFER OUTPUT ACCORDING ANSI/CEA-2010 STANDARD INTRODUCTION CEA-2010 (ANSI) TEST PROCEDURE AUDIOMATICA AN-007 APPLICATION NOTE MEASURING MAXIMUM SUBWOOFER OUTPUT ACCORDING ANSI/CEA-2010 STANDARD by Daniele Ponteggia - dp@audiomatica.com INTRODUCTION The Consumer Electronics Association (CEA),

More information

Modelling, Extraction and Description of Intrinsic Cues of High Resolution Satellite Images: Independent Component Analysis based approaches

Modelling, Extraction and Description of Intrinsic Cues of High Resolution Satellite Images: Independent Component Analysis based approaches Modelling, Extraction and Description of Intrinsic Cues of High Resolution Satellite Images: Independent Component Analysis based approaches PhD Thesis by Payam Birjandi Director: Prof. Mihai Datcu Problematic

More information

APPLYING MFCC-BASED AUTOMATIC SPEAKER RECOGNITION TO GSM AND FORENSIC DATA

APPLYING MFCC-BASED AUTOMATIC SPEAKER RECOGNITION TO GSM AND FORENSIC DATA APPLYING MFCC-BASED AUTOMATIC SPEAKER RECOGNITION TO GSM AND FORENSIC DATA Tuija Niemi-Laitinen*, Juhani Saastamoinen**, Tomi Kinnunen**, Pasi Fränti** *Crime Laboratory, NBI, Finland **Dept. of Computer

More information

A STUDY OF ECHO IN VOIP SYSTEMS AND SYNCHRONOUS CONVERGENCE OF

A STUDY OF ECHO IN VOIP SYSTEMS AND SYNCHRONOUS CONVERGENCE OF A STUDY OF ECHO IN VOIP SYSTEMS AND SYNCHRONOUS CONVERGENCE OF THE µ-law PNLMS ALGORITHM Laura Mintandjian and Patrick A. Naylor 2 TSS Departement, Nortel Parc d activites de Chateaufort, 78 Chateaufort-France

More information

From Concept to Production in Secure Voice Communications

From Concept to Production in Secure Voice Communications From Concept to Production in Secure Voice Communications Earl E. Swartzlander, Jr. Electrical and Computer Engineering Department University of Texas at Austin Austin, TX 78712 Abstract In the 1970s secure

More information

Available from Deakin Research Online:

Available from Deakin Research Online: This is the authors final peered reviewed (post print) version of the item published as: Adibi,S 2014, A low overhead scaled equalized harmonic-based voice authentication system, Telematics and informatics,

More information

A Comparison of Speech Coding Algorithms ADPCM vs CELP. Shannon Wichman

A Comparison of Speech Coding Algorithms ADPCM vs CELP. Shannon Wichman A Comparison of Speech Coding Algorithms ADPCM vs CELP Shannon Wichman Department of Electrical Engineering The University of Texas at Dallas Fall 1999 December 8, 1999 1 Abstract Factors serving as constraints

More information

Lecture 1-6: Noise and Filters

Lecture 1-6: Noise and Filters Lecture 1-6: Noise and Filters Overview 1. Periodic and Aperiodic Signals Review: by periodic signals, we mean signals that have a waveform shape that repeats. The time taken for the waveform to repeat

More information

The Effect of Long-Term Use of Drugs on Speaker s Fundamental Frequency

The Effect of Long-Term Use of Drugs on Speaker s Fundamental Frequency The Effect of Long-Term Use of Drugs on Speaker s Fundamental Frequency Andrey Raev 1, Yuri Matveev 1, Tatiana Goloshchapova 2 1 Speech Technology Center, St. Petersburg, RUSSIA {raev, matveev}@speechpro.com

More information

Experiment # (4) AM Demodulator

Experiment # (4) AM Demodulator Islamic University of Gaza Faculty of Engineering Electrical Department Experiment # (4) AM Demodulator Communications Engineering I (Lab.) Prepared by: Eng. Omar A. Qarmout Eng. Mohammed K. Abu Foul Experiment

More information

Polytechnic University of Puerto Rico Department of Electrical Engineering Master s Degree in Electrical Engineering.

Polytechnic University of Puerto Rico Department of Electrical Engineering Master s Degree in Electrical Engineering. Polytechnic University of Puerto Rico Department of Electrical Engineering Master s Degree in Electrical Engineering Course Syllabus Course Title : Algorithms for Digital Signal Processing Course Code

More information

Dirac Live & the RS20i

Dirac Live & the RS20i Dirac Live & the RS20i Dirac Research has worked for many years fine-tuning digital sound optimization and room correction. Today, the technology is available to the high-end consumer audio market with

More information

Automatic Evaluation Software for Contact Centre Agents voice Handling Performance

Automatic Evaluation Software for Contact Centre Agents voice Handling Performance International Journal of Scientific and Research Publications, Volume 5, Issue 1, January 2015 1 Automatic Evaluation Software for Contact Centre Agents voice Handling Performance K.K.A. Nipuni N. Perera,

More information

PeakVue Analysis for Antifriction Bearing Fault Detection

PeakVue Analysis for Antifriction Bearing Fault Detection August 2011 PeakVue Analysis for Antifriction Bearing Fault Detection Peak values (PeakVue) are observed over sequential discrete time intervals, captured, and analyzed. The analyses are the (a) peak values

More information

Separation and Classification of Harmonic Sounds for Singing Voice Detection

Separation and Classification of Harmonic Sounds for Singing Voice Detection Separation and Classification of Harmonic Sounds for Singing Voice Detection Martín Rocamora and Alvaro Pardo Institute of Electrical Engineering - School of Engineering Universidad de la República, Uruguay

More information

Formant Bandwidth and Resilience of Speech to Noise

Formant Bandwidth and Resilience of Speech to Noise Formant Bandwidth and Resilience of Speech to Noise Master Thesis Leny Vinceslas August 5, 211 Internship for the ATIAM Master s degree ENS - Laboratoire Psychologie de la Perception - Hearing Group Supervised

More information

Advanced Speech-Audio Processing in Mobile Phones and Hearing Aids

Advanced Speech-Audio Processing in Mobile Phones and Hearing Aids Advanced Speech-Audio Processing in Mobile Phones and Hearing Aids Synergies and Distinctions Peter Vary RWTH Aachen University Institute of Communication Systems WASPAA, October 23, 2013 Mohonk Mountain

More information

SPEAKER IDENTIFICATION FROM YOUTUBE OBTAINED DATA

SPEAKER IDENTIFICATION FROM YOUTUBE OBTAINED DATA SPEAKER IDENTIFICATION FROM YOUTUBE OBTAINED DATA Nitesh Kumar Chaudhary 1 and Shraddha Srivastav 2 1 Department of Electronics & Communication Engineering, LNMIIT, Jaipur, India 2 Bharti School Of Telecommunication,

More information

Final Year Project Progress Report. Frequency-Domain Adaptive Filtering. Myles Friel. Supervisor: Dr.Edward Jones

Final Year Project Progress Report. Frequency-Domain Adaptive Filtering. Myles Friel. Supervisor: Dr.Edward Jones Final Year Project Progress Report Frequency-Domain Adaptive Filtering Myles Friel 01510401 Supervisor: Dr.Edward Jones Abstract The Final Year Project is an important part of the final year of the Electronic

More information

BLIND SOURCE SEPARATION OF SPEECH AND BACKGROUND MUSIC FOR IMPROVED SPEECH RECOGNITION

BLIND SOURCE SEPARATION OF SPEECH AND BACKGROUND MUSIC FOR IMPROVED SPEECH RECOGNITION BLIND SOURCE SEPARATION OF SPEECH AND BACKGROUND MUSIC FOR IMPROVED SPEECH RECOGNITION P. Vanroose Katholieke Universiteit Leuven, div. ESAT/PSI Kasteelpark Arenberg 10, B 3001 Heverlee, Belgium Peter.Vanroose@esat.kuleuven.ac.be

More information

Linear Parameter Measurement (LPM)

Linear Parameter Measurement (LPM) (LPM) Module of the R&D SYSTEM FEATURES Identifies linear transducer model Measures suspension creep LS-fitting in impedance LS-fitting in displacement (optional) Single-step measurement with laser sensor

More information

ECG SIGNAL PROCESSING AND HEART RATE FREQUENCY DETECTION METHODS

ECG SIGNAL PROCESSING AND HEART RATE FREQUENCY DETECTION METHODS ECG SIGNAL PROCESSING AND HEART RATE FREQUENCY DETECTION METHODS J. Parak, J. Havlik Department of Circuit Theory, Faculty of Electrical Engineering Czech Technical University in Prague Abstract Digital

More information

Analysis/resynthesis with the short time Fourier transform

Analysis/resynthesis with the short time Fourier transform Analysis/resynthesis with the short time Fourier transform summer 2006 lecture on analysis, modeling and transformation of audio signals Axel Röbel Institute of communication science TU-Berlin IRCAM Analysis/Synthesis

More information

Abstract. Cycle Domain Simulator for Phase-Locked Loops

Abstract. Cycle Domain Simulator for Phase-Locked Loops Abstract Cycle Domain Simulator for Phase-Locked Loops Norman James December 1999 As computers become faster and more complex, clock synthesis becomes critical. Due to the relatively slower bus clocks

More information

Recent advances in Digital Music Processing and Indexing

Recent advances in Digital Music Processing and Indexing Recent advances in Digital Music Processing and Indexing Acoustics 08 warm-up TELECOM ParisTech Gaël RICHARD Telecom ParisTech (ENST) www.enst.fr/~grichard/ Content Introduction and Applications Components

More information

Developing an Isolated Word Recognition System in MATLAB

Developing an Isolated Word Recognition System in MATLAB MATLAB Digest Developing an Isolated Word Recognition System in MATLAB By Daryl Ning Speech-recognition technology is embedded in voice-activated routing systems at customer call centres, voice dialling

More information

Trigonometric functions and sound

Trigonometric functions and sound Trigonometric functions and sound The sounds we hear are caused by vibrations that send pressure waves through the air. Our ears respond to these pressure waves and signal the brain about their amplitude

More information

A Segmentation Algorithm for Zebra Finch Song at the Note Level. Ping Du and Todd W. Troyer

A Segmentation Algorithm for Zebra Finch Song at the Note Level. Ping Du and Todd W. Troyer A Segmentation Algorithm for Zebra Finch Song at the Note Level Ping Du and Todd W. Troyer Neuroscience and Cognitive Science Program, Dept. of Psychology University of Maryland, College Park, MD 20742

More information

HD Radio FM Transmission System Specifications Rev. F August 24, 2011

HD Radio FM Transmission System Specifications Rev. F August 24, 2011 HD Radio FM Transmission System Specifications Rev. F August 24, 2011 SY_SSS_1026s TRADEMARKS HD Radio and the HD, HD Radio, and Arc logos are proprietary trademarks of ibiquity Digital Corporation. ibiquity,

More information

TRAFFIC MONITORING WITH AD-HOC MICROPHONE ARRAY

TRAFFIC MONITORING WITH AD-HOC MICROPHONE ARRAY 4 4th International Workshop on Acoustic Signal Enhancement (IWAENC) TRAFFIC MONITORING WITH AD-HOC MICROPHONE ARRAY Takuya Toyoda, Nobutaka Ono,3, Shigeki Miyabe, Takeshi Yamada, Shoji Makino University

More information

Robust Methods for Automatic Transcription and Alignment of Speech Signals

Robust Methods for Automatic Transcription and Alignment of Speech Signals Robust Methods for Automatic Transcription and Alignment of Speech Signals Leif Grönqvist (lgr@msi.vxu.se) Course in Speech Recognition January 2. 2004 Contents Contents 1 1 Introduction 2 2 Background

More information

A Wavelet Based Prediction Method for Time Series

A Wavelet Based Prediction Method for Time Series A Wavelet Based Prediction Method for Time Series Cristina Stolojescu 1,2 Ion Railean 1,3 Sorin Moga 1 Philippe Lenca 1 and Alexandru Isar 2 1 Institut TELECOM; TELECOM Bretagne, UMR CNRS 3192 Lab-STICC;

More information

AC 2012-5055: MULTIMEDIA SYSTEMS EDUCATION INNOVATIONS I: SPEECH

AC 2012-5055: MULTIMEDIA SYSTEMS EDUCATION INNOVATIONS I: SPEECH AC -555: MULTIMEDIA SYSTEMS EDUCATION INNOVATIONS I: SPEECH Prof. Tokunbo Ogunfunmi, Santa Clara University Tokunbo Ogunfunmi is the Associate Dean for Research and Faculty Development in the School of

More information

Annotated bibliographies for presentations in MUMT 611, Winter 2006

Annotated bibliographies for presentations in MUMT 611, Winter 2006 Stephen Sinclair Music Technology Area, McGill University. Montreal, Canada Annotated bibliographies for presentations in MUMT 611, Winter 2006 Presentation 4: Musical Genre Similarity Aucouturier, J.-J.

More information

MPEG Unified Speech and Audio Coding Enabling Efficient Coding of both Speech and Music

MPEG Unified Speech and Audio Coding Enabling Efficient Coding of both Speech and Music ISO/IEC MPEG USAC Unified Speech and Audio Coding MPEG Unified Speech and Audio Coding Enabling Efficient Coding of both Speech and Music The standardization of MPEG USAC in ISO/IEC is now in its final

More information

Time Domain and Frequency Domain Techniques For Multi Shaker Time Waveform Replication

Time Domain and Frequency Domain Techniques For Multi Shaker Time Waveform Replication Time Domain and Frequency Domain Techniques For Multi Shaker Time Waveform Replication Thomas Reilly Data Physics Corporation 1741 Technology Drive, Suite 260 San Jose, CA 95110 (408) 216-8440 This paper

More information

Advanced Signal Processing and Digital Noise Reduction

Advanced Signal Processing and Digital Noise Reduction Advanced Signal Processing and Digital Noise Reduction Saeed V. Vaseghi Queen's University of Belfast UK WILEY HTEUBNER A Partnership between John Wiley & Sons and B. G. Teubner Publishers Chichester New

More information

Basic Acoustics and Acoustic Filters

Basic Acoustics and Acoustic Filters Basic CHAPTER Acoustics and Acoustic Filters 1 3 Basic Acoustics and Acoustic Filters 1.1 The sensation of sound Several types of events in the world produce the sensation of sound. Examples include doors

More information

Teaching Fourier Analysis and Wave Physics with the Bass Guitar

Teaching Fourier Analysis and Wave Physics with the Bass Guitar Teaching Fourier Analysis and Wave Physics with the Bass Guitar Michael Courtney Department of Chemistry and Physics, Western Carolina University Norm Althausen Lorain County Community College This article

More information

Audio Content Analysis for Online Audiovisual Data Segmentation and Classification

Audio Content Analysis for Online Audiovisual Data Segmentation and Classification IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING, VOL. 9, NO. 4, MAY 2001 441 Audio Content Analysis for Online Audiovisual Data Segmentation and Classification Tong Zhang, Member, IEEE, and C.-C. Jay

More information

Myanmar Continuous Speech Recognition System Based on DTW and HMM

Myanmar Continuous Speech Recognition System Based on DTW and HMM Myanmar Continuous Speech Recognition System Based on DTW and HMM Ingyin Khaing Department of Information and Technology University of Technology (Yatanarpon Cyber City),near Pyin Oo Lwin, Myanmar Abstract-

More information

have more skill and perform more complex

have more skill and perform more complex Speech Recognition Smartphone UI Speech Recognition Technology and Applications for Improving Terminal Functionality and Service Usability User interfaces that utilize voice input on compact devices such

More information

Construct User Guide

Construct User Guide Construct User Guide Contents Contents 1 1 Introduction 2 1.1 Construct Features..................................... 2 1.2 Speech Licenses....................................... 3 2 Scenario Management

More information

Overview of the research results on Voice over IP

Overview of the research results on Voice over IP Overview of the research results on Voice over IP F. Beritelli (Università di Catania) C. Casetti (Politecnico di Torino) S. Giordano (Università di Pisa) R. Lo Cigno (Politecnico di Torino) 1. Introduction

More information

A Study on Relationship between Power of Adapter and Total Harmonic Distortion of Earphone s leaking Sound

A Study on Relationship between Power of Adapter and Total Harmonic Distortion of Earphone s leaking Sound , pp.201-208 http://dx.doi.org/10.14257/ijmue.2014.9.6.20 A Study on Relationship between Power of Adapter and Total Harmonic Distortion of Earphone s leaking Sound Eun-Young Yi, Chan-Joong Jung and Myung-Jin

More information

Probability and Random Variables. Generation of random variables (r.v.)

Probability and Random Variables. Generation of random variables (r.v.) Probability and Random Variables Method for generating random variables with a specified probability distribution function. Gaussian And Markov Processes Characterization of Stationary Random Process Linearly

More information

Speech: A Challenge to Digital Signal Processing Technology for Human-to-Computer Interaction

Speech: A Challenge to Digital Signal Processing Technology for Human-to-Computer Interaction : A Challenge to Digital Signal Processing Technology for Human-to-Computer Interaction Urmila Shrawankar Dept. of Information Technology Govt. Polytechnic, Nagpur Institute Sadar, Nagpur 440001 (INDIA)

More information

Carla Simões, t-carlas@microsoft.com. Speech Analysis and Transcription Software

Carla Simões, t-carlas@microsoft.com. Speech Analysis and Transcription Software Carla Simões, t-carlas@microsoft.com Speech Analysis and Transcription Software 1 Overview Methods for Speech Acoustic Analysis Why Speech Acoustic Analysis? Annotation Segmentation Alignment Speech Analysis

More information

Simple Voice over IP (VoIP) Implementation

Simple Voice over IP (VoIP) Implementation Simple Voice over IP (VoIP) Implementation ECE Department, University of Florida Abstract Voice over IP (VoIP) technology has many advantages over the traditional Public Switched Telephone Networks. In

More information

STUDY REGARDING THE USE OF THE TOOLS OFFERED BY MICROSOFT EXCEL SOFTWARE IN THE ACTIVITY OF THE BIHOR COUNTY COMPANIES

STUDY REGARDING THE USE OF THE TOOLS OFFERED BY MICROSOFT EXCEL SOFTWARE IN THE ACTIVITY OF THE BIHOR COUNTY COMPANIES STUDY REGARDING THE USE OF THE TOOLS OFFERED BY MICROSOFT EXCEL SOFTWARE IN THE ACTIVITY OF THE BIHOR COUNTY COMPANIES Ţarcă Naiana 1, Popa Adela 2 1 Faculty of Economics, University of Oradea, Oradea,

More information

SOFTWARE FOR GENERATION OF SPECTRUM COMPATIBLE TIME HISTORY

SOFTWARE FOR GENERATION OF SPECTRUM COMPATIBLE TIME HISTORY 3 th World Conference on Earthquake Engineering Vancouver, B.C., Canada August -6, 24 Paper No. 296 SOFTWARE FOR GENERATION OF SPECTRUM COMPATIBLE TIME HISTORY ASHOK KUMAR SUMMARY One of the important

More information

Linear Predictive Coding

Linear Predictive Coding Linear Predictive Coding Jeremy Bradbury December 5, 2000 0 Outline I. Proposal II. Introduction A. Speech Coding B. Voice Coders C. LPC Overview III. Historical Perspective of Linear Predictive Coding

More information

Voice Communication Package v7.0 of front-end voice processing software technologies General description and technical specification

Voice Communication Package v7.0 of front-end voice processing software technologies General description and technical specification Voice Communication Package v7.0 of front-end voice processing software technologies General description and technical specification (Revision 1.0, May 2012) General VCP information Voice Communication

More information

A WEB BASED TRAINING MODULE FOR TEACHING DIGITAL COMMUNICATIONS

A WEB BASED TRAINING MODULE FOR TEACHING DIGITAL COMMUNICATIONS A WEB BASED TRAINING MODULE FOR TEACHING DIGITAL COMMUNICATIONS Ali Kara 1, Cihangir Erdem 1, Mehmet Efe Ozbek 1, Nergiz Cagiltay 2, Elif Aydin 1 (1) Department of Electrical and Electronics Engineering,

More information

The Computer Music Tutorial

The Computer Music Tutorial Curtis Roads with John Strawn, Curtis Abbott, John Gordon, and Philip Greenspun The Computer Music Tutorial Corrections and Revisions The MIT Press Cambridge, Massachusetts London, England 2 Corrections

More information

Creating voices for the Festival speech synthesis system.

Creating voices for the Festival speech synthesis system. M. Hood Supervised by A. Lobb and S. Bangay G01H0708 Creating voices for the Festival speech synthesis system. Abstract This project focuses primarily on the process of creating a voice for a concatenative

More information

Digital Signal Controller Based Automatic Transfer Switch

Digital Signal Controller Based Automatic Transfer Switch Digital Signal Controller Based Automatic Transfer Switch by Venkat Anant Senior Staff Applications Engineer Freescale Semiconductor, Inc. Abstract: An automatic transfer switch (ATS) enables backup generators,

More information

Detection of Heart Diseases by Mathematical Artificial Intelligence Algorithm Using Phonocardiogram Signals

Detection of Heart Diseases by Mathematical Artificial Intelligence Algorithm Using Phonocardiogram Signals International Journal of Innovation and Applied Studies ISSN 2028-9324 Vol. 3 No. 1 May 2013, pp. 145-150 2013 Innovative Space of Scientific Research Journals http://www.issr-journals.org/ijias/ Detection

More information

A fast multi-class SVM learning method for huge databases

A fast multi-class SVM learning method for huge databases www.ijcsi.org 544 A fast multi-class SVM learning method for huge databases Djeffal Abdelhamid 1, Babahenini Mohamed Chaouki 2 and Taleb-Ahmed Abdelmalik 3 1,2 Computer science department, LESIA Laboratory,

More information

Precision Diode Rectifiers

Precision Diode Rectifiers by Kenneth A. Kuhn March 21, 2013 Precision half-wave rectifiers An operational amplifier can be used to linearize a non-linear function such as the transfer function of a semiconductor diode. The classic

More information

Subjective SNR measure for quality assessment of. speech coders \A cross language study

Subjective SNR measure for quality assessment of. speech coders \A cross language study Subjective SNR measure for quality assessment of speech coders \A cross language study Mamoru Nakatsui and Hideki Noda Communications Research Laboratory, Ministry of Posts and Telecommunications, 4-2-1,

More information

Monophonic Music Recognition

Monophonic Music Recognition Monophonic Music Recognition Per Weijnitz Speech Technology 5p per.weijnitz@gslt.hum.gu.se 5th March 2003 Abstract This report describes an experimental monophonic music recognition system, carried out

More information

The LENA TM Language Environment Analysis System:

The LENA TM Language Environment Analysis System: FOUNDATION The LENA TM Language Environment Analysis System: The Interpreted Time Segments (ITS) File Dongxin Xu, Umit Yapanel, Sharmi Gray, & Charles T. Baer LENA Foundation, Boulder, CO LTR-04-2 September

More information

Digital Speech Coding

Digital Speech Coding Digital Speech Processing David Tipper Associate Professor Graduate Program of Telecommunications and Networking University of Pittsburgh Telcom 2720 Slides 7 http://www.sis.pitt.edu/~dtipper/tipper.html

More information

Jitter Transfer Functions in Minutes

Jitter Transfer Functions in Minutes Jitter Transfer Functions in Minutes In this paper, we use the SV1C Personalized SerDes Tester to rapidly develop and execute PLL Jitter transfer function measurements. We leverage the integrated nature

More information

Controller Design in Frequency Domain

Controller Design in Frequency Domain ECSE 4440 Control System Engineering Fall 2001 Project 3 Controller Design in Frequency Domain TA 1. Abstract 2. Introduction 3. Controller design in Frequency domain 4. Experiment 5. Colclusion 1. Abstract

More information

IBM Research Report. CSR: Speaker Recognition from Compressed VoIP Packet Stream

IBM Research Report. CSR: Speaker Recognition from Compressed VoIP Packet Stream RC23499 (W0501-090) January 19, 2005 Computer Science IBM Research Report CSR: Speaker Recognition from Compressed Packet Stream Charu Aggarwal, David Olshefski, Debanjan Saha, Zon-Yin Shae, Philip Yu

More information

Electronic Communications Committee (ECC) within the European Conference of Postal and Telecommunications Administrations (CEPT)

Electronic Communications Committee (ECC) within the European Conference of Postal and Telecommunications Administrations (CEPT) Page 1 Electronic Communications Committee (ECC) within the European Conference of Postal and Telecommunications Administrations (CEPT) ECC RECOMMENDATION (06)01 Bandwidth measurements using FFT techniques

More information

Dynamic sound source for simulating the Lombard effect in room acoustic modeling software

Dynamic sound source for simulating the Lombard effect in room acoustic modeling software Dynamic sound source for simulating the Lombard effect in room acoustic modeling software Jens Holger Rindel a) Claus Lynge Christensen b) Odeon A/S, Scion-DTU, Diplomvej 381, DK-2800 Kgs. Lynby, Denmark

More information

CBS RECORDS PROFESSIONAL SERIES CBS RECORDS CD-1 STANDARD TEST DISC

CBS RECORDS PROFESSIONAL SERIES CBS RECORDS CD-1 STANDARD TEST DISC CBS RECORDS PROFESSIONAL SERIES CBS RECORDS CD-1 STANDARD TEST DISC 1. INTRODUCTION The CBS Records CD-1 Test Disc is a highly accurate signal source specifically designed for those interested in making

More information

The Algorithms of Speech Recognition, Programming and Simulating in MATLAB

The Algorithms of Speech Recognition, Programming and Simulating in MATLAB FACULTY OF ENGINEERING AND SUSTAINABLE DEVELOPMENT. The Algorithms of Speech Recognition, Programming and Simulating in MATLAB Tingxiao Yang January 2012 Bachelor s Thesis in Electronics Bachelor s Program

More information

Membering T M : A Conference Call Service with Speaker-Independent Name Dialing on AIN

Membering T M : A Conference Call Service with Speaker-Independent Name Dialing on AIN PAGE 30 Membering T M : A Conference Call Service with Speaker-Independent Name Dialing on AIN Sung-Joon Park, Kyung-Ae Jang, Jae-In Kim, Myoung-Wan Koo, Chu-Shik Jhon Service Development Laboratory, KT,

More information

Measuring Line Edge Roughness: Fluctuations in Uncertainty

Measuring Line Edge Roughness: Fluctuations in Uncertainty Tutor6.doc: Version 5/6/08 T h e L i t h o g r a p h y E x p e r t (August 008) Measuring Line Edge Roughness: Fluctuations in Uncertainty Line edge roughness () is the deviation of a feature edge (as

More information

Timing Errors and Jitter

Timing Errors and Jitter Timing Errors and Jitter Background Mike Story In a sampled (digital) system, samples have to be accurate in level and time. The digital system uses the two bits of information the signal was this big

More information

Title : Analog Circuit for Sound Localization Applications

Title : Analog Circuit for Sound Localization Applications Title : Analog Circuit for Sound Localization Applications Author s Name : Saurabh Kumar Tiwary Brett Diamond Andrea Okerholm Contact Author : Saurabh Kumar Tiwary A-51 Amberson Plaza 5030 Center Avenue

More information

Voice Encoding Methods for Digital Wireless Communications Systems

Voice Encoding Methods for Digital Wireless Communications Systems SOUTHERN METHODIST UNIVERSITY Voice Encoding Methods for Digital Wireless Communications Systems BY Bryan Douglas Street address city state, zip e-mail address Student ID xxx-xx-xxxx EE6302 Section 324,

More information

Portable Time Interval Counter with Picosecond Precision

Portable Time Interval Counter with Picosecond Precision Portable Time Interval Counter with Picosecond Precision R. SZPLET, Z. JACHNA, K. ROZYC, J. KALISZ Department of Electronic Engineering Military University of Technology Gen. S. Kaliskiego 2, 00-908 Warsaw

More information

ANALYZER BASICS WHAT IS AN FFT SPECTRUM ANALYZER? 2-1

ANALYZER BASICS WHAT IS AN FFT SPECTRUM ANALYZER? 2-1 WHAT IS AN FFT SPECTRUM ANALYZER? ANALYZER BASICS The SR760 FFT Spectrum Analyzer takes a time varying input signal, like you would see on an oscilloscope trace, and computes its frequency spectrum. Fourier's

More information

Noise. CIH Review PDC March 2012

Noise. CIH Review PDC March 2012 Noise CIH Review PDC March 2012 Learning Objectives Understand the concept of the decibel, decibel determination, decibel addition, and weighting Know the characteristics of frequency that are relevant

More information

Waveforms and the Speed of Sound

Waveforms and the Speed of Sound Laboratory 3 Seth M. Foreman February 24, 2015 Waveforms and the Speed of Sound 1 Objectives The objectives of this excercise are: to measure the speed of sound in air to record and analyze waveforms of

More information

Music technology. Draft GCE A level and AS subject content

Music technology. Draft GCE A level and AS subject content Music technology Draft GCE A level and AS subject content July 2015 Contents The content for music technology AS and A level 3 Introduction 3 Aims and objectives 3 Subject content 4 Recording and production

More information

Adaptive Sampling Rate Correction for Acoustic Echo Control in Voice-Over-IP Matthias Pawig, Gerald Enzner, Member, IEEE, and Peter Vary, Fellow, IEEE

Adaptive Sampling Rate Correction for Acoustic Echo Control in Voice-Over-IP Matthias Pawig, Gerald Enzner, Member, IEEE, and Peter Vary, Fellow, IEEE IEEE TRANSACTIONS ON SIGNAL PROCESSING, VOL. 58, NO. 1, JANUARY 2010 189 Adaptive Sampling Rate Correction for Acoustic Echo Control in Voice-Over-IP Matthias Pawig, Gerald Enzner, Member, IEEE, and Peter

More information

The Calculation of G rms

The Calculation of G rms The Calculation of G rms QualMark Corp. Neill Doertenbach The metric of G rms is typically used to specify and compare the energy in repetitive shock vibration systems. However, the method of arriving

More information

VEHICLE TRACKING USING ACOUSTIC AND VIDEO SENSORS

VEHICLE TRACKING USING ACOUSTIC AND VIDEO SENSORS VEHICLE TRACKING USING ACOUSTIC AND VIDEO SENSORS Aswin C Sankaranayanan, Qinfen Zheng, Rama Chellappa University of Maryland College Park, MD - 277 {aswch, qinfen, rama}@cfar.umd.edu Volkan Cevher, James

More information