Algorithm for Detection of Voice Signal Periodicity
|
|
- Adrian Jones
- 7 years ago
- Views:
Transcription
1 Algorithm for Detection of Voice Signal Periodicity Ovidiu Buza, Gavril Toderean, András Balogh Department of Communications, Technical University of Cluj- Napoca, Romania József Domokos Department of Electrical Engineering,Sapientia University, Targu Mures, Romania
2 Algorithm for Detection of Voice Signal Periodicity An original algorithm for detecting the periodicity of voice signal M k (i) PIV D Main characteristics of current algorithm: - precise determination of each period from a voiced segment of speech - accurate detection of pitch interval boundaries - marking the glottal peak of each period The algorithm uses time domain analysis of the signal -> fast and efficient
3 INTRODUCTION Some existing methods for pitch detection: use LPC model by detecting peaks from LPC residual signal [1] by calculating spectral discontinuities using time-frequency transformations or by detecting waveform discontinuities from corresponding vocal tract signal [2]-[5] Other methods use autocorrelation, cepstrum and inverse filtering (SIFT) for estimating signal periodicity [6] Statistical methods have been developed also ([7]-[9]) -> determine mean values of F0 frequency along a considered frame, but not the precise frame for each signal period. 1. D.G. Childers, H.T. Hu, Speech synthesis by glottal excited linear prediction, Journal of the Acoustical Society of America, P.A. Naylor, A. Kounoudes, J. Gudnason, M. Brookes, Estimation of glottal closure instants in voiced speech using the DYPSA algorithm, IEEE Transactions on Audio, Speech, and Language Processing, Volume 15, Issue 1, pp.34-43, January M.R.P. Thomas, P.A. Naylor, The SIGMA algorithm: a glottal activity detector for electroglottographic signals, IEEE Transactions on Audio, Speech, and Language Processing, Volume 17, No. 8, November C. d Alessandro et al., Phase-based methods for voice source analysis, in Advances in Nonlinear Speech Processing, International Conference on Nonlinear Speech Processing, NOLISP 2007, Paris, France, pp. 1-27, May K. Schnell, Estimation of glottal closure instances from speech signals by weighted nonlinear prediction, in Advances in Nonlinear Speech Processing, International Conference on Nonlinear Speech Processing, NOLISP 2007, Paris, France, pp , N.A. Kader, Pitch detection algorithm using a wavelet correlation model, The 17th National Radio Science Conference (NRSC), S. Sakai, J. Glass, Fundamental frequency modeling for corpus-based speech synthesis based on a statistical learning technique, Spoken Language System Publications, G. Proakis, C. M. Rader, F. Ling, M. Moonen, I. K. Proudler, C. L. Nikias, Algorithms for Statistical Signal Processing, Prentice Hall, 2002
4 A generic algorithm Was presented by Childers and Hu [1]. This method uses the results of S/U/V segmentation of speech and prediction error signal generated from LPC analysis The waveform signal periodicity is calculated by detecting GCI points that correspond with physical glottal vibrations ->from the correlated signal Cte(n) Testing this method -> although providing good results, introduces some errors especially in frames with rapid changing of F0 frequency. prediction error elp(n) GP GP GP GP GP = glottal peak correlated signal Cte(n)
5 The Proposed Algorithm Implements a synchronous time-domain analysis: a pitch synchronous algorithm Gives a precise period determination Applies well on voiced segments of the speech signal Pivot Determination Pivot point: the maximum point of entire analysed segment -> all subsequent period information will be detected Period Estimation Initial period estimation around the pivot point Glottal Peaks and Hiatus Points Detection Glottal peaks of all periods situated to the left and to the right of pivot point are determined Period Segmentation The boundaries of each period are detected
6 A. Pivot Point Determination M k (i) PIV Algorithm for Detection of Voice Signal Periodicity D The pivot point: the reference point Establishing the pivot point: initial filtering of the signal (median filter) zero-crossing points, local minimum and local maximum points are detected (ZeroMinMax algorithm [10] ). Pivot Point -> the sample point with the highest amplitude among the maximum points, along a distance D from the beginning of the segment PIV max( Mk ( i)), k 0,.. N ; i D - N : the number of local maximum points M k (i) from considered segment - i : the sample index -D: introduced for limiting the calculation for long duration segments
7 B. First Period Estimation Algorithm for Detection of Voice Signal Periodicity M S (i) PIV M D (j) D 1 D 2 An initial estimation of the period around the pivot point First: the local maximum points in the left M S (i) and right M D (j) vicinity of pivot point having an amplitude comparable with the amplitude of central point are detected If the distances D 1 and D 2 between M S (i), M D (j) and the pivot point are approximately equals (most cases) => Initial estimation of the period: PER = Average (D 1, D 2 ); If D 1 and D 2 are quite different (few cases of sharpened voice) => the nearest value from the median period value of the previous processed segment will be considered -> increases the robustness D1 D2 PER d( PIV, M d( PIV, M ( D 1 D S D 2 ( i)) ( j)) ) / 2 - M S (i) : first local maximum point at the left of pivot point with comparable magnitude : ( M S ( i), PIV ) S - M D (j) : first local maximum point at the right of pivot point with comparable magnitude : ( M D ( j), PIV ) S - PER is the first estimation of the period.
8 C. Detecting Glottal Peaks Algorithm for Detection of Voice Signal Periodicity M k-1 (j) PIV M k (i) k=n S D k P k-1 k=0 k=n D All the local maximum points corresponding with glottal peaks are determined, starting from the pivot position to the left and to the right. Starting from a previous peak M k-1 (j), the next peak M k (i) is found as follows: 1) first estimating the position of the next peak : the distance from the previous peak is equal to the estimated current period P k-1 ; 2) determine the local maximum point situated at minimum distance from the estimated position; 3) current period value P k is updated according to the position of this last point that was found. If a maximum point is not found in the expected position: -> exceeding the allowed period duration (A) -> low signal amplitude (B) => the next local maximum point is marked as hiatus (gap): - period hiatus case (A) - amplitude hiatus case (B).
9 C. Detecting Glottal Peaks Algorithm for Detection of Voice Signal Periodicity M k-1 (j) PIV M k (i) k=n S D k P k-1 k=0 k=n D The condition for determining a glottal peak M k (i) at iteration k : Dk d( M Dk Pk k 1 1 ( j), M / P k 1 k ( i)) - M k-1 (j) is the peak determined at the previous iteration (k-1), situated at sample number j; - D k is the distance between previous peak M k-1 (j) and current peak M k (i); k = 1.. N S at the left of pivot point, k = 1.. N D at the right of pivot point; - P k-1 represents the estimated period at iteration k-1, where P 0 has been settled at step 2 (first period estimation) of the algorithm; - Δ represents the threshold for the relative error between previous estimated period P k-1 and the actual distance D k
10 C. Detecting Glottal Peaks Algorithm for Detection of Voice Signal Periodicity M k-1 (j) PIV M k (i) k=n S D k P k-1 k=0 k=n D After the determination of a glottal peak M k (i), current period estimation P k will be updated: P k ( P 1N ( k) D ) /( N ( k) 1) k k -> N(k) represents a weighting factor - can be set to the number of periods covered by the previous iteration: N(k) = k - 1, or - can be set to a constant: N(k) = C In the current algorithm: N=4 => more rapidity in changing the estimated current period, following the variations of signal frequency (due to the speaker intonation).
11 C. Detecting Glottal Peaks Algorithm for Detection of Voice Signal Periodicity An example of automate detection of glottal peaks into a voiced segment of speech All detected points (1 22) belong to the same voiced region. The algorithm detects: a) point 9 : the pivot point, then b) it settles the left peaks (from 8 to 1), c) then the right peaks (points 10-13); after point 13, it could not identify a next peak situated inside or nearly the estimated period distance => the next peak (14) has been marked as hiatus. d) the algorithm is resumed: from this point -> the end of the segment. => a new pivot point (point 19) -> a new value of period -> all other peaks (15-18 and 20-22)
12 D. Period Segmentation Algorithm for Detection of Voice Signal Periodicity M k (i) M k+1 (j) Z k (m) PER k Z k+1 (n) After all the peaks have been determined => the pitch interval boundaries The starting point of each period interval: the first zero-crossing point before the period peak. Each period interval: will start at corresponding initial zero point -> will last till the initial zero point of the next interval Period interval duration PER k corresponding to the peak M k (i) is computed as distance between the two zero points that are marked as period interval boundaries: PER k d( Z ( m), Z 1( n)) k k - Z k (m) : the first zero point preceding M k (i) and situated at sample number m - Z k+1 (n) : the first zero point preceding M k+1 (j) and situated at sample number n In sample units : In time domain: PER k = n-m, PER k (t) = (n-m)/f es, - F es represents the sampling frequency of the signal.
13 Results Obtained The algorithm works well both for male and female voice. Result of pitch intervals determination for a voiced segment of speech uttered by a male speaker : Periods duration and pitch frequency for above speech segment: Nw : length of periods in number of samples Tw: period duration in miliseconds Fw: pitch frequency.
14 Results Obtained For untainted voices at normal speed utterances, the accuracy is high -> few differences between manual and automate period segmentation Testing the algorithm on a series of sound files (~ 20 seconds) => a correct detection rate of over 90% compared to manual segmentation of periods Sharpened voices or noisy environments - the signal waveform is very fragmented and rich in high order harmonics => some variations in detecting period boundaries could be observed because of the uniform manner the algorithm detects the glottal peaks and corresponding boundaries, an error that appeares at one specific period tends to be compensated at the next period => a good overall result Next phase of research: a comparative study with other methods that detect glottal peaks directly from waveform signal, like Childers and Hu method [1] and DYPSA algorithm [2]. [1] D.G. Childers, H.T. Hu, Speech synthesis by glottal excited linear prediction, Journal of the Acoustical Society of America, 1994 [2] P.A. Naylor, A. Kounoudes, J. Gudnason, M. Brookes, Estimation of glottal closure instants in voiced speech using the DYPSA algorithm, IEEE Transactions on Audio, Speech, and Language Processing, Volume 15, Issue 1, pp.34-43, January 2007
15 Conclusions An original algorithm for determining the pitch intervals for a voice signal The algorithm is very accurate and is working exclusively in the time domain of analysis Unlike other methods that use the frequency domain, it does not require windowing or complex calculations -> it is very quickly The method involves four successive steps -> four algorithms have been developed: - an algorithm for determining the pivot point; - an algorithm for determining an estimate of the period around the pivot point; - an algorithm for determining glottal peaks of the speech segment; the algorithm is able to detect also the hiatus points of the segment and classify them as period hiatus points or amplitude hiatus points; - an algorithm for determining the end points of pitch intervals, corresponding to the glottal peaks of the analysed segment.
16 Remarks The glottal peaks detected by our algorithm: the most significant peaks in each period of the speech signal These peaks correspond with glottal closure instants (GCI), but are not identical: > The GCI points are computed from the corresponding EGG signal (recorded together with the original speech signal -> need a laryngograph) > the glottal peaks are detected directly from the speech signal waveform => Advantage: detection of pitch intervals is made even if we don t have the EGG signal recording (signal waveform; vocal databases that do not store EGG) EGG
17 References 1. D.G. Childers, H.T. Hu, Speech synthesis by glottal excited linear prediction, Journal of the Acoustical Society of America, P.A. Naylor, A. Kounoudes, J. Gudnason, M. Brookes, Estimation of glottal closure instants in voiced speech using the DYPSA algorithm, IEEE Transactions on Audio, Speech, and Language Processing, Volume 15, Issue 1, pp.34-43, January M.R.P. Thomas, P.A. Naylor, The SIGMA algorithm: a glottal activity detector for electroglottographic signals, IEEE Transactions on Audio, Speech, and Language Processing, Volume 17, No. 8, November C. d Alessandro et al., Phase-based methods for voice source analysis, in Advances in Nonlinear Speech Processing, International Conference on Nonlinear Speech Processing, NOLISP 2007, Paris, France, pp. 1-27, May K. Schnell, Estimation of glottal closure instances from speech signals by weighted nonlinear prediction, in Advances in Nonlinear Speech Processing, International Conference on Nonlinear Speech Processing, NOLISP 2007, Paris, France, pp , May N.A. Kader, Pitch detection algorithm using a wavelet correlation model, The 17th National Radio Science Conference (NRSC), S. Sakai, J. Glass, Fundamental frequency modeling for corpus-based speech synthesis based on a statistical learning technique, Spoken Language System Publications, G. Proakis, C. M. Rader, F. Ling, M. Moonen, I. K. Proudler, C. L. Nikias, Algorithms for Statistical Signal Processing, Prentice Hall, D. Joho, M. Bennewitz, S. Behnke, Pitch estimation using models of voiced speech on three levels, IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Volume 4, pp , O. Buza, Contributions into Voice Signal Analysis and Text to Speech Synthesis for Romanian, Phd Thesis, Faculty of Electronics and Telecommunications, Cluj-Napoca, Romania, 2010.
18 Algorithm for Detection of Voice Signal Periodicity M k (i) M k+1 (j) Z k (m) Z k+1 (n) PER k Ovidiu Buza, Gavril Toderean, András Balogh Department of Communications, Technical University of Cluj-Napoca, Romania József Domokos Department of Electrical Engineering,Sapientia University, Targu Mures, Romania
Emotion Detection from Speech
Emotion Detection from Speech 1. Introduction Although emotion detection from speech is a relatively new field of research, it has many potential applications. In human-computer or human-human interaction
More informationL9: Cepstral analysis
L9: Cepstral analysis The cepstrum Homomorphic filtering The cepstrum and voicing/pitch detection Linear prediction cepstral coefficients Mel frequency cepstral coefficients This lecture is based on [Taylor,
More informationSpeech Signal Processing: An Overview
Speech Signal Processing: An Overview S. R. M. Prasanna Department of Electronics and Electrical Engineering Indian Institute of Technology Guwahati December, 2012 Prasanna (EMST Lab, EEE, IITG) Speech
More informationEstablishing the Uniqueness of the Human Voice for Security Applications
Proceedings of Student/Faculty Research Day, CSIS, Pace University, May 7th, 2004 Establishing the Uniqueness of the Human Voice for Security Applications Naresh P. Trilok, Sung-Hyuk Cha, and Charles C.
More informationThirukkural - A Text-to-Speech Synthesis System
Thirukkural - A Text-to-Speech Synthesis System G. L. Jayavardhana Rama, A. G. Ramakrishnan, M Vijay Venkatesh, R. Murali Shankar Department of Electrical Engg, Indian Institute of Science, Bangalore 560012,
More informationA TOOL FOR TEACHING LINEAR PREDICTIVE CODING
A TOOL FOR TEACHING LINEAR PREDICTIVE CODING Branislav Gerazov 1, Venceslav Kafedziski 2, Goce Shutinoski 1 1) Department of Electronics, 2) Department of Telecommunications Faculty of Electrical Engineering
More informationLecture 1-10: Spectrograms
Lecture 1-10: Spectrograms Overview 1. Spectra of dynamic signals: like many real world signals, speech changes in quality with time. But so far the only spectral analysis we have performed has assumed
More informationEricsson T18s Voice Dialing Simulator
Ericsson T18s Voice Dialing Simulator Mauricio Aracena Kovacevic, Anna Dehlbom, Jakob Ekeberg, Guillaume Gariazzo, Eric Lästh and Vanessa Troncoso Dept. of Signals Sensors and Systems Royal Institute of
More informationMUSICAL INSTRUMENT FAMILY CLASSIFICATION
MUSICAL INSTRUMENT FAMILY CLASSIFICATION Ricardo A. Garcia Media Lab, Massachusetts Institute of Technology 0 Ames Street Room E5-40, Cambridge, MA 039 USA PH: 67-53-0 FAX: 67-58-664 e-mail: rago @ media.
More informationArtificial Neural Network for Speech Recognition
Artificial Neural Network for Speech Recognition Austin Marshall March 3, 2005 2nd Annual Student Research Showcase Overview Presenting an Artificial Neural Network to recognize and classify speech Spoken
More informationSchool Class Monitoring System Based on Audio Signal Processing
C. R. Rashmi 1,,C.P.Shantala 2 andt.r.yashavanth 3 1 Department of CSE, PG Student, CIT, Gubbi, Tumkur, Karnataka, India. 2 Department of CSE, Vice Principal & HOD, CIT, Gubbi, Tumkur, Karnataka, India.
More informationBroadband Networks. Prof. Dr. Abhay Karandikar. Electrical Engineering Department. Indian Institute of Technology, Bombay. Lecture - 29.
Broadband Networks Prof. Dr. Abhay Karandikar Electrical Engineering Department Indian Institute of Technology, Bombay Lecture - 29 Voice over IP So, today we will discuss about voice over IP and internet
More informationThis document is downloaded from DR-NTU, Nanyang Technological University Library, Singapore.
This document is downloaded from DR-NTU, Nanyang Technological University Library, Singapore. Title Transcription of polyphonic signals using fast filter bank( Accepted version ) Author(s) Foo, Say Wei;
More informationFigure1. Acoustic feedback in packet based video conferencing system
Real-Time Howling Detection for Hands-Free Video Conferencing System Mi Suk Lee and Do Young Kim Future Internet Research Department ETRI, Daejeon, Korea {lms, dyk}@etri.re.kr Abstract: This paper presents
More informationA Sound Analysis and Synthesis System for Generating an Instrumental Piri Song
, pp.347-354 http://dx.doi.org/10.14257/ijmue.2014.9.8.32 A Sound Analysis and Synthesis System for Generating an Instrumental Piri Song Myeongsu Kang and Jong-Myon Kim School of Electrical Engineering,
More informationAudio Engineering Society. Convention Paper. Presented at the 129th Convention 2010 November 4 7 San Francisco, CA, USA
Audio Engineering Society Convention Paper Presented at the 129th Convention 2010 November 4 7 San Francisco, CA, USA The papers at this Convention have been selected on the basis of a submitted abstract
More informationAN-007 APPLICATION NOTE MEASURING MAXIMUM SUBWOOFER OUTPUT ACCORDING ANSI/CEA-2010 STANDARD INTRODUCTION CEA-2010 (ANSI) TEST PROCEDURE
AUDIOMATICA AN-007 APPLICATION NOTE MEASURING MAXIMUM SUBWOOFER OUTPUT ACCORDING ANSI/CEA-2010 STANDARD by Daniele Ponteggia - dp@audiomatica.com INTRODUCTION The Consumer Electronics Association (CEA),
More informationModelling, Extraction and Description of Intrinsic Cues of High Resolution Satellite Images: Independent Component Analysis based approaches
Modelling, Extraction and Description of Intrinsic Cues of High Resolution Satellite Images: Independent Component Analysis based approaches PhD Thesis by Payam Birjandi Director: Prof. Mihai Datcu Problematic
More informationAPPLYING MFCC-BASED AUTOMATIC SPEAKER RECOGNITION TO GSM AND FORENSIC DATA
APPLYING MFCC-BASED AUTOMATIC SPEAKER RECOGNITION TO GSM AND FORENSIC DATA Tuija Niemi-Laitinen*, Juhani Saastamoinen**, Tomi Kinnunen**, Pasi Fränti** *Crime Laboratory, NBI, Finland **Dept. of Computer
More informationA STUDY OF ECHO IN VOIP SYSTEMS AND SYNCHRONOUS CONVERGENCE OF
A STUDY OF ECHO IN VOIP SYSTEMS AND SYNCHRONOUS CONVERGENCE OF THE µ-law PNLMS ALGORITHM Laura Mintandjian and Patrick A. Naylor 2 TSS Departement, Nortel Parc d activites de Chateaufort, 78 Chateaufort-France
More informationFrom Concept to Production in Secure Voice Communications
From Concept to Production in Secure Voice Communications Earl E. Swartzlander, Jr. Electrical and Computer Engineering Department University of Texas at Austin Austin, TX 78712 Abstract In the 1970s secure
More informationAvailable from Deakin Research Online:
This is the authors final peered reviewed (post print) version of the item published as: Adibi,S 2014, A low overhead scaled equalized harmonic-based voice authentication system, Telematics and informatics,
More informationA Comparison of Speech Coding Algorithms ADPCM vs CELP. Shannon Wichman
A Comparison of Speech Coding Algorithms ADPCM vs CELP Shannon Wichman Department of Electrical Engineering The University of Texas at Dallas Fall 1999 December 8, 1999 1 Abstract Factors serving as constraints
More informationLecture 1-6: Noise and Filters
Lecture 1-6: Noise and Filters Overview 1. Periodic and Aperiodic Signals Review: by periodic signals, we mean signals that have a waveform shape that repeats. The time taken for the waveform to repeat
More informationThe Effect of Long-Term Use of Drugs on Speaker s Fundamental Frequency
The Effect of Long-Term Use of Drugs on Speaker s Fundamental Frequency Andrey Raev 1, Yuri Matveev 1, Tatiana Goloshchapova 2 1 Speech Technology Center, St. Petersburg, RUSSIA {raev, matveev}@speechpro.com
More informationExperiment # (4) AM Demodulator
Islamic University of Gaza Faculty of Engineering Electrical Department Experiment # (4) AM Demodulator Communications Engineering I (Lab.) Prepared by: Eng. Omar A. Qarmout Eng. Mohammed K. Abu Foul Experiment
More informationPolytechnic University of Puerto Rico Department of Electrical Engineering Master s Degree in Electrical Engineering.
Polytechnic University of Puerto Rico Department of Electrical Engineering Master s Degree in Electrical Engineering Course Syllabus Course Title : Algorithms for Digital Signal Processing Course Code
More informationDirac Live & the RS20i
Dirac Live & the RS20i Dirac Research has worked for many years fine-tuning digital sound optimization and room correction. Today, the technology is available to the high-end consumer audio market with
More informationAutomatic Evaluation Software for Contact Centre Agents voice Handling Performance
International Journal of Scientific and Research Publications, Volume 5, Issue 1, January 2015 1 Automatic Evaluation Software for Contact Centre Agents voice Handling Performance K.K.A. Nipuni N. Perera,
More informationPeakVue Analysis for Antifriction Bearing Fault Detection
August 2011 PeakVue Analysis for Antifriction Bearing Fault Detection Peak values (PeakVue) are observed over sequential discrete time intervals, captured, and analyzed. The analyses are the (a) peak values
More informationSeparation and Classification of Harmonic Sounds for Singing Voice Detection
Separation and Classification of Harmonic Sounds for Singing Voice Detection Martín Rocamora and Alvaro Pardo Institute of Electrical Engineering - School of Engineering Universidad de la República, Uruguay
More informationFormant Bandwidth and Resilience of Speech to Noise
Formant Bandwidth and Resilience of Speech to Noise Master Thesis Leny Vinceslas August 5, 211 Internship for the ATIAM Master s degree ENS - Laboratoire Psychologie de la Perception - Hearing Group Supervised
More informationAdvanced Speech-Audio Processing in Mobile Phones and Hearing Aids
Advanced Speech-Audio Processing in Mobile Phones and Hearing Aids Synergies and Distinctions Peter Vary RWTH Aachen University Institute of Communication Systems WASPAA, October 23, 2013 Mohonk Mountain
More informationSPEAKER IDENTIFICATION FROM YOUTUBE OBTAINED DATA
SPEAKER IDENTIFICATION FROM YOUTUBE OBTAINED DATA Nitesh Kumar Chaudhary 1 and Shraddha Srivastav 2 1 Department of Electronics & Communication Engineering, LNMIIT, Jaipur, India 2 Bharti School Of Telecommunication,
More informationFinal Year Project Progress Report. Frequency-Domain Adaptive Filtering. Myles Friel. Supervisor: Dr.Edward Jones
Final Year Project Progress Report Frequency-Domain Adaptive Filtering Myles Friel 01510401 Supervisor: Dr.Edward Jones Abstract The Final Year Project is an important part of the final year of the Electronic
More informationBLIND SOURCE SEPARATION OF SPEECH AND BACKGROUND MUSIC FOR IMPROVED SPEECH RECOGNITION
BLIND SOURCE SEPARATION OF SPEECH AND BACKGROUND MUSIC FOR IMPROVED SPEECH RECOGNITION P. Vanroose Katholieke Universiteit Leuven, div. ESAT/PSI Kasteelpark Arenberg 10, B 3001 Heverlee, Belgium Peter.Vanroose@esat.kuleuven.ac.be
More informationLinear Parameter Measurement (LPM)
(LPM) Module of the R&D SYSTEM FEATURES Identifies linear transducer model Measures suspension creep LS-fitting in impedance LS-fitting in displacement (optional) Single-step measurement with laser sensor
More informationECG SIGNAL PROCESSING AND HEART RATE FREQUENCY DETECTION METHODS
ECG SIGNAL PROCESSING AND HEART RATE FREQUENCY DETECTION METHODS J. Parak, J. Havlik Department of Circuit Theory, Faculty of Electrical Engineering Czech Technical University in Prague Abstract Digital
More informationAnalysis/resynthesis with the short time Fourier transform
Analysis/resynthesis with the short time Fourier transform summer 2006 lecture on analysis, modeling and transformation of audio signals Axel Röbel Institute of communication science TU-Berlin IRCAM Analysis/Synthesis
More informationAbstract. Cycle Domain Simulator for Phase-Locked Loops
Abstract Cycle Domain Simulator for Phase-Locked Loops Norman James December 1999 As computers become faster and more complex, clock synthesis becomes critical. Due to the relatively slower bus clocks
More informationRecent advances in Digital Music Processing and Indexing
Recent advances in Digital Music Processing and Indexing Acoustics 08 warm-up TELECOM ParisTech Gaël RICHARD Telecom ParisTech (ENST) www.enst.fr/~grichard/ Content Introduction and Applications Components
More informationDeveloping an Isolated Word Recognition System in MATLAB
MATLAB Digest Developing an Isolated Word Recognition System in MATLAB By Daryl Ning Speech-recognition technology is embedded in voice-activated routing systems at customer call centres, voice dialling
More informationTrigonometric functions and sound
Trigonometric functions and sound The sounds we hear are caused by vibrations that send pressure waves through the air. Our ears respond to these pressure waves and signal the brain about their amplitude
More informationA Segmentation Algorithm for Zebra Finch Song at the Note Level. Ping Du and Todd W. Troyer
A Segmentation Algorithm for Zebra Finch Song at the Note Level Ping Du and Todd W. Troyer Neuroscience and Cognitive Science Program, Dept. of Psychology University of Maryland, College Park, MD 20742
More informationHD Radio FM Transmission System Specifications Rev. F August 24, 2011
HD Radio FM Transmission System Specifications Rev. F August 24, 2011 SY_SSS_1026s TRADEMARKS HD Radio and the HD, HD Radio, and Arc logos are proprietary trademarks of ibiquity Digital Corporation. ibiquity,
More informationTRAFFIC MONITORING WITH AD-HOC MICROPHONE ARRAY
4 4th International Workshop on Acoustic Signal Enhancement (IWAENC) TRAFFIC MONITORING WITH AD-HOC MICROPHONE ARRAY Takuya Toyoda, Nobutaka Ono,3, Shigeki Miyabe, Takeshi Yamada, Shoji Makino University
More informationRobust Methods for Automatic Transcription and Alignment of Speech Signals
Robust Methods for Automatic Transcription and Alignment of Speech Signals Leif Grönqvist (lgr@msi.vxu.se) Course in Speech Recognition January 2. 2004 Contents Contents 1 1 Introduction 2 2 Background
More informationA Wavelet Based Prediction Method for Time Series
A Wavelet Based Prediction Method for Time Series Cristina Stolojescu 1,2 Ion Railean 1,3 Sorin Moga 1 Philippe Lenca 1 and Alexandru Isar 2 1 Institut TELECOM; TELECOM Bretagne, UMR CNRS 3192 Lab-STICC;
More informationAC 2012-5055: MULTIMEDIA SYSTEMS EDUCATION INNOVATIONS I: SPEECH
AC -555: MULTIMEDIA SYSTEMS EDUCATION INNOVATIONS I: SPEECH Prof. Tokunbo Ogunfunmi, Santa Clara University Tokunbo Ogunfunmi is the Associate Dean for Research and Faculty Development in the School of
More informationAnnotated bibliographies for presentations in MUMT 611, Winter 2006
Stephen Sinclair Music Technology Area, McGill University. Montreal, Canada Annotated bibliographies for presentations in MUMT 611, Winter 2006 Presentation 4: Musical Genre Similarity Aucouturier, J.-J.
More informationMPEG Unified Speech and Audio Coding Enabling Efficient Coding of both Speech and Music
ISO/IEC MPEG USAC Unified Speech and Audio Coding MPEG Unified Speech and Audio Coding Enabling Efficient Coding of both Speech and Music The standardization of MPEG USAC in ISO/IEC is now in its final
More informationTime Domain and Frequency Domain Techniques For Multi Shaker Time Waveform Replication
Time Domain and Frequency Domain Techniques For Multi Shaker Time Waveform Replication Thomas Reilly Data Physics Corporation 1741 Technology Drive, Suite 260 San Jose, CA 95110 (408) 216-8440 This paper
More informationAdvanced Signal Processing and Digital Noise Reduction
Advanced Signal Processing and Digital Noise Reduction Saeed V. Vaseghi Queen's University of Belfast UK WILEY HTEUBNER A Partnership between John Wiley & Sons and B. G. Teubner Publishers Chichester New
More informationBasic Acoustics and Acoustic Filters
Basic CHAPTER Acoustics and Acoustic Filters 1 3 Basic Acoustics and Acoustic Filters 1.1 The sensation of sound Several types of events in the world produce the sensation of sound. Examples include doors
More informationTeaching Fourier Analysis and Wave Physics with the Bass Guitar
Teaching Fourier Analysis and Wave Physics with the Bass Guitar Michael Courtney Department of Chemistry and Physics, Western Carolina University Norm Althausen Lorain County Community College This article
More informationAudio Content Analysis for Online Audiovisual Data Segmentation and Classification
IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING, VOL. 9, NO. 4, MAY 2001 441 Audio Content Analysis for Online Audiovisual Data Segmentation and Classification Tong Zhang, Member, IEEE, and C.-C. Jay
More informationMyanmar Continuous Speech Recognition System Based on DTW and HMM
Myanmar Continuous Speech Recognition System Based on DTW and HMM Ingyin Khaing Department of Information and Technology University of Technology (Yatanarpon Cyber City),near Pyin Oo Lwin, Myanmar Abstract-
More informationhave more skill and perform more complex
Speech Recognition Smartphone UI Speech Recognition Technology and Applications for Improving Terminal Functionality and Service Usability User interfaces that utilize voice input on compact devices such
More informationConstruct User Guide
Construct User Guide Contents Contents 1 1 Introduction 2 1.1 Construct Features..................................... 2 1.2 Speech Licenses....................................... 3 2 Scenario Management
More informationOverview of the research results on Voice over IP
Overview of the research results on Voice over IP F. Beritelli (Università di Catania) C. Casetti (Politecnico di Torino) S. Giordano (Università di Pisa) R. Lo Cigno (Politecnico di Torino) 1. Introduction
More informationA Study on Relationship between Power of Adapter and Total Harmonic Distortion of Earphone s leaking Sound
, pp.201-208 http://dx.doi.org/10.14257/ijmue.2014.9.6.20 A Study on Relationship between Power of Adapter and Total Harmonic Distortion of Earphone s leaking Sound Eun-Young Yi, Chan-Joong Jung and Myung-Jin
More informationProbability and Random Variables. Generation of random variables (r.v.)
Probability and Random Variables Method for generating random variables with a specified probability distribution function. Gaussian And Markov Processes Characterization of Stationary Random Process Linearly
More informationSpeech: A Challenge to Digital Signal Processing Technology for Human-to-Computer Interaction
: A Challenge to Digital Signal Processing Technology for Human-to-Computer Interaction Urmila Shrawankar Dept. of Information Technology Govt. Polytechnic, Nagpur Institute Sadar, Nagpur 440001 (INDIA)
More informationCarla Simões, t-carlas@microsoft.com. Speech Analysis and Transcription Software
Carla Simões, t-carlas@microsoft.com Speech Analysis and Transcription Software 1 Overview Methods for Speech Acoustic Analysis Why Speech Acoustic Analysis? Annotation Segmentation Alignment Speech Analysis
More informationSimple Voice over IP (VoIP) Implementation
Simple Voice over IP (VoIP) Implementation ECE Department, University of Florida Abstract Voice over IP (VoIP) technology has many advantages over the traditional Public Switched Telephone Networks. In
More informationSTUDY REGARDING THE USE OF THE TOOLS OFFERED BY MICROSOFT EXCEL SOFTWARE IN THE ACTIVITY OF THE BIHOR COUNTY COMPANIES
STUDY REGARDING THE USE OF THE TOOLS OFFERED BY MICROSOFT EXCEL SOFTWARE IN THE ACTIVITY OF THE BIHOR COUNTY COMPANIES Ţarcă Naiana 1, Popa Adela 2 1 Faculty of Economics, University of Oradea, Oradea,
More informationSOFTWARE FOR GENERATION OF SPECTRUM COMPATIBLE TIME HISTORY
3 th World Conference on Earthquake Engineering Vancouver, B.C., Canada August -6, 24 Paper No. 296 SOFTWARE FOR GENERATION OF SPECTRUM COMPATIBLE TIME HISTORY ASHOK KUMAR SUMMARY One of the important
More informationLinear Predictive Coding
Linear Predictive Coding Jeremy Bradbury December 5, 2000 0 Outline I. Proposal II. Introduction A. Speech Coding B. Voice Coders C. LPC Overview III. Historical Perspective of Linear Predictive Coding
More informationVoice Communication Package v7.0 of front-end voice processing software technologies General description and technical specification
Voice Communication Package v7.0 of front-end voice processing software technologies General description and technical specification (Revision 1.0, May 2012) General VCP information Voice Communication
More informationA WEB BASED TRAINING MODULE FOR TEACHING DIGITAL COMMUNICATIONS
A WEB BASED TRAINING MODULE FOR TEACHING DIGITAL COMMUNICATIONS Ali Kara 1, Cihangir Erdem 1, Mehmet Efe Ozbek 1, Nergiz Cagiltay 2, Elif Aydin 1 (1) Department of Electrical and Electronics Engineering,
More informationThe Computer Music Tutorial
Curtis Roads with John Strawn, Curtis Abbott, John Gordon, and Philip Greenspun The Computer Music Tutorial Corrections and Revisions The MIT Press Cambridge, Massachusetts London, England 2 Corrections
More informationCreating voices for the Festival speech synthesis system.
M. Hood Supervised by A. Lobb and S. Bangay G01H0708 Creating voices for the Festival speech synthesis system. Abstract This project focuses primarily on the process of creating a voice for a concatenative
More informationDigital Signal Controller Based Automatic Transfer Switch
Digital Signal Controller Based Automatic Transfer Switch by Venkat Anant Senior Staff Applications Engineer Freescale Semiconductor, Inc. Abstract: An automatic transfer switch (ATS) enables backup generators,
More informationDetection of Heart Diseases by Mathematical Artificial Intelligence Algorithm Using Phonocardiogram Signals
International Journal of Innovation and Applied Studies ISSN 2028-9324 Vol. 3 No. 1 May 2013, pp. 145-150 2013 Innovative Space of Scientific Research Journals http://www.issr-journals.org/ijias/ Detection
More informationA fast multi-class SVM learning method for huge databases
www.ijcsi.org 544 A fast multi-class SVM learning method for huge databases Djeffal Abdelhamid 1, Babahenini Mohamed Chaouki 2 and Taleb-Ahmed Abdelmalik 3 1,2 Computer science department, LESIA Laboratory,
More informationPrecision Diode Rectifiers
by Kenneth A. Kuhn March 21, 2013 Precision half-wave rectifiers An operational amplifier can be used to linearize a non-linear function such as the transfer function of a semiconductor diode. The classic
More informationSubjective SNR measure for quality assessment of. speech coders \A cross language study
Subjective SNR measure for quality assessment of speech coders \A cross language study Mamoru Nakatsui and Hideki Noda Communications Research Laboratory, Ministry of Posts and Telecommunications, 4-2-1,
More informationMonophonic Music Recognition
Monophonic Music Recognition Per Weijnitz Speech Technology 5p per.weijnitz@gslt.hum.gu.se 5th March 2003 Abstract This report describes an experimental monophonic music recognition system, carried out
More informationThe LENA TM Language Environment Analysis System:
FOUNDATION The LENA TM Language Environment Analysis System: The Interpreted Time Segments (ITS) File Dongxin Xu, Umit Yapanel, Sharmi Gray, & Charles T. Baer LENA Foundation, Boulder, CO LTR-04-2 September
More informationDigital Speech Coding
Digital Speech Processing David Tipper Associate Professor Graduate Program of Telecommunications and Networking University of Pittsburgh Telcom 2720 Slides 7 http://www.sis.pitt.edu/~dtipper/tipper.html
More informationJitter Transfer Functions in Minutes
Jitter Transfer Functions in Minutes In this paper, we use the SV1C Personalized SerDes Tester to rapidly develop and execute PLL Jitter transfer function measurements. We leverage the integrated nature
More informationController Design in Frequency Domain
ECSE 4440 Control System Engineering Fall 2001 Project 3 Controller Design in Frequency Domain TA 1. Abstract 2. Introduction 3. Controller design in Frequency domain 4. Experiment 5. Colclusion 1. Abstract
More informationIBM Research Report. CSR: Speaker Recognition from Compressed VoIP Packet Stream
RC23499 (W0501-090) January 19, 2005 Computer Science IBM Research Report CSR: Speaker Recognition from Compressed Packet Stream Charu Aggarwal, David Olshefski, Debanjan Saha, Zon-Yin Shae, Philip Yu
More informationElectronic Communications Committee (ECC) within the European Conference of Postal and Telecommunications Administrations (CEPT)
Page 1 Electronic Communications Committee (ECC) within the European Conference of Postal and Telecommunications Administrations (CEPT) ECC RECOMMENDATION (06)01 Bandwidth measurements using FFT techniques
More informationDynamic sound source for simulating the Lombard effect in room acoustic modeling software
Dynamic sound source for simulating the Lombard effect in room acoustic modeling software Jens Holger Rindel a) Claus Lynge Christensen b) Odeon A/S, Scion-DTU, Diplomvej 381, DK-2800 Kgs. Lynby, Denmark
More informationCBS RECORDS PROFESSIONAL SERIES CBS RECORDS CD-1 STANDARD TEST DISC
CBS RECORDS PROFESSIONAL SERIES CBS RECORDS CD-1 STANDARD TEST DISC 1. INTRODUCTION The CBS Records CD-1 Test Disc is a highly accurate signal source specifically designed for those interested in making
More informationThe Algorithms of Speech Recognition, Programming and Simulating in MATLAB
FACULTY OF ENGINEERING AND SUSTAINABLE DEVELOPMENT. The Algorithms of Speech Recognition, Programming and Simulating in MATLAB Tingxiao Yang January 2012 Bachelor s Thesis in Electronics Bachelor s Program
More informationMembering T M : A Conference Call Service with Speaker-Independent Name Dialing on AIN
PAGE 30 Membering T M : A Conference Call Service with Speaker-Independent Name Dialing on AIN Sung-Joon Park, Kyung-Ae Jang, Jae-In Kim, Myoung-Wan Koo, Chu-Shik Jhon Service Development Laboratory, KT,
More informationMeasuring Line Edge Roughness: Fluctuations in Uncertainty
Tutor6.doc: Version 5/6/08 T h e L i t h o g r a p h y E x p e r t (August 008) Measuring Line Edge Roughness: Fluctuations in Uncertainty Line edge roughness () is the deviation of a feature edge (as
More informationTiming Errors and Jitter
Timing Errors and Jitter Background Mike Story In a sampled (digital) system, samples have to be accurate in level and time. The digital system uses the two bits of information the signal was this big
More informationTitle : Analog Circuit for Sound Localization Applications
Title : Analog Circuit for Sound Localization Applications Author s Name : Saurabh Kumar Tiwary Brett Diamond Andrea Okerholm Contact Author : Saurabh Kumar Tiwary A-51 Amberson Plaza 5030 Center Avenue
More informationVoice Encoding Methods for Digital Wireless Communications Systems
SOUTHERN METHODIST UNIVERSITY Voice Encoding Methods for Digital Wireless Communications Systems BY Bryan Douglas Street address city state, zip e-mail address Student ID xxx-xx-xxxx EE6302 Section 324,
More informationPortable Time Interval Counter with Picosecond Precision
Portable Time Interval Counter with Picosecond Precision R. SZPLET, Z. JACHNA, K. ROZYC, J. KALISZ Department of Electronic Engineering Military University of Technology Gen. S. Kaliskiego 2, 00-908 Warsaw
More informationANALYZER BASICS WHAT IS AN FFT SPECTRUM ANALYZER? 2-1
WHAT IS AN FFT SPECTRUM ANALYZER? ANALYZER BASICS The SR760 FFT Spectrum Analyzer takes a time varying input signal, like you would see on an oscilloscope trace, and computes its frequency spectrum. Fourier's
More informationNoise. CIH Review PDC March 2012
Noise CIH Review PDC March 2012 Learning Objectives Understand the concept of the decibel, decibel determination, decibel addition, and weighting Know the characteristics of frequency that are relevant
More informationWaveforms and the Speed of Sound
Laboratory 3 Seth M. Foreman February 24, 2015 Waveforms and the Speed of Sound 1 Objectives The objectives of this excercise are: to measure the speed of sound in air to record and analyze waveforms of
More informationMusic technology. Draft GCE A level and AS subject content
Music technology Draft GCE A level and AS subject content July 2015 Contents The content for music technology AS and A level 3 Introduction 3 Aims and objectives 3 Subject content 4 Recording and production
More informationAdaptive Sampling Rate Correction for Acoustic Echo Control in Voice-Over-IP Matthias Pawig, Gerald Enzner, Member, IEEE, and Peter Vary, Fellow, IEEE
IEEE TRANSACTIONS ON SIGNAL PROCESSING, VOL. 58, NO. 1, JANUARY 2010 189 Adaptive Sampling Rate Correction for Acoustic Echo Control in Voice-Over-IP Matthias Pawig, Gerald Enzner, Member, IEEE, and Peter
More informationThe Calculation of G rms
The Calculation of G rms QualMark Corp. Neill Doertenbach The metric of G rms is typically used to specify and compare the energy in repetitive shock vibration systems. However, the method of arriving
More informationVEHICLE TRACKING USING ACOUSTIC AND VIDEO SENSORS
VEHICLE TRACKING USING ACOUSTIC AND VIDEO SENSORS Aswin C Sankaranayanan, Qinfen Zheng, Rama Chellappa University of Maryland College Park, MD - 277 {aswch, qinfen, rama}@cfar.umd.edu Volkan Cevher, James
More information