Follow this and additional works at:

Size: px
Start display at page:

Download "Follow this and additional works at:"

Transcription

1 Marquette University Electrical and Computer Engineering Faculty Research and Publications Electrical and Computer Engineering, Department of -14 Speech Enhancement Using Bayesian Estimators of the Perceptually-Motivated Short-Time Spectral Amplitude (STSA) with Chi Speech Priors Marek B. Trawicki Marquette University, Michael T. Johnson Marquette University, Follow this and additional works at: Part of the Computer Engineering Commons, and the Electrical and Computer Engineering Commons Recommended Citation Recommended Citation Trawicki, Marek B. and Johnson, Michael T., "Speech Enhancement Using Bayesian Estimators of the Perceptually-Motivated Short-Time Spectral Amplitude (STSA) with Chi Speech Priors" (14). Electrical and Computer Engineering Faculty Research and Publications

2 Marquette University Electrical and Computer Engineering Faculty Research and Publications/College of Engineering This paper is NOT THE PUBLISHED VERSION; but the author s final, peer-reviewed manuscript. The published version may be accessed by following the link in the citation below. Speech Communication, Vol. 57 (February 14): DOI. This article is Elsevier and permission has been granted for this version to appear in e-publications@marquette. Elsevier does not grant permission for this article to be further copied/distributed or hosted elsewhere without the express permission from Elsevier. Speech Enhancement Using Bayesian Estimators of The Perceptually Motivated Short-Time Spectral Amplitude (STSA) With Chi Speech Priors Marek B. Trawicki Marquette University, Department of Electrical and Computer Engineering, Speech and Signal Processing Laboratory, Milwaukee, WI Michael T. Johnson Marquette University, Department of Electrical and Computer Engineering, Speech and Signal Processing Laboratory, Milwaukee, WI Abstract In this paper, the authors propose new perceptually-motivated Weighted Euclidean (WE) and Weighted Cosh (WCOSH) estimators that utilize more appropriate Chi statistical models for the speech prior with Gaussian

3 statistical models for the noise likelihood. Whereas the perceptually-motivated WE and WCOSH cost functions emphasized spectral valleys rather than spectral peaks (formants) and indirectly accounted for auditory masking effects, the incorporation of the Chi distribution statistical models demonstrated distinct improvement over the Rayleigh statistical models for the speech prior. The estimators incorporate both weighting law and shape parameters on the cost functions and distributions. Performance is evaluated in terms of the Segmental Signalto-Noise Ratio (SSNR), Perceptual Evaluation of Speech Quality (PESQ), and Signal-to-Noise Ratio (SNR) Loss objective quality measures to determine the amount of noise reduction along with overall speech quality and speech intelligibility improvement. Based on experimental results across three different input SNRs and eight unique noises along with various weighting law and shape parameters, the two general, less-complicated, closed-form derived solution estimators of WE and WCOSH with Chi speech priors provide significant gains in noise reduction and noticeable gains in overall speech quality and speech intelligibility improvements over the baseline WE and WCOSH with the standard Rayleigh speech priors. Overall, the goal of the work is to capitalize on the mutual benefits of the WE and WCOSH cost functions and Chi distributions for the speech prior to improvement enhancement. Keywords Speech enhancement, Probability, Amplitude estimation, Phase estimation, Parameter estimation 1. Introduction Speech enhancement systems concern themselves with reducing the corrupting background noise in the noisy signal (Loizou, 7). The most common approach is to perform statistical estimation: minimize the Bayes Risk of the squared-error of the spectral amplitude cost function, which leads to the subsequent and traditional Ephraim and Malah Minimum Mean-Square Error (MMSE) short-time spectral amplitude (STSA) estimator Ephraim and Malah, Based on the effectiveness of that STSA estimator, researchers began to modify the squared-error of the spectral amplitude cost function to utilize more subjectively meaningful cost functions. Ephraim and Malah (Ephraim and Malah, 1985) also developed and implemented the MMSE logspectral amplitude (LSA) estimator that minimizes the squared-error of the log-spectral amplitude, which is a more subjectively meaningful cost function that correlates well with human perception. From the STSA and LSA cost functions, Loizou (5) constructed several perceptually-motivated spectral amplitude cost functions that emphasized spectral valleys rather than spectral peaks (formants) and indirectly accounted for auditory masking effects. Specifically, the Weighted Euclidean (WE) and Weighted Cosh (WCOSH) Bayesian estimators, which applied a weighting law parameter to the STSA cost function, had the best performances for reducing residual noise and producing better speech quality. In each of those corresponding spectral amplitude, log-spectral amplitude, and perceptually-motivated spectral amplitude estimators, the cost functions employed Rayleigh distributions for the statistical models of the speech priors and noise likelihoods. Eventually, researchers began to exploit alternative and more accurate statistical modeling assumptions to the Rayleigh distribution for both the speech prior and noise likelihood using the STSA cost function. Andrianakis and White (9) continued with the MMSE spectral amplitude estimators using the Gamma distribution but introduced the Chi distribution for modeling the speech priors. The Chi speech prior contains a shaping parameter that was varied to determine its effect on the quality of enhanced speech. From the results, the performance of the estimators was dependent on the shaping parameter, which controlled the trade-off between the level of residual noise and musical tones. As a generalization to the Ephraim and Malah s MMSE STSA and LSA estimators along with Andrianakis and White s Chi distribution speech priors, Breithaupt et al. (8) developed a MMSE STSA estimator that uses both a variable compression function in the error criterion and the Chi distribution as a prior model. The resulting two parameters provide for the reduction of musical noise, speech distortion, and noise distortion. Through the incorporation of Chi distribution statistical models for

4 the speech prior, the squared-error cost functions demonstrated distinct improvement over the Rayleigh statistical models. Despite the success of the spectral amplitude, log-spectral amplitude, and perceptually-motivated cost functions with Rayleigh statistical models and spectral amplitude cost functions with Chi distributions for the speech priors, there has not been any work to capitalize on their mutual benefits for speech enhancement. Specifically, the improved statistical models for the speech prior have only been incorporated with the original MMSE STSA estimator, not with the spectral amplitude perceptually-motivated spectral amplitude (WE and WCOSH) cost functions. The fundamental purpose is to determine the effectiveness that more accurate speech priors would have on improved cost functions for noise reduction. Instead of utilizing the Rayleigh distributions for the speech prior, the Chi distribution is employed in this work since it leads to more general, less complicated, and more closed-form estimator solutions. For specific values of the shaping parameter, Chi distribution is equivalent to the half-gaussian and Rayleigh distribution as special cases. Therefore, the focus of this work is to use the MMSE WE and WCOSH estimators with the Chi spectral speech prior distribution (Johnson et al., 1994) for reducing the background noise along with improving overall speech quality and speech intelligibility. The remainder of this paper is organized into the following sections: system and statistical models (Section ), perceptually-motivated cost functions with Chi speech priors (Section 3), experiments and results (Section 4), and conclusion (Section 5).. System and statistical models In the time domain, the single channel additive noise model is given as (1) y(t) = s(t) + d(t) where s(t), d(t), and y(t) represent the clean, noise, and noisy signals. By taking the short-time Fourier Transform, (1) can be written in the frequency domain as Y(l,k) = S(l,k) + D(l,k) () R(l,k)e jθ(l,k) = X(l,k)e jα(l,k) + N(l,k)e jθ(l,k) where l and k are the particular frame and frequency bin index with noisy, clean, and noise clean spectral amplitudes R, X, and N and noisy, clean, and noise spectral phases θ, α, and θ. As opposed to using the traditional Rayleigh statistical models for both the speech prior and noise likelihood, the traditional Rayleigh speech prior given as (3) p(x,α) = X πσ X exp ( X σ X ) is modified through the use of Chi speech priors (Johnson et al., 1994), where σ X is the speech spectral variance. Specifically, the Chi speech prior is given as (4) p(x,α) = θ a Γ(a) Xa 1 exp ( X θ ), where σ X = θa with shape parameter a and scaling parameter θ and Γ( ) is the gamma function. With a =.5 and a = 1, (4) is equivalent to the Half-Gaussian and Rayleigh distributions. The noise likelihood is still modeled as a Gaussian distribution given as (5) p(y X,α) = 1 πσ N exp ( Y Xejα σ N ),

5 where σ N is the noise spectral variance. In order to simplify the notation, λ = σ, λ X = σ X, and λ N = σ N is utilized as the spectral variances in the derivation of the WE with Chi speech prior estimator and WCOSH with Chi speech prior estimator. 3. Perceptually-motivated cost functions with Chi speech priors 3.1. Weighted Euclidean (WE) From the work in Loizou (5), the Weighted Euclidean (WE) cost function is given as (6) d WE (X,Xˆ ) = (X Xˆ ) X p with estimator equation (7) X^ WE = π X p+1 p(y X,α)p(X,α)dαdX π, X p p(y X,α)p(X,α)dαdX where p is the weighting law parameter. For p =, (7) is equivalent to the MMSE STSA estimator in Ephraim and Malah, 1984, Loizou, 5, Gray et al., 198. Through the substitution of the statistical models in (4), (5) and using and in Gradshteyn and Ryzhik (7), the spectral phase is integrated from the two integrals as (8) and π X p+1 p(y X,α)p(X,α)dαdX X p+a exp ( X ) J (ix v λ ) dx (9) π X p p(y X,α)p(X,α)dαdX X p+a 1 exp ( X ) J (ix v λ ) dx, where λ is defined as (1) 1 λ = 1 λ X + 1 λ N in (A.3) of Ephraim and Malah (1984), J ( ) is the th-order Bessel function of the first-kind, and (11) 1 = a λ X + 1 λ N which is equivalent to 1/λ in (1) for a = 1. By utilizing and in Gradshteyn and Ryzhik (7), (8), (9) are given as (1) π X p+1 p(y X,α)p(X,α)dαdX Γ(p+a+1 1 ) (p+a+1) 1F 1 ( 1 p a ;1; v λ 1 ) and (13) π X p p(y X,α)p(X,α)dαdX Γ(p+a ) 1 λ a (p+a) 1F 1 ( p a ;1; v λ 1 ) where 1 F 1 ( ; ; ) is the confluent hypergeometric function. With the combination of simplification of the integrals in (1), (13), the final form of the new WE estimator with Chi speech prior in (4) is given as

6 (14) Xˆ where WE,CHI = G WE,CHI R = Γ(p+a+1 (15) v a = ξ a+ξ γ and (16) ζ a = 1+ξ a+ξ Γ( p+a ) ) v a γ 1F 1 ( 1 p a ;1; vζ a ) 1F 1 ( p a ;1; vζ a ) R, for p + a > with gain function G WE,CHI and a priori ξ = σ X /σ N and a posteriori γ = R /σ N SNRs. For a = 1 and p =, (14) is exactly equivalent to the STSA estimator with Rayleigh speech prior (Ephraim and Malah, 1984). 3.. Weighted Cosh (WCOSH) In Loizou, 5, the Weighted Cosh (WCOSH) cost function is given as (17) d WCOSH (X,Xˆ ) = [ X Xˆ + Xˆ with estimator equation (18) Xˆ WCOSH = [ X 1] Xp π X p+1 p(y X,α)p(X,α)dαdX π ] X p 1 p(y X,α)p(X,α)dαdX 1, where p is the weighting law parameter. For p =, (18) is equivalent to the Cosh cost function given in Loizou, 5, Gray et al., 198. In order to determine the final estimator equation for the WCOSH with Chi speech prior, the integrals are derived in a same approach as with the WE with Chi speech prior estimator in (14). By the substitution of the statistical models in (4), (5) and using and in Gradshteyn and Ryzhik (7), the spectral phase is integrated from the two integrals as (19) and () π π X p+1 p(y X,α)p(X,α)dαdX X p 1 p(y X,α)p(X,α)dαdX X p+a exp ( X ) J (ix v λ ) dx X p+a exp ( X ) J (ix v λ ) dx, where 1 is defined in (11). Through and in Gradshteyn and Ryzhik (7), (19), () are given as (1) π X p+1 p(y X,α)p(X,α)dαdX Γ(p+a+1 1 ) (p+a+1) 1F 1 ( 1 p a ;1; v λ 1 ) and () π X p 1 p(y X,α)p(X,α)dαdX Γ(p+a 1 1 ) (p+a 1) 1F 1 ( 3 p a ;1; v λ 1 ). With the combination of simplification of the integrals in (1), () and using v a and ς a in (15) and equation, the final form of the new WCOSH estimator with Chi speech prior in (4) is given as

7 (3) Xˆ WCOSH,CHI = G WCOSH,CHI R = [ Γ(p+a+1 ) Γ( p+a 1 ) ] 1 va [ 1 1 p a F 1 ( ;1; vζ a ) ] γ 1F 1 ( 3 p a ;1; vζ a ) for p + a > 1 with gain function G WCOSH,CHI. For a = 1 and p =, (3) is similar to the LSA estimator with Rayleigh speech prior (Ephraim and Malah, 1985). By comparing the WE and WCOSH estimators given in (18), (19), (), (1), (), (3), the only differences consist of the integral in the denominator and square root. Fig. 1, Fig., Fig. 3(WE with Chi speech prior estimator) and Fig. 4, Fig. 5, Fig. 6 (WCOSH with Chi speech prior estimator) present the gain functions G WE,CHI and G WCOSH,CHI for the WE and WCOSH with Chi speech prior estimators given in (14), (3) using representative weighting law parameters of p WE = { 1.,.5,.5} and p WCOSH = {.75,.5,.5} as a function of instantaneous SNR γ k 1 for three fixed a priori SNR ξ k values of, 5, and 1 db and valid shaping parameter a values. 1 R Fig. 1. Gain curves for WE (p = 1.) estimator with Chi prior. Fig.. Gain curves for WE (p =.5) estimator with Chi prior.

8 Fig. 3. Gain curves for WE (p =.5) estimator with Chi prior. Fig. 4. Gain curves for WCOSH (p =.75) estimator with Chi prior. Fig. 5. Gain curves for WCOSH (p =.5) estimator with Chi prior. Fig. 6. Gain curves for WCOSH (p =.5) estimator with Chi prior. From the gain curves, there are several interesting observations to note from both the WE and WCOSH with Chi and Rayleigh speech prior estimators. Based on both sets of estimators across the different a priori SNR ξ k, the gains were smaller in value (more attenuation) as the shaping parameter a approached its limiting value with a decrease in the instantaneous a posteriori SNR γ k 1. As the shaping parameter a 1, which is the Rayleigh speech prior, the gains had a flatter shape and larger value (less attenuation). Regardless of the a priori SNR ξ k and shaping parameter a, the gains all eventually converged to approximately 6 db at around an a posteriori SNR of 8 1 db with an increase of the instantaneous a posteriori SNR γ k 1, which was essentially independent of the weighting law parameter p. With the WE with Chi speech prior estimator, the increase in the weighting law parameter p, which in turn causes an increase in the range of valid shaping

9 parameters a, generated gains with more attenuation at lower instantaneous a posteriori SNR γ k 1 (and less attenuation at higher instantaneous a posteriori SNR γ k 1) using the limiting value of the shaping parameter a. The gains with an increase of weighting law parameter p and shaping parameter a 1 (Rayleigh speech prior) had less attenuation at lower instantaneous a posteriori SNR γ k 1 and no substantial change in attenuation at higher instantaneous a posteriori SNR γ k 1. For the WCOSH with Chi speech prior estimator, the gains were much more dependent on the a priori SNR ξ k than the WE with Chi speech prior estimator. For a particular weighting law parameter p across all shaping parameter a, the gains had less attenuation with an increase in the a priori SNR ξ k. By comparing the same weighting law parameter p =.5 (Fig., Fig. 5) and p =.5 (Fig. 3, Fig. 6) across the WE and WCOSH with Chi speech prior estimators, the gains associated with the WE with Chi speech prior estimator had significantly more attenuation at lower instantaneous a posteriori SNR γ k 1 (and similar attenuation at higher lower instantaneous a posteriori SNR γ k 1) than the gains associated with the WCOSH with Chi speech prior estimator. 4. Experiments and results The proposed WE and WCOSH with Chi speech prior optimal estimators given in (14), (3) were evaluated using the objective measures of Segmental Signal-to-Noise Ratio (SSNR) Papamichalis, 1987, Perceptual Evaluation of Speech Quality (PESQ) ITU, 3, Hu and Loizou, 7, Hu and Loizou, 8, Rix et al., 1, and Signal-to-Noise Ratio (SNR) Loss Ma and Loizou, 11 to access noise reduction, overall speech quality, and speech intelligibility, where PESQ and SNR Loss have a range of (higher scores indicate better performance) and.. (lower scores indicate better performance). In particular, the performance is given via SSNR, PESQ, and SNR Loss improvements, where the improvements are calculated as SSNR/PESQ/SNR Loss output (enhanced signal) minus SSNR/PESQ/SNR Loss input (noisy signal). Clean and noisy speech were taken from the noisy speech corpus (NOIZEUS) Hu and Loizou, 7, which contains 3 IEEE sentences (Subcommittee, 1969) (produced by three male and three female speakers) corrupted by eight different real-world noises at different SNRs ranging from to 15 db at increments of 5 db, where the noises were taken from the AURORA database (Pearce and Hirsch, ), which includes airport, babble, car, exhibition, restaurant, station, street, and train noises. The analysis conditions consisted of frames of 56 samples (5.6 ms) with 5% overlap using Hanning windows. Noise estimation was performed on an initial silence of 5 frames. The decision-directed (DD) Ephraim and Malah, 1984 smoothing approach was utilized to estimate ξ with α SNR =.98 using thresholds of ξ min = 1 5/1 and γ min = 4. In order to evaluate the performance, the enhanced signals were reconstructed using the overlap-add technique. The shape parameter a in the Chi speech prior was varied for specific weighting law parameters p to determine its effect on enhancement, quality, and intelligibility with results averaged over 3 utterances across the 8 different noises at input SNRs of, 5, and 1 db. As recommended by Loizou (5), p WE = 1 and p WCOSH =.5 were selected as the weighting law parameter p to achieve the best overall speech quality in the enhancement process. Fig. 7, Fig. 8 illustrate the SSNR improvements for the WE and WCOSH with Chi speech prior estimators at various input SNRs, noises, and shaping parameter a with particular weighting law parameter p. The WE and WCOSH with Chi speech prior estimators consistently produced 3 db ( db input SNR), 1 db (5 db input SNR), and db (1 db input SNR) over the baseline WE and WCOSH with Rayleigh speech prior estimators, which typically occurred at the limiting value of the shaping parameter a for the corresponding weighting law parameter p of a.5 (p WE = 1.) and a.75 (p WCOSH =.5). At the limiting shaping parameter a, the WE and WCOSH with Chi speech prior estimators achieved maximum SSNR improvements of 9 13 db ( db input SNR), 6 9 db (5 db input SNR), and 4 5 db (1 db input SNR) across the car, train, station, exhibition, street, babble, and airport noises. In comparing the WE and WCOSH with Chi speech prior estimators, the WE with Chi speech prior estimator had slightly better SSNR improvement performance over the WCOSH with Chi speech prior estimator for noise reduction.

10 Fig. 7. SSNR improvements for MMSE WE estimator with Chi prior (p = 1). Fig. 8. SSNR improvements for MMSE WCOSH estimator with Chi prior (p =.5). Fig. 9, Fig. 1 present the PESQ improvements for the WE and WCOSH with Chi speech prior estimators at various input SNRs, noises, and shaping parameter a with particular weighting law parameter p. In a similar fashion to the SSNR improvements, the WE and WCOSH with Chi speech prior estimators generated..3 and..1 gains over the baseline WE and WCOSH estimators with Rayleigh speech prior with the most pronounced improvements occurring at input SNRs of 5 and 1 db. In contrast to the SSNR improvements that were almost exclusively dependent on the limiting shaping parameter a, the PESQ improvements diminished at a =.7.8 (WE with Chi speech prior estimator) and a =.85.9 (WCOSH with Chi speech prior estimator). For both the WE and WCOSH with Chi prior estimators, the maximum PESQ improvements ranged from..55 (5 db input SNR),..5 (1 db input SNR), and ( db input SNR) across the restaurant, airport, babble, street, exhibition, station, train, and car noises. After examination of the WE and WCOSH with Chi speech prior estimators, the WE with Chi speech prior estimator had slightly better PESQ improvement performance over the WCOSH with Chi speech prior estimator for speech quality.

11 Fig. 9. PESQ improvements for MMSE WE estimator with Chi prior (p = 1). Fig. 1. PESQ improvements for MMSE WCOSH estimator with Chi prior (p =.5). Fig. 11, Fig. 1 demonstrate the SNR Loss improvements for the WE and WCOSH with Chi speech prior estimators at various input SNRs, noises, and shaping parameter a with particular weighting law parameter p. The WE and WCOSH with Chi speech prior estimators typically yielded.1 ( db input SNR),.5 (5 db input SNR), and.5 (1 db input SNR) over the corresponding baseline WE and WCOSH with Rayleigh speech prior estimators, which occurred at a wide range of shaping parameters a. In contrast to the SSNR and PESQ improvements, the SNR Loss improvements were most noticeable at input SNRs of 5,, and 1 db and car, station, babble, airport, exhibition, restaurant, train, and street noises. In more specific terms, the WE and WCOSH with Chi speech prior estimators realized maximum SNR Loss improvements of ( db input SNR),.1.95 (5 db input SNR), and From the WE and WCOSH with Chi speech prior estimators, the WE with Chi speech prior estimator often had larger decreases in SNR Loss over the baseline Rayleigh speech prior estimators than the WCOSH with Chi speech prior estimator.

12 Fig. 11. SNR Loss improvements for MMSE WE estimator with Chi prior (p = 1). Fig. 1. SNR Loss improvements for MMSE WCOSH estimator with Chi prior (p =.5). Table 1, Table, Table 3, Table 4, Table 5, Table 6 show the SSNR improvement, PESQ improvement, and SNR Loss improvement for the WE and WCOSH with Chi speech prior estimators for two additional and representative weighting law parameters p. Whereas the WE with Chi speech prior estimator was examined with the weighting law parameters p =.5 (a >.5) and p =.5 (a >.15), the WCOSH with Chi speech prior estimator was examined with the weighting law parameters p =.75 (a >.875) and p =.5 (a >.65) according to the relationships p + a > and p + a > 1. For each weighting law parameter p of the WCOSH and WE with Chi speech prior estimators at the particular noise and input SNR, the SSNR improvement, PESQ improvement, and SNR Loss improvement results are provided alongside their corresponding shaping parameter a, where the shaping parameter a 1 represents the baseline WE and WCOSH with Rayleigh speech prior estimators. In terms of SSNR improvements, the WE and WCOSH with Chi speech prior estimators generally produced.5.5 db gains over the baseline WE and WCOSH with Rayleigh speech prior estimators. As the weighting law parameter p was decreased in value, the SSNR improvement increased in value, where the maximum SSNR improvement ranged from 6 to 9 db across the car, train, and babble noises. The WCOSH with Chi speech prior typically had less performance gains over the baseline WCOSH with Rayleigh speech prior because of the higher baseline SSNR improvements. For each the WE and WCOSH with Chi speech prior estimators, the limiting factor in SSNR improvement was the lower bound of the shaping parameters a. For the PESQ improvements, the WE and WCOSH with Chi speech prior estimators generated upwards of.14 gains over the baseline WE and WCOSH with Rayleigh speech prior estimators. In a similar way to the SSNR improvements, the increase in the weighting law parameter p caused a decrease in PESQ improvement. The maximum PESQ improvement ranged from.4 to.56 across the car, train, and babble noises, where the shaping parameter a reached the maximum at a =.5.7 (WE with Chi speech prior estimator) and a =.9.99 (WCOSH with Chi speech prior estimator). In general, the WCOSH with Chi speech prior estimator did not always follow the same relationship between the weighting law parameter p and PESQ improvement as the WE with Chi speech prior estimator. With the SNR Loss improvements, the WE and WCOSH with Chi speech prior estimators supplied nearly.19 gains over the baseline WE and WCOSH with Rayleigh speech prior estimators. As with SSNR improvement and PESQ improvement, the SNR Loss improvement decreased in value with an increase in the weighting law parameter p value. The car, babble, and train noises achieved maximum SNR Loss improvements of.88.11, which occurred in the range of a =.5.45 (WE with Chi speech prior estimator) and a/1. (WCOSH with Chi speech prior estimator). In most cases, the WCOSH with Chi speech prior estimator did not produce nearly as pronounced SNR Loss improvement gains compared to the WE with Chi speech prior estimator over the baseline and WE and WCOSH with Rayleigh speech prior estimators.

13 Table 1. SSNR improvements for MMSE WE estimator with Chi prior (p =.5 and p =.5). SNR [db] Babble Car Train p =.5 p =.5 p =.5 p =.5 p =.5 p = AVG Table. SSNR improvements for MMSE WCOSH estimator with Chi prior (p =.75 and p =.5). SNR [db] Babble Car Train p =.75 p =.5 p =.75 p =.5 p =.75 p = AVG Table 3. PESQ improvements for MMSE WE estimator with Chi prior (p =.5 and p =.5). SNR [db] Babble Car Train p =.5 p =.5 p =.5 p =.5 p =.5 p = AVG

14 Table 4. PESQ improvements for MMSE WCOSH estimator with Chi prior (p =.75 and p =.5). SNR [db] Babble Car Train p =.75 p =.5 p =.75 p =.5 p =.75 p = AVG Table 5. SNR Loss improvements for MMSE WE estimator with Chi prior (p =.5 and p =.5). SNR [db] Babble Car Train p =.5 p =.5 p =.5 p =.5 p =.5 p = AVG Table 6. SNR Loss improvements for MMSE WCOSH estimator with Chi prior (p =.75 and p =.5). SNR [db] Babble Car Train p =.75 p =.5 p =.75 p =.5 p =.75 p =

15 AVG

16 5. Conclusion In this paper, the authors derived novel perceptually-motivated WE and WCOSH estimators using more appropriate Chi speech prior as a substitute for the traditional Rayleigh speech prior to model the speech spectral amplitude. Fundamentally, the goal of the work is to capitalize on the mutual benefits of the WE and WCOSH cost functions and Chi distributions for the speech prior to provide gains in all phases of enhancement. The WE and WCOSH with Chi speech prior estimators incorporated weighting law and shape parameters on the cost functions and distributions. Instead of measuring the performance simply with the SSNR objective quality metric to determine the amount of noise reduction, the estimators were evaluated using the PESQ and SNR Loss objective quality metrics to ascertain the level of overall speech quality and speech intelligibility compared to the original noisy signals corrupted by input SNRs of, 5, and 1 db across airport, babble, car, exhibition, restaurant, station, street, and train noises. With the WE and WCOSH with standard Rayleigh speech prior estimators serving as the baseline results, the experimental results indicated that the new WE and WCOSH with Chi speech prior estimators provided significant gains in noise reduction and noticeable gains in overall speech quality and speech intelligibility improvements. Generally, the best results for the various objective quality metrics occurred for a particular weighting law parameter at the limiting value of the shaping parameter at lower input SNRs (SSNR improvement) and various values of the shaping parameter at higher input SNRs (PESQ improvement and SNR Loss improvement). In more specific terms, the WE and WCOSH with Chi speech prior estimators consistently produced upwards of approximately 3 db (SSNR improvement),.3 (PESQ improvement), and.5 (SNR Loss improvement) over the baseline WE and WCOSH with Rayleigh speech prior estimators. In comparing the WE with Chi speech prior and WCOSH with Chi speech prior estimators, the WE with Chi speech prior estimator often times had slightly better overall performance across the SSNR, PESQ, and SNR Loss objective quality metrics than the WCOSH with Chi speech prior estimator and would be the recommended estimator for filtering noisy signals with more negative values of the weighting law parameter. For future work, the WE and WCOSH estimators would involve further modifications to integrate even more generalized speech prior statistical estimators, namely the generalized Gamma speech prior, to obtain more gains in SSNR, PESQ, and SNR Loss improvements over the traditional Rayleigh speech prior. References Loizou, 7. P.C. Loizou. Speech Enhancement Theory and Practice. CRC Press (7) Ephraim and Malah, Y. Ephraim, D. Malah. Speech enhancement using a minimum mean-square error short-time spectral amplitude estimator. IEEE Transactions on Acoustics, Speech and Signal Processing, ASSP-3 (1984), pp Ephraim and Malah, Y. Ephraim, D. Malah. Speech enhancement using a minimum mean-square error log-spectral amplitude estimator. IEEE Transactions on Acoustics, Speech and Signal Processing, 33 (1985), pp Loizou, 5. P.C. Loizou. Speech enhancement based on perceptually motivated Bayesian estimators of the magnitude spectrum. IEEE Transactions on Acoustics, Speech and Signal Processing, 13 (5), pp Andrianakis and White, 9. I. Andrianakis, P.R. White. Speech spectral amplitude estimators using optimallyshaped gamma and chi priors Speech Communication, 51 (9), pp Breithaupt et al., 8 Breithaupt, C., Krawczyk, M., Martin, R., 8. Parameterized MMSE spectral magnitude estimation for the enhancement of noisy speech. In: Presented at International Conference on Acoustics, Speech, and, Signal Processing. Johnson et al., 1994 N. Johnson, S. Kotz, N. BalakrishnanContinuous Univariate Distributions

17 (nd ed.), John Wiley and Sons, New York (1994) vol. 1 Gray et al., 198 R.M. Gray, A. Buzo, J.A.H. Gray, Y. MatsuyamaDistortion measures for speech processing IEEE Transactions on Acoustics, Speech and Signal Processing, ASSP-8 (198), pp Gradshteyn and Ryzhik, 7 I.S. Gradshteyn, I.M. RyzhikTables of Integrals, Series, and Products Academic Press (7) Papamichalis, 1987 P.E. PapamichalisPractical Approaches to Speech Coding Prentice-Hall, New York, NY (1987) ITU, 3 ITU, Subjective test methodology for evaluating speech communication systems that include noise suppression algorithm, ITU-T Recommendation, 3. Hu and Loizou, 7 Y. Hu, P.C. LoizouSubjective comparison and evaluation of speech enhancement algorithms Speech Communication, 49 (7), pp Hu and Loizou, 8 Y. Hu, P. LoizouEvaluation of objective quality measures for speech enhancement IEEE Transactions on Audio, Speech, and Language Processing, 16 (8), pp Rix et al., 1 Rix, A., Beerends, J., Hollier, M., Hekstra, A., 1. Perceptual evaluation of speech quality (PESQ)-A new method for speech quality assessment of telephone networks and codecs. In: Presented at IEEE International Conference of Acoustics, Speech, and, Signal Processing. Ma and Loizou, 11 J. Ma, P.C. LoizouSNR loss: a new objective measure for predicting the intelligibility of noise-suppressed speech Speech Communication, 53 (11), pp Subcommittee, 1969 I. SubcommitteeIEEE recommended practice for speech quality measurements IEEE Transactions on Audio and Electroacoustics, AU-17 (1969), pp Pearce and Hirsch, Pearce, D., Hirsch, H.-G.,. Performance evaluation of speech recognition systems under noisy conditions. In: Presented at 6th International Conference on Spoken Language Processing (ICSLP), Beijing, China.

White Paper. PESQ: An Introduction. Prepared by: Psytechnics Limited. 23 Museum Street Ipswich, Suffolk United Kingdom IP1 1HN

White Paper. PESQ: An Introduction. Prepared by: Psytechnics Limited. 23 Museum Street Ipswich, Suffolk United Kingdom IP1 1HN PESQ: An Introduction White Paper Prepared by: Psytechnics Limited 23 Museum Street Ipswich, Suffolk United Kingdom IP1 1HN t: +44 (0) 1473 261 800 f: +44 (0) 1473 261 880 e: info@psytechnics.com September

More information

Log-Likelihood Ratio-based Relay Selection Algorithm in Wireless Network

Log-Likelihood Ratio-based Relay Selection Algorithm in Wireless Network Recent Advances in Electrical Engineering and Electronic Devices Log-Likelihood Ratio-based Relay Selection Algorithm in Wireless Network Ahmed El-Mahdy and Ahmed Walid Faculty of Information Engineering

More information

This document is downloaded from DR-NTU, Nanyang Technological University Library, Singapore.

This document is downloaded from DR-NTU, Nanyang Technological University Library, Singapore. This document is downloaded from DR-NTU, Nanyang Technological University Library, Singapore. Title Transcription of polyphonic signals using fast filter bank( Accepted version ) Author(s) Foo, Say Wei;

More information

The Calculation of G rms

The Calculation of G rms The Calculation of G rms QualMark Corp. Neill Doertenbach The metric of G rms is typically used to specify and compare the energy in repetitive shock vibration systems. However, the method of arriving

More information

A Sound Analysis and Synthesis System for Generating an Instrumental Piri Song

A Sound Analysis and Synthesis System for Generating an Instrumental Piri Song , pp.347-354 http://dx.doi.org/10.14257/ijmue.2014.9.8.32 A Sound Analysis and Synthesis System for Generating an Instrumental Piri Song Myeongsu Kang and Jong-Myon Kim School of Electrical Engineering,

More information

AN-007 APPLICATION NOTE MEASURING MAXIMUM SUBWOOFER OUTPUT ACCORDING ANSI/CEA-2010 STANDARD INTRODUCTION CEA-2010 (ANSI) TEST PROCEDURE

AN-007 APPLICATION NOTE MEASURING MAXIMUM SUBWOOFER OUTPUT ACCORDING ANSI/CEA-2010 STANDARD INTRODUCTION CEA-2010 (ANSI) TEST PROCEDURE AUDIOMATICA AN-007 APPLICATION NOTE MEASURING MAXIMUM SUBWOOFER OUTPUT ACCORDING ANSI/CEA-2010 STANDARD by Daniele Ponteggia - dp@audiomatica.com INTRODUCTION The Consumer Electronics Association (CEA),

More information

Integration of Short-Time Fourier Domain Speech Enhancement and Observation Uncertainty Techniques for Robust Automatic Speech Recognition

Integration of Short-Time Fourier Domain Speech Enhancement and Observation Uncertainty Techniques for Robust Automatic Speech Recognition Integration of Short-Time Fourier Domain Speech Enhancement and Observation Uncertainty Techniques for Robust Automatic Speech Recognition Von der Fakultät IV Elektrotechnik und Informatik der Technischen

More information

Audio Engineering Society. Convention Paper. Presented at the 129th Convention 2010 November 4 7 San Francisco, CA, USA

Audio Engineering Society. Convention Paper. Presented at the 129th Convention 2010 November 4 7 San Francisco, CA, USA Audio Engineering Society Convention Paper Presented at the 129th Convention 2010 November 4 7 San Francisco, CA, USA The papers at this Convention have been selected on the basis of a submitted abstract

More information

PERCENTAGE ARTICULATION LOSS OF CONSONANTS IN THE ELEMENTARY SCHOOL CLASSROOMS

PERCENTAGE ARTICULATION LOSS OF CONSONANTS IN THE ELEMENTARY SCHOOL CLASSROOMS The 21 st International Congress on Sound and Vibration 13-17 July, 2014, Beijing/China PERCENTAGE ARTICULATION LOSS OF CONSONANTS IN THE ELEMENTARY SCHOOL CLASSROOMS Dan Wang, Nanjie Yan and Jianxin Peng*

More information

MUSICAL INSTRUMENT FAMILY CLASSIFICATION

MUSICAL INSTRUMENT FAMILY CLASSIFICATION MUSICAL INSTRUMENT FAMILY CLASSIFICATION Ricardo A. Garcia Media Lab, Massachusetts Institute of Technology 0 Ames Street Room E5-40, Cambridge, MA 039 USA PH: 67-53-0 FAX: 67-58-664 e-mail: rago @ media.

More information

White Paper. Comparison between subjective listening quality and P.862 PESQ score. Prepared by: A.W. Rix Psytechnics Limited

White Paper. Comparison between subjective listening quality and P.862 PESQ score. Prepared by: A.W. Rix Psytechnics Limited Comparison between subjective listening quality and P.862 PESQ score White Paper Prepared by: A.W. Rix Psytechnics Limited 23 Museum Street Ipswich, Suffolk United Kingdom IP1 1HN t: +44 (0) 1473 261 800

More information

Advanced Signal Processing and Digital Noise Reduction

Advanced Signal Processing and Digital Noise Reduction Advanced Signal Processing and Digital Noise Reduction Saeed V. Vaseghi Queen's University of Belfast UK WILEY HTEUBNER A Partnership between John Wiley & Sons and B. G. Teubner Publishers Chichester New

More information

MICROPHONE SPECIFICATIONS EXPLAINED

MICROPHONE SPECIFICATIONS EXPLAINED Application Note AN-1112 MICROPHONE SPECIFICATIONS EXPLAINED INTRODUCTION A MEMS microphone IC is unique among InvenSense, Inc., products in that its input is an acoustic pressure wave. For this reason,

More information

ACOUSTICAL CONSIDERATIONS FOR EFFECTIVE EMERGENCY ALARM SYSTEMS IN AN INDUSTRIAL SETTING

ACOUSTICAL CONSIDERATIONS FOR EFFECTIVE EMERGENCY ALARM SYSTEMS IN AN INDUSTRIAL SETTING ACOUSTICAL CONSIDERATIONS FOR EFFECTIVE EMERGENCY ALARM SYSTEMS IN AN INDUSTRIAL SETTING Dennis P. Driscoll, P.E. and David C. Byrne, CCC-A Associates in Acoustics, Inc. Evergreen, Colorado Telephone (303)

More information

RANDOM VIBRATION AN OVERVIEW by Barry Controls, Hopkinton, MA

RANDOM VIBRATION AN OVERVIEW by Barry Controls, Hopkinton, MA RANDOM VIBRATION AN OVERVIEW by Barry Controls, Hopkinton, MA ABSTRACT Random vibration is becoming increasingly recognized as the most realistic method of simulating the dynamic environment of military

More information

Khalid Sayood and Martin C. Rost Department of Electrical Engineering University of Nebraska

Khalid Sayood and Martin C. Rost Department of Electrical Engineering University of Nebraska PROBLEM STATEMENT A ROBUST COMPRESSION SYSTEM FOR LOW BIT RATE TELEMETRY - TEST RESULTS WITH LUNAR DATA Khalid Sayood and Martin C. Rost Department of Electrical Engineering University of Nebraska The

More information

PHASE ESTIMATION ALGORITHM FOR FREQUENCY HOPPED BINARY PSK AND DPSK WAVEFORMS WITH SMALL NUMBER OF REFERENCE SYMBOLS

PHASE ESTIMATION ALGORITHM FOR FREQUENCY HOPPED BINARY PSK AND DPSK WAVEFORMS WITH SMALL NUMBER OF REFERENCE SYMBOLS PHASE ESTIMATION ALGORITHM FOR FREQUENCY HOPPED BINARY PSK AND DPSK WAVEFORMS WITH SMALL NUM OF REFERENCE SYMBOLS Benjamin R. Wiederholt The MITRE Corporation Bedford, MA and Mario A. Blanco The MITRE

More information

Analysis/resynthesis with the short time Fourier transform

Analysis/resynthesis with the short time Fourier transform Analysis/resynthesis with the short time Fourier transform summer 2006 lecture on analysis, modeling and transformation of audio signals Axel Röbel Institute of communication science TU-Berlin IRCAM Analysis/Synthesis

More information

5 Tips For Making the Most Out of Any Available Opportunities

5 Tips For Making the Most Out of Any Available Opportunities IEEE TRANSACTIONS ON AUDIO, SPEECH, AND LANGUAGE PROCESSING, VOL 16, NO 3, MARCH 2008 541 Noise Tracking Using DFT Domain Subspace Decompositions Richard C Hendriks, Jesper Jensen, and Richard Heusdens

More information

A Microphone Array for Hearing Aids

A Microphone Array for Hearing Aids A Microphone Array for Hearing Aids by Bernard Widrow 1531-636X/06/$10.00 2001IEEE 0.00 26 Abstract A directional acoustic receiving system is constructed in the form of a necklace including an array of

More information

Ericsson T18s Voice Dialing Simulator

Ericsson T18s Voice Dialing Simulator Ericsson T18s Voice Dialing Simulator Mauricio Aracena Kovacevic, Anna Dehlbom, Jakob Ekeberg, Guillaume Gariazzo, Eric Lästh and Vanessa Troncoso Dept. of Signals Sensors and Systems Royal Institute of

More information

Measuring Line Edge Roughness: Fluctuations in Uncertainty

Measuring Line Edge Roughness: Fluctuations in Uncertainty Tutor6.doc: Version 5/6/08 T h e L i t h o g r a p h y E x p e r t (August 008) Measuring Line Edge Roughness: Fluctuations in Uncertainty Line edge roughness () is the deviation of a feature edge (as

More information

The Effective Number of Bits (ENOB) of my R&S Digital Oscilloscope Technical Paper

The Effective Number of Bits (ENOB) of my R&S Digital Oscilloscope Technical Paper The Effective Number of Bits (ENOB) of my R&S Digital Oscilloscope Technical Paper Products: R&S RTO1012 R&S RTO1014 R&S RTO1022 R&S RTO1024 This technical paper provides an introduction to the signal

More information

APPLYING MFCC-BASED AUTOMATIC SPEAKER RECOGNITION TO GSM AND FORENSIC DATA

APPLYING MFCC-BASED AUTOMATIC SPEAKER RECOGNITION TO GSM AND FORENSIC DATA APPLYING MFCC-BASED AUTOMATIC SPEAKER RECOGNITION TO GSM AND FORENSIC DATA Tuija Niemi-Laitinen*, Juhani Saastamoinen**, Tomi Kinnunen**, Pasi Fränti** *Crime Laboratory, NBI, Finland **Dept. of Computer

More information

Emotion Detection from Speech

Emotion Detection from Speech Emotion Detection from Speech 1. Introduction Although emotion detection from speech is a relatively new field of research, it has many potential applications. In human-computer or human-human interaction

More information

LOW COST HARDWARE IMPLEMENTATION FOR DIGITAL HEARING AID USING

LOW COST HARDWARE IMPLEMENTATION FOR DIGITAL HEARING AID USING LOW COST HARDWARE IMPLEMENTATION FOR DIGITAL HEARING AID USING RasPi Kaveri Ratanpara 1, Priyan Shah 2 1 Student, M.E Biomedical Engineering, Government Engineering college, Sector-28, Gandhinagar (Gujarat)-382028,

More information

Workshop Perceptual Effects of Filtering and Masking Introduction to Filtering and Masking

Workshop Perceptual Effects of Filtering and Masking Introduction to Filtering and Masking Workshop Perceptual Effects of Filtering and Masking Introduction to Filtering and Masking The perception and correct identification of speech sounds as phonemes depends on the listener extracting various

More information

Thirukkural - A Text-to-Speech Synthesis System

Thirukkural - A Text-to-Speech Synthesis System Thirukkural - A Text-to-Speech Synthesis System G. L. Jayavardhana Rama, A. G. Ramakrishnan, M Vijay Venkatesh, R. Murali Shankar Department of Electrical Engg, Indian Institute of Science, Bangalore 560012,

More information

MATLAB-based Applications for Image Processing and Image Quality Assessment Part II: Experimental Results

MATLAB-based Applications for Image Processing and Image Quality Assessment Part II: Experimental Results 154 L. KRASULA, M. KLÍMA, E. ROGARD, E. JEANBLANC, MATLAB BASED APPLICATIONS PART II: EXPERIMENTAL RESULTS MATLAB-based Applications for Image Processing and Image Quality Assessment Part II: Experimental

More information

Figure1. Acoustic feedback in packet based video conferencing system

Figure1. Acoustic feedback in packet based video conferencing system Real-Time Howling Detection for Hands-Free Video Conferencing System Mi Suk Lee and Do Young Kim Future Internet Research Department ETRI, Daejeon, Korea {lms, dyk}@etri.re.kr Abstract: This paper presents

More information

A TOOL FOR TEACHING LINEAR PREDICTIVE CODING

A TOOL FOR TEACHING LINEAR PREDICTIVE CODING A TOOL FOR TEACHING LINEAR PREDICTIVE CODING Branislav Gerazov 1, Venceslav Kafedziski 2, Goce Shutinoski 1 1) Department of Electronics, 2) Department of Telecommunications Faculty of Electrical Engineering

More information

Automatic Evaluation Software for Contact Centre Agents voice Handling Performance

Automatic Evaluation Software for Contact Centre Agents voice Handling Performance International Journal of Scientific and Research Publications, Volume 5, Issue 1, January 2015 1 Automatic Evaluation Software for Contact Centre Agents voice Handling Performance K.K.A. Nipuni N. Perera,

More information

AUDIO signals are often contaminated by background environment

AUDIO signals are often contaminated by background environment 1830 IEEE TRANSACTIONS ON SIGNAL PROCESSING, VOL. 56, NO. 5, MAY 2008 Audio Denoising by Time-Frequency Block Thresholding Guoshen Yu, Stéphane Mallat, Fellow, IEEE, and Emmanuel Bacry Abstract Removing

More information

Artificial Neural Network for Speech Recognition

Artificial Neural Network for Speech Recognition Artificial Neural Network for Speech Recognition Austin Marshall March 3, 2005 2nd Annual Student Research Showcase Overview Presenting an Artificial Neural Network to recognize and classify speech Spoken

More information

SPEAKER IDENTIFICATION FROM YOUTUBE OBTAINED DATA

SPEAKER IDENTIFICATION FROM YOUTUBE OBTAINED DATA SPEAKER IDENTIFICATION FROM YOUTUBE OBTAINED DATA Nitesh Kumar Chaudhary 1 and Shraddha Srivastav 2 1 Department of Electronics & Communication Engineering, LNMIIT, Jaipur, India 2 Bharti School Of Telecommunication,

More information

Performance Analysis of Interleaving Scheme in Wideband VoIP System under Different Strategic Conditions

Performance Analysis of Interleaving Scheme in Wideband VoIP System under Different Strategic Conditions Performance Analysis of Scheme in Wideband VoIP System under Different Strategic Conditions Harjit Pal Singh 1, Sarabjeet Singh 1 and Jasvir Singh 2 1 Dept. of Physics, Dr. B.R. Ambedkar National Institute

More information

Voice Communication Package v7.0 of front-end voice processing software technologies General description and technical specification

Voice Communication Package v7.0 of front-end voice processing software technologies General description and technical specification Voice Communication Package v7.0 of front-end voice processing software technologies General description and technical specification (Revision 1.0, May 2012) General VCP information Voice Communication

More information

Exploiting A Constellation of Narrowband RF Sensors to Detect and Track Moving Targets

Exploiting A Constellation of Narrowband RF Sensors to Detect and Track Moving Targets Exploiting A Constellation of Narrowband RF Sensors to Detect and Track Moving Targets Chris Kreucher a, J. Webster Stayman b, Ben Shapo a, and Mark Stuff c a Integrity Applications Incorporated 900 Victors

More information

Objective Speech Quality Measures for Internet Telephony

Objective Speech Quality Measures for Internet Telephony Objective Speech Quality Measures for Internet Telephony Timothy A. Hall National Institute of Standards and Technology 100 Bureau Drive, STOP 8920 Gaithersburg, MD 20899-8920 ABSTRACT Measuring voice

More information

Lecture 8: Signal Detection and Noise Assumption

Lecture 8: Signal Detection and Noise Assumption ECE 83 Fall Statistical Signal Processing instructor: R. Nowak, scribe: Feng Ju Lecture 8: Signal Detection and Noise Assumption Signal Detection : X = W H : X = S + W where W N(, σ I n n and S = [s, s,...,

More information

MultiDSLA. Measuring Network Performance. Malden Electronics Ltd

MultiDSLA. Measuring Network Performance. Malden Electronics Ltd MultiDSLA Measuring Network Performance Malden Electronics Ltd The Business Case for Network Performance Measurement MultiDSLA is a highly scalable solution for the measurement of network speech transmission

More information

L9: Cepstral analysis

L9: Cepstral analysis L9: Cepstral analysis The cepstrum Homomorphic filtering The cepstrum and voicing/pitch detection Linear prediction cepstral coefficients Mel frequency cepstral coefficients This lecture is based on [Taylor,

More information

SmartFocus Article 1 - Technical approach

SmartFocus Article 1 - Technical approach SmartFocus Article 1 - Technical approach Effective strategies for addressing listening in noisy environments The difficulty of determining the desired amplification for listening in noise is well documented.

More information

The effect of mismatched recording conditions on human and automatic speaker recognition in forensic applications

The effect of mismatched recording conditions on human and automatic speaker recognition in forensic applications Forensic Science International 146S (2004) S95 S99 www.elsevier.com/locate/forsciint The effect of mismatched recording conditions on human and automatic speaker recognition in forensic applications A.

More information

Complexity-bounded Power Control in Video Transmission over a CDMA Wireless Network

Complexity-bounded Power Control in Video Transmission over a CDMA Wireless Network Complexity-bounded Power Control in Video Transmission over a CDMA Wireless Network Xiaoan Lu, David Goodman, Yao Wang, and Elza Erkip Electrical and Computer Engineering, Polytechnic University, Brooklyn,

More information

Analog-to-Digital Voice Encoding

Analog-to-Digital Voice Encoding Analog-to-Digital Voice Encoding Basic Voice Encoding: Converting Analog to Digital This topic describes the process of converting analog signals to digital signals. Digitizing Analog Signals 1. Sample

More information

Establishing the Uniqueness of the Human Voice for Security Applications

Establishing the Uniqueness of the Human Voice for Security Applications Proceedings of Student/Faculty Research Day, CSIS, Pace University, May 7th, 2004 Establishing the Uniqueness of the Human Voice for Security Applications Naresh P. Trilok, Sung-Hyuk Cha, and Charles C.

More information

ACM SIGKDD Workshop on Intelligence and Security Informatics Held in conjunction with KDD-2010

ACM SIGKDD Workshop on Intelligence and Security Informatics Held in conjunction with KDD-2010 Fuzzy Association Rule Mining for Community Crime Pattern Discovery Anna L. Buczak, Christopher M. Gifford ACM SIGKDD Workshop on Intelligence and Security Informatics Held in conjunction with KDD-2010

More information

Function Guide for the Fourier Transformation Package SPIRE-UOL-DOC-002496

Function Guide for the Fourier Transformation Package SPIRE-UOL-DOC-002496 Function Guide for the Fourier Transformation Package SPIRE-UOL-DOC-002496 Prepared by: Peter Davis (University of Lethbridge) peter.davis@uleth.ca Andres Rebolledo (University of Lethbridge) andres.rebolledo@uleth.ca

More information

How To Understand The Quality Of A Wireless Voice Communication

How To Understand The Quality Of A Wireless Voice Communication Effects of the Wireless Channel in VOIP (Voice Over Internet Protocol) Networks Atul Ranjan Srivastava 1, Vivek Kushwaha 2 Department of Electronics and Communication, University of Allahabad, Allahabad

More information

Understanding the Transition From PESQ to POLQA. An Ascom Network Testing White Paper

Understanding the Transition From PESQ to POLQA. An Ascom Network Testing White Paper Understanding the Transition From PESQ to POLQA An Ascom Network Testing White Paper By Dr. Irina Cotanis Prepared by: Date: Document: Dr. Irina Cotanis 6 December 2011 NT11-22759, Rev. 1.0 Ascom (2011)

More information

Noise. CIH Review PDC March 2012

Noise. CIH Review PDC March 2012 Noise CIH Review PDC March 2012 Learning Objectives Understand the concept of the decibel, decibel determination, decibel addition, and weighting Know the characteristics of frequency that are relevant

More information

Radar Systems Engineering Lecture 6 Detection of Signals in Noise

Radar Systems Engineering Lecture 6 Detection of Signals in Noise Radar Systems Engineering Lecture 6 Detection of Signals in Noise Dr. Robert M. O Donnell Guest Lecturer Radar Systems Course 1 Detection 1/1/010 Block Diagram of Radar System Target Radar Cross Section

More information

ANALYZER BASICS WHAT IS AN FFT SPECTRUM ANALYZER? 2-1

ANALYZER BASICS WHAT IS AN FFT SPECTRUM ANALYZER? 2-1 WHAT IS AN FFT SPECTRUM ANALYZER? ANALYZER BASICS The SR760 FFT Spectrum Analyzer takes a time varying input signal, like you would see on an oscilloscope trace, and computes its frequency spectrum. Fourier's

More information

Capacity Limits of MIMO Channels

Capacity Limits of MIMO Channels Tutorial and 4G Systems Capacity Limits of MIMO Channels Markku Juntti Contents 1. Introduction. Review of information theory 3. Fixed MIMO channels 4. Fading MIMO channels 5. Summary and Conclusions References

More information

Voice Quality Evaluation and the Impact of Wireless Packet Communication Systems

Voice Quality Evaluation and the Impact of Wireless Packet Communication Systems 1 Voice Quality Evaluation in Wireless Packet Communication Systems: A Tutorial and Performance Results for ROHC Stephan Rein Frank H. P. Fitzek Martin Reisslein Abstract As wireless systems are evolving

More information

How to Measure Network Performance by Using NGNs

How to Measure Network Performance by Using NGNs Speech Quality Measurement Tools for Dynamic Network Management Simon Broom, Mike Hollier Psytechnics, 23 Museum Street, Ipswich, Suffolk, UK IP1 1HN Phone +44 (0)1473 261800, Fax +44 (0)1473 261880 simon.broom@psytechnics.com

More information

Performance analysis of bandwidth efficient coherent modulation schemes with L-fold MRC and SC in Nakagami-m fading channels

Performance analysis of bandwidth efficient coherent modulation schemes with L-fold MRC and SC in Nakagami-m fading channels Title Performance analysis of bandwidth efficient coherent modulation schemes with L-fold MRC and SC in Nakagami-m fading channels Author(s) Lo, CM; Lam, WH Citation Ieee International Symposium On Personal,

More information

Advanced Speech-Audio Processing in Mobile Phones and Hearing Aids

Advanced Speech-Audio Processing in Mobile Phones and Hearing Aids Advanced Speech-Audio Processing in Mobile Phones and Hearing Aids Synergies and Distinctions Peter Vary RWTH Aachen University Institute of Communication Systems WASPAA, October 23, 2013 Mohonk Mountain

More information

The Effect of Network Cabling on Bit Error Rate Performance. By Paul Kish NORDX/CDT

The Effect of Network Cabling on Bit Error Rate Performance. By Paul Kish NORDX/CDT The Effect of Network Cabling on Bit Error Rate Performance By Paul Kish NORDX/CDT Table of Contents Introduction... 2 Probability of Causing Errors... 3 Noise Sources Contributing to Errors... 4 Bit Error

More information

Basic principles of Voice over IP

Basic principles of Voice over IP Basic principles of Voice over IP Dr. Peter Počta {pocta@fel.uniza.sk} Department of Telecommunications and Multimedia Faculty of Electrical Engineering University of Žilina, Slovakia Outline VoIP Transmission

More information

JPEG compression of monochrome 2D-barcode images using DCT coefficient distributions

JPEG compression of monochrome 2D-barcode images using DCT coefficient distributions Edith Cowan University Research Online ECU Publications Pre. JPEG compression of monochrome D-barcode images using DCT coefficient distributions Keng Teong Tan Hong Kong Baptist University Douglas Chai

More information

Room Acoustic Reproduction by Spatial Room Response

Room Acoustic Reproduction by Spatial Room Response Room Acoustic Reproduction by Spatial Room Response Rendering Hoda Nasereddin 1, Mohammad Asgari 2 and Ayoub Banoushi 3 Audio Engineer, Broadcast engineering department, IRIB university, Tehran, Iran,

More information

Signal Detection. Outline. Detection Theory. Example Applications of Detection Theory

Signal Detection. Outline. Detection Theory. Example Applications of Detection Theory Outline Signal Detection M. Sami Fadali Professor of lectrical ngineering University of Nevada, Reno Hypothesis testing. Neyman-Pearson (NP) detector for a known signal in white Gaussian noise (WGN). Matched

More information

Adaptive Equalization of binary encoded signals Using LMS Algorithm

Adaptive Equalization of binary encoded signals Using LMS Algorithm SSRG International Journal of Electronics and Communication Engineering (SSRG-IJECE) volume issue7 Sep Adaptive Equalization of binary encoded signals Using LMS Algorithm Dr.K.Nagi Reddy Professor of ECE,NBKR

More information

MATLAB-based Applications for Image Processing and Image Quality Assessment Part I: Software Description

MATLAB-based Applications for Image Processing and Image Quality Assessment Part I: Software Description RADIOENGINEERING, VOL. 20, NO. 4, DECEMBER 2011 1009 MATLAB-based Applications for Image Processing and Image Quality Assessment Part I: Software Description Lukáš KRASULA, Miloš KLÍMA, Eric ROGARD, Edouard

More information

Electronic Communications Committee (ECC) within the European Conference of Postal and Telecommunications Administrations (CEPT)

Electronic Communications Committee (ECC) within the European Conference of Postal and Telecommunications Administrations (CEPT) Page 1 Electronic Communications Committee (ECC) within the European Conference of Postal and Telecommunications Administrations (CEPT) ECC RECOMMENDATION (06)01 Bandwidth measurements using FFT techniques

More information

Classic EEG (ERPs)/ Advanced EEG. Quentin Noirhomme

Classic EEG (ERPs)/ Advanced EEG. Quentin Noirhomme Classic EEG (ERPs)/ Advanced EEG Quentin Noirhomme Outline Origins of MEEG Event related potentials Time frequency decomposition i Source reconstruction Before to start EEGlab Fieldtrip (included in spm)

More information

Enhancing the SNR of the Fiber Optic Rotation Sensor using the LMS Algorithm

Enhancing the SNR of the Fiber Optic Rotation Sensor using the LMS Algorithm 1 Enhancing the SNR of the Fiber Optic Rotation Sensor using the LMS Algorithm Hani Mehrpouyan, Student Member, IEEE, Department of Electrical and Computer Engineering Queen s University, Kingston, Ontario,

More information

EM Clustering Approach for Multi-Dimensional Analysis of Big Data Set

EM Clustering Approach for Multi-Dimensional Analysis of Big Data Set EM Clustering Approach for Multi-Dimensional Analysis of Big Data Set Amhmed A. Bhih School of Electrical and Electronic Engineering Princy Johnson School of Electrical and Electronic Engineering Martin

More information

A Comparison of Speech Coding Algorithms ADPCM vs CELP. Shannon Wichman

A Comparison of Speech Coding Algorithms ADPCM vs CELP. Shannon Wichman A Comparison of Speech Coding Algorithms ADPCM vs CELP Shannon Wichman Department of Electrical Engineering The University of Texas at Dallas Fall 1999 December 8, 1999 1 Abstract Factors serving as constraints

More information

BLIND SOURCE SEPARATION OF SPEECH AND BACKGROUND MUSIC FOR IMPROVED SPEECH RECOGNITION

BLIND SOURCE SEPARATION OF SPEECH AND BACKGROUND MUSIC FOR IMPROVED SPEECH RECOGNITION BLIND SOURCE SEPARATION OF SPEECH AND BACKGROUND MUSIC FOR IMPROVED SPEECH RECOGNITION P. Vanroose Katholieke Universiteit Leuven, div. ESAT/PSI Kasteelpark Arenberg 10, B 3001 Heverlee, Belgium Peter.Vanroose@esat.kuleuven.ac.be

More information

IEEE Proof. Web Version. PROGRESSIVE speaker adaptation has been considered

IEEE Proof. Web Version. PROGRESSIVE speaker adaptation has been considered IEEE TRANSACTIONS ON AUDIO, SPEECH, AND LANGUAGE PROCESSING 1 A Joint Factor Analysis Approach to Progressive Model Adaptation in Text-Independent Speaker Verification Shou-Chun Yin, Richard Rose, Senior

More information

Auto-Tuning Using Fourier Coefficients

Auto-Tuning Using Fourier Coefficients Auto-Tuning Using Fourier Coefficients Math 56 Tom Whalen May 20, 2013 The Fourier transform is an integral part of signal processing of any kind. To be able to analyze an input signal as a superposition

More information

Characterizing Digital Cameras with the Photon Transfer Curve

Characterizing Digital Cameras with the Photon Transfer Curve Characterizing Digital Cameras with the Photon Transfer Curve By: David Gardner Summit Imaging (All rights reserved) Introduction Purchasing a camera for high performance imaging applications is frequently

More information

The CUSUM algorithm a small review. Pierre Granjon

The CUSUM algorithm a small review. Pierre Granjon The CUSUM algorithm a small review Pierre Granjon June, 1 Contents 1 The CUSUM algorithm 1.1 Algorithm............................... 1.1.1 The problem......................... 1.1. The different steps......................

More information

Agilent Creating Multi-tone Signals With the N7509A Waveform Generation Toolbox. Application Note

Agilent Creating Multi-tone Signals With the N7509A Waveform Generation Toolbox. Application Note Agilent Creating Multi-tone Signals With the N7509A Waveform Generation Toolbox Application Note Introduction Of all the signal engines in the N7509A, the most complex is the multi-tone engine. This application

More information

Developing an Isolated Word Recognition System in MATLAB

Developing an Isolated Word Recognition System in MATLAB MATLAB Digest Developing an Isolated Word Recognition System in MATLAB By Daryl Ning Speech-recognition technology is embedded in voice-activated routing systems at customer call centres, voice dialling

More information

application note Directional Microphone Applications Introduction Directional Hearing Aids

application note Directional Microphone Applications Introduction Directional Hearing Aids APPLICATION NOTE AN-4 Directional Microphone Applications Introduction The inability to understand speech in noisy environments is a significant problem for hearing impaired individuals. An omnidirectional

More information

A Coefficient of Variation for Skewed and Heavy-Tailed Insurance Losses. Michael R. Powers[ 1 ] Temple University and Tsinghua University

A Coefficient of Variation for Skewed and Heavy-Tailed Insurance Losses. Michael R. Powers[ 1 ] Temple University and Tsinghua University A Coefficient of Variation for Skewed and Heavy-Tailed Insurance Losses Michael R. Powers[ ] Temple University and Tsinghua University Thomas Y. Powers Yale University [June 2009] Abstract We propose a

More information

HD Radio FM Transmission System Specifications Rev. F August 24, 2011

HD Radio FM Transmission System Specifications Rev. F August 24, 2011 HD Radio FM Transmission System Specifications Rev. F August 24, 2011 SY_SSS_1026s TRADEMARKS HD Radio and the HD, HD Radio, and Arc logos are proprietary trademarks of ibiquity Digital Corporation. ibiquity,

More information

A Digital Audio Watermark Embedding Algorithm

A Digital Audio Watermark Embedding Algorithm Xianghong Tang, Yamei Niu, Hengli Yue, Zhongke Yin Xianghong Tang, Yamei Niu, Hengli Yue, Zhongke Yin School of Communication Engineering, Hangzhou Dianzi University, Hangzhou, Zhejiang, 3008, China tangxh@hziee.edu.cn,

More information

BER Performance Analysis of SSB-QPSK over AWGN and Rayleigh Channel

BER Performance Analysis of SSB-QPSK over AWGN and Rayleigh Channel Performance Analysis of SSB-QPSK over AWGN and Rayleigh Channel Rahul Taware ME Student EXTC Department, DJSCOE Vile-Parle (W) Mumbai 056 T. D Biradar Associate Professor EXTC Department, DJSCOE Vile-Parle

More information

STUDY OF MUTUAL INFORMATION IN PERCEPTUAL CODING WITH APPLICATION FOR LOW BIT-RATE COMPRESSION

STUDY OF MUTUAL INFORMATION IN PERCEPTUAL CODING WITH APPLICATION FOR LOW BIT-RATE COMPRESSION STUDY OF MUTUAL INFORMATION IN PERCEPTUAL CODING WITH APPLICATION FOR LOW BIT-RATE COMPRESSION Adiel Ben-Shalom, Michael Werman School of Computer Science Hebrew University Jerusalem, Israel. {chopin,werman}@cs.huji.ac.il

More information

Adjusting Voice Quality

Adjusting Voice Quality Adjusting Voice Quality Electrical Characteristics This topic describes the electrical characteristics of analog voice and the factors affecting voice quality. Factors That Affect Voice Quality The following

More information

PeakVue Analysis for Antifriction Bearing Fault Detection

PeakVue Analysis for Antifriction Bearing Fault Detection August 2011 PeakVue Analysis for Antifriction Bearing Fault Detection Peak values (PeakVue) are observed over sequential discrete time intervals, captured, and analyzed. The analyses are the (a) peak values

More information

Tech Note. Introduction. Definition of Call Quality. Contents. Voice Quality Measurement Understanding VoIP Performance. Title Series.

Tech Note. Introduction. Definition of Call Quality. Contents. Voice Quality Measurement Understanding VoIP Performance. Title Series. Tech Note Title Series Voice Quality Measurement Understanding VoIP Performance Date January 2005 Overview This tech note describes commonly-used call quality measurement methods, explains the metrics

More information

RECOMMENDATION ITU-R SM.1792. Measuring sideband emissions of T-DAB and DVB-T transmitters for monitoring purposes

RECOMMENDATION ITU-R SM.1792. Measuring sideband emissions of T-DAB and DVB-T transmitters for monitoring purposes Rec. ITU-R SM.1792 1 RECOMMENDATION ITU-R SM.1792 Measuring sideband emissions of T-DAB and DVB-T transmitters for monitoring purposes (2007) Scope This Recommendation provides guidance to measurement

More information

Digital Modulation. David Tipper. Department of Information Science and Telecommunications University of Pittsburgh. Typical Communication System

Digital Modulation. David Tipper. Department of Information Science and Telecommunications University of Pittsburgh. Typical Communication System Digital Modulation David Tipper Associate Professor Department of Information Science and Telecommunications University of Pittsburgh http://www.tele.pitt.edu/tipper.html Typical Communication System Source

More information

Module 13 : Measurements on Fiber Optic Systems

Module 13 : Measurements on Fiber Optic Systems Module 13 : Measurements on Fiber Optic Systems Lecture : Measurements on Fiber Optic Systems Objectives In this lecture you will learn the following Measurements on Fiber Optic Systems Attenuation (Loss)

More information

HSI BASED COLOUR IMAGE EQUALIZATION USING ITERATIVE n th ROOT AND n th POWER

HSI BASED COLOUR IMAGE EQUALIZATION USING ITERATIVE n th ROOT AND n th POWER HSI BASED COLOUR IMAGE EQUALIZATION USING ITERATIVE n th ROOT AND n th POWER Gholamreza Anbarjafari icv Group, IMS Lab, Institute of Technology, University of Tartu, Tartu 50411, Estonia sjafari@ut.ee

More information

Section 5.0 : Horn Physics. By Martin J. King, 6/29/08 Copyright 2008 by Martin J. King. All Rights Reserved.

Section 5.0 : Horn Physics. By Martin J. King, 6/29/08 Copyright 2008 by Martin J. King. All Rights Reserved. Section 5. : Horn Physics Section 5. : Horn Physics By Martin J. King, 6/29/8 Copyright 28 by Martin J. King. All Rights Reserved. Before discussing the design of a horn loaded loudspeaker system, it is

More information

IMPLEMENTATION OF THE ADAPTIVE FILTER FOR VOICE COMMUNICATIONS WITH CONTROL SYSTEMS

IMPLEMENTATION OF THE ADAPTIVE FILTER FOR VOICE COMMUNICATIONS WITH CONTROL SYSTEMS 1. JAN VAŇUŠ IMPLEMENTATION OF THE ADAPTIVE FILTER FOR VOICE COMMUNICATIONS WITH CONTROL SYSTEMS Abstract: In the paper is described use of the draft method for optimal setting values of the filter length

More information

Jitter Measurements in Serial Data Signals

Jitter Measurements in Serial Data Signals Jitter Measurements in Serial Data Signals Michael Schnecker, Product Manager LeCroy Corporation Introduction The increasing speed of serial data transmission systems places greater importance on measuring

More information

EEG COHERENCE AND PHASE DELAYS: COMPARISONS BETWEEN SINGLE REFERENCE, AVERAGE REFERENCE AND CURRENT SOURCE DENSITY

EEG COHERENCE AND PHASE DELAYS: COMPARISONS BETWEEN SINGLE REFERENCE, AVERAGE REFERENCE AND CURRENT SOURCE DENSITY Version 1, June 13, 2004 Rough Draft form We apologize while we prepare the manuscript for publication but the data are valid and the conclusions are fundamental EEG COHERENCE AND PHASE DELAYS: COMPARISONS

More information

SOFTWARE FOR GENERATION OF SPECTRUM COMPATIBLE TIME HISTORY

SOFTWARE FOR GENERATION OF SPECTRUM COMPATIBLE TIME HISTORY 3 th World Conference on Earthquake Engineering Vancouver, B.C., Canada August -6, 24 Paper No. 296 SOFTWARE FOR GENERATION OF SPECTRUM COMPATIBLE TIME HISTORY ASHOK KUMAR SUMMARY One of the important

More information

A Secure File Transfer based on Discrete Wavelet Transformation and Audio Watermarking Techniques

A Secure File Transfer based on Discrete Wavelet Transformation and Audio Watermarking Techniques A Secure File Transfer based on Discrete Wavelet Transformation and Audio Watermarking Techniques Vineela Behara,Y Ramesh Department of Computer Science and Engineering Aditya institute of Technology and

More information

Audio Engineering Society. Convention Paper. Presented at the 119th Convention 2005 October 7 10 New York, New York USA

Audio Engineering Society. Convention Paper. Presented at the 119th Convention 2005 October 7 10 New York, New York USA Audio Engineering Society Convention Paper Presented at the 9th Convention 5 October 7 New York, New York USA This convention paper has been reproduced from the author's advance manuscript, without editing,

More information

The Optimization of Parameters Configuration for AMR Codec in Mobile Networks

The Optimization of Parameters Configuration for AMR Codec in Mobile Networks 01 8th International Conference on Communications and Networking in China (CHINACOM) The Optimization of Parameters Configuration for AMR Codec in Mobile Networks Nan Ha,JingWang, Zesong Fei, Wenzhi Li,

More information

Welcome to the United States Patent and TradeMark Office

Welcome to the United States Patent and TradeMark Office Welcome to the United States Patent and TradeMark Office an Agency of the United States Department of Commerce United States Patent 5,159,703 Lowery October 27, 1992 Silent subliminal presentation system

More information