Follow this and additional works at:
|
|
- Osborn Curtis
- 1 years ago
- Views:
Transcription
1 Marquette University Electrical and Computer Engineering Faculty Research and Publications Electrical and Computer Engineering, Department of -14 Speech Enhancement Using Bayesian Estimators of the Perceptually-Motivated Short-Time Spectral Amplitude (STSA) with Chi Speech Priors Marek B. Trawicki Marquette University, Michael T. Johnson Marquette University, Follow this and additional works at: Part of the Computer Engineering Commons, and the Electrical and Computer Engineering Commons Recommended Citation Recommended Citation Trawicki, Marek B. and Johnson, Michael T., "Speech Enhancement Using Bayesian Estimators of the Perceptually-Motivated Short-Time Spectral Amplitude (STSA) with Chi Speech Priors" (14). Electrical and Computer Engineering Faculty Research and Publications
2 Marquette University Electrical and Computer Engineering Faculty Research and Publications/College of Engineering This paper is NOT THE PUBLISHED VERSION; but the author s final, peer-reviewed manuscript. The published version may be accessed by following the link in the citation below. Speech Communication, Vol. 57 (February 14): DOI. This article is Elsevier and permission has been granted for this version to appear in e-publications@marquette. Elsevier does not grant permission for this article to be further copied/distributed or hosted elsewhere without the express permission from Elsevier. Speech Enhancement Using Bayesian Estimators of The Perceptually Motivated Short-Time Spectral Amplitude (STSA) With Chi Speech Priors Marek B. Trawicki Marquette University, Department of Electrical and Computer Engineering, Speech and Signal Processing Laboratory, Milwaukee, WI Michael T. Johnson Marquette University, Department of Electrical and Computer Engineering, Speech and Signal Processing Laboratory, Milwaukee, WI Abstract In this paper, the authors propose new perceptually-motivated Weighted Euclidean (WE) and Weighted Cosh (WCOSH) estimators that utilize more appropriate Chi statistical models for the speech prior with Gaussian
3 statistical models for the noise likelihood. Whereas the perceptually-motivated WE and WCOSH cost functions emphasized spectral valleys rather than spectral peaks (formants) and indirectly accounted for auditory masking effects, the incorporation of the Chi distribution statistical models demonstrated distinct improvement over the Rayleigh statistical models for the speech prior. The estimators incorporate both weighting law and shape parameters on the cost functions and distributions. Performance is evaluated in terms of the Segmental Signalto-Noise Ratio (SSNR), Perceptual Evaluation of Speech Quality (PESQ), and Signal-to-Noise Ratio (SNR) Loss objective quality measures to determine the amount of noise reduction along with overall speech quality and speech intelligibility improvement. Based on experimental results across three different input SNRs and eight unique noises along with various weighting law and shape parameters, the two general, less-complicated, closed-form derived solution estimators of WE and WCOSH with Chi speech priors provide significant gains in noise reduction and noticeable gains in overall speech quality and speech intelligibility improvements over the baseline WE and WCOSH with the standard Rayleigh speech priors. Overall, the goal of the work is to capitalize on the mutual benefits of the WE and WCOSH cost functions and Chi distributions for the speech prior to improvement enhancement. Keywords Speech enhancement, Probability, Amplitude estimation, Phase estimation, Parameter estimation 1. Introduction Speech enhancement systems concern themselves with reducing the corrupting background noise in the noisy signal (Loizou, 7). The most common approach is to perform statistical estimation: minimize the Bayes Risk of the squared-error of the spectral amplitude cost function, which leads to the subsequent and traditional Ephraim and Malah Minimum Mean-Square Error (MMSE) short-time spectral amplitude (STSA) estimator Ephraim and Malah, Based on the effectiveness of that STSA estimator, researchers began to modify the squared-error of the spectral amplitude cost function to utilize more subjectively meaningful cost functions. Ephraim and Malah (Ephraim and Malah, 1985) also developed and implemented the MMSE logspectral amplitude (LSA) estimator that minimizes the squared-error of the log-spectral amplitude, which is a more subjectively meaningful cost function that correlates well with human perception. From the STSA and LSA cost functions, Loizou (5) constructed several perceptually-motivated spectral amplitude cost functions that emphasized spectral valleys rather than spectral peaks (formants) and indirectly accounted for auditory masking effects. Specifically, the Weighted Euclidean (WE) and Weighted Cosh (WCOSH) Bayesian estimators, which applied a weighting law parameter to the STSA cost function, had the best performances for reducing residual noise and producing better speech quality. In each of those corresponding spectral amplitude, log-spectral amplitude, and perceptually-motivated spectral amplitude estimators, the cost functions employed Rayleigh distributions for the statistical models of the speech priors and noise likelihoods. Eventually, researchers began to exploit alternative and more accurate statistical modeling assumptions to the Rayleigh distribution for both the speech prior and noise likelihood using the STSA cost function. Andrianakis and White (9) continued with the MMSE spectral amplitude estimators using the Gamma distribution but introduced the Chi distribution for modeling the speech priors. The Chi speech prior contains a shaping parameter that was varied to determine its effect on the quality of enhanced speech. From the results, the performance of the estimators was dependent on the shaping parameter, which controlled the trade-off between the level of residual noise and musical tones. As a generalization to the Ephraim and Malah s MMSE STSA and LSA estimators along with Andrianakis and White s Chi distribution speech priors, Breithaupt et al. (8) developed a MMSE STSA estimator that uses both a variable compression function in the error criterion and the Chi distribution as a prior model. The resulting two parameters provide for the reduction of musical noise, speech distortion, and noise distortion. Through the incorporation of Chi distribution statistical models for
4 the speech prior, the squared-error cost functions demonstrated distinct improvement over the Rayleigh statistical models. Despite the success of the spectral amplitude, log-spectral amplitude, and perceptually-motivated cost functions with Rayleigh statistical models and spectral amplitude cost functions with Chi distributions for the speech priors, there has not been any work to capitalize on their mutual benefits for speech enhancement. Specifically, the improved statistical models for the speech prior have only been incorporated with the original MMSE STSA estimator, not with the spectral amplitude perceptually-motivated spectral amplitude (WE and WCOSH) cost functions. The fundamental purpose is to determine the effectiveness that more accurate speech priors would have on improved cost functions for noise reduction. Instead of utilizing the Rayleigh distributions for the speech prior, the Chi distribution is employed in this work since it leads to more general, less complicated, and more closed-form estimator solutions. For specific values of the shaping parameter, Chi distribution is equivalent to the half-gaussian and Rayleigh distribution as special cases. Therefore, the focus of this work is to use the MMSE WE and WCOSH estimators with the Chi spectral speech prior distribution (Johnson et al., 1994) for reducing the background noise along with improving overall speech quality and speech intelligibility. The remainder of this paper is organized into the following sections: system and statistical models (Section ), perceptually-motivated cost functions with Chi speech priors (Section 3), experiments and results (Section 4), and conclusion (Section 5).. System and statistical models In the time domain, the single channel additive noise model is given as (1) y(t) = s(t) + d(t) where s(t), d(t), and y(t) represent the clean, noise, and noisy signals. By taking the short-time Fourier Transform, (1) can be written in the frequency domain as Y(l,k) = S(l,k) + D(l,k) () R(l,k)e jθ(l,k) = X(l,k)e jα(l,k) + N(l,k)e jθ(l,k) where l and k are the particular frame and frequency bin index with noisy, clean, and noise clean spectral amplitudes R, X, and N and noisy, clean, and noise spectral phases θ, α, and θ. As opposed to using the traditional Rayleigh statistical models for both the speech prior and noise likelihood, the traditional Rayleigh speech prior given as (3) p(x,α) = X πσ X exp ( X σ X ) is modified through the use of Chi speech priors (Johnson et al., 1994), where σ X is the speech spectral variance. Specifically, the Chi speech prior is given as (4) p(x,α) = θ a Γ(a) Xa 1 exp ( X θ ), where σ X = θa with shape parameter a and scaling parameter θ and Γ( ) is the gamma function. With a =.5 and a = 1, (4) is equivalent to the Half-Gaussian and Rayleigh distributions. The noise likelihood is still modeled as a Gaussian distribution given as (5) p(y X,α) = 1 πσ N exp ( Y Xejα σ N ),
5 where σ N is the noise spectral variance. In order to simplify the notation, λ = σ, λ X = σ X, and λ N = σ N is utilized as the spectral variances in the derivation of the WE with Chi speech prior estimator and WCOSH with Chi speech prior estimator. 3. Perceptually-motivated cost functions with Chi speech priors 3.1. Weighted Euclidean (WE) From the work in Loizou (5), the Weighted Euclidean (WE) cost function is given as (6) d WE (X,Xˆ ) = (X Xˆ ) X p with estimator equation (7) X^ WE = π X p+1 p(y X,α)p(X,α)dαdX π, X p p(y X,α)p(X,α)dαdX where p is the weighting law parameter. For p =, (7) is equivalent to the MMSE STSA estimator in Ephraim and Malah, 1984, Loizou, 5, Gray et al., 198. Through the substitution of the statistical models in (4), (5) and using and in Gradshteyn and Ryzhik (7), the spectral phase is integrated from the two integrals as (8) and π X p+1 p(y X,α)p(X,α)dαdX X p+a exp ( X ) J (ix v λ ) dx (9) π X p p(y X,α)p(X,α)dαdX X p+a 1 exp ( X ) J (ix v λ ) dx, where λ is defined as (1) 1 λ = 1 λ X + 1 λ N in (A.3) of Ephraim and Malah (1984), J ( ) is the th-order Bessel function of the first-kind, and (11) 1 = a λ X + 1 λ N which is equivalent to 1/λ in (1) for a = 1. By utilizing and in Gradshteyn and Ryzhik (7), (8), (9) are given as (1) π X p+1 p(y X,α)p(X,α)dαdX Γ(p+a+1 1 ) (p+a+1) 1F 1 ( 1 p a ;1; v λ 1 ) and (13) π X p p(y X,α)p(X,α)dαdX Γ(p+a ) 1 λ a (p+a) 1F 1 ( p a ;1; v λ 1 ) where 1 F 1 ( ; ; ) is the confluent hypergeometric function. With the combination of simplification of the integrals in (1), (13), the final form of the new WE estimator with Chi speech prior in (4) is given as
6 (14) Xˆ where WE,CHI = G WE,CHI R = Γ(p+a+1 (15) v a = ξ a+ξ γ and (16) ζ a = 1+ξ a+ξ Γ( p+a ) ) v a γ 1F 1 ( 1 p a ;1; vζ a ) 1F 1 ( p a ;1; vζ a ) R, for p + a > with gain function G WE,CHI and a priori ξ = σ X /σ N and a posteriori γ = R /σ N SNRs. For a = 1 and p =, (14) is exactly equivalent to the STSA estimator with Rayleigh speech prior (Ephraim and Malah, 1984). 3.. Weighted Cosh (WCOSH) In Loizou, 5, the Weighted Cosh (WCOSH) cost function is given as (17) d WCOSH (X,Xˆ ) = [ X Xˆ + Xˆ with estimator equation (18) Xˆ WCOSH = [ X 1] Xp π X p+1 p(y X,α)p(X,α)dαdX π ] X p 1 p(y X,α)p(X,α)dαdX 1, where p is the weighting law parameter. For p =, (18) is equivalent to the Cosh cost function given in Loizou, 5, Gray et al., 198. In order to determine the final estimator equation for the WCOSH with Chi speech prior, the integrals are derived in a same approach as with the WE with Chi speech prior estimator in (14). By the substitution of the statistical models in (4), (5) and using and in Gradshteyn and Ryzhik (7), the spectral phase is integrated from the two integrals as (19) and () π π X p+1 p(y X,α)p(X,α)dαdX X p 1 p(y X,α)p(X,α)dαdX X p+a exp ( X ) J (ix v λ ) dx X p+a exp ( X ) J (ix v λ ) dx, where 1 is defined in (11). Through and in Gradshteyn and Ryzhik (7), (19), () are given as (1) π X p+1 p(y X,α)p(X,α)dαdX Γ(p+a+1 1 ) (p+a+1) 1F 1 ( 1 p a ;1; v λ 1 ) and () π X p 1 p(y X,α)p(X,α)dαdX Γ(p+a 1 1 ) (p+a 1) 1F 1 ( 3 p a ;1; v λ 1 ). With the combination of simplification of the integrals in (1), () and using v a and ς a in (15) and equation, the final form of the new WCOSH estimator with Chi speech prior in (4) is given as
7 (3) Xˆ WCOSH,CHI = G WCOSH,CHI R = [ Γ(p+a+1 ) Γ( p+a 1 ) ] 1 va [ 1 1 p a F 1 ( ;1; vζ a ) ] γ 1F 1 ( 3 p a ;1; vζ a ) for p + a > 1 with gain function G WCOSH,CHI. For a = 1 and p =, (3) is similar to the LSA estimator with Rayleigh speech prior (Ephraim and Malah, 1985). By comparing the WE and WCOSH estimators given in (18), (19), (), (1), (), (3), the only differences consist of the integral in the denominator and square root. Fig. 1, Fig., Fig. 3(WE with Chi speech prior estimator) and Fig. 4, Fig. 5, Fig. 6 (WCOSH with Chi speech prior estimator) present the gain functions G WE,CHI and G WCOSH,CHI for the WE and WCOSH with Chi speech prior estimators given in (14), (3) using representative weighting law parameters of p WE = { 1.,.5,.5} and p WCOSH = {.75,.5,.5} as a function of instantaneous SNR γ k 1 for three fixed a priori SNR ξ k values of, 5, and 1 db and valid shaping parameter a values. 1 R Fig. 1. Gain curves for WE (p = 1.) estimator with Chi prior. Fig.. Gain curves for WE (p =.5) estimator with Chi prior.
8 Fig. 3. Gain curves for WE (p =.5) estimator with Chi prior. Fig. 4. Gain curves for WCOSH (p =.75) estimator with Chi prior. Fig. 5. Gain curves for WCOSH (p =.5) estimator with Chi prior. Fig. 6. Gain curves for WCOSH (p =.5) estimator with Chi prior. From the gain curves, there are several interesting observations to note from both the WE and WCOSH with Chi and Rayleigh speech prior estimators. Based on both sets of estimators across the different a priori SNR ξ k, the gains were smaller in value (more attenuation) as the shaping parameter a approached its limiting value with a decrease in the instantaneous a posteriori SNR γ k 1. As the shaping parameter a 1, which is the Rayleigh speech prior, the gains had a flatter shape and larger value (less attenuation). Regardless of the a priori SNR ξ k and shaping parameter a, the gains all eventually converged to approximately 6 db at around an a posteriori SNR of 8 1 db with an increase of the instantaneous a posteriori SNR γ k 1, which was essentially independent of the weighting law parameter p. With the WE with Chi speech prior estimator, the increase in the weighting law parameter p, which in turn causes an increase in the range of valid shaping
9 parameters a, generated gains with more attenuation at lower instantaneous a posteriori SNR γ k 1 (and less attenuation at higher instantaneous a posteriori SNR γ k 1) using the limiting value of the shaping parameter a. The gains with an increase of weighting law parameter p and shaping parameter a 1 (Rayleigh speech prior) had less attenuation at lower instantaneous a posteriori SNR γ k 1 and no substantial change in attenuation at higher instantaneous a posteriori SNR γ k 1. For the WCOSH with Chi speech prior estimator, the gains were much more dependent on the a priori SNR ξ k than the WE with Chi speech prior estimator. For a particular weighting law parameter p across all shaping parameter a, the gains had less attenuation with an increase in the a priori SNR ξ k. By comparing the same weighting law parameter p =.5 (Fig., Fig. 5) and p =.5 (Fig. 3, Fig. 6) across the WE and WCOSH with Chi speech prior estimators, the gains associated with the WE with Chi speech prior estimator had significantly more attenuation at lower instantaneous a posteriori SNR γ k 1 (and similar attenuation at higher lower instantaneous a posteriori SNR γ k 1) than the gains associated with the WCOSH with Chi speech prior estimator. 4. Experiments and results The proposed WE and WCOSH with Chi speech prior optimal estimators given in (14), (3) were evaluated using the objective measures of Segmental Signal-to-Noise Ratio (SSNR) Papamichalis, 1987, Perceptual Evaluation of Speech Quality (PESQ) ITU, 3, Hu and Loizou, 7, Hu and Loizou, 8, Rix et al., 1, and Signal-to-Noise Ratio (SNR) Loss Ma and Loizou, 11 to access noise reduction, overall speech quality, and speech intelligibility, where PESQ and SNR Loss have a range of (higher scores indicate better performance) and.. (lower scores indicate better performance). In particular, the performance is given via SSNR, PESQ, and SNR Loss improvements, where the improvements are calculated as SSNR/PESQ/SNR Loss output (enhanced signal) minus SSNR/PESQ/SNR Loss input (noisy signal). Clean and noisy speech were taken from the noisy speech corpus (NOIZEUS) Hu and Loizou, 7, which contains 3 IEEE sentences (Subcommittee, 1969) (produced by three male and three female speakers) corrupted by eight different real-world noises at different SNRs ranging from to 15 db at increments of 5 db, where the noises were taken from the AURORA database (Pearce and Hirsch, ), which includes airport, babble, car, exhibition, restaurant, station, street, and train noises. The analysis conditions consisted of frames of 56 samples (5.6 ms) with 5% overlap using Hanning windows. Noise estimation was performed on an initial silence of 5 frames. The decision-directed (DD) Ephraim and Malah, 1984 smoothing approach was utilized to estimate ξ with α SNR =.98 using thresholds of ξ min = 1 5/1 and γ min = 4. In order to evaluate the performance, the enhanced signals were reconstructed using the overlap-add technique. The shape parameter a in the Chi speech prior was varied for specific weighting law parameters p to determine its effect on enhancement, quality, and intelligibility with results averaged over 3 utterances across the 8 different noises at input SNRs of, 5, and 1 db. As recommended by Loizou (5), p WE = 1 and p WCOSH =.5 were selected as the weighting law parameter p to achieve the best overall speech quality in the enhancement process. Fig. 7, Fig. 8 illustrate the SSNR improvements for the WE and WCOSH with Chi speech prior estimators at various input SNRs, noises, and shaping parameter a with particular weighting law parameter p. The WE and WCOSH with Chi speech prior estimators consistently produced 3 db ( db input SNR), 1 db (5 db input SNR), and db (1 db input SNR) over the baseline WE and WCOSH with Rayleigh speech prior estimators, which typically occurred at the limiting value of the shaping parameter a for the corresponding weighting law parameter p of a.5 (p WE = 1.) and a.75 (p WCOSH =.5). At the limiting shaping parameter a, the WE and WCOSH with Chi speech prior estimators achieved maximum SSNR improvements of 9 13 db ( db input SNR), 6 9 db (5 db input SNR), and 4 5 db (1 db input SNR) across the car, train, station, exhibition, street, babble, and airport noises. In comparing the WE and WCOSH with Chi speech prior estimators, the WE with Chi speech prior estimator had slightly better SSNR improvement performance over the WCOSH with Chi speech prior estimator for noise reduction.
10 Fig. 7. SSNR improvements for MMSE WE estimator with Chi prior (p = 1). Fig. 8. SSNR improvements for MMSE WCOSH estimator with Chi prior (p =.5). Fig. 9, Fig. 1 present the PESQ improvements for the WE and WCOSH with Chi speech prior estimators at various input SNRs, noises, and shaping parameter a with particular weighting law parameter p. In a similar fashion to the SSNR improvements, the WE and WCOSH with Chi speech prior estimators generated..3 and..1 gains over the baseline WE and WCOSH estimators with Rayleigh speech prior with the most pronounced improvements occurring at input SNRs of 5 and 1 db. In contrast to the SSNR improvements that were almost exclusively dependent on the limiting shaping parameter a, the PESQ improvements diminished at a =.7.8 (WE with Chi speech prior estimator) and a =.85.9 (WCOSH with Chi speech prior estimator). For both the WE and WCOSH with Chi prior estimators, the maximum PESQ improvements ranged from..55 (5 db input SNR),..5 (1 db input SNR), and ( db input SNR) across the restaurant, airport, babble, street, exhibition, station, train, and car noises. After examination of the WE and WCOSH with Chi speech prior estimators, the WE with Chi speech prior estimator had slightly better PESQ improvement performance over the WCOSH with Chi speech prior estimator for speech quality.
11 Fig. 9. PESQ improvements for MMSE WE estimator with Chi prior (p = 1). Fig. 1. PESQ improvements for MMSE WCOSH estimator with Chi prior (p =.5). Fig. 11, Fig. 1 demonstrate the SNR Loss improvements for the WE and WCOSH with Chi speech prior estimators at various input SNRs, noises, and shaping parameter a with particular weighting law parameter p. The WE and WCOSH with Chi speech prior estimators typically yielded.1 ( db input SNR),.5 (5 db input SNR), and.5 (1 db input SNR) over the corresponding baseline WE and WCOSH with Rayleigh speech prior estimators, which occurred at a wide range of shaping parameters a. In contrast to the SSNR and PESQ improvements, the SNR Loss improvements were most noticeable at input SNRs of 5,, and 1 db and car, station, babble, airport, exhibition, restaurant, train, and street noises. In more specific terms, the WE and WCOSH with Chi speech prior estimators realized maximum SNR Loss improvements of ( db input SNR),.1.95 (5 db input SNR), and From the WE and WCOSH with Chi speech prior estimators, the WE with Chi speech prior estimator often had larger decreases in SNR Loss over the baseline Rayleigh speech prior estimators than the WCOSH with Chi speech prior estimator.
12 Fig. 11. SNR Loss improvements for MMSE WE estimator with Chi prior (p = 1). Fig. 1. SNR Loss improvements for MMSE WCOSH estimator with Chi prior (p =.5). Table 1, Table, Table 3, Table 4, Table 5, Table 6 show the SSNR improvement, PESQ improvement, and SNR Loss improvement for the WE and WCOSH with Chi speech prior estimators for two additional and representative weighting law parameters p. Whereas the WE with Chi speech prior estimator was examined with the weighting law parameters p =.5 (a >.5) and p =.5 (a >.15), the WCOSH with Chi speech prior estimator was examined with the weighting law parameters p =.75 (a >.875) and p =.5 (a >.65) according to the relationships p + a > and p + a > 1. For each weighting law parameter p of the WCOSH and WE with Chi speech prior estimators at the particular noise and input SNR, the SSNR improvement, PESQ improvement, and SNR Loss improvement results are provided alongside their corresponding shaping parameter a, where the shaping parameter a 1 represents the baseline WE and WCOSH with Rayleigh speech prior estimators. In terms of SSNR improvements, the WE and WCOSH with Chi speech prior estimators generally produced.5.5 db gains over the baseline WE and WCOSH with Rayleigh speech prior estimators. As the weighting law parameter p was decreased in value, the SSNR improvement increased in value, where the maximum SSNR improvement ranged from 6 to 9 db across the car, train, and babble noises. The WCOSH with Chi speech prior typically had less performance gains over the baseline WCOSH with Rayleigh speech prior because of the higher baseline SSNR improvements. For each the WE and WCOSH with Chi speech prior estimators, the limiting factor in SSNR improvement was the lower bound of the shaping parameters a. For the PESQ improvements, the WE and WCOSH with Chi speech prior estimators generated upwards of.14 gains over the baseline WE and WCOSH with Rayleigh speech prior estimators. In a similar way to the SSNR improvements, the increase in the weighting law parameter p caused a decrease in PESQ improvement. The maximum PESQ improvement ranged from.4 to.56 across the car, train, and babble noises, where the shaping parameter a reached the maximum at a =.5.7 (WE with Chi speech prior estimator) and a =.9.99 (WCOSH with Chi speech prior estimator). In general, the WCOSH with Chi speech prior estimator did not always follow the same relationship between the weighting law parameter p and PESQ improvement as the WE with Chi speech prior estimator. With the SNR Loss improvements, the WE and WCOSH with Chi speech prior estimators supplied nearly.19 gains over the baseline WE and WCOSH with Rayleigh speech prior estimators. As with SSNR improvement and PESQ improvement, the SNR Loss improvement decreased in value with an increase in the weighting law parameter p value. The car, babble, and train noises achieved maximum SNR Loss improvements of.88.11, which occurred in the range of a =.5.45 (WE with Chi speech prior estimator) and a/1. (WCOSH with Chi speech prior estimator). In most cases, the WCOSH with Chi speech prior estimator did not produce nearly as pronounced SNR Loss improvement gains compared to the WE with Chi speech prior estimator over the baseline and WE and WCOSH with Rayleigh speech prior estimators.
13 Table 1. SSNR improvements for MMSE WE estimator with Chi prior (p =.5 and p =.5). SNR [db] Babble Car Train p =.5 p =.5 p =.5 p =.5 p =.5 p = AVG Table. SSNR improvements for MMSE WCOSH estimator with Chi prior (p =.75 and p =.5). SNR [db] Babble Car Train p =.75 p =.5 p =.75 p =.5 p =.75 p = AVG Table 3. PESQ improvements for MMSE WE estimator with Chi prior (p =.5 and p =.5). SNR [db] Babble Car Train p =.5 p =.5 p =.5 p =.5 p =.5 p = AVG
14 Table 4. PESQ improvements for MMSE WCOSH estimator with Chi prior (p =.75 and p =.5). SNR [db] Babble Car Train p =.75 p =.5 p =.75 p =.5 p =.75 p = AVG Table 5. SNR Loss improvements for MMSE WE estimator with Chi prior (p =.5 and p =.5). SNR [db] Babble Car Train p =.5 p =.5 p =.5 p =.5 p =.5 p = AVG Table 6. SNR Loss improvements for MMSE WCOSH estimator with Chi prior (p =.75 and p =.5). SNR [db] Babble Car Train p =.75 p =.5 p =.75 p =.5 p =.75 p =
15 AVG
16 5. Conclusion In this paper, the authors derived novel perceptually-motivated WE and WCOSH estimators using more appropriate Chi speech prior as a substitute for the traditional Rayleigh speech prior to model the speech spectral amplitude. Fundamentally, the goal of the work is to capitalize on the mutual benefits of the WE and WCOSH cost functions and Chi distributions for the speech prior to provide gains in all phases of enhancement. The WE and WCOSH with Chi speech prior estimators incorporated weighting law and shape parameters on the cost functions and distributions. Instead of measuring the performance simply with the SSNR objective quality metric to determine the amount of noise reduction, the estimators were evaluated using the PESQ and SNR Loss objective quality metrics to ascertain the level of overall speech quality and speech intelligibility compared to the original noisy signals corrupted by input SNRs of, 5, and 1 db across airport, babble, car, exhibition, restaurant, station, street, and train noises. With the WE and WCOSH with standard Rayleigh speech prior estimators serving as the baseline results, the experimental results indicated that the new WE and WCOSH with Chi speech prior estimators provided significant gains in noise reduction and noticeable gains in overall speech quality and speech intelligibility improvements. Generally, the best results for the various objective quality metrics occurred for a particular weighting law parameter at the limiting value of the shaping parameter at lower input SNRs (SSNR improvement) and various values of the shaping parameter at higher input SNRs (PESQ improvement and SNR Loss improvement). In more specific terms, the WE and WCOSH with Chi speech prior estimators consistently produced upwards of approximately 3 db (SSNR improvement),.3 (PESQ improvement), and.5 (SNR Loss improvement) over the baseline WE and WCOSH with Rayleigh speech prior estimators. In comparing the WE with Chi speech prior and WCOSH with Chi speech prior estimators, the WE with Chi speech prior estimator often times had slightly better overall performance across the SSNR, PESQ, and SNR Loss objective quality metrics than the WCOSH with Chi speech prior estimator and would be the recommended estimator for filtering noisy signals with more negative values of the weighting law parameter. For future work, the WE and WCOSH estimators would involve further modifications to integrate even more generalized speech prior statistical estimators, namely the generalized Gamma speech prior, to obtain more gains in SSNR, PESQ, and SNR Loss improvements over the traditional Rayleigh speech prior. References Loizou, 7. P.C. Loizou. Speech Enhancement Theory and Practice. CRC Press (7) Ephraim and Malah, Y. Ephraim, D. Malah. Speech enhancement using a minimum mean-square error short-time spectral amplitude estimator. IEEE Transactions on Acoustics, Speech and Signal Processing, ASSP-3 (1984), pp Ephraim and Malah, Y. Ephraim, D. Malah. Speech enhancement using a minimum mean-square error log-spectral amplitude estimator. IEEE Transactions on Acoustics, Speech and Signal Processing, 33 (1985), pp Loizou, 5. P.C. Loizou. Speech enhancement based on perceptually motivated Bayesian estimators of the magnitude spectrum. IEEE Transactions on Acoustics, Speech and Signal Processing, 13 (5), pp Andrianakis and White, 9. I. Andrianakis, P.R. White. Speech spectral amplitude estimators using optimallyshaped gamma and chi priors Speech Communication, 51 (9), pp Breithaupt et al., 8 Breithaupt, C., Krawczyk, M., Martin, R., 8. Parameterized MMSE spectral magnitude estimation for the enhancement of noisy speech. In: Presented at International Conference on Acoustics, Speech, and, Signal Processing. Johnson et al., 1994 N. Johnson, S. Kotz, N. BalakrishnanContinuous Univariate Distributions
17 (nd ed.), John Wiley and Sons, New York (1994) vol. 1 Gray et al., 198 R.M. Gray, A. Buzo, J.A.H. Gray, Y. MatsuyamaDistortion measures for speech processing IEEE Transactions on Acoustics, Speech and Signal Processing, ASSP-8 (198), pp Gradshteyn and Ryzhik, 7 I.S. Gradshteyn, I.M. RyzhikTables of Integrals, Series, and Products Academic Press (7) Papamichalis, 1987 P.E. PapamichalisPractical Approaches to Speech Coding Prentice-Hall, New York, NY (1987) ITU, 3 ITU, Subjective test methodology for evaluating speech communication systems that include noise suppression algorithm, ITU-T Recommendation, 3. Hu and Loizou, 7 Y. Hu, P.C. LoizouSubjective comparison and evaluation of speech enhancement algorithms Speech Communication, 49 (7), pp Hu and Loizou, 8 Y. Hu, P. LoizouEvaluation of objective quality measures for speech enhancement IEEE Transactions on Audio, Speech, and Language Processing, 16 (8), pp Rix et al., 1 Rix, A., Beerends, J., Hollier, M., Hekstra, A., 1. Perceptual evaluation of speech quality (PESQ)-A new method for speech quality assessment of telephone networks and codecs. In: Presented at IEEE International Conference of Acoustics, Speech, and, Signal Processing. Ma and Loizou, 11 J. Ma, P.C. LoizouSNR loss: a new objective measure for predicting the intelligibility of noise-suppressed speech Speech Communication, 53 (11), pp Subcommittee, 1969 I. SubcommitteeIEEE recommended practice for speech quality measurements IEEE Transactions on Audio and Electroacoustics, AU-17 (1969), pp Pearce and Hirsch, Pearce, D., Hirsch, H.-G.,. Performance evaluation of speech recognition systems under noisy conditions. In: Presented at 6th International Conference on Spoken Language Processing (ICSLP), Beijing, China.
White Paper. PESQ: An Introduction. Prepared by: Psytechnics Limited. 23 Museum Street Ipswich, Suffolk United Kingdom IP1 1HN
PESQ: An Introduction White Paper Prepared by: Psytechnics Limited 23 Museum Street Ipswich, Suffolk United Kingdom IP1 1HN t: +44 (0) 1473 261 800 f: +44 (0) 1473 261 880 e: info@psytechnics.com September
More informationLog-Likelihood Ratio-based Relay Selection Algorithm in Wireless Network
Recent Advances in Electrical Engineering and Electronic Devices Log-Likelihood Ratio-based Relay Selection Algorithm in Wireless Network Ahmed El-Mahdy and Ahmed Walid Faculty of Information Engineering
More informationThis document is downloaded from DR-NTU, Nanyang Technological University Library, Singapore.
This document is downloaded from DR-NTU, Nanyang Technological University Library, Singapore. Title Transcription of polyphonic signals using fast filter bank( Accepted version ) Author(s) Foo, Say Wei;
More informationThe Calculation of G rms
The Calculation of G rms QualMark Corp. Neill Doertenbach The metric of G rms is typically used to specify and compare the energy in repetitive shock vibration systems. However, the method of arriving
More informationA Sound Analysis and Synthesis System for Generating an Instrumental Piri Song
, pp.347-354 http://dx.doi.org/10.14257/ijmue.2014.9.8.32 A Sound Analysis and Synthesis System for Generating an Instrumental Piri Song Myeongsu Kang and Jong-Myon Kim School of Electrical Engineering,
More informationAN-007 APPLICATION NOTE MEASURING MAXIMUM SUBWOOFER OUTPUT ACCORDING ANSI/CEA-2010 STANDARD INTRODUCTION CEA-2010 (ANSI) TEST PROCEDURE
AUDIOMATICA AN-007 APPLICATION NOTE MEASURING MAXIMUM SUBWOOFER OUTPUT ACCORDING ANSI/CEA-2010 STANDARD by Daniele Ponteggia - dp@audiomatica.com INTRODUCTION The Consumer Electronics Association (CEA),
More informationIntegration of Short-Time Fourier Domain Speech Enhancement and Observation Uncertainty Techniques for Robust Automatic Speech Recognition
Integration of Short-Time Fourier Domain Speech Enhancement and Observation Uncertainty Techniques for Robust Automatic Speech Recognition Von der Fakultät IV Elektrotechnik und Informatik der Technischen
More informationAudio Engineering Society. Convention Paper. Presented at the 129th Convention 2010 November 4 7 San Francisco, CA, USA
Audio Engineering Society Convention Paper Presented at the 129th Convention 2010 November 4 7 San Francisco, CA, USA The papers at this Convention have been selected on the basis of a submitted abstract
More informationPERCENTAGE ARTICULATION LOSS OF CONSONANTS IN THE ELEMENTARY SCHOOL CLASSROOMS
The 21 st International Congress on Sound and Vibration 13-17 July, 2014, Beijing/China PERCENTAGE ARTICULATION LOSS OF CONSONANTS IN THE ELEMENTARY SCHOOL CLASSROOMS Dan Wang, Nanjie Yan and Jianxin Peng*
More informationMUSICAL INSTRUMENT FAMILY CLASSIFICATION
MUSICAL INSTRUMENT FAMILY CLASSIFICATION Ricardo A. Garcia Media Lab, Massachusetts Institute of Technology 0 Ames Street Room E5-40, Cambridge, MA 039 USA PH: 67-53-0 FAX: 67-58-664 e-mail: rago @ media.
More informationWhite Paper. Comparison between subjective listening quality and P.862 PESQ score. Prepared by: A.W. Rix Psytechnics Limited
Comparison between subjective listening quality and P.862 PESQ score White Paper Prepared by: A.W. Rix Psytechnics Limited 23 Museum Street Ipswich, Suffolk United Kingdom IP1 1HN t: +44 (0) 1473 261 800
More informationAdvanced Signal Processing and Digital Noise Reduction
Advanced Signal Processing and Digital Noise Reduction Saeed V. Vaseghi Queen's University of Belfast UK WILEY HTEUBNER A Partnership between John Wiley & Sons and B. G. Teubner Publishers Chichester New
More informationMICROPHONE SPECIFICATIONS EXPLAINED
Application Note AN-1112 MICROPHONE SPECIFICATIONS EXPLAINED INTRODUCTION A MEMS microphone IC is unique among InvenSense, Inc., products in that its input is an acoustic pressure wave. For this reason,
More informationACOUSTICAL CONSIDERATIONS FOR EFFECTIVE EMERGENCY ALARM SYSTEMS IN AN INDUSTRIAL SETTING
ACOUSTICAL CONSIDERATIONS FOR EFFECTIVE EMERGENCY ALARM SYSTEMS IN AN INDUSTRIAL SETTING Dennis P. Driscoll, P.E. and David C. Byrne, CCC-A Associates in Acoustics, Inc. Evergreen, Colorado Telephone (303)
More informationRANDOM VIBRATION AN OVERVIEW by Barry Controls, Hopkinton, MA
RANDOM VIBRATION AN OVERVIEW by Barry Controls, Hopkinton, MA ABSTRACT Random vibration is becoming increasingly recognized as the most realistic method of simulating the dynamic environment of military
More informationKhalid Sayood and Martin C. Rost Department of Electrical Engineering University of Nebraska
PROBLEM STATEMENT A ROBUST COMPRESSION SYSTEM FOR LOW BIT RATE TELEMETRY - TEST RESULTS WITH LUNAR DATA Khalid Sayood and Martin C. Rost Department of Electrical Engineering University of Nebraska The
More informationPHASE ESTIMATION ALGORITHM FOR FREQUENCY HOPPED BINARY PSK AND DPSK WAVEFORMS WITH SMALL NUMBER OF REFERENCE SYMBOLS
PHASE ESTIMATION ALGORITHM FOR FREQUENCY HOPPED BINARY PSK AND DPSK WAVEFORMS WITH SMALL NUM OF REFERENCE SYMBOLS Benjamin R. Wiederholt The MITRE Corporation Bedford, MA and Mario A. Blanco The MITRE
More informationAnalysis/resynthesis with the short time Fourier transform
Analysis/resynthesis with the short time Fourier transform summer 2006 lecture on analysis, modeling and transformation of audio signals Axel Röbel Institute of communication science TU-Berlin IRCAM Analysis/Synthesis
More information5 Tips For Making the Most Out of Any Available Opportunities
IEEE TRANSACTIONS ON AUDIO, SPEECH, AND LANGUAGE PROCESSING, VOL 16, NO 3, MARCH 2008 541 Noise Tracking Using DFT Domain Subspace Decompositions Richard C Hendriks, Jesper Jensen, and Richard Heusdens
More informationA Microphone Array for Hearing Aids
A Microphone Array for Hearing Aids by Bernard Widrow 1531-636X/06/$10.00 2001IEEE 0.00 26 Abstract A directional acoustic receiving system is constructed in the form of a necklace including an array of
More informationEricsson T18s Voice Dialing Simulator
Ericsson T18s Voice Dialing Simulator Mauricio Aracena Kovacevic, Anna Dehlbom, Jakob Ekeberg, Guillaume Gariazzo, Eric Lästh and Vanessa Troncoso Dept. of Signals Sensors and Systems Royal Institute of
More informationMeasuring Line Edge Roughness: Fluctuations in Uncertainty
Tutor6.doc: Version 5/6/08 T h e L i t h o g r a p h y E x p e r t (August 008) Measuring Line Edge Roughness: Fluctuations in Uncertainty Line edge roughness () is the deviation of a feature edge (as
More informationThe Effective Number of Bits (ENOB) of my R&S Digital Oscilloscope Technical Paper
The Effective Number of Bits (ENOB) of my R&S Digital Oscilloscope Technical Paper Products: R&S RTO1012 R&S RTO1014 R&S RTO1022 R&S RTO1024 This technical paper provides an introduction to the signal
More informationAPPLYING MFCC-BASED AUTOMATIC SPEAKER RECOGNITION TO GSM AND FORENSIC DATA
APPLYING MFCC-BASED AUTOMATIC SPEAKER RECOGNITION TO GSM AND FORENSIC DATA Tuija Niemi-Laitinen*, Juhani Saastamoinen**, Tomi Kinnunen**, Pasi Fränti** *Crime Laboratory, NBI, Finland **Dept. of Computer
More informationEmotion Detection from Speech
Emotion Detection from Speech 1. Introduction Although emotion detection from speech is a relatively new field of research, it has many potential applications. In human-computer or human-human interaction
More informationLOW COST HARDWARE IMPLEMENTATION FOR DIGITAL HEARING AID USING
LOW COST HARDWARE IMPLEMENTATION FOR DIGITAL HEARING AID USING RasPi Kaveri Ratanpara 1, Priyan Shah 2 1 Student, M.E Biomedical Engineering, Government Engineering college, Sector-28, Gandhinagar (Gujarat)-382028,
More informationWorkshop Perceptual Effects of Filtering and Masking Introduction to Filtering and Masking
Workshop Perceptual Effects of Filtering and Masking Introduction to Filtering and Masking The perception and correct identification of speech sounds as phonemes depends on the listener extracting various
More informationThirukkural - A Text-to-Speech Synthesis System
Thirukkural - A Text-to-Speech Synthesis System G. L. Jayavardhana Rama, A. G. Ramakrishnan, M Vijay Venkatesh, R. Murali Shankar Department of Electrical Engg, Indian Institute of Science, Bangalore 560012,
More informationMATLAB-based Applications for Image Processing and Image Quality Assessment Part II: Experimental Results
154 L. KRASULA, M. KLÍMA, E. ROGARD, E. JEANBLANC, MATLAB BASED APPLICATIONS PART II: EXPERIMENTAL RESULTS MATLAB-based Applications for Image Processing and Image Quality Assessment Part II: Experimental
More informationFigure1. Acoustic feedback in packet based video conferencing system
Real-Time Howling Detection for Hands-Free Video Conferencing System Mi Suk Lee and Do Young Kim Future Internet Research Department ETRI, Daejeon, Korea {lms, dyk}@etri.re.kr Abstract: This paper presents
More informationA TOOL FOR TEACHING LINEAR PREDICTIVE CODING
A TOOL FOR TEACHING LINEAR PREDICTIVE CODING Branislav Gerazov 1, Venceslav Kafedziski 2, Goce Shutinoski 1 1) Department of Electronics, 2) Department of Telecommunications Faculty of Electrical Engineering
More informationAutomatic Evaluation Software for Contact Centre Agents voice Handling Performance
International Journal of Scientific and Research Publications, Volume 5, Issue 1, January 2015 1 Automatic Evaluation Software for Contact Centre Agents voice Handling Performance K.K.A. Nipuni N. Perera,
More informationAUDIO signals are often contaminated by background environment
1830 IEEE TRANSACTIONS ON SIGNAL PROCESSING, VOL. 56, NO. 5, MAY 2008 Audio Denoising by Time-Frequency Block Thresholding Guoshen Yu, Stéphane Mallat, Fellow, IEEE, and Emmanuel Bacry Abstract Removing
More informationArtificial Neural Network for Speech Recognition
Artificial Neural Network for Speech Recognition Austin Marshall March 3, 2005 2nd Annual Student Research Showcase Overview Presenting an Artificial Neural Network to recognize and classify speech Spoken
More informationSPEAKER IDENTIFICATION FROM YOUTUBE OBTAINED DATA
SPEAKER IDENTIFICATION FROM YOUTUBE OBTAINED DATA Nitesh Kumar Chaudhary 1 and Shraddha Srivastav 2 1 Department of Electronics & Communication Engineering, LNMIIT, Jaipur, India 2 Bharti School Of Telecommunication,
More informationPerformance Analysis of Interleaving Scheme in Wideband VoIP System under Different Strategic Conditions
Performance Analysis of Scheme in Wideband VoIP System under Different Strategic Conditions Harjit Pal Singh 1, Sarabjeet Singh 1 and Jasvir Singh 2 1 Dept. of Physics, Dr. B.R. Ambedkar National Institute
More informationVoice Communication Package v7.0 of front-end voice processing software technologies General description and technical specification
Voice Communication Package v7.0 of front-end voice processing software technologies General description and technical specification (Revision 1.0, May 2012) General VCP information Voice Communication
More informationExploiting A Constellation of Narrowband RF Sensors to Detect and Track Moving Targets
Exploiting A Constellation of Narrowband RF Sensors to Detect and Track Moving Targets Chris Kreucher a, J. Webster Stayman b, Ben Shapo a, and Mark Stuff c a Integrity Applications Incorporated 900 Victors
More informationObjective Speech Quality Measures for Internet Telephony
Objective Speech Quality Measures for Internet Telephony Timothy A. Hall National Institute of Standards and Technology 100 Bureau Drive, STOP 8920 Gaithersburg, MD 20899-8920 ABSTRACT Measuring voice
More informationLecture 8: Signal Detection and Noise Assumption
ECE 83 Fall Statistical Signal Processing instructor: R. Nowak, scribe: Feng Ju Lecture 8: Signal Detection and Noise Assumption Signal Detection : X = W H : X = S + W where W N(, σ I n n and S = [s, s,...,
More informationMultiDSLA. Measuring Network Performance. Malden Electronics Ltd
MultiDSLA Measuring Network Performance Malden Electronics Ltd The Business Case for Network Performance Measurement MultiDSLA is a highly scalable solution for the measurement of network speech transmission
More informationL9: Cepstral analysis
L9: Cepstral analysis The cepstrum Homomorphic filtering The cepstrum and voicing/pitch detection Linear prediction cepstral coefficients Mel frequency cepstral coefficients This lecture is based on [Taylor,
More informationSmartFocus Article 1 - Technical approach
SmartFocus Article 1 - Technical approach Effective strategies for addressing listening in noisy environments The difficulty of determining the desired amplification for listening in noise is well documented.
More informationThe effect of mismatched recording conditions on human and automatic speaker recognition in forensic applications
Forensic Science International 146S (2004) S95 S99 www.elsevier.com/locate/forsciint The effect of mismatched recording conditions on human and automatic speaker recognition in forensic applications A.
More informationComplexity-bounded Power Control in Video Transmission over a CDMA Wireless Network
Complexity-bounded Power Control in Video Transmission over a CDMA Wireless Network Xiaoan Lu, David Goodman, Yao Wang, and Elza Erkip Electrical and Computer Engineering, Polytechnic University, Brooklyn,
More informationAnalog-to-Digital Voice Encoding
Analog-to-Digital Voice Encoding Basic Voice Encoding: Converting Analog to Digital This topic describes the process of converting analog signals to digital signals. Digitizing Analog Signals 1. Sample
More informationEstablishing the Uniqueness of the Human Voice for Security Applications
Proceedings of Student/Faculty Research Day, CSIS, Pace University, May 7th, 2004 Establishing the Uniqueness of the Human Voice for Security Applications Naresh P. Trilok, Sung-Hyuk Cha, and Charles C.
More informationACM SIGKDD Workshop on Intelligence and Security Informatics Held in conjunction with KDD-2010
Fuzzy Association Rule Mining for Community Crime Pattern Discovery Anna L. Buczak, Christopher M. Gifford ACM SIGKDD Workshop on Intelligence and Security Informatics Held in conjunction with KDD-2010
More informationFunction Guide for the Fourier Transformation Package SPIRE-UOL-DOC-002496
Function Guide for the Fourier Transformation Package SPIRE-UOL-DOC-002496 Prepared by: Peter Davis (University of Lethbridge) peter.davis@uleth.ca Andres Rebolledo (University of Lethbridge) andres.rebolledo@uleth.ca
More informationHow To Understand The Quality Of A Wireless Voice Communication
Effects of the Wireless Channel in VOIP (Voice Over Internet Protocol) Networks Atul Ranjan Srivastava 1, Vivek Kushwaha 2 Department of Electronics and Communication, University of Allahabad, Allahabad
More informationUnderstanding the Transition From PESQ to POLQA. An Ascom Network Testing White Paper
Understanding the Transition From PESQ to POLQA An Ascom Network Testing White Paper By Dr. Irina Cotanis Prepared by: Date: Document: Dr. Irina Cotanis 6 December 2011 NT11-22759, Rev. 1.0 Ascom (2011)
More informationNoise. CIH Review PDC March 2012
Noise CIH Review PDC March 2012 Learning Objectives Understand the concept of the decibel, decibel determination, decibel addition, and weighting Know the characteristics of frequency that are relevant
More informationRadar Systems Engineering Lecture 6 Detection of Signals in Noise
Radar Systems Engineering Lecture 6 Detection of Signals in Noise Dr. Robert M. O Donnell Guest Lecturer Radar Systems Course 1 Detection 1/1/010 Block Diagram of Radar System Target Radar Cross Section
More informationANALYZER BASICS WHAT IS AN FFT SPECTRUM ANALYZER? 2-1
WHAT IS AN FFT SPECTRUM ANALYZER? ANALYZER BASICS The SR760 FFT Spectrum Analyzer takes a time varying input signal, like you would see on an oscilloscope trace, and computes its frequency spectrum. Fourier's
More informationCapacity Limits of MIMO Channels
Tutorial and 4G Systems Capacity Limits of MIMO Channels Markku Juntti Contents 1. Introduction. Review of information theory 3. Fixed MIMO channels 4. Fading MIMO channels 5. Summary and Conclusions References
More informationVoice Quality Evaluation and the Impact of Wireless Packet Communication Systems
1 Voice Quality Evaluation in Wireless Packet Communication Systems: A Tutorial and Performance Results for ROHC Stephan Rein Frank H. P. Fitzek Martin Reisslein Abstract As wireless systems are evolving
More informationHow to Measure Network Performance by Using NGNs
Speech Quality Measurement Tools for Dynamic Network Management Simon Broom, Mike Hollier Psytechnics, 23 Museum Street, Ipswich, Suffolk, UK IP1 1HN Phone +44 (0)1473 261800, Fax +44 (0)1473 261880 simon.broom@psytechnics.com
More informationPerformance analysis of bandwidth efficient coherent modulation schemes with L-fold MRC and SC in Nakagami-m fading channels
Title Performance analysis of bandwidth efficient coherent modulation schemes with L-fold MRC and SC in Nakagami-m fading channels Author(s) Lo, CM; Lam, WH Citation Ieee International Symposium On Personal,
More informationAdvanced Speech-Audio Processing in Mobile Phones and Hearing Aids
Advanced Speech-Audio Processing in Mobile Phones and Hearing Aids Synergies and Distinctions Peter Vary RWTH Aachen University Institute of Communication Systems WASPAA, October 23, 2013 Mohonk Mountain
More informationThe Effect of Network Cabling on Bit Error Rate Performance. By Paul Kish NORDX/CDT
The Effect of Network Cabling on Bit Error Rate Performance By Paul Kish NORDX/CDT Table of Contents Introduction... 2 Probability of Causing Errors... 3 Noise Sources Contributing to Errors... 4 Bit Error
More informationBasic principles of Voice over IP
Basic principles of Voice over IP Dr. Peter Počta {pocta@fel.uniza.sk} Department of Telecommunications and Multimedia Faculty of Electrical Engineering University of Žilina, Slovakia Outline VoIP Transmission
More informationJPEG compression of monochrome 2D-barcode images using DCT coefficient distributions
Edith Cowan University Research Online ECU Publications Pre. JPEG compression of monochrome D-barcode images using DCT coefficient distributions Keng Teong Tan Hong Kong Baptist University Douglas Chai
More informationRoom Acoustic Reproduction by Spatial Room Response
Room Acoustic Reproduction by Spatial Room Response Rendering Hoda Nasereddin 1, Mohammad Asgari 2 and Ayoub Banoushi 3 Audio Engineer, Broadcast engineering department, IRIB university, Tehran, Iran,
More informationSignal Detection. Outline. Detection Theory. Example Applications of Detection Theory
Outline Signal Detection M. Sami Fadali Professor of lectrical ngineering University of Nevada, Reno Hypothesis testing. Neyman-Pearson (NP) detector for a known signal in white Gaussian noise (WGN). Matched
More informationAdaptive Equalization of binary encoded signals Using LMS Algorithm
SSRG International Journal of Electronics and Communication Engineering (SSRG-IJECE) volume issue7 Sep Adaptive Equalization of binary encoded signals Using LMS Algorithm Dr.K.Nagi Reddy Professor of ECE,NBKR
More informationMATLAB-based Applications for Image Processing and Image Quality Assessment Part I: Software Description
RADIOENGINEERING, VOL. 20, NO. 4, DECEMBER 2011 1009 MATLAB-based Applications for Image Processing and Image Quality Assessment Part I: Software Description Lukáš KRASULA, Miloš KLÍMA, Eric ROGARD, Edouard
More informationElectronic Communications Committee (ECC) within the European Conference of Postal and Telecommunications Administrations (CEPT)
Page 1 Electronic Communications Committee (ECC) within the European Conference of Postal and Telecommunications Administrations (CEPT) ECC RECOMMENDATION (06)01 Bandwidth measurements using FFT techniques
More informationClassic EEG (ERPs)/ Advanced EEG. Quentin Noirhomme
Classic EEG (ERPs)/ Advanced EEG Quentin Noirhomme Outline Origins of MEEG Event related potentials Time frequency decomposition i Source reconstruction Before to start EEGlab Fieldtrip (included in spm)
More informationEnhancing the SNR of the Fiber Optic Rotation Sensor using the LMS Algorithm
1 Enhancing the SNR of the Fiber Optic Rotation Sensor using the LMS Algorithm Hani Mehrpouyan, Student Member, IEEE, Department of Electrical and Computer Engineering Queen s University, Kingston, Ontario,
More informationEM Clustering Approach for Multi-Dimensional Analysis of Big Data Set
EM Clustering Approach for Multi-Dimensional Analysis of Big Data Set Amhmed A. Bhih School of Electrical and Electronic Engineering Princy Johnson School of Electrical and Electronic Engineering Martin
More informationA Comparison of Speech Coding Algorithms ADPCM vs CELP. Shannon Wichman
A Comparison of Speech Coding Algorithms ADPCM vs CELP Shannon Wichman Department of Electrical Engineering The University of Texas at Dallas Fall 1999 December 8, 1999 1 Abstract Factors serving as constraints
More informationBLIND SOURCE SEPARATION OF SPEECH AND BACKGROUND MUSIC FOR IMPROVED SPEECH RECOGNITION
BLIND SOURCE SEPARATION OF SPEECH AND BACKGROUND MUSIC FOR IMPROVED SPEECH RECOGNITION P. Vanroose Katholieke Universiteit Leuven, div. ESAT/PSI Kasteelpark Arenberg 10, B 3001 Heverlee, Belgium Peter.Vanroose@esat.kuleuven.ac.be
More informationIEEE Proof. Web Version. PROGRESSIVE speaker adaptation has been considered
IEEE TRANSACTIONS ON AUDIO, SPEECH, AND LANGUAGE PROCESSING 1 A Joint Factor Analysis Approach to Progressive Model Adaptation in Text-Independent Speaker Verification Shou-Chun Yin, Richard Rose, Senior
More informationAuto-Tuning Using Fourier Coefficients
Auto-Tuning Using Fourier Coefficients Math 56 Tom Whalen May 20, 2013 The Fourier transform is an integral part of signal processing of any kind. To be able to analyze an input signal as a superposition
More informationCharacterizing Digital Cameras with the Photon Transfer Curve
Characterizing Digital Cameras with the Photon Transfer Curve By: David Gardner Summit Imaging (All rights reserved) Introduction Purchasing a camera for high performance imaging applications is frequently
More informationThe CUSUM algorithm a small review. Pierre Granjon
The CUSUM algorithm a small review Pierre Granjon June, 1 Contents 1 The CUSUM algorithm 1.1 Algorithm............................... 1.1.1 The problem......................... 1.1. The different steps......................
More informationAgilent Creating Multi-tone Signals With the N7509A Waveform Generation Toolbox. Application Note
Agilent Creating Multi-tone Signals With the N7509A Waveform Generation Toolbox Application Note Introduction Of all the signal engines in the N7509A, the most complex is the multi-tone engine. This application
More informationDeveloping an Isolated Word Recognition System in MATLAB
MATLAB Digest Developing an Isolated Word Recognition System in MATLAB By Daryl Ning Speech-recognition technology is embedded in voice-activated routing systems at customer call centres, voice dialling
More informationapplication note Directional Microphone Applications Introduction Directional Hearing Aids
APPLICATION NOTE AN-4 Directional Microphone Applications Introduction The inability to understand speech in noisy environments is a significant problem for hearing impaired individuals. An omnidirectional
More informationA Coefficient of Variation for Skewed and Heavy-Tailed Insurance Losses. Michael R. Powers[ 1 ] Temple University and Tsinghua University
A Coefficient of Variation for Skewed and Heavy-Tailed Insurance Losses Michael R. Powers[ ] Temple University and Tsinghua University Thomas Y. Powers Yale University [June 2009] Abstract We propose a
More informationHD Radio FM Transmission System Specifications Rev. F August 24, 2011
HD Radio FM Transmission System Specifications Rev. F August 24, 2011 SY_SSS_1026s TRADEMARKS HD Radio and the HD, HD Radio, and Arc logos are proprietary trademarks of ibiquity Digital Corporation. ibiquity,
More informationA Digital Audio Watermark Embedding Algorithm
Xianghong Tang, Yamei Niu, Hengli Yue, Zhongke Yin Xianghong Tang, Yamei Niu, Hengli Yue, Zhongke Yin School of Communication Engineering, Hangzhou Dianzi University, Hangzhou, Zhejiang, 3008, China tangxh@hziee.edu.cn,
More informationBER Performance Analysis of SSB-QPSK over AWGN and Rayleigh Channel
Performance Analysis of SSB-QPSK over AWGN and Rayleigh Channel Rahul Taware ME Student EXTC Department, DJSCOE Vile-Parle (W) Mumbai 056 T. D Biradar Associate Professor EXTC Department, DJSCOE Vile-Parle
More informationSTUDY OF MUTUAL INFORMATION IN PERCEPTUAL CODING WITH APPLICATION FOR LOW BIT-RATE COMPRESSION
STUDY OF MUTUAL INFORMATION IN PERCEPTUAL CODING WITH APPLICATION FOR LOW BIT-RATE COMPRESSION Adiel Ben-Shalom, Michael Werman School of Computer Science Hebrew University Jerusalem, Israel. {chopin,werman}@cs.huji.ac.il
More informationAdjusting Voice Quality
Adjusting Voice Quality Electrical Characteristics This topic describes the electrical characteristics of analog voice and the factors affecting voice quality. Factors That Affect Voice Quality The following
More informationPeakVue Analysis for Antifriction Bearing Fault Detection
August 2011 PeakVue Analysis for Antifriction Bearing Fault Detection Peak values (PeakVue) are observed over sequential discrete time intervals, captured, and analyzed. The analyses are the (a) peak values
More informationTech Note. Introduction. Definition of Call Quality. Contents. Voice Quality Measurement Understanding VoIP Performance. Title Series.
Tech Note Title Series Voice Quality Measurement Understanding VoIP Performance Date January 2005 Overview This tech note describes commonly-used call quality measurement methods, explains the metrics
More informationRECOMMENDATION ITU-R SM.1792. Measuring sideband emissions of T-DAB and DVB-T transmitters for monitoring purposes
Rec. ITU-R SM.1792 1 RECOMMENDATION ITU-R SM.1792 Measuring sideband emissions of T-DAB and DVB-T transmitters for monitoring purposes (2007) Scope This Recommendation provides guidance to measurement
More informationDigital Modulation. David Tipper. Department of Information Science and Telecommunications University of Pittsburgh. Typical Communication System
Digital Modulation David Tipper Associate Professor Department of Information Science and Telecommunications University of Pittsburgh http://www.tele.pitt.edu/tipper.html Typical Communication System Source
More informationModule 13 : Measurements on Fiber Optic Systems
Module 13 : Measurements on Fiber Optic Systems Lecture : Measurements on Fiber Optic Systems Objectives In this lecture you will learn the following Measurements on Fiber Optic Systems Attenuation (Loss)
More informationHSI BASED COLOUR IMAGE EQUALIZATION USING ITERATIVE n th ROOT AND n th POWER
HSI BASED COLOUR IMAGE EQUALIZATION USING ITERATIVE n th ROOT AND n th POWER Gholamreza Anbarjafari icv Group, IMS Lab, Institute of Technology, University of Tartu, Tartu 50411, Estonia sjafari@ut.ee
More informationSection 5.0 : Horn Physics. By Martin J. King, 6/29/08 Copyright 2008 by Martin J. King. All Rights Reserved.
Section 5. : Horn Physics Section 5. : Horn Physics By Martin J. King, 6/29/8 Copyright 28 by Martin J. King. All Rights Reserved. Before discussing the design of a horn loaded loudspeaker system, it is
More informationIMPLEMENTATION OF THE ADAPTIVE FILTER FOR VOICE COMMUNICATIONS WITH CONTROL SYSTEMS
1. JAN VAŇUŠ IMPLEMENTATION OF THE ADAPTIVE FILTER FOR VOICE COMMUNICATIONS WITH CONTROL SYSTEMS Abstract: In the paper is described use of the draft method for optimal setting values of the filter length
More informationJitter Measurements in Serial Data Signals
Jitter Measurements in Serial Data Signals Michael Schnecker, Product Manager LeCroy Corporation Introduction The increasing speed of serial data transmission systems places greater importance on measuring
More informationEEG COHERENCE AND PHASE DELAYS: COMPARISONS BETWEEN SINGLE REFERENCE, AVERAGE REFERENCE AND CURRENT SOURCE DENSITY
Version 1, June 13, 2004 Rough Draft form We apologize while we prepare the manuscript for publication but the data are valid and the conclusions are fundamental EEG COHERENCE AND PHASE DELAYS: COMPARISONS
More informationSOFTWARE FOR GENERATION OF SPECTRUM COMPATIBLE TIME HISTORY
3 th World Conference on Earthquake Engineering Vancouver, B.C., Canada August -6, 24 Paper No. 296 SOFTWARE FOR GENERATION OF SPECTRUM COMPATIBLE TIME HISTORY ASHOK KUMAR SUMMARY One of the important
More informationA Secure File Transfer based on Discrete Wavelet Transformation and Audio Watermarking Techniques
A Secure File Transfer based on Discrete Wavelet Transformation and Audio Watermarking Techniques Vineela Behara,Y Ramesh Department of Computer Science and Engineering Aditya institute of Technology and
More informationAudio Engineering Society. Convention Paper. Presented at the 119th Convention 2005 October 7 10 New York, New York USA
Audio Engineering Society Convention Paper Presented at the 9th Convention 5 October 7 New York, New York USA This convention paper has been reproduced from the author's advance manuscript, without editing,
More informationThe Optimization of Parameters Configuration for AMR Codec in Mobile Networks
01 8th International Conference on Communications and Networking in China (CHINACOM) The Optimization of Parameters Configuration for AMR Codec in Mobile Networks Nan Ha,JingWang, Zesong Fei, Wenzhi Li,
More informationWelcome to the United States Patent and TradeMark Office
Welcome to the United States Patent and TradeMark Office an Agency of the United States Department of Commerce United States Patent 5,159,703 Lowery October 27, 1992 Silent subliminal presentation system
More information