Bayesian Feature Enhancement for Reverberation and Noise Robust Speech Recognition

Leutnant, Volker; Krueger, Alexander; Haeb-Umbach, Reinhold

Bayesian Feature Enhancement for Reverberation and Noise Robust Speech Recognition

V. Leutnant, A. Krueger, R. Haeb-Umbach, IEEE Transactions on Audio, Speech, and Language Processing 21 (2013) 1640–1652.

Download

No fulltext has been uploaded.

DOI

10.1109/TASL.2013.2258013

Journal Article | English

Author

Leutnant, Volker; Krueger, Alexander; Haeb-Umbach, Reinhold^LibreCat

Department

Nachrichtentechnik (NT) / Heinz Nixdorf Institut

Abstract

In this contribution we extend a previously proposed Bayesian approach for the enhancement of reverberant logarithmic mel power spectral coefficients for robust automatic speech recognition to the additional compensation of background noise. A recently proposed observation model is employed whose time-variant observation error statistics are obtained as a side product of the inference of the a posteriori probability density function of the clean speech feature vectors. Further a reduction of the computational effort and the memory requirements are achieved by using a recursive formulation of the observation model. The performance of the proposed algorithms is first experimentally studied on a connected digits recognition task with artificially created noisy reverberant data. It is shown that the use of the time-variant observation error model leads to a significant error rate reduction at low signal-to-noise ratios compared to a time-invariant model. Further experiments were conducted on a 5000 word task recorded in a reverberant and noisy environment. A significant word error rate reduction was obtained demonstrating the effectiveness of the approach on real-world data.

Keywords

Bayes methods; compensation; error statistics; reverberation; speech recognition; Bayesian feature enhancement; background noise; clean speech feature vectors; compensation; connected digits recognition task; error statistics; memory requirements; noisy reverberant data; posteriori probability density function; recursive formulation; reverberant logarithmic mel power spectral coefficients; robust automatic speech recognition; signal-to-noise ratios; time-variant observation; word error rate reduction; Robust automatic speech recognition; model-based Bayesian feature enhancement; observation model for reverberant and noisy speech; recursive observation model

Publishing Year

2013

Journal Title

IEEE Transactions on Audio, Speech, and Language Processing

Volume

Issue

Page

1640-1652

LibreCat-ID

11862

Cite this

Leutnant V, Krueger A, Haeb-Umbach R. Bayesian Feature Enhancement for Reverberation and Noise Robust Speech Recognition. IEEE Transactions on Audio, Speech, and Language Processing. 2013;21(8):1640-1652. doi:10.1109/TASL.2013.2258013

Leutnant, V., Krueger, A., & Haeb-Umbach, R. (2013). Bayesian Feature Enhancement for Reverberation and Noise Robust Speech Recognition. IEEE Transactions on Audio, Speech, and Language Processing, 21(8), 1640–1652. https://doi.org/10.1109/TASL.2013.2258013

@article{Leutnant_Krueger_Haeb-Umbach_2013, title={Bayesian Feature Enhancement for Reverberation and Noise Robust Speech Recognition}, volume={21}, DOI={10.1109/TASL.2013.2258013}, number={8}, journal={IEEE Transactions on Audio, Speech, and Language Processing}, author={Leutnant, Volker and Krueger, Alexander and Haeb-Umbach, Reinhold}, year={2013}, pages={1640–1652} }

Leutnant, Volker, Alexander Krueger, and Reinhold Haeb-Umbach. “Bayesian Feature Enhancement for Reverberation and Noise Robust Speech Recognition.” IEEE Transactions on Audio, Speech, and Language Processing 21, no. 8 (2013): 1640–52. https://doi.org/10.1109/TASL.2013.2258013.

V. Leutnant, A. Krueger, and R. Haeb-Umbach, “Bayesian Feature Enhancement for Reverberation and Noise Robust Speech Recognition,” IEEE Transactions on Audio, Speech, and Language Processing, vol. 21, no. 8, pp. 1640–1652, 2013.

Leutnant, Volker, et al. “Bayesian Feature Enhancement for Reverberation and Noise Robust Speech Recognition.” IEEE Transactions on Audio, Speech, and Language Processing, vol. 21, no. 8, 2013, pp. 1640–52, doi:10.1109/TASL.2013.2258013.

Export

Marked Publications

Open Data LibreCat

Search this title in

Google Scholar