Modeling the dynamics of speech and noise for speech feature enhancement in ASR

Windmann, Stefan; Haeb-Umbach, Reinhold

Modeling the dynamics of speech and noise for speech feature enhancement in ASR

S. Windmann, R. Haeb-Umbach, in: IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP 2008), 2008, pp. 4409–4412.

Download (ext.)

https://groups.uni-paderborn.de/nt/pubs/2008/WiHa08-1.pdf

DOI

10.1109/ICASSP.2008.4518633

Conference Paper | English

Author

Windmann, Stefan; Haeb-Umbach, Reinhold^LibreCat

Department

Nachrichtentechnik (NT) / Heinz Nixdorf Institut

Abstract

In this paper a switching linear dynamical model (SLDM) approach for speech feature enhancement is improved by employing more accurate models for the dynamics of speech and noise. The model of the clean speech feature trajectory is improved by augmenting the state vector to capture information derived from the delta features. Further a hidden noise state variable is introduced to obtain a more elaborated model for the noise dynamics. Approximate Bayesian inference in the SLDM is carried out by a bank of extended Kalman filters, whose outputs are combined according to the a posteriori probability of the individual state models. Experimental results on the AURORA2 database show improved recognition accuracy.

Keywords

a posteriori probability; AURORA2 database; Bayesian inference; Bayes methods; channel bank filters; extended Kalman filter banks; hidden noise state variable; Kalman filters; noise dynamics; speech enhancement; speech feature enhancement; speech feature trajectory; switching linear dynamical model approach

Publishing Year

2008

Proceedings Title

IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP 2008)

Page

4409-4412

LibreCat-ID

11939

Cite this

Windmann S, Haeb-Umbach R. Modeling the dynamics of speech and noise for speech feature enhancement in ASR. In: IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP 2008). ; 2008:4409-4412. doi:10.1109/ICASSP.2008.4518633

Windmann, S., & Haeb-Umbach, R. (2008). Modeling the dynamics of speech and noise for speech feature enhancement in ASR. In IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP 2008) (pp. 4409–4412). https://doi.org/10.1109/ICASSP.2008.4518633

@inproceedings{Windmann_Haeb-Umbach_2008, title={Modeling the dynamics of speech and noise for speech feature enhancement in ASR}, DOI={10.1109/ICASSP.2008.4518633}, booktitle={IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP 2008)}, author={Windmann, Stefan and Haeb-Umbach, Reinhold}, year={2008}, pages={4409–4412} }

Windmann, Stefan, and Reinhold Haeb-Umbach. “Modeling the Dynamics of Speech and Noise for Speech Feature Enhancement in ASR.” In IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP 2008), 4409–12, 2008. https://doi.org/10.1109/ICASSP.2008.4518633.

S. Windmann and R. Haeb-Umbach, “Modeling the dynamics of speech and noise for speech feature enhancement in ASR,” in IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP 2008), 2008, pp. 4409–4412.

Windmann, Stefan, and Reinhold Haeb-Umbach. “Modeling the Dynamics of Speech and Noise for Speech Feature Enhancement in ASR.” IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP 2008), 2008, pp. 4409–12, doi:10.1109/ICASSP.2008.4518633.

All files available under the following license(s):

Copyright Statement:

This Item is protected by copyright and/or related rights. [...]

Link(s) to Main File(s)

URL

https://groups.uni-paderborn.de/nt/pubs/2008/WiHa08-1.pdf

Access Level

Closed Access

Export

Marked Publications

Open Data LibreCat

Search this title in

Google Scholar