Approaches to Iterative Speech Feature Enhancement and Recognition

Windmann, Stefan; Haeb-Umbach, Reinhold

Approaches to Iterative Speech Feature Enhancement and Recognition

S. Windmann, R. Haeb-Umbach, IEEE Transactions on Audio, Speech, and Language Processing 17 (2009) 974–984.

Download (ext.)

https://groups.uni-paderborn.de/nt/pubs/2009/WiHa09-1.pdf

DOI

10.1109/TASL.2009.2014894

Journal Article | English

Author

Windmann, Stefan; Haeb-Umbach, Reinhold^LibreCat

Department

Nachrichtentechnik (NT) / Heinz Nixdorf Institut

Abstract

In automatic speech recognition, hidden Markov models (HMMs) are commonly used for speech decoding, while switching linear dynamic models (SLDMs) can be employed for a preceding model-based speech feature enhancement. In this paper, these model types are combined in order to obtain a novel iterative speech feature enhancement and recognition architecture. It is shown that speech feature enhancement with SLDMs can be improved by feeding back information from the HMM to the enhancement stage. Two different feedback structures are derived. In the first, the posteriors of the HMM states are used to control the model probabilities of the SLDMs, while in the second they are employed to directly influence the estimate of the speech feature distribution. Both approaches lead to improvements in recognition accuracy both on the AURORA2 and AURORA4 databases compared to non-iterative speech feature enhancement with SLDMs. It is also shown that a combination with uncertainty decoding further enhances performance.

Keywords

AURORA2 databases; AURORA4 databases; automatic speech recognition; feedback structures; hidden Markov models; HMM; iterative methods; iterative speech feature enhancement; model probabilities; speech decoding; speech enhancement; speech feature distribution; speech recognition; switching linear dynamic models

Publishing Year

2009

Journal Title

IEEE Transactions on Audio, Speech, and Language Processing

Volume

Issue

Page

974-984

LibreCat-ID

11937

Cite this

Windmann S, Haeb-Umbach R. Approaches to Iterative Speech Feature Enhancement and Recognition. IEEE Transactions on Audio, Speech, and Language Processing. 2009;17(5):974-984. doi:10.1109/TASL.2009.2014894

Windmann, S., & Haeb-Umbach, R. (2009). Approaches to Iterative Speech Feature Enhancement and Recognition. IEEE Transactions on Audio, Speech, and Language Processing, 17(5), 974–984. https://doi.org/10.1109/TASL.2009.2014894

@article{Windmann_Haeb-Umbach_2009, title={Approaches to Iterative Speech Feature Enhancement and Recognition}, volume={17}, DOI={10.1109/TASL.2009.2014894}, number={5}, journal={IEEE Transactions on Audio, Speech, and Language Processing}, author={Windmann, Stefan and Haeb-Umbach, Reinhold}, year={2009}, pages={974–984} }

Windmann, Stefan, and Reinhold Haeb-Umbach. “Approaches to Iterative Speech Feature Enhancement and Recognition.” IEEE Transactions on Audio, Speech, and Language Processing 17, no. 5 (2009): 974–84. https://doi.org/10.1109/TASL.2009.2014894.

S. Windmann and R. Haeb-Umbach, “Approaches to Iterative Speech Feature Enhancement and Recognition,” IEEE Transactions on Audio, Speech, and Language Processing, vol. 17, no. 5, pp. 974–984, 2009.

Windmann, Stefan, and Reinhold Haeb-Umbach. “Approaches to Iterative Speech Feature Enhancement and Recognition.” IEEE Transactions on Audio, Speech, and Language Processing, vol. 17, no. 5, 2009, pp. 974–84, doi:10.1109/TASL.2009.2014894.

All files available under the following license(s):

Copyright Statement:

This Item is protected by copyright and/or related rights. [...]

Link(s) to Main File(s)

URL

https://groups.uni-paderborn.de/nt/pubs/2009/WiHa09-1.pdf

Access Level

Closed Access

Export

Marked Publications

Open Data LibreCat

Search this title in

Google Scholar