Unsupervised adaptation of a denoising autoencoder by Bayesian Feature Enhancement for reverberant asr under mismatch conditions

Heymann, Jahn; Haeb-Umbach, Reinhold; Golik, P.; Schlueter, R.

Unsupervised adaptation of a denoising autoencoder by Bayesian Feature Enhancement for reverberant asr under mismatch conditions

J. Heymann, R. Haeb-Umbach, P. Golik, R. Schlueter, in: Acoustics, Speech and Signal Processing (ICASSP), 2015 IEEE International Conference On, 2015, pp. 5053–5057.

Download (ext.)

https://groups.uni-paderborn.de/nt/pubs/2015/hey_icassp_2015.pdf

DOI

10.1109/ICASSP.2015.7178933

Conference Paper | English

Author

Heymann, Jahn^LibreCat; Haeb-Umbach, Reinhold^LibreCat; Golik, P.; Schlueter, R.

Department

Nachrichtentechnik (NT) / Heinz Nixdorf Institut

Abstract

The parametric Bayesian Feature Enhancement (BFE) and a datadriven Denoising Autoencoder (DA) both bring performance gains in severe single-channel speech recognition conditions. The first can be adjusted to different conditions by an appropriate parameter setting, while the latter needs to be trained on conditions similar to the ones expected at decoding time, making it vulnerable to a mismatch between training and test conditions. We use a DNN backend and study reverberant ASR under three types of mismatch conditions: different room reverberation times, different speaker to microphone distances and the difference between artificially reverberated data and the recordings in a reverberant environment. We show that for these mismatch conditions BFE can provide the targets for a DA. This unsupervised adaptation provides a performance gain over the direct use of BFE and even enables to compensate for the mismatch of real and simulated reverberant data.

Keywords

codecs; signal denoising; speech recognition; Bayesian feature enhancement; denoising autoencoder; reverberant ASR; single-channel speech recognition; speaker to microphone distances; unsupervised adaptation; Adaptation models; Noise reduction; Reverberation; Speech; Speech recognition; Training; deep neuronal networks; denoising autoencoder; feature enhancement; robust speech recognition

Publishing Year

2015

Proceedings Title

Acoustics, Speech and Signal Processing (ICASSP), 2015 IEEE International Conference on

Page

5053-5057

LibreCat-ID

11813

Cite this

Heymann J, Haeb-Umbach R, Golik P, Schlueter R. Unsupervised adaptation of a denoising autoencoder by Bayesian Feature Enhancement for reverberant asr under mismatch conditions. In: Acoustics, Speech and Signal Processing (ICASSP), 2015 IEEE International Conference On. ; 2015:5053-5057. doi:10.1109/ICASSP.2015.7178933

Heymann, J., Haeb-Umbach, R., Golik, P., & Schlueter, R. (2015). Unsupervised adaptation of a denoising autoencoder by Bayesian Feature Enhancement for reverberant asr under mismatch conditions. In Acoustics, Speech and Signal Processing (ICASSP), 2015 IEEE International Conference on (pp. 5053–5057). https://doi.org/10.1109/ICASSP.2015.7178933

@inproceedings{Heymann_Haeb-Umbach_Golik_Schlueter_2015, title={Unsupervised adaptation of a denoising autoencoder by Bayesian Feature Enhancement for reverberant asr under mismatch conditions}, DOI={10.1109/ICASSP.2015.7178933}, booktitle={Acoustics, Speech and Signal Processing (ICASSP), 2015 IEEE International Conference on}, author={Heymann, Jahn and Haeb-Umbach, Reinhold and Golik, P. and Schlueter, R.}, year={2015}, pages={5053–5057} }

Heymann, Jahn, Reinhold Haeb-Umbach, P. Golik, and R. Schlueter. “Unsupervised Adaptation of a Denoising Autoencoder by Bayesian Feature Enhancement for Reverberant Asr under Mismatch Conditions.” In Acoustics, Speech and Signal Processing (ICASSP), 2015 IEEE International Conference On, 5053–57, 2015. https://doi.org/10.1109/ICASSP.2015.7178933.

J. Heymann, R. Haeb-Umbach, P. Golik, and R. Schlueter, “Unsupervised adaptation of a denoising autoencoder by Bayesian Feature Enhancement for reverberant asr under mismatch conditions,” in Acoustics, Speech and Signal Processing (ICASSP), 2015 IEEE International Conference on, 2015, pp. 5053–5057.

Heymann, Jahn, et al. “Unsupervised Adaptation of a Denoising Autoencoder by Bayesian Feature Enhancement for Reverberant Asr under Mismatch Conditions.” Acoustics, Speech and Signal Processing (ICASSP), 2015 IEEE International Conference On, 2015, pp. 5053–57, doi:10.1109/ICASSP.2015.7178933.

All files available under the following license(s):

Copyright Statement:

This Item is protected by copyright and/or related rights. [...]

Link(s) to Main File(s)

URL

https://groups.uni-paderborn.de/nt/pubs/2015/hey_icassp_2015.pdf

Access Level

Closed Access

Export

Marked Publications

Open Data LibreCat

Search this title in

Google Scholar