---
_id: '11813'
abstract:
- lang: eng
  text: 'The parametric Bayesian Feature Enhancement (BFE) and a datadriven Denoising
    Autoencoder (DA) both bring performance gains in severe single-channel speech
    recognition conditions. The first can be adjusted to different conditions by an
    appropriate parameter setting, while the latter needs to be trained on conditions
    similar to the ones expected at decoding time, making it vulnerable to a mismatch
    between training and test conditions. We use a DNN backend and study reverberant
    ASR under three types of mismatch conditions: different room reverberation times,
    different speaker to microphone distances and the difference between artificially
    reverberated data and the recordings in a reverberant environment. We show that
    for these mismatch conditions BFE can provide the targets for a DA. This unsupervised
    adaptation provides a performance gain over the direct use of BFE and even enables
    to compensate for the mismatch of real and simulated reverberant data.'
author:
- first_name: Jahn
  full_name: Heymann, Jahn
  id: '9168'
  last_name: Heymann
- first_name: Reinhold
  full_name: Haeb-Umbach, Reinhold
  id: '242'
  last_name: Haeb-Umbach
- first_name: P.
  full_name: Golik, P.
  last_name: Golik
- first_name: R.
  full_name: Schlueter, R.
  last_name: Schlueter
citation:
  ama: 'Heymann J, Haeb-Umbach R, Golik P, Schlueter R. Unsupervised adaptation of
    a denoising autoencoder by Bayesian Feature Enhancement for reverberant asr under
    mismatch conditions. In: <i>Acoustics, Speech and Signal Processing (ICASSP),
    2015 IEEE International Conference On</i>. ; 2015:5053-5057. doi:<a href="https://doi.org/10.1109/ICASSP.2015.7178933">10.1109/ICASSP.2015.7178933</a>'
  apa: Heymann, J., Haeb-Umbach, R., Golik, P., &#38; Schlueter, R. (2015). Unsupervised
    adaptation of a denoising autoencoder by Bayesian Feature Enhancement for reverberant
    asr under mismatch conditions. In <i>Acoustics, Speech and Signal Processing (ICASSP),
    2015 IEEE International Conference on</i> (pp. 5053–5057). <a href="https://doi.org/10.1109/ICASSP.2015.7178933">https://doi.org/10.1109/ICASSP.2015.7178933</a>
  bibtex: '@inproceedings{Heymann_Haeb-Umbach_Golik_Schlueter_2015, title={Unsupervised
    adaptation of a denoising autoencoder by Bayesian Feature Enhancement for reverberant
    asr under mismatch conditions}, DOI={<a href="https://doi.org/10.1109/ICASSP.2015.7178933">10.1109/ICASSP.2015.7178933</a>},
    booktitle={Acoustics, Speech and Signal Processing (ICASSP), 2015 IEEE International
    Conference on}, author={Heymann, Jahn and Haeb-Umbach, Reinhold and Golik, P.
    and Schlueter, R.}, year={2015}, pages={5053–5057} }'
  chicago: Heymann, Jahn, Reinhold Haeb-Umbach, P. Golik, and R. Schlueter. “Unsupervised
    Adaptation of a Denoising Autoencoder by Bayesian Feature Enhancement for Reverberant
    Asr under Mismatch Conditions.” In <i>Acoustics, Speech and Signal Processing
    (ICASSP), 2015 IEEE International Conference On</i>, 5053–57, 2015. <a href="https://doi.org/10.1109/ICASSP.2015.7178933">https://doi.org/10.1109/ICASSP.2015.7178933</a>.
  ieee: J. Heymann, R. Haeb-Umbach, P. Golik, and R. Schlueter, “Unsupervised adaptation
    of a denoising autoencoder by Bayesian Feature Enhancement for reverberant asr
    under mismatch conditions,” in <i>Acoustics, Speech and Signal Processing (ICASSP),
    2015 IEEE International Conference on</i>, 2015, pp. 5053–5057.
  mla: Heymann, Jahn, et al. “Unsupervised Adaptation of a Denoising Autoencoder by
    Bayesian Feature Enhancement for Reverberant Asr under Mismatch Conditions.” <i>Acoustics,
    Speech and Signal Processing (ICASSP), 2015 IEEE International Conference On</i>,
    2015, pp. 5053–57, doi:<a href="https://doi.org/10.1109/ICASSP.2015.7178933">10.1109/ICASSP.2015.7178933</a>.
  short: 'J. Heymann, R. Haeb-Umbach, P. Golik, R. Schlueter, in: Acoustics, Speech
    and Signal Processing (ICASSP), 2015 IEEE International Conference On, 2015, pp.
    5053–5057.'
date_created: 2019-07-12T05:28:45Z
date_updated: 2022-01-06T06:51:09Z
department:
- _id: '54'
doi: 10.1109/ICASSP.2015.7178933
keyword:
- codecs
- signal denoising
- speech recognition
- Bayesian feature enhancement
- denoising autoencoder
- reverberant ASR
- single-channel speech recognition
- speaker to microphone distances
- unsupervised adaptation
- Adaptation models
- Noise reduction
- Reverberation
- Speech
- Speech recognition
- Training
- deep neuronal networks
- denoising autoencoder
- feature enhancement
- robust speech recognition
language:
- iso: eng
main_file_link:
- open_access: '1'
  url: https://groups.uni-paderborn.de/nt/pubs/2015/hey_icassp_2015.pdf
oa: '1'
page: 5053-5057
publication: Acoustics, Speech and Signal Processing (ICASSP), 2015 IEEE International
  Conference on
status: public
title: Unsupervised adaptation of a denoising autoencoder by Bayesian Feature Enhancement
  for reverberant asr under mismatch conditions
type: conference
user_id: '44006'
year: '2015'
...
---
_id: '11862'
abstract:
- lang: eng
  text: In this contribution we extend a previously proposed Bayesian approach for
    the enhancement of reverberant logarithmic mel power spectral coefficients for
    robust automatic speech recognition to the additional compensation of background
    noise. A recently proposed observation model is employed whose time-variant observation
    error statistics are obtained as a side product of the inference of the a posteriori
    probability density function of the clean speech feature vectors. Further a reduction
    of the computational effort and the memory requirements are achieved by using
    a recursive formulation of the observation model. The performance of the proposed
    algorithms is first experimentally studied on a connected digits recognition task
    with artificially created noisy reverberant data. It is shown that the use of
    the time-variant observation error model leads to a significant error rate reduction
    at low signal-to-noise ratios compared to a time-invariant model. Further experiments
    were conducted on a 5000 word task recorded in a reverberant and noisy environment.
    A significant word error rate reduction was obtained demonstrating the effectiveness
    of the approach on real-world data.
author:
- first_name: Volker
  full_name: Leutnant, Volker
  last_name: Leutnant
- first_name: Alexander
  full_name: Krueger, Alexander
  last_name: Krueger
- first_name: Reinhold
  full_name: Haeb-Umbach, Reinhold
  id: '242'
  last_name: Haeb-Umbach
citation:
  ama: Leutnant V, Krueger A, Haeb-Umbach R. Bayesian Feature Enhancement for Reverberation
    and Noise Robust Speech Recognition. <i>IEEE Transactions on Audio, Speech, and
    Language Processing</i>. 2013;21(8):1640-1652. doi:<a href="https://doi.org/10.1109/TASL.2013.2258013">10.1109/TASL.2013.2258013</a>
  apa: Leutnant, V., Krueger, A., &#38; Haeb-Umbach, R. (2013). Bayesian Feature Enhancement
    for Reverberation and Noise Robust Speech Recognition. <i>IEEE Transactions on
    Audio, Speech, and Language Processing</i>, <i>21</i>(8), 1640–1652. <a href="https://doi.org/10.1109/TASL.2013.2258013">https://doi.org/10.1109/TASL.2013.2258013</a>
  bibtex: '@article{Leutnant_Krueger_Haeb-Umbach_2013, title={Bayesian Feature Enhancement
    for Reverberation and Noise Robust Speech Recognition}, volume={21}, DOI={<a href="https://doi.org/10.1109/TASL.2013.2258013">10.1109/TASL.2013.2258013</a>},
    number={8}, journal={IEEE Transactions on Audio, Speech, and Language Processing},
    author={Leutnant, Volker and Krueger, Alexander and Haeb-Umbach, Reinhold}, year={2013},
    pages={1640–1652} }'
  chicago: 'Leutnant, Volker, Alexander Krueger, and Reinhold Haeb-Umbach. “Bayesian
    Feature Enhancement for Reverberation and Noise Robust Speech Recognition.” <i>IEEE
    Transactions on Audio, Speech, and Language Processing</i> 21, no. 8 (2013): 1640–52.
    <a href="https://doi.org/10.1109/TASL.2013.2258013">https://doi.org/10.1109/TASL.2013.2258013</a>.'
  ieee: V. Leutnant, A. Krueger, and R. Haeb-Umbach, “Bayesian Feature Enhancement
    for Reverberation and Noise Robust Speech Recognition,” <i>IEEE Transactions on
    Audio, Speech, and Language Processing</i>, vol. 21, no. 8, pp. 1640–1652, 2013.
  mla: Leutnant, Volker, et al. “Bayesian Feature Enhancement for Reverberation and
    Noise Robust Speech Recognition.” <i>IEEE Transactions on Audio, Speech, and Language
    Processing</i>, vol. 21, no. 8, 2013, pp. 1640–52, doi:<a href="https://doi.org/10.1109/TASL.2013.2258013">10.1109/TASL.2013.2258013</a>.
  short: V. Leutnant, A. Krueger, R. Haeb-Umbach, IEEE Transactions on Audio, Speech,
    and Language Processing 21 (2013) 1640–1652.
date_created: 2019-07-12T05:29:42Z
date_updated: 2022-01-06T06:51:11Z
department:
- _id: '54'
doi: 10.1109/TASL.2013.2258013
intvolume: '        21'
issue: '8'
keyword:
- Bayes methods
- compensation
- error statistics
- reverberation
- speech recognition
- Bayesian feature enhancement
- background noise
- clean speech feature vectors
- compensation
- connected digits recognition task
- error statistics
- memory requirements
- noisy reverberant data
- posteriori probability density function
- recursive formulation
- reverberant logarithmic mel power spectral coefficients
- robust automatic speech recognition
- signal-to-noise ratios
- time-variant observation
- word error rate reduction
- Robust automatic speech recognition
- model-based Bayesian feature enhancement
- observation model for reverberant and noisy speech
- recursive observation model
language:
- iso: eng
page: 1640-1652
publication: IEEE Transactions on Audio, Speech, and Language Processing
status: public
title: Bayesian Feature Enhancement for Reverberation and Noise Robust Speech Recognition
type: journal_article
user_id: '44006'
volume: 21
year: '2013'
...
---
_id: '11864'
abstract:
- lang: eng
  text: In this work, an observation model for the joint compensation of noise and
    reverberation in the logarithmic mel power spectral density domain is considered.
    It relates the features of the noisy reverberant speech to those of the non-reverberant
    speech and the noise. In contrast to enhancement of features only corrupted by
    reverberation (reverberant features), enhancement of noisy reverberant features
    requires a more sophisticated model for the error introduced by the proposed observation
    model. In a first consideration, it will be shown that this error is highly dependent
    on the instantaneous ratio of the power of reverberant speech to the power of
    the noise and, moreover, sensitive to the phase between reverberant speech and
    noise in the short-time discrete Fourier domain. Afterwards, a statistically motivated
    approach will be presented allowing for the model of the observation error to
    be inferred from the error model previously used for the reverberation only case.
    Finally, the developed observation error model will be utilized in a Bayesian
    feature enhancement scheme, leading to improvements in word accuracy on the AURORA5
    database.
author:
- first_name: Volker
  full_name: Leutnant, Volker
  last_name: Leutnant
- first_name: Alexander
  full_name: Krueger, Alexander
  last_name: Krueger
- first_name: Reinhold
  full_name: Haeb-Umbach, Reinhold
  id: '242'
  last_name: Haeb-Umbach
citation:
  ama: 'Leutnant V, Krueger A, Haeb-Umbach R. A Statistical Observation Model For
    Noisy Reverberant Speech Features and its Application to Robust ASR. In: <i>Signal
    Processing, Communications and Computing (ICSPCC), 2012 IEEE International Conference
    On</i>. ; 2012.'
  apa: Leutnant, V., Krueger, A., &#38; Haeb-Umbach, R. (2012). A Statistical Observation
    Model For Noisy Reverberant Speech Features and its Application to Robust ASR.
    In <i>Signal Processing, Communications and Computing (ICSPCC), 2012 IEEE International
    Conference on</i>.
  bibtex: '@inproceedings{Leutnant_Krueger_Haeb-Umbach_2012, title={A Statistical
    Observation Model For Noisy Reverberant Speech Features and its Application to
    Robust ASR}, booktitle={Signal Processing, Communications and Computing (ICSPCC),
    2012 IEEE International Conference on}, author={Leutnant, Volker and Krueger,
    Alexander and Haeb-Umbach, Reinhold}, year={2012} }'
  chicago: Leutnant, Volker, Alexander Krueger, and Reinhold Haeb-Umbach. “A Statistical
    Observation Model For Noisy Reverberant Speech Features and Its Application to
    Robust ASR.” In <i>Signal Processing, Communications and Computing (ICSPCC), 2012
    IEEE International Conference On</i>, 2012.
  ieee: V. Leutnant, A. Krueger, and R. Haeb-Umbach, “A Statistical Observation Model
    For Noisy Reverberant Speech Features and its Application to Robust ASR,” in <i>Signal
    Processing, Communications and Computing (ICSPCC), 2012 IEEE International Conference
    on</i>, 2012.
  mla: Leutnant, Volker, et al. “A Statistical Observation Model For Noisy Reverberant
    Speech Features and Its Application to Robust ASR.” <i>Signal Processing, Communications
    and Computing (ICSPCC), 2012 IEEE International Conference On</i>, 2012.
  short: 'V. Leutnant, A. Krueger, R. Haeb-Umbach, in: Signal Processing, Communications
    and Computing (ICSPCC), 2012 IEEE International Conference On, 2012.'
date_created: 2019-07-12T05:29:44Z
date_updated: 2022-01-06T06:51:11Z
department:
- _id: '54'
keyword:
- Robust Automatic Speech Recognition
- Bayesian feature enhancement
- observation model for reverberant and noisy speech
language:
- iso: eng
main_file_link:
- open_access: '1'
  url: http://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=6335731
oa: '1'
publication: Signal Processing, Communications and Computing (ICSPCC), 2012 IEEE International
  Conference on
status: public
title: A Statistical Observation Model For Noisy Reverberant Speech Features and its
  Application to Robust ASR
type: conference
user_id: '44006'
year: '2012'
...