---
_id: '11813'
abstract:
- lang: eng
  text: 'The parametric Bayesian Feature Enhancement (BFE) and a datadriven Denoising
    Autoencoder (DA) both bring performance gains in severe single-channel speech
    recognition conditions. The first can be adjusted to different conditions by an
    appropriate parameter setting, while the latter needs to be trained on conditions
    similar to the ones expected at decoding time, making it vulnerable to a mismatch
    between training and test conditions. We use a DNN backend and study reverberant
    ASR under three types of mismatch conditions: different room reverberation times,
    different speaker to microphone distances and the difference between artificially
    reverberated data and the recordings in a reverberant environment. We show that
    for these mismatch conditions BFE can provide the targets for a DA. This unsupervised
    adaptation provides a performance gain over the direct use of BFE and even enables
    to compensate for the mismatch of real and simulated reverberant data.'
author:
- first_name: Jahn
  full_name: Heymann, Jahn
  id: '9168'
  last_name: Heymann
- first_name: Reinhold
  full_name: Haeb-Umbach, Reinhold
  id: '242'
  last_name: Haeb-Umbach
- first_name: P.
  full_name: Golik, P.
  last_name: Golik
- first_name: R.
  full_name: Schlueter, R.
  last_name: Schlueter
citation:
  ama: 'Heymann J, Haeb-Umbach R, Golik P, Schlueter R. Unsupervised adaptation of
    a denoising autoencoder by Bayesian Feature Enhancement for reverberant asr under
    mismatch conditions. In: <i>Acoustics, Speech and Signal Processing (ICASSP),
    2015 IEEE International Conference On</i>. ; 2015:5053-5057. doi:<a href="https://doi.org/10.1109/ICASSP.2015.7178933">10.1109/ICASSP.2015.7178933</a>'
  apa: Heymann, J., Haeb-Umbach, R., Golik, P., &#38; Schlueter, R. (2015). Unsupervised
    adaptation of a denoising autoencoder by Bayesian Feature Enhancement for reverberant
    asr under mismatch conditions. In <i>Acoustics, Speech and Signal Processing (ICASSP),
    2015 IEEE International Conference on</i> (pp. 5053–5057). <a href="https://doi.org/10.1109/ICASSP.2015.7178933">https://doi.org/10.1109/ICASSP.2015.7178933</a>
  bibtex: '@inproceedings{Heymann_Haeb-Umbach_Golik_Schlueter_2015, title={Unsupervised
    adaptation of a denoising autoencoder by Bayesian Feature Enhancement for reverberant
    asr under mismatch conditions}, DOI={<a href="https://doi.org/10.1109/ICASSP.2015.7178933">10.1109/ICASSP.2015.7178933</a>},
    booktitle={Acoustics, Speech and Signal Processing (ICASSP), 2015 IEEE International
    Conference on}, author={Heymann, Jahn and Haeb-Umbach, Reinhold and Golik, P.
    and Schlueter, R.}, year={2015}, pages={5053–5057} }'
  chicago: Heymann, Jahn, Reinhold Haeb-Umbach, P. Golik, and R. Schlueter. “Unsupervised
    Adaptation of a Denoising Autoencoder by Bayesian Feature Enhancement for Reverberant
    Asr under Mismatch Conditions.” In <i>Acoustics, Speech and Signal Processing
    (ICASSP), 2015 IEEE International Conference On</i>, 5053–57, 2015. <a href="https://doi.org/10.1109/ICASSP.2015.7178933">https://doi.org/10.1109/ICASSP.2015.7178933</a>.
  ieee: J. Heymann, R. Haeb-Umbach, P. Golik, and R. Schlueter, “Unsupervised adaptation
    of a denoising autoencoder by Bayesian Feature Enhancement for reverberant asr
    under mismatch conditions,” in <i>Acoustics, Speech and Signal Processing (ICASSP),
    2015 IEEE International Conference on</i>, 2015, pp. 5053–5057.
  mla: Heymann, Jahn, et al. “Unsupervised Adaptation of a Denoising Autoencoder by
    Bayesian Feature Enhancement for Reverberant Asr under Mismatch Conditions.” <i>Acoustics,
    Speech and Signal Processing (ICASSP), 2015 IEEE International Conference On</i>,
    2015, pp. 5053–57, doi:<a href="https://doi.org/10.1109/ICASSP.2015.7178933">10.1109/ICASSP.2015.7178933</a>.
  short: 'J. Heymann, R. Haeb-Umbach, P. Golik, R. Schlueter, in: Acoustics, Speech
    and Signal Processing (ICASSP), 2015 IEEE International Conference On, 2015, pp.
    5053–5057.'
date_created: 2019-07-12T05:28:45Z
date_updated: 2022-01-06T06:51:09Z
department:
- _id: '54'
doi: 10.1109/ICASSP.2015.7178933
keyword:
- codecs
- signal denoising
- speech recognition
- Bayesian feature enhancement
- denoising autoencoder
- reverberant ASR
- single-channel speech recognition
- speaker to microphone distances
- unsupervised adaptation
- Adaptation models
- Noise reduction
- Reverberation
- Speech
- Speech recognition
- Training
- deep neuronal networks
- denoising autoencoder
- feature enhancement
- robust speech recognition
language:
- iso: eng
main_file_link:
- open_access: '1'
  url: https://groups.uni-paderborn.de/nt/pubs/2015/hey_icassp_2015.pdf
oa: '1'
page: 5053-5057
publication: Acoustics, Speech and Signal Processing (ICASSP), 2015 IEEE International
  Conference on
status: public
title: Unsupervised adaptation of a denoising autoencoder by Bayesian Feature Enhancement
  for reverberant asr under mismatch conditions
type: conference
user_id: '44006'
year: '2015'
...
---
_id: '11716'
abstract:
- lang: eng
  text: The accuracy of automatic speech recognition systems in noisy and reverberant
    environments can be improved notably by exploiting the uncertainty of the estimated
    speech features using so-called uncertainty-of-observation techniques. In this
    paper, we introduce a new Bayesian decision rule that can serve as a mathematical
    framework from which both known and new uncertainty-of-observation techniques
    can be either derived or approximated. The new decision rule in its direct form
    leads to the new significance decoding approach for Gaussian mixture models, which
    results in better performance compared to standard uncertainty-of-observation
    techniques in different additive and convolutive noise scenarios.
author:
- first_name: Ahmed H.
  full_name: Abdelaziz, Ahmed H.
  last_name: Abdelaziz
- first_name: Steffen
  full_name: Zeiler, Steffen
  last_name: Zeiler
- first_name: Dorothea
  full_name: Kolossa, Dorothea
  last_name: Kolossa
- first_name: Volker
  full_name: Leutnant, Volker
  last_name: Leutnant
- first_name: Reinhold
  full_name: Haeb-Umbach, Reinhold
  id: '242'
  last_name: Haeb-Umbach
citation:
  ama: 'Abdelaziz AH, Zeiler S, Kolossa D, Leutnant V, Haeb-Umbach R. GMM-based significance
    decoding. In: <i>Acoustics, Speech and Signal Processing (ICASSP), 2013 IEEE International
    Conference On</i>. ; 2013:6827-6831. doi:<a href="https://doi.org/10.1109/ICASSP.2013.6638984">10.1109/ICASSP.2013.6638984</a>'
  apa: Abdelaziz, A. H., Zeiler, S., Kolossa, D., Leutnant, V., &#38; Haeb-Umbach,
    R. (2013). GMM-based significance decoding. In <i>Acoustics, Speech and Signal
    Processing (ICASSP), 2013 IEEE International Conference on</i> (pp. 6827–6831).
    <a href="https://doi.org/10.1109/ICASSP.2013.6638984">https://doi.org/10.1109/ICASSP.2013.6638984</a>
  bibtex: '@inproceedings{Abdelaziz_Zeiler_Kolossa_Leutnant_Haeb-Umbach_2013, title={GMM-based
    significance decoding}, DOI={<a href="https://doi.org/10.1109/ICASSP.2013.6638984">10.1109/ICASSP.2013.6638984</a>},
    booktitle={Acoustics, Speech and Signal Processing (ICASSP), 2013 IEEE International
    Conference on}, author={Abdelaziz, Ahmed H. and Zeiler, Steffen and Kolossa, Dorothea
    and Leutnant, Volker and Haeb-Umbach, Reinhold}, year={2013}, pages={6827–6831}
    }'
  chicago: Abdelaziz, Ahmed H., Steffen Zeiler, Dorothea Kolossa, Volker Leutnant,
    and Reinhold Haeb-Umbach. “GMM-Based Significance Decoding.” In <i>Acoustics,
    Speech and Signal Processing (ICASSP), 2013 IEEE International Conference On</i>,
    6827–31, 2013. <a href="https://doi.org/10.1109/ICASSP.2013.6638984">https://doi.org/10.1109/ICASSP.2013.6638984</a>.
  ieee: A. H. Abdelaziz, S. Zeiler, D. Kolossa, V. Leutnant, and R. Haeb-Umbach, “GMM-based
    significance decoding,” in <i>Acoustics, Speech and Signal Processing (ICASSP),
    2013 IEEE International Conference on</i>, 2013, pp. 6827–6831.
  mla: Abdelaziz, Ahmed H., et al. “GMM-Based Significance Decoding.” <i>Acoustics,
    Speech and Signal Processing (ICASSP), 2013 IEEE International Conference On</i>,
    2013, pp. 6827–31, doi:<a href="https://doi.org/10.1109/ICASSP.2013.6638984">10.1109/ICASSP.2013.6638984</a>.
  short: 'A.H. Abdelaziz, S. Zeiler, D. Kolossa, V. Leutnant, R. Haeb-Umbach, in:
    Acoustics, Speech and Signal Processing (ICASSP), 2013 IEEE International Conference
    On, 2013, pp. 6827–6831.'
date_created: 2019-07-12T05:26:53Z
date_updated: 2022-01-06T06:51:07Z
department:
- _id: '54'
doi: 10.1109/ICASSP.2013.6638984
keyword:
- Bayes methods
- Gaussian processes
- convolution
- decision theory
- decoding
- noise
- reverberation
- speech coding
- speech recognition
- Bayesian decision rule
- GMM
- Gaussian mixture models
- additive noise scenarios
- automatic speech recognition systems
- convolutive noise scenarios
- decoding approach
- mathematical framework
- reverberant environments
- significance decoding
- speech feature estimation
- uncertainty-of-observation techniques
- Hidden Markov models
- Maximum likelihood decoding
- Noise
- Speech
- Speech recognition
- Uncertainty
- Uncertainty-of-observation
- modified imputation
- noise robust speech recognition
- significance decoding
- uncertainty decoding
language:
- iso: eng
page: 6827-6831
publication: Acoustics, Speech and Signal Processing (ICASSP), 2013 IEEE International
  Conference on
publication_identifier:
  issn:
  - 1520-6149
status: public
title: GMM-based significance decoding
type: conference
user_id: '44006'
year: '2013'
...
---
_id: '11820'
abstract:
- lang: eng
  text: In this paper, we derive an uncertainty decoding rule for automatic speech
    recognition (ASR), which accounts for both corrupted observations and inter-frame
    correlation. The conditional independence assumption, prevalent in hidden Markov
    model-based ASR, is relaxed to obtain a clean speech posterior that is conditioned
    on the complete observed feature vector sequence. This is a more informative posterior
    than one conditioned only on the current observation. The novel decoding is used
    to obtain a transmission-error robust remote ASR system, where the speech capturing
    unit is connected to the decoder via an error-prone communication network. We
    show how the clean speech posterior can be computed for communication links being
    characterized by either bit errors or packet loss. Recognition results are presented
    for both distributed and network speech recognition, where in the latter case
    common voice-over-IP codecs are employed.
author:
- first_name: Valentin
  full_name: Ion, Valentin
  last_name: Ion
- first_name: Reinhold
  full_name: Haeb-Umbach, Reinhold
  id: '242'
  last_name: Haeb-Umbach
citation:
  ama: Ion V, Haeb-Umbach R. A Novel Uncertainty Decoding Rule With Applications to
    Transmission Error Robust Speech Recognition. <i>IEEE Transactions on Audio, Speech,
    and Language Processing</i>. 2008;16(5):1047-1060. doi:<a href="https://doi.org/10.1109/TASL.2008.925879">10.1109/TASL.2008.925879</a>
  apa: Ion, V., &#38; Haeb-Umbach, R. (2008). A Novel Uncertainty Decoding Rule With
    Applications to Transmission Error Robust Speech Recognition. <i>IEEE Transactions
    on Audio, Speech, and Language Processing</i>, <i>16</i>(5), 1047–1060. <a href="https://doi.org/10.1109/TASL.2008.925879">https://doi.org/10.1109/TASL.2008.925879</a>
  bibtex: '@article{Ion_Haeb-Umbach_2008, title={A Novel Uncertainty Decoding Rule
    With Applications to Transmission Error Robust Speech Recognition}, volume={16},
    DOI={<a href="https://doi.org/10.1109/TASL.2008.925879">10.1109/TASL.2008.925879</a>},
    number={5}, journal={IEEE Transactions on Audio, Speech, and Language Processing},
    author={Ion, Valentin and Haeb-Umbach, Reinhold}, year={2008}, pages={1047–1060}
    }'
  chicago: 'Ion, Valentin, and Reinhold Haeb-Umbach. “A Novel Uncertainty Decoding
    Rule With Applications to Transmission Error Robust Speech Recognition.” <i>IEEE
    Transactions on Audio, Speech, and Language Processing</i> 16, no. 5 (2008): 1047–60.
    <a href="https://doi.org/10.1109/TASL.2008.925879">https://doi.org/10.1109/TASL.2008.925879</a>.'
  ieee: V. Ion and R. Haeb-Umbach, “A Novel Uncertainty Decoding Rule With Applications
    to Transmission Error Robust Speech Recognition,” <i>IEEE Transactions on Audio,
    Speech, and Language Processing</i>, vol. 16, no. 5, pp. 1047–1060, 2008.
  mla: Ion, Valentin, and Reinhold Haeb-Umbach. “A Novel Uncertainty Decoding Rule
    With Applications to Transmission Error Robust Speech Recognition.” <i>IEEE Transactions
    on Audio, Speech, and Language Processing</i>, vol. 16, no. 5, 2008, pp. 1047–60,
    doi:<a href="https://doi.org/10.1109/TASL.2008.925879">10.1109/TASL.2008.925879</a>.
  short: V. Ion, R. Haeb-Umbach, IEEE Transactions on Audio, Speech, and Language
    Processing 16 (2008) 1047–1060.
date_created: 2019-07-12T05:28:53Z
date_updated: 2022-01-06T06:51:10Z
department:
- _id: '54'
doi: 10.1109/TASL.2008.925879
intvolume: '        16'
issue: '5'
keyword:
- automatic speech recognition
- bit errors
- codecs
- communication links
- corrupted observations
- decoding
- distributed speech recognition
- error-prone communication network
- feature vector sequence
- hidden Markov model-based ASR
- hidden Markov models
- inter-frame correlation
- Internet telephony
- network speech recognition
- packet loss
- speech posterior
- speech recognition
- transmission error robust speech recognition
- uncertainty decoding
- voice-over-IP codecs
language:
- iso: eng
main_file_link:
- open_access: '1'
  url: https://groups.uni-paderborn.de/nt/pubs/2008/IoHa08-1.pdf
oa: '1'
page: 1047-1060
publication: IEEE Transactions on Audio, Speech, and Language Processing
status: public
title: A Novel Uncertainty Decoding Rule With Applications to Transmission Error Robust
  Speech Recognition
type: journal_article
user_id: '44006'
volume: 16
year: '2008'
...
