---
_id: '11716'
abstract:
- lang: eng
  text: The accuracy of automatic speech recognition systems in noisy and reverberant
    environments can be improved notably by exploiting the uncertainty of the estimated
    speech features using so-called uncertainty-of-observation techniques. In this
    paper, we introduce a new Bayesian decision rule that can serve as a mathematical
    framework from which both known and new uncertainty-of-observation techniques
    can be either derived or approximated. The new decision rule in its direct form
    leads to the new significance decoding approach for Gaussian mixture models, which
    results in better performance compared to standard uncertainty-of-observation
    techniques in different additive and convolutive noise scenarios.
author:
- first_name: Ahmed H.
  full_name: Abdelaziz, Ahmed H.
  last_name: Abdelaziz
- first_name: Steffen
  full_name: Zeiler, Steffen
  last_name: Zeiler
- first_name: Dorothea
  full_name: Kolossa, Dorothea
  last_name: Kolossa
- first_name: Volker
  full_name: Leutnant, Volker
  last_name: Leutnant
- first_name: Reinhold
  full_name: Haeb-Umbach, Reinhold
  id: '242'
  last_name: Haeb-Umbach
citation:
  ama: 'Abdelaziz AH, Zeiler S, Kolossa D, Leutnant V, Haeb-Umbach R. GMM-based significance
    decoding. In: <i>Acoustics, Speech and Signal Processing (ICASSP), 2013 IEEE International
    Conference On</i>. ; 2013:6827-6831. doi:<a href="https://doi.org/10.1109/ICASSP.2013.6638984">10.1109/ICASSP.2013.6638984</a>'
  apa: Abdelaziz, A. H., Zeiler, S., Kolossa, D., Leutnant, V., &#38; Haeb-Umbach,
    R. (2013). GMM-based significance decoding. In <i>Acoustics, Speech and Signal
    Processing (ICASSP), 2013 IEEE International Conference on</i> (pp. 6827–6831).
    <a href="https://doi.org/10.1109/ICASSP.2013.6638984">https://doi.org/10.1109/ICASSP.2013.6638984</a>
  bibtex: '@inproceedings{Abdelaziz_Zeiler_Kolossa_Leutnant_Haeb-Umbach_2013, title={GMM-based
    significance decoding}, DOI={<a href="https://doi.org/10.1109/ICASSP.2013.6638984">10.1109/ICASSP.2013.6638984</a>},
    booktitle={Acoustics, Speech and Signal Processing (ICASSP), 2013 IEEE International
    Conference on}, author={Abdelaziz, Ahmed H. and Zeiler, Steffen and Kolossa, Dorothea
    and Leutnant, Volker and Haeb-Umbach, Reinhold}, year={2013}, pages={6827–6831}
    }'
  chicago: Abdelaziz, Ahmed H., Steffen Zeiler, Dorothea Kolossa, Volker Leutnant,
    and Reinhold Haeb-Umbach. “GMM-Based Significance Decoding.” In <i>Acoustics,
    Speech and Signal Processing (ICASSP), 2013 IEEE International Conference On</i>,
    6827–31, 2013. <a href="https://doi.org/10.1109/ICASSP.2013.6638984">https://doi.org/10.1109/ICASSP.2013.6638984</a>.
  ieee: A. H. Abdelaziz, S. Zeiler, D. Kolossa, V. Leutnant, and R. Haeb-Umbach, “GMM-based
    significance decoding,” in <i>Acoustics, Speech and Signal Processing (ICASSP),
    2013 IEEE International Conference on</i>, 2013, pp. 6827–6831.
  mla: Abdelaziz, Ahmed H., et al. “GMM-Based Significance Decoding.” <i>Acoustics,
    Speech and Signal Processing (ICASSP), 2013 IEEE International Conference On</i>,
    2013, pp. 6827–31, doi:<a href="https://doi.org/10.1109/ICASSP.2013.6638984">10.1109/ICASSP.2013.6638984</a>.
  short: 'A.H. Abdelaziz, S. Zeiler, D. Kolossa, V. Leutnant, R. Haeb-Umbach, in:
    Acoustics, Speech and Signal Processing (ICASSP), 2013 IEEE International Conference
    On, 2013, pp. 6827–6831.'
date_created: 2019-07-12T05:26:53Z
date_updated: 2022-01-06T06:51:07Z
department:
- _id: '54'
doi: 10.1109/ICASSP.2013.6638984
keyword:
- Bayes methods
- Gaussian processes
- convolution
- decision theory
- decoding
- noise
- reverberation
- speech coding
- speech recognition
- Bayesian decision rule
- GMM
- Gaussian mixture models
- additive noise scenarios
- automatic speech recognition systems
- convolutive noise scenarios
- decoding approach
- mathematical framework
- reverberant environments
- significance decoding
- speech feature estimation
- uncertainty-of-observation techniques
- Hidden Markov models
- Maximum likelihood decoding
- Noise
- Speech
- Speech recognition
- Uncertainty
- Uncertainty-of-observation
- modified imputation
- noise robust speech recognition
- significance decoding
- uncertainty decoding
language:
- iso: eng
page: 6827-6831
publication: Acoustics, Speech and Signal Processing (ICASSP), 2013 IEEE International
  Conference on
publication_identifier:
  issn:
  - 1520-6149
status: public
title: GMM-based significance decoding
type: conference
user_id: '44006'
year: '2013'
...
---
_id: '11917'
abstract:
- lang: eng
  text: In this paper we present a speech presence probability (SPP) estimation algorithmwhich
    exploits both temporal and spectral correlations of speech. To this end, the SPP
    estimation is formulated as the posterior probability estimation of the states
    of a two-dimensional (2D) Hidden Markov Model (HMM). We derive an iterative algorithm
    to decode the 2D-HMM which is based on the turbo principle. The experimental results
    show that indeed the SPP estimates improve from iteration to iteration, and further
    clearly outperform another state-of-the-art SPP estimation algorithm.
author:
- first_name: Dang Hai Tran
  full_name: Vu, Dang Hai Tran
  last_name: Vu
- first_name: Reinhold
  full_name: Haeb-Umbach, Reinhold
  id: '242'
  last_name: Haeb-Umbach
citation:
  ama: 'Vu DHT, Haeb-Umbach R. Using the turbo principle for exploiting temporal and
    spectral correlations in speech presence probability estimation. In: <i>38th International
    Conference on Acoustics, Speech and Signal Processing (ICASSP 2013)</i>. ; 2013:863-867.
    doi:<a href="https://doi.org/10.1109/ICASSP.2013.6637771">10.1109/ICASSP.2013.6637771</a>'
  apa: Vu, D. H. T., &#38; Haeb-Umbach, R. (2013). Using the turbo principle for exploiting
    temporal and spectral correlations in speech presence probability estimation.
    In <i>38th International Conference on Acoustics, Speech and Signal Processing
    (ICASSP 2013)</i> (pp. 863–867). <a href="https://doi.org/10.1109/ICASSP.2013.6637771">https://doi.org/10.1109/ICASSP.2013.6637771</a>
  bibtex: '@inproceedings{Vu_Haeb-Umbach_2013, title={Using the turbo principle for
    exploiting temporal and spectral correlations in speech presence probability estimation},
    DOI={<a href="https://doi.org/10.1109/ICASSP.2013.6637771">10.1109/ICASSP.2013.6637771</a>},
    booktitle={38th International Conference on Acoustics, Speech and Signal Processing
    (ICASSP 2013)}, author={Vu, Dang Hai Tran and Haeb-Umbach, Reinhold}, year={2013},
    pages={863–867} }'
  chicago: Vu, Dang Hai Tran, and Reinhold Haeb-Umbach. “Using the Turbo Principle
    for Exploiting Temporal and Spectral Correlations in Speech Presence Probability
    Estimation.” In <i>38th International Conference on Acoustics, Speech and Signal
    Processing (ICASSP 2013)</i>, 863–67, 2013. <a href="https://doi.org/10.1109/ICASSP.2013.6637771">https://doi.org/10.1109/ICASSP.2013.6637771</a>.
  ieee: D. H. T. Vu and R. Haeb-Umbach, “Using the turbo principle for exploiting
    temporal and spectral correlations in speech presence probability estimation,”
    in <i>38th International Conference on Acoustics, Speech and Signal Processing
    (ICASSP 2013)</i>, 2013, pp. 863–867.
  mla: Vu, Dang Hai Tran, and Reinhold Haeb-Umbach. “Using the Turbo Principle for
    Exploiting Temporal and Spectral Correlations in Speech Presence Probability Estimation.”
    <i>38th International Conference on Acoustics, Speech and Signal Processing (ICASSP
    2013)</i>, 2013, pp. 863–67, doi:<a href="https://doi.org/10.1109/ICASSP.2013.6637771">10.1109/ICASSP.2013.6637771</a>.
  short: 'D.H.T. Vu, R. Haeb-Umbach, in: 38th International Conference on Acoustics,
    Speech and Signal Processing (ICASSP 2013), 2013, pp. 863–867.'
date_created: 2019-07-12T05:30:45Z
date_updated: 2022-01-06T06:51:12Z
department:
- _id: '54'
doi: 10.1109/ICASSP.2013.6637771
keyword:
- correlation methods
- estimation theory
- hidden Markov models
- iterative methods
- probability
- spectral analysis
- speech processing
- 2D HMM
- SPP estimates
- iterative algorithm
- posterior probability estimation
- spectral correlation
- speech presence probability estimation
- state-of-the-art SPP estimation algorithm
- temporal correlation
- turbo principle
- two-dimensional hidden Markov model
- Correlation
- Decoding
- Estimation
- Iterative decoding
- Noise
- Speech
- Vectors
language:
- iso: eng
page: 863-867
publication: 38th International Conference on Acoustics, Speech and Signal Processing
  (ICASSP 2013)
publication_identifier:
  issn:
  - 1520-6149
status: public
title: Using the turbo principle for exploiting temporal and spectral correlations
  in speech presence probability estimation
type: conference
user_id: '44006'
year: '2013'
...
---
_id: '11937'
abstract:
- lang: eng
  text: In automatic speech recognition, hidden Markov models (HMMs) are commonly
    used for speech decoding, while switching linear dynamic models (SLDMs) can be
    employed for a preceding model-based speech feature enhancement. In this paper,
    these model types are combined in order to obtain a novel iterative speech feature
    enhancement and recognition architecture. It is shown that speech feature enhancement
    with SLDMs can be improved by feeding back information from the HMM to the enhancement
    stage. Two different feedback structures are derived. In the first, the posteriors
    of the HMM states are used to control the model probabilities of the SLDMs, while
    in the second they are employed to directly influence the estimate of the speech
    feature distribution. Both approaches lead to improvements in recognition accuracy
    both on the AURORA2 and AURORA4 databases compared to non-iterative speech feature
    enhancement with SLDMs. It is also shown that a combination with uncertainty decoding
    further enhances performance.
author:
- first_name: Stefan
  full_name: Windmann, Stefan
  last_name: Windmann
- first_name: Reinhold
  full_name: Haeb-Umbach, Reinhold
  id: '242'
  last_name: Haeb-Umbach
citation:
  ama: Windmann S, Haeb-Umbach R. Approaches to Iterative Speech Feature Enhancement
    and Recognition. <i>IEEE Transactions on Audio, Speech, and Language Processing</i>.
    2009;17(5):974-984. doi:<a href="https://doi.org/10.1109/TASL.2009.2014894">10.1109/TASL.2009.2014894</a>
  apa: Windmann, S., &#38; Haeb-Umbach, R. (2009). Approaches to Iterative Speech
    Feature Enhancement and Recognition. <i>IEEE Transactions on Audio, Speech, and
    Language Processing</i>, <i>17</i>(5), 974–984. <a href="https://doi.org/10.1109/TASL.2009.2014894">https://doi.org/10.1109/TASL.2009.2014894</a>
  bibtex: '@article{Windmann_Haeb-Umbach_2009, title={Approaches to Iterative Speech
    Feature Enhancement and Recognition}, volume={17}, DOI={<a href="https://doi.org/10.1109/TASL.2009.2014894">10.1109/TASL.2009.2014894</a>},
    number={5}, journal={IEEE Transactions on Audio, Speech, and Language Processing},
    author={Windmann, Stefan and Haeb-Umbach, Reinhold}, year={2009}, pages={974–984}
    }'
  chicago: 'Windmann, Stefan, and Reinhold Haeb-Umbach. “Approaches to Iterative Speech
    Feature Enhancement and Recognition.” <i>IEEE Transactions on Audio, Speech, and
    Language Processing</i> 17, no. 5 (2009): 974–84. <a href="https://doi.org/10.1109/TASL.2009.2014894">https://doi.org/10.1109/TASL.2009.2014894</a>.'
  ieee: S. Windmann and R. Haeb-Umbach, “Approaches to Iterative Speech Feature Enhancement
    and Recognition,” <i>IEEE Transactions on Audio, Speech, and Language Processing</i>,
    vol. 17, no. 5, pp. 974–984, 2009.
  mla: Windmann, Stefan, and Reinhold Haeb-Umbach. “Approaches to Iterative Speech
    Feature Enhancement and Recognition.” <i>IEEE Transactions on Audio, Speech, and
    Language Processing</i>, vol. 17, no. 5, 2009, pp. 974–84, doi:<a href="https://doi.org/10.1109/TASL.2009.2014894">10.1109/TASL.2009.2014894</a>.
  short: S. Windmann, R. Haeb-Umbach, IEEE Transactions on Audio, Speech, and Language
    Processing 17 (2009) 974–984.
date_created: 2019-07-12T05:31:08Z
date_updated: 2022-01-06T06:51:12Z
department:
- _id: '54'
doi: 10.1109/TASL.2009.2014894
intvolume: '        17'
issue: '5'
keyword:
- AURORA2 databases
- AURORA4 databases
- automatic speech recognition
- feedback structures
- hidden Markov models
- HMM
- iterative methods
- iterative speech feature enhancement
- model probabilities
- speech decoding
- speech enhancement
- speech feature distribution
- speech recognition
- switching linear dynamic models
language:
- iso: eng
main_file_link:
- open_access: '1'
  url: https://groups.uni-paderborn.de/nt/pubs/2009/WiHa09-1.pdf
oa: '1'
page: 974-984
publication: IEEE Transactions on Audio, Speech, and Language Processing
status: public
title: Approaches to Iterative Speech Feature Enhancement and Recognition
type: journal_article
user_id: '44006'
volume: 17
year: '2009'
...
---
_id: '11820'
abstract:
- lang: eng
  text: In this paper, we derive an uncertainty decoding rule for automatic speech
    recognition (ASR), which accounts for both corrupted observations and inter-frame
    correlation. The conditional independence assumption, prevalent in hidden Markov
    model-based ASR, is relaxed to obtain a clean speech posterior that is conditioned
    on the complete observed feature vector sequence. This is a more informative posterior
    than one conditioned only on the current observation. The novel decoding is used
    to obtain a transmission-error robust remote ASR system, where the speech capturing
    unit is connected to the decoder via an error-prone communication network. We
    show how the clean speech posterior can be computed for communication links being
    characterized by either bit errors or packet loss. Recognition results are presented
    for both distributed and network speech recognition, where in the latter case
    common voice-over-IP codecs are employed.
author:
- first_name: Valentin
  full_name: Ion, Valentin
  last_name: Ion
- first_name: Reinhold
  full_name: Haeb-Umbach, Reinhold
  id: '242'
  last_name: Haeb-Umbach
citation:
  ama: Ion V, Haeb-Umbach R. A Novel Uncertainty Decoding Rule With Applications to
    Transmission Error Robust Speech Recognition. <i>IEEE Transactions on Audio, Speech,
    and Language Processing</i>. 2008;16(5):1047-1060. doi:<a href="https://doi.org/10.1109/TASL.2008.925879">10.1109/TASL.2008.925879</a>
  apa: Ion, V., &#38; Haeb-Umbach, R. (2008). A Novel Uncertainty Decoding Rule With
    Applications to Transmission Error Robust Speech Recognition. <i>IEEE Transactions
    on Audio, Speech, and Language Processing</i>, <i>16</i>(5), 1047–1060. <a href="https://doi.org/10.1109/TASL.2008.925879">https://doi.org/10.1109/TASL.2008.925879</a>
  bibtex: '@article{Ion_Haeb-Umbach_2008, title={A Novel Uncertainty Decoding Rule
    With Applications to Transmission Error Robust Speech Recognition}, volume={16},
    DOI={<a href="https://doi.org/10.1109/TASL.2008.925879">10.1109/TASL.2008.925879</a>},
    number={5}, journal={IEEE Transactions on Audio, Speech, and Language Processing},
    author={Ion, Valentin and Haeb-Umbach, Reinhold}, year={2008}, pages={1047–1060}
    }'
  chicago: 'Ion, Valentin, and Reinhold Haeb-Umbach. “A Novel Uncertainty Decoding
    Rule With Applications to Transmission Error Robust Speech Recognition.” <i>IEEE
    Transactions on Audio, Speech, and Language Processing</i> 16, no. 5 (2008): 1047–60.
    <a href="https://doi.org/10.1109/TASL.2008.925879">https://doi.org/10.1109/TASL.2008.925879</a>.'
  ieee: V. Ion and R. Haeb-Umbach, “A Novel Uncertainty Decoding Rule With Applications
    to Transmission Error Robust Speech Recognition,” <i>IEEE Transactions on Audio,
    Speech, and Language Processing</i>, vol. 16, no. 5, pp. 1047–1060, 2008.
  mla: Ion, Valentin, and Reinhold Haeb-Umbach. “A Novel Uncertainty Decoding Rule
    With Applications to Transmission Error Robust Speech Recognition.” <i>IEEE Transactions
    on Audio, Speech, and Language Processing</i>, vol. 16, no. 5, 2008, pp. 1047–60,
    doi:<a href="https://doi.org/10.1109/TASL.2008.925879">10.1109/TASL.2008.925879</a>.
  short: V. Ion, R. Haeb-Umbach, IEEE Transactions on Audio, Speech, and Language
    Processing 16 (2008) 1047–1060.
date_created: 2019-07-12T05:28:53Z
date_updated: 2022-01-06T06:51:10Z
department:
- _id: '54'
doi: 10.1109/TASL.2008.925879
intvolume: '        16'
issue: '5'
keyword:
- automatic speech recognition
- bit errors
- codecs
- communication links
- corrupted observations
- decoding
- distributed speech recognition
- error-prone communication network
- feature vector sequence
- hidden Markov model-based ASR
- hidden Markov models
- inter-frame correlation
- Internet telephony
- network speech recognition
- packet loss
- speech posterior
- speech recognition
- transmission error robust speech recognition
- uncertainty decoding
- voice-over-IP codecs
language:
- iso: eng
main_file_link:
- open_access: '1'
  url: https://groups.uni-paderborn.de/nt/pubs/2008/IoHa08-1.pdf
oa: '1'
page: 1047-1060
publication: IEEE Transactions on Audio, Speech, and Language Processing
status: public
title: A Novel Uncertainty Decoding Rule With Applications to Transmission Error Robust
  Speech Recognition
type: journal_article
user_id: '44006'
volume: 16
year: '2008'
...
---
_id: '11825'
abstract:
- lang: eng
  text: In this paper, we propose an enhanced error concealment strategy at the server
    side of a distributed speech recognition (DSR) system, which is fully compatible
    with the existing DSR standard. It is based on a Bayesian approach, where the
    a posteriori probability density of the error-free feature vector is computed,
    given all received feature vectors which are possibly corrupted by transmission
    errors. Rather than computing a point estimate, such as the MMSE estimate, and
    plugging it into the Bayesian decision rule, we employ uncertainty decoding, which
    results in an integration over the uncertainty in the feature domain. In a typical
    scenario the communication between the thin client, often a mobile device, and
    the recognition server spreads across heterogeneous networks. Both bit errors
    on circuit-switched links and lost data packets on IP connections are mitigated
    by our approach in a unified manner. The experiments reveal improved robustness
    both for small- and large-vocabulary recognition tasks.
author:
- first_name: Valentin
  full_name: Ion, Valentin
  last_name: Ion
- first_name: Reinhold
  full_name: Haeb-Umbach, Reinhold
  id: '242'
  last_name: Haeb-Umbach
citation:
  ama: Ion V, Haeb-Umbach R. Uncertainty decoding for distributed speech recognition
    over error-prone networks. <i>Speech Communication</i>. 2006;48(11):1435-1446.
    doi:<a href="https://doi.org/10.1016/j.specom.2006.03.007">10.1016/j.specom.2006.03.007</a>
  apa: Ion, V., &#38; Haeb-Umbach, R. (2006). Uncertainty decoding for distributed
    speech recognition over error-prone networks. <i>Speech Communication</i>, <i>48</i>(11),
    1435–1446. <a href="https://doi.org/10.1016/j.specom.2006.03.007">https://doi.org/10.1016/j.specom.2006.03.007</a>
  bibtex: '@article{Ion_Haeb-Umbach_2006, title={Uncertainty decoding for distributed
    speech recognition over error-prone networks}, volume={48}, DOI={<a href="https://doi.org/10.1016/j.specom.2006.03.007">10.1016/j.specom.2006.03.007</a>},
    number={11}, journal={Speech Communication}, author={Ion, Valentin and Haeb-Umbach,
    Reinhold}, year={2006}, pages={1435–1446} }'
  chicago: 'Ion, Valentin, and Reinhold Haeb-Umbach. “Uncertainty Decoding for Distributed
    Speech Recognition over Error-Prone Networks.” <i>Speech Communication</i> 48,
    no. 11 (2006): 1435–46. <a href="https://doi.org/10.1016/j.specom.2006.03.007">https://doi.org/10.1016/j.specom.2006.03.007</a>.'
  ieee: V. Ion and R. Haeb-Umbach, “Uncertainty decoding for distributed speech recognition
    over error-prone networks,” <i>Speech Communication</i>, vol. 48, no. 11, pp.
    1435–1446, 2006.
  mla: Ion, Valentin, and Reinhold Haeb-Umbach. “Uncertainty Decoding for Distributed
    Speech Recognition over Error-Prone Networks.” <i>Speech Communication</i>, vol.
    48, no. 11, 2006, pp. 1435–46, doi:<a href="https://doi.org/10.1016/j.specom.2006.03.007">10.1016/j.specom.2006.03.007</a>.
  short: V. Ion, R. Haeb-Umbach, Speech Communication 48 (2006) 1435–1446.
date_created: 2019-07-12T05:28:59Z
date_updated: 2022-01-06T06:51:10Z
department:
- _id: '54'
doi: 10.1016/j.specom.2006.03.007
intvolume: '        48'
issue: '11'
keyword:
- Channel error robustness
- Distributed speech recognition
- Soft features
- Uncertainty decoding
language:
- iso: eng
main_file_link:
- open_access: '1'
  url: https://groups.uni-paderborn.de/nt/pubs/2006/IoHa06-3.pdf
oa: '1'
page: 1435-1446
publication: Speech Communication
status: public
title: Uncertainty decoding for distributed speech recognition over error-prone networks
type: journal_article
user_id: '44006'
volume: 48
year: '2006'
...
