---
_id: '11846'
abstract:
- lang: eng
  text: In this paper, we present a new technique for automatic speech recognition
    (ASR) in reverberant environments. Our approach is aimed at the enhancement of
    the logarithmic Mel power spectrum, which is computed at an intermediate stage
    to obtain the widely used Mel frequency cepstral coefficients (MFCCs). Given the
    reverberant logarithmic Mel power spectral coefficients (LMPSCs), a minimum mean
    square error estimate of the clean LMPSCs is computed by carrying out Bayesian
    inference. We employ switching linear dynamical models as an a priori model for
    the dynamics of the clean LMPSCs. Further, we derive a stochastic observation
    model which relates the clean to the reverberant LMPSCs through a simplified model
    of the room impulse response (RIR). This model requires only two parameters, namely
    RIR energy and reverberation time, which can be estimated from the captured microphone
    signal. The performance of the proposed enhancement technique is studied on the
    AURORA5 database and compared to that of constrained maximum-likelihood linear
    regression (CMLLR). It is shown by experimental results that our approach significantly
    outperforms CMLLR and that up to 80\% of the errors caused by the reverberation
    are recovered. In addition to the fact that the approach is compatible with the
    standard MFCC feature vectors, it leaves the ASR back-end unchanged. It is of
    moderate computational complexity and suitable for real time applications.
author:
- first_name: Alexander
  full_name: Krueger, Alexander
  last_name: Krueger
- first_name: Reinhold
  full_name: Haeb-Umbach, Reinhold
  id: '242'
  last_name: Haeb-Umbach
citation:
  ama: Krueger A, Haeb-Umbach R. Model-Based Feature Enhancement for Reverberant Speech
    Recognition. <i>IEEE Transactions on Audio, Speech, and Language Processing</i>.
    2010;18(7):1692-1707. doi:<a href="https://doi.org/10.1109/TASL.2010.2049684">10.1109/TASL.2010.2049684</a>
  apa: Krueger, A., &#38; Haeb-Umbach, R. (2010). Model-Based Feature Enhancement
    for Reverberant Speech Recognition. <i>IEEE Transactions on Audio, Speech, and
    Language Processing</i>, <i>18</i>(7), 1692–1707. <a href="https://doi.org/10.1109/TASL.2010.2049684">https://doi.org/10.1109/TASL.2010.2049684</a>
  bibtex: '@article{Krueger_Haeb-Umbach_2010, title={Model-Based Feature Enhancement
    for Reverberant Speech Recognition}, volume={18}, DOI={<a href="https://doi.org/10.1109/TASL.2010.2049684">10.1109/TASL.2010.2049684</a>},
    number={7}, journal={IEEE Transactions on Audio, Speech, and Language Processing},
    author={Krueger, Alexander and Haeb-Umbach, Reinhold}, year={2010}, pages={1692–1707}
    }'
  chicago: 'Krueger, Alexander, and Reinhold Haeb-Umbach. “Model-Based Feature Enhancement
    for Reverberant Speech Recognition.” <i>IEEE Transactions on Audio, Speech, and
    Language Processing</i> 18, no. 7 (2010): 1692–1707. <a href="https://doi.org/10.1109/TASL.2010.2049684">https://doi.org/10.1109/TASL.2010.2049684</a>.'
  ieee: A. Krueger and R. Haeb-Umbach, “Model-Based Feature Enhancement for Reverberant
    Speech Recognition,” <i>IEEE Transactions on Audio, Speech, and Language Processing</i>,
    vol. 18, no. 7, pp. 1692–1707, 2010.
  mla: Krueger, Alexander, and Reinhold Haeb-Umbach. “Model-Based Feature Enhancement
    for Reverberant Speech Recognition.” <i>IEEE Transactions on Audio, Speech, and
    Language Processing</i>, vol. 18, no. 7, 2010, pp. 1692–707, doi:<a href="https://doi.org/10.1109/TASL.2010.2049684">10.1109/TASL.2010.2049684</a>.
  short: A. Krueger, R. Haeb-Umbach, IEEE Transactions on Audio, Speech, and Language
    Processing 18 (2010) 1692–1707.
date_created: 2019-07-12T05:29:23Z
date_updated: 2022-01-06T06:51:11Z
department:
- _id: '54'
doi: 10.1109/TASL.2010.2049684
intvolume: '        18'
issue: '7'
keyword:
- ASR
- AURORA5 database
- automatic speech recognition
- Bayesian inference
- belief networks
- CMLLR
- computational complexity
- constrained maximum likelihood linear regression
- least mean squares methods
- LMPSC computation
- logarithmic Mel power spectrum
- maximum likelihood estimation
- Mel frequency cepstral coefficients
- MFCC feature vectors
- microphone signal
- minimum mean square error estimation
- model-based feature enhancement
- regression analysis
- reverberant speech recognition
- reverberation
- RIR energy
- room impulse response
- speech recognition
- stochastic observation model
- stochastic processes
language:
- iso: eng
main_file_link:
- open_access: '1'
  url: https://groups.uni-paderborn.de/nt/pubs/2010/KrHa10.pdf
oa: '1'
page: 1692-1707
publication: IEEE Transactions on Audio, Speech, and Language Processing
status: public
title: Model-Based Feature Enhancement for Reverberant Speech Recognition
type: journal_article
user_id: '44006'
volume: 18
year: '2010'
...