---
_id: '11753'
abstract:
- lang: eng
text: This contribution describes a step-wise source counting algorithm to determine
the number of speakers in an offline scenario. Each speaker is identified by a
variational expectation maximization (VEM) algorithm for complex Watson mixture
models and therefore directly yields beamforming vectors for a subsequent speech
separation process. An observation selection criterion is proposed which improves
the robustness of the source counting in noise. The algorithm is compared to an
alternative VEM approach with Gaussian mixture models based on directions of arrival
and shown to deliver improved source counting accuracy. The article concludes
by extending the offline algorithm towards a low-latency online estimation of
the number of active sources from the streaming input data.
author:
- first_name: Lukas
full_name: Drude, Lukas
id: '11213'
last_name: Drude
- first_name: Aleksej
full_name: Chinaev, Aleksej
last_name: Chinaev
- first_name: Dang Hai
full_name: Tran Vu, Dang Hai
last_name: Tran Vu
- first_name: Reinhold
full_name: Haeb-Umbach, Reinhold
id: '242'
last_name: Haeb-Umbach
citation:
ama: 'Drude L, Chinaev A, Tran Vu DH, Haeb-Umbach R. Towards Online Source Counting
in Speech Mixtures Applying a Variational EM for Complex Watson Mixture Models.
In: 14th International Workshop on Acoustic Signal Enhancement (IWAENC 2014).
; 2014:213-217.'
apa: Drude, L., Chinaev, A., Tran Vu, D. H., & Haeb-Umbach, R. (2014). Towards
Online Source Counting in Speech Mixtures Applying a Variational EM for Complex
Watson Mixture Models. In 14th International Workshop on Acoustic Signal Enhancement
(IWAENC 2014) (pp. 213–217).
bibtex: '@inproceedings{Drude_Chinaev_Tran Vu_Haeb-Umbach_2014, title={Towards Online
Source Counting in Speech Mixtures Applying a Variational EM for Complex Watson
Mixture Models}, booktitle={14th International Workshop on Acoustic Signal Enhancement
(IWAENC 2014)}, author={Drude, Lukas and Chinaev, Aleksej and Tran Vu, Dang Hai
and Haeb-Umbach, Reinhold}, year={2014}, pages={213–217} }'
chicago: Drude, Lukas, Aleksej Chinaev, Dang Hai Tran Vu, and Reinhold Haeb-Umbach.
“Towards Online Source Counting in Speech Mixtures Applying a Variational EM for
Complex Watson Mixture Models.” In 14th International Workshop on Acoustic
Signal Enhancement (IWAENC 2014), 213–17, 2014.
ieee: L. Drude, A. Chinaev, D. H. Tran Vu, and R. Haeb-Umbach, “Towards Online Source
Counting in Speech Mixtures Applying a Variational EM for Complex Watson Mixture
Models,” in 14th International Workshop on Acoustic Signal Enhancement (IWAENC
2014), 2014, pp. 213–217.
mla: Drude, Lukas, et al. “Towards Online Source Counting in Speech Mixtures Applying
a Variational EM for Complex Watson Mixture Models.” 14th International Workshop
on Acoustic Signal Enhancement (IWAENC 2014), 2014, pp. 213–17.
short: 'L. Drude, A. Chinaev, D.H. Tran Vu, R. Haeb-Umbach, in: 14th International
Workshop on Acoustic Signal Enhancement (IWAENC 2014), 2014, pp. 213–217.'
date_created: 2019-07-12T05:27:35Z
date_updated: 2022-01-06T06:51:08Z
department:
- _id: '54'
keyword:
- Accuracy
- Acoustics
- Estimation
- Mathematical model
- Soruce separation
- Speech
- Vectors
- Bayes methods
- Blind source separation
- Directional statistics
- Number of speakers
- Speaker diarization
language:
- iso: eng
main_file_link:
- open_access: '1'
url: https://groups.uni-paderborn.de/nt/pubs/2014/DrChTrHaeb14.pdf
oa: '1'
page: 213-217
publication: 14th International Workshop on Acoustic Signal Enhancement (IWAENC 2014)
related_material:
link:
- description: Poster
relation: supplementary_material
url: https://groups.uni-paderborn.de/nt/pubs/2014/DrChTrHaeb14_Poster.pdf
status: public
title: Towards Online Source Counting in Speech Mixtures Applying a Variational EM
for Complex Watson Mixture Models
type: conference
user_id: '44006'
year: '2014'
...
---
_id: '11716'
abstract:
- lang: eng
text: The accuracy of automatic speech recognition systems in noisy and reverberant
environments can be improved notably by exploiting the uncertainty of the estimated
speech features using so-called uncertainty-of-observation techniques. In this
paper, we introduce a new Bayesian decision rule that can serve as a mathematical
framework from which both known and new uncertainty-of-observation techniques
can be either derived or approximated. The new decision rule in its direct form
leads to the new significance decoding approach for Gaussian mixture models, which
results in better performance compared to standard uncertainty-of-observation
techniques in different additive and convolutive noise scenarios.
author:
- first_name: Ahmed H.
full_name: Abdelaziz, Ahmed H.
last_name: Abdelaziz
- first_name: Steffen
full_name: Zeiler, Steffen
last_name: Zeiler
- first_name: Dorothea
full_name: Kolossa, Dorothea
last_name: Kolossa
- first_name: Volker
full_name: Leutnant, Volker
last_name: Leutnant
- first_name: Reinhold
full_name: Haeb-Umbach, Reinhold
id: '242'
last_name: Haeb-Umbach
citation:
ama: 'Abdelaziz AH, Zeiler S, Kolossa D, Leutnant V, Haeb-Umbach R. GMM-based significance
decoding. In: Acoustics, Speech and Signal Processing (ICASSP), 2013 IEEE International
Conference On. ; 2013:6827-6831. doi:10.1109/ICASSP.2013.6638984'
apa: Abdelaziz, A. H., Zeiler, S., Kolossa, D., Leutnant, V., & Haeb-Umbach,
R. (2013). GMM-based significance decoding. In Acoustics, Speech and Signal
Processing (ICASSP), 2013 IEEE International Conference on (pp. 6827–6831).
https://doi.org/10.1109/ICASSP.2013.6638984
bibtex: '@inproceedings{Abdelaziz_Zeiler_Kolossa_Leutnant_Haeb-Umbach_2013, title={GMM-based
significance decoding}, DOI={10.1109/ICASSP.2013.6638984},
booktitle={Acoustics, Speech and Signal Processing (ICASSP), 2013 IEEE International
Conference on}, author={Abdelaziz, Ahmed H. and Zeiler, Steffen and Kolossa, Dorothea
and Leutnant, Volker and Haeb-Umbach, Reinhold}, year={2013}, pages={6827–6831}
}'
chicago: Abdelaziz, Ahmed H., Steffen Zeiler, Dorothea Kolossa, Volker Leutnant,
and Reinhold Haeb-Umbach. “GMM-Based Significance Decoding.” In Acoustics,
Speech and Signal Processing (ICASSP), 2013 IEEE International Conference On,
6827–31, 2013. https://doi.org/10.1109/ICASSP.2013.6638984.
ieee: A. H. Abdelaziz, S. Zeiler, D. Kolossa, V. Leutnant, and R. Haeb-Umbach, “GMM-based
significance decoding,” in Acoustics, Speech and Signal Processing (ICASSP),
2013 IEEE International Conference on, 2013, pp. 6827–6831.
mla: Abdelaziz, Ahmed H., et al. “GMM-Based Significance Decoding.” Acoustics,
Speech and Signal Processing (ICASSP), 2013 IEEE International Conference On,
2013, pp. 6827–31, doi:10.1109/ICASSP.2013.6638984.
short: 'A.H. Abdelaziz, S. Zeiler, D. Kolossa, V. Leutnant, R. Haeb-Umbach, in:
Acoustics, Speech and Signal Processing (ICASSP), 2013 IEEE International Conference
On, 2013, pp. 6827–6831.'
date_created: 2019-07-12T05:26:53Z
date_updated: 2022-01-06T06:51:07Z
department:
- _id: '54'
doi: 10.1109/ICASSP.2013.6638984
keyword:
- Bayes methods
- Gaussian processes
- convolution
- decision theory
- decoding
- noise
- reverberation
- speech coding
- speech recognition
- Bayesian decision rule
- GMM
- Gaussian mixture models
- additive noise scenarios
- automatic speech recognition systems
- convolutive noise scenarios
- decoding approach
- mathematical framework
- reverberant environments
- significance decoding
- speech feature estimation
- uncertainty-of-observation techniques
- Hidden Markov models
- Maximum likelihood decoding
- Noise
- Speech
- Speech recognition
- Uncertainty
- Uncertainty-of-observation
- modified imputation
- noise robust speech recognition
- significance decoding
- uncertainty decoding
language:
- iso: eng
page: 6827-6831
publication: Acoustics, Speech and Signal Processing (ICASSP), 2013 IEEE International
Conference on
publication_identifier:
issn:
- 1520-6149
status: public
title: GMM-based significance decoding
type: conference
user_id: '44006'
year: '2013'
...
---
_id: '11862'
abstract:
- lang: eng
text: In this contribution we extend a previously proposed Bayesian approach for
the enhancement of reverberant logarithmic mel power spectral coefficients for
robust automatic speech recognition to the additional compensation of background
noise. A recently proposed observation model is employed whose time-variant observation
error statistics are obtained as a side product of the inference of the a posteriori
probability density function of the clean speech feature vectors. Further a reduction
of the computational effort and the memory requirements are achieved by using
a recursive formulation of the observation model. The performance of the proposed
algorithms is first experimentally studied on a connected digits recognition task
with artificially created noisy reverberant data. It is shown that the use of
the time-variant observation error model leads to a significant error rate reduction
at low signal-to-noise ratios compared to a time-invariant model. Further experiments
were conducted on a 5000 word task recorded in a reverberant and noisy environment.
A significant word error rate reduction was obtained demonstrating the effectiveness
of the approach on real-world data.
author:
- first_name: Volker
full_name: Leutnant, Volker
last_name: Leutnant
- first_name: Alexander
full_name: Krueger, Alexander
last_name: Krueger
- first_name: Reinhold
full_name: Haeb-Umbach, Reinhold
id: '242'
last_name: Haeb-Umbach
citation:
ama: Leutnant V, Krueger A, Haeb-Umbach R. Bayesian Feature Enhancement for Reverberation
and Noise Robust Speech Recognition. IEEE Transactions on Audio, Speech, and
Language Processing. 2013;21(8):1640-1652. doi:10.1109/TASL.2013.2258013
apa: Leutnant, V., Krueger, A., & Haeb-Umbach, R. (2013). Bayesian Feature Enhancement
for Reverberation and Noise Robust Speech Recognition. IEEE Transactions on
Audio, Speech, and Language Processing, 21(8), 1640–1652. https://doi.org/10.1109/TASL.2013.2258013
bibtex: '@article{Leutnant_Krueger_Haeb-Umbach_2013, title={Bayesian Feature Enhancement
for Reverberation and Noise Robust Speech Recognition}, volume={21}, DOI={10.1109/TASL.2013.2258013},
number={8}, journal={IEEE Transactions on Audio, Speech, and Language Processing},
author={Leutnant, Volker and Krueger, Alexander and Haeb-Umbach, Reinhold}, year={2013},
pages={1640–1652} }'
chicago: 'Leutnant, Volker, Alexander Krueger, and Reinhold Haeb-Umbach. “Bayesian
Feature Enhancement for Reverberation and Noise Robust Speech Recognition.” IEEE
Transactions on Audio, Speech, and Language Processing 21, no. 8 (2013): 1640–52.
https://doi.org/10.1109/TASL.2013.2258013.'
ieee: V. Leutnant, A. Krueger, and R. Haeb-Umbach, “Bayesian Feature Enhancement
for Reverberation and Noise Robust Speech Recognition,” IEEE Transactions on
Audio, Speech, and Language Processing, vol. 21, no. 8, pp. 1640–1652, 2013.
mla: Leutnant, Volker, et al. “Bayesian Feature Enhancement for Reverberation and
Noise Robust Speech Recognition.” IEEE Transactions on Audio, Speech, and Language
Processing, vol. 21, no. 8, 2013, pp. 1640–52, doi:10.1109/TASL.2013.2258013.
short: V. Leutnant, A. Krueger, R. Haeb-Umbach, IEEE Transactions on Audio, Speech,
and Language Processing 21 (2013) 1640–1652.
date_created: 2019-07-12T05:29:42Z
date_updated: 2022-01-06T06:51:11Z
department:
- _id: '54'
doi: 10.1109/TASL.2013.2258013
intvolume: ' 21'
issue: '8'
keyword:
- Bayes methods
- compensation
- error statistics
- reverberation
- speech recognition
- Bayesian feature enhancement
- background noise
- clean speech feature vectors
- compensation
- connected digits recognition task
- error statistics
- memory requirements
- noisy reverberant data
- posteriori probability density function
- recursive formulation
- reverberant logarithmic mel power spectral coefficients
- robust automatic speech recognition
- signal-to-noise ratios
- time-variant observation
- word error rate reduction
- Robust automatic speech recognition
- model-based Bayesian feature enhancement
- observation model for reverberant and noisy speech
- recursive observation model
language:
- iso: eng
page: 1640-1652
publication: IEEE Transactions on Audio, Speech, and Language Processing
status: public
title: Bayesian Feature Enhancement for Reverberation and Noise Robust Speech Recognition
type: journal_article
user_id: '44006'
volume: 21
year: '2013'
...
---
_id: '11939'
abstract:
- lang: eng
text: In this paper a switching linear dynamical model (SLDM) approach for speech
feature enhancement is improved by employing more accurate models for the dynamics
of speech and noise. The model of the clean speech feature trajectory is improved
by augmenting the state vector to capture information derived from the delta features.
Further a hidden noise state variable is introduced to obtain a more elaborated
model for the noise dynamics. Approximate Bayesian inference in the SLDM is carried
out by a bank of extended Kalman filters, whose outputs are combined according
to the a posteriori probability of the individual state models. Experimental results
on the AURORA2 database show improved recognition accuracy.
author:
- first_name: Stefan
full_name: Windmann, Stefan
last_name: Windmann
- first_name: Reinhold
full_name: Haeb-Umbach, Reinhold
id: '242'
last_name: Haeb-Umbach
citation:
ama: 'Windmann S, Haeb-Umbach R. Modeling the dynamics of speech and noise for speech
feature enhancement in ASR. In: IEEE International Conference on Acoustics,
Speech and Signal Processing (ICASSP 2008). ; 2008:4409-4412. doi:10.1109/ICASSP.2008.4518633'
apa: Windmann, S., & Haeb-Umbach, R. (2008). Modeling the dynamics of speech
and noise for speech feature enhancement in ASR. In IEEE International Conference
on Acoustics, Speech and Signal Processing (ICASSP 2008) (pp. 4409–4412).
https://doi.org/10.1109/ICASSP.2008.4518633
bibtex: '@inproceedings{Windmann_Haeb-Umbach_2008, title={Modeling the dynamics
of speech and noise for speech feature enhancement in ASR}, DOI={10.1109/ICASSP.2008.4518633},
booktitle={IEEE International Conference on Acoustics, Speech and Signal Processing
(ICASSP 2008)}, author={Windmann, Stefan and Haeb-Umbach, Reinhold}, year={2008},
pages={4409–4412} }'
chicago: Windmann, Stefan, and Reinhold Haeb-Umbach. “Modeling the Dynamics of Speech
and Noise for Speech Feature Enhancement in ASR.” In IEEE International Conference
on Acoustics, Speech and Signal Processing (ICASSP 2008), 4409–12, 2008. https://doi.org/10.1109/ICASSP.2008.4518633.
ieee: S. Windmann and R. Haeb-Umbach, “Modeling the dynamics of speech and noise
for speech feature enhancement in ASR,” in IEEE International Conference on
Acoustics, Speech and Signal Processing (ICASSP 2008), 2008, pp. 4409–4412.
mla: Windmann, Stefan, and Reinhold Haeb-Umbach. “Modeling the Dynamics of Speech
and Noise for Speech Feature Enhancement in ASR.” IEEE International Conference
on Acoustics, Speech and Signal Processing (ICASSP 2008), 2008, pp. 4409–12,
doi:10.1109/ICASSP.2008.4518633.
short: 'S. Windmann, R. Haeb-Umbach, in: IEEE International Conference on Acoustics,
Speech and Signal Processing (ICASSP 2008), 2008, pp. 4409–4412.'
date_created: 2019-07-12T05:31:11Z
date_updated: 2022-01-06T06:51:12Z
department:
- _id: '54'
doi: 10.1109/ICASSP.2008.4518633
keyword:
- a posteriori probability
- AURORA2 database
- Bayesian inference
- Bayes methods
- channel bank filters
- extended Kalman filter banks
- hidden noise state variable
- Kalman filters
- noise dynamics
- speech enhancement
- speech feature enhancement
- speech feature trajectory
- switching linear dynamical model approach
language:
- iso: eng
main_file_link:
- open_access: '1'
url: https://groups.uni-paderborn.de/nt/pubs/2008/WiHa08-1.pdf
oa: '1'
page: 4409-4412
publication: IEEE International Conference on Acoustics, Speech and Signal Processing
(ICASSP 2008)
status: public
title: Modeling the dynamics of speech and noise for speech feature enhancement in
ASR
type: conference
user_id: '44006'
year: '2008'
...
---
_id: '11870'
abstract:
- lang: eng
text: We derive a class of computationally inexpensive linear dimension reduction
criteria by introducing a weighted variant of the well-known K-class Fisher criterion
associated with linear discriminant analysis (LDA). It can be seen that LDA weights
contributions of individual class pairs according to the Euclidean distance of
the respective class means. We generalize upon LDA by introducing a different
weighting function
author:
- first_name: M.
full_name: Loog, M.
last_name: Loog
- first_name: R.P.W.
full_name: Duin, R.P.W.
last_name: Duin
- first_name: Reinhold
full_name: Haeb-Umbach, Reinhold
id: '242'
last_name: Haeb-Umbach
citation:
ama: Loog M, Duin RPW, Haeb-Umbach R. Multiclass linear dimension reduction by weighted
pairwise Fisher criteria. IEEE Transactions on Pattern Analysis and Machine
Intelligence. 2001;23(7):762-766. doi:10.1109/34.935849
apa: Loog, M., Duin, R. P. W., & Haeb-Umbach, R. (2001). Multiclass linear dimension
reduction by weighted pairwise Fisher criteria. IEEE Transactions on Pattern
Analysis and Machine Intelligence, 23(7), 762–766. https://doi.org/10.1109/34.935849
bibtex: '@article{Loog_Duin_Haeb-Umbach_2001, title={Multiclass linear dimension
reduction by weighted pairwise Fisher criteria}, volume={23}, DOI={10.1109/34.935849},
number={7}, journal={IEEE Transactions on Pattern Analysis and Machine Intelligence},
author={Loog, M. and Duin, R.P.W. and Haeb-Umbach, Reinhold}, year={2001}, pages={762–766}
}'
chicago: 'Loog, M., R.P.W. Duin, and Reinhold Haeb-Umbach. “Multiclass Linear Dimension
Reduction by Weighted Pairwise Fisher Criteria.” IEEE Transactions on Pattern
Analysis and Machine Intelligence 23, no. 7 (2001): 762–66. https://doi.org/10.1109/34.935849.'
ieee: M. Loog, R. P. W. Duin, and R. Haeb-Umbach, “Multiclass linear dimension reduction
by weighted pairwise Fisher criteria,” IEEE Transactions on Pattern Analysis
and Machine Intelligence, vol. 23, no. 7, pp. 762–766, 2001.
mla: Loog, M., et al. “Multiclass Linear Dimension Reduction by Weighted Pairwise
Fisher Criteria.” IEEE Transactions on Pattern Analysis and Machine Intelligence,
vol. 23, no. 7, 2001, pp. 762–66, doi:10.1109/34.935849.
short: M. Loog, R.P.W. Duin, R. Haeb-Umbach, IEEE Transactions on Pattern Analysis
and Machine Intelligence 23 (2001) 762–766.
date_created: 2019-07-12T05:29:51Z
date_updated: 2022-01-06T06:51:11Z
department:
- _id: '54'
doi: 10.1109/34.935849
intvolume: ' 23'
issue: '7'
keyword:
- approximate pairwise accuracy
- Bayes error
- Bayes methods
- error statistics
- Euclidean distance
- Fisher criterion
- linear dimension reduction
- linear discriminant analysis
- pattern classification
- statistical analysis
- statistical pattern classification
- weighting function
language:
- iso: eng
main_file_link:
- open_access: '1'
url: https://groups.uni-paderborn.de/nt/pubs/2001/LoDuHa01.pdf
oa: '1'
page: 762-766
publication: IEEE Transactions on Pattern Analysis and Machine Intelligence
status: public
title: Multiclass linear dimension reduction by weighted pairwise Fisher criteria
type: journal_article
user_id: '44006'
volume: 23
year: '2001'
...