A Study on Online Source Extraction in the Presence of Changing Speaker Positions
J. Heitkaemper, T. Feher, M. Freitag, R. Haeb-Umbach, in: International Conference on Statistical Language and Speech Processing 2019, Ljubljana, Slovenia, 2019.
Download
SLSP_2019_Heitkaemper_Paper.pdf
578.60 KB
Conference Paper
| English
Author
Heitkaemper, JensLibreCat;
Feher, Thomas;
Freitag, Michael;
Haeb-Umbach, ReinholdLibreCat
Abstract
Multi-talker speech and moving speakers still pose a significant challenge to automatic speech recognition systems. Assuming an enrollment utterance of the target speakeris available, the so-called SpeakerBeam concept has been recently proposed to extract the target speaker from a speech mixture. If multi-channel input is available, spatial properties of the speaker can be exploited to support the source extraction. In this contribution we investigate different approaches to exploit such spatial information. In particular, we are interested in the question, how useful this information is if the target speaker changes his/her position. To this end, we present a SpeakerBeam-based source extraction network that is adapted to work on moving speakers by recursively updating the beamformer coefficients. Experimental results are presented on two data sets, one with articially created room impulse responses, and one with real room impulse responses and noise recorded in a conference room. Interestingly, spatial features turn out to be advantageous even if the speaker position changes.
Publishing Year
Proceedings Title
International Conference on Statistical Language and Speech Processing 2019, Ljubljana, Slovenia
LibreCat-ID
Cite this
Heitkaemper J, Feher T, Freitag M, Haeb-Umbach R. A Study on Online Source Extraction in the Presence of Changing Speaker Positions. In: International Conference on Statistical Language and Speech Processing 2019, Ljubljana, Slovenia. ; 2019.
Heitkaemper, J., Feher, T., Freitag, M., & Haeb-Umbach, R. (2019). A Study on Online Source Extraction in the Presence of Changing Speaker Positions. In International Conference on Statistical Language and Speech Processing 2019, Ljubljana, Slovenia.
@inproceedings{Heitkaemper_Feher_Freitag_Haeb-Umbach_2019, title={A Study on Online Source Extraction in the Presence of Changing Speaker Positions}, booktitle={International Conference on Statistical Language and Speech Processing 2019, Ljubljana, Slovenia}, author={Heitkaemper, Jens and Feher, Thomas and Freitag, Michael and Haeb-Umbach, Reinhold}, year={2019} }
Heitkaemper, Jens, Thomas Feher, Michael Freitag, and Reinhold Haeb-Umbach. “A Study on Online Source Extraction in the Presence of Changing Speaker Positions.” In International Conference on Statistical Language and Speech Processing 2019, Ljubljana, Slovenia, 2019.
J. Heitkaemper, T. Feher, M. Freitag, and R. Haeb-Umbach, “A Study on Online Source Extraction in the Presence of Changing Speaker Positions,” in International Conference on Statistical Language and Speech Processing 2019, Ljubljana, Slovenia, 2019.
Heitkaemper, Jens, et al. “A Study on Online Source Extraction in the Presence of Changing Speaker Positions.” International Conference on Statistical Language and Speech Processing 2019, Ljubljana, Slovenia, 2019.
All files available under the following license(s):
Creative Commons Public Domain Dedication (CC0 1.0):
Main File(s)
File Name
SLSP_2019_Heitkaemper_Paper.pdf
578.60 KB
Access Level
Open Access
Last Uploaded
2019-11-08T07:47:12Z