Guided Source Separation Meets a Strong ASR Backend: Hitachi/Paderborn University Joint Investigation for Dinner Party ASR

N. Kanda, C. Boeddeker, J. Heitkaemper, Y. Fujita, S. Horiguchi, R. Haeb-Umbach, in: INTERSPEECH 2019, Graz, Austria, 2019.

Download
OA INTERSPEECH_2019_Boeddeker_Paper.pdf 216.20 KB
Conference Paper | English
Author
Kanda, Naoyuki; Boeddeker, ChristophLibreCat; Heitkaemper, JensLibreCat; Fujita, Yusuke; Horiguchi, Shota; Haeb-Umbach, ReinholdLibreCat
Abstract
In this paper, we present Hitachi and Paderborn University’s joint effort for automatic speech recognition (ASR) in a dinner party scenario. The main challenges of ASR systems for dinner party recordings obtained by multiple microphone arrays are (1) heavy speech overlaps, (2) severe noise and reverberation, (3) very natural onversational content, and possibly (4) insufficient training data. As an example of a dinner party scenario, we have chosen the data presented during the CHiME-5 speech recognition challenge, where the baseline ASR had a 73.3% word error rate (WER), and even the best performing system at the CHiME-5 challenge had a 46.1% WER. We extensively investigated a combination of the guided source separation-based speech enhancement technique and an already proposed strong ASR backend and found that a tight combination of these techniques provided substantial accuracy improvements. Our final system achieved WERs of 39.94% and 41.64% for the development and evaluation data, respectively, both of which are the best published results for the dataset. We also investigated with additional training data on the official small data in the CHiME-5 corpus to assess the intrinsic difficulty of this ASR task.
Publishing Year
Proceedings Title
INTERSPEECH 2019, Graz, Austria
LibreCat-ID

Cite this

Kanda N, Boeddeker C, Heitkaemper J, Fujita Y, Horiguchi S, Haeb-Umbach R. Guided Source Separation Meets a Strong ASR Backend: Hitachi/Paderborn University Joint Investigation for Dinner Party ASR. In: INTERSPEECH 2019, Graz, Austria. ; 2019.
Kanda, N., Boeddeker, C., Heitkaemper, J., Fujita, Y., Horiguchi, S., & Haeb-Umbach, R. (2019). Guided Source Separation Meets a Strong ASR Backend: Hitachi/Paderborn University Joint Investigation for Dinner Party ASR. In INTERSPEECH 2019, Graz, Austria.
@inproceedings{Kanda_Boeddeker_Heitkaemper_Fujita_Horiguchi_Haeb-Umbach_2019, title={Guided Source Separation Meets a Strong ASR Backend: Hitachi/Paderborn University Joint Investigation for Dinner Party ASR}, booktitle={INTERSPEECH 2019, Graz, Austria}, author={Kanda, Naoyuki and Boeddeker, Christoph and Heitkaemper, Jens and Fujita, Yusuke and Horiguchi, Shota and Haeb-Umbach, Reinhold}, year={2019} }
Kanda, Naoyuki, Christoph Boeddeker, Jens Heitkaemper, Yusuke Fujita, Shota Horiguchi, and Reinhold Haeb-Umbach. “Guided Source Separation Meets a Strong ASR Backend: Hitachi/Paderborn University Joint Investigation for Dinner Party ASR.” In INTERSPEECH 2019, Graz, Austria, 2019.
N. Kanda, C. Boeddeker, J. Heitkaemper, Y. Fujita, S. Horiguchi, and R. Haeb-Umbach, “Guided Source Separation Meets a Strong ASR Backend: Hitachi/Paderborn University Joint Investigation for Dinner Party ASR,” in INTERSPEECH 2019, Graz, Austria, 2019.
Kanda, Naoyuki, et al. “Guided Source Separation Meets a Strong ASR Backend: Hitachi/Paderborn University Joint Investigation for Dinner Party ASR.” INTERSPEECH 2019, Graz, Austria, 2019.
All files available under the following license(s):
Copyright Statement:
This Item is protected by copyright and/or related rights. [...]
Main File(s)
Access Level
OA Open Access
Last Uploaded
2019-11-08T07:45:15Z


Export

Marked Publications

Open Data LibreCat

Search this title in

Google Scholar