All-neural Online Source Separation, Counting, and Diarization for Meeting Analysis

T. von Neumann, K. Kinoshita, M. Delcroix, S. Araki, T. Nakatani, R. Haeb-Umbach, in: ICASSP 2019, Brighton, UK, 2019.

Download
OA ICASSP_2019_Neumann_Paper.pdf 126.45 KB
Conference Paper | English
Author
von Neumann, Thilo; Kinoshita, Keisuke; Delcroix, Marc; Araki, Shoko; Nakatani, Tomohiro; Haeb-Umbach, ReinholdLibreCat
Abstract
Automatic meeting analysis comprises the tasks of speaker counting, speaker diarization, and the separation of overlapped speech, followed by automatic speech recognition. This all has to be carried out on arbitrarily long sessions and, ideally, in an online or block-online manner. While significant progress has been made on individual tasks, this paper presents for the first time an all-neural approach to simultaneous speaker counting, diarization and source separation. The NN-based estimator operates in a block-online fashion and tracks speakers even if they remain silent for a number of time blocks, thus learning a stable output order for the separated sources. The neural network is recurrent over time as well as over the number of sources. The simulation experiments show that state of the art separation performance is achieved, while at the same time delivering good diarization and source counting results. It even generalizes well to an unseen large number of blocks.
Publishing Year
Proceedings Title
ICASSP 2019, Brighton, UK
LibreCat-ID

Cite this

von Neumann T, Kinoshita K, Delcroix M, Araki S, Nakatani T, Haeb-Umbach R. All-neural Online Source Separation, Counting, and Diarization for Meeting Analysis. In: ICASSP 2019, Brighton, UK. ; 2019.
von Neumann, T., Kinoshita, K., Delcroix, M., Araki, S., Nakatani, T., & Haeb-Umbach, R. (2019). All-neural Online Source Separation, Counting, and Diarization for Meeting Analysis. In ICASSP 2019, Brighton, UK.
@inproceedings{von Neumann_Kinoshita_Delcroix_Araki_Nakatani_Haeb-Umbach_2019, title={All-neural Online Source Separation, Counting, and Diarization for Meeting Analysis}, booktitle={ICASSP 2019, Brighton, UK}, author={von Neumann, Thilo and Kinoshita, Keisuke and Delcroix, Marc and Araki, Shoko and Nakatani, Tomohiro and Haeb-Umbach, Reinhold}, year={2019} }
Neumann, Thilo von, Keisuke Kinoshita, Marc Delcroix, Shoko Araki, Tomohiro Nakatani, and Reinhold Haeb-Umbach. “All-Neural Online Source Separation, Counting, and Diarization for Meeting Analysis.” In ICASSP 2019, Brighton, UK, 2019.
T. von Neumann, K. Kinoshita, M. Delcroix, S. Araki, T. Nakatani, and R. Haeb-Umbach, “All-neural Online Source Separation, Counting, and Diarization for Meeting Analysis,” in ICASSP 2019, Brighton, UK, 2019.
von Neumann, Thilo, et al. “All-Neural Online Source Separation, Counting, and Diarization for Meeting Analysis.” ICASSP 2019, Brighton, UK, 2019.
All files available under the following license(s):
Copyright Statement:
This Item is protected by copyright and/or related rights. [...]
Main File(s)
Access Level
OA Open Access
Last Uploaded
2019-09-19T07:05:57Z


Export

Marked Publications

Open Data LibreCat

Search this title in

Google Scholar