Deep Attractor Networks for Speaker Re-Identifikation and Blind Source Separation
L. Drude, T. von Neumann, R. Haeb-Umbach, in: ICASSP 2018, Calgary, Canada, 2018.
Conference Paper
| English
Author
Drude, LukasLibreCat;
von Neumann, Thilo;
Haeb-Umbach, ReinholdLibreCat
Abstract
Deep clustering (DC) and deep attractor networks (DANs) are a data-driven way to monaural blind source separation. Both approaches provide astonishing single channel performance but have not yet been generalized to block-online processing. When separating speech in a continuous stream with a block-online algorithm, it needs to be determined in each block which of the output streams belongs to whom. In this contribution we solve this block permutation problem by introducing an additional speaker identification embedding to the DAN model structure. We motivate this model decision by analyzing the embedding topology of DC and DANs and show, that DC and DANs themselves are not sufficient for speaker identification. This model structure (a) improves the signal to distortion ratio (SDR) over a DAN baseline and (b) provides up to 61% and up to 34% relative reduction in permutation error rate and re-identification error rate compared to an i-vector baseline, respectively.
Publishing Year
Proceedings Title
ICASSP 2018, Calgary, Canada
LibreCat-ID
Cite this
Drude L, von Neumann T, Haeb-Umbach R. Deep Attractor Networks for Speaker Re-Identifikation and Blind Source Separation. In: ICASSP 2018, Calgary, Canada. ; 2018.
Drude, L., von Neumann, T., & Haeb-Umbach, R. (2018). Deep Attractor Networks for Speaker Re-Identifikation and Blind Source Separation. In ICASSP 2018, Calgary, Canada.
@inproceedings{Drude_von Neumann_Haeb-Umbach_2018, title={Deep Attractor Networks for Speaker Re-Identifikation and Blind Source Separation}, booktitle={ICASSP 2018, Calgary, Canada}, author={Drude, Lukas and von Neumann, Thilo and Haeb-Umbach, Reinhold}, year={2018} }
Drude, Lukas, Thilo von Neumann, and Reinhold Haeb-Umbach. “Deep Attractor Networks for Speaker Re-Identifikation and Blind Source Separation.” In ICASSP 2018, Calgary, Canada, 2018.
L. Drude, T. von Neumann, and R. Haeb-Umbach, “Deep Attractor Networks for Speaker Re-Identifikation and Blind Source Separation,” in ICASSP 2018, Calgary, Canada, 2018.
Drude, Lukas, et al. “Deep Attractor Networks for Speaker Re-Identifikation and Blind Source Separation.” ICASSP 2018, Calgary, Canada, 2018.
All files available under the following license(s):
Copyright Statement:
This Item is protected by copyright and/or related rights. [...]
Link(s) to Main File(s)
Access Level
Closed Access
External material:
Supplementary Material
Description
Slides