Spatio-spectral diarization of meetings by combining TDOA-based segmentation and speaker embedding-based clustering

T. Cord-Landwehr, T. Gburrek, M. Deegen, R. Haeb-Umbach, in: Proceedings of INTERSPEECH, 2025.

Download
OA main.pdf 921.92 KB
Conference Paper | English
Abstract
We propose a spatio-spectral, combined model-based and data-driven diarization pipeline consisting of TDOA-based segmentation followed by embedding-based clustering. The proposed system requires neither access to multi-channel training data nor prior knowledge about the number or placement of microphones. It works for both a compact microphone array and distributed microphones, with minor adjustments. Due to its superior handling of overlapping speech during segmentation, the proposed pipeline significantly outperforms the single-channel pyannote approach, both in a scenario with a compact microphone array and in a setup with distributed microphones. Additionally, we show that, unlike fully spatial diarization pipelines, the proposed system can correctly track speakers when they change positions.
Publishing Year
Proceedings Title
Proceedings of INTERSPEECH
Conference
Interspeech 2025
Conference Location
Rotterdam
LibreCat-ID

Cite this

Cord-Landwehr T, Gburrek T, Deegen M, Haeb-Umbach R. Spatio-spectral diarization of meetings by combining TDOA-based  segmentation and speaker embedding-based clustering. In: Proceedings of INTERSPEECH. ; 2025. doi:10.21437/Interspeech.2025-1663
Cord-Landwehr, T., Gburrek, T., Deegen, M., & Haeb-Umbach, R. (2025). Spatio-spectral diarization of meetings by combining TDOA-based  segmentation and speaker embedding-based clustering. Proceedings of INTERSPEECH. Interspeech 2025, Rotterdam. https://doi.org/10.21437/Interspeech.2025-1663
@inproceedings{Cord-Landwehr_Gburrek_Deegen_Haeb-Umbach_2025, title={Spatio-spectral diarization of meetings by combining TDOA-based  segmentation and speaker embedding-based clustering}, DOI={10.21437/Interspeech.2025-1663}, booktitle={Proceedings of INTERSPEECH}, author={Cord-Landwehr, Tobias and Gburrek, Tobias and Deegen, Marc and Haeb-Umbach, Reinhold}, year={2025} }
Cord-Landwehr, Tobias, Tobias Gburrek, Marc Deegen, and Reinhold Haeb-Umbach. “Spatio-Spectral Diarization of Meetings by Combining TDOA-Based  Segmentation and Speaker Embedding-Based Clustering.” In Proceedings of INTERSPEECH, 2025. https://doi.org/10.21437/Interspeech.2025-1663.
T. Cord-Landwehr, T. Gburrek, M. Deegen, and R. Haeb-Umbach, “Spatio-spectral diarization of meetings by combining TDOA-based  segmentation and speaker embedding-based clustering,” presented at the Interspeech 2025, Rotterdam, 2025, doi: 10.21437/Interspeech.2025-1663.
Cord-Landwehr, Tobias, et al. “Spatio-Spectral Diarization of Meetings by Combining TDOA-Based  Segmentation and Speaker Embedding-Based Clustering.” Proceedings of INTERSPEECH, 2025, doi:10.21437/Interspeech.2025-1663.
All files available under the following license(s):
Copyright Statement:
This Item is protected by copyright and/or related rights. [...]
Main File(s)
File Name
main.pdf 921.92 KB
Access Level
OA Open Access
Last Uploaded
2025-08-29T09:43:32Z


Export

Marked Publications

Open Data LibreCat

Sources

arXiv 2506.16228

Search this title in

Google Scholar