{"oa":"1","publication":"INTERSPEECH 2017, Stockholm, Schweden","abstract":[{"text":"Recent advances in discriminatively trained mask estimation networks to extract a single source utilizing beamforming techniques demonstrate, that the integration of statistical models and deep neural networks (DNNs) are a promising approach for robust automatic speech recognition (ASR) applications. In this contribution we demonstrate how discriminatively trained embeddings on spectral features can be tightly integrated into statistical model-based source separation to separate and transcribe overlapping speech. Good generalization to unseen spatial configurations is achieved by estimating a statistical model at test time, while still leveraging discriminative training of deep clustering embeddings on a separate training set. We formulate an expectation maximization (EM) algorithm which jointly estimates a model for deep clustering embeddings and complex-valued spatial observations in the short time Fourier transform (STFT) domain at test time. Extensive simulations confirm, that the integrated model outperforms (a) a deep clustering model with a subsequent beamforming step and (b) an EM-based model with a beamforming step alone in terms of signal to distortion ratio (SDR) and perceptually motivated metric (PESQ) gains. ASR results on a reverberated dataset further show, that the aforementioned gains translate to reduced word error rates (WERs) even in reverberant environments.","lang":"eng"}],"department":[{"_id":"54"}],"author":[{"first_name":"Lukas","id":"11213","full_name":"Drude, Lukas","last_name":"Drude"},{"last_name":"Haeb-Umbach","id":"242","full_name":"Haeb-Umbach, Reinhold","first_name":"Reinhold"}],"title":"Tight integration of spatial and spectral features for BSS with Deep Clustering embeddings","status":"public","citation":{"apa":"Drude, L., & Haeb-Umbach, R. (2017). Tight integration of spatial and spectral features for BSS with Deep Clustering embeddings. In INTERSPEECH 2017, Stockholm, Schweden.","short":"L. Drude, R. Haeb-Umbach, in: INTERSPEECH 2017, Stockholm, Schweden, 2017.","mla":"Drude, Lukas, and Reinhold Haeb-Umbach. “Tight Integration of Spatial and Spectral Features for BSS with Deep Clustering Embeddings.” INTERSPEECH 2017, Stockholm, Schweden, 2017.","ama":"Drude L, Haeb-Umbach R. Tight integration of spatial and spectral features for BSS with Deep Clustering embeddings. In: INTERSPEECH 2017, Stockholm, Schweden. ; 2017.","bibtex":"@inproceedings{Drude_Haeb-Umbach_2017, title={Tight integration of spatial and spectral features for BSS with Deep Clustering embeddings}, booktitle={INTERSPEECH 2017, Stockholm, Schweden}, author={Drude, Lukas and Haeb-Umbach, Reinhold}, year={2017} }","chicago":"Drude, Lukas, and Reinhold Haeb-Umbach. “Tight Integration of Spatial and Spectral Features for BSS with Deep Clustering Embeddings.” In INTERSPEECH 2017, Stockholm, Schweden, 2017.","ieee":"L. Drude and R. Haeb-Umbach, “Tight integration of spatial and spectral features for BSS with Deep Clustering embeddings,” in INTERSPEECH 2017, Stockholm, Schweden, 2017."},"main_file_link":[{"open_access":"1","url":"https://groups.uni-paderborn.de/nt/pubs/2017/INTERSPEECH_2017_Drude_paper.pdf"}],"language":[{"iso":"eng"}],"date_created":"2019-07-12T05:27:37Z","_id":"11754","user_id":"44006","year":"2017","related_material":{"link":[{"description":"Slides","url":"https://groups.uni-paderborn.de/nt/pubs/2017/INTERSPEECH_2017_Drude_slides.pdf","relation":"supplementary_material"}]},"type":"conference","date_updated":"2022-01-06T06:51:08Z"}