Towards Online Source Counting in Speech Mixtures Applying a Variational EM for Complex Watson Mixture Models
Drude, Lukas
Chinaev, Aleksej
Tran Vu, Dang Hai
Haeb-Umbach, Reinhold
Accuracy
Acoustics
Estimation
Mathematical model
Soruce separation
Speech
Vectors
Bayes methods
Blind source separation
Directional statistics
Number of speakers
Speaker diarization
This contribution describes a step-wise source counting algorithm to determine the number of speakers in an offline scenario. Each speaker is identified by a variational expectation maximization (VEM) algorithm for complex Watson mixture models and therefore directly yields beamforming vectors for a subsequent speech separation process. An observation selection criterion is proposed which improves the robustness of the source counting in noise. The algorithm is compared to an alternative VEM approach with Gaussian mixture models based on directions of arrival and shown to deliver improved source counting accuracy. The article concludes by extending the offline algorithm towards a low-latency online estimation of the number of active sources from the streaming input data.
2014
info:eu-repo/semantics/conferenceObject
doc-type:conferenceObject
text
https://ris.uni-paderborn.de/record/11753
Drude L, Chinaev A, Tran Vu DH, Haeb-Umbach R. Towards Online Source Counting in Speech Mixtures Applying a Variational EM for Complex Watson Mixture Models. In: <i>14th International Workshop on Acoustic Signal Enhancement (IWAENC 2014)</i>. ; 2014:213-217.
eng
info:eu-repo/semantics/openAccess