Self-Trained Audio Tagging and Sound Event Detection in Domestic Environments

Ebbers, Janek; Haeb-Umbach, Reinhold

Self-Trained Audio Tagging and Sound Event Detection in Domestic Environments

J. Ebbers, R. Haeb-Umbach, in: Proceedings of the 6th Detection and Classification of Acoustic Scenes and Events 2021 Workshop (DCASE2021), Barcelona, Spain, 2021, pp. 226–230.

Download

template.pdf 239.46 KB

Conference Paper | English

Author

Ebbers, Janek^LibreCat; Haeb-Umbach, Reinhold^LibreCat

Department

Nachrichtentechnik (NT) / Heinz Nixdorf Institut

Project

PC2: Computing Resources Provided by the Paderborn Center for Parallel Computing

Abstract

In this paper we present our system for the Detection and Classification of Acoustic Scenes and Events (DCASE) 2021 Challenge Task 4: Sound Event Detection and Separation in Domestic Environments, where it scored the fourth rank. Our presented solution is an advancement of our system used in the previous edition of the task.We use a forward-backward convolutional recurrent neural network (FBCRNN) for tagging and pseudo labeling followed by tag-conditioned sound event detection (SED) models which are trained using strong pseudo labels provided by the FBCRNN. Our advancement over our earlier model is threefold. First, we introduce a strong label loss in the objective of the FBCRNN to take advantage of the strongly labeled synthetic data during training. Second, we perform multiple iterations of self-training for both the FBCRNN and tag-conditioned SED models. Third, while we used only tag-conditioned CNNs as our SED model in the previous edition we here explore sophisticated tag-conditioned SED model architectures, namely, bidirectional CRNNs and bidirectional convolutional transformer neural networks (CTNNs), and combine them. With metric and class specific tuning of median filter lengths for post-processing, our final SED model, consisting of 6 submodels (2 of each architecture), achieves on the public evaluation set poly-phonic sound event detection scores (PSDS) of 0.455 for scenario 1 and 0.684 for scenario as well as a collar-based F1-score of 0.596 outperforming the baselines and our model from the previous edition by far. Source code is publicly available at https://github.com/fgnt/pb_sed.

Publishing Year

2021

Proceedings Title

Proceedings of the 6th Detection and Classification of Acoustic Scenes and Events 2021 Workshop (DCASE2021)

Page

226–230

ISBN

978-84-09-36072-7

LibreCat-ID

29308

Cite this

Ebbers J, Haeb-Umbach R. Self-Trained Audio Tagging and Sound Event Detection in Domestic Environments. In: Proceedings of the 6th Detection and Classification of Acoustic Scenes and Events 2021 Workshop (DCASE2021). ; 2021:226–230.

Ebbers, J., & Haeb-Umbach, R. (2021). Self-Trained Audio Tagging and Sound Event Detection in Domestic Environments. Proceedings of the 6th Detection and Classification of Acoustic Scenes and Events 2021 Workshop (DCASE2021), 226–230.

@inproceedings{Ebbers_Haeb-Umbach_2021, place={Barcelona, Spain}, title={Self-Trained Audio Tagging and Sound Event Detection in Domestic Environments}, booktitle={Proceedings of the 6th Detection and Classification of Acoustic Scenes and Events 2021 Workshop (DCASE2021)}, author={Ebbers, Janek and Haeb-Umbach, Reinhold}, year={2021}, pages={226–230} }

Ebbers, Janek, and Reinhold Haeb-Umbach. “Self-Trained Audio Tagging and Sound Event Detection in Domestic Environments.” In Proceedings of the 6th Detection and Classification of Acoustic Scenes and Events 2021 Workshop (DCASE2021), 226–230. Barcelona, Spain, 2021.

J. Ebbers and R. Haeb-Umbach, “Self-Trained Audio Tagging and Sound Event Detection in Domestic Environments,” in Proceedings of the 6th Detection and Classification of Acoustic Scenes and Events 2021 Workshop (DCASE2021), 2021, pp. 226–230.

Ebbers, Janek, and Reinhold Haeb-Umbach. “Self-Trained Audio Tagging and Sound Event Detection in Domestic Environments.” Proceedings of the 6th Detection and Classification of Acoustic Scenes and Events 2021 Workshop (DCASE2021), 2021, pp. 226–230.

All files available under the following license(s):

Copyright Statement:

This Item is protected by copyright and/or related rights. [...]

Main File(s)

File Name

template.pdf 239.46 KB

Access Level

Open Access

Last Uploaded

2022-01-13T08:19:50Z

Export

Marked Publications

Open Data LibreCat

Search this title in

Google Scholar
ISBN Search