Please note that LibreCat no longer supports Internet Explorer versions 8 or 9 (or earlier).
We recommend upgrading to the latest Internet Explorer, Google Chrome, or Firefox.
333 Publications
2025 | Conference Paper | LibreCat-ID: 59900
A. Werning and R. Häb-Umbach, “Distilling Efficient Audio Models using Data Pruning with CLAP,” in Proceedings of DAS|DAGA 2025, Copenhagen, 2025, doi: 10.71568/DASDAGA2025.149.
LibreCat
| DOI
2025 | Conference Paper | LibreCat-ID: 59999
F. Rautenberg, M. Kuhlmann, F. Seebauer, J. Wiechmann, P. Wagner, and R. Haeb-Umbach, “Speech Synthesis along Perceptual Voice Quality Dimensions,” presented at the IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Hyderabad, India , 2025, doi: 10.1109/icassp49660.2025.10888012.
LibreCat
| DOI
2024 | Preprint | LibreCat-ID: 56273 |

S. Cornell et al., “The CHiME-8 DASR Challenge for Generalizable and Array Agnostic Distant Automatic Speech Recognition and Diarization,” arXiv:2407.16447. 2024.
LibreCat
| Download (ext.)
| arXiv
2024 | Conference Paper | LibreCat-ID: 57031 |

T. Gburrek, A. Meise, J. Schmalenstroeer, and R. Haeb-Umbach, “Diminishing Domain Mismatch for DNN-Based Acoustic Distance Estimation via Stochastic Room Reverberation Models,” 2024, doi: 10.1109/iwaenc61483.2024.10694103.
LibreCat
| Files available
| DOI
2024 | Journal Article | LibreCat-ID: 52958 |

C. Boeddeker, A. S. Subramanian, G. Wichern, R. Haeb-Umbach, and J. Le Roux, “TS-SEP: Joint Diarization and Separation Conditioned on Estimated Speaker Embeddings,” IEEE/ACM Transactions on Audio, Speech, and Language Processing, vol. 32, pp. 1185–1197, 2024, doi: 10.1109/taslp.2024.3350887.
LibreCat
| Files available
| DOI
| Download (ext.)
2024 | Conference Paper | LibreCat-ID: 57085 |

T. Cord-Landwehr, C. Boeddeker, and R. Haeb-Umbach, “Simultaneous Diarization and Separation of Meetings through the Integration of Statistical Mixture Models,” presented at the 2025 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Hyderabad, India, 2024, doi: 10.1109/ICASSP49660.2025.10888445.
LibreCat
| DOI
| Download (ext.)
2024 | Report | LibreCat-ID: 57161
A. Werning and R. Haeb-Umbach, UPB-NT submission to DCASE24: Dataset pruning for targeted knowledge distillation. 2024.
LibreCat
2024 | Conference Paper | LibreCat-ID: 57160
A. Werning and R. Haeb-Umbach, “Target-Specific Dataset Pruning for Compression of Audio Tagging Models,” presented at the 32nd European Signal Processing Conference, Lyon, 2024.
LibreCat
| Files available
2024 | Conference Paper | LibreCat-ID: 57099
Y. Xie, M. Kuhlmann, F. Rautenberg, Z.-H. Tan, and R. Häb-Umbach, “Speaker and Style Disentanglement of Speech Based on Contrastive Predictive Coding Supported Factorized Variational Autoencoder,” in 2024 32nd European Signal Processing Conference (EUSIPCO), 2024, pp. 436–440.
LibreCat
2024 | Conference Paper | LibreCat-ID: 56004 |

T. von Neumann, C. Boeddeker, T. Cord-Landwehr, M. Delcroix, and R. Haeb-Umbach, “Meeting Recognition with Continuous Speech Separation and Transcription-Supported Diarization,” 2024, doi: 10.1109/icasspw62465.2024.10625894.
LibreCat
| Files available
| DOI
2024 | Conference Paper | LibreCat-ID: 53659
T. Cord-Landwehr, C. Boeddeker, C. Zorilă, R. Doddipatla, and R. Haeb-Umbach, “Geodesic Interpolation of Frame-Wise Speaker Embeddings for the Diarization of Meeting Scenarios,” presented at the 2024 IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), Seoul, 2024, doi: 10.1109/icassp48485.2024.10445911.
LibreCat
| DOI
2024 | Conference Paper | LibreCat-ID: 56272 |

C. Boeddeker, T. Cord-Landwehr, and R. Haeb-Umbach, “Once more Diarization: Improving meeting transcription systems through segment-level speaker reassignment,” 2024, doi: 10.21437/interspeech.2024-1286.
LibreCat
| DOI
| Download (ext.)
2024 | Conference Paper | LibreCat-ID: 57659 |

P. Vieting, S. Berger, T. von Neumann, C. Boeddeker, R. Schlüter, and R. Haeb-Umbach, “Combining TF-GridNet and Mixture Encoder for Continuous Speech Separation for Meeting Transcription,” 2024.
LibreCat
| Download (ext.)
2023 | Conference Paper | LibreCat-ID: 48269 |

T. Gburrek, J. Schmalenstroeer, and R. Haeb-Umbach, “On the Integration of Sampling Rate Synchronization and Acoustic Beamforming,” presented at the European Signal Processing Conference (EUSIPCO), Helsinki, 2023.
LibreCat
| Download (ext.)
2023 | Conference Paper | LibreCat-ID: 48270 |

J. Schmalenstroeer, T. Gburrek, and R. Haeb-Umbach, “LibriWASN: A Data Set for Meeting Separation, Diarization, and Recognition with Asynchronous Recording Devices,” presented at the ITG Conference on Speech Communication, Aachen, 2023.
LibreCat
| Files available
2023 | Conference Paper | LibreCat-ID: 48355 |

F. Rautenberg, M. Kuhlmann, J. Wiechmann, F. Seebauer, P. Wagner, and R. Haeb-Umbach, “On Feature Importance and Interpretability of Speaker Representations,” presented at the ITG Conference on Speech Communication, Aachen, 2023.
LibreCat
| Files available
| Download (ext.)
| arXiv
2023 | Conference Paper | LibreCat-ID: 48410 |

J. Wiechmann, F. Rautenberg, P. Wagner, and R. Haeb-Umbach, “Explaining voice characteristics to novice voice practitioners-How successful is it?,” 2023.
LibreCat
| Files available
| Download (ext.)
2023 | Conference Paper | LibreCat-ID: 48391
R. Aralikatti, C. Boeddeker, G. Wichern, A. Subramanian, and J. Le Roux, “Reverberation as Supervision For Speech Separation,” 2023, doi: 10.1109/icassp49357.2023.10095022.
LibreCat
| DOI
2023 | Conference Paper | LibreCat-ID: 46069
F. Seebauer, M. Kuhlmann, R. Haeb-Umbach, and P. Wagner, “Re-examining the quality dimensions of synthetic speech,” 2023.
LibreCat
2023 | Journal Article | LibreCat-ID: 35602 |

T. von Neumann, K. Kinoshita, C. Boeddeker, M. Delcroix, and R. Haeb-Umbach, “Segment-Less Continuous Speech Separation of Meetings: Training and Evaluation Criteria,” IEEE/ACM Transactions on Audio, Speech, and Language Processing, vol. 31, pp. 576–589, 2023, doi: 10.1109/taslp.2022.3228629.
LibreCat
| Files available
| DOI
2023 | Conference Paper | LibreCat-ID: 49109 |

T. Gburrek, J. Schmalenstroeer, and R. Haeb-Umbach, “Spatial Diarization for Meeting Transcription with Ad-Hoc Acoustic Sensor Networks,” presented at the 57th Asilomar Conference on Signals, Systems, and Computers, 2023.
LibreCat
| Files available
2023 | Conference Paper | LibreCat-ID: 44849 |

F. Rautenberg et al., “Speech Disentanglement for Analysis and Modification of Acoustic and Perceptual Speaker Characteristics,” in Fortschritte der Akustik - DAGA 2023, Hamburg, 2023, pp. 1409–1412.
LibreCat
| Files available
| Download (ext.)
2023 | Conference Paper | LibreCat-ID: 49111
J. Ebbers, R. Haeb-Umbach, and R. Serizel, “Post-Processing Independent Evaluation of Sound Event Detection Systems,” in Proceedings of the 8th Detection and Classification of Acoustic Scenes and Events 2023 Workshop (DCASE2023), 2023, pp. 36–40.
LibreCat
| Files available
2023 | Conference Paper | LibreCat-ID: 57098
F. Seebauer, M. Kuhlmann, R. Häb-Umbach, and P. Wagner, “DISCERNING DIMENSIONS OF QUALITY FOR STATE OF THE ART SYNTHETIC SPEECH,” presented at the International Congress of Phonetic Sciences (ICPhS), Prague, 2023.
LibreCat
2023 | Conference Paper | LibreCat-ID: 57086
M. Kuhlmann, A. Meise, F. Seebauer, P. Wagner, and R. Häb-Umbach, “Investigating Speaker Embedding Disentanglement on Natural Read Speech,” in Speech Communication; 15th ITG Conference, 2023, pp. 121–125.
LibreCat
2023 | Conference Paper | LibreCat-ID: 48281 |

T. von Neumann, C. Boeddeker, K. Kinoshita, M. Delcroix, and R. Haeb-Umbach, “On Word Error Rate Definitions and Their Efficient Computation for Multi-Speaker Speech Recognition Systems,” 2023, doi: 10.1109/icassp49357.2023.10094784.
LibreCat
| Files available
| DOI
| Download (ext.)
2023 | Conference Paper | LibreCat-ID: 48275 |

T. von Neumann, C. Boeddeker, M. Delcroix, and R. Haeb-Umbach, “MeetEval: A Toolkit for Computation of Word Error Rates for Meeting Transcription Systems,” presented at the CHiME 2023 Workshop on Speech Processing in Everyday Environments, Dublin, 2023.
LibreCat
| Files available
| Download (ext.)
2023 | Conference Paper | LibreCat-ID: 47128 |

T. Cord-Landwehr, C. Boeddeker, C. Zorilă, R. Doddipatla, and R. Haeb-Umbach, “Frame-Wise and Overlap-Robust Speaker Embeddings for Meeting Diarization,” presented at the 2023 IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), Rhodes, 2023, doi: 10.1109/icassp49357.2023.10095370.
LibreCat
| Files available
| DOI
2023 | Conference Paper | LibreCat-ID: 47129 |

T. Cord-Landwehr, C. Boeddeker, C. Zorilă, R. Doddipatla, and R. Haeb-Umbach, “A Teacher-Student Approach for Extracting Informative Speaker Embeddings From Speech Mixtures,” 2023, doi: 10.21437/interspeech.2023-1379.
LibreCat
| Files available
| DOI
2023 | Conference Paper | LibreCat-ID: 54439 |

C. Boeddeker, T. Cord-Landwehr, T. von Neumann, and R. Haeb-Umbach, “Multi-stage diarization refinement for the CHiME-7 DASR scenario,” 2023, doi: 10.21437/chime.2023-10.
LibreCat
| DOI
| Download (ext.)
2023 | Conference Paper | LibreCat-ID: 48390 |

S. Berger, P. Vieting, C. Boeddeker, R. Schlüter, and R. Haeb-Umbach, “Mixture Encoder for Joint Speech Separation and Recognition,” 2023, doi: 10.21437/interspeech.2023-1815.
LibreCat
| DOI
| Download (ext.)
2022 | Journal Article | LibreCat-ID: 33669 |

W. Zhang, X. Chang, C. Boeddeker, T. Nakatani, S. Watanabe, and Y. Qian, “End-to-End Dereverberation, Beamforming, and Speech Recognition in A Cocktail Party,” IEEE/ACM Transactions on Audio, Speech, and Language Processing, 2022, doi: 10.1109/TASLP.2022.3209942.
LibreCat
| Files available
| DOI
2022 | Conference Paper | LibreCat-ID: 33471
J. Heitkämper, J. Schmalenstroeer, and R. Haeb-Umbach, “Neural Network Based Carrier Frequency Offset Estimation From Speech Transmitted Over High Frequency Channels,” presented at the 30th European Signal Processing Conference (EUSIPCO), Belgrad.
LibreCat
| Files available
2022 | Conference Paper | LibreCat-ID: 33806
H. Afifi, H. Karl, T. Gburrek, and J. Schmalenstroeer, “Data-driven Time Synchronization in Wireless Multimedia Networks,” 2022, doi: 10.1109/iwcmc55113.2022.9824980.
LibreCat
| DOI
2022 | Conference Paper | LibreCat-ID: 33847 |

T. Cord-Landwehr, T. von Neumann, C. Boeddeker, and R. Haeb-Umbach, “MMS-MSG: A Multi-purpose Multi-Speaker Mixture Signal Generator,” presented at the 2022 International Workshop on Acoustic Signal Enhancement (IWAENC), Bamberg, 2022.
LibreCat
| Files available
| arXiv
2022 | Conference Paper | LibreCat-ID: 33807 |

T. Gburrek, J. Schmalenstroeer, and R. Haeb-Umbach, “On Synchronization of Wireless Acoustic Sensor Networks in the Presence of Time-Varying Sampling Rate Offsets and Speaker Changes,” 2022, doi: 10.1109/icassp43922.2022.9746284.
LibreCat
| Files available
| DOI
2022 | Journal Article | LibreCat-ID: 33451 |

C. Grimm, T. Fei, E. Warsitz, R. Farhoud, T. Breddermann, and R. Haeb-Umbach, “Warping of Radar Data Into Camera Image for Cross-Modal Supervision in Automotive Applications,” IEEE Transactions on Vehicular Technology, vol. 71, no. 9, pp. 9435–9449, 2022, doi: 10.1109/TVT.2022.3182411.
LibreCat
| Files available
| DOI
2022 | Conference Paper | LibreCat-ID: 33696 |

J. Wiechmann, T. Glarner, F. Rautenberg, P. Wagner, and R. Haeb-Umbach, “Technically enabled explaining of voice characteristics,” Bielefeld, 2022.
LibreCat
| Files available
2022 | Conference Paper | LibreCat-ID: 33857 |

M. Kuhlmann, F. Seebauer, J. Ebbers, P. Wagner, and R. Haeb-Umbach, “Investigation into Target Speaking Rate Adaptation for Voice Conversion,” 2022, doi: 10.21437/interspeech.2022-10740.
LibreCat
| Files available
| DOI
| Download (ext.)
2022 | Conference Paper | LibreCat-ID: 33808 |

T. Gburrek, J. Schmalenstroeer, J. Heitkaemper, and R. Haeb-Umbach, “Informed vs. Blind Beamforming in Ad-Hoc Acoustic Sensor Networks for Meeting Transcription,” presented at the 17th International Workshop on Acoustic Signal Enhancement (IWAENC 2022), Bamberg, Germany , 2022, doi: 10.1109/IWAENC53105.2022.9914772.
LibreCat
| Files available
| DOI
2022 | Conference Paper | LibreCat-ID: 34072 |

J. Ebbers, R. Haeb-Umbach, and R. Serizel, “Threshold Independent Evaluation of Sound Event Detection Scores,” 2022.
LibreCat
| Files available
2022 | Report | LibreCat-ID: 49113
J. Ebbers and R. Haeb-Umbach, Pre-Training And Self-Training For Sound Event Detection In Domestic Environments. 2022.
LibreCat
| Files available
2022 | Conference Paper | LibreCat-ID: 33848 |

T. Cord-Landwehr, C. Boeddeker, T. von Neumann, C. Zorila, R. Doddipatla, and R. Haeb-Umbach, “Monaural source separation: From anechoic to reverberant environments,” presented at the 2022 International Workshop on Acoustic Signal Enhancement (IWAENC), 2022.
LibreCat
| Files available
| arXiv
2022 | Conference Paper | LibreCat-ID: 33819 |

T. von Neumann, K. Kinoshita, C. Boeddeker, M. Delcroix, and R. Haeb-Umbach, “SA-SDR: A Novel Loss Function for Separation of Meeting Style Data,” 2022, doi: 10.1109/icassp43922.2022.9746757.
LibreCat
| Files available
| DOI
2022 | Misc | LibreCat-ID: 33816 |

T. Gburrek, C. Boeddeker, T. von Neumann, T. Cord-Landwehr, J. Schmalenstroeer, and R. Haeb-Umbach, A Meeting Transcription System for an Ad-Hoc Acoustic Sensor Network. arXiv, 2022.
LibreCat
| Files available
| DOI
2022 | Conference Paper | LibreCat-ID: 33954 |

C. Boeddeker, T. Cord-Landwehr, T. von Neumann, and R. Haeb-Umbach, “An Initialization Scheme for Meeting Separation with Spatial Mixture Models,” 2022, doi: 10.21437/interspeech.2022-10929.
LibreCat
| DOI
| Download (ext.)
2022 | Conference Paper | LibreCat-ID: 33958
K. Kinoshita, T. von Neumann, M. Delcroix, C. Boeddeker, and R. Haeb-Umbach, “Utterance-by-utterance overlap-aware neural diarization with Graph-PIT,” in Proc. Interspeech 2022, 2022, pp. 1486–1490, doi: 10.21437/Interspeech.2022-11408.
LibreCat
| DOI
| Download (ext.)
2021 | Journal Article | LibreCat-ID: 21065 |

R. Haeb-Umbach, J. Heymann, L. Drude, S. Watanabe, M. Delcroix, and T. Nakatani, “Far-Field Automatic Speech Recognition,” Proceedings of the IEEE, vol. 109, no. 2, pp. 124–148, 2021.
LibreCat
| Files available
| DOI
2021 | Conference Paper | LibreCat-ID: 28256
W. Zhang et al., “End-to-End Dereverberation, Beamforming, and Speech Recognition with Improved Numerical Stability and Advanced Frontend,” 2021, doi: 10.1109/icassp39728.2021.9414464.
LibreCat
| DOI
2021 | Conference Paper | LibreCat-ID: 28262
C. Li et al., “ESPnet-SE: End-To-End Speech Enhancement and Separation Toolkit Designed for ASR Integration,” 2021, doi: 10.1109/slt48900.2021.9383615.
LibreCat
| DOI