TY - CONF AB - Multi-talker speech and moving speakers still pose a significant challenge to automatic speech recognition systems. Assuming an enrollment utterance of the target speakeris available, the so-called SpeakerBeam concept has been recently proposed to extract the target speaker from a speech mixture. If multi-channel input is available, spatial properties of the speaker can be exploited to support the source extraction. In this contribution we investigate different approaches to exploit such spatial information. In particular, we are interested in the question, how useful this information is if the target speaker changes his/her position. To this end, we present a SpeakerBeam-based source extraction network that is adapted to work on moving speakers by recursively updating the beamformer coefficients. Experimental results are presented on two data sets, one with articially created room impulse responses, and one with real room impulse responses and noise recorded in a conference room. Interestingly, spatial features turn out to be advantageous even if the speaker position changes. AU - Heitkaemper, Jens AU - Feher, Thomas AU - Freitag, Michael AU - Haeb-Umbach, Reinhold ID - 14822 T2 - International Conference on Statistical Language and Speech Processing 2019, Ljubljana, Slovenia TI - A Study on Online Source Extraction in the Presence of Changing Speaker Positions ER - TY - CONF AB - This paper deals with multi-channel speech recognition in scenarios with multiple speakers. Recently, the spectral characteristics of a target speaker, extracted from an adaptation utterance, have been used to guide a neural network mask estimator to focus on that speaker. In this work we present two variants of speakeraware neural networks, which exploit both spectral and spatial information to allow better discrimination between target and interfering speakers. Thus, we introduce either a spatial preprocessing prior to the mask estimation or a spatial plus spectral speaker characterization block whose output is directly fed into the neural mask estimator. The target speaker’s spectral and spatial signature is extracted from an adaptation utterance recorded at the beginning of a session. We further adapt the architecture for low-latency processing by means of block-online beamforming that recursively updates the signal statistics. Experimental results show that the additional spatial information clearly improves source extraction, in particular in the same-gender case, and that our proposal achieves state-of-the-art performance in terms of distortion reduction and recognition accuracy. AU - Martin-Donas, Juan M. AU - Heitkaemper, Jens AU - Haeb-Umbach, Reinhold AU - Gomez, Angel M. AU - Peinado, Antonio M. ID - 14824 T2 - INTERSPEECH 2019, Graz, Austria TI - Multi-Channel Block-Online Source Extraction based on Utterance Adaptation ER - TY - CONF AB - In this paper, we present Hitachi and Paderborn University’s joint effort for automatic speech recognition (ASR) in a dinner party scenario. The main challenges of ASR systems for dinner party recordings obtained by multiple microphone arrays are (1) heavy speech overlaps, (2) severe noise and reverberation, (3) very natural onversational content, and possibly (4) insufficient training data. As an example of a dinner party scenario, we have chosen the data presented during the CHiME-5 speech recognition challenge, where the baseline ASR had a 73.3% word error rate (WER), and even the best performing system at the CHiME-5 challenge had a 46.1% WER. We extensively investigated a combination of the guided source separation-based speech enhancement technique and an already proposed strong ASR backend and found that a tight combination of these techniques provided substantial accuracy improvements. Our final system achieved WERs of 39.94% and 41.64% for the development and evaluation data, respectively, both of which are the best published results for the dataset. We also investigated with additional training data on the official small data in the CHiME-5 corpus to assess the intrinsic difficulty of this ASR task. AU - Kanda, Naoyuki AU - Boeddeker, Christoph AU - Heitkaemper, Jens AU - Fujita, Yusuke AU - Horiguchi, Shota AU - Haeb-Umbach, Reinhold ID - 14826 T2 - INTERSPEECH 2019, Graz, Austria TI - Guided Source Separation Meets a Strong ASR Backend: Hitachi/Paderborn University Joint Investigation for Dinner Party ASR ER - TY - CHAP AU - Seipelt, Agnes Regina AU - Klugseder, Robert ED - Aringer, Klaus ED - Utz, Christian ED - Wozonig, Thomas ID - 14828 SN - 978-3-99012-553-3 T2 - Musik im Zusammenhang: Festschrift Peter Revers zum 65. Geburtstag TI - Digitale Musikanalyse auf Grundlage von MEI-codierten Daten ER - TY - GEN ED - Scheideler, Christian ED - Berenbrink, Petra ID - 14829 SN - 978-1-4503-6184-2 TI - The 31st ACM Symposium on Parallelism in Algorithms and Architectures, SPAA 2019, Phoenix, AZ, USA, June 22-24, 2019 ER - TY - JOUR AU - Gmyr, Robert AU - Lefevre, Jonas AU - Scheideler, Christian ID - 14830 IS - 2 JF - Theory Comput. Syst. TI - Self-Stabilizing Metric Graphs VL - 63 ER - TY - GEN AU - Sabu, Nithin S. ID - 14831 TI - FPGA Acceleration of String Search Techniques in Huge Data Sets ER - TY - THES AU - Vaz, Gavin Francis ID - 14849 TI - Using Just-in-Time Code Generation to Transparently Accelerate Applications in Heterogeneous Systems ER - TY - THES AU - Mäcker, Alexander ID - 14851 TI - On Scheduling with Setup Times ER - TY - CONF AB - In a variety of industrial applications, liquids are atomized to produce aerosols for further processing. Example applications are the coating of surfaces with paints, the application of ultra-thin adhesive layers and the atomization of fuels for the production of combustible dispersions. In this publication different atomizing principles (standing-wave, capillary-wave, vibrating-mesh) are examined and discussed. Using an optimized standing-wave system, tough liquids with viscosities of up to about 100 Pas could be successfully atomized. AU - Dunst, Paul AU - Bornmann, Peter AU - Hemsel, Tobias AU - Littmann, Walter AU - Sextro, Walter ED - Lötters, Joost ED - Urban, Gerald ID - 14852 KW - atomization KW - ultrasound KW - standing-wave KW - capillarywave KW - vibrating-mesh T2 - Conference Proceedings - The 4th Conference on MicroFluidic Handling Systems (MFHS2019) TI - Atomization of Fluids with Ultrasound ER -