{"citation":{"ama":"Heymann J, Drude L, Haeb-Umbach R. Wide Residual BLSTM Network with Discriminative Speaker Adaptation for Robust Speech Recognition. In: <i>Computer Speech and Language</i>. ; 2016.","ieee":"J. Heymann, L. Drude, and R. Haeb-Umbach, “Wide Residual BLSTM Network with Discriminative Speaker Adaptation for Robust Speech Recognition,” in <i>Computer Speech and Language</i>, 2016.","short":"J. Heymann, L. Drude, R. Haeb-Umbach, in: Computer Speech and Language, 2016.","apa":"Heymann, J., Drude, L., &#38; Haeb-Umbach, R. (2016). Wide Residual BLSTM Network with Discriminative Speaker Adaptation for Robust Speech Recognition. In <i>Computer Speech and Language</i>.","chicago":"Heymann, Jahn, Lukas Drude, and Reinhold Haeb-Umbach. “Wide Residual BLSTM Network with Discriminative Speaker Adaptation for Robust Speech Recognition.” In <i>Computer Speech and Language</i>, 2016.","bibtex":"@inproceedings{Heymann_Drude_Haeb-Umbach_2016, title={Wide Residual BLSTM Network with Discriminative Speaker Adaptation for Robust Speech Recognition}, booktitle={Computer Speech and Language}, author={Heymann, Jahn and Drude, Lukas and Haeb-Umbach, Reinhold}, year={2016} }","mla":"Heymann, Jahn, et al. “Wide Residual BLSTM Network with Discriminative Speaker Adaptation for Robust Speech Recognition.” <i>Computer Speech and Language</i>, 2016."},"type":"conference","publication":"Computer Speech and Language","date_updated":"2022-01-06T06:51:11Z","_id":"11834","language":[{"iso":"eng"}],"title":"Wide Residual BLSTM Network with Discriminative Speaker Adaptation for Robust Speech Recognition","department":[{"_id":"54"}],"author":[{"full_name":"Heymann, Jahn","last_name":"Heymann","first_name":"Jahn","id":"9168"},{"full_name":"Drude, Lukas","last_name":"Drude","first_name":"Lukas","id":"11213"},{"last_name":"Haeb-Umbach","full_name":"Haeb-Umbach, Reinhold","id":"242","first_name":"Reinhold"}],"related_material":{"link":[{"url":"https://groups.uni-paderborn.de/nt/pubs/2016/chime4_upbonly_poster.pdf","description":"Poster","relation":"supplementary_material"}]},"main_file_link":[{"url":"https://groups.uni-paderborn.de/nt/pubs/2016/chime4_upbonly_paper.pdf","open_access":"1"}],"user_id":"44006","year":"2016","status":"public","date_created":"2019-07-12T05:29:09Z","abstract":[{"lang":"eng","text":"We present a system for the 4th CHiME challenge which significantly increases the performance for all three tracks with respect to the provided baseline system. The front-end uses a bi-directional Long Short-Term Memory (BLSTM)-based neural network to estimate signal statistics. These then steer a Generalized Eigenvalue beamformer. The back-end consists of a 22 layer deep Wide Residual Network and two extra BLSTM layers. Working on a whole utterance instead of frames allows us to refine Batch-Normalization. We also train our own BLSTM-based language model. Adding a discriminative speaker adaptation leads to further gains. The final system achieves a word error rate on the six channel real test data of 3.48%. For the two channel track we achieve 5.96% and for the one channel track 9.34%. This is the best reported performance on the challenge achieved by a single system, i.e., a configuration, which does not combine multiple systems. At the same time, our system is independent of the microphone configuration. We can thus use the same components for all three tracks."}],"oa":"1"}