{"page":"109-131","date_updated":"2022-01-06T06:51:07Z","year":"2002","citation":{"bibtex":"@article{Beyerlein_Aubert_Haeb-Umbach_Harris_Klakow_Wendemuth_Molau_Ney_Pitz_Sixtus_2002, title={Large Vocabulary Continuous Speech Recognition of Broadcast News - The Philips/RWTH Approach}, number={37}, journal={Speech Communication}, author={Beyerlein, P. and Aubert, X. and Haeb-Umbach, Reinhold and Harris, M. and Klakow, D. and Wendemuth, A. and Molau, S. and Ney, N. and Pitz, Michael and Sixtus, A.}, year={2002}, pages={109–131} }","ieee":"P. Beyerlein et al., “Large Vocabulary Continuous Speech Recognition of Broadcast News - The Philips/RWTH Approach,” Speech Communication, no. 37, pp. 109–131, 2002.","chicago":"Beyerlein, P., X. Aubert, Reinhold Haeb-Umbach, M. Harris, D. Klakow, A. Wendemuth, S. Molau, N. Ney, Michael Pitz, and A. Sixtus. “Large Vocabulary Continuous Speech Recognition of Broadcast News - The Philips/RWTH Approach.” Speech Communication, no. 37 (2002): 109–31.","short":"P. Beyerlein, X. Aubert, R. Haeb-Umbach, M. Harris, D. Klakow, A. Wendemuth, S. Molau, N. Ney, M. Pitz, A. Sixtus, Speech Communication (2002) 109–131.","mla":"Beyerlein, P., et al. “Large Vocabulary Continuous Speech Recognition of Broadcast News - The Philips/RWTH Approach.” Speech Communication, no. 37, 2002, pp. 109–31.","apa":"Beyerlein, P., Aubert, X., Haeb-Umbach, R., Harris, M., Klakow, D., Wendemuth, A., … Sixtus, A. (2002). Large Vocabulary Continuous Speech Recognition of Broadcast News - The Philips/RWTH Approach. Speech Communication, (37), 109–131.","ama":"Beyerlein P, Aubert X, Haeb-Umbach R, et al. Large Vocabulary Continuous Speech Recognition of Broadcast News - The Philips/RWTH Approach. Speech Communication. 2002;(37):109-131."},"language":[{"iso":"eng"}],"main_file_link":[{"url":"https://groups.uni-paderborn.de/nt/pubs/2002/BeAuHaHaKlWeMoNePiSi02.pdf","open_access":"1"}],"status":"public","issue":"37","_id":"11727","oa":"1","author":[{"first_name":"P.","last_name":"Beyerlein","full_name":"Beyerlein, P."},{"last_name":"Aubert","full_name":"Aubert, X.","first_name":"X."},{"first_name":"Reinhold","id":"242","last_name":"Haeb-Umbach","full_name":"Haeb-Umbach, Reinhold"},{"full_name":"Harris, M.","last_name":"Harris","first_name":"M."},{"first_name":"D.","last_name":"Klakow","full_name":"Klakow, D."},{"full_name":"Wendemuth, A.","last_name":"Wendemuth","first_name":"A."},{"first_name":"S.","full_name":"Molau, S.","last_name":"Molau"},{"first_name":"N.","full_name":"Ney, N.","last_name":"Ney"},{"first_name":"Michael","last_name":"Pitz","full_name":"Pitz, Michael"},{"first_name":"A.","last_name":"Sixtus","full_name":"Sixtus, A."}],"department":[{"_id":"54"}],"title":"Large Vocabulary Continuous Speech Recognition of Broadcast News - The Philips/RWTH Approach","type":"journal_article","publication":"Speech Communication","date_created":"2019-07-12T05:27:05Z","abstract":[{"lang":"eng","text":"Automatic speech recognition of real-live broadcast news (BN) data (Hub-4) has become a challenging research topic in recent years. This paper summarizes our key efforts to build a large vocabulary continuous speech recognition system for the heterogenous BN task without inducing undesired complexity and computational resources. These key efforts included: - automatic segmentation of the audio signal into speech utterances; - efficient one-pass trigram decoding using look-ahead techniques; - optimal log-linear interpolation of a variety of acoustic and language models using discriminative model combination (DMC); - handling short-range and weak longer-range correlations in natural speech and language by the use of phrases and of distance-language models; - improving the acoustic modeling by a robust feature extraction, channel normalization, adaptation techniques as well as automatic script selection and verification. The starting point of the system development was the Philips 64k-NAB word-internal triphone trigram system. On the speaker-independent but microphone-dependent NAB-task (transcription of read newspaper texts) we obtained a word error rate of about 10\\%. Now, at the conclusion of the system development, we have arrived at Philips at an DMC-interpolated phrase-based crossword-pentaphone 4-gram system. This system transcribes BN data with an overall word error rate of about 17\\%."}],"user_id":"44006"}