{"user_id":"44006","citation":{"chicago":"Welling, L., Reinhold Haeb-Umbach, X. Aubert, and N. Haberland. “A Study on Speaker Normalization Using Vocal Tract Normalization and Speaker Adaptive Training.” In ICASSP 1998, Seattle, 1998.","mla":"Welling, L., et al. “A Study on Speaker Normalization Using Vocal Tract Normalization and Speaker Adaptive Training.” ICASSP 1998, Seattle, 1998.","bibtex":"@inproceedings{Welling_Haeb-Umbach_Aubert_Haberland_1998, title={A Study on Speaker Normalization Using Vocal Tract Normalization and Speaker Adaptive Training}, booktitle={ICASSP 1998, Seattle}, author={Welling, L. and Haeb-Umbach, Reinhold and Aubert, X. and Haberland, N.}, year={1998} }","apa":"Welling, L., Haeb-Umbach, R., Aubert, X., & Haberland, N. (1998). A Study on Speaker Normalization Using Vocal Tract Normalization and Speaker Adaptive Training. In ICASSP 1998, Seattle.","ieee":"L. Welling, R. Haeb-Umbach, X. Aubert, and N. Haberland, “A Study on Speaker Normalization Using Vocal Tract Normalization and Speaker Adaptive Training,” in ICASSP 1998, Seattle, 1998.","short":"L. Welling, R. Haeb-Umbach, X. Aubert, N. Haberland, in: ICASSP 1998, Seattle, 1998.","ama":"Welling L, Haeb-Umbach R, Aubert X, Haberland N. A Study on Speaker Normalization Using Vocal Tract Normalization and Speaker Adaptive Training. In: ICASSP 1998, Seattle. ; 1998."},"title":"A Study on Speaker Normalization Using Vocal Tract Normalization and Speaker Adaptive Training","oa":"1","date_created":"2019-07-12T05:31:07Z","department":[{"_id":"54"}],"publication":"ICASSP 1998, Seattle","year":"1998","status":"public","abstract":[{"text":"Although speaker normalization is attempted in very different manners, vocal tract normalization (VTN) and speaker adaptive training (SAT) share many common properties. We show that both lead to more compact representations of the phonetically relevant variations of the training data and that both achieve improved error rate performance only if a complementary normalization or adaptation operation is conducted on the test data. Algorithms for fast test speaker enrollment are presented for both normalization methods: in the framework of SAT, a pre-transformation step is proposed, which alone, i.e. without subsequent unsupervised MLLR adaption, reduces the error rate by almost 10% on the WSJ 5k test sets. For VTN, the use of a Gaussian mixture model makes obsolete a first recognition pass to obtain a preliminary transcription of the test utterance at hardly and loss in performance.","lang":"eng"}],"_id":"11936","type":"conference","main_file_link":[{"open_access":"1","url":"https://groups.uni-paderborn.de/nt/pubs/1998/ICASSP_1998_Haeb_paper.pdf"}],"author":[{"last_name":"Welling","first_name":"L.","full_name":"Welling, L."},{"full_name":"Haeb-Umbach, Reinhold","id":"242","first_name":"Reinhold","last_name":"Haeb-Umbach"},{"full_name":"Aubert, X.","first_name":"X.","last_name":"Aubert"},{"full_name":"Haberland, N.","first_name":"N.","last_name":"Haberland"}],"date_updated":"2022-01-06T06:51:12Z","language":[{"iso":"eng"}]}