{"oa":"1","abstract":[{"text":" \"In this paper we present an algorithm for the unsupervised segmentation of a lattice produced by a phoneme recognizer into words. Using a lattice rather than a single phoneme string accounts for the uncertainty of the recognizer about the true label sequence. An example application is the discovery of lexical units from the output of an error-prone phoneme recognizer in a zero-resource setting, where neither the lexicon nor the language model (LM) is known. We propose a computationally efficient iterative approach, which alternates between the following two steps: First, the most probable string is extracted from the lattice using a phoneme LM learned on the segmentation result of the previous iteration. Second, word segmentation is performed on the extracted string using a word and phoneme LM which is learned alongside the new segmentation. We present results on lattices produced by a phoneme recognizer on the WSJCAM0 dataset. We show that our approach delivers superior segmentation performance than an earlier approach found in the literature, in particular for higher-order language models. \" ","lang":"eng"}],"status":"public","date_created":"2019-07-12T05:28:46Z","related_material":{"link":[{"description":"Poster","relation":"supplementary_material","url":"https://groups.uni-paderborn.de/nt/pubs/2014/HeWaHa2014_Poster.pdf"}]},"main_file_link":[{"url":"https://groups.uni-paderborn.de/nt/pubs/2014/HeWaHa2014.pdf","open_access":"1"}],"year":"2014","user_id":"44006","department":[{"_id":"54"}],"author":[{"first_name":"Jahn","id":"9168","last_name":"Heymann","full_name":"Heymann, Jahn"},{"full_name":"Walter, Oliver","last_name":"Walter","first_name":"Oliver"},{"id":"242","first_name":"Reinhold","last_name":"Haeb-Umbach","full_name":"Haeb-Umbach, Reinhold"},{"first_name":"Bhiksha","last_name":"Raj","full_name":"Raj, Bhiksha"}],"publication":"39th International Conference on Acoustics, Speech and Signal Processing (ICASSP 2014)","date_updated":"2022-01-06T06:51:09Z","citation":{"ama":"Heymann J, Walter O, Haeb-Umbach R, Raj B. Iterative Bayesian Word Segmentation for Unspuervised Vocabulary Discovery from Phoneme Lattices. In: <i>39th International Conference on Acoustics, Speech and Signal Processing (ICASSP 2014)</i>. ; 2014.","mla":"Heymann, Jahn, et al. “Iterative Bayesian Word Segmentation for Unspuervised Vocabulary Discovery from Phoneme Lattices.” <i>39th International Conference on Acoustics, Speech and Signal Processing (ICASSP 2014)</i>, 2014.","bibtex":"@inproceedings{Heymann_Walter_Haeb-Umbach_Raj_2014, title={Iterative Bayesian Word Segmentation for Unspuervised Vocabulary Discovery from Phoneme Lattices}, booktitle={39th International Conference on Acoustics, Speech and Signal Processing (ICASSP 2014)}, author={Heymann, Jahn and Walter, Oliver and Haeb-Umbach, Reinhold and Raj, Bhiksha}, year={2014} }","apa":"Heymann, J., Walter, O., Haeb-Umbach, R., &#38; Raj, B. (2014). Iterative Bayesian Word Segmentation for Unspuervised Vocabulary Discovery from Phoneme Lattices. In <i>39th International Conference on Acoustics, Speech and Signal Processing (ICASSP 2014)</i>.","chicago":"Heymann, Jahn, Oliver Walter, Reinhold Haeb-Umbach, and Bhiksha Raj. “Iterative Bayesian Word Segmentation for Unspuervised Vocabulary Discovery from Phoneme Lattices.” In <i>39th International Conference on Acoustics, Speech and Signal Processing (ICASSP 2014)</i>, 2014.","short":"J. Heymann, O. Walter, R. Haeb-Umbach, B. Raj, in: 39th International Conference on Acoustics, Speech and Signal Processing (ICASSP 2014), 2014.","ieee":"J. Heymann, O. Walter, R. Haeb-Umbach, and B. Raj, “Iterative Bayesian Word Segmentation for Unspuervised Vocabulary Discovery from Phoneme Lattices,” in <i>39th International Conference on Acoustics, Speech and Signal Processing (ICASSP 2014)</i>, 2014."},"type":"conference","_id":"11814","language":[{"iso":"eng"}],"title":"Iterative Bayesian Word Segmentation for Unspuervised Vocabulary Discovery from Phoneme Lattices"}