{"user_id":"44006","year":"2014","author":[{"first_name":"Jahn","id":"9168","full_name":"Heymann, Jahn","last_name":"Heymann"},{"last_name":"Walter","full_name":"Walter, Oliver","first_name":"Oliver"},{"last_name":"Haeb-Umbach","id":"242","full_name":"Haeb-Umbach, Reinhold","first_name":"Reinhold"},{"full_name":"Raj, Bhiksha","last_name":"Raj","first_name":"Bhiksha"}],"title":"Iterative Bayesian Word Segmentation for Unspuervised Vocabulary Discovery from Phoneme Lattices","abstract":[{"text":" \"In this paper we present an algorithm for the unsupervised segmentation of a lattice produced by a phoneme recognizer into words. Using a lattice rather than a single phoneme string accounts for the uncertainty of the recognizer about the true label sequence. An example application is the discovery of lexical units from the output of an error-prone phoneme recognizer in a zero-resource setting, where neither the lexicon nor the language model (LM) is known. We propose a computationally efficient iterative approach, which alternates between the following two steps: First, the most probable string is extracted from the lattice using a phoneme LM learned on the segmentation result of the previous iteration. Second, word segmentation is performed on the extracted string using a word and phoneme LM which is learned alongside the new segmentation. We present results on lattices produced by a phoneme recognizer on the WSJCAM0 dataset. We show that our approach delivers superior segmentation performance than an earlier approach found in the literature, in particular for higher-order language models. \" ","lang":"eng"}],"publication":"39th International Conference on Acoustics, Speech and Signal Processing (ICASSP 2014)","citation":{"apa":"Heymann, J., Walter, O., Haeb-Umbach, R., & Raj, B. (2014). Iterative Bayesian Word Segmentation for Unspuervised Vocabulary Discovery from Phoneme Lattices. In 39th International Conference on Acoustics, Speech and Signal Processing (ICASSP 2014).","bibtex":"@inproceedings{Heymann_Walter_Haeb-Umbach_Raj_2014, title={Iterative Bayesian Word Segmentation for Unspuervised Vocabulary Discovery from Phoneme Lattices}, booktitle={39th International Conference on Acoustics, Speech and Signal Processing (ICASSP 2014)}, author={Heymann, Jahn and Walter, Oliver and Haeb-Umbach, Reinhold and Raj, Bhiksha}, year={2014} }","short":"J. Heymann, O. Walter, R. Haeb-Umbach, B. Raj, in: 39th International Conference on Acoustics, Speech and Signal Processing (ICASSP 2014), 2014.","mla":"Heymann, Jahn, et al. “Iterative Bayesian Word Segmentation for Unspuervised Vocabulary Discovery from Phoneme Lattices.” 39th International Conference on Acoustics, Speech and Signal Processing (ICASSP 2014), 2014.","ieee":"J. Heymann, O. Walter, R. Haeb-Umbach, and B. Raj, “Iterative Bayesian Word Segmentation for Unspuervised Vocabulary Discovery from Phoneme Lattices,” in 39th International Conference on Acoustics, Speech and Signal Processing (ICASSP 2014), 2014.","ama":"Heymann J, Walter O, Haeb-Umbach R, Raj B. Iterative Bayesian Word Segmentation for Unspuervised Vocabulary Discovery from Phoneme Lattices. In: 39th International Conference on Acoustics, Speech and Signal Processing (ICASSP 2014). ; 2014.","chicago":"Heymann, Jahn, Oliver Walter, Reinhold Haeb-Umbach, and Bhiksha Raj. “Iterative Bayesian Word Segmentation for Unspuervised Vocabulary Discovery from Phoneme Lattices.” In 39th International Conference on Acoustics, Speech and Signal Processing (ICASSP 2014), 2014."},"main_file_link":[{"url":"https://groups.uni-paderborn.de/nt/pubs/2014/HeWaHa2014.pdf","open_access":"1"}],"_id":"11814","oa":"1","status":"public","related_material":{"link":[{"relation":"supplementary_material","url":"https://groups.uni-paderborn.de/nt/pubs/2014/HeWaHa2014_Poster.pdf","description":"Poster"}]},"type":"conference","date_updated":"2022-01-06T06:51:09Z","language":[{"iso":"eng"}],"date_created":"2019-07-12T05:28:46Z","department":[{"_id":"54"}]}