--- res: bibo_abstract: - ' "Several self-localization algorithms have been proposed, that determine the positions of either acoustic or visual sensors autonomously. Usually these positions are given in a modality specific coordinate system, with an unknown rotation, translation and scale between the different systems. For a joint audiovisual tracking, where the different modalities support each other, the two modalities need to be mapped into a common coordinate system. In this paper we propose to estimate this mapping based on audiovisual correlates, i.e., a speaker that can be localized by both, a microphone and a camera network separately. The voice is tracked by a microphone network, which had to be calibrated by a self-localization algorithm at first, and the head is tracked by a calibrated camera network. Unlike existing Singular Value Decomposition based approaches to estimate the coordinate system mapping, we propose to perform an estimation in the shape domain, which turns out to be computationally more efficient. Simulations of the self-localization of an acoustic sensor network and a following coordinate mapping for a joint speaker localization showed a significant improvement of the localization performance, since the modalities were able to support each other." @eng' bibo_authorlist: - foaf_Person: foaf_givenName: Florian foaf_name: Jacob, Florian foaf_surname: Jacob - foaf_Person: foaf_givenName: Reinhold foaf_name: Haeb-Umbach, Reinhold foaf_surname: Haeb-Umbach foaf_workInfoHomepage: http://www.librecat.org/personId=242 dct_date: 2014^xs_gYear dct_language: eng dct_title: Coordinate Mapping Between an Acoustic and Visual Sensor Network in the Shape Domain for a Joint Self-Calibrating Speaker Tracking@ ...