Online Diarization of Streaming Audio-Visual Data for Smart Environments
J. Schmalenstroeer, R. Haeb-Umbach, IEEE Journal of Selected Topics in Signal Processing 4 (2010) 845–856.
Download (ext.)
          
        
            
            
            Journal Article
            
            
            
              |              English
              
            
          
        Abstract
    For an environment to be perceived as being smart, contextual information has to be gathered to adapt the system's behavior and its interface towards the user. Being a rich source of context information speech can be acquired unobtrusively by microphone arrays and then processed to extract information about the user and his environment. In this paper, a system for joint temporal segmentation, speaker localization, and identification is presented, which is supported by face identification from video data obtained from a steerable camera. Special attention is paid to latency aspects and online processing capabilities, as they are important for the application under investigation, namely ambient communication. It describes the vision of terminal-less, session-less and multi-modal telecommunication with remote partners, where the user can move freely within his home while the communication follows him. The speaker diarization serves as a context source, which has been integrated in a service-oriented middleware architecture and provided to the application to select the most appropriate I/O device and to steer the camera towards the speaker during ambient communication.
    
  Keywords
    
        audio streaming; 
        audio visual data streaming; 
        context information speech; 
        face identification; 
        face recognition; 
        image segmentation; 
        middleware; 
        multimodal telecommunication; 
        online diarization; 
        service oriented middleware architecture; 
        sessionless telecommunication; 
        software architecture; 
        speaker identification; 
        speaker localization; 
        speaker recognition; 
        steerable camera; 
        telecommunication computing; 
        temporal segmentation; 
        terminal-less telecommunication; 
        video streaming
    
  Publishing Year
    
  Journal Title
    IEEE Journal of Selected Topics in Signal Processing
  Volume
      4
    Issue
      5
    Page
      845-856
    LibreCat-ID
    
  Cite this
Schmalenstroeer J, Haeb-Umbach R. Online Diarization of Streaming Audio-Visual Data for Smart Environments. IEEE Journal of Selected Topics in Signal Processing. 2010;4(5):845-856. doi:10.1109/JSTSP.2010.2050519
    Schmalenstroeer, J., & Haeb-Umbach, R. (2010). Online Diarization of Streaming Audio-Visual Data for Smart Environments. IEEE Journal of Selected Topics in Signal Processing, 4(5), 845–856. https://doi.org/10.1109/JSTSP.2010.2050519
    @article{Schmalenstroeer_Haeb-Umbach_2010, title={Online Diarization of Streaming Audio-Visual Data for Smart Environments}, volume={4}, DOI={10.1109/JSTSP.2010.2050519}, number={5}, journal={IEEE Journal of Selected Topics in Signal Processing}, author={Schmalenstroeer, Joerg and Haeb-Umbach, Reinhold}, year={2010}, pages={845–856} }
    Schmalenstroeer, Joerg, and Reinhold Haeb-Umbach. “Online Diarization of Streaming Audio-Visual Data for Smart Environments.” IEEE Journal of Selected Topics in Signal Processing 4, no. 5 (2010): 845–56. https://doi.org/10.1109/JSTSP.2010.2050519.
    J. Schmalenstroeer and R. Haeb-Umbach, “Online Diarization of Streaming Audio-Visual Data for Smart Environments,” IEEE Journal of Selected Topics in Signal Processing, vol. 4, no. 5, pp. 845–856, 2010, doi: 10.1109/JSTSP.2010.2050519.
    Schmalenstroeer, Joerg, and Reinhold Haeb-Umbach. “Online Diarization of Streaming Audio-Visual Data for Smart Environments.” IEEE Journal of Selected Topics in Signal Processing, vol. 4, no. 5, 2010, pp. 845–56, doi:10.1109/JSTSP.2010.2050519.
  
      All files available under the following license(s):
      
      
        
          
        
          
          
      
      
    
  
            Copyright Statement:
          
        
            This Item is protected by copyright and/or related rights. [...]
          
        
      Link(s) to Main File(s)
    
  Access Level
     Closed Access
 Closed Access
     
                 
            
            
 Google Scholar
Google Scholar