---
_id: '15814'
abstract:
- lang: eng
  text: Once a popular theme of futuristic science fiction or far-fetched technology
    forecasts, digital home assistants with a spoken language interface have become
    a ubiquitous commodity today. This success has been made possible by major advancements
    in signal processing and machine learning for so-called far-field speech recognition,
    where the commands are spoken at a distance from the sound capturing device. The
    challenges encountered are quite unique and different from many other use cases
    of automatic speech recognition. The purpose of this tutorial article is to describe,
    in a way amenable to the non-specialist, the key speech processing algorithms
    that enable reliable fully hands-free speech interaction with digital home assistants.
    These technologies include multi-channel acoustic echo cancellation, microphone
    array processing and dereverberation techniques for signal enhancement, reliable
    wake-up word and end-of-interaction detection, high-quality speech synthesis,
    as well as sophisticated statistical models for speech and language, learned from
    large amounts of heterogeneous training data. In all these fields, deep learning
    has occupied a critical role.
author:
- first_name: Reinhold
  full_name: Haeb-Umbach, Reinhold
  id: '242'
  last_name: Haeb-Umbach
- first_name: Shinji
  full_name: Watanabe, Shinji
  last_name: Watanabe
- first_name: Tomohiro
  full_name: Nakatani, Tomohiro
  last_name: Nakatani
- first_name: Michiel
  full_name: Bacchiani, Michiel
  last_name: Bacchiani
- first_name: Bjoern
  full_name: Hoffmeister, Bjoern
  last_name: Hoffmeister
- first_name: Michael L.
  full_name: Seltzer, Michael L.
  last_name: Seltzer
- first_name: Heiga
  full_name: Zen, Heiga
  last_name: Zen
- first_name: Mehrez
  full_name: Souden, Mehrez
  last_name: Souden
citation:
  ama: 'Haeb-Umbach R, Watanabe S, Nakatani T, et al. Speech Processing for Digital
    Home Assistance: Combining Signal Processing With Deep-Learning Techniques. <i>IEEE
    Signal Processing Magazine</i>. 2019;36(6):111-124. doi:<a href="https://doi.org/10.1109/MSP.2019.2918706">10.1109/MSP.2019.2918706</a>'
  apa: 'Haeb-Umbach, R., Watanabe, S., Nakatani, T., Bacchiani, M., Hoffmeister, B.,
    Seltzer, M. L., Zen, H., &#38; Souden, M. (2019). Speech Processing for Digital
    Home Assistance: Combining Signal Processing With Deep-Learning Techniques. <i>IEEE
    Signal Processing Magazine</i>, <i>36</i>(6), 111–124. <a href="https://doi.org/10.1109/MSP.2019.2918706">https://doi.org/10.1109/MSP.2019.2918706</a>'
  bibtex: '@article{Haeb-Umbach_Watanabe_Nakatani_Bacchiani_Hoffmeister_Seltzer_Zen_Souden_2019,
    title={Speech Processing for Digital Home Assistance: Combining Signal Processing
    With Deep-Learning Techniques}, volume={36}, DOI={<a href="https://doi.org/10.1109/MSP.2019.2918706">10.1109/MSP.2019.2918706</a>},
    number={6}, journal={IEEE Signal Processing Magazine}, author={Haeb-Umbach, Reinhold
    and Watanabe, Shinji and Nakatani, Tomohiro and Bacchiani, Michiel and Hoffmeister,
    Bjoern and Seltzer, Michael L. and Zen, Heiga and Souden, Mehrez}, year={2019},
    pages={111–124} }'
  chicago: 'Haeb-Umbach, Reinhold, Shinji Watanabe, Tomohiro Nakatani, Michiel Bacchiani,
    Bjoern Hoffmeister, Michael L. Seltzer, Heiga Zen, and Mehrez Souden. “Speech
    Processing for Digital Home Assistance: Combining Signal Processing With Deep-Learning
    Techniques.” <i>IEEE Signal Processing Magazine</i> 36, no. 6 (2019): 111–24.
    <a href="https://doi.org/10.1109/MSP.2019.2918706">https://doi.org/10.1109/MSP.2019.2918706</a>.'
  ieee: 'R. Haeb-Umbach <i>et al.</i>, “Speech Processing for Digital Home Assistance:
    Combining Signal Processing With Deep-Learning Techniques,” <i>IEEE Signal Processing
    Magazine</i>, vol. 36, no. 6, pp. 111–124, 2019, doi: <a href="https://doi.org/10.1109/MSP.2019.2918706">10.1109/MSP.2019.2918706</a>.'
  mla: 'Haeb-Umbach, Reinhold, et al. “Speech Processing for Digital Home Assistance:
    Combining Signal Processing With Deep-Learning Techniques.” <i>IEEE Signal Processing
    Magazine</i>, vol. 36, no. 6, 2019, pp. 111–24, doi:<a href="https://doi.org/10.1109/MSP.2019.2918706">10.1109/MSP.2019.2918706</a>.'
  short: R. Haeb-Umbach, S. Watanabe, T. Nakatani, M. Bacchiani, B. Hoffmeister, M.L.
    Seltzer, H. Zen, M. Souden, IEEE Signal Processing Magazine 36 (2019) 111–124.
date_created: 2020-02-06T07:26:20Z
date_updated: 2023-01-09T11:47:09Z
ddc:
- '000'
department:
- _id: '54'
doi: 10.1109/MSP.2019.2918706
file:
- access_level: open_access
  content_type: application/pdf
  creator: huesera
  date_created: 2020-02-06T07:28:26Z
  date_updated: 2020-02-06T07:28:26Z
  file_id: '15815'
  file_name: JournalIEEESignal ProcessingMagazine_2019_Haeb-Umbach_Paper.pdf
  file_size: 1085002
  relation: main_file
file_date_updated: 2020-02-06T07:28:26Z
has_accepted_license: '1'
intvolume: '        36'
issue: '6'
language:
- iso: eng
oa: '1'
page: 111-124
publication: IEEE Signal Processing Magazine
publication_identifier:
  issn:
  - 1558-0792
status: public
title: 'Speech Processing for Digital Home Assistance: Combining Signal Processing
  With Deep-Learning Techniques'
type: journal_article
user_id: '242'
volume: 36
year: '2019'
...
