---
_id: '15952'
abstract:
- lang: eng
  text: Arbitrary sampling rate conversion has already received considerable attention
    in the past, but still lacks an equivalent representation of the effective time-dilation
    process in the block frequency domain. Good sampling rate converters in the time
    domain have been known, for instance, in terms of time-varying 'Sinc' or fixed
    'Farrow' polynomial filters. The former can deliver nearly exact conversion at
    high complexity, while the latter has pronounced computational efficiency with
    limited accuracy. Only recently, it was shown that a composite 'polyphase Farrow'
    form with high resampling precision can be implemented with quasi-fixed filters
    that operate at the input sampling rate. We therefore propose to capitalize from
    that fixed-filter architecture in that we translate the polyphase-Farrow filters
    into an equivalent FFT-based overlap-save form. Experimental evaluation and comparison
    with other state-of-the art frequency-domain approaches then proves currently
    the best price-performance ratio of the proposed algorithm. It is thus an ideal
    candidate for the new framework of acoustic sensor networks that critically rests
    upon fast and accurate alignment of autonomous sampling processes.
author:
- first_name: Joerg
  full_name: Schmalenstroeer, Joerg
  id: '460'
  last_name: Schmalenstroeer
- first_name: Aleksej
  full_name: Chinaev, Aleksej
  last_name: Chinaev
- first_name: Gerald
  full_name: Enzner, Gerald
  last_name: Enzner
citation:
  ama: 'Schmalenstroeer J, Chinaev A, Enzner G. Fast and Accurate Audio Resampling
    for Acoustic Sensor Networks by Polyphase-Farrow Filters with FFT Realization.
    In: <i>Speech Communication; 13th ITG-Symposium</i>. ; 2018:1-5.'
  apa: Schmalenstroeer, J., Chinaev, A., &#38; Enzner, G. (2018). Fast and Accurate
    Audio Resampling for Acoustic Sensor Networks by Polyphase-Farrow Filters with
    FFT Realization. <i>Speech Communication; 13th ITG-Symposium</i>, 1–5.
  bibtex: '@inproceedings{Schmalenstroeer_Chinaev_Enzner_2018, title={Fast and Accurate
    Audio Resampling for Acoustic Sensor Networks by Polyphase-Farrow Filters with
    FFT Realization}, booktitle={Speech Communication; 13th ITG-Symposium}, author={Schmalenstroeer,
    Joerg and Chinaev, Aleksej and Enzner, Gerald}, year={2018}, pages={1–5} }'
  chicago: Schmalenstroeer, Joerg, Aleksej Chinaev, and Gerald Enzner. “Fast and Accurate
    Audio Resampling for Acoustic Sensor Networks by Polyphase-Farrow Filters with
    FFT Realization.” In <i>Speech Communication; 13th ITG-Symposium</i>, 1–5, 2018.
  ieee: J. Schmalenstroeer, A. Chinaev, and G. Enzner, “Fast and Accurate Audio Resampling
    for Acoustic Sensor Networks by Polyphase-Farrow Filters with FFT Realization,”
    in <i>Speech Communication; 13th ITG-Symposium</i>, 2018, pp. 1–5.
  mla: Schmalenstroeer, Joerg, et al. “Fast and Accurate Audio Resampling for Acoustic
    Sensor Networks by Polyphase-Farrow Filters with FFT Realization.” <i>Speech Communication;
    13th ITG-Symposium</i>, 2018, pp. 1–5.
  short: 'J. Schmalenstroeer, A. Chinaev, G. Enzner, in: Speech Communication; 13th
    ITG-Symposium, 2018, pp. 1–5.'
date_created: 2020-02-21T08:53:14Z
date_updated: 2024-11-14T09:42:35Z
department:
- _id: '54'
language:
- iso: eng
page: 1-5
project:
- _id: '52'
  name: Computing Resources Provided by the Paderborn Center for Parallel Computing
publication: Speech Communication; 13th ITG-Symposium
publication_identifier:
  issn:
  - 'null'
quality_controlled: '1'
status: public
title: Fast and Accurate Audio Resampling for Acoustic Sensor Networks by Polyphase-Farrow
  Filters with FFT Realization
type: conference
user_id: '460'
year: '2018'
...
---
_id: '11717'
abstract:
- lang: eng
  text: In this work, we address the limited availability of large annotated databases
    for real-life audio event detection by utilizing the concept of transfer learning.
    This technique aims to transfer knowledge from a source domain to a target domain,
    even if source and target have different feature distributions and label sets.
    We hypothesize that all acoustic events share the same inventory of basic acoustic
    building blocks and differ only in the temporal order of these acoustic units.
    We then construct a deep neural network with convolutional layers for extracting
    the acoustic units and a recurrent layer for capturing the temporal order. Under
    the above hypothesis, transfer learning from a source to a target domain with
    a different acoustic event inventory is realized by transferring the convolutional
    layers from the source to the target domain. The recurrent layer is, however,
    learnt directly from the target domain. Experiments on the transfer from a synthetic
    source database to the reallife target database of DCASE 2016 demonstrate that
    transfer learning leads to improved detection performance on average. However,
    the successful transfer to detect events which are very different from what was
    seen in the source domain, could not be verified.
author:
- first_name: Prerna
  full_name: Arora, Prerna
  last_name: Arora
- first_name: Reinhold
  full_name: Haeb-Umbach, Reinhold
  id: '242'
  last_name: Haeb-Umbach
citation:
  ama: 'Arora P, Haeb-Umbach R. A Study on Transfer Learning for Acoustic Event Detection
    in a Real Life Scenario. In: <i>IEEE 19th International Workshop on Multimedia
    Signal Processing (MMSP)</i>. ; 2017.'
  apa: Arora, P., &#38; Haeb-Umbach, R. (2017). A Study on Transfer Learning for Acoustic
    Event Detection in a Real Life Scenario. In <i>IEEE 19th International Workshop
    on Multimedia Signal Processing (MMSP)</i>.
  bibtex: '@inproceedings{Arora_Haeb-Umbach_2017, title={A Study on Transfer Learning
    for Acoustic Event Detection in a Real Life Scenario}, booktitle={IEEE 19th International
    Workshop on Multimedia Signal Processing (MMSP)}, author={Arora, Prerna and Haeb-Umbach,
    Reinhold}, year={2017} }'
  chicago: Arora, Prerna, and Reinhold Haeb-Umbach. “A Study on Transfer Learning
    for Acoustic Event Detection in a Real Life Scenario.” In <i>IEEE 19th International
    Workshop on Multimedia Signal Processing (MMSP)</i>, 2017.
  ieee: P. Arora and R. Haeb-Umbach, “A Study on Transfer Learning for Acoustic Event
    Detection in a Real Life Scenario,” in <i>IEEE 19th International Workshop on
    Multimedia Signal Processing (MMSP)</i>, 2017.
  mla: Arora, Prerna, and Reinhold Haeb-Umbach. “A Study on Transfer Learning for
    Acoustic Event Detection in a Real Life Scenario.” <i>IEEE 19th International
    Workshop on Multimedia Signal Processing (MMSP)</i>, 2017.
  short: 'P. Arora, R. Haeb-Umbach, in: IEEE 19th International Workshop on Multimedia
    Signal Processing (MMSP), 2017.'
date_created: 2019-07-12T05:26:54Z
date_updated: 2022-01-06T06:51:07Z
department:
- _id: '54'
language:
- iso: eng
main_file_link:
- open_access: '1'
  url: https://groups.uni-paderborn.de/nt/pubs/2017/MMSP_2017_AroraHaeb.pdf
oa: '1'
publication: IEEE 19th International Workshop on Multimedia Signal Processing (MMSP)
related_material:
  link:
  - description: Poster
    relation: supplementary_material
    url: https://groups.uni-paderborn.de/nt/pubs/2017/MMSP_2017_AroraHaeb_poster.pdf
status: public
title: A Study on Transfer Learning for Acoustic Event Detection in a Real Life Scenario
type: conference
user_id: '44006'
year: '2017'
...
---
_id: '11735'
abstract:
- lang: eng
  text: This report describes the computation of gradients by algorithmic differentiation
    for statistically optimum beamforming operations. Especially the derivation of
    complex-valued functions is a key component of this approach. Therefore the real-valued
    algorithmic differentiation is extended via the complex-valued chain rule. In
    addition to the basic mathematic operations the derivative of the eigenvalue problem
    with complex-valued eigenvectors is one of the key results of this report. The
    potential of this approach is shown with experimental results on the CHiME-3 challenge
    database. There, the beamforming task is used as a front-end for an ASR system.
    With the developed derivatives a joint optimization of a speech enhancement and
    speech recognition system w.r.t. the recognition optimization criterion is possible.
author:
- first_name: Christoph
  full_name: Boeddeker, Christoph
  id: '40767'
  last_name: Boeddeker
- first_name: Patrick
  full_name: Hanebrink, Patrick
  last_name: Hanebrink
- first_name: Lukas
  full_name: Drude, Lukas
  id: '11213'
  last_name: Drude
- first_name: Jahn
  full_name: Heymann, Jahn
  id: '9168'
  last_name: Heymann
- first_name: Reinhold
  full_name: Haeb-Umbach, Reinhold
  id: '242'
  last_name: Haeb-Umbach
citation:
  ama: Boeddeker C, Hanebrink P, Drude L, Heymann J, Haeb-Umbach R. <i>On the Computation
    of Complex-Valued Gradients with Application to Statistically Optimum Beamforming</i>.;
    2017.
  apa: Boeddeker, C., Hanebrink, P., Drude, L., Heymann, J., &#38; Haeb-Umbach, R.
    (2017). <i>On the Computation of Complex-valued Gradients with Application to
    Statistically Optimum Beamforming</i>.
  bibtex: '@book{Boeddeker_Hanebrink_Drude_Heymann_Haeb-Umbach_2017, title={On the
    Computation of Complex-valued Gradients with Application to Statistically Optimum
    Beamforming}, author={Boeddeker, Christoph and Hanebrink, Patrick and Drude, Lukas
    and Heymann, Jahn and Haeb-Umbach, Reinhold}, year={2017} }'
  chicago: Boeddeker, Christoph, Patrick Hanebrink, Lukas Drude, Jahn Heymann, and
    Reinhold Haeb-Umbach. <i>On the Computation of Complex-Valued Gradients with Application
    to Statistically Optimum Beamforming</i>, 2017.
  ieee: C. Boeddeker, P. Hanebrink, L. Drude, J. Heymann, and R. Haeb-Umbach, <i>On
    the Computation of Complex-valued Gradients with Application to Statistically
    Optimum Beamforming</i>. 2017.
  mla: Boeddeker, Christoph, et al. <i>On the Computation of Complex-Valued Gradients
    with Application to Statistically Optimum Beamforming</i>. 2017.
  short: C. Boeddeker, P. Hanebrink, L. Drude, J. Heymann, R. Haeb-Umbach, On the
    Computation of Complex-Valued Gradients with Application to Statistically Optimum
    Beamforming, 2017.
date_created: 2019-07-12T05:27:15Z
date_updated: 2022-01-06T06:51:08Z
department:
- _id: '54'
language:
- iso: eng
main_file_link:
- open_access: '1'
  url: https://groups.uni-paderborn.de/nt/pubs/2017/ArXiv_2017_BoeddekerHanebrinkHaeb_Article.pdf
oa: '1'
status: public
title: On the Computation of Complex-valued Gradients with Application to Statistically
  Optimum Beamforming
type: report
user_id: '40767'
year: '2017'
...
---
_id: '11736'
abstract:
- lang: eng
  text: In this paper we show how a neural network for spectral mask estimation for
    an acoustic beamformer can be optimized by algorithmic differentiation. Using
    the beamformer output SNR as the objective function to maximize, the gradient
    is propagated through the beamformer all the way to the neural network which provides
    the clean speech and noise masks from which the beamformer coefficients are estimated
    by eigenvalue decomposition. A key theoretical result is the derivative of an
    eigenvalue problem involving complex-valued eigenvectors. Experimental results
    on the CHiME-3 challenge database demonstrate the effectiveness of the approach.
    The tools developed in this paper are a key component for an end-to-end optimization
    of speech enhancement and speech recognition.
author:
- first_name: Christoph
  full_name: Boeddeker, Christoph
  id: '40767'
  last_name: Boeddeker
- first_name: Patrick
  full_name: Hanebrink, Patrick
  last_name: Hanebrink
- first_name: Lukas
  full_name: Drude, Lukas
  id: '11213'
  last_name: Drude
- first_name: Jahn
  full_name: Heymann, Jahn
  id: '9168'
  last_name: Heymann
- first_name: Reinhold
  full_name: Haeb-Umbach, Reinhold
  id: '242'
  last_name: Haeb-Umbach
citation:
  ama: 'Boeddeker C, Hanebrink P, Drude L, Heymann J, Haeb-Umbach R. Optimizing Neural-Network
    Supported Acoustic Beamforming by Algorithmic Differentiation. In: <i>Proc. IEEE
    Intl. Conf. on Acoustics, Speech and Signal Processing (ICASSP)</i>. ; 2017.'
  apa: Boeddeker, C., Hanebrink, P., Drude, L., Heymann, J., &#38; Haeb-Umbach, R.
    (2017). Optimizing Neural-Network Supported Acoustic Beamforming by Algorithmic
    Differentiation. In <i>Proc. IEEE Intl. Conf. on Acoustics, Speech and Signal
    Processing (ICASSP)</i>.
  bibtex: '@inproceedings{Boeddeker_Hanebrink_Drude_Heymann_Haeb-Umbach_2017, title={Optimizing
    Neural-Network Supported Acoustic Beamforming by Algorithmic Differentiation},
    booktitle={Proc. IEEE Intl. Conf. on Acoustics, Speech and Signal Processing (ICASSP)},
    author={Boeddeker, Christoph and Hanebrink, Patrick and Drude, Lukas and Heymann,
    Jahn and Haeb-Umbach, Reinhold}, year={2017} }'
  chicago: Boeddeker, Christoph, Patrick Hanebrink, Lukas Drude, Jahn Heymann, and
    Reinhold Haeb-Umbach. “Optimizing Neural-Network Supported Acoustic Beamforming
    by Algorithmic Differentiation.” In <i>Proc. IEEE Intl. Conf. on Acoustics, Speech
    and Signal Processing (ICASSP)</i>, 2017.
  ieee: C. Boeddeker, P. Hanebrink, L. Drude, J. Heymann, and R. Haeb-Umbach, “Optimizing
    Neural-Network Supported Acoustic Beamforming by Algorithmic Differentiation,”
    in <i>Proc. IEEE Intl. Conf. on Acoustics, Speech and Signal Processing (ICASSP)</i>,
    2017.
  mla: Boeddeker, Christoph, et al. “Optimizing Neural-Network Supported Acoustic
    Beamforming by Algorithmic Differentiation.” <i>Proc. IEEE Intl. Conf. on Acoustics,
    Speech and Signal Processing (ICASSP)</i>, 2017.
  short: 'C. Boeddeker, P. Hanebrink, L. Drude, J. Heymann, R. Haeb-Umbach, in: Proc.
    IEEE Intl. Conf. on Acoustics, Speech and Signal Processing (ICASSP), 2017.'
date_created: 2019-07-12T05:27:16Z
date_updated: 2022-01-06T06:51:08Z
department:
- _id: '54'
language:
- iso: eng
main_file_link:
- open_access: '1'
  url: https://groups.uni-paderborn.de/nt/pubs/2017/icassp_2017_boeddeker_paper.pdf
oa: '1'
publication: Proc. IEEE Intl. Conf. on Acoustics, Speech and Signal Processing (ICASSP)
status: public
title: Optimizing Neural-Network Supported Acoustic Beamforming by Algorithmic Differentiation
type: conference
user_id: '44006'
year: '2017'
...
---
_id: '11737'
abstract:
- lang: eng
  text: The benefits of both a logarithmic spectral amplitude (LSA) estimation and
    a modeling in a generalized spectral domain (where short-time amplitudes are raised
    to a generalized power exponent, not restricted to magnitude or power spectrum)
    are combined in this contribution to achieve a better tradeoff between speech
    quality and noise suppression in single-channel speech enhancement. A novel gain
    function is derived to enhance the logarithmic generalized spectral amplitudes
    of noisy speech. Experiments on the CHiME-3 dataset show that it outperforms the
    famous minimum mean squared error (MMSE) LSA gain function of Ephraim and Malah
    in terms of noise suppression by 1.4 dB, while the good speech quality of the
    MMSE-LSA estimator is maintained.
author:
- first_name: Alleksej
  full_name: Chinaev, Alleksej
  last_name: Chinaev
- first_name: Reinhold
  full_name: Haeb-Umbach, Reinhold
  id: '242'
  last_name: Haeb-Umbach
citation:
  ama: 'Chinaev A, Haeb-Umbach R. A Generalized Log-Spectral Amplitude Estimator for
    Single-Channel Speech Enhancement. In: <i>Proc. IEEE Intl. Conf. on Acoustics,
    Speech and Signal Processing (ICASSP)</i>. ; 2017.'
  apa: Chinaev, A., &#38; Haeb-Umbach, R. (2017). A Generalized Log-Spectral Amplitude
    Estimator for Single-Channel Speech Enhancement. In <i>Proc. IEEE Intl. Conf.
    on Acoustics, Speech and Signal Processing (ICASSP)</i>.
  bibtex: '@inproceedings{Chinaev_Haeb-Umbach_2017, title={A Generalized Log-Spectral
    Amplitude Estimator for Single-Channel Speech Enhancement}, booktitle={Proc. IEEE
    Intl. Conf. on Acoustics, Speech and Signal Processing (ICASSP)}, author={Chinaev,
    Alleksej and Haeb-Umbach, Reinhold}, year={2017} }'
  chicago: Chinaev, Alleksej, and Reinhold Haeb-Umbach. “A Generalized Log-Spectral
    Amplitude Estimator for Single-Channel Speech Enhancement.” In <i>Proc. IEEE Intl.
    Conf. on Acoustics, Speech and Signal Processing (ICASSP)</i>, 2017.
  ieee: A. Chinaev and R. Haeb-Umbach, “A Generalized Log-Spectral Amplitude Estimator
    for Single-Channel Speech Enhancement,” in <i>Proc. IEEE Intl. Conf. on Acoustics,
    Speech and Signal Processing (ICASSP)</i>, 2017.
  mla: Chinaev, Alleksej, and Reinhold Haeb-Umbach. “A Generalized Log-Spectral Amplitude
    Estimator for Single-Channel Speech Enhancement.” <i>Proc. IEEE Intl. Conf. on
    Acoustics, Speech and Signal Processing (ICASSP)</i>, 2017.
  short: 'A. Chinaev, R. Haeb-Umbach, in: Proc. IEEE Intl. Conf. on Acoustics, Speech
    and Signal Processing (ICASSP), 2017.'
date_created: 2019-07-12T05:27:17Z
date_updated: 2022-01-06T06:51:08Z
department:
- _id: '54'
language:
- iso: eng
main_file_link:
- open_access: '1'
  url: https://groups.uni-paderborn.de/nt/pubs/2017/ChinHaeb17.pdf
oa: '1'
publication: Proc. IEEE Intl. Conf. on Acoustics, Speech and Signal Processing (ICASSP)
related_material:
  link:
  - description: Slides
    relation: supplementary_material
    url: https://groups.uni-paderborn.de/nt/pubs/2017/ChinHaeb17_Slides.pdf
status: public
title: A Generalized Log-Spectral Amplitude Estimator for Single-Channel Speech Enhancement
type: conference
user_id: '44006'
year: '2017'
...
---
_id: '11754'
abstract:
- lang: eng
  text: Recent advances in discriminatively trained mask estimation networks to extract
    a single source utilizing beamforming techniques demonstrate, that the integration
    of statistical models and deep neural networks (DNNs) are a promising approach
    for robust automatic speech recognition (ASR) applications. In this contribution
    we demonstrate how discriminatively trained embeddings on spectral features can
    be tightly integrated into statistical model-based source separation to separate
    and transcribe overlapping speech. Good generalization to unseen spatial configurations
    is achieved by estimating a statistical model at test time, while still leveraging
    discriminative training of deep clustering embeddings on a separate training set.
    We formulate an expectation maximization (EM) algorithm which jointly estimates
    a model for deep clustering embeddings and complex-valued spatial observations
    in the short time Fourier transform (STFT) domain at test time. Extensive simulations
    confirm, that the integrated model outperforms (a) a deep clustering model with
    a subsequent beamforming step and (b) an EM-based model with a beamforming step
    alone in terms of signal to distortion ratio (SDR) and perceptually motivated
    metric (PESQ) gains. ASR results on a reverberated dataset further show, that
    the aforementioned gains translate to reduced word error rates (WERs) even in
    reverberant environments.
author:
- first_name: Lukas
  full_name: Drude, Lukas
  id: '11213'
  last_name: Drude
- first_name: Reinhold
  full_name: Haeb-Umbach, Reinhold
  id: '242'
  last_name: Haeb-Umbach
citation:
  ama: 'Drude L, Haeb-Umbach R. Tight integration of spatial and spectral features
    for BSS with Deep Clustering embeddings. In: <i>INTERSPEECH 2017, Stockholm, Schweden</i>.
    ; 2017.'
  apa: Drude, L., &#38; Haeb-Umbach, R. (2017). Tight integration of spatial and spectral
    features for BSS with Deep Clustering embeddings. In <i>INTERSPEECH 2017, Stockholm,
    Schweden</i>.
  bibtex: '@inproceedings{Drude_Haeb-Umbach_2017, title={Tight integration of spatial
    and spectral features for BSS with Deep Clustering embeddings}, booktitle={INTERSPEECH
    2017, Stockholm, Schweden}, author={Drude, Lukas and Haeb-Umbach, Reinhold}, year={2017}
    }'
  chicago: Drude, Lukas, and Reinhold Haeb-Umbach. “Tight Integration of Spatial and
    Spectral Features for BSS with Deep Clustering Embeddings.” In <i>INTERSPEECH
    2017, Stockholm, Schweden</i>, 2017.
  ieee: L. Drude and R. Haeb-Umbach, “Tight integration of spatial and spectral features
    for BSS with Deep Clustering embeddings,” in <i>INTERSPEECH 2017, Stockholm, Schweden</i>,
    2017.
  mla: Drude, Lukas, and Reinhold Haeb-Umbach. “Tight Integration of Spatial and Spectral
    Features for BSS with Deep Clustering Embeddings.” <i>INTERSPEECH 2017, Stockholm,
    Schweden</i>, 2017.
  short: 'L. Drude, R. Haeb-Umbach, in: INTERSPEECH 2017, Stockholm, Schweden, 2017.'
date_created: 2019-07-12T05:27:37Z
date_updated: 2022-01-06T06:51:08Z
department:
- _id: '54'
language:
- iso: eng
main_file_link:
- open_access: '1'
  url: https://groups.uni-paderborn.de/nt/pubs/2017/INTERSPEECH_2017_Drude_paper.pdf
oa: '1'
publication: INTERSPEECH 2017, Stockholm, Schweden
related_material:
  link:
  - description: Slides
    relation: supplementary_material
    url: https://groups.uni-paderborn.de/nt/pubs/2017/INTERSPEECH_2017_Drude_slides.pdf
status: public
title: Tight integration of spatial and spectral features for BSS with Deep Clustering
  embeddings
type: conference
user_id: '44006'
year: '2017'
...
---
_id: '11770'
abstract:
- lang: eng
  text: 'In this contribution we show how to exploit text data to support word discovery
    from audio input in an underresourced target language. Given audio, of which a
    certain amount is transcribed at the word level, and additional unrelated text
    data, the approach is able to learn a probabilistic mapping from acoustic units
    to characters and utilize it to segment the audio data into words without the
    need of a pronunciation dictionary. This is achieved by three components: an unsupervised
    acoustic unit discovery system, a supervisedly trained acoustic unit-to-grapheme
    converter, and a word discovery system, which is initialized with a language model
    trained on the text data. Experiments for multiple setups show that the initialization
    of the language model with text data improves the word segementation performance
    by a large margin.'
author:
- first_name: Thomas
  full_name: Glarner, Thomas
  id: '14169'
  last_name: Glarner
- first_name: Benedikt
  full_name: Boenninghoff, Benedikt
  last_name: Boenninghoff
- first_name: Oliver
  full_name: Walter, Oliver
  last_name: Walter
- first_name: Reinhold
  full_name: Haeb-Umbach, Reinhold
  id: '242'
  last_name: Haeb-Umbach
citation:
  ama: 'Glarner T, Boenninghoff B, Walter O, Haeb-Umbach R. Leveraging Text Data for
    Word Segmentation for Underresourced Languages. In: <i>INTERSPEECH 2017, Stockholm,
    Schweden</i>. ; 2017.'
  apa: Glarner, T., Boenninghoff, B., Walter, O., &#38; Haeb-Umbach, R. (2017). Leveraging
    Text Data for Word Segmentation for Underresourced Languages. In <i>INTERSPEECH
    2017, Stockholm, Schweden</i>.
  bibtex: '@inproceedings{Glarner_Boenninghoff_Walter_Haeb-Umbach_2017, title={Leveraging
    Text Data for Word Segmentation for Underresourced Languages}, booktitle={INTERSPEECH
    2017, Stockholm, Schweden}, author={Glarner, Thomas and Boenninghoff, Benedikt
    and Walter, Oliver and Haeb-Umbach, Reinhold}, year={2017} }'
  chicago: Glarner, Thomas, Benedikt Boenninghoff, Oliver Walter, and Reinhold Haeb-Umbach.
    “Leveraging Text Data for Word Segmentation for Underresourced Languages.” In
    <i>INTERSPEECH 2017, Stockholm, Schweden</i>, 2017.
  ieee: T. Glarner, B. Boenninghoff, O. Walter, and R. Haeb-Umbach, “Leveraging Text
    Data for Word Segmentation for Underresourced Languages,” in <i>INTERSPEECH 2017,
    Stockholm, Schweden</i>, 2017.
  mla: Glarner, Thomas, et al. “Leveraging Text Data for Word Segmentation for Underresourced
    Languages.” <i>INTERSPEECH 2017, Stockholm, Schweden</i>, 2017.
  short: 'T. Glarner, B. Boenninghoff, O. Walter, R. Haeb-Umbach, in: INTERSPEECH
    2017, Stockholm, Schweden, 2017.'
date_created: 2019-07-12T05:27:55Z
date_updated: 2022-01-06T06:51:08Z
department:
- _id: '54'
language:
- iso: eng
main_file_link:
- open_access: '1'
  url: https://groups.uni-paderborn.de/nt/pubs/2017/INTERSPEECH_2017_Glarner_paper.pdf
oa: '1'
publication: INTERSPEECH 2017, Stockholm, Schweden
related_material:
  link:
  - description: Poster
    relation: supplementary_material
    url: https://groups.uni-paderborn.de/nt/pubs/2017/INTERSPEECH_2017_Glarner_poster.pdf
status: public
title: Leveraging Text Data for Word Segmentation for Underresourced Languages
type: conference
user_id: '44006'
year: '2017'
...
---
_id: '11809'
abstract:
- lang: eng
  text: This paper presents an end-to-end training approach for a beamformer-supported
    multi-channel ASR system. A neural network which estimates masks for a statistically
    optimum beamformer is jointly trained with a network for acoustic modeling. To
    update its parameters, we propagate the gradients from the acoustic model all
    the way through feature extraction and the complex valued beamforming operation.
    Besides avoiding a mismatch between the front-end and the back-end, this approach
    also eliminates the need for stereo data, i.e., the parallel availability of clean
    and noisy versions of the signals. Instead, it can be trained with real noisy
    multichannel data only. Also, relying on the signal statistics for beamforming,
    the approach makes no assumptions on the configuration of the microphone array.
    We further observe a performance gain through joint training in terms of word
    error rate in an evaluation of the system on the CHiME 4 dataset.
author:
- first_name: Jahn
  full_name: Heymann, Jahn
  id: '9168'
  last_name: Heymann
- first_name: Lukas
  full_name: Drude, Lukas
  id: '11213'
  last_name: Drude
- first_name: Christoph
  full_name: Boeddeker, Christoph
  id: '40767'
  last_name: Boeddeker
- first_name: Patrick
  full_name: Hanebrink, Patrick
  last_name: Hanebrink
- first_name: Reinhold
  full_name: Haeb-Umbach, Reinhold
  id: '242'
  last_name: Haeb-Umbach
citation:
  ama: 'Heymann J, Drude L, Boeddeker C, Hanebrink P, Haeb-Umbach R. BEAMNET: End-to-End
    Training of a Beamformer-Supported Multi-Channel ASR System. In: <i>Proc. IEEE
    Intl. Conf. on Acoustics, Speech and Signal Processing (ICASSP)</i>. ; 2017.'
  apa: 'Heymann, J., Drude, L., Boeddeker, C., Hanebrink, P., &#38; Haeb-Umbach, R.
    (2017). BEAMNET: End-to-End Training of a Beamformer-Supported Multi-Channel ASR
    System. In <i>Proc. IEEE Intl. Conf. on Acoustics, Speech and Signal Processing
    (ICASSP)</i>.'
  bibtex: '@inproceedings{Heymann_Drude_Boeddeker_Hanebrink_Haeb-Umbach_2017, title={BEAMNET:
    End-to-End Training of a Beamformer-Supported Multi-Channel ASR System}, booktitle={Proc.
    IEEE Intl. Conf. on Acoustics, Speech and Signal Processing (ICASSP)}, author={Heymann,
    Jahn and Drude, Lukas and Boeddeker, Christoph and Hanebrink, Patrick and Haeb-Umbach,
    Reinhold}, year={2017} }'
  chicago: 'Heymann, Jahn, Lukas Drude, Christoph Boeddeker, Patrick Hanebrink, and
    Reinhold Haeb-Umbach. “BEAMNET: End-to-End Training of a Beamformer-Supported
    Multi-Channel ASR System.” In <i>Proc. IEEE Intl. Conf. on Acoustics, Speech and
    Signal Processing (ICASSP)</i>, 2017.'
  ieee: 'J. Heymann, L. Drude, C. Boeddeker, P. Hanebrink, and R. Haeb-Umbach, “BEAMNET:
    End-to-End Training of a Beamformer-Supported Multi-Channel ASR System,” in <i>Proc.
    IEEE Intl. Conf. on Acoustics, Speech and Signal Processing (ICASSP)</i>, 2017.'
  mla: 'Heymann, Jahn, et al. “BEAMNET: End-to-End Training of a Beamformer-Supported
    Multi-Channel ASR System.” <i>Proc. IEEE Intl. Conf. on Acoustics, Speech and
    Signal Processing (ICASSP)</i>, 2017.'
  short: 'J. Heymann, L. Drude, C. Boeddeker, P. Hanebrink, R. Haeb-Umbach, in: Proc.
    IEEE Intl. Conf. on Acoustics, Speech and Signal Processing (ICASSP), 2017.'
date_created: 2019-07-12T05:28:40Z
date_updated: 2022-01-06T06:51:09Z
department:
- _id: '54'
language:
- iso: eng
main_file_link:
- open_access: '1'
  url: https://groups.uni-paderborn.de/nt/pubs/2017/icassp_2017_heymann_paper.pdf
oa: '1'
project:
- _id: '52'
  name: Computing Resources Provided by the Paderborn Center for Parallel Computing
publication: Proc. IEEE Intl. Conf. on Acoustics, Speech and Signal Processing (ICASSP)
related_material:
  link:
  - description: Poster
    relation: supplementary_material
    url: https://groups.uni-paderborn.de/nt/pubs/2017/icassp_2017_heymann_poster.pdf
status: public
title: 'BEAMNET: End-to-End Training of a Beamformer-Supported Multi-Channel ASR System'
type: conference
user_id: '40767'
year: '2017'
...
---
_id: '11811'
abstract:
- lang: eng
  text: 'Acoustic beamforming can greatly improve the performance of Automatic Speech
    Recognition (ASR) and speech enhancement systems when multiple channels are available.
    We recently proposed a way to support the model-based Generalized Eigenvalue beamforming
    operation with a powerful neural network for spectral mask estimation. The enhancement
    system has a number of desirable properties. In particular, neither assumptions
    need to be made about the nature of the acoustic transfer function (e.g., being
    anechonic), nor does the array configuration need to be known. While the system
    has been originally developed to enhance speech in noisy environments, we show
    in this article that it is also effective in suppressing reverberation, thus leading
    to a generic trainable multi-channel speech enhancement system for robust speech
    processing. To support this claim, we consider two distinct datasets: The CHiME
    3 challenge, which features challenging real-world noise distortions, and the
    Reverb challenge, which focuses on distortions caused by reverberation. We evaluate
    the system both with respect to a speech enhancement and a recognition task. For
    the first task we propose a new way to cope with the distortions introduced by
    the Generalized Eigenvalue beamformer by renormalizing the target energy for each
    frequency bin, and measure its effectiveness in terms of the PESQ score. For the
    latter we feed the enhanced signal to a strong DNN back-end and achieve state-of-the-art
    ASR results on both datasets. We further experiment with different network architectures
    for spectral mask estimation: One small feed-forward network with only one hidden
    layer, one Convolutional Neural Network and one bi-directional Long Short-Term
    Memory network, showing that even a small network is capable of delivering significant
    performance improvements.'
author:
- first_name: Jahn
  full_name: Heymann, Jahn
  id: '9168'
  last_name: Heymann
- first_name: Lukas
  full_name: Drude, Lukas
  id: '11213'
  last_name: Drude
- first_name: Reinhold
  full_name: Haeb-Umbach, Reinhold
  id: '242'
  last_name: Haeb-Umbach
citation:
  ama: Heymann J, Drude L, Haeb-Umbach R. A Generic Neural Acoustic Beamforming Architecture
    for Robust Multi-Channel Speech Processing. <i>Computer Speech and Language</i>.
    2017.
  apa: Heymann, J., Drude, L., &#38; Haeb-Umbach, R. (2017). A Generic Neural Acoustic
    Beamforming Architecture for Robust Multi-Channel Speech Processing. <i>Computer
    Speech and Language</i>.
  bibtex: '@article{Heymann_Drude_Haeb-Umbach_2017, title={A Generic Neural Acoustic
    Beamforming Architecture for Robust Multi-Channel Speech Processing}, journal={Computer
    Speech and Language}, author={Heymann, Jahn and Drude, Lukas and Haeb-Umbach,
    Reinhold}, year={2017} }'
  chicago: Heymann, Jahn, Lukas Drude, and Reinhold Haeb-Umbach. “A Generic Neural
    Acoustic Beamforming Architecture for Robust Multi-Channel Speech Processing.”
    <i>Computer Speech and Language</i>, 2017.
  ieee: J. Heymann, L. Drude, and R. Haeb-Umbach, “A Generic Neural Acoustic Beamforming
    Architecture for Robust Multi-Channel Speech Processing,” <i>Computer Speech and
    Language</i>, 2017.
  mla: Heymann, Jahn, et al. “A Generic Neural Acoustic Beamforming Architecture for
    Robust Multi-Channel Speech Processing.” <i>Computer Speech and Language</i>,
    2017.
  short: J. Heymann, L. Drude, R. Haeb-Umbach, Computer Speech and Language (2017).
date_created: 2019-07-12T05:28:43Z
date_updated: 2022-01-06T06:51:09Z
department:
- _id: '54'
language:
- iso: eng
main_file_link:
- open_access: '1'
  url: https://groups.uni-paderborn.de/nt/pubs/2017/ComputerSpeechLanguage_2017_heymann_paper.pdf
oa: '1'
publication: Computer Speech and Language
status: public
title: A Generic Neural Acoustic Beamforming Architecture for Robust Multi-Channel
  Speech Processing
type: journal_article
user_id: '44006'
year: '2017'
...
---
_id: '12081'
abstract:
- lang: eng
  text: 'The invention relates to a building or enclosure termination opening and/or
    closing apparatus having communication signed or encrypted by means of a key,
    and to a method for operating such. To allow simple, convenient and secure use
    by exclusively authorised users, the apparatus comprises: a first and a second
    user terminal, with secure forwarding of a time-limited key from the first to
    the second user terminal being possible. According to an alternative, individual
    keys are generated by a user identification and a secret device key.'
author:
- first_name: Florian
  full_name: Jacob, Florian
  last_name: Jacob
- first_name: Joerg
  full_name: Schmalenstroeer, Joerg
  id: '460'
  last_name: Schmalenstroeer
citation:
  ama: Jacob F, Schmalenstroeer J. Building or Enclosure Termination Closing and/or
    Opening Apparatus, and Method for Operating a Building or Enclosure Termination.
    2017.
  apa: Jacob, F., &#38; Schmalenstroeer, J. (2017). Building or Enclosure Termination
    Closing and/or Opening Apparatus, and Method for Operating a Building or Enclosure
    Termination.
  bibtex: '@article{Jacob_Schmalenstroeer_2017, title={Building or Enclosure Termination
    Closing and/or Opening Apparatus, and Method for Operating a Building or Enclosure
    Termination}, author={Jacob, Florian and Schmalenstroeer, Joerg}, year={2017}
    }'
  chicago: Jacob, Florian, and Joerg Schmalenstroeer. “Building or Enclosure Termination
    Closing and/or Opening Apparatus, and Method for Operating a Building or Enclosure
    Termination,” 2017.
  ieee: F. Jacob and J. Schmalenstroeer, “Building or Enclosure Termination Closing
    and/or Opening Apparatus, and Method for Operating a Building or Enclosure Termination.”
    2017.
  mla: Jacob, Florian, and Joerg Schmalenstroeer. <i>Building or Enclosure Termination
    Closing and/or Opening Apparatus, and Method for Operating a Building or Enclosure
    Termination</i>. 2017.
  short: F. Jacob, J. Schmalenstroeer, (2017).
date_created: 2019-07-19T08:07:11Z
date_updated: 2022-01-06T06:51:17Z
department:
- _id: '54'
ipc: WO2018/077610A
ipn: WO2018/077610A
publication_date: '2017'
status: public
title: Building or Enclosure Termination Closing and/or Opening Apparatus, and Method
  for Operating a Building or Enclosure Termination
type: patent
user_id: '460'
year: '2017'
...
---
_id: '11763'
abstract:
- lang: eng
  text: In this paper, we apply a high-resolution approach, i.e. the matrix pencil
    method (MPM), to the FMCW automotive radar system to separate the neighboring
    targets, which share similar parameters, i.e. range, relative speed and azimuth
    angle, and cause overlapping in the radar spectrum. In order to adapt the 1D model
    of MPM to the 2D range-velocity spectrum and simultaneously limit the computational
    cost, some preprocessing steps are proposed to construct a novel separation algorithm.
    Finally, this algorithm is evaluated in both simulation and real data, and the
    results indicate a promising performance.
author:
- first_name: Tai
  full_name: Fei, Tai
  last_name: Fei
- first_name: Christopher
  full_name: Grimm, Christopher
  last_name: Grimm
- first_name: Ridha
  full_name: Farhoud, Ridha
  last_name: Farhoud
- first_name: Tobias
  full_name: Breddermann, Tobias
  last_name: Breddermann
- first_name: Ernst
  full_name: Warsitz, Ernst
  last_name: Warsitz
- first_name: Reinhold
  full_name: Haeb-Umbach, Reinhold
  id: '242'
  last_name: Haeb-Umbach
citation:
  ama: 'Fei T, Grimm C, Farhoud R, Breddermann T, Warsitz E, Haeb-Umbach R. A Novel
    Target Separation Algorithm Applied to The Two-Dimensional Spectrum for FMCW Automotive
    Radar Systems. In: <i>IEEE International Conference on Microwave, Communications,
    Anthenas and Electronic Systems</i>. ; 2017.'
  apa: Fei, T., Grimm, C., Farhoud, R., Breddermann, T., Warsitz, E., &#38; Haeb-Umbach,
    R. (2017). A Novel Target Separation Algorithm Applied to The Two-Dimensional
    Spectrum for FMCW Automotive Radar Systems. <i>IEEE International Conference on
    Microwave, Communications, Anthenas and Electronic Systems</i>.
  bibtex: '@inproceedings{Fei_Grimm_Farhoud_Breddermann_Warsitz_Haeb-Umbach_2017,
    title={A Novel Target Separation Algorithm Applied to The Two-Dimensional Spectrum
    for FMCW Automotive Radar Systems}, booktitle={IEEE International conference on
    microwave, communications, anthenas and electronic systems}, author={Fei, Tai
    and Grimm, Christopher and Farhoud, Ridha and Breddermann, Tobias and Warsitz,
    Ernst and Haeb-Umbach, Reinhold}, year={2017} }'
  chicago: Fei, Tai, Christopher Grimm, Ridha Farhoud, Tobias Breddermann, Ernst Warsitz,
    and Reinhold Haeb-Umbach. “A Novel Target Separation Algorithm Applied to The
    Two-Dimensional Spectrum for FMCW Automotive Radar Systems.” In <i>IEEE International
    Conference on Microwave, Communications, Anthenas and Electronic Systems</i>,
    2017.
  ieee: T. Fei, C. Grimm, R. Farhoud, T. Breddermann, E. Warsitz, and R. Haeb-Umbach,
    “A Novel Target Separation Algorithm Applied to The Two-Dimensional Spectrum for
    FMCW Automotive Radar Systems,” 2017.
  mla: Fei, Tai, et al. “A Novel Target Separation Algorithm Applied to The Two-Dimensional
    Spectrum for FMCW Automotive Radar Systems.” <i>IEEE International Conference
    on Microwave, Communications, Anthenas and Electronic Systems</i>, 2017.
  short: 'T. Fei, C. Grimm, R. Farhoud, T. Breddermann, E. Warsitz, R. Haeb-Umbach,
    in: IEEE International Conference on Microwave, Communications, Anthenas and Electronic
    Systems, 2017.'
date_created: 2019-07-12T05:27:47Z
date_updated: 2023-11-20T16:37:49Z
department:
- _id: '54'
language:
- iso: eng
main_file_link:
- open_access: '1'
  url: https://groups.uni-paderborn.de/nt/pubs/2017/COMCAS_2017_FeiHaeb_paper.pdf
oa: '1'
publication: IEEE International conference on microwave, communications, anthenas
  and electronic systems
quality_controlled: '1'
status: public
title: A Novel Target Separation Algorithm Applied to The Two-Dimensional Spectrum
  for FMCW Automotive Radar Systems
type: conference
user_id: '242'
year: '2017'
...
---
_id: '11772'
abstract:
- lang: eng
  text: In this paper, we present a hypothesis test for the classification of moving
    targets in the sight of an automotive radar sensor. For this purpose, a statistical
    model of the relative velocity between a stationary target and the radar sensor
    has been developed. With respect to the statistical properties a confidence interval
    is calculated and targets with relative velocity lying outside this interval are
    classified as moving targets. Compared to existing algorithms our approach is
    able to give robust classification independent of the number of observed moving
    targets and is characterized by an instantaneous classification, a simple parameterization
    of the model and an automatic calculation of the discriminating threshold.
author:
- first_name: Christopher
  full_name: Grimm, Christopher
  last_name: Grimm
- first_name: Tobias
  full_name: Breddermann, Tobias
  last_name: Breddermann
- first_name: Ridha
  full_name: Farhoud, Ridha
  last_name: Farhoud
- first_name: Tai
  full_name: Fei, Tai
  last_name: Fei
- first_name: Ernst
  full_name: Warsitz, Ernst
  last_name: Warsitz
- first_name: Reinhold
  full_name: Haeb-Umbach, Reinhold
  id: '242'
  last_name: Haeb-Umbach
citation:
  ama: 'Grimm C, Breddermann T, Farhoud R, Fei T, Warsitz E, Haeb-Umbach R. Hypothesis
    Test for the Detection of Moving Targets in Automotive Radar. In: <i>IEEE International
    Conference on Microwave, Communications, Anthenas and Electronic Systems (COMCAS)</i>.
    ; 2017.'
  apa: Grimm, C., Breddermann, T., Farhoud, R., Fei, T., Warsitz, E., &#38; Haeb-Umbach,
    R. (2017). Hypothesis Test for the Detection of Moving Targets in Automotive Radar.
    <i>IEEE International Conference on Microwave, Communications, Anthenas and Electronic
    Systems (COMCAS)</i>.
  bibtex: '@inproceedings{Grimm_Breddermann_Farhoud_Fei_Warsitz_Haeb-Umbach_2017,
    title={Hypothesis Test for the Detection of Moving Targets in Automotive Radar},
    booktitle={IEEE International conference on microwave, communications, anthenas
    and electronic systems (COMCAS)}, author={Grimm, Christopher and Breddermann,
    Tobias and Farhoud, Ridha and Fei, Tai and Warsitz, Ernst and Haeb-Umbach, Reinhold},
    year={2017} }'
  chicago: Grimm, Christopher, Tobias Breddermann, Ridha Farhoud, Tai Fei, Ernst Warsitz,
    and Reinhold Haeb-Umbach. “Hypothesis Test for the Detection of Moving Targets
    in Automotive Radar.” In <i>IEEE International Conference on Microwave, Communications,
    Anthenas and Electronic Systems (COMCAS)</i>, 2017.
  ieee: C. Grimm, T. Breddermann, R. Farhoud, T. Fei, E. Warsitz, and R. Haeb-Umbach,
    “Hypothesis Test for the Detection of Moving Targets in Automotive Radar,” 2017.
  mla: Grimm, Christopher, et al. “Hypothesis Test for the Detection of Moving Targets
    in Automotive Radar.” <i>IEEE International Conference on Microwave, Communications,
    Anthenas and Electronic Systems (COMCAS)</i>, 2017.
  short: 'C. Grimm, T. Breddermann, R. Farhoud, T. Fei, E. Warsitz, R. Haeb-Umbach,
    in: IEEE International Conference on Microwave, Communications, Anthenas and Electronic
    Systems (COMCAS), 2017.'
date_created: 2019-07-12T05:27:57Z
date_updated: 2023-11-20T16:37:59Z
department:
- _id: '54'
language:
- iso: eng
main_file_link:
- open_access: '1'
  url: https://groups.uni-paderborn.de/nt/pubs/2017/COMCAS_2017_GrimmHaeb_paper.pdf
oa: '1'
publication: IEEE International conference on microwave, communications, anthenas
  and electronic systems (COMCAS)
quality_controlled: '1'
status: public
title: Hypothesis Test for the Detection of Moving Targets in Automotive Radar
type: conference
user_id: '242'
year: '2017'
...
---
_id: '11759'
abstract:
- lang: eng
  text: 'Variational Autoencoders (VAEs) have been shown to provide efficient neural-network-based
    approximate Bayesian inference for observation models for which exact inference
    is intractable. Its extension, the so-called Structured VAE (SVAE) allows inference
    in the presence of both discrete and continuous latent variables. Inspired by
    this extension, we developed a VAE with Hidden Markov Models (HMMs) as latent
    models. We applied the resulting HMM-VAE to the task of acoustic unit discovery
    in a zero resource scenario. Starting from an initial model based on variational
    inference in an HMM with Gaussian Mixture Model (GMM) emission probabilities,
    the accuracy of the acoustic unit discovery could be significantly improved by
    the HMM-VAE. In doing so we were able to demonstrate for an unsupervised learning
    task what is well-known in the supervised learning case: Neural networks provide
    superior modeling power compared to GMMs.'
author:
- first_name: Janek
  full_name: Ebbers, Janek
  id: '34851'
  last_name: Ebbers
- first_name: Jahn
  full_name: Heymann, Jahn
  id: '9168'
  last_name: Heymann
- first_name: Lukas
  full_name: Drude, Lukas
  id: '11213'
  last_name: Drude
- first_name: Thomas
  full_name: Glarner, Thomas
  id: '14169'
  last_name: Glarner
- first_name: Reinhold
  full_name: Haeb-Umbach, Reinhold
  id: '242'
  last_name: Haeb-Umbach
- first_name: Bhiksha
  full_name: Raj, Bhiksha
  last_name: Raj
citation:
  ama: 'Ebbers J, Heymann J, Drude L, Glarner T, Haeb-Umbach R, Raj B. Hidden Markov
    Model Variational Autoencoder for Acoustic Unit Discovery. In: <i>INTERSPEECH
    2017, Stockholm, Schweden</i>. ; 2017.'
  apa: Ebbers, J., Heymann, J., Drude, L., Glarner, T., Haeb-Umbach, R., &#38; Raj,
    B. (2017). Hidden Markov Model Variational Autoencoder for Acoustic Unit Discovery.
    <i>INTERSPEECH 2017, Stockholm, Schweden</i>.
  bibtex: '@inproceedings{Ebbers_Heymann_Drude_Glarner_Haeb-Umbach_Raj_2017, title={Hidden
    Markov Model Variational Autoencoder for Acoustic Unit Discovery}, booktitle={INTERSPEECH
    2017, Stockholm, Schweden}, author={Ebbers, Janek and Heymann, Jahn and Drude,
    Lukas and Glarner, Thomas and Haeb-Umbach, Reinhold and Raj, Bhiksha}, year={2017}
    }'
  chicago: Ebbers, Janek, Jahn Heymann, Lukas Drude, Thomas Glarner, Reinhold Haeb-Umbach,
    and Bhiksha Raj. “Hidden Markov Model Variational Autoencoder for Acoustic Unit
    Discovery.” In <i>INTERSPEECH 2017, Stockholm, Schweden</i>, 2017.
  ieee: J. Ebbers, J. Heymann, L. Drude, T. Glarner, R. Haeb-Umbach, and B. Raj, “Hidden
    Markov Model Variational Autoencoder for Acoustic Unit Discovery,” 2017.
  mla: Ebbers, Janek, et al. “Hidden Markov Model Variational Autoencoder for Acoustic
    Unit Discovery.” <i>INTERSPEECH 2017, Stockholm, Schweden</i>, 2017.
  short: 'J. Ebbers, J. Heymann, L. Drude, T. Glarner, R. Haeb-Umbach, B. Raj, in:
    INTERSPEECH 2017, Stockholm, Schweden, 2017.'
date_created: 2019-07-12T05:27:42Z
date_updated: 2023-11-22T08:29:06Z
department:
- _id: '54'
language:
- iso: eng
main_file_link:
- open_access: '1'
  url: https://groups.uni-paderborn.de/nt/pubs/2017/INTERSPEECH_2017_Ebbers_paper.pdf
oa: '1'
publication: INTERSPEECH 2017, Stockholm, Schweden
quality_controlled: '1'
related_material:
  link:
  - description: Poster
    relation: supplementary_material
    url: https://groups.uni-paderborn.de/nt/pubs/2017/INTERSPEECH_2017_Ebbers_poster.pdf
  - description: Slides
    relation: supplementary_material
    url: https://groups.uni-paderborn.de/nt/pubs/2017/INTERSPEECH_2017_Ebbers_slides.pdf
status: public
title: Hidden Markov Model Variational Autoencoder for Acoustic Unit Discovery
type: conference
user_id: '34851'
year: '2017'
...
---
_id: '11895'
abstract:
- lang: eng
  text: Multi-channel speech enhancement algorithms rely on a synchronous sampling
    of the microphone signals. This, however, cannot always be guaranteed, especially
    if the sensors are distributed in an environment. To avoid performance degradation
    the sampling rate offset needs to be estimated and compensated for. In this contribution
    we extend the recently proposed coherence drift based method in two important
    directions. First, the increasing phase shift in the short-time Fourier transform
    domain is estimated from the coherence drift in a Matched Filterlike fashion,
    where intermediate estimates are weighted by their instantaneous SNR. Second,
    an observed bias is removed by iterating between offset estimation and compensation
    by resampling a couple of times. The effectiveness of the proposed method is demonstrated
    by speech recognition results on the output of a beamformer with and without sampling
    rate offset compensation between the input channels. We compare MVDR and maximum-SNR
    beamformers in reverberant environments and further show that both benefit from
    a novel phase normalization, which we also propose in this contribution.
author:
- first_name: Joerg
  full_name: Schmalenstroeer, Joerg
  id: '460'
  last_name: Schmalenstroeer
- first_name: Jahn
  full_name: Heymann, Jahn
  id: '9168'
  last_name: Heymann
- first_name: Lukas
  full_name: Drude, Lukas
  id: '11213'
  last_name: Drude
- first_name: Christoph
  full_name: Boeddeker, Christoph
  id: '40767'
  last_name: Boeddeker
- first_name: Reinhold
  full_name: Haeb-Umbach, Reinhold
  id: '242'
  last_name: Haeb-Umbach
citation:
  ama: 'Schmalenstroeer J, Heymann J, Drude L, Boeddeker C, Haeb-Umbach R. Multi-Stage
    Coherence Drift Based Sampling Rate Synchronization for Acoustic Beamforming.
    In: <i>IEEE 19th International Workshop on Multimedia Signal Processing (MMSP)</i>.
    ; 2017.'
  apa: Schmalenstroeer, J., Heymann, J., Drude, L., Boeddeker, C., &#38; Haeb-Umbach,
    R. (2017). Multi-Stage Coherence Drift Based Sampling Rate Synchronization for
    Acoustic Beamforming. <i>IEEE 19th International Workshop on Multimedia Signal
    Processing (MMSP)</i>.
  bibtex: '@inproceedings{Schmalenstroeer_Heymann_Drude_Boeddeker_Haeb-Umbach_2017,
    title={Multi-Stage Coherence Drift Based Sampling Rate Synchronization for Acoustic
    Beamforming}, booktitle={IEEE 19th International Workshop on Multimedia Signal
    Processing (MMSP)}, author={Schmalenstroeer, Joerg and Heymann, Jahn and Drude,
    Lukas and Boeddeker, Christoph and Haeb-Umbach, Reinhold}, year={2017} }'
  chicago: Schmalenstroeer, Joerg, Jahn Heymann, Lukas Drude, Christoph Boeddeker,
    and Reinhold Haeb-Umbach. “Multi-Stage Coherence Drift Based Sampling Rate Synchronization
    for Acoustic Beamforming.” In <i>IEEE 19th International Workshop on Multimedia
    Signal Processing (MMSP)</i>, 2017.
  ieee: J. Schmalenstroeer, J. Heymann, L. Drude, C. Boeddeker, and R. Haeb-Umbach,
    “Multi-Stage Coherence Drift Based Sampling Rate Synchronization for Acoustic
    Beamforming,” 2017.
  mla: Schmalenstroeer, Joerg, et al. “Multi-Stage Coherence Drift Based Sampling
    Rate Synchronization for Acoustic Beamforming.” <i>IEEE 19th International Workshop
    on Multimedia Signal Processing (MMSP)</i>, 2017.
  short: 'J. Schmalenstroeer, J. Heymann, L. Drude, C. Boeddeker, R. Haeb-Umbach,
    in: IEEE 19th International Workshop on Multimedia Signal Processing (MMSP), 2017.'
date_created: 2019-07-12T05:30:20Z
date_updated: 2023-10-26T08:12:05Z
department:
- _id: '54'
language:
- iso: eng
main_file_link:
- open_access: '1'
  url: https://groups.uni-paderborn.de/nt/pubs/2017/MMSP_2017_SchHaeb.pdf
oa: '1'
publication: IEEE 19th International Workshop on Multimedia Signal Processing (MMSP)
quality_controlled: '1'
related_material:
  link:
  - description: Poster
    relation: supplementary_material
    url: https://groups.uni-paderborn.de/nt/pubs/2017/MMSP_2017_SchHaeb_poster.pdf
status: public
title: Multi-Stage Coherence Drift Based Sampling Rate Synchronization for Acoustic
  Beamforming
type: conference
user_id: '460'
year: '2017'
...
---
_id: '11773'
abstract:
- lang: eng
  text: In this paper we present an algorithm for the detection of moving targets
    in sight of an automotive radar sensor which can handle distorted ego-velocity
    information. In situations where biased or none velocity information is provided
    from the ego-vehicle, the algorithm is able to estimate the ego-velocity based
    on previously detected stationary targets with high accuracy, subsequently used
    for the target classification. Compared to existing ego-velocity algorithms our
    approach provides fast and efficient inference without sacrificing the practical
    classification accuracy. Other than that the algorithm is characterized by simple
    parameterization and little but appropriate model assumptions for high accurate
    production automotive radar sensors.
author:
- first_name: Christopher
  full_name: Grimm, Christopher
  last_name: Grimm
- first_name: Ridha
  full_name: Farhoud, Ridha
  last_name: Farhoud
- first_name: Tai
  full_name: Fei, Tai
  last_name: Fei
- first_name: Ernst
  full_name: Warsitz, Ernst
  last_name: Warsitz
- first_name: Reinhold
  full_name: Haeb-Umbach, Reinhold
  id: '242'
  last_name: Haeb-Umbach
citation:
  ama: 'Grimm C, Farhoud R, Fei T, Warsitz E, Haeb-Umbach R. Detection of Moving Targets
    in Automotive Radar with Distorted Ego-Velocity Information. In: <i>IEEE Microwaves,
    Radar and Remote Sensing Symposium (MRRS)</i>. ; 2017.'
  apa: Grimm, C., Farhoud, R., Fei, T., Warsitz, E., &#38; Haeb-Umbach, R. (2017).
    Detection of Moving Targets in Automotive Radar with Distorted Ego-Velocity Information.
    <i>IEEE Microwaves, Radar and Remote Sensing Symposium (MRRS)</i>.
  bibtex: '@inproceedings{Grimm_Farhoud_Fei_Warsitz_Haeb-Umbach_2017, title={Detection
    of Moving Targets in Automotive Radar with Distorted Ego-Velocity Information},
    booktitle={IEEE Microwaves, Radar and Remote Sensing Symposium (MRRS)}, author={Grimm,
    Christopher and Farhoud, Ridha and Fei, Tai and Warsitz, Ernst and Haeb-Umbach,
    Reinhold}, year={2017} }'
  chicago: Grimm, Christopher, Ridha Farhoud, Tai Fei, Ernst Warsitz, and Reinhold
    Haeb-Umbach. “Detection of Moving Targets in Automotive Radar with Distorted Ego-Velocity
    Information.” In <i>IEEE Microwaves, Radar and Remote Sensing Symposium (MRRS)</i>,
    2017.
  ieee: C. Grimm, R. Farhoud, T. Fei, E. Warsitz, and R. Haeb-Umbach, “Detection of
    Moving Targets in Automotive Radar with Distorted Ego-Velocity Information,” 2017.
  mla: Grimm, Christopher, et al. “Detection of Moving Targets in Automotive Radar
    with Distorted Ego-Velocity Information.” <i>IEEE Microwaves, Radar and Remote
    Sensing Symposium (MRRS)</i>, 2017.
  short: 'C. Grimm, R. Farhoud, T. Fei, E. Warsitz, R. Haeb-Umbach, in: IEEE Microwaves,
    Radar and Remote Sensing Symposium (MRRS), 2017.'
date_created: 2019-07-12T05:27:59Z
date_updated: 2023-11-20T16:38:11Z
department:
- _id: '54'
language:
- iso: eng
main_file_link:
- open_access: '1'
  url: https://groups.uni-paderborn.de/nt/pubs/2017/MRRS_2017_GrimmHaeb_paper.pdf
oa: '1'
publication: IEEE Microwaves, Radar and Remote Sensing Symposium (MRRS)
quality_controlled: '1'
status: public
title: Detection of Moving Targets in Automotive Radar with Distorted Ego-Velocity
  Information
type: conference
user_id: '242'
year: '2017'
...
---
_id: '11738'
abstract:
- lang: eng
  text: 'In this contribution we investigate a priori signal-to-noise ratio (SNR)
    estimation, a crucial component of a single-channel speech enhancement system
    based on spectral subtraction. The majority of the state-of-the art a priori SNR
    estimators work in the power spectral domain, which is, however, not confirmed
    to be the optimal domain for the estimation. Motivated by the generalized spectral
    subtraction rule, we show how the estimation of the a priori SNR can be formulated
    in the so called generalized SNR domain. This formulation allows to generalize
    the widely used decision directed (DD) approach. An experimental investigation
    with different noise types reveals the superiority of the generalized DD approach
    over the conventional DD approach in terms of both the mean opinion score - listening
    quality objective measure and the output global SNR in the medium to high input
    SNR regime, while we show that the power spectrum is the optimal domain for low
    SNR. We further develop a parameterization which adjusts the domain of estimation
    automatically according to the estimated input global SNR. Index Terms: single-channel
    speech enhancement, a priori SNR estimation, generalized spectral subtraction'
author:
- first_name: Aleksej
  full_name: Chinaev, Aleksej
  last_name: Chinaev
- first_name: Reinhold
  full_name: Haeb-Umbach, Reinhold
  id: '242'
  last_name: Haeb-Umbach
citation:
  ama: 'Chinaev A, Haeb-Umbach R. A Priori SNR Estimation Using a Generalized Decision
    Directed Approach. In: <i>INTERSPEECH 2016, San Francisco, USA</i>. ; 2016.'
  apa: Chinaev, A., &#38; Haeb-Umbach, R. (2016). A Priori SNR Estimation Using a
    Generalized Decision Directed Approach. In <i>INTERSPEECH 2016, San Francisco,
    USA</i>.
  bibtex: '@inproceedings{Chinaev_Haeb-Umbach_2016, title={A Priori SNR Estimation
    Using a Generalized Decision Directed Approach}, booktitle={INTERSPEECH 2016,
    San Francisco, USA}, author={Chinaev, Aleksej and Haeb-Umbach, Reinhold}, year={2016}
    }'
  chicago: Chinaev, Aleksej, and Reinhold Haeb-Umbach. “A Priori SNR Estimation Using
    a Generalized Decision Directed Approach.” In <i>INTERSPEECH 2016, San Francisco,
    USA</i>, 2016.
  ieee: A. Chinaev and R. Haeb-Umbach, “A Priori SNR Estimation Using a Generalized
    Decision Directed Approach,” in <i>INTERSPEECH 2016, San Francisco, USA</i>, 2016.
  mla: Chinaev, Aleksej, and Reinhold Haeb-Umbach. “A Priori SNR Estimation Using
    a Generalized Decision Directed Approach.” <i>INTERSPEECH 2016, San Francisco,
    USA</i>, 2016.
  short: 'A. Chinaev, R. Haeb-Umbach, in: INTERSPEECH 2016, San Francisco, USA, 2016.'
date_created: 2019-07-12T05:27:18Z
date_updated: 2022-01-06T06:51:08Z
department:
- _id: '54'
language:
- iso: eng
main_file_link:
- open_access: '1'
  url: https://groups.uni-paderborn.de/nt/pubs/2016/ChHa16.pdf
oa: '1'
publication: INTERSPEECH 2016, San Francisco, USA
related_material:
  link:
  - description: Poster
    relation: supplementary_material
    url: https://groups.uni-paderborn.de/nt/pubs/2016/ChHa16_Poster.pdf
status: public
title: A Priori SNR Estimation Using a Generalized Decision Directed Approach
type: conference
user_id: '44006'
year: '2016'
...
---
_id: '11743'
abstract:
- lang: eng
  text: This contribution introduces a novel causal a priori signal-to-noise ratio
    (SNR) estimator for single-channel speech enhancement. To exploit the advantages
    of the generalized spectral subtraction, a normalized ?-order magnitude (NAOM)
    domain is introduced where an a priori SNR estimation is carried out. In this
    domain, the NAOM coefficients of noise and clean speech signals are modeled by
    a Weibull distribution and aWeibullmixturemodel (WMM), respectively. While the
    parameters of the noise model are calculated from the noise power spectral density
    estimates, the speechWMM parameters are estimated from the noisy signal by applying
    a causal Expectation-Maximization algorithm. Further a maximum a posteriori estimate
    of the a priori SNR is developed. The experiments in different noisy environments
    show the superiority of the proposed estimator compared to the well-known decision-directed
    approach in terms of estimation error, estimator variance and speech quality of
    the enhanced signals when used for speech enhancement.
author:
- first_name: Aleksej
  full_name: Chinaev, Aleksej
  last_name: Chinaev
- first_name: Jens
  full_name: Heitkaemper, Jens
  id: '27643'
  last_name: Heitkaemper
- first_name: Reinhold
  full_name: Haeb-Umbach, Reinhold
  id: '242'
  last_name: Haeb-Umbach
citation:
  ama: 'Chinaev A, Heitkaemper J, Haeb-Umbach R. A Priori SNR Estimation Using Weibull
    Mixture Model. In: <i>12. ITG Fachtagung Sprachkommunikation (ITG 2016)</i>. ;
    2016.'
  apa: Chinaev, A., Heitkaemper, J., &#38; Haeb-Umbach, R. (2016). A Priori SNR Estimation
    Using Weibull Mixture Model. In <i>12. ITG Fachtagung Sprachkommunikation (ITG
    2016)</i>.
  bibtex: '@inproceedings{Chinaev_Heitkaemper_Haeb-Umbach_2016, title={A Priori SNR
    Estimation Using Weibull Mixture Model}, booktitle={12. ITG Fachtagung Sprachkommunikation
    (ITG 2016)}, author={Chinaev, Aleksej and Heitkaemper, Jens and Haeb-Umbach, Reinhold},
    year={2016} }'
  chicago: Chinaev, Aleksej, Jens Heitkaemper, and Reinhold Haeb-Umbach. “A Priori
    SNR Estimation Using Weibull Mixture Model.” In <i>12. ITG Fachtagung Sprachkommunikation
    (ITG 2016)</i>, 2016.
  ieee: A. Chinaev, J. Heitkaemper, and R. Haeb-Umbach, “A Priori SNR Estimation Using
    Weibull Mixture Model,” in <i>12. ITG Fachtagung Sprachkommunikation (ITG 2016)</i>,
    2016.
  mla: Chinaev, Aleksej, et al. “A Priori SNR Estimation Using Weibull Mixture Model.”
    <i>12. ITG Fachtagung Sprachkommunikation (ITG 2016)</i>, 2016.
  short: 'A. Chinaev, J. Heitkaemper, R. Haeb-Umbach, in: 12. ITG Fachtagung Sprachkommunikation
    (ITG 2016), 2016.'
date_created: 2019-07-12T05:27:24Z
date_updated: 2022-01-06T06:51:08Z
department:
- _id: '54'
language:
- iso: eng
main_file_link:
- open_access: '1'
  url: https://groups.uni-paderborn.de/nt/pubs/2016/ChHeiHa16.pdf
oa: '1'
publication: 12. ITG Fachtagung Sprachkommunikation (ITG 2016)
related_material:
  link:
  - description: Presentation
    relation: supplementary_material
    url: https://groups.uni-paderborn.de/nt/pubs/2016/ChHeiHa16_Presentation.pdf
status: public
title: A Priori SNR Estimation Using Weibull Mixture Model
type: conference
user_id: '44006'
year: '2016'
...
---
_id: '11744'
abstract:
- lang: eng
  text: A noise power spectral density (PSD) estimation is an indispensable component
    of speech spectral enhancement systems. In this paper we present a noise PSD tracking
    algorithm, which employs a noise presence probability estimate delivered by a
    deep neural network (DNN). The algorithm provides a causal noise PSD estimate
    and can thus be used in speech enhancement systems for communication purposes.
    An extensive performance comparison has been carried out with ten causal state-of-the-art
    noise tracking algorithms taken from the literature and categorized acc. to applied
    techniques. The experiments showed that the proposed DNN-based noise PSD tracker
    outperforms all competing methods with respect to all tested performance measures,
    which include the noise tracking performance and the performance of a speech enhancement
    system employing the noise tracking component.
author:
- first_name: Aleksej
  full_name: Chinaev, Aleksej
  last_name: Chinaev
- first_name: Jahn
  full_name: Heymann, Jahn
  id: '9168'
  last_name: Heymann
- first_name: Lukas
  full_name: Drude, Lukas
  id: '11213'
  last_name: Drude
- first_name: Reinhold
  full_name: Haeb-Umbach, Reinhold
  id: '242'
  last_name: Haeb-Umbach
citation:
  ama: 'Chinaev A, Heymann J, Drude L, Haeb-Umbach R. Noise-Presence-Probability-Based
    Noise PSD Estimation by Using DNNs. In: <i>12. ITG Fachtagung Sprachkommunikation
    (ITG 2016)</i>. ; 2016.'
  apa: Chinaev, A., Heymann, J., Drude, L., &#38; Haeb-Umbach, R. (2016). Noise-Presence-Probability-Based
    Noise PSD Estimation by Using DNNs. In <i>12. ITG Fachtagung Sprachkommunikation
    (ITG 2016)</i>.
  bibtex: '@inproceedings{Chinaev_Heymann_Drude_Haeb-Umbach_2016, title={Noise-Presence-Probability-Based
    Noise PSD Estimation by Using DNNs}, booktitle={12. ITG Fachtagung Sprachkommunikation
    (ITG 2016)}, author={Chinaev, Aleksej and Heymann, Jahn and Drude, Lukas and Haeb-Umbach,
    Reinhold}, year={2016} }'
  chicago: Chinaev, Aleksej, Jahn Heymann, Lukas Drude, and Reinhold Haeb-Umbach.
    “Noise-Presence-Probability-Based Noise PSD Estimation by Using DNNs.” In <i>12.
    ITG Fachtagung Sprachkommunikation (ITG 2016)</i>, 2016.
  ieee: A. Chinaev, J. Heymann, L. Drude, and R. Haeb-Umbach, “Noise-Presence-Probability-Based
    Noise PSD Estimation by Using DNNs,” in <i>12. ITG Fachtagung Sprachkommunikation
    (ITG 2016)</i>, 2016.
  mla: Chinaev, Aleksej, et al. “Noise-Presence-Probability-Based Noise PSD Estimation
    by Using DNNs.” <i>12. ITG Fachtagung Sprachkommunikation (ITG 2016)</i>, 2016.
  short: 'A. Chinaev, J. Heymann, L. Drude, R. Haeb-Umbach, in: 12. ITG Fachtagung
    Sprachkommunikation (ITG 2016), 2016.'
date_created: 2019-07-12T05:27:25Z
date_updated: 2022-01-06T06:51:08Z
department:
- _id: '54'
language:
- iso: eng
main_file_link:
- open_access: '1'
  url: https://groups.uni-paderborn.de/nt/pubs/2016/ChHeyDrHa16.pdf
oa: '1'
publication: 12. ITG Fachtagung Sprachkommunikation (ITG 2016)
related_material:
  link:
  - description: Presentation
    relation: supplementary_material
    url: https://groups.uni-paderborn.de/nt/pubs/2016/ChHeyDrHa16_Presentation.pdf
status: public
title: Noise-Presence-Probability-Based Noise PSD Estimation by Using DNNs
type: conference
user_id: '44006'
year: '2016'
...
---
_id: '11751'
author:
- first_name: Lukas
  full_name: Drude, Lukas
  id: '11213'
  last_name: Drude
- first_name: Christoph
  full_name: Boeddeker, Christoph
  id: '40767'
  last_name: Boeddeker
- first_name: Reinhold
  full_name: Haeb-Umbach, Reinhold
  id: '242'
  last_name: Haeb-Umbach
citation:
  ama: 'Drude L, Boeddeker C, Haeb-Umbach R. Blind Speech Separation based on Complex
    Spherical k-Mode Clustering. In: <i>Proc. IEEE Intl. Conf. on Acoustics, Speech
    and Signal Processing (ICASSP)</i>. ; 2016.'
  apa: Drude, L., Boeddeker, C., &#38; Haeb-Umbach, R. (2016). Blind Speech Separation
    based on Complex Spherical k-Mode Clustering. In <i>Proc. IEEE Intl. Conf. on
    Acoustics, Speech and Signal Processing (ICASSP)</i>.
  bibtex: '@inproceedings{Drude_Boeddeker_Haeb-Umbach_2016, title={Blind Speech Separation
    based on Complex Spherical k-Mode Clustering}, booktitle={Proc. IEEE Intl. Conf.
    on Acoustics, Speech and Signal Processing (ICASSP)}, author={Drude, Lukas and
    Boeddeker, Christoph and Haeb-Umbach, Reinhold}, year={2016} }'
  chicago: Drude, Lukas, Christoph Boeddeker, and Reinhold Haeb-Umbach. “Blind Speech
    Separation Based on Complex Spherical K-Mode Clustering.” In <i>Proc. IEEE Intl.
    Conf. on Acoustics, Speech and Signal Processing (ICASSP)</i>, 2016.
  ieee: L. Drude, C. Boeddeker, and R. Haeb-Umbach, “Blind Speech Separation based
    on Complex Spherical k-Mode Clustering,” in <i>Proc. IEEE Intl. Conf. on Acoustics,
    Speech and Signal Processing (ICASSP)</i>, 2016.
  mla: Drude, Lukas, et al. “Blind Speech Separation Based on Complex Spherical K-Mode
    Clustering.” <i>Proc. IEEE Intl. Conf. on Acoustics, Speech and Signal Processing
    (ICASSP)</i>, 2016.
  short: 'L. Drude, C. Boeddeker, R. Haeb-Umbach, in: Proc. IEEE Intl. Conf. on Acoustics,
    Speech and Signal Processing (ICASSP), 2016.'
date_created: 2019-07-12T05:27:33Z
date_updated: 2022-01-06T06:51:08Z
department:
- _id: '54'
language:
- iso: eng
main_file_link:
- open_access: '1'
  url: https://groups.uni-paderborn.de/nt/pubs/2016/icassp_2016_drude_paper.pdf
oa: '1'
publication: Proc. IEEE Intl. Conf. on Acoustics, Speech and Signal Processing (ICASSP)
related_material:
  link:
  - description: Slides
    relation: supplementary_material
    url: https://groups.uni-paderborn.de/nt/pubs/2016/icassp_2016_drude_slides.pdf
status: public
title: Blind Speech Separation based on Complex Spherical k-Mode Clustering
type: conference
user_id: '44006'
year: '2016'
...
---
_id: '11756'
abstract:
- lang: eng
  text: Although complex-valued neural networks (CVNNs) â?? networks which can operate
    with complex arithmetic â?? have been around for a while, they have not been given
    reconsideration since the breakthrough of deep network architectures. This paper
    presents a critical assessment whether the novel tool set of deep neural networks
    (DNNs) should be extended to complex-valued arithmetic. Indeed, with DNNs making
    inroads in speech enhancement tasks, the use of complex-valued input data, specifically
    the short-time Fourier transform coefficients, is an obvious consideration. In
    particular when it comes to performing tasks that heavily rely on phase information,
    such as acoustic beamforming, complex-valued algorithms are omnipresent. In this
    contribution we recapitulate backpropagation in CVNNs, develop complex-valued
    network elements, such as the split-rectified non-linearity, and compare real-
    and complex-valued networks on a beamforming task. We find that CVNNs hardly provide
    a performance gain and conclude that the effort of developing the complex-valued
    counterparts of the building blocks of modern deep or recurrent neural networks
    can hardly be justified.
author:
- first_name: Lukas
  full_name: Drude, Lukas
  id: '11213'
  last_name: Drude
- first_name: Bhiksha
  full_name: Raj, Bhiksha
  last_name: Raj
- first_name: Reinhold
  full_name: Haeb-Umbach, Reinhold
  id: '242'
  last_name: Haeb-Umbach
citation:
  ama: 'Drude L, Raj B, Haeb-Umbach R. On the appropriateness of complex-valued neural
    networks for speech enhancement. In: <i>INTERSPEECH 2016, San Francisco, USA</i>.
    ; 2016.'
  apa: Drude, L., Raj, B., &#38; Haeb-Umbach, R. (2016). On the appropriateness of
    complex-valued neural networks for speech enhancement. In <i>INTERSPEECH 2016,
    San Francisco, USA</i>.
  bibtex: '@inproceedings{Drude_Raj_Haeb-Umbach_2016, title={On the appropriateness
    of complex-valued neural networks for speech enhancement}, booktitle={INTERSPEECH
    2016, San Francisco, USA}, author={Drude, Lukas and Raj, Bhiksha and Haeb-Umbach,
    Reinhold}, year={2016} }'
  chicago: Drude, Lukas, Bhiksha Raj, and Reinhold Haeb-Umbach. “On the Appropriateness
    of Complex-Valued Neural Networks for Speech Enhancement.” In <i>INTERSPEECH 2016,
    San Francisco, USA</i>, 2016.
  ieee: L. Drude, B. Raj, and R. Haeb-Umbach, “On the appropriateness of complex-valued
    neural networks for speech enhancement,” in <i>INTERSPEECH 2016, San Francisco,
    USA</i>, 2016.
  mla: Drude, Lukas, et al. “On the Appropriateness of Complex-Valued Neural Networks
    for Speech Enhancement.” <i>INTERSPEECH 2016, San Francisco, USA</i>, 2016.
  short: 'L. Drude, B. Raj, R. Haeb-Umbach, in: INTERSPEECH 2016, San Francisco, USA,
    2016.'
date_created: 2019-07-12T05:27:39Z
date_updated: 2022-01-06T06:51:08Z
department:
- _id: '54'
language:
- iso: eng
main_file_link:
- open_access: '1'
  url: https://groups.uni-paderborn.de/nt/pubs/2016/interspeech_2016_drude_paper.pdf
oa: '1'
publication: INTERSPEECH 2016, San Francisco, USA
related_material:
  link:
  - description: Poster
    relation: supplementary_material
    url: https://groups.uni-paderborn.de/nt/pubs/2016/interspeech_2016_drude_slides.pdf
status: public
title: On the appropriateness of complex-valued neural networks for speech enhancement
type: conference
user_id: '44006'
year: '2016'
...
