---
_id: '59999'
author:
- first_name: Frederik
  full_name: Rautenberg, Frederik
  id: '72602'
  last_name: Rautenberg
- first_name: Michael
  full_name: Kuhlmann, Michael
  id: '49871'
  last_name: Kuhlmann
- first_name: Fritz
  full_name: Seebauer, Fritz
  last_name: Seebauer
- first_name: Jana
  full_name: Wiechmann, Jana
  last_name: Wiechmann
- first_name: Petra
  full_name: Wagner, Petra
  last_name: Wagner
- first_name: Reinhold
  full_name: Haeb-Umbach, Reinhold
  id: '242'
  last_name: Haeb-Umbach
citation:
  ama: 'Rautenberg F, Kuhlmann M, Seebauer F, Wiechmann J, Wagner P, Haeb-Umbach R.
    Speech Synthesis along Perceptual Voice Quality Dimensions. In: <i>ICASSP 2025
    - 2025 IEEE International Conference on Acoustics, Speech and Signal Processing
    (ICASSP)</i>. IEEE; 2025. doi:<a href="https://doi.org/10.1109/icassp49660.2025.10888012">10.1109/icassp49660.2025.10888012</a>'
  apa: Rautenberg, F., Kuhlmann, M., Seebauer, F., Wiechmann, J., Wagner, P., &#38;
    Haeb-Umbach, R. (2025). Speech Synthesis along Perceptual Voice Quality Dimensions.
    <i>ICASSP 2025 - 2025 IEEE International Conference on Acoustics, Speech and Signal
    Processing (ICASSP)</i>. IEEE International Conference on Acoustics, Speech and
    Signal Processing (ICASSP), Hyderabad, India . <a href="https://doi.org/10.1109/icassp49660.2025.10888012">https://doi.org/10.1109/icassp49660.2025.10888012</a>
  bibtex: '@inproceedings{Rautenberg_Kuhlmann_Seebauer_Wiechmann_Wagner_Haeb-Umbach_2025,
    title={Speech Synthesis along Perceptual Voice Quality Dimensions}, DOI={<a href="https://doi.org/10.1109/icassp49660.2025.10888012">10.1109/icassp49660.2025.10888012</a>},
    booktitle={ICASSP 2025 - 2025 IEEE International Conference on Acoustics, Speech
    and Signal Processing (ICASSP)}, publisher={IEEE}, author={Rautenberg, Frederik
    and Kuhlmann, Michael and Seebauer, Fritz and Wiechmann, Jana and Wagner, Petra
    and Haeb-Umbach, Reinhold}, year={2025} }'
  chicago: Rautenberg, Frederik, Michael Kuhlmann, Fritz Seebauer, Jana Wiechmann,
    Petra Wagner, and Reinhold Haeb-Umbach. “Speech Synthesis along Perceptual Voice
    Quality Dimensions.” In <i>ICASSP 2025 - 2025 IEEE International Conference on
    Acoustics, Speech and Signal Processing (ICASSP)</i>. IEEE, 2025. <a href="https://doi.org/10.1109/icassp49660.2025.10888012">https://doi.org/10.1109/icassp49660.2025.10888012</a>.
  ieee: 'F. Rautenberg, M. Kuhlmann, F. Seebauer, J. Wiechmann, P. Wagner, and R.
    Haeb-Umbach, “Speech Synthesis along Perceptual Voice Quality Dimensions,” presented
    at the IEEE International Conference on Acoustics, Speech and Signal Processing
    (ICASSP), Hyderabad, India , 2025, doi: <a href="https://doi.org/10.1109/icassp49660.2025.10888012">10.1109/icassp49660.2025.10888012</a>.'
  mla: Rautenberg, Frederik, et al. “Speech Synthesis along Perceptual Voice Quality
    Dimensions.” <i>ICASSP 2025 - 2025 IEEE International Conference on Acoustics,
    Speech and Signal Processing (ICASSP)</i>, IEEE, 2025, doi:<a href="https://doi.org/10.1109/icassp49660.2025.10888012">10.1109/icassp49660.2025.10888012</a>.
  short: 'F. Rautenberg, M. Kuhlmann, F. Seebauer, J. Wiechmann, P. Wagner, R. Haeb-Umbach,
    in: ICASSP 2025 - 2025 IEEE International Conference on Acoustics, Speech and
    Signal Processing (ICASSP), IEEE, 2025.'
conference:
  end_date: 2025-04-11
  location: 'Hyderabad, India '
  name: IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)
  start_date: 2025-04-06
date_created: 2025-05-20T08:17:22Z
date_updated: 2025-05-26T11:09:56Z
department:
- _id: '54'
- _id: '660'
doi: 10.1109/icassp49660.2025.10888012
language:
- iso: eng
project:
- _id: '129'
  grant_number: '438445824'
  name: 'TRR 318 - C06: TRR 318 - Technisch unterstütztes Erklären von Stimmcharakteristika
    (Teilprojekt C06)'
publication: ICASSP 2025 - 2025 IEEE International Conference on Acoustics, Speech
  and Signal Processing (ICASSP)
publication_status: published
publisher: IEEE
status: public
title: Speech Synthesis along Perceptual Voice Quality Dimensions
type: conference
user_id: '72602'
year: '2025'
...
---
_id: '61047'
author:
- first_name: Frederik
  full_name: Rautenberg, Frederik
  id: '72602'
  last_name: Rautenberg
- first_name: Fritz
  full_name: Seebauer, Fritz
  last_name: Seebauer
- first_name: Jana
  full_name: Wiechmann, Jana
  last_name: Wiechmann
- first_name: Michael
  full_name: Kuhlmann, Michael
  id: '49871'
  last_name: Kuhlmann
- first_name: Petra
  full_name: Wagner, Petra
  last_name: Wagner
- first_name: Reinhold
  full_name: Haeb-Umbach, Reinhold
  id: '242'
  last_name: Haeb-Umbach
citation:
  ama: 'Rautenberg F, Seebauer F, Wiechmann J, Kuhlmann M, Wagner P, Haeb-Umbach R.
    Synthesizing Speech with Selected Perceptual Voice Qualities – A Case Study with
    Creaky Voice. In: <i>Interspeech 2025</i>. ISCA; 2025. doi:<a href="https://doi.org/10.21437/Interspeech.2025-1443">10.21437/Interspeech.2025-1443</a>'
  apa: Rautenberg, F., Seebauer, F., Wiechmann, J., Kuhlmann, M., Wagner, P., &#38;
    Haeb-Umbach, R. (2025). Synthesizing Speech with Selected Perceptual Voice Qualities
    – A Case Study with Creaky Voice. <i>Interspeech 2025</i>. Interspeech, Rotterdam.
    <a href="https://doi.org/10.21437/Interspeech.2025-1443">https://doi.org/10.21437/Interspeech.2025-1443</a>
  bibtex: '@inproceedings{Rautenberg_Seebauer_Wiechmann_Kuhlmann_Wagner_Haeb-Umbach_2025,
    title={Synthesizing Speech with Selected Perceptual Voice Qualities – A Case Study
    with Creaky Voice}, DOI={<a href="https://doi.org/10.21437/Interspeech.2025-1443">10.21437/Interspeech.2025-1443</a>},
    booktitle={Interspeech 2025}, publisher={ISCA}, author={Rautenberg, Frederik and
    Seebauer, Fritz and Wiechmann, Jana and Kuhlmann, Michael and Wagner, Petra and
    Haeb-Umbach, Reinhold}, year={2025} }'
  chicago: Rautenberg, Frederik, Fritz Seebauer, Jana Wiechmann, Michael Kuhlmann,
    Petra Wagner, and Reinhold Haeb-Umbach. “Synthesizing Speech with Selected Perceptual
    Voice Qualities – A Case Study with Creaky Voice.” In <i>Interspeech 2025</i>.
    ISCA, 2025. <a href="https://doi.org/10.21437/Interspeech.2025-1443">https://doi.org/10.21437/Interspeech.2025-1443</a>.
  ieee: 'F. Rautenberg, F. Seebauer, J. Wiechmann, M. Kuhlmann, P. Wagner, and R.
    Haeb-Umbach, “Synthesizing Speech with Selected Perceptual Voice Qualities – A
    Case Study with Creaky Voice,” presented at the Interspeech, Rotterdam, 2025,
    doi: <a href="https://doi.org/10.21437/Interspeech.2025-1443">10.21437/Interspeech.2025-1443</a>.'
  mla: Rautenberg, Frederik, et al. “Synthesizing Speech with Selected Perceptual
    Voice Qualities – A Case Study with Creaky Voice.” <i>Interspeech 2025</i>, ISCA,
    2025, doi:<a href="https://doi.org/10.21437/Interspeech.2025-1443">10.21437/Interspeech.2025-1443</a>.
  short: 'F. Rautenberg, F. Seebauer, J. Wiechmann, M. Kuhlmann, P. Wagner, R. Haeb-Umbach,
    in: Interspeech 2025, ISCA, 2025.'
conference:
  end_date: 2025-08-21
  location: Rotterdam
  name: Interspeech
  start_date: 2025-08-17
date_created: 2025-08-28T08:39:01Z
date_updated: 2025-08-28T08:56:49Z
department:
- _id: '54'
- _id: '660'
doi: 10.21437/Interspeech.2025-1443
language:
- iso: eng
project:
- _id: '129'
  name: 'TRR 318; TP C06: Technisch unterstütztes Erklären von Stimmcharakteristika'
publication: Interspeech 2025
publisher: ISCA
status: public
title: Synthesizing Speech with Selected Perceptual Voice Qualities – A Case Study
  with Creaky Voice
type: conference
user_id: '72602'
year: '2025'
...
---
_id: '62164'
author:
- first_name: Michael
  full_name: Kuhlmann, Michael
  id: '49871'
  last_name: Kuhlmann
- first_name: Fritz
  full_name: Seebauer, Fritz
  last_name: Seebauer
- first_name: Petra
  full_name: Wagner, Petra
  id: '74505'
  last_name: Wagner
- first_name: Reinhold
  full_name: Häb-Umbach, Reinhold
  id: '242'
  last_name: Häb-Umbach
citation:
  ama: 'Kuhlmann M, Seebauer F, Wagner P, Häb-Umbach R. Towards Frame-level Quality
    Predictions of Synthetic Speech. In: <i>Interspeech 2025</i>. ISCA; 2025. doi:<a
    href="https://doi.org/10.21437/interspeech.2025-2190">10.21437/interspeech.2025-2190</a>'
  apa: Kuhlmann, M., Seebauer, F., Wagner, P., &#38; Häb-Umbach, R. (2025). Towards
    Frame-level Quality Predictions of Synthetic Speech. <i>Interspeech 2025</i>.
    <a href="https://doi.org/10.21437/interspeech.2025-2190">https://doi.org/10.21437/interspeech.2025-2190</a>
  bibtex: '@inproceedings{Kuhlmann_Seebauer_Wagner_Häb-Umbach_2025, title={Towards
    Frame-level Quality Predictions of Synthetic Speech}, DOI={<a href="https://doi.org/10.21437/interspeech.2025-2190">10.21437/interspeech.2025-2190</a>},
    booktitle={Interspeech 2025}, publisher={ISCA}, author={Kuhlmann, Michael and
    Seebauer, Fritz and Wagner, Petra and Häb-Umbach, Reinhold}, year={2025} }'
  chicago: Kuhlmann, Michael, Fritz Seebauer, Petra Wagner, and Reinhold Häb-Umbach.
    “Towards Frame-Level Quality Predictions of Synthetic Speech.” In <i>Interspeech
    2025</i>. ISCA, 2025. <a href="https://doi.org/10.21437/interspeech.2025-2190">https://doi.org/10.21437/interspeech.2025-2190</a>.
  ieee: 'M. Kuhlmann, F. Seebauer, P. Wagner, and R. Häb-Umbach, “Towards Frame-level
    Quality Predictions of Synthetic Speech,” 2025, doi: <a href="https://doi.org/10.21437/interspeech.2025-2190">10.21437/interspeech.2025-2190</a>.'
  mla: Kuhlmann, Michael, et al. “Towards Frame-Level Quality Predictions of Synthetic
    Speech.” <i>Interspeech 2025</i>, ISCA, 2025, doi:<a href="https://doi.org/10.21437/interspeech.2025-2190">10.21437/interspeech.2025-2190</a>.
  short: 'M. Kuhlmann, F. Seebauer, P. Wagner, R. Häb-Umbach, in: Interspeech 2025,
    ISCA, 2025.'
date_created: 2025-11-11T11:43:20Z
date_updated: 2025-11-11T11:45:12Z
department:
- _id: '54'
doi: 10.21437/interspeech.2025-2190
language:
- iso: eng
publication: Interspeech 2025
publication_status: published
publisher: ISCA
status: public
title: Towards Frame-level Quality Predictions of Synthetic Speech
type: conference
user_id: '49871'
year: '2025'
...
---
_id: '57099'
author:
- first_name: Yuying
  full_name: Xie, Yuying
  last_name: Xie
- first_name: Michael
  full_name: Kuhlmann, Michael
  id: '49871'
  last_name: Kuhlmann
- first_name: Frederik
  full_name: Rautenberg, Frederik
  id: '72602'
  last_name: Rautenberg
- first_name: Zheng-Hua
  full_name: Tan, Zheng-Hua
  last_name: Tan
- first_name: Reinhold
  full_name: Häb-Umbach, Reinhold
  id: '242'
  last_name: Häb-Umbach
citation:
  ama: 'Xie Y, Kuhlmann M, Rautenberg F, Tan Z-H, Häb-Umbach R. Speaker and Style
    Disentanglement of Speech Based on Contrastive Predictive Coding Supported Factorized
    Variational Autoencoder. In: <i>2024 32nd European Signal Processing Conference
    (EUSIPCO)</i>. ; 2024:436–440.'
  apa: Xie, Y., Kuhlmann, M., Rautenberg, F., Tan, Z.-H., &#38; Häb-Umbach, R. (2024).
    Speaker and Style Disentanglement of Speech Based on Contrastive Predictive Coding
    Supported Factorized Variational Autoencoder. <i>2024 32nd European Signal Processing
    Conference (EUSIPCO)</i>, 436–440.
  bibtex: '@inproceedings{Xie_Kuhlmann_Rautenberg_Tan_Häb-Umbach_2024, title={Speaker
    and Style Disentanglement of Speech Based on Contrastive Predictive Coding Supported
    Factorized Variational Autoencoder}, booktitle={2024 32nd European Signal Processing
    Conference (EUSIPCO)}, author={Xie, Yuying and Kuhlmann, Michael and Rautenberg,
    Frederik and Tan, Zheng-Hua and Häb-Umbach, Reinhold}, year={2024}, pages={436–440}
    }'
  chicago: Xie, Yuying, Michael Kuhlmann, Frederik Rautenberg, Zheng-Hua Tan, and
    Reinhold Häb-Umbach. “Speaker and Style Disentanglement of Speech Based on Contrastive
    Predictive Coding Supported Factorized Variational Autoencoder.” In <i>2024 32nd
    European Signal Processing Conference (EUSIPCO)</i>, 436–440, 2024.
  ieee: Y. Xie, M. Kuhlmann, F. Rautenberg, Z.-H. Tan, and R. Häb-Umbach, “Speaker
    and Style Disentanglement of Speech Based on Contrastive Predictive Coding Supported
    Factorized Variational Autoencoder,” in <i>2024 32nd European Signal Processing
    Conference (EUSIPCO)</i>, 2024, pp. 436–440.
  mla: Xie, Yuying, et al. “Speaker and Style Disentanglement of Speech Based on Contrastive
    Predictive Coding Supported Factorized Variational Autoencoder.” <i>2024 32nd
    European Signal Processing Conference (EUSIPCO)</i>, 2024, pp. 436–440.
  short: 'Y. Xie, M. Kuhlmann, F. Rautenberg, Z.-H. Tan, R. Häb-Umbach, in: 2024 32nd
    European Signal Processing Conference (EUSIPCO), 2024, pp. 436–440.'
date_created: 2024-11-15T06:52:54Z
date_updated: 2024-11-15T06:54:40Z
department:
- _id: '54'
language:
- iso: eng
page: 436–440
publication: 2024 32nd European Signal Processing Conference (EUSIPCO)
status: public
title: Speaker and Style Disentanglement of Speech Based on Contrastive Predictive
  Coding Supported Factorized Variational Autoencoder
type: conference
user_id: '49871'
year: '2024'
...
---
_id: '48355'
abstract:
- lang: eng
  text: "Unsupervised speech disentanglement aims at separating fast varying from\r\nslowly
    varying components of a speech signal. In this contribution, we take a\r\ncloser
    look at the embedding vector representing the slowly varying signal\r\ncomponents,
    commonly named the speaker embedding vector. We ask, which\r\nproperties of a
    speaker's voice are captured and investigate to which extent do\r\nindividual
    embedding vector components sign responsible for them, using the\r\nconcept of
    Shapley values. Our findings show that certain speaker-specific\r\nacoustic-phonetic
    properties can be fairly well predicted from the speaker\r\nembedding, while the
    investigated more abstract voice quality features cannot."
author:
- first_name: Frederik
  full_name: Rautenberg, Frederik
  id: '72602'
  last_name: Rautenberg
- first_name: Michael
  full_name: Kuhlmann, Michael
  id: '49871'
  last_name: Kuhlmann
- first_name: Jana
  full_name: Wiechmann, Jana
  last_name: Wiechmann
- first_name: Fritz
  full_name: Seebauer, Fritz
  last_name: Seebauer
- first_name: Petra
  full_name: Wagner, Petra
  last_name: Wagner
- first_name: Reinhold
  full_name: Haeb-Umbach, Reinhold
  id: '242'
  last_name: Haeb-Umbach
citation:
  ama: 'Rautenberg F, Kuhlmann M, Wiechmann J, Seebauer F, Wagner P, Haeb-Umbach R.
    On Feature Importance and Interpretability of Speaker Representations. In: <i>ITG
    Conference on Speech Communication</i>. ; 2023.'
  apa: Rautenberg, F., Kuhlmann, M., Wiechmann, J., Seebauer, F., Wagner, P., &#38;
    Haeb-Umbach, R. (2023). On Feature Importance and Interpretability of Speaker
    Representations. <i>ITG Conference on Speech Communication</i>. ITG Conference
    on Speech Communication, Aachen.
  bibtex: '@inproceedings{Rautenberg_Kuhlmann_Wiechmann_Seebauer_Wagner_Haeb-Umbach_2023,
    title={On Feature Importance and Interpretability of Speaker Representations},
    booktitle={ITG Conference on Speech Communication}, author={Rautenberg, Frederik
    and Kuhlmann, Michael and Wiechmann, Jana and Seebauer, Fritz and Wagner, Petra
    and Haeb-Umbach, Reinhold}, year={2023} }'
  chicago: Rautenberg, Frederik, Michael Kuhlmann, Jana Wiechmann, Fritz Seebauer,
    Petra Wagner, and Reinhold Haeb-Umbach. “On Feature Importance and Interpretability
    of Speaker Representations.” In <i>ITG Conference on Speech Communication</i>,
    2023.
  ieee: F. Rautenberg, M. Kuhlmann, J. Wiechmann, F. Seebauer, P. Wagner, and R. Haeb-Umbach,
    “On Feature Importance and Interpretability of Speaker Representations,” presented
    at the ITG Conference on Speech Communication, Aachen, 2023.
  mla: Rautenberg, Frederik, et al. “On Feature Importance and Interpretability of
    Speaker Representations.” <i>ITG Conference on Speech Communication</i>, 2023.
  short: 'F. Rautenberg, M. Kuhlmann, J. Wiechmann, F. Seebauer, P. Wagner, R. Haeb-Umbach,
    in: ITG Conference on Speech Communication, 2023.'
conference:
  end_date: 2023-09-22
  location: Aachen
  name: ITG Conference on Speech Communication
  start_date: 2023-09-20
date_created: 2023-10-20T08:04:46Z
date_updated: 2023-11-22T13:44:33Z
ddc:
- '000'
department:
- _id: '54'
- _id: '660'
external_id:
  arxiv:
  - '2310.12599'
file:
- access_level: closed
  content_type: application/pdf
  creator: frra
  date_created: 2023-10-20T08:20:58Z
  date_updated: 2023-10-20T08:20:58Z
  file_id: '48359'
  file_name: arxiv.pdf
  file_size: 272390
  relation: main_file
  success: 1
file_date_updated: 2023-10-20T08:20:58Z
has_accepted_license: '1'
language:
- iso: eng
main_file_link:
- open_access: '1'
  url: https://arxiv.org/abs/2310.12599
oa: '1'
project:
- _id: '129'
  grant_number: '438445824'
  name: 'TRR 318 - C06: TRR 318 - Technisch unterstütztes Erklären von Stimmcharakteristika
    (Teilprojekt C06)'
publication: ITG Conference on Speech Communication
status: public
title: On Feature Importance and Interpretability of Speaker Representations
type: conference
user_id: '72602'
year: '2023'
...
---
_id: '46069'
author:
- first_name: Fritz
  full_name: Seebauer, Fritz
  last_name: Seebauer
- first_name: Michael
  full_name: Kuhlmann, Michael
  id: '49871'
  last_name: Kuhlmann
- first_name: Reinhold
  full_name: Haeb-Umbach, Reinhold
  id: '242'
  last_name: Haeb-Umbach
- first_name: Petra
  full_name: Wagner, Petra
  last_name: Wagner
citation:
  ama: 'Seebauer F, Kuhlmann M, Haeb-Umbach R, Wagner P. Re-examining the quality
    dimensions of synthetic speech. In: <i>12th Speech Synthesis Workshop (SSW) 2023</i>.
    ; 2023.'
  apa: Seebauer, F., Kuhlmann, M., Haeb-Umbach, R., &#38; Wagner, P. (2023). Re-examining
    the quality dimensions of synthetic speech. <i>12th Speech Synthesis Workshop
    (SSW) 2023</i>.
  bibtex: '@inproceedings{Seebauer_Kuhlmann_Haeb-Umbach_Wagner_2023, title={Re-examining
    the quality dimensions of synthetic speech}, booktitle={12th Speech Synthesis
    Workshop (SSW) 2023}, author={Seebauer, Fritz and Kuhlmann, Michael and Haeb-Umbach,
    Reinhold and Wagner, Petra}, year={2023} }'
  chicago: Seebauer, Fritz, Michael Kuhlmann, Reinhold Haeb-Umbach, and Petra Wagner.
    “Re-Examining the Quality Dimensions of Synthetic Speech.” In <i>12th Speech Synthesis
    Workshop (SSW) 2023</i>, 2023.
  ieee: F. Seebauer, M. Kuhlmann, R. Haeb-Umbach, and P. Wagner, “Re-examining the
    quality dimensions of synthetic speech,” 2023.
  mla: Seebauer, Fritz, et al. “Re-Examining the Quality Dimensions of Synthetic Speech.”
    <i>12th Speech Synthesis Workshop (SSW) 2023</i>, 2023.
  short: 'F. Seebauer, M. Kuhlmann, R. Haeb-Umbach, P. Wagner, in: 12th Speech Synthesis
    Workshop (SSW) 2023, 2023.'
date_created: 2023-07-15T16:10:20Z
date_updated: 2023-10-25T08:42:56Z
department:
- _id: '54'
has_accepted_license: '1'
language:
- iso: eng
project:
- _id: '129'
  grant_number: '438445824'
  name: 'TRR 318 - C06: TRR 318 - Technisch unterstütztes Erklären von Stimmcharakteristika
    (Teilprojekt C06)'
publication: 12th Speech Synthesis Workshop (SSW) 2023
status: public
title: Re-examining the quality dimensions of synthetic speech
type: conference
user_id: '242'
year: '2023'
...
---
_id: '44849'
author:
- first_name: Frederik
  full_name: Rautenberg, Frederik
  id: '72602'
  last_name: Rautenberg
- first_name: Michael
  full_name: Kuhlmann, Michael
  id: '49871'
  last_name: Kuhlmann
- first_name: Janek
  full_name: Ebbers, Janek
  id: '34851'
  last_name: Ebbers
- first_name: Jana
  full_name: Wiechmann, Jana
  last_name: Wiechmann
- first_name: Fritz
  full_name: Seebauer, Fritz
  last_name: Seebauer
- first_name: Petra
  full_name: Wagner, Petra
  last_name: Wagner
- first_name: Reinhold
  full_name: Haeb-Umbach, Reinhold
  id: '242'
  last_name: Haeb-Umbach
citation:
  ama: 'Rautenberg F, Kuhlmann M, Ebbers J, et al. Speech Disentanglement for Analysis
    and Modification of Acoustic and Perceptual Speaker Characteristics. In: <i>Fortschritte
    Der Akustik - DAGA 2023</i>. ; 2023:1409-1412.'
  apa: Rautenberg, F., Kuhlmann, M., Ebbers, J., Wiechmann, J., Seebauer, F., Wagner,
    P., &#38; Haeb-Umbach, R. (2023). Speech Disentanglement for Analysis and Modification
    of Acoustic and Perceptual Speaker Characteristics. <i>Fortschritte Der Akustik
    - DAGA 2023</i>, 1409–1412.
  bibtex: '@inproceedings{Rautenberg_Kuhlmann_Ebbers_Wiechmann_Seebauer_Wagner_Haeb-Umbach_2023,
    title={Speech Disentanglement for Analysis and Modification of Acoustic and Perceptual
    Speaker Characteristics}, booktitle={Fortschritte der Akustik - DAGA 2023}, author={Rautenberg,
    Frederik and Kuhlmann, Michael and Ebbers, Janek and Wiechmann, Jana and Seebauer,
    Fritz and Wagner, Petra and Haeb-Umbach, Reinhold}, year={2023}, pages={1409–1412}
    }'
  chicago: Rautenberg, Frederik, Michael Kuhlmann, Janek Ebbers, Jana Wiechmann, Fritz
    Seebauer, Petra Wagner, and Reinhold Haeb-Umbach. “Speech Disentanglement for
    Analysis and Modification of Acoustic and Perceptual Speaker Characteristics.”
    In <i>Fortschritte Der Akustik - DAGA 2023</i>, 1409–12, 2023.
  ieee: F. Rautenberg <i>et al.</i>, “Speech Disentanglement for Analysis and Modification
    of Acoustic and Perceptual Speaker Characteristics,” in <i>Fortschritte der Akustik
    - DAGA 2023</i>, Hamburg, 2023, pp. 1409–1412.
  mla: Rautenberg, Frederik, et al. “Speech Disentanglement for Analysis and Modification
    of Acoustic and Perceptual Speaker Characteristics.” <i>Fortschritte Der Akustik
    - DAGA 2023</i>, 2023, pp. 1409–12.
  short: 'F. Rautenberg, M. Kuhlmann, J. Ebbers, J. Wiechmann, F. Seebauer, P. Wagner,
    R. Haeb-Umbach, in: Fortschritte Der Akustik - DAGA 2023, 2023, pp. 1409–1412.'
conference:
  end_date: 2023-03-09
  location: Hamburg
  name: DAGA 2023 - 49. Jahrestagung für Akustik
  start_date: 2023-03-06
date_created: 2023-05-15T08:48:54Z
date_updated: 2024-02-29T17:05:16Z
ddc:
- '000'
department:
- _id: '54'
- _id: '660'
file:
- access_level: open_access
  content_type: application/pdf
  creator: frra
  date_created: 2024-02-29T16:15:12Z
  date_updated: 2024-02-29T16:15:12Z
  file_id: '52221'
  file_name: Daga_2023_Rautenberg_Paper.pdf
  file_size: 289493
  relation: main_file
file_date_updated: 2024-02-29T16:15:12Z
has_accepted_license: '1'
language:
- iso: eng
main_file_link:
- open_access: '1'
  url: https://pub.dega-akustik.de/DAGA_2023/data/articles/000105.pdf
oa: '1'
page: 1409-1412
project:
- _id: '129'
  grant_number: '438445824'
  name: 'TRR 318 - C06: TRR 318 - Technisch unterstütztes Erklären von Stimmcharakteristika
    (Teilprojekt C06)'
publication: Fortschritte der Akustik - DAGA 2023
publication_status: published
status: public
title: Speech Disentanglement for Analysis and Modification of Acoustic and Perceptual
  Speaker Characteristics
type: conference
user_id: '72602'
year: '2023'
...
---
_id: '57098'
author:
- first_name: Fritz
  full_name: Seebauer, Fritz
  last_name: Seebauer
- first_name: Michael
  full_name: Kuhlmann, Michael
  id: '49871'
  last_name: Kuhlmann
- first_name: Reinhold
  full_name: Häb-Umbach, Reinhold
  id: '242'
  last_name: Häb-Umbach
- first_name: Petra
  full_name: Wagner, Petra
  last_name: Wagner
citation:
  ama: 'Seebauer F, Kuhlmann M, Häb-Umbach R, Wagner P. DISCERNING DIMENSIONS OF QUALITY
    FOR STATE OF THE ART SYNTHETIC SPEECH. In: <i>Proceedings of the 20th International
    Congress of Phonetic Sciences</i>. ; 2023.'
  apa: Seebauer, F., Kuhlmann, M., Häb-Umbach, R., &#38; Wagner, P. (2023). DISCERNING
    DIMENSIONS OF QUALITY FOR STATE OF THE ART SYNTHETIC SPEECH. <i>Proceedings of
    the 20th International Congress of Phonetic Sciences</i>. International Congress
    of Phonetic Sciences (ICPhS), Prague.
  bibtex: '@inproceedings{Seebauer_Kuhlmann_Häb-Umbach_Wagner_2023, title={DISCERNING
    DIMENSIONS OF QUALITY FOR STATE OF THE ART SYNTHETIC SPEECH}, booktitle={Proceedings
    of the 20th International Congress of Phonetic Sciences}, author={Seebauer, Fritz
    and Kuhlmann, Michael and Häb-Umbach, Reinhold and Wagner, Petra}, year={2023}
    }'
  chicago: Seebauer, Fritz, Michael Kuhlmann, Reinhold Häb-Umbach, and Petra Wagner.
    “DISCERNING DIMENSIONS OF QUALITY FOR STATE OF THE ART SYNTHETIC SPEECH.” In <i>Proceedings
    of the 20th International Congress of Phonetic Sciences</i>, 2023.
  ieee: F. Seebauer, M. Kuhlmann, R. Häb-Umbach, and P. Wagner, “DISCERNING DIMENSIONS
    OF QUALITY FOR STATE OF THE ART SYNTHETIC SPEECH,” presented at the International
    Congress of Phonetic Sciences (ICPhS), Prague, 2023.
  mla: Seebauer, Fritz, et al. “DISCERNING DIMENSIONS OF QUALITY FOR STATE OF THE
    ART SYNTHETIC SPEECH.” <i>Proceedings of the 20th International Congress of Phonetic
    Sciences</i>, 2023.
  short: 'F. Seebauer, M. Kuhlmann, R. Häb-Umbach, P. Wagner, in: Proceedings of the
    20th International Congress of Phonetic Sciences, 2023.'
conference:
  end_date: 2023-08-11
  location: Prague
  name: International Congress of Phonetic Sciences (ICPhS)
  start_date: 2023-08-07
date_created: 2024-11-15T06:49:27Z
date_updated: 2024-11-15T06:54:55Z
department:
- _id: '54'
language:
- iso: eng
publication: Proceedings of the 20th International Congress of Phonetic Sciences
publication_identifier:
  isbn:
  - 978-80-908 114-2-3
status: public
title: DISCERNING DIMENSIONS OF QUALITY FOR STATE OF THE ART SYNTHETIC SPEECH
type: conference
user_id: '49871'
year: '2023'
...
---
_id: '57086'
author:
- first_name: Michael
  full_name: Kuhlmann, Michael
  id: '49871'
  last_name: Kuhlmann
- first_name: Adrian Tobias
  full_name: Meise, Adrian Tobias
  id: '79268'
  last_name: Meise
- first_name: Fritz
  full_name: Seebauer, Fritz
  last_name: Seebauer
- first_name: Petra
  full_name: Wagner, Petra
  last_name: Wagner
- first_name: Reinhold
  full_name: Häb-Umbach, Reinhold
  id: '242'
  last_name: Häb-Umbach
citation:
  ama: 'Kuhlmann M, Meise AT, Seebauer F, Wagner P, Häb-Umbach R. Investigating Speaker
    Embedding Disentanglement on Natural Read Speech. In: <i>Speech Communication;
    15th ITG Conference</i>. ; 2023:121–125.'
  apa: Kuhlmann, M., Meise, A. T., Seebauer, F., Wagner, P., &#38; Häb-Umbach, R.
    (2023). Investigating Speaker Embedding Disentanglement on Natural Read Speech.
    <i>Speech Communication; 15th ITG Conference</i>, 121–125.
  bibtex: '@inproceedings{Kuhlmann_Meise_Seebauer_Wagner_Häb-Umbach_2023, title={Investigating
    Speaker Embedding Disentanglement on Natural Read Speech}, booktitle={Speech Communication;
    15th ITG Conference}, author={Kuhlmann, Michael and Meise, Adrian Tobias and Seebauer,
    Fritz and Wagner, Petra and Häb-Umbach, Reinhold}, year={2023}, pages={121–125}
    }'
  chicago: Kuhlmann, Michael, Adrian Tobias Meise, Fritz Seebauer, Petra Wagner, and
    Reinhold Häb-Umbach. “Investigating Speaker Embedding Disentanglement on Natural
    Read Speech.” In <i>Speech Communication; 15th ITG Conference</i>, 121–125, 2023.
  ieee: M. Kuhlmann, A. T. Meise, F. Seebauer, P. Wagner, and R. Häb-Umbach, “Investigating
    Speaker Embedding Disentanglement on Natural Read Speech,” in <i>Speech Communication;
    15th ITG Conference</i>, 2023, pp. 121–125.
  mla: Kuhlmann, Michael, et al. “Investigating Speaker Embedding Disentanglement
    on Natural Read Speech.” <i>Speech Communication; 15th ITG Conference</i>, 2023,
    pp. 121–125.
  short: 'M. Kuhlmann, A.T. Meise, F. Seebauer, P. Wagner, R. Häb-Umbach, in: Speech
    Communication; 15th ITG Conference, 2023, pp. 121–125.'
date_created: 2024-11-14T09:45:03Z
date_updated: 2026-01-05T10:12:23Z
department:
- _id: '54'
language:
- iso: eng
page: 121–125
project:
- _id: '52'
  name: 'PC2: Computing Resources Provided by the Paderborn Center for Parallel Computing'
publication: Speech Communication; 15th ITG Conference
status: public
title: Investigating Speaker Embedding Disentanglement on Natural Read Speech
type: conference
user_id: '49871'
year: '2023'
...
---
_id: '33857'
author:
- first_name: Michael
  full_name: Kuhlmann, Michael
  id: '49871'
  last_name: Kuhlmann
- first_name: Fritz
  full_name: Seebauer, Fritz
  last_name: Seebauer
- first_name: Janek
  full_name: Ebbers, Janek
  id: '34851'
  last_name: Ebbers
- first_name: Petra
  full_name: Wagner, Petra
  last_name: Wagner
- first_name: Reinhold
  full_name: Haeb-Umbach, Reinhold
  id: '242'
  last_name: Haeb-Umbach
citation:
  ama: 'Kuhlmann M, Seebauer F, Ebbers J, Wagner P, Haeb-Umbach R. Investigation into
    Target Speaking Rate Adaptation for Voice Conversion. In: <i>Interspeech 2022</i>.
    ISCA; 2022. doi:<a href="https://doi.org/10.21437/interspeech.2022-10740">10.21437/interspeech.2022-10740</a>'
  apa: Kuhlmann, M., Seebauer, F., Ebbers, J., Wagner, P., &#38; Haeb-Umbach, R. (2022).
    Investigation into Target Speaking Rate Adaptation for Voice Conversion. <i>Interspeech
    2022</i>. <a href="https://doi.org/10.21437/interspeech.2022-10740">https://doi.org/10.21437/interspeech.2022-10740</a>
  bibtex: '@inproceedings{Kuhlmann_Seebauer_Ebbers_Wagner_Haeb-Umbach_2022, title={Investigation
    into Target Speaking Rate Adaptation for Voice Conversion}, DOI={<a href="https://doi.org/10.21437/interspeech.2022-10740">10.21437/interspeech.2022-10740</a>},
    booktitle={Interspeech 2022}, publisher={ISCA}, author={Kuhlmann, Michael and
    Seebauer, Fritz and Ebbers, Janek and Wagner, Petra and Haeb-Umbach, Reinhold},
    year={2022} }'
  chicago: Kuhlmann, Michael, Fritz Seebauer, Janek Ebbers, Petra Wagner, and Reinhold
    Haeb-Umbach. “Investigation into Target Speaking Rate Adaptation for Voice Conversion.”
    In <i>Interspeech 2022</i>. ISCA, 2022. <a href="https://doi.org/10.21437/interspeech.2022-10740">https://doi.org/10.21437/interspeech.2022-10740</a>.
  ieee: 'M. Kuhlmann, F. Seebauer, J. Ebbers, P. Wagner, and R. Haeb-Umbach, “Investigation
    into Target Speaking Rate Adaptation for Voice Conversion,” 2022, doi: <a href="https://doi.org/10.21437/interspeech.2022-10740">10.21437/interspeech.2022-10740</a>.'
  mla: Kuhlmann, Michael, et al. “Investigation into Target Speaking Rate Adaptation
    for Voice Conversion.” <i>Interspeech 2022</i>, ISCA, 2022, doi:<a href="https://doi.org/10.21437/interspeech.2022-10740">10.21437/interspeech.2022-10740</a>.
  short: 'M. Kuhlmann, F. Seebauer, J. Ebbers, P. Wagner, R. Haeb-Umbach, in: Interspeech
    2022, ISCA, 2022.'
date_created: 2022-10-21T06:50:59Z
date_updated: 2023-10-25T09:04:45Z
ddc:
- '000'
department:
- _id: '54'
doi: 10.21437/interspeech.2022-10740
file:
- access_level: closed
  content_type: application/pdf
  creator: mikuhl
  date_created: 2023-07-15T16:16:12Z
  date_updated: 2023-07-15T16:16:12Z
  file_id: '46070'
  file_name: kuhlmann22_interspeech.pdf
  file_size: 303863
  relation: main_file
  success: 1
file_date_updated: 2023-07-15T16:16:12Z
has_accepted_license: '1'
language:
- iso: eng
main_file_link:
- open_access: '1'
  url: https://www.isca-speech.org/archive/pdfs/interspeech_2022/kuhlmann22_interspeech.pdf
oa: '1'
project:
- _id: '52'
  name: 'PC2: Computing Resources Provided by the Paderborn Center for Parallel Computing'
publication: Interspeech 2022
publication_status: published
publisher: ISCA
quality_controlled: '1'
status: public
title: Investigation into Target Speaking Rate Adaptation for Voice Conversion
type: conference
user_id: '34851'
year: '2022'
...
---
_id: '29304'
abstract:
- lang: eng
  text: 'In this work we address disentanglement of style and content in speech signals.
    We propose a fully convolutional variational autoencoder employing two encoders:
    a content encoder and a style encoder. To foster disentanglement, we propose adversarial
    contrastive predictive coding. This new disentanglement method does neither need
    parallel data nor any supervision. We show that the proposed technique is capable
    of separating speaker and content traits into the two different representations
    and show competitive speaker-content disentanglement performance compared to other
    unsupervised approaches. We further demonstrate an increased robustness of the
    content representation against a train-test mismatch compared to spectral features,
    when used for phone recognition.'
author:
- first_name: Janek
  full_name: Ebbers, Janek
  id: '34851'
  last_name: Ebbers
- first_name: Michael
  full_name: Kuhlmann, Michael
  id: '49871'
  last_name: Kuhlmann
- first_name: Tobias
  full_name: Cord-Landwehr, Tobias
  id: '44393'
  last_name: Cord-Landwehr
- first_name: Reinhold
  full_name: Haeb-Umbach, Reinhold
  id: '242'
  last_name: Haeb-Umbach
citation:
  ama: 'Ebbers J, Kuhlmann M, Cord-Landwehr T, Haeb-Umbach R. Contrastive Predictive
    Coding Supported Factorized Variational Autoencoder for Unsupervised Learning
    of Disentangled Speech Representations. In: <i>Proceedings of the IEEE International
    Conference on Acoustics, Speech and Signal Processing (ICASSP)</i>. ; 2021:3860–3864.'
  apa: Ebbers, J., Kuhlmann, M., Cord-Landwehr, T., &#38; Haeb-Umbach, R. (2021).
    Contrastive Predictive Coding Supported Factorized Variational Autoencoder for
    Unsupervised Learning of Disentangled Speech Representations. <i>Proceedings of
    the IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)</i>,
    3860–3864.
  bibtex: '@inproceedings{Ebbers_Kuhlmann_Cord-Landwehr_Haeb-Umbach_2021, title={Contrastive
    Predictive Coding Supported Factorized Variational Autoencoder for Unsupervised
    Learning of Disentangled Speech Representations}, booktitle={Proceedings of the
    IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)},
    author={Ebbers, Janek and Kuhlmann, Michael and Cord-Landwehr, Tobias and Haeb-Umbach,
    Reinhold}, year={2021}, pages={3860–3864} }'
  chicago: Ebbers, Janek, Michael Kuhlmann, Tobias Cord-Landwehr, and Reinhold Haeb-Umbach.
    “Contrastive Predictive Coding Supported Factorized Variational Autoencoder for
    Unsupervised Learning of Disentangled Speech Representations.” In <i>Proceedings
    of the IEEE International Conference on Acoustics, Speech and Signal Processing
    (ICASSP)</i>, 3860–3864, 2021.
  ieee: J. Ebbers, M. Kuhlmann, T. Cord-Landwehr, and R. Haeb-Umbach, “Contrastive
    Predictive Coding Supported Factorized Variational Autoencoder for Unsupervised
    Learning of Disentangled Speech Representations,” in <i>Proceedings of the IEEE
    International Conference on Acoustics, Speech and Signal Processing (ICASSP)</i>,
    2021, pp. 3860–3864.
  mla: Ebbers, Janek, et al. “Contrastive Predictive Coding Supported Factorized Variational
    Autoencoder for Unsupervised Learning of Disentangled Speech Representations.”
    <i>Proceedings of the IEEE International Conference on Acoustics, Speech and Signal
    Processing (ICASSP)</i>, 2021, pp. 3860–3864.
  short: 'J. Ebbers, M. Kuhlmann, T. Cord-Landwehr, R. Haeb-Umbach, in: Proceedings
    of the IEEE International Conference on Acoustics, Speech and Signal Processing
    (ICASSP), 2021, pp. 3860–3864.'
date_created: 2022-01-13T07:55:29Z
date_updated: 2023-11-22T08:29:42Z
ddc:
- '000'
department:
- _id: '54'
file:
- access_level: open_access
  content_type: application/pdf
  creator: ebbers
  date_created: 2022-01-13T07:56:30Z
  date_updated: 2022-01-13T08:19:19Z
  file_id: '29305'
  file_name: Template.pdf
  file_size: 236628
  relation: main_file
file_date_updated: 2022-01-13T08:19:19Z
has_accepted_license: '1'
language:
- iso: eng
oa: '1'
page: 3860–3864
project:
- _id: '52'
  name: 'PC2: Computing Resources Provided by the Paderborn Center for Parallel Computing'
publication: Proceedings of the IEEE International Conference on Acoustics, Speech
  and Signal Processing (ICASSP)
quality_controlled: '1'
status: public
title: Contrastive Predictive Coding Supported Factorized Variational Autoencoder
  for Unsupervised Learning of Disentangled Speech Representations
type: conference
user_id: '34851'
year: '2021'
...
