Project - TRR 318 - Technically enabled explanation of speaker traits (Subproject C06) (Uni Paderborn) - Project

Overview

A voice might be described as hoarse, or it may be clear, deep, or breathy. Researchers in Project C06 are looking at issues of how different vocal traits sound, and how a voice can be represented in all of its many facets. Here, linguists and computer scientists are working to develop an intelligent system that professionals can use to explain the phenomenon of voice to general audiences. To do this, the artificial intelligence (AI) system generates speech samples in which the same content is spoken by different voices. This may also be helpful for clinical linguists working in diagnostics, for example, to help identify the vocal characteristics of Parkinson's disease. By demonstrating how the system measures differences between voices, vocal modelling becomes more transparent and comprehensible. The goal of the research team is to find out whether people can better imitate and describe a voice with the help of the AI system.

Key Facts

Grant Number:: 438445824

Project type:: Forschung

Project duration:: 07/2021 - 06/2025

Funded by:: Deutsche Forschungsgemeinschaft (DFG)

Websites:: Homepage
DFG-Datenbank gepris
Tiefe generative Modelle für die Phonetikforschung

News

11.07.2023

Dissecting, understanding and manipulating the human voice

More news

More Information

Principal Investigators

Prof. Dr. Reinhold H?b-Umbach

Communications Engineering / Heinz Nixdorf Institute

About the person

Petra Wagner

Universit?t Bielefeld

About the person (Orcid.org)

Cooperating Institutions

Universit?t Bielefeld

Cooperating Institution

Publications

Speech Synthesis along Perceptual Voice Quality Dimensions

F. Rautenberg, M. Kuhlmann, F. Seebauer, J. Wiechmann, P. Wagner, R. Haeb-Umbach, in: ICASSP 2025 - 2025 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), IEEE, 2025.

DOI

Synthesizing Speech with Selected Perceptual Voice Qualities – A Case Study with Creaky Voice

F. Rautenberg, F. Seebauer, J. Wiechmann, M. Kuhlmann, P. Wagner, R. Haeb-Umbach, in: Interspeech 2025, ISCA, 2025.

DOI

Challenges and Limits in Explaining and Acoustic Modeling of Voice Characteristics

J. Wiechmann, P. Wagner, Journal of Voice (2025).

DOI

On Feature Importance and Interpretability of Speaker Representations

F. Rautenberg, M. Kuhlmann, J. Wiechmann, F. Seebauer, P. Wagner, R. Haeb-Umbach, in: ITG Conference on Speech Communication, 2023.

arXiv

Explaining voice characteristics to novice voice practitioners-How successful is it?

J. Wiechmann, F. Rautenberg, P. Wagner, R. Haeb-Umbach, in: 20th International Congress of the Phonetic Sciences (ICPhS) , 2023.

Show all publications

More information about the project:

Homepage

DFG-Datenbank gepris

Tiefe generative Modelle für die Phonetikforschung

365体育_足球比分网￥投注直播官网

Overview

Key Facts

News

Dis­sect­ing, un­der­stand­ing and ma­nip­u­lat­ing the hu­man voice

More Information

Prof. Dr. Reinhold H?b-Umbach

Petra Wagner

Universit?t Bielefeld

Dissecting, understanding and manipulating the human voice