All projects

MoVE

Modeling of Speech Attitudes and Application to an Expressive Conversational Agent

In a context where personal assistants and interactions with machines are becoming a part of our daily realities, voice has become the privileged modality of interaction with the machine. Voice synthesis has made enormous progress in recent years, particularly through the use of deep learningand large multi-speaker databases. However, there are two principal limitations. The first is low expressivity: the agent’s behavior is still often monomodal (voice, such as seen in the assistants Alexa or Google Home) and remains very monotonous, which greatly reduces the acceptance, length of time, and quality of interactions. The second limitation is that the agent’s behavior is poorly, or not at all, adapted to the speaker and to the situation, which reduces their understanding of the information and reaction time to the information transmitted.

The MoVE project will develop neural learning algorithms to adapt the speech style of a synthetic voice to a specific interaction situation, with, for example, a focus on the attitudes of the synthesized voice (cordial, smiling, authoritative, etc.). The improved adaptation of the voice style will result in a better understanding of the information communicated by the agent and will reduce human reaction time to the information provided (e.g. in an emergency situation).

With the supervision of:

IrcamSorbonne UniversityCNRSMinistry of Culture

Discover other team projects

OpenTuning

Individuation créative par le design d'interaction avec des systèmes musicaux génératifs

Dates : March 2026 to December 2029

Inside Artificial Improvisation

Dans la boîte noire de l’improvisation artificielle

Dates : January 2026 to December 2029

INTIM

INteractive analysis/synthesis of musical TIMbre

Dates : September 2024 to March 2026