• Research

    The fundamental principle of IRCAM is to encourage productive interaction among scientific research, technological developments, and contemporary music production. Since its establishment in 1977, this initiative has provided the foundation for the institute’s activities. One of the major issues is the importance of contributing to the renewal of musical expression through science and technology. Conversely, sp…

    • Research Topics
    • The STMS Lab
    • Research Teams
    • Sound Systems and Signals: Audio/Acoustics, InstruMents
    • Acoustic and Cognitive Spaces
    • Perception and Sound Design
    • Sound Analysis-Synthesis
    • Sound Music Movement Interaction
    • Musical Representations
    • Analysis of Musical Practices
    • Projects
    • Sound Workshop
    • The Musical Body
    • Creative Dynamics
    • Musique/Sciences Collection
  • Creation

    IRCAM is an internationally recognized research center dedicated to creating new technologies for music. The institute offers a unique experimental environment where composers strive to enlarge their musical experience through the concepts expressed in new technologies.

    • Composers & Artists in Studio
    • In Ex Machina
    • Jazz Ex Machina
    • Improvise cum machina 1/2
    • Improvise cum machina 2/2
    • Like Sound, Like Flesh
    • Silent Talks
    • Music-Fictions
    • Tablado
    • Musical creation around Outer Space
    • Campo Abierto
    • Musical creation around Dream Work
    • Biotope
    • EROR
    • IDEA
    • Artistic Research Residency
    • Artistic Residencies: The Blog
    • Rendez Vous 20.21
    • Season 2021.22
    • Seasons from 1996 to present
    • ManiFeste-2022 Website
    • Replay: ManiFeste-2022 Concerts
    • ManiFeste festival from 2012 to 2022
    • L’Étincelle, IRCAM’s journal of creation
  • Transmission

    In parallel to its fundamental missions of research and creation, IRCAM is committed to sharing its knowledge and know-how, its technologies with the general public. Cutting-edge research and diffusion of innovations, an international reference for education and the democratization of artistic practices are the driving forces behind the institute’s educational activities.

    • 2022.23 Training Courses
    • Max, Max for Live
    • OpenMusic
    • Modalys
    • TS2 and Partiels
    • Sound spatialization
    • From PureData to audio plugins
    • Sensors, Interfaces, and Interactive Machine Learning
    • Other training programs
    • Practical Information
    • Advanced Programs
    • Cursus Program on Composition and Computer Music
    • Supersonic Chair
    • Master ATIAM
    • Sound Design Master's Program
    • Music Doctorate
    • AIMove Master
    • School Programs
    • Studios of Creation
    • Mixed-Music
    • Artistic and Cultural Education
    • Career Discovery Visit
    • Images of a Work Collection
    • ManiFeste-2022, the Academy
  • Innovations

    At the center of societal and economic concerns combining culture and information technologies, the current research at IRCAM is seen by the international research community as a reference for interdisciplinary projects on the sciences and technologies for sound and music, constantly exposed to society’s new needs and uses.

    • The IRCAM Forum
    • Subscribe to the Forum
    • Softwares
    • Ircam Amplify
    • Industrial Applications
    • Industrial Licenses
    • Forum Vertigo
  • IRCAM
  • Careers & Job Offers
  • Calls for applications
  • Newsletter
  • Arrive
  • Boutique
  • Resource Center
  • Magazine
  • Medias
  • login
  • En
  • Fr
  • IRCAM
  • Careers & Job Offers
  • Calls for applications
  • Newsletter
  • Arrive
  • Boutique
  • Resource Center

Fr | En

  • login
  • Research

    The fundamental principle of IRCAM is to encourage productive interaction among scientific research, technological developments, and contemporary music production. Since its establishment in 1977, this initiative has provided the foundation for the institute’s activities. One of the major issues is the importance of contributing to the renewal of musical expression through science and technology. Conversely, sp…

    • Research Topics
    • The STMS Lab
    • Research Teams
    • Sound Systems and Signals: Audio/Acoustics, InstruMents
    • Acoustic and Cognitive Spaces
    • Perception and Sound Design
    • Sound Analysis-Synthesis
    • Sound Music Movement Interaction
    • Musical Representations
    • Analysis of Musical Practices
    • Projects
    • Sound Workshop
    • The Musical Body
    • Creative Dynamics
    • Musique/Sciences Collection
  • Creation

    IRCAM is an internationally recognized research center dedicated to creating new technologies for music. The institute offers a unique experimental environment where composers strive to enlarge their musical experience through the concepts expressed in new technologies.

    • Composers & Artists in Studio
    • In Ex Machina
    • Jazz Ex Machina
    • Improvise cum machina 1/2
    • Improvise cum machina 2/2
    • Like Sound, Like Flesh
    • Silent Talks
    • Music-Fictions
    • Tablado
    • Musical creation around Outer Space
    • Campo Abierto
    • Musical creation around Dream Work
    • Biotope
    • EROR
    • IDEA
    • Artistic Research Residency
    • Artistic Residencies: The Blog
    • Rendez Vous 20.21
    • Season 2021.22
    • Seasons from 1996 to present
    • ManiFeste-2022 Website
    • Replay: ManiFeste-2022 Concerts
    • ManiFeste festival from 2012 to 2022
    • L’Étincelle, IRCAM’s journal of creation
  • Transmission

    In parallel to its fundamental missions of research and creation, IRCAM is committed to sharing its knowledge and know-how, its technologies with the general public. Cutting-edge research and diffusion of innovations, an international reference for education and the democratization of artistic practices are the driving forces behind the institute’s educational activities.

    • 2022.23 Training Courses
    • Max, Max for Live
    • OpenMusic
    • Modalys
    • TS2 and Partiels
    • Sound spatialization
    • From PureData to audio plugins
    • Sensors, Interfaces, and Interactive Machine Learning
    • Other training programs
    • Practical Information
    • Advanced Programs
    • Cursus Program on Composition and Computer Music
    • Supersonic Chair
    • Master ATIAM
    • Sound Design Master's Program
    • Music Doctorate
    • AIMove Master
    • School Programs
    • Studios of Creation
    • Mixed-Music
    • Artistic and Cultural Education
    • Career Discovery Visit
    • Images of a Work Collection
    • ManiFeste-2022, the Academy
  • Innovations

    At the center of societal and economic concerns combining culture and information technologies, the current research at IRCAM is seen by the international research community as a reference for interdisciplinary projects on the sciences and technologies for sound and music, constantly exposed to society’s new needs and uses.

    • The IRCAM Forum
    • Subscribe to the Forum
    • Softwares
    • Ircam Amplify
    • Industrial Applications
    • Industrial Licenses
    • Forum Vertigo
  • Home
  • AI sparks dramatic advances in voice technologies
Publish date


With Published chosen, won't be shown until this time





June 13, 2022
Edit
Article
Research
News






AI sparks dramatic advances in voice technologies

Edit






Interview with Nicolas Obin
Edit






Nicolas Obin is both a lecturer at Sorbonne University and a researcher in the Sciences and Technologies of Music and Sound laboratory at IRCAM working on voice synthesis for over a decade. A specialist in speech processing and human communication, he continues his research on the application of the latest advances in artificial intelligence (AI) to voice and related technologies. He works both with the experts from the SCAI (the Center for Artificial Intelligence at the Sorbonne) and with renowned artists as a part of his artistic commitment to IRCAM including Eric Rohmer, Philippe Parreno, Roman Polansky, Leos Carax, Georges Aperghis, and this year Alexander Schubert. We meet him on the occasion of the ManiFeste-2022 festival, during which he is organizing the "Deep Voice, Paris" meetings that he co-founded with Xavier Fresquet of the SCAI.

In the age of AI, deep learning and vocal assistants, IRCAM is a pioneer in the creation of synthesized voices. Nicolas, can you tell us more about the research you do on a daily basis in the Sound Analysis and Synthesis team?

With vocal assistants, voice has become the favored modality of interaction between humans and the connected machines that populate our everyday lives. The voice makes it possible to bring the machine to life and to give it a semblance of humanity. My research focuses on the digital modeling of the human voice at the interface of linguistics, computer science, machine learning, and artificial intelligence. The goal is to better understand the human voice and communication in order to create speech machines, clone a person's vocal identity or manipulate personality attributes such as age, gender, attitudes, or emotions.

Our team, led by Axel Roebel, has extensive scientific and technological expertise on the human voice, including the transfer of our research advances to professional audio plugins dedicated to the voice such as IrcamTools TRAX or IrcamLab TS, which are commonly used by sound designers in the film industry. For example, sound designer Nicolas Becker used the features in IrcamLab TS to recreate the sensation of progressive hearing loss in the film Sound of Metal, for which he won an Academy Award for Sound Design.

In addition to artistic collaborations, we are constantly working with brands and companies to give an artificial voice to personal assistants, virtual agents, or humanoid robots.


Sciences and Technologies of Music and Sound laboratory at IRCAM © Philippe Barbosa

You are the co-founder of "Deep Voice, Paris" an annual event dedicated to voice and artificial intelligence which will be held this year from June 15 to 17. What will be the focus of this 2nd edition?

The theme of this second edition is diversity and inclusion in voice technologies for a digital world that is better tailored to and more representative of the diversity of individuals, cultures, and languages. While there are between 6,000 and 7,000 living languages in the world today— including sign language—only a few dozen, a hundred at best, are present in the digital world, whether in search engines, for translation, or as vocal assistants. The objective of "Deep Voice, Paris" is to bring together the actors of scientific research and technological innovation to imagine the uses and practices of the future, but also to reflect on the contribution of digital technology in the world of today and tomorrow.

We are looking forward to welcoming some of the leading innovators in these fields, in particular the first non-generic artificial voice created by the members of the Q project, Mozilla's Common Voice open science initiative, the companies Navas Lab Europe, and ReadSpeaker specialized in multilingual speech synthesis and virtual agents, and the Californian startup SANAS which is able to transform a person's accent in quasi real-time! The "Deep Voice, Paris" meetings provide an opportunity to keep abreast of technological developments, to meet the players, and to participate in discussions on their use in our daily lives.

With the ANR project TheVoice, you addressed the creation of voices for content production in the creative industry sector. Has this applied research consortium led to any significant achievements?

The ANR project TheVoice was an opportunity for us to work very closely with the creative industry, such as production and post-production companies and especially for dubbing. It gave us the opportunity to better understand the voice professions, the industrial and cultural issues, and to bring totally new artificial intelligence solutions to a sector that is especially demanding in terms of quality.

More specifically, we designed algorithms that enabled us to transfer the vocal identity of one person onto the voice of another, in other words, a "vocal deep fake". We have already used these innovations — as part of a project conducted with IRCAM Amplify — for Thierry Ardisson's new TV show "Hôtel du Temps", in which deep fake technologies are used to give a digital life to personalities during an interview; to recreate the voice of Isaac Asimov, one of the founders of science fiction, in a documentary being produced by Arte; and to create artificial voices in Alexander Schubert's latest work Anima™, just performed at the Centre Pompidou during ManiFeste.

And from the point of view of fundamental research, what are the challenges for the creation of the voices in the future, the latest challenges to be taken up by researchers?

The boom in artificial intelligence in the mid-2010s spurred impressive advances in all areas of digital technology, and in particular in voice technologies. In 2018, the first artificial voices considered as natural as human voices were created, and this achievement has crossed the threshold of a kind of "vocal singularity". This first prefigures the rapid and profound mutations linked to the modeling and simulation of digital humans and our modes of interaction with machines, in an ever-increasing immersion in digital technology. But, despite these advances, voice remains a complex manifestation of the human being: in contrast to 3D animation widely used in movies, video games, or virtual reality, the fields of application of artificial voices are still very limited.

The upcoming research challenges are numerous: they consist in manipulating the attributes of a voice to create digital filters allowing to sculpt the personality of a human or artificial voice, to improve the modeling of the voice in interaction, in particular with its context, to allow a fluid vocal interaction, personalized and adapted to an interlocutor and to a situation, the whole in an ultra-realistic way, possibly guided by physics... and in real-time!

This proliferation of research and innovation is a fabulous breeding ground for experimentation for artists. They can now produce voices and vocalizations that are virtually unheard of, that is to say, free from the constraints of nature and physics, whether it be to make a voice sing with an inhuman range (unattainable by a human being) or to create cyber-physical objects endowed with hybrid voices, such as making a tree, a lamp, or a guitar speak. These are endless possibilities for imagining new forms of expression and creative sound artifacts at the interface of the human and the machine, the singular and the universal, the real, and the virtual.

Edit
Ircam

1, place Igor-Stravinsky
75004 Paris
T. +33 1 44 78 48 43
Opening times

Monday through Friday 9:30am-7pm Closed Saturday and Sunday
Subway access

Hôtel de Ville, Rambuteau, Châtelet, Les Halles
  • Jobs Offers & Internships
  • The IRCAM team
  • Partners
  • Support IRCAM
Institut de Recherche et de Coordination Acoustique/Musique

go to :
  • Centre Pompidou
  • Legal Notes
  • General Conditions
  • Espace Pro
Copyright © 2022 Ircam. All rights reserved.