The objective of the Artificial Creative Intelligence and Data Science (ACIDS) project is to model musical creativity by developing innovative artificial intelligence and machine learning models and to provide tools for intuitive exploration of creativity. The project provides extensive theoretical, modeling, and tool experimentation activity. The study of creativity in interactive human-AI situations is crucial for the understanding of .symbiotic. interactions. The availability of artificial intelligence models capable of demonstrating creative behaviors could give rise to a whole new category of generic creative learningsystems.
Time is the very essence of music, and yet it is a complex, multi-scale, multi-faceted fact. This is why music must be examined at varying temporal granularities, as a multitude of time scales coexist (from the identity of individual notes to the structure of entire pieces). We therefore introduce the idea of learning about deep temporal granularity, which could allow us to find not only the salient features of a dataset, but also the time scale at which it behaves best.
For example, we recently developed the first Live Orchestral Piano system with automatic learning of piano/orchestra repertoires (ACTOR project), which allows you to compose music with a classical orchestra in real-time by simply playing on a MIDI keyboard. By observing the correlation between piano scores and corresponding historical orchestrations, we could deduce the spectral knowledge of composers. The probabilistic models we are studying are neural networks with conditional and temporal structures.
ACIDS encourages an interactive, user-centered model where the goal is to put the focus on the human. For example (see collaborations with the team’s REACH and MERCI projects), collective human-machine interactions, including improvisation, interest us as a general model of human interactions in which decisions, initiatives, and cooperation are all at work, and constitute an ideal observation and vantage point for understanding and modeling symbiotic interaction in general.
While most current research attempts to surpass previous approaches by using more complex and cumbersome models, we encourage the need for a simple and controllable model. In addition, we believe that truly intelligent models should be able to learn and generalize from small amounts of data. One of ACIDS’ core ideas is based on the assumption of multiplicity. This concept asserts that very complex information could be found in a simpler and more organized space in its original form. We therefore intend to model the high-level semantics of music through the notion of latent spaces. This could lead to an understanding of the complex characteristics of music but also to the production of understandable control parameters.
We study the relationships between different instrumental timbres, based on perceptual notations. However, they allow only a limited degree of interpretation, there is no capacity for generation and no generalization.
In ACIDS we are studying variational auto-encoders (VAE) that can compensate for these limitations by regularizing their latent space during training to ensure that the latent space of the audio tracks the same topology as that of the perceptual timbre space. In this way, we bridge the gap between analysis, perception, and audio synthesis in a single system. Sound synthesizers are ubiquitous in music, and they even now completely define new musical genres. However, their complexity and parameters make them difficult to master. We have created an innovative generative probabilistic model that learns an inversable correspondence between the continuous latent auditory space of a synthesizer’s audio capabilities and its parameters. We approach this challenge by using variational auto-encoders and standardizing the audio streams. Thanks to this new learning model, we can learn the principal macro-commands of a synthesizer, allowing us to travel through its multitude of organized sounds, make parameter inferences from the audio to control the synthesizer with our voice, and even tackle the learning of the semantic dimension where we find out how the commands adapt to given semantic concepts, all in a single model. ACIDS’ work is regularly used in contemporary creation, for example in the production of the voice of a synthetic opera singer (La fabrique des monstres, Ghisi / Peyret) or collaborations with the composer Alexandre Schubert on gestural learning and capture.
IRCAM's Team : Musical Representations
- ACIDS project
- ACIDS
- ACIDS project
Project details
