SPA(S)M 4/4 : Frédéric Bevilacqua – Adding grist to the mill

Artistic Residencies: The Blog

Because he is conducting his artistic residency on motion capture associated with synthesis and composition processes in IRCAM’s Sound Music Movement Interaction team, Basile Chassaing can rely on their expertise to conduct his project. In return, as explained by Frédéric Bevilacqua, research director and head of the team, they expect to use this opportunity to learn from Chassaing’s experiments with the tools they have developed.

It is not the first time that this research team has worked with motion capture associated with synthesis and composition processes – in fact, it is at the core of its research. That is how, over time and conducting a broad spectrum of creative projects, the R-IoT[1] motion sensors were developed, along with a range of software dedicated to the analysis and compositional exploitation of data supplied by the sensors, sometimes involving the use of interactive machine learning or, more recently, deep learning[2]

The performance of these sensors varies depending on whether they are used in instrumental or choreographic settings. “In principle, it should be the same,” says Frédéric Bevilacqua. “But in practice, it is a bit more complicated. First of all, when you work with a musician, it is natural that a gesture produces a sound; the work is directly in line with the notion of playing modes, extended or otherwise. Specific gestures are identified that generate sounds, and the musician is used to reproducing and refining their gestures according to the sound results obtained. The machine is, in turn, also learning from the musician. Dancing however, is not as standardized. When you put a dancer in the role of a musician, it challenges their focus, even when they are not consciously interacting with the system. Furthermore, it opens more possibilities when trying to define specific gestures, which consequently makes machine learning more complex.

H2O - in memoriam A-68A / basile chassaing 2022-23

Basile Chassaing’s project SPA(S)M was therefore the perfect opportunity to bring together the team’s approach and that of a composer who had previous experience in the field. Chassaing’s focus on the rhythmic aspects of writing when applied to motion capture, a topic which had hitherto received less attention, as well as his expectations regarding the evolution of the ways in which recorded data could be used in a compositional framework also appealed to the team members.

Getting (re)acquainted with these state-of-the-art tools developed by the team gave Chassaing the opportunity to really appreciate the full range of the inherent challenges they represent. First and foremost, the sensor, the famous R-IoT, for which this residency provides the opportunity for a new iteration[3]. Then there's MuBu: developed for Max, MuBu processes the gestural data – making it an indispensable link in the chain between the sensor and the actual composition tools.

Amongst these compositional tools, the most advanced is CataRT, a real-time concatenative sound synthesis system.

“The researcher and developer Diemo Schwarz has developed an interface that makes it possible to use CataRT on a 2D surface or a trackpad (with pressure control). It allows you to control sound synthesis by moving in a virtual 2D space in which the system has distributed the short snippets (grains) of sound according to previously selected descriptors (such as timbre, loudness, etc.)

Using this tool is fairly intuitive, as you have visual feedback for each action. The sound segments you see on the screen can simply be combined and played. This lets you easily control the synthesized textures. We could imagine transferring this principle to a multidimensional space and using it for dance motion capture, but the dancer would not be able to have visual feedback which makes controlling the tool very different. Also, some parts of the space within CataRT are relatively empty, which means that the dancer could be moving but not produce any sound because they have unknowingly gotten lost in an area devoid of any sound!”  

SPA[S]M - extrait #1 / chassaing + grach / Royaumont 2023

“That is why I also showed Basile the work of Victor Parades, one of our doctoral students who, as part of his thesis, has developed a system that allows CataRT's space to be redefined or reshaped to match that captured. Using this system, you have a hole-free space which minimalizes the chance of unwanted surprises when it comes to differences between sound segments that are covered in spaces close to each other.”

“Later, we showed him the work of Sarah Nabi, another doctoral student who, as part of her thesis co-directed by Philippe Esling, also worked with a R-IoT-equipped dancer, but this time on RAVE, a generative synthesis system originally developed by Antoine Caillon that relies on deep learning.”

The major difference between CataRT and RAVE is the nature of their synthesis process, which is concatenative for the first and generative for the latter. “Actually, the two systems are rather similar from a sound-designer’s point of view” says Frédéric Bevilacqua. “CataRT cuts sounds and projects them into a space where they can then be rewritten. RAVE learns a vast audio corpus and creates a so-called "latent" space, which again is traversed. The biggest problem with generative synthesis is that machine learning is extremely time-consuming, and latent space is currently very difficult to control – these systems are still under construction and far from complete, whereas the parameters of CataRT have been mastered. In my opinion, the two systems are complementary. RAVE's potential is to produce hybrid sounds or timbre transfer.[4]

In both cases, the experimental exploration phase with the dancers is absolutely essential. That is just as well, because it is at the heart of the research undertaken by Basile Chassaing, who is opening a new chapter in his residency at the start of this year, focusing on RAVE and the collaboration with Sarah Nabi.

[1]Read episode 2, with Emmanuel Fléty
[2] Or "deep learning": the latest generation of artificial intelligence that we're hearing so much about at the moment.
[3]Read episode 2, with Emmanuel Fléty
[4] To learn more about this, read the articles on Maxime Mantovani’ artistic residency