Industrial Licenses

IRCAM manages a large portfolio of industrial licenses in collaboration with national and international actors, on cutting-edge technological modules and applications obtained through various research projects, including: SuperVP, Spat, descripteurs audio, Modalys, Voice Casting, CatarT, Gesture/Voice Follower and Audio to Midi.

Informations Hugues Vinet, director of the Research and Development Department


SuperVP is a signal-processing library that is based on an implementation of a sophisticated phase vocoder algorithm. The library can be used to perform a large number of signal transformations with outstanding sound quality with the following functions:

  • Transposition (preservation of the spectral envelope)
  • Time Stretching (preservation of transients)
  • Ambient noise filtering
  • Vocal transformation (trax)
  • Separation end remixing of sound components
  • Filtering

Format: C++ Library, MacOSX, Windows, Linux

Learn more about SuperVP

Audio to Midi

Conversion of a mono or polyphonic audio signal (ex. piano, guitar) to a MIDI format. This makes it possible to rework a sound recording by layering virtual instruments piloted by MIDI over the recording.
Format: C++ Library, MacOSX, Windows


Modalys is an environment that lets users create unheard of virtual instruments based on simple physical objects such as strings, plates, tubes, membranes, plectrum, bows, or hammers, and making them interact. It is possible to construct objects with complex forms using a threedimensional mesh or resulting from measurements. Modalys brings these virtual instruments to life by calculating how they vibrate when played.
Format: C++ Library, MacOSX, Windows

Learn more about Modalys


Spat~ IRCAM’s spatializer is software dedicated to sound spatialization in real-time. Originally designed as a library, it enables musicians and sound engineers to control the spatial sound processing for various sound broadcasting systems. List of available modules:

  • Renderings and Panoramas in 2D or 3D:
    - Stereo (AB, XY, MS), 5.1, etc.,
    - Binaural rendering with headphones (with compensation of near-field effects)
    - Transaural rendering for loudspeakers,
    - Vector-base amplitude panning (VBAP)
    - Distance-based amplitude panning (DBAP)
  • Artifical reverberation: multi-channel reverberation, scalable and customizable based on a network of looped delays. Multi-channel convolution in real-time without latency.
  • Perceptive control of the acoustic quality and room effect; heat and brilliance; presence/proximity of the sound source; presence of the room; early or late reverberation; intimitiy, vivacity. Simplified control of source radiation (opening and orientation).
  • Low-level control; equalization, Doppler effect, air absorption, and more.

Format: C++ Library, MacOSX, Windows

Learn more about Spat

Descripteurs audio

Automatic extraction of information from musical recordings. List of available modules:

  • IrcamClass: estimation of classes genre, mood, instrumentation, and other categories
  • IrcamBeat: estimation of tempo, metrics, complexity, rhythmic percussiveness, temporal positions of high and low points
  • IrcamKeymode: estimation of the tonality
  • Ircamchord: estimation of chord successions
  • Ircamstructure: estimation of a piece’s temporal structure
  • Ircamsummary: sound summary of a title via concentration synchronized to the tempo of representative excerpts.

Voice Casting

A system that calculates timbral similarity of a target voice with a corpus of voices. This is used for dubbing films, making it possible to find a voice in a database of actors that is the closest match to the original actor’s voice.
Format: C++ Library, MacOSX, Windows

Gesture/Voice Follower

This ensemble of modules for real-time recognition and synchronization of temporal forms (movements, voices, etc.). Functions with all types of signals from live performances (e.g. motion sensors, image analysis, audio descriptors) and is widely used for the conception of new types of instruments based on recorded sounds.
Format: C++ Library, MacOSX, Windows

Learn more about Gesture/Voice Follower


Concatenative corpus-based synthesis makes use of a database of recorded sounds and an algorithm for the selection of units that makes it possible to choose the segments of the database in order to synthesize by concatenation a musical sequence. The selection is based on the characteristics of the recording that are obtained by an analysis of the signal and other features, such as the pitch, energy, or specter.
Format: C++ Library, MacOSX, Windows

Learn more about CataRT