Processing by Phase Vocoder

Techniques for the analysis and transformation of sounds

The phase vocoder, one of the most effective techniques for the analysis and transformation of sounds, represents the foundation of the SupervP software program. With the phase vocoder, it is possible to transpose, stretch, or shorten sounds; it is possible to apply a practically limitless number of filters to sounds. By the same token, the level of sound quality of the transformed signals is extremely high when applied to speech. Numerous improvements and extensions have been introduced, for example:

  • Reassigned spectrum
  • Estimation of the spectral envelope via ‘true envelope’ transposition with the preservation of the spectral envelope transposition with the ‘shape invariant’ model
  • Generalized cross synthesis enabling the synthesis of hybrid sounds
  • Several methods for estimating the fundamental frequency (pitch) of a signal
  • Classification by nature of the spectral, sinusoidal (voiced) or non-sinusoidal (non-voiced sounds or noises) peaks segmentation of the time/frequency zones into transitory and non-transitory regions and the increase or decrease of transitory sections
  • Processing the sinusoidal, non-sinusoidal, and transitory time/frequency zones
  • The LF model of a glottal source, making it possible to transform a voice, etc.

These different modules of analysis, synthesis, and processing are used in several software programs on the market today.

IRCAM's team: Sound Analysis & Synthesis team.

  • logo Ircam