Effective Searches of Temporal Series
Searching for sounds can be a painful and tedious task when dealing with large-scale databases. Even when metainformation is available, query results are often far from the mental image imagined by the user. Today, there is no system that transforms the intuitive projection of a sound idea into an effective search; sound samples do not let users extract high-level information such as melody or lyrics from songs.
Project Description & Goals
Beginning with this observation, we have developed a code that makes it possible to carry out effective searches on temporal forms and take into account the multidimensional nature of sound perception. This makes it possible to carry out searches based on the temporal form of the descriptors rather than on mean values. These descriptors are modeled to obtain their average, standard deviation as well as the form of their temporal evolution via a symbolic representation enabling both compact storage and an effective search. However, it was essential that the comparison of temporal series that make it possible to obtain a similarity based on perceptive criteria for objects that could possibly be very different mathematically. Using an approach derived from Dynamic Time Warping (DTW), we have developed a robust measure of similarity following non-linear distortions such as range, noise sound, and unique values. Thanks to a new algorithm for indexing, it is possible to obtain the best element from a database containing several million sound samples almost immediately.
Our study then opened to the implementation of higher-lever interactions. We studied the possibility of a query that is pertinent to several temporal curves simultaneously, going beyond the framework of the simple consideration of often less relevant criteria. Thanks to new heuristics, we have carried out the first precise multi-objective search algorithm for temporal series.
These techniques apply to all fields of scientific research due to the ubiquity of the temporal information. Multi-objective searches of temporal series are open to numerous applications in fields ranging from medical analysis to robotics.This also enables the installation of a system of request by vocal imitation based on multiples of spectral descriptors.
These advances have been implemented in an interface using iPad multi-touch technology.
IRCAM's team: Musical Representations team.