A Simple Music/Voice Separation Method Based on the Extraction of the Underlying Repeating Structure, by Zafar Rafii

Friday, july the 29th at 2pm.
Telecom ParisTech (Dareau)
Room: DA-006 (Vitrine de la recherche)
Author: Zafar Rafii

Zafar Rafii is a Ph.D. candidate in Electrical Engineering & Computer Science at Northwestern University. He received a Master of Science in Electrical Engineering from Ecole Nationale Superieure de l’Electronique et de ses Applications (ENSEA) in France and from Illinois Institute of Technology (IIT) in Chicago. In France, he worked as a research engineer on source separation at Audionamix (aka Mist Technologies). His current research interests are centered around audio analysis and include signal processing, machine learning and cognitive science.
Webpage: http://www.cs.northwestern.edu/~zra446/

Repetition “is the basis of music as an art” (Schenker, 1954). This is especially true for popular songs, generally characterized by an underlying repeating musical structure over which the singer performs varying lyrics. Based on this simple observation, we propose to separate the repeating musical “background” from the non-repeating musical “foreground”. The basic idea is to identify the periodically repeating audio segments, compare them to a repeating segment model, and extract  the repeating patterns via binary masking. The result is a simple but effective music/voice separation system. Unlike previous separation approaches, this method does not depend on special features, does not rely on complex frameworks, and does not need prior training. Because it is only based on self-similarity, it has the advantage of being simple, fast and completely automatable.

Comments are closed.