Music Visuals I : Spectrogram

The idea of these music visuals experiments is to explore different ways to generate visuals, as automatically as possible, directly from a music track. Ideally, those visuals should be entertaining and reflect what stands out in the music from a listener point of view.

In this video, I used Matlab to compute the spectrogram of a music track and display it as concentric color rings. Small rings in the center correspond to low frequencies, whereas larger rings show higher frequencies. The ring brightness is directly linked to the corresponding frequency amplitude in the spectrogram.

While the spectrogram contains a lot of information on the frequencies of the sound, it is not intuitive to link what it shows to what we hear. The low frequencies corresponding to beats and percussions are easily distinguished, however higher frequencies contain a mix of harmonics from different instruments such as piano and human voice. While we can easily distinguish them by ear, it is fairly unintuitive to do so in the spectrogram visualization where both are entwined.

In order to improve the visualization readability, more complex information can be extracted from the spectrogram such as melody, pitch, tempo, etc… These are intuitive properties used to describe music to human listeners and therefore should be used to guide an intuitive visualization. However, these properties are not always straightforward to compute and extract from the sound wave data. It gets even harder when the data is a mix of different instruments playing different melodies.

Research on these topics is being done and tools are already available to try and extract these properties from sound data. For example, the Matlab MIR toolbox provides methods for different feature extractions, and other challenges like melody extraction can be tackled using advanced algorithms. While these methods provide impressive results in some cases, they do not work with any input and still fail when dealing with challenging cases.

These tools are promising ways to improve the music analysis part of this, though my priority for future experiments will probably be to improve on the visual generation part to try to get interesting visuals even for simple and easy to compute music features.