DD.1 Embedding spaces and multivariate time

WP DD : Deep Discovery

Report Embedding spaces and multivariate time V1

Authors

Tristan Carsault, Jérôme Nika

Abstract

Our goal in this work is twofold: to develop an intelligent listening and predictive module of chord sequences, and to propose an adapted evaluation of the associated Music Information Retrieval (MIR) tasks that are the real-time extraction of musical chord labels from a live audio stream and the prediction of a possible continuation of the extracted symbolic sequence.

Therefore, we propose two independent modules that allows to extract chords in real-time and to predict a possible continuation of an input chord sequence. Both modules are available online, along with tutorials. This modules are aimed to be used in co-creative context such as through an integration within the DYCI2 or SoMax.

In the case of chords, there exists some strong inherent hierarchical and functional relationships. However, most of the research in the field of MIR focuses mainly on the performance of chord-based statistical models, without considering music-based evaluation or learning. Indeed, usual evaluations are based on a binary qualification of the classification outputs (right chord predicted versus wrong chord predicted).

Therefore, our research that are detailed in the following introduce a specifically-tailored chord analyzer that allows to measure the performances of chord-based models in term of functional qualification of the classification outputs (by taking into account the harmonic function of the chords). Then, in order to introduce musical knowledge into the learning process for the automatic chord extraction task, we also present a specific musical distance for comparing predicted and labeled chords. Finally, we conduct investigations into the impact of including high-level metadata in chord sequence prediction learning (such as information on key or downbeat position). We show that a model can obtain better performances in term of accuracy or perplexity, but output biased results. At the same time, a model with a lower accuracy score can output errors with more musical meaning. Therefore, performing a goal-oriented evaluation allows a better understanding of the results and a more adapted design of MIR models.

Report

Merci DD1 V1. extracting and predicting chord progressions from a real-time audio stream

Code

Deep Chord progressions extraction & prediction source repository

Related documents