Realtime music segmentation using information geometric methods Information geometry is a recent field of mathematics, in particular of statistical inference, that studies the notions of probability and information by the way of differential geometry. It is an emerging field that brings together various fields such as machine learning, information theory, signal processing, and differential geometry.

This project, undertaken by the MuTant Team-Project, aims at introducing the theoretical concepts and notions of information geometry useful in the development of a formal mathematical framework for the manipulation of audio streams. The idea is to provide alternative structures of manipulation, that respect the temporal and probabilistic natures of audio streams more than the usual structures used in audio content analysis applications do.

This formal framework leads to two applicative fields: automatic structure learning as well as audio stream transformation. The first field is part of the general framework of audio content analysis with applications to automatic structure discovery, automatic segmentation, automatic recognition of auditory scenes, etc. Concerning the second field, applications can be found in audio restoration, data encoding and compression, as well as in providing new methods for sound transformation in analysis-synthesis schemes.

This page provides information about our research progress in this emerging field at Ircam.


  • Realtime segmentation of audio streams
  • Real-Time Transcription: Automatic transcription of polyphonic music in real time
  • Audio Oracle: Incremental analysis of audio structures
  • Guidage: Fast query by example retrieval for concatenative sound synthesis


  • IRCAM - Mutant Team-Project: Arshia Cont (Researcher), Arnaud Dessein (PhD Student, now at University of York)
  • External: Frederic Barbaresco (THALES Research), Frank Nielsen (LIX, école polytechnique), Shlomo Dubnov (UCSD)


Selected Documents

Arnaud Dessein, and Arshia Cont. An information-geometric approach to real-time audio segmentation. IEEE Signal Processing Letters, 20(4):331-334, April 2013. (paper) (draft) (bibtex)

Arnaud Dessein, Arshia Cont, and Guillaume Lemaitre. Real-time detection of overlapping sound events with non-negative matrix factorization. In Frank Nielsen and Rajendra Bhatia, editors, Matrix Information Geometry, chapter 14, pages 341–371. Springer, Berlin/Heidelberg, Germany, 2013. (paper) (draft) (bibtex) (web)

Arnaud Dessein, and Arshia Cont. Online change detection in exponential families with unknown parameters. In F. Nielsen and F. Barbaresco, editors, Geometric Science of Information: First International Conference, GSI 2013, Paris, France, August 28-30, 2013, Proceedings, volume 8085 of Lecture Notes in Computer Science, pages 633–640. Springer, Berlin/Heidelberg, Germany, 2013. (draft) (bibtex)

Arnaud Dessein, Arshia Cont, and Guillaume Lemaitre. Real-time polyphonic music transcription with non-negative matrix factorization and beta-divergence. In 11th International Society for Music Information Retrieval Conference (ISMIR), pages 489–494, Utrecht, Netherlands, August 2010. (paper) (bibtex) (web)

Arnaud Dessein. Computational Methods of Information Geometry with Real-Time Applications in Audio Signal Processing. PhD thesis, Université Pierre et Marie Curie, Paris, France, December 2012. (manuscript) (résumé) (bibtex) (slides) (web)

Cont A., Dubnov S., and Assayag G., On the Information Geometry of Audio Streams with Applications to Similarity Computing, IEEE Transactions on Audio, Speech and Language Processing, Vol. 19, no. 4, Pp. 837-846, 2011. (preprint) (bibtex)

Cont Arshia, Modeling Musical Anticipation: From the time of music to the music of time. PhD thesis in Acoustics, Signal Processing, and Computer Science Applied to Music (ATIAM). Paris : University of Paris 6 (UPMC), and University of California San Diego (UCSD) (joint), 2008. (pdf) (bibtex)


music-information-geometry.txt · Dernière modification: 2013/08/21 15:24 par Arshia Cont