Table des matières

Supervised approach of rhythm quantification based on tree series enumeration

A new quantization system is currently under development. It proposes a new interactive approach, which breaks with the previous single-solution systems. It is run through a dedicated interface, that allows to visualize and edit the transcriptions of the input sequence. It is implemented in the newest version of the "rhythm" library.


To download the sources, go to :

Download the zip file, uncompress it, and move it to you library folder (either OM X.X/libraries , or the file specified in Preferences/Libraries)

NOTE : This library is compatible with OM 6.10 and further. Make sure you have a compatible version.

The rq Class

The new quantization system is implemented in the form of a new class : rq. The class has two inputs :

By double-clicking the quant-system, the quantization interface can be displayed.

The interface has three views :

The k-best parsing algorithm

The quantization algorithm is a recursive, dynamic-programming, lazy algorithm. It works by recursively subdividing segments into equal parts, and then aligning the input points to the closest segment boundary. The algorithm enumerates the subdivisions that give the best results, ranked according to a criterion that combines two quality measures :

See the Further Reading section for more details about its in-depth functioning.

Example patch

A simple example patch can be found here : patch exemple quant-system.omp

Typical workflow

Charge a ''chord-seq'' into the quant-system

To do so, plug the chord-seq you want to quantify into the "chord-seq" entry of the rq box, then evaluate it.

The input chord-seq should be monophonic. That means that the various chord objects inside the chord-seq should not overlap (if they do, the algorithm will remove the overlap). But the system will perform correctly when the chord-seq contains chord objects that contain more than one note. If the notes inside a chord are not all of same length, the algorithm will consider the chord's length to be that of the longest note.

Segment the input

You can then either :

Segmentation marks can be placed anywhere, not necessarily on a chord. The "Slur" field indicates if the last note of the previous segment overlaps with the current, and if so, displays the amount of overlap (in ms).

The segments should not be too long (about the size of a bar), and should correspond to constant-tempo regions for best results. Each segment will correspond to a bar in the final transcription.

Click "quantify"

Click "quantify" to run the algorithm on each segment. Computations can take a long time when the segments are long, or when the schema is big. See section How to make it faster.

Select in the k-best panel

Select in the k-best panel the transcriptions you want to keep. To display the right panel, tick "Show solutions". To navigate through the solutions you can either :

The chosen solution will be automatically updated in the editor view.

Edit the transcription

Double clicking a chord or a group will display a list of other possible transcriptions for this chord/group. You can then choose a solution from this list by double-clicking it. You can also compute more solutions by clicking "More".

You can also replace a subtree by any valid OM rhythm tree by selecting a chord or group, press "e" and type the subtree you want.

Be careful of the cursor mode you are in : chord to select a single chord, group to select a sub-tree.

Edition is only available when in "Edit mode".

Retrieve the final transcription

To get the final transcription as an independent voice object, use the function get-voice, which takes as an input a quant-system and outputs a voice. It has an optional input to state if you want the transcription as displayed in Edition mode or in Render mode (default : Render).

Do not forget to block the rq box before evaluating, otherwise you will lose your work !

Video demo

A video of the system being used following the previous workflow can be found here : Video presentation.

Le greffon Adobe Flash est nécessaire pour afficher ce contenu.

Tempo smoothing

Our system performs a tempo estimation on each segment independently, in order to find for each segment the tempo that gives the best solution. As a result, the tempo is likely to change tremendously from one measure to another, which is not satisfactory in many cases. In order to solve this problem, we added a tempo-smoothing algorithm, in order to find solutions that keep an approximately constant tempo over all segments, or at least over sequences of segments that are as long as possible.

The fitness of a sequence of tempi is assessed according to two criteria :

The balance between these two criteria can be set with the Smoothing parameter.

For each segment, the algorithm determines if there is a tempo that is close to that of the previous segment. If, in the current segment, no tempo is close enough to the previous, the algorithm can change the tempo and start a new sequence of tempi : we call that a "tempo jump". The Tempo jump penalty parameter allows to set how frequent one wants these tempo jumps to be. The higher it is, the less frequent tempo jumps are going to be.


The "Parameters" button allows to edit global quantization parameters (for all segments). There are various categories of parameters :

Segment parameters

These parameters are segment-dependant and can be edited for each segment :

To edit the quantization parameters for one segment only, double-click the corresponding segmentation mark. You can the re-run the algorithm on this segment only by select it and press "q", which can save a lot of computation time.

Segmentation parameters

This parameter concerns the segmentation algorithm.

Tempo smoothing parameters

These parameters concern the tempo-smoothing algorithm.

Display parameters

The Edit/Render switch allows to display the solution in two modes : Edit mode, where the display might not be ideal, but edition of the solution is permitted, and Render mode, where the display is prettier, but edition is not permitted.

The Color mode option displays with a color code (red : bad, green : good) the precision of the current transcription for the selected chord or group in the editor view.

The Show pulses option displays in the chord-seq view the quantization grid on which onsets are aligned.

The Show solutions switch displays the k-best panel on the right.

How to make it faster

The quantification algorithm requires a lot of calculations, which can take a lot of time, especially with older computers. Fortunately, there are ways to make it faster.

Export functions

To export one or multiple transcriptions from the quant-system to OpenMusic objects, two functions are available :

Subdivision Schemas

The subdivision schema specifies which subdivision or series of subdivisions are allowed in each beat. It is given as a list.

For example, the schema (2 2 3) specifies that each beat can be cut in 2, then each part can be cut in 2, then each part can be cut again in 3.

Each segment can be cut again, or left as is.

For example, with the previous schema, the first half can be cut in 2 while the second half is left uncut. Again, the first quarter can be cut in 3, while the second is not.

Nested lists can be used in order to describe alternate possibilities.

For example, the schema (2 (2 3) 3) specifies that each beat can be cut in 2, then each part can be cut in 2 or in 3, then each part can be cut again in 3.

When a choice is left between various subdivisions, the choice is made in each segment independently of the choices in the other segments.

For example, with the previous schema, the first half can be cut in 2 or 3, and the second half can be cut in 2 or 3 independently, the choices can be different in each part.

Deeper nested lists can be used in order to describe alternate series of subdivisions. More generally speaking, when the current depth is odd, the list describes successive choices, and when it is even, the list alternative choices.

For example, the schema (2 ( (2 3) (3 2) ) ) specifies that each beat can be cut in 2, then each part can be cut in 2 then 3, or in 3 then 2.

The schema ( ( ( (2 3) (3 2) 2) (5 (2 3) ) 7) ) is equivalent to leaving the choice between the following schemas : (2 3 2), (3 2 2), (5 2), (5 3), (7)

More examples and further explanations can be found in the references given in the Further Reading section.

Further reading

To know more about this transcription system :

For any inquiries, please contact : adrien[dot]ycart[at]