Abstract
We present a method for tempo estimation from audio re-cordings based on signal processing and peak tracking, andnot depending on training on ground-truth data. First anaccentuation curve, emphasising the temporal location andaccentuation of notes, is based on a detection of bursts ofenergy localised in time and frequency. This enables todetect notes in dense polyphonic texture, while ignoringspectral fluctuation produced by vibrato and tremolo. Pe-riodicities in the accentuation curve are detected using animproved version of autocorrelation function. Hierarchicalmetrical structures, composed of a large set of periodicitiesin pairwise harmonic relationships, are tracked over time.In this way, the metrical structure can be tracked even ifthe rhythmical emphasis switches from one metrical levelto another.
This approach, compared to all the other participants tothe MIREX Audio Tempo Extraction from 2006 to 2018,is the third best one among those that can track tempovariations. While the two best methods are based on ma-chine learning, our method suggests a way to track tempofounded on signal processing and heuristics-based peaktracking. Besides, the approach offers for the first timea detailed representation of the dynamic evolution of themetrical structure. The method is integrated intoMIRtool-box, a Matlab toolbox freely available.