78
10
v1v2 (latest)

vocadito: A dataset of solo vocals with f0f_0, note, and lyric annotations

Abstract

To compliment the existing set of datasets, we present a small dataset entitled vocadito, consisting of 40 short excerpts of monophonic singing, sung in 7 different languages by singers with varying of levels of training, and recorded on a variety of devices. We provide several types of annotations, including f0f_0, lyrics, and two different note annotations. All annotations were created by musicians. We provide an analysis of the differences between the two note annotations, and see that the agreement level is low, which has implications for evaluating vocal note estimation algorithms. We also analyze the relation between the f0f_0 and note annotations, and show that quantizing f0f_0 values in frequency does not provide a reasonable note estimate, reinforcing the difficulty of the note estimation task for singing voice. Finally, we provide baseline results from recent algorithms on vocadito for note and f0f_0 transcription. Vocadito is made freely available for public use.

View on arXiv
Comments on this paper