v1v2 (latest)

GraFPrint: A GNN-Based Approach for Audio Identification

IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2024

14 October 2024

Main:4 Pages

2 Figures

Bibliography:1 Pages

Abstract

This paper introduces GraFPrint, an audio identification framework that leverages the structural learning capabilities of Graph Neural Networks (GNNs) to create robust audio fingerprints. Our method constructs a k-nearest neighbor (k-NN) graph from time-frequency representations and applies max-relative graph convolutions to encode local and global information. The network is trained using a self-supervised contrastive approach, which enhances resilience to ambient distortions by optimizing feature representation. GraFPrint demonstrates superior performance on large-scale datasets at various levels of granularity, proving to be both lightweight and scalable, making it suitable for real-world applications with extensive reference databases.

View on arXiv

Comments on this paper