175
v1v2 (latest)

Contrastive Representation Learning for Acoustic Parameter Estimation

IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2023
Abstract

A study is presented in which a contrastive learning approach is used to extract low-dimensional representations of the acoustic environment from single-channel, reverberant speech signals. Convolution of room impulse responses (RIRs) with anechoic source signals is leveraged as a data augmentation technique that offers considerable flexibility in the design of the upstream task. We evaluate the embeddings across three different downstream tasks, which include the regression of acoustic parameters reverberation time RT60 and clarity index C50, and the classification into small and large rooms. We demonstrate that the learned representations generalize well to unseen data and perform similarly to a fully-supervised baseline.

View on arXiv
Comments on this paper