21
0

FlowMAC: Conditional Flow Matching for Audio Coding at Low Bit Rates

Abstract

This paper introduces FlowMAC, a novel neural audio codec for high-quality general audio compression at low bit rates based on conditional flow matching (CFM). FlowMAC jointly learns a mel spectrogram encoder, quantizer and decoder. At inference time the decoder integrates a continuous normalizing flow via an ODE solver to generate a high-quality mel spectrogram. This is the first time that a CFM-based approach is applied to general audio coding, enabling a scalable, simple and memory efficient training. Our subjective evaluations show that FlowMAC at 3 kbps achieves similar quality as state-of-the-art GAN-based and DDPM-based neural audio codecs at double the bit rate. Moreover, FlowMAC offers a tunable inference pipeline, which permits to trade off complexity and quality. This enables real-time coding on CPU, while maintaining high perceptual quality.

View on arXiv
@article{pia2025_2409.17635,
  title={ FlowMAC: Conditional Flow Matching for Audio Coding at Low Bit Rates },
  author={ Nicola Pia and Martin Strauss and Markus Multrus and Bernd Edler },
  journal={arXiv preprint arXiv:2409.17635},
  year={ 2025 }
}
Comments on this paper