ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2503.01485
53
0

FlowDec: A flow-based full-band general audio codec with high perceptual quality

3 March 2025
Simon Welker
Matthew Le
Ricky T. Q. Chen
Wei-Ning Hsu
Timo Gerkmann
Alexander Richard
Yi-Chiao Wu
ArXivPDFHTML
Abstract

We propose FlowDec, a neural full-band audio codec for general audio sampled at 48 kHz that combines non-adversarial codec training with a stochastic postfilter based on a novel conditional flow matching method. Compared to the prior work ScoreDec which is based on score matching, we generalize from speech to general audio and move from 24 kbit/s to as low as 4 kbit/s, while improving output quality and reducing the required postfilter DNN evaluations from 60 to 6 without any fine-tuning or distillation techniques. We provide theoretical insights and geometric intuitions for our approach in comparison to ScoreDec as well as another recent work that uses flow matching, and conduct ablation studies on our proposed components. We show that FlowDec is a competitive alternative to the recent GAN-dominated stream of neural codecs, achieving FAD scores better than those of the established GAN-based codec DAC and listening test scores that are on par, and producing qualitatively more natural reconstructions for speech and harmonic structures in music.

View on arXiv
@article{welker2025_2503.01485,
  title={ FlowDec: A flow-based full-band general audio codec with high perceptual quality },
  author={ Simon Welker and Matthew Le and Ricky T. Q. Chen and Wei-Ning Hsu and Timo Gerkmann and Alexander Richard and Yi-Chiao Wu },
  journal={arXiv preprint arXiv:2503.01485},
  year={ 2025 }
}
Comments on this paper