PicoAudio: Enabling Precise Timestamp and Frequency Controllability of Audio Events in Text-to-audio Generation

3 July 2024

Papers citing "PicoAudio: Enabling Precise Timestamp and Frequency Controllability of Audio Events in Text-to-audio Generation"

7 / 7 papers shown

Title
Audio-Language Datasets of Scenes and Events: A Survey Gijs Wijngaard Elia Formisano Michele Esposito M. Dumontier 79 2 0 10 Jan 2025
Enhancing Robustness in Deep Reinforcement Learning: A Lyapunov Exponent Approach Rory Young Nicolas Pugeault AAML 57 0 0 14 Oct 2024
AudioComposer: Towards Fine-grained Audio Generation with Natural Language Descriptions Y. Wang Hangting Chen Dongchao Yang Zhiyong Wu Xixin Wu DiffM 40 2 0 19 Sep 2024
MambaFoley: Foley Sound Generation using Selective State-Space Models Marco Furio Colombo Francesca Ronchini Luca Comanducci Fabio Antonacci Mamba 20 1 0 13 Sep 2024
Fast Timing-Conditioned Latent Audio Diffusion Zach Evans CJ Carr Josiah Taylor Scott H. Hawley Jordi Pons DiffM 74 101 0 07 Feb 2024
T-FOLEY: A Controllable Waveform-Domain Diffusion Model for Temporal-Event-Guided Foley Sound Synthesis Yoonjin Chung Junwon Lee Juhan Nam 40 13 0 17 Jan 2024
Text-to-Audio Generation using Instruction-Tuned LLM and Latent Diffusion Model Deepanway Ghosal Navonil Majumder Ambuj Mehrish Soujanya Poria 138 141 0 24 Apr 2023