ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2301.12503
  4. Cited By
AudioLDM: Text-to-Audio Generation with Latent Diffusion Models

AudioLDM: Text-to-Audio Generation with Latent Diffusion Models

29 January 2023
Haohe Liu
Zehua Chen
Yiitan Yuan
Xinhao Mei
Xubo Liu
Danilo P. Mandic
Wenwu Wang
Mark D. Plumbley
    DiffM
ArXivPDFHTML

Papers citing "AudioLDM: Text-to-Audio Generation with Latent Diffusion Models"

50 / 72 papers shown
Title
Diffused Responsibility: Analyzing the Energy Consumption of Generative Text-to-Audio Diffusion Models
Diffused Responsibility: Analyzing the Energy Consumption of Generative Text-to-Audio Diffusion Models
Riccardo Passoni
Francesca Ronchini
Luca Comanducci
Romain Serizel
Fabio Antonacci
DiffM
23
0
0
12 May 2025
TACOS: Temporally-aligned Audio CaptiOnS for Language-Audio Pretraining
TACOS: Temporally-aligned Audio CaptiOnS for Language-Audio Pretraining
Paul Primus
Florian Schmid
Gerhard Widmer
CLIP
AI4TS
VLM
23
0
0
12 May 2025
FLAM: Frame-Wise Language-Audio Modeling
FLAM: Frame-Wise Language-Audio Modeling
Yusong Wu
Christos Tsirigotis
Ke Chen
Cheng-Zhi Anna Huang
Aaron C. Courville
Oriol Nieto
Prem Seetharaman
Justin Salamon
43
0
0
08 May 2025
SonicRAG : High Fidelity Sound Effects Synthesis Based on Retrival Augmented Generation
SonicRAG : High Fidelity Sound Effects Synthesis Based on Retrival Augmented Generation
Yu-Ren Guo
Wen-Kai Tai
43
0
0
06 May 2025
Sparse-to-Sparse Training of Diffusion Models
Sparse-to-Sparse Training of Diffusion Models
Inês Cardoso Oliveira
Decebal Constantin Mocanu
Luis A. Leiva
DiffM
78
0
0
30 Apr 2025
CoCoDiff: Diversifying Skeleton Action Features via Coarse-Fine Text-Co-Guided Latent Diffusion
CoCoDiff: Diversifying Skeleton Action Features via Coarse-Fine Text-Co-Guided Latent Diffusion
Zhifu Zhao
Hanyang Hua
J. Li
Shaoxin Wu
Fu Li
Yangtao Zhou
Yang Li
DiffM
68
0
0
30 Apr 2025
TrueFake: A Real World Case Dataset of Last Generation Fake Images also Shared on Social Networks
TrueFake: A Real World Case Dataset of Last Generation Fake Images also Shared on Social Networks
S. Dell’Anna
Andrea Montibeller
Giulia Boato
54
0
0
29 Apr 2025
LoopGen: Training-Free Loopable Music Generation
LoopGen: Training-Free Loopable Music Generation
Davide Marincione
Giorgio Strano
Donato Crisostomi
Roberto Ribuoli
Emanuele Rodolà
MGen
48
0
0
06 Apr 2025
DiffGAP: A Lightweight Diffusion Module in Contrastive Space for Bridging Cross-Model Gap
DiffGAP: A Lightweight Diffusion Module in Contrastive Space for Bridging Cross-Model Gap
Shentong Mo
Zehua Chen
Fan Bao
Jun-Jie Zhu
DiffM
50
0
0
15 Mar 2025
DexGrasp Anything: Towards Universal Robotic Dexterous Grasping with Physics Awareness
DexGrasp Anything: Towards Universal Robotic Dexterous Grasping with Physics Awareness
Yiming Zhong
Qi Jiang
Jingyi Yu
Yuexin Ma
56
2
0
11 Mar 2025
DGFM: Full Body Dance Generation Driven by Music Foundation Models
DGFM: Full Body Dance Generation Driven by Music Foundation Models
Xinran Liu
Zhenhua Feng
Diptesh Kanojia
Wenwu Wang
DiffM
62
1
0
27 Feb 2025
RestoreGrad: Signal Restoration Using Conditional Denoising Diffusion Models with Jointly Learned Prior
RestoreGrad: Signal Restoration Using Conditional Denoising Diffusion Models with Jointly Learned Prior
Ching Hua Lee
Chouchang Yang
Jaejin Cho
Yashas Malur Saidutta
R. S. Srinivasa
Yilin Shen
Hongxia Jin
DiffM
80
0
0
19 Feb 2025
A Reversible Solver for Diffusion SDEs
A Reversible Solver for Diffusion SDEs
Zander Blasingame
Chen Liu
DiffM
54
0
0
12 Feb 2025
Survey on AI-Generated Media Detection: From Non-MLLM to MLLM
Survey on AI-Generated Media Detection: From Non-MLLM to MLLM
Yueying Zou
Peipei Li
Zekun Li
Huaibo Huang
Xing Cui
Xuannan Liu
Chenghanyu Zhang
Ran He
DeLMO
116
1
0
07 Feb 2025
Editing Music with Melody and Text: Using ControlNet for Diffusion Transformer
Editing Music with Melody and Text: Using ControlNet for Diffusion Transformer
Siyuan Hou
Shansong Liu
Ruibin Yuan
Wei Xue
Ying Shan
Mangsuo Zhao
Chao Zhang
87
3
0
17 Jan 2025
Audio-Language Datasets of Scenes and Events: A Survey
Audio-Language Datasets of Scenes and Events: A Survey
Gijs Wijngaard
Elia Formisano
Michele Esposito
M. Dumontier
79
2
0
10 Jan 2025
FlowSep: Language-Queried Sound Separation with Rectified Flow Matching
FlowSep: Language-Queried Sound Separation with Rectified Flow Matching
Yi Yuan
Xubo Liu
Haohe Liu
Mark D. Plumbley
Wenwu Wang
52
3
0
10 Jan 2025
COCOLA: Coherence-Oriented Contrastive Learning of Musical Audio Representations
COCOLA: Coherence-Oriented Contrastive Learning of Musical Audio Representations
Ruben Ciranni
Emilian Postolache
Giorgio Mariani
Michele Mancusi
Giorgio Fabbro
Emanuele Rodolà
Luca Cosmo
59
7
0
10 Jan 2025
Text2Data: Low-Resource Data Generation with Textual Control
Text2Data: Low-Resource Data Generation with Textual Control
Shiyu Wang
Yihao Feng
Tian Lan
Ning Yu
Yu Bai
R. Xu
H. Wang
Caiming Xiong
S.
DiffM
80
0
0
03 Jan 2025
Simultaneous Music Separation and Generation Using Multi-Track Latent Diffusion Models
Simultaneous Music Separation and Generation Using Multi-Track Latent Diffusion Models
Tornike Karchkhadze
M. Izadi
Shlomo Dubnov
DiffM
39
2
0
31 Dec 2024
TangoFlux: Super Fast and Faithful Text to Audio Generation with Flow Matching and Clap-Ranked Preference Optimization
TangoFlux: Super Fast and Faithful Text to Audio Generation with Flow Matching and Clap-Ranked Preference Optimization
Chia-Yu Hung
Navonil Majumder
Zhifeng Kong
Ambuj Mehrish
Rafael Valle
Bryan Catanzaro
Soujanya Poria
Bryan Catanzaro
Soujanya Poria
52
5
0
30 Dec 2024
Spider: Any-to-Many Multimodal LLM
Spider: Any-to-Many Multimodal LLM
Jinxiang Lai
Jie Zhang
Jun Liu
Jian Li
Xiaocheng Lu
Song Guo
MLLM
54
2
0
14 Nov 2024
Enhancing Robustness in Deep Reinforcement Learning: A Lyapunov Exponent
  Approach
Enhancing Robustness in Deep Reinforcement Learning: A Lyapunov Exponent Approach
Rory Young
Nicolas Pugeault
AAML
57
3
0
14 Oct 2024
Presto! Distilling Steps and Layers for Accelerating Music Generation
Presto! Distilling Steps and Layers for Accelerating Music Generation
Zachary Novack
Ge Zhu
Jonah Casebeer
Julian McAuley
Taylor Berg-Kirkpatrick
Nicholas J. Bryan
45
5
0
07 Oct 2024
MDSGen: Fast and Efficient Masked Diffusion Temporal-Aware Transformers for Open-Domain Sound Generation
MDSGen: Fast and Efficient Masked Diffusion Temporal-Aware Transformers for Open-Domain Sound Generation
T. Pham
Tri Ton
Chang D. Yoo
36
3
0
03 Oct 2024
Synthio: Augmenting Small-Scale Audio Classification Datasets with Synthetic Data
Synthio: Augmenting Small-Scale Audio Classification Datasets with Synthetic Data
Sreyan Ghosh
Sonal Kumar
Zhifeng Kong
Rafael Valle
Bryan Catanzaro
Dinesh Manocha
DiffM
39
2
0
02 Oct 2024
A Simple but Strong Baseline for Sounding Video Generation: Effective Adaptation of Audio and Video Diffusion Models for Joint Generation
A Simple but Strong Baseline for Sounding Video Generation: Effective Adaptation of Audio and Video Diffusion Models for Joint Generation
Masato Ishii
Akio Hayakawa
Takashi Shibuya
Yuki Mitsufuji
VGen
DiffM
63
4
0
26 Sep 2024
GALD-SE: Guided Anisotropic Lightweight Diffusion for Efficient Speech Enhancement
GALD-SE: Guided Anisotropic Lightweight Diffusion for Efficient Speech Enhancement
Chengzhong Wang
Jianjun Gu
Dingding Yao
Junfeng Li
Yonghong Yan
DiffM
40
0
0
23 Sep 2024
AudioEditor: A Training-Free Diffusion-Based Audio Editing Framework
AudioEditor: A Training-Free Diffusion-Based Audio Editing Framework
Yuhang Jia
Yang Chen
Jinghua Zhao
Shiwan Zhao
Wenjia Zeng
Yong Chen
Yong Qin
DiffM
29
1
0
19 Sep 2024
AudioComposer: Towards Fine-grained Audio Generation with Natural Language Descriptions
AudioComposer: Towards Fine-grained Audio Generation with Natural Language Descriptions
Y. Wang
Hangting Chen
Dongchao Yang
Zhiyong Wu
Xixin Wu
DiffM
40
2
0
19 Sep 2024
High-Resolution Speech Restoration with Latent Diffusion Model
High-Resolution Speech Restoration with Latent Diffusion Model
Tushar Dhyani
Florian Lux
Michele Mancusi
Giorgio Fabbro
Fritz Hohl
Ngoc Thang Vu
DiffM
30
0
0
17 Sep 2024
Language-Queried Target Sound Extraction Without Parallel Training Data
Language-Queried Target Sound Extraction Without Parallel Training Data
Hao Ma
Zhiyuan Peng
Xu Li
Yukai Li
Mingjie Shao
Qiuqiang Kong
Ju Liu
VLM
67
1
0
14 Sep 2024
Sub-graph Based Diffusion Model for Link Prediction
Sub-graph Based Diffusion Model for Link Prediction
Hang Li
Wei Jin
Geri Skenderi
Harry Shomer
Wenzhuo Tang
Wenqi Fan
Jiliang Tang
DiffM
21
0
0
13 Sep 2024
STA-V2A: Video-to-Audio Generation with Semantic and Temporal Alignment
STA-V2A: Video-to-Audio Generation with Semantic and Temporal Alignment
Yong Ren
Chenxing Li
Manjie Xu
Wei Liang
Yu Gu
Rilin Chen
Dong Yu
VGen
DiffM
43
6
0
13 Sep 2024
Bridging Paintings and Music -- Exploring Emotion based Music Generation
  through Paintings
Bridging Paintings and Music -- Exploring Emotion based Music Generation through Paintings
Tanisha Hisariya
Huan Zhang
Jinhua Liang
24
3
0
12 Sep 2024
InstructSing: High-Fidelity Singing Voice Generation via Instructing
  Yourself
InstructSing: High-Fidelity Singing Voice Generation via Instructing Yourself
Chang Zeng
Chunhui Wang
Xiaoxiao Miao
Jian Zhao
Zhonglin Jiang
Yong Chen
25
0
0
10 Sep 2024
Atlas Gaussians Diffusion for 3D Generation
Atlas Gaussians Diffusion for 3D Generation
Haitao Yang
Yuan Dong
Hanwen Jiang
Dejia Xu
Georgios Pavlakos
Qixing Huang
3DGS
65
3
0
23 Aug 2024
Video-to-Audio Generation with Hidden Alignment
Video-to-Audio Generation with Hidden Alignment
Manjie Xu
Chenxing Li
Yong Ren
Rilin Chen
Yu Gu
Yu Gu
Dong Yu
Dong Yu
DiffM
VGen
43
11
0
10 Jul 2024
Read, Watch and Scream! Sound Generation from Text and Video
Read, Watch and Scream! Sound Generation from Text and Video
Yujin Jeong
Yunji Kim
Sanghyuk Chun
Jiyoung Lee
VGen
DiffM
25
11
0
08 Jul 2024
PAGURI: a user experience study of creative interaction with
  text-to-music models
PAGURI: a user experience study of creative interaction with text-to-music models
Francesca Ronchini
Luca Comanducci
Gabriele Perego
Fabio Antonacci
24
2
0
05 Jul 2024
TimeLDM: Latent Diffusion Model for Unconditional Time Series Generation
TimeLDM: Latent Diffusion Model for Unconditional Time Series Generation
Jian Qian
Miao Sun
Sifan Zhou
Biao Wan
Minhao Li
Patrick Chiang
25
7
0
05 Jul 2024
Subtractive Training for Music Stem Insertion using Latent Diffusion Models
Subtractive Training for Music Stem Insertion using Latent Diffusion Models
Ivan Villa-Renteria
Mason L. Wang
Zachary Shah
Zhe Li
Soohyun Kim
Neelesh Ramachandran
Mert Pilanci
34
0
0
27 Jun 2024
MusicScore: A Dataset for Music Score Modeling and Generation
MusicScore: A Dataset for Music Score Modeling and Generation
Yuheng Lin
Zheqi Dai
Qiuqiang Kong
VLM
32
2
0
17 Jun 2024
LAFMA: A Latent Flow Matching Model for Text-to-Audio Generation
LAFMA: A Latent Flow Matching Model for Text-to-Audio Generation
Wenhao Guan
K. Wang
Wangjin Zhou
Yang Wang
Feng Deng
Hui Wang
Lin Li
Q. Hong
Yong Qin
DiffM
28
3
0
12 Jun 2024
Generative Diffusion Models for Fast Simulations of Particle Collisions
  at CERN
Generative Diffusion Models for Fast Simulations of Particle Collisions at CERN
Mikołaj Kita
Jan Dubiñski
Przemysław Rokita
Kamil Deja
DiffM
AI4CE
27
2
0
05 Jun 2024
SysCaps: Language Interfaces for Simulation Surrogates of Complex Systems
SysCaps: Language Interfaces for Simulation Surrogates of Complex Systems
Patrick Emami
Zhaonan Li
Saumya Sinha
Truc Nguyen
48
1
0
30 May 2024
X-VILA: Cross-Modality Alignment for Large Language Model
X-VILA: Cross-Modality Alignment for Large Language Model
Hanrong Ye
De-An Huang
Yao Lu
Zhiding Yu
Wei Ping
...
Jan Kautz
Song Han
Dan Xu
Pavlo Molchanov
Hongxu Yin
MLLM
VLM
40
29
0
29 May 2024
AdjointDEIS: Efficient Gradients for Diffusion Models
AdjointDEIS: Efficient Gradients for Diffusion Models
Zander Blasingame
Chen Liu
DiffM
30
2
0
23 May 2024
Images that Sound: Composing Images and Sounds on a Single Canvas
Images that Sound: Composing Images and Sounds on a Single Canvas
Ziyang Chen
Daniel Geng
Andrew Owens
DiffM
48
9
0
20 May 2024
On the Challenges and Opportunities in Generative AI
On the Challenges and Opportunities in Generative AI
Laura Manduchi
Kushagra Pandey
Robert Bamler
Ryan Cotterell
Sina Daubener
...
F. Wenzel
Frank Wood
Stephan Mandt
Vincent Fortuin
Vincent Fortuin
54
17
0
28 Feb 2024
12
Next