ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2309.08051
  4. Cited By
Retrieval-Augmented Text-to-Audio Generation
v1v2 (latest)

Retrieval-Augmented Text-to-Audio Generation

IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2023
14 September 2023
Yiitan Yuan
Haohe Liu
Xubo Liu
Qiushi Huang
Mark D. Plumbley
Wenwu Wang
    RALM
ArXiv (abs)PDFHTMLHuggingFace (7 upvotes)

Papers citing "Retrieval-Augmented Text-to-Audio Generation"

20 / 20 papers shown
Title
Feedback-driven Retrieval-augmented Audio Generation with Large Audio Language Models
Feedback-driven Retrieval-augmented Audio Generation with Large Audio Language Models
Junqi Zhao
Chenxing Li
Jinzheng Zhao
Rilin Chen
Dong Yu
Mark D. Plumbley
Wenwu Wang
82
0
0
02 Nov 2025
DreamAudio: Customized Text-to-Audio Generation with Diffusion Models
DreamAudio: Customized Text-to-Audio Generation with Diffusion Models
Yi Yuan
Xubo Liu
Haohe Liu
Xiyuan Kang
Zhuo Chen
Yuping Wang
Mark D. Plumbley
Wenwu Wang
DiffM
104
0
0
07 Sep 2025
MeanAudio: Fast and Faithful Text-to-Audio Generation with Mean Flows
MeanAudio: Fast and Faithful Text-to-Audio Generation with Mean Flows
Xiquan Li
Junxi Liu
Yuzhe Liang
Zhikang Niu
Wenxi Chen
Xie Chen
129
2
0
08 Aug 2025
A Review on Score-based Generative Models for Audio Applications
Ge Zhu
Yutong Wen
Zhiyao Duan
DiffMMedIm
158
3
0
10 Jun 2025
AudioGenie: A Training-Free Multi-Agent Framework for Diverse Multimodality-to-Multiaudio Generation
AudioGenie: A Training-Free Multi-Agent Framework for Diverse Multimodality-to-Multiaudio Generation
Yan Rong
Jinting Wang
Shan Yang
Guangzhi Lei
Li Liu
DiffMVGen
213
1
0
28 May 2025
AutoStyle-TTS: Retrieval-Augmented Generation based Automatic Style Matching Text-to-Speech Synthesis
AutoStyle-TTS: Retrieval-Augmented Generation based Automatic Style Matching Text-to-Speech Synthesis
Dan Luo
Chengyuan Ma
Weiqin Li
Jun Wang
Wei Chen
Zhiyong Wu
244
4
0
14 Apr 2025
Retrieval-Augmented Neural Field for HRTF Upsampling and Personalization
Retrieval-Augmented Neural Field for HRTF Upsampling and PersonalizationIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2025
Yoshiki Masuyama
Gordon Wichern
François Germain
Christopher Ick
Jonathan Le Roux
223
4
0
22 Jan 2025
Retrieval-Augmented Dialogue Knowledge Aggregation for Expressive Conversational Speech Synthesis
Retrieval-Augmented Dialogue Knowledge Aggregation for Expressive Conversational Speech SynthesisInformation Fusion (Inf. Fusion), 2025
Rui Liu
Zhenqi Jia
F. Bao
Hong Li
163
7
0
11 Jan 2025
FlowSep: Language-Queried Sound Separation with Rectified Flow Matching
FlowSep: Language-Queried Sound Separation with Rectified Flow MatchingIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2024
Yi Yuan
Xubo Liu
Haohe Liu
Mark D. Plumbley
Wenwu Wang
355
20
0
10 Jan 2025
Sound-VECaps: Improving Audio Generation with Visual Enhanced Captions
Sound-VECaps: Improving Audio Generation with Visual Enhanced Captions
Yi Yuan
Dongya Jia
Xiaobin Zhuang
Yuanzhe Chen
Zhengxi Liu
...
Longji Xu
Xubo Liu
Xiyuan Kang
Mark D. Plumbley
Wenwu Wang
VLM
311
4
0
03 Jan 2025
Scaling Transformers for Low-Bitrate High-Quality Speech Coding
Scaling Transformers for Low-Bitrate High-Quality Speech Coding
Julian Parker
Anton Smirnov
Jordi Pons
CJ Carr
Zack Zukowski
Zach Evans
Xubo Liu
265
49
0
29 Nov 2024
Leveraging Retrieval Augment Approach for Multimodal Emotion Recognition
  Under Missing Modalities
Leveraging Retrieval Augment Approach for Multimodal Emotion Recognition Under Missing Modalities
Qi Fan
Hongyu Yuan
Haolin Zuo
Rui Liu
Guanglai Gao
87
2
0
19 Sep 2024
Rhythmic Foley: A Framework For Seamless Audio-Visual Alignment In
  Video-to-Audio Synthesis
Rhythmic Foley: A Framework For Seamless Audio-Visual Alignment In Video-to-Audio SynthesisIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2024
Zhiqi Huang
Dan Luo
Jun Wang
Huan Liao
Zhiheng Li
Zhiyong Wu
VGen
146
5
0
13 Sep 2024
PPPR: Portable Plug-in Prompt Refiner for Text to Audio Generation
PPPR: Portable Plug-in Prompt Refiner for Text to Audio GenerationInterspeech (Interspeech), 2024
Shuchen Shi
Ruibo Fu
Zhengqi Wen
Jianhua Tao
Tao Wang
...
Zhengqi Wen
Yukun Liu
Yongwei Li
Zhiyong Wang
Xiaopeng Wang
135
2
0
07 Jun 2024
A Survey of Deep Learning Audio Generation Methods
A Survey of Deep Learning Audio Generation Methods
Matej Bozic
Marko Horvat
VLMMedIm
247
8
0
31 May 2024
RAG and RAU: A Survey on Retrieval-Augmented Language Model in Natural Language Processing
RAG and RAU: A Survey on Retrieval-Augmented Language Model in Natural Language Processing
Yucheng Hu
Yuxing Lu
RALM
297
31
0
30 Apr 2024
T-CLAP: Temporal-Enhanced Contrastive Language-Audio Pretraining
T-CLAP: Temporal-Enhanced Contrastive Language-Audio Pretraining
Yiitan Yuan
Zhuo Chen
Xubo Liu
Haohe Liu
Xuenan Xu
Dongya Jia
Yuanzhe Chen
Mark D. Plumbley
Wenwu Wang
CLIPVLM
159
20
0
27 Apr 2024
A Survey on Retrieval-Augmented Text Generation for Large Language
  Models
A Survey on Retrieval-Augmented Text Generation for Large Language Models
Yizheng Huang
Jimmy X. Huang
3DVRALM
231
79
0
17 Apr 2024
Retrieval-Augmented Generation for AI-Generated Content: A Survey
Retrieval-Augmented Generation for AI-Generated Content: A Survey
Penghao Zhao
Hailin Zhang
Qinhan Yu
Zhengren Wang
Yunteng Geng
Fangcheng Fu
Ling Yang
Wentao Zhang
Jie Jiang
Tengjiao Wang
3DV
832
426
0
29 Feb 2024
Retrieval Augmented End-to-End Spoken Dialog Models
Retrieval Augmented End-to-End Spoken Dialog Models
Mingqiu Wang
Izhak Shafran
H. Soltau
Wei Han
Yuan Cao
Dian Yu
Laurent El Shafey
RALMAuLLM
144
21
0
02 Feb 2024
1