Communities
Connect sessions
AI calendar
Organizations
Join Slack
Contact Sales
Search
Open menu
Home
Papers
2305.18474
Cited By
Make-An-Audio 2: Temporal-Enhanced Text-to-Audio Generation
29 May 2023
Jia-Bin Huang
Yi Ren
Rongjie Huang
Dongchao Yang
Zhenhui Ye
Chen Zhang
Jinglin Liu
Xiang Yin
Zejun Ma
Zhou Zhao
DiffM
Re-assign community
ArXiv (abs)
PDF
HTML
HuggingFace (3 upvotes)
Papers citing
"Make-An-Audio 2: Temporal-Enhanced Text-to-Audio Generation"
11 / 61 papers shown
Title
Prompt-Singer: Controllable Singing-Voice-Synthesis with Natural Language Prompt
North American Chapter of the Association for Computational Linguistics (NAACL), 2024
Yongqi Wang
Ruofan Hu
Rongjie Huang
Zhiqing Hong
Ruiqi Li
Wenrui Liu
Fuming You
Tao Jin
Zhou Zhao
329
21
0
18 Mar 2024
A Detailed Audio-Text Data Simulation Pipeline using Single-Event Sounds
IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2024
Xuenan Xu
Xiaohang Xu
Zeyu Xie
Pingyue Zhang
Mengyue Wu
Kai Yu
135
8
0
07 Mar 2024
Seeing and Hearing: Open-domain Visual-Audio Generation with Diffusion Latent Aligners
Yazhou Xing
Yin-Yin He
Zeyue Tian
Xintao Wang
Qifeng Chen
285
99
0
27 Feb 2024
SingVisio: Visual Analytics of Diffusion Model for Singing Voice Conversion
Liumeng Xue
Chaoren Wang
Mingxuan Wang
Xueyao Zhang
Jun Han
Zhizheng Wu
DiffM
162
6
0
20 Feb 2024
Multi-view MidiVAE: Fusing Track- and Bar-view Representations for Long Multi-track Symbolic Music Generation
IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2024
Zhiwei Lin
Jun Chen
Boshi Tang
Binzhu Sha
Jing Yang
Yaolong Ju
Fan Fan
Max Welling
Zhiyong Wu
Helen M. Meng
222
2
0
15 Jan 2024
Auffusion: Leveraging the Power of Diffusion and Large Language Models for Text-to-Audio Generation
IEEE/ACM Transactions on Audio Speech and Language Processing (TASLP), 2024
Jinlong Xue
Yayue Deng
Yingming Gao
Ya Li
DiffM
212
58
0
02 Jan 2024
Schrodinger Bridges Beat Diffusion Models on Text-to-Speech Synthesis
Zehua Chen
Guande He
Kaiwen Zheng
Xu Tan
Jun Zhu
DiffM
242
34
0
06 Dec 2023
VoiceLDM: Text-to-Speech with Environmental Context
IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2023
Yeong-Won Lee
In-won Yeon
Juhan Nam
Joon Son Chung
VLM
DiffM
135
30
0
24 Sep 2023
ConsistencyTTA: Accelerating Diffusion-Based Text-to-Audio Generation with Consistency Distillation
Interspeech (Interspeech), 2023
Yatong Bai
Trung D. Q. Dang
Dung N. Tran
K. Koishida
Somayeh Sojoudi
DiffM
321
32
0
19 Sep 2023
AudioLDM 2: Learning Holistic Audio Generation with Self-supervised Pretraining
Haohe Liu
Yiitan Yuan
Xubo Liu
Xinhao Mei
Qiuqiang Kong
Qiao Tian
Yuping Wang
Wenwu Wang
Yuxuan Wang
Mark D. Plumbley
DiffM
264
368
0
10 Aug 2023
Separate Anything You Describe
IEEE Transactions on Audio, Speech, and Language Processing (TASLP), 2023
Xubo Liu
Qiuqiang Kong
Yan Zhao
Haohe Liu
Yiitan Yuan
Yuzhuo Liu
Rui Xia
Yuxuan Wang
Mark D. Plumbley
Wenwu Wang
VLM
261
69
0
09 Aug 2023
Previous
1
2