ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2026 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2308.04729
  4. Cited By
JEN-1: Text-Guided Universal Music Generation with Omnidirectional Diffusion Models
v1v2 (latest)

JEN-1: Text-Guided Universal Music Generation with Omnidirectional Diffusion Models

Conference on Algebraic Informatics (CAI), 2023
9 August 2023
Peike Li
Bo-Yu Chen
Yao Yao
Yikai Wang
Allen Wang
Alex Jinpeng Wang
    MGenVLMDiffM
ArXiv (abs)PDFHTMLHuggingFace (32 upvotes)

Papers citing "JEN-1: Text-Guided Universal Music Generation with Omnidirectional Diffusion Models"

50 / 69 papers shown
AudioMoG: Guiding Audio Generation with Mixture-of-Guidance
AudioMoG: Guiding Audio Generation with Mixture-of-Guidance
Junyou Wang
Zehua Chen
Binjie Yuan
Kaiwen Zheng
Chang Li
Yuxuan Jiang
Jun Zhu
137
0
0
28 Sep 2025
LiLAC: A Lightweight Latent ControlNet for Musical Audio Generation
LiLAC: A Lightweight Latent ControlNet for Musical Audio Generation
Tom Baker
Javier Nistal
DiffM
237
1
0
13 Jun 2025
Auto-Regressive vs Flow-Matching: a Comparative Study of Modeling Paradigms for Text-to-Music Generation
Or Tal
Felix Kreuk
Yossi Adi
AI4TS
337
0
0
10 Jun 2025
A Review on Score-based Generative Models for Audio Applications
Ge Zhu
Yutong Wen
Zhiyao Duan
DiffMMedIm
214
3
0
10 Jun 2025
A Survey on Cross-Modal Interaction Between Music and Multimodal Data
A Survey on Cross-Modal Interaction Between Music and Multimodal Data
Sifei Li
Mining Tan
Feier Shen
Minyan Luo
Zijiao Yin
Fan Tang
Weiming Dong
Changsheng Xu
305
1
0
17 Apr 2025
Analyzable Chain-of-Musical-Thought Prompting for High-Fidelity Music Generation
Analyzable Chain-of-Musical-Thought Prompting for High-Fidelity Music Generation
Max W. Y. Lam
Yijin Xing
Weiya You
Jingcheng Wu
Zongyu Yin
...
T. Zhao
Chien-Hung Liu
Xuchen Song
Yang Li
Yahui Zhou
LRM
352
13
0
25 Mar 2025
InspireMusic: Integrating Super Resolution and Large Language Model for High-Fidelity Long-Form Music Generation
Chen Zhang
Yukun Ma
Qian Chen
Wen Wang
Shengkui Zhao
...
Yiheng Jiang
Chaohong Tan
Zhifu Gao
Zhihao Du
B. Ma
154
9
0
28 Feb 2025
NOTA: Multimodal Music Notation Understanding for Visual Large Language Model
NOTA: Multimodal Music Notation Understanding for Visual Large Language ModelNorth American Chapter of the Association for Computational Linguistics (NAACL), 2025
Mingni Tang
Jiajia Li
Lu Yang
Zhiqiang Zhang
Jinghao Tian
Hui Yuan
Guang Dai
Peijie Wang
210
2
0
17 Feb 2025
Music for All: Representational Bias and Cross-Cultural Adaptability of Music Generation Models
Music for All: Representational Bias and Cross-Cultural Adaptability of Music Generation ModelsNorth American Chapter of the Association for Computational Linguistics (NAACL), 2025
Atharva Mehta
Shivam Chauhan
Amirbek Djanibekov
Atharva Kulkarni
Gus Xia
Monojit Choudhury
593
8
0
11 Feb 2025
Editing Music with Melody and Text: Using ControlNet for Diffusion Transformer
Editing Music with Melody and Text: Using ControlNet for Diffusion TransformerIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2024
Siyuan Hou
Shansong Liu
Ruibin Yuan
Wei Xue
Ying Shan
Mangsuo Zhao
Chao Zhang
315
12
0
17 Jan 2025
MusicFlow: Cascaded Flow Matching for Text Guided Music Generation
MusicFlow: Cascaded Flow Matching for Text Guided Music GenerationInternational Conference on Machine Learning (ICML), 2024
K R Prajwal
Bowen Shi
Matthew Lee
Apoorv Vyas
Andros Tjandra
...
Baishan Guo
Huiyu Wang
Triantafyllos Afouras
David Kant
Wei-Ning Hsu
192
11
0
27 Oct 2024
Multi-Source Music Generation with Latent Diffusion
Multi-Source Music Generation with Latent Diffusion
Zhongweiyang Xu
Debottam Dutta
Yu-Lin Wei
Romit Roy Choudhury
DiffM
406
5
0
10 Sep 2024
FLUX that Plays Music
FLUX that Plays Music
Zhengcong Fei
Mingyuan Fan
Changqian Yu
Junshi Huang
316
17
0
01 Sep 2024
MusiConGen: Rhythm and Chord Control for Transformer-Based Text-to-Music
  Generation
MusiConGen: Rhythm and Chord Control for Transformer-Based Text-to-Music Generation
Yun-Han Lan
Wen-Yi Hsiao
Hao-Chung Cheng
Yi-Hsuan Yang
198
21
0
21 Jul 2024
Audio Conditioning for Music Generation via Discrete Bottleneck Features
Audio Conditioning for Music Generation via Discrete Bottleneck Features
Simon Rouard
Yossi Adi
Jade Copet
Axel Roebel
Alexandre Défossez
MGen
308
5
0
17 Jul 2024
High Fidelity Text-Guided Music Generation and Editing via Single-Stage
  Flow Matching
High Fidelity Text-Guided Music Generation and Editing via Single-Stage Flow Matching
Gaël Le Lan
Bowen Shi
Zhaoheng Ni
Sidd Srinivasan
Anurag Kumar
...
Varun K. Nagaraja
Ernie Chang
Wei-Ning Hsu
Yangyang Shi
Vikas Chandra
DiffM
120
2
0
04 Jul 2024
Towards Training Music Taggers on Synthetic Data
Towards Training Music Taggers on Synthetic Data
N. Kroher
Steven Manangu
A. Pikrakis
183
1
0
02 Jul 2024
JEN-1 DreamStyler: Customized Musical Concept Learning via Pivotal
  Parameters Tuning
JEN-1 DreamStyler: Customized Musical Concept Learning via Pivotal Parameters Tuning
Boyu Chen
Peike Li
Yao Yao
Alex Wang
DiffM
184
3
0
18 Jun 2024
Joint Audio and Symbolic Conditioning for Temporally Controlled
  Text-to-Music Generation
Joint Audio and Symbolic Conditioning for Temporally Controlled Text-to-Music Generation
Or Tal
Alon Ziv
Itai Gat
Felix Kreuk
Yossi Adi
195
29
0
16 Jun 2024
Diff-A-Riff: Musical Accompaniment Co-creation via Latent Diffusion
  Models
Diff-A-Riff: Musical Accompaniment Co-creation via Latent Diffusion Models
J. Nistal
Marco Pasini
Cyran Aouameur
M. Grachten
Stefan Lattner
DiffM
302
38
0
12 Jun 2024
Autoregressive Diffusion Transformer for Text-to-Speech Synthesis
Autoregressive Diffusion Transformer for Text-to-Speech Synthesis
Zhijun Liu
Shuai Wang
Sho Inoue
Qibing Bai
Haizhou Li
DiffM
184
31
0
08 Jun 2024
A Survey of Deep Learning Audio Generation Methods
A Survey of Deep Learning Audio Generation Methods
Matej Bozic
Marko Horvat
VLMMedIm
299
9
0
31 May 2024
Instruct-MusicGen: Unlocking Text-to-Music Editing for Music Language Models via Instruction Tuning
Instruct-MusicGen: Unlocking Text-to-Music Editing for Music Language Models via Instruction Tuning
Yixiao Zhang
Yukara Ikemiya
Woosung Choi
Naoki Murata
Marco A. Martínez-Ramírez
Liwei Lin
Gus Xia
Wei-Hsiang Liao
Yuki Mitsufuji
Simon Dixon
427
22
0
28 May 2024
Quality-aware Masked Diffusion Transformer for Enhanced Music Generation
Quality-aware Masked Diffusion Transformer for Enhanced Music Generation
Chang Li
Ruoyu Wang
Lijuan Liu
Jun Du
Yixuan Sun
Zilu Guo
Zhenrong Zhang
Yuan Jiang
J. Gao
Feng Ma
391
8
0
24 May 2024
Music Consistency Models
Music Consistency Models
Zhengcong Fei
Mingyuan Fan
Junshi Huang
DiffM
197
7
0
20 Apr 2024
MusicMagus: Zero-Shot Text-to-Music Editing via Diffusion Models
MusicMagus: Zero-Shot Text-to-Music Editing via Diffusion Models
Yixiao Zhang
Yukara Ikemiya
Gus Xia
Naoki Murata
Marco A. Martínez-Ramírez
Wei-Hsiang Liao
Yuki Mitsufuji
Simon Dixon
307
43
0
09 Feb 2024
Fast Timing-Conditioned Latent Audio Diffusion
Fast Timing-Conditioned Latent Audio Diffusion
Zach Evans
CJ Carr
Josiah Taylor
Scott H. Hawley
Jordi Pons
DiffM
502
192
0
07 Feb 2024
Bass Accompaniment Generation via Latent Diffusion
Bass Accompaniment Generation via Latent Diffusion
Marco Pasini
M. Grachten
Stefan Lattner
198
19
0
02 Feb 2024
Masked Audio Generation using a Single Non-Autoregressive Transformer
Masked Audio Generation using a Single Non-Autoregressive TransformerInternational Conference on Learning Representations (ICLR), 2024
Alon Ziv
Itai Gat
Gaël Le Lan
Tal Remez
Felix Kreuk
Alexandre Défossez
Jade Copet
Gabriel Synnaeve
Yossi Adi
390
61
0
09 Jan 2024
Audiobox: Unified Audio Generation with Natural Language Prompts
Audiobox: Unified Audio Generation with Natural Language Prompts
Apoorv Vyas
Bowen Shi
Matt Le
Andros Tjandra
Yi-Chiao Wu
...
Chris Summers
Carleigh Wood
Joshua Lane
Mary Williamson
Wei-Ning Hsu
315
137
0
25 Dec 2023
Can MusicGen Create Training Data for MIR Tasks?
Can MusicGen Create Training Data for MIR Tasks?
N. Kroher
Helena Cuesta
A. Pikrakis
MGenVLM
207
2
0
15 Nov 2023
JEN-1 Composer: A Unified Framework for High-Fidelity Multi-Track Music
  Generation
JEN-1 Composer: A Unified Framework for High-Fidelity Multi-Track Music GenerationAAAI Conference on Artificial Intelligence (AAAI), 2023
Yao Yao
Peike Li
Boyu Chen
Alex Wang
DiffM
212
17
0
29 Oct 2023
Stack-and-Delay: a new codebook pattern for music generation
Stack-and-Delay: a new codebook pattern for music generationIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2023
Gaël Le Lan
Varun K. Nagaraja
Ernie Chang
David Kant
Zhaoheng Ni
Yangyang Shi
Forrest N. Iandola
Vikas Chandra
BDL
207
8
0
15 Sep 2023
AudioPaLM: A Large Language Model That Can Speak and Listen
AudioPaLM: A Large Language Model That Can Speak and Listen
Paul Kishan Rubenstein
Chulayuth Asawaroengchai
D. Nguyen
Ankur Bapna
Zalan Borsos
...
Neil Zeghidour
Yu Zhang
Zhishuai Zhang
Lukás Zilka
Christian Frank
LM&MAAuLLMVLM
253
390
0
22 Jun 2023
Simple and Controllable Music Generation
Simple and Controllable Music GenerationNeural Information Processing Systems (NeurIPS), 2023
Jade Copet
Felix Kreuk
Itai Gat
Tal Remez
David Kant
Gabriel Synnaeve
Yossi Adi
Alexandre Défossez
MGen
429
583
0
08 Jun 2023
Text-to-Audio Generation using Instruction-Tuned LLM and Latent
  Diffusion Model
Text-to-Audio Generation using Instruction-Tuned LLM and Latent Diffusion Model
Deepanway Ghosal
Navonil Majumder
Ambuj Mehrish
Soujanya Poria
348
190
0
24 Apr 2023
Noise2Music: Text-conditioned Music Generation with Diffusion Models
Noise2Music: Text-conditioned Music Generation with Diffusion Models
Qingqing Huang
Daniel S. Park
Tao Wang
Timo I. Denk
Andy Ly
...
Jesse Engel
Quoc V. Le
William Chan
Zhifeng Chen
Wei Han
MGenDiffM
338
245
0
08 Feb 2023
Make-An-Audio: Text-To-Audio Generation with Prompt-Enhanced Diffusion
  Models
Make-An-Audio: Text-To-Audio Generation with Prompt-Enhanced Diffusion ModelsInternational Conference on Machine Learning (ICML), 2023
Rongjie Huang
Jia-Bin Huang
Dongchao Yang
Yi Ren
Luping Liu
Mingze Li
Zhenhui Ye
Jinglin Liu
Xiaoyue Yin
Zhou Zhao
DiffM
400
427
0
30 Jan 2023
AudioLDM: Text-to-Audio Generation with Latent Diffusion Models
AudioLDM: Text-to-Audio Generation with Latent Diffusion ModelsInternational Conference on Machine Learning (ICML), 2023
Haohe Liu
Zehua Chen
Yiitan Yuan
Xinhao Mei
Xubo Liu
Danilo Mandic
Wenwu Wang
Mark D. Plumbley
DiffM
726
665
0
29 Jan 2023
Moûsai: Text-to-Music Generation with Long-Context Latent Diffusion
Moûsai: Text-to-Music Generation with Long-Context Latent Diffusion
Flavio Schneider
Ojasv Kamal
Zhijing Jin
Bernhard Schölkopf
MGen
357
111
0
27 Jan 2023
MusicLM: Generating Music From Text
MusicLM: Generating Music From Text
A. Agostinelli
Timo I. Denk
Zalan Borsos
Jesse Engel
Mauro Verzetti
...
Adam Roberts
Marco Tagliasacchi
Matthew Sharifi
Neil Zeghidour
Christian Frank
MGen
708
595
0
26 Jan 2023
High Fidelity Neural Audio Compression
High Fidelity Neural Audio Compression
Alexandre Défossez
Jade Copet
Gabriel Synnaeve
Yossi Adi
300
982
0
24 Oct 2022
Scaling Instruction-Finetuned Language Models
Scaling Instruction-Finetuned Language ModelsJournal of machine learning research (JMLR), 2022
Hyung Won Chung
Le Hou
Shayne Longpre
Barret Zoph
Yi Tay
...
Jacob Devlin
Adam Roberts
Denny Zhou
Quoc V. Le
Jason W. Wei
ReLMLRM
1.3K
3,790
0
20 Oct 2022
AudioGen: Textually Guided Audio Generation
AudioGen: Textually Guided Audio GenerationInternational Conference on Learning Representations (ICLR), 2022
Felix Kreuk
Gabriel Synnaeve
Adam Polyak
Uriel Singer
Alexandre Défossez
Jade Copet
Devi Parikh
Yaniv Taigman
Yossi Adi
DiffM
391
391
0
30 Sep 2022
AudioLM: a Language Modeling Approach to Audio Generation
AudioLM: a Language Modeling Approach to Audio GenerationIEEE/ACM Transactions on Audio Speech and Language Processing (TASLP), 2022
Zalan Borsos
Raphaël Marinier
Damien Vincent
Eugene Kharitonov
Olivier Pietquin
...
Dominik Roblek
O. Teboul
David Grangier
Marco Tagliasacchi
Neil Zeghidour
AuLLM
392
813
0
07 Sep 2022
Prompt-to-Prompt Image Editing with Cross Attention Control
Prompt-to-Prompt Image Editing with Cross Attention ControlInternational Conference on Learning Representations (ICLR), 2022
Amir Hertz
Ron Mokady
J. Tenenbaum
Kfir Aberman
Yael Pritch
Daniel Cohen-Or
DiffM
711
2,323
0
02 Aug 2022
Classifier-Free Diffusion Guidance
Classifier-Free Diffusion Guidance
Jonathan Ho
Tim Salimans
FaML
470
5,280
0
26 Jul 2022
Diffsound: Discrete Diffusion Model for Text-to-sound Generation
Diffsound: Discrete Diffusion Model for Text-to-sound GenerationIEEE/ACM Transactions on Audio Speech and Language Processing (TASLP), 2022
Dongchao Yang
Jianwei Yu
Helin Wang
Wen Wang
Chao Weng
Yuexian Zou
Dong Yu
DiffM
262
379
0
20 Jul 2022
Photorealistic Text-to-Image Diffusion Models with Deep Language
  Understanding
Photorealistic Text-to-Image Diffusion Models with Deep Language UnderstandingNeural Information Processing Systems (NeurIPS), 2022
Chitwan Saharia
William Chan
Saurabh Saxena
Lala Li
Jay Whang
...
Raphael Gontijo-Lopes
Tim Salimans
Jonathan Ho
David J Fleet
Mohammad Norouzi
VLM
1.1K
7,473
0
23 May 2022
Symbolic music generation conditioned on continuous-valued emotions
Symbolic music generation conditioned on continuous-valued emotionsIEEE Access (IEEE Access), 2022
Serkan Sulun
M. Davies
Paula Viana
MGen
194
37
0
30 Mar 2022
12
Next