ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2306.05284
  4. Cited By
Simple and Controllable Music Generation

Simple and Controllable Music Generation

8 June 2023
Jade Copet
Felix Kreuk
Itai Gat
Tal Remez
David Kant
Gabriel Synnaeve
Yossi Adi
Alexandre Défossez
    MGen
ArXivPDFHTML

Papers citing "Simple and Controllable Music Generation"

50 / 256 papers shown
Title
SilVar: Speech Driven Multimodal Model for Reasoning Visual Question
  Answering and Object Localization
SilVar: Speech Driven Multimodal Model for Reasoning Visual Question Answering and Object Localization
Tan-Hanh Pham
Hoang-Nam Le
Phu-Vinh Nguyen
Chris Ngo
Truong Son-Hy
AuLLM
LRM
81
1
0
21 Dec 2024
Tuning Music Education: AI-Powered Personalization in Learning Music
Tuning Music Education: AI-Powered Personalization in Learning Music
Mayank Sanganeria
Rohan Gala
70
0
0
18 Dec 2024
Interpreting Graphic Notation with MusicLDM: An AI Improvisation of
  Cornelius Cardew's Treatise
Interpreting Graphic Notation with MusicLDM: An AI Improvisation of Cornelius Cardew's Treatise
Tornike Karchkhadze
Keren Shao
Shlomo Dubnov
75
0
0
12 Dec 2024
M6: Multi-generator, Multi-domain, Multi-lingual and cultural,
  Multi-genres, Multi-instrument Machine-Generated Music Detection Databases
M6: Multi-generator, Multi-domain, Multi-lingual and cultural, Multi-genres, Multi-instrument Machine-Generated Music Detection Databases
Yupei Li
Hanqian Li
Lucia Specia
Björn Schuller
103
3
0
08 Dec 2024
SyncFlow: Toward Temporally Aligned Joint Audio-Video Generation from
  Text
SyncFlow: Toward Temporally Aligned Joint Audio-Video Generation from Text
Haohe Liu
Gaël Le Lan
Xinhao Mei
Zhaoheng Ni
Anurag Kumar
Varun K. Nagaraja
Wenwu Wang
Mark D. Plumbley
Yangyang Shi
Vikas Chandra
VGen
64
1
0
03 Dec 2024
The Codec Language Model-based Zero-Shot Spontaneous Style TTS System for CoVoC Challenge 2024
The Codec Language Model-based Zero-Shot Spontaneous Style TTS System for CoVoC Challenge 2024
Shuoyi Zhou
Yixuan Zhou
Weiqing Li
Jun Chen
Runchuan Ye
Weihao Wu
Zijian Lin
Shun Lei
Zhiyong Wu
102
1
0
02 Dec 2024
Scaling Transformers for Low-Bitrate High-Quality Speech Coding
Scaling Transformers for Low-Bitrate High-Quality Speech Coding
Julian Parker
Anton Smirnov
Jordi Pons
CJ Carr
Zack Zukowski
Zach Evans
Xubo Liu
77
9
0
29 Nov 2024
Continuous Autoregressive Models with Noise Augmentation Avoid Error
  Accumulation
Continuous Autoregressive Models with Noise Augmentation Avoid Error Accumulation
Marco Pasini
J. Nistal
Stefan Lattner
George Fazekas
69
3
0
27 Nov 2024
Compression of Higher Order Ambisonics with Multichannel RVQGAN
Compression of Higher Order Ambisonics with Multichannel RVQGAN
Toni Hirvonen
Mahmoud Namazi
70
0
0
18 Nov 2024
Long-Form Text-to-Music Generation with Adaptive Prompts: A Case of Study in Tabletop Role-Playing Games Soundtracks
Long-Form Text-to-Music Generation with Adaptive Prompts: A Case of Study in Tabletop Role-Playing Games Soundtracks
Felipe Marra
Lucas N. Ferreira
31
0
0
06 Nov 2024
The ISCSLP 2024 Conversational Voice Clone (CoVoC) Challenge: Tasks,
  Results and Findings
The ISCSLP 2024 Conversational Voice Clone (CoVoC) Challenge: Tasks, Results and Findings
Kangxiang Xia
Dake Guo
J.-H. Yao
Liumeng Xue
Hanzhao Li
...
Lei Xie
Qingqing Zhang
L. Luo
M. Dong
Peng Sun
52
1
0
31 Oct 2024
A Closer Look at Neural Codec Resynthesis: Bridging the Gap between
  Codec and Waveform Generation
A Closer Look at Neural Codec Resynthesis: Bridging the Gap between Codec and Waveform Generation
Alexander H. Liu
Qirui Wang
Yuan Gong
James Glass
30
0
0
29 Oct 2024
Fourier Head: Helping Large Language Models Learn Complex Probability Distributions
Fourier Head: Helping Large Language Models Learn Complex Probability Distributions
Nate Gillman
Daksh Aggarwal
Michael Freeman
Saurabh Singh
Chen Sun
AI4TS
46
3
0
29 Oct 2024
OmniSep: Unified Omni-Modality Sound Separation with Query-Mixup
OmniSep: Unified Omni-Modality Sound Separation with Query-Mixup
Xize Cheng
Siqi Zheng
Zehan Wang
Minghui Fang
Ziang Zhang
...
Z. Ma
Shengpeng Ji
Jialong Zuo
Tao Jin
Zhou Zhao
30
1
0
28 Oct 2024
MusicFlow: Cascaded Flow Matching for Text Guided Music Generation
MusicFlow: Cascaded Flow Matching for Text Guided Music Generation
K R Prajwal
Bowen Shi
Matthew Lee
Apoorv Vyas
Andros Tjandra
...
Baishan Guo
Huiyu Wang
Triantafyllos Afouras
David Kant
Wei-Ning Hsu
43
5
0
27 Oct 2024
Get Large Language Models Ready to Speak: A Late-fusion Approach for
  Speech Generation
Get Large Language Models Ready to Speak: A Late-fusion Approach for Speech Generation
Maohao Shen
Shun Zhang
Jilong Wu
Zhiping Xiu
Ehab AlBadawy
Yiting Lu
M. Seltzer
Qing He
36
2
0
27 Oct 2024
SNAC: Multi-Scale Neural Audio Codec
SNAC: Multi-Scale Neural Audio Codec
Hubert Siuzdak
Florian Grötschla
Luca A. Lanzendörfer
19
10
0
18 Oct 2024
MeloTrans: A Text to Symbolic Music Generation Model Following Human
  Composition Habit
MeloTrans: A Text to Symbolic Music Generation Model Following Human Composition Habit
Yutian Wang
Wanyin Yang
Zhenrong Dai
Yilong Zhang
Kun Zhao
Hui Wang
37
2
0
17 Oct 2024
MuVi: Video-to-Music Generation with Semantic Alignment and Rhythmic
  Synchronization
MuVi: Video-to-Music Generation with Semantic Alignment and Rhythmic Synchronization
Ruiqi Li
Siqi Zheng
Xize Cheng
Ziang Zhang
Shengpeng Ji
Zhou Zhao
VGen
68
7
0
16 Oct 2024
SAFREE: Training-Free and Adaptive Guard for Safe Text-to-Image And Video Generation
SAFREE: Training-Free and Adaptive Guard for Safe Text-to-Image And Video Generation
Jaehong Yoon
Shoubin Yu
Vaidehi Patil
Huaxiu Yao
Joey Tianyi Zhou
79
15
0
16 Oct 2024
Enhancing Robustness in Deep Reinforcement Learning: A Lyapunov Exponent
  Approach
Enhancing Robustness in Deep Reinforcement Learning: A Lyapunov Exponent Approach
Rory Young
Nicolas Pugeault
AAML
59
0
0
14 Oct 2024
Code Drift: Towards Idempotent Neural Audio Codecs
Code Drift: Towards Idempotent Neural Audio Codecs
P. O'Reilly
Prem Seetharaman
Jiaqi Su
Zeyu Jin
Bryan Pardo
125
0
0
14 Oct 2024
M2M-Gen: A Multimodal Framework for Automated Background Music
  Generation in Japanese Manga Using Large Language Models
M2M-Gen: A Multimodal Framework for Automated Background Music Generation in Japanese Manga Using Large Language Models
Megha Sharma
Muhammad Taimoor Haseeb
Gus Xia
Yoshimasa Tsuruoka
18
2
0
13 Oct 2024
SRC-gAudio: Sampling-Rate-Controlled Audio Generation
SRC-gAudio: Sampling-Rate-Controlled Audio Generation
Chenxing Li
Manjie Xu
Dong Yu
DiffM
33
0
0
09 Oct 2024
Diversity-Rewarded CFG Distillation
Diversity-Rewarded CFG Distillation
Geoffrey Cideron
A. Agostinelli
Johan Ferret
Sertan Girgin
Romuald Elie
Olivier Bachem
Sarah Perrin
Alexandre Ramé
39
2
0
08 Oct 2024
Presto! Distilling Steps and Layers for Accelerating Music Generation
Presto! Distilling Steps and Layers for Accelerating Music Generation
Zachary Novack
Ge Zhu
Jonah Casebeer
Julian McAuley
Taylor Berg-Kirkpatrick
Nicholas J. Bryan
45
5
0
07 Oct 2024
HALL-E: Hierarchical Neural Codec Language Model for Minute-Long
  Zero-Shot Text-to-Speech Synthesis
HALL-E: Hierarchical Neural Codec Language Model for Minute-Long Zero-Shot Text-to-Speech Synthesis
Yuto Nishimura
Takumi Hirose
Masanari Ohi
Hideki Nakayama
Nakamasa Inoue
VLM
29
1
0
06 Oct 2024
Graded Suspiciousness of Adversarial Texts to Human
Graded Suspiciousness of Adversarial Texts to Human
Shakila Mahjabin Tonni
Pedro Faustini
Mark Dras
AAML
23
0
0
06 Oct 2024
SyllableLM: Learning Coarse Semantic Units for Speech Language Models
SyllableLM: Learning Coarse Semantic Units for Speech Language Models
Alan Baade
Puyuan Peng
David Harwath
52
3
0
05 Oct 2024
Audio-Agent: Leveraging LLMs For Audio Generation, Editing and Composition
Audio-Agent: Leveraging LLMs For Audio Generation, Editing and Composition
Zixuan Wang
Chi-Keung Tang
Chi-Keung Tang
DiffM
VGen
LLMAG
46
4
0
04 Oct 2024
Synthio: Augmenting Small-Scale Audio Classification Datasets with Synthetic Data
Synthio: Augmenting Small-Scale Audio Classification Datasets with Synthetic Data
Sreyan Ghosh
Sonal Kumar
Zhifeng Kong
Rafael Valle
Bryan Catanzaro
Dinesh Manocha
DiffM
49
2
0
02 Oct 2024
Do Music Generation Models Encode Music Theory?
Do Music Generation Models Encode Music Theory?
Megan Wei
Michael Freeman
Chris Donahue
Chen Sun
MGen
28
4
0
01 Oct 2024
Zero-Shot Text-to-Speech from Continuous Text Streams
Zero-Shot Text-to-Speech from Continuous Text Streams
Trung D. Q. Dang
David Aponte
Dung Tran
Tianyi Chen
K. Koishida
AuLLM
VLM
32
3
0
01 Oct 2024
Integrating Text-to-Music Models with Language Models: Composing Long
  Structured Music Pieces
Integrating Text-to-Music Models with Language Models: Composing Long Structured Music Pieces
Lilac Atassi
41
0
0
01 Oct 2024
From Vision to Audio and Beyond: A Unified Model for Audio-Visual
  Representation and Generation
From Vision to Audio and Beyond: A Unified Model for Audio-Visual Representation and Generation
Kun Su
Xiulong Liu
Eli Shlizerman
VGen
30
6
0
27 Sep 2024
MIO: A Foundation Model on Multimodal Tokens
MIO: A Foundation Model on Multimodal Tokens
Zekun Wang
King Zhu
Chunpu Xu
Wangchunshu Zhou
Jiaheng Liu
...
Yuanxing Zhang
Ge Zhang
Ke Xu
Jie Fu
Wenhao Huang
MLLM
AuLLM
60
11
0
26 Sep 2024
FlowMAC: Conditional Flow Matching for Audio Coding at Low Bit Rates
FlowMAC: Conditional Flow Matching for Audio Coding at Low Bit Rates
N. Pia
Martin Strauss
M. Multrus
B. Edler
39
0
0
26 Sep 2024
Temporally Aligned Audio for Video with Autoregression
Temporally Aligned Audio for Video with Autoregression
Ilpo Viertola
Vladimir E. Iashin
Esa Rahtu
VGen
47
11
0
20 Sep 2024
MuCodec: Ultra Low-Bitrate Music Codec
MuCodec: Ultra Low-Bitrate Music Codec
Yaoxun Xu
Hangting Chen
Jianwei Yu
Wei Tan
Rongzhi Gu
Shun Lei
Zhiwei Lin
Zhiyong Wu
30
1
0
20 Sep 2024
Preference Alignment Improves Language Model-Based TTS
Preference Alignment Improves Language Model-Based TTS
Jinchuan Tian
Chunlei Zhang
Jiatong Shi
Hao Zhang
Jianwei Yu
Shinji Watanabe
Dong Yu
32
7
0
19 Sep 2024
Low Frame-rate Speech Codec: a Codec Designed for Fast High-quality
  Speech LLM Training and Inference
Low Frame-rate Speech Codec: a Codec Designed for Fast High-quality Speech LLM Training and Inference
Edresson Casanova
Ryan Langman
Paarth Neekhara
Shehzeen Samarah Hussain
Jason Chun Lok Li
Subhankar Ghosh
Ante Jukić
Sang-gil Lee
AuLLM
31
2
0
18 Sep 2024
Speaking from Coarse to Fine: Improving Neural Codec Language Model via
  Multi-Scale Speech Coding and Generation
Speaking from Coarse to Fine: Improving Neural Codec Language Model via Multi-Scale Speech Coding and Generation
Haohan Guo
Fenglong Xie
Dongchao Yang
Xixin Wu
Helen Meng
43
1
0
18 Sep 2024
SpoofCeleb: Speech Deepfake Detection and SASV In The Wild
SpoofCeleb: Speech Deepfake Detection and SASV In The Wild
Jee-weon Jung
Yihan Wu
Xin Wang
Ji-Hoon Kim
Soumi Maiti
...
Joon Son Chung
Wangyou Zhang
Seyun Um
Shinnosuke Takamichi
Shinji Watanabe
65
1
0
18 Sep 2024
FakeMusicCaps: a Dataset for Detection and Attribution of Synthetic
  Music Generated via Text-to-Music Models
FakeMusicCaps: a Dataset for Detection and Attribution of Synthetic Music Generated via Text-to-Music Models
Luca Comanducci
Paolo Bestagini
Stefano Tubaro
35
7
0
16 Sep 2024
LOCKEY: A Novel Approach to Model Authentication and Deepfake Tracking
LOCKEY: A Novel Approach to Model Authentication and Deepfake Tracking
Mayank Kumar Singh
Naoya Takahashi
Wei-Hsiang Liao
Yuki Mitsufuji
50
1
0
12 Sep 2024
VMAS: Video-to-Music Generation via Semantic Alignment in Web Music
  Videos
VMAS: Video-to-Music Generation via Semantic Alignment in Web Music Videos
Yan-Bo Lin
Yu Tian
L. Yang
Gedas Bertasius
Heng Wang
VGen
34
7
0
11 Sep 2024
An End-to-End Approach for Chord-Conditioned Song Generation
An End-to-End Approach for Chord-Conditioned Song Generation
Shuochen Gao
Shun Lei
Fan Zhuo
Hangyu Liu
Feng Liu
Boshi Tang
Qiaochu Huang
Shiyin Kang
Zhiyong Wu
28
2
0
10 Sep 2024
Multi-Source Music Generation with Latent Diffusion
Multi-Source Music Generation with Latent Diffusion
Zhongweiyang Xu
Debottam Dutta
Yu-Lin Wei
Romit Roy Choudhury
DiffM
42
1
0
10 Sep 2024
SongCreator: Lyrics-based Universal Song Generation
SongCreator: Lyrics-based Universal Song Generation
Shun Lei
Yixuan Zhou
Boshi Tang
Max W. Y. Lam
Feng Liu
Hangyu Liu
Jingcheng Wu
Shiyin Kang
Zhiyong Wu
Helen Meng
44
4
0
09 Sep 2024
Resource-Efficient Generative AI Model Deployment in Mobile Edge
  Networks
Resource-Efficient Generative AI Model Deployment in Mobile Edge Networks
Yuxin Liang
Peng Yang
Yuanyuan He
Feng Lyu
18
2
0
09 Sep 2024
Previous
123456
Next