ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2306.05284
  4. Cited By
Simple and Controllable Music Generation

Simple and Controllable Music Generation

8 June 2023
Jade Copet
Felix Kreuk
Itai Gat
Tal Remez
David Kant
Gabriel Synnaeve
Yossi Adi
Alexandre Défossez
    MGen
ArXivPDFHTML

Papers citing "Simple and Controllable Music Generation"

50 / 256 papers shown
Title
MetaBGM: Dynamic Soundtrack Transformation For Continuous Multi-Scene
  Experiences With Ambient Awareness And Personalization
MetaBGM: Dynamic Soundtrack Transformation For Continuous Multi-Scene Experiences With Ambient Awareness And Personalization
Haoxuan Liu
Zihao Wang
HaoRong Hong
Youwei Feng
Jiaxin Yu
Han Diao
Yunfei Xu
Kaipeng Zhang
36
0
0
05 Sep 2024
LAST: Language Model Aware Speech Tokenization
LAST: Language Model Aware Speech Tokenization
A. Turetzky
Yossi Adi
34
2
0
05 Sep 2024
FireRedTTS: A Foundation Text-To-Speech Framework for Industry-Level Generative Speech Applications
FireRedTTS: A Foundation Text-To-Speech Framework for Industry-Level Generative Speech Applications
Hao-Han Guo
Kun Liu
Fei-Yu Shen
Yi-Chen Wu
Xu Tang
Kun Xie
Kai-Tuo Xu
Kun Xie
Kai-Tuo Xu
42
20
0
05 Sep 2024
SymPAC: Scalable Symbolic Music Generation With Prompts And Constraints
SymPAC: Scalable Symbolic Music Generation With Prompts And Constraints
Haonan Chen
Jordan B. L. Smith
Janne Spijkervet
Ju-Chiang Wang
Pei Zou
Bochen Li
Qiuqiang Kong
Xingjian Du
20
1
0
04 Sep 2024
Dynamic Motion Synthesis: Masked Audio-Text Conditioned Spatio-Temporal
  Transformers
Dynamic Motion Synthesis: Masked Audio-Text Conditioned Spatio-Temporal Transformers
Sohan Anisetty
James Hays
38
0
0
03 Sep 2024
EnCLAP++: Analyzing the EnCLAP Framework for Optimizing Automated Audio
  Captioning Performance
EnCLAP++: Analyzing the EnCLAP Framework for Optimizing Automated Audio Captioning Performance
Jaeyeon Kim
Minjeon Jeon
Jaeyoon Jung
Sang Hoon Woo
Jinjoo Lee
26
2
0
02 Sep 2024
Expanding on EnCLAP with Auxiliary Retrieval Model for Automated Audio
  Captioning
Expanding on EnCLAP with Auxiliary Retrieval Model for Automated Audio Captioning
Jaeyeon Kim
Jaeyoon Jung
Minjeong Jeon
Sang Hoon Woo
Jinjoo Lee
24
1
0
02 Sep 2024
SoCodec: A Semantic-Ordered Multi-Stream Speech Codec for Efficient
  Language Model Based Text-to-Speech Synthesis
SoCodec: A Semantic-Ordered Multi-Stream Speech Codec for Efficient Language Model Based Text-to-Speech Synthesis
Haohan Guo
Fenglong Xie
Kun Xie
Dongchao Yang
Dake Guo
Xixin Wu
Helen Meng
34
4
0
02 Sep 2024
MMT-BERT: Chord-aware Symbolic Music Generation Based on Multitrack
  Music Transformer and MusicBERT
MMT-BERT: Chord-aware Symbolic Music Generation Based on Multitrack Music Transformer and MusicBERT
Jinlong Zhu
Keigo Sakurai
Ren Togo
Takahiro Ogawa
Miki Haseyama
GAN
30
1
0
02 Sep 2024
MaskGCT: Zero-Shot Text-to-Speech with Masked Generative Codec
  Transformer
MaskGCT: Zero-Shot Text-to-Speech with Masked Generative Codec Transformer
Yuancheng Wang
Haoyue Zhan
Liwei Liu
Ruihong Zeng
Haotian Guo
Jiachen Zheng
Qiang Zhang
Shunsi Zhang
Shunsi Zhang
Zhizheng Wu
36
39
0
01 Sep 2024
FLUX that Plays Music
FLUX that Plays Music
Zhengcong Fei
Mingyuan Fan
Changqian Yu
Junshi Huang
84
7
0
01 Sep 2024
Mini-Omni: Language Models Can Hear, Talk While Thinking in Streaming
Mini-Omni: Language Models Can Hear, Talk While Thinking in Streaming
Zhifei Xie
Changqiao Wu
AuLLM
VGen
VLM
SyDa
LRM
29
55
0
29 Aug 2024
SSDM: Scalable Speech Dysfluency Modeling
SSDM: Scalable Speech Dysfluency Modeling
Jiachen Lian
Xuanru Zhou
Z. Ezzes
Jet M J Vonk
Brittany Morin
D. Baquirin
Zachary Mille
M. G. Tempini
Gopala Anumanchipalli
AuLLM
32
1
0
29 Aug 2024
WavTokenizer: an Efficient Acoustic Discrete Codec Tokenizer for Audio Language Modeling
WavTokenizer: an Efficient Acoustic Discrete Codec Tokenizer for Audio Language Modeling
Shengpeng Ji
Ziyue Jiang
Xize Cheng
Yifu Chen
Minghui Fang
...
Rongjie Huang
Yidi Jiang
Qian Chen
Zhou Zhao
Zhou Zhao
VLM
57
33
0
29 Aug 2024
Hierarchical Generative Modeling of Melodic Vocal Contours in Hindustani
  Classical Music
Hierarchical Generative Modeling of Melodic Vocal Contours in Hindustani Classical Music
N. Shikarpur
Krishna Maneesha Dendukuri
Yusong Wu
Antoine Caillon
Cheng-Zhi Anna Huang
20
1
0
22 Aug 2024
Does Current Deepfake Audio Detection Model Effectively Detect ALM-based
  Deepfake Audio?
Does Current Deepfake Audio Detection Model Effectively Detect ALM-based Deepfake Audio?
Yuankun Xie
Chenxu Xiong
Xiaopeng Wang
Zhiyong Wang
Yi Lu
...
Yukun Liu
Zhengqi Wen
Jianhua Tao
Guanjun Li
Long Ye
AuLLM
28
1
0
20 Aug 2024
Efficient and Scalable Point Cloud Generation with Sparse Point-Voxel
  Diffusion Models
Efficient and Scalable Point Cloud Generation with Sparse Point-Voxel Diffusion Models
Ioannis Romanelis
Vlassios Fotis
Athanasios P. Kalogeras
Christos Alexakos
Konstantinos Moustakas
Adrian Munteanu
35
0
0
12 Aug 2024
TEAdapter: Supply abundant guidance for controllable text-to-music
  generation
TEAdapter: Supply abundant guidance for controllable text-to-music generation
Jialing Zou
Jiahao Mei
Xudong Nan
Jinghua Li
Daoguo Dong
Liang He
33
0
0
09 Aug 2024
PiCoGen2: Piano cover generation with transfer learning approach and
  weakly aligned data
PiCoGen2: Piano cover generation with transfer learning approach and weakly aligned data
Chih-Pin Tan
Hsin Ai
Yi-Hsin Chang
Shuen-Huei Guan
Yi-Hsuan Yang
42
2
0
02 Aug 2024
Nested Music Transformer: Sequentially Decoding Compound Tokens in
  Symbolic Music and Audio Generation
Nested Music Transformer: Sequentially Decoding Compound Tokens in Symbolic Music and Audio Generation
Michael Kolle
Maximilian Zorn
Jongmin Jung
Dasaem Jeong
36
0
0
02 Aug 2024
Combining audio control and style transfer using latent diffusion
Combining audio control and style transfer using latent diffusion
Andreas Maier
Yuliya Burankova
Anne Hartebrodt
David B. Blumenthal
DiffM
34
2
0
31 Jul 2024
Recording First-person Experiences to Build a New Type of Foundation
  Model
Recording First-person Experiences to Build a New Type of Foundation Model
Dionis Barcari
David Gamez
Aliya Grig
ALM
25
0
0
31 Jul 2024
A New Type of Foundation Model Based on Recordings of People's Emotions
  and Physiology
A New Type of Foundation Model Based on Recordings of People's Emotions and Physiology
David Gamez
Dionis Barcari
Aliya Grig
29
0
0
31 Jul 2024
MMTrail: A Multimodal Trailer Video Dataset with Language and Music
  Descriptions
MMTrail: A Multimodal Trailer Video Dataset with Language and Music Descriptions
Xiaowei Chi
Yatian Wang
Aosong Cheng
Pengjun Fang
Zeyue Tian
...
Wenhan Luo
Qifeng Chen
Shanghang Zhang
Qi-fei Liu
Yi-Ting Guo
75
7
0
30 Jul 2024
Futga: Towards Fine-grained Music Understanding through
  Temporally-enhanced Generative Augmentation
Futga: Towards Fine-grained Music Understanding through Temporally-enhanced Generative Augmentation
Junda Wu
Zachary Novack
Amit Namburi
Jiaheng Dai
Hao-Wen Dong
Zhouhang Xie
Carol Chen
Julian McAuley
38
1
0
29 Jul 2024
Discrete Flow Matching
Discrete Flow Matching
Itai Gat
Tal Remez
Neta Shaul
Felix Kreuk
Ricky T. Q. Chen
Gabriel Synnaeve
Yossi Adi
Y. Lipman
DiffM
52
57
0
22 Jul 2024
MusiConGen: Rhythm and Chord Control for Transformer-Based Text-to-Music
  Generation
MusiConGen: Rhythm and Chord Control for Transformer-Based Text-to-Music Generation
Yun-Han Lan
Wen-Yi Hsiao
Hao-Chung Cheng
Yi-Hsuan Yang
48
7
0
21 Jul 2024
Towards Assessing Data Replication in Music Generation with Music
  Similarity Metrics on Raw Audio
Towards Assessing Data Replication in Music Generation with Music Similarity Metrics on Raw Audio
Roser Batlle-Roca
Wei-Hsiang Liao
Xavier Serra
Yuki Mitsufuji
Emilia Gómez
50
0
0
19 Jul 2024
Stable Audio Open
Stable Audio Open
Zach Evans
Julian Parker
CJ Carr
Zack Zukowski
Josiah Taylor
Jordi Pons
72
38
0
19 Jul 2024
Audio Conditioning for Music Generation via Discrete Bottleneck Features
Audio Conditioning for Music Generation via Discrete Bottleneck Features
Simon Rouard
Yossi Adi
Jade Copet
Axel Roebel
Alexandre Défossez
MGen
54
1
0
17 Jul 2024
A Language Modeling Approach to Diacritic-Free Hebrew TTS
A Language Modeling Approach to Diacritic-Free Hebrew TTS
Amit Roth
A. Turetzky
Yossi Adi
32
2
0
16 Jul 2024
LiteFocus: Accelerated Diffusion Inference for Long Audio Synthesis
LiteFocus: Accelerated Diffusion Inference for Long Audio Synthesis
Zhenxiong Tan
Xinyin Ma
Gongfan Fang
Xinchao Wang
36
3
0
15 Jul 2024
BandControlNet: Parallel Transformers-based Steerable Popular Music
  Generation with Fine-Grained Spatiotemporal Features
BandControlNet: Parallel Transformers-based Steerable Popular Music Generation with Fine-Grained Spatiotemporal Features
Jing Luo
Xinyu Yang
Dorien Herremans
31
3
0
15 Jul 2024
Masked Generative Video-to-Audio Transformers with Enhanced
  Synchronicity
Masked Generative Video-to-Audio Transformers with Enhanced Synchronicity
Santiago Pascual
Chunghsin Yeh
Ioannis Tsiamas
Joan Serrà
DiffM
VGen
44
15
0
15 Jul 2024
Live2Diff: Live Stream Translation via Uni-directional Attention in
  Video Diffusion Models
Live2Diff: Live Stream Translation via Uni-directional Attention in Video Diffusion Models
Zhening Xing
Gereon Fox
Yanhong Zeng
Xingang Pan
Mohamed A. Elgharib
Christian Theobalt
Kai Chen
VGen
27
3
0
11 Jul 2024
PAGURI: a user experience study of creative interaction with
  text-to-music models
PAGURI: a user experience study of creative interaction with text-to-music models
Francesca Ronchini
Luca Comanducci
Gabriele Perego
Fabio Antonacci
35
3
0
05 Jul 2024
MuseBarControl: Enhancing Fine-Grained Control in Symbolic Music
  Generation through Pre-Training and Counterfactual Loss
MuseBarControl: Enhancing Fine-Grained Control in Symbolic Music Generation through Pre-Training and Counterfactual Loss
Yangyang Shu
Haiming Xu
Ziqin Zhou
Anton van den Hengel
Lingqiao Liu
27
3
0
05 Jul 2024
MuDiT & MuSiT: Alignment with Colloquial Expression in
  Description-to-Song Generation
MuDiT & MuSiT: Alignment with Colloquial Expression in Description-to-Song Generation
Zihao Wang
Haoxuan Liu
Jiaxing Yu
Tao Zhang
Yan Liu
Kaipeng Zhang
65
1
0
03 Jul 2024
Towards Training Music Taggers on Synthetic Data
Towards Training Music Taggers on Synthetic Data
N. Kroher
Steven Manangu
A. Pikrakis
52
1
0
02 Jul 2024
Accompanied Singing Voice Synthesis with Fully Text-controlled Melody
Accompanied Singing Voice Synthesis with Fully Text-controlled Melody
Ruiqi Li
Zhiqing Hong
Yongqi Wang
Lichao Zhang
Rongjie Huang
Siqi Zheng
Zhou Zhao
36
6
0
02 Jul 2024
Pictures Of MIDI: Controlled Music Generation via Graphical Prompts for
  Image-Based Diffusion Inpainting
Pictures Of MIDI: Controlled Music Generation via Graphical Prompts for Image-Based Diffusion Inpainting
Scott H. Hawley
35
2
0
01 Jul 2024
Subtractive Training for Music Stem Insertion using Latent Diffusion Models
Subtractive Training for Music Stem Insertion using Latent Diffusion Models
Ivan Villa-Renteria
Mason L. Wang
Zachary Shah
Zhe Li
Soohyun Kim
Neelesh Ramachandran
Mert Pilanci
42
0
0
27 Jun 2024
Improving Robustness of LLM-based Speech Synthesis by Learning Monotonic
  Alignment
Improving Robustness of LLM-based Speech Synthesis by Learning Monotonic Alignment
Paarth Neekhara
Shehzeen Samarah Hussain
Subhankar Ghosh
Jason Chun Lok Li
Rafael Valle
Rohan Badlani
Boris Ginsburg
52
11
0
25 Jun 2024
SpecMaskGIT: Masked Generative Modeling of Audio Spectrograms for
  Efficient Audio Synthesis and Beyond
SpecMaskGIT: Masked Generative Modeling of Audio Spectrograms for Efficient Audio Synthesis and Beyond
Marco Comunità
Zhi-Wei Zhong
Akira Takahashi
Shiqi Yang
Mengjie Zhao
Koichi Saito
Yukara Ikemiya
Takashi Shibuya
Shusuke Takahashi
Yuki Mitsufuji
68
2
0
25 Jun 2024
Exploring compressibility of transformer based text-to-music (TTM)
  models
Exploring compressibility of transformer based text-to-music (TTM) models
Vasileios Moschopoulos
Thanasis Kotsiopoulos
Pablo Peso Parada
Konstantinos Nikiforidis
Alexandros Stergiadis
Gerasimos Papakostas
Md. Asif Jalal
Jisi Zhang
Anastasios Drosou
Karthikeyan P. Saravanan
25
0
0
24 Jun 2024
Improving Unsupervised Clean-to-Rendered Guitar Tone Transformation
  Using GANs and Integrated Unaligned Clean Data
Improving Unsupervised Clean-to-Rendered Guitar Tone Transformation Using GANs and Integrated Unaligned Clean Data
Yu-Hua Chen
Woosung Choi
Wei-Hsiang Liao
Marco A. Martínez-Ramírez
K. Cheuk
Yuki Mitsufuji
J. Jang
Yi-Hsuan Yang
50
5
0
22 Jun 2024
JEN-1 DreamStyler: Customized Musical Concept Learning via Pivotal
  Parameters Tuning
JEN-1 DreamStyler: Customized Musical Concept Learning via Pivotal Parameters Tuning
Boyu Chen
Peike Li
Yao Yao
Alex Wang
DiffM
37
3
0
18 Jun 2024
Improving Text-To-Audio Models with Synthetic Captions
Improving Text-To-Audio Models with Synthetic Captions
Zhifeng Kong
Sang-gil Lee
Deepanway Ghosal
Navonil Majumder
Ambuj Mehrish
Rafael Valle
Soujanya Poria
Bryan Catanzaro
47
11
0
18 Jun 2024
Joint Audio and Symbolic Conditioning for Temporally Controlled
  Text-to-Music Generation
Joint Audio and Symbolic Conditioning for Temporally Controlled Text-to-Music Generation
Or Tal
Alon Ziv
Itai Gat
Felix Kreuk
Yossi Adi
47
13
0
16 Jun 2024
MMM: Multi-Layer Multi-Residual Multi-Stream Discrete Speech
  Representation from Self-supervised Learning Model
MMM: Multi-Layer Multi-Residual Multi-Stream Discrete Speech Representation from Self-supervised Learning Model
Jiatong Shi
Xutai Ma
Hirofumi Inaguma
Anna Y. Sun
Shinji Watanabe
57
7
0
14 Jun 2024
Previous
123456
Next