Communities
Connect sessions
AI calendar
Organizations
Join Slack
Contact Sales
Search
Open menu
Home
Papers
2206.05408
Cited By
v1
v2
v3 (latest)
Multi-instrument Music Synthesis with Spectrogram Diffusion
International Society for Music Information Retrieval Conference (ISMIR), 2022
11 June 2022
Curtis Hawthorne
Ian Simon
Adam Roberts
Neil Zeghidour
Josh Gardner
Ethan Manilow
Jesse Engel
DiffM
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Multi-instrument Music Synthesis with Spectrogram Diffusion"
41 / 41 papers shown
Title
Pianist Transformer: Towards Expressive Piano Performance Rendering via Scalable Self-Supervised Pre-Training
Hong-Jie You
Jie-Jing Shao
Xiao-Wen Yang
Lin-Han Jia
Lan-Zhe Guo
Yu-Feng Li
ViT
24
0
0
02 Dec 2025
GuitarFlow: Realistic Electric Guitar Synthesis From Tablatures via Flow Matching and Style Transfer
Jackson Loth
Pedro Sarmento
Mark Sandler
M. Barthet
110
0
0
23 Oct 2025
TinyMusician: On-Device Music Generation with Knowledge Distillation and Mixed Precision Quantization
Hainan Wang
M. Hosseinzadeh
Reza Rawassizadeh
MQ
MGen
156
0
0
31 Aug 2025
A Review on Score-based Generative Models for Audio Applications
Ge Zhu
Yutong Wen
Zhiyao Duan
DiffM
MedIm
210
3
0
10 Jun 2025
Unified Cross-modal Translation of Score Images, Symbolic Music, and Performance Audio
Jongmin Jung
Dongmin Kim
Sihun Lee
Seola Cho
Hyungjoon Soh
Irmak Bukey
Chris Donahue
Dasaem Jeong
151
0
0
19 May 2025
A Survey on Cross-Modal Interaction Between Music and Multimodal Data
Sifei Li
Mining Tan
Feier Shen
Minyan Luo
Zijiao Yin
Fan Tang
Weiming Dong
Changsheng Xu
297
1
0
17 Apr 2025
A Survey on Music Generation from Single-Modal, Cross-Modal, and Multi-Modal Perspectives
Shuyu Li
Shulei Ji
Zihao Wang
Songruoyao Wu
Jiaxing Yu
Jianchao Tan
MGen
VGen
507
3
0
01 Apr 2025
Interpreting Graphic Notation with MusicLDM: An AI Improvisation of Cornelius Cardew's Treatise
BigData Congress [Services Society] (BSS), 2024
Tornike Karchkhadze
Keren Shao
Shlomo Dubnov
222
0
0
12 Dec 2024
MoMu-Diffusion: On Learning Long-Term Motion-Music Synchronization and Correspondence
Neural Information Processing Systems (NeurIPS), 2024
Fuming You
Minghui Fang
Li Tang
Rongjie Huang
Yongqi Wang
Zhou Zhao
228
4
0
04 Nov 2024
Annotation-Free MIDI-to-Audio Synthesis via Concatenative Synthesis and Generative Refinement
Osamu Take
Taketo Akama
210
0
0
22 Oct 2024
Integrating Text-to-Music Models with Language Models: Composing Long Structured Music Pieces
Lilac Atassi
417
0
0
01 Oct 2024
ViolinDiff: Enhancing Expressive Violin Synthesis with Pitch Bend Conditioning
IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2024
Daewoong Kim
Hao-Wen Dong
Dasaem Jeong
184
0
0
19 Sep 2024
Why Perturbing Symbolic Music is Necessary: Fitting the Distribution of Never-used Notes through a Joint Probabilistic Diffusion Model
Shipei Liu
Xiaoya Fan
Guowei Wu
DiffM
154
2
0
04 Aug 2024
Combining audio control and style transfer using latent diffusion
International Society for Music Information Retrieval Conference (ISMIR), 2024
Andreas Maier
Yuliya Burankova
Anne Hartebrodt
David B. Blumenthal
DiffM
208
10
0
31 Jul 2024
A Survey of Deep Learning Audio Generation Methods
Matej Bozic
Marko Horvat
VLM
MedIm
271
8
0
31 May 2024
SMITIN: Self-Monitored Inference-Time INtervention for Generative Music Transformers
IEEE Open Journal of Signal Processing (JOSP), 2024
Junghyun Koo
Gordon Wichern
François Germain
Sameer Khurana
Jonathan Le Roux
198
7
0
02 Apr 2024
Improving Diffusion Models's Data-Corruption Resistance using Scheduled Pseudo-Huber Loss
Artem Khrapov
Vadim Popov
Tasnima Sadekova
Assel Yermekova
Mikhail Kudinov
DiffM
203
4
0
25 Mar 2024
MusicHiFi: Fast High-Fidelity Stereo Vocoding
Ge Zhu
Juan-Pablo Caceres
Zhiyao Duan
Nicholas J. Bryan
DiffM
250
8
0
15 Mar 2024
MR-MT3: Memory Retaining Multi-Track Music Transcription to Mitigate Instrument Leakage
Hao Hao Tan
K. Cheuk
Taemin Cho
Wei-Hsiang Liao
Yuki Mitsufuji
139
2
0
15 Mar 2024
Fast Timing-Conditioned Latent Audio Diffusion
Zach Evans
CJ Carr
Josiah Taylor
Scott H. Hawley
Jordi Pons
DiffM
494
191
0
07 Feb 2024
Expressive Acoustic Guitar Sound Synthesis with an Instrument-Specific Input Representation and Diffusion Outpainting
IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2024
Hounsu Kim
Soonbeom Choi
Juhan Nam
221
7
0
24 Jan 2024
DITTO: Diffusion Inference-Time T-Optimization for Music Generation
International Conference on Machine Learning (ICML), 2024
Cheng-i Wang
Julian McAuley
Taylor Berg-Kirkpatrick
Nicholas J. Bryan
DiffM
267
69
0
22 Jan 2024
StemGen: A music generation model that listens
IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2023
Julian Parker
Janne Spijkervet
Katerina Kosta
Furkan Yesiler
Boris Kuznetsov
Ju-Chiang Wang
Matt Avent
Jitong Chen
Duc Le
MGen
276
43
0
14 Dec 2023
EDMSound: Spectrogram Based Diffusion Models for Efficient and High-Quality Audio Synthesis
Ge Zhu
Yutong Wen
M. Carbonneau
Zhiyao Duan
DiffM
192
13
0
15 Nov 2023
Music ControlNet: Multiple Time-varying Controls for Music Generation
IEEE/ACM Transactions on Audio Speech and Language Processing (TASLP), 2023
Shih-Lun Wu
Chris Donahue
Shinji Watanabe
Nicholas J. Bryan
DiffM
MGen
295
111
0
13 Nov 2023
Performance Conditioning for Diffusion-Based Multi-Instrument Music Synthesis
IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2023
Ben Maman
Johannes Zeitler
Meinard Muller
Amit H. Bermano
DiffM
173
6
0
21 Sep 2023
A Review of Differentiable Digital Signal Processing for Music & Speech Synthesis
Frontiers in Signal Processing (FSP), 2023
B. Hayes
Jordie Shier
Gyorgy Fazekas
Andrew Mcpherson
C. Saitis
202
39
0
29 Aug 2023
LongDanceDiff: Long-term Dance Generation with Conditional Diffusion Model
Siqi Yang
Zejun Yang
Zhisheng Wang
165
17
0
23 Aug 2023
Discrete Diffusion Probabilistic Models for Symbolic Music Generation
International Joint Conference on Artificial Intelligence (IJCAI), 2023
Matthias Plasser
S. Peter
Gerhard Widmer
DiffM
MGen
152
19
0
16 May 2023
AudioSlots: A slot-centric generative model for audio separation
P. Reddy
Scott Wisdom
Klaus Greff
J. Hershey
Thomas Kipf
OCL
VLM
224
6
0
09 May 2023
Long-Term Rhythmic Video Soundtracker
International Conference on Machine Learning (ICML), 2023
Jiashuo Yu
Yaohui Wang
Xinyuan Chen
Xiao Sun
Yu Qiao
DiffM
269
18
0
02 May 2023
Attributing Image Generative Models using Latent Fingerprints
International Conference on Machine Learning (ICML), 2023
Guangyu Nie
Changhoon Kim
Yezhou Yang
Yi Ren
WIGM
DiffM
175
20
0
17 Apr 2023
Exploring Diffusion Models for Unsupervised Video Anomaly Detection
International Conference on Information Photonics (ICIP), 2023
Anil Osman Tur
Nicola Dall’Asen
Cigdem Beyan
Elisa Ricci
DiffM
VGen
301
42
0
12 Apr 2023
Distribution Preserving Source Separation With Time Frequency Predictive Models
European Signal Processing Conference (EUSIPCO), 2023
Pedro J. Villasana T
J. Klejsa
Lars Villemoes
P. Hedelin
152
2
0
10 Mar 2023
Multi-Source Diffusion Models for Simultaneous Music Generation and Separation
International Conference on Learning Representations (ICLR), 2023
Giorgio Mariani
Irene Tallini
Emilian Postolache
Michele Mancusi
Luca Cosmo
Emanuele Rodolà
DiffM
523
65
0
04 Feb 2023
SingSong: Generating musical accompaniments from singing
Chris Donahue
Antoine Caillon
Adam Roberts
Ethan Manilow
P. Esling
...
Mauro Verzetti
Ian Simon
Olivier Pietquin
Neil Zeghidour
Jesse Engel
166
69
0
30 Jan 2023
MusicLM: Generating Music From Text
A. Agostinelli
Timo I. Denk
Zalan Borsos
Jesse Engel
Mauro Verzetti
...
Adam Roberts
Marco Tagliasacchi
Matthew Sharifi
Neil Zeghidour
Christian Frank
MGen
668
590
0
26 Jan 2023
Rock Guitar Tablature Generation via Natural Language Processing
Josue Casco-Rodriguez
180
1
0
12 Jan 2023
Can Knowledge of End-to-End Text-to-Speech Models Improve Neural MIDI-to-Audio Synthesis Systems?
IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2022
Xuan Shi
Erica Cooper
Xin Wang
Junichi Yamagishi
Shrikanth Narayanan
178
4
0
25 Nov 2022
Solving Audio Inverse Problems with a Diffusion Model
IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2022
Eloi Moliner
J. Lehtinen
Vesa Valimaki
DiffM
351
73
0
27 Oct 2022
DiffRoll: Diffusion-based Generative Music Transcription with Unsupervised Pretraining Capability
IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2022
K. Cheuk
Ryosuke Sawata
Toshimitsu Uesaka
Naoki Murata
Naoya Takahashi
Shusuke Takahashi
Dorien Herremans
Yuki Mitsufuji
DiffM
162
21
0
11 Oct 2022
1