SpecMaskGIT: Masked Generative Modeling of Audio Spectrograms for
Efficient Audio Synthesis and Beyond

SpecMaskGIT: Masked Generative Modeling of Audio Spectrograms for Efficient Audio Synthesis and Beyond

25 June 2024

Marco Comunità

Akira Takahashi

Mengjie Zhao

Takashi Shibuya

Shusuke Takahashi

Papers citing "SpecMaskGIT: Masked Generative Modeling of Audio Spectrograms for Efficient Audio Synthesis and Beyond"

9 / 9 papers shown

Title
AudioX: Diffusion Transformer for Anything-to-Audio Generation Zeyue Tian Yizhu Jin Zhaoyang Liu Ruibin Yuan Xu Tan Qifeng Chen Wei Xue Y. Guo 65 3 0 13 Mar 2025
Distillation of Discrete Diffusion through Dimensional Correlations Satoshi Hayakawa Yuhta Takida Masaaki Imaizumi Hiromi Wakaki Yuki Mitsufuji DiffM 56 0 0 11 Oct 2024
A Pytorch Reproduction of Masked Generative Image Transformer Victor Besnier Mickael Chen ViT 51 12 0 22 Oct 2023
Diverse and Vivid Sound Generation from Text Descriptions Guangwei Li Xuenan Xu Lingfeng Dai Mengyue Wu K. Yu 45 4 0 03 May 2023
Text-to-Audio Generation using Instruction-Tuned LLM and Latent Diffusion Model Deepanway Ghosal Navonil Majumder Ambuj Mehrish Soujanya Poria 138 141 0 24 Apr 2023
Make-An-Audio: Text-To-Audio Generation with Prompt-Enhanced Diffusion Models Rongjie Huang Jia-Bin Huang Dongchao Yang Yi Ren Luping Liu Mingze Li Zhenhui Ye Jinglin Liu Xiaoyue Yin Zhou Zhao DiffM 140 315 0 30 Jan 2023
Muse: Text-To-Image Generation via Masked Generative Transformers Huiwen Chang Han Zhang Jarred Barber AJ Maschinot José Lezama ... Kevin Patrick Murphy William T. Freeman Michael Rubinstein Yuanzhen Li Dilip Krishnan DiffM 197 517 0 02 Jan 2023
Supervised and Unsupervised Learning of Audio Representations for Music Understanding Matthew C. McCallum Filip Korzeniowski Sergio Oramas F. Gouyon Andreas F. Ehmann SSL 76 36 0 07 Oct 2022
Codified audio language modeling learns useful representations for music information retrieval Rodrigo Castellon Chris Donahue Percy Liang 76 86 0 12 Jul 2021