It's Raw! Audio Generation with State-Space Models

20 February 2022

Papers citing "It's Raw! Audio Generation with State-Space Models"

36 / 36 papers shown

Title
Quamba2: A Robust and Scalable Post-training Quantization Framework for Selective State Space Models Hung-Yueh Chiang Chi-chih Chang N. Frumkin Kai-Chiang Wu Mohamed S. Abdelfattah Diana Marculescu MQ 81 0 0 28 Mar 2025
SCSegamba: Lightweight Structure-Aware Vision Mamba for Crack Segmentation in Structures Hui Liu Chen Jia Fan Shi Xu Cheng Shengyong Chen Mamba 47 0 0 03 Mar 2025
Uncertainty Representations in State-Space Layers for Deep Reinforcement Learning under Partial Observability Carlos E. Luis A. Bottero Julia Vinogradska Felix Berkenkamp Jan Peters 76 1 0 20 Feb 2025
MambaHSI: Spatial-Spectral Mamba for Hyperspectral Image Classification Yapeng Li Yong Luo L. Zhang Zengmao Wang Bo Du Mamba 58 57 0 10 Jan 2025
Real-time Speech Enhancement on Raw Signals with Deep State-space Modeling Yan Ru Pei Ritik Shrivastava Fnu Sidharth 38 1 0 31 Dec 2024
Context Clues: Evaluating Long Context Models for Clinical Prediction Tasks on EHRs Michael Wornow Suhana Bedi Miguel Angel Fuentes Hernandez E. Steinberg Jason Alan Fries Christopher Ré Sanmi Koyejo N. Shah 95 4 0 09 Dec 2024
Layer-Adaptive State Pruning for Deep State Space Models Minseon Gwak Seongrok Moon Joohwan Ko PooGyeon Park 25 0 0 05 Nov 2024
Oscillatory State-Space Models T. Konstantin Rusch Daniela Rus AI4TS 76 4 0 04 Oct 2024
DiSPo: Diffusion-SSM based Policy Learning for Coarse-to-Fine Action Discretization Nayoung Oh Jaehyeong Jang Moonkyeong Jung Daehyung Park 89 0 0 23 Sep 2024
MambaFoley: Foley Sound Generation using Selective State-Space Models Marco Furio Colombo Francesca Ronchini Luca Comanducci Fabio Antonacci Mamba 20 1 0 13 Sep 2024
Salmon: A Suite for Acoustic Language Model Evaluation Gallil Maimon Amit Roth Yossi Adi ELM AuLLM 49 5 0 11 Sep 2024
SysCaps: Language Interfaces for Simulation Surrogates of Complex Systems Patrick Emami Zhaonan Li Saumya Sinha Truc Nguyen 48 1 0 30 May 2024
Understanding the differences in Foundation Models: Attention, State Space Models, and Recurrent Neural Networks Jerome Sieber Carmen Amo Alonso A. Didier M. Zeilinger Antonio Orvieto AAML 42 8 0 24 May 2024
PrivImage: Differentially Private Synthetic Image Generation using Diffusion Models with Semantic-Aware Pretraining Kecen Li Chen Gong Zhixiang Li Yuzhong Zhao Xinwen Hou Tianhao Wang 23 10 0 19 Oct 2023
Focus Your Attention (with Adaptive IIR Filters) Shahar Lutati Itamar Zimerman Lior Wolf 27 9 0 24 May 2023
Simple Hardware-Efficient Long Convolutions for Sequence Modeling Daniel Y. Fu Elliot L. Epstein Eric N. D. Nguyen A. Thomas Michael Zhang Tri Dao Atri Rudra Christopher Ré 11 51 0 13 Feb 2023
SingSong: Generating musical accompaniments from singing Chris Donahue Antoine Caillon Adam Roberts Ethan Manilow P. Esling ... Mauro Verzetti Ian Simon Olivier Pietquin Neil Zeghidour Jesse Engel 25 52 0 30 Jan 2023
Rock Guitar Tablature Generation via Natural Language Processing Josue Casco-Rodriguez 18 1 0 12 Jan 2023
Hungry Hungry Hippos: Towards Language Modeling with State Space Models Daniel Y. Fu Tri Dao Khaled Kamal Saab A. Thomas Atri Rudra Christopher Ré 43 367 0 28 Dec 2022
Audio Language Modeling using Perceptually-Guided Discrete Representations Felix Kreuk Yaniv Taigman Adam Polyak Jade Copet Gabriel Synnaeve Alexandre Défossez Yossi Adi 24 4 0 02 Nov 2022
Structured State Space Decoder for Speech Recognition and Synthesis Koichi Miyazaki Masato Murata Tomoki Koriyama 14 12 0 31 Oct 2022
Solving Audio Inverse Problems with a Diffusion Model Eloi Moliner J. Lehtinen Vesa Valimaki DiffM 20 49 0 27 Oct 2022
A Survey on Artificial Intelligence for Music Generation: Agents, Domains and Perspectives Carlos Hernandez-Olivan Javier Hernandez-Olivan J. R. Beltrán MGen 27 6 0 25 Oct 2022
S4ND: Modeling Images and Videos as Multidimensional Signals Using State Spaces Eric N. D. Nguyen Karan Goel Albert Gu Gordon W. Downs Preey Shah Tri Dao S. Baccus Christopher Ré VLM 22 37 0 12 Oct 2022
GAN You Hear Me? Reclaiming Unconditional Speech Synthesis from Diffusion Models Matthew Baas Herman Kamper DiffM 14 8 0 11 Oct 2022
WaveFit: An Iterative and Non-autoregressive Neural Vocoder based on Fixed-Point Iteration Yuma Koizumi Kohei Yatabe Heiga Zen M. Bacchiani DiffM 42 29 0 03 Oct 2022
On the Parameterization and Initialization of Diagonal State Space Models Albert Gu Ankit Gupta Karan Goel Christopher Ré 14 294 0 23 Jun 2022
FlashAttention: Fast and Memory-Efficient Exact Attention with IO-Awareness Tri Dao Daniel Y. Fu Stefano Ermon Atri Rudra Christopher Ré VLM 56 2,017 0 27 May 2022
Realization Theory Of Recurrent Neural ODEs Using Polynomial System Embeddings Martin Gonzalez Thibault Defourneau H. Hajri M. Petreczky 22 2 0 24 May 2022
Long Movie Clip Classification with State-Space Video Models Md. Mohaiminul Islam Gedas Bertasius VLM 36 100 0 04 Apr 2022
Diagonal State Spaces are as Effective as Structured State Spaces Ankit Gupta Albert Gu Jonathan Berant 34 288 0 27 Mar 2022
Zero-Shot Text-to-Image Generation Aditya A. Ramesh Mikhail Pavlov Gabriel Goh Scott Gray Chelsea Voss Alec Radford Mark Chen Ilya Sutskever VLM 253 4,764 0 24 Feb 2021
Generative Spoken Language Modeling from Raw Audio Kushal Lakhotia Evgeny Kharitonov Wei-Ning Hsu Yossi Adi Adam Polyak ... Tu Nguyen Jade Copet Alexei Baevski A. Mohamed Emmanuel Dupoux AuLLM 174 336 0 01 Feb 2021
DDSP: Differentiable Digital Signal Processing Jesse Engel Lamtharn Hantrakul Chenjie Gu Adam Roberts DiffM 83 371 0 14 Jan 2020
High Fidelity Speech Synthesis with Adversarial Networks Mikolaj Binkowski Jeff Donahue Sander Dieleman Aidan Clark Erich Elsen Norman Casagrande Luis C. Cobo Karen Simonyan 213 239 0 25 Sep 2019
Aggregated Residual Transformations for Deep Neural Networks Saining Xie Ross B. Girshick Piotr Dollár Z. Tu Kaiming He 261 10,196 0 16 Nov 2016