v1v2v3 (latest)

Layer-Adaptive State Pruning for Deep State Space Models

Neural Information Processing Systems (NeurIPS), 2024

5 November 2024

Papers citing "Layer-Adaptive State Pruning for Deep State Space Models"

28 / 28 papers shown

Hankel Singular Value Regularization for Highly Compressible State Space Models

Paul Schwerdtner

Jules Berman

Benjamin Peherstorfer

185

27 Oct 2025

A Deep State-Space Model Compression Method using Upper Bound on Output Error

Hiroki Sakamoto

Kazuhiro Sato

16 Oct 2025

Uncovering the Spectral Bias in Diagonal State Space Models

109

28 Aug 2025

Compression Method for Deep Diagonal State Space Model Based on

H^2

Optimal ReductionIEEE Control Systems Letters (L-CSS), 2025

Hiroki Sakamoto

Kazuhiro Sato

144

14 Jul 2025

State-Free Inference of State-Space Models: The Transfer Function ApproachInternational Conference on Machine Learning (ICML), 2024

Ramin Hasani

...

271

10 May 2024

Mamba: Linear-Time Sequence Modeling with Selective State Spaces

Albert Gu

Tri Dao

Mamba

564

5,271

01 Dec 2023

Efficient Joint Optimization of Layer-Adaptive Weight Pruning in Deep Neural NetworksIEEE International Conference on Computer Vision (ICCV), 2023

Zhe Wang

Weisi Lin

138

21 Aug 2023

Effectively Modeling Time Series with Simple Discrete State SpacesInternational Conference on Learning Representations (ICLR), 2023

150

16 Mar 2023

On the Parameterization and Initialization of Diagonal State Space ModelsNeural Information Processing Systems (NeurIPS), 2022

413

473

23 Jun 2022

Diagonal State Spaces are as Effective as Structured State SpacesNeural Information Processing Systems (NeurIPS), 2022

Ankit Gupta

Albert Gu

Jonathan Berant

392

411

27 Mar 2022

It's Raw! Audio Generation with State-Space ModelsInternational Conference on Machine Learning (ICML), 2022

261

233

20 Feb 2022

Efficiently Modeling Long Sequences with Structured State SpacesInternational Conference on Learning Representations (ICLR), 2021

Albert Gu

Karan Goel

Christopher Ré

1.0K

2,871

31 Oct 2021

Combining Recurrent, Convolutional, and Continuous-time Models with Linear State-Space Layers

294

935

26 Oct 2021

Long Range Arena: A Benchmark for Efficient Transformers

383

832

08 Nov 2020

Layer-adaptive sparsity for the Magnitude-based Pruning

Sejun Park

252

300

15 Oct 2020

HiPPO: Recurrent Memory with Optimal Polynomial Projections

398

805

17 Aug 2020

Rigging the Lottery: Making All Tickets WinnersInternational Conference on Machine Learning (ICML), 2019

537

686

25 Nov 2019

One ticket to win them all: generalizing lottery ticket initializations across datasets and optimizersNeural Information Processing Systems (NeurIPS), 2019

197

244

06 Jun 2019

The State of Sparsity in Deep Neural Networks

Trevor Gale

Erich Elsen

Sara Hooker

385

839

25 Feb 2019

Learning long-range spatial dependencies with horizontal gated-recurrent units

315

170

21 May 2018

ListOps: A Diagnostic Dataset for Latent Tree Learning

Nikita Nangia

Samuel R. Bowman

254

151

17 Apr 2018

Speech Commands: A Dataset for Limited-Vocabulary Speech Recognition

Pete Warden

235

1,850

09 Apr 2018

To prune, or not to prune: exploring the efficacy of pruning for model compression

Michael Zhu

Suyog Gupta

355

1,405

05 Oct 2017

Channel Pruning for Accelerating Very Deep Neural Networks

Yihui He

Xiangyu Zhang

Jian Sun

637

2,683

19 Jul 2017

Scalable Training of Artificial Neural Networks with Adaptive Sparse Connectivity inspired by Network Science

Decebal Constantin Mocanu

402

697

15 Jul 2017

Gaussian Error Linear Units (GELUs)

Dan Hendrycks

Kevin Gimpel

971

6,140

27 Jun 2016

Learning both Weights and Connections for Efficient Neural NetworksNeural Information Processing Systems (NeurIPS), 2015

Song Han

564

7,320

08 Jun 2015

Norm-Based Capacity Control in Neural Networks

Behnam Neyshabur

Ryota Tomioka

Nathan Srebro

883

635

27 Feb 2015