Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1712.01312
Cited By
Learning Sparse Neural Networks through
L
0
L_0
L
0
Regularization
4 December 2017
Christos Louizos
Max Welling
Diederik P. Kingma
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Learning Sparse Neural Networks through $L_0$ Regularization"
50 / 152 papers shown
Title
Super-fast rates of convergence for Neural Networks Classifiers under the Hard Margin Condition
Nathanael Tepakbong
Ding-Xuan Zhou
Xiang Zhou
33
0
0
13 May 2025
A Comprehensive Survey of Mixture-of-Experts: Algorithms, Theory, and Applications
Siyuan Mu
Sen Lin
MoE
123
1
0
10 Mar 2025
MaskPrune: Mask-based LLM Pruning for Layer-wise Uniform Structures
Jiayu Qin
Jianchao Tan
K. Zhang
Xunliang Cai
Wei Wang
40
0
0
19 Feb 2025
Advancing Weight and Channel Sparsification with Enhanced Saliency
Xinglong Sun
Maying Shen
Hongxu Yin
Lei Mao
Pavlo Molchanov
Jose M. Alvarez
46
1
0
05 Feb 2025
Deep Weight Factorization: Sparse Learning Through the Lens of Artificial Symmetries
Chris Kolb
T. Weber
Bernd Bischl
David Rügamer
106
0
0
04 Feb 2025
Playing the Lottery With Concave Regularizers for Sparse Trainable Neural Networks
Giulia Fracastoro
Sophie M. Fosson
Andrea Migliorati
G. Calafiore
38
1
0
19 Jan 2025
Circuit Compositions: Exploring Modular Structures in Transformer-Based Language Models
Philipp Mondorf
Sondre Wold
Barbara Plank
34
0
0
02 Oct 2024
Evaluating Model Robustness Using Adaptive Sparse L0 Regularization
Weiyou Liu
Zhenyang Li
Weitong Chen
AAML
20
1
0
28 Aug 2024
Mask in the Mirror: Implicit Sparsification
Tom Jacobs
R. Burkholz
42
3
0
19 Aug 2024
Isomorphic Pruning for Vision Models
Gongfan Fang
Xinyin Ma
Michael Bi Mi
Xinchao Wang
VLM
ViT
34
6
0
05 Jul 2024
Finding Transformer Circuits with Edge Pruning
Adithya Bhaskar
Alexander Wettig
Dan Friedman
Danqi Chen
62
16
0
24 Jun 2024
Geometric sparsification in recurrent neural networks
Wyatt Mackey
Ioannis Schizas
Jared Deighton
David L. Boothe, Jr.
Vasileios Maroulas
28
0
0
10 Jun 2024
Towards Understanding Task-agnostic Debiasing Through the Lenses of Intrinsic Bias and Forgetfulness
Guangliang Liu
Milad Afshari
Xitong Zhang
Zhiyu Xue
Avrajit Ghosh
Bidhan Bashyal
Rongrong Wang
K. Johnson
27
0
0
06 Jun 2024
S3D: A Simple and Cost-Effective Self-Speculative Decoding Scheme for Low-Memory GPUs
Wei Zhong
Manasa Bharadwaj
37
5
0
30 May 2024
A separability-based approach to quantifying generalization: which layer is best?
Luciano Dyballa
Evan Gerritz
Steven W. Zucker
OOD
28
3
0
02 May 2024
AdaFSNet: Time Series Classification Based on Convolutional Network with a Adaptive and Effective Kernel Size Configuration
Haoxiao Wang
Bo Peng
Jianhua Zhang
Xu Cheng
AI4TS
39
1
0
28 Apr 2024
Where does In-context Translation Happen in Large Language Models
Suzanna Sia
David Mueller
Kevin Duh
LRM
33
0
0
07 Mar 2024
Always-Sparse Training by Growing Connections with Guided Stochastic Exploration
Mike Heddes
Narayan Srinivasa
T. Givargis
Alexandru Nicolau
91
0
0
12 Jan 2024
The LLM Surgeon
Tycho F. A. van der Ouderaa
Markus Nagel
M. V. Baalen
Yuki Markus Asano
Tijmen Blankevoort
26
14
0
28 Dec 2023
Shedding the Bits: Pushing the Boundaries of Quantization with Minifloats on FPGAs
Shivam Aggarwal
Hans Jakob Damsgaard
Alessandro Pappalardo
Giuseppe Franco
Thomas B. Preußer
Michaela Blott
Tulika Mitra
MQ
19
5
0
21 Nov 2023
End-to-end Feature Selection Approach for Learning Skinny Trees
Shibal Ibrahim
Kayhan Behdin
Rahul Mazumder
25
0
0
28 Oct 2023
f-Divergence Minimization for Sequence-Level Knowledge Distillation
Yuqiao Wen
Zichao Li
Wenyu Du
Lili Mou
30
53
0
27 Jul 2023
Quantizable Transformers: Removing Outliers by Helping Attention Heads Do Nothing
Yelysei Bondarenko
Markus Nagel
Tijmen Blankevoort
MQ
13
88
0
22 Jun 2023
A Simple and Effective Pruning Approach for Large Language Models
Mingjie Sun
Zhuang Liu
Anna Bair
J. Zico Kolter
56
353
0
20 Jun 2023
LoSparse: Structured Compression of Large Language Models based on Low-Rank and Sparse Approximation
Yixiao Li
Yifan Yu
Qingru Zhang
Chen Liang
Pengcheng He
Weizhu Chen
Tuo Zhao
33
65
0
20 Jun 2023
GC-Flow: A Graph-Based Flow Network for Effective Clustering
Tianchun Wang
F. Mirzazadeh
X. Zhang
Jing Chen
BDL
40
7
0
26 May 2023
How do languages influence each other? Studying cross-lingual data sharing during LM fine-tuning
Rochelle Choenni
Dan Garrette
Ekaterina Shutova
32
15
0
22 May 2023
Discovering Causal Relations and Equations from Data
Gustau Camps-Valls
Andreas Gerhardus
Urmi Ninad
Gherardo Varando
Georg Martius
E. Balaguer-Ballester
Ricardo Vinuesa
Emiliano Díaz
L. Zanna
Jakob Runge
PINN
AI4Cl
AI4CE
CML
35
72
0
21 May 2023
SPADE: Sparse Pillar-based 3D Object Detection Accelerator for Autonomous Driving
Minjae Lee
Seongmin Park
Hyung-Se Kim
Minyong Yoon
Jangwhan Lee
Junwon Choi
Nam Sung Kim
Mingu Kang
Jungwook Choi
3DPC
26
4
0
12 May 2023
VISION DIFFMASK: Faithful Interpretation of Vision Transformers with Differentiable Patch Masking
A. Nalmpantis
Apostolos Panagiotopoulos
John Gkountouras
Konstantinos Papakostas
Wilker Aziz
15
4
0
13 Apr 2023
NTK-SAP: Improving neural network pruning by aligning training dynamics
Yite Wang
Dawei Li
Ruoyu Sun
28
19
0
06 Apr 2023
Learning Sparsity of Representations with Discrete Latent Variables
Zhao Xu
Daniel Oñoro-Rubio
G. Serra
Mathias Niepert
13
0
0
03 Apr 2023
Illuminati: Towards Explaining Graph Neural Networks for Cybersecurity Analysis
Haoyu He
Yuede Ji
H. H. Huang
23
20
0
26 Mar 2023
Personalized Lightweight Text-to-Speech: Voice Cloning with Adaptive Structured Pruning
Sung-Feng Huang
Chia-Ping Chen
Zhi-Sheng Chen
Yu-Pao Tsai
Hung-yi Lee
16
2
0
21 Mar 2023
Memorization Capacity of Neural Networks with Conditional Computation
Erdem Koyuncu
30
4
0
20 Mar 2023
On Model Compression for Neural Networks: Framework, Algorithm, and Convergence Guarantee
Chenyang Li
Jihoon Chung
Mengnan Du
Haimin Wang
Xianlian Zhou
Bohao Shen
33
1
0
13 Mar 2023
Sparsity May Cry: Let Us Fail (Current) Sparse Neural Networks Together!
Shiwei Liu
Tianlong Chen
Zhenyu (Allen) Zhang
Xuxi Chen
Tianjin Huang
Ajay Jaiswal
Zhangyang Wang
26
29
0
03 Mar 2023
DSD
2
^2
2
: Can We Dodge Sparse Double Descent and Compress the Neural Network Worry-Free?
Victor Quétu
Enzo Tartaglione
24
7
0
02 Mar 2023
Balanced Training for Sparse GANs
Yite Wang
Jing Wu
N. Hovakimyan
Ruoyu Sun
32
9
0
28 Feb 2023
Considering Layerwise Importance in the Lottery Ticket Hypothesis
Benjamin Vandersmissen
José Oramas
15
1
0
22 Feb 2023
FedSDG-FS: Efficient and Secure Feature Selection for Vertical Federated Learning
Anran Li
Hongyi Peng
Lan Zhang
Jiahui Huang
Qing-Wu Guo
Han Yu
Yang Liu
FedML
25
28
0
21 Feb 2023
DAG Learning on the Permutahedron
Valentina Zantedeschi
Luca Franceschi
Jean Kaddour
Matt J. Kusner
Vlad Niculae
27
11
0
27 Jan 2023
Rewarded meta-pruning: Meta Learning with Rewards for Channel Pruning
Athul Shibu
Abhishek Kumar
Heechul Jung
Dong-Gyu Lee
9
1
0
26 Jan 2023
SpArX: Sparse Argumentative Explanations for Neural Networks [Technical Report]
Hamed Ayoobi
Nico Potyka
Francesca Toni
16
17
0
23 Jan 2023
Learnable Heterogeneous Convolution: Learning both topology and strength
Rongzhen Zhao
Zhenzhi Wu
Qikun Zhang
19
6
0
13 Jan 2023
Inversion of Bayesian Networks
Jesse van Oostrum
Peter van Hintum
Nihat Ay
BDL
13
1
0
20 Dec 2022
CSQ: Growing Mixed-Precision Quantization Scheme with Bi-level Continuous Sparsification
Lirui Xiao
Huanrui Yang
Zhen Dong
Kurt Keutzer
Li Du
Shanghang Zhang
MQ
27
10
0
06 Dec 2022
Robust Training of Graph Neural Networks via Noise Governance
Siyi Qian
Haochao Ying
Renjun Hu
Jingbo Zhou
Jintai Chen
D. Z. Chen
Jian Wu
NoLa
25
34
0
12 Nov 2022
Accounting for Temporal Variability in Functional Magnetic Resonance Imaging Improves Prediction of Intelligence
Y. Li
Xin Ma
Rajshekhar Sunderraman
Shihao Ji
Suprateek Kundu
16
6
0
11 Nov 2022
LOFT: Finding Lottery Tickets through Filter-wise Training
Qihan Wang
Chen Dun
Fangshuo Liao
C. Jermaine
Anastasios Kyrillidis
18
3
0
28 Oct 2022
1
2
3
4
Next