ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1712.01312
  4. Cited By
Learning Sparse Neural Networks through $L_0$ Regularization

Learning Sparse Neural Networks through L0L_0L0​ Regularization

4 December 2017
Christos Louizos
Max Welling
Diederik P. Kingma
ArXivPDFHTML

Papers citing "Learning Sparse Neural Networks through $L_0$ Regularization"

50 / 154 papers shown
Title
Super-fast rates of convergence for Neural Networks Classifiers under the Hard Margin Condition
Super-fast rates of convergence for Neural Networks Classifiers under the Hard Margin Condition
Nathanael Tepakbong
Ding-Xuan Zhou
Xiang Zhou
33
0
0
13 May 2025
A Comprehensive Survey of Mixture-of-Experts: Algorithms, Theory, and Applications
A Comprehensive Survey of Mixture-of-Experts: Algorithms, Theory, and Applications
Siyuan Mu
Sen Lin
MoE
126
1
0
10 Mar 2025
MaskPrune: Mask-based LLM Pruning for Layer-wise Uniform Structures
MaskPrune: Mask-based LLM Pruning for Layer-wise Uniform Structures
Jiayu Qin
Jianchao Tan
K. Zhang
Xunliang Cai
Wei Wang
40
0
0
19 Feb 2025
Advancing Weight and Channel Sparsification with Enhanced Saliency
Advancing Weight and Channel Sparsification with Enhanced Saliency
Xinglong Sun
Maying Shen
Hongxu Yin
Lei Mao
Pavlo Molchanov
Jose M. Alvarez
46
1
0
05 Feb 2025
Deep Weight Factorization: Sparse Learning Through the Lens of Artificial Symmetries
Deep Weight Factorization: Sparse Learning Through the Lens of Artificial Symmetries
Chris Kolb
T. Weber
Bernd Bischl
David Rügamer
109
0
0
04 Feb 2025
Playing the Lottery With Concave Regularizers for Sparse Trainable Neural Networks
Playing the Lottery With Concave Regularizers for Sparse Trainable Neural Networks
Giulia Fracastoro
Sophie M. Fosson
Andrea Migliorati
G. Calafiore
40
1
0
19 Jan 2025
Circuit Compositions: Exploring Modular Structures in Transformer-Based Language Models
Circuit Compositions: Exploring Modular Structures in Transformer-Based Language Models
Philipp Mondorf
Sondre Wold
Barbara Plank
34
0
0
02 Oct 2024
Evaluating Model Robustness Using Adaptive Sparse L0 Regularization
Evaluating Model Robustness Using Adaptive Sparse L0 Regularization
Weiyou Liu
Zhenyang Li
Weitong Chen
AAML
25
1
0
28 Aug 2024
Mask in the Mirror: Implicit Sparsification
Mask in the Mirror: Implicit Sparsification
Tom Jacobs
R. Burkholz
42
3
0
19 Aug 2024
Isomorphic Pruning for Vision Models
Isomorphic Pruning for Vision Models
Gongfan Fang
Xinyin Ma
Michael Bi Mi
Xinchao Wang
VLM
ViT
34
6
0
05 Jul 2024
Finding Transformer Circuits with Edge Pruning
Finding Transformer Circuits with Edge Pruning
Adithya Bhaskar
Alexander Wettig
Dan Friedman
Danqi Chen
62
16
0
24 Jun 2024
Geometric sparsification in recurrent neural networks
Geometric sparsification in recurrent neural networks
Wyatt Mackey
Ioannis Schizas
Jared Deighton
David L. Boothe, Jr.
Vasileios Maroulas
28
0
0
10 Jun 2024
Towards Understanding Task-agnostic Debiasing Through the Lenses of
  Intrinsic Bias and Forgetfulness
Towards Understanding Task-agnostic Debiasing Through the Lenses of Intrinsic Bias and Forgetfulness
Guangliang Liu
Milad Afshari
Xitong Zhang
Zhiyu Xue
Avrajit Ghosh
Bidhan Bashyal
Rongrong Wang
K. Johnson
27
0
0
06 Jun 2024
S3D: A Simple and Cost-Effective Self-Speculative Decoding Scheme for
  Low-Memory GPUs
S3D: A Simple and Cost-Effective Self-Speculative Decoding Scheme for Low-Memory GPUs
Wei Zhong
Manasa Bharadwaj
43
5
0
30 May 2024
A separability-based approach to quantifying generalization: which layer
  is best?
A separability-based approach to quantifying generalization: which layer is best?
Luciano Dyballa
Evan Gerritz
Steven W. Zucker
OOD
28
3
0
02 May 2024
AdaFSNet: Time Series Classification Based on Convolutional Network with
  a Adaptive and Effective Kernel Size Configuration
AdaFSNet: Time Series Classification Based on Convolutional Network with a Adaptive and Effective Kernel Size Configuration
Haoxiao Wang
Bo Peng
Jianhua Zhang
Xu Cheng
AI4TS
39
1
0
28 Apr 2024
Where does In-context Translation Happen in Large Language Models
Where does In-context Translation Happen in Large Language Models
Suzanna Sia
David Mueller
Kevin Duh
LRM
33
0
0
07 Mar 2024
Always-Sparse Training by Growing Connections with Guided Stochastic Exploration
Always-Sparse Training by Growing Connections with Guided Stochastic Exploration
Mike Heddes
Narayan Srinivasa
T. Givargis
Alexandru Nicolau
91
0
0
12 Jan 2024
The LLM Surgeon
The LLM Surgeon
Tycho F. A. van der Ouderaa
Markus Nagel
M. V. Baalen
Yuki Markus Asano
Tijmen Blankevoort
28
14
0
28 Dec 2023
Shedding the Bits: Pushing the Boundaries of Quantization with
  Minifloats on FPGAs
Shedding the Bits: Pushing the Boundaries of Quantization with Minifloats on FPGAs
Shivam Aggarwal
Hans Jakob Damsgaard
Alessandro Pappalardo
Giuseppe Franco
Thomas B. Preußer
Michaela Blott
Tulika Mitra
MQ
19
5
0
21 Nov 2023
End-to-end Feature Selection Approach for Learning Skinny Trees
End-to-end Feature Selection Approach for Learning Skinny Trees
Shibal Ibrahim
Kayhan Behdin
Rahul Mazumder
27
0
0
28 Oct 2023
f-Divergence Minimization for Sequence-Level Knowledge Distillation
f-Divergence Minimization for Sequence-Level Knowledge Distillation
Yuqiao Wen
Zichao Li
Wenyu Du
Lili Mou
30
53
0
27 Jul 2023
Quantizable Transformers: Removing Outliers by Helping Attention Heads
  Do Nothing
Quantizable Transformers: Removing Outliers by Helping Attention Heads Do Nothing
Yelysei Bondarenko
Markus Nagel
Tijmen Blankevoort
MQ
13
88
0
22 Jun 2023
A Simple and Effective Pruning Approach for Large Language Models
A Simple and Effective Pruning Approach for Large Language Models
Mingjie Sun
Zhuang Liu
Anna Bair
J. Zico Kolter
56
355
0
20 Jun 2023
LoSparse: Structured Compression of Large Language Models based on
  Low-Rank and Sparse Approximation
LoSparse: Structured Compression of Large Language Models based on Low-Rank and Sparse Approximation
Yixiao Li
Yifan Yu
Qingru Zhang
Chen Liang
Pengcheng He
Weizhu Chen
Tuo Zhao
33
66
0
20 Jun 2023
GC-Flow: A Graph-Based Flow Network for Effective Clustering
GC-Flow: A Graph-Based Flow Network for Effective Clustering
Tianchun Wang
F. Mirzazadeh
X. Zhang
Jing Chen
BDL
40
7
0
26 May 2023
How do languages influence each other? Studying cross-lingual data
  sharing during LM fine-tuning
How do languages influence each other? Studying cross-lingual data sharing during LM fine-tuning
Rochelle Choenni
Dan Garrette
Ekaterina Shutova
32
15
0
22 May 2023
Discovering Causal Relations and Equations from Data
Discovering Causal Relations and Equations from Data
Gustau Camps-Valls
Andreas Gerhardus
Urmi Ninad
Gherardo Varando
Georg Martius
E. Balaguer-Ballester
Ricardo Vinuesa
Emiliano Díaz
L. Zanna
Jakob Runge
PINN
AI4Cl
AI4CE
CML
35
72
0
21 May 2023
SPADE: Sparse Pillar-based 3D Object Detection Accelerator for
  Autonomous Driving
SPADE: Sparse Pillar-based 3D Object Detection Accelerator for Autonomous Driving
Minjae Lee
Seongmin Park
Hyung-Se Kim
Minyong Yoon
Jangwhan Lee
Junwon Choi
Nam Sung Kim
Mingu Kang
Jungwook Choi
3DPC
26
4
0
12 May 2023
VISION DIFFMASK: Faithful Interpretation of Vision Transformers with
  Differentiable Patch Masking
VISION DIFFMASK: Faithful Interpretation of Vision Transformers with Differentiable Patch Masking
A. Nalmpantis
Apostolos Panagiotopoulos
John Gkountouras
Konstantinos Papakostas
Wilker Aziz
15
4
0
13 Apr 2023
Surrogate Lagrangian Relaxation: A Path To Retrain-free Deep Neural
  Network Pruning
Surrogate Lagrangian Relaxation: A Path To Retrain-free Deep Neural Network Pruning
Shangli Zhou
Mikhail A. Bragin
Lynn Pepin
Deniz Gurevin
Fei Miao
Caiwen Ding
14
3
0
08 Apr 2023
NTK-SAP: Improving neural network pruning by aligning training dynamics
NTK-SAP: Improving neural network pruning by aligning training dynamics
Yite Wang
Dawei Li
Ruoyu Sun
28
19
0
06 Apr 2023
Learning Sparsity of Representations with Discrete Latent Variables
Learning Sparsity of Representations with Discrete Latent Variables
Zhao Xu
Daniel Oñoro-Rubio
G. Serra
Mathias Niepert
13
0
0
03 Apr 2023
Illuminati: Towards Explaining Graph Neural Networks for Cybersecurity
  Analysis
Illuminati: Towards Explaining Graph Neural Networks for Cybersecurity Analysis
Haoyu He
Yuede Ji
H. H. Huang
23
20
0
26 Mar 2023
Personalized Lightweight Text-to-Speech: Voice Cloning with Adaptive
  Structured Pruning
Personalized Lightweight Text-to-Speech: Voice Cloning with Adaptive Structured Pruning
Sung-Feng Huang
Chia-Ping Chen
Zhi-Sheng Chen
Yu-Pao Tsai
Hung-yi Lee
18
2
0
21 Mar 2023
Memorization Capacity of Neural Networks with Conditional Computation
Memorization Capacity of Neural Networks with Conditional Computation
Erdem Koyuncu
30
4
0
20 Mar 2023
On Model Compression for Neural Networks: Framework, Algorithm, and
  Convergence Guarantee
On Model Compression for Neural Networks: Framework, Algorithm, and Convergence Guarantee
Chenyang Li
Jihoon Chung
Mengnan Du
Haimin Wang
Xianlian Zhou
Bohao Shen
33
1
0
13 Mar 2023
Sparsity May Cry: Let Us Fail (Current) Sparse Neural Networks Together!
Sparsity May Cry: Let Us Fail (Current) Sparse Neural Networks Together!
Shiwei Liu
Tianlong Chen
Zhenyu (Allen) Zhang
Xuxi Chen
Tianjin Huang
Ajay Jaiswal
Zhangyang Wang
29
29
0
03 Mar 2023
DSD$^2$: Can We Dodge Sparse Double Descent and Compress the Neural
  Network Worry-Free?
DSD2^22: Can We Dodge Sparse Double Descent and Compress the Neural Network Worry-Free?
Victor Quétu
Enzo Tartaglione
24
7
0
02 Mar 2023
Balanced Training for Sparse GANs
Balanced Training for Sparse GANs
Yite Wang
Jing Wu
N. Hovakimyan
Ruoyu Sun
32
9
0
28 Feb 2023
Considering Layerwise Importance in the Lottery Ticket Hypothesis
Considering Layerwise Importance in the Lottery Ticket Hypothesis
Benjamin Vandersmissen
José Oramas
15
1
0
22 Feb 2023
FedSDG-FS: Efficient and Secure Feature Selection for Vertical Federated
  Learning
FedSDG-FS: Efficient and Secure Feature Selection for Vertical Federated Learning
Anran Li
Hongyi Peng
Lan Zhang
Jiahui Huang
Qing-Wu Guo
Han Yu
Yang Liu
FedML
25
28
0
21 Feb 2023
DAG Learning on the Permutahedron
DAG Learning on the Permutahedron
Valentina Zantedeschi
Luca Franceschi
Jean Kaddour
Matt J. Kusner
Vlad Niculae
27
11
0
27 Jan 2023
Rewarded meta-pruning: Meta Learning with Rewards for Channel Pruning
Rewarded meta-pruning: Meta Learning with Rewards for Channel Pruning
Athul Shibu
Abhishek Kumar
Heechul Jung
Dong-Gyu Lee
9
1
0
26 Jan 2023
SpArX: Sparse Argumentative Explanations for Neural Networks [Technical
  Report]
SpArX: Sparse Argumentative Explanations for Neural Networks [Technical Report]
Hamed Ayoobi
Nico Potyka
Francesca Toni
16
17
0
23 Jan 2023
Learnable Heterogeneous Convolution: Learning both topology and strength
Learnable Heterogeneous Convolution: Learning both topology and strength
Rongzhen Zhao
Zhenzhi Wu
Qikun Zhang
21
6
0
13 Jan 2023
Inversion of Bayesian Networks
Inversion of Bayesian Networks
Jesse van Oostrum
Peter van Hintum
Nihat Ay
BDL
13
1
0
20 Dec 2022
CSQ: Growing Mixed-Precision Quantization Scheme with Bi-level
  Continuous Sparsification
CSQ: Growing Mixed-Precision Quantization Scheme with Bi-level Continuous Sparsification
Lirui Xiao
Huanrui Yang
Zhen Dong
Kurt Keutzer
Li Du
Shanghang Zhang
MQ
27
10
0
06 Dec 2022
Robust Training of Graph Neural Networks via Noise Governance
Robust Training of Graph Neural Networks via Noise Governance
Siyi Qian
Haochao Ying
Renjun Hu
Jingbo Zhou
Jintai Chen
D. Z. Chen
Jian Wu
NoLa
25
34
0
12 Nov 2022
Accounting for Temporal Variability in Functional Magnetic Resonance
  Imaging Improves Prediction of Intelligence
Accounting for Temporal Variability in Functional Magnetic Resonance Imaging Improves Prediction of Intelligence
Y. Li
Xin Ma
Rajshekhar Sunderraman
Shihao Ji
Suprateek Kundu
19
6
0
11 Nov 2022
1234
Next