ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2210.06313
  4. Cited By
The Lazy Neuron Phenomenon: On Emergence of Activation Sparsity in
  Transformers

The Lazy Neuron Phenomenon: On Emergence of Activation Sparsity in Transformers

12 October 2022
Zong-xiao Li
Chong You
Srinadh Bhojanapalli
Daliang Li
A. S. Rawat
Sashank J. Reddi
Kenneth Q Ye
Felix Chern
Felix X. Yu
Ruiqi Guo
Surinder Kumar
    MoE
ArXivPDFHTML

Papers citing "The Lazy Neuron Phenomenon: On Emergence of Activation Sparsity in Transformers"

15 / 15 papers shown
Title
R-Sparse: Rank-Aware Activation Sparsity for Efficient LLM Inference
R-Sparse: Rank-Aware Activation Sparsity for Efficient LLM Inference
Zhenyu (Allen) Zhang
Zechun Liu
Yuandong Tian
Harshit Khaitan
Z. Wang
Steven Li
54
0
0
28 Apr 2025
Adaptive Rank Allocation: Speeding Up Modern Transformers with RaNA Adapters
Adaptive Rank Allocation: Speeding Up Modern Transformers with RaNA Adapters
Roberto Garcia
Jerry Liu
Daniel Sorvisto
Sabri Eyuboglu
81
0
0
23 Mar 2025
Repetition Neurons: How Do Language Models Produce Repetitions?
Repetition Neurons: How Do Language Models Produce Repetitions?
Tatsuya Hiraoka
Kentaro Inui
MILM
57
5
0
21 Feb 2025
Merging Feed-Forward Sublayers for Compressed Transformers
Merging Feed-Forward Sublayers for Compressed Transformers
Neha Verma
Kenton W. Murray
Kevin Duh
AI4CE
40
0
0
10 Jan 2025
Sparsing Law: Towards Large Language Models with Greater Activation Sparsity
Sparsing Law: Towards Large Language Models with Greater Activation Sparsity
Yuqi Luo
Chenyang Song
Xu Han
Y. Chen
Chaojun Xiao
Zhiyuan Liu
Maosong Sun
44
3
0
04 Nov 2024
Dual sparse training framework: inducing activation map sparsity via
  Transformed $\ell1$ regularization
Dual sparse training framework: inducing activation map sparsity via Transformed ℓ1\ell1ℓ1 regularization
Xiaolong Yu
Cong Tian
25
0
0
30 May 2024
FFSplit: Split Feed-Forward Network For Optimizing Accuracy-Efficiency
  Trade-off in Language Model Inference
FFSplit: Split Feed-Forward Network For Optimizing Accuracy-Efficiency Trade-off in Language Model Inference
Zirui Liu
Qingquan Song
Q. Xiao
Sathiya Keerthi Selvaraj
Rahul Mazumder
Aman Gupta
Xia Hu
8
4
0
08 Jan 2024
On the Principles of Parsimony and Self-Consistency for the Emergence of
  Intelligence
On the Principles of Parsimony and Self-Consistency for the Emergence of Intelligence
Y. Ma
Doris Y. Tsao
H. Shum
45
67
0
11 Jul 2022
TPU-KNN: K Nearest Neighbor Search at Peak FLOP/s
TPU-KNN: K Nearest Neighbor Search at Peak FLOP/s
Felix Chern
Blake A. Hechtman
Andy Davis
Ruiqi Guo
David Majnemer
Surinder Kumar
75
22
0
28 Jun 2022
Sparsity Winning Twice: Better Robust Generalization from More Efficient
  Training
Sparsity Winning Twice: Better Robust Generalization from More Efficient Training
Tianlong Chen
Zhenyu (Allen) Zhang
Pengju Wang
Santosh Balachandra
Haoyu Ma
Zehao Wang
Zhangyang Wang
OOD
AAML
72
46
0
20 Feb 2022
SCENIC: A JAX Library for Computer Vision Research and Beyond
SCENIC: A JAX Library for Computer Vision Research and Beyond
Mostafa Dehghani
A. Gritsenko
Anurag Arnab
Matthias Minderer
Yi Tay
33
67
0
18 Oct 2021
MLP-Mixer: An all-MLP Architecture for Vision
MLP-Mixer: An all-MLP Architecture for Vision
Ilya O. Tolstikhin
N. Houlsby
Alexander Kolesnikov
Lucas Beyer
Xiaohua Zhai
...
Andreas Steiner
Daniel Keysers
Jakob Uszkoreit
Mario Lucic
Alexey Dosovitskiy
239
2,554
0
04 May 2021
Sparsity in Deep Learning: Pruning and growth for efficient inference
  and training in neural networks
Sparsity in Deep Learning: Pruning and growth for efficient inference and training in neural networks
Torsten Hoefler
Dan Alistarh
Tal Ben-Nun
Nikoli Dryden
Alexandra Peste
MQ
128
679
0
31 Jan 2021
Exploring Deep Neural Networks via Layer-Peeled Model: Minority Collapse
  in Imbalanced Training
Exploring Deep Neural Networks via Layer-Peeled Model: Minority Collapse in Imbalanced Training
Cong Fang
Hangfeng He
Qi Long
Weijie J. Su
FAtt
112
162
0
29 Jan 2021
Convolutional Neural Networks Analyzed via Convolutional Sparse Coding
Convolutional Neural Networks Analyzed via Convolutional Sparse Coding
V. Papyan
Yaniv Romano
Michael Elad
48
283
0
27 Jul 2016
1