ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2102.04010
  4. Cited By
Learning N:M Fine-grained Structured Sparse Neural Networks From Scratch

Learning N:M Fine-grained Structured Sparse Neural Networks From Scratch

8 February 2021
Aojun Zhou
Yukun Ma
Junnan Zhu
Jianbo Liu
Zhijie Zhang
Kun Yuan
Wenxiu Sun
Hongsheng Li
ArXivPDFHTML

Papers citing "Learning N:M Fine-grained Structured Sparse Neural Networks From Scratch"

50 / 145 papers shown
Title
Semantic Retention and Extreme Compression in LLMs: Can We Have Both?
Semantic Retention and Extreme Compression in LLMs: Can We Have Both?
Stanislas Laborde
Martin Cousseau
Antoun Yaacoub
Lionel Prevost
MQ
23
0
0
12 May 2025
Sparse-to-Sparse Training of Diffusion Models
Sparse-to-Sparse Training of Diffusion Models
Inês Cardoso Oliveira
Decebal Constantin Mocanu
Luis A. Leiva
DiffM
86
0
0
30 Apr 2025
Efficient LLMs with AMP: Attention Heads and MLP Pruning
Efficient LLMs with AMP: Attention Heads and MLP Pruning
Leandro Giusti Mugnaini
Bruno Yamamoto
Lucas Lauton de Alcantara
Victor Zacarias
Edson Bollis
Lucas Pellicer
A. H. R. Costa
Artur Jordao
37
0
0
29 Apr 2025
Periodic Online Testing for Sparse Systolic Tensor Arrays
Periodic Online Testing for Sparse Systolic Tensor Arrays
C. Peltekis
Chrysostomos Nicopoulos
G. Dimitrakopoulos
47
0
0
25 Apr 2025
TAMP: Token-Adaptive Layerwise Pruning in Multimodal Large Language Models
TAMP: Token-Adaptive Layerwise Pruning in Multimodal Large Language Models
Jaewoo Lee
Keyang Xuan
Chanakya Ekbote
Sandeep Polisetty
Yi Ren Fung
Paul Pu Liang
VLM
37
0
0
14 Apr 2025
PQS (Prune, Quantize, and Sort): Low-Bitwidth Accumulation of Dot Products in Neural Network Computations
PQS (Prune, Quantize, and Sort): Low-Bitwidth Accumulation of Dot Products in Neural Network Computations
Vikas Natesh
H. T. Kung
MQ
139
0
0
12 Apr 2025
Thanos: A Block-wise Pruning Algorithm for Efficient Large Language Model Compression
Thanos: A Block-wise Pruning Algorithm for Efficient Large Language Model Compression
Ivan Ilin
Peter Richtárik
26
0
0
06 Apr 2025
MDP: Multidimensional Vision Model Pruning with Latency Constraint
MDP: Multidimensional Vision Model Pruning with Latency Constraint
Xinglong Sun
Barath Lakshmanan
Maying Shen
Shiyi Lan
Jingde Chen
Jose M. Alvarez
VLM
46
0
0
02 Apr 2025
Triad: Empowering LMM-based Anomaly Detection with Vision Expert-guided Visual Tokenizer and Manufacturing Process
Triad: Empowering LMM-based Anomaly Detection with Vision Expert-guided Visual Tokenizer and Manufacturing Process
Yuanze Li
Shihao Yuan
Haolin Wang
Qizhang Li
Ming-Yu Liu
Chen Xu
Guangming Shi
Wangmeng Zuo
56
0
0
17 Mar 2025
CABS: Conflict-Aware and Balanced Sparsification for Enhancing Model Merging
Zongzhen Yang
Binhang Qi
Hailong Sun
Wenrui Long
Ruobing Zhao
Xiang Gao
MoMe
48
0
0
26 Feb 2025
EvoP: Robust LLM Inference via Evolutionary Pruning
EvoP: Robust LLM Inference via Evolutionary Pruning
Shangyu Wu
Hongchao Du
Ying Xiong
Shuai Chen
Tei-Wei Kuo
Nan Guan
Chun Jason Xue
34
1
0
19 Feb 2025
Advancing Weight and Channel Sparsification with Enhanced Saliency
Advancing Weight and Channel Sparsification with Enhanced Saliency
Xinglong Sun
Maying Shen
Hongxu Yin
Lei Mao
Pavlo Molchanov
Jose M. Alvarez
46
1
0
05 Feb 2025
Symmetric Pruning of Large Language Models
Symmetric Pruning of Large Language Models
Kai Yi
Peter Richtárik
AAML
VLM
57
0
0
31 Jan 2025
SLoPe: Double-Pruned Sparse Plus Lazy Low-Rank Adapter Pretraining of LLMs
SLoPe: Double-Pruned Sparse Plus Lazy Low-Rank Adapter Pretraining of LLMs
Mohammad Mozaffari
Amir Yazdanbakhsh
Zhao Zhang
M. Dehnavi
78
5
0
28 Jan 2025
Meta-Sparsity: Learning Optimal Sparse Structures in Multi-task Networks through Meta-learning
Meta-Sparsity: Learning Optimal Sparse Structures in Multi-task Networks through Meta-learning
Richa Upadhyay
Ronald Phlypo
Rajkumar Saini
Marcus Liwicki
35
0
0
21 Jan 2025
MaskGaussian: Adaptive 3D Gaussian Representation from Probabilistic Masks
MaskGaussian: Adaptive 3D Gaussian Representation from Probabilistic Masks
Yifei Liu
Zhihang Zhong
Yifan Zhan
Sheng Xu
Xiao Sun
3DGS
51
3
0
29 Dec 2024
AutoSculpt: A Pattern-based Model Auto-pruning Framework Using
  Reinforcement Learning and Graph Learning
AutoSculpt: A Pattern-based Model Auto-pruning Framework Using Reinforcement Learning and Graph Learning
Lixian Jing
Jianpeng Qi
Junyu Dong
Yanwei Yu
3DPC
AI4CE
39
0
0
24 Dec 2024
Preserving Deep Representations In One-Shot Pruning: A Hessian-Free
  Second-Order Optimization Framework
Preserving Deep Representations In One-Shot Pruning: A Hessian-Free Second-Order Optimization Framework
Ryan Lucas
Rahul Mazumder
74
0
0
27 Nov 2024
Layer Pruning with Consensus: A Triple-Win Solution
Layer Pruning with Consensus: A Triple-Win Solution
Leandro Giusti Mugnaini
Carolina Tavares Duarte
Anna H. Reali Costa
Artur Jordao
71
0
0
21 Nov 2024
AmoebaLLM: Constructing Any-Shape Large Language Models for Efficient
  and Instant Deployment
AmoebaLLM: Constructing Any-Shape Large Language Models for Efficient and Instant Deployment
Y. Fu
Zhongzhi Yu
Junwei Li
Jiayi Qian
Yongan Zhang
Xiangchi Yuan
Dachuan Shi
Roman Yakunin
Y. Lin
29
2
0
15 Nov 2024
Zeroth-Order Adaptive Neuron Alignment Based Pruning without Re-Training
Zeroth-Order Adaptive Neuron Alignment Based Pruning without Re-Training
Elia Cunegatti
Leonardo Lucio Custode
Giovanni Iacca
47
0
0
11 Nov 2024
MoE-I$^2$: Compressing Mixture of Experts Models through Inter-Expert
  Pruning and Intra-Expert Low-Rank Decomposition
MoE-I2^22: Compressing Mixture of Experts Models through Inter-Expert Pruning and Intra-Expert Low-Rank Decomposition
Cheng Yang
Yang Sui
Jinqi Xiao
Lingyi Huang
Yu Gong
Yuanlin Duan
Wenqi Jia
Miao Yin
Yu Cheng
Bo Yuan
MoE
71
4
0
01 Nov 2024
Compressing Large Language Models with Automated Sub-Network Search
Compressing Large Language Models with Automated Sub-Network Search
R. Sukthanker
B. Staffler
Frank Hutter
Aaron Klein
LRM
38
0
0
09 Oct 2024
Dynamic Sparse Training versus Dense Training: The Unexpected Winner in Image Corruption Robustness
Dynamic Sparse Training versus Dense Training: The Unexpected Winner in Image Corruption Robustness
Boqian Wu
Q. Xiao
Shunxin Wang
N. Strisciuglio
Mykola Pechenizkiy
M. V. Keulen
D. Mocanu
Elena Mocanu
OOD
3DH
52
0
0
03 Oct 2024
MaskLLM: Learnable Semi-Structured Sparsity for Large Language Models
MaskLLM: Learnable Semi-Structured Sparsity for Large Language Models
Gongfan Fang
Hongxu Yin
Saurav Muralidharan
Greg Heinrich
Jeff Pool
Jan Kautz
Pavlo Molchanov
Xinchao Wang
35
3
0
26 Sep 2024
Are Sparse Neural Networks Better Hard Sample Learners?
Are Sparse Neural Networks Better Hard Sample Learners?
Q. Xiao
Boqian Wu
Lu Yin
Christopher Neil Gadzinski
Tianjin Huang
Mykola Pechenizkiy
D. Mocanu
35
1
0
13 Sep 2024
S-STE: Continuous Pruning Function for Efficient 2:4 Sparse Pre-training
S-STE: Continuous Pruning Function for Efficient 2:4 Sparse Pre-training
Yuezhou Hu
Jun-Jie Zhu
Jianfei Chen
36
0
0
13 Sep 2024
Mixed Sparsity Training: Achieving 4$\times$ FLOP Reduction for
  Transformer Pretraining
Mixed Sparsity Training: Achieving 4×\times× FLOP Reduction for Transformer Pretraining
Pihe Hu
Shaolong Li
Longbo Huang
28
0
0
21 Aug 2024
STBLLM: Breaking the 1-Bit Barrier with Structured Binary LLMs
STBLLM: Breaking the 1-Bit Barrier with Structured Binary LLMs
Peijie Dong
Lujun Li
Dayou Du
Yuhan Chen
Zhenheng Tang
...
Wei Xue
Wenhan Luo
Qi-fei Liu
Yi-Ting Guo
Xiaowen Chu
MQ
45
4
0
03 Aug 2024
Pruning Large Language Models with Semi-Structural Adaptive Sparse
  Training
Pruning Large Language Models with Semi-Structural Adaptive Sparse Training
Weiyu Huang
Yuezhou Hu
Guohao Jian
Jun Zhu
Jianfei Chen
35
5
0
30 Jul 2024
Toward Efficient Permutation for Hierarchical N:M Sparsity on GPUs
Toward Efficient Permutation for Hierarchical N:M Sparsity on GPUs
Seungmin Yu
Xiaodie Yi
Hayun Lee
Dongkun Shin
21
1
0
30 Jul 2024
Nerva: a Truly Sparse Implementation of Neural Networks
Nerva: a Truly Sparse Implementation of Neural Networks
Wieger Wesselink
Bram Grooten
Qiao Xiao
Cássio Machado de Campos
Mykola Pechenizkiy
25
0
0
24 Jul 2024
Multi-Dimensional Pruning: Joint Channel, Layer and Block Pruning with
  Latency Constraint
Multi-Dimensional Pruning: Joint Channel, Layer and Block Pruning with Latency Constraint
Xinglong Sun
Barath Lakshmanan
Maying Shen
Shiyi Lan
Jingde Chen
Jose Alvarez
VLM
36
3
0
17 Jun 2024
ALPS: Improved Optimization for Highly Sparse One-Shot Pruning for Large
  Language Models
ALPS: Improved Optimization for Highly Sparse One-Shot Pruning for Large Language Models
Xiang Meng
Kayhan Behdin
Haoyue Wang
Rahul Mazumder
37
3
0
12 Jun 2024
Effective Interplay between Sparsity and Quantization: From Theory to Practice
Effective Interplay between Sparsity and Quantization: From Theory to Practice
Simla Burcu Harma
Ayan Chakraborty
Elizaveta Kostenok
Danila Mishin
Dongho Ha
...
Martin Jaggi
Ming Liu
Yunho Oh
Suvinay Subramanian
Amir Yazdanbakhsh
MQ
34
5
0
31 May 2024
SPP: Sparsity-Preserved Parameter-Efficient Fine-Tuning for Large
  Language Models
SPP: Sparsity-Preserved Parameter-Efficient Fine-Tuning for Large Language Models
Xudong Lu
Aojun Zhou
Yuhui Xu
Renrui Zhang
Peng Gao
Hongsheng Li
26
7
0
25 May 2024
QGait: Toward Accurate Quantization for Gait Recognition with Binarized
  Input
QGait: Toward Accurate Quantization for Gait Recognition with Binarized Input
Senmao Tian
Haoyu Gao
Gangyi Hong
Shuyun Wang
JingJie Wang
Xin Yu
Shunli Zhang
MQ
32
1
0
22 May 2024
From Algorithm to Hardware: A Survey on Efficient and Safe Deployment of
  Deep Neural Networks
From Algorithm to Hardware: A Survey on Efficient and Safe Deployment of Deep Neural Networks
Xue Geng
Zhe Wang
Chunyun Chen
Qing Xu
Kaixin Xu
...
Zhenghua Chen
M. Aly
Jie Lin
Min-man Wu
Xiaoli Li
33
1
0
09 May 2024
Collage: Light-Weight Low-Precision Strategy for LLM Training
Collage: Light-Weight Low-Precision Strategy for LLM Training
Tao Yu
Gaurav Gupta
Karthick Gopalswamy
Amith R. Mamidala
Hao Zhou
Jeffrey Huynh
Youngsuk Park
Ron Diamant
Anoop Deoras
Jun Huan
MQ
49
3
0
06 May 2024
Torch2Chip: An End-to-end Customizable Deep Neural Network Compression
  and Deployment Toolkit for Prototype Hardware Accelerator Design
Torch2Chip: An End-to-end Customizable Deep Neural Network Compression and Deployment Toolkit for Prototype Hardware Accelerator Design
Jian Meng
Yuan Liao
Anupreetham Anupreetham
Ahmed Hassan
Shixing Yu
Han-Sok Suh
Xiaofeng Hu
Jae-sun Seo
MQ
47
1
0
02 May 2024
SparseDM: Toward Sparse Efficient Diffusion Models
SparseDM: Toward Sparse Efficient Diffusion Models
Kafeng Wang
Jianfei Chen
He Li
Zhenpeng Mi
Jun-Jie Zhu
DiffM
60
8
0
16 Apr 2024
Rethinking Pruning for Vision-Language Models: Strategies for Effective
  Sparsity and Performance Restoration
Rethinking Pruning for Vision-Language Models: Strategies for Effective Sparsity and Performance Restoration
Shwai He
Ang Li
Tianlong Chen
VLM
42
1
0
03 Apr 2024
Accelerating Transformer Pre-training with 2:4 Sparsity
Accelerating Transformer Pre-training with 2:4 Sparsity
Yuezhou Hu
Kang Zhao
Weiyu Huang
Jianfei Chen
Jun Zhu
57
7
0
02 Apr 2024
Decoding Compressed Trust: Scrutinizing the Trustworthiness of Efficient
  LLMs Under Compression
Decoding Compressed Trust: Scrutinizing the Trustworthiness of Efficient LLMs Under Compression
Junyuan Hong
Jinhao Duan
Chenhui Zhang
Zhangheng Li
Chulin Xie
...
B. Kailkhura
Dan Hendrycks
Dawn Song
Zhangyang Wang
Bo-wen Li
34
24
0
18 Mar 2024
Abstracting Sparse DNN Acceleration via Structured Sparse Tensor
  Decomposition
Abstracting Sparse DNN Acceleration via Structured Sparse Tensor Decomposition
Geonhwa Jeong
Po-An Tsai
A. Bambhaniya
S. Keckler
Tushar Krishna
25
7
0
12 Mar 2024
Not All Experts are Equal: Efficient Expert Pruning and Skipping for
  Mixture-of-Experts Large Language Models
Not All Experts are Equal: Efficient Expert Pruning and Skipping for Mixture-of-Experts Large Language Models
Xudong Lu
Qi Liu
Yuhui Xu
Aojun Zhou
Siyuan Huang
Bo-Wen Zhang
Junchi Yan
Hongsheng Li
MoE
32
25
0
22 Feb 2024
NutePrune: Efficient Progressive Pruning with Numerous Teachers for
  Large Language Models
NutePrune: Efficient Progressive Pruning with Numerous Teachers for Large Language Models
Shengrui Li
Junzhe Chen
Xueting Han
Jing Bai
22
6
0
15 Feb 2024
Model Compression and Efficient Inference for Large Language Models: A
  Survey
Model Compression and Efficient Inference for Large Language Models: A Survey
Wenxiao Wang
Wei Chen
Yicong Luo
Yongliu Long
Zhengkai Lin
Liye Zhang
Binbin Lin
Deng Cai
Xiaofei He
MQ
38
47
0
15 Feb 2024
Towards Meta-Pruning via Optimal Transport
Towards Meta-Pruning via Optimal Transport
Alexander Theus
Olin Geimer
Friedrich Wicke
Thomas Hofmann
Sotiris Anagnostidis
Sidak Pal Singh
MoMe
16
3
0
12 Feb 2024
Progressive Gradient Flow for Robust N:M Sparsity Training in
  Transformers
Progressive Gradient Flow for Robust N:M Sparsity Training in Transformers
A. Bambhaniya
Amir Yazdanbakhsh
Suvinay Subramanian
Sheng-Chun Kao
Shivani Agrawal
Utku Evci
Tushar Krishna
54
16
0
07 Feb 2024
123
Next