Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2203.00091
Cited By
Dynamic N:M Fine-grained Structured Sparse Attention Mechanism
28 February 2022
Zhaodong Chen
Yuying Quan
Zheng Qu
L. Liu
Yufei Ding
Yuan Xie
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Dynamic N:M Fine-grained Structured Sparse Attention Mechanism"
13 / 13 papers shown
Title
Generalized Neighborhood Attention: Multi-dimensional Sparse Attention at the Speed of Light
Ali Hassani
Fengzhe Zhou
Aditya Kane
Jiannan Huang
Chieh-Yun Chen
...
Bing Xu
Haicheng Wu
Wen-mei W. Hwu
Ming-Yu Liu
Humphrey Shi
26
0
0
23 Apr 2025
Samoyeds: Accelerating MoE Models with Structured Sparsity Leveraging Sparse Tensor Cores
Chenpeng Wu
Qiqi Gu
Heng Shi
Jianguo Yao
Haibing Guan
MoE
48
0
0
13 Mar 2025
SLoPe: Double-Pruned Sparse Plus Lazy Low-Rank Adapter Pretraining of LLMs
Mohammad Mozaffari
Amir Yazdanbakhsh
Zhao Zhang
M. Dehnavi
67
5
0
28 Jan 2025
Preserving Deep Representations In One-Shot Pruning: A Hessian-Free Second-Order Optimization Framework
Ryan Lucas
Rahul Mazumder
69
0
0
27 Nov 2024
ZipVL: Efficient Large Vision-Language Models with Dynamic Token Sparsification and KV Cache Compression
Yefei He
Feng Chen
Jing Liu
Wenqi Shao
Hong Zhou
K. Zhang
Bohan Zhuang
VLM
44
11
0
11 Oct 2024
FlightLLM: Efficient Large Language Model Inference with a Complete Mapping Flow on FPGAs
Shulin Zeng
Jun Liu
Guohao Dai
Xinhao Yang
Tianyu Fu
...
Zehao Wang
Ruoyu Zhang
Kairui Wen
Xuefei Ning
Yu Wang
54
55
0
08 Jan 2024
Learning Section Weights for Multi-Label Document Classification
Maziar Moradi Fard
Paula Sorolla Bayod
Kiomars Motarjem
Mohammad Alian Nejadi
S. Akhondi
Camilo Thorne
14
0
0
26 Nov 2023
VENOM: A Vectorized N:M Format for Unleashing the Power of Sparse Tensor Cores
Roberto L. Castro
Andrei Ivanov
Diego Andrade
Tal Ben-Nun
B. Fraguela
Torsten Hoefler
11
15
0
03 Oct 2023
Efficient N:M Sparse DNN Training Using Algorithm, Architecture, and Dataflow Co-Design
Chao Fang
Wei Sun
Aojun Zhou
Zhongfeng Wang
11
3
0
22 Sep 2023
The Information Pathways Hypothesis: Transformers are Dynamic Self-Ensembles
Md Shamim Hussain
Mohammed J. Zaki
D. Subramanian
29
2
0
02 Jun 2023
On Learning the Transformer Kernel
Sankalan Pal Chowdhury
Adamos Solomou
Kumar Avinava Dubey
Mrinmaya Sachan
ViT
39
14
0
15 Oct 2021
Big Bird: Transformers for Longer Sequences
Manzil Zaheer
Guru Guruganesh
Kumar Avinava Dubey
Joshua Ainslie
Chris Alberti
...
Philip Pham
Anirudh Ravula
Qifan Wang
Li Yang
Amr Ahmed
VLM
251
2,009
0
28 Jul 2020
Efficient Content-Based Sparse Attention with Routing Transformers
Aurko Roy
M. Saffar
Ashish Vaswani
David Grangier
MoE
238
578
0
12 Mar 2020
1