ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2026 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2003.09833
  4. Cited By
SAC: Accelerating and Structuring Self-Attention via Sparse Adaptive
  Connection
v1v2v3 (latest)

SAC: Accelerating and Structuring Self-Attention via Sparse Adaptive Connection

Neural Information Processing Systems (NeurIPS), 2020
22 March 2020
Xiaoya Li
Yuxian Meng
Mingxin Zhou
Qinghong Han
Leilei Gan
Jiwei Li
ArXiv (abs)PDFHTML

Papers citing "SAC: Accelerating and Structuring Self-Attention via Sparse Adaptive Connection"

16 / 16 papers shown
BLaST: High Performance Inference and Pretraining using BLock Sparse Transformers
BLaST: High Performance Inference and Pretraining using BLock Sparse Transformers
Patrik Okanovic
Sameer Deshmukh
Grzegorz Kwa'sniewski
Yi Zhu
Haruto Fujii
...
Maciej Besta
Kentaro Katayama
Takumi Honda
Yusuke Nagasaka
Torsten Hoefler
249
0
0
03 Jul 2025
Model Compression and Efficient Inference for Large Language Models: A
  Survey
Model Compression and Efficient Inference for Large Language Models: A Survey
Wenxiao Wang
Wei Chen
Yicong Luo
Yongliu Long
Zhengkai Lin
Liye Zhang
Binbin Lin
Deng Cai
Xiaofei He
MQ
378
95
0
15 Feb 2024
ChatGPT-Like Large-Scale Foundation Models for Prognostics and Health
  Management: A Survey and Roadmaps
ChatGPT-Like Large-Scale Foundation Models for Prognostics and Health Management: A Survey and RoadmapsReliability Engineering & System Safety (Reliab. Eng. Syst. Saf.), 2023
Yanfang Li
Huan Wang
Muxia Sun
LM&MAAI4TSAI4CE
444
109
0
10 May 2023
Museformer: Transformer with Fine- and Coarse-Grained Attention for
  Music Generation
Museformer: Transformer with Fine- and Coarse-Grained Attention for Music GenerationNeural Information Processing Systems (NeurIPS), 2022
Botao Yu
Peiling Lu
Rui Wang
Wei Hu
Xu Tan
Wei Ye
Shikun Zhang
Tao Qin
Tie-Yan Liu
MGen
346
86
0
19 Oct 2022
CAB: Comprehensive Attention Benchmarking on Long Sequence Modeling
CAB: Comprehensive Attention Benchmarking on Long Sequence ModelingInternational Conference on Machine Learning (ICML), 2022
Jinchao Zhang
Shuyang Jiang
Jiangtao Feng
Lin Zheng
Dianbo Sui
3DV
689
11
0
14 Oct 2022
Hierarchical Graph Transformer with Adaptive Node Sampling
Hierarchical Graph Transformer with Adaptive Node SamplingNeural Information Processing Systems (NeurIPS), 2022
Zaixin Zhang
Qi Liu
Qingyong Hu
Cheekong Lee
361
128
0
08 Oct 2022
Sparse Attentive Memory Network for Click-through Rate Prediction with
  Long Sequences
Sparse Attentive Memory Network for Click-through Rate Prediction with Long SequencesInternational Conference on Information and Knowledge Management (CIKM), 2022
Qianying Lin
Wen-Ji Zhou
Yanshi Wang
Qing Da
Qingguo Chen
Bing Wang
VLM
240
14
0
08 Aug 2022
Attention Mechanism in Neural Networks: Where it Comes and Where it Goes
Attention Mechanism in Neural Networks: Where it Comes and Where it Goes
Derya Soydaner
3DV
358
340
0
27 Apr 2022
Faster Nearest Neighbor Machine Translation
Faster Nearest Neighbor Machine Translation
Shuhe Wang
Jiwei Li
Yuxian Meng
Rongbin Ouyang
Guoyin Wang
Xiaoya Li
Tianwei Zhang
Shi Zong
194
12
0
15 Dec 2021
GNN-LM: Language Modeling based on Global Contexts via GNN
GNN-LM: Language Modeling based on Global Contexts via GNN
Yuxian Meng
Shi Zong
Xiaoya Li
Xiaofei Sun
Tianwei Zhang
Leilei Gan
Jiwei Li
LRM
618
45
0
17 Oct 2021
Layer-wise Model Pruning based on Mutual Information
Layer-wise Model Pruning based on Mutual InformationConference on Empirical Methods in Natural Language Processing (EMNLP), 2021
Chun Fan
Jiwei Li
Xiang Ao
Leilei Gan
Yuxian Meng
Xiaofei Sun
182
27
0
28 Aug 2021
A Survey of Transformers
A Survey of TransformersAI Open (AO), 2021
Tianyang Lin
Yuxin Wang
Xiangyang Liu
Xipeng Qiu
ViT
651
1,457
0
08 Jun 2021
Fast Nearest Neighbor Machine Translation
Fast Nearest Neighbor Machine TranslationFindings (Findings), 2021
Yuxian Meng
Xiaoya Li
Xiayu Zheng
Leilei Gan
Xiaofei Sun
Tianwei Zhang
Jiwei Li
LRM
358
54
0
30 May 2021
Summarize, Outline, and Elaborate: Long-Text Generation via Hierarchical
  Supervision from Extractive Summaries
Summarize, Outline, and Elaborate: Long-Text Generation via Hierarchical Supervision from Extractive Summaries
Xiaofei Sun
Zijun Sun
Yuxian Meng
Jiwei Li
Chun Fan
289
25
0
14 Oct 2020
$O(n)$ Connections are Expressive Enough: Universal Approximability of
  Sparse Transformers
O(n)O(n)O(n) Connections are Expressive Enough: Universal Approximability of Sparse Transformers
Chulhee Yun
Yin-Wen Chang
Srinadh Bhojanapalli
A. S. Rawat
Sashank J. Reddi
Sanjiv Kumar
274
95
0
08 Jun 2020
An Attentive Survey of Attention Models
An Attentive Survey of Attention Models
S. Chaudhari
Varun Mithal
Gungor Polatkan
R. Ramanath
597
745
0
05 Apr 2019
1
Page 1 of 1