Communities
Connect sessions
AI calendar
Organizations
Join Slack
Contact Sales
Search
Open menu
Home
Papers
2005.14187
Cited By
HAT: Hardware-Aware Transformers for Efficient Natural Language Processing
Annual Meeting of the Association for Computational Linguistics (ACL), 2020
28 May 2020
Hanrui Wang
Zhanghao Wu
Zhijian Liu
Han Cai
Ligeng Zhu
Chuang Gan
Song Han
Re-assign community
ArXiv (abs)
PDF
HTML
HuggingFace (2 upvotes)
Github (334★)
Papers citing
"HAT: Hardware-Aware Transformers for Efficient Natural Language Processing"
50 / 112 papers shown
Title
PVNAS: 3D Neural Architecture Search with Point-Voxel Convolution
IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2021
Zhijian Liu
Haotian Tang
Shengyu Zhao
Kevin Shao
Song Han
3DPC
112
46
0
25 Apr 2022
Enable Deep Learning on Mobile Devices: Methods, Systems, and Applications
Han Cai
Ji Lin
Chengyue Wu
Zhijian Liu
Haotian Tang
Hanrui Wang
Ligeng Zhu
Song Han
194
129
0
25 Apr 2022
SplitNets: Designing Neural Architectures for Efficient Distributed Computing on Head-Mounted Systems
Computer Vision and Pattern Recognition (CVPR), 2022
Xin Dong
B. D. Salvo
Meng Li
Chiao Liu
Zhongnan Qu
H. T. Kung
Ziyun Li
3DGS
139
23
0
10 Apr 2022
Probing Structured Pruning on Multilingual Pre-trained Models: Settings, Algorithms, and Efficiency
Annual Meeting of the Association for Computational Linguistics (ACL), 2022
Yanyang Li
Fuli Luo
Runxin Xu
Songfang Huang
Fei Huang
Liwei Wang
105
3
0
06 Apr 2022
A Fast Post-Training Pruning Framework for Transformers
Neural Information Processing Systems (NeurIPS), 2022
Woosuk Kwon
Sehoon Kim
Michael W. Mahoney
Joseph Hassoun
Kurt Keutzer
A. Gholami
158
183
0
29 Mar 2022
Bilaterally Slimmable Transformer for Elastic and Efficient Visual Question Answering
IEEE transactions on multimedia (IEEE TMM), 2022
Zhou Yu
Zitian Jin
Jun Yu
Mingliang Xu
Hongbo Wang
Jianping Fan
124
5
0
24 Mar 2022
Mokey: Enabling Narrow Fixed-Point Inference for Out-of-the-Box Floating-Point Transformer Models
International Symposium on Computer Architecture (ISCA), 2022
Ali Hadi Zadeh
Mostafa Mahmoud
Ameer Abdelhadi
Andreas Moshovos
MQ
153
37
0
23 Mar 2022
Training-free Transformer Architecture Search
Computer Vision and Pattern Recognition (CVPR), 2022
Qinqin Zhou
Kekai Sheng
Xiawu Zheng
Ke Li
Xing Sun
Yonghong Tian
Jie Chen
Rongrong Ji
ViT
134
54
0
23 Mar 2022
Accelerating Neural Architecture Exploration Across Modalities Using Genetic Algorithms
Daniel Cummings
S. N. Sridhar
Anthony Sarah
Maciej Szankin
AI4CE
94
0
0
25 Feb 2022
EdgeFormer: A Parameter-Efficient Transformer for On-Device Seq2seq Generation
Conference on Empirical Methods in Natural Language Processing (EMNLP), 2022
Tao Ge
Si-Qing Chen
Furu Wei
MoE
208
28
0
16 Feb 2022
Fast Monte-Carlo Approximation of the Attention Mechanism
AAAI Conference on Artificial Intelligence (AAAI), 2022
Hyunjun Kim
Jeonggil Ko
150
4
0
30 Jan 2022
AutoDistil: Few-shot Task-agnostic Neural Architecture Search for Distilling Large Language Models
Dongkuan Xu
Subhabrata Mukherjee
Xiaodong Liu
Debadeepta Dey
Wenhui Wang
Xiang Zhang
Ahmed Hassan Awadallah
Jianfeng Gao
122
5
0
29 Jan 2022
Transformers in Medical Imaging: A Survey
Fahad Shamshad
Salman Khan
Syed Waqas Zamir
Muhammad Haris Khan
Munawar Hayat
Fahad Shahbaz Khan
Huazhu Fu
ViT
LM&MA
MedIm
244
865
0
24 Jan 2022
Representing Long-Range Context for Graph Neural Networks with Global Attention
Neural Information Processing Systems (NeurIPS), 2022
Zhanghao Wu
Paras Jain
Matthew A. Wright
Azalia Mirhoseini
Joseph E. Gonzalez
Ion Stoica
GNN
188
352
0
21 Jan 2022
MAPLE: Microprocessor A Priori for Latency Estimation
Saad Abbasi
Alexander Wong
M. Shafiee
125
13
0
30 Nov 2021
Searching the Search Space of Vision Transformer
Minghao Chen
Kan Wu
Bolin Ni
Houwen Peng
Bei Liu
Jianlong Fu
Hongyang Chao
Haibin Ling
ViT
132
65
0
29 Nov 2021
Sparse is Enough in Scaling Transformers
Sebastian Jaszczur
Aakanksha Chowdhery
Afroz Mohiuddin
Lukasz Kaiser
Wojciech Gajewski
Henryk Michalewski
Jonni Kanerva
MoE
115
116
0
24 Nov 2021
Searching for TrioNet: Combining Convolution with Local and Global Self-Attention
British Machine Vision Conference (BMVC), 2021
Huaijin Pi
Huiyu Wang
Yingwei Li
Zizhang Li
Alan Yuille
ViT
101
3
0
15 Nov 2021
One Proxy Device Is Enough for Hardware-Aware Neural Architecture Search
Proceedings of the ACM on Measurement and Analysis of Computing Systems (POMACS), 2021
Bingqian Lu
Jianyi Yang
Weiwen Jiang
Yiyu Shi
Shaolei Ren
183
25
0
01 Nov 2021
Pipeline Parallelism for Inference on Heterogeneous Edge Computing
Yang Hu
Connor Imes
Xuanang Zhao
Souvik Kundu
Peter A. Beerel
S. Crago
J. Walters
MoE
187
25
0
28 Oct 2021
Accelerating Framework of Transformer by Hardware Design and Model Compression Co-Optimization
Panjie Qi
E. Sha
Qingfeng Zhuge
Hongwu Peng
Shaoyi Huang
Zhenglun Kong
Yuhong Song
Bingbing Li
111
54
0
19 Oct 2021
Energon: Towards Efficient Acceleration of Transformers Using Dynamic Sparse Attention
Zhe Zhou
Junling Liu
Zhenyu Gu
Guangyu Sun
198
55
0
18 Oct 2021
Omni-sparsity DNN: Fast Sparsity Optimization for On-Device Streaming E2E ASR via Supernet
Haichuan Yang
Yuan Shangguan
Dilin Wang
Meng Li
P. Chuang
Xiaohui Zhang
Ganesh Venkatesh
Ozlem Kalinli
Vikas Chandra
152
14
0
15 Oct 2021
SuperShaper: Task-Agnostic Super Pre-training of BERT Models with Variable Hidden Dimensions
Vinod Ganesan
Gowtham Ramesh
Pratyush Kumar
90
9
0
10 Oct 2021
Understanding and Overcoming the Challenges of Efficient Transformer Quantization
Conference on Empirical Methods in Natural Language Processing (EMNLP), 2021
Yelysei Bondarenko
Markus Nagel
Tijmen Blankevoort
MQ
133
168
0
27 Sep 2021
The NiuTrans System for the WMT21 Efficiency Task
Chenglong Wang
Chi Hu
Yongyu Mu
Zhongxiang Yan
Siming Wu
...
Hang Cao
Bei Li
Ye Lin
Tong Xiao
Jingbo Zhu
113
2
0
16 Sep 2021
RankNAS: Efficient Neural Architecture Search by Pairwise Ranking
Chi Hu
Chenglong Wang
Xiangnan Ma
Xia Meng
Yinqiao Li
Tong Xiao
Jingbo Zhu
Changliang Li
107
13
0
15 Sep 2021
EfficientBERT: Progressively Searching Multilayer Perceptron via Warm-up Knowledge Distillation
Chenhe Dong
Guangrun Wang
Hang Xu
Jiefeng Peng
Xiaozhe Ren
Xiaodan Liang
122
28
0
15 Sep 2021
Searching for Efficient Multi-Stage Vision Transformers
Yi-Lun Liao
S. Karaman
Vivienne Sze
ViT
93
19
0
01 Sep 2021
Generic Neural Architecture Search via Regression
Neural Information Processing Systems (NeurIPS), 2021
Yuhong Li
Cong Hao
Pan Li
Jinjun Xiong
Deming Chen
134
31
0
04 Aug 2021
Group Fisher Pruning for Practical Network Compression
Liyang Liu
Shilong Zhang
Zhanghui Kuang
Aojun Zhou
Jingliang Xue
Xinjiang Wang
Yimin Chen
Wenming Yang
Q. Liao
Wayne Zhang
152
173
0
02 Aug 2021
AutoTinyBERT: Automatic Hyper-parameter Optimization for Efficient Pre-trained Language Models
Annual Meeting of the Association for Computational Linguistics (ACL), 2021
Yichun Yin
Cheng Chen
Lifeng Shang
Xin Jiang
Xiao Chen
Qun Liu
VLM
112
50
0
29 Jul 2021
QuantumNAS: Noise-Adaptive Search for Robust Quantum Circuits
International Symposium on High-Performance Computer Architecture (HPCA), 2021
Hanrui Wang
Yongshan Ding
Jiaqi Gu
Zirui Li
Chengyue Wu
David Z. Pan
Frederic T. Chong
Song Han
261
222
0
22 Jul 2021
You Better Look Twice: a new perspective for designing accurate detectors with reduced computations
British Machine Vision Conference (BMVC), 2021
Alexandra Dana
M. Shutman
Yotam Perlitz
Ran Vitek
Tomer Peleg
R. Jevnisek
ObjD
213
3
0
21 Jul 2021
AutoFormer: Searching Transformers for Visual Recognition
Minghao Chen
Houwen Peng
Jianlong Fu
Haibin Ling
ViT
173
310
0
01 Jul 2021
LV-BERT: Exploiting Layer Variety for BERT
Findings (Findings), 2021
Weihao Yu
Zihang Jiang
Fei Chen
Qibin Hou
Jiashi Feng
MQ
91
0
0
22 Jun 2021
HELP: Hardware-Adaptive Efficient Latency Prediction for NAS via Meta-Learning
Hayeon Lee
Sewoong Lee
Song Chong
Sung Ju Hwang
109
29
0
16 Jun 2021
FEAR: A Simple Lightweight Method to Rank Architectures
Debadeepta Dey
Shital C. Shah
Sébastien Bubeck
OOD
124
4
0
07 Jun 2021
You Only Compress Once: Towards Effective and Elastic BERT Compression via Exploit-Explore Stochastic Nature Gradient
Shaokun Zhang
Xiawu Zheng
Chenyi Yang
Yuchao Li
Yan Wang
Yong Li
Mengdi Wang
Shen Li
Jun Yang
Rongrong Ji
MQ
125
23
0
04 Jun 2021
Analogous to Evolutionary Algorithm: Designing a Unified Sequence Model
Neural Information Processing Systems (NeurIPS), 2021
Jiangning Zhang
Chao Xu
Jian Li
Wenzhou Chen
Yabiao Wang
Ying Tai
Shuo Chen
Chengjie Wang
Feiyue Huang
Yong Liu
174
25
0
31 May 2021
Memory-Efficient Differentiable Transformer Architecture Search
Findings (Findings), 2021
Yuekai Zhao
Li Dong
Yelong Shen
Zhihua Zhang
Furu Wei
Weizhu Chen
ViT
117
19
0
31 May 2021
NAS-BERT: Task-Agnostic and Adaptive-Size BERT Compression with Neural Architecture Search
Knowledge Discovery and Data Mining (KDD), 2021
Jin Xu
Xu Tan
Renqian Luo
Kaitao Song
Jian Li
Tao Qin
Tie-Yan Liu
MQ
105
85
0
30 May 2021
Dynamic Multi-Branch Layers for On-Device Neural Machine Translation
IEEE/ACM Transactions on Audio Speech and Language Processing (TASLP), 2021
Zhixing Tan
Zeyuan Yang
Meng Zhang
Qun Liu
Maosong Sun
Yang Liu
AI4CE
146
5
0
14 May 2021
Dynamic-OFA: Runtime DNN Architecture Switching for Performance Scaling on Heterogeneous Embedded Platforms
W. Lou
Lei Xun
Amin Sabet
Jia Bi
Jonathon S. Hare
G. Merrett
AI4CE
128
32
0
08 May 2021
Translational NLP: A New Paradigm and General Principles for Natural Language Processing Research
North American Chapter of the Association for Computational Linguistics (NAACL), 2021
Denis R. Newman-Griffis
J. Lehman
Carolyn Rose
H. Hochheiser
94
20
0
16 Apr 2021
Enabling Design Methodologies and Future Trends for Edge AI: Specialization and Co-design
IEEE design & test (DT), 2021
Cong Hao
Jordan Dotzel
Jinjun Xiong
Luca Benini
Zhiru Zhang
Deming Chen
137
40
0
25 Mar 2021
Scalable Vision Transformers with Hierarchical Pooling
IEEE International Conference on Computer Vision (ICCV), 2021
Zizheng Pan
Bohan Zhuang
Jing Liu
Haoyu He
Jianfei Cai
ViT
144
142
0
19 Mar 2021
AlphaNet: Improved Training of Supernets with Alpha-Divergence
International Conference on Machine Learning (ICML), 2021
Dilin Wang
Chengyue Gong
Meng Li
Qiang Liu
Vikas Chandra
325
49
0
16 Feb 2021
Dancing along Battery: Enabling Transformer with Run-time Reconfigurability on Mobile Devices
Design Automation Conference (DAC), 2021
Yuhong Song
Weiwen Jiang
Bingbing Li
Panjie Qi
Qingfeng Zhuge
E. Sha
Sakyasingha Dasgupta
Yiyu Shi
Caiwen Ding
112
18
0
12 Feb 2021
A Comprehensive Survey on Hardware-Aware Neural Architecture Search
Hadjer Benmeziane
Kaoutar El Maghraoui
Hamza Ouarnoughi
Smail Niar
Martin Wistuba
Naigang Wang
170
121
0
22 Jan 2021
Previous
1
2
3
Next