ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2005.14187
  4. Cited By
HAT: Hardware-Aware Transformers for Efficient Natural Language
  Processing

HAT: Hardware-Aware Transformers for Efficient Natural Language Processing

Annual Meeting of the Association for Computational Linguistics (ACL), 2020
28 May 2020
Hanrui Wang
Zhanghao Wu
Zhijian Liu
Han Cai
Ligeng Zhu
Chuang Gan
Song Han
ArXiv (abs)PDFHTMLHuggingFace (2 upvotes)Github (334★)

Papers citing "HAT: Hardware-Aware Transformers for Efficient Natural Language Processing"

50 / 112 papers shown
Title
PVNAS: 3D Neural Architecture Search with Point-Voxel Convolution
PVNAS: 3D Neural Architecture Search with Point-Voxel ConvolutionIEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2021
Zhijian Liu
Haotian Tang
Shengyu Zhao
Kevin Shao
Song Han
3DPC
112
46
0
25 Apr 2022
Enable Deep Learning on Mobile Devices: Methods, Systems, and
  Applications
Enable Deep Learning on Mobile Devices: Methods, Systems, and Applications
Han Cai
Ji Lin
Chengyue Wu
Zhijian Liu
Haotian Tang
Hanrui Wang
Ligeng Zhu
Song Han
194
129
0
25 Apr 2022
SplitNets: Designing Neural Architectures for Efficient Distributed
  Computing on Head-Mounted Systems
SplitNets: Designing Neural Architectures for Efficient Distributed Computing on Head-Mounted SystemsComputer Vision and Pattern Recognition (CVPR), 2022
Xin Dong
B. D. Salvo
Meng Li
Chiao Liu
Zhongnan Qu
H. T. Kung
Ziyun Li
3DGS
139
23
0
10 Apr 2022
Probing Structured Pruning on Multilingual Pre-trained Models: Settings,
  Algorithms, and Efficiency
Probing Structured Pruning on Multilingual Pre-trained Models: Settings, Algorithms, and EfficiencyAnnual Meeting of the Association for Computational Linguistics (ACL), 2022
Yanyang Li
Fuli Luo
Runxin Xu
Songfang Huang
Fei Huang
Liwei Wang
105
3
0
06 Apr 2022
A Fast Post-Training Pruning Framework for Transformers
A Fast Post-Training Pruning Framework for TransformersNeural Information Processing Systems (NeurIPS), 2022
Woosuk Kwon
Sehoon Kim
Michael W. Mahoney
Joseph Hassoun
Kurt Keutzer
A. Gholami
158
183
0
29 Mar 2022
Bilaterally Slimmable Transformer for Elastic and Efficient Visual
  Question Answering
Bilaterally Slimmable Transformer for Elastic and Efficient Visual Question AnsweringIEEE transactions on multimedia (IEEE TMM), 2022
Zhou Yu
Zitian Jin
Jun Yu
Mingliang Xu
Hongbo Wang
Jianping Fan
124
5
0
24 Mar 2022
Mokey: Enabling Narrow Fixed-Point Inference for Out-of-the-Box
  Floating-Point Transformer Models
Mokey: Enabling Narrow Fixed-Point Inference for Out-of-the-Box Floating-Point Transformer ModelsInternational Symposium on Computer Architecture (ISCA), 2022
Ali Hadi Zadeh
Mostafa Mahmoud
Ameer Abdelhadi
Andreas Moshovos
MQ
153
37
0
23 Mar 2022
Training-free Transformer Architecture Search
Training-free Transformer Architecture SearchComputer Vision and Pattern Recognition (CVPR), 2022
Qinqin Zhou
Kekai Sheng
Xiawu Zheng
Ke Li
Xing Sun
Yonghong Tian
Jie Chen
Rongrong Ji
ViT
134
54
0
23 Mar 2022
Accelerating Neural Architecture Exploration Across Modalities Using
  Genetic Algorithms
Accelerating Neural Architecture Exploration Across Modalities Using Genetic Algorithms
Daniel Cummings
S. N. Sridhar
Anthony Sarah
Maciej Szankin
AI4CE
94
0
0
25 Feb 2022
EdgeFormer: A Parameter-Efficient Transformer for On-Device Seq2seq
  Generation
EdgeFormer: A Parameter-Efficient Transformer for On-Device Seq2seq GenerationConference on Empirical Methods in Natural Language Processing (EMNLP), 2022
Tao Ge
Si-Qing Chen
Furu Wei
MoE
208
28
0
16 Feb 2022
Fast Monte-Carlo Approximation of the Attention Mechanism
Fast Monte-Carlo Approximation of the Attention MechanismAAAI Conference on Artificial Intelligence (AAAI), 2022
Hyunjun Kim
Jeonggil Ko
150
4
0
30 Jan 2022
AutoDistil: Few-shot Task-agnostic Neural Architecture Search for
  Distilling Large Language Models
AutoDistil: Few-shot Task-agnostic Neural Architecture Search for Distilling Large Language Models
Dongkuan Xu
Subhabrata Mukherjee
Xiaodong Liu
Debadeepta Dey
Wenhui Wang
Xiang Zhang
Ahmed Hassan Awadallah
Jianfeng Gao
122
5
0
29 Jan 2022
Transformers in Medical Imaging: A Survey
Transformers in Medical Imaging: A Survey
Fahad Shamshad
Salman Khan
Syed Waqas Zamir
Muhammad Haris Khan
Munawar Hayat
Fahad Shahbaz Khan
Huazhu Fu
ViTLM&MAMedIm
244
865
0
24 Jan 2022
Representing Long-Range Context for Graph Neural Networks with Global
  Attention
Representing Long-Range Context for Graph Neural Networks with Global AttentionNeural Information Processing Systems (NeurIPS), 2022
Zhanghao Wu
Paras Jain
Matthew A. Wright
Azalia Mirhoseini
Joseph E. Gonzalez
Ion Stoica
GNN
188
352
0
21 Jan 2022
MAPLE: Microprocessor A Priori for Latency Estimation
MAPLE: Microprocessor A Priori for Latency Estimation
Saad Abbasi
Alexander Wong
M. Shafiee
125
13
0
30 Nov 2021
Searching the Search Space of Vision Transformer
Searching the Search Space of Vision Transformer
Minghao Chen
Kan Wu
Bolin Ni
Houwen Peng
Bei Liu
Jianlong Fu
Hongyang Chao
Haibin Ling
ViT
132
65
0
29 Nov 2021
Sparse is Enough in Scaling Transformers
Sparse is Enough in Scaling Transformers
Sebastian Jaszczur
Aakanksha Chowdhery
Afroz Mohiuddin
Lukasz Kaiser
Wojciech Gajewski
Henryk Michalewski
Jonni Kanerva
MoE
115
116
0
24 Nov 2021
Searching for TrioNet: Combining Convolution with Local and Global
  Self-Attention
Searching for TrioNet: Combining Convolution with Local and Global Self-AttentionBritish Machine Vision Conference (BMVC), 2021
Huaijin Pi
Huiyu Wang
Yingwei Li
Zizhang Li
Alan Yuille
ViT
101
3
0
15 Nov 2021
One Proxy Device Is Enough for Hardware-Aware Neural Architecture Search
One Proxy Device Is Enough for Hardware-Aware Neural Architecture SearchProceedings of the ACM on Measurement and Analysis of Computing Systems (POMACS), 2021
Bingqian Lu
Jianyi Yang
Weiwen Jiang
Yiyu Shi
Shaolei Ren
183
25
0
01 Nov 2021
Pipeline Parallelism for Inference on Heterogeneous Edge Computing
Pipeline Parallelism for Inference on Heterogeneous Edge Computing
Yang Hu
Connor Imes
Xuanang Zhao
Souvik Kundu
Peter A. Beerel
S. Crago
J. Walters
MoE
187
25
0
28 Oct 2021
Accelerating Framework of Transformer by Hardware Design and Model
  Compression Co-Optimization
Accelerating Framework of Transformer by Hardware Design and Model Compression Co-Optimization
Panjie Qi
E. Sha
Qingfeng Zhuge
Hongwu Peng
Shaoyi Huang
Zhenglun Kong
Yuhong Song
Bingbing Li
111
54
0
19 Oct 2021
Energon: Towards Efficient Acceleration of Transformers Using Dynamic
  Sparse Attention
Energon: Towards Efficient Acceleration of Transformers Using Dynamic Sparse Attention
Zhe Zhou
Junling Liu
Zhenyu Gu
Guangyu Sun
198
55
0
18 Oct 2021
Omni-sparsity DNN: Fast Sparsity Optimization for On-Device Streaming
  E2E ASR via Supernet
Omni-sparsity DNN: Fast Sparsity Optimization for On-Device Streaming E2E ASR via Supernet
Haichuan Yang
Yuan Shangguan
Dilin Wang
Meng Li
P. Chuang
Xiaohui Zhang
Ganesh Venkatesh
Ozlem Kalinli
Vikas Chandra
152
14
0
15 Oct 2021
SuperShaper: Task-Agnostic Super Pre-training of BERT Models with
  Variable Hidden Dimensions
SuperShaper: Task-Agnostic Super Pre-training of BERT Models with Variable Hidden Dimensions
Vinod Ganesan
Gowtham Ramesh
Pratyush Kumar
90
9
0
10 Oct 2021
Understanding and Overcoming the Challenges of Efficient Transformer
  Quantization
Understanding and Overcoming the Challenges of Efficient Transformer QuantizationConference on Empirical Methods in Natural Language Processing (EMNLP), 2021
Yelysei Bondarenko
Markus Nagel
Tijmen Blankevoort
MQ
133
168
0
27 Sep 2021
The NiuTrans System for the WMT21 Efficiency Task
The NiuTrans System for the WMT21 Efficiency Task
Chenglong Wang
Chi Hu
Yongyu Mu
Zhongxiang Yan
Siming Wu
...
Hang Cao
Bei Li
Ye Lin
Tong Xiao
Jingbo Zhu
113
2
0
16 Sep 2021
RankNAS: Efficient Neural Architecture Search by Pairwise Ranking
RankNAS: Efficient Neural Architecture Search by Pairwise Ranking
Chi Hu
Chenglong Wang
Xiangnan Ma
Xia Meng
Yinqiao Li
Tong Xiao
Jingbo Zhu
Changliang Li
107
13
0
15 Sep 2021
EfficientBERT: Progressively Searching Multilayer Perceptron via Warm-up
  Knowledge Distillation
EfficientBERT: Progressively Searching Multilayer Perceptron via Warm-up Knowledge Distillation
Chenhe Dong
Guangrun Wang
Hang Xu
Jiefeng Peng
Xiaozhe Ren
Xiaodan Liang
122
28
0
15 Sep 2021
Searching for Efficient Multi-Stage Vision Transformers
Searching for Efficient Multi-Stage Vision Transformers
Yi-Lun Liao
S. Karaman
Vivienne Sze
ViT
93
19
0
01 Sep 2021
Generic Neural Architecture Search via Regression
Generic Neural Architecture Search via RegressionNeural Information Processing Systems (NeurIPS), 2021
Yuhong Li
Cong Hao
Pan Li
Jinjun Xiong
Deming Chen
134
31
0
04 Aug 2021
Group Fisher Pruning for Practical Network Compression
Group Fisher Pruning for Practical Network Compression
Liyang Liu
Shilong Zhang
Zhanghui Kuang
Aojun Zhou
Jingliang Xue
Xinjiang Wang
Yimin Chen
Wenming Yang
Q. Liao
Wayne Zhang
152
173
0
02 Aug 2021
AutoTinyBERT: Automatic Hyper-parameter Optimization for Efficient
  Pre-trained Language Models
AutoTinyBERT: Automatic Hyper-parameter Optimization for Efficient Pre-trained Language ModelsAnnual Meeting of the Association for Computational Linguistics (ACL), 2021
Yichun Yin
Cheng Chen
Lifeng Shang
Xin Jiang
Xiao Chen
Qun Liu
VLM
112
50
0
29 Jul 2021
QuantumNAS: Noise-Adaptive Search for Robust Quantum Circuits
QuantumNAS: Noise-Adaptive Search for Robust Quantum CircuitsInternational Symposium on High-Performance Computer Architecture (HPCA), 2021
Hanrui Wang
Yongshan Ding
Jiaqi Gu
Zirui Li
Chengyue Wu
David Z. Pan
Frederic T. Chong
Song Han
261
222
0
22 Jul 2021
You Better Look Twice: a new perspective for designing accurate
  detectors with reduced computations
You Better Look Twice: a new perspective for designing accurate detectors with reduced computationsBritish Machine Vision Conference (BMVC), 2021
Alexandra Dana
M. Shutman
Yotam Perlitz
Ran Vitek
Tomer Peleg
R. Jevnisek
ObjD
213
3
0
21 Jul 2021
AutoFormer: Searching Transformers for Visual Recognition
AutoFormer: Searching Transformers for Visual Recognition
Minghao Chen
Houwen Peng
Jianlong Fu
Haibin Ling
ViT
173
310
0
01 Jul 2021
LV-BERT: Exploiting Layer Variety for BERT
LV-BERT: Exploiting Layer Variety for BERTFindings (Findings), 2021
Weihao Yu
Zihang Jiang
Fei Chen
Qibin Hou
Jiashi Feng
MQ
91
0
0
22 Jun 2021
HELP: Hardware-Adaptive Efficient Latency Prediction for NAS via
  Meta-Learning
HELP: Hardware-Adaptive Efficient Latency Prediction for NAS via Meta-Learning
Hayeon Lee
Sewoong Lee
Song Chong
Sung Ju Hwang
109
29
0
16 Jun 2021
FEAR: A Simple Lightweight Method to Rank Architectures
FEAR: A Simple Lightweight Method to Rank Architectures
Debadeepta Dey
Shital C. Shah
Sébastien Bubeck
OOD
124
4
0
07 Jun 2021
You Only Compress Once: Towards Effective and Elastic BERT Compression
  via Exploit-Explore Stochastic Nature Gradient
You Only Compress Once: Towards Effective and Elastic BERT Compression via Exploit-Explore Stochastic Nature Gradient
Shaokun Zhang
Xiawu Zheng
Chenyi Yang
Yuchao Li
Yan Wang
Yong Li
Mengdi Wang
Shen Li
Jun Yang
Rongrong Ji
MQ
125
23
0
04 Jun 2021
Analogous to Evolutionary Algorithm: Designing a Unified Sequence Model
Analogous to Evolutionary Algorithm: Designing a Unified Sequence ModelNeural Information Processing Systems (NeurIPS), 2021
Jiangning Zhang
Chao Xu
Jian Li
Wenzhou Chen
Yabiao Wang
Ying Tai
Shuo Chen
Chengjie Wang
Feiyue Huang
Yong Liu
174
25
0
31 May 2021
Memory-Efficient Differentiable Transformer Architecture Search
Memory-Efficient Differentiable Transformer Architecture SearchFindings (Findings), 2021
Yuekai Zhao
Li Dong
Yelong Shen
Zhihua Zhang
Furu Wei
Weizhu Chen
ViT
117
19
0
31 May 2021
NAS-BERT: Task-Agnostic and Adaptive-Size BERT Compression with Neural
  Architecture Search
NAS-BERT: Task-Agnostic and Adaptive-Size BERT Compression with Neural Architecture SearchKnowledge Discovery and Data Mining (KDD), 2021
Jin Xu
Xu Tan
Renqian Luo
Kaitao Song
Jian Li
Tao Qin
Tie-Yan Liu
MQ
105
85
0
30 May 2021
Dynamic Multi-Branch Layers for On-Device Neural Machine Translation
Dynamic Multi-Branch Layers for On-Device Neural Machine TranslationIEEE/ACM Transactions on Audio Speech and Language Processing (TASLP), 2021
Zhixing Tan
Zeyuan Yang
Meng Zhang
Qun Liu
Maosong Sun
Yang Liu
AI4CE
146
5
0
14 May 2021
Dynamic-OFA: Runtime DNN Architecture Switching for Performance Scaling
  on Heterogeneous Embedded Platforms
Dynamic-OFA: Runtime DNN Architecture Switching for Performance Scaling on Heterogeneous Embedded Platforms
W. Lou
Lei Xun
Amin Sabet
Jia Bi
Jonathon S. Hare
G. Merrett
AI4CE
128
32
0
08 May 2021
Translational NLP: A New Paradigm and General Principles for Natural
  Language Processing Research
Translational NLP: A New Paradigm and General Principles for Natural Language Processing ResearchNorth American Chapter of the Association for Computational Linguistics (NAACL), 2021
Denis R. Newman-Griffis
J. Lehman
Carolyn Rose
H. Hochheiser
94
20
0
16 Apr 2021
Enabling Design Methodologies and Future Trends for Edge AI:
  Specialization and Co-design
Enabling Design Methodologies and Future Trends for Edge AI: Specialization and Co-designIEEE design & test (DT), 2021
Cong Hao
Jordan Dotzel
Jinjun Xiong
Luca Benini
Zhiru Zhang
Deming Chen
137
40
0
25 Mar 2021
Scalable Vision Transformers with Hierarchical Pooling
Scalable Vision Transformers with Hierarchical PoolingIEEE International Conference on Computer Vision (ICCV), 2021
Zizheng Pan
Bohan Zhuang
Jing Liu
Haoyu He
Jianfei Cai
ViT
144
142
0
19 Mar 2021
AlphaNet: Improved Training of Supernets with Alpha-Divergence
AlphaNet: Improved Training of Supernets with Alpha-DivergenceInternational Conference on Machine Learning (ICML), 2021
Dilin Wang
Chengyue Gong
Meng Li
Qiang Liu
Vikas Chandra
325
49
0
16 Feb 2021
Dancing along Battery: Enabling Transformer with Run-time
  Reconfigurability on Mobile Devices
Dancing along Battery: Enabling Transformer with Run-time Reconfigurability on Mobile DevicesDesign Automation Conference (DAC), 2021
Yuhong Song
Weiwen Jiang
Bingbing Li
Panjie Qi
Qingfeng Zhuge
E. Sha
Sakyasingha Dasgupta
Yiyu Shi
Caiwen Ding
112
18
0
12 Feb 2021
A Comprehensive Survey on Hardware-Aware Neural Architecture Search
A Comprehensive Survey on Hardware-Aware Neural Architecture Search
Hadjer Benmeziane
Kaoutar El Maghraoui
Hamza Ouarnoughi
Smail Niar
Martin Wistuba
Naigang Wang
170
121
0
22 Jan 2021
Previous
123
Next