ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2005.14187
  4. Cited By
HAT: Hardware-Aware Transformers for Efficient Natural Language
  Processing

HAT: Hardware-Aware Transformers for Efficient Natural Language Processing

Annual Meeting of the Association for Computational Linguistics (ACL), 2020
28 May 2020
Hanrui Wang
Zhanghao Wu
Zhijian Liu
Han Cai
Ligeng Zhu
Chuang Gan
Song Han
ArXiv (abs)PDFHTMLHuggingFace (2 upvotes)Github (334★)

Papers citing "HAT: Hardware-Aware Transformers for Efficient Natural Language Processing"

50 / 115 papers shown
Title
Elastic ViTs from Pretrained Models without Retraining
Elastic ViTs from Pretrained Models without Retraining
Walter Simoncini
Michael Dorkenwald
Tijmen Blankevoort
Cees G. M. Snoek
Yuki Markus Asano
VLM
71
0
0
20 Oct 2025
Where to Begin: Efficient Pretraining via Subnetwork Selection and Distillation
Where to Begin: Efficient Pretraining via Subnetwork Selection and Distillation
Arjun Krishnakumar
R. Sukthanker
Hannan Javed Mahadik
Gabriela Kadlecová
Vladyslav Moroshan
Timur Carstensen
Frank Hutter
Aaron Klein
61
0
0
08 Oct 2025
LM-Searcher: Cross-domain Neural Architecture Search with LLMs via Unified Numerical Encoding
LM-Searcher: Cross-domain Neural Architecture Search with LLMs via Unified Numerical Encoding
Yuxuan Hu
Jihao Liu
Ke Wang
Jinliang Zhen
Weikang Shi
Manyuan Zhang
Qi Dou
R. Liu
Aojun Zhou
Hongsheng Li
143
1
0
06 Sep 2025
ESM: A Framework for Building Effective Surrogate Models for Hardware-Aware Neural Architecture Search
ESM: A Framework for Building Effective Surrogate Models for Hardware-Aware Neural Architecture SearchDesign Automation Conference (DAC), 2025
Azaz-Ur-Rehman Nasir
Samroz Ahmad Shoaib
Muhammad Abdullah Hanif
Muhammad Shafique
70
0
0
02 Aug 2025
Enhancing Video Understanding: Deep Neural Networks for Spatiotemporal Analysis
Enhancing Video Understanding: Deep Neural Networks for Spatiotemporal Analysis
Amir Hosein Fadaei
M. Dehaqani
247
0
0
11 Feb 2025
Merino: Entropy-driven Design for Generative Language Models on IoT Devices
Merino: Entropy-driven Design for Generative Language Models on IoT DevicesAAAI Conference on Artificial Intelligence (AAAI), 2024
Youpeng Zhao
Ming Lin
Huadong Tang
Qiang Wu
Jun Wang
301
1
0
28 Jan 2025
Cross-layer Attention Sharing for Pre-trained Large Language Models
Cross-layer Attention Sharing for Pre-trained Large Language Models
Yongyu Mu
Yuzhang Wu
Yuchun Fan
Chenglong Wang
Hengyu Li
...
Murun Yang
Fandong Meng
Jie Zhou
Tong Xiao
Jingbo Zhu
174
5
0
04 Aug 2024
Croppable Knowledge Graph Embedding
Croppable Knowledge Graph Embedding
Yushan Zhu
Wen Zhang
Zhiqiang Liu
Yin Hua
Lei Liang
H. Chen
192
0
0
03 Jul 2024
The Need for Speed: Pruning Transformers with One Recipe
The Need for Speed: Pruning Transformers with One Recipe
Samir Khaki
Konstantinos N. Plataniotis
238
14
0
26 Mar 2024
Multi-objective Differentiable Neural Architecture Search
Multi-objective Differentiable Neural Architecture Search
R. Sukthanker
Arber Zela
B. Staffler
Samuel Dooley
Josif Grabocka
Katharina Eggensperger
437
1
0
28 Feb 2024
CiMNet: Towards Joint Optimization for DNN Architecture and
  Configuration for Compute-In-Memory Hardware
CiMNet: Towards Joint Optimization for DNN Architecture and Configuration for Compute-In-Memory Hardware
Souvik Kundu
Anthony Sarah
Vinay Joshi
O. J. Omer
S. Subramoney
141
0
0
19 Feb 2024
TransAxx: Efficient Transformers with Approximate Computing
TransAxx: Efficient Transformers with Approximate Computing
Dimitrios Danopoulos
Georgios Zervakis
Dimitrios Soudris
Jörg Henkel
ViT
217
4
0
12 Feb 2024
Weight-Entanglement Meets Gradient-Based Neural Architecture Search
Weight-Entanglement Meets Gradient-Based Neural Architecture Search
R. Sukthanker
Arjun Krishnakumar
Mahmoud Safari
Katharina Eggensperger
156
5
0
16 Dec 2023
DistDNAS: Search Efficient Feature Interactions within 2 Hours
DistDNAS: Search Efficient Feature Interactions within 2 HoursBigData Congress [Services Society] (BSS), 2023
Tunhou Zhang
W. Wen
Igor Fedorov
Xi Liu
Buyun Zhang
...
Wen-Yen Chen
Yiping Han
Feng Yan
Hai Helen Li
Yiran Chen
226
1
0
01 Nov 2023
MatFormer: Nested Transformer for Elastic Inference
MatFormer: Nested Transformer for Elastic InferenceNeural Information Processing Systems (NeurIPS), 2023
Devvrit
Sneha Kudugunta
Aditya Kusupati
Tim Dettmers
Kaifeng Chen
...
Yulia Tsvetkov
Hannaneh Hajishirzi
Sham Kakade
Ali Farhadi
Prateek Jain
215
57
0
11 Oct 2023
Quantized Transformer Language Model Implementations on Edge Devices
Quantized Transformer Language Model Implementations on Edge DevicesInternational Conference on Machine Learning and Applications (ICMLA), 2023
Mohammad Wali Ur Rahman
Murad Mehrab Abrar
Hunter Gibbons Copening
Salim Hariri
Sicong Shao
Pratik Satam
Soheil Salehi
MQ
121
20
0
06 Oct 2023
PRAT: PRofiling Adversarial aTtacks
PRAT: PRofiling Adversarial aTtacks
Rahul Ambati
Naveed Akhtar
Lin Wang
Yogesh S Rawat
AAML
148
1
0
20 Sep 2023
RingMo-lite: A Remote Sensing Multi-task Lightweight Network with
  CNN-Transformer Hybrid Framework
RingMo-lite: A Remote Sensing Multi-task Lightweight Network with CNN-Transformer Hybrid Framework
Yuelei Wang
Ting Zhang
Liangjin Zhao
Lin Hu
Zhechao Wang
...
Kaiqiang Chen
Xuan Zeng
Zhirui Wang
Hongqi Wang
Xian Sun
169
7
0
16 Sep 2023
Which Transformer to Favor: A Comparative Analysis of Efficiency in Vision Transformers
Which Transformer to Favor: A Comparative Analysis of Efficiency in Vision TransformersIEEE Workshop/Winter Conference on Applications of Computer Vision (WACV), 2023
Tobias Christian Nauen
Sebastián M. Palacio
Federico Raue
Andreas Dengel
398
6
0
18 Aug 2023
Training-free Neural Architecture Search for RNNs and Transformers
Training-free Neural Architecture Search for RNNs and TransformersAnnual Meeting of the Association for Computational Linguistics (ACL), 2023
Aaron Serianni
Jugal Kalita
141
8
0
01 Jun 2023
Meta-prediction Model for Distillation-Aware NAS on Unseen Datasets
Meta-prediction Model for Distillation-Aware NAS on Unseen DatasetsInternational Conference on Learning Representations (ICLR), 2023
Hayeon Lee
Sohyun An
Minseon Kim
Sung Ju Hwang
OOD
119
5
0
26 May 2023
System-status-aware Adaptive Network for Online Streaming Video
  Understanding
System-status-aware Adaptive Network for Online Streaming Video UnderstandingComputer Vision and Pattern Recognition (CVPR), 2023
Lin Geng Foo
Jia Gong
Zhipeng Fan
Jing Liu
AI4TS
176
15
0
28 Mar 2023
EdgeTran: Co-designing Transformers for Efficient Inference on Mobile
  Edge Platforms
EdgeTran: Co-designing Transformers for Efficient Inference on Mobile Edge Platforms
Shikhar Tuli
N. Jha
160
3
0
24 Mar 2023
Transformers in Speech Processing: A Survey
Transformers in Speech Processing: A Survey
S. Latif
Aun Zaidi
Heriberto Cuayáhuitl
Fahad Shamshad
Moazzam Shoukat
Muhammad Usama
Junaid Qadir
357
65
0
21 Mar 2023
Gradient-Free Structured Pruning with Unlabeled Data
Gradient-Free Structured Pruning with Unlabeled DataInternational Conference on Machine Learning (ICML), 2023
Azade Nova
H. Dai
Dale Schuurmans
SyDa
190
31
0
07 Mar 2023
AccelTran: A Sparsity-Aware Accelerator for Dynamic Inference with
  Transformers
AccelTran: A Sparsity-Aware Accelerator for Dynamic Inference with TransformersIEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems (IEEE TCAD), 2023
Shikhar Tuli
N. Jha
202
45
0
28 Feb 2023
Full Stack Optimization of Transformer Inference: a Survey
Full Stack Optimization of Transformer Inference: a Survey
Sehoon Kim
Coleman Hooper
Thanakul Wattanawong
Minwoo Kang
Ruohan Yan
...
Qijing Huang
Kurt Keutzer
Michael W. Mahoney
Y. Shao
A. Gholami
MQ
243
138
0
27 Feb 2023
Speculative Decoding with Big Little Decoder
Speculative Decoding with Big Little DecoderNeural Information Processing Systems (NeurIPS), 2023
Sehoon Kim
K. Mangalam
Suhong Moon
Jitendra Malik
Michael W. Mahoney
A. Gholami
Kurt Keutzer
MoE
264
148
0
15 Feb 2023
The Framework Tax: Disparities Between Inference Efficiency in NLP
  Research and Deployment
The Framework Tax: Disparities Between Inference Efficiency in NLP Research and DeploymentConference on Empirical Methods in Natural Language Processing (EMNLP), 2023
Jared Fernandez
Jacob Kahn
Clara Na
Yonatan Bisk
Emma Strubell
FedML
231
12
0
13 Feb 2023
6-DoF Robotic Grasping with Transformer
6-DoF Robotic Grasping with Transformer
Zhenjie Zhao
Han Yu
Hang Wu
Xuebo Zhang
ViT
127
0
0
29 Jan 2023
AttMEMO : Accelerating Transformers with Memoization on Big Memory
  Systems
AttMEMO : Accelerating Transformers with Memoization on Big Memory Systems
Yuan Feng
Hyeran Jeon
F. Blagojevic
Cyril Guyot
Qing Li
Dong Li
GNN
131
7
0
23 Jan 2023
Convolution-enhanced Evolving Attention Networks
Convolution-enhanced Evolving Attention NetworksIEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2022
Yujing Wang
Yaming Yang
Zhuowan Li
Jiangang Bai
Mingliang Zhang
Xiangtai Li
Jiahao Yu
Ce Zhang
Gao Huang
Yu Tong
ViT
199
9
0
16 Dec 2022
Vision Transformer Computation and Resilience for Dynamic Inference
Vision Transformer Computation and Resilience for Dynamic InferenceIEEE International Symposium on Performance Analysis of Systems and Software (ISPASS), 2022
Kavya Sreedhar
Jason Clemons
Rangharajan Venkatesan
S. Keckler
M. Horowitz
173
2
0
06 Dec 2022
HeatViT: Hardware-Efficient Adaptive Token Pruning for Vision
  Transformers
HeatViT: Hardware-Efficient Adaptive Token Pruning for Vision TransformersInternational Symposium on High-Performance Computer Architecture (HPCA), 2022
Zhaoyang Han
Mengshu Sun
Alec Lu
Yanyue Xie
Li-Yu Daisy Liu
...
Xin Meng
Hao Sun
Xue Lin
Zhenman Fang
Yanzhi Wang
ViT
195
95
0
15 Nov 2022
Efficiently Scaling Transformer Inference
Efficiently Scaling Transformer InferenceConference on Machine Learning and Systems (MLSys), 2022
Reiner Pope
Sholto Douglas
Aakanksha Chowdhery
Jacob Devlin
James Bradbury
Anselm Levskaya
Jonathan Heek
Kefan Xiao
Shivani Agrawal
J. Dean
217
439
0
09 Nov 2022
QuaLA-MiniLM: a Quantized Length Adaptive MiniLM
QuaLA-MiniLM: a Quantized Length Adaptive MiniLM
Shira Guskin
Moshe Wasserblat
Chang Wang
Haihao Shen
MQ
186
2
0
31 Oct 2022
NASA: Neural Architecture Search and Acceleration for Hardware Inspired
  Hybrid Networks
NASA: Neural Architecture Search and Acceleration for Hardware Inspired Hybrid Networks
Huihong Shi
Haoran You
Yang Zhao
Zhongfeng Wang
Yingyan Lin
205
7
0
24 Oct 2022
Wide Attention Is The Way Forward For Transformers?
Wide Attention Is The Way Forward For Transformers?
Jason Brown
Yiren Zhao
Ilia Shumailov
Robert D. Mullins
146
10
0
02 Oct 2022
Efficient Methods for Natural Language Processing: A Survey
Efficient Methods for Natural Language Processing: A SurveyTransactions of the Association for Computational Linguistics (TACL), 2022
Marcos Vinícius Treviso
Ji-Ung Lee
Tianchu Ji
Betty van Aken
Qingqing Cao
...
Emma Strubell
Niranjan Balasubramanian
Leon Derczynski
Iryna Gurevych
Roy Schwartz
289
137
0
31 Aug 2022
Efficient Sparsely Activated Transformers
Efficient Sparsely Activated Transformers
Salar Latifi
Saurav Muralidharan
M. Garland
MoE
164
2
0
31 Aug 2022
FocusFormer: Focusing on What We Need via Architecture Sampler
FocusFormer: Focusing on What We Need via Architecture Sampler
Jing Liu
Jianfei Cai
Bohan Zhuang
114
8
0
23 Aug 2022
Survey on Evolutionary Deep Learning: Principles, Algorithms,
  Applications and Open Issues
Survey on Evolutionary Deep Learning: Principles, Algorithms, Applications and Open IssuesACM Computing Surveys (ACM CSUR), 2022
Nan Li
Lianbo Ma
Guo-Ding Yu
Bing Xue
Mengjie Zhang
Yaochu Jin
133
90
0
23 Aug 2022
Neural Architecture Search on Efficient Transformers and Beyond
Neural Architecture Search on Efficient Transformers and Beyond
Zexiang Liu
Dong Li
Kaiyue Lu
Zhen Qin
Weixuan Sun
Jiacheng Xu
Yiran Zhong
143
20
0
28 Jul 2022
UFO: Unified Feature Optimization
UFO: Unified Feature OptimizationEuropean Conference on Computer Vision (ECCV), 2022
Teng Xi
Yifan Sun
Deli Yu
Bi Li
Nan Peng
...
Haocheng Feng
Junyu Han
Jingtuo Liu
Errui Ding
Jingdong Wang
149
11
0
21 Jul 2022
NASRec: Weight Sharing Neural Architecture Search for Recommender
  Systems
NASRec: Weight Sharing Neural Architecture Search for Recommender SystemsThe Web Conference (WWW), 2022
Tunhou Zhang
Dehua Cheng
Yuchen He
Zhengxing Chen
Xiaoliang Dai
Liang Xiong
Feng Yan
Xue Yang
Yiran Chen
W. Wen
114
18
0
14 Jul 2022
STI: Turbocharge NLP Inference at the Edge via Elastic Pipelining
STI: Turbocharge NLP Inference at the Edge via Elastic PipeliningInternational Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS), 2022
Liwei Guo
Wonkyo Choe
F. Lin
90
19
0
11 Jul 2022
EATFormer: Improving Vision Transformer Inspired by Evolutionary
  Algorithm
EATFormer: Improving Vision Transformer Inspired by Evolutionary AlgorithmInternational Journal of Computer Vision (IJCV), 2022
Jiangning Zhang
Xiangtai Li
Yabiao Wang
Chengjie Wang
Jianlong Wu
Yong Liu
Dacheng Tao
ViT
209
45
0
19 Jun 2022
EfficientFormer: Vision Transformers at MobileNet Speed
EfficientFormer: Vision Transformers at MobileNet SpeedNeural Information Processing Systems (NeurIPS), 2022
Yanyu Li
Geng Yuan
Yang Wen
Eric Hu
Georgios Evangelidis
Sergey Tulyakov
Yanzhi Wang
Jian Ren
ViT
490
479
0
02 Jun 2022
FlexiBERT: Are Current Transformer Architectures too Homogeneous and
  Rigid?
FlexiBERT: Are Current Transformer Architectures too Homogeneous and Rigid?Journal of Artificial Intelligence Research (JAIR), 2022
Shikhar Tuli
Bhishma Dedhia
Shreshth Tuli
N. Jha
164
15
0
23 May 2022
A Hardware-Aware Framework for Accelerating Neural Architecture Search
  Across Modalities
A Hardware-Aware Framework for Accelerating Neural Architecture Search Across Modalities
Daniel Cummings
Anthony Sarah
S. N. Sridhar
Maciej Szankin
J. P. Muñoz
Sairam Sundaresan
162
8
0
19 May 2022
123
Next