ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2009.06732
  4. Cited By
Efficient Transformers: A Survey

Efficient Transformers: A Survey

14 September 2020
Yi Tay
Mostafa Dehghani
Dara Bahri
Donald Metzler
    VLM
ArXivPDFHTML

Papers citing "Efficient Transformers: A Survey"

50 / 633 papers shown
Title
Multimodal Contrastive Learning with LIMoE: the Language-Image Mixture
  of Experts
Multimodal Contrastive Learning with LIMoE: the Language-Image Mixture of Experts
Basil Mustafa
C. Riquelme
J. Puigcerver
Rodolphe Jenatton
N. Houlsby
VLM
MoE
28
183
0
06 Jun 2022
Exploring Transformers for Behavioural Biometrics: A Case Study in Gait
  Recognition
Exploring Transformers for Behavioural Biometrics: A Case Study in Gait Recognition
Paula Delgado-Santos
Ruben Tolosana
R. Guest
F. Deravi
R. Vera-Rodríguez
13
30
0
03 Jun 2022
BayesFormer: Transformer with Uncertainty Estimation
BayesFormer: Transformer with Uncertainty Estimation
Karthik Abinav Sankararaman
Sinong Wang
Han Fang
UQCV
BDL
14
10
0
02 Jun 2022
Fair Comparison between Efficient Attentions
Fair Comparison between Efficient Attentions
Jiuk Hong
Chaehyeon Lee
Soyoun Bang
Heechul Jung
17
1
0
01 Jun 2022
Transformer with Fourier Integral Attentions
Transformer with Fourier Integral Attentions
T. Nguyen
Minh Pham
Tam Nguyen
Khai Nguyen
Stanley J. Osher
Nhat Ho
17
4
0
01 Jun 2022
Transformers for Multi-Object Tracking on Point Clouds
Transformers for Multi-Object Tracking on Point Clouds
Felicia Ruppel
F. Faion
Claudius Gläser
Klaus C. J. Dietmayer
3DPC
4
16
0
31 May 2022
Prompt Injection: Parameterization of Fixed Inputs
Prompt Injection: Parameterization of Fixed Inputs
Eunbi Choi
Yongrae Jo
Joel Jang
Minjoon Seo
8
29
0
31 May 2022
FlashAttention: Fast and Memory-Efficient Exact Attention with
  IO-Awareness
FlashAttention: Fast and Memory-Efficient Exact Attention with IO-Awareness
Tri Dao
Daniel Y. Fu
Stefano Ermon
Atri Rudra
Christopher Ré
VLM
56
2,017
0
27 May 2022
Probabilistic Transformer: Modelling Ambiguities and Distributions for
  RNA Folding and Molecule Design
Probabilistic Transformer: Modelling Ambiguities and Distributions for RNA Folding and Molecule Design
Jörg K.H. Franke
Frederic Runge
Frank Hutter
15
14
0
27 May 2022
Training Language Models with Memory Augmentation
Training Language Models with Memory Augmentation
Zexuan Zhong
Tao Lei
Danqi Chen
RALM
232
127
0
25 May 2022
ASSET: Autoregressive Semantic Scene Editing with Transformers at High
  Resolutions
ASSET: Autoregressive Semantic Scene Editing with Transformers at High Resolutions
Difan Liu
Sandesh Shetty
Tobias Hinz
Matthew Fisher
Richard Y. Zhang
Taesung Park
E. Kalogerakis
ViT
14
29
0
24 May 2022
Self-Supervised Speech Representation Learning: A Review
Self-Supervised Speech Representation Learning: A Review
Abdel-rahman Mohamed
Hung-yi Lee
Lasse Borgholt
Jakob Drachmann Havtorn
Joakim Edin
...
Shang-Wen Li
Karen Livescu
Lars Maaløe
Tara N. Sainath
Shinji Watanabe
SSL
AI4TS
124
344
0
21 May 2022
Exploring Extreme Parameter Compression for Pre-trained Language Models
Exploring Extreme Parameter Compression for Pre-trained Language Models
Yuxin Ren
Benyou Wang
Lifeng Shang
Xin Jiang
Qun Liu
20
18
0
20 May 2022
Transformer with Memory Replay
Transformer with Memory Replay
R. Liu
Barzan Mozafari
OffRL
62
4
0
19 May 2022
Text Detection & Recognition in the Wild for Robot Localization
Text Detection & Recognition in the Wild for Robot Localization
Z. Raisi
John S. Zelek
12
0
0
17 May 2022
Transkimmer: Transformer Learns to Layer-wise Skim
Transkimmer: Transformer Learns to Layer-wise Skim
Yue Guan
Zhengyi Li
Jingwen Leng
Zhouhan Lin
Minyi Guo
61
38
0
15 May 2022
Symphony Generation with Permutation Invariant Language Model
Symphony Generation with Permutation Invariant Language Model
Jiafeng Liu
Yuanliang Dong
Zehua Cheng
Xinran Zhang
Xiaobing Li
Feng Yu
Maosong Sun
8
39
0
10 May 2022
Transformer-Empowered 6G Intelligent Networks: From Massive MIMO
  Processing to Semantic Communication
Transformer-Empowered 6G Intelligent Networks: From Massive MIMO Processing to Semantic Communication
Yang Wang
Zhen Gao
Dezhi Zheng
Sheng Chen
Deniz Gündüz
H. Vincent Poor
AI4CE
11
81
0
08 May 2022
Transformers in Time-series Analysis: A Tutorial
Transformers in Time-series Analysis: A Tutorial
Sabeen Ahmed
Ian E. Nielsen
Aakash Tripathi
Shamoon Siddiqui
Ghulam Rasool
R. Ramachandran
AI4TS
19
142
0
28 Apr 2022
Enable Deep Learning on Mobile Devices: Methods, Systems, and
  Applications
Enable Deep Learning on Mobile Devices: Methods, Systems, and Applications
Han Cai
Ji Lin
Yujun Lin
Zhijian Liu
Haotian Tang
Hanrui Wang
Ligeng Zhu
Song Han
19
107
0
25 Apr 2022
ChapterBreak: A Challenge Dataset for Long-Range Language Models
ChapterBreak: A Challenge Dataset for Long-Range Language Models
Simeng Sun
Katherine Thai
Mohit Iyyer
10
19
0
22 Apr 2022
On the Locality of Attention in Direct Speech Translation
On the Locality of Attention in Direct Speech Translation
Belen Alastruey
Javier Ferrando
Gerard I. Gállego
Marta R. Costa-jussá
6
7
0
19 Apr 2022
Exploring Dimensionality Reduction Techniques in Multilingual
  Transformers
Exploring Dimensionality Reduction Techniques in Multilingual Transformers
Álvaro Huertas-García
Alejandro Martín
Javier Huertas-Tato
David Camacho
19
7
0
18 Apr 2022
Usage of specific attention improves change point detection
Usage of specific attention improves change point detection
Anna Dmitrienko
Evgenia Romanenkova
Alexey Zaytsev
11
0
0
18 Apr 2022
Multi-Frame Self-Supervised Depth with Transformers
Multi-Frame Self-Supervised Depth with Transformers
Vitor Campagnolo Guizilini
Rares Ambrus
Di Chen
Sergey Zakharov
Adrien Gaidon
ViT
MDE
10
84
0
15 Apr 2022
Characterizing the Efficiency vs. Accuracy Trade-off for Long-Context
  NLP Models
Characterizing the Efficiency vs. Accuracy Trade-off for Long-Context NLP Models
Phyllis Ang
Bhuwan Dhingra
Lisa Wu Wills
19
6
0
15 Apr 2022
Revisiting Transformer-based Models for Long Document Classification
Revisiting Transformer-based Models for Long Document Classification
Xiang Dai
Ilias Chalkidis
S. Darkner
Desmond Elliott
VLM
6
68
0
14 Apr 2022
Malceiver: Perceiver with Hierarchical and Multi-modal Features for
  Android Malware Detection
Malceiver: Perceiver with Hierarchical and Multi-modal Features for Android Malware Detection
Niall McLaughlin
17
2
0
12 Apr 2022
A Call for Clarity in Beam Search: How It Works and When It Stops
A Call for Clarity in Beam Search: How It Works and When It Stops
Jungo Kasai
Keisuke Sakaguchi
Ronan Le Bras
Dragomir R. Radev
Yejin Choi
Noah A. Smith
24
6
0
11 Apr 2022
Linear Complexity Randomized Self-attention Mechanism
Linear Complexity Randomized Self-attention Mechanism
Lin Zheng
Chong-Jun Wang
Lingpeng Kong
11
31
0
10 Apr 2022
BERTuit: Understanding Spanish language in Twitter through a native
  transformer
BERTuit: Understanding Spanish language in Twitter through a native transformer
Javier Huertas-Tato
Alejandro Martín
David Camacho
11
9
0
07 Apr 2022
Scaling Language Model Size in Cross-Device Federated Learning
Scaling Language Model Size in Cross-Device Federated Learning
Jae Hun Ro
Theresa Breiner
Lara McConnaughey
Mingqing Chen
A. Suresh
Shankar Kumar
Rajiv Mathews
FedML
21
24
0
31 Mar 2022
MAE-AST: Masked Autoencoding Audio Spectrogram Transformer
MAE-AST: Masked Autoencoding Audio Spectrogram Transformer
Alan Baade
Puyuan Peng
David F. Harwath
21
95
0
30 Mar 2022
Diagonal State Spaces are as Effective as Structured State Spaces
Diagonal State Spaces are as Effective as Structured State Spaces
Ankit Gupta
Albert Gu
Jonathan Berant
34
288
0
27 Mar 2022
A General Survey on Attention Mechanisms in Deep Learning
A General Survey on Attention Mechanisms in Deep Learning
Gianni Brauwers
Flavius Frasincar
11
294
0
27 Mar 2022
Transformers Meet Visual Learning Understanding: A Comprehensive Review
Transformers Meet Visual Learning Understanding: A Comprehensive Review
Yuting Yang
Licheng Jiao
Xuantong Liu
F. Liu
Shuyuan Yang
Zhixi Feng
Xu Tang
ViT
MedIm
20
28
0
24 Mar 2022
Linearizing Transformer with Key-Value Memory
Linearizing Transformer with Key-Value Memory
Yizhe Zhang
Deng Cai
12
5
0
23 Mar 2022
PaCa-ViT: Learning Patch-to-Cluster Attention in Vision Transformers
PaCa-ViT: Learning Patch-to-Cluster Attention in Vision Transformers
Ryan Grainger
Thomas Paniagua
Xi Song
Naresh P. Cuntoor
Mun Wai Lee
Tianfu Wu
ViT
10
7
0
22 Mar 2022
Mask Usage Recognition using Vision Transformer with Transfer Learning
  and Data Augmentation
Mask Usage Recognition using Vision Transformer with Transfer Learning and Data Augmentation
Hensel Donato Jahja
N. Yudistira
Sutrisno
ViT
12
8
0
22 Mar 2022
Efficient Classification of Long Documents Using Transformers
Efficient Classification of Long Documents Using Transformers
Hyunji Hayley Park
Yogarshi Vyas
Kashif Shah
4
51
0
21 Mar 2022
Memorizing Transformers
Memorizing Transformers
Yuhuai Wu
M. Rabe
DeLesley S. Hutchins
Christian Szegedy
RALM
11
170
0
16 Mar 2022
Do BERTs Learn to Use Browser User Interface? Exploring Multi-Step Tasks
  with Unified Vision-and-Language BERTs
Do BERTs Learn to Use Browser User Interface? Exploring Multi-Step Tasks with Unified Vision-and-Language BERTs
Taichi Iki
Akiko Aizawa
LLMAG
11
6
0
15 Mar 2022
Block-Recurrent Transformers
Block-Recurrent Transformers
DeLesley S. Hutchins
Imanol Schlag
Yuhuai Wu
Ethan Dyer
Behnam Neyshabur
13
94
0
11 Mar 2022
WaveMix: Resource-efficient Token Mixing for Images
WaveMix: Resource-efficient Token Mixing for Images
Pranav Jeevan
A. Sethi
6
10
0
07 Mar 2022
Dynamic N:M Fine-grained Structured Sparse Attention Mechanism
Dynamic N:M Fine-grained Structured Sparse Attention Mechanism
Zhaodong Chen
Yuying Quan
Zheng Qu
L. Liu
Yufei Ding
Yuan Xie
14
21
0
28 Feb 2022
PMC-Patients: A Large-scale Dataset of Patient Summaries and Relations
  for Benchmarking Retrieval-based Clinical Decision Support Systems
PMC-Patients: A Large-scale Dataset of Patient Summaries and Relations for Benchmarking Retrieval-based Clinical Decision Support Systems
Zhengyun Zhao
Qiao Jin
Fangyuan Chen
Tuorui Peng
Sheng Yu
PINN
4
34
0
28 Feb 2022
A Differential Attention Fusion Model Based on Transformer for Time
  Series Forecasting
A Differential Attention Fusion Model Based on Transformer for Time Series Forecasting
Benhan Li
Shengdong Du
Tianrui Li
AI4TS
12
2
0
23 Feb 2022
Ligandformer: A Graph Neural Network for Predicting Compound Property
  with Robust Interpretation
Ligandformer: A Graph Neural Network for Predicting Compound Property with Robust Interpretation
Jinjiang Guo
Qi Liu
Han Guo
Xi Lu
AI4CE
11
3
0
21 Feb 2022
Deep Learning for Hate Speech Detection: A Comparative Study
Deep Learning for Hate Speech Detection: A Comparative Study
Jitendra Malik
Hezhe Qiao
Guansong Pang
A. Hengel
35
43
0
19 Feb 2022
The NLP Task Effectiveness of Long-Range Transformers
The NLP Task Effectiveness of Long-Range Transformers
Guanghui Qin
Yukun Feng
Benjamin Van Durme
10
27
0
16 Feb 2022
Previous
123...1011121389
Next