ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2009.06732
  4. Cited By
Efficient Transformers: A Survey

Efficient Transformers: A Survey

14 September 2020
Yi Tay
Mostafa Dehghani
Dara Bahri
Donald Metzler
    VLM
ArXivPDFHTML

Papers citing "Efficient Transformers: A Survey"

50 / 633 papers shown
Title
MuLD: The Multitask Long Document Benchmark
MuLD: The Multitask Long Document Benchmark
G. Hudson
Noura Al Moubayed
8
10
0
15 Feb 2022
Hindi/Bengali Sentiment Analysis Using Transfer Learning and Joint Dual
  Input Learning with Self Attention
Hindi/Bengali Sentiment Analysis Using Transfer Learning and Joint Dual Input Learning with Self Attention
Shahrukh Khan
Mahnoor Shahid
20
1
0
11 Feb 2022
Deep Learning for Computational Cytology: A Survey
Deep Learning for Computational Cytology: A Survey
Hao Jiang
Yanning Zhou
Yi-Mou Lin
R. Chan
Jiangshu Liu
Hao Chen
11
74
0
10 Feb 2022
Universal Hopfield Networks: A General Framework for Single-Shot
  Associative Memory Models
Universal Hopfield Networks: A General Framework for Single-Shot Associative Memory Models
Beren Millidge
Tommaso Salvatori
Yuhang Song
Thomas Lukasiewicz
Rafal Bogacz
VLM
14
52
0
09 Feb 2022
pNLP-Mixer: an Efficient all-MLP Architecture for Language
pNLP-Mixer: an Efficient all-MLP Architecture for Language
Francesco Fusco
Damian Pascual
Peter W. J. Staar
Diego Antognini
21
29
0
09 Feb 2022
Structured Time Series Prediction without Structural Prior
Structured Time Series Prediction without Structural Prior
Darko Drakulic
J. Andreoli
DiffM
AI4TS
9
3
0
07 Feb 2022
Patch-Based Stochastic Attention for Image Editing
Patch-Based Stochastic Attention for Image Editing
Nicolas Cherel
Andrés Almansa
Y. Gousseau
A. Newson
17
6
0
07 Feb 2022
Structure-Aware Transformer for Graph Representation Learning
Structure-Aware Transformer for Graph Representation Learning
Dexiong Chen
Leslie O’Bray
Karsten M. Borgwardt
26
236
0
07 Feb 2022
GatorTron: A Large Clinical Language Model to Unlock Patient Information
  from Unstructured Electronic Health Records
GatorTron: A Large Clinical Language Model to Unlock Patient Information from Unstructured Electronic Health Records
Xi Yang
Aokun Chen
Nima M. Pournejatian
Hoo-Chang Shin
Kaleb E. Smith
...
Duane A. Mitchell
W. Hogan
E. Shenkman
Jiang Bian
Yonghui Wu
AI4MH
LM&MA
22
499
0
02 Feb 2022
Corpus for Automatic Structuring of Legal Documents
Corpus for Automatic Structuring of Legal Documents
Prathamesh Kalamkar
Aman Tiwari
Astha Agarwal
S. Karn
Smita Gupta
Vivek Raghavan
Ashutosh Modi
AILaw
31
48
0
31 Jan 2022
BOAT: Bilateral Local Attention Vision Transformer
BOAT: Bilateral Local Attention Vision Transformer
Tan Yu
Gangming Zhao
Ping Li
Yizhou Yu
ViT
20
27
0
31 Jan 2022
Fast Monte-Carlo Approximation of the Attention Mechanism
Fast Monte-Carlo Approximation of the Attention Mechanism
Hyunjun Kim
Jeonggil Ko
6
2
0
30 Jan 2022
Transformers in Medical Imaging: A Survey
Transformers in Medical Imaging: A Survey
Fahad Shamshad
Salman Khan
Syed Waqas Zamir
Muhammad Haris Khan
Munawar Hayat
F. Khan
H. Fu
ViT
LM&MA
MedIm
106
653
0
24 Jan 2022
Table Pre-training: A Survey on Model Architectures, Pre-training
  Objectives, and Downstream Tasks
Table Pre-training: A Survey on Model Architectures, Pre-training Objectives, and Downstream Tasks
Haoyu Dong
Zhoujun Cheng
Xinyi He
Mengyuan Zhou
Anda Zhou
Fan Zhou
Ao Liu
Shi Han
Dongmei Zhang
LMTD
60
62
0
24 Jan 2022
Continual Transformers: Redundancy-Free Attention for Online Inference
Continual Transformers: Redundancy-Free Attention for Online Inference
Lukas Hedegaard
Arian Bakhtiarnia
Alexandros Iosifidis
CLL
14
11
0
17 Jan 2022
Control of Dual-Sourcing Inventory Systems using Recurrent Neural
  Networks
Control of Dual-Sourcing Inventory Systems using Recurrent Neural Networks
Lucas Böttcher
Thomas Asikis
I. Fragkos
BDL
6
10
0
16 Jan 2022
Video Transformers: A Survey
Video Transformers: A Survey
Javier Selva
A. S. Johansen
Sergio Escalera
Kamal Nasrollahi
T. Moeslund
Albert Clapés
ViT
20
101
0
16 Jan 2022
GateFormer: Speeding Up News Feed Recommendation with Input Gated
  Transformers
GateFormer: Speeding Up News Feed Recommendation with Input Gated Transformers
Peitian Zhang
Zheng liu
AI4TS
8
1
0
12 Jan 2022
SCROLLS: Standardized CompaRison Over Long Language Sequences
SCROLLS: Standardized CompaRison Over Long Language Sequences
Uri Shaham
Elad Segal
Maor Ivgi
Avia Efrat
Ori Yoran
...
Ankit Gupta
Wenhan Xiong
Mor Geva
Jonathan Berant
Omer Levy
RALM
20
132
0
10 Jan 2022
Classification of Long Sequential Data using Circular Dilated
  Convolutional Neural Networks
Classification of Long Sequential Data using Circular Dilated Convolutional Neural Networks
Lei Cheng
Ruslan Khalitov
Tong Yu
Zhirong Yang
20
32
0
06 Jan 2022
Transformer Uncertainty Estimation with Hierarchical Stochastic
  Attention
Transformer Uncertainty Estimation with Hierarchical Stochastic Attention
Jiahuan Pei
Cheng-Yu Wang
Gyuri Szarvas
6
22
0
27 Dec 2021
Video Joint Modelling Based on Hierarchical Transformer for
  Co-summarization
Video Joint Modelling Based on Hierarchical Transformer for Co-summarization
Haopeng Li
Qiuhong Ke
Mingming Gong
Zhang Rui
ViT
15
22
0
27 Dec 2021
Domain Adaptation with Pre-trained Transformers for Query Focused
  Abstractive Text Summarization
Domain Adaptation with Pre-trained Transformers for Query Focused Abstractive Text Summarization
Md Tahmid Rahman Laskar
Enamul Hoque
J. Huang
28
44
0
22 Dec 2021
Efficient Visual Tracking with Exemplar Transformers
Efficient Visual Tracking with Exemplar Transformers
Philippe Blatter
Menelaos Kanakis
Martin Danelljan
Luc Van Gool
ViT
8
79
0
17 Dec 2021
Block-Skim: Efficient Question Answering for Transformer
Block-Skim: Efficient Question Answering for Transformer
Yue Guan
Zhengyi Li
Jingwen Leng
Zhouhan Lin
Minyi Guo
Yuhao Zhu
14
30
0
16 Dec 2021
Mask-combine Decoding and Classification Approach for Punctuation
  Prediction with real-time Inference Constraints
Mask-combine Decoding and Classification Approach for Punctuation Prediction with real-time Inference Constraints
Christoph Minixhofer
Ondˇrej Klejch
P. Bell
AI4CE
15
0
0
15 Dec 2021
LongT5: Efficient Text-To-Text Transformer for Long Sequences
LongT5: Efficient Text-To-Text Transformer for Long Sequences
Mandy Guo
Joshua Ainslie
David C. Uthus
Santiago Ontanon
Jianmo Ni
Yun-hsuan Sung
Yinfei Yang
VLM
29
306
0
15 Dec 2021
Simple Local Attentions Remain Competitive for Long-Context Tasks
Simple Local Attentions Remain Competitive for Long-Context Tasks
Wenhan Xiong
Barlas Ouguz
Anchit Gupta
Xilun Chen
Diana Liskovich
Omer Levy
Wen-tau Yih
Yashar Mehdad
25
29
0
14 Dec 2021
Couplformer:Rethinking Vision Transformer with Coupling Attention Map
Couplformer:Rethinking Vision Transformer with Coupling Attention Map
Hai Lan
Xihao Wang
Xian Wei
ViT
15
3
0
10 Dec 2021
Sketching as a Tool for Understanding and Accelerating Self-attention
  for Long Sequences
Sketching as a Tool for Understanding and Accelerating Self-attention for Long Sequences
Yifan Chen
Qi Zeng
Dilek Z. Hakkani-Tür
Di Jin
Heng Ji
Yun Yang
17
4
0
10 Dec 2021
Recurrent Glimpse-based Decoder for Detection with Transformer
Recurrent Glimpse-based Decoder for Detection with Transformer
Zhe Chen
Jing Zhang
Dacheng Tao
ViT
11
27
0
09 Dec 2021
FastSGD: A Fast Compressed SGD Framework for Distributed Machine
  Learning
FastSGD: A Fast Compressed SGD Framework for Distributed Machine Learning
Keyu Yang
Lu Chen
Zhihao Zeng
Yunjun Gao
13
9
0
08 Dec 2021
Attention-Based Model and Deep Reinforcement Learning for Distribution
  of Event Processing Tasks
Attention-Based Model and Deep Reinforcement Learning for Distribution of Event Processing Tasks
A. Mazayev
F. Al-Tam
N. Correia
10
5
0
07 Dec 2021
Graph Conditioned Sparse-Attention for Improved Source Code
  Understanding
Graph Conditioned Sparse-Attention for Improved Source Code Understanding
Junyan Cheng
Iordanis Fostiropoulos
Barry W. Boehm
11
1
0
01 Dec 2021
Score Transformer: Generating Musical Score from Note-level
  Representation
Score Transformer: Generating Musical Score from Note-level Representation
Masahiro Suzuki
16
8
0
01 Dec 2021
Sparse is Enough in Scaling Transformers
Sparse is Enough in Scaling Transformers
Sebastian Jaszczur
Aakanksha Chowdhery
Afroz Mohiuddin
Lukasz Kaiser
Wojciech Gajewski
Henryk Michalewski
Jonni Kanerva
MoE
16
100
0
24 Nov 2021
Octree Transformer: Autoregressive 3D Shape Generation on Hierarchically
  Structured Sequences
Octree Transformer: Autoregressive 3D Shape Generation on Hierarchically Structured Sequences
Moritz Ibing
Gregor Kobsik
Leif Kobbelt
10
37
0
24 Nov 2021
Adaptive Fourier Neural Operators: Efficient Token Mixers for
  Transformers
Adaptive Fourier Neural Operators: Efficient Token Mixers for Transformers
John Guibas
Morteza Mardani
Zong-Yi Li
Andrew Tao
Anima Anandkumar
Bryan Catanzaro
17
222
0
24 Nov 2021
SimpleTRON: Simple Transformer with O(N) Complexity
SimpleTRON: Simple Transformer with O(N) Complexity
Uladzislau Yorsh
Alexander Kovalenko
Vojtvech Vanvcura
Daniel Vavsata
Pavel Kordík
Tomávs Mikolov
17
1
0
23 Nov 2021
PointMixer: MLP-Mixer for Point Cloud Understanding
PointMixer: MLP-Mixer for Point Cloud Understanding
Jaesung Choe
Chunghyun Park
François Rameau
Jaesik Park
In So Kweon
3DPC
32
98
0
22 Nov 2021
Quality and Cost Trade-offs in Passage Re-ranking Task
Quality and Cost Trade-offs in Passage Re-ranking Task
P. Podberezko
V. Mitskevich
Raman Makouski
P. Goncharov
Andrei Khobnia
Nikolay A Bushkov
Marina Chernyshevich
14
0
0
18 Nov 2021
HiRID-ICU-Benchmark -- A Comprehensive Machine Learning Benchmark on
  High-resolution ICU Data
HiRID-ICU-Benchmark -- A Comprehensive Machine Learning Benchmark on High-resolution ICU Data
Hugo Yèche
Rita Kuznetsova
M. Zimmermann
Matthias Huser
Xinrui Lyu
M. Faltys
Gunnar Rätsch
29
40
0
16 Nov 2021
A Survey of Visual Transformers
A Survey of Visual Transformers
Yang Liu
Yao Zhang
Yixin Wang
Feng Hou
Jin Yuan
Jiang Tian
Yang Zhang
Zhongchao Shi
Jianping Fan
Zhiqiang He
3DGS
ViT
66
325
0
11 Nov 2021
Are we ready for a new paradigm shift? A Survey on Visual Deep MLP
Are we ready for a new paradigm shift? A Survey on Visual Deep MLP
Ruiyang Liu
Yinghui Li
Li Tao
Dun Liang
Haitao Zheng
77
96
0
07 Nov 2021
CoreLM: Coreference-aware Language Model Fine-Tuning
CoreLM: Coreference-aware Language Model Fine-Tuning
Nikolaos Stylianou
I. Vlahavas
6
2
0
04 Nov 2021
Transformers for prompt-level EMA non-response prediction
Transformers for prompt-level EMA non-response prediction
Supriya Nagesh
Alexander Moreno
Stephanie M Carpenter
Jamie Yap
Soujanya Chatterjee
...
Santosh Kumar
Cho Lam
D. Wetter
Inbal Nahum-Shani
James M. Rehg
4
0
0
01 Nov 2021
Skyformer: Remodel Self-Attention with Gaussian Kernel and Nyström
  Method
Skyformer: Remodel Self-Attention with Gaussian Kernel and Nyström Method
Yifan Chen
Qi Zeng
Heng Ji
Yun Yang
8
49
0
29 Oct 2021
Delayed Propagation Transformer: A Universal Computation Engine towards
  Practical Control in Cyber-Physical Systems
Delayed Propagation Transformer: A Universal Computation Engine towards Practical Control in Cyber-Physical Systems
Wenqing Zheng
Qiangqiang Guo
H. Yang
Peihao Wang
Zhangyang Wang
AI4CE
6
11
0
29 Oct 2021
Scatterbrain: Unifying Sparse and Low-rank Attention Approximation
Scatterbrain: Unifying Sparse and Low-rank Attention Approximation
Beidi Chen
Tri Dao
Eric Winsor
Zhao-quan Song
Atri Rudra
Christopher Ré
18
125
0
28 Oct 2021
The Efficiency Misnomer
The Efficiency Misnomer
Daoyuan Chen
Liuyi Yao
Dawei Gao
Ashish Vaswani
Yaliang Li
23
96
0
25 Oct 2021
Previous
123...101112139
Next