ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2009.06732
  4. Cited By
Efficient Transformers: A Survey

Efficient Transformers: A Survey

14 September 2020
Yi Tay
Mostafa Dehghani
Dara Bahri
Donald Metzler
    VLM
ArXivPDFHTML

Papers citing "Efficient Transformers: A Survey"

50 / 633 papers shown
Title
TypeFormer: Transformers for Mobile Keystroke Biometrics
TypeFormer: Transformers for Mobile Keystroke Biometrics
Giuseppe Stragapede
Paula Delgado-Santos
Ruben Tolosana
R. Vera-Rodríguez
R. Guest
Aythami Morales
10
12
0
26 Dec 2022
Pretraining Without Attention
Pretraining Without Attention
Junxiong Wang
J. Yan
Albert Gu
Alexander M. Rush
25
48
0
20 Dec 2022
JEMMA: An Extensible Java Dataset for ML4Code Applications
JEMMA: An Extensible Java Dataset for ML4Code Applications
Anjan Karmakar
Miltiadis Allamanis
Romain Robbes
VLM
8
3
0
18 Dec 2022
Two-stage Contextual Transformer-based Convolutional Neural Network for
  Airway Extraction from CT Images
Two-stage Contextual Transformer-based Convolutional Neural Network for Airway Extraction from CT Images
Yanan Wu
Shuiqing Zhao
Shouliang Qi
Jie Feng
H. Pang
...
Long Bai
Meng-Yi Li
Shuyue Xia
W. Qian
Hongliang Ren
ViT
MedIm
13
24
0
15 Dec 2022
PulseImpute: A Novel Benchmark Task for Pulsative Physiological Signal
  Imputation
PulseImpute: A Novel Benchmark Task for Pulsative Physiological Signal Imputation
Maxwell A. Xu
Alexander Moreno
Supriya Nagesh
V. Aydemir
D. Wetter
Santosh Kumar
James M. Rehg
AI4TS
14
7
0
14 Dec 2022
LawngNLI: A Long-Premise Benchmark for In-Domain Generalization from
  Short to Long Contexts and for Implication-Based Retrieval
LawngNLI: A Long-Premise Benchmark for In-Domain Generalization from Short to Long Contexts and for Implication-Based Retrieval
William F. Bruno
Dan Roth
ELM
AILaw
25
6
0
06 Dec 2022
Document-Level Abstractive Summarization
Document-Level Abstractive Summarization
Gonçalo Raposo
Afonso Raposo
Ana Sofia Carmo
19
1
0
06 Dec 2022
Human-in-the-Loop Hate Speech Classification in a Multilingual Context
Human-in-the-Loop Hate Speech Classification in a Multilingual Context
Ana Kotarcic
Dominik Hangartner
Fabrizio Gilardi
Selina Kurer
K. Donnay
17
2
0
05 Dec 2022
LMEC: Learnable Multiplicative Absolute Position Embedding Based
  Conformer for Speech Recognition
LMEC: Learnable Multiplicative Absolute Position Embedding Based Conformer for Speech Recognition
Yuguang Yang
Y. Pan
Jingjing Yin
Heng Lu
11
3
0
05 Dec 2022
Long-Document Cross-Lingual Summarization
Long-Document Cross-Lingual Summarization
Shaohui Zheng
Zhixu Li
Jiaan Wang
Jianfeng Qu
An Liu
Lei Zhao
Zhigang Chen
RALM
17
9
0
01 Dec 2022
BudgetLongformer: Can we Cheaply Pretrain a SotA Legal Language Model
  From Scratch?
BudgetLongformer: Can we Cheaply Pretrain a SotA Legal Language Model From Scratch?
Joel Niklaus
Daniele Giofré
25
11
0
30 Nov 2022
Medical Image Segmentation Review: The success of U-Net
Medical Image Segmentation Review: The success of U-Net
Reza Azad
Ehsan Khodapanah Aghdam
Amelie Rauland
Yiwei Jia
Atlas Haddadi Avval
Afshin Bozorgpour
Sanaz Karimijafarbigloo
Joseph Paul Cohen
Ehsan Adeli
Dorit Merhof
SSeg
14
253
0
27 Nov 2022
Deep representation learning: Fundamentals, Perspectives, Applications,
  and Open Challenges
Deep representation learning: Fundamentals, Perspectives, Applications, and Open Challenges
K. T. Baghaei
Amirreza Payandeh
Pooya Fayyazsanavi
Shahram Rahimi
Zhiqian Chen
Somayeh Bakhtiari Ramezani
FaML
AI4TS
30
6
0
27 Nov 2022
DBA: Efficient Transformer with Dynamic Bilinear Low-Rank Attention
DBA: Efficient Transformer with Dynamic Bilinear Low-Rank Attention
Bosheng Qin
Juncheng Li
Siliang Tang
Yueting Zhuang
17
2
0
24 Nov 2022
STGlow: A Flow-based Generative Framework with Dual Graphormer for
  Pedestrian Trajectory Prediction
STGlow: A Flow-based Generative Framework with Dual Graphormer for Pedestrian Trajectory Prediction
Rongqin Liang
Yuanman Li
Jiantao Zhou
Xia Li
37
12
0
21 Nov 2022
Token Turing Machines
Token Turing Machines
Michael S. Ryoo
K. Gopalakrishnan
Kumara Kahatapitiya
Ted Xiao
Kanishka Rao
Austin Stone
Yao Lu
Julian Ibarz
Anurag Arnab
27
21
0
16 Nov 2022
Evade the Trap of Mediocrity: Promoting Diversity and Novelty in Text
  Generation via Concentrating Attention
Evade the Trap of Mediocrity: Promoting Diversity and Novelty in Text Generation via Concentrating Attention
Wenhao Li
Xiaoyuan Yi
Jinyi Hu
Maosong Sun
Xing Xie
16
0
0
14 Nov 2022
A Comprehensive Survey of Transformers for Computer Vision
A Comprehensive Survey of Transformers for Computer Vision
Sonain Jamil
Md. Jalil Piran
Oh-Jin Kwon
ViT
17
46
0
11 Nov 2022
Linear Self-Attention Approximation via Trainable Feedforward Kernel
Linear Self-Attention Approximation via Trainable Feedforward Kernel
Uladzislau Yorsh
Alexander Kovalenko
17
0
0
08 Nov 2022
Parallel Attention Forcing for Machine Translation
Parallel Attention Forcing for Machine Translation
Qingyun Dou
Mark J. F. Gales
8
0
0
06 Nov 2022
Deliberation Networks and How to Train Them
Deliberation Networks and How to Train Them
Qingyun Dou
Mark J. F. Gales
14
0
0
06 Nov 2022
BERT for Long Documents: A Case Study of Automated ICD Coding
BERT for Long Documents: A Case Study of Automated ICD Coding
Arash Afkanpour
Shabir Adeel
H. Bassani
Arkady Epshteyn
Hongbo Fan
...
Sanjana Woonna
S. Zamani
Elli Kanal
M. Fomitchev
Donny Cheung
34
14
0
04 Nov 2022
Once-for-All Sequence Compression for Self-Supervised Speech Models
Once-for-All Sequence Compression for Self-Supervised Speech Models
Hsuan-Jui Chen
Yen Meng
Hung-yi Lee
8
4
0
04 Nov 2022
Agent-Time Attention for Sparse Rewards Multi-Agent Reinforcement
  Learning
Agent-Time Attention for Sparse Rewards Multi-Agent Reinforcement Learning
Jennifer She
Jayesh K. Gupta
Mykel J. Kochenderfer
16
4
0
31 Oct 2022
Transformers meet Stochastic Block Models: Attention with Data-Adaptive
  Sparsity and Cost
Transformers meet Stochastic Block Models: Attention with Data-Adaptive Sparsity and Cost
Sungjun Cho
Seonwoo Min
Jinwoo Kim
Moontae Lee
Honglak Lee
Seunghoon Hong
30
3
0
27 Oct 2022
Automated Diagnosis of Cardiovascular Diseases from Cardiac Magnetic
  Resonance Imaging Using Deep Learning Models: A Review
Automated Diagnosis of Cardiovascular Diseases from Cardiac Magnetic Resonance Imaging Using Deep Learning Models: A Review
M. Jafari
A. Shoeibi
Marjane Khodatars
Navid Ghassemi
Parisa Moridian
...
Yu-Dong Zhang
Shui-Hua Wang
Juan M Gorriz
Hamid Alinejad-Rokny
U. Acharya
17
0
0
26 Oct 2022
How Long Is Enough? Exploring the Optimal Intervals of Long-Range
  Clinical Note Language Modeling
How Long Is Enough? Exploring the Optimal Intervals of Long-Range Clinical Note Language Modeling
Samuel Cahyawijaya
Bryan Wilie
Holy Lovenia
Huang Zhong
Mingqian Zhong
Yuk-Yu Nancy Ip
Pascale Fung
LM&MA
9
2
0
25 Oct 2022
Characterizing Verbatim Short-Term Memory in Neural Language Models
Characterizing Verbatim Short-Term Memory in Neural Language Models
K. Armeni
C. Honey
Tal Linzen
KELM
RALM
20
3
0
24 Oct 2022
Effective Pre-Training Objectives for Transformer-based Autoencoders
Effective Pre-Training Objectives for Transformer-based Autoencoders
Luca Di Liello
Matteo Gabburo
Alessandro Moschitti
18
3
0
24 Oct 2022
Composition, Attention, or Both?
Composition, Attention, or Both?
Ryosuke Yoshida
Yohei Oseki
CoGe
6
0
0
24 Oct 2022
Graphically Structured Diffusion Models
Graphically Structured Diffusion Models
Christian Weilbach
William Harvey
Frank D. Wood
DiffM
19
7
0
20 Oct 2022
An Empirical Analysis of SMS Scam Detection Systems
An Empirical Analysis of SMS Scam Detection Systems
Muhammad Salman
Muhammad Ikram
M. Kâafar
25
8
0
19 Oct 2022
Tempo: Accelerating Transformer-Based Model Training through Memory
  Footprint Reduction
Tempo: Accelerating Transformer-Based Model Training through Memory Footprint Reduction
Muralidhar Andoorveedu
Zhanda Zhu
Bojian Zheng
Gennady Pekhimenko
12
6
0
19 Oct 2022
Linear Video Transformer with Feature Fixation
Linear Video Transformer with Feature Fixation
Kaiyue Lu
Zexia Liu
Jianyuan Wang
Weixuan Sun
Zhen Qin
...
Xuyang Shen
Huizhong Deng
Xiaodong Han
Yuchao Dai
Yiran Zhong
28
4
0
15 Oct 2022
CAB: Comprehensive Attention Benchmarking on Long Sequence Modeling
CAB: Comprehensive Attention Benchmarking on Long Sequence Modeling
Jinchao Zhang
Shuyang Jiang
Jiangtao Feng
Lin Zheng
Lingpeng Kong
3DV
39
9
0
14 Oct 2022
On Compressing Sequences for Self-Supervised Speech Models
On Compressing Sequences for Self-Supervised Speech Models
Yen Meng
Hsuan-Jui Chen
Jiatong Shi
Shinji Watanabe
Paola García
Hung-yi Lee
Hao Tang
SSL
8
14
0
13 Oct 2022
Compute-Efficient Deep Learning: Algorithmic Trends and Opportunities
Compute-Efficient Deep Learning: Algorithmic Trends and Opportunities
Brian Bartoldson
B. Kailkhura
Davis W. Blalock
19
47
0
13 Oct 2022
MiniALBERT: Model Distillation via Parameter-Efficient Recursive
  Transformers
MiniALBERT: Model Distillation via Parameter-Efficient Recursive Transformers
Mohammadmahdi Nouriborji
Omid Rohanian
Samaneh Kouchaki
David A. Clifton
14
8
0
12 Oct 2022
Designing Robust Transformers using Robust Kernel Density Estimation
Designing Robust Transformers using Robust Kernel Density Estimation
Xing Han
Tongzheng Ren
T. Nguyen
Khai Nguyen
Joydeep Ghosh
Nhat Ho
12
6
0
11 Oct 2022
An Exploration of Hierarchical Attention Transformers for Efficient Long
  Document Classification
An Exploration of Hierarchical Attention Transformers for Efficient Long Document Classification
Ilias Chalkidis
Xiang Dai
Manos Fergadiotis
Prodromos Malakasiotis
Desmond Elliott
30
33
0
11 Oct 2022
Retrieval Augmentation for T5 Re-ranker using External Sources
Retrieval Augmentation for T5 Re-ranker using External Sources
Kai Hui
Tao Chen
Zhen Qin
Honglei Zhuang
Fernando Diaz
Michael Bendersky
Donald Metzler
RALM
LRM
15
1
0
11 Oct 2022
Bird-Eye Transformers for Text Generation Models
Bird-Eye Transformers for Text Generation Models
Lei Sha
Yuhang Song
Yordan Yordanov
Tommaso Salvatori
Thomas Lukasiewicz
17
0
0
08 Oct 2022
Hierarchical Graph Transformer with Adaptive Node Sampling
Hierarchical Graph Transformer with Adaptive Node Sampling
Zaixin Zhang
Qi Liu
Qingyong Hu
Cheekong Lee
67
82
0
08 Oct 2022
Time-Space Transformers for Video Panoptic Segmentation
Time-Space Transformers for Video Panoptic Segmentation
Andra Petrovai
S. Nedevschi
ViT
11
3
0
07 Oct 2022
Small Character Models Match Large Word Models for Autocomplete Under
  Memory Constraints
Small Character Models Match Large Word Models for Autocomplete Under Memory Constraints
Ganesh Jawahar
Subhabrata Mukherjee
Debadeepta Dey
Muhammad Abdul-Mageed
L. Lakshmanan
C. C. T. Mendes
Gustavo de Rosa
S. Shah
19
0
0
06 Oct 2022
VIMA: General Robot Manipulation with Multimodal Prompts
VIMA: General Robot Manipulation with Multimodal Prompts
Yunfan Jiang
Agrim Gupta
Zichen Zhang
Guanzhi Wang
Yongqiang Dou
Yanjun Chen
Li Fei-Fei
Anima Anandkumar
Yuke Zhu
Linxi Fan
LM&Ro
15
334
0
06 Oct 2022
WavSpA: Wavelet Space Attention for Boosting Transformers' Long Sequence
  Learning Ability
WavSpA: Wavelet Space Attention for Boosting Transformers' Long Sequence Learning Ability
Yufan Zhuang
Zihan Wang
Fangbo Tao
Jingbo Shang
ViT
AI4TS
15
3
0
05 Oct 2022
Memory in humans and deep language models: Linking hypotheses for model
  augmentation
Memory in humans and deep language models: Linking hypotheses for model augmentation
Omri Raccah
Pheobe Chen
Ted Willke
David Poeppel
Vy A. Vo
RALM
11
1
0
04 Oct 2022
Expediting Large-Scale Vision Transformer for Dense Prediction without
  Fine-tuning
Expediting Large-Scale Vision Transformer for Dense Prediction without Fine-tuning
Weicong Liang
Yuhui Yuan
Henghui Ding
Xiao Luo
Weihong Lin
Ding Jia
Zheng-Wei Zhang
Chao Zhang
Hanhua Hu
17
25
0
03 Oct 2022
DARTFormer: Finding The Best Type Of Attention
DARTFormer: Finding The Best Type Of Attention
Jason Brown
Yiren Zhao
Ilia Shumailov
Robert D. Mullins
17
6
0
02 Oct 2022
Previous
123...678...111213
Next