ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2009.06732
  4. Cited By
Efficient Transformers: A Survey

Efficient Transformers: A Survey

14 September 2020
Yi Tay
Mostafa Dehghani
Dara Bahri
Donald Metzler
    VLM
ArXivPDFHTML

Papers citing "Efficient Transformers: A Survey"

50 / 633 papers shown
Title
Spike-driven Transformer
Spike-driven Transformer
Man Yao
Jiakui Hu
Zhaokun Zhou
Liuliang Yuan
Yonghong Tian
Boxing Xu
Guoqi Li
21
109
0
04 Jul 2023
Learning Feature Matching via Matchable Keypoint-Assisted Graph Neural
  Network
Learning Feature Matching via Matchable Keypoint-Assisted Graph Neural Network
Zizhuo Li
Jiayi Ma
27
2
0
04 Jul 2023
Challenges in Domain-Specific Abstractive Summarization and How to
  Overcome them
Challenges in Domain-Specific Abstractive Summarization and How to Overcome them
Anum Afzal
Juraj Vladika
Daniel Braun
Florian Matthes
HILM
25
10
0
03 Jul 2023
SMILE: Evaluation and Domain Adaptation for Social Media Language
  Understanding
SMILE: Evaluation and Domain Adaptation for Social Media Language Understanding
Vasilisa Bashlovkina
Riley Matthews
Zhaobin Kuang
Simon Baumgartner
Michael Bendersky
25
4
0
30 Jun 2023
Transformers in Healthcare: A Survey
Transformers in Healthcare: A Survey
Subhash Nerella
S. Bandyopadhyay
Jiaqing Zhang
Miguel Contreras
Scott Siegel
...
Jessica Sena
B. Shickel
A. Bihorac
Kia Khezeli
Parisa Rashidi
MedIm
AI4CE
19
24
0
30 Jun 2023
FLuRKA: Fast and accurate unified Low-Rank & Kernel Attention
FLuRKA: Fast and accurate unified Low-Rank & Kernel Attention
Ahan Gupta
Hao Guo
Yueming Yuan
Yan-Quan Zhou
Charith Mendis
19
2
0
27 Jun 2023
LongCoder: A Long-Range Pre-trained Language Model for Code Completion
LongCoder: A Long-Range Pre-trained Language Model for Code Completion
Daya Guo
Canwen Xu
Nan Duan
Jian Yin
Julian McAuley
13
76
0
26 Jun 2023
H$_2$O: Heavy-Hitter Oracle for Efficient Generative Inference of Large
  Language Models
H2_22​O: Heavy-Hitter Oracle for Efficient Generative Inference of Large Language Models
Zhenyu (Allen) Zhang
Ying Sheng
Tianyi Zhou
Tianlong Chen
Lianmin Zheng
...
Yuandong Tian
Christopher Ré
Clark W. Barrett
Zhangyang Wang
Beidi Chen
VLM
47
246
0
24 Jun 2023
GPT-Based Models Meet Simulation: How to Efficiently Use Large-Scale
  Pre-Trained Language Models Across Simulation Tasks
GPT-Based Models Meet Simulation: How to Efficiently Use Large-Scale Pre-Trained Language Models Across Simulation Tasks
Philippe J. Giabbanelli
LLMAG
ALM
AI4CE
15
11
0
21 Jun 2023
RoTaR: Efficient Row-Based Table Representation Learning via
  Teacher-Student Training
RoTaR: Efficient Row-Based Table Representation Learning via Teacher-Student Training
Zui Chen
Lei Cao
S. Madden
14
0
0
20 Jun 2023
Building Blocks for a Complex-Valued Transformer Architecture
Building Blocks for a Complex-Valued Transformer Architecture
Florian Eilers
Xiaoyi Jiang
ViT
24
6
0
16 Jun 2023
Understanding Optimization of Deep Learning via Jacobian Matrix and
  Lipschitz Constant
Understanding Optimization of Deep Learning via Jacobian Matrix and Lipschitz Constant
Xianbiao Qi
Jianan Wang
Lei Zhang
11
0
0
15 Jun 2023
When to Use Efficient Self Attention? Profiling Text, Speech and Image
  Transformer Variants
When to Use Efficient Self Attention? Profiling Text, Speech and Image Transformer Variants
Anuj Diwan
Eunsol Choi
David F. Harwath
25
0
0
14 Jun 2023
GCformer: An Efficient Framework for Accurate and Scalable Long-Term
  Multivariate Time Series Forecasting
GCformer: An Efficient Framework for Accurate and Scalable Long-Term Multivariate Time Series Forecasting
Yanjun Zhao
Ziqing Ma
Tian Zhou
Liang Sun
M. Ye
Yi Qian
AI4TS
14
19
0
14 Jun 2023
Recursion of Thought: A Divide-and-Conquer Approach to Multi-Context
  Reasoning with Language Models
Recursion of Thought: A Divide-and-Conquer Approach to Multi-Context Reasoning with Language Models
Soochan Lee
Gunhee Kim
ReLM
LRM
14
26
0
12 Jun 2023
A Comprehensive Survey on Applications of Transformers for Deep Learning
  Tasks
A Comprehensive Survey on Applications of Transformers for Deep Learning Tasks
Saidul Islam
Hanae Elmekki
Ahmed Elsebai
Jamal Bentahar
Najat Drawel
Gaith Rjoub
Witold Pedrycz
ViT
MedIm
16
167
0
11 Jun 2023
Attention-stacked Generative Adversarial Network (AS-GAN)-empowered
  Sensor Data Augmentation for Online Monitoring of Manufacturing System
Attention-stacked Generative Adversarial Network (AS-GAN)-empowered Sensor Data Augmentation for Online Monitoring of Manufacturing System
Yuxuan Li
Chenang Liu
11
2
0
09 Jun 2023
Lenient Evaluation of Japanese Speech Recognition: Modeling Naturally
  Occurring Spelling Inconsistency
Lenient Evaluation of Japanese Speech Recognition: Modeling Naturally Occurring Spelling Inconsistency
Shigeki Karita
R. Sproat
Haruko Ishikawa
14
4
0
07 Jun 2023
RITA: Group Attention is All You Need for Timeseries Analytics
RITA: Group Attention is All You Need for Timeseries Analytics
Jiaming Liang
Lei Cao
Samuel Madden
Z. Ives
Guoliang Li
AI4TS
16
0
0
02 Jun 2023
An Overview on Generative AI at Scale with Edge-Cloud Computing
An Overview on Generative AI at Scale with Edge-Cloud Computing
Yun Cheng Wang
Jintang Xue
Chengwei Wei
C.-C. Jay Kuo
19
27
0
02 Jun 2023
Faster Causal Attention Over Large Sequences Through Sparse Flash
  Attention
Faster Causal Attention Over Large Sequences Through Sparse Flash Attention
Matteo Pagliardini
Daniele Paliotta
Martin Jaggi
Franccois Fleuret
LRM
10
22
0
01 Jun 2023
Coneheads: Hierarchy Aware Attention
Coneheads: Hierarchy Aware Attention
Albert Tseng
Tao Yu
Toni J.B. Liu
Chris De Sa
3DPC
4
5
0
01 Jun 2023
LAIT: Efficient Multi-Segment Encoding in Transformers with
  Layer-Adjustable Interaction
LAIT: Efficient Multi-Segment Encoding in Transformers with Layer-Adjustable Interaction
Jeremiah Milbauer
Annie Louis
Mohammad Javad Hosseini
Alex Fabrikant
Donald Metzler
Tal Schuster
16
9
0
31 May 2023
Blockwise Parallel Transformer for Large Context Models
Blockwise Parallel Transformer for Large Context Models
Hao Liu
Pieter Abbeel
28
11
0
30 May 2023
Networked Time Series Imputation via Position-aware Graph Enhanced
  Variational Autoencoders
Networked Time Series Imputation via Position-aware Graph Enhanced Variational Autoencoders
Dingsu Wang
Yuchen Yan
Ruizhong Qiu
Yada Zhu
Kaiyu Guan
A. Margenot
Hanghang Tong
AI4TS
28
27
0
29 May 2023
HyperConformer: Multi-head HyperMixer for Efficient Speech Recognition
HyperConformer: Multi-head HyperMixer for Efficient Speech Recognition
Florian Mai
Juan Pablo Zuluaga
Titouan Parcollet
P. Motlícek
15
10
0
29 May 2023
Key-Value Transformer
Key-Value Transformer
Ali Borji
13
0
0
28 May 2023
A Quantitative Review on Language Model Efficiency Research
A Quantitative Review on Language Model Efficiency Research
Meng-Long Jiang
Hy Dang
Lingbo Tong
22
0
0
28 May 2023
Plug-and-Play Document Modules for Pre-trained Models
Plug-and-Play Document Modules for Pre-trained Models
Chaojun Xiao
Zhengyan Zhang
Xu Han
Chi-Min Chan
Yankai Lin
Zhiyuan Liu
Xiangyang Li
Zhonghua Li
Zhao Cao
Maosong Sun
KELM
22
5
0
28 May 2023
Adaptive Sparsity Level during Training for Efficient Time Series
  Forecasting with Transformers
Adaptive Sparsity Level during Training for Efficient Time Series Forecasting with Transformers
Zahra Atashgahi
Mykola Pechenizkiy
Raymond N. J. Veldhuis
D. Mocanu
AI4TS
AI4CE
19
1
0
28 May 2023
Slide, Constrain, Parse, Repeat: Synchronous SlidingWindows for Document
  AMR Parsing
Slide, Constrain, Parse, Repeat: Synchronous SlidingWindows for Document AMR Parsing
Sadhana Kumaravel
Tahira Naseem
Ramón Fernández Astudillo
Radu Florian
Salim Roukos
44
0
0
26 May 2023
Do We Really Need a Large Number of Visual Prompts?
Do We Really Need a Large Number of Visual Prompts?
Youngeun Kim
Yuhang Li
Abhishek Moitra
Ruokai Yin
Priyadarshini Panda
VLM
VPVLM
34
5
0
26 May 2023
Generating Images with Multimodal Language Models
Generating Images with Multimodal Language Models
Jing Yu Koh
Daniel Fried
Ruslan Salakhutdinov
MLLM
23
233
0
26 May 2023
Neural Natural Language Processing for Long Texts: A Survey on
  Classification and Summarization
Neural Natural Language Processing for Long Texts: A Survey on Classification and Summarization
Dimitrios Tsirmpas
Ioannis Gkionis
Georgios Th. Papadopoulos
Ioannis Mademlis
AILaw
AI4TS
AI4CE
18
16
0
25 May 2023
Learning to Act through Evolution of Neural Diversity in Random Neural
  Networks
Learning to Act through Evolution of Neural Diversity in Random Neural Networks
J. Pedersen
S. Risi
15
2
0
25 May 2023
Dynamic Context Pruning for Efficient and Interpretable Autoregressive
  Transformers
Dynamic Context Pruning for Efficient and Interpretable Autoregressive Transformers
Sotiris Anagnostidis
Dario Pavllo
Luca Biggio
Lorenzo Noci
Aurélien Lucchi
Thomas Hofmann
32
51
0
25 May 2023
Peek Across: Improving Multi-Document Modeling via Cross-Document
  Question-Answering
Peek Across: Improving Multi-Document Modeling via Cross-Document Question-Answering
Avi Caciularu
Matthew E. Peters
Jacob Goldberger
Ido Dagan
Arman Cohan
RALM
19
25
0
24 May 2023
Fourier Transformer: Fast Long Range Modeling by Removing Sequence
  Redundancy with FFT Operator
Fourier Transformer: Fast Long Range Modeling by Removing Sequence Redundancy with FFT Operator
Ziwei He
Meng-Da Yang
Minwei Feng
Jingcheng Yin
Xinbing Wang
Jingwen Leng
Zhouhan Lin
ViT
16
10
0
24 May 2023
Focus Your Attention (with Adaptive IIR Filters)
Focus Your Attention (with Adaptive IIR Filters)
Shahar Lutati
Itamar Zimerman
Lior Wolf
27
9
0
24 May 2023
Pre-RMSNorm and Pre-CRMSNorm Transformers: Equivalent and Efficient
  Pre-LN Transformers
Pre-RMSNorm and Pre-CRMSNorm Transformers: Equivalent and Efficient Pre-LN Transformers
Zixuan Jiang
Jiaqi Gu
Hanqing Zhu
D. Pan
AI4CE
17
15
0
24 May 2023
Adapting Language Models to Compress Contexts
Adapting Language Models to Compress Contexts
Alexis Chevalier
Alexander Wettig
Anirudh Ajith
Danqi Chen
LLMAG
9
173
0
24 May 2023
Segmented Recurrent Transformer: An Efficient Sequence-to-Sequence Model
Segmented Recurrent Transformer: An Efficient Sequence-to-Sequence Model
Yinghan Long
Sayeed Shafayet Chowdhury
Kaushik Roy
38
1
0
24 May 2023
Towards A Unified View of Sparse Feed-Forward Network in Pretraining
  Large Language Model
Towards A Unified View of Sparse Feed-Forward Network in Pretraining Large Language Model
Leo Liu
Tim Dettmers
Xi Victoria Lin
Ves Stoyanov
Xian Li
MoE
13
9
0
23 May 2023
NarrativeXL: A Large-scale Dataset For Long-Term Memory Models
NarrativeXL: A Large-scale Dataset For Long-Term Memory Models
A. Moskvichev
Ky-Vinh Mai
RALM
8
1
0
23 May 2023
Global Structure Knowledge-Guided Relation Extraction Method for
  Visually-Rich Document
Global Structure Knowledge-Guided Relation Extraction Method for Visually-Rich Document
Xiangnan Chen
Qianwen Xiao
Juncheng Li
Duo Dong
Jun Lin
Xiaozhong Liu
Siliang Tang
32
5
0
23 May 2023
Challenges in Context-Aware Neural Machine Translation
Challenges in Context-Aware Neural Machine Translation
Linghao Jin
Jacqueline He
Jonathan May
Xuezhe Ma
33
6
0
23 May 2023
RWKV: Reinventing RNNs for the Transformer Era
RWKV: Reinventing RNNs for the Transformer Era
Bo Peng
Eric Alcaide
Quentin G. Anthony
Alon Albalak
Samuel Arcadinho
...
Qihang Zhao
P. Zhou
Qinghua Zhou
Jian Zhu
Rui-Jie Zhu
72
550
0
22 May 2023
FIT: Far-reaching Interleaved Transformers
FIT: Far-reaching Interleaved Transformers
Ting-Li Chen
Lala Li
19
12
0
22 May 2023
Prefix Propagation: Parameter-Efficient Tuning for Long Sequences
Prefix Propagation: Parameter-Efficient Tuning for Long Sequences
Jonathan Li
Will Aitken
R. Bhambhoria
Xiao-Dan Zhu
9
14
0
20 May 2023
Rethinking Data Augmentation for Tabular Data in Deep Learning
Rethinking Data Augmentation for Tabular Data in Deep Learning
Soma Onishi
Shoya Meguro
LMTD
18
14
0
17 May 2023
Previous
123456...111213
Next