Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2009.06732
Cited By
Efficient Transformers: A Survey
14 September 2020
Yi Tay
Mostafa Dehghani
Dara Bahri
Donald Metzler
VLM
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Efficient Transformers: A Survey"
50 / 633 papers shown
Title
ReSurgSAM2: Referring Segment Anything in Surgical Video via Credible Long-term Tracking
Haofeng Liu
Mingqi Gao
Xuxiao Luo
Ziyue Wang
Guanyi Qin
J. Wu
Yueming Jin
18
0
0
13 May 2025
Efficient Unstructured Pruning of Mamba State-Space Models for Resource-Constrained Environments
Ibne Farabi Shihab
Sanjeda Akter
Anuj Sharma
Mamba
34
0
0
13 May 2025
Graph Laplacian Wavelet Transformer via Learnable Spectral Decomposition
Andrew Kiruluta
Eric Lundy
Priscilla Burity
19
0
0
09 May 2025
T-T: Table Transformer for Tagging-based Aspect Sentiment Triplet Extraction
Kun Peng
Chaodong Tong
Cong Cao
Hao Peng
Q. Li
Guanlin Wu
Lei Jiang
Yanbing Liu
Philip S. Yu
LMTD
43
0
0
08 May 2025
Polysemy of Synthetic Neurons Towards a New Type of Explanatory Categorical Vector Spaces
Michael Pichat
William Pogrund
Paloma Pichat
Judicael Poumay
Armanouche Gasparian
Samuel Demarchi
Martin Corbet
Alois Georgeon
Michael Veillet-Guillem
MILM
11
0
0
30 Apr 2025
Comparison of Different Deep Neural Network Models in the Cultural Heritage Domain
Teodor Boyadzhiev
Gabriele Lagani
Luca Ciampi
Giuseppe Amato
Krassimira Ivanova
VLM
47
0
0
30 Apr 2025
Optimizing LLM Inference: Fluid-Guided Online Scheduling with Memory Constraints
Ruicheng Ao
Gan Luo
D. Simchi-Levi
Xinshang Wang
26
2
0
15 Apr 2025
Of All StrIPEs: Investigating Structure-informed Positional Encoding for Efficient Music Generation
Manvi Agarwal
Changhong Wang
Gaël Richard
27
0
0
07 Apr 2025
Optimal Scaling Laws for Efficiency Gains in a Theoretical Transformer-Augmented Sectional MoE Framework
Soham Sane
MoE
62
0
0
26 Mar 2025
Burst Image Super-Resolution with Mamba
Ozan Unal
Steven Marty
Dengxin Dai
Mamba
43
0
0
25 Mar 2025
From S4 to Mamba: A Comprehensive Survey on Structured State Space Models
Shriyank Somvanshi
Md Monzurul Islam
Mahmuda Sultana Mimi
Sazzad Bin Bashar Polock
Gaurab Chhetri
Subasish Das
Mamba
AI4TS
40
0
0
22 Mar 2025
Intra-neuronal attention within language models Relationships between activation and semantics
Michael Pichat
William Pogrund
Paloma Pichat
Armanouche Gasparian
Samuel Demarchi
Corbet Alois Georgeon
Michael Veillet-Guillem
MILM
38
0
0
17 Mar 2025
Long-VMNet: Accelerating Long-Form Video Understanding via Fixed Memory
Saket Gurukar
Asim Kadav
VLM
50
0
0
17 Mar 2025
SplatPose: Geometry-Aware 6-DoF Pose Estimation from Single RGB Image via 3D Gaussian Splatting
Linqi Yang
Xiongwei Zhao
Qihao Sun
Ke Wang
Ao Chen
Peng Kang
3DGS
73
0
0
07 Mar 2025
Malware Detection at the Edge with Lightweight LLMs: A Performance Evaluation
Christian Rondanini
B. Carminati
E. Ferrari
Antonio Gaudiano
Ashish Kundu
51
0
0
06 Mar 2025
MTS: A Deep Reinforcement Learning Portfolio Management Framework with Time-Awareness and Short-Selling
Fengchen Gu
Zhengyong Jiang
Ángel García-Fernández
Angelos Stefanidis
Jionglong Su
Huakang Li
AI4TS
AIFin
65
0
0
06 Mar 2025
ReaderLM-v2: Small Language Model for HTML to Markdown and JSON
Feng Wang
Zesheng Shi
Bo Wang
Nan Wang
Han Xiao
RALM
70
1
0
03 Mar 2025
Attention Condensation via Sparsity Induced Regularized Training
Eli Sason
Darya Frolova
Boris Nazarov
Felix Goldberd
89
0
0
03 Mar 2025
Transformer Meets Twicing: Harnessing Unattended Residual Information
Laziz U. Abdullaev
Tan M. Nguyen
37
2
0
02 Mar 2025
Revisiting Kernel Attention with Correlated Gaussian Process Representation
Long Minh Bui
Tho Tran Huu
Duy-Tung Dinh
T. Nguyen
Trong Nghia Hoang
24
2
0
27 Feb 2025
The Role of Sparsity for Length Generalization in Transformers
Noah Golowich
Samy Jelassi
David Brandfonbrener
Sham Kakade
Eran Malach
37
0
0
24 Feb 2025
KernelBench: Can LLMs Write Efficient GPU Kernels?
Anne Ouyang
Simon Guo
Simran Arora
Alex L. Zhang
William Hu
Christopher Ré
Azalia Mirhoseini
ALM
38
1
0
14 Feb 2025
Unified Spatial-Temporal Edge-Enhanced Graph Networks for Pedestrian Trajectory Prediction
Ruochen Li
Tanqiu Qiao
Stamos Katsigiannis
Zhanxing Zhu
Hubert P. H. Shum
AI4TS
63
1
0
04 Feb 2025
ZETA: Leveraging Z-order Curves for Efficient Top-k Attention
Qiuhao Zeng
Jerry Huang
Peng Lu
Gezheng Xu
Boxing Chen
Charles X. Ling
Boyu Wang
45
1
0
24 Jan 2025
Integrating remote sensing data assimilation, deep learning and large language model for interactive wheat breeding yield prediction
Guofeng Yang
Nanfei Jin
Wenjie Ai
Zhonghua Zheng
Yuhong He
Yong He
33
0
0
08 Jan 2025
Decoupling Knowledge and Reasoning in Transformers: A Modular Architecture with Generalized Cross-Attention
Zhenyu Guo
Wenguang Chen
32
0
0
01 Jan 2025
VMamba: Visual State Space Model
Yue Liu
Yunjie Tian
Yuzhong Zhao
Hongtian Yu
Lingxi Xie
Yaowei Wang
Qixiang Ye
Jianbin Jiao
Yunfan Liu
Mamba
106
592
0
31 Dec 2024
A Silver Bullet or a Compromise for Full Attention? A Comprehensive Study of Gist Token-based Context Compression
Chenlong Deng
Zhisong Zhang
Kelong Mao
Shuaiyi Li
Xinting Huang
Dong Yu
Zhicheng Dou
36
1
0
23 Dec 2024
Advances in Transformers for Robotic Applications: A Review
Nikunj Sanghai
Nik Bear Brown
AI4CE
70
0
0
13 Dec 2024
Context Clues: Evaluating Long Context Models for Clinical Prediction Tasks on EHRs
Michael Wornow
Suhana Bedi
Miguel Angel Fuentes Hernandez
E. Steinberg
Jason Alan Fries
Christopher Ré
Sanmi Koyejo
N. Shah
95
4
0
09 Dec 2024
Even Sparser Graph Transformers
Hamed Shirzad
Honghao Lin
B. Venkatachalam
A. Velingker
David P. Woodruff
Danica J. Sutherland
GNN
91
3
0
25 Nov 2024
Best of Both Worlds: Advantages of Hybrid Graph Sequence Models
Ali Behrouz
Ali Parviz
Mahdi Karami
Clayton Sanford
Bryan Perozzi
Vahab Mirrokni
79
2
0
23 Nov 2024
freePruner: A Training-free Approach for Large Multimodal Model Acceleration
Bingxin Xu
Yuzhang Shang
Yunhao Ge
Qian Lou
Yan Yan
94
3
0
23 Nov 2024
Financial Risk Assessment via Long-term Payment Behavior Sequence Folding
Yiran Qiao
Yateng Tang
Xiang Ao
Qi Yuan
Ziming Liu
Chen Shen
Xuehao Zheng
62
0
0
22 Nov 2024
MetaLA: Unified Optimal Linear Approximation to Softmax Attention Map
Yuhong Chou
Man Yao
Kexin Wang
Yuqi Pan
Ruijie Zhu
Yiran Zhong
Yu Qiao
J. Wu
Bo Xu
Guoqi Li
46
4
0
16 Nov 2024
Kernel Approximation using Analog In-Memory Computing
Julian Büchel
Giacomo Camposampiero
A. Vasilopoulos
C. Lammie
M. Le Gallo
Abbas Rahimi
A. Sebastian
38
0
0
05 Nov 2024
Provable Length Generalization in Sequence Prediction via Spectral Filtering
Annie Marsden
Evan Dogariu
Naman Agarwal
Xinyi Chen
Daniel Suo
Elad Hazan
34
1
0
01 Nov 2024
ProTransformer: Robustify Transformers via Plug-and-Play Paradigm
Zhichao Hou
Weizhi Gao
Yuchen Shen
Feiyi Wang
Xiaorui Liu
VLM
23
2
0
30 Oct 2024
Extralonger: Toward a Unified Perspective of Spatial-Temporal Factors for Extra-Long-Term Traffic Forecasting
Zhiwei Zhang
Shaojun E
Fandong Meng
Jie Zhou
Wenjuan Han
31
0
0
30 Oct 2024
TokenFormer: Rethinking Transformer Scaling with Tokenized Model Parameters
Haiyang Wang
Yue Fan
Muhammad Ferjad Naeem
Yongqin Xian
J. E. Lenssen
Liwei Wang
F. Tombari
Bernt Schiele
41
2
0
30 Oct 2024
Continuous Speech Tokenizer in Text To Speech
Yixing Li
Ruobing Xie
X. Sun
Yu Cheng
Zhanhui Kang
AuLLM
CLL
53
2
0
22 Oct 2024
Modelling Concurrent RTP Flows for End-to-end Predictions of QoS in Real Time Communications
Tailai Song
Paolo Garza
Michela Meo
Maurizio Matteo Munafò
15
1
0
21 Oct 2024
Flash Inference: Near Linear Time Inference for Long Convolution Sequence Models and Beyond
Costin-Andrei Oncescu
Sanket Purandare
Stratos Idreos
Sham Kakade
VLM
AI4TS
3DV
16
0
0
16 Oct 2024
Geometric Inductive Biases of Deep Networks: The Role of Data and Architecture
Sajad Movahedi
Antonio Orvieto
Seyed-Mohsen Moosavi-Dezfooli
AAML
AI4CE
58
0
0
15 Oct 2024
Predicting from Strings: Language Model Embeddings for Bayesian Optimization
Tung Nguyen
Qiuyi Zhang
Bangding Yang
Chansoo Lee
J. Bornschein
Yingjie Miao
Sagi Perel
Yutian Chen
Xingyou Song
BDL
20
1
0
14 Oct 2024
Joint Fine-tuning and Conversion of Pretrained Speech and Language Models towards Linear Complexity
Mutian He
Philip N. Garner
80
0
0
09 Oct 2024
Guided Self-attention: Find the Generalized Necessarily Distinct Vectors for Grain Size Grading
Fang Gao
XueTao Li
Jiabao Wang
Shengheng Ma
Jun Yu
15
0
0
08 Oct 2024
Computational design of target-specific linear peptide binders with TransformerBeta
Haowen Zhao
Francesco A. Aprile
Barbara Bravi
21
0
0
07 Oct 2024
S7: Selective and Simplified State Space Layers for Sequence Modeling
Taylan Soydan
Nikola Zubić
Nico Messikommer
Siddhartha Mishra
Davide Scaramuzza
24
3
0
04 Oct 2024
Can Mamba Always Enjoy the "Free Lunch"?
Ruifeng Ren
Zhicong Li
Yong Liu
39
1
0
04 Oct 2024
1
2
3
4
...
11
12
13
Next