ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2006.04768
  4. Cited By
Linformer: Self-Attention with Linear Complexity

Linformer: Self-Attention with Linear Complexity

8 June 2020
Sinong Wang
Belinda Z. Li
Madian Khabsa
Han Fang
Hao Ma
ArXivPDFHTML

Papers citing "Linformer: Self-Attention with Linear Complexity"

50 / 648 papers shown
Title
Mixture of In-Context Prompters for Tabular PFNs
Mixture of In-Context Prompters for Tabular PFNs
Derek Xu
Olcay Cirit
Reza Asadi
Yizhou Sun
Wei Wang
31
9
0
25 May 2024
Spectraformer: A Unified Random Feature Framework for Transformer
Spectraformer: A Unified Random Feature Framework for Transformer
Duke Nguyen
Aditya Joshi
Flora D. Salim
29
0
0
24 May 2024
ARVideo: Autoregressive Pretraining for Self-Supervised Video
  Representation Learning
ARVideo: Autoregressive Pretraining for Self-Supervised Video Representation Learning
Sucheng Ren
Hongru Zhu
Chen Wei
Yijiang Li
Alan L. Yuille
Cihang Xie
AI4TS
VGen
SSL
51
1
0
24 May 2024
MiniCache: KV Cache Compression in Depth Dimension for Large Language
  Models
MiniCache: KV Cache Compression in Depth Dimension for Large Language Models
Akide Liu
Jing Liu
Zizheng Pan
Yefei He
Gholamreza Haffari
Bohan Zhuang
MQ
35
30
0
23 May 2024
Retrievable Domain-Sensitive Feature Memory for Multi-Domain
  Recommendation
Retrievable Domain-Sensitive Feature Memory for Multi-Domain Recommendation
Yuang Zhao
Zhaocheng Du
Qinglin Jia
Linxuan Zhang
Zhenhua Dong
Ruiming Tang
25
2
0
21 May 2024
Large-Scale Multi-Center CT and MRI Segmentation of Pancreas with Deep
  Learning
Large-Scale Multi-Center CT and MRI Segmentation of Pancreas with Deep Learning
Zheyuan Zhang
Elif Keles
Gorkem Durak
Yavuz Taktak
Onkar Susladkar
...
Cemal Yazici
T. Tirkes
B. Turkbey
Michael B. Wallace
Ulas Bagci
OOD
23
0
0
20 May 2024
Stereo-Knowledge Distillation from dpMV to Dual Pixels for Light Field
  Video Reconstruction
Stereo-Knowledge Distillation from dpMV to Dual Pixels for Light Field Video Reconstruction
Aryan Garg
Raghav Mallampali
Akshat Joshi
Shrisudhan Govindarajan
Kaushik Mitra
29
0
0
20 May 2024
Asymptotic theory of in-context learning by linear attention
Asymptotic theory of in-context learning by linear attention
Yue M. Lu
Mary I. Letey
Jacob A. Zavatone-Veth
Anindita Maiti
C. Pehlevan
19
10
0
20 May 2024
GestFormer: Multiscale Wavelet Pooling Transformer Network for Dynamic
  Hand Gesture Recognition
GestFormer: Multiscale Wavelet Pooling Transformer Network for Dynamic Hand Gesture Recognition
Mallika Garg
Debashis Ghosh
P. M. Pradhan
SLR
ViT
32
2
0
18 May 2024
Efficient Multimodal Large Language Models: A Survey
Efficient Multimodal Large Language Models: A Survey
Yizhang Jin
Jian Li
Yexin Liu
Tianjun Gu
Kai Wu
...
Xin Tan
Zhenye Gan
Yabiao Wang
Chengjie Wang
Lizhuang Ma
LRM
39
45
0
17 May 2024
A Survey on Transformers in NLP with Focus on Efficiency
A Survey on Transformers in NLP with Focus on Efficiency
Wazib Ansar
Saptarsi Goswami
Amlan Chakrabarti
MedIm
38
2
0
15 May 2024
MambaOut: Do We Really Need Mamba for Vision?
MambaOut: Do We Really Need Mamba for Vision?
Weihao Yu
Xinchao Wang
Mamba
39
47
0
13 May 2024
IceFormer: Accelerated Inference with Long-Sequence Transformers on CPUs
IceFormer: Accelerated Inference with Long-Sequence Transformers on CPUs
Yuzhen Mao
Martin Ester
Ke Li
30
6
0
05 May 2024
MFTraj: Map-Free, Behavior-Driven Trajectory Prediction for Autonomous
  Driving
MFTraj: Map-Free, Behavior-Driven Trajectory Prediction for Autonomous Driving
Haicheng Liao
Zhenning Li
Chengyue Wang
Huanming Shen
Bonan Wang
Dongping Liao
Guofa Li
Chengzhong Xu
22
13
0
02 May 2024
DynaMo: Accelerating Language Model Inference with Dynamic Multi-Token
  Sampling
DynaMo: Accelerating Language Model Inference with Dynamic Multi-Token Sampling
Shikhar Tuli
Chi-Heng Lin
Yen-Chang Hsu
N. Jha
Yilin Shen
Hongxia Jin
AI4CE
30
1
0
01 May 2024
CLIP-Mamba: CLIP Pretrained Mamba Models with OOD and Hessian Evaluation
CLIP-Mamba: CLIP Pretrained Mamba Models with OOD and Hessian Evaluation
Weiquan Huang
Yifei Shen
Yifan Yang
Mamba
33
4
0
30 Apr 2024
From Persona to Personalization: A Survey on Role-Playing Language
  Agents
From Persona to Personalization: A Survey on Role-Playing Language Agents
Jiangjie Chen
Xintao Wang
Rui Xu
Siyu Yuan
Yikai Zhang
...
Caiyu Hu
Siye Wu
Scott Ren
Ziquan Fu
Yanghua Xiao
50
76
0
28 Apr 2024
Mamba-360: Survey of State Space Models as Transformer Alternative for
  Long Sequence Modelling: Methods, Applications, and Challenges
Mamba-360: Survey of State Space Models as Transformer Alternative for Long Sequence Modelling: Methods, Applications, and Challenges
Badri N. Patro
Vijay Srinivas Agneeswaran
Mamba
35
38
0
24 Apr 2024
Retrieval Head Mechanistically Explains Long-Context Factuality
Retrieval Head Mechanistically Explains Long-Context Factuality
Wenhao Wu
Yizhong Wang
Guangxuan Xiao
Hao-Chun Peng
Yao Fu
LRM
30
60
0
24 Apr 2024
A Survey on Efficient Inference for Large Language Models
A Survey on Efficient Inference for Large Language Models
Zixuan Zhou
Xuefei Ning
Ke Hong
Tianyu Fu
Jiaming Xu
...
Shengen Yan
Guohao Dai
Xiao-Ping Zhang
Yuhan Dong
Yu-Xiang Wang
46
82
0
22 Apr 2024
Collaborative Filtering Based on Diffusion Models: Unveiling the
  Potential of High-Order Connectivity
Collaborative Filtering Based on Diffusion Models: Unveiling the Potential of High-Order Connectivity
Yukui Hou
Jin-Duk Park
Won-Yong Shin
21
13
0
22 Apr 2024
Guided Discrete Diffusion for Electronic Health Record Generation
Guided Discrete Diffusion for Electronic Health Record Generation
Jun Han
Zixiang Chen
Yongqian Li
Yiwen Kou
Eran Halperin
Robert E. Tillman
Quanquan Gu
MedIm
DiffM
34
6
0
18 Apr 2024
Sequence Length Scaling in Vision Transformers for Scientific Images on
  Frontier
Sequence Length Scaling in Vision Transformers for Scientific Images on Frontier
A. Tsaris
Chengming Zhang
Xiao Wang
Junqi Yin
Siyan Liu
...
Jong Youl Choi
M. Wahib
Dan Lu
Prasanna Balaprakash
Feiyi Wang
18
1
0
17 Apr 2024
LongVQ: Long Sequence Modeling with Vector Quantization on Structured
  Memory
LongVQ: Long Sequence Modeling with Vector Quantization on Structured Memory
Zicheng Liu
Li Wang
Siyuan Li
Zedong Wang
Haitao Lin
Stan Z. Li
VLM
27
4
0
17 Apr 2024
MemLLM: Finetuning LLMs to Use An Explicit Read-Write Memory
MemLLM: Finetuning LLMs to Use An Explicit Read-Write Memory
Ali Modarressi
Abdullatif Köksal
Ayyoob Imani
Mohsen Fayyaz
Hinrich Schütze
KELM
104
9
0
17 Apr 2024
Comprehensive Survey of Model Compression and Speed up for Vision
  Transformers
Comprehensive Survey of Model Compression and Speed up for Vision Transformers
Feiyang Chen
Ziqian Luo
Lisang Zhou
Xueting Pan
Ying Jiang
16
22
0
16 Apr 2024
Hierarchical Context Merging: Better Long Context Understanding for
  Pre-trained LLMs
Hierarchical Context Merging: Better Long Context Understanding for Pre-trained LLMs
Woomin Song
Seunghyuk Oh
Sangwoo Mo
Jaehyung Kim
Sukmin Yun
Jung-Woo Ha
Jinwoo Shin
30
14
0
16 Apr 2024
Adaptive Patching for High-resolution Image Segmentation with
  Transformers
Adaptive Patching for High-resolution Image Segmentation with Transformers
Enzhi Zhang
Isaac Lyngaas
Peng Chen
Xiao Wang
Jun Igarashi
Yuankai Huo
M. Wahib
M. Munetomo
MedIm
24
1
0
15 Apr 2024
Foundational GPT Model for MEG
Foundational GPT Model for MEG
Richard Csaky
M. Es
Oiwi Parker Jones
M. Woolrich
27
2
0
14 Apr 2024
TransformerFAM: Feedback attention is working memory
TransformerFAM: Feedback attention is working memory
Dongseong Hwang
Weiran Wang
Zhuoyuan Huo
K. Sim
P. M. Mengibar
32
12
0
14 Apr 2024
Adapting LLaMA Decoder to Vision Transformer
Adapting LLaMA Decoder to Vision Transformer
Jiahao Wang
Wenqi Shao
Mengzhao Chen
Chengyue Wu
Yong Liu
Taiqiang Wu
Kaipeng Zhang
Songyang Zhang
Kai-xiang Chen
Ping Luo
MLLM
38
4
0
10 Apr 2024
Eagle and Finch: RWKV with Matrix-Valued States and Dynamic Recurrence
Eagle and Finch: RWKV with Matrix-Valued States and Dynamic Recurrence
Bo Peng
Daniel Goldstein
Quentin G. Anthony
Alon Albalak
Eric Alcaide
...
Bingchen Zhao
Qihang Zhao
Peng Zhou
Jian Zhu
Ruijie Zhu
51
74
0
08 Apr 2024
Softmax Attention with Constant Cost per Token
Softmax Attention with Constant Cost per Token
Franz A. Heinsen
21
1
0
08 Apr 2024
Lightweight Deep Learning for Resource-Constrained Environments: A
  Survey
Lightweight Deep Learning for Resource-Constrained Environments: A Survey
Hou-I Liu
Marco Galindo
Hongxia Xie
Lai-Kuan Wong
Hong-Han Shuai
Yung-Hui Li
Wen-Huang Cheng
50
48
0
08 Apr 2024
Bidirectional Long-Range Parser for Sequential Data Understanding
Bidirectional Long-Range Parser for Sequential Data Understanding
George Leotescu
Daniel Voinea
A. Popa
42
1
0
08 Apr 2024
Diffusion-RWKV: Scaling RWKV-Like Architectures for Diffusion Models
Diffusion-RWKV: Scaling RWKV-Like Architectures for Diffusion Models
Zhengcong Fei
Mingyuan Fan
Changqian Yu
Debang Li
Junshi Huang
32
24
0
06 Apr 2024
Training LLMs over Neurally Compressed Text
Training LLMs over Neurally Compressed Text
Brian Lester
Jaehoon Lee
A. Alemi
Jeffrey Pennington
Adam Roberts
Jascha Narain Sohl-Dickstein
Noah Constant
32
6
0
04 Apr 2024
On the Theoretical Expressive Power and the Design Space of Higher-Order
  Graph Transformers
On the Theoretical Expressive Power and the Design Space of Higher-Order Graph Transformers
Cai Zhou
Rose Yu
Yusu Wang
32
7
0
04 Apr 2024
Optimizing the Deployment of Tiny Transformers on Low-Power MCUs
Optimizing the Deployment of Tiny Transformers on Low-Power MCUs
Victor J. B. Jung
Alessio Burrello
Moritz Scherer
Francesco Conti
Luca Benini
22
4
0
03 Apr 2024
Cross-Architecture Transfer Learning for Linear-Cost Inference
  Transformers
Cross-Architecture Transfer Learning for Linear-Cost Inference Transformers
Sehyun Choi
26
3
0
03 Apr 2024
NeRF-MAE: Masked AutoEncoders for Self-Supervised 3D Representation
  Learning for Neural Radiance Fields
NeRF-MAE: Masked AutoEncoders for Self-Supervised 3D Representation Learning for Neural Radiance Fields
Muhammad Zubair Irshad
Sergey Zakahrov
Vitor Campagnolo Guizilini
Adrien Gaidon
Z. Kira
Rares Ambrus
ViT
37
12
0
01 Apr 2024
From Similarity to Superiority: Channel Clustering for Time Series
  Forecasting
From Similarity to Superiority: Channel Clustering for Time Series Forecasting
Jialin Chen
J. E. Lenssen
Aosong Feng
Weihua Hu
Matthias Fey
Leandros Tassiulas
J. Leskovec
Rex Ying
AI4TS
34
10
0
31 Mar 2024
Transformers-based architectures for stroke segmentation: A review
Transformers-based architectures for stroke segmentation: A review
Yalda Zafari-Ghadim
Essam A. Rashed
M. Mabrok
MedIm
14
1
0
27 Mar 2024
Incorporating Exponential Smoothing into MLP: A Simple but Effective
  Sequence Model
Incorporating Exponential Smoothing into MLP: A Simple but Effective Sequence Model
Jiqun Chu
Zuoquan Lin
AI4TS
25
2
0
26 Mar 2024
PCToolkit: A Unified Plug-and-Play Prompt Compression Toolkit of Large
  Language Models
PCToolkit: A Unified Plug-and-Play Prompt Compression Toolkit of Large Language Models
Jinyi Li
Yihuai Lan
Lei Wang
Hao Wang
25
0
0
26 Mar 2024
ALISA: Accelerating Large Language Model Inference via Sparsity-Aware KV
  Caching
ALISA: Accelerating Large Language Model Inference via Sparsity-Aware KV Caching
Youpeng Zhao
Di Wu
Jun Wang
29
25
0
26 Mar 2024
Block Selective Reprogramming for On-device Training of Vision
  Transformers
Block Selective Reprogramming for On-device Training of Vision Transformers
Sreetama Sarkar
Souvik Kundu
Kai Zheng
P. Beerel
37
2
0
25 Mar 2024
PaPr: Training-Free One-Step Patch Pruning with Lightweight ConvNets for
  Faster Inference
PaPr: Training-Free One-Step Patch Pruning with Lightweight ConvNets for Faster Inference
Tanvir Mahmud
Burhaneddin Yaman
Chun-Hao Liu
Diana Marculescu
38
2
0
24 Mar 2024
Holographic Global Convolutional Networks for Long-Range Prediction
  Tasks in Malware Detection
Holographic Global Convolutional Networks for Long-Range Prediction Tasks in Malware Detection
Mohammad Mahmudul Alam
Edward Raff
Stella Biderman
Tim Oates
James Holt
AAML
30
3
0
23 Mar 2024
LLaVA-PruMerge: Adaptive Token Reduction for Efficient Large Multimodal
  Models
LLaVA-PruMerge: Adaptive Token Reduction for Efficient Large Multimodal Models
Yuzhang Shang
Mu Cai
Bingxin Xu
Yong Jae Lee
Yan Yan
VLM
29
104
0
22 Mar 2024
Previous
123456...111213
Next