Stable, Fast and Accurate: Kernelized Attention with Relative Positional
Encoding

Stable, Fast and Accurate: Kernelized Attention with Relative Positional Encoding

23 June 2021

Tianle Cai

Papers citing "Stable, Fast and Accurate: Kernelized Attention with Relative Positional Encoding"

17 / 17 papers shown

Title
Let the Code LLM Edit Itself When You Edit the Code Zhenyu He Jun Zhang Shengjie Luo Jingjing Xu Z. Zhang Di He KELM 31 0 0 03 Jul 2024
Efficient Attention via Control Variates Lin Zheng Jianbo Yuan Chong-Jun Wang Lingpeng Kong 24 18 0 09 Feb 2023
Single Cells Are Spatial Tokens: Transformers for Spatial Transcriptomic Data Imputation Haifang Wen Wenzhuo Tang Wei Jin Jiayuan Ding Renming Liu Xinnan Dai Feng Shi Lulu Shang Jiliang Tang Yuying Xie 27 8 0 06 Feb 2023
Learning a Fourier Transform for Linear Relative Positional Encodings in Transformers K. Choromanski Shanda Li Valerii Likhosherstov Kumar Avinava Dubey Shengjie Luo Di He Yiming Yang Tamás Sarlós Thomas Weingarten Adrian Weller 23 8 0 03 Feb 2023
Mnemosyne: Learning to Train Transformers with Transformers Deepali Jain K. Choromanski Kumar Avinava Dubey Sumeet Singh Vikas Sindhwani Tingnan Zhang Jie Tan OffRL 31 9 0 02 Feb 2023
Hungry Hungry Hippos: Towards Language Modeling with State Space Models Daniel Y. Fu Tri Dao Khaled Kamal Saab A. Thomas Atri Rudra Christopher Ré 65 368 0 28 Dec 2022
Lightweight Structure-Aware Attention for Visual Understanding Heeseung Kwon F. M. Castro M. Marín-Jiménez N. Guil Alahari Karteek 26 2 0 29 Nov 2022
KERPLE: Kernelized Relative Positional Embedding for Length Extrapolation Ta-Chung Chi Ting-Han Fan Peter J. Ramadge Alexander I. Rudnicky 39 65 0 20 May 2022
Attention Mechanism in Neural Networks: Where it Comes and Where it Goes Derya Soydaner 3DV 36 149 0 27 Apr 2022
A Quality Index Metric and Method for Online Self-Assessment of Autonomous Vehicles Sensory Perception Ce Zhang A. Eskandarian 16 8 0 04 Mar 2022
FastRPB: a Scalable Relative Positional Encoding for Long Sequence Tasks Maksim Zubkov Daniil Gavrilov 22 0 0 23 Feb 2022
Flowformer: Linearizing Transformers with Conservation Flows Haixu Wu Jialong Wu Jiehui Xu Jianmin Wang Mingsheng Long 6 90 0 13 Feb 2022
Improving Sample Efficiency of Value Based Models Using Attention and Vision Transformers Amir Ardalan Kalantari Mohammad Amini Sarath Chandar Doina Precup 44 4 0 01 Feb 2022
Can Vision Transformers Perform Convolution? Shanda Li Xiangning Chen Di He Cho-Jui Hsieh ViT 27 19 0 02 Nov 2021
Ripple Attention for Visual Perception with Sub-quadratic Complexity Lin Zheng Huijie Pan Lingpeng Kong 21 3 0 06 Oct 2021
Do Transformers Really Perform Bad for Graph Representation? Chengxuan Ying Tianle Cai Shengjie Luo Shuxin Zheng Guolin Ke Di He Yanming Shen Tie-Yan Liu GNN 23 433 0 09 Jun 2021
GLUE: A Multi-Task Benchmark and Analysis Platform for Natural Language Understanding Alex Jinpeng Wang Amanpreet Singh Julian Michael Felix Hill Omer Levy Samuel R. Bowman ELM 297 6,950 0 20 Apr 2018