Communities
Connect sessions
AI calendar
Organizations
Join Slack
Contact Sales
Search
Open menu
Home
Papers
All Papers
0 / 0 papers shown
Title
Home
Papers
2106.03143
Cited By
v1
v2
v3 (latest)
CAPE: Encoding Relative Positions with Continuous Augmented Positional Embeddings
Neural Information Processing Systems (NeurIPS), 2021
6 June 2021
Tatiana Likhomanenko
Qiantong Xu
Gabriel Synnaeve
R. Collobert
A. Rogozhnikov
OOD
ViT
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"CAPE: Encoding Relative Positions with Continuous Augmented Positional Embeddings"
41 / 41 papers shown
Title
Positional Encoding via Token-Aware Phase Attention
Wang
Sheng Shen
Rémi Munos
Hongyuan Zhan
Yuandong Tian
118
0
0
16 Sep 2025
PiPViT: Patch-based Visual Interpretable Prototypes for Retinal Image Analysis
Marzieh Oghbaie
Teresa Araújoa
Hrvoje Bogunović
ViT
MedIm
308
0
0
12 Jun 2025
LLM as Effective Streaming Processor: Bridging Streaming-Batch Mismatches with Group Position Encoding
Annual Meeting of the Association for Computational Linguistics (ACL), 2025
Junlong Tong
Jinlan Fu
Zixuan Lin
Yingqi Fan
Anhao Zhao
Hui Su
Xiaoyu Shen
317
2
0
22 May 2025
Learning to Adapt to Position Bias in Vision Transformer Classifiers
Robert-Jan Bruintjes
Jan van Gemert
321
0
0
19 May 2025
Context-aware Biases for Length Extrapolation
Ali Veisi
Hamidreza Amirzadeh
Amir Mansourian
455
1
0
11 Mar 2025
Rethinking Associative Memory Mechanism in Induction Head
Shuo Wang
Issei Sato
377
0
0
16 Dec 2024
Knowledge-enhanced Transformer for Multivariate Long Sequence Time-series Forecasting
Shubham Tanaji Kakde
Rony Mitra
Jasashwi Mandal
Manoj Kumar Tiwari
KELM
AI4TS
109
1
0
17 Nov 2024
DAPE V2: Process Attention Score as Feature Map for Length Extrapolation
Annual Meeting of the Association for Computational Linguistics (ACL), 2024
Chuanyang Zheng
Yihang Gao
Han Shi
Jing Xiong
Jiankai Sun
...
Xiaozhe Ren
Michael Ng
Xin Jiang
Zhenguo Li
Yu Li
294
11
0
07 Oct 2024
Towards LifeSpan Cognitive Systems
Yu Wang
Chi Han
Tongtong Wu
Xiaoxin He
Wangchunshu Zhou
...
Zexue He
Wei Wang
Gholamreza Haffari
Heng Ji
Julian McAuley
KELM
CLL
937
8
0
20 Sep 2024
AMES: Asymmetric and Memory-Efficient Similarity Estimation for Instance-level Retrieval
European Conference on Computer Vision (ECCV), 2024
Pavel Suma
Giorgos Kordopatis-Zilos
Ahmet Iscen
Giorgos Tolias
VLM
309
8
0
06 Aug 2024
HSViT: Horizontally Scalable Vision Transformer
Chenhao Xu
Chang-Tsun Li
Chee Peng Lim
Douglas Creighton
ViT
175
6
0
08 Apr 2024
Rotary Position Embedding for Vision Transformer
Byeongho Heo
Song Park
Dongyoon Han
Sangdoo Yun
351
117
0
20 Mar 2024
The What, Why, and How of Context Length Extension Techniques in Large Language Models -- A Detailed Survey
Saurav Pawar
S.M. Towhidul Islam Tonmoy
S. M. M. Zaman
Vinija Jain
Vasu Sharma
Amitava Das
148
39
0
15 Jan 2024
Long-MIL: Scaling Long Contextual Multiple Instance Learning for Histopathology Whole Slide Image Analysis
Honglin Li
Yunlong Zhang
Chenglu Zhu
Jiatong Cai
Sunyi Zheng
Lin Yang
VLM
237
6
0
21 Nov 2023
Extending Input Contexts of Language Models through Training on Segmented Sequences
Petros Karypis
Julian McAuley
George Karypis
211
1
0
23 Oct 2023
AV-CPL: Continuous Pseudo-Labeling for Audio-Visual Speech Recognition
Andrew Rouditchenko
R. Collobert
Tatiana Likhomanenko
VLM
166
4
0
29 Sep 2023
Enabling Differentially Private Federated Learning for Speech Recognition: Benchmarks, Adaptive Optimizers and Gradient Clipping
Martin Pelikan
Sheikh Shams Azam
Vitaly Feldman
Jan Honza Silovsky
Kunal Talwar
Christopher G. Brinton
Tatiana Likhomanenko
463
8
0
29 Sep 2023
NeuroCodeBench: a plain C neural network benchmark for software verification
Edoardo Manino
R. Menezes
F. Shmarov
Lucas C. Cordeiro
139
4
0
07 Sep 2023
Extract-and-Adaptation Network for 3D Interacting Hand Mesh Recovery
J. Park
Daniel Sungho Jung
Gyeongsik Moon
Kyoung Mu Lee
144
8
0
05 Sep 2023
LM-Infinite: Zero-Shot Extreme Length Generalization for Large Language Models
North American Chapter of the Association for Computational Linguistics (NAACL), 2023
Chi Han
Qifan Wang
Yuan Yao
Wenhan Xiong
Yu Chen
Heng Ji
Sinong Wang
468
94
0
30 Aug 2023
How to Scale Your EMA
Neural Information Processing Systems (NeurIPS), 2023
Dan Busbridge
Jason Ramapuram
Pierre Ablin
Tatiana Likhomanenko
Eeshan Gunesh Dhekane
Xavier Suau
Russ Webb
207
21
0
25 Jul 2023
Linearized Relative Positional Encoding
Zhen Qin
Weixuan Sun
Kaiyue Lu
Huizhong Deng
Dong Li
Xiaodong Han
Yuchao Dai
Lingpeng Kong
Yiran Zhong
110
18
0
18 Jul 2023
LEA: Improving Sentence Similarity Robustness to Typos Using Lexical Attention Bias
Knowledge Discovery and Data Mining (KDD), 2023
Mario Almagro
Emilio Almazán
Diego Ortego
David Jiménez
225
5
0
06 Jul 2023
Monotonic Location Attention for Length Generalization
International Conference on Machine Learning (ICML), 2023
Jishnu Ray Chowdhury
Cornelia Caragea
LLMAG
152
10
0
31 May 2023
The Impact of Positional Encoding on Length Generalization in Transformers
Neural Information Processing Systems (NeurIPS), 2023
Amirhossein Kazemnejad
Inkit Padhi
Karthikeyan N. Ramamurthy
Payel Das
Siva Reddy
295
290
0
31 May 2023
DINOv2: Learning Robust Visual Features without Supervision
Maxime Oquab
Timothée Darcet
Théo Moutakanni
Huy Q. Vo
Marc Szafraniec
...
Edouard Grave
Julien Mairal
Patrick Labatut
Armand Joulin
Piotr Bojanowski
VLM
CLIP
SSL
1.0K
5,722
0
14 Apr 2023
Transformers in Speech Processing: A Survey
S. Latif
Aun Zaidi
Heriberto Cuayáhuitl
Fahad Shamshad
Moazzam Shoukat
Muhammad Usama
Junaid Qadir
400
66
0
21 Mar 2023
Stabilizing Transformer Training by Preventing Attention Entropy Collapse
International Conference on Machine Learning (ICML), 2023
Shuangfei Zhai
Tatiana Likhomanenko
Etai Littwin
Dan Busbridge
Jason Ramapuram
Yizhe Zhang
Jiatao Gu
J. Susskind
AAML
291
109
0
11 Mar 2023
A Theoretical Understanding of Shallow Vision Transformers: Learning, Generalization, and Sample Complexity
International Conference on Learning Representations (ICLR), 2023
Hongkang Li
Ming Wang
Sijia Liu
Pin-Yu Chen
ViT
MLT
462
77
0
12 Feb 2023
Continuous Soft Pseudo-Labeling in ASR
Tatiana Likhomanenko
R. Collobert
Navdeep Jaitly
Samy Bengio
VLM
245
5
0
11 Nov 2022
Variable Attention Masking for Configurable Transformer Transducer Speech Recognition
IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2022
P. Swietojanski
Stefan Braun
Dogan Can
Thiago Fraga da Silva
Arnab Ghoshal
...
Henry Mason
Erik McDermott
Honza Silovsky
R. Travadi
Xiaodan Zhuang
209
21
0
02 Nov 2022
More Speaking or More Speakers?
IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2022
Dan Berrebbi
R. Collobert
Navdeep Jaitly
Tatiana Likhomanenko
189
6
0
02 Nov 2022
Continuous Pseudo-Labeling from the Start
International Conference on Learning Representations (ICLR), 2022
Dan Berrebbi
R. Collobert
Samy Bengio
Navdeep Jaitly
Tatiana Likhomanenko
159
17
0
17 Oct 2022
Parameterization of Cross-Token Relations with Relative Positional Encoding for Vision MLP
ACM Multimedia (ACM MM), 2022
Zhicai Wang
Y. Hao
Xingyu Gao
Hao Zhang
Shuo Wang
Tingting Mu
Xiangnan He
177
8
0
15 Jul 2022
KERPLE: Kernelized Relative Positional Embedding for Length Extrapolation
Neural Information Processing Systems (NeurIPS), 2022
Ta-Chung Chi
Ting-Han Fan
Peter J. Ramadge
Alexander I. Rudnicky
271
86
0
20 May 2022
Similarity and Content-based Phonetic Self Attention for Speech Recognition
Interspeech (Interspeech), 2022
Kyuhong Shim
Wonyong Sung
264
8
0
19 Mar 2022
Continual Transformers: Redundancy-Free Attention for Online Inference
International Conference on Learning Representations (ICLR), 2022
Lukas Hedegaard
Arian Bakhtiarnia
Alexandros Iosifidis
CLL
356
14
0
17 Jan 2022
Pseudo-Labeling for Massively Multilingual Speech Recognition
IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2021
Loren Lugosch
Tatiana Likhomanenko
Gabriel Synnaeve
R. Collobert
VLM
230
33
0
30 Oct 2021
Train Short, Test Long: Attention with Linear Biases Enables Input Length Extrapolation
International Conference on Learning Representations (ICLR), 2021
Ofir Press
Noah A. Smith
M. Lewis
596
978
0
27 Aug 2021
AMMUS : A Survey of Transformer-based Pretrained Models in Natural Language Processing
Katikapalli Subramanyam Kalyan
A. Rajasekharan
S. Sangeetha
VLM
LM&MA
267
308
0
12 Aug 2021
Position Information in Transformers: An Overview
Computational Linguistics (CL), 2021
Philipp Dufter
Martin Schmitt
Hinrich Schütze
263
186
0
22 Feb 2021
1