v1v2v3 (latest)

CAPE: Encoding Relative Positions with Continuous Augmented Positional Embeddings

Neural Information Processing Systems (NeurIPS), 2021

6 June 2021

Papers citing "CAPE: Encoding Relative Positions with Continuous Augmented Positional Embeddings"

41 / 41 papers shown

Title
Positional Encoding via Token-Aware Phase Attention Wang Sheng Shen Rémi Munos Hongyuan Zhan Yuandong Tian 118 0 0 16 Sep 2025
PiPViT: Patch-based Visual Interpretable Prototypes for Retinal Image Analysis Marzieh Oghbaie Teresa Araújoa Hrvoje Bogunović ViT MedIm 308 0 0 12 Jun 2025
LLM as Effective Streaming Processor: Bridging Streaming-Batch Mismatches with Group Position EncodingAnnual Meeting of the Association for Computational Linguistics (ACL), 2025 Junlong Tong Jinlan Fu Zixuan Lin Yingqi Fan Anhao Zhao Hui Su Xiaoyu Shen 317 2 0 22 May 2025
Learning to Adapt to Position Bias in Vision Transformer Classifiers Robert-Jan Bruintjes Jan van Gemert 321 0 0 19 May 2025
Context-aware Biases for Length Extrapolation Ali Veisi Hamidreza Amirzadeh Amir Mansourian 455 1 0 11 Mar 2025
Rethinking Associative Memory Mechanism in Induction Head Shuo Wang Issei Sato 377 0 0 16 Dec 2024
Knowledge-enhanced Transformer for Multivariate Long Sequence Time-series Forecasting Shubham Tanaji Kakde Rony Mitra Jasashwi Mandal Manoj Kumar Tiwari KELM AI4TS 109 1 0 17 Nov 2024
DAPE V2: Process Attention Score as Feature Map for Length ExtrapolationAnnual Meeting of the Association for Computational Linguistics (ACL), 2024 Chuanyang Zheng Yihang Gao Han Shi Jing Xiong Jiankai Sun ... Xiaozhe Ren Michael Ng Xin Jiang Zhenguo Li Yu Li 294 11 0 07 Oct 2024
Towards LifeSpan Cognitive Systems Yu Wang Chi Han Tongtong Wu Xiaoxin He Wangchunshu Zhou ... Zexue He Wei Wang Gholamreza Haffari Heng Ji Julian McAuley KELM CLL 937 8 0 20 Sep 2024
AMES: Asymmetric and Memory-Efficient Similarity Estimation for Instance-level RetrievalEuropean Conference on Computer Vision (ECCV), 2024 Pavel Suma Giorgos Kordopatis-Zilos Ahmet Iscen Giorgos Tolias VLM 309 8 0 06 Aug 2024
HSViT: Horizontally Scalable Vision Transformer Chenhao Xu Chang-Tsun Li Chee Peng Lim Douglas Creighton ViT 175 6 0 08 Apr 2024
Rotary Position Embedding for Vision Transformer Byeongho Heo Song Park Dongyoon Han Sangdoo Yun 351 117 0 20 Mar 2024
The What, Why, and How of Context Length Extension Techniques in Large Language Models -- A Detailed Survey Saurav Pawar S.M. Towhidul Islam Tonmoy S. M. M. Zaman Vinija Jain Vasu Sharma Amitava Das 148 39 0 15 Jan 2024
Long-MIL: Scaling Long Contextual Multiple Instance Learning for Histopathology Whole Slide Image Analysis Honglin Li Yunlong Zhang Chenglu Zhu Jiatong Cai Sunyi Zheng Lin Yang VLM 237 6 0 21 Nov 2023
Extending Input Contexts of Language Models through Training on Segmented Sequences Petros Karypis Julian McAuley George Karypis 211 1 0 23 Oct 2023
AV-CPL: Continuous Pseudo-Labeling for Audio-Visual Speech Recognition Andrew Rouditchenko R. Collobert Tatiana Likhomanenko VLM 166 4 0 29 Sep 2023
Enabling Differentially Private Federated Learning for Speech Recognition: Benchmarks, Adaptive Optimizers and Gradient Clipping Martin Pelikan Sheikh Shams Azam Vitaly Feldman Jan Honza Silovsky Kunal Talwar Christopher G. Brinton Tatiana Likhomanenko 463 8 0 29 Sep 2023
NeuroCodeBench: a plain C neural network benchmark for software verification Edoardo Manino R. Menezes F. Shmarov Lucas C. Cordeiro 139 4 0 07 Sep 2023
Extract-and-Adaptation Network for 3D Interacting Hand Mesh Recovery J. Park Daniel Sungho Jung Gyeongsik Moon Kyoung Mu Lee 144 8 0 05 Sep 2023
LM-Infinite: Zero-Shot Extreme Length Generalization for Large Language ModelsNorth American Chapter of the Association for Computational Linguistics (NAACL), 2023 Chi Han Qifan Wang Yuan Yao Wenhan Xiong Yu Chen Heng Ji Sinong Wang 468 94 0 30 Aug 2023
How to Scale Your EMANeural Information Processing Systems (NeurIPS), 2023 Dan Busbridge Jason Ramapuram Pierre Ablin Tatiana Likhomanenko Eeshan Gunesh Dhekane Xavier Suau Russ Webb 207 21 0 25 Jul 2023
Linearized Relative Positional Encoding Zhen Qin Weixuan Sun Kaiyue Lu Huizhong Deng Dong Li Xiaodong Han Yuchao Dai Lingpeng Kong Yiran Zhong 110 18 0 18 Jul 2023
LEA: Improving Sentence Similarity Robustness to Typos Using Lexical Attention BiasKnowledge Discovery and Data Mining (KDD), 2023 Mario Almagro Emilio Almazán Diego Ortego David Jiménez 225 5 0 06 Jul 2023
Monotonic Location Attention for Length GeneralizationInternational Conference on Machine Learning (ICML), 2023 Jishnu Ray Chowdhury Cornelia Caragea LLMAG 152 10 0 31 May 2023
The Impact of Positional Encoding on Length Generalization in TransformersNeural Information Processing Systems (NeurIPS), 2023 Amirhossein Kazemnejad Inkit Padhi Karthikeyan N. Ramamurthy Payel Das Siva Reddy 295 290 0 31 May 2023
DINOv2: Learning Robust Visual Features without Supervision Maxime Oquab Timothée Darcet Théo Moutakanni Huy Q. Vo Marc Szafraniec ... Edouard Grave Julien Mairal Patrick Labatut Armand Joulin Piotr Bojanowski VLM CLIP SSL 1.0K 5,722 0 14 Apr 2023
Transformers in Speech Processing: A Survey S. Latif Aun Zaidi Heriberto Cuayáhuitl Fahad Shamshad Moazzam Shoukat Muhammad Usama Junaid Qadir 400 66 0 21 Mar 2023
Stabilizing Transformer Training by Preventing Attention Entropy CollapseInternational Conference on Machine Learning (ICML), 2023 Shuangfei Zhai Tatiana Likhomanenko Etai Littwin Dan Busbridge Jason Ramapuram Yizhe Zhang Jiatao Gu J. Susskind AAML 291 109 0 11 Mar 2023
A Theoretical Understanding of Shallow Vision Transformers: Learning, Generalization, and Sample ComplexityInternational Conference on Learning Representations (ICLR), 2023 Hongkang Li Ming Wang Sijia Liu Pin-Yu Chen ViT MLT 462 77 0 12 Feb 2023
Continuous Soft Pseudo-Labeling in ASR Tatiana Likhomanenko R. Collobert Navdeep Jaitly Samy Bengio VLM 245 5 0 11 Nov 2022
Variable Attention Masking for Configurable Transformer Transducer Speech RecognitionIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2022 P. Swietojanski Stefan Braun Dogan Can Thiago Fraga da Silva Arnab Ghoshal ... Henry Mason Erik McDermott Honza Silovsky R. Travadi Xiaodan Zhuang 209 21 0 02 Nov 2022
More Speaking or More Speakers?IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2022 Dan Berrebbi R. Collobert Navdeep Jaitly Tatiana Likhomanenko 189 6 0 02 Nov 2022
Continuous Pseudo-Labeling from the StartInternational Conference on Learning Representations (ICLR), 2022 Dan Berrebbi R. Collobert Samy Bengio Navdeep Jaitly Tatiana Likhomanenko 159 17 0 17 Oct 2022
Parameterization of Cross-Token Relations with Relative Positional Encoding for Vision MLPACM Multimedia (ACM MM), 2022 Zhicai Wang Y. Hao Xingyu Gao Hao Zhang Shuo Wang Tingting Mu Xiangnan He 177 8 0 15 Jul 2022
KERPLE: Kernelized Relative Positional Embedding for Length ExtrapolationNeural Information Processing Systems (NeurIPS), 2022 Ta-Chung Chi Ting-Han Fan Peter J. Ramadge Alexander I. Rudnicky 271 86 0 20 May 2022
Similarity and Content-based Phonetic Self Attention for Speech RecognitionInterspeech (Interspeech), 2022 Kyuhong Shim Wonyong Sung 264 8 0 19 Mar 2022
Continual Transformers: Redundancy-Free Attention for Online InferenceInternational Conference on Learning Representations (ICLR), 2022 Lukas Hedegaard Arian Bakhtiarnia Alexandros Iosifidis CLL 356 14 0 17 Jan 2022
Pseudo-Labeling for Massively Multilingual Speech RecognitionIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2021 Loren Lugosch Tatiana Likhomanenko Gabriel Synnaeve R. Collobert VLM 230 33 0 30 Oct 2021
Train Short, Test Long: Attention with Linear Biases Enables Input Length ExtrapolationInternational Conference on Learning Representations (ICLR), 2021 Ofir Press Noah A. Smith M. Lewis 596 978 0 27 Aug 2021
AMMUS : A Survey of Transformer-based Pretrained Models in Natural Language Processing Katikapalli Subramanyam Kalyan A. Rajasekharan S. Sangeetha VLM LM&MA 267 308 0 12 Aug 2021
Position Information in Transformers: An OverviewComputational Linguistics (CL), 2021 Philipp Dufter Martin Schmitt Hinrich Schütze 263 186 0 22 Feb 2021

All Papers

CAPE: Encoding Relative Positions with Continuous Augmented Positional Embeddings

Papers citing "CAPE: Encoding Relative Positions with Continuous Augmented Positional Embeddings"