ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2026 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2205.09921
  4. Cited By
KERPLE: Kernelized Relative Positional Embedding for Length
  Extrapolation
v1v2 (latest)

KERPLE: Kernelized Relative Positional Embedding for Length Extrapolation

Neural Information Processing Systems (NeurIPS), 2022
20 May 2022
Ta-Chung Chi
Ting-Han Fan
Peter J. Ramadge
Alexander I. Rudnicky
ArXiv (abs)PDFHTMLGithub (18★)

Papers citing "KERPLE: Kernelized Relative Positional Embedding for Length Extrapolation"

50 / 56 papers shown
ShaRP: SHAllow-LayeR Pruning for Video Large Language Models Acceleration
ShaRP: SHAllow-LayeR Pruning for Video Large Language Models Acceleration
Yingjie Xia
Tao Liu
Jinglei Shi
Qingsong Xie
Heng Guo
Jian Yang
Xi Wang
VLM
103
0
0
05 Dec 2025
Selective Rotary Position Embedding
Selective Rotary Position Embedding
Sajad Movahedi
Timur Carstensen
Arshia Afzal
Frank Hutter
Antonio Orvieto
Volkan Cevher
377
2
0
21 Nov 2025
A Circular Argument : Does RoPE need to be Equivariant for Vision?
A Circular Argument : Does RoPE need to be Equivariant for Vision?
Chase van de Geijn
Timo Lüddecke
Polina Turishcheva
Alexander S. Ecker
221
2
0
11 Nov 2025
Indirect Attention: Turning Context Misalignment into a Feature
Indirect Attention: Turning Context Misalignment into a Feature
Bissmella Bahaduri
Hicham Talaoubrid
Fangchen Feng
Zuheng Ming
Anissa Mokraoui
203
0
0
30 Sep 2025
SAS: Simulated Attention Score
SAS: Simulated Attention Score
Chuanyang Zheng
J. Sun
Yihang Gao
Yuehao Wang
Peihao Wang
...
Atlas Wang
Mac Schwager
Anderson Schneider
Xiaodong Liu
Jianfeng Gao
AI4TS
302
3
0
10 Jul 2025
Long-Short Alignment for Effective Long-Context Modeling in LLMs
Long-Short Alignment for Effective Long-Context Modeling in LLMs
Tianqi Du
Haotian Huang
Yifei Wang
Yisen Wang
222
2
0
13 Jun 2025
Mitigating Posterior Salience Attenuation in Long-Context LLMs with Positional Contrastive DecodingAnnual Meeting of the Association for Computational Linguistics (ACL), 2025
Zikai Xiao
Ziyang Wang
Wen Ma
Yan Zhang
Wei Shen
Yan Wang
Luqi Gong
Zuozhu Liu
243
3
0
10 Jun 2025
Native-Resolution Image Synthesis
Native-Resolution Image Synthesis
Zidong Wang
Mengwei He
Xiangyu Yue
Xuming He
Yiyuan Zhang
357
7
0
03 Jun 2025
A Comparative Study on Positional Encoding for Time-frequency Domain Dual-path Transformer-based Source Separation Models
A Comparative Study on Positional Encoding for Time-frequency Domain Dual-path Transformer-based Source Separation Models
Kohei Saijo
Tetsuji Ogawa
360
4
0
28 Apr 2025
Between Linear and Sinusoidal: Rethinking the Time Encoder in Dynamic Graph Learning
Between Linear and Sinusoidal: Rethinking the Time Encoder in Dynamic Graph Learning
Hsing-Huan Chung
Shravan Chaudhari
Xing Han
Yoav Wald
Suchi Saria
Joydeep Ghosh
AI4TS
309
1
0
10 Apr 2025
FactGuard: Leveraging Multi-Agent Systems to Generate Answerable and Unanswerable Questions for Enhanced Long-Context LLM Extraction
FactGuard: Leveraging Multi-Agent Systems to Generate Answerable and Unanswerable Questions for Enhanced Long-Context LLM Extraction
Qian Zhang
Fang Li
Jie Wang
Lingfeng Qiao
Yifei Yu
Di Yin
Xingwu Sun
RALM
391
3
0
08 Apr 2025
On Vanishing Variance in Transformer Length Generalization
On Vanishing Variance in Transformer Length Generalization
Ruining Li
Gabrijel Boduljak
Jensen
Zhou
308
3
0
03 Apr 2025
Where is this coming from? Making groundedness count in the evaluation of Document VQA models
Where is this coming from? Making groundedness count in the evaluation of Document VQA modelsNorth American Chapter of the Association for Computational Linguistics (NAACL), 2025
Armineh Nourbakhsh
Siddharth Parekh
Pranav Shetty
Zhao Jin
Sameena Shah
Carolyn Rose
327
3
0
24 Mar 2025
A Survey on Transformer Context Extension: Approaches and Evaluation
A Survey on Transformer Context Extension: Approaches and Evaluation
Yijun Liu
Jinzheng Yu
Yang Xu
Zhongyang Li
Qingfu Zhu
LLMAG
581
15
0
17 Mar 2025
Context-aware Biases for Length Extrapolation
Context-aware Biases for Length Extrapolation
Ali Veisi
Hamidreza Amirzadeh
Amir Mansourian
637
2
0
11 Mar 2025
Self-Adjust Softmax
Self-Adjust Softmax
Chuanyang Zheng
Yihang Gao
Guoxuan Chen
Han Shi
Jing Xiong
Xiaozhe Ren
Chao Huang
Xin Jiang
Zhiyu Li
Yu Li
437
4
0
25 Feb 2025
Enhancing Auto-regressive Chain-of-Thought through Loop-Aligned Reasoning
Enhancing Auto-regressive Chain-of-Thought through Loop-Aligned Reasoning
Qifan Yu
Zhenyu He
Sijie Li
Xun Zhou
Jun Zhang
Jingjing Xu
Di He
OffRLLRM
440
16
0
12 Feb 2025
Learning the RoPEs: Better 2D and 3D Position Encodings with STRING
Learning the RoPEs: Better 2D and 3D Position Encodings with STRING
Connor Schenck
Isaac Reid
M. Jacob
Alex Bewley
Joshua Ainslie
...
Matthias Minderer
Dmitry Kalashnikov
Jonathan Tompson
Vikas Sindhwani
Krzysztof Choromanski
380
13
0
04 Feb 2025
Rethinking Addressing in Language Models via Contexualized Equivariant Positional Encoding
Rethinking Addressing in Language Models via Contexualized Equivariant Positional Encoding
Jiajun Zhu
Peihao Wang
Ruisi Cai
Jason D. Lee
Pan Li
Liang Luo
KELM
412
5
0
01 Jan 2025
Provable Length Generalization in Sequence Prediction via Spectral
  Filtering
Provable Length Generalization in Sequence Prediction via Spectral Filtering
Annie Marsden
Evan Dogariu
Naman Agarwal
Xinyi Chen
Daniel Suo
Elad Hazan
386
1
0
01 Nov 2024
What is Wrong with Perplexity for Long-context Language Modeling?
What is Wrong with Perplexity for Long-context Language Modeling?International Conference on Learning Representations (ICLR), 2024
Lizhe Fang
Yifei Wang
Zhaoyang Liu
Chenheng Zhang
Stefanie Jegelka
Jinyang Gao
Bolin Ding
Yisen Wang
778
43
0
31 Oct 2024
HoPE: A Novel Positional Encoding Without Long-Term Decay for Enhanced
  Context Awareness and Extrapolation
HoPE: A Novel Positional Encoding Without Long-Term Decay for Enhanced Context Awareness and ExtrapolationAnnual Meeting of the Association for Computational Linguistics (ACL), 2024
Yuhan Chen
Ang Lv
Jian Luan
Bin Wang
Wen Liu
265
12
0
28 Oct 2024
Rethinking Transformer for Long Contextual Histopathology Whole Slide
  Image Analysis
Rethinking Transformer for Long Contextual Histopathology Whole Slide Image AnalysisNeural Information Processing Systems (NeurIPS), 2024
Honglin Li
Yunlong Zhang
Pingyi Chen
Honglin Li
Chenglu Zhu
Lin Yang
MedIm
359
19
0
18 Oct 2024
MLissard: Multilingual Long and Simple Sequential Reasoning Benchmarks
MLissard: Multilingual Long and Simple Sequential Reasoning Benchmarks
M. Bueno
R. Lotufo
Rodrigo Nogueira
LRM
290
0
0
08 Oct 2024
DAPE V2: Process Attention Score as Feature Map for Length Extrapolation
DAPE V2: Process Attention Score as Feature Map for Length ExtrapolationAnnual Meeting of the Association for Computational Linguistics (ACL), 2024
Chuanyang Zheng
Yihang Gao
Han Shi
Jing Xiong
Jiankai Sun
...
Xiaozhe Ren
Michael Ng
Xin Jiang
Zhenguo Li
Yu Li
416
12
0
07 Oct 2024
UncertaintyRAG: Span-Level Uncertainty Enhanced Long-Context Modeling
  for Retrieval-Augmented Generation
UncertaintyRAG: Span-Level Uncertainty Enhanced Long-Context Modeling for Retrieval-Augmented Generation
Zixuan Li
Jing Xiong
Fanghua Ye
Chuanyang Zheng
Xun Wu
...
Xiaodan Liang
Chengming Li
Zhenan Sun
Lingpeng Kong
Ngai Wong
RALMUQLM
414
9
0
03 Oct 2024
LongRecipe: Recipe for Efficient Long Context Generalization in Large
  Language Models
LongRecipe: Recipe for Efficient Long Context Generalization in Large Language ModelsAnnual Meeting of the Association for Computational Linguistics (ACL), 2024
Zhiyuan Hu
Yuliang Liu
Jinman Zhao
Suyuchen Wang
Yan Wang
...
Qing Gu
Anh Tuan Luu
See-Kiong Ng
Zhiwei Jiang
Bryan Hooi
408
23
0
31 Aug 2024
Human-inspired Episodic Memory for Infinite Context LLMs
Human-inspired Episodic Memory for Infinite Context LLMs
Zafeirios Fountas
Martin A Benfeghoul
Adnan Oomerjee
Fenia Christopoulou
Gerasimos Lampouras
H. Ammar
Jun Wang
462
32
0
12 Jul 2024
Universal Length Generalization with Turing Programs
Universal Length Generalization with Turing Programs
Kaiying Hou
David Brandfonbrener
Sham Kakade
Samy Jelassi
Eran Malach
275
21
0
03 Jul 2024
Let the Code LLM Edit Itself When You Edit the Code
Let the Code LLM Edit Itself When You Edit the Code
Zhenyu He
Jun Zhang
Shengjie Luo
Jingjing Xu
Zongzhang Zhang
Di He
KELM
314
3
0
03 Jul 2024
Transformers Can Do Arithmetic with the Right Embeddings
Transformers Can Do Arithmetic with the Right Embeddings
Sean McLeish
Arpit Bansal
Alex Stein
Neel Jain
John Kirchenbauer
...
B. Kailkhura
A. Bhatele
Jonas Geiping
Avi Schwarzschild
Tom Goldstein
300
73
0
27 May 2024
MEP: Multiple Kernel Learning Enhancing Relative Positional Encoding
  Length Extrapolation
MEP: Multiple Kernel Learning Enhancing Relative Positional Encoding Length Extrapolation
Weiguo Gao
238
1
0
26 Mar 2024
CLongEval: A Chinese Benchmark for Evaluating Long-Context Large
  Language Models
CLongEval: A Chinese Benchmark for Evaluating Long-Context Large Language Models
Zexuan Qiu
Jingjing Li
Shijue Huang
Wanjun Zhong
Irwin King
ELMALM
367
14
0
06 Mar 2024
LLM Inference Unveiled: Survey and Roofline Model Insights
LLM Inference Unveiled: Survey and Roofline Model Insights
Zhihang Yuan
Yuzhang Shang
Yang Zhou
Zhen Dong
Zhe Zhou
...
Yong Jae Lee
Yan Yan
Beidi Chen
Guangyu Sun
Kurt Keutzer
689
168
0
26 Feb 2024
Transformers Can Achieve Length Generalization But Not Robustly
Transformers Can Achieve Length Generalization But Not Robustly
Yongchao Zhou
Uri Alon
Xinyun Chen
Xuezhi Wang
Rishabh Agarwal
Denny Zhou
361
76
0
14 Feb 2024
Self-Attention through Kernel-Eigen Pair Sparse Variational Gaussian
  Processes
Self-Attention through Kernel-Eigen Pair Sparse Variational Gaussian Processes
Yingyi Chen
Qinghua Tao
F. Tonin
Johan A. K. Suykens
303
4
0
02 Feb 2024
Two Stones Hit One Bird: Bilevel Positional Encoding for Better Length
  Extrapolation
Two Stones Hit One Bird: Bilevel Positional Encoding for Better Length Extrapolation
Zhenyu He
Guhao Feng
Shengjie Luo
Kai-Bo Yang
Liwei Wang
Jingjing Xu
Zhi Zhang
Hongxia Yang
Di He
256
25
0
29 Jan 2024
An Empirical Study on the Impact of Positional Encoding in
  Transformer-based Monaural Speech Enhancement
An Empirical Study on the Impact of Positional Encoding in Transformer-based Monaural Speech Enhancement
Qiquan Zhang
Meng Ge
Hongxu Zhu
E. Ambikairajah
Qi Song
Zhaoheng Ni
Haizhou Li
289
17
0
18 Jan 2024
BA-SAM: Scalable Bias-Mode Attention Mask for Segment Anything Model
BA-SAM: Scalable Bias-Mode Attention Mask for Segment Anything ModelComputer Vision and Pattern Recognition (CVPR), 2024
Yiran Song
Qianyu Zhou
Hefei Ling
Deng-Ping Fan
Xuequan Lu
Lizhuang Ma
VLM
585
22
0
04 Jan 2024
Transformers Implement Functional Gradient Descent to Learn Non-Linear
  Functions In Context
Transformers Implement Functional Gradient Descent to Learn Non-Linear Functions In Context
Xiang Cheng
Yuxin Chen
S. Sra
696
63
0
11 Dec 2023
The Efficiency Spectrum of Large Language Models: An Algorithmic Survey
The Efficiency Spectrum of Large Language Models: An Algorithmic Survey
Tianyu Ding
Tianyi Chen
Haidong Zhu
Jiachen Jiang
Yiqi Zhong
Jinxin Zhou
Guangzhi Wang
Zhihui Zhu
Ilya Zharkov
Luming Liang
504
37
0
01 Dec 2023
Advancing Transformer Architecture in Long-Context Large Language
  Models: A Comprehensive Survey
Advancing Transformer Architecture in Long-Context Large Language Models: A Comprehensive Survey
Yunpeng Huang
Jingwei Xu
Junyu Lai
Zixu Jiang
Taolue Chen
...
Xiaoxing Ma
Lijuan Yang
Zhou Xin
Shupeng Li
Penghao Zhao
LLMAGKELM
479
114
0
21 Nov 2023
Long-MIL: Scaling Long Contextual Multiple Instance Learning for
  Histopathology Whole Slide Image Analysis
Long-MIL: Scaling Long Contextual Multiple Instance Learning for Histopathology Whole Slide Image Analysis
Honglin Li
Yunlong Zhang
Chenglu Zhu
Jiatong Cai
Sunyi Zheng
Lin Yang
VLM
359
6
0
21 Nov 2023
Addressing the Length Bias Problem in Document-Level Neural Machine
  Translation
Addressing the Length Bias Problem in Document-Level Neural Machine Translation
Zhuocheng Zhang
Shuhao Gu
Min Zhang
Yang Feng
264
2
0
20 Nov 2023
Attention Alignment and Flexible Positional Embeddings Improve
  Transformer Length Extrapolation
Attention Alignment and Flexible Positional Embeddings Improve Transformer Length Extrapolation
Ta-Chung Chi
Ting-Han Fan
Alexander I. Rudnicky
181
10
0
01 Nov 2023
CLEX: Continuous Length Extrapolation for Large Language Models
CLEX: Continuous Length Extrapolation for Large Language ModelsInternational Conference on Learning Representations (ICLR), 2023
Guanzheng Chen
Xin Li
Zaiqiao Meng
Shangsong Liang
Li Bing
360
38
0
25 Oct 2023
Extending Input Contexts of Language Models through Training on
  Segmented Sequences
Extending Input Contexts of Language Models through Training on Segmented Sequences
Petros Karypis
Julian McAuley
George Karypis
329
1
0
23 Oct 2023
From Interpolation to Extrapolation: Complete Length Generalization for
  Arithmetic Transformers
From Interpolation to Extrapolation: Complete Length Generalization for Arithmetic Transformers
Shaoxiong Duan
Yining Shi
Wei Xu
349
16
0
18 Oct 2023
CoCA: Fusing Position Embedding with Collinear Constrained Attention in
  Transformers for Long Context Window Extending
CoCA: Fusing Position Embedding with Collinear Constrained Attention in Transformers for Long Context Window ExtendingAnnual Meeting of the Association for Computational Linguistics (ACL), 2023
Shiyi Zhu
Jingting Ye
Wei Jiang
Siqiao Xue
Qi Zhang
Yifan Wu
Jianguo Li
186
9
0
15 Sep 2023
Exploring Transformer Extrapolation
Exploring Transformer ExtrapolationAAAI Conference on Artificial Intelligence (AAAI), 2023
Zhen Qin
Yiran Zhong
Huiyuan Deng
178
12
0
19 Jul 2023
12
Next
Page 1 of 2