ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2006.15595
  4. Cited By
Rethinking Positional Encoding in Language Pre-training
v1v2v3v4 (latest)

Rethinking Positional Encoding in Language Pre-training

28 June 2020
Guolin Ke
Di He
Tie-Yan Liu
ArXiv (abs)PDFHTMLGithub (251★)

Papers citing "Rethinking Positional Encoding in Language Pre-training"

50 / 172 papers shown
Title
Revisiting Transformers with Insights from Image Filtering
Revisiting Transformers with Insights from Image Filtering
Laziz U. Abdullaev
Maksim Tkachenko
Tan M. Nguyen
ViT
129
0
0
12 Jun 2025
Theoretical Analysis of Positional Encodings in Transformer Models: Impact on Expressiveness and Generalization
Theoretical Analysis of Positional Encodings in Transformer Models: Impact on Expressiveness and Generalization
Yin Li
20
0
0
05 Jun 2025
A Word is Worth 4-bit: Efficient Log Parsing with Binary Coded Decimal Recognition
A Word is Worth 4-bit: Efficient Log Parsing with Binary Coded Decimal Recognition
Prerak Srivastava
Giulio Corallo
Sergey Rybalko
32
0
0
01 Jun 2025
PIPE: Physics-Informed Position Encoding for Alignment of Satellite Images and Time Series
PIPE: Physics-Informed Position Encoding for Alignment of Satellite Images and Time Series
Haobo Li
Eunseo Jung
Zixin Chen
Zhaowei Wang
Yueya Wang
Huamin Qu
Alexis Kai Hon Lau
8
0
0
27 May 2025
PaTH Attention: Position Encoding via Accumulating Householder Transformations
PaTH Attention: Position Encoding via Accumulating Householder Transformations
Songlin Yang
Yikang Shen
Kaiyue Wen
Shawn Tan
Mayank Mishra
Liliang Ren
Rameswar Panda
Yoon Kim
72
1
0
22 May 2025
LOOPE: Learnable Optimal Patch Order in Positional Embeddings for Vision Transformers
LOOPE: Learnable Optimal Patch Order in Positional Embeddings for Vision Transformers
M. Chowdhury
Md Rifat Ur Rahman
Akil Ahmad Taki
59
0
0
19 Apr 2025
Of All StrIPEs: Investigating Structure-informed Positional Encoding for Efficient Music Generation
Of All StrIPEs: Investigating Structure-informed Positional Encoding for Efficient Music Generation
Manvi Agarwal
Changhong Wang
Gaël Richard
68
0
0
07 Apr 2025
Spline-based Transformers
Spline-based Transformers
Prashanth Chandran
Agon Serifi
Markus Gross
Moritz Bächer
158
0
0
03 Apr 2025
Parameter-Efficient Adaptation of Geospatial Foundation Models through Embedding Deflection
Romain Thoreau
Valerio Marsocci
Dawa Derksen
AI4CE
105
3
0
12 Mar 2025
The Role of Sparsity for Length Generalization in Transformers
The Role of Sparsity for Length Generalization in Transformers
Noah Golowich
Samy Jelassi
David Brandfonbrener
Sham Kakade
Eran Malach
83
0
0
24 Feb 2025
Positional Encoding in Transformer-Based Time Series Models: A Survey
Positional Encoding in Transformer-Based Time Series Models: A Survey
Habib Irani
Vangelis Metsis
AI4TS
80
2
0
17 Feb 2025
Rethinking Associative Memory Mechanism in Induction Head
Rethinking Associative Memory Mechanism in Induction Head
Shuo Wang
Issei Sato
183
0
0
16 Dec 2024
ArtFormer: Controllable Generation of Diverse 3D Articulated Objects
ArtFormer: Controllable Generation of Diverse 3D Articulated Objects
Jiayi Su
Youhe Feng
Zheng Li
Jinhua Song
Yangfan He
Botao Ren
Botian Xu
AI4CE
156
3
0
10 Dec 2024
When Precision Meets Position: BFloat16 Breaks Down RoPE in Long-Context
  Training
When Precision Meets Position: BFloat16 Breaks Down RoPE in Long-Context Training
Haonan Wang
Qian Liu
Chao Du
Tongyao Zhu
Cunxiao Du
Kenji Kawaguchi
Tianyu Pang
231
8
0
20 Nov 2024
Spatioformer: A Geo-encoded Transformer for Large-Scale Plant Species Richness Prediction
Spatioformer: A Geo-encoded Transformer for Large-Scale Plant Species Richness Prediction
Yiqing Guo
K. Mokany
S. Levick
Jinyan Yang
P. Moghadam
MDE
234
2
0
25 Oct 2024
Just In Time Transformers
Just In Time Transformers
Ahmed Ala Eddine Benali
M. Cafaro
I. Epicoco
Marco Pulimeno
Enrico Junior Schioppa
AI4TS
21
0
0
22 Oct 2024
Mitigating Object Hallucination via Concentric Causal Attention
Mitigating Object Hallucination via Concentric Causal Attention
Yun Xing
Yiheng Li
Ivan Laptev
Shijian Lu
108
23
0
21 Oct 2024
MLissard: Multilingual Long and Simple Sequential Reasoning Benchmarks
MLissard: Multilingual Long and Simple Sequential Reasoning Benchmarks
M. Bueno
R. Lotufo
Rodrigo Nogueira
LRM
69
0
0
08 Oct 2024
DAPE V2: Process Attention Score as Feature Map for Length Extrapolation
DAPE V2: Process Attention Score as Feature Map for Length Extrapolation
Chuanyang Zheng
Yihang Gao
Han Shi
Jing Xiong
Jiankai Sun
...
Xiaozhe Ren
Michael Ng
Xin Jiang
Zhenguo Li
Yu Li
83
3
0
07 Oct 2024
Efficient transformer with reinforced position embedding for language
  models
Efficient transformer with reinforced position embedding for language models
Yen-Che Hsiao
Abhishek Dutta
33
0
0
07 Oct 2024
Towards LifeSpan Cognitive Systems
Towards LifeSpan Cognitive Systems
Yu Wang
Chi Han
Tongtong Wu
Xiaoxin He
Wangchunshu Zhou
...
Zexue He
Wei Wang
Gholamreza Haffari
Heng Ji
Julian McAuley
KELMCLL
482
2
0
20 Sep 2024
TeXBLEU: Automatic Metric for Evaluate LaTeX Format
TeXBLEU: Automatic Metric for Evaluate LaTeX Format
Kyudan Jung
N. Kim
Hyongon Ryu
Sieun Hyeon
Seung-jun Lee
Hyeok-jae Lee
77
1
0
10 Sep 2024
DRFormer: Multi-Scale Transformer Utilizing Diverse Receptive Fields for
  Long Time-Series Forecasting
DRFormer: Multi-Scale Transformer Utilizing Diverse Receptive Fields for Long Time-Series Forecasting
Ruixin Ding
Yuqi Chen
Yu-Ting Lan
Wei Zhang
AI4TS
78
3
0
05 Aug 2024
Forest2Seq: Revitalizing Order Prior for Sequential Indoor Scene
  Synthesis
Forest2Seq: Revitalizing Order Prior for Sequential Indoor Scene Synthesis
Qi Sun
Hang Zhou
Wengang Zhou
Li Li
Houqiang Li
3DPC3DV
96
7
0
07 Jul 2024
Learning Positional Attention for Sequential Recommendation
Learning Positional Attention for Sequential Recommendation
Fan Luo
Juan Zhang
Shenghui Xu
37
2
0
03 Jul 2024
NovoBench: Benchmarking Deep Learning-based De Novo Peptide Sequencing
  Methods in Proteomics
NovoBench: Benchmarking Deep Learning-based De Novo Peptide Sequencing Methods in Proteomics
Jingbo Zhou
Shaorong Chen
Jun Xia
Sizhe Liu
Tianze Ling
Wenjie Du
Yue Liu
Jianwei Yin
Stan Z. Li
67
4
0
16 Jun 2024
Are queries and keys always relevant? A case study on Transformer wave functions
Are queries and keys always relevant? A case study on Transformer wave functions
Riccardo Rende
Luciano Loris Viteritti
96
7
0
29 May 2024
Base of RoPE Bounds Context Length
Base of RoPE Bounds Context Length
Xin Men
Mingyu Xu
Bingning Wang
Qingyu Zhang
Hongyu Lin
Xianpei Han
Weipeng Chen
101
26
0
23 May 2024
Improving Transformers using Faithful Positional Encoding
Improving Transformers using Faithful Positional Encoding
Tsuyoshi Idé
Jokin Labaien
Pin-Yu Chen
61
0
0
15 May 2024
PoPE: Legendre Orthogonal Polynomials Based Position Encoding for Large
  Language Models
PoPE: Legendre Orthogonal Polynomials Based Position Encoding for Large Language Models
Arpit Aggarwal
37
0
0
29 Apr 2024
Low-resource neural machine translation with morphological modeling
Low-resource neural machine translation with morphological modeling
Antoine Nzeyimana
66
6
0
03 Apr 2024
EulerFormer: Sequential User Behavior Modeling with Complex Vector
  Attention
EulerFormer: Sequential User Behavior Modeling with Complex Vector Attention
Zhen Tian
Wayne Xin Zhao
Changwang Zhang
Xin Zhao
Zhongrui Ma
Ji-Rong Wen
91
3
0
26 Mar 2024
Geotokens and Geotransformers
Geotokens and Geotransformers
Eren Unlu
53
0
0
23 Mar 2024
AdaNovo: Adaptive \emph{De Novo} Peptide Sequencing with Conditional
  Mutual Information
AdaNovo: Adaptive \emph{De Novo} Peptide Sequencing with Conditional Mutual Information
Jun Xia
Shaorong Chen
Jingbo Zhou
Tianze Ling
Wenjie Du
Sizhe Liu
Stan Z. Li
36
5
0
09 Mar 2024
Probabilistic Image-Driven Traffic Modeling via Remote Sensing
Probabilistic Image-Driven Traffic Modeling via Remote Sensing
Scott Workman
Armin Hadzic
57
0
0
08 Mar 2024
Exploring the Potential of Large Language Models for Improving Digital Forensic Investigation Efficiency
Exploring the Potential of Large Language Models for Improving Digital Forensic Investigation Efficiency
Akila Wickramasekara
Frank Breitinger
Mark Scanlon
148
10
0
29 Feb 2024
Lissard: Long and Simple Sequential Reasoning Datasets
Lissard: Long and Simple Sequential Reasoning Datasets
M. Bueno
R. Lotufo
Rodrigo Nogueira
RALMLRM
35
2
0
12 Feb 2024
Large Language Models: A Survey
Large Language Models: A Survey
Shervin Minaee
Tomas Mikolov
Narjes Nikzad
M. Asgari-Chenaghlu
R. Socher
Xavier Amatriain
Jianfeng Gao
ALMLM&MAELM
246
425
0
09 Feb 2024
XTSFormer: Cross-Temporal-Scale Transformer for Irregular Time Event
  Prediction
XTSFormer: Cross-Temporal-Scale Transformer for Irregular Time Event Prediction
Tingsong Xiao
Zelin Xu
Wenchong He
Jim Su
Yupu Zhang
...
Jason Petho
Jiang Bian
P. Tighe
Parisa Rashidi
Zhe Jiang
AI4TS
67
4
0
03 Feb 2024
The What, Why, and How of Context Length Extension Techniques in Large
  Language Models -- A Detailed Survey
The What, Why, and How of Context Length Extension Techniques in Large Language Models -- A Detailed Survey
Saurav Pawar
S.M. Towhidul Islam Tonmoy
S. M. M. Zaman
Vinija Jain
Aman Chadha
Amitava Das
68
29
0
15 Jan 2024
Graph Language Models
Graph Language Models
Moritz Plenz
Anette Frank
KELMAI4CE
97
7
0
13 Jan 2024
Lightning Attention-2: A Free Lunch for Handling Unlimited Sequence
  Lengths in Large Language Models
Lightning Attention-2: A Free Lunch for Handling Unlimited Sequence Lengths in Large Language Models
Zhen Qin
Weigao Sun
Dong Li
Xuyang Shen
Weixuan Sun
Yiran Zhong
119
28
0
09 Jan 2024
LLM Maybe LongLM: Self-Extend LLM Context Window Without Tuning
LLM Maybe LongLM: Self-Extend LLM Context Window Without Tuning
Hongye Jin
Xiaotian Han
Jingfeng Yang
Zhimeng Jiang
Zirui Liu
Chia-Yuan Chang
Huiyuan Chen
Helen Zhou
124
118
0
02 Jan 2024
Algebraic Positional Encodings
Algebraic Positional Encodings
Konstantinos Kogkalidis
Jean-Philippe Bernardy
Vikas Garg
34
3
0
26 Dec 2023
SSIN: Self-Supervised Learning for Rainfall Spatial Interpolation
SSIN: Self-Supervised Learning for Rainfall Spatial Interpolation
Jia Li
Yanyan Shen
Lei Chen
Charles Wang Wai Ng
60
3
0
27 Nov 2023
Long-MIL: Scaling Long Contextual Multiple Instance Learning for
  Histopathology Whole Slide Image Analysis
Long-MIL: Scaling Long Contextual Multiple Instance Learning for Histopathology Whole Slide Image Analysis
Honglin Li
Yunlong Zhang
Chenglu Zhu
Jiatong Cai
Sunyi Zheng
Lin Yang
VLM
89
4
0
21 Nov 2023
Hierarchically Gated Recurrent Neural Network for Sequence Modeling
Hierarchically Gated Recurrent Neural Network for Sequence Modeling
Zhen Qin
Aaron Courville
Yiran Zhong
90
80
0
08 Nov 2023
Positional Encoding-based Resident Identification in Multi-resident
  Smart Homes
Positional Encoding-based Resident Identification in Multi-resident Smart Homes
Zhiyi Song
Dipankar Chaki
Abdallah Lakhdari
A. Bouguettaya
55
2
0
27 Oct 2023
The Locality and Symmetry of Positional Encodings
The Locality and Symmetry of Positional Encodings
Lihu Chen
Gaël Varoquaux
Fabian M. Suchanek
62
1
0
19 Oct 2023
Enhanced Transformer Architecture for Natural Language Processing
Enhanced Transformer Architecture for Natural Language Processing
Woohyeon Moon
Taeyoung Kim
Bumgeun Park
Dongsoo Har
73
0
0
17 Oct 2023
1234
Next