ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2006.15595
  4. Cited By
Rethinking Positional Encoding in Language Pre-training
v1v2v3v4 (latest)

Rethinking Positional Encoding in Language Pre-training

28 June 2020
Guolin Ke
Di He
Tie-Yan Liu
ArXiv (abs)PDFHTMLGithub (251★)

Papers citing "Rethinking Positional Encoding in Language Pre-training"

22 / 172 papers shown
Title
MSG-Transformer: Exchanging Local Spatial Information by Manipulating
  Messenger Tokens
MSG-Transformer: Exchanging Local Spatial Information by Manipulating Messenger Tokens
Jiemin Fang
Lingxi Xie
Xinggang Wang
Xiaopeng Zhang
Wenyu Liu
Qi Tian
ViT
73
77
0
31 May 2021
Self-Attention Networks Can Process Bounded Hierarchical Languages
Self-Attention Networks Can Process Bounded Hierarchical Languages
Shunyu Yao
Binghui Peng
Christos H. Papadimitriou
Karthik Narasimhan
87
83
0
24 May 2021
Relative Positional Encoding for Transformers with Linear Complexity
Relative Positional Encoding for Transformers with Linear Complexity
Antoine Liutkus
Ondřej Cífka
Shih-Lun Wu
Umut Simsekli
Yi-Hsuan Yang
Gaël Richard
84
48
0
18 May 2021
Towards Robust Vision Transformer
Towards Robust Vision Transformer
Xiaofeng Mao
Gege Qi
YueFeng Chen
Xiaodan Li
Ranjie Duan
Shaokai Ye
Yuan He
Hui Xue
ViT
75
195
0
17 May 2021
How could Neural Networks understand Programs?
How could Neural Networks understand Programs?
Dinglan Peng
Shuxin Zheng
Yatao Li
Guolin Ke
Di He
Tie-Yan Liu
NAI
67
65
0
10 May 2021
MuseMorphose: Full-Song and Fine-Grained Piano Music Style Transfer with
  One Transformer VAE
MuseMorphose: Full-Song and Fine-Grained Piano Music Style Transfer with One Transformer VAE
Shih-Lun Wu
Yi-Hsuan Yang
ViT
109
55
0
10 May 2021
LANA: Towards Personalized Deep Knowledge Tracing Through
  Distinguishable Interactive Sequences
LANA: Towards Personalized Deep Knowledge Tracing Through Distinguishable Interactive Sequences
Yuhao Zhou
Xihua Li
Yunbo Cao
Xuemin Zhao
Qing Ye
Jiancheng Lv
AI4Ed
51
9
0
21 Apr 2021
RoFormer: Enhanced Transformer with Rotary Position Embedding
RoFormer: Enhanced Transformer with Rotary Position Embedding
Jianlin Su
Yu Lu
Shengfeng Pan
Ahmed Murtadha
Bo Wen
Yunfeng Liu
363
2,546
0
20 Apr 2021
SAPE: Spatially-Adaptive Progressive Encoding for Neural Optimization
SAPE: Spatially-Adaptive Progressive Encoding for Neural Optimization
Amir Hertz
Or Perel
Raja Giryes
O. Sorkine-Hornung
Daniel Cohen-Or
81
69
0
19 Apr 2021
A Simple and Effective Positional Encoding for Transformers
A Simple and Effective Positional Encoding for Transformers
Pu-Chin Chen
Henry Tsai
Srinadh Bhojanapalli
Hyung Won Chung
Yin-Wen Chang
Chun-Sung Ferng
120
66
0
18 Apr 2021
Lattice-BERT: Leveraging Multi-Granularity Representations in Chinese
  Pre-trained Language Models
Lattice-BERT: Leveraging Multi-Granularity Representations in Chinese Pre-trained Language Models
Yuxuan Lai
Yijia Liu
Yansong Feng
Songfang Huang
Dongyan Zhao
VLMAI4CE
70
38
0
15 Apr 2021
Investigating the Limitations of Transformers with Simple Arithmetic
  Tasks
Investigating the Limitations of Transformers with Simple Arithmetic Tasks
Rodrigo Nogueira
Zhiying Jiang
Jimmy J. Li
LRM
122
130
0
25 Feb 2021
Position Information in Transformers: An Overview
Position Information in Transformers: An Overview
Philipp Dufter
Martin Schmitt
Hinrich Schütze
93
148
0
22 Feb 2021
Less is More: Pre-train a Strong Text Encoder for Dense Retrieval Using
  a Weak Decoder
Less is More: Pre-train a Strong Text Encoder for Dense Retrieval Using a Weak Decoder
Shuqi Lu
Di He
Chenyan Xiong
Guolin Ke
Waleed Malik
Zhicheng Dou
Paul N. Bennett
Tie-Yan Liu
Arnold Overwijk
RALM
118
11
0
18 Feb 2021
COCO-LM: Correcting and Contrasting Text Sequences for Language Model
  Pretraining
COCO-LM: Correcting and Contrasting Text Sequences for Language Model Pretraining
Yu Meng
Chenyan Xiong
Payal Bajaj
Saurabh Tiwary
Paul N. Bennett
Jiawei Han
Xia Song
182
205
0
16 Feb 2021
Revisiting Language Encoding in Learning Multilingual Representations
Revisiting Language Encoding in Learning Multilingual Representations
Shengjie Luo
Kaiyuan Gao
Shuxin Zheng
Guolin Ke
Di He
Liwei Wang
Tie-Yan Liu
60
2
0
16 Feb 2021
Compound Word Transformer: Learning to Compose Full-Song Music over
  Dynamic Directed Hypergraphs
Compound Word Transformer: Learning to Compose Full-Song Music over Dynamic Directed Hypergraphs
Wen-Yi Hsiao
Jen-Yu Liu
Yin-Cheng Yeh
Yi-Hsuan Yang
193
187
0
07 Jan 2021
Shortformer: Better Language Modeling using Shorter Inputs
Shortformer: Better Language Modeling using Shorter Inputs
Ofir Press
Noah A. Smith
M. Lewis
306
91
0
31 Dec 2020
Learning from Mistakes: Using Mis-predictions as Harm Alerts in Language
  Pre-Training
Learning from Mistakes: Using Mis-predictions as Harm Alerts in Language Pre-Training
Chen Xing
Wenhao Liu
Caiming Xiong
31
0
0
16 Dec 2020
Positional Artefacts Propagate Through Masked Language Model Embeddings
Positional Artefacts Propagate Through Masked Language Model Embeddings
Ziyang Luo
Artur Kulmizev
Xiaoxi Mao
100
41
0
09 Nov 2020
Contextual BERT: Conditioning the Language Model Using a Global State
Contextual BERT: Conditioning the Language Model Using a Global State
Timo I. Denk
Ana Peleteiro Ramallo
65
6
0
29 Oct 2020
ERNIE-Gram: Pre-Training with Explicitly N-Gram Masked Language Modeling
  for Natural Language Understanding
ERNIE-Gram: Pre-Training with Explicitly N-Gram Masked Language Modeling for Natural Language Understanding
Dongling Xiao
Yukun Li
Han Zhang
Yu Sun
Hao Tian
Hua Wu
Haifeng Wang
29
39
0
23 Oct 2020
Previous
1234