Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2006.15595
Cited By
v1
v2
v3
v4 (latest)
Rethinking Positional Encoding in Language Pre-training
28 June 2020
Guolin Ke
Di He
Tie-Yan Liu
Re-assign community
ArXiv (abs)
PDF
HTML
Github (251★)
Papers citing
"Rethinking Positional Encoding in Language Pre-training"
50 / 172 papers shown
Title
giMLPs: Gate with Inhibition Mechanism in MLPs
Cheng Kang
Jindich Prokop
Lei Tong
Huiyu Zhou
Yong Hu
Daneil Novak
33
0
0
01 Aug 2022
Generalized Attention Mechanism and Relative Position for Transformer
R. Pandya
ViT
18
1
0
24 Jul 2022
Parameterization of Cross-Token Relations with Relative Positional Encoding for Vision MLP
Zhicai Wang
Y. Hao
Xingyu Gao
Hao Zhang
Shuo Wang
Tingting Mu
Xiangnan He
70
8
0
15 Jul 2022
Towards Robust Referring Video Object Segmentation with Cyclic Relational Consensus
Xiang Li
Jinglu Wang
Xiaohao Xu
Xiao Li
Bhiksha Raj
Yan Lu
VOS
107
37
0
04 Jul 2022
LIFT: Language-Interfaced Fine-Tuning for Non-Language Machine Learning Tasks
Tuan Dinh
Yuchen Zeng
Ruisu Zhang
Ziqian Lin
Michael Gira
Shashank Rajput
Jy-yong Sohn
Dimitris Papailiopoulos
Kangwook Lee
LMTD
174
140
0
14 Jun 2022
MetaTPTrans: A Meta Learning Approach for Multilingual Code Representation Learning
Weiguo Pian
Hanyu Peng
Xunzhu Tang
Tiezhu Sun
Haoye Tian
Andrew Habib
Jacques Klein
Tegawende F. Bissyande
56
12
0
13 Jun 2022
Do we really need temporal convolutions in action segmentation?
Dazhao Du
Fuchun Sun
Yu Li
Zhongang Qi
Hui Xiong
Ying Shan
ViT
69
17
0
26 May 2022
Your Transformer May Not be as Powerful as You Expect
Shengjie Luo
Shanda Li
Shuxin Zheng
Tie-Yan Liu
Liwei Wang
Di He
139
54
0
26 May 2022
Neural Additive Models for Nowcasting
Wonkeun Jo
Dongil Kim
116
4
0
20 May 2022
KERPLE: Kernelized Relative Positional Embedding for Length Extrapolation
Ta-Chung Chi
Ting-Han Fan
Peter J. Ramadge
Alexander I. Rudnicky
98
73
0
20 May 2022
Trading Positional Complexity vs. Deepness in Coordinate Networks
Jianqiao Zheng
Sameera Ramasinghe
Xueqian Li
Simon Lucey
101
19
0
18 May 2022
Zero-shot Code-Mixed Offensive Span Identification through Rationale Extraction
Manikandan Ravikiran
Bharathi Raja Chakravarthi
100
3
0
12 May 2022
Decoupled Side Information Fusion for Sequential Recommendation
Yueqi Xie
Peilin Zhou
Sunghun Kim
102
114
0
23 Apr 2022
DecBERT: Enhancing the Language Understanding of BERT with Causal Attention Masks
Ziyang Luo
Yadong Xi
Jing Ma
Zhiwei Yang
Xiaoxi Mao
Changjie Fan
Rongsheng Zhang
42
3
0
19 Apr 2022
Dynamic Position Encoding for Transformers
Joyce Zheng
Mehdi Rezagholizadeh
Peyman Passban
21
1
0
18 Apr 2022
3D Shuffle-Mixer: An Efficient Context-Aware Vision Learner of Transformer-MLP Paradigm for Dense Prediction in Medical Volume
Jianye Pang
Cheng Jiang
Yihao Chen
Jianbo Chang
M. Feng
Renzhi Wang
Jianhua Yao
ViT
MedIm
55
11
0
14 Apr 2022
METRO: Efficient Denoising Pretraining of Large Scale Autoencoding Language Models with Model Generated Signals
Payal Bajaj
Chenyan Xiong
Guolin Ke
Xiaodong Liu
Di He
Saurabh Tiwary
Tie-Yan Liu
Paul N. Bennett
Xia Song
Jianfeng Gao
118
32
0
13 Apr 2022
Pretraining Text Encoders with Adversarial Mixture of Training Signal Generators
Yu Meng
Chenyan Xiong
Payal Bajaj
Saurabh Tiwary
Paul N. Bennett
Jiawei Han
Xia Song
MoE
78
16
0
07 Apr 2022
Visual Abductive Reasoning
Chen Liang
Wenguan Wang
Tianfei Zhou
Yi Yang
LRM
92
40
0
26 Mar 2022
LGT-Net: Indoor Panoramic Room Layout Estimation with Geometry-Aware Transformer Network
Zhigang Jiang
Zhongzheng Xiang
Jinhua Xu
Mingbi Zhao
ViT
3DV
100
35
0
03 Mar 2022
FastRPB: a Scalable Relative Positional Encoding for Long Sequence Tasks
Maksim Zubkov
Daniil Gavrilov
47
0
0
23 Feb 2022
General-purpose, long-context autoregressive modeling with Perceiver AR
Curtis Hawthorne
Andrew Jaegle
Cătălina Cangea
Sebastian Borgeaud
C. Nash
...
Hannah R. Sheahan
Neil Zeghidour
Jean-Baptiste Alayrac
João Carreira
Jesse Engel
110
66
0
15 Feb 2022
OFA: Unifying Architectures, Tasks, and Modalities Through a Simple Sequence-to-Sequence Learning Framework
Peng Wang
An Yang
Rui Men
Junyang Lin
Shuai Bai
Zhikang Li
Jianxin Ma
Chang Zhou
Jingren Zhou
Hongxia Yang
MLLM
ObjD
176
884
0
07 Feb 2022
Improving Sample Efficiency of Value Based Models Using Attention and Vision Transformers
Amir Ardalan Kalantari
Mohammad Amini
Sarath Chandar
Doina Precup
83
4
0
01 Feb 2022
Rewiring with Positional Encodings for Graph Neural Networks
Rickard Brüel-Gabrielsson
Mikhail Yurochkin
Justin Solomon
AI4CE
117
33
0
29 Jan 2022
SoK: Vehicle Orientation Representations for Deep Rotation Estimation
Huahong Tu
Siyuan Peng
Vladimir Leung
Richard Gao
3DPC
84
1
0
08 Dec 2021
SwinTrack: A Simple and Strong Baseline for Transformer Tracking
Liting Lin
Heng Fan
Zhipeng Zhang
Yong-mei Xu
Haibin Ling
ViT
107
322
0
02 Dec 2021
Swin Transformer V2: Scaling Up Capacity and Resolution
Ze Liu
Han Hu
Yutong Lin
Zhuliang Yao
Zhenda Xie
...
Yue Cao
Zheng Zhang
Li Dong
Furu Wei
B. Guo
ViT
284
1,839
0
18 Nov 2021
Theme Transformer: Symbolic Music Generation with Theme-Conditioned Transformer
Yi-Jen Shih
Shih-Lun Wu
Frank Zalkow
Meinard Muller
Yi-Hsuan Yang
94
83
0
07 Nov 2021
Can Vision Transformers Perform Convolution?
Shanda Li
Xiangning Chen
Di He
Cho-Jui Hsieh
ViT
110
21
0
02 Nov 2021
Relative Molecule Self-Attention Transformer
Lukasz Maziarka
Dawid Majchrowski
Tomasz Danel
Piotr Gaiñski
Jacek Tabor
Igor T. Podolak
Pawel M. Morkisz
Stanislaw Jastrzebski
MedIm
92
36
0
12 Oct 2021
Learning to Iteratively Solve Routing Problems with Dual-Aspect Collaborative Transformer
Yining Ma
Jingwen Li
Zhiguang Cao
Wen Song
Le Zhang
Zhenghua Chen
Jing Tang
168
140
0
06 Oct 2021
Multiplicative Position-aware Transformer Models for Language Understanding
Zhiheng Huang
Davis Liang
Peng Xu
Bing Xiang
27
1
0
27 Sep 2021
The Impact of Positional Encodings on Multilingual Compression
Vinit Ravishankar
Anders Søgaard
60
5
0
11 Sep 2021
Ultra-high Resolution Image Segmentation via Locality-aware Context Fusion and Alternating Local Enhancement
Wenxi Liu
Qi Li
Xin Lin
Weixiang Yang
Shengfeng He
Yuanlong Yu
76
7
0
06 Sep 2021
Teaching Autoregressive Language Models Complex Tasks By Demonstration
Gabriel Recchia
83
22
0
05 Sep 2021
Shatter: An Efficient Transformer Encoder with Single-Headed Self-Attention and Relative Sequence Partitioning
Ran Tian
Joshua Maynez
Ankur P. Parikh
ViT
56
2
0
30 Aug 2021
Conditional DETR for Fast Training Convergence
Depu Meng
Xiaokang Chen
Zejia Fan
Gang Zeng
Houqiang Li
Yuhui Yuan
Lei-huan Sun
Jingdong Wang
ViT
116
628
0
13 Aug 2021
Variable-Length Music Score Infilling via XLNet and Musically Specialized Positional Encoding
Chin-Jui Chang
Chun-Yi Lee
Yi-Hsuan Yang
63
21
0
11 Aug 2021
PiSLTRc: Position-informed Sign Language Transformer with Content-aware Convolution
Pan Xie
Mengyi Zhao
Xiaohui Hu
ViT
SLR
97
35
0
27 Jul 2021
SpectralFormer: Rethinking Hyperspectral Image Classification with Transformers
Danfeng Hong
Zhu Han
Jing Yao
Lianru Gao
Bing Zhang
Antonio J. Plaza
Jocelyn Chanussot
ViT
77
913
0
07 Jul 2021
Rethinking Positional Encoding
Jianqiao Zheng
Sameera Ramasinghe
Simon Lucey
85
52
0
06 Jul 2021
Stable, Fast and Accurate: Kernelized Attention with Relative Positional Encoding
Shengjie Luo
Shanda Li
Tianle Cai
Di He
Dinglan Peng
Shuxin Zheng
Guolin Ke
Liwei Wang
Tie-Yan Liu
95
50
0
23 Jun 2021
Bootstrap Representation Learning for Segmentation on Medical Volumes and Sequences
Zejian Chen
Wei Zhuo
Tianfu Wang
Wufeng Xue
Dong Ni
101
6
0
23 Jun 2021
Large-Scale Chemical Language Representations Capture Molecular Structure and Properties
Jerret Ross
Brian M. Belgodere
Vijil Chenthamarakshan
Inkit Padhi
Youssef Mroueh
Payel Das
AI4CE
91
302
0
17 Jun 2021
Convolutions and Self-Attention: Re-interpreting Relative Positions in Pre-trained Language Models
Tyler A. Chang
Yifan Xu
Weijian Xu
Zhuowen Tu
ViT
57
15
0
10 Jun 2021
Do Transformers Really Perform Bad for Graph Representation?
Chengxuan Ying
Tianle Cai
Shengjie Luo
Shuxin Zheng
Guolin Ke
Di He
Yanming Shen
Tie-Yan Liu
GNN
109
443
0
09 Jun 2021
A Survey of Transformers
Tianyang Lin
Yuxin Wang
Xiangyang Liu
Xipeng Qiu
ViT
202
1,148
0
08 Jun 2021
CAPE: Encoding Relative Positions with Continuous Augmented Positional Embeddings
Tatiana Likhomanenko
Qiantong Xu
Gabriel Synnaeve
R. Collobert
A. Rogozhnikov
OOD
ViT
84
60
0
06 Jun 2021
The Case for Translation-Invariant Self-Attention in Transformer-Based Language Models
Ulme Wennberg
G. Henter
MILM
93
22
0
03 Jun 2021
Previous
1
2
3
4
Next