Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2305.13571
Cited By
Latent Positional Information is in the Self-Attention Variance of Transformer Language Models Without Positional Embeddings
23 May 2023
Ta-Chung Chi
Ting-Han Fan
Li-Wei Chen
Alexander I. Rudnicky
Peter J. Ramadge
VLM
MILM
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Latent Positional Information is in the Self-Attention Variance of Transformer Language Models Without Positional Embeddings"
4 / 4 papers shown
Title
Mitigate Position Bias in Large Language Models via Scaling a Single Dimension
Yijiong Yu
Huiqiang Jiang
Xufang Luo
Qianhui Wu
Chin-Yew Lin
Dongsheng Li
Yuqing Yang
Yongfeng Huang
L. Qiu
35
9
0
04 Jun 2024
Breaking Symmetry When Training Transformers
Chunsheng Zuo
Michael Guerzhoy
17
0
0
06 Feb 2024
What Language Model to Train if You Have One Million GPU Hours?
Teven Le Scao
Thomas Wang
Daniel Hesslow
Lucile Saulnier
Stas Bekman
...
Lintang Sutawika
Jaesung Tae
Zheng-Xin Yong
Julien Launay
Iz Beltagy
MoE
AI4CE
212
103
0
27 Oct 2022
The Pile: An 800GB Dataset of Diverse Text for Language Modeling
Leo Gao
Stella Biderman
Sid Black
Laurence Golding
Travis Hoppe
...
Horace He
Anish Thite
Noa Nabeshima
Shawn Presser
Connor Leahy
AIMat
236
1,508
0
31 Dec 2020
1