Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2410.06205
Cited By
Round and Round We Go! What makes Rotary Positional Encodings useful?
8 October 2024
Federico Barbero
Alex Vitvitskyi
Christos Perivolaropoulos
Razvan Pascanu
Petar Velickovic
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Round and Round We Go! What makes Rotary Positional Encodings useful?"
10 / 10 papers shown
Title
Effective Length Extrapolation via Dimension-Wise Positional Embeddings Manipulation
Yi Lu
Wanxu Zhao
Xin Zhou
Chenxin An
C. Wang
...
Jun Zhao
Tao Ji
Tao Gui
Qi Zhang
Xuanjing Huang
37
0
0
26 Apr 2025
Of All StrIPEs: Investigating Structure-informed Positional Encoding for Efficient Music Generation
Manvi Agarwal
Changhong Wang
Gaël Richard
22
0
0
07 Apr 2025
On the Spatial Structure of Mixture-of-Experts in Transformers
Daniel Bershatsky
Ivan V. Oseledets
MoE
32
0
0
06 Apr 2025
Sensitivity Meets Sparsity: The Impact of Extremely Sparse Parameter Patterns on Theory-of-Mind of Large Language Models
Yuheng Wu
Wentao Guo
Zirui Liu
Heng Ji
Zhaozhuo Xu
Denghui Zhang
28
0
0
05 Apr 2025
TRA: Better Length Generalisation with Threshold Relative Attention
Mattia Opper
Roland Fernandez
P. Smolensky
Jianfeng Gao
37
0
0
29 Mar 2025
How can representation dimension dominate structurally pruned LLMs?
Mingxue Xu
Lisa Alazraki
Danilo P. Mandic
48
0
0
06 Mar 2025
Rotary Outliers and Rotary Offset Features in Large Language Models
André Jonasson
64
0
0
03 Mar 2025
TabICL: A Tabular Foundation Model for In-Context Learning on Large Data
Jingang Qu
David Holzmüller
Gaël Varoquaux
Marine Le Morvan
LMTD
73
4
0
08 Feb 2025
Learning the RoPEs: Better 2D and 3D Position Encodings with STRING
Connor Schenck
Isaac Reid
M. Jacob
Alex Bewley
Joshua Ainslie
...
Matthias Minderer
Dmitry Kalashnikov
Jonathan Tompson
Vikas Sindhwani
Krzysztof Choromanski
54
1
0
04 Feb 2025
softmax is not enough (for sharp out-of-distribution)
Petar Veličković
Christos Perivolaropoulos
Federico Barbero
Razvan Pascanu
18
1
0
01 Oct 2024
1