Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2410.01104
Cited By
softmax is not enough (for sharp out-of-distribution)
1 October 2024
Petar Veličković
Christos Perivolaropoulos
Federico Barbero
Razvan Pascanu
Re-assign community
ArXiv
PDF
HTML
Papers citing
"softmax is not enough (for sharp out-of-distribution)"
11 / 11 papers shown
Title
Knowledge Distillation for Speech Denoising by Latent Representation Alignment with Cosine Distance
Diep Luong
Mikko Heikkinen
K. Drossos
Tuomas Virtanen
34
0
0
06 May 2025
Bayesian Principles Improve Prompt Learning In Vision-Language Models
Mingyu Kim
Jongwoo Ko
Mijung Park
VLM
28
0
0
19 Apr 2025
Long Context In-Context Compression by Getting to the Gist of Gisting
Aleksandar Petrov
Mark Sandler
A. Zhmoginov
Nolan Miller
Max Vladymyrov
17
0
0
11 Apr 2025
On Vanishing Variance in Transformer Length Generalization
Ruining Li
Gabrijel Boduljak
Jensen
Zhou
26
0
0
03 Apr 2025
Multi-Token Attention
O. Yu. Golovneva
Tianlu Wang
Jason Weston
Sainbayar Sukhbaatar
40
1
0
01 Apr 2025
TRA: Better Length Generalisation with Threshold Relative Attention
Mattia Opper
Roland Fernandez
P. Smolensky
Jianfeng Gao
37
0
0
29 Mar 2025
Attend or Perish: Benchmarking Attention in Algorithmic Reasoning
Michal Spiegel
Michal Štefánik
Marek Kadlcík
Josef Kuchař
29
0
0
28 Feb 2025
Hallucination Detection in LLMs Using Spectral Features of Attention Maps
Jakub Binkowski
Denis Janiak
Albert Sawczyn
Bogdan Gabrys
Tomasz Kajdanowicz
50
0
0
24 Feb 2025
What makes a good feedforward computational graph?
Alex Vitvitskyi
J. G. Araújo
Marc Lackenby
Petar Velickovic
71
1
0
10 Feb 2025
Round and Round We Go! What makes Rotary Positional Encodings useful?
Federico Barbero
Alex Vitvitskyi
Christos Perivolaropoulos
Razvan Pascanu
Petar Velickovic
47
16
0
08 Oct 2024
MANO: Exploiting Matrix Norm for Unsupervised Accuracy Estimation Under Distribution Shifts
Renchunzi Xie
Ambroise Odonnat
Vasilii Feofanov
Weijian Deng
Jianfeng Zhang
Bo An
28
0
0
29 May 2024
1