Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2205.14173
Cited By
Momentum Stiefel Optimizer, with Applications to Suitably-Orthogonal Attention, and Optimal Transport
27 May 2022
Lingkai Kong
Yuqing Wang
Molei Tao
ODL
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Momentum Stiefel Optimizer, with Applications to Suitably-Orthogonal Attention, and Optimal Transport"
5 / 5 papers shown
Title
Spectral-factorized Positive-definite Curvature Learning for NN Training
Wu Lin
Felix Dangel
Runa Eschenhagen
Juhan Bae
Richard E. Turner
Roger B. Grosse
47
0
0
10 Feb 2025
Pyramid Vision Transformer: A Versatile Backbone for Dense Prediction without Convolutions
Wenhai Wang
Enze Xie
Xiang Li
Deng-Ping Fan
Kaitao Song
Ding Liang
Tong Lu
Ping Luo
Ling Shao
ViT
277
3,623
0
24 Feb 2021
Fast and accurate optimization on the orthogonal manifold without retraction
Pierre Ablin
Gabriel Peyré
51
26
0
15 Feb 2021
Dynamical Isometry and a Mean Field Theory of CNNs: How to Train 10,000-Layer Vanilla Convolutional Neural Networks
Lechao Xiao
Yasaman Bahri
Jascha Narain Sohl-Dickstein
S. Schoenholz
Jeffrey Pennington
222
348
0
14 Jun 2018
A Differential Equation for Modeling Nesterov's Accelerated Gradient Method: Theory and Insights
Weijie Su
Stephen P. Boyd
Emmanuel J. Candes
105
1,152
0
04 Mar 2015
1