NysAct: A Scalable Preconditioned Gradient Descent using Nystrom ApproximationBigData Congress [Services Society] (BSS), 2024 |
Newton Meets Marchenko-Pastur: Massively Parallel Second-Order
Optimization with Hessian Sketching and DebiasingInternational Conference on Learning Representations (ICLR), 2024 |