Cuttlefish: Low-Rank Model Training without All the Tuning

Cuttlefish: Low-Rank Model Training without All the Tuning

4 May 2023

Saurabh Agarwal

Pongsakorn U-chupala

Yoshiki Tanaka

Dimitris Papailiopoulos

Papers citing "Cuttlefish: Low-Rank Model Training without All the Tuning"

13 / 13 papers shown

Title
Efficient-vDiT: Efficient Video Diffusion Transformers With Attention Tile Hangliang Ding Dacheng Li Runlong Su Peiyuan Zhang Zhijie Deng Ion Stoica Hao Zhang VGen 63 3 0 10 Feb 2025
Characterizing the Accuracy - Efficiency Trade-off of Low-rank Decomposition in Language Models Chakshu Moar Michael Pellauer Hyoukjun Kwon 25 1 0 10 May 2024
GaLore: Memory-Efficient LLM Training by Gradient Low-Rank Projection Jiawei Zhao Zhenyu (Allen) Zhang Beidi Chen Zhangyang Wang A. Anandkumar Yuandong Tian 25 173 0 06 Mar 2024
LLM360: Towards Fully Transparent Open-Source LLMs Zhengzhong Liu Aurick Qiao W. Neiswanger Hongyi Wang Bowen Tan ... Zhiting Hu Mark Schulze Preslav Nakov Timothy Baldwin Eric P. Xing 27 68 0 11 Dec 2023
Maestro: Uncovering Low-Rank Structures via Trainable Decomposition Samuel Horváth Stefanos Laskaridis Shashank Rajput Hongyi Wang BDL 21 4 0 28 Aug 2023
A Field Guide to Federated Optimization Jianyu Wang Zachary B. Charles Zheng Xu Gauri Joshi H. B. McMahan ... Mi Zhang Tong Zhang Chunxiang Zheng Chen Zhu Wennan Zhu FedML 163 358 0 14 Jul 2021
MLP-Mixer: An all-MLP Architecture for Vision Ilya O. Tolstikhin N. Houlsby Alexander Kolesnikov Lucas Beyer Xiaohua Zhai ... Andreas Steiner Daniel Keysers Jakob Uszkoreit Mario Lucic Alexey Dosovitskiy 239 2,554 0 04 May 2021
Initialization and Regularization of Factorized Neural Layers M. Khodak Neil A. Tenenholtz Lester W. Mackey Nicolò Fusi 63 56 0 03 May 2021
Comparing Rewinding and Fine-tuning in Neural Network Pruning Alex Renda Jonathan Frankle Michael Carbin 216 354 0 05 Mar 2020
GLUE: A Multi-Task Benchmark and Analysis Platform for Natural Language Understanding Alex Jinpeng Wang Amanpreet Singh Julian Michael Felix Hill Omer Levy Samuel R. Bowman ELM 294 6,927 0 20 Apr 2018
MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications Andrew G. Howard Menglong Zhu Bo Chen Dmitry Kalenichenko Weijun Wang Tobias Weyand M. Andreetto Hartwig Adam 3DH 948 20,214 0 17 Apr 2017
Incremental Network Quantization: Towards Lossless CNNs with Low-Precision Weights Aojun Zhou Anbang Yao Yiwen Guo Lin Xu Yurong Chen MQ 291 1,002 0 10 Feb 2017
Xception: Deep Learning with Depthwise Separable Convolutions François Chollet MDE BDL PINN 193 14,190 0 07 Oct 2016