Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1910.02653
Cited By
v1
v2
v3 (latest)
Checkmate: Breaking the Memory Wall with Optimal Tensor Rematerialization
7 October 2019
Paras Jain
Ajay Jain
Aniruddha Nrusimha
A. Gholami
Pieter Abbeel
Kurt Keutzer
Ion Stoica
Joseph E. Gonzalez
Re-assign community
ArXiv (abs)
PDF
HTML
Github (131★)
Papers citing
"Checkmate: Breaking the Memory Wall with Optimal Tensor Rematerialization"
47 / 97 papers shown
Title
POET: Training Neural Networks on Tiny Devices with Integrated Rematerialization and Paging
Shishir G. Patil
Paras Jain
P. Dutta
Ion Stoica
Joseph E. Gonzalez
74
37
0
15 Jul 2022
RevBiFPN: The Fully Reversible Bidirectional Feature Pyramid Network
Vitaliy Chiley
Vithursan Thangarasa
Abhay Gupta
Anshul Samar
Joel Hestness
D. DeCoste
71
8
0
28 Jun 2022
GACT: Activation Compressed Training for Generic Network Architectures
Xiaoxuan Liu
Lianmin Zheng
Dequan Wang
Yukuo Cen
Weize Chen
...
Zhiyuan Liu
Jie Tang
Joey Gonzalez
Michael W. Mahoney
Alvin Cheung
VLM
GNN
MQ
94
33
0
22 Jun 2022
Merak: An Efficient Distributed DNN Training Framework with Automated 3D Parallelism for Giant Foundation Models
Zhiquan Lai
Shengwei Li
Xudong Tang
Ke-shi Ge
Weijie Liu
Yabo Duan
Linbo Qiao
Dongsheng Li
89
45
0
10 Jun 2022
Bamboo: Making Preemptible Instances Resilient for Affordable Training of Large DNNs
John Thorpe
Pengzhan Zhao
Jon Eyolfson
Yifan Qiao
Zhihao Jia
Minjia Zhang
Ravi Netravali
Guoqing Harry Xu
73
58
0
26 Apr 2022
Scientometric Review of Artificial Intelligence for Operations & Maintenance of Wind Turbines: The Past, Present and Future
Joyjit Chatterjee
Nina Dethlefs
57
85
0
30 Mar 2022
DELTA: Dynamically Optimizing GPU Memory beyond Tensor Recomputation
Yu Tang
Chenyu Wang
Yufan Zhang
Yuliang Liu
Xingcheng Zhang
Linbo Qiao
Zhiquan Lai
Dongsheng Li
66
6
0
30 Mar 2022
Survey on Large Scale Neural Network Training
Julia Gusak
Daria Cherniuk
Alena Shilova
A. Katrutsa
Daniel Bershatsky
...
Lionel Eyraud-Dubois
Oleg Shlyazhko
Denis Dimitrov
Ivan Oseledets
Olivier Beaumont
74
11
0
21 Feb 2022
Harmony: Overcoming the Hurdles of GPU Memory Capacity to Train Massive DNN Models on Commodity Servers
Youjie Li
Amar Phanishayee
D. Murray
Jakub Tarnawski
Nam Sung Kim
43
22
0
02 Feb 2022
Alpa: Automating Inter- and Intra-Operator Parallelism for Distributed Deep Learning
Lianmin Zheng
Zhuohan Li
Hao Zhang
Yonghao Zhuang
Zhifeng Chen
...
Yuanzhong Xu
Danyang Zhuo
Eric P. Xing
Joseph E. Gonzalez
Ion Stoica
MoE
137
104
0
28 Jan 2022
Terra: Imperative-Symbolic Co-Execution of Imperative Deep Learning Programs
Taebum Kim
Eunji Jeong
Geonyong Kim
Yunmo Koo
Sehoon Kim
Gyeong-In Yu
Byung-Gon Chun
AI4CE
63
5
0
23 Jan 2022
Combined Scaling for Zero-shot Transfer Learning
Hieu H. Pham
Zihang Dai
Golnaz Ghiasi
Kenji Kawaguchi
Hanxiao Liu
...
Yi-Ting Chen
Minh-Thang Luong
Yonghui Wu
Mingxing Tan
Quoc V. Le
VLM
106
200
0
19 Nov 2021
Sequential Aggregation and Rematerialization: Distributed Full-batch Training of Graph Neural Networks on Large Graphs
Hesham Mostafa
GNN
99
25
0
11 Nov 2021
Varuna: Scalable, Low-cost Training of Massive Deep Learning Models
Sanjith Athlur
Nitika Saran
Muthian Sivathanu
Ramachandran Ramjee
Nipun Kwatra
GNN
106
84
0
07 Nov 2021
A Data-Centric Optimization Framework for Machine Learning
Oliver Rausch
Tal Ben-Nun
Nikoli Dryden
Andrei Ivanov
Shigang Li
Torsten Hoefler
AI4CE
36
16
0
20 Oct 2021
The CoRa Tensor Compiler: Compilation for Ragged Tensors with Minimal Padding
Pratik Fegade
Tianqi Chen
Phillip B. Gibbons
T. Mowry
61
29
0
19 Oct 2021
Hydra: A System for Large Multi-Model Deep Learning
Kabir Nagrecha
Arun Kumar
MoE
AI4CE
56
5
0
16 Oct 2021
PatrickStar: Parallel Training of Pre-trained Models via Chunk-based Memory Management
Jiarui Fang
Zilin Zhu
Shenggui Li
Hui Su
Yang Yu
Jie Zhou
Yang You
VLM
105
25
0
12 Aug 2021
Accelerating Quadratic Optimization with Reinforcement Learning
Jeffrey Ichnowski
Paras Jain
Bartolomeo Stellato
G. Banjac
Michael Luo
Francesco Borrelli
Joseph E. Gonzalez
Ion Stoica
Ken Goldberg
OffRL
85
36
0
22 Jul 2021
KAISA: An Adaptive Second-Order Optimizer Framework for Deep Neural Networks
J. G. Pauloski
Qi Huang
Lei Huang
Shivaram Venkataraman
Kyle Chard
Ian Foster
Zhao-jie Zhang
76
29
0
04 Jul 2021
MAGE: Nearly Zero-Cost Virtual Memory for Secure Computation
Sam Kumar
David Culler
Raluca A. Popa
53
21
0
23 Jun 2021
Dorylus: Affordable, Scalable, and Accurate GNN Training with Distributed CPU Servers and Serverless Threads
John Thorpe
Yifan Qiao
Jon Eyolfson
Shen Teng
Guanzhou Hu
...
Jinliang Wei
Keval Vora
Ravi Netravali
Miryung Kim
G. Xu
GNN
60
144
0
24 May 2021
ActNN: Reducing Training Memory Footprint via 2-Bit Activation Compressed Training
Jianfei Chen
Lianmin Zheng
Z. Yao
Dequan Wang
Ion Stoica
Michael W. Mahoney
Joseph E. Gonzalez
MQ
77
75
0
29 Apr 2021
ZeRO-Infinity: Breaking the GPU Memory Wall for Extreme Scale Deep Learning
Samyam Rajbhandari
Olatunji Ruwase
Jeff Rasley
Shaden Smith
Yuxiong He
GNN
101
393
0
16 Apr 2021
Software-Hardware Co-design for Fast and Scalable Training of Deep Learning Recommendation Models
Dheevatsa Mudigere
Y. Hao
Jianyu Huang
Zhihao Jia
Andrew Tulloch
...
Ajit Mathews
Lin Qiao
M. Smelyanskiy
Bill Jia
Vijay Rao
103
154
0
12 Apr 2021
Efficient Large-Scale Language Model Training on GPU Clusters Using Megatron-LM
Deepak Narayanan
Mohammad Shoeybi
Jared Casper
P. LeGresley
M. Patwary
...
Prethvi Kashinkunti
J. Bernauer
Bryan Catanzaro
Amar Phanishayee
Matei A. Zaharia
MoE
158
709
0
09 Apr 2021
Putting NeRF on a Diet: Semantically Consistent Few-Shot View Synthesis
Ajay Jain
Matthew Tancik
Pieter Abbeel
139
504
0
01 Apr 2021
TeraPipe: Token-Level Pipeline Parallelism for Training Large-Scale Language Models
Zhuohan Li
Siyuan Zhuang
Shiyuan Guo
Danyang Zhuo
Hao Zhang
Basel Alomair
Ion Stoica
MoE
93
124
0
16 Feb 2021
Improving Panoptic Segmentation at All Scales
Lorenzo Porzi
Samuel Rota Buló
Peter Kontschieder
90
16
0
14 Dec 2020
Memory Optimization for Deep Networks
Aashaka Shah
Chaoxia Wu
Jayashree Mohan
Vijay Chidambaram
Philipp Krahenbuhl
79
24
0
27 Oct 2020
High-Capacity Expert Binary Networks
Adrian Bulat
Brais Martínez
Georgios Tzimiropoulos
MQ
98
59
0
07 Oct 2020
Accelerating Recommender Systems via Hardware "scale-in"
S. Krishna
Ravi Krishna
GNN
LRM
133
6
0
11 Sep 2020
Pollux: Co-adaptive Cluster Scheduling for Goodput-Optimized Deep Learning
Aurick Qiao
Sang Keun Choe
Suhas Jayaram Subramanya
Willie Neiswanger
Qirong Ho
Hao Zhang
G. Ganger
Eric Xing
VLM
77
182
0
27 Aug 2020
Scaling Distributed Deep Learning Workloads beyond the Memory Capacity with KARMA
Mohamed Wahib
Haoyu Zhang
Truong Thao Nguyen
Aleksandr Drozd
Jens Domke
Lingqi Zhang
Ryousei Takano
Satoshi Matsuoka
OODD
76
23
0
26 Aug 2020
A Computational-Graph Partitioning Method for Training Memory-Constrained DNNs
Fareed Qararyah
Mohamed Wahib
Douga Dikbayir
M. E. Belviranli
Didem Unat
54
10
0
19 Aug 2020
Skyline: Interactive In-Editor Computational Performance Profiling for Deep Neural Network Training
Geoffrey X. Yu
Tovi Grossman
Gennady Pekhimenko
41
17
0
15 Aug 2020
The Case for Strong Scaling in Deep Learning: Training Large 3D CNNs with Hybrid Parallelism
Yosuke Oyama
N. Maruyama
Nikoli Dryden
Erin McCarthy
P. Harrington
J. Balewski
Satoshi Matsuoka
Peter Nugent
B. Van Essen
3DV
AI4CE
71
37
0
25 Jul 2020
DAPPLE: A Pipelined Data Parallel Approach for Training Large Models
Shiqing Fan
Yi Rong
Chen Meng
Zongyan Cao
Siyu Wang
...
Jun Yang
Lixue Xia
Lansong Diao
Xiaoyong Liu
Wei Lin
96
240
0
02 Jul 2020
Data Movement Is All You Need: A Case Study on Optimizing Transformers
A. Ivanov
Nikoli Dryden
Tal Ben-Nun
Shigang Li
Torsten Hoefler
133
135
0
30 Jun 2020
Locally Masked Convolution for Autoregressive Models
Ajay Jain
Pieter Abbeel
Deepak Pathak
DiffM
OffRL
117
32
0
22 Jun 2020
Neural Parameter Allocation Search
Bryan A. Plummer
Nikoli Dryden
Julius Frost
Torsten Hoefler
Kate Saenko
118
16
0
18 Jun 2020
Dynamic Tensor Rematerialization
Marisa Kirisame
Steven Lyubomirsky
Altan Haan
Jennifer Brennan
Mike He
Jared Roesch
Tianqi Chen
Zachary Tatlock
92
94
0
17 Jun 2020
Memory-Efficient Pipeline-Parallel DNN Training
Deepak Narayanan
Amar Phanishayee
Kaiyu Shi
Xie Chen
Matei A. Zaharia
MoE
92
218
0
16 Jun 2020
Train Large, Then Compress: Rethinking Model Size for Efficient Training and Inference of Transformers
Zhuohan Li
Eric Wallace
Sheng Shen
Kevin Lin
Kurt Keutzer
Dan Klein
Joseph E. Gonzalez
129
151
0
26 Feb 2020
Optimal checkpointing for heterogeneous chains: how to train deep neural networks with limited memory
Julien Herrmann
Olivier Beaumont
Lionel Eyraud-Dubois
J. Herrmann
Alexis Joly
Alena Shilova
BDL
69
30
0
27 Nov 2019
ZeRO: Memory Optimizations Toward Training Trillion Parameter Models
Samyam Rajbhandari
Jeff Rasley
Olatunji Ruwase
Yuxiong He
ALM
AI4CE
90
921
0
04 Oct 2019
Echo: Compiler-based GPU Memory Footprint Reduction for LSTM RNN Training
Bojian Zheng
Abhishek Tiwari
Nandita Vijaykumar
Gennady Pekhimenko
77
44
0
22 May 2018
Previous
1
2