Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1604.06174
Cited By
Training Deep Nets with Sublinear Memory Cost
21 April 2016
Tianqi Chen
Bing Xu
Chiyuan Zhang
Carlos Guestrin
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Training Deep Nets with Sublinear Memory Cost"
50 / 204 papers shown
Title
Efficient Large-Scale Language Model Training on GPU Clusters Using Megatron-LM
Deepak Narayanan
M. Shoeybi
Jared Casper
P. LeGresley
M. Patwary
...
Prethvi Kashinkunti
J. Bernauer
Bryan Catanzaro
Amar Phanishayee
Matei A. Zaharia
MoE
11
645
0
09 Apr 2021
No frame left behind: Full Video Action Recognition
X. Liu
S. Pintea
F. Karimi Nejadasl
O. Booij
J. C. V. Gemert
19
40
0
29 Mar 2021
Deep and Statistical Learning in Biomedical Imaging: State of the Art in 3D MRI Brain Tumor Segmentation
K. R. M. Fernando
Cris P Tsokos
25
53
0
09 Mar 2021
Learning Transferable Visual Models From Natural Language Supervision
Alec Radford
Jong Wook Kim
Chris Hallacy
Aditya A. Ramesh
Gabriel Goh
...
Amanda Askell
Pamela Mishkin
Jack Clark
Gretchen Krueger
Ilya Sutskever
CLIP
VLM
103
27,682
0
26 Feb 2021
Jacobian Determinant of Normalizing Flows
Huadong Liao
Jiawei He
DRL
11
7
0
12 Feb 2021
Enabling Binary Neural Network Training on the Edge
Erwei Wang
James J. Davis
Daniele Moro
Piotr Zielinski
Jia Jie Lim
C. Coelho
S. Chatterjee
P. Cheung
G. Constantinides
MQ
20
24
0
08 Feb 2021
Scaling Deep Contrastive Learning Batch Size under Memory Limited Setup
Luyu Gao
Yunyi Zhang
Jiawei Han
Jamie Callan
23
90
0
18 Jan 2021
ZeRO-Offload: Democratizing Billion-Scale Model Training
Jie Ren
Samyam Rajbhandari
Reza Yazdani Aminabadi
Olatunji Ruwase
Shuangyang Yang
Minjia Zhang
Dong Li
Yuxiong He
MoE
168
414
0
18 Jan 2021
A Novel Memory-Efficient Deep Learning Training Framework via Error-Bounded Lossy Compression
Sian Jin
Guanpeng Li
S. Song
Dingwen Tao
AI4CE
29
12
0
18 Nov 2020
RocketQA: An Optimized Training Approach to Dense Passage Retrieval for Open-Domain Question Answering
Yingqi Qu
Yuchen Ding
Jing Liu
Kai Liu
Ruiyang Ren
Xin Zhao
Daxiang Dong
Hua-Hong Wu
Haifeng Wang
RALM
OffRL
214
593
0
16 Oct 2020
SMYRF: Efficient Attention using Asymmetric Clustering
Giannis Daras
Nikita Kitaev
Augustus Odena
A. Dimakis
25
44
0
11 Oct 2020
Review: Deep Learning in Electron Microscopy
Jeffrey M. Ede
31
79
0
17 Sep 2020
Scaling Distributed Deep Learning Workloads beyond the Memory Capacity with KARMA
M. Wahib
Haoyu Zhang
Truong Thao Nguyen
Aleksandr Drozd
Jens Domke
Lingqi Zhang
Ryousei Takano
Satoshi Matsuoka
OODD
34
23
0
26 Aug 2020
GANBERT: Generative Adversarial Networks with Bidirectional Encoder Representations from Transformers for MRI to PET synthesis
Hoo-Chang Shin
Alvin Ihsani
Swetha Mandava
Sharath Turuvekere Sreenivas
Christopher Forster
Jiook Cha
Alzheimer's Disease Neuroimaging Initiative
GAN
MedIm
16
19
0
10 Aug 2020
The Case for Strong Scaling in Deep Learning: Training Large 3D CNNs with Hybrid Parallelism
Yosuke Oyama
N. Maruyama
Nikoli Dryden
Erin McCarthy
P. Harrington
J. Balewski
Satoshi Matsuoka
Peter Nugent
B. Van Essen
3DV
AI4CE
24
37
0
25 Jul 2020
DAPPLE: A Pipelined Data Parallel Approach for Training Large Models
Shiqing Fan
Yi Rong
Chen Meng
Zongyan Cao
Siyu Wang
...
Jun Yang
Lixue Xia
Lansong Diao
Xiaoyong Liu
Wei Lin
21
232
0
02 Jul 2020
PLATO-2: Towards Building an Open-Domain Chatbot via Curriculum Learning
Siqi Bao
H. He
Fan Wang
Hua-Hong Wu
Haifeng Wang
Wenquan Wu
Zhen Guo
Zhibin Liu
Xinchao Xu
30
137
0
30 Jun 2020
LAMP: Large Deep Nets with Automated Model Parallelism for Image Segmentation
Wentao Zhu
Can Zhao
Wenqi Li
H. Roth
Ziyue Xu
Daguang Xu
3DV
24
18
0
22 Jun 2020
Dynamic Tensor Rematerialization
Marisa Kirisame
Steven Lyubomirsky
Altan Haan
Jennifer Brennan
Mike He
Jared Roesch
Tianqi Chen
Zachary Tatlock
16
93
0
17 Jun 2020
Memory-Efficient Pipeline-Parallel DNN Training
Deepak Narayanan
Amar Phanishayee
Kaiyu Shi
Xie Chen
Matei A. Zaharia
MoE
17
212
0
16 Jun 2020
VirTex: Learning Visual Representations from Textual Annotations
Karan Desai
Justin Johnson
SSL
VLM
24
432
0
11 Jun 2020
Linformer: Self-Attention with Linear Complexity
Sinong Wang
Belinda Z. Li
Madian Khabsa
Han Fang
Hao Ma
58
1,646
0
08 Jun 2020
UFO-BLO: Unbiased First-Order Bilevel Optimization
Valerii Likhosherstov
Xingyou Song
K. Choromanski
Jared Davis
Adrian Weller
29
7
0
05 Jun 2020
Hybrid Attention for Automatic Segmentation of Whole Fetal Head in Prenatal Ultrasound Volumes
Xin Yang
Xu Wang
Yi Wang
Haoran Dou
Shengli Li
H. Wen
Yi Lin
Pheng-Ann Heng
Dong Ni
16
19
0
28 Apr 2020
Longformer: The Long-Document Transformer
Iz Beltagy
Matthew E. Peters
Arman Cohan
RALM
VLM
28
3,913
0
10 Apr 2020
TorchIO: A Python library for efficient loading, preprocessing, augmentation and patch-based sampling of medical images in deep learning
Fernando Pérez-García
Rachel Sparks
Sébastien Ourselin
MedIm
LM&MA
138
427
0
09 Mar 2020
Towards Crowdsourced Training of Large Neural Networks using Decentralized Mixture-of-Experts
Max Ryabinin
Anton I. Gusev
FedML
14
48
0
10 Feb 2020
Efficient Memory Management for Deep Neural Net Inference
Yury Pisarchyk
Juhyun Lee
24
36
0
10 Jan 2020
Optimal checkpointing for heterogeneous chains: how to train deep neural networks with limited memory
Julien Herrmann
Olivier Beaumont
Lionel Eyraud-Dubois
J. Herrmann
Alexis Joly
Alena Shilova
BDL
21
29
0
27 Nov 2019
Streaming convolutional neural networks for end-to-end learning with multi-megapixel images
H. Pinckaers
Bram van Ginneken
G. Litjens
MedIm
24
94
0
11 Nov 2019
On-Device Machine Learning: An Algorithms and Learning Theory Perspective
Sauptik Dhar
Junyao Guo
Jiayi Liu
S. Tripathi
Unmesh Kurup
Mohak Shah
17
141
0
02 Nov 2019
ALBERT: A Lite BERT for Self-supervised Learning of Language Representations
Zhenzhong Lan
Mingda Chen
Sebastian Goodman
Kevin Gimpel
Piyush Sharma
Radu Soricut
SSL
AIMat
70
6,370
0
26 Sep 2019
Profiling based Out-of-core Hybrid Method for Large Neural Networks
Yuki Ito
Haruki Imai
Tung D. Le
Yasushi Negishi
K. Kawachiya
R. Matsumiya
Toshio Endo
14
9
0
11 Jul 2019
Evaluating Protein Transfer Learning with TAPE
Roshan Rao
Nicholas Bhattacharya
Neil Thomas
Yan Duan
Xi Chen
John F. Canny
Pieter Abbeel
Yun S. Song
SSL
19
781
0
19 Jun 2019
Generating Long Sequences with Sparse Transformers
R. Child
Scott Gray
Alec Radford
Ilya Sutskever
11
1,847
0
23 Apr 2019
Improving Strong-Scaling of CNN Training by Exploiting Finer-Grained Parallelism
Nikoli Dryden
N. Maruyama
Tom Benson
Tim Moon
M. Snir
B. Van Essen
18
49
0
15 Mar 2019
SimpleDet: A Simple and Versatile Distributed Framework for Object Detection and Instance Recognition
Yuntao Chen
Chenxia Han
Yanghao Li
Zehao Huang
Yi-Xin Jiang
Naiyan Wang
Zhaoxiang Zhang
73
30
0
14 Mar 2019
ANODE: Unconditionally Accurate Memory-Efficient Gradients for Neural ODEs
A. Gholami
Kurt Keutzer
George Biros
17
166
0
27 Feb 2019
Training on the Edge: The why and the how
Navjot Kukreja
Alena Shilova
Olivier Beaumont
Jan Huckelheim
N. Ferrier
P. Hovland
Gerard Gorman
14
33
0
13 Feb 2019
AccUDNN: A GPU Memory Efficient Accelerator for Training Ultra-deep Neural Networks
Jinrong Guo
Wantao Liu
Wang Wang
Q. Lu
Songlin Hu
Jizhong Han
Ruixuan Li
11
9
0
21 Jan 2019
Learning Energy Based Inpainting for Optical Flow
Christoph Vogel
Huijuan Cao
T. Pock
3DPC
22
5
0
09 Nov 2018
Supporting Very Large Models using Automatic Dataflow Graph Partitioning
Minjie Wang
Chien-chin Huang
Jinyang Li
35
154
0
24 Jul 2018
Backdrop: Stochastic Backpropagation
Siavash Golkar
Kyle Cranmer
17
2
0
04 Jun 2018
Collaborative Learning for Deep Neural Networks
Guocong Song
Wei Chai
FedML
15
192
0
30 May 2018
Echo: Compiler-based GPU Memory Footprint Reduction for LSTM RNN Training
Bojian Zheng
Abhishek Tiwari
Nandita Vijaykumar
Gennady Pekhimenko
19
44
0
22 May 2018
Dynamic Control Flow in Large-Scale Machine Learning
Yuan Yu
Martín Abadi
P. Barham
E. Brevdo
M. Burrows
...
Michael Isard
M. Kudlur
R. Monga
D. Murray
Xiaoqiang Zheng
AI4CE
11
106
0
04 May 2018
Learning Longer-term Dependencies in RNNs with Auxiliary Losses
Trieu H. Trinh
Andrew M. Dai
Thang Luong
Quoc V. Le
23
179
0
01 Mar 2018
Demystifying Parallel and Distributed Deep Learning: An In-Depth Concurrency Analysis
Tal Ben-Nun
Torsten Hoefler
GNN
30
701
0
26 Feb 2018
Bonnet: An Open-Source Training and Deployment Framework for Semantic Segmentation in Robotics using CNNs
Andres Milioto
C. Stachniss
SSeg
40
86
0
25 Feb 2018
SuperNeurons: Dynamic GPU Memory Management for Training Deep Neural Networks
Linnan Wang
Jinmian Ye
Yiyang Zhao
Wei Yu Wu
Ang Li
S. Song
Zenglin Xu
Tim Kraska
3DH
33
264
0
13 Jan 2018
Previous
1
2
3
4
5
Next