ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2026 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1606.03401
  4. Cited By
Memory-Efficient Backpropagation Through Time

Memory-Efficient Backpropagation Through Time

10 June 2016
A. Gruslys
Rémi Munos
Ivo Danihelka
Marc Lanctot
Alex Graves
ArXiv (abs)PDFHTML

Papers citing "Memory-Efficient Backpropagation Through Time"

50 / 118 papers shown
OneTrans: Unified Feature Interaction and Sequence Modeling with One Transformer in Industrial Recommender
OneTrans: Unified Feature Interaction and Sequence Modeling with One Transformer in Industrial Recommender
Zhaoqi Zhang
Haolei Pei
Jun Guo
Tianyu Wang
Yufei Feng
Hui Sun
Shaowei Liu
Aixin Sun
OffRL
262
30
0
30 Oct 2025
Learning Regularization Functionals for Inverse Problems: A Comparative Study
Learning Regularization Functionals for Inverse Problems: A Comparative Study
J. Hertrich
Matthias Joachim Ehrhardt
Alexander Denker
Stanislas Ducotterd
Zhenghan Fang
...
German Shâma Wache
Martin Zach
Yasi Zhang
Matthias Joachim Ehrhardt
Sebastian Neumayer
234
10
0
02 Oct 2025
DaCe AD: Unifying High-Performance Automatic Differentiation for Machine Learning and Scientific Computing
DaCe AD: Unifying High-Performance Automatic Differentiation for Machine Learning and Scientific ComputingIEEE International Conference on Cluster Computing (Cluster), 2025
Afif Boudaoud
A. Calotoiu
Marcin Copik
Torsten Hoefler
PINNAI4CE
196
1
0
02 Sep 2025
Text2Stereo: Repurposing Stable Diffusion for Stereo Generation with Consistency Rewards
Text2Stereo: Repurposing Stable Diffusion for Stereo Generation with Consistency Rewards
Aakash Garg
Libing Zeng
Andrii Tsarov
N. Kalantari
299
1
0
27 May 2025
Neural ODE Transformers: Analyzing Internal Dynamics and Adaptive Fine-tuning
Neural ODE Transformers: Analyzing Internal Dynamics and Adaptive Fine-tuningInternational Conference on Learning Representations (ICLR), 2025
Anh Tong
Thanh Nguyen-Tang
Dongeun Lee
Duc Nguyen
Toan M. Tran
David Hall
Cheongwoong Kang
Jaesik Choi
593
10
0
03 Mar 2025
GPU Memory Usage Optimization for Backward Propagation in Deep Network Training
GPU Memory Usage Optimization for Backward Propagation in Deep Network Training
Ding-Yong Hong
Tzu-Hsien Tsai
Ning Wang
Pangfeng Liu
Jan-Jan Wu
267
2
0
18 Feb 2025
Memory-Efficient Fine-Tuning of Transformers via Token Selection
Memory-Efficient Fine-Tuning of Transformers via Token SelectionConference on Empirical Methods in Natural Language Processing (EMNLP), 2025
Antoine Simoulin
Namyong Park
Xiaoyi Liu
Grey Yang
479
8
0
31 Jan 2025
Optimal Gradient Checkpointing for Sparse and Recurrent Architectures
  using Off-Chip Memory
Optimal Gradient Checkpointing for Sparse and Recurrent Architectures using Off-Chip Memory
Wadjih Bencheikh
Jan Finkbeiner
Emre Neftci
297
4
0
16 Dec 2024
Automatic Differentiation-based Full Waveform Inversion with Flexible
  Workflows
Automatic Differentiation-based Full Waveform Inversion with Flexible WorkflowsJournal of Geophysical Research (JGR), 2024
Feng Liu
Haipeng Li
Guangyuan Zou
Junlun Li
463
10
0
30 Nov 2024
FusionLLM: A Decentralized LLM Training System on Geo-distributed GPUs
  with Adaptive Compression
FusionLLM: A Decentralized LLM Training System on Geo-distributed GPUs with Adaptive Compression
Zhenheng Tang
Xueze Kang
Yiming Yin
Xinglin Pan
Yuxin Wang
...
Shaohuai Shi
Amelie Chi Zhou
Bo Li
Bingsheng He
Xiaowen Chu
AI4CE
283
10
0
16 Oct 2024
On-Device Training of Fully Quantized Deep Neural Networks on Cortex-M
  Microcontrollers
On-Device Training of Fully Quantized Deep Neural Networks on Cortex-M Microcontrollers
M. Deutel
Frank Hannig
Christopher Mutschler
Jürgen Teich
MQ
341
12
0
15 Jul 2024
Banishing LLM Hallucinations Requires Rethinking Generalization
Banishing LLM Hallucinations Requires Rethinking Generalization
Johnny Li
Saksham Consul
Eda Zhou
James Wong
Naila Farooqui
...
Zhuxiaona Wei
Tian Wu
Ben Echols
Sharon Zhou
Gregory Diamos
LRM
413
22
0
25 Jun 2024
Adding Conditional Control to Diffusion Models with Reinforcement Learning
Adding Conditional Control to Diffusion Models with Reinforcement Learning
Yulai Zhao
Masatoshi Uehara
Gabriele Scalia
Tommaso Biancalani
Sergey Levine
Ehsan Hajiramezanali
Ehsan Hajiramezanali
AI4CE
618
14
0
17 Jun 2024
ORBIT: Oak Ridge Base Foundation Model for Earth System Predictability
ORBIT: Oak Ridge Base Foundation Model for Earth System Predictability
Xiao Wang
A. Tsaris
Siyan Liu
Jong Youl Choi
Ming Fan
Wei Zhang
Ju Yin
M. Ashfaq
Dan Lu
Dali Wang
286
22
0
23 Apr 2024
Tiny Machine Learning: Progress and Futures
Tiny Machine Learning: Progress and Futures
Ji Lin
Ligeng Zhu
Wei-Ming Chen
Wei-Chen Wang
Song Han
385
138
0
28 Mar 2024
Block Selective Reprogramming for On-device Training of Vision
  Transformers
Block Selective Reprogramming for On-device Training of Vision Transformers
Sreetama Sarkar
Souvik Kundu
Kai Zheng
Peter A. Beerel
263
5
0
25 Mar 2024
FedMef: Towards Memory-efficient Federated Dynamic Pruning
FedMef: Towards Memory-efficient Federated Dynamic Pruning
Hong Huang
Weiming Zhuang
Chen Chen
Lingjuan Lyu
304
19
0
21 Mar 2024
Feedback Efficient Online Fine-Tuning of Diffusion Models
Feedback Efficient Online Fine-Tuning of Diffusion Models
Masatoshi Uehara
Yulai Zhao
Kevin Black
Ehsan Hajiramezanali
Gabriele Scalia
N. Diamant
Alex Tseng
Sergey Levine
Tommaso Biancalani
469
47
0
26 Feb 2024
Fine-Tuning of Continuous-Time Diffusion Models as Entropy-Regularized
  Control
Fine-Tuning of Continuous-Time Diffusion Models as Entropy-Regularized Control
Masatoshi Uehara
Yulai Zhao
Kevin Black
Ehsan Hajiramezanali
Gabriele Scalia
N. Diamant
Alex Tseng
Tommaso Biancalani
Sergey Levine
366
104
0
23 Feb 2024
Moonwalk: Inverse-Forward Differentiation
Moonwalk: Inverse-Forward Differentiation
Dmitrii Krylov
Armin Karamzade
Roy Fox
208
1
0
22 Feb 2024
On the Resurgence of Recurrent Models for Long Sequences -- Survey and
  Research Opportunities in the Transformer Era
On the Resurgence of Recurrent Models for Long Sequences -- Survey and Research Opportunities in the Transformer Era
Matteo Tiezzi
Michele Casoni
Alessandro Betti
Tommaso Guidi
Marco Gori
S. Melacci
366
19
0
12 Feb 2024
InstructVideo: Instructing Video Diffusion Models with Human Feedback
InstructVideo: Instructing Video Diffusion Models with Human Feedback
Hangjie Yuan
Shiwei Zhang
Xiang Wang
Yujie Wei
Tao Feng
Yining Pan
Yingya Zhang
Ziwei Liu
Samuel Albanie
Dong Ni
VGen
293
90
0
19 Dec 2023
Compressed Context Memory For Online Language Model Interaction
Compressed Context Memory For Online Language Model Interaction
Jang-Hyun Kim
Junyoung Yeom
Sangdoo Yun
Hyun Oh Song
KELM
367
33
1
06 Dec 2023
Coop: Memory is not a Commodity
Coop: Memory is not a CommodityNeural Information Processing Systems (NeurIPS), 2023
Jianhao Zhang
Shihan Ma
Peihong Liu
Jinhui Yuan
232
9
0
01 Nov 2023
FetusMapV2: Enhanced Fetal Pose Estimation in 3D Ultrasound
FetusMapV2: Enhanced Fetal Pose Estimation in 3D Ultrasound
Chaoyu Chen
Xin Yang
Yuhao Huang
Wenlong Shi
Yan Cao
...
Kejuan Yue
Yuanji Zhang
Yi Xiong
Dong Ni
Weijun Huang
3DH
160
0
0
30 Oct 2023
Long-term Dependency for 3D Reconstruction of Freehand Ultrasound
  Without External Tracker
Long-term Dependency for 3D Reconstruction of Freehand Ultrasound Without External TrackerIEEE Transactions on Biomedical Engineering (IEEE Trans. Biomed. Eng.), 2023
Qi Li
Ziyi Shen
Qian Li
D. Barratt
T. Dowrick
Matthew J. Clarkson
Tom Vercauteren
Yipeng Hu
189
16
0
16 Oct 2023
Unsupervised Discovery of Interpretable Directions in h-space of
  Pre-trained Diffusion Models
Unsupervised Discovery of Interpretable Directions in h-space of Pre-trained Diffusion Models
Zijian Zhang
Luping Liu
Zhijie Lin
Yichen Zhu
Zhou Zhao
DiffM
368
11
0
15 Oct 2023
Fast-ELECTRA for Efficient Pre-training
Fast-ELECTRA for Efficient Pre-trainingInternational Conference on Learning Representations (ICLR), 2023
Chengyu Dong
Liyuan Liu
Hao Cheng
Jingbo Shang
Jianfeng Gao
Xiaodong Liu
289
2
0
11 Oct 2023
Aligning Text-to-Image Diffusion Models with Reward Backpropagation
Aligning Text-to-Image Diffusion Models with Reward Backpropagation
Mihir Prabhudesai
Anirudh Goyal
Deepak Pathak
Katerina Fragkiadaki
571
241
0
05 Oct 2023
OneAdapt: Fast Adaptation for Deep Learning Applications via
  Backpropagation
OneAdapt: Fast Adaptation for Deep Learning Applications via BackpropagationACM Symposium on Cloud Computing (SoCC), 2023
Kuntai Du
Yuhan Liu
Yitian Hao
Qizheng Zhang
Haodong Wang
Yuyang Huang
Ganesh Ananthanarayanan
Junchen Jiang
244
2
0
03 Oct 2023
Score-based Data Assimilation for a Two-Layer Quasi-Geostrophic Model
Score-based Data Assimilation for a Two-Layer Quasi-Geostrophic Model
Sacha Lewin
Gilles Louppe
306
12
0
03 Oct 2023
Directly Fine-Tuning Diffusion Models on Differentiable Rewards
Directly Fine-Tuning Diffusion Models on Differentiable RewardsInternational Conference on Learning Representations (ICLR), 2023
Amita Gajewar
Paul Vicol
G. Bansal
David J Fleet
337
361
0
29 Sep 2023
Enabling Resource-efficient AIoT System with Cross-level Optimization: A
  survey
Enabling Resource-efficient AIoT System with Cross-level Optimization: A surveyIEEE Communications Surveys and Tutorials (COMST), 2023
Sicong Liu
Bin Guo
Cheng Fang
Ziqi Wang
Shiyan Luo
Zimu Zhou
Zhiwen Yu
AI4CE
354
40
0
27 Sep 2023
FusionAI: Decentralized Training and Deploying LLMs with Massive
  Consumer-Level GPUs
FusionAI: Decentralized Training and Deploying LLMs with Massive Consumer-Level GPUs
Zhenheng Tang
Yuxin Wang
Xin He
Longteng Zhang
Xinglin Pan
...
Rongfei Zeng
Kaiyong Zhao
Shaoshuai Shi
Bingsheng He
Xiaowen Chu
277
37
0
03 Sep 2023
Brain-inspired learning in artificial neural networks: a review
Brain-inspired learning in artificial neural networks: a reviewAPL Machine Learning (AML), 2023
Samuel Schmidgall
Jascha Achterberg
Thomas Miconi
Louis Kirsch
Rojin Ziaei
S. P. Hajiseyedrazi
Nhan Duy Truong
277
109
0
18 May 2023
Domain Generalization for Mammographic Image Analysis with Contrastive
  Learning
Domain Generalization for Mammographic Image Analysis with Contrastive Learning
Zheren Li
Zhiming Cui
Lichi Zhang
Sheng Wang
Chenjin Lei
...
Yajia Gu
Zaiyi Liu
Chunling Liu
Dinggang Shen
Jie‐Zhi Cheng
661
5
0
20 Apr 2023
DisCo-CLIP: A Distributed Contrastive Loss for Memory Efficient CLIP
  Training
DisCo-CLIP: A Distributed Contrastive Loss for Memory Efficient CLIP TrainingComputer Vision and Pattern Recognition (CVPR), 2023
Yihao Chen
Xianbiao Qi
Jianan Wang
Lei Zhang
244
26
0
17 Apr 2023
End-to-End Diffusion Latent Optimization Improves Classifier Guidance
End-to-End Diffusion Latent Optimization Improves Classifier GuidanceIEEE International Conference on Computer Vision (ICCV), 2023
Bram Wallace
Akash Gokul
Stefano Ermon
Nikhil Naik
578
117
0
23 Mar 2023
RAF: Holistic Compilation for Deep Learning Model Training
RAF: Holistic Compilation for Deep Learning Model Training
Cody Hao Yu
Haozheng Fan
Guangtai Huang
Zhen Jia
Yizhi Liu
...
Yuan Zhou
Haichen Shen
Junru Shao
Mu Li
Yida Wang
217
3
0
08 Mar 2023
Towards Vision Transformer Unrolling Fixed-Point Algorithm: a Case Study
  on Image Restoration
Towards Vision Transformer Unrolling Fixed-Point Algorithm: a Case Study on Image Restoration
Peng Qiao
Sidun Liu
Tao Sun
Ke Yang
Y. Dou
ViT
298
3
0
29 Jan 2023
XEngine: Optimal Tensor Rematerialization for Neural Networks in
  Heterogeneous Environments
XEngine: Optimal Tensor Rematerialization for Neural Networks in Heterogeneous EnvironmentsACM Transactions on Architecture and Code Optimization (TACO) (TACO), 2022
Manuela Schuler
Richard Membarth
P. Slusallek
285
4
0
19 Dec 2022
K-UNN: k-Space Interpolation With Untrained Neural Network
K-UNN: k-Space Interpolation With Untrained Neural Network
Zhuoxu Cui
Seng Jia
Qingyong Zhu
Congcong Liu
Zhilang Qiu
Yuanyuan Liu
Jing Cheng
Haifeng Wang
Yanjie Zhu
Dong Liang
172
1
0
11 Aug 2022
DIVISION: Memory Efficient Training via Dual Activation Precision
DIVISION: Memory Efficient Training via Dual Activation PrecisionInternational Conference on Machine Learning (ICML), 2022
Guanchu Wang
Zirui Liu
Zhimeng Jiang
Ninghao Liu
Nannan Zou
Helen Zhou
MQ
484
4
0
05 Aug 2022
RevBiFPN: The Fully Reversible Bidirectional Feature Pyramid Network
RevBiFPN: The Fully Reversible Bidirectional Feature Pyramid NetworkConference on Machine Learning and Systems (MLSys), 2022
Vitaliy Chiley
Vithursan Thangarasa
Abhay Gupta
Anshul Samar
Joel Hestness
D. DeCoste
265
14
0
28 Jun 2022
p-Meta: Towards On-device Deep Model Adaptation
p-Meta: Towards On-device Deep Model AdaptationKnowledge Discovery and Data Mining (KDD), 2022
Zhongnan Qu
Zimu Zhou
Yongxin Tong
Lothar Thiele
157
15
0
25 Jun 2022
Flexible Diffusion Modeling of Long Videos
Flexible Diffusion Modeling of Long VideosNeural Information Processing Systems (NeurIPS), 2022
William Harvey
Saeid Naderiparizi
Vaden Masrani
Christian D. Weilbach
Frank Wood
DiffMBDLVGen
665
351
0
23 May 2022
Real-time Forecasting of Time Series in Financial Markets Using
  Sequentially Trained Many-to-one LSTMs
Real-time Forecasting of Time Series in Financial Markets Using Sequentially Trained Many-to-one LSTMs
Kelum Gajamannage
Yonggi Park
AI4TSAIFin
278
4
0
10 May 2022
Beyond backpropagation: bilevel optimization through implicit
  differentiation and equilibrium propagation
Beyond backpropagation: bilevel optimization through implicit differentiation and equilibrium propagationNeural Computation (Neural Comput.), 2022
Nicolas Zucchet
João Sacramento
421
34
0
06 May 2022
Enable Deep Learning on Mobile Devices: Methods, Systems, and
  Applications
Enable Deep Learning on Mobile Devices: Methods, Systems, and Applications
Han Cai
Ji Lin
Chengyue Wu
Zhijian Liu
Haotian Tang
Hanrui Wang
Ligeng Zhu
Song Han
288
138
0
25 Apr 2022
DropIT: Dropping Intermediate Tensors for Memory-Efficient DNN Training
DropIT: Dropping Intermediate Tensors for Memory-Efficient DNN TrainingInternational Conference on Learning Representations (ICLR), 2022
Joya Chen
Kai Xu
Yuhui Wang
Yifei Cheng
Angela Yao
318
9
0
28 Feb 2022
123
Next
Page 1 of 3