ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2104.07857
  4. Cited By
ZeRO-Infinity: Breaking the GPU Memory Wall for Extreme Scale Deep
  Learning

ZeRO-Infinity: Breaking the GPU Memory Wall for Extreme Scale Deep Learning

International Conference for High Performance Computing, Networking, Storage and Analysis (SC), 2021
16 April 2021
Samyam Rajbhandari
Olatunji Ruwase
Jeff Rasley
Shaden Smith
Yuxiong He
    GNN
ArXiv (abs)PDFHTML

Papers citing "ZeRO-Infinity: Breaking the GPU Memory Wall for Extreme Scale Deep Learning"

50 / 235 papers shown
Title
SALPA: Spaceborne LiDAR Point Adjustment for Enhanced GEDI Footprint Geolocation
SALPA: Spaceborne LiDAR Point Adjustment for Enhanced GEDI Footprint Geolocation
Narumasa Tsutsumida
Rei Mitsuhashi
Yoshito Sawada
Akira Kato
40
0
0
18 Nov 2025
10Cache: Heterogeneous Resource-Aware Tensor Caching and Migration for LLM Training
10Cache: Heterogeneous Resource-Aware Tensor Caching and Migration for LLM Training
Sabiha Afroz
Redwan Ibne Seraj Khan
Hadeel Albahar
Jingoo Han
A. R. Butt
112
0
0
18 Nov 2025
Look as You Think: Unifying Reasoning and Visual Evidence Attribution for Verifiable Document RAG via Reinforcement Learning
Look as You Think: Unifying Reasoning and Visual Evidence Attribution for Verifiable Document RAG via Reinforcement Learning
Shuochen Liu
Pengfei Luo
Chao Zhang
Yuhao Chen
H. Zhang
Qi Liu
Xin Kou
Tong Xu
Enhong Chen
OODLRM
200
0
0
15 Nov 2025
AsyncHZP: Hierarchical ZeRO Parallelism with Asynchronous Scheduling for Scalable LLM Training
AsyncHZP: Hierarchical ZeRO Parallelism with Asynchronous Scheduling for Scalable LLM Training
Huawei Bai
Yifan Huang
Wenqi Shi
Ansheng You
Feifan Shao
Tengfei Han
Minghui Yu
84
0
0
23 Oct 2025
OptPipe: Memory- and Scheduling-Optimized Pipeline Parallelism for LLM Training
OptPipe: Memory- and Scheduling-Optimized Pipeline Parallelism for LLM Training
Hongpei Li
Han Zhang
Huikang Liu
Dongdong Ge
Yinyu Ye
56
0
0
06 Oct 2025
SlimPack: Fine-Grained Asymmetric Packing for Balanced and Efficient Variable-Length LLM Training
SlimPack: Fine-Grained Asymmetric Packing for Balanced and Efficient Variable-Length LLM Training
Y. Liu
Guohao Wu
Shenglong Zhang
Wei Zhang
Qianchao Zhu
Zhouyang Li
Chenyu Wang
76
0
0
30 Sep 2025
LoRAFusion: Efficient LoRA Fine-Tuning for LLMs
LoRAFusion: Efficient LoRA Fine-Tuning for LLMs
Zhanda Zhu
Qidong Su
Yaoyao Ding
Kevin Song
Shang Wang
Gennady Pekhimenko
MoMe
156
0
0
30 Sep 2025
AdaPtis: Reducing Pipeline Bubbles with Adaptive Pipeline Parallelism on Heterogeneous Models
AdaPtis: Reducing Pipeline Bubbles with Adaptive Pipeline Parallelism on Heterogeneous Models
Jihu Guo
Tenghui Ma
Wei Gao
Peng Sun
Jiaxing Li
Xun Chen
Yuyang Jin
Dahua Lin
68
0
0
28 Sep 2025
PreScope: Unleashing the Power of Prefetching for Resource-Constrained MoE Inference
PreScope: Unleashing the Power of Prefetching for Resource-Constrained MoE Inference
Enda Yu
Zhaoning Zhang
Dezun Dong
Yongwei Wu
Xiangke Liao
112
1
0
28 Sep 2025
Efficient Fine-Grained GPU Performance Modeling for Distributed Deep Learning of LLM
Efficient Fine-Grained GPU Performance Modeling for Distributed Deep Learning of LLM
Biyao Zhang
Mingkai Zheng
Debargha Ganguly
Xuecen Zhang
Vikash Singh
Vipin Chaudhary
Zhao Zhang
62
0
0
26 Sep 2025
SuperOffload: Unleashing the Power of Large-Scale LLM Training on Superchips
SuperOffload: Unleashing the Power of Large-Scale LLM Training on Superchips
Xinyu Lian
Masahiro Tanaka
Olatunji Ruwase
Minjia Zhang
68
2
0
25 Sep 2025
RLinf: Flexible and Efficient Large-scale Reinforcement Learning via Macro-to-Micro Flow Transformation
RLinf: Flexible and Efficient Large-scale Reinforcement Learning via Macro-to-Micro Flow Transformation
Chao Yu
Y. Wang
Zhen Guo
Hao Lin
Si Xu
...
Boxun Li
Jianlei Yang
Z. Yang
Guohao Dai
Yu Wang
AI4CE
80
2
0
19 Sep 2025
MLP-Offload: Multi-Level, Multi-Path Offloading for LLM Pre-training to Break the GPU Memory Wall
MLP-Offload: Multi-Level, Multi-Path Offloading for LLM Pre-training to Break the GPU Memory Wall
Avinash Maurya
M. Rafique
Franck Cappello
Bogdan Nicolae
128
1
0
02 Sep 2025
DualSparse-MoE: Coordinating Tensor/Neuron-Level Sparsity with Expert Partition and Reconstruction
DualSparse-MoE: Coordinating Tensor/Neuron-Level Sparsity with Expert Partition and Reconstruction
Weilin Cai
Le Qin
Shwai He
Junwei Cui
Ang Li
Jiayi Huang
MoE
100
0
0
25 Aug 2025
Condition Weaving Meets Expert Modulation: Towards Universal and Controllable Image Generation
Condition Weaving Meets Expert Modulation: Towards Universal and Controllable Image Generation
Guoqing Zhang
Xingtong Ge
Lu Shi
Xin Zhang
Muqing Xue
Wanru Xu
Yigang Cen
J. Zhang
DiffM
134
0
0
24 Aug 2025
LLaDA-MedV: Exploring Large Language Diffusion Models for Biomedical Image Understanding
LLaDA-MedV: Exploring Large Language Diffusion Models for Biomedical Image Understanding
Xuanzhao Dong
Wenhui Zhu
Xiwen Chen
Zhipeng Wang
Peijie Qiu
Shao Tang
Xin Li
Yalin Wang
MedImVLM
149
3
0
03 Aug 2025
Sustainable AI Training via Hardware-Software Co-Design on NVIDIA, AMD, and Emerging GPU Architectures
Sustainable AI Training via Hardware-Software Co-Design on NVIDIA, AMD, and Emerging GPU ArchitecturesInternational Symposium on Service Oriented Software Engineering (ISSOSE), 2025
Yashasvi Makin
Rahul Maliakkal
109
0
0
28 Jul 2025
STAlloc: Enhancing Memory Efficiency in Large-Scale Model Training with Spatio-Temporal Planning
STAlloc: Enhancing Memory Efficiency in Large-Scale Model Training with Spatio-Temporal Planning
Zixiao Huang
Junhao Hu
Hao Lin
Chunyang Zhu
Yueran Tang
...
Zhenhua Li
Shengen Yan
Zhenhua Zhu
Guohao Dai
Yu Wang
145
2
0
22 Jul 2025
The Serial Scaling Hypothesis
The Serial Scaling Hypothesis
Yuxi Liu
Konpat Preechakul
Kananart Kuwaranancharoen
Yutong Bai
LRM
172
3
0
16 Jul 2025
Xiangqi-R1: Enhancing Spatial Strategic Reasoning in LLMs for Chinese Chess via Reinforcement Learning
Xiangqi-R1: Enhancing Spatial Strategic Reasoning in LLMs for Chinese Chess via Reinforcement Learning
Yuhao Chen
Shuochen Liu
Yuanjie Lyu
Chao Zhang
Jiayao Shi
Tong Xu
LRM
85
1
0
16 Jul 2025
Symbiosis: Multi-Adapter Inference and Fine-Tuning
Symbiosis: Multi-Adapter Inference and Fine-Tuning
Saransh Gupta
Umesh Deshpande
Travis Janssen
Swami Sundararaman
MoE
285
0
0
03 Jul 2025
RewardAnything: Generalizable Principle-Following Reward Models
RewardAnything: Generalizable Principle-Following Reward Models
Zhuohao Yu
Jiali Zeng
Weizheng Gu
Yidong Wang
Jindong Wang
Fandong Meng
Jie Zhou
Yue Zhang
Shikun Zhang
Wei Ye
LRM
355
10
0
04 Jun 2025
InComeS: Integrating Compression and Selection Mechanisms into LLMs for Efficient Model Editing
InComeS: Integrating Compression and Selection Mechanisms into LLMs for Efficient Model Editing
Shuaiyi Li
Zhisong Zhang
Yang Deng
Chenlong Deng
Tianqing Fang
Hongming Zhang
Haitao Mi
Dong Yu
Wai Lam
KELM
192
0
0
28 May 2025
Exploiting Block Coordinate Descent for Cost-Effective LLM Model Training
Exploiting Block Coordinate Descent for Cost-Effective LLM Model Training
Zeyu Liu
Yunquan Zhang
Yunquan Zhang
Boyang Zhang
Guoyong Jiang
Xin Zhang
L. Xiao
Weifeng Zhang
Daning Cheng
208
0
0
23 May 2025
ZenFlow: Enabling Stall-Free Offloading Training via Asynchronous Updates
ZenFlow: Enabling Stall-Free Offloading Training via Asynchronous Updates
Tingfeng Lan
Yusen Wu
Bin Ma
Zhaoyuan Su
Rui Yang
Tekin Bicer
Masahiro Tanaka
Olatunji Ruwase
Dong Li
Yue Cheng
484
3
0
18 May 2025
Retrospex: Language Agent Meets Offline Reinforcement Learning Critic
Retrospex: Language Agent Meets Offline Reinforcement Learning CriticConference on Empirical Methods in Natural Language Processing (EMNLP), 2025
Yufei Xiang
Yiqun Shen
Yeqin Zhang
Cam-Tu Nguyen
OffRLLLMAGKELMLRM
451
4
0
17 May 2025
MegaScale-MoE: Large-Scale Communication-Efficient Training of Mixture-of-Experts Models in Production
MegaScale-MoE: Large-Scale Communication-Efficient Training of Mixture-of-Experts Models in Production
Cheng Jin
Ziheng Jiang
Zhihao Bai
Zheng Zhong
Jing Liu
...
Yanghua Peng
Xuanzhe Liu
Xuanzhe Liu
Xin Jin
Xin Liu
MoE
350
6
0
16 May 2025
FloE: On-the-Fly MoE Inference on Memory-constrained GPU
FloE: On-the-Fly MoE Inference on Memory-constrained GPU
Yuxin Zhou
Zheng Li
Junxuan Zhang
Jue Wang
Yanjie Wang
Zhongle Xie
Ke Chen
Lidan Shou
MoE
383
3
0
09 May 2025
Pushing the Limits of Low-Bit Optimizers: A Focus on EMA Dynamics
Pushing the Limits of Low-Bit Optimizers: A Focus on EMA Dynamics
Cong Xu
Wenbin Liang
Mo Yu
Anan Liu
Jianchao Tan
Lizhuang Ma
Jiangming Wang
Jun Wang
Weinan Zhang
Wei Zhang
MQ
280
0
0
01 May 2025
SYMI: Efficient Mixture-of-Experts Training via Model and Optimizer State Decoupling
SYMI: Efficient Mixture-of-Experts Training via Model and Optimizer State Decoupling
Athinagoras Skiadopoulos
Mark Zhao
Swapnil Gandhi
Thomas Norrie
Shrijeet Mukherjee
Christos Kozyrakis
MoE
281
0
0
28 Apr 2025
Self-alignment of Large Video Language Models with Refined Regularized Preference Optimization
Self-alignment of Large Video Language Models with Refined Regularized Preference Optimization
Pritam Sarkar
Ali Etemad
278
2
0
16 Apr 2025
Bringing together invertible UNets with invertible attention modules for memory-efficient diffusion models
Bringing together invertible UNets with invertible attention modules for memory-efficient diffusion models
Karan Jain
Mohammad Nayeem Teli
MedIm
146
0
0
15 Apr 2025
Scaling Laws of Graph Neural Networks for Atomistic Materials Modeling
Scaling Laws of Graph Neural Networks for Atomistic Materials ModelingDesign Automation Conference (DAC), 2025
Chaojian Li
Zhifan Ye
Massimiliano Lupo Pasini
Jong Youl Choi
Cheng Wan
Y. Lin
Dali Wang
227
2
0
10 Apr 2025
MLKV: Efficiently Scaling up Large Embedding Model Training with Disk-based Key-Value Storage
MLKV: Efficiently Scaling up Large Embedding Model Training with Disk-based Key-Value StorageIEEE International Conference on Data Engineering (ICDE), 2025
Yongjun He
R. Waleffe
Zhichao Han
Johnu George
Binhang Yuan
...
Yinan Shan
Yang Zhao
Debojyoti Dutta
Theodoros Rekatsinas
Ce Zhang
AIFin
225
0
0
02 Apr 2025
Orchestrate Multimodal Data with Batch Post-Balancing to Accelerate Multimodal Large Language Model Training
Orchestrate Multimodal Data with Batch Post-Balancing to Accelerate Multimodal Large Language Model Training
Yijie Zheng
Bangjun Xiao
Lei Shi
Xiaoyang Li
Faming Wu
Tianyu Li
Xuefeng Xiao
Yanzhe Zhang
Longji Xu
Shouda Liu
MLLMMoE
360
2
0
31 Mar 2025
Mist: Efficient Distributed Training of Large Language Models via Memory-Parallelism Co-Optimization
Mist: Efficient Distributed Training of Large Language Models via Memory-Parallelism Co-OptimizationEuropean Conference on Computer Systems (EuroSys), 2025
Zhanda Zhu
Christina Giannoula
Muralidhar Andoorveedu
Qidong Su
Karttikeya Mangalam
Bojian Zheng
Gennady Pekhimenko
VLMMoE
208
5
0
24 Mar 2025
Mixture of Lookup Experts
Mixture of Lookup Experts
Shibo Jie
Yehui Tang
Kai Han
Yongqian Li
Duyu Tang
Zhi-Hong Deng
Yunhe Wang
MoE
374
3
0
20 Mar 2025
MagicDistillation: Weak-to-Strong Video Distillation for Large-Scale Few-Step Synthesis
MagicDistillation: Weak-to-Strong Video Distillation for Large-Scale Few-Step Synthesis
Shitong Shao
Hongwei Yi
Hanzhong Guo
Tian Ye
Daquan Zhou
Michael Lingelbach
Zhiqiang Xu
Bo Han
VGen
321
0
0
17 Mar 2025
Implicit Cross-Lingual Rewarding for Efficient Multilingual Preference AlignmentAnnual Meeting of the Association for Computational Linguistics (ACL), 2025
Wen Yang
Junhong Wu
Chen Wang
Chengqing Zong
J.N. Zhang
341
5
0
06 Mar 2025
PipeOffload: Improving Scalability of Pipeline Parallelism with Memory Optimization
PipeOffload: Improving Scalability of Pipeline Parallelism with Memory Optimization
Xinyi Wan
Penghui Qi
Guangxing Huang
Jialin Li
Jialin Li
161
3
0
03 Mar 2025
Progressive Sparse Attention: Algorithm and System Co-design for Efficient Attention in LLM Serving
Qihui Zhou
Peiqi Yin
Pengfei Zuo
James Cheng
CLL
236
3
0
01 Mar 2025
Klotski: Efficient Mixture-of-Expert Inference via Expert-Aware Multi-Batch Pipeline
Klotski: Efficient Mixture-of-Expert Inference via Expert-Aware Multi-Batch PipelineInternational Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS), 2025
Zhiyuan Fang
Yuegui Huang
Zicong Hong
Yufeng Lyu
Wuhui Chen
Yue Yu
Fan Yu
Zibin Zheng
MoE
180
6
0
09 Feb 2025
TabICL: A Tabular Foundation Model for In-Context Learning on Large Data
TabICL: A Tabular Foundation Model for In-Context Learning on Large Data
Jingang Qu
David Holzmüller
Gaël Varoquaux
Marine Le Morvan
LMTD
854
58
0
08 Feb 2025
A Survey of Large Language Models for Healthcare: from Data, Technology, and Applications to Accountability and Ethics
A Survey of Large Language Models for Healthcare: from Data, Technology, and Applications to Accountability and EthicsInformation Fusion (Inf. Fusion), 2023
Kai He
Rui Mao
Qika Lin
Yucheng Ruan
Xiang Lan
Mengling Feng
Xiaoshi Zhong
LM&MAAILaw
598
256
0
28 Jan 2025
A Survey on Memory-Efficient Transformer-Based Model Training in AI for Science
A Survey on Memory-Efficient Transformer-Based Model Training in AI for Science
Kaiyuan Tian
Linbo Qiao
Baihui Liu
Gongqingjian Jiang
Shanshan Li
Dongsheng Li
311
0
0
21 Jan 2025
Deep Compression Autoencoder for Efficient High-Resolution Diffusion Models
Deep Compression Autoencoder for Efficient High-Resolution Diffusion ModelsInternational Conference on Learning Representations (ICLR), 2024
Junyu Chen
Han Cai
Junsong Chen
Enze Xie
Shang Yang
Haotian Tang
Zhekai Zhang
Yaojie Lu
Song Han
DiffM
432
14
0
20 Jan 2025
Reasoning Through Execution: Unifying Process and Outcome Rewards for Code Generation
Reasoning Through Execution: Unifying Process and Outcome Rewards for Code Generation
Zhuohao Yu
Weizheng Gu
Yidong Wang
Xingru Jiang
Zhengran Zeng
Jindong Wang
Wei Ye
Shikun Zhang
LRM
400
5
0
19 Dec 2024
Enabling Efficient Serverless Inference Serving for LLM (Large Language
  Model) in the Cloud
Enabling Efficient Serverless Inference Serving for LLM (Large Language Model) in the Cloud
Himel Ghosh
145
2
0
23 Nov 2024
Hardware Scaling Trends and Diminishing Returns in Large-Scale Distributed Training
Hardware Scaling Trends and Diminishing Returns in Large-Scale Distributed Training
Jared Fernandez
Luca Wehrstedt
Leonid Shamis
Mostafa Elhoushi
Kalyan Saladi
Yonatan Bisk
Emma Strubell
Jacob Kahn
1.2K
10
0
20 Nov 2024
HOBBIT: A Mixed Precision Expert Offloading System for Fast MoE
  Inference
HOBBIT: A Mixed Precision Expert Offloading System for Fast MoE Inference
Peng Tang
Jiacheng Liu
X. Hou
Yifei Pu
Jing Wang
Pheng-Ann Heng
Chong Li
Minyi Guo
MoE
292
25
0
03 Nov 2024
12345
Next