Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2312.12391
Cited By
vTrain: A Simulation Framework for Evaluating Cost-effective and Compute-optimal Large Language Model Training
27 November 2023
Jehyeon Bang
Yujeong Choi
Myeongwoo Kim
Yongdeok Kim
Minsoo Rhu
Re-assign community
ArXiv
PDF
HTML
Papers citing
"vTrain: A Simulation Framework for Evaluating Cost-effective and Compute-optimal Large Language Model Training"
9 / 9 papers shown
Title
Optimizing LLMs for Italian: Reducing Token Fertility and Enhancing Efficiency Through Vocabulary Adaptation
Luca Moroni
Giovanni Puccetti
Pere-Lluís Huguet Cabot
Andrei Stefan Bejgu
Edoardo Barba
Alessio Miaschi
F. Dell’Orletta
Andrea Esuli
Roberto Navigli
30
0
0
23 Apr 2025
Lumos: Efficient Performance Modeling and Estimation for Large-scale LLM Training
Mingyu Liang
Hiwot Tadese Kassa
Wenyin Fu
Brian Coutinho
Louis Feng
Christina Delimitrou
21
0
0
12 Apr 2025
Maya: Optimizing Deep Learning Training Workloads using Emulated Virtual Accelerators
Srihas Yarlagadda
A. Agrawal
Elton Pinto
Hakesh Darapaneni
Mitali Meratwal
Shivam Mittal
Pranavi Bajjuri
S.
Alexey Tumanov
78
0
0
26 Mar 2025
RAGO: Systematic Performance Optimization for Retrieval-Augmented Generation Serving
Wenqi Jiang
Suvinay Subramanian
Cat Graves
Gustavo Alonso
Amir Yazdanbakhsh
Vidushi Dadu
47
6
0
18 Mar 2025
LLMServingSim: A HW/SW Co-Simulation Infrastructure for LLM Inference Serving at Scale
Jaehong Cho
Minsu Kim
Hyunmin Choi
Guseul Heo
Jongse Park
38
9
0
10 Aug 2024
Towards a Flexible and High-Fidelity Approach to Distributed DNN Training Emulation
Banruo Liu
M. Ojewale
Yuhan Ding
Marco Canini
21
1
0
05 May 2024
Training language models to follow instructions with human feedback
Long Ouyang
Jeff Wu
Xu Jiang
Diogo Almeida
Carroll L. Wainwright
...
Amanda Askell
Peter Welinder
Paul Christiano
Jan Leike
Ryan J. Lowe
OSLM
ALM
303
11,909
0
04 Mar 2022
ZeRO-Offload: Democratizing Billion-Scale Model Training
Jie Ren
Samyam Rajbhandari
Reza Yazdani Aminabadi
Olatunji Ruwase
Shuangyang Yang
Minjia Zhang
Dong Li
Yuxiong He
MoE
160
413
0
18 Jan 2021
Megatron-LM: Training Multi-Billion Parameter Language Models Using Model Parallelism
M. Shoeybi
M. Patwary
Raul Puri
P. LeGresley
Jared Casper
Bryan Catanzaro
MoE
243
1,817
0
17 Sep 2019
1