Communities
Connect sessions
AI calendar
Organizations
Join Slack
Contact Sales
Search
Open menu
Home
Papers
2105.13120
Cited By
v1
v2
v3 (latest)
Sequence Parallelism: Long Sequence Training from System Perspective
Annual Meeting of the Association for Computational Linguistics (ACL), 2021
26 May 2021
Shenggui Li
Fuzhao Xue
Chaitanya Baranwal
Yongbin Li
Yang You
Re-assign community
ArXiv (abs)
PDF
HTML
HuggingFace (6 upvotes)
Papers citing
"Sequence Parallelism: Long Sequence Training from System Perspective"
50 / 74 papers shown
RELIC: Interactive Video World Model with Long-Horizon Memory
Yicong Hong
Yiqun Mei
Chongjian Ge
Yiran Xu
Yang Zhou
...
Eli Shechtman
Kalyan Sunkavalli
Feng Liu
Z. Li
Hao Tan
VGen
VLM
306
2
0
03 Dec 2025
PipeDiT: Accelerating Diffusion Transformers in Video Generation with Task Pipelining and Model Decoupling
S. Wang
Qiang Wang
Shaohuai Shi
VGen
134
0
0
15 Nov 2025
In-Context Learning with Unpaired Clips for Instruction-based Video Editing
Xinyao Liao
Xianfang Zeng
Ziye Song
Zhoujie Fu
Gang Yu
Guosheng Lin
131
5
0
16 Oct 2025
TASP: Topology-aware Sequence Parallelism
Y. Wang
Ke Hong
Xiuhong Li
Yuanchao Xu
Wenxun Wang
Guohao Dai
Y. Wang
165
0
0
30 Sep 2025
A Scalable Distributed Framework for Multimodal GigaVoxel Image Registration
Rohit Jena
Vedant Zope
Pratik Chaudhari
James C. Gee
FedML
131
0
0
29 Sep 2025
RollPacker: Mitigating Long-Tail Rollouts for Fast, Synchronous RL Post-Training
Wei Gao
Yuheng Zhao
Dakai An
Tianyuan Wu
Lunxi Cao
...
Yuchi Xu
Jiamang Wang
Lin Qu
B. Zheng
Wei Wang
OffRL
VLM
208
11
0
25 Sep 2025
Learning to Shard: RL for Co-optimizing the Parallelism Degrees and Per-operator Sharding Dimensions in Distributed LLM Inference
Ruokai Yin
Sattwik Deb Mishra
Xuan Zuo
Hokchhay Tann
Preyas Shah
Apala Guha
OffRL
89
0
0
29 Aug 2025
TPLA: Tensor Parallel Latent Attention for Efficient Disaggregated Prefill and Decode Inference
Xiaojuan Tang
Fanxu Meng
Pingzhi Tang
Yuxuan Wang
Di Yin
Xing Sun
M. Zhang
197
0
0
21 Aug 2025
Modality Agnostic Efficient Long Range Encoder
T. Parag
Ahmed Elgammal
158
0
0
25 Jul 2025
Accelerating Parallel Diffusion Model Serving with Residual Compression
Jiajun Luo
Yicheng Xiao
Jianru Xu
Yangxiu You
Rongwei Lu
Chen Tang
Jingyan Jiang
Zhi Wang
245
0
0
23 Jul 2025
ContentV: Efficient Training of Video Generation Models with Limited Compute
Wenfeng Lin
Renjie Chen
Boyuan Liu
Shiyue Yan
Ruoyu Feng
...
Chao Feng
Jiao Ran
Qi Wu
Zuotao Liu
Mingyu Guo
VGen
442
3
0
05 Jun 2025
Many-for-Many: Unify the Training of Multiple Video and Image Generation and Manipulation Tasks
Tao Yang
Ruibin Li
Yangming Shi
Yuqi Zhang
Qide Dong
Haoran Cheng
Weiguo Feng
Shilei Wen
Bingyue Peng
Lei Zhang
DiffM
VGen
264
0
0
02 Jun 2025
100-LongBench: Are de facto Long-Context Benchmarks Literally Evaluating Long-Context Ability?
Annual Meeting of the Association for Computational Linguistics (ACL), 2025
Wang Yang
Hongye Jin
Shaochen Zhong
Song Jiang
Qifan Wang
Vipin Chaudhary
Xiaotian Han
ELM
210
1
0
25 May 2025
FlashForge: Ultra-Efficient Prefix-Aware Attention for LLM Decoding
Zhibin Wang
Rui Ning
Chao Fang
Zhonghui Zhang
Xi Lin
...
Rong Gu
Kun Yang
Guihai Chen
Sheng Zhong
Chen Tian
224
6
0
23 May 2025
Longer Context, Deeper Thinking: Uncovering the Role of Long-Context Ability in Reasoning
Wang Yang
Zirui Liu
Hongye Jin
Qingyu Yin
Vipin Chaudhary
Xiaotian Han
ReLM
LRM
265
3
0
22 May 2025
PaTH Attention: Position Encoding via Accumulating Householder Transformations
Songlin Yang
Yikang Shen
Kaiyue Wen
Shawn Tan
Mayank Mishra
Liliang Ren
Rameswar Panda
Yoon Kim
878
12
0
22 May 2025
MegaScale-MoE: Large-Scale Communication-Efficient Training of Mixture-of-Experts Models in Production
Cheng Jin
Ziheng Jiang
Zhihao Bai
Zheng Zhong
Jing Liu
...
Yanghua Peng
Xuanzhe Liu
Xuanzhe Liu
Xin Jin
Xin Liu
MoE
411
7
0
16 May 2025
Small Clips, Big Gains: Learning Long-Range Refocused Temporal Information for Video Super-Resolution
Xingyu Zhou
Wei Long
Jingbo Lu
Shiyin Jiang
Weiyi You
Haifeng Wu
Shuhang Gu
269
0
0
04 May 2025
SlimPipe: Memory-Thrifty and Efficient Pipeline Parallelism for Long-Context LLM Training
Zheng Li
Wenshu Fan
Wei Zhang
Tailing Yuan
Bin Chen
Chengru Song
Chen Zhang
212
3
0
20 Apr 2025
Lumos: Efficient Performance Modeling and Estimation for Large-scale LLM Training
Mingyu Liang
Hiwot Tadese Kassa
Wenyin Fu
Brian Coutinho
Louis Feng
Christina Delimitrou
252
3
0
12 Apr 2025
Jupiter: Fast and Resource-Efficient Collaborative Inference of Generative LLMs on Edge Devices
IEEE Conference on Computer Communications (IEEE INFOCOM), 2025
Shengyuan Ye
Bei Ouyang
Liekang Zeng
Tianyi Qian
Xiaowen Chu
Jian Tang
Xu Chen
370
12
0
11 Apr 2025
Orchestrate Multimodal Data with Batch Post-Balancing to Accelerate Multimodal Large Language Model Training
Yijie Zheng
Bangjun Xiao
Lei Shi
Xiaoyang Li
Faming Wu
Tianyu Li
Xuefeng Xiao
Yanzhe Zhang
Longji Xu
Shouda Liu
MLLM
MoE
408
2
0
31 Mar 2025
Generating, Fast and Slow: Scalable Parallel Video Generation with Video Interface Networks
Bhishma Dedhia
David Bourgin
Krishna Kumar Singh
Yuheng Li
Yan Kang
Zhan Xu
N. Jha
Yixiao Liu
DiffM
VGen
386
0
0
21 Mar 2025
ATTENTION2D: Communication Efficient Distributed Self-Attention Mechanism
Venmugil Elango
426
1
0
20 Mar 2025
AdaReTaKe: Adaptive Redundancy Reduction to Perceive Longer for Video-language Understanding
Annual Meeting of the Association for Computational Linguistics (ACL), 2025
Xiao Wang
Qingyi Si
Yue Yu
Shiyu Zhu
Zheng Lin
Liqiang Nie
VLM
421
29
0
16 Mar 2025
Seesaw: High-throughput LLM Inference via Model Re-sharding
Qidong Su
Wei Zhao
Xuelong Li
Muralidhar Andoorveedu
Chenhao Jiang
Zhanda Zhu
Kevin Song
Christina Giannoula
Gennady Pekhimenko
LRM
366
7
0
09 Mar 2025
APB: Accelerating Distributed Long-Context Inference by Passing Compressed Context Blocks across GPUs
Annual Meeting of the Association for Computational Linguistics (ACL), 2025
Yuxiang Huang
Mingye Li
Xu Han
Chaojun Xiao
Weilin Zhao
Sun Ao
Hao Zhou
Jie Zhou
Zhiyuan Liu
Maosong Sun
386
2
0
17 Feb 2025
LongReason: A Synthetic Long-Context Reasoning Benchmark via Context Expansion
Zhan Ling
Kang Liu
Kai Yan
Yue Yang
Weijian Lin
Ting-Han Fan
Lingfeng Shen
Zhengyin Du
Jiecao Chen
ReLM
LRM
ELM
437
21
0
25 Jan 2025
A Survey on Memory-Efficient Transformer-Based Model Training in AI for Science
Kaiyuan Tian
Linbo Qiao
Baihui Liu
Gongqingjian Jiang
Shanshan Li
Dongsheng Li
375
1
0
21 Jan 2025
TokenRing: An Efficient Parallelism Framework for Infinite-Context LLMs via Bidirectional Communication
Zongwu Wang
Fangxin Liu
Mingshuai Li
Li Jiang
LRM
310
1
0
29 Dec 2024
FlexSP: Accelerating Large Language Model Training via Flexible Sequence Parallelism
International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS), 2024
Yijiao Wang
Shiju Wang
Shenhan Zhu
Fangcheng Fu
Xinyi Liu
Xuefeng Xiao
Huixia Li
Jiashi Li
Faming Wu
Tengjiao Wang
391
0
0
02 Dec 2024
When Precision Meets Position: BFloat16 Breaks Down RoPE in Long-Context Training
Haonan Wang
Qian Liu
Chao Du
Tongyao Zhu
Cunxiao Du
Kenji Kawaguchi
Tianyu Pang
469
10
0
20 Nov 2024
Hardware Scaling Trends and Diminishing Returns in Large-Scale Distributed Training
Jared Fernandez
Luca Wehrstedt
Leonid Shamis
Mostafa Elhoushi
Kalyan Saladi
Yonatan Bisk
Emma Strubell
Jacob Kahn
1.2K
11
0
20 Nov 2024
Context Parallelism for Scalable Million-Token Inference
Amy Yang
Jingyi Yang
Aya Ibrahim
Xinfeng Xie
Bangsheng Tang
Grigory Sizov
Jeremy Reizenstein
Jongsoo Park
Jianyu Huang
MoE
LRM
473
20
0
04 Nov 2024
Reducing the Cost of Dropout in Flash-Attention by Hiding RNG with GEMM
Haiyue Ma
Jian Liu
Ronny Krashinsky
226
0
0
10 Oct 2024
FltLM: An Intergrated Long-Context Large Language Model for Effective Context Filtering and Understanding
European Conference on Artificial Intelligence (ECAI), 2024
Jingyang Deng
Zhengyang Shen
Boyang Wang
Lixin Su
Suqi Cheng
Ying Nie
Junfeng Wang
D. Yin
Jinwen Ma
166
3
0
09 Oct 2024
How to Train Long-Context Language Models (Effectively)
Annual Meeting of the Association for Computational Linguistics (ACL), 2024
Tianyu Gao
Alexander Wettig
Howard Yen
Danqi Chen
RALM
665
92
0
03 Oct 2024
No Request Left Behind: Tackling Heterogeneity in Long-Context LLM Inference with Medha
A. Agrawal
Haoran Qiu
Junda Chen
Íñigo Goiri
Chaojie Zhang
Rayyan Shahid
Ramachandran Ramjee
Alexey Tumanov
Esha Choukse
RALM
LRM
560
0
0
25 Sep 2024
PecSched: Preemptive and Efficient Cluster Scheduling for LLM Inference
Zeyu Zhang
Haiying Shen
VLM
349
1
0
23 Sep 2024
Achieving Peak Performance for Large Language Models: A Systematic Review
IEEE Access (IEEE Access), 2024
Z. R. K. Rostam
Sándor Szénási
Gábor Kertész
321
18
0
07 Sep 2024
Kraken: Inherently Parallel Transformers For Efficient Multi-Device Inference
Neural Information Processing Systems (NeurIPS), 2024
R. Prabhakar
Hengrui Zhang
D. Wentzlaff
294
1
0
14 Aug 2024
Efficient Training of Large Language Models on Distributed Infrastructures: A Survey
Jiangfei Duan
Shuo Zhang
Zerui Wang
Lijuan Jiang
Wenwen Qu
...
Dahua Lin
Yonggang Wen
Xin Jin
Tianwei Zhang
Yang Liu
369
32
0
29 Jul 2024
MINI-SEQUENCE TRANSFORMER: Optimizing Intermediate Memory for Long Sequences Training
Cheng Luo
Jiawei Zhao
Zhuoming Chen
Beidi Chen
A. Anandkumar
265
5
0
22 Jul 2024
TorchGT: A Holistic System for Large-scale Graph Transformer Training
Mengdie Zhang
Jie Sun
Qi Hu
Yang Liu
Zeke Wang
Yonggang Wen
Tianwei Zhang
GNN
236
6
0
19 Jul 2024
Scaling Granite Code Models to 128K Context
Matt Stallone
Vaibhav Saxena
Leonid Karlinsky
Bridget McGinn
Tim Bula
...
Rogerio Feris
Nirmit Desai
David D. Cox
Ruchir Puri
Yikang Shen
266
8
0
18 Jul 2024
Inference Optimization of Foundation Models on AI Accelerators
Youngsuk Park
Kailash Budhathoki
Liangfu Chen
Jonas M. Kübler
Jiaji Huang
Matthäus Kleindessner
Jun Huan
Volkan Cevher
Yida Wang
George Karypis
313
14
0
12 Jul 2024
WallFacer: Guiding Transformer Model Training Out of the Long-Context Dark Forest with N-body Problem
Ziming Liu
Shaoyu Wang
Shenggan Cheng
Zhongkai Zhao
Xuanlei Zhao
James Demmel
Yang You
219
1
0
30 Jun 2024
A Survey on Mixture of Experts in Large Language Models
Weilin Cai
Juyong Jiang
Fan Wang
Jing Tang
Sunghun Kim
Jiayi Huang
MoE
477
70
0
26 Jun 2024
Long Context Transfer from Language to Vision
Peiyuan Zhang
Kaichen Zhang
Bo Li
Guangtao Zeng
Jingkang Yang
Yuanhan Zhang
Ziyue Wang
Haoran Tan
Chunyuan Li
Ziwei Liu
VLM
315
349
0
24 Jun 2024
Mooncake: A KVCache-centric Disaggregated Architecture for LLM Serving
Ruoyu Qin
Zheming Li
Weiran He
Mingxing Zhang
Yongwei Wu
Weimin Zheng
Xinran Xu
702
120
0
24 Jun 2024
1
2
Next