Communities
Connect sessions
AI calendar
Organizations
Contact Sales
Search
Open menu
Home
Papers
2508.15881
Cited By
v1
v2 (latest)
TPLA: Tensor Parallel Latent Attention for Efficient Disaggregated Prefill and Decode Inference
21 August 2025
Xiaojuan Tang
Fanxu Meng
Pingzhi Tang
Yuxuan Wang
Di Yin
Xing Sun
M. Zhang
Re-assign community
ArXiv (abs)
PDF
HTML
HuggingFace (6 upvotes)
Github (353★)
Papers citing
"TPLA: Tensor Parallel Latent Attention for Efficient Disaggregated Prefill and Decode Inference"
Title
No papers found