ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2508.15881
  4. Cited By
TPLA: Tensor Parallel Latent Attention for Efficient Disaggregated Prefill and Decode Inference
v1v2 (latest)

TPLA: Tensor Parallel Latent Attention for Efficient Disaggregated Prefill and Decode Inference

21 August 2025
Xiaojuan Tang
Fanxu Meng
Pingzhi Tang
Yuxuan Wang
Di Yin
Xing Sun
M. Zhang
ArXiv (abs)PDFHTMLHuggingFace (6 upvotes)Github (353★)

Papers citing "TPLA: Tensor Parallel Latent Attention for Efficient Disaggregated Prefill and Decode Inference"

Title

No papers found