ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2412.18106
  4. Cited By
Tackling the Dynamicity in a Production LLM Serving System with SOTA
  Optimizations via Hybrid Prefill/Decode/Verify Scheduling on Efficient
  Meta-kernels

Tackling the Dynamicity in a Production LLM Serving System with SOTA Optimizations via Hybrid Prefill/Decode/Verify Scheduling on Efficient Meta-kernels

24 December 2024
Mingcong Song
Xinru Tang
Fengfan Hou
Jing Li
Wei Wei
Yipeng Ma
Runqiu Xiao
Hongjie Si
D. Jiang
Shouyi Yin
Yang Hu
Guoping Long
ArXivPDFHTML

Papers citing "Tackling the Dynamicity in a Production LLM Serving System with SOTA Optimizations via Hybrid Prefill/Decode/Verify Scheduling on Efficient Meta-kernels"

Title
No papers