Tackling the Dynamicity in a Production LLM Serving System with SOTA
Optimizations via Hybrid Prefill/Decode/Verify Scheduling on Efficient
Meta-kernels
Papers citing "Tackling the Dynamicity in a Production LLM Serving System with SOTA
Optimizations via Hybrid Prefill/Decode/Verify Scheduling on Efficient
Meta-kernels"