Speed and Conversational Large Language Models: Not All Is About Tokens per Second

23 February 2025

Papers citing "Speed and Conversational Large Language Models: Not All Is About Tokens per Second"

2 / 2 papers shown

Title
Tempo: Application-aware LLM Serving with Mixed SLO Requirements Wei Zhang Zhiyu Wu Yi Mu Banruo Liu Myungjin Lee Fan Lai 55 0 0 24 Apr 2025
Spanish and LLM Benchmarks: is MMLU Lost in Translation? Irene Plaza Nina Melero Cristina del Pozo Javier Conde Pedro Reviriego Marina Mayor-Rocher María Grandury ELM 27 7 0 28 May 2024