272
v1v2 (latest)

PromptWise: Online Learning for Cost-Aware Prompt Assignment in Generative Models

Main:11 Pages
29 Figures
Bibliography:5 Pages
2 Tables
Appendix:23 Pages
Abstract

The rapid advancement of generative AI has provided users with a wide range of well-trained models to address diverse prompts. When selecting a model for a given prompt, users should weigh not only its performance but also its service cost. However, existing model-selection methods typically emphasize performance while overlooking cost differences. In this paper, we introduce PromptWise, an online learning framework that assigns prompts to generative models in a cost-aware manner. PromptWise estimates prompt-model compatibility to select the least expensive model expected to deliver satisfactory outputs. Unlike standard contextual bandits that make a one-shot decision per prompt, PromptWise employs a cost-aware bandit structure that allows sequential model assignments per prompt to reduce total service cost. Through numerical experiments on tasks such as code generation and translation, we demonstrate that PromptWise can achieve performance comparable to baseline selection methods while incurring substantially lower costs. The code is available at:this http URL.

View on arXiv
Comments on this paper