40
0

Advancing and Benchmarking Personalized Tool Invocation for LLMs

Abstract

Tool invocation is a crucial mechanism for extending the capabilities of Large Language Models (LLMs) and has recently garnered significant attention. It enables LLMs to solve complex problems through tool calls while accessing up-to-date world knowledge. However, existing work primarily focuses on the fundamental ability of LLMs to invoke tools for problem-solving, without considering personalized constraints in tool invocation. In this work, we introduce the concept of Personalized Tool Invocation and define two key tasks: Tool Preference and Profile-dependent Query. Tool Preference addresses user preferences when selecting among functionally similar tools, while Profile-dependent Query considers cases where a user query lacks certain tool parameters, requiring the model to infer them from the user profile. To tackle these challenges, we propose PTool, a data synthesis framework designed for personalized tool invocation. Additionally, we construct \textbf{PTBench}, the first benchmark for evaluating personalized tool invocation. We then fine-tune various open-source models, demonstrating the effectiveness of our framework and providing valuable insights. Our benchmark is public atthis https URL.

View on arXiv
@article{huang2025_2505.04072,
  title={ Advancing and Benchmarking Personalized Tool Invocation for LLMs },
  author={ Xu Huang and Yuefeng Huang and Weiwen Liu and Xingshan Zeng and Yasheng Wang and Ruiming Tang and Hong Xie and Defu Lian },
  journal={arXiv preprint arXiv:2505.04072},
  year={ 2025 }
}
Comments on this paper