ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2402.10753
17
16

ToolSword: Unveiling Safety Issues of Large Language Models in Tool Learning Across Three Stages

16 February 2024
Junjie Ye
Sixian Li
Guanyu Li
Caishuang Huang
Songyang Gao
Yilong Wu
Qi Zhang
Tao Gui
Xuanjing Huang
    LLMAG
ArXivPDFHTML
Abstract

Tool learning is widely acknowledged as a foundational approach or deploying large language models (LLMs) in real-world scenarios. While current research primarily emphasizes leveraging tools to augment LLMs, it frequently neglects emerging safety considerations tied to their application. To fill this gap, we present ToolSwordToolSwordToolSword, a comprehensive framework dedicated to meticulously investigating safety issues linked to LLMs in tool learning. Specifically, ToolSword delineates six safety scenarios for LLMs in tool learning, encompassing maliciousmaliciousmalicious queriesqueriesqueries and jailbreakjailbreakjailbreak attacksattacksattacks in the input stage, noisynoisynoisy misdirectionmisdirectionmisdirection and riskyriskyrisky cuescuescues in the execution stage, and harmfulharmfulharmful feedbackfeedbackfeedback and errorerrorerror conflictsconflictsconflicts in the output stage. Experiments conducted on 11 open-source and closed-source LLMs reveal enduring safety challenges in tool learning, such as handling harmful queries, employing risky tools, and delivering detrimental feedback, which even GPT-4 is susceptible to. Moreover, we conduct further studies with the aim of fostering research on tool learning safety. The data is released in https://github.com/Junjie-Ye/ToolSword.

View on arXiv
Comments on this paper