Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2401.17167
Cited By
Planning, Creation, Usage: Benchmarking LLMs for Comprehensive Tool Utilization in Real-World Complex Scenarios
30 January 2024
Shijue Huang
Wanjun Zhong
Jianqiao Lu
Qi Zhu
Jiahui Gao
Weiwen Liu
Yutai Hou
Xingshan Zeng
Yasheng Wang
Lifeng Shang
Xin Jiang
Ruifeng Xu
Qun Liu
LLMAG
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Planning, Creation, Usage: Benchmarking LLMs for Comprehensive Tool Utilization in Real-World Complex Scenarios"
5 / 5 papers shown
Title
TRAIL: Trace Reasoning and Agentic Issue Localization
Darshan Deshpande
Varun Gangal
Hersh Mehta
Jitin Krishnan
Anand Kannappan
Rebecca Qian
16
0
0
13 May 2025
Toward Generalizable Evaluation in the LLM Era: A Survey Beyond Benchmarks
Yixin Cao
Shibo Hong
X. Li
Jiahao Ying
Yubo Ma
...
Juanzi Li
Aixin Sun
Xuanjing Huang
Tat-Seng Chua
Yu Jiang
ALM
ELM
84
0
0
26 Apr 2025
Multi-Mission Tool Bench: Assessing the Robustness of LLM based Agents through Related and Dynamic Missions
Peijie Yu
Yifan Yang
J. Li
Zelong Zhang
Haorui Wang
Xiao Feng
Feng Zhang
LLMAG
103
0
0
03 Apr 2025
Can Tool-augmented Large Language Models be Aware of Incomplete Conditions?
Seungbin Yang
chaeHun Park
Taehee Kim
Jaegul Choo
44
2
0
18 Jun 2024
A Survey on Self-Evolution of Large Language Models
Zhengwei Tao
Ting-En Lin
Xiancai Chen
Hangyu Li
Yuchuan Wu
Yongbin Li
Zhi Jin
Fei Huang
Dacheng Tao
Jingren Zhou
LRM
LM&Ro
46
21
0
22 Apr 2024
1