v1v2v3v4v5 (latest)

Can Tool-augmented Large Language Models be Aware of Incomplete Conditions?

18 June 2024

Papers citing "Can Tool-augmented Large Language Models be Aware of Incomplete Conditions?"

44 / 44 papers shown

VitaBench: Benchmarking LLM Agents with Versatile Interactive Tasks in Real-world Applications

...

197

30 Sep 2025

DiaTool-DPO: Multi-Turn Direct Preference Optimization for Tool-Augmented Large Language Models

368

02 Apr 2025

InfoQuest: Evaluating Multi-Turn Dialogue Agents for Open-Ended Conversations with Hidden Context

Bryan L. M. de Oliveira

869

17 Feb 2025

Learning Evolving Tools for Large Language ModelsInternational Conference on Learning Representations (ICLR), 2024

639

09 Oct 2024

Qwen2 Technical Report

Bowen Yu

...

Yuqiong Liu

Zeyu Cui

Zhenru Zhang

Zhifang Guo

Zhi-Wei Fan

OSLM VLM MU

650

1,712

15 Jul 2024

BiasAlert: A Plug-and-play Tool for Social Bias Detection in LLMs

Zuozhu Liu

371

14 Jul 2024

WTU-EVAL: A Whether-or-Not Tool Usage Evaluation Benchmark for Large Language Models

Yuanzhe Zhang

Kang Liu

Jinan Xu

ELM LLMAG

183

02 Jul 2024

Tools Fail: Detecting Silent Errors in Faulty Tools

305

27 Jun 2024

CLAMBER: A Benchmark of Identifying and Clarifying Ambiguous Information Needs in Large Language Models

Tong Zhang

273

20 May 2024

Phi-3 Technical Report: A Highly Capable Language Model Locally on Your Phone

Ahmed Hassan Awadallah

...

Yue Zhang

640

1,927

22 Apr 2024

Adaptive-RAG: Learning to Adapt Retrieval-Augmented Large Language Models through Question Complexity

377

345

21 Mar 2024

What Are Tools Anyway? A Survey from the Language Model Perspective

Zhiruo Wang

Zhoujun Cheng

Hao Zhu

Daniel Fried

Graham Neubig

329

18 Mar 2024

StableToolBench: Towards Stable Large-Scale Benchmarking on Tool Learning of Large Language ModelsAnnual Meeting of the Association for Computational Linguistics (ACL), 2024

Peng Li

Zhiyuan Liu

Maosong Sun

Yang Liu

ELM

456

12 Mar 2024

Middleware for LLMs: Tools Are Instrumental for Language Agents in Complex Environments

Yu Gu

Yiheng Shu

Hao Yu

Xiao Liu

Yuxiao Dong

327

22 Feb 2024

ToolSword: Unveiling Safety Issues of Large Language Models in Tool Learning Across Three Stages

Xuanjing Huang

416

16 Feb 2024

Planning, Creation, Usage: Benchmarking LLMs for Comprehensive Tool Utilization in Real-World Complex Scenarios

Jiahui Gao

...

Yasheng Wang

Lifeng Shang

Xin Jiang

Ruifeng Xu

Qun Liu

LLMAG

294

30 Jan 2024

RoTBench: A Multi-Level Benchmark for Evaluating the Robustness of Large Language Models in Tool LearningConference on Empirical Methods in Natural Language Processing (EMNLP), 2024

Xuanjing Huang

280

16 Jan 2024

EASYTOOL: Enhancing LLM-based Agents with Concise Tool InstructionNorth American Chapter of the Association for Computational Linguistics (NAACL), 2024

Xu Tan

Dongsheng Li

257

116

11 Jan 2024

Mistral 7B

Albert Q. Jiang

Alexandre Sablayrolles

A. Mensch

Chris Bamford

Devendra Singh Chaplot

...

396

3,000

10 Oct 2023

ToRA: A Tool-Integrated Reasoning Agent for Mathematical Problem SolvingInternational Conference on Learning Representations (ICLR), 2023

Zhihong Shao

Yujiu Yang

437

263

29 Sep 2023

ToolLLM: Facilitating Large Language Models to Master 16000+ Real-world APIsInternational Conference on Learning Representations (ICLR), 2023

...

Jie Zhou

Mark B. Gerstein

Dahai Li

Zhiyuan Liu

Maosong Sun

CLL ALM LLMAG ELM LM&MA

603

1,139

31 Jul 2023

Llama 2: Open Foundation and Fine-Tuned Chat Models

Louis Martin

...

Sharan Narang

Sergey Edunov

8.7K

15,388

18 Jul 2023

Judging LLM-as-a-Judge with MT-Bench and Chatbot ArenaNeural Information Processing Systems (NeurIPS), 2023

...

3.2K

6,725

09 Jun 2023

Auto-GPT for Online Decision Making: Benchmarks and Additional Opinions

249

257

04 Jun 2023

Large Language Models as Tool MakersInternational Conference on Learning Representations (ICLR), 2023

Tianle Cai

288

262

26 May 2023

MultiTool-CoT: GPT-3 Can Use Multiple External Tools with Chain of Thought PromptingAnnual Meeting of the Association for Computational Linguistics (ACL), 2023

Sadao Kurohashi

291

26 May 2023

On the Tool Manipulation Capability of Open-source Large Language Models

281

25 May 2023

Gorilla: Large Language Model Connected with Massive APIsNeural Information Processing Systems (NeurIPS), 2023

Tianjun Zhang

400

892

24 May 2023

Selectively Answering Ambiguous QuestionsConference on Empirical Methods in Natural Language Processing (EMNLP), 2023

Jeremy R. Cole

Michael J.Q. Zhang

D. Gillick

Julian Martin Eisenschlos

Bhuwan Dhingra

Jacob Eisenstein

UQLM

493

24 May 2023

CRITIC: Large Language Models Can Self-Correct with Tool-Interactive CritiquingInternational Conference on Learning Representations (ICLR), 2023

Zhihong Shao

Yujiu Yang

405

604

19 May 2023

ToolkenGPT: Augmenting Frozen Language Models with Massive Tools via Tool EmbeddingsNeural Information Processing Systems (NeurIPS), 2023

614

241

19 May 2023

API-Bank: A Comprehensive Benchmark for Tool-Augmented LLMsConference on Empirical Methods in Natural Language Processing (EMNLP), 2023

Feifan Song

Zhoujun Li

Fei Huang

Yongbin Li

ELM RALM CLL

308

302

14 Apr 2023

GPT-4 Technical Report

...

4.7K

21,366

15 Mar 2023

Toolformer: Language Models Can Teach Themselves to Use ToolsNeural Information Processing Systems (NeurIPS), 2023

Luke Zettlemoyer

472

2,744

09 Feb 2023

Large Language Models are Better Reasoners with Self-VerificationConference on Empirical Methods in Natural Language Processing (EMNLP), 2022

Jun Zhao

502

325

19 Dec 2022

ReAct: Synergizing Reasoning and Acting in Language ModelsInternational Conference on Learning Representations (ICLR), 2022

Dian Yu

2.6K

5,491

06 Oct 2022

Decomposed Prompting: A Modular Approach for Solving Complex TasksInternational Conference on Learning Representations (ICLR), 2022

510

597

05 Oct 2022

Chain-of-Thought Prompting Elicits Reasoning in Large Language ModelsNeural Information Processing Systems (NeurIPS), 2022

2.4K

14,735

28 Jan 2022

SimCSE: Simple Contrastive Learning of Sentence EmbeddingsConference on Empirical Methods in Natural Language Processing (EMNLP), 2021

829

4,055

18 Apr 2021

Selective Question Answering under Domain Shift

Amita Kamath

Robin Jia

Abigail Z. Jacobs

OOD

255

246

16 Jun 2020

Language Models are Few-Shot LearnersNeural Information Processing Systems (NeurIPS), 2020

...

2.0K

53,198

28 May 2020

AmbigQA: Answering Ambiguous Open-domain Questions

Sewon Min

Julian Michael

Hannaneh Hajishirzi

Luke Zettlemoyer

420

403

22 Apr 2020

"None of the Above":Measure Uncertainty in Dialog Response RetrievalAnnual Meeting of the Association for Computational Linguistics (ACL), 2020

215

04 Apr 2020

Reading Wikipedia to Answer Open-Domain Questions

Jason Weston

398

2,150

31 Mar 2017