59

Agentic Reasoning for Large Language Models

Tianxin Wei
Ting-Wei Li
Zhining Liu
Xuying Ning
Ze Yang
Jiaru Zou
Zhichen Zeng
Ruizhong Qiu
Xiao Lin
Dongqi Fu
Zihao Li
Mengting Ai
Duo Zhou
Wenxuan Bao
Yunzhe Li
Gaotang Li
Cheng Qian
Yu Wang
Xiangru Tang
Yin Xiao
Liri Fang
Hui Liu
Xianfeng Tang
Yuji Zhang
Chi Wang
Jiaxuan You
Heng Ji
Hanghang Tong
Jingrui He
Main:73 Pages
12 Figures
Bibliography:62 Pages
6 Tables
Abstract

Reasoning is a fundamental cognitive process underlying inference, problem-solving, and decision-making. While large language models (LLMs) demonstrate strong reasoning capabilities in closed-world settings, they struggle in open-ended and dynamic environments. Agentic reasoning marks a paradigm shift by reframing LLMs as autonomous agents that plan, act, and learn through continual interaction. In this survey, we organize agentic reasoning along three complementary dimensions. First, we characterize environmental dynamics through three layers: foundational agentic reasoning, which establishes core single-agent capabilities including planning, tool use, and search in stable environments; self-evolving agentic reasoning, which studies how agents refine these capabilities through feedback, memory, and adaptation; and collective multi-agent reasoning, which extends intelligence to collaborative settings involving coordination, knowledge sharing, and shared goals. Across these layers, we distinguish in-context reasoning, which scales test-time interaction through structured orchestration, from post-training reasoning, which optimizes behaviors via reinforcement learning and supervised fine-tuning. We further review representative agentic reasoning frameworks across real-world applications and benchmarks, including science, robotics, healthcare, autonomous research, and mathematics. This survey synthesizes agentic reasoning methods into a unified roadmap bridging thought and action, and outlines open challenges and future directions, including personalization, long-horizon interaction, world modeling, scalable multi-agent training, and governance for real-world deployment.

View on arXiv
Comments on this paper