Communities
Connect sessions
AI calendar
Organizations
Join Slack
Contact Sales
Search
Open menu
Home
Papers
2303.11366
Cited By
v1
v2
v3
v4 (latest)
Reflexion: Language Agents with Verbal Reinforcement Learning
Neural Information Processing Systems (NeurIPS), 2023
20 March 2023
Noah Shinn
Federico Cassano
Beck Labash
A. Gopinath
Karthik Narasimhan
Shunyu Yao
LLMAG
KELM
Re-assign community
ArXiv (abs)
PDF
HTML
HuggingFace (5 upvotes)
Papers citing
"Reflexion: Language Agents with Verbal Reinforcement Learning"
50 / 1,269 papers shown
Active Confusion Expression in Large Language Models: Leveraging World Models toward Better Social Reasoning
Jialu Du
Guiyang Hou
Yihui Fu
Chen Wu
Wenqi Zhang
Yongliang Shen
Weiming Lu
LLMAG
LRM
175
0
0
09 Oct 2025
xRouter: Training Cost-Aware LLMs Orchestration System via Reinforcement Learning
Cheng Qian
Zuxin Liu
Shirley Kokane
Akshara Prabhakar
Jielin Qiu
...
Weiran Yao
Shelby Heinecke
Silvio Savarese
Caiming Xiong
Huan Wang
172
1
0
09 Oct 2025
FURINA: A Fully Customizable Role-Playing Benchmark via Scalable Multi-Agent Collaboration Pipeline
Haotian Wu
Shufan Jiang
Chios Chen
Yiyang Feng
Hehai Lin
Heqing Zou
Yao Shu
Y. Li
AI4CE
268
0
0
08 Oct 2025
CompassLLM: A Multi-Agent Approach toward Geo-Spatial Reasoning for Popular Path Query
Md. Nazmul Islam Ananto
Shamit Fatin
Mohammed Eunus Ali
Md. Rizwan Parvez
LLMAG
LRM
115
0
0
08 Oct 2025
SanDRA: Safe Large-Language-Model-Based Decision Making for Automated Vehicles Using Reachability Analysis
Yuanfei Lin
Sebastian Illing
Matthias Althoff
181
1
0
08 Oct 2025
Cross-Modal Attention Guided Unlearning in Vision-Language Models
Karuna Bhaila
Aneesh Komanduri
Minh-Hao Van
Xintao Wu
MU
185
0
0
08 Oct 2025
BG-FlipIn: A Bayesian game framework for FlipIt-insider models in advanced persistent threats
Yang Jiao
Guanpu Chen
Yiguang Hong
AAML
103
0
0
08 Oct 2025
Adaptive Tool Generation with Models as Tools and Reinforcement Learning
Chenpeng Wang
Xiaojie Cheng
Chunye Wang
L. Yang
Lei Zhang
LRM
121
0
0
08 Oct 2025
Haystack Engineering: Context Engineering for Heterogeneous and Agentic Long-Context Evaluation
Mufei Li
Dongqi Fu
Limei Wang
Si Zhang
Hanqing Zeng
...
Xiaoxin He
Xavier Bresson
Yinglong Xia
Chonglin Sun
Pan Li
226
0
0
08 Oct 2025
ProSEA: Problem Solving via Exploration Agents
William Nguyen
Vinh Luong
Christopher Nguyen
LLMAG
144
0
0
08 Oct 2025
Non-Stationary Online Structured Prediction with Surrogate Losses
Shinsaku Sakaue
Han Bao
Yuzhou Cao
131
2
0
08 Oct 2025
Mission Impossible: Feedback-Guided Dynamic Interactive Planning for Improving Reasoning on LLMs
Dong Yan
Gaochen Wu
Bowen Zhou
LLMAG
LRM
AI4CE
119
0
0
07 Oct 2025
Beyond Grid-Locked Voxels: Neural Response Functions for Continuous Brain Encoding
Haomiao Chen
K. Jamison
M. Sabuncu
Amy Kuceyeski
150
1
0
07 Oct 2025
Limited-Angle Tomography Reconstruction via Projector Guided 3D Diffusion
Zhantao Deng
Mériem Er-Rafik
Anna Sushko
C. Hébert
Pascal Fua
DiffM
MedIm
114
0
0
07 Oct 2025
A Survey on Agentic Security: Applications, Threats and Defenses
Asif Shahriar
M. Rahman
Sadif Ahmed
Farig Sadeque
Md Rizwan Parvez
146
3
0
07 Oct 2025
Learning to Crawl: Latent Model-Based Reinforcement Learning for Soft Robotic Adaptive Locomotion
Vaughn Gzenda
Robin Chhabra
120
0
0
07 Oct 2025
Mixing Mechanisms: How Language Models Retrieve Bound Entities In-Context
Yoav Gur-Arieh
Mor Geva
Atticus Geiger
KELM
150
3
0
07 Oct 2025
RareAgent: Self-Evolving Reasoning for Drug Repurposing in Rare Diseases
Lang Qin
Zijian Gan
Xu Cao
Pengcheng Jiang
Yankai Jiang
Jiawei Han
Kaishun Wu
Jintai Chen
LRM
172
0
0
07 Oct 2025
ARM: Discovering Agentic Reasoning Modules for Generalizable Multi-Agent Systems
Bohan Yao
Shiva Krishna Reddy Malay
Vikas Yadav
LM&Ro
LRM
152
0
0
07 Oct 2025
Alignment Tipping Process: How Self-Evolution Pushes LLM Agents Off the Rails
Siwei Han
Jiaqi Liu
Yaofeng Su
Wenbo Duan
Xinyuan Liu
Cihang Xie
Mohit Bansal
Mingyu Ding
Linjun Zhang
Huaxiu Yao
134
1
0
06 Oct 2025
TRAJECT-Bench:A Trajectory-Aware Benchmark for Evaluating Agentic Tool Use
Pengfei He
Zhenwei Dai
Bing He
Hui Liu
Xianfeng Tang
...
Subhabrata Mukherjee
Suhang Wang
Yue Xing
Shucheng Zhou
Benoit Dumoulin
LLMAG
183
1
0
06 Oct 2025
LaDiR: Latent Diffusion Enhances LLMs for Text Reasoning
Haoqiang Kang
Y. Zhang
Nikki Lijing Kuang
Nicklas Majamaki
Navdeep Jaitly
Yi-An Ma
Lianhui Qin
LRM
589
2
0
06 Oct 2025
AInstein: Assessing the Feasibility of AI-Generated Approaches to Research Problems
Shambhavi Mishra
Gaurav Sahu
M. Pedersoli
Laurent Charlin
Jose Dolz
Christopher Pal
LRM
99
0
0
06 Oct 2025
ViTs: Teaching Machines to See Time Series Anomalies Like Human Experts
Zexin Wang
Changhua Pei
Yang Liu
Hengyue Jiang
Quan Zhou
...
Hang Cui
Jianhui Li
Gaogang Xie
Jingjing Li
Dan Pei
AI4TS
VLM
142
0
0
06 Oct 2025
Agentic Context Engineering: Evolving Contexts for Self-Improving Language Models
Qizheng Zhang
Changran Hu
Shubhangi Upasani
Boyuan Ma
Fenglu Hong
...
Mengmeng Ji
Hanchen Li
Urmish Thakker
James Zou
Kunle Olukotun
LLMAG
KELM
222
24
0
06 Oct 2025
Bridging Reasoning to Learning: Unmasking Illusions using Complexity Out of Distribution Generalization
Mohammad Mahdi Samiei Paqaleh
Arash Marioriyad
Arman Tahmasebi-Zadeh
Mohamadreza Fereydooni
Mahdi Ghaznavai
Mahdieh Soleymani Baghshah
120
0
0
06 Oct 2025
A global log for medical AI
Ayush Noori
Adam Rodman
Alan Karthikesalingam
Bilal A. Mateen
Christopher A. Longhurst
...
Noa Dagan
David Clifton
Ran D. Balicer
I. Kohane
Marinka Zitnik
172
0
0
05 Oct 2025
NegotiationGym: Self-Optimizing Agents in a Multi-Agent Social Simulation Environment
Shashank Mangla
Chris Hokamp
Jack Boylan
D. Ghalandari
Yuuv Jauhari
Lauren Cassidy
Oisin Duffy
93
2
0
05 Oct 2025
Zephyrus: An Agentic Framework for Weather Science
Sumanth Varambally
Marshall Fisher
Jas Thakker
Yiwei Chen
Zhirui Xia
...
Salva Rühling Cachay
Taylor Berg-Kirkpatrick
Duncan Watson-Parris
Yi-An Ma
Rose Yu
LLMAG
120
2
0
05 Oct 2025
Large Language Models Hallucination: A Comprehensive Survey
Aisha Alansari
Hamzah Luqman
HILM
LRM
461
1
0
05 Oct 2025
Utility-Learning Tension in Self-Modifying Agents
Charles L. Wang
Keir Dorchen
Peter Jin
129
0
0
05 Oct 2025
Just-in-time Episodic Feedback Hinter: Leveraging Offline Knowledge to Improve LLM Agents Adaptation
Hadi Nekoei
Aman Jaiswal
Patrice Béchard
Oleh Shliazhko
Orlando Marquez Ayala
Mathieu Reymond
Massimo Caccia
Alexandre Drouin
Sarath Chandar
Alexandre Lacoste
KELM
128
1
0
05 Oct 2025
SPOGW: a Score-based Preference Optimization method via Group-Wise comparison for workflows
Yitong Cui
Liu Liu
B. Yu
Jiayan Qiu
Xikai Zhang
Likang Xiao
Y. Liu
Quan Chen
159
0
0
05 Oct 2025
Adversarial Agent Collaboration for C to Rust Translation
Tianyu Li
Ruishi Li
Bo Wang
Brandon Paulsen
Umang Mathur
Prateek Saxena
154
2
0
04 Oct 2025
Towards Policy-Compliant Agents: Learning Efficient Guardrails For Policy Violation Detection
Xiaofei Wen
Wenjie Mo
Yanan Xie
Peng Qi
Muhao Chen
AAML
156
0
0
03 Oct 2025
Self-Reflective Generation at Test Time
Jian Mu
Qixin Zhang
Zhiyong Wang
Menglin Yang
Shuang Qiu
Chengwei Qin
Zhongxiang Dai
Yao Shu
LRM
144
1
0
03 Oct 2025
FOR-Prompting: From Objection to Revision via an Asymmetric Prompting Protocol
He Zhang
Anzhou Zhang
Jian Dai
LRM
69
0
0
02 Oct 2025
An Algorithmic Information-Theoretic Perspective on the Symbol Grounding Problem
Zhangchi Liu
106
0
0
02 Oct 2025
Enhancing Large Language Model Reasoning with Reward Models: An Analytical Survey
Qiyuan Liu
Hao Xu
Xuhong Chen
Wei Chen
Yee Whye Teh
Ning Miao
ReLM
LRM
AI4CE
278
0
0
02 Oct 2025
ReSeek: A Self-Correcting Framework for Search Agents with Instructive Rewards
Shiyu Li
Yang Tang
Yifan Wang
P. Li
Xi Chen
KELM
LRM
176
1
0
01 Oct 2025
A Tale of LLMs and Induced Small Proxies: Scalable Agents for Knowledge Mining
Sipeng Zhang
Longfei Yun
Zilong Wang
Jingbo Shang
Letian Peng
146
0
0
01 Oct 2025
MEMTRACK: Evaluating Long-Term Memory and State Tracking in Multi-Platform Dynamic Agent Environments
Darshan Deshpande
Varun Gangal
Hersh Mehta
Anand Kannappan
Rebecca Qian
Peng Wang
97
0
0
01 Oct 2025
Fine-tuning with RAG for Improving LLM Learning of New Skills
Humaid Ibrahim
Nikolai Rozanov
Marek Rei
RALM
100
0
0
01 Oct 2025
Rethinking Thinking Tokens: LLMs as Improvement Operators
Lovish Madaan
Aniket Didolkar
Suchin Gururangan
John Quan
Ruan Silva
Ruslan Salakhutdinov
Manzil Zaheer
Sanjeev Arora
Anirudh Goyal
ReLM
LRM
191
1
1
01 Oct 2025
Typed Chain-of-Thought: A Curry-Howard Framework for Verifying LLM Reasoning
Elija Perrier
LRM
108
1
0
01 Oct 2025
MAVUL: Multi-Agent Vulnerability Detection via Contextual Reasoning and Interactive Refinement
Youpeng Li
Kartik Joshi
Xinda Wang
Eric Wong
125
1
0
30 Sep 2025
GRPO-
λ
λ
λ
: Credit Assignment improves LLM Reasoning
Prasanna Parthasarathi
Mathieu Reymond
Boxing Chen
Yufei Cui
Sarath Chandar
LRM
175
3
1
30 Sep 2025
Planner-R1: Reward Shaping Enables Efficient Agentic RL with Smaller LLMs
Sirou Zhu
Yanbin Jiang
Hejian Sang
Shao Tang
Qingquan Song
Biao He
Rohit Jain
Zhipeng Wang
Alborz Geramifard
137
0
0
30 Sep 2025
LLM-based Multi-Agent Blackboard System for Information Discovery in Data Science
Alireza Salemi
Mihir Parmar
Palash Goyal
Yiwen Song
Jinsung Yoon
Hamed Zamani
Hamid Palangi
Tomas Pfister
LLMAG
136
4
0
30 Sep 2025
Interactive Learning for LLM Reasoning
Hehai Lin
Shilei Cao
Minzhi Li
Sudong Wang
Haotian Wu
Linyi Yang
Lixian Zhang
Chengwei Qin
LLMAG
LRM
280
1
0
30 Sep 2025
Previous
1
2
3
4
5
...
24
25
26
Next