Communities
Connect sessions
AI calendar
Organizations
Join Slack
Contact Sales
Search
Open menu
Home
Papers
All Papers
0 / 0 papers shown
Title
Home
Papers
2303.17651
Cited By
v1
v2 (latest)
Self-Refine: Iterative Refinement with Self-Feedback
Neural Information Processing Systems (NeurIPS), 2023
30 March 2023
Aman Madaan
Niket Tandon
Prakhar Gupta
Skyler Hallinan
Luyu Gao
Sarah Wiegreffe
Uri Alon
Nouha Dziri
Shrimai Prabhumoye
Yiming Yang
Shashank Gupta
Bodhisattwa Prasad Majumder
Katherine Hermann
Sean Welleck
Amir Yazdanbakhsh
Peter Clark
ReLM
LRM
DiffM
Re-assign community
ArXiv (abs)
PDF
HTML
HuggingFace (2 upvotes)
Papers citing
"Self-Refine: Iterative Refinement with Self-Feedback"
50 / 1,154 papers shown
Title
Code-enabled language models can outperform reasoning models on diverse tasks
Cedegao E. Zhang
Cédric Colas
Gabriel Poesia
Joshua B. Tenenbaum
Jacob Andreas
ReLM
ALM
LRM
AI4CE
156
0
0
23 Oct 2025
Automated Cloud Infrastructure-as-Code Reconciliation with AI Agents
Zhenning Yang
Hui Guan
Victor Nicolet
Brandon Paulsen
Joey Dodds
Daniel Kroening
Ang Chen
88
0
0
23 Oct 2025
AgentArcEval: An Architecture Evaluation Method for Foundation Model based Agents
Journal of Systems and Software (JSS), 2025
Qinghua Lu
Dehai Zhao
Yue Liu
Hao Zhang
Liming Zhu
Xiwei Xu
Angela Shi
Tristan Tan
Rick Kazman
88
0
0
23 Oct 2025
Finding the Sweet Spot: Trading Quality, Cost, and Speed During Inference-Time LLM Reflection
Jack Butler
Nikita Kozodoi
Zainab Afolabi
Brian Tyacke
Gaiar Baimuratov
85
0
0
23 Oct 2025
Communication to Completion: Modeling Collaborative Workflows with Intelligent Multi-Agent Communication
Yiming Lu
Xun Wang
Simin Ma
Shujian Liu
Sathish Indurthi
Song Wang
Haoyun Deng
Fei Liu
Kaiqiang Song
80
0
0
22 Oct 2025
Learning Affordances at Inference-Time for Vision-Language-Action Models
Ameesh Shah
William Chen
Adwait Godbole
Federico Mora
Sanjit A. Seshia
Sergey Levine
104
0
0
22 Oct 2025
Illusions of reflection: open-ended task reveals systematic failures in Large Language Models' reflective reasoning
Sion Weatherhead
Flora D. Salim
Aaron Belbasis
ReLM
LRM
ELM
178
0
0
21 Oct 2025
Chain-of-Conceptual-Thought Elicits Daily Conversation in Large Language Models
Qingqing Gu
Dan Wang
Yue Zhao
Xiaoyu Wang
Zhonglin Jiang
Yong Chen
Hongyan Li
Luo Ji
ReLM
LRM
233
0
0
21 Oct 2025
SOCIA-Nabla: Textual Gradient Meets Multi-Agent Orchestration for Automated Simulator Generation
Yuncheng Hua
Sion Weatherhead
Mehdi Jafari
Hao Xue
Flora D. Salim
96
0
0
21 Oct 2025
Med-VRAgent: A Framework for Medical Visual Reasoning-Enhanced Agents
Guangfu Guo
Xiaoqian Lu
Yue Feng
LRM
140
0
0
21 Oct 2025
Prompting the Priorities: A First Look at Evaluating LLMs for Vulnerability Triage and Prioritization
Osama Al Haddad
Muhammad Ikram
Ejaz Ahmed
Young Lee
192
0
0
21 Oct 2025
PlanU: Large Language Model Reasoning through Planning under Uncertainty
Ziwei Deng
Mian Deng
Chenjing Liang
Zeming Gao
Chennan Ma
Chenxing Lin
Haipeng Zhang
Songzhu Mei
Cheng-Yu Wang
Siqi Shen
137
0
0
21 Oct 2025
WebDevJudge: Evaluating (M)LLMs as Critiques for Web Development Quality
Chunyang Li
Yilun Zheng
Xinting Huang
Tianqing Fang
Jiahao Xu
Yangqiu Song
L. Chen
Han Hu
ELM
104
0
0
21 Oct 2025
StreamingThinker: Large Language Models Can Think While Reading
Junlong Tong
Yingqi Fan
Anhao Zhao
Yunpu Ma
Xiaoyu Shen
RALM
LRM
259
1
0
20 Oct 2025
Certified Self-Consistency: Statistical Guarantees and Test-Time Training for Reliable Reasoning in LLMs
Paula Cordero-Encinar
Andrew Duncan
LRM
173
1
0
20 Oct 2025
Empowering Real-World: A Survey on the Technology, Practice, and Evaluation of LLM-driven Industry Agents
Yihong Tang
Kehai Chen
Liang Yue
Jinxin Fan
Caishen Zhou
...
Kaiyang Guo
Xingshan Zeng
Wenjing Cun
L. Shang
Min Zhang
LLMAG
138
0
0
20 Oct 2025
Deep Self-Evolving Reasoning
Zihan Liu
Shun Zheng
Xumeng Wen
Yang Wang
Jiang Bian
Mao Yang
ReLM
LRM
111
1
0
20 Oct 2025
An Agentic Framework with LLMs for Solving Complex Vehicle Routing Problems
Ni Zhang
Zhiguang Cao
Jianan Zhou
Cong Zhang
Yew-Soon Ong
76
0
0
19 Oct 2025
LANPO: Bootstrapping Language and Numerical Feedback for Reinforcement Learning in LLMs
Ang Li
Yifei Wang
Zhihang Yuan
Stefanie Jegelka
Y. X. R. Wang
ALM
KELM
162
0
0
18 Oct 2025
Before you <think>, monitor: Implementing Flavell's metacognitive framework in LLMs
Nick Oh
LRM
79
0
0
18 Oct 2025
Can LLMs Correct Themselves? A Benchmark of Self-Correction in LLMs
Guiyao Tie
Zenghui Yuan
Zeli Zhao
Chaoran Hu
Tianhe Gu
...
Ming Jin
Qingsong Wen
Lixing Chen
P. Zhou
Lichao Sun
KELM
ReLM
LRM
233
1
0
17 Oct 2025
VISTA: A Test-Time Self-Improving Video Generation Agent
Do Xuan Long
Xingchen Wan
Hootan Nakhost
Chen-Yu Lee
Tomas Pfister
Sercan Ö. Arık
VGen
TTA
190
2
0
17 Oct 2025
LLM-ERM: Sample-Efficient Program Learning via LLM-Guided Search
Shivam Singhal
Eran Malach
T. Poggio
Tomer Galanti
72
0
0
16 Oct 2025
Rewiring Experts on the Fly:Continuous Rerouting for Better Online Adaptation in Mixture-of-Expert models
Guinan Su
Yanwu Yang
Li Shen
Lu Yin
Shiwei Liu
Jonas Geiping
MoE
KELM
156
1
0
16 Oct 2025
Are My Optimized Prompts Compromised? Exploring Vulnerabilities of LLM-based Optimizers
Andrew Zhao
Reshmi Ghosh
Vitor Carvalho
Emily Lawton
Keegan Hines
Gao Huang
Jack W. Stokes
AAML
SILM
154
1
0
16 Oct 2025
Stable but Miscalibrated: A Kantian View on Overconfidence from Filters to Large Language Models
Akira Okutomi
LRM
159
0
0
16 Oct 2025
Retrieval-in-the-Chain: Bootstrapping Large Language Models for Generative Retrieval
Yingchen Zhang
Ruqing Zhang
Jiafeng Guo
W. Peng
Sen Li
Fuyu Lv
LRM
148
0
0
15 Oct 2025
Training LLM Agents to Empower Humans
Evan Ellis
Vivek Myers
Jens Tuyls
Sergey Levine
Anca Dragan
Benjamin Eysenbach
158
0
0
15 Oct 2025
Generative Universal Verifier as Multimodal Meta-Reasoner
Xinchen Zhang
X. Zhang
Youbin Wu
Yanbin Cao
Renrui Zhang
Ruihang Chu
Ling Yang
Yujiu Yang
LRM
136
1
0
15 Oct 2025
Optimal Aggregation of LLM and PRM Signals for Efficient Test-Time Scaling
Peng Kuang
Yanli Wang
Xiaoyu Han
Yaowenqi Liu
Kaidi Xu
Haohan Wang
52
0
0
15 Oct 2025
BoN Appetit Team at LeWiDi-2025: Best-of-N Test-time Scaling Can Not Stomach Annotation Disagreements (Yet)
Tomas Ruiz
Siyao Peng
Barbara Plank
Carsten Schwemmer
64
1
0
14 Oct 2025
Self-Verifying Reflection Helps Transformers with CoT Reasoning
Zhongwei Yu
Wannian Xia
Xue Yan
Bo Xu
Haifeng Zhang
Yali Du
Ning Yang
ReLM
LRM
89
0
0
14 Oct 2025
LLM Prompt Duel Optimizer: Efficient Label-Free Prompt Optimization
Yuanchen Wu
Saurabh Verma
Justin Lee
Fangzhou Xiong
Poppy Zhang
Amel Awadelkarim
Xu Chen
Yubai Yuan
Shawndra Hill
85
0
0
14 Oct 2025
Multi-stage Prompt Refinement for Mitigating Hallucinations in Large Language Models
Jung-Woo Shim
Yeong-Joon Ju
Ji-Hoon Park
Seong-Whan Lee
LRM
80
0
0
14 Oct 2025
KnowRL: Teaching Language Models to Know What They Know
Sahil Kale
Devendra Singh Dhami
KELM
92
0
0
13 Oct 2025
ReLook: Vision-Grounded RL with a Multimodal LLM Critic for Agentic Web Coding
Yuhang Li
Chenchen Zhang
Ruilin Lv
Ao Liu
K. Deng
Yuanxing Zhang
Jiaheng Liu
Wiggin Zhou
B. Zhou
LRM
71
3
0
13 Oct 2025
LLM Reasoning for Machine Translation: Synthetic Data Generation over Thinking Tokens
A. Zebaze
Rachel Bawden
Benoît Sagot
LRM
112
1
0
13 Oct 2025
FOSSIL: Harnessing Feedback on Suboptimal Samples for Data-Efficient Generalisation with Imitation Learning for Embodied Vision-and-Language Tasks
Sabrina McCallum
Amit Parekh
Alessandro Suglia
LM&Ro
116
0
0
13 Oct 2025
Generative AI for Biosciences: Emerging Threats and Roadmap to Biosecurity
Zaixi Zhang
Souradip Chakraborty
Amrit Singh Bedi
Emilin Mathew
Varsha Saravanan
...
Eric Xing
R. Altman
George Church
M. Y. Wang
Mengdi Wang
SILM
355
0
0
13 Oct 2025
Tracing the Traces: Latent Temporal Signals for Efficient and Accurate Reasoning
Martina G. Vilas
Safoora Yousefi
Besmira Nushi
Eric Horvitz
Vidhisha Balachandran
LRM
92
0
0
12 Oct 2025
Towards Self-Refinement of Vision-Language Models with Triangular Consistency
Yunlong Deng
Guangyi Chen
Tianpei Gu
Lingjing Kong
Yan Li
Zeyu Tang
Kun Zhang
136
1
0
12 Oct 2025
PrediQL: Automated Testing of GraphQL APIs with LLMs
Shaolun Liu
Sina Marefat
Omar Tsai
Yu Chen
Zecheng Deng
Jia Wang
Mohammad A. Tayebi
93
0
0
12 Oct 2025
MatryoshkaThinking: Recursive Test-Time Scaling Enables Efficient Reasoning
Hongwei Chen
Yishu Lei
Dan Zhang
Bo Ke
Danxiang Zhu
...
Shikun Feng
Jingzhou He
Yu Sun
Hua Wu
Haifeng Wang
ReLM
LRM
104
0
0
11 Oct 2025
Failure-Driven Workflow Refinement
Jusheng Zhang
Kaitong Cai
Qinglin Zeng
Ningyuan Liu
Stephen Fan
Ziliang Chen
Keze Wang
92
7
0
11 Oct 2025
MedAgentAudit: Diagnosing and Quantifying Collaborative Failure Modes in Medical Multi-Agent Systems
Lei Gu
Yinghao Zhu
Haoran Sang
Zixiang Wang
Dehao Sui
Wen Tang
Ewen M. Harrison
Junyi Gao
Lequan Yu
Liantao Ma
93
1
0
11 Oct 2025
Mitigating Hallucination in Multimodal Reasoning via Functional Attention Control
H. Lu
Bolun Chu
Weiye Fu
Guoshun Nan
Junning Liu
Minghui Pan
Qiankun Li
Yi Yu
Hua Wang
Kun Wang
LRM
96
0
0
11 Oct 2025
Answer-Consistent Chain-of-thought Reinforcement Learning For Multi-modal Large Langauge Models
Minbin Huang
Runhui Huang
Chuanyang Zheng
Jingyao Li
Guoxuan Chen
Han Shi
Hong Cheng
KELM
LRM
76
0
0
11 Oct 2025
MEC
3
^3
3
O: Multi-Expert Consensus for Code Time Complexity Prediction
Joonghyuk Hahn
Soohan Lim
Yo-Sub Han
96
0
0
10 Oct 2025
Automated Refinement of Essay Scoring Rubrics for Language Models via Reflect-and-Revise
Keno Harada
Lui Yoshida
Takeshi Kojima
Yusuke Iwasawa
Yutaka Matsuo
98
0
0
10 Oct 2025
Autonomous Agents for Scientific Discovery: Orchestrating Scientists, Language, Code, and Physics
Lianhao Zhou
Hongyi Ling
Cong Fu
Yepeng Huang
Michael Sun
...
X. Qian
Heng Ji
Wei Wang
Marinka Zitnik
Shuiwang Ji
LLMAG
LM&Ro
AI4CE
156
3
0
10 Oct 2025
Previous
1
2
3
4
5
6
...
22
23
24
Next