Communities
Connect sessions
AI calendar
Organizations
Join Slack
Contact Sales
Search
Open menu
Home
Papers
2303.17651
Cited By
v1
v2 (latest)
Self-Refine: Iterative Refinement with Self-Feedback
Neural Information Processing Systems (NeurIPS), 2023
30 March 2023
Aman Madaan
Niket Tandon
Prakhar Gupta
Skyler Hallinan
Luyu Gao
Sarah Wiegreffe
Uri Alon
Nouha Dziri
Shrimai Prabhumoye
Yiming Yang
Shashank Gupta
Bodhisattwa Prasad Majumder
Katherine Hermann
Sean Welleck
Amir Yazdanbakhsh
Peter Clark
ReLM
LRM
DiffM
Re-assign community
ArXiv (abs)
PDF
HTML
HuggingFace (2 upvotes)
Papers citing
"Self-Refine: Iterative Refinement with Self-Feedback"
50 / 1,676 papers shown
Aligning Large Language Models with Procedural Rules: An Autoregressive State-Tracking Prompting for In-Game Trading
Minkyung Kim
J. Kim
Woongcheol Yang
Sangdon Park
Sohee Bae
104
0
0
28 Oct 2025
Is Your Prompt Poisoning Code? Defect Induction Rates and Security Mitigation Strategies
Bin Wang
Y. Zhong
MiDi Wan
W. Yu
YuanBing Ouyang
Y. Huang
Hui Li
SILM
AAML
207
1
0
27 Oct 2025
Language Server CLI Empowers Language Agents with Process Rewards
Yifan Zhang
Lanser Contributors
73
0
0
27 Oct 2025
Deductive Chain-of-Thought Augmented Socially-aware Robot Navigation World Model
Weizheng Wang
Obi Ike
Soyun Choi
Sungeun Hong
Byung-Cheol Min
LM&Ro
LRM
195
0
0
27 Oct 2025
Scalable Supervising Software Agents with Patch Reasoner
Junjielong Xu
Boyin Tan
Xiaoyuan Liu
Chao Peng
Pengfei Gao
Pinjia He
ALM
LRM
153
0
0
26 Oct 2025
Accelerating Materials Design via LLM-Guided Evolutionary Search
Nikhil Abhyankar
Sanchit Kabra
Saaketh Desai
Chandan K. Reddy
109
0
0
26 Oct 2025
Scalable Oversight via Partitioned Human Supervision
Ren Yin
Takashi Ishida
Masashi Sugiyama
165
0
0
26 Oct 2025
Hollywood Town: Long-Video Generation via Cross-Modal Multi-Agent Orchestration
Zheng Wei
Mingchen Li
Zeqian Zhang
Ruibin Yuan
Pan Hui
Huamin Qu
James Evans
Maneesh Agrawala
Anyi Rao
VGen
112
2
0
25 Oct 2025
Boosting Accuracy and Efficiency of Budget Forcing in LLMs via Reinforcement Learning for Mathematical Reasoning
Ravindra Aribowo Tarunokusumo
Rafael Fernandes Cunha
OffRL
ReLM
LRM
142
0
0
24 Oct 2025
FLAMES: Fine-tuning LLMs to Synthesize Invariants for Smart Contract Security
Mojtaba Eshghie
Gabriele Morello
Matteo Lauretano
Alexandre Bartel
Martin Monperrus
123
1
0
24 Oct 2025
Finding the Sweet Spot: Trading Quality, Cost, and Speed During Inference-Time LLM Reflection
Jack Butler
Nikita Kozodoi
Zainab Afolabi
Brian Tyacke
Gaiar Baimuratov
105
0
0
23 Oct 2025
AgentArcEval: An Architecture Evaluation Method for Foundation Model based Agents
Journal of Systems and Software (JSS), 2025
Qinghua Lu
Dehai Zhao
Yue Liu
Hao Zhang
Liming Zhu
Xiwei Xu
Angela Shi
Tristan Tan
Rick Kazman
110
0
0
23 Oct 2025
Automated Cloud Infrastructure-as-Code Reconciliation with AI Agents
Zhenning Yang
Hui Guan
Victor Nicolet
Brandon Paulsen
Joey Dodds
Daniel Kroening
Ang Chen
120
0
0
23 Oct 2025
Code-enabled language models can outperform reasoning models on diverse tasks
Cedegao E. Zhang
Cédric Colas
Gabriel Poesia
Joshua B. Tenenbaum
Jacob Andreas
ReLM
ALM
LRM
AI4CE
189
0
0
23 Oct 2025
Communication to Completion: Modeling Collaborative Workflows with Intelligent Multi-Agent Communication
Yiming Lu
Xun Wang
Simin Ma
Shujian Liu
Sathish Indurthi
Song Wang
Haoyun Deng
Fei Liu
Kaiqiang Song
117
0
0
22 Oct 2025
Learning Affordances at Inference-Time for Vision-Language-Action Models
Ameesh Shah
William Chen
Adwait Godbole
Federico Mora
Sanjit A. Seshia
Sergey Levine
132
0
0
22 Oct 2025
WebDevJudge: Evaluating (M)LLMs as Critiques for Web Development Quality
Chunyang Li
Yilun Zheng
Xinting Huang
Tianqing Fang
Jiahao Xu
Yangqiu Song
L. Chen
Han Hu
ELM
118
0
0
21 Oct 2025
PlanU: Large Language Model Reasoning through Planning under Uncertainty
Ziwei Deng
Mian Deng
Chenjing Liang
Zeming Gao
Chennan Ma
Chenxing Lin
Haipeng Zhang
Songzhu Mei
Cheng-Yu Wang
Siqi Shen
153
0
0
21 Oct 2025
Med-VRAgent: A Framework for Medical Visual Reasoning-Enhanced Agents
Guangfu Guo
Xiaoqian Lu
Yue Feng
LRM
180
1
0
21 Oct 2025
Chain-of-Conceptual-Thought Elicits Daily Conversation in Large Language Models
Qingqing Gu
Dan Wang
Yue Zhao
Xiaoyu Wang
Zhonglin Jiang
Yong Chen
Hongyan Li
Luo Ji
ReLM
LRM
292
0
0
21 Oct 2025
Illusions of reflection: open-ended task reveals systematic failures in Large Language Models' reflective reasoning
Sion Weatherhead
Flora D. Salim
Aaron Belbasis
ReLM
LRM
ELM
195
0
0
21 Oct 2025
SOCIA-Nabla: Textual Gradient Meets Multi-Agent Orchestration for Automated Simulator Generation
Yuncheng Hua
Sion Weatherhead
Mehdi Jafari
Hao Xue
Flora D. Salim
139
0
0
21 Oct 2025
Prompting the Priorities: A First Look at Evaluating LLMs for Vulnerability Triage and Prioritization
Osama Al Haddad
Muhammad Ikram
Ejaz Ahmed
Young Lee
234
0
0
21 Oct 2025
Empowering Real-World: A Survey on the Technology, Practice, and Evaluation of LLM-driven Industry Agents
Yihong Tang
Kehai Chen
Liang Yue
Jinxin Fan
Caishen Zhou
...
Kaiyang Guo
Xingshan Zeng
Wenjing Cun
L. Shang
Min Zhang
LLMAG
161
0
0
20 Oct 2025
Deep Self-Evolving Reasoning
Zihan Liu
Shun Zheng
Xumeng Wen
Yang Wang
Jiang Bian
Mao Yang
ReLM
LRM
171
1
0
20 Oct 2025
Certified Self-Consistency: Statistical Guarantees and Test-Time Training for Reliable Reasoning in LLMs
Paula Cordero-Encinar
Andrew Duncan
LRM
200
1
0
20 Oct 2025
StreamingThinker: Large Language Models Can Think While Reading
Junlong Tong
Yingqi Fan
Anhao Zhao
Yunpu Ma
Xiaoyu Shen
RALM
LRM
346
2
0
20 Oct 2025
An Agentic Framework with LLMs for Solving Complex Vehicle Routing Problems
Ni Zhang
Zhiguang Cao
Jianan Zhou
Cong Zhang
Yew-Soon Ong
109
0
0
19 Oct 2025
Before you <think>, monitor: Implementing Flavell's metacognitive framework in LLMs
Nick Oh
LRM
138
0
0
18 Oct 2025
LANPO: Bootstrapping Language and Numerical Feedback for Reinforcement Learning in LLMs
Ang Li
Yifei Wang
Zhihang Yuan
Stefanie Jegelka
Y. X. R. Wang
ALM
KELM
178
0
0
18 Oct 2025
Can LLMs Correct Themselves? A Benchmark of Self-Correction in LLMs
Guiyao Tie
Zenghui Yuan
Zeli Zhao
Chaoran Hu
Tianhe Gu
...
Ming Jin
Qingsong Wen
Lixing Chen
P. Zhou
Lichao Sun
KELM
ReLM
LRM
260
1
0
17 Oct 2025
VISTA: A Test-Time Self-Improving Video Generation Agent
Do Xuan Long
Xingchen Wan
Hootan Nakhost
Chen-Yu Lee
Tomas Pfister
Sercan Ö. Arık
VGen
TTA
252
3
0
17 Oct 2025
Stable but Miscalibrated: A Kantian View on Overconfidence from Filters to Large Language Models
Akira Okutomi
LRM
209
0
0
16 Oct 2025
Rewiring Experts on the Fly:Continuous Rerouting for Better Online Adaptation in Mixture-of-Expert models
Guinan Su
Yanwu Yang
Li Shen
Lu Yin
Shiwei Liu
Jonas Geiping
MoE
KELM
195
2
0
16 Oct 2025
LLM-ERM: Sample-Efficient Program Learning via LLM-Guided Search
Shivam Singhal
Eran Malach
T. Poggio
Tomer Galanti
96
1
0
16 Oct 2025
Are My Optimized Prompts Compromised? Exploring Vulnerabilities of LLM-based Optimizers
Andrew Zhao
Reshmi Ghosh
Vitor Carvalho
Emily Lawton
Keegan Hines
Gao Huang
Jack W. Stokes
AAML
SILM
248
1
0
16 Oct 2025
Training LLM Agents to Empower Humans
Evan Ellis
Vivek Myers
Jens Tuyls
Sergey Levine
Anca Dragan
Benjamin Eysenbach
194
0
0
15 Oct 2025
Generative Universal Verifier as Multimodal Meta-Reasoner
Xinchen Zhang
X. Zhang
Youbin Wu
Yanbin Cao
Renrui Zhang
Ruihang Chu
Ling Yang
Yujiu Yang
LRM
181
4
0
15 Oct 2025
Retrieval-in-the-Chain: Bootstrapping Large Language Models for Generative Retrieval
Yingchen Zhang
Ruqing Zhang
Jiafeng Guo
W. Peng
Sen Li
Fuyu Lv
LRM
197
0
0
15 Oct 2025
Optimal Aggregation of LLM and PRM Signals for Efficient Test-Time Scaling
Peng Kuang
Yanli Wang
Xiaoyu Han
Yaowenqi Liu
Kaidi Xu
Haohan Wang
82
0
0
15 Oct 2025
BoN Appetit Team at LeWiDi-2025: Best-of-N Test-time Scaling Can Not Stomach Annotation Disagreements (Yet)
Tomas Ruiz
Siyao Peng
Barbara Plank
Carsten Schwemmer
100
1
0
14 Oct 2025
Self-Verifying Reflection Helps Transformers with CoT Reasoning
Zhongwei Yu
Wannian Xia
Xue Yan
Bo Xu
Haifeng Zhang
Yali Du
Ning Yang
ReLM
LRM
107
1
0
14 Oct 2025
Multi-stage Prompt Refinement for Mitigating Hallucinations in Large Language Models
Jung-Woo Shim
Yeong-Joon Ju
Ji-Hoon Park
Seong-Whan Lee
LRM
111
0
0
14 Oct 2025
LLM Prompt Duel Optimizer: Efficient Label-Free Prompt Optimization
Yuanchen Wu
Saurabh Verma
Justin Lee
Fangzhou Xiong
Poppy Zhang
Amel Awadelkarim
Xu Chen
Yubai Yuan
Shawndra Hill
118
1
0
14 Oct 2025
FOSSIL: Harnessing Feedback on Suboptimal Samples for Data-Efficient Generalisation with Imitation Learning for Embodied Vision-and-Language Tasks
Sabrina McCallum
Amit Parekh
Alessandro Suglia
LM&Ro
117
0
0
13 Oct 2025
LLM Reasoning for Machine Translation: Synthetic Data Generation over Thinking Tokens
A. Zebaze
Rachel Bawden
Benoît Sagot
LRM
143
1
0
13 Oct 2025
ReLook: Vision-Grounded RL with a Multimodal LLM Critic for Agentic Web Coding
Yuhang Li
Chenchen Zhang
Ruilin Lv
Ao Liu
K. Deng
Yuanxing Zhang
Jiaheng Liu
Wiggin Zhou
B. Zhou
LRM
115
3
0
13 Oct 2025
KnowRL: Teaching Language Models to Know What They Know
Sahil Kale
Devendra Singh Dhami
KELM
112
0
0
13 Oct 2025
Generative AI for Biosciences: Emerging Threats and Roadmap to Biosecurity
Zaixi Zhang
Souradip Chakraborty
Amrit Singh Bedi
Emilin Mathew
Varsha Saravanan
...
Eric Xing
R. Altman
George Church
M. Y. Wang
Mengdi Wang
SILM
438
1
0
13 Oct 2025
PrediQL: Automated Testing of GraphQL APIs with LLMs
Shaolun Liu
Sina Marefat
Omar Tsai
Yu Chen
Zecheng Deng
Jia Wang
Mohammad A. Tayebi
117
0
0
12 Oct 2025
Previous
1
2
3
4
5
6
...
32
33
34
Next
Page 3 of 34
Page
of 34
Go