v1v2 (latest)

Evaluating Uncertainty-based Failure Detection for Closed-Loop LLM Planners

1 June 2024

Papers citing "Evaluating Uncertainty-based Failure Detection for Closed-Loop LLM Planners"

38 / 38 papers shown

Title
LAGEA: Language Guided Embodied Agents for Robotic Manipulation Abdul Monaf Chowdhury Akm Moshiur Rahman Mazumder Rabeya Akter S. Arib LM&Ro 80 0 0 27 Sep 2025
Generalizability of Large Language Model-Based Agents: A Comprehensive Survey Minxing Zhang Yi Yang Roy Xie Bhuwan Dhingra Shuyan Zhou Jian Pei LLMAG LM&Ro AI4CE 150 1 0 19 Sep 2025
Survey of GenAI for Automotive Software Development: From Requirements to Executable Code Nenad Petrovic Vahid Zolfaghari André Schamschurko Sven Kirchner Fengjunjie Pan ... Krzysztof Lebioda Yinglei Song Yi Zhang Lukasz Mazur Alois Knoll 92 1 0 20 Jul 2025
Identifying Uncertainty in Self-Adaptive Robotics with Large Language Models C. Gomes Jalil Boudjadar Mirgita Frasheri Shaukat Ali Peter Gorm Larsen 166 2 0 29 Apr 2025
Uncertainty Quantification and Confidence Calibration in Large Language Models: A Survey Xiaoou Liu Tiejin Chen Longchao Da Chacha Chen Zhen Lin Hua Wei HILM 412 35 0 20 Mar 2025
A Survey on Uncertainty Quantification of Large Language Models: Taxonomy, Open Research Challenges, and Future DirectionsACM Computing Surveys (ACM CSUR), 2024 Ola Shorinwa Zhiting Mei Justin Lidard Allen Z. Ren Anirudha Majumdar HILM LRM 319 55 0 07 Dec 2024
AHA: A Vision-Language-Model for Detecting and Reasoning Over Failures in Robotic ManipulationInternational Conference on Learning Representations (ICLR), 2024 Jiafei Duan Wilbert Pumacay Nishanth Kumar Yi Ru Wang Shulin Tian Wentao Yuan Ranjay Krishna Dieter Fox Ajay Mandlekar Yijie Guo VLM LRM 228 71 0 01 Oct 2024
RACER: Rich Language-Guided Failure Recovery Policies for Imitation LearningIEEE International Conference on Robotics and Automation (ICRA), 2024 Yinpei Dai Jayjun Lee Nima Fazeli Joyce Chai 145 27 0 23 Sep 2024
DexGANGrasp: Dexterous Generative Adversarial Grasping Synthesis for Task-Oriented Manipulation Qian Feng David S. Martinez Lema M. Malmir Hang Li Jianxiang Feng Zhaopeng Chen Alois C. Knoll 237 11 0 24 Jul 2024
Self-Evaluation Improves Selective Generation in Large Language Models Jie Jessie Ren Yao-Min Zhao Tu Vu Peter J. Liu Balaji Lakshminarayanan ELM 269 59 0 14 Dec 2023
Foundation Models in Robotics: Applications, Challenges, and the Future Roya Firoozi Johnathan Tucker Stephen Tian Anirudha Majumdar Jiankai Sun ... Brian Ichter Danny Driess Jiajun Wu Cewu Lu Mac Schwager LM&Ro AI4CE LRM VLM 208 258 0 13 Dec 2023
Topology-Matching Normalizing Flows for Out-of-Distribution Detection in Robot LearningConference on Robot Learning (CoRL), 2023 Jianxiang Feng Jongseok Lee Simon Geisler Stephan Gunnemann Rudolph Triebel OODD 195 7 0 11 Nov 2023
A Survey on Hallucination in Large Language Models: Principles, Taxonomy, Challenges, and Open Questions Lei Huang Weijiang Yu Weitao Ma Weihong Zhong Zhangyin Feng ... Qianglong Chen Weihua Peng Xiaocheng Feng Bing Qin Ting Liu LRM HILM 306 1,755 0 09 Nov 2023
VoxPoser: Composable 3D Value Maps for Robotic Manipulation with Language ModelsConference on Robot Learning (CoRL), 2023 Wenlong Huang Chen Wang Ruohan Zhang Yunzhu Li Jiajun Wu Li Fei-Fei LM&Ro 291 712 0 12 Jul 2023
A Stitch in Time Saves Nine: Detecting and Mitigating Hallucinations of LLMs by Validating Low-Confidence Generation Neeraj Varshney Wenlin Yao Hongming Zhang Jianshu Chen Dong Yu HILM 334 218 0 08 Jul 2023
Robots That Ask For Help: Uncertainty Alignment for Large Language Model PlannersConference on Robot Learning (CoRL), 2023 Allen Z. Ren Anushri Dixit Alexandra Bodrova Sumeet Singh Stephen Tu ... Jacob Varley Zhenjia Xu Dorsa Sadigh Andy Zeng Anirudha Majumdar LM&Ro 417 295 0 04 Jul 2023
DoReMi: Grounding Language Model by Detecting and Recovering from Plan-Execution MisalignmentIEEE/RJS International Conference on Intelligent RObots and Systems (IROS), 2023 Yanjiang Guo Yen-Jen Wang Lihan Zha Zheyuan Jiang Jianyu Chen LM&Ro 367 58 0 01 Jul 2023
REFLECT: Summarizing Robot Experiences for Failure Explanation and CorrectionConference on Robot Learning (CoRL), 2023 Zeyi Liu Arpit Bahety Shuran Song LRM 352 180 0 27 Jun 2023
Can LLMs Express Their Uncertainty? An Empirical Evaluation of Confidence Elicitation in LLMsInternational Conference on Learning Representations (ICLR), 2023 Miao Xiong Zhiyuan Hu Xinyang Lu Yifei Li Jie Fu Junxian He Bryan Hooi 396 651 0 22 Jun 2023
AdaPlanner: Adaptive Planning from Feedback with Language ModelsNeural Information Processing Systems (NeurIPS), 2023 Haotian Sun Yuchen Zhuang Lingkai Kong Bo Dai Chao Zhang LLMAG 150 175 0 26 May 2023
Visual Instruction TuningNeural Information Processing Systems (NeurIPS), 2023 Haotian Liu Chunyuan Li Qingyang Wu Yong Jae Lee SyDa VLM MLLM 862 7,083 0 17 Apr 2023
Errors are Useful Prompts: Instruction Guided Task Programming with Verifier-Assisted Iterative Prompting Marta Skreta Naruki Yoshikawa Sebastian Arellano-Rubach Zhi Ji L. B. Kristensen Kourosh Darvish Alán Aspuru-Guzik Florian Shkurti Animesh Garg 196 65 0 24 Mar 2023
Reflexion: Language Agents with Verbal Reinforcement LearningNeural Information Processing Systems (NeurIPS), 2023 Noah Shinn Federico Cassano Beck Labash A. Gopinath Karthik Narasimhan Shunyu Yao LLMAG KELM 473 2,105 0 20 Mar 2023
Grounding DINO: Marrying DINO with Grounded Pre-Training for Open-Set Object DetectionEuropean Conference on Computer Vision (ECCV), 2023 Shilong Liu Zhaoyang Zeng Tianhe Ren Feng Li Hao Zhang ... Chun-yue Li Jianwei Yang Hang Su Jun Zhu Lei Zhang ObjD 606 3,093 0 09 Mar 2023
PaLM-E: An Embodied Multimodal Language ModelInternational Conference on Machine Learning (ICML), 2023 Danny Driess F. Xia Mehdi S. M. Sajjadi Corey Lynch Aakanksha Chowdhery ... Marc Toussaint Klaus Greff Andy Zeng Igor Mordatch Peter R. Florence LM&Ro 325 2,145 0 06 Mar 2023
Describe, Explain, Plan and Select: Interactive Planning with Large Language Models Enables Open-World Multi-Task Agents Zihao Wang Shaofei Cai Guanzhou Chen Hoang Trung-Dung Xiaojian Ma Yitao Liang LM&Ro LLMAG 399 423 0 03 Feb 2023
ReAct: Synergizing Reasoning and Acting in Language ModelsInternational Conference on Learning Representations (ICLR), 2022 Shunyu Yao Jeffrey Zhao Dian Yu Nan Du Izhak Shafran Karthik Narasimhan Yuan Cao LLMAG ReLM LRM 1.5K 4,819 0 06 Oct 2022
Inner Monologue: Embodied Reasoning through Planning with Language ModelsConference on Robot Learning (CoRL), 2022 Wenlong Huang F. Xia Ted Xiao Harris Chan Jacky Liang ... Tomas Jackson Linda Luu Sergey Levine Karol Hausman Brian Ichter LLMAG LM&Ro LRM 316 1,132 0 12 Jul 2022
Do As I Can, Not As I Say: Grounding Language in Robotic AffordancesConference on Robot Learning (CoRL), 2022 Michael Ahn Anthony Brohan Noah Brown Yevgen Chebotar Omar Cortes ... Ted Xiao Peng Xu Sichun Xu Mengyuan Yan Andy Zeng LM&Ro 475 2,514 0 04 Apr 2022
Chain-of-Thought Prompting Elicits Reasoning in Large Language ModelsNeural Information Processing Systems (NeurIPS), 2022 Jason W. Wei Xuezhi Wang Dale Schuurmans Maarten Bosma Brian Ichter F. Xia Ed H. Chi Quoc Le Denny Zhou LM&Ro LRM AI4CE ReLM 2.1K 13,906 0 28 Jan 2022
Language Models as Zero-Shot Planners: Extracting Actionable Knowledge for Embodied AgentsInternational Conference on Machine Learning (ICML), 2022 Wenlong Huang Pieter Abbeel Deepak Pathak Igor Mordatch LM&Ro 263 1,356 0 18 Jan 2022
Introspective Robot Perception using Smoothed Predictions from Bayesian Neural NetworksInternational Symposium of Robotics Research (ISRR), 2021 Jianxiang Feng M. Durner Zoltán-Csaba Márton Ferenc Bálint-Benczédi Rudolph Triebel UQCV BDL 201 12 0 27 Sep 2021
Bayesian Active Learning for Sim-to-Real Robotic PerceptionIEEE/RJS International Conference on Intelligent RObots and Systems (IROS), 2021 Jianxiang Feng Jongseok Lee M. Durner Rudolph Triebel 227 17 0 23 Sep 2021
Trust Your Robots! Predictive Uncertainty Estimation of Neural Networks with Sparse Gaussian Processes Jongseo Lee Jianxiang Feng Matthias Humt M. Müller Rudolph Triebel UQCV 227 24 0 20 Sep 2021
A Survey of Uncertainty in Deep Neural Networks J. Gawlikowski Cedrique Rovile Njieutcheu Tassi Mohsin Ali Jongseo Lee Matthias Humt ... R. Roscher Muhammad Shahzad Wen Yang R. Bamler Xiaoxiang Zhu BDL UQCV OOD 487 1,432 0 07 Jul 2021
Estimating Model Uncertainty of Neural Networks in Sparse Information Form Jongseo Lee Matthias Humt Jianxiang Feng Rudolph Triebel BDL UQCV 198 50 0 20 Jun 2020
A General Framework for Uncertainty Estimation in Deep LearningIEEE Robotics and Automation Letters (RA-L), 2019 Antonio Loquercio Mattia Segu Davide Scaramuzza UQCV BDL OOD 260 329 0 16 Jul 2019
The Limits and Potentials of Deep Learning for Robotics Niko Sünderhauf Oliver Brock Walter J. Scheirer R. Hadsell Dieter Fox ... B. Upcroft Pieter Abbeel Wolfram Burgard Michael Milford Peter Corke 193 544 0 18 Apr 2018