Communities
Connect sessions
AI calendar
Organizations
Join Slack
Contact Sales
Search
Open menu
Home
Papers
2411.18203
Cited By
v1
v2
v3
v4
v5 (latest)
Critic-V: VLM Critics Help Catch VLM Errors in Multimodal Reasoning
Computer Vision and Pattern Recognition (CVPR), 2024
27 November 2024
Di Zhang
Jingdi Lei
Junxian Li
Xunzhi Wang
Yong Liu
Zonglin Yang
Jiatong Li
Weida Wang
Steve Yang
Jianbo Wu
Peng Ye
Wanli Ouyang
Dongzhan Zhou
OffRL
LRM
Re-assign community
ArXiv (abs)
PDF
HTML
HuggingFace (40 upvotes)
Papers citing
"Critic-V: VLM Critics Help Catch VLM Errors in Multimodal Reasoning"
26 / 76 papers shown
Title
MathVista: Evaluating Mathematical Reasoning of Foundation Models in Visual Contexts
International Conference on Learning Representations (ICLR), 2023
Pan Lu
Hritik Bansal
Tony Xia
Hamish Ivison
Chun-yue Li
Hannaneh Hajishirzi
Hao Cheng
Kai-Wei Chang
Michel Galley
Jianfeng Gao
LRM
MLLM
499
1,083
0
03 Oct 2023
The Dawn of LMMs: Preliminary Explorations with GPT-4V(ision)
Zhengyuan Yang
Linjie Li
Kevin Qinghong Lin
Jianfeng Wang
Chung-Ching Lin
Nasim Shakouri Mahmoudabadi
Lijuan Wang
LM&MA
293
805
0
29 Sep 2023
Aligning Large Multimodal Models with Factually Augmented RLHF
Annual Meeting of the Association for Computational Linguistics (ACL), 2023
Zhiqing Sun
Sheng Shen
Shengcao Cao
Haotian Liu
Chunyuan Li
...
Liangyan Gui
Yu-Xiong Wang
Yiming Yang
Kurt Keutzer
Trevor Darrell
VLM
257
567
0
25 Sep 2023
OpenFlamingo: An Open-Source Framework for Training Large Autoregressive Vision-Language Models
Anas Awadalla
Irena Gao
Josh Gardner
Jack Hessel
Yusuf Hanafy
...
Simon Kornblith
Pang Wei Koh
Gabriel Ilharco
Mitchell Wortsman
Ludwig Schmidt
MLLM
320
525
0
02 Aug 2023
SEED-Bench: Benchmarking Multimodal LLMs with Generative Comprehension
Bohao Li
Rui Wang
Guangzhi Wang
Yuying Ge
Yixiao Ge
Ying Shan
MLLM
ELM
407
764
0
30 Jul 2023
MMBench: Is Your Multi-modal Model an All-around Player?
European Conference on Computer Vision (ECCV), 2023
Yuanzhan Liu
Haodong Duan
Yuanhan Zhang
Yue Liu
Songyang Zhang
...
Yuan Liu
Conghui He
Ziwei Liu
Kai-xiang Chen
Dahua Lin
552
1,595
0
12 Jul 2023
Mitigating Hallucination in Large Multi-Modal Models via Robust Instruction Tuning
International Conference on Learning Representations (ICLR), 2023
Fuxiao Liu
Kevin Qinghong Lin
Linjie Li
Jianfeng Wang
Yaser Yacoob
Lijuan Wang
VLM
MLLM
355
389
0
26 Jun 2023
Direct Preference Optimization: Your Language Model is Secretly a Reward Model
Neural Information Processing Systems (NeurIPS), 2023
Rafael Rafailov
Archit Sharma
E. Mitchell
Stefano Ermon
Christopher D. Manning
Chelsea Finn
ALM
755
6,364
0
29 May 2023
Tree of Thoughts: Deliberate Problem Solving with Large Language Models
Neural Information Processing Systems (NeurIPS), 2023
Shunyu Yao
Dian Yu
Jeffrey Zhao
Izhak Shafran
Thomas Griffiths
Yuan Cao
Karthik Narasimhan
LM&Ro
LRM
AI4CE
463
2,967
0
17 May 2023
Evaluating Object Hallucination in Large Vision-Language Models
Conference on Empirical Methods in Natural Language Processing (EMNLP), 2023
Yifan Li
Yifan Du
Kun Zhou
Jinpeng Wang
Wayne Xin Zhao
Ji-Rong Wen
MLLM
LRM
651
1,210
0
17 May 2023
MiniGPT-4: Enhancing Vision-Language Understanding with Advanced Large Language Models
International Conference on Learning Representations (ICLR), 2023
Deyao Zhu
Jun Chen
Xiaoqian Shen
Xiang Li
Mohamed Elhoseiny
VLM
MLLM
404
2,632
0
20 Apr 2023
Visual Instruction Tuning
Neural Information Processing Systems (NeurIPS), 2023
Haotian Liu
Chunyuan Li
Qingyang Wu
Yong Jae Lee
SyDa
VLM
MLLM
950
7,136
0
17 Apr 2023
REFINER: Reasoning Feedback on Intermediate Representations
Conference of the European Chapter of the Association for Computational Linguistics (EACL), 2023
Debjit Paul
Mete Ismayilzada
Maxime Peyrard
Beatriz Borges
Antoine Bosselut
Robert West
Boi Faltings
ReLM
LRM
290
218
0
04 Apr 2023
Self-Refine: Iterative Refinement with Self-Feedback
Neural Information Processing Systems (NeurIPS), 2023
Aman Madaan
Niket Tandon
Prakhar Gupta
Skyler Hallinan
Luyu Gao
...
Bodhisattwa Prasad Majumder
Katherine Hermann
Sean Welleck
Amir Yazdanbakhsh
Peter Clark
ReLM
LRM
DiffM
692
2,470
0
30 Mar 2023
VAD: Vectorized Scene Representation for Efficient Autonomous Driving
IEEE International Conference on Computer Vision (ICCV), 2023
Bo Jiang
Shaoyu Chen
Qing Xu
Bencheng Liao
Jiajie Chen
Helong Zhou
Qian Zhang
Wenyu Liu
Chang Huang
Xinggang Wang
424
428
0
21 Mar 2023
PaLM-E: An Embodied Multimodal Language Model
International Conference on Machine Learning (ICML), 2023
Danny Driess
F. Xia
Mehdi S. M. Sajjadi
Corey Lynch
Aakanksha Chowdhery
...
Marc Toussaint
Klaus Greff
Andy Zeng
Igor Mordatch
Peter R. Florence
LM&Ro
357
2,153
0
06 Mar 2023
A Multitask, Multilingual, Multimodal Evaluation of ChatGPT on Reasoning, Hallucination, and Interactivity
International Joint Conference on Natural Language Processing (IJCNLP), 2023
Yejin Bang
Samuel Cahyawijaya
Nayeon Lee
Wenliang Dai
Jane Polak Scowcroft
...
Tiezheng Yu
Willy Chung
Quyet V. Do
Yan Xu
Pascale Fung
ReLM
LRM
614
1,583
0
08 Feb 2023
ROSCOE: A Suite of Metrics for Scoring Step-by-Step Reasoning
O. Yu. Golovneva
Moya Chen
Spencer Poff
Martin Corredor
Luke Zettlemoyer
Maryam Fazel-Zarandi
Asli Celikyilmaz
ReLM
LRM
262
192
0
15 Dec 2022
In-context Reinforcement Learning with Algorithm Distillation
International Conference on Learning Representations (ICLR), 2022
Michael Laskin
Luyu Wang
Junhyuk Oh
Emilio Parisotto
Stephen Spencer
...
Ethan A. Brooks
Maxime Gazeau
Himanshu Sahni
Satinder Singh
Volodymyr Mnih
OffRL
181
166
0
25 Oct 2022
Automatic Chain of Thought Prompting in Large Language Models
International Conference on Learning Representations (ICLR), 2022
Zhuosheng Zhang
Aston Zhang
Mu Li
Alexander J. Smola
ReLM
LRM
416
826
0
07 Oct 2022
VIMA: General Robot Manipulation with Multimodal Prompts
Yunfan Jiang
Agrim Gupta
Zichen Zhang
Guanzhi Wang
Yongqiang Dou
Yanjun Chen
Li Fei-Fei
Anima Anandkumar
Yuke Zhu
Linxi Fan
LM&Ro
303
461
0
06 Oct 2022
Learn to Explain: Multimodal Reasoning via Thought Chains for Science Question Answering
Neural Information Processing Systems (NeurIPS), 2022
Pan Lu
Swaroop Mishra
Tony Xia
Liang Qiu
Kai-Wei Chang
Song-Chun Zhu
Oyvind Tafjord
Peter Clark
Ashwin Kalyan
ELM
ReLM
LRM
506
1,798
0
20 Sep 2022
ST-P3: End-to-end Vision-based Autonomous Driving via Spatial-Temporal Feature Learning
European Conference on Computer Vision (ECCV), 2022
Shengchao Hu
Li Chen
Peng Wu
Guoying Gu
Junchi Yan
Dacheng Tao
223
364
0
15 Jul 2022
Large Language Models are Zero-Shot Reasoners
Neural Information Processing Systems (NeurIPS), 2022
Takeshi Kojima
S. Gu
Machel Reid
Yutaka Matsuo
Yusuke Iwasawa
ReLM
LRM
1.3K
5,855
0
24 May 2022
Flamingo: a Visual Language Model for Few-Shot Learning
Neural Information Processing Systems (NeurIPS), 2022
Jean-Baptiste Alayrac
Jeff Donahue
Pauline Luc
Antoine Miech
Iain Barr
...
Mikolaj Binkowski
Ricardo Barreira
Oriol Vinyals
Andrew Zisserman
Karen Simonyan
MLLM
VLM
666
4,695
0
29 Apr 2022
Chain-of-Thought Prompting Elicits Reasoning in Large Language Models
Neural Information Processing Systems (NeurIPS), 2022
Jason W. Wei
Xuezhi Wang
Dale Schuurmans
Maarten Bosma
Brian Ichter
F. Xia
Ed H. Chi
Quoc Le
Denny Zhou
LM&Ro
LRM
AI4CE
ReLM
2.1K
14,012
0
28 Jan 2022
Previous
1
2