Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2401.02582
Cited By
CoCoT: Contrastive Chain-of-Thought Prompting for Large Multimodal Models with Multiple Image Inputs
5 January 2024
Daoan Zhang
Junming Yang
Hanjia Lyu
Zijian Jin
Yuan Yao
Mingkai Chen
Jiebo Luo
Re-assign community
ArXiv
PDF
HTML
Papers citing
"CoCoT: Contrastive Chain-of-Thought Prompting for Large Multimodal Models with Multiple Image Inputs"
29 / 29 papers shown
Title
G-FOCUS: Towards a Robust Method for Assessing UI Design Persuasiveness
Jaehyun Jeon
Janghan Yoon
Minsoo Kim
Sumin Shim
Yejin Choi
Hanbin Kim
Youngjae Yu
AAML
33
0
0
08 May 2025
Benchmarking Multimodal CoT Reward Model Stepwise by Visual Program
Minghe Gao
Xuqi Liu
Zhongqi Yue
Y. Wu
Shuang Chen
Juncheng Billy Li
Siliang Tang
Fei Wu
Tat-Seng Chua
Yueting Zhuang
OffRL
LRM
39
1
0
09 Apr 2025
Enhancing Compositional Reasoning in Vision-Language Models with Synthetic Preference Data
Samarth Mishra
Kate Saenko
Venkatesh Saligrama
CoGe
LRM
37
0
0
07 Apr 2025
Multimodal Chain-of-Thought Reasoning: A Comprehensive Survey
Y. Wang
Shengqiong Wu
Y. Zhang
William Yang Wang
Ziwei Liu
Jiebo Luo
Hao Fei
LRM
84
7
0
16 Mar 2025
Code to Think, Think to Code: A Survey on Code-Enhanced Reasoning and Reasoning-Driven Code Intelligence in LLMs
Dayu Yang
Tianyang Liu
Daoan Zhang
Antoine Simoulin
Xiaoyi Liu
...
Zhaopu Teng
Xin Qian
Grey Yang
Jiebo Luo
Julian McAuley
ReLM
OffRL
LRM
81
3
0
26 Feb 2025
Be a Multitude to Itself: A Prompt Evolution Framework for Red Teaming
Rui Li
Peiyi Wang
Jingyuan Ma
Di Zhang
Lei Sha
Zhifang Sui
LLMAG
44
0
0
22 Feb 2025
Multi-granular Training Strategies for Robust Multi-hop Reasoning Over Noisy and Heterogeneous Knowledge Sources
Jackson Coleman
Isaiah Lawrence
Benjamin Turner
LRM
38
0
0
09 Feb 2025
CoMT: A Novel Benchmark for Chain of Multi-modal Thought on Large Vision-Language Models
Zihui Cheng
Qiguang Chen
Jin Zhang
Hao Fei
Xiaocheng Feng
Wanxiang Che
Min Li
L. Qin
VLM
MLLM
LRM
75
3
0
17 Dec 2024
Reason-before-Retrieve: One-Stage Reflective Chain-of-Thoughts for Training-Free Zero-Shot Composed Image Retrieval
Yuanmin Tang
Xiaoting Qin
J. Zhang
Jing Yu
Gaopeng Gou
Gang Xiong
Qingwei Ling
Saravan Rajmohan
Dongmei Zhang
Qi Wu
LRM
66
1
0
15 Dec 2024
Chain-of-Thought in Large Language Models: Decoding, Projection, and Activation
H. Yang
Qianghua Zhao
Lei Li
AI4CE
LRM
66
1
0
05 Dec 2024
Enhancing Visual Reasoning with Autonomous Imagination in Multimodal Large Language Models
J. Liu
Yumeng Li
Boyuan Xiao
Yichang Jian
Ziang Qin
Tianjia Shao
Yao-Xiang Ding
Kun Zhou
MLLM
LRM
95
2
0
27 Nov 2024
Thinking Before Looking: Improving Multimodal LLM Reasoning via Mitigating Visual Hallucination
Haojie Zheng
Tianyang Xu
Hanchi Sun
Shu Pu
Ruoxi Chen
Lichao Sun
MLLM
LRM
64
8
0
15 Nov 2024
Natural Language Inference Improves Compositionality in Vision-Language Models
Paola Cascante-Bonilla
Yu Hou
Yang Trista Cao
Hal Daumé III
Rachel Rudinger
ReLM
CoGe
VLM
36
3
0
29 Oct 2024
IW-Bench: Evaluating Large Multimodal Models for Converting Image-to-Web
Hongcheng Guo
Wei Zhang
Junhao Chen
Yaonan Gu
Jian Yang
...
Binyuan Hui
Tianyu Liu
Jianxin Ma
Chang Zhou
Zhoujun Li
25
1
0
14 Sep 2024
X-Reflect: Cross-Reflection Prompting for Multimodal Recommendation
Hanjia Lyu
Ryan A. Rossi
Xiang Chen
Md Mehrab Tanjim
Stefano Petrangeli
Somdeb Sarkhel
Jiebo Luo
16
5
0
27 Aug 2024
Smart Vision-Language Reasoners
Denisa Roberts
Lucas Roberts
VLM
ReLM
LRM
36
4
0
05 Jul 2024
What is the Visual Cognition Gap between Humans and Multimodal LLMs?
Xu Cao
Bolin Lai
Wenqian Ye
Yunsheng Ma
Joerg Heintz
Jintai Chen
Jianguo Cao
James M. Rehg
37
8
0
14 Jun 2024
MMFakeBench: A Mixed-Source Multimodal Misinformation Detection Benchmark for LVLMs
Xuannan Liu
Zekun Li
Peipei Li
Shuhan Xia
Xing Cui
Linzhi Huang
Huaibo Huang
Weihong Deng
Zhaofeng He
36
11
0
13 Jun 2024
An Empirical Analysis on Large Language Models in Debate Evaluation
Xinyi Liu
Pinxin Liu
Hangfeng He
ELM
24
4
0
28 May 2024
Chain-of-Thought Prompting for Demographic Inference with Large Multimodal Models
Yongsheng Yu
Jiebo Luo
LRM
AI4CE
24
2
0
24 May 2024
Image-of-Thought Prompting for Visual Reasoning Refinement in Multimodal Large Language Models
Qiji Zhou
Ruochen Zhou
Zike Hu
Panzhong Lu
Siyang Gao
Yue Zhang
LRM
38
12
0
22 May 2024
Cantor: Inspiring Multimodal Chain-of-Thought of MLLM
Timin Gao
Peixian Chen
Mengdan Zhang
Chaoyou Fu
Yunhang Shen
...
Shengchuan Zhang
Xiawu Zheng
Xing Sun
Liujuan Cao
Rongrong Ji
MLLM
LRM
39
15
0
24 Apr 2024
FINEMATCH: Aspect-based Fine-grained Image and Text Mismatch Detection and Correction
Hang Hua
Jing Shi
Kushal Kafle
Simon Jenni
Daoan Zhang
John Collomosse
Scott D. Cohen
Jiebo Luo
CoGe
VLM
42
9
0
23 Apr 2024
TextCoT: Zoom In for Enhanced Multimodal Text-Rich Image Understanding
Bozhi Luan
Hao Feng
Hong Chen
Yonghui Wang
Wen-gang Zhou
Houqiang Li
MLLM
24
10
0
15 Apr 2024
AVicuna: Audio-Visual LLM with Interleaver and Context-Boundary Alignment for Temporal Referential Dialogue
Yunlong Tang
Daiki Shimada
Jing Bi
Chenliang Xu
VGen
27
17
0
24 Mar 2024
BBA: Bi-Modal Behavioral Alignment for Reasoning with Large Vision-Language Models
Xueliang Zhao
Xinting Huang
Tingchen Fu
Qintong Li
Shansan Gong
Lemao Liu
Wei Bi
Lingpeng Kong
LRM
31
1
0
21 Feb 2024
LLMVA-GEBC: Large Language Model with Video Adapter for Generic Event Boundary Captioning
Yunlong Tang
Jinrui Zhang
Xiangchen Wang
Teng Wang
Feng Zheng
VLM
64
9
0
17 Jun 2023
Large Language Models are Zero-Shot Reasoners
Takeshi Kojima
S. Gu
Machel Reid
Yutaka Matsuo
Yusuke Iwasawa
ReLM
LRM
291
2,712
0
24 May 2022
Training language models to follow instructions with human feedback
Long Ouyang
Jeff Wu
Xu Jiang
Diogo Almeida
Carroll L. Wainwright
...
Amanda Askell
Peter Welinder
Paul Christiano
Jan Leike
Ryan J. Lowe
OSLM
ALM
301
11,730
0
04 Mar 2022
1