Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2404.16033
Cited By
Cantor: Inspiring Multimodal Chain-of-Thought of MLLM
24 April 2024
Timin Gao
Peixian Chen
Mengdan Zhang
Chaoyou Fu
Yunhang Shen
Yan Zhang
Shengchuan Zhang
Xiawu Zheng
Xing Sun
Liujuan Cao
Rongrong Ji
MLLM
LRM
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Cantor: Inspiring Multimodal Chain-of-Thought of MLLM"
20 / 20 papers shown
Title
VisuoThink: Empowering LVLM Reasoning with Multimodal Tree Search
Yikun Wang
Siyin Wang
Qinyuan Cheng
Zhaoye Fei
Liang Ding
Qipeng Guo
Dacheng Tao
Xipeng Qiu
LRM
15
0
0
12 Apr 2025
SafeMLRM: Demystifying Safety in Multi-modal Large Reasoning Models
Junfeng Fang
Y. Wang
Ruipeng Wang
Zijun Yao
Kun Wang
An Zhang
X. Wang
Tat-Seng Chua
AAML
LRM
60
2
0
09 Apr 2025
Mind with Eyes: from Language Reasoning to Multimodal Reasoning
Zhiyu Lin
Yifei Gao
Xian Zhao
Yunfan Yang
Jitao Sang
LRM
45
1
0
23 Mar 2025
ComfyGPT: A Self-Optimizing Multi-Agent System for Comprehensive ComfyUI Workflow Generation
Oucheng Huang
Yuhang Ma
Zeng Zhao
Mingrui Wu
Jiayi Ji
Rongsheng Zhang
Z. Hu
Xiaoshuai Sun
Rongrong Ji
36
0
0
22 Mar 2025
MAPS: A Multi-Agent Framework Based on Big Seven Personality and Socratic Guidance for Multimodal Scientific Problem Solving
Jian Zhang
Zhiyuan Wang
Z. Wang
Xinyu Zhang
Fangzhi Xu
Qika Lin
Rui Mao
Erik Cambria
Jun Liu
LLMAG
54
0
0
21 Mar 2025
Multimodal Chain-of-Thought Reasoning: A Comprehensive Survey
Y. Wang
Shengqiong Wu
Y. Zhang
William Yang Wang
Ziwei Liu
Jiebo Luo
Hao Fei
LRM
74
7
0
16 Mar 2025
LOVA3: Learning to Visual Question Answering, Asking and Assessment
Henry Hengyuan Zhao
Pan Zhou
Difei Gao
Zechen Bai
Mike Zheng Shou
69
8
0
21 Feb 2025
Reason-before-Retrieve: One-Stage Reflective Chain-of-Thoughts for Training-Free Zero-Shot Composed Image Retrieval
Yuanmin Tang
Xiaoting Qin
J. Zhang
Jing Yu
Gaopeng Gou
Gang Xiong
Qingwei Ling
Saravan Rajmohan
Dongmei Zhang
Qi Wu
LRM
66
1
0
15 Dec 2024
A Survey of Hallucination in Large Visual Language Models
Wei Lan
Wenyi Chen
Qingfeng Chen
Shirui Pan
Huiyu Zhou
Yi-Lun Pan
LRM
25
4
0
20 Oct 2024
Exploring Prompt Engineering: A Systematic Review with SWOT Analysis
Aditi Singh
Abul Ehtesham
Gaurav Kumar Gupta
Nikhil Kumar Chatta
Saket Kumar
T. T. Khoei
28
1
0
09 Oct 2024
MC-CoT: A Modular Collaborative CoT Framework for Zero-shot Medical-VQA with LLM and MLLM Integration
Lai Wei
Wenkai Wang
Xiaoyu Shen
Yu Xie
Zhihao Fan
Xiaojin Zhang
Zhongyu Wei
Wei Chen
14
4
0
06 Oct 2024
From Linguistic Giants to Sensory Maestros: A Survey on Cross-Modal Reasoning with Large Language Models
Shengsheng Qian
Zuyi Zhou
Dizhan Xue
Bing Wang
Changsheng Xu
LRM
34
1
0
19 Sep 2024
BLIP-2: Bootstrapping Language-Image Pre-training with Frozen Image Encoders and Large Language Models
Junnan Li
Dongxu Li
Silvio Savarese
Steven C. H. Hoi
VLM
MLLM
244
4,186
0
30 Jan 2023
Learn to Explain: Multimodal Reasoning via Thought Chains for Science Question Answering
Pan Lu
Swaroop Mishra
Tony Xia
Liang Qiu
Kai-Wei Chang
Song-Chun Zhu
Oyvind Tafjord
Peter Clark
A. Kalyan
ELM
ReLM
LRM
198
1,089
0
20 Sep 2022
CyCLIP: Cyclic Contrastive Language-Image Pretraining
Shashank Goel
Hritik Bansal
S. Bhatia
Ryan A. Rossi
Vishwa Vinay
Aditya Grover
CLIP
VLM
160
131
0
28 May 2022
Self-Consistency Improves Chain of Thought Reasoning in Language Models
Xuezhi Wang
Jason W. Wei
Dale Schuurmans
Quoc Le
Ed H. Chi
Sharan Narang
Aakanksha Chowdhery
Denny Zhou
ReLM
BDL
LRM
AI4CE
297
3,163
0
21 Mar 2022
Training language models to follow instructions with human feedback
Long Ouyang
Jeff Wu
Xu Jiang
Diogo Almeida
Carroll L. Wainwright
...
Amanda Askell
Peter Welinder
Paul Christiano
Jan Leike
Ryan J. Lowe
OSLM
ALM
301
11,730
0
04 Mar 2022
BLIP: Bootstrapping Language-Image Pre-training for Unified Vision-Language Understanding and Generation
Junnan Li
Dongxu Li
Caiming Xiong
S. Hoi
MLLM
BDL
VLM
CLIP
382
4,010
0
28 Jan 2022
Chain-of-Thought Prompting Elicits Reasoning in Large Language Models
Jason W. Wei
Xuezhi Wang
Dale Schuurmans
Maarten Bosma
Brian Ichter
F. Xia
Ed H. Chi
Quoc Le
Denny Zhou
LM&Ro
LRM
AI4CE
ReLM
315
8,261
0
28 Jan 2022
Internet-Augmented Dialogue Generation
M. Komeili
Kurt Shuster
Jason Weston
RALM
231
278
0
15 Jul 2021
1