v1v2v3 (latest)

CogCoM: A Visual Language Model with Chain-of-Manipulations Reasoning

International Conference on Learning Representations (ICLR), 2024

6 February 2024

Ji Qi

Bin Xu

Lei Hou

Juanzi Li

Yuxiao Dong

Jie Tang

VLM

LRM

ArXiv (abs)PDF HTML HuggingFace (8 upvotes)

Papers citing "CogCoM: A Visual Language Model with Chain-of-Manipulations Reasoning"

14 / 14 papers shown

Video-CoM: Interactive Video Reasoning via Chain of Manipulations

164

28 Nov 2025

Beyond Seeing: Evaluating Multimodal LLMs on Tool-Enabled Image Perception, Transformation, and Reasoning

Ernesto Gabriel Hernández Montoya

...

325

14 Oct 2025

Video-STAR: Reinforcing Open-Vocabulary Action Recognition with Tools

...

140

09 Oct 2025

Reinforced Visual Perception with Tools

155

01 Sep 2025

Explain Before You Answer: A Survey on Compositional Visual Reasoning

...

352

24 Aug 2025

SIFThinker: Spatially-Aware Image Focus for Visual Reasoning

279

08 Aug 2025

Learning Only with Images: Visual Reinforcement Learning with Reasoning, Rendering, and Visual Feedback

301

28 Jul 2025

LIRA: Inferring Segmentation in Large Multi-modal Models with Local Interleaved Region Assistance

459

08 Jul 2025

MMGeoLM: Hard Negative Contrastive Learning for Fine-Grained Geometric Understanding in Large Multimodal Models

410

26 May 2025

FaceInsight: A Multimodal Large Language Model for Face Perception

392

22 Apr 2025

New Dataset and Methods for Fine-Grained Compositional Referring Expression Comprehension via Specialist-MLLM CollaborationIEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2025

491

27 Feb 2025

Multimodal Inconsistency Reasoning (MMIR): A New Benchmark for Multimodal Reasoning ModelsAnnual Meeting of the Association for Computational Linguistics (ACL), 2025

428

22 Feb 2025

Beyond Chain-of-Thought: A Survey of Chain-of-X Paradigms for LLMs

Julian McAuley

Shuai Li

LRM

629

24 Apr 2024

Griffon v2: Advancing Multimodal Perception with High-Resolution Scaling and Visual-Language Co-Referring

294

14 Mar 2024