ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2026 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2209.09513
  4. Cited By
Learn to Explain: Multimodal Reasoning via Thought Chains for Science
  Question Answering
v1v2 (latest)

Learn to Explain: Multimodal Reasoning via Thought Chains for Science Question Answering

Neural Information Processing Systems (NeurIPS), 2022
20 September 2022
Pan Lu
Swaroop Mishra
Tony Xia
Liang Qiu
Kai-Wei Chang
Song-Chun Zhu
Oyvind Tafjord
Peter Clark
Ashwin Kalyan
    ELMReLMLRM
ArXiv (abs)PDFHTMLHuggingFace (1 upvotes)

Papers citing "Learn to Explain: Multimodal Reasoning via Thought Chains for Science Question Answering"

50 / 1,273 papers shown
DIQ-H: Evaluating Hallucination Persistence in VLMs Under Temporal Visual Degradation
DIQ-H: Evaluating Hallucination Persistence in VLMs Under Temporal Visual Degradation
Zexin Lin
Hawen Wan
Yebin Zhong
Xiaoqiang
VLM
149
0
0
03 Dec 2025
MemVerse: Multimodal Memory for Lifelong Learning Agents
MemVerse: Multimodal Memory for Lifelong Learning Agents
J. Liu
Yifei Sun
Weihua Cheng
Haodong Lei
Yirong Chen
...
Nianchen Deng
Yi Yu
Shuyue Hu
Botian Shi
Ding Wang
KELM
195
2
0
03 Dec 2025
See, Think, Learn: A Self-Taught Multimodal Reasoner
See, Think, Learn: A Self-Taught Multimodal Reasoner
Sourabh Sharma
Sonam Gupta
Sadbhawna
ReLMLRMVLM
229
0
0
02 Dec 2025
Mitigating Intra- and Inter-modal Forgetting in Continual Learning of Unified Multimodal Models
Mitigating Intra- and Inter-modal Forgetting in Continual Learning of Unified Multimodal Models
Xiwen Wei
Mustafa Munir
R. Marculescu
CLL
267
0
0
02 Dec 2025
OneThinker: All-in-one Reasoning Model for Image and Video
OneThinker: All-in-one Reasoning Model for Image and Video
Kaituo Feng
M. Zhang
Hongyu Li
Kaixuan Fan
Shuang Chen
...
Haoze Sun
Yan Feng
Peng Pei
Xunliang Cai
Xiangyu Yue
OffRLMLLMVLMLRM
664
3
0
02 Dec 2025
Lost in Modality: Evaluating the Effectiveness of Text-Based Membership Inference Attacks on Large Multimodal Models
Lost in Modality: Evaluating the Effectiveness of Text-Based Membership Inference Attacks on Large Multimodal Models
Ziyi Tong
Feifei Sun
Le Minh Nguyen
17
0
0
02 Dec 2025
FiMMIA: scaling semantic perturbation-based membership inference across modalities
FiMMIA: scaling semantic perturbation-based membership inference across modalities
Anton A. Emelyanov
Sergei Kudriashov
Alena Fenogenova
142
0
0
02 Dec 2025
Script: Graph-Structured and Query-Conditioned Semantic Token Pruning for Multimodal Large Language Models
Zhongyu Yang
Dannong Xu
Wei Pang
Yingfang Yuan
VLM
188
0
0
01 Dec 2025
PhyDetEx: Detecting and Explaining the Physical Plausibility of T2V Models
Zeqing Wang
Keze Wang
Lei Zhang
VGen
130
0
0
01 Dec 2025
Comparative Analysis of 47 Context-Based Question Answer Models Across 8 Diverse Datasets
Comparative Analysis of 47 Context-Based Question Answer Models Across 8 Diverse Datasets
Muhammad Muneeb
David B. Ascher
Ahsan Baidar Bakht
78
0
0
29 Nov 2025
Better, Stronger, Faster: Tackling the Trilemma in MLLM-based Segmentation with Simultaneous Textual Mask Prediction
Better, Stronger, Faster: Tackling the Trilemma in MLLM-based Segmentation with Simultaneous Textual Mask Prediction
Jiazhen Liu
Mingkuan Feng
Long Chen
88
0
0
29 Nov 2025
AgriCoT: A Chain-of-Thought Benchmark for Evaluating Reasoning in Vision-Language Models for Agriculture
AgriCoT: A Chain-of-Thought Benchmark for Evaluating Reasoning in Vision-Language Models for Agriculture
Yibin Wen
Qingmei Li
Zi Ye
Jiarui Zhang
Jing Wu
...
Yang Zhang
Lingyuan Zhao
Haohuan Fu
Huang Jianxi
Juepeng Zheng
ReLMLRM
161
0
0
28 Nov 2025
A Rosetta Stone for AI Benchmarks
A Rosetta Stone for AI Benchmarks
A. Ho
Jean-Stanislas Denain
David Atanasov
Samuel Albanie
Rohin Shah
ELM
265
0
0
28 Nov 2025
EM-KD: Distilling Efficient Multimodal Large Language Model with Unbalanced Vision Tokens
EM-KD: Distilling Efficient Multimodal Large Language Model with Unbalanced Vision Tokens
Ze Feng
Sen Yang
Boqiang Duan
Wankou Yang
Jingdong Wang
VLM
171
0
0
26 Nov 2025
LocateAnything3D: Vision-Language 3D Detection with Chain-of-Sight
LocateAnything3D: Vision-Language 3D Detection with Chain-of-Sight
Yunze Man
S. S. Wang
Guowen Zhang
Johan Bjorck
Zhiqi Li
Liang-Yan Gui
Jim Fan
Jan Kautz
Yu Wang
Zhiding Yu
121
0
0
25 Nov 2025
M$^3$Prune: Hierarchical Communication Graph Pruning for Efficient Multi-Modal Multi-Agent Retrieval-Augmented Generation
M3^33Prune: Hierarchical Communication Graph Pruning for Efficient Multi-Modal Multi-Agent Retrieval-Augmented Generation
Weizi Shao
Taolin Zhang
Zijie Zhou
Chen Chen
C. Wang
Xiaofeng He
78
0
0
25 Nov 2025
Harmonious Parameter Adaptation in Continual Visual Instruction Tuning for Safety-Aligned MLLMs
Harmonious Parameter Adaptation in Continual Visual Instruction Tuning for Safety-Aligned MLLMs
Z. J. Wang
Chang Che
Qi Wang
Hui Ma
Zenglin Shi
Cees G. M. Snoek
Meng Wang
CLL
196
0
0
25 Nov 2025
Object-Centric Vision Token Pruning for Vision Language Models
Object-Centric Vision Token Pruning for Vision Language Models
Guangyuan Li
R. Zhao
Jinhong Deng
Yanbo Wang
Joni Pajarinen
VLM
174
0
0
25 Nov 2025
INTERLACE: Interleaved Layer Pruning and Efficient Adaptation in Large Vision-Language Models
INTERLACE: Interleaved Layer Pruning and Efficient Adaptation in Large Vision-Language Models
Parsa Madinei
Ryan Solgi
Ziqi Wen
Jonathan Skaza
Miguel P. Eckstein
Ramtin Pedarsani
VLM
181
0
0
24 Nov 2025
Parallel Vision Token Scheduling for Fast and Accurate Multimodal LMMs Inference
Parallel Vision Token Scheduling for Fast and Accurate Multimodal LMMs Inference
Wengyi Zhan
Mingbao Lin
Zhihang Lin
Rongrong Ji
MLLMVLMLRM
220
0
0
24 Nov 2025
Cross Domain Evaluation of Multimodal Chain-of-Thought Reasoning of different datasets into the Amazon CoT Framework
Cross Domain Evaluation of Multimodal Chain-of-Thought Reasoning of different datasets into the Amazon CoT Framework
Nitya Tiwari
Parv Maheshwari
Vidisha Agarwal
LRM
100
0
0
24 Nov 2025
BackdoorVLM: A Benchmark for Backdoor Attacks on Vision-Language Models
BackdoorVLM: A Benchmark for Backdoor Attacks on Vision-Language Models
Juncheng Li
Y. Li
Hanxun Huang
Yunhao Chen
Xin Wang
Yixu Wang
Xingjun Ma
Yu-Gang Jiang
MLLMAAMLVLM
198
0
0
24 Nov 2025
Chain-of-Visual-Thought: Teaching VLMs to See and Think Better with Continuous Visual Tokens
Chain-of-Visual-Thought: Teaching VLMs to See and Think Better with Continuous Visual Tokens
Yiming Qin
Bomin Wei
Jiaxin Ge
Konstantinos Kallidromitis
Stephanie Fu
Trevor Darrell
Xudong Wang
LRMVLM
250
1
0
24 Nov 2025
Scaling Agentic Reinforcement Learning for Tool-Integrated Reasoning in VLMs
Scaling Agentic Reinforcement Learning for Tool-Integrated Reasoning in VLMs
Meng Lu
Ran Xu
Yi Fang
Wenxuan Zhang
Yue Yu
...
Guanghua Xiao
Hanrui Wang
Di Jin
W. Shi
Xuan Wang
LRM
139
1
0
24 Nov 2025
VADE: Variance-Aware Dynamic Sampling via Online Sample-Level Difficulty Estimation for Multimodal RL
VADE: Variance-Aware Dynamic Sampling via Online Sample-Level Difficulty Estimation for Multimodal RL
Zengjie Hu
Jiantao Qiu
Tianyi Bai
Haojin Yang
Binhang Yuan
Qi Jing
Conghui He
Wentao Zhang
OffRL
226
0
0
24 Nov 2025
Self-Empowering VLMs: Achieving Hierarchical Consistency via Self-Elicited Knowledge Distillation
Self-Empowering VLMs: Achieving Hierarchical Consistency via Self-Elicited Knowledge Distillation
Wei Yang
Yiran Zhu
Zilin Li
Xunjia Zhang
Hongtao Wang
VLM
131
0
0
23 Nov 2025
FastMMoE: Accelerating Multimodal Large Language Models through Dynamic Expert Activation and Routing-Aware Token Pruning
FastMMoE: Accelerating Multimodal Large Language Models through Dynamic Expert Activation and Routing-Aware Token Pruning
Guoyang Xia
Yifeng Ding
Fengfa Li
Lei Ren
Wei Chen
Fangxiang Feng
Xiaojie Wang
MoEVLM
187
0
0
22 Nov 2025
The PLLuM Instruction Corpus
The PLLuM Instruction Corpus
Piotr Pęzik
Filip Żarnecki
Konrad Kaczyñski
A. Cichosz
Zuzanna Deckert
...
Konrad Wojtasik
Arkadiusz Janz
P. Kazienko
Julia Moska
Jan Kocoñ
104
0
0
21 Nov 2025
Downscaling Intelligence: Exploring Perception and Reasoning Bottlenecks in Small Multimodal Models
Downscaling Intelligence: Exploring Perception and Reasoning Bottlenecks in Small Multimodal Models
Mark Endo
Serena Yeung-Levy
LRM
238
0
0
21 Nov 2025
EvoLMM: Self-Evolving Large Multimodal Models with Continuous Rewards
EvoLMM: Self-Evolving Large Multimodal Models with Continuous Rewards
Omkar Thawakar
Shravan Venkatraman
Ritesh Thawkar
Abdelrahman M. Shaker
Hisham Cholakkal
Rao Muhammad Anwer
Salman Khan
Fahad A Khan
SyDaLRMVLM
325
3
0
20 Nov 2025
Learning to Think Fast and Slow for Visual Language Models
Chenyu Lin
Cheng Chi
Jinlin Wu
Sharon Li
Kaiyang Zhou
ReLMVLM
225
0
0
20 Nov 2025
Q-MLLM: Vector Quantization for Robust Multimodal Large Language Model Security
Wei Zhao
Zhe Li
Yige Li
Jun Sun
AAML
104
0
0
20 Nov 2025
Parameter Importance-Driven Continual Learning for Foundation Models
Parameter Importance-Driven Continual Learning for Foundation Models
LingXiang Wang
Hainan Zhang
Zhiming Zheng
KELMCLL
481
0
0
19 Nov 2025
A Comprehensive Study on Visual Token Redundancy for Discrete Diffusion-based Multimodal Large Language Models
A Comprehensive Study on Visual Token Redundancy for Discrete Diffusion-based Multimodal Large Language Models
Duo Li
Zuhao Yang
Xiaoqin Zhang
Ling Shao
Shijian Lu
VLM
155
1
0
19 Nov 2025
Multimodal Evaluation of Russian-language Architectures
Multimodal Evaluation of Russian-language Architectures
Artem Chervyakov
Ulyana Isaeva
Anton A. Emelyanov
Artem Safin
Maria Tikhonova
...
Ilseyar Alimova
Ilseyar Alimova
A. Kapitanov
Alena Fenogenova
Alena Fenogenova
319
1
0
19 Nov 2025
Zero-Training Task-Specific Model Synthesis for Few-Shot Medical Image Classification
Zero-Training Task-Specific Model Synthesis for Few-Shot Medical Image Classification
Yao Qin
Yangyang Yan
YuanChao Yang
Jinhua Pang
Huanyong Bi
Yuan Liu
HaiHua Wang
MedIm
135
0
0
18 Nov 2025
SafeGRPO: Self-Rewarded Multimodal Safety Alignment via Rule-Governed Policy Optimization
SafeGRPO: Self-Rewarded Multimodal Safety Alignment via Rule-Governed Policy Optimization
Xuankun Rong
Wenke Huang
Tingfeng Wang
Daiguo Zhou
Bo Du
Mang Ye
LRM
230
0
0
17 Nov 2025
CreBench: Human-Aligned Creativity Evaluation from Idea to Process to Product
CreBench: Human-Aligned Creativity Evaluation from Idea to Process to Product
Kaiwen Xue
Chenglong Li
Zhonghong Ou
Guoxin Zhang
Kaoyan Lu
...
Xinyu Liu
Qunlin Chen
Weiwei Qin
Yiran Shen
Jiayi Cen
121
0
0
17 Nov 2025
From Perception to Reasoning: Deep Thinking Empowers Multimodal Large Language Models
From Perception to Reasoning: Deep Thinking Empowers Multimodal Large Language Models
Wenxin Zhu
Andong Chen
Yuchen Song
Kehai Chen
Conghui Zhu
Ziyan Chen
Tiejun Zhao
LRM
450
0
0
17 Nov 2025
Explore How to Inject Beneficial Noise in MLLMs
Explore How to Inject Beneficial Noise in MLLMs
Ruishu Zhu
Sida Huang
Ziheng Jiao
Hongyuan Zhang
205
4
0
17 Nov 2025
Learning with Preserving for Continual Multitask Learning
Learning with Preserving for Continual Multitask Learning
H. Wang
Siwoo Bae
Zirong Chen
Meiyi Ma
CLL
193
0
0
11 Nov 2025
Knowledge-Augmented Long-CoT Generation for Complex Biomolecular Reasoning
Knowledge-Augmented Long-CoT Generation for Complex Biomolecular Reasoning
Tianwen Lyu
Xiang Zhuang
Keyan Ding
Xinzhe Cao
Lei Liang
Wei Zhao
Qiang Zhang
H. Chen
LRM
108
0
0
11 Nov 2025
Efficient Test-Time Scaling of Multi-Step Reasoning by Probing Internal States of Large Language Models
Efficient Test-Time Scaling of Multi-Step Reasoning by Probing Internal States of Large Language Models
Jingwei Ni
Ekaterina Fadeeva
Tianyi Wu
Mubashara Akhtar
Jiaheng Zhang
...
Markus Leippold
Timothy Baldwin
See-Kiong Ng
Artem Shelmanov
Mrinmaya Sachan
LRM
228
0
0
09 Nov 2025
NVIDIA Nemotron Nano V2 VL
NVIDIA Nemotron Nano V2 VL
Nvidia
Amala Sanjay Deshmukh
Kateryna Chumachenko
Tuomas Rintamaki
Matthieu Le
...
Krzysztof Pawelec
Michael Evans
Katherine Luna
Jie Lou
Erick Galinkin
VLM
310
2
0
06 Nov 2025
ChiMDQA: Towards Comprehensive Chinese Document QA with Fine-grained Evaluation
ChiMDQA: Towards Comprehensive Chinese Document QA with Fine-grained EvaluationInternational Conference on Artificial Neural Networks (ICANN), 2025
Jing Gao
Shutiao Luo
Yumeng Liu
Yuanming Li
Hongji Zeng
111
0
0
05 Nov 2025
QG-CoC: Question-Guided Chain-of-Captions for Large Multimodal Models
QG-CoC: Question-Guided Chain-of-Captions for Large Multimodal Models
Kuei-Chun Kao
Hsu Tzu-Yin
Yunqi Hong
Ruochen Wang
Cho-Jui Hsieh
LRM
157
0
0
05 Nov 2025
SAIL-RL: Guiding MLLMs in When and How to Think via Dual-Reward RL Tuning
SAIL-RL: Guiding MLLMs in When and How to Think via Dual-Reward RL Tuning
Fangxun Shu
Yongjie Ye
Yue Liao
Zijian Kang
Weijie Yin
Jiacong Wang
Xiao Liang
Shuicheng Yan
Chao Feng
OffRLReLMLRM
237
1
0
04 Nov 2025
Multimodal Reasoning via Latent Refocusing
Multimodal Reasoning via Latent Refocusing
Jizheng Ma
Xiaofei Zhou
Yanlong Song
Han Yan
Han Yan
VLMLRM
179
1
0
04 Nov 2025
Can Visual Input Be Compressed? A Visual Token Compression Benchmark for Large Multimodal Models
Can Visual Input Be Compressed? A Visual Token Compression Benchmark for Large Multimodal Models
Tianfan Peng
Yuntao Du
Pengzhou Ji
Shijie Dong
Kailin Jiang
...
Jinhe Bi
Qian Li
Wei Du
Feng Xiao
Lizhen Cui
VLM
273
0
0
04 Nov 2025
When Visualizing is the First Step to Reasoning: MIRA, a Benchmark for Visual Chain-of-Thought
When Visualizing is the First Step to Reasoning: MIRA, a Benchmark for Visual Chain-of-Thought
Yiyang Zhou
Haoqin Tu
Z. Wang
Zeyu Wang
Niklas Muennighoff
...
Shen Yan
Haoqi Fan
Cihang Xie
Huaxiu Yao
Qinghao Ye
LRM
256
2
0
04 Nov 2025
1234...242526
Next