ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2026 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2209.09513
  4. Cited By
Learn to Explain: Multimodal Reasoning via Thought Chains for Science
  Question Answering
v1v2 (latest)

Learn to Explain: Multimodal Reasoning via Thought Chains for Science Question Answering

Neural Information Processing Systems (NeurIPS), 2022
20 September 2022
Pan Lu
Swaroop Mishra
Tony Xia
Liang Qiu
Kai-Wei Chang
Song-Chun Zhu
Oyvind Tafjord
Peter Clark
Ashwin Kalyan
    ELMReLMLRM
ArXiv (abs)PDFHTMLHuggingFace (1 upvotes)

Papers citing "Learn to Explain: Multimodal Reasoning via Thought Chains for Science Question Answering"

50 / 1,273 papers shown
FIRE: A Dataset for Feedback Integration and Refinement Evaluation of
  Multimodal Models
FIRE: A Dataset for Feedback Integration and Refinement Evaluation of Multimodal Models
Pengxiang Li
Zhi Gao
Bofei Zhang
Tao Yuan
Yuwei Wu
Mehrtash Harandi
Yunde Jia
Song-Chun Zhu
Qing Li
VLMMLLM
310
11
0
16 Jul 2024
Reflective Instruction Tuning: Mitigating Hallucinations in Large
  Vision-Language Models
Reflective Instruction Tuning: Mitigating Hallucinations in Large Vision-Language Models
Jinrui Zhang
Teng Wang
Haigang Zhang
Ping Lu
Feng Zheng
MLLMLRMVLM
307
10
0
16 Jul 2024
VLMEvalKit: An Open-Source Toolkit for Evaluating Large Multi-Modality Models
VLMEvalKit: An Open-Source Toolkit for Evaluating Large Multi-Modality Models
Haodong Duan
Xinyu Fang
Junming Yang
Xiangyu Zhao
Lin Chen
...
Yuhang Zang
Pan Zhang
Jiaqi Wang
Dahua Lin
Kai Chen
LM&MAVLM
734
363
0
16 Jul 2024
On Large Language Model Continual Unlearning
On Large Language Model Continual Unlearning
Chongyang Gao
Lixu Wang
Chenkai Weng
Tianlin Li
Qi Zhu
Qi Zhu
MU
275
0
0
14 Jul 2024
The Synergy between Data and Multi-Modal Large Language Models: A Survey
  from Co-Development Perspective
The Synergy between Data and Multi-Modal Large Language Models: A Survey from Co-Development Perspective
Zhen Qin
Daoyuan Chen
Wenhao Zhang
Liuyi Yao
Yilun Huang
Bolin Ding
Yaliang Li
Shuiguang Deng
353
13
0
11 Jul 2024
DenseFusion-1M: Merging Vision Experts for Comprehensive Multimodal
  Perception
DenseFusion-1M: Merging Vision Experts for Comprehensive Multimodal Perception
Xiaotong Li
Fan Zhang
Haiwen Diao
Yueze Wang
Xinlong Wang
Ling-yu Duan
VLM
340
49
0
11 Jul 2024
HiRes-LLaVA: Restoring Fragmentation Input in High-Resolution Large Vision-Language Models
HiRes-LLaVA: Restoring Fragmentation Input in High-Resolution Large Vision-Language Models
Runhui Huang
Xinpeng Ding
Chunwei Wang
J. N. Han
Yulong Liu
Hengshuang Zhao
Hang Xu
Lu Hou
Wei Zhang
Xiaodan Liang
VLM
209
13
0
11 Jul 2024
SHERL: Synthesizing High Accuracy and Efficient Memory for
  Resource-Limited Transfer Learning
SHERL: Synthesizing High Accuracy and Efficient Memory for Resource-Limited Transfer Learning
Haiwen Diao
Bo Wan
Xu Jia
Yunzhi Zhuge
Ying Zhang
Huchuan Lu
Long Chen
VLM
252
11
0
10 Jul 2024
Fuse, Reason and Verify: Geometry Problem Solving with Parsed Clauses
  from Diagram
Fuse, Reason and Verify: Geometry Problem Solving with Parsed Clauses from Diagram
Ming-Liang Zhang
Zhong-Zhi Li
Fei Yin
Liang Lin
Cheng-Lin Liu
LRM
167
9
0
10 Jul 2024
EfficientQAT: Efficient Quantization-Aware Training for Large Language Models
EfficientQAT: Efficient Quantization-Aware Training for Large Language Models
Mengzhao Chen
Wenqi Shao
Peng Xu
Jiahao Wang
Shiyang Feng
Kaipeng Zhang
Ping Luo
MQ
538
85
0
10 Jul 2024
A Single Transformer for Scalable Vision-Language Modeling
A Single Transformer for Scalable Vision-Language Modeling
Yangyi Chen
Xingyao Wang
Yuan Yao
Heng Ji
LRM
297
32
0
08 Jul 2024
VideoCoT: A Video Chain-of-Thought Dataset with Active Annotation Tool
VideoCoT: A Video Chain-of-Thought Dataset with Active Annotation Tool
Yan Wang
Yawen Zeng
Jingsheng Zheng
Xiaofen Xing
Jin Xu
Xiangmin Xu
MLLMLRMVGen
223
31
0
07 Jul 2024
OmChat: A Recipe to Train Multimodal Language Models with Strong Long
  Context and Video Understanding
OmChat: A Recipe to Train Multimodal Language Models with Strong Long Context and Video Understanding
Tiancheng Zhao
Qianqian Zhang
Kyusong Lee
Peng Liu
Lu Zhang
Chunxin Fang
Jiajia Liao
Kelei Jiang
Yibo Ma
Ruochen Xu
MLLMVLM
266
8
0
06 Jul 2024
Rethinking Visual Prompting for Multimodal Large Language Models with
  External Knowledge
Rethinking Visual Prompting for Multimodal Large Language Models with External Knowledge
Yuanze Lin
Yunsheng Li
Dongdong Chen
Weijian Xu
Ronald Clark
Juil Sock
Lu Yuan
LRMVLM
238
11
0
05 Jul 2024
A Systematic Survey and Critical Review on Evaluating Large Language
  Models: Challenges, Limitations, and Recommendations
A Systematic Survey and Critical Review on Evaluating Large Language Models: Challenges, Limitations, and Recommendations
Md Tahmid Rahman Laskar
Sawsan Alqahtani
M Saiful Bari
Mizanur Rahman
Mohammad Abdullah Matin Khan
...
Chee Wei Tan
Md. Rizwan Parvez
Enamul Hoque
Shafiq Joty
Jimmy Huang
ELMALM
283
91
0
04 Jul 2024
HEMM: Holistic Evaluation of Multimodal Foundation Models
HEMM: Holistic Evaluation of Multimodal Foundation Models
Paul Pu Liang
Akshay Goindani
Talha Chafekar
Leena Mathur
Haofei Yu
Ruslan Salakhutdinov
Louis-Philippe Morency
353
27
0
03 Jul 2024
InternLM-XComposer-2.5: A Versatile Large Vision Language Model
  Supporting Long-Contextual Input and Output
InternLM-XComposer-2.5: A Versatile Large Vision Language Model Supporting Long-Contextual Input and Output
Pan Zhang
Xiaoyi Dong
Yuhang Zang
Yuhang Cao
Rui Qian
...
Kai Chen
Jifeng Dai
Yu Qiao
Dahua Lin
Jiaqi Wang
304
172
0
03 Jul 2024
Synthetic Multimodal Question Generation
Synthetic Multimodal Question Generation
Ian Wu
Sravan Jayanthi
Vijay Viswanathan
Simon Rosenberg
Sina Pakazad
Tongshuang Wu
Graham Neubig
229
8
0
02 Jul 2024
Survey on Knowledge Distillation for Large Language Models: Methods,
  Evaluation, and Application
Survey on Knowledge Distillation for Large Language Models: Methods, Evaluation, and Application
Chuanpeng Yang
Wang Lu
Yao Zhu
Yidong Wang
Qian Chen
Chenlong Gao
Bingjie Yan
Yiqiang Chen
ALMKELM
293
75
0
02 Jul 2024
Meerkat: Audio-Visual Large Language Model for Grounding in Space and
  Time
Meerkat: Audio-Visual Large Language Model for Grounding in Space and Time
Sanjoy Chowdhury
Sayan Nag
Subhrajyoti Dasgupta
Jun Chen
Mohamed Elhoseiny
Ruohan Gao
Dinesh Manocha
VLMMLLM
412
23
0
01 Jul 2024
We-Math: Does Your Large Multimodal Model Achieve Human-like
  Mathematical Reasoning?
We-Math: Does Your Large Multimodal Model Achieve Human-like Mathematical Reasoning?
Runqi Qiao
Qiuna Tan
Guanting Dong
Minhui Wu
Chong Sun
...
Yida Xu
Muxi Diao
Zhimin Bao
Chen Li
Honggang Zhang
VLMLRM
297
166
0
01 Jul 2024
Efficient Expert Pruning for Sparse Mixture-of-Experts Language Models:
  Enhancing Performance and Reducing Inference Costs
Efficient Expert Pruning for Sparse Mixture-of-Experts Language Models: Enhancing Performance and Reducing Inference Costs
Enshu Liu
Junyi Zhu
Zinan Lin
Xuefei Ning
Matthew B. Blaschko
Shengen Yan
Guohao Dai
Huazhong Yang
Yu Wang
MoE
263
22
0
01 Jul 2024
MIA-Bench: Towards Better Instruction Following Evaluation of Multimodal LLMs
MIA-Bench: Towards Better Instruction Following Evaluation of Multimodal LLMs
Yusu Qian
Hanrong Ye
J. Fauconnier
Peter Grasch
Yinfei Yang
Zhe Gan
691
41
0
01 Jul 2024
BAPO: Base-Anchored Preference Optimization for Personalized Alignment
  in Large Language Models
BAPO: Base-Anchored Preference Optimization for Personalized Alignment in Large Language Models
Gihun Lee
Minchan Jeong
Yujin Kim
Hojung Jung
Jaehoon Oh
Sangmook Kim
Se-Young Yun
260
0
0
30 Jun 2024
MMEvalPro: Calibrating Multimodal Benchmarks Towards Trustworthy and Efficient Evaluation
MMEvalPro: Calibrating Multimodal Benchmarks Towards Trustworthy and Efficient Evaluation
Jinsheng Huang
Liang Chen
Taian Guo
Fu Zeng
Yusheng Zhao
...
Wei Ju
Luchen Liu
Tianyu Liu
Baobao Chang
Ming Zhang
362
10
0
29 Jun 2024
Web2Code: A Large-scale Webpage-to-Code Dataset and Evaluation Framework
  for Multimodal LLMs
Web2Code: A Large-scale Webpage-to-Code Dataset and Evaluation Framework for Multimodal LLMs
Sukmin Yun
Haokun Lin
Rusiru Thushara
Mohammad Qazim Bhat
Yongxin Wang
...
Timothy Baldwin
Zhengzhong Liu
Eric P. Xing
Xiaodan Liang
Zhiqiang Shen
201
39
0
28 Jun 2024
LLaVolta: Efficient Multi-modal Models via Stage-wise Visual Context
  Compression
LLaVolta: Efficient Multi-modal Models via Stage-wise Visual Context Compression
Jieneng Chen
Luoxin Ye
Ju He
Zhao-Yang Wang
Daniel Khashabi
Alan Yuille
VLM
173
1
0
28 Jun 2024
From the Least to the Most: Building a Plug-and-Play Visual Reasoner via
  Data Synthesis
From the Least to the Most: Building a Plug-and-Play Visual Reasoner via Data Synthesis
Chuanqi Cheng
Jian Guan
Wei Wu
Rui Yan
LRM
196
15
0
28 Jun 2024
MM-Instruct: Generated Visual Instructions for Large Multimodal Model
  Alignment
MM-Instruct: Generated Visual Instructions for Large Multimodal Model Alignment
Jihao Liu
Xin Huang
Jinliang Zheng
Boxiao Liu
Jia Wang
Osamu Yoshie
Yu Liu
Hongsheng Li
MLLMSyDa
225
6
0
28 Jun 2024
MMRo: Are Multimodal LLMs Eligible as the Brain for In-Home Robotics?
MMRo: Are Multimodal LLMs Eligible as the Brain for In-Home Robotics?
Jinming Li
Yichen Zhu
Zhiyuan Xu
Jindong Gu
Minjie Zhu
Xin Liu
Ning Liu
Yaxin Peng
Feifei Feng
Jian Tang
LRMLM&Ro
227
15
0
28 Jun 2024
CMMaTH: A Chinese Multi-modal Math Skill Evaluation Benchmark for
  Foundation Models
CMMaTH: A Chinese Multi-modal Math Skill Evaluation Benchmark for Foundation Models
Zhong-Zhi Li
Ming-Liang Zhang
Fei Yin
Zhi-Long Ji
Jin-Feng Bai
Zhen-Ru Pan
Fan-Hu Zeng
Jian Xu
Jia-Xin Zhang
Cheng-Lin Liu
ELM
185
19
0
28 Jun 2024
Solving Token Gradient Conflict in Mixture-of-Experts for Large Vision-Language Model
Solving Token Gradient Conflict in Mixture-of-Experts for Large Vision-Language Model
Longrong Yang
Dong Shen
Chaoxiang Cai
Fan Yang
Size Li
Tingting Gao
Xi Li
MoE
432
8
0
28 Jun 2024
OMG-LLaVA: Bridging Image-level, Object-level, Pixel-level Reasoning and
  Understanding
OMG-LLaVA: Bridging Image-level, Object-level, Pixel-level Reasoning and Understanding
Tao Zhang
Xiangtai Li
Hao Fei
Haobo Yuan
Shengqiong Wu
Shunping Ji
Chen Change Loy
Shuicheng Yan
LRMMLLMVLM
338
123
0
27 Jun 2024
FlowVQA: Mapping Multimodal Logic in Visual Question Answering with
  Flowcharts
FlowVQA: Mapping Multimodal Logic in Visual Question Answering with Flowcharts
Shubhankar Singh
Purvi Chaurasia
Yerram Varun
Pranshu Pandya
Vatsal Gupta
Vivek Gupta
Dan Roth
216
19
0
27 Jun 2024
Building Understandable Messaging for Policy and Evidence Review
  (BUMPER) with AI
Building Understandable Messaging for Policy and Evidence Review (BUMPER) with AI
Katherine A. Rosenfeld
Maike Sonnewald
Sonia J. Jindal
Kevin A. McCarthy
Joshua L. Proctor
212
1
0
27 Jun 2024
Curriculum Learning with Quality-Driven Data Selection
Curriculum Learning with Quality-Driven Data Selection
Biao Wu
Fang Meng
435
5
0
27 Jun 2024
Learning to Correct for QA Reasoning with Black-box LLMs
Learning to Correct for QA Reasoning with Black-box LLMs
Jaehyung Kim
Dongyoung Kim
Yiming Yang
LRM
250
6
0
26 Jun 2024
Mental Modeling of Reinforcement Learning Agents by Language Models
Mental Modeling of Reinforcement Learning Agents by Language Models
Wenhao Lu
Xufeng Zhao
Josua Spisak
Jae Hee Lee
Stefan Wermter
LLMAGLRMLM&Ro
253
3
0
26 Jun 2024
S3: A Simple Strong Sample-effective Multimodal Dialog System
S3: A Simple Strong Sample-effective Multimodal Dialog System
Elisei Rykov
Egor Malkershin
Ilseyar Alimova
233
0
0
26 Jun 2024
MG-LLaVA: Towards Multi-Granularity Visual Instruction Tuning
MG-LLaVA: Towards Multi-Granularity Visual Instruction Tuning
Xiangyu Zhao
Xiangtai Li
Haodong Duan
Haian Huang
Yining Li
Kai Chen
Hua Yang
VLMMLLM
335
21
0
25 Jun 2024
ARES: Alternating Reinforcement Learning and Supervised Fine-Tuning for
  Enhanced Multi-Modal Chain-of-Thought Reasoning Through Diverse AI Feedback
ARES: Alternating Reinforcement Learning and Supervised Fine-Tuning for Enhanced Multi-Modal Chain-of-Thought Reasoning Through Diverse AI Feedback
Ju-Seung Byun
Jiyun Chun
Jihyung Kil
Andrew Perrault
ReLMLRM
316
12
0
25 Jun 2024
Math-LLaVA: Bootstrapping Mathematical Reasoning for Multimodal Large
  Language Models
Math-LLaVA: Bootstrapping Mathematical Reasoning for Multimodal Large Language Models
Wenhao Shi
Zhiqiang Hu
Yi Bin
Junhua Liu
Yang Yang
See-Kiong Ng
Lidong Bing
Roy Ka-Wei Lee
SyDaMLLMLRM
316
113
0
25 Jun 2024
Cambrian-1: A Fully Open, Vision-Centric Exploration of Multimodal LLMs
Cambrian-1: A Fully Open, Vision-Centric Exploration of Multimodal LLMs
Shengbang Tong
Ellis L Brown
Penghao Wu
Sanghyun Woo
Manoj Middepogu
...
Xichen Pan
Austin Wang
Rob Fergus
Yann LeCun
Saining Xie
3DVMLLM
397
640
0
24 Jun 2024
Losing Visual Needles in Image Haystacks: Vision Language Models are
  Easily Distracted in Short and Long Contexts
Losing Visual Needles in Image Haystacks: Vision Language Models are Easily Distracted in Short and Long Contexts
Aditya Sharma
Michael Saxon
William Yang Wang
VLM
269
9
0
24 Jun 2024
Evaluating Large Vision-and-Language Models on Children's Mathematical
  Olympiads
Evaluating Large Vision-and-Language Models on Children's Mathematical Olympiads
A. Cherian
Kuan-Chuan Peng
Suhas Lohit
Joanna Matthiesen
Kevin A. Smith
J. Tenenbaum
ELMLRM
220
15
0
22 Jun 2024
V-RECS, a Low-Cost LLM4VIS Recommender with Explanations, Captioning and
  Suggestions
V-RECS, a Low-Cost LLM4VIS Recommender with Explanations, Captioning and Suggestions
L. Podo
M. Angelini
Paola Velardi
313
3
0
21 Jun 2024
Whiteboard-of-Thought: Thinking Step-by-Step Across Modalities
Whiteboard-of-Thought: Thinking Step-by-Step Across Modalities
Sachit Menon
Richard Zemel
Carl Vondrick
LRM
255
9
0
20 Jun 2024
VisualRWKV: Exploring Recurrent Neural Networks for Visual Language
  Models
VisualRWKV: Exploring Recurrent Neural Networks for Visual Language ModelsInternational Conference on Computational Linguistics (COLING), 2024
Haowen Hou
Peigen Zeng
Fei Ma
Fei Richard Yu
VLM
187
12
0
19 Jun 2024
AdaMoE: Token-Adaptive Routing with Null Experts for Mixture-of-Experts
  Language Models
AdaMoE: Token-Adaptive Routing with Null Experts for Mixture-of-Experts Language ModelsConference on Empirical Methods in Natural Language Processing (EMNLP), 2024
Zihao Zeng
Yibo Miao
Hongcheng Gao
Hao Zhang
Zhijie Deng
MoE
249
30
0
19 Jun 2024
Synthetic Context Generation for Question Generation
Synthetic Context Generation for Question Generation
Naiming Liu
Zichao Wang
Richard Baraniuk
LRM
275
3
0
19 Jun 2024
Previous
123...161718...242526
Next
Page 17 of 26
Pageof 26