ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2026 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2209.09513
  4. Cited By
Learn to Explain: Multimodal Reasoning via Thought Chains for Science
  Question Answering
v1v2 (latest)

Learn to Explain: Multimodal Reasoning via Thought Chains for Science Question Answering

Neural Information Processing Systems (NeurIPS), 2022
20 September 2022
Pan Lu
Swaroop Mishra
Tony Xia
Liang Qiu
Kai-Wei Chang
Song-Chun Zhu
Oyvind Tafjord
Peter Clark
Ashwin Kalyan
    ELMReLMLRM
ArXiv (abs)PDFHTMLHuggingFace (1 upvotes)

Papers citing "Learn to Explain: Multimodal Reasoning via Thought Chains for Science Question Answering"

50 / 1,273 papers shown
Benchmarking Multi-Image Understanding in Vision and Language Models:
  Perception, Knowledge, Reasoning, and Multi-Hop Reasoning
Benchmarking Multi-Image Understanding in Vision and Language Models: Perception, Knowledge, Reasoning, and Multi-Hop Reasoning
Bingchen Zhao
Yongshuo Zong
Letian Zhang
Timothy Hospedales
VLM
278
41
0
18 Jun 2024
TroL: Traversal of Layers for Large Language and Vision Models
TroL: Traversal of Layers for Large Language and Vision Models
Byung-Kwan Lee
Sangyun Chung
Chae Won Kim
Beomchan Park
Yong Man Ro
349
12
0
18 Jun 2024
VoCo-LLaMA: Towards Vision Compression with Large Language Models
VoCo-LLaMA: Towards Vision Compression with Large Language Models
Xubing Ye
Yukang Gan
Xiaoke Huang
Yixiao Ge
Yansong Tang
MLLMVLM
402
52
0
18 Jun 2024
MMDU: A Multi-Turn Multi-Image Dialog Understanding Benchmark and
  Instruction-Tuning Dataset for LVLMs
MMDU: A Multi-Turn Multi-Image Dialog Understanding Benchmark and Instruction-Tuning Dataset for LVLMs
Ziyu Liu
Tao Chu
Yuhang Zang
Xilin Wei
Xiaoyi Dong
...
Zijian Liang
Yuanjun Xiong
Yu Qiao
Dahua Lin
Jiaqi Wang
VLM
200
67
0
17 Jun 2024
Unveiling Encoder-Free Vision-Language Models
Unveiling Encoder-Free Vision-Language Models
Haiwen Diao
Yufeng Cui
Xiaotong Li
Yueze Wang
Huchuan Lu
Xinlong Wang
VLM
246
66
0
17 Jun 2024
On Efficient Language and Vision Assistants for Visually-Situated
  Natural Language Understanding: What Matters in Reading and Reasoning
On Efficient Language and Vision Assistants for Visually-Situated Natural Language Understanding: What Matters in Reading and Reasoning
Geewook Kim
Minjoon Seo
VLM
239
5
0
17 Jun 2024
Improving Multi-Agent Debate with Sparse Communication Topology
Improving Multi-Agent Debate with Sparse Communication Topology
Yunxuan Li
Yibing Du
Jiageng Zhang
Le Hou
Peter Grabowski
Yeqing Li
Eugene Ie
LLMAG
221
65
0
17 Jun 2024
Preserving Knowledge in Large Language Model with Model-Agnostic
  Self-Decompression
Preserving Knowledge in Large Language Model with Model-Agnostic Self-Decompression
Zilun Zhang
Yutao Sun
Tiancheng Zhao
Leigang Sha
Ruochen Xu
Kyusong Lee
Jianwei Yin
CLLKELM
313
0
0
17 Jun 2024
MFC-Bench: Benchmarking Multimodal Fact-Checking with Large Vision-Language Models
MFC-Bench: Benchmarking Multimodal Fact-Checking with Large Vision-Language Models
Shengkang Wang
Hongzhan Lin
Ziyang Luo
Zhen Ye
Guang Chen
Jing Ma
456
11
0
17 Jun 2024
SPA-VL: A Comprehensive Safety Preference Alignment Dataset for Vision Language Model
SPA-VL: A Comprehensive Safety Preference Alignment Dataset for Vision Language Model
Yongting Zhang
Lu Chen
Guodong Zheng
Yifeng Gao
Rui Zheng
...
Yu Qiao
Xuanjing Huang
Feng Zhao
Tao Gui
Jing Shao
VLM
528
60
0
17 Jun 2024
Concept-skill Transferability-based Data Selection for Large
  Vision-Language Models
Concept-skill Transferability-based Data Selection for Large Vision-Language Models
Jaewoo Lee
Boyang Li
Sung Ju Hwang
VLM
298
20
0
16 Jun 2024
Reminding Multimodal Large Language Models of Object-aware Knowledge
  with Retrieved Tags
Reminding Multimodal Large Language Models of Object-aware Knowledge with Retrieved Tags
Daiqing Qi
Handong Zhao
Zijun Wei
Sheng Li
269
3
0
16 Jun 2024
Mixture-of-Subspaces in Low-Rank Adaptation
Mixture-of-Subspaces in Low-Rank Adaptation
Taiqiang Wu
Jiahao Wang
Zhe Zhao
Ngai Wong
541
41
0
16 Jun 2024
SciEx: Benchmarking Large Language Models on Scientific Exams with Human
  Expert Grading and Automatic Grading
SciEx: Benchmarking Large Language Models on Scientific Exams with Human Expert Grading and Automatic GradingConference on Empirical Methods in Natural Language Processing (EMNLP), 2024
Tu Anh Dinh
Carlos Mullov
Leonard Barmann
Zhaolin Li
Danni Liu
...
Michael Beigl
Rainer Stiefelhagen
Carsten Dachsbacher
Klemens Bohm
Jan Niehues
ELM
194
24
0
14 Jun 2024
VEGA: Learning Interleaved Image-Text Comprehension in Vision-Language
  Large Models
VEGA: Learning Interleaved Image-Text Comprehension in Vision-Language Large Models
Chenyu Zhou
Mengdan Zhang
Peixian Chen
Chaoyou Fu
Chunjiang Ge
Xiawu Zheng
Xing Sun
Rongrong Ji
VLM
159
5
0
14 Jun 2024
Precision Empowers, Excess Distracts: Visual Question Answering With
  Dynamically Infused Knowledge In Language Models
Precision Empowers, Excess Distracts: Visual Question Answering With Dynamically Infused Knowledge In Language ModelsICON (ICON), 2024
Manas Jhalani
Annervaz K M
Pushpak Bhattacharyya
111
3
0
14 Jun 2024
What is the Visual Cognition Gap between Humans and Multimodal LLMs?
What is the Visual Cognition Gap between Humans and Multimodal LLMs?
Xu Cao
Yifan Shen
Bolin Lai
Wenqian Ye
Yunsheng Ma
...
Jintai Chen
Meihuan Huang
Jianguo Cao
Aidong Zhang
James M. Rehg
363
22
0
14 Jun 2024
ClimateIQA: A New Dataset and Benchmark to Advance Vision-Language Models in Meteorology Anomalies Analysis
ClimateIQA: A New Dataset and Benchmark to Advance Vision-Language Models in Meteorology Anomalies Analysis
Jian Chen
Peilin Zhou
Yining Hua
Dading Chong
Meng Cao
Yaowei Li
Wei Chen
Bing Zhu
Junwei Liang
Zixuan Yuan
VLM
471
3
0
14 Jun 2024
Explore the Limits of Omni-modal Pretraining at Scale
Explore the Limits of Omni-modal Pretraining at Scale
Yiyuan Zhang
Handong Li
Jing Liu
Xiangyu Yue
VLMLRM
255
1
0
13 Jun 2024
MuirBench: A Comprehensive Benchmark for Robust Multi-image
  Understanding
MuirBench: A Comprehensive Benchmark for Robust Multi-image Understanding
Fei Wang
Xingyu Fu
James Y. Huang
Zekun Li
Qin Liu
...
Kai-Wei Chang
Dan Roth
Sheng Zhang
Hoifung Poon
Muhao Chen
VLM
339
113
0
13 Jun 2024
ReMI: A Dataset for Reasoning with Multiple Images
ReMI: A Dataset for Reasoning with Multiple Images
Mehran Kazemi
Nishanth Dikkala
Ankit Anand
Petar Dević
Ishita Dasgupta
...
Bahare Fatemi
Pranjal Awasthi
Dee Guo
Sreenivas Gollapudi
Ahmed Qureshi
LRMVLM
309
25
0
13 Jun 2024
SememeLM: A Sememe Knowledge Enhanced Method for Long-tail Relation
  Representation
SememeLM: A Sememe Knowledge Enhanced Method for Long-tail Relation Representation
Shuyi Li
Shaojuan Wu
Xiaowang Zhang
Zhiyong Feng
294
0
0
13 Jun 2024
SciKnowEval: Evaluating Multi-level Scientific Knowledge of Large Language Models
SciKnowEval: Evaluating Multi-level Scientific Knowledge of Large Language Models
Kehua Feng
Keyan Ding
Weijie Wang
Xiang Zhuang
Yuqi Tang
Ming Qin
Yu Zhao
ELM
354
12
0
13 Jun 2024
MiLoRA: Harnessing Minor Singular Components for Parameter-Efficient LLM Finetuning
MiLoRA: Harnessing Minor Singular Components for Parameter-Efficient LLM Finetuning
Hanqing Wang
Zeguan Xiao
Shuo Wang
Guanhua Chen
Guanhua Chen
381
52
0
13 Jun 2024
MobileAIBench: Benchmarking LLMs and LMMs for On-Device Use Cases
MobileAIBench: Benchmarking LLMs and LMMs for On-Device Use Cases
Rithesh Murthy
Liangwei Yang
Juntao Tan
Tulika Awalgaonkar
Yilun Zhou
...
Zuxin Liu
Ming Zhu
Huan Wang
Caiming Xiong
Silvio Savarese
245
10
0
12 Jun 2024
Beyond LLaVA-HD: Diving into High-Resolution Large Multimodal Models
Beyond LLaVA-HD: Diving into High-Resolution Large Multimodal Models
Yi-Fan Zhang
Qingsong Wen
Chaoyou Fu
Xue Wang
Zhang Zhang
Liwen Wang
Rong Jin
306
69
0
12 Jun 2024
OmniCorpus: A Unified Multimodal Corpus of 10 Billion-Level Images
  Interleaved with Text
OmniCorpus: A Unified Multimodal Corpus of 10 Billion-Level Images Interleaved with Text
Qingyun Li
Zhe Chen
Weiyun Wang
Wenhai Wang
Shenglong Ye
...
Dahua Lin
Yu Qiao
Botian Shi
Conghui He
Jifeng Dai
VLMOffRL
271
48
0
12 Jun 2024
Large Language Models Meet Text-Centric Multimodal Sentiment Analysis: A
  Survey
Large Language Models Meet Text-Centric Multimodal Sentiment Analysis: A Survey
Hao Yang
Yanyan Zhao
Yang Wu
Shilong Wang
Tian Zheng
Hongbo Zhang
Zongyang Ma
Wanxiang Che
Bing Qin
352
37
0
12 Jun 2024
MLLMGuard: A Multi-dimensional Safety Evaluation Suite for Multimodal
  Large Language Models
MLLMGuard: A Multi-dimensional Safety Evaluation Suite for Multimodal Large Language Models
Tianle Gu
Zeyang Zhou
Kexin Huang
Dandan Liang
Yixu Wang
...
Keqing Wang
Yujiu Yang
Yan Teng
Botian Shi
Yingchun Wang
ELM
293
31
0
11 Jun 2024
Needle In A Multimodal Haystack
Needle In A Multimodal Haystack
Weiyun Wang
Shuibo Zhang
Yiming Ren
Yuchen Duan
Tiantong Li
...
Ping Luo
Yu Qiao
Jifeng Dai
Wenqi Shao
Wenhai Wang
VLM
229
42
0
11 Jun 2024
CVQA: Culturally-diverse Multilingual Visual Question Answering
  Benchmark
CVQA: Culturally-diverse Multilingual Visual Question Answering BenchmarkNeural Information Processing Systems (NeurIPS), 2024
David Romero
Chenyang Lyu
Haryo Akbarianto Wibowo
Teresa Lynn
Injy Hamed
...
Oana Ignat
Joan Nwatu
Amélie Reymond
Thamar Solorio
Alham Fikri Aji
322
90
0
10 Jun 2024
Solution for SMART-101 Challenge of CVPR Multi-modal Algorithmic
  Reasoning Task 2024
Solution for SMART-101 Challenge of CVPR Multi-modal Algorithmic Reasoning Task 2024
Jinwoo Ahn
Junhyeok Park
Min-Jun Kim
Kang-Hyeon Kim
So-Yeong Sohn
Yun-Ji Lee
Du-Seong Chang
Yu-Jung Heo
Eun-Sol Kim
LRM
176
0
0
10 Jun 2024
SciRIFF: A Resource to Enhance Language Model Instruction-Following over Scientific Literature
SciRIFF: A Resource to Enhance Language Model Instruction-Following over Scientific Literature
Aman Rangapur
Kejian Shi
William Merrill
Aakanksha Naik
Shruti Singh
...
Luca Soldaini
Shannon Zejiang Shen
Doug Downey
Hannaneh Hajishirzi
Arman Cohan
487
26
0
10 Jun 2024
Evaluating Zero-Shot Long-Context LLM Compression
Evaluating Zero-Shot Long-Context LLM Compression
Chenyu Wang
Yihan Wang
Kai Li
302
0
0
10 Jun 2024
M3GIA: A Cognition Inspired Multilingual and Multimodal General
  Intelligence Ability Benchmark
M3GIA: A Cognition Inspired Multilingual and Multimodal General Intelligence Ability Benchmark
Wei Song
Yadong Li
Jianhua Xu
Guowei Wu
Lingfeng Ming
...
Weihua Luo
Houyi Li
Yi Du
Fangda Guo
Kaicheng Yu
ELMLRM
288
13
0
08 Jun 2024
BayesAgent: Bayesian Agentic Reasoning Under Uncertainty via Verbalized Probabilistic Graphical Modeling
BayesAgent: Bayesian Agentic Reasoning Under Uncertainty via Verbalized Probabilistic Graphical Modeling
Hengguan Huang
Xing Shen
Songtao Wang
Dianbo Liu
Hao Wang
David Alejandro Duchene
Hao Wang
Samir Bhatt
239
0
0
08 Jun 2024
An Empirical Study on Parameter-Efficient Fine-Tuning for MultiModal
  Large Language Models
An Empirical Study on Parameter-Efficient Fine-Tuning for MultiModal Large Language ModelsAnnual Meeting of the Association for Computational Linguistics (ACL), 2024
Xiongtao Zhou
Jie He
Yuhua Ke
Guangyao Zhu
Víctor Gutiérrez-Basulto
Jeff Z. Pan
238
28
0
07 Jun 2024
Think out Loud: Emotion Deducing Explanation in Dialogues
Think out Loud: Emotion Deducing Explanation in Dialogues
JiangNan Li
Zheng Lin
Lanrui Wang
Q. Si
Yanan Cao
Mo Yu
Peng Fu
Weiping Wang
Jie Zhou
219
3
0
07 Jun 2024
MGIMM: Multi-Granularity Instruction Multimodal Model for
  Attribute-Guided Remote Sensing Image Detailed Description
MGIMM: Multi-Granularity Instruction Multimodal Model for Attribute-Guided Remote Sensing Image Detailed DescriptionIEEE Transactions on Geoscience and Remote Sensing (TGRS), 2024
Cong Yang
Zuchao Li
Lefei Zhang
241
3
0
07 Jun 2024
POEM: Interactive Prompt Optimization for Enhancing Multimodal Reasoning
  of Large Language Models
POEM: Interactive Prompt Optimization for Enhancing Multimodal Reasoning of Large Language Models
Jianben He
Xingbo Wang
Shiyi Liu
Guande Wu
Claudio Silva
Huamin Qu
LRM
258
6
0
06 Jun 2024
Discovering Bias in Latent Space: An Unsupervised Debiasing Approach
Discovering Bias in Latent Space: An Unsupervised Debiasing Approach
Dyah Adila
Shuai Zhang
Boran Han
Yuyang Wang
AAMLLLMSV
308
13
0
05 Jun 2024
Wings: Learning Multimodal LLMs without Text-only Forgetting
Wings: Learning Multimodal LLMs without Text-only Forgetting
Yi-Kai Zhang
Shiyin Lu
Yang Li
Yanqing Ma
Qing-Guo Chen
Zhao Xu
Weihua Luo
Kaifu Zhang
De-Chuan Zhan
Han-Jia Ye
VLM
319
20
0
05 Jun 2024
A-Bench: Are LMMs Masters at Evaluating AI-generated Images?
A-Bench: Are LMMs Masters at Evaluating AI-generated Images?
Zicheng Zhang
H. Wu
Chunyi Li
Yingjie Zhou
Wei Sun
Xiongkuo Min
Zijian Chen
Xiaohong Liu
Weisi Lin
Guangtao Zhai
EGVM
426
40
0
05 Jun 2024
From Redundancy to Relevance: Enhancing Explainability in Multimodal
  Large Language Models
From Redundancy to Relevance: Enhancing Explainability in Multimodal Large Language Models
Xiaofeng Zhang
Chen Shen
Xiaosong Yuan
Shaotian Yan
Liang Xie
Wenxiao Wang
Chaochen Gu
Hao Tang
Jieping Ye
191
8
0
04 Jun 2024
Multimodal Reasoning with Multimodal Knowledge Graph
Multimodal Reasoning with Multimodal Knowledge Graph
Junlin Lee
Yequan Wang
Jing Li
Min Zhang
275
55
0
04 Jun 2024
HoneyGPT: Breaking the Trilemma in Terminal Honeypots with Large Language Model
HoneyGPT: Breaking the Trilemma in Terminal Honeypots with Large Language Model
Ziyang Wang
Jianzhou You
Haining Wang
Tianwei Yuan
Shichao Lv
Yang Wang
Limin Sun
301
11
0
04 Jun 2024
Parrot: Multilingual Visual Instruction Tuning
Parrot: Multilingual Visual Instruction Tuning
Hai-Long Sun
Da-Wei Zhou
Yangfu Li
Shiyin Lu
Chao Yi
...
Zhao Xu
Weihua Luo
Kaifu Zhang
De-Chuan Zhan
Han-Jia Ye
MLLM
718
19
0
04 Jun 2024
Dragonfly: Multi-Resolution Zoom Supercharges Large Visual-Language
  Model
Dragonfly: Multi-Resolution Zoom Supercharges Large Visual-Language Model
Kezhen Chen
Rahul Thapa
Rahul Chalamala
Ben Athiwaratkun
Shuaiwen Leon Song
James Zou
VLM
253
5
0
03 Jun 2024
Retrieval Meets Reasoning: Even High-school Textbook Knowledge Benefits
  Multimodal Reasoning
Retrieval Meets Reasoning: Even High-school Textbook Knowledge Benefits Multimodal Reasoning
Cheng Tan
Jingxuan Wei
Linzhuang Sun
Zhangyang Gao
Siyuan Li
Bihui Yu
Ruifeng Guo
Stan Z. Li
ReLMLRM3DV
283
14
0
31 May 2024
Ovis: Structural Embedding Alignment for Multimodal Large Language Model
Ovis: Structural Embedding Alignment for Multimodal Large Language Model
Shiyin Lu
Yang Li
Qing-Guo Chen
Zhao Xu
Weihua Luo
Kaifu Zhang
Han-Jia Ye
VLMMLLM
435
128
0
31 May 2024
Previous
123...171819...242526
Next
Page 18 of 26
Pageof 26