ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2026 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2209.09513
  4. Cited By
Learn to Explain: Multimodal Reasoning via Thought Chains for Science
  Question Answering
v1v2 (latest)

Learn to Explain: Multimodal Reasoning via Thought Chains for Science Question Answering

Neural Information Processing Systems (NeurIPS), 2022
20 September 2022
Pan Lu
Swaroop Mishra
Tony Xia
Liang Qiu
Kai-Wei Chang
Song-Chun Zhu
Oyvind Tafjord
Peter Clark
Ashwin Kalyan
    ELMReLMLRM
ArXiv (abs)PDFHTMLHuggingFace (1 upvotes)

Papers citing "Learn to Explain: Multimodal Reasoning via Thought Chains for Science Question Answering"

50 / 1,273 papers shown
Shotluck Holmes: A Family of Efficient Small-Scale Large Language Vision
  Models For Video Captioning and Summarization
Shotluck Holmes: A Family of Efficient Small-Scale Large Language Vision Models For Video Captioning and Summarization
Richard Luo
Austin Peng
Adithya Vasudev
Rishabh Jain
141
3
0
31 May 2024
Visual Perception by Large Language Model's Weights
Visual Perception by Large Language Model's Weights
Feipeng Ma
Hongwei Xue
Guangting Wang
Yizhou Zhou
Fengyun Rao
Shilin Yan
Yueyi Zhang
Siying Wu
Mike Zheng Shou
Xiaoyan Sun
VLM
166
18
0
30 May 2024
Temporal Grounding of Activities using Multimodal Large Language Models
Temporal Grounding of Activities using Multimodal Large Language Models
Young Chol Song
280
1
0
30 May 2024
Instruction-Guided Visual Masking
Instruction-Guided Visual Masking
Jinliang Zheng
Jianxiong Li
Si Cheng
Yinan Zheng
Jiaming Li
Jihao Liu
Yu Liu
Jingjing Liu
Xianyuan Zhan
268
17
0
30 May 2024
PertEval: Unveiling Real Knowledge Capacity of LLMs with
  Knowledge-Invariant Perturbations
PertEval: Unveiling Real Knowledge Capacity of LLMs with Knowledge-Invariant Perturbations
Jiatong Li
Renjun Hu
Kunzhe Huang
Zhuang Yan
Qi Liu
Mengxiao Zhu
Xing Shi
Jialin Li
KELM
384
13
0
30 May 2024
Enhancing Large Vision Language Models with Self-Training on Image
  Comprehension
Enhancing Large Vision Language Models with Self-Training on Image Comprehension
Yihe Deng
Pan Lu
Fan Yin
Ziniu Hu
Sheng Shen
James Zou
Kai-Wei Chang
Wei Wang
SyDaVLMLRM
239
71
0
30 May 2024
Matryoshka Query Transformer for Large Vision-Language Models
Matryoshka Query Transformer for Large Vision-Language Models
Wenbo Hu
Zi-Yi Dou
Liunian Harold Li
Amita Kamath
Nanyun Peng
Kai-Wei Chang
MLLM
304
27
0
29 May 2024
Cracking the Code of Juxtaposition: Can AI Models Understand the
  Humorous Contradictions
Cracking the Code of Juxtaposition: Can AI Models Understand the Humorous Contradictions
Zhe Hu
Tuo Liang
Jing Li
Yiren Lu
Yunlai Zhou
Yiran Qiao
Jing Ma
Yu Yin
238
11
0
29 May 2024
Enhancing Descriptive Image Quality Assessment with A Large-scale Multi-modal Dataset
Enhancing Descriptive Image Quality Assessment with A Large-scale Multi-modal DatasetIEEE Transactions on Image Processing (TIP), 2024
Zhiyuan You
Jinjin Gu
Zheyuan Li
Xin Cai
Kaiwen Zhu
Chao Dong
Tianfan Xue
EGVM
479
38
0
29 May 2024
PromptWizard: Task-Aware Agent-driven Prompt Optimization Framework
PromptWizard: Task-Aware Agent-driven Prompt Optimization Framework
Eshaan Agarwal
Vivek Dani
T. Ganu
A. Nambi
LLMAG
196
0
0
28 May 2024
The Evolution of Multimodal Model Architectures
The Evolution of Multimodal Model Architectures
S. Wadekar
Abhishek Chaurasia
Vasu Sharma
Eugenio Culurciello
325
27
0
28 May 2024
Seeing the Image: Prioritizing Visual Correlation by Contrastive
  Alignment
Seeing the Image: Prioritizing Visual Correlation by Contrastive Alignment
Xin Xiao
Bohong Wu
Jiacong Wang
Chunyuan Li
Xun Zhou
Haoyuan Guo
VLM
185
20
0
28 May 2024
Visual Anchors Are Strong Information Aggregators For Multimodal Large
  Language Model
Visual Anchors Are Strong Information Aggregators For Multimodal Large Language Model
Haogeng Liu
Quanzeng You
Xiaotian Han
Yongfei Liu
Huaibo Huang
Ran He
Hongxia Yang
132
4
0
28 May 2024
LoRA-Switch: Boosting the Efficiency of Dynamic LLM Adapters via
  System-Algorithm Co-design
LoRA-Switch: Boosting the Efficiency of Dynamic LLM Adapters via System-Algorithm Co-design
Rui Kong
Qiyang Li
Xinyu Fang
Qingtian Feng
Qingfeng He
Yazhu Dong
Weijun Wang
Yuanchun Li
Linghe Kong
Yunxin Liu
MoE
262
15
0
28 May 2024
Matryoshka Multimodal Models
Matryoshka Multimodal Models
Mu Cai
Jianwei Yang
Jianfeng Gao
Yong Jae Lee
VLM
272
57
0
27 May 2024
A Survey of Multimodal Large Language Model from A Data-centric
  Perspective
A Survey of Multimodal Large Language Model from A Data-centric Perspective
Tianyi Bai
Hao Liang
Binwang Wan
Yanran Xu
Xi Li
...
Ping Huang
Jiulong Shan
Conghui He
Binhang Yuan
Wentao Zhang
404
64
0
26 May 2024
M$^3$CoT: A Novel Benchmark for Multi-Domain Multi-step Multi-modal
  Chain-of-Thought
M3^33CoT: A Novel Benchmark for Multi-Domain Multi-step Multi-modal Chain-of-Thought
Qiguang Chen
Libo Qin
Jin Zhang
Zhi Chen
Xiao Xu
Wanxiang Che
LRM
325
110
0
26 May 2024
Accelerating Transformers with Spectrum-Preserving Token Merging
Accelerating Transformers with Spectrum-Preserving Token Merging
Hoai-Chau Tran
D. M. Nguyen
Duy M. Nguyen
Trung Thanh Nguyen
Ngan Le
Pengtao Xie
Daniel Sonntag
James Y. Zou
Binh T. Nguyen
Mathias Niepert
276
26
0
25 May 2024
Disease-informed Adaptation of Vision-Language Models
Disease-informed Adaptation of Vision-Language Models
Jiajin Zhang
Ge Wang
Mannudeep K. Kalra
Pingkun Yan
VLM
277
8
0
24 May 2024
Text Generation: A Systematic Literature Review of Tasks, Evaluation,
  and Challenges
Text Generation: A Systematic Literature Review of Tasks, Evaluation, and Challenges
Jonas Becker
Jan Philip Wahle
Bela Gipp
Terry Ruas
375
18
0
24 May 2024
Meteor: Mamba-based Traversal of Rationale for Large Language and Vision
  Models
Meteor: Mamba-based Traversal of Rationale for Large Language and Vision Models
Byung-Kwan Lee
Chae Won Kim
Beomchan Park
Yonghyun Ro
MLLMLRM
339
29
0
24 May 2024
M4U: Evaluating Multilingual Understanding and Reasoning for Large Multimodal Models
M4U: Evaluating Multilingual Understanding and Reasoning for Large Multimodal Models
Hongyu Wang
Jiayu Xu
Senwei Xie
Ruiping Wang
Jialin Li
Zhaojie Xie
Bin Zhang
Chuyan Xiong
Xilin Chen
ELMVLMLRM
411
10
0
24 May 2024
Enhancing Visual-Language Modality Alignment in Large Vision Language Models via Self-Improvement
Enhancing Visual-Language Modality Alignment in Large Vision Language Models via Self-Improvement
Xiyao Wang
Jiuhai Chen
Zhaoyang Wang
Yuhang Zhou
Yiyang Zhou
...
Wanrong Zhu
Tom Goldstein
Parminder Bhatia
Furong Huang
Cao Xiao
494
64
0
24 May 2024
Calibrated Self-Rewarding Vision Language Models
Calibrated Self-Rewarding Vision Language ModelsNeural Information Processing Systems (NeurIPS), 2024
Yiyang Zhou
Zhiyuan Fan
Dongjie Cheng
Sihan Yang
Zhaorun Chen
Chenhang Cui
Xiyao Wang
Yun Li
Linjun Zhang
Huaxiu Yao
VLM
304
66
0
23 May 2024
AlignGPT: Multi-modal Large Language Models with Adaptive Alignment
  Capability
AlignGPT: Multi-modal Large Language Models with Adaptive Alignment Capability
Fei Zhao
Taotian Pang
Chunhui Li
Zhen Wu
Junjie Guo
Shangyu Xing
Xinyu Dai
202
13
0
23 May 2024
Maintaining Structural Integrity in Parameter Spaces for Parameter Efficient Fine-tuning
Maintaining Structural Integrity in Parameter Spaces for Parameter Efficient Fine-tuningInternational Conference on Learning Representations (ICLR), 2024
Chongjie Si
Xuehui Wang
Xue Yang
Zhengqin Xu
Qingyun Li
Jifeng Dai
Yu Qiao
Yunbo Wang
Wei Shen
258
6
0
23 May 2024
Dynamic Mixture of Experts: An Auto-Tuning Approach for Efficient Transformer Models
Dynamic Mixture of Experts: An Auto-Tuning Approach for Efficient Transformer ModelsInternational Conference on Learning Representations (ICLR), 2024
Yongxin Guo
Zhenglin Cheng
Xiaoying Tang
Tao Lin
Tao Lin
MoE
557
33
0
23 May 2024
Dense Connector for MLLMs
Dense Connector for MLLMs
Huanjin Yao
Wenhao Wu
Taojiannan Yang
Yuxin Song
Mengxi Zhang
Haocheng Feng
Yifan Sun
Zhiheng Li
Wanli Ouyang
Jingdong Wang
MLLMVLM
224
39
0
22 May 2024
Large Language Models Meet NLP: A Survey
Large Language Models Meet NLP: A Survey
Libo Qin
Qiguang Chen
Xiachong Feng
Yang Wu
Yongheng Zhang
Hai-Tao Zheng
Min Li
Wanxiang Che
Philip S. Yu
LRMALMLM&MAELM
476
119
0
21 May 2024
Single Image Unlearning: Efficient Machine Unlearning in Multimodal Large Language Models
Single Image Unlearning: Efficient Machine Unlearning in Multimodal Large Language Models
Jiaqi Li
Qianshan Wei
Chuanyi Zhang
Guilin Qi
Miaozeng Du
Yongrui Chen
Sheng Bi
Fan Liu
VLMMU
507
32
0
21 May 2024
Imp: Highly Capable Large Multimodal Models for Mobile Devices
Imp: Highly Capable Large Multimodal Models for Mobile Devices
Zhenwei Shao
Zhou Yu
Jun Yu
Xuecheng Ouyang
Lihao Zheng
Zhenbiao Gai
Mingyang Wang
Jiajun Ding
282
23
0
20 May 2024
Rethinking Overlooked Aspects in Vision-Language Models
Rethinking Overlooked Aspects in Vision-Language Models
Yuan Liu
Le Tian
Xiao Zhou
Jie Zhou
VLM
243
2
0
20 May 2024
TinyLLaVA Factory: A Modularized Codebase for Small-scale Large
  Multimodal Models
TinyLLaVA Factory: A Modularized Codebase for Small-scale Large Multimodal Models
Junlong Jia
Ying Hu
Xi Weng
Yiming Shi
Chenyi Guo
...
Baichuan Zhou
Ziyu Liu
Jie Luo
Lei Huang
Ji Wu
222
13
0
20 May 2024
MemeMQA: Multimodal Question Answering for Memes via Rationale-Based
  Inferencing
MemeMQA: Multimodal Question Answering for Memes via Rationale-Based InferencingAnnual Meeting of the Association for Computational Linguistics (ACL), 2024
Siddhant Agarwal
Shivam Sharma
Preslav Nakov
Tanmoy Chakraborty
267
10
0
18 May 2024
Efficient Multimodal Large Language Models: A Survey
Efficient Multimodal Large Language Models: A Survey
Yizhang Jin
Jian Li
Yexin Liu
Tianjun Gu
Kai Wu
...
Xin Tan
Zhenye Gan
Yabiao Wang
Chengjie Wang
Lizhuang Ma
LRM
308
86
0
17 May 2024
Libra: Building Decoupled Vision System on Large Language Models
Libra: Building Decoupled Vision System on Large Language ModelsInternational Conference on Machine Learning (ICML), 2024
Yifan Xu
Xiaoshan Yang
Y. Song
Changsheng Xu
MLLMVLM
208
10
0
16 May 2024
SciQAG: A Framework for Auto-Generated Science Question Answering
  Dataset with Fine-grained Evaluation
SciQAG: A Framework for Auto-Generated Science Question Answering Dataset with Fine-grained Evaluation
Yuwei Wan
Yixuan Liu
Aswathy Ajith
Clara Grazian
B. Hoex
Wenjie Zhang
Chunyu Kit
Tong Xie
Ian Foster
242
21
0
16 May 2024
Enhancing Semantics in Multimodal Chain of Thought via Soft Negative
  Sampling
Enhancing Semantics in Multimodal Chain of Thought via Soft Negative SamplingInternational Conference on Language Resources and Evaluation (LREC), 2024
Guangmin Zheng
Jin Wang
Xiaobing Zhou
Xuejie Zhang
LRM
154
7
0
16 May 2024
Beyond Flesch-Kincaid: Prompt-based Metrics Improve Difficulty
  Classification of Educational Texts
Beyond Flesch-Kincaid: Prompt-based Metrics Improve Difficulty Classification of Educational TextsWorkshop on Innovative Use of NLP for Building Educational Applications (UNBEA), 2024
Donya Rooein
Paul Rottger
Anastassia Shaitarova
Dirk Hovy
212
9
0
15 May 2024
Xmodel-VLM: A Simple Baseline for Multimodal Vision Language Model
Xmodel-VLM: A Simple Baseline for Multimodal Vision Language Model
Wanting Xu
Yang Liu
Langping He
Xucheng Huang
Ling Jiang
VLMMLLM
211
5
0
15 May 2024
CuMo: Scaling Multimodal LLM with Co-Upcycled Mixture-of-Experts
CuMo: Scaling Multimodal LLM with Co-Upcycled Mixture-of-ExpertsNeural Information Processing Systems (NeurIPS), 2024
Jiachen Li
Xinyao Wang
Sijie Zhu
Chia-Wen Kuo
Lu Xu
Fan Chen
Jitesh Jain
Humphrey Shi
Longyin Wen
MLLMMoE
226
56
0
09 May 2024
Memory-Space Visual Prompting for Efficient Vision-Language Fine-Tuning
Memory-Space Visual Prompting for Efficient Vision-Language Fine-TuningInternational Conference on Machine Learning (ICML), 2024
Shibo Jie
Yehui Tang
Ning Ding
Zhi-Hong Deng
Kai Han
Yunhe Wang
VLM
345
20
0
09 May 2024
Language-Image Models with 3D Understanding
Language-Image Models with 3D UnderstandingInternational Conference on Learning Representations (ICLR), 2024
Jang Hyun Cho
Boris Ivanovic
Yulong Cao
Edward Schmerling
Yue Wang
...
Boyi Li
Yurong You
Philipp Krahenbuhl
Yan Wang
Marco Pavone
LRM
190
27
0
06 May 2024
What matters when building vision-language models?
What matters when building vision-language models?Neural Information Processing Systems (NeurIPS), 2024
Hugo Laurençon
Léo Tronchon
Matthieu Cord
Victor Sanh
VLM
313
278
0
03 May 2024
AdaMoLE: Fine-Tuning Large Language Models with Adaptive Mixture of
  Low-Rank Adaptation Experts
AdaMoLE: Fine-Tuning Large Language Models with Adaptive Mixture of Low-Rank Adaptation Experts
Zefang Liu
Jiahua Luo
MoEKELM
346
24
0
01 May 2024
How Far Are We to GPT-4V? Closing the Gap to Commercial Multimodal
  Models with Open-Source Suites
How Far Are We to GPT-4V? Closing the Gap to Commercial Multimodal Models with Open-Source Suites
Zhe Chen
Weiyun Wang
Hao Tian
Shenglong Ye
Zhangwei Gao
...
Tong Lu
Dahua Lin
Yu Qiao
Jifeng Dai
Wenhai Wang
MLLMVLM
534
1,004
0
25 Apr 2024
Continual Learning of Large Language Models: A Comprehensive Survey
Continual Learning of Large Language Models: A Comprehensive Survey
Haizhou Shi
Zihao Xu
Hengyi Wang
Weiyi Qin
Wenyuan Wang
Yibin Wang
Zifeng Wang
Sayna Ebrahimi
Hao Wang
CLLKELMLRM
409
160
0
25 Apr 2024
Mamba-360: Survey of State Space Models as Transformer Alternative for
  Long Sequence Modelling: Methods, Applications, and Challenges
Mamba-360: Survey of State Space Models as Transformer Alternative for Long Sequence Modelling: Methods, Applications, and Challenges
Badri N. Patro
Vijay Srinivas Agneeswaran
Mamba
368
76
0
24 Apr 2024
Cantor: Inspiring Multimodal Chain-of-Thought of MLLM
Cantor: Inspiring Multimodal Chain-of-Thought of MLLM
Timin Gao
Peixian Chen
Mengdan Zhang
Chaoyou Fu
Chunjiang Ge
...
Shengchuan Zhang
Xiawu Zheng
Xing Sun
Liujuan Cao
Rongrong Ji
MLLMLRM
306
50
0
24 Apr 2024
What Makes Multimodal In-Context Learning Work?
What Makes Multimodal In-Context Learning Work?
Folco Bertini Baldassini
Mustafa Shukor
Matthieu Cord
Laure Soulier
Benjamin Piwowarski
438
40
0
24 Apr 2024
Previous
123...181920...242526
Next
Page 19 of 26
Pageof 26