Communities
Connect sessions
AI calendar
Organizations
Join Slack
Contact Sales
Search
Open menu
Home
Papers
2209.09513
Cited By
v1
v2 (latest)
Learn to Explain: Multimodal Reasoning via Thought Chains for Science Question Answering
Neural Information Processing Systems (NeurIPS), 2022
20 September 2022
Pan Lu
Swaroop Mishra
Tony Xia
Liang Qiu
Kai-Wei Chang
Song-Chun Zhu
Oyvind Tafjord
Peter Clark
Ashwin Kalyan
ELM
ReLM
LRM
Re-assign community
ArXiv (abs)
PDF
HTML
HuggingFace (1 upvotes)
Papers citing
"Learn to Explain: Multimodal Reasoning via Thought Chains for Science Question Answering"
50 / 1,271 papers shown
Title
Bootstrapping SparseFormers from Vision Foundation Models
Computer Vision and Pattern Recognition (CVPR), 2023
Ziteng Gao
Zhan Tong
Kevin Qinghong Lin
Joya Chen
Mike Zheng Shou
147
0
0
04 Dec 2023
Retrieval-augmented Multi-modal Chain-of-Thoughts Reasoning for Large Language Models
Bingshuai Liu
Chenyang Lyu
Zijun Min
Zhanyu Wang
Jinsong Su
Longyue Wang
LRM
231
11
0
04 Dec 2023
Mitigating Fine-Grained Hallucination by Fine-Tuning Large Vision-Language Models with Caption Rewrites
Conference on Multimedia Modeling (MMM), 2023
Lei Wang
Jiabang He
Shenshen Li
Ning Liu
Ee-Peng Lim
MLLM
191
64
0
04 Dec 2023
Good Questions Help Zero-Shot Image Reasoning
Kaiwen Yang
Tao Shen
Xinmei Tian
Xiubo Geng
Chongyang Tao
Dacheng Tao
Wanrong Zhu
LRM
217
10
0
04 Dec 2023
ViP-LLaVA: Making Large Multimodal Models Understand Arbitrary Visual Prompts
Computer Vision and Pattern Recognition (CVPR), 2023
Mu Cai
Haotian Liu
Dennis Park
Siva Karthik Mustikovela
Gregory P. Meyer
Yuning Chai
Yong Jae Lee
VLM
LRM
MLLM
309
147
0
01 Dec 2023
MM-Narrator: Narrating Long-form Videos with Multimodal In-Context Learning
Computer Vision and Pattern Recognition (CVPR), 2023
Chaoyi Zhang
Kevin Qinghong Lin
Zhengyuan Yang
Jianfeng Wang
Linjie Li
Chung-Ching Lin
Zicheng Liu
Lijuan Wang
VGen
238
48
0
29 Nov 2023
Contrastive Vision-Language Alignment Makes Efficient Instruction Learner
Lizhao Liu
Xinyu Sun
Tianhang Xiang
Zhuangwei Zhuang
Liuren Yin
Mingkui Tan
VLM
159
4
0
29 Nov 2023
LLaMA-VID: An Image is Worth 2 Tokens in Large Language Models
European Conference on Computer Vision (ECCV), 2023
Yanwei Li
Chengyao Wang
Jiaya Jia
VLM
MLLM
283
467
0
28 Nov 2023
SEED-Bench-2: Benchmarking Multimodal Large Language Models
Bohao Li
Yuying Ge
Yixiao Ge
Guangzhi Wang
Rui Wang
Ruimao Zhang
Ying Shan
MLLM
VLM
156
83
0
28 Nov 2023
Compositional Chain-of-Thought Prompting for Large Multimodal Models
Computer Vision and Pattern Recognition (CVPR), 2023
Chancharik Mitra
Brandon Huang
Trevor Darrell
Roei Herzig
MLLM
LRM
308
162
0
27 Nov 2023
MMMU: A Massive Multi-discipline Multimodal Understanding and Reasoning Benchmark for Expert AGI
Computer Vision and Pattern Recognition (CVPR), 2023
Xiang Yue
Yuansheng Ni
Kai Zhang
Tianyu Zheng
Ruoqi Liu
...
Yibo Liu
Wenhao Huang
Huan Sun
Yu-Chuan Su
Wenhu Chen
OSLM
ELM
VLM
841
1,575
0
27 Nov 2023
LANS: A Layout-Aware Neural Solver for Plane Geometry Problem
Annual Meeting of the Association for Computational Linguistics (ACL), 2023
Zhong-Zhi Li
Ming-Liang Zhang
Fei Yin
Cheng-Lin Liu
200
20
0
25 Nov 2023
Boosting the Power of Small Multimodal Reasoning Models to Match Larger Models with Self-Consistency Training
European Conference on Computer Vision (ECCV), 2023
Cheng Tan
Jingxuan Wei
Zhangyang Gao
Linzhuang Sun
Siyuan Li
Ruifeng Guo
Xihong Yang
Stan Z. Li
LRM
256
28
0
23 Nov 2023
CoachLM: Automatic Instruction Revisions Improve the Data Quality in LLM Instruction Tuning
IEEE International Conference on Data Engineering (ICDE), 2023
Yilun Liu
Shimin Tao
Xiaofeng Zhao
Ming Zhu
Wenbing Ma
...
Min Zhang
Hongxia Ma
Li Zhang
Hao Yang
Yanfei Jiang
176
20
0
22 Nov 2023
ShareGPT4V: Improving Large Multi-Modal Models with Better Captions
European Conference on Computer Vision (ECCV), 2023
Lin Chen
Jinsong Li
Xiao-wen Dong
Pan Zhang
Conghui He
Yuan Liu
Feng Zhao
Dahua Lin
MLLM
VLM
351
927
0
21 Nov 2023
From Wrong To Right: A Recursive Approach Towards Vision-Language Explanation
Conference on Empirical Methods in Natural Language Processing (EMNLP), 2023
Jiaxin Ge
Sanjay Subramanian
Trevor Darrell
Boyi Li
LRM
241
4
0
21 Nov 2023
Video-LLaVA: Learning United Visual Representation by Alignment Before Projection
Bin Lin
Yang Ye
Bin Zhu
Jiaxi Cui
Munan Ning
Peng Jin
Li-ming Yuan
VLM
MLLM
1.4K
1,153
0
16 Nov 2023
Structured Chemistry Reasoning with Large Language Models
Siru Ouyang
Zhuosheng Zhang
Bing Yan
Xuan Liu
Yejin Choi
Jiawei Han
Lianhui Qin
LRM
141
26
0
16 Nov 2023
A Robust Semantics-based Watermark for Large Language Model against Paraphrasing
Jie Ren
Han Xu
Yiding Liu
Yingqian Cui
Shuaiqiang Wang
D. Yin
Shucheng Zhou
OffRL
378
75
0
15 Nov 2023
XplainLLM: A QA Explanation Dataset for Understanding LLM Decision-Making
Conference on Empirical Methods in Natural Language Processing (EMNLP), 2023
Zichen Chen
Jianda Chen
Mitali Gaidhani
Ambuj K. Singh
Misha Sra
189
4
0
15 Nov 2023
Unlock the Power: Competitive Distillation for Multi-Modal Large Language Models
Xinwei Li
Li Lin
Shuai Wang
Chen Qian
102
6
0
14 Nov 2023
Chat-UniVi: Unified Visual Representation Empowers Large Language Models with Image and Video Understanding
Computer Vision and Pattern Recognition (CVPR), 2023
Peng Jin
Ryuichi Takanobu
Caiwan Zhang
Xiaochun Cao
Li-ming Yuan
MLLM
460
347
0
14 Nov 2023
SPHINX: The Joint Mixing of Weights, Tasks, and Visual Embeddings for Multi-modal Large Language Models
Ziyi Lin
Chris Liu
Renrui Zhang
Shiyang Feng
Longtian Qiu
...
Siyuan Huang
Yichi Zhang
Xuming He
Jiaming Song
Yu Qiao
MLLM
VLM
280
272
0
13 Nov 2023
It's Not Easy Being Wrong: Large Language Models Struggle with Process of Elimination Reasoning
Annual Meeting of the Association for Computational Linguistics (ACL), 2023
Nishant Balepur
Shramay Palta
Rachel Rudinger
LRM
197
14
0
13 Nov 2023
Large Language Models are In-context Teachers for Knowledge Reasoning
Conference on Empirical Methods in Natural Language Processing (EMNLP), 2023
Jiachen Zhao
Zonghai Yao
Zhichao Yang
Hong-ye Yu
ReLM
LRM
145
2
0
12 Nov 2023
InfMLLM: A Unified Framework for Visual-Language Tasks
Qiang-feng Zhou
Zhibin Wang
Wei Chu
Yinghui Xu
Hao Li
Yuan Qi
MLLM
119
12
0
12 Nov 2023
Monkey: Image Resolution and Text Label Are Important Things for Large Multi-modal Models
Computer Vision and Pattern Recognition (CVPR), 2023
Zhang Li
Biao Yang
Qiang Liu
Zhiyin Ma
Shuo Zhang
Jingxu Yang
Yabo Sun
Yuliang Liu
Xiang Bai
MLLM
449
377
0
11 Nov 2023
Analyzing Modular Approaches for Visual Question Decomposition
Conference on Empirical Methods in Natural Language Processing (EMNLP), 2023
Apoorv Khandelwal
Ellie Pavlick
Chen Sun
238
5
0
10 Nov 2023
u-LLaVA: Unifying Multi-Modal Tasks via Large Language Model
Jinjin Xu
Liwu Xu
Yuzhe Yang
Xiang Li
Fanyi Wang
Yanchun Xie
Yi-Jie Huang
Yaqian Li
MoE
MLLM
VLM
380
24
0
09 Nov 2023
Agent Lumos: Unified and Modular Training for Open-Source Language Agents
Da Yin
Faeze Brahman
Abhilasha Ravichander
Khyathi Chandu
Kai-Wei Chang
Yejin Choi
Bill Yuchen Lin
LLMAG
268
59
0
09 Nov 2023
TEAL: Tokenize and Embed ALL for Multi-modal Large Language Models
Zhen Yang
Yingxue Zhang
Fandong Meng
Jie Zhou
VLM
MLLM
171
4
0
08 Nov 2023
CogVLM: Visual Expert for Pretrained Language Models
Neural Information Processing Systems (NeurIPS), 2023
Weihan Wang
Qingsong Lv
Wenmeng Yu
Wenyi Hong
Ji Qi
...
Bin Xu
Juanzi Li
Yuxiao Dong
Ming Ding
Jie Tang
VLM
MLLM
607
699
0
06 Nov 2023
Octavius: Mitigating Task Interference in MLLMs via LoRA-MoE
International Conference on Learning Representations (ICLR), 2023
Zeren Chen
Ziqin Wang
Zhen Wang
Huayang Liu
Zhen-fei Yin
Si Liu
Lu Sheng
Wanli Ouyang
Yu Qiao
Jing Shao
MoE
233
16
0
05 Nov 2023
Is GPT Powerful Enough to Analyze the Emotions of Memes?
International Conference on Machine Learning and Applications (ICMLA), 2023
Jingjing Wang
Joshua Luo
Grace Yang
Allen Hong
Feng Luo
ELM
AI4MH
142
5
0
01 Nov 2023
EHRXQA: A Multi-Modal Question Answering Dataset for Electronic Health Records with Chest X-ray Images
Neural Information Processing Systems (NeurIPS), 2023
Seongsu Bae
Daeun Kyung
Jaehee Ryu
Eunbyeol Cho
Gyubok Lee
...
Jungwoo Oh
Lei Ji
E. Chang
Tackeun Kim
Edward Choi
256
43
0
28 Oct 2023
An Early Evaluation of GPT-4V(ision)
Yang Wu
Shilong Wang
Hao Yang
Tian Zheng
Hongbo Zhang
Yanyan Zhao
Bing Qin
MLLM
ELM
145
48
0
25 Oct 2023
DDCoT: Duty-Distinct Chain-of-Thought Prompting for Multimodal Reasoning in Language Models
Neural Information Processing Systems (NeurIPS), 2023
Ge Zheng
Bin Yang
Jiajin Tang
Hong-Yu Zhou
Sibei Yang
LRM
MLLM
276
178
0
25 Oct 2023
Data Pruning via Moving-one-Sample-out
Neural Information Processing Systems (NeurIPS), 2023
Haoru Tan
Sitong Wu
Fei Du
Yukang Chen
Zhibin Wang
Fan Wang
Xiaojuan Qi
327
62
0
23 Oct 2023
Small Language Models Fine-tuned to Coordinate Larger Language Models improve Complex Reasoning
Conference on Empirical Methods in Natural Language Processing (EMNLP), 2023
Gurusha Juneja
Subhabrata Dutta
Soumen Chakrabarti
Sunny Manchanda
Tanmoy Chakraborty
LRM
ReLM
308
19
0
21 Oct 2023
MarineGPT: Unlocking Secrets of Ocean to the Public
Ziqiang Zheng
Jipeng Zhang
Tuan-Anh Vu
Shizhe Diao
Yue Him Wong Tim
Sai-Kit Yeung
295
23
0
20 Oct 2023
ToolChain*: Efficient Action Space Navigation in Large Language Models with A* Search
Yuchen Zhuang
Xiang Chen
Tong Yu
Saayan Mitra
Victor S. Bursztyn
Ryan Rossi
Somdeb Sarkhel
Chao Zhang
LLMAG
253
91
0
20 Oct 2023
Evaluating the Symbol Binding Ability of Large Language Models for Multiple-Choice Questions in Vietnamese General Education
Symposium on Information and Communication Technology (SICT), 2023
Duc-Vu Nguyen
Quoc-Nam Nguyen
440
7
0
18 Oct 2023
VLIS: Unimodal Language Models Guide Multimodal Language Generation
Jiwan Chung
Youngjae Yu
VLM
193
2
0
15 Oct 2023
MM-BigBench: Evaluating Multimodal Models on Multimodal Content Comprehension Tasks
Xiaocui Yang
Wenfang Wu
Shi Feng
Ming Wang
Daling Wang
Yang Li
Qi Sun
Yifei Zhang
Xiaoming Fu
Soujanya Poria
LRM
ELM
163
20
0
13 Oct 2023
Ziya-Visual: Bilingual Large Vision-Language Model via Multi-Task Instruction Tuning
Junyu Lu
Di Zhang
Xiaojun Wu
Xinyu Gao
Ruyi Gan
Jiaxing Zhang
Yan Song
Pingjian Zhang
VLM
MLLM
155
11
0
12 Oct 2023
Survey on Factuality in Large Language Models: Knowledge, Retrieval and Domain-Specificity
Cunxiang Wang
Xiaoze Liu
Yuanhao Yue
Xiangru Tang
Tianhang Zhang
...
Linyi Yang
Yongfeng Zhang
Xing Xie
Zheng Zhang
Yue Zhang
HILM
KELM
415
251
0
11 Oct 2023
TRACE: A Comprehensive Benchmark for Continual Learning in Large Language Models
Xiao Wang
Yuan Zhang
Tianze Chen
Songyang Gao
Senjie Jin
...
Rui Zheng
Yicheng Zou
Tao Gui
Tao Gui
Xuanjing Huang
ALM
LRM
CLL
179
35
0
10 Oct 2023
On the Evaluation and Refinement of Vision-Language Instruction Tuning Datasets
Ning Liao
Shaofeng Zhang
Renqiu Xia
Min Cao
Yu Qiao
Junchi Yan
MLLM
139
0
0
10 Oct 2023
ViCor: Bridging Visual Understanding and Commonsense Reasoning with Large Language Models
Annual Meeting of the Association for Computational Linguistics (ACL), 2023
KAI-QING Zhou
Kwonjoon Lee
Teruhisa Misu
Xin Eric Wang
LRM
245
8
0
09 Oct 2023
Towards Better Chain-of-Thought Prompting Strategies: A Survey
Zihan Yu
Liang He
Zhen Wu
Xinyu Dai
Jiajun Chen
LRM
326
82
0
08 Oct 2023
Previous
1
2
3
...
22
23
24
25
26
Next