Communities
Connect sessions
AI calendar
Organizations
Join Slack
Contact Sales
Search
Open menu
Home
Papers
2209.09513
Cited By
v1
v2 (latest)
Learn to Explain: Multimodal Reasoning via Thought Chains for Science Question Answering
Neural Information Processing Systems (NeurIPS), 2022
20 September 2022
Pan Lu
Swaroop Mishra
Tony Xia
Liang Qiu
Kai-Wei Chang
Song-Chun Zhu
Oyvind Tafjord
Peter Clark
Ashwin Kalyan
ELM
ReLM
LRM
Re-assign community
ArXiv (abs)
PDF
HTML
HuggingFace (1 upvotes)
Papers citing
"Learn to Explain: Multimodal Reasoning via Thought Chains for Science Question Answering"
50 / 1,266 papers shown
Title
MathVista: Evaluating Mathematical Reasoning of Foundation Models in Visual Contexts
International Conference on Learning Representations (ICLR), 2023
Pan Lu
Hritik Bansal
Tony Xia
Hamish Ivison
Chun-yue Li
Hannaneh Hajishirzi
Hao Cheng
Kai-Wei Chang
Michel Galley
Jianfeng Gao
LRM
MLLM
527
1,107
0
03 Oct 2023
Language Models as Knowledge Bases for Visual Word Sense Disambiguation
Anastasia Kritharoula
Maria Lymperaiou
Giorgos Stamou
164
4
0
03 Oct 2023
HallE-Control: Controlling Object Hallucination in Large Multimodal Models
Bohan Zhai
Shijia Yang
Chenfeng Xu
Sheng Shen
Kurt Keutzer
Chunyuan Li
Manling Li
MLLM
281
22
0
03 Oct 2023
Fool Your (Vision and) Language Model With Embarrassingly Simple Permutations
International Conference on Machine Learning (ICML), 2023
Yongshuo Zong
Tingyang Yu
Ruchika Chavhan
Bingchen Zhao
Timothy M. Hospedales
MLLM
AAML
LRM
231
25
0
02 Oct 2023
At Which Training Stage Does Code Data Help LLMs Reasoning?
International Conference on Learning Representations (ICLR), 2023
Xiaogang Jia
Yue Liu
Yue Yu
Yuanliang Zhang
Yu Jiang
Changjian Wang
Shanshan Li
LRM
SyDa
334
88
0
28 Sep 2023
AnyMAL: An Efficient and Scalable Any-Modality Augmented Language Model
Conference on Empirical Methods in Natural Language Processing (EMNLP), 2023
Avamarie Brueggeman
Andrea Madotto
Mohammad Kachuee
Tushar Nagarajan
Matt Smith
...
Peyman Heidari
Yue Liu
Kavya Srinet
Babak Damavandi
Anuj Kumar
MLLM
266
109
0
27 Sep 2023
NLPBench: Evaluating Large Language Models on Solving NLP Problems
Linxin Song
Jieyu Zhang
Lechao Cheng
Pengyuan Zhou
Wanrong Zhu
Irene Li
ELM
LM&MA
LRM
297
14
0
27 Sep 2023
Graph Neural Prompting with Large Language Models
AAAI Conference on Artificial Intelligence (AAAI), 2023
Yijun Tian
Huan Song
Zichen Wang
Haozhu Wang
Ziqing Hu
Fang Wang
Nitesh Chawla
Panpan Xu
AI4CE
379
71
0
27 Sep 2023
Navigate through Enigmatic Labyrinth A Survey of Chain of Thought Reasoning: Advances, Frontiers and Future
Annual Meeting of the Association for Computational Linguistics (ACL), 2023
Zheng Chu
Jingchang Chen
Qianglong Chen
Weijiang Yu
Tao He
Haotian Wang
Weihua Peng
Ming-Yuan Liu
Bing Qin
Ting Liu
LRM
AI4CE
451
218
0
27 Sep 2023
InternLM-XComposer: A Vision-Language Large Model for Advanced Text-image Comprehension and Composition
Pan Zhang
Xiaoyi Wang
Bin Wang
Yuhang Cao
Chao Xu
...
Conghui He
Xingcheng Zhang
Yu Qiao
Da Lin
Yuan Liu
MLLM
674
300
0
26 Sep 2023
Multimodal Deep Learning for Scientific Imaging Interpretation
Abdulelah S. Alshehri
Franklin L. Lee
Shihu Wang
98
3
0
21 Sep 2023
Knowledge Graph Question Answering for Materials Science (KGQA4MAT): Developing Natural Language Interface for Metal-Organic Frameworks Knowledge Graph (MOF-KG) Using LLM
Yuan An
Jane Greenberg
Alexander Kalinowski
Xintong Zhao
Xiaohua Hu
F. Uribe-Romo
Kyle Langlois
Jacob Furst
Diego A. Gómez-Gualdrón
183
6
0
20 Sep 2023
Investigating the Catastrophic Forgetting in Multimodal Large Language Models
Yuexiang Zhai
Shengbang Tong
Xiao Li
Mu Cai
Qing Qu
Yong Jae Lee
Yi Ma
VLM
MLLM
CLL
272
119
0
19 Sep 2023
MMICL: Empowering Vision-language Model with Multi-Modal In-Context Learning
International Conference on Learning Representations (ICLR), 2023
Haozhe Zhao
Zefan Cai
Shuzheng Si
Xiaojian Ma
Kaikai An
Liang Chen
Zixuan Liu
Sheng Wang
Wenjuan Han
Baobao Chang
MLLM
VLM
398
180
0
14 Sep 2023
TextBind: Multi-turn Interleaved Multimodal Instruction-following in the Wild
Annual Meeting of the Association for Computational Linguistics (ACL), 2023
Huayang Li
Siheng Li
Deng Cai
Longyue Wang
Lemao Liu
Taro Watanabe
Yujiu Yang
Shuming Shi
MLLM
289
22
0
14 Sep 2023
Measuring and Improving Chain-of-Thought Reasoning in Vision-Language Models
North American Chapter of the Association for Computational Linguistics (NAACL), 2023
Yangyi Chen
Karan Sikka
Michael Cogswell
Heng Ji
Ajay Divakaran
LRM
273
41
0
08 Sep 2023
Evaluation and Enhancement of Semantic Grounding in Large Vision-Language Models
Jiaying Lu
Jinmeng Rao
Kezhen Chen
Xiaoyuan Guo
Yawen Zhang
Baochen Sun
Carl Yang
Jie Yang
169
13
0
07 Sep 2023
ImageBind-LLM: Multi-modality Instruction Tuning
Jiaming Han
Renrui Zhang
Wenqi Shao
Shiyang Feng
Peng Xu
...
Yafei Wen
Xiaoxin Chen
Xiangyu Yue
Jiaming Song
Yu Qiao
MLLM
240
148
0
07 Sep 2023
Scaling Autoregressive Multi-Modal Models: Pretraining and Instruction Tuning
L. Yu
Bowen Shi
Ramakanth Pasunuru
Benjamin Muller
O. Yu. Golovneva
...
Yaniv Taigman
Maryam Fazel-Zarandi
Asli Celikyilmaz
Luke Zettlemoyer
Armen Aghajanyan
MLLM
211
160
0
05 Sep 2023
CIEM: Contrastive Instruction Evaluation Method for Better Instruction Tuning
Hongyu Hu
Jiyuan Zhang
Minyi Zhao
Zhenbang Sun
MLLM
164
76
0
05 Sep 2023
AGIBench: A Multi-granularity, Multimodal, Human-referenced, Auto-scoring Benchmark for Large Language Models
BenchCouncil International Symposium (ISB), 2023
Fei Tang
Wanling Gao
Luzhou Peng
Jianfeng Zhan
ELM
118
2
0
05 Sep 2023
A Survey on Interpretable Cross-modal Reasoning
Dizhan Xue
Shengsheng Qian
Zuyi Zhou
Changsheng Xu
LRM
369
5
0
05 Sep 2023
Sparkles: Unlocking Chats Across Multiple Images for Multimodal Instruction-Following Models
Yupan Huang
Zaiqiao Meng
Fangyu Liu
Yixuan Su
Nigel Collier
Yutong Lu
MLLM
138
32
0
31 Aug 2023
UniPT: Universal Parallel Tuning for Transfer Learning with Efficient Parameter and Memory
Computer Vision and Pattern Recognition (CVPR), 2023
Haiwen Diao
Bo Wan
Yanzhe Zhang
Xuecong Jia
Huchuan Lu
Long Chen
VLM
185
25
0
28 Aug 2023
Examining User-Friendly and Open-Sourced Large GPT Models: A Survey on Language, Multimodal, and Scientific GPT Models
Ran Bi
Su He
Zhenyu He
Jiacheng Lin
Qizhi Pei
Jie Shao
Wei Zhang
LM&MA
SyDa
162
14
0
27 Aug 2023
SciEval: A Multi-Level Large Language Model Evaluation Benchmark for Scientific Research
AAAI Conference on Artificial Intelligence (AAAI), 2023
Liangtai Sun
Yang Han
Zihan Zhao
Da Ma
Zhe-Wei Shen
Baocai Chen
Lu Chen
Kai Yu
ELM
220
121
0
25 Aug 2023
MLLM-DataEngine: An Iterative Refinement Approach for MLLM
Zhiyuan Zhao
Linke Ouyang
Bin Wang
Siyuan Huang
Pan Zhang
Xiao-wen Dong
Yuan Liu
Conghui He
MLLM
255
8
0
25 Aug 2023
Qwen-VL: A Versatile Vision-Language Model for Understanding, Localization, Text Reading, and Beyond
Jinze Bai
Shuai Bai
Shusheng Yang
Shijie Wang
Sinan Tan
Peng Wang
Junyang Lin
Chang Zhou
Jingren Zhou
MLLM
VLM
ObjD
489
1,523
0
24 Aug 2023
DeepLOC: Deep Learning-based Bone Pathology Localization and Classification in Wrist X-ray Images
International Joint Conference on the Analysis of Images, Social Networks and Texts (AIST), 2023
R. Dibo
Andrey V. Galichin
P. Astashev
Dmitry V. Dylov
Oleg Y. Rogov
128
5
0
24 Aug 2023
HuBo-VLM: Unified Vision-Language Model designed for HUman roBOt interaction tasks
Zichao Dong
Weikun Zhang
Xufeng Huang
Hang Ji
Xin Zhan
Junbo Chen
VLM
87
6
0
24 Aug 2023
InstructionGPT-4: A 200-Instruction Paradigm for Fine-Tuning MiniGPT-4
Lai Wei
Zihao Jiang
Weiran Huang
Lichao Sun
VLM
MLLM
250
74
0
23 Aug 2023
Towards CausalGPT: A Multi-Agent Approach for Faithful Knowledge Reasoning via Promoting Causal Consistency in LLMs
Ziyi Tang
Ruilin Wang
Weixing Chen
Keze Wang
Zehua Wang
Tianshui Chen
Liang Lin
Tianshui Chen
Liang Lin
LRM
212
12
0
23 Aug 2023
Tackling Vision Language Tasks Through Learning Inner Monologues
AAAI Conference on Artificial Intelligence (AAAI), 2023
Diji Yang
Kezhen Chen
Jinmeng Rao
Xiaoyuan Guo
Yawen Zhang
Jie Yang
Yujiao Shi
MLLM
207
13
0
19 Aug 2023
Platypus: Quick, Cheap, and Powerful Refinement of LLMs
Ariel N. Lee
Cole J. Hunter
Nataniel Ruiz
ALM
ObjD
203
171
0
14 Aug 2023
VisIT-Bench: A Benchmark for Vision-Language Instruction Following Inspired by Real-World Use
Yonatan Bitton
Hritik Bansal
Jack Hessel
Rulin Shao
Wanrong Zhu
Anas Awadalla
Josh Gardner
Rohan Taori
L. Schimdt
VLM
399
97
0
12 Aug 2023
Thinking Like an Expert:Multimodal Hypergraph-of-Thought (HoT) Reasoning to boost Foundation Modals
Fanglong Yao
Changyuan Tian
Jintao Liu
Zequn Zhang
Qing Liu
Li Jin
Shuchao Li
Xiaoyu Li
Xian Sun
ReLM
LRM
117
23
0
11 Aug 2023
Sci-CoT: Leveraging Large Language Models for Enhanced Knowledge Distillation in Small Models for Scientific QA
International Conference on Innovative Computing and Cloud Computing (ICCC), 2023
Yuhan Ma
Haiqi Jiang
Chenyou Fan
LRM
156
17
0
09 Aug 2023
Tiny LVLM-eHub: Early Multimodal Experiments with Bard
IEEE Transactions on Big Data (IEEE Trans. Big Data), 2023
Wenqi Shao
Yutao Hu
Shiyang Feng
Meng Lei
Kaipeng Zhang
...
Peng Xu
Siyuan Huang
Jiaming Song
Yuning Qiao
Ping Luo
VLM
MLLM
187
24
0
07 Aug 2023
Tool Documentation Enables Zero-Shot Tool-Usage with Large Language Models
Cheng-Yu Hsieh
Sibei Chen
Chun-Liang Li
Yasuhisa Fujii
Alexander Ratner
Chen-Yu Lee
Ranjay Krishna
Tomas Pfister
LLMAG
SyDa
276
58
0
01 Aug 2023
LaFiCMIL: Rethinking Large File Classification from the Perspective of Correlated Multiple Instance Learning
International Conference on Applications of Natural Language to Data Bases (NLDB), 2023
Tiezhu Sun
Weiguo Pian
N. Daoudi
Kevin Allix
Tegawende F. Bissyande
Jacques Klein
338
4
0
30 Jul 2023
UnIVAL: Unified Model for Image, Video, Audio and Language Tasks
Mustafa Shukor
Corentin Dancette
Alexandre Ramé
Matthieu Cord
MoMe
MLLM
271
54
0
30 Jul 2023
Foundational Models Defining a New Era in Vision: A Survey and Outlook
Muhammad Awais
Muzammal Naseer
Salman Khan
Rao Muhammad Anwer
Hisham Cholakkal
M. Shah
Ming-Hsuan Yang
Fahad Shahbaz Khan
VLM
408
151
0
25 Jul 2023
Enhancing Human-like Multi-Modal Reasoning: A New Challenging Dataset and Comprehensive Framework
Jingxuan Wei
Cheng Tan
Zhangyang Gao
Linzhuang Sun
Siyuan Li
Bihui Yu
R. Guo
Stan Z. Li
LRM
335
16
0
24 Jul 2023
SciBench: Evaluating College-Level Scientific Problem-Solving Abilities of Large Language Models
International Conference on Machine Learning (ICML), 2023
Xiaoxuan Wang
Ziniu Hu
Pan Lu
Yanqiao Zhu
Jieyu Zhang
Satyen Subramaniam
Arjun R. Loomba
Shichang Zhang
Luke Huan
Wei Wang
ELM
LRM
359
169
0
20 Jul 2023
MMBench: Is Your Multi-modal Model an All-around Player?
European Conference on Computer Vision (ECCV), 2023
Yuanzhan Liu
Haodong Duan
Yuanhan Zhang
Yue Liu
Songyang Zhang
...
Yuan Liu
Conghui He
Ziwei Liu
Kai-xiang Chen
Dahua Lin
644
1,622
0
12 Jul 2023
SVIT: Scaling up Visual Instruction Tuning
Bo Zhao
Boya Wu
Muyang He
Tiejun Huang
MLLM
261
157
0
09 Jul 2023
MultiQG-TI: Towards Question Generation from Multi-modal Sources
Workshop on Innovative Use of NLP for Building Educational Applications (UNBEA), 2023
Zichao Wang
Richard Baraniuk
105
7
0
07 Jul 2023
CARE-MI: Chinese Benchmark for Misinformation Evaluation in Maternity and Infant Care
Neural Information Processing Systems (NeurIPS), 2023
Tong Xiang
Liangzhi Li
Wangyue Li
Min‐Jun Bai
Lu Wei
Bowen Wang
Noa Garcia
241
8
0
04 Jul 2023
SCITUNE: Aligning Large Language Models with Scientific Multimodal Instructions
Sameera Horawalavithana
Sai Munikoti
Ian Stewart
Henry Kvinge
MLLM
129
26
0
03 Jul 2023
Visual Instruction Tuning with Polite Flamingo
AAAI Conference on Artificial Intelligence (AAAI), 2023
Delong Chen
Jianfeng Liu
Wenliang Dai
Baoyuan Wang
MLLM
332
52
0
03 Jul 2023
Previous
1
2
3
...
23
24
25
26
Next