ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2026 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2001.08034
  4. Cited By
ManyModalQA: Modality Disambiguation and QA over Diverse Inputs

ManyModalQA: Modality Disambiguation and QA over Diverse Inputs

AAAI Conference on Artificial Intelligence (AAAI), 2020
22 January 2020
Darryl Hannan
Akshay Jain
Joey Tianyi Zhou
    AAML
ArXiv (abs)PDFHTMLGithub (17★)

Papers citing "ManyModalQA: Modality Disambiguation and QA over Diverse Inputs"

37 / 37 papers shown
Memory-QA: Answering Recall Questions Based on Multimodal Memories
Memory-QA: Answering Recall Questions Based on Multimodal Memories
Hongda Jiang
Xinyuan Zhang
Siddhant Garg
Rishab Arora
Shiun-Zu Kuo
...
Yue Liu
Aaron Colak
Ahmed Aly
Anuj Kumar
Xin Luna Dong
198
2
0
22 Sep 2025
Rethinking Information Synthesis in Multimodal Question Answering A Multi-Agent Perspective
Rethinking Information Synthesis in Multimodal Question Answering A Multi-Agent Perspective
Krishna Singh Rajput
Tejas Anvekar
Chitta Baral
Vivek Gupta
234
2
0
27 May 2025
VLMT: Vision-Language Multimodal Transformer for Multimodal Multi-hop Question Answering
VLMT: Vision-Language Multimodal Transformer for Multimodal Multi-hop Question Answering
Qi Zhi Lim
C. Lee
K. Lim
Kalaiarasi Sonai Muthu Anbananthen
299
1
0
11 Apr 2025
FCMR: Robust Evaluation of Financial Cross-Modal Multi-Hop Reasoning
FCMR: Robust Evaluation of Financial Cross-Modal Multi-Hop ReasoningAnnual Meeting of the Association for Computational Linguistics (ACL), 2024
Seunghee Kim
Changhyeon Kim
Taeuk Kim
LRM
525
10
0
17 Dec 2024
CT2C-QA: Multimodal Question Answering over Chinese Text, Table and
  Chart
CT2C-QA: Multimodal Question Answering over Chinese Text, Table and ChartACM Multimedia (MM), 2024
Bowen Zhao
Tianhao Cheng
Yuejie Zhang
Ying Cheng
Rui Feng
Xiaobo Zhang
LMTD
251
8
0
28 Oct 2024
RA-BLIP: Multimodal Adaptive Retrieval-Augmented Bootstrapping
  Language-Image Pre-training
RA-BLIP: Multimodal Adaptive Retrieval-Augmented Bootstrapping Language-Image Pre-trainingIEEE transactions on multimedia (IEEE TMM), 2024
Muhe Ding
Yang Ma
Pengda Qin
Yue Yu
Yuhong Li
Liqiang Nie
289
8
0
18 Oct 2024
MRAG-Bench: Vision-Centric Evaluation for Retrieval-Augmented Multimodal Models
MRAG-Bench: Vision-Centric Evaluation for Retrieval-Augmented Multimodal ModelsInternational Conference on Learning Representations (ICLR), 2024
Wenbo Hu
Jia-Chen Gu
Zi-Yi Dou
Mohsen Fayyaz
Pan Lu
Kai-Wei Chang
Nanyun Peng
VLM
395
36
0
10 Oct 2024
Towards Robust Evaluation: A Comprehensive Taxonomy of Datasets and
  Metrics for Open Domain Question Answering in the Era of Large Language
  Models
Towards Robust Evaluation: A Comprehensive Taxonomy of Datasets and Metrics for Open Domain Question Answering in the Era of Large Language ModelsIEEE Access (IEEE Access), 2024
Akchay Srivastava
Atif Memon
ELM
251
7
0
19 Jun 2024
cPAPERS: A Dataset of Situated and Multimodal Interactive Conversations
  in Scientific Papers
cPAPERS: A Dataset of Situated and Multimodal Interactive Conversations in Scientific Papers
Anirudh S. Sundar
Jin Xu
William Gay
Christopher Richardson
Larry Heck
321
8
0
12 Jun 2024
MileBench: Benchmarking MLLMs in Long Context
MileBench: Benchmarking MLLMs in Long Context
Dingjie Song
Shunian Chen
Guiming Hardy Chen
Fei Yu
Xiang Wan
Benyou Wang
VLM
414
68
0
29 Apr 2024
iTBLS: A Dataset of Interactive Conversations Over Tabular Information
iTBLS: A Dataset of Interactive Conversations Over Tabular Information
Anirudh S. Sundar
Christopher Richardson
William Gay
Larry Heck
LMTD
406
3
0
19 Apr 2024
SnapNTell: Enhancing Entity-Centric Visual Question Answering with
  Retrieval Augmented Multimodal LLM
SnapNTell: Enhancing Entity-Centric Visual Question Answering with Retrieval Augmented Multimodal LLMConference on Empirical Methods in Natural Language Processing (EMNLP), 2024
Jielin Qiu
Andrea Madotto
Mohammad Kachuee
Paul A. Crook
Yongjun Xu
Xin Luna Dong
Christos Faloutsos
Lei Li
Babak Damavandi
Seungwhan Moon
269
18
0
07 Mar 2024
Exploring Hybrid Question Answering via Program-based Prompting
Exploring Hybrid Question Answering via Program-based Prompting
Qi Shi
Han Cui
Haofeng Wang
Qingfu Zhu
Wanxiang Che
Ting Liu
224
9
0
16 Feb 2024
DIVKNOWQA: Assessing the Reasoning Ability of LLMs via Open-Domain
  Question Answering over Knowledge Base and Text
DIVKNOWQA: Assessing the Reasoning Ability of LLMs via Open-Domain Question Answering over Knowledge Base and Text
Wenting Zhao
Ye Liu
Tong Niu
Yao Wan
Philip S. Yu
Shafiq Joty
Yingbo Zhou
Semih Yavuz
LRM
278
9
0
31 Oct 2023
EHRXQA: A Multi-Modal Question Answering Dataset for Electronic Health
  Records with Chest X-ray Images
EHRXQA: A Multi-Modal Question Answering Dataset for Electronic Health Records with Chest X-ray ImagesNeural Information Processing Systems (NeurIPS), 2023
Seongsu Bae
Daeun Kyung
Jaehee Ryu
Eunbyeol Cho
Gyubok Lee
...
Jungwoo Oh
Lei Ji
E. Chang
Tackeun Kim
Edward Choi
356
51
0
28 Oct 2023
MoqaGPT : Zero-Shot Multi-modal Open-domain Question Answering with
  Large Language Model
MoqaGPT : Zero-Shot Multi-modal Open-domain Question Answering with Large Language Model
Le Zhang
Yihong Wu
Fengran Mo
Jian-Yun Nie
Aishwarya Agrawal
MLLMRALM
259
8
0
20 Oct 2023
Progressive Evidence Refinement for Open-domain Multimodal Retrieval
  Question Answering
Progressive Evidence Refinement for Open-domain Multimodal Retrieval Question Answering
Shuwen Yang
Anran Wu
Xingjiao Wu
Luwei Xiao
Tianlong Ma
Cheng Jin
Liang He
246
7
0
15 Oct 2023
MMHQA-ICL: Multimodal In-context Learning for Hybrid Question Answering
  over Text, Tables and Images
MMHQA-ICL: Multimodal In-context Learning for Hybrid Question Answering over Text, Tables and Images
Weihao Liu
Fangyu Lei
Tongxu Luo
Jiahe Lei
Shizhu He
Jun Zhao
Kang Liu
LMTD
216
17
0
09 Sep 2023
Fine-tuning Multimodal LLMs to Follow Zero-shot Demonstrative
  Instructions
Fine-tuning Multimodal LLMs to Follow Zero-shot Demonstrative InstructionsInternational Conference on Learning Representations (ICLR), 2023
Juncheng Li
Kaihang Pan
Zhiqi Ge
Minghe Gao
Wei Ji
Wenqiao Zhang
Tat-Seng Chua
Siliang Tang
Hanwang Zhang
Yueting Zhuang
MLLM
401
92
0
08 Aug 2023
Unified Language Representation for Question Answering over Text,
  Tables, and Images
Unified Language Representation for Question Answering over Text, Tables, and ImagesAnnual Meeting of the Association for Computational Linguistics (ACL), 2023
Yu Bowen
Cheng Fu
Haiyang Yu
Fei Huang
Yongbin Li
LMTD
281
33
0
29 Jun 2023
Pre-Training Multi-Modal Dense Retrievers for Outside-Knowledge Visual
  Question Answering
Pre-Training Multi-Modal Dense Retrievers for Outside-Knowledge Visual Question AnsweringInternational Conference on the Theory of Information Retrieval (ICTIR), 2023
Alireza Salemi
Mahta Rafiee
Hamed Zamani
232
13
0
28 Jun 2023
A Symmetric Dual Encoding Dense Retrieval Framework for
  Knowledge-Intensive Visual Question Answering
A Symmetric Dual Encoding Dense Retrieval Framework for Knowledge-Intensive Visual Question AnsweringAnnual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR), 2023
Alireza Salemi
Juan Altmayer Pizzorno
Hamed Zamani
166
25
0
26 Apr 2023
MPMQA: Multimodal Question Answering on Product Manuals
MPMQA: Multimodal Question Answering on Product ManualsAAAI Conference on Artificial Intelligence (AAAI), 2023
Liangfu Zhang
Anwen Hu
Jing Zhang
Shuo Hu
Qin Jin
230
15
0
19 Apr 2023
cTBLS: Augmenting Large Language Models with Conversational Tables
cTBLS: Augmenting Large Language Models with Conversational Tables
Anirudh S. Sundar
Larry Heck
LMTD
409
12
0
21 Mar 2023
MuRAG: Multimodal Retrieval-Augmented Generator for Open Question
  Answering over Images and Text
MuRAG: Multimodal Retrieval-Augmented Generator for Open Question Answering over Images and TextConference on Empirical Methods in Natural Language Processing (EMNLP), 2022
Wenhu Chen
Hexiang Hu
Xi Chen
Pat Verga
William W. Cohen
RALM
425
259
0
06 Oct 2022
OPERA: Harmonizing Task-Oriented Dialogs and Information Seeking
  Experience
OPERA: Harmonizing Task-Oriented Dialogs and Information Seeking ExperienceACM Transactions on the Web (TWEB), 2022
Miaoran Li
Baolin Peng
Jianfeng Gao
Zhu Zhang
298
9
0
24 Jun 2022
Multimodal Conversational AI: A Survey of Datasets and Approaches
Multimodal Conversational AI: A Survey of Datasets and Approaches
Anirudh S. Sundar
Larry Heck
176
35
0
13 May 2022
DrugEHRQA: A Question Answering Dataset on Structured and Unstructured
  Electronic Health Records For Medicine Related Queries
DrugEHRQA: A Question Answering Dataset on Structured and Unstructured Electronic Health Records For Medicine Related QueriesInternational Conference on Language Resources and Evaluation (LREC), 2022
Jayetri Bardhan
Anthony Colas
Kirk Roberts
D. Wang
CML
147
18
0
03 May 2022
Conversational Question Answering on Heterogeneous Sources
Conversational Question Answering on Heterogeneous SourcesAnnual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR), 2022
Philipp Christmann
Rishiraj Saha Roy
Gerhard Weikum
365
50
0
25 Apr 2022
MuMuQA: Multimedia Multi-Hop News Question Answering via Cross-Media
  Knowledge Extraction and Grounding
MuMuQA: Multimedia Multi-Hop News Question Answering via Cross-Media Knowledge Extraction and GroundingAAAI Conference on Artificial Intelligence (AAAI), 2021
Revanth Reddy Gangi Reddy
Xilin Rui
Pengfei Yu
Xudong Lin
Haoyang Wen
...
Joey Tianyi Zhou
Avirup Sil
Shih-Fu Chang
Alex Schwing
Heng Ji
302
36
0
20 Dec 2021
Echo-Reconstruction: Audio-Augmented 3D Scene Reconstruction
Echo-Reconstruction: Audio-Augmented 3D Scene Reconstruction
Justin Wilson
Nicholas Rewkowski
Ming Lin
Henry Fuchs
176
1
0
05 Oct 2021
WebQA: Multihop and Multimodal QA
WebQA: Multihop and Multimodal QAComputer Vision and Pattern Recognition (CVPR), 2021
Yingshan Chang
M. Narang
Hisami Suzuki
Guihong Cao
Jianfeng Gao
Yonatan Bisk
LRM
427
130
0
01 Sep 2021
MultiBench: Multiscale Benchmarks for Multimodal Representation Learning
MultiBench: Multiscale Benchmarks for Multimodal Representation Learning
Paul Pu Liang
Yiwei Lyu
Xiang Fan
Zetian Wu
Yun Cheng
...
Peter Wu
Michelle A. Lee
Yuke Zhu
Ruslan Salakhutdinov
Louis-Philippe Morency
VLM
337
238
0
15 Jul 2021
Question Decomposition with Dependency Graphs
Question Decomposition with Dependency GraphsConference on Automated Knowledge Base Construction (AKBC), 2021
Matan Hasson
Jonathan Berant
GNN
204
10
0
17 Apr 2021
Effect of Visual Extensions on Natural Language Understanding in
  Vision-and-Language Models
Effect of Visual Extensions on Natural Language Understanding in Vision-and-Language ModelsConference on Empirical Methods in Natural Language Processing (EMNLP), 2021
Taichi Iki
Akiko Aizawa
VLM
271
22
0
16 Apr 2021
MultiModalQA: Complex Question Answering over Text, Tables and Images
MultiModalQA: Complex Question Answering over Text, Tables and ImagesInternational Conference on Learning Representations (ICLR), 2021
Alon Talmor
Ori Yoran
Amnon Catav
Dan Lahav
Yizhong Wang
Akari Asai
Gabriel Ilharco
Hannaneh Hajishirzi
Jonathan Berant
LMTD
336
220
0
13 Apr 2021
Challenges in Information-Seeking QA: Unanswerable Questions and
  Paragraph Retrieval
Challenges in Information-Seeking QA: Unanswerable Questions and Paragraph Retrieval
Akari Asai
Eunsol Choi
RALM
373
63
0
22 Oct 2020
1
Page 1 of 1