Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2412.10704
Cited By
VisDoM: Multi-Document QA with Visually Rich Elements Using Multimodal Retrieval-Augmented Generation
14 December 2024
Manan Suri
Puneet Mathur
Franck Dernoncourt
Kanika Goswami
Ryan Rossi
Dinesh Manocha
Re-assign community
ArXiv
PDF
HTML
Papers citing
"VisDoM: Multi-Document QA with Visually Rich Elements Using Multimodal Retrieval-Augmented Generation"
3 / 3 papers shown
Title
How does Watermarking Affect Visual Language Models in Document Understanding?
Chunxue Xu
Yiwei Wang
Bryan Hooi
Yujun Cai
Songze Li
VLM
44
0
0
01 Apr 2025
MDocAgent: A Multi-Modal Multi-Agent Framework for Document Understanding
S. Han
Peng Xia
Ruiyi Zhang
Tong Sun
Yun-Qing Li
Hongtu Zhu
Huaxiu Yao
VLM
55
2
0
18 Mar 2025
Ask in Any Modality: A Comprehensive Survey on Multimodal Retrieval-Augmented Generation
Mohammad Mahdi Abootorabi
Amirhosein Zobeiri
Mahdi Dehghani
Mohammadali Mohammadkhani
Bardia Mohammadi
Omid Ghahroodi
M. Baghshah
Ehsaneddin Asgari
RALM
82
3
0
12 Feb 2025
1