VisDoM: Multi-Document QA with Visually Rich Elements Using Multimodal Retrieval-Augmented Generation

14 December 2024

Papers citing "VisDoM: Multi-Document QA with Visually Rich Elements Using Multimodal Retrieval-Augmented Generation"

3 / 3 papers shown

Title
How does Watermarking Affect Visual Language Models in Document Understanding? Chunxue Xu Yiwei Wang Bryan Hooi Yujun Cai Songze Li VLM 44 0 0 01 Apr 2025
MDocAgent: A Multi-Modal Multi-Agent Framework for Document Understanding S. Han Peng Xia Ruiyi Zhang Tong Sun Yun-Qing Li Hongtu Zhu Huaxiu Yao VLM 55 2 0 18 Mar 2025
Ask in Any Modality: A Comprehensive Survey on Multimodal Retrieval-Augmented Generation Mohammad Mahdi Abootorabi Amirhosein Zobeiri Mahdi Dehghani Mohammadali Mohammadkhani Bardia Mohammadi Omid Ghahroodi M. Baghshah Ehsaneddin Asgari RALM 82 3 0 12 Feb 2025