PDF-MVQA: A Dataset for Multimodal Information Retrieval in PDF-based Visual Question Answering

19 April 2024

Papers citing "PDF-MVQA: A Dataset for Multimodal Information Retrieval in PDF-based Visual Question Answering"

4 / 4 papers shown

Title
Unified Multi-Modal Interleaved Document Representation for Information Retrieval Jaewoo Lee Joonho Ko Jinheon Baek Soyeong Jeong Sung Ju Hwang 20 1 0 03 Oct 2024
Training language models to follow instructions with human feedback Long Ouyang Jeff Wu Xu Jiang Diogo Almeida Carroll L. Wainwright ... Amanda Askell Peter Welinder Paul Christiano Jan Leike Ryan J. Lowe OSLM ALM 301 11,730 0 04 Mar 2022
LayoutLMv2: Multi-modal Pre-training for Visually-Rich Document Understanding Yang Xu Yiheng Xu Tengchao Lv Lei Cui Furu Wei ... D. Florêncio Cha Zhang Wanxiang Che Min Zhang Lidong Zhou ViT MLLM 137 492 0 29 Dec 2020
Big Bird: Transformers for Longer Sequences Manzil Zaheer Guru Guruganesh Kumar Avinava Dubey Joshua Ainslie Chris Alberti ... Philip Pham Anirudh Ravula Qifan Wang Li Yang Amr Ahmed VLM 249 1,982 0 28 Jul 2020