Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2102.09550
Cited By
Going Full-TILT Boogie on Document Understanding with Text-Image-Layout Transformer
18 February 2021
Rafal Powalski
Łukasz Borchmann
Dawid Jurkiewicz
Tomasz Dwojak
Michal Pietruszka
Gabriela Pałka
ViT
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Going Full-TILT Boogie on Document Understanding with Text-Image-Layout Transformer"
30 / 30 papers shown
Title
KIEval: Evaluation Metric for Document Key Information Extraction
Minsoo Khang
Sang Chul Jung
Sungrae Park
Teakgyu Hong
47
0
0
07 Mar 2025
LiGT: Layout-infused Generative Transformer for Visual Question Answering on Vietnamese Receipts
Thanh-Phong Le
Trung Le Chi Phan
Nghia Hieu Nguyen
Kiet Van Nguyen
ViT
44
0
0
26 Feb 2025
DocMamba: Efficient Document Pre-training with State Space Model
Pengfei Hu
Zhenrong Zhang
Jiefeng Ma
Shuhang Liu
Jun Du
Jianshu Zhang
Mamba
35
1
0
18 Sep 2024
VisFocus: Prompt-Guided Vision Encoders for OCR-Free Dense Document Understanding
Ofir Abramovich
Niv Nayman
Sharon Fogel
I. Lavi
Ron Litman
Shahar Tsiper
Royee Tichauer
Srikar Appalaraju
Shai Mazor
R. Manmatha
VLM
33
3
0
17 Jul 2024
Reconstructing training data from document understanding models
Jérémie Dentan
Arnaud Paran
A. Shabou
AAML
SyDa
36
1
0
05 Jun 2024
CT-Eval: Benchmarking Chinese Text-to-Table Performance in Large Language Models
Haoxiang Shi
Jiaan Wang
Jiarong Xu
Cen Wang
Tetsuya Sakai
LMTD
26
0
0
20 May 2024
A Hybrid Approach for Document Layout Analysis in Document images
Tahira Shehzadi
Didier Stricker
Muhammad Zeshan Afzal
29
5
0
27 Apr 2024
DocGraphLM: Documental Graph Language Model for Information Extraction
Dongsheng Wang
Zhiqiang Ma
Armineh Nourbakhsh
Kang Gu
Sameena Shah
26
8
0
05 Jan 2024
On Evaluation of Document Classification using RVL-CDIP
Stefan Larson
Gordon Lim
Kevin Leach
26
3
0
21 Jun 2023
Language Independent Neuro-Symbolic Semantic Parsing for Form Understanding
Bhanu Prakash Voutharoja
Lizhen Qu
Fatemeh Shiri
20
1
0
08 May 2023
DocParser: End-to-end OCR-free Information Extraction from Visually Rich Documents
M. Dhouib
G. Bettaieb
A. Shabou
12
20
0
24 Apr 2023
LoRaLay: A Multilingual and Multimodal Dataset for Long Range and Layout-Aware Summarization
Laura Nguyen
Thomas Scialom
Benjamin Piwowarski
Jacopo Staiano
24
7
0
26 Jan 2023
Alignment-Enriched Tuning for Patch-Level Pre-trained Document Image Models
Lei Wang
Jian He
Xingdong Xu
Ning Liu
Hui-juan Liu
27
2
0
27 Nov 2022
Evaluating Out-of-Distribution Performance on Document Image Classifiers
Stefan Larson
Gordon Lim
Yutong Ai
David Kuang
Kevin Leach
OODD
OOD
29
18
0
14 Oct 2022
ERNIE-mmLayout: Multi-grained MultiModal Transformer for Document Understanding
Wenjin Wang
Zhengjie Huang
Bin Luo
Qianglong Chen
Qiming Peng
...
Weichong Yin
Shi Feng
Yu Sun
Dianhai Yu
Yin Zhang
ViT
22
11
0
18 Sep 2022
Doc2Graph: a Task Agnostic Document Understanding Framework based on Graph Neural Networks
Andrea Gemelli
Sanket Biswas
Enrico Civitelli
Josep Lladós
S. Marinai
13
15
0
23 Aug 2022
End-to-end Document Recognition and Understanding with Dessurt
Brian L. Davis
B. Morse
Brian L. Price
Chris Tensmeyer
Curtis Wigington
Vlad I. Morariu
VLM
ViT
18
73
0
30 Mar 2022
DAN: a Segmentation-free Document Attention Network for Handwritten Document Recognition
Denis Coquenet
Clément Chatelain
Thierry Paquet
22
57
0
23 Mar 2022
DiT: Self-supervised Pre-training for Document Image Transformer
Junlong Li
Yiheng Xu
Tengchao Lv
Lei Cui
Chaoxi Zhang
Furu Wei
ViT
VLM
33
159
0
04 Mar 2022
WebFormer: The Web-page Transformer for Structure Information Extraction
Qifan Wang
Yi Fang
Anirudh Ravula
Fuli Feng
Xiaojun Quan
Dongfang Liu
ViT
141
65
0
01 Feb 2022
OCR-free Document Understanding Transformer
Geewook Kim
Teakgyu Hong
Moonbin Yim
Jeongyeon Nam
Jinyoung Park
Jinyeong Yim
Wonseok Hwang
Sangdoo Yun
Dongyoon Han
Seunghyun Park
ViT
48
262
0
30 Nov 2021
ICDAR 2021 Competition on Document VisualQuestion Answering
Rubèn Pérez Tito
Minesh Mathew
C. V. Jawahar
Ernest Valveny
Dimosthenis Karatzas
30
23
0
10 Nov 2021
Information Extraction from Visually Rich Documents with Font Style Embeddings
Ismail Oussaid
William Vanhuffel
Pirashanth Ratnamogan
Mhamed Hajaiej
Alexis Mathey
Thomas Gilles
16
1
0
07 Nov 2021
MarkupLM: Pre-training of Text and Markup Language for Visually-rich Document Understanding
Junlong Li
Yiheng Xu
Lei Cui
Furu Wei
VLM
3DGS
21
59
0
16 Oct 2021
BROS: A Pre-trained Language Model Focusing on Text and Layout for Better Key Information Extraction from Documents
Teakgyu Hong
Donghyun Kim
Mingi Ji
Wonseok Hwang
Daehyun Nam
Sungrae Park
VLM
23
149
0
10 Aug 2021
DocFormer: End-to-End Transformer for Document Understanding
Srikar Appalaraju
Bhavan A. Jasani
Bhargava Urala Kota
Yusheng Xie
R. Manmatha
ViT
25
270
0
22 Jun 2021
InfographicVQA
Minesh Mathew
Viraj Bagal
Rubèn Pérez Tito
Dimosthenis Karatzas
Ernest Valveny
C. V. Jawahar
22
202
0
26 Apr 2021
LayoutLMv2: Multi-modal Pre-training for Visually-Rich Document Understanding
Yang Xu
Yiheng Xu
Tengchao Lv
Lei Cui
Furu Wei
...
D. Florêncio
Cha Zhang
Wanxiang Che
Min Zhang
Lidong Zhou
ViT
MLLM
145
498
0
29 Dec 2020
From Dataset Recycling to Multi-Property Extraction and Beyond
Tomasz Dwojak
Michal Pietruszka
Łukasz Borchmann
Jakub Chlkedowski
Filip Graliñski
36
5
0
06 Nov 2020
FUNSD: A Dataset for Form Understanding in Noisy Scanned Documents
Guillaume Jaume
H. K. Ekenel
Jean-Philippe Thiran
122
355
0
27 May 2019
1