ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2102.09550
  4. Cited By
Going Full-TILT Boogie on Document Understanding with Text-Image-Layout
  Transformer

Going Full-TILT Boogie on Document Understanding with Text-Image-Layout Transformer

18 February 2021
Rafal Powalski
Łukasz Borchmann
Dawid Jurkiewicz
Tomasz Dwojak
Michal Pietruszka
Gabriela Pałka
    ViT
ArXivPDFHTML

Papers citing "Going Full-TILT Boogie on Document Understanding with Text-Image-Layout Transformer"

30 / 30 papers shown
Title
KIEval: Evaluation Metric for Document Key Information Extraction
KIEval: Evaluation Metric for Document Key Information Extraction
Minsoo Khang
Sang Chul Jung
Sungrae Park
Teakgyu Hong
47
0
0
07 Mar 2025
LiGT: Layout-infused Generative Transformer for Visual Question Answering on Vietnamese Receipts
LiGT: Layout-infused Generative Transformer for Visual Question Answering on Vietnamese Receipts
Thanh-Phong Le
Trung Le Chi Phan
Nghia Hieu Nguyen
Kiet Van Nguyen
ViT
44
0
0
26 Feb 2025
DocMamba: Efficient Document Pre-training with State Space Model
DocMamba: Efficient Document Pre-training with State Space Model
Pengfei Hu
Zhenrong Zhang
Jiefeng Ma
Shuhang Liu
Jun Du
Jianshu Zhang
Mamba
35
1
0
18 Sep 2024
VisFocus: Prompt-Guided Vision Encoders for OCR-Free Dense Document Understanding
VisFocus: Prompt-Guided Vision Encoders for OCR-Free Dense Document Understanding
Ofir Abramovich
Niv Nayman
Sharon Fogel
I. Lavi
Ron Litman
Shahar Tsiper
Royee Tichauer
Srikar Appalaraju
Shai Mazor
R. Manmatha
VLM
33
3
0
17 Jul 2024
Reconstructing training data from document understanding models
Reconstructing training data from document understanding models
Jérémie Dentan
Arnaud Paran
A. Shabou
AAML
SyDa
34
1
0
05 Jun 2024
CT-Eval: Benchmarking Chinese Text-to-Table Performance in Large
  Language Models
CT-Eval: Benchmarking Chinese Text-to-Table Performance in Large Language Models
Haoxiang Shi
Jiaan Wang
Jiarong Xu
Cen Wang
Tetsuya Sakai
LMTD
26
0
0
20 May 2024
A Hybrid Approach for Document Layout Analysis in Document images
A Hybrid Approach for Document Layout Analysis in Document images
Tahira Shehzadi
Didier Stricker
Muhammad Zeshan Afzal
29
5
0
27 Apr 2024
DocGraphLM: Documental Graph Language Model for Information Extraction
DocGraphLM: Documental Graph Language Model for Information Extraction
Dongsheng Wang
Zhiqiang Ma
Armineh Nourbakhsh
Kang Gu
Sameena Shah
26
8
0
05 Jan 2024
On Evaluation of Document Classification using RVL-CDIP
On Evaluation of Document Classification using RVL-CDIP
Stefan Larson
Gordon Lim
Kevin Leach
26
3
0
21 Jun 2023
Language Independent Neuro-Symbolic Semantic Parsing for Form
  Understanding
Language Independent Neuro-Symbolic Semantic Parsing for Form Understanding
Bhanu Prakash Voutharoja
Lizhen Qu
Fatemeh Shiri
20
1
0
08 May 2023
DocParser: End-to-end OCR-free Information Extraction from Visually Rich
  Documents
DocParser: End-to-end OCR-free Information Extraction from Visually Rich Documents
M. Dhouib
G. Bettaieb
A. Shabou
12
20
0
24 Apr 2023
LoRaLay: A Multilingual and Multimodal Dataset for Long Range and
  Layout-Aware Summarization
LoRaLay: A Multilingual and Multimodal Dataset for Long Range and Layout-Aware Summarization
Laura Nguyen
Thomas Scialom
Benjamin Piwowarski
Jacopo Staiano
22
7
0
26 Jan 2023
Alignment-Enriched Tuning for Patch-Level Pre-trained Document Image
  Models
Alignment-Enriched Tuning for Patch-Level Pre-trained Document Image Models
Lei Wang
Jian He
Xingdong Xu
Ning Liu
Hui-juan Liu
27
2
0
27 Nov 2022
Evaluating Out-of-Distribution Performance on Document Image Classifiers
Evaluating Out-of-Distribution Performance on Document Image Classifiers
Stefan Larson
Gordon Lim
Yutong Ai
David Kuang
Kevin Leach
OODD
OOD
29
18
0
14 Oct 2022
ERNIE-mmLayout: Multi-grained MultiModal Transformer for Document
  Understanding
ERNIE-mmLayout: Multi-grained MultiModal Transformer for Document Understanding
Wenjin Wang
Zhengjie Huang
Bin Luo
Qianglong Chen
Qiming Peng
...
Weichong Yin
Shi Feng
Yu Sun
Dianhai Yu
Yin Zhang
ViT
22
11
0
18 Sep 2022
Doc2Graph: a Task Agnostic Document Understanding Framework based on
  Graph Neural Networks
Doc2Graph: a Task Agnostic Document Understanding Framework based on Graph Neural Networks
Andrea Gemelli
Sanket Biswas
Enrico Civitelli
Josep Lladós
S. Marinai
13
15
0
23 Aug 2022
End-to-end Document Recognition and Understanding with Dessurt
End-to-end Document Recognition and Understanding with Dessurt
Brian L. Davis
B. Morse
Brian L. Price
Chris Tensmeyer
Curtis Wigington
Vlad I. Morariu
VLM
ViT
18
73
0
30 Mar 2022
DAN: a Segmentation-free Document Attention Network for Handwritten
  Document Recognition
DAN: a Segmentation-free Document Attention Network for Handwritten Document Recognition
Denis Coquenet
Clément Chatelain
Thierry Paquet
22
57
0
23 Mar 2022
DiT: Self-supervised Pre-training for Document Image Transformer
DiT: Self-supervised Pre-training for Document Image Transformer
Junlong Li
Yiheng Xu
Tengchao Lv
Lei Cui
Chaoxi Zhang
Furu Wei
ViT
VLM
31
159
0
04 Mar 2022
WebFormer: The Web-page Transformer for Structure Information Extraction
WebFormer: The Web-page Transformer for Structure Information Extraction
Qifan Wang
Yi Fang
Anirudh Ravula
Fuli Feng
Xiaojun Quan
Dongfang Liu
ViT
141
65
0
01 Feb 2022
OCR-free Document Understanding Transformer
OCR-free Document Understanding Transformer
Geewook Kim
Teakgyu Hong
Moonbin Yim
Jeongyeon Nam
Jinyoung Park
Jinyeong Yim
Wonseok Hwang
Sangdoo Yun
Dongyoon Han
Seunghyun Park
ViT
44
262
0
30 Nov 2021
ICDAR 2021 Competition on Document VisualQuestion Answering
ICDAR 2021 Competition on Document VisualQuestion Answering
Rubèn Pérez Tito
Minesh Mathew
C. V. Jawahar
Ernest Valveny
Dimosthenis Karatzas
30
23
0
10 Nov 2021
Information Extraction from Visually Rich Documents with Font Style
  Embeddings
Information Extraction from Visually Rich Documents with Font Style Embeddings
Ismail Oussaid
William Vanhuffel
Pirashanth Ratnamogan
Mhamed Hajaiej
Alexis Mathey
Thomas Gilles
16
1
0
07 Nov 2021
MarkupLM: Pre-training of Text and Markup Language for Visually-rich
  Document Understanding
MarkupLM: Pre-training of Text and Markup Language for Visually-rich Document Understanding
Junlong Li
Yiheng Xu
Lei Cui
Furu Wei
VLM
3DGS
21
59
0
16 Oct 2021
BROS: A Pre-trained Language Model Focusing on Text and Layout for
  Better Key Information Extraction from Documents
BROS: A Pre-trained Language Model Focusing on Text and Layout for Better Key Information Extraction from Documents
Teakgyu Hong
Donghyun Kim
Mingi Ji
Wonseok Hwang
Daehyun Nam
Sungrae Park
VLM
23
149
0
10 Aug 2021
DocFormer: End-to-End Transformer for Document Understanding
DocFormer: End-to-End Transformer for Document Understanding
Srikar Appalaraju
Bhavan A. Jasani
Bhargava Urala Kota
Yusheng Xie
R. Manmatha
ViT
25
270
0
22 Jun 2021
InfographicVQA
InfographicVQA
Minesh Mathew
Viraj Bagal
Rubèn Pérez Tito
Dimosthenis Karatzas
Ernest Valveny
C. V. Jawahar
22
202
0
26 Apr 2021
LayoutLMv2: Multi-modal Pre-training for Visually-Rich Document
  Understanding
LayoutLMv2: Multi-modal Pre-training for Visually-Rich Document Understanding
Yang Xu
Yiheng Xu
Tengchao Lv
Lei Cui
Furu Wei
...
D. Florêncio
Cha Zhang
Wanxiang Che
Min Zhang
Lidong Zhou
ViT
MLLM
145
498
0
29 Dec 2020
From Dataset Recycling to Multi-Property Extraction and Beyond
From Dataset Recycling to Multi-Property Extraction and Beyond
Tomasz Dwojak
Michal Pietruszka
Łukasz Borchmann
Jakub Chlkedowski
Filip Graliñski
34
5
0
06 Nov 2020
FUNSD: A Dataset for Form Understanding in Noisy Scanned Documents
FUNSD: A Dataset for Form Understanding in Noisy Scanned Documents
Guillaume Jaume
H. K. Ekenel
Jean-Philippe Thiran
122
355
0
27 May 2019
1