ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2206.01062
  4. Cited By
DocLayNet: A Large Human-Annotated Dataset for Document-Layout Analysis

DocLayNet: A Large Human-Annotated Dataset for Document-Layout Analysis

2 June 2022
B. Pfitzmann
Christoph Auer
Michele Dolfi
A. Nassar
Peter W. J. Staar
ArXivPDFHTML

Papers citing "DocLayNet: A Large Human-Annotated Dataset for Document-Layout Analysis"

50 / 56 papers shown
Title
NoTeS-Bank: Benchmarking Neural Transcription and Search for Scientific Notes Understanding
NoTeS-Bank: Benchmarking Neural Transcription and Search for Scientific Notes Understanding
Aniket Pal
Sanket Biswas
Alloy Das
Ayush Lodh
Priyanka Banerjee
Soumitri Chattopadhyay
Dimosthenis Karatzas
Josep Lladós
C. V. Jawahar
VLM
32
0
0
12 Apr 2025
Capybara-OMNI: An Efficient Paradigm for Building Omni-Modal Language Models
Capybara-OMNI: An Efficient Paradigm for Building Omni-Modal Language Models
Xingguang Ji
Jiakang Wang
Hongzhi Zhang
Jingyuan Zhang
Haonan Zhou
Chenxi Sun
Y. Liu
Qi Wang
Fuzheng Zhang
MLLM
VLM
58
0
0
10 Apr 2025
Archival Faces: Detection of Faces in Digitized Historical Documents
Archival Faces: Detection of Faces in Digitized Historical Documents
Marek Vaško
Adam Herout
Michal Hradiš
CVBM
65
0
0
01 Apr 2025
AnnoPage Dataset: Dataset of Non-Textual Elements in Documents with Fine-Grained Categorization
AnnoPage Dataset: Dataset of Non-Textual Elements in Documents with Fine-Grained Categorization
Martin Kiss
Michal Hradiš
Martina Dvořáková
Václav Jiroušek
Filip Kersch
46
1
0
28 Mar 2025
SFDLA: Source-Free Document Layout Analysis
SFDLA: Source-Free Document Layout Analysis
Sebastian Tewes
Yufan Chen
Omar Moured
Jiaming Zhang
Rainer Stiefelhagen
50
0
0
24 Mar 2025
PP-DocLayout: A Unified Document Layout Detection Model to Accelerate Large-Scale Data Construction
PP-DocLayout: A Unified Document Layout Detection Model to Accelerate Large-Scale Data Construction
Ting Sun
Cheng Cui
Yuning Du
Yi Liu
48
1
0
21 Mar 2025
TextBite: A Historical Czech Document Dataset for Logical Page Segmentation
TextBite: A Historical Czech Document Dataset for Logical Page Segmentation
Martin Kostelník
Karel Beneš
Michal Hradiš
37
0
0
20 Mar 2025
UniHDSA: A Unified Relation Prediction Approach for Hierarchical Document Structure Analysis
UniHDSA: A Unified Relation Prediction Approach for Hierarchical Document Structure Analysis
Jiawei Wang
Kai Hu
Qiang Huo
55
0
0
20 Mar 2025
SmolDocling: An ultra-compact vision-language model for end-to-end multi-modal document conversion
A. Nassar
Andres Marafioti
Matteo Omenetti
Maksym Lysak
Nikolaos Livathinos
...
Yusik Kim
A. Said Gurbuz
Michele Dolfi
Miquel Farré
Peter W. J. Staar
55
3
0
14 Mar 2025
EDocNet: Efficient Datasheet Layout Analysis Based on Focus and Global Knowledge Distillation
EDocNet: Efficient Datasheet Layout Analysis Based on Focus and Global Knowledge Distillation
Hong Cai Chen
Longchang Wu
Yang Zhang
38
0
0
23 Feb 2025
OmniParser V2: Structured-Points-of-Thought for Unified Visual Text Parsing and Its Generality to Multimodal Large Language Models
OmniParser V2: Structured-Points-of-Thought for Unified Visual Text Parsing and Its Generality to Multimodal Large Language Models
Wenwen Yu
Zhibo Yang
Jianqiang Wan
Sibo Song
J. Tang
Wenqing Cheng
Y. Liu
Xiang Bai
51
1
0
22 Feb 2025
Granite Vision: a lightweight, open-source multimodal model for enterprise Intelligence
Granite Vision: a lightweight, open-source multimodal model for enterprise Intelligence
Granite Vision Team
Leonid Karlinsky
Assaf Arbelle
Abraham Daniels
A. Nassar
...
Sriram Raghavan
T. Syeda-Mahmood
Peter W. J. Staar
Tal Drory
Rogerio Feris
VLM
AI4TS
114
0
0
14 Feb 2025
Handwritten Text Recognition: A Survey
Handwritten Text Recognition: A Survey
Carlos Garrido-Munoz
Antonio Ríos-Vila
Jorge Calvo-Zaragoza
106
0
0
12 Feb 2025
\Éclair -- Extracting Content and Layout with Integrated Reading Order for Documents
\Éclair -- Extracting Content and Layout with Integrated Reading Order for Documents
Ilia Karmanov
A. Deshmukh
Lukas Voegtle
Philipp Fischer
Kateryna Chumachenko
...
Jarno Seppänen
Jupinder Parmar
Joseph Jennings
Andrew Tao
Karan Sapra
73
0
0
06 Feb 2025
OmniDocBench: Benchmarking Diverse PDF Document Parsing with Comprehensive Annotations
OmniDocBench: Benchmarking Diverse PDF Document Parsing with Comprehensive Annotations
Linke Ouyang
Yuan Qu
Hongbin Zhou
Jiawei Zhu
Rui Zhang
...
Chao Xu
Bo Zhang
Botian Shi
Zhongying Tu
Conghui He
101
5
0
10 Dec 2024
M-Longdoc: A Benchmark For Multimodal Super-Long Document Understanding
  And A Retrieval-Aware Tuning Framework
M-Longdoc: A Benchmark For Multimodal Super-Long Document Understanding And A Retrieval-Aware Tuning Framework
Yew Ken Chia
Liying Cheng
Hou Pong Chan
Chaoqun Liu
Maojia Song
Sharifah Mahani Aljunied
Soujanya Poria
Lidong Bing
RALM
VLM
43
4
0
09 Nov 2024
DocLayout-YOLO: Enhancing Document Layout Analysis through Diverse
  Synthetic Data and Global-to-Local Adaptive Perception
DocLayout-YOLO: Enhancing Document Layout Analysis through Diverse Synthetic Data and Global-to-Local Adaptive Perception
Zhiyuan Zhao
Hengrui Kang
Bin Wang
Conghui He
27
10
0
16 Oct 2024
AceParse: A Comprehensive Dataset with Diverse Structured Texts for Academic Literature Parsing
AceParse: A Comprehensive Dataset with Diverse Structured Texts for Academic Literature Parsing
Huawei Ji
Cheng Deng
Bo Xue
Zhouyang Jin
Jiaxin Ding
Xiaoying Gan
Luoyi Fu
Xinbing Wang
Chenghu Zhou
26
0
0
16 Sep 2024
SynthDoc: Bilingual Documents Synthesis for Visual Document
  Understanding
SynthDoc: Bilingual Documents Synthesis for Visual Document Understanding
Chuanghao Ding
Xuejing Liu
Wei Tang
Juan Li
Xiaoliang Wang
Rui Zhao
Cam-Tu Nguyen
Fei Tan
23
0
0
27 Aug 2024
DocLayLLM: An Efficient Multi-modal Extension of Large Language Models for Text-rich Document Understanding
DocLayLLM: An Efficient Multi-modal Extension of Large Language Models for Text-rich Document Understanding
Wenhui Liao
Jiapeng Wang
Hongliang Li
Chengyu Wang
Jun Huang
Lianwen Jin
38
0
0
27 Aug 2024
MaterioMiner -- An ontology-based text mining dataset for extraction of
  process-structure-property entities
MaterioMiner -- An ontology-based text mining dataset for extraction of process-structure-property entities
Ali Riza Durmaz
Akhil Thomas
Lokesh Mishra
Rachana Niranjan Murthy
Thomas Straub
35
1
0
05 Aug 2024
DocXplain: A Novel Model-Agnostic Explainability Method for Document
  Image Classification
DocXplain: A Novel Model-Agnostic Explainability Method for Document Image Classification
S. Saifullah
S. Agne
Andreas Dengel
Sheraz Ahmed
29
0
0
04 Jul 2024
DocGenome: An Open Large-scale Scientific Document Benchmark for
  Training and Testing Multi-modal Large Language Models
DocGenome: An Open Large-scale Scientific Document Benchmark for Training and Testing Multi-modal Large Language Models
Renqiu Xia
Song Mao
Xiangchao Yan
Hongbin Zhou
Bo Zhang
...
Yongwei Wang
Bin Wang
Junchi Yan
Fei Wu
Yu Qiao
48
10
0
17 Jun 2024
SRFUND: A Multi-Granularity Hierarchical Structure Reconstruction
  Benchmark in Form Understanding
SRFUND: A Multi-Granularity Hierarchical Structure Reconstruction Benchmark in Form Understanding
Jiefeng Ma
Yan Wang
Chenyu Liu
Jun Du
Yu Hu
Zhenrong Zhang
Pengfei Hu
Qing Wang
Jianshu Zhang
36
0
0
13 Jun 2024
DocSynthv2: A Practical Autoregressive Modeling for Document Generation
DocSynthv2: A Practical Autoregressive Modeling for Document Generation
Sanket Biswas
R. Jain
Vlad I. Morariu
Jiuxiang Gu
Puneet Mathur
Curtis Wigington
Tong Sun
Josep Lladós
43
1
0
12 Jun 2024
M3T: A New Benchmark Dataset for Multi-Modal Document-Level Machine
  Translation
M3T: A New Benchmark Dataset for Multi-Modal Document-Level Machine Translation
Benjamin Hsu
Xiaoyu Liu
Huayang Li
Yoshinari Fujinuma
Maria Nadejde
Xing Niu
Yair Kittenplon
Ron Litman
R. Pappagari
33
4
0
12 Jun 2024
DistilDoc: Knowledge Distillation for Visually-Rich Document Applications
DistilDoc: Knowledge Distillation for Visually-Rich Document Applications
Jordy Van Landeghem
Subhajit Maity
Ayan Banerjee
Matthew Blaschko
Marie-Francine Moens
Josep Lladós
Sanket Biswas
50
2
0
12 Jun 2024
UnSupDLA: Towards Unsupervised Document Layout Analysis
UnSupDLA: Towards Unsupervised Document Layout Analysis
Talha Uddin Sheikh
Tahira Shehzadi
K. Hashmi
Didier Stricker
Muhammad Zeshan Afzal
28
2
0
10 Jun 2024
Towards Unified Multi-granularity Text Detection with Interactive
  Attention
Towards Unified Multi-granularity Text Detection with Interactive Attention
Xingyu Wan
Chengquan Zhang
Pengyuan Lyu
Sen Fan
Zihan Ni
Kun Yao
Errui Ding
Jingdong Wang
60
1
0
30 May 2024
DLAFormer: An End-to-End Transformer For Document Layout Analysis
DLAFormer: An End-to-End Transformer For Document Layout Analysis
Jiawei Wang
Kai Hu
Qiang Huo
3DV
ViT
25
3
0
20 May 2024
A Hybrid Approach for Document Layout Analysis in Document images
A Hybrid Approach for Document Layout Analysis in Document images
Tahira Shehzadi
Didier Stricker
Muhammad Zeshan Afzal
34
5
0
27 Apr 2024
LayoutLLM: Layout Instruction Tuning with Large Language Models for
  Document Understanding
LayoutLLM: Layout Instruction Tuning with Large Language Models for Document Understanding
Chuwei Luo
Yufan Shen
Zhaoqing Zhu
Qi Zheng
Zhi Yu
Cong Yao
31
38
0
08 Apr 2024
Draw-and-Understand: Leveraging Visual Prompts to Enable MLLMs to Comprehend What You Want
Draw-and-Understand: Leveraging Visual Prompts to Enable MLLMs to Comprehend What You Want
Weifeng Lin
Xinyu Wei
Ruichuan An
Peng Gao
Bocheng Zou
Yulin Luo
Siyuan Huang
Shanghang Zhang
Hongsheng Li
VLM
63
33
0
29 Mar 2024
Can AI Models Appreciate Document Aesthetics? An Exploration of
  Legibility and Layout Quality in Relation to Prediction Confidence
Can AI Models Appreciate Document Aesthetics? An Exploration of Legibility and Layout Quality in Relation to Prediction Confidence
Hsiu-Wei Yang
Abhinav Agrawal
Pavlos Fragkogiannis
Shubham Nitin Mulay
27
1
0
27 Mar 2024
RoDLA: Benchmarking the Robustness of Document Layout Analysis Models
RoDLA: Benchmarking the Robustness of Document Layout Analysis Models
Yufan Chen
Jiaming Zhang
Kunyu Peng
Junwei Zheng
Ruiping Liu
Philip H. S. Torr
Rainer Stiefelhagen
OOD
29
5
0
21 Mar 2024
RJUA-MedDQA: A Multimodal Benchmark for Medical Document Question
  Answering and Clinical Reasoning
RJUA-MedDQA: A Multimodal Benchmark for Medical Document Question Answering and Clinical Reasoning
Congyun Jin
Ming Zhang
Xiaowei Ma
Yujiao Li
Yingbo Wang
...
Chenfei Chi
Xiangguo Lv
Fangzhou Li
Wei Xue
Yiran Huang
LM&MA
27
2
0
19 Feb 2024
GraphKD: Exploring Knowledge Distillation Towards Document Object
  Detection with Structured Graph Creation
GraphKD: Exploring Knowledge Distillation Towards Document Object Detection with Structured Graph Creation
Ayan Banerjee
Sanket Biswas
Josep Lladós
Umapada Pal
38
1
0
17 Feb 2024
Financial Report Chunking for Effective Retrieval Augmented Generation
Financial Report Chunking for Effective Retrieval Augmented Generation
Antonio Jimeno-Yepes
Yao You
Jan Milczek
Sebastian Laverde
Renyu Li
43
20
0
05 Feb 2024
InstructDoc: A Dataset for Zero-Shot Generalization of Visual Document
  Understanding with Instructions
InstructDoc: A Dataset for Zero-Shot Generalization of Visual Document Understanding with Instructions
Ryota Tanaka
Taichi Iki
Kyosuke Nishida
Kuniko Saito
Jun Suzuki
VLM
21
23
0
24 Jan 2024
Dynamic Relation Transformer for Contextual Text Block Detection
Dynamic Relation Transformer for Contextual Text Block Detection
Jiawei Wang
Shunchi Zhang
Kai Hu
Chixiang Ma
Zhuoyao Zhong
Lei-huan Sun
Qiang Huo
27
0
0
17 Jan 2024
WordScape: a Pipeline to extract multilingual, visually rich Documents
  with Layout Annotations from Web Crawl Data
WordScape: a Pipeline to extract multilingual, visually rich Documents with Layout Annotations from Web Crawl Data
Maurice Weber
Carlo Siebenschuh
Rory Butler
Anton Alexandrov
Valdemar Thanner
...
Haris Jabbar
Ian T. Foster
Bo-wen Li
Rick L. Stevens
Ce Zhang
13
4
0
15 Dec 2023
ESG Accountability Made Easy: DocQA at Your Service
ESG Accountability Made Easy: DocQA at Your Service
Lokesh Mishra
Cesar Berrospi
K. Dinkla
Diego Antognini
Francesco Fusco
...
Panagiotis Vagenas
Lucas Morin
Christoph Auer
Michele Dolfi
Peter W. J. Staar
28
3
0
30 Nov 2023
A Scalable Framework for Table of Contents Extraction from Complex ESG
  Annual Reports
A Scalable Framework for Table of Contents Extraction from Complex ESG Annual Reports
Xinyu Wang
Lin Gui
Yulan He
LMTD
21
2
0
27 Oct 2023
Hierarchical Text Spotter for Joint Text Spotting and Layout Analysis
Hierarchical Text Spotter for Joint Text Spotting and Layout Analysis
Shangbang Long
Siyang Qin
Yasuhisa Fujii
Alessandro Bissacco
Michalis Raptis
24
5
0
25 Oct 2023
Unveiling Document Structures with YOLOv5 Layout Detection
Unveiling Document Structures with YOLOv5 Layout Detection
Herman Sugiharto
Yorissa Silviana
Langa Khumalo
17
0
0
29 Sep 2023
Document AI: A Comparative Study of Transformer-Based, Graph-Based
  Models, and Convolutional Neural Networks For Document Layout Analysis
Document AI: A Comparative Study of Transformer-Based, Graph-Based Models, and Convolutional Neural Networks For Document Layout Analysis
Sotirios Kastanas
Shaomu Tan
Yijiang He
27
1
0
29 Aug 2023
Beyond Document Page Classification: Design, Datasets, and Challenges
Beyond Document Page Classification: Design, Datasets, and Challenges
Jordy Van Landeghem
Sanket Biswas
Matthew B. Blaschko
Marie-Francine Moens
37
6
0
24 Aug 2023
ICDAR 2023 Competition on Robust Layout Segmentation in Corporate
  Documents
ICDAR 2023 Competition on Robust Layout Segmentation in Corporate Documents
Christoph Auer
A. Nassar
Maksym Lysak
Michele Dolfi
Nikolaos Livathinos
Peter W. J. Staar
OOD
3DV
27
6
0
24 May 2023
WeLayout: WeChat Layout Analysis System for the ICDAR 2023 Competition
  on Robust Layout Segmentation in Corporate Documents
WeLayout: WeChat Layout Analysis System for the ICDAR 2023 Competition on Robust Layout Segmentation in Corporate Documents
Mingliang Zhang
Zhen Cao
Juntao Liu
Liqiang Niu
Fandong Meng
Jie Zhou
40
6
0
11 May 2023
SwinDocSegmenter: An End-to-End Unified Domain Adaptive Transformer for
  Document Instance Segmentation
SwinDocSegmenter: An End-to-End Unified Domain Adaptive Transformer for Document Instance Segmentation
Ayan Banerjee
Sanket Biswas
Josep Lladós
Umapada Pal
ViT
12
16
0
08 May 2023
12
Next