Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1805.09687
Cited By
Corpus Conversion Service: A machine learning platform to ingest documents at scale [Poster abstract]
15 May 2018
Peter W. J. Staar
Michele Dolfi
Christoph Auer
C. Bekas
MedIm
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Corpus Conversion Service: A machine learning platform to ingest documents at scale [Poster abstract]"
17 / 17 papers shown
Title
OmniParser V2: Structured-Points-of-Thought for Unified Visual Text Parsing and Its Generality to Multimodal Large Language Models
Wenwen Yu
Zhibo Yang
Jianqiang Wan
Sibo Song
J. Tang
Wenqing Cheng
Y. Liu
Xiang Bai
48
1
0
22 Feb 2025
OmniParser: A Unified Framework for Text Spotting, Key Information Extraction and Table Recognition
Jianqiang Wan
Sibo Song
Wenwen Yu
Yuliang Liu
Wenqing Cheng
Fei Huang
Xiang Bai
Cong Yao
Zhibo Yang
48
26
0
28 Mar 2024
ICDAR 2023 Competition on Robust Layout Segmentation in Corporate Documents
Christoph Auer
A. Nassar
Maksym Lysak
Michele Dolfi
Nikolaos Livathinos
Peter W. J. Staar
OOD
3DV
27
6
0
24 May 2023
Optimized Table Tokenization for Table Structure Recognition
Maksym Lysak
Ahmed Nassar
Nikolaos Livathinos
Christoph Auer
Peter W. J. Staar
LMTD
25
13
0
05 May 2023
DocLayNet: A Large Human-Annotated Dataset for Document-Layout Analysis
B. Pfitzmann
Christoph Auer
Michele Dolfi
A. Nassar
Peter W. J. Staar
16
85
0
02 Jun 2022
Delivering Document Conversion as a Cloud Service with High Throughput and Responsiveness
Christoph Auer
Michele Dolfi
A. Carvalho
Cesar Berrospi Ramis
P. W. J. S. I. Research
17
9
0
01 Jun 2022
TableFormer: Table Structure Understanding with Transformers
A. Nassar
Nikolaos Livathinos
Maksym Lysak
Peter W. J. Staar
LMTD
ViT
11
73
0
02 Mar 2022
CoVA: Context-aware Visual Attention for Webpage Information Extraction
Anurendra Kumar
Keval Morabia
Jingjing Wang
A. Niekler
Martin Potthast
23
11
0
24 Oct 2021
VILA: Improving Structured Content Extraction from Scientific PDFs Using Visual Layout Groups
Zejiang Shen
Kyle Lo
Lucy Lu Wang
Bailey Kuehl
Daniel S. Weld
Doug Downey
VLM
16
34
0
01 Jun 2021
Robust PDF Document Conversion Using Recurrent Neural Networks
Nikolaos Livathinos
Cesar Berrospi
Maksym Lysak
Viktor Kuropiatnyk
Ahmed Nassar
A. Carvalho
Michele Dolfi
Christoph Auer
K. Dinkla
Peter W. J. Staar
20
22
0
18 Feb 2021
Understanding in Artificial Intelligence
S. Maetschke
D. M. Iraola
Pieter Barnard
Elaheh Shafieibavani
Peter Zhong
Ying Xu
Antonio Jimeno Yepes
ELM
VLM
11
0
0
17 Jan 2021
Extracting Procedural Knowledge from Technical Documents
Shivali Agarwal
Shubham Atreja
V. Agarwal
19
4
0
20 Oct 2020
Cross-Domain Document Object Detection: Benchmark Suite and Method
K. Li
Curtis Wigington
Chris Tensmeyer
Handong Zhao
Nikolaos Barmpalios
Vlad I. Morariu
Varun Manjunatha
Tong Sun
Y. Fu
16
45
0
30 Mar 2020
A Machine Learning Framework for Data Ingestion in Document Images
Han Fu
Yunyu Bai
Zhuo Li
Jun Shen
Jianling Sun
19
1
0
11 Feb 2020
Image-based table recognition: data, model, and evaluation
Xu Zhong
Elaheh Shafieibavani
Antonio Jimeno Yepes
LMTD
16
212
0
25 Nov 2019
Fine-Grained Object Detection over Scientific Document Images with Region Embeddings
Ankur Goswami
Joshua McGrath
S. Peters
Theodoros Rekatsinas
ObjD
16
3
0
28 Oct 2019
PubLayNet: largest dataset ever for document layout analysis
Xu Zhong
Jianbin Tang
Antonio Jimeno Yepes
13
448
0
16 Aug 2019
1