ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2409.01704
  4. Cited By
General OCR Theory: Towards OCR-2.0 via a Unified End-to-end Model

General OCR Theory: Towards OCR-2.0 via a Unified End-to-end Model

3 September 2024
Haoran Wei
Chenglong Liu
Jinyue Chen
Jia Wang
Lingyu Kong
Yanming Xu
Zheng Ge
Liang Zhao
Jianjian Sun
Yuang Peng
Chunrui Han
Xiangyu Zhang
    VLM
ArXivPDFHTML

Papers citing "General OCR Theory: Towards OCR-2.0 via a Unified End-to-end Model"

13 / 13 papers shown
Title
GDI-Bench: A Benchmark for General Document Intelligence with Vision and Reasoning Decoupling
GDI-Bench: A Benchmark for General Document Intelligence with Vision and Reasoning Decoupling
Siqi Li
Yufan Shen
Xiangnan Chen
Jiayi Chen
Hengwei Ju
...
Licheng Wen
Botian Shi
Y. Liu
Xinyu Cai
Yu Qiao
VLM
ELM
84
0
0
30 Apr 2025
AutoP2C: An LLM-Based Agent Framework for Code Repository Generation from Multimodal Content in Academic Papers
AutoP2C: An LLM-Based Agent Framework for Code Repository Generation from Multimodal Content in Academic Papers
Zijie Lin
Yiqing Shen
Qilin Cai
He Sun
Jinrui Zhou
Mingjun Xiao
45
0
0
28 Apr 2025
Kimi-VL Technical Report
Kimi-VL Technical Report
Kimi Team
Angang Du
B. Yin
Bowei Xing
Bowen Qu
...
Zhiqi Huang
Zihao Huang
Zijia Zhao
Z. Chen
Zongyu Lin
MLLM
VLM
MoE
90
0
0
10 Apr 2025
Commander-GPT: Fully Unleashing the Sarcasm Detection Capability of Multi-Modal Large Language Models
Commander-GPT: Fully Unleashing the Sarcasm Detection Capability of Multi-Modal Large Language Models
Y. Zhang
Chunwang Zou
Bo Wang
Jing Qin
44
0
0
24 Mar 2025
PHT-CAD: Efficient CAD Parametric Primitive Analysis with Progressive Hierarchical Tuning
PHT-CAD: Efficient CAD Parametric Primitive Analysis with Progressive Hierarchical Tuning
Ke Niu
Yuwen Chen
Haiyang Yu
Z. Chen
Xianghui Que
Bin Li
Xiangyang Xue
47
0
0
23 Mar 2025
KITAB-Bench: A Comprehensive Multi-Domain Benchmark for Arabic OCR and Document Understanding
KITAB-Bench: A Comprehensive Multi-Domain Benchmark for Arabic OCR and Document Understanding
Ahmed Heakl
Abdullah Sohail
Mukul Ranjan
Rania Hossam
Ghazi Ahmed
Mohamed El-Geish
Omar Maher
Zhiqiang Shen
Fahad A Khan
Salman Khan
VLM
47
0
0
20 Feb 2025
\Éclair -- Extracting Content and Layout with Integrated Reading Order for Documents
\Éclair -- Extracting Content and Layout with Integrated Reading Order for Documents
Ilia Karmanov
A. Deshmukh
Lukas Voegtle
Philipp Fischer
Kateryna Chumachenko
...
Jarno Seppänen
Jupinder Parmar
Joseph Jennings
Andrew Tao
Karan Sapra
66
0
0
06 Feb 2025
Kimi k1.5: Scaling Reinforcement Learning with LLMs
Kimi k1.5: Scaling Reinforcement Learning with LLMs
Kimi Team
Angang Du
Bofei Gao
Bowei Xing
Changjiu Jiang
...
Zhilin Yang
Zhiqi Huang
Zihao Huang
Ziyao Xu
Z. Yang
VLM
ALM
OffRL
AI4TS
LRM
82
128
0
22 Jan 2025
OmniDocBench: Benchmarking Diverse PDF Document Parsing with Comprehensive Annotations
OmniDocBench: Benchmarking Diverse PDF Document Parsing with Comprehensive Annotations
Linke Ouyang
Yuan Qu
Hongbin Zhou
Jiawei Zhu
Rui Zhang
...
Chao Xu
Bo Zhang
Botian Shi
Zhongying Tu
Conghui He
86
5
0
10 Dec 2024
Chimera: Improving Generalist Model with Domain-Specific Experts
Chimera: Improving Generalist Model with Domain-Specific Experts
Tianshuo Peng
M. Li
Hongbin Zhou
Renqiu Xia
Renrui Zhang
...
Aojun Zhou
Botian Shi
Tao Chen
Bo Zhang
Xiangyu Yue
79
4
0
08 Dec 2024
DreamBench++: A Human-Aligned Benchmark for Personalized Image Generation
DreamBench++: A Human-Aligned Benchmark for Personalized Image Generation
Yuang Peng
Yuxin Cui
Haomiao Tang
Zekun Qi
Runpei Dong
Jing Bai
Chunrui Han
Zheng Ge
Xiangyu Zhang
Shu-Tao Xia
EGVM
48
30
0
24 Jun 2024
Sheet Music Transformer: End-To-End Optical Music Recognition Beyond
  Monophonic Transcription
Sheet Music Transformer: End-To-End Optical Music Recognition Beyond Monophonic Transcription
Antonio Ríos-Vila
Jorge Calvo-Zaragoza
Thierry Paquet
28
9
0
12 Feb 2024
COCO-Text: Dataset and Benchmark for Text Detection and Recognition in
  Natural Images
COCO-Text: Dataset and Benchmark for Text Detection and Recognition in Natural Images
Andreas Veit
Tomas Matera
Lukás Neumann
Jirí Matas
Serge J. Belongie
169
458
0
26 Jan 2016
1