ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2009.09941
  4. Cited By
PP-OCR: A Practical Ultra Lightweight OCR System

PP-OCR: A Practical Ultra Lightweight OCR System

21 September 2020
Yuning Du
Chenxia Li
Ruoyu Guo
Xiaoting Yin
Weiwei Liu
Jun Zhou
Yifan Bai
Zilin Yu
Yehua Yang
Qingqing Dang
Hongya Wang
ArXivPDFHTML

Papers citing "PP-OCR: A Practical Ultra Lightweight OCR System"

50 / 63 papers shown
Title
Uni-AIMS: AI-Powered Microscopy Image Analysis
Uni-AIMS: AI-Powered Microscopy Image Analysis
Yanhui Hong
Nan Wang
Zhiyi Xia
Haoyi Tao
Xi Fang
...
Shengyu Li
Ziqi Chen
Zezhong Zhang
Guolin Ke
Linfeng Zhang
9
0
0
11 May 2025
Leveraging Vision-Language Models for Visual Grounding and Analysis of Automotive UI
Leveraging Vision-Language Models for Visual Grounding and Analysis of Automotive UI
Benjamin Raphael Ernhofer
Daniil Prokhorov
Jannica Langner
Dominik Bollmann
21
0
0
09 May 2025
SymbioticRAG: Enhancing Document Intelligence Through Human-LLM Symbiotic Collaboration
SymbioticRAG: Enhancing Document Intelligence Through Human-LLM Symbiotic Collaboration
Qiang Sun
Tingting Bi
Sirui Li
E. Holden
Paul Duuring
Kai Niu
Wei Liu
22
0
0
05 May 2025
Use of Metric Learning for the Recognition of Handwritten Digits, and its Application to Increase the Outreach of Voice-based Communication Platforms
Use of Metric Learning for the Recognition of Handwritten Digits, and its Application to Increase the Outreach of Voice-based Communication Platforms
Devesh Pant
Dibyendu Talukder
Deepak Kumar
Rachit Pandey
Aaditeshwar Seth
Chetan Arora
14
1
0
26 Apr 2025
Step1X-Edit: A Practical Framework for General Image Editing
Step1X-Edit: A Practical Framework for General Image Editing
S. Liu
Yucheng Han
Peng Xing
Fukun Yin
Rui Wang
...
Yibo Zhu
Binxing Jiao
X. Zhang
Gang Yu
Daxin Jiang
DiffM
93
2
0
24 Apr 2025
AdaParse: An Adaptive Parallel PDF Parsing and Resource Scaling Engine
AdaParse: An Adaptive Parallel PDF Parsing and Resource Scaling Engine
Carlo Siebenschuh
Kyle Hippe
Ozan Gokdemir
Alexander Brace
A. Khan
...
V. Vishwanath
R. Stevens
Arvind Ramanathan
Ian Foster
Robert Underwood
MoE
31
0
0
23 Apr 2025
Detecting and Understanding Hateful Contents in Memes Through Captioning and Visual Question-Answering
Detecting and Understanding Hateful Contents in Memes Through Captioning and Visual Question-Answering
Ali Anaissi
Junaid Akram
Kunal Chaturvedi
Ali Braytee
20
0
0
23 Apr 2025
A Lightweight Multi-Module Fusion Approach for Korean Character Recognition
A Lightweight Multi-Module Fusion Approach for Korean Character Recognition
Inho Jake Park
Jaehoon Jay Jeong
Ho-Sang Jo
23
0
0
08 Apr 2025
MarkushGrapher: Joint Visual and Textual Recognition of Markush Structures
MarkushGrapher: Joint Visual and Textual Recognition of Markush Structures
Lucas Morin
Valéry Weber
A. Nassar
Gerhard Ingmar Meijer
Luc Van Gool
Yawei Li
Peter W. J. Staar
56
1
0
20 Mar 2025
DesignDiffusion: High-Quality Text-to-Design Image Generation with Diffusion Models
Zhendong Wang
Jianmin Bao
Shuyang Gu
Dong Chen
Wengang Zhou
H. Li
DiffM
44
0
0
03 Mar 2025
NusaAksara: A Multimodal and Multilingual Benchmark for Preserving Indonesian Indigenous Scripts
NusaAksara: A Multimodal and Multilingual Benchmark for Preserving Indonesian Indigenous Scripts
Muhammad Farid Adilazuarda
M. Wijanarko
Lucky Susanto
Khumaisa Nuráini
Derry Wijaya
Alham Fikri Aji
49
0
0
25 Feb 2025
ControlText: Unlocking Controllable Fonts in Multilingual Text Rendering without Font Annotations
ControlText: Unlocking Controllable Fonts in Multilingual Text Rendering without Font Annotations
Bowen Jiang
Yuan Yuan
Xinyi Bai
Zhuoqun Hao
Alyson Yin
Yaojie Hu
Wenyu Liao
Lyle Ungar
Camillo J. Taylor
DiffM
40
1
0
16 Feb 2025
CT2C-QA: Multimodal Question Answering over Chinese Text, Table and
  Chart
CT2C-QA: Multimodal Question Answering over Chinese Text, Table and Chart
Bowen Zhao
Tianhao Cheng
Yuejie Zhang
Ying Cheng
Rui Feng
Xiaobo Zhang
LMTD
13
1
0
28 Oct 2024
VisRAG: Vision-based Retrieval-augmented Generation on Multi-modality Documents
VisRAG: Vision-based Retrieval-augmented Generation on Multi-modality Documents
S. Yu
C. Tang
Bokai Xu
Junbo Cui
Junhao Ran
...
Zhenghao Liu
Shuo Wang
Xu Han
Zhiyuan Liu
Maosong Sun
VLM
29
21
0
14 Oct 2024
Grounding Partially-Defined Events in Multimodal Data
Grounding Partially-Defined Events in Multimodal Data
Kate Sanders
Reno Kriz
David Etter
Hannah Recknor
Alexander Martin
Cameron Carpenter
Jingyang Lin
Benjamin Van Durme
22
1
0
07 Oct 2024
A Reflection on the Impact of Misspecifying Unidentifiable Causal
  Inference Models in Surrogate Endpoint Evaluation
A Reflection on the Impact of Misspecifying Unidentifiable Causal Inference Models in Surrogate Endpoint Evaluation
Gokce Deliorman
Florian Stijven
Wim Van der Elst
Maria del Carmen Pardo
Ariel Alonso
CML
23
0
0
06 Oct 2024
ACE: All-round Creator and Editor Following Instructions via Diffusion
  Transformer
ACE: All-round Creator and Editor Following Instructions via Diffusion Transformer
Zhen Han
Zeyinzi Jiang
Yulin Pan
Jingfeng Zhang
Chaojie Mao
Chenwei Xie
Yu Liu
Jingren Zhou
DiffM
18
11
0
30 Sep 2024
AceParse: A Comprehensive Dataset with Diverse Structured Texts for Academic Literature Parsing
AceParse: A Comprehensive Dataset with Diverse Structured Texts for Academic Literature Parsing
Huawei Ji
Cheng Deng
Bo Xue
Zhouyang Jin
Jiaxin Ding
Xiaoying Gan
Luoyi Fu
Xinbing Wang
Chenghu Zhou
15
0
0
16 Sep 2024
PdfTable: A Unified Toolkit for Deep Learning-Based Table Extraction
PdfTable: A Unified Toolkit for Deep Learning-Based Table Extraction
Lei Sheng
Shuai-Shuai Xu
LMTD
16
0
0
08 Sep 2024
Spanish TrOCR: Leveraging Transfer Learning for Language Adaptation
Spanish TrOCR: Leveraging Transfer Learning for Language Adaptation
Filipe Lauar
Valentin Laurent
21
0
0
09 Jul 2024
High-Throughput Phenotyping using Computer Vision and Machine Learning
High-Throughput Phenotyping using Computer Vision and Machine Learning
Vivaan Singhvi
Langalibalele Lunga
Pragya Nidhi
Chris Keum
Varrun Prakash
15
0
0
08 Jul 2024
OSPC: Artificial VLM Features for Hateful Meme Detection
OSPC: Artificial VLM Features for Hateful Meme Detection
Peter Grönquist
VLM
16
0
0
03 Jul 2024
AHMsys: An Automated HVAC Modeling System for BIM Project
AHMsys: An Automated HVAC Modeling System for BIM Project
Long Hoang Dang
Duy-Hung Nguyen
Thai Quang Le
Thinh Truong Nguyen
Clark Mei
Vu Hoang
AI4CE
21
0
0
02 Jul 2024
MixTex: Unambiguous Recognition Should Not Rely Solely on Real Data
MixTex: Unambiguous Recognition Should Not Rely Solely on Real Data
Renqing Luo
Yuhan Xu
25
0
0
24 Jun 2024
AnyTrans: Translate AnyText in the Image with Large Scale Models
AnyTrans: Translate AnyText in the Image with Large Scale Models
Zhipeng Qian
Pei Zhang
Baosong Yang
Kai Fan
Yiwei Ma
Derek F. Wong
Xiaoshuai Sun
Rongrong Ji
VLM
26
1
0
17 Jun 2024
Impact of Stickers on Multimodal Chat Sentiment Analysis and Intent
  Recognition: A New Task, Dataset and Baseline
Impact of Stickers on Multimodal Chat Sentiment Analysis and Intent Recognition: A New Task, Dataset and Baseline
Yuanchen Shi
Biao Ma
Fang Kong
16
0
0
14 May 2024
Bridging the Gap Between End-to-End and Two-Step Text Spotting
Bridging the Gap Between End-to-End and Two-Step Text Spotting
Mingxin Huang
Hongliang Li
Yuliang Liu
Xiang Bai
Lianwen Jin
33
3
0
06 Apr 2024
Visual CoT: Advancing Multi-Modal Language Models with a Comprehensive
  Dataset and Benchmark for Chain-of-Thought Reasoning
Visual CoT: Advancing Multi-Modal Language Models with a Comprehensive Dataset and Benchmark for Chain-of-Thought Reasoning
Hao Shao
Shengju Qian
Han Xiao
Guanglu Song
Zhuofan Zong
Letian Wang
Yu Liu
Hongsheng Li
VGen
LRM
MLLM
47
35
0
25 Mar 2024
The future of document indexing: GPT and Donut revolutionize table of
  content processing
The future of document indexing: GPT and Donut revolutionize table of content processing
Degaga Wolde Feyisa
Haylemicheal Berihun
Amanuel Zewdu
Mahsa Najimoghadam
Marzieh Zare
16
0
0
12 Mar 2024
Enhancing Visual Document Understanding with Contrastive Learning in
  Large Visual-Language Models
Enhancing Visual Document Understanding with Contrastive Learning in Large Visual-Language Models
Xin Li
Yunfei Wu
Xinghua Jiang
Zhihao Guo
Ming Gong
Haoyu Cao
Yinsong Liu
Deqiang Jiang
Xing Sun
VLM
27
12
0
29 Feb 2024
Lumos : Empowering Multimodal LLMs with Scene Text Recognition
Lumos : Empowering Multimodal LLMs with Scene Text Recognition
Ashish Shenoy
Yichao Lu
Srihari Jayakumar
Debojeet Chatterjee
Mohsen Moslehpour
...
Shicong Zhao
Longfang Zhao
Ankit Ramchandani
Xin Luna Dong
Anuj Kumar
MLLM
14
1
0
12 Feb 2024
V-IRL: Grounding Virtual Intelligence in Real Life
V-IRL: Grounding Virtual Intelligence in Real Life
Jihan Yang
Runyu Ding
Ellis L Brown
Xiaojuan Qi
Saining Xie
LM&Ro
46
18
0
05 Feb 2024
Emu Edit: Precise Image Editing via Recognition and Generation Tasks
Emu Edit: Precise Image Editing via Recognition and Generation Tasks
Shelly Sheynin
Adam Polyak
Uriel Singer
Yuval Kirstain
Amit Zohar
Oron Ashual
Devi Parikh
Yaniv Taigman
11
51
0
16 Nov 2023
Monkey: Image Resolution and Text Label Are Important Things for Large
  Multi-modal Models
Monkey: Image Resolution and Text Label Are Important Things for Large Multi-modal Models
Zhang Li
Biao Yang
Qiang Liu
Zhiyin Ma
Shuo Zhang
Jingxu Yang
Yabo Sun
Yuliang Liu
Xiang Bai
MLLM
14
240
0
11 Nov 2023
Talk2BEV: Language-enhanced Bird's-eye View Maps for Autonomous Driving
Talk2BEV: Language-enhanced Bird's-eye View Maps for Autonomous Driving
Tushar Choudhary
Vikrant Dewangan
Shivam Chandhok
Shubham Priyadarshan
Anushka Jain
A. K. Singh
Siddharth Srivastava
Krishna Murthy Jatavallabhula
K. M. Krishna
24
57
0
03 Oct 2023
LISTER: Neighbor Decoding for Length-Insensitive Scene Text Recognition
LISTER: Neighbor Decoding for Length-Insensitive Scene Text Recognition
Changxu Cheng
P. Wang
Cheng Da
Qi Zheng
Cong Yao
18
15
0
24 Aug 2023
MolGrapher: Graph-based Visual Recognition of Chemical Structures
MolGrapher: Graph-based Visual Recognition of Chemical Structures
Lucas Morin
Martin Danelljan
M. I. Agea
A. Nassar
Valéry Weber
Ingmar Meijer
Peter W. J. Staar
F. I. F. Richard Yu
GNN
19
10
0
23 Aug 2023
AltDiffusion: A Multilingual Text-to-Image Diffusion Model
AltDiffusion: A Multilingual Text-to-Image Diffusion Model
Fulong Ye
Guangyi Liu
Xinya Wu
Ledell Yu Wu
VLM
14
25
0
19 Aug 2023
A Novel Pipeline for Improving Optical Character Recognition through
  Post-processing Using Natural Language Processing
A Novel Pipeline for Improving Optical Character Recognition through Post-processing Using Natural Language Processing
Aishik Rakshit
Samyak Mehta
Anirban Dasgupta
10
0
0
09 Jul 2023
MultiQG-TI: Towards Question Generation from Multi-modal Sources
MultiQG-TI: Towards Question Generation from Multi-modal Sources
Zichao Wang
Richard Baraniuk
10
5
0
07 Jul 2023
GlyphControl: Glyph Conditional Control for Visual Text Generation
GlyphControl: Glyph Conditional Control for Visual Text Generation
Yukang Yang
Dongnan Gui
Yuhui Yuan
Weicong Liang
Haisong Ding
Hang-Rui Hu
Kai Chen
DiffM
14
76
0
29 May 2023
BaDLAD: A Large Multi-Domain Bengali Document Layout Analysis Dataset
BaDLAD: A Large Multi-Domain Bengali Document Layout Analysis Dataset
Md. Istiak Hossain Shihab
Md. Rakibul Hasan
Mahfuzur Rahman Emon
Syed Mobassir Hossen
Md. Nazmuddoha Ansary
...
Sayma Sultana Chowdhury
Farig Sadeque
Tahsin Reasat
Ahmed Imtiaz Humayun
Asif Sushmit
6
13
0
09 Mar 2023
DocILE Benchmark for Document Information Localization and Extraction
DocILE Benchmark for Document Information Localization and Extraction
vStvepán vSimsa
Milan vSulc
Michal Uvrivcávr
Yash J. Patel
Ahmed Hamdi
...
Matyávs Skalický
Jivrí Matas
Antoine Doucet
Mickael Coustaty
Dimosthenis Karatzas
11
33
0
11 Feb 2023
Lexi: Self-Supervised Learning of the UI Language
Lexi: Self-Supervised Learning of the UI Language
Pratyay Banerjee
Shweti Mahajan
Kushal Arora
Chitta Baral
Oriana Riva
25
17
0
23 Jan 2023
Improving Inference Performance of Machine Learning with the
  Divide-and-Conquer Principle
Improving Inference Performance of Machine Learning with the Divide-and-Conquer Principle
Alex Kogan
LRM
14
0
0
12 Jan 2023
Text2Poster: Laying out Stylized Texts on Retrieved Images
Text2Poster: Laying out Stylized Texts on Retrieved Images
Chuhao Jin
Hongteng Xu
Ruihua Song
Zhiwu Lu
DiffM
12
6
0
06 Jan 2023
Text Detection Forgot About Document OCR
Text Detection Forgot About Document OCR
Krzysztof Olejniczak
Milan Šulc
14
9
0
14 Oct 2022
HAPI: A Large-scale Longitudinal Dataset of Commercial ML API
  Predictions
HAPI: A Large-scale Longitudinal Dataset of Commercial ML API Predictions
Lingjiao Chen
Zhihua Jin
Sabri Eyuboglu
Christopher Ré
Matei A. Zaharia
James Y. Zou
35
8
0
18 Sep 2022
Computer vision based vehicle tracking as a complementary and scalable
  approach to RFID tagging
Computer vision based vehicle tracking as a complementary and scalable approach to RFID tagging
P. Gaur
Abhilasha Bhardwaj
P. Shete
Mohini Laghate
D. Sarode
6
0
0
13 Sep 2022
TRIE++: Towards End-to-End Information Extraction from Visually Rich
  Documents
TRIE++: Towards End-to-End Information Extraction from Visually Rich Documents
Zhanzhan Cheng
Peng Zhang
Can Li
Qiao Liang
Yunlu Xu
Pengfei Li
Shiliang Pu
Yi Niu
Fei Wu
6
9
0
14 Jul 2022
12
Next