An End-to-End Trainable Neural Network for Image-based Sequence Recognition and Its Application to Scene Text Recognition

IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2015

21 July 2015

Papers citing "An End-to-End Trainable Neural Network for Image-based Sequence Recognition and Its Application to Scene Text Recognition"

50 / 680 papers shown

Global-Local Aware Scene Text EditingIEEE International Conference on Multimedia and Expo (ICME), 2025

200

03 Dec 2025

Handwritten Text Recognition for Low Resource Languages

115

01 Dec 2025

HunyuanOCR Technical Report

...

Jianchen Zhu

Jie Jiang

Linus

Han Hu

Chengquan Zhang

MLLM VLM

557

24 Nov 2025

TopoReformer: Mitigating Adversarial Attacks Using Topological Purification in OCR Models

106

19 Nov 2025

OTSNet: A Neurocognitive-Inspired Observation-Thinking-Spelling Pipeline for Scene Text Recognition

Lixu Sun

Nurmemet Yolwas

Wushour Silamu

11 Nov 2025

SilhouetteTell: Practical Video Identification Leveraging Blurred Recordings of Video Subtitles

Guanchong Huang

Song Fang

103

31 Oct 2025

Efficient License Plate Recognition via Pseudo-Labeled Supervision with Grounding DINO and YOLOv8International Workshop on Machine Learning for Signal Processing (MLSP), 2025

Zahra Ebrahimi Vargoorani

Amir Mohammad Ghoreyshi

Ching Yee Suen

28 Oct 2025

Tibetan Language and AI: A Comprehensive Survey of Resources, Methods and Challenges

...

116

22 Oct 2025

CharDiff-LP: A Diffusion Model with Character-Level Guidance for License Plate Image Restoration

114

20 Oct 2025

Scaling Beyond Context: A Survey of Multimodal Retrieval-Augmented Generation for Document Understanding

270

17 Oct 2025

ClapperText: A Benchmark for Text Recognition in Low-Resource Archival Documents

122

17 Oct 2025

UniCalli: A Unified Diffusion Framework for Column-Level Generation and Recognition of Chinese Calligraphy

15 Oct 2025

LadderMoE: Ladder-Side Mixture of Experts Adapters for Bronze Inscription Recognition

136

02 Oct 2025

Exploring OCR-augmented Generation for Bilingual VQA

JoonHo Lee

Sunho Park

VLM

116

02 Oct 2025

From Videos to Indexed Knowledge Graphs -- Framework to Marry Methods for Multimodal Content Analysis and Understanding

103

01 Oct 2025

Automatic Intermodal Loading Unit Identification using Computer Vision: A Scoping Review

123

22 Sep 2025

GraDeT-HTR: A Resource-Efficient Bengali Handwritten Text Recognition System utilizing Grapheme-based Tokenizer and Decoder-only Transformer

Md. Mahmudul Hasan

Ahmed Nesar Tahsin Choudhury

Mahmudul Hasan

Md. Mosaddek Khan

132

22 Sep 2025

Emotion-Aware Speech Generation with Character-Specific Voices for Comics

Zhiwen Qian

Jinhua Liang

Huan Zhang

106

18 Sep 2025

Exploring Light-Weight Object Recognition for Real-Time Document Detection

213

07 Sep 2025

The Telephone Game: Evaluating Semantic Drift in Unified Models

175

04 Sep 2025

Why Stop at Words? Unveiling the Bigger Picture through Line-Level OCR

29 Aug 2025

Quo Vadis Handwritten Text Generation for Handwritten Text Recognition?

Vittorio Pippi

Konstantina Nikolaidou

122

13 Aug 2025

Enhanced Generative Structure Prior for Chinese Text Image Super-resolutionIEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2025

Xiaoming Li

Wangmeng Zuo

Chen Change Loy

175

11 Aug 2025

TEACH: Text Encoding as Curriculum Hints for Scene Text Recognition

Xiahan Yang

Hui Zheng

VLM

111

02 Aug 2025

LPTR-AFLNet: Lightweight Integrated Chinese License Plate Rectification and Recognition Network

229

22 Jul 2025

Hyper-Local Deformable Transformers for Text Spotting on Historical MapsKnowledge Discovery and Data Mining (KDD), 2024

Yijun Lin

Yao-Yi Chiang

144

17 Jun 2025

Learning to Align: Addressing Character Frequency Distribution Shifts in Handwritten Text Recognition

Panagiotis Kaliosis

John Pavlopoulos

228

11 Jun 2025

Text-Aware Image Restoration with Diffusion Models

292

11 Jun 2025

Task-driven real-world super-resolution of document scansApplied Sciences (AS), 2025

121

08 Jun 2025

Sign Language: Towards Sign Understanding for Robot Autonomy

226

03 Jun 2025

TextSR: Diffusion Super-Resolution with Multilingual OCR Guidance

Keren Ye

Ignacio Garcia Dorado

250

29 May 2025

WriteViT: Handwritten Text Generation with Vision Transformer

224

19 May 2025

Revisiting SSL for sound event detection: complementary fusion and adaptive post-processingJournal of King Saud University: Computer and Information Sciences (J. King Saud Univ. Comput. Inf. Sci.), 2025

349

17 May 2025

DOTA: Deformable Optimized Transformer Architecture for End-to-End Text Recognition with Retrieval-Augmented GenerationInternational Conference on Knowledge and Smart Technology (ICKST), 2025

Naphat Nithisopa

Teerapong Panboonyuen

ViT

209

07 May 2025

Visual Text Processing: A Comprehensive Review and Unified Evaluation

...

442

30 Apr 2025

Single Document Image Highlight Removal via A Large-Scale Real-World Dataset and A Location-Aware Network

149

19 Apr 2025

ViMo: A Generative Visual GUI World Model for App Agents

541

15 Apr 2025

Enabling Collaborative Parametric Knowledge Calibration for Retrieval-Augmented Vision Question Answering

238

05 Apr 2025

Learning Phase Distortion with Selective State Space Models for Video Turbulence MitigationComputer Vision and Pattern Recognition (CVPR), 2025

376

03 Apr 2025

NCAP: Scene Text Image Super-Resolution with Non-CAtegorical PriorIEEE Workshop/Winter Conference on Applications of Computer Vision (WACV), 2025

Dongwoo Park

Suk Pil Ko

958

01 Apr 2025

Leveraging Contrast Information for Efficient Document Shadow Removal

351

01 Apr 2025

Improving Applicability of Deep Learning based Token Classification models during Training

Anket Mehra

Malte Prieß

Marian Himstedt

276

28 Mar 2025

Practical Fine-Tuning of Autoregressive Models on Limited Handwritten TextsIEEE International Conference on Document Analysis and Recognition (ICDAR), 2025

Jan Kohút

Michal Hradiš

342

25 Mar 2025

Linguistics-aware Masked Image Modeling for Self-supervised Scene Text RecognitionComputer Vision and Pattern Recognition (CVPR), 2025

300

24 Mar 2025

Accurate Scene Text Recognition with Efficient Model Scaling and Cloze Self-DistillationComputer Vision and Pattern Recognition (CVPR), 2025

407

20 Mar 2025

A Context-Driven Training-Free Network for Lightweight Scene Text Segmentation and Recognition

Ritabrata Chakraborty

Shivakumara Palaiahnakote

Umapada Pal

Cheng-Lin Liu

VLM

277

19 Mar 2025

Historic Scripts to Modern Vision: A Novel Dataset and A VLM Framework for Transliteration of Modi Script to DevanagariIEEE International Conference on Document Analysis and Recognition (ICDAR), 2025

302

17 Mar 2025

MathMistake Checker: A Comprehensive Demonstration for Step-by-Step Math Problem Mistake Finding by Prompt-Guided LLMsAAAI Conference on Artificial Intelligence (AAAI), 2025

291

06 Mar 2025

TextDoctor: Unified Document Image Inpainting via Patch Pyramid Diffusion Models

Wanglong Lu

Lingming Su

Jingjing Zheng

Vinícius Veloso de Melo

279

06 Mar 2025

DashCop: Automated E-ticket Generation for Two-Wheeler Traffic Violations Using Dashcam VideosIEEE Workshop/Winter Conference on Applications of Computer Vision (WACV), 2025

Deepti Rawat

Keshav Gupta

Aryamaan Basu Roy

Ravi Kiran Sarvadevabhatla

284

01 Mar 2025