ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2026 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1507.05717
  4. Cited By
An End-to-End Trainable Neural Network for Image-based Sequence
  Recognition and Its Application to Scene Text Recognition

An End-to-End Trainable Neural Network for Image-based Sequence Recognition and Its Application to Scene Text Recognition

IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2015
21 July 2015
Baoguang Shi
X. Bai
Cong Yao
    VLM
ArXiv (abs)PDFHTML

Papers citing "An End-to-End Trainable Neural Network for Image-based Sequence Recognition and Its Application to Scene Text Recognition"

50 / 680 papers shown
Global-Local Aware Scene Text Editing
Global-Local Aware Scene Text EditingIEEE International Conference on Multimedia and Expo (ICME), 2025
Fuxiang Yang
Tonghua Su
Donglin Di
Yin Chen
Xiangqian Wu
Zhongjie Wang
Lei Fan
DiffM
200
0
0
03 Dec 2025
Handwritten Text Recognition for Low Resource Languages
Sayantan Dey
Alireza Alaei
P. Roy
VLM
115
0
0
01 Dec 2025
HunyuanOCR Technical Report
HunyuanOCR Technical Report
Hunyuan Vision Team
Pengyuan Lyu
Xingyu Wan
Gengluo Li
Shangpin Peng
...
Jianchen Zhu
Jie Jiang
Linus
Han Hu
Chengquan Zhang
MLLMVLM
557
0
0
24 Nov 2025
TopoReformer: Mitigating Adversarial Attacks Using Topological Purification in OCR Models
Bhagyesh Kumar
A S Aravinthakashan
Akshat Satyanarayan
Ishaan Gakhar
Ujjwal Verma
AAML
106
0
0
19 Nov 2025
OTSNet: A Neurocognitive-Inspired Observation-Thinking-Spelling Pipeline for Scene Text Recognition
OTSNet: A Neurocognitive-Inspired Observation-Thinking-Spelling Pipeline for Scene Text Recognition
Lixu Sun
Nurmemet Yolwas
Wushour Silamu
51
0
0
11 Nov 2025
SilhouetteTell: Practical Video Identification Leveraging Blurred Recordings of Video Subtitles
SilhouetteTell: Practical Video Identification Leveraging Blurred Recordings of Video Subtitles
Guanchong Huang
Song Fang
103
0
0
31 Oct 2025
Efficient License Plate Recognition via Pseudo-Labeled Supervision with Grounding DINO and YOLOv8
Efficient License Plate Recognition via Pseudo-Labeled Supervision with Grounding DINO and YOLOv8International Workshop on Machine Learning for Signal Processing (MLSP), 2025
Zahra Ebrahimi Vargoorani
Amir Mohammad Ghoreyshi
Ching Yee Suen
69
0
0
28 Oct 2025
Tibetan Language and AI: A Comprehensive Survey of Resources, Methods and Challenges
Tibetan Language and AI: A Comprehensive Survey of Resources, Methods and Challenges
Cheng Huang
Nyima Tashi
Fan Gao
Yutong Liu
J. Li
...
Guojie Tang
Xiangxiang Wang
Jia Zhang
Tsengdar J. Lee
Yongbin Yu
116
0
0
22 Oct 2025
CharDiff-LP: A Diffusion Model with Character-Level Guidance for License Plate Image Restoration
CharDiff-LP: A Diffusion Model with Character-Level Guidance for License Plate Image Restoration
Gyuhwan Park
Kihyun Na
Injung Kim
DiffM
114
0
0
20 Oct 2025
Scaling Beyond Context: A Survey of Multimodal Retrieval-Augmented Generation for Document Understanding
Scaling Beyond Context: A Survey of Multimodal Retrieval-Augmented Generation for Document Understanding
Sensen Gao
Shanshan Zhao
Xu Jiang
Lunhao Duan
Yong Xien Chng
Qing-Guo Chen
Weihua Luo
Kaifu Zhang
Jia-Wang Bian
Mingming Gong
270
1
0
17 Oct 2025
ClapperText: A Benchmark for Text Recognition in Low-Resource Archival Documents
ClapperText: A Benchmark for Text Recognition in Low-Resource Archival Documents
Tingyu Lin
Marco Peer
Florian Kleber
Robert Sablatnig
122
0
0
17 Oct 2025
UniCalli: A Unified Diffusion Framework for Column-Level Generation and Recognition of Chinese Calligraphy
UniCalli: A Unified Diffusion Framework for Column-Level Generation and Recognition of Chinese Calligraphy
Tianshuo Xu
Kai Wang
Zhifei Chen
Leyi Wu
Tianshui Wen
Fei Chao
Ying-Cong Chen
DiffM
92
0
0
15 Oct 2025
LadderMoE: Ladder-Side Mixture of Experts Adapters for Bronze Inscription Recognition
LadderMoE: Ladder-Side Mixture of Experts Adapters for Bronze Inscription Recognition
Rixin Zhou
Peiqiang Qiu
Qian Zhang
Chuntao Li
Xi Yang
136
0
0
02 Oct 2025
Exploring OCR-augmented Generation for Bilingual VQA
Exploring OCR-augmented Generation for Bilingual VQA
JoonHo Lee
Sunho Park
VLM
116
0
0
02 Oct 2025
From Videos to Indexed Knowledge Graphs -- Framework to Marry Methods for Multimodal Content Analysis and Understanding
From Videos to Indexed Knowledge Graphs -- Framework to Marry Methods for Multimodal Content Analysis and Understanding
Basem Rizk
Joel Walsh
Mark Core
Benjamin Nye
103
0
0
01 Oct 2025
Automatic Intermodal Loading Unit Identification using Computer Vision: A Scoping Review
Automatic Intermodal Loading Unit Identification using Computer Vision: A Scoping Review
Emre Gülsoylu
Alhassan Abdelhalim
Derya Kara Boztas
Ole Grasse
Carlos Jahn
Simone Frintrop
Janick Edinger
123
0
0
22 Sep 2025
GraDeT-HTR: A Resource-Efficient Bengali Handwritten Text Recognition System utilizing Grapheme-based Tokenizer and Decoder-only Transformer
GraDeT-HTR: A Resource-Efficient Bengali Handwritten Text Recognition System utilizing Grapheme-based Tokenizer and Decoder-only Transformer
Md. Mahmudul Hasan
Ahmed Nesar Tahsin Choudhury
Mahmudul Hasan
Md. Mosaddek Khan
132
1
0
22 Sep 2025
Emotion-Aware Speech Generation with Character-Specific Voices for Comics
Emotion-Aware Speech Generation with Character-Specific Voices for Comics
Zhiwen Qian
Jinhua Liang
Huan Zhang
106
0
0
18 Sep 2025
Exploring Light-Weight Object Recognition for Real-Time Document Detection
Exploring Light-Weight Object Recognition for Real-Time Document Detection
Lucas Wojcik
Luiz Coelho
Roger Granada
David Menotti
213
0
0
07 Sep 2025
The Telephone Game: Evaluating Semantic Drift in Unified Models
The Telephone Game: Evaluating Semantic Drift in Unified Models
Sabbir Mollah
Rohit Gupta
S. Swetha
Qingyang Liu
Ahnaf Munir
Mubarak Shah
VLM
175
2
0
04 Sep 2025
Why Stop at Words? Unveiling the Bigger Picture through Line-Level OCR
Why Stop at Words? Unveiling the Bigger Picture through Line-Level OCR
Shashank Vempati
Nishit Anand
Gaurav Talebailkar
Arpan Garai
Chetan Arora
97
0
0
29 Aug 2025
Quo Vadis Handwritten Text Generation for Handwritten Text Recognition?
Quo Vadis Handwritten Text Generation for Handwritten Text Recognition?
Vittorio Pippi
Konstantina Nikolaidou
S. Cascianelli
George Retsinas
Giorgos Sfikas
Rita Cucchiara
Marcus Liwicki
DiffM
122
1
0
13 Aug 2025
Enhanced Generative Structure Prior for Chinese Text Image Super-resolution
Enhanced Generative Structure Prior for Chinese Text Image Super-resolutionIEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2025
Xiaoming Li
Wangmeng Zuo
Chen Change Loy
175
0
0
11 Aug 2025
TEACH: Text Encoding as Curriculum Hints for Scene Text Recognition
TEACH: Text Encoding as Curriculum Hints for Scene Text Recognition
Xiahan Yang
Hui Zheng
VLM
111
1
0
02 Aug 2025
LPTR-AFLNet: Lightweight Integrated Chinese License Plate Rectification and Recognition Network
LPTR-AFLNet: Lightweight Integrated Chinese License Plate Rectification and Recognition Network
Guangzhu Xu
Pengcheng Zuo
Zhi Ke
Bangjun Lei
229
0
0
22 Jul 2025
Hyper-Local Deformable Transformers for Text Spotting on Historical Maps
Hyper-Local Deformable Transformers for Text Spotting on Historical MapsKnowledge Discovery and Data Mining (KDD), 2024
Yijun Lin
Yao-Yi Chiang
144
7
0
17 Jun 2025
Learning to Align: Addressing Character Frequency Distribution Shifts in Handwritten Text Recognition
Learning to Align: Addressing Character Frequency Distribution Shifts in Handwritten Text Recognition
Panagiotis Kaliosis
John Pavlopoulos
228
1
0
11 Jun 2025
Text-Aware Image Restoration with Diffusion Models
Text-Aware Image Restoration with Diffusion Models
Jaewon Min
J. Kim
Paul Hyunbin Cho
J. Lee
Jihye Park
Minkyu Park
S. Kim
Hyunhee Park
Seungryong Kim
292
1
0
11 Jun 2025
Task-driven real-world super-resolution of document scans
Task-driven real-world super-resolution of document scansApplied Sciences (AS), 2025
Maciej Zyrek
Tomasz Tarasiewicz
Jakub Sadel
Aleksandra Krzywon
M. Kawulok
SupR
121
0
0
08 Jun 2025
Sign Language: Towards Sign Understanding for Robot Autonomy
Sign Language: Towards Sign Understanding for Robot Autonomy
Ayush Agrawal
Joel Loo
Nicky Zimmerman
David Hsu
SLR
226
0
0
03 Jun 2025
TextSR: Diffusion Super-Resolution with Multilingual OCR Guidance
TextSR: Diffusion Super-Resolution with Multilingual OCR Guidance
Keren Ye
Ignacio Garcia Dorado
Michalis Raptis
M. Delbracio
Irene Zhu
P. Milanfar
Hossein Talebi
250
1
0
29 May 2025
WriteViT: Handwritten Text Generation with Vision Transformer
WriteViT: Handwritten Text Generation with Vision Transformer
Dang Hoai Nam
Huynh Tong Dang Khoa
Vo Nguyen Le Duy
ViT
224
0
0
19 May 2025
Revisiting SSL for sound event detection: complementary fusion and adaptive post-processing
Revisiting SSL for sound event detection: complementary fusion and adaptive post-processingJournal of King Saud University: Computer and Information Sciences (J. King Saud Univ. Comput. Inf. Sci.), 2025
Hanfang Cui
Longfei Song
Li Li
Dongxing Xu
Yanhua Long
349
0
0
17 May 2025
DOTA: Deformable Optimized Transformer Architecture for End-to-End Text Recognition with Retrieval-Augmented Generation
DOTA: Deformable Optimized Transformer Architecture for End-to-End Text Recognition with Retrieval-Augmented GenerationInternational Conference on Knowledge and Smart Technology (ICKST), 2025
Naphat Nithisopa
Teerapong Panboonyuen
ViT
209
0
0
07 May 2025
Visual Text Processing: A Comprehensive Review and Unified Evaluation
Visual Text Processing: A Comprehensive Review and Unified Evaluation
Yan Shu
Weichao Zeng
Fangmin Zhao
Zeyu Chen
Zhiyu Li
...
Paolo Rota
Xiang Bai
Lianwen Jin
Xu-Cheng Yin
Andrii Zadaianchuk
CoGe
442
6
0
30 Apr 2025
Single Document Image Highlight Removal via A Large-Scale Real-World Dataset and A Location-Aware Network
Single Document Image Highlight Removal via A Large-Scale Real-World Dataset and A Location-Aware Network
Lu Pan
Yu-Hsuan Huang
Hongxia Xie
Cheng Zhang
H Zhao
Hong-Han Shuai
Wen-Huang Cheng
149
0
0
19 Apr 2025
ViMo: A Generative Visual GUI World Model for App Agents
ViMo: A Generative Visual GUI World Model for App Agents
Dezhao Luo
Bohan Tang
Kang Li
Georgios Papoudakis
Jifei Song
S. Gong
Haifeng Zhang
Jun Wang
Cheng Deng
LM&RoVGen
541
4
0
15 Apr 2025
Enabling Collaborative Parametric Knowledge Calibration for Retrieval-Augmented Vision Question Answering
Enabling Collaborative Parametric Knowledge Calibration for Retrieval-Augmented Vision Question Answering
Jiaqi Deng
Kaize Shi
Zonghan Wu
Huan Huo
Dingxian Wang
Guandong Xu
238
0
0
05 Apr 2025
Learning Phase Distortion with Selective State Space Models for Video Turbulence Mitigation
Learning Phase Distortion with Selective State Space Models for Video Turbulence MitigationComputer Vision and Pattern Recognition (CVPR), 2025
Xingguang Zhang
Nicholas Chimitt
Xijun Wang
Yu Yuan
Stanley H. Chan
376
3
0
03 Apr 2025
NCAP: Scene Text Image Super-Resolution with Non-CAtegorical Prior
NCAP: Scene Text Image Super-Resolution with Non-CAtegorical PriorIEEE Workshop/Winter Conference on Applications of Computer Vision (WACV), 2025
Dongwoo Park
Suk Pil Ko
958
0
0
01 Apr 2025
Leveraging Contrast Information for Efficient Document Shadow Removal
Leveraging Contrast Information for Efficient Document Shadow Removal
Wenshu Fan
Jiancheng Huang
Na Liu
Mingfu Yan
Yi Huang
Shifeng Chen
351
1
0
01 Apr 2025
Improving Applicability of Deep Learning based Token Classification models during Training
Improving Applicability of Deep Learning based Token Classification models during Training
Anket Mehra
Malte Prieß
Marian Himstedt
276
0
0
28 Mar 2025
Practical Fine-Tuning of Autoregressive Models on Limited Handwritten Texts
Practical Fine-Tuning of Autoregressive Models on Limited Handwritten TextsIEEE International Conference on Document Analysis and Recognition (ICDAR), 2025
Jan Kohút
Michal Hradiš
342
0
0
25 Mar 2025
Linguistics-aware Masked Image Modeling for Self-supervised Scene Text Recognition
Linguistics-aware Masked Image Modeling for Self-supervised Scene Text RecognitionComputer Vision and Pattern Recognition (CVPR), 2025
Yifei Zhang
Yu Xie
Jin Wei
Xiaomeng Yang
Can Ma
Can Ma
Xiangyang Ji
300
7
0
24 Mar 2025
Accurate Scene Text Recognition with Efficient Model Scaling and Cloze Self-Distillation
Accurate Scene Text Recognition with Efficient Model Scaling and Cloze Self-DistillationComputer Vision and Pattern Recognition (CVPR), 2025
Andrea Maracani
Savas Ozkan
Sijun Cho
Hyowon Kim
Eunchung Noh
Jeongwon Min
Cho Jung Min
Dookun Park
Mete Ozay
407
1
0
20 Mar 2025
A Context-Driven Training-Free Network for Lightweight Scene Text Segmentation and Recognition
A Context-Driven Training-Free Network for Lightweight Scene Text Segmentation and Recognition
Ritabrata Chakraborty
Shivakumara Palaiahnakote
Umapada Pal
Cheng-Lin Liu
VLM
277
1
0
19 Mar 2025
Historic Scripts to Modern Vision: A Novel Dataset and A VLM Framework for Transliteration of Modi Script to Devanagari
Historic Scripts to Modern Vision: A Novel Dataset and A VLM Framework for Transliteration of Modi Script to DevanagariIEEE International Conference on Document Analysis and Recognition (ICDAR), 2025
Harshal Kausadikar
Tanvi Kale
Onkar Susladkar
Sparsh Mittal
302
0
0
17 Mar 2025
MathMistake Checker: A Comprehensive Demonstration for Step-by-Step Math Problem Mistake Finding by Prompt-Guided LLMs
MathMistake Checker: A Comprehensive Demonstration for Step-by-Step Math Problem Mistake Finding by Prompt-Guided LLMsAAAI Conference on Artificial Intelligence (AAAI), 2025
T. Zhang
Zhuoxuan Jiang
Haotian Zhang
Lin Lin
Shaohua Zhang
LRM
291
1
0
06 Mar 2025
TextDoctor: Unified Document Image Inpainting via Patch Pyramid Diffusion Models
Wanglong Lu
Lingming Su
Jingjing Zheng
Vinícius Veloso de Melo
Farzaneh Shoeleh
J. Hawkin
T. Tricco
Hanli Zhao
Xianta Jiang
DiffM
279
2
0
06 Mar 2025
DashCop: Automated E-ticket Generation for Two-Wheeler Traffic Violations Using Dashcam VideosIEEE Workshop/Winter Conference on Applications of Computer Vision (WACV), 2025
Deepti Rawat
Keshav Gupta
Aryamaan Basu Roy
Ravi Kiran Sarvadevabhatla
284
1
0
01 Mar 2025
1234...121314
Next