ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2026 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1507.05717
  4. Cited By
An End-to-End Trainable Neural Network for Image-based Sequence
  Recognition and Its Application to Scene Text Recognition

An End-to-End Trainable Neural Network for Image-based Sequence Recognition and Its Application to Scene Text Recognition

IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2015
21 July 2015
Baoguang Shi
X. Bai
Cong Yao
    VLM
ArXiv (abs)PDFHTML

Papers citing "An End-to-End Trainable Neural Network for Image-based Sequence Recognition and Its Application to Scene Text Recognition"

50 / 680 papers shown
OmniParser V2: Structured-Points-of-Thought for Unified Visual Text Parsing and Its Generality to Multimodal Large Language Models
OmniParser V2: Structured-Points-of-Thought for Unified Visual Text Parsing and Its Generality to Multimodal Large Language Models
Wenwen Yu
Zhibo Yang
Jianqiang Wan
Sibo Song
J. Tang
Wenqing Cheng
Yunxing Liu
Xiang Bai
335
14
0
22 Feb 2025
Handwritten Text Recognition: A Survey
Handwritten Text Recognition: A Survey
Carlos Garrido-Munoz
Antonio Ríos-Vila
Jorge Calvo-Zaragoza
315
6
0
12 Feb 2025
PLATTER: A Page-Level Handwritten Text Recognition System for Indic Scripts
Badri Vishal Kasuba
Dhruv Kudale
Venkatapathy Subramanian
P. Chaudhuri
Ganesh Ramakrishnan
294
1
0
10 Feb 2025
SceneVTG++: Controllable Multilingual Visual Text Generation in the Wild
SceneVTG++: Controllable Multilingual Visual Text Generation in the Wild
Jiawei Liu
Yuanzhi Zhu
Feiyu Gao
Zhiyong Yang
P. Wang
Junyang Lin
Xinyu Wang
Wenyu Liu
DiffM
354
0
0
08 Jan 2025
First-place Solution for Streetscape Shop Sign Recognition Competition
First-place Solution for Streetscape Shop Sign Recognition Competition
Bin Wang
Li Jing
979
0
0
06 Jan 2025
Efficient Video-Based ALPR System Using YOLO and Visual Rhythm
Victor Nascimento Ribeiro
Nina S. T. Hirata
217
0
0
04 Jan 2025
Instruction-Guided Scene Text Recognition
Instruction-Guided Scene Text RecognitionIEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2024
Yongkun Du
Z. Chen
Yuchen Su
Caiyan Jia
Yu-Gang Jiang
490
17
0
03 Jan 2025
Disentanglement and Compositionality of Letter Identity and Letter
  Position in Variational Auto-Encoder Vision Models
Disentanglement and Compositionality of Letter Identity and Letter Position in Variational Auto-Encoder Vision Models
Bruno Bianchi
Aakash Agrawal
S. Dehaene
Emmanuel Chemla
Yair Lakretz
DRLCoGe
350
0
0
11 Dec 2024
TextSSR: Diffusion-based Data Synthesis for Scene Text Recognition
TextSSR: Diffusion-based Data Synthesis for Scene Text Recognition
Xingsong Ye
Yongkun Du
Yunbo Tao
Z. Chen
DiffM
433
2
0
02 Dec 2024
DLaVA: Document Language and Vision Assistant for Answer Localization with Enhanced Interpretability and Trustworthiness
DLaVA: Document Language and Vision Assistant for Answer Localization with Enhanced Interpretability and Trustworthiness
Ahmad Mohammadshirazi
Pinaki Prasad Guha Neogi
Ser-Nam Lim
R. Ramnath
433
6
0
29 Nov 2024
SVTRv2: CTC Beats Encoder-Decoder Models in Scene Text Recognition
SVTRv2: CTC Beats Encoder-Decoder Models in Scene Text Recognition
Yongkun Du
Z. Chen
Hongtao Xie
Caiyan Jia
Yu-Gang Jiang
384
19
0
24 Nov 2024
Boosting Semi-Supervised Scene Text Recognition via Viewing and
  Summarizing
Boosting Semi-Supervised Scene Text Recognition via Viewing and SummarizingNeural Information Processing Systems (NeurIPS), 2024
Yadong Qu
Yuxin Wang
Bangbang Zhou
Zihan Wang
Hongtao Xie
Yongdong Zhang
236
2
0
23 Nov 2024
Learning based Geéz character handwritten recognition
Learning based Geéz character handwritten recognition
Hailemicael Lulseged Yimer
Hailegabriel Dereje Degefa
Marco Cristani
Federico Cunico
193
2
0
20 Nov 2024
Relational Contrastive Learning and Masked Image Modeling for Scene Text Recognition
T. Lin
Jinglei Zhang
Yi Xu
Kai Chen
Rui Zhang
Chong Chen
349
0
0
18 Nov 2024
SAN: Structure-Aware Network for Complex and Long-tailed Chinese Text
  Recognition
SAN: Structure-Aware Network for Complex and Long-tailed Chinese Text RecognitionIEEE International Conference on Document Analysis and Recognition (ICDAR), 2024
Jing Zhang
Chang-rui Liu
Chun Yang
167
3
0
10 Nov 2024
HIP: Hierarchical Point Modeling and Pre-training for Visual Information
  Extraction
HIP: Hierarchical Point Modeling and Pre-training for Visual Information Extraction
Rujiao Long
Pengfei Wang
Zhibo Yang
Cong Yao
251
0
0
02 Nov 2024
Visual Text Matters: Improving Text-KVQA with Visual Text Entity
  Knowledge-aware Large Multimodal Assistant
Visual Text Matters: Improving Text-KVQA with Visual Text Entity Knowledge-aware Large Multimodal AssistantConference on Empirical Methods in Natural Language Processing (EMNLP), 2024
A. S. Penamakuri
Anand Mishra
328
2
0
24 Oct 2024
Human-Inspired Long-Term Indoor Localization in Human-Oriented
  Environment
Human-Inspired Long-Term Indoor Localization in Human-Oriented Environment
Nicky Zimmerman
Matteo Sodano
232
0
0
16 Oct 2024
ChartKG: A Knowledge-Graph-Based Representation for Chart Images
ChartKG: A Knowledge-Graph-Based Representation for Chart ImagesIEEE Transactions on Visualization and Computer Graphics (TVCG), 2024
Zhiguang Zhou
Haoxuan Wang
Zhengqing Zhao
Fengling Zheng
Yongheng Wang
Wei Chen
Yong Wang
287
4
0
13 Oct 2024
Grounding Partially-Defined Events in Multimodal Data
Grounding Partially-Defined Events in Multimodal DataConference on Empirical Methods in Natural Language Processing (EMNLP), 2024
Kate Sanders
Reno Kriz
David Etter
Hannah Recknor
Alexander Martin
Cameron Carpenter
Jingyang Lin
Benjamin Van Durme
171
5
0
07 Oct 2024
HATFormer: Historic Handwritten Arabic Text Recognition with Transformers
HATFormer: Historic Handwritten Arabic Text Recognition with Transformers
Adrian Chan
Anupam Mijar
Mehreen Saeed
Chau-Wai Wong
Akram Khater
648
3
0
03 Oct 2024
AI-Powered Augmented Reality for Satellite Assembly, Integration and
  Test
AI-Powered Augmented Reality for Satellite Assembly, Integration and Test
Alvaro Patricio
Joao Valente
Atabak Dehban
Ines Cadilha
Daniel Reis
Rodrigo Ventura
132
3
0
26 Sep 2024
Text Image Generation for Low-Resource Languages with Dual Translation
  Learning
Text Image Generation for Low-Resource Languages with Dual Translation Learning
Chihiro Noguchi
Shun Fukuda
Shoichiro Mihara
Masao Yamanaka
DiffM
204
0
0
26 Sep 2024
General Detection-based Text Line Recognition
General Detection-based Text Line RecognitionNeural Information Processing Systems (NeurIPS), 2024
Raphael Baena
Syrine Kalleli
Mathieu Aubry
981
1
0
25 Sep 2024
One Model for Two Tasks: Cooperatively Recognizing and Recovering
  Low-Resolution Scene Text Images by Iterative Mutual Guidance
One Model for Two Tasks: Cooperatively Recognizing and Recovering Low-Resolution Scene Text Images by Iterative Mutual Guidance
Minyi Zhao
Yang Wang
Jihong Guan
Shuigeng Zhou
182
0
0
22 Sep 2024
VL-Reader: Vision and Language Reconstructor is an Effective Scene Text
  Recognizer
VL-Reader: Vision and Language Reconstructor is an Effective Scene Text RecognizerACM Multimedia (MM), 2024
Humen Zhong
Zhibo Yang
Zhaohai Li
Peng Wang
Jun Tang
Wenqing Cheng
Cong Yao
252
5
0
18 Sep 2024
HTR-VT: Handwritten Text Recognition with Vision Transformer
HTR-VT: Handwritten Text Recognition with Vision TransformerPattern Recognition (Pattern Recogn.), 2024
Yuting Li
Dexiong Chen
Tinglong Tang
Xi Shen
ViT
156
32
0
13 Sep 2024
Boosting CNN-based Handwriting Recognition Systems with Learnable
  Relaxation Labeling
Boosting CNN-based Handwriting Recognition Systems with Learnable Relaxation Labeling
S. Ferro
Alessandro Torcinovich
Arianna Traviglia
Marcello Pelillo
126
0
0
09 Sep 2024
PdfTable: A Unified Toolkit for Deep Learning-Based Table Extraction
PdfTable: A Unified Toolkit for Deep Learning-Based Table Extraction
Lei Sheng
Shuai-Shuai Xu
LMTD
200
0
0
08 Sep 2024
RoomDiffusion: A Specialized Diffusion Model in the Interior Design
  Industry
RoomDiffusion: A Specialized Diffusion Model in the Interior Design Industry
Zhaowei Wang
Ying Hao
Hao Wei
Qing Xiao
Lulu Chen
Yulong Li
Yue Yang
Tianyi Li
DiffM
102
2
0
05 Sep 2024
Platypus: A Generalized Specialist Model for Reading Text in Various
  Forms
Platypus: A Generalized Specialist Model for Reading Text in Various FormsEuropean Conference on Computer Vision (ECCV), 2024
Peng Wang
Zhaohai Li
Jun Tang
Humen Zhong
Fei Huang
Zhibo Yang
Cong Yao
VLMObjD
203
2
0
27 Aug 2024
Decoder Pre-Training with only Text for Scene Text Recognition
Decoder Pre-Training with only Text for Scene Text RecognitionACM Multimedia (MM), 2024
Shuai Zhao
Yongkun Du
Zhineng Chen
Yu-Gang Jiang
154
6
0
11 Aug 2024
Image-to-LaTeX Converter for Mathematical Formulas and Text
Image-to-LaTeX Converter for Mathematical Formulas and Text
Daniil Gurgurov
Aleksey Morshnev
ViTVLM
188
3
0
07 Aug 2024
LEGO: Self-Supervised Representation Learning for Scene Text Images
LEGO: Self-Supervised Representation Learning for Scene Text Images
Yujin Ren
Jiaxin Zhang
Lianwen Jin
SSL
252
0
0
04 Aug 2024
Self-Supervised Learning for Text Recognition: A Critical Survey
Self-Supervised Learning for Text Recognition: A Critical SurveyInternational Journal of Computer Vision (IJCV), 2024
Carlos Peñarrubia
J. J. Valero-Mas
Jorge Calvo-Zaragoza
424
4
0
29 Jul 2024
Visual Text Generation in the Wild
Visual Text Generation in the Wild
Yuanzhi Zhu
Jiawei Liu
Feiyu Gao
Wenyu Liu
Xinggang Wang
Peng Wang
Fei Huang
Cong Yao
Zhibo Yang
DiffM
240
14
0
19 Jul 2024
Qalam : A Multimodal LLM for Arabic Optical Character and Handwriting
  Recognition
Qalam : A Multimodal LLM for Arabic Optical Character and Handwriting Recognition
Gagan Bhatia
El Moatez Billah Nagoudi
Fakhraddin Alwajih
Muhammad Abdul-Mageed
179
11
0
18 Jul 2024
Back to Newton's Laws: Learning Vision-based Agile Flight via
  Differentiable Physics
Back to Newton's Laws: Learning Vision-based Agile Flight via Differentiable Physics
Yuang Zhang
Yu Hu
Yunlong Song
Danping Zou
Weiyao Lin
328
36
0
15 Jul 2024
Long-range Turbulence Mitigation: A Large-scale Dataset and A
  Coarse-to-fine Framework
Long-range Turbulence Mitigation: A Large-scale Dataset and A Coarse-to-fine Framework
Shengqi Xu
Run Sun
Yi Chang
Shuning Cao
Xueyao Xiao
Luxin Yan
189
6
0
11 Jul 2024
PosFormer: Recognizing Complex Handwritten Mathematical Expression with
  Position Forest Transformer
PosFormer: Recognizing Complex Handwritten Mathematical Expression with Position Forest Transformer
Tongkun Guan
Chengyu Lin
Wei Shen
Xiaokang Yang
267
15
0
10 Jul 2024
Spanish TrOCR: Leveraging Transfer Learning for Language Adaptation
Spanish TrOCR: Leveraging Transfer Learning for Language Adaptation
Filipe Lauar
Valentin Laurent
142
2
0
09 Jul 2024
Focus on the Whole Character: Discriminative Character Modeling for
  Scene Text Recognition
Focus on the Whole Character: Discriminative Character Modeling for Scene Text Recognition
Bangbang Zhou
Yadong Qu
Zixiao Wang
Zicheng Li
Boqiang Zhang
Hongtao Xie
266
3
0
08 Jul 2024
MixTex: Unambiguous Recognition Should Not Rely Solely on Real Data
MixTex: Unambiguous Recognition Should Not Rely Solely on Real Data
Renqing Luo
Yuhan Xu
224
0
0
24 Jun 2024
Fusion of Movement and Naive Predictions for Point Forecasting in
  Univariate Random Walks
Fusion of Movement and Naive Predictions for Point Forecasting in Univariate Random Walks
Cheng Zhang
148
0
0
20 Jun 2024
AnyTrans: Translate AnyText in the Image with Large Scale Models
AnyTrans: Translate AnyText in the Image with Large Scale Models
Zhipeng Qian
Pei Zhang
Baosong Yang
Kai Fan
Yiwei Ma
Yang Li
Xiaoshuai Sun
Rongrong Ji
VLM
240
3
0
17 Jun 2024
VCR: A Task for Pixel-Level Complex Reasoning in Vision Language Models via Restoring Occluded Text
VCR: A Task for Pixel-Level Complex Reasoning in Vision Language Models via Restoring Occluded TextInternational Conference on Learning Representations (ICLR), 2024
Tianyu Zhang
Suyuchen Wang
Lu Li
Ge Zhang
Perouz Taslakian
Sai Rajeswar
Jie Fu
Bang Liu
Yoshua Bengio
261
5
0
10 Jun 2024
Classification of Non-native Handwritten Characters Using Convolutional
  Neural Network
Classification of Non-native Handwritten Characters Using Convolutional Neural Network
F. A. Mamun
S. Chowdhury
J. E. Giti
H. Sarker
264
1
0
06 Jun 2024
Improving Text Generation on Images with Synthetic Captions
Improving Text Generation on Images with Synthetic Captions
Jun Young Koh
Sang Hyun Park
Joy Song
DiffM
339
4
0
01 Jun 2024
LOGO: Video Text Spotting with Language Collaboration and Glyph
  Perception Model
LOGO: Video Text Spotting with Language Collaboration and Glyph Perception Model
Hongen Liu
Di Sun
Jiahao Wang
Lu Dong
Gang Pan
270
1
0
29 May 2024
Dataset and Benchmark for Urdu Natural Scenes Text Detection,
  Recognition and Visual Question Answering
Dataset and Benchmark for Urdu Natural Scenes Text Detection, Recognition and Visual Question Answering
Hiba Maryam
Ling Fu
Jiajun Song
Tajrian Abm Shafayet
Qidi Luo
Xiang Bai
Yuliang Liu
175
0
0
21 May 2024
Previous
12345...121314
Next