Communities
Connect sessions
AI calendar
Organizations
Join Slack
Contact Sales
Search
Open menu
Home
Papers
1507.05717
Cited By
An End-to-End Trainable Neural Network for Image-based Sequence Recognition and Its Application to Scene Text Recognition
IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2015
21 July 2015
Baoguang Shi
X. Bai
Cong Yao
VLM
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"An End-to-End Trainable Neural Network for Image-based Sequence Recognition and Its Application to Scene Text Recognition"
50 / 680 papers shown
CustomText: Customized Textual Image Generation using Diffusion Models
Shubham Paliwal
Arushi Jain
Monika Sharma
Vikram Jamwal
Lovekesh Vig
132
5
0
21 May 2024
HAAP: Vision-context Hierarchical Attention Autoregressive with Adaptive Permutation for Scene Text Recognition
Honghui Chen
Yuhang Qiu
Jiabao Wang
Pingping Chen
Nam Ling
193
0
0
15 May 2024
Self-Supervised Pre-training with Symmetric Superimposition Modeling for Scene Text Recognition
International Joint Conference on Artificial Intelligence (IJCAI), 2024
Zuan Gao
Yuxin Wang
Yadong Qu
Boqiang Zhang
Zixiao Wang
Jianjun Xu
Hongtao Xie
ViT
197
12
0
09 May 2024
Align, Minimize and Diversify: A Source-Free Unsupervised Domain Adaptation Method for Handwritten Text Recognition
María Alfaro-Contreras
Jorge Calvo-Zaragoza
214
0
0
28 Apr 2024
GatedLexiconNet: A Comprehensive End-to-End Handwritten Paragraph Text Recognition System
Lalita Kumari
Sukhdeep Singh
V. Rathore
Anuj Sharma
164
2
0
22 Apr 2024
A Dataset and Model for Realistic License Plate Deblurring
Haoyan Gong
Yuzheng Feng
Zhenrong Zhang
Xianxu Hou
Jingxin Liu
Siqi Huang
Hongbin Liu
161
9
0
21 Apr 2024
JSTR: Judgment Improves Scene Text Recognition
Masato Fujitake
239
1
0
09 Apr 2024
NAF-DPM: A Nonlinear Activation-Free Diffusion Probabilistic Model for Document Enhancement
Giordano Cicchetti
Danilo Comminiello
158
8
0
08 Apr 2024
LayoutLLM: Layout Instruction Tuning with Large Language Models for Document Understanding
Chuwei Luo
Yufan Shen
Zhaoqing Zhu
Qi Zheng
Zhi Yu
Cong Yao
370
95
0
08 Apr 2024
Bridging the Gap Between End-to-End and Two-Step Text Spotting
Mingxin Huang
Hongliang Li
Yuliang Liu
Xiang Bai
Lianwen Jin
213
11
0
06 Apr 2024
OmniParser: A Unified Framework for Text Spotting, Key Information Extraction and Table Recognition
Jianqiang Wan
Sibo Song
Wenwen Yu
Yuliang Liu
Wenqing Cheng
Fei Huang
Xiang Bai
Cong Yao
Zhibo Yang
277
72
0
28 Mar 2024
Global License Plate Dataset
Siddharth Agrawal
156
1
0
22 Mar 2024
Practical End-to-End Optical Music Recognition for Pianoform Music
Jirí Mayer
Milan Straka
Jan Hajic
Pavel Pecina
121
12
0
20 Mar 2024
HierCode: A Lightweight Hierarchical Codebook for Zero-shot Chinese Text Recognition
Yuyi Zhang
Yuanzhi Zhu
Dezhi Peng
Peirong Zhang
Zhenhua Yang
Zhibo Yang
Cong Yao
Lianwen Jin
193
10
0
20 Mar 2024
Efficient scene text image super-resolution with semantic guidance
LeoWu TomyEnrique
Xiangcheng Du
Kangliang Liu
Han Yuan
Zhao Zhou
Cheng Jin
VLM
181
8
0
20 Mar 2024
From Pixels to Insights: A Survey on Automatic Chart Understanding in the Era of Large Foundation Models
IEEE Transactions on Knowledge and Data Engineering (TKDE), 2024
Kung-Hsiang Huang
Hou Pong Chan
Yi R. Fung
Haoyi Qiu
Mingyang Zhou
Shafiq Joty
Shih-Fu Chang
Chenhui Xu
AI4TS
467
56
0
18 Mar 2024
OCR is All you need: Importing Multi-Modality into Image-based Defect Detection System
Chih-Chung Hsu
Chia-Ming Lee
Chun-Hung Sun
Kuang-Ming Wu
181
1
0
18 Mar 2024
TextBlockV2: Towards Precise-Detection-Free Scene Text Spotting with Pre-trained Language Model
Jiahao Lyu
Jin Wei
Gangyan Zeng
Zeng Li
Enze Xie
Wei Wang
Can Ma
VLM
294
7
0
15 Mar 2024
IndicSTR12: A Dataset for Indic Scene Text Recognition
Harsh Lunia
Ajoy Mondal
C. V. Jawahar
168
3
0
12 Mar 2024
Open-Vocabulary Scene Text Recognition via Pseudo-Image Labeling and Margin Loss
Xuhua Ren
Hengcan Shi
Jin Li
VLM
242
0
0
12 Mar 2024
LOCR: Location-Guided Transformer for Optical Character Recognition
Yu Sun
Dongzhan Zhou
Chen Lin
Conghui He
Wanli Ouyang
Han-Sen Zhong
297
4
0
04 Mar 2024
Efficiently Leveraging Linguistic Priors for Scene Text Spotting
Nguyen Nguyen
Yapeng Tian
Chenliang Xu
270
2
0
27 Feb 2024
Sequential Visual and Semantic Consistency for Semi-supervised Text Recognition
Mingkun Yang
Biao Yang
Minghui Liao
Yingying Zhu
Xiang Bai
292
6
0
24 Feb 2024
Class-Aware Mask-Guided Feature Refinement for Scene Text Recognition
Mingkun Yang
Biao Yang
Minghui Liao
Yingying Zhu
X. Bai
VLM
243
19
0
21 Feb 2024
VATr++: Choose Your Words Wisely for Handwritten Text Generation
Bram Vanherle
Vittorio Pippi
S. Cascianelli
Nick Michiels
F. Reeth
Rita Cucchiara
194
15
0
16 Feb 2024
Sheet Music Transformer: End-To-End Optical Music Recognition Beyond Monophonic Transcription
IEEE International Conference on Document Analysis and Recognition (ICDAR), 2024
Antonio Ríos-Vila
Jorge Calvo-Zaragoza
Thierry Paquet
275
20
0
12 Feb 2024
Visual Text Meets Low-level Vision: A Comprehensive Survey on Visual Text Processing
Yan Shu
Weichao Zeng
Zhenhang Li
Fangmin Zhao
Can Ma
241
8
0
05 Feb 2024
Text Image Inpainting via Global Structure-Guided Diffusion Models
AAAI Conference on Artificial Intelligence (AAAI), 2024
Shipeng Zhu
Pengfei Fang
Chenjie Zhu
Zuoyan Zhao
Qiang Xu
Hui Xue
DiffM
229
17
0
26 Jan 2024
VIPTR: A Vision Permutable Extractor for Fast and Efficient Scene Text Recognition
Xianfu Cheng
Weixiao Zhou
Xiang Li
Xiaoming Chen
Zhiqiang Wang
Tongliang Li
Zhoujun Li
359
1
0
18 Jan 2024
SwinTextSpotter v2: Towards Better Synergy for Scene Text Spotting
International Journal of Computer Vision (IJCV), 2024
Mingxin Huang
Dezhi Peng
Hongliang Li
Zhenghao Peng
Chongyu Liu
Dahua Lin
Yuliang Liu
Xiang Bai
Lianwen Jin
362
7
0
15 Jan 2024
Spatio-Temporal Turbulence Mitigation: A Translational Perspective
Computer Vision and Pattern Recognition (CVPR), 2024
Xingguang Zhang
Nicholas Chimitt
Yiheng Chi
Zhiyuan Mao
Stanley H. Chan
307
21
0
08 Jan 2024
Inverse-like Antagonistic Scene Text Spotting via Reading-Order Estimation and Dynamic Sampling
IEEE Transactions on Image Processing (TIP), 2024
Shi-Xue Zhang
Chun Yang
Xiaobin Zhu
Hongyang Zhou
Hongfa Wang
Xu-Cheng Yin
297
15
0
08 Jan 2024
An Empirical Study of Scaling Law for OCR
Miao Rang
Zhenni Bi
Chuanjian Liu
Yunhe Wang
Kai Han
427
12
0
29 Dec 2023
Word length-aware text spotting: Enhancing detection and recognition in dense text image
Hao Wang
Huabing Zhou
Yanduo Zhang
Tao Lu
Jiayi Ma
205
1
0
25 Dec 2023
IPAD: Iterative, Parallel, and Diffusion-based Network for Scene Text Recognition
Xiaomeng Yang
Zhi Qiao
Can Ma
DiffM
387
7
0
19 Dec 2023
Cross-Lingual Learning in Multilingual Scene Text Recognition
Jeonghun Baek
Yusuke Matsui
Kiyoharu Aizawa
214
1
0
17 Dec 2023
Diffusion-based Blind Text Image Super-Resolution
Computer Vision and Pattern Recognition (CVPR), 2023
Yuzhe Zhang
Jiawei Zhang
Hao Li
Zhouxia Wang
Luwei Hou
Dongqing Zou
Liheng Bian
280
28
0
13 Dec 2023
Toward Real Text Manipulation Detection: New Dataset and New Solution
Dongliang Luo
Yuliang Liu
Rui Yang
Xianjin Liu
Jishen Zeng
Yu Zhou
Xiang Bai
207
11
0
12 Dec 2023
IDPL-PFOD2: A New Large-Scale Dataset for Printed Farsi Optical Character Recognition
Fatemeh Asadi-zeydabadi
Ali Afkari-Fahandari
Amin Faraji
Elham Shabaninia
Hossein Nezamabadi-pour
148
3
0
02 Dec 2023
Towards Higher Ranks via Adversarial Weight Pruning
Neural Information Processing Systems (NeurIPS), 2023
Yuchuan Tian
Hanting Chen
Tianyu Guo
Chao Xu
Yunhe Wang
231
5
0
29 Nov 2023
DSText V2: A Comprehensive Video Text Spotting Dataset for Dense and Small Text
Pattern Recognition (Pattern Recogn.), 2023
Weijia Wu
Yiming Zhang
Yefei He
Luoming Zhang
Zhenyu Lou
Hong Zhou
Xiang Bai
226
9
0
29 Nov 2023
PEAN: A Diffusion-Based Prior-Enhanced Attention Network for Scene Text Image Super-Resolution
ACM Multimedia (ACM MM), 2023
Zuoyan Zhao
Hui Xue
Pengfei Fang
Shipeng Zhu
DiffM
252
11
0
29 Nov 2023
STR-Cert: Robustness Certification for Deep Text Recognition on Deep Learning Pipelines and Vision Transformers
Daqian Shao
Lukas Fesser
Marta Z. Kwiatkowska
188
0
0
28 Nov 2023
Vulnerability Analysis of Transformer-based Optical Character Recognition to Adversarial Attacks
Lucas Beerens
D. Higham
207
1
0
28 Nov 2023
TextDiffuser-2: Unleashing the Power of Language Models for Text Rendering
European Conference on Computer Vision (ECCV), 2023
Jingye Chen
Yupan Huang
Tengchao Lv
Lei Cui
Qifeng Chen
Furu Wei
DiffM
283
103
0
28 Nov 2023
Data Generation for Post-OCR correction of Cyrillic handwriting
Evgenii Davydkin
Aleksandr Markelov
Egor Iuldashev
Anton Dudkin
I. Krivorotov
293
4
0
27 Nov 2023
Recognition-Guided Diffusion Model for Scene Text Image Super-Resolution
IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2023
Yuxuan Zhou
Liangcai Gao
Zhi Tang
Baole Wei
DiffM
213
13
0
22 Nov 2023
Towards Detecting, Recognizing, and Parsing the Address Information from Bangla Signboard: A Deep Learning-based Approach
Hasan Murad
Mohammed Eunus Ali
177
0
0
22 Nov 2023
DocPedia: Unleashing the Power of Large Multimodal Model in the Frequency Domain for Versatile Document Understanding
Hao Feng
Qi Liu
Hao Liu
Wen-gang Zhou
Houqiang Li
Can Huang
VLM
343
93
0
20 Nov 2023
Scene Text Image Super-resolution based on Text-conditional Diffusion Models
Chihiro Noguchi
Shun Fukuda
Masao Yamanaka
DiffM
228
22
0
16 Nov 2023
Previous
1
2
3
4
5
6
...
12
13
14
Next