Fine-tuning Is a Surprisingly Effective Domain Adaptation Baseline in Handwriting Recognition

13 February 2023

Abstract

In many machine learning tasks, a large general dataset and a small specialized dataset are available. In such situations, various domain adaptation methods can be used to adapt a general model to the target dataset. We show that in the case of neural networks trained for handwriting recognition using CTC, simple fine-tuning with data augmentation works surprisingly well in such scenarios and that it is resistant to overfitting even for very small target domain datasets. We evaluated the behavior of fine-tuning with respect to augmentation, training data size, and quality of the pre-trained network, both in writer-dependent and writer-independent settings. On a large real-world dataset, fine-tuning on new writers provided an average relative CER improvement of 25 % for 16 text lines and 50 % for 256 text lines.

View on arXiv

@article{kohút2025_2302.06308,
  title={ Fine-tuning Is a Surprisingly Effective Domain Adaptation Baseline in Handwriting Recognition },
  author={ Jan Kohút and Michal Hradiš },
  journal={arXiv preprint arXiv:2302.06308},
  year={ 2025 }
}

Comments on this paper