10
0

OBD-Finder: Explainable Coarse-to-Fine Text-Centric Oracle Bone Duplicates Discovery

Abstract

Oracle Bone Inscription (OBI) is the earliest systematic writing system in China, while the identification of Oracle Bone (OB) duplicates is a fundamental issue in OBI research. In this work, we design a progressive OB duplicate discovery framework that combines unsupervised low-level keypoints matching with high-level text-centric content-based matching to refine and rank the candidate OB duplicates with semantic awareness and interpretability. We compare our approach with state-of-the-art content-based image retrieval and image matching methods, showing that our approach yields comparable recall performance and the highest simplified mean reciprocal rank scores for both Top-5 and Top-15 retrieval results, and with significantly accelerated computation efficiency. We have discovered over 60 pairs of new OB duplicates in real-world deployment, which were missed by OBI researchers for decades. The models, video illustration and demonstration of this work are available at:this https URL.

View on arXiv
@article{zhang2025_2505.03836,
  title={ OBD-Finder: Explainable Coarse-to-Fine Text-Centric Oracle Bone Duplicates Discovery },
  author={ Chongsheng Zhang and Shuwen Wu and Yingqi Chen and Matthias Aßenmacher and Christian Heumann and Yi Men and Gaojuan Fan and João Gama },
  journal={arXiv preprint arXiv:2505.03836},
  year={ 2025 }
}
Comments on this paper