The crucial field of Optical Chemical Structure Recognition (OCSR) aims to transform chemical structure photographs into machine-readable formats so that chemical databases may be efficiently stored and queried. Although a number of OCSR technologies have been created, little is known about how well they work in different picture deterioration scenarios. In this work, a new dataset of chemically structured images that have been systematically harmed graphically by compression, noise, distortion, and black overlays is presented. On these subsets, publicly accessible OCSR tools were thoroughly tested to determine how resilient they were to unfavorable circumstances. The outcomes show notable performance variation, underscoring each tool's advantages and disadvantages. Interestingly, MolScribe performed best under heavy compression (55.8% at 99%) and had the highest identification rate on undamaged photos (94.6%). MolVec performed exceptionally well against noise and black overlay (86.8% at 40%), although it declined under extreme distortion (<70%). With recognition rates below 30%, Decimer demonstrated strong sensitivity to noise and black overlay, but Imago had the lowest baseline accuracy (73.6%). The creative assessment of this study offers important new information about how well the OCSR tool performs when images deteriorate, as well as useful standards for tool development in the future.
View on arXiv@article{lin2025_2502.15768, title={ Exploring the Role of Artificial Intelligence and Machine Learning in Process Optimization for Chemical Industry }, author={ Zishuo Lin and Jiajie Wang and Zhe Yan and Peiyong Ma }, journal={arXiv preprint arXiv:2502.15768}, year={ 2025 } }