This paper introduces AdaptoVision, a novel convolutional neural network (CNN) architecture designed to efficiently balance computational complexity and classification accuracy. By leveraging enhanced residual units, depth-wise separable convolutions, and hierarchical skip connections, AdaptoVision significantly reduces parameter count and computational requirements while preserving competitive performance across various benchmark and medical image datasets. Extensive experimentation demonstrates that AdaptoVision achieves state-of-the-art on BreakHis dataset and comparable accuracy levels, notably 95.3\% on CIFAR-10 and 85.77\% on CIFAR-100, without relying on any pretrained weights. The model's streamlined architecture and strategic simplifications promote effective feature extraction and robust generalization, making it particularly suitable for deployment in real-time and resource-constrained environments.
View on arXiv@article{sabrin2025_2504.12652, title={ AdaptoVision: A Multi-Resolution Image Recognition Model for Robust and Scalable Classification }, author={ Md. Sanaullah Chowdhury Lameya Sabrin }, journal={arXiv preprint arXiv:2504.12652}, year={ 2025 } }