v1v2 (latest)

AdaptoVision: A Multi-Resolution Image Recognition Model for Robust and Scalable Classification

17 April 2025

Md. Sanaullah Chowdhury Lameya Sabrin

VLM

ArXiv (abs)PDF HTML

Main:10 Pages

4 Figures

Bibliography:3 Pages

7 Tables

Abstract

This paper introduces AdaptoVision, a novel convolutional neural network (CNN) architecture designed to efficiently balance computational complexity and classification accuracy. By leveraging enhanced residual units, depth-wise separable convolutions, and hierarchical skip connections, AdaptoVision significantly reduces parameter count and computational requirements while preserving competitive performance across various benchmark and medical image datasets. Extensive experimentation demonstrates that AdaptoVision achieves state-of-the-art on BreakHis dataset and comparable accuracy levels, notably 95.3\% on CIFAR-10 and 85.77\% on CIFAR-100, without relying on any pretrained weights. The model's streamlined architecture and strategic simplifications promote effective feature extraction and robust generalization, making it particularly suitable for deployment in real-time and resource-constrained environments.

View on arXiv

Comments on this paper