Hardware-software co-exploration with racetrack memory based in-memory computing for CNN inference in embedded systems

2 July 2025

Benjamin Chen Ming Choong

Tao Luo

Cheng Liu

Bingsheng He

Wei Zhang

Joey Tianyi Zhou

ArXiv (abs)PDF HTML

Main:22 Pages

40 Figures

Bibliography:3 Pages

2 Tables

Abstract

Deep neural networks generate and process large volumes of data, posing challenges for low-resource embedded systems. In-memory computing has been demonstrated as an efficient computing infrastructure and shows promise for embedded AI applications. Among newly-researched memory technologies, racetrack memory is a non-volatile technology that allows high data density fabrication, making it a good fit for in-memory computing. However, integrating in-memory arithmetic circuits with memory cells affects both the memory density and power efficiency. It remains challenging to build efficient in-memory arithmetic circuits on racetrack memory within area and energy constraints. To this end, we present an efficient in-memory convolutional neural network (CNN) accelerator optimized for use with racetrack memory. We design a series of fundamental arithmetic circuits as in-memory computing cells suited for multiply-and-accumulate operations. Moreover, we explore the design space of racetrack memory based systems and CNN model architectures, employing co-design to improve the efficiency and performance of performing CNN inference in racetrack memory while maintaining model accuracy. Our designed circuits and model-system co-optimization strategies achieve a small memory bank area with significant improvements in energy and performance for racetrack memory based embedded systems.

View on arXiv

@article{choong2025_2507.01429,
  title={ Hardware-software co-exploration with racetrack memory based in-memory computing for CNN inference in embedded systems },
  author={ Benjamin Chen Ming Choong and Tao Luo and Cheng Liu and Bingsheng He and Wei Zhang and Joey Tianyi Zhou },
  journal={arXiv preprint arXiv:2507.01429},
  year={ 2025 }
}

Comments on this paper