LoRA Fine-Tuning Without GPUs: A CPU-Efficient Meta-Generation Framework for LLMs

2 July 2025

Reza Arabpour

Haitz Sáez de Ocáriz Borde

Anastasis Kratsios

ArXiv (abs)PDF HTML

Main:6 Pages

1 Figures

Bibliography:1 Pages

3 Tables

Appendix:12 Pages

Abstract

Low-Rank Adapters (LoRAs) have transformed the fine-tuning of Large Language Models (LLMs) by enabling parameter-efficient updates. However, their widespread adoption remains limited by the reliance on GPU-based training. In this work, we propose a theoretically grounded approach to LoRA fine-tuning designed specifically for users with limited computational resources, particularly those restricted to standard laptop CPUs. Our method learns a meta-operator that maps any input dataset, represented as a probability distribution, to a set of LoRA weights by leveraging a large bank of pre-trained adapters for the Mistral-7B-Instruct-v0.2 model. Instead of performing new gradient-based updates, our pipeline constructs adapters via lightweight combinations of existing LoRAs directly on CPU. While the resulting adapters do not match the performance of GPU-trained counterparts, they consistently outperform the base Mistral model on downstream tasks, offering a practical and accessible alternative to traditional GPU-based fine-tuning.

View on arXiv

@article{arabpour2025_2507.01806,
  title={ LoRA Fine-Tuning Without GPUs: A CPU-Efficient Meta-Generation Framework for LLMs },
  author={ Reza Arabpour and Haitz Sáez de Ocáriz Borde and Anastasis Kratsios },
  journal={arXiv preprint arXiv:2507.01806},
  year={ 2025 }
}

Comments on this paper