v1v2 (latest)

SeqProFT: Sequence-only Protein Property Prediction with LoRA Finetuning

18 November 2024

Abstract

Protein language models (PLMs) have demonstrated remarkable capabilities in learning relationships between protein sequences and functions. However, finetuning these large models requires substantial computational resources, often with suboptimal task-specific results. This study investigates how parameter-efficient finetuning via LoRA can enhance protein property prediction while significantly reducing computational demands. By applying LoRA to ESM-2 and ESM-C models of varying sizes and evaluating 10 diverse protein property prediction tasks, we demonstrate that smaller models with LoRA adaptation can match or exceed the performance of larger models without adaptation. Additionally, we integrate contact map information through a multi-head attention mechanism, improving model comprehension of structural features. Our systematic analysis reveals that LoRA finetuning enables faster convergence, better performance, and more efficient resource utilization, providing practical guidance for protein research applications in resource-constrained environments. The code is available atthis https URL.

View on arXiv

Main:9 Pages

4 Figures

Bibliography:1 Pages

5 Tables

Comments on this paper