Replicating ReLM Results: Validating Large Language Models with ReLM

16 April 2025

Abstract

Validating Large Language Models with ReLM explores the application of formal languages to evaluate and control Large Language Models (LLMs) for memorization, bias, and zero-shot performance. Current approaches for evaluating these types behavior are often slow, imprecise, costly, or introduce biases of their own, but are necessary due to the importance of this behavior when productionizing LLMs. This project reproduces key results from the original ReLM paper and expounds on the approach and applications with an emphasis on the relevance to the field of systems for machine learning.

View on arXiv

@article{adamson2025_2504.12357,
  title={ Replicating ReLM Results: Validating Large Language Models with ReLM },
  author={ Reece Adamson and Erin Song },
  journal={arXiv preprint arXiv:2504.12357},
  year={ 2025 }
}

Comments on this paper