A Library of LLM Intrinsics for Retrieval-Augmented Generation

In the developer community for large language models (LLMs), there is not yet a clean pattern analogous to a software library, to support very large scale collaboration. Even for the commonplace use case of Retrieval-Augmented Generation (RAG), it is not currently possible to write a RAG application against a well-defined set of APIs that are agreed upon by different LLM providers. Inspired by the idea of compiler intrinsics, we propose some elements of such a concept through introducing a library of LLM Intrinsics for RAG. An LLM intrinsic is defined as a capability that can be invoked through a well-defined API that is reasonably stable and independent of how the LLM intrinsic itself is implemented. The intrinsics in our library are released as LoRA adapters on HuggingFace, and through a software interface with clear structured input/output characteristics on top of vLLM as an inference platform, accompanied in both places with documentation and code. This article describes the intended usage, training details, and evaluations for each intrinsic, as well as compositions of multiple intrinsics.
View on arXiv@article{danilevsky2025_2504.11704, title={ A Library of LLM Intrinsics for Retrieval-Augmented Generation }, author={ Marina Danilevsky and Kristjan Greenewald and Chulaka Gunasekara and Maeda Hanafi and Lihong He and Yannis Katsis and Krishnateja Killamsetty and Yatin Nandwani and Lucian Popa and Dinesh Raghu and Frederick Reiss and Vraj Shah and Khoi-Nguyen Tran and Huaiyu Zhu and Luis Lastras }, journal={arXiv preprint arXiv:2504.11704}, year={ 2025 } }