ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2505.04846
30
0

HiPerRAG: High-Performance Retrieval Augmented Generation for Scientific Insights

7 May 2025
Ozan Gokdemir
Carlo Siebenschuh
Alexander Brace
Azton Wells
Brian Hsu
Kyle Hippe
Priyanka V. Setty
Aswathy Ajith
J. G. Pauloski
Varuni K. Sastry
Sam Foreman
Huihuo Zheng
Heng Ma
B. Kale
Nicholas Chia
Thomas Gibbs
M. Papka
Thomas Brettin
Francis J. Alexander
A. Anandkumar
Ian Foster
R. Stevens
V. Vishwanath
A. Ramanathan
    VLM
ArXivPDFHTML
Abstract

The volume of scientific literature is growing exponentially, leading to underutilized discoveries, duplicated efforts, and limited cross-disciplinary collaboration. Retrieval Augmented Generation (RAG) offers a way to assist scientists by improving the factuality of Large Language Models (LLMs) in processing this influx of information. However, scaling RAG to handle millions of articles introduces significant challenges, including the high computational costs associated with parsing documents and embedding scientific knowledge, as well as the algorithmic complexity of aligning these representations with the nuanced semantics of scientific content. To address these issues, we introduce HiPerRAG, a RAG workflow powered by high performance computing (HPC) to index and retrieve knowledge from more than 3.6 million scientific articles. At its core are Oreo, a high-throughput model for multimodal document parsing, and ColTrast, a query-aware encoder fine-tuning algorithm that enhances retrieval accuracy by using contrastive learning and late-interaction techniques. HiPerRAG delivers robust performance on existing scientific question answering benchmarks and two new benchmarks introduced in this work, achieving 90% accuracy on SciQ and 76% on PubMedQA-outperforming both domain-specific models like PubMedGPT and commercial LLMs such as GPT-4. Scaling to thousands of GPUs on the Polaris, Sunspot, and Frontier supercomputers, HiPerRAG delivers million document-scale RAG workflows for unifying scientific knowledge and fostering interdisciplinary innovation.

View on arXiv
@article{gokdemir2025_2505.04846,
  title={ HiPerRAG: High-Performance Retrieval Augmented Generation for Scientific Insights },
  author={ Ozan Gokdemir and Carlo Siebenschuh and Alexander Brace and Azton Wells and Brian Hsu and Kyle Hippe and Priyanka V. Setty and Aswathy Ajith and J. Gregory Pauloski and Varuni Sastry and Sam Foreman and Huihuo Zheng and Heng Ma and Bharat Kale and Nicholas Chia and Thomas Gibbs and Michael E. Papka and Thomas Brettin and Francis J. Alexander and Anima Anandkumar and Ian Foster and Rick Stevens and Venkatram Vishwanath and Arvind Ramanathan },
  journal={arXiv preprint arXiv:2505.04846},
  year={ 2025 }
}
Comments on this paper