Towards Efficient Educational Chatbots: Benchmarking RAG Frameworks

2 March 2025

Abstract

Large Language Models (LLMs) have proven immensely beneficial in education by capturing vast amounts of literature-based information, allowing them to generate context without relying on external sources. In this paper, we propose a generative AI-powered GATE question-answering framework (GATE stands for Graduate Aptitude Test in Engineering) that leverages LLMs to explain GATE solutions and support students in their exam preparation. We conducted extensive benchmarking to select the optimal embedding model and LLM, evaluating our framework based on criteria such as latency, faithfulness, and relevance, with additional validation through human evaluation. Our chatbot integrates state-of-the-art embedding models and LLMs to deliver accurate, context-aware responses. Through rigorous experimentation, we identified configurations that balance performance and computational efficiency, ensuring a reliable chatbot to serve students' needs. Additionally, we discuss the challenges faced in data processing and modeling and implemented solutions. Our work explores the application of Retrieval-Augmented Generation (RAG) for GATE Q/A explanation tasks, and our findings demonstrate significant improvements in retrieval accuracy and response quality. This research offers practical insights for developing effective AI-driven educational tools while highlighting areas for future enhancement in usability and scalability.

View on arXiv

@article{khan2025_2503.00781,
  title={ Towards Efficient Educational Chatbots: Benchmarking RAG Frameworks },
  author={ Umar Ali Khan and Ekram Khan and Fiza Khan and Athar Ali Moinuddin },
  journal={arXiv preprint arXiv:2503.00781},
  year={ 2025 }
}

Comments on this paper