ChaTA: Towards an Intelligent Question-Answer Teaching Assistant using
Open-Source LLMs
- AI4Ed
To address the challenges of scalable and intelligent question-answering (QA), we introduce an innovative solution that leverages open-source Large Language Models (LLMs) to ensure data privacy. We use models from the LLaMA-2 family and augmentations including retrieval augmented generation (RAG), supervised fine-tuning (SFT), and an alternative to reinforcement learning with human feedback (RLHF). We perform our experiments on a Piazza dataset from an introductory CS course with 10k QA pairs and 1.5k pairs of preferences data and conduct both human evaluations and automatic LLM evaluations on a small subset. We find preliminary evidence that modeling techniques collectively enhance the quality of answers by 33%, and RAG is an impactful addition. This work paves the way for the development of ChaTA, an intelligent QA assistant customizable for courses with an online QA platform.
View on arXiv