ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2401.01916
12
12

AstroLLaMA-Chat: Scaling AstroLLaMA with Conversational and Diverse Datasets

3 January 2024
Ernest Perkowski
Rui Pan
Tuan Dung Nguyen
Yuan-Sen Ting
Sandor Kruk
Tong Zhang
Charlie OÑeill
Maja Jabłońska
Ze-Chang Sun
Michael J. Smith
Huiling Liu
Kevin Schawinski
K. Iyer
I. Ciucă
    AI4MH
ArXivPDFHTML
Abstract

We explore the potential of enhancing LLM performance in astronomy-focused question-answering through targeted, continual pre-training. By employing a compact 7B-parameter LLaMA-2 model and focusing exclusively on a curated set of astronomy corpora -- comprising abstracts, introductions, and conclusions -- we achieve notable improvements in specialized topic comprehension. While general LLMs like GPT-4 excel in broader question-answering scenarios due to superior reasoning capabilities, our findings suggest that continual pre-training with limited resources can still enhance model performance on specialized topics. Additionally, we present an extension of AstroLLaMA: the fine-tuning of the 7B LLaMA model on a domain-specific conversational dataset, culminating in the release of the chat-enabled AstroLLaMA for community use. Comprehensive quantitative benchmarking is currently in progress and will be detailed in an upcoming full paper. The model, AstroLLaMA-Chat, is now available at https://huggingface.co/universeTBD, providing the first open-source conversational AI tool tailored for the astronomy community.

View on arXiv
Comments on this paper