ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2504.08583
26
0

AstroLLaVA: towards the unification of astronomical data and natural language

11 April 2025
Sharaf Zaman
Michael J. Smith
P. Khetarpal
Rishabh Chakrabarty
Michele Ginolfi
M. Huertas-Company
Maja Jabłońska
Sandor Kruk
Matthieu Le Lain
Sergio J. Rodríguez Méndez
Dimitrios Tanoglidis
ArXivPDFHTML
Abstract

We present AstroLLaVA, a vision language model for astronomy that enables interaction with astronomical imagery through natural dialogue. By fine-tuning the LLaVA model on a diverse dataset of ∼\sim∼30k images with captions and question-answer pairs sourced from NASA's `Astronomy Picture of the Day', the European Southern Observatory, and the NASA/ESA Hubble Space Telescope, we create a model capable of answering open-ended questions about astronomical concepts depicted visually. Our two-stage fine-tuning process adapts the model to both image captioning and visual question answering in the astronomy domain. We demonstrate AstroLLaVA's performance on an astronomical visual question answering benchmark and release the model weights, code, and training set to encourage further open source work in this space. Finally, we suggest a roadmap towards general astronomical data alignment with pre-trained language models, and provide an open space for collaboration towards this end for interested researchers.

View on arXiv
@article{zaman2025_2504.08583,
  title={ AstroLLaVA: towards the unification of astronomical data and natural language },
  author={ Sharaf Zaman and Michael J. Smith and Pranav Khetarpal and Rishabh Chakrabarty and Michele Ginolfi and Marc Huertas-Company and Maja Jabłońska and Sandor Kruk and Matthieu Le Lain and Sergio José Rodríguez Méndez and Dimitrios Tanoglidis },
  journal={arXiv preprint arXiv:2504.08583},
  year={ 2025 }
}
Comments on this paper