ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2502.19546
53
0

Repurposing the scientific literature with vision-language models

26 February 2025
Anton Alyakin
Jaden Stryker
Daniel Alber
Karl L. Sangwon
Brandon Duderstadt
Akshay Save
David Kurland
Spencer Frome
Shrutika Singh
Jeff Zhang
Eunice Yang
Ki Yun Park
Cordelia Orillac
Aly A. Valliani
Sean N. Neifert
Albert Liu
Aneek Patel
Christopher Livia
Darryl Lau
Ilya Laufer
Peter A. Rozman
Eveline Teresa Hidalgo
H. Riina
Rui Feng
Todd C. Hollon
Yindalon Aphinyanaphongs
J. Golfinos
Laura Snyder
Eric Leuthardt
Douglas Kondziolka
E. Oermann
Eric Karl Oermann
ArXivPDFHTML
Abstract

Leading vision-language models (VLMs) are trained on general Internet content, overlooking scientific journals' rich, domain-specific knowledge. Training on specialty-specific literature could yield high-performance, task-specific tools, enabling generative AI to match generalist models in specialty publishing, educational, and clinical tasks. We created NeuroPubs, a multimodal dataset of 23,000 Neurosurgery Publications articles (134M words, 78K image-caption pairs). Using NeuroPubs, VLMs generated publication-ready graphical abstracts (70% of 100 abstracts) and board-style questions indistinguishable from human-written ones (54% of 89,587 questions). We used these questions to train CNS-Obsidian, a 34B-parameter VLM. In a blinded, randomized controlled trial, our model demonstrated non-inferiority to then state-of-the-art GPT-4o in neurosurgical differential diagnosis (clinical utility, 40.62% upvotes vs. 57.89%, p=0.1150; accuracy, 59.38% vs. 65.79%, p=0.3797). Our pilot study demonstrates how training generative AI models on specialty-specific journal content - without large-scale internet data - results in high-performance academic and clinical tools, enabling domain-tailored AI across diverse fields.

View on arXiv
@article{alyakin2025_2502.19546,
  title={ Repurposing the scientific literature with vision-language models },
  author={ Anton Alyakin and Jaden Stryker and Daniel Alexander Alber and Karl L. Sangwon and Jin Vivian Lee and Brandon Duderstadt and Akshay Save and David Kurland and Spencer Frome and Shrutika Singh and Jeff Zhang and Eunice Yang and Ki Yun Park and Cordelia Orillac and Aly A. Valliani and Sean Neifert and Albert Liu and Aneek Patel and Christopher Livia and Darryl Lau and Ilya Laufer and Peter A. Rozman and Eveline Teresa Hidalgo and Howard Riina and Rui Feng and Todd Hollon and Yindalon Aphinyanaphongs and John G. Golfinos and Laura Snyder and Eric Leuthardt and Douglas Kondziolka and Eric Karl Oermann },
  journal={arXiv preprint arXiv:2502.19546},
  year={ 2025 }
}
Comments on this paper