ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2504.16137
35
46

Virology Capabilities Test (VCT): A Multimodal Virology Q&A Benchmark

21 April 2025
Jasper Götting
Pedro Medeiros
Jon G Sanders
Nathaniel Li
Long Phan
Karam Elabd
Lennart Justen
Dan Hendrycks
Seth Donoughe
    ELM
ArXivPDFHTML
Abstract

We present the Virology Capabilities Test (VCT), a large language model (LLM) benchmark that measures the capability to troubleshoot complex virology laboratory protocols. Constructed from the inputs of dozens of PhD-level expert virologists, VCT consists of 322322322 multimodal questions covering fundamental, tacit, and visual knowledge that is essential for practical work in virology laboratories. VCT is difficult: expert virologists with access to the internet score an average of 22.1%22.1\%22.1% on questions specifically in their sub-areas of expertise. However, the most performant LLM, OpenAI's o3, reaches 43.8%43.8\%43.8% accuracy, outperforming 94%94\%94% of expert virologists even within their sub-areas of specialization. The ability to provide expert-level virology troubleshooting is inherently dual-use: it is useful for beneficial research, but it can also be misused. Therefore, the fact that publicly available models outperform virologists on VCT raises pressing governance considerations. We propose that the capability of LLMs to provide expert-level troubleshooting of dual-use virology work should be integrated into existing frameworks for handling dual-use technologies in the life sciences.

View on arXiv
@article{götting2025_2504.16137,
  title={ Virology Capabilities Test (VCT): A Multimodal Virology Q&A Benchmark },
  author={ Jasper Götting and Pedro Medeiros and Jon G Sanders and Nathaniel Li and Long Phan and Karam Elabd and Lennart Justen and Dan Hendrycks and Seth Donoughe },
  journal={arXiv preprint arXiv:2504.16137},
  year={ 2025 }
}
Comments on this paper