29

A Proper Scoring Rule for Virtual Staining

Samuel Tonks
Steve Hood
Ryan Musso
Ceridwen Hopely
Steve Titus
Minh Doan
Iain Styles
Alexander Krull
Main:8 Pages
4 Figures
Bibliography:3 Pages
Abstract

Generative virtual staining (VS) models for high-throughput screening (HTS) can provide an estimated posterior distribution of possible biological feature values for each input and cell. However, when evaluating a VS model, the true posterior is unavailable. Existing evaluation protocols only check the accuracy of the marginal distribution over the dataset rather than the predicted posteriors. We introduce information gain (IG) as a cell-wise evaluation framework that enables direct assessment of predicted posteriors. IG is a strictly proper scoring rule and comes with a sound theoretical motivation allowing for interpretability, and for comparing results across models and features. We evaluate diffusion- and GAN-based models on an extensive HTS dataset using IG and other metrics and show that IG can reveal substantial performance differences other metrics cannot.

View on arXiv
Comments on this paper