ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2504.00020
122
1

Celler:A Genomic Language Model for Long-Tailed Single-Cell Annotation

28 March 2025
Huan Zhao
Yiming Liu
Jina Yao
Ling Xiong
Zexin Zhou
Zixing Zhang
ArXiv (abs)PDFHTML
Abstract

Recent breakthroughs in single-cell technology have ushered in unparalleled opportunities to decode the molecular intricacy of intricate biological systems, especially those linked to diseases unique to humans. However, these progressions have also ushered in novel obstacles-specifically, the efficient annotation of extensive, long-tailed single-cell data pertaining to disease conditions. To effectively surmount this challenge, we introduce Celler, a state-of-the-art generative pre-training model crafted specifically for the annotation of single-cell data. Celler incorporates two groundbreaking elements: First, we introduced the Gaussian Inflation (GInf) Loss function. By dynamically adjusting sample weights, GInf Loss significantly enhances the model's ability to learn from rare categories while reducing the risk of overfitting for common categories. Secondly, we introduce an innovative Hard Data Mining (HDM) strategy into the training process, specifically targeting the challenging-to-learn minority data samples, which significantly improved the model's predictive accuracy. Additionally, to further advance research in this field, we have constructed a large-scale single-cell dataset: Celler-75, which encompasses 40 million cells distributed across 80 human tissues and 75 specific diseases. This dataset provides critical support for comprehensively exploring the potential of single-cell technology in disease research. Our code is available atthis https URL.

View on arXiv
@article{zhao2025_2504.00020,
  title={ Celler:A Genomic Language Model for Long-Tailed Single-Cell Annotation },
  author={ Huan Zhao and Yiming Liu and Jina Yao and Ling Xiong and Zexin Zhou and Zixing Zhang },
  journal={arXiv preprint arXiv:2504.00020},
  year={ 2025 }
}
Comments on this paper