ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2406.18045
32
2

PharmaGPT: Domain-Specific Large Language Models for Bio-Pharmaceutical and Chemistry

26 June 2024
Linqing Chen
Weilei Wang
Zilong Bai
Peng Xu
Yan Fang
Jie Fang
Wentao Wu
Lizhi Zhou
Ruiji Zhang
Yubin Xia
Chaobo Xu
Ran Hu
Licong Xu
Qijun Cai
Haoran Hua
Jing Sun
Jin Liu
Tian Qiu
Haowen Liu
Meng Hu
Xiuwen Li
Fei Gao
Yufu Wang
Lin Tie
Chaochao Wang
Jianping Lu
Cheng Sun
Yixin Wang
Shengjie Yang
Yuancheng Li
Lu Jin
Lisha Zhang
Fu Bian
Zhongkai Ye
Lidong Pei
Changyang Tu
    AI4MH
    LM&MA
ArXivPDFHTML
Abstract

Large language models (LLMs) have revolutionized Natural Language Processing (NLP) by minimizing the need for complex feature engineering. However, the application of LLMs in specialized domains like biopharmaceuticals and chemistry remains largely unexplored. These fields are characterized by intricate terminologies, specialized knowledge, and a high demand for precision areas where general purpose LLMs often fall short. In this study, we introduce PharmaGPT, a suite of domain specilized LLMs with 13 billion and 70 billion parameters, specifically trained on a comprehensive corpus tailored to the Bio-Pharmaceutical and Chemical domains. Our evaluation shows that PharmaGPT surpasses existing general models on specific-domain benchmarks such as NAPLEX, demonstrating its exceptional capability in domain-specific tasks. Remarkably, this performance is achieved with a model that has only a fraction, sometimes just one-tenth-of the parameters of general-purpose large models. This advancement establishes a new benchmark for LLMs in the bio-pharmaceutical and chemical fields, addressing the existing gap in specialized language modeling. It also suggests a promising path for enhanced research and development, paving the way for more precise and effective NLP applications in these areas.

View on arXiv
Comments on this paper