ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2204.11817
14
155

Translation between Molecules and Natural Language

25 April 2022
Carl N. Edwards
T. Lai
Kevin Ros
Garrett Honke
Kyunghyun Cho
Heng Ji
ArXivPDFHTML
Abstract

We present MolT5\textbf{MolT5}MolT5 −-− a self-supervised learning framework for pretraining models on a vast amount of unlabeled natural language text and molecule strings. MolT5\textbf{MolT5}MolT5 allows for new, useful, and challenging analogs of traditional vision-language tasks, such as molecule captioning and text-based de novo molecule generation (altogether: translation between molecules and language), which we explore for the first time. Since MolT5\textbf{MolT5}MolT5 pretrains models on single-modal data, it helps overcome the chemistry domain shortcoming of data scarcity. Furthermore, we consider several metrics, including a new cross-modal embedding-based metric, to evaluate the tasks of molecule captioning and text-based molecule generation. Our results show that MolT5\textbf{MolT5}MolT5-based models are able to generate outputs, both molecules and captions, which in many cases are high quality.

View on arXiv
Comments on this paper