38
0

Enhanced Multi-Tuple Extraction for Alloys: Integrating Pointer Networks and Augmented Attention

Abstract

Extracting high-quality structured information from scientific literature is crucial for advancing material design through data-driven methods. Despite the considerable research in natural language processing for dataset extraction, effective approaches for multi-tuple extraction in scientific literature remain scarce due to the complex interrelations of tuples and contextual ambiguities. In the study, we illustrate the multi-tuple extraction of mechanical properties from multi-principal-element alloys and presents a novel framework that combines an entity extraction model based on MatSciBERT with pointer networks and an allocation model utilizing inter- and intra-entity attention. Our rigorous experiments on tuple extraction demonstrate impressive F1 scores of 0.963, 0.947, 0.848, and 0.753 across datasets with 1, 2, 3, and 4 tuples, confirming the effectiveness of the model. Furthermore, an F1 score of 0.854 was achieved on a randomly curated dataset. These results highlight the model's capacity to deliver precise and structured information, offering a robust alternative to large language models and equipping researchers with essential data for fostering data-driven innovations.

View on arXiv
@article{hei2025_2503.06861,
  title={ Enhanced Multi-Tuple Extraction for Alloys: Integrating Pointer Networks and Augmented Attention },
  author={ Mengzhe Hei and Zhouran Zhang and Qingbao Liu and Yan Pan and Xiang Zhao and Yongqian Peng and Yicong Ye and Xin Zhang and Shuxin Bai },
  journal={arXiv preprint arXiv:2503.06861},
  year={ 2025 }
}
Comments on this paper