All Papers
Title |
|---|
Title |
|---|

Research profiles highlight scientists' research focus, enabling talent discovery and collaborations, but are often outdated. Automated, scalable methods are urgently needed to keep profiles current. We design and evaluate two Large Language Models (LLMs)-based methods to generate scientific interest profiles--one summarizing PubMed abstracts and the other using Medical Subject Headings (MeSH) terms--comparing them with researchers' self-summarized interests. We collected titles, MeSH terms, and abstracts of PubMed publications for 595 faculty at Columbia University Irving Medical Center, obtaining human-written profiles for 167. GPT-4o-mini was prompted to summarize each researcher's interests. Manual and automated evaluations characterized similarities between machine-generated and self-written profiles. The similarity study showed low ROUGE-L, BLEU, and METEOR scores, reflecting little terminological overlap. BERTScore analysis revealed moderate semantic similarity (F1: 0.542 for MeSH-based, 0.555 for abstract-based), despite low lexical overlap. In validation, paraphrased summaries achieved a higher F1 of 0.851. Comparing original and manually paraphrased summaries indicated limitations of such metrics. Kullback-Leibler (KL) Divergence of TF-IDF values (8.56 for MeSH-based, 8.58 for abstract-based) suggests machine summaries employ different keywords than human-written ones. Manual reviews showed 77.78% rated MeSH-based profiling "good" or "excellent," with readability rated favorably in 93.44% of cases, though granularity and accuracy varied. Panel reviews favored 67.86% of MeSH-derived profiles over abstract-derived ones. LLMs promise to automate scientific interest profiling at scale. MeSH-derived profiles have better readability than abstract-derived ones. Machine-generated summaries differ from human-written ones in concept choice, with the latter initiating more novel ideas.
View on arXiv