ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2504.02871
24
0

Synthesized Annotation Guidelines are Knowledge-Lite Boosters for Clinical Information Extraction

1 April 2025
Enshuo Hsu
Martin Ugbala
Krishna Kumar Kookal
Zouaidi Kawtar
Nicholas L. Rider
Muhammad F. Walji
Kirk Roberts
ArXivPDFHTML
Abstract

Generative information extraction using large language models, particularly through few-shot learning, has become a popular method. Recent studies indicate that providing a detailed, human-readable guideline-similar to the annotation guidelines traditionally used for training human annotators can significantly improve performance. However, constructing these guidelines is both labor- and knowledge-intensive. Additionally, the definitions are often tailored to meet specific needs, making them highly task-specific and often non-reusable. Handling these subtle differences requires considerable effort and attention to detail. In this study, we propose a self-improving method that harvests the knowledge summarization and text generation capacity of LLMs to synthesize annotation guidelines while requiring virtually no human input. Our zero-shot experiments on the clinical named entity recognition benchmarks, 2012 i2b2 EVENT, 2012 i2b2 TIMEX, 2014 i2b2, and 2018 n2c2 showed 25.86%, 4.36%, 0.20%, and 7.75% improvements in strict F1 scores from the no-guideline baseline. The LLM-synthesized guidelines showed equivalent or better performance compared to human-written guidelines by 1.15% to 4.14% in most tasks. In conclusion, this study proposes a novel LLM self-improving method that requires minimal knowledge and human input and is applicable to multiple biomedical domains.

View on arXiv
@article{hsu2025_2504.02871,
  title={ Synthesized Annotation Guidelines are Knowledge-Lite Boosters for Clinical Information Extraction },
  author={ Enshuo Hsu and Martin Ugbala and Krishna Kumar Kookal and Zouaidi Kawtar and Nicholas L. Rider and Muhammad F. Walji and Kirk Roberts },
  journal={arXiv preprint arXiv:2504.02871},
  year={ 2025 }
}
Comments on this paper