ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2026 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2310.04270
  4. Cited By
A Comprehensive Evaluation of Large Language Models on Benchmark
  Biomedical Text Processing Tasks
v1v2v3 (latest)

A Comprehensive Evaluation of Large Language Models on Benchmark Biomedical Text Processing Tasks

6 October 2023
Fangshuo Liao
Md Tahmid Rahman Laskar
Cruz Barnum
Jimmy Xiangji Huang
    AI4MHLM&MA
ArXiv (abs)PDFHTMLGithub (7★)

Papers citing "A Comprehensive Evaluation of Large Language Models on Benchmark Biomedical Text Processing Tasks"

31 / 31 papers shown
Conversational No-code, Multi-agentic Disease Module Identification and Drug Repurposing Prediction with ChatDRex
Conversational No-code, Multi-agentic Disease Module Identification and Drug Repurposing Prediction with ChatDRex
Simon Süwer
Kester Bagemihl
Sylvie Baier
Lucia Dicunta
M. List
Jan Baumbach
Andreas Maier
Fernando M. Delgado-Chaves
170
0
0
26 Nov 2025
BanglaMedQA and BanglaMMedBench: Evaluating Retrieval-Augmented Generation Strategies for Bangla Biomedical Question Answering
BanglaMedQA and BanglaMMedBench: Evaluating Retrieval-Augmented Generation Strategies for Bangla Biomedical Question Answering
Sadia Sultana
Saiyma Sittul Muna
Mosammat Zannatul Samarukh
Ajwad Abrar
Tareque Mohmud Chowdhury
RALM
256
1
0
06 Nov 2025
DACIP-RC: Domain Adaptive Continual Instruction Pre-Training via Reading Comprehension on Business Conversations
DACIP-RC: Domain Adaptive Continual Instruction Pre-Training via Reading Comprehension on Business Conversations
Elena Khasanova
Harsh Saini
Md Tahmid Rahman Laskar
Xue-Yong Fu
Cheng Chen
Shashi Bhushan TN
CLL
143
1
0
09 Oct 2025
RELATE: Relation Extraction in Biomedical Abstracts with LLMs and Ontology Constraints
RELATE: Relation Extraction in Biomedical Abstracts with LLMs and Ontology Constraints
Olawumi Olasunkanmi
Mathew Satursky
Hong Yi
Chris Bizon
Harlin Lee
Stanley Ahalt
121
1
0
23 Sep 2025
Toward Better EHR Reasoning in LLMs: Reinforcement Learning with Expert Attention Guidance
Toward Better EHR Reasoning in LLMs: Reinforcement Learning with Expert Attention Guidance
Yue Fang
Yuxin Guo
Jiaran Gao
Hongxin Ding
Xinke Jiang
...
Yinghao Zhu
Zhibang Yang
Liantao Ma
Junfeng Zhao
Yasha Wang
LM&MALRM
225
3
0
19 Aug 2025
Improving Automatic Evaluation of Large Language Models (LLMs) in Biomedical Relation Extraction via LLMs-as-the-Judge
Improving Automatic Evaluation of Large Language Models (LLMs) in Biomedical Relation Extraction via LLMs-as-the-JudgeAnnual Meeting of the Association for Computational Linguistics (ACL), 2025
Md Tahmid Rahman Laskar
Israt Jahan
Elham Dolatabadi
Chun Peng
E. Hoque
J. Huang
LM&MA
217
12
0
01 Jun 2025
Ontology- and LLM-based Data Harmonization for Federated Learning in Healthcare
Ontology- and LLM-based Data Harmonization for Federated Learning in Healthcare
Natallia Kokash
Lei Wang
Thomas H. Gillespie
Adam Belloum
Paola Grosso
Sara Quinney
Lang Li
Bernard de Bono
248
8
0
26 May 2025
Performance Evaluation of Large Language Models in Bangla Consumer Health Query Summarization
Performance Evaluation of Large Language Models in Bangla Consumer Health Query Summarization
Ajwad Abrar
Farzana Tabassum
Sabbir Ahmed
LM&MAELMAI4MH
422
8
0
08 May 2025
EvidenceBench: A Benchmark for Extracting Evidence from Biomedical Papers
EvidenceBench: A Benchmark for Extracting Evidence from Biomedical Papers
Jiadong Wang
Weili Cao
Kaicheng Wang
Xiaoyue Wang
Ashish Dalvi
...
David E. Neal
Maxim Khan
Christopher D. Rosin
R. Paturi
Leon Bergen
361
3
0
25 Apr 2025
Classification of User Reports for Detection of Faulty Computer Components using NLP Models: A Case Study
Classification of User Reports for Detection of Faulty Computer Components using NLP Models: A Case Study
Maria de Lourdes M. Silva
André L. C. Mendonça
Eduardo R. D. Neto
Iago C. Chaves
Felipe T. Brito
V. A. E. Farias
Javam C. Machado
182
2
0
20 Mar 2025
Can Frontier LLMs Replace Annotators in Biomedical Text Mining? Analyzing Challenges and Exploring Solutions
Can Frontier LLMs Replace Annotators in Biomedical Text Mining? Analyzing Challenges and Exploring Solutions
Yichong Zhao
Susumu Goto
316
2
0
05 Mar 2025
Position: Beyond Assistance - Reimagining LLMs as Ethical and Adaptive Co-Creators in Mental Health Care
Position: Beyond Assistance - Reimagining LLMs as Ethical and Adaptive Co-Creators in Mental Health Care
Abeer Badawi
Md Tahmid Rahman Laskar
J. Huang
Shaina Raza
Elham Dolatabadi
AI4MH
313
1
0
21 Feb 2025
ChemSafetyBench: Benchmarking LLM Safety on Chemistry Domain
ChemSafetyBench: Benchmarking LLM Safety on Chemistry Domain
Haochen Zhao
Xiangru Tang
Ziran Yang
Xiao Han
Xuanzhi Feng
...
Senhao Cheng
Di Jin
Yilun Zhao
Arman Cohan
Mark B. Gerstein
ELM
281
8
0
23 Nov 2024
AfriMed-QA: A Pan-African, Multi-Specialty, Medical Question-Answering Benchmark Dataset
AfriMed-QA: A Pan-African, Multi-Specialty, Medical Question-Answering Benchmark DatasetAnnual Meeting of the Association for Computational Linguistics (ACL), 2024
Tobi Olatunji
Charles Nimo
A. Owodunni
Tassallah Abdullahi
Emmanuel Ayodele
...
Michael Best
Irfan Essa
Stephen E. Moore
Chris Fourie
Mercy Nyamewaa Asiedu
LM&MA
912
25
0
23 Nov 2024
Zodiac: A Cardiologist-Level LLM Framework for Multi-Agent Diagnostics
Zodiac: A Cardiologist-Level LLM Framework for Multi-Agent Diagnostics
Yuan Zhou
Peng Zhang
Mengya Song
Alice Zheng
Yiwen Lu
Zhiheng Liu
Yong Chen
Zhaohan Xi
LM&MA
176
10
0
02 Oct 2024
Assessing and Enhancing Large Language Models in Rare Disease
  Question-answering
Assessing and Enhancing Large Language Models in Rare Disease Question-answering
Guanchu Wang
Junhao Ran
Ruixiang Tang
Chia-Yuan Chang
Chia-Yuan Chang
Yu-Neng Chuang
Zirui Liu
Vladimir Braverman
Zhandong Liu
Xia Hu
LM&MA
268
21
0
15 Aug 2024
Stochastic Parrots or ICU Experts? Large Language Models in Critical
  Care Medicine: A Scoping Review
Stochastic Parrots or ICU Experts? Large Language Models in Critical Care Medicine: A Scoping Review
Tongyue Shi
Jun Ma
Zihan Yu
Haowei Xu
Minqi Xiong
Meirong Xiao
Yilin Li
Huiying Zhao
Guilan Kong
254
4
0
27 Jul 2024
A Systematic Survey and Critical Review on Evaluating Large Language
  Models: Challenges, Limitations, and Recommendations
A Systematic Survey and Critical Review on Evaluating Large Language Models: Challenges, Limitations, and Recommendations
Md Tahmid Rahman Laskar
Sawsan Alqahtani
M Saiful Bari
Mizanur Rahman
Mohammad Abdullah Matin Khan
...
Chee Wei Tan
Md. Rizwan Parvez
Enamul Hoque
Shafiq Joty
Jimmy Huang
ELMALM
317
110
0
04 Jul 2024
Evaluation of Language Models in the Medical Context Under
  Resource-Constrained Settings
Evaluation of Language Models in the Medical Context Under Resource-Constrained Settings
Andrea Posada
Daniel Rueckert
Felix Meissen
Philip Muller
LM&MAELM
267
1
0
24 Jun 2024
Large Language Models in the Clinic: A Comprehensive Benchmark
Large Language Models in the Clinic: A Comprehensive Benchmark
Andrew Liu
Hongjian Zhou
Yining Hua
Omid Rohanian
Anshul Thakur
Lei A. Clifton
David Clifton
AI4MHLM&MA
342
24
0
25 Apr 2024
A Comprehensive Survey on Evaluating Large Language Model Applications
  in the Medical Industry
A Comprehensive Survey on Evaluating Large Language Model Applications in the Medical Industry
Yining Huang
Keke Tang
Meilian Chen
Boyuan Wang
ELMLM&MA
520
31
0
24 Apr 2024
Unveiling LLM Evaluation Focused on Metrics: Challenges and Solutions
Unveiling LLM Evaluation Focused on Metrics: Challenges and Solutions
Taojun Hu
Xiao-Hua Zhou
ELM
373
50
0
14 Apr 2024
Bioinformatics and Biomedical Informatics with ChatGPT: Year One Review
Bioinformatics and Biomedical Informatics with ChatGPT: Year One Review
Jinge Wang
Zien Cheng
Qiuming Yao
Li Liu
Dong Xu
Gangqing Hu
LM&MAAI4CE
399
3
0
22 Mar 2024
RAmBLA: A Framework for Evaluating the Reliability of LLMs as Assistants
  in the Biomedical Domain
RAmBLA: A Framework for Evaluating the Reliability of LLMs as Assistants in the Biomedical Domain
William James Bolton
Rafael Poyiadzi
Edward R. Morrell
Gabriela van Bergen Gonzalez Bueno
Lea Goetz
289
7
0
21 Mar 2024
Leveraging Biomolecule and Natural Language through Multi-Modal Learning: A Survey
Leveraging Biomolecule and Natural Language through Multi-Modal Learning: A Survey
Qizhi Pei
Lijun Wu
Ran Bi
Jinhua Zhu
Yue Wang
Guoqing Liu
Tao Qin
Lijun Wu
Rui Yan
AI4CE
545
25
0
03 Mar 2024
Reading Subtext: Evaluating Large Language Models on Short Story
  Summarization with Writers
Reading Subtext: Evaluating Large Language Models on Short Story Summarization with Writers
Melanie Subbiah
Sean Zhang
Lydia B. Chilton
Kathleen McKeown
449
29
0
02 Mar 2024
Biomedical Entity Linking as Multiple Choice Question Answering
Biomedical Entity Linking as Multiple Choice Question Answering
Zhenxi Lin
Ziheng Zhang
Xian Wu
Yefeng Zheng
439
4
0
23 Feb 2024
An Evaluation of Large Language Models in Bioinformatics Research
An Evaluation of Large Language Models in Bioinformatics Research
Hengchuang Yin
Zhonghui Gu
Fanhao Wang
Yiparemu Abuduhaibaier
Yanqiao Zhu
Xinming Tu
Xian-Sheng Hua
Xiao Luo
Luke Huan
LM&MA
275
8
0
21 Feb 2024
Tiny Titans: Can Smaller Large Language Models Punch Above Their Weight
  in the Real World for Meeting Summarization?
Tiny Titans: Can Smaller Large Language Models Punch Above Their Weight in the Real World for Meeting Summarization?
Xue-Yong Fu
Md Tahmid Rahman Laskar
Elena Khasanova
Cheng-Hsiung Chen
TN ShashiBhushan
ALM
339
44
0
01 Feb 2024
A comparative study of zero-shot inference with large language models
  and supervised modeling in breast cancer pathology classification
A comparative study of zero-shot inference with large language models and supervised modeling in breast cancer pathology classificationResearch Square (RS), 2024
Madhumita Sushil
T. Zack
Divneet Mandair
Zhiwei Zheng
Ahmed Wali
Yan-Ning Yu
Yuwei Quan
A. Butte
332
9
0
25 Jan 2024
BenLLMEval: A Comprehensive Evaluation into the Potentials and Pitfalls
  of Large Language Models on Bengali NLP
BenLLMEval: A Comprehensive Evaluation into the Potentials and Pitfalls of Large Language Models on Bengali NLPInternational Conference on Language Resources and Evaluation (LREC), 2023
M. Kabir
Mohammed Saidul Islam
Md Tahmid Rahman Laskar
Mir Tafseer Nayeem
M Saiful Bari
Enamul Hoque
LM&MA
391
31
0
22 Sep 2023
1
Page 1 of 1