ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2026 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2409.01941
  4. Cited By
Towards Leveraging Large Language Models for Automated Medical Q&A
  Evaluation

Towards Leveraging Large Language Models for Automated Medical Q&A Evaluation

3 September 2024
Jack Krolik
Herprit Mahal
Feroz Ahmad
Gaurav Trivedi
Bahador Saket
    ELMLM&MA
ArXiv (abs)PDFHTMLGithub

Papers citing "Towards Leveraging Large Language Models for Automated Medical Q&A Evaluation"

6 / 6 papers shown
The Biased Oracle: Assessing LLMs' Understandability and Empathy in Medical Diagnoses
The Biased Oracle: Assessing LLMs' Understandability and Empathy in Medical Diagnoses
Jianzhou Yao
Shunchang Liu
Guillaume Drui
Rikard Pettersson
Alessandro Blasimme
Sara Kijewski
125
1
0
02 Nov 2025
LaQual: A Novel Framework for Automated Evaluation of LLM App Quality
LaQual: A Novel Framework for Automated Evaluation of LLM App Quality
Yan Wang
Xinyi Hou
Yanjie Zhao
Weiguo Lin
Haoyu Wang
Junjun Si
ELM
178
1
0
26 Aug 2025
DistillNote: Toward a Functional Evaluation Framework of LLM-Generated Clinical Note Summaries
DistillNote: Toward a Functional Evaluation Framework of LLM-Generated Clinical Note Summaries
Heloisa Oss Boll
Antonio Oss Boll
Leticia Puttlitz Boll
Ameen Abu-Hanna
Iacer Calixto
296
2
0
20 Jun 2025
ReasonMed: A 370K Multi-Agent Generated Dataset for Advancing Medical Reasoning
ReasonMed: A 370K Multi-Agent Generated Dataset for Advancing Medical Reasoning
Yu Sun
Xingyu Qian
Weiwen Xu
Hao Zhang
Chenghao Xiao
...
Yu Rong
Wenbing Huang
Qifeng Bai
Qifeng Bai
Yu Rong
LRM
410
19
0
11 Jun 2025
ClinBench-HPB: A Clinical Benchmark for Evaluating LLMs in Hepato-Pancreato-Biliary Diseases
ClinBench-HPB: A Clinical Benchmark for Evaluating LLMs in Hepato-Pancreato-Biliary Diseases
Y. Li
Xiaojun Zeng
Chihua Fang
Jian Yang
Fucang Jia
L. Zhang
LM&MAELMAI4MH
293
0
0
30 May 2025
MCQG-SRefine: Multiple Choice Question Generation and Evaluation with Iterative Self-Critique, Correction, and Comparison Feedback
MCQG-SRefine: Multiple Choice Question Generation and Evaluation with Iterative Self-Critique, Correction, and Comparison FeedbackNorth American Chapter of the Association for Computational Linguistics (NAACL), 2024
Zonghai Yao
Aditya Parashar
Huixue Zhou
Won Seok Jang
Feiyun Ouyang
Zhichao Yang
Hong-ye Yu
ELM
501
21
0
17 Oct 2024
1
Page 1 of 1