Communities
Connect sessions
AI calendar
Organizations
Join Slack
Contact Sales
Search
Open menu
Home
Papers
1809.00732
Cited By
emrQA: A Large Corpus for Question Answering on Electronic Medical Records
3 September 2018
Anusri Pampari
Preethi Raghavan
Jennifer J. Liang
Jian-wei Peng
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"emrQA: A Large Corpus for Question Answering on Electronic Medical Records"
50 / 127 papers shown
Title
Investigating LLM Capabilities on Long Context Comprehension for Medical Question Answering
Feras AlMannaa
Talia Tseriotou
Jenny Chim
Maria Liakata
ELM
163
0
0
21 Oct 2025
PaperArena: An Evaluation Benchmark for Tool-Augmented Agentic Reasoning on Scientific Literature
Daoyu Wang
Mingyue Cheng
Qi Liu
Shuo Yu
Zirui Liu
Ze Guo
LRM
209
1
0
13 Oct 2025
ELAIPBench: A Benchmark for Expert-Level Artificial Intelligence Paper Understanding
Xinbang Dai
Huikang Hu
Yongrui Chen
Jiaqi Li
Rihui Jin
Yuyang Zhang
Xiaoguang Li
Lifeng Shang
Guilin Qi
RALM
ELM
75
0
0
12 Oct 2025
TableMind: An Autonomous Programmatic Agent for Tool-Augmented Table Reasoning
Chuang Jiang
Mingyue Cheng
Xiaoyu Tao
Qingyang Mao
Jie Ouyang
Qi Liu
LLMAG
LMTD
ReLM
LRM
216
2
0
08 Sep 2025
Chatbot To Help Patients Understand Their Health
Won Seok Jang
Hieu Tran
Manav Mistry
SaiKiran Gandluri
Yifan Zhang
Sharmin Sultana
Sunjae Kown
Yuan-kang Zhang
Zonghai Yao
Hong-ye Yu
AI4MH
LM&MA
155
0
0
06 Sep 2025
A Graph-Based Test-Harness for LLM Evaluation
Jessica Lundin
Guillaume Chabot-Couture
ELM
71
1
0
28 Aug 2025
Evaluating Retrieval-Augmented Generation vs. Long-Context Input for Clinical Reasoning over EHRs
Skatje Myers
Dmitriy Dligach
T. Miller
Samantha Barr
Yanjun Gao
M. Churpek
Anoop Mayampurath
Majid Afshar
RALM
83
2
0
20 Aug 2025
DR.EHR: Dense Retrieval for Electronic Health Record with Knowledge Injection and Synthetic Data
Zhengyun Zhao
Huaiyuan Ying
Yue Zhong
S. Yu
107
0
0
24 Jul 2025
From Queries to Criteria: Understanding How Astronomers Evaluate LLMs
Alina Hyk
Kiera McCormick
Mian Zhong
I. Ciucă
Sanjib Sharma
John F. Wu
J. E. G. Peek
K. Iyer
Ziang Xiao
Anjalie Field
149
2
0
21 Jul 2025
Neural at ArchEHR-QA 2025: Agentic Prompt Optimization for Evidence-Grounded Clinical Question Answering
Sai Prasanna Teja Reddy Bogireddy
Abrar Majeedi
Viswanatha Reddy Gajjala
Zhuoyan Xu
Siddhant Rai
Vaishnav Potlapalli
233
1
0
12 Jun 2025
Toward Scientific Reasoning in LLMs: Training from Expert Discussions via Reinforcement Learning
Ming Yin
Yuanhao Qu
Dyllan Liu
Ling Yang
Le Cong
171
0
0
26 May 2025
MedScore: Generalizable Factuality Evaluation of Free-Form Medical Answers by Domain-adapted Claim Decomposition and Verification
Heyuan Huang
Alexandra DeLucia
Vijay Murari Tiyyala
Mark Dredze
HILM
MedIm
282
1
0
24 May 2025
Experience Retrieval-Augmentation with Electronic Health Records Enables Accurate Discharge QA
Justice Ou
Tinglin Huang
Yilun Zhao
Ziyang Yu
Peiqing Lu
Rex Ying
RALM
162
3
0
23 Mar 2025
MedAgentsBench: Benchmarking Thinking Models and Agent Frameworks for Complex Medical Reasoning
Xiangru Tang
Daniel Shao
Jiwoong Sohn
Jiapeng Chen
Jiayi Zhang
...
Yilun Zhao
Chenglin Wu
Wenqi Shi
Arman Cohan
Mark B. Gerstein
AI4MH
LRM
ELM
LM&MA
276
25
0
10 Mar 2025
BPQA Dataset: Evaluating How Well Language Models Leverage Blood Pressures to Answer Biomedical Questions
Chi Hang
Ruiqi Deng
L. Jiang
Zihao Yang
Anton Alyakin
Daniel Alber
E. Oermann
AI4MH
LM&MA
168
0
0
06 Mar 2025
EchoQA: A Large Collection of Instruction Tuning Data for Echocardiogram Reports
L. Moukheiber
Mira Moukheiber
Dana Moukheiiber
Jae-Woo Ju
Hyung-Chul Lee
LM&MA
300
1
0
04 Mar 2025
A Survey of Large Language Models for Healthcare: from Data, Technology, and Applications to Accountability and Ethics
Information Fusion (Inf. Fusion), 2023
Kai He
Rui Mao
Qika Lin
Yucheng Ruan
Xiang Lan
Mengling Feng
Xiaoshi Zhong
LM&MA
AILaw
626
256
0
28 Jan 2025
Clinical Insights: A Comprehensive Review of Language Models in Medicine
PLOS Digital Health (PDH), 2024
Nikita Neveditsin
Pawan Lingras
V. Mago
LM&MA
488
15
0
08 Jan 2025
SCITAT: A Question Answering Benchmark for Scientific Tables and Text Covering Diverse Reasoning Types
Xuanliang Zhang
Dingzirui Wang
Baoxin Wang
Longxu Dou
Xinyuan Lu
Keyan Xu
Dayong Wu
Qingfu Zhu
Wanxiang Che
LMTD
930
5
0
16 Dec 2024
Large Language Model Benchmarks in Medical Tasks
Lawrence K. Q. Yan
Ming Li
Yujiao Shi
Cheng Fei
Cheng Fei
...
Junyu Liu
Xinyuan Song
Riyang Bao
Zekun Jiang
Ziyuan Qin
LM&MA
AI4MH
603
19
0
28 Oct 2024
CliMedBench: A Large-Scale Chinese Benchmark for Evaluating Medical Large Language Models in Clinical Scenarios
Conference on Empirical Methods in Natural Language Processing (EMNLP), 2024
Zetian Ouyang
Yishuai Qiu
Linlin Wang
Gerard de Melo
Ya Zhang
Yanfeng Wang
Liang He
LM&MA
AI4MH
ELM
137
10
0
04 Oct 2024
HealthQ: Unveiling Questioning Capabilities of LLM Chains in Healthcare Conversations
Smart Health (SH), 2024
Ziyu Wang
Hao Li
Di Huang
Amir M. Rahmani
Chae-Won Shin
Amir M. Rahmani
LM&MA
315
28
0
28 Sep 2024
WeQA: A Benchmark for Retrieval Augmented Generation in Wind Energy Domain
Rounak Meyur
Hung Phan
S. Wagle
Jan Strube
M. Halappanavar
Sameera Horawalavithana
Anurag Acharya
Sai Munikoti
223
0
0
21 Aug 2024
RealMedQA: A pilot biomedical question answering dataset containing realistic clinical questions
Gregory Kell
A. Roberts
Serge Umansky
Yuti Khare
Najma Ahmed
...
Chloe Simela
Jack Coumbe
Julian Rozario
Ryan-Rhys Griffiths
Iain J. Marshall
122
0
0
16 Aug 2024
MedSyn: LLM-based Synthetic Medical Text Generation Framework
Gleb Kumichev
Pavel Blinov
Yulia Kuzkina
Vasily Goncharov
Galina Zubkova
Nikolai Zenovkin
Aleksei Goncharov
Andrey Savchenko
SyDa
MedIm
262
31
0
04 Aug 2024
KaPQA: Knowledge-Augmented Product Question-Answering
Swetha Eppalapally
Daksh Dangi
Chaithra Bhat
Ankita Gupta
Ruiyi Zhang
...
Karishma Bagga
Seunghyun Yoon
Nedim Lipka
Ryan Rossi
Franck Dernoncourt
RALM
192
3
0
22 Jul 2024
SPIQA: A Dataset for Multimodal Question Answering on Scientific Papers
Shraman Pramanick
Rama Chellappa
Subhashini Venugopalan
373
48
0
12 Jul 2024
Benchmark Data Contamination of Large Language Models: A Survey
Cheng Xu
Shuhao Guan
Derek Greene
Mohand-Tahar Kechadi
ELM
ALM
247
85
0
06 Jun 2024
A Survey on Medical Large Language Models: Technology, Application, Trustworthiness, and Future Directions
Lei Liu
Xiaoyan Yang
Junchi Lei
Xiaoyang Liu
Yue Shen
...
Peng Wei
Jinjie Gu
Zhixuan Chu
Zhan Qin
Kui Ren
LM&MA
AILaw
233
37
0
06 Jun 2024
LG AI Research & KAIST at EHRSQL 2024: Self-Training Large Language Models with Pseudo-Labeled Unanswerable Questions for a Reliable Text-to-SQL System on EHRs
Clinical Natural Language Processing Workshop (ClinicalNLP), 2024
Yongrae Jo
Seongyun Lee
Minju Seo
Sung Ju Hwang
Moontae Lee
130
5
0
18 May 2024
MedConceptsQA: Open Source Medical Concepts QA Benchmark
Ofir Ben Shoham
Nadav Rappoport
AI4MH
ELM
234
10
0
12 May 2024
DOLOMITES: Domain-Specific Long-Form Methodical Tasks
Transactions of the Association for Computational Linguistics (TACL), 2024
Chaitanya Malaviya
Priyanka Agrawal
Kuzman Ganchev
Pranesh Srinivasan
Fantine Huot
Jonathan Berant
Mark Yatskar
Dipanjan Das
Mirella Lapata
Chris Alberti
239
9
0
09 May 2024
Towards Unbiased Evaluation of Detecting Unanswerable Questions in EHRSQL
Yongjin Yang
Sihyeon Kim
Sangmook Kim
Gyubok Lee
Se-Young Yun
Edward Choi
170
3
0
29 Apr 2024
emrQA-msquad: A Medical Dataset Structured with the SQuAD V2.0 Framework, Enriched with emrQA Medical Information
Jimenez Eladio
Hao Wu
167
4
0
18 Apr 2024
Unveiling LLM Evaluation Focused on Metrics: Challenges and Solutions
Taojun Hu
Xiao-Hua Zhou
ELM
215
42
0
14 Apr 2024
Benchmarking Large Language Models on Answering and Explaining Challenging Medical Questions
Hanjie Chen
Zhouxiang Fang
Yash Singla
Mark Dredze
ELM
AI4MH
400
82
0
28 Feb 2024
EHRNoteQA: An LLM Benchmark for Real-World Clinical Practice Using Discharge Summaries
Sunjun Kweon
Jiyoun Kim
Heeyoung Kwak
Dongchul Cha
Hangyul Yoon
Kwanghyun Kim
Jeewon Yang
Seunghyun Won
Edward Choi
LM&MA
446
24
0
25 Feb 2024
Me LLaMA: Foundation Large Language Models for Medical Applications
Qianqian Xie
Qingyu Chen
Aokun Chen
C.A.I. Peng
Yan Hu
...
Huan He
Lucila Ohno-Machido
Yonghui Wu
Hua Xu
Jiang Bian
LM&MA
AI4MH
306
37
0
20 Feb 2024
Retrieval-Augmented Thought Process as Sequential Decision Making
T. Pouplin
Hao Sun
Samuel Holt
M. Schaar
KELM
RALM
LRM
110
2
0
12 Feb 2024
SA-MDKIF: A Scalable and Adaptable Medical Domain Knowledge Injection Framework for Large Language Models
Tianhan Xu
Zhe Hu
LingSen Chen
Bin Li
LM&MA
187
1
0
01 Feb 2024
XAIQA: Explainer-Based Data Augmentation for Extractive Question Answering
Joel Stremmel
A. Saeedi
Hamid Hassanzadeh
Sanjit Batra
Jeffrey K Hertzberg
Jaime Murillo
Eran Halperin
MedIm
138
1
0
06 Dec 2023
Fine-tuning pre-trained extractive QA models for clinical document parsing
Ashwyn Sharma
David I. Feldman
Aneesh Jain
208
0
0
04 Dec 2023
A Survey of Large Language Models in Medicine: Progress, Application, and Challenge
Hongjian Zhou
Fenglin Liu
Boyang Gu
Xinyu Zou
Jinfa Huang
...
Yefeng Zheng
Lei A. Clifton
Zheng Li
Fenglin Liu
David Clifton
LM&MA
564
179
0
09 Nov 2023
Adapting Pre-trained Generative Models for Extractive Question Answering
IEEE Games Entertainment Media Conference (IEEE GEM), 2023
Prabir Mallick
Tapas Nayak
Indrajit Bhattacharya
155
10
0
06 Nov 2023
EHRXQA: A Multi-Modal Question Answering Dataset for Electronic Health Records with Chest X-ray Images
Neural Information Processing Systems (NeurIPS), 2023
Seongsu Bae
Daeun Kyung
Jaehee Ryu
Eunbyeol Cho
Gyubok Lee
...
Jungwoo Oh
Lei Ji
E. Chang
Tackeun Kim
Edward Choi
248
43
0
28 Oct 2023
MedEval: A Multi-Level, Multi-Task, and Multi-Domain Medical Benchmark for Language Model Evaluation
Conference on Empirical Methods in Natural Language Processing (EMNLP), 2023
Zexue He
Yu Wang
An Yan
Yao Liu
Eric Y. Chang
Amilcare Gentili
Julian McAuley
Chun-Nan Hsu
ELM
371
22
0
21 Oct 2023
CLIFT: Analysing Natural Distribution Shift on Question Answering Models in Clinical Domain
Ankit Pal
180
2
0
19 Oct 2023
NuclearQA: A Human-Made Benchmark for Language Models for the Nuclear Domain
Anurag Acharya
Sai Munikoti
Aaron Hellinger
Sara Smith
S. Wagle
Sameera Horawalavithana
ELM
223
6
0
17 Oct 2023
Emerging Challenges in Personalized Medicine: Assessing Demographic Effects on Biomedical Question Answering Systems
International Joint Conference on Natural Language Processing (IJCNLP), 2023
Sagi Shaier
Kevin Bennett
Lawrence E Hunter
Katharina von der Wense
146
0
0
16 Oct 2023
Question Answering for Electronic Health Records: A Scoping Review of datasets and models
Journal of Medical Internet Research (JMIR), 2023
Jayetri Bardhan
Kirk Roberts
Daisy Zhe Wang
338
8
0
12 Oct 2023
1
2
3
Next