Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2306.06494
Cited By
Multi-modal Pre-training for Medical Vision-language Understanding and Generation: An Empirical Study with A New Benchmark
10 June 2023
Li Xu
Bo Liu
Ameer Hamza Khan
Lu Fan
Xiao-Ming Wu
LM&MA
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Multi-modal Pre-training for Medical Vision-language Understanding and Generation: An Empirical Study with A New Benchmark"
11 / 11 papers shown
Title
MuMu-LLaMA: Multi-modal Music Understanding and Generation via Large Language Models
Shansong Liu
Atin Sakkeer Hussain
Qilong Wu
Chenshuo Sun
Ying Shan
AuLLM
61
3
0
09 Dec 2024
GEMeX: A Large-Scale, Groundable, and Explainable Medical VQA Benchmark for Chest X-ray Diagnosis
Bo Liu
K. Zou
Liming Zhan
Zexin Lu
Xiaoyu Dong
Yidi Chen
Chengqiang Xie
Jiannong Cao
Xiao-Ming Wu
Huazhu Fu
120
0
0
25 Nov 2024
Daily Physical Activity Monitoring -- Adaptive Learning from Multi-source Motion Sensor Data
Haoting Zhang
Donglin Zhan
Yunduan Lin
Jinghai He
Qing Zhu
Z. Shen
Zeyu Zheng
40
0
0
26 May 2024
RJUA-MedDQA: A Multimodal Benchmark for Medical Document Question Answering and Clinical Reasoning
Congyun Jin
Ming Zhang
Xiaowei Ma
Yujiao Li
Yingbo Wang
...
Chenfei Chi
Xiangguo Lv
Fangzhou Li
Wei Xue
Yiran Huang
LM&MA
25
2
0
19 Feb 2024
Medical Vision Language Pretraining: A survey
Prashant Shrestha
Sanskar Amgain
Bidur Khanal
Cristian A. Linte
Binod Bhattarai
VLM
27
14
0
11 Dec 2023
Align, Reason and Learn: Enhancing Medical Vision-and-Language Pre-training with Knowledge
Zhihong Chen
Guanbin Li
Xiang Wan
119
65
0
15 Sep 2022
Multi-Modal Masked Autoencoders for Medical Vision-and-Language Pre-Training
Zhihong Chen
Yu Du
Jinpeng Hu
Yang Liu
Guanbin Li
Xiang Wan
Tsung-Hui Chang
79
111
0
15 Sep 2022
Masked Autoencoders Are Scalable Vision Learners
Kaiming He
Xinlei Chen
Saining Xie
Yanghao Li
Piotr Dollár
Ross B. Girshick
ViT
TPM
258
7,412
0
11 Nov 2021
Auto-Encoding Knowledge Graph for Unsupervised Medical Report Generation
Fenglin Liu
Chenyu You
Xian Wu
Shen Ge
Sheng Wang
Xu Sun
MedIm
73
91
0
08 Nov 2021
Unified Vision-Language Pre-Training for Image Captioning and VQA
Luowei Zhou
Hamid Palangi
Lei Zhang
Houdong Hu
Jason J. Corso
Jianfeng Gao
MLLM
VLM
250
927
0
24 Sep 2019
Multimodal Compact Bilinear Pooling for Visual Question Answering and Visual Grounding
Akira Fukui
Dong Huk Park
Daylen Yang
Anna Rohrbach
Trevor Darrell
Marcus Rohrbach
144
1,464
0
06 Jun 2016
1