ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2306.00890
  4. Cited By
LLaVA-Med: Training a Large Language-and-Vision Assistant for
  Biomedicine in One Day

LLaVA-Med: Training a Large Language-and-Vision Assistant for Biomedicine in One Day

1 June 2023
Chunyuan Li
Cliff Wong
Sheng Zhang
Naoto Usuyama
Haotian Liu
Jianwei Yang
Tristan Naumann
Hoifung Poon
Jianfeng Gao
    LM&MA
    MedIm
ArXivPDFHTML

Papers citing "LLaVA-Med: Training a Large Language-and-Vision Assistant for Biomedicine in One Day"

50 / 110 papers shown
Title
Efficient and Comprehensive Feature Extraction in Large Vision-Language Model for Clinical Pathology Analysis
Efficient and Comprehensive Feature Extraction in Large Vision-Language Model for Clinical Pathology Analysis
Shengxuming Zhang
Weihan Li
Tianhong Gao
Jiacong Hu
Haoming Luo
Mingli Song
Xiuming Zhang
Mingli Song
Zunlei Feng
LM&MA
101
0
0
12 Dec 2024
On Domain-Specific Post-Training for Multimodal Large Language Models
On Domain-Specific Post-Training for Multimodal Large Language Models
Daixuan Cheng
Shaohan Huang
Ziyu Zhu
Xintong Zhang
Wayne Xin Zhao
Zhongzhi Luan
Bo Dai
Zhenliang Zhang
VLM
87
2
0
29 Nov 2024
Libra: Leveraging Temporal Images for Biomedical Radiology Analysis
Libra: Leveraging Temporal Images for Biomedical Radiology Analysis
Xi Zhang
Zaiqiao Meng
Jake Lever
Edmond S. L. Ho
MedIm
94
0
0
28 Nov 2024
GEMeX: A Large-Scale, Groundable, and Explainable Medical VQA Benchmark for Chest X-ray Diagnosis
GEMeX: A Large-Scale, Groundable, and Explainable Medical VQA Benchmark for Chest X-ray Diagnosis
Bo Liu
K. Zou
Liming Zhan
Zexin Lu
Xiaoyu Dong
Yidi Chen
Chengqiang Xie
Jiannong Cao
Xiao-Ming Wu
Huazhu Fu
118
0
0
25 Nov 2024
VILA-M3: Enhancing Vision-Language Models with Medical Expert Knowledge
VILA-M3: Enhancing Vision-Language Models with Medical Expert Knowledge
Vishwesh Nath
Wenqi Li
Dong Yang
Andriy Myronenko
Mingxin Zheng
...
Holger Roth
Daguang Xu
Baris Turkbey
Holger Roth
Daguang Xu
VLM
90
4
0
19 Nov 2024
Benchmarking Vision Language Model Unlearning via Fictitious Facial Identity Dataset
Benchmarking Vision Language Model Unlearning via Fictitious Facial Identity Dataset
Yingzi Ma
Jiongxiao Wang
Fei-Yue Wang
Siyuan Ma
Jiazhao Li
...
B. Li
Yejin Choi
M. Chen
Chaowei Xiao
Chaowei Xiao
MU
52
6
0
05 Nov 2024
Collective Model Intelligence Requires Compatible Specialization
Collective Model Intelligence Requires Compatible Specialization
Jyothish Pari
Samy Jelassi
Pulkit Agrawal
MoMe
30
1
0
04 Nov 2024
The Potential of LLMs in Medical Education: Generating Questions and Answers for Qualification Exams
The Potential of LLMs in Medical Education: Generating Questions and Answers for Qualification Exams
Yunqi Zhu
Wen Tang
Ying Sun
Xuebing Yang
Liyang Dou
Yifan Gu
Yuanyuan Wu
Wensheng Zhang
Ying Sun
Xuebing Yang
LM&MA
ELM
38
1
0
31 Oct 2024
Reflexive Guidance: Improving OoDD in Vision-Language Models via Self-Guided Image-Adaptive Concept Generation
Reflexive Guidance: Improving OoDD in Vision-Language Models via Self-Guided Image-Adaptive Concept Generation
Seulbi Lee
J. Kim
Sangheum Hwang
LRM
43
0
0
19 Oct 2024
Fine-Grained Verifiers: Preference Modeling as Next-token Prediction in Vision-Language Alignment
Fine-Grained Verifiers: Preference Modeling as Next-token Prediction in Vision-Language Alignment
Chenhang Cui
An Zhang
Yiyang Zhou
Zhaorun Chen
Gelei Deng
Huaxiu Yao
Tat-Seng Chua
58
4
0
18 Oct 2024
SlideChat: A Large Vision-Language Assistant for Whole-Slide Pathology Image Understanding
SlideChat: A Large Vision-Language Assistant for Whole-Slide Pathology Image Understanding
Ying Chen
Guoan Wang
Yuanfeng Ji
Yanjun Li
Jin Ye
Tianbin Li
Bin Zhang
Nana Pei
Rongshan Yu
Yu Qiao
VLM
LM&MA
51
2
0
15 Oct 2024
From Generalist to Specialist: Adapting Vision Language Models via
  Task-Specific Visual Instruction Tuning
From Generalist to Specialist: Adapting Vision Language Models via Task-Specific Visual Instruction Tuning
Yang Bai
Yang Zhou
Jun Zhou
Rick Siow Mong Goh
Daniel Ting
Yong Liu
VLM
44
0
0
09 Oct 2024
LMOD: A Large Multimodal Ophthalmology Dataset and Benchmark for Large Vision-Language Models
LMOD: A Large Multimodal Ophthalmology Dataset and Benchmark for Large Vision-Language Models
Zhenyue Qin
Yu Yin
Dylan Campbell
Xuansheng Wu
Ke Zou
Yih-Chung Tham
Ninghao Liu
Xiuzhen Zhang
Qingyu Chen
36
1
0
02 Oct 2024
MIO: A Foundation Model on Multimodal Tokens
MIO: A Foundation Model on Multimodal Tokens
Zekun Wang
King Zhu
Chunpu Xu
Wangchunshu Zhou
Jiaheng Liu
...
Yuanxing Zhang
Ge Zhang
Ke Xu
Jie Fu
Wenhao Huang
MLLM
AuLLM
48
11
0
26 Sep 2024
MindGuard: Towards Accessible and Sitgma-free Mental Health First Aid
  via Edge LLM
MindGuard: Towards Accessible and Sitgma-free Mental Health First Aid via Edge LLM
Sijie Ji
Xinzhe Zheng
Jiawei Sun
Renqi Chen
Wei Gao
Mani Srivastava
AI4MH
32
2
0
16 Sep 2024
An Embedding is Worth a Thousand Noisy Labels
An Embedding is Worth a Thousand Noisy Labels
Francesco Di Salvo
Sebastian Doerrich
Ines Rieger
Christian Ledig
NoLa
68
0
0
26 Aug 2024
FEDKIM: Adaptive Federated Knowledge Injection into Medical Foundation Models
FEDKIM: Adaptive Federated Knowledge Injection into Medical Foundation Models
Xiaochen Wang
Jiaqi Wang
Houping Xiao
J. Chen
Fenglong Ma
MedIm
61
7
0
17 Aug 2024
Beyond the Hype: A dispassionate look at vision-language models in medical scenario
Beyond the Hype: A dispassionate look at vision-language models in medical scenario
Yang Nan
Huichi Zhou
Xiaodan Xing
Guang Yang
46
3
0
16 Aug 2024
AgentsCoMerge: Large Language Model Empowered Collaborative Decision Making for Ramp Merging
AgentsCoMerge: Large Language Model Empowered Collaborative Decision Making for Ramp Merging
Senkang Hu
Zhengru Fang
Zihan Fang
Yiqin Deng
Xianhao Chen
Yuguang Fang
Sam Kwong
40
13
0
07 Aug 2024
MedTrinity-25M: A Large-scale Multimodal Dataset with Multigranular Annotations for Medicine
MedTrinity-25M: A Large-scale Multimodal Dataset with Multigranular Annotations for Medicine
Yunfei Xie
Ce Zhou
Lang Gao
Juncheng Wu
Xianhang Li
...
Sheng Liu
Lei Xing
James Zou
Cihang Xie
Yuyin Zhou
LM&MA
MedIm
71
23
0
06 Aug 2024
Self-Introspective Decoding: Alleviating Hallucinations for Large Vision-Language Models
Self-Introspective Decoding: Alleviating Hallucinations for Large Vision-Language Models
Fushuo Huo
Wenchao Xu
Zhong Zhang
Haozhao Wang
Zhicheng Chen
Peilin Zhao
VLM
MLLM
61
18
0
04 Aug 2024
Privacy-preserving datasets by capturing feature distributions with
  Conditional VAEs
Privacy-preserving datasets by capturing feature distributions with Conditional VAEs
Francesco Di Salvo
David Tafler
Sebastian Doerrich
Christian Ledig
CML
21
0
0
01 Aug 2024
ExpertAF: Expert Actionable Feedback from Video
ExpertAF: Expert Actionable Feedback from Video
Kumar Ashutosh
Tushar Nagarajan
Georgios Pavlakos
Kris M. Kitani
Kristen Grauman
VGen
42
2
0
01 Aug 2024
Prompting Medical Large Vision-Language Models to Diagnose Pathologies by Visual Question Answering
Prompting Medical Large Vision-Language Models to Diagnose Pathologies by Visual Question Answering
Danfeng Guo
Sumitaka Honji
LRM
59
0
0
31 Jul 2024
Fact-Aware Multimodal Retrieval Augmentation for Accurate Medical Radiology Report Generation
Fact-Aware Multimodal Retrieval Augmentation for Accurate Medical Radiology Report Generation
Liwen Sun
James Zhao
Megan Han
Chenyan Xiong
MedIm
40
7
0
21 Jul 2024
DALL-M: Context-Aware Clinical Data Augmentation with LLMs
DALL-M: Context-Aware Clinical Data Augmentation with LLMs
Chihcheng Hsieh
Catarina Moreira
Isabel Blanco Nobre
Sandra Costa Sousa
Chun Ouyang
M. Brereton
Joaquim A. Jorge
Jacinto C. Nascimento
41
0
0
11 Jul 2024
WSI-VQA: Interpreting Whole Slide Images by Generative Visual Question
  Answering
WSI-VQA: Interpreting Whole Slide Images by Generative Visual Question Answering
Pingyi Chen
Chenglu Zhu
Sunyi Zheng
Honglin Li
Lin Yang
35
6
0
08 Jul 2024
CaRe-Ego: Contact-aware Relationship Modeling for Egocentric Interactive Hand-object Segmentation
CaRe-Ego: Contact-aware Relationship Modeling for Egocentric Interactive Hand-object Segmentation
Yuejiao Su
Yi Wang
Lap-Pui Chau
57
1
0
08 Jul 2024
LLaRA: Supercharging Robot Learning Data for Vision-Language Policy
LLaRA: Supercharging Robot Learning Data for Vision-Language Policy
Xiang Li
Cristina Mata
J. Park
Kumara Kahatapitiya
Yoo Sung Jang
...
Kanchana Ranasinghe
R. Burgert
Mu Cai
Yong Jae Lee
Michael S. Ryoo
LM&Ro
59
23
0
28 Jun 2024
Benchmarking Generative Models on Computational Thinking Tests in Elementary Visual Programming
Benchmarking Generative Models on Computational Thinking Tests in Elementary Visual Programming
Victor-Alexandru Pădurean
Adish Singla
ELM
44
3
0
14 Jun 2024
Dr-LLaVA: Visual Instruction Tuning with Symbolic Clinical Grounding
Dr-LLaVA: Visual Instruction Tuning with Symbolic Clinical Grounding
Shenghuan Sun
Gregory M. Goldgof
Alexander Schubert
Zhiqing Sun
Thomas Hartvigsen
A. Butte
Ahmed Alaa
LM&MA
27
4
0
29 May 2024
Enhancing Zero-Shot Facial Expression Recognition by LLM Knowledge
  Transfer
Enhancing Zero-Shot Facial Expression Recognition by LLM Knowledge Transfer
Zengqun Zhao
Yu Cao
Shaogang Gong
Ioannis Patras
37
6
0
29 May 2024
MetaToken: Detecting Hallucination in Image Descriptions by Meta Classification
MetaToken: Detecting Hallucination in Image Descriptions by Meta Classification
Laura Fieback
Jakob Spiegelberg
Hanno Gottschalk
MLLM
54
5
0
29 May 2024
A Misleading Gallery of Fluid Motion by Generative Artificial
  Intelligence
A Misleading Gallery of Fluid Motion by Generative Artificial Intelligence
Ali Kashefi
VGen
40
5
0
24 May 2024
A Textbook Remedy for Domain Shifts: Knowledge Priors for Medical Image
  Analysis
A Textbook Remedy for Domain Shifts: Knowledge Priors for Medical Image Analysis
Yue Yang
Mona Gandhi
Yufei Wang
Yifan Wu
Michael S. Yao
Christopher Callison-Burch
James C. Gee
Mark Yatskar
38
3
0
23 May 2024
BiomedParse: a biomedical foundation model for image parsing of
  everything everywhere all at once
BiomedParse: a biomedical foundation model for image parsing of everything everywhere all at once
Theodore Zhao
Yu Gu
Jianwei Yang
Naoto Usuyama
Ho Hin Lee
...
B. Piening
Carlo Bifulco
Mu-Hsin Wei
Hoifung Poon
Sheng Wang
MedIm
18
22
0
21 May 2024
A Survey of Deep Learning-based Radiology Report Generation Using Multimodal Data
A Survey of Deep Learning-based Radiology Report Generation Using Multimodal Data
Xinyi Wang
Grazziela Figueredo
Ruizhe Li
W. Zhang
Weitong Chen
Xin Chen
MedIm
ViT
39
2
0
21 May 2024
Simplifying Multimodality: Unimodal Approach to Multimodal Challenges in
  Radiology with General-Domain Large Language Model
Simplifying Multimodality: Unimodal Approach to Multimodal Challenges in Radiology with General-Domain Large Language Model
Seonhee Cho
Choonghan Kim
Jiho Lee
Chetan Chilkunda
Sujin Choi
Joo Heung Yoon
42
0
0
29 Apr 2024
BlenderAlchemy: Editing 3D Graphics with Vision-Language Models
BlenderAlchemy: Editing 3D Graphics with Vision-Language Models
Ian Huang
Guandao Yang
Leonidas J. Guibas
32
3
0
26 Apr 2024
RadGenome-Chest CT: A Grounded Vision-Language Dataset for Chest CT
  Analysis
RadGenome-Chest CT: A Grounded Vision-Language Dataset for Chest CT Analysis
Xiaoman Zhang
Chaoyi Wu
Ziheng Zhao
Jiayu Lei
Ya-Qin Zhang
Yanfeng Wang
Weidi Xie
23
16
0
25 Apr 2024
SkinGEN: an Explainable Dermatology Diagnosis-to-Generation Framework with Interactive Vision-Language Models
SkinGEN: an Explainable Dermatology Diagnosis-to-Generation Framework with Interactive Vision-Language Models
Bo Lin
Yingjing Xu
Xuanwen Bao
Zhou Zhao
Zuyong Zhang
Zhouyang Wang
54
2
0
23 Apr 2024
LaPA: Latent Prompt Assist Model For Medical Visual Question Answering
LaPA: Latent Prompt Assist Model For Medical Visual Question Answering
Tiancheng Gu
Kaicheng Yang
Dongnan Liu
Weidong Cai
MedIm
24
2
0
19 Apr 2024
Tri-modal Confluence with Temporal Dynamics for Scene Graph Generation
  in Operating Rooms
Tri-modal Confluence with Temporal Dynamics for Scene Graph Generation in Operating Rooms
Diandian Guo
Manxi Lin
Jialun Pei
He Tang
Yueming Jin
Pheng-Ann Heng
24
1
0
14 Apr 2024
Surgical-LVLM: Learning to Adapt Large Vision-Language Model for Grounded Visual Question Answering in Robotic Surgery
Surgical-LVLM: Learning to Adapt Large Vision-Language Model for Grounded Visual Question Answering in Robotic Surgery
Guan-Feng Wang
Long Bai
Wan Jun Nah
Jie Wang
Zhaoxi Zhang
Zhen Chen
Jinlin Wu
Mobarakol Islam
Hongbin Liu
Hongliang Ren
40
14
0
22 Mar 2024
Effectiveness Assessment of Recent Large Vision-Language Models
Effectiveness Assessment of Recent Large Vision-Language Models
Yao Jiang
Xinyu Yan
Ge-Peng Ji
Keren Fu
Meijun Sun
Huan Xiong
Deng-Ping Fan
Fahad Shahbaz Khan
27
14
0
07 Mar 2024
CoTBal: Comprehensive Task Balancing for Multi-Task Visual Instruction Tuning
CoTBal: Comprehensive Task Balancing for Multi-Task Visual Instruction Tuning
Yanqi Dai
Dong Jing
Nanyi Fei
Zhiwu Lu
Nanyi Fei
Guoxing Yang
Zhiwu Lu
43
3
0
07 Mar 2024
Beyond Specialization: Assessing the Capabilities of MLLMs in Age and Gender Estimation
Beyond Specialization: Assessing the Capabilities of MLLMs in Age and Gender Estimation
Maksim Kuprashevich
Grigorii Alekseenko
Irina Tolstykh
ELM
48
3
0
04 Mar 2024
Where is the answer? Investigating Positional Bias in Language Model Knowledge Extraction
Where is the answer? Investigating Positional Bias in Language Model Knowledge Extraction
Kuniaki Saito
Kihyuk Sohn
Chen-Yu Lee
Yoshitaka Ushiku
62
2
0
16 Feb 2024
Towards Urban General Intelligence: A Review and Outlook of Urban Foundation Models
Towards Urban General Intelligence: A Review and Outlook of Urban Foundation Models
Weijiao Zhang
Jindong Han
Zhao Xu
Hang Ni
Hao Liu
Hui Xiong
Hui Xiong
AI4CE
77
14
0
30 Jan 2024
LLaVA-MoLE: Sparse Mixture of LoRA Experts for Mitigating Data Conflicts
  in Instruction Finetuning MLLMs
LLaVA-MoLE: Sparse Mixture of LoRA Experts for Mitigating Data Conflicts in Instruction Finetuning MLLMs
Shaoxiang Chen
Zequn Jie
Lin Ma
MoE
38
46
0
29 Jan 2024
Previous
123
Next