LOVA3: Learning to Visual Question Answering, Asking and AssessmentNeural Information Processing Systems (NeurIPS), 2024 |
The Labyrinth of Links: Navigating the Associative Maze of Multi-modal LLMsInternational Conference on Learning Representations (ICLR), 2024 |
ViP-LLaVA: Making Large Multimodal Models Understand Arbitrary Visual
PromptsComputer Vision and Pattern Recognition (CVPR), 2023 |
ChatSpot: Bootstrapping Multimodal LLMs via Precise Referring
Instruction TuningInternational Joint Conference on Artificial Intelligence (IJCAI), 2023 Liang Zhao En Yu Zheng Ge Jinrong Yang Hao-Ran Wei ...Jian‐Yuan Sun Yuang Peng Runpei Dong Chunrui Han Xiangyu Zhang |
Localized Questions in Medical Visual Question AnsweringInternational Conference on Medical Image Computing and Computer-Assisted Intervention (MICCAI), 2023 |
A Review on Explainability in Multimodal Deep Neural NetsIEEE Access (IEEE Access), 2021 |