Re-Imagining Multimodal Instruction Tuning: A Representation ViewInternational Conference on Learning Representations (ICLR), 2025 |
Devils in Middle Layers of Large Vision-Language Models: Interpreting, Detecting and Mitigating Object Hallucinations via Attention LensComputer Vision and Pattern Recognition (CVPR), 2024 |
Referential communication in heterogeneous communities of pre-trained
visual deep networksAdaptive Agents and Multi-Agent Systems (AAMAS), 2023 |