Synthetic Visual GenomeComputer Vision and Pattern Recognition (CVPR), 2025 |
RefChartQA: Grounding Visual Answer on Chart Images through Instruction TuningIEEE International Conference on Document Analysis and Recognition (ICDAR), 2025 |
ProAPO: Progressively Automatic Prompt Optimization for Visual ClassificationComputer Vision and Pattern Recognition (CVPR), 2025 |
New Dataset and Methods for Fine-Grained Compositional Referring Expression Comprehension via Specialist-MLLM CollaborationIEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2025 |
Towards Visual Grounding: A SurveyIEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2024 |
Aria-UI: Visual Grounding for GUI InstructionsAnnual Meeting of the Association for Computational Linguistics (ACL), 2024 |
FineCops-Ref: A new Dataset and Task for Fine-Grained Compositional Referring Expression ComprehensionConference on Empirical Methods in Natural Language Processing (EMNLP), 2024 |
F-LMM: Grounding Frozen Large Multimodal ModelsComputer Vision and Pattern Recognition (CVPR), 2024 |