
Title |
|---|
![]() DGTRSD & DGTRS-CLIP: A Dual-Granularity Remote Sensing Image-Text Dataset and Vision Language Foundation Model for AlignmentIEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing (IEEE J-STARS), 2025 |
Your Large Vision-Language Model Only Needs A Few Attention Heads For Visual GroundingComputer Vision and Pattern Recognition (CVPR), 2025 |
![]() Towards Visual Grounding: A SurveyIEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2024 |
![]() OneRef: Unified One-tower Expression Grounding and Segmentation with
Mask Referring ModelingNeural Information Processing Systems (NeurIPS), 2024 |