
Title |
|---|
![]() Decoupling Static and Hierarchical Motion Perception for Referring Video
SegmentationComputer Vision and Pattern Recognition (CVPR), 2024 Shuting He Henghui Ding |
![]() Cross-Modal Conditioned Reconstruction for Language-guided Medical Image
SegmentationIEEE Transactions on Medical Imaging (IEEE TMI), 2024 |
![]() GiT: Towards Generalist Vision Transformer through Universal Language
InterfaceEuropean Conference on Computer Vision (ECCV), 2024 |
![]() CoIN: A Benchmark of Continual Instruction tuNing for Multimodel Large
Language ModelNeural Information Processing Systems (NeurIPS), 2024 |
![]() -Bench: Benchmarking the Robustness of Referring Perception
Models under PerturbationsEuropean Conference on Computer Vision (ECCV), 2024 |
![]() CogCoM: A Visual Language Model with Chain-of-Manipulations ReasoningInternational Conference on Learning Representations (ICLR), 2024 |
![]() Unifying Visual and Vision-Language Tracking via Contrastive LearningAAAI Conference on Artificial Intelligence (AAAI), 2024 |