Bridging Modality Gap for Visual Grounding with Effecitve Cross-modal
DistillationChinese Conference on Pattern Recognition and Computer Vision (CPRCV), 2023 |
DDG-Net: Discriminability-Driven Graph Network for Weakly-supervised
Temporal Action LocalizationIEEE International Conference on Computer Vision (ICCV), 2023 |
G2L: Semantically Aligned and Uniform Video Grounding via Geodesic and
Game TheoryIEEE International Conference on Computer Vision (ICCV), 2023 |
Correspondence Matters for Video Referring Expression ComprehensionACM Multimedia (ACM MM), 2022 |
LocVTP: Video-Text Pre-training for Temporal LocalizationEuropean Conference on Computer Vision (ECCV), 2022 |