v1v2 (latest)

Improving Weakly Supervised Visual Grounding by Contrastive Knowledge Distillation

3 July 2020

Liwei Wang

Jing-ling Huang

Yin Li

Kun Xu

Zhengyuan Yang

Dong Yu

ObjD

ArXiv (abs)PDF HTML

Papers citing "Improving Weakly Supervised Visual Grounding by Contrastive Knowledge Distillation"

43 / 43 papers shown

LIHE: Linguistic Instance-Split Hyperbolic-Euclidean Framework for Generalized Weakly-Supervised Referring Expression ComprehensionConference on Empirical Methods in Natural Language Processing (EMNLP), 2025

191

15 Nov 2025

Learning Egocentric In-Hand Object Segmentation through Weak Supervision from Human Narrations

417

30 Sep 2025

Prototype-Aware Multimodal Alignment for Open-Vocabulary Visual Grounding

208

08 Sep 2025

Seeing the Trees for the Forest: Rethinking Weakly-Supervised Medical Visual Grounding

...

481

21 May 2025

3DWG: 3D Weakly Supervised Visual Grounding via Category and Instance-Level AlignmentIEEE International Conference on Robotics and Automation (ICRA), 2025

280

03 May 2025

Towards Visual Grounding: A SurveyIEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2024

1.1K

28 Dec 2024

Q-GroundCAM: Quantifying Grounding in Vision Language Models via GradCAM

Navid Rajabi

Jana Kosecka

236

29 Apr 2024

How to Understand "Support"? An Implicit-enhanced Causal Inference Approach for Weakly-supervised Phrase Grounding

257

29 Feb 2024

Cycle-Consistency Learning for Captioning and Grounding

321

23 Dec 2023

SEER-ZSL: Semantic Encoder-Enhanced Representations for Generalized Zero-Shot Learning

William Heyden

Habib Ullah

M. Salman Siddiqui

Fadi Al Machot

VLM

300

20 Dec 2023

Weakly-Supervised 3D Visual Grounding based on Visual Language AlignmentIEEE transactions on multimedia (IEEE TMM), 2023

634

15 Dec 2023

EtC: Temporal Boundary Expand then Clarify for Weakly Supervised Video Grounding with Multimodal Large Language ModelIEEE transactions on multimedia (IEEE TMM), 2023

Guozhang Li

Xinpeng Ding

De Cheng

Jie Li

Nannan Wang

Xinbo Gao

507

05 Dec 2023

Which One? Leveraging Context Between Objects and Multiple Views for Language GroundingNorth American Chapter of the Association for Computational Linguistics (NAACL), 2023

325

12 Nov 2023

Shatter and Gather: Learning Referring Image Segmentation with Text SupervisionIEEE International Conference on Computer Vision (ICCV), 2023

339

29 Aug 2023

Referring Image Segmentation Using Text SupervisionIEEE International Conference on Computer Vision (ICCV), 2023

Fang Liu

344

28 Aug 2023

Distilling Coarse-to-Fine Semantic Matching Knowledge for Weakly Supervised 3D Visual GroundingIEEE International Conference on Computer Vision (ICCV), 2023

Xize Cheng

Zhou Zhao

222

18 Jul 2023

Top-Down Framework for Weakly-supervised Grounded Image Captioning

Yi Wang

269

13 Jun 2023

Weakly-Supervised Visual-Textual Grounding with Semantic Prior RefinementBritish Machine Vision Conference (BMVC), 2023

238

18 May 2023

CLIP-VG: Self-paced Curriculum Adapting of CLIP for Visual GroundingIEEE transactions on multimedia (IEEE TMM), 2023

Linhui Xiao

Xiaoshan Yang

Fang Peng

Ming Yan

Yaowei Wang

Changsheng Xu

ObjD VLM

563

15 May 2023

Focusing On Targets For Improving Weakly Supervised Visual GroundingIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2023

V. Pham

Nao Mishima

ObjD

238

22 Feb 2023

Who are you referring to? Coreference resolution in image narrationsIEEE International Conference on Computer Vision (ICCV), 2022

359

26 Nov 2022

A Unified Mutual Supervision Framework for Referring Expression Segmentation and Generation

Lei Zhang

207

15 Nov 2022

Exploring Generalizable Distillation for Efficient Medical Image SegmentationIEEE journal of biomedical and health informatics (IEEE JBHI), 2022

261

26 Jul 2022

Contrastive Deep SupervisionEuropean Conference on Computer Vision (ECCV), 2022

336

12 Jul 2022

DUET: Cross-modal Semantic Grounding for Contrastive Zero-shot LearningAAAI Conference on Artificial Intelligence (AAAI), 2022

Huajun Chen

493

04 Jul 2022

Improving Visual Grounding by Encouraging Consistent Gradient-based ExplanationsComputer Vision and Pattern Recognition (CVPR), 2022

Ziyan Yang

Kushal Kafle

Franck Dernoncourt

Vicente Ordónez Román

VLM

454

30 Jun 2022

A Unified Continuous Learning Framework for Multi-modal Knowledge Discovery and Pre-training

Xuanjing Huang

172

11 Jun 2022

Guiding Visual Question Answering with Attention PriorsIEEE Workshop/Winter Conference on Applications of Computer Vision (WACV), 2022

286

25 May 2022

Region-aware Knowledge Distillation for Efficient Image-to-Image TranslationBritish Machine Vision Conference (BMVC), 2022

315

25 May 2022

Beyond Bounding Box: Multimodal Knowledge Learning for Object Detection

198

09 May 2022

Adapting CLIP For Phrase Localization Without Further Training

315

07 Apr 2022

Multi-View Transformer for 3D Visual GroundingComputer Vision and Pattern Recognition (CVPR), 2022

466

191

05 Apr 2022

TubeDETR: Spatio-Temporal Video Grounding with TransformersComputer Vision and Pattern Recognition (CVPR), 2022

390

127

30 Mar 2022

Unsupervised Vision-Language Parsing: Seamlessly Bridging Visual Scene Graphs with Language Structures via Dependency RelationshipsComputer Vision and Pattern Recognition (CVPR), 2022

289

27 Mar 2022

Pseudo-Q: Generating Pseudo Language Queries for Visual GroundingComputer Vision and Pattern Recognition (CVPR), 2022

Gao Huang

430

16 Mar 2022

GroupViT: Semantic Segmentation Emerges from Text SupervisionComputer Vision and Pattern Recognition (CVPR), 2022

937

685

22 Feb 2022

Unpaired Referring Expression Grounding via Bidirectional Cross-Modal MatchingNeurocomputing (Neurocomputing), 2022

Hengcan Shi

Munawar Hayat

Jianfei Cai

ObjD

275

18 Jan 2022

Injecting Semantic Concepts into End-to-End Image Captioning

Xiaowei Hu

Yezhou Yang

Zicheng Liu

ViT VLM

289

124

09 Dec 2021

Making a Bird AI Expert Work for You and MeIEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2021

341

06 Dec 2021

Weakly-Supervised Video Object Grounding via Causal Intervention

358

01 Dec 2021

A Survey on Temporal Sentence Grounding in Videos

406

16 Sep 2021

Distributed Attention for Grounded Image Captioning

Wenping Wang

470

02 Aug 2021

Relation-aware Instance Refinement for Weakly Supervised Visual GroundingComputer Vision and Pattern Recognition (CVPR), 2021

298

24 Mar 2021