Communities
Connect sessions
AI calendar
Organizations
Join Slack
Contact Sales
Search
Open menu
Home
Papers
1704.03470
Cited By
v1
v2
v3
v4 (latest)
Learning Two-Branch Neural Networks for Image-Text Matching Tasks
11 April 2017
Liwei Wang
Yin Li
Jing-ling Huang
Svetlana Lazebnik
VLM
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Learning Two-Branch Neural Networks for Image-Text Matching Tasks"
50 / 189 papers shown
Annotating Satellite Images of Forests with Keywords from a Specialized Corpus in the Context of Change Detection
International Conference on Content-Based Multimedia Indexing (CBMI), 2023
Nathalie Neptune
Josiane Mothe
102
1
0
16 Sep 2025
Visual Grounding from Event Cameras
Lingdong Kong
Dongyue Lu
Ao Liang
Rong Li
Yuhao Dong
Tianshuai Hu
Lai Xing Ng
Wei Tsang Ooi
Benoit R. Cottereau
VGen
170
1
0
11 Sep 2025
Prototype-Aware Multimodal Alignment for Open-Vocabulary Visual Grounding
Jiangnan Xie
Xiaolong Zheng
Liang Zheng
ObjD
206
0
0
08 Sep 2025
LLaVA-RE: Binary Image-Text Relevancy Evaluation with Multimodal Large Language Model
Tao Sun
Oliver Liu
JinJin Li
Lan Ma
124
0
0
07 Aug 2025
Talk2Event: Grounded Understanding of Dynamic Scenes from Event Cameras
Lingdong Kong
Dongyue Lu
Ao Liang
Rong Li
Yuhao Dong
Tianshuai Hu
Lai Xing Ng
Wei Tsang Ooi
Benoit R. Cottereau
VGen
384
5
0
23 Jul 2025
Audio-3DVG: Unified Audio -- Point Cloud Fusion for 3D Visual Grounding
Duc Cao-Dinh
Khai Le-Duc
Anh Dao
Bach Phan Tat
Chris Ngo
Duy M. H. Nguyen
Nguyen X. Khanh
Thanh Nguyen-Tang
287
0
0
01 Jul 2025
AerialVG: A Challenging Benchmark for Aerial Visual Grounding by Exploring Positional Relations
Junli Liu
Qizhi Chen
Zechuan Wang
Yiwen Tang
Yiting Zhang
Chi Yan
Dong Wang
Xiaochen Li
Jiangwei Zhong
CoGe
627
11
0
10 Apr 2025
ELIP: Enhanced Visual-Language Foundation Models for Image Retrieval
Guanqi Zhan
Yuanpei Liu
Kai Han
Weidi Xie
Andrew Zisserman
VLM
1.2K
4
0
21 Feb 2025
Towards Visual Grounding: A Survey
IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2024
Linhui Xiao
Xiaoshan Yang
X. Lan
Yaowei Wang
Changsheng Xu
ObjD
1.1K
43
0
28 Dec 2024
Linguistics-Vision Monotonic Consistent Network for Sign Language Production
IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2024
Xu Wang
Shengeng Tang
Peipei Song
Shuo Wang
D. Guo
Richang Hong
SLR
378
10
0
22 Dec 2024
Joint Top-Down and Bottom-Up Frameworks for 3D Visual Grounding
International Conference on Pattern Recognition (ICPR), 2024
Yang Liu
Daizong Liu
Wei Hu
3DPC
438
9
0
21 Oct 2024
ResVG: Enhancing Relation and Semantic Understanding in Multiple Instances for Visual Grounding
ACM Multimedia (MM), 2024
Minghang Zheng
Jiahua Zhang
Qingchao Chen
Yuxin Peng
Yang Liu
ObjD
342
7
0
29 Aug 2024
Language-driven Grasp Detection with Mask-guided Attention
IEEE/RJS International Conference on Intelligent RObots and Systems (IROS), 2024
Tuan V. Vo
M. Vu
Baoru Huang
An Vuong
Ngan Le
T. Vo
Anh Nguyen
235
6
0
29 Jul 2024
Object-Aware Query Perturbation for Cross-Modal Image-Text Retrieval
Naoya Sogi
Takashi Shibata
Makoto Terao
VLM
365
4
0
17 Jul 2024
SegVG: Transferring Object Bounding Box to Segmentation for Visual Grounding
Weitai Kang
Gaowen Liu
Mubarak Shah
Yan Yan
ObjD
463
22
0
03 Jul 2024
FILS: Self-Supervised Video Feature Prediction In Semantic Language Space
Mona Ahmadian
Frank Guerin
Andrew Gilbert
355
4
0
05 Jun 2024
Hire: Hybrid-modal Interaction with Multiple Relational Enhancements for Image-Text Matching
Xuri Ge
Fuhai Chen
Songpei Xu
Fuxiang Tao
Jie Wang
Joemon M. Jose
262
3
0
05 Jun 2024
3SHNet: Boosting Image-Sentence Retrieval via Visual Semantic-Spatial Self-Highlighting
Xuri Ge
Songpei Xu
Fuhai Chen
Jie Wang
Guoxin Wang
Shan An
Joemon M. Jose
3DPC
324
24
0
26 Apr 2024
N-Modal Contrastive Losses with Applications to Social Media Data in Trimodal Space
William Theisen
Walter J. Scheirer
259
1
0
18 Mar 2024
LuoJiaHOG: A Hierarchy Oriented Geo-aware Image Caption Dataset for Remote Sensing Image-Text Retrival
Yuanxin Zhao
Mi Zhang
Bingnan Yang
Zhan Zhang
Jiaju Kang
Jianya Gong
251
5
0
16 Mar 2024
REPAIR: Rank Correlation and Noisy Pair Half-replacing with Memory for Noisy Correspondence
IEEE transactions on multimedia (IEEE TMM), 2024
Ruochen Zheng
Jiahao Hong
Changxin Gao
Nong Sang
201
3
0
13 Mar 2024
How to Understand "Support"? An Implicit-enhanced Causal Inference Approach for Weakly-supervised Phrase Grounding
Jiamin Luo
Jianing Zhao
Jingjing Wang
Guodong Zhou
257
0
0
29 Feb 2024
Bridging Modality Gap for Visual Grounding with Effecitve Cross-modal Distillation
Chinese Conference on Pattern Recognition and Computer Vision (CPRCV), 2023
Jiaxi Wang
Wenhui Hu
Xueyang Liu
Beihu Wu
Yuting Qiu
Yingying Cai
347
2
0
29 Dec 2023
Context Disentangling and Prototype Inheriting for Robust Visual Grounding
Wei Tang
Liang Li
Xuejing Liu
Lu Jin
Jinhui Tang
Zechao Li
305
45
0
19 Dec 2023
Weakly-Supervised 3D Visual Grounding based on Visual Language Alignment
IEEE transactions on multimedia (IEEE TMM), 2023
Xiaoxu Xu
Yitian Yuan
Qiudan Zhang
Wen-Bin Wu
Zequn Jie
Lin Ma
Xu Wang
632
5
0
15 Dec 2023
Negative Pre-aware for Noisy Cross-modal Matching
AAAI Conference on Artificial Intelligence (AAAI), 2023
Xu-Yao Zhang
Hao Li
Mang Ye
391
17
0
10 Dec 2023
GPT-4 Enhanced Multimodal Grounding for Autonomous Driving: Leveraging Cross-Modal Attention with Large Language Models
Haicheng Liao
Huanming Shen
Zhenning Li
Chengyue Wang
Guofa Li
Yiming Bie
Chengzhong Xu
310
80
0
06 Dec 2023
Enriching Phrases with Coupled Pixel and Object Contexts for Panoptic Narrative Grounding
International Joint Conference on Artificial Intelligence (IJCAI), 2023
Tianrui Hui
Zihan Ding
Junshi Huang
Xiaoming Wei
Xiaolin K. Wei
Jiao Dai
Jizhong Han
Si Liu
353
8
0
02 Nov 2023
RIO: A Benchmark for Reasoning Intention-Oriented Objects in Open Environments
Neural Information Processing Systems (NeurIPS), 2023
Mengxue Qu
Yu-Huan Wu
Wu Liu
Xiaodan Liang
Jingkuan Song
Yao-Min Zhao
Yunchao Wei
273
21
0
26 Oct 2023
NICE: Improving Panoptic Narrative Detection and Segmentation with Cascading Collaborative Learning
IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2023
Haowei Wang
Jiayi Ji
Tianyu Guo
Yilong Yang
Weihao Ye
Xiaoshuai Sun
Rongrong Ji
428
10
0
17 Oct 2023
Iterative Robust Visual Grounding with Masked Reference based Centerpoint Supervision
Menghao Li
Chunlei Wang
W. Feng
Shuchang Lyu
Guangliang Cheng
Xiangtai Li
Binghao Liu
Qi Zhao
301
7
0
23 Jul 2023
Efficient Token-Guided Image-Text Retrieval with Consistent Multimodal Contrastive Training
IEEE Transactions on Image Processing (IEEE TIP), 2023
Chong Liu
Yuqi Zhang
Hongsong Wang
Weihua Chen
F. Wang
Yan Huang
Yixing Shen
Liang Wang
281
48
0
15 Jun 2023
"Let's not Quote out of Context": Unified Vision-Language Pretraining for Context Assisted Image Captioning
Annual Meeting of the Association for Computational Linguistics (ACL), 2023
Abisek Rajakumar Kalarani
P. Bhattacharyya
Niyati Chhaya
Sumit Shekhar
CoGe
VLM
290
12
0
01 Jun 2023
Language-Guided 3D Object Detection in Point Cloud for Autonomous Driving
Wenhao Cheng
Junbo Yin
Wei Li
Ruigang Yang
Jianbing Shen
3DPC
239
22
0
25 May 2023
Click-Feedback Retrieval
Zeyu Wang
Yuehua Wu
329
0
0
28 Apr 2023
BiCro: Noisy Correspondence Rectification for Multi-modality Data via Bi-directional Cross-modal Similarity Consistency
Computer Vision and Pattern Recognition (CVPR), 2023
Shuo Yang
Zhaopan Xu
Kai Wang
Yang You
Huanjin Yao
Tongliang Liu
Min Xu
314
58
0
22 Mar 2023
Scene Graph Based Fusion Network For Image-Text Retrieval
IEEE International Conference on Multimedia and Expo (ICME), 2023
Guoliang Wang
Yanlei Shang
Yongzhe Chen
209
4
0
20 Mar 2023
Efficient Image-Text Retrieval via Keyword-Guided Pre-Screening
Min Cao
Yang Bai
Wenwen Qiang
Ziqiang Cao
Liqiang Nie
Min Zhang
247
4
0
14 Mar 2023
LexLIP: Lexicon-Bottlenecked Language-Image Pre-Training for Large-Scale Image-Text Retrieval
Ziyang Luo
Pu Zhao
Can Xu
Xiubo Geng
Tao Shen
Chongyang Tao
Jing Ma
Qingwen Lin
Daxin Jiang
VLM
CLIP
220
3
0
06 Feb 2023
Open-vocabulary Object Segmentation with Diffusion Models
IEEE International Conference on Computer Vision (ICCV), 2023
Ziyi Li
Qinye Zhou
Xiaoyun Zhang
Ya Zhang
Yanfeng Wang
Weidi Xie
VLM
384
95
0
12 Jan 2023
Universal Multimodal Representation for Language Understanding
IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2023
Zhuosheng Zhang
Kehai Chen
Rui Wang
Masao Utiyama
Eiichiro Sumita
Z. Li
Hai Zhao
SSL
324
32
0
09 Jan 2023
Learning Multimodal Data Augmentation in Feature Space
International Conference on Learning Representations (ICLR), 2022
Zichang Liu
Zhiqiang Tang
Xingjian Shi
Aston Zhang
Mu Li
Anshumali Shrivastava
A. Wilson
299
30
0
29 Dec 2022
Multimodal Query-guided Object Localization
Aditay Tripathi
Rajath R Dani
Anand Mishra
Anirban Chakraborty
276
0
0
01 Dec 2022
DQ-DETR: Dual Query Detection Transformer for Phrase Extraction and Grounding
AAAI Conference on Artificial Intelligence (AAAI), 2022
Siyi Liu
Yaoyuan Liang
Feng Li
Shijia Huang
Hao Zhang
Hang Su
Jun Zhu
Lei Zhang
ObjD
450
43
0
28 Nov 2022
SLAN: Self-Locator Aided Network for Cross-Modal Understanding
Jiang-Tian Zhai
Tao Gui
Tong Wu
Xinghan Chen
Jiangjiang Liu
Bo Ren
Ming-Ming Cheng
ObjD
VLM
194
1
0
28 Nov 2022
Look Around and Refer: 2D Synthetic Semantics Knowledge Distillation for 3D Visual Grounding
Neural Information Processing Systems (NeurIPS), 2022
Eslam Mohamed Bakr
Yasmeen Alsaedy
Mohamed Elhoseiny
3DPC
246
62
0
25 Nov 2022
YORO -- Lightweight End to End Visual Grounding
Chih-Hui Ho
Srikar Appalaraju
Bhavan A. Jasani
R. Manmatha
Nuno Vasconcelos
ObjD
281
28
0
15 Nov 2022
Cross-modal Semantic Enhanced Interaction for Image-Sentence Retrieval
IEEE Workshop/Winter Conference on Applications of Computer Vision (WACV), 2022
Xuri Ge
Fuhai Chen
Songpei Xu
Fuxiang Tao
J. Jose
257
36
0
17 Oct 2022
ConTra: (Con)text (Tra)nsformer for Cross-Modal Video Retrieval
Asian Conference on Computer Vision (ACCV), 2022
A. Fragomeni
Michael Wray
Dima Damen
CLIP
ViT
180
4
0
09 Oct 2022
Learning to embed semantic similarity for joint image-text retrieval
IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2021
Noam Malali
Y. Keller
252
12
0
07 Oct 2022
1
2
3
4
Next
Page 1 of 4