ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2026 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1704.03470
  4. Cited By
Learning Two-Branch Neural Networks for Image-Text Matching Tasks
v1v2v3v4 (latest)

Learning Two-Branch Neural Networks for Image-Text Matching Tasks

11 April 2017
Liwei Wang
Yin Li
Jing-ling Huang
Svetlana Lazebnik
    VLM
ArXiv (abs)PDFHTML

Papers citing "Learning Two-Branch Neural Networks for Image-Text Matching Tasks"

50 / 189 papers shown
Annotating Satellite Images of Forests with Keywords from a Specialized Corpus in the Context of Change Detection
Annotating Satellite Images of Forests with Keywords from a Specialized Corpus in the Context of Change DetectionInternational Conference on Content-Based Multimedia Indexing (CBMI), 2023
Nathalie Neptune
Josiane Mothe
102
1
0
16 Sep 2025
Visual Grounding from Event Cameras
Visual Grounding from Event Cameras
Lingdong Kong
Dongyue Lu
Ao Liang
Rong Li
Yuhao Dong
Tianshuai Hu
Lai Xing Ng
Wei Tsang Ooi
Benoit R. Cottereau
VGen
170
1
0
11 Sep 2025
Prototype-Aware Multimodal Alignment for Open-Vocabulary Visual Grounding
Prototype-Aware Multimodal Alignment for Open-Vocabulary Visual Grounding
Jiangnan Xie
Xiaolong Zheng
Liang Zheng
ObjD
206
0
0
08 Sep 2025
LLaVA-RE: Binary Image-Text Relevancy Evaluation with Multimodal Large Language Model
LLaVA-RE: Binary Image-Text Relevancy Evaluation with Multimodal Large Language Model
Tao Sun
Oliver Liu
JinJin Li
Lan Ma
124
0
0
07 Aug 2025
Talk2Event: Grounded Understanding of Dynamic Scenes from Event Cameras
Talk2Event: Grounded Understanding of Dynamic Scenes from Event Cameras
Lingdong Kong
Dongyue Lu
Ao Liang
Rong Li
Yuhao Dong
Tianshuai Hu
Lai Xing Ng
Wei Tsang Ooi
Benoit R. Cottereau
VGen
384
5
0
23 Jul 2025
Audio-3DVG: Unified Audio -- Point Cloud Fusion for 3D Visual Grounding
Audio-3DVG: Unified Audio -- Point Cloud Fusion for 3D Visual Grounding
Duc Cao-Dinh
Khai Le-Duc
Anh Dao
Bach Phan Tat
Chris Ngo
Duy M. H. Nguyen
Nguyen X. Khanh
Thanh Nguyen-Tang
287
0
0
01 Jul 2025
AerialVG: A Challenging Benchmark for Aerial Visual Grounding by Exploring Positional Relations
AerialVG: A Challenging Benchmark for Aerial Visual Grounding by Exploring Positional Relations
Junli Liu
Qizhi Chen
Zechuan Wang
Yiwen Tang
Yiting Zhang
Chi Yan
Dong Wang
Xiaochen Li
Jiangwei Zhong
CoGe
627
11
0
10 Apr 2025
ELIP: Enhanced Visual-Language Foundation Models for Image Retrieval
ELIP: Enhanced Visual-Language Foundation Models for Image Retrieval
Guanqi Zhan
Yuanpei Liu
Kai Han
Weidi Xie
Andrew Zisserman
VLM
1.2K
4
0
21 Feb 2025
Towards Visual Grounding: A Survey
Towards Visual Grounding: A SurveyIEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2024
Linhui Xiao
Xiaoshan Yang
X. Lan
Yaowei Wang
Changsheng Xu
ObjD
1.1K
43
0
28 Dec 2024
Linguistics-Vision Monotonic Consistent Network for Sign Language
  Production
Linguistics-Vision Monotonic Consistent Network for Sign Language ProductionIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2024
Xu Wang
Shengeng Tang
Peipei Song
Shuo Wang
D. Guo
Richang Hong
SLR
378
10
0
22 Dec 2024
Joint Top-Down and Bottom-Up Frameworks for 3D Visual Grounding
Joint Top-Down and Bottom-Up Frameworks for 3D Visual GroundingInternational Conference on Pattern Recognition (ICPR), 2024
Yang Liu
Daizong Liu
Wei Hu
3DPC
438
9
0
21 Oct 2024
ResVG: Enhancing Relation and Semantic Understanding in Multiple
  Instances for Visual Grounding
ResVG: Enhancing Relation and Semantic Understanding in Multiple Instances for Visual GroundingACM Multimedia (MM), 2024
Minghang Zheng
Jiahua Zhang
Qingchao Chen
Yuxin Peng
Yang Liu
ObjD
342
7
0
29 Aug 2024
Language-driven Grasp Detection with Mask-guided Attention
Language-driven Grasp Detection with Mask-guided AttentionIEEE/RJS International Conference on Intelligent RObots and Systems (IROS), 2024
Tuan V. Vo
M. Vu
Baoru Huang
An Vuong
Ngan Le
T. Vo
Anh Nguyen
235
6
0
29 Jul 2024
Object-Aware Query Perturbation for Cross-Modal Image-Text Retrieval
Object-Aware Query Perturbation for Cross-Modal Image-Text Retrieval
Naoya Sogi
Takashi Shibata
Makoto Terao
VLM
365
4
0
17 Jul 2024
SegVG: Transferring Object Bounding Box to Segmentation for Visual
  Grounding
SegVG: Transferring Object Bounding Box to Segmentation for Visual Grounding
Weitai Kang
Gaowen Liu
Mubarak Shah
Yan Yan
ObjD
463
22
0
03 Jul 2024
FILS: Self-Supervised Video Feature Prediction In Semantic Language
  Space
FILS: Self-Supervised Video Feature Prediction In Semantic Language Space
Mona Ahmadian
Frank Guerin
Andrew Gilbert
355
4
0
05 Jun 2024
Hire: Hybrid-modal Interaction with Multiple Relational Enhancements for
  Image-Text Matching
Hire: Hybrid-modal Interaction with Multiple Relational Enhancements for Image-Text Matching
Xuri Ge
Fuhai Chen
Songpei Xu
Fuxiang Tao
Jie Wang
Joemon M. Jose
262
3
0
05 Jun 2024
3SHNet: Boosting Image-Sentence Retrieval via Visual Semantic-Spatial
  Self-Highlighting
3SHNet: Boosting Image-Sentence Retrieval via Visual Semantic-Spatial Self-Highlighting
Xuri Ge
Songpei Xu
Fuhai Chen
Jie Wang
Guoxin Wang
Shan An
Joemon M. Jose
3DPC
324
24
0
26 Apr 2024
N-Modal Contrastive Losses with Applications to Social Media Data in
  Trimodal Space
N-Modal Contrastive Losses with Applications to Social Media Data in Trimodal Space
William Theisen
Walter J. Scheirer
259
1
0
18 Mar 2024
LuoJiaHOG: A Hierarchy Oriented Geo-aware Image Caption Dataset for
  Remote Sensing Image-Text Retrival
LuoJiaHOG: A Hierarchy Oriented Geo-aware Image Caption Dataset for Remote Sensing Image-Text Retrival
Yuanxin Zhao
Mi Zhang
Bingnan Yang
Zhan Zhang
Jiaju Kang
Jianya Gong
251
5
0
16 Mar 2024
REPAIR: Rank Correlation and Noisy Pair Half-replacing with Memory for
  Noisy Correspondence
REPAIR: Rank Correlation and Noisy Pair Half-replacing with Memory for Noisy CorrespondenceIEEE transactions on multimedia (IEEE TMM), 2024
Ruochen Zheng
Jiahao Hong
Changxin Gao
Nong Sang
201
3
0
13 Mar 2024
How to Understand "Support"? An Implicit-enhanced Causal Inference
  Approach for Weakly-supervised Phrase Grounding
How to Understand "Support"? An Implicit-enhanced Causal Inference Approach for Weakly-supervised Phrase Grounding
Jiamin Luo
Jianing Zhao
Jingjing Wang
Guodong Zhou
257
0
0
29 Feb 2024
Bridging Modality Gap for Visual Grounding with Effecitve Cross-modal
  Distillation
Bridging Modality Gap for Visual Grounding with Effecitve Cross-modal DistillationChinese Conference on Pattern Recognition and Computer Vision (CPRCV), 2023
Jiaxi Wang
Wenhui Hu
Xueyang Liu
Beihu Wu
Yuting Qiu
Yingying Cai
347
2
0
29 Dec 2023
Context Disentangling and Prototype Inheriting for Robust Visual
  Grounding
Context Disentangling and Prototype Inheriting for Robust Visual Grounding
Wei Tang
Liang Li
Xuejing Liu
Lu Jin
Jinhui Tang
Zechao Li
305
45
0
19 Dec 2023
Weakly-Supervised 3D Visual Grounding based on Visual Language Alignment
Weakly-Supervised 3D Visual Grounding based on Visual Language AlignmentIEEE transactions on multimedia (IEEE TMM), 2023
Xiaoxu Xu
Yitian Yuan
Qiudan Zhang
Wen-Bin Wu
Zequn Jie
Lin Ma
Xu Wang
632
5
0
15 Dec 2023
Negative Pre-aware for Noisy Cross-modal Matching
Negative Pre-aware for Noisy Cross-modal MatchingAAAI Conference on Artificial Intelligence (AAAI), 2023
Xu-Yao Zhang
Hao Li
Mang Ye
391
17
0
10 Dec 2023
GPT-4 Enhanced Multimodal Grounding for Autonomous Driving: Leveraging
  Cross-Modal Attention with Large Language Models
GPT-4 Enhanced Multimodal Grounding for Autonomous Driving: Leveraging Cross-Modal Attention with Large Language Models
Haicheng Liao
Huanming Shen
Zhenning Li
Chengyue Wang
Guofa Li
Yiming Bie
Chengzhong Xu
310
80
0
06 Dec 2023
Enriching Phrases with Coupled Pixel and Object Contexts for Panoptic
  Narrative Grounding
Enriching Phrases with Coupled Pixel and Object Contexts for Panoptic Narrative GroundingInternational Joint Conference on Artificial Intelligence (IJCAI), 2023
Tianrui Hui
Zihan Ding
Junshi Huang
Xiaoming Wei
Xiaolin K. Wei
Jiao Dai
Jizhong Han
Si Liu
353
8
0
02 Nov 2023
RIO: A Benchmark for Reasoning Intention-Oriented Objects in Open
  Environments
RIO: A Benchmark for Reasoning Intention-Oriented Objects in Open EnvironmentsNeural Information Processing Systems (NeurIPS), 2023
Mengxue Qu
Yu-Huan Wu
Wu Liu
Xiaodan Liang
Jingkuan Song
Yao-Min Zhao
Yunchao Wei
273
21
0
26 Oct 2023
NICE: Improving Panoptic Narrative Detection and Segmentation with
  Cascading Collaborative Learning
NICE: Improving Panoptic Narrative Detection and Segmentation with Cascading Collaborative LearningIEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2023
Haowei Wang
Jiayi Ji
Tianyu Guo
Yilong Yang
Weihao Ye
Xiaoshuai Sun
Rongrong Ji
428
10
0
17 Oct 2023
Iterative Robust Visual Grounding with Masked Reference based
  Centerpoint Supervision
Iterative Robust Visual Grounding with Masked Reference based Centerpoint Supervision
Menghao Li
Chunlei Wang
W. Feng
Shuchang Lyu
Guangliang Cheng
Xiangtai Li
Binghao Liu
Qi Zhao
301
7
0
23 Jul 2023
Efficient Token-Guided Image-Text Retrieval with Consistent Multimodal
  Contrastive Training
Efficient Token-Guided Image-Text Retrieval with Consistent Multimodal Contrastive TrainingIEEE Transactions on Image Processing (IEEE TIP), 2023
Chong Liu
Yuqi Zhang
Hongsong Wang
Weihua Chen
F. Wang
Yan Huang
Yixing Shen
Liang Wang
281
48
0
15 Jun 2023
"Let's not Quote out of Context": Unified Vision-Language Pretraining
  for Context Assisted Image Captioning
"Let's not Quote out of Context": Unified Vision-Language Pretraining for Context Assisted Image CaptioningAnnual Meeting of the Association for Computational Linguistics (ACL), 2023
Abisek Rajakumar Kalarani
P. Bhattacharyya
Niyati Chhaya
Sumit Shekhar
CoGeVLM
290
12
0
01 Jun 2023
Language-Guided 3D Object Detection in Point Cloud for Autonomous
  Driving
Language-Guided 3D Object Detection in Point Cloud for Autonomous Driving
Wenhao Cheng
Junbo Yin
Wei Li
Ruigang Yang
Jianbing Shen
3DPC
239
22
0
25 May 2023
Click-Feedback Retrieval
Click-Feedback Retrieval
Zeyu Wang
Yuehua Wu
329
0
0
28 Apr 2023
BiCro: Noisy Correspondence Rectification for Multi-modality Data via
  Bi-directional Cross-modal Similarity Consistency
BiCro: Noisy Correspondence Rectification for Multi-modality Data via Bi-directional Cross-modal Similarity ConsistencyComputer Vision and Pattern Recognition (CVPR), 2023
Shuo Yang
Zhaopan Xu
Kai Wang
Yang You
Huanjin Yao
Tongliang Liu
Min Xu
314
58
0
22 Mar 2023
Scene Graph Based Fusion Network For Image-Text Retrieval
Scene Graph Based Fusion Network For Image-Text RetrievalIEEE International Conference on Multimedia and Expo (ICME), 2023
Guoliang Wang
Yanlei Shang
Yongzhe Chen
209
4
0
20 Mar 2023
Efficient Image-Text Retrieval via Keyword-Guided Pre-Screening
Efficient Image-Text Retrieval via Keyword-Guided Pre-Screening
Min Cao
Yang Bai
Wenwen Qiang
Ziqiang Cao
Liqiang Nie
Min Zhang
247
4
0
14 Mar 2023
LexLIP: Lexicon-Bottlenecked Language-Image Pre-Training for Large-Scale
  Image-Text Retrieval
LexLIP: Lexicon-Bottlenecked Language-Image Pre-Training for Large-Scale Image-Text Retrieval
Ziyang Luo
Pu Zhao
Can Xu
Xiubo Geng
Tao Shen
Chongyang Tao
Jing Ma
Qingwen Lin
Daxin Jiang
VLMCLIP
220
3
0
06 Feb 2023
Open-vocabulary Object Segmentation with Diffusion Models
Open-vocabulary Object Segmentation with Diffusion ModelsIEEE International Conference on Computer Vision (ICCV), 2023
Ziyi Li
Qinye Zhou
Xiaoyun Zhang
Ya Zhang
Yanfeng Wang
Weidi Xie
VLM
384
95
0
12 Jan 2023
Universal Multimodal Representation for Language Understanding
Universal Multimodal Representation for Language UnderstandingIEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2023
Zhuosheng Zhang
Kehai Chen
Rui Wang
Masao Utiyama
Eiichiro Sumita
Z. Li
Hai Zhao
SSL
324
32
0
09 Jan 2023
Learning Multimodal Data Augmentation in Feature Space
Learning Multimodal Data Augmentation in Feature SpaceInternational Conference on Learning Representations (ICLR), 2022
Zichang Liu
Zhiqiang Tang
Xingjian Shi
Aston Zhang
Mu Li
Anshumali Shrivastava
A. Wilson
299
30
0
29 Dec 2022
Multimodal Query-guided Object Localization
Multimodal Query-guided Object Localization
Aditay Tripathi
Rajath R Dani
Anand Mishra
Anirban Chakraborty
276
0
0
01 Dec 2022
DQ-DETR: Dual Query Detection Transformer for Phrase Extraction and
  Grounding
DQ-DETR: Dual Query Detection Transformer for Phrase Extraction and GroundingAAAI Conference on Artificial Intelligence (AAAI), 2022
Siyi Liu
Yaoyuan Liang
Feng Li
Shijia Huang
Hao Zhang
Hang Su
Jun Zhu
Lei Zhang
ObjD
450
43
0
28 Nov 2022
SLAN: Self-Locator Aided Network for Cross-Modal Understanding
SLAN: Self-Locator Aided Network for Cross-Modal Understanding
Jiang-Tian Zhai
Tao Gui
Tong Wu
Xinghan Chen
Jiangjiang Liu
Bo Ren
Ming-Ming Cheng
ObjDVLM
194
1
0
28 Nov 2022
Look Around and Refer: 2D Synthetic Semantics Knowledge Distillation for
  3D Visual Grounding
Look Around and Refer: 2D Synthetic Semantics Knowledge Distillation for 3D Visual GroundingNeural Information Processing Systems (NeurIPS), 2022
Eslam Mohamed Bakr
Yasmeen Alsaedy
Mohamed Elhoseiny
3DPC
246
62
0
25 Nov 2022
YORO -- Lightweight End to End Visual Grounding
YORO -- Lightweight End to End Visual Grounding
Chih-Hui Ho
Srikar Appalaraju
Bhavan A. Jasani
R. Manmatha
Nuno Vasconcelos
ObjD
281
28
0
15 Nov 2022
Cross-modal Semantic Enhanced Interaction for Image-Sentence Retrieval
Cross-modal Semantic Enhanced Interaction for Image-Sentence RetrievalIEEE Workshop/Winter Conference on Applications of Computer Vision (WACV), 2022
Xuri Ge
Fuhai Chen
Songpei Xu
Fuxiang Tao
J. Jose
257
36
0
17 Oct 2022
ConTra: (Con)text (Tra)nsformer for Cross-Modal Video Retrieval
ConTra: (Con)text (Tra)nsformer for Cross-Modal Video RetrievalAsian Conference on Computer Vision (ACCV), 2022
A. Fragomeni
Michael Wray
Dima Damen
CLIPViT
180
4
0
09 Oct 2022
Learning to embed semantic similarity for joint image-text retrieval
Learning to embed semantic similarity for joint image-text retrievalIEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2021
Noam Malali
Y. Keller
252
12
0
07 Oct 2022
1234
Next
Page 1 of 4