ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2205.00272
  4. Cited By
Improving Visual Grounding with Visual-Linguistic Verification and
  Iterative Reasoning

Improving Visual Grounding with Visual-Linguistic Verification and Iterative Reasoning

30 April 2022
Li Yang
Yan Xu
Chunfen Yuan
Wei Liu
Bing Li
Weiming Hu
    ObjD
ArXivPDFHTML

Papers citing "Improving Visual Grounding with Visual-Linguistic Verification and Iterative Reasoning"

50 / 58 papers shown
Title
Efficient Adaptation For Remote Sensing Visual Grounding
Efficient Adaptation For Remote Sensing Visual Grounding
Hasan Moughnieh
Mohamad Chalhoub
Hasan Nasrallah
Cristiano Nattero
Paolo Campanella
Giovanni Nico
A. Ghandour
51
0
0
29 Mar 2025
Visual Position Prompt for MLLM based Visual Grounding
Visual Position Prompt for MLLM based Visual Grounding
Wei Tang
Yanpeng Sun
Qinying Gu
Zechao Li
VLM
50
0
0
19 Mar 2025
SwimVG: Step-wise Multimodal Fusion and Adaption for Visual Grounding
SwimVG: Step-wise Multimodal Fusion and Adaption for Visual Grounding
Liangtao Shi
Ting Liu
Xiantao Hu
Yue Hu
Quanjun Yin
Richang Hong
ObjD
51
0
0
24 Feb 2025
MaPPER: Multimodal Prior-guided Parameter Efficient Tuning for Referring Expression Comprehension
MaPPER: Multimodal Prior-guided Parameter Efficient Tuning for Referring Expression Comprehension
Ting Liu
Zunnan Xu
Yue Hu
Liangtao Shi
Zhiqiang Wang
Quanjun Yin
65
2
0
03 Jan 2025
Hierarchical Alignment-enhanced Adaptive Grounding Network for Generalized Referring Expression Comprehension
Yaxian Wang
Henghui Ding
Shuting He
Xudong Jiang
Bifan Wei
Jun Liu
ObjD
47
1
0
03 Jan 2025
Paint Outside the Box: Synthesizing and Selecting Training Data for Visual Grounding
Paint Outside the Box: Synthesizing and Selecting Training Data for Visual Grounding
Zilin Du
Haoxin Li
Jianfei Yu
Boyang Li
152
0
0
01 Dec 2024
Phrase Decoupling Cross-Modal Hierarchical Matching and Progressive
  Position Correction for Visual Grounding
Phrase Decoupling Cross-Modal Hierarchical Matching and Progressive Position Correction for Visual Grounding
Minghong Xie
Hao Wu
Huafeng Li
Yafei Zhang
Dapeng Tao
Z. Yu
ObjD
40
1
0
31 Oct 2024
Context-Infused Visual Grounding for Art
Context-Infused Visual Grounding for Art
Selina Khan
Nanne van Noord
ObjD
35
1
0
16 Oct 2024
SimVG: A Simple Framework for Visual Grounding with Decoupled
  Multi-modal Fusion
SimVG: A Simple Framework for Visual Grounding with Decoupled Multi-modal Fusion
Ming Dai
Lingfeng Yang
Yihao Xu
Zhenhua Feng
Wankou Yang
ObjD
27
9
0
26 Sep 2024
HiFi-CS: Towards Open Vocabulary Visual Grounding For Robotic Grasping Using Vision-Language Models
HiFi-CS: Towards Open Vocabulary Visual Grounding For Robotic Grasping Using Vision-Language Models
V. Bhat
Prashanth Krishnamurthy
Ramesh Karri
Farshad Khorrami
46
3
0
16 Sep 2024
NeIn: Telling What You Don't Want
NeIn: Telling What You Don't Want
Nhat-Tan Bui
Dinh-Hieu Hoang
Quoc-Huy Trinh
Minh-Triet Tran
Truong Nguyen
Susan Gauch
43
2
0
09 Sep 2024
Make Graph-based Referring Expression Comprehension Great Again through
  Expression-guided Dynamic Gating and Regression
Make Graph-based Referring Expression Comprehension Great Again through Expression-guided Dynamic Gating and Regression
Jingcheng Ke
Dele Wang
Jun-Cheng Chen
I-Hong Jhuo
Chia-Wen Lin
Yen-Yu Lin
33
0
0
05 Sep 2024
ResVG: Enhancing Relation and Semantic Understanding in Multiple
  Instances for Visual Grounding
ResVG: Enhancing Relation and Semantic Understanding in Multiple Instances for Visual Grounding
Minghang Zheng
Jiahua Zhang
Qingchao Chen
Yuxin Peng
Yang Liu
ObjD
29
2
0
29 Aug 2024
Disentangle and denoise: Tackling context misalignment for video moment
  retrieval
Disentangle and denoise: Tackling context misalignment for video moment retrieval
Kaijing Ma
Han Fang
Xianghao Zang
Chao Ban
Lanxiang Zhou
Zhongjiang He
Yongxiang Li
Hao Sun
Zerun Feng
Xingsong Hou
60
1
0
14 Aug 2024
Towards Zero-Shot Annotation of the Built Environment with
  Vision-Language Models (Vision Paper)
Towards Zero-Shot Annotation of the Built Environment with Vision-Language Models (Vision Paper)
Bin Han
Yiwei Yang
A. Caspi
Bill Howe
VLM
24
1
0
01 Aug 2024
ACTRESS: Active Retraining for Semi-supervised Visual Grounding
ACTRESS: Active Retraining for Semi-supervised Visual Grounding
Weitai Kang
Mengxue Qu
Yunchao Wei
Yan Yan
38
6
0
03 Jul 2024
Visual Grounding with Attention-Driven Constraint Balancing
Visual Grounding with Attention-Driven Constraint Balancing
Weitai Kang
Luowei Zhou
Junyi Wu
Changchang Sun
Yan Yan
45
4
0
03 Jul 2024
SegVG: Transferring Object Bounding Box to Segmentation for Visual
  Grounding
SegVG: Transferring Object Bounding Box to Segmentation for Visual Grounding
Weitai Kang
Gaowen Liu
Mubarak Shah
Yan Yan
ObjD
33
9
0
03 Jul 2024
Dr-LLaVA: Visual Instruction Tuning with Symbolic Clinical Grounding
Dr-LLaVA: Visual Instruction Tuning with Symbolic Clinical Grounding
Shenghuan Sun
Gregory M. Goldgof
Alexander Schubert
Zhiqing Sun
Thomas Hartvigsen
A. Butte
Ahmed Alaa
LM&MA
42
4
0
29 May 2024
OmniBind: Teach to Build Unequal-Scale Modality Interaction for
  Omni-Bind of All
OmniBind: Teach to Build Unequal-Scale Modality Interaction for Omni-Bind of All
Yuanhuiyi Lyu
Xueye Zheng
Dahun Kim
Lin Wang
51
13
0
25 May 2024
MLS-Track: Multilevel Semantic Interaction in RMOT
MLS-Track: Multilevel Semantic Interaction in RMOT
Zeliang Ma
Yang Song
Zhe Cui
Zhicheng Zhao
Fei Su
Delong Liu
Jingyu Wang
36
3
0
18 Apr 2024
SC-Tune: Unleashing Self-Consistent Referential Comprehension in Large
  Vision Language Models
SC-Tune: Unleashing Self-Consistent Referential Comprehension in Large Vision Language Models
Tongtian Yue
Jie Cheng
Longteng Guo
Xingyuan Dai
Zijia Zhao
Xingjian He
Gang Xiong
Yisheng Lv
Jing Liu
43
9
0
20 Mar 2024
Adversarial Testing for Visual Grounding via Image-Aware Property
  Reduction
Adversarial Testing for Visual Grounding via Image-Aware Property Reduction
Zhiyuan Chang
Mingyang Li
Junjie Wang
Cheng Li
Boyu Wu
Fanjiang Xu
Qing Wang
AAML
36
0
0
02 Mar 2024
How to Understand "Support"? An Implicit-enhanced Causal Inference
  Approach for Weakly-supervised Phrase Grounding
How to Understand "Support"? An Implicit-enhanced Causal Inference Approach for Weakly-supervised Phrase Grounding
Jiamin Luo
Jianing Zhao
Jingjing Wang
Guodong Zhou
46
0
0
29 Feb 2024
EarthGPT: A Universal Multi-modal Large Language Model for Multi-sensor
  Image Comprehension in Remote Sensing Domain
EarthGPT: A Universal Multi-modal Large Language Model for Multi-sensor Image Comprehension in Remote Sensing Domain
Wei Zhang
Miaoxin Cai
Tong Zhang
Zhuang Yin
Xuerui Mao
24
88
0
30 Jan 2024
Unifying Visual and Vision-Language Tracking via Contrastive Learning
Unifying Visual and Vision-Language Tracking via Contrastive Learning
Yinchao Ma
Yuyang Tang
Wenfei Yang
Tianzhu Zhang
Jinpeng Zhang
Mengxue Kang
ObjD
18
12
0
20 Jan 2024
Bridging Modality Gap for Visual Grounding with Effecitve Cross-modal
  Distillation
Bridging Modality Gap for Visual Grounding with Effecitve Cross-modal Distillation
Jiaxi Wang
Wenhui Hu
Xueyang Liu
Beihu Wu
Yuting Qiu
Yingying Cai
15
0
0
29 Dec 2023
Cycle-Consistency Learning for Captioning and Grounding
Cycle-Consistency Learning for Captioning and Grounding
Ning Wang
Jiajun Deng
Mingbo Jia
ObjD
42
7
0
23 Dec 2023
Context Disentangling and Prototype Inheriting for Robust Visual
  Grounding
Context Disentangling and Prototype Inheriting for Robust Visual Grounding
Wei Tang
Liang Li
Xuejing Liu
Lu Jin
Jinhui Tang
Zechao Li
38
24
0
19 Dec 2023
Weakly-Supervised 3D Visual Grounding based on Visual Linguistic Alignment
Weakly-Supervised 3D Visual Grounding based on Visual Linguistic Alignment
Xiaoxu Xu
Yitian Yuan
Qiudan Zhang
Wen-Bin Wu
Zequn Jie
Lin Ma
Xu Wang
56
4
0
15 Dec 2023
Mono3DVG: 3D Visual Grounding in Monocular Images
Mono3DVG: 3D Visual Grounding in Monocular Images
Yangfan Zhan
Yuan. Yuan
Zhitong Xiong
MDE
36
9
0
13 Dec 2023
GPT-4 Enhanced Multimodal Grounding for Autonomous Driving: Leveraging
  Cross-Modal Attention with Large Language Models
GPT-4 Enhanced Multimodal Grounding for Autonomous Driving: Leveraging Cross-Modal Attention with Large Language Models
Haicheng Liao
Huanming Shen
Zhenning Li
Chengyue Wang
Guofa Li
Yiming Bie
Chengzhong Xu
36
50
0
06 Dec 2023
Recent Advances in Multi-modal 3D Scene Understanding: A Comprehensive
  Survey and Evaluation
Recent Advances in Multi-modal 3D Scene Understanding: A Comprehensive Survey and Evaluation
Yinjie Lei
Zixuan Wang
Feng Chen
Guoqing Wang
Peng Wang
Yang Yang
34
8
0
24 Oct 2023
OV-VG: A Benchmark for Open-Vocabulary Visual Grounding
OV-VG: A Benchmark for Open-Vocabulary Visual Grounding
Chunlei Wang
Wenquan Feng
Xiangtai Li
Guangliang Cheng
Shuchang Lyu
Binghao Liu
Lijiang Chen
Qi Zhao
ObjD
VLM
26
9
0
22 Oct 2023
Shatter and Gather: Learning Referring Image Segmentation with Text
  Supervision
Shatter and Gather: Learning Referring Image Segmentation with Text Supervision
Dongwon Kim
Nam-Won Kim
Cuiling Lan
Suha Kwak
VLM
42
19
0
29 Aug 2023
Language-Guided Diffusion Model for Visual Grounding
Language-Guided Diffusion Model for Visual Grounding
Sijia Chen
Baochun Li
37
5
0
18 Aug 2023
Grounded Image Text Matching with Mismatched Relation Reasoning
Grounded Image Text Matching with Mismatched Relation Reasoning
Yu Wu
Yan-Tao Wei
Haozhe Jasper Wang
Yongfei Liu
Sibei Yang
Xuming He
31
6
0
02 Aug 2023
VL-Grasp: a 6-Dof Interactive Grasp Policy for Language-Oriented Objects
  in Cluttered Indoor Scenes
VL-Grasp: a 6-Dof Interactive Grasp Policy for Language-Oriented Objects in Cluttered Indoor Scenes
Yuhao Lu
Yixuan Fan
Beixing Deng
F. Liu
Yali Li
Shengjin Wang
38
28
0
01 Aug 2023
3DRP-Net: 3D Relative Position-aware Network for 3D Visual Grounding
3DRP-Net: 3D Relative Position-aware Network for 3D Visual Grounding
Zehan Wang
Haifeng Huang
Yang Zhao
Lin Li
Xize Cheng
Yichen Zhu
Aoxiong Yin
Zhou Zhao
3DPC
32
20
0
25 Jul 2023
Iterative Robust Visual Grounding with Masked Reference based
  Centerpoint Supervision
Iterative Robust Visual Grounding with Masked Reference based Centerpoint Supervision
Menghao Li
Chunlei Wang
W. Feng
Shuchang Lyu
Guangliang Cheng
Xiangtai Li
Binghao Liu
Qi Zhao
28
5
0
23 Jul 2023
Top-Down Framework for Weakly-supervised Grounded Image Captioning
Top-Down Framework for Weakly-supervised Grounded Image Captioning
Chen Cai
Suchen Wang
Kim-Hui Yap
Yi Wang
ObjD
23
3
0
13 Jun 2023
Recent Advances of Local Mechanisms in Computer Vision: A Survey and
  Outlook of Recent Work
Recent Advances of Local Mechanisms in Computer Vision: A Survey and Outlook of Recent Work
Qiangchang Wang
Yilong Yin
23
0
0
02 Jun 2023
Exploring Affordance and Situated Meaning in Image Captions: A
  Multimodal Analysis
Exploring Affordance and Situated Meaning in Image Captions: A Multimodal Analysis
Pin-Er Chen
Po-Ya Angela Wang
Hsin-Yu Chou
Yu-Hsiang Tseng
S. Hsieh
23
1
0
24 May 2023
Vision-Language Models in Remote Sensing: Current Progress and Future
  Trends
Vision-Language Models in Remote Sensing: Current Progress and Future Trends
Xiang Li
Congcong Wen
Yuan Hu
Zhenghang Yuan
Xiao Xiang Zhu
VLM
21
71
0
09 May 2023
ScanERU: Interactive 3D Visual Grounding based on Embodied Reference
  Understanding
ScanERU: Interactive 3D Visual Grounding based on Embodied Reference Understanding
Ziyang Lu
Yunqiang Pei
Guoqing Wang
Yang Yang
Zheng Wang
Heng Tao Shen
46
7
0
23 Mar 2023
Joint Visual Grounding and Tracking with Natural Language Specification
Joint Visual Grounding and Tracking with Natural Language Specification
Li Zhou
Zikun Zhou
Kaige Mao
Zhenyu He
30
59
0
21 Mar 2023
DQ-DETR: Dual Query Detection Transformer for Phrase Extraction and
  Grounding
DQ-DETR: Dual Query Detection Transformer for Phrase Extraction and Grounding
Siyi Liu
Yaoyuan Liang
Feng Li
Shijia Huang
Hao Zhang
Hang Su
Jun Zhu
Lei Zhang
ObjD
50
25
0
28 Nov 2022
RSVG: Exploring Data and Models for Visual Grounding on Remote Sensing
  Data
RSVG: Exploring Data and Models for Visual Grounding on Remote Sensing Data
Yangfan Zhan
Zhitong Xiong
Yuan. Yuan
74
106
0
23 Oct 2022
EDA: Explicit Text-Decoupling and Dense Alignment for 3D Visual
  Grounding
EDA: Explicit Text-Decoupling and Dense Alignment for 3D Visual Grounding
Yanmin Wu
Xinhua Cheng
Renrui Zhang
Zesen Cheng
Jian Zhang
53
63
0
29 Sep 2022
Dynamic MDETR: A Dynamic Multimodal Transformer Decoder for Visual
  Grounding
Dynamic MDETR: A Dynamic Multimodal Transformer Decoder for Visual Grounding
Fengyuan Shi
Ruopeng Gao
Weilin Huang
Limin Wang
27
23
0
28 Sep 2022
12
Next