ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2026 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2003.00403
  4. Cited By
Cops-Ref: A new Dataset and Task on Compositional Referring Expression
  Comprehension

Cops-Ref: A new Dataset and Task on Compositional Referring Expression Comprehension

Computer Vision and Pattern Recognition (CVPR), 2020
1 March 2020
Zhenfang Chen
Peng Wang
Lin Ma
Kwan-Yee K. Wong
Qi Wu
    ObjD
ArXiv (abs)PDFHTML

Papers citing "Cops-Ref: A new Dataset and Task on Compositional Referring Expression Comprehension"

49 / 49 papers shown
Making Dialogue Grounding Data Rich: A Three-Tier Data Synthesis Framework for Generalized Referring Expression Comprehension
Making Dialogue Grounding Data Rich: A Three-Tier Data Synthesis Framework for Generalized Referring Expression Comprehension
Juexi Shao
Siyou Li
Yujian Gan
Chris Madge
Vanja Karan
Massimo Poesio
210
0
0
02 Dec 2025
Generative Adversarial Gumbel MCTS for Abstract Visual Composition Generation
Generative Adversarial Gumbel MCTS for Abstract Visual Composition Generation
Zirui Zhao
Boye Niu
David Hsu
W. Lee
GAN
293
0
0
01 Dec 2025
Referring Expressions as a Lens into Spatial Language Grounding in Vision-Language Models
Referring Expressions as a Lens into Spatial Language Grounding in Vision-Language Models
Akshar Tumu
Varad Shinde
Parisa Kordjamshidi
122
0
0
08 Nov 2025
CoT Referring: Improving Referring Expression Tasks with Grounded Reasoning
CoT Referring: Improving Referring Expression Tasks with Grounded Reasoning
Qihua Dong
Luis Figueroa
Handong Zhao
Kushal Kafle
Jason Kuen
Zhihong Ding
Scott D. Cohen
Y. Fu
ObjDLRM
275
3
0
03 Oct 2025
GeoRef: Referring Expressions in Geometry via Task Formulation, Synthetic Supervision, and Reinforced MLLM-based Solutions
GeoRef: Referring Expressions in Geometry via Task Formulation, Synthetic Supervision, and Reinforced MLLM-based Solutions
Bing Liu
Wenqiang Yv
X. J. Yang
S. Wang
Junzhuo Liu
Peng Wang
G. Wang
Yang Yang
H. Shen
ObjD
227
0
0
25 Sep 2025
Audio-3DVG: Unified Audio -- Point Cloud Fusion for 3D Visual Grounding
Audio-3DVG: Unified Audio -- Point Cloud Fusion for 3D Visual Grounding
Duc Cao-Dinh
Khai Le-Duc
Anh Dao
Bach Phan Tat
Chris Ngo
Duy M. H. Nguyen
Nguyen X. Khanh
Thanh Nguyen-Tang
287
0
0
01 Jul 2025
RoboOS: A Hierarchical Embodied Framework for Cross-Embodiment and Multi-Agent Collaboration
RoboOS: A Hierarchical Embodied Framework for Cross-Embodiment and Multi-Agent Collaboration
Huajie Tan
Xiaoshuai Hao
Cheng Chi
Minglan Lin
Yaoxu Lyu
...
Yulong Ao
Yonghua Lin
Pengwei Wang
Zhongyuan Wang
Shanghang Zhang
LM&Ro
529
18
0
06 May 2025
KARL: Knowledge-Aware Reasoning and Reinforcement Learning for Knowledge-Intensive Visual Grounding
KARL: Knowledge-Aware Reasoning and Reinforcement Learning for Knowledge-Intensive Visual Grounding
Xinyu Ma
Ziyang Ding
Zhicong Luo
Chong Chen
Zonghao Guo
Yang Li
Xiaoyi Feng
Maosong Sun
Maosong Sun
VLMLRM
391
20
0
17 Mar 2025
HiMTok: Learning Hierarchical Mask Tokens for Image Segmentation with Large Multimodal Model
HiMTok: Learning Hierarchical Mask Tokens for Image Segmentation with Large Multimodal Model
Tao Wang
Changxu Cheng
Lingfeng Wang
Senda Chen
Wuyue Zhao
VLM
464
9
0
17 Mar 2025
Referring to Any Person
Referring to Any Person
Qing Jiang
Lin Wu
Zhaoyang Zeng
Tianhe Ren
Yuda Xiong
Yihao Chen
Qin Liu
Lei Zhang
977
15
0
11 Mar 2025
New Dataset and Methods for Fine-Grained Compositional Referring Expression Comprehension via Specialist-MLLM Collaboration
New Dataset and Methods for Fine-Grained Compositional Referring Expression Comprehension via Specialist-MLLM CollaborationIEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2025
X. J. Yang
Jing Liu
Peng Wang
Guoqing Wang
Yue Yang
Mengqi Li
ObjD
537
8
0
27 Feb 2025
Acknowledging Focus Ambiguity in Visual Questions
Acknowledging Focus Ambiguity in Visual Questions
Chongyan Chen
Yu-Yun Tseng
Zhuoheng Li
Anush Venkatesh
Danna Gurari
379
0
0
04 Jan 2025
Towards Visual Grounding: A Survey
Towards Visual Grounding: A SurveyIEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2024
Linhui Xiao
Xiaoshan Yang
X. Lan
Yaowei Wang
Changsheng Xu
ObjD
1.1K
43
0
28 Dec 2024
FineCops-Ref: A new Dataset and Task for Fine-Grained Compositional Referring Expression Comprehension
FineCops-Ref: A new Dataset and Task for Fine-Grained Compositional Referring Expression ComprehensionConference on Empirical Methods in Natural Language Processing (EMNLP), 2024
Junzhuo Liu
Xiaohu Yang
Weiwei Li
Peng Wang
ObjD
478
17
0
23 Sep 2024
Revisiting Referring Expression Comprehension Evaluation in the Era of
  Large Multimodal Models
Revisiting Referring Expression Comprehension Evaluation in the Era of Large Multimodal Models
Jierun Chen
Fangyun Wei
Jinjing Zhao
Sizhe Song
Bohuai Wu
Zhuoxuan Peng
S.-H. Gary Chan
Hongyang R. Zhang
308
43
0
24 Jun 2024
Bootstrapping Referring Multi-Object Tracking
Bootstrapping Referring Multi-Object Tracking
Yani Zhang
Dongming Wu
Wencheng Han
Xingping Dong
424
22
0
07 Jun 2024
WaterVG: Waterway Visual Grounding based on Text-Guided Vision and
  mmWave Radar
WaterVG: Waterway Visual Grounding based on Text-Guided Vision and mmWave Radar
Runwei Guan
Liye Jia
Fengyufan Yang
Shanliang Yao
Erick Purwanto
...
Eng Gee Lim
Jeremy S. Smith
Ka Lok Man
Xuming Hu
Yutao Yue
471
22
0
19 Mar 2024
GroundVLP: Harnessing Zero-shot Visual Grounding from Vision-Language
  Pre-training and Open-Vocabulary Object Detection
GroundVLP: Harnessing Zero-shot Visual Grounding from Vision-Language Pre-training and Open-Vocabulary Object Detection
Haozhan Shen
Tiancheng Zhao
Mingwei Zhu
Yuxiang Cai
VLMObjD
505
31
0
22 Dec 2023
Language-guided Robot Grasping: CLIP-based Referring Grasp Synthesis in
  Clutter
Language-guided Robot Grasping: CLIP-based Referring Grasp Synthesis in ClutterConference on Robot Learning (CoRL), 2023
Georgios Tziafas
Yucheng Xu
Arushi Goel
Mohammadreza Kasaei
Zhibin Li
Hamidreza Kasaei
303
45
0
09 Nov 2023
GENOME: GenerativE Neuro-symbOlic visual reasoning by growing and
  reusing ModulEs
GENOME: GenerativE Neuro-symbOlic visual reasoning by growing and reusing ModulEs
Zhenfang Chen
Rui Sun
Wenjun Liu
Yining Hong
Chuang Gan
LRM
374
24
0
08 Nov 2023
Enhancing Multimodal Compositional Reasoning of Visual Language Models
  with Generative Negative Mining
Enhancing Multimodal Compositional Reasoning of Visual Language Models with Generative Negative MiningIEEE Workshop/Winter Conference on Applications of Computer Vision (WACV), 2023
U. Sahin
Hang Li
Qadeer Ahmad Khan
Zorah Lähner
Volker Tresp
VLMCoGe
243
29
0
07 Nov 2023
TextPSG: Panoptic Scene Graph Generation from Textual Descriptions
TextPSG: Panoptic Scene Graph Generation from Textual DescriptionsIEEE International Conference on Computer Vision (ICCV), 2023
Chengyang Zhao
Songlin Yang
Zhenfang Chen
Mingyu Ding
Chuang Gan
473
24
0
10 Oct 2023
InstructDET: Diversifying Referring Object Detection with Generalized
  Instructions
InstructDET: Diversifying Referring Object Detection with Generalized InstructionsInternational Conference on Learning Representations (ICLR), 2023
Ronghao Dang
Jiangyan Feng
Haodong Zhang
Chongjian Ge
Lin Song
...
Chengju Liu
Qi Chen
Feng Zhu
Rui Zhao
Yibing Song
ObjD
529
16
0
08 Oct 2023
Dense Object Grounding in 3D Scenes
Dense Object Grounding in 3D ScenesACM Multimedia (ACM MM), 2023
Wencan Huang
Daizong Liu
Wei Hu
287
26
0
05 Sep 2023
VL-Grasp: a 6-Dof Interactive Grasp Policy for Language-Oriented Objects
  in Cluttered Indoor Scenes
VL-Grasp: a 6-Dof Interactive Grasp Policy for Language-Oriented Objects in Cluttered Indoor ScenesIEEE/RJS International Conference on Intelligent RObots and Systems (IROS), 2023
Yuhao Lu
Yixuan Fan
Beixing Deng
Fan Liu
Yali Li
Shengjin Wang
302
64
0
01 Aug 2023
Described Object Detection: Liberating Object Detection with Flexible
  Expressions
Described Object Detection: Liberating Object Detection with Flexible ExpressionsNeural Information Processing Systems (NeurIPS), 2023
Chi Xie
Zhao Zhang
YiXuan Wu
Feng Zhu
Rui Zhao
Shuang Liang
ObjD
353
56
0
24 Jul 2023
Advancing Visual Grounding with Scene Knowledge: Benchmark and Method
Advancing Visual Grounding with Scene Knowledge: Benchmark and MethodComputer Vision and Pattern Recognition (CVPR), 2023
Zhihong Chen
Ruifei Zhang
Yibing Song
Xiang Wan
Guanbin Li
258
32
0
21 Jul 2023
ICSVR: Investigating Compositional and Syntactic Understanding in Video
  Retrieval Models
ICSVR: Investigating Compositional and Syntactic Understanding in Video Retrieval Models
Avinash Madasu
Vasudev Lal
CoGe
355
5
0
28 Jun 2023
Large Language Models as Commonsense Knowledge for Large-Scale Task
  Planning
Large Language Models as Commonsense Knowledge for Large-Scale Task PlanningNeural Information Processing Systems (NeurIPS), 2023
Zirui Zhao
W. Lee
David Hsu
LRMLLMAGLM&Ro
466
357
0
23 May 2023
Embodied Concept Learner: Self-supervised Learning of Concepts and
  Mapping through Instruction Following
Embodied Concept Learner: Self-supervised Learning of Concepts and Mapping through Instruction FollowingConference on Robot Learning (CoRL), 2023
Mingyu Ding
Yan Xu
Zhenfang Chen
David D. Cox
Ping Luo
J. Tenenbaum
Chuang Gan
LM&Ro
238
29
0
07 Apr 2023
3D Concept Learning and Reasoning from Multi-View Images
3D Concept Learning and Reasoning from Multi-View ImagesComputer Vision and Pattern Recognition (CVPR), 2023
Yining Hong
Chun-Tse Lin
Yilun Du
Zhenfang Chen
J. Tenenbaum
Chuang Gan
3DV
410
83
0
20 Mar 2023
PACO: Parts and Attributes of Common Objects
PACO: Parts and Attributes of Common ObjectsComputer Vision and Pattern Recognition (CVPR), 2023
Vignesh Ramanathan
Anmol Kalia
Vladan Petrovic
Yiqian Wen
Baixue Zheng
...
Abhishek Kadian
Amir Mousavi
Yi-Zhe Song
Abhimanyu Dubey
D. Mahajan
VLM
292
158
0
04 Jan 2023
CREPE: Can Vision-Language Foundation Models Reason Compositionally?
CREPE: Can Vision-Language Foundation Models Reason Compositionally?Computer Vision and Pattern Recognition (CVPR), 2022
Zixian Ma
Jerry Hong
Mustafa Omer Gul
Mona Gandhi
Irena Gao
Ranjay Krishna
CoGe
449
200
0
13 Dec 2022
Language Conditioned Spatial Relation Reasoning for 3D Object Grounding
Language Conditioned Spatial Relation Reasoning for 3D Object GroundingNeural Information Processing Systems (NeurIPS), 2022
Shizhe Chen
Pierre-Louis Guhur
Makarand Tapaswi
Cordelia Schmid
Ivan Laptev
339
150
0
17 Nov 2022
YORO -- Lightweight End to End Visual Grounding
YORO -- Lightweight End to End Visual Grounding
Chih-Hui Ho
Srikar Appalaraju
Bhavan A. Jasani
R. Manmatha
Nuno Vasconcelos
ObjD
281
28
0
15 Nov 2022
RSVG: Exploring Data and Models for Visual Grounding on Remote Sensing
  Data
RSVG: Exploring Data and Models for Visual Grounding on Remote Sensing DataIEEE Transactions on Geoscience and Remote Sensing (IEEE TGRS), 2022
Yangfan Zhan
Zhitong Xiong
Yuan. Yuan
289
215
0
23 Oct 2022
Do Vision-and-Language Transformers Learn Grounded Predicate-Noun
  Dependencies?
Do Vision-and-Language Transformers Learn Grounded Predicate-Noun Dependencies?Conference on Empirical Methods in Natural Language Processing (EMNLP), 2022
Mitja Nikolaus
Emmanuelle Salin
Stéphane Ayache
Abdellah Fourtassi
Benoit Favre
179
17
0
21 Oct 2022
RefCrowd: Grounding the Target in Crowd with Referring Expressions
RefCrowd: Grounding the Target in Crowd with Referring ExpressionsACM Multimedia (ACM MM), 2022
Heqian Qiu
Hongliang Li
Taijin Zhao
Lanxiao Wang
Qingbo Wu
Fanman Meng
ObjD
318
10
0
16 Jun 2022
Referring Image Matting
Referring Image MattingComputer Vision and Pattern Recognition (CVPR), 2022
Jizhizi Li
Jing Zhang
Dacheng Tao
ObjDVLM
260
35
0
10 Jun 2022
Fixing Malfunctional Objects With Learned Physical Simulation and
  Functional Prediction
Fixing Malfunctional Objects With Learned Physical Simulation and Functional PredictionComputer Vision and Pattern Recognition (CVPR), 2022
Yining Hong
Kaichun Mo
L. Yi
Leonidas Guibas
Antonio Torralba
J. Tenenbaum
Chuang Gan
251
5
0
05 May 2022
FindIt: Generalized Localization with Natural Language Queries
FindIt: Generalized Localization with Natural Language QueriesEuropean Conference on Computer Vision (ECCV), 2022
Weicheng Kuo
Fred Bertsch
Wei Li
A. Piergiovanni
M. Saffar
A. Angelova
ObjD
261
18
0
31 Mar 2022
Differentiated Relevances Embedding for Group-based Referring Expression
  Comprehension
Differentiated Relevances Embedding for Group-based Referring Expression Comprehension
Fuhai Chen
Xuri Ge
Xiaoshuai Sun
Yue Gao
Jianzhuang Liu
Feiyue Huang
Rongrong Ji
217
0
0
12 Mar 2022
COVR: A test-bed for Visually Grounded Compositional Generalization with
  real images
COVR: A test-bed for Visually Grounded Compositional Generalization with real imagesConference on Empirical Methods in Natural Language Processing (EMNLP), 2021
Ben Bogin
Shivanshu Gupta
Matt Gardner
Jonathan Berant
CoGe
198
31
0
22 Sep 2021
YouRefIt: Embodied Reference Understanding with Language and Gesture
YouRefIt: Embodied Reference Understanding with Language and GestureIEEE International Conference on Computer Vision (ICCV), 2021
Yixin Chen
Qing Li
Deqian Kong
Yik Lun Kei
Song-Chun Zhu
Tao Gao
Yixin Zhu
Siyuan Huang
LM&Ro
336
55
0
08 Sep 2021
A Better Loss for Visual-Textual Grounding
A Better Loss for Visual-Textual GroundingACM Symposium on Applied Computing (SAC), 2021
Davide Rigoni
Luciano Serafini
A. Sperduti
ObjD
318
3
0
11 Aug 2021
Exploring Data Pipelines through the Process Lens: a Reference Model
  forComputer Vision
Exploring Data Pipelines through the Process Lens: a Reference Model forComputer Vision
Agathe Balayn
B. Kulynych
S. Guerses
252
4
0
05 Jul 2021
Understanding Synonymous Referring Expressions via Contrastive Features
Understanding Synonymous Referring Expressions via Contrastive FeaturesInternational Journal of Computer Vision (IJCV), 2021
Yi-Wen Chen
Yi-Hsuan Tsai
Ming-Hsuan Yang
ObjD
241
5
0
20 Apr 2021
OCID-Ref: A 3D Robotic Dataset with Embodied Language for Clutter Scene
  Grounding
OCID-Ref: A 3D Robotic Dataset with Embodied Language for Clutter Scene GroundingNorth American Chapter of the Association for Computational Linguistics (NAACL), 2021
Ke-Jyun Wang
Yun-Hsuan Liu
Hung-Ting Su
Jen-Wei Wang
Yu-Siang Wang
Winston H. Hsu
Wen-Chin Chen
249
28
0
13 Mar 2021
Referring Expression Comprehension: A Survey of Methods and Datasets
Referring Expression Comprehension: A Survey of Methods and DatasetsIEEE transactions on multimedia (TMM), 2020
Yanyuan Qiao
Chaorui Deng
Qi Wu
ObjD
478
128
0
19 Jul 2020
1
Page 1 of 1