Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1711.08389
Cited By
Conditional Image-Text Embedding Networks
22 November 2017
Bryan A. Plummer
Paige Kordas
M. Kiapour
Shuai Zheng
Robinson Piramuthu
Svetlana Lazebnik
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Conditional Image-Text Embedding Networks"
26 / 26 papers shown
Title
Language-Guided Diffusion Model for Visual Grounding
Sijia Chen
Baochun Li
27
5
0
18 Aug 2023
DQ-DETR: Dual Query Detection Transformer for Phrase Extraction and Grounding
Siyi Liu
Yaoyuan Liang
Feng Li
Shijia Huang
Hao Zhang
Hang Su
Jun Zhu
Lei Zhang
ObjD
37
24
0
28 Nov 2022
YORO -- Lightweight End to End Visual Grounding
Chih-Hui Ho
Srikar Appalaraju
Bhavan A. Jasani
R. Manmatha
Nuno Vasconcelos
ObjD
21
21
0
15 Nov 2022
Dynamic MDETR: A Dynamic Multimodal Transformer Decoder for Visual Grounding
Fengyuan Shi
Ruopeng Gao
Weilin Huang
Limin Wang
17
23
0
28 Sep 2022
Ground then Navigate: Language-guided Navigation in Dynamic Scenes
Kanishk Jain
Varun Chhangani
Amogh Tiwari
K. M. Krishna
Vineet Gandhi
LM&Ro
18
27
0
24 Sep 2022
PPMN: Pixel-Phrase Matching Network for One-Stage Panoptic Narrative Grounding
Zihan Ding
Zixiang Ding
Tianrui Hui
Junshi Huang
Xiaoming Wei
Xiaolin K. Wei
Si Liu
12
12
0
11 Aug 2022
Improving Visual Grounding with Visual-Linguistic Verification and Iterative Reasoning
Li Yang
Yan Xu
Chunfen Yuan
Wei Liu
Bing Li
Weiming Hu
ObjD
34
113
0
30 Apr 2022
Adapting CLIP For Phrase Localization Without Further Training
Jiahao Li
G. Shakhnarovich
Raymond A. Yeh
VLM
CLIP
28
25
0
07 Apr 2022
GroupViT: Semantic Segmentation Emerges from Text Supervision
Jiarui Xu
Shalini De Mello
Sifei Liu
Wonmin Byeon
Thomas Breuel
Jan Kautz
X. Wang
ViT
VLM
189
499
0
22 Feb 2022
Unpaired Referring Expression Grounding via Bidirectional Cross-Modal Matching
Hengcan Shi
Munawar Hayat
Jianfei Cai
ObjD
18
10
0
18 Jan 2022
Anchoring to Exemplars for Training Mixture-of-Expert Cell Embeddings
Siqi Wang
Manyuan Lu
Nikita Moshkov
Juan C. Caicedo
Bryan A. Plummer
13
4
0
06 Dec 2021
Referring Transformer: A One-step Approach to Multi-task Visual Grounding
Muchen Li
Leonid Sigal
ObjD
10
187
0
06 Jun 2021
MDETR -- Modulated Detection for End-to-End Multi-Modal Understanding
Aishwarya Kamath
Mannat Singh
Yann LeCun
Gabriel Synnaeve
Ishan Misra
Nicolas Carion
ObjD
VLM
57
858
0
26 Apr 2021
TransVG: End-to-End Visual Grounding with Transformers
Jiajun Deng
Zhengyuan Yang
Tianlang Chen
Wen-gang Zhou
Houqiang Li
ViT
21
329
0
17 Apr 2021
Few-Shot Visual Grounding for Natural Human-Robot Interaction
Georgios Tziafas
S. Kasaei
19
6
0
17 Mar 2021
PhraseCut: Language-based Image Segmentation in the Wild
Chenyun Wu
Zhe-nan Lin
Scott D. Cohen
Trung Bui
Subhransu Maji
VLM
13
111
0
03 Aug 2020
MosAIc: Finding Artistic Connections across Culture with Conditional Image Retrieval
Mark Hamilton
Stephanie Fu
Mindren Lu
Johnny Bui
Darius Bopp
...
Margaret Wang
Marina Rogers
Lei Zhang
Chris Hoder
William T. Freeman
19
3
0
14 Jul 2020
Grounding-Tracking-Integration
Zhengyuan Yang
T. Kumar
Tianlang Chen
Jinsong Su
Jiebo Luo
22
53
0
13 Dec 2019
A Real-time Global Inference Network for One-stage Referring Expression Comprehension
Yiyi Zhou
Rongrong Ji
Gen Luo
Xiaoshuai Sun
Jinsong Su
Xinghao Ding
Chia-Wen Lin
Q. Tian
ObjD
24
60
0
07 Dec 2019
LoGAN: Latent Graph Co-Attention Network for Weakly-Supervised Video Moment Retrieval
Reuben Tan
Huijuan Xu
Kate Saenko
Bryan A. Plummer
25
67
0
27 Sep 2019
CAMP: Cross-Modal Adaptive Message Passing for Text-Image Retrieval
Zihao W. Wang
Xihui Liu
Hongsheng Li
Lu Sheng
Junjie Yan
Xiaogang Wang
Jing Shao
VLM
23
299
0
12 Sep 2019
Phrase Localization Without Paired Training Examples
Josiah Wang
Lucia Specia
24
41
0
20 Aug 2019
Zero-Shot Grounding of Objects from Natural Language Queries
Arka Sadhu
Kan Chen
Ram Nevatia
ObjD
28
156
0
20 Aug 2019
A Fast and Accurate One-Stage Approach to Visual Grounding
Zhengyuan Yang
Boqing Gong
Liwei Wang
Wenbing Huang
Dong Yu
Jiebo Luo
ObjD
12
360
0
18 Aug 2019
Improving Referring Expression Grounding with Cross-modal Attention-guided Erasing
Xihui Liu
Zihao W. Wang
Jing Shao
Xiaogang Wang
Hongsheng Li
ObjD
19
180
0
03 Mar 2019
Multimodal Compact Bilinear Pooling for Visual Question Answering and Visual Grounding
Akira Fukui
Dong Huk Park
Daylen Yang
Anna Rohrbach
Trevor Darrell
Marcus Rohrbach
152
1,465
0
06 Jun 2016
1