Communities
Connect sessions
AI calendar
Organizations
Join Slack
Contact Sales
Search
Open menu
Home
Papers
2409.10419
Cited By
v1
v2 (latest)
HiFi-CS: Towards Open Vocabulary Visual Grounding For Robotic Grasping Using Vision-Language Models
16 September 2024
V. Bhat
Prashanth Krishnamurthy
Ramesh Karri
Farshad Khorrami
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"HiFi-CS: Towards Open Vocabulary Visual Grounding For Robotic Grasping Using Vision-Language Models"
23 / 73 papers shown
End-to-end Trainable Deep Neural Network for Robotic Grasp Detection and Semantic Segmentation from RGB
Stefan Ainetter
Friedrich Fraundorfer
298
151
0
12 Jul 2021
Cross-Modal Progressive Comprehension for Referring Segmentation
IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2021
Si Liu
Tianrui Hui
Shaofei Huang
Yunchao Wei
Yue Liu
Guanbin Li
EgoV
VOS
252
164
0
15 May 2021
Encoder Fusion Network with Co-Attention Embedding for Referring Image Segmentation
Computer Vision and Pattern Recognition (CVPR), 2021
Guang Feng
Zhiwei Hu
Lihe Zhang
Huchuan Lu
EgoV
236
200
0
05 May 2021
TransVG: End-to-End Visual Grounding with Transformers
IEEE International Conference on Computer Vision (ICCV), 2021
Jiajun Deng
Zhengyuan Yang
Tianlang Chen
Wen-gang Zhou
Houqiang Li
ViT
646
442
0
17 Apr 2021
A Joint Network for Grasp Detection Conditioned on Natural Language Commands
IEEE International Conference on Robotics and Automation (ICRA), 2021
Yiye Chen
Ruinian Xu
Yunzhi Lin
Patricio A. Vela
204
54
0
01 Apr 2021
Refer-it-in-RGBD: A Bottom-up Approach for 3D Visual Grounding in RGBD Images
Computer Vision and Pattern Recognition (CVPR), 2021
Haolin Liu
Anran Lin
Xiaoguang Han
Lei Yang
Yizhou Yu
Shuguang Cui
287
47
0
14 Mar 2021
RGB Matters: Learning 7-DoF Grasp Poses on Monocular RGBD Images
IEEE International Conference on Robotics and Automation (ICRA), 2021
Minghao Gou
Haoshu Fang
Zhanda Zhu
Shengwei Xu
Chenxi Wang
Cewu Lu
191
127
0
03 Mar 2021
Learning Transferable Visual Models From Natural Language Supervision
International Conference on Machine Learning (ICML), 2021
Alec Radford
Jong Wook Kim
Chris Hallacy
Aditya A. Ramesh
Gabriel Goh
...
Amanda Askell
Pamela Mishkin
Jack Clark
Gretchen Krueger
Ilya Sutskever
CLIP
VLM
2.0K
42,087
0
26 Feb 2021
A Recurrent Vision-and-Language BERT for Navigation
Computer Vision and Pattern Recognition (CVPR), 2020
Yicong Hong
Qi Wu
Yuankai Qi
Cristian Rodriguez-Opazo
Stephen Gould
LM&Ro
326
385
0
26 Nov 2020
ACRONYM: A Large-Scale Grasp Dataset Based on Simulation
IEEE International Conference on Robotics and Automation (ICRA), 2020
Clemens Eppner
Arsalan Mousavian
Dieter Fox
309
245
0
18 Nov 2020
Real-Time Deep Learning Approach to Visual Servo Control and Grasp Detection for Autonomous Robotic Manipulation
E. G. Ribeiro
R. Q. Mendes
V. Grassi
216
75
0
13 Oct 2020
GSNet: Joint Vehicle Pose and Shape Reconstruction with Geometrical and Scene-aware Supervision
European Conference on Computer Vision (ECCV), 2020
Lei Ke
Shichao Li
Yanan Sun
Yu-Wing Tai
Chi-Keung Tang
3DPC
151
54
0
26 Jul 2020
ScanRefer: 3D Object Localization in RGB-D Scans using Natural Language
European Conference on Computer Vision (ECCV), 2019
Dave Zhenyu Chen
Angel X. Chang
Matthias Nießner
3DPC
436
507
0
18 Dec 2019
Grasping in the Wild:Learning 6DoF Closed-Loop Grasping from Low-Cost Demonstrations
IEEE Robotics and Automation Letters (RA-L), 2019
Shuran Song
Andy Zeng
Johnny Lee
Thomas Funkhouser
341
258
0
09 Dec 2019
Interactive Visual Grounding of Referring Expressions for Human-Robot Interaction
Mohit Shridhar
David Hsu
160
154
0
11 Jun 2018
Jacquard: A Large Scale Dataset for Robotic Grasp Detection
Amaury Depierre
Emmanuel Dellandrea
Liming Chen
334
370
0
30 Mar 2018
MAttNet: Modular Attention Network for Referring Expression Comprehension
Licheng Yu
Zhe Lin
Xiaohui Shen
Jimei Yang
Xin Lu
Joey Tianyi Zhou
Tamara L. Berg
ObjD
527
913
0
24 Jan 2018
FiLM: Visual Reasoning with a General Conditioning Layer
Ethan Perez
Florian Strub
H. D. Vries
Vincent Dumoulin
Aaron Courville
FAtt
AIMat
OffRL
AI4CE
786
2,953
0
22 Sep 2017
Modulating early visual processing by language
H. D. Vries
Florian Strub
Jérémie Mary
Hugo Larochelle
Olivier Pietquin
Aaron Courville
550
518
0
02 Jul 2017
Modeling Context Between Objects for Referring Expression Understanding
Varun K. Nagaraja
Vlad I. Morariu
Larry S. Davis
305
231
0
01 Aug 2016
Modeling Context in Referring Expressions
Licheng Yu
Patrick Poirson
Shan Yang
Alexander C. Berg
Tamara L. Berg
576
1,529
0
31 Jul 2016
Natural Language Object Retrieval
Ronghang Hu
Huazhe Xu
Marcus Rohrbach
Jiashi Feng
Kate Saenko
Trevor Darrell
ObjD
347
570
0
13 Nov 2015
Flickr30k Entities: Collecting Region-to-Phrase Correspondences for Richer Image-to-Sentence Models
Bryan A. Plummer
Liwei Wang
Christopher M. Cervantes
Juan C. Caicedo
Anjali Narayan-Chen
Svetlana Lazebnik
619
2,388
0
19 May 2015
Previous
1
2
Page 2 of 2