ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2026 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2409.10419
  4. Cited By
HiFi-CS: Towards Open Vocabulary Visual Grounding For Robotic Grasping Using Vision-Language Models
v1v2 (latest)

HiFi-CS: Towards Open Vocabulary Visual Grounding For Robotic Grasping Using Vision-Language Models

16 September 2024
V. Bhat
Prashanth Krishnamurthy
Ramesh Karri
Farshad Khorrami
ArXiv (abs)PDFHTML

Papers citing "HiFi-CS: Towards Open Vocabulary Visual Grounding For Robotic Grasping Using Vision-Language Models"

23 / 73 papers shown
End-to-end Trainable Deep Neural Network for Robotic Grasp Detection and
  Semantic Segmentation from RGB
End-to-end Trainable Deep Neural Network for Robotic Grasp Detection and Semantic Segmentation from RGB
Stefan Ainetter
Friedrich Fraundorfer
298
151
0
12 Jul 2021
Cross-Modal Progressive Comprehension for Referring Segmentation
Cross-Modal Progressive Comprehension for Referring SegmentationIEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2021
Si Liu
Tianrui Hui
Shaofei Huang
Yunchao Wei
Yue Liu
Guanbin Li
EgoVVOS
252
164
0
15 May 2021
Encoder Fusion Network with Co-Attention Embedding for Referring Image
  Segmentation
Encoder Fusion Network with Co-Attention Embedding for Referring Image SegmentationComputer Vision and Pattern Recognition (CVPR), 2021
Guang Feng
Zhiwei Hu
Lihe Zhang
Huchuan Lu
EgoV
236
200
0
05 May 2021
TransVG: End-to-End Visual Grounding with Transformers
TransVG: End-to-End Visual Grounding with TransformersIEEE International Conference on Computer Vision (ICCV), 2021
Jiajun Deng
Zhengyuan Yang
Tianlang Chen
Wen-gang Zhou
Houqiang Li
ViT
646
442
0
17 Apr 2021
A Joint Network for Grasp Detection Conditioned on Natural Language
  Commands
A Joint Network for Grasp Detection Conditioned on Natural Language CommandsIEEE International Conference on Robotics and Automation (ICRA), 2021
Yiye Chen
Ruinian Xu
Yunzhi Lin
Patricio A. Vela
204
54
0
01 Apr 2021
Refer-it-in-RGBD: A Bottom-up Approach for 3D Visual Grounding in RGBD
  Images
Refer-it-in-RGBD: A Bottom-up Approach for 3D Visual Grounding in RGBD ImagesComputer Vision and Pattern Recognition (CVPR), 2021
Haolin Liu
Anran Lin
Xiaoguang Han
Lei Yang
Yizhou Yu
Shuguang Cui
287
47
0
14 Mar 2021
RGB Matters: Learning 7-DoF Grasp Poses on Monocular RGBD Images
RGB Matters: Learning 7-DoF Grasp Poses on Monocular RGBD ImagesIEEE International Conference on Robotics and Automation (ICRA), 2021
Minghao Gou
Haoshu Fang
Zhanda Zhu
Shengwei Xu
Chenxi Wang
Cewu Lu
191
127
0
03 Mar 2021
Learning Transferable Visual Models From Natural Language Supervision
Learning Transferable Visual Models From Natural Language SupervisionInternational Conference on Machine Learning (ICML), 2021
Alec Radford
Jong Wook Kim
Chris Hallacy
Aditya A. Ramesh
Gabriel Goh
...
Amanda Askell
Pamela Mishkin
Jack Clark
Gretchen Krueger
Ilya Sutskever
CLIPVLM
2.0K
42,087
0
26 Feb 2021
A Recurrent Vision-and-Language BERT for Navigation
A Recurrent Vision-and-Language BERT for NavigationComputer Vision and Pattern Recognition (CVPR), 2020
Yicong Hong
Qi Wu
Yuankai Qi
Cristian Rodriguez-Opazo
Stephen Gould
LM&Ro
326
385
0
26 Nov 2020
ACRONYM: A Large-Scale Grasp Dataset Based on Simulation
ACRONYM: A Large-Scale Grasp Dataset Based on SimulationIEEE International Conference on Robotics and Automation (ICRA), 2020
Clemens Eppner
Arsalan Mousavian
Dieter Fox
309
245
0
18 Nov 2020
Real-Time Deep Learning Approach to Visual Servo Control and Grasp
  Detection for Autonomous Robotic Manipulation
Real-Time Deep Learning Approach to Visual Servo Control and Grasp Detection for Autonomous Robotic Manipulation
E. G. Ribeiro
R. Q. Mendes
V. Grassi
216
75
0
13 Oct 2020
GSNet: Joint Vehicle Pose and Shape Reconstruction with Geometrical and
  Scene-aware Supervision
GSNet: Joint Vehicle Pose and Shape Reconstruction with Geometrical and Scene-aware SupervisionEuropean Conference on Computer Vision (ECCV), 2020
Lei Ke
Shichao Li
Yanan Sun
Yu-Wing Tai
Chi-Keung Tang
3DPC
151
54
0
26 Jul 2020
ScanRefer: 3D Object Localization in RGB-D Scans using Natural Language
ScanRefer: 3D Object Localization in RGB-D Scans using Natural LanguageEuropean Conference on Computer Vision (ECCV), 2019
Dave Zhenyu Chen
Angel X. Chang
Matthias Nießner
3DPC
436
507
0
18 Dec 2019
Grasping in the Wild:Learning 6DoF Closed-Loop Grasping from Low-Cost
  Demonstrations
Grasping in the Wild:Learning 6DoF Closed-Loop Grasping from Low-Cost DemonstrationsIEEE Robotics and Automation Letters (RA-L), 2019
Shuran Song
Andy Zeng
Johnny Lee
Thomas Funkhouser
341
258
0
09 Dec 2019
Interactive Visual Grounding of Referring Expressions for Human-Robot
  Interaction
Interactive Visual Grounding of Referring Expressions for Human-Robot Interaction
Mohit Shridhar
David Hsu
160
154
0
11 Jun 2018
Jacquard: A Large Scale Dataset for Robotic Grasp Detection
Jacquard: A Large Scale Dataset for Robotic Grasp Detection
Amaury Depierre
Emmanuel Dellandrea
Liming Chen
334
370
0
30 Mar 2018
MAttNet: Modular Attention Network for Referring Expression
  Comprehension
MAttNet: Modular Attention Network for Referring Expression Comprehension
Licheng Yu
Zhe Lin
Xiaohui Shen
Jimei Yang
Xin Lu
Joey Tianyi Zhou
Tamara L. Berg
ObjD
527
913
0
24 Jan 2018
FiLM: Visual Reasoning with a General Conditioning Layer
FiLM: Visual Reasoning with a General Conditioning Layer
Ethan Perez
Florian Strub
H. D. Vries
Vincent Dumoulin
Aaron Courville
FAttAIMatOffRLAI4CE
786
2,953
0
22 Sep 2017
Modulating early visual processing by language
Modulating early visual processing by language
H. D. Vries
Florian Strub
Jérémie Mary
Hugo Larochelle
Olivier Pietquin
Aaron Courville
550
518
0
02 Jul 2017
Modeling Context Between Objects for Referring Expression Understanding
Modeling Context Between Objects for Referring Expression Understanding
Varun K. Nagaraja
Vlad I. Morariu
Larry S. Davis
305
231
0
01 Aug 2016
Modeling Context in Referring Expressions
Modeling Context in Referring Expressions
Licheng Yu
Patrick Poirson
Shan Yang
Alexander C. Berg
Tamara L. Berg
576
1,529
0
31 Jul 2016
Natural Language Object Retrieval
Natural Language Object Retrieval
Ronghang Hu
Huazhe Xu
Marcus Rohrbach
Jiashi Feng
Kate Saenko
Trevor Darrell
ObjD
347
570
0
13 Nov 2015
Flickr30k Entities: Collecting Region-to-Phrase Correspondences for
  Richer Image-to-Sentence Models
Flickr30k Entities: Collecting Region-to-Phrase Correspondences for Richer Image-to-Sentence Models
Bryan A. Plummer
Liwei Wang
Christopher M. Cervantes
Juan C. Caicedo
Anjali Narayan-Chen
Svetlana Lazebnik
619
2,388
0
19 May 2015
Previous
12
Page 2 of 2