Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2310.10221
Cited By
RoboLLM: Robotic Vision Tasks Grounded on Multimodal Large Language Models
16 October 2023
Zijun Long
George Killick
R. McCreadie
Gerardo Aragon Camarasa
VLM
Re-assign community
ArXiv
PDF
HTML
Papers citing
"RoboLLM: Robotic Vision Tasks Grounded on Multimodal Large Language Models"
7 / 7 papers shown
Title
From Decision to Action in Surgical Autonomy: Multi-Modal Large Language Models for Robot-Assisted Blood Suction
Sadra Zargarzadeh
Maryam Mirzaei
Yafei Ou
Mahdi Tavakoli
21
1
0
14 Aug 2024
Robot Instance Segmentation with Few Annotations for Grasping
Moshe Kimhi
David Vainshtein
Chaim Baskin
Dotan Di Castro
45
2
0
01 Jul 2024
CLCE: An Approach to Refining Cross-Entropy and Contrastive Learning for Optimized Learning Fusion
Zijun Long
George Killick
Lipeng Zhuang
Gerardo Aragon Camarasa
Zaiqiao Meng
R. McCreadie
VLM
29
2
0
22 Feb 2024
CrisisViT: A Robust Vision Transformer for Crisis Image Classification
Zijun Long
R. McCreadie
Muhammad Imran
48
9
0
05 Jan 2024
Elucidating and Overcoming the Challenges of Label Noise in Supervised Contrastive Learning
Zijun Long
George Killick
Lipeng Zhuang
R. McCreadie
Gerardo Aragon Camarasa
Paul Henderson
18
5
0
25 Nov 2023
BLIP-2: Bootstrapping Language-Image Pre-training with Frozen Image Encoders and Large Language Models
Junnan Li
Dongxu Li
Silvio Savarese
Steven C. H. Hoi
VLM
MLLM
244
4,186
0
30 Jan 2023
Pyramid Vision Transformer: A Versatile Backbone for Dense Prediction without Convolutions
Wenhai Wang
Enze Xie
Xiang Li
Deng-Ping Fan
Kaitao Song
Ding Liang
Tong Lu
Ping Luo
Ling Shao
ViT
263
3,538
0
24 Feb 2021
1