Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2304.02643
Cited By
Segment Anything
5 April 2023
A. Kirillov
Eric Mintun
Nikhila Ravi
Hanzi Mao
Chloe Rolland
Laura Gustafson
Tete Xiao
Spencer Whitehead
Alexander C. Berg
Wan-Yen Lo
Piotr Dollár
Ross B. Girshick
MLLM
VLM
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Segment Anything"
50 / 4,194 papers shown
Title
DynamicGSG: Dynamic 3D Gaussian Scene Graphs for Environment Adaptation
Luzhou Ge
Xiangyu Zhu
Zhuo Yang
Xuesong Li
3DGS
72
0
0
21 Feb 2025
Personalized Instance-based Navigation Toward User-Specific Objects in Realistic Environments
Luca Barsellotti
Roberto Bigazzi
Marcella Cornia
Lorenzo Baraldi
Rita Cucchiara
101
1
0
20 Feb 2025
Contrastive Localized Language-Image Pre-Training
Hong-You Chen
Zhengfeng Lai
Han Zhang
Xuben Wang
Marcin Eichner
Keen You
Meng Cao
Bowen Zhang
Yue Yang
Zhe Gan
CLIP
VLM
68
7
0
20 Feb 2025
SMITE: Segment Me In TimE
Amirhossein Alimohammadi
Sauradip Nag
Saeid Asgari Taghanaki
Andrea Tagliasacchi
Ghassan Hamarneh
Ali Mahdavi-Amiri
VLM
VOS
221
2
0
20 Feb 2025
What Is a Good Caption? A Comprehensive Visual Caption Benchmark for Evaluating Both Correctness and Thoroughness
Zhihang Liu
Chen-Wei Xie
Bin Wen
Feiwu Yu
Jixuan Chen
...
Pandeng Li
Yun Zheng
Hongtao Xie
Yun Zheng
Hongtao Xie
VLM
CoGe
102
0
0
19 Feb 2025
Black Sheep in the Herd: Playing with Spuriously Correlated Attributes for Vision-Language Recognition
Xinyu Tian
Shu Zou
Zhaoyuan Yang
Mengqi He
Jing Zhang
VLM
50
0
0
19 Feb 2025
CAST: Component-Aligned 3D Scene Reconstruction from an RGB Image
Kaixin Yao
Longwen Zhang
Xinhao Yan
Yan Zeng
Qixuan Zhang
Wei Yang
Lan Xu
Jiayuan Gu
Jingyi Yu
34
3
0
18 Feb 2025
Understanding and Evaluating Hallucinations in 3D Visual Language Models
Ruiying Peng
Kaiyuan Li
Weichen Zhang
Chen Gao
Xinlei Chen
Yong Li
57
0
0
18 Feb 2025
L4P: Low-Level 4D Vision Perception Unified
Abhishek Badki
Hang Su
Bowen Wen
Orazio Gallo
VLM
92
1
0
18 Feb 2025
Magma: A Foundation Model for Multimodal AI Agents
Jianwei Yang
Reuben Tan
Qianhui Wu
Ruijie Zheng
Baolin Peng
...
Seonghyeon Ye
Joel Jang
Yuquan Deng
Lars Liden
Jianfeng Gao
VLM
AI4TS
122
9
0
18 Feb 2025
SoFar: Language-Grounded Orientation Bridges Spatial Reasoning and Object Manipulation
Zekun Qi
Wenyao Zhang
Yufei Ding
Runpei Dong
Xinqiang Yu
...
Xin Jin
Kaisheng Ma
Zhizheng Zhang
He Wang
Li Yi
LM&Ro
133
4
0
18 Feb 2025
UPCMR: A Universal Prompt-guided Model for Random Sampling Cardiac MRI Reconstruction
Donghang Lyu
Chinmay Rao
Marius Staring
M. Osch
M. Doneva
Hildo J. Lamb
Nicola Pezzotti
46
1
0
18 Feb 2025
Pre-training Auto-regressive Robotic Models with 4D Representations
Dantong Niu
Yuvan Sharma
Haoru Xue
Giscard Biamby
Junyi Zhang
Ziteng Ji
Trevor Darrell
Roei Herzig
80
1
0
18 Feb 2025
SAM-LAD: Segment Anything Model Meets Zero-Shot Logic Anomaly Detection
Yun Peng
Xiao Lin
Nachuan Ma
Jiayuan Du
Chuangwei Liu
Chengju Liu
Qi Chen
46
3
0
17 Feb 2025
Investigating Inference-time Scaling for Chain of Multi-modal Thought: A Preliminary Study
Yujie Lin
Ante Wang
Moye Chen
Jingyao Liu
Hao Liu
Jinsong Su
Xinyan Xiao
LRM
50
2
0
17 Feb 2025
Occlusion-aware Text-Image-Point Cloud Pretraining for Open-World 3D Object Recognition
Khanh Nguyen
Ghulam Mubashar Hassan
Ajmal Mian
3DPC
54
0
0
15 Feb 2025
HIPPo: Harnessing Image-to-3D Priors for Model-free Zero-shot 6D Pose Estimation
Yibo Liu
Zhaodong Jiang
Binbin Xu
Guile Wu
Y. Ren
Tongtong Cao
Bingbing Liu
Rui Heng Yang
Amir Rasouli
J. Shan
54
1
0
14 Feb 2025
MonoForce: Learnable Image-conditioned Physics Engine
R. Agishev
Karel Zimmermann
95
0
0
14 Feb 2025
E-MD3C: Taming Masked Diffusion Transformers for Efficient Zero-Shot Object Customization
T. Pham
Zhang Kang
Ji Woo Hong
Xuran Zheng
Chang D. Yoo
87
0
0
13 Feb 2025
Wholly-WOOD: Wholly Leveraging Diversified-quality Labels for Weakly-supervised Oriented Object Detection
Yi Yu
Xue Yang
Yansheng Li
Zhenjun Han
Feipeng Da
Junchi Yan
73
0
0
13 Feb 2025
Bilevel Learning for Bilevel Planning
Bowen Li
Tom Silver
Sebastian A. Scherer
Alexander G. Gray
84
2
0
12 Feb 2025
Color Universal Design Neural Network for the Color Vision Deficiencies
Sunyong Seo
Jinho Park
68
0
0
12 Feb 2025
MatSwap: Light-aware material transfers in images
Ivan Lopes
Valentin Deschaintre
Yannick Hold-Geoffroy
Raoul de Charette
DiffM
90
0
0
11 Feb 2025
Animate Anyone 2: High-Fidelity Character Image Animation with Environment Affordance
Li Hu
Guangyuan Wang
Zhen Shen
Xin Gao
Dechao Meng
Lian Zhuo
Peng Zhang
Bang Zhang
Liefeng Bo
DiffM
VGen
105
9
0
10 Feb 2025
UniMoD: Efficient Unified Multimodal Transformers with Mixture-of-Depths
Weijia Mao
Zheng Yang
Mike Zheng Shou
MoE
82
0
0
10 Feb 2025
Fully Exploiting Vision Foundation Model's Profound Prior Knowledge for Generalizable RGB-Depth Driving Scene Parsing
Sicen Guo
Tianyou Wen
Chuang-Wei Liu
Qijun Chen
Rui Fan
62
0
0
10 Feb 2025
FunduSAM: A Specialized Deep Learning Model for Enhanced Optic Disc and Cup Segmentation in Fundus Images
Jinchen Yu
Yongwei Nie
Fei Qi
Wenxiong Liao
Hongmin Cai
MedIm
59
0
0
10 Feb 2025
MoFM: A Large-Scale Human Motion Foundation Model
Mohammadreza Baharani
Ghazal Alinezhad Noghre
Armin Danesh Pazho
Gabriel Maldonado
Hamed Tabkhi
AI4CE
232
1
0
08 Feb 2025
LeAP: Consistent multi-domain 3D labeling using Foundation Models
Simon Gebraad
Andras Palffy
Holger Caesar
191
1
0
06 Feb 2025
PixFoundation: Are We Heading in the Right Direction with Pixel-level Vision Foundation Models?
Mennatullah Siam
VLM
89
1
0
06 Feb 2025
No Free Lunch in Annotation either: An objective evaluation of foundation models for streamlining annotation in animal tracking
Emil Mededovic
Valdy Laurentius
Yuli Wu
Marcin Kopaczka
Zhu Chen
Mareike Schulz
René Tolba
Johannes Stegmaier
102
1
0
06 Feb 2025
Scaling Laws in Patchification: An Image Is Worth 50,176 Tokens And More
Feng Wang
Yaodong Yu
Guoyizhe Wei
Wei Shao
Yuyin Zhou
Alan Yuille
Cihang Xie
ViT
101
4
0
06 Feb 2025
Point2RBox-v2: Rethinking Point-supervised Oriented Object Detection with Spatial Layout Among Instances
Yi Yu
Botao Ren
Peiyuan Zhang
Mingxin Liu
Junwei Luo
Shaofeng Zhang
Feipeng Da
Junchi Yan
Xue Yang
3DPC
131
1
0
06 Feb 2025
Controllable Satellite-to-Street-View Synthesis with Precise Pose Alignment and Zero-Shot Environmental Control
Xianghui Ze
Zhenbo Song
Qiwei Wang
Jianfeng Lu
Yujiao Shi
65
0
0
05 Feb 2025
Disentangling CLIP for Multi-Object Perception
Samyak Rawlekar
Yujun Cai
Yiwei Wang
Ming-Hsuan Yang
Narendra Ahuja
VLM
CoGe
80
0
0
05 Feb 2025
ZISVFM: Zero-Shot Object Instance Segmentation in Indoor Robotic Environments with Vision Foundation Models
Ying Zhang
Maoliang Yin
Wenfu Bi
Haibao Yan
Shaohan Bian
Cui-Hua Zhang
C. Hua
83
2
0
05 Feb 2025
Tell2Reg: Establishing spatial correspondence between images by the same language prompts
Wen Yan
Qianye Yang
Shiqi Huang
Yipei Wang
S. Punwani
M. Emberton
V. Stavrinides
Yipeng Hu
D. Barratt
92
0
0
05 Feb 2025
Articulate AnyMesh: Open-Vocabulary 3D Articulated Objects Modeling
Xiaowen Qiu
Jincheng Yang
Yian Wang
Zhehuan Chen
Yufei Wang
Tsun-Hsuan Wang
Zhou Xian
Chuang Gan
105
5
0
04 Feb 2025
LAST SToP For Modeling Asynchronous Time Series
Shubham Gupta
Thibaut Durand
Graham Taylor
Lilian W. Białokozowicz
AI4TS
44
0
0
04 Feb 2025
Exploring Few-Shot Defect Segmentation in General Industrial Scenarios with Metric Learning and Vision Foundation Models
Tongkun Liu
Bing Li
Xiao Jin
Yupeng Shi
Qiuying Li
Xiang Wei
68
0
0
03 Feb 2025
Scalable, Training-Free Visual Language Robotics: A Modular Multi-Model Framework for Consumer-Grade GPUs
Marie Samson
Bastien Muraccioli
Fumio Kanehiro
105
1
0
03 Feb 2025
AquaticCLIP: A Vision-Language Foundation Model for Underwater Scene Analysis
B. Alawode
I. I. Ganapathi
S. Javed
Naoufel Werghi
Mohammed Bennamoun
Arif Mahmood
CLIP
VLM
80
1
0
03 Feb 2025
DesCLIP: Robust Continual Adaptation via General Attribute Descriptions for Pretrained Vision-Language Models
Chiyuan He
Zihuan Qiu
Fanman Meng
Linfeng Xu
Qingbo Wu
Yiming Li
VLM
CLL
KELM
73
0
0
02 Feb 2025
Self-Prompt SAM: Medical Image Segmentation via Automatic Prompt SAM Adaptation
Bin Xie
Hao Tang
Dawen Cai
Yan Yan
Gady Agam
MedIm
VLM
64
1
0
02 Feb 2025
PM-MOE: Mixture of Experts on Private Model Parameters for Personalized Federated Learning
Yu Feng
Yangli-ao Geng
Yifan Zhu
Zongfu Han
Xie Yu
Kaiwen Xue
Haoran Luo
Mengyang Sun
Guangwei Zhang
Meina Song
FedML
MoE
73
0
0
01 Feb 2025
Laser: Efficient Language-Guided Segmentation in Neural Radiance Fields
Xingyu Miao
Haoran Duan
Yang Bai
Tejal Shah
Jun Song
Yang Long
R. Ranjan
Ling Shao
84
4
0
31 Jan 2025
Spend Wisely: Maximizing Post-Training Gains in Iterative Synthetic Data Boostrapping
Pu Yang
Yunzhen Feng
Ziyuan Chen
Yuhang Wu
Zhuoyuan Li
DiffM
106
0
0
31 Jan 2025
A Survey on Class-Agnostic Counting: Advancements from Reference-Based to Open-World Text-Guided Approaches
Luca Ciampi
Ali Azmoudeh
Elif Ecem Akbaba
Erdi Sarıtaş
Ziya Ata Yazıcı
H. K. Ekenel
Giuseppe Amato
Fabrizio Falchi
102
0
0
31 Jan 2025
Lifting by Gaussians: A Simple, Fast and Flexible Method for 3D Instance Segmentation
Rohan Chacko
Nicolai Haeni
Eldar Khaliullin
Lin Sun
Douglas Lee
3DGS
49
1
0
31 Jan 2025
Efficient Redundancy Reduction for Open-Vocabulary Semantic Segmentation
Lin Chen
Qi Yang
Kun Ding
Zhu Li
Gang Shen
Fei Li
Qiyuan Cao
Shiming Xiang
VLM
64
0
0
29 Jan 2025
Previous
1
2
3
...
15
16
17
...
82
83
84
Next