Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2304.02643
Cited By
Segment Anything
5 April 2023
A. Kirillov
Eric Mintun
Nikhila Ravi
Hanzi Mao
Chloe Rolland
Laura Gustafson
Tete Xiao
Spencer Whitehead
Alexander C. Berg
Wan-Yen Lo
Piotr Dollár
Ross B. Girshick
MLLM
VLM
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Segment Anything"
50 / 4,188 papers shown
Title
The Coralscapes Dataset: Semantic Scene Understanding in Coral Reefs
Jonathan Sauder
Viktor Domazetoski
G. Banc-Prandi
Gabriela Perna
Anders Meibom
D. Tuia
58
0
0
25 Mar 2025
Show and Segment: Universal Medical Image Segmentation via In-Context Learning
Yunhe Gao
Di Liu
Zhuowei Li
Yongbin Li
Dongdong Chen
Mu Zhou
Dimitris N. Metaxas
VLM
53
0
0
25 Mar 2025
Learning 3D Object Spatial Relationships from Pre-trained 2D Diffusion Models
Sangwon Beak
Hyeonwoo Kim
Hanbyul Joo
46
0
0
25 Mar 2025
PartRM: Modeling Part-Level Dynamics with Large Cross-State Reconstruction Model
Mingju Gao
Yike Pan
Huan-ang Gao
Zongzheng Zhang
Wenyi Li
Hao Dong
Hao Tang
Li Yi
Hao Zhao
VGen
47
0
0
25 Mar 2025
Context-Aware Semantic Segmentation: Enhancing Pixel-Level Understanding with Large Language Models for Advanced Vision Applications
Ben Rahman
VLM
42
1
0
25 Mar 2025
Video Anomaly Detection with Contours - A Study
M. Siemon
I. Nikolov
T. Moeslund
Kamal Nasrollahi
3DH
55
0
0
25 Mar 2025
DeClotH: Decomposable 3D Cloth and Human Body Reconstruction from a Single Image
Hyeongjin Nam
Donghwan Kim
Jeongtaek Oh
Kyoung Mu Lee
DiffM
3DH
56
0
0
25 Mar 2025
Scaling Vision Pre-Training to 4K Resolution
Baifeng Shi
Boyi Li
Han Cai
Yunfan LU
Sifei Liu
...
Jan Kautz
Enze Xie
Trevor Darrell
Pavlo Molchanov
Hongxu Yin
CLIP
166
0
0
25 Mar 2025
Training-Free Personalization via Retrieval and Reasoning on Fingerprints
Deepayan Das
Davide Talon
Yiming Wang
Massimiliano Mancini
Elisa Ricci
VLM
LRM
50
0
0
24 Mar 2025
Towards Human-Understandable Multi-Dimensional Concept Discovery
Arne Grobrugge
Niklas Kühl
G. Satzger
Philipp Spitzer
44
0
0
24 Mar 2025
MaSS13K: A Matting-level Semantic Segmentation Benchmark
C. Xie
Minghan Li
Hui Zeng
Jun Luo
Lei Zhang
VLM
76
0
0
24 Mar 2025
LiDAR Remote Sensing Meets Weak Supervision: Concepts, Methods, and Perspectives
Yuan Gao
Shaobo Xia
P. Wang
Xiaohuan Xi
Sheng Nie
Cheng-Xiang Wang
50
1
0
24 Mar 2025
OCRT: Boosting Foundation Models in the Open World with Object-Concept-Relation Triad
Luyao Tang
Yuxuan Yuan
Chia-Ju Chen
Zeyu Zhang
Yue Huang
Kun Zhang
57
0
0
24 Mar 2025
EgoSurgery-HTS: A Dataset for Egocentric Hand-Tool Segmentation in Open Surgery Videos
Nathan Darjana
Ryo Fujii
Hideo Saito
Hiroki Kajita
56
0
0
24 Mar 2025
Towards Training-free Anomaly Detection with Vision and Language Foundation Models
Jinjin Zhang
Guodong Wang
Yizhou Jin
Di Huang
42
1
0
24 Mar 2025
Target-Aware Video Diffusion Models
Taeksoo Kim
Hanbyul Joo
DiffM
VGen
91
1
0
24 Mar 2025
Structure-Aware Correspondence Learning for Relative Pose Estimation
Yihan Chen
Wenfei Yang
Huan Ren
Shifeng Zhang
Tianzhu Zhang
Feng Wu
3DPC
71
0
0
24 Mar 2025
Coeff-Tuning: A Graph Filter Subspace View for Tuning Attention-Based Large Models
Zichen Miao
Wei Chen
Qiang Qiu
92
1
0
24 Mar 2025
DiffV2IR: Visible-to-Infrared Diffusion Model via Vision-Language Understanding
Lingyan Ran
Lidong Wang
Guangcong Wang
Peng Wang
Yuyao Zhang
59
0
0
24 Mar 2025
SceneSplat: Gaussian Splatting-based Scene Understanding with Vision-Language Pretraining
Yue Li
Qi Ma
Runyi Yang
Huapeng Li
Mengjiao Ma
...
E. Konukoglu
Theo Gevers
Luc Van Gool
Martin R. Oswald
Danda Pani Paudel
3DGS
VLM
88
0
0
23 Mar 2025
PG-SAM: Prior-Guided SAM with Medical for Multi-organ Segmentation
Yiheng Zhong
Zihong Luo
Chengzhi Liu
Feilong Tang
Zelin Peng
Ming Hu
Y. Hu
Jionglong Su
Zongyuan Geand
Imran Razzak
MedIm
65
0
0
23 Mar 2025
MLLM-For3D: Adapting Multimodal Large Language Model for 3D Reasoning Segmentation
Jiaxin Huang
Runnan Chen
Ziwen Li
Zhengqing Gao
Xiao He
Yandong Guo
Mingming Gong
Tongliang Liu
LRM
56
0
0
23 Mar 2025
CustomKD: Customizing Large Vision Foundation for Edge Model Improvement via Knowledge Distillation
Jungsoo Lee
Debasmit Das
Munawar Hayat
Sungha Choi
Kyuwoong Hwang
Fatih Porikli
VLM
68
1
0
23 Mar 2025
FisherTune: Fisher-Guided Robust Tuning of Vision Foundation Models for Domain Generalized Segmentation
Dong Zhao
Jinlong Li
Shuang Wang
Mengyao Wu
Qi Zang
N. Sebe
Zhun Zhong
182
0
0
23 Mar 2025
TransAnimate: Taming Layer Diffusion to Generate RGBA Video
Xuewei Chen
Zhimin Chen
Yiren Song
VGen
70
0
0
23 Mar 2025
PanopticSplatting: End-to-End Panoptic Gaussian Splatting
Yuxuan Xie
Xuan Yu
Changjian Jiang
Sitong Mao
Shunbo Zhou
Rui Fan
R. Xiong
Yansen Wang
3DGS
48
0
0
23 Mar 2025
PanoGS: Gaussian-based Panoptic Segmentation for 3D Open Vocabulary Scene Understanding
Hongjia Zhai
Yiming Li
Zhenzhe Li
Xiaokun Pan
Yijia He
Guofeng Zhang
50
0
0
23 Mar 2025
Vehicular Road Crack Detection with Deep Learning: A New Online Benchmark for Comprehensive Evaluation of Existing Algorithms
Nachuan Ma
Zhengfei Song
Qiang Hu
Chuang-Wei Liu
Yu Han
Yanting Zhang
Rui Fan
Lihua Xie
57
0
0
23 Mar 2025
Serial Low-rank Adaptation of Vision Transformer
Houqiang Zhong
Shaocheng Shen
Ke Cai
Zhenglong Wu
Jiangchao Yao
Yuan Cheng
Xuefei Li
Xiaoyun Zhang
Li-Na Song
Qiang Hu
47
0
0
22 Mar 2025
RAIDER: Tool-Equipped Large Language Model Agent for Robotic Action Issue Detection, Explanation and Recovery
Silvia Izquierdo-Badiola
Carlos Rizzo
Guillem Alenyà
LLMAG
LM&Ro
84
0
0
22 Mar 2025
Co-op: Correspondence-based Novel Object Pose Estimation
Sungphill Moon
Hyeontae Son
Dongcheol Hur
Sangwook Kim
3DH
69
1
0
22 Mar 2025
RefCut: Interactive Segmentation with Reference Guidance
Zheng Lin
Nan Zhou
Chen-Xi Du
Deng-Ping Fan
Shi-Min Hu
65
0
0
22 Mar 2025
GS-LTS: 3D Gaussian Splatting-Based Adaptive Modeling for Long-Term Service Robots
Bin Fu
Jiajian Li
Bin Zhang
Ruiping Wang
Xilin Chen
3DGS
41
0
0
22 Mar 2025
Enhancing Martian Terrain Recognition with Deep Constrained Clustering
Tejas Panambur
M. Parente
57
0
0
22 Mar 2025
GOAL: Global-local Object Alignment Learning
Hyungyu Choi
Young Kyun Jang
Chanho Eom
VLM
165
0
0
22 Mar 2025
Dereflection Any Image with Diffusion Priors and Diversified Data
Jichen Hu
Chen-Ning Yang
Zanwei Zhou
Jiemin Fang
Xiaokang Yang
Q. Tian
Wei-Ming Shen
47
0
0
21 Mar 2025
Halton Scheduler For Masked Generative Image Transformer
Victor Besnier
Mickael Chen
David Hurych
Eduardo Valle
Matthieu Cord
52
1
0
21 Mar 2025
RAW-Adapter: Adapting Pre-trained Visual Model to Camera RAW Images and A Benchmark
Ziteng Cui
Jianfei Yang
Tatsuya Harada
VLM
56
0
0
21 Mar 2025
Steady Progress Beats Stagnation: Mutual Aid of Foundation and Conventional Models in Mixed Domain Semi-Supervised Medical Image Segmentation
Qinghe Ma
Jian Zhang
Zekun Li
Lei Qi
Qian Yu
Yinghuan Shi
MedIm
50
1
0
21 Mar 2025
MagicColor: Multi-Instance Sketch Colorization
Yuyao Zhang
Yue Ma
Bingyuan Wang
Qifeng Chen
Zeyu Wang
DiffM
73
0
0
21 Mar 2025
Decouple and Track: Benchmarking and Improving Video Diffusion Transformers for Motion Transfer
Qingyu Shi
Jianzong Wu
Jinbin Bai
Jun Zhang
Lu Qi
Xiaomeng Li
Yunhai Tong
48
0
0
21 Mar 2025
ExCap3D: Expressive 3D Scene Understanding via Object Captioning with Varying Detail
Chandan Yeshwanth
Dávid Rozenberszki
Angela Dai
77
0
0
21 Mar 2025
OpenCity3D: What do Vision-Language Models know about Urban Environments?
Valentin Bieri
Marco Zamboni
Nicolas S. Blumer
Qingxuan Chen
Francis Engelmann
56
1
0
21 Mar 2025
Is there anything left? Measuring semantic residuals of objects removed from 3D Gaussian Splatting
Simona Kocour
Assia Benbihi
Aikaterini Adam
Torsten Sattler
3DPC
41
0
0
21 Mar 2025
Scoring, Remember, and Reference: Catching Camouflaged Objects in Videos
Yuang Feng
Shuyong Gao
Fuzhen Yan
Yicheng Song
Lingyi Hong
J. Hu
Wenqiang Zhang
VOS
53
0
0
21 Mar 2025
Controllable Segmentation-Based Text-Guided Style Editing
Jingwen Li
Aravind Chandrasekar
Mariana Rocha
Chao Li
Yuqing Chen
55
0
0
20 Mar 2025
Shining Yourself: High-Fidelity Ornaments Virtual Try-on with Diffusion Model
Yingmao Miao
Zhanpeng Huang
Rui Han
Zibin Wang
Chenhao Lin
Chao Shen
DiffM
52
0
0
20 Mar 2025
M2N2V2: Multi-Modal Unsupervised and Training-free Interactive Segmentation
Markus Karmann
Peng-Tao Jiang
Bo Li
O. Urfalioglu
47
0
0
20 Mar 2025
Cross-Modal and Uncertainty-Aware Agglomeration for Open-Vocabulary 3D Scene Understanding
Jinlong Li
Cristiano Saltori
Fabio Poiesi
N. Sebe
201
0
0
20 Mar 2025
GraspCoT: Integrating Physical Property Reasoning for 6-DoF Grasping under Flexible Language Instructions
Xiaomeng Chu
Jiajun Deng
Guoliang You
Wei Liu
Xuzhao Li
Jianmin Ji
Wenjie Qu
84
0
0
20 Mar 2025
Previous
1
2
3
...
8
9
10
...
82
83
84
Next