Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2304.02643
Cited By
Segment Anything
5 April 2023
A. Kirillov
Eric Mintun
Nikhila Ravi
Hanzi Mao
Chloe Rolland
Laura Gustafson
Tete Xiao
Spencer Whitehead
Alexander C. Berg
Wan-Yen Lo
Piotr Dollár
Ross B. Girshick
MLLM
VLM
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Segment Anything"
50 / 4,200 papers shown
Title
PGP-SAM: Prototype-Guided Prompt Learning for Efficient Few-Shot Medical Image Segmentation
Zhonghao Yan
Zijin Yin
Tianyu Lin
Xiangzhu Zeng
Kongming Liang
Zhanyu Ma
VLM
MedIm
59
0
0
12 Jan 2025
Semi-supervised 3D Semantic Scene Completion with 2D Vision Foundation Model Guidance
Duc-Hai Pham
Duc Dung Nguyen
Anh Pham
Ho Lai Tuan
P. Nguyen
Khoi Duc Minh Nguyen
Rang Nguyen
3DPC
56
1
0
10 Jan 2025
Multi-subject Open-set Personalization in Video Generation
Tsai-Shien Chen
Aliaksandr Siarohin
Willi Menapace
Yuwei Fang
Kwot Sin Lee
Ivan Skorokhodov
Kfir Aberman
Jun-Yan Zhu
Ming-Hsuan Yang
Sergey Tulyakov
DiffM
VGen
81
7
0
10 Jan 2025
Edit as You See: Image-guided Video Editing via Masked Motion Modeling
Zhi-Lin Huang
Yebin Liu
Chujun Qin
Zihan Wang
Dong Zhou
Dong Li
E. Barsoum
DiffM
VGen
46
0
0
08 Jan 2025
ORGANA: A Robotic Assistant for Automated Chemistry Experimentation and Characterization
Kourosh Darvish
Marta Skreta
Yuchi Zhao
Naruki Yoshikawa
Sagnik Som
...
Han Hao
Haoping Xu
Alán Aspuru-Guzik
Animesh Garg
Florian Shkurti
64
23
0
08 Jan 2025
VideoAnydoor: High-fidelity Video Object Insertion with Precise Motion Control
Yuanpeng Tu
Hao Luo
Xi Chen
S. Ji
Xiang Bai
Hengshuang Zhao
VGen
DiffM
52
3
0
08 Jan 2025
ConceptMaster: Multi-Concept Video Customization on Diffusion Transformer Models Without Test-Time Tuning
Yuzhou Huang
Ziyang Yuan
Quande Liu
Qiulin Wang
Xintao Wang
Ruimao Zhang
Pengfei Wan
Di Zhang
Kun Gai
VGen
DiffM
55
10
0
08 Jan 2025
Gaussian Building Mesh (GBM): Extract a Building's 3D Mesh with Google Earth and Gaussian Splatting
K. Gao
Liangzhi Li
Hongjie He
Dening Lu
Linlin Xu
Jonathan Li
GP
3DGS
32
2
0
08 Jan 2025
OmniManip: Towards General Robotic Manipulation via Object-Centric Interaction Primitives as Spatial Constraints
Mingjie Pan
Jiyao Zhang
Tianshu Wu
Yinghao Zhao
Wenlong Gao
Hao Dong
LM&Ro
63
8
0
08 Jan 2025
Concept Matching with Agent for Out-of-Distribution Detection
YuXiao Lee
Xiaofeng Cao
Jingcai Guo
Wei Ye
Qing Guo
Yi Chang
71
0
0
08 Jan 2025
START: A Generalized State Space Model with Saliency-Driven Token-Aware Transformation
Jintao Guo
Lei Qi
Yinghuan Shi
Yang Gao
38
2
0
08 Jan 2025
Rare-to-Frequent: Unlocking Compositional Generation Power of Diffusion Models on Rare Concepts with LLM Guidance
Dongmin Park
Sebin Kim
Taehong Moon
Minkyu Kim
Kangwook Lee
Jaewoong Cho
DiffM
CoGe
75
2
0
08 Jan 2025
URSA: Understanding and Verifying Chain-of-thought Reasoning in Multimodal Mathematics
Ruilin Luo
Zhuofan Zheng
Yifan Wang
Yiyao Yu
Xinzhe Ni
Zicheng Lin
Jin Zeng
Yujiu Yang
LRM
83
14
0
08 Jan 2025
Segment Anything Model for Zero-shot Single Particle Tracking in Liquid Phase Transmission Electron Microscopy
Risha Goel
Zain Shabeeb
Isabel Panicker
Vida Jamali
VLM
26
0
0
06 Jan 2025
FoundPAD: Foundation Models Reloaded for Face Presentation Attack Detection
Guray Ozgur
Eduarda Caldeira
Tahar Chettaoui
Fadi Boutros
Raghavendra Ramachandra
Naser Damer
AAML
CVBM
39
0
1
06 Jan 2025
Visual Large Language Models for Generalized and Specialized Applications
Yifan Li
Zhixin Lai
Wentao Bao
Zhen Tan
Anh Dao
Kewei Sui
Jiayi Shen
Dong Liu
Huan Liu
Yu Kong
VLM
91
12
0
06 Jan 2025
ProTracker: Probabilistic Integration for Robust and Accurate Point Tracking
Tingyang Zhang
Chen Wang
Zhiyang Dou
Qingzhe Gao
Jiahui Lei
Baoquan Chen
Lingjie Liu
3DV
51
0
0
06 Jan 2025
MObI: Multimodal Object Inpainting Using Diffusion Models
Alexandru Buburuzan
Anuj Sharma
John Redford
P. Dokania
Romain Mueller
DiffM
96
1
0
06 Jan 2025
Efficient Architectures for High Resolution Vision-Language Models
Miguel Carvalho
Bruno Martins
MLLM
VLM
50
0
0
05 Jan 2025
Tensor-GaLore: Memory-Efficient Training via Gradient Tensor Decomposition
Robert Joseph George
David Pitt
Jiawei Zhao
Jean Kossaifi
Cheng Luo
Yuandong Tian
Anima Anandkumar
45
1
0
04 Jan 2025
DreamMask: Boosting Open-vocabulary Panoptic Segmentation with Synthetic Data
Yuanpeng Tu
Xi Chen
Ser-Nam Lim
Hengshuang Zhao
47
0
0
03 Jan 2025
Exploiting Boundary Loss for the Hierarchical Panoptic Segmentation of Plants and Leaves
Madeleine Darbyshire
Elizabeth I. Sklar
Simon Parsons
60
0
0
03 Jan 2025
STORM: Spatio-Temporal Reconstruction Model for Large-Scale Outdoor Scenes
Jiawei Yang
Jiahui Huang
Yuxiao Chen
Yan Wang
Boyi Li
...
Peter Karkus
Danfei Xu
Boris Ivanovic
Yue Wang
Marco Pavone
3DGS
80
5
0
03 Jan 2025
Instruction-Guided Scene Text Recognition
Yongkun Du
Z. Chen
Yuchen Su
Caiyan Jia
Yu-Gang Jiang
78
3
0
03 Jan 2025
GeoDiffuser: Geometry-Based Image Editing with Diffusion Models
Rahul Sajnani
Jeroen Vanbaar
Jie Min
Kapil D. Katyal
Srinath Sridhar
DiffM
59
10
0
03 Jan 2025
A Novel Shape Guided Transformer Network for Instance Segmentation in Remote Sensing Images
Dawen Yu
Shunping Ji
ViT
62
1
0
03 Jan 2025
VisionLLM v2: An End-to-End Generalist Multimodal Large Language Model for Hundreds of Vision-Language Tasks
Jiannan Wu
Muyan Zhong
Sen Xing
Zeqiang Lai
Zhaoyang Liu
...
Lewei Lu
Tong Lu
Ping Luo
Yu Qiao
Jifeng Dai
MLLM
VLM
LRM
104
48
0
03 Jan 2025
PanoSLAM: Panoptic 3D Scene Reconstruction via Gaussian SLAM
Runnan Chen
Zhaoqing Wang
Jiepeng Wang
Yuexin Ma
Mingming Gong
Wenping Wang
Tongliang Liu
3DGS
41
2
0
03 Jan 2025
Is Segment Anything Model 2 All You Need for Surgery Video Segmentation? A Systematic Evaluation
Cheng Yuan
Jian Jiang
Kunyi Yang
Lv Wu
Rui Wang
...
Yifan Zhou
Wanli Song
Haoran Wang
Qi Dou
Yutong Ban
41
1
0
03 Jan 2025
Fine-grained Image-to-LiDAR Contrastive Distillation with Visual Foundation Models
Yifan Zhang
Junhui Hou
66
1
0
03 Jan 2025
RealCustom++: Representing Images as Real-Word for Real-Time Customization
Zhendong Mao
Mengqi Huang
Fei Ding
Mingcong Liu
Qian He
Xiaojun Chang
DiffM
84
6
0
03 Jan 2025
Region-Guided Attack on the Segment Anything Model (SAM)
Xiaoliang Liu
Furao Shen
Jian Zhao
AAML
33
0
0
03 Jan 2025
PatchRefiner V2: Fast and Lightweight Real-Domain High-Resolution Metric Depth Estimation
Zhenyu Li
Wenqing Cui
S. Bhat
Peter Wonka
MDE
46
0
0
03 Jan 2025
RORem: Training a Robust Object Remover with Human-in-the-Loop
Ruibin Li
Tao Yang
Song Guo
Lefei Zhang
58
3
0
01 Jan 2025
VinT-6D: A Large-Scale Object-in-hand Dataset from Vision, Touch and Proprioception
Zhaoliang Wan
Yonggen Ling
Senlin Yi
Lu Qi
Wangwei Lee
...
Xiao Teng
Peng Lu
Xu Yang
Ming-Hsuan Yang
Hui Cheng
65
5
0
31 Dec 2024
ILDiff: Generate Transparent Animated Stickers by Implicit Layout Distillation
Ting Zhang
Zhiqiang Yuan
Yeshuang Zhu
Jinchao Zhang
DiffM
109
0
0
31 Dec 2024
Towards Visual Grounding: A Survey
Linhui Xiao
Xiaoshan Yang
X. Lan
Yaowei Wang
Changsheng Xu
ObjD
67
4
0
31 Dec 2024
VersaGen: Unleashing Versatile Visual Control for Text-to-Image Synthesis
Zhipeng Chen
Lan Yang
Yonggang Qi
Honggang Zhang
Kaiyue Pang
Ke Li
Yi-Zhe Song
DiffM
102
0
0
31 Dec 2024
Dual-Space Augmented Intrinsic-LoRA for Wind Turbine Segmentation
Shubh Singhal
Raül Pérez-Gonzalo
Andreas Espersen
Antonio Agudo
44
0
0
31 Dec 2024
MSDNet: Multi-Scale Decoder for Few-Shot Semantic Segmentation via Transformer-Guided Prototyping
Amirreza Fateh
Mohammad Reza Mohammadi
Mohammad Reza Jahed Motlagh
ViT
80
5
0
31 Dec 2024
Vitron: A Unified Pixel-level Vision LLM for Understanding, Generating, Segmenting, Editing
Hao Fei
Shengqiong Wu
Hao Zhang
Tat-Seng Chua
Shuicheng Yan
71
39
0
31 Dec 2024
4D Gaussian Splatting: Modeling Dynamic Scenes with Native 4D Primitives
Zeyu Yang
Zijie Pan
Xiatian Zhu
Li Zhang
Yu-Gang Jiang
Philip H. S. Torr
3DGS
48
0
0
31 Dec 2024
Tuning Vision-Language Models with Candidate Labels by Prompt Alignment
Zhifang Zhang
Yuwei Niu
Xin Liu
Beibei Li
VPVLM
VLM
67
0
0
31 Dec 2024
A Comprehensive Survey of Large Language Models and Multimodal Large Language Models in Medicine
Hanguang Xiao
Feizhong Zhou
Xianglong Liu
Tianqi Liu
Zhipeng Li
Xin Liu
Xiaoxuan Huang
AILaw
LM&MA
LRM
66
19
0
31 Dec 2024
Protective Perturbations against Unauthorized Data Usage in Diffusion-based Image Generation
Sen Peng
Jijia Yang
Mingyue Wang
Jianfei He
Xiaohua Jia
DiffM
55
0
0
25 Dec 2024
StaR Maps: Unveiling Uncertainty in Geospatial Relations
Simon Kohaut
Benedict Flade
Julian Eggert
Devendra Singh Dhami
Kristian Kersting
47
1
0
24 Dec 2024
LangSurf: Language-Embedded Surface Gaussians for 3D Scene Understanding
Hao Li
Roy Qin
Zhengyu Zou
Diqi He
Yangqiu Song
Bingquan Dai
Dingewn Zhang
Jiawei Han
3DGS
58
1
0
23 Dec 2024
Personalized Large Vision-Language Models
Chau Pham
Hoang Phan
David Doermann
Yunjie Tian
VLM
54
3
0
23 Dec 2024
AFANet: Adaptive Frequency-Aware Network for Weakly-Supervised Few-Shot Semantic Segmentation
Jiaqi Ma
Guo-Sen Xie
Fang Zhao
Zechao Li
44
0
0
23 Dec 2024
Learning Dynamic Local Context Representations for Infrared Small Target Detection
Guoyi Zhang
Guangsheng Xu
Han Wang
Siyang Chen
Yunxiao Shan
Xiaohu Zhang
47
1
0
23 Dec 2024
Previous
1
2
3
...
17
18
19
...
82
83
84
Next