Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2304.02643
Cited By
Segment Anything
5 April 2023
A. Kirillov
Eric Mintun
Nikhila Ravi
Hanzi Mao
Chloe Rolland
Laura Gustafson
Tete Xiao
Spencer Whitehead
Alexander C. Berg
Wan-Yen Lo
Piotr Dollár
Ross B. Girshick
MLLM
VLM
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Segment Anything"
50 / 4,187 papers shown
Title
Boosting Multi-View Stereo with Depth Foundation Model in the Absence of Real-World Labels
Jie Zhu
Bo Peng
Zhe Zhang
Bingzheng Liu
Jianjun Lei
33
0
0
16 Apr 2025
Crane: Context-Guided Prompt Learning and Attention Refinement for Zero-Shot Anomaly Detections
Alireza Salehi
Mohammadreza Salehi
Reshad Hosseini
Cees G. M. Snoek
Makoto Yamada
Mohammad Sabokrou
VLM
33
0
0
15 Apr 2025
FACT: Foundation Model for Assessing Cancer Tissue Margins with Mass Spectrometry
Mohammad Farahmand
A. Jamzad
Fahimeh Fooladgar
Laura Connolly
Martin Kaufmann
Kevin Yi Mi Ren
John Rudan
Doug McKay
Gabor Fichtinger
P. Mousavi
43
0
0
15 Apr 2025
SimpleAR: Pushing the Frontier of Autoregressive Visual Generation through Pretraining, SFT, and RL
Junke Wang
Zhi Tian
Xinyu Wang
Xinyu Zhang
Weilin Huang
Zuxuan Wu
Yu Jiang
VGen
49
6
0
15 Apr 2025
Easy3D: A Simple Yet Effective Method for 3D Interactive Segmentation
Andrea Simonelli
Norman Muller
Peter Kontschieder
26
0
0
15 Apr 2025
MediSee: Reasoning-based Pixel-level Perception in Medical Images
Qinyue Tong
Ziqian Lu
Jun Liu
Yangming Zheng
Zheming Lu
LRM
23
0
0
15 Apr 2025
Explicit and Implicit Representations in AI-based 3D Reconstruction for Radiology: A Systematic Review
Yuezhe Yang
Boyu Yang
Yaqian Wang
Yang He
Xingbo Dong
Zhe Jin
38
0
0
15 Apr 2025
LVLM_CSP: Accelerating Large Vision Language Models via Clustering, Scattering, and Pruning for Reasoning Segmentation
Hanning Chen
Yang Ni
Wenjun Huang
Hyunwoo Oh
Yezi Liu
Tamoghno Das
Mohsen Imani
VLM
LRM
36
0
0
15 Apr 2025
PT-Mark: Invisible Watermarking for Text-to-image Diffusion Models via Semantic-aware Pivotal Tuning
Yixuan Wang
Huiyu Xu
Zhibo Wang
Jiacheng Du
Zehan Li
Yiming Li
Qiu Wang
Kui Ren
WIGM
54
0
0
15 Apr 2025
Reimagining Urban Science: Scaling Causal Inference with Large Language Models
Yutong Xia
Ao Qu
Yunhan Zheng
Yihong Tang
Dingyi Zhuang
...
Cathy Wu
R. Zimmermann
Lijun Sun
Roger Zimmermann
Jinhua Zhao
AI4CE
75
0
0
15 Apr 2025
Deep Learning in Concealed Dense Prediction
Pancheng Zhao
Deng-Ping Fan
Shupeng Cheng
Salman Khan
F. Khan
David A. Clifton
P. Xu
Jufeng Yang
VLM
25
0
0
15 Apr 2025
TT3D: Table Tennis 3D Reconstruction
Thomas Gossard
Andreas Ziegler
A. Zell
33
0
0
14 Apr 2025
Zero-shot Autonomous Microscopy for Scalable and Intelligent Characterization of 2D Materials
J. Yang
Ruoyan Avery Yin
Chi Jiang
Yuepeng Hu
X. Zhu
...
Zongyou Yin
Jing Kong
Neil Gong
Z. Z. Ren
Haozhe Wang
26
0
0
14 Apr 2025
Pixel-SAIL: Single Transformer For Pixel-Grounded Understanding
Tao Zhang
X. Li
Zilong Huang
Y. Li
Weixian Lei
XueQing Deng
Shihao Chen
S. Ji
Jiashi Feng
MLLM
LRM
62
2
0
14 Apr 2025
ToolTipNet: A Segmentation-Driven Deep Learning Baseline for Surgical Instrument Tip Detection
Zijian Wu
Shuojue Yang
Yueming Jin
Septimiu E. Salcudean
MedIm
35
1
0
13 Apr 2025
GeoNav: Empowering MLLMs with Explicit Geospatial Reasoning Abilities for Language-Goal Aerial Navigation
Haotian Xu
Yue Hu
Chen Gao
Zhengqiu Zhu
Yong Zhao
Yongqian Li
Quanjun Yin
39
0
0
13 Apr 2025
SegEarth-R1: Geospatial Pixel Reasoning via Large Language Model
Kaiyu Li
Zepeng Xin
Li Pang
Chao Pang
Yupeng Deng
Jing Yao
Guisong Xia
Deyu Meng
Zhi Wang
Xiangyong Cao
VLM
LRM
37
0
0
13 Apr 2025
Mixture-of-Shape-Experts (MoSE): End-to-End Shape Dictionary Framework to Prompt SAM for Generalizable Medical Segmentation
Jia Wei
Xiaoqi Zhao
Jonghye Woo
J. Ouyang
G. El Fakhri
Qingyu Chen
Xiaofeng Liu
22
0
0
13 Apr 2025
Learning Occlusion-Robust Vision Transformers for Real-Time UAV Tracking
You Wu
Xucheng Wang
Xiangyang Yang
Mengyuan Liu
Dan Zeng
Hengzhou Ye
Shuiwang Li
34
0
0
12 Apr 2025
Visual moral inference and communication
Warren Zhu
Aida Ramezani
Yang Xu
33
0
0
12 Apr 2025
PathSeqSAM: Sequential Modeling for Pathology Image Segmentation with SAM2
Mingyang Zhu
Yinting Liu
Mingyu Li
Jiacheng Wang
21
0
0
12 Apr 2025
DoorBot: Closed-Loop Task Planning and Manipulation for Door Opening in the Wild with Haptic Feedback
Zhi Wang
Yuchen Mo
Shengmiao Jin
Wenzhen Yuan
34
1
0
12 Apr 2025
DreamFuse: Adaptive Image Fusion with Diffusion Transformer
Junjia Huang
Pengxiang Yan
Jiyang Liu
Jie Wu
Zhao Wang
Yitong Wang
Liang Lin
G. Li
37
0
0
11 Apr 2025
Training-free Guidance in Text-to-Video Generation via Multimodal Planning and Structured Noise Initialization
Jialu Li
Shoubin Yu
Han Lin
Jaemin Cho
Jaehong Yoon
Joey Tianyi Zhou
DiffM
VGen
50
0
0
11 Apr 2025
Embodied Image Captioning: Self-supervised Learning Agents for Spatially Coherent Image Descriptions
Tommaso Galliena
Tommaso Apicella
Stefano Rosa
Pietro Morerio
Alessio Del Bue
Lorenzo Natale
39
0
0
11 Apr 2025
Adversarial Examples in Environment Perception for Automated Driving (Review)
Jun Yan
Huilin Yin
AAML
34
0
0
11 Apr 2025
FocalLens: Instruction Tuning Enables Zero-Shot Conditional Image Representations
Cheng-Yu Hsieh
Pavan Kumar Anasosalu Vasu
Fartash Faghri
Raviteja Vemulapalli
Chun-Liang Li
Ranjay Krishna
Oncel Tuzel
Hadi Pouransari
VLM
162
0
0
11 Apr 2025
Parameter-Free Fine-tuning via Redundancy Elimination for Vision Foundation Models
Jiahuan Long
Tingsong Jiang
Wen Yao
Yizhe Xiong
Zhengqin Xu
Shuai Jia
Chao Ma
24
0
0
11 Apr 2025
SynthFM: Training Modality-agnostic Foundation Models for Medical Image Segmentation without Real Medical Data
Sourya Sengupta
Satrajit Chakrabarty
Keerthi Sravan Ravi
Gopal Avinash
Ravi Soni
MedIm
29
0
0
11 Apr 2025
Hypergraph Vision Transformers: Images are More than Nodes, More than Edges
Joshua Fixelle
ViT
29
0
0
11 Apr 2025
FMLGS: Fast Multilevel Language Embedded Gaussians for Part-level Interactive Agents
Xin Tan
Yuzhou Ji
He Zhu
Yuan Xie
3DGS
36
0
0
11 Apr 2025
Cut-and-Splat: Leveraging Gaussian Splatting for Synthetic Data Generation
Bram Vanherle
Brent Zoomers
Jeroen Put
F. Reeth
Nick Michiels
3DGS
34
0
0
11 Apr 2025
Robust SAM: On the Adversarial Robustness of Vision Foundation Models
Jiahuan Long
Zhengqin Xu
Tingsong Jiang
Wen Yao
Shuai Jia
Chao Ma
Xiaoqian Chen
AAML
VLM
39
1
0
11 Apr 2025
On Background Bias of Post-Hoc Concept Embeddings in Computer Vision DNNs
Gesina Schwalbe
Georgii Mikriukov
Edgar Heinert
Stavros Gerolymatos
Mert Keser
Alois Knoll
Matthias Rottmann
Annika Mütze
31
0
0
11 Apr 2025
Diffusion Models for Robotic Manipulation: A Survey
Rosa Wolf
Yitian Shi
Sheng Liu
Rania Rayyes
51
1
0
11 Apr 2025
FindAnything: Open-Vocabulary and Object-Centric Mapping for Robot Exploration in Any Environment
Sebastián Barbas Laina
Simon Boche
Sotiris Papatheodorou
Simon Schaefer
Jaehyung Jung
Stefan Leutenegger
52
0
0
11 Apr 2025
CoProSketch: Controllable and Progressive Sketch Generation with Diffusion Model
Ruohao Zhan
Yijin Li
Yisheng He
Shuo Chen
Yichen Shen
Xinyu Chen
Zilong Dong
Zhaoyang Huang
Guofeng Zhang
DiffM
34
0
0
11 Apr 2025
Towards Unconstrained 2D Pose Estimation of the Human Spine
Muhammad Gul Zain Ali Khan
Stephan Krauß
Didier Stricker
3DH
58
0
0
10 Apr 2025
HoloPart: Generative 3D Part Amodal Segmentation
Yanting Yang
Y. Guo
Yukun Huang
Zi-Xin Zou
Zhipeng Yu
Yangguang Li
Yan-Pei Cao
Xihui Liu
DiffM
45
1
0
10 Apr 2025
MARS: a Multimodal Alignment and Ranking System for Few-Shot Segmentation
Nico Catalano
Stefano Samele
Paolo Pertino
Matteo Matteucci
3DPC
53
0
0
10 Apr 2025
FlexIP: Dynamic Control of Preservation and Personality for Customized Image Generation
Linyan Huang
Haonan Lin
Yanning Zhou
Kaiwen Xiao
47
0
0
10 Apr 2025
Marmot: Multi-Agent Reasoning for Multi-Object Self-Correcting in Improving Image-Text Alignment
Jiayang Sun
H. Wang
Jie Cao
Huaibo Huang
Ran He
DiffM
76
0
0
10 Apr 2025
VideoExpert: Augmented LLM for Temporal-Sensitive Video Understanding
Henghao Zhao
Ge-Peng Ji
Rui Yan
Huan Xiong
Zechao Li
24
0
0
10 Apr 2025
Multi-Modal Data Fusion for Moisture Content Prediction in Apple Drying
Shichen Li
Chenhui Shao
36
1
0
10 Apr 2025
Wheat3DGS: In-field 3D Reconstruction, Instance Segmentation and Phenotyping of Wheat Heads with Gaussian Splatting
Daiwei Zhang
Joaquin Gajardo
Tomislav Medic
Isinsu Katircioglu
Mike Boss
Norbert Kirchgessner
Achim Walter
Lukas Roth
29
0
0
09 Apr 2025
MultiADS: Defect-aware Supervision for Multi-type Anomaly Detection and Segmentation in Zero-Shot Learning
Ylli Sadikaj
Hongkuan Zhou
Lavdim Halilaj
Stefan Schmid
Steffen Staab
Claudia Plant
23
0
0
09 Apr 2025
Are We Done with Object-Centric Learning?
Alexander Rubinstein
Ameya Prabhu
Matthias Bethge
Seong Joon Oh
OCL
634
0
0
09 Apr 2025
Masked Scene Modeling: Narrowing the Gap Between Supervised and Self-Supervised Learning in 3D Scene Understanding
Pedro Hermosilla
Christian Stippel
Leon Sick
SSL
3DPC
79
0
0
09 Apr 2025
Patch Matters: Training-free Fine-grained Image Caption Enhancement via Local Perception
Ruotian Peng
Haiying He
Yake Wei
Yandong Wen
D. Hu
VLM
39
0
0
09 Apr 2025
MovSAM: A Single-image Moving Object Segmentation Framework Based on Deep Thinking
Chang Nie
Yiqing Xu
Guangming Wang
Zhe Liu
Yanzi Miao
Hesheng Wang
VLM
41
0
0
09 Apr 2025
Previous
1
2
3
4
5
6
...
82
83
84
Next