Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2304.02643
Cited By
Segment Anything
5 April 2023
A. Kirillov
Eric Mintun
Nikhila Ravi
Hanzi Mao
Chloe Rolland
Laura Gustafson
Tete Xiao
Spencer Whitehead
Alexander C. Berg
Wan-Yen Lo
Piotr Dollár
Ross B. Girshick
MLLM
VLM
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Segment Anything"
50 / 4,267 papers shown
Title
Ego3DT: Tracking Every 3D Object in Ego-centric Videos
Shengyu Hao
Wenhao Chai
Zhonghan Zhao
Meiqi Sun
Wendi Hu
...
Yixian Zhao
Qi Li
Yizhou Wang
Xi Li
Gaoang Wang
45
1
0
11 Oct 2024
Adversarial Training Can Provably Improve Robustness: Theoretical Analysis of Feature Learning Process Under Structured Data
Binghui Li
Yuanzhi Li
OOD
44
2
0
11 Oct 2024
Cross-Modal Bidirectional Interaction Model for Referring Remote Sensing Image Segmentation
Zhe Dong
Yuzhe Sun
Tianzhu Liu
Wangmeng Zuo
Yanfeng Gu
38
5
0
11 Oct 2024
Dynamic Multimodal Evaluation with Flexible Complexity by Vision-Language Bootstrapping
Yue Yang
Shanghang Zhang
Wenqi Shao
Kaipeng Zhang
Yi Bin
Yu Wang
Ping Luo
50
2
0
11 Oct 2024
PointOBB-v2: Towards Simpler, Faster, and Stronger Single Point Supervised Oriented Object Detection
Botao Ren
Xue Yang
Yi Yu
Junwei Luo
Zhidong Deng
57
6
0
10 Oct 2024
Emerging Pixel Grounding in Large Multimodal Models Without Grounding Supervision
Shengcao Cao
Liang-Yan Gui
Yu-Xiong Wang
52
3
0
10 Oct 2024
SG-Nav: Online 3D Scene Graph Prompting for LLM-based Zero-shot Object Navigation
Hang Yin
Xiuwei Xu
Zhenyu Wu
Jie Zhou
Jiwen Lu
53
16
0
10 Oct 2024
Neural Material Adaptor for Visual Grounding of Intrinsic Dynamics
Junyi Cao
Shanyan Guan
Yanhao Ge
Wei Li
Xiaokang Yang
Chao Ma
AI4CE
52
6
0
10 Oct 2024
UW-SDF: Exploiting Hybrid Geometric Priors for Neural SDF Reconstruction from Underwater Multi-view Monocular Images
Zeyu Chen
Jingyi Tang
Gu Wang
Shengquan Li
Xinghui Li
Xiangyang Ji
Xiu Li
42
0
0
10 Oct 2024
Distribution Guidance Network for Weakly Supervised Point Cloud Semantic Segmentation
Zhiyi Pan
Wei-Nan Gao
Shan Liu
Ge Li
37
1
0
10 Oct 2024
OneRef: Unified One-tower Expression Grounding and Segmentation with Mask Referring Modeling
Linhui Xiao
Xiaoshan Yang
Fang Peng
Yaowei Wang
Changsheng Xu
ObjD
56
6
0
10 Oct 2024
ONCOPILOT: A Promptable CT Foundation Model For Solid Tumor Evaluation
Léo Machado
Hélène Philippe
Elodie Ferreres
Julien Khlaut
Julie Dupuis
...
Corentin Dancette
Daniel Tordjman
Pierre Manceron
Paul Hérent
Paul Hérent
52
0
0
10 Oct 2024
Exploring Foundation Models in Remote Sensing Image Change Detection: A Comprehensive Survey
Zihan Yu
Tianxiao Li
Yuxin Zhu
Rongze Pan
43
1
0
10 Oct 2024
Delta-ICM: Entropy Modeling with Delta Function for Learned Image Compression
Takahiro Shindo
Taiju Watanabe
Yui Tatsumi
Hiroshi Watanabe
50
1
0
10 Oct 2024
Moyun: A Diffusion-Based Model for Style-Specific Chinese Calligraphy Generation
Kaiyuan Liu
Jiahao Mei
Hengyu Zhang
Yihuai Zhang
Xingjiao Wu
Daoguo Dong
Liang He
DiffM
32
0
0
10 Oct 2024
Fine-detailed Neural Indoor Scene Reconstruction using multi-level importance sampling and multi-view consistency
Xinghui Li
Yuchen Ji
Xiansong Lai
Wanting Zhang
3DV
45
1
0
10 Oct 2024
O1O: Grouping of Known Classes to Identify Unknown Objects as Odd-One-Out
Mısra Yavuz
Fatma Guney
38
0
0
10 Oct 2024
Metamizer: a versatile neural optimizer for fast and accurate physics simulations
Nils Wandel
Stefan Schulz
Reinhard Klein
PINN
AI4CE
56
1
0
10 Oct 2024
RDT-1B: a Diffusion Foundation Model for Bimanual Manipulation
Songming Liu
Lingxuan Wu
Bangguo Li
Hengkai Tan
Huayu Chen
Zhengyi Wang
Ke Xu
Hang Su
Jun Zhu
52
94
0
10 Oct 2024
Interactive4D: Interactive 4D LiDAR Segmentation
Ilya Fradlin
Idil Esen Zulfikar
Kadir Yilmaz
Theodora Kontogianni
Bastian Leibe
55
2
0
10 Oct 2024
3D Vision-Language Gaussian Splatting
Qucheng Peng
Benjamin Planche
Zhongpai Gao
Meng Zheng
Anwesa Choudhuri
Terrence Chen
Chong Chen
Ziyan Wu
3DGS
57
5
0
10 Oct 2024
Meissonic: Revitalizing Masked Generative Transformers for Efficient High-Resolution Text-to-Image Synthesis
Jinbin Bai
Tian-Chun Ye
Wei Chow
Enxin Song
Qing-Guo Chen
Hefei Ling
Zhen Dong
Lei Zhu
74
15
0
10 Oct 2024
Generalizing Segmentation Foundation Model Under Sim-to-real Domain-shift for Guidewire Segmentation in X-ray Fluoroscopy
Yuxuan Wen
Evgenia Roussinova
Olivier Brina
Paolo Machi
Mohamed Bouri
OOD
MedIm
41
1
0
09 Oct 2024
Exploring Efficient Foundational Multi-modal Models for Video Summarization
Karan Samel
Apoorva Beedu
Nitish Sontakke
Irfan Essa
37
1
0
09 Oct 2024
Structured Spatial Reasoning with Open Vocabulary Object Detectors
Negar Nejatishahidin
Madhukar Reddy Vongala
Jana Kosecka
51
3
0
09 Oct 2024
EvolveDirector: Approaching Advanced Text-to-Image Generation with Large Vision-Language Models
Rui Zhao
Hangjie Yuan
Yujie Wei
Shiwei Zhang
Yuchao Gu
...
Xiang Wang
Zhangjie Wu
Junhao Zhang
Yingya Zhang
Mike Zheng Shou
DiffM
VLM
68
4
0
09 Oct 2024
Structure-Centric Robust Monocular Depth Estimation via Knowledge Distillation
Runze Chen
Haiyong Luo
Fang Zhao
Jingze Yu
Yupeng Jia
Juan Wang
Xuepeng Ma
MDE
59
1
0
09 Oct 2024
Bridge the Points: Graph-based Few-shot Segment Anything Semantically
Anqi Zhang
Guangyu Gao
Jianbo Jiao
C. Liu
Yunchao Wei
VLM
50
5
0
09 Oct 2024
Iterative Optimization Annotation Pipeline and ALSS-YOLO-Seg for Efficient Banana Plantation Segmentation in UAV Imagery
Ang He
Ximei Wu
Xing Xu
Jing Chen
Xiaobin Guo
Sheng Xu
31
0
0
09 Oct 2024
K-SAM: A Prompting Method Using Pretrained U-Net to Improve Zero Shot Performance of SAM on Lung Segmentation in CXR Images
Mohamed Deriche
Mohammad Marufur
33
0
0
09 Oct 2024
HERM: Benchmarking and Enhancing Multimodal LLMs for Human-Centric Understanding
Keliang Li
Zaifei Yang
Jiahe Zhao
Hongze Shen
Ruibing Hou
Hong Chang
Shiguang Shan
Xilin Chen
VLM
50
0
0
09 Oct 2024
Open-RGBT: Open-vocabulary RGB-T Zero-shot Semantic Segmentation in Open-world Environments
Meng Yu
Luojie Yang
Xunjie He
Yi Yang
Yufeng Yue
VLM
44
0
0
09 Oct 2024
Towards Natural Image Matting in the Wild via Real-Scenario Prior
Ruihao Xia
Yu Liang
Peng-Tao Jiang
Hao Zhang
Qianru Sun
Yang Tang
Bo Li
Pan Zhou
56
0
0
09 Oct 2024
NaVIP: An Image-Centric Indoor Navigation Solution for Visually Impaired People
Jun Yu
Yifan Zhang
Badrinadh Aila
V. Namboodiri
53
1
0
08 Oct 2024
Unveiling the Backbone-Optimizer Coupling Bias in Visual Representation Learning
Siyuan Li
Juanxi Tian
Zedong Wang
Luyuan Zhang
Zicheng Liu
Weiyang Jin
Yang Liu
Baigui Sun
Stan Z. Li
45
0
0
08 Oct 2024
Context-Aware Command Understanding for Tabletop Scenarios
Paul Gajewski
Antonio Galiza Cerdeira Gonzalez
B. Indurkhya
LM&Ro
23
0
0
08 Oct 2024
OrionNav: Online Planning for Robot Autonomy with Context-Aware LLM and Open-Vocabulary Semantic Scene Graphs
Venkata Naren Devarakonda
Raktim Gautam Goswami
Ali Umut Kaypak
Naman Patel
Rooholla Khorrambakht
Prashanth Krishnamurthy
Farshad Khorrami
LM&Ro
59
5
0
08 Oct 2024
BUMBLE: Unifying Reasoning and Acting with Vision-Language Models for Building-wide Mobile Manipulation
Rutav Shah
Albert Yu
Yifeng Zhu
Yuke Zhu
Roberto Martín-Martín
LM&Ro
61
6
0
08 Oct 2024
Prompting DirectSAM for Semantic Contour Extraction in Remote Sensing Images
Shiyu Miao
Delong Chen
Fan Liu
Chuanyi Zhang
Yanhui Gu
Shengjie Guo
Jun Zhou
36
2
0
08 Oct 2024
GR-2: A Generative Video-Language-Action Model with Web-Scale Knowledge for Robot Manipulation
Chi-Lam Cheang
Guangzeng Chen
Ya Jing
Tao Kong
Hang Li
...
Hongtao Wu
Jiafeng Xu
Yichu Yang
Hanbo Zhang
Minzhao Zhu
VGen
LM&Ro
71
57
0
08 Oct 2024
Towards Unsupervised Eye-Region Segmentation for Eye Tracking
Jiangfan Deng
Zhuang Jia
Zhaoxue Wang
Xiang Long
Daniel K. Du
25
1
0
08 Oct 2024
ConceptAgent: LLM-Driven Precondition Grounding and Tree Search for Robust Task Planning and Execution
Corban Rivera
Grayson Byrd
William Paul
Tyler Feldman
Meghan Booker
...
Krishna Murthy Jatavallabhula
Celso M. De Melo
Lalithkumar Seenivasan
Mathias Unberath
Rama Chellappa
LLMAG
LM&Ro
36
1
0
08 Oct 2024
AP-LDM: Attentive and Progressive Latent Diffusion Model for Training-Free High-Resolution Image Generation
Boyuan Cao
Jiaxin Ye
Yujie Wei
Hongming Shan
50
3
0
08 Oct 2024
Training-Free Open-Ended Object Detection and Segmentation via Attention as Prompts
Zhiwei Lin
Yongtao Wang
Zhi Tang
ObjD
VLM
41
4
0
08 Oct 2024
DiffusionGuard: A Robust Defense Against Malicious Diffusion-based Image Editing
June Suk Choi
Kyungmin Lee
Jongheon Jeong
Saining Xie
Jinwoo Shin
Kimin Lee
DiffM
AAML
52
4
0
08 Oct 2024
TweedieMix: Improving Multi-Concept Fusion for Diffusion-based Image/Video Generation
Gihyun Kwon
Jong Chul Ye
DiffM
66
4
0
08 Oct 2024
Pyramidal Flow Matching for Efficient Video Generative Modeling
Yang Jin
Zhicheng Sun
Ningyuan Li
Kun Xu
K. Xu
...
Nan Zhuang
Quzhe Huang
Yang Song
Yadong Mu
Zhouchen Lin
VGen
90
70
0
08 Oct 2024
FogROS2-PLR: Probabilistic Latency-Reliability For Cloud Robotics
Kaiyuan Chen
Nan Tian
Christian Juette
Tianshuang Qiu
Liu Ren
John Kubiatowicz
Ken Goldberg
60
1
0
07 Oct 2024
Toward General Object-level Mapping from Sparse Views with 3D Diffusion Priors
Ziwei Liao
Binbin Xu
Steven L. Waslander
DiffM
56
3
0
07 Oct 2024
GS-VTON: Controllable 3D Virtual Try-on with Gaussian Splatting
Yukang Cao
Masoud Hadi
Liang Pan
Ziwei Liu
3DGS
DiffM
63
4
0
07 Oct 2024
Previous
1
2
3
...
27
28
29
...
84
85
86
Next