Communities
Connect sessions
AI calendar
Organizations
Join Slack
Contact Sales
Search
Open menu
Home
Papers
All Papers
0 / 0 papers shown
Title
Home
Papers
2408.00714
Cited By
SAM 2: Segment Anything in Images and Videos
International Conference on Learning Representations (ICLR), 2024
1 August 2024
Nikhila Ravi
Valentin Gabeur
Yuan-Ting Hu
Ronghang Hu
Chaitanya K. Ryali
Tengyu Ma
Haitham Khedr
Roman Rädle
Chloe Rolland
Laura Gustafson
Eric Mintun
Junting Pan
Kalyan Vasudev Alwala
Nicolas Carion
Chao-Yuan Wu
Ross B. Girshick
Piotr Dollár
Christoph Feichtenhofer
VLM
MLLM
Re-assign community
ArXiv (abs)
PDF
HTML
HuggingFace (116 upvotes)
Papers citing
"SAM 2: Segment Anything in Images and Videos"
34 / 834 papers shown
Title
GroundingBooth: Grounding Text-to-Image Customization
Zhexiao Xiong
Wei Xiong
Jing Shi
Chentao Song
Yizhi Song
Nathan Jacobs
DiffM
377
12
0
13 Sep 2024
Robust Real-time Segmentation of Bio-Morphological Features in Human Cherenkov Imaging during Radiotherapy via Deep Learning
Shiru Wang
Yao Chen
Lesley A. Jarvis
Yucheng Tang
D. Gladstone
K. Samkoe
Brian W Pogue
P. Brůža
Rongxiao Zhang
MedIm
82
1
0
09 Sep 2024
PRoGS: Progressive Rendering of Gaussian Splats
IEEE Workshop/Winter Conference on Applications of Computer Vision (WACV), 2024
Brent Zoomers
Maarten Wijnants
Ivan Molenaers
Joni Vanherck
Jeroen Put
Lode Jorissen
Nick Michiels
3DGS
147
5
0
03 Sep 2024
Cross-domain Multi-step Thinking: Zero-shot Fine-grained Traffic Sign Recognition in the Wild
Knowledge-Based Systems (KBS), 2024
Yaozong Gan
Guang Li
Ren Togo
Keisuke Maeda
Takahiro Ogawa
Miki Haseyama
247
1
0
03 Sep 2024
Cross-Domain Foundation Model Adaptation: Pioneering Computer Vision Models for Geophysical Data Analysis
Journal of Geophysical Research (JGR), 2024
Zhixiang Guo
Xinming Wu
Luming Liang
Hanlin Sheng
Nuo Chen
Zhengfa Bi
AI4CE
225
9
0
22 Aug 2024
Retrieval-augmented Few-shot Medical Image Segmentation with Foundation Models
IEEE Transactions on Neural Networks and Learning Systems (TNNLS), 2024
Lin Zhao
Xiao Chen
Eric Z. Chen
Yikang Liu
Terrence Chen
Shanhui Sun
VLM
240
18
0
16 Aug 2024
Surgical SAM 2: Real-time Segment Anything in Surgical Video by Efficient Frame Pruning
Haofeng Liu
Erli Zhang
Junde Wu
Mingxuan Hong
Yueming Jin
MedIm
216
37
0
15 Aug 2024
Prompt-Based Segmentation at Multiple Resolutions and Lighting Conditions using Segment Anything Model 2
Osher Rafaeli
T. Svoray
Roni Blushtein-Livnon
Ariel Nahlieli
VLM
367
12
0
13 Aug 2024
Polyp SAM 2: Advancing Zero shot Polyp Segmentation in Colorectal Cancer Detection
International Conferences on Human-Machine Systems (ICCHS), 2024
Mobina Mansoori
Sajjad Shahabodini
J. Abouei
Konstantinos N. Plataniotis
Arash Mohammadi
MedIm
391
11
0
12 Aug 2024
Interactive 3D Medical Image Segmentation with SAM 2
Chuyun Shen
Wenhao Li
Yuhang Shi
Xiangfeng Wang
VLM
MedIm
330
21
0
05 Aug 2024
Earth System Data Cubes: Avenues for advancing Earth system research
Environmental Data Science (EDS), 2024
David Montero
Guido Kraemer
Anca Anghelea
C. Aybar
Gunnar Brandt
...
Francesco Martinuzzi
Martin Reinhardt
Maximilian Sochting
Khalil Teber
Miguel D. Mahecha
181
12
0
05 Aug 2024
Medical SAM 2: Segment medical images as video via Segment Anything Model 2
Jiayuan Zhu
Yunli Qi
A. El Abbadi
VLM
MedIm
278
156
0
01 Aug 2024
Segment anything model 2: an application to 2D and 3D medical images
Haoyu Dong
Han Gu
Yaqian Chen
Jichen Yang
Yuwen Chen
Maciej A. Mazurowski
VLM
MedIm
264
28
0
01 Aug 2024
Evaluating SAM2's Role in Camouflaged Object Detection: From SAM to SAM2
Lv Tang
Bo Li
VLM
156
13
0
31 Jul 2024
SegSTRONG-C: Segmenting Surgical Tools Robustly On Non-adversarial Generated Corruptions -- An EndoVis'24 Challenge
Hao Ding
Tuxun Lu
Yuqian Zhang
Ruixing Liang
Hongchao Shu
...
Bo Wang
Marcos Fernández-Rodríguez
Estevao Lima
João L. Vilaça
Mathias Unberath
516
7
0
16 Jul 2024
psifx -- Psychological and Social Interactions Feature Extraction Package
Guillaume Rochette
Mathieu Rochat
M. Vowels
153
1
0
14 Jul 2024
Affordance-Guided Reinforcement Learning via Visual Prompting
Olivia Y. Lee
Annie Xie
Kuan Fang
Karl Pertsch
Chelsea Finn
OffRL
LM&Ro
508
25
0
14 Jul 2024
Learning Spatial-Semantic Features for Robust Video Object Segmentation
Xin Li
Deshui Miao
Zhenyu He
Longji Xu
Huchuan Lu
Ming-Hsuan Yang
VOS
267
5
0
10 Jul 2024
Fish-Vista: A Multi-Purpose Dataset for Understanding & Identification of Traits from Images
Kazi Sajeed Mehrab
M. Maruf
Arka Daw
Harish Babu Manogaran
Abhilash Neog
...
Paula Mabee
Wasila Dahdul
Anuj Karpatne
Wasila M Dahdul
Anuj Karpatne
411
6
0
10 Jul 2024
EVF-SAM: Early Vision-Language Fusion for Text-Prompted Segment Anything Model
Yuxuan Zhang
Tianheng Cheng
Lianghui Zhu
Lei Liu
Heng Liu
Longjin Ran
Xiaoxin Chen
Xiaoxin Chen
Wenyu Liu
Xinggang Wang
VLM
519
53
0
28 Jun 2024
DAG-Plan: Generating Directed Acyclic Dependency Graphs for Dual-Arm Cooperative Planning
Zeyu Gao
Yao Mu
Jinye Qu
Mengkang Hu
Lingyue Guo
Ping Luo
Yanfeng Lu
Ping Luo
Shanghang Zhang
Yanfeng Lu
321
19
0
14 Jun 2024
GUIOdyssey: A Comprehensive Dataset for Cross-App GUI Navigation on Mobile Devices
Quanfeng Lu
Wenqi Shao
Zitao Liu
Lingxiao Du
Fanqing Meng
Boxuan Li
Botong Chen
Siyuan Huang
Kaipeng Zhang
Ping Luo
305
88
0
12 Jun 2024
HO-Cap: A Capture System and Dataset for 3D Reconstruction and Pose Tracking of Hand-Object Interaction
Jikai Wang
Qifan Zhang
Yu-Wei Chao
Bowen Wen
Xiaohu Guo
Yu Xiang
3DH
418
8
0
10 Jun 2024
CromSS: Cross-modal pre-training with noisy labels for remote sensing image segmentation
Chenying Liu
C. Albrecht
Yi Wang
Xiao Xiang Zhu
482
4
0
02 May 2024
Closed-Loop Open-Vocabulary Mobile Manipulation with GPT-4V
Peiyuan Zhi
Zhiyuan Zhang
Muzhi Han
Zeyu Zhang
Zhitian Li
Ziyuan Jiao
Ziyuan Jiao
Siyuan Huang
Siyuan Huang
LRM
LM&Ro
282
50
0
16 Apr 2024
Long-horizon Locomotion and Manipulation on a Quadrupedal Robot with Large Language Models
Yutao Ouyang
Jinhan Li
Yunfei Li
Zhongyu Li
Chao Yu
Koushil Sreenath
Yi Wu
331
21
0
08 Apr 2024
CoMat: Aligning Text-to-Image Diffusion Model with Image-to-Text Concept Matching
Neural Information Processing Systems (NeurIPS), 2024
Dongzhi Jiang
Guanglu Song
Xiaoshi Wu
Renrui Zhang
Dazhong Shen
Zhuofan Zong
Yu Liu
Jiaming Song
VLM
369
50
0
04 Apr 2024
Crafting Dynamic Virtual Activities with Advanced Multimodal Models
International Symposium on Mixed and Augmented Reality (ISMAR), 2024
Changyang Li
Qingan Yan
Minyoung Kim
Z. Li
Yi Tian Xu
Lap-Fai Yu
115
0
0
15 Mar 2024
Renovating Names in Open-Vocabulary Segmentation Benchmarks
Neural Information Processing Systems (NeurIPS), 2024
Haiwen Huang
Songyou Peng
Dan Zhang
Andreas Geiger
VLM
194
5
0
14 Mar 2024
PointSeg: A Training-Free Paradigm for 3D Scene Segmentation via Foundation Models
Qingdong He
Jinlong Peng
Zhengkai Jiang
Xiaobin Hu
Jiangning Zhang
3DPC
VLM
379
8
0
11 Mar 2024
Grounding LLMs For Robot Task Planning Using Closed-loop State Feedback
V. Bhat
Ali Umut Kaypak
Prashanth Krishnamurthy
Ramesh Karri
Farshad Khorrami
LM&Ro
335
32
0
13 Feb 2024
Promoting Segment Anything Model towards Highly Accurate Dichotomous Image Segmentation
Xianjie Liu
Keren Fu
Qijun Zhao
Qijun Zhao
VLM
500
1
0
30 Dec 2023
Audio-Visual Instance Segmentation
Computer Vision and Pattern Recognition (CVPR), 2023
Ruohao Guo
Yaru Chen
Yanyu Qi
Wenzhen Yue
Dantong Niu
...
Wenzhen Yue
Ji Shi
Qixun Wang
Peiliang Zhang
Buwen Liang
VLM
VOS
301
11
0
28 Oct 2023
Temporal Transductive Inference for Few-Shot Video Object Segmentation
International Journal of Computer Vision (IJCV), 2022
Mennatullah Siam
Konstantinos G. Derpanis
Richard P. Wildes
VOS
252
10
0
27 Mar 2022
Previous
1
2
3
...
15
16
17