ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2026 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2408.00714
  4. Cited By
SAM 2: Segment Anything in Images and Videos

SAM 2: Segment Anything in Images and Videos

International Conference on Learning Representations (ICLR), 2024
1 August 2024
Nikhila Ravi
Valentin Gabeur
Yuan-Ting Hu
Ronghang Hu
Chaitanya K. Ryali
Tengyu Ma
Haitham Khedr
Roman Rädle
Chloe Rolland
Laura Gustafson
Eric Mintun
Junting Pan
Kalyan Vasudev Alwala
Nicolas Carion
Chao-Yuan Wu
Ross B. Girshick
Piotr Dollár
Christoph Feichtenhofer
    VLMMLLM
ArXiv (abs)PDFHTMLHuggingFace (116 upvotes)

Papers citing "SAM 2: Segment Anything in Images and Videos"

50 / 863 papers shown
Masquerade: Learning from In-the-wild Human Videos using Data-Editing
Masquerade: Learning from In-the-wild Human Videos using Data-Editing
Marion Lepert
Jiaying Fang
Jeannette Bohg
VGen
220
12
0
13 Aug 2025
A Survey on 3D Gaussian Splatting Applications: Segmentation, Editing, and Generation
A Survey on 3D Gaussian Splatting Applications: Segmentation, Editing, and Generation
Shuting He
Peilin Ji
Yitong Yang
Changshuo Wang
Jiayi Ji
Yinglin Wang
Henghui Ding
3DGS
312
10
0
13 Aug 2025
Designing Memory-Augmented AR Agents for Spatiotemporal Reasoning in Personalized Task Assistance
Designing Memory-Augmented AR Agents for Spatiotemporal Reasoning in Personalized Task Assistance
Dongwook Choi
Taeyoon Kwon
Dongil Yang
Hyojun Kim
Jinyoung Yeo
159
0
0
12 Aug 2025
HumanOLAT: A Large-Scale Dataset for Full-Body Human Relighting and Novel-View Synthesis
HumanOLAT: A Large-Scale Dataset for Full-Body Human Relighting and Novel-View Synthesis
Timo Teufel
Pulkit Gera
Xilong Zhou
Umar Iqbal
Pramod Rao
Jan Kautz
Vladislav Golyanik
Christian Theobalt
3DH
172
4
0
12 Aug 2025
Correspondence as Video: Test-Time Adaption on SAM2 for Reference Segmentation in the Wild
Correspondence as Video: Test-Time Adaption on SAM2 for Reference Segmentation in the Wild
Haoran Wang
Zekun Li
Jian Zhang
Lei Qi
Yinghuan Shi
VOSVGen
214
0
0
11 Aug 2025
ReferSplat: Referring Segmentation in 3D Gaussian Splatting
ReferSplat: Referring Segmentation in 3D Gaussian Splatting
Shuting He
Guangquan Jie
Changshuo Wang
Yun Zhou
Shuming Hu
Guanbin Li
Henghui Ding
3DGS
179
6
0
11 Aug 2025
Exploring Multimodal Diffusion Transformers for Enhanced Prompt-based Image Editing
Exploring Multimodal Diffusion Transformers for Enhanced Prompt-based Image Editing
Joonghyuk Shin
Alchan Hwang
Yujin Kim
Daneul Kim
Jaesik Park
DiffM
123
4
0
11 Aug 2025
NeeCo: Image Synthesis of Novel Instrument States Based on Dynamic and Deformable 3D Gaussian Reconstruction
NeeCo: Image Synthesis of Novel Instrument States Based on Dynamic and Deformable 3D Gaussian Reconstruction
Tianle Zeng
Junlei Hu
Gerardo Loza Galindo
Sharib Ali
Duygu Sarikaya
Pietro Valdastri
Dominic Jones
92
0
0
11 Aug 2025
SAGOnline: Segment Any Gaussians Online
SAGOnline: Segment Any Gaussians Online
Wentao Sun
Quanyun Wu
Hanqing Xu
Kyle Gao
Zhengsen Xu
Yiping Chen
Dedong Zhang
Lingfei Ma
John S. Zelek
Jonathan Li
3DGS
268
1
0
11 Aug 2025
OctreeNCA: Single-Pass 184 MP Segmentation on Consumer Hardware
OctreeNCA: Single-Pass 184 MP Segmentation on Consumer Hardware
Nick Lemke
John Kalkhof
Niklas Babendererde
Anirban Mukhopadhyay
85
0
0
09 Aug 2025
CannyEdit: Selective Canny Control and Dual-Prompt Guidance for Training-Free Image Editing
CannyEdit: Selective Canny Control and Dual-Prompt Guidance for Training-Free Image Editing
Weiyan Xie
Han Gao
Didan Deng
Kaican Li
April Hua Liu
Yongxiang Huang
Nevin L. Zhang
DiffM
206
0
0
09 Aug 2025
NEP: Autoregressive Image Editing via Next Editing Token Prediction
NEP: Autoregressive Image Editing via Next Editing Token Prediction
Huimin Wu
Xiaojian Ma
Haozhe Zhao
Yanpeng Zhao
Qing Li
DiffM
153
2
0
08 Aug 2025
F2PASeg: Feature Fusion for Pituitary Anatomy Segmentation in Endoscopic Surgery
F2PASeg: Feature Fusion for Pituitary Anatomy Segmentation in Endoscopic SurgeryInternational Conference on Medical Image Computing and Computer-Assisted Intervention (MICCAI), 2025
L. Chen
Zhiying Wu
Tianye Lei
Xuexue Bai
Ming Feng
Yuxi Wang
Gaofeng Meng
Zhen Lei
Hongbin Liu
MedIm
101
1
0
07 Aug 2025
Segmenting the Complex and Irregular in Two-Phase Flows: A Real-World Empirical Study with SAM2
Segmenting the Complex and Irregular in Two-Phase Flows: A Real-World Empirical Study with SAM2
Semanur Küçük
Cosimo Della Santina
Angeliki Laskari
51
0
0
07 Aug 2025
MOSEv2: A More Challenging Dataset for Video Object Segmentation in Complex Scenes
MOSEv2: A More Challenging Dataset for Video Object Segmentation in Complex Scenes
Henghui Ding
Kaining Ying
Chang-rui Liu
Shuting He
Xudong Jiang
Yu-Gang Jiang
Juil Sock
Song Bai
VOS
352
24
0
07 Aug 2025
Segment Any Vehicle: Semantic and Visual Context Driven SAM and A Benchmark
Segment Any Vehicle: Semantic and Visual Context Driven SAM and A Benchmark
Xiao Wang
Ziwen Wang
Wentao Wu
Anjie Wang
Jiashu Wu
Yantao Pan
Chenglong Li
VLM
198
0
0
06 Aug 2025
SAM2-UNeXT: An Improved High-Resolution Baseline for Adapting Foundation Models to Downstream Segmentation Tasks
SAM2-UNeXT: An Improved High-Resolution Baseline for Adapting Foundation Models to Downstream Segmentation Tasks
Xinyu Xiong
Zihuang Wu
L. Zhang
Lei Lu
Ming-hui Li
Guanbin Li
191
2
0
05 Aug 2025
ActionSink: Toward Precise Robot Manipulation with Dynamic Integration of Action Flow
ActionSink: Toward Precise Robot Manipulation with Dynamic Integration of Action Flow
Shanshan Guo
Xiwen Liang
Junfan Lin
Yuzheng Zhuang
Guanbin Li
Xiaodan Liang
162
1
0
05 Aug 2025
Trace3D: Consistent Segmentation Lifting via Gaussian Instance Tracing
Trace3D: Consistent Segmentation Lifting via Gaussian Instance Tracing
Hongyu Shen
Junfeng Ni
Yixin Chen
Weishuo Li
Mingtao Pei
Siyuan Huang
3DGS
161
8
0
05 Aug 2025
H3R: Hybrid Multi-view Correspondence for Generalizable 3D Reconstruction
H3R: Hybrid Multi-view Correspondence for Generalizable 3D Reconstruction
Heng Jia
Linchao Zhu
Na Zhao
3DGS
190
0
0
05 Aug 2025
Towards Stealthy and Effective Backdoor Attacks on Lane Detection: A Naturalistic Data Poisoning Approach
Towards Stealthy and Effective Backdoor Attacks on Lane Detection: A Naturalistic Data Poisoning Approach
Yifan Liao
Yuxin Cao
Yedi Zhang
Wentao He
Yan Xiao
Xianglong Du
Zhiyong Huang
Jin Song Dong
AAML
141
4
0
04 Aug 2025
DreamPainter: Image Background Inpainting for E-commerce Scenarios
DreamPainter: Image Background Inpainting for E-commerce Scenarios
Sijie Zhao
Jing Cheng
Yaoyao Wu
Hao Xu
Shaohui Jiao
DiffM
114
0
0
04 Aug 2025
Multimodal Referring Segmentation: A Survey
Multimodal Referring Segmentation: A Survey
Henghui Ding
Song Tang
Shuting He
Chang-rui Liu
Zuxuan Wu
Yu-Gang Jiang
395
11
0
01 Aug 2025
SDMatte: Grafting Diffusion Models for Interactive Matting
SDMatte: Grafting Diffusion Models for Interactive Matting
Daigang Xu
Yu Liang
H. Zhang
Jinwei Chen
Wei Dong
L. Chen
Wanyu Liu
Bo Li
P. Jiang
DiffM
251
2
0
01 Aug 2025
Video Color Grading via Look-Up Table Generation
Video Color Grading via Look-Up Table Generation
Seunghyun Shin
Dongmin Shin
Jisu Shin
Hae-Gon Jeon
Joon-Young Lee
DiffMVGen
123
1
0
01 Aug 2025
Omni-Scan: Creating Visually-Accurate Digital Twin Object Models Using a Bimanual Robot with Handover and Gaussian Splat Merging
Omni-Scan: Creating Visually-Accurate Digital Twin Object Models Using a Bimanual Robot with Handover and Gaussian Splat Merging
Tianshuang Qiu
Zehan Ma
Karim El-Refai
Hiya Shah
Chung Min Kim
Justin Kerr
Ken Goldberg
3DGS
182
2
0
01 Aug 2025
SAMSA 2.0: Prompting Segment Anything with Spectral Angles for Hyperspectral Interactive Medical Image Segmentation
SAMSA 2.0: Prompting Segment Anything with Spectral Angles for Hyperspectral Interactive Medical Image Segmentation
Alfie Roddan
Tobias Czempiel
Chi Xu
Daniel Elson
Stamatia Giannarou
VLM
126
0
0
01 Aug 2025
Semantic and Temporal Integration in Latent Diffusion Space for High-Fidelity Video Super-Resolution
Semantic and Temporal Integration in Latent Diffusion Space for High-Fidelity Video Super-Resolution
Yiwen Wang
Xinning Chai
Yuhong Zhang
Zhengxue Cheng
Jun Zhao
Rong Xie
Li Song
DiffMVGen
126
0
0
01 Aug 2025
AniMer+: Unified Pose and Shape Estimation Across Mammalia and Aves via Family-Aware Transformer
AniMer+: Unified Pose and Shape Estimation Across Mammalia and Aves via Family-Aware TransformerIEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2025
Jin Lyu
Liang An
Li Lin
Pujin Cheng
Yebin Liu
Xiaoying Tang
170
0
0
01 Aug 2025
Fine-grained Spatiotemporal Grounding on Egocentric Videos
Fine-grained Spatiotemporal Grounding on Egocentric Videos
Shuo Liang
Yiwu Zhong
Zi-Yuan Hu
Yeyao Tao
Liwei Wang
EgoV
288
5
0
01 Aug 2025
Contact-Aware Amodal Completion for Human-Object Interaction via Multi-Regional Inpainting
Contact-Aware Amodal Completion for Human-Object Interaction via Multi-Regional Inpainting
Seunggeun Chi
Enna Sachdeva
Pin-Hao Huang
Kwonjoon Lee
DiffM
123
2
0
01 Aug 2025
Towards Affordable Tumor Segmentation and Visualization for 3D Breast MRI Using SAM2
Towards Affordable Tumor Segmentation and Visualization for 3D Breast MRI Using SAM2
Solha Kang
Eugene Kim
J. Vankerschaver
Utku Ozbulak
137
0
0
31 Jul 2025
Enhanced Velocity Field Modeling for Gaussian Video Reconstruction
Enhanced Velocity Field Modeling for Gaussian Video Reconstruction
Zhenyang Li
Xiaoyang Bai
Tongchen Zhang
Pengfei Shen
Weiwei Xu
Yifan Peng
3DGS
190
0
0
31 Jul 2025
RAGNet: Large-scale Reasoning-based Affordance Segmentation Benchmark towards General Grasping
RAGNet: Large-scale Reasoning-based Affordance Segmentation Benchmark towards General Grasping
Dongming Wu
Yanping Fu
Saike Huang
Yingfei Liu
Fan Jia
...
Feng Dai
Tiancai Wang
Rao Muhammad Anwer
Fahad Shahbaz Khan
Jianbing Shen
3DV
121
3
0
31 Jul 2025
SAMSA: Segment Anything Model Enhanced with Spectral Angles for Hyperspectral Interactive Medical Image Segmentation
SAMSA: Segment Anything Model Enhanced with Spectral Angles for Hyperspectral Interactive Medical Image SegmentationInternational Conference on Medical Image Computing and Computer-Assisted Intervention (MICCAI), 2025
Alfie Roddan
Tobias Czempiel
Chi Xu
Daniel Elson
Stamatia Giannarou
VLM
93
1
0
31 Jul 2025
Robust and Efficient 3D Gaussian Splatting for Urban Scene Reconstruction
Robust and Efficient 3D Gaussian Splatting for Urban Scene Reconstruction
Zhensheng Yuan
Haozhi Huang
Zhen Xiong
Di Wang
Guanghua Yang
3DGS
145
2
0
30 Jul 2025
Beyond Rigid AI: Towards Natural Human-Machine Symbiosis for Interoperative Surgical Assistance
Beyond Rigid AI: Towards Natural Human-Machine Symbiosis for Interoperative Surgical Assistance
Lalithkumar Seenivasan
Jiru Xu
R. Soberanis-Mukul
Hao Ding
Grayson Byrd
Yu-Chun Ku
Jose L. Porras
M. Ishii
Mathias Unberath
117
0
0
30 Jul 2025
Towards Omnimodal Expressions and Reasoning in Referring Audio-Visual Segmentation
Towards Omnimodal Expressions and Reasoning in Referring Audio-Visual Segmentation
Kaining Ying
Henghui Ding
Guangquan Jie
Yu Jiang
VOS
335
5
0
30 Jul 2025
Neural Multi-View Self-Calibrated Photometric Stereo without Photometric Stereo Cues
Neural Multi-View Self-Calibrated Photometric Stereo without Photometric Stereo Cues
Xu Cao
Takafumi Taketomi
3DV
187
0
0
30 Jul 2025
HRVVS: A High-resolution Video Vasculature Segmentation Network via Hierarchical Autoregressive Residual Priors
HRVVS: A High-resolution Video Vasculature Segmentation Network via Hierarchical Autoregressive Residual PriorsInternational Conference on Medical Image Computing and Computer-Assisted Intervention (MICCAI), 2025
Xincheng Yao
Yijun Yang
Kangwei Guo
Ruiqiang Xiao
Haipeng Zhou
Haisu Tao
Zhiqiang Wang
Lei Zhu
VOS
238
0
0
30 Jul 2025
From Waveforms to Pixels: A Survey on Audio-Visual Segmentation
From Waveforms to Pixels: A Survey on Audio-Visual Segmentation
Jia Li
Yapeng Tian
VOS
219
2
0
29 Jul 2025
MOVE: Motion-Guided Few-Shot Video Object Segmentation
MOVE: Motion-Guided Few-Shot Video Object Segmentation
Kaining Ying
Hengrui Hu
Henghui Ding
VOS
244
3
0
29 Jul 2025
Semantic Segmentation of iPS Cells: Case Study on Model Complexity in Biomedical Imaging
Semantic Segmentation of iPS Cells: Case Study on Model Complexity in Biomedical Imaging
Maoquan Zhang
Bisser Raytchev
Xiujuan Sun
VLM
128
0
0
29 Jul 2025
SAMITE: Position Prompted SAM2 with Calibrated Memory for Visual Object Tracking
SAMITE: Position Prompted SAM2 with Calibrated Memory for Visual Object Tracking
Qianxiong Xu
Lanyun Zhu
Chenxi Liu
Guosheng Lin
Cheng Long
Ziyue Li
Rui Zhao
129
1
0
29 Jul 2025
RIS-LAD: A Benchmark and Model for Referring Low-Altitude Drone Image Segmentation
RIS-LAD: A Benchmark and Model for Referring Low-Altitude Drone Image Segmentation
Kai Ye
YingShi Luan
Zhudi Chen
Guangyue Meng
Pingyang Dai
Liujuan Cao
197
0
0
28 Jul 2025
SAMwave: Wavelet-Driven Feature Enrichment for Effective Adaptation of Segment Anything Model
SAMwave: Wavelet-Driven Feature Enrichment for Effective Adaptation of Segment Anything Model
Saurabh Yadav
Avi Gupta
Koteswar Rao Jerripothula
VLM
168
0
0
27 Jul 2025
Latest Object Memory Management for Temporally Consistent Video Instance Segmentation
Latest Object Memory Management for Temporally Consistent Video Instance Segmentation
Seunghun Lee
Jiwan Seo
Minwoo Choi
Kiljoon Han
Jaehoon Jeong
Zane Durante
Ehsan Adeli
Sang Hyun Park
Sunghoon Im
VOS
220
1
0
26 Jul 2025
HumanSAM: Classifying Human-centric Forgery Videos in Human Spatial, Appearance, and Motion Anomaly
HumanSAM: Classifying Human-centric Forgery Videos in Human Spatial, Appearance, and Motion Anomaly
Chang Liu
Yunfan Ye
Fan Zhang
Q. Zhou
Yuchuan Luo
Zhiping Cai
253
2
0
26 Jul 2025
HQ-SMem: Video Segmentation and Tracking Using Memory Efficient Object Embedding With Selective Update and Self-Supervised Distillation Feedback
HQ-SMem: Video Segmentation and Tracking Using Memory Efficient Object Embedding With Selective Update and Self-Supervised Distillation Feedback
Elham Soltani Kazemi
Imad Eddine Toubal
Gani Rahmon
Jaired Collins
K. Palaniappan
VOS
196
0
0
25 Jul 2025
Object-centric Video Question Answering with Visual Grounding and Referring
Object-centric Video Question Answering with Visual Grounding and Referring
Haochen Wang
Qirui Chen
Cilin Yan
Jiayin Cai
Xiaolong Jiang
Yao Hu
Weidi Xie
Stratis Gavves
MLLMVOS
267
5
0
25 Jul 2025
Previous
123...789...161718
Next
Page 8 of 18
Pageof 18