ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2026 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2408.00714
  4. Cited By
SAM 2: Segment Anything in Images and Videos

SAM 2: Segment Anything in Images and Videos

International Conference on Learning Representations (ICLR), 2024
1 August 2024
Nikhila Ravi
Valentin Gabeur
Yuan-Ting Hu
Ronghang Hu
Chaitanya K. Ryali
Tengyu Ma
Haitham Khedr
Roman Rädle
Chloe Rolland
Laura Gustafson
Eric Mintun
Junting Pan
Kalyan Vasudev Alwala
Nicolas Carion
Chao-Yuan Wu
Ross B. Girshick
Piotr Dollár
Christoph Feichtenhofer
    VLMMLLM
ArXiv (abs)PDFHTMLHuggingFace (116 upvotes)

Papers citing "SAM 2: Segment Anything in Images and Videos"

50 / 863 papers shown
WS$^2$: Weakly Supervised Segmentation using Before-After Supervision in Waste Sorting
WS2^22: Weakly Supervised Segmentation using Before-After Supervision in Waste Sorting
Andrea Marelli
Alberto Foresti
Leonardo Pesce
Giacomo Boracchi
Mario Grosso
118
0
0
08 Sep 2025
Co-Seg: Mutual Prompt-Guided Collaborative Learning for Tissue and Nuclei Segmentation
Co-Seg: Mutual Prompt-Guided Collaborative Learning for Tissue and Nuclei SegmentationInternational Conference on Medical Image Computing and Computer-Assisted Intervention (MICCAI), 2025
Qing Xu
Wenting Duan
Daming Gao
156
2
0
08 Sep 2025
GELATO: Multi-Instruction Trajectory Reshaping via Geometry-Aware Multiagent-based Orchestration
GELATO: Multi-Instruction Trajectory Reshaping via Geometry-Aware Multiagent-based Orchestration
Junhui Huang
Yuhe Gong
Changsheng Li
Xingguang Duan
Luis F. C. Figueredo
148
0
0
07 Sep 2025
MonoGlass3D: Monocular 3D Glass Detection with Plane Regression and Adaptive Feature Fusion
MonoGlass3D: Monocular 3D Glass Detection with Plane Regression and Adaptive Feature Fusion
Kai Zhang
Guoyang Zhao
Jianxing Shi
B. Liu
Weiqing Qi
Jun Ma
126
0
0
06 Sep 2025
Enhancing Self-Driving Segmentation in Adverse Weather Conditions: A Dual Uncertainty-Aware Training Approach to SAM Optimization
Enhancing Self-Driving Segmentation in Adverse Weather Conditions: A Dual Uncertainty-Aware Training Approach to SAM Optimization
Dharsan Ravindran
Kevin Wang
Zhuoyuan Cao
Saleh Abdelrahman
Jeffery Wu
111
0
0
05 Sep 2025
PAOLI: Pose-free Articulated Object Learning from Sparse-view Images
PAOLI: Pose-free Articulated Object Learning from Sparse-view Images
Jianning Deng
Kartic Subr
Hakan Bilen
OCL
248
0
0
04 Sep 2025
SLENet: A Guidance-Enhanced Network for Underwater Camouflaged Object Detection
SLENet: A Guidance-Enhanced Network for Underwater Camouflaged Object Detection
Xinxin Huang
Han Sun
Ningzhong Liu
Huiyu Zhou
Yinan Yao
148
1
0
04 Sep 2025
Strefer: Empowering Video LLMs with Space-Time Referring and Reasoning via Synthetic Instruction Data
Strefer: Empowering Video LLMs with Space-Time Referring and Reasoning via Synthetic Instruction Data
Honglu Zhou
Xiangyu Peng
Shrikant B. Kendre
Michael S Ryoo
Silvio Savarese
Caiming Xiong
Juan Carlos Niebles
130
1
0
03 Sep 2025
PixFoundation 2.0: Do Video Multi-Modal LLMs Use Motion in Visual Grounding?
PixFoundation 2.0: Do Video Multi-Modal LLMs Use Motion in Visual Grounding?
Mennatullah Siam
VGen
118
0
0
02 Sep 2025
Scalable Option Learning in High-Throughput Environments
Scalable Option Learning in High-Throughput Environments
Mikael Henaff
Scott Fujimoto
Michael Matthews
Michael Rabbat
OffRL
208
1
0
30 Aug 2025
Visually Grounded Narratives: Reducing Cognitive Burden in Researcher-Participant Interaction
Visually Grounded Narratives: Reducing Cognitive Burden in Researcher-Participant Interaction
Runtong Wu
Jiayao Song
Fei Teng
Xianhao Ren
Yuyan Gao
Kailun Yang
144
0
0
30 Aug 2025
DGL-RSIS: Decoupling Global Spatial Context and Local Class Semantics for Training-Free Remote Sensing Image Segmentation
DGL-RSIS: Decoupling Global Spatial Context and Local Class Semantics for Training-Free Remote Sensing Image Segmentation
Boyi Li
Ce Zhang
Richard M. Timmerman
Wenxuan Bao
114
0
0
30 Aug 2025
3D-LATTE: Latent Space 3D Editing from Textual Instructions
3D-LATTE: Latent Space 3D Editing from Textual Instructions
Maria Parelli
Michael Oechsle
Michael Niemeyer
Federico Tombari
Andreas Geiger
DiffM
300
2
0
29 Aug 2025
SPGrasp: Spatiotemporal Prompt-driven Grasp Synthesis in Dynamic Scenes
SPGrasp: Spatiotemporal Prompt-driven Grasp Synthesis in Dynamic Scenes
Yunpeng Mei
Hongjie Cao
Yinqiu Xia
Wei Xiao
Zhaohan Feng
Gang Wang
Jie Chen
157
0
0
28 Aug 2025
Dino U-Net: Exploiting High-Fidelity Dense Features from Foundation Models for Medical Image Segmentation
Dino U-Net: Exploiting High-Fidelity Dense Features from Foundation Models for Medical Image Segmentation
Yifan Gao
Haoyue Li
Feng Yuan
Xiaosong Wang
Xin Gao
MedImAI4CE
118
4
0
28 Aug 2025
Generalizable Object Re-Identification via Visual In-Context Prompting
Generalizable Object Re-Identification via Visual In-Context Prompting
Zhizhong Huang
Xiaoming Liu
97
3
0
28 Aug 2025
Color Bind: Exploring Color Perception in Text-to-Image Models
Color Bind: Exploring Color Perception in Text-to-Image Models
Shay Shomer Chai
Wenxuan Peng
Bharath Hariharan
Hadar Averbuch-Elor
DiffM
213
1
0
27 Aug 2025
FreeVPS: Repurposing Training-Free SAM2 for Generalizable Video Polyp Segmentation
FreeVPS: Repurposing Training-Free SAM2 for Generalizable Video Polyp Segmentation
Qiang Hu
Ying Zhou
Gepeng Ji
Nick Barnes
Qiang Li
Zhiwei Wang
153
0
0
27 Aug 2025
SWIRL: A Staged Workflow for Interleaved Reinforcement Learning in Mobile GUI Control
SWIRL: A Staged Workflow for Interleaved Reinforcement Learning in Mobile GUI Control
Quanfeng Lu
Zhantao Ma
Shuai Zhong
Jin Wang
Dahai Yu
Michael K. Ng
Ping Luo
220
0
0
27 Aug 2025
ZeST: an LLM-based Zero-Shot Traversability Navigation for Unknown Environments
ZeST: an LLM-based Zero-Shot Traversability Navigation for Unknown Environments
Shreya Gummadi
M. V. Gasparino
Gianluca Capezzuto
Marcelo Becker
Girish Chowdhary
151
0
0
26 Aug 2025
Autoregressive Universal Video Segmentation Model
Autoregressive Universal Video Segmentation Model
Miran Heo
Sukjun Hwang
Min-Hung Chen
Y. Wang
Albert Gu
Seon Joo Kim
Ryo Hachiuma
VOS
242
1
0
26 Aug 2025
ArgusCogito: Chain-of-Thought for Cross-Modal Synergy and Omnidirectional Reasoning in Camouflaged Object Segmentation
ArgusCogito: Chain-of-Thought for Cross-Modal Synergy and Omnidirectional Reasoning in Camouflaged Object Segmentation
Jianwen Tan
H. Zhang
Rui Xiong
Han Zhou
Hongfei Wang
Ye Li
LRM
140
0
0
25 Aug 2025
SafeBimanual: Diffusion-based Trajectory Optimization for Safe Bimanual Manipulation
SafeBimanual: Diffusion-based Trajectory Optimization for Safe Bimanual Manipulation
Haoyuan Deng
Wenkai Guo
Qianzhun Wang
Zhenyu Wu
Ziwei Wang
116
0
0
25 Aug 2025
Quickly Tuning Foundation Models for Image Segmentation
Quickly Tuning Foundation Models for Image Segmentation
Breenda Das
Lennart Purucker
Timur Carstensen
Frank Hutter
MLLMVLM
140
0
0
24 Aug 2025
LodeStar: Long-horizon Dexterity via Synthetic Data Augmentation from Human Demonstrations
LodeStar: Long-horizon Dexterity via Synthetic Data Augmentation from Human Demonstrations
Weikang Wan
Jiawei Fu
Xiaodi Yuan
Yifeng Zhu
Hao Su
161
3
0
24 Aug 2025
WebSight: A Vision-First Architecture for Robust Web Agents
WebSight: A Vision-First Architecture for Robust Web Agents
Tanvir Bhathal
Asanshay Gupta
LRM
134
2
0
23 Aug 2025
NeuralMeshing: Complete Object Mesh Extraction from Casual Captures
NeuralMeshing: Complete Object Mesh Extraction from Casual Captures
Floris Erich
Naoya Chiba
Abdullah Mustafa
Ryo Hanai
Noriaki Ando
Yusuke Yoshiyasu
Y. Domae
144
0
0
22 Aug 2025
Seeing Clearly, Forgetting Deeply: Revisiting Fine-Tuned Video Generators for Driving Simulation
Seeing Clearly, Forgetting Deeply: Revisiting Fine-Tuned Video Generators for Driving Simulation
Chun-Peng Chang
Chen-Yu Wang
Julian Schmidt
Holger Caesar
A. Pagani
VGen
265
1
0
22 Aug 2025
Towards Open World Detection: A Survey
Towards Open World Detection: A Survey
Andrei-Stefan Bulzan
Cosmin Cernazanu-Glavan
ObjDVLM
220
0
0
22 Aug 2025
Self-Validated Learning for Particle Separation: A Correctness-Based Self-Training Framework Without Human Labels
Self-Validated Learning for Particle Separation: A Correctness-Based Self-Training Framework Without Human Labels
Philipp D. Lösel
Aleese Barron
Yulai Zhang
Matthias Fabian
Benjamin Young
Nicolas Francois
Andrew M. Kingston
112
0
0
22 Aug 2025
Lang2Lift: A Framework for Language-Guided Pallet Detection and Pose Estimation Integrated in Autonomous Outdoor Forklift Operation
Lang2Lift: A Framework for Language-Guided Pallet Detection and Pose Estimation Integrated in Autonomous Outdoor Forklift Operation
Huy Hoang Nguyen
Johannes Huemer
Markus Murschitz
Tobias Glueck
Minh Nhat Vu
Andreas Kugi
104
0
0
21 Aug 2025
SceneGen: Single-Image 3D Scene Generation in One Feedforward Pass
SceneGen: Single-Image 3D Scene Generation in One Feedforward Pass
Yanxu Meng
Haoning Wu
Ya Zhang
Weidi Xie
VGen
393
9
0
21 Aug 2025
WorldWeaver: Generating Long-Horizon Video Worlds via Rich Perception
WorldWeaver: Generating Long-Horizon Video Worlds via Rich Perception
Zhiheng Liu
XueQing Deng
Shoufa Chen
Angtian Wang
Qiushan Guo
Mingfei Han
Zeyue Xue
M. Chen
Ping Luo
Linjie Yang
DiffMVGen
171
5
0
21 Aug 2025
WeedSense: Multi-Task Learning for Weed Segmentation, Height Estimation, and Growth Stage Classification
WeedSense: Multi-Task Learning for Weed Segmentation, Height Estimation, and Growth Stage Classification
Toqi Tahamid Sarker
Khaled R Ahmed
Taminul Islam
Cristiana Bernardi Rankrape
Karla Gage
114
3
0
20 Aug 2025
GaussianArt: Unified Modeling of Geometry and Motion for Articulated Objects
GaussianArt: Unified Modeling of Geometry and Motion for Articulated Objects
Licheng Shen
Saining Zhang
Honghan Li
Peilin Yang
Zihao Huang
Zongzheng Zhang
Hao Zhao
3DGS3DV
207
4
0
20 Aug 2025
RynnEC: Bringing MLLMs into Embodied World
RynnEC: Bringing MLLMs into Embodied World
Ronghao Dang
Yuqian Yuan
Yunxuan Mao
Kehan Li
Jiangpin Liu
Zhikai Wang
Xin Li
F. Wang
Deli Zhao
VGenLM&Ro
216
6
0
19 Aug 2025
Train Once, Deploy Anywhere: Realize Data-Efficient Dynamic Object Manipulation
Train Once, Deploy Anywhere: Realize Data-Efficient Dynamic Object Manipulation
Zhuoling Li
Xiaoyang Wu
Zhenhua Xu
Hengshuang Zhao
122
1
0
19 Aug 2025
subCellSAM: Zero-Shot (Sub-)Cellular Segmentation for Hit Validation in Drug Discovery
subCellSAM: Zero-Shot (Sub-)Cellular Segmentation for Hit Validation in Drug Discovery
Jacob Hanimann
Daniel Siegismund
Mario Wieser
Stephan Steigele
VLM
111
0
0
19 Aug 2025
MR6D: Benchmarking 6D Pose Estimation for Mobile Robots
MR6D: Benchmarking 6D Pose Estimation for Mobile Robots
Anas Gouda
Shrutarv Awasthi
Christian Blesing
Lokeshwaran Manohar
Frank Hoffmann
Alice Kirchheim
139
0
0
19 Aug 2025
Unleashing Semantic and Geometric Priors for 3D Scene Completion
Unleashing Semantic and Geometric Priors for 3D Scene Completion
Shiyuan Chen
Wei Sui
Bohao Zhang
Zeyd Boukhers
John See
Cong Yang
131
1
0
19 Aug 2025
Odo: Depth-Guided Diffusion for Identity-Preserving Body Reshaping
Odo: Depth-Guided Diffusion for Identity-Preserving Body Reshaping
Siddharth Khandelwal
Sridhar Kamath
Arjun Jain
DiffM
214
0
0
18 Aug 2025
Precise Action-to-Video Generation Through Visual Action Prompts
Precise Action-to-Video Generation Through Visual Action Prompts
Yuang Wang
Chao Wen
Haoyu Guo
Sida Peng
Minghan Qin
Hujun Bao
Xiaowei Zhou
Ruizhen Hu
VGen
147
4
0
18 Aug 2025
AIM 2025 Rip Current Segmentation (RipSeg) Challenge Report
AIM 2025 Rip Current Segmentation (RipSeg) Challenge Report
Andrei Dumitriu
Florin Miron
Florin Tatui
Radu Tudor Ionescu
Radu Timofte
...
Puhua Chen
Xu Liu
Jin Hu
Jinyang Xu
Biao Liu
241
0
0
18 Aug 2025
SIS-Challenge: Event-based Spatio-temporal Instance Segmentation Challenge at the CVPR 2025 Event-based Vision Workshop
SIS-Challenge: Event-based Spatio-temporal Instance Segmentation Challenge at the CVPR 2025 Event-based Vision Workshop
Friedhelm Hamann
Emil Mededovic
Fabian Gülhan
Yuli Wu
Johannes Stegmaier
...
Kanghan Oh
Gi Hyun Lim
Boxuan Yang
Bowen Du
Guillermo Gallego
ISeg
219
0
0
18 Aug 2025
DynamicPose: Real-time and Robust 6D Object Pose Tracking for Fast-Moving Cameras and Objects
DynamicPose: Real-time and Robust 6D Object Pose Tracking for Fast-Moving Cameras and Objects
Tingbang Liang
Yixin Zeng
Jiatong Xie
Boyu Zhou
127
0
0
16 Aug 2025
Remove360: Benchmarking Residuals After Object Removal in 3D Gaussian Splatting
Remove360: Benchmarking Residuals After Object Removal in 3D Gaussian Splatting
Simona Kocour
Assia Benbihi
Torsten Sattler
3DPC
131
0
0
15 Aug 2025
LEARN: A Story-Driven Layout-to-Image Generation Framework for STEM Instruction
LEARN: A Story-Driven Layout-to-Image Generation Framework for STEM Instruction
Maoquan Zhang
Bisser Raytchev
Xiujuan Sun
DiffM
95
0
0
15 Aug 2025
Generalized Decoupled Learning for Enhancing Open-Vocabulary Dense Perception
Generalized Decoupled Learning for Enhancing Open-Vocabulary Dense Perception
Junjie Wang
Keyu Chen
Yulin Li
Bin Chen
Hengshuang Zhao
Xiaojuan Qi
Zhuotao Tian
CLIPVLM
142
1
0
15 Aug 2025
Privacy-enhancing Sclera Segmentation Benchmarking Competition: SSBC 2025
Privacy-enhancing Sclera Segmentation Benchmarking Competition: SSBC 2025
Matej Vitek
Darian Tomašević
Abhijit Das
Sabari Nathan
Gökhan Özbulak
...
Raghavendra Ramachandra
Aditya Nigam
Umapada Pal
Peter Peer
Vitomir Štruc
148
0
0
14 Aug 2025
Towards Agentic AI for Multimodal-Guided Video Object Segmentation
Towards Agentic AI for Multimodal-Guided Video Object Segmentation
Tuyen Tran
T. Hoang Ngan Le
Truyen Tran
VOS
184
0
0
14 Aug 2025
Previous
123...678...161718
Next
Page 7 of 18
Pageof 18