Communities
Connect sessions
AI calendar
Organizations
Join Slack
Contact Sales
Search
Open menu
Home
Papers
2408.00714
Cited By
SAM 2: Segment Anything in Images and Videos
International Conference on Learning Representations (ICLR), 2024
1 August 2024
Nikhila Ravi
Valentin Gabeur
Yuan-Ting Hu
Ronghang Hu
Chaitanya K. Ryali
Tengyu Ma
Haitham Khedr
Roman Rädle
Chloe Rolland
Laura Gustafson
Eric Mintun
Junting Pan
Kalyan Vasudev Alwala
Nicolas Carion
Chao-Yuan Wu
Ross B. Girshick
Piotr Dollár
Christoph Feichtenhofer
VLM
MLLM
Re-assign community
ArXiv (abs)
PDF
HTML
HuggingFace (116 upvotes)
Papers citing
"SAM 2: Segment Anything in Images and Videos"
50 / 861 papers shown
How Can Objects Help Video-Language Understanding?
Zitian Tang
Shijie Wang
Junho Cho
Jaewook Yoo
Chen Sun
356
3
0
10 Apr 2025
Are We Done with Object-Centric Learning?
Alexander Rubinstein
Christian Schroeder de Witt
Matthias Bethge
Seong Joon Oh
OCL
2.1K
3
0
09 Apr 2025
Few-Shot Adaptation of Grounding DINO for Agricultural Domain
Rajhans Singh
Rafael Bidese Puhl
Kshitiz Dhakal
Sudhir Sornapudi
309
3
0
09 Apr 2025
Falcon: Fractional Alternating Cut with Overcoming Minima in Unsupervised Segmentation
Xiao Zhang
Xiangyu Han
Xiwen Lai
Yao Sun
Pei Zhang
Konrad Kording
289
0
0
08 Apr 2025
HER-Seg: Holistically Efficient Segmentation for High-Resolution Medical Images
Qing Xu
Zhenye Lou
Chenxin Li
Xiangjian He
Rong Qu
Tesema Fiseha Berhanu
Yi Wang
Wenting Duan
Daming Gao
MedIm
284
0
0
08 Apr 2025
S^4M: Boosting Semi-Supervised Instance Segmentation with SAM
Heeji Yoon
Heeseong Shin
Eunbeen Hong
Hyunwook Choi
Hansang Cho
Daun Jeong
Seungryong Kim
237
1
0
07 Apr 2025
Caption Anything in Video: Fine-grained Object-centric Captioning via Spatiotemporal Multimodal Prompting
Yunlong Tang
Jing Bi
Chao Huang
Susan Liang
Daiki Shimada
...
Jinxi He
Liu He
Zeliang Zhang
Jiebo Luo
Chenliang Xu
263
8
0
07 Apr 2025
Lumina-OmniLV: A Unified Multimodal Framework for General Low-Level Vision
Yuandong Pu
Le Zhuo
Kaiwen Zhu
Liangbin Xie
Wenlong Zhang
Xiangyu Chen
Peng Gao
Botian Shi
Chao Dong
Yihao Liu
MLLM
320
10
0
07 Apr 2025
CMaP-SAM: Contraction Mapping Prior for SAM-driven Few-shot Segmentation
Shuai Chen
Fanman Meng
Haoran Wei
Haoran Wei
Qi Wu
Linfeng Xu
Haoyang Li
Hongliang Li
270
0
0
07 Apr 2025
SAM2MOT: A Novel Paradigm of Multi-Object Tracking by Segmentation
Junjie Jiang
Zelin Wang
Manqi Zhao
Yin Li
Dongsheng Jiang
718
13
0
06 Apr 2025
Multi-identity Human Image Animation with Structural Video Diffusion
Zhenzhi Wang
Yongqian Li
Yanhong Zeng
Yuwei Guo
Dahua Lin
Tianfan Xue
Bo Dai
VGen
263
5
0
05 Apr 2025
Performance Analysis of Deep Learning Models for Femur Segmentation in MRI Scan
Conference on Algebraic Informatics (AI), 2025
Mengyuan Liu
Yixiao Chen
Anning Tian
Xinmeng Wu
Mozhi Shen
Tianchou Gong
Jeongkyu Lee
MedIm
202
2
0
05 Apr 2025
Morpheus: Benchmarking Physical Reasoning of Video Generative Models with Real Physical Experiments
Chenyu Zhang
Daniil Cherniavskii
Antonios Tragoudaras
Antonios Vozikis
Thijmen Nijdam
Thijmen Nijdam
Mark Bodracska
Mark Bodracska
Andrii Zadaianchuk
E. Gavves
EGVM
VGen
294
13
0
03 Apr 2025
MG-Gen: Single Image to Motion Graphics Generation
Takahiro Shirakawa
Tomoyuki Suzuki
Takuto Narumoto
Daichi Haraguchi
VGen
605
0
0
03 Apr 2025
ConMo: Controllable Motion Disentanglement and Recomposition for Zero-Shot Motion Transfer
Computer Vision and Pattern Recognition (CVPR), 2025
Jiayi Gao
Zijin Yin
Changcheng Hua
Yuxin Peng
Kongming Liang
Zhanyu Ma
Jiaxin Guo
Yang Liu
VGen
DiffM
317
7
0
03 Apr 2025
BOP Challenge 2024 on Model-Based and Model-Free 6D Object Pose Estimation
Van Nguyen Nguyen
Stephen Tyree
Andrew Guo
Mederic Fourmy
Anas Gouda
...
Stan Birchfield
Jiri Matas
Yann Labbé
M. Sundermeyer
Tomás Hodan
3DPC
691
15
0
03 Apr 2025
COST: Contrastive One-Stage Transformer for Vision-Language Small Object Tracking
Information Fusion (Inf. Fusion), 2025
Chunhui Zhang
Li Liu
Jialin Gao
Xin Sun
Hao Wen
Xi Zhou
Shiming Ge
Yucheng Wang
290
4
0
02 Apr 2025
UnIRe: Unsupervised Instance Decomposition for Dynamic Urban Scene Reconstruction
Yunxuan Mao
R. Xiong
Longji Xu
Yiyi Liao
3DPC
1.0K
2
0
01 Apr 2025
WorldScore: A Unified Evaluation Benchmark for World Generation
Haoyi Duan
Hong-Xing Yu
Sirui Chen
L. Fei-Fei
Jiajun Wu
VGen
401
46
0
01 Apr 2025
Coca-Splat: Collaborative Optimization for Camera Parameters and 3D Gaussians
Jiamin Wu
Hongyang Li
Xiaoke Jiang
Xingtai Lv
Lei Zhang
3DGS
333
0
0
01 Apr 2025
Zero-Shot 4D Lidar Panoptic Segmentation
Computer Vision and Pattern Recognition (CVPR), 2025
Yushan Zhang
Aljosa Osep
Laura Leal-Taixé
Tim Meinhardt
3DPC
350
5
0
01 Apr 2025
RipVIS: Rip Currents Video Instance Segmentation Benchmark for Beach Monitoring and Safety
Computer Vision and Pattern Recognition (CVPR), 2025
Andrei Dumitriu
Florin Tatui
Florin Miron
Aakash Ralhan
Radu Tudor Ionescu
Radu Timofte
371
1
0
01 Apr 2025
PolygoNet: Leveraging Simplified Polygonal Representation for Effective Image Classification
Salim Khazem
Jérémy Fix
C´edric Pradalier
167
3
0
01 Apr 2025
ZeroMimic: Distilling Robotic Manipulation Skills from Web Videos
IEEE International Conference on Robotics and Automation (ICRA), 2025
Junyao Shi
Zhuolun Zhao
Tianyou Wang
Ian Pedroza
Amy Luo
Jie Wang
Jason Ma
Dinesh Jayaraman
LM&Ro
287
13
0
31 Mar 2025
Easi3R: Estimating Disentangled Motion from DUSt3R Without Training
Xingyu Chen
Yue Chen
Yuliang Xiu
Andreas Geiger
Anpei Chen
3DPC
VGen
398
46
0
31 Mar 2025
Multi-Task Learning for Extracting Menstrual Characteristics from Clinical Notes
Anna Shopova
Cristoph Lippert
Leslee J. Shaw
Eugenia Alleva
298
4
0
31 Mar 2025
SALT: A Flexible Semi-Automatic Labeling Tool for General LiDAR Point Clouds with Cross-Scene Adaptability and 4D Consistency
Yanbo Wang
Yongtao Chen
Chuan Cao
Tianchen Deng
Wentao Zhao
Jingchuan Wang
Weidong Chen
374
5
0
31 Mar 2025
SAVeD: Learning to Denoise Low-SNR Video for Improved Downstream Performance
Suzanne Stathatos
Michael Hobley
Markus Marks
Pietro Perona
411
1
0
31 Mar 2025
ReferDINO-Plus: 2nd Solution for 4th PVUW MeViS Challenge at CVPR 2025
Tianming Liang
Haichao Jiang
Wei-Shi Zheng
Jian-Fang Hu
304
1
0
30 Mar 2025
EAP4EMSIG -- Enhancing Event-Driven Microscopy for Microfluidic Single-Cell Analysis
Nils Friederich
Angelo Jovin Yamachui Sitcheu
Annika Nassal
Erenus Yildiz
Matthias Pesch
...
D. Kohlheyer
Hanno Scharr
Johannes Seiffarth
K. Nöh
Ralf Mikut
280
0
0
30 Mar 2025
A GAN-Enhanced Deep Learning Framework for Rooftop Detection from Historical Aerial Imagery
International Journal of Remote Sensing (IJRS), 2025
Pengyu Chen
Sicheng Wang
Cuizhen Wang
Senrong Wang
Beiao Huang
Lu Huang
Zhe Zang
365
3
0
29 Mar 2025
Segment Any Motion in Videos
Computer Vision and Pattern Recognition (CVPR), 2025
Nan Huang
Wenzhao Zheng
Chenfeng Xu
Kurt Keutzer
Shanghang Zhang
Angjoo Kanazawa
Qianqian Wang
VOS
318
13
0
28 Mar 2025
Segment then Splat: Unified 3D Open-Vocabulary Segmentation via Gaussian Splatting
Yiren Lu
Yunlai Zhou
Yiran Qiao
Chaoda Song
Tuo Liang
Jing Ma
Huan Wang
Yu Yin
3DGS
266
3
0
28 Mar 2025
Semantic Consistent Language Gaussian Splatting for Point-Level Open-vocabulary Querying
Hairong Yin
Huangying Zhan
Yi Tian Xu
Raymond A. Yeh
278
3
0
27 Mar 2025
A Unified Image-Dense Annotation Generation Model for Underwater Scenes
Computer Vision and Pattern Recognition (CVPR), 2025
Hongkai Lin
Dingkang Liang
Zhenghao Qi
X. Bai
DiffM
331
2
0
27 Mar 2025
Online Reasoning Video Segmentation with Just-in-Time Digital Twins
Yiqing Shen
Bohan Liu
Chenjia Li
Lalithkumar Seenivasan
Mathias Unberath
VOS
420
14
0
27 Mar 2025
Feature4X: Bridging Any Monocular Video to 4D Agentic AI with Versatile Gaussian Feature Fields
Computer Vision and Pattern Recognition (CVPR), 2025
Shijie Zhou
Hui Ren
Yijia Weng
Shuwang Zhang
Zhen Wang
...
Zhiwen Fan
Suya You
Ziyi Wang
Leonidas Guibas
A. Kadambi
VGen
3DGS
369
5
0
26 Mar 2025
DINeMo: Learning Neural Mesh Models with no 3D Annotations
Weijie Guo
Guofeng Zhang
Wufei Ma
Jieneng Chen
3DH
371
0
0
26 Mar 2025
DynOPETs: A Versatile Benchmark for Dynamic Object Pose Estimation and Tracking in Moving Camera Scenarios
IEEE Robotics and Automation Letters (IEEE RA-L), 2025
Xiangting Meng
Jiaqi Yang
Mingshu Chen
C. Yan
Yujiao Shi
Wenchao Ding
L. Kneip
323
0
0
25 Mar 2025
Semi-SMD: Semi-Supervised Metric Depth Estimation via Surrounding Cameras for Autonomous Driving
Yusen Xie
Zhengmin Huang
Shaojie Shen
Jun Ma
417
1
0
25 Mar 2025
LayerCraft: Enhancing Text-to-Image Generation with CoT Reasoning and Layered Object Integration
Yuyao Zhang
Jinghao Li
Yu-Wing Tai
DiffM
555
6
0
25 Mar 2025
CamSAM2: Segment Anything Accurately in Camouflaged Videos
Yuli Zhou
Guolei Sun
Yawei Li
Yuqian Fu
Luca Benini
Ender Konukoglu
344
4
0
25 Mar 2025
RP-SAM2: Refining Point Prompts for Stable Surgical Instrument Segmentation
Nuren Zhaksylyk
Ibrahim Almakky
Jay N. Paranjape
S. Vedula
S. Sikder
Vishal M. Patel
Mohammad Yaqub
314
1
0
25 Mar 2025
RoboEngine: Plug-and-Play Robot Data Augmentation with Semantic Robot Segmentation and Background Generation
Chengbo Yuan
Suraj Joshi
Shaoting Zhu
Hang Su
Hang Zhao
Yang Gao
VGen
327
23
0
24 Mar 2025
OmnimatteZero: Fast Training-free Omnimatte with Pre-trained Video Diffusion Models
Dvir Samuel
Matan Levy
N. Darshan
Gal Chechik
Rami Ben-Ari
DiffM
376
0
0
23 Mar 2025
SceneSplat: Gaussian Splatting-based Scene Understanding with Vision-Language Pretraining
Yue Li
Qi Ma
Runyi Yang
Huapeng Li
Mengjiao Ma
...
E. Konukoglu
Theo Gevers
Luc Van Gool
Martin R. Oswald
Danda Pani Paudel
3DGS
VLM
661
20
0
23 Mar 2025
Retrieval Augmented Generation and Understanding in Vision: A Survey and New Outlook
Xu Zheng
Ziqiao Weng
Yuanhuiyi Lyu
Lutao Jiang
Haiwei Xue
Bin Ren
Danda Pani Paudel
Andrii Zadaianchuk
Luc Van Gool
Xuming Hu
3DV
380
25
0
23 Mar 2025
SALT: Parameter-Efficient Fine-Tuning via Singular Value Adaptation with Low-Rank Transformation
Abdelrahman Elsayed
Sarim Hashmi
Mohammed Elseiagy
Hu Wang
Mohammad Yaqub
Ibrahim Almakky
OOD
337
1
0
20 Mar 2025
MagicMotion: Controllable Video Generation with Dense-to-Sparse Trajectory Guidance
Quanhao Li
Zhen Xing
Rui Wang
Hui Zhang
Jingdong Sun
Zuxuan Wu
VGen
470
18
0
20 Mar 2025
M3: 3D-Spatial MultiModal Memory
International Conference on Learning Representations (ICLR), 2025
Xueyan Zou
Yuchen Song
Ri-Zhao Qiu
Xuanbin Peng
Jianglong Ye
Sifei Liu
Xiaolong Wang
3DGS
261
2
0
20 Mar 2025
Previous
1
2
3
...
12
13
14
...
16
17
18
Next
Page 13 of 18
Page
of 18
Go