Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2406.06163
Cited By
Extending Segment Anything Model into Auditory and Temporal Dimensions for Audio-Visual Segmentation
10 June 2024
Juhyeong Seon
Woobin Im
Sebin Lee
Jumin Lee
Sung-eui Yoon
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Extending Segment Anything Model into Auditory and Temporal Dimensions for Audio-Visual Segmentation"
4 / 4 papers shown
Title
CATR: Combinatorial-Dependence Audio-Queried Transformer for Audio-Visual Video Segmentation
Kexin Li
Zongxin Yang
Lei Chen
Yezhou Yang
Jun Xiao
VOS
28
49
0
18 Sep 2023
AV-SAM: Segment Anything Model Meets Audio-Visual Localization and Segmentation
Shentong Mo
Yapeng Tian
VLM
79
47
0
03 May 2023
Masked Autoencoders Are Scalable Vision Learners
Kaiming He
Xinlei Chen
Saining Xie
Yanghao Li
Piotr Dollár
Ross B. Girshick
ViT
TPM
258
7,337
0
11 Nov 2021
Is Space-Time Attention All You Need for Video Understanding?
Gedas Bertasius
Heng Wang
Lorenzo Torresani
ViT
275
1,939
0
09 Feb 2021
1