Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2309.10787
Cited By
AV-SUPERB: A Multi-Task Evaluation Benchmark for Audio-Visual Representation Models
19 September 2023
Yuan Tseng
Layne Berry
Yi-Ting Chen
I-Hsiang Chiu
Hsuan-Hao Lin
Max Liu
Puyuan Peng
Yi-Jen Shih
Hung-Yu Wang
Haibin Wu
Po-Yao Huang
Chun-Mao Lai
Shang-Wen Li
David F. Harwath
Yu Tsao
Shinji Watanabe
Abdel-rahman Mohamed
Chi-Luen Feng
Hung-yi Lee
VLM
SSL
Re-assign community
ArXiv
PDF
HTML
Papers citing
"AV-SUPERB: A Multi-Task Evaluation Benchmark for Audio-Visual Representation Models"
5 / 5 papers shown
Title
Learning State-Aware Visual Representations from Audible Interactions
Himangi Mittal
Pedro Morgado
Unnat Jain
Abhinav Gupta
49
20
0
27 Sep 2022
Masked Autoencoders Are Scalable Vision Learners
Kaiming He
Xinlei Chen
Saining Xie
Yanghao Li
Piotr Dollár
Ross B. Girshick
ViT
TPM
255
5,353
0
11 Nov 2021
Ego4D: Around the World in 3,000 Hours of Egocentric Video
Kristen Grauman
Andrew Westbury
Eugene Byrne
Zachary Chavis
Antonino Furnari
...
Mike Zheng Shou
Antonio Torralba
Lorenzo Torresani
Mingfei Yan
Jitendra Malik
EgoV
212
682
0
13 Oct 2021
VoxCeleb2: Deep Speaker Recognition
Joon Son Chung
Arsha Nagrani
Andrew Zisserman
206
1,954
0
14 Jun 2018
Lip Reading Sentences in the Wild
Joon Son Chung
A. Senior
Oriol Vinyals
Andrew Zisserman
138
782
0
16 Nov 2016
1