Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2307.04760
Cited By
Learning Spatial Features from Audio-Visual Correspondence in Egocentric Videos
10 July 2023
Sagnik Majumder
Ziad Al-Halah
Kristen Grauman
SSL
EgoV
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Learning Spatial Features from Audio-Visual Correspondence in Egocentric Videos"
7 / 7 papers shown
Title
Images that Sound: Composing Images and Sounds on a Single Canvas
Ziyang Chen
Daniel Geng
Andrew Owens
DiffM
46
8
0
20 May 2024
Masked Autoencoders Are Scalable Vision Learners
Kaiming He
Xinlei Chen
Saining Xie
Yanghao Li
Piotr Dollár
Ross B. Girshick
ViT
TPM
258
7,337
0
11 Nov 2021
Ego4D: Around the World in 3,000 Hours of Egocentric Video
Kristen Grauman
Andrew Westbury
Eugene Byrne
Zachary Chavis
Antonino Furnari
...
Mike Zheng Shou
Antonio Torralba
Lorenzo Torresani
Mingfei Yan
Jitendra Malik
EgoV
224
1,017
0
13 Oct 2021
MAAS: Multi-modal Assignation for Active Speaker Detection
Juan Carlos León Alcázar
Fabian Caba Heilbron
Ali K. Thabet
Bernard Ghanem
55
51
0
11 Jan 2021
VisualVoice: Audio-Visual Speech Separation with Cross-Modal Consistency
Ruohan Gao
Kristen Grauman
CVBM
185
196
0
08 Jan 2021
Audio-Visual Floorplan Reconstruction
Senthil Purushwalkam
S. V. A. Garí
V. Ithapu
Carl Schissler
Philip Robinson
Abhinav Gupta
Kristen Grauman
VGen
3DV
60
41
0
31 Dec 2020
VisualEchoes: Spatial Image Representation Learning through Echolocation
Ruohan Gao
Changan Chen
Ziad Al-Halah
Carl Schissler
Kristen Grauman
MDE
SSL
156
83
0
04 May 2020
1