Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2103.01353
Cited By
There is More than Meets the Eye: Self-Supervised Multi-Object Detection and Tracking with Sound by Distilling Multimodal Knowledge
1 March 2021
Francisco Rivera Valverde
Juana Valeria Hurtado
Abhinav Valada
Re-assign community
ArXiv
PDF
HTML
Papers citing
"There is More than Meets the Eye: Self-Supervised Multi-Object Detection and Tracking with Sound by Distilling Multimodal Knowledge"
8 / 8 papers shown
Title
AV-PedAware: Self-Supervised Audio-Visual Fusion for Dynamic Pedestrian Awareness
Yizhuo Yang
Shenghai Yuan
Muqing Cao
Jianfei Yang
Lihua Xie
49
7
0
11 Nov 2024
Audio-Visual Speaker Tracking: Progress, Challenges, and Future Directions
Jinzheng Zhao
Yong-mei Xu
Xinyuan Qian
Davide Berghi
Peipei Wu
Meng Cui
Jianyuan Sun
Philip J. B. Jackson
Wenwu Wang
BDL
32
7
0
23 Oct 2023
Day2Dark: Pseudo-Supervised Activity Recognition beyond Silent Daylight
Yunhua Zhang
Hazel Doughty
Cees G. M. Snoek
VLM
26
0
0
05 Dec 2022
Leveraging the Video-level Semantic Consistency of Event for Audio-visual Event Localization
Yuanyuan Jiang
Jianqin Yin
Yonghao Dang
27
4
0
11 Oct 2022
Boosting 3D Object Detection by Simulating Multimodality on Point Clouds
Wu Zheng
Ming-Hong Hong
Li Jiang
Chi-Wing Fu
3DPC
14
29
0
30 Jun 2022
Amodal Panoptic Segmentation
Rohit Mohan
Abhinav Valada
9
43
0
23 Feb 2022
Audiovisual SlowFast Networks for Video Recognition
Fanyi Xiao
Yong Jae Lee
Kristen Grauman
Jitendra Malik
Christoph Feichtenhofer
192
204
0
23 Jan 2020
Aggregated Residual Transformations for Deep Neural Networks
Saining Xie
Ross B. Girshick
Piotr Dollár
Z. Tu
Kaiming He
261
10,106
0
16 Nov 2016
1