Communities
Connect sessions
AI calendar
Organizations
Join Slack
Contact Sales
Search
Open menu
Home
Papers
2003.04210
Cited By
Semantic Object Prediction and Spatial Sound Super-Resolution with Binaural Sounds
European Conference on Computer Vision (ECCV), 2020
9 March 2020
A. Vasudevan
Dengxin Dai
Luc Van Gool
ObjD
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Semantic Object Prediction and Spatial Sound Super-Resolution with Binaural Sounds"
32 / 32 papers shown
ViSAudio: End-to-End Video-Driven Binaural Spatial Audio Generation
Mengchen Zhang
Qi Chen
Tong Wu
Zihan Liu
Dahua Lin
VGen
255
2
0
02 Dec 2025
Clink! Chop! Thud! -- Learning Object Sounds from Real-World Interactions
Mengyu Yang
Yiming Chen
Haozheng Pei
Siddhant Agarwal
Arun Balajee Vasudevan
James Hays
158
0
0
02 Oct 2025
Deep Learning for Personalized Binaural Audio Reproduction
Xikun Lu
Yunda Chen
Zehua Chen
Jie Wang
Mingxing Liu
Hongmei Hu
C. Zheng
Stefan Bleeck
Jinqiu Sang
264
2
0
30 Aug 2025
ViSAGe: Video-to-Spatial Audio Generation
International Conference on Learning Representations (ICLR), 2025
Jaeyeon Kim
Heeseung Yun
Gunhee Kim
VGen
252
14
0
13 Jun 2025
Cross-Modal Urban Sensing: Evaluating Sound-Vision Alignment Across Street-Level and Aerial Imagery
Pengyu Chen
Xiao Huang
Teng Fei
Sicheng Wang
231
0
0
03 Jun 2025
OmniAudio: Generating Spatial Audio from 360-Degree Video
Huadai Liu
Tianyi Luo
Qikai Jiang
Kaicheng Luo
Peiwen Sun
...
Xin Li
Shiliang Zhang
Zhijie Yan
Zhou Zhao
Wei Xue
VGen
586
15
0
21 Apr 2025
HAVT-IVD: Heterogeneity-Aware Cross-Modal Network for Audio-Visual Surveillance: Idling Vehicles Detection With Multichannel Audio and Multiscale Visual Cues
Xiwen Li
Ross T. Whitaker
Tolga Tasdizen
425
0
0
15 Apr 2025
AV-PedAware: Self-Supervised Audio-Visual Fusion for Dynamic Pedestrian Awareness
IEEE/RJS International Conference on Intelligent RObots and Systems (IROS), 2023
Yizhuo Yang
Shenghai Yuan
Muqing Cao
Jianfei Yang
Lihua Xie
605
15
0
11 Nov 2024
Estimating Indoor Scene Depth Maps from Ultrasonic Echoes
International Conference on Information Photonics (ICIP), 2024
Junpei Honma
Akisato Kimura
Go Irie
MDE
295
1
0
05 Sep 2024
Visual Prompt Selection for In-Context Learning Segmentation
Wei Suo
Lanqing Lai
Mengyang Sun
Hanwang Zhang
Peng Wang
Yanning Zhang
VLM
334
12
0
14 Jul 2024
Leveraging Visual Supervision for Array-based Active Speaker Detection and Localization
Davide Berghi
Philip J. B. Jackson
254
6
0
21 Dec 2023
Segment Beyond View: Handling Partially Missing Modality for Audio-Visual Semantic Segmentation
AAAI Conference on Artificial Intelligence (AAAI), 2023
Renjie Wu
Hu Wang
Feras Dayoub
Hsiang-Ting Chen
291
11
0
14 Dec 2023
Audio-Visual Speaker Tracking: Progress, Challenges, and Future Directions
Jinzheng Zhao
Yong-mei Xu
Xinyuan Qian
Davide Berghi
Peipei Wu
Meng Cui
Jianyuan Sun
Philip J. B. Jackson
Wenwu Wang
BDL
796
10
0
23 Oct 2023
The Un-Kidnappable Robot: Acoustic Localization of Sneaking People
IEEE International Conference on Robotics and Automation (ICRA), 2023
Mengyu Yang
Patrick Grady
Samarth Brahmbhatt
Arun Balajee Vasudevan
Charles C. Kemp
James Hays
502
1
0
05 Oct 2023
The Audio-Visual BatVision Dataset for Research on Sight and Sound
IEEE/RJS International Conference on Intelligent RObots and Systems (IROS), 2023
Amandine Brunetto
Sascha Hornauer
Stella X. Yu
Fabien Moutarde
364
7
0
13 Mar 2023
Tragic Talkers: A Shakespearean Sound- and Light-Field Dataset for Audio-Visual Machine Learning Research
Conference on Visual Media Production (VMP), 2022
Davide Berghi
M. Volino
Philip J. B. Jackson
VGen
206
6
0
04 Dec 2022
Estimating Visual Information From Audio Through Manifold Learning
Fabrizio Pedersoli
Dryden Wiebe
A. Banitalebi
Yong Zhang
George Tzanetakis
K. M. Yi
SSL
371
9
0
03 Aug 2022
Beyond Visual Field of View: Perceiving 3D Environment with Echoes and Vision
Xiangjie Sui
Esa Rahtu
Hang Zhao
MDE
396
8
0
03 Jul 2022
Self-supervised Learning of Audio Representations from Audio-Visual Data using Spatial Alignment
IEEE Journal on Selected Topics in Signal Processing (IEEE JSTSP), 2022
Shanshan Wang
Archontis Politis
A. Mesaros
Maria Sandsten
SSL
160
12
0
02 Jun 2022
Deep Learning for Omnidirectional Vision: A Survey and New Perspectives
Hao Ai
Zidong Cao
Jin Zhu
Haotian Bai
Yucheng Chen
Ling Wang
402
45
0
21 May 2022
Invisible-to-Visible: Privacy-Aware Human Segmentation using Airborne Ultrasound via Collaborative Learning Probabilistic U-Net
Risako Tanigawa
Yasunori Ishii
Kazuki Kozuka
Takayoshi Yamashita
187
1
0
11 May 2022
Visually Supervised Speaker Detection and Localization via Microphone Array
IEEE International Workshop on Multimedia Signal Processing (MMSP), 2021
Davide Berghi
A. Hilton
Philip J. B. Jackson
241
11
0
07 Mar 2022
Sound and Visual Representation Learning with Multiple Pretraining Tasks
Computer Vision and Pattern Recognition (CVPR), 2022
A. Vasudevan
Dengxin Dai
Luc Van Gool
SSL
290
7
0
04 Jan 2022
Beyond Mono to Binaural: Generating Binaural Audio from Mono Audio with Depth and Cross Modal Attention
IEEE Workshop/Winter Conference on Applications of Computer Vision (WACV), 2021
Kranti K. Parida
Siddharth Srivastava
Gaurav Sharma
MDE
282
33
0
15 Nov 2021
Pano-AVQA: Grounded Audio-Visual Question Answering on 360
∘
^\circ
∘
Videos
IEEE International Conference on Computer Vision (ICCV), 2021
Heeseung Yun
Youngjae Yu
Wonsuk Yang
Kangil Lee
Gunhee Kim
357
123
0
11 Oct 2021
ASOD60K: An Audio-Induced Salient Object Detection Dataset for Panoramic Videos
Yi Zhang
386
8
0
24 Jul 2021
Visually Informed Binaural Audio Generation without Binaural Audios
Computer Vision and Pattern Recognition (CVPR), 2021
Xudong Xu
Hang Zhou
Ziwei Liu
Bo Dai
Xiaogang Wang
Dahua Lin
DiffM
235
74
0
13 Apr 2021
Can audio-visual integration strengthen robustness under multimodal attacks?
Computer Vision and Pattern Recognition (CVPR), 2021
Yapeng Tian
Chenliang Xu
AAML
367
41
0
05 Apr 2021
Beyond Image to Depth: Improving Depth Prediction using Echoes
Computer Vision and Pattern Recognition (CVPR), 2021
Kranti K. Parida
Siddharth Srivastava
Gaurav Sharma
MDE
382
43
0
15 Mar 2021
Capturing Omni-Range Context for Omnidirectional Segmentation
Computer Vision and Pattern Recognition (CVPR), 2021
Kailun Yang
Kailai Li
Simon Reiß
Xinxin Hu
Rainer Stiefelhagen
209
84
0
09 Mar 2021
There is More than Meets the Eye: Self-Supervised Multi-Object Detection and Tracking with Sound by Distilling Multimodal Knowledge
Computer Vision and Pattern Recognition (CVPR), 2021
Francisco Rivera Valverde
Juana Valeria Hurtado
Abhinav Valada
261
86
0
01 Mar 2021
Depth Estimation from Monocular Images and Sparse Radar Data
IEEE/RJS International Conference on Intelligent RObots and Systems (IROS), 2020
Juan Lin
Dengxin Dai
Luc Van Gool
MDE
271
90
0
30 Sep 2020
1
Page 1 of 1