Semantic Object Prediction and Spatial Sound Super-Resolution with Binaural Sounds

European Conference on Computer Vision (ECCV), 2020

9 March 2020

Luc Van Gool

Papers citing "Semantic Object Prediction and Spatial Sound Super-Resolution with Binaural Sounds"

32 / 32 papers shown

ViSAudio: End-to-End Video-Driven Binaural Spatial Audio Generation

255

02 Dec 2025

Clink! Chop! Thud! -- Learning Object Sounds from Real-World Interactions

Arun Balajee Vasudevan

James Hays

158

02 Oct 2025

Deep Learning for Personalized Binaural Audio Reproduction

264

30 Aug 2025

ViSAGe: Video-to-Spatial Audio GenerationInternational Conference on Learning Representations (ICLR), 2025

252

13 Jun 2025

Cross-Modal Urban Sensing: Evaluating Sound-Vision Alignment Across Street-Level and Aerial Imagery

231

03 Jun 2025

OmniAudio: Generating Spatial Audio from 360-Degree Video

...

586

21 Apr 2025

HAVT-IVD: Heterogeneity-Aware Cross-Modal Network for Audio-Visual Surveillance: Idling Vehicles Detection With Multichannel Audio and Multiscale Visual Cues

Xiwen Li

Ross T. Whitaker

Tolga Tasdizen

425

15 Apr 2025

AV-PedAware: Self-Supervised Audio-Visual Fusion for Dynamic Pedestrian AwarenessIEEE/RJS International Conference on Intelligent RObots and Systems (IROS), 2023

605

11 Nov 2024

Estimating Indoor Scene Depth Maps from Ultrasonic EchoesInternational Conference on Information Photonics (ICIP), 2024

295

05 Sep 2024

Visual Prompt Selection for In-Context Learning Segmentation

Peng Wang

334

14 Jul 2024

Leveraging Visual Supervision for Array-based Active Speaker Detection and Localization

Davide Berghi

Philip J. B. Jackson

254

21 Dec 2023

Segment Beyond View: Handling Partially Missing Modality for Audio-Visual Semantic SegmentationAAAI Conference on Artificial Intelligence (AAAI), 2023

Renjie Wu

Hu Wang

Feras Dayoub

Hsiang-Ting Chen

291

14 Dec 2023

Audio-Visual Speaker Tracking: Progress, Challenges, and Future Directions

796

23 Oct 2023

The Un-Kidnappable Robot: Acoustic Localization of Sneaking PeopleIEEE International Conference on Robotics and Automation (ICRA), 2023

Mengyu Yang

Patrick Grady

Samarth Brahmbhatt

Arun Balajee Vasudevan

Charles C. Kemp

James Hays

502

05 Oct 2023

The Audio-Visual BatVision Dataset for Research on Sight and SoundIEEE/RJS International Conference on Intelligent RObots and Systems (IROS), 2023

364

13 Mar 2023

Tragic Talkers: A Shakespearean Sound- and Light-Field Dataset for Audio-Visual Machine Learning ResearchConference on Visual Media Production (VMP), 2022

206

04 Dec 2022

Estimating Visual Information From Audio Through Manifold Learning

Yong Zhang

371

03 Aug 2022

Beyond Visual Field of View: Perceiving 3D Environment with Echoes and Vision

Xiangjie Sui

Esa Rahtu

Hang Zhao

MDE

396

03 Jul 2022

Self-supervised Learning of Audio Representations from Audio-Visual Data using Spatial AlignmentIEEE Journal on Selected Topics in Signal Processing (IEEE JSTSP), 2022

160

02 Jun 2022

Deep Learning for Omnidirectional Vision: A Survey and New Perspectives

402

21 May 2022

Invisible-to-Visible: Privacy-Aware Human Segmentation using Airborne Ultrasound via Collaborative Learning Probabilistic U-Net

187

11 May 2022

Visually Supervised Speaker Detection and Localization via Microphone ArrayIEEE International Workshop on Multimedia Signal Processing (MMSP), 2021

Davide Berghi

A. Hilton

Philip J. B. Jackson

241

07 Mar 2022

Sound and Visual Representation Learning with Multiple Pretraining TasksComputer Vision and Pattern Recognition (CVPR), 2022

A. Vasudevan

Dengxin Dai

Luc Van Gool

SSL

290

04 Jan 2022

Beyond Mono to Binaural: Generating Binaural Audio from Mono Audio with Depth and Cross Modal AttentionIEEE Workshop/Winter Conference on Applications of Computer Vision (WACV), 2021

282

15 Nov 2021

$Pano-AVQA: Grounded Audio-Visual Question Answering on 360$^\circ$ Videos$

Pano-AVQA: Grounded Audio-Visual Question Answering on 360

^\circ

VideosIEEE International Conference on Computer Vision (ICCV), 2021

357

123

11 Oct 2021

ASOD60K: An Audio-Induced Salient Object Detection Dataset for Panoramic Videos

Yi Zhang

386

24 Jul 2021

Visually Informed Binaural Audio Generation without Binaural AudiosComputer Vision and Pattern Recognition (CVPR), 2021

235

13 Apr 2021

Can audio-visual integration strengthen robustness under multimodal attacks?Computer Vision and Pattern Recognition (CVPR), 2021

Yapeng Tian

Chenliang Xu

AAML

367

05 Apr 2021

Beyond Image to Depth: Improving Depth Prediction using EchoesComputer Vision and Pattern Recognition (CVPR), 2021

382

15 Mar 2021

Capturing Omni-Range Context for Omnidirectional SegmentationComputer Vision and Pattern Recognition (CVPR), 2021

Kailun Yang

209

09 Mar 2021

There is More than Meets the Eye: Self-Supervised Multi-Object Detection and Tracking with Sound by Distilling Multimodal KnowledgeComputer Vision and Pattern Recognition (CVPR), 2021

Francisco Rivera Valverde

Juana Valeria Hurtado

Abhinav Valada

261

01 Mar 2021

Depth Estimation from Monocular Images and Sparse Radar DataIEEE/RJS International Conference on Intelligent RObots and Systems (IROS), 2020

Juan Lin

Dengxin Dai

Luc Van Gool

MDE

271

30 Sep 2020