ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1809.02587
  4. Cited By
Self-Supervised Generation of Spatial Audio for 360 Video

Self-Supervised Generation of Spatial Audio for 360 Video

7 September 2018
Pedro Morgado
Nuno Vasconcelos
Timothy R. Langlois
Oliver Wang
    MDE
ArXiv (abs)PDFHTML

Papers citing "Self-Supervised Generation of Spatial Audio for 360 Video"

50 / 117 papers shown
Title
MRSAudio: A Large-Scale Multimodal Recorded Spatial Audio Dataset with Refined Annotations
MRSAudio: A Large-Scale Multimodal Recorded Spatial Audio Dataset with Refined Annotations
Wenxiang Guo
Changhao Pan
Zhiyuan Zhu
Xintong Hu
Yu Zhang
...
Z. Chen
Yanhao Yu
Qiange Huang
Fei Wu
Zhou Zhao
163
0
0
12 Oct 2025
StereoSync: Spatially-Aware Stereo Audio Generation from Video
StereoSync: Spatially-Aware Stereo Audio Generation from Video
Christian Marinoni
R. F. Gramaccioni
Kazuki Shimada
Takashi Shibuya
Yuki Mitsufuji
Danilo Comminiello
DiffMVGen
74
2
0
07 Oct 2025
Text2Move: Text-to-moving sound generation via trajectory prediction and temporal alignment
Text2Move: Text-to-moving sound generation via trajectory prediction and temporal alignment
Y. Liu
Shaofan Yang
Kai Li
Xu Li
77
1
0
26 Sep 2025
Lightweight Implicit Neural Network for Binaural Audio Synthesis
Lightweight Implicit Neural Network for Binaural Audio Synthesis
Xikun Lu
Fang Liu
Weizhi Shi
Jinqiu Sang
92
0
0
17 Sep 2025
Deep Learning for Personalized Binaural Audio Reproduction
Deep Learning for Personalized Binaural Audio Reproduction
Xikun Lu
Yunda Chen
Zehua Chen
Jie Wang
Mingxing Liu
Hongmei Hu
C. Zheng
Stefan Bleeck
Jinqiu Sang
120
2
0
30 Aug 2025
Spherical Vision Transformers for Audio-Visual Saliency Prediction in 360-Degree Videos
Spherical Vision Transformers for Audio-Visual Saliency Prediction in 360-Degree Videos
Mert Cokelek
Halit Ozsoy
Nevrez Imamoglu
C. Ozcinar
Inci Ayhan
Erkut Erdem
Aykut Erdem
MDE
116
1
0
27 Aug 2025
ASAudio: A Survey of Advanced Spatial Audio Research
ASAudio: A Survey of Advanced Spatial Audio Research
Zhiyuan Zhu
Yu Zhang
Wenxiang Guo
Changhao Pan
Zhou Zhao
121
1
0
08 Aug 2025
ViSAGe: Video-to-Spatial Audio Generation
ViSAGe: Video-to-Spatial Audio GenerationInternational Conference on Learning Representations (ICLR), 2025
Jaeyeon Kim
Heeseung Yun
Gunhee Kim
VGen
165
9
0
13 Jun 2025
Average Calibration Losses for Reliable Uncertainty in Medical Image Segmentation
Average Calibration Losses for Reliable Uncertainty in Medical Image Segmentation
Theodore Barfoot
Luis C. Garcia-Peraza-Herrera
Samet Akcay
Ben Glocker
Tom Vercauteren
UQCV
333
0
0
04 Jun 2025
In-the-wild Audio Spatialization with Flexible Text-guided Localization
In-the-wild Audio Spatialization with Flexible Text-guided LocalizationAnnual Meeting of the Association for Computational Linguistics (ACL), 2025
Tianrui Pan
J. Tang
Longxiang Zhang
Jie Tang
Gangshan Wu
141
2
0
01 Jun 2025
Learning to Highlight Audio by Watching Movies
Learning to Highlight Audio by Watching MoviesComputer Vision and Pattern Recognition (CVPR), 2025
Chao Huang
Ruohan Gao
J. M. F. Tsang
Jan Kurcius
Cagdas Bilen
Chenliang Xu
Anurag Kumar
Sanjeel Parekh
VGen
217
3
0
17 May 2025
Differentiable Room Acoustic Rendering with Multi-View Vision Priors
Differentiable Room Acoustic Rendering with Multi-View Vision Priors
Derong Jin
Ruohan Gao
247
2
0
30 Apr 2025
OmniAudio: Generating Spatial Audio from 360-Degree Video
OmniAudio: Generating Spatial Audio from 360-Degree Video
Huadai Liu
Tianyi Luo
Qikai Jiang
Kaicheng Luo
Peiwen Sun
...
Xin Li
Shiliang Zhang
Zhijie Yan
Zhou Zhao
Wei Xue
VGen
372
10
0
21 Apr 2025
Hearing Anywhere in Any Environment
Hearing Anywhere in Any EnvironmentComputer Vision and Pattern Recognition (CVPR), 2025
Xiulong Liu
Anurag Kumar
P. Calamia
Sebastia V. Amengual
Calvin Murdock
Ishwarya Ananthabhotla
Philip Robinson
Eli Shlizerman
V. Ithapu
Ruohan Gao
218
6
0
14 Apr 2025
AV-Surf: Surface-Enhanced Geometry-Aware Novel-View Acoustic Synthesis
AV-Surf: Surface-Enhanced Geometry-Aware Novel-View Acoustic Synthesis
Hadam Baek
Hannie Shin
Jiyoung Seo
Chanwoo Kim
Saerom Kim
Hyeongbok Kim
Sangpil Kim
167
1
0
17 Mar 2025
Aligning Audio-Visual Joint Representations with an Agentic Workflow
Aligning Audio-Visual Joint Representations with an Agentic WorkflowNeural Information Processing Systems (NeurIPS), 2024
Shentong Mo
Yibing Song
201
2
0
30 Oct 2024
Diff-SAGe: End-to-End Spatial Audio Generation Using Diffusion Models
Diff-SAGe: End-to-End Spatial Audio Generation Using Diffusion ModelsIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2024
Saksham Singh Kushwaha
Jianbo Ma
Mark R. P. Thomas
Yapeng Tian
Avery Bruni
111
7
0
15 Oct 2024
Enhancing Robustness in Deep Reinforcement Learning: A Lyapunov Exponent
  Approach
Enhancing Robustness in Deep Reinforcement Learning: A Lyapunov Exponent ApproachNeural Information Processing Systems (NeurIPS), 2024
Rory Young
Nicolas Pugeault
AAML
298
20
0
14 Oct 2024
End-to-end multi-channel speaker extraction and binaural speech synthesis
End-to-end multi-channel speaker extraction and binaural speech synthesis
Cheng Chi
Xiaoyu Li
Andong Li
Yuxuan Ke
Yao Ge
Xiaodong Li
C. Zheng
117
0
0
08 Oct 2024
Self-Supervised Audio-Visual Soundscape Stylization
Self-Supervised Audio-Visual Soundscape StylizationEuropean Conference on Computer Vision (ECCV), 2024
Tingle Li
Renhao Wang
Po-Yao Huang
Andrew Owens
Gopala Anumanchipalli
DiffMSSL
219
7
0
22 Sep 2024
Multi-scale Multi-instance Visual Sound Localization and Segmentation
Multi-scale Multi-instance Visual Sound Localization and Segmentation
Shentong Mo
Haofan Wang
220
3
0
31 Aug 2024
How Does Audio Influence Visual Attention in Omnidirectional Videos? Database and Model
How Does Audio Influence Visual Attention in Omnidirectional Videos? Database and ModelIEEE Transactions on Image Processing (TIP), 2024
Yuxin Zhu
Huiyu Duan
Kaiwei Zhang
Yucheng Zhu
Xilei Zhu
Long Teng
Xiongkuo Min
Guangtao Zhai
203
6
0
10 Aug 2024
Audio-visual Generalized Zero-shot Learning the Easy Way
Audio-visual Generalized Zero-shot Learning the Easy Way
Shentong Mo
Pedro Morgado
207
7
0
18 Jul 2024
Modeling and Driving Human Body Soundfields through Acoustic Primitives
Modeling and Driving Human Body Soundfields through Acoustic Primitives
Chao Huang
Dejan Marković
Chenliang Xu
Alexander Richard
214
12
0
18 Jul 2024
Semantic Grouping Network for Audio Source Separation
Semantic Grouping Network for Audio Source Separation
Shentong Mo
Yapeng Tian
178
5
0
04 Jul 2024
SOAF: Scene Occlusion-aware Neural Acoustic Field
SOAF: Scene Occlusion-aware Neural Acoustic Field
Huiyu Gao
Jiahao Ma
David Ahmedt-Aristizabal
Chuong H. Nguyen
Miaomiao Liu
340
5
0
02 Jul 2024
Depth Anywhere: Enhancing 360 Monocular Depth Estimation via Perspective
  Distillation and Unlabeled Data Augmentation
Depth Anywhere: Enhancing 360 Monocular Depth Estimation via Perspective Distillation and Unlabeled Data AugmentationNeural Information Processing Systems (NeurIPS), 2024
Ning-Hsu Wang
Yu-Lun Liu
MDE
212
19
0
18 Jun 2024
AV-GS: Learning Material and Geometry Aware Priors for Novel View Acoustic Synthesis
AV-GS: Learning Material and Geometry Aware Priors for Novel View Acoustic Synthesis
Swapnil Bhosale
Haosen Yang
Helen Treharne
Jiankang Deng
Xiatian Zhu
283
10
0
13 Jun 2024
SEE-2-SOUND: Zero-Shot Spatial Environment-to-Spatial Sound
SEE-2-SOUND: Zero-Shot Spatial Environment-to-Spatial Sound
Rishit Dagli
Shivesh Prakash
Robert Wu
H. Khosravani
305
14
0
06 Jun 2024
Images that Sound: Composing Images and Sounds on a Single Canvas
Images that Sound: Composing Images and Sounds on a Single Canvas
Ziyang Chen
Daniel Geng
Andrew Owens
DiffM
349
15
0
20 May 2024
Unified Video-Language Pre-training with Synchronized Audio
Unified Video-Language Pre-training with Synchronized Audio
Shentong Mo
Haofan Wang
Huaxia Li
Xu Tang
228
2
0
12 May 2024
MIMOSA: Human-AI Co-Creation of Computational Spatial Audio Effects on
  Videos
MIMOSA: Human-AI Co-Creation of Computational Spatial Audio Effects on Videos
Zheng Ning
Zheng Zhang
Jerrick Ban
Kaiwen Jiang
Ruohong Gan
Yapeng Tian
Tao Li
VGen
89
9
0
23 Apr 2024
Interpreting End-to-End Deep Learning Models for Speech Source
  Localization Using Layer-wise Relevance Propagation
Interpreting End-to-End Deep Learning Models for Speech Source Localization Using Layer-wise Relevance PropagationEuropean Signal Processing Conference (EUSIPCO), 2024
Luca Comanducci
Fabio Antonacci
Augusto Sarti
119
1
0
04 Apr 2024
Text-to-Audio Generation Synchronized with Videos
Text-to-Audio Generation Synchronized with Videos
Shentong Mo
Jing Shi
Yapeng Tian
DiffMVGen
155
26
0
08 Mar 2024
Leveraging Visual Supervision for Array-based Active Speaker Detection
  and Localization
Leveraging Visual Supervision for Array-based Active Speaker Detection and Localization
Davide Berghi
Philip J. B. Jackson
183
5
0
21 Dec 2023
Unveiling the Power of Audio-Visual Early Fusion Transformers with Dense
  Interactions through Masked Modeling
Unveiling the Power of Audio-Visual Early Fusion Transformers with Dense Interactions through Masked ModelingComputer Vision and Pattern Recognition (CVPR), 2023
Shentong Mo
Pedro Morgado
206
27
0
02 Dec 2023
Weakly-Supervised Audio-Visual Segmentation
Weakly-Supervised Audio-Visual SegmentationNeural Information Processing Systems (NeurIPS), 2023
Shentong Mo
Bhiksha Raj
VOS
231
18
0
25 Nov 2023
Cross-modal Generative Model for Visual-Guided Binaural Stereo
  Generation
Cross-modal Generative Model for Visual-Guided Binaural Stereo GenerationKnowledge-Based Systems (KBS), 2023
Zhaojian Li
Jiangwei Zhong
Yuan Yuan
182
9
0
13 Nov 2023
Sounding Bodies: Modeling 3D Spatial Sound of Humans Using Body Pose and
  Audio
Sounding Bodies: Modeling 3D Spatial Sound of Humans Using Body Pose and AudioNeural Information Processing Systems (NeurIPS), 2023
Xudong Xu
Dejan Marković
Jacob Sandakly
Todd Keebler
Steven Krenn
Alexander Richard
113
8
0
01 Nov 2023
LAVSS: Location-Guided Audio-Visual Spatial Audio Separation
LAVSS: Location-Guided Audio-Visual Spatial Audio SeparationIEEE Workshop/Winter Conference on Applications of Computer Vision (WACV), 2023
Yuxin Ye
Wenming Yang
Yapeng Tian
169
12
0
31 Oct 2023
Audio-Visual Instance Segmentation
Audio-Visual Instance SegmentationComputer Vision and Pattern Recognition (CVPR), 2023
Ruohao Guo
Yaru Chen
Yanyu Qi
Wenzhen Yue
Dantong Niu
...
Wenzhen Yue
Ji Shi
Qixun Wang
Peiliang Zhang
Buwen Liang
VLMVOS
285
11
0
28 Oct 2023
Measuring Acoustics with Collaborative Multiple Agents
Measuring Acoustics with Collaborative Multiple AgentsInternational Joint Conference on Artificial Intelligence (IJCAI), 2023
Yinfeng Yu
Changan Chen
Lele Cao
Fangkai Yang
Gang Hua
198
7
0
09 Oct 2023
Class-Incremental Grouping Network for Continual Audio-Visual Learning
Class-Incremental Grouping Network for Continual Audio-Visual LearningIEEE International Conference on Computer Vision (ICCV), 2023
Shentong Mo
Weiguo Pian
Yapeng Tian
CLLVLM
144
31
0
11 Sep 2023
PEANUT: A Human-AI Collaborative Tool for Annotating Audio-Visual Data
PEANUT: A Human-AI Collaborative Tool for Annotating Audio-Visual DataACM Symposium on User Interface Software and Technology (UIST), 2023
Zheng Zhang
Zheng Ning
Chenliang Xu
Yapeng Tian
Toby Jia-Jun Li
211
11
0
27 Jul 2023
Learning Spatial Features from Audio-Visual Correspondence in Egocentric
  Videos
Learning Spatial Features from Audio-Visual Correspondence in Egocentric VideosComputer Vision and Pattern Recognition (CVPR), 2023
Sagnik Majumder
Ziad Al-Halah
Kristen Grauman
SSLEgoV
273
7
0
10 Jul 2023
RealImpact: A Dataset of Impact Sound Fields for Real Objects
RealImpact: A Dataset of Impact Sound Fields for Real ObjectsComputer Vision and Pattern Recognition (CVPR), 2023
Samuel Clarke
Ruohan Gao
Mason Wang
M. Rau
Julia Xu
Jui-Hsien Wang
Doug L. James
Jiajun Wu
193
13
0
16 Jun 2023
Sonicverse: A Multisensory Simulation Platform for Embodied Household
  Agents that See and Hear
Sonicverse: A Multisensory Simulation Platform for Embodied Household Agents that See and HearIEEE International Conference on Robotics and Automation (ICRA), 2023
Ruohan Gao
Hao Li
Gokul Dharan
Zhuzhu Wang
Chengshu Li
Fei Xia
Silvio Savarese
Li Fei-Fei
Jiajun Wu
289
14
0
01 Jun 2023
A Unified Audio-Visual Learning Framework for Localization, Separation,
  and Recognition
A Unified Audio-Visual Learning Framework for Localization, Separation, and RecognitionInternational Conference on Machine Learning (ICML), 2023
Shentong Mo
Pedro Morgado
162
25
0
30 May 2023
DiffAVA: Personalized Text-to-Audio Generation with Visual Alignment
DiffAVA: Personalized Text-to-Audio Generation with Visual Alignment
Shentong Mo
Jing Shi
Yapeng Tian
100
17
0
22 May 2023
A Comprehensive Survey on Segment Anything Model for Vision and Beyond
A Comprehensive Survey on Segment Anything Model for Vision and Beyond
Chunhui Zhang
Li Liu
Yawen Cui
Guanjie Huang
Weilin Lin
Yiqian Yang
Yuehong Hu
VLM
316
127
0
14 May 2023
123
Next