Communities
Connect sessions
AI calendar
Organizations
Join Slack
Contact Sales
Search
Open menu
Home
Papers
2202.06406
Cited By
Visual Sound Localization in the Wild by Cross-Modal Interference Erasing
AAAI Conference on Artificial Intelligence (AAAI), 2022
13 February 2022
Xian Liu
Rui Qian
Hang Zhou
Di Hu
Weiyao Lin
Ziwei Liu
Bolei Zhou
Xiaowei Zhou
Re-assign community
ArXiv (abs)
PDF
HTML
Github (29★)
Papers citing
"Visual Sound Localization in the Wild by Cross-Modal Interference Erasing"
21 / 21 papers shown
Title
Learning from Silence and Noise for Visual Sound Source Localization
Xavier Juanola
G. Morais
Magdalena Fuentes
Gloria Haro
SSL
144
0
0
29 Aug 2025
FuseChat-3.0: Preference Optimization Meets Heterogeneous Model Fusion
Ziyi Yang
Fanqi Wan
Longguang Zhong
Canbin Huang
Guosheng Liang
Xiaojun Quan
MoMe
264
9
0
06 Mar 2025
AVS-Mamba: Exploring Temporal and Multi-modal Mamba for Audio-Visual Segmentation
IEEE transactions on multimedia (TMM), 2025
Sitong Gong
Yunzhi Zhuge
Lu Zhang
Yifan Wang
Pingping Zhang
Lijun Wang
Huchuan Lu
Mamba
VOS
105
13
0
14 Jan 2025
SaSR-Net: Source-Aware Semantic Representation Network for Enhancing Audio-Visual Question Answering
Conference on Empirical Methods in Natural Language Processing (EMNLP), 2024
Tianyu Yang
Yiyang Nan
Lisen Dai
Zhenwen Liang
Yapeng Tian
Wei Wei
280
1
0
07 Nov 2024
A Critical Assessment of Visual Sound Source Localization Models Including Negative Audio
IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2024
Xavier Juanola
Gloria Haro
Magdalena Fuentes
344
4
0
01 Oct 2024
Dual Mean-Teacher: An Unbiased Semi-Supervised Framework for Audio-Visual Source Localization
Yuxin Guo
Shijie Ma
Hu Su
Zhiqing Wang
Yuhao Zhao
Wei Zou
Siyang Sun
Yun Zheng
SSL
213
16
0
05 Mar 2024
Cross Pseudo-Labeling for Semi-Supervised Audio-Visual Source Localization
Yuxin Guo
Shijie Ma
Yuhao Zhao
Hu Su
Wei Zou
215
4
0
05 Mar 2024
STELLA: Continual Audio-Video Pre-training with Spatio-Temporal Localized Alignment
International Conference on Machine Learning (ICML), 2023
Jaewoo Lee
Jaehong Yoon
Wonjae Kim
Yunji Kim
Sung Ju Hwang
CLL
275
1
0
12 Oct 2023
BAVS: Bootstrapping Audio-Visual Segmentation by Integrating Foundation Knowledge
IEEE transactions on multimedia (IEEE TMM), 2023
Chen Liu
Peike Li
Hu Zhang
Lincheng Li
Zi Huang
Dadong Wang
Xin Yu
VOS
164
43
0
20 Aug 2023
Audio-Visual Segmentation by Exploring Cross-Modal Mutual Semantics
ACM Multimedia (ACM MM), 2023
Chen Liu
Peike Li
Xingqun Qi
Hu Zhang
Lincheng Li
Dadong Wang
Xin Yu
VOS
215
43
0
31 Jul 2023
Connecting Multi-modal Contrastive Representations
Neural Information Processing Systems (NeurIPS), 2023
Zehan Wang
Yang Zhao
Xize Cheng
Haifeng Huang
Jiageng Liu
...
Lin Li
Yongqiang Wang
Aoxiong Yin
Ziang Zhang
Zhou Zhao
173
40
0
22 May 2023
Egocentric Auditory Attention Localization in Conversations
Computer Vision and Pattern Recognition (CVPR), 2023
Fiona Ryan
Hao Jiang
Abhinav Shukla
James M. Rehg
V. Ithapu
EgoV
219
23
0
28 Mar 2023
Dense-Localizing Audio-Visual Events in Untrimmed Videos: A Large-Scale Benchmark and Baseline
Computer Vision and Pattern Recognition (CVPR), 2023
Tiantian Geng
Teng Wang
Yanfu Zhang
Runmin Cong
Feng Zheng
181
59
0
22 Mar 2023
Audio-Driven Co-Speech Gesture Video Generation
Neural Information Processing Systems (NeurIPS), 2022
Xian Liu
Qianyi Wu
Hang Zhou
Yuanqi Du
Wayne Wu
Dahua Lin
Ziwei Liu
SLR
VGen
261
67
0
05 Dec 2022
A Closer Look at Weakly-Supervised Audio-Visual Source Localization
Neural Information Processing Systems (NeurIPS), 2022
Shentong Mo
Pedro Morgado
237
79
0
30 Aug 2022
Static and Dynamic Concepts for Self-supervised Video Representation Learning
European Conference on Computer Vision (ECCV), 2022
Rui Qian
Shuangrui Ding
Xian Liu
Dahua Lin
SSL
156
26
0
26 Jul 2022
Learning to Answer Questions in Dynamic Audio-Visual Scenarios
Computer Vision and Pattern Recognition (CVPR), 2022
Guangyao Li
Yake Wei
Yapeng Tian
Chenliang Xu
Ji-Rong Wen
Di Hu
251
200
0
26 Mar 2022
Learning Hierarchical Cross-Modal Association for Co-Speech Gesture Generation
Computer Vision and Pattern Recognition (CVPR), 2022
Xian Liu
Qianyi Wu
Hang Zhou
Yinghao Xu
Rui Qian
Xinyi Lin
Xiaowei Zhou
Wayne Wu
Bo Dai
Bolei Zhou
SLR
211
133
0
24 Mar 2022
Semantic-Aware Implicit Neural Audio-Driven Video Portrait Generation
European Conference on Computer Vision (ECCV), 2022
Xian Liu
Yinghao Xu
Qianyi Wu
Hang Zhou
Wayne Wu
Bolei Zhou
VGen
DiffM
3DH
190
161
0
19 Jan 2022
Audio-Visual Collaborative Representation Learning for Dynamic Saliency Prediction
Hailong Ning
Bin Zhao
Zhanxuan Hu
Lang He
Ercheng Pei
236
12
0
17 Sep 2021
Vision-Infused Deep Audio Inpainting
IEEE International Conference on Computer Vision (ICCV), 2019
Hang Zhou
Ziwei Liu
Lingfeng Guo
Ping Luo
Dahua Lin
286
91
0
24 Oct 2019
1