Communities
Connect sessions
AI calendar
Organizations
Join Slack
Contact Sales
Search
Open menu
Home
Papers
2202.06406
Cited By
Visual Sound Localization in the Wild by Cross-Modal Interference Erasing
AAAI Conference on Artificial Intelligence (AAAI), 2022
13 February 2022
Xian Liu
Rui Qian
Hang Zhou
Di Hu
Weiyao Lin
Ziwei Liu
Bolei Zhou
Xiaowei Zhou
Re-assign community
ArXiv (abs)
PDF
HTML
Github (29★)
Papers citing
"Visual Sound Localization in the Wild by Cross-Modal Interference Erasing"
21 / 21 papers shown
Learning from Silence and Noise for Visual Sound Source Localization
Xavier Juanola
G. Morais
Magdalena Fuentes
Gloria Haro
SSL
164
0
0
29 Aug 2025
FuseChat-3.0: Preference Optimization Meets Heterogeneous Model Fusion
Ziyi Yang
Fanqi Wan
Longguang Zhong
Canbin Huang
Guosheng Liang
Xiaojun Quan
MoMe
285
9
0
06 Mar 2025
AVS-Mamba: Exploring Temporal and Multi-modal Mamba for Audio-Visual Segmentation
IEEE transactions on multimedia (TMM), 2025
Sitong Gong
Yunzhi Zhuge
Lu Zhang
Yifan Wang
Pingping Zhang
Lijun Wang
Huchuan Lu
Mamba
VOS
133
14
0
14 Jan 2025
SaSR-Net: Source-Aware Semantic Representation Network for Enhancing Audio-Visual Question Answering
Conference on Empirical Methods in Natural Language Processing (EMNLP), 2024
Tianyu Yang
Yiyang Nan
Lisen Dai
Zhenwen Liang
Yapeng Tian
Wei Wei
340
1
0
07 Nov 2024
A Critical Assessment of Visual Sound Source Localization Models Including Negative Audio
IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2024
Xavier Juanola
Gloria Haro
Magdalena Fuentes
387
4
0
01 Oct 2024
Dual Mean-Teacher: An Unbiased Semi-Supervised Framework for Audio-Visual Source Localization
Yuxin Guo
Shijie Ma
Hu Su
Zhiqing Wang
Yuhao Zhao
Wei Zou
Siyang Sun
Yun Zheng
SSL
233
16
0
05 Mar 2024
Cross Pseudo-Labeling for Semi-Supervised Audio-Visual Source Localization
Yuxin Guo
Shijie Ma
Yuhao Zhao
Hu Su
Wei Zou
247
4
0
05 Mar 2024
STELLA: Continual Audio-Video Pre-training with Spatio-Temporal Localized Alignment
International Conference on Machine Learning (ICML), 2023
Jaewoo Lee
Jaehong Yoon
Wonjae Kim
Yunji Kim
Sung Ju Hwang
CLL
288
1
0
12 Oct 2023
BAVS: Bootstrapping Audio-Visual Segmentation by Integrating Foundation Knowledge
IEEE transactions on multimedia (IEEE TMM), 2023
Chen Liu
Peike Li
Hu Zhang
Lincheng Li
Zi Huang
Dadong Wang
Xin Yu
VOS
176
43
0
20 Aug 2023
Audio-Visual Segmentation by Exploring Cross-Modal Mutual Semantics
ACM Multimedia (ACM MM), 2023
Chen Liu
Peike Li
Xingqun Qi
Hu Zhang
Lincheng Li
Dadong Wang
Xin Yu
VOS
255
44
0
31 Jul 2023
Connecting Multi-modal Contrastive Representations
Neural Information Processing Systems (NeurIPS), 2023
Zehan Wang
Yang Zhao
Xize Cheng
Haifeng Huang
Jiageng Liu
...
Lin Li
Yongqiang Wang
Aoxiong Yin
Ziang Zhang
Zhou Zhao
193
40
0
22 May 2023
Egocentric Auditory Attention Localization in Conversations
Computer Vision and Pattern Recognition (CVPR), 2023
Fiona Ryan
Hao Jiang
Abhinav Shukla
James M. Rehg
V. Ithapu
EgoV
224
23
0
28 Mar 2023
Dense-Localizing Audio-Visual Events in Untrimmed Videos: A Large-Scale Benchmark and Baseline
Computer Vision and Pattern Recognition (CVPR), 2023
Tiantian Geng
Teng Wang
Yanfu Zhang
Runmin Cong
Feng Zheng
192
61
0
22 Mar 2023
Audio-Driven Co-Speech Gesture Video Generation
Neural Information Processing Systems (NeurIPS), 2022
Xian Liu
Qianyi Wu
Hang Zhou
Yuanqi Du
Wayne Wu
Dahua Lin
Ziwei Liu
SLR
VGen
274
67
0
05 Dec 2022
A Closer Look at Weakly-Supervised Audio-Visual Source Localization
Neural Information Processing Systems (NeurIPS), 2022
Shentong Mo
Pedro Morgado
241
79
0
30 Aug 2022
Static and Dynamic Concepts for Self-supervised Video Representation Learning
European Conference on Computer Vision (ECCV), 2022
Rui Qian
Shuangrui Ding
Xian Liu
Dahua Lin
SSL
176
26
0
26 Jul 2022
Learning to Answer Questions in Dynamic Audio-Visual Scenarios
Computer Vision and Pattern Recognition (CVPR), 2022
Guangyao Li
Yake Wei
Yapeng Tian
Chenliang Xu
Ji-Rong Wen
Di Hu
279
201
0
26 Mar 2022
Learning Hierarchical Cross-Modal Association for Co-Speech Gesture Generation
Computer Vision and Pattern Recognition (CVPR), 2022
Xian Liu
Qianyi Wu
Hang Zhou
Yinghao Xu
Rui Qian
Xinyi Lin
Xiaowei Zhou
Wayne Wu
Bo Dai
Bolei Zhou
SLR
247
135
0
24 Mar 2022
Semantic-Aware Implicit Neural Audio-Driven Video Portrait Generation
European Conference on Computer Vision (ECCV), 2022
Xian Liu
Yinghao Xu
Qianyi Wu
Hang Zhou
Wayne Wu
Bolei Zhou
VGen
DiffM
3DH
225
162
0
19 Jan 2022
Audio-Visual Collaborative Representation Learning for Dynamic Saliency Prediction
Hailong Ning
Bin Zhao
Zhanxuan Hu
Lang He
Ercheng Pei
257
12
0
17 Sep 2021
Vision-Infused Deep Audio Inpainting
IEEE International Conference on Computer Vision (ICCV), 2019
Hang Zhou
Ziwei Liu
Lingfeng Guo
Ping Luo
Dahua Lin
298
92
0
24 Oct 2019
1