Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2007.06355
Cited By
Multiple Sound Sources Localization from Coarse to Fine
13 July 2020
Rui Qian
Di Hu
Heinrich Dinkel
Mengyue Wu
N. Xu
Weiyao Lin
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Multiple Sound Sources Localization from Coarse to Fine"
50 / 104 papers shown
Title
Hearing and Seeing Through CLIP: A Framework for Self-Supervised Sound Source Localization
Sooyoung Park
Arda Senocak
Joon Son Chung
VLM
43
0
0
08 May 2025
OpenAVS: Training-Free Open-Vocabulary Audio Visual Segmentation with Foundational Models
Shengkai Chen
Yifang Yin
Jinming Cao
Shili Xiang
Zhenguang Liu
Roger Zimmermann
VOS
VLM
37
0
0
30 Apr 2025
Improving Sound Source Localization with Joint Slot Attention on Image and Audio
Inho Kim
Youngkil Song
Jicheol Park
Won Hwa Kim
Suha Kwak
22
0
0
21 Apr 2025
Robust Audio-Visual Segmentation via Audio-Guided Visual Convergent Alignment
Chen Liu
Peike Li
Liying Yang
Dadong Wang
Lincheng Li
Xin Yu
VOS
60
0
0
17 Mar 2025
Multimodal Class-aware Semantic Enhancement Network for Audio-Visual Video Parsing
Pengcheng Zhao
Jinxing Zhou
Yang Zhao
D. Guo
Yanxiang Chen
82
1
0
15 Dec 2024
Towards Open-Vocabulary Audio-Visual Event Localization
Jinxing Zhou
D. Guo
Ruohao Guo
Yuxin Mao
Jingjing Hu
Yiran Zhong
Xiaojun Chang
M. Wang
VLM
46
4
0
18 Nov 2024
SaSR-Net: Source-Aware Semantic Representation Network for Enhancing Audio-Visual Question Answering
Tianyu Yang
Yiyang Nan
Lisen Dai
Zhenwen Liang
Yapeng Tian
X. Zhang
28
0
0
07 Nov 2024
Aligning Audio-Visual Joint Representations with an Agentic Workflow
Shentong Mo
Yibing Song
21
0
0
30 Oct 2024
A Critical Assessment of Visual Sound Source Localization Models Including Negative Audio
Xavier Juanola
Gloria Haro
Magdalena Fuentes
28
2
0
01 Oct 2024
Multi-scale Multi-instance Visual Sound Localization and Segmentation
Shentong Mo
Haofan Wang
25
2
0
31 Aug 2024
Enhancing Sound Source Localization via False Negative Elimination
Zengjie Song
Jiangshe Zhang
Yuxi Wang
Junsong Fan
Zhaoxiang Zhang
35
0
0
29 Aug 2024
AVESFormer: Efficient Transformer Design for Real-Time Audio-Visual Segmentation
Zili Wang
Qi Yang
Linsu Shi
Jiazhong Yu
M. Tanveer
Fei Li
Shiming Xiang
VOS
14
1
0
03 Aug 2024
Aligning Sight and Sound: Advanced Sound Source Localization Through Audio-Visual Alignment
Arda Senocak
H. Ryu
Junsik Kim
Tae-Hyun Oh
Hanspeter Pfister
Joon Son Chung
24
3
0
18 Jul 2024
Modeling and Driving Human Body Soundfields through Acoustic Primitives
Chao Huang
Dejan Marković
Chenliang Xu
Alexander Richard
31
4
0
18 Jul 2024
Stepping Stones: A Progressive Training Strategy for Audio-Visual Semantic Segmentation
Juncheng Ma
Peiwen Sun
Yaoting Wang
Di Hu
VOS
41
7
0
16 Jul 2024
Semantic Grouping Network for Audio Source Separation
Shentong Mo
Yapeng Tian
34
4
0
04 Jul 2024
SAVE: Segment Audio-Visual Easy way using Segment Anything Model
Khanh-Binh Nguyen
Chae Jung Park
VLM
VOS
24
1
0
02 Jul 2024
MA-AVT: Modality Alignment for Parameter-Efficient Audio-Visual Transformers
Tanvir Mahmud
Shentong Mo
Yapeng Tian
Diana Marculescu
26
4
0
07 Jun 2024
Progressive Confident Masking Attention Network for Audio-Visual Segmentation
Yuxuan Wang
Feng Dong
Jinchao Zhu
Shuyue Zhu
VOS
43
0
0
04 Jun 2024
Advancing Weakly-Supervised Audio-Visual Video Parsing via Segment-wise Pseudo Labeling
Jinxing Zhou
Dan Guo
Yiran Zhong
Meng Wang
VLM
53
18
0
03 Jun 2024
Unified Video-Language Pre-training with Synchronized Audio
Shentong Mo
Haofan Wang
Huaxia Li
Xu Tang
30
2
0
12 May 2024
SemiPL: A Semi-supervised Method for Event Sound Source Localization
Yue Li
Baiqiao Yin
Jinfu Liu
Jiajun Wen
Jiaying Lin
Mengyuan Liu
35
0
0
30 Apr 2024
Audio-Visual Generalized Zero-Shot Learning using Pre-Trained Large Multi-Modal Models
David Kurzendörfer
Otniel-Bogdan Mercea
A. Sophia Koepke
Zeynep Akata
VLM
CLIP
14
2
0
09 Apr 2024
T-VSL: Text-Guided Visual Sound Source Localization in Mixtures
Tanvir Mahmud
Yapeng Tian
Diana Marculescu
42
7
0
02 Apr 2024
Learning to Visually Localize Sound Sources from Mixtures without Prior Source Knowledge
Dongjin Kim
Sung-Jin Um
Sangmin Lee
Jung Uk Kim
30
4
0
26 Mar 2024
Unsupervised Audio-Visual Segmentation with Modality Alignment
Swapnil Bhosale
Haosen Yang
Diptesh Kanojia
Jiangkang Deng
Xiatian Zhu
VOS
30
5
0
21 Mar 2024
Audio-Visual Segmentation via Unlabeled Frame Exploitation
Jinxiang Liu
Yikun Liu
Fei Zhang
Chen Ju
Ya-Qin Zhang
Yanfeng Wang
34
9
0
17 Mar 2024
Dual Mean-Teacher: An Unbiased Semi-Supervised Framework for Audio-Visual Source Localization
Yuxin Guo
Shijie Ma
Hu Su
Zhiqing Wang
Yuhao Zhao
Wei Zou
Siyang Sun
Yun Zheng
SSL
41
12
0
05 Mar 2024
Cross Pseudo-Labeling for Semi-Supervised Audio-Visual Source Localization
Yuxin Guo
Shijie Ma
Yuhao Zhao
Hu Su
Wei Zou
31
4
0
05 Mar 2024
EchoTrack: Auditory Referring Multi-Object Tracking for Autonomous Driving
Jiacheng Lin
Jiajun Chen
Kunyu Peng
Xuan He
Zhiyong Li
Rainer Stiefelhagen
Kailun Yang
48
6
0
28 Feb 2024
Bootstrapping Audio-Visual Segmentation by Strengthening Audio Cues
Tianxiang Chen
Zhentao Tan
Tao Gong
Qi Chu
Yue-bo Wu
Bin Liu
Le Lu
Jieping Ye
Nenghai Yu
VOS
32
3
0
04 Feb 2024
Efficient Multiscale Multimodal Bottleneck Transformer for Audio-Video Classification
Wentao Zhu
22
5
0
08 Jan 2024
Cooperation Does Matter: Exploring Multi-Order Bilateral Relations for Audio-Visual Segmentation
Qi Yang
Xing Nie
Tong Li
Pengfei Gao
Ying Guo
Cheng Zhen
Pengfei Yan
Shiming Xiang
VOS
21
12
0
11 Dec 2023
Weakly-Supervised Audio-Visual Segmentation
Shentong Mo
Bhiksha Raj
VOS
32
12
0
25 Nov 2023
Can CLIP Help Sound Source Localization?
Sooyoung Park
Arda Senocak
Joon Son Chung
22
6
0
07 Nov 2023
Sounding Bodies: Modeling 3D Spatial Sound of Humans Using Body Pose and Audio
Xudong Xu
Dejan Marković
Jacob Sandakly
Todd Keebler
Steven Krenn
Alexander Richard
15
2
0
01 Nov 2023
Leveraging Hyperbolic Embeddings for Coarse-to-Fine Robot Design
Heng Dong
Junyu Zhang
Chongjie Zhang
24
2
0
01 Nov 2023
LAVSS: Location-Guided Audio-Visual Spatial Audio Separation
Yuxin Ye
Wenming Yang
Yapeng Tian
23
9
0
31 Oct 2023
Multimodal Variational Auto-encoder based Audio-Visual Segmentation
Yuxin Mao
Jing Zhang
Mochu Xiang
Yiran Zhong
Yuchao Dai
25
33
0
12 Oct 2023
Cross-modal Cognitive Consensus guided Audio-Visual Segmentation
Zhaofeng Shi
Qingbo Wu
Fanman Meng
Linfeng Xu
Hongliang Li
VOS
25
3
0
10 Oct 2023
QDFormer: Towards Robust Audiovisual Segmentation in Complex Environments with Quantization-based Semantic Decomposition
Xiang Li
Jinglu Wang
Xiaohao Xu
Xiulian Peng
Rita Singh
Yan Lu
Bhiksha Raj
VOS
31
10
0
29 Sep 2023
Sound Source Localization is All about Cross-Modal Alignment
Arda Senocak
H. Ryu
Junsik Kim
Tae-Hyun Oh
Hanspeter Pfister
Joon Son Chung
19
17
0
19 Sep 2023
CATR: Combinatorial-Dependence Audio-Queried Transformer for Audio-Visual Video Segmentation
Kexin Li
Zongxin Yang
Lei Chen
Yezhou Yang
Jun Xiao
VOS
37
49
0
18 Sep 2023
Discovering Sounding Objects by Audio Queries for Audio Visual Segmentation
Shaofei Huang
Han Li
Yuqing Wang
Hongji Zhu
Jiao Dai
Jizhong Han
Wenge Rong
Si Liu
VOS
17
16
0
18 Sep 2023
Class-Incremental Grouping Network for Continual Audio-Visual Learning
Shentong Mo
Weiguo Pian
Yapeng Tian
CLL
VLM
27
21
0
11 Sep 2023
Hyperbolic Audio-visual Zero-shot Learning
Jie Hong
Zeeshan Hayder
Junlin Han
Pengfei Fang
Mehrtash Harandi
L. Petersson
12
13
0
24 Aug 2023
Improving Audio-Visual Segmentation with Bidirectional Generation
Dawei Hao
Yuxin Mao
Bowen He
Xiaodong Han
Yuchao Dai
Yiran Zhong
VOS
VGen
23
29
0
16 Aug 2023
Audio-Visual Spatial Integration and Recursive Attention for Robust Sound Source Localization
Sung-Jin Um
Dongjin Kim
Jung Uk Kim
9
4
0
11 Aug 2023
Induction Network: Audio-Visual Modality Gap-Bridging for Self-Supervised Sound Source Localization
Tianyu Liu
Peng Zhang
Wei Huang
Yufei Zha
Tao You
Yanni Zhang
SSL
8
2
0
09 Aug 2023
Audio-Visual Segmentation by Exploring Cross-Modal Mutual Semantics
Chen Liu
Peike Li
Xingqun Qi
Hu Zhang
Lincheng Li
Dadong Wang
Xin Yu
VOS
24
28
0
31 Jul 2023
1
2
3
Next