Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2007.06355
Cited By
Multiple Sound Sources Localization from Coarse to Fine
13 July 2020
Rui Qian
Di Hu
Heinrich Dinkel
Mengyue Wu
N. Xu
Weiyao Lin
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Multiple Sound Sources Localization from Coarse to Fine"
50 / 104 papers shown
Title
PEANUT: A Human-AI Collaborative Tool for Annotating Audio-Visual Data
Zheng Zhang
Zheng Ning
Chenliang Xu
Yapeng Tian
Toby Jia-Jun Li
51
6
0
27 Jul 2023
Audio-aware Query-enhanced Transformer for Audio-Visual Segmentation
Jinxian Liu
Chen Ju
Chaofan Ma
Yanfeng Wang
Yu Wang
Ya-Qin Zhang
VOS
16
23
0
25 Jul 2023
AVSegFormer: Audio-Visual Segmentation with Transformer
Sheng Gao
Zhe Chen
Guo Chen
Wenhai Wang
Tong Lu
VOS
24
46
0
03 Jul 2023
Visually-Guided Sound Source Separation with Audio-Visual Predictive Coding
Zengjie Song
Zhaoxiang Zhang
19
1
0
19 Jun 2023
Revisit Weakly-Supervised Audio-Visual Video Parsing from the Language Perspective
Yingying Fan
Yu Wu
Bo Du
Yutian Lin
26
7
0
01 Jun 2023
A Unified Audio-Visual Learning Framework for Localization, Separation, and Recognition
Shentong Mo
Pedro Morgado
30
21
0
30 May 2023
Referred by Multi-Modality: A Unified Temporal Transformer for Video Object Segmentation
Shilin Yan
Renrui Zhang
Ziyu Guo
Wenchao Chen
Wei Zhang
Hongyang Li
Yu Qiao
Hao Dong
Zhongjiang He
Peng Gao
VOS
11
29
0
25 May 2023
Connecting Multi-modal Contrastive Representations
Zehan Wang
Yang Zhao
Xize Cheng
Haifeng Huang
Jiageng Liu
...
Lin Li
Yongqiang Wang
Aoxiong Yin
Ziang Zhang
Zhou Zhao
17
22
0
22 May 2023
Annotation-free Audio-Visual Segmentation
Jinxian Liu
Yu Wang
Chen Ju
Chaofan Ma
Ya-Qin Zhang
Weidi Xie
VOS
VLM
29
28
0
18 May 2023
Transavs: End-To-End Audio-Visual Segmentation With Transformer
Yuhang Ling
Yuxi Li
Zhenye Gan
Jiangning Zhang
M. Chi
Yabiao Wang
VOS
ViT
26
1
0
12 May 2023
Listen to Look into the Future: Audio-Visual Egocentric Gaze Anticipation
Bolin Lai
Fiona Ryan
Wenqi Jia
Miao Liu
James M. Rehg
EgoV
19
8
0
06 May 2023
Audio-Visual Grouping Network for Sound Localization from Mixtures
Shentong Mo
Yapeng Tian
37
40
0
29 Mar 2023
Egocentric Auditory Attention Localization in Conversations
Fiona Ryan
Hao Jiang
Abhinav Shukla
James M. Rehg
V. Ithapu
EgoV
24
15
0
28 Mar 2023
Egocentric Audio-Visual Object Localization
Chao Huang
Yapeng Tian
Anurag Kumar
Chenliang Xu
EgoV
18
28
0
23 Mar 2023
Learning Audio-Visual Source Localization via False Negative Aware Contrastive Learning
Weixuan Sun
Jiayi Zhang
Jianyuan Wang
Zheyuan Liu
Yiran Zhong
Tianpeng Feng
Yandong Guo
Yanhao Zhang
Nick Barnes
SSL
19
42
0
20 Mar 2023
Improving Audio-Visual Video Parsing with Pseudo Visual Labels
Jinxing Zhou
Dan Guo
Yiran Zhong
Meng Wang
VLM
31
11
0
04 Mar 2023
Audio-Visual Segmentation with Semantics
Jinxing Zhou
Xuyang Shen
Jianyuan Wang
Jiayi Zhang
Weixuan Sun
...
Stan Birchfield
Dan Guo
Lingpeng Kong
Meng Wang
Yiran Zhong
VOS
38
37
0
30 Jan 2023
Vision Transformers are Parameter-Efficient Audio-Visual Learners
Yan-Bo Lin
Yi-Lin Sung
Jie Lei
Mohit Bansal
Gedas Bertasius
26
72
0
15 Dec 2022
iQuery: Instruments as Queries for Audio-Visual Sound Separation
Jiaben Chen
Renrui Zhang
Dongze Lian
Jiaqi Yang
Ziyao Zeng
Jianbo Shi
16
25
0
07 Dec 2022
Mix and Localize: Localizing Sound Sources in Mixtures
Xixi Hu
Ziyang Chen
Andrew Owens
12
50
0
28 Nov 2022
Contrastive Positive Sample Propagation along the Audio-Visual Event Line
Jinxing Zhou
Dan Guo
Meng Wang
13
48
0
18 Nov 2022
Hear The Flow: Optical Flow-Based Self-Supervised Visual Sound Source Localization
Dennis Fedorishin
D. Mohan
Bhavin Jawade
S. Setlur
V. Govindaraju
VGen
11
10
0
06 Nov 2022
MarginNCE: Robust Sound Localization with a Negative Margin
Sooyoung Park
Arda Senocak
Joon Son Chung
SSL
6
13
0
03 Nov 2022
Leveraging the Video-level Semantic Consistency of Event for Audio-visual Event Localization
Yuanyuan Jiang
Jianqin Yin
Yonghao Dang
27
4
0
11 Oct 2022
AVE-CLIP: AudioCLIP-based Multi-window Temporal Transformer for Audio Visual Event Localization
Tanvir Mahmud
Diana Marculescu
CLIP
11
31
0
11 Oct 2022
Vision+X: A Survey on Multimodal Learning in the Light of Data
Ye Zhu
Yuehua Wu
N. Sebe
Yan Yan
28
16
0
05 Oct 2022
A Closer Look at Weakly-Supervised Audio-Visual Source Localization
Shentong Mo
Pedro Morgado
79
64
0
30 Aug 2022
Static and Dynamic Concepts for Self-supervised Video Representation Learning
Rui Qian
Shuangrui Ding
Xian Liu
Dahua Lin
SSL
9
22
0
26 Jul 2022
Audio-Visual Segmentation
Jinxing Zhou
Jianyuan Wang
J. Zhang
Weixuan Sun
Jing Zhang
Stan Birchfield
Dan Guo
Lingpeng Kong
Meng Wang
Yiran Zhong
VOS
20
109
0
11 Jul 2022
Exploiting Transformation Invariance and Equivariance for Self-supervised Sound Localisation
Jinxian Liu
Chen Ju
Weidi Xie
Ya-Qin Zhang
10
38
0
26 Jun 2022
Conditional Vector Graphics Generation for Music Cover Images
V. Efimova
Ivan Jarsky
Ilya Bizyaev
Andrey Filchenkov
GAN
19
3
0
15 May 2022
How to Listen? Rethinking Visual Sound Localization
Ho-Hsiang Wu
Magdalena Fuentes
Prem Seetharaman
J. P. Bello
ObjD
14
4
0
11 Apr 2022
Learning to Answer Questions in Dynamic Audio-Visual Scenarios
Guangyao Li
Yake Wei
Yapeng Tian
Chenliang Xu
Ji-Rong Wen
Di Hu
29
133
0
26 Mar 2022
Self-Supervised Predictive Learning: A Negative-Free Method for Sound Source Localization in Visual Scenes
Zengjie Song
Yuxi Wang
Junsong Fan
T. Tan
Zhaoxiang Zhang
SSL
14
42
0
25 Mar 2022
Localizing Visual Sounds the Easy Way
Shentong Mo
Pedro Morgado
11
77
0
17 Mar 2022
Audio-visual Generalised Zero-shot Learning with Cross-modal Attention and Language
Otniel-Bogdan Mercea
Lukas Riesch
A. Sophia Koepke
Zeynep Akata
8
48
0
07 Mar 2022
Visual Sound Localization in the Wild by Cross-Modal Interference Erasing
Xian Liu
Rui Qian
Hang Zhou
Di Hu
Weiyao Lin
Ziwei Liu
Bolei Zhou
Xiaowei Zhou
6
25
0
13 Feb 2022
Self-Supervised Moving Vehicle Detection from Audio-Visual Cues
Jannik Zürn
Wolfram Burgard
SSL
21
8
0
30 Jan 2022
Egocentric Deep Multi-Channel Audio-Visual Active Speaker Localization
Hao Jiang
Calvin Murdock
V. Ithapu
EgoV
25
40
0
06 Jan 2022
Class-aware Sounding Objects Localization via Audiovisual Correspondence
Di Hu
Yake Wei
Rui Qian
Weiyao Lin
Ruihua Song
Ji-Rong Wen
24
41
0
22 Dec 2021
Audio-Visual Synchronisation in the wild
Honglie Chen
Weidi Xie
Triantafyllos Afouras
Arsha Nagrani
Andrea Vedaldi
Andrew Zisserman
18
37
0
08 Dec 2021
Structure from Silence: Learning Scene Structure from Ambient Sound
Ziyang Chen
Xixi Hu
Andrew Owens
20
26
0
10 Nov 2021
Space-Time Memory Network for Sounding Object Localization in Videos
Sizhe Li
Yapeng Tian
Chenliang Xu
21
10
0
10 Nov 2021
Audio-Visual Collaborative Representation Learning for Dynamic Saliency Prediction
Hailong Ning
Bin Zhao
Zhanxuan Hu
Lang He
Ercheng Pei
14
10
0
17 Sep 2021
Dual Normalization Multitasking for Audio-Visual Sounding Object Localization
Tokuhiro Nishikawa
Daiki Shimada
Jerry Jun Yokono
13
0
0
01 Jun 2021
Self-supervised object detection from audio-visual correspondence
Triantafyllos Afouras
Yuki M. Asano
Francois Fagan
Andrea Vedaldi
Florian Metze
SSL
19
45
0
13 Apr 2021
Visually Informed Binaural Audio Generation without Binaural Audios
Xudong Xu
Hang Zhou
Ziwei Liu
Bo Dai
Xiaogang Wang
Dahua Lin
DiffM
11
53
0
13 Apr 2021
Contrastive Learning of Global-Local Video Representations
Shuang Ma
Zhaoyang Zeng
Daniel J. McDuff
Yale Song
SSL
22
7
0
07 Apr 2021
Localizing Visual Sounds the Hard Way
Honglie Chen
Weidi Xie
Triantafyllos Afouras
Arsha Nagrani
Andrea Vedaldi
Andrew Zisserman
ObjD
11
184
0
06 Apr 2021
Cyclic Co-Learning of Sounding Object Visual Grounding and Sound Separation
Yapeng Tian
Di Hu
Chenliang Xu
ObjD
13
86
0
05 Apr 2021
Previous
1
2
3
Next