Communities
Connect sessions
AI calendar
Organizations
Join Slack
Contact Sales
Search
Open menu
Home
Papers
All Papers
0 / 0 papers shown
Title
Home
Papers
1803.08842
Cited By
Audio-Visual Event Localization in Unconstrained Videos
23 March 2018
Yapeng Tian
Jing Shi
Bochen Li
Zhiyao Duan
Chenliang Xu
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Audio-Visual Event Localization in Unconstrained Videos"
48 / 298 papers shown
Title
Can audio-visual integration strengthen robustness under multimodal attacks?
Computer Vision and Pattern Recognition (CVPR), 2021
Yapeng Tian
Chenliang Xu
AAML
260
40
0
05 Apr 2021
Cross-Modal learning for Audio-Visual Video Parsing
Interspeech (Interspeech), 2021
Jatin Lamba
Abhishek
Jayaprakash Akula
Rishabh Dabral
Preethi Jyothi
Ganesh Ramakrishnan
232
8
0
03 Apr 2021
Unsupervised Sound Localization via Iterative Contrastive Learning
Computer Vision and Image Understanding (CVIU), 2021
Yan-Bo Lin
Hung-Yu Tseng
Hsin-Ying Lee
Yen-Yu Lin
Ming-Hsuan Yang
SSL
154
40
0
01 Apr 2021
Positive Sample Propagation along the Audio-Visual Event Line
Computer Vision and Pattern Recognition (CVPR), 2021
Jinxing Zhou
Liang Zheng
Yiran Zhong
Shijie Hao
Meng Wang
213
124
0
01 Apr 2021
Learning Audio-Visual Correlations from Variational Cross-Modal Generation
IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2021
Ye Zhu
Yu Wu
Hugo Latapie
Yi Yang
Yan Yan
SSL
232
21
0
05 Feb 2021
ACAV100M: Automatic Curation of Large-Scale Datasets for Audio-Visual Video Representation Learning
IEEE International Conference on Computer Vision (ICCV), 2021
Sangho Lee
Jiwan Chung
Youngjae Yu
Gunhee Kim
Thomas Breuel
Gal Chechik
Yale Song
309
65
0
26 Jan 2021
VisualVoice: Audio-Visual Speech Separation with Cross-Modal Consistency
Computer Vision and Pattern Recognition (CVPR), 2021
Ruohan Gao
Kristen Grauman
CVBM
390
235
0
08 Jan 2021
ViNet: Pushing the limits of Visual Modality for Audio-Visual Saliency Prediction
IEEE/RJS International Conference on Intelligent RObots and Systems (IROS), 2020
Samyak Jain
P. Yarlagadda
Shreyank Jyoti
Shyamgopal Karthik
Subramanian Ramanathan
Vineet Gandhi
ViT
251
81
0
11 Dec 2020
Multi-Instrumentalist Net: Unsupervised Generation of Music from Body Movements
Kun Su
Xiulong Liu
Eli Shlizerman
198
32
0
07 Dec 2020
iPerceive: Applying Common-Sense Reasoning to Multi-Modal Dense Video Captioning and Video Question Answering
Vasu Sharma
Gurneet Arora
Navpreet Kaloty
168
39
0
16 Nov 2020
Into the Wild with AudioScope: Unsupervised Audio-Visual Separation of On-Screen Sounds
International Conference on Learning Representations (ICLR), 2020
Efthymios Tzinis
Scott Wisdom
A. Jansen
Shawn Hershey
Tal Remez
D. Ellis
J. Hershey
311
78
0
02 Nov 2020
Object Permanence Through Audio-Visual Representations
IEEE Access (IEEE Access), 2020
Fanjun Bu
Chien-Ming Huang
183
2
0
20 Oct 2020
Discriminative Sounding Objects Localization via Self-supervised Audiovisual Matching
Di Hu
Rui Qian
Minyue Jiang
Xiao Tan
Shilei Wen
Errui Ding
Weiyao Lin
Dejing Dou
181
149
0
12 Oct 2020
AVECL-UMONS database for audio-visual event classification and localization
Mathilde Brousmiche
Stéphane Dupont
Jean Rouat
126
3
0
02 Oct 2020
Learning to Set Waypoints for Audio-Visual Navigation
Changan Chen
Sagnik Majumder
Ziad Al-Halah
Ruohan Gao
Santhosh Kumar Ramakrishnan
Kristen Grauman
SSL
212
5
0
21 Aug 2020
Describing Unseen Videos via Multi-Modal Cooperative Dialog Agents
Ye Zhu
Yu Wu
Yi Yang
Yan Yan
201
13
0
18 Aug 2020
Audio-Visual Event Localization via Recursive Fusion by Joint Co-Attention
Bin Duan
Hao Tang
Wei Wang
Ziliang Zong
Guowei Yang
Yan Yan
137
73
0
14 Aug 2020
Foley Music: Learning to Generate Music from Videos
Chuang Gan
Deng Huang
Peihao Chen
J. Tenenbaum
Antonio Torralba
VGen
119
152
0
21 Jul 2020
Unified Multisensory Perception: Weakly-Supervised Audio-Visual Video Parsing
Yapeng Tian
Dingzeyu Li
Chenliang Xu
228
207
0
21 Jul 2020
Sep-Stereo: Visually Guided Stereophonic Audio Generation by Associating Source Separation
European Conference on Computer Vision (ECCV), 2020
Hang Zhou
Xudong Xu
Dahua Lin
Xiaogang Wang
Ziwei Liu
DiffM
191
94
0
20 Jul 2020
Talking-head Generation with Rhythmic Head Motion
European Conference on Computer Vision (ECCV), 2020
Lele Chen
Guofeng Cui
Celong Liu
Zhong Li
Ziyi Kou
Yi Tian Xu
Chenliang Xu
159
197
0
16 Jul 2020
Leveraging Category Information for Single-Frame Visual Sound Source Separation
Xiangjie Sui
Esa Rahtu
129
9
0
15 Jul 2020
Multiple Sound Sources Localization from Coarse to Fine
European Conference on Computer Vision (ECCV), 2020
Rui Qian
Di Hu
Heinrich Dinkel
Mengyue Wu
N. Xu
Weiyao Lin
241
179
0
13 Jul 2020
Do We Need Sound for Sound Source Localization?
Asian Conference on Computer Vision (ACCV), 2020
Takashi Oya
Shohei Iwase
Ryota Natsume
Takahiro Itazuri
Shugo Yamaguchi
Shigeo Morishima
131
25
0
11 Jul 2020
Labelling unlabelled videos from scratch with multi-modal self-supervision
Neural Information Processing Systems (NeurIPS), 2020
Yuki M. Asano
Mandela Patrick
Christian Rupprecht
Andrea Vedaldi
SSL
253
161
0
24 Jun 2020
Audeo: Audio Generation for a Silent Performance Video
Neural Information Processing Systems (NeurIPS), 2020
Kun Su
Xiulong Liu
Eli Shlizerman
VGen
188
72
0
23 Jun 2020
Visually Guided Sound Source Separation using Cascaded Opponent Filter Network
Xiangjie Sui
Esa Rahtu
195
25
0
04 Jun 2020
Cross-Task Transfer for Geotagged Audiovisual Aerial Scene Recognition
Di Hu
Xuhong Li
Lichao Mou
P. Jin
Dong Chen
L. Jing
Xiaoxiang Zhu
Dejing Dou
151
6
0
18 May 2020
VisualEchoes: Spatial Image Representation Learning through Echolocation
European Conference on Computer Vision (ECCV), 2020
Ruohan Gao
Changan Chen
Ziad Al-Halah
Carl Schissler
Kristen Grauman
MDE
SSL
417
90
0
04 May 2020
VGGSound: A Large-scale Audio-Visual Dataset
IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2020
Honglie Chen
Weidi Xie
Andrea Vedaldi
Andrew Zisserman
226
743
0
29 Apr 2020
Semantic Object Prediction and Spatial Sound Super-Resolution with Binaural Sounds
European Conference on Computer Vision (ECCV), 2020
A. Vasudevan
Dengxin Dai
Luc Van Gool
ObjD
193
50
0
09 Mar 2020
Deep Audio-Visual Learning: A Survey
International Journal of Automation and Computing (IJAC), 2020
Hao Zhu
Mandi Luo
Rui Wang
A. Zheng
Ran He
176
175
0
14 Jan 2020
STAViS: Spatio-Temporal AudioVisual Saliency Network
Computer Vision and Pattern Recognition (CVPR), 2020
A. Tsiami
Petros Koutras
Petros Maragos
172
80
0
09 Jan 2020
SoundSpaces: Audio-Visual Navigation in 3D Environments
Changan Chen
Unnat Jain
Carl Schissler
S. V. A. Garí
Ziad Al-Halah
V. Ithapu
Philip Robinson
Kristen Grauman
239
28
0
24 Dec 2019
Listen to Look: Action Recognition by Previewing Audio
Computer Vision and Pattern Recognition (CVPR), 2019
Ruohan Gao
Tae-Hyun Oh
Kristen Grauman
Lorenzo Torresani
VLM
256
280
0
10 Dec 2019
Learning to Localize Sound Sources in Visual Scenes: Analysis and Applications
IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2019
Arda Senocak
Tae-Hyun Oh
Junsik Kim
Ming-Hsuan Yang
In So Kweon
SSL
162
62
0
20 Nov 2019
Learning to Localize Temporal Events in Large-scale Video Data
Mikel Bober-Irizar
Miha Škalič
David Austin
88
1
0
25 Oct 2019
A Case Study on Combining ASR and Visual Features for Generating Instructional Video Captions
Conference on Computational Natural Language Learning (CoNLL), 2019
Jack Hessel
Bo Pang
Zhenhai Zhu
Radu Soricut
161
39
0
07 Oct 2019
Learning to Have an Ear for Face Super-Resolution
Computer Vision and Pattern Recognition (CVPR), 2019
Givi Meishvili
Simon Jenni
Paolo Favaro
SupR
CVBM
145
24
0
27 Sep 2019
Deep Latent Space Learning for Cross-modal Mapping of Audio and Visual Signals
International Conference on Digital Image Computing: Techniques and Applications (DICTA), 2019
Shah Nawaz
Muhammad Kamran Janjua
I. Gallo
Arif Mahmood
Alessandro Calefati
131
43
0
18 Sep 2019
Towards Generating Ambisonics Using Audio-Visual Cue for Virtual Reality
IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2019
A. Rana
C. Ozcinar
A. Smolic
99
30
0
16 Aug 2019
Self-Supervised Audio-Visual Co-Segmentation
Andrew Rouditchenko
Hang Zhao
Chuang Gan
Josh H. McDermott
Antonio Torralba
VLM
SSL
119
107
0
18 Apr 2019
Audio-Visual Model Distillation Using Acoustic Images
Andrés F. Pérez
Valentina Sanguineti
Pietro Morerio
Vittorio Murino
VLM
149
30
0
16 Apr 2019
Co-Separating Sounds of Visual Objects
Ruohan Gao
Kristen Grauman
276
221
0
16 Apr 2019
Dual-modality seq2seq network for audio-visual event localization
Yan-Bo Lin
Yu-Jhe Li
Y. Wang
171
150
0
20 Feb 2019
2.5D Visual Sound
Ruohan Gao
Kristen Grauman
VGen
245
143
0
11 Dec 2018
An Attempt towards Interpretable Audio-Visual Video Captioning
Yapeng Tian
Chenxiao Guan
Justin Goodman
Marc Moore
Chenliang Xu
164
21
0
07 Dec 2018
Identify, locate and separate: Audio-visual object extraction in large video collections using weak supervision
IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), 2018
Sanjeel Parekh
A. Ozerov
S. Essid
Ngoc Q. K. Duong
P. Pérez
G. Richard
119
16
0
09 Nov 2018
Previous
1
2
3
4
5
6