Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2106.09669
Cited By
Improving On-Screen Sound Separation for Open-Domain Videos with Audio-Visual Self-Attention
17 June 2021
Efthymios Tzinis
Scott Wisdom
Tal Remez
J. Hershey
VLM
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Improving On-Screen Sound Separation for Open-Domain Videos with Audio-Visual Self-Attention"
9 / 9 papers shown
Title
Reading to Listen at the Cocktail Party: Multi-Modal Speech Separation
Akam Rahimi
Triantafyllos Afouras
Andrew Zisserman
40
28
0
02 Jan 2025
Robust Active Speaker Detection in Noisy Environments
Siva Sai Nagender Vasireddy
Chenxu Zhang
Xiaohu Guo
Yapeng Tian
37
0
0
27 Mar 2024
VoViT: Low Latency Graph-based Audio-Visual Voice Separation Transformer
Juan F. Montesinos
V. S. Kadandale
G. Haro
ViT
23
19
0
08 Mar 2022
Visual Sound Localization in the Wild by Cross-Modal Interference Erasing
Xian Liu
Rui Qian
Hang Zhou
Di Hu
Weiyao Lin
Ziwei Liu
Bolei Zhou
Xiaowei Zhou
13
25
0
13 Feb 2022
Multi-modal Residual Perceptron Network for Audio-Video Emotion Recognition
Xin Chang
W. Skarbek
22
19
0
21 Jul 2021
VidTr: Video Transformer Without Convolutions
Yanyi Zhang
Xinyu Li
Chunhui Liu
Bing Shuai
Yi Zhu
Biagio Brattoli
Hao Chen
I. Marsic
Joseph Tighe
ViT
136
193
0
23 Apr 2021
Is Space-Time Attention All You Need for Video Understanding?
Gedas Bertasius
Heng Wang
Lorenzo Torresani
ViT
280
1,982
0
09 Feb 2021
VisualVoice: Audio-Visual Speech Separation with Cross-Modal Consistency
Ruohan Gao
Kristen Grauman
CVBM
190
198
0
08 Jan 2021
MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications
Andrew G. Howard
Menglong Zhu
Bo Chen
Dmitry Kalenichenko
Weijun Wang
Tobias Weyand
M. Andreetto
Hartwig Adam
3DH
950
20,567
0
17 Apr 2017
1