Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2106.03932
Cited By
How to Design a Three-Stage Architecture for Audio-Visual Active Speaker Detection in the Wild
7 June 2021
Okan Kopuklu
Maja Taseska
Gerhard Rigoll
3DV
Re-assign community
ArXiv
PDF
HTML
Papers citing
"How to Design a Three-Stage Architecture for Audio-Visual Active Speaker Detection in the Wild"
10 / 10 papers shown
Title
Egocentric Auditory Attention Localization in Conversations
Fiona Ryan
Hao Jiang
Abhinav Shukla
James M. Rehg
V. Ithapu
EgoV
24
16
0
28 Mar 2023
Deep Learning Based Audio-Visual Multi-Speaker DOA Estimation Using Permutation-Free Loss Function
Qing Wang
Hang Chen
Yannan Jiang
Zhe Wang
Yuyang Wang
Jun Du
Chin-Hui Lee
14
4
0
26 Oct 2022
UniCon+: ICTCAS-UCAS Submission to the AVA-ActiveSpeaker Task at ActivityNet Challenge 2022
Yuanhang Zhang
Susan Liang
Shuang Yang
Shiguang Shan
8
4
0
22 Jun 2022
Rethinking Audio-visual Synchronization for Active Speaker Detection
Abudukelimu Wuerkaixi
You Zhang
Z. Duan
Changshui Zhang
16
10
0
21 Jun 2022
Egocentric Deep Multi-Channel Audio-Visual Active Speaker Localization
Hao Jiang
Calvin Murdock
V. Ithapu
EgoV
25
40
0
06 Jan 2022
Learning Spatial-Temporal Graphs for Active Speaker Detection
Sourya Roy
Kyle Min
Subarna Tripathi
T. Guha
Somdeb Majumdar
25
3
0
02 Dec 2021
A trained humanoid robot can perform human-like crossmodal social attention and conflict resolution
Di Fu
Fares Abawi
Hugo C. C. Carneiro
Matthias Kerzel
Ziwei Chen
Erik Strahl
Xun Liu
S. Wermter
14
6
0
02 Nov 2021
MAAS: Multi-modal Assignation for Active Speaker Detection
Juan Carlos León Alcázar
Fabian Caba Heilbron
Ali K. Thabet
Bernard Ghanem
57
51
0
11 Jan 2021
VoxCeleb2: Deep Speaker Recognition
Joon Son Chung
Arsha Nagrani
Andrew Zisserman
214
2,224
0
14 Jun 2018
Wave-U-Net: A Multi-Scale Neural Network for End-to-End Audio Source Separation
Daniel Stoller
Sebastian Ewert
S. Dixon
AI4TS
101
588
0
08 Jun 2018
1