ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2026 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2002.05314
  4. Cited By
Self-supervised learning for audio-visual speaker diarization

Self-supervised learning for audio-visual speaker diarization

IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2020
13 February 2020
Yifan Ding
Yong-mei Xu
Shi-Xiong Zhang
Yahuan Cong
Liqiang Wang
    VLM
ArXiv (abs)PDFHTML

Papers citing "Self-supervised learning for audio-visual speaker diarization"

14 / 14 papers shown
Quality-Aware End-to-End Audio-Visual Neural Speaker Diarization
Quality-Aware End-to-End Audio-Visual Neural Speaker Diarization
Mao-Kui He
Jun Du
Shu-Tong Niu
Qing-Feng Liu
Chin-Hui Lee
232
2
0
15 Oct 2024
Audio-Visual Speaker Diarization: Current Databases, Approaches and
  Challenges
Audio-Visual Speaker Diarization: Current Databases, Approaches and Challenges
Victoria Mingote
Alfonso Ortega
A. Miguel
Eduardo Lleida
319
3
0
09 Sep 2024
Attend-Fusion: Efficient Audio-Visual Fusion for Video Classification
Attend-Fusion: Efficient Audio-Visual Fusion for Video Classification
Mahrukh Awan
Asmar Nadeem
Muhammad Junaid Awan
Armin Mustafa
Syed Sameed Husain
399
5
0
26 Aug 2024
Look, Listen and Recognise: Character-Aware Audio-Visual Subtitling
Look, Listen and Recognise: Character-Aware Audio-Visual SubtitlingIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2024
Bruno Korbar
Jaesung Huh
Andrew Zisserman
270
8
0
22 Jan 2024
CAD -- Contextual Multi-modal Alignment for Dynamic AVQA
CAD -- Contextual Multi-modal Alignment for Dynamic AVQAIEEE Workshop/Winter Conference on Applications of Computer Vision (WACV), 2023
Asmar Nadeem
Adrian Hilton
R. Dawes
Graham A. Thomas
A. Mustafa
374
14
0
25 Oct 2023
Hyperbolic Audio-visual Zero-shot Learning
Hyperbolic Audio-visual Zero-shot LearningIEEE International Conference on Computer Vision (ICCV), 2023
Jie Hong
Zeeshan Hayder
Junlin Han
Pengfei Fang
Mehrtash Harandi
L. Petersson
278
25
0
24 Aug 2023
Modality-Independent Teachers Meet Weakly-Supervised Audio-Visual Event
  Parser
Modality-Independent Teachers Meet Weakly-Supervised Audio-Visual Event ParserNeural Information Processing Systems (NeurIPS), 2023
Yun-hsuan Lai
Yen-Chun Chen
Y. Wang
327
24
0
27 May 2023
LoCoNet: Long-Short Context Network for Active Speaker Detection
LoCoNet: Long-Short Context Network for Active Speaker DetectionComputer Vision and Pattern Recognition (CVPR), 2023
Xizi Wang
Feng Cheng
Gedas Bertasius
David J. Crandall
276
31
0
19 Jan 2023
Learning in Audio-visual Context: A Review, Analysis, and New
  Perspective
Learning in Audio-visual Context: A Review, Analysis, and New Perspective
Yake Wei
Di Hu
Yapeng Tian
Xuelong Li
331
76
0
20 Aug 2022
Rethinking Audio-visual Synchronization for Active Speaker Detection
Rethinking Audio-visual Synchronization for Active Speaker DetectionInternational Workshop on Machine Learning for Signal Processing (MLSP), 2022
Abudukelimu Wuerkaixi
You Zhang
Z. Duan
Changshui Zhang
231
21
0
21 Jun 2022
Audio Self-supervised Learning: A Survey
Audio Self-supervised Learning: A SurveyPatterns (Patterns), 2022
Shuo Liu
Adria Mallol-Ragolta
Emilia Parada-Cabeleiro
Kun Qian
Xingshuo Jing
Alexander Kathan
Bin Hu
Bjoern W. Schuller
SSL
338
136
0
02 Mar 2022
Look Who's Talking: Active Speaker Detection in the Wild
Look Who's Talking: Active Speaker Detection in the Wild
You Jin Kim
Hee-Soo Heo
Soyeon Choe
Soo-Whan Chung
Yoohwan Kwon
Bong-Jin Lee
Youngki Kwon
Joon Son Chung
268
29
0
17 Aug 2021
The Right to Talk: An Audio-Visual Transformer Approach
The Right to Talk: An Audio-Visual Transformer ApproachIEEE International Conference on Computer Vision (ICCV), 2021
Thanh-Dat Truong
C. Duong
T. D. Vu
H. Pham
Bhiksha Raj
Ngan Le
Khoa Luu
237
39
0
06 Aug 2021
UniCon: Unified Context Network for Robust Active Speaker Detection
UniCon: Unified Context Network for Robust Active Speaker DetectionACM Multimedia (ACM MM), 2021
Yuanhang Zhang
Susan Liang
Shuang Yang
Xiao-Chang Liu
Zhongqin Wu
Shiguang Shan
Xilin Chen
CVBM
195
44
0
05 Aug 2021
1
Page 1 of 1