Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2002.08742
Cited By
Disentangled Speech Embeddings using Cross-modal Self-supervision
20 February 2020
Arsha Nagrani
Joon Son Chung
Samuel Albanie
Andrew Zisserman
SSL
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Disentangled Speech Embeddings using Cross-modal Self-supervision"
20 / 20 papers shown
Title
Audio-Visual Generalized Zero-Shot Learning using Pre-Trained Large Multi-Modal Models
David Kurzendörfer
Otniel-Bogdan Mercea
A. Sophia Koepke
Zeynep Akata
VLM
CLIP
26
2
0
09 Apr 2024
VoxSRC 2022: The Fourth VoxCeleb Speaker Recognition Challenge
Jaesung Huh
A. Brown
Jee-weon Jung
Joon Son Chung
Arsha Nagrani
D. Garcia-Romero
Andrew Zisserman
18
26
0
20 Feb 2023
Audio Representation Learning by Distilling Video as Privileged Information
Amirhossein Hajavi
Ali Etemad
13
4
0
06 Feb 2023
SyncTalkFace: Talking Face Generation with Precise Lip-Syncing via Audio-Lip Memory
Se Jin Park
Minsu Kim
Joanna Hong
J. Choi
Y. Ro
CVBM
19
85
0
02 Nov 2022
Audio-Visual Person-of-Interest DeepFake Detection
D. Cozzolino
Alessandro Pianese
Matthias Nießner
L. Verdoliva
28
59
0
06 Apr 2022
Audio Self-supervised Learning: A Survey
Shuo Liu
Adria Mallol-Ragolta
Emilia Parada-Cabeleiro
Kun Qian
Xingshuo Jing
Alexander Kathan
Bin Hu
Bjoern W. Schuller
SSL
35
106
0
02 Mar 2022
Speaker Normalization for Self-supervised Speech Emotion Recognition
Itai Gat
Hagai Aronowitz
Weizhong Zhu
E. Morais
R. Hoory
25
50
0
02 Feb 2022
Bootstrap Equilibrium and Probabilistic Speaker Representation Learning for Self-supervised Speaker Verification
Sung Hwan Mun
Min Hyun Han
Dongjune Lee
Jihwan Kim
N. Kim
SSL
22
3
0
16 Dec 2021
Self-Supervised Speaker Verification with Simple Siamese Network and Self-Supervised Regularization
Mufan Sang
Haoqi Li
F. Liu
Andrew O. Arnold
Li Wan
SSL
16
38
0
08 Dec 2021
Speech2Video: Cross-Modal Distillation for Speech to Video Generation
Shijing Si
Jianzong Wang
Xiaoyang Qu
Ning Cheng
Wenqi Wei
Xinghua Zhu
Jing Xiao
VGen
10
15
0
10 Jul 2021
Pose-Controllable Talking Face Generation by Implicitly Modularized Audio-Visual Representation
Hang Zhou
Yasheng Sun
Wayne Wu
Chen Change Loy
Xiaogang Wang
Ziwei Liu
CVBM
26
360
0
22 Apr 2021
Disentanglement for audio-visual emotion recognition using multitask setup
Raghuveer Peri
Srinivas Parthasarathy
Charles Bradshaw
Shiva Sundaram
20
11
0
11 Feb 2021
Self-Supervised Hyperboloid Representations from Logical Queries over Knowledge Graphs
Nurendra Choudhary
Nikhil S. Rao
S. Katariya
Karthik Subbian
Chandan K. Reddy
16
64
0
23 Dec 2020
Self-supervised Text-independent Speaker Verification using Prototypical Momentum Contrastive Learning
Wei Xia
Chunlei Zhang
Chao Weng
Meng Yu
Dong Yu
SSL
14
77
0
13 Dec 2020
Look, Listen, and Attend: Co-Attention Network for Self-Supervised Audio-Visual Representation Learning
Ying Cheng
Ruize Wang
Zhihao Pan
Rui Feng
Yuejie Zhang
SSL
16
106
0
13 Aug 2020
Self-Supervised Learning of Audio-Visual Objects from Video
Triantafyllos Afouras
Andrew Owens
Joon Son Chung
Andrew Zisserman
SSL
17
250
0
10 Aug 2020
Visually Guided Sound Source Separation using Cascaded Opponent Filter Network
Lingyu Zhu
Esa Rahtu
14
23
0
04 Jun 2020
High-Fidelity Audio Generation and Representation Learning with Guided Adversarial Autoencoder
Kazi Nazmul Haque
R. Rana
Björn W Schuller
DRL
24
12
0
01 Jun 2020
FaceFilter: Audio-visual speech separation using still images
Soo-Whan Chung
Soyeon Choe
Joon Son Chung
Hong-Goo Kang
CVBM
21
66
0
14 May 2020
VoxCeleb2: Deep Speaker Recognition
Joon Son Chung
Arsha Nagrani
Andrew Zisserman
219
2,233
0
14 Jun 2018
1