ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2002.08742
  4. Cited By
Disentangled Speech Embeddings using Cross-modal Self-supervision

Disentangled Speech Embeddings using Cross-modal Self-supervision

20 February 2020
Arsha Nagrani
Joon Son Chung
Samuel Albanie
Andrew Zisserman
    SSL
ArXivPDFHTML

Papers citing "Disentangled Speech Embeddings using Cross-modal Self-supervision"

20 / 20 papers shown
Title
Audio-Visual Generalized Zero-Shot Learning using Pre-Trained Large
  Multi-Modal Models
Audio-Visual Generalized Zero-Shot Learning using Pre-Trained Large Multi-Modal Models
David Kurzendörfer
Otniel-Bogdan Mercea
A. Sophia Koepke
Zeynep Akata
VLM
CLIP
26
2
0
09 Apr 2024
VoxSRC 2022: The Fourth VoxCeleb Speaker Recognition Challenge
VoxSRC 2022: The Fourth VoxCeleb Speaker Recognition Challenge
Jaesung Huh
A. Brown
Jee-weon Jung
Joon Son Chung
Arsha Nagrani
D. Garcia-Romero
Andrew Zisserman
18
26
0
20 Feb 2023
Audio Representation Learning by Distilling Video as Privileged
  Information
Audio Representation Learning by Distilling Video as Privileged Information
Amirhossein Hajavi
Ali Etemad
13
4
0
06 Feb 2023
SyncTalkFace: Talking Face Generation with Precise Lip-Syncing via
  Audio-Lip Memory
SyncTalkFace: Talking Face Generation with Precise Lip-Syncing via Audio-Lip Memory
Se Jin Park
Minsu Kim
Joanna Hong
J. Choi
Y. Ro
CVBM
19
85
0
02 Nov 2022
Audio-Visual Person-of-Interest DeepFake Detection
Audio-Visual Person-of-Interest DeepFake Detection
D. Cozzolino
Alessandro Pianese
Matthias Nießner
L. Verdoliva
28
59
0
06 Apr 2022
Audio Self-supervised Learning: A Survey
Audio Self-supervised Learning: A Survey
Shuo Liu
Adria Mallol-Ragolta
Emilia Parada-Cabeleiro
Kun Qian
Xingshuo Jing
Alexander Kathan
Bin Hu
Bjoern W. Schuller
SSL
35
106
0
02 Mar 2022
Speaker Normalization for Self-supervised Speech Emotion Recognition
Speaker Normalization for Self-supervised Speech Emotion Recognition
Itai Gat
Hagai Aronowitz
Weizhong Zhu
E. Morais
R. Hoory
25
50
0
02 Feb 2022
Bootstrap Equilibrium and Probabilistic Speaker Representation Learning
  for Self-supervised Speaker Verification
Bootstrap Equilibrium and Probabilistic Speaker Representation Learning for Self-supervised Speaker Verification
Sung Hwan Mun
Min Hyun Han
Dongjune Lee
Jihwan Kim
N. Kim
SSL
22
3
0
16 Dec 2021
Self-Supervised Speaker Verification with Simple Siamese Network and
  Self-Supervised Regularization
Self-Supervised Speaker Verification with Simple Siamese Network and Self-Supervised Regularization
Mufan Sang
Haoqi Li
F. Liu
Andrew O. Arnold
Li Wan
SSL
16
38
0
08 Dec 2021
Speech2Video: Cross-Modal Distillation for Speech to Video Generation
Speech2Video: Cross-Modal Distillation for Speech to Video Generation
Shijing Si
Jianzong Wang
Xiaoyang Qu
Ning Cheng
Wenqi Wei
Xinghua Zhu
Jing Xiao
VGen
10
15
0
10 Jul 2021
Pose-Controllable Talking Face Generation by Implicitly Modularized
  Audio-Visual Representation
Pose-Controllable Talking Face Generation by Implicitly Modularized Audio-Visual Representation
Hang Zhou
Yasheng Sun
Wayne Wu
Chen Change Loy
Xiaogang Wang
Ziwei Liu
CVBM
26
360
0
22 Apr 2021
Disentanglement for audio-visual emotion recognition using multitask
  setup
Disentanglement for audio-visual emotion recognition using multitask setup
Raghuveer Peri
Srinivas Parthasarathy
Charles Bradshaw
Shiva Sundaram
20
11
0
11 Feb 2021
Self-Supervised Hyperboloid Representations from Logical Queries over
  Knowledge Graphs
Self-Supervised Hyperboloid Representations from Logical Queries over Knowledge Graphs
Nurendra Choudhary
Nikhil S. Rao
S. Katariya
Karthik Subbian
Chandan K. Reddy
16
64
0
23 Dec 2020
Self-supervised Text-independent Speaker Verification using Prototypical
  Momentum Contrastive Learning
Self-supervised Text-independent Speaker Verification using Prototypical Momentum Contrastive Learning
Wei Xia
Chunlei Zhang
Chao Weng
Meng Yu
Dong Yu
SSL
14
77
0
13 Dec 2020
Look, Listen, and Attend: Co-Attention Network for Self-Supervised
  Audio-Visual Representation Learning
Look, Listen, and Attend: Co-Attention Network for Self-Supervised Audio-Visual Representation Learning
Ying Cheng
Ruize Wang
Zhihao Pan
Rui Feng
Yuejie Zhang
SSL
16
106
0
13 Aug 2020
Self-Supervised Learning of Audio-Visual Objects from Video
Self-Supervised Learning of Audio-Visual Objects from Video
Triantafyllos Afouras
Andrew Owens
Joon Son Chung
Andrew Zisserman
SSL
17
250
0
10 Aug 2020
Visually Guided Sound Source Separation using Cascaded Opponent Filter
  Network
Visually Guided Sound Source Separation using Cascaded Opponent Filter Network
Lingyu Zhu
Esa Rahtu
14
23
0
04 Jun 2020
High-Fidelity Audio Generation and Representation Learning with Guided
  Adversarial Autoencoder
High-Fidelity Audio Generation and Representation Learning with Guided Adversarial Autoencoder
Kazi Nazmul Haque
R. Rana
Björn W Schuller
DRL
24
12
0
01 Jun 2020
FaceFilter: Audio-visual speech separation using still images
FaceFilter: Audio-visual speech separation using still images
Soo-Whan Chung
Soyeon Choe
Joon Son Chung
Hong-Goo Kang
CVBM
21
66
0
14 May 2020
VoxCeleb2: Deep Speaker Recognition
VoxCeleb2: Deep Speaker Recognition
Joon Son Chung
Arsha Nagrani
Andrew Zisserman
219
2,233
0
14 Jun 2018
1