ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2107.04806
  4. Cited By
Speech2Video: Cross-Modal Distillation for Speech to Video Generation

Speech2Video: Cross-Modal Distillation for Speech to Video Generation

10 July 2021
Shijing Si
Jianzong Wang
Xiaoyang Qu
Ning Cheng
Wenqi Wei
Xinghua Zhu
Jing Xiao
    VGen
ArXivPDFHTML

Papers citing "Speech2Video: Cross-Modal Distillation for Speech to Video Generation"

11 / 11 papers shown
Title
A Survey on Deep Multi-modal Learning for Body Language Recognition and
  Generation
A Survey on Deep Multi-modal Learning for Body Language Recognition and Generation
Li Liu
Lufei Gao
Wen-Ling Lei
Fengji Ma
Xiaotian Lin
Jin-Tao Wang
CVBM
27
5
0
17 Aug 2023
MusicFace: Music-driven Expressive Singing Face Synthesis
MusicFace: Music-driven Expressive Singing Face Synthesis
Peng Liu
W. Deng
Hengda Li
Jintai Wang
Yinglin Zheng
Yiwei Ding
Xiaohu Guo
Ming Zeng
CVBM
30
10
0
24 Mar 2023
MARLIN: Masked Autoencoder for facial video Representation LearnINg
MARLIN: Masked Autoencoder for facial video Representation LearnINg
Zhixi Cai
Shreya Ghosh
Kalin Stefanov
Abhinav Dhall
Jianfei Cai
Hamid Rezatofighi
Reza Haffari
Munawar Hayat
ViT
CVBM
20
60
0
12 Nov 2022
SVLDL: Improved Speaker Age Estimation Using Selective Variance Label
  Distribution Learning
SVLDL: Improved Speaker Age Estimation Using Selective Variance Label Distribution Learning
Zuheng Kang
Jianzong Wang
Junqing Peng
Jing Xiao
19
3
0
18 Oct 2022
Boosting Star-GANs for Voice Conversion with Contrastive Discriminator
Boosting Star-GANs for Voice Conversion with Contrastive Discriminator
Shijing Si
Jianzong Wang
Xulong Zhang
Xiaoyang Qu
Ning Cheng
Jing Xiao
14
1
0
21 Sep 2022
SpeechEQ: Speech Emotion Recognition based on Multi-scale Unified
  Datasets and Multitask Learning
SpeechEQ: Speech Emotion Recognition based on Multi-scale Unified Datasets and Multitask Learning
Zuheng Kang
Junqing Peng
Jianzong Wang
Jing Xiao
11
3
0
27 Jun 2022
Improving Human Image Synthesis with Residual Fast Fourier
  Transformation and Wasserstein Distance
Improving Human Image Synthesis with Residual Fast Fourier Transformation and Wasserstein Distance
Jianhan Wu
Shijing Si
Jianzong Wang
Jing Xiao
32
1
0
24 May 2022
Deep Learning for Visual Speech Analysis: A Survey
Deep Learning for Visual Speech Analysis: A Survey
Changchong Sheng
Gangyao Kuang
L. Bai
Chen Hou
Y. Guo
Xin Xu
M. Pietikäinen
Li Liu
VLM
23
33
0
22 May 2022
Towards Speaker Age Estimation with Label Distribution Learning
Towards Speaker Age Estimation with Label Distribution Learning
Shijing Si
Jianzong Wang
Junqing Peng
Jing Xiao
6
21
0
23 Feb 2022
A Style-Based Generator Architecture for Generative Adversarial Networks
A Style-Based Generator Architecture for Generative Adversarial Networks
Tero Karras
S. Laine
Timo Aila
279
10,348
0
12 Dec 2018
VoxCeleb2: Deep Speaker Recognition
VoxCeleb2: Deep Speaker Recognition
Joon Son Chung
Arsha Nagrani
Andrew Zisserman
221
2,233
0
14 Jun 2018
1