ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2026 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2011.03530
  4. Cited By
Large-scale multilingual audio visual dubbing

Large-scale multilingual audio visual dubbing

6 November 2020
Yi Yang
Brendan Shillingford
Yannis Assael
Miaosen Wang
Wendi Liu
Yutian Chen
Yu Zhang
Eren Sezener
Luis C. Cobo
Misha Denil
Y. Aytar
Nando de Freitas
ArXiv (abs)PDFHTML

Papers citing "Large-scale multilingual audio visual dubbing"

13 / 13 papers shown
Human Motion Video Generation: A Survey
Human Motion Video Generation: A SurveyIEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2025
Haiwei Xue
Xiangyang Luo
Zhanghao Hu
Shu Zhang
Xunzhi Xiang
...
Fei Ma
Zhiyong Wu
Changpeng Yang
Zonghong Dai
Fei Richard Yu
EGVMVGen
269
33
0
04 Sep 2025
MAVFlow: Preserving Paralinguistic Elements with Conditional Flow Matching for Zero-Shot AV2AV Multilingual Translation
MAVFlow: Preserving Paralinguistic Elements with Conditional Flow Matching for Zero-Shot AV2AV Multilingual Translation
Sungwoo Cho
J. Choi
Sungnyun Kim
Se-Young Yun
387
0
0
14 Mar 2025
The Tug-of-War Between Deepfake Generation and Detection
The Tug-of-War Between Deepfake Generation and Detection
Hannah Lee
Changyeon Lee
Kevin Farhat
Lin Qiu
Steve Geluso
Aerin Kim
O. Etzioni
500
12
0
08 Jul 2024
MINT: a Multi-modal Image and Narrative Text Dubbing Dataset for Foley
  Audio Content Planning and Generation
MINT: a Multi-modal Image and Narrative Text Dubbing Dataset for Foley Audio Content Planning and Generation
Ruibo Fu
Shuchen Shi
Hongming Guo
Tao Wang
Chunyu Qiang
...
Zhiyong Wang
Yukun Liu
Zhengqi Wen
Shuai Zhang
Guanjun Li
VGen
112
2
0
15 Jun 2024
DubWise: Video-Guided Speech Duration Control in Multimodal LLM-based
  Text-to-Speech for Dubbing
DubWise: Video-Guided Speech Duration Control in Multimodal LLM-based Text-to-Speech for Dubbing
Neha Sahipjohn
Ashishkumar Gudmalwar
Nirmesh Shah
Pankaj Wasnik
R. Shah
285
13
0
13 Jun 2024
ANIM-400K: A Large-Scale Dataset for Automated End-To-End Dubbing of
  Video
ANIM-400K: A Large-Scale Dataset for Automated End-To-End Dubbing of VideoIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2024
Kevin Cai
Chonghua Liu
David M. Chan
VGen
196
9
0
10 Jan 2024
Looking Similar, Sounding Different: Leveraging Counterfactual
  Cross-Modal Pairs for Audiovisual Representation Learning
Looking Similar, Sounding Different: Leveraging Counterfactual Cross-Modal Pairs for Audiovisual Representation LearningComputer Vision and Pattern Recognition (CVPR), 2023
Nikhil Singh
Chih-Wei Wu
Iroro Orife
Mahdi M. Kalayeh
449
3
0
12 Apr 2023
Learning to Dub Movies via Hierarchical Prosody Models
Learning to Dub Movies via Hierarchical Prosody ModelsComputer Vision and Pattern Recognition (CVPR), 2022
Gaoxiang Cong
Liang Li
Yuankai Qi
Zhengjun Zha
Qi Wu
Wen-yu Wang
Bin Jiang
Ming-Hsuan Yang
Qin Huang
450
44
0
08 Dec 2022
Technology Pipeline for Large Scale Cross-Lingual Dubbing of Lecture
  Videos into Multiple Indian Languages
Technology Pipeline for Large Scale Cross-Lingual Dubbing of Lecture Videos into Multiple Indian LanguagesInterspeech (Interspeech), 2022
Anusha Prakash
Arun Kumar
Ashish Seth
Bhagyashree Mukherjee
Ishika Gupta
...
D. Sharma
H. Murthy
P. Bhattacharya
S. Umesh
R. Sangal
261
5
0
01 Nov 2022
Towards Realistic Visual Dubbing with Heterogeneous Sources
Towards Realistic Visual Dubbing with Heterogeneous SourcesACM Multimedia (MM), 2021
Tianyi Xie
Liucheng Liao
Cheng Bi
Benlai Tang
Xiang Yin
Jianfei Yang
Mingjie Wang
Jiali Yao
Yang Zhang
Zejun Ma
VGen
199
44
0
17 Jan 2022
Audio-Visual Synchronisation in the wild
Audio-Visual Synchronisation in the wild
Honglie Chen
Weidi Xie
Triantafyllos Afouras
Arsha Nagrani
Andrea Vedaldi
Andrew Zisserman
258
51
0
08 Dec 2021
More than Words: In-the-Wild Visually-Driven Prosody for Text-to-Speech
More than Words: In-the-Wild Visually-Driven Prosody for Text-to-SpeechComputer Vision and Pattern Recognition (CVPR), 2021
Michael Hassid
Michelle Tadmor Ramanovich
Brendan Shillingford
Miaosen Wang
Ye Jia
Tal Remez
DiffM
251
21
0
19 Nov 2021
Neural Dubber: Dubbing for Videos According to Scripts
Neural Dubber: Dubbing for Videos According to Scripts
Chenxu Hu
Qiao Tian
Tingle Li
Yuping Wang
Yuxuan Wang
Hang Zhao
DiffMVGen
438
55
0
15 Oct 2021
1
Page 1 of 1