ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2307.16430
  4. Cited By
VITS2: Improving Quality and Efficiency of Single-Stage Text-to-Speech
  with Adversarial Learning and Architecture Design

VITS2: Improving Quality and Efficiency of Single-Stage Text-to-Speech with Adversarial Learning and Architecture Design

31 July 2023
Jungil Kong
Jihoon Park
Beomjeong Kim
Jeongmin Kim
Dohee Kong
Sangjin Kim
ArXivPDFHTML

Papers citing "VITS2: Improving Quality and Efficiency of Single-Stage Text-to-Speech with Adversarial Learning and Architecture Design"

6 / 6 papers shown
Title
Muyan-TTS: A Trainable Text-to-Speech Model Optimized for Podcast Scenarios with a $50K Budget
Muyan-TTS: A Trainable Text-to-Speech Model Optimized for Podcast Scenarios with a 50KBudget50K Budget50KBudget
Xin Li
Kaikai Jia
Hao Sun
Jun Dai
Z. L. Jiang
99
0
0
27 Apr 2025
MathReader : Text-to-Speech for Mathematical Documents
MathReader : Text-to-Speech for Mathematical Documents
Sieun Hyeon
Kyudan Jung
N. Kim
Hyun Gon Ryu
Jaeyoung Do
36
1
0
13 Jan 2025
FaceSpeak: Expressive and High-Quality Speech Synthesis from Human Portraits of Different Styles
FaceSpeak: Expressive and High-Quality Speech Synthesis from Human Portraits of Different Styles
Tian-Hao Zhang
Jiawei Zhang
J. Wang
Xinyuan Qian
Xu-cheng Yin
CVBM
45
0
0
02 Jan 2025
Fake it to make it: Using synthetic data to remedy the data shortage in
  joint multimodal speech-and-gesture synthesis
Fake it to make it: Using synthetic data to remedy the data shortage in joint multimodal speech-and-gesture synthesis
Shivam Mehta
Anna Deichler
Jim O'Regan
Birger Moëll
Jonas Beskow
G. Henter
Simon Alexanderson
34
4
0
30 Apr 2024
High Fidelity Speech Synthesis with Adversarial Networks
High Fidelity Speech Synthesis with Adversarial Networks
Mikolaj Binkowski
Jeff Donahue
Sander Dieleman
Aidan Clark
Erich Elsen
Norman Casagrande
Luis C. Cobo
Karen Simonyan
215
239
0
25 Sep 2019
Transfer Learning from Speaker Verification to Multispeaker
  Text-To-Speech Synthesis
Transfer Learning from Speaker Verification to Multispeaker Text-To-Speech Synthesis
Ye Jia
Yu Zhang
Ron J. Weiss
Quan Wang
Jonathan Shen
...
Z. Chen
Patrick Nguyen
Ruoming Pang
Ignacio López Moreno
Yonghui Wu
204
819
0
12 Jun 2018
1