ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2026 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2504.02407
  4. Cited By
F5R-TTS: Improving Flow-Matching based Text-to-Speech with Group Relative Policy Optimization
v1v2v3 (latest)

F5R-TTS: Improving Flow-Matching based Text-to-Speech with Group Relative Policy Optimization

3 April 2025
Xiaohui Sun
Ruitong Xiao
Jianye Mo
Bowen Wu
Qun Yu
Baoxun Wang
ArXiv (abs)PDFHTMLGithub (1541★)

Papers citing "F5R-TTS: Improving Flow-Matching based Text-to-Speech with Group Relative Policy Optimization"

31 / 31 papers shown
YingMusic-SVC: Real-World Robust Zero-Shot Singing Voice Conversion with Flow-GRPO and Singing-Specific Inductive Biases
YingMusic-SVC: Real-World Robust Zero-Shot Singing Voice Conversion with Flow-GRPO and Singing-Specific Inductive Biases
Gongyu Chen
Xiaoyu Zhang
Zhenqiang Weng
Junjie Zheng
Da Shen
Chaofan Ding
Wei-Qiang Zhang
Zihao Chen
97
3
0
04 Dec 2025
YingMusic-Singer: Zero-shot Singing Voice Synthesis and Editing with Annotation-free Melody Guidance
YingMusic-Singer: Zero-shot Singing Voice Synthesis and Editing with Annotation-free Melody Guidance
Junjie Zheng
Chunbo Hao
Guobin Ma
Xiaoyu Zhang
Gongyu Chen
Chaofan Ding
Zihao Chen
Lei Xie
DiffM
233
4
0
04 Dec 2025
Step-Audio-EditX Technical Report
Step-Audio-EditX Technical Report
Chao Yan
Boyong Wu
Peng Yang
Pengfei Tan
Guoqiang Hu
...
Xiangyu Zhang
Daxin Jiang
Daxin Jiang
Shuchang Zhou
Gang Yu
214
3
0
05 Nov 2025
Vox-Evaluator: Enhancing Stability and Fidelity for Zero-shot TTS with A Multi-Level Evaluator
Vox-Evaluator: Enhancing Stability and Fidelity for Zero-shot TTS with A Multi-Level Evaluator
H. Wang
Na Li
Chuke Wang
Shu Wu
Zhifeng Li
Dong Yu
DiffM
177
0
0
23 Oct 2025
No Verifiable Reward for Prosody: Toward Preference-Guided Prosody Learning in TTS
No Verifiable Reward for Prosody: Toward Preference-Guided Prosody Learning in TTS
Seungyoun Shin
Dongha Ahn
Jiwoo Kim
Sungwook Jeon
144
0
0
23 Sep 2025
Inference-Time Alignment Control for Diffusion Models with Reinforcement Learning Guidance
Inference-Time Alignment Control for Diffusion Models with Reinforcement Learning Guidance
Luozhijie Jin
Zijie Qiu
J. Liu
Zijie Diao
Lifeng Qiao
Ning Ding
Alex Lamb
Xipeng Qiu
AI4CE
168
4
0
28 Aug 2025
Multi-Metric Preference Alignment for Generative Speech Restoration
Multi-Metric Preference Alignment for Generative Speech Restoration
Junan Zhang
Xueyao Zhang
Jing Yang
Yuancheng Wang
Fan Fan
Zhizheng Wu
399
6
0
24 Aug 2025
CosyVoice 3: Towards In-the-wild Speech Generation via Scaling-up and Post-training
CosyVoice 3: Towards In-the-wild Speech Generation via Scaling-up and Post-training
Zhihao Du
Changfeng Gao
Yuxuan Wang
Fan Yu
Tianyu Zhao
...
Mengzhe Chen
Yafeng Chen
Shiliang Zhang
Wen Wang
Jieping Ye
AuLLM
406
95
0
23 May 2025
Flow-GRPO: Training Flow Matching Models via Online RL
Flow-GRPO: Training Flow Matching Models via Online RL
Jie Liu
Gongye Liu
Jiajun Liang
Yongqian Li
Jiaheng Liu
Xinyu Wang
Pengfei Wan
Di Zhang
Wanli Ouyang
AI4CE
1.0K
319
0
08 May 2025
Koel-TTS: Enhancing LLM based Speech Generation with Preference Alignment and Classifier Free Guidance
Koel-TTS: Enhancing LLM based Speech Generation with Preference Alignment and Classifier Free Guidance
Shehzeen Samarah Hussain
Paarth Neekhara
Xuesong Yang
Edresson Casanova
Subhankar Ghosh
Mikyas T. Desta
Roy Fejgin
Rafael Valle
Jason Chun Lok Li
491
22
0
07 Feb 2025
DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning
DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning
DeepSeek-AI
Daya Guo
Dejian Yang
Haowei Zhang
Junxiao Song
...
Shiyu Wang
S. Yu
Shunfeng Zhou
Shuting Pan
S.S. Li
OffRLAI4TSLRMReLMVLM
1.8K
5,342
0
22 Jan 2025
F5-TTS: A Fairytaler that Fakes Fluent and Faithful Speech with Flow Matching
F5-TTS: A Fairytaler that Fakes Fluent and Faithful Speech with Flow MatchingAnnual Meeting of the Association for Computational Linguistics (ACL), 2024
Emmanouil Benetos
Zhikang Niu
Ziyang Ma
Keqi Deng
Chunhui Wang
Jian Zhao
Kai Yu
Xie Chen
818
366
0
09 Oct 2024
Preference Alignment Improves Language Model-Based TTS
Preference Alignment Improves Language Model-Based TTSIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2024
Jinchuan Tian
Chunlei Zhang
Jiatong Shi
Hao Zhang
Jianwei Yu
Shinji Watanabe
Dong Yu
272
25
0
19 Sep 2024
Emo-DPO: Controllable Emotional Speech Synthesis through Direct
  Preference Optimization
Emo-DPO: Controllable Emotional Speech Synthesis through Direct Preference OptimizationIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2024
Xiaoxue Gao
Chen Zhang
Yiming Chen
Huayun Zhang
Nancy F. Chen
293
35
0
16 Sep 2024
FunAudioLLM: Voice Understanding and Generation Foundation Models for
  Natural Interaction Between Humans and LLMs
FunAudioLLM: Voice Understanding and Generation Foundation Models for Natural Interaction Between Humans and LLMs
Keyu An
Qian Chen
Chong Deng
Zhihao Du
Changfeng Gao
...
Bin Zhang
Qinglin Zhang
Shiliang Zhang
Nan Zhao
Siqi Zheng
AuLLM
481
140
0
04 Jul 2024
E2 TTS: Embarrassingly Easy Fully Non-Autoregressive Zero-Shot TTS
E2 TTS: Embarrassingly Easy Fully Non-Autoregressive Zero-Shot TTS
Sefik Emre Eskimez
Xiaofei Wang
Manthan Thakker
Canrun Li
Chung-Hsien Tsai
...
Min Tang
Xu Tan
Yanqing Liu
Sheng Zhao
Naoyuki Kanda
VLM
341
176
0
26 Jun 2024
Nemotron-4 340B Technical Report
Nemotron-4 340B Technical Report
Nvidia
:
Bo Adler
Niket Agarwal
Ashwath Aithal
...
Jimmy Zhang
Jing Zhang
Vivienne Zhang
Yian Zhang
Chen Zhu
339
122
0
17 Jun 2024
VALL-E 2: Neural Codec Language Models are Human Parity Zero-Shot Text
  to Speech Synthesizers
VALL-E 2: Neural Codec Language Models are Human Parity Zero-Shot Text to Speech Synthesizers
Sanyuan Chen
Shujie Liu
Long Zhou
Yanqing Liu
Xu Tan
Jinyu Li
Sheng Zhao
Yao Qian
Furu Wei
VLM
351
175
0
08 Jun 2024
Seed-TTS: A Family of High-Quality Versatile Speech Generation Models
Seed-TTS: A Family of High-Quality Versatile Speech Generation Models
Philip Anastassiou
Jiawei Chen
Jingshu Chen
Yuanzhe Chen
Zhuo Chen
...
Wenjie Zhang
Yanzhe Zhang
Zilin Zhao
Dejian Zhong
Xiaobin Zhuang
407
316
0
04 Jun 2024
DeepSeek-V2: A Strong, Economical, and Efficient Mixture-of-Experts
  Language Model
DeepSeek-V2: A Strong, Economical, and Efficient Mixture-of-Experts Language Model
DeepSeek-AI
Aixin Liu
Bei Feng
Bin Wang
Bingxuan Wang
...
Zhuoshu Li
Zihan Wang
Zihui Gu
Zilin Li
Ziwei Xie
MoE
575
1,094
0
07 May 2024
BASE TTS: Lessons from building a billion-parameter Text-to-Speech model
  on 100K hours of data
BASE TTS: Lessons from building a billion-parameter Text-to-Speech model on 100K hours of data
Mateusz Lajszczak
Guillermo Cámbara
Yang Li
Fatih Beyhan
Arent van Korlaar
...
Bartosz Putrycz
Soledad López Gambino
Kayeon Yoo
Elena Sokolova
Thomas Drugman
LM&MA
478
116
0
12 Feb 2024
DeepSeekMath: Pushing the Limits of Mathematical Reasoning in Open
  Language Models
DeepSeekMath: Pushing the Limits of Mathematical Reasoning in Open Language Models
Zhihong Shao
Peiyi Wang
Qihao Zhu
Runxin Xu
Jun-Mei Song
...
Haowei Zhang
Mingchuan Zhang
Yiming Li
Yu-Huan Wu
Daya Guo
ReLMLRM
2.1K
5,487
0
05 Feb 2024
Voicebox: Text-Guided Multilingual Universal Speech Generation at Scale
Voicebox: Text-Guided Multilingual Universal Speech Generation at ScaleNeural Information Processing Systems (NeurIPS), 2023
Matt Le
Apoorv Vyas
Bowen Shi
Brian Karrer
Leda Sari
...
Mary Williamson
Vimal Manohar
Yossi Adi
Jay Mahadeokar
Wei-Ning Hsu
AuLLM
386
478
0
23 Jun 2023
Direct Preference Optimization: Your Language Model is Secretly a Reward
  Model
Direct Preference Optimization: Your Language Model is Secretly a Reward ModelNeural Information Processing Systems (NeurIPS), 2023
Rafael Rafailov
Archit Sharma
E. Mitchell
Stefano Ermon
Christopher D. Manning
Chelsea Finn
ALM
1.1K
8,135
0
29 May 2023
Training Diffusion Models with Reinforcement Learning
Training Diffusion Models with Reinforcement LearningInternational Conference on Learning Representations (ICLR), 2023
Kevin Black
Michael Janner
Yilun Du
Ilya Kostrikov
Sergey Levine
EGVM
757
778
0
22 May 2023
FunASR: A Fundamental End-to-End Speech Recognition Toolkit
FunASR: A Fundamental End-to-End Speech Recognition ToolkitInterspeech (Interspeech), 2023
Zhifu Gao
Zerui Li
Jiaming Wang
Haoneng Luo
Xian Shi
...
Yabin Li
Lingyun Zuo
Zhihao Du
Zhangyu Xiao
Shiliang Zhang
320
129
0
18 May 2023
Speak Foreign Languages with Your Own Voice: Cross-Lingual Neural Codec
  Language Modeling
Speak Foreign Languages with Your Own Voice: Cross-Lingual Neural Codec Language Modeling
Zi-Hua Zhang
Long Zhou
Chengyi Wang
Sanyuan Chen
Yu Wu
...
Huaming Wang
Jinyu Li
Lei He
Sheng Zhao
Furu Wei
VLM
455
253
0
07 Mar 2023
Neural Codec Language Models are Zero-Shot Text to Speech Synthesizers
Neural Codec Language Models are Zero-Shot Text to Speech SynthesizersIEEE Transactions on Audio, Speech, and Language Processing (IEEE TASLP), 2023
Chengyi Wang
Sanyuan Chen
Yu-Huan Wu
Zi-Hua Zhang
Long Zhou
...
Huaming Wang
Jinyu Li
Lei He
Sheng Zhao
Furu Wei
616
1,138
0
05 Jan 2023
Wespeaker: A Research and Production oriented Speaker Embedding Learning
  Toolkit
Wespeaker: A Research and Production oriented Speaker Embedding Learning ToolkitIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2022
Hongji Wang
Che-Yuan Liang
Shuai Wang
Zhengyang Chen
Binbin Zhang
Xu Xiang
Yan Deng
Y. Qian
359
217
0
31 Oct 2022
Large-scale Self-Supervised Speech Representation Learning for Automatic
  Speaker Verification
Large-scale Self-Supervised Speech Representation Learning for Automatic Speaker VerificationIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2021
Zhengyang Chen
Sanyuan Chen
Yu-Huan Wu
Yao Qian
Chengyi Wang
Shujie Liu
Y. Qian
Michael Zeng
SSL
353
189
0
12 Oct 2021
Proximal Policy Optimization Algorithms
Proximal Policy Optimization Algorithms
John Schulman
Filip Wolski
Prafulla Dhariwal
Alec Radford
Oleg Klimov
OffRL
1.5K
26,647
0
20 Jul 2017
1
Page 1 of 1