ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2111.07402
  4. Cited By
Textless Speech Emotion Conversion using Discrete and Decomposed
  Representations
v1v2v3 (latest)

Textless Speech Emotion Conversion using Discrete and Decomposed Representations

Conference on Empirical Methods in Natural Language Processing (EMNLP), 2021
14 November 2021
Felix Kreuk
Adam Polyak
Jade Copet
Eugene Kharitonov
Tu Nguyen
M. Rivière
Wei-Ning Hsu
Abdel-rahman Mohamed
Emmanuel Dupoux
Yossi Adi
ArXiv (abs)PDFHTML

Papers citing "Textless Speech Emotion Conversion using Discrete and Decomposed Representations"

29 / 29 papers shown
Title
Heptapod: Language Modeling on Visual Signals
Heptapod: Language Modeling on Visual Signals
Yongxin Zhu
J. Chen
Yuanzhe Chen
Zhuo Chen
Dongya Jia
Jian Cong
Xiaobin Zhuang
Yuping Wang
Yuping Wang
VLM
110
0
0
08 Oct 2025
HuLA: Prosody-Aware Anti-Spoofing with Multi-Task Learning for Expressive and Emotional Synthetic Speech
HuLA: Prosody-Aware Anti-Spoofing with Multi-Task Learning for Expressive and Emotional Synthetic Speech
Aurosweta Mahapatra
Ismail Rasim Ulgen
Berrak Sisman
83
0
0
25 Sep 2025
Maestro-EVC: Controllable Emotional Voice Conversion Guided by References and Explicit Prosody
Maestro-EVC: Controllable Emotional Voice Conversion Guided by References and Explicit Prosody
Jinsung Yoon
Wooyeol Jeong
Jio Gim
Young-Joo Suh
52
0
0
09 Aug 2025
Towards Better Disentanglement in Non-Autoregressive Zero-Shot Expressive Voice Conversion
Towards Better Disentanglement in Non-Autoregressive Zero-Shot Expressive Voice Conversion
Seymanur Akti
T. Nguyen
Jan Niehues
DRL
274
0
0
04 Jun 2025
Audio-to-Audio Emotion Conversion With Pitch And Duration Style Transfer
Audio-to-Audio Emotion Conversion With Pitch And Duration Style Transfer
Soumya Dutta
Avni Jain
Sriram Ganapathy
220
0
0
23 May 2025
On The Landscape of Spoken Language Models: A Comprehensive Survey
On The Landscape of Spoken Language Models: A Comprehensive Survey
Siddhant Arora
Kai-Wei Chang
Chung-Ming Chien
Yifan Peng
Haibin Wu
Yossi Adi
Emmanuel Dupoux
Hung-yi Lee
Karen Livescu
Shinji Watanabe
277
52
0
11 Apr 2025
Scaling Analysis of Interleaved Speech-Text Language Models
Scaling Analysis of Interleaved Speech-Text Language Models
Gallil Maimon
Michael Hassid
Amit Roth
Yossi Adi
AuLLM
316
3
0
03 Apr 2025
Unsupervised Speech Segmentation: A General Approach Using Speech Language Models
Unsupervised Speech Segmentation: A General Approach Using Speech Language Models
Avishai Elmakies
Omri Abend
Yossi Adi
273
1
0
08 Jan 2025
A Review of Human Emotion Synthesis Based on Generative Technology
A Review of Human Emotion Synthesis Based on Generative Technology
Fei Ma
Yongqian Li
Yifan Xie
Y. He
Yujiao Shi
...
Z. Liu
Wei Yao
Fuji Ren
Fei Richard Yu
Shiguang Ni
226
7
0
10 Dec 2024
Enhancing TTS Stability in Hebrew using Discrete Semantic Units
Enhancing TTS Stability in Hebrew using Discrete Semantic UnitsIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2024
Ella Zeldes
Or Tal
Yossi Adi
151
3
0
28 Oct 2024
Speech Recognition Rescoring with Large Speech-Text Foundation Models
Speech Recognition Rescoring with Large Speech-Text Foundation ModelsIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2024
Prashanth Gurunath Shivakumar
J. Kolehmainen
Aditya Gourav
Yi Gu
Ankur Gandhe
Ariya Rastrow
I. Bulyko
AuLLM
208
2
0
25 Sep 2024
Estimating the Completeness of Discrete Speech Units
Estimating the Completeness of Discrete Speech UnitsSpoken Language Technology Workshop (SLT), 2024
Sung-Lin Yeh
Hao Tang
294
5
0
09 Sep 2024
The Interspeech 2024 Challenge on Speech Processing Using Discrete Units
The Interspeech 2024 Challenge on Speech Processing Using Discrete Units
Xuankai Chang
Jiatong Shi
Jinchuan Tian
Yuning Wu
Yuxun Tang
Yihan Wu
Shinji Watanabe
Yossi Adi
Xie Chen
Qin Jin
190
25
0
11 Jun 2024
Converting Anyone's Voice: End-to-End Expressive Voice Conversion with a
  Conditional Diffusion Model
Converting Anyone's Voice: End-to-End Expressive Voice Conversion with a Conditional Diffusion ModelThe Speaker and Language Recognition Workshop (Odyssey), 2024
Zongyang Du
Junchen Lu
Kun Zhou
Lakshmish Kaushik
Berrak Sisman
224
5
0
02 May 2024
SpiRit-LM: Interleaved Spoken and Written Language Model
SpiRit-LM: Interleaved Spoken and Written Language Model
Tu Nguyen
Benjamin Muller
Bokai Yu
Marta R. Costa-jussá
Maha Elbayad
...
Itai Gat
Gabriel Synnaeve
Juan Pino
Benoît Sagot
Emmanuel Dupoux
AuLLMVLM
209
99
0
08 Feb 2024
DurFlex-EVC: Duration-Flexible Emotional Voice Conversion Leveraging Discrete Representations without Text Alignment
DurFlex-EVC: Duration-Flexible Emotional Voice Conversion Leveraging Discrete Representations without Text AlignmentIEEE Transactions on Affective Computing (IEEE Trans. Affective Comput.), 2024
Hyoung-Seok Oh
Sang-Hoon Lee
Deok-Hyun Cho
Seong-Whan Lee
501
1
0
16 Jan 2024
SelfVC: Voice Conversion With Iterative Refinement using Self
  Transformations
SelfVC: Voice Conversion With Iterative Refinement using Self Transformations
Paarth Neekhara
Shehzeen Samarah Hussain
Rafael Valle
Boris Ginsburg
Rishabh Ranjan
Shlomo Dubnov
F. Koushanfar
Julian McAuley
146
7
0
14 Oct 2023
Enhancing expressivity transfer in textless speech-to-speech translation
Enhancing expressivity transfer in textless speech-to-speech translationAutomatic Speech Recognition & Understanding (ASRU), 2023
J. Duret
Benjamin O’Brien
Yannick Esteve
Titouan Parcollet
127
3
0
11 Oct 2023
JVNV: A Corpus of Japanese Emotional Speech with Verbal Content and
  Nonverbal Expressions
JVNV: A Corpus of Japanese Emotional Speech with Verbal Content and Nonverbal ExpressionsIEEE Access (IEEE Access), 2023
Detai Xin
Junfeng Jiang
Shinnosuke Takamichi
Yuki Saito
Akiko Aizawa
Hiroshi Saruwatari
150
22
0
09 Oct 2023
Low-Resource Self-Supervised Learning with SSL-Enhanced TTS
Low-Resource Self-Supervised Learning with SSL-Enhanced TTS
Xin Wang
Taein Kwon
Wei-Ning Hsu
Yossi Adi
Tu Nguyen
D. Bohus
Emmanuel Dupoux
Neel Joshi
Abdelrahman Mohamed
126
5
0
29 Sep 2023
Towards General-Purpose Text-Instruction-Guided Voice Conversion
Towards General-Purpose Text-Instruction-Guided Voice ConversionAutomatic Speech Recognition & Understanding (ASRU), 2023
Chun-Yi Kuan
Chen-An Li
Tsung-Yuan Hsu
Tzu-Quan Lin
Ho-Lam Chung
Kai-Wei Chang
Shuo-yiin Chang
Hung-yi Lee
228
10
0
25 Sep 2023
Let There Be Sound: Reconstructing High Quality Speech from Silent
  Videos
Let There Be Sound: Reconstructing High Quality Speech from Silent VideosAAAI Conference on Artificial Intelligence (AAAI), 2023
Ji-Hoon Kim
Jaehun Kim
Joon Son Chung
219
10
0
29 Aug 2023
From Discrete Tokens to High-Fidelity Audio Using Multi-Band Diffusion
From Discrete Tokens to High-Fidelity Audio Using Multi-Band DiffusionNeural Information Processing Systems (NeurIPS), 2023
Robin San Roman
Yossi Adi
Antoine Deleforge
Romain Serizel
Gabriel Synnaeve
Alexandre Défossez
DiffM
202
33
0
02 Aug 2023
Learning Multilingual Expressive Speech Representation for Prosody
  Prediction without Parallel Data
Learning Multilingual Expressive Speech Representation for Prosody Prediction without Parallel DataSpeech Synthesis Workshop (SSW), 2023
J. Duret
Titouan Parcollet
Yannick Esteve
129
4
0
29 Jun 2023
AudioToken: Adaptation of Text-Conditioned Diffusion Models for
  Audio-to-Image Generation
AudioToken: Adaptation of Text-Conditioned Diffusion Models for Audio-to-Image Generation
Guy Yariv
Itai Gat
Lior Wolf
Yossi Adi
Idan Schwartz
DiffM
220
27
0
22 May 2023
WESPER: Zero-shot and Realtime Whisper to Normal Voice Conversion for
  Whisper-based Speech Interactions
WESPER: Zero-shot and Realtime Whisper to Normal Voice Conversion for Whisper-based Speech InteractionsInternational Conference on Human Factors in Computing Systems (CHI), 2023
Jun Rekimoto
179
28
0
03 Mar 2023
Speak, Read and Prompt: High-Fidelity Text-to-Speech with Minimal
  Supervision
Speak, Read and Prompt: High-Fidelity Text-to-Speech with Minimal SupervisionTransactions of the Association for Computational Linguistics (TACL), 2023
Eugene Kharitonov
Damien Vincent
Zalan Borsos
Raphaël Marinier
Sertan Girgin
Olivier Pietquin
Matthew Sharifi
Marco Tagliasacchi
Neil Zeghidour
167
253
0
07 Feb 2023
An Overview of Affective Speech Synthesis and Conversion in the Deep
  Learning Era
An Overview of Affective Speech Synthesis and Conversion in the Deep Learning EraProceedings of the IEEE (Proc. IEEE), 2022
Andreas Triantafyllopoulos
Björn W. Schuller
Gokcce .Iymen
M. Sezgin
Xiangheng He
...
Shuo Liu
Silvan Mertes
Elisabeth André
Ruibo Fu
Jianhua Tao
254
84
0
06 Oct 2022
AudioGen: Textually Guided Audio Generation
AudioGen: Textually Guided Audio GenerationInternational Conference on Learning Representations (ICLR), 2022
Felix Kreuk
Gabriel Synnaeve
Adam Polyak
Uriel Singer
Alexandre Défossez
Jade Copet
Devi Parikh
Yaniv Taigman
Yossi Adi
DiffM
306
384
0
30 Sep 2022
1