ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2304.06795
  4. Cited By
Efficient Sequence Transduction by Jointly Predicting Tokens and
  Durations

Efficient Sequence Transduction by Jointly Predicting Tokens and Durations

13 April 2023
Hainan Xu
Fei Jia
Somshubra Majumdar
Hengguan Huang
Shinji Watanabe
Boris Ginsburg
ArXivPDFHTML

Papers citing "Efficient Sequence Transduction by Jointly Predicting Tokens and Durations"

14 / 14 papers shown
Title
Unveiling the Best Practices for Applying Speech Foundation Models to Speech Intelligibility Prediction for Hearing-Impaired People
Unveiling the Best Practices for Applying Speech Foundation Models to Speech Intelligibility Prediction for Hearing-Impaired People
Haoshuai Zhou
Boxuan Cao
Changgeng Mo
Linkai Li
Shan Xiang Wang
AI4CE
14
0
0
13 May 2025
Koel-TTS: Enhancing LLM based Speech Generation with Preference Alignment and Classifier Free Guidance
Koel-TTS: Enhancing LLM based Speech Generation with Preference Alignment and Classifier Free Guidance
Shehzeen Samarah Hussain
Paarth Neekhara
Xuesong Yang
Edresson Casanova
Subhankar Ghosh
Mikyas T. Desta
Roy Fejgin
Rafael Valle
Jason Chun Lok Li
59
2
0
07 Feb 2025
Joint Automatic Speech Recognition And Structure Learning For Better Speech Understanding
Joint Automatic Speech Recognition And Structure Learning For Better Speech Understanding
J. Hu
Zuchao Li
Mengjia Shen
Haojun Ai
Sheng Li
Jun Zhang
29
0
0
20 Jan 2025
HAINAN: Fast and Accurate Transducer for Hybrid-Autoregressive ASR
HAINAN: Fast and Accurate Transducer for Hybrid-Autoregressive ASR
Hainan Xu
Travis M. Bartley
Vladimir Bataev
Boris Ginsburg
63
0
0
03 Oct 2024
Longer is (Not Necessarily) Stronger: Punctuated Long-Sequence Training
  for Enhanced Speech Recognition and Translation
Longer is (Not Necessarily) Stronger: Punctuated Long-Sequence Training for Enhanced Speech Recognition and Translation
Nithin Rao Koluguri
Travis M. Bartley
Hainan Xu
Oleksii Hrinchuk
Jagadeesh Balam
Boris Ginsburg
Georg Kucsko
24
2
0
09 Sep 2024
Romanization Encoding For Multilingual ASR
Romanization Encoding For Multilingual ASR
Wen Ding
Fei Jia
Hainan Xu
Yu Xi
Junjie Lai
Boris Ginsburg
16
0
0
05 Jul 2024
Label-Looping: Highly Efficient Decoding for Transducers
Label-Looping: Highly Efficient Decoding for Transducers
Vladimir Bataev
Hainan Xu
Daniel Galvez
Vitaly Lavrukhin
Boris Ginsburg
23
4
0
10 Jun 2024
Speed of Light Exact Greedy Decoding for RNN-T Speech Recognition Models
  on GPU
Speed of Light Exact Greedy Decoding for RNN-T Speech Recognition Models on GPU
Daniel Galvez
Vladimir Bataev
Hainan Xu
Tim Kaldewey
21
5
0
06 Jun 2024
TDT-KWS: Fast And Accurate Keyword Spotting Using Token-and-duration
  Transducer
TDT-KWS: Fast And Accurate Keyword Spotting Using Token-and-duration Transducer
Yu Xi
Hao Li
Baochen Yang
Haoyu Li
Hai-kun Xu
Kai Yu
22
1
0
20 Mar 2024
Speech-based Slot Filling using Large Language Models
Speech-based Slot Filling using Large Language Models
Guangzhi Sun
Shutong Feng
Dongcheng Jiang
Chao Zhang
Milica Gasic
P. Woodland
13
1
0
13 Nov 2023
Leveraging Multilingual Self-Supervised Pretrained Models for
  Sequence-to-Sequence End-to-End Spoken Language Understanding
Leveraging Multilingual Self-Supervised Pretrained Models for Sequence-to-Sequence End-to-End Spoken Language Understanding
Pavel Denisov
Ngoc Thang Vu
24
1
0
09 Oct 2023
ENIGMA-51: Towards a Fine-Grained Understanding of Human-Object
  Interactions in Industrial Scenarios
ENIGMA-51: Towards a Fine-Grained Understanding of Human-Object Interactions in Industrial Scenarios
Francesco Ragusa
Rosario Leonardi
Michele Mazzamuto
Claudia Bonanno
Rosario Scavo
Antonino Furnari
G. Farinella
22
7
0
26 Sep 2023
Echo State Speech Recognition
Echo State Speech Recognition
H. Shrivastava
Ankush Garg
Yuan Cao
Yu Zhang
Tara N. Sainath
31
22
0
18 Feb 2021
NeMo: a toolkit for building AI applications using Neural Modules
NeMo: a toolkit for building AI applications using Neural Modules
Oleksii Kuchaiev
Jason Chun Lok Li
Huyen Nguyen
Oleksii Hrinchuk
Ryan Leary
...
Jack Cook
P. Castonguay
Mariya Popova
Jocelyn Huang
Jonathan M. Cohen
179
287
0
14 Sep 2019
1