ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2310.02943
  4. Cited By
LibriSpeech-PC: Benchmark for Evaluation of Punctuation and
  Capitalization Capabilities of end-to-end ASR Models

LibriSpeech-PC: Benchmark for Evaluation of Punctuation and Capitalization Capabilities of end-to-end ASR Models

4 October 2023
Aleksandr Meister
Matvei Novikov
Nikolay Karpov
Evelina Bakhturina
Vitaly Lavrukhin
Boris Ginsburg
ArXivPDFHTML

Papers citing "LibriSpeech-PC: Benchmark for Evaluation of Punctuation and Capitalization Capabilities of end-to-end ASR Models"

9 / 9 papers shown
Title
SupertonicTTS: Towards Highly Scalable and Efficient Text-to-Speech System
SupertonicTTS: Towards Highly Scalable and Efficient Text-to-Speech System
Hyeongju Kim
Jinhyeok Yang
Yechan Yu
Seunghun Ji
Jacob Morton
Frederik Bous
Joon Byun
Juheon Lee
51
0
0
29 Mar 2025
DeepAudio-V1:Towards Multi-Modal Multi-Stage End-to-End Video to Speech and Audio Generation
DeepAudio-V1:Towards Multi-Modal Multi-Stage End-to-End Video to Speech and Audio Generation
Haomin Zhang
Chang Liu
Junjie Zheng
Zihao Chen
Chaofan Ding
Xinhan Di
DiffM
VGen
85
0
0
28 Mar 2025
Methods to Increase the Amount of Data for Speech Recognition for Low Resource Languages
Methods to Increase the Amount of Data for Speech Recognition for Low Resource Languages
Alexan Ayrapetyan
Sofia Kostandian
Ara Yeroyan
Mher Yerznkanyan
Nikolay Karpov
Nune Tadevosyan
Vitaly Lavrukhin
Boris Ginsburg
63
0
0
08 Jan 2025
F5-TTS: A Fairytaler that Fakes Fluent and Faithful Speech with Flow
  Matching
F5-TTS: A Fairytaler that Fakes Fluent and Faithful Speech with Flow Matching
Yushen Chen
Zhikang Niu
Ziyang Ma
Keqi Deng
Chunhui Wang
Jian Zhao
Kai Yu
Xie Chen
33
52
0
09 Oct 2024
Whisper in Medusa's Ear: Multi-head Efficient Decoding for
  Transformer-based ASR
Whisper in Medusa's Ear: Multi-head Efficient Decoding for Transformer-based ASR
Yael Segal-Feldman
Aviv Shamsian
Aviv Navon
Gill Hetz
Joseph Keshet
27
1
0
24 Sep 2024
Longer is (Not Necessarily) Stronger: Punctuated Long-Sequence Training
  for Enhanced Speech Recognition and Translation
Longer is (Not Necessarily) Stronger: Punctuated Long-Sequence Training for Enhanced Speech Recognition and Translation
Nithin Rao Koluguri
Travis M. Bartley
Hainan Xu
Oleksii Hrinchuk
Jagadeesh Balam
Boris Ginsburg
Georg Kucsko
37
2
0
09 Sep 2024
Cellwise robust and sparse principal component analysis
Cellwise robust and sparse principal component analysis
Pia Pfeiffer
Laura Vana-Gur
Peter Filzmoser
13
0
0
28 Aug 2024
E2 TTS: Embarrassingly Easy Fully Non-Autoregressive Zero-Shot TTS
E2 TTS: Embarrassingly Easy Fully Non-Autoregressive Zero-Shot TTS
Sefik Emre Eskimez
Xiaofei Wang
Manthan Thakker
Canrun Li
Chung-Hsien Tsai
...
Min Tang
Xu Tan
Yanqing Liu
Sheng Zhao
Naoyuki Kanda
VLM
35
47
0
26 Jun 2024
Whispy: Adapting STT Whisper Models to Real-Time Environments
Whispy: Adapting STT Whisper Models to Real-Time Environments
Antonio Bevilacqua
Paolo Saviano
A. Amirante
S. Romano
21
3
0
06 May 2024
1