ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1805.04699
  4. Cited By
TED-LIUM 3: twice as much data and corpus repartition for experiments on
  speaker adaptation

TED-LIUM 3: twice as much data and corpus repartition for experiments on speaker adaptation

12 May 2018
François Hernandez
Vincent Nguyen
Sahar Ghannay
N. Tomashenko
Yannick Esteve
    VLM
ArXivPDFHTML

Papers citing "TED-LIUM 3: twice as much data and corpus repartition for experiments on speaker adaptation"

50 / 205 papers shown
Title
OWSM-CTC: An Open Encoder-Only Speech Foundation Model for Speech
  Recognition, Translation, and Language Identification
OWSM-CTC: An Open Encoder-Only Speech Foundation Model for Speech Recognition, Translation, and Language Identification
Yifan Peng
Yui Sudo
Muhammad Shakeel
Shinji Watanabe
VLM
46
17
0
20 Feb 2024
Speech Translation with Speech Foundation Models and Large Language
  Models: What is There and What is Missing?
Speech Translation with Speech Foundation Models and Large Language Models: What is There and What is Missing?
Marco Gaido
Sara Papi
Matteo Negri
L. Bentivogli
49
13
0
19 Feb 2024
Exploring the limits of decoder-only models trained on public speech
  recognition corpora
Exploring the limits of decoder-only models trained on public speech recognition corpora
Ankit Gupta
G. Saon
Brian Kingsbury
OffRL
25
5
0
31 Jan 2024
OWSM v3.1: Better and Faster Open Whisper-Style Speech Models based on
  E-Branchformer
OWSM v3.1: Better and Faster Open Whisper-Style Speech Models based on E-Branchformer
Yifan Peng
Jinchuan Tian
William Chen
Siddhant Arora
Brian Yan
...
Kwanghee Choi
Jiatong Shi
Xuankai Chang
Jee-weon Jung
Shinji Watanabe
VLM
OSLM
34
40
0
30 Jan 2024
R-BI: Regularized Batched Inputs enhance Incremental Decoding Framework
  for Low-Latency Simultaneous Speech Translation
R-BI: Regularized Batched Inputs enhance Incremental Decoding Framework for Low-Latency Simultaneous Speech Translation
Jiaxin Guo
Zhanglin Wu
Zongyao Li
Hengchao Shang
Daimeng Wei
Xiaoyu Chen
Zhiqiang Rao
Shaojun Li
Hao Yang
35
1
0
11 Jan 2024
Audio-visual fine-tuning of audio-only ASR models
Audio-visual fine-tuning of audio-only ASR models
Avner May
Dmitriy Serdyuk
Ankit Parag Shah
Otavio Braga
Olivier Siohan
31
3
0
14 Dec 2023
Compression of end-to-end non-autoregressive image-to-speech system for
  low-resourced devices
Compression of end-to-end non-autoregressive image-to-speech system for low-resourced devices
Gokul Srinivasagan
Michael Deisher
Munir Georges
VLM
24
0
0
30 Nov 2023
COSMIC: Data Efficient Instruction-tuning For Speech In-Context Learning
COSMIC: Data Efficient Instruction-tuning For Speech In-Context Learning
Jing Pan
Jian Wu
Yashesh Gaur
S. Sivasankaran
Zhuo Chen
Shujie Liu
Jinyu Li
ELM
35
26
0
03 Nov 2023
Distil-Whisper: Robust Knowledge Distillation via Large-Scale Pseudo
  Labelling
Distil-Whisper: Robust Knowledge Distillation via Large-Scale Pseudo Labelling
Sanchit Gandhi
Patrick von Platen
Alexander M. Rush
VLM
27
52
0
01 Nov 2023
How Much Context Does My Attention-Based ASR System Need?
How Much Context Does My Attention-Based ASR System Need?
Robert Flynn
Anton Ragni
32
1
0
24 Oct 2023
Multi-stage Large Language Model Correction for Speech Recognition
Multi-stage Large Language Model Correction for Speech Recognition
Jie Pu
Thai-Son Nguyen
Sebastian Stüker
LRM
35
6
0
17 Oct 2023
Fast Word Error Rate Estimation Using Self-Supervised Representations for Speech and Text
Fast Word Error Rate Estimation Using Self-Supervised Representations for Speech and Text
Chanho Park
Chengsong Lu
Mingjie Chen
Thomas Hain
31
3
0
12 Oct 2023
AfriSpeech-200: Pan-African Accented Speech Dataset for Clinical and
  General Domain ASR
AfriSpeech-200: Pan-African Accented Speech Dataset for Clinical and General Domain ASR
Tobi Olatunji
Tejumade Afonja
Aditya Yadavalli
Chris C. Emezue
Sahib Singh
...
Joanne I. Osuchukwu
Salomey Osei
A. Tonja
Naome A. Etori
Clinton Mbataku
32
16
0
30 Sep 2023
Exploring Speech Recognition, Translation, and Understanding with
  Discrete Speech Units: A Comparative Study
Exploring Speech Recognition, Translation, and Understanding with Discrete Speech Units: A Comparative Study
Xuankai Chang
Brian Yan
Kwanghee Choi
Jee-weon Jung
Yichen Lu
...
Pengcheng Guo
Yao-Fei Cheng
Pavel Denisov
Kohei Saijo
Hsiu-Hsuan Wang
33
38
0
27 Sep 2023
HyPoradise: An Open Baseline for Generative Speech Recognition with
  Large Language Models
HyPoradise: An Open Baseline for Generative Speech Recognition with Large Language Models
Cheng Chen
Yuchen Hu
Chao-Han Huck Yang
Sabato Marco Siniscalchi
Pin-Yu Chen
Eng Siong Chng
32
42
0
27 Sep 2023
Speech collage: code-switched audio generation by collaging monolingual
  corpora
Speech collage: code-switched audio generation by collaging monolingual corpora
A. Hussein
Dorsa Zeinali
Ondˇrej Klejch
Sanjeev Khudanpur
Brian Yan
Shammur A. Chowdhury
Ahmed M. Ali
Shinji Watanabe
Sanjeev Khudanpur
27
1
0
27 Sep 2023
Direct Models for Simultaneous Translation and Automatic Subtitling:
  FBK@IWSLT2023
Direct Models for Simultaneous Translation and Automatic Subtitling: FBK@IWSLT2023
Sara Papi
Marco Gaido
Matteo Negri
43
7
0
27 Sep 2023
Updated Corpora and Benchmarks for Long-Form Speech Recognition
Updated Corpora and Benchmarks for Long-Form Speech Recognition
Jennifer Drexler Fox
Desh Raj
Natalie Delworth
Quinn Mcnamara
Corey Miller
Miguel Jetté
AuLLM
36
7
0
26 Sep 2023
Learning from Flawed Data: Weakly Supervised Automatic Speech
  Recognition
Learning from Flawed Data: Weakly Supervised Automatic Speech Recognition
Dongji Gao
Hainan Xu
Desh Raj
Leibny Paola García Perera
Daniel Povey
Sanjeev Khudanpur
35
4
0
26 Sep 2023
Reproducing Whisper-Style Training Using an Open-Source Toolkit and
  Publicly Available Data
Reproducing Whisper-Style Training Using an Open-Source Toolkit and Publicly Available Data
Yifan Peng
Jinchuan Tian
Brian Yan
Dan Berrebbi
Xuankai Chang
...
Yui Sudo
Muhammad Shakeel
Jee-weon Jung
Soumi Maiti
Shinji Watanabe
VLM
39
36
0
25 Sep 2023
Investigating End-to-End ASR Architectures for Long Form Audio
  Transcription
Investigating End-to-End ASR Architectures for Long Form Audio Transcription
Nithin Rao Koluguri
Samuel Kriman
Georgy Zelenfroind
Somshubra Majumdar
Dima Rekesh
Vahid Noroozi
Jagadeesh Balam
Boris Ginsburg
AuLLM
39
9
0
18 Sep 2023
Training dynamic models using early exits for automatic speech
  recognition on resource-constrained devices
Training dynamic models using early exits for automatic speech recognition on resource-constrained devices
George August Wright
Umberto Cappellazzo
Salah Zaiem
Desh Raj
Lucas Ondel Yang
Daniele Falavigna
Mohamed Nabih Ali
Alessio Brutti
42
2
0
18 Sep 2023
BLSP: Bootstrapping Language-Speech Pre-training via Behavior Alignment
  of Continuation Writing
BLSP: Bootstrapping Language-Speech Pre-training via Behavior Alignment of Continuation Writing
Chen Wang
Minpeng Liao
Zhongqiang Huang
Jinliang Lu
Junhong Wu
Yuchen Liu
Chengqing Zong
Jiajun Zhang
AuLLM
33
38
0
02 Sep 2023
Alternative Pseudo-Labeling for Semi-Supervised Automatic Speech
  Recognition
Alternative Pseudo-Labeling for Semi-Supervised Automatic Speech Recognition
Hanjing Zhu
Dongji Gao
Gaofeng Cheng
Daniel Povey
Pengyuan Zhang
Yonghong Yan
NoLa
38
4
0
12 Aug 2023
Integration of Frame- and Label-synchronous Beam Search for Streaming
  Encoder-decoder Speech Recognition
Integration of Frame- and Label-synchronous Beam Search for Streaming Encoder-decoder Speech Recognition
E. Tsunoo
Hayato Futami
Yosuke Kashiwagi
Siddhant Arora
Shinji Watanabe
30
4
0
24 Jul 2023
Ed-Fed: A generic federated learning framework with resource-aware
  client selection for edge devices
Ed-Fed: A generic federated learning framework with resource-aware client selection for edge devices
Zitha Sasindran
Harsha Yelchuri
T. V. Prabhakar
FedML
29
4
0
14 Jul 2023
Adapting an ASR Foundation Model for Spoken Language Assessment
Adapting an ASR Foundation Model for Spoken Language Assessment
Rao Ma
Mengjie Qian
Mark Gales
Kate Knill
27
11
0
13 Jul 2023
The NPU-MSXF Speech-to-Speech Translation System for IWSLT 2023
  Speech-to-Speech Translation Task
The NPU-MSXF Speech-to-Speech Translation System for IWSLT 2023 Speech-to-Speech Translation Task
Kun Song
Yinjiao Lei
Pei-Ning Chen
Yiqing Cao
Kun Wei
Yongmao Zhang
Linfu Xie
Ning Jiang
Guoqing Zhao
29
1
0
10 Jul 2023
Can Generative Large Language Models Perform ASR Error Correction?
Can Generative Large Language Models Perform ASR Error Correction?
Rao Ma
Mengjie Qian
Potsawee Manakul
Mark Gales
Kate Knill
AuLLM
KELM
27
49
0
09 Jul 2023
When to Use Efficient Self Attention? Profiling Text, Speech and Image
  Transformer Variants
When to Use Efficient Self Attention? Profiling Text, Speech and Image Transformer Variants
Anuj Diwan
Eunsol Choi
David Harwath
54
0
0
14 Jun 2023
KIT's Multilingual Speech Translation System for IWSLT 2023
KIT's Multilingual Speech Translation System for IWSLT 2023
Danni Liu
Thai-Binh Nguyen
Sai Koneru
Enes Yavuz Ugan
Ngoc-Quan Pham
Tuan-Nam Nguyen
Tu Anh Dinh
Carlos Mullov
A. Waibel
Jan Niehues
28
7
0
08 Jun 2023
Adapting an Unadaptable ASR System
Adapting an Unadaptable ASR System
Rao Ma
Mengjie Qian
Mark Gales
Kate Knill
33
3
0
01 Jun 2023
RAND: Robustness Aware Norm Decay For Quantized Seq2seq Models
RAND: Robustness Aware Norm Decay For Quantized Seq2seq Models
David Qiu
David Rim
Shaojin Ding
Oleg Rybakov
Yanzhang He
MQ
35
4
0
24 May 2023
InterFormer: Interactive Local and Global Features Fusion for Automatic
  Speech Recognition
InterFormer: Interactive Local and Global Features Fusion for Automatic Speech Recognition
Zhibing Lai
Tianren Zhang
Qi Liu
Xinyuan Qian
Li-Fang Wei
Songlu Chen
Feng Chen
Xu-Cheng Yin
35
2
0
24 May 2023
Fast Conformer with Linearly Scalable Attention for Efficient Speech
  Recognition
Fast Conformer with Linearly Scalable Attention for Efficient Speech Recognition
Dima Rekesh
Nithin Rao Koluguri
Samuel Kriman
Somshubra Majumdar
Vahid Noroozi
...
Oleksii Hrinchuk
Krishna Puvvada
Ankur Kumar
Jagadeesh Balam
Boris Ginsburg
56
84
0
08 May 2023
SynthVSR: Scaling Up Visual Speech Recognition With Synthetic
  Supervision
SynthVSR: Scaling Up Visual Speech Recognition With Synthetic Supervision
Xubo Liu
Egor Lakomkin
Konstantinos Vougioukas
Pingchuan Ma
Honglie Chen
...
Niko Moritz
J. Kolár
Stavros Petridis
M. Pantic
Christian Fuegen
52
19
0
30 Mar 2023
Text is All You Need: Personalizing ASR Models using Controllable Speech
  Synthesis
Text is All You Need: Personalizing ASR Models using Controllable Speech Synthesis
Karren D. Yang
Ting-Yao Hu
Jen-Hao Rick Chang
H. Koppula
Oncel Tuzel
48
12
0
27 Mar 2023
Auto-AVSR: Audio-Visual Speech Recognition with Automatic Labels
Auto-AVSR: Audio-Visual Speech Recognition with Automatic Labels
Pingchuan Ma
A. Haliassos
Adriana Fernandez-Lopez
Honglie Chen
Stavros Petridis
M. Pantic
27
107
0
25 Mar 2023
Right the docs: Characterising voice dataset documentation practices
  used in machine learning
Right the docs: Characterising voice dataset documentation practices used in machine learning
Kathy Reid
Elizabeth T. Williams
27
2
0
19 Mar 2023
Visual Information Matters for ASR Error Correction
Visual Information Matters for ASR Error Correction
Bannihati Kumar Vanya
Shanbo Cheng
Ningxin Peng
Yuchen Zhang
29
3
0
16 Mar 2023
Google USM: Scaling Automatic Speech Recognition Beyond 100 Languages
Google USM: Scaling Automatic Speech Recognition Beyond 100 Languages
Yu Zhang
Wei Han
James Qin
Yongqiang Wang
Ankur Bapna
...
Pedro J. Moreno
Chung-Cheng Chiu
J. Schalkwyk
Franccoise Beaufays
Yonghui Wu
VLM
79
254
0
02 Mar 2023
WhisperX: Time-Accurate Speech Transcription of Long-Form Audio
WhisperX: Time-Accurate Speech Transcription of Long-Form Audio
Max Bain
Jaesung Huh
Tengda Han
Andrew Zisserman
45
210
0
01 Mar 2023
deHuBERT: Disentangling Noise in a Self-supervised Model for Robust
  Speech Recognition
deHuBERT: Disentangling Noise in a Self-supervised Model for Robust Speech Recognition
Dianwen Ng
Ruixi Zhang
J. Yip
Zhao Yang
Jinjie Ni
Chong Zhang
Yukun Ma
Chongjia Ni
Eng Siong Chng
B. Ma
29
14
0
28 Feb 2023
Efficient Ensemble for Multimodal Punctuation Restoration using
  Time-Delay Neural Network
Efficient Ensemble for Multimodal Punctuation Restoration using Time-Delay Neural Network
Xing Yi Liu
Homayoon Beigi
9
1
0
26 Feb 2023
Federated Learning for ASR based on Wav2vec 2.0
Federated Learning for ASR based on Wav2vec 2.0
Tuan Nguyen
Salima Mdhaffar
N. Tomashenko
J. Bonastre
Yannick Esteve
FedML
47
10
0
20 Feb 2023
ASR Bundestag: A Large-Scale political debate dataset in German
ASR Bundestag: A Large-Scale political debate dataset in German
Johannes Wirth
René Peinl
32
1
0
12 Feb 2023
MAC: A unified framework boosting low resource automatic speech
  recognition
MAC: A unified framework boosting low resource automatic speech recognition
Zeping Min
Qian Ge
Zhong Li
E. Weinan
21
1
0
05 Feb 2023
SLUE Phase-2: A Benchmark Suite of Diverse Spoken Language Understanding
  Tasks
SLUE Phase-2: A Benchmark Suite of Diverse Spoken Language Understanding Tasks
Suwon Shon
Siddhant Arora
Chyi-Jiunn Lin
Ankita Pasad
Felix Wu
Roshan S. Sharma
Wei Wu
Hung-yi Lee
Karen Livescu
Shinji Watanabe
ELM
21
32
0
20 Dec 2022
Jointly Learning Visual and Auditory Speech Representations from Raw
  Data
Jointly Learning Visual and Auditory Speech Representations from Raw Data
A. Haliassos
Pingchuan Ma
Rodrigo Mira
Stavros Petridis
M. Pantic
SSL
45
49
0
12 Dec 2022
Robust Speech Recognition via Large-Scale Weak Supervision
Robust Speech Recognition via Large-Scale Weak Supervision
Alec Radford
Jong Wook Kim
Tao Xu
Greg Brockman
C. McLeavey
Ilya Sutskever
OffRL
79
3,338
0
06 Dec 2022
Previous
12345
Next