ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1805.04699
  4. Cited By
TED-LIUM 3: twice as much data and corpus repartition for experiments on
  speaker adaptation

TED-LIUM 3: twice as much data and corpus repartition for experiments on speaker adaptation

12 May 2018
François Hernandez
Vincent Nguyen
Sahar Ghannay
N. Tomashenko
Yannick Esteve
    VLM
ArXivPDFHTML

Papers citing "TED-LIUM 3: twice as much data and corpus repartition for experiments on speaker adaptation"

50 / 204 papers shown
Title
BERSting at the Screams: A Benchmark for Distanced, Emotional and Shouted Speech Recognition
BERSting at the Screams: A Benchmark for Distanced, Emotional and Shouted Speech Recognition
Paige Tuttosi
Mantaj Dhillon
Luna Sang
Shane Eastwood
Poorvi Bhatia
Quang Minh Dinh
Avni Kapoor
Yewon Jin
Angelica Lim
34
0
0
30 Apr 2025
On The Landscape of Spoken Language Models: A Comprehensive Survey
On The Landscape of Spoken Language Models: A Comprehensive Survey
Siddhant Arora
Kai-Wei Chang
Chung-Ming Chien
Yifan Peng
Haibin Wu
Yossi Adi
Emmanuel Dupoux
Hung-yi Lee
Karen Livescu
Shinji Watanabe
52
3
0
11 Apr 2025
Scaling Analysis of Interleaved Speech-Text Language Models
Scaling Analysis of Interleaved Speech-Text Language Models
Gallil Maimon
Michael Hassid
Amit Roth
Yossi Adi
AuLLM
45
0
0
03 Apr 2025
Dolphin: A Large-Scale Automatic Speech Recognition Model for Eastern Languages
Dolphin: A Large-Scale Automatic Speech Recognition Model for Eastern Languages
Yangyang Meng
Jinpeng Li
Guodong Lin
Yu Pu
G. Wang
Hu Du
Zhiming Shao
Yukai Huang
Ke Li
Wei-Qiang Zhang
ObjD
101
0
0
26 Mar 2025
Contextual Speech Extraction: Leveraging Textual History as an Implicit Cue for Target Speech Extraction
Minsu Kim
Rodrigo Mira
Honglie Chen
Stavros Petridis
M. Pantic
69
0
0
13 Mar 2025
Text-Speech Language Models with Improved Cross-Modal Transfer by Aligning Abstraction Levels
Santiago Cuervo
Adel Moumen
Yanis Labrak
Sameer Khurana
Antoine Laurent
Mickael Rouvier
R. Marxer
77
1
0
08 Mar 2025
Slamming: Training a Speech Language Model on One GPU in a Day
Slamming: Training a Speech Language Model on One GPU in a Day
Gallil Maimon
Avishai Elmakies
Yossi Adi
38
3
0
19 Feb 2025
Soundwave: Less is More for Speech-Text Alignment in LLMs
Soundwave: Less is More for Speech-Text Alignment in LLMs
Yunke Zhang
Zhiheng Liu
Fan Bu
Ruiyu Zhang
Benyou Wang
Yiming Li
AuLLM
SyDa
VLM
107
0
0
18 Feb 2025
DuplexMamba: Enhancing Real-time Speech Conversations with Duplex and Streaming Capabilities
DuplexMamba: Enhancing Real-time Speech Conversations with Duplex and Streaming Capabilities
Xiangyu Lu
Wang Xu
Haoyu Wang
Hongyun Zhou
Haiyan Zhao
Conghui Zhu
T. Zhao
M. Yang
Mamba
AuLLM
66
0
0
16 Feb 2025
FlanEC: Exploring Flan-T5 for Post-ASR Error Correction
FlanEC: Exploring Flan-T5 for Post-ASR Error Correction
Moreno La Quatra
Valerio Mario Salerno
Yu Tsao
Sabato Marco Siniscalchi
99
0
0
22 Jan 2025
Towards Maximum Likelihood Training for Transducer-based Streaming
  Speech Recognition
Towards Maximum Likelihood Training for Transducer-based Streaming Speech Recognition
Hyeonseung Lee
J. Yoon
Sungsoo Kim
N. Kim
71
0
0
26 Nov 2024
Mamba-based Decoder-Only Approach with Bidirectional Speech Modeling for
  Speech Recognition
Mamba-based Decoder-Only Approach with Bidirectional Speech Modeling for Speech Recognition
Yoshiki Masuyama
Koichi Miyazaki
Masato Murata
Mamba
43
0
0
11 Nov 2024
SPES: Spectrogram Perturbation for Explainable Speech-to-Text Generation
SPES: Spectrogram Perturbation for Explainable Speech-to-Text Generation
Dennis Fucci
Marco Gaido
Beatrice Savoldi
Matteo Negri
Mauro Cettolo
L. Bentivogli
57
1
0
03 Nov 2024
OmniFlatten: An End-to-end GPT Model for Seamless Voice Conversation
OmniFlatten: An End-to-end GPT Model for Seamless Voice Conversation
Qinglin Zhang
Luyao Cheng
Chong Deng
Qian Chen
Wen Wang
...
Jiaqing Liu
Hai Yu
Chaohong Tan
Zhihao Du
Shiliang Zhang
SyDa
BDL
AuLLM
VLM
64
13
0
23 Oct 2024
The First VoicePrivacy Attacker Challenge Evaluation Plan
The First VoicePrivacy Attacker Challenge Evaluation Plan
N. Tomashenko
Xiaoxiao Miao
Emmanuel Vincent
Junichi Yamagishi
131
2
0
09 Oct 2024
Joint Fine-tuning and Conversion of Pretrained Speech and Language Models towards Linear Complexity
Joint Fine-tuning and Conversion of Pretrained Speech and Language Models towards Linear Complexity
Mutian He
Philip N. Garner
82
0
0
09 Oct 2024
MOSEL: 950,000 Hours of Speech Data for Open-Source Speech Foundation
  Model Training on EU Languages
MOSEL: 950,000 Hours of Speech Data for Open-Source Speech Foundation Model Training on EU Languages
Marco Gaido
Sara Papi
L. Bentivogli
Alessio Brutti
Mauro Cettolo
R. Gretter
M. Matassoni
Mohamed Nabih
Matteo Negri
44
1
0
01 Oct 2024
Efficient Long-Form Speech Recognition for General Speech In-Context
  Learning
Efficient Long-Form Speech Recognition for General Speech In-Context Learning
Hao Yen
Shaoshi Ling
Guoli Ye
28
0
0
29 Sep 2024
Speech Recognition Rescoring with Large Speech-Text Foundation Models
Speech Recognition Rescoring with Large Speech-Text Foundation Models
Prashanth Gurunath Shivakumar
J. Kolehmainen
Aditya Gourav
Yi Gu
Ankur Gandhe
Ariya Rastrow
I. Bulyko
AuLLM
31
0
0
25 Sep 2024
Revisiting Acoustic Features for Robust ASR
Revisiting Acoustic Features for Robust ASR
Muhammad Ahmed Shah
Bhiksha Raj
AAML
21
0
0
24 Sep 2024
A quest through interconnected datasets: lessons from highly-cited
  ICASSP papers
A quest through interconnected datasets: lessons from highly-cited ICASSP papers
Cynthia C. S. Liem
Doğa Taşcılar
Andrew M. Demetriou
30
0
0
19 Sep 2024
M-BEST-RQ: A Multi-Channel Speech Foundation Model for Smart Glasses
M-BEST-RQ: A Multi-Channel Speech Foundation Model for Smart Glasses
Yufeng Yang
Desh Raj
Ju Lin
Niko Moritz
Junteng Jia
...
Egor Lakomkin
Yiteng Huang
Jacob Donley
Jay Mahadeokar
Ozlem Kalinli
36
2
0
17 Sep 2024
ASR Error Correction using Large Language Models
ASR Error Correction using Large Language Models
Rao Ma
Mengjie Qian
Mark Gales
Kate Knill
KELM
46
1
0
14 Sep 2024
Longer is (Not Necessarily) Stronger: Punctuated Long-Sequence Training
  for Enhanced Speech Recognition and Translation
Longer is (Not Necessarily) Stronger: Punctuated Long-Sequence Training for Enhanced Speech Recognition and Translation
Nithin Rao Koluguri
Travis M. Bartley
Hainan Xu
Oleksii Hrinchuk
Jagadeesh Balam
Boris Ginsburg
Georg Kucsko
44
3
0
09 Sep 2024
Cellwise robust and sparse principal component analysis
Cellwise robust and sparse principal component analysis
Pia Pfeiffer
Laura Vana-Gur
Peter Filzmoser
20
0
0
28 Aug 2024
Learn and Don't Forget: Adding a New Language to ASR Foundation Models
Learn and Don't Forget: Adding a New Language to ASR Foundation Models
Mengjie Qian
Siyuan Tang
Rao Ma
Kate Knill
Mark Gales
CLL
39
3
0
09 Jul 2024
Seed-ASR: Understanding Diverse Speech and Contexts with LLM-based
  Speech Recognition
Seed-ASR: Understanding Diverse Speech and Contexts with LLM-based Speech Recognition
Ye Bai
Jingping Chen
Jitong Chen
Wei Chen
Zhuo Chen
...
Wanyi Zhang
Yang Zhang
Yawei Zhang
Yijie Zheng
Ming Zou
AuLLM
52
19
0
05 Jul 2024
Speculative Speech Recognition by Audio-Prefixed Low-Rank Adaptation of
  Language Models
Speculative Speech Recognition by Audio-Prefixed Low-Rank Adaptation of Language Models
Bolaji Yusuf
M. Baskar
Andrew Rosenberg
Bhuvana Ramabhadran
45
1
0
05 Jul 2024
Towards Robust Speech Representation Learning for Thousands of Languages
Towards Robust Speech Representation Learning for Thousands of Languages
William Chen
Wangyou Zhang
Yifan Peng
Xinjian Li
Jinchuan Tian
Jiatong Shi
Xuankai Chang
Soumi Maiti
Karen Livescu
Shinji Watanabe
ELM
42
6
0
30 Jun 2024
Blending LLMs into Cascaded Speech Translation: KIT's Offline Speech
  Translation System for IWSLT 2024
Blending LLMs into Cascaded Speech Translation: KIT's Offline Speech Translation System for IWSLT 2024
Sai Koneru
Thai-Binh Nguyen
Ngoc-Quan Pham
Danni Liu
Zhaolin Li
Alexander Waibel
Jan Niehues
OffRL
44
3
0
24 Jun 2024
Self-Train Before You Transcribe
Self-Train Before You Transcribe
Robert Flynn
Anton Ragni
41
0
0
17 Jun 2024
Multi-Modal Retrieval For Large Language Model Based Speech Recognition
Multi-Modal Retrieval For Large Language Model Based Speech Recognition
J. Kolehmainen
Aditya Gourav
Prashanth Gurunath Shivakumar
Yile Gu
Ankur Gandhe
Ariya Rastrow
Grant P. Strimel
I. Bulyko
40
4
0
13 Jun 2024
DiscreteSLU: A Large Language Model with Self-Supervised Discrete Speech
  Units for Spoken Language Understanding
DiscreteSLU: A Large Language Model with Self-Supervised Discrete Speech Units for Spoken Language Understanding
Suwon Shon
Kwangyoun Kim
Yi-Te Hsu
Prashant Sridhar
Shinji Watanabe
Karen Livescu
AuLLM
46
3
0
13 Jun 2024
On the Effects of Heterogeneous Data Sources on Speech-to-Text
  Foundation Models
On the Effects of Heterogeneous Data Sources on Speech-to-Text Foundation Models
Jinchuan Tian
Yifan Peng
William Chen
Kwanghee Choi
Karen Livescu
Shinji Watanabe
37
5
0
13 Jun 2024
Just Because We Camp, Doesn't Mean We Should: The Ethics of Modelling
  Queer Voices
Just Because We Camp, Doesn't Mean We Should: The Ethics of Modelling Queer Voices
A. Sigurgeirsson
Eddie L. Ungless
39
2
0
11 Jun 2024
mHuBERT-147: A Compact Multilingual HuBERT Model
mHuBERT-147: A Compact Multilingual HuBERT Model
Marcely Zanon Boito
Vivek Iyer
Nikolaos Lagos
Laurent Besacier
Ioan Calapodescu
VLM
70
8
0
10 Jun 2024
Discrete Multimodal Transformers with a Pretrained Large Language Model
  for Mixed-Supervision Speech Processing
Discrete Multimodal Transformers with a Pretrained Large Language Model for Mixed-Supervision Speech Processing
V. Trinh
Rosy Southwell
Yiwen Guan
Xinlu He
Zhiyong Wang
Jacob Whitehill
OffRL
36
2
0
04 Jun 2024
Federating Dynamic Models using Early-Exit Architectures for Automatic
  Speech Recognition on Heterogeneous Clients
Federating Dynamic Models using Early-Exit Architectures for Automatic Speech Recognition on Heterogeneous Clients
Mohamed Nabih Ali
Alessio Brutti
Daniele Falavigna
45
0
0
27 May 2024
Denoising LM: Pushing the Limits of Error Correction Models for Speech
  Recognition
Denoising LM: Pushing the Limits of Error Correction Models for Speech Recognition
Zijin Gu
Tatiana Likhomanenko
Richard He Bai
Erik McDermott
R. Collobert
Navdeep Jaitly
AuLLM
58
2
0
24 May 2024
Self-Taught Recognizer: Toward Unsupervised Adaptation for Speech
  Foundation Models
Self-Taught Recognizer: Toward Unsupervised Adaptation for Speech Foundation Models
Yuchen Hu
Chen Chen
Chao-Han Huck Yang
Chengwei Qin
Pin-Yu Chen
Chng Eng Siong
Chao Zhang
VLM
33
3
0
23 May 2024
SBAAM! Eliminating Transcript Dependency in Automatic Subtitling
SBAAM! Eliminating Transcript Dependency in Automatic Subtitling
Marco Gaido
Sara Papi
Matteo Negri
Mauro Cettolo
L. Bentivogli
43
1
0
17 May 2024
Listen Again and Choose the Right Answer: A New Paradigm for Automatic
  Speech Recognition with Large Language Models
Listen Again and Choose the Right Answer: A New Paradigm for Automatic Speech Recognition with Large Language Models
Yuchen Hu
Chen Chen
Chengwei Qin
Qiushi Zhu
Eng Siong Chng
Ruizhe Li
AuLLM
KELM
54
5
0
16 May 2024
Muting Whisper: A Universal Acoustic Adversarial Attack on Speech
  Foundation Models
Muting Whisper: A Universal Acoustic Adversarial Attack on Speech Foundation Models
Vyas Raina
Rao Ma
Charles G McGhee
Kate Knill
Mark Gales
AAML
33
5
0
09 May 2024
Whispy: Adapting STT Whisper Models to Real-Time Environments
Whispy: Adapting STT Whisper Models to Real-Time Environments
Antonio Bevilacqua
Paolo Saviano
A. Amirante
S. Romano
23
3
0
06 May 2024
Automatic Speech Recognition System-Independent Word Error Rate
  Estimation
Automatic Speech Recognition System-Independent Word Error Rate Estimation
Chanho Park
Mingjie Chen
Thomas Hain
26
0
0
25 Apr 2024
Cross-Domain Audio Deepfake Detection: Dataset and Analysis
Cross-Domain Audio Deepfake Detection: Dataset and Analysis
Yuang Li
Min Zhang
Mengxin Ren
Miaomiao Ma
Daimeng Wei
Hao Yang
43
4
0
07 Apr 2024
Scaling Properties of Speech Language Models
Scaling Properties of Speech Language Models
Santiago Cuervo
R. Marxer
31
9
0
31 Mar 2024
SpeechColab Leaderboard: An Open-Source Platform for Automatic Speech
  Recognition Evaluation
SpeechColab Leaderboard: An Open-Source Platform for Automatic Speech Recognition Evaluation
Jiayu Du
Jinpeng Li
Guoguo Chen
Wei-Qiang Zhang
ELM
37
3
0
13 Mar 2024
Speech Robust Bench: A Robustness Benchmark For Speech Recognition
Speech Robust Bench: A Robustness Benchmark For Speech Recognition
Muhammad A. Shah
David Solans Noguero
Mikko A. Heikkilä
Nicolas Kourtellis
32
5
0
08 Mar 2024
OWSM-CTC: An Open Encoder-Only Speech Foundation Model for Speech
  Recognition, Translation, and Language Identification
OWSM-CTC: An Open Encoder-Only Speech Foundation Model for Speech Recognition, Translation, and Language Identification
Yifan Peng
Yui Sudo
Muhammad Shakeel
Shinji Watanabe
VLM
46
17
0
20 Feb 2024
12345
Next