ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1912.07875
  4. Cited By
Libri-Light: A Benchmark for ASR with Limited or No Supervision

Libri-Light: A Benchmark for ASR with Limited or No Supervision

17 December 2019
Jacob Kahn
M. Rivière
Weiyi Zheng
Evgeny Kharitonov
Qiantong Xu
Pierre-Emmanuel Mazaré
Julien Karadayi
Vitaliy Liptchinsky
R. Collobert
Christian Fuegen
Tatiana Likhomanenko
Gabriel Synnaeve
Armand Joulin
Abdel-rahman Mohamed
Emmanuel Dupoux
    AuLLM
ArXiv (abs)PDFHTML

Papers citing "Libri-Light: A Benchmark for ASR with Limited or No Supervision"

50 / 475 papers shown
Title
Adapting self-supervised models to multi-talker speech recognition using
  speaker embeddings
Adapting self-supervised models to multi-talker speech recognition using speaker embeddings
Zili Huang
Desh Raj
Leibny Paola García-Perera
Sanjeev Khudanpur
155
29
0
01 Nov 2022
token2vec: A Joint Self-Supervised Pre-training Framework Using Unpaired
  Speech and Text
token2vec: A Joint Self-Supervised Pre-training Framework Using Unpaired Speech and Text
Xianghu Yue
Junyi Ao
Xiaoxue Gao
Haizhou Li
SSL
60
8
0
30 Oct 2022
Filter and evolve: progressive pseudo label refining for semi-supervised
  automatic speech recognition
Filter and evolve: progressive pseudo label refining for semi-supervised automatic speech recognition
Zezhong Jin
Dading Zhong
Xiao Song
Zhaoyi Liu
Naipeng Ye
Qingcheng Zeng
51
2
0
28 Oct 2022
Analyzing Acoustic Word Embeddings from Pre-trained Self-supervised
  Speech Models
Analyzing Acoustic Word Embeddings from Pre-trained Self-supervised Speech Models
Ramon Sanabria
Hao Tang
Sharon Goldwater
SSL
105
19
0
28 Oct 2022
Self-supervised language learning from raw audio: Lessons from the Zero
  Resource Speech Challenge
Self-supervised language learning from raw audio: Lessons from the Zero Resource Speech Challenge
Ewan Dunbar
Nicolas Hamilakis
Emmanuel Dupoux
SSL
80
30
0
27 Oct 2022
Exploring Effective Distillation of Self-Supervised Speech Models for
  Automatic Speech Recognition
Exploring Effective Distillation of Self-Supervised Speech Models for Automatic Speech Recognition
Yujin Wang
Changli Tang
Ziyang Ma
Zhisheng Zheng
Xie Chen
Weiqiang Zhang
110
1
0
27 Oct 2022
Improving Speech-to-Speech Translation Through Unlabeled Text
Improving Speech-to-Speech Translation Through Unlabeled Text
Xuan-Phi Nguyen
Sravya Popuri
Changhan Wang
Yun Tang
Ilia Kulikov
Hongyu Gong
63
9
0
26 Oct 2022
Quantitative Evidence on Overlooked Aspects of Enrollment Speaker
  Embeddings for Target Speaker Separation
Quantitative Evidence on Overlooked Aspects of Enrollment Speaker Embeddings for Target Speaker Separation
Xiaoyu Liu
Xu Li
Joan Serrà
74
9
0
23 Oct 2022
Guided contrastive self-supervised pre-training for automatic speech
  recognition
Guided contrastive self-supervised pre-training for automatic speech recognition
Aparna Khare
Minhua Wu
Saurabhchand Bhati
J. Droppo
Roland Maas
SSL
52
0
0
22 Oct 2022
Named Entity Detection and Injection for Direct Speech Translation
Named Entity Detection and Injection for Direct Speech Translation
Marco Gaido
Yun Tang
Ilia Kulikov
Rongqing Huang
Hongyu Gong
Hirofumi Inaguma
63
3
0
21 Oct 2022
Evidence of Vocal Tract Articulation in Self-Supervised Learning of
  Speech
Evidence of Vocal Tract Articulation in Self-Supervised Learning of Speech
Cheol Jun Cho
Peter Wu
Abdel-rahman Mohamed
Gopala K. Anumanchipalli
82
34
0
21 Oct 2022
G-Augment: Searching for the Meta-Structure of Data Augmentation
  Policies for ASR
G-Augment: Searching for the Meta-Structure of Data Augmentation Policies for ASR
Gary Wang
Ekin D.Cubuk
Andrew Rosenberg
Shuyang Cheng
Ron J. Weiss
Bhuvana Ramabhadran
Pedro J. Moreno
Quoc V. Le
Daniel S. Park
110
2
0
19 Oct 2022
End-to-End Integration of Speech Recognition, Dereverberation,
  Beamforming, and Self-Supervised Learning Representation
End-to-End Integration of Speech Recognition, Dereverberation, Beamforming, and Self-Supervised Learning Representation
Yoshiki Masuyama
Xuankai Chang
Samuele Cornell
Shinji Watanabe
Nobutaka Ono
79
19
0
19 Oct 2022
MaSS: Multi-attribute Selective Suppression
MaSS: Multi-attribute Selective Suppression
Chun-Fu Chen
Shaohan Hu
Zhong-Zhi Shi
Prateek Gulati
Bill Moriarty
Marco Pistoia
Vincenzo Piuri
P. Samarati
CVBM
50
4
0
18 Oct 2022
Continuous Pseudo-Labeling from the Start
Continuous Pseudo-Labeling from the Start
Dan Berrebbi
R. Collobert
Samy Bengio
Navdeep Jaitly
Tatiana Likhomanenko
51
16
0
17 Oct 2022
TransFusion: Transcribing Speech with Multinomial Diffusion
TransFusion: Transcribing Speech with Multinomial Diffusion
Matthew Baas
Kevin Eloff
Herman Kamper
DiffM
31
4
0
14 Oct 2022
Experiments on Turkish ASR with Self-Supervised Speech Representation
  Learning
Experiments on Turkish ASR with Self-Supervised Speech Representation Learning
Ali Safaya
E. Erzin
41
1
0
13 Oct 2022
Towards visually prompted keyword localisation for zero-resource spoken
  languages
Towards visually prompted keyword localisation for zero-resource spoken languages
Leanne Nortje
Herman Kamper
45
6
0
12 Oct 2022
SpeechUT: Bridging Speech and Text with Hidden-Unit for Encoder-Decoder
  Based Speech-Text Pre-training
SpeechUT: Bridging Speech and Text with Hidden-Unit for Encoder-Decoder Based Speech-Text Pre-training
Zi-Hua Zhang
Long Zhou
Junyi Ao
Shujie Liu
Lirong Dai
Jinyu Li
Furu Wei
126
58
0
07 Oct 2022
Improving Label-Deficient Keyword Spotting Through Self-Supervised
  Pretraining
Improving Label-Deficient Keyword Spotting Through Self-Supervised Pretraining
H. S. Bovbjerg
Zheng-Hua Tan
VLM
79
3
0
04 Oct 2022
Fine-tuning Wav2vec for Vocal-burst Emotion Recognition
Fine-tuning Wav2vec for Vocal-burst Emotion Recognition
Dang-Khanh Nguyen
Sudarshan Pant
Ngoc-Huynh Ho
Gueesang Lee
Soo-Huyng Kim
Hyung-Jeong Yang
37
3
0
01 Oct 2022
E-Branchformer: Branchformer with Enhanced merging for speech
  recognition
E-Branchformer: Branchformer with Enhanced merging for speech recognition
Kwangyoun Kim
Felix Wu
Yifan Peng
Jing Pan
Prashant Sridhar
Kyu Jeong Han
Shinji Watanabe
168
117
0
30 Sep 2022
SpeechLM: Enhanced Speech Pre-Training with Unpaired Textual Data
SpeechLM: Enhanced Speech Pre-Training with Unpaired Textual Data
Zi-Hua Zhang
Sanyuan Chen
Long Zhou
Yu Wu
Shuo Ren
...
Zhuoyuan Yao
Xun Gong
Lirong Dai
Jinyu Li
Furu Wei
79
57
0
30 Sep 2022
MeWEHV: Mel and Wave Embeddings for Human Voice Tasks
MeWEHV: Mel and Wave Embeddings for Human Voice Tasks
Andrés Vasco-Carofilis
Laura Fernández-Robles
Enrique Alegre
Eduardo FIDALGO
78
3
0
28 Sep 2022
An Efficient Multitask Learning Architecture for Affective Vocal Burst
  Analysis
An Efficient Multitask Learning Architecture for Affective Vocal Burst Analysis
Tobias Hallmen
Silvan Mertes
Dominik Schiller
Elisabeth André
47
5
0
28 Sep 2022
The Efficacy of Self-Supervised Speech Models for Audio Representations
The Efficacy of Self-Supervised Speech Models for Audio Representations
Tung-Yu Wu
Chen-An Li
Tzu-Han Lin
Tsung-Yuan Hsu
Hung-yi Lee
64
5
0
26 Sep 2022
Self-Relation Attention and Temporal Awareness for Emotion Recognition
  via Vocal Burst
Self-Relation Attention and Temporal Awareness for Emotion Recognition via Vocal Burst
Dang-Linh Trinh
Minh-Cong Vo
Gueesang Lee
57
3
0
15 Sep 2022
AudioLM: a Language Modeling Approach to Audio Generation
AudioLM: a Language Modeling Approach to Audio Generation
Zalan Borsos
Raphaël Marinier
Damien Vincent
Eugene Kharitonov
Olivier Pietquin
...
Dominik Roblek
O. Teboul
David Grangier
Marco Tagliasacchi
Neil Zeghidour
AuLLM
163
616
0
07 Sep 2022
Learning a Dual-Mode Speech Recognition Model via Self-Pruning
Learning a Dual-Mode Speech Recognition Model via Self-Pruning
Chunxi Liu
Yuan Shangguan
Haichuan Yang
Yangyang Shi
Raghuraman Krishnamoorthi
Ozlem Kalinli
SSL
84
7
0
25 Jul 2022
Distilled Non-Semantic Speech Embeddings with Binary Neural Networks for
  Low-Resource Devices
Distilled Non-Semantic Speech Embeddings with Binary Neural Networks for Low-Resource Devices
Harlin Lee
Aaqib Saeed
58
2
0
12 Jul 2022
A Comparative Study of Self-supervised Speech Representation Based Voice
  Conversion
A Comparative Study of Self-supervised Speech Representation Based Voice Conversion
Wen-Chin Huang
Shu-Wen Yang
Tomoki Hayashi
Tomoki Toda
59
17
0
10 Jul 2022
Distance-Based Sound Separation
Distance-Based Sound Separation
K. Patterson
K. Wilson
Scott Wisdom
J. Hershey
49
22
0
01 Jul 2022
Wav2Vec-Aug: Improved self-supervised training with limited data
Wav2Vec-Aug: Improved self-supervised training with limited data
Anuroop Sriram
Michael Auli
Alexei Baevski
SSLVLM
43
15
0
27 Jun 2022
Is the Language Familiarity Effect gradual? A computational modelling
  approach
Is the Language Familiarity Effect gradual? A computational modelling approach
Maureen de Seyssel
Guillaume Wisniewski
Emmanuel Dupoux
17
2
0
27 Jun 2022
Exploring the Effectiveness of Self-supervised Learning and Classifier
  Chains in Emotion Recognition of Nonverbal Vocalizations
Exploring the Effectiveness of Self-supervised Learning and Classifier Chains in Emotion Recognition of Nonverbal Vocalizations
Detai Xin
Shinnosuke Takamichi
Hiroshi Saruwatari
44
14
0
21 Jun 2022
Censer: Curriculum Semi-supervised Learning for Speech Recognition Based
  on Self-supervised Pre-training
Censer: Curriculum Semi-supervised Learning for Speech Recognition Based on Self-supervised Pre-training
Bowen Zhang
Songjun Cao
Xiaoming Zhang
Yike Zhang
Long Ma
T. Shinozaki
SSL
62
6
0
16 Jun 2022
Guided-TTS 2: A Diffusion Model for High-quality Adaptive Text-to-Speech
  with Untranscribed Data
Guided-TTS 2: A Diffusion Model for High-quality Adaptive Text-to-Speech with Untranscribed Data
Sungwon Kim
Heeseung Kim
Sung-Hoon Yoon
DiffM
246
53
0
30 May 2022
Self-Supervised Speech Representation Learning: A Review
Self-Supervised Speech Representation Learning: A Review
Abdel-rahman Mohamed
Hung-yi Lee
Lasse Borgholt
Jakob Drachmann Havtorn
Joakim Edin
...
Shang-Wen Li
Karen Livescu
Lars Maaløe
Tara N. Sainath
Shinji Watanabe
SSLAI4TS
273
367
0
21 May 2022
Leveraging Pseudo-labeled Data to Improve Direct Speech-to-Speech
  Translation
Leveraging Pseudo-labeled Data to Improve Direct Speech-to-Speech Translation
Qianqian Dong
Fengpeng Yue
Tom Ko
Mingxuan Wang
Qibing Bai
Yu Zhang
87
16
0
18 May 2022
Accented Speech Recognition: Benchmarking, Pre-training, and Diverse
  Data
Accented Speech Recognition: Benchmarking, Pre-training, and Diverse Data
Alena Aksenova
Zhehuai Chen
Chung-Cheng Chiu
D. Esch
Pavel Golik
...
Levi King
Bhuvana Ramabhadran
Andrew Rosenberg
Suzan Schwartz
Gary Wang
100
23
0
16 May 2022
Wav2Seq: Pre-training Speech-to-Text Encoder-Decoder Models Using Pseudo
  Languages
Wav2Seq: Pre-training Speech-to-Text Encoder-Decoder Models Using Pseudo Languages
Felix Wu
Kwangyoun Kim
Shinji Watanabe
Kyu Jeong Han
Ryan T. McDonald
Kilian Q. Weinberger
Yoav Artzi
SyDa
94
39
0
02 May 2022
Ultra Fast Speech Separation Model with Teacher Student Learning
Ultra Fast Speech Separation Model with Teacher Student Learning
Sanyuan Chen
Yu-Huan Wu
Zhuo Chen
Jian Wu
Takuya Yoshioka
Shujie Liu
Jinyu Li
Xiangzhan Yu
75
14
0
27 Apr 2022
Why does Self-Supervised Learning for Speech Recognition Benefit Speaker
  Recognition?
Why does Self-Supervised Learning for Speech Recognition Benefit Speaker Recognition?
Sanyuan Chen
Yu Wu
Chengyi Wang
Shujie Liu
Zhuo Chen
...
Gang Liu
Jinyu Li
Jian Wu
Xiangzhan Yu
Furu Wei
SSL
95
42
0
27 Apr 2022
HuBERT-EE: Early Exiting HuBERT for Efficient Speech Recognition
HuBERT-EE: Early Exiting HuBERT for Efficient Speech Recognition
J. Yoon
Beom Jun Woo
N. Kim
66
13
0
13 Apr 2022
The PartialSpoof Database and Countermeasures for the Detection of Short
  Fake Speech Segments Embedded in an Utterance
The PartialSpoof Database and Countermeasures for the Detection of Short Fake Speech Segments Embedded in an Utterance
Lin Zhang
Xin Wang
Erica Cooper
Nicholas W. D. Evans
Junichi Yamagishi
109
58
0
11 Apr 2022
MAESTRO: Matched Speech Text Representations through Modality Matching
MAESTRO: Matched Speech Text Representations through Modality Matching
Zhehuai Chen
Yu Zhang
Andrew Rosenberg
Bhuvana Ramabhadran
Pedro J. Moreno
Ankur Bapna
Heiga Zen
92
108
0
07 Apr 2022
Speech Pre-training with Acoustic Piece
Speech Pre-training with Acoustic Piece
Shuo Ren
Shujie Liu
Yu Wu
Long Zhou
Furu Wei
SSL
57
17
0
07 Apr 2022
Enhanced Direct Speech-to-Speech Translation Using Self-supervised
  Pre-training and Data Augmentation
Enhanced Direct Speech-to-Speech Translation Using Self-supervised Pre-training and Data Augmentation
Sravya Popuri
Peng-Jen Chen
Changhan Wang
J. Pino
Yossi Adi
Jiatao Gu
Wei-Ning Hsu
Ann Lee
140
58
0
06 Apr 2022
A Wav2vec2-Based Experimental Study on Self-Supervised Learning Methods
  to Improve Child Speech Recognition
A Wav2vec2-Based Experimental Study on Self-Supervised Learning Methods to Improve Child Speech Recognition
Rishabh Jain
Andrei Barcovschi
Mariam Yiwere
Dan Bigioi
Peter Corcoran
H. Cucu
54
34
0
06 Apr 2022
Combining Spectral and Self-Supervised Features for Low Resource Speech
  Recognition and Translation
Combining Spectral and Self-Supervised Features for Low Resource Speech Recognition and Translation
Dan Berrebbi
Jiatong Shi
Brian Yan
Osbel López-Francisco
Jonathan D. Amith
Shinji Watanabe
56
27
0
05 Apr 2022
Previous
123...106789
Next