ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1912.07875
  4. Cited By
Libri-Light: A Benchmark for ASR with Limited or No Supervision

Libri-Light: A Benchmark for ASR with Limited or No Supervision

17 December 2019
Jacob Kahn
M. Rivière
Weiyi Zheng
Evgeny Kharitonov
Qiantong Xu
Pierre-Emmanuel Mazaré
Julien Karadayi
Vitaliy Liptchinsky
R. Collobert
Christian Fuegen
Tatiana Likhomanenko
Gabriel Synnaeve
Armand Joulin
Abdel-rahman Mohamed
Emmanuel Dupoux
    AuLLM
ArXiv (abs)PDFHTML

Papers citing "Libri-Light: A Benchmark for ASR with Limited or No Supervision"

50 / 475 papers shown
Title
End-to-End Open Vocabulary Keyword Search With Multilingual Neural
  Representations
End-to-End Open Vocabulary Keyword Search With Multilingual Neural Representations
Bolaji Yusuf
J. Černocký
Murat Saraclar
72
2
0
15 Aug 2023
SpeechX: Neural Codec Language Model as a Versatile Speech Transformer
SpeechX: Neural Codec Language Model as a Versatile Speech Transformer
Xiaofei Wang
Manthan Thakker
Zhuo Chen
Naoyuki Kanda
Sefik Emre Eskimez
Sanyuan Chen
M. Tang
Shujie Liu
Jinyu Li
Takuya Yoshioka
109
86
0
14 Aug 2023
EXPRESSO: A Benchmark and Analysis of Discrete Expressive Speech
  Resynthesis
EXPRESSO: A Benchmark and Analysis of Discrete Expressive Speech Resynthesis
Tu Nguyen
Wei-Ning Hsu
Antony DÁvirro
Bowen Shi
Itai Gat
...
Gabriel Synnaeve
Michael Hassid
Felix Kreuk
Yossi Adi
Emmanuel Dupoux
75
62
0
10 Aug 2023
Federated Representation Learning for Automatic Speech Recognition
Federated Representation Learning for Automatic Speech Recognition
Guruprasad V Ramesh
Gopinath Chennupati
Milind Rao
Anit Kumar Sahu
Ariya Rastrow
J. Droppo
50
0
0
03 Aug 2023
Adaptation of Whisper models to child speech recognition
Adaptation of Whisper models to child speech recognition
Rishabh Jain
Andrei Barcovschi
Mariam Yiwere
Peter Corcoran
H. Cucu
46
34
0
24 Jul 2023
Representation Learning With Hidden Unit Clustering For Low Resource
  Speech Applications
Representation Learning With Hidden Unit Clustering For Low Resource Speech Applications
Varun Krishna
T. Sai
Sriram Ganapathy
SSL
57
2
0
14 Jul 2023
Mega-TTS 2: Boosting Prompting Mechanisms for Zero-Shot Speech Synthesis
Mega-TTS 2: Boosting Prompting Mechanisms for Zero-Shot Speech Synthesis
Ziyue Jiang
Jinglin Liu
Yi Ren
Jinzheng He
Zhe Ye
...
Pengfei Wei
Chunfeng Wang
Xiang Yin
Zejun Ma
Zhou Zhao
120
52
0
14 Jul 2023
Leveraging Pretrained ASR Encoders for Effective and Efficient
  End-to-End Speech Intent Classification and Slot Filling
Leveraging Pretrained ASR Encoders for Effective and Efficient End-to-End Speech Intent Classification and Slot Filling
Hengguan Huang
Jagadeesh Balam
Boris Ginsburg
59
4
0
13 Jul 2023
Whisper-AT: Noise-Robust Automatic Speech Recognizers are Also Strong
  General Audio Event Taggers
Whisper-AT: Noise-Robust Automatic Speech Recognizers are Also Strong General Audio Event Taggers
Yuan Gong
Sameer Khurana
Leonid Karlinsky
James R. Glass
86
71
0
06 Jul 2023
Don't Stop Self-Supervision: Accent Adaptation of Speech Representations
  via Residual Adapters
Don't Stop Self-Supervision: Accent Adaptation of Speech Representations via Residual Adapters
Anshu Bhatia
Sanchit Sinha
Saket Dingliwal
Karthik Gopalakrishnan
S. Bodapati
Katrin Kirchhoff
75
6
0
02 Jul 2023
What Do Self-Supervised Speech Models Know About Words?
What Do Self-Supervised Speech Models Know About Words?
Ankita Pasad
C. Chien
Shane Settle
Karen Livescu
SSL
155
36
0
30 Jun 2023
Focus on the Sound around You: Monaural Target Speaker Extraction via
  Distance and Speaker Information
Focus on the Sound around You: Monaural Target Speaker Extraction via Distance and Speaker Information
Jiuxin Lin
Peng Wang
Heinrich Dinkel
Jun Chen
Zhiyong Wu
Zhiyong Yan
Yongqing Wang
Junbo Zhang
Yujun Wang
67
9
0
28 Jun 2023
Prompting Large Language Models for Zero-Shot Domain Adaptation in
  Speech Recognition
Prompting Large Language Models for Zero-Shot Domain Adaptation in Speech Recognition
Yuang Li
Yu-Huan Wu
Jinyu Li
Shujie Liu
116
47
0
28 Jun 2023
Voicebox: Text-Guided Multilingual Universal Speech Generation at Scale
Voicebox: Text-Guided Multilingual Universal Speech Generation at Scale
Matt Le
Apoorv Vyas
Bowen Shi
Brian Karrer
Leda Sari
...
Mary Williamson
Vimal Manohar
Yossi Adi
Jay Mahadeokar
Wei-Ning Hsu
AuLLM
121
306
0
23 Jun 2023
Learning When to Trust Which Teacher for Weakly Supervised ASR
Learning When to Trust Which Teacher for Weakly Supervised ASR
Aakriti Agrawal
Milind Rao
Anit Kumar Sahu
Gopinath Chennupati
A. Stolcke
43
0
0
21 Jun 2023
Visually grounded few-shot word learning in low-resource settings
Visually grounded few-shot word learning in low-resource settings
Leanne Nortje
Dan Oneaţă
Herman Kamper
VLM
49
4
0
20 Jun 2023
Tagged End-to-End Simultaneous Speech Translation Training using
  Simultaneous Interpretation Data
Tagged End-to-End Simultaneous Speech Translation Training using Simultaneous Interpretation Data
Yuka Ko
Ryo Fukuda
Yuta Nishikawa
Yasumasa Kano
Katsuhito Sudoh
Satoshi Nakamura
61
6
0
14 Jun 2023
Reducing Barriers to Self-Supervised Learning: HuBERT Pre-training with
  Academic Compute
Reducing Barriers to Self-Supervised Learning: HuBERT Pre-training with Academic Compute
William Chen
Xuankai Chang
Yifan Peng
Zhaoheng Ni
Soumi Maiti
Shinji Watanabe
SSL
87
27
0
11 Jun 2023
Zambezi Voice: A Multilingual Speech Corpus for Zambian Languages
Zambezi Voice: A Multilingual Speech Corpus for Zambian Languages
Claytone Sikasote
Kalinda Siaminwe
Stanly Mwape
Bangiwe Zulu
Mofya Phiri
Martin Phiri
David Zulu
Mayumbo Nyirenda
Antonios Anastasopoulos
79
8
0
07 Jun 2023
PolyVoice: Language Models for Speech to Speech Translation
PolyVoice: Language Models for Speech to Speech Translation
Qianqian Dong
Zhiying Huang
Qiao Tian
Chen Xu
Tom Ko
...
Lu Lu
Zejun Ma
Yuping Wang
Mingxuan Wang
Yuxuan Wang
105
25
0
05 Jun 2023
BabySLM: language-acquisition-friendly benchmark of self-supervised
  spoken language models
BabySLM: language-acquisition-friendly benchmark of self-supervised spoken language models
Marvin Lavechin
Yaya Sy
Hadrien Titeux
María Andrea Cruz Blandón
Okko Räsänen
H. Bredin
Emmanuel Dupoux
Alejandrina Cristià
AuLLM
141
13
0
02 Jun 2023
Speech Translation with Foundation Models and Optimal Transport: UPC at
  IWSLT23
Speech Translation with Foundation Models and Optimal Transport: UPC at IWSLT23
Ioannis Tsiamas
Gerard I. Gállego
José A. R. Fonollosa
Marta R. Costa-jussá
OT
40
3
0
02 Jun 2023
DistilXLSR: A Light Weight Cross-Lingual Speech Representation Model
DistilXLSR: A Light Weight Cross-Lingual Speech Representation Model
Haoyu Wang
Siyuan Wang
Weiqiang Zhang
Jinfeng Bai
71
2
0
02 Jun 2023
Exploration on HuBERT with Multiple Resolutions
Exploration on HuBERT with Multiple Resolutions
Jiatong Shi
Yun Tang
Hirofumi Inaguma
Hongyu Gong
J. Pino
Shinji Watanabe
103
9
0
01 Jun 2023
Speech Self-Supervised Representation Benchmarking: Are We Doing it
  Right?
Speech Self-Supervised Representation Benchmarking: Are We Doing it Right?
Salah Zaiem
Youcef Kemiche
Titouan Parcollet
S. Essid
Mirco Ravanelli
SSL
99
27
0
01 Jun 2023
Edit Distance based RL for RNNT decoding
Edit Distance based RL for RNNT decoding
DongSeon Hwang
Changwan Ryu
K. Sim
37
0
0
31 May 2023
Make-A-Voice: Unified Voice Synthesis With Discrete Representation
Make-A-Voice: Unified Voice Synthesis With Discrete Representation
Rongjie Huang
Chunlei Zhang
Yongqiang Wang
Dongchao Yang
Lu Liu
Zhenhui Ye
Ziyue Jiang
Chao Weng
Zhou Zhao
Dong Yu
DiffM
83
27
0
30 May 2023
MiniSUPERB: Lightweight Benchmark for Self-supervised Speech Models
MiniSUPERB: Lightweight Benchmark for Self-supervised Speech Models
Yu-Hsiang Wang
Huan Chen
Kai-Wei Chang
Winston H. Hsu
Hung-yi Lee
106
7
0
30 May 2023
Improving Textless Spoken Language Understanding with Discrete Units as
  Intermediate Target
Improving Textless Spoken Language Understanding with Discrete Units as Intermediate Target
Guanyong Wu
Guan-Ting Lin
Shang-Wen Li
Hung-yi Lee
79
5
0
29 May 2023
VioLA: Unified Codec Language Models for Speech Recognition, Synthesis,
  and Translation
VioLA: Unified Codec Language Models for Speech Recognition, Synthesis, and Translation
Tianrui Wang
Long Zhou
Zi-Hua Zhang
Yu-Huan Wu
Shujie Liu
Yashesh Gaur
Zhuo Chen
Jinyu Li
Furu Wei
92
106
0
25 May 2023
Visually grounded few-shot word acquisition with fewer shots
Visually grounded few-shot word acquisition with fewer shots
Leanne Nortje
Benjamin van Niekerk
Herman Kamper
63
1
0
25 May 2023
Spoofing Attacker Also Benefits from Self-Supervised Pretrained Model
Spoofing Attacker Also Benefits from Self-Supervised Pretrained Model
Aoi Ito
Shota Horiguchi
SSL
51
2
0
24 May 2023
AV-TranSpeech: Audio-Visual Robust Speech-to-Speech Translation
AV-TranSpeech: Audio-Visual Robust Speech-to-Speech Translation
Rongjie Huang
Huadai Liu
Xize Cheng
Yi Ren
Lin Li
...
Jinzheng He
Lichao Zhang
Jinglin Liu
Xiaoyue Yin
Zhou Zhao
123
8
0
24 May 2023
Spoken Question Answering and Speech Continuation Using
  Spectrogram-Powered LLM
Spoken Question Answering and Speech Continuation Using Spectrogram-Powered LLM
Eliya Nachmani
Alon Levkovitch
Roy Hirsch
Julián Salazar
Chulayutsh Asawaroengchai
Soroosh Mariooryad
Ehud Rivlin
RJ Skerry-Ryan
Michelle Tadmor Ramanovich
AuLLM
107
45
0
24 May 2023
On the Transferability of Whisper-based Representations for
  "In-the-Wild" Cross-Task Downstream Speech Applications
On the Transferability of Whisper-based Representations for "In-the-Wild" Cross-Task Downstream Speech Applications
Vamsikrishna Chemudupati
Marzieh S. Tahaei
Heitor R. Guimarães
Arthur Pimentel
Anderson R. Avila
Mehdi Rezagholizadeh
Boxing Chen
Tiago H. Falk
SSL
128
7
0
23 May 2023
Textually Pretrained Speech Language Models
Textually Pretrained Speech Language Models
Michael Hassid
Tal Remez
Tu Nguyen
Itai Gat
Alexis Conneau
...
Alexandre Défossez
Gabriel Synnaeve
Emmanuel Dupoux
Roy Schwartz
Yossi Adi
VLMSyDa
127
61
0
22 May 2023
Duplex Diffusion Models Improve Speech-to-Speech Translation
Duplex Diffusion Models Improve Speech-to-Speech Translation
Xianchao Wu
DiffM
81
5
0
22 May 2023
SpeechGPT: Empowering Large Language Models with Intrinsic Cross-Modal
  Conversational Abilities
SpeechGPT: Empowering Large Language Models with Intrinsic Cross-Modal Conversational Abilities
Dong Zhang
Shimin Li
Xin Zhang
Jun Zhan
Pengyu Wang
Yaqian Zhou
Xipeng Qiu
AuLLMMLLM
132
344
0
18 May 2023
DinoSR: Self-Distillation and Online Clustering for Self-supervised
  Speech Representation Learning
DinoSR: Self-Distillation and Online Clustering for Self-supervised Speech Representation Learning
Alexander H. Liu
Heng-Jui Chang
Michael Auli
Wei-Ning Hsu
James R. Glass
75
26
0
17 May 2023
SoundStorm: Efficient Parallel Audio Generation
SoundStorm: Efficient Parallel Audio Generation
Zalan Borsos
Matthew Sharifi
Damien Vincent
Eugene Kharitonov
Neil Zeghidour
Marco Tagliasacchi
90
110
0
16 May 2023
Understanding and Bridging the Modality Gap for Speech Translation
Understanding and Bridging the Modality Gap for Speech Translation
Qingkai Fang
Yang Feng
75
26
0
15 May 2023
An Exploration into the Performance of Unsupervised Cross-Task Speech
  Representations for "In the Wild'' Edge Applications
An Exploration into the Performance of Unsupervised Cross-Task Speech Representations for "In the Wild'' Edge Applications
Heitor R. Guimarães
Arthur Pimentel
Anderson R. Avila
Mehdi Rezagholizadeh
Tiago H. Falk
SSL
70
2
0
09 May 2023
Exploration of Language Dependency for Japanese Self-Supervised Speech
  Representation Models
Exploration of Language Dependency for Japanese Self-Supervised Speech Representation Models
Takanori Ashihara
Takafumi Moriya
Kohei Matsuura
Tomohiro Tanaka
83
3
0
09 May 2023
Fast Conformer with Linearly Scalable Attention for Efficient Speech
  Recognition
Fast Conformer with Linearly Scalable Attention for Efficient Speech Recognition
Dima Rekesh
Nithin Rao Koluguri
Samuel Kriman
Somshubra Majumdar
Vahid Noroozi
...
Oleksii Hrinchuk
Krishna Puvvada
Ankur Kumar
Jagadeesh Balam
Boris Ginsburg
101
92
0
08 May 2023
Learning Robust Self-attention Features for Speech Emotion Recognition
  with Label-adaptive Mixup
Learning Robust Self-attention Features for Speech Emotion Recognition with Label-adaptive Mixup
Lei Kang
Lichao Zhang
Dazhi Jiang
61
6
0
07 May 2023
Analysing the Impact of Audio Quality on the Use of Naturalistic
  Long-Form Recordings for Infant-Directed Speech Research
Analysing the Impact of Audio Quality on the Use of Naturalistic Long-Form Recordings for Infant-Directed Speech Research
María Andrea Cruz Blandón
Alejandrina Cristià
Okko Räsänen
42
1
0
03 May 2023
Understanding Shared Speech-Text Representations
Understanding Shared Speech-Text Representations
Gary Wang
Kyle Kastner
Ankur Bapna
Zhehuai Chen
Andrew Rosenberg
Bhuvana Ramabhadran
Yu Zhang
AuLLM
95
7
0
27 Apr 2023
NaturalSpeech 2: Latent Diffusion Models are Natural and Zero-Shot
  Speech and Singing Synthesizers
NaturalSpeech 2: Latent Diffusion Models are Natural and Zero-Shot Speech and Singing Synthesizers
Kai Shen
Zeqian Ju
Xu Tan
Yanqing Liu
Yichong Leng
Lei He
Tao Qin
Sheng Zhao
Jiang Bian
DiffM
104
247
0
18 Apr 2023
HCAM -- Hierarchical Cross Attention Model for Multi-modal Emotion
  Recognition
HCAM -- Hierarchical Cross Attention Model for Multi-modal Emotion Recognition
Soumya Dutta
Sriram Ganapathy
103
18
0
14 Apr 2023
Efficient Sequence Transduction by Jointly Predicting Tokens and
  Durations
Efficient Sequence Transduction by Jointly Predicting Tokens and Durations
Hainan Xu
Fei Jia
Somshubra Majumdar
Hengguan Huang
Shinji Watanabe
Boris Ginsburg
68
26
0
13 Apr 2023
Previous
123456...8910
Next