ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2305.10615
  4. Cited By
ML-SUPERB: Multilingual Speech Universal PERformance Benchmark

ML-SUPERB: Multilingual Speech Universal PERformance Benchmark

18 May 2023
Jiatong Shi
Dan Berrebbi
William Chen
Ho-Lam Chung
En-Pei Hu
Wei-Ping Huang
Xuankai Chang
Shang-Wen Li
Abdel-rahman Mohamed
Hung-yi Lee
Shinji Watanabe
    ELM
ArXivPDFHTML

Papers citing "ML-SUPERB: Multilingual Speech Universal PERformance Benchmark"

48 / 48 papers shown
Title
mWhisper-Flamingo for Multilingual Audio-Visual Noise-Robust Speech Recognition
mWhisper-Flamingo for Multilingual Audio-Visual Noise-Robust Speech Recognition
Andrew Rouditchenko
Saurabhchand Bhati
Samuel Thomas
Hilde Kuehne
Rogerio Feris
85
1
0
03 Feb 2025
Discrete Speech Unit Extraction via Independent Component Analysis
Discrete Speech Unit Extraction via Independent Component Analysis
Tomohiko Nakamura
Kwanghee Choi
Keigo Hojo
Yoshiaki Bando
Satoru Fukayama
Shinji Watanabe
38
0
0
11 Jan 2025
CA-SSLR: Condition-Aware Self-Supervised Learning Representation for
  Generalized Speech Processing
CA-SSLR: Condition-Aware Self-Supervised Learning Representation for Generalized Speech Processing
Yen-Ju Lu
Jing Liu
Thomas Thebaud
Laureano Moro Velázquez
Ariya Rastrow
Najim Dehak
Jesus Villalba
65
1
0
05 Dec 2024
How to Learn a New Language? An Efficient Solution for Self-Supervised Learning Models Unseen Languages Adaption in Low-Resource Scenario
How to Learn a New Language? An Efficient Solution for Self-Supervised Learning Models Unseen Languages Adaption in Low-Resource Scenario
Shih-Heng Wang
Zih-Ching Chen
Jiatong Shi
Ming To Chuang
Guan-Ting Lin
Kuan Po Huang
David F. Harwath
Shang-Wen Li
Hung-yi Lee
70
1
0
27 Nov 2024
An Empirical Analysis of Speech Self-Supervised Learning at Multiple
  Resolutions
An Empirical Analysis of Speech Self-Supervised Learning at Multiple Resolutions
Theo Clark
Benedetta Cevoli
Eloy de Jong
Timofey Abramski
Jamie Dougherty
SSL
31
0
0
31 Oct 2024
Exploiting Longitudinal Speech Sessions via Voice Assistant Systems for
  Early Detection of Cognitive Decline
Exploiting Longitudinal Speech Sessions via Voice Assistant Systems for Early Detection of Cognitive Decline
Kristin Qi
Jiatong Shi
Caroline Summerour
J. Batsis
Xiaohui Liang
26
0
0
16 Oct 2024
Improving Multilingual ASR in the Wild Using Simple N-best Re-ranking
Improving Multilingual ASR in the Wild Using Simple N-best Re-ranking
Brian Yan
Vineel Pratap
Shinji Watanabe
Michael Auli
21
0
0
27 Sep 2024
Enhancing Synthetic Training Data for Speech Commands: From ASR-Based
  Filtering to Domain Adaptation in SSL Latent Space
Enhancing Synthetic Training Data for Speech Commands: From ASR-Based Filtering to Domain Adaptation in SSL Latent Space
Sebastião Quintas
Isabelle Ferrané
Thomas Pellegrini
23
0
0
19 Sep 2024
ESPnet-EZ: Python-only ESPnet for Easy Fine-tuning and Integration
ESPnet-EZ: Python-only ESPnet for Easy Fine-tuning and Integration
Masao Someki
Kwanghee Choi
Siddhant Arora
William Chen
Samuele Cornell
Jionghao Han
Yifan Peng
Jiatong Shi
Vaibhav Srivastav
Shinji Watanabe
VLM
23
0
0
14 Sep 2024
STAB: Speech Tokenizer Assessment Benchmark
STAB: Speech Tokenizer Assessment Benchmark
Shikhar Vashishth
Harman Singh
Shikhar Bharadwaj
Sriram Ganapathy
Chulayuth Asawaroengchai
Kartik Audhkhasi
Andrew Rosenberg
Ankur Bapna
Bhuvana Ramabhadran
41
0
0
04 Sep 2024
A Layer-Anchoring Strategy for Enhancing Cross-Lingual Speech Emotion
  Recognition
A Layer-Anchoring Strategy for Enhancing Cross-Lingual Speech Emotion Recognition
Shreya G. Upadhyay
Carlos Busso
Chi-Chun Lee
26
3
0
06 Jul 2024
Performance Analysis of Speech Encoders for Low-Resource SLU and ASR in
  Tunisian Dialect
Performance Analysis of Speech Encoders for Low-Resource SLU and ASR in Tunisian Dialect
Salima Mdhaffar
Haroun Elleuch
Fethi Bougares
Yannick Esteve
36
0
0
05 Jul 2024
Codec-ASR: Training Performant Automatic Speech Recognition Systems with
  Discrete Speech Representations
Codec-ASR: Training Performant Automatic Speech Recognition Systems with Discrete Speech Representations
Kunal Dhawan
Nithin Rao Koluguri
Ante Jukić
Ryan Langman
Jagadeesh Balam
Boris Ginsburg
31
1
0
03 Jul 2024
Less Forgetting for Better Generalization: Exploring Continual-learning
  Fine-tuning Methods for Speech Self-supervised Representations
Less Forgetting for Better Generalization: Exploring Continual-learning Fine-tuning Methods for Speech Self-supervised Representations
Salah Zaiem
Titouan Parcollet
S. Essid
CLL
28
3
0
30 Jun 2024
BESTOW: Efficient and Streamable Speech Language Model with the Best of
  Two Worlds in GPT and T5
BESTOW: Efficient and Streamable Speech Language Model with the Best of Two Worlds in GPT and T5
Zhehuai Chen
He Huang
Oleksii Hrinchuk
Krishna C. Puvvada
Nithin Rao Koluguri
Piotr Żelasko
Jagadeesh Balam
Boris Ginsburg
AuLLM
RALM
26
10
0
28 Jun 2024
Interface Design for Self-Supervised Speech Models
Interface Design for Self-Supervised Speech Models
Yi-Jen Shih
David Harwath
41
1
0
18 Jun 2024
MMM: Multi-Layer Multi-Residual Multi-Stream Discrete Speech
  Representation from Self-supervised Learning Model
MMM: Multi-Layer Multi-Residual Multi-Stream Discrete Speech Representation from Self-supervised Learning Model
Jiatong Shi
Xutai Ma
Hirofumi Inaguma
Anna Y. Sun
Shinji Watanabe
43
7
0
14 Jun 2024
SingOMD: Singing Oriented Multi-resolution Discrete Representation
  Construction from Speech Models
SingOMD: Singing Oriented Multi-resolution Discrete Representation Construction from Speech Models
Yuxun Tang
Yuning Wu
Jiatong Shi
Qin Jin
39
5
0
13 Jun 2024
VISinger2+: End-to-End Singing Voice Synthesis Augmented by
  Self-Supervised Learning Representation
VISinger2+: End-to-End Singing Voice Synthesis Augmented by Self-Supervised Learning Representation
Yifeng Yu
Jiatong Shi
Yuning Wu
Shinji Watanabe
23
3
0
13 Jun 2024
ML-SUPERB 2.0: Benchmarking Multilingual Speech Models Across Modeling
  Constraints, Languages, and Datasets
ML-SUPERB 2.0: Benchmarking Multilingual Speech Models Across Modeling Constraints, Languages, and Datasets
Jiatong Shi
Shih-Heng Wang
William Chen
Martijn Bartelds
Vanya Bannihatti Kumar
...
Xuankai Chang
Dan Jurafsky
Karen Livescu
Hung-yi Lee
Shinji Watanabe
AuLLM
70
5
0
12 Jun 2024
Self-Supervised Speech Representations are More Phonetic than Semantic
Self-Supervised Speech Representations are More Phonetic than Semantic
Kwanghee Choi
Ankita Pasad
Tomohiko Nakamura
Satoru Fukayama
Karen Livescu
Shinji Watanabe
21
14
0
12 Jun 2024
The Interspeech 2024 Challenge on Speech Processing Using Discrete Units
The Interspeech 2024 Challenge on Speech Processing Using Discrete Units
Xuankai Chang
Jiatong Shi
Jinchuan Tian
Yuning Wu
Yuxun Tang
Yihan Wu
Shinji Watanabe
Yossi Adi
Xie Chen
Qin Jin
35
15
0
11 Jun 2024
mHuBERT-147: A Compact Multilingual HuBERT Model
mHuBERT-147: A Compact Multilingual HuBERT Model
Marcely Zanon Boito
Vivek Iyer
Nikolaos Lagos
Laurent Besacier
Ioan Calapodescu
VLM
54
8
0
10 Jun 2024
Efficient Compression of Multitask Multilingual Speech Models
Efficient Compression of Multitask Multilingual Speech Models
Thomas Palmeira Ferraz
31
0
0
02 May 2024
A Large-Scale Evaluation of Speech Foundation Models
A Large-Scale Evaluation of Speech Foundation Models
Shu-Wen Yang
Heng-Jui Chang
Zili Huang
Andy T. Liu
Cheng-I Jeff Lai
...
Kushal Lakhotia
Shang-Wen Li
Abdelrahman Mohamed
Shinji Watanabe
Hung-yi Lee
35
19
0
15 Apr 2024
The X-LANCE Technical Report for Interspeech 2024 Speech Processing
  Using Discrete Speech Unit Challenge
The X-LANCE Technical Report for Interspeech 2024 Speech Processing Using Discrete Speech Unit Challenge
Yiwei Guo
Chenrun Wang
Yifan Yang
Hankun Wang
Ziyang Ma
...
Hanzheng Li
Shuai Fan
Hui Zhang
Xie Chen
Kai Yu
23
1
0
09 Apr 2024
AnnoTheia: A Semi-Automatic Annotation Toolkit for Audio-Visual Speech
  Technologies
AnnoTheia: A Semi-Automatic Annotation Toolkit for Audio-Visual Speech Technologies
José-M. Acosta-Triana
David Gimeno-Gómez
Carlos David Martínez Hinarejos
VLM
VGen
29
2
0
20 Feb 2024
Probing Self-supervised Learning Models with Target Speech Extraction
Probing Self-supervised Learning Models with Target Speech Extraction
Junyi Peng
Marc Delcroix
Tsubasa Ochiai
Oldrich Plchot
Takanori Ashihara
Shoko Araki
J. Černocký
26
1
0
17 Feb 2024
Are Paralinguistic Representations all that is needed for Speech Emotion
  Recognition?
Are Paralinguistic Representations all that is needed for Speech Emotion Recognition?
Orchid Chetia Phukan
Gautam Siddharth Kashyap
Arun Balaji Buduru
Rajesh Sharma
18
0
0
02 Feb 2024
Evaluating Self-supervised Speech Models on a Taiwanese Hokkien Corpus
Evaluating Self-supervised Speech Models on a Taiwanese Hokkien Corpus
Yi-Hui Chou
Kalvin Chang
Meng-Ju Wu
Winston Ou
Alice Wen-Hsin Bi
...
Iu-Tshian Phoann
Winnie Chang
Chenxuan Cui
Noel Chen
Jiatong Shi
29
3
0
06 Dec 2023
CL-MASR: A Continual Learning Benchmark for Multilingual ASR
CL-MASR: A Continual Learning Benchmark for Multilingual ASR
Luca Della Libera
Pooneh Mousavi
Salah Zaiem
Cem Subakan
Mirco Ravanelli
AuLLM
CLL
27
13
0
25 Oct 2023
Findings of the 2023 ML-SUPERB Challenge: Pre-Training and Evaluation over More Languages and Beyond
Findings of the 2023 ML-SUPERB Challenge: Pre-Training and Evaluation over More Languages and Beyond
Jiatong Shi
William Chen
Dan Berrebbi
Hsiu-Hsuan Wang
Wei-Ping Huang
...
Yuxun Tang
Shang-Wen Li
Abdelrahman Mohamed
Hung-yi Lee
Shinji Watanabe
LRM
ELM
25
15
0
09 Oct 2023
HuBERTopic: Enhancing Semantic Representation of HuBERT through
  Self-supervision Utilizing Topic Model
HuBERTopic: Enhancing Semantic Representation of HuBERT through Self-supervision Utilizing Topic Model
Takashi Maekaku
Jiatong Shi
Xuankai Chang
Yuya Fujita
Shinji Watanabe
15
1
0
06 Oct 2023
EFFUSE: Efficient Self-Supervised Feature Fusion for E2E ASR in Low
  Resource and Multilingual Scenarios
EFFUSE: Efficient Self-Supervised Feature Fusion for E2E ASR in Low Resource and Multilingual Scenarios
Tejes Srivastava
Jiatong Shi
William Chen
Shinji Watanabe
24
1
0
05 Oct 2023
Evaluating Self-Supervised Speech Representations for Indigenous
  American Languages
Evaluating Self-Supervised Speech Representations for Indigenous American Languages
Chih-Chen Chen
William Chen
Rodolfo Zevallos
John E. Ortega
34
7
0
05 Oct 2023
Zero Resource Code-switched Speech Benchmark Using Speech Utterance
  Pairs For Multiple Spoken Languages
Zero Resource Code-switched Speech Benchmark Using Speech Utterance Pairs For Multiple Spoken Languages
Kuan-Po Huang
Chih-Kai Yang
Yu-Kuan Fu
Ewan Dunbar
Hung-yi Lee
24
5
0
04 Oct 2023
Multi-resolution HuBERT: Multi-resolution Speech Self-Supervised
  Learning with Masked Unit Prediction
Multi-resolution HuBERT: Multi-resolution Speech Self-Supervised Learning with Masked Unit Prediction
Jiatong Shi
H. Inaguma
Xutai Ma
Ilia Kulikov
Anna Y. Sun
31
24
0
04 Oct 2023
SSHR: Leveraging Self-supervised Hierarchical Representations for
  Multilingual Automatic Speech Recognition
SSHR: Leveraging Self-supervised Hierarchical Representations for Multilingual Automatic Speech Recognition
Hongfei Xue
Qijie Shao
Tommy Yuan
Peikun Chen
Jie Liu
Lei Xie
24
2
0
29 Sep 2023
Exploring Speech Recognition, Translation, and Understanding with
  Discrete Speech Units: A Comparative Study
Exploring Speech Recognition, Translation, and Understanding with Discrete Speech Units: A Comparative Study
Xuankai Chang
Brian Yan
Kwanghee Choi
Jee-weon Jung
Yichen Lu
...
Pengcheng Guo
Yao-Fei Cheng
Pavel Denisov
Kohei Saijo
Hsiu-Hsuan Wang
18
36
0
27 Sep 2023
Joint Prediction and Denoising for Large-scale Multilingual
  Self-supervised Learning
Joint Prediction and Denoising for Large-scale Multilingual Self-supervised Learning
William Chen
Jiatong Shi
Brian Yan
Dan Berrebbi
Wangyou Zhang
Yifan Peng
Xuankai Chang
Soumi Maiti
Shinji Watanabe
11
8
0
26 Sep 2023
Dynamic-SUPERB: Towards A Dynamic, Collaborative, and Comprehensive
  Instruction-Tuning Benchmark for Speech
Dynamic-SUPERB: Towards A Dynamic, Collaborative, and Comprehensive Instruction-Tuning Benchmark for Speech
Chien-yu Huang
Ke-Han Lu
Shi Wang
Chi-Yuan Hsiao
Chun-Yi Kuan
...
Roshan S. Sharma
Shinji Watanabe
Bhiksha Ramakrishnan
Shady Shehata
Hung-yi Lee
AuLLM
24
50
0
18 Sep 2023
CoLLD: Contrastive Layer-to-layer Distillation for Compressing
  Multilingual Pre-trained Speech Encoders
CoLLD: Contrastive Layer-to-layer Distillation for Compressing Multilingual Pre-trained Speech Encoders
Heng-Jui Chang
Ning Dong
Ruslan Mavlyutov
Sravya Popuri
Yu-An Chung
26
6
0
14 Sep 2023
LeBenchmark 2.0: a Standardized, Replicable and Enhanced Framework for
  Self-supervised Representations of French Speech
LeBenchmark 2.0: a Standardized, Replicable and Enhanced Framework for Self-supervised Representations of French Speech
Titouan Parcollet
H. Nguyen
Solène Evain
Marcely Zanon Boito
Adrien Pupier
...
François Portet
Solange Rossato
F. Ringeval
D. Schwab
Laurent Besacier
32
14
0
11 Sep 2023
Decoding Emotions: A comprehensive Multilingual Study of Speech Models
  for Speech Emotion Recognition
Decoding Emotions: A comprehensive Multilingual Study of Speech Models for Speech Emotion Recognition
Anant Singh
Akshat Gupta
24
4
0
17 Aug 2023
Streaming End-to-End Multilingual Speech Recognition with Joint Language
  Identification
Streaming End-to-End Multilingual Speech Recognition with Joint Language Identification
C. Zhang
Bo-wen Li
Tara N. Sainath
Trevor Strohman
S. Mavandadi
Shuo-yiin Chang
Parisa Haghani
124
28
0
13 Sep 2022
FLEURS: Few-shot Learning Evaluation of Universal Representations of
  Speech
FLEURS: Few-shot Learning Evaluation of Universal Representations of Speech
Alexis Conneau
Min Ma
Simran Khanuja
Yu Zhang
Vera Axelrod
Siddharth Dalmia
Jason Riesa
Clara E. Rivera
Ankur Bapna
VLM
73
281
0
25 May 2022
Self-Supervised Speech Representation Learning: A Review
Self-Supervised Speech Representation Learning: A Review
Abdel-rahman Mohamed
Hung-yi Lee
Lasse Borgholt
Jakob Drachmann Havtorn
Joakim Edin
...
Shang-Wen Li
Karen Livescu
Lars Maaløe
Tara N. Sainath
Shinji Watanabe
SSL
AI4TS
124
339
0
21 May 2022
XTREME-S: Evaluating Cross-lingual Speech Representations
XTREME-S: Evaluating Cross-lingual Speech Representations
Alexis Conneau
Ankur Bapna
Yu Zhang
Min Ma
Patrick von Platen
...
Orhan Firat
Michael Auli
Sebastian Ruder
Jason Riesa
Melvin Johnson
VLM
AILaw
ELM
46
21
0
21 Mar 2022
1