ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2105.01051
  4. Cited By
SUPERB: Speech processing Universal PERformance Benchmark

SUPERB: Speech processing Universal PERformance Benchmark

3 May 2021
Shu-Wen Yang
Po-Han Chi
Yung-Sung Chuang
Cheng-I Jeff Lai
Kushal Lakhotia
Yist Y. Lin
Andy T. Liu
Jiatong Shi
Xuankai Chang
Guan-Ting Lin
Tzu-hsien Huang
Wei-Cheng Tseng
Ko-tik Lee
Da-Rong Liu
Zili Huang
Shuyan Dong
Shang-Wen Li
Shinji Watanabe
Abdel-rahman Mohamed
Hung-yi Lee
    SSL
ArXivPDFHTML

Papers citing "SUPERB: Speech processing Universal PERformance Benchmark"

50 / 160 papers shown
Title
Fast Word Error Rate Estimation Using Self-Supervised Representations for Speech and Text
Fast Word Error Rate Estimation Using Self-Supervised Representations for Speech and Text
Chanho Park
Chengsong Lu
Mingjie Chen
Thomas Hain
18
3
0
12 Oct 2023
Findings of the 2023 ML-SUPERB Challenge: Pre-Training and Evaluation over More Languages and Beyond
Findings of the 2023 ML-SUPERB Challenge: Pre-Training and Evaluation over More Languages and Beyond
Jiatong Shi
William Chen
Dan Berrebbi
Hsiu-Hsuan Wang
Wei-Ping Huang
...
Yuxun Tang
Shang-Wen Li
Abdelrahman Mohamed
Hung-yi Lee
Shinji Watanabe
LRM
ELM
34
15
0
09 Oct 2023
XLS-R fine-tuning on noisy word boundaries for unsupervised speech
  segmentation into words
XLS-R fine-tuning on noisy word boundaries for unsupervised speech segmentation into words
Robin Algayres
Pablo Diego-Simon
Benoît Sagot
Emmanuel Dupoux
28
1
0
08 Oct 2023
Unsupervised Active Learning: Optimizing Labeling Cost-Effectiveness for
  Automatic Speech Recognition
Unsupervised Active Learning: Optimizing Labeling Cost-Effectiveness for Automatic Speech Recognition
Zhisheng Zheng
Ziyang Ma
Yu Wang
Xie Chen
26
2
0
28 Aug 2023
An Effective Transformer-based Contextual Model and Temporal Gate
  Pooling for Speaker Identification
An Effective Transformer-based Contextual Model and Temporal Gate Pooling for Speaker Identification
Harunori Kawano
Sota Shimizu
30
1
0
22 Aug 2023
Improving Joint Speech-Text Representations Without Alignment
Improving Joint Speech-Text Representations Without Alignment
Cal Peyser
Zhong Meng
Ke Hu
Rohit Prabhavalkar
Andrew Rosenberg
Tara N. Sainath
M. Picheny
Kyunghyun Cho
VLM
26
4
0
11 Aug 2023
Joint speech and overlap detection: a benchmark over multiple audio
  setup and speech domains
Joint speech and overlap detection: a benchmark over multiple audio setup and speech domains
Martin Lebourdais
Théo Mariotte
Marie Tahon
Anthony Larcher
Antoine Laurent
Silvio Montrésor
S. Meignier
Jean-Hugh Thomas
VLM
25
5
0
24 Jul 2023
Vesper: A Compact and Effective Pretrained Model for Speech Emotion
  Recognition
Vesper: A Compact and Effective Pretrained Model for Speech Emotion Recognition
Weidong Chen
Xiaofen Xing
Peihao Chen
Xiangmin Xu
VLM
28
35
0
20 Jul 2023
On the Use of Self-Supervised Speech Representations in Spontaneous
  Speech Synthesis
On the Use of Self-Supervised Speech Representations in Spontaneous Speech Synthesis
Siyang Wang
G. Henter
Joakim Gustafson
Éva Székely
42
5
0
11 Jul 2023
On-Device Constrained Self-Supervised Speech Representation Learning for
  Keyword Spotting via Knowledge Distillation
On-Device Constrained Self-Supervised Speech Representation Learning for Keyword Spotting via Knowledge Distillation
Gene-Ping Yang
Yue Gu
Qingming Tang
Dongsu Du
Yuzong Liu
14
5
0
06 Jul 2023
Bidirectional Looking with A Novel Double Exponential Moving Average to
  Adaptive and Non-adaptive Momentum Optimizers
Bidirectional Looking with A Novel Double Exponential Moving Average to Adaptive and Non-adaptive Momentum Optimizers
Yineng Chen
Z. Li
Lefei Zhang
Bo Du
Hai Zhao
25
4
0
02 Jul 2023
When to Use Efficient Self Attention? Profiling Text, Speech and Image
  Transformer Variants
When to Use Efficient Self Attention? Profiling Text, Speech and Image Transformer Variants
Anuj Diwan
Eunsol Choi
David F. Harwath
39
0
0
14 Jun 2023
Simultaneous or Sequential Training? How Speech Representations
  Cooperate in a Multi-Task Self-Supervised Learning System
Simultaneous or Sequential Training? How Speech Representations Cooperate in a Multi-Task Self-Supervised Learning System
Khazar Khorrami
María Andrea Cruz Blandón
Tuomas Virtanen
Okko Rasanen
SSL
20
1
0
05 Jun 2023
Self-supervised representations in speech-based depression detection
Self-supervised representations in speech-based depression detection
Wen Wu
C. Zhang
P. Woodland
14
23
0
20 May 2023
Scaling laws for language encoding models in fMRI
Scaling laws for language encoding models in fMRI
Richard Antonello
Aditya R. Vaidya
Alexander G. Huth
MedIm
19
55
0
19 May 2023
Recycle-and-Distill: Universal Compression Strategy for
  Transformer-based Speech SSL Models with Attention Map Reusing and Masking
  Distillation
Recycle-and-Distill: Universal Compression Strategy for Transformer-based Speech SSL Models with Attention Map Reusing and Masking Distillation
Kangwook Jang
Sungnyun Kim
Se-Young Yun
Hoi-Rim Kim
24
5
0
19 May 2023
Syllable Discovery and Cross-Lingual Generalization in a Visually
  Grounded, Self-Supervised Speech Model
Syllable Discovery and Cross-Lingual Generalization in a Visually Grounded, Self-Supervised Speech Model
Puyuan Peng
Shang-Wen Li
Okko Rasanen
Abdel-rahman Mohamed
David F. Harwath
SSL
VLM
18
7
0
19 May 2023
ML-SUPERB: Multilingual Speech Universal PERformance Benchmark
ML-SUPERB: Multilingual Speech Universal PERformance Benchmark
Jiatong Shi
Dan Berrebbi
William Chen
Ho-Lam Chung
En-Pei Hu
...
Xuankai Chang
Shang-Wen Li
Abdel-rahman Mohamed
Hung-yi Lee
Shinji Watanabe
ELM
55
58
0
18 May 2023
DinoSR: Self-Distillation and Online Clustering for Self-supervised
  Speech Representation Learning
DinoSR: Self-Distillation and Online Clustering for Self-supervised Speech Representation Learning
Alexander H. Liu
Heng-Jui Chang
Michael Auli
Wei-Ning Hsu
James R. Glass
19
24
0
17 May 2023
Exploration of Language Dependency for Japanese Self-Supervised Speech
  Representation Models
Exploration of Language Dependency for Japanese Self-Supervised Speech Representation Models
Takanori Ashihara
Takafumi Moriya
Kohei Matsuura
Tomohiro Tanaka
14
3
0
09 May 2023
Computational modeling of semantic change
Computational modeling of semantic change
Nina Tahmasebi
Haim Dubossarsky
26
6
0
13 Apr 2023
Looking Similar, Sounding Different: Leveraging Counterfactual
  Cross-Modal Pairs for Audiovisual Representation Learning
Looking Similar, Sounding Different: Leveraging Counterfactual Cross-Modal Pairs for Audiovisual Representation Learning
Nikhil Singh
Chih-Wei Wu
Iroro Orife
Mahdi M. Kalayeh
23
2
0
12 Apr 2023
ESPnet-ST-v2: Multipurpose Spoken Language Translation Toolkit
ESPnet-ST-v2: Multipurpose Spoken Language Translation Toolkit
Brian Yan
Jiatong Shi
Yun Tang
H. Inaguma
Yifan Peng
...
Zhaoheng Ni
Moto Hira
Soumi Maiti
J. Pino
Shinji Watanabe
19
20
0
10 Apr 2023
Designing and Evaluating Speech Emotion Recognition Systems: A reality
  check case study with IEMOCAP
Designing and Evaluating Speech Emotion Recognition Systems: A reality check case study with IEMOCAP
Nikolaos Antoniou
Athanasios Katsamanis
Theodoros Giannakopoulos
Shrikanth Narayanan
19
17
0
03 Apr 2023
SpeechFormer++: A Hierarchical Efficient Framework for Paralinguistic
  Speech Processing
SpeechFormer++: A Hierarchical Efficient Framework for Paralinguistic Speech Processing
Weidong Chen
Xiaofen Xing
Xiangmin Xu
Jianxin Pang
Lan Du
30
38
0
27 Feb 2023
Phone and speaker spatial organization in self-supervised speech
  representations
Phone and speaker spatial organization in self-supervised speech representations
Pablo Riera
M. Cerdeiro
L. Pepino
Luciana Ferrer
SSL
16
1
0
24 Feb 2023
Perceive and predict: self-supervised speech representation based loss
  functions for speech enhancement
Perceive and predict: self-supervised speech representation based loss functions for speech enhancement
George Close
William Ravenscroft
Thomas Hain
Stefan Goetze
SSL
30
12
0
11 Jan 2023
Supervised Acoustic Embeddings And Their Transferability Across
  Languages
Supervised Acoustic Embeddings And Their Transferability Across Languages
Sreepratha Ram
Hanan Aldarmaki
SSL
19
3
0
03 Jan 2023
Exploring Effective Fusion Algorithms for Speech Based Self-Supervised
  Learning Models
Exploring Effective Fusion Algorithms for Speech Based Self-Supervised Learning Models
Changli Tang
Yujin Wang
Xie Chen
Weiqiang Zhang
23
2
0
20 Dec 2022
Context-aware Fine-tuning of Self-supervised Speech Models
Context-aware Fine-tuning of Self-supervised Speech Models
Suwon Shon
Felix Wu
Kwangyoun Kim
Prashant Sridhar
Karen Livescu
Shinji Watanabe
25
7
0
16 Dec 2022
DDSupport: Language Learning Support System that Displays Differences
  and Distances from Model Speech
DDSupport: Language Learning Support System that Displays Differences and Distances from Model Speech
Kazuki Kawamura
Jun Rekimoto
12
0
0
08 Dec 2022
CHAPTER: Exploiting Convolutional Neural Network Adapters for
  Self-supervised Speech Models
CHAPTER: Exploiting Convolutional Neural Network Adapters for Self-supervised Speech Models
Zih-Ching Chen
Yu-Shun Sung
Hung-yi Lee
13
16
0
01 Dec 2022
EURO: ESPnet Unsupervised ASR Open-source Toolkit
EURO: ESPnet Unsupervised ASR Open-source Toolkit
Dongji Gao
Jiatong Shi
Shun-Po Chuang
Leibny Paola García-Perera
Hung-yi Lee
Shinji Watanabe
Sanjeev Khudanpur
19
8
0
30 Nov 2022
Model Extraction Attack against Self-supervised Speech Models
Model Extraction Attack against Self-supervised Speech Models
Tsung-Yuan Hsu
Chen An Li
Tung-Yu Wu
Hung-yi Lee
17
1
0
29 Nov 2022
TESSP: Text-Enhanced Self-Supervised Speech Pre-training
TESSP: Text-Enhanced Self-Supervised Speech Pre-training
Zhuoyuan Yao
Shuo Ren
Sanyuan Chen
Ziyang Ma
Pengcheng Guo
Linfu Xie
22
5
0
24 Nov 2022
Device Directedness with Contextual Cues for Spoken Dialog Systems
Device Directedness with Contextual Cues for Spoken Dialog Systems
Dhanush Bekal
S. Srinivasan
S. Bodapati
S. Ronanki
Katrin Kirchhoff
31
1
0
23 Nov 2022
Exploring WavLM on Speech Enhancement
Exploring WavLM on Speech Enhancement
Hyungchan Song
Sanyuan Chen
Zhuo Chen
Yu-Huan Wu
Takuya Yoshioka
M. Tang
Jong Won Shin
Shujie Liu
8
16
0
18 Nov 2022
Compressing Transformer-based self-supervised models for speech
  processing
Compressing Transformer-based self-supervised models for speech processing
Tzu-Quan Lin
Tsung-Huan Yang
Chun-Yao Chang
Kuang-Ming Chen
Tzu-hsun Feng
Hung-yi Lee
Hao Tang
30
6
0
17 Nov 2022
MelHuBERT: A simplified HuBERT on Mel spectrograms
MelHuBERT: A simplified HuBERT on Mel spectrograms
Tzu-Quan Lin
Hung-yi Lee
Hao Tang
SSL
24
13
0
17 Nov 2022
Comparative layer-wise analysis of self-supervised speech models
Comparative layer-wise analysis of self-supervised speech models
Ankita Pasad
Bowen Shi
Karen Livescu
SSL
22
109
0
08 Nov 2022
Bridging Speech and Textual Pre-trained Models with Unsupervised ASR
Bridging Speech and Textual Pre-trained Models with Unsupervised ASR
Jiatong Shi
Chan-Jan Hsu
Ho-Lam Chung
Dongji Gao
Leibny Paola García-Perera
Shinji Watanabe
Ann Lee
Hung-yi Lee
27
12
0
06 Nov 2022
data2vec-aqc: Search for the right Teaching Assistant in the
  Teacher-Student training setup
data2vec-aqc: Search for the right Teaching Assistant in the Teacher-Student training setup
Vasista Sai Lodagala
Sreyan Ghosh
S. Umesh
SSL
33
5
0
02 Nov 2022
Avoid Overthinking in Self-Supervised Models for Speech Recognition
Avoid Overthinking in Self-Supervised Models for Speech Recognition
Dan Berrebbi
Brian Yan
Shinji Watanabe
LRM
11
4
0
01 Nov 2022
Self-supervised language learning from raw audio: Lessons from the Zero
  Resource Speech Challenge
Self-supervised language learning from raw audio: Lessons from the Zero Resource Speech Challenge
Ewan Dunbar
Nicolas Hamilakis
Emmanuel Dupoux
SSL
24
30
0
27 Oct 2022
Exploring Effective Distillation of Self-Supervised Speech Models for
  Automatic Speech Recognition
Exploring Effective Distillation of Self-Supervised Speech Models for Automatic Speech Recognition
Yujin Wang
Changli Tang
Ziyang Ma
Zhisheng Zheng
Xie Chen
Weiqiang Zhang
29
1
0
27 Oct 2022
Robust Data2vec: Noise-robust Speech Representation Learning for ASR by
  Combining Regression and Improved Contrastive Learning
Robust Data2vec: Noise-robust Speech Representation Learning for ASR by Combining Regression and Improved Contrastive Learning
Qiu-shi Zhu
Long Zhou
Jie M. Zhang
Shujie Liu
Yu-Chen Hu
Lirong Dai
VLM
SSL
48
37
0
27 Oct 2022
Multitask Detection of Speaker Changes, Overlapping Speech and Voice
  Activity Using wav2vec 2.0
Multitask Detection of Speaker Changes, Overlapping Speech and Voice Activity Using wav2vec 2.0
Marie Kunesova
Zbynek Zajíc
SSL
VLM
13
15
0
26 Oct 2022
Real-time Speech Interruption Analysis: From Cloud to Client Deployment
Real-time Speech Interruption Analysis: From Cloud to Client Deployment
Quchen Fu
Szu-Wei Fu
Yaran Fan
Yu-Huan Wu
Zhuo Chen
J. Gupchup
Ross Cutler
26
0
0
24 Oct 2022
Self-supervised Rewiring of Pre-trained Speech Encoders: Towards Faster
  Fine-tuning with Less Labels in Speech Processing
Self-supervised Rewiring of Pre-trained Speech Encoders: Towards Faster Fine-tuning with Less Labels in Speech Processing
Haomiao Yang
Jinming Zhao
Gholamreza Haffari
Ehsan Shareghi
25
2
0
24 Oct 2022
Bootstrapping meaning through listening: Unsupervised learning of spoken
  sentence embeddings
Bootstrapping meaning through listening: Unsupervised learning of spoken sentence embeddings
Jian Zhu
Zuoyu Tian
Yadong Liu
Cong Zhang
Chia-wen Lo
SSL
30
2
0
23 Oct 2022
Previous
1234
Next