Communities
Connect sessions
AI calendar
Organizations
Join Slack
Contact Sales
Search
Open menu
Home
Papers
2310.02720
Cited By
v1
v2 (latest)
Multi-resolution HuBERT: Multi-resolution Speech Self-Supervised Learning with Masked Unit Prediction
International Conference on Learning Representations (ICLR), 2023
4 October 2023
Jiatong Shi
Hirofumi Inaguma
Xutai Ma
Ilia Kulikov
Anna Y. Sun
Re-assign community
ArXiv (abs)
PDF
HTML
HuggingFace (1 upvotes)
Papers citing
"Multi-resolution HuBERT: Multi-resolution Speech Self-Supervised Learning with Masked Unit Prediction"
26 / 26 papers shown
DualSpeechLM: Towards Unified Speech Understanding and Generation via Dual Speech Token Modeling with Large Language Models
Yuanyuan Wang
Dongchao Yang
Yiwen Shao
Hangting Chen
Jiankun Zhao
Zhiyong Wu
Chao Yang
Xixin Wu
160
1
0
12 Aug 2025
An Exploration of Mamba for Speech Self-Supervised Models
Tzu-Quan Lin
Heng-Cheng Kuo
Tzu-Chieh Wei
H. Cheng
Chun Wei Chen
Hsien-Fu Hsiao
Yu Tsao
Hung-yi Lee
Mamba
152
1
0
14 Jun 2025
Uni-VERSA: Versatile Speech Assessment with a Unified Network
Jiatong Shi
Hye-jin Shim
Shinji Watanabe
211
3
0
27 May 2025
Optimizing Speech Multi-View Feature Fusion through Conditional Computation
IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2025
Weiqiao Shan
Yuhao Zhang
Yuchen Han
Yangqiu Song
X. Zhao
Yongqian Li
Hao Fei
Hao Yang
Tong Xiao
Jingbo Zhu
154
0
0
14 Jan 2025
How to Learn a New Language? An Efficient Solution for Self-Supervised Learning Models Unseen Languages Adaption in Low-Resource Scenario
Shih-Heng Wang
Zih-Ching Chen
Jiatong Shi
Ming-To Chuang
Guan-Ting Lin
Kuan Po Huang
David Harwath
Shang-Wen Li
Hung-yi Lee
296
3
0
27 Nov 2024
Fusion of Discrete Representations and Self-Augmented Representations for Multilingual Automatic Speech Recognition
Spoken Language Technology Workshop (SLT), 2024
Shih-Heng Wang
Jiatong Shi
Chien-yu Huang
Shinji Watanabe
Hung-yi Lee
221
1
0
27 Nov 2024
DC-Spin: A Speaker-invariant Speech Tokenizer for Spoken Language Models
Heng-Jui Chang
Hongyu Gong
Changhan Wang
James R. Glass
Yu-An Chung
309
5
0
31 Oct 2024
An Empirical Analysis of Speech Self-Supervised Learning at Multiple Resolutions
Theo Clark
Benedetta Cevoli
Eloy de Jong
Timofey Abramski
Jamie Dougherty
SSL
208
0
0
31 Oct 2024
JOOCI: a Framework for Learning Comprehensive Speech Representations
Hemant Yadav
R. Shah
Sunayana Sitaram
316
0
0
14 Oct 2024
Bridging Speech and Text: Enhancing ASR with Pinyin-to-Character Pre-training in LLMs
International Symposium on Chinese Spoken Language Processing (ISCSLP), 2024
Yang Yuhang
Peng Yizhou
Eng Siong Chng
Xionghu Zhong
AuLLM
AI4CE
185
2
0
24 Sep 2024
Exploring Prediction Targets in Masked Pre-Training for Speech Foundation Models
Li-Wei Chen
Takuya Higuchi
He Bai
Ahmed Hussen Abdelaziz
Alexander Rudnicky
Shinji Watanabe
Tatiana Likhomanenko
B. Theobald
Zakaria Aldeneh
297
1
0
16 Sep 2024
Muskits-ESPnet: A Comprehensive Toolkit for Singing Voice Synthesis in New Paradigm
ACM Multimedia (MM), 2024
Yuning Wu
Jiatong Shi
Yifeng Yu
Yuxun Tang
Tao Qian
Yueqian Lin
Jionghao Han
Xinyi Bai
Shinji Watanabe
Qin Jin
202
7
0
11 Sep 2024
SSDM: Scalable Speech Dysfluency Modeling
Neural Information Processing Systems (NeurIPS), 2024
Jiachen Lian
Xuanru Zhou
Z. Ezzes
Jet M J Vonk
Brittany Morin
D. Baquirin
Zachary Mille
M. G. Tempini
Gopala Anumanchipalli
AuLLM
286
19
0
29 Aug 2024
Speech Representation Learning Revisited: The Necessity of Separate Learnable Parameters and Robust Data Augmentation
Hemant Yadav
Sunayana Sitaram
R. Shah
SSL
304
0
0
20 Aug 2024
Towards Robust Speech Representation Learning for Thousands of Languages
William Chen
Wangyou Zhang
Yifan Peng
Xinjian Li
Jinchuan Tian
Jiatong Shi
Xuankai Chang
Soumi Maiti
Karen Livescu
Shinji Watanabe
ELM
329
44
0
30 Jun 2024
SingMOS: An extensive Open-Source Singing Voice Dataset for MOS Prediction
Yuxun Tang
Jiatong Shi
Yuning Wu
Qin Jin
221
25
0
16 Jun 2024
MMM: Multi-Layer Multi-Residual Multi-Stream Discrete Speech Representation from Self-supervised Learning Model
Interspeech (Interspeech), 2024
Jiatong Shi
Xutai Ma
Hirofumi Inaguma
Anna Y. Sun
Shinji Watanabe
192
14
0
14 Jun 2024
SingOMD: Singing Oriented Multi-resolution Discrete Representation Construction from Speech Models
Yuxun Tang
Yuning Wu
Jiatong Shi
Qin Jin
233
7
0
13 Jun 2024
VISinger2+: End-to-End Singing Voice Synthesis Augmented by Self-Supervised Learning Representation
Yifeng Yu
Jiatong Shi
Yuning Wu
Shinji Watanabe
213
9
0
13 Jun 2024
TokSing: Singing Voice Synthesis based on Discrete Tokens
Yuning Wu
Chunlei Zhang
Jiatong Shi
Yuxun Tang
Shan Yang
Qin Jin
256
13
0
12 Jun 2024
MS-HuBERT: Mitigating Pre-training and Inference Mismatch in Masked Language Modelling methods for learning Speech Representations
Interspeech (Interspeech), 2024
Hemant Yadav
Sunayana Sitaram
R. Shah
SSL
301
3
0
09 Jun 2024
CtrSVDD: A Benchmark Dataset and Baseline Analysis for Controlled Singing Voice Deepfake Detection
Yongyi Zang
Jiatong Shi
You Zhang
Ryuichi Yamamoto
Jionghao Han
...
Shengyuan Xu
Wenxiao Zhao
Jing Guo
Tomoki Toda
Zhiyao Duan
210
25
0
04 Jun 2024
Findings of the 2023 ML-SUPERB Challenge: Pre-Training and Evaluation over More Languages and Beyond
Automatic Speech Recognition & Understanding (ASRU), 2023
Jiatong Shi
William Chen
Dan Berrebbi
Hsiu-Hsuan Wang
Wei-Ping Huang
...
Yuxun Tang
Shang-Wen Li
Abdelrahman Mohamed
Hung-yi Lee
Shinji Watanabe
LRM
ELM
375
18
0
09 Oct 2023
EFFUSE: Efficient Self-Supervised Feature Fusion for E2E ASR in Low Resource and Multilingual Scenarios
Interspeech (Interspeech), 2023
Tejes Srivastava
Jiatong Shi
William Chen
Shinji Watanabe
231
2
0
05 Oct 2023
ML-SUPERB: Multilingual Speech Universal PERformance Benchmark
Interspeech (Interspeech), 2023
Jiatong Shi
Dan Berrebbi
William Chen
Ho-Lam Chung
En-Pei Hu
...
Xuankai Chang
Shang-Wen Li
Abdel-rahman Mohamed
Hung-yi Lee
Shinji Watanabe
ELM
321
87
0
18 May 2023
Mockingjay: Unsupervised Speech Representation Learning with Deep Bidirectional Transformer Encoders
IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2019
Andy T. Liu
Shu-Wen Yang
Po-Han Chi
Po-Chun Hsu
Hung-yi Lee
SSL
456
391
0
25 Oct 2019
1