Communities
Connect sessions
AI calendar
Organizations
Join Slack
Contact Sales
Search
Open menu
Home
Papers
All Papers
0 / 0 papers shown
Title
Home
Papers
2310.02720
Cited By
v1
v2 (latest)
Multi-resolution HuBERT: Multi-resolution Speech Self-Supervised Learning with Masked Unit Prediction
International Conference on Learning Representations (ICLR), 2023
4 October 2023
Jiatong Shi
Hirofumi Inaguma
Xutai Ma
Ilia Kulikov
Anna Y. Sun
Re-assign community
ArXiv (abs)
PDF
HTML
HuggingFace (1 upvotes)
Papers citing
"Multi-resolution HuBERT: Multi-resolution Speech Self-Supervised Learning with Masked Unit Prediction"
26 / 26 papers shown
Title
DualSpeechLM: Towards Unified Speech Understanding and Generation via Dual Speech Token Modeling with Large Language Models
Yuanyuan Wang
Dongchao Yang
Yiwen Shao
Hangting Chen
Jiankun Zhao
Zhiyong Wu
Chao Yang
Xixin Wu
142
1
0
12 Aug 2025
An Exploration of Mamba for Speech Self-Supervised Models
Tzu-Quan Lin
Heng-Cheng Kuo
Tzu-Chieh Wei
H. Cheng
Chun Wei Chen
Hsien-Fu Hsiao
Yu Tsao
Hung-yi Lee
Mamba
128
1
0
14 Jun 2025
Uni-VERSA: Versatile Speech Assessment with a Unified Network
Jiatong Shi
Hye-jin Shim
Shinji Watanabe
189
3
0
27 May 2025
Optimizing Speech Multi-View Feature Fusion through Conditional Computation
IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2025
Weiqiao Shan
Yuhao Zhang
Yuchen Han
Yangqiu Song
X. Zhao
Yongqian Li
Hao Fei
Hao Yang
Tong Xiao
Jingbo Zhu
145
0
0
14 Jan 2025
How to Learn a New Language? An Efficient Solution for Self-Supervised Learning Models Unseen Languages Adaption in Low-Resource Scenario
Shih-Heng Wang
Zih-Ching Chen
Jiatong Shi
Ming-To Chuang
Guan-Ting Lin
Kuan Po Huang
David Harwath
Shang-Wen Li
Hung-yi Lee
291
3
0
27 Nov 2024
Fusion of Discrete Representations and Self-Augmented Representations for Multilingual Automatic Speech Recognition
Spoken Language Technology Workshop (SLT), 2024
Shih-Heng Wang
Jiatong Shi
Chien-yu Huang
Shinji Watanabe
Hung-yi Lee
209
0
0
27 Nov 2024
DC-Spin: A Speaker-invariant Speech Tokenizer for Spoken Language Models
Heng-Jui Chang
Hongyu Gong
Changhan Wang
James R. Glass
Yu-An Chung
293
5
0
31 Oct 2024
An Empirical Analysis of Speech Self-Supervised Learning at Multiple Resolutions
Theo Clark
Benedetta Cevoli
Eloy de Jong
Timofey Abramski
Jamie Dougherty
SSL
186
0
0
31 Oct 2024
JOOCI: a Framework for Learning Comprehensive Speech Representations
Hemant Yadav
R. Shah
Sunayana Sitaram
291
0
0
14 Oct 2024
Bridging Speech and Text: Enhancing ASR with Pinyin-to-Character Pre-training in LLMs
International Symposium on Chinese Spoken Language Processing (ISCSLP), 2024
Yang Yuhang
Peng Yizhou
Eng Siong Chng
Xionghu Zhong
AuLLM
AI4CE
158
2
0
24 Sep 2024
Exploring Prediction Targets in Masked Pre-Training for Speech Foundation Models
Li-Wei Chen
Takuya Higuchi
He Bai
Ahmed Hussen Abdelaziz
Alexander Rudnicky
Shinji Watanabe
Tatiana Likhomanenko
B. Theobald
Zakaria Aldeneh
273
1
0
16 Sep 2024
Muskits-ESPnet: A Comprehensive Toolkit for Singing Voice Synthesis in New Paradigm
ACM Multimedia (MM), 2024
Yuning Wu
Jiatong Shi
Yifeng Yu
Yuxun Tang
Tao Qian
Yueqian Lin
Jionghao Han
Xinyi Bai
Shinji Watanabe
Qin Jin
174
7
0
11 Sep 2024
SSDM: Scalable Speech Dysfluency Modeling
Neural Information Processing Systems (NeurIPS), 2024
Jiachen Lian
Xuanru Zhou
Z. Ezzes
Jet M J Vonk
Brittany Morin
D. Baquirin
Zachary Mille
M. G. Tempini
Gopala Anumanchipalli
AuLLM
243
19
0
29 Aug 2024
Speech Representation Learning Revisited: The Necessity of Separate Learnable Parameters and Robust Data Augmentation
Hemant Yadav
Sunayana Sitaram
R. Shah
SSL
288
0
0
20 Aug 2024
Towards Robust Speech Representation Learning for Thousands of Languages
William Chen
Wangyou Zhang
Yifan Peng
Xinjian Li
Jinchuan Tian
Jiatong Shi
Xuankai Chang
Soumi Maiti
Karen Livescu
Shinji Watanabe
ELM
311
42
0
30 Jun 2024
SingMOS: An extensive Open-Source Singing Voice Dataset for MOS Prediction
Yuxun Tang
Jiatong Shi
Yuning Wu
Qin Jin
204
25
0
16 Jun 2024
MMM: Multi-Layer Multi-Residual Multi-Stream Discrete Speech Representation from Self-supervised Learning Model
Interspeech (Interspeech), 2024
Jiatong Shi
Xutai Ma
Hirofumi Inaguma
Anna Y. Sun
Shinji Watanabe
179
13
0
14 Jun 2024
SingOMD: Singing Oriented Multi-resolution Discrete Representation Construction from Speech Models
Yuxun Tang
Yuning Wu
Jiatong Shi
Qin Jin
213
7
0
13 Jun 2024
VISinger2+: End-to-End Singing Voice Synthesis Augmented by Self-Supervised Learning Representation
Yifeng Yu
Jiatong Shi
Yuning Wu
Shinji Watanabe
185
9
0
13 Jun 2024
TokSing: Singing Voice Synthesis based on Discrete Tokens
Yuning Wu
Chunlei Zhang
Jiatong Shi
Yuxun Tang
Shan Yang
Qin Jin
232
13
0
12 Jun 2024
MS-HuBERT: Mitigating Pre-training and Inference Mismatch in Masked Language Modelling methods for learning Speech Representations
Interspeech (Interspeech), 2024
Hemant Yadav
Sunayana Sitaram
R. Shah
SSL
281
2
0
09 Jun 2024
CtrSVDD: A Benchmark Dataset and Baseline Analysis for Controlled Singing Voice Deepfake Detection
Yongyi Zang
Jiatong Shi
You Zhang
Ryuichi Yamamoto
Jionghao Han
...
Shengyuan Xu
Wenxiao Zhao
Jing Guo
Tomoki Toda
Zhiyao Duan
191
24
0
04 Jun 2024
Findings of the 2023 ML-SUPERB Challenge: Pre-Training and Evaluation over More Languages and Beyond
Automatic Speech Recognition & Understanding (ASRU), 2023
Jiatong Shi
William Chen
Dan Berrebbi
Hsiu-Hsuan Wang
Wei-Ping Huang
...
Yuxun Tang
Shang-Wen Li
Abdelrahman Mohamed
Hung-yi Lee
Shinji Watanabe
LRM
ELM
354
18
0
09 Oct 2023
EFFUSE: Efficient Self-Supervised Feature Fusion for E2E ASR in Low Resource and Multilingual Scenarios
Interspeech (Interspeech), 2023
Tejes Srivastava
Jiatong Shi
William Chen
Shinji Watanabe
221
2
0
05 Oct 2023
ML-SUPERB: Multilingual Speech Universal PERformance Benchmark
Interspeech (Interspeech), 2023
Jiatong Shi
Dan Berrebbi
William Chen
Ho-Lam Chung
En-Pei Hu
...
Xuankai Chang
Shang-Wen Li
Abdel-rahman Mohamed
Hung-yi Lee
Shinji Watanabe
ELM
306
86
0
18 May 2023
Mockingjay: Unsupervised Speech Representation Learning with Deep Bidirectional Transformer Encoders
IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2019
Andy T. Liu
Shu-Wen Yang
Po-Han Chi
Po-Chun Hsu
Hung-yi Lee
SSL
436
389
0
25 Oct 2019
1