Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2110.01900
Cited By
DistilHuBERT: Speech Representation Learning by Layer-wise Distillation of Hidden-unit BERT
5 October 2021
Heng-Jui Chang
Shu-Wen Yang
Hung-yi Lee
SSL
Re-assign community
ArXiv
PDF
HTML
Papers citing
"DistilHuBERT: Speech Representation Learning by Layer-wise Distillation of Hidden-unit BERT"
50 / 116 papers shown
Title
StableQuant: Layer Adaptive Post-Training Quantization for Speech Foundation Models
Yeona Hong
Hyewon Han
Woo-Jin Chung
Hong-Goo Kang
MQ
23
0
0
21 Apr 2025
Efficient Adapter Tuning for Joint Singing Voice Beat and Downbeat Tracking with Self-supervised Learning Features
Jiajun Deng
Yaolong Ju
Jing Yang
Simon Lui
Xunying Liu
45
0
0
13 Mar 2025
Leveraging Cross-Attention Transformer and Multi-Feature Fusion for Cross-Linguistic Speech Emotion Recognition
Ruoyu Zhao
Xiantao Jiang
Fei Yu
Victor C.M. Leung
Tao Wang
S. Zhang
27
0
0
06 Jan 2025
Lillama: Large Language Models Compression via Low-Rank Feature Distillation
Yaya Sy
Christophe Cerisara
Irina Illina
MQ
69
0
0
31 Dec 2024
DC-Spin: A Speaker-invariant Speech Tokenizer for Spoken Language Models
Heng-Jui Chang
Hongyu Gong
Changhan Wang
James R. Glass
Yu-An Chung
26
0
0
31 Oct 2024
EH-MAM: Easy-to-Hard Masked Acoustic Modeling for Self-Supervised Speech Representation Learning
Ashish Seth
Ramaneswaran Selvakumar
S. Sakshi
Sonal Kumar
Sreyan Ghosh
Dinesh Manocha
24
0
0
17 Oct 2024
Efficient Training of Self-Supervised Speech Foundation Models on a Compute Budget
Andy T. Liu
Yi-Cheng Lin
Haibin Wu
Stefan Winkler
Hung-yi Lee
28
1
0
09 Sep 2024
SSDM: Scalable Speech Dysfluency Modeling
Jiachen Lian
Xuanru Zhou
Z. Ezzes
Jet M J Vonk
Brittany Morin
D. Baquirin
Zachary Mille
M. G. Tempini
Gopala Anumanchipalli
AuLLM
30
1
0
29 Aug 2024
Leave No Knowledge Behind During Knowledge Distillation: Towards Practical and Effective Knowledge Distillation for Code-Switching ASR Using Realistic Data
Liang-Hsuan Tseng
Zih-Ching Chen
Wei-Shun Chang
Cheng-Kuang Lee
Tsung-Ren Huang
Hung-yi Lee
36
1
0
15 Jul 2024
uDistil-Whisper: Label-Free Data Filtering for Knowledge Distillation in Low-Data Regimes
Abdul Waheed
Karima Kadaoui
Muhammad Abdul-Mageed
Muhammad Abdul-Mageed
19
1
0
01 Jul 2024
Exploring compressibility of transformer based text-to-music (TTM) models
Vasileios Moschopoulos
Thanasis Kotsiopoulos
Pablo Peso Parada
Konstantinos Nikiforidis
Alexandros Stergiadis
Gerasimos Papakostas
Md. Asif Jalal
Jisi Zhang
Anastasios Drosou
Karthikeyan P. Saravanan
23
0
0
24 Jun 2024
Outlier Reduction with Gated Attention for Improved Post-training Quantization in Large Sequence-to-sequence Speech Foundation Models
Dominik Wagner
Ilja Baumann
K. Riedhammer
Tobias Bocklet
MQ
25
1
0
16 Jun 2024
One-pass Multiple Conformer and Foundation Speech Systems Compression and Quantization Using An All-in-one Neural Model
Zhaoqing Li
Haoning Xu
Tianzi Wang
Shoukang Hu
Zengrui Jin
Shujie Hu
Jiajun Deng
Mingyu Cui
Mengzhe Geng
Xunying Liu
MQ
18
1
0
14 Jun 2024
MMM: Multi-Layer Multi-Residual Multi-Stream Discrete Speech Representation from Self-supervised Learning Model
Jiatong Shi
Xutai Ma
Hirofumi Inaguma
Anna Y. Sun
Shinji Watanabe
50
7
0
14 Jun 2024
LASER: Learning by Aligning Self-supervised Representations of Speech for Improving Content-related Tasks
Amit Meghanani
Thomas Hain
30
1
0
13 Jun 2024
AdaPTwin: Low-Cost Adaptive Compression of Product Twins in Transformers
Emil Biju
Anirudh Sriram
Mert Pilanci
32
0
0
13 Jun 2024
GenDistiller: Distilling Pre-trained Language Models based on an Autoregressive Generative Model
Yingying Gao
Shilei Zhang
Chao Deng
Junlan Feng
19
0
0
12 Jun 2024
Sustainable self-supervised learning for speech representations
Luis Lugo
Valentin Vielzeuf
29
2
0
11 Jun 2024
DAISY: Data Adaptive Self-Supervised Early Exit for Speech Representation Models
T. Lin
Hung-yi Lee
Hao Tang
22
1
0
08 Jun 2024
On the social bias of speech self-supervised models
Yi-Cheng Lin
T. Lin
Hsi-Che Lin
Andy T. Liu
Hung-yi Lee
29
3
0
07 Jun 2024
To Distill or Not to Distill? On the Robustness of Robust Knowledge Distillation
Abdul Waheed
Karima Kadaoui
Muhammad Abdul-Mageed
VLM
25
3
0
06 Jun 2024
DP-DyLoRA: Fine-Tuning Transformer-Based Models On-Device under Differentially Private Federated Learning using Dynamic Low-Rank Adaptation
Jie Xu
Karthikeyan P. Saravanan
Rogier van Dalen
Haaris Mehmood
David Tuckey
Mete Ozay
56
5
0
10 May 2024
A Large-Scale Evaluation of Speech Foundation Models
Shu-Wen Yang
Heng-Jui Chang
Zili Huang
Andy T. Liu
Cheng-I Jeff Lai
...
Kushal Lakhotia
Shang-Wen Li
Abdelrahman Mohamed
Shinji Watanabe
Hung-yi Lee
38
19
0
15 Apr 2024
Masked Modeling Duo: Towards a Universal Audio Pre-training Framework
Daisuke Niizumi
Daiki Takeuchi
Yasunori Ohishi
Noboru Harada
K. Kashino
23
10
0
09 Apr 2024
An Efficient End-to-End Approach to Noise Invariant Speech Features via Multi-Task Learning
Heitor R. Guimarães
Arthur Pimentel
Anderson R. Avila
Mehdi Rezagholizadeh
Boxing Chen
Tiago H. Falk
48
1
0
13 Mar 2024
SCORE: Self-supervised Correspondence Fine-tuning for Improved Content Representations
Amit Meghanani
Thomas Hain
25
3
0
10 Mar 2024
SKILL: Similarity-aware Knowledge distILLation for Speech Self-Supervised Learning
Luca Zampierin
G. B. Hacene
Bac Nguyen
Mirco Ravanelli
25
2
0
26 Feb 2024
OWSM-CTC: An Open Encoder-Only Speech Foundation Model for Speech Recognition, Translation, and Language Identification
Yifan Peng
Yui Sudo
Muhammad Shakeel
Shinji Watanabe
VLM
35
17
0
20 Feb 2024
Evaluating and Improving Continual Learning in Spoken Language Understanding
Muqiao Yang
Xiang Li
Umberto Cappellazzo
Shinji Watanabe
Bhiksha Raj
CLL
24
0
0
16 Feb 2024
SpeechCLIP+: Self-supervised multi-task representation learning for speech via CLIP and speech-image data
Hsuan-Fu Wang
Yi-Jen Shih
Heng-Jui Chang
Layne Berry
Puyuan Peng
Hung-yi Lee
Hsin-Min Wang
David F. Harwath
VLM
26
2
0
10 Feb 2024
The last Dance : Robust backdoor attack via diffusion models and bayesian approach
Orson Mengara
DiffM
26
4
0
05 Feb 2024
On the Transferability of Large-Scale Self-Supervision to Few-Shot Audio Classification
Calum Heggan
S. Budgett
Timothy M. Hospedales
Mehrdad Yaghoobi
SSL
13
1
0
02 Feb 2024
OWSM v3.1: Better and Faster Open Whisper-Style Speech Models based on E-Branchformer
Yifan Peng
Jinchuan Tian
William Chen
Siddhant Arora
Brian Yan
...
Kwanghee Choi
Jiatong Shi
Xuankai Chang
Jee-weon Jung
Shinji Watanabe
VLM
OSLM
19
38
0
30 Jan 2024
CTC Blank Triggered Dynamic Layer-Skipping for Efficient CTC-based Speech Recognition
Junfeng Hou
Peiyao Wang
Jincheng Zhang
Meng-Da Yang
Minwei Feng
Jingcheng Yin
21
1
0
04 Jan 2024
Noise robust distillation of self-supervised speech models via correlation metrics
Fabian Ritter Gutierrez
Kuan-Po Huang
Dianwen Ng
Jeremy H. M. Wong
Hung-yi Lee
Chng Eng Siong
Nancy F. Chen
14
1
0
19 Dec 2023
Efficiency-oriented approaches for self-supervised speech representation learning
Luis Lugo
Valentin Vielzeuf
SSL
19
1
0
18 Dec 2023
STaR: Distilling Speech Temporal Relation for Lightweight Speech Self-Supervised Learning Models
Kangwook Jang
Sungnyun Kim
Hoi-Rim Kim
23
1
0
14 Dec 2023
R-Spin: Efficient Speaker and Noise-invariant Representation Learning with Acoustic Pieces
Heng-Jui Chang
James R. Glass
15
3
0
15 Nov 2023
Distil-Whisper: Robust Knowledge Distillation via Large-Scale Pseudo Labelling
Sanchit Gandhi
Patrick von Platen
Alexander M. Rush
VLM
6
49
0
01 Nov 2023
EFFUSE: Efficient Self-Supervised Feature Fusion for E2E ASR in Low Resource and Multilingual Scenarios
Tejes Srivastava
Jiatong Shi
William Chen
Shinji Watanabe
32
1
0
05 Oct 2023
Continual Contrastive Spoken Language Understanding
Umberto Cappellazzo
Enrico Fini
Muqiao Yang
Daniele Falavigna
A. Brutti
Bhiksha Raj
CLL
23
1
0
04 Oct 2023
Segment-Level Vectorized Beam Search Based on Partially Autoregressive Inference
Masao Someki
N. Eng
Yosuke Higuchi
Shinji Watanabe
10
0
0
26 Sep 2023
Reproducing Whisper-Style Training Using an Open-Source Toolkit and Publicly Available Data
Yifan Peng
Jinchuan Tian
Brian Yan
Dan Berrebbi
Xuankai Chang
...
Yui Sudo
Muhammad Shakeel
Jee-weon Jung
Soumi Maiti
Shinji Watanabe
VLM
23
33
0
25 Sep 2023
Distilling HuBERT with LSTMs via Decoupled Knowledge Distillation
Danilo de Oliveira
Timo Gerkmann
VLM
15
3
0
18 Sep 2023
CoLLD: Contrastive Layer-to-layer Distillation for Compressing Multilingual Pre-trained Speech Encoders
Heng-Jui Chang
Ning Dong
Ruslan Mavlyutov
Sravya Popuri
Yu-An Chung
37
6
0
14 Sep 2023
SpeechTokenizer: Unified Speech Tokenizer for Speech Large Language Models
Xin Zhang
Dong Zhang
Shimin Li
Yaqian Zhou
Xipeng Qiu
25
61
0
31 Aug 2023
Knowledge Distillation from Non-streaming to Streaming ASR Encoder using Auxiliary Non-streaming Layer
Kyuhong Shim
Jinkyu Lee
Simyoung Chang
Kyuwoong Hwang
19
2
0
31 Aug 2023
Let There Be Sound: Reconstructing High Quality Speech from Silent Videos
Ji-Hoon Kim
Jaehun Kim
Joon Son Chung
8
5
0
29 Aug 2023
Speech Self-Supervised Representations Benchmarking: a Case for Larger Probing Heads
Salah Zaiem
Youcef Kemiche
Titouan Parcollet
S. Essid
Mirco Ravanelli
SSL
12
11
0
28 Aug 2023
Video Multimodal Emotion Recognition System for Real World Applications
Sun-Kyung Lee
Jong-Hwan Kim
CVBM
8
0
0
28 Aug 2023
1
2
3
Next