ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2110.01900
  4. Cited By
DistilHuBERT: Speech Representation Learning by Layer-wise Distillation
  of Hidden-unit BERT

DistilHuBERT: Speech Representation Learning by Layer-wise Distillation of Hidden-unit BERT

5 October 2021
Heng-Jui Chang
Shu-Wen Yang
Hung-yi Lee
    SSL
ArXivPDFHTML

Papers citing "DistilHuBERT: Speech Representation Learning by Layer-wise Distillation of Hidden-unit BERT"

50 / 116 papers shown
Title
StableQuant: Layer Adaptive Post-Training Quantization for Speech Foundation Models
StableQuant: Layer Adaptive Post-Training Quantization for Speech Foundation Models
Yeona Hong
Hyewon Han
Woo-Jin Chung
Hong-Goo Kang
MQ
23
0
0
21 Apr 2025
Efficient Adapter Tuning for Joint Singing Voice Beat and Downbeat Tracking with Self-supervised Learning Features
Jiajun Deng
Yaolong Ju
Jing Yang
Simon Lui
Xunying Liu
45
0
0
13 Mar 2025
Leveraging Cross-Attention Transformer and Multi-Feature Fusion for Cross-Linguistic Speech Emotion Recognition
Leveraging Cross-Attention Transformer and Multi-Feature Fusion for Cross-Linguistic Speech Emotion Recognition
Ruoyu Zhao
Xiantao Jiang
Fei Yu
Victor C.M. Leung
Tao Wang
S. Zhang
27
0
0
06 Jan 2025
Lillama: Large Language Models Compression via Low-Rank Feature Distillation
Lillama: Large Language Models Compression via Low-Rank Feature Distillation
Yaya Sy
Christophe Cerisara
Irina Illina
MQ
69
0
0
31 Dec 2024
DC-Spin: A Speaker-invariant Speech Tokenizer for Spoken Language Models
DC-Spin: A Speaker-invariant Speech Tokenizer for Spoken Language Models
Heng-Jui Chang
Hongyu Gong
Changhan Wang
James R. Glass
Yu-An Chung
26
0
0
31 Oct 2024
EH-MAM: Easy-to-Hard Masked Acoustic Modeling for Self-Supervised Speech
  Representation Learning
EH-MAM: Easy-to-Hard Masked Acoustic Modeling for Self-Supervised Speech Representation Learning
Ashish Seth
Ramaneswaran Selvakumar
S. Sakshi
Sonal Kumar
Sreyan Ghosh
Dinesh Manocha
24
0
0
17 Oct 2024
Efficient Training of Self-Supervised Speech Foundation Models on a Compute Budget
Efficient Training of Self-Supervised Speech Foundation Models on a Compute Budget
Andy T. Liu
Yi-Cheng Lin
Haibin Wu
Stefan Winkler
Hung-yi Lee
28
1
0
09 Sep 2024
SSDM: Scalable Speech Dysfluency Modeling
SSDM: Scalable Speech Dysfluency Modeling
Jiachen Lian
Xuanru Zhou
Z. Ezzes
Jet M J Vonk
Brittany Morin
D. Baquirin
Zachary Mille
M. G. Tempini
Gopala Anumanchipalli
AuLLM
30
1
0
29 Aug 2024
Leave No Knowledge Behind During Knowledge Distillation: Towards
  Practical and Effective Knowledge Distillation for Code-Switching ASR Using
  Realistic Data
Leave No Knowledge Behind During Knowledge Distillation: Towards Practical and Effective Knowledge Distillation for Code-Switching ASR Using Realistic Data
Liang-Hsuan Tseng
Zih-Ching Chen
Wei-Shun Chang
Cheng-Kuang Lee
Tsung-Ren Huang
Hung-yi Lee
36
1
0
15 Jul 2024
uDistil-Whisper: Label-Free Data Filtering for Knowledge Distillation in Low-Data Regimes
uDistil-Whisper: Label-Free Data Filtering for Knowledge Distillation in Low-Data Regimes
Abdul Waheed
Karima Kadaoui
Muhammad Abdul-Mageed
Muhammad Abdul-Mageed
19
1
0
01 Jul 2024
Exploring compressibility of transformer based text-to-music (TTM)
  models
Exploring compressibility of transformer based text-to-music (TTM) models
Vasileios Moschopoulos
Thanasis Kotsiopoulos
Pablo Peso Parada
Konstantinos Nikiforidis
Alexandros Stergiadis
Gerasimos Papakostas
Md. Asif Jalal
Jisi Zhang
Anastasios Drosou
Karthikeyan P. Saravanan
23
0
0
24 Jun 2024
Outlier Reduction with Gated Attention for Improved Post-training
  Quantization in Large Sequence-to-sequence Speech Foundation Models
Outlier Reduction with Gated Attention for Improved Post-training Quantization in Large Sequence-to-sequence Speech Foundation Models
Dominik Wagner
Ilja Baumann
K. Riedhammer
Tobias Bocklet
MQ
25
1
0
16 Jun 2024
One-pass Multiple Conformer and Foundation Speech Systems Compression
  and Quantization Using An All-in-one Neural Model
One-pass Multiple Conformer and Foundation Speech Systems Compression and Quantization Using An All-in-one Neural Model
Zhaoqing Li
Haoning Xu
Tianzi Wang
Shoukang Hu
Zengrui Jin
Shujie Hu
Jiajun Deng
Mingyu Cui
Mengzhe Geng
Xunying Liu
MQ
18
1
0
14 Jun 2024
MMM: Multi-Layer Multi-Residual Multi-Stream Discrete Speech
  Representation from Self-supervised Learning Model
MMM: Multi-Layer Multi-Residual Multi-Stream Discrete Speech Representation from Self-supervised Learning Model
Jiatong Shi
Xutai Ma
Hirofumi Inaguma
Anna Y. Sun
Shinji Watanabe
50
7
0
14 Jun 2024
LASER: Learning by Aligning Self-supervised Representations of Speech
  for Improving Content-related Tasks
LASER: Learning by Aligning Self-supervised Representations of Speech for Improving Content-related Tasks
Amit Meghanani
Thomas Hain
30
1
0
13 Jun 2024
AdaPTwin: Low-Cost Adaptive Compression of Product Twins in Transformers
AdaPTwin: Low-Cost Adaptive Compression of Product Twins in Transformers
Emil Biju
Anirudh Sriram
Mert Pilanci
32
0
0
13 Jun 2024
GenDistiller: Distilling Pre-trained Language Models based on an
  Autoregressive Generative Model
GenDistiller: Distilling Pre-trained Language Models based on an Autoregressive Generative Model
Yingying Gao
Shilei Zhang
Chao Deng
Junlan Feng
19
0
0
12 Jun 2024
Sustainable self-supervised learning for speech representations
Sustainable self-supervised learning for speech representations
Luis Lugo
Valentin Vielzeuf
29
2
0
11 Jun 2024
DAISY: Data Adaptive Self-Supervised Early Exit for Speech
  Representation Models
DAISY: Data Adaptive Self-Supervised Early Exit for Speech Representation Models
T. Lin
Hung-yi Lee
Hao Tang
22
1
0
08 Jun 2024
On the social bias of speech self-supervised models
On the social bias of speech self-supervised models
Yi-Cheng Lin
T. Lin
Hsi-Che Lin
Andy T. Liu
Hung-yi Lee
29
3
0
07 Jun 2024
To Distill or Not to Distill? On the Robustness of Robust Knowledge
  Distillation
To Distill or Not to Distill? On the Robustness of Robust Knowledge Distillation
Abdul Waheed
Karima Kadaoui
Muhammad Abdul-Mageed
VLM
25
3
0
06 Jun 2024
DP-DyLoRA: Fine-Tuning Transformer-Based Models On-Device under Differentially Private Federated Learning using Dynamic Low-Rank Adaptation
DP-DyLoRA: Fine-Tuning Transformer-Based Models On-Device under Differentially Private Federated Learning using Dynamic Low-Rank Adaptation
Jie Xu
Karthikeyan P. Saravanan
Rogier van Dalen
Haaris Mehmood
David Tuckey
Mete Ozay
56
5
0
10 May 2024
A Large-Scale Evaluation of Speech Foundation Models
A Large-Scale Evaluation of Speech Foundation Models
Shu-Wen Yang
Heng-Jui Chang
Zili Huang
Andy T. Liu
Cheng-I Jeff Lai
...
Kushal Lakhotia
Shang-Wen Li
Abdelrahman Mohamed
Shinji Watanabe
Hung-yi Lee
38
19
0
15 Apr 2024
Masked Modeling Duo: Towards a Universal Audio Pre-training Framework
Masked Modeling Duo: Towards a Universal Audio Pre-training Framework
Daisuke Niizumi
Daiki Takeuchi
Yasunori Ohishi
Noboru Harada
K. Kashino
23
10
0
09 Apr 2024
An Efficient End-to-End Approach to Noise Invariant Speech Features via
  Multi-Task Learning
An Efficient End-to-End Approach to Noise Invariant Speech Features via Multi-Task Learning
Heitor R. Guimarães
Arthur Pimentel
Anderson R. Avila
Mehdi Rezagholizadeh
Boxing Chen
Tiago H. Falk
48
1
0
13 Mar 2024
SCORE: Self-supervised Correspondence Fine-tuning for Improved Content
  Representations
SCORE: Self-supervised Correspondence Fine-tuning for Improved Content Representations
Amit Meghanani
Thomas Hain
25
3
0
10 Mar 2024
SKILL: Similarity-aware Knowledge distILLation for Speech
  Self-Supervised Learning
SKILL: Similarity-aware Knowledge distILLation for Speech Self-Supervised Learning
Luca Zampierin
G. B. Hacene
Bac Nguyen
Mirco Ravanelli
25
2
0
26 Feb 2024
OWSM-CTC: An Open Encoder-Only Speech Foundation Model for Speech
  Recognition, Translation, and Language Identification
OWSM-CTC: An Open Encoder-Only Speech Foundation Model for Speech Recognition, Translation, and Language Identification
Yifan Peng
Yui Sudo
Muhammad Shakeel
Shinji Watanabe
VLM
35
17
0
20 Feb 2024
Evaluating and Improving Continual Learning in Spoken Language
  Understanding
Evaluating and Improving Continual Learning in Spoken Language Understanding
Muqiao Yang
Xiang Li
Umberto Cappellazzo
Shinji Watanabe
Bhiksha Raj
CLL
24
0
0
16 Feb 2024
SpeechCLIP+: Self-supervised multi-task representation learning for
  speech via CLIP and speech-image data
SpeechCLIP+: Self-supervised multi-task representation learning for speech via CLIP and speech-image data
Hsuan-Fu Wang
Yi-Jen Shih
Heng-Jui Chang
Layne Berry
Puyuan Peng
Hung-yi Lee
Hsin-Min Wang
David F. Harwath
VLM
26
2
0
10 Feb 2024
The last Dance : Robust backdoor attack via diffusion models and
  bayesian approach
The last Dance : Robust backdoor attack via diffusion models and bayesian approach
Orson Mengara
DiffM
26
4
0
05 Feb 2024
On the Transferability of Large-Scale Self-Supervision to Few-Shot Audio
  Classification
On the Transferability of Large-Scale Self-Supervision to Few-Shot Audio Classification
Calum Heggan
S. Budgett
Timothy M. Hospedales
Mehrdad Yaghoobi
SSL
13
1
0
02 Feb 2024
OWSM v3.1: Better and Faster Open Whisper-Style Speech Models based on
  E-Branchformer
OWSM v3.1: Better and Faster Open Whisper-Style Speech Models based on E-Branchformer
Yifan Peng
Jinchuan Tian
William Chen
Siddhant Arora
Brian Yan
...
Kwanghee Choi
Jiatong Shi
Xuankai Chang
Jee-weon Jung
Shinji Watanabe
VLM
OSLM
19
38
0
30 Jan 2024
CTC Blank Triggered Dynamic Layer-Skipping for Efficient CTC-based
  Speech Recognition
CTC Blank Triggered Dynamic Layer-Skipping for Efficient CTC-based Speech Recognition
Junfeng Hou
Peiyao Wang
Jincheng Zhang
Meng-Da Yang
Minwei Feng
Jingcheng Yin
21
1
0
04 Jan 2024
Noise robust distillation of self-supervised speech models via
  correlation metrics
Noise robust distillation of self-supervised speech models via correlation metrics
Fabian Ritter Gutierrez
Kuan-Po Huang
Dianwen Ng
Jeremy H. M. Wong
Hung-yi Lee
Chng Eng Siong
Nancy F. Chen
14
1
0
19 Dec 2023
Efficiency-oriented approaches for self-supervised speech representation
  learning
Efficiency-oriented approaches for self-supervised speech representation learning
Luis Lugo
Valentin Vielzeuf
SSL
19
1
0
18 Dec 2023
STaR: Distilling Speech Temporal Relation for Lightweight Speech
  Self-Supervised Learning Models
STaR: Distilling Speech Temporal Relation for Lightweight Speech Self-Supervised Learning Models
Kangwook Jang
Sungnyun Kim
Hoi-Rim Kim
23
1
0
14 Dec 2023
R-Spin: Efficient Speaker and Noise-invariant Representation Learning
  with Acoustic Pieces
R-Spin: Efficient Speaker and Noise-invariant Representation Learning with Acoustic Pieces
Heng-Jui Chang
James R. Glass
15
3
0
15 Nov 2023
Distil-Whisper: Robust Knowledge Distillation via Large-Scale Pseudo
  Labelling
Distil-Whisper: Robust Knowledge Distillation via Large-Scale Pseudo Labelling
Sanchit Gandhi
Patrick von Platen
Alexander M. Rush
VLM
6
49
0
01 Nov 2023
EFFUSE: Efficient Self-Supervised Feature Fusion for E2E ASR in Low
  Resource and Multilingual Scenarios
EFFUSE: Efficient Self-Supervised Feature Fusion for E2E ASR in Low Resource and Multilingual Scenarios
Tejes Srivastava
Jiatong Shi
William Chen
Shinji Watanabe
32
1
0
05 Oct 2023
Continual Contrastive Spoken Language Understanding
Continual Contrastive Spoken Language Understanding
Umberto Cappellazzo
Enrico Fini
Muqiao Yang
Daniele Falavigna
A. Brutti
Bhiksha Raj
CLL
23
1
0
04 Oct 2023
Segment-Level Vectorized Beam Search Based on Partially Autoregressive
  Inference
Segment-Level Vectorized Beam Search Based on Partially Autoregressive Inference
Masao Someki
N. Eng
Yosuke Higuchi
Shinji Watanabe
10
0
0
26 Sep 2023
Reproducing Whisper-Style Training Using an Open-Source Toolkit and
  Publicly Available Data
Reproducing Whisper-Style Training Using an Open-Source Toolkit and Publicly Available Data
Yifan Peng
Jinchuan Tian
Brian Yan
Dan Berrebbi
Xuankai Chang
...
Yui Sudo
Muhammad Shakeel
Jee-weon Jung
Soumi Maiti
Shinji Watanabe
VLM
23
33
0
25 Sep 2023
Distilling HuBERT with LSTMs via Decoupled Knowledge Distillation
Distilling HuBERT with LSTMs via Decoupled Knowledge Distillation
Danilo de Oliveira
Timo Gerkmann
VLM
15
3
0
18 Sep 2023
CoLLD: Contrastive Layer-to-layer Distillation for Compressing
  Multilingual Pre-trained Speech Encoders
CoLLD: Contrastive Layer-to-layer Distillation for Compressing Multilingual Pre-trained Speech Encoders
Heng-Jui Chang
Ning Dong
Ruslan Mavlyutov
Sravya Popuri
Yu-An Chung
37
6
0
14 Sep 2023
SpeechTokenizer: Unified Speech Tokenizer for Speech Large Language
  Models
SpeechTokenizer: Unified Speech Tokenizer for Speech Large Language Models
Xin Zhang
Dong Zhang
Shimin Li
Yaqian Zhou
Xipeng Qiu
25
61
0
31 Aug 2023
Knowledge Distillation from Non-streaming to Streaming ASR Encoder using
  Auxiliary Non-streaming Layer
Knowledge Distillation from Non-streaming to Streaming ASR Encoder using Auxiliary Non-streaming Layer
Kyuhong Shim
Jinkyu Lee
Simyoung Chang
Kyuwoong Hwang
19
2
0
31 Aug 2023
Let There Be Sound: Reconstructing High Quality Speech from Silent
  Videos
Let There Be Sound: Reconstructing High Quality Speech from Silent Videos
Ji-Hoon Kim
Jaehun Kim
Joon Son Chung
8
5
0
29 Aug 2023
Speech Self-Supervised Representations Benchmarking: a Case for Larger
  Probing Heads
Speech Self-Supervised Representations Benchmarking: a Case for Larger Probing Heads
Salah Zaiem
Youcef Kemiche
Titouan Parcollet
S. Essid
Mirco Ravanelli
SSL
12
11
0
28 Aug 2023
Video Multimodal Emotion Recognition System for Real World Applications
Video Multimodal Emotion Recognition System for Real World Applications
Sun-Kyung Lee
Jong-Hwan Kim
CVBM
8
0
0
28 Aug 2023
123
Next