Communities
Connect sessions
AI calendar
Organizations
Join Slack
Contact Sales

Terms and Conditions

Twitter GitHub LinkedIn Bluesky Youtube

© 2026 ResearchTrend.AI, All rights reserved.

Home
Papers
2004.05274
Cited By

Improved Speech Representations with Multi-Target Autoregressive
Predictive Coding

Improved Speech Representations with Multi-Target Autoregressive Predictive Coding

Annual Meeting of the Association for Computational Linguistics (ACL), 2020

11 April 2020

ArXiv (abs)PDF HTML

Papers citing "Improved Speech Representations with Multi-Target Autoregressive Predictive Coding"

37 / 37 papers shown

Multi-objective Non-intrusive Hearing-aid Speech Assessment Model

Multi-objective Non-intrusive Hearing-aid Speech Assessment Model

Hsin-Tien Chiang

Yu Tsao

John H. L. Hansen

276

8

0

15 Nov 2023

Reduce, Reuse, Recycle: Is Perturbed Data better than Other Language
augmentation for Low Resource Self-Supervised Speech Models

Reduce, Reuse, Recycle: Is Perturbed Data better than Other Language augmentation for Low Resource Self-Supervised Speech ModelsInterspeech (Interspeech), 2023

Alessandro Ragano

480

4

0

22 Sep 2023

On the Robustness of Arabic Speech Dialect Identification

On the Robustness of Arabic Speech Dialect IdentificationInterspeech (Interspeech), 2023

AbdelRahim Elmadany

Muhammad Abdul-Mageed

158

17

0

01 Jun 2023

ViT-TTS: Visual Text-to-Speech with Scalable Diffusion Transformer

ViT-TTS: Visual Text-to-Speech with Scalable Diffusion TransformerConference on Empirical Methods in Natural Language Processing (EMNLP), 2023

Rongjie Huang

Zhou Zhao

403

32

0

22 May 2023

Investigating Enhancements to Contrastive Predictive Coding for Human
Activity Recognition

Investigating Enhancements to Contrastive Predictive Coding for Human Activity RecognitionAnnual IEEE International Conference on Pervasive Computing and Communications (PerCom), 2022

H. Haresamudram

404

20

0

11 Nov 2022

Losses Can Be Blessings: Routing Self-Supervised Speech Representations Towards Efficient Multilingual and Multitask Speech Processing

Losses Can Be Blessings: Routing Self-Supervised Speech Representations Towards Efficient Multilingual and Multitask Speech ProcessingNeural Information Processing Systems (NeurIPS), 2022

Kaizhi Qian

Cheng-I Jeff Lai

471

10

0

02 Nov 2022

SUPERB @ SLT 2022: Challenge on Generalization and Efficiency of
Self-Supervised Speech Representation Learning

SUPERB @ SLT 2022: Challenge on Generalization and Efficiency of Self-Supervised Speech Representation LearningSpoken Language Technology Workshop (SLT), 2022

Tzu-Quan Lin

...

Shinji Watanabe

Abdel-rahman Mohamed

332

38

0

16 Oct 2022

Self-Supervised Speech Representation Learning: A Review

Self-Supervised Speech Representation Learning: A ReviewIEEE Journal on Selected Topics in Signal Processing (IEEE JSTSP), 2022

Abdel-rahman Mohamed

Jakob Drachmann Havtorn

...

Tara N. Sainath

Shinji Watanabe

796

475

0

21 May 2022

Federated Self-Supervised Learning for Acoustic Event Classification

Federated Self-Supervised Learning for Acoustic Event ClassificationIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2022

Ming Sun

Spyros Matsoukas

203

14

0

22 Mar 2022

A Brief Overview of Unsupervised Neural Speech Representation Learning

A Brief Overview of Unsupervised Neural Speech Representation Learning

Jakob Drachmann Havtorn

266

13

0

01 Mar 2022

Assessing the State of Self-Supervised Human Activity Recognition using
Wearables

Assessing the State of Self-Supervised Human Activity Recognition using WearablesProceedings of the ACM on Interactive Mobile Wearable and Ubiquitous Technologies (IMWUT), 2022

H. Haresamudram

481

128

0

22 Feb 2022

Textless Speech Emotion Conversion using Discrete and Decomposed
Representations

Textless Speech Emotion Conversion using Discrete and Decomposed RepresentationsConference on Empirical Methods in Natural Language Processing (EMNLP), 2021

Eugene Kharitonov

Abdel-rahman Mohamed

Emmanuel Dupoux

Yossi Adi

404

47

0

14 Nov 2021

WavLM: Large-Scale Self-Supervised Pre-Training for Full Stack Speech
Processing

WavLM: Large-Scale Self-Supervised Pre-Training for Full Stack Speech Processing

...

Jian Wu

1.4K

2,988

0

26 Oct 2021

Don't speak too fast: The impact of data bias on self-supervised speech
models

Don't speak too fast: The impact of data bias on self-supervised speech models

303

35

0

15 Oct 2021

DistilHuBERT: Speech Representation Learning by Layer-wise Distillation
of Hidden-unit BERT

DistilHuBERT: Speech Representation Learning by Layer-wise Distillation of Hidden-unit BERT

832

211

0

05 Oct 2021

Comparison of Self-Supervised Speech Pre-Training Methods on Flemish
Dutch

Comparison of Self-Supervised Speech Pre-Training Methods on Flemish Dutch

Jakob Poncelet

182

3

0

29 Sep 2021

W2v-BERT: Combining Contrastive Learning and Masked Language Modeling
for Self-Supervised Speech Pre-Training

W2v-BERT: Combining Contrastive Learning and Masked Language Modeling for Self-Supervised Speech Pre-TrainingAutomatic Speech Recognition & Understanding (ASRU), 2021

Chung-Cheng Chiu

404

525

0

07 Aug 2021

HuBERT: Self-Supervised Speech Representation Learning by Masked
Prediction of Hidden Units

HuBERT: Self-Supervised Speech Representation Learning by Masked Prediction of Hidden UnitsIEEE/ACM Transactions on Audio Speech and Language Processing (TASLP), 2021

Kushal Lakhotia

Ruslan Salakhutdinov

Abdel-rahman Mohamed

783

4,394

0

14 Jun 2021

LeBenchmark: A Reproducible Framework for Assessing Self-Supervised
Representation Learning from Speech

LeBenchmark: A Reproducible Framework for Assessing Self-Supervised Representation Learning from SpeechInterspeech (Interspeech), 2021

Marcely Zanon Boito

Salima Mdhaffar

...

François Portet

Solange Rossato

Fabien Ringeval

Laurent Besacier

282

74

0

23 Apr 2021

Fast Development of ASR in African Languages using Self Supervised
Speech Representation Learning

Fast Development of ASR in African Languages using Self Supervised Speech Representation Learning

Jama Hussein Mohamud

Laurent Besacier

243

7

0

16 Mar 2021

General-Purpose Speech Representation Learning through a Self-Supervised
Multi-Granularity Framework

General-Purpose Speech Representation Learning through a Self-Supervised Multi-Granularity Framework

174

6

0

03 Feb 2021

Generative Spoken Language Modeling from Raw Audio

Generative Spoken Language Modeling from Raw AudioTransactions of the Association for Computational Linguistics (TACL), 2021

Kushal Lakhotia

Evgeny Kharitonov

Yossi Adi

...

Emmanuel Dupoux

778

459

0

01 Feb 2021

End2End Acoustic to Semantic Transduction

End2End Acoustic to Semantic TransductionIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2021

Valentin Pelloin

Nathalie Camelin

Antoine Laurent

Antoine Caubrière

152

16

0

01 Feb 2021

A comparison of self-supervised speech representations as input features
for unsupervised acoustic word embeddings

A comparison of self-supervised speech representations as input features for unsupervised acoustic word embeddingsSpoken Language Technology Workshop (SLT), 2020

Lisa van Staden

188

17

0

14 Dec 2020

DeCoAR 2.0: Deep Contextualized Acoustic Representations with Vector
Quantization

DeCoAR 2.0: Deep Contextualized Acoustic Representations with Vector Quantization

208

114

0

11 Dec 2020

Contrastive Predictive Coding for Human Activity Recognition

Contrastive Predictive Coding for Human Activity RecognitionProceedings of the ACM on Interactive Mobile Wearable and Ubiquitous Technologies (IMWUT), 2020

H. Haresamudram

447

150

0

09 Dec 2020

End-to-end spoken language understanding using transformer networks and
self-supervised pre-trained features

End-to-end spoken language understanding using transformer networks and self-supervised pre-trained featuresIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2020

Brian Kingsbury

189

13

0

16 Nov 2020

Towards Semi-Supervised Semantics Understanding from Speech

Towards Semi-Supervised Semantics Understanding from Speech

Cheng-I Jeff Lai

239

7

0

11 Nov 2020

Non-Autoregressive Predictive Coding for Learning Speech Representations
from Local Dependencies

Non-Autoregressive Predictive Coding for Learning Speech Representations from Local DependenciesInterspeech (Interspeech), 2020

Alexander H. Liu

284

93

0

01 Nov 2020

Speech SIMCLR: Combining Contrastive and Reconstruction Objective for
Self-supervised Speech Representation Learning

Speech SIMCLR: Combining Contrastive and Reconstruction Objective for Self-supervised Speech Representation LearningInterspeech (Interspeech), 2020

Wei Zou

Xiangang Li

425

74

0

27 Oct 2020

Multilingual Speech Translation with Efficient Finetuning of Pretrained
Models

Multilingual Speech Translation with Efficient Finetuning of Pretrained Models

Michael Auli

349

6

0

24 Oct 2020

Any-to-One Sequence-to-Sequence Voice Conversion using Self-Supervised
Discrete Speech Representations

Any-to-One Sequence-to-Sequence Voice Conversion using Self-Supervised Discrete Speech RepresentationsIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2020

371

48

0

23 Oct 2020

Similarity Analysis of Self-Supervised Speech Representations

Similarity Analysis of Self-Supervised Speech Representations

Yonatan Belinkov

428

45

0

22 Oct 2020

Pretraining Techniques for Sequence-to-Sequence Voice Conversion

Pretraining Techniques for Sequence-to-Sequence Voice ConversionIEEE/ACM Transactions on Audio Speech and Language Processing (TASLP), 2020

Hirokazu Kameoka

392

48

0

07 Aug 2020

A Further Study of Unsupervised Pre-training for Transformer Based
Speech Recognition

A Further Study of Unsupervised Pre-training for Transformer Based Speech Recognition

Wei Zou

Xiangang Li

280

31

0

20 May 2020

Vector-Quantized Autoregressive Predictive Coding

Vector-Quantized Autoregressive Predictive Coding

Hao Tang

324

126

0

17 May 2020

Mockingjay: Unsupervised Speech Representation Learning with Deep
Bidirectional Transformer Encoders

Mockingjay: Unsupervised Speech Representation Learning with Deep Bidirectional Transformer EncodersIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2019

648

394

0

25 Oct 2019

Page 1 of 1