ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2007.06028
  4. Cited By
TERA: Self-Supervised Learning of Transformer Encoder Representation for
  Speech

TERA: Self-Supervised Learning of Transformer Encoder Representation for Speech

12 July 2020
Andy T. Liu
Shang-Wen Li
Hung-yi Lee
    SSL
ArXivPDFHTML

Papers citing "TERA: Self-Supervised Learning of Transformer Encoder Representation for Speech"

50 / 215 papers shown
Title
Nes2Net: A Lightweight Nested Architecture for Foundation Model Driven Speech Anti-spoofing
Nes2Net: A Lightweight Nested Architecture for Foundation Model Driven Speech Anti-spoofing
Tianchi Liu
Duc-Tuan Truong
Rohan Kumar Das
K. Lee
Haizhou Li
31
0
0
08 Apr 2025
Causal Self-supervised Pretrained Frontend with Predictive Code for Speech Separation
Causal Self-supervised Pretrained Frontend with Predictive Code for Speech Separation
Wupeng Wang
Zexu Pan
X. Li
Shuai Wang
Haizhou Li
AI4TS
34
0
0
03 Apr 2025
JiTTER: Jigsaw Temporal Transformer for Event Reconstruction for Self-Supervised Sound Event Detection
JiTTER: Jigsaw Temporal Transformer for Event Reconstruction for Self-Supervised Sound Event Detection
Hyeonuk Nam
Yong-Hwa Park
34
1
0
28 Feb 2025
How Redundant Is the Transformer Stack in Speech Representation Models?
How Redundant Is the Transformer Stack in Speech Representation Models?
Teresa Dorszewski
Albert Kjøller Jacobsen
Lenka Tětková
Lars Kai Hansen
104
0
0
20 Jan 2025
SQ-Whisper: Speaker-Querying based Whisper Model for Target-Speaker ASR
SQ-Whisper: Speaker-Querying based Whisper Model for Target-Speaker ASR
Pengcheng Guo
Xuankai Chang
Hang Lv
Shinji Watanabe
Lei Xie
61
0
0
07 Dec 2024
Speech Separation with Pretrained Frontend to Minimize Domain Mismatch
Speech Separation with Pretrained Frontend to Minimize Domain Mismatch
Wupeng Wang
Zexu Pan
X. Li
Shuai Wang
H. Li
29
4
0
05 Nov 2024
DC-Spin: A Speaker-invariant Speech Tokenizer for Spoken Language Models
DC-Spin: A Speaker-invariant Speech Tokenizer for Spoken Language Models
Heng-Jui Chang
Hongyu Gong
Changhan Wang
James R. Glass
Yu-An Chung
26
0
0
31 Oct 2024
Modelling Concurrent RTP Flows for End-to-end Predictions of QoS in Real
  Time Communications
Modelling Concurrent RTP Flows for End-to-end Predictions of QoS in Real Time Communications
Tailai Song
Paolo Garza
Michela Meo
Maurizio Matteo Munafò
15
1
0
21 Oct 2024
Exploring Prediction Targets in Masked Pre-Training for Speech Foundation Models
Exploring Prediction Targets in Masked Pre-Training for Speech Foundation Models
Li-Wei Chen
Takuya Higuchi
He Bai
Ahmed Hussen Abdelaziz
Alexander Rudnicky
Shinji Watanabe
Tatiana Likhomanenko
B. Theobald
Zakaria Aldeneh
38
0
0
16 Sep 2024
Stimulus Modality Matters: Impact of Perceptual Evaluations from
  Different Modalities on Speech Emotion Recognition System Performance
Stimulus Modality Matters: Impact of Perceptual Evaluations from Different Modalities on Speech Emotion Recognition System Performance
Huang-Cheng Chou
Haibin Wu
Chi-Chun Lee
27
0
0
16 Sep 2024
Self-supervised Learning for Acoustic Few-Shot Classification
Self-supervised Learning for Acoustic Few-Shot Classification
Jingyong Liang
Bernd Meyer
Issac Ning Lee
Thanh-Toan Do
SSL
52
0
0
15 Sep 2024
Efficient Training of Self-Supervised Speech Foundation Models on a Compute Budget
Efficient Training of Self-Supervised Speech Foundation Models on a Compute Budget
Andy T. Liu
Yi-Cheng Lin
Haibin Wu
Stefan Winkler
Hung-yi Lee
31
1
0
09 Sep 2024
Probing self-attention in self-supervised speech models for
  cross-linguistic differences
Probing self-attention in self-supervised speech models for cross-linguistic differences
Sai Gopinath
Joselyn Rodriguez
MILM
51
0
0
04 Sep 2024
Progressive Residual Extraction based Pre-training for Speech
  Representation Learning
Progressive Residual Extraction based Pre-training for Speech Representation Learning
Tianrui Wang
Jin Li
Ziyang Ma
Rui Cao
Xie Chen
...
Meng Ge
Xiaobao Wang
Yuguang Wang
Jianwu Dang
Nyima Tashi
SSL
35
0
0
31 Aug 2024
SpeechPrompt: Prompting Speech Language Models for Speech Processing
  Tasks
SpeechPrompt: Prompting Speech Language Models for Speech Processing Tasks
Kai-Wei Chang
Haibin Wu
Yu-Kai Wang
Yuan-Kuei Wu
Hua Shen
Wei-Cheng Tseng
Iu-thing Kang
Shang-Wen Li
Hung-yi Lee
39
3
0
23 Aug 2024
Convexity-based Pruning of Speech Representation Models
Convexity-based Pruning of Speech Representation Models
Teresa Dorszewski
Lenka Tětková
Lars Kai Hansen
25
2
0
16 Aug 2024
Transformer-based Single-Cell Language Model: A Survey
Transformer-based Single-Cell Language Model: A Survey
Wei Lan
Guohang He
Mingyang Liu
Qingfeng Chen
Junyue Cao
Wei Peng
MedIm
LRM
22
7
0
18 Jul 2024
Performance Analysis of Speech Encoders for Low-Resource SLU and ASR in
  Tunisian Dialect
Performance Analysis of Speech Encoders for Low-Resource SLU and ASR in Tunisian Dialect
Salima Mdhaffar
Haroun Elleuch
Fethi Bougares
Yannick Esteve
49
0
0
05 Jul 2024
A dual task learning approach to fine-tune a multilingual semantic
  speech encoder for Spoken Language Understanding
A dual task learning approach to fine-tune a multilingual semantic speech encoder for Spoken Language Understanding
G. Laperriere
Sahar Ghannay
Bassam Jabaian
Yannick Esteve
21
0
0
17 Jun 2024
LASER: Learning by Aligning Self-supervised Representations of Speech
  for Improving Content-related Tasks
LASER: Learning by Aligning Self-supervised Representations of Speech for Improving Content-related Tasks
Amit Meghanani
Thomas Hain
36
1
0
13 Jun 2024
GenDistiller: Distilling Pre-trained Language Models based on an
  Autoregressive Generative Model
GenDistiller: Distilling Pre-trained Language Models based on an Autoregressive Generative Model
Yingying Gao
Shilei Zhang
Chao Deng
Junlan Feng
19
0
0
12 Jun 2024
On the social bias of speech self-supervised models
On the social bias of speech self-supervised models
Yi-Cheng Lin
T. Lin
Hsi-Che Lin
Andy T. Liu
Hung-yi Lee
32
3
0
07 Jun 2024
Population Transformer: Learning Population-level Representations of Neural Activity
Population Transformer: Learning Population-level Representations of Neural Activity
Geeling Chau
Christopher Wang
Sabera Talukder
Vighnesh Subramaniam
Saraswati Soedarmadji
Yisong Yue
Boris Katz
Andrei Barbu
MedIm
54
0
0
05 Jun 2024
A Large-Scale Evaluation of Speech Foundation Models
A Large-Scale Evaluation of Speech Foundation Models
Shu-Wen Yang
Heng-Jui Chang
Zili Huang
Andy T. Liu
Cheng-I Jeff Lai
...
Kushal Lakhotia
Shang-Wen Li
Abdelrahman Mohamed
Shinji Watanabe
Hung-yi Lee
38
19
0
15 Apr 2024
Masked Modeling Duo: Towards a Universal Audio Pre-training Framework
Masked Modeling Duo: Towards a Universal Audio Pre-training Framework
Daisuke Niizumi
Daiki Takeuchi
Yasunori Ohishi
Noboru Harada
K. Kashino
29
10
0
09 Apr 2024
The Effect of Batch Size on Contrastive Self-Supervised Speech
  Representation Learning
The Effect of Batch Size on Contrastive Self-Supervised Speech Representation Learning
Nik Vaessen
David A. van Leeuwen
25
3
0
21 Feb 2024
EMO-SUPERB: An In-depth Look at Speech Emotion Recognition
EMO-SUPERB: An In-depth Look at Speech Emotion Recognition
Haibin Wu
Huang-Cheng Chou
Kai-Wei Chang
Lucas Goncalves
Jiawei Du
Jyh-Shing Roger Jang
Chi-Chun Lee
Hung-Yi Lee
29
11
0
20 Feb 2024
REBORN: Reinforcement-Learned Boundary Segmentation with Iterative
  Training for Unsupervised ASR
REBORN: Reinforcement-Learned Boundary Segmentation with Iterative Training for Unsupervised ASR
Liang-Hsuan Tseng
En-Pei Hu
Cheng-Han Chiang
Yuan Tseng
Hung-yi Lee
Lin-shan Lee
Shao-Hua Sun
59
1
0
06 Feb 2024
On the Transferability of Large-Scale Self-Supervision to Few-Shot Audio
  Classification
On the Transferability of Large-Scale Self-Supervision to Few-Shot Audio Classification
Calum Heggan
S. Budgett
Timothy M. Hospedales
Mehrdad Yaghoobi
SSL
19
1
0
02 Feb 2024
From Google Gemini to OpenAI Q* (Q-Star): A Survey of Reshaping the
  Generative Artificial Intelligence (AI) Research Landscape
From Google Gemini to OpenAI Q* (Q-Star): A Survey of Reshaping the Generative Artificial Intelligence (AI) Research Landscape
Timothy R. McIntosh
Teo Susnjak
Tong Liu
Paul Watters
Malka N. Halgamuge
79
46
0
18 Dec 2023
Acoustic models of Brazilian Portuguese Speech based on Neural
  Transformers
Acoustic models of Brazilian Portuguese Speech based on Neural Transformers
M. Gauy
Marcelo Finger
16
2
0
14 Dec 2023
A Quantitative Approach to Understand Self-Supervised Models as
  Cross-lingual Feature Extractors
A Quantitative Approach to Understand Self-Supervised Models as Cross-lingual Feature Extractors
Shuyue Stella Li
Beining Xu
Xiangyu Zhang
Hexin Liu
Wen-Han Chao
Leibny Paola García
SSL
14
4
0
27 Nov 2023
Multi-objective Non-intrusive Hearing-aid Speech Assessment Model
Multi-objective Non-intrusive Hearing-aid Speech Assessment Model
Hsin-Tien Chiang
Szu-Wei Fu
Hsin-Min Wang
Yu Tsao
John H. L. Hansen
34
2
0
15 Nov 2023
Investigating Self-Supervised Deep Representations for EEG-based
  Auditory Attention Decoding
Investigating Self-Supervised Deep Representations for EEG-based Auditory Attention Decoding
Karan Thakkar
Jiarui Hai
Mounya Elhilali
11
1
0
01 Nov 2023
An Investigation of Representation and Allocation Harms in Contrastive
  Learning
An Investigation of Representation and Allocation Harms in Contrastive Learning
Subha Maity
Mayank Agarwal
Mikhail Yurochkin
Yuekai Sun
25
2
0
02 Oct 2023
Improving Speech Inversion Through Self-Supervised Embeddings and
  Enhanced Tract Variables
Improving Speech Inversion Through Self-Supervised Embeddings and Enhanced Tract Variables
Ahmed Adel Attia
Yashish M. Siriwardena
Carol Y. Espy-Wilson
SSL
26
4
0
17 Sep 2023
LeBenchmark 2.0: a Standardized, Replicable and Enhanced Framework for
  Self-supervised Representations of French Speech
LeBenchmark 2.0: a Standardized, Replicable and Enhanced Framework for Self-supervised Representations of French Speech
Titouan Parcollet
H. Nguyen
Solène Evain
Marcely Zanon Boito
Adrien Pupier
...
François Portet
Solange Rossato
F. Ringeval
D. Schwab
Laurent Besacier
32
15
0
11 Sep 2023
Leveraging Pretrained Image-text Models for Improving Audio-Visual
  Learning
Leveraging Pretrained Image-text Models for Improving Audio-Visual Learning
Saurabhchand Bhati
Jesús Villalba
Laureano Moro Velázquez
Thomas Thebaud
Najim Dehak
CLIP
25
3
0
08 Sep 2023
Leveraging Label Information for Multimodal Emotion Recognition
Leveraging Label Information for Multimodal Emotion Recognition
Pei-Hsin Wang
Sunlu Zeng
Junqing Chen
Lu Fan
Meng Chen
Youzheng Wu
Xiaodong He
25
4
0
05 Sep 2023
Acoustic-to-articulatory inversion for dysarthric speech: Are
  pre-trained self-supervised representations favorable?
Acoustic-to-articulatory inversion for dysarthric speech: Are pre-trained self-supervised representations favorable?
S. K. Maharana
Krishna Kamal Adidam
Shoumik Nandi
Ajitesh Srivastava
22
2
0
03 Sep 2023
End-to-End Open Vocabulary Keyword Search With Multilingual Neural
  Representations
End-to-End Open Vocabulary Keyword Search With Multilingual Neural Representations
Bolaji Yusuf
J. Černocký
Murat Saraclar
22
2
0
15 Aug 2023
Speaker Recognition Using Isomorphic Graph Attention Network Based
  Pooling on Self-Supervised Representation
Speaker Recognition Using Isomorphic Graph Attention Network Based Pooling on Self-Supervised Representation
Zirui Ge
Xinzhou Xu
Haiyan Guo
Tingting Wang
Zhen Yang
SSL
19
1
0
09 Aug 2023
CroSSL: Cross-modal Self-Supervised Learning for Time-series through
  Latent Masking
CroSSL: Cross-modal Self-Supervised Learning for Time-series through Latent Masking
Shohreh Deldari
Dimitris Spathis
Mohammad Malekzadeh
F. Kawsar
Flora D. Salim
Akhil Mathur
9
15
0
31 Jul 2023
Representation Learning With Hidden Unit Clustering For Low Resource
  Speech Applications
Representation Learning With Hidden Unit Clustering For Low Resource Speech Applications
Varun Krishna
T. Sai
Sriram Ganapathy
SSL
15
2
0
14 Jul 2023
Semantic enrichment towards efficient speech representations
Semantic enrichment towards efficient speech representations
G. Laperriere
H. Nguyen
Sahar Ghannay
Bassam Jabaian
Yannick Esteve
43
2
0
03 Jul 2023
Evaluation of Speech Representations for MOS prediction
Evaluation of Speech Representations for MOS prediction
F. S. Oliveira
Edresson Casanova
Arnaldo Cândido Júnior
L. Gris
A. S. Soares
A. R. G. Filho
24
4
0
16 Jun 2023
Pushing the Limits of Unsupervised Unit Discovery for SSL Speech
  Representation
Pushing the Limits of Unsupervised Unit Discovery for SSL Speech Representation
Ziyang Ma
Zhisheng Zheng
Guanrou Yang
Yu Wang
C. Zhang
Xie Chen
SSL
22
8
0
15 Jun 2023
Feature Normalization for Fine-tuning Self-Supervised Models in Speech
  Enhancement
Feature Normalization for Fine-tuning Self-Supervised Models in Speech Enhancement
Hejung Yang
Hong-Goo Kang
SSL
15
0
0
14 Jun 2023
MFSN: Multi-perspective Fusion Search Network For Pre-training Knowledge
  in Speech Emotion Recognition
MFSN: Multi-perspective Fusion Search Network For Pre-training Knowledge in Speech Emotion Recognition
Haiyang Sun
Fulin Zhang
Yingying Gao
Zheng Lian
Shilei Zhang
Junlan Feng
14
4
0
12 Jun 2023
Probing self-supervised speech models for phonetic and phonemic
  information: a case study in aspiration
Probing self-supervised speech models for phonetic and phonemic information: a case study in aspiration
Kinan Martin
Jon Gauthier
Canaan Breiss
R. Levy
SSL
11
14
0
09 Jun 2023
12345
Next