ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1904.03670
  4. Cited By
Speech Model Pre-training for End-to-End Spoken Language Understanding

Speech Model Pre-training for End-to-End Spoken Language Understanding

7 April 2019
Loren Lugosch
Mirco Ravanelli
Patrick Ignoto
Vikrant Singh Tomar
Yoshua Bengio
    SyDa
    AuLLM
ArXivPDFHTML

Papers citing "Speech Model Pre-training for End-to-End Spoken Language Understanding"

50 / 51 papers shown
Title
AdaST: Dynamically Adapting Encoder States in the Decoder for End-to-End Speech-to-Text Translation
AdaST: Dynamically Adapting Encoder States in the Decoder for End-to-End Speech-to-Text Translation
Wuwei Huang
Dexin Wang
Deyi Xiong
69
4
0
18 Mar 2025
Data-efficient Performance Modeling via Pre-training
Data-efficient Performance Modeling via Pre-training
Chunting Liu
Riyadh Baghdadi
41
0
0
24 Jan 2025
Emotion-Aware Speech Self-Supervised Representation Learning with
  Intensity Knowledge
Emotion-Aware Speech Self-Supervised Representation Learning with Intensity Knowledge
Rui Liu
Zening Ma
SSL
34
1
0
10 Jun 2024
Investigating the Áutoencoder Behavior' in Speech Self-Supervised
  Models: a focus on HuBERT's Pretraining
Investigating the Áutoencoder Behavior' in Speech Self-Supervised Models: a focus on HuBERT's Pretraining
Valentin Vielzeuf
SSL
36
0
0
14 May 2024
A Large-Scale Evaluation of Speech Foundation Models
A Large-Scale Evaluation of Speech Foundation Models
Shu-Wen Yang
Heng-Jui Chang
Zili Huang
Andy T. Liu
Cheng-I Jeff Lai
...
Kushal Lakhotia
Shang-Wen Li
Abdelrahman Mohamed
Shinji Watanabe
Hung-yi Lee
38
19
0
15 Apr 2024
R-Spin: Efficient Speaker and Noise-invariant Representation Learning
  with Acoustic Pieces
R-Spin: Efficient Speaker and Noise-invariant Representation Learning with Acoustic Pieces
Heng-Jui Chang
James R. Glass
25
3
0
15 Nov 2023
Improving Small Footprint Few-shot Keyword Spotting with Supervision on
  Auxiliary Data
Improving Small Footprint Few-shot Keyword Spotting with Supervision on Auxiliary Data
Seunghan Yang
Byeonggeun Kim
Kyuhong Shim
Simyoung Chang
22
1
0
31 Aug 2023
Multimodal Audio-textual Architecture for Robust Spoken Language
  Understanding
Multimodal Audio-textual Architecture for Robust Spoken Language Understanding
Anderson R. Avila
Mehdi Rezagholizadeh
Chao Xing
11
1
0
12 Jun 2023
Looking Similar, Sounding Different: Leveraging Counterfactual
  Cross-Modal Pairs for Audiovisual Representation Learning
Looking Similar, Sounding Different: Leveraging Counterfactual Cross-Modal Pairs for Audiovisual Representation Learning
Nikhil Singh
Chih-Wei Wu
Iroro Orife
Mahdi M. Kalayeh
23
2
0
12 Apr 2023
A Data-Efficient Visual-Audio Representation with Intuitive Fine-tuning
  for Voice-Controlled Robots
A Data-Efficient Visual-Audio Representation with Intuitive Fine-tuning for Voice-Controlled Robots
Peixin Chang
Shuijing Liu
Tianchen Ji
Neeloy Chakraborty
Kaiwen Hong
Katherine Driggs-Campbell
32
3
0
23 Jan 2023
Context-aware Fine-tuning of Self-supervised Speech Models
Context-aware Fine-tuning of Self-supervised Speech Models
Suwon Shon
Felix Wu
Kwangyoun Kim
Prashant Sridhar
Karen Livescu
Shinji Watanabe
25
7
0
16 Dec 2022
Model Extraction Attack against Self-supervised Speech Models
Model Extraction Attack against Self-supervised Speech Models
Tsung-Yuan Hsu
Chen An Li
Tung-Yu Wu
Hung-yi Lee
17
1
0
29 Nov 2022
Active Learning of Non-semantic Speech Tasks with Pretrained Models
Active Learning of Non-semantic Speech Tasks with Pretrained Models
Harlin Lee
Aaqib Saeed
Andrea L. Bertozzi
VLM
14
2
0
31 Oct 2022
On Out-of-Distribution Detection for Audio with Deep Nearest Neighbors
On Out-of-Distribution Detection for Audio with Deep Nearest Neighbors
Z. Bukhsh
Aaqib Saeed
OODD
22
9
0
27 Oct 2022
Taxonomic Classification of IoT Smart Home Voice Control
Taxonomic Classification of IoT Smart Home Voice Control
M. Hewitt
H. Cunningham
11
1
0
24 Oct 2022
HAPI: A Large-scale Longitudinal Dataset of Commercial ML API
  Predictions
HAPI: A Large-scale Longitudinal Dataset of Commercial ML API Predictions
Lingjiao Chen
Zhihua Jin
Sabri Eyuboglu
Christopher Ré
Matei A. Zaharia
James Y. Zou
37
9
0
18 Sep 2022
End-to-End Spoken Language Understanding: Performance analyses of a
  voice command task in a low resource setting
End-to-End Spoken Language Understanding: Performance analyses of a voice command task in a low resource setting
Thierry Desot
François Portet
Michel Vacher
17
12
0
17 Jul 2022
Two-Pass Low Latency End-to-End Spoken Language Understanding
Two-Pass Low Latency End-to-End Spoken Language Understanding
Siddhant Arora
Siddharth Dalmia
Xuankai Chang
Brian Yan
A. Black
Shinji Watanabe
VLM
19
19
0
14 Jul 2022
Toward Low-Cost End-to-End Spoken Language Understanding
Toward Low-Cost End-to-End Spoken Language Understanding
Marco Dinarelli
M. Naguib
Franccois Portet
9
5
0
01 Jul 2022
Finstreder: Simple and fast Spoken Language Understanding with Finite
  State Transducers using modern Speech-to-Text models
Finstreder: Simple and fast Spoken Language Understanding with Finite State Transducers using modern Speech-to-Text models
Daniel Bermuth
Alexander Poeppel
W. Reif
15
7
0
29 Jun 2022
STOP: A dataset for Spoken Task Oriented Semantic Parsing
STOP: A dataset for Spoken Task Oriented Semantic Parsing
Paden Tomasello
Akshat Shrivastava
Daniel Lazar
Po-Chun Hsu
Duc Le
...
Robin Algayres
Tu Nguyen
Emmanuel Dupoux
Luke Zettlemoyer
Abdel-rahman Mohamed
9
35
0
29 Jun 2022
On Building Spoken Language Understanding Systems for Low Resourced
  Languages
On Building Spoken Language Understanding Systems for Low Resourced Languages
Akshat Gupta
17
8
0
25 May 2022
Self-Supervised Speech Representation Learning: A Review
Self-Supervised Speech Representation Learning: A Review
Abdel-rahman Mohamed
Hung-yi Lee
Lasse Borgholt
Jakob Drachmann Havtorn
Joakim Edin
...
Shang-Wen Li
Karen Livescu
Lars Maaløe
Tara N. Sainath
Shinji Watanabe
SSL
AI4TS
124
344
0
21 May 2022
Tokenwise Contrastive Pretraining for Finer Speech-to-BERT Alignment in
  End-to-End Speech-to-Intent Systems
Tokenwise Contrastive Pretraining for Finer Speech-to-BERT Alignment in End-to-End Speech-to-Intent Systems
Vishal Sunder
Eric Fosler-Lussier
Samuel Thomas
H. Kuo
Brian Kingsbury
16
7
0
11 Apr 2022
Three-Module Modeling For End-to-End Spoken Language Understanding Using
  Pre-trained DNN-HMM-Based Acoustic-Phonetic Model
Three-Module Modeling For End-to-End Spoken Language Understanding Using Pre-trained DNN-HMM-Based Acoustic-Phonetic Model
N. J. Wang
Lu Wang
Yandan Sun
Haimei Kang
Dejun Zhang
AuLLM
11
3
0
07 Apr 2022
SpeechPrompt: An Exploration of Prompt Tuning on Generative Spoken
  Language Model for Speech Processing Tasks
SpeechPrompt: An Exploration of Prompt Tuning on Generative Spoken Language Model for Speech Processing Tasks
Kai-Wei Chang
Wei-Cheng Tseng
Shang-Wen Li
Hung-yi Lee
17
22
0
31 Mar 2022
Improving Distortion Robustness of Self-supervised Speech Processing
  Tasks with Domain Adaptation
Improving Distortion Robustness of Self-supervised Speech Processing Tasks with Domain Adaptation
Kuan Po Huang
Yuanbin Fu
Yu Zhang
Hung-yi Lee
14
28
0
30 Mar 2022
WAVPROMPT: Towards Few-Shot Spoken Language Understanding with Frozen
  Language Models
WAVPROMPT: Towards Few-Shot Spoken Language Understanding with Frozen Language Models
Heting Gao
Junrui Ni
Kaizhi Qian
Yang Zhang
Shiyu Chang
M. Hasegawa-Johnson
VLM
6
31
0
29 Mar 2022
A Speech Representation Anonymization Framework via Selective Noise
  Perturbation
A Speech Representation Anonymization Framework via Selective Noise Perturbation
Minh Tran
M. Soleymani
22
4
0
26 Mar 2022
Building Robust Spoken Language Understanding by Cross Attention between
  Phoneme Sequence and ASR Hypothesis
Building Robust Spoken Language Understanding by Cross Attention between Phoneme Sequence and ASR Hypothesis
Zexun Wang
Yuquan Le
Yi Zhu
Yuming Zhao
M.-W. Feng
Meng Chen
Xiaodong He
15
5
0
22 Mar 2022
On the Use of External Data for Spoken Named Entity Recognition
On the Use of External Data for Spoken Named Entity Recognition
Ankita Pasad
Felix Wu
Suwon Shon
Karen Livescu
Kyu Jeong Han
32
16
0
14 Dec 2021
Speech Representation Learning Through Self-supervised Pretraining And
  Multi-task Finetuning
Speech Representation Learning Through Self-supervised Pretraining And Multi-task Finetuning
Yi-Chen Chen
Shu-Wen Yang
Cheng-Kuang Lee
Simon See
Hung-yi Lee
SSL
11
12
0
18 Oct 2021
Don't speak too fast: The impact of data bias on self-supervised speech
  models
Don't speak too fast: The impact of data bias on self-supervised speech models
Yen Meng
Yi-Hui Chou
Andy T. Liu
Hung-yi Lee
34
24
0
15 Oct 2021
Decoupled Contrastive Learning
Decoupled Contrastive Learning
Chun-Hsiao Yeh
Cheng-Yao Hong
Yen-Chi Hsu
Tyng-Luh Liu
Yubei Chen
Yann LeCun
171
182
0
13 Oct 2021
Exploring Teacher-Student Learning Approach for Multi-lingual
  Speech-to-Intent Classification
Exploring Teacher-Student Learning Approach for Multi-lingual Speech-to-Intent Classification
Bidisha Sharma
Maulik C. Madhavi
Xuehao Zhou
Haizhou Li
12
2
0
28 Sep 2021
Integrating Dialog History into End-to-End Spoken Language Understanding
  Systems
Integrating Dialog History into End-to-End Spoken Language Understanding Systems
Jatin Ganhotra
Samuel Thomas
H. Kuo
Sachindra Joshi
G. Saon
Zoltán Tüske
Brian Kingsbury
19
10
0
18 Aug 2021
Learning a Neural Diff for Speech Models
Learning a Neural Diff for Speech Models
J. Macoskey
Grant P. Strimel
Ariya Rastrow
10
2
0
03 Aug 2021
Did the Model Change? Efficiently Assessing Machine Learning API Shifts
Did the Model Change? Efficiently Assessing Machine Learning API Shifts
Lingjiao Chen
Tracy Cai
Matei A. Zaharia
James Y. Zou
18
17
0
29 Jul 2021
Layer-wise Analysis of a Self-supervised Speech Representation Model
Layer-wise Analysis of a Self-supervised Speech Representation Model
Ankita Pasad
Ju-Chieh Chou
Karen Livescu
SSL
22
287
0
10 Jul 2021
Representation based meta-learning for few-shot spoken intent
  recognition
Representation based meta-learning for few-shot spoken intent recognition
Ashish R. Mittal
Samarth Bharadwaj
Shreya Khare
Saneem A. Chemmengath
Karthik Sankaranarayanan
Brian Kingsbury
13
11
0
29 Jun 2021
Open, Sesame! Introducing Access Control to Voice Services
Open, Sesame! Introducing Access Control to Voice Services
Dominika Woszczyk
Alvin Lee
Soteris Demetriou
AAML
16
0
0
27 Jun 2021
Pre-training for Spoken Language Understanding with Joint Textual and
  Phonetic Representation Learning
Pre-training for Spoken Language Understanding with Joint Textual and Phonetic Representation Learning
Qian Chen
Wen Wang
Qinglin Zhang
11
10
0
21 Apr 2021
Speak or Chat with Me: End-to-End Spoken Language Understanding System
  with Flexible Inputs
Speak or Chat with Me: End-to-End Spoken Language Understanding System with Flexible Inputs
Sujeong Cha
Wang Hou
Hyun Jung
M. Phung
M. Picheny
H. Kuo
Samuel Thomas
E. Morais
VLM
14
15
0
07 Apr 2021
Timers and Such: A Practical Benchmark for Spoken Language Understanding
  with Numbers
Timers and Such: A Practical Benchmark for Spoken Language Understanding with Numbers
Loren Lugosch
Piyush Papreja
Mirco Ravanelli
A. Heba
Titouan Parcollet
6
12
0
04 Apr 2021
Leveraging Acoustic and Linguistic Embeddings from Pretrained speech and
  language Models for Intent Classification
Leveraging Acoustic and Linguistic Embeddings from Pretrained speech and language Models for Intent Classification
Bidisha Sharma
Maulik C. Madhavi
Haizhou Li
11
19
0
15 Feb 2021
Towards Semi-Supervised Semantics Understanding from Speech
Towards Semi-Supervised Semantics Understanding from Speech
Cheng-I Jeff Lai
Jin Cao
S. Bodapati
Shang-Wen Li
SSL
14
7
0
11 Nov 2020
Semi-Supervised Spoken Language Understanding via Self-Supervised Speech
  and Language Model Pretraining
Semi-Supervised Spoken Language Understanding via Self-Supervised Speech and Language Model Pretraining
Cheng-I Jeff Lai
Yung-Sung Chuang
Hung-yi Lee
Shang-Wen Li
James R. Glass
VLM
SSL
16
58
0
26 Oct 2020
Semantic Complexity in End-to-End Spoken Language Understanding
Semantic Complexity in End-to-End Spoken Language Understanding
Joseph P. McKenna
Samridhi Choudhary
Michael Stephen Saxon
Grant P. Strimel
Athanasios Mouchtaris
12
14
0
06 Aug 2020
A Data Efficient End-To-End Spoken Language Understanding Architecture
A Data Efficient End-To-End Spoken Language Understanding Architecture
Marco Dinarelli
Nikita Kapoor
Bassam Jabaian
Laurent Besacier
3DV
12
20
0
14 Feb 2020
SpeechBERT: An Audio-and-text Jointly Learned Language Model for
  End-to-end Spoken Question Answering
SpeechBERT: An Audio-and-text Jointly Learned Language Model for End-to-end Spoken Question Answering
Yung-Sung Chuang
Chi-Liang Liu
Hung-yi Lee
Lin-shan Lee
AuLLM
17
39
0
25 Oct 2019
12
Next