ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2205.01086
  4. Cited By
Wav2Seq: Pre-training Speech-to-Text Encoder-Decoder Models Using Pseudo
  Languages

Wav2Seq: Pre-training Speech-to-Text Encoder-Decoder Models Using Pseudo Languages

2 May 2022
Felix Wu
Kwangyoun Kim
Shinji Watanabe
Kyu Jeong Han
Ryan T. McDonald
Kilian Q. Weinberger
Yoav Artzi
    SyDa
ArXivPDFHTML

Papers citing "Wav2Seq: Pre-training Speech-to-Text Encoder-Decoder Models Using Pseudo Languages"

28 / 28 papers shown
Title
On The Landscape of Spoken Language Models: A Comprehensive Survey
On The Landscape of Spoken Language Models: A Comprehensive Survey
Siddhant Arora
Kai-Wei Chang
Chung-Ming Chien
Yifan Peng
Haibin Wu
Yossi Adi
Emmanuel Dupoux
Hung-yi Lee
Karen Livescu
Shinji Watanabe
45
2
0
11 Apr 2025
SpeechPrompt: Prompting Speech Language Models for Speech Processing
  Tasks
SpeechPrompt: Prompting Speech Language Models for Speech Processing Tasks
Kai-Wei Chang
Haibin Wu
Yu-Kai Wang
Yuan-Kuei Wu
Hua Shen
Wei-Cheng Tseng
Iu-thing Kang
Shang-Wen Li
Hung-yi Lee
39
3
0
23 Aug 2024
On the Evaluation of Speech Foundation Models for Spoken Language
  Understanding
On the Evaluation of Speech Foundation Models for Spoken Language Understanding
Siddhant Arora
Ankita Pasad
Chung-Ming Chien
Jionghao Han
Roshan S. Sharma
...
William Chen
Suwon Shon
Hung-yi Lee
Karen Livescu
Shinji Watanabe
ELM
43
4
0
14 Jun 2024
DiscreteSLU: A Large Language Model with Self-Supervised Discrete Speech
  Units for Spoken Language Understanding
DiscreteSLU: A Large Language Model with Self-Supervised Discrete Speech Units for Spoken Language Understanding
Suwon Shon
Kwangyoun Kim
Yi-Te Hsu
Prashant Sridhar
Shinji Watanabe
Karen Livescu
AuLLM
44
2
0
13 Jun 2024
DEPTH: Discourse Education through Pre-Training Hierarchically
DEPTH: Discourse Education through Pre-Training Hierarchically
Zachary Bamberger
Ofek Glick
Chaim Baskin
Yonatan Belinkov
54
0
0
13 May 2024
Compact Speech Translation Models via Discrete Speech Units Pretraining
Compact Speech Translation Models via Discrete Speech Units Pretraining
Tsz Kin Lam
Alexandra Birch
Barry Haddow
45
2
0
29 Feb 2024
UniEnc-CASSNAT: An Encoder-only Non-autoregressive ASR for Speech SSL
  Models
UniEnc-CASSNAT: An Encoder-only Non-autoregressive ASR for Speech SSL Models
Ruchao Fan
Natarajan Balaji Shankar
Abeer Alwan
14
0
0
14 Feb 2024
Predicting positive transfer for improved low-resource speech
  recognition using acoustic pseudo-tokens
Predicting positive transfer for improved low-resource speech recognition using acoustic pseudo-tokens
Nay San
Georgios Paraskevopoulos
Aryaman Arora
Xiluo He
Prabhjot Kaur
Oliver Adams
Dan Jurafsky
20
7
0
03 Feb 2024
Retrieval Augmented End-to-End Spoken Dialog Models
Retrieval Augmented End-to-End Spoken Dialog Models
Mingqiu Wang
Izhak Shafran
H. Soltau
Wei Han
Yuan Cao
Dian Yu
Laurent El Shafey
RALM
AuLLM
22
11
0
02 Feb 2024
When Large Language Models Meet Vector Databases: A Survey
When Large Language Models Meet Vector Databases: A Survey
Zhi Jing
Yongye Su
Yikun Han
Bo Yuan
Haiyun Xu
Chunjiang Liu
Kehai Chen
Min Zhang
53
35
0
30 Jan 2024
R-Spin: Efficient Speaker and Noise-invariant Representation Learning
  with Acoustic Pieces
R-Spin: Efficient Speaker and Noise-invariant Representation Learning with Acoustic Pieces
Heng-Jui Chang
James R. Glass
25
3
0
15 Nov 2023
Toward Joint Language Modeling for Speech Units and Text
Toward Joint Language Modeling for Speech Units and Text
Ju-Chieh Chou
Chung-Ming Chien
Wei-Ning Hsu
Karen Livescu
Arun Babu
Alexis Conneau
Alexei Baevski
Michael Auli
VLM
26
19
0
12 Oct 2023
Prompting and Adapter Tuning for Self-supervised Encoder-Decoder Speech
  Model
Prompting and Adapter Tuning for Self-supervised Encoder-Decoder Speech Model
Kai-Wei Chang
Ming-Hsin Chen
Yun-Ping Lin
Jing Neng Hsu
Paul Kuo-Ming Huang
Chien-yu Huang
Shang-Wen Li
Hung-yi Lee
21
6
0
04 Oct 2023
What Do Self-Supervised Speech Models Know About Words?
What Do Self-Supervised Speech Models Know About Words?
Ankita Pasad
C. Chien
Shane Settle
Karen Livescu
SSL
18
26
0
30 Jun 2023
Recent Advances in Direct Speech-to-text Translation
Recent Advances in Direct Speech-to-text Translation
Chen Xu
Rong Ye
Qianqian Dong
Chengqi Zhao
Tom Ko
Mingxuan Wang
Tong Xiao
Jingbo Zhu
12
18
0
20 Jun 2023
Speech-to-Text Adapter and Speech-to-Entity Retriever Augmented LLMs for
  Speech Understanding
Speech-to-Text Adapter and Speech-to-Entity Retriever Augmented LLMs for Speech Understanding
Mingqiu Wang
Izhak Shafran
H. Soltau
Wei Han
Yuan Cao
Dian Yu
Laurent El Shafey
RALM
AuLLM
11
9
0
08 Jun 2023
DUB: Discrete Unit Back-translation for Speech Translation
DUB: Discrete Unit Back-translation for Speech Translation
Dong Zhang
Rong Ye
Tom Ko
Mingxuan Wang
Yaqian Zhou
11
23
0
19 May 2023
ChatGPT for Shaping the Future of Dentistry: The Potential of
  Multi-Modal Large Language Model
ChatGPT for Shaping the Future of Dentistry: The Potential of Multi-Modal Large Language Model
Hanyao Huang
Ou Zheng
Dongdong Wang
Jiayi Yin
Zijin Wang
...
H. Yin
Chuan Xu
Renjie Yang
Q. Zheng
B. Shi
MedIm
AI4MH
AI4CE
LM&MA
58
171
0
23 Mar 2023
Structured Pruning of Self-Supervised Pre-trained Models for Speech
  Recognition and Understanding
Structured Pruning of Self-Supervised Pre-trained Models for Speech Recognition and Understanding
Yifan Peng
Kwangyoun Kim
Felix Wu
Prashant Sridhar
Shinji Watanabe
19
34
0
27 Feb 2023
SLUE Phase-2: A Benchmark Suite of Diverse Spoken Language Understanding
  Tasks
SLUE Phase-2: A Benchmark Suite of Diverse Spoken Language Understanding Tasks
Suwon Shon
Siddhant Arora
Chyi-Jiunn Lin
Ankita Pasad
Felix Wu
Roshan S. Sharma
Wei Yu Wu
Hung-yi Lee
Karen Livescu
Shinji Watanabe
ELM
19
31
0
20 Dec 2022
MMSpeech: Multi-modal Multi-task Encoder-Decoder Pre-training for Speech
  Recognition
MMSpeech: Multi-modal Multi-task Encoder-Decoder Pre-training for Speech Recognition
Xiaohuan Zhou
Jiaming Wang
Zeyu Cui
Shiliang Zhang
Zhijie Yan
Jingren Zhou
Chang Zhou
27
12
0
29 Nov 2022
Channel-Aware Pretraining of Joint Encoder-Decoder Self-Supervised Model
  for Telephonic-Speech ASR
Channel-Aware Pretraining of Joint Encoder-Decoder Self-Supervised Model for Telephonic-Speech ASR
Vrunda N. Sukhadia
Anjana Arunkumar
S. Umesh
11
1
0
03 Nov 2022
Bootstrapping meaning through listening: Unsupervised learning of spoken
  sentence embeddings
Bootstrapping meaning through listening: Unsupervised learning of spoken sentence embeddings
Jian Zhu
Zuoyu Tian
Yadong Liu
Cong Zhang
Chia-wen Lo
SSL
30
2
0
23 Oct 2022
SpeechUT: Bridging Speech and Text with Hidden-Unit for Encoder-Decoder
  Based Speech-Text Pre-training
SpeechUT: Bridging Speech and Text with Hidden-Unit for Encoder-Decoder Based Speech-Text Pre-training
Zi-Hua Zhang
Long Zhou
Junyi Ao
Shujie Liu
Lirong Dai
Jinyu Li
Furu Wei
61
57
0
07 Oct 2022
Multitask Prompted Training Enables Zero-Shot Task Generalization
Multitask Prompted Training Enables Zero-Shot Task Generalization
Victor Sanh
Albert Webson
Colin Raffel
Stephen H. Bach
Lintang Sutawika
...
T. Bers
Stella Biderman
Leo Gao
Thomas Wolf
Alexander M. Rush
LRM
205
1,654
0
15 Oct 2021
SpeechT5: Unified-Modal Encoder-Decoder Pre-Training for Spoken Language
  Processing
SpeechT5: Unified-Modal Encoder-Decoder Pre-Training for Spoken Language Processing
Junyi Ao
Rui Wang
Long Zhou
Chengyi Wang
Shuo Ren
...
Yu Zhang
Zhihua Wei
Yao Qian
Jinyu Li
Furu Wei
110
192
0
14 Oct 2021
Emerging Properties in Self-Supervised Vision Transformers
Emerging Properties in Self-Supervised Vision Transformers
Mathilde Caron
Hugo Touvron
Ishan Misra
Hervé Jégou
Julien Mairal
Piotr Bojanowski
Armand Joulin
292
5,761
0
29 Apr 2021
Generative Spoken Language Modeling from Raw Audio
Generative Spoken Language Modeling from Raw Audio
Kushal Lakhotia
Evgeny Kharitonov
Wei-Ning Hsu
Yossi Adi
Adam Polyak
...
Tu Nguyen
Jade Copet
Alexei Baevski
A. Mohamed
Emmanuel Dupoux
AuLLM
174
336
0
01 Feb 2021
1