Communities
Connect sessions
AI calendar
Organizations
Join Slack
Contact Sales
Search
Open menu
Home
Papers
2505.17496
Cited By
Analyzing Mitigation Strategies for Catastrophic Forgetting in End-to-End Training of Spoken Language Models
23 May 2025
Chi-Yuan Hsiao
Ke-Han Lu
Kai-Wei Chang
Chih-Kai Yang
Wei-Chih Chen
Hung-yi Lee
CLL
MoMe
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Analyzing Mitigation Strategies for Catastrophic Forgetting in End-to-End Training of Spoken Language Models"
32 / 32 papers shown
Title
Understanding Textual Capability Degradation in Speech LLMs via Parameter Importance Analysis
Chao Wang
Rui Zheng
Yang Ai
Zhen-Hua Ling
60
0
0
28 Sep 2025
On The Landscape of Spoken Language Models: A Comprehensive Survey
Siddhant Arora
Kai-Wei Chang
Chung-Ming Chien
Yifan Peng
Haibin Wu
Yossi Adi
Emmanuel Dupoux
Hung-yi Lee
Karen Livescu
Shinji Watanabe
277
52
0
11 Apr 2025
DeSTA2: Developing Instruction-Following Speech Language Model Without Speech Instruction-Tuning Data
IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2024
Ke-Han Lu
Zhehuai Chen
Szu-Wei Fu
Chao-Han Huck Yang
Jagadeesh Balam
Boris Ginsburg
Yu-Te Wang
Hung-yi Lee
AuLLM
SyDa
322
34
0
28 Jan 2025
Lifelong Learning of Large Language Model based Agents: A Roadmap
Junhao Zheng
Chengming Shi
Xidi Cai
Qiuke Li
Duzhen Zhang
Xuefei Liu
Dong Yu
Qianli Ma
CLL
KELM
LLMAG
LM&Ro
AI4CE
172
35
0
13 Jan 2025
Building a Taiwanese Mandarin Spoken Language Model: A First Attempt
Chih-Kai Yang
Yu-Kuan Fu
Chen-An Li
Yi-Cheng Lin
Yu-Xiang Lin
...
Ulin Sanga
Xuanjun Chen
Po-Chun Hsu
Shu-Wen Yang
Hung-yi Lee
AuLLM
255
12
0
11 Nov 2024
Dynamic-SUPERB Phase-2: A Collaboratively Expanding Benchmark for Measuring the Capabilities of Spoken Language Models with 180 Tasks
Chien-yu Huang
Wei-Chih Chen
Shu-Wen Yang
Andy T. Liu
Chen-An Li
...
David Harwath
Shinji Watanabe
Hung-yi Lee
Shinji Watanabe
Hung-yi Lee
ELM
AuLLM
173
55
0
08 Nov 2024
Moshi: a speech-text foundation model for real-time dialogue
Alexandre Défossez
Laurent Mazaré
Manu Orsini
Amélie Royer
P. Pérez
Edouard Grave
Edouard Grave
Neil Zeghidour
AuLLM
387
327
0
17 Sep 2024
Mini-Omni: Language Models Can Hear, Talk While Thinking in Streaming
Zhifei Xie
Changqiao Wu
AuLLM
VGen
VLM
SyDa
LRM
283
151
0
29 Aug 2024
Qwen2-Audio Technical Report
Yunfei Chu
Jin Xu
Qian Yang
Haojie Wei
Xipin Wei
...
Yuanjun Lv
Jinzheng He
Junyang Lin
Chang Zhou
Jingren Zhou
AuLLM
VLM
213
346
0
15 Jul 2024
DeSTA: Enhancing Speech Language Models through Descriptive Speech-Text Alignment
Ke-Han Lu
Zhehuai Chen
Szu-Wei Fu
He Huang
Boris Ginsburg
Yu-Chiang Frank Wang
Hung-yi Lee
VLM
AuLLM
178
34
0
27 Jun 2024
Magpie: Alignment Data Synthesis from Scratch by Prompting Aligned LLMs with Nothing
Zhangchen Xu
Fengqing Jiang
Luyao Niu
Yuntian Deng
Radha Poovendran
Yejin Choi
Bill Yuchen Lin
SyDa
267
238
0
12 Jun 2024
Seamless: Multilingual Expressive and Streaming Speech Translation
Seamless Communication
Loïc Barrault
Yu-An Chung
Mariano Coria Meglioli
David Dale
...
Paden Tomasello
Changhan Wang
Jeff Wang
Skyler Wang
Mary Williamson
163
219
0
08 Dec 2023
Qwen-Audio: Advancing Universal Audio Understanding via Unified Large-Scale Audio-Language Models
Yunfei Chu
Jin Xu
Xiaohuan Zhou
Qian Yang
Shiliang Zhang
Zhijie Yan
Chang Zhou
Jingren Zhou
AuLLM
262
571
0
14 Nov 2023
Instruction-Following Evaluation for Large Language Models
Jeffrey Zhou
Tianjian Lu
Swaroop Mishra
Siddhartha Brahma
Sujoy Basu
Yi Luan
Denny Zhou
Le Hou
ELM
ALM
LRM
239
520
0
14 Nov 2023
Language Models are Super Mario: Absorbing Abilities from Homologous Models as a Free Lunch
International Conference on Machine Learning (ICML), 2023
Le Yu
Yu Bowen
Haiyang Yu
Fei Huang
Yongbin Li
MoMe
402
471
0
06 Nov 2023
SALMONN: Towards Generic Hearing Abilities for Large Language Models
Changli Tang
Wenyi Yu
Guangzhi Sun
Xianzhao Chen
Tian Tan
Wei Li
Lu Lu
Zejun Ma
Chao Zhang
LM&MA
AuLLM
273
416
0
20 Oct 2023
UniverSLU: Universal Spoken Language Understanding for Diverse Tasks with Natural Language Instructions
North American Chapter of the Association for Computational Linguistics (NAACL), 2023
Siddhant Arora
Hayato Futami
Jee-weon Jung
Yifan Peng
Roshan S. Sharma
Yosuke Kashiwagi
E. Tsunoo
Karen Livescu
Shinji Watanabe
ELM
198
11
0
04 Oct 2023
Joint Audio and Speech Understanding
Automatic Speech Recognition & Understanding (ASRU), 2023
Yuan Gong
Alexander H. Liu
Hongyin Luo
Leonid Karlinsky
James R. Glass
AuLLM
391
114
0
25 Sep 2023
Dynamic-SUPERB: Towards A Dynamic, Collaborative, and Comprehensive Instruction-Tuning Benchmark for Speech
IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2023
Chien-yu Huang
Ke-Han Lu
Shi Wang
Chi-Yuan Hsiao
Chun-Yi Kuan
...
Roshan S. Sharma
Shinji Watanabe
Bhiksha Ramakrishnan
Shady Shehata
Hung-yi Lee
AuLLM
196
84
0
18 Sep 2023
Mitigating the Alignment Tax of RLHF
Conference on Empirical Methods in Natural Language Processing (EMNLP), 2023
Yong Lin
Hangyu Lin
Wei Xiong
Shizhe Diao
Zeming Zheng
...
Han Zhao
Nan Jiang
Heng Ji
Xingtai Lv
Tong Zhang
MoMe
CLL
391
119
0
12 Sep 2023
TIES-Merging: Resolving Interference When Merging Models
Neural Information Processing Systems (NeurIPS), 2023
Prateek Yadav
Derek Tam
Leshem Choshen
Colin Raffel
Joey Tianyi Zhou
MoMe
316
498
0
02 Jun 2023
Spoken Question Answering and Speech Continuation Using Spectrogram-Powered LLM
International Conference on Learning Representations (ICLR), 2023
Eliya Nachmani
Alon Levkovitch
Roy Hirsch
Julián Salazar
Chulayutsh Asawaroengchai
Soroosh Mariooryad
Ehud Rivlin
RJ Skerry-Ryan
Michelle Tadmor Ramanovich
AuLLM
296
82
0
24 May 2023
SpeechGPT: Empowering Large Language Models with Intrinsic Cross-Modal Conversational Abilities
Conference on Empirical Methods in Natural Language Processing (EMNLP), 2023
Dong Zhang
Shimin Li
Xin Zhang
Jun Zhan
Pengyu Wang
Yaqian Zhou
Xipeng Qiu
AuLLM
MLLM
441
497
0
18 May 2023
GPT-4 Technical Report
OpenAI OpenAI
OpenAI Josh Achiam
Steven Adler
Sandhini Agarwal
Lama Ahmad
...
Shengjia Zhao
Tianhao Zheng
Juntang Zhuang
William Zhuk
Barret Zoph
LLMAG
MLLM
3.3K
20,007
0
15 Mar 2023
Neural Codec Language Models are Zero-Shot Text to Speech Synthesizers
IEEE Transactions on Audio, Speech, and Language Processing (IEEE TASLP), 2023
Chengyi Wang
Sanyuan Chen
Yu-Huan Wu
Zi-Hua Zhang
Long Zhou
...
Huaming Wang
Jinyu Li
Lei He
Sheng Zhao
Furu Wei
350
982
0
05 Jan 2023
AudioLM: a Language Modeling Approach to Audio Generation
IEEE/ACM Transactions on Audio Speech and Language Processing (TASLP), 2022
Zalan Borsos
Raphaël Marinier
Damien Vincent
Eugene Kharitonov
Olivier Pietquin
...
Dominik Roblek
O. Teboul
David Grangier
Marco Tagliasacchi
Neil Zeghidour
AuLLM
320
789
0
07 Sep 2022
SpeechT5: Unified-Modal Encoder-Decoder Pre-Training for Spoken Language Processing
Junyi Ao
Rui Wang
Long Zhou
Chengyi Wang
Shuo Ren
...
Yu Zhang
Zhihua Wei
Yao Qian
Jinyu Li
Furu Wei
290
247
0
14 Oct 2021
Generative Spoken Language Modeling from Raw Audio
Transactions of the Association for Computational Linguistics (TACL), 2021
Kushal Lakhotia
Evgeny Kharitonov
Wei-Ning Hsu
Yossi Adi
Adam Polyak
...
Tu Nguyen
Jade Copet
Alexei Baevski
A. Mohamed
Emmanuel Dupoux
AuLLM
407
421
0
01 Feb 2021
HiFi-GAN: Generative Adversarial Networks for Efficient and High Fidelity Speech Synthesis
Jungil Kong
Jaehyeon Kim
Jaekyoung Bae
405
2,374
0
12 Oct 2020
Unsupervised Cross-lingual Representation Learning for Speech Recognition
Interspeech (Interspeech), 2020
Alexis Conneau
Alexei Baevski
R. Collobert
Abdel-rahman Mohamed
Michael Auli
SSL
301
900
0
24 Jun 2020
Experience Replay for Continual Learning
David Rolnick
Arun Ahuja
Jonathan Richard Schwarz
Timothy Lillicrap
Greg Wayne
CLL
372
1,373
0
28 Nov 2018
An Empirical Investigation of Catastrophic Forgetting in Gradient-Based Neural Networks
International Conference on Learning Representations (ICLR), 2013
Ian Goodfellow
M. Berk Mirza
Xia Da
Aaron Courville
Yoshua Bengio
365
1,563
0
21 Dec 2013
1