Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2402.05755
Cited By
SpiRit-LM: Interleaved Spoken and Written Language Model
8 February 2024
Tu Nguyen
Benjamin Muller
Bokai Yu
Marta R. Costa-jussá
Maha Elbayad
Sravya Popuri
Paul-Ambroise Duquenne
Robin Algayres
Ruslan Mavlyutov
Itai Gat
Gabriel Synnaeve
Juan Pino
Benoît Sagot
Emmanuel Dupoux
AuLLM
VLM
Re-assign community
ArXiv
PDF
HTML
Papers citing
"SpiRit-LM: Interleaved Spoken and Written Language Model"
27 / 27 papers shown
Title
Voila: Voice-Language Foundation Models for Real-Time Autonomous Interaction and Voice Role-Play
Yemin Shi
Yu Shu
Siwei Dong
Guangyi Liu
Jaward Sesay
Jingwen Li
Zhiting Hu
AuLLM
VLM
43
0
0
05 May 2025
LLaMA-Omni2: LLM-based Real-time Spoken Chatbot with Autoregressive Streaming Speech Synthesis
Qingkai Fang
Yan Zhou
Shoutao Guo
Shaolei Zhang
Yang Feng
AuLLM
51
0
0
05 May 2025
On The Landscape of Spoken Language Models: A Comprehensive Survey
Siddhant Arora
Kai-Wei Chang
Chung-Ming Chien
Yifan Peng
Haibin Wu
Yossi Adi
Emmanuel Dupoux
Hung-yi Lee
Karen Livescu
Shinji Watanabe
39
1
0
11 Apr 2025
TASTE: Text-Aligned Speech Tokenization and Embedding for Spoken Language Modeling
Liang-Hsuan Tseng
Yi-Chang Chen
Kuan-Yi Lee
Da-shan Shiu
Hung-yi Lee
AuLLM
52
0
0
09 Apr 2025
VocalNet: Speech LLM with Multi-Token Prediction for Faster and High-Quality Generation
Yuhao Wang
Heyang Liu
Ziyang Cheng
Ronghua Wu
Qunshan Gu
Yanfeng Wang
Yu Wang
45
0
0
05 Apr 2025
Overcoming Vocabulary Constraints with Pixel-level Fallback
Jonas F. Lotz
Hendra Setiawan
Stephan Peitz
Yova Kementchedjhieva
35
0
0
02 Apr 2025
Make Some Noise: Towards LLM audio reasoning and generation using sound tokens
Shivam Mehta
Nebojsa Jojic
Hannes Gamper
28
0
0
28 Mar 2025
InSerter: Speech Instruction Following with Unsupervised Interleaved Pre-training
Dingdong Wang
Jin Xu
Ruihang Chu
Zhifang Guo
X. Wang
Jincenzi Wu
Dongchao Yang
Shengpeng Ji
Junyang Lin
AuLLM
83
0
0
04 Mar 2025
Baichuan-Audio: A Unified Framework for End-to-End Speech Interaction
Tianpeng Li
J. Liu
Tao Zhang
Yuanbo Fang
Da Pan
...
Guosheng Dong
Jianhua Xu
Haoze Sun
Zenan Zhou
Weipeng Chen
AuLLM
50
3
0
24 Feb 2025
SALMONN-omni: A Codec-free LLM for Full-duplex Speech Understanding and Generation
Wenyi Yu
Siyin Wang
Xiaoyu Yang
Xianzhao Chen
Xiaohai Tian
J. Zhang
Guangzhi Sun
Lu Lu
Y. Wang
Chao Zhang
AuLLM
64
6
0
27 Nov 2024
Align-SLM: Textless Spoken Language Models with Reinforcement Learning from AI Feedback
Guan-Ting Lin
Prashanth Gurunath Shivakumar
Aditya Gourav
Yile Gu
Ankur Gandhe
Hung-yi Lee
I. Bulyko
23
8
0
04 Nov 2024
DC-Spin: A Speaker-invariant Speech Tokenizer for Spoken Language Models
Heng-Jui Chang
Hongyu Gong
Changhan Wang
James R. Glass
Yu-An Chung
26
0
0
31 Oct 2024
Enhancing TTS Stability in Hebrew using Discrete Semantic Units
Ella Zeldes
Or Tal
Yossi Adi
17
0
0
28 Oct 2024
Get Large Language Models Ready to Speak: A Late-fusion Approach for Speech Generation
Maohao Shen
Shun Zhang
Jilong Wu
Zhiping Xiu
Ehab AlBadawy
Yiting Lu
M. Seltzer
Qing He
33
2
0
27 Oct 2024
Roadmap towards Superhuman Speech Understanding using Large Language Models
Fan Bu
Yuhao Zhang
X. Wang
Benyou Wang
Q. Liu
H. Li
LM&MA
ELM
AuLLM
33
1
0
17 Oct 2024
SyllableLM: Learning Coarse Semantic Units for Speech Language Models
Alan Baade
Puyuan Peng
David F. Harwath
42
3
0
05 Oct 2024
Recent Advances in Speech Language Models: A Survey
Wenqian Cui
Dianzhi Yu
Xiaoqi Jiao
Ziqiao Meng
Guangyan Zhang
Qichao Wang
Yiwen Guo
Irwin King
AuLLM
57
14
0
01 Oct 2024
Speech Recognition Rescoring with Large Speech-Text Foundation Models
Prashanth Gurunath Shivakumar
J. Kolehmainen
Aditya Gourav
Yi Gu
Ankur Gandhe
Ariya Rastrow
I. Bulyko
AuLLM
21
0
0
25 Sep 2024
Beyond Turn-Based Interfaces: Synchronous LLMs as Full-Duplex Dialogue Agents
Bandhav Veluri
Benjamin Peloquin
Bokai Yu
Hongyu Gong
Shyamnath Gollakota
AuLLM
OffRL
34
13
0
23 Sep 2024
Improving Spoken Language Modeling with Phoneme Classification: A Simple Fine-tuning Approach
Maxime Poli
Emmanuel Chemla
Emmanuel Dupoux
21
2
0
16 Sep 2024
Salmon: A Suite for Acoustic Language Model Evaluation
Gallil Maimon
Amit Roth
Yossi Adi
ELM
AuLLM
49
5
0
11 Sep 2024
DiscreteSLU: A Large Language Model with Self-Supervised Discrete Speech Units for Spoken Language Understanding
Suwon Shon
Kwangyoun Kim
Yi-Te Hsu
Prashant Sridhar
Shinji Watanabe
Karen Livescu
AuLLM
39
2
0
13 Jun 2024
MAD Speech: Measures of Acoustic Diversity of Speech
Matthieu Futeral
A. Agostinelli
Marco Tagliasacchi
Neil Zeghidour
Eugene Kharitonov
46
1
0
16 Apr 2024
Scaling Speech Technology to 1,000+ Languages
Vineel Pratap
Andros Tjandra
Bowen Shi
Paden Tomasello
Arun Babu
...
Yossi Adi
Xiaohui Zhang
Wei-Ning Hsu
Alexis Conneau
Michael Auli
VLM
73
297
0
22 May 2023
"I'm sorry to hear that": Finding New Biases in Language Models with a Holistic Descriptor Dataset
Eric Michael Smith
Melissa Hall
Melanie Kambadur
Eleonora Presani
Adina Williams
65
128
0
18 May 2022
Generative Spoken Language Modeling from Raw Audio
Kushal Lakhotia
Evgeny Kharitonov
Wei-Ning Hsu
Yossi Adi
Adam Polyak
...
Tu Nguyen
Jade Copet
Alexei Baevski
A. Mohamed
Emmanuel Dupoux
AuLLM
174
336
0
01 Feb 2021
Scaling Laws for Neural Language Models
Jared Kaplan
Sam McCandlish
T. Henighan
Tom B. Brown
B. Chess
R. Child
Scott Gray
Alec Radford
Jeff Wu
Dario Amodei
220
3,054
0
23 Jan 2020
1