Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2406.06582
Cited By
Discrete Multimodal Transformers with a Pretrained Large Language Model for Mixed-Supervision Speech Processing
4 June 2024
V. Trinh
Rosy Southwell
Yiwen Guan
Xinlu He
Zhiyong Wang
Jacob Whitehill
OffRL
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Discrete Multimodal Transformers with a Pretrained Large Language Model for Mixed-Supervision Speech Processing"
2 / 2 papers shown
Title
From TOWER to SPIRE: Adding the Speech Modality to a Text-Only LLM
Kshitij Ambilduke
Ben Peters
Sonal Sannigrahi
Anil Keshwani
Tsz Kin Lam
Bruno Martins
Marcely Zanon Boito
André F. T. Martins
47
0
0
13 Mar 2025
Multi-modal Speech Transformer Decoders: When Do Multiple Modalities Improve Accuracy?
Yiwen Guan
V. Trinh
Vivek Voleti
Jacob Whitehill
34
1
0
13 Sep 2024
1