Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2311.02248
Cited By
COSMIC: Data Efficient Instruction-tuning For Speech In-Context Learning
3 November 2023
Jing Pan
Jian Wu
Yashesh Gaur
S. Sivasankaran
Zhuo Chen
Shujie Liu
Jinyu Li
ELM
Re-assign community
ArXiv
PDF
HTML
Papers citing
"COSMIC: Data Efficient Instruction-tuning For Speech In-Context Learning"
26 / 26 papers shown
Title
On The Landscape of Spoken Language Models: A Comprehensive Survey
Siddhant Arora
Kai-Wei Chang
Chung-Ming Chien
Yifan Peng
Haibin Wu
Yossi Adi
Emmanuel Dupoux
Hung-yi Lee
Karen Livescu
Shinji Watanabe
36
1
0
11 Apr 2025
Audio-Language Models for Audio-Centric Tasks: A survey
Yi Su
Jisheng Bai
Qisheng Xu
Kele Xu
Yong Dou
AuLLM
99
1
0
28 Jan 2025
FireRedASR: Open-Source Industrial-Grade Mandarin Speech Recognition Models from Encoder-Decoder to LLM Integration
Kai-Tuo Xu
Feng-Long Xie
Xu Tang
Yao Hu
54
4
0
24 Jan 2025
Prepending or Cross-Attention for Speech-to-Text? An Empirical Comparison
Tsz Kin Lam
Marco Gaido
Sara Papi
L. Bentivogli
Barry Haddow
29
0
0
04 Jan 2025
Can Large Audio-Language Models Truly Hear? Tackling Hallucinations with Multi-Task Assessment and Stepwise Audio Reasoning
Chun-Yi Kuan
Hung-yi Lee
AuLLM
LRM
56
1
0
03 Jan 2025
SpeechQE: Estimating the Quality of Direct Speech Translation
HyoJung Han
Kevin Duh
Marine Carpuat
18
0
0
28 Oct 2024
Roadmap towards Superhuman Speech Understanding using Large Language Models
Fan Bu
Yuhao Zhang
X. Wang
Benyou Wang
Q. Liu
H. Li
LM&MA
ELM
AuLLM
33
1
0
17 Oct 2024
Self-Powered LLM Modality Expansion for Large Speech-Text Models
Tengfei Yu
Xuebo Liu
Zhiyi Hou
Liang Ding
Dacheng Tao
Min Zhang
32
0
0
04 Oct 2024
Recent Advances in Speech Language Models: A Survey
Wenqian Cui
Dianzhi Yu
Xiaoqi Jiao
Ziqiao Meng
Guangyan Zhang
Qichao Wang
Yiwen Guo
Irwin King
AuLLM
57
14
0
01 Oct 2024
Efficient Long-Form Speech Recognition for General Speech In-Context Learning
Hao Yen
Shaoshi Ling
Guoli Ye
16
0
0
29 Sep 2024
LA-RAG:Enhancing LLM-based ASR Accuracy with Retrieval-Augmented Generation
Shaojun Li
Hengchao Shang
Daimeng Wei
Jiaxin Guo
Zongyao Li
Xianghui He
Min Zhang
Hao Yang
19
2
0
13 Sep 2024
SALSA: Speedy ASR-LLM Synchronous Aggregation
Ashish R. Mittal
Darshan Prabhu
Sunita Sarawagi
P. Jyothi
21
2
0
29 Aug 2024
Language Model Can Listen While Speaking
Ziyang Ma
Yakun Song
Chenpeng Du
Jian Cong
Zhuo Chen
Yuping Wang
Y. Wang
Xie Chen
AuLLM
29
23
0
05 Aug 2024
DeSTA: Enhancing Speech Language Models through Descriptive Speech-Text Alignment
Ke-Han Lu
Zhehuai Chen
Szu-Wei Fu
He Huang
Boris Ginsburg
Yu-Chiang Frank Wang
Hung-yi Lee
VLM
AuLLM
20
9
0
27 Jun 2024
Instruction Data Generation and Unsupervised Adaptation for Speech Language Models
Vahid Noroozi
Zhehuai Chen
Somshubra Majumdar
Steve Huang
Jagadeesh Balam
Boris Ginsburg
SyDa
26
3
0
18 Jun 2024
DiscreteSLU: A Large Language Model with Self-Supervised Discrete Speech Units for Spoken Language Understanding
Suwon Shon
Kwangyoun Kim
Yi-Te Hsu
Prashant Sridhar
Shinji Watanabe
Karen Livescu
AuLLM
31
2
0
13 Jun 2024
Understanding Sounds, Missing the Questions: The Challenge of Object Hallucination in Large Audio-Language Models
Chun-Yi Kuan
Wei-Ping Huang
Hung-yi Lee
AuLLM
19
1
0
12 Jun 2024
BLSP-KD: Bootstrapping Language-Speech Pre-training via Knowledge Distillation
Chen Wang
Minpeng Liao
Zhongqiang Huang
Jiajun Zhang
ALM
AuLLM
32
1
0
29 May 2024
Bayesian Example Selection Improves In-Context Learning for Speech, Text, and Visual Modalities
Siyin Wang
Chao-Han Huck Yang
Ji Wu
Chao Zhang
BDL
32
4
0
23 Apr 2024
Advancing Large Language Models to Capture Varied Speaking Styles and Respond Properly in Spoken Conversations
Guan-Ting Lin
Cheng-Han Chiang
Hung-yi Lee
16
22
0
20 Feb 2024
Speech Translation with Speech Foundation Models and Large Language Models: What is There and What is Missing?
Marco Gaido
Sara Papi
Matteo Negri
L. Bentivogli
33
11
0
19 Feb 2024
Integrating Pre-Trained Speech and Language Models for End-to-End Speech Recognition
Yukiya Hono
Koh Mitsuda
Tianyu Zhao
Kentaro Mitsui
Toshiaki Wakatsuki
Kei Sawada
AuLLM
21
8
0
06 Dec 2023
SLM: Bridge the thin gap between speech and text foundation models
Mingqiu Wang
Wei Han
Izhak Shafran
Zelin Wu
Chung-Cheng Chiu
...
Zhong Meng
Golan Pundak
Nikhil Siddhartha
J. Schalkwyk
Yonghui Wu
AuLLM
37
56
0
30 Sep 2023
End-to-End Speech Recognition Contextualization with Large Language Models
Egor Lakomkin
Chunyang Wu
Yassir Fathullah
Ozlem Kalinli
M. Seltzer
Christian Fuegen
47
17
0
19 Sep 2023
BLIP-2: Bootstrapping Language-Image Pre-training with Frozen Image Encoders and Large Language Models
Junnan Li
Dongxu Li
Silvio Savarese
Steven C. H. Hoi
VLM
MLLM
244
4,186
0
30 Jan 2023
FLEURS: Few-shot Learning Evaluation of Universal Representations of Speech
Alexis Conneau
Min Ma
Simran Khanuja
Yu Zhang
Vera Axelrod
Siddharth Dalmia
Jason Riesa
Clara E. Rivera
Ankur Bapna
VLM
73
281
0
25 May 2022
1