Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2401.10446
Cited By
Large Language Models are Efficient Learners of Noise-Robust Speech Recognition
19 January 2024
Yuchen Hu
Chen Chen
Chao-Han Huck Yang
Ruizhe Li
Chao Zhang
Pin-Yu Chen
Ensiong Chng
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Large Language Models are Efficient Learners of Noise-Robust Speech Recognition"
19 / 19 papers shown
Title
Adaptive Audio-Visual Speech Recognition via Matryoshka-Based Multimodal LLMs
Umberto Cappellazzo
Minsu Kim
Stavros Petridis
47
0
0
09 Mar 2025
Retrieval-Augmented Speech Recognition Approach for Domain Challenges
Peng Shen
Xugang Lu
Hisashi Kawai
RALM
60
0
0
24 Feb 2025
mWhisper-Flamingo for Multilingual Audio-Visual Noise-Robust Speech Recognition
Andrew Rouditchenko
Saurabhchand Bhati
Samuel Thomas
Hilde Kuehne
Rogerio Feris
87
1
0
03 Feb 2025
FlanEC: Exploring Flan-T5 for Post-ASR Error Correction
Moreno La Quatra
Valerio Mario Salerno
Yu Tsao
Sabato Marco Siniscalchi
78
0
0
22 Jan 2025
Listening and Seeing Again: Generative Error Correction for Audio-Visual Speech Recognition
Rui Liu
Hongyu Yuan
H. Li
38
0
0
03 Jan 2025
Large Language Models are Strong Audio-Visual Speech Recognition Learners
Umberto Cappellazzo
Minsu Kim
Honglie Chen
Pingchuan Ma
Stavros Petridis
Daniele Falavigna
Alessio Brutti
Maja Pantic
28
9
0
18 Sep 2024
SIFToM: Robust Spoken Instruction Following through Theory of Mind
Lance Ying
Jason Xinyu Liu
Shivam Aarya
Yizirui Fang
Stefanie Tellex
J. Tenenbaum
Tianmin Shu
LM&Ro
48
2
0
17 Sep 2024
LA-RAG:Enhancing LLM-based ASR Accuracy with Retrieval-Augmented Generation
Shaojun Li
Hengchao Shang
Daimeng Wei
Jiaxin Guo
Zongyao Li
Xianghui He
Min Zhang
Hao Yang
21
2
0
13 Sep 2024
A Transcription Prompt-based Efficient Audio Large Language Model for Robust Speech Recognition
Yangze Li
Xiong Wang
Songjun Cao
Yike Zhang
Long Ma
Lei Xie
AuLLM
48
0
0
18 Aug 2024
Evolutionary Prompt Design for LLM-Based Post-ASR Error Correction
Rithik Sachdev
Zhong-Qiu Wang
Chao-Han Huck Yang
19
3
0
23 Jul 2024
Streaming Decoder-Only Automatic Speech Recognition with Discrete Speech Units: A Pilot Study
Peikun Chen
Sining Sun
Changhao Shan
Qing Yang
Lei Xie
35
2
0
27 Jun 2024
LipGER: Visually-Conditioned Generative Error Correction for Robust Automatic Speech Recognition
Sreyan Ghosh
Sonal Kumar
Ashish Seth
Purva Chiniya
Utkarsh Tyagi
R. Duraiswami
Dinesh Manocha
33
0
0
06 Jun 2024
Self-Taught Recognizer: Toward Unsupervised Adaptation for Speech Foundation Models
Yuchen Hu
Chen Chen
Chao-Han Huck Yang
Chengwei Qin
Pin-Yu Chen
Chng Eng Siong
Chao Zhang
VLM
27
3
0
23 May 2024
MMGER: Multi-modal and Multi-granularity Generative Error Correction with LLM for Joint Accent and Speech Recognition
Bingshen Mu
Yangze Li
Qijie Shao
Kun Wei
Xucheng Wan
Naijun Zheng
Huan Zhou
Lei Xie
29
5
0
06 May 2024
Whispering LLaMA: A Cross-Modal Generative Error Correction Framework for Speech Recognition
S. Radhakrishnan
Chao-Han Huck Yang
S. Khan
Rohit Kumar
N. Kiani
D. Gómez-Cabrero
Jesper N. Tegnér
38
47
0
10 Oct 2023
Wav2code: Restore Clean Speech Representations via Codebook Lookup for Noise-Robust ASR
Yuchen Hu
Cheng Chen
Qiu-shi Zhu
E. Chng
12
15
0
11 Apr 2023
BLIP-2: Bootstrapping Language-Image Pre-training with Frozen Image Encoders and Large Language Models
Junnan Li
Dongxu Li
Silvio Savarese
Steven C. H. Hoi
VLM
MLLM
244
4,186
0
30 Jan 2023
Training language models to follow instructions with human feedback
Long Ouyang
Jeff Wu
Xu Jiang
Diogo Almeida
Carroll L. Wainwright
...
Amanda Askell
Peter Welinder
Paul Christiano
Jan Leike
Ryan J. Lowe
OSLM
ALM
301
11,730
0
04 Mar 2022
FastCorrect 2: Fast Error Correction on Multiple Candidates for Automatic Speech Recognition
Yichong Leng
Xu Tan
Rui Wang
Linchen Zhu
Jin Xu
...
Linquan Liu
Tao Qin
Xiang-Yang Li
Ed Lin
Tie-Yan Liu
15
40
0
29 Sep 2021
1