Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2402.05457
Cited By
It's Never Too Late: Fusing Acoustic Information into Large Language Models for Automatic Speech Recognition
8 February 2024
Chen Chen
Ruizhe Li
Yuchen Hu
Sabato Marco Siniscalchi
Pin-Yu Chen
Ensiong Chng
Chao-Han Huck Yang
Re-assign community
ArXiv
PDF
HTML
Papers citing
"It's Never Too Late: Fusing Acoustic Information into Large Language Models for Automatic Speech Recognition"
21 / 21 papers shown
Title
Listening and Seeing Again: Generative Error Correction for Audio-Visual Speech Recognition
Rui Liu
Hongyu Yuan
H. Li
35
0
0
03 Jan 2025
Effective Text Adaptation for LLM-based ASR through Soft Prompt Fine-Tuning
Yingyi Ma
Zhe Liu
Ozlem Kalinli
65
0
0
09 Dec 2024
Mamba-based Decoder-Only Approach with Bidirectional Speech Modeling for Speech Recognition
Yoshiki Masuyama
Koichi Miyazaki
Masato Murata
Mamba
28
0
0
11 Nov 2024
Optimizing Contextual Speech Recognition Using Vector Quantization for Efficient Retrieval
Nikolaos Flemotomos
Roger Hsiao
P. Swietojanski
Takaaki Hori
Dogan Can
Xiaodan Zhuang
37
0
0
01 Nov 2024
Large Language Models are Strong Audio-Visual Speech Recognition Learners
Umberto Cappellazzo
Minsu Kim
Honglie Chen
Pingchuan Ma
Stavros Petridis
Daniele Falavigna
Alessio Brutti
Maja Pantic
18
9
0
18 Sep 2024
LA-RAG:Enhancing LLM-based ASR Accuracy with Retrieval-Augmented Generation
Shaojun Li
Hengchao Shang
Daimeng Wei
Jiaxin Guo
Zongyao Li
Xianghui He
Min Zhang
Hao Yang
19
2
0
13 Sep 2024
SALSA: Speedy ASR-LLM Synchronous Aggregation
Ashish R. Mittal
Darshan Prabhu
Sunita Sarawagi
P. Jyothi
21
2
0
29 Aug 2024
Streaming Decoder-Only Automatic Speech Recognition with Discrete Speech Units: A Pilot Study
Peikun Chen
Sining Sun
Changhao Shan
Qing Yang
Lei Xie
29
2
0
27 Jun 2024
Large Language Models for Dysfluency Detection in Stuttered Speech
Dominik Wagner
Sebastian P. Bayerl
Ilja Baumann
K. Riedhammer
Elmar Nöth
Tobias Bocklet
30
3
0
16 Jun 2024
Soundscape Captioning using Sound Affective Quality Network and Large Language Model
Yuanbo Hou
Qiaoqiao Ren
A. Mitchell
Wenwu Wang
Jian Kang
Tony Belpaeme
Dick Botteldooren
26
3
0
09 Jun 2024
Crossmodal ASR Error Correction with Discrete Speech Units
Yuanchao Li
Pinzhen Chen
Peter Bell
Catherine Lai
21
6
0
26 May 2024
MMGER: Multi-modal and Multi-granularity Generative Error Correction with LLM for Joint Accent and Speech Recognition
Bingshen Mu
Yangze Li
Qijie Shao
Kun Wei
Xucheng Wan
Naijun Zheng
Huan Zhou
Lei Xie
27
5
0
06 May 2024
MSCoTDet: Language-driven Multi-modal Fusion for Improved Multispectral Pedestrian Detection
Taeheon Kim
Sangyun Chung
Damin Yeom
Youngjoon Yu
Hak Gu Kim
Y. Ro
25
2
0
22 Mar 2024
Multi-stage Large Language Model Correction for Speech Recognition
Jie Pu
Thai-Son Nguyen
Sebastian Stüker
LRM
17
6
0
17 Oct 2023
Whispering LLaMA: A Cross-Modal Generative Error Correction Framework for Speech Recognition
S. Radhakrishnan
Chao-Han Huck Yang
S. Khan
Rohit Kumar
N. Kiani
D. Gómez-Cabrero
Jesper N. Tegnér
35
47
0
10 Oct 2023
Caption Anything: Interactive Image Description with Diverse Multimodal Controls
Teng Wang
Jinrui Zhang
Junjie Fei
Hao Zheng
Yunlong Tang
Zhe Li
Mingqi Gao
Shanshan Zhao
MLLM
96
81
0
04 May 2023
On Uni-Modal Feature Learning in Supervised Multi-Modal Learning
Chenzhuang Du
Jiaye Teng
Tingle Li
Yichen Liu
Tianyuan Yuan
Yue Wang
Yang Yuan
Hang Zhao
33
38
0
02 May 2023
Wav2code: Restore Clean Speech Representations via Codebook Lookup for Noise-Robust ASR
Yuchen Hu
Cheng Chen
Qiu-shi Zhu
E. Chng
10
15
0
11 Apr 2023
UMIX: Improving Importance Weighting for Subpopulation Shift via Uncertainty-Aware Mixup
Zongbo Han
Zhipeng Liang
Fan Yang
Liu Liu
Lanqing Li
Yatao Bian
P. Zhao
Bing Wu
Changqing Zhang
Jianhua Yao
45
34
0
19 Sep 2022
RescoreBERT: Discriminative Speech Recognition Rescoring with BERT
Liyan Xu
Yile Gu
J. Kolehmainen
Haidar Khan
Ankur Gandhe
Ariya Rastrow
A. Stolcke
I. Bulyko
23
45
0
02 Feb 2022
Dropout as a Bayesian Approximation: Representing Model Uncertainty in Deep Learning
Y. Gal
Zoubin Ghahramani
UQCV
BDL
245
9,042
0
06 Jun 2015
1