ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2310.04673
  4. Cited By
LauraGPT: Listen, Attend, Understand, and Regenerate Audio with GPT

LauraGPT: Listen, Attend, Understand, and Regenerate Audio with GPT

7 October 2023
Zhihao Du
Jiaming Wang
Qian Chen
Yunfei Chu
Zhifu Gao
Zerui Li
Kai Hu
Xiaohuan Zhou
Jin Xu
Ziyang Ma
Wen Wang
Siqi Zheng
Chang Zhou
Zhijie Yan
Shiliang Zhang
    LLMAG
    VLM
    AuLLM
    LM&MA
ArXivPDFHTML

Papers citing "LauraGPT: Listen, Attend, Understand, and Regenerate Audio with GPT"

14 / 14 papers shown
Title
Muyan-TTS: A Trainable Text-to-Speech Model Optimized for Podcast Scenarios with a $50K Budget
Muyan-TTS: A Trainable Text-to-Speech Model Optimized for Podcast Scenarios with a 50KBudget50K Budget50KBudget
Xin Li
Kaikai Jia
Hao Sun
Jun Dai
Z. L. Jiang
34
0
0
27 Apr 2025
SIFT-50M: A Large-Scale Multilingual Dataset for Speech Instruction Fine-Tuning
SIFT-50M: A Large-Scale Multilingual Dataset for Speech Instruction Fine-Tuning
Prabhat Pandey
R. Swaminathan
K V Vijay Girish
Arunasish Sen
Jian Xie
Grant P. Strimel
Andreas Schwarz
36
0
0
12 Apr 2025
Can Large Audio-Language Models Truly Hear? Tackling Hallucinations with Multi-Task Assessment and Stepwise Audio Reasoning
Can Large Audio-Language Models Truly Hear? Tackling Hallucinations with Multi-Task Assessment and Stepwise Audio Reasoning
Chun-Yi Kuan
Hung-yi Lee
AuLLM
LRM
56
1
0
03 Jan 2025
Recent Advances in Speech Language Models: A Survey
Recent Advances in Speech Language Models: A Survey
Wenqian Cui
Dianzhi Yu
Xiaoqi Jiao
Ziqiao Meng
Guangyan Zhang
Qichao Wang
Yiwen Guo
Irwin King
AuLLM
57
14
0
01 Oct 2024
Salmon: A Suite for Acoustic Language Model Evaluation
Salmon: A Suite for Acoustic Language Model Evaluation
Gallil Maimon
Amit Roth
Yossi Adi
ELM
AuLLM
49
5
0
11 Sep 2024
Language Model Can Listen While Speaking
Language Model Can Listen While Speaking
Ziyang Ma
Yakun Song
Chenpeng Du
Jian Cong
Zhuo Chen
Yuping Wang
Y. Wang
Xie Chen
AuLLM
29
23
0
05 Aug 2024
How Should We Extract Discrete Audio Tokens from Self-Supervised Models?
How Should We Extract Discrete Audio Tokens from Self-Supervised Models?
Pooneh Mousavi
J. Duret
Salah Zaiem
Luca Della Libera
Artem Ploujnikov
Cem Subakan
Mirco Ravanelli
21
9
0
15 Jun 2024
PolySpeech: Exploring Unified Multitask Speech Models for
  Competitiveness with Single-task Models
PolySpeech: Exploring Unified Multitask Speech Models for Competitiveness with Single-task Models
Runyan Yang
Huibao Yang
Xiqing Zhang
Tiantian Ye
Ying Liu
Yingying Gao
Shilei Zhang
Chao Deng
Junlan Feng
23
0
0
12 Jun 2024
MaLa-ASR: Multimedia-Assisted LLM-Based ASR
MaLa-ASR: Multimedia-Assisted LLM-Based ASR
Guanrou Yang
Ziyang Ma
Fan Yu
Zhifu Gao
Shiliang Zhang
Xie Chen
AuLLM
33
2
0
09 Jun 2024
SALM: Speech-augmented Language Model with In-context Learning for
  Speech Recognition and Translation
SALM: Speech-augmented Language Model with In-context Learning for Speech Recognition and Translation
Zhehuai Chen
He Huang
A. Andrusenko
Oleksii Hrinchuk
Krishna C. Puvvada
Jason Chun Lok Li
Subhankar Ghosh
Jagadeesh Balam
Boris Ginsburg
LRM
16
48
0
13 Oct 2023
LMCodec: A Low Bitrate Speech Codec With Causal Transformer Models
LMCodec: A Low Bitrate Speech Codec With Causal Transformer Models
Teerapat Jenrungrot
Michael Chinen
W. Kleijn
Jan Skoglund
Zalan Borsos
Neil Zeghidour
Marco Tagliasacchi
39
19
0
23 Mar 2023
Google USM: Scaling Automatic Speech Recognition Beyond 100 Languages
Google USM: Scaling Automatic Speech Recognition Beyond 100 Languages
Yu Zhang
Wei Han
James Qin
Yongqiang Wang
Ankur Bapna
...
Pedro J. Moreno
Chung-Cheng Chiu
J. Schalkwyk
Franccoise Beaufays
Yonghui Wu
VLM
77
249
0
02 Mar 2023
Make-An-Audio: Text-To-Audio Generation with Prompt-Enhanced Diffusion
  Models
Make-An-Audio: Text-To-Audio Generation with Prompt-Enhanced Diffusion Models
Rongjie Huang
Jia-Bin Huang
Dongchao Yang
Yi Ren
Luping Liu
Mingze Li
Zhenhui Ye
Jinglin Liu
Xiaoyue Yin
Zhou Zhao
DiffM
137
304
0
30 Jan 2023
SpeechT5: Unified-Modal Encoder-Decoder Pre-Training for Spoken Language
  Processing
SpeechT5: Unified-Modal Encoder-Decoder Pre-Training for Spoken Language Processing
Junyi Ao
Rui Wang
Long Zhou
Chengyi Wang
Shuo Ren
...
Yu Zhang
Zhihua Wei
Yao Qian
Jinyu Li
Furu Wei
110
192
0
14 Oct 2021
1