Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2104.09426
Cited By
Advanced Long-context End-to-end Speech Recognition Using Context-expanded Transformers
19 April 2021
Takaaki Hori
Niko Moritz
Chiori Hori
Jonathan Le Roux
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Advanced Long-context End-to-end Speech Recognition Using Context-expanded Transformers"
24 / 24 papers shown
Title
Efficient Long-Form Speech Recognition for General Speech In-Context Learning
Hao Yen
Shaoshi Ling
Guoli Ye
21
0
0
29 Sep 2024
An Efficient Self-Learning Framework For Interactive Spoken Dialog Systems
Hitesh Tulsiani
David M. Chan
Shalini Ghosh
Garima Lalwani
Prabhat Pandey
Ankish Bansal
Sri Garimella
Ariya Rastrow
Björn Hoffmeister
26
0
0
16 Sep 2024
Advanced Long-Content Speech Recognition With Factorized Neural Transducer
Xun Gong
Yu Wu
Jinyu Li
Shujie Liu
Rui Zhao
Xie Chen
Yanmin Qian
16
6
0
20 Mar 2024
Using Large Language Model for End-to-End Chinese ASR and NER
Yuang Li
Jiawei Yu
Min Zhang
Mengxin Ren
Yanqing Zhao
Xiaofeng Zhao
Miaomiao Ma
Chang Su
Hao-Yu Yang
29
7
0
21 Jan 2024
Promptformer: Prompted Conformer Transducer for ASR
Sergio Duarte Torres
Arunasish Sen
Aman Rana
Lukas Drude
Alejandro Gomez-Alanis
Andreas Schwarz
Leif Rädel
Volker Leutnant
27
3
0
14 Jan 2024
Improved Long-Form Speech Recognition by Jointly Modeling the Primary and Non-primary Speakers
Guru Prakash Arumugam
Shuo-yiin Chang
Tara N. Sainath
Rohit Prabhavalkar
Quan Wang
Shaan Bijwadia
14
3
0
18 Dec 2023
Generative Context-aware Fine-tuning of Self-supervised Speech Models
Suwon Shon
Kwangyoun Kim
Prashant Sridhar
Yi-Te Hsu
Shinji Watanabe
Karen Livescu
12
2
0
15 Dec 2023
How Much Context Does My Attention-Based ASR System Need?
Robert Flynn
Anton Ragni
30
1
0
24 Oct 2023
Conversational Speech Recognition by Learning Audio-textual Cross-modal Contextual Representation
Kun Wei
Bei Li
Hang Lv
Quan Lu
Ning Jiang
Lei Xie
31
3
0
22 Oct 2023
Enhancing End-to-End Conversational Speech Translation Through Target Language Context Utilization
A. Hussein
Brian Yan
Antonios Anastasopoulos
Shinji Watanabe
Sanjeev Khudanpur
29
3
0
27 Sep 2023
Memory-augmented conformer for improved end-to-end long-form ASR
Carlos Carvalho
A. Abad
RALM
25
1
0
22 Sep 2023
Investigating End-to-End ASR Architectures for Long Form Audio Transcription
Nithin Rao Koluguri
Samuel Kriman
Georgy Zelenfroind
Somshubra Majumdar
Dima Rekesh
Vahid Noroozi
Jagadeesh Balam
Boris Ginsburg
AuLLM
29
9
0
18 Sep 2023
BASS: Block-wise Adaptation for Speech Summarization
Roshan S. Sharma
Kenneth Zheng
Siddhant Arora
Shinji Watanabe
Rita Singh
Bhiksha Raj
21
7
0
17 Jul 2023
Accelerating Transducers through Adjacent Token Merging
Yuang Li
Yu-Huan Wu
Jinyu Li
Shujie Liu
17
4
0
28 Jun 2023
Towards Effective and Compact Contextual Representation for Conformer Transducer Speech Recognition Systems
Mingyu Cui
Jiawen Kang
Jiajun Deng
Xiaoyue Yin
Yutao Xie
Xie Chen
Xunying Liu
14
8
0
23 Jun 2023
Context-aware Fine-tuning of Self-supervised Speech Models
Suwon Shon
Felix Wu
Kwangyoun Kim
Prashant Sridhar
Karen Livescu
Shinji Watanabe
25
7
0
16 Dec 2022
LongFNT: Long-form Speech Recognition with Factorized Neural Transducer
Xun Gong
Yu-Huan Wu
Jinyu Li
Shujie Liu
Rui Zhao
Xie Chen
Y. Qian
RALM
11
10
0
17 Nov 2022
Contextual-Utterance Training for Automatic Speech Recognition
Alejandro Gomez-Alanis
Lukas Drude
A. Schwarz
R. Swaminathan
Simon Wiesler
21
1
0
27 Oct 2022
Leveraging Acoustic Contextual Representation by Audio-textual Cross-modal Learning for Conversational ASR
Kun Wei
Yike Zhang
Sining Sun
Lei Xie
Long Ma
16
9
0
03 Jul 2022
Conversational Speech Recognition By Learning Conversation-level Characteristics
Kun Wei
Yike Zhang
Sining Sun
Lei Xie
Long Ma
22
7
0
16 Feb 2022
PM-MMUT: Boosted Phone-Mask Data Augmentation using Multi-Modeling Unit Training for Phonetic-Reduction-Robust E2E Speech Recognition
Guodong Ma
Pengfei Hu
Nurmemet Yolwas
Shen Huang
Hao-Ming Huang
19
4
0
13 Dec 2021
Recent Advances in End-to-End Automatic Speech Recognition
Jinyu Li
VLM
13
362
0
02 Nov 2021
Speech Summarization using Restricted Self-Attention
Roshan S. Sharma
Shruti Palaskar
A. Black
Florian Metze
17
33
0
12 Oct 2021
Input Length Matters: Improving RNN-T and MWER Training for Long-form Telephony Speech Recognition
Zhiyun Lu
Yanwei Pan
Thibault Doutre
Parisa Haghani
Liangliang Cao
Rohit Prabhavalkar
C. Zhang
Trevor Strohman
AuLLM
72
14
0
08 Oct 2021
1